No Leap Seconds here June 30, 2012 11:18 PM   Subscribe

I take it that Metafilter dodged the bullet tonight?

A lot of major sites seem to have been downed by the leap second and I'm hearing competing rumors about why. Some say that anything RedHat based is in trouble, others say it's just a chain reaction from AWS dying.

Either way, the site seems okay...?
posted by Tell Me No Lies to Uptime at 11:18 PM (78 comments total) 2 users marked this as a favorite

I knew something seemed a split second off.
posted by JohnnyGunn at 11:21 PM on June 30, 2012 [4 favorites]


You may have spoken too soon. It won't be 23:59:60 in MeFi's home timezone (Pacific) for another 34 minutes...
posted by oneswellfoop at 11:26 PM on June 30, 2012


RHEL 6 & Debian Squeeze seem to be the primary culprits.
posted by ferdinand.bardamu at 11:27 PM on June 30, 2012


When there's a leap second, does the ball that marks the occasion drop or rise?
posted by zippy at 11:28 PM on June 30, 2012 [1 favorite]


Although I run Debian Squeeze & I've had no problem...
posted by ferdinand.bardamu at 11:31 PM on June 30, 2012


I don't know about this... are you sure it's not the storm problem?
posted by taz (staff) at 11:44 PM on June 30, 2012 [4 favorites]


Yeah, between those two incidents this weekend, my pager's been going off like ... something that goes off a lot.
posted by ChrisR at 11:46 PM on June 30, 2012 [1 favorite]


Can somebody please explain here why the leap second is a problem?
posted by iamkimiam at 11:51 PM on June 30, 2012 [1 favorite]


Uh oh. I never stocked up on water and bread!! We're all gonna die!!!1!
posted by deborah at 11:56 PM on June 30, 2012 [2 favorites]


Because for many versions of the linux kernel -- a majority of the ones in production use today -- there is a bug that causes the kernel to panic (crash) when NTP, which is a time synchronization protocol, tells it there's a leap second. [1] - serverfault analysis, [2] - Linux kernel mailing list discussion
posted by ChrisR at 11:57 PM on June 30, 2012 [2 favorites]


are you sure it's not the storm problem?

I'm not sure of anything at this point. LinkedIn IT was thinking it might be the leap second.
posted by Tell Me No Lies at 11:57 PM on June 30, 2012


It affects other things too, things that you don't necessarily realize run Linux. I can't really say what those things are for various reasons, but trust me, Linux runs a lot of things now.
posted by ChrisR at 11:58 PM on June 30, 2012 [1 favorite]


The whole night's fucked. These three kids came over for a sleepover, but two of them were forced by their parents because the real guest had separation issues. So we ordered pizza, but they brought their own food and wouldn't touch it. Then one of them got a "headache" and then they all went home, crushing my kid in a manner that only a wet tennis ball could fix.

But now we have two mostly uneaten pizzas, so if Fred Meyer is running its POS terminals on RHEL 6 or whatever, we're good for at least tomorrow.
posted by mph at 12:00 AM on July 1, 2012 [3 favorites]


We're alive!! *kisses server*

Yay!!!!!1!!! *hugs everyone*
posted by deborah at 12:01 AM on July 1, 2012


It is now 12 midnight in Oregon. Every body ok?
posted by Cranberry at 12:01 AM on July 1, 2012 [1 favorite]


Yay!!

I would like an uneaten pizza please. Starving.
posted by taz (staff) at 12:02 AM on July 1, 2012 [1 favorite]


Can somebody please explain here why the leap second is a problem?

In general terms its because nobody ever thinks about leap seconds. You write code assuming 24 hours in a day, 60 minutes in a hour, and 60 seconds in a minute.

Usually that's not a problem, but there are some rare edge cases where it bites you. Today we appear to be seeing some of them.
posted by Tell Me No Lies at 12:02 AM on July 1, 2012


Oh, and happy July
posted by Cranberry at 12:03 AM on July 1, 2012


IF taz doesn't eat it all, I'd like a slice of pizza too. Please.
posted by Cranberry at 12:04 AM on July 1, 2012


Um. Holy crap. My birthday is next week.
posted by Night_owl at 12:05 AM on July 1, 2012


*bares teeth, snarls*
posted by taz (staff) at 12:05 AM on July 1, 2012 [2 favorites]


/em feeds the taz
posted by Night_owl at 12:06 AM on July 1, 2012


crushing my kid in a manner that only a wet tennis ball could fix.

I am not familiar with that idiom.

Two of my three guests also left early tonight but that was because they realized they would be billing beaucoup unscheduled consulting hours this evening.
posted by Tell Me No Lies at 12:07 AM on July 1, 2012 [1 favorite]


Have it all, taz. I'll go make some toast.
posted by Cranberry at 12:21 AM on July 1, 2012 [1 favorite]


Yawn. Yesterday was SUCH a long day. Goodnight.
posted by Cranberry at 12:31 AM on July 1, 2012 [2 favorites]


The leap second is inserted at midnight UTC, so as far as I can tell it shouldn't matter which timezone you're in.
posted by goodnewsfortheinsane at 12:34 AM on July 1, 2012 [4 favorites]


Oh. I guess that explains why I felt so tired when I went to sleep slightly past midnight. Nothing to do with all the cleaning I did yesterday.

In other news, I also made some cake last night and it turned out really really good. Feel free to pop in if you want some.
posted by daniel_charms at 12:56 AM on July 1, 2012 [1 favorite]


I take it that Metafilter dodged the bullet tonight?

I agree with your premise, but not your reasoning.
posted by mazola at 1:02 AM on July 1, 2012


I am not familiar with that idiom.

It was raining, but the boy didn't want to be in the house. So we walked over to his school playground, and on the way we found a tennis ball. We spent 30 minutes throwing it against the back wall, seeing what sort of patterns we could make. I guess he considers it a pretty big treat to play in the rain, and having the tennis ball seemed to complete the experience.
posted by mph at 1:31 AM on July 1, 2012 [23 favorites]


oneswellfoop writes "You may have spoken too soon. It won't be 23:59:60 in MeFi's home timezone (Pacific) for another 34 minutes..."

Isn't the leap second in UTC and therefor happening everywhere at the same time?
posted by Mitheral at 1:56 AM on July 1, 2012


mph, I don't know you or anything, but I think someone might have replaced your kid with a dog.
posted by item at 2:58 AM on July 1, 2012 [33 favorites]




I love imperfection.
posted by Kerasia at 4:16 AM on July 1, 2012


I love imperfection

Yeah, me to.
posted by the quidnunc kid at 7:10 AM on July 1, 2012 [2 favorites]


Shit.
posted by the quidnunc kid at 7:11 AM on July 1, 2012


Yeah, shit, I thought you did that on purpos.
posted by scratch at 7:16 AM on July 1, 2012


Bullitt was pretty good, and it's always fun to see McQueen at the top of his game. Why would we want to dodge him (unless he's driving, of course)?
posted by GenjiandProust at 7:24 AM on July 1, 2012


I've long had kind of a hobbyist interest in NTP, as I have kind of an exact time fetish, and so for example I run the Meinberg NTP server on my Win 7 machine and all my other PCs and devices sync to it (though I've had frustrations with an iOS implementation &mdash a straight port of ntpd doesn't seem to work for me, very possibly because I'm too lazy to figure out why). But, as this is just a hobbyish interest and I'm not a sysadmin or anything, there's lots I don't know or worry about. I'm not sure what happened on my PC yesterday.

Someone had mentioned in a recent thread (about typical errors programmers make concerning time) that they thought that NTP skews across that 59th second to elongate it to two seconds in order to avoid such problems. Apparently that's not the case? Surely it's not an implementation decision, as how to handle leap seconds seems like something that would be defined in the protocol.
posted by Ivan Fyodorovich at 7:30 AM on July 1, 2012


I would like an uneaten pizza please. Starving.

They are definitely preferable to the eaten kind.
posted by ricochet biscuit at 7:41 AM on July 1, 2012 [2 favorites]


MetaFilter: I can't really say what those things are for various reasons, but trust me
posted by hippybear at 7:47 AM on July 1, 2012 [1 favorite]


Metafilter: preferable to the eaten kind
posted by parmanparman at 7:52 AM on July 1, 2012


> I have kind of an exact time fetish

Rule #34 strikes again.
posted by languagehat at 7:52 AM on July 1, 2012 [10 favorites]


MetaFilter: seems like something that would be defined in the protocol
posted by subbes at 7:54 AM on July 1, 2012


dammit languagehat you broke the streak
posted by subbes at 7:54 AM on July 1, 2012


The leap second issue was an issue with specific Linux kernels, I think, but Metafilter is running on Windows Server 2008, so it has another set of issues entirely. :)
posted by Pronoiac at 8:00 AM on July 1, 2012


Oh man, I slept terribly, and I was gonna blame the (granted, mild by national standards) heat and humidity, and the random firecrackers and the neighbor playing dance music at two in the morning and the sudden rain and the random sirens, but I think maybe I'll pin it on the leap second.
posted by cortex (staff) at 8:02 AM on July 1, 2012 [2 favorites]


/em feeds the taz -posted by Night_owl at 2:06 AM on July 1


My god man, it's after midnight. You never feed a taz AFTER MIDNIGHT.
posted by Atreides at 8:04 AM on July 1, 2012 [1 favorite]


Or get her wet.
posted by cjorgensen at 8:09 AM on July 1, 2012 [2 favorites]


Taz should be kept dry and in a shoebox under the bed, otherwise hijinks occur and a wizened Chinese guy will threaten to take her back.
posted by arcticseal at 8:16 AM on July 1, 2012 [1 favorite]


I was disarming a bomb last night, and boy did that leap second come in handy.
posted by TwelveTwo at 8:37 AM on July 1, 2012 [7 favorites]


I'm just glad to find out that the reason why the school server locked up hard and wouldn't talk to me today (not even with a local console) probably wasn't faulty hardware.
posted by flabdablet at 8:38 AM on July 1, 2012


the (granted, mild by national standards) heat and humidity, and the random firecrackers and the neighbor playing dance music at two in the morning and the sudden rain and the random sirens

What were you doing in midtown Atlanta last night? Granted, my neighbors were also playing "Call Me Maybe" every fourth song or so. I was going to go over there and fuss, but this heat wave has us in an air quality Code Purple so we're not even supposed to go outside.

damn guv'mint keeping us from berating our neighbors about their taste in latenight music
posted by catlet at 9:05 AM on July 1, 2012


Just checked all my checked out test clients at work; I have 17 Redhat machines running RHEL 4, 5 and 6 and they all seemed happy.
posted by octothorpe at 9:08 AM on July 1, 2012


Yeah, we're not running any of the systems that had leap second problems. Shew.
posted by pb (staff) at 9:44 AM on July 1, 2012 [1 favorite]


Yeah, we're not running any of the systems that had leap second problems. Shew.

Not anymore, you mean. Cut to a building engulfed in flames.
posted by TwelveTwo at 10:31 AM on July 1, 2012 [2 favorites]


NTP does specifically cover leap seconds, but I thought this note was interesting:

In the date and timestamp formats, the prime epoch, or base date of
era 0, is 0 h 1 January 1900 UTC, when all bits are zero. It should
be noted that strictly speaking, UTC did not exist prior to 1 January
1972, but it is convenient to assume it has existed for all eternity,
even if all knowledge of historic leap seconds has been lost.

Honestly I'm not sure what the implications of that are. Probably nothing.
posted by Tell Me No Lies at 10:46 AM on July 1, 2012 [3 favorites]


Honestly I'm not sure what the implications of that are.

Doom.

Probably nothing.

Dooooooooooooooooom.
posted by kmz at 11:39 AM on July 1, 2012 [8 favorites]


The Time Gnomes have to work so much more during leap year, moving everything just a little, all precisely, setting every frame so the camera can go off without show any work, and all so we perceive time as continuous. Adding a leap second on top of all that? Ugh, it is no wonder I stubbed my toe this morning.
posted by TwelveTwo at 11:42 AM on July 1, 2012


I couldn't sleep last night because of the snakes.
posted by cjorgensen at 5:10 PM on July 1, 2012 [2 favorites]


We were on the 4th floor of a Portland Red Lion and the thump-thump-thump bass from the 6th floor lounge kept me awake. I spent the time being strangely mesmerized by the CNBC show "Princess" (we don't have cable at home so when I do watch TV--usually in a hotel--I have a hard time not being mesmerized by it, anyway, but watching spoiled-brat reality show people having to wash dishes in a restaurant was especially captivating). Then I couldn't get comfortable and when I finally did fall asleep, the baby in the next room started crying.

Damn leap second.
posted by WorkingMyWayHome at 6:19 PM on July 1, 2012


Look second before you leap second.
posted by flapjax at midnite at 9:11 PM on July 1, 2012 [1 favorite]


Anybody else's Snow Leopard install just freeze and require a reboot? Totally unresponsive - just a beach ball.
posted by obiwanwasabi at 1:35 AM on July 2, 2012


This just underlines that programmers should never be left unsupervised when anything requires them to interact with time in their code.
posted by maxwelton at 1:55 AM on July 2, 2012


It shouldn't even be left to committees, because those will often contain enough drones to make sure things stay mucked up for a very long time even if there are people on them who actually know what they're on about.

POSIX time is an abomination and needs to be fixed. NTP should distribute TAI, not UTC, systems should maintain leap second tables much as they currently do for time zones, and nobody should ever write clock-time-based code except by using standard libraries that have been rigorously peer reviewed and thoroughly debugged.
posted by flabdablet at 4:07 AM on July 2, 2012


Well something's wrong. I woke up today and my hovercraft was full of eels.
posted by Splunge at 5:28 AM on July 2, 2012 [2 favorites]


Leap Second sounds like the name of some cruddy business book parable. Second Mouse Lite.
posted by dirtdirt at 5:30 AM on July 2, 2012


Who Moved My Leap Second?
posted by item at 5:53 AM on July 2, 2012 [2 favorites]


So yes, there was a kernel bug that caused panics on versions 2.6.29 and earlier. LKML thread with the gory details.

The site I work on runs newer kernels than that, but there was another bug introduced in 2008 that led to futex calls repeatedly timing out. The upshot of this was that every piece of software we had that used futexes (read: Java and a bunch of other things) promptly started spinlocking instead of polling, our CPU usage shot through the roof, and we had to restart pretty much all of our infrastructure.

All this because of a patch intended to speed up handling leap second insertion (a once-every-four-or-five-years event) by less than a millisecond.

Sometimes I hate my job.
posted by spitefulcrow at 9:24 AM on July 2, 2012 [1 favorite]


POSIX time is an abomination and needs to be fixed.

Agreed.

NTP should distribute TAI, not UTC

Disagree. UTC is the correct clock. NTP correctly flags the upcoming tick, which means your clock is always right to the civil clock, unless you do something stupid, like the Linux kernel did. GPS also announces the upcoming second, and if you read the ephemeris, can also give you the new conversions to UTC, and funny enough, TAI, even though TAI->GPS won't change. GPS is defined as TAI+19.

If you're going to use a non-civil timescale, use GPS, which gives you TAI for free and can be directly read into stratum-1 NTP servers by just reading the uncorrected GPS time and phase locking to the PPS second.

But the world runs on UTC, the civil master clocks present an image of UTC, and that's the correct clock to present.

IMHO, of course.

systems should maintain leap second tables much as they currently do for time zones

One more update, no thank you. Again, leap seconds are a solved problem. You can't fix timezones, but all you need to do is *simply* spot the leap flack, and jump on the next leap second day. That's it. Idiots not handling it right should just *shut up* and let NTP do the job.

Yes, NTP cheats a bit, by simply not stepping the clock at 23:59:59, but it works if you need it, and if you don't, you can spot that and tack on 23:59:60. Solved problem, and it's only because people insist on fixing it again that it breaks. There are more precise systems if you need them. (Hint: Did you think that you might need them? You do not. You *know* if you need better precision.)

and nobody should ever write clock-time-based code except by using standard libraries that have been rigorously peer reviewed and thoroughly debugged.

I cannot favorite that enough. My favorite was when Android phones couldn't autofocus on certain days checking the clock during autofocus and it was a homebuilt clock routine.

The sheer number of fails there still takes my breath away.
posted by eriko at 11:09 AM on July 2, 2012 [1 favorite]


Disagree. UTC is the correct clock ... the world runs on UTC, the civil master clocks present an image of UTC, and that's the correct clock to present.

Present. That's the key word right there. UTC should certainly be available as a presentation choice, and should be handled just like any other timezone.

The original idea of the Unix clock - a simple integer based on a well-defined epoch that incremented once per second, to be converted to human-readable form as required - was, to my way of thinking, an excellent and beautiful idea. It just worked, regardless of the bizarre ways that humans choose to think about time, and regardless of the common habit of trying to treat times and dates as the same thing*. When I found out that POSIX actually mandates fiddling about with that clock to deal with leap seconds I had trouble believing it was true.

As you point out, the GPS clock now works pretty much that way, basically because if it worked any other way then GPS would be unnecessarily hard to do. It seems to me that kernel time clocks should also go back to doing that.

One more update, no thank you.

There's already a leap seconds file included in the existing TZ updates. Here's a way to use it whose basic approach strikes me as absolutely sound.

The only thing I don't like about it is the need to edit the leapseconds file to get rid of the leap seconds inserted before UTC->GPS synchronization. Given an accessible group of stratum 1 NTP servers handing out TAI instead of UTC, that wouldn't be necessary.
posted by flabdablet at 6:07 PM on July 2, 2012


*Given that times are derived from atomic clocks, while dates are derived from orbital movements, it should be obvious that they're not the same thing. Attempts to bend reality to prop up the illusion that 1 day = 86400 seconds are always going to cause trouble. I say we leave reality as it is, encapsulate all the associated politics in standard libraries, and then use them.
posted by flabdablet at 6:14 PM on July 2, 2012


Attempts to bend reality to prop up the illusion that 1 day = 86400 seconds are always going to cause trouble.

We're not bending reality to say that 1d=86.4ksec. We're bending reality because we insist that 0800 be in the morning and 2000 be in the evening.

Otherwise, we could drop leap years, leap seconds and time zones. The *whole reason* for them is that we want 0000 in the middle of the night, 1200 in the middle of the day, summer in the middle of June in the Northern Hemisphere, and a white Christmas there as well.

Indeed, if we'd stop trying to fake out the stratum-1 servers, we'd be fine. But no...we keep pretending that second-after-second is "natural."

There is nothing natural about the second. Not one goddamn thing. We have, figuratively*, pulled it out of our ass because 86.4K of them sort of kind of make a day, and, alas, the planet isn't that stable. Indeed, the entire problem with UTC/TAI is caused by the fact that we stopped defining the second as 1/86400 of a day and started defining it as the duration of 9,192,631,770 periods of the radiation corresponding to the transition between the two hyperfine levels of the ground state of 133Cs.

And, absolutely worst, you are making the fundamental mistake of the hacker. Do not put the smarts at the endpoint. That means there are millions of endpoints that have to run the right code, that have to be updated with changes (And they all won't be. You know it, I know it.) This is right up there with hardcoding the NTP server into the firmware in terms of dumb.

Broadcast the fundamental civilian standard from the top, and let the few that actually need to derive a continuous time scale deal with that. The real world? The real world is on UTC, they just read the bits, then display them. They're simple, and they work right. The vast majority of clocks need to display UTC. Send UTC, and deal with the damn few exceptions *as exceptions*. And if Linus and gang had a clue, they wouldn't have a kernel crash because of the change. But no....they had to try to put the smarts at the endpoint. Idiots.

Broadcasting TAI means you're making the exception -- needing TAI -- as the rule. Yes, that makes it easier on you, the programmer, but your job is to be smart. Learn how to deal with it and stop whinging because someone made you work a little harder to handle your exception, or stop writing time code.

And, seriously, if you need TAI, you get a GPS clock and subtract 19 seconds from the raw number, or if you need more accuracy than that, well, you're a national standards laboratory or working with one, and you already know how to get it, and if you don't, you need to put down the computer and call a real expert on time.

Again -- there are six billion plus people on this planet who care about an offset of UTC. Not TAI. Not UT1. Not the GPS timescale. UTC is the clock of the real world, and the core sources -- the top tier labs, the stratum-1 NTP servers, need to be broadcasting the standard clock of the real world. Just like they are now.

Present the expected default, and let those who need the exception handle it.

Dammit.

Want to hear about screwed up exceptions? NIST has made a clock source that they literally can not tell you how accurate it is, because it's more accurate than the duration of 1 period of the radiation corresponding to the transition between the two hyperfine levels of the ground state of 133Cs. The error is, quite literally, $UNDEF. They're going to have to come up with a new standard second to describe this. Which means, of course, unless we are fantastically lucky, we are about to have to slew TAI, when we go from the cesium based TAI to the new TAI. This will, of course, slew UT1 and UTC, but not GPS, because the GPS epoch is TAI as of 06-Jan-1980 00:00:00+19 seconds.

Is your code ready for that? Or are you smart and lazy and letting NTP deal with it?


* Yes, I originally wrote literally.
posted by eriko at 8:24 PM on July 2, 2012 [1 favorite]


Do not put the smarts at the endpoint

Point taken. But my objection is that what's currently at the endpoints is not so much smart as perverse, broken and wrong.

We're not bending reality to say that 1d=86.4ksec

Given that we have decided that the SI second is no longer derived from the length of the mean solar day, then we are trying to bend reality when we insist that every day should consist of 86400 of those seconds. Times and dates stopped being the same thing when the SI second was defined.

UTC deals with this by making the date a derived quantity and occasionally declaring a day to be 86401 seconds long. For POSIX time the days are the primary quantity and the seconds are derived; there are always 86400 POSIX seconds in a POSIX day.

To my way of thinking it is just ugly and wrong for POSIX time to be based on UTC with deleted leap seconds. It should have been based on UT2R, and there should be a bunch of stratum-2 NTP servers available that hand out UT2R time. This is essentially the approach taken by Google with its "leap smear", but with more emphasis placed on keeping the duration of seconds consistent than on having them phase-locked to UTC.

If we want seconds that have the property of being as close to exactly the same duration each as we can possibly arrange, then we should be using NTP or something more precise than NTP to distribute TAI, plus centrally administered authoritative tables to define the date that any given TAI second belongs in or the "wall" time it should be displayed as, should that be required.

Trying to use the same infrastructure for both accurate timekeeping and accurate datekeeping can only ever work properly if the endpoints are smart.
posted by flabdablet at 9:41 PM on July 2, 2012


systems should maintain leap second tables

Uhh... leap seconds are usually only scheduled a few months in advance. I'm not sure keeping all these tables updated would be any prettier than the existing situation.
posted by Tell Me No Lies at 12:42 AM on July 3, 2012


A few months is more warning than is often given for Daylight Saving changes, so dealing with updates with that kind of urgency is already common practice.
posted by flabdablet at 7:26 AM on July 3, 2012


What happened while I was out on vacation? a Time Shift?
posted by infini at 11:45 AM on July 3, 2012


Someone had mentioned in a recent thread (about typical errors programmers make concerning time) that they thought that NTP skews across that 59th second to elongate it to two seconds in order to avoid such problems.

Google has a patched NTP that does this (slews it out milliseconds at a time over the course of a day, I think), but it's not the normal behavior.
posted by kenko at 12:22 PM on July 3, 2012


eriko is the smartest person I never met.

flabdablet is pretty dang sharp too.

Also: NERD FIGHT!!!

posted by slogger at 1:44 PM on July 3, 2012


« Older No Numbers, No Colors, Linguistics   |   Canada Day shout-out to our friends north of the... Newer »

You are not logged in, either login or create an account to post comments