* Harlan Stenn <stenn@ntp.org>
Matthew Huff writes:
A backward step is a known issue and something that people are more comfortable dealing with as it can happen on any machine with a noisy clock crystal.
A clock crystal has to be REALLY bad for ntpd to need to step the clock.
Having 61 seconds in a minute or 86401 seconds in a day is a different story.
Yeah, leap years suck too.
And those jumps around daylight savings time.
Hi Harlan, Leap years and DST ladjustments have never caused us any major issues. It seems these code paths are well tested and work fine. The leap second in 2012 however ... total and utter carnage. Application servers, databases, etc. falling over like dominoes. All hands on deck in the middle of the night to clean up. It took days before we stopped finding broken stuff. Maybe all the bugs from 2012 have been fixed. Maybe they haven't. Maybe new ones have been introduced. I'm not terribly optimistic. One example I'm aware of: Cisco Nexus 5010/5020 switches need software that was released as late as 29th of April this year in order to be immune to the crash&burn leap second bug CSCub38654. The official «Cisco Suggested release based on software quality, stability and longevity» is older. Go figure. In any case, we're certainly not going to risk it. So our plan is to disconnect our local stratum-2s from their upstreams on June 29th so they (and more crucially, their downstream clients) remain oblivious to the leap second. Come July 1st, we'll reconnect them. The clients' clocks will be 1s (plus any drift) off at that point, but as we're running ntpd with the "-x" option, that shouldn't cause backwards stepping. Running with slightly incorrect clocks for a few days is a small price to pay to avoid a repeat of 2012's mayhem. Tore