The IERS will be adding a second to time again on my birthday; 2015-06-30T23:59:59 2015-06-30T23:59:60 2015-07-01T00:00:00 Have fun, everyone. :-) Cheers, -- jra -- Jay R. Ashworth Baylink jra@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://www.bcp38.info 2000 Land Rover DII St Petersburg FL USA BCP38: Ask For It By Name! +1 727 647 1274
So you need to wait one more second before you may pop the bottle? :) On Fri, June 19, 2015 7:06 pm, Jay Ashworth wrote:
The IERS will be adding a second to time again on my birthday;
2015-06-30T23:59:59 2015-06-30T23:59:60 2015-07-01T00:00:00
Have fun, everyone. :-)
Cheers, -- jra
-- Jay R. Ashworth Baylink jra@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://www.bcp38.info 2000 Land Rover DII St Petersburg FL USA BCP38: Ask For It By Name! +1 727 647 1274
On (2015-06-19 13:06 -0400), Jay Ashworth wrote: Hey,
The IERS will be adding a second to time again on my birthday;
2015-06-30T23:59:60
Hopefully this is last leap second we'll ever see. Non-monotonic time is an abomination and very very few programs measuring passage of time are correct. Even those which are, usually are not portable, most languages do not even offer monotonic time in standard libraries. Canada, China, England and Germany, shame on you for opposing leapsecondless UTC. Next year hopefully GPSTIME. TAI and UTC are the same thing, with different static offset. -- ++ytti
Saku Ytti writes:
Hopefully this is last leap second we'll ever see. Non-monotonic time is an abomination and very very few programs measuring passage of time are correct. Even those which are, usually are not portable, most languages do not even offer monotonic time in standard libraries. Canada, China, England and Germany, shame on you for opposing leapsecondless UTC.
It's a problem with POSIX, not UTC. UTC is monotonic.
Next year hopefully GPSTIME. TAI and UTC are the same thing, with different static offset.
The General Timestamp API that Network Time Foundation is working on can solve this problem. People use different timescales for different reasons. The Agile folks like the "pigs and chickens" analogy: in a bacon and egg breakfast, the chicken is invested while the pig is committed. It's lame for a chicken to dictate to a pig. It's lame to change an existing Standard. Leave that one alone and choose to follow a new/different Standard. If you don't have a system that can properly handle leapseconds, there are several solutions to this, including: - implement DLM's leap second process in the kernel, described over 20 years ago - use the posix-right timezone files - help Network Time Foundation get the General Timestamp API implemented and deployed, which will let folks use whatever timescale they want. -- Harlan Stenn <stenn@ntp.org> http://networktimefoundation.org - be a member!
On Sat 2015-06-20T10:48:17 +0300, Saku Ytti hath writ:
You're right. Hopefully POSIX will become monotonic next year, by removal of leaps from UTC.
Probably not. The ITU-R has outlined four methods for this issue, see http://www.acma.gov.au/Industry/Spectrum/Spectrum-planning/International-pla... where of method A1, A2, B, C1, C2, and D not all of them remove the leap second from UTC. In any case, previous draft proposals have all specified a 5 year interval from deciding to change until the change happens, so we should plan for 5 more years of leap seconds no matter what. -- Steve Allen <sla@ucolick.org> WGS-84 (GPS) UCO/Lick Observatory--ISB Natural Sciences II, Room 165 Lat +36.99855 1156 High Street Voice: +1 831 459 3046 Lng -122.06015 Santa Cruz, CA 95064 http://www.ucolick.org/~sla/ Hgt +250 m
----- Original Message -----
- use the posix-right timezone files
What; not posixly-correct? Cheers, -- jr ':-)' a -- Jay R. Ashworth Baylink jra@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://www.bcp38.info 2000 Land Rover DII St Petersburg FL USA BCP38: Ask For It By Name! +1 727 647 1274
On Sat, 20 Jun 2015 11:32:53 -0400, Jay Ashworth said:
----- Original Message -----
- use the posix-right timezone files
What; not posixly-correct?
I wonder how many of us are old enough to remember what that environment variable *used* to be called before political correctness became important.
----- Original Message -----
From: "Valdis Kletnieks" <Valdis.Kletnieks@vt.edu>
On Sat, 20 Jun 2015 11:32:53 -0400, Jay Ashworth said:
----- Original Message -----
- use the posix-right timezone files
What; not posixly-correct?
I wonder how many of us are old enough to remember what that environment variable *used* to be called before political correctness became important.
There are so many layers in that observation that I'm lost. Was posixly-correct a purposeful pun on politically correct, and I've missed it all these decades? Or was it named something else earlier than that, which wasn't itself politically correct? Cheers, -- jra -- Jay R. Ashworth Baylink jra@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://www.bcp38.info 2000 Land Rover DII St Petersburg FL USA BCP38: Ask For It By Name! +1 727 647 1274
On Sat, 20 Jun 2015 19:06:29 -0400, Jay Ashworth said:
----- Original Message -----
From: "Valdis Kletnieks" <Valdis.Kletnieks@vt.edu>
I wonder how many of us are old enough to remember what that environment variable *used* to be called before political correctness became important.
There are so many layers in that observation that I'm lost.
Was posixly-correct a purposeful pun on politically correct, and I've missed it all these decades?
Or was it named something else earlier than that, which wasn't itself politically correct?
I'll let the perpetrator, Richard Stallman, explain. It was a kerfluffle regarding whether /bin/du should use units of 1,000 or 1024. http://karmak.org/archive/2003/01/12-14-99.epl.html
On Sun, Jun 21, 2015 at 1:06 AM, <Valdis.Kletnieks@vt.edu> wrote:
On Sat, 20 Jun 2015 19:06:29 -0400, Jay Ashworth said: [snip] I'll let the perpetrator, Richard Stallman, explain. It was a kerfluffle regarding whether /bin/du should use units of 1,000 or 1024.
It's not 1024 vs 1000; it's 1024 vs 512. If it's "du" or "df"; the display is supposed to be the number of 512-Byte blocks..... Thankfully, the -k and -g options were added to display in Kilobyte or Gigabyte units which are more human understandable and familiar. Some of the GNU utilities play fast and loose on the spec and default to 1024-byte blocks. If you set POSIXLY_CORRECT in the environment, they will show in 512 byte blocks, or the disk sector size in bytes, instead, like they are "supposed to" -- -JH
----- Original Message -----
From: "Jimmy Hess" <mysidia@gmail.com>
On Sun, Jun 21, 2015 at 1:06 AM, <Valdis.Kletnieks@vt.edu> wrote:
On Sat, 20 Jun 2015 19:06:29 -0400, Jay Ashworth said: [snip] I'll let the perpetrator, Richard Stallman, explain. It was a kerfluffle regarding whether /bin/du should use units of 1,000 or 1024.
It's not 1024 vs 1000; it's 1024 vs 512.
If it's "du" or "df"; the display is supposed to be the number of 512-Byte blocks..... [ ... ] If you set POSIXLY_CORRECT in the environment, they will show in 512 byte blocks, or the disk sector size in bytes, instead, like they are "supposed to"
Yes, but Valdis' "politically correct" reference goes to the original name of the environment variable, which I once knew, but had forgotten, was POSIX_ME_HARDER. Cheers, -- jra -- Jay R. Ashworth Baylink jra@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://www.bcp38.info 2000 Land Rover DII St Petersburg FL USA BCP38: Ask For It By Name! +1 727 647 1274
Harlan Stenn <stenn@ntp.org> wrote:
It's a problem with POSIX, not UTC.
UTC is monotonic.
The problems are that UTC is unpredictable, and it breaks the standard labelling of points in time that was used for hundreds (arguably thousands) of years before 1972. Tony. -- f.anthony.n.finch <dot@dotat.at> http://dotat.at/ Irish Sea: Northwesterly 4 or 5, occasionally 6 at first, becoming variable 4. Slight or moderate. Mainly fair. Good.
On Mon, Jun 22, 2015 at 01:15:41PM +0100, Tony Finch <dot@dotat.at> wrote a message of 15 lines which said:
The problems are that UTC is unpredictable,
That's because the earth rotation is unpredictable. Any time based on this buggy planet's movements will be unpredictable. Let's patch it now!
On 22 Jun 2015, at 12:27 , Stephane Bortzmeyer <bortzmeyer@nic.fr> wrote:
On Mon, Jun 22, 2015 at 01:15:41PM +0100, Tony Finch <dot@dotat.at> wrote a message of 15 lines which said:
The problems are that UTC is unpredictable,
That's because the earth rotation is unpredictable. Any time based on this buggy planet's movements will be unpredictable. Let's patch it now!
So we need a new center of the universe and switch to stardate and thus solve the 32bit UNIX time problem for real this time?
On Mon, Jun 22, 2015 at 12:38:28PM +0000, Bjoern A. Zeeb <bzeeb-lists@lists.zabbadoz.net> wrote a message of 17 lines which said:
So we need a new center of the universe and switch to stardate and thus solve the 32bit UNIX time problem for real this time?
Or simply use TAI which is the obvious time reference for Internet devices. Using UTC in routers is madness. Routers and Internet servers should use TAI internally and use UTC only when communicating with humans (the inferior life form which crawls on the Earth surface and cares about things like whether the sun is high at noon, for outside picnics).
On (2015-06-22 14:44 +0200), Stephane Bortzmeyer wrote:
Or simply use TAI which is the obvious time reference for Internet devices. Using UTC in routers is madness. Routers and Internet servers should use TAI internally and use UTC only when communicating with humans (the inferior life form which crawls on the Earth surface and cares about things like whether the sun is high at noon, for outside picnics).
I couldn't agree more. But out of curiosity does anyone have scoop why TAI exists? I believe GPSTIME predates it, which appears analogous to TAI. -- ++ytti
On Mon, Jun 22, 2015 at 8:44 AM, Stephane Bortzmeyer <bortzmeyer@nic.fr> wrote:
On Mon, Jun 22, 2015 at 12:38:28PM +0000, Bjoern A. Zeeb <bzeeb-lists@lists.zabbadoz.net> wrote a message of 17 lines which said:
So we need a new center of the universe and switch to stardate and thus solve the 32bit UNIX time problem for real this time?
Or simply use TAI which is the obvious time reference for Internet devices. Using UTC in routers is madness. Routers and Internet servers should use TAI internally and use UTC only when communicating with humans (the inferior life form which crawls on the Earth surface and cares about things like whether the sun is high at noon, for outside picnics).
If the Earth's core ever decides to have some real fun and causes there to be a negative leap second (there is historical precedent for this, albeit before the existence of UTC and atomic time) there would be a minute with only 59 seconds, and I would expect things to break in new and creative ways. We live in a relatively narrow slice of time (a few decades) where this is a possibility, but it is a possibility. (Note that the number of leap seconds per year has _slowed_ recently, corresponding to a speed up in the long term averaged rotation of the Earth. If that speed up of the Earth's rotation were to happen again, negative leap seconds would be inevitable.) The drift between the Earth's time and atomic time will just get worse over longer time frames (see http://www.ucolick.org/~sla/leapsecs/year2100.html ). Even if UTC - TAI is just fixed (i.e., no more leap seconds), that is just pushing the problem down the road, and our grandkids will have to deal with leap minutes, or our remoter descendants with leap hours.
It's a problem with POSIX, not UTC.
Yes. I remember this being raised by people at the USNO back in the early 1990's, but there was no interest in changing POSIX. Too much installed base was the reason stated IIRC. My opinion is (and has been since the early 90's) that the computer / Internet world should just adopt IAT as the time system in use. That is the best time we have, and it will never have steps. Yes, that means that you would need something like a time zone file to set your system clock by hand from local (UTC) time, but, then, we already have to have time zone files. Regards Marshall Eubanks
I do feel sorry for you unix/linux users having a problem in year 2038.... fortunately I get another ~ 8 years... my Amiga gets its first big problem in 2046 ;-) http://web.archive.org/web/19981203142814/http://www.amiga.com/092098-y2k.ht... alan PS if i get to see the 2078 issue I'll be old enough to fuss about other things than a 2 digit date display..and I'm sure if I'm around until 7 February, 2114, 06:28:16 I'll have more to worry about than an old Amiga finally reaching the end of its useful life...unless its actually driving my life support system! ;-)
Stephane Bortzmeyer <bortzmeyer@nic.fr> wrote:
That's because the earth rotation is unpredictable. Any time based on this buggy planet's movements will be unpredictable. Let's patch it now!
http://mm.icann.org/pipermail/tz/2015-May/022280.html http://mm.icann.org/pipermail/tz/2015-May/022281.html http://mm.icann.org/pipermail/tz/2015-May/022282.html Tony. -- f.anthony.n.finch <dot@dotat.at> http://dotat.at/ Northwest Faeroes, Southeast Iceland: Northeasterly 3 or 4. Moderate, becoming mainly slight. Mainly fair. Good, occasionally poor in Southeast Iceland.
On Mon, Jun 22, 2015, 08:29 Stephane Bortzmeyer <bortzmeyer@nic.fr> wrote:
On Mon, Jun 22, 2015 at 01:15:41PM +0100, Tony Finch <dot@dotat.at> wrote a message of 15 lines which said:
The problems are that UTC is unpredictable,
That's because the earth rotation is unpredictable. Any time based on this buggy planet's movements will be unpredictable. Let's patch it now!
So, what we should do is make clocks move. 99999 slower half of the year (and then speed back up) so that we're really in line with earth's rotational time. I mean we've got the computers to do it (I think most RTC only go down to thousandths so it'll still need a little skewing but I'm sure we'll manage).
Ps - if anyone actually does this, I'm going postal.
shawn wilson <ag4ve.us@gmail.com> wrote:
So, what we should do is make clocks move. 99999 slower half of the year (and then speed back up) so that we're really in line with earth's rotational time.
That's how UTC worked in the 1960s. ftp://maia.usno.navy.mil/ser7/tai-utc.dat It causes problems for systems that have a tight coupling between time and frequency - broadcast, cellular, etc. Tony. -- f.anthony.n.finch <dot@dotat.at> http://dotat.at/ Fair Isle, Southeast Faeroes: Northeasterly 5 or 6 backing northerly 4 or 5. Moderate, occasionally rough at first. Mainly fair. Good.
On Mon, Jun 22, 2015 at 7:17 AM, shawn wilson <ag4ve.us@gmail.com> wrote:
So, what we should do is make clocks move. 99999 slower half of the year (and then speed back up) so that we're really in line with earth's rotational time. I mean we've got the computers to do it (I think most RTC only go down to thousandths so it'll still need a little skewing but I'm sure we'll manage).
Ps - if anyone actually does this, I'm going postal.
http://googleblog.blogspot.com/2011/09/time-technology-and-leaping-seconds.h... comes dangerously close to your modest proposal. Damian
On 25 Jun 2015, at 03:14, Damian Menscher via NANOG <nanog@nanog.org> wrote:
http://googleblog.blogspot.com/2011/09/time-technology-and-leaping-seconds.h... comes dangerously close to your modest proposal.
Damian
I wonder why Google hasn't published the patch yet. Leap smear sounds like the sane way to do leap seconds, and it would't break software at all, because time adjustments in the sub-second area are proven to work quite well. Btw. there seem to be a couple of public Google timeservers, I wonder whether could just sync time from there to get leap smearing. time[1-4].google.com Also this update looks like it would smoothen the process: https://rhn.redhat.com/errata/RHBA-2015-1159.html https://bugzilla.redhat.com/show_bug.cgi?id=1214752 -Stefan
On Wed, Jun 24, 2015 at 9:48 PM, Stefan Schlesinger <sts@ono.at> wrote:
On 25 Jun 2015, at 03:14, Damian Menscher via NANOG <nanog@nanog.org> wrote:
http://googleblog.blogspot.com/2011/09/time-technology-and-leaping-seconds.h...
comes dangerously close to your modest proposal.
I wonder why Google hasn't published the patch yet. Leap smear sounds like the sane way to do leap seconds, and it would't break software at all, because time adjustments in the sub-second area are proven to work quite well.
Btw. there seem to be a couple of public Google timeservers, I wonder whether could just sync time from there to get leap smearing.
I'd be cautious about that approach. I don't think they've been advertised for public use, so they could go away without notice. Also, definitely don't mix them with normal servers, as that would just confuse your clocks (which might smear *and* leap or something equally insane). Damian
* Stefan Schlesinger <sts@ono.at>
On 25 Jun 2015, at 03:14, Damian Menscher via NANOG <nanog@nanog.org> wrote:
http://googleblog.blogspot.com/2011/09/time-technology-and-leaping-seconds.h... comes dangerously close to your modest proposal.
I wonder why Google hasn't published the patch yet. Leap smear sounds like the sane way to do leap seconds, and it would't break software at all, because time adjustments in the sub-second area are proven to work quite well.
It's implemented in chronyd versions 2.0 and up, for what it's worth. The required config directive is "leapsecmode slew". There's a nice blog post explaining how this feature, as well as some other approaches on how to deal with the leap second, work here: http://developerblog.redhat.com/2015/06/01/five-different-ways-handle-leap-s... Tore
Damian Menscher via NANOG <nanog@nanog.org> wrote:
http://googleblog.blogspot.com/2011/09/time-technology-and-leaping-seconds.h... comes dangerously close to your modest proposal.
Also http://developerblog.redhat.com/2015/06/01/five-different-ways-handle-leap-s... Tony. -- f.anthony.n.finch <dot@dotat.at> http://dotat.at/ Southwest Viking: Northwesterly 3 or 4, veering southeasterly 4 or 5 later. Slight or moderate. Fair. Good.
Tony Finch writes:
Harlan Stenn <stenn@ntp.org> wrote:
It's a problem with POSIX, not UTC.
UTC is monotonic.
The problems are that UTC is unpredictable, and it breaks the standard labelling of points in time that was used for hundreds (arguably thousands) of years before 1972.
You mean back when seconds were rubbery, and before the earth's rotational speed could be easily and accurately measured, or at least when the wobbles at that level of accuracy became so noticeable that they could no longer be ignored? H
On Jun 19, 2015 2:05 PM, "Saku Ytti" <saku@ytti.fi> wrote:
On (2015-06-19 13:06 -0400), Jay Ashworth wrote:
Hey,
The IERS will be adding a second to time again on my birthday;
2015-06-30T23:59:60
Hopefully this is last leap second we'll ever see. Non-monotonic time is
an
abomination and very very few programs measuring passage of time are correct. Even those which are, usually are not portable, most languages do not even offer monotonic time in standard libraries. Canada, China, England and Germany, shame on you for opposing leapsecondless UTC.
Next year hopefully GPSTIME. TAI and UTC are the same thing, with different static offset.
Unlikely but here's hoping. I mean letting computers figure out slower earth rotation on the fly would seem more accurate than leap seconds anyway. And then all of us who do earthly things and would like simpler libraries could live in peace.
shawn wilson writes:
... I mean letting computers figure out slower earth rotation on the fly would seem more accurate than leap seconds anyway. And then all of us who do earthly things and would like simpler libraries could live in peace.
Really? Have you looked in to those calculations, and I'm only talking about the allegedly predictable parts of those calculations, not things like the jetstream, the circumpolar currents, or earthquakes. H
On Sat, Jun 20, 2015, 14:16 Harlan Stenn <stenn@ntp.org> wrote:
shawn wilson writes:
... I mean letting computers figure out slower earth rotation on the fly would seem more accurate than leap seconds anyway. And then all of us who do earthly things and would like simpler libraries could live in peace.
Really? Have you looked in to those calculations, and I'm only talking about the allegedly predictable parts of those calculations, not things like the jetstream, the circumpolar currents, or earthquakes.
Ok, forget that point - AFAIK, the only things that matter wrt time is agreement on interval/counter and epoch, and stability. Right now we only have agreement on interval. So while I'd prefer a consistent epoch and counter, I'll live with whatever as we have access to board agreement and stability (like this doesn't hit NANOG every time with "uh oh").
Subject: REMINDER: LEAP SECOND Date: Fri, Jun 19, 2015 at 01:06:22PM -0400 Quoting Jay Ashworth (jra@baylink.com):
The IERS will be adding a second to time again on my birthday;
This time around there are a number of Vendor C devices that will fail in spectacular ways if not upgraded with a pretty new release -- Nexus and ASR1K being the two most "interesting" among those I've reviewed. http://www.cisco.com/web/about/doing_business/leap-second.html#~ProductInfor... -- Måns Nilsson primary/secondary/besserwisser/machina MN-1334-RIPE +46 705 989668 I'd like some JUNK FOOD ... and then I want to be ALONE --
The universal workaround is to simply disable NTP on your devices sometime on Leap-Second eave. This will let the clocks free-run over the one-second push, an event of which they will be blissfully ignorant. When you re-enable NTP after The Leap, normal, non-destructive, NTP convergence will occur. Better, if you have a master NTP site clock, you need only disable it’s upstream NTP feed to isolate all the subsidiary devices. If you don’t have such a master clock, this is an excellent time to set one up one. I have found the Time Machines TM1000A GPS time server very inexpensive and super reliable: http://www.newegg.com/Product/Product.aspx?Item=0N6-001Y-00007 -mel
On Jun 19, 2015, at 11:08 AM, Måns Nilsson <mansaxel@besserwisser.org> wrote:
Subject: REMINDER: LEAP SECOND Date: Fri, Jun 19, 2015 at 01:06:22PM -0400 Quoting Jay Ashworth (jra@baylink.com):
The IERS will be adding a second to time again on my birthday;
This time around there are a number of Vendor C devices that will fail in spectacular ways if not upgraded with a pretty new release -- Nexus and ASR1K being the two most "interesting" among those I've reviewed.
http://www.cisco.com/web/about/doing_business/leap-second.html#~ProductInfor...
-- Måns Nilsson primary/secondary/besserwisser/machina MN-1334-RIPE +46 705 989668 I'd like some JUNK FOOD ... and then I want to be ALONE --
On Fri, Jun 19, 2015 at 06:29:34PM +0000, Mel Beckman wrote:
The universal workaround is to simply disable NTP on your devices sometime on Leap-Second eave. This will let the clocks free-run over the one-second push, an event of which they will be blissfully ignorant. When you re-enable NTP after The Leap, normal, non-destructive, NTP convergence will occur.
<randy>I encourage all my competitors to use this approach.</randy> If you're more than 128 ms off when NTP is flipped back on, it will still probably step the clock, then start slewing it. So you've skipped the leap per se, but your clocks will still jump forward quite a bit. This might isolate you from any leap second related failures, but it does not protect you against the system clock being stepped. If the leap pending information data persists, you might not even be isolated from any leap second failures. You could manage to upset the system clock even more. Are your time servers correctly armed for the leap?
Better, if you have a master NTP site clock, you need only disable it’s upstream NTP feed to isolate all the subsidiary devices. If you don’t have such a master clock, this is an excellent time to set one up one. I have found the Time Machines TM1000A GPS time server very inexpensive and super reliable:
http://www.newegg.com/Product/Product.aspx?Item=0N6-001Y-00007
$20 says that doesn't leap correctly. A lot of the inexpensive units appear to be using NMEA speaking GPS modules, and there's no real way to get leap information out of them. Many of them may ignore the timestamps and just use the PPS, in which case they may persist a second behind the world for quite some time. --msa
Bad idea. When restarting ntpd your clocks will likely be off by a second, which will cause a backward step, which will force the problem you claim to be avoiding. There are plenty of ways to solve this problem, and you just get to choose what you want to risk/pay. -- Harlan Stenn <stenn@ntp.org> http://networktimefoundation.org - be a member!
On 19 June 2015 at 23:58, Harlan Stenn <stenn@ntp.org> wrote:
Bad idea.
When restarting ntpd your clocks will likely be off by a second, which will cause a backward step, which will force the problem you claim to be avoiding.
If you are afraid that your routers will crash due to the leapsecond, then it would help to disable the thing that you think will crash them. Even if the router crashes when you enable it later on. Because then you can have one router crash at a time and have it happen in a service window where you are ready for it. Instead of having all routers in your whole network crash at exactly the same time. Regards, Baldur
Baldur Norddahl writes:
On 19 June 2015 at 23:58, Harlan Stenn <stenn@ntp.org> wrote:
Bad idea.
When restarting ntpd your clocks will likely be off by a second, which will cause a backward step, which will force the problem you claim to be avoiding.
If you are afraid that your routers will crash due to the leapsecond, then it would help to disable the thing that you think will crash them. Even if the router crashes when you enable it later on. Because then you can have one router crash at a time and have it happen in a service window where you are ready for it. Instead of having all routers in your whole network crash at exactly the same time.
That' seems fair, as long as you turn off the time stuff only on your routers, and I'm assuming this is on routers that don't have supported software. H
Harlan, This is cisco's recommended workaround, the ultimate conclusion of an exhaustive study of all Cisco firmware and after detailed post mortem analysis of two previous Leap seconds: https://tools.cisco.com/bugsearch/bug/CSCut33302 GSS Leap second update CSCut33302 Description Symptom: There are periodic leap second events which can add or delete a second to global time. When the leap second update occurs the GSS might hang and have to be reload or the kernel could crash and the GSS would reboot. Conditions: The leap second update will be propagated via Network Time Protocol (NTP) or via manually setting the clock. Workaround: Workaround, Turn off NTP prior to leap second and turn it back on afterward. Further Problem Description: None Or, in the immortal words of The IT Crowd: "Turn it off and on again!" If you run non-IOS server software of such fragility that it can't tolerate time slewing, just shut it down and power back up after The Leap. That's what your competitors are doing :) -mel beckman On Jun 19, 2015, at 4:15 PM, Harlan Stenn <stenn@ntp.org<mailto:stenn@ntp.org>> wrote: Baldur Norddahl writes: On 19 June 2015 at 23:58, Harlan Stenn <stenn@ntp.org<mailto:stenn@ntp.org>> wrote: Bad idea. When restarting ntpd your clocks will likely be off by a second, which will cause a backward step, which will force the problem you claim to be avoiding. If you are afraid that your routers will crash due to the leapsecond, then it would help to disable the thing that you think will crash them. Even if the router crashes when you enable it later on. Because then you can have one router crash at a time and have it happen in a service window where you are ready for it. Instead of having all routers in your whole network crash at exactly the same time. That' seems fair, as long as you turn off the time stuff only on your routers, and I'm assuming this is on routers that don't have supported software. H
Mel Beckman writes:
Harlan,
This is cisco's recommended workaround, the ultimate conclusion of an exhau= stive study of all Cisco firmware and after detailed post mortem analysis o= f two previous Leap seconds:
Fair enough. And I've been trying to get Cisco to work with us for very many years. They have yet to show any interest. But they'd be paying us for that. We have no leverage with them. But folks who are paying Cisco for support? For the number of years Cisco has been using NTP and for the number of product lines that use it, they could certainly do better. I know they were current when I did the port for the MDS switch line, years ago. -- Harlan Stenn <stenn@ntp.org> http://networktimefoundation.org - be a member!
I had problems with Leap Second with mikrotik in versions 6.29.1, 6.28, 6.5 and other versions. Configured NTP Client in all of them. Anyone else had this problem?
On Jun 19, 2015, at 19:30, Baldur Norddahl <baldur.norddahl@gmail.com> wrote:
On 19 June 2015 at 23:58, Harlan Stenn <stenn@ntp.org> wrote:
Bad idea.
When restarting ntpd your clocks will likely be off by a second, which will cause a backward step, which will force the problem you claim to be avoiding.
If you are afraid that your routers will crash due to the leapsecond, then it would help to disable the thing that you think will crash them. Even if the router crashes when you enable it later on. Because then you can have one router crash at a time and have it happen in a service window where you are ready for it. Instead of having all routers in your whole network crash at exactly the same time.
Regards,
Baldur
It looks to have only affected the CCR line and only those running the NTP and not the SNTP package. ----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com Midwest Internet Exchange http://www.midwest-ix.com ----- Original Message ----- From: "Guilherme Ganascim" <guilherme.ganascim@persistelecom.com.br> To: nanog@nanog.org Sent: Tuesday, June 30, 2015 8:08:28 PM Subject: Re: REMINDER: LEAP SECOND I had problems with Leap Second with mikrotik in versions 6.29.1, 6.28, 6.5 and other versions. Configured NTP Client in all of them. Anyone else had this problem?
On Jun 19, 2015, at 19:30, Baldur Norddahl <baldur.norddahl@gmail.com> wrote:
On 19 June 2015 at 23:58, Harlan Stenn <stenn@ntp.org> wrote:
Bad idea.
When restarting ntpd your clocks will likely be off by a second, which will cause a backward step, which will force the problem you claim to be avoiding.
If you are afraid that your routers will crash due to the leapsecond, then it would help to disable the thing that you think will crash them. Even if the router crashes when you enable it later on. Because then you can have one router crash at a time and have it happen in a service window where you are ready for it. Instead of having all routers in your whole network crash at exactly the same time.
Regards,
Baldur
On Wed, Jul 1, 2015 at 10:17 AM, Mike Hammett <nanog@ics-il.net> wrote:
It looks to have only affected the CCR line and only those running the NTP and not the SNTP package.
That's Mikrotik's position, but reports of some users contradict their version (both in the need for NTP and for only affecting CCR line), although the NTP package is clearly a contributing factor. Rubens
No, I'm surprised we know the kernels. They're a pretty closed company. All we can do is enter IPs for the client side and turn it on\off server side. Well, and broadcast\multicast\manycast. ----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com Midwest Internet Exchange http://www.midwest-ix.com ----- Original Message ----- From: "Harlan Stenn" <stenn@ntp.org> To: "Mike Hammett" <nanog@ics-il.net> Cc: nanog@nanog.org Sent: Wednesday, July 1, 2015 7:43:43 PM Subject: Re: REMINDER: LEAP SECOND Mike Hammett writes:
It looks to have only affected the CCR line and only those running the NTP and not the SNTP package.
Any idea what version of NTP or what their configuration looked like? H
On Wed, Jul 1, 2015 at 11:15 AM, Michel Luczak <frnog@shrd.fr> wrote:
I had problems with Leap Second with mikrotik in versions 6.29.1, 6.28, 6.5 and other versions.
Configured NTP Client in all of them.
Anyone else had this problem?
Apparently 6.27 was the safe version to have (no issues on our CRS and CCR routers).
Not quite. Reported crashes included 6.27, so it's possible that some other mitigating factor helped not to crash (like using SNTP instead of NTP, although there seems to be people with crashes using SNTP or no SNTP/NTP at all). Variations also include whether hardware watchdog was able to reboot the box or it just froze (including frontal display not responding). Rubens
Once upon a time, Rubens Kuhl <rubensk@gmail.com> said:
Not quite. Reported crashes included 6.27, so it's possible that some other mitigating factor helped not to crash (like using SNTP instead of NTP, although there seems to be people with crashes using SNTP or no SNTP/NTP at all).
These are running Linux kernels, right? Anybody know which version? I know the last couple of leap seconds hit (different) bugs in the Linux kernel. The 2012 bug was timer related and confused some user-space applications, but the 2008 bug could cause a kernel deadlock (which this sounds like). -- Chris Adams <cma@cmadams.net>
v5 is 2.4, v6 3.3.5 ----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com Midwest Internet Exchange http://www.midwest-ix.com ----- Original Message ----- From: "Chris Adams" <cma@cmadams.net> To: nanog@nanog.org Sent: Wednesday, July 1, 2015 9:39:09 AM Subject: Re: REMINDER: LEAP SECOND Once upon a time, Rubens Kuhl <rubensk@gmail.com> said:
Not quite. Reported crashes included 6.27, so it's possible that some other mitigating factor helped not to crash (like using SNTP instead of NTP, although there seems to be people with crashes using SNTP or no SNTP/NTP at all).
These are running Linux kernels, right? Anybody know which version? I know the last couple of leap seconds hit (different) bugs in the Linux kernel. The 2012 bug was timer related and confused some user-space applications, but the 2008 bug could cause a kernel deadlock (which this sounds like). -- Chris Adams <cma@cmadams.net>
Once upon a time, Mike Hammett <nanog@ics-il.net> said:
v5 is 2.4, v6 3.3.5
Don't know why a 3.3.5 kernel would have deadlocked; don't think there are any known issues that would cause that, unless there are Mikrotik specific patches that caused the problem. I believe the bug from the 2008 leap second was present in kernels in 2.4 up through 2.6.26 (although Red Hat at least patched it in their older version long-term support kernels). -- Chris Adams <cma@cmadams.net>
The only v6 ones that are sure to have had the problem are based on tilera chips and one of two NTP packages available. *shrugs* ----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com Midwest Internet Exchange http://www.midwest-ix.com ----- Original Message ----- From: "Chris Adams" <cma@cmadams.net> To: nanog@nanog.org Sent: Wednesday, July 1, 2015 1:17:06 PM Subject: Re: REMINDER: LEAP SECOND Once upon a time, Mike Hammett <nanog@ics-il.net> said:
v5 is 2.4, v6 3.3.5
Don't know why a 3.3.5 kernel would have deadlocked; don't think there are any known issues that would cause that, unless there are Mikrotik specific patches that caused the problem. I believe the bug from the 2008 leap second was present in kernels in 2.4 up through 2.6.26 (although Red Hat at least patched it in their older version long-term support kernels). -- Chris Adams <cma@cmadams.net>
On Wed, Jul 1, 2015 at 3:17 PM, Chris Adams <cma@cmadams.net> wrote:
Once upon a time, Mike Hammett <nanog@ics-il.net> said:
v5 is 2.4, v6 3.3.5
Don't know why a 3.3.5 kernel would have deadlocked; don't think there are any known issues that would cause that, unless there are Mikrotik specific patches that caused the problem.
I believe the bug from the 2008 leap second was present in kernels in 2.4 up through 2.6.26 (although Red Hat at least patched it in their older version long-term support kernels).
3.3 was listed as buggy in this regard as well. Rubens
http://forum.mikrotik.com/viewtopic.php?f=2&t=98138#p488731 ----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com Midwest Internet Exchange http://www.midwest-ix.com ----- Original Message ----- From: "Rubens Kuhl" <rubensk@gmail.com> To: "Nanog" <nanog@nanog.org> Sent: Wednesday, July 1, 2015 1:20:30 PM Subject: Re: REMINDER: LEAP SECOND On Wed, Jul 1, 2015 at 3:17 PM, Chris Adams <cma@cmadams.net> wrote:
Once upon a time, Mike Hammett <nanog@ics-il.net> said:
v5 is 2.4, v6 3.3.5
Don't know why a 3.3.5 kernel would have deadlocked; don't think there are any known issues that would cause that, unless there are Mikrotik specific patches that caused the problem.
I believe the bug from the 2008 leap second was present in kernels in 2.4 up through 2.6.26 (although Red Hat at least patched it in their older version long-term support kernels).
3.3 was listed as buggy in this regard as well. Rubens
On 6/19/15 2:58 PM, Harlan Stenn wrote:
Bad idea.
When restarting ntpd your clocks will likely be off by a second, which will cause a backward step, which will force the problem you claim to be avoiding.
There are plenty of ways to solve this problem, and you just get to choose what you want to risk/pay.
You misunderstand the problem. :) The problem is not "clock skips backward one second," because most of the time that's not what happens. The problem is that most software does not handle it well when the clock ticks ... :59 :60 :00 instead of ticking directly from :59 to :00. THAT problem is avoided by temporarily turning off NTP and then turning it back on again when "the coast is clear." Most software can handle the "clock skips forward or backwards one second" problem fairly robustly, and as Baldur pointed out by doing the reset in a controlled manner you greatly reduce your overall risk. Doug -- I am conducting an experiment in the efficacy of PGP/MIME signatures. This message should be signed. If it is not, or the signature does not validate, please let me know how you received this message (direct, or to a list) and the mail software you use. Thanks!
Doug Barton writes:
This is an OpenPGP/MIME signed message (RFC 4880 and 3156) On 6/19/15 2:58 PM, Harlan Stenn wrote:
Bad idea.
When restarting ntpd your clocks will likely be off by a second, which will cause a backward step, which will force the problem you claim to be avoiding.
There are plenty of ways to solve this problem, and you just get to choose what you want to risk/pay.
You misunderstand the problem. :) The problem is not "clock skips backward one second," because most of the time that's not what happens. The problem is that most software does not handle it well when the clock ticks ... :59 :60 :00 instead of ticking directly from :59 to :00.
POSIX NEVER shows :60.
THAT problem is avoided by temporarily turning off NTP and then turning it back on again when "the coast is clear." Most software can handle the "clock skips forward or backwards one second" problem fairly robustly,= and as Baldur pointed out by doing the reset in a controlled manner you greatly reduce your overall risk.
Time going backwards is deadly to a number of applications. But apparently not to applications you care about. You're also not doing anything where somebody is going to get sued because a timestamp is off by a second. There are people for whom this is a very real risk. H
Harlan, Help me understand why there is a serious risk of going back in time. I acknowledge that there is a remote chance of a backstep, but the probability seems very low. Suppose I disable my NTP service five minutes before a positive leap second occurs, so that no server in my network can query it. These servers will then run on their own internal clocks. Then, five minutes after the leap second, I re-engage NTP. Assuming a high degree of local oscillator fidelity, imagine the clock drift is zero. The result is that NTP will report one second older than the time currently in my server, i.e. exactly five minutes after the 23:59:60 leap second. Thus even systems, such as Unix, where 23:59:60 does not exist in the UTC implementation, the timestamp the server sees from NTP is not the potentially code-crashing 23:59:60, but a perfectly rational 00:05:01. This is what my server’s NTP client compares with its internal clock of 00:04:59. NTP's target time is in the future, so there is no risk of going back in time. NTP gradually increments the local time to converge on NTP’s time. In the alternative case of a negative leap second, following the NTP clock discipline algorithm, the NTP client amortizes the one-second reverse jump, specifically in order to avoid setting the clock backward: the local time will be gradually adjusted again via the clock discipline algorithm until local and NTP times converge. Although the offset is more than the 125ms step threshold, stepping a full one second backward is still statistically unlikely. It may be that I’ve misread the NTP specification in RPC-5905 and its antecedents, as well as the leap second historical records of problems. But the disabling-NTP-prior-to-leap workaround seems to bypass all the documented leap-second live lock hangs and other bugs.. -mel On Jun 22, 2015, at 4:06 PM, Harlan Stenn <stenn@ntp.org<mailto:stenn@ntp.org>> wrote: Doug Barton writes: This is an OpenPGP/MIME signed message (RFC 4880 and 3156) On 6/19/15 2:58 PM, Harlan Stenn wrote: Bad idea. When restarting ntpd your clocks will likely be off by a second, which will cause a backward step, which will force the problem you claim to be avoiding. There are plenty of ways to solve this problem, and you just get to choose what you want to risk/pay. You misunderstand the problem. :) The problem is not "clock skips backward one second," because most of the time that's not what happens. The problem is that most software does not handle it well when the clock ticks ... :59 :60 :00 instead of ticking directly from :59 to :00. POSIX NEVER shows :60. THAT problem is avoided by temporarily turning off NTP and then turning it back on again when "the coast is clear." Most software can handle the "clock skips forward or backwards one second" problem fairly robustly,= and as Baldur pointed out by doing the reset in a controlled manner you greatly reduce your overall risk. Time going backwards is deadly to a number of applications. But apparently not to applications you care about. You're also not doing anything where somebody is going to get sued because a timestamp is off by a second. There are people for whom this is a very real risk. H
This stuff can make my head explode. When a leap second is added, like on 30 June 2015 at the last second of the day, POSIX insists that the day still have 86400 seconds in it. This makes the day longer by one second, so time has to either slow down or move backwards. The "dumb" way to do this is to step the clock back by 1 second at the instant before the stroke of midnight. The allegedly better way to do this would be to stop the clock a bit before midnight, and hold the time for 1 second. To continue providing monotonic time, every time somebody says "what time is it" during that holding period one would want to bump the time by the smallest amount possible, usually 1 nanosecond (assuming the kernel keeps time in nanoseconds). Ideally you wouldn't want to add enough nanoseconds to cause the clock to roll over into the next day "too early". But apparently nobody has implemented this, even though Prof. Mills described it in RFC-i-forget about 20 years ago. This is mostly because POSIX deals with absolute time and not relative time. In the unlikely event of a leap second deletion, there would be no 23:59:59, so when the clock is about to strike 23:59:59 it's OK to add an extra second to the clock to effectively have the time "jump" from 23:59:58 to 00:00:00. This is still a monotonic increment in time. Whatever you decide to do, I recommend you actually test it out to see if it behaves the way you think it will. You have a whole week still. H
Harlan, Why should your head explode? Possibly you’re overthinking the problem. And there is no reason (or simple way I can envision) to test my plan, as you advise, in advance. I will just block NTP in my border router temporarily. No need to make a mountain out of this molehill. Cisco, and many other NTP client gear vendors, recommend this approach, and they’ve published extensive research on the matter. -mel
On Jun 23, 2015, at 12:46 AM, Harlan Stenn <stenn@ntp.org> wrote:
This stuff can make my head explode.
When a leap second is added, like on 30 June 2015 at the last second of the day, POSIX insists that the day still have 86400 seconds in it. This makes the day longer by one second, so time has to either slow down or move backwards.
The "dumb" way to do this is to step the clock back by 1 second at the instant before the stroke of midnight.
The allegedly better way to do this would be to stop the clock a bit before midnight, and hold the time for 1 second. To continue providing monotonic time, every time somebody says "what time is it" during that holding period one would want to bump the time by the smallest amount possible, usually 1 nanosecond (assuming the kernel keeps time in nanoseconds).
Ideally you wouldn't want to add enough nanoseconds to cause the clock to roll over into the next day "too early".
But apparently nobody has implemented this, even though Prof. Mills described it in RFC-i-forget about 20 years ago.
This is mostly because POSIX deals with absolute time and not relative time.
In the unlikely event of a leap second deletion, there would be no 23:59:59, so when the clock is about to strike 23:59:59 it's OK to add an extra second to the clock to effectively have the time "jump" from 23:59:58 to 00:00:00. This is still a monotonic increment in time.
Whatever you decide to do, I recommend you actually test it out to see if it behaves the way you think it will. You have a whole week still.
H
On 23/06/2015 10:25, Mel Beckman wrote:
Why should your head explode? Possibly you’re overthinking the problem.
The problems don't relate to Harlan overthinking the problem. They relate to developers underthinking the problem and assuming that all clocks are monotonic and that certain rules apply, e.g. that there are 60 seconds in a minute, 86400 seconds in a day and so forth. Mostly applications are not time sensitive, but sometimes they are. When they are, and when the developer assumes something which isn't true, unexpected things might happen. assert()s can be triggered, time synchronisation lost with third party applications, unexpected and untested code paths could be used, etc. Blocking NTP at the NTP edge will probably work fine for most situations. Bear in mind that your NTP edge is not necessarily the same as your network edge. E.g. you might have internal GPS / radio sources which could unexpectedly inject the leap second. The larger the network, the more likely this is to happen. Most organisations have network fossils and ntp is an excellent source of these. I.e. systems which work away for years without any problems before one day accidentally triggering meltdown because some developer didn't understand the subtleties of clock monotonicity. Nick
On Jun 23, 2015 6:26 AM, "Nick Hilliard" <nick@foobar.org> wrote:
Blocking NTP at the NTP edge will probably work fine for most situations. Bear in mind that your NTP edge is not necessarily the same as your
network
edge. E.g. you might have internal GPS / radio sources which could unexpectedly inject the leap second. The larger the network, the more likely this is to happen. Most organisations have network fossils and ntp is an excellent source of these. I.e. systems which work away for years without any problems before one day accidentally triggering meltdown because some developer didn't understand the subtleties of clock monotonicity.
NTP causes jumps - not skews, right?
On 23/06/2015 18:23, shawn wilson wrote:
NTP causes jumps - not skews, right?
this is implementation dependent. For normal clock differences on ntpd, if you start it with the -x parameter, it will always slew and never step. If you start ntpd without the -x parameter, if the calculated correct time after slewing is out by > 128ms relative to other ntp servers, then after 900 seconds, it will step to the correct time. However in the case of leap seconds, if the operating system implements ntp kernel discipline, then the ntp server will immediately step by the leap second (forwards or backwards), as soon as it receives the leap second notification. It does this on the basis that the kernel supports leap seconds, therefore it's probably the right thing to do. Nick
shawn wilson writes:
On Jun 23, 2015 6:26 AM, "Nick Hilliard" <nick@foobar.org> wrote:
Blocking NTP at the NTP edge will probably work fine for most situations. Bear in mind that your NTP edge is not necessarily the same as your
network
edge. E.g. you might have internal GPS / radio sources which could unexpectedly inject the leap second. The larger the network, the more likely this is to happen. Most organisations have network fossils and ntp is an excellent source of these. I.e. systems which work away for years without any problems before one day accidentally triggering meltdown because some developer didn't understand the subtleties of clock monotonicity.
NTP causes jumps - not skews, right?
Left to its default condition, ntp will step/jump a change in excess of 128msec. If you want to slew the clock instead, a 1 second correction will take a little over 33 minutes' time to apply. I don't understand why people believe that stopping ntpd for a few minutes while the leap second is applied will help. If the system clock keeps good time, it will *still* be about 1 second ahead when ntpd is restarted, and that will trigger a backward step which is fatal to a number of applications. H
A backward step is a known issue and something that people are more comfortable dealing with as it can happen on any machine with a noisy clock crystal. Having 61 seconds in a minute or 86401 seconds in a day is a different story.
On Jun 23, 2015, at 8:37 PM, Harlan Stenn <stenn@ntp.org> wrote:
shawn wilson writes:
On Jun 23, 2015 6:26 AM, "Nick Hilliard" <nick@foobar.org> wrote:
Blocking NTP at the NTP edge will probably work fine for most situations. Bear in mind that your NTP edge is not necessarily the same as your
network
edge. E.g. you might have internal GPS / radio sources which could unexpectedly inject the leap second. The larger the network, the more likely this is to happen. Most organisations have network fossils and ntp is an excellent source of these. I.e. systems which work away for years without any problems before one day accidentally triggering meltdown because some developer didn't understand the subtleties of clock monotonicity.
NTP causes jumps - not skews, right?
Left to its default condition, ntp will step/jump a change in excess of 128msec.
If you want to slew the clock instead, a 1 second correction will take a little over 33 minutes' time to apply.
I don't understand why people believe that stopping ntpd for a few minutes while the leap second is applied will help. If the system clock keeps good time, it will *still* be about 1 second ahead when ntpd is restarted, and that will trigger a backward step which is fatal to a number of applications.
H
Matthew Huff writes:
A backward step is a known issue and something that people are more comfortable dealing with as it can happen on any machine with a noisy clock crystal.
A clock crystal has to be REALLY bad for ntpd to need to step the clock.
Having 61 seconds in a minute or 86401 seconds in a day is a different story.
Yeah, leap years suck too. And those jumps around daylight savings time. H
* Harlan Stenn <stenn@ntp.org>
Matthew Huff writes:
A backward step is a known issue and something that people are more comfortable dealing with as it can happen on any machine with a noisy clock crystal.
A clock crystal has to be REALLY bad for ntpd to need to step the clock.
Having 61 seconds in a minute or 86401 seconds in a day is a different story.
Yeah, leap years suck too.
And those jumps around daylight savings time.
Hi Harlan, Leap years and DST ladjustments have never caused us any major issues. It seems these code paths are well tested and work fine. The leap second in 2012 however ... total and utter carnage. Application servers, databases, etc. falling over like dominoes. All hands on deck in the middle of the night to clean up. It took days before we stopped finding broken stuff. Maybe all the bugs from 2012 have been fixed. Maybe they haven't. Maybe new ones have been introduced. I'm not terribly optimistic. One example I'm aware of: Cisco Nexus 5010/5020 switches need software that was released as late as 29th of April this year in order to be immune to the crash&burn leap second bug CSCub38654. The official «Cisco Suggested release based on software quality, stability and longevity» is older. Go figure. In any case, we're certainly not going to risk it. So our plan is to disconnect our local stratum-2s from their upstreams on June 29th so they (and more crucially, their downstream clients) remain oblivious to the leap second. Come July 1st, we'll reconnect them. The clients' clocks will be 1s (plus any drift) off at that point, but as we're running ntpd with the "-x" option, that shouldn't cause backwards stepping. Running with slightly incorrect clocks for a few days is a small price to pay to avoid a repeat of 2012's mayhem. Tore
In your letter dated Wed, 24 Jun 2015 08:33:14 +0200 you wrote:
Leap years and DST ladjustments have never caused us any major issues. It seems these code paths are well tested and work fine.
I seem to remember that they were not tested that well on a certain brand of mobile devices a few years back... In any case, we can abstract from time zones and DST by using UTC internally and then converting to local time in the UI. For UTC the analog approach would be to keep time in TAI internally and convert to UTC when required. There is however a big problem with that. UTC as a time scale is not predictable. There is no way of computing the number of seconds between 2000-01-01 00:00 and 2100-01-01 00:00 because that value is undefined. The net results is that representing, say, 2020-01-01 00:00 as a TAI timestamp is impossible until about 6 months before that date. One way forward for people who for some reason feel attached to representing the rotation of the earth in civil time is to have a scheme for leap second just like leap years. For example, insert a leap second every 18 months. And then revise that scheme once a century to cope unexpect changes in the earth's rotation. (Or just get rid of them all together and move to a different time zone every 4000 years).
Philip Homburg <pch-nanog@u-1.phicoh.com> wrote:
For UTC the analog approach would be to keep time in TAI internally and convert to UTC when required.
This is much less of a solution than you might hope, because most APIs, protocols, and data formats require UT. (Usually not UTC but a representation isomorphic to traditional UT which ignores leap seconds.) Tony. -- f.anthony.n.finch <dot@dotat.at> http://dotat.at/ Trafalgar: Variable 3 or 4, but northwesterly 4 or 5 in southeast. Slight, occasionally moderate. Mainly fair. Mainly good.
In your letter dated Wed, 24 Jun 2015 14:05:34 +0100 you wrote:
Philip Homburg <pch-nanog@u-1.phicoh.com> wrote:
For UTC the analog approach would be to keep time in TAI internally and convert to UTC when required.
This is much less of a solution than you might hope, because most APIs, protocols, and data formats require UT. (Usually not UTC but a representation isomorphic to traditional UT which ignores leap seconds.)
Supporting legacy formats can be annoying. In some cases it would be no problem. For example NTP. If there is a defined way to convert between TAI and UTC then converting TAI to NTP timestamps is easy except during an actual leap second. Which is not really a problem. Unix systems would probably need a few new system calls to accept time in TAI. File formats like tar are unlikely to matter much: find a consistent way of encoding time around the leap second and most likely nobody will care. In any case, it would be nice if future formats and systems could have a sensible time keeping system.
On Wed, Jun 24, 2015 at 08:33:14AM +0200, Tore Anderson wrote:
Leap years and DST ladjustments have never caused us any major issues. It seems these code paths are well tested and work fine.
I've seen quite a few people that for whatever reason insist on running systems in local time zones struggle with the DST reverse step. It's not nearly as much of a non-issue as you claim.
The leap second in 2012 however ... total and utter carnage. Application servers, databases, etc. falling over like dominoes. All hands on deck in the middle of the night to clean up. It took days before we stopped finding broken stuff.
"Total and utter carnage" is a bit of a stretch. Linux hosts that ran applications dependant on nanosleeps needed reboots. Note that this wasn't an issue in 2009, because the poorly tested change in question hadn't yet been made to the Linux kernel. (Even in 2012, my personal hosts, running a different operating system sailed through it just fine.) At any time, you might have a bad operational day for any number of reasons. Sure, that one was annoying, but to my knowledge nobody died, and a lot of hosts that probably needed one anyway got a reboot. Certainly, lately, I've seen a lot of Linux hosts rebooted more than once for security patching. #opslife? Cheers, --msa
Once upon a time, Majdi S. Abbas <msa@latt.net> said:
"Total and utter carnage" is a bit of a stretch. Linux hosts that ran applications dependant on nanosleeps needed reboots. Note that this wasn't an issue in 2009, because the poorly tested change in question hadn't yet been made to the Linux kernel.
In 2009, there was a different problem. If the system was under sufficient kernel-related load (such as disk I/O), the kernel's attempt to print an informational message that a leap second had been added caused a kernel deadlock, immediately killing the system. I don't remember any widespread Linux-related leap second issues before that though. -- Chris Adams <cma@cmadams.net>
* Majdi S. Abbas
On Wed, Jun 24, 2015 at 08:33:14AM +0200, Tore Anderson wrote:
Leap years and DST ladjustments have never caused us any major issues. It seems these code paths are well tested and work fine.
I've seen quite a few people that for whatever reason insist on running systems in local time zones struggle with the DST reverse step. It's not nearly as much of a non-issue as you claim.
Read again, and note the word "us". I am describing my and my employer's experience with past DST changes and leap years, and those have indeed been completely uneventful. YMMV.
The leap second in 2012 however ... total and utter carnage. Application servers, databases, etc. falling over like dominoes. All hands on deck in the middle of the night to clean up. It took days before we stopped finding broken stuff.
"Total and utter carnage" is a bit of a stretch.
As above, I am speaking only about how the 2012 leap second went down in our infrastructure. I stand by how I described the event. Again, YMMV. If you plan on let your infrastructure deal with the upcoming leap second head-on, I wish you the best of luck. Hopefully all the bugs from 2012 have been fixed. I, however, certainly have no intention of being the one to find out otherwise. Tore
Does anyone know what the latest that we can run our NTP servers and not distribute the LEAP_SECOND flag to the NTP clients?
On Jun 24, 2015, at 2:33 PM, Tore Anderson <tore@fud.no> wrote:
* Majdi S. Abbas
On Wed, Jun 24, 2015 at 08:33:14AM +0200, Tore Anderson wrote:
Leap years and DST ladjustments have never caused us any major issues. It seems these code paths are well tested and work fine.
I've seen quite a few people that for whatever reason insist on running systems in local time zones struggle with the DST reverse step. It's not nearly as much of a non-issue as you claim.
Read again, and note the word "us". I am describing my and my employer's experience with past DST changes and leap years, and those have indeed been completely uneventful.
YMMV.
The leap second in 2012 however ... total and utter carnage. Application servers, databases, etc. falling over like dominoes. All hands on deck in the middle of the night to clean up. It took days before we stopped finding broken stuff.
"Total and utter carnage" is a bit of a stretch.
As above, I am speaking only about how the 2012 leap second went down in our infrastructure. I stand by how I described the event.
Again, YMMV. If you plan on let your infrastructure deal with the upcoming leap second head-on, I wish you the best of luck. Hopefully all the bugs from 2012 have been fixed. I, however, certainly have no intention of being the one to find out otherwise.
Tore
* Matthew Huff
Does anyone know what the latest that we can run our NTP servers and not distribute the LEAP_SECOND flag to the NTP clients?
From http://support.ntp.org/bin/view/Support/NTPRelatedDefinitions: Leap Indicator This is a two-bit code warning of an impending leap second to be inserted in the NTP timescale. The bits are set before 23:59 on the day of insertion and reset after 00:00 on the following day. This causes the number of seconds (rollover interval) in the day of insertion to be increased or decreased by one. So the answer to your question is, AIUI, 2015-06-29 23:59:59. Tore
I saw that, but it says the bits are set "before 23:59" on the day of insertion, but I was hoping that I could shut it down later than 23:59:59 of the previous day (8pm EST). The reason is FINRA regulations. We have to have the time synced once per trading day before the open according to the regulations. We could manually run ntpdate on 100+ servers including 50+ windows servers, but that's not a great solution. ---- Matthew Huff | 1 Manhattanville Rd Director of Operations | Purchase, NY 10577 OTA Management LLC | Phone: 914-460-4039 aim: matthewbhuff | Fax: 914-694-5669 -----Original Message----- From: Tore Anderson [mailto:tore@fud.no] Sent: Wednesday, June 24, 2015 3:07 PM To: Matthew Huff Cc: nanog2 Subject: Re: REMINDER: LEAP SECOND * Matthew Huff
Does anyone know what the latest that we can run our NTP servers and not distribute the LEAP_SECOND flag to the NTP clients?
From http://support.ntp.org/bin/view/Support/NTPRelatedDefinitions:
Leap Indicator This is a two-bit code warning of an impending leap second to be inserted in the NTP timescale. The bits are set before 23:59 on the day of insertion and reset after 00:00 on the following day. This causes the number of seconds (rollover interval) in the day of insertion to be increased or decreased by one. So the answer to your question is, AIUI, 2015-06-29 23:59:59. Tore
* Matthew Huff
I saw that, but it says the bits are set "before 23:59" on the day of insertion, but I was hoping that I could shut it down later than 23:59:59 of the previous day (8pm EST). The reason is FINRA regulations. We have to have the time synced once per trading day before the open according to the regulations.
Again AIUI, and I'm no NTP expert so I hope someone corrects me if I'm wrong: If you don't configure the "leapfile" ntpd option, the Leap Indicator flag will flow down to your servers from the stratum-1s servers you're synchronising from (directly or indirectly). So what I think you could do, is to on the 29th remove all your upstream servers from your NTP server's config, and set "fudge 127.127.1.0 stratum 3" or something like that so that clients will still want to sync to it. At that point, your NTP server's clock chip will be the reference clock, which might be drift-prone. To work around that, you could at 8pm on the 30th stop ntpd, manually sync the system clock with ntpdate, and start ntpd again. That should keep your NTP server's clock reasonably synchronised, that provides your clients with (a Leap Indicator-free) NTP service. I make no guarantees that the above will work the way I think it will, though... Try it at your own risk. Tore
That won't work. Being internally sync'ed isn't good enough for FINRA. All the machines must be synced to an external accurate source at least once per trading day. Our plan is to disable our two stratum 1 servers, and our 3 stratum 2 servers before the leap second turnover, but to be 100% safe we would need to do that 24 hours before, but that would be a violation of FINRA regulations. It looks like the safest thing for us to do is to keep our NTP servers running and deal with any crashes/issues. That's better than having to deal with FINRA. ---- Matthew Huff | 1 Manhattanville Rd Director of Operations | Purchase, NY 10577 OTA Management LLC | Phone: 914-460-4039 aim: matthewbhuff | Fax: 914-694-5669 -----Original Message----- From: Tore Anderson [mailto:tore@fud.no] Sent: Wednesday, June 24, 2015 3:26 PM To: Matthew Huff Cc: nanog2 Subject: Re: REMINDER: LEAP SECOND * Matthew Huff
I saw that, but it says the bits are set "before 23:59" on the day of insertion, but I was hoping that I could shut it down later than 23:59:59 of the previous day (8pm EST). The reason is FINRA regulations. We have to have the time synced once per trading day before the open according to the regulations.
Again AIUI, and I'm no NTP expert so I hope someone corrects me if I'm wrong: If you don't configure the "leapfile" ntpd option, the Leap Indicator flag will flow down to your servers from the stratum-1s servers you're synchronising from (directly or indirectly). So what I think you could do, is to on the 29th remove all your upstream servers from your NTP server's config, and set "fudge 127.127.1.0 stratum 3" or something like that so that clients will still want to sync to it. At that point, your NTP server's clock chip will be the reference clock, which might be drift-prone. To work around that, you could at 8pm on the 30th stop ntpd, manually sync the system clock with ntpdate, and start ntpd again. That should keep your NTP server's clock reasonably synchronised, that provides your clients with (a Leap Indicator-free) NTP service. I make no guarantees that the above will work the way I think it will, though... Try it at your own risk. Tore
* Matthew Huff
That won't work. Being internally sync'ed isn't good enough for FINRA. All the machines must be synced to an external accurate source at least once per trading day.
That was why I proposed to ntpdate on your (upstream-free since the 29th) NTP server(s) sometime on the 30th. That would synchronise its local clock with an external accurate source, without learning the Leap Indicator.
Our plan is to disable our two stratum 1 servers, and our 3 stratum 2 servers before the leap second turnover, but to be 100% safe we would need to do that 24 hours before, but that would be a violation of FINRA regulations.
If you run your own straum-1 servers, can't you just opt not to configure "leapfile"? Assuming your own organisation is the only user of those servers, that is (certainly don't do that if it's a public server). After the leap second has passed, you can proceed to correct things. Your clients will then be 1s ahead of correct time, and will need to step/slew their clocks to get in sync. But maybe that's OK as far as FINRA's concerned...
It looks like the safest thing for us to do is to keep our NTP servers running and deal with any crashes/issues. That's better than having to deal with FINRA.
Maybe. I have no experience with FINRA. :-) Tore
Yo Tore! On Wed, 24 Jun 2015 21:57:28 +0200 Tore Anderson <tore@fud.no> wrote:
If you run your own straum-1 servers, can't you just opt not to configure "leapfile"?
Depends on what your Stratum-1 is syncronized to. Some GPS time sources pass on the leap indicator to NTP. For example, the SiRF-3 GPS, connected by way of gpsd, to ntpd will pass on the leap second. At least in theory. :-) RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem@rellim.com Tel:+1(541)382-8588
Once upon a time, Gary E. Miller <gem@rellim.com> said:
Depends on what your Stratum-1 is syncronized to. Some GPS time sources pass on the leap indicator to NTP. For example, the SiRF-3 GPS, connected by way of gpsd, to ntpd will pass on the leap second.
Yep, my ancient old SVeeSix has been showing the leap second for several months; the notification it was added to the GPS signal a while back, either at the start of the year or the start of the quarter (in theory, leap seconds can be added/removed quarterly). -- Chris Adams <cma@cmadams.net>
On 06/24/2015 12:44 PM, Matthew Huff wrote:
It looks like the safest thing for us to do is to keep our NTP servers running and deal with any crashes/issues. That's better than having to deal with FINRA.
For what it's worth, Red Hat pushed updates to NTP and to TZDATA. You might want to check the documentation to see if the updates include sane handling of the leap second. (I run CentOS 7.1 and Fedora 20, which is where I saw the updates during my morning maintenance.)
Yes, the clock has to be bad. Been there, done that, especially early Sun x86 servers. Leap years and DST are both things people and developers are aware of outside of technology, leap seconds, not so much.
On Jun 23, 2015, at 11:33 PM, Harlan Stenn <stenn@ntp.org> wrote:
Matthew Huff writes:
A backward step is a known issue and something that people are more comfortable dealing with as it can happen on any machine with a noisy clock crystal.
A clock crystal has to be REALLY bad for ntpd to need to step the clock.
Having 61 seconds in a minute or 86401 seconds in a day is a different story.
Yeah, leap years suck too.
And those jumps around daylight savings time.
H
On Jun 22, 2015, at 7:06 PM, Harlan Stenn <stenn@ntp.org> wrote:
Time going backwards is deadly to a number of applications.
But apparently not to applications you care about.
Oh it is a problem, and most handle it very ungracefully, such as dovecot which just dies: http://wiki.dovecot.org/TimeMovedBackwards The issue is also most people are used to using rc.local or rc.* type scripts to spawn a daemon which leads to the next major sysadmin/developer problem with is handling error cases improperly or not at all. (86401 is perhaps and error case that people should test for). - Jared
Guys, if we don't have NTP enable on our Linux we still have problem with leap second ?? -----Mensagem original----- De: NANOG [mailto:nanog-bounces@nanog.org] Em nome de Jared Mauch Enviada em: terça-feira, 23 de junho de 2015 10:08 Para: Harlan Stenn Cc: nanog@nanog.org Assunto: Re: REMINDER: LEAP SECOND
On Jun 22, 2015, at 7:06 PM, Harlan Stenn <stenn@ntp.org> wrote:
Time going backwards is deadly to a number of applications.
But apparently not to applications you care about.
Oh it is a problem, and most handle it very ungracefully, such as dovecot which just dies: http://wiki.dovecot.org/TimeMovedBackwards The issue is also most people are used to using rc.local or rc.* type scripts to spawn a daemon which leads to the next major sysadmin/developer problem with is handling error cases improperly or not at all. (86401 is perhaps and error case that people should test for). - Jared
If you don’t have NTP enabled your clock may be wrong so it likely won’t impact you. I’ve always had trouble getting NTP to work right over the years for a variety of reasons. Just set something in cron to ntpdate -u your host on july 1st and you should be good. - Jared
On Jun 23, 2015, at 9:14 AM, Leonardo Oliveira Ortiz <leonardo.ortiz@marisolsa.com> wrote:
Guys, if we don't have NTP enable on our Linux we still have problem with leap second ??
-----Mensagem original----- De: NANOG [mailto:nanog-bounces@nanog.org] Em nome de Jared Mauch Enviada em: terça-feira, 23 de junho de 2015 10:08 Para: Harlan Stenn Cc: nanog@nanog.org Assunto: Re: REMINDER: LEAP SECOND
On Jun 22, 2015, at 7:06 PM, Harlan Stenn <stenn@ntp.org> wrote:
Time going backwards is deadly to a number of applications.
But apparently not to applications you care about.
Oh it is a problem, and most handle it very ungracefully, such as dovecot which just dies:
http://wiki.dovecot.org/TimeMovedBackwards
The issue is also most people are used to using rc.local or rc.* type scripts to spawn a daemon which leads to the next major sysadmin/developer problem with is handling error cases improperly or not at all. (86401 is perhaps and error case that people should test for).
- Jared
----- Original Message -----
From: "Harlan Stenn" <stenn@ntp.org>
You misunderstand the problem. :) The problem is not "clock skips backward one second," because most of the time that's not what happens. The problem is that most software does not handle it well when the clock ticks ... :59 :60 :00 instead of ticking directly from :59 to :00.
POSIX NEVER shows :60.
Then I hope POSIX does not claim to represent UTC, because UTC does, no? (IE: somewhere between "a bit" and "a lot" more expansion was called for there; most of us don't have ntp.org email addresses. :-) Cheers, -- jra -- Jay R. Ashworth Baylink jra@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://www.bcp38.info 2000 Land Rover DII St Petersburg FL USA BCP38: Ask For It By Name! +1 727 647 1274
Yo Jay! On Tue, 23 Jun 2015 22:02:50 -0400 (EDT) Jay Ashworth <jra@baylink.com> wrote:
----- Original Message -----
From: "Harlan Stenn" <stenn@ntp.org>
You misunderstand the problem. :) The problem is not "clock skips backward one second," because most of the time that's not what happens. The problem is that most software does not handle it well when the clock ticks ... :59 :60 :00 instead of ticking directly from :59 to :00.
POSIX NEVER shows :60.
Then I hope POSIX does not claim to represent UTC, because UTC does, no?
POSIX-1:2001 clearly 61 seeconds in a minute: The POSIX-1:2001 docs are here: http://pubs.opengroup.org/onlinepubs/009695399/basedefs/time.h.html From the Description: "The <time.h> header shall declare the structure tm, which shall include at least the following members: int tm_sec Seconds [0,60]. " From the Application Usage: "The range [0,60] for tm_sec allows for the occasional leap second." From the Rationale: "The range [0,60] seconds allows for positive or negative leap seconds." But, from the section on "Seconds Since the Epoch" http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap04.html#tag_... POSIX seconds is defined as: tm_sec + tm_min*60 + tm_hour*3600 + tm_yday*86400 + (tm_year-70)*31536000 + ((tm_year-69)/4)*86400 - ((tm_year-1)/100)*86400 + ((tm_year+299)/400)*86400 Summed up with: "The relationship between the actual time of day and the current value for seconds since the Epoch is unspecified." Which basically says if you are gonna split hairs on leap seconds things will be undefined. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem@rellim.com Tel:+1(541)382-8588
participants (34)
-
Alan Buxey
-
Alexander Maassen
-
Baldur Norddahl
-
Bjoern A. Zeeb
-
Chris Adams
-
Damian Menscher
-
Doug Barton
-
Gary E. Miller
-
Guilherme Ganascim
-
Harlan Stenn
-
Jared Mauch
-
Jay Ashworth
-
Jimmy Hess
-
Leonardo Oliveira Ortiz
-
Majdi S. Abbas
-
Marshall Eubanks
-
Matthew Huff
-
Mel Beckman
-
Michel Luczak
-
Mike Hammett
-
Måns Nilsson
-
Nick Hilliard
-
Philip Homburg
-
Randy Bush
-
Rubens Kuhl
-
Saku Ytti
-
shawn wilson
-
Stefan Schlesinger
-
Stephane Bortzmeyer
-
Stephen Satchell
-
Steve Allen
-
Tony Finch
-
Tore Anderson
-
Valdis.Kletnieks@vt.edu