Someone didn't get the leap second memo...
[root@hayden ~]# ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== LOCAL(0) .LOCL. 10 l 20d 64 0 0.000 0.000 0.000 -clock.xmission. 132.163.4.103 2 u 169 256 377 66.078 -1.302 0.164 xclock.sjc.he.ne 10.200.208.2 2 u 13 256 315 65.689 999.633 2.015 +tock.usshc.com .GPS. 1 u 87 256 377 26.930 -0.550 0.121 *ntp.your.org .CDMA. 1 u 43 256 217 23.339 0.544 0.069 Our batch system went belly up, but other than that, no other apparent leap second issues. ---- Matthew Huff | 1 Manhattanville Rd Director of Operations | Purchase, NY 10577 OTA Management LLC | Phone: 914-460-4039 aim: matthewbhuff | Fax: 914-694-5669
Had a set of Cisco ASR1004s running 15.4(3)S1 (on IOS-XE 03.13.01.S) all restart at around midnight UTC, and all with `Last reload reason: Watchdog`, with those boxes being at separate DCs in different regions. I'm assuming when I call TAC I'll get a "whoops; sorry". -- Hugo Slabbert | email, xmpp/jabber: hugo@slabnet.com pgp key: B178313E | also on Signal On Sun 2017-Jan-01 01:02:24 +0000, Matthew Huff <mhuff@ox.com> wrote:
[root@hayden ~]# ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== LOCAL(0) .LOCL. 10 l 20d 64 0 0.000 0.000 0.000 -clock.xmission. 132.163.4.103 2 u 169 256 377 66.078 -1.302 0.164 xclock.sjc.he.ne 10.200.208.2 2 u 13 256 315 65.689 999.633 2.015 +tock.usshc.com .GPS. 1 u 87 256 377 26.930 -0.550 0.121 *ntp.your.org .CDMA. 1 u 43 256 217 23.339 0.544 0.069
Our batch system went belly up, but other than that, no other apparent leap second issues.
---- Matthew Huff | 1 Manhattanville Rd Director of Operations | Purchase, NY 10577 OTA Management LLC | Phone: 914-460-4039 aim: matthewbhuff | Fax: 914-694-5669
We had some ASR1001s routers reboot. Looks like we hit this bug: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvb01730 On Sat, Dec 31, 2016 at 5:47 PM, Hugo Slabbert <hugo@slabnet.com> wrote:
Had a set of Cisco ASR1004s running 15.4(3)S1 (on IOS-XE 03.13.01.S) all restart at around midnight UTC, and all with `Last reload reason: Watchdog`, with those boxes being at separate DCs in different regions. I'm assuming when I call TAC I'll get a "whoops; sorry".
-- Hugo Slabbert | email, xmpp/jabber: hugo@slabnet.com pgp key: B178313E | also on Signal
On Sun 2017-Jan-01 01:02:24 +0000, Matthew Huff <mhuff@ox.com> wrote:
[root@hayden ~]# ntpq -p
remote refid st t when poll reach delay offset jitter ============================================================ ================== LOCAL(0) .LOCL. 10 l 20d 64 0 0.000 0.000 0.000 -clock.xmission. 132.163.4.103 2 u 169 256 377 66.078 -1.302 0.164 xclock.sjc.he.ne 10.200.208.2 2 u 13 256 315 65.689 999.633 2.015 +tock.usshc.com .GPS. 1 u 87 256 377 26.930 -0.550 0.121 *ntp.your.org .CDMA. 1 u 43 256 217 23.339 0.544 0.069
Our batch system went belly up, but other than that, no other apparent leap second issues.
---- Matthew Huff | 1 Manhattanville Rd Director of Operations | Purchase, NY 10577 OTA Management LLC | Phone: 914-460-4039 aim: matthewbhuff | Fax: 914-694-5669
We manage a couple of these too, running an affected version. :( I've applied the suggested workaround, but not the recommended 24h in advance of the unexpected leap second. I'll find out tomorrow if that mattered. On Sat, 31 Dec 2016, Richard Hicks wrote:
We had some ASR1001s routers reboot.
Looks like we hit this bug: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvb01730
On Sat, Dec 31, 2016 at 5:47 PM, Hugo Slabbert <hugo@slabnet.com> wrote:
Had a set of Cisco ASR1004s running 15.4(3)S1 (on IOS-XE 03.13.01.S) all restart at around midnight UTC, and all with `Last reload reason: Watchdog`, with those boxes being at separate DCs in different regions. I'm assuming when I call TAC I'll get a "whoops; sorry".
-- Hugo Slabbert | email, xmpp/jabber: hugo@slabnet.com pgp key: B178313E | also on Signal
On Sun 2017-Jan-01 01:02:24 +0000, Matthew Huff <mhuff@ox.com> wrote:
[root@hayden ~]# ntpq -p
remote refid st t when poll reach delay offset jitter ============================================================ ================== LOCAL(0) .LOCL. 10 l 20d 64 0 0.000 0.000 0.000 -clock.xmission. 132.163.4.103 2 u 169 256 377 66.078 -1.302 0.164 xclock.sjc.he.ne 10.200.208.2 2 u 13 256 315 65.689 999.633 2.015 +tock.usshc.com .GPS. 1 u 87 256 377 26.930 -0.550 0.121 *ntp.your.org .CDMA. 1 u 43 256 217 23.339 0.544 0.069
Our batch system went belly up, but other than that, no other apparent leap second issues.
---- Matthew Huff | 1 Manhattanville Rd Director of Operations | Purchase, NY 10577 OTA Management LLC | Phone: 914-460-4039 aim: matthewbhuff | Fax: 914-694-5669
---------------------------------------------------------------------- Jon Lewis, MCP :) | I route | therefore you are _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
That bug note indicates all 3.13.* are vulnerable to this, but our 3.13.3 lab router seemed ok, no reload. Logged: Dec 31 23:59:59: %IOSXE-5-PLATFORM: R0/0: kernel: Clock: inserting leap second 23:59:60 UTC But the release notes seem to indicate that bugs: CSCut82336 ASR1002-X: Handle leap second in ToD IN CSCut65374 PTP Leap Second: ASR1002-X incorporate leap second addition 6/30/15 are open caveats in 3.13.3, and resolved in 3.13.4. I guess I'll find out which of our 300+ ASR 1000s misbehaved on Tuesday. They're pretty much all on 3.13.5. Hopefully none had an issue. There are a lot of PSIRTs with ASRs, that's the main reason we're on later 3.13 versions. Chuck -----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Richard Hicks Sent: Saturday, December 31, 2016 9:17 PM To: Hugo Slabbert <hugo@slabnet.com> Cc: nanog@nanog.org Subject: Re: Someone didn't get the leap second memo... We had some ASR1001s routers reboot. Looks like we hit this bug: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvb01730 On Sat, Dec 31, 2016 at 5:47 PM, Hugo Slabbert <hugo@slabnet.com> wrote:
Had a set of Cisco ASR1004s running 15.4(3)S1 (on IOS-XE 03.13.01.S) all restart at around midnight UTC, and all with `Last reload reason: Watchdog`, with those boxes being at separate DCs in different regions. I'm assuming when I call TAC I'll get a "whoops; sorry".
-- Hugo Slabbert | email, xmpp/jabber: hugo@slabnet.com pgp key: B178313E | also on Signal
On Sun 2017-Jan-01 01:02:24 +0000, Matthew Huff <mhuff@ox.com> wrote:
[root@hayden ~]# ntpq -p
remote refid st t when poll reach delay offset jitter ============================================================ ================== LOCAL(0) .LOCL. 10 l 20d 64 0 0.000 0.000 0.000 -clock.xmission. 132.163.4.103 2 u 169 256 377 66.078 -1.302 0.164 xclock.sjc.he.ne 10.200.208.2 2 u 13 256 315 65.689 999.633 2.015 +tock.usshc.com .GPS. 1 u 87 256 377 26.930 -0.550 0.121 *ntp.your.org .CDMA. 1 u 43 256 217 23.339 0.544 0.069
Our batch system went belly up, but other than that, no other apparent leap second issues.
---- Matthew Huff | 1 Manhattanville Rd Director of Operations | Purchase, NY 10577 OTA Management LLC | Phone: 914-460-4039 aim: matthewbhuff | Fax: 914-694-5669
participants (5)
-
Chuck Church
-
Hugo Slabbert
-
Jon Lewis
-
Matthew Huff
-
Richard Hicks