Nationwide Routing issues with Wiltel

Vincent India

26 Jun 2006 26 Jun '06

8:39 p.m.

Anyone experiencing problems with Wiltel Backbone, or know of any issues with the Wiltel Backbone? I called their NOC and was told they are experiencing a nationwide routing problem that they are working on but couldn't get any further details?

Attachments:

attachment.html (text/html — 1.4 KB)

Show replies by date

david raistrick

26 Jun 26 Jun

8:55 p.m.

On Mon, 26 Jun 2006, Vincent India wrote:

...

Anyone experiencing problems with Wiltel Backbone, or know of any issues with the Wiltel Backbone? I called their NOC and was told they are experiencing a nationwide routing problem that they are working on but couldn't get any further details?

Told me it was related to the L3/Wiltel integration. Most of the breaks I've been seeing seem to be at the point where most of my traffic has be going from wiltel to l3 in DC or so. Oh, and that the "former wiltel" tier 1 guys had 40+ calls in the hold queue... --- david raistrick http://www.netmeister.org/news/learn2quote.html drais@atlasta.net http://www.expita.com/nomime.html

david raistrick

8:56 p.m.

On Mon, 26 Jun 2006, david raistrick wrote:

...

Told me it was related to the L3/Wiltel integration. Most of the breaks I've

To add the notable quotes: "running scripts updating BGP configs" and "don't know why they're doing it in the middle of the day" ..d --- david raistrick http://www.netmeister.org/news/learn2quote.html drais@atlasta.net http://www.expita.com/nomime.html

Jeremy Chadwick

9:03 p.m.

On Mon, Jun 26, 2006 at 04:55:04PM -0400, david raistrick wrote:

...

On Mon, 26 Jun 2006, Vincent India wrote:

...
Anyone experiencing problems with Wiltel Backbone, or know of any issues with the Wiltel Backbone? I called their NOC and was told they are experiencing a nationwide routing problem that they are working on but couldn't get any further details?

Told me it was related to the L3/Wiltel integration. Most of the breaks I've been seeing seem to be at the point where most of my traffic has be going from wiltel to l3 in DC or so. Oh, and that the "former wiltel" tier 1 guys had 40+ calls in the hold queue...

I can confirm this as well, although have no proof (at this point) that Layer3 is necessarily to blame. We (or rather the company I work for) are seeing similar between MCI/UU/Verizon and WCG when reaching some of our clients: 4. 0.so-4-0-0.CL1.LAX15.ALTER.NET 0.0% 102 10.9 11.3 10.9 17.1 0.7 5. POS6-0.GW1.LAX15.ALTER.NET 0.0% 102 10.7 11.1 10.7 13.9 0.5 6. wcgGigELAX-gw.customer.alter.net 98.0% 102 135.4 147.0 135.4 158.5 16.3 7. anhmca1wcx2-pos6-1-oc48.wcg.net 98.0% 102 162.9 156.8 150.8 162.9 8.6 -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |

Chris Stone

9:38 p.m.

On Mon, June 26, 2006 3:03 pm, Jeremy Chadwick wrote:

...

I can confirm this as well, although have no proof (at this point) that Layer3 is necessarily to blame. We (or rather the company I work for) are seeing similar between MCI/UU/Verizon and WCG when reaching some of our clients:

...

From the Cogent NOC:

"Yes, we became aware of a problem within the Level 3 network that affected all routing from all affiliated ISPs on the Internet. Many of our own customers called because they were being affected. At approximately 3:45 PM EST, the Cogent NOC was able to contact the Level 3 NOC and obtain information that the issue should be resolved within 15-20 minutes from that time. The specifics were not released either to our NOC or from Level 3's NOC."

Berkman, Scott

10:44 p.m.

FYI Level 3 had a pre-existing peering issue with UUNet/Alternet at the end of last week that was due to issues on an oc48. If that hasn't been resolved, it is a separate (or compound) issue compared to today's Wiltel outage. We have been told by L3 that it was an outage on the legacy Wiltel network. From what we gather they had to reroute a lot of that traffic onto other parts of their network, causing increased total traffic and then latency, but that has not been confirmed to us. -Scott -----Original Message----- From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu] On Behalf Of Jeremy Chadwick Sent: Monday, June 26, 2006 5:04 PM To: nanog@merit.edu Subject: Re: Nationwide Routing issues with Wiltel On Mon, Jun 26, 2006 at 04:55:04PM -0400, david raistrick wrote:

...

On Mon, 26 Jun 2006, Vincent India wrote:

...
Anyone experiencing problems with Wiltel Backbone, or know of any issues with the Wiltel Backbone? I called their NOC and was told they are experiencing a nationwide routing problem that they are working on but couldn't get any further details?

Told me it was related to the L3/Wiltel integration. Most of the breaks I've been seeing seem to be at the point where most of my traffic has be going from wiltel to l3 in DC or so. Oh, and that the "former wiltel"

...

tier 1 guys had 40+ calls in the hold queue...

Steve Sobol

11:37 p.m.

On Mon, 26 Jun 2006, Vincent India wrote:

...

Anyone experiencing problems with Wiltel Backbone, or know of any issues with the Wiltel Backbone? I called their NOC and was told they are experiencing a nationwide routing problem that they are working on but couldn't get any further details?

I have a box sitting in a colo off a WCG circuit in Columbus, OH; traceroutes from the west coast were dying a few hops short of the colo facility, but I'm not a direct customer of WCG, so calling them for info would have been pointless... -- Steve Sobol, Professional Geek ** Java/VB/VC/PHP/Perl ** Linux/*BSD/Windows Apple Valley, California PGP:0xE3AE35ED It's all fun and games until someone starts a bonfire in the living room.

Mark Radabaugh

27 Jun 27 Jun

2:44 a.m.

Steve Sobol wrote:

...

On Mon, 26 Jun 2006, Vincent India wrote:

...
Anyone experiencing problems with Wiltel Backbone, or know of any issues with the Wiltel Backbone? I called their NOC and was told they are experiencing a nationwide routing problem that they are working on but couldn't get any further details?

I have a box sitting in a colo off a WCG circuit in Columbus, OH; traceroutes from the west coast were dying a few hops short of the colo facility, but I'm not a direct customer of WCG, so calling them for info would have been pointless...

As a customer we were not able to get through to L3 on the phone. Apparently prefix filtering wasn't working so well either given that AS27251 was managing to announce 38/8, 64/8 and 67/8 with L3 happily passing it along. -- Mark Radabaugh Amplex mark@amplex.net 419.837.5015

Jeremy Chadwick

3:44 p.m.

On Mon, Jun 26, 2006 at 04:39:17PM -0400, Vincent India wrote:

...

Anyone experiencing problems with Wiltel Backbone, or know of any issues with the Wiltel Backbone? I called their NOC and was told they are experiencing a nationwide routing problem that they are working on but couldn't get any further details?

Was anyone able to get an RFO or post-mortem for this? -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |

david raistrick

28 Jun 28 Jun

8:39 p.m.

On Tue, 27 Jun 2006, Jeremy Chadwick wrote:

...

Was anyone able to get an RFO or post-mortem for this?

"An inaccurate set of BGP policies were distributed to routers connected to AS791 1 when an automated update script ran at 1100 MDT. The update script regularly ran every two hours to update the network with current BGP information. Due to the scheduled shutdown of the legacy BGP policy server and subsequent con version to the Level3 route registry engine, the old server policy server was sh utdown. In addition, the scripts used to update routes on the network were to be disabled. One of these scripts wasn t disabled as intended. As a result, the script ran as scheduled at 1300MDT and consequently pushed partial configurations to production routers because the script was unable to communicate with decommissioned policy server. Incorrect policies were exchanged between AS7911 s customers and peers resulted in increased latency; as large route blocks attem pted to traverse individual customer connections. Repair Updated configurations were pushed to all the routers, individual connections were cleaned up and BGP sessions were restored. In addition, the automated BGP script has been shut-off. Maximum pre-fix list limits have been established across the network as a risk mitigation step. " --- david raistrick http://www.netmeister.org/news/learn2quote.html drais@atlasta.net http://www.expita.com/nomime.html

7164

Age (days ago)

7166

Last active (days ago)

List overview

Download

9 comments

7 participants

participants (7)

Berkman, Scott
Chris Stone
david raistrick
Jeremy Chadwick
Mark Radabaugh
Steve Sobol
Vincent India