Re: Carrier Circus (was RE: Intermedia (ICIX) brokenness...)
On Fri, May 04, 2001 at 12:18:18PM -0700, Jonathan Disher wrote:
Personally, I'm still trying to figure out why Exodus, in all their apparent wisdom (or lack thereof), has stopped using the GBLX OC-48's in the former GlobalCenter facilities (or at least SNV3), and is now shuttling all its traffic out a single Exodus OC-12. Prior to yesterday these traces would've shown gblx.net routers (on different IPs), and would never have touched an exodus backbone...
Hrm lets think about that for a momment shall we. Could it be, perhaps, that Exodus purchased GlobalCenter and is integrating those facilities into their network? Could it also be that Exodus has a well designed network where most of the traffic is quickly sent to peers and an OC48 backbone is not required? I don't see any congestion on that OC12, so perhaps that is the case? I also don't see a damn thing wrong with the traceroute you provided, and an OC12 peer to UU is pretty good. Was there some other complaint or do you just not like it when your traceroute changes?
Of course, this is probably a move I should've expected from Exodus, after the mongolian flustercluck that was the AS change in SNV3. You'd think they would do something like that carefully, as you can -seriously- bone customers. But noooooo. One of our junior admins made the change (since I was out of town, but hey, it's cut and paste!). He, and all of the other affected customers in SNV3 on the conference call, were left on hold for about half an hour (plus the call started half an hour late), whereupon the exodus engineering team popped back in and said "We're done with our side, you guys go ahead!".
Actually I was awake for that. I guess your junior engineer wasn't able to figure out that if he simply put in an additional neighbor statement with the new AS your downtime would have been less then 30 seconds as bgp came back up. 30 second outages are pretty light in the history of GCTR and GBLX outages, if you can't handle maint then you should have setup static routes out or multihomed, but you shouldn't blame your stupidity or lack of forethought on other networks.
Now. Does it seem logical to kill connectivity over BOTH of your hosting routers at once, thus killing every single BGP-running customer you have that isn't physically in their cage at the time? Or would it seem better to do what I assumed they'd do, which is do one router, wait for everyone to make changes, then do the other?
ASN changes are not exactly easy or frequent, but I seem to recall that one going over rather smoothly. Customers were given ample warning and a conference call was setup to handle any outstanding issues, of which there were none.
I guess this is what happens when I assume intelligence at a hosting/backbone provider.
Or when we assume intelligent posts to nanog... -- Richard A Steenbergen <ras@e-gerbil.net> http://www.e-gerbil.net/ras PGP Key ID: 0x138EA177 (67 29 D7 BC E8 18 3E DA B2 46 B3 D8 14 36 FE B6)
On Fri, 4 May 2001, Richard A. Steenbergen wrote:
Personally, I'm still trying to figure out why Exodus, in all their apparent wisdom (or lack thereof), has stopped using the GBLX OC-48's in the former GlobalCenter facilities (or at least SNV3), and is now shuttling all its traffic out a single Exodus OC-12. Prior to
network where most of the traffic is quickly sent to peers and an OC48 backbone is not required? I don't see any congestion on that OC12, so perhaps that is the case? I also don't see a damn thing wrong with the
Just to give correct information, from SNV3 Exodus has an OC48 and an OC12 to two other datacenters in Santa Clara and an OC48 to Chicago. There is enough bandwidth to carry the traffic. No peering or backbone link was hurt during this move. Thanks Christian Yes, I work for Exodus.
On Fri, May 04, 2001 at 03:47:09PM -0400, Richard A. Steenbergen wrote:
30 second outages are pretty light in the history of GCTR and GBLX outages.
I would definitely have to agree. Personally, I think this speaks volumes. I have spent 4 hours on the phone with GCTR trying to find someone capable of understanding that HSRP does not work when you put interfaces in the same subnet on different VLANs..... I have spent 5 hours waiting for GCTR to diagnose a simple failed linecard. That was just the diagnosis; it took another 7 hours or so to actually replace it. You're right about multihoming; anyone in a GCTR facility should /definitely/ be multihomed. --msa
participants (3)
-
Christian Nielsen
-
Majdi S. Abbas
-
Richard A. Steenbergen