On Sat, Oct 08, 2005 at 07:24:06AM -0700, JC Dill wrote:
Cogent was a "tier 1" until prior de-peering incidents left them unable to reach other networks. They solved this by buying filtered transit thru Verio to reach the networks they couldn't reach via peering.
For the record, Cogent was never a Tier 1. They have never had Sprint peering (unless you count the 30 seconds between acquisition of a company that did have it, and the depeering notice, years ago). Cogent's history of depeering debacles, at least as best as I can remember them, is: ATDN (AS1668) depeers Cogent, December 18 2002. http://www.cctec.com/maillists/nanog/historical/0212/msg00366.html http://www.cctec.com/maillists/nanog/historical/0212/msg00412.html ATDN is in the process of shutting off legacy transit and peering on its path to tier-1-dom, and disconnects Cogent due to ratio (also technically Cogent is still on a trial peering session). At this time, Cogent is a full transit customer of AboveNet (AS6461), ATDN is still a full transit customer of Level 3, and Cogent is a peer of Level 3. Following the depeering, Cogent shifts 100% of the traffic to their (3) peers, which become severely congested nearly 24/7 for several weeks. Despite being able to send some traffic to AboveNet transit, they decide to leave traffic congested to (3) to see if ATDN will repeer (not knowing that AOL customers don't know what peering is and thus won't be nearly as vocal as Cogent's customers). Traffic stays congested until Cogent's peering capacity with (3) is upgraded. ATDN later switches their routing with (3) from transit to customer-only routes (removing the last of their transit paths), at which point Cogent shifts traffic to newly acquired Verio transit to reach them. Teleglobe (AS6453) depeers Cogent, some time in Feb 2005? Don't ask me why but I can't find a NANOG thread discussing this. Teleglobe depeers Cogent due to various ratio and market pressure issues. Of note is that Cogent has recently entered the Canadian market where Teleglobe has a strong presence, and has started giving away free or nearly free transit to large inbound networks. Teleglobe is a Sprint customer, and Cogent reaches Sprint through Verio. Teleglobe is caught completely off-guard when Cogent refuses to accept the route via Sprint transit, and blocks traffic between the networks. This continues for several days, until eventually routes are leaked/added from Teleglobe to SAVVIS (AS3561), who Cogent peers with. This continues for a few days more until Teleglobe finally agrees to repeer Cogent. France Telecom (OpenTransit/AS5511) depeers Cogent, April 14 2005 http://www.merit.edu/mail.archives/nanog/2005-04/msg00484.html FT depeers Cogent due to, well, a variety of issues and general unhappiness surrounding Cogent's entrance into their markets through the purchase of Lambdanet. FT is a Sprint customer, Cogent is already receiving Sprint routes via Verio but intentionally blocks these routes so that they have no path to FT. The rumored resolution to the dispute is that a FT customer sues Cogent in France, and a French judge either does or is about to fine the hell out of Cogent unless connectivity is restored. At this point Cogent caves, and begins accepting the routes via Sprint (via Verio). Of course I am certain there are a lot more depeerings (both from and to Cogent) that did not make the news, but these are the big notable events that dramatically impacted connectivity. For anyone keeping score, the last two times Cogent was depeered, it responded by intentionally blocking connectivity to the network in question, despite the fact that both of those networks were Sprint customers and thus perfectly reachable under the Sprint transit Cogent gets from Verio. While no one has come forward to say if the Cogent/Verio agreement is structured for full transit or only Sprint/ATDN routes, Cogent has certainly set a precedent for intentionally disrupting connectivity in response to depeering, as a scare tactic to keep other networks from depeering them.
L3 was hoping to force Cogent to increase that transit to include the traffic destined for L3's customers, thus raising Cogent's transport costs at no additional (transport) cost to L3.
As I've already pointed out, L3 depeering Cogent is in fact a major revenue loss for L3. Not only will they not make any money off of Cogent (since we both know Cogent will NEVER give them money for direct transit), but Cogent will heavily depref them and shift many many gigabits of traffic away from L3 and onto their competitors, traffic that L3 was previously billing their customers for. They'll also lose customers during the unreachability, and even if Cogent buckles and buys transit they'll lose some outbound traffic from their multihomed customers due to a longer as-path length to reach Cogent and many of Cogent's routes (11k of them remember). Let me be perfectly clear here, under absolutely no line of logic will L3 see an increase in revenue from this, period. If you think they will, you don't understand how the Internet works. What L3 will see from this is a REDUCTION IN BILLABLE TRAFFIC AND BACKBONE UTILIZATION.
3) Possible traffic issues. Was Cogent guilty of not transporting the Level3-bound packets within the Cogent network to the closest point-of-entry peer to the host in the Level3 network, therefore "costing" Level3 transit of their own packets?
Possible, in fact probable. Most ISPs hand off traffic to peers under a "hot potato" policy, they hand it off at the closest point where they connect. If the traffic is equal in both directions then this works. If the traffic is not equal, then this lowers the cost of the network that has high outbound traffic, as the other network bears the brunt of the total cost for transporting the combined traffic between their respective customers.
Do you know why people hot potato traffic? Because MEDs suck. In addition to the obvious aggregation issues (for example how do you assign a MED value to 4.0.0.0/8, it is used around the world), they usually end up producing sub-optimal routing. IGP cost is a view of what it costs YOU to get the packet off your network. MED values set to the opposing network's IGP cost is a view of what it costs THEN to get the packet off their network. Neither is a complete view of reality, and the MED view just happens to be worse. Consider a simple scenerio, You operate a major network, you peer with someone who operates a major network, you both intelligently aggregate your prefixes and work with your customers to make certain that everything in BGP maps to a specific geographic region, and you both interconnect with each other in the usual "maxium reasonable extend possible" locations (New York, Ashburn, Chicago, Dallas, San Jose, Los Angeles, Seattle, Atlanta, Miami) across the US. Now lets say you have a customer who is in Chicago, and they're sending data to a customer in, oh lets say Denver. In hot-potato routing, you get the packet off your network in Chicago, and then the other network uses its more complete and detailed understanding of where this packet is going within its own network to know that Chicago->Denver is a straight shot. In a cold potato situation however, you are only looking at the other network's IGP cost, not your own. Denver is pretty much dead center in the middle of San Jose, Chicago, and Dallas, and which one is "closer" is really up for grabs. On the vast majority of networks, Dallas is actually closest by IGP cost, with San Jose a close second, and Chicago a close third. If you're cold potato'ing to try and improve routing, even under the most ideal conditions possible (which given the current financial state of the carriers involved RARELY happens these days), you're going to end up hauling packets to some out of the way place like SJC or DFW, and then the other network is going to end up hauling packets back to Denver. You both lose the "saving money by hauling traffic less" game, and your customers lose in suboptimal routing. The heart of the problem is that you need to consider your cost + their cost to have meds be effective, even if you solved all the implementation issues that you will never practically solve. Unfortunately since two networks have no way to coordinate metrics on the same "scale" (my 43ms may be 4300 igp cost, your 43ms may be 43, and joe bob's 43ms cost may be 9182), you have no reliable way to "add" the two costs. Now, networks who are looking for equity in the ratios have a choice. They can either: * Spend thousands of man hours deaggregating (and then listen to you complain about poluting the routing table with prefixes) * Spend millions of dollars deploying more gear into more interconnection locations in areas of network presense but not peering presence (Denver, St Louis, Kansas City, New Orleans, Tampa, Phoenix, etc etc etc), all in areas without well defined peering locations where they are likely to end up in buildings across the block but which cost thousands of dollars to connect, or * Establish these as smaller interconnections across telco circuits, again spending thousands of dollars a month more in circuits, hundreds of thousands of dollars in ports, tens of thousands of man hours managing capacity at five dozen new interconnections around the world, all while reducing to almost zero the ability for a major interconnection to fail over to another major interconnection during a maintenance, fiber cut, network event, etc. * Break their customers routing in the process of doing all this. -OR- * Depeer said network, expect that they will buy transit from Verio or any of the other dozens of networks who provide this service, and that whoever ends up interconnecting with them to deliver the traffic will have equitable traffic. Now, which one do you think they're going to pick?
There are ways to deal with it though, like cold potato routing.
Spoken like someone who has never dealt with the reality of running a large network, or dealt with customers wondering why you are routing their traffic across the country and back again. Anyone who values the quality of their connectivity will stick to arm-chair engineering and not actually building a network this way. -- Richard A Steenbergen <ras@e-gerbil.net> http://www.e-gerbil.net/ras GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)