US-Asia Peering

William B. Norton

2 Jan 2003 2 Jan '03

11:59 p.m.

Hi all - I understand that there is a real glut of AP transoceanic capacity, particularly on the Japan-US cable where twice as much capacity is idle as is in use. This has sent the price point down to historic levels, O($28K/mo for STM-1) or less than $200/Mbps for transport! This is approaching an attractive price point for long distance peering so, just for grins,... Are there transport providers that can provide a price point around $100/Mbps for transport capacity from Tokyo to the U.S. (LAX/SJO) ? What are the technical issue with extreme long distance (transoceanic) peering? In particular, what are the issues interconnecting layer 2 switches across the ocean for the purposes of providing a global peering cloud using: 0) vanilla circuit transport to interconnect the switches 1) MPLS framed ethernet service to interconnect the switches 2) tunnelled transport over transit to interconnect the switches Thanks in advance. Bill

Show replies by date

Stephen Stuart

3 Jan 3 Jan

8:37 a.m.

Rearranged slightly.

...

What are the technical issue with extreme long distance (transoceanic) peering?

In particular, what are the issues interconnecting layer 2 switches across the ocean for the purposes of providing a global peering cloud using:

In the generic sense, the issues are largely the same as interconnecting to the L2 switch in the customer's cage (operated by the customer as an aggregation device) or the L2 switch implementing another exchange fabric in the same building or metro. Complex L2 topologies are hard to manage, especially when the devices that implement that topology are not all managed by the same entity. L2 has poor policy knobs for implementing administrative boundaries; if such a boundary is involved, the events that you need to anticipate are largely the same whether the switches are separated by a cage wall or an ocean. The auto-configuring properties of L2 fabrics (like MAC learning, or spanning tree) that make it an attractive edge technology can be very detrimental to smooth operation when more than one set of hands operates the controls. An exchange point is, quite literally, an implementation of an administrative boundary; the desire of customers to use L2 devices themselves (for aggregation of cheap copper interfaces toward expensive optical ones, use of combined L2/L3 gear, or whatever other reason) means that the L2 fabric of any Ethernet-based exchange has potentially as many hands on the controls as there are customers. So, good operational sense for the exchange point operator means exercising as much control over those auto-configuring properties as is possible; turning them off or turning automatic functions into manual ones. Did I mention that L2 has lousy knobs for policy control? (They're getting a little better, so perhaps whatever is a notch better than "lousy" is appropriate). One of the ones that you have to turn off is spanning tree (read the thread from a few months back on the hospital network that melted down for background material). That means that you have to build a loop-free topology out of reliable links, which you can get with all three of the technologies you mention, but you have to build a loop-free topology. In order to use inter-metro connectivity efficiently, you are limited to building L2 patches that each implement pairwise connectivity between two metros. That makes this:

...

0) vanilla circuit transport to interconnect the switches

hard, because your interior connectivity is dedicated to one of those pairwise combinations (hard, but not impossible, assuming you have some capex to throw at the problem). The pairwise limitation also, indirectly, puts the kibosh on using this fabric as a means to pretend that a network has a backbone in order to qualify for peering that it wouldn't get otherwise. That leaves these two:

...

1) MPLS framed ethernet service to interconnect the switches 2) tunnelled transport over transit to interconnect the switches

which will carry the exchange point traffic over an L3 (okay, so MPLS is "L2.5") network; in addition, you get the benefit of being able to have all the L3 knobs available in the interior to do traffic engineering. Both options perform better when the interior MTU can accomodate the encapsulation protocol plus payload without fragmenting, so someone is operating routers with SONET interfaces in this equation. Qui bene? - The operator of the L3 network that carries the inter-EP fabric gets revenue. - The people who peer using this L2 fabric get to avoid some transit, but I would argue that it is only to reach providers that are similarly desirous of avoiding transit, since this won't help the upwardly mobile with the geographic diversity component of getting to the "next" tier. Who loses? - Transit providers who came to the exchange point for the purpose of picking up transit sales. - If the exchange point operator is the one carrying the traffic, they lose for competing with their customers in the previous bullet; they will have taken the first steps on the path from being an exchange point operator to being a network-plus-colo provider (where they'll compete with the network-plus-colo providers just coming out of bankruptcy with all their debt scraped off). So far, there has been an assumption that the provider of inter-EP connectivity is willing to portion it out in a manner that is usage-insensitive for the participants. I don't believe that the glut of capacity or the other expenses that come with operating an L3 network has driven the costs so low that the resulting product is "too cheap to meter." If that is the case, then delivering robust, auditable service is better implemented by connecting the customers up to the L3 gear and implementing their L2 connections as pairwise virtual circuits between customers (so you can be sure you're not paying remote rates to talk to a local customer, or billing local rates to traffic that a customer exchanges over the wide area). We effectively just took the exchange point switch fabric out of the equation and layered the peering fabric as VCs (MPLS or tunneled Ethernet) on a transit network where each customer is connected on a separately auditable interface. At that point: - the people doing the peering might just as well use their transit, unless they work together (in pairwise combinations, so this doesn't scale well) to get on-net pricing from a common transit provider. - the people doing the peering might as well just buy a circuit, if they're going so cheap, or build out their own network in hopes of making the grade with respect to the geographic diversity component of other people's peering requirements.

...

I understand that there is a real glut of AP transoceanic capacity, particularly on the Japan-US cable where twice as much capacity is idle as is in use. This has sent the price point down to historic levels, O($28K/mo for STM-1) or less than $200/Mbps for transport! This is approaching an attractive price point for long distance peering so, just for grins,...

Consumed at what quantities? Parceling this out at single-digit-Mb/s quantities increases the cost at the edge of the network that delivers this.

...

Are there transport providers that can provide a price point around $100/Mbps for transport capacity from Tokyo to the U.S. (LAX/SJO) ?

SJO is in Costa Rica, bud. :-) Stephen VP Eng., PAIX

Jeff Barrows

3:05 p.m.

...

- Transit providers who came to the exchange point for the purpose of picking up transit sales.

- If the exchange point operator is the one carrying the traffic, they lose for competing with their customers in the previous bullet; they will have taken the first steps on the path from being an exchange point operator to being a network-plus-colo provider (where they'll compete with the network-plus-colo providers just coming out of bankruptcy with all their debt scraped off).

i'm still amazed that nobody has brought up the fact that a couple of the larger colo/exchange operators that claimed they wouldn't compete with their IP customers are indeed selling IP transit-- intentionally undercutting the prices of the providers that colo'd there to sell transit partly because the colo/exchange operator kept telling the world that they would never compete with their customers in the IP transit space. clearly, interconnecting their exchange points to create a richly- connected Internet 'core' is a natural progression if their customers don't complain too loudly. not that it's a bad long-term plan-- but I do agree with Stephen in that it'll be tough for them to survive against the debt-free big boys if they emerge as clear network-plus-colo competitors and lose the few remaining bits of their 'neutral' facade. - jsb -- Jeff Barrows, President Firefly Networks http://FireflyNetworks.net +1 703 287 4221 Voice +1 703 288 4003 Facsimile An Advanced Internet Engineering & Professional Services Organization

David Diaz

3:11 p.m.

Both Stephen and Jeff and correct. And Im not sure it would be in the best interests of the colo company to be a jack of all trades. Where I do see a benefit are where an exch pt company wants to start a new one in a new city. It's the classic chicken and the egg. Where I have promoted allowing a beta group of peers to jump in for little or no charge till say peer #6, another solution is to connect that new exch pt to a successful one at another location. Allowing the benefit of new peers at location B to see old peers at location A. This would allow a critical mass of peers immediately, and would allow customer 1 to see benefit. Some restrictions might have to be in place. 1) Limiting the traffic levels for distance peering. 100meg or 1 Gig would do it 2) Perhaps a time limit Also, instead of competing with carriers at this new location B, you would actually prove there is business there. Most companies are in a wait and see mode before deploying, or a wait and get an order 1st mode. By jump starting the peering with transport, you then have the data the carrier engineers need to justify a build. This IS one way to get more successful peering points started. At 10:05 -0500 1/3/03, Jeff Barrows wrote:

...

...
- Transit providers who came to the exchange point for the purpose of picking up transit sales.

- If the exchange point operator is the one carrying the traffic, they lose for competing with their customers in the previous bullet; they will have taken the first steps on the path from being an exchange point operator to being a network-plus-colo provider (where they'll compete with the network-plus-colo providers just coming out of bankruptcy with all their debt scraped off).

i'm still amazed that nobody has brought up the fact that a couple of the larger colo/exchange operators that claimed they wouldn't compete with their IP customers are indeed selling IP transit-- intentionally undercutting the prices of the providers that colo'd there to sell transit partly because the colo/exchange operator kept telling the world that they would never compete with their customers in the IP transit space.

clearly, interconnecting their exchange points to create a richly- connected Internet 'core' is a natural progression if their customers don't complain too loudly.

not that it's a bad long-term plan-- but I do agree with Stephen in that it'll be tough for them to survive against the debt-free big boys if they emerge as clear network-plus-colo competitors and lose the few remaining bits of their 'neutral' facade.

- jsb

-- Jeff Barrows, President Firefly Networks http://FireflyNetworks.net +1 703 287 4221 Voice +1 703 288 4003 Facsimile

An Advanced Internet Engineering & Professional Services Organization

-- David Diaz dave@smoton.net [Email] pagedave@smoton.net [Pager] www.smoton.net [Peering Site under development] Smotons (Smart Photons) trump dumb photons

Jared Mauch

6:46 p.m.

On Fri, Jan 03, 2003 at 10:11:09AM -0500, David Diaz wrote:

...

2) Perhaps a time limit

who is still connected to mae-w fddi? i know there are people there. time limits don't work well. -- Jared Mauch | pgp key available via finger from jared@puck.nether.net clue++; | http://puck.nether.net/~jared/ My statements are only mine.

Bill Woodcock

6:35 p.m.

> clearly, interconnecting their exchange points to create a richly- > connected Internet 'core' is a natural progression if their > customers don't complain too loudly. > not that it's a bad long-term plan... Actually, it is. It's failed in every prior instance. It's one of the many, many ways in which exchange points commit suicide. -Bill

Joe Provo

8:02 p.m.

I find the interesting that there were immediate assumptions by all the followup posters that the hypothectical mesh wbn suggested would be run by an exchange point operator. Perhaps no public statements were sent by anyone in using similar trans-atlantic services (that are not run by the affected EP operator[s]). It isn't a new solution, and there isn't only one company offering the service. I think exploring any technical issues/experiences in the differing existing deploys and how they would relate to a trans- pacific deploy is quite worthwhile. If anyone using one of the trans-atlantic services wanted to send comments but didn't have enough desire to get a throwaway account subscribed to nanog-post, I'll happily anonomize and repost for you. Just no guarentees on timliness. Cheers, Joe

Stephen Stuart

8:32 p.m.

...

I find the interesting that there were immediate assumptions by all the followup posters that the hypothectical mesh wbn suggested would be run by an exchange point operator.

I beg to differ. I said "if the exchange-point operator is the one carrying the traffic," at the point where the subsequent comment depended on that assumption. The preceding comments, particularly the ones regarding administrative boundaries and who benefits, do not assume that the exchange point operator carries the traffic: - the message regarding the fact that more administrative boundaries make like more difficult at layer 2 is generic; (stated explicitly for you now) a system with three parties and two administrative boundaries (two EP switch fabrics and a carrier between them) has a *lot* of operational complexity and (I would argue) too many hands on the knobs - when what you are doing is collecting switch *fabrics* together. - the operator of the L3 network that carries inter-EP fabric does not have to be either of the connected EP operators, which is why I called them by the name "operator of the L3 network" and not "EP operator."

...

Perhaps no public statements were sent by anyone in using similar trans-atlantic services (that are not run by the affected EP operator[s]). It isn't a new solution, and there isn't only one company offering the service.

That is, as you say, not a new solution.

...

I think exploring any technical issues/experiences in the differing existing deploys and how they would relate to a trans- pacific deploy is quite worthwhile. If anyone using one of the trans-atlantic services wanted to send comments but didn't have enough desire to get a throwaway account subscribed to nanog-post, I'll happily anonomize and repost for you. Just no guarentees on timliness.

PAIX has a segment of its existing customer base that use Ethernet extend-o-technology to carry L2 frames between them and the port they have leased on the PAIX switch with an ocean between them. With respect to the exchange point layer 2 fabric, this is no different than connecting a customer switch to the switch fabric (operationally, we see no difference). Certainly some impressions from the folks at the far end of those links would be of interest. Stephen

William B. Norton

10 Jan 10 Jan

12:53 a.m.

At 10:35 AM 1/3/2003 -0800, Bill Woodcock wrote:

...

> clearly, interconnecting their exchange points to create a richly- > connected Internet 'core' is a natural progression if their > customers don't complain too loudly. > not that it's a bad long-term plan...

Actually, it is. It's failed in every prior instance.

I'd like to understand your viewpoint Bill. The LINX consists of a handful of distributed and interconnected switches such that customers are able to choose which site they want for colo. Likewise for the AMS-IX and a handful of other dominant European exchanges. By most accounts these are successful IXes, with a large and growing population of ISPs benefiting from the large and growing population. So I don't see the failure cases.

...

It's one of the many, many ways in which exchange points commit suicide.

I'd love to see a list of the ways IXes commit suicide. Can you rattle off a few?

...

-Bill

Bill Woodcock

1:44 a.m.

> The LINX consists of a handful > of distributed and interconnected switches such that customers are able to > choose which site they want for colo. Likewise for the AMS-IX and a handful > of other dominant European exchanges. Correct. Within the metro area. That is, as has been documented many times over, a necessary condition for long-term stability. > >It's one of the many, many ways in which exchange points commit suicide. > > I'd love to see a list of the ways IXes commit suicide. Can you rattle off > a few? 1) Cross the trust threshhold in the wrong direction. 2) Cross the cost-of-transit threshhold in the wrong direction. 3) Increase shared costs until conditions 1 and/or 2 are met. Those are sort of meta-cases which encompass most of the specific failure modes. Of course, you can always declare yourself closed or obsolete, a al MAE-East-FDDI, which I guess would be a fourth case, but rare. -Bill

Stephen J. Wilcox

1:58 a.m.

On Thu, 9 Jan 2003, Bill Woodcock wrote:

...

> The LINX consists of a handful > of distributed and interconnected switches such that customers are able to > choose which site they want for colo. Likewise for the AMS-IX and a handful > of other dominant European exchanges.

Correct. Within the metro area. That is, as has been documented many times over, a necessary condition for long-term stability.

Theres an increasing number of "psuedo-wire" connections tho, you could regard these L2 extensions an extension of the switch as a whole making it international. Where the same pseudo wire provider connects to say LINX, AMSIX, DECIX your only a little way off having an interconnection of multiple IXs, its possible this will occur by accident .. Steve

...

> >It's one of the many, many ways in which exchange points commit suicide. > > I'd love to see a list of the ways IXes commit suicide. Can you rattle off > a few?

1) Cross the trust threshhold in the wrong direction. 2) Cross the cost-of-transit threshhold in the wrong direction. 3) Increase shared costs until conditions 1 and/or 2 are met.

Those are sort of meta-cases which encompass most of the specific failure modes. Of course, you can always declare yourself closed or obsolete, a al MAE-East-FDDI, which I guess would be a fourth case, but rare.

-Bill

Randy Bush

2:07 a.m.

...

Where the same pseudo wire provider connects to say LINX, AMSIX, DECIX your only a little way off having an interconnection of multiple IXs, its possible this will occur by accident ..

and l2 networks scale soooo well, and are so well known for being reliable. is no one worried about storms, spanning tree bugs, ... in a large multi-l2-exchange environment? this is not a prudent direction. randy

William B. Norton

3:13 a.m.

At 06:07 PM 1/9/2003 -0800, Randy Bush wrote:

...

...
Where the same pseudo wire provider connects to say LINX, AMSIX, DECIX your only a little way off having an interconnection of multiple IXs, its possible this will occur by accident ..

and l2 networks scale soooo well, and are so well known for being reliable. is no one worried about storms, spanning tree bugs, ... in a large multi-l2-exchange environment? this is not a prudent direction.

Well, first I think we need to agree that there are two different cases here: 1) interconnecting IXes operated by the same party, vs. 2) interconnecting IXes operated by different parties. In the first case an IX operator can shoot himself in the foot, but there is only one gun and one person, so you can easily figure out why the foot hurts. In the latter case, there are more people with more guns. Without perfect information distributed among the operators, this is clearly a more dangerous situation and diagnosing/repairing is more difficult and time intensive. I believe we are really talking about the first case. Secondly, some of the issues of scaling l2 infrastructure have been addressed by VLANs, allowing the separation of traffic into groups of VLAN participants. This reduces the scope of an L2 problem to the VLAN in use. Since the role of the IX operator is to provide a safe stable scaleable etc. interconnection environment, distributed VLANs are a tool that can help extend the peering population while mitigating the risk of any single ISP from wrecking the peering infrastructure. Bill

...

randy

Randy Bush

4:14 a.m.

...

Well, first I think we need to agree that there are two different cases here: 1) interconnecting IXes operated by the same party, vs. 2) interconnecting IXes operated by different parties.

In the first case an IX operator can shoot himself in the foot, but there is only one gun and one person, so you can easily figure out why the foot hurts.

well, now we know you have ever had to debug a large L2 disaster randy

William B. Norton

4:37 a.m.

At 08:14 PM 1/9/2003 -0800, Randy Bush wrote:

...

...
Well, first I think we need to agree that there are two different cases here: 1) interconnecting IXes operated by the same party, vs. 2) interconnecting IXes operated by different parties.

In the first case an IX operator can shoot himself in the foot, but there is only one gun and one person, so you can easily figure out why the foot hurts.

well, now we know you have ever had to debug a large L2 disaster

...

In the first case an IX operator can shoot himself in the foot, but there is only >one gun and one person, so you can easily figure out why the foot hurts. In the latter case, there are more people with more guns. Without

Randy - You snipped out what I said out of context. Below is the complete paragraph (and admittedly I should have said "relatively easily" rather than "easily".) The point is that I don't think we are talking about interconnecting switches operated by different parties, and I think you would agree that if it is difficult diagnosing problems with a single large scale l2 fabric, it is even more difficult with multiple administrative domains. That was the point. Original Paragraph: perfect >information distributed among the operators, this is clearly a more dangerous >situation and diagnosing/repairing is more difficult and time intensive. I believe >we are really talking about the first case. Woody - I'd still like to hear about the failures "in every prior instance".

...

...
clearly, interconnecting their exchange points to create a richly- connected Internet 'core' is a natural progression if their customers don't complain too loudly. not that it's a bad long-term plan...

...

Actually, it is. It's failed in every prior instance.

...

Thanks.

Paul Vixie

7:03 a.m.

what a morass of confusion. on the one hand we have a metro peering fabric, which as linx, exchangepoint, paix, and lots of others have shown, is good. on another hand we have a metro peering fabric, which as mfs and ames showed, can be really bad. because we have a lot of hands we also have exchange-level peering, which as paix and six has shown, can be done safely. there's also a hand containing multiple instances of exchange level peering which was not done safely (and i'm not double counting ames and mfs here.) finally we have intermetro (wide area) peering, which has been shown to be a complete joke for peering for any number of reasons. before any of you argue further, please carefully define your terminology so the rest of us will know how to fill out our scorecards. -- Paul Vixie

Stephen Stuart

4:37 a.m.

...

Well, first I think we need to agree that there are two different cases here: 1) interconnecting IXes operated by the same party, vs. 2) interconnecting IXes operated by different parties.

PAIX has successful implementations of both of these (I count our metro strategy as an instance of the first sort, and our varied interconnections to other switch fabrics operated by other people as an instance of the second). With both, as I stated earlier, a thorough understanding of the shortcomings of L2 fabrics as a place for different parties to meet is crucial to avoiding service-affecting outages. The lessons learned along the way are very helpful in debugging quickly and thoroughly when a service-affecting outage creeps through. Stephen VP, Eng. PAIX

Bill Woodcock

6:04 a.m.

On Fri, 10 Jan 2003, Stephen J. Wilcox wrote: > Theres an increasing number of "psuedo-wire" connections tho, you could regard > these L2 extensions an extension of the switch as a whole making it > international. > Where the same pseudo wire provider connects to say LINX, AMSIX, DECIX your only > a little way off having an interconnection of multiple IXs, its possible this > will occur by accident .. Yes, that's an unfortunate accident that's occurred before. -Bill

neil＠DOMINO.ORG

7:42 a.m.

...

Theres an increasing number of "psuedo-wire" connections tho, you could regard these L2 extensions an extension of the switch as a whole making it international.

Thats not really applicable in my view, the psuedo-wire is no different to a long fibre extension and they are only used to connect specific parties not IXP's. Regards, Neil.

Stephen J. Wilcox

12:54 p.m.

On Fri, 10 Jan 2003, Neil J. McRae wrote:

...

...
Theres an increasing number of "psuedo-wire" connections tho, you could regard these L2 extensions an extension of the switch as a whole making it international.

Thats not really applicable in my view, the psuedo-wire is no different to a long fibre extension and they are only used to connect specific parties not IXP's.

Sure, purpose is different but functionally the pseudowire is the same as the exchange, a collection of L2 devices and either has the potential in the event of a L2 issue (arp storm, STP) to affect the other in the absence f any L3 boundary. The fact that it only connects a single device is arbitrary. In response to Randy and Bill(s), this seems to come down to a trade off of commercial vs technical. A lot of us agree this is technically not the best way and produces instabilities with the potential to take out major chunks of internet but it is cheap and this means people will adopt this way of doing it, unfortunately as this has now happened it means those opposed to the idea will have to also consider this as an option if they are to compete. The growing number of things in the Internet business which nowadays need to more cheap and less technically sound is something I find disheartening. Steve

Bill Woodcock

5:33 p.m.

On Fri, 10 Jan 2003, Stephen J. Wilcox wrote: > In response to Randy and Bill(s), this seems to come down to a trade off of > commercial vs technical. A lot of us agree this is technically not the best way > and produces instabilities with the potential to take out major chunks of > internet but it is cheap and this means people will adopt this way of doing it, > unfortunately as this has now happened it means those opposed to the idea will > have to also consider this as an option if they are to compete. I don't think it's fair to characterize it as a trend... I mean, ten years ago, we were all (generalizing here) stupid enough to try these tricks. Fortunately, smarter people have come along since, and learned from our mistakes. There are also _vastly_ more people involved in the industry now than then, so it comes as no surprise that there are still some newbies trying this, despite all the lessons of the past. The good news is that although they're a quantitatively growing group, they're a shrinking _fraction_ of the whole. So that's evidence of some small progress in the state of knowledge. Fight the law of conservation of clue! -Bill

William B. Norton

11 Jan 11 Jan

12:38 a.m.

At 09:33 AM 1/10/2003 -0800, Bill Woodcock wrote:

...

On Fri, 10 Jan 2003, Stephen J. Wilcox wrote: > In response to Randy and Bill(s), this seems to come down to a trade off of > commercial vs technical. A lot of us agree this is technically not the best way > and produces instabilities with the potential to take out major chunks of > internet but it is cheap and this means people will adopt this way of doing it, > unfortunately as this has now happened it means those opposed to the idea will > have to also consider this as an option if they are to compete.

I don't think it's fair to characterize it as a trend... I mean, ten years ago, we were all (generalizing here) stupid enough to try these tricks. Fortunately, smarter people have come along since, and learned from our mistakes. There are also _vastly_ more people involved in the industry now than then, so it comes as no surprise that there are still some newbies trying this, despite all the lessons of the past. The good news is that although they're a quantitatively growing group, they're a shrinking _fraction_ of the whole. So that's evidence of some small progress in the state of knowledge. Fight the law of conservation of clue!

-Bill

Bill - the argument seems like Proof by Rigorous Assertion: I know it is a bad idea. I really really believe it is a bad idea. My friends say it's a bad idea. Not one that I know says it is a good idea. Therefore, and I can't emphasize this enough, in conclusion, it is a bad idea. If what you are saying is true, I'd really like to hear just a couple of insurmountable technical problems with WAN L2.5 infrastructure interconnecting IX switches. For the sake of argument and to clarify the discussion (Paul) let's make a few assumptions: 1) We are talking about an operations model where IX switches are operated by a single company. 2) The IX switches are interconnected by MPLS by a transport provider offering that service. 3) An ISP on one switch creates a VLAN for peering with ISPs on any of the other switches. This ISP VLAN is only for peering with the ISP that created this VLAN. Since he is paying for the VLAN traffic he has this right. 4) The cost of transporting the traffic between the switches is bourne by a transport provider who in turn charges the ISP that created the VLAN in question. I can articulate a half dozen reasons why this is a good idea. Please share with us why this is a such a bad idea. If it has been tried before, it would be helpful to point to specific the case and why it failed, the technical failure scenario. I'd like to hear why/how it was worse by the distance between switches. Bill

Stephen J. Wilcox

5:11 p.m.

On Fri, 10 Jan 2003, William B. Norton wrote:

...

At 09:33 AM 1/10/2003 -0800, Bill Woodcock wrote:

...
On Fri, 10 Jan 2003, Stephen J. Wilcox wrote: > In response to Randy and Bill(s), this seems to come down to a trade off of > commercial vs technical. A lot of us agree this is technically not the best way > and produces instabilities with the potential to take out major chunks of > internet but it is cheap and this means people will adopt this way of doing it, > unfortunately as this has now happened it means those opposed to the idea will > have to also consider this as an option if they are to compete.

I don't think it's fair to characterize it as a trend... I mean, ten years ago, we were all (generalizing here) stupid enough to try these tricks. Fortunately, smarter people have come along since, and learned from our mistakes. There are also _vastly_ more people involved in the industry now than then, so it comes as no surprise that there are still some newbies trying this, despite all the lessons of the past. The good news is that although they're a quantitatively growing group, they're a shrinking _fraction_ of the whole. So that's evidence of some small progress in the state of knowledge. Fight the law of conservation of clue!

-Bill

Bill - the argument seems like Proof by Rigorous Assertion: I know it is a bad idea. I really really believe it is a bad idea. My friends say it's a bad idea. Not one that I know says it is a good idea. Therefore, and I can't emphasize this enough, in conclusion, it is a bad idea.

Well, my comments were opinions and "in my experience".. which is a perfectly valid argument

...

If what you are saying is true, I'd really like to hear just a couple of insurmountable technical problems with WAN L2.5 infrastructure

Maybe it works fine technically.. I can think of a number of examples where companies have their IT done cheaply and it "works" .. then they call an expert when at some point down the line theres a problem they cant explain!

...

interconnecting IX switches. For the sake of argument and to clarify the discussion (Paul) let's make a few assumptions:

1) We are talking about an operations model where IX switches are operated by a single company. 2) The IX switches are interconnected by MPLS by a transport provider offering that service. 3) An ISP on one switch creates a VLAN for peering with ISPs on any of the other switches. This ISP VLAN is only for peering with the ISP that created this VLAN. Since he is paying for the VLAN traffic he has this right. 4) The cost of transporting the traffic between the switches is bourne by a transport provider who in turn charges the ISP that created the VLAN in question.

No, this isnt what I'm talking about. I'm talking of IXs with third party MPLS transport providers connecting delivering vlan to remote sites. However in your example above I think you have a scenario that I dont think is available but is certainly around as proposals. The problems with that are not simple, they are the possibility of complication introduced by multiple parties and of unexpected behaviour arising which in such a large system has the potential to affect many customers of many suppliers. Steve

...

I can articulate a half dozen reasons why this is a good idea. Please share with us why this is a such a bad idea. If it has been tried before, it would be helpful to point to specific the case and why it failed, the technical failure scenario. I'd like to hear why/how it was worse by the distance between switches.

Bill

Paul Vixie

10:09 p.m.

...

If what you are saying is true, I'd really like to hear just a couple of insurmountable technical problems with WAN L2.5 infrastructure interconnecting IX switches.

stephen stuart already answered that.

...

For the sake of argument and to clarify the discussion (Paul) let's make a few assumptions:

1) We are talking about an operations model where IX switches are operated by a single company. 2) The IX switches are interconnected by MPLS by a transport provider offering that service. 3) An ISP on one switch creates a VLAN for peering with ISPs on any of the other switches. This ISP VLAN is only for peering with the ISP that created this VLAN. Since he is paying for the VLAN traffic he has this right. 4) The cost of transporting the traffic between the switches is bourne by a transport provider who in turn charges the ISP that created the VLAN in question.

packetexchange already does this between any number of IXP's. the only technical issue is whether to trunk the connection between packetexchange and the IXP (at PAIX we don't -- each such extended vlan gets its own port without vlan tagging and counts as a normal customer connection.) the nice economic angle in all this is that it's an IXP-independent service, so if someone at LINX-Docklands wanted to talk to someone at PAIX-NY, it'd work. the nice marketing/political angle is, it's no different from telseon or yipes except for the distances involved, and even that's a temporary artifact as everybody continues to try to cover everybody else's territories and learn to sing everybody else's songs. i believe that bandx tried to do this also, but i don't know as many details.

...

I can articulate a half dozen reasons why this is a good idea. Please share with us why this is a such a bad idea. If it has been tried before, it would be helpful to point to specific the case and why it failed, the technical failure scenario. I'd like to hear why/how it was worse by the distance between switches.

terminologically speaking, the thing that caused the uproar here when you brought up the topic was only partly technical. important issues, in no particular order of importance, are: 1. neutrality. this is something more than one wide-area-ethernet provider must be allowed to do within an IXP if that IXP wants to claim neutrality. further, the IXP cannot contract with a WAN provider for the purpose of interconnecting its own switches, or it becomes a competitor of this whole class of customer. (as a purist, i would normally go further, and say that to be neutral, an IXP cannot negotiate contracts on behalf of its customers, nor resell any customer's service, but let's do that in a separate thread, or better yet, let's just not do it again, the archives already have it all.) 2. laughability. noone who peers remotely in this way will qualify under any of the peering requirements i've seen on the topic of "must interconnect to us in N or more places M miles apart, and must have an OCxx backbone between those places." after some squabbling, an OCxx backbone that's on someone else's ATM or FR has been seen to qualify. but any claims about backbonitude that are based on a WAN-IP or WAN-Ether or WAN-MPLS have not been successful, and a lot of laughing was heard. 3. economics. because a lot of people didn't notice it the first time, here is another copy of woody's most excellent cutoff metrics argument: There's a threshold, defined by a step-function in the price-per-distance of layer-1 services. If you follow that step-function like a line on a topo map until it reconnects with itself, it forms a convex space. Interconnection of switch fabrics within that space is necessary to their success and long-term survival, whereas interconnection of switch fabrics across the border of that space is detrimental to their success and ultimately to their survival. -Bill inside that threshold, an IXP has to interconnect to remain relevant. outside that threshold, a neutral IXP cannot interconnect lest they compete with their customers. -- Paul Vixie

Stephen J. Wilcox

10:28 p.m.

On 11 Jan 2003, Paul Vixie wrote:

...

...
For the sake of argument and to clarify the discussion (Paul) let's make a few assumptions:

1) We are talking about an operations model where IX switches are operated by a single company. 2) The IX switches are interconnected by MPLS by a transport provider offering that service. 3) An ISP on one switch creates a VLAN for peering with ISPs on any of the other switches. This ISP VLAN is only for peering with the ISP that created this VLAN. Since he is paying for the VLAN traffic he has this right. 4) The cost of transporting the traffic between the switches is bourne by a transport provider who in turn charges the ISP that created the VLAN in question.

packetexchange already does this between any number of IXP's. the only technical issue is whether to trunk the connection between packetexchange and the IXP (at PAIX we don't -- each such extended vlan gets its own port without vlan tagging and counts as a normal customer connection.) the nice

which is also true for linx and manap, in that way its not massively different from a normal connection and also meets the usual technical requirements of the ixp requiring no special circumstances from the ixp

...

economic angle in all this is that it's an IXP-independent service, so if someone at LINX-Docklands wanted to talk to someone at PAIX-NY, it'd work.

yes but to clarify as most exchanges enforce a single mac address per port and you dont want to bridge the two ixps you will have at least one L3 hop between the IXPs, which protects you against the nasties of large L2 topologies and L2 meltdowns <snip>

...

2. laughability. noone who peers remotely in this way will qualify under any of the peering requirements i've seen on the topic of "must interconnect to us in N or more places M miles apart, and must have an OCxx backbone between those places." after some squabbling, an OCxx backbone that's on someone else's ATM or FR has been seen to qualify. but any claims about backbonitude that are based on a WAN-IP or WAN-Ether or WAN-MPLS have not been successful, and a lot of laughing was heard.

thats odd, surely the main purpose of this requirement is to ensure that the peering is as cost neutral as possible, eg someone peering with Sprint at a single site exchanging global routes (own, customer) will clearly save the ISP money and cost sprint who now have to ship traffic to and from that site - a good case for not peering or peering only local routes. whether the mechanism by which the interconnect is enabled is long reach ethernet or sdh or whatever doesnt seem important to the peering policy

...

3. economics. because a lot of people didn't notice it the first time, here is another copy of woody's most excellent cutoff metrics argument:

There's a threshold, defined by a step-function in the price-per-distance of layer-1 services. If you follow that step-function like a line on a topo map until it reconnects with itself, it forms a convex space. Interconnection of switch fabrics within that space is necessary to their success and long-term survival, whereas interconnection of switch fabrics across the border of that space is detrimental to their success and ultimately to their survival. -Bill

inside that threshold, an IXP has to interconnect to remain relevant. outside that threshold, a neutral IXP cannot interconnect lest they compete with their customers.

absolutely! which of course explains why all the neutral exchanges have small coverage area - a couple of close sites inside a city. and some of the commercial exchanges have gone beyond and offer long distance peerings as they dont care if they compete, they may well welcome it. Steve

Kurt Erik Lindqvist

12 Jan 12 Jan

8:08 p.m.

Bill, On lördag, jan 11, 2003, at 01:38 Europe/Stockholm, William B. Norton wrote:

...

If what you are saying is true, I'd really like to hear just a couple of insurmountable technical problems with WAN L2.5 infrastructure interconnecting IX switches. For the sake of argument and to clarify the discussion (Paul) let's make a few assumptions:

1) We are talking about an operations model where IX switches are operated by a single company. 2) The IX switches are interconnected by MPLS by a transport provider offering that service. 3) An ISP on one switch creates a VLAN for peering with ISPs on any of the other switches. This ISP VLAN is only for peering with the ISP that created this VLAN. Since he is paying for the VLAN traffic he has this right. 4) The cost of transporting the traffic between the switches is bourne by a transport provider who in turn charges the ISP that created the VLAN in question.

I can articulate a half dozen reasons why this is a good idea. Please share with us why this is a such a bad idea. If it has been tried before, it would be helpful to point to specific the case and why it failed, the technical failure scenario. I'd like to hear why/how it was worse by the distance between switches.

How do you see the failed AMS-IX expansion fit into this? My (very simplified) summary of what happened was that : a) ISPs where worried of the stability of the exchange (if I remember correctly Jesper Skriver made a good mail on this) b) The large operators saw that AMS-IX would be directly competing with them on transit and transport revenues and therefor in the end where not interested in AMS-IX. Note that there still was many (mostly small) ISPs that where in favour of the expansion. At the time of the origin of the discussion I was peering co-ordinator at KPNQwest, and would have pulled-out of AMS-IX if the plans (and KQ ..:) ) would have moved on. Best regards, - kurtis -

Paul Vixie

13 Jan 13 Jan

7:58 a.m.

kurtis@kurtis.pp.se (Kurt Erik Lindqvist) writes:

...

Bill,

How do you see the failed AMS-IX expansion fit into this?

My (very simplified) summary of what happened was that : ... At the time of the origin of the discussion I was peering co-ordinator at KPNQwest, and would have pulled-out of AMS-IX if the plans (and KQ..:) ) would have moved on.

well of course i'm not bill, but (naturally) i will comment anyway. was AMS-IX planning to expand beyond its original metro and bridge all the XP switches together? if so then i understand exactly why KQ and other ISP's would have pulled out of AMS-IX in protest (and in fear). however, if the expansion was intra-metro, then i must be confused, because KQ's major source of bandwidth revenue should have been inter-metro not intra-metro. -- Paul Vixie

Bill Woodcock

8:05 p.m.

> Was AMS-IX planning to expand beyond its original metro and bridge > all the XP switches together? Yep. -Bill

Kurt Erik Lindqvist

9:25 p.m.

...

...
How do you see the failed AMS-IX expansion fit into this?

My (very simplified) summary of what happened was that : ... At the time of the origin of the discussion I was peering co-ordinator at KPNQwest, and would have pulled-out of AMS-IX if the plans (and KQ..:) ) would have moved on.

well of course i'm not bill, but (naturally) i will comment anyway. was AMS-IX planning to expand beyond its original metro and bridge all the XP switches together? if so then i understand exactly why KQ and other ISP's would have pulled out of AMS-IX in protest (and in fear). however, if the expansion was intra-metro, then i must be confused, because KQ's major source of bandwidth revenue should have been inter-metro not intra-metro.

They planed to interconnect other (well, one) other exchanges in NL. Best regards, - kurtis -

Simon Lockhart

3 Jan 3 Jan

8:37 a.m.

On Thu Jan 02, 2003 at 03:59:35PM -0800, William B. Norton wrote:

...

This has sent the price point down to historic levels, O($28K/mo for STM-1) or less than $200/Mbps for transport! This is approaching an attractive price point for long distance peering so, just for grins,...

Are there transport providers that can provide a price point around $100/Mbps for transport capacity from Tokyo to the U.S. (LAX/SJO) ?

But, given that peering costs are more than just the circuit cost (once you include Exchange Point costs, and colo, etc), why would anyone do this when you can just buy transit for $100/Mbps or less? I'm going through this at work at the moment, and am having problems justifying staying at the West Coast, having only just justified the East Coast, so going to AP (although it's what I'd want to do), is just way out of the question... Simon -- Simon Lockhart | Tel: +44 (0)1628 407720 (BBC ext 37720) Technology Manager | Fax: +44 (0)1628 407701 (BBC ext 37701) BBC Internet Services | Email: Simon.Lockhart@bbc.co.uk BBC Technology, Maiden House, Vanwall Road, Maidenhead. SL6 4UB. UK

Paul Vixie

8:03 p.m.

simonl@rd.bbc.co.uk (Simon Lockhart) writes:

...

But, given that peering costs are more than just the circuit cost (once you include Exchange Point costs, and colo, etc), why would anyone do this when you can just buy transit for $100/Mbps or less?

Because peering is better. There's no way to become DDoS attack-resistant if you buy transit, since no matter how strong you are, your provider will ultimately be weaker. Whether that's because high splay is required to be strong, or because your provider's security team isn't on a two-minute call back, or because your provider has a larger set of things to invest their capital in than your particular path out, doesn't matter. The fact is, no cost-effective transit will ever be as good as the best high-splay peering.

...

I'm going through this at work at the moment, and am having problems justifying staying at the West Coast, having only just justified the East Coast, so going to AP (although it's what I'd want to do), is just way out of the question...

OPN (other people's networks) are the second most frequent root cause of connectivity failure. (Network engineers are the most frequent cause, per Vijay's excellent talk in Eugene.) The most reliable access you can get is when you connect to other networks directly rather than using intermediaries. Naturally, with a high number of other networks and of places to meet them, it's only cost effective to peer globally if you have "enough" traffic and if that traffic's reliability bears directly on your top-line revenue. -- Paul Vixie

William B. Norton

6 Jan 6 Jan

5:36 p.m.

Thanks all for your responses (both public and private). Several folks wanted to know what I found out so... I heard from a couple companies that are operating wide area distributed peering architectures today. They claim that the biggest issues has been the perception among prospects that "ethernet isn't supposed to do that (extreme long distance)." I'd love to hear more experiences both pro/con. (I have to admit I was surprised that *transoceanic ethernet* as a shared peering transport did not have serious issues. I would have expected that the time delay from the time a broadcast was transmitted to the time it was heard would have been an issue somehow, or some such interesting problems would come up.) Several VLAN configuration issues came up as a design consideration for wide area peering infrastructure. For example a) a VLAN for each peering session vs. b) one VLAN per each customer to which others "subscribe" and peer across vs. c) a global VLAN which nobody likes. There are policy and design tradeoffs with each of these that touch on the limitation of 4096 VLANs . As for transport, MPLS framing of ethernet seems to work well. The question of tunneling transport over existing transit connections has proven effective to trialing but may be more expensive as the traffic volume increases. Running circuits of dedicated access can reduce the risk of running out of capacity on a "shared" transit or MPLS IX interconnect fabric. As for the operator of the transport between distributed switches, Joe Provo is correct that it need not be the IX operator. IX neutrality generally means that the IX Operator is not aligned with any one participant in the IX, but rather is working to the benefit of all of it IX participants. If an IX Operator's actions unnecessarily favor or harm one participant over another, then neutrality may an issue. Extending the population of an IX by using a distributed architecture doesn't necessarily clash with this neutrality principle, especially if doing so is solely for extending peer-peer interconnection. And no, this is not a new idea; the LINX, AMSIX, etc. have been doing this for a long time and the key seems to be that the IX switches are under one autonomous control. Bill

8205

Age (days ago)

8216

Last active (days ago)

List overview

Download

31 comments

14 participants

participants (14)

Bill Woodcock
David Diaz
Jared Mauch
Jeff Barrows
Joe Provo
Kurt Erik Lindqvist
neil＠DOMINO.ORG
Paul Vixie
Randy Bush
Simon Lockhart
Stephen J. Wilcox
Stephen Stuart
William B. Norton
William B. Norton

US-Asia Peering

William B. Norton

Jeff Barrows

David Diaz

William B. Norton

Stephen J. Wilcox

William B. Norton

William B. Norton

Paul Vixie

Stephen J. Wilcox

William B. Norton

Stephen J. Wilcox

Paul Vixie

Stephen J. Wilcox

Paul Vixie

Simon Lockhart

Paul Vixie

William B. Norton

tags

participants (14)