so having dual stack backbones is very important. but ...
Other global providers have a IPv6 network to, all open for business, but there are very VERY few customers. And, I'm not so sure we even have a "Internet" of IPv6 out there either. It looks cold and empty to me. Here's a challange, have NTP server attached directly to a good clock and a IPv6 network. Is there anyone who can talk to it using IPv6 on the Nanog list? (Time20.Stupi.SE, 2001:0440:1880:1000::0020) -Peter
On Thu, 2005-10-13 at 11:05 -0700, Peter Lothberg wrote:
Is there anyone who can talk to it using IPv6 on the Nanog list?
(Time20.Stupi.SE, 2001:0440:1880:1000::0020)
As a certain "Tier 1" still uses a mesh of tunnels and uses Viagenie in Canada as their transit provider latency to the above IP is in the area of 300ms, going transatlantic twice. IPv4 latency is only 66ms though. I do hope that some "Tier 1's" get their act together and start doing native IPv6. I already once suggested upgrading their hardware to them ;) For other people wanting latency tests etc, I suggest one takes a look at the following URL's: http://www.sixxs.net/tools/grh/ - IPv6 BGP monitor using a large number of ISP's for input, thus basically a distributed looking glass. http://www.sixxs.net/tools/traceroute/ - IPv4 & IPv6 Traceroute, so one can perform the below oneselves, except then from the web.... Of course also handy RIPE's RIS (http://ris.ripe.net) and a lot of other links listed on the main GRH page, near the bottom. Greets, Jeroen -- jeroen@noc:~$ traceroute6 2001:0440:1880:1000::0020 traceroute to 2001:0440:1880:1000::0020 (2001:440:1880:1000::20) from 2001:838:1:1:210:dcff:fe20:7c7c, 30 hops max, 16 byte packets 1 fe0.breda.ipv6.concepts-ict.net (2001:838:1:1::1) 0.386 ms 0.348 ms 0.316 ms 2 se2.ams-ix.ipv6.concepts-ict.net (2001:838:0:10::1) 2.139 ms 2.057 ms 2.187 ms 3 ge0-1-0.rtr1.ams-tc2.io.nl (2001:7f8:1::a502:4587:1) 2.306 ms 2.245 ms 2.477 ms 4 ge-0-1-0-0-v189.ipv6.rtr1.ams-rb.io.nl (2001:1460:2000::1) 3.03 ms 2.714 ms 5.233 ms 5 if-11-0-1-459.6bb1.AD1-Amsterdam.ipv6.teleglobe.net (2001:5a0:200::15) 3.533 ms 3.38 ms 4.053 ms 6 gin-mtt-6bb1.ipv6.teleglobe.net (2001:5a0:300::1) 90.535 ms 90.651 ms 89.447 ms 7 tu-0.viagenie.mlpsca01.us.b6.verio.net (2001:418:0:4000::26) 125.304 ms 117.245 ms 125.672 ms 8 3ffe:b00:c18::f (3ffe:b00:c18::f) 188.142 ms 195.208 ms 186.002 ms 9 sl-s1v6-nyc-t-1000.sprintv6.net (2001:440:1239:1001::2) 338.253 ms 190.913 ms 192.706 ms 10 sl-bb1v6-sto-t-101.sprintv6.net (2001:440:1239:1012::1) 289.743 ms 289.226 ms sl-bb1v6-sto-t-102.sprintv6.net (2001:440:1239:100d::2) 286.042 ms 11 2001:7f8:d:fb::34 (2001:7f8:d:fb::34) 307.681 ms 308.409 ms 306.03 ms 12 2001:440:1880:1::2 (2001:440:1880:1::2) 309.458 ms 305.965 ms 308.103 ms 13 2001:440:1880:1::12 (2001:440:1880:1::12) 310.033 ms 308.292 ms 308.387 ms 14 2001:440:1880:1000::20 (2001:440:1880:1000::20) 291.071 ms 294.227 ms 288.971 ms jeroen@noc:~$ traceroute time20.stupi.se traceroute to time20.stupi.se (192.36.143.234), 64 hops max, 40 byte packets 1 ge-1-3-0.colo.breda.concepts-ict.net (213.197.29.1) 0 ms 0 ms 0 ms 2 at-0-3-1.nikhef.concepts-ict.net (213.197.27.126) 2 ms 10 ms 2 ms 3 ams1-core.gigabiteth0-2.swip.net (195.69.144.88) 2 ms 3 ms 2 ms 4 cor1-core.pos5-0.swip.net (130.244.193.105) [MPLS: Label 755 Exp 0] 118 ms * 66 ms 5 par1-core.pos12-0.swip.net (130.244.218.1) [MPLS: Label 390 Exp 0] 66 ms 66 ms 66 ms 6 lon1-core.pos1-0.swip.net (130.244.194.218) [MPLS: Label 128 Exp 0] 28 ms 26 ms 26 ms 7 kst1-core.pos5-0.swip.net (130.244.192.61) 65 ms * 65 ms 8 cty3-core.srp4-0.swip.net (130.244.194.244) 75 ms 71 ms 66 ms 9 R29-YB-SRP-1-0.Stupi.NET (194.71.10.40) 66 ms 66 ms 66 ms 10 * * * 11 Time20.Stupi.SE (192.36.143.234) 66 ms 68 ms 66 ms
On Thu, 13 Oct 2005, Jeroen Massar wrote:
On Thu, 2005-10-13 at 11:05 -0700, Peter Lothberg wrote:
Is there anyone who can talk to it using IPv6 on the Nanog list?
(Time20.Stupi.SE, 2001:0440:1880:1000::0020)
As a certain "Tier 1" still uses a mesh of tunnels and uses Viagenie in Canada as their transit provider latency to the above IP is in the area of 300ms, going transatlantic twice. IPv4 latency is only 66ms though. I do hope that some "Tier 1's" get their act together and start doing native IPv6. I already once suggested upgrading their hardware to them ;)
I also presume you sent them a check and showed them the business case for the upgrade? No large provider is going to upgrade anything without a business reason. Oh, and some parts, critical parts even, of v6 are still 'broken'...
Of course, that's a business decision, but may be instead of getting a new check for the IPv6 service, not providing it, you will lost some checks from existing customers who demand dual stack ;-) Business is also be competitive, and other carriers already have the service as a value added to the existing IPv4 customers. Regards, Jordi
De: "Christopher L. Morrow" <christopher.morrow@mci.com> Responder a: <owner-nanog@merit.edu> Fecha: Thu, 13 Oct 2005 22:55:09 +0000 (GMT) Para: Jeroen Massar <jeroen@unfix.org> CC: Peter Lothberg <roll@Stupi.SE>, <nanog@nanog.org> Asunto: Re: IPv6 news
On Thu, 13 Oct 2005, Jeroen Massar wrote:
On Thu, 2005-10-13 at 11:05 -0700, Peter Lothberg wrote:
Is there anyone who can talk to it using IPv6 on the Nanog list?
(Time20.Stupi.SE, 2001:0440:1880:1000::0020)
As a certain "Tier 1" still uses a mesh of tunnels and uses Viagenie in Canada as their transit provider latency to the above IP is in the area of 300ms, going transatlantic twice. IPv4 latency is only 66ms though. I do hope that some "Tier 1's" get their act together and start doing native IPv6. I already once suggested upgrading their hardware to them ;)
I also presume you sent them a check and showed them the business case for the upgrade? No large provider is going to upgrade anything without a business reason. Oh, and some parts, critical parts even, of v6 are still 'broken'...
************************************ The IPv6 Portal: http://www.ipv6tf.org Barcelona 2005 Global IPv6 Summit Information available at: http://www.ipv6-es.com This electronic message contains information which may be privileged or confidential. The information is intended to be for the use of the individual(s) named above. If you are not the intended recipient be aware that any disclosure, copying, distribution or use of the contents of this information, including attached files, is prohibited.
On Fri, 14 Oct 2005, JORDI PALET MARTINEZ wrote:
Of course, that's a business decision, but may be instead of getting a new check for the IPv6 service, not providing it, you will lost some checks from existing customers who demand dual stack ;-)
As ted and others have already said: "Show me the customers who are asking"... so far the numbers are startlingly low, too low to justify full builds by anyone large.
Business is also be competitive, and other carriers already have the service as a value added to the existing IPv4 customers.
Sure, and the decision to use their network I'd suspect hardly ever comes down to 'v6'. My point was, really, that the screaming crazy man saying: "I told them dudes to forklift their network" is hardly productive. Showing, if folks can't find it themselves, that there is a business case that would justify a few million dollar upgrade is... A few folks that have a deployment going are ahead of the curve, hopefully they can keep the parts they have running and upgrade away from the 7507 that is their current solution :) Hopefully other folks can make their beancounters understand that v6 is going to happen regardless of their wishes for it NOT to happen due to upgrade costs. Also, hoefully as old hardware is cycled out finally new and v6 capable hardware will take it's place :) -Chris
On Fri, Oct 14, 2005 at 12:32:29AM +0000, Christopher L. Morrow wrote:
A few folks that have a deployment going are ahead of the curve, hopefully they can keep the parts they have running and upgrade away from the 7507 that is their current solution :)
The larger EU/US ISPs that have real deployments all use Junipers (for their IPv6), not Ciscos (with a few exceptions - Verio?). Don't know wether that's true for ASPAC folks too - can someone comment? One might conclude a thing or two from that - or not. Best regards, Daniel -- CLUE-RIPE -- Jabber: dr@cluenet.de -- dr@IRCnet -- PGP: 0xA85C8AA0
On Fri, Oct 14, 2005 at 10:21:47AM +0200, Daniel Roesen wrote:
On Fri, Oct 14, 2005 at 12:32:29AM +0000, Christopher L. Morrow wrote:
A few folks that have a deployment going are ahead of the curve, hopefully they can keep the parts they have running and upgrade away from the 7507 that is their current solution :)
The larger EU/US ISPs that have real deployments all use Junipers (for their IPv6), not Ciscos (with a few exceptions - Verio?). Don't know wether that's true for ASPAC folks too - can someone comment?
We're using both cisco and juniper in Verio for our IPv6 services and have been since it's launch (I think it was Oct/Nov 03). Running the dual-stack native service has been fairly straightforward. There are a few networks that are doing the dual-stack native thing. Others (eg: sprintv6 as seen in peoples traces to peters v6 clock) are doing tunneled infrastucture/overlay networks to support their IPv6 customers. This obviously has drawbacks that if you aggregate packets in a few locations (eg: asia, east-us, west-us, europe) for your tunneled stuff then have a full-mesh for their igp, the metrics don't always make sense. Combined with tunnels going long distances that don't reflect RTT and people still doing full transit to anyone they can tunnel with cause some interesting paths. One thing i find promising/good: Lots of people here sent their v6 traces to the list, so it's not just a few random geeks messing with v6 as much anymore, it's there. - jared -- Jared Mauch | pgp key available via finger from jared@puck.nether.net clue++; | http://puck.nether.net/~jared/ My statements are only mine.
"I told them dudes to forklift their network" is hardly productive.
IPv6 is not a forklift upgrade.
Showing, if folks can't find it themselves, that there is a business case that would justify a few million dollar upgrade is...
Again, it is cheaper to ease into IPv6 rather than waiting until it hutrts so bad that you have a business case for a million dollar spend. Start by making sure that your engineers get some IPv6 training and enable it in your test labs. Add IPv6 testing to any purchase of new equipment. Figure out how to enable IPv6 on your network, i.e. make your plan so you know whether 6PE or dual-stack or ??? will work for you. Light up IPv6 on a few PoPs so that you can get some experience with monitoring etc.
Hopefully other folks can make their beancounters understand that v6 is going to happen regardless of their wishes for it NOT to happen due to upgrade costs.
This is the key thing. The cost of NOT being prepared for IPv6 is vastly higher than the cost of easing into it. When the IPv6 era comes crashing, there will be bankruptcies as a result because some networks will simply not be prepared to handle the transition. At that point in time there will be a shortage of IPv6 expertise and even large amounts of money will not necessarily be able to buy the skills that you need. The time to prepare is now. When I read the Geoff Huston's response to Tony Hain's analysis here: http://www.cisco.com/en/US/about/ac123/ac147/archived_issues/ipj_8-3/ipv4.ht... I don't see him saying that we should do nothing. There seems to be general consensus among these guys that the time for action is now. If you want the IPv6 transition to be painless for your company, then you need to get planning and get IPv6 in your test labs today. --Michael Dillon
On Fri, 14 Oct 2005 Michael.Dillon@btradianz.com wrote:
"I told them dudes to forklift their network" is hardly productive.
IPv6 is not a forklift upgrade.
agreed, it's a measured engineered decision hopefully. backed by financial and prudent engineering decisions. that wasn't the tone of the orignial comment though, which was: "Yea, I told them to just do it" which is tantamount to 'forklift your network you dummies'.
Showing, if folks can't find it themselves, that there is a business case that would justify a few million dollar upgrade is...
Again, it is cheaper to ease into IPv6 rather than waiting until it hutrts so bad that you have a business case for a million dollar spend. Start by making sure that your engineers get some IPv6 training
sure, most folks are doing this... (or atleast quite a few are) and at a certain time it's time to go from 'research' to 'production'. At that time it'd be prudent to address the financial reasons to 'go production', which is: "who's going to pay for this? What customer demand is there for this? uhm... why should I jeopardize my network for this now?"
Hopefully other folks can make their beancounters understand that v6 is going to happen regardless of their wishes for it NOT to happen due to upgrade costs.
The time to prepare is now.
agreed, some of the reason I asked a month ago about content providers :) also some of the reason I then said: "perhaps shim6 and multihoming is important to operators, better pay attention!" (I'm certainly not the only one saying these, I just don't want to drag other folks down with me :) )
When I read the Geoff Huston's response to Tony Hain's analysis here: http://www.cisco.com/en/US/about/ac123/ac147/archived_issues/ipj_8-3/ipv4.ht... I don't see him saying that we should do nothing. There seems to be general consensus among these guys that the time for action is now. If you want the IPv6 transition to be painless for your company, then you need to get planning and get IPv6 in your test labs today.
(and finding a business case that flies with management to actually do something higher in priority than all the other daily work)
Once upon a time, Christopher L. Morrow <christopher.morrow@mci.com> said:
agreed, it's a measured engineered decision hopefully. backed by financial and prudent engineering decisions. that wasn't the tone of the orignial comment though, which was: "Yea, I told them to just do it" which is tantamount to 'forklift your network you dummies'.
For some equipment, it still works out to "forklift your network". For example, our current dialup gear doesn't support IPv6 (and AFAIK no upgrades are available or planned to add it). There's no reason for us to replace our dialup gear; the only thing that fails on it is fans (and we can replace those easily enough with an hour's work of chassis dis/re-assembly). Dialup isn't going to go away in the near future either. -- Chris Adams <cmadams@hiwaay.net> Systems and Network Administrator - HiWAAY Internet Services I don't speak for anybody but myself - that's enough trouble.
On Fri, 14 Oct 2005, Chris Adams wrote:
Once upon a time, Christopher L. Morrow <christopher.morrow@mci.com> said:
agreed, it's a measured engineered decision hopefully. backed by financial and prudent engineering decisions. that wasn't the tone of the orignial comment though, which was: "Yea, I told them to just do it" which is tantamount to 'forklift your network you dummies'.
For some equipment, it still works out to "forklift your network". For example, our current dialup gear doesn't support IPv6 (and AFAIK no upgrades are available or planned to add it). There's no reason for us to replace our dialup gear; the only thing that fails on it is fans (and we can replace those easily enough with an hour's work of chassis dis/re-assembly). Dialup isn't going to go away in the near future either.
i suspect there is quite a large amount of gear (type not weight) that will never see v6 through the vendors but still support customers... speedstream anyone? cable-modem anyone? :( there are LOTS of things out there that don't know from v6 :( Thanks for another example though :)
When I suggest to my customers to move to IPv6, I explicitly tell them that planning is very important: 1) Initially (in some cases), your equipment may not have native support for the core/access networks. Not a problem, when you upgrade your network for other reasons (line cards, new IPv4 features, etc.), IPv6 usually will come as a value added. At the time being a transition box (event just a PC), could make it, as the traffic levels are low. This also give you the time to experiment, see how the traffic is growing, and help your "commercial" decision to move ahead faster or not. 2) Same with the CPEs. They don't support today, most of the time, native IPv6, but a PC in your network, probably with 6to4 and Teredo as non-managed transition mechanisms, will do it. Is not the optimal way, but help to move on but better than just nothing. Doing this you offer better service to your customers who are also playing with IPv6, instead of asking them to use third party tunnel brokers or 6to4 relays. Of course, a better service could be to setup a TB in your network, but this could mean some extra O&M cost. If you network is big, obviously you may need to setup several of those PCs, in different POPs, regions, etc. but you will see the need when traffic comes. The alternative is also to use existing or old routers, which most of the time also support 6to4. Regards, Jordi
De: "Christopher L. Morrow" <christopher.morrow@mci.com> Responder a: <owner-nanog@merit.edu> Fecha: Sat, 15 Oct 2005 03:34:21 +0000 (GMT) Para: Chris Adams <cmadams@hiwaay.net> CC: <nanog@nanog.org> Asunto: Re: IPv6 news
On Fri, 14 Oct 2005, Chris Adams wrote:
Once upon a time, Christopher L. Morrow <christopher.morrow@mci.com> said:
agreed, it's a measured engineered decision hopefully. backed by financial and prudent engineering decisions. that wasn't the tone of the orignial comment though, which was: "Yea, I told them to just do it" which is tantamount to 'forklift your network you dummies'.
For some equipment, it still works out to "forklift your network". For example, our current dialup gear doesn't support IPv6 (and AFAIK no upgrades are available or planned to add it). There's no reason for us to replace our dialup gear; the only thing that fails on it is fans (and we can replace those easily enough with an hour's work of chassis dis/re-assembly). Dialup isn't going to go away in the near future either.
i suspect there is quite a large amount of gear (type not weight) that will never see v6 through the vendors but still support customers... speedstream anyone? cable-modem anyone? :( there are LOTS of things out there that don't know from v6 :(
Thanks for another example though :)
************************************ The IPv6 Portal: http://www.ipv6tf.org Barcelona 2005 Global IPv6 Summit Information available at: http://www.ipv6-es.com This electronic message contains information which may be privileged or confidential. The information is intended to be for the use of the individual(s) named above. If you are not the intended recipient be aware that any disclosure, copying, distribution or use of the contents of this information, including attached files, is prohibited.
On Fri, Oct 14, 2005 at 10:21:58PM -0500, Chris Adams wrote:
For some equipment, it still works out to "forklift your network". For example, our current dialup gear doesn't support IPv6 (and AFAIK no upgrades are available or planned to add it).
How does that hinder your backbone, leased line access and hosting deployment? Best regards, Daniel -- CLUE-RIPE -- Jabber: dr@cluenet.de -- dr@IRCnet -- PGP: 0xA85C8AA0
Once upon a time, Daniel Roesen <dr@cluenet.de> said:
On Fri, Oct 14, 2005 at 10:21:58PM -0500, Chris Adams wrote:
For some equipment, it still works out to "forklift your network". For example, our current dialup gear doesn't support IPv6 (and AFAIK no upgrades are available or planned to add it).
How does that hinder your backbone, leased line access and hosting deployment?
Let's say we've passed the day when we can no longer get IPv4 address space (that's what started this thread) and we bring up a new hosting customer who is also local and has a dialup account. The hosting site gets an IPv6 address; how will the customer access it? -- Chris Adams <cmadams@hiwaay.net> Systems and Network Administrator - HiWAAY Internet Services I don't speak for anybody but myself - that's enough trouble.
On Fri, Oct 14, 2005 at 12:32:29AM +0000, Christopher L. Morrow wrote:
As ted and others have already said: "Show me the customers who are asking"... so far the numbers are startlingly low, too low to justify full builds by anyone large.
Just wait for a popular adult-content-provider offering website-access for free via IPv6.. -- Sabri please do not throw salami pizza away
On Fri, 14 Oct 2005, Sabri Berisha wrote:
On Fri, Oct 14, 2005 at 12:32:29AM +0000, Christopher L. Morrow wrote:
As ted and others have already said: "Show me the customers who are asking"... so far the numbers are startlingly low, too low to justify full builds by anyone large.
Just wait for a popular adult-content-provider offering website-access for free via IPv6..
that'd fall into my 1 month ago questions about: "Why won't a large content provider or 3 light v6 versions of their services?" questions :)
On Fri, 14 Oct 2005 15:13:44 +0200 Sabri Berisha <sabri@cluecentral.net> wrote:
On Fri, Oct 14, 2005 at 12:32:29AM +0000, Christopher L. Morrow wrote:
As ted and others have already said: "Show me the customers who are asking"... so far the numbers are startlingly low, too low to justify full builds by anyone large.
Just wait for a popular adult-content-provider offering website-access for free via IPv6..
Why ? Are you implying that there is unlimited free IPv6 bandwidth ? If not, why would they do that ? If so, I don't do porn, but I would be highly interested in free bandwidth. I suspect others would be too... Regards Marshall Eubanks
-- Sabri
please do not throw salami pizza away
On Fri, Oct 14, 2005 at 10:17:51AM -0400, Marshall Eubanks wrote: Dear Marshall,
Just wait for a popular adult-content-provider offering website-access for free via IPv6..
Why ? Are you implying that there is unlimited free IPv6 bandwidth ?
Nope.
If not, why would they do that ?
Imagine the following scenario: "It's 2009, the world reaches the end of it's ipv4 supply. As large global networks are still struggling to implement ipv6 on their equipment, their customers are facing more and more problems to get additional IP-space from their RIR's and are forced to use ipv6. But due to the lack of planning, only a number of access-isp's have successfully deployed ipv6 on their networks and so we have shattered native ipv6 connectivity throughout the internet. To encourage the access- and carrierindustry, (adult)contentproviders in all continents decide to boost the demand for ipv6 connectivity and offer their services for free to ipv6 users, for a limited period of time." Why did the internet grow so fast in the 90's? The public was able to access the network and created the demand for more content. This content attracted more and more eyeballs, and thus more commercial activities were deployed, resulting in a exponential growth of the network. Without eyeballs, contentproviders are not encouraged to deploy ipv6. Without content, eyeballproviders are not encouraged to deploy ipv6. It's a matter of time before one of them will be forced to end this circle and there is only one way to attract a large audience: giving a way your service for (nearly) free. -- Sabri please do not throw salami pizza away
Dear Sabri; On Fri, 14 Oct 2005 16:34:19 +0200 Sabri Berisha <sabri@cluecentral.net> wrote:
On Fri, Oct 14, 2005 at 10:17:51AM -0400, Marshall Eubanks wrote:
Dear Marshall,
Just wait for a popular adult-content-provider offering website-access for free via IPv6..
Why ? Are you implying that there is unlimited free IPv6 bandwidth ?
Nope.
If not, why would they do that ?
Imagine the following scenario:
"It's 2009, the world reaches the end of it's ipv4 supply. As large global networks are still struggling to implement ipv6 on their equipment, their customers are facing more and more problems to get additional IP-space from their RIR's and are forced to use ipv6. But due to the lack of planning, only a number of access-isp's have successfully deployed ipv6 on their networks and so we have shattered native ipv6 connectivity throughout the internet. To encourage the access- and carrierindustry, (adult)contentproviders in all continents decide to boost the demand for ipv6 connectivity and offer their services for free to ipv6 users, for a limited period of time."
Try as I may, I cannot imagine depending on the charity of porn content providers :) What I can imagine is the porn providers paying whatever is necessary for IPv4 address blocks. I would not want to get in a bidding war with them.
Why did the internet grow so fast in the 90's? The public was able to access the network and created the demand for more content. This content attracted more and more eyeballs, and thus more commercial activities were deployed, resulting in a exponential growth of the network.
Without eyeballs, contentproviders are not encouraged to deploy ipv6. Without content, eyeballproviders are not encouraged to deploy ipv6. It's a matter of time before one of them will be forced to end this circle and there is only one way to attract a large audience: giving a way your service for (nearly) free.
This sounds just like arguments heard in multicast discussions. I do give my content away for free, and AmericaFree.tv currently realizes about 10% of its (streaming) revenue from multicast (and presumably also realizes a similar percentage reduction in bandwidth costs from the same). If I thought that AFTV could increase revenues by another 10% by putting out IPv6 reflections of existing content, it would. If I thought that it could make money by putting out dedicated IPv6 content, it would do that too. I don't see any reason to expect either at present. I would be glad to be convinced otherwise.
-- Sabri
Regards Marshall
please do not throw salami pizza away
On Thu, 2005-10-13 at 22:55 +0000, Christopher L. Morrow wrote: <SNIP>
I also presume you sent them a check and showed them the business case for the upgrade? No large provider is going to upgrade anything without a business reason.
Current clients are already paying them at them moment are they not, as they apparently didn't reserve any funds for upgrades of their network, nor didn't take IPv6 along in the last 10 years of hardware cycles, thus clearly having played dumb for the last 10 years, how should their customers suddenly have to cough up to the stupidity of not being able to run a business and plan ahead into the future? As they apparently didn't upgrade their network for 10 years, somebody has to have a fat bankaccount by now :) Even then, they could easily do some 'good' tunnels over their own IPv4 infrastructure, enabling IPv6 at the edges where they connect their customers and maybe do some sensible peering and thus providing sensible IPv6 transit to their paying customers...
Oh, and some parts, critical parts even, of v6 are still 'broken'...
Yep, there is no multihoming, but effectively, except for the BGP tricks that are currently being played in IPv4 there is nothing in IPv4 either. But one won't need to upgrade a Tier 1's hardware to support shim6, as that will all be done at the end site and not at the "Tier 1" level, so that is just another bad excuse. Greets, Jeroen
On Fri, 14 Oct 2005, Jeroen Massar wrote:
On Thu, 2005-10-13 at 22:55 +0000, Christopher L. Morrow wrote: <SNIP>
I also presume you sent them a check and showed them the business case for the upgrade? No large provider is going to upgrade anything without a business reason.
Current clients are already paying them at them moment are they not, as they apparently didn't reserve any funds for upgrades of their network, nor didn't take IPv6 along in the last 10 years of hardware cycles, thus clearly having played dumb for the last 10 years, how should their
silly me... I forgot that stable ipv6 code has been available for 10 years, forget my protest then.
Oh, and some parts, critical parts even, of v6 are still 'broken'...
Yep, there is no multihoming, but effectively, except for the BGP tricks that are currently being played in IPv4 there is nothing in IPv4 either. But one won't need to upgrade a Tier 1's hardware to support shim6, as
shim6 is: 1) not baked 2) not helpful for transit as's 3) not a reality
that will all be done at the end site and not at the "Tier 1" level, so that is just another bad excuse.
or bad assumptions on your part, it's perhaps a matter of perspective.
On 14-Oct-2005, at 10:13, Christopher L. Morrow wrote:
Yep, there is no multihoming, but effectively, except for the BGP tricks that are currently being played in IPv4 there is nothing in IPv4 either. But one won't need to upgrade a Tier 1's hardware to support shim6, as
shim6 is: 1) not baked 2) not helpful for transit as's 3) not a reality
Not baked is absolutely correct, and not a reality follows readily from that, as viewed by an operator. I'm interested in (2), though. Shim6 is not intended to be a solution for transit ASes. If you're an ISP, then you can get PI address space and multi-home in the normal way with BGP. The big gap in the multi-homing story for v6 is for end sites, since those are specifically excluded by all the RIRs' policies on PI addressing right now. Shim6 is intended to be a solution for end sites. Are you suggesting that something else is required for ISPs above and beyond announcing PI space with BGP, or that shim6 (once baked and real) would present a threat to ISPs? Joe
On Fri, 2005-10-14 at 10:57 -0400, Joe Abley wrote: <SNIP>
Are you suggesting that something else is required for ISPs above and beyond announcing PI space with BGP, or that shim6 (once baked and real) would present a threat to ISPs?
There is one situation which is not really covered here, one can of course announce multiple de-aggregates, but, these will be filtered. As such announcing them will only hurt one a lot, as the 'transits' that do carry them are mostly of bad quality. eg take the following situation: Big ISP, or a large corporate network, spread around the world. >200 customers and so, thus they can easily get a IPv6 prefix from their favourite RIR. Thus they get, say a /32. Now this ISP has a large webfarm in the US. They have a very small one in say, Taiwan. In IPv4, this would mean: chunk up your PA and simply announce them in /20's or whatever is comfortable for you. In IPv6 though, one is not supposed to announce chunks out of the /32, also when you do, as mentioned above, one gets bad routing. Anyway, you don't want your farm in Taiwan to attract all the local (complete asia?) traffic, which you have to ship over that same small link or your internal network in taiwan to the US again, while in IPv4 others would be doing that. In this case, which is basically "traffic engineering for endsites with a global prefix", one runs into the shim6 thing again.... For instance UUNET 'solved' this in a different way, they simply requested a 10 or so separate /32's. See GRH for the list. These chunks are still /32's thus only <n> of these /32's can exist in the global routing table. It would still be 'nicer' if they only had to use one prefix... Greets, Jeroen
On Fri, 14 Oct 2005, Jeroen Massar wrote:
On Fri, 2005-10-14 at 10:57 -0400, Joe Abley wrote: <SNIP>
Are you suggesting that something else is required for ISPs above and beyond announcing PI space with BGP, or that shim6 (once baked and real) would present a threat to ISPs?
There is one situation which is not really covered here, one can of course announce multiple de-aggregates, but, these will be filtered. As such announcing them will only hurt one a lot, as the 'transits' that do carry them are mostly of bad quality.
eg take the following situation:
snip following one example of multihoming problems... there are others.
In this case, which is basically "traffic engineering for endsites with a global prefix", one runs into the shim6 thing again....
For instance UUNET 'solved' this in a different way, they simply requested a 10 or so separate /32's. See GRH for the list. These chunks are still /32's thus only <n> of these /32's can exist in the global routing table. It would still be 'nicer' if they only had to use one prefix...
One may want to have more options :) one might want to perhaps peer in region, or by national boundary... you must have options, a single prefix, as your example above showed, is not an option. -Chris
On Fri, Oct 14, 2005 at 10:57:59AM -0400, Joe Abley wrote:
The big gap in the multi-homing story for v6 is for end sites, since those are specifically excluded by all the RIRs' policies on PI addressing right now. Shim6 is intended to be a solution for end sites.
But isn't a solution for many (most?) of them, EVEN if it would be universally implemented everywhere[tm]. This is a known problem, and it was decided to aim for a 80%[1] solution (80% of the problem space, not 80% of folks in need). There is still not half of an idea for a solution for the full problem in sight. Best regards, Daniel [1] number made up -- CLUE-RIPE -- Jabber: dr@cluenet.de -- dr@IRCnet -- PGP: 0xA85C8AA0
On 14-Oct-2005, at 11:27, Daniel Roesen wrote:
On Fri, Oct 14, 2005 at 10:57:59AM -0400, Joe Abley wrote:
The big gap in the multi-homing story for v6 is for end sites, since those are specifically excluded by all the RIRs' policies on PI addressing right now. Shim6 is intended to be a solution for end sites.
But isn't a solution for many (most?) of them, EVEN if it would be universally implemented everywhere[tm].
Agreed, the solution space of current IPv4 multi-homing practice is larger than that of shim6. I think it is far too early to judge how many end sites might find shim6 an acceptable solution, however -- I'd wait for some measurement and modelling before I made declarations about that, and the measurement and modelling is arguably of limited use until the protocol elements come out of the oven. I don't think it's a foregone conclusion that edge-adaptive traffic engineering (along the lines of that carried out by many peer-to-peer applications) is necessarily inferior to traffic engineering carried out by upstream ISPs, however, which is something that I often hear. The balance of goodness depends on far too many variables to pre- judge, and there are philosophical arguments in favour of both approaches. Joe
On Fri, Oct 14, 2005 at 11:50:33AM -0400, Joe Abley wrote:
I think it is far too early to judge how many end sites might find shim6 an acceptable solution, however -- I'd wait for some measurement and modelling before I made declarations about that,
You mean in some 5-10 years? When finally the many folks who even struggle to implement TCP properly manage to implement some one or even two (newest idea of the shim6 folks) shim layers into the stack and get that deployed widely? I don't see that. But I think the discussion is mood. IETF decided on their goal, and it's superfluous trying to change that. While watching shim6 we carry on hoping that we'll get IPv6 multihoming going in the conventional, proven, working, feature-complete way we're used to... until IETF perhaps at one point in time realize that they are designing a solution which misses the stated requirements of many folks actually operating networks - and start working on a solution which actually solves the preceived problem of scalability in a way operators look forward in deploying. And looking at the IPv6 allocation lists, I see that some of the folks' employers involved in shim6 developement actually have got their own allocations (and even leak more-specifics in geopgraphic distinct locations for traffic engineering). Looks like they couldn't convice even their own IT folks that shim6 or anything else will fix their problem (feature wise and/or timeline wise). Sorry for being so politically incorrect to spell out in open words what a lot of folks out there think. I'm wearing my asbestos anyway. :-) Best regards, Daniel -- CLUE-RIPE -- Jabber: dr@cluenet.de -- dr@IRCnet -- PGP: 0xA85C8AA0
On Fri, 14 Oct 2005, Daniel Roesen wrote:
On Fri, Oct 14, 2005 at 11:50:33AM -0400, Joe Abley wrote:
I think it is far too early to judge how many end sites might find shim6 an acceptable solution, however -- I'd wait for some measurement and modelling before I made declarations about that,
But I think the discussion is mood. IETF decided on their goal, and it's superfluous trying to change that. While watching shim6 we carry on hoping that we'll get IPv6 multihoming going in the conventional, proven, working, feature-complete way we're used to... until IETF
there is no hope in having operators explain to ietf that the current path is fruitless? certainly they can be made to see the light, yes?
And looking at the IPv6 allocation lists, I see that some of the folks' employers involved in shim6 developement actually have got their own allocations (and even leak more-specifics in geopgraphic distinct locations for traffic engineering). Looks like they couldn't convice even their own IT folks that shim6 or anything else will fix their problem (feature wise and/or timeline wise).
that is troubling, yes... 'hypocrisy' ?
Sorry for being so politically incorrect to spell out in open words what a lot of folks out there think. I'm wearing my asbestos anyway. :-)
i for one appreciate it. Thanks!
On Sat, Oct 15, 2005 at 03:15:45AM +0000, Christopher L. Morrow wrote:
But I think the discussion is mood. IETF decided on their goal, and it's superfluous trying to change that. While watching shim6 we carry on hoping that we'll get IPv6 multihoming going in the conventional, proven, working, feature-complete way we're used to... until IETF
there is no hope in having operators explain to ietf that the current path is fruitless? certainly they can be made to see the light, yes?
Well, all this discussion and the set of requirements are nothing new. Quite the contrary. Lots and lots of talking was done, but still multi6 resulted in shim6. Where should one gain hope? They were constantly beaten with 6D Maglites, what does it take to see the light? Most folks have given up argueing I guess. I myself certainly did, at least in open fora. But I have also to admit that I'm shocked how few folks have the balls (or is it lazyness?) to express their opinion on IPv6 multihoming in the public, on the established fora for that stuff. See the recent threads about IPv6 PI / multihoming on ARIN PPML and other policy-making mailing lists. Almost zero feedback from enterprise / SME folks. That of course makes it much easier... "see, noone really complains! we must be going down the right road!".
And looking at the IPv6 allocation lists, I see that some of the folks' employers involved in shim6 developement actually have got their own allocations (and even leak more-specifics in geopgraphic distinct locations for traffic engineering). Looks like they couldn't convice even their own IT folks that shim6 or anything else will fix their problem (feature wise and/or timeline wise).
that is troubling, yes... 'hypocrisy' ?
Hm, perhaps more like OPP syndrome, no idea. You can very comfortably talk about ignoring requirements, when you have your own allocs in place, or know that you can easily pretend to be an ~ISP by sheer size of the company (you know, the "our IT department is the ISP for all other departments and spoke sites, and we have lots of them" standard trick). Frankly I don't have a clue what really lead the IETF to do the multi6 => shim6 move. Don't have any insight into the politics behind the curtains. Best regards, Daniel -- CLUE-RIPE -- Jabber: dr@cluenet.de -- dr@IRCnet -- PGP: 0xA85C8AA0
But I think the discussion is mood. IETF decided on their goal, and it's superfluous trying to change that. While watching shim6 we carry on hoping that we'll get IPv6 multihoming going in the conventional, proven, working, feature-complete way we're used to... until IETF
there is no hope in having operators explain to ietf that the current path is fruitless? certainly they can be made to see the light, yes?
Doubtful. The IETF was operating under the impression that having a scalable routing subsystem was paramount. Do you think operators can be made to see that light? Implementing IPv6 multihoming the "conventional" way guarantees that we end up with one prefix per site, and as the need for multihoming reaches deeper into the population, the growth rate of the routing table would surpass even the growth rate of the Internet itself. The alternative is a multihoming scheme that does not require a prefix per site. But that doesn't match the stated requirement of 'conventional', 'proven', 'working' [sic], 'feature-complete'. The operational community needs to reach consensus on what its priorities are. We fought the CIDR wars to keep the routing subsystem working and the operational community were the primary backers of that. To not support scalable multihoming is to reverse that position entirely. Tony
On Fri, Oct 14, 2005 at 09:52:19PM -0700, Tony Li wrote:
The alternative is a multihoming scheme that does not require a prefix per site. But that doesn't match the stated requirement of 'conventional', 'proven', 'working' [sic], 'feature-complete'.
Those weren't the "stated requirements" on an alternative multihoming scheme,, but only the attributes of conventional BGP multihoming. Please don't lay words into my mouth I didn't say.
The operational community needs to reach consensus on what its priorities are. We fought the CIDR wars to keep the routing subsystem working and the operational community were the primary backers of that. To not support scalable multihoming is to reverse that position entirely.
CIDR didn't have the big disadvantages to operators (at least non that I can identify, not having personally lived thru the CIDR migration). Operators DO support scalable multihoming, but it has to deliver what they want/need. HOW this can be achieved is the task of the IETF and the REAL challenge. shim6 is only "the easy way out". Best regards, Daniel -- CLUE-RIPE -- Jabber: dr@cluenet.de -- dr@IRCnet -- PGP: 0xA85C8AA0
On Sat, Oct 15, 2005 at 10:44:39AM +0200, Daniel Roesen wrote:
Operators DO support scalable multihoming, but it has to deliver what they want/need. HOW this can be achieved is the task of the IETF and the REAL challenge. shim6 is only "the easy way out".
Daniel
Easy... perhaps/perhaps not. Shim6 is an attempt to split the endpoint identifier from the routing locator. Unfortunately, it is not a clean split, still co-mingling the two thus retaining the confusion that we have today on what is an address used for... :) Not persuaded that this is going to be easy. --bill
Daniel,
The alternative is a multihoming scheme that does not require a prefix per site. But that doesn't match the stated requirement of 'conventional', 'proven', 'working' [sic], 'feature-complete'.
Those weren't the "stated requirements" on an alternative multihoming scheme,, but only the attributes of conventional BGP multihoming. Please don't lay words into my mouth I didn't say.
Those are exactly the words that you used in your message. I quote:
While watching shim6 we carry on hoping that we'll get IPv6 multihoming going in the conventional, proven, working, feature-complete way we're used to... until IETF perhaps at one point in time realize that they are designing a solution which misses the stated requirements of many folks actually operating networks -
The operational community needs to reach consensus on what its priorities are. We fought the CIDR wars to keep the routing subsystem working and the operational community were the primary backers of that. To not support scalable multihoming is to reverse that position entirely.
CIDR didn't have the big disadvantages to operators (at least non that I can identify, not having personally lived thru the CIDR migration).
No. It had big disadvantages to the end users. We asked them to suck it up in the name of having a scalable Internet. Now that we are proposing a technology to continue to help the providers scale, but that has disadvantages to the providers, we're seeing that the providers are not willing to sacrifice. Extremely disappointing.
Operators DO support scalable multihoming, but it has to deliver what they want/need. HOW this can be achieved is the task of the IETF and the REAL challenge. shim6 is only "the easy way out".
The IETF is responsible for providing real world engineering solutions to continue the growth of the Internet. When presented with the fundamentally conflicting requirements of supporting the Internet and fulfilling one faction's requests, they have chosen in favor of the Internet. If you'd like to suggest that they discover fundamentally new technology so that one can have their cake and eat it too, that's a fine thing, but that is the province of the IRTF. Engineering is the art of making tradeoffs, and that's what the IETF has done. I think that the provider community should examine the tradeoffs that have been made in much greater detail before condemning the result. The provider community has been well served by the IETF over the years and shim6 deserves at least a full and reasoned hearing before you throw the baby out with the bath-water. Tony
On Sat, 15 Oct 2005, Tony Li wrote:
The operational community needs to reach consensus on what its priorities are. We fought the CIDR wars to keep the routing subsystem working and the operational community were the primary backers of that. To not support scalable multihoming is to reverse that position entirely.
CIDR didn't have the big disadvantages to operators (at least non that I can identify, not having personally lived thru the CIDR migration).
No. It had big disadvantages to the end users. We asked them to suck it up in the name of having a scalable Internet. Now that we are proposing a technology to continue to help the providers scale, but that has disadvantages to the providers, we're seeing that the providers are not willing to sacrifice. Extremely disappointing.
I don't want to speak for Daniel, nor other operators really, but a solution that doesn't allow an operator to traffic engineer internally or externally is just not workable. For the same reasons quoted in your other messages to me: "Increased reliance on the Internet" If the network isn't reliable due to suboptimal routing issues it can't survive :(
condemning the result. The provider community has been well served by the IETF over the years and shim6 deserves at least a full and reasoned hearing before you throw the baby out with the bath-water.
agreed, but it doesn't seem that, until recently atleast, there was much operator participation. Hopefully that's changing for the better :)
I don't want to speak for Daniel, nor other operators really, but a solution that doesn't allow an operator to traffic engineer internally or externally is just not workable. For the same reasons quoted in your other messages to me: "Increased reliance on the Internet"
There's nothing in any multi-prefix multihoming solution that prevents an operator from internal or external traffic engineering. There just isn't a single explicit prefix to manipulate. If, within any given routing domain, you choose to carry a longer prefix and manipulate it to whatever extent your vendor's BGP permits, you and your consenting adult peers are free to do so. Do not, however, expect the rest of us to carry your traffic engineering prefixes. We are not interested.
agreed, but it doesn't seem that, until recently atleast, there was much operator participation. Hopefully that's changing for the better :)
Hopefully, that will reach a point where the operators show up and participate at IETF, rather than the IETF coming to NANOG. Tony
On Sat, 15 Oct 2005, Tony Li wrote:
I don't want to speak for Daniel, nor other operators really, but a solution that doesn't allow an operator to traffic engineer internally or externally is just not workable. For the same reasons quoted in your other messages to me: "Increased reliance on the Internet"
There's nothing in any multi-prefix multihoming solution that prevents an operator from internal or external traffic engineering. There just isn't a single explicit prefix to manipulate. If, within any given routing domain, you choose to carry a longer prefix and manipulate it to whatever extent your vendor's BGP permits, you and your consenting adult peers are free to do so. Do not, however, expect the rest of us to carry your traffic engineering prefixes. We are not interested.
I'm aware that routing table bloat is a problem, I'm also aware that just doing what we do today tomorrow will probably cause lots of expense, failure, or pain at some point in the future. I can't see that source/dest pairs being basically meaningless and large sinks or sources of traffic being anonymous is going to help either. (shim6 provides the possibility for end nodes to 'renumber' and change their source/dest at will, potentially playing havoc with traffic patterns, potentially even inducing 'failure' in the network interconnects along the way.
agreed, but it doesn't seem that, until recently atleast, there was much operator participation. Hopefully that's changing for the better :)
Hopefully, that will reach a point where the operators show up and participate at IETF, rather than the IETF coming to NANOG.
agreed.
On Sun, 16 Oct 2005, Christopher L. Morrow wrote:
I don't want to speak for Daniel, nor other operators really, but a solution that doesn't allow an operator to traffic engineer internally or externally is just not workable. For the same reasons quoted in your other messages to me: "Increased reliance on the Internet"
Well, people havn't been at all keen on solutions which would need fairly significant changes to how the operators do inter-AS routing (even if they would avoid shifting some aspects of routing to end-nodes). Given this high-resistance (rightly, wrongly, doesn't matter) to big changes in the transit parts of the internet, the only place then to do it is at the edges: have leaf-sites^Wnodes be more far active in how their packets are routed (by making deliberate use of the current provider aligned allocation<->topology transit internet). What kind of operator are you thinking of btw? End-node shouldn't bother operators of ISPs really (they'll only get the traffic that the end-node decided it wanted to send via them, which is exactly what you have today ;) ). It could bother operators of other kinds of sites though - but I'm hopeful though that the shim6 mechanisms will be malleable to site-multihoming, even if initially shim6 only concerns itself with end-hosts.
If the network isn't reliable due to suboptimal routing issues it can't survive :(
Just cause one network is unreliable does not mean that all the networks the end-node is connected to are unreliable. The end-node can try figure out which work and which don't and route accordingly. That's the whole point of shim6 ;). regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: I do not fear computers. I fear the lack of them. -- Isaac Asimov
On Fri, 14 Oct 2005, Tony Li wrote:
The alternative is a multihoming scheme that does not require a prefix per site.
Another alternative is to force-align allocation and topology in some way /other/ than by "Providers" (geographical allocation in whatever hierarchy, IX allocation, whatever), such that networks were easily aggregatable. Lots of objections though (the "providers and geography don't align" one though is ultimately slightly bogus, because with non-provider-aligned allocation policies in place it would be in providers interests to align their peering to match the allocation policy). FWIW, my current IPv6 assignment is PI to a degree (where P == my first hop IPv4 provider), I can change this "first hop IPv4" provider to any other provider within my country and still retain my IPv6 assignment. That kind of "PI" at least meets a lot of my needs. But it will disappear as soon my "first hop" provider offers native IPv6 - I'll have to give up my more mobile assignment then. I.e. my IPv6 experience is /better off/ if ISPs in my country do /not/ deploy IPv6.. ;)
But that doesn't match the stated requirement of 'conventional', 'proven', 'working' [sic], 'feature-complete'.
ACK. regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: I didn't know he was dead; I thought he was British.
FWIW, my current IPv6 assignment is PI to a degree (where P == my first hop IPv4 provider), I can change this "first hop IPv4" provider to any other provider within my country and still retain my IPv6 assignment.
it sounds as if you have the mythical separation of locator and identifier :-)/2. the problem is that there is likely to be a shortage of those locators. one problem with 6to4 is that having all traffic go through gateways will not scale well. to support v6-only folk, either the number of 6to4 gateways will need to approach the number of dfz routers, the dfz routers will run 6to4, or some combination thereof. randy
On Sat, 15 Oct 2005, Randy Bush wrote:
it sounds as if you have the mythical separation of locator and identifier :-)/2. the problem is that there is likely to be a shortage of those locators.
Yes ;^) Note that it's not 6to4, it's a proper 2001:: /48 delegation via the SIXXS tunnel-broker, which is mine to use long as ISPs here do not deploy IPv6 ;).
one problem with 6to4 is that having all traffic go through gateways will not scale well. to support v6-only folk, either the number of 6to4 gateways will need to approach the number of dfz routers, the dfz routers will run 6to4, or some combination thereof.
ACK. And 6to4 obviously won't fly for long after the 4 tank runs dry. regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: In Hollywood, if you don't have happiness, you send out for it. -- Rex Reed
Paul Jakma wrote:
And 6to4 obviously won't fly for long after the 4 tank runs dry.
Hopefully it won't need to at that point as it is only intended as a transitional step. I like the simplicity of 6to4 and the way it preserves end-to-end addresses. If only there was a way to adapt it's stateless automatic tunneling to solve the IPv6 multihoming/PI problem. I took a quick hack at it and the result is interesting though far from perfect: http://kl.net/ipv6/pi-in-6.txt - Kevin
Just my 5 cents to the topic: Don't you all think that IPv6 would not be so neccessary for the very long time yet, if the IPv4 allocation scheme would be done right from the very very beginning? If the allocation policies would be something like the ones for ASn's. I.e. when you ask for IP space allocation you must be in the need to set your own routing policies. In any other cases you should use private network space with only one IP shown outside the network. Yes, this would be a headache for some appplications like IP telephony, but, I don't see any problems in making the _correct_ protocols so they could work through NAT. As what I see now is that a very large address blocks are allocated to large companies, what companies do with them? Correct, they ae installing them as IP's of workstations, when, if IPs would be treated as a very valuable resource from the beggining, they would have to use at max /24 (well, may be 2 or three /24) for access routers. When they are proposing /48 allocation scheme for IPv6 they must be out of their mind, because in case such allocation will be ineffect, IPv6 address space will end shortly too. Again, IPv6 is creating more problems then solve. Better solution would be to freeze IPv4 allocation, then force big IPv4 users to return the addresses to the "public pool", and start allocation from the white piece of paper, doing the things right. -- With best regards, GRED-RIPE
We do not think, that _it wil be IPv6_. IPv6 is a good example of _second_ system, and do not looks as _succesfull_ for now. And it is not definitely _LAST PROTOCOL_. It _can be_ IPv6, true. But it can be other protocol (or just workaround for IPv4, as we had CIDR and CLASSLESS) instead. ----- Original Message ----- From: "Gregory Edigarov" <greg@velcom.com> To: <nanog@nanog.org> Sent: Monday, October 17, 2005 3:42 AM Subject: Re: IPv6 news
Just my 5 cents to the topic:
Don't you all think that IPv6 would not be so neccessary for the very long time yet, if the IPv4 allocation scheme would be done right from the very very beginning? If the allocation policies would be something like the ones for ASn's. I.e. when you ask for IP space allocation you must be in the need to set your own routing policies. In any other cases you should use private network space with only one IP shown outside the network. Yes, this would be a headache for some appplications like IP telephony, but, I don't see any problems in making the _correct_ protocols so they could work through NAT.
As what I see now is that a very large address blocks are allocated to large companies, what companies do with them? Correct, they ae installing them as IP's of workstations, when, if IPs would be treated as a very valuable resource from the beggining, they would have to use at max /24 (well, may be 2 or three /24) for access routers.
When they are proposing /48 allocation scheme for IPv6 they must be out of their mind, because in case such allocation will be ineffect, IPv6 address space will end shortly too.
Again, IPv6 is creating more problems then solve. Better solution would be to freeze IPv4 allocation, then force big IPv4 users to return the addresses to the "public pool", and start allocation from the white piece of paper, doing the things right.
-- With best regards, GRED-RIPE
On 24/10/05, Alexei Roudnev <alex@relcom.net> wrote:
We do not think, that _it wil be IPv6_. IPv6 is a good example of _second_ system, and do not looks as _succesfull_ for now. And it is not definitely _LAST PROTOCOL_.
enter jim fleming (or those chinese guys, more recently) with ipv9 srs
We do not think, that _it wil be IPv6_. IPv6 is a good example of _second_ system, and do not looks as _succesfull_ for now. And it is not definitely _LAST PROTOCOL_.
enter jim fleming (or those chinese guys, more recently) with ipv9
No, enter the National Science Foundation... http://www.nsf.gov/cise/geni/ Jim Fleming's idea wasn't all that crazy and some people are looking at similar partitioning schemes to make IPv6 multihoming practical. The IPv9 idea from China had nothing to do with IP, it was just a catchy marketing name for yet another domain naming scheme like RealNames. There really is serious, non-crazy research work going on to make a better replacement for the Internet Protocol. And Dave Clark, author of this interesting document on Internet routing: http://www.networksorcery.com/enp/ien/ien46.txt has recently been going around giving talks on fundamental re-architecting the Internet. At MIT he is a director of the CFP which is getting NSF GENI grant money to explore this: http://cfp.mit.edu/groups/internet/internet.html This is not the 1990's any more. ISO/CLNP has gone away. ATM has been embraced in MPLS. PSTN is being embraced in VoIP such as the British Telecom 21CN initiative. What was crazy yesterday, is thinkable today. In the end, IPv6 may be able to incorporate enough of these new ideas to continue as the last protocol. But we don't know that yet. --Michael Dillon
Another alternative is to force-align allocation and topology in some way /other/ than by "Providers" (geographical allocation in whatever hierarchy, IX allocation, whatever), such that networks were easily aggregatable. Lots of objections though (the "providers and geography don't align" one though is ultimately slightly bogus, because with non-provider-aligned allocation policies in place it would be in providers interests to align their peering to match the allocation policy).
I think we need a researcher to sit down and figure out exactly what this would look like in a sample city and a sample national provider. This is one of those inversion situations where we are turning the existing model inside out. Some people may be familiar with the inversion of control in user interfaces that came about when Windows/Macintosh became the standard UI. Here, the suggestion is that netblocks should be allocated to cities, not to providers. Within a city, providers would get a subset of the city address block to meet their local infrastructure needs. They would interconnect with each other a local exchange points to exchange local traffic as Paul Vixie is suggesting here: http://news.com.com/5208-1028-0.html?forumID=1&threadID=10554&messageID=77189&start=-1 Addresses from other cities would be viewed as a single aggregate for that city and these could be even further aggregated at some regional level such as Northwest, Southwest, Midwest, Southeast and Northeast. It's different than what we have now, but not extremely different. It is doable with IPv6 without any protocol changes because there is sufficient reserve address space available. It meets the concept of Internet as utility or mission-critical Internet because it mandates local interconnect. The customer point of view is that low latency and consistent latency is best and that mandates local interconnect. --Michael Dillon
Michael.Dillon@btradianz.com wrote:
Another alternative is to force-align allocation and topology in some way /other/ than by "Providers" (geographical allocation in whatever hierarchy, IX allocation, whatever), such that networks were easily aggregatable. Lots of objections though (the "providers and geography don't align" one though is ultimately slightly bogus, because with non-provider-aligned allocation policies in place it would be in providers interests to align their peering to match the allocation policy).
I think we need a researcher to sit down and figure out exactly what this would look like in a sample city and a sample national provider.
This is one of those inversion situations where we are turning the existing model inside out. Some people may be familiar with the inversion of control in user interfaces that came about when Windows/Macintosh became the standard UI.
Here, the suggestion is that netblocks should be allocated to cities, not to providers. Within a city, providers would get a subset of the city address block to meet their local infrastructure needs. They would interconnect with each other a local exchange points to exchange local traffic as Paul Vixie is suggesting here: http://news.com.com/5208-1028-0.html?forumID=1&threadID=10554&messageID=77189&start=-1
Addresses from other cities would be viewed as a single aggregate for that city and these could be even further aggregated at some regional level such as Northwest, Southwest, Midwest, Southeast and Northeast.
Err... Sounds awfully like E.164. Why don't we use phone number instead of IP numbers? We all know how well carrier phone number routing and number portability works, don't we?
It's different than what we have now, but not extremely different. It is doable with IPv6 without any protocol changes because there is sufficient reserve address space available. It meets the concept of Internet as utility or mission-critical Internet because it mandates local interconnect. The customer point of view is that low latency and consistent latency is best and that mandates local interconnect.
I'm sorry, but your geographical approach is broken by design. -- Andre
On Mon, 17 Oct 2005, Andre Oppermann wrote:
We all know how well carrier phone number routing and number portability works, don't we?
EWORKSFORME (and everyone else here). Took a good bit of very firm pressure from ComReg, the telecoms wathdog/regulator here, to overcome negative reaction from the operators though. (There's no such pressure which could be applied on IP operators, but same processes essentially could be applied at least for IP connectivity at national regulatory levels at least - trade /32's at INEX, the IX here and figure out billing. If only ComReg had the authority.. ;) ). regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: "When people are least sure, they are often most dogmatic." -- John Kenneth Galbraith
Paul Jakma wrote:
On Mon, 17 Oct 2005, Andre Oppermann wrote:
We all know how well carrier phone number routing and number portability works, don't we?
EWORKSFORME (and everyone else here). Took a good bit of very firm pressure from ComReg, the telecoms wathdog/regulator here, to overcome negative reaction from the operators though.
We don't want them involved in Internet routing, do we?
(There's no such pressure which could be applied on IP operators, but same processes essentially could be applied at least for IP connectivity at national regulatory levels at least - trade /32's at INEX, the IX here and figure out billing. If only ComReg had the authority.. ;) ).
Do you have any idea how this works internally? Apparently not. Phone numbers are an interesting species. On a global level they are used for call routing. On a local level however it's not more than a DNS name mapping to some real on-net identifier. Unfortunatly anyone calling your ported mobile number from outside the mobile networks ends up with the number range holder (you former number range holder) who in turn has to forward the call to your current mobile operator. On a TDM network this works pretty OK as the quality parameters are standardized and fixed (64kbit transparent voice channel, call capacity, etc). Outgoing are not affected because the TDM network always sets up parallel in/out path's. The return channel for your outgoing call doesn't come back through your former mobile operator. Now compare this to the Internet and IP routing. See some little differences and diffculties here and there? Yea, I thought so. Conclusion: Applying the phone number portability to the Internet is broken by design. -- Andre
On Tue, 18 Oct 2005, Andre Oppermann wrote:
We don't want them involved in Internet routing, do we?
Which, POTS telco's or ComReg? :) The latter does a good job afaict.
Do you have any idea how this works internally? Apparently not.
I have had telco people vaguely explain some of the issues and abstracts of SS7 signalling involved in calls they were handling for "me" to me.
Phone numbers are an interesting species. On a global level they are used for call routing.
Yep.
On a local level however it's not more than a DNS name mapping to some real on-net identifier.
Within a telco? There's a myriad of ways to do it afaict. The case I know best though involved calls inbound to an operator specific prefix (there are a set of 4 or so major telco peering exchanges in Ireland, where the domestic and transit telcos /must/ be avilable for peering). The operator used custom software to map specific numbers to X.25 "addresses" (I forget the exact X.25 jargon) on their own network to deliver the calls to various locations on our network.
Unfortunatly anyone calling your ported mobile number from outside the mobile networks ends up with the number range holder (you former number range holder)
The operator's prefix, yes.
who in turn has to forward the call to your current mobile operator.
Yep.
Outgoing are not affected because the TDM network always sets up parallel in/out path's. The return channel for your outgoing call doesn't come back through your former mobile operator.
I didn't know that, but sounds exactly what you want.
Now compare this to the Internet and IP routing. See some little differences and diffculties here and there? Yea, I thought so.
There are huge differences in the details, obviously. The basic concepts though are at least interesting to consider, if not directly applicable to IP (technically at least - operationally/politically is another question): 1. Providers servicing these prefixes must peer and exchange the prefix information 2. Providers must be prepared to carry other providers traffic into the area 2a. The providers within the area have to figure out how to bill for the difference of this traffic.
Conclusion: Applying the phone number portability to the Internet is broken by design.
Right, cause phone number portability is up and running for several sets of prefixes in various regions across the world[1], so there's definitely nothing we can learn from them. ;) 1. Does the US have number portability anywhere? If so, that would be a /huge/ region, and very interesting to examine to see how they manage it. regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: "Plan to throw one away. You will anyway." - Fred Brooks, "The Mythical Man Month"
Paul Jakma wrote:
On Tue, 18 Oct 2005, Andre Oppermann wrote:
On a local level however it's not more than a DNS name mapping to some real on-net identifier.
Within a telco?
No. Within a region. Normally area codes are a region. Sometimes entire country codes are a region in this sense. Depends on the size of the region/country though. In some cases there is even more than one area code for the same region. For every area code there is a 'default' carrier. Usually this is the incumbant. They've got the obligation to forward inbound calls to the true carrier of that particular number holder. However this only works if the target carrier has some kind of interconnect with said default carrier. On top of this forwarding doesn't come for free. Depending on my call volume with that area I have to look into direct termination of ported numbers. Usually the regulator or some party designated by the regulator runs a number portability registry with the true carrier information for every number. If I want to optimize my call routing I have to periodically synchronize the call routing tables on my switches with that registry. In a very competitive area this lead to 30-50% of all numbers being ported and thus showing up in my routing table. As we know from the Internet DFZ the routing table becomes very large. Fortunatly for the TDM network I have to do the routing table lookup only once when setting up the call, not for every voice sample. Thus I pay the lookup cost only once for x minutes of voice traffic. Very DNS like. And because of the area code hierarchy even more so. That's why number portability is normally only offered within the same area code or region. So you can't take your NY fixed line phone number to LA. Unless of course you have someone picking up the call in NY and transporting it to you in LA. For non-US mobile networks the number portability works the same. The mobile network(s) have got their own area codes.
There's a myriad of ways to do it afaict. The case I know best though involved calls inbound to an operator specific prefix (there are a set of 4 or so major telco peering exchanges in Ireland, where the domestic and transit telcos /must/ be avilable for peering). The operator used custom software to map specific numbers to X.25 "addresses" (I forget the exact X.25 jargon) on their own network to deliver the calls to various locations on our network.
You can forget that X.25 stuff. It's only used for SS7 message routing and doesn't have anything to do with call routing as such. The telco peering points is just a technicality. It's there just for optimization. Most regulators have set up an "easy interconnection" policy to prevent your favorite incumbant from offering 'peering' only on lands end.
Outgoing are not affected because the TDM network always sets up parallel in/out path's. The return channel for your outgoing call doesn't come back through your former mobile operator.
I didn't know that, but sounds exactly what you want.
Sure. However this is the main difference between the TDM network and the Internet. Due to this fact many things work on the phone network like carrier pre-selection, phone number portability, etc., that do not work on an IP network.
Now compare this to the Internet and IP routing. See some little differences and diffculties here and there? Yea, I thought so.
There are huge differences in the details, obviously. The basic concepts though are at least interesting to consider, if not directly applicable to IP (technically at least - operationally/politically is another question):
1. Providers servicing these prefixes must peer and exchange the prefix information
On the phone network the prefix information is not dynamically exchanged. There are number portability registries whose data you can download every night or so and then dump it into your own switch or IN platform.
2. Providers must be prepared to carry other providers traffic into the area
Only one of them. The 'default' carrier. There are many phone networks and carriers carrier who do not have 100% coverage.
2a. The providers within the area have to figure out how to bill for the difference of this traffic.
No. Usually the tariff is set by the regulator based on some fixed interconnection charge and network element usage.
Conclusion: Applying the phone number portability to the Internet is broken by design.
Right, cause phone number portability is up and running for several sets of prefixes in various regions across the world[1], so there's definitely nothing we can learn from them. ;)
Well, we can learn from them that circuit switched networks are different than packet switched networks. Beyond that not much.
1. Does the US have number portability anywhere? If so, that would be a /huge/ region, and very interesting to examine to see how they manage it.
See above for an explanation. To summarize the differences between PSTN and Internet routing: o PSTN ports numbers only within regions/area codes o PSTN routes the return path along the forward path (symetric) o PSTN calls have pre-determined characteristics and performance (64kbit) o PSTN has static routing with periodic sync from porting database o PSTN pays the routing table lookup only once when doing call setup o PSTN call forwarding and peering is not free or zero settlement -- Andre
On Tue, 18 Oct 2005, Andre Oppermann wrote:
No. Within a region. Normally area codes are a region. Sometimes entire country codes are a region in this sense. Depends on the size of the region/country though. In some cases there is even more than one area code for the same region.
Ah, yes, that I know. I thought maybe you were referring to number -> GSM SIM IMSI mapping within a telco, or whatever is the equivalent for fixed-line. (How roaming is done is really interesting btw). <snip interesting details>
said default carrier. On top of this forwarding doesn't come for free.
Of course not.
the call routing tables on my switches with that registry. In a very competitive area this lead to 30-50% of all numbers being ported and thus showing up in my routing table.
Yep. Any geographic solution must consider that disaggregation will always tend towards 100%.
As we know from the Internet DFZ the routing table becomes very large.
However, it can be confined to that arbitrary area.
That's why number portability is normally only offered within the same area code or region. So you can't take your NY fixed line phone number to LA. Unless of course you have someone picking up the call in NY and transporting it to you in LA.
Yep, obviously ;).
You can forget that X.25 stuff. It's only used for SS7 message routing and doesn't have anything to do with call routing as such.
Ah, it was used for everything in that network actually - but that was a very very specialised telco network. (And they had started moving to IP when I last worked with them.)
Outgoing are not affected because the TDM network always sets up parallel in/out path's. The return channel for your outgoing call doesn't come back through your former mobile operator.
Sure. However this is the main difference between the TDM network and the Internet. Due to this fact many things work on the phone network like carrier pre-selection, phone number portability, etc., that do not work on an IP network.
I'm not source how assymetric paths affect portability etc. Also, IP is well capable of that, and makes life easier.
On the phone network the prefix information is not dynamically exchanged.
Uhm, sure it is.
There are number portability registries whose data you can download every night or so and then dump it into your own switch or IN platform.
The number portability registries can be updated infrequently, yes. The telco prefix routing information however most definitely *is* routed dynamically. Maybe you don't have to participate in this routing (your not a telco?), but between the telcos - most definitely ;). (If not, we were scammed for a fortune for dynamically routed redundancy of calls across a set of exchanges ;) ).
2. Providers must be prepared to carry other providers traffic into the area
Only one of them. The 'default' carrier. There are many phone networks and carriers carrier who do not have 100% coverage.
Let me restate that: 2. One or more providers must be prepared to carry any providers traffic into the area Same thing. The incentive for providers to announce such an area-prefix to as many other providers outside of the area as possible would be to reduce settlement fees within the area for the smaller providers, and for the big ones -> make money.
2a. The providers within the area have to figure out how to bill for the difference of this traffic.
No. Usually the tariff is set by the regulator based on some fixed interconnection charge and network element usage.
How they figure it out (with or without a regulator) doesn't matter. It just has to be figured out. We don't have IP regulators, so for IP providers would have to figure it out all by themselves obviously. ;) That'd be the stumbling block I suspect.
Well, we can learn from them that circuit switched networks are different than packet switched networks. Beyond that not much.
If you want to focus on the differences between IP and POTS/GSM, sure, they're completely different. However, the point is to examine the abstract model for how telcos manage to achieve number portability without global-scope exchange of subscriber information and see what, if any, techniques could apply to IP.
To summarize the differences between PSTN and Internet routing:
o PSTN ports numbers only within regions/area codes
We're discussing what would be possible with area (rather than provider) assigned IP addresses. Ie, this is as possible for IP as PSTN, if $RIR decides to make some allocations in this way.
o PSTN routes the return path along the forward path (symetric)
I thought you said it didn't? No matter, IP is assymmetric.
o PSTN calls have pre-determined characteristics and performance (64kbit)
No bearing on routing.
o PSTN has static routing with periodic sync from porting database
The important point is that information to describe number->provider is exchanged betweeen providers in the area only. Whether it's done by dynamic protocols, email or post is an irrelevant detail, all that matters is that we have a way to do same in IP (we do: BGP).
o PSTN pays the routing table lookup only once when doing call setup
Well, that's simply a fundamental difference between packet and circuit switching ;).
o PSTN call forwarding and peering is not free or zero settlement
Indeed, I thought I had emphasised that working out the billing would be a major component of area-allocated IP. ;) regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: QOTD: "There may be no excuse for laziness, but I'm sure looking."
On Tue, 18 Oct 2005, Paul Jakma wrote:
If you want to focus on the differences between IP and POTS/GSM, sure, they're completely different. However, the point is to examine the abstract model for how telcos manage to achieve number portability without global-scope exchange of subscriber information and see what, if any, techniques could apply to IP.
Eg, given some arbitrary area: - RIR assigns a prefix to that area - For that area, for the set of ISPs providing service in that area (the area-ISP set) which are all peered with each other (eg at some IX in or near the area concerned), each ISP: - announces the area prefix as far and wide as they can (doing so will be an advantage for settlement with the other area-ISP set ISPs) - exchanges very very specific routes of: area-site -> AS with the other area-ISP set ISPs (if they peer locally, they can keep these very specific routes local too) - keep track of how much traffic to the area-prefix is handed off to other area-ISP set ISPs (and to which, obviously), and how much is received. - periodically, for every other area-ISP, reconcile traffic handed off / received and either send your or wait for their invoice as appropriate. Fraught with some difficulties obviously. (Politics of settlement, particularly when there is no benevolant entity to arbitrate and/or impose - before you ever get to the question of how to define an "area"). If it seems too difficult and the status quo is preferred - no worries, the hosts will figure out some kind of indirection. Bit less efficient than if ISPs would route natively/locally, but hey it won't require any difficult decisions and co-ordination in the ISP community. And maybe that'd be for the best. ;) regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: Nowlan's Theory: He who hesitates is not only lost, but several miles from the next freeway exit.
Paul Jakma wrote:
On Tue, 18 Oct 2005, Paul Jakma wrote:
If you want to focus on the differences between IP and POTS/GSM, sure, they're completely different. However, the point is to examine the abstract model for how telcos manage to achieve number portability without global-scope exchange of subscriber information and see what, if any, techniques could apply to IP.
Eg, given some arbitrary area:
- RIR assigns a prefix to that area
- For that area, for the set of ISPs providing service in that area (the area-ISP set) which are all peered with each other (eg at some IX in or near the area concerned), each ISP: - announces the area prefix as far and wide as they can (doing so will be an advantage for settlement with the other area-ISP set ISPs) - exchanges very very specific routes of:
area-site -> AS
with the other area-ISP set ISPs (if they peer locally, they can keep these very specific routes local too)
- keep track of how much traffic to the area-prefix is handed off to other area-ISP set ISPs (and to which, obviously), and how much is received.
- periodically, for every other area-ISP, reconcile traffic handed off / received and either send your or wait for their invoice as appropriate.
Fraught with some difficulties obviously. (Politics of settlement, particularly when there is no benevolant entity to arbitrate and/or impose - before you ever get to the question of how to define an "area").
If it seems too difficult and the status quo is preferred - no worries, the hosts will figure out some kind of indirection. Bit less efficient than if ISPs would route natively/locally, but hey it won't require any difficult decisions and co-ordination in the ISP community.
And maybe that'd be for the best. ;)
Again, this fails with the asymmetric nature of IP routing. On top it fails on bandwidth issues. What if super-cheap pron hoster X is in that area doing streaming full-res HDTV to it's suckers? I bet some participants in your service area face some serious link saturation issues. None of the participants have any control or estimates over the traffic that is and will be passing through them. Traffic flows will just happen there. Forget capacity planning. You'd have a hard time finding ISP's interested in that. -- Andre
On Tue, 18 Oct 2005, Andre Oppermann wrote:
Again, this fails with the asymmetric nature of IP routing.
The assymetric nature is plus-point. It means the traffic out of the area goes out via the "correct" provider (ie the one whose customer it is).
On top it fails on bandwidth issues. What if super-cheap pron hoster X is in that area doing streaming full-res HDTV to it's suckers?
It goes via the ISP(s) which "super cheap hoster X" pays for transit.
I bet some participants in your service area face some serious link saturation issues. None of the participants have any control or estimates over the traffic that is and will be passing through them.
Yep.
Traffic flows will just happen there. Forget capacity planning. You'd have a hard time finding ISP's interested in that.
Maybe. Look at it the other way though, it's a business opportunity - you can make money by attracting as much area-destined external traffic as possible and handing it off to correct intra-area ISP for that subscriber. The more the better, it's a potential revenue source. It's in your interest to be able to carry all the external traffic into the area that you can get. regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: Profanity is the one language all programmers know best.
Paul Jakma wrote:
On Tue, 18 Oct 2005, Andre Oppermann wrote:
Traffic flows will just happen there. Forget capacity planning. You'd have a hard time finding ISP's interested in that.
Maybe.
Look at it the other way though, it's a business opportunity - you can make money by attracting as much area-destined external traffic as possible and handing it off to correct intra-area ISP for that subscriber. The more the better, it's a potential revenue source. It's in your interest to be able to carry all the external traffic into the area that you can get.
Do you care from which upstream you get your connectivity from? Do you care whether it is Sprint, Level(3) or Cogent? Apparently you don't. With your proposed you don't have much/any influence on the way your packets take. -- Andre
On Tue, 18 Oct 2005, Andre Oppermann wrote:
you care whether it is Sprint, Level(3) or Cogent? Apparently you don't. With your proposed you don't have much/any influence on the way your packets take.
They might take much better routes actually than is possible today. regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: The early worm gets the bird.
Paul Jakma wrote:
On Tue, 18 Oct 2005, Andre Oppermann wrote:
you care whether it is Sprint, Level(3) or Cogent? Apparently you don't. With your proposed you don't have much/any influence on the way your packets take.
They might take much better routes actually than is possible today.
Yea, but only by chance, not by design. ;-) -- Andre
On Tue, 18 Oct 2005, Andre Oppermann wrote:
Yea, but only by chance, not by design. ;-)
Nope, by design. Routing would generally be better. The entire area would effectively be multihomed to the set of area-ISPs. There'd be some downsides too, eg where a provider attracting traffic for the prefix has some failure internally and for some reason doesn't withdraw the area-aggregate to ASes wholly external to the area. regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: To give of yourself, you must first know yourself.
On Tue, 18 Oct 2005, Andre Oppermann wrote:
don't. With your proposed you don't have much/any influence on the way your packets take.
Oh, NB: It's not my proposal at all. I'm merely exploring it. ;) Further, most of the thinking on this was done by the likes of Marcelo Bagnulo, Iljitsch van Beijnum and others. The fact that both of them are working on SHIM6 now probably is telling. regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: Yeah. Maybe I do have the right ... What's that stuff? -- Homer Simpson Deep Space Homer
Paul Jakma wrote:
On Tue, 18 Oct 2005, Andre Oppermann wrote:
As we know from the Internet DFZ the routing table becomes very large.
However, it can be confined to that arbitrary area.
Yes, but it's a very cumbersum process. You have to track this stuff for all regions and countries. They all vary how they do it. For example your ComReg publishes a couple of tables now and then with new/changed information. (Look for ComReg 04/35, 03/143R, etc.)
You can forget that X.25 stuff. It's only used for SS7 message routing and doesn't have anything to do with call routing as such.
Ah, it was used for everything in that network actually - but that was a very very specialised telco network. (And they had started moving to IP when I last worked with them.)
SS7 over IP is quite popular these days. However call routing != SS7 message routing.
Sure. However this is the main difference between the TDM network and the Internet. Due to this fact many things work on the phone network like carrier pre-selection, phone number portability, etc., that do not work on an IP network.
I'm not source how assymetric paths affect portability etc. Also, IP is well capable of that, and makes life easier.
IP routing is not symmetric whereas circuit switching is. In a case of individual IP address portability the return traffic always goes back to the ISP who has that particular prefix. No matter who 'opened' the connection. If I port my static dial-up IP to my new super FTTH ISP then suddenly up to 100Mbit of return traffic have to pass from Dial-Up ISP to FTTH ISP. I'd say this screws the Dial-Up ISP pretty royally. And you too because he most likely doesn't have that much capacity.
On the phone network the prefix information is not dynamically exchanged.
Uhm, sure it is.
Nope, it's not. Can you name a phone prefix routing protocol?
There are number portability registries whose data you can download every night or so and then dump it into your own switch or IN platform.
The number portability registries can be updated infrequently, yes.
The telco prefix routing information however most definitely *is* routed dynamically. Maybe you don't have to participate in this routing (your not a telco?), but between the telcos - most definitely ;).
(If not, we were scammed for a fortune for dynamically routed redundancy of calls across a set of exchanges ;) ).
That works differently. In the PSTN you always have multiple routes to a destination. If you have a direct trunk between two CO's then it will fill that first. When the direct trunk is full, the local switch has got an overflow route towards a neighboring or higher switch. It can have multiple overflow routes with different priorities. You can replace full trunk with dead trunk to get your redundancy. However there is no dynamic call routing as we know it from BGP or OSPF. At least not directly. Some switch vendors have developed call optimization software which runs in some sort of central intelligence center in the network and tries to optimize the trunk usage and priorities based on statistical and historical data.
2a. The providers within the area have to figure out how to bill for the difference of this traffic.
No. Usually the tariff is set by the regulator based on some fixed interconnection charge and network element usage.
How they figure it out (with or without a regulator) doesn't matter. It just has to be figured out. We don't have IP regulators, so for IP providers would have to figure it out all by themselves obviously. ;)
That'd be the stumbling block I suspect.
The stumbling block is that all IP packets return to the prefix holder (the old ISP) and the end-user bandwidth is not fixed.
To summarize the differences between PSTN and Internet routing:
o PSTN ports numbers only within regions/area codes
We're discussing what would be possible with area (rather than provider) assigned IP addresses. Ie, this is as possible for IP as PSTN, if $RIR decides to make some allocations in this way.
$RIR making allocations that way is not sufficient. It would need regulatory backing to enforce IP address portability. Every established carrier is not very interested in porting IP addresses to competitors.
o PSTN routes the return path along the forward path (symetric)
I thought you said it didn't? No matter, IP is assymmetric.
IP is asymmetric and PSTN is symmetric. There you have the first major problem with IP in this szenario.
o PSTN calls have pre-determined characteristics and performance (64kbit)
No bearing on routing.
Very much so. See my Dial-Up vs. FTTH ISP example.
o PSTN has static routing with periodic sync from porting database
The important point is that information to describe number->provider is exchanged betweeen providers in the area only. Whether it's done by dynamic protocols, email or post is an irrelevant detail, all that matters is that we have a way to do same in IP (we do: BGP).
The differences are far greater. See my description of call routing above.
o PSTN call forwarding and peering is not free or zero settlement
Indeed, I thought I had emphasised that working out the billing would be a major component of area-allocated IP. ;)
As you can see in the Dial-Up vs. FTTH ISP case as new ISP you don't have a chance to differentiate yourself through better routing or performance or QOS or whatever from anyone else. If the performance at the old ISP was lousy before it is lousy after porting the IP address because it's still the shitty bandwidth of that old ISP. -- Andre
On Tue, 18 Oct 2005, Andre Oppermann wrote:
Yes, but it's a very cumbersum process. You have to track this stuff for all regions and countries. They all vary how they do it. For example your ComReg publishes a couple of tables now and then with new/changed information. (Look for ComReg 04/35, 03/143R, etc.)
Presumably, for IP, we'd use databases and processes more typical to normal IP ops, eg RIR databases and such, to record which ISPs can service which geo-prefixes. The subscriber_prefix->provider information itself can just be dynamically routed locally.
SS7 over IP is quite popular these days. However call routing != SS7 message routing.
By call-routing you mean the actual circuit switching of each call? I don't mean that, I mean the number routing, which SS7 /does/ do - you referred to it as being more analogous to DNS iirc in operation.
IP routing is not symmetric whereas circuit switching is. In a case of individual IP address portability the return traffic always goes back to the ISP who has that particular prefix. No matter who 'opened' the connection. If I port my static dial-up IP to my new super FTTH ISP then suddenly up to 100Mbit of return traffic have to pass from Dial-Up ISP to FTTH ISP. I'd say this screws the Dial-Up ISP pretty royally. And you too because he most likely doesn't have that much capacity.
Ah, multi-homed. We havn't considered this case yet, but in the above, you're a customer of both these ISPs. I'd say the dial-up ISP would ask you sharpish to relist your "home" ISP as the FTTH ISP while charging you for that 100Mb/s of traffic (as the contract would provide for). The same thing can happen today with multihomed PI customers - what would happen today?
Nope, it's not. Can you name a phone prefix routing protocol?
Ehm, SS7 ;). You might call it DNS-like because it's request based, but it still provides routing information.
However there is no dynamic call routing as we know it from BGP or OSPF. At least not directly. Some switch vendors have developed call optimization software which runs in some sort of central intelligence center in the network and tries to optimize the trunk usage and priorities based on statistical and historical data.
I meant only the routing information, not the switching (which is clearly completely different in packet switched IP).
The stumbling block is that all IP packets return to the prefix holder (the old ISP) and the end-user bandwidth is not fixed.
See above, what happens today? Multihomed sites can already try screw upstreams in this way, so no difference.
$RIR making allocations that way is not sufficient. It would need regulatory backing to enforce IP address portability.
Probably.
Every established carrier is not very interested in porting IP addresses to competitors.
Why not, if you can money off it. Two-edged sword too, if you must easily port addresses to competitors you can get their customers more easily too. Whether this is the right solution depends on whether ISPs would prefer such a mechanism to end-host based solutions. I can't answer that question. regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: If you stand on your head, you will get footprints in your hair.
Paul Jakma wrote:
On Tue, 18 Oct 2005, Andre Oppermann wrote:
SS7 over IP is quite popular these days. However call routing != SS7 message routing.
By call-routing you mean the actual circuit switching of each call? I don't mean that, I mean the number routing, which SS7 /does/ do - you referred to it as being more analogous to DNS iirc in operation.
Nope, it's not. Can you name a phone prefix routing protocol?
Ehm, SS7 ;).
You might call it DNS-like because it's request based, but it still provides routing information.
SS7 is not a routing protocol. SS7 is a transport stack like what we refer commonly to as TCP/IP. There are a number of protocols that run atop the basic SS7 transport network. Some protocols are datagram oriented and some are session oriented. For call handling ISUP or derivates are used. The main difference to the IP world is the limited scope in the PSTN. A PSTN switch consults its (static) number routing table for the trunk to forward the call on. Then it contacts the switch at the other end of the tunk and hands over further forwarding to it. If that doesn't work out the circuit gets shut down backward switch to switch. For special numbers like 800 and 900 there is another protocol called IN (Intelligent Network) which is nothing else than DNS. Each special number has a 'real' number in its shadow. The originating switch requests the real number through IN and then the normal call forward happens. IN may deliver different numbers based on the location of the orginating switch or daytime or any other criteria you may think of. A little bit like Akamai if you wish. However there is nothing akin BGP or OSPF in the SS7 suite of protocols. All the forwarding/trunk tables are computed offline for each switch and then stored on the switches by some bulk data transfer. Variantions are emerging with extended IN platforms where you have one more central databases of forwarding information but that is just a large geographically distributed switch then. -- Andre
Right, cause phone number portability is up and running for several sets of prefixes in various regions across the world[1], so there's definitely nothing we can learn from them. ;)
Well, we can learn from them that circuit switched networks are different than packet switched networks. Beyond that not much.
I disagree. There are many parallels and in many ways the telephony operators are struggling with the same kinds of problems that we are. NANPA has forecast that the North American number plan will be exhausted within 20 years. Just like the IPv4 address space. Their plan is to extend the number plan by two digits using 4-digit area codes and 4 digit central office codes. Rather like IPv6's extended address length. The new digits will be introduced at the same time so that everyone will dial an extra digit at the end of their existing area code, and another extra digit at the beginning of their central office code. Today you would dial (703)227-0660 to reach ARIN's help desk. After the change you would dial (7030)0227-0660. Full details here: http://www.atis.org/inc/docs/finaldocs/020107029.doc NANPA's website points to more information. http://www.nanpa.com/index.html There is also a North American Numbering Council that meets regularly and has several working groups. http://www.nanc-chair.org/docs/documents.html It is foolish to regard people outside the IP networking industry as inferior. Good ideas can come from anywhere and we can often understand our own area of interest much better by comparing and contrasting with other similar areas of interest. --Michael Dillon
Michael.Dillon@btradianz.com wrote:
Right, cause phone number portability is up and running for several
sets
of prefixes in various regions across the world[1], so there's definitely nothing we can learn from them. ;)
Well, we can learn from them that circuit switched networks are different than packet switched networks. Beyond that not much.
I disagree. There are many parallels and in many ways the telephony operators are struggling with the same kinds of problems that we are. NANPA has forecast that the North American number plan will be exhausted within 20 years. Just like the IPv4 address space.
Their plan is to extend the number plan by two digits using 4-digit area codes and 4 digit central office codes. Rather like IPv6's extended address length. The new digits will be introduced at the same time so that everyone will dial an extra digit at the end of their existing area code, and another extra digit at the beginning of their central office code. Today you would dial (703)227-0660 to reach ARIN's help desk. After the change you would dial (7030)0227-0660. Full details here: http://www.atis.org/inc/docs/finaldocs/020107029.doc
NANPA's website points to more information. http://www.nanpa.com/index.html
There is also a North American Numbering Council that meets regularly and has several working groups. http://www.nanc-chair.org/docs/documents.html
It is foolish to regard people outside the IP networking industry as inferior. Good ideas can come from anywhere and we can often understand our own area of interest much better by comparing and contrasting with other similar areas of interest.
There is a major difference between phone numbers and IP addresses which makes direct comparisons harder. Phone numbers are more like Domain names (+email addresses behind them) than IP addresses. People use phone numbers the same way they use domain names. They remember them and use them to access other people, or companies. I haven't seen many billboards with IP addresses on them lately. Nobody cares about the actual IP address. Only the computer does at the time of the DNS lookup. So an IP address is only used as underlying transport vehicle of data. For the enduser it doesn't have any direct significance. A phone number has significance to the end user and has a hybrid function as underlying routing element to varying degrees too. The entire problemset with IP address portability comes from two issues: Ease of ISP changes and redundant connectivity. The former could theoretically be solved with with better procedures and methods for host address assignment. However it still requires some labor intensive transition period and the IP addresses are much tangled with other things like DNS and so on. The second issue is IP architecture specific. The PSTN, due to its symmetric nature, doesn't have the redundancy problem to the same extent as the Internet. For the IP prefix however you have to participate in the global routing system to survive link losses. Without any shim6 or SCTP stuff that is. Again, phone numbers and their portability can and should not be compared with the IP address portability issues. They're very different animals. -- Andre
Again, phone numbers and their portability can and should not be compared with the IP address portability issues. They're very different animals.
That's your elephant. My elephant looks different. Phone numbers and IP addresses are exactly the same. They are numbers used to identify the location to which I want to connect. They are allocated from a numbering plan with a fixed number of digits/bits. The numbering plans are running out of space. One solution being used in both telephony and IP networks, is to carve out smaller allocations (CIDR/pooling) to make the number plan last longer. Sometimes the fact that phone networks are like pink elephants, not grey, is important. Other times The grey elephant keepers watch the behavior of pink elephants to learn about elephants in general. I never suggested that one could mindlessly apply techniques from the telephony world in the IP network world. But we can understand our problems better if we compare them to the telephony world, the ATM world, and maybe even the world of Advanced Data Path Routing Techniques for Three Tier KVM Networks? http://www.tron.com/i0000032.html There are also lessons to be learned outside the world of telecoms networks by studying the distribution of market towns in ancient Mesopotamia or the crystalline lattice of the skeletons of diatoms and dinoflagellates. --Michael Dillon
On Wed, 19 Oct 2005, Michael.Dillon@btradianz.com wrote:
Again, phone numbers and their portability can and should not be compared with the IP address portability issues. They're very different animals.
That's your elephant. My elephant looks different.
Survey says... BZZZZZT. Read about SS7 LNP implementation before speaking, please. They are very different creatures. Something that resembles telephony LNP will not scale to the quantity of micro-streams currently used by WWW applications. The reason it works (FSVO "works") for telephony is because, unlike TCP streams, telephone circuits are comparatively large streams with much longer keepalive times. -- -- Todd Vierling <tv@duh.org> <tv@pobox.com> <todd@vierling.name>
Survey says... BZZZZZT.
Yaur argument is fallacious.
Read about SS7 LNP implementation before speaking, please.
I never said anything about SS7 implementation of LNP.
They are very different creatures. Something that resembles telephony LNP will not scale to the quantity of micro-streams currently used by WWW applications. The reason it works (FSVO "works") for telephony is because, unlike TCP streams, telephone circuits are comparatively large streams with much longer keepalive times.
This is a strawman argument. I certainly agree with what you have said about TCP streams versus calls on the PSTN but that has nothing whatsoever to do with what I was talking about. Why is it that whenever people suggest that the IP networking world can learn from the experience of the telephony world, some people assume that the proposal is to imitate the telephony world in every detail? The fact is that both worlds are completely different in the details. But these different details lead the telephony world to make different technology choices and then gain real world operational experience with those choices. As the IP world evolves and changes (remember this started with a discussion of IPv6) it is possible that some of the hard-won experience from the telephony world can be applied in the IP world. No doubt it will be necessary to implement things differently in the IP world because of the details. But it is crazy to reject the hard won experience of the telephony world wholesale just because you worship at the temple of IP. In any case, the telephony world owns and runs the Internet today. Bellhead and nethead arguments belong in the past. Today's bellheads are running IP networks and VoIP along with all the PDH, SDH, X.25, SS7, ATM, Frame Relay etc. --Michael Dillon
On Oct 19, 2005, at 9:39 AM, Michael.Dillon@btradianz.com wrote:
Why is it that whenever people suggest that the IP networking world can learn from the experience of the telephony world, some people assume that the proposal is to imitate the telephony world in every detail?
Seems to me to be a species of the broader attitude among some that any analogy or reference to anything outside of the IP world is inherently flawed. On this view, there is no such thing as a good analogy or comparison. All are equally bad and misleading. Full stop. Perhaps an ungenerous observation, but this is exactly the sort of attitude that one would expect of narrow experts in the field who have no knowledge whether and how it might resemble or relate to anything else... Private editorial ends. TV
At 07:36 AM 10/18/2005, Andre Oppermann wrote: [... items deleted ...] To summarize the differences between PSTN and Internet routing:
o PSTN ports numbers only within regions/area codes o PSTN routes the return path along the forward path (symetric) o PSTN calls have pre-determined characteristics and performance (64kbit) o PSTN has static routing with periodic sync from porting database o PSTN pays the routing table lookup only once when doing call setup o PSTN call forwarding and peering is not free or zero settlement
-- Andre
Largely true; influenced by history and the difference between circuit-switched networks and packet-switched networks. LNP is more like DNS than multihoming. Sort of. Imagine TCP using domain names rather than IP addresses. I should note however, that in the U.S., Number Portability (LNP) rarely uses call forwarding anymore. Except in legacy rural areas, the LNP dip occurs before reaching the host office and is thus shunted to the correct carrier earlier up in the stream. At minimum it is done by the N+1 switch. However, it is common for the IXCs (LD Carriers) and CLECs do it even earlier to avoid paying the local ILEC database lookup fees. In that scenario, it routes perfectly to the correct carrier. BTW, telephone networks are generally do not multihome and are very fragile. Node (Switch) failure brings down large sections of the network. They instead concentrate on 99.99%+ uptime for the switches themselves. In other words, they concentrate on internal component redudancy and same-destination route redundancy rather than handling an outage of the entire switch. The SS7 network has removed some of this fragility, but not all. Not by a long shot. Describing this in a picture: Internet way: "route around problems" A - B - C \ / \-D-/ The Telco way: "try to make problems never happen" A--B--C A--B--C Where the AA in the Telco model is essentially the same equipment in the same room with redundant components. Anyway, ... TCP using DNS rather than IP?... Interesting thought. John
1. Does the US have number portability anywhere? If so, that would be a /huge/ region, and very interesting to examine to see how they manage it.
In the USA this is called LNP (Local Number Portability). This article has a couple of pages of history and then a technical overview of how LNP works. http://scholar.lib.vt.edu/ejournals/JOTS/Winter-Spring-2001/pdf/obermier.pdf This document explains the architecture of LNP in today's phone network: http://www.verisign.com/stellent/groups/public/documents/white_paper/001950.... However, LNP is not as simple as most laypeople think. It has other applications than simply consumer convenience. For instance, disaster recovery http://www.neustar.com/pressroom/datasheets/DisasterRecPress.pdf Read this description of number pooling http://www.verisign.com/stellent/groups/public/documents/white_paper/001949.... and reflect on how similar this seems to injecting longer prefixes into BGP (hole punching) to support moving a customer from another network. LNP and routing are the same problem. The details of the solutions differ because the technology environment and constraints differ. But you will never understand IP routing unless you understand how non-IP networks solve these same problems. That's why some people use RIP to teach routing even though it is considered bad practice to run RIP on any network in this day and age. People need to learn routing theory separately from "How to configure BGP on your brand-X boxes". --Michael Dillon
On Mon, 2005-10-17 at 11:39 +0100, Michael.Dillon@btradianz.com wrote:
Another alternative is to force-align allocation and topology in some way /other/ than by "Providers" (geographical allocation in whatever hierarchy, IX allocation, whatever), such that networks were easily aggregatable. Lots of objections though (the "providers and geography don't align" one though is ultimately slightly bogus, because with non-provider-aligned allocation policies in place it would be in providers interests to align their peering to match the allocation policy).
I think we need a researcher to sit down and figure out exactly what this would look like in a sample city and a sample national provider.
There has been quite some research on it, there where ideas, there was even talk of a vendor going to implement it, but it never happened. It won't work because of cash reasons (read: telco/transit don't want it) For your 'city data' check: http://unstats.un.org/unsd/demographic/default.htm or for pre-processed files: http://arneill-py.sacramento.ca.us/ipv6mh/ under "Geographical data". especially: http://arneill-py.sacramento.ca.us/ipv6mh/geov6.txt will be quite of your liking. 8<--------------------- Allocation: IANA block = 2346::/16 Ratio in use: one /48 site for 4 persons Allocation: 6bone block = 3FFE:FB00::/24 Ratio in use: one /48 site for 1024 persons -------------------->8 Which indeed seems quite reasonable. The problem with this is: 'who is paying for which traffic and to whom' One solution is an overlay network.... Notez bien, though this solves multihoming, it doesn't solve relocation, thus if your company moves it has to renumber, and renumbering is no fun, then again, you can most likely start from mostly scratch in the new location and you might be able to tunnel (parts of) the old allocation to the new site depending on which subnets/hosts one has moved already. Greets, Jeroen
I think we need a researcher to sit down and figure out exactly what this would look like in a sample city and a sample national provider.
especially: http://arneill-py.sacramento.ca.us/ipv6mh/geov6.txt
will be quite of your liking.
Not at all. This proposal is all about allocating addresses based on country boundaries and I reject this model. The Internet is a network of cities, not countries. The national boundaries are completely random in technical terms, but the cities are not random. The cities are where the people are, where the railways and roads are, where the channels of trade and communication begin and end.
Which indeed seems quite reasonable. The problem with this is: 'who is paying for which traffic and to whom'
Customers pay for all the traffic on their link and they pay their money to their Internet access provider. But that is beside the point.
Notez bien, though this solves multihoming, it doesn't solve relocation, thus if your company moves it has to renumber, and renumbering is no fun,
Not true. If a company moves across the city and connects to a different access provider, they don't have to renumber. After all, they are still in the same city. It just means that inside that city, the providers will carry an additional longer prefix in their routing tables. But the global view will be unchanged. In fact, the smaller global table allows for much more detail to be carried locally, given the same constraints of RAM and processing power in routers. --Michael Dillon
On Mon, 17 Oct 2005 Michael.Dillon@btradianz.com wrote:
http://arneill-py.sacramento.ca.us/ipv6mh/geov6.txt will be quite of your liking.
Not at all. This proposal is all about allocating addresses based on country boundaries and I reject this model. The Internet is a network of cities, not countries. The national boundaries are completely random in technical terms, but the cities are not random. The cities are where the people are, where the railways and roads are, where the channels of trade and communication begin and end.
Uhh, I'd say the internet is a network of networks, not a network of cities. :) But you bring a good point about railways. But are there enough privately-owned railways to make a good analogue? (This certainly doesn't apply to roads) I.e., when a dozen different railway companies want to provide transport, do each and every one of them build (parallel) tracks, stations, and trains on each city? I do not think so, but I do not know if any sort of "roaming" agreements exist. Or are you arguing that the basic infra (like the fibers) should be city/government/etc. controlled so it could be used in more cost-effective ways by all providers? -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings
On 17-okt-2005, at 14:18, Jeroen Massar wrote:
Another alternative is to force-align allocation and topology in some way /other/ than by "Providers" (geographical allocation in whatever hierarchy, IX allocation, whatever), such that networks were easily aggregatable. Lots of objections though (the "providers and geography don't align" one though is ultimately slightly bogus, because with non-provider-aligned allocation policies in place it would be in providers interests to align their peering to match the allocation policy).
The current assumption is that all aggregation happens on ISP. Replacing that with the assumption that all aggregation will happen on geography isn't all that useful. The important thing here is that you can aggregate on pretty much anything: hair color, router vendor, market capitalization, you name it. In the end, you always aggregate on the way the addresses are given out, which may or may not be meaningful. Aggregating on provider is the most powerful because the aggregate leads you fairly directly to the place where you need to go as long as the destination is single homed. But suppose at some point we end up with a routing table consisting of 10 million PI blocks from multihomers and some unimportant stuff that disappears in the error margin (i.e., those 5000 IPv6 /20s for huge ISPs). Also suppose that it's possible to build a reasonably cost effective router that handles 1M routes, but this router technology doesn't scale to the next order of magnitude. The simple solution is to build a big router that actually consists of 11 small ones: 10 sub-routers that each hold one tenth of the global routing table, and an 11th sub-router that distributes packets to the sub-router that holds the right part of the global routing table. So sub-router 1 has the part of the global IPv6 routing table that falls within 2000::/6, sub-router 2 has 2400::/6, sub-router 3 2800::/6 and so on. So we're aggregating here, but not really "on" something. This has the unpleasant side effect that we now have to spend 11 times more money to keep a 10 times larger routing table. Alternatively, we can trade hardware costs for bandwidth, by having 10 routers that are present in the network anyway each handle part of the global routing table. So a router in Boston would handle everything under 2000::/6, a router in Chicago 2400::/6, one in Seattle 2800::/6 and so on. Obviously this isn't great if you're in Boston and your address is 2800::1, but it doesn't require additional hardware. This scheme can be optimized by aligning addressing and geography to a reasonable degree. So if you're in Boston, you'd get 2000::1 rather than 2800::1. But that doesn't magically shrink the routing table to one route per city. In the case of Boston, it's likely that the source and destination ISPs for a certain packet don't interconnect within the city itself. So someone sitting in New York probably won't see much difference: he or she still has to carry all the routes for multihomers in Boston. Some of these will point to her own customers in Boston, some to peers in New York, others to peers in DC, and so on. However, as distance increases the difference between "this packet needs to go to a customer in Boston", "this packet needs to go to a peer in New York" and "this packet needs to go to a peer in DC" becomes meaningless, so it's possible to replace a large number of individual routes by a single city or region aggregate. So even without magic interconnection dust, aggregation based on geographical addressing can have benefits. However, it has several limitations. An important one is that early exit routing is replaced by late exit routing. Also, when someone multihomes by connecting to ISPs in Miami and Tokyo you don't get to aggregate. But worst case, you just don't get to aggregate, either because people multihome in weird ways, for traffic engineering reasons or because of lack of interconnection (however as interconnects become really sparse the savings go up again) so you're no worse off than today. But if and when the routing tables explode and routers can't keep up, having geographical addressing in place for multihoming allows for a plan B that we don't have today.
I think we need a researcher to sit down and figure out exactly what this would look like in a sample city and a sample national provider.
There has been quite some research on it, there where ideas, there was even talk of a vendor going to implement it, but it never happened. It won't work because of cash reasons (read: telco/transit don't want it)
I'm not familiar with that... Do you have a reference?
For your 'city data' check: http://unstats.un.org/unsd/demographic/default.htm
or for pre-processed files: http://arneill-py.sacramento.ca.us/ipv6mh/ under "Geographical data".
Note that this page hasn't been updated in more than two years. When Michel started this initiative the IETF multihoming in IPv6 (multi6) working group was pretty much dead and it certainly wasn't considering any input. However, our efforts resulted in the wg coming back to life again, considering input, rejecting most of it, and start work on a solution in a new wg: shim6. (Paul Jakma wrote something to the effect that I am involved with shim6 so that says something about other options. It doesn't, as far as I'm concerned. But shim6 is a worthy pursuit in its own right.) For anyone who wants to read the latest version of all of this (still two years old, though): http://www.muada.com/drafts/draft-van-beijnum- multi6-isp-int-aggr-01.txt
especially: http://arneill-py.sacramento.ca.us/ipv6mh/geov6.txt
Which indeed seems quite reasonable. The problem with this is: 'who is paying for which traffic and to whom'
Just because I press the "up" button for the elevator doesn't mean I'm going to the top floor. Still, having one button for "down" and one for "up" rather than having a different one for each floor seems to work well for this initial part. Once you get inside the elevator you still have to pick a floor, of course. Iljitsch
Hi, On Wed, 19 Oct 2005, Iljitsch van Beijnum wrote:
On 17-okt-2005, at 14:18, Jeroen Massar wrote:
Another alternative is to force-align allocation and topology in some way /other/ than by "Providers" (geographical allocation in whatever hierarchy, IX allocation, whatever), such that networks were easily aggregatable. Lots of objections though (the "providers and geography don't align" one though is ultimately slightly bogus, because with non-provider-aligned allocation policies in place it would be in providers interests to align their peering to match the allocation policy).
Iljitsch, fix your attributions, that's my text. (Jeroen might not appreciate you attributing my incoherent mumblings to him).
The current assumption is that all aggregation happens on ISP. Replacing that with the assumption that all aggregation will happen on geography isn't all that useful.
That's a bold assertion. You'll have to show why because the fact is that that is how other networks achieve portability (after which multi-homing is easy). Fact is I can change my fixed phone provider and my mobile phone provider, but I can't change my ISP without some pain (and I'm a /tiny/ site ;) ).
The important thing here is that you can aggregate on pretty much anything: hair color, router vendor, market capitalization, you name it.
Hmm, no ;).
In the end, you always aggregate on the way the addresses are given out, which may or may not be meaningful.
No, you have to aggregate on topology.
Aggregating on provider is the most powerful because the aggregate leads you fairly directly to the place where you need to go as long as the destination is single homed.
Sure. But it means you're tied to the provider (for that address at least).
interconnect within the city itself. So someone sitting in New York probably won't see much difference: he or she still has to carry all the routes for multihomers in Boston. Some of these will point to her own customers in Boston, some to peers in New York, others to peers in DC, and so on.
But at least, to the rest of the world, all the multihomers in Boston and New York have reduced down to just 2 routes. That's a significant step forward. (And eventually those ISPs back-hauling lots of very specific Boston customer prefixes to New York will figure out they should just peer in Boston and confine the very specific Boston routes there).
limitations. An important one is that early exit routing is replaced by late exit routing.
Can you expand on this?
Also, when someone multihomes by connecting to ISPs in Miami and Tokyo you don't get to aggregate.
Or, that entity just gets two prefixes, one for its Miami site allocated from the Miami area prefix and one for its Tokyo site allocated from the Tokyo area prefix. Really large networks with their own internal-transit across multiple areas for whom this would not work can just get a global prefix. But those kinds of networks are rare, a fraction of multi-homers. So it's still a step forward.
really sparse the savings go up again) so you're no worse off than today.
You're better off, because small/medium sites can be aggregated with all the other small/medium sites in their $AREA. The really large trans-$AREA networks are rare. Let's be honest, the reasons that make $AREA-allocated addresses and aggregation difficult are /not/ technical. ;)
(Paul Jakma wrote something to the effect that I am involved with shim6 so that says something about other options. It doesn't, as far as I'm concerned. But shim6 is a worthy pursuit in its own right.)
I said "possibly is telling" ;). But apologies for any presumption ;). regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: fortune: not found
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Yo Michael! On Mon, 17 Oct 2005, Michael.Dillon@btradianz.com wrote:
Here, the suggestion is that netblocks should be allocated to cities, not to providers. Within a city, providers would get a subset of the city address block to meet their local infrastructure needs. They would interconnect with each other a local exchange points to exchange local traffic
And who is going to force the ISPs to interconnect at the city level? For competitive reasons there is no peering in my city. The nearest peering points are several hundred miles away, in different directions, and even those are not shared in common by the local ISPs. RGDS GARY - --------------------------------------------------------------------------- Gary E. Miller Rellim 20340 Empire Blvd, Suite E-3, Bend, OR 97701 gem@rellim.com Tel:+1(541)382-8588 Fax: +1(541)382-8676 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFDU9eq8KZibdeR3qURAnBfAKC4ZBCUrGq9HgW80FJxIGqfbR7mowCgi4GD ykujmYnq/FPv6MA1nKdf49A= =aUG/ -----END PGP SIGNATURE-----
And who is going to force the ISPs to interconnect at the city level?
Customers, of course! Who else ever forces ISPs to do anything?
For competitive reasons there is no peering in my city. The nearest peering points are several hundred miles away, in different directions, and even those are not shared in common by the local ISPs.
Doesn't make sense, does it? As the Internet becomes more and more mission critical for more people, you will see non-technical customers questioning the chaos. Once the insurance companies realize that ISPs in some areas are not providing proper resiliency, they will raise the premiums for business insurance in those areas and this will provide the economic incentive for customers to force ISPs to build a sensible 21st century utility architecture. --Michael Dillon
On Mon, Oct 17, 2005 at 11:39:37AM +0100, Michael.Dillon@btradianz.com wrote:
Here, the suggestion is that netblocks should be allocated to cities, not to providers. Within
I am a multihomed customer and my ISPs are in two different cities. What are my IP addresses going to be? This situation happens all the time, by the way. In fact, the closer you get to smaller, consumer grade connectivity, the more you will see backhauling below the network layer in providers' networks that will make this happen. -Phil
Here, the suggestion is that netblocks should be allocated to cities, not to providers. Within
I am a multihomed customer and my ISPs are in two different cities. What are my IP addresses going to be?
Your assumptions are flawed. I never suggested that there would be a flag day. I never suggested that geotopological addressing would work everywhere or solve all problems. I never suggested that we should turn off the existing provider aggregatable IP address allocations. I just suggested an alternative way of issuing addresses so that they are geotopologically aggregatable, not provider aggregatable. There are sufficient reserved addresses in the IPv6 address space to do this. We could start issuing geotopological netblocks and try it for 5 years or so to see whether it works better or not. In any case, you are located in Montreal which is such a major city that I expect any ISP selling service (geotopologically) in Montreal would use Montreal address space even if they backhauled at layer two to some other city. However, there will likely be lots of situations where people in small towns roughly equidistant from two cities will choose to multihome with links to separate cities. This will either have to be done using provider-aggregated addresses or by using addresses from one of the cities with a longer prefix inside that city's routing table to direct the traffic to the neighboring city. If this is suboptimal, it won't be by much considering that these are neighboring cities. I'm not suggesting any change to IPv6 stacks or to routing protocols. I'm just suggesting that we could allocate the same IPv6 addresses to operators in a way that allows geotopological aggregation rather than the existing provider aggregation. Combine this with local traffic exchange in every city and you have a more robust Internet with lower overall latency that will run with a smaller global routing table. I know that some individual operators, such as the one I work for, have very robust IP networks with low overall latency. But when we talk about the Internet, then we include all the private interconnects and public exchange points and tromboning of traffic due to peering "issues", etc. --Michael Dillon --Michael Dillon
I reread this and still don't see how geographical ip address allocation is going to work if typical customer connections are network-centric and any large area has number of competitive access providers (unless you're fine with multiple providers announcing aggregate summary in anycast fashion). The only way I see that geographical addressing might have some advantage is if the area is covered by large monopoly that connects everyone else there (and its not to say that such situation does not exist in some countries, but I don't think this situation should be encouraged and forced to continue forever with network allocation policies). On Tue, 18 Oct 2005 Michael.Dillon@btradianz.com wrote:
Here, the suggestion is that netblocks should be allocated to cities, not to providers. Within
I am a multihomed customer and my ISPs are in two different cities. What are my IP addresses going to be?
Your assumptions are flawed. I never suggested that there would be a flag day. I never suggested that geotopological addressing would work everywhere or solve all problems. I never suggested that we should turn off the existing provider aggregatable IP address allocations.
I just suggested an alternative way of issuing addresses so that they are geotopologically aggregatable, not provider aggregatable. There are sufficient reserved addresses in the IPv6 address space to do this. We could start issuing geotopological netblocks and try it for 5 years or so to see whether it works better or not.
In any case, you are located in Montreal which is such a major city that I expect any ISP selling service (geotopologically) in Montreal would use Montreal address space even if they backhauled at layer two to some other city.
However, there will likely be lots of situations where people in small towns roughly equidistant from two cities will choose to multihome with links to separate cities. This will either have to be done using provider-aggregated addresses or by using addresses from one of the cities with a longer prefix inside that city's routing table to direct the traffic to the neighboring city. If this is suboptimal, it won't be by much considering that these are neighboring cities.
I'm not suggesting any change to IPv6 stacks or to routing protocols. I'm just suggesting that we could allocate the same IPv6 addresses to operators in a way that allows geotopological aggregation rather than the existing provider aggregation. Combine this with local traffic exchange in every city and you have a more robust Internet with lower overall latency that will run with a smaller global routing table.
I know that some individual operators, such as the one I work for, have very robust IP networks with low overall latency. But when we talk about the Internet, then we include all the private interconnects and public exchange points and tromboning of traffic due to peering "issues", etc.
--Michael Dillon
-- William Leibzon Elan Networks william@elan.net
On Tue, 18 Oct 2005, william(at)elan.net wrote:
I reread this and still don't see how geographical ip address allocation is going to work if typical customer connections are network-centric
That's a "today's operator" view of customers though. Many customers view their network as being situated in one or more fixed geographic locations (not in terms of which provider gives them transit), which rarely change. ("Road warriors" just connect to HQ or their home site via VPN or whatever).
and any large area has number of competitive access providers (unless you're fine with multiple providers announcing aggregate summary in anycast fashion).
Yep, they'd have to. They'd also have to figure out the billing side of it for any traffic differentials. Essentially, when seen globally - the providers would service the geographic /area/, not the customers. When seen within this arbitrary area, you'd see routes for each customer and to which exact provider they'd have to go. Would also encourage peering generally to occur as close as possible to the arbitrary areas as possible, one suspects (so the providers own routing table wouldn't have to carry the "detail" further than needed). regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: Between grand theft and a legal fee, there only stands a law degree.
I reread this and still don't see how geographical ip address allocation is going to work if typical customer connections are network-centric and any large area has number of competitive access providers
Inside the city, you see lots of longer prefixes from that city's netblock. Outside the city you see only the single aggregate prefix.
The only way I see that geographical addressing might have some advantage is if the area is covered by large monopoly that connects everyone else there
Monopoly? Not necessary. Yes, you need to have universal exchange of local traffic in the city but that can happen through private interconnects and multiple exchange points. No need for a monopoly. The major change is that providers which participate in geotopological addressing would have to interconnect with *ALL* other such providers in that city. This would mean more use of public exchange points. Also, I think it makes sense to have a second regional layer of aggregation where you group neighboring cities that have a lot of traffic with each other. I think this would result in no more than 20-30 regions per continent. --Michael Dillon
On Tue, 18 Oct 2005 10:49:36 +0100 Michael.Dillon@btradianz.com wrote:
I reread this and still don't see how geographical ip address allocation is going to work if typical customer connections are network-centric and any large area has number of competitive access providers
Inside the city, you see lots of longer prefixes from that city's netblock. Outside the city you see only the single aggregate prefix.
The only way I see that geographical addressing might have some advantage is if the area is covered by large monopoly that connects everyone else there
Monopoly? Not necessary. Yes, you need to have universal exchange of local traffic in the city but that can happen through private interconnects and multiple exchange points. No need for a monopoly. The major change is that providers which participate in geotopological addressing would have to interconnect with *ALL* other such providers in that city. This would mean more use of public exchange points.
Also, I think it makes sense to have a second regional layer of aggregation where you group neighboring cities that have a lot of traffic with each other. I think this would result in no more than 20-30 regions per continent.
--Michael Dillon
I think that levels of multi-homing are likely to develop for small entities : Multi-homing-0 : You have two or more connnections, but no real sharing of information between them. (I have this now at home, with a Cable modem and dial up for backup. They always appear to the outside as two disjoint networks, and in fact never overlap in time.) I would argue that the vast majority of residences and small offices are are likely to fall into this category; the goal is really internal failover from the preferred provider to the secondary, with automatic renumbering courtesy of DHCP. Multi-homing-1 : You have two or more connections, but can do no traffic engineering, and have to assume an equal preference for each connection. Say there is some sort of geographical or topological prefix. From the outside, they could all be viewed as "belonging" to some preferred carrier, or to a local exchange point, or a protocol could be created to do some sort of topogolocal or geographical provider discovery. It seems to me that this means accepting some sort of hot potato routing, and also some interaction between providers. (The routing would go something like, this is a packet for Clifton, VA; Cox and Verizon cover Clifton Virginia; pick one of these and give it to them and let them worry about the details.) Of course, this scheme it would be highly likely in such a scheme that outbound and inbound traffic for the same flow could use different providers. Multi-homing-2 : What we would now consider as multi-homing, with full control and full BGP. Why would you want Multi-homing-1 ? Well, it should be cheaper than MH-2, with no user administration but you should still get some load balancing and also fast failover if a circuit goes down. That would more than meet the needs of most home offices. If BGP table growth is an issue, I think that some sort of MH-1 is inevitable. I think that inevitably means some sort of geographical or topological based prefix, less-than-optimal routing for at least some packets, and much less user control compared to MH-2. Regional exchanges might be nice, but are not necessary. Regards Marshall Eubanks
On Fri, 14 Oct 2005, Tony Li wrote:
But I think the discussion is mood. IETF decided on their goal, and it's superfluous trying to change that. While watching shim6 we carry on hoping that we'll get IPv6 multihoming going in the conventional, proven, working, feature-complete way we're used to... until IETF
there is no hope in having operators explain to ietf that the current path is fruitless? certainly they can be made to see the light, yes?
Doubtful. The IETF was operating under the impression that having a scalable routing subsystem was paramount. Do you think operators can be made to see that light?
They've been asking for that as well I think. I certainly don't want to have 1M+ routes for JUST the Internet to worry about anytime soon, I'd hate to see over 300k for real Internet routes anytime soon :( Much of today's hardware doesn't seem so happy around that number :( Operators and IETF need to hit a middle ground. Perhaps that middle ground is a mix of these 2 things? Routing hardware will certainly scale to larger than today's table, other factors are driving that, will it scale to 1B prefixes or 1T prefixes? I'm not able to speculate on that... I can see an immediate need to get over 500k though, and not from ipv6 nor lack of shim, and I'm probably shooting low.
Implementing IPv6 multihoming the "conventional" way guarantees that we end up with one prefix per site, and as the need for multihoming reaches deeper into the population, the growth rate of the routing table would surpass even the growth rate of the Internet itself.
I'm not sure I agree that the end state is 100% multihoming. I can certainly agree that more multihoming is coming. Many more people are pushing for multihoming today than in previous years, apparently telco instability (financial not technical) is/has driven this :) (among other things I'm sure) Again, I don't know what the end state size is, I agree it's bigger than today, I think it's probably smaller than infinity.
The alternative is a multihoming scheme that does not require a prefix per site. But that doesn't match the stated requirement of 'conventional', 'proven', 'working' [sic], 'feature-complete'.
whichever solution is the end solution there needs to be the capability to solve today's problems in tomorrow's world. The current proposal I think doesn't capture all of today's problems. If the set of missing solutions were going to magically go away then alls good... I can't see link failure (not complete failure of the path, partial path failure) or congestion or suboptimal paths going away though. (just to name a few) -Chris
Perhaps that middle ground is a mix of these 2 things?
Perhaps. But what we currently seem to believe is that current routing table growth is dominated by traffic engineering and multihoming. If future routing is to scale better than today, then we need some strong forces that push for conservation.
I'm not sure I agree that the end state is 100% multihoming. I can certainly agree that more multihoming is coming. Many more people are pushing for multihoming today than in previous years, apparently telco instability (financial not technical) is/has driven this :) (among other things I'm sure)
I wasn't suggesting that the end state is 100% multihoming, but I do think that it will grow to well over the 10% factor that we used in years past. I agree that there is much more push for multihoming thanks to the connectivity issues that we've seen, both due to financially driven and backhoe driven causes. The increased reliance of society on the Internet in general also helps us to go there and the increase in wireless access makes multihoming of small sites in a metro virtually trivial. Tony
They've been asking for that as well I think. I certainly don't want to have 1M+ routes for JUST the Internet to worry about anytime soon, I'd hate to see over 300k for real Internet routes anytime soon :( Much of today's hardware doesn't seem so happy around that number :( Operators and IETF need to hit a middle ground.
There are 437 cities of 1 million or more population. There are roughly 5,000 cities of over 100,000 population. And there are 3,047,000 named communities in the world. Seems to me that the number of routes in the global routing table should logically be closer to 5,000 than to 3,000,000.
I'm not sure I agree that the end state is 100% multihoming. I can certainly agree that more multihoming is coming. Many more people are pushing for multihoming today than in previous years, apparently telco instability (financial not technical) is/has driven this :) (among other things I'm sure)
I agree that the end state is *NOT* 100% multihoming. It is too complex for most people and there is no business justification for it. But an awful lot of business customers will be able to justify multihoming. That is part and parcel of the "mission critical" Internet. --Michael Dillon
On Mon, 17 Oct 2005, Michael.Dillon@btradianz.com wrote:
I agree that the end state is *NOT* 100% multihoming. It is too complex for most people and there is no business justification for it. But an awful lot of business customers will be able to justify multihoming. That is part and parcel of the "mission critical" Internet.
Portability is another aspect. You mightn't need multihoming for failover (don't know about you, but my ISP is plenty reliable), but you might want the ability to be "multihomed over time". Course, IPv6 makes renumbering really easy, so maybe that argument is moot. regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: The secret source of humor is not joy but sorrow; there is no humor in Heaven. -- Mark Twain
On Mon, Oct 17, 2005 at 12:08:38PM +0100, Michael.Dillon@btradianz.com <Michael.Dillon@btradianz.com> wrote a message of 28 lines which said:
There are 437 cities of 1 million or more population. There are roughly 5,000 cities of over 100,000 population. And there are 3,047,000 named communities in the world.
Seems to me that the number of routes in the global routing table should logically be closer to 5,000 than to 3,000,000.
If there is an exchange point per city over 100,000 (the route goes to the IXP and then to the actual provider)... Otherwise, there is a flaw in your calculation.
There are 437 cities of 1 million or more population. There are roughly 5,000 cities of over 100,000 population. And there are 3,047,000 named communities in the world.
Seems to me that the number of routes in the global routing table should logically be closer to 5,000 than to 3,000,000.
If there is an exchange point per city over 100,000 (the route goes to the IXP and then to the actual provider)... Otherwise, there is a flaw in your calculation.
I didn't calculate those numbers. They come from various demographic sources. And I would expect that many of these cities will have more than one exchange point. In fact, one could argue that a city should have no less than 3 central switching points for resiliency, and that major intercity providers should have no less than three paths into each city. However, if the addresses for everything in the city come from a single netblock, then sites in a neighboring city will only need one aggregate route route in the majority of cases. Even if there are enough special cases for an average of 5 routes per city, you still have only 25,000 global routes. It is still far from the projection of one million routes that some people have made and it is still less than today's routing table size. --Michael Dillon
On Mon, 17 Oct 2005 Michael.Dillon@btradianz.com wrote:
I agree that the end state is *NOT* 100% multihoming. It is too complex for most people and there is no business justification for it. But an awful lot of business customers will be able to justify multihoming. That is part and parcel of the "mission critical" Internet.
" The year is 2011. Last week my DSL provider died for 6 hours and my family was unable to get to the Internet. A friend suggests I try Suparoute11 (tm), I download the program and install on my Xbox 5. A few seconds later it uses my Xbox's 802.66 wireless port to contact several other people running the program within a few blocks of my house. Options are displayed on my screen, one guy's hub is offering a "backup 10Mb/s home link, tunneled and advertised to TIX" across a large cable provider. My son tells me that is what I want so I setup a payment of $5 per month to him. In 10 minutes from start to finish my house's /54 is "multi-homed", whatever that means. " -- Simon J. Lyall | Very Busy | Web: http://www.darkmere.gen.nz/ "To stay awake all night adds a day to your life" - Stilgar | eMT.
On Tue, 2005-10-18 at 01:07 +1300, Simon Lyall wrote:
My son tells me that is what I want so I setup a payment of $5 per month to him. In 10 minutes from start to finish my house's /54 is "multi-homed", whatever that means.
You cover the payment part, partially. But now the fun parts: - which prefix - where did you get the prefix - who says you are you and thus that you are allowed to use that prefix (someone has a working PKI available? :) - routing table size? - is the other end of the possible communication going to be able to reach this prefix? - will everybody accept you as being the endpoint? - what about a broken prefix 'upstream' (you just said your primary link goes down, thus it might be a misconfig 'upstream') - insert various other "2011" stories... Greets, Jeroen
man, 17,.10.2005 kl. 14.22 +0200, skrev Jeroen Massar:
On Tue, 2005-10-18 at 01:07 +1300, Simon Lyall wrote:
My son tells me that is what I want so I setup a payment of $5 per month to him. In 10 minutes from start to finish my house's /54 is "multi-homed", whatever that means.
You cover the payment part, partially. But now the fun parts:
This story may be a lot closer to reality in 2011 than you seem able to accept. It's a mere illusion given limitations in current (2005) routing-protocols, but those protocols may also be nothing more than a distant memory by 2011.
- which prefix
Why shouldn't it be his own?
- where did you get the prefix
Maybe it's assigned to him?
- who says you are you and thus that you are allowed to use that prefix (someone has a working PKI available? :)
The allocating entity should be able to authorise that
- routing table size?
Who says everybody has to know about everybody else all the time. Maybe it's enough to know where to go when you need to get there. (more in other thread).
- is the other end of the possible communication going to be able to reach this prefix?
ditto
- will everybody accept you as being the endpoint?
Why not. As long as it is unique.
- what about a broken prefix 'upstream' (you just said your primary link goes down, thus it might be a misconfig 'upstream')
Yeah. There will probably be many failure modes in future networks too.
- insert various other "2011" stories...
... and get some vision about the future. Don't assume the world will stop. //Per
On Mon, 17 Oct 2005 Michael.Dillon@btradianz.com wrote:
I'm not sure I agree that the end state is 100% multihoming. I can certainly agree that more multihoming is coming. Many more people are pushing for multihoming today than in previous years, apparently telco instability (financial not technical) is/has driven this :) (among other things I'm sure)
I agree that the end state is *NOT* 100% multihoming. It is too complex for most people and there is no business justification for it. But an awful lot of business customers will be able to justify multihoming. That is part and parcel of the "mission critical" Internet.
It'd be interesting to see how many 'providers' can't qualify for a /32 and will have multihomed in v6 and will thus have more than 1 /48 assigned and thus more than 1 /64 per customer... Say someone like Covad or Rythyms or perhaps even a cable-isp? In these instances each consumer will actually be multihomed, yes? The complexity just landed on your grandmama's doorstep. -Chris
there is no hope in having operators explain to ietf that the current path is fruitless? certainly they can be made to see the light, yes?
you have not spent much time with the ivtf, have you?
Actually Chris has been extremely active in the IETF - his draft on current/desired router filtering capabilities is something that's been needed for a while (draft-morrow-filter-caps-01.)
On Sun, 16 Oct 2005, Susan Harris wrote:
there is no hope in having operators explain to ietf that the current path is fruitless? certainly they can be made to see the light, yes?
you have not spent much time with the ivtf, have you?
Actually Chris has been extremely active in the IETF - his draft on current/desired router filtering capabilities is something that's been needed for a while (draft-morrow-filter-caps-01.)
doh, supposedly it's a WG document now, but honestly I just have a big mouth, George Jones provided most of the content and guidance... I just tried to hang myself with the xml authoring 'tools' :( I do hope to gain more ietf experience though... (I may bring a hard hat this time around)
there is no hope in having operators explain to ietf that the current
path
is fruitless? certainly they can be made to see the light, yes?
you have not spent much time with the ivtf, have you?
In case you have never heard of the IVTF, Randy eloquently summarizes it here: https://rip.psg.com/~randy/050721.ccr-ivtf.html --Michael Dillon
On Oct 14, 2005, at 12:10 PM, Daniel Roesen wrote:
designing a solution which misses the stated requirements of many folks actually operating networks
So far it's missing some of the stated requirements (reasons for multihoming) listed in the charter... well I was going to cut-n-paste like I did to my email to shim6 dated Oct 4 2005, but it seems to have been removed in an update... so I'll cut-n-paste from that email: For the purposes of redundancy, load sharing, operational policy or cost, a site may be multi-homed, with the site's network having connections to multiple IP service providers. So the IETF identified 4 reasons to multihome. Of those 4, shim6 ignores at least 2 of them (operational policy and cost), and so far as I can see glosses over load sharing. I'd actually redefine load sharing to load (im)balancing, but that may just be pedantics. I don't recall seeing any followup to my question, and it's not showing up in the list archive so maybe there's something wrong with my list subscription (although I have had responses to other postings... which are also not in the archive)
So the IETF identified 4 reasons to multihome. Of those 4, shim6 ignores at least 2 of them (operational policy and cost), and so far as I can see glosses over load sharing.
If you have a solution that satisfies all requirements, you should contribute it. Shim6 is indeed a partial solution to the stated requirements. There was no tractable solution found to all requirements, and to not solve any of the issues was seen as basically fatal. Tony
On 15-Oct-2005, at 15:29, Tony Li wrote:
So the IETF identified 4 reasons to multihome. Of those 4, shim6 ignores at least 2 of them (operational policy and cost), and so far as I can see glosses over load sharing.
If you have a solution that satisfies all requirements, you should contribute it. Shim6 is indeed a partial solution to the stated requirements. There was no tractable solution found to all requirements, and to not solve any of the issues was seen as basically fatal.
Yes. It may be worth noting that the "requirements" you're talking about were very deliberately published in a document which professes to contain "goals" and was intended to avoid any mention of the "r" word (although I see we missed one in the title of section 3.2 :-) The draft that led to RFC 3582 was originally a requirements document. As you say, there was no confidence that there would be any proposals which would meet all the items in the document if they had been wrapped in MUSTs and SHOULDs. As the abstract says: This document outlines a set of goals for proposed new IPv6 site- multihoming architectures. It is recognised that this set of goals is ambitious and that some goals may conflict with others. The solution or solutions adopted may only be able to satisfy some of the goals presented here. Joe
On Oct 15, 2005, at 3:29 PM, Tony Li wrote:
So the IETF identified 4 reasons to multihome. Of those 4, shim6 ignores at least 2 of them (operational policy and cost), and so far as I can see glosses over load sharing.
If you have a solution that satisfies all requirements, you should contribute it. Shim6 is indeed a partial solution to the stated requirements. There was no tractable solution found to all requirements, and to not solve any of the issues was seen as basically fatal.
I don't have an acceptable solution... however, I am getting tired of shim6 being pushed as *the* solution to site rehoming, when at best it's an end node rehoming solution.
I don't have an acceptable solution... however, I am getting tired of shim6 being pushed as *the* solution to site rehoming, when at best it's an end node rehoming solution.
Well, sorry. When we explored site multihoming (not rehoming) in the ways that you seem to suggest, it was effectively a set of coordinated NAT boxes around the periphery of the site. That was rejected quite quickly. Tony
Tony, On Oct 15, 2005, at 3:27 PM, Tony Li wrote:
When we explored site multihoming (not rehoming) in the ways that you seem to suggest, it was effectively a set of coordinated NAT boxes around the periphery of the site. That was rejected quite quickly.
What were the reasons for rejection? Thanks, -drc
drc@virtualized.org (David Conrad) writes:
On Oct 15, 2005, at 3:27 PM, Tony Li wrote:
When we explored site multihoming (not rehoming) in the ways that you seem to suggest, it was effectively a set of coordinated NAT boxes around the periphery of the site. That was rejected quite quickly.
What were the reasons for rejection?
i wasn't there for that meeting. but when similar things were proposed at other meetings, somebody always said "no! we have to have end-to-end, and if we'd wanted nat-around-every-net we'd've stuck with IPv4." -- Paul Vixie
On Oct 15, 2005, at 9:08 PM, Paul Vixie wrote:
but when similar things were proposed at other meetings, somebody always said "no! we have to have end-to- end, and if we'd wanted nat-around-every-net we'd've stuck with IPv4."
Hmm. Is VJ compression considered a violation of the "end-to-end" principle? Or perhaps I misunderstand (yet again). Rgds, -drc
but when similar things were proposed at other meetings, somebody always said "no! we have to have end- to-end, and if we'd wanted nat-around-every-net we'd've stuck with IPv4."
Is VJ compression considered a violation of the "end-to-end" principle?
Or perhaps I misunderstand (yet again).
Paul is correct. Things that looked like NAT were rejected because "NAT is evil". Shifting the NAT to end system removed the objection to NAT, tho it's not entirely clear why. Shifting NAT to the end system also happened to simplify the entire solution as well. VJ compression should not be considered a violation of the "end-to- end" principle, as it is a per-link hack and performs a function that CANNOT be performed in the end systems. However, I'm not entirely sure that this is relevant. NAT is not, strictly speaking, a violation of the end-to-end principle. It certainly is rather ugly and awkward from an architectural perspective, but it is a function that is not otherwise required in the end host, so placing it into the network does not violate the letter of the principle. Perhaps this is yet another case where people misunderstand the principle itself and are invoking it to give a name to their (well placed) architectural distaste. Tony
Hi Tony, On Sat, 15 Oct 2005 23:26:20 -0700 Tony Li <tony.li@tony.li> wrote: <snip> Perhaps
this is yet another case where people misunderstand the principle itself and are invoking it to give a name to their (well placed) architectural distaste.
Doesn't NAT, or more specifically the most commonly used, NAPT, create hard state within the network, which then makes it violate the end-to-end argument ? Also, because it has to understand transport and application layer protocols, to be able to translate embedded addresses, doesn't this also make it violate end-to-end ? I've understood the fundamental benefit of following the end-to-end argument is that you end up with a application agnostic network, which therefore doesn't create future constraints on which applications can then be used over that network. In an end-to-end "compliant" network, any new transport layer protocols, such as SCTP or DCCP, and new user applications, only require an upgrade of the end or edge node software, which can be performed in an incremental, per edge node as needed basis. In other words, there isn't any whole of network upgrade cost or functionality deployment delay to support new applications, which was the drawback of application specific networks, such as the traditional POTS network. Have I somehow misunderstood the intent or benefits of the end-to-end argument ? Thanks, Mark. -- The Internet's nature is peer to peer.
Doesn't NAT, or more specifically the most commonly used, NAPT, create hard state within the network, which then makes it violate the end-to-end argument ? Also, because it has to understand transport and application layer protocols, to be able to translate embedded addresses, doesn't this also make it violate end-to-end ? I've understood the fundamental benefit of following the end-to-end argument is that you end up with a application agnostic network, which therefore doesn't create future constraints on which applications can then be used over that network. In an end-to-end "compliant" network, any new transport layer protocols, such as SCTP or DCCP, and new user applications, only require an upgrade of the end or edge node software, which can be performed in an incremental, per edge node as needed basis. In other words, there isn't any whole of network upgrade cost or functionality deployment delay to support new applications, which was the drawback of application specific networks, such as the traditional POTS network.
Have I somehow misunderstood the intent or benefits of the end-to-end argument ?
Mark, This is probably the most common misunderstanding of the end-to-end principle out there. Someone else can dig up the quote, but basically, the principle says that the network should not replicate functionality that the hosts already have to perform. You have to look at X.25's hop-by-hop data windows to truly grok this point. Many people pick this up and twist it into ~the network has to be application agnostic~ and then use this against NATs or firewalls, which is simply a misuse of the principle. Really, this is a separate principle in and of its own right. It's not one that I subscribe to, but that's a different conversation... Regards, Tony
Many people pick this up and twist it into ~the network has to be application agnostic~ and then use this against NATs or firewalls, which is simply a misuse of the principle.
Personally, I think that NAT's interference with the communication between hosts is similar to the way in which error-detection and retransmission interfere with realtime voice communication, as described in Saltzer's end-to-end paper: http://web.mit.edu/Saltzer/www/publications/endtoend/endtoend.txt It seems that the end-to-end principle is more of a metaphor for how to look at the design problem rather than a hard and fast rule. http://www.postel.org/pipermail/end2end-interest/2002-March/001848.html --Michael Dillon
On Sun, Oct 16, 2005 at 01:45:40AM -0700, Tony Li wrote:
Doesn't NAT, or more specifically the most commonly used, NAPT, create hard state within the network, which then makes it violate the end-to-end argument ? Also, because it has to understand transport and application layer protocols, to be able to translate embedded addresses, doesn't this also make it violate end-to-end ? I've understood the fundamental benefit of following the end-to-end argument is that you end up with a application agnostic network, which therefore doesn't create future constraints on which applications can then be used over that network. In an end-to-end "compliant" network, any new transport layer protocols, such as SCTP or DCCP, and new user applications, only require an upgrade of the end or edge node software, which can be performed in an incremental, per edge node as needed basis. In other words, there isn't any whole of network upgrade cost or functionality deployment delay to support new applications, which was the drawback of application specific networks, such as the traditional POTS network.
Have I somehow misunderstood the intent or benefits of the end-to-end argument ?
Mark,
This is probably the most common misunderstanding of the end-to-end principle out there. Someone else can dig up the quote, but basically, the principle says that the network should not replicate functionality that the hosts already have to perform. You have to look at X.25's hop-by-hop data windows to truly grok this point.
Many people pick this up and twist it into ~the network has to be application agnostic~ and then use this against NATs or firewalls, which is simply a misuse of the principle. Really, this is a separate principle in and of its own right. It's not one that I subscribe to, but that's a different conversation...
Maybe its time to pull out some of Noel's work on both topics. Reasonable introductions to both the e2e principle and locator/id split topics can be found on http://users.exis.net/~jnc/tech/end_end.html and http://users.exis.net/~jnc/tech/endpoints.txt respectively. Dave
On Mon, 17 Oct 2005 07:57:52 -0700 David Meyer <dmm@1-4-5.net> wrote:
On Sun, Oct 16, 2005 at 01:45:40AM -0700, Tony Li wrote:
<snip>
This is probably the most common misunderstanding of the end-to-end principle out there. Someone else can dig up the quote, but basically, the principle says that the network should not replicate functionality that the hosts already have to perform. You have to look at X.25's hop-by-hop data windows to truly grok this point.
Many people pick this up and twist it into ~the network has to be application agnostic~ and then use this against NATs or firewalls, which is simply a misuse of the principle. Really, this is a separate principle in and of its own right. It's not one that I subscribe to, but that's a different conversation...
Maybe its time to pull out some of Noel's work on both topics. Reasonable introductions to both the e2e principle and locator/id split topics can be found on
http://users.exis.net/~jnc/tech/end_end.html and http://users.exis.net/~jnc/tech/endpoints.txt
Tony is right, thinking about it a bit more, I've mixed the two together. I first came across the end-to-end argument (the "X.25" example) in "Routing In the Internet". The other stuff (as well as e2e) was in RFC1958, "Architectural Principles of the Internet", and a few other places. I see value in getting rid of NAT and firewalls (protecting host based functions) out of the network because I've been burned by NAT on a few occasions (due to its stateful nature, due to its lack of application protocol support, due to its complexity when public address space would have been a simpler and cheaper solution), and with hosts starting to have multiple interfaces i.e. wired and wireless, it makes sense to me that firewalling on the host itself is a better way to protect them, rather than relying on a network topology located firewall that only protects against attacks coming upstream from the firewall. We've already pretty much evolved to the host based firewalling model anyway, with all major desktop/server OSes coming out of the box already with one. I think the major component missing is scalable policy deployment, although I've been told that they are being developed as well. I'm practical about NATs and network-located firewalls though, and although I don't necessarily like doing it much, will suggest the "conventional" NAT/firewall models/solutions when necessary. Regards, Mark. -- The Internet's nature is peer to peer.
Tony, On Oct 15, 2005, at 11:26 PM, Tony Li wrote:
Paul is correct. Things that looked like NAT were rejected because "NAT is evil".
Religion is so much fun.
Shifting the NAT to end system removed the objection to NAT, tho it's not entirely clear why. Shifting NAT to the end system also happened to simplify the entire solution as well.
Except for the part about having to rewrite all existing implementations to take full advantage of the technology.
VJ compression should not be considered a violation of the "end-to- end" principle, as it is a per-link hack and performs a function that CANNOT be performed in the end systems. However, I'm not entirely sure that this is relevant.
Well, if you NAT the destination identifier into a routing locator when a packet traverses the source edge/core boundary and NAT the locator back into the original destination identifier when you get to the core/destination edge boundary, it might be relevant. The advantages I see of such an approach would be: - no need to modify existing IPv6 stacks in any way - identifiers do not need to be assigned according to network topology (they could, in fact, be allocated according to national political boundaries, geographic boundaries, or randomly for that matter). They wouldn't even necessarily have to be IPv6 addresses just so long as they could be mapped and unmapped into the appropriate locators (e.g., they could even be, oh say, IPv4 addresses). - locators could change arbitrarily without affecting end-to-end sessions in any way - the core/destination edge NAT could have arbitrarily many locators associated with it - the source edge/core NAT could determine which of the locators associated with a destination it wanted to use Of course, the locator/identifier mapping is where things might get a bit complicated. What would be needed would be a globally distributed lookup technology that could take in an identifier and return one or more locators. It would have to be very fast since the mapping would be occurring for every packet, implying a need for caching and some mechanism to insure cache coherency, perhaps something as simple as a cache entry time to live if you make the assumption that the mappings either don't change very frequently and/ or stale mappings could be dealt with. You'd also probably want some way to verify that the mappings weren't mucked with by miscreants. This sounds strangely familiar... Obviously, some of the disadvantages of such an approach would be that it would require both ends to play and end users wouldn't be able to traceroute. I'm sure there are many other disadvantages as well. However, if an approach like this would be technically feasible (and I'm not entirely sure it would be), I suspect it would get deployed _much_ faster than an approach that requires every network stack to be modified. Again. Particularly given the number of folks who care about multi-homing are so small relative to the number of folks on the Internet. Can two evils make a good? :-) Rgds, -drc (speaking only for myself, of course)
Hi David, <snip>
Well, if you NAT the destination identifier into a routing locator when a packet traverses the source edge/core boundary and NAT the locator back into the original destination identifier when you get to the core/destination edge boundary, it might be relevant. The advantages I see of such an approach would be:
- no need to modify existing IPv6 stacks in any way - identifiers do not need to be assigned according to network topology (they could, in fact, be allocated according to national political boundaries, geographic boundaries, or randomly for that matter). They wouldn't even necessarily have to be IPv6 addresses just so long as they could be mapped and unmapped into the appropriate locators (e.g., they could even be, oh say, IPv4 addresses). - locators could change arbitrarily without affecting end-to-end sessions in any way - the core/destination edge NAT could have arbitrarily many locators associated with it - the source edge/core NAT could determine which of the locators associated with a destination it wanted to use
Of course, the locator/identifier mapping is where things might get a bit complicated. What would be needed would be a globally distributed lookup technology that could take in an identifier and return one or more locators. It would have to be very fast since the mapping would be occurring for every packet, implying a need for caching and some mechanism to insure cache coherency, perhaps something as simple as a cache entry time to live if you make the assumption that the mappings either don't change very frequently and/ or stale mappings could be dealt with. You'd also probably want some way to verify that the mappings weren't mucked with by miscreants. This sounds strangely familiar...
Certainly does. Apparently this or a similar idea was suggested back in 1997, and is the root origin of the 64 bits for host address space, according to Christian Huitema, in his IPv6 book - http://www.huitema.net/ipv6.asp. A google search found the draft : "GSE - An Alternate Addressing Architecture for IPv6" M. O'Dell, INTERNET DRAFT, 1997 http://www.caida.org/outreach/bib/networking/entries/odell97GSE.xml
Can two evils make a good? :-)
Not sure, however, two wrongs don't make a right, but three lefts do. Regards, Mark. -- The Internet's nature is peer to peer.
Certainly does. Apparently this or a similar idea was suggested back in 1997, and is the root origin of the 64 bits for host address space, according to Christian Huitema, in his IPv6 book - http://www.huitema.net/ipv6.asp.
A google search found the draft :
"GSE - An Alternate Addressing Architecture for IPv6" M. O'Dell, INTERNET DRAFT, 1997
http://www.caida.org/outreach/bib/networking/entries/odell97GSE.xml
Note that GSE is in no way a NAT, so is very different than David's proposal. GSE also has a direct impact on all implementations (e.g., only use the identifier bits in the TCP pseudo-header, so that is also an all- implementations change. Further, that is a flag day, worldwide, even for non-multi-homed sites. Tony
GSE also has a direct impact on all implementations (e.g., only use the identifier bits in the TCP pseudo-header, so that is also an all- implementations change. Further, that is a flag day, worldwide, even for non-multi-homed sites.
a flag day only for the very small number of ipv6 sites pay me now or pay me later randy
Shifting the NAT to end system removed the objection to NAT, tho it's not entirely clear why. Shifting NAT to the end system also happened to simplify the entire solution as well.
Except for the part about having to rewrite all existing implementations to take full advantage of the technology.
That was inevitable from the start. A real locator/identifier separation requires a rewrite. Any system that provided site-wide source address control was going to require a rewrite. The bigger issue is that given that rewrite was inevitable, why didn't we have more design freedom. Religion *is* so much fun.
Obviously, some of the disadvantages of such an approach would be that it would require both ends to play and end users wouldn't be able to traceroute. I'm sure there are many other disadvantages as well. However, if an approach like this would be technically feasible (and I'm not entirely sure it would be), I suspect it would get deployed _much_ faster than an approach that requires every network stack to be modified. Again. Particularly given the number of folks who care about multi-homing are so small relative to the number of folks on the Internet.
David, I should point out that if only a small number of folks care about multihoming, then only a small number of folks need to change their stacks. And even in your solution, there would need to be some changes to the end host if you want to support exit point selection, or carry alternate locators in the transport. It's just a mess. I think that we all can agree that a real locator/ identifier split is the correct architectural direction, but that's simply not politically tractable. If the real message that the provider community is trying to send is that they want this, and not IPv6 as it stands today, then that's the message that should be sent, without reference to shim6. Tony
Tony Li wrote:
It's just a mess. I think that we all can agree that a real locator/ identifier split is the correct architectural direction, but that's simply not politically tractable. If the real message that the provider community is trying to send is that they want this, and not IPv6 as it stands today, then that's the message that should be sent, without reference to shim6.
Tony
How is a split between locator / identifier any different logicaly from the existing ipv4 source routing? I thought that got dead ended? Or is a table lookup going to be needed? Wont all those tables need to be in the exact (or close to) same place as the current routing tables? Appreciate any enlightenment. Joe
How is a split between locator / identifier any different logicaly from the existing ipv4 source routing?
IPv4 source routing, as it exists today, is an extremely limited mechanism for specifying waypoints along the path to the destination. This is completely orthogonal to a real identifier/locator split, which would divide what we know of as the 'address' into two separate spaces, one which says "where" the node is, topologically, and one which says "who" the node is. One might use the identifier in the TCP pseudo-header, but not the locator, for one example, immediately allowing both mobility and multi-homing. Tony
Tony Li wrote:
How is a split between locator / identifier any different logicaly from the existing ipv4 source routing?
IPv4 source routing, as it exists today, is an extremely limited mechanism for specifying waypoints along the path to the destination.
IOW the end stations were supposed to be able to tell eachother how to route to eachother. Obviously that does not work in todays internet. But that was a seperation between the endpoints ID and the routing of the packet.
This is completely orthogonal to a real identifier/locator split, which would divide what we know of as the 'address' into two separate spaces, one which says "where" the node is, topologically, and one which says "who" the node is. One might use the identifier in the TCP pseudo-header, but not the locator, for one example, immediately allowing both mobility and multi-homing.
Do you mean adding a second address space to be used by all l3 protocols? Or adding a second address space for every L3 protocol? Or adding a layer 2.5 address space? That appears to be what shim6 is. Also my original question -- How do I send my packet to the other node? I cant just address my packet to the ID, I have to use either information supplied by that node or by a third party. Source routing or routing tables. If this decoupling depends on inband negotiated information, than this allows survivability, but it is not multihoming where multihoming is described as what we do now.
Tony
On Sun, 16 Oct 2005, Tony Li wrote:
This is completely orthogonal to a real identifier/locator split, which would divide what we know of as the 'address' into two separate spaces, one which says "where" the node is, topologically, and one which says "who" the node is.
Hmm, no idea whether it's a good idea or not, but from POV of scaling while it might make 'where' scaleable, you still have to find a way to tie "who" to "where". Some might say we already have this split though, DNS.
Tony
regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: Robotic tape changer mistook operator's tie for a backup tape.
Paul,
This is completely orthogonal to a real identifier/locator split, which would divide what we know of as the 'address' into two separate spaces, one which says "where" the node is, topologically, and one which says "who" the node is.
Hmm, no idea whether it's a good idea or not, but from POV of scaling while it might make 'where' scaleable, you still have to find a way to tie "who" to "where".
True. Even better, you get to change this binding (mobility) or have multiple bindings (multihoming).
Some might say we already have this split though, DNS.
True enough, but unfortunately, it's not done in a way that we can make use of the identifier in the routing subsystem or in the transport protocols. Tony
On Mon, 17 Oct 2005, Tony Li wrote:
True. Even better, you get to change this binding (mobility) or have multiple bindings (multihoming).
Indeed.
True enough, but unfortunately, it's not done in a way that we can make use of the identifier in the routing subsystem or in the transport protocols.
Well, if the idea is that the global routing subsystem should not have to burdened with the overwhelming details of "who"->"where", then this mightn't matter much - all routing needs to know is "where". The transport protocols, well they generally act on behalf of something which can do the lookup and supply transport with right address, as long the DNS server does not require "who"->"where" indirection ;). The other way of course is to carry "who"->"where" as routes in our current routing system. Then we just have to figure out how to confine things topologically. Would require changes in how providers peer though, but it is possible. And has been done in other networks, eg GSM in Ireland. We have provider-assigned, but provider-mobile prefixes. Just as with IP multihoming there was much protest that it couldn't be done, would be too problematic, would be too burdensome. However the regulator told the operators "I don't care, you have till X to figure it out and implement". The operators did figure out, presumably including how to do billing for the differentials of any traffic carried for customers who had moved to other providers... The rest of the world has no clue that a large set of Irish GSM telephone numbers are essentially "/32 routed" between Irish providers. ;) A possibility anyway (but whether it's the least worst way - i don't know ;) ). It does though keep operators fully involved in all aspects of routing. Otherwise end-hosts will just work around the 'dumb' providers themselves, if there's no solution operators like. (Not a bad thing either really).
Tony
regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: Money isn't everything -- but it's a long way ahead of what comes next. -- Sir Edmond Stockdale
True enough, but unfortunately, it's not done in a way that we can make use of the identifier in the routing subsystem or in the transport protocols.
The transport protocols, well they generally act on behalf of something which can do the lookup and supply transport with right address, as long the DNS server does not require "who"->"where" indirection ;).
The transport protocols unfortunately need the identifier in the packet to demux connections. Tony
# >> True enough, but unfortunately, it's not done in a way that we can make # >> use of the identifier in the routing subsystem or in the transport # >> protocols. # > # > The transport protocols, well they generally act on behalf of something # > which can do the lookup and supply transport with right address, as long # > the DNS server does not require "who"->"where" indirection ;). # # The transport protocols unfortunately need the identifier in the packet to # demux connections. the idea of a "transport protocol" comes from the OSI Reference Model which might not be the best conceptual fabric for re-thinking Internet routing. we know it's a "distributed system" and we know that various waypoints will or will not have "state", but i don't think we know that there will always be a "layer" that does what the "transport protocol" does in the OSIRM. i mention this because padlipsky's mantra about maps and territories came into my head just now as i was listening to folks talk about what the "transport protocol" had to have or had to provide. there's only a "transport protocol" if we decide to keep thinking in ISORM terms. and with that, i do indeed wonder if this has stopped being operational and if so, whether nanog wants to overlap THIS much with the irtf? refs: http://www.amazon.com/exec/obidos/tg/detail/-/0132681110/103-3252601-1266225
this because padlipsky's mantra about maps and territories came into my
head S.I. Hayakawa - Language and Thought in Action "The symbol is not the thing the thing symbolized; The map is not the territory: The word is not the thing." Nevertheless, Padlipsky is a good thing to read. Here is the book review from the Cisco IP Journal with a taste of the book. http://www.cisco.com/en/US/about/ac123/ac147/ac174/ac179/about_cisco_ipj_arc... --Michael Dillon
On Sun, 16 Oct 2005, Joe Maimon wrote:
Tony Li wrote:
It's just a mess. I think that we all can agree that a real locator/ identifier split is the correct architectural direction, but that's simply not politically tractable. If the real message that the provider community is trying to send is that they want this, and not IPv6 as it stands today, then that's the message that should be sent, without reference to shim6.
Tony
How is a split between locator / identifier any different logicaly from the existing ipv4 source routing?
I thought that got dead ended?
Or is a table lookup going to be needed?
Wont all those tables need to be in the exact (or close to) same place as the current routing tables?
Appreciate any enlightenment.
For example, if your goal was to have TCP-like sessions between identifiers survive network events without globally propagating full network topology information about your site (the gripe against classic IPv4 BGP) you could have multiple locators associated with any single identifier sort of like the same way you can have multiple A records for a domain name. If the location layer session times out then it would try the other locators listed (pick a method of selection) and if it suceeded would resume the session transparent to the identifier layer. Design the timeout and retransmit algorithm and parameters to achieve the convergence times of your choice. You would need a new protocol stack on the hosts at both ends of connections. By common convention classic TCP hosts could be told to use one of the locators (a transition hack, or just run the protocols in parallel). No change would be required to the network, and existing TCP could continue to be supported (no flag day). Of course support of this new protocol would be limited to the clients and servers that chose to implement it, however this is no less than the change required for IPv6 which some hoped would solve the multihoming problem (possibly defined as scalably supporting network topology change without sessions being interrupted). Mike. +----------------- H U R R I C A N E - E L E C T R I C -----------------+ | Mike Leber Direct Internet Connections Voice 510 580 4100 | | Hurricane Electric Web Hosting Colocation Fax 510 580 4151 | | mleber@he.net http://www.he.net | +-----------------------------------------------------------------------+
Mike Leber wrote:
On Sun, 16 Oct 2005, Joe Maimon wrote:
For example, if your goal was to have TCP-like sessions between identifiers survive network events without globally propagating full network topology information about your site (the gripe against classic IPv4 BGP) you could have multiple locators associated with any single identifier sort of like the same way you can have multiple A records for a domain name.
Real world shows that that doesnt work very well. Multiple A records is not usuable practicaly speaking for anything other than load balancing, today.
If the location layer session times out then it would try the other locators listed (pick a method of selection) and if it suceeded would resume the session transparent to the identifier layer. Design the timeout and retransmit algorithm and parameters to achieve the convergence times of your choice.
DNS is a good example of something that was designed that way, but few people rely on common implementations actualy performing it properly.
You would need a new protocol stack on the hosts at both ends of connections. By common convention classic TCP hosts could be told to use one of the locators (a transition hack, or just run the protocols in parallel). No change would be required to the network, and existing TCP could continue to be supported (no flag day).
Appears to me thats what shim6 is (cursory reading + nanog discussions)
Of course support of this new protocol would be limited to the clients and servers that chose to implement it, however this is no less than the change required for IPv6 which some hoped would solve the multihoming problem (possibly defined as scalably supporting network topology change without sessions being interrupted).
Long story short, seperating endpoint/locator does nothing to allow multiple paths to a single IP6 address/prefix to scale.
On Sun, 16 Oct 2005, Joe Maimon wrote:
Mike Leber wrote:
On Sun, 16 Oct 2005, Joe Maimon wrote: For example, if your goal was to have TCP-like sessions between identifiers survive network events without globally propagating full network topology information about your site (the gripe against classic IPv4 BGP) you could have multiple locators associated with any single identifier sort of like the same way you can have multiple A records for a domain name.
Real world shows that that doesnt work very well. Multiple A records is not usuable practicaly speaking for anything other than load balancing, today.
You are missing the point. Currently multihomed sites have multiple path entries in the routing table for a specific multihomed prefix. Instead of having multiple paths, you would have multiple location records in DNS. (Which are A records and any possible reordering by round robbin is not part of the actual path selection algorithm and which are associated with indentifiers though a standard to be designed as part of the new protocol.) The process of how you select which one to use (and what knobs you have for tuning) is a design decision, the same way it was with BGP and OSPF.
If the location layer session times out then it would try the other locators listed (pick a method of selection) and if it suceeded would resume the session transparent to the identifier layer. Design the timeout and retransmit algorithm and parameters to achieve the convergence times of your choice.
DNS is a good example of something that was designed that way, but few people rely on common implementations actualy performing it properly.
BGP and OSPF have timeouts and select other paths. They give you the convergence you have now in the event of router failure. For example, a BGP session with a peer accross a public exchange has to time out, your interface is still up. Yes, you have to engineer the protocol for the convergence time you want. Define the goal and figure out the algorithm to achieve it.
You would need a new protocol stack on the hosts at both ends of connections. By common convention classic TCP hosts could be told to use one of the locators (a transition hack, or just run the protocols in parallel). No change would be required to the network, and existing TCP could continue to be supported (no flag day).
Appears to me thats what shim6 is (cursory reading + nanog discussions)
Perhaps a shim6 advocate will explain the differences. Does shim6 provide separate identifiers from locators? Does shim6 require new protocol stacks on the hosts at both ends of a session? (If not then the source is not making its own path selection decisions.)
Of course support of this new protocol would be limited to the clients and servers that chose to implement it, however this is no less than the change required for IPv6 which some hoped would solve the multihoming problem (possibly defined as scalably supporting network topology change without sessions being interrupted).
Long story short, seperating endpoint/locator does nothing to allow multiple paths to a single IP6 address/prefix to scale.
Um, this is equivalent to saying "it doesn't work because I say so". How doesn't it work? For example you could claim (and then try to defend your claim): * It can't possibly converge quick enough because the genius that went into BGP and OSPF was lost and can never be found again. * (ok seriously) It can't converge quick enough because the timeout would have to be X and based on a guestimate of network topology entropy that would result in Y percent more traffic as each host tries to reestablish locator sessions. (Well, then define what percentage of sessions you think get interruped and support your claim.) * You throw away real topology information and rely on latency (or whatever), and using latency doesn't work because it doesn't allow traffic engineering according to policy. (Who said you have to give everybody the same set of locators? Paul might say ewwww. FWIW, if you want the ability to tell different peers different answers like with BGP you will need the ability to give different answers with the new protocol.) Mike. +----------------- H U R R I C A N E - E L E C T R I C -----------------+ | Mike Leber Direct Internet Connections Voice 510 580 4100 | | Hurricane Electric Web Hosting Colocation Fax 510 580 4151 | | mleber@he.net http://www.he.net | +-----------------------------------------------------------------------+
On Sun, 16 Oct 2005, Mike Leber wrote:
Does shim6 require new protocol stacks on the hosts at both ends of a session? (If not then the source is not making its own path selection decisions.)
As I understood it, shim6 is a way for two hosts to communicate between each other that they have multiple IPv6 addresses. So if a timeout occurs to the last used address, you can try another and try to resume the communication. So if the web-server has two different IP:s (from two different providers), both would be in DNS (preferrably) and the TCP session would be established with one of them. If shim6 detects that the original path is broken, it will try to use another and if it succeeds, the application won't notice anything as shim6 will abstract this to the TCP layer. I think this is a really good idea, having the network know about all multihomed companies just doesn't scale. With less prefixes and less AS numbers, network convergance would be much better. Think in the future, do we really want routers that'll handle millions of prefixes and hundreds of thousands of AS numbers, just because people want resiliance? If this can be solved on the end-user layer instead, it's more scalable. I can also see a loadbalancing scheme coming out on top of shim6 that'll be usable to end users as well. -- Mikael Abrahamsson email: swmike@swm.pp.se
On Sun, 16 Oct 2005, Mikael Abrahamsson wrote:
Think in the future, do we really want routers that'll handle millions of prefixes and hundreds of thousands of AS numbers, just because people want resiliance? If this can be solved on the end-user layer instead, it's more
you are getting these anyway, thank network convergence for that... or curse it, your call. things like 2547 'vpn' and the like are driving prefix numbers up regardless of what the Internet is doing. Hardware will be required to handle million(s) of prefixes sooner than large scale v6 deployment IMHO. Note, just cause it's there (or will be) doesn't mean I'm advocating that solution for v6, I just had questions (and thought others might as well) about shim6 or the direction of 'site multihoming' in v6... (not even specificly shim6)
On Mon, 17 Oct 2005, Christopher L. Morrow wrote:
you are getting these anyway, thank network convergence for that... or curse it, your call. things like 2547 'vpn' and the like are driving prefix numbers up regardless of what the Internet is doing. Hardware will be required to handle million(s) of prefixes sooner than large scale v6 deployment IMHO.
Both MPLS and any tunneled VPN over IP means the core won't have to know about all those prefixes (think aggregation of addresses regionally in the IP case and outer label in the MPLS case). So if you're building a 100G capable platform that'll do IPv6 and MPLS, how much difference would it be if you only had to support 16000 labels and 16000 IPv6 prefixes, rather than 2 million? Then of course I guess the argument can be made to put everything on MPLS to avoid the core knowing about anything but outer labels. -- Mikael Abrahamsson email: swmike@swm.pp.se
On Mon, 17 Oct 2005, Mikael Abrahamsson wrote:
On Mon, 17 Oct 2005, Christopher L. Morrow wrote:
you are getting these anyway, thank network convergence for that... or curse it, your call. things like 2547 'vpn' and the like are driving prefix numbers up regardless of what the Internet is doing. Hardware will be required to handle million(s) of prefixes sooner than large scale v6 deployment IMHO.
Both MPLS and any tunneled VPN over IP means the core won't have to know about all those prefixes (think aggregation of addresses regionally in the IP case and outer label in the MPLS case).
'core' doesn't matter so much, somewhere there has to be this knowledge... Perhaps you'll get lucky with some 'edge' devices not having to know about every destination, but I think that might be more rare than you'd like.
So if you're building a 100G capable platform that'll do IPv6 and MPLS, how much difference would it be if you only had to support 16000 labels and 16000 IPv6 prefixes, rather than 2 million?
not sure, I'm a chemical engineer :) Seriously though, is the break 1-2M or is it 10-20m? (doubling or orders of magnitude?)
Then of course I guess the argument can be made to put everything on MPLS to avoid the core knowing about anything but outer labels.
oh yes, mpls everywhere! wait... did I say that? yuck.
man, 17,.10.2005 kl. 07.17 +0200, skrev Mikael Abrahamsson:
Both MPLS and any tunneled VPN over IP means the core won't have to know about all those prefixes (think aggregation of addresses regionally in the IP case and outer label in the MPLS case).
Hope you don't imply NAT and private addresses like it is usually associated with VPN in the IPv4 world ;)
So if you're building a 100G capable platform that'll do IPv6 and MPLS, how much difference would it be if you only had to support 16000 labels and 16000 IPv6 prefixes, rather than 2 million?
Then of course I guess the argument can be made to put everything on MPLS to avoid the core knowing about anything but outer labels.
<flame>MPLS on its own won't solve anything. Although MPLS has its uses, it smells too much like another desperate attempt from the telco-heads in the ITU crowd to make a packet-switched network look and behave like a circuit-switched network.</flame> What this discussion boils down to is that a long term solution has to remove the size of the routing-table as a limiting factor in internet routing. Something must eliminate the need for every node in the default-free transit-network to know how to reach every allocated address-block at all times. Allocation policies, operational agreements on filtering, BCPs etc can only slow the growth of the routing-table. Growth can't be eliminated. In the future network you'll have routers that may know a lot about their "local region" of the network but have to rely on nodes that are several hops (even AS-hops) away to pass the packets to more remote destinations. These trust-relationships have to be built and maintained automatically (may involve packet tagging / tunnelling etc), similar to current route-cache mechanisms, but will require a whole new set of routing protocols. Despite lots of research there's no such solution today or anytime soon. Just think of the added complexity. How do you build trust with remote nodes given the problems you see in trusting your direct peers in the BGP world today? How can routing loops be prevented in such a network? All we know is that if there is no such solution, at some point in time the network will fragment due to its size and complexity. In the meantime we have to manage with what we've got, and treat v6 just like we've done with v4 - multihoming and all. We know we'll run out of v4 addresses at some point, and that v6 is the only realistic alternative. Without improved routing protocols, all we can do is to pray that the development of routing hardware in terms of memory and processing capability outpaces the growth of the routing table. Initiatives like shim6 that changes the behaviour of leaf-nodes are only a supplement and won't replace the need for true multi-homing for end-sites. Here we have to adapt to business needs, and businesses have made it pretty clear that it is unacceptable to them to be tied to any single provider. Besides, shim6 doesn't eliminate the need for a mechanism to locate any globally unique address. What if there's suddenly 10M LIR's, or otherwise a trend towards a market with very small providers each handling only a small number of customers? Who gets to decide who may peer with whom, or decide which providers will be denied the ability to build redundant connectivity with multiple upstreams? //Per
On Mon, 17 Oct 2005, Per Heldal wrote:
man, 17,.10.2005 kl. 07.17 +0200, skrev Mikael Abrahamsson:
Both MPLS and any tunneled VPN over IP means the core won't have to know about all those prefixes (think aggregation of addresses regionally in the IP case and outer label in the MPLS case).
Hope you don't imply NAT and private addresses like it is usually associated with VPN in the IPv4 world ;)
No, no NAT and RFC1918 implied, even though it might be part of it.
Then of course I guess the argument can be made to put everything on MPLS to avoid the core knowing about anything but outer labels.
<flame>MPLS on its own won't solve anything. Although MPLS has its uses, it smells too much like another desperate attempt from the telco-heads in the ITU crowd to make a packet-switched network look and behave like a circuit-switched network.</flame>
Why? The initial argument for MPLS was that it would solve the core problem and put intelligence at the edge. You would have a core that only needed to know about hundreds of nodes instead of 100.000:nds of nodes.
Growth can't be eliminated. In the future network you'll have routers that may know a lot about their "local region" of the network but have to rely on nodes that are several hops (even AS-hops) away to pass the packets to more remote destinations. These trust-relationships have to
Yes, that is what's being proposed. Know your internal nodes, announce single big prefix externally. With ISPs only having a single prefix and no "single customer" prefixes, routing table can be kept low. Redundancy can be solved with for instance shim6.
alternative. Without improved routing protocols, all we can do is to pray that the development of routing hardware in terms of memory and processing capability outpaces the growth of the routing table.
We have done this for 15 years or so, what good has it brought us? Yes, TCAM size etc has been fairly good in keeping up with routing table size, but at quite high cost.
Initiatives like shim6 that changes the behaviour of leaf-nodes are only a supplement and won't replace the need for true multi-homing for end-sites. Here we have to adapt to business needs, and businesses have
Why? What problem does multihoming with single prefix solve that a fully working shim6 doesn't? What is the argument that the "internet" needs to know about a lot of end-users, instead of the end-user knowing that each end user might have n number of IP addresses and that there are n^2 combinations to send packets? Convergence time in the real world today is in the minutes, with shim6 it would for the end user be much quicker to "route around" the problem. Shouldn't be any problem to have failover in the subsecond timeframe, even thought that might need some kind of hello mechanism that is suboptimal because it sends traffic not carrying any data.
single provider. Besides, shim6 doesn't eliminate the need for a mechanism to locate any globally unique address. What if there's
I thought DNS solved that?
suddenly 10M LIR's, or otherwise a trend towards a market with very small providers each handling only a small number of customers? Who gets to decide who may peer with whom, or decide which providers will be denied the ability to build redundant connectivity with multiple upstreams?
It costs money to maintain a LIR which limits the number of LIRs economically viable in the world. -- Mikael Abrahamsson email: swmike@swm.pp.se
man, 17,.10.2005 kl. 12.55 +0000, skrev Mikael Abrahamsson: [snip]
<flame>MPLS on its own won't solve anything. Although MPLS has its uses, it smells too much like another desperate attempt from the telco-heads in the ITU crowd to make a packet-switched network look and behave like a circuit-switched network.</flame>
Why? The initial argument for MPLS was that it would solve the core problem and put intelligence at the edge. You would have a core that only needed to know about hundreds of nodes instead of 100.000:nds of nodes.
My comment about MPLS wasn't directed specifically at this problem. Re-encapsulation may or may not be part of a future solution. If so, what mechanism is tbd. A true scalable solution will equire that the problem is distributed. Isolation of the problem in one place (core or edge) is no solution.
Growth can't be eliminated. In the future network you'll have routers that may know a lot about their "local region" of the network but have to rely on nodes that are several hops (even AS-hops) away to pass the packets to more remote destinations. These trust-relationships have to
Yes, that is what's being proposed. Know your internal nodes, announce single big prefix externally. With ISPs only having a single prefix and no "single customer" prefixes, routing table can be kept low. Redundancy can be solved with for instance shim6.
What I suggested above is not what is being proposed. Current proposals are limited to quirks to make your network appear less complex to the world. Ok in the short term, but doesn't scale. Well, let's try to turn the problem on its head and see if thats clearer; Imagine an internet where only your closest neighbors know you exist. The rest of the internet knows nothing about you, except there are mechanisms that let them "track you down" when necessary. That is very different from today's full-routing-table.
alternative. Without improved routing protocols, all we can do is to pray that the development of routing hardware in terms of memory and processing capability outpaces the growth of the routing table.
We have done this for 15 years or so, what good has it brought us? Yes, TCAM size etc has been fairly good in keeping up with routing table size, but at quite high cost.
True, but there's no law saying that current routing protocols and path-selection algorithms have to stay unchanged forever.
Initiatives like shim6 that changes the behaviour of leaf-nodes are only a supplement and won't replace the need for true multi-homing for end-sites. Here we have to adapt to business needs, and businesses have
Why? What problem does multihoming with single prefix solve that a fully working shim6 doesn't?
It does not provide 100% provider-indepence to begin with. Depending on who you ask that alone is a show-stopper.
What is the argument that the "internet" needs to know about a lot of end-users, instead of the end-user knowing that each end user might have n number of IP addresses and that there are n^2 combinations to send packets?
The internet shouldn't need to know anything about individual users to begin with, provided there are mechanisms avilable track them down. By that I mean that algorithms to locate end-nodes may include mechanisms to "interrogate" a large number of nodes to find the desired location as opposed to looking it up in a locally stored database (routing-table). Note that I'm all for shim6 as a principle, just not in this context. It's perfect for future communications devices that may need to switch between GSM-UMTS, GSM-EDGE, WLAN, WIMAX, ethernet w/charging-power for mobile units and whatever else is available by then.
Convergence time in the real world today is in the minutes, with shim6 it would for the end user be much quicker to "route around" the problem. Shouldn't be any problem to have failover in the subsecond timeframe, even thought that might need some kind of hello mechanism that is suboptimal because it sends traffic not carrying any data.
single provider. Besides, shim6 doesn't eliminate the need for a mechanism to locate any globally unique address. What if there's
I thought DNS solved that?
I thought DNS only provided a name for an address ;) How does DNS tell us that e.g. 193.10.6.6 is part of a subnet belonging to AS2838 and how to get there?
suddenly 10M LIR's, or otherwise a trend towards a market with very small providers each handling only a small number of customers? Who gets to decide who may peer with whom, or decide which providers will be denied the ability to build redundant connectivity with multiple upstreams?
It costs money to maintain a LIR which limits the number of LIRs economically viable in the world.
True, but it's just another artificial limit. It doesn't address the real problem of limitations in current core networking protocols. //Per
On Mon, 17 Oct 2005, Per Heldal wrote:
Well, let's try to turn the problem on its head and see if thats clearer; Imagine an internet where only your closest neighbors know you exist. The rest of the internet knows nothing about you, except there are mechanisms that let them "track you down" when necessary. That is very different from today's full-routing-table.
Yes, it's true that it's different, but is it better?
It does not provide 100% provider-indepence to begin with. Depending on who you ask that alone is a show-stopper.
Well, the reason for people wanting to stick to their "own" IP adresses are administrative and technical. If we solve that then hopefully, it wont be such a big hassle to renumber to go to another provider. Also, if everybody got their equal size subnet delegation from each ISP then it shouldnt be that much of a problem to run two "networks" side-by-side by using the subnet part of the delegation equal to both networks, but keep the prefix separate. If you switch providers you change the prefix part. This means we need new mechanisms to handle this, but I feel that's better than doing the routing mistake again.
The internet shouldn't need to know anything about individual users to begin with, provided there are mechanisms avilable track them down. By that I mean that algorithms to locate end-nodes may include mechanisms to "interrogate" a large number of nodes to find the desired location as opposed to looking it up in a locally stored database (routing-table).
So what is it you're proposing? I understand what shim6 tries to do (since it basically keeps most of todays mechanisms) but I do not understand your proposal. Could you please elaborate?
I thought DNS only provided a name for an address ;) How does DNS tell us that e.g. 193.10.6.6 is part of a subnet belonging to AS2838 and how to get there?
Should end users really care for that level of routing information? Also, your proposal seems to indicate that we need something that sounds like a proxy server that actually do know more about the internet and who needs to keep state, this doesn't sound scalable? -- Mikael Abrahamsson email: swmike@swm.pp.se
is that anything like using, in Cisco terms, a "fast-switching cache" vs a "FIB"? On Oct 17, 2005, at 6:47 AM, Mikael Abrahamsson wrote:
Well, let's try to turn the problem on its head and see if thats clearer; Imagine an internet where only your closest neighbors know you exist. The rest of the internet knows nothing about you, except there are mechanisms that let them "track you down" when necessary. That is very different from today's full-routing-table.
man, 17,.10.2005 kl. 07.25 -0700, skrev Fred Baker:
is that anything like using, in Cisco terms, a "fast-switching cache" vs a "FIB"?
I'll bite as I wrote the paragraph you're quoting; Actually, hanging on to the old concepts may be more confusing than trying to look at it in completely new ways. Imagine a situation with no access to any means of direct communication (phone etc). You've got a message to deliver to some person, and have no idea where to find that person. Chances are there's a group of people nearby you can ask. They may know how to find the one you're looking for. If not they may know others they can ask on your behalf. Several iterations later the person is located and you've established a path through which you can pass the information you wanted. Translated into cisco terms this mean that the FIB is just a partial routing database, enough to start the search and otherwise handle communications in the neighborhood (no more than X router-hops, maybe AS-hops away). When the destination is located you keep that information for a while in case there are more packets going to the same place, similar to what you do with traditional route-cache.
On Oct 17, 2005, at 6:47 AM, Mikael Abrahamsson wrote:
Well, let's try to turn the problem on its head and see if thats clearer; Imagine an internet where only your closest neighbors know you exist. The rest of the internet knows nothing about you, except there are mechanisms that let them "track you down" when necessary. That is very different from today's full-routing-table.
OK. What you just described is akin to an enterprise network with a default route. It's also akin to the way DNS works. The big question becomes not only "who knows what I need to know", but "how do I know that they actually know it?". For example, let's postulate that the concept is that each ISP advertises some sort of routing service that will install routes on demand, but requires that someone initiate a request for the route, and requires either the target system or the edge router in that domain that is closest to the target respond with a route. Simplistically, perhaps I am trying to route from my edge network ("A") towards your edge network ("B"), and we are both customers of some ISP ("C"). The host A' that is trying to get to your host B' initiates a request. Lets presume that this goes to some name in domain A that lists all border routers, or some multicast group that they are members of. Presumably every border router does this, but for present discussion the border router in A connecting to router C' in C asks all of his peers (POPs?) for the route, and some other router C" asks B's border router. B's border router has the route, and so replies; C" replies to C', C' to A's border router, and that router to A'. A' can now send a message. Presumably, if someone else now ask C about the route, either C' or C", or if the route was multicast to all of C's edge routers then any router in C would be able to respond directly. This becomes more interesting if C is in fact a succession of peer ISPs or ISPs that purchase transit from other ISPs. It also becomes very interesting if some router D' is misconfigured to advertise itself as B. It's not dissimilar to ant routing. For that, there is a variety of literature; Google is your friend. In manet and sensor networks, it works pretty well, especially in the sense that once it finds a route it keeps using it while it continues working even if other routes are changing around it, and it can use local repair to deal with local changes. At least as the researchers have described it, it doesn't do "policy" very well, and in networks that tend to be stable (such as wired networks) its load and convergence properties can be improved on. I'll let you read there. On Oct 17, 2005, at 9:20 AM, Per Heldal wrote:
man, 17,.10.2005 kl. 07.25 -0700, skrev Fred Baker:
is that anything like using, in Cisco terms, a "fast-switching cache" vs a "FIB"?
I'll bite as I wrote the paragraph you're quoting;
Actually, hanging on to the old concepts may be more confusing than trying to look at it in completely new ways.
Imagine a situation with no access to any means of direct communication (phone etc). You've got a message to deliver to some person, and have no idea where to find that person. Chances are there's a group of people nearby you can ask. They may know how to find the one you're looking for. If not they may know others they can ask on your behalf. Several iterations later the person is located and you've established a path through which you can pass the information you wanted.
Translated into cisco terms this mean that the FIB is just a partial routing database, enough to start the search and otherwise handle communications in the neighborhood (no more than X router-hops, maybe AS-hops away). When the destination is located you keep that information for a while in case there are more packets going to the same place, similar to what you do with traditional route-cache.
On Oct 17, 2005, at 6:47 AM, Mikael Abrahamsson wrote:
Well, let's try to turn the problem on its head and see if thats clearer; Imagine an internet where only your closest neighbors know you exist. The rest of the internet knows nothing about you, except there are mechanisms that let them "track you down" when necessary. That is very different from today's full-routing-table.
mon, 17,.10.2005 kl. 11.29 -0700, Fred Baker:
OK. What you just described is akin to an enterprise network with a default route. It's also akin to the way DNS works.
No default, just one or more *potential* routes. Your input is appreciated, and yes I'm very much aware that many people who maintain solutions that assume full/total control of the entire routing-table will be screaming bloody murder if that is going to change. Further details about future inter-domain-routing concepts belong in other fora (e.g. ietf's inter-domain-routing wg). The long-term operational impact is that the current inter-domain-routing concepts (BGP etc) don't scale indefinitely and will have to be changed some time in the future. Thus expect the size of the routing-table to be eliminated from the list of limiting factors, or that the bar is considerably raised. --- Note that I'm not saying that nothing should be done to preserve and optimise the use of the resources that are available today just because there will be something better available in a distant future. I'm in favor of the most restrictive allocation policies in place today. The development of the internet has for a long time been based on finding better ways to use available resources (CIDR anyone). To me a natural next-step in that process is for RIR's to start reclaiming unused v4 address-blocks, or at least start collect data to document that space is not being used (if they're not already doing so). E.g prevously announced address-blocks that has disappeared from the global routing-table for more than X months should go back to the RIR-pool (X<=6). //Per
That is an assumption that I haven't found it necessary to make. I have concluded that there is no real debate about whether the Internet will have to change to something that gives us the ability to directly address (e.g. not behind a NAT, which imposes some "interesting" requirements at the application layer and gateways of the sort which were what the Internet came about to not need) a whole lot more things than it does today. The debate is about how and when. "when" seems pretty solidly in the 3-10 year timeframe, exactly where in that timeframe being a point of some discussion, and "how" comes down to a choice of "IPv6" or "something else". Fleming's IPv8 was a non-stupid idea that has a fundamental flaw in that it re-interprets parts of the IPv4 header as domain identifiers - it effectively extends the IP address by 16 bits, which is good, but does so in a way that is not backward compatible. If we could make those 16 bits be AS numbers (and ignoring for the moment the fact that we seem to need larger AS numbers), the matter follows pretty quickly. If one is going to change the header, though, giving up fragmentation as a feature sees a little tough; one may as well change the header and manage to keep the capability. One also needs to change some other protocols, such as routing AS numbers and including them in DNS records as part of the address. From my perspective, we are having enough good experience with IPv6 that we should simply choose that approach; there isn't a real good reason to find a different one. Yes, that means there is a long coexistence period yada yada yada. This is also true of any other fundamental network layer protocol change. The RIRs have been trying pretty hard to make IPv6 allocations be one prefix per ISP, with truly large edge networks being treated as functionally equivalent to an ISP (PI addressing without admitting it is being done). Make the bald assertion that this is equal to one prefix per AS (they're not the same statement at all, but the number of currently assigned AS numbers exceeds the number of prefixes under discussion, so in my mind it makes a reasonable thumb-in-the-wind- guesstimate), that is a reduction of the routing table size by an order of magnitude. If we are able to reduce the routing table size by an order of magnitude, I don't see that we have a requirement to fundamentally change the routing technology to support it. We may *want* to (and yes, I would like to, for various reasons), but that is a different assertion. On Oct 17, 2005, at 12:42 PM, Per Heldal wrote:
mon, 17,.10.2005 kl. 11.29 -0700, Fred Baker:
OK. What you just described is akin to an enterprise network with a default route. It's also akin to the way DNS works.
No default, just one or more *potential* routes.
Your input is appreciated, and yes I'm very much aware that many people who maintain solutions that assume full/total control of the entire routing-table will be screaming bloody murder if that is going to change. Further details about future inter-domain-routing concepts belong in other fora (e.g. ietf's inter-domain-routing wg).
The long-term operational impact is that the current inter-domain- routing concepts (BGP etc) don't scale indefinitely and will have to be changed some time in the future. Thus expect the size of the routing-table to be eliminated from the list of limiting factors, or that the bar is considerably raised.
---
Note that I'm not saying that nothing should be done to preserve and optimise the use of the resources that are available today just because there will be something better available in a distant future. I'm in favor of the most restrictive allocation policies in place today. The development of the internet has for a long time been based on finding better ways to use available resources (CIDR anyone). To me a natural next-step in that process is for RIR's to start reclaiming unused v4 address-blocks, or at least start collect data to document that space is not being used (if they're not already doing so). E.g prevously announced address-blocks that has disappeared from the global routing-table for more than X months should go back to the RIR-pool (X<=6).
//Per
Fred,
If we are able to reduce the routing table size by an order of magnitude, I don't see that we have a requirement to fundamentally change the routing technology to support it. We may *want* to (and yes, I would like to, for various reasons), but that is a different assertion.
There is a fundamental difference between a one-time reduction in the table and a fundamental dissipation of the forces that cause it to bloat in the first place. Simply reducing the table as a one-off only buys you linearly more time. Eliminating the drivers for bloat buys you technology generations. If we're going to put the world thru the pain of change, it seems that we should do our best to ensure that it never, ever has to happen again. Regards, Tony
On 10/17/05 4:51 PM, "Tony Li" <tony.li@tony.li> wrote:
Fred,
If we are able to reduce the routing table size by an order of magnitude, I don't see that we have a requirement to fundamentally change the routing technology to support it. We may *want* to (and yes, I would like to, for various reasons), but that is a different assertion.
There is a fundamental difference between a one-time reduction in the table and a fundamental dissipation of the forces that cause it to bloat in the first place. Simply reducing the table as a one-off only buys you linearly more time. Eliminating the drivers for bloat buys you technology generations.
If we're going to put the world thru the pain of change, it seems that we should do our best to ensure that it never, ever has to happen again.
That's the goal here? To ensure we'll never have another protocol transition? I hope you realize what a flawed statement that is. We can't see into the future. However, assuming that IPv6 is the Last Protocol seems a bit short sighted. If we get 20 years out of IPv6, that will be just peachy. Of course, if we can't get PI address space for enterprises and real multihoming, there won't be any real IPv6 deployment. Lots of (possibly illegitimate) IPv4 trading and NAT, but not IPv6. These aren't nice to haves. Even if it shortens the life of IPv6, that is an acceptable trade-off. IPv6 is not the Last Protocol.
Regards, Tony
Dan
There is a fundamental difference between a one-time reduction in the table and a fundamental dissipation of the forces that cause it to bloat in the first place. Simply reducing the table as a one-off only buys you linearly more time. Eliminating the drivers for bloat buys you technology generations.
If we're going to put the world thru the pain of change, it seems that we should do our best to ensure that it never, ever has to happen again.
That's the goal here? To ensure we'll never have another protocol transition? I hope you realize what a flawed statement that is.
tony probably did not think about it because that's not what he said at all. he was speaking of routing table bloat, not transitions. and he was spot on. randy
Randy; we are living on Earth with small size (only 6,000 km in radius), so we will never see unlimited grouth in multihomed networks. It is not a problem. We are not building Internet for the whole universe. Good old Moore can deal with our planet very well. I repeated many times - IPv6 idea of changing multihome approach is VERY BAD and will not survise for more that 1 - 2 years. (if IPv6 survive at all, which I have many doubts about). ----- Original Message ----- From: "Randy Bush" <randy@psg.com> To: "Daniel Golding" <dgolding@burtongroup.com> Cc: "Tony Li" <tony.li@tony.li>; "Fred Baker" <fred@cisco.com>; "Per Heldal" <heldal@eml.cc>; <nanog@merit.edu> Sent: Monday, October 17, 2005 2:16 PM Subject: Re: And Now for Something Completely Different (was Re: IPv6 news)
There is a fundamental difference between a one-time reduction in the table and a fundamental dissipation of the forces that cause it to bloat in the first place. Simply reducing the table as a one-off only buys you linearly more time. Eliminating the drivers for bloat buys you technology generations.
If we're going to put the world thru the pain of change, it seems that we should do our best to ensure that it never, ever has to happen again.
That's the goal here? To ensure we'll never have another protocol transition? I hope you realize what a flawed statement that is.
tony probably did not think about it because that's not what he said at all. he was speaking of routing table bloat, not transitions.
and he was spot on.
randy
Daniel,
If we're going to put the world thru the pain of change, it seems that we should do our best to ensure that it never, ever has to happen again.
That's the goal here? To ensure we'll never have another protocol transition? I hope you realize what a flawed statement that is. We can't see into the future. However, assuming that IPv6 is the Last Protocol seems a bit short sighted. If we get 20 years out of IPv6, that will be just peachy.
I see that as a worthy goal and no, I don't see that as flawed. While we certainly cannot guarantee that v6 will be the last protocol, we should certainly be designing for it to be the best that we can possibly make it. Just how many times do you think that we will replace all implementations? This change is simply fundamental to the way the Internet works. There is almost as much pain associated with this change as if we were to change the electric outlet voltage in every single country to a mutually incompatible standard. Can you imagine power companies making that change and then telling consumers to expect another such change in 20 years? To not even *attempt* to avoid future all-systems changes is nothing short of negligent, IMHO.
Of course, if we can't get PI address space for enterprises and real multihoming, there won't be any real IPv6 deployment. Lots of (possibly illegitimate) IPv4 trading and NAT, but not IPv6.
These aren't nice to haves. Even if it shortens the life of IPv6, that is an acceptable trade-off.
IPv6 is not the Last Protocol.
If you do get PI space for multihoming, then by definition, it cannot be the last protocol. In fact, it will have cemented v6's lifetime as just 10 more years. Tony
On Oct 17, 2005, at 2:24 PM, Tony Li wrote:
To not even *attempt* to avoid future all-systems changes is nothing short of negligent, IMHO.
On Oct 17, 2005, at 2:17 PM, Randy Bush wrote:
and that is what the other v6 ivory tower crew said a decade ago. which is why we have the disaster we have now.
and there I would agree, on both points. now, the proposal put forward lo these many moons ago to avoid any possibility of a routing change was, as I recall, Nimrod, and the Nimrod architecture called for variable length addresses in the network layer protocol and the use of a flow label (as in "IPv6 flow label") as a short-form address in some senses akin to a virtual circuit ID. There has been a lot of work on that in rrg among other places, but the word from those who would deploy it has been uniformly "think in terms of an incremental upgrade to BGP" and "maybe MPLS will work as a virtual circuit ID if we really need one". As you no doubt recall all too well, the variable length address was in fact agreed on at one point, but that failed for political reasons. Something about OSI. The 16 byte length of an IPv6 address derived from that as well - it didn't allow one to represent an NSAP in IPv6, which was an objective. So the routing problem was looked at, and making a fundamental routing change was rejected by both the operational community and the routing folks. No, IPv6 doesn't fix (or even change) the routing of the system, and that problem will fester until it becomes important enough to change. But lets not blame that on the "ivory tower folks", at least not wholly. We were all involved.
Wasn't Noel Chiappa Nimrods "father" ? He explained his philosophy to me in an interview a decade ago as well as why he believed that BGP was not sustainable. yet here we are still chugging along meanwhile back to your operational flows ;-) ============================================================= The COOK Report on Internet Protocol, 431 Greenway Ave, Ewing, NJ 08618 USA 609 882-2572 (PSTN) 415 651-4147 (Lingo) cook@cookreport.com Subscription info: http://cookreport.com/subscriptions.shtml IMS and an Internet Economic & Business Model at: http://cookreport.com/14.09.shtml ============================================================= On Oct 17, 2005, at 5:58 PM, Fred Baker wrote:
now, the proposal put forward lo these many moons ago to avoid any possibility of a routing change was, as I recall, Nimrod, and the Nimrod architecture called for variable length addresses in the network layer protocol and the use of a flow label (as in "IPv6 flow label") as a short-form address in some senses akin to a virtual circuit ID.
Fred,
So the routing problem was looked at, and making a fundamental routing change was rejected by both the operational community and the routing folks.
No, IPv6 doesn't fix (or even change) the routing of the system, and that problem will fester until it becomes important enough to change.
From this end of the elephant, we looked at Nimrod and saw potential there, but did not buy off on it. We also looked at GSE and the routing folks at the very least seemed bought into that, but it died, under what I would characterize as a purely political hailstorm. Yes, the lack of a scalable routing architecture will continue to fester until it has sufficient political visibility that it exceeds our pain threshold and we decide to make the change. The problem with this model is that the pain of change grows daily. Each and every v6 system that is deployed is yet another bit of installed base that will need to be updated someday. The Internet community needs the IETF leadership to understand this very clearly and to take action to resolve this issue sooner, not later. As others have said, this is a pay now or pay later situation, and the pay later portion is MUCH more expensive. Specifically, the IAB should call for a halt to IPv6 deployment until consensus is reached on a scalable routing architecture. I realize that this is painful, but continuing to deploy is simply creating a v6 mortgage that we cannot afford to pay off. Regards, Tony
tony.li@tony.li (Tony Li) writes:
Specifically, the IAB should call for a halt to IPv6 deployment until consensus is reached on a scalable routing architecture. I realize that this is painful, but continuing to deploy is simply creating a v6 mortgage that we cannot afford to pay off.
well, maybe so. but IAB could never make deployment start, and IAB cannot make deployment stop. deployment happens on its own terms. leadership has a built in time delay and a couple layers of indirection between intent and action and results. if you want IAB to lead somehow, give them some runway. -- Paul Vixie
On Mon, 17 Oct 2005 14:24:08 -0700 Tony Li <tony.li@tony.li> wrote: Dear Tony et al.; This is beginning to sound like an IETF or IRTF mail list, and, lo!, I get an email today from Leslie Daigle : A new mailing list has been created to provide a forum for general discussion of Internet architectural issues: architecture-discuss@ietf.org https://www1.ietf.org/mailman/listinfo/architecture-discuss The architecture-discuss list serves as a technical discussion forum for all members of the IETF community that are interested in larger architectural issues. It is meant to be an open discussion forum for all long and/or wide range architectural concerns related to the Internet Architecture. In particular, it may be used to discuss and bring forth different points of view on controversial architectural questions. Discussions that drill down and dwell on specifics of individual protocols or technologies are to be discouraged, redirected to appropriate other fora, or re-cast to discussions of broader implications for Internet architecture. Maybe this conversation should move there. Maybe a lot of operators should join that list. Probably couldn't hurt. Regards Marshall Eubanks
Daniel,
If we're going to put the world thru the pain of change, it seems that we should do our best to ensure that it never, ever has to happen again.
That's the goal here? To ensure we'll never have another protocol transition? I hope you realize what a flawed statement that is. We can't see into the future. However, assuming that IPv6 is the Last Protocol seems a bit short sighted. If we get 20 years out of IPv6, that will be just peachy.
I see that as a worthy goal and no, I don't see that as flawed. While we certainly cannot guarantee that v6 will be the last protocol, we should certainly be designing for it to be the best that we can possibly make it. Just how many times do you think that we will replace all implementations?
This change is simply fundamental to the way the Internet works. There is almost as much pain associated with this change as if we were to change the electric outlet voltage in every single country to a mutually incompatible standard. Can you imagine power companies making that change and then telling consumers to expect another such change in 20 years?
To not even *attempt* to avoid future all-systems changes is nothing short of negligent, IMHO.
Of course, if we can't get PI address space for enterprises and real multihoming, there won't be any real IPv6 deployment. Lots of (possibly illegitimate) IPv4 trading and NAT, but not IPv6.
These aren't nice to haves. Even if it shortens the life of IPv6, that is an acceptable trade-off.
IPv6 is not the Last Protocol.
If you do get PI space for multihoming, then by definition, it cannot be the last protocol. In fact, it will have cemented v6's lifetime as just 10 more years.
Tony
works for me - I did say I'd like to change the routing protocol - but I think the routing protocol can be changed asynchronously, and will have to. On Oct 17, 2005, at 1:51 PM, Tony Li wrote:
Fred,
If we are able to reduce the routing table size by an order of magnitude, I don't see that we have a requirement to fundamentally change the routing technology to support it. We may *want* to (and yes, I would like to, for various reasons), but that is a different assertion.
There is a fundamental difference between a one-time reduction in the table and a fundamental dissipation of the forces that cause it to bloat in the first place. Simply reducing the table as a one- off only buys you linearly more time. Eliminating the drivers for bloat buys you technology generations.
If we're going to put the world thru the pain of change, it seems that we should do our best to ensure that it never, ever has to happen again.
Regards, Tony
works for me - I did say I'd like to change the routing protocol - but I think the routing protocol can be changed asynchronously, and will have to.
and that is what the other v6 ivory tower crew said a decade ago. which is why we have the disaster we have now. randy
At 04:51 PM 10/17/2005, Tony Li wrote:
Fred,
If we are able to reduce the routing table size by an order of magnitude, I don't see that we have a requirement to fundamentally change the routing technology to support it. We may *want* to (and yes, I would like to, for various reasons), but that is a different assertion.
There is a fundamental difference between a one-time reduction in the table and a fundamental dissipation of the forces that cause it to bloat in the first place. Simply reducing the table as a one-off only buys you linearly more time. Eliminating the drivers for bloat buys you technology generations.
If we're going to put the world thru the pain of change, it seems that we should do our best to ensure that it never, ever has to happen again.
But wasn't that the rationale for originally putting the kitchen sink into IPv6, rather than fixing the address length issue? I think we missed a lot of opportunities. Extended addressing may well have been possible to integrate in the mid 1990's ahead of much of the massive Internet expansion. Too late. We're 10 years on, and talking about whether there will need to be more than one massive pain of migration, because the kitchen sink didn't take into account multihoming. Now we're talking about a solution that appear to be an even worse Rube Goldberg than token ring source-route bridging. Moore will likely have to continue to produce the solution.
Daniel,
But wasn't that the rationale for originally putting the kitchen sink into IPv6, rather than fixing the address length issue?
The stated rationale was to fix the address length issue.
I think we missed a lot of opportunities.
Amen.
We're 10 years on, and talking about whether there will need to be more than one massive pain of migration, because the kitchen sink didn't take into account multihoming.
More generally because we were unwilling to make changes to the routing architecture.
Now we're talking about a solution that appear to be an even worse Rube Goldberg than token ring source-route bridging.
No one has proposed anything that is as bad as the exponential traffic explosion caused by explorers.
Moore will likely have to continue to produce the solution.
What happens if he can't? Silicon technology *is* topping out. What happens to v6 if every single household and business on the planet decides to multihome? Tony
Moore will likely have to continue to produce the solution.
What happens if he can't? Silicon technology *is* topping out. What happens to v6 if every single household and business on the planet decides to multihome?
I often wonder what would happen if IETF and NANOG were to collectively swear off reductio ad absurdum scenarios. For extra credit, the reader is invited to compare and contrast the relative probabilities of "pervasive end-site multihoming" and "IETF and NANOG learning what a 'false dichotomy' is". Traditional multihoming techniques work today and buy us a decade or more _until the v6 routing table is the same size as today's v4 table_. That's a decade to make Pad's seminal work (cited by Vixie earlier; it's nice to have it back in print!) required reading for IETF participation and smack some sense into the acolytes of Rube Goldberg whose influence on shim6 is all too apparent. ---Rob
It doesn't look like were talking about the same thing. A. Address conservation and aggregation (IPv4 and IPv6) is very important to get the most out of what we've got. Read; limit the combined routing-table to a manageable size whatever that may be. B. There seems to be widespread fear that the global routing-table will grow to a non-manageable size sooner or later regardless of the efforts under A. So the limit has to be removed. What you address below mostly belong under A (and I mostly agree), whereas I so far have focused on B. On Mon, 2005-10-17 at 13:12 -0700, Fred Baker wrote:
That is an assumption that I haven't found it necessary to make. I have concluded that there is no real debate about whether the Internet will have to change to something that gives us the ability to directly address (e.g. not behind a NAT, which imposes some "interesting" requirements at the application layer and gateways of the sort which were what the Internet came about to not need) a whole lot more things than it does today. The debate is about how and when. "when" seems pretty solidly in the 3-10 year timeframe, exactly where in that timeframe being a point of some discussion, and "how" comes down to a choice of "IPv6" or "something else".
Sure, something has to replace IPv4 sooner or later and IPv6 is almost certainly the thing. Personally I belive the most trustworthy projections points towards 2010 as the time we'll run out of addresses in V4 with an additional 2-3 years if we can manage to reclaim up to 90% of those previously allocated addressblocks which are not used or not announced to the public internet.
Fleming's IPv8 was a non-stupid idea that has a fundamental flaw in that it re-interprets parts of the IPv4 header as domain identifiers - it effectively extends the IP address by 16 bits, which is good, but does so in a way that is not backward compatible. If we could make those 16 bits be AS numbers (and ignoring for the moment the fact that we seem to need larger AS numbers), the matter follows pretty quickly. If one is going to change the header, though, giving up fragmentation as a feature sees a little tough; one may as well change the header and manage to keep the capability. One also needs to change some other protocols, such as routing AS numbers and including them in DNS records as part of the address.
For the record: You brought up IPv8. Nothing of what I've mentioned requires any change to transport protocols wether implemented on top of IPv4 or 6.
From my perspective, we are having enough good experience with IPv6 that we should simply choose that approach; there isn't a real good reason to find a different one. Yes, that means there is a long coexistence period yada yada yada. This is also true of any other fundamental network layer protocol change.
The RIRs have been trying pretty hard to make IPv6 allocations be one prefix per ISP, with truly large edge networks being treated as functionally equivalent to an ISP (PI addressing without admitting it is being done). Make the bald assertion that this is equal to one prefix per AS (they're not the same statement at all, but the number of currently assigned AS numbers exceeds the number of prefixes under discussion, so in my mind it makes a reasonable thumb-in-the-wind- guesstimate), that is a reduction of the routing table size by an order of magnitude.
I wouldn't even characterise that as being bald. Initial allocations of more than one prefix per AS should not be allowed. Further; initial allocations should differentiate between network of various sizes into separate address-blocks to simplify and promote strict prefix-filtering policies. Large networks may make arrangements with their neighbors to honor more specifics, but that shouldn't mean that the rest of the world should accept those.
If we are able to reduce the routing table size by an order of magnitude, I don't see that we have a requirement to fundamentally change the routing technology to support it. We may *want* to (and yes, I would like to, for various reasons), but that is a different assertion.
Predictions disagree. //Per
we agree that at least initially every prefix allocated should belong to a different AS (eg, no AS gets more than one); the fly in that is whether there is an ISP somewhere that is so truly large that it needs two super-sized blocks. I don't know if such exists, but one hopes it is very much the exception. The question is "does every AS get a prefix". Under current rules, most AS's assigned to edge networks to support multihoming will not get a prefix. I personally think that's probably the wrong answer (eg, you and I seem to agree on PI space for networks that would warrant an AS number does to size, connectivity, and use of BGP to manage their borders), but it is the current answer. On Oct 17, 2005, at 2:06 PM, Per Heldal wrote:
The RIRs have been trying pretty hard to make IPv6 allocations be one prefix per ISP, with truly large edge networks being treated as functionally equivalent to an ISP (PI addressing without admitting it is being done). Make the bald assertion that this is equal to one prefix per AS (they're not the same statement at all, but the number of currently assigned AS numbers exceeds the number of prefixes under discussion, so in my mind it makes a reasonable thumb-in-the-wind- guesstimate), that is a reduction of the routing table size by an order of magnitude.
I wouldn't even characterise that as being bald. Initial allocations of more than one prefix per AS should not be allowed. Further; initial allocations should differentiate between network of various sizes into separate address-blocks to simplify and promote strict prefix-filtering policies. Large networks may make arrangements with their neighbors to honor more specifics, but that shouldn't mean that the rest of the world should accept those.
----- Original Message ----- From: "Fred Baker" <fred@cisco.com> To: "Per Heldal" <heldal@eml.cc> Cc: <nanog@merit.edu> Sent: Monday, October 17, 2005 15:12 Subject: Re: And Now for Something Completely Different (was Re: IPv6 news)
That is an assumption that I haven't found it necessary to make. I have concluded that there is no real debate about whether the Internet will have to change to something that gives us the ability to directly address (e.g. not behind a NAT, which imposes some "interesting" requirements at the application layer and gateways of the sort which were what the Internet came about to not need) a whole lot more things than it does today. The debate is about how and when. "when" seems pretty solidly in the 3-10 year timeframe, exactly where in that timeframe being a point of some discussion, and "how" comes down to a choice of "IPv6" or "something else".
Fleming's IPv8 was a non-stupid idea that has a fundamental flaw in that it re-interprets parts of the IPv4 header as domain identifiers - it effectively extends the IP address by 16 bits, which is good, but does so in a way that is not backward compatible. If we could make those 16 bits be AS numbers (and ignoring for the moment the fact that we seem to need larger AS numbers), the matter follows pretty quickly. If one is going to change the header, though, giving up fragmentation as a feature sees a little tough; one may as well change the header and manage to keep the capability. One also needs to change some other protocols, such as routing AS numbers and including them in DNS records as part of the address.
From my perspective, we are having enough good experience with IPv6 that we should simply choose that approach; there isn't a real good reason to find a different one. Yes, that means there is a long coexistence period yada yada yada. This is also true of any other fundamental network layer protocol change.
The RIRs have been trying pretty hard to make IPv6 allocations be one prefix per ISP, with truly large edge networks being treated as functionally equivalent to an ISP (PI addressing without admitting it is being done). Make the bald assertion that this is equal to one prefix per AS (they're not the same statement at all, but the number of currently assigned AS numbers exceeds the number of prefixes under discussion, so in my mind it makes a reasonable thumb-in-the-wind- guesstimate), that is a reduction of the routing table size by an order of magnitude.
If we are able to reduce the routing table size by an order of magnitude, I don't see that we have a requirement to fundamentally change the routing technology to support it. We may *want* to (and yes, I would like to, for various reasons), but that is a different assertion.
On Oct 17, 2005, at 12:42 PM, Per Heldal wrote:
mon, 17,.10.2005 kl. 11.29 -0700, Fred Baker:
OK. What you just described is akin to an enterprise network with a default route. It's also akin to the way DNS works.
No default, just one or more *potential* routes.
Your input is appreciated, and yes I'm very much aware that many people who maintain solutions that assume full/total control of the entire routing-table will be screaming bloody murder if that is going to change. Further details about future inter-domain-routing concepts belong in other fora (e.g. ietf's inter-domain-routing wg).
The long-term operational impact is that the current inter-domain- routing concepts (BGP etc) don't scale indefinitely and will have to be changed some time in the future. Thus expect the size of the routing-table to be eliminated from the list of limiting factors, or that the bar is considerably raised.
---
Note that I'm not saying that nothing should be done to preserve and optimise the use of the resources that are available today just because there will be something better available in a distant future. I'm in favor of the most restrictive allocation policies in place today. The development of the internet has for a long time been based on finding better ways to use available resources (CIDR anyone). To me a natural next-step in that process is for RIR's to start reclaiming unused v4 address-blocks, or at least start collect data to document that space is not being used (if they're not already doing so). E.g prevously announced address-blocks that has disappeared from the global routing-table for more than X months should go back to the RIR-pool (X<=6).
//Per
Thus spake "Fred Baker" <fred@cisco.com>
The RIRs have been trying pretty hard to make IPv6 allocations be one prefix per ISP, with truly large edge networks being treated as functionally equivalent to an ISP (PI addressing without admitting it is being done). Make the bald assertion that this is equal to one prefix per AS (they're not the same statement at all, but the number of currently assigned AS numbers exceeds the number of prefixes under discussion, so in my mind it makes a reasonable thumb-in-the-wind- guesstimate), that is a reduction of the routing table size by an order of magnitude.
If we are able to reduce the routing table size by an order of magnitude, I don't see that we have a requirement to fundamentally change the routing technology to support it. We may *want* to (and yes, I would like to, for various reasons), but that is a different assertion.
If we reduce the average number of prefixes per AS by an order of magnitude, IMHO the result will be that there will be an order of magnitude growth in the number of ASes. We're just going to trade one problem for another. What we need is an interdomain routing system that can either (a) drastically reduce the incremental cost of additional prefixes in the DFZ, or (b) move the exist cost out of the DFZ to the people who want to multihome. Both probably mean ditching BGP4 and moving to some sort of inter-AS MPLS scheme, but it will never see the light of day unless it allows leaving hosts and intra-site routing intact (i.e. hop-by-hop routing and a single prefix per site). This last is why shim6 is DOA. S Stephen Sprunk "Stupid people surround themselves with smart CCIE #3723 people. Smart people surround themselves with K5SSS smart people who disagree with them." --Aaron Sorkin
What we need is an interdomain routing system that can either (a) drastically reduce the incremental cost of additional prefixes in the DFZ, or (b) move the exist cost out of the DFZ to the people who want to multihome. Both probably mean ditching BGP4 and moving to some sort of inter-AS MPLS scheme, but it will never see the light of day unless it allows leaving hosts and intra-site routing intact (i.e. hop-by-hop routing and a single prefix per site). This last is why shim6 is DOA.
or... drop the idea of "A"/"THE" default free zone and recognize that the concept is based on a flawed assumption. --bill
S
Stephen Sprunk "Stupid people surround themselves with smart CCIE #3723 people. Smart people surround themselves with K5SSS smart people who disagree with them." --Aaron Sorkin
E.g prevously announced address-blocks that has disappeared from the global routing-table for more than X months should go back to the RIR-pool (X<=6).
In RFC 2050 section 3 a) the organization has no intention of connecting to the Internet-either now or in the future-but it still requires a globally unique IP address. The organization should consider using reserved addresses from RFC1918. If it is determined this is not possible, they can be issued unique (if not Internet routable) IP addresses. Seems to me that the Internet routing table contents past and present are irrelevant. Note also that the so-called Internet routing table contents vary depending on where you look at it. In any case, at the ARIN meeting in LA there will be an Open Policy Hour for people to suggest things that the RIRs should do. More info here: http://www.arin.net/ARIN-XVI/agenda/tuesday.html#policy --Michael Dillon
Thus spake <Michael.Dillon@btradianz.com>
E.g prevously announced address-blocks that has disappeared from the global routing-table for more than X months should go back to the RIR-pool (X<=6).
In RFC 2050 section 3 a) the organization has no intention of connecting to the Internet-either now or in the future-but it still requires a globally unique IP address. The organization should consider using reserved addresses from RFC1918. If it is determined this is not possible, they can be issued unique (if not Internet routable) IP addresses.
Seems to me that the Internet routing table contents past and present are irrelevant. Note also that the so-called Internet routing table contents vary depending on where you look at it.
Obviously if the RIRs contacted the folks responsible for a given block and were provided justification for its continued allocation, then it should not be reclaimed. On the other hand, folks sitting on several class Bs and not using them could have their blocks reclaimed trivially; ditto for companies that no longer exist. The last is certainly doable without much risk of controversy. However, one of the articles referred to recently in this thread (I forget which) showed that even if we reclaimed all of the address space that is currently unannounced (in use or not), we'd buy ourselves a trivially short extension of the IPv4 address space exhaustion date. Considering the cost of performing the task, doing so seems rather pointless; our time would be better spent getting IPv6 deployed and either reengineering the routing system or switching to geo addresses. S Stephen Sprunk "Stupid people surround themselves with smart CCIE #3723 people. Smart people surround themselves with K5SSS smart people who disagree with them." --Aaron Sorkin
Obviously if the RIRs contacted the folks responsible for a given block and were provided justification for its continued allocation, then it should not be reclaimed. On the other hand, folks sitting on several class Bs and not using them could have their blocks reclaimed trivially; ditto for companies that no longer exist. The last is certainly doable without much risk of
controversy.
This is exactly what the Internic did many years ago. I, like many other people, had registered a .com domain name at no cost. Then suddenly one day, the Internic said that I had to pay an annual subscription fee for this domain name. Like many others, I paid my fees. There were a few complaints about this but by and large people accepted the idea that you had to MAINTAIN a business relationship with the domain name registry in order to continue using a domain name. For some reason, this concept of MAINTAINING a business relationship with the registry, has not caught on in the Regional IP Registry arena. Of course, a large number of IP address users do pay annual membership subscription fees to ARIN (or other RIRs) but not all. And the RIRs seem unwilling to withdraw services (in-addr.arpa) from those who do not MAINTAIN a business relationship.
However, one of the articles referred to recently in this thread (I forget which) showed that even if we reclaimed all of the address space that is
currently unannounced (in use or not), we'd buy ourselves a trivially short extension of the IPv4 address space exhaustion date. Considering the cost of performing the task, doing so seems rather pointless; our time would be better spent getting IPv6 deployed and either reengineering the routing system or switching to geo addresses.
From the viewpoint of avoiding an addressing crisis and avoiding a v6 transition crisis, your advice is sound. However, from the viewpoint of having a sensibly functioning RIR system, I think we still need to deal with two issues. One is that the holders of IP address allocations should be required to maintain a business relationship with the RIR or lose the right to use those IP addresses. The other is
Probably this article from the Cisco IP Journal which has been mentioned a few times in the past week. http://www.cisco.com/en/US/about/ac123/ac147/archived_issues/ipj_8-3/ipv4.ht... that the RIRs need to fix the whole debacle that is "whois" and "routing registries". There should be no need for 3rd party bogon lists. The RIR's should publish an authoritative registry, rooted in IANA, that covers the entire IPv4 and IPv6 address spaces. An operator faced with receiving a new BGP announcement should be able to query such a registry and find out whether or not the address block is allocated, who it is allocated to, and whether that party intends to announce that exact block size from that exact AS number. --Michael Dillon
Imagine a situation with no access to any means of direct communication (phone etc). You've got a message to deliver to some person, and have no idea where to find that person. Chances are there's a group of people nearby you can ask. They may know how to find the one you're looking for. If not they may know others they can ask on your behalf. Several iterations later the person is located and you've established a path through which you can pass the information you wanted.
Translated into cisco terms this mean that the FIB is just a partial routing database, enough to start the search and otherwise handle communications in the neighborhood (no more than X router-hops, maybe AS-hops away). When the destination is located you keep that information for a while in case there are more packets going to the same place, similar to what you do with traditional route-cache.
check out "The Landmark Hierarchy: A New Hierarchy for Routing in Very Large Networks"; Paul Tsuchiya; 1989. randy
On Mon, Oct 17, 2005 at 09:03:45AM -1000, Randy Bush wrote:
Imagine a situation with no access to any means of direct communication (phone etc). You've got a message to deliver to some person, and have no idea where to find that person. Chances are there's a group of people nearby you can ask. They may know how to find the one you're looking for. If not they may know others they can ask on your behalf. Several iterations later the person is located and you've established a path through which you can pass the information you wanted.
Translated into cisco terms this mean that the FIB is just a partial routing database, enough to start the search and otherwise handle communications in the neighborhood (no more than X router-hops, maybe AS-hops away). When the destination is located you keep that information for a while in case there are more packets going to the same place, similar to what you do with traditional route-cache.
check out "The Landmark Hierarchy: A New Hierarchy for Routing in Very Large Networks"; Paul Tsuchiya; 1989.
great stuff... i have a hardcopy. is it online yet? --bill (checking citesear...)
randy
check out "The Landmark Hierarchy: A New Hierarchy for Routing in Very Large Networks"; Paul Tsuchiya; 1989. great stuff... i have a hardcopy. is it online yet?
dunno if i would say great. but certainly good. <http://portal.acm.org/citation.cfm?id=52329> randy
check out "The Landmark Hierarchy: A New Hierarchy for Routing in Very Large Networks"; Paul Tsuchiya; 1989.
great stuff... i have a hardcopy. is it online yet?
Just google for "landmark routing" and you will find lots of papers and presentations that deal with the topic. If OSPF area boundaries were more fluid, rather like the period covered by a moving average... Of course, this might not be so nice if it was done across AS boundaries. --Michael Dillon
That reminds me of anycasting or routing issues. Hackers did use this technique to make use of ip addresses not really allocated. There would be no need for IPv6 if this was more widespread. How about claiming to be f.root-servers.net and setting up our own root :) Regards, Peter and Karin Dambier
is that anything like using, in Cisco terms, a "fast-switching cache" vs a "FIB"?
On Oct 17, 2005, at 6:47 AM, Mikael Abrahamsson wrote:
Well, let's try to turn the problem on its head and see if thats clearer; Imagine an internet where only your closest neighbors know you exist. The rest of the internet knows nothing about you, except there are mechanisms that let them "track you down" when necessary. That is very different from today's full-routing-table.
man, 17,.10.2005 kl. 19.16 +0200, skrev Peter Dambier:
That reminds me of anycasting or routing issues.
Hackers did use this technique to make use of ip addresses not really allocated. There would be no need for IPv6 if this was more widespread.
How about claiming to be f.root-servers.net and setting up our own root :)
Yeah, you'd love that, wouldn't you? ;) Trust that security considerations would be an important part of the design of any replacement for current routing protocols. //Per
man, 17,.10.2005 kl. 15.47 +0000, skrev Mikael Abrahamsson:
On Mon, 17 Oct 2005, Per Heldal wrote:
Well, let's try to turn the problem on its head and see if thats clearer; Imagine an internet where only your closest neighbors know you exist. The rest of the internet knows nothing about you, except there are mechanisms that let them "track you down" when necessary. That is very different from today's full-routing-table.
Yes, it's true that it's different, but is it better?
This thread, as well as most messages on this mailinglist in the last 2 days says so. Everyone uses all their energy trying to work within the limits of the current scheme. Common sense says it would be to eliminate the problem. What happens to policies if there's no limit to the size of the routing-table?
It does not provide 100% provider-indepence to begin with. Depending on who you ask that alone is a show-stopper.
Well, the reason for people wanting to stick to their "own" IP adresses are administrative and technical. If we solve that then hopefully, it wont be such a big hassle to renumber to go to another provider.
I'm not so sure it will be that easy to get the flexibility you want. How do you for example enforce rules of flexibilty on *all* dns-providers.
Also, if everybody got their equal size subnet delegation from each ISP then it shouldnt be that much of a problem to run two "networks" side-by-side by using the subnet part of the delegation equal to both networks, but keep the prefix separate. If you switch providers you change the prefix part. This means we need new mechanisms to handle this, but I feel that's better than doing the routing mistake again.
True, but it creates unnecessary complexity for end-systems. It still doesn't help for scaleability on the next level up.
The internet shouldn't need to know anything about individual users to begin with, provided there are mechanisms avilable track them down. By that I mean that algorithms to locate end-nodes may include mechanisms to "interrogate" a large number of nodes to find the desired location as opposed to looking it up in a locally stored database (routing-table).
So what is it you're proposing? I understand what shim6 tries to do (since it basically keeps most of todays mechanisms) but I do not understand your proposal. Could you please elaborate?
What I've got can't be called a proposal. There's no solution to propose. I just think that network complexity should be handled in the network and not by exporting the problem to connected clients. BGP and its related path-selection algorithms have served us well for many years, but there's a need for alternatives and somebody have to get involved.
I thought DNS only provided a name for an address ;) How does DNS tell us that e.g. 193.10.6.6 is part of a subnet belonging to AS2838 and how to get there?
Should end users really care for that level of routing information?
I never said so. Their equipment, their upstream, or the upstream's upstream may need to know to get there though.
Also, your proposal seems to indicate that we need something that sounds like a proxy server that actually do know more about the internet and who needs to keep state, this doesn't sound scalable?
There's no proxy server involved unless you count forwarding of route location requests between inter-domain routers as proxy. If so, all intra-domain routers would be proxies. Data transport along an established forwarding path would not change. This mailinglist isn't really the place to discuss future concepts and further discussion should move to the IETF Inter-Domain-Routing working-group or other suitable forum. //Per
On Mon, 17 Oct 2005, Mikael Abrahamsson wrote:
Also, if everybody got their equal size subnet delegation from each ISP then it shouldnt be that much of a problem to run two "networks" side-by-side by using the subnet part of the delegation equal to both networks, but keep the prefix separate. If you switch providers you change the prefix part. This means we need new mechanisms to handle this,
Hmm, one thing that would be nice would be to seperate the prefix from the host identifier in DNS, oh... wait..
but I feel that's better than doing the routing mistake again.
It's all routing, just shifting different portions of it to different places. regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: panic ("Splunge!"); linux-2.2.16/drivers/scsi/psi240i.c
# ... # # You are missing the point. # # Currently multihomed sites have multiple path entries in the routing table # for a specific multihomed prefix. # # Instead of having multiple paths, you would have multiple location records # in DNS. (Which are A records and any possible reordering by round robbin is # not part of the actual path selection algorithm and which are associated # with indentifiers though a standard to be designed as part of the new # protocol.) # # The process of how you select which one to use (and what knobs you have for # tuning) is a design decision, the same way it was with BGP and OSPF. yes. we called this the A6/DNAME/SRV approach. # * You throw away real topology information and rely on latency (or # whatever), and using latency doesn't work because it doesn't allow traffic # engineering according to policy. (Who said you have to give everybody the # same set of locators? Paul might say ewwww. ... some applications need best-latency. others, best-isochrony. still others, highest-bandwidth. i don't think the network should have a single policy, but i don't think every application should have to test for latency, isochrony, and bandwidth toward each potential endpoint before it selects one. can't we collect this information as "hint state" and make it available to applications who will then make their decision privately? or failing that, just use SRV? # FWIW, if you want the ability to tell different peers different answers like # with BGP you will need the ability to give different answers with the new # protocol.) that's how i envision that the NSID option would be used from the client side, though i recognize that this will make ed lewis's head explode, that's OK too.
Tony, On Oct 16, 2005, at 1:15 AM, Tony Li wrote:
A real locator/identifier separation requires a rewrite.
Not necessarily. If you transition at the edge, what happens within the site matters only to the site and what matters to the core only matters to the core. No stacks, either core or edge, need to be rewritten.
Any system that provided site-wide source address control was going to require a rewrite.
Not necessarily (depending on what you mean by the ambiguous term "address"). A lot depends on the actual requirements for source locator or identifier control.
David, I should point out that if only a small number of folks care about multihoming, then only a small number of folks need to change their stacks.
I thought all clients would have to be modified if they wanted to take full advantage of a shim6 enabled multi-homed server?
And even in your solution, there would need to be some changes to the end host if you want to support exit point selection, or carry alternate locators in the transport.
One of the problems that I have seen in the IETF is calling desires "requirements". How important is exit point selection? Are there ways of implementing exit point selection without changing the IP stack? How critical is it that alternate locators be carried in the transport? Does the lack of that functionality make the protocol unusable? What _are_ the actual requirements (not the "Goals")? From my perspective, the really, really critical flaw in both IPv4 and IPv6 is the lack of _transparent_ renumberability. Multi-homing is also a flaw, but less critical and I believe it can be addressed with the right solution to renumberability. A "few" folks worry about multi- homing. Everybody worries about end site renumbering.
It's just a mess. I think that we all can agree that a real locator/identifier split is the correct architectural direction, but that's simply not politically tractable.
Right. And since it couldn't be done the right way in the protocol, we make the protocol much more complicated and force a reset to address functionality that relatively few folks need. I'm suggesting not mucking with the packet format anymore. It might be ugly, but it can be made to work until somebody comes up with IPv7. Instead, since the locator/identifier split wasn't done in the protocol, do the split in _operation_. Make the edge/core boundary real. Both edge and core could be addressed without hierarchy, but the mapping between the edge and core would change such that the edge would never be seen in the DFZ. Within the core, nothing new or different need be done (well, except for deploying IPv6 and running the core/edge translators). Within the edge, nothing new need be done either. Yes, it is a hack. But I suspect it would address the real requirements (or, at least my pet requirement :-)). Rgds, -drc
drc@virtualized.org (David Conrad) wrote:
I'm suggesting not mucking with the packet format anymore. It might be ugly, but it can be made to work until somebody comes up with IPv7. Instead, since the locator/identifier split wasn't done in the protocol, do the split in _operation_.
It has been done a long time ago, IMHO. I wonder whether I am the only one seeing this, but we already have a (albeit routing-) locator (ASN) and an identifier (IP address), that are pretty much distinct and where the routing locator is not used inside the "local" network, but only outside. There's your edge/core boundary. Every multi-homer will be needing their own ASN, so that's what clutters up your routing tables. It's economy there. Btw, a lot of ASNs advertise one network only. People surely think multihoming is important to them (and I cannot blame them for that). Hierarchical routing is one possible solution, with a lot of drawbacks and problems. Forget about geographic hierarchies; there's always people who do not peer. Visibility radius limitation is another (I cannot believe the idea is new, I only don't know what it's called). Cheers, Elmi. -- "Begehe nur nicht den Fehler, Meinung durch Sachverstand zu substituieren." (PLemken, <bu6o7e$e6v0p$2@ID-31.news.uni-berlin.de>) --------------------------------------------------------------[ ELMI-RIPE ]---
David,
A real locator/identifier separation requires a rewrite.
Not necessarily. If you transition at the edge, what happens within the site matters only to the site and what matters to the core only matters to the core. No stacks, either core or edge, need to be rewritten.
Transitioning at the edge implies to me that the hosts need to know about different semantics for the IPv6 header. That, in turn, implies that there is different code for the hosts. Alternately, if there is no new code anywhere, it's clear that you must necessarily have the same semantics and must not have made a change.
Any system that provided site-wide source address control was going to require a rewrite.
Not necessarily (depending on what you mean by the ambiguous term "address"). A lot depends on the actual requirements for source locator or identifier control.
Again, source address selection is going to require something different than what we have today. The host might have to interact with some centralized policy server, execute a selection algorithm, or consult an oracle. Whatever that might be, there is new code involved.
David, I should point out that if only a small number of folks care about multihoming, then only a small number of folks need to change their stacks.
I thought all clients would have to be modified if they wanted to take full advantage of a shim6 enabled multi-homed server?
The keyword there is "full". Unmodified clients can still interact with a multi-homed server in a legacy manner.
And even in your solution, there would need to be some changes to the end host if you want to support exit point selection, or carry alternate locators in the transport.
One of the problems that I have seen in the IETF is calling desires "requirements". How important is exit point selection? Are there ways of implementing exit point selection without changing the IP stack? How critical is it that alternate locators be carried in the transport? Does the lack of that functionality make the protocol unusable?
What _are_ the actual requirements (not the "Goals")? From my perspective, the really, really critical flaw in both IPv4 and IPv6 is the lack of _transparent_ renumberability. Multi-homing is also a flaw, but less critical and I believe it can be addressed with the right solution to renumberability. A "few" folks worry about multi-homing. Everybody worries about end site renumbering.
As with any political process, the stated requirements are a function of perspective. The stated requirements may or may not have anything to do with reality, realizability, practicality, or architectural elegance.
It's just a mess. I think that we all can agree that a real locator/identifier split is the correct architectural direction, but that's simply not politically tractable.
Right. And since it couldn't be done the right way in the protocol, we make the protocol much more complicated and force a reset to address functionality that relatively few folks need.
It could have been done the right way in the protocol, but it wasn't. Yes, the result is that the subsequent 'work around' solution is much more complicated than it could have been. Again, between multihoming and mobility, the ubiquity and necessity of Internet access, and the reliability of the last mile, this is not going to remain a rare or even minority issue. Regards, Tony
Tony, On Oct 18, 2005, at 1:56 PM, Tony Li wrote:
Not necessarily. If you transition at the edge, what happens within the site matters only to the site and what matters to the core only matters to the core. No stacks, either core or edge, need to be rewritten.
Transitioning at the edge implies to me that the hosts need to know about different semantics for the IPv6 header. That, in turn, implies that there is different code for the hosts. Alternately, if there is no new code anywhere, it's clear that you must necessarily have the same semantics and must not have made a change.
No. The only code change that must occur is at the core/edge transition device _at both ends_. Let me try explaining this by example: Assume you have: - ISP A providing service to site X. - ISP B providing service to site Y. - ISP A has locator prefix A000::0 - ISP B has locator prefix B000::0 - Site X has identifier prefix 1000::0 - Site Y has identifier prefix 2000::0 - Host 1000::1 wants to send a packet to host 2000::2 Then: 1. Packet leaves host 1000::1 destined for 2000:2 and ends up at the site edge router for ISP A. 2. The site edge router for ISP A sees destination 2000:2 and looks up the locator in some globally distributed database using the identifier prefix 2000::0, getting back locator prefix B000::0. 3. The site edge router for ISP A rewrites the destination address with the locator prefix, i.e., B000::2. 4. The site edge router for ISP A forwards the packet to the next (core) hop for destination B000::2. 5. The site edge router for ISP B receives the packet destined for B000::2 6. The site edge router for ISP B rewrites the destination prefix with the identifier prefix, i.e., 2000::2 7. The site edge router for ISP B forwards the packet to the destination. You want multi-homing? Site Y has two ISPs, each having their own locator prefix, e.g., ISP B (B000::0) and ISP C (C000::0). The lookup at step 2 returns two locators and the site edge router for ISP A can choose which path to take (perhaps with advice from the administrator of Site Y encoded in the response from the lookup, e.g., a preference or priority value). Transparent renumbering is obvious. Mobility might be possible with a little work and the old site edge router forwarding to the new site edge router for the duration of the cached response from the lookup. No code changes within the site or within the core would be necessary. Of course, the tricky bit is in looking up the locator in the globally distributed database and caching the response (which presumably would be necessary because the lookup will take a long time, relatively speaking). When a new 'conversation' between two hosts start, the initial packet will obviously have increased latency, but subsequent packets could rely on cached information. Again, I realize this is a hack, but I suspect it is a hack that impacts fewer points than something like shim6.
Again, source address selection is going to require something different than what we have today. The host might have to interact with some centralized policy server, execute a selection algorithm, or consult an oracle. Whatever that might be, there is new code involved.
Well, yes, if source address selection is important. My point was that there didn't have to be new code in the IP stack.
As with any political process, the stated requirements are a function of perspective. The stated requirements may or may not have anything to do with reality, realizability, practicality, or architectural elegance.
Hmm. Are the aliens who took the _real_ IETF and replaced it with what's there now going to give it back? :-)
It could have been done the right way in the protocol, but it wasn't. Yes, the result is that the subsequent 'work around' solution is much more complicated than it could have been.
I grant additional complexity is necessary. However, additional complexity in every system seems like a bad idea to me.
Again, between multihoming and mobility, the ubiquity and necessity of Internet access, and the reliability of the last mile, this is not going to remain a rare or even minority issue.
I very much agree. Rgds, -drc
On Tue, 2005-10-18 at 15:52 -0700, David Conrad wrote:
Hmm. Are the aliens who took the _real_ IETF and replaced it with what's there now going to give it back? :-)
Sure they'll hand it back ... when there is no more money to be made from IETF-related technology and politicians no longer feel it's interesting ;) Otoh, the IETF is a function of its partitipants. Businesses today have such fear of competitors and interllectual-propery-issues that they hardly can cooperate on anything. Thus, the number of technology reasearchers available to partitipate in public forums is just a fraction of what it was 20 or 30 years ago. If enough people find IETF unworkable, wouldn't they just form an alternative forum that would be to the IETF what the IETF has been to ITU? //Per
On 16-Oct-2005, at 03:37, David Conrad wrote:
Shifting the NAT to end system removed the objection to NAT, tho it's not entirely clear why. Shifting NAT to the end system also happened to simplify the entire solution as well.
Except for the part about having to rewrite all existing implementations to take full advantage of the technology.
Thought experiment: how many different software vendors need to change their shipping IPv6 code in order for some new feature like shim6 to be 80% deployed in the server and client communities of hosts? I'm thinking it's probably less than 5, but I'd be interested to hear opinions to the contrary. Joe
On Sun, 16 Oct 2005 10:55:38 EDT, Joe Abley said:
Thought experiment: how many different software vendors need to change their shipping IPv6 code in order for some new feature like shim6 to be 80% deployed in the server and client communities of hosts?
I'm thinking it's probably less than 5, but I'd be interested to hear opinions to the contrary.
Client end, if Microsoft, MacOS X, and the various Linuxoids shipped, you'd have pretty good coverage. Maybe Solaris 11 if they're still relevant by then. A few vendor Unixoids (AIX, Irix, etc), and proprietary systems (z/OS), but those vendors will either read the writing on the wall or fade away... Router end, probably same number - Cisco, Juniper, Linksys and a few other SOHO-class vendors, plus a few I've overlooked. The number is certainly more than 5, very likely close to 1 dozen, unlikely to be more than 2 dozen. Of course, even if everybody shipped a *working* *interoperable* product today (quit giggling - we're being hypothetical here), you'd still have a 3-5 year timeframe before all the stuff sold yesterday and years previous got upgraded or replaced.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 16-Oct-2005, at 16:20, Valdis.Kletnieks@vt.edu wrote:
On Sun, 16 Oct 2005 10:55:38 EDT, Joe Abley said:
Thought experiment: how many different software vendors need to change their shipping IPv6 code in order for some new feature like shim6 to be 80% deployed in the server and client communities of hosts?
I'm thinking it's probably less than 5, but I'd be interested to hear opinions to the contrary.
Client end, if Microsoft, MacOS X, and the various Linuxoids shipped, you'd have pretty good coverage. Maybe Solaris 11 if they're still relevant by then. A few vendor Unixoids (AIX, Irix, etc), and proprietary systems (z/ OS), but those vendors will either read the writing on the wall or fade away...
To get 80%, I think on the client side you just need support in Microsoft v6-capable operating systems.
Router end, probably same number - Cisco, Juniper, Linksys and a few other SOHO-class vendors, plus a few I've overlooked.
I don't think you need any support at all on routers for shim6 to be functional for services that users and content providers care about. On the server side, windows plus Solaris plus linux seems like it might give 80%. So, that makes windows, Solaris and Linux. Whether the answer is three, five or twelve, the point I was attempting to make was that it's not necessarily a huge deployment obstacle to roll out shim6 across a good proportion of the network's hosts from the coding point of view. Since no flag day is required, this does not seem necessarily unmanageable.
Of course, even if everybody shipped a *working* *interoperable* product today (quit giggling - we're being hypothetical here), you'd still have a 3-5 year timeframe before all the stuff sold yesterday and years previous got upgraded or replaced.
Sure. In some ways it's fortunate that Microsoft has yet to ship an operating system where v6 is turned on by default (right? I heard that Vista will be the first?) On the server side there's the additional upgrade carrot that upgrades will facilitate multi-homing, which may make for an easier sale. Anyway. Thought experiment. Joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (Darwin) iD8DBQFDUs2a/f+PWOTbRPIRAsBUAJ0XSgeNfOsVJDt5kOyb0YReiwYnowCgxGCK yb5mI6ijhwUFKTwrZ2OSQyw= =F648 -----END PGP SIGNATURE-----
# ... # # Obviously, some of the disadvantages of such an approach would be that it # would require both ends to play and end users wouldn't be able to # traceroute. I'm sure there are many other disadvantages as well. ... ok, so here's the problem. we don't have what the iab thinks of as end-to-end and we have not had it for a long time and it's not coming back under any circumstances. but the people willing to serve on the iab, as filtered down to the set of people willing to be put on the iab by any particular nomcom, do not believe this, or they believe it but they behave like a supreme court nominee who gets an inevitable question about roe-v-wade and their knee jerks and they say "i support the constitution". so even though NAT is here to stay and firewalls are here to stay and proxies are here to stay and most ipv6 deployment by the end of its useful lifetime will have used RFC1918-like private addressing, or be behind firewalls that limit flows to what a security administrator can predict and protect and understand... officially the IAB can never, ever recognize this or act on it or make decisions or interpretations or recommendations based on it. that's how politics "just is" and our proper course is to build and deploy technology that works even if it goes against what the IAB has writ and seems a little bit subversive at the time it comes out (as with firewalls, and NAT), and let the political world play catch-up to the real world that we actually live in. # However, if an approach like this would be technically feasible (and I'm not # entirely sure it would be), I suspect it would get deployed _much_ faster # than an approach that requires every network stack to be modified. Again. # Particularly given the number of folks who care about multi-homing are so # small relative to the number of folks on the Internet. right. # Can two evils make a good? :-) definitely.
# > but when similar things were proposed at other meetings, somebody always # > said "no! we have to have end-to- end, and if we'd wanted # > nat-around-every-net we'd've stuck with IPv4." # # Is VJ compression considered a violation of the "end-to-end" principle? # # Or perhaps I misunderstand (yet again). vj is a framing protocol. IP goes in, IP comes out. univerality is retained.
Is VJ compression considered a violation of the "end-to-end" principle?
VJ compression happens in the middle of the network, between two routers/gateways. End-to-end refers to the hosts, i.e. the computers which "host" the end users' applications. Of course, in the old days, many of these "hosts" also carried out the function of a gateway using a dialup modem, but that is still not violating the end-to-end model because the end user application never knows about the VJ compression. NAT is different because it causes some end-user applications to fail entirely. For instance an application which sends its IP address to another host with the instructions "call me back when something interesting happens". The NAT box in the middle causes the callback to fail in most cases. And end-to-end multihoming solution that is consistent with the end-to-end model will allow any application to communicate with another host even when one of the hosts moves to a different network location. BGP multihoming achieves this by announcing the small number of possible locations where a particular netblock can be found. The telephone system solves this by providing a central directory service where the network looks up an 800 number (or any portable number) to find the current location of the destination. Some people have used DNS techniques to do a similar sort of IPv4 multihoming, notably Paul Vixie and an Israeli box vendor whose name escapes me at the moment. Theoretically, in a network, a router/gateway could have some intelligence/state so that it does not simply forward packets based on destination addresses in the routing table. Instead it does some kind of query/lookup to identify the real destination location. If you stick this functionality directly in the end hosts themselves, then you have SHIM6. If you stick the functionality in the provider edge router then you have MPLS. --Michael Dillon
Michael, On Oct 17, 2005, at 6:17 AM, Michael.Dillon@btradianz.com wrote:
Is VJ compression considered a violation of the "end-to-end" principle? VJ compression happens in the middle of the network, between two routers/gateways. End-to-end refers to the hosts, i.e. the computers which "host" the end users' applications.
It was a rhetorical question. My point was that what happens between the ends is irrelevant if what gets sent from one end is what is received by the other end. Yes, it is obvious, however I have seen people freak out when you suggest touching the address fields, regardless of the fact that you say it'll be put back before it hits the destination end host.
Theoretically, in a network, a router/gateway could have some intelligence/state so that it does not simply forward packets based on destination addresses in the routing table. Instead it does some kind of query/lookup to identify the real destination location.
Yes. All of this is simply hackery to get around a fundamental flaw in the Internet Protocol (either v4 or v6) - the lack of locator/identifier separation. My concerns with shim6 aren't that the protocol is broken, but rather it is so complex that I fear (a) it will take a very long time to implement, (b) it will take much, much longer to implement correctly, (c) it will never get fully deployed. Since I see multi-homing/renumbering/mobility (all facets of the same thing) as the underlying problem with IPv4, I'm hoping that by addressing that problem, IPv6 could actually justify its existence in a business sense. Since non-shim6 enabled stacks are already being deployed, I suspect an edge box approach will be the most pragmatic way of actually getting something people can use. Unfortunately, delays in deploying some sort of multi-homing/renumbering/mobility solution will, I suspect, entrench (single sided) NAT even more than it is entrenched today, even on IPv6 sites. So it goes. Rgds, -drc
The problem with that (and many premises) is that we need to remember these arguments and foreseen "problems" were all dreamed up 10 or so years ago. The status of everyone's network, everyone's business needs and everyone's network design (and capabilities) were drastically different that long ago. It's a solution that made sense for far different reasons when it was created then it makes sense for now. *shrug* Scott -----Original Message----- From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu] On Behalf Of Paul Vixie Sent: Sunday, October 16, 2005 12:08 AM To: nanog@merit.edu Subject: Re: IPv6 news drc@virtualized.org (David Conrad) writes:
On Oct 15, 2005, at 3:27 PM, Tony Li wrote:
When we explored site multihoming (not rehoming) in the ways that you seem to suggest, it was effectively a set of coordinated NAT boxes around the periphery of the site. That was rejected quite quickly.
What were the reasons for rejection?
i wasn't there for that meeting. but when similar things were proposed at other meetings, somebody always said "no! we have to have end-to-end, and if we'd wanted nat-around-every-net we'd've stuck with IPv4." -- Paul Vixie
# The problem with that (and many premises) is that we need to remember these # arguments and foreseen "problems" were all dreamed up 10 or so years ago. # The status of everyone's network, everyone's business needs and everyone's # network design (and capabilities) were drastically different that long ago. # # It's a solution that made sense for far different reasons when it was # created then it makes sense for now. nope. the problems we're discussing on this thread were all identified ten years ago but ipv6 got standardized in spite of the warnings. ipv6 as it is now specified never made sense for any network, either existing or possible, and all it's really done is to make a bad situation worse and a hard problem even harder to solve. but by handing /32 sugar pills to early adopters, some momentum will be created, and the folks who will not be able to multihome later on will not by that time have any choice except to "accept status quo". it's a neat solution, i wish i'd thought of it. (i was blinded by idealism.)
On Sat, 15 Oct 2005, John Payne wrote:
On Oct 15, 2005, at 3:29 PM, Tony Li wrote:
So the IETF identified 4 reasons to multihome. Of those 4, shim6 ignores at least 2 of them (operational policy and cost), and so far as I can see glosses over load sharing.
If you have a solution that satisfies all requirements, you should contribute it. Shim6 is indeed a partial solution to the stated requirements. There was no tractable solution found to all requirements, and to not solve any of the issues was seen as basically fatal.
I don't have an acceptable solution... however, I am getting tired of shim6 being pushed as *the* solution to site rehoming, when at best it's an end node rehoming solution.
The essential problem comes down to routing topology information... all the "scalable" solutions obscure or reduce this information in order to work. Not reducing the information means either having the same number of routing table entries as the number of multihoming sites or enforcing some kind of theoretical heirarchical structure (many of the original IPv6 papers had this pipe dream spelled out for how to handle multihoming) either based on geography or what amounted to a hegemony of top level providers. The heirarchical solutions haven't already been implemented (with IPv4 or IPv6) because it costs alot to build networks, many networks have spent too much, and market forces have "optimized" (if that is what you call the current state of affairs) to a set of providers that been able to stay in business. Also if your business is providing connectivity, you are loath to enshrine via protocol somebody else in a strategically dominant position... it would eliminate any chance of being able to manage your profit margin. So... any proposal you here from somebody that says the current IPv4 BGP solution for multihoming is unworkable due to the fact that it won't scale (an interesting claim depending on where on the curve you think market penetration of the Internet is, coupled with the question how many more sites multihoming do you want to support (this appears to the result of complex economic interactions, not anybody's planning)) you can count on routing topology information being reduced. Don't be suprised, its necessary to achieve their design goal. You can't have both full topology information (all the path information) and less information (replacing the paths with something simpler). For example, shim6 sounds like it replaces substitues all kinds of path selection information with latency. The amusing thought came to mind that the equivalent of current AS prepending for shim6 sites would be to get a box to introduce additional latency on the path they want to reduce traffic on. (There are companies that sell such optical devices for testing.) (perhaps I'm totally wrong about how shim6 works, my appologies to people working on it) Anyway, it's useful to think about what you want in terms of topology information and where you want it. When is it necessary to have real topology information and when would something else suffice (depending on what you are doing (for example live on demand video streams)), etc. Parameters are things like session surviability, time to convergence, etc. You have alot more room if you say that its ok if existing sessions can be interruped and reconnect. Unfortunatley, alot of the solutions look like something other than TCP meaning you aren't helping the existing applications. I don't envy anybody trying to magic away topology information. I can see solutions if we replace TCP. Whatever you implement on the servers, I don't see how session surviability can be included without cooperation from the clients or the clients routers. Mike. +----------------- H U R R I C A N E - E L E C T R I C -----------------+ | Mike Leber Direct Internet Connections Voice 510 580 4100 | | Hurricane Electric Web Hosting Colocation Fax 510 580 4151 | | mleber@he.net http://www.he.net | +-----------------------------------------------------------------------+
Joe (or anyone else), On Oct 14, 2005, at 7:57 AM, Joe Abley wrote:
The big gap in the multi-homing story for v6 is for end sites, since those are specifically excluded by all the RIRs' policies on PI addressing right now. Shim6 is intended to be a solution for end sites.
Since shim6 requires changes in protocol stacks on nodes, my impression has been that it isn't a _site_ multihoming solution, but rather a _node_ multihoming solution. Is my impression incorrect?
Are you suggesting that something else is required for ISPs above and beyond announcing PI space with BGP, or that shim6 (once baked and real) would present a threat to ISPs?
If my impression is correct, then my feeling is that something else is required. I am somewhat skeptical that shim6 will be implemented in any near term timeframe and it will take a very long time for existing v6 stacks to be upgraded to support shim6. What I suspect will be required is real _site_ multihoming. Something that will take existing v6 customer sites and allow them to be multi-homed without modification to each and every v6 stack within the site. Rgds, -drc
BTW, as I read it, SHIM6 requires not only modification to ALL nodes at the site, but, modification to ALL nodes to which the node needs reliable connectivity. In other words, SHIM6 is not fully useful until it is fully ubiquitous in virtually all IPv6 stacks. Owen --On October 14, 2005 11:48:28 AM -0700 David Conrad <david.conrad@nominum.com> wrote:
Joe (or anyone else),
On Oct 14, 2005, at 7:57 AM, Joe Abley wrote:
The big gap in the multi-homing story for v6 is for end sites, since those are specifically excluded by all the RIRs' policies on PI addressing right now. Shim6 is intended to be a solution for end sites.
Since shim6 requires changes in protocol stacks on nodes, my impression has been that it isn't a _site_ multihoming solution, but rather a _node_ multihoming solution. Is my impression incorrect?
Are you suggesting that something else is required for ISPs above and beyond announcing PI space with BGP, or that shim6 (once baked and real) would present a threat to ISPs?
If my impression is correct, then my feeling is that something else is required. I am somewhat skeptical that shim6 will be implemented in any near term timeframe and it will take a very long time for existing v6 stacks to be upgraded to support shim6. What I suspect will be required is real _site_ multihoming. Something that will take existing v6 customer sites and allow them to be multi-homed without modification to each and every v6 stack within the site.
Rgds, -drc
-- If it wasn't crypto-signed, it probably didn't come from me.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 14-Oct-2005, at 15:16, Owen DeLong wrote:
BTW, as I read it, SHIM6 requires not only modification to ALL nodes at the site, but, modification to ALL nodes to which the node needs reliable connectivity.
For one host with multiple, globally-unqiue addresses to talk optimally to another similarly-equipped host, such that failures in reachability of individual addresses should not cause sessions to fail, both hosts would need to be shim6-capable.
In other words, SHIM6 is not fully useful until it is fully ubiquitous in virtually all IPv6 stacks.
Which is not to say that there is no value in a non-ubiquitous deployment -- rather, the value will grow as deployment proceeds. Joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (Darwin) iD8DBQFDUAYT/f+PWOTbRPIRAoZGAJ91IkqDyktDQjBPE0fXBhqXKYtDRwCfYMTq GIy1KstwI70KJSMSop1uPSU= =4tS8 -----END PGP SIGNATURE-----
On 14-Oct-2005, at 14:48, David Conrad wrote:
On Oct 14, 2005, at 7:57 AM, Joe Abley wrote:
The big gap in the multi-homing story for v6 is for end sites, since those are specifically excluded by all the RIRs' policies on PI addressing right now. Shim6 is intended to be a solution for end sites.
Since shim6 requires changes in protocol stacks on nodes, my impression has been that it isn't a _site_ multihoming solution, but rather a _node_ multihoming solution. Is my impression incorrect?
There is no shortage of rough corners to file down, and I am behind on my shim6 mail, but the general idea is to let end sites multi-home in the "bag-o-PA-prefixes" style and let the nodes within that site use their multiple globally-unique addresses (one per upstream, say) to allow sessions to survive rehoming events.
Are you suggesting that something else is required for ISPs above and beyond announcing PI space with BGP, or that shim6 (once baked and real) would present a threat to ISPs?
If my impression is correct, then my feeling is that something else is required. I am somewhat skeptical that shim6 will be implemented in any near term timeframe and it will take a very long time for existing v6 stacks to be upgraded to support shim6. What I suspect will be required is real _site_ multihoming. Something that will take existing v6 customer sites and allow them to be multi-homed without modification to each and every v6 stack within the site.
For end sites, that's a wildly-held opinion. For ISPs, it's not required, since ISPs can already multi-home in the manner you describe, using PI addresses and BGP. Joe
On Fri, Oct 14, 2005 at 03:19:27PM -0400, Joe Abley wrote:
On Oct 14, 2005, at 7:57 AM, Joe Abley wrote:
Since shim6 requires changes in protocol stacks on nodes, my impression has been that it isn't a _site_ multihoming solution, but rather a _node_ multihoming solution. Is my impression incorrect?
There is no shortage of rough corners to file down, and I am behind on my shim6 mail, but the general idea is to let end sites multi-home in the "bag-o-PA-prefixes" style and let the nodes within that site use their multiple globally-unique addresses (one per upstream, say) to allow sessions to survive rehoming events.
the kicker here is that the applications then need some serious smarts to do proper source address selection.
I suspect will be required is real _site_ multihoming. Something that will take existing v6 customer sites and allow them to be multi-homed without modification to each and every v6 stack within the site.
For end sites, that's a wildly-held opinion.
wildly or widely? :)
Joe
--bill
On Fri, 14 Oct 2005 bmanning@vacation.karoshi.com wrote:
Since shim6 requires changes in protocol stacks on nodes, my impression has been that it isn't a _site_ multihoming solution, but rather a _node_ multihoming solution. Is my impression incorrect?
There is no shortage of rough corners to file down, and I am behind on my shim6 mail, but the general idea is to let end sites multi-home in the "bag-o-PA-prefixes" style and let the nodes within that site use their multiple globally-unique addresses (one per upstream, say) to allow sessions to survive rehoming events.
the kicker here is that the applications then need some serious smarts to do proper source address selection.
No. The kicker is that the applications needs no such smarts and shim6 will take care of this for all applications on the system on the network level. -- William Leibzon Elan Networks william@elan.net
On Fri, Oct 14, 2005 at 12:33:51PM -0700, william(at)elan.net wrote:
On Fri, 14 Oct 2005 bmanning@vacation.karoshi.com wrote:
Since shim6 requires changes in protocol stacks on nodes, my impression has been that it isn't a _site_ multihoming solution, but rather a _node_ multihoming solution. Is my impression incorrect?
There is no shortage of rough corners to file down, and I am behind on my shim6 mail, but the general idea is to let end sites multi-home in the "bag-o-PA-prefixes" style and let the nodes within that site use their multiple globally-unique addresses (one per upstream, say) to allow sessions to survive rehoming events.
the kicker here is that the applications then need some serious smarts to do proper source address selection.
No. The kicker is that the applications needs no such smarts and shim6 will take care of this for all applications on the system on the network level.
call me skeptical... I last followed this line of thinking w/ Rich Draves hammered out the issues of source address selection in his I-D's (and they might have become RFCs) and they are not trival. vendors getting this right, let alone interoperable will be a significant feat. Handwaving and Tapdancing not withstanding. Then there is the minor detail in a service provider figuring out what, if anything, she needs to do to migrate from the existing, timetested, working methods of multi homing to this new'nimproved method... reminds me of the old NewYorker cartoon; "...and then a miracle occurs..." --bill
-- William Leibzon Elan Networks william@elan.net
On Oct 14, 2005, at 3:33 PM, william(at)elan.net wrote:
No. The kicker is that the applications needs no such smarts and shim6 will take care of this for all applications on the system on the network level.
Not directly aimed at William: As others have said before (and I finally listened) PLEASE take this discussion to the shim6 list. However, I'm still at a loss as to how the shim on the end node is going to know about the operational policy, cost implications and site level load balancing requirements to do this _correctly_. I'm also undecided about how I feel about the extra packets caused by the (I forget the official term) discovery packets for shim6 for an end site with say a hundred machines each with thousands of short lived TCP sessions.
Not directly aimed at William: As others have said before (and I finally listened) PLEASE take this discussion to the shim6 list.
there are some reasons why i'd rather have the discussion here... ) IETF IPR and legal restraints on postings ) shim6 is a protocol design working group, not an operational wg * the IETF has always done operations poorly, it seems worse these days. ) there are enough folks on both that they can "represent" to shim6 list as needed/desired --bill
On Oct 15, 2005, at 12:32 AM, bmanning@vacation.karoshi.com wrote:
Not directly aimed at William: As others have said before (and I finally listened) PLEASE take this discussion to the shim6 list.
there are some reasons why i'd rather have the discussion here... ) IETF IPR and legal restraints on postings ) shim6 is a protocol design working group, not an operational wg * the IETF has always done operations poorly, it seems worse these days. ) there are enough folks on both that they can "represent" to shim6 list as needed/desired
The problem is... from what I've seen so far the operational requirements voice is a lonely one. If there are any network operators on this list who wouldn't qualify for PI space in an IPv6 world... do you really want your traffic engineering decisions made by individual end systems based on delay?
On Fri, 14 Oct 2005, John Payne wrote:
I'm also undecided about how I feel about the extra packets caused by the (I forget the official term) discovery packets for shim6 for an end site with say a hundred machines each with thousands of short lived TCP sessions.
The shim6 capability detection can (and I expect it will) be postposed to be done only when arbitrary policy criteria have been met (e.g., the session has lasted more than 30 seconds). It doesn't need to be done before establishing the session. IMHO, this is a MAJOR benefit of the current shim6 approach, compared to other mechanisms or approaches (e.g., HIP). -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings
On Fri, Oct 14, 2005 at 07:27:37PM +0000, bmanning@vacation.karoshi.com wrote:
the kicker here is that the applications then need some serious smarts to do proper source address selection.
Nope. The ULID is supposed to be static, globally unique. Just not globally routed. Seperating topology from identification. Something I didn't see discussed yet is that shim6 sites would need to get a globally unique, provider independent /48 or larger... which folks could start to announce. But I guess that address space would come from blocks earmarked as "non-routable, it's a bogon, bad IP space, filter in BGP at first sight!". :-) Regards, Daniel -- CLUE-RIPE -- Jabber: dr@cluenet.de -- dr@IRCnet -- PGP: 0xA85C8AA0
On Fri, 14 Oct 2005 21:39:58 +0200, Daniel Roesen said:
Nope. The ULID is supposed to be static, globally unique. Just not globally routed. Seperating topology from identification.
Something I didn't see discussed yet is that shim6 sites would need to get a globally unique, provider independent /48 or larger... which folks could start to announce. But I guess that address space would come from blocks earmarked as "non-routable, it's a bogon, bad IP space, filter in BGP at first sight!". :-)
You know, if you describe it that way too many times, people who are only paying half-attention are going to say "IPv6 has something almost like NAT, only different".
On Fri, 14 Oct 2005 Valdis.Kletnieks@vt.edu wrote:
On Fri, 14 Oct 2005 21:39:58 +0200, Daniel Roesen said:
Nope. The ULID is supposed to be static, globally unique. Just not globally routed. Seperating topology from identification.
Something I didn't see discussed yet is that shim6 sites would need to get a globally unique, provider independent /48 or larger... which folks could start to announce. But I guess that address space would come from blocks earmarked as "non-routable, it's a bogon, bad IP space, filter in BGP at first sight!". :-)
You know, if you describe it that way too many times, people who are only paying half-attention are going to say "IPv6 has something almost like NAT, only different".
you know... shim6 could make 'source address' pointless, you COULD just do NAT instead :) or do shim6 which looks like NAT ... if you don't get the host auth parts correct/done-well you might even be able to send traffic off to the 'wrong' place :) it'll be neat!
Christopher, On Oct 14, 2005, at 9:32 PM, Christopher L. Morrow wrote:
You know, if you describe it that way too many times, people who are only paying half-attention are going to say "IPv6 has something almost like NAT, only different". you know... shim6 could make 'source address' pointless, you COULD just do NAT instead :) or do shim6 which looks like NAT ... if you don't get the host auth parts correct/done-well you might even be able to send traffic off to the 'wrong' place :) it'll be neat!
I believe relying on the address as any sort of authentication is a mistake. Given IPv6 was, at least in theory, supposed to require IPSEC, I would have thought the use of the source address for anything other than connection demultiplexing would have been a waste of time. Of course, that assumes that people actually implement "required" parts of protocol specifications. As has been seen countless times, what happens in practice doesn't seem to conform to what is required in theory. Do all IPv6 stacks implement IPSEC? Rgds, -drc
On Fri, 14 Oct 2005, David Conrad wrote:
Christopher,
(chris is fine, silly corp email doesn't let us have sane addresses :( )
On Oct 14, 2005, at 9:32 PM, Christopher L. Morrow wrote:
You know, if you describe it that way too many times, people who are only paying half-attention are going to say "IPv6 has something almost like NAT, only different". you know... shim6 could make 'source address' pointless, you COULD just do NAT instead :) or do shim6 which looks like NAT ... if you don't get the host auth parts correct/done-well you might even be able to send traffic off to the 'wrong' place :) it'll be neat!
I believe relying on the address as any sort of authentication is a mistake. Given IPv6 was, at least in theory, supposed to require
in v4 it's not used that way, in v6 I'd hope the trend continues. If there isn't some very good form of authentication built into the shim solution it may be possible for an attacker in the packet path (or who can guess well enough) to tell either endpoint that there was a path failure and it's time to use a new ip address for the current conversation. anyway, shim6 isn't hear yet, and I'm sure someone would have thought of this problem before :)
IPSEC, I would have thought the use of the source address for anything other than connection demultiplexing would have been a waste of time.
Of course, that assumes that people actually implement "required" parts of protocol specifications. As has been seen countless times, what happens in practice doesn't seem to conform to what is required in theory. Do all IPv6 stacks implement IPSEC?
Merike has some interesting information about this... from what I understand not everyone implemented all the 'required' parts :( I wonder how quickly SA's can be re-done for conversations in a shim world? will they have to be or could the SA be tied to the ULID? Probably also something fun for the shim folks to figure out. I'd hate to have to re-negotiate ipsec associations everytime I thought there was a path failure :(
Daniel Roesen wrote:
On Fri, Oct 14, 2005 at 07:27:37PM +0000, bmanning@vacation.karoshi.com wrote:
the kicker here is that the applications then need some serious smarts to do proper source address selection.
Nope. The ULID is supposed to be static, globally unique. Just not globally routed. Seperating topology from identification.
Something I didn't see discussed yet is that shim6 sites would need to get a globally unique, provider independent /48 or larger... which folks could start to announce. But I guess that address space would come from blocks earmarked as "non-routable, it's a bogon, bad IP space, filter in BGP at first sight!". :-)
Actually, doing multihoming and getting PI space are orthogonal in shim6 last I knew. That is, you could get address space from your N providers and have one of the providers, say Provider X, to be the ULID for the end points. Should Provider X's link(s) go down, shim6 will ensure it all still works (which is, after all, the whole point). Getting PI space is really an administrative and economic issue. It is not a technical requirement of shim6. From draft-ietf-shim6-arch-00.txt, 3. Endpoint Identity There are a number of options in the choice of an endpoint identity realm, including the use of existing addresses as an identity tokens, the use of distinguished (possibly non-routeable) addresses as tokens, or the use of tokens drawn from a different realm (such as use of a fully qualified domain name). Shim6 uses the first of these options, and the endpoint identity for a host is one of the locator addresses that are normally associated with the host. The particular locator address selected to be the endpoint identity (or ULID) is specified in [RFC3484]. Shim6 does not mandate the use of distinguished addresses as identities, although the use non-routeable distinguished addresses in this context is described as an option in this approach. -- Crist J. Clark crist.clark@globalstar.com Globalstar Communications (408) 933-4387
On Fri, Oct 14, 2005 at 01:11:18PM -0700, Crist Clark wrote:
Actually, doing multihoming and getting PI space are orthogonal in shim6 last I knew. That is, you could get address space from your N providers and have one of the providers, say Provider X, to be the ULID for the end points. Should Provider X's link(s) go down, shim6 will ensure it all still works (which is, after all, the whole point).
But when the contract with this supplier is terminated, I have to renumber the whole network. Excellent.
Getting PI space is really an administrative and economic issue. It is not a technical requirement of shim6.
That's the problem. Ignorance regarding economical problems. We don't have technical problems, we have economical ones (if at all). Best regards, Daniel -- CLUE-RIPE -- Jabber: dr@cluenet.de -- dr@IRCnet -- PGP: 0xA85C8AA0
Seems like it might be a good time to update everyone on the IAB IPv6 Multi-homing BOF we're holding Monday afternoon at NANOG. My very draft introduction slides are on http://www.1-4-5.net/~dmm/talks/NANOG35/multihoming. Dave
Daniel Roesen wrote:
On Fri, Oct 14, 2005 at 01:11:18PM -0700, Crist Clark wrote:
Actually, doing multihoming and getting PI space are orthogonal in shim6 last I knew. That is, you could get address space from your N providers and have one of the providers, say Provider X, to be the ULID for the end points. Should Provider X's link(s) go down, shim6 will ensure it all still works (which is, after all, the whole point).
But when the contract with this supplier is terminated, I have to renumber the whole network. Excellent.
Well, you can get by on shim6 until the supplier reissues the address space elsewhere. Not too reassuring? But with IPv6, renumbering is easy! Uh, yeah, right.
Getting PI space is really an administrative and economic issue. It is not a technical requirement of shim6.
That's the problem. Ignorance regarding economical problems. We don't have technical problems, we have economical ones (if at all).
Multihoming in IPv6 is a technical problem if you consider unrestricted growth of the routing tables to be a technical problem (there are those who think is it economic, "throw more memory and CPU at it"). As the section of the draft quoted, using unrouted "RFC1918-like" address space, which would presumably be PI, is also covered by shim6, but it is not the only way in which it work. There is plenty of finger pointing about the economic problems, or ignorance thereof, in IPv6, including the occasional conspiracy theory. We had similar problems with IPv4. We still feel pain from the switch from Classful networking to CIDR occasionally. That was a technical change driven by an economic reality. Then there is that evil spawn of IPv4, NAT. The way these things get worked out ain't always pretty. -- Crist J. Clark crist.clark@globalstar.com Globalstar Communications (408) 933-4387
david.conrad@nominum.com (David Conrad) writes: (shouldn't that be drc@iana.org now?)
If my impression is correct, then my feeling is that something else is required. I am somewhat skeptical that shim6 will be implemented in any near term timeframe and it will take a very long time for existing v6 stacks to be upgraded to support shim6. What I suspect will be required is real _site_ multihoming. Something that will take existing v6 customer sites and allow them to be multi-homed without modification to each and every v6 stack within the site.
if all you've got is a hammer, every problem looks like a nail. so, the above problem statement looked like a dns issue to me, and to some other folks, and thus was born A6. had ietf killed AAAA back when there were effectively zero ipv6 hosts on the 'net, and paid the apparently-high A6 complexity penalty, we'd be talking about something else by now. as it is, the shim6 complexity penalty is even higher, and i don't think we'll ever get to stop talking about this problem. -- Paul Vixie
On Fri, 14 Oct 2005, Paul Vixie wrote:
david.conrad@nominum.com (David Conrad) writes:
(shouldn't that be drc@iana.org now?)
If my impression is correct, then my feeling is that something else is required. I am somewhat skeptical that shim6 will be implemented in any near term timeframe and it will take a very long time for existing v6 stacks to be upgraded to support shim6. What I suspect will be required is real _site_ multihoming. Something that will take existing v6 customer sites and allow them to be multi-homed without modification to each and every v6 stack within the site.
if all you've got is a hammer, every problem looks like a nail. so, the above problem statement looked like a dns issue to me, and to some other folks, and thus was born A6. had ietf killed AAAA back when there were effectively zero ipv6 hosts on the 'net, and paid the apparently-high A6 complexity penalty, we'd be talking about something else by now. as it is, the shim6 complexity penalty is even higher, and i don't think we'll ever get to stop talking about this problem.
Paul, I was big fan of A6 design and very sorry to see it discarded by IETF. But I do not believe it would have solved quite the same problem as what shim6 is trying to do (I'm not saying that it could not solve it with additional ipv6 extensions...). A6 was particularly good for solving problems of renumbering and allowing site to have same local ip address configuration for multiple ISP connections which can allow easy load balancing through DNS. But once connection is established it would be normal end-node<->end-node IPv6 connection with full 128bit ip6 addresses used for each side. That means that if connection goes down it would not be able to automatically do a fail-over (nor would dns load-balancing similar to current one with listing multiple A addresses do failover to use only active and working ipv6 connection). What I do think is that A6 and shim6-like design would have been a good compliment to each other. In this case the end-site network address (i.e. part above /48) would have been the same and is the one configured on switches and servers and listed as host address. The domain reference (i.e. common to multiple hosts) would point to the ISP networks A6 network addresses (i.e. part below /48) and is the only thing one changes when moving from one ISP to another (no renumbering!) or when connecting to 2nd ISP. If multi6 is used then additional special reference is made to multi6 network address (special reserved network block non-routable block) and multi6 aware clients would use that as common ip6 address reference for upper layers (i.e. in TCP and UDP) and let very simple multi6 layer keep track of local host's resolution of this address to real A6 networks. -- William Leibzon Elan Networks william@elan.net
On Oct 14, 2005, at 1:48 PM, Paul Vixie wrote:
david.conrad@nominum.com (David Conrad) writes: (shouldn't that be drc@iana.org now?)
Not quite yet... :-)
if all you've got is a hammer, every problem looks like a nail.
I guess the question was what is the problem IPng was supposed to solve?
had ietf killed AAAA back when there were effectively zero ipv6 hosts on the 'net, and paid the apparently- high A6 complexity penalty, we'd be talking about something else by now.
Yeah, like why didn't the Internet work anymore. A6 was simply a broken idea. It might have limped along in a vastly simplified form, but it would have changed how the DNS works in some really fundamental ways and I doubt those ways would ever have been acceptable.
as it is, the shim6 complexity penalty is even higher, and i don't think we'll ever get to stop talking about this problem.
It is unfortunate that simplicity, both in terms of operations as well as protocol definition, appears to be secondary to meeting everybody's pet requirements. But perhaps that is only appearance. Rgds, -drc
# > if all you've got is a hammer, every problem looks like a nail. # # I guess the question was what is the problem IPng was supposed to solve? that depends on who you ask. the pet problem i was dealing with at the time was the necessary evil called CIDR. necessary because infinite routing table growth wasn't possible, evil because it locked customers into their providers, entrenched the existing large providers against future providers, and made it hard or impossible for the average endusing company to multihome. so, when IPv6 was chosen and was "more evil" in that way, i was mystified. it turns out that most other folks' pet problem was "not enough address space". (oops.)
On Sat, 15 Oct 2005, Paul Vixie wrote: > ...the necessary evil called CIDR. evil because it locked customers > into their providers, entrenched the existing large providers > against future providers, and made it hard or impossible for the > average endusing company to multihome. Uh, perhaps I'm being dense, but how does moving masking off byte boundaries have any of the above effects? -Bill
...the necessary evil called CIDR. evil because it locked customers into their providers, entrenched the existing large providers against future providers, and made it hard or impossible for the average endusing company to multihome.
Uh, perhaps I'm being dense, but how does moving masking off byte boundaries have any of the above effects?
Bill, CIDR also changed allocation policies and created the notions of PA and PI space. Tony
On Mon, 17 Oct 2005, Tony Li wrote: > CIDR also changed allocation policies and created the notions of PA and PI > space. Hm. I guess I never thought of them as being causally related. And I remember the whole "portable/non-portable" issue as predating CIDR... I (perhaps mistakenly) remember having had that argument with customers when we were still knocking Cs for them out of our service-provision Bs... -Bill
On Fri, 14 Oct 2005, David Conrad wrote:
Joe (or anyone else), On Oct 14, 2005, at 7:57 AM, Joe Abley wrote:
The big gap in the multi-homing story for v6 is for end sites, since those are specifically excluded by all the RIRs' policies on PI addressing right now. Shim6 is intended to be a solution for end sites.
Since shim6 requires changes in protocol stacks on nodes, my impression has been that it isn't a _site_ multihoming solution, but rather a _node_ multihoming solution. Is my impression incorrect?
that is my read as well... I'd bet it'll be fun with uRPF strict on sites that are /multihomed/ though still staticly routed :)
Are you suggesting that something else is required for ISPs above and beyond announcing PI space with BGP, or that shim6 (once baked and real) would present a threat to ISPs?
If my impression is correct, then my feeling is that something else is required. I am somewhat skeptical that shim6 will be implemented in any near term timeframe and it will take a very long time for existing v6 stacks to be upgraded to support shim6. What I suspect will be required is real _site_ multihoming. Something that will take existing v6 customer sites and allow them to be multi-homed without modification to each and every v6 stack within the site.
you've hit a nail on it's head.
On Oct 14, 2005, at 10:57 AM, Joe Abley wrote:
On 14-Oct-2005, at 10:13, Christopher L. Morrow wrote:
Yep, there is no multihoming, but effectively, except for the BGP tricks that are currently being played in IPv4 there is nothing in IPv4 either. But one won't need to upgrade a Tier 1's hardware to support shim6, as
shim6 is: 1) not baked 2) not helpful for transit as's 3) not a reality
Not baked is absolutely correct, and not a reality follows readily from that, as viewed by an operator.
I'm interested in (2), though. Shim6 is not intended to be a solution for transit ASes. If you're an ISP, then you can get PI address space and multi-home in the normal way with BGP.
*IF* you're a big enough ISP. There are (a few) ISPs with few enough customers that they'd have to "exaggerate" plans to get the same level of multihoming that they do with their legacy IPv4 allocations... Also, are people going to consider accepting longer than /32s from their direct peers? (not for global transit, just peering)... in this case I'm thinking about those networks who do inconsistant announcements at various NAPs for "in-country" and other reasons.
On Fri, 14 Oct 2005, John Payne wrote:
On Oct 14, 2005, at 10:57 AM, Joe Abley wrote:
On 14-Oct-2005, at 10:13, Christopher L. Morrow wrote:
Yep, there is no multihoming, but effectively, except for the BGP tricks that are currently being played in IPv4 there is nothing in IPv4 either. But one won't need to upgrade a Tier 1's hardware to support shim6, as
shim6 is: 1) not baked 2) not helpful for transit as's 3) not a reality
Not baked is absolutely correct, and not a reality follows readily from that, as viewed by an operator.
I'm interested in (2), though. Shim6 is not intended to be a solution for transit ASes. If you're an ISP, then you can get PI address space and multi-home in the normal way with BGP.
*IF* you're a big enough ISP. There are (a few) ISPs with few enough customers that they'd have to "exaggerate" plans to get the same level of multihoming that they do with their legacy IPv4 allocations...
even if you are big enough you may have a particularly large sink somewhere inside your /32 that you want to pull through particular links but not others... that's not possible in the current scenario. With shim6 it's even worse, the large sink gets to do the engineering for you :) HURRAY! Cause I'm sure they understand the internals of your network, right? :(
On Fri, Oct 14, 2005 at 08:41:27AM +0200, Jeroen Massar wrote:
Even then, they could easily do some 'good' tunnels over their own IPv4 infrastructure, enabling IPv6 at the edges where they connect their customers and maybe do some sensible peering and thus providing sensible IPv6 transit to their paying customers...
Most people that charge their customers for v6 provide sensible transit. You can get tunnels from lots of folks for "free", where it's the cost of your existing bits/packets/pipe. - jared -- Jared Mauch | pgp key available via finger from jared@puck.nether.net clue++; | http://puck.nether.net/~jared/ My statements are only mine.
On Thu, 13 Oct 2005 11:05:58 PDT, Peter Lothberg said:
Is there anyone who can talk to it using IPv6 on the Nanog list?
(Time20.Stupi.SE, 2001:0440:1880:1000::0020)
% ntpdate -q 2001:0440:1880:1000::0020 server 2001:440:1880:1000::20, stratum 1, offset 0.012038, delay 0.45547 13 Oct 14:37:22 ntpdate[30374]: adjust time server 2001:440:1880:1000::20 offset 0.012038 sec % traceroute6 2001:0440:1880:1000::0020 traceroute to 2001:0440:1880:1000::0020 (2001:440:1880:1000::20) from 2001:468:c80:2103:206:5bff:feea:8e4e, 30 hops max, 16 byte packets 1 isb-7301-1.gi0-1.103.cns.ip6.vt.edu (2001:468:c80:2103::1) 0.671 ms 0.844 ms 0.782 ms 2 isb-6509-2.ge2-4.cns.ip6.vt.edu (2001:468:c80:f220::1) 5.584 ms 1.438 ms 2.644 ms 3 isb-7606-2.ge1-1.cns.ip6.vt.edu (2001:468:c80:f222::2) 1.762 ms 1.59 ms 1.564 ms 4 atm10-0.10.wtn2.ip6.networkvirginia.net (2001:468:cfe:2001::1) 7.757 ms 7.987 ms 7.598 ms 5 2001:468:cff:c::1 (2001:468:cff:c::1) 8.988 ms 9.094 ms 7.944 ms 6 dcne-clpk.maxgigapop.net (2001:468:cff:3::2) 8.624 ms 8.37 ms 8.453 ms 7 washng-max.abilene.ucaid.edu (2001:468:ff:184c::1) 9.371 ms 8.711 ms 8.287 ms 8 atlang-washng.abilene.ucaid.edu (2001:468:ff:118::1) 24.465 ms 25.249 ms 27.17 ms 9 hstnng-atlang.abilene.ucaid.edu (2001:468:ff:e11::2) 59.79 ms 43.836 ms 43.905 ms 10 losang-hstnng.abilene.ucaid.edu (2001:468:ff:1114::2) 75.915 ms 80.024 ms 76.147 ms 11 transpac-1-lo-jmb-702.lsanca.pacificwave.net (2001:504:b:20::136) 75.506 ms 75.583 ms 75.574 ms 12 3ffe:8140:101:1::2 (3ffe:8140:101:1::2) 190.994 ms 188.997 ms 188.857 ms 13 3ffe:8140:101:6::3 (3ffe:8140:101:6::3) 192.198 ms 188.83 ms 190.311 ms 14 hitachi1.otemachi.wide.ad.jp (2001:200:0:1800::9c4:2) 203.086 ms 201.553 ms 201.516 ms 15 pc6.otemachi.wide.ad.jp (2001:200:0:1800::9c4:0) 202.271 ms 200.954 ms 201.457 ms 16 otm6-gate1.iij.net (2001:200:0:1800::2497:1) 255.474 ms 275.889 ms 263.122 ms 17 otm6-bb0.IIJ.Net (2001:240:100:2::1) 264.74 ms 263.295 ms 259.571 ms 18 plt001ix06.IIJ.Net (2001:240:bb20:f000::4001) 255.896 ms 257.555 ms 256.579 ms 19 plt6-gate1.IIJ.Net (2001:240:bb62:8000::4003) 256.799 ms 256.91 ms 257.353 ms 20 sl-bb1v6-rly-t-22.sprintv6.net (3ffe:2900:b:e::1) 327.795 ms 327.826 ms 327.676 ms 21 sl-bb1v6-nyc-t-1000.sprintv6.net (2001:440:1239:1001::2) 335.371 ms 334.318 ms 333.559 ms 22 sl-bb1v6-sto-t-102.sprintv6.net (2001:440:1239:100d::2) 430.171 ms sl-bb1v6-sto-t-101.sprintv6.net (2001:440:1239:1012::1) 443.515 ms sl-bb1v6-sto-t-102.sprintv6.net (2001:440:1239:100d::2) 428.871 ms 23 2001:7f8:d:fb::34 (2001:7f8:d:fb::34) 443.356 ms 449.329 ms 453.155 ms 24 2001:440:1880:1::2 (2001:440:1880:1::2) 447.132 ms 449.606 ms 454.631 ms 25 2001:440:1880:1::12 (2001:440:1880:1::12) 436.22 ms 449.6 ms 461.937 ms 26 2001:440:1880:1000::20 (2001:440:1880:1000::20) 431.528 ms 431.559 ms 434.464 ms Blech. :) (For comparison, here's the IPv4 traceroute: % traceroute 192.36.143.234 traceroute to 192.36.143.234 (192.36.143.234), 30 hops max, 38 byte packets 1 isb-6509-1.vl103.cns.vt.edu (128.173.12.1) 0.863 ms 1.197 ms 0.637 ms 2 isb-6509-2.vl710.cns.vt.edu (128.173.0.82) 0.686 ms 0.811 ms 0.478 ms 3 isb-7606-1.ge1-1.cns.vt.edu (192.70.187.198) 5.380 ms 1.767 ms 0.625 ms 4 atm1-0.11.roa.networkvirginia.net (192.70.187.194) 6.626 ms 4.997 ms 1.970 ms 5 sl-gw20-rly-2-2.sprintlink.net (160.81.255.1) 7.282 ms 7.340 ms 8.584 ms 6 sl-bb20-rly-3-2.sprintlink.net (144.232.14.29) 17.183 ms 7.721 ms 7.182 ms 7 sl-bb20-tuk-11-0.sprintlink.net (144.232.20.137) 29.827 ms 28.928 ms 29.724 ms 8 sl-bb21-tuk-15-0.sprintlink.net (144.232.20.133) 29.537 ms 28.896 ms 34.679 ms 9 sl-bb21-lon-14-0.sprintlink.net (144.232.19.70) 100.620 ms 100.013 ms 98.216 ms 10 sl-bb22-lon-3-0.sprintlink.net (213.206.129.153) 99.232 ms 159.099 ms 160.970 ms 11 sl-bb20-bru-14-0.sprintlink.net (213.206.129.42) 239.766 ms 119.984 ms 203.193 ms 12 sl-bb21-bru-15-0.sprintlink.net (80.66.128.42) 106.066 ms 103.692 ms 105.917 ms 13 sl-bb20-ams-14-0.sprintlink.net (213.206.129.45) 106.905 ms 106.920 ms 106.243 ms 14 sl-bb21-ham-6-0.sprintlink.net (213.206.129.145) 202.024 ms 124.268 ms 208.230 ms 15 sl-bb21-cop-13-0.sprintlink.net (213.206.129.57) 118.080 ms 119.096 ms 118.624 ms 16 sl-bb21-sto-14-0.sprintlink.net (213.206.129.34) 124.976 ms 125.618 ms 126.115 ms 17 sl-bb20-sto-15-0.sprintlink.net (80.77.96.33) 126.867 ms 126.760 ms 125.839 ms 18 sl-tst1-sto-0-0.sprintlink.net (213.206.131.10) 126.482 ms 126.175 ms 124.867 ms 19 BFR3-POS-2-0.Stupi.NET (192.108.195.121) 125.468 ms 125.193 ms 124.997 ms 20 * * * 21 Time20.Stupi.SE (192.36.143.234) 126.873 ms 126.055 ms 126.962 ms
On Thu, 2005-10-13 at 14:42 -0400, Valdis.Kletnieks@vt.edu wrote:
On Thu, 13 Oct 2005 11:05:58 PDT, Peter Lothberg said:
Is there anyone who can talk to it using IPv6 on the Nanog list?
(Time20.Stupi.SE, 2001:0440:1880:1000::0020)
% ntpdate -q 2001:0440:1880:1000::0020 server 2001:440:1880:1000::20, stratum 1, offset 0.012038, delay 0.45547 13 Oct 14:37:22 ntpdate[30374]: adjust time server 2001:440:1880:1000::20 offset 0.012038 sec
% traceroute6 2001:0440:1880:1000::0020 traceroute to 2001:0440:1880:1000::0020 (2001:440:1880:1000::20) from 2001:468:c80:2103:206:5bff:feea:8e4e, 30 hops max, 16 byte packets
Well Valdis, that bad route also has to do with your side of the equation, you might want to check who you are actually using as transits and if the routes they are providing to you are sane enough. Text version: 2001:468::/32 is in the routing table, getting accepted by most ISP's. This one has a reasonable route, going over GBLX (3549) in most places. Though some get this over BT (1752), who have 'nice' (ahem) tunneled connectivity and transit with everybody on the planet. ESNET (293) seem to be the third 'transit', with OpenTransit (5011) being the fourth one and ISC being the fifth. Many routes seem to go over VIAGENIE (10566) who seem to have some connectivity problems too most of the time. Path wise most of it looks pretty sane. Then there is a chunk of /40's, which are visible inside Abilene, GRH does see them, but other ISP's don't. Most people will thus take the /32 towards your IP, which might go over some laggy tunneled networks. Fortunately not criss cross world yet, but... then the fun part: 2001:468:e00::/40 though seem to be visible globally, getting announced by University of California, directly going to: Korea! :) And then coming back to the rest of the world over Viagenie (10566) One of those nice paths: 2001:468:e00::/40 16150 (SE) 6667 (FI) 3549 (US) 6939 (US) 6939 (US) 10566 (CA) 3786 (KR) 17832 (KR) 1237 (KR) 17579 (KR) 2153 (US) Neatly around the world, you might want to hint this University to not do 'transit' uplinks themselves with Korean networks :) Then there is also a 2001:468:e9c::/48 which also goes over Korea. The colored version: http://www.sixxs.net/tools/grh/lg/?find=2001:468::/32 The above simply happens because most ISP's sanely filter on /32 boundaries, as per: http://www.space.net/~gert/RIPE/ipv6-filters.html For further reading see Gert Doering's excellent presentations at: http://www.space.net/~gert/RIPE/ Greets, Jeroen
On Thu, 13 Oct 2005 21:06:30 +0200, Jeroen Massar said:
Well Valdis, that bad route also has to do with your side of the equation, you might want to check who you are actually using as transits and if the routes they are providing to you are sane enough.
Well, if somebody at stupi.se wants to do a traceroute6 back at us, I'll be glad to see what the reverse path looks like... but last I heard traceroute and traceroute6 showed the *forward* path of packets..
2001:468::/32 is in the routing table, getting accepted by most ISP's. This one has a reasonable route
The real problem (at least for the forward direction from here) is that the outbound packets get into the Abilene network, and the best path from there to 2001:440:1880 is a 3ffe: tunnel to japan and then another 3ffe: tunnel back to New York.
On Thu, 2005-10-13 at 15:38 -0400, Valdis.Kletnieks@vt.edu wrote:
On Thu, 13 Oct 2005 21:06:30 +0200, Jeroen Massar said:
Well Valdis, that bad route also has to do with your side of the equation, you might want to check who you are actually using as transits and if the routes they are providing to you are sane enough.
Well, if somebody at stupi.se wants to do a traceroute6 back at us, I'll be glad to see what the reverse path looks like... but last I heard traceroute and traceroute6 showed the *forward* path of packets..
That is correct, try tracepath, this shows at least the assymetry. You can also peek at GRH to see a probable AS path back. ASN's still tell a lot in IPv6. Next month I'll finalize the 'symmetry' tool which allows one to do the AS path checkup between two places automatically.
2001:468::/32 is in the routing table, getting accepted by most ISP's. This one has a reasonable route
The real problem (at least for the forward direction from here) is that the outbound packets get into the Abilene network, and the best path from there to 2001:440:1880 is a 3ffe: tunnel to japan and then another 3ffe: tunnel back to New York.
Kick Abilene to not be so silly and get some real transits. Then again Abiline is educational and those networks seem to have very nice (read: overcomplex) routing policies... Greets, Jeroen
On Thu, 13 Oct 2005 21:44:23 +0200, Jeroen Massar said:
Kick Abilene to not be so silly and get some real transits. Then again Abiline is educational and those networks seem to have very nice (read: overcomplex) routing policies...
Somehow, I don't think anything that Abilene does is going to fix Jordi's routing. From where *you* are, do *you* have a path to 2001:0440:1880:1000::0020 that *doesn't go through Japan? If so, what does your path look like?
My box that gets IPv6 connectivity from Kewlio (set up via the SixXS tunnel broker) has a fairly short route which doesn't seem to go via Japan traceroute6 to time20.stupi.se (2001:440:1880:1000::20) from 2001:4bd0:202a::1, 64 hops max, 12 byte packets 1 gw-121.lon-01.gb.sixxs.net 3.484 ms 3.527 ms 3.978 ms 2 po6.712-IPv6-necromancer.sov.kewlio.net.uk 16.976 ms 4.536 ms 3.979 ms 3 sl-bb1v6-bru-t-4.sprintv6.net 55.976 ms 55.614 ms 54.972 ms 4 sl-bb1v6-sto-t-100.sprintv6.net 84.971 ms 82.604 ms 82.961 ms 5 * * * 6 2001:440:1880:1::2 97.992 ms 101.565 ms 109.964 ms 7 2001:440:1880:1::12 104.966 ms 105.651 ms 102.960 ms 8 2001:440:1880:1000::20 83.971 ms 84.650 ms 85.963 ms -bash-2.05b$ Though my other box (with connectivity via the BT Exact tunnel broker) goes via Japan... -bash-2.05b$ traceroute6 time20.stupi.se traceroute6 to time20.stupi.se (2001:440:1880:1000::20) from 2001:618:400::511d: 554, 64 hops max, 12 byte packets 1 tb-exit.ipv6.btexact.com 7.983 ms 8.759 ms 7.939 ms 2 uk6x-core-hopper-g0-2.ipv6.btexact.com 9.966 ms 7.892 ms 9.945 ms 3 ft-euro6ix-uk6x.ipv6.btexact.com 9.972 ms 9.899 ms 9.944 ms 4 Po3-2.LONBB3.London.opentransit.net 9.976 ms 9.910 ms 9.952 ms 5 So7-2-0.LONCR1.London.opentransit.net 39.963 ms 10.800 ms 8.944 ms 6 Po12-0.LONCR3.London.opentransit.net 9.975 ms 9.912 ms 9.944 ms 7 Po12-0.OAKCR2.Oakhill.opentransit.net 81.971 ms 81.858 ms 82.929 ms 8 Po5-0.PASCR3.Pastourelle.opentransit.net 141.972 ms 141.986 ms 167.906 ms 9 Po2-0.KITBB1.Kitaibaraki.opentransit.net 269.852 ms 269.712 ms 270.920 ms 10 Ge0-0-0.TKYBB4.Tokyo.opentransit.net 267.901 ms 267.842 ms Po1-3.TKYBB2.To kyo.opentransit.net 271.916 ms 11 Ge0-0-0.TKYBB4.Tokyo.opentransit.net 272.865 ms 2001:688:0:2:8::23 270.868 ms 269.056 ms 12 hitachi1.otemachi.wide.ad.jp 406.900 ms 404.830 ms 2001:688:0:2:8::23 272 .890 ms 13 hitachi1.otemachi.wide.ad.jp 408.073 ms 409.827 ms 410.849 ms 14 otm6-gate1.iij.net 257.918 ms 390.834 ms 286.880 ms 15 otm6-bb1.IIJ.Net 284.922 ms otm6-gate1.iij.net 259.766 ms 259.903 ms 16 plt001ix06.IIJ.Net 260.792 ms 263.903 ms otm6-bb0.IIJ.Net 259.808 ms 17 plt001ix06.IIJ.Net 266.909 ms plt001ix06.IIJ.Net 266.716 ms 266.728 ms 18 sl-bb1v6-rly-t-22.sprintv6.net 333.883 ms 332.888 ms plt6-gate1.IIJ.Net 2 66.886 ms 19 sl-bb1v6-rly-t-22.sprintv6.net 339.748 ms sl-s1v6-nyc-t-1000.sprintv6.net 339.852 ms 338.706 ms 20 sl-bb1v6-sto-t-102.sprintv6.net 433.779 ms sl-bb1v6-sto-t-101.sprintv6.net 435.691 ms sl-bb1v6-nyc-t-1000.sprintv6.net 342.824 ms 21 sl-bb1v6-sto-t-101.sprintv6.net 439.739 ms 2001:7f8:d:fb::34 526.720 ms 4 54.105 ms 22 2001:7f8:d:fb::34 461.876 ms 459.004 ms 459.913 ms 23 2001:440:1880:1::2 456.849 ms 2001:440:1880:1::12 454.025 ms 454.121 ms 24 2001:440:1880:1000::20 436.766 ms 434.023 ms 2001:440:1880:1::12 462.884 ms -bash-2.05b$
Valdis.Kletnieks@vt.edu wrote:
Somehow, I don't think anything that Abilene does is going to fix Jordi's routing. From where *you* are, do *you* have a path to 2001:0440:1880:1000::0020 that *doesn't go through Japan? If so, what does your path look like?
At least some parts of the US seem to. This wouldn't be so bad if I didn't have to go across the US to get to the first hop: FT@vash:~$ /usr/sbin/ntpdate -q 2001:0440:1880:1000::0020 192.36.143.234 server 2001:440:1880:1000::20, stratum 1, offset -0.005350, delay 0.27211 server 192.36.143.234, stratum 1, offset 0.002036, delay 0.15575 13 Oct 16:13:39 ntpdate[1526]: adjust time server 192.36.143.234 offset 0.002036 sec FT@vash:~$ /usr/sbin/traceroute6 2001:0440:1880:1000::0020 traceroute to 2001:0440:1880:1000::0020 (2001:440:1880:1000::20) from 2001:470:1f00:ffff::649, 30 hops max, 16 byte packets 1 tommydool.tunnel.tserv1.fmt.ipv6.he.net (2001:470:1f00:ffff::648) 67.52 ms 70.567 ms 68.182 ms 2 2001:470:1fff:2::26 (2001:470:1fff:2::26) 67.607 ms 68.592 ms 70.168 ms 3 sl-bb1v6-rly-t-76.sprintv6.net (3ffe:2900:a:1::1) 143.214 ms 144.479 ms 145.113 ms 4 sl-s1v6-nyc-t-1000.sprintv6.net (2001:440:1239:1001::2) 150.654 ms 147.692 ms 151.378 ms 5 sl-bb1v6-sto-t-102.sprintv6.net (2001:440:1239:100d::2) 244.013 ms sl-bb1v6-sto-t-101.sprintv6.net (2001:440:1239:1012::1) 246.934 ms sl-bb1v6-sto-t-102.sprintv6.net (2001:440:1239:100d::2) 240.373 ms 6 2001:7f8:d:fb::34 (2001:7f8:d:fb::34) 262.842 ms 261.69 ms 263.779 ms 7 2001:440:1880:1::2 (2001:440:1880:1::2) 258.422 ms 266.312 ms 264.051 ms 8 2001:440:1880:1::12 (2001:440:1880:1::12) 250.289 ms 265.565 ms 267.93 ms 9 2001:440:1880:1000::20 (2001:440:1880:1000::20) 246.908 ms 249.03 ms 247.193 ms
On Thu, 13 Oct 2005, Nicholas Suan wrote:
Valdis.Kletnieks@vt.edu wrote:
Somehow, I don't think anything that Abilene does is going to fix Jordi's routing. From where *you* are, do *you* have a path to 2001:0440:1880:1000::0020 that *doesn't go through Japan? If so, what does your path look like?
At least some parts of the US seem to. This wouldn't be so bad if I didn't have to go across the US to get to the first hop:
FT@vash:~$ /usr/sbin/ntpdate -q 2001:0440:1880:1000::0020 192.36.143.234 server 2001:440:1880:1000::20, stratum 1, offset -0.005350, delay 0.27211 server 192.36.143.234, stratum 1, offset 0.002036, delay 0.15575 13 Oct 16:13:39 ntpdate[1526]: adjust time server 192.36.143.234 offset 0.002036 sec
FT@vash:~$ /usr/sbin/traceroute6 2001:0440:1880:1000::0020 traceroute to 2001:0440:1880:1000::0020 (2001:440:1880:1000::20) from 2001:470:1f00:ffff::649, 30 hops max, 16 byte packets 1 tommydool.tunnel.tserv1.fmt.ipv6.he.net (2001:470:1f00:ffff::648) 67.52 ms 70.567 ms 68.182 ms 2 2001:470:1fff:2::26 (2001:470:1fff:2::26) 67.607 ms 68.592 ms 144.479 ms 145.113 ms 4 sl-s1v6-nyc-t-1000.sprintv6.net (2001:440:1239:1001::2) 150.654 ms 147.692 ms 151.378 ms 5 sl-bb1v6-sto-t-102.sprintv6.net (2001:440:1239:100d::2) 244.013 ms 9 2001:440:1880:1000::20 (2001:440:1880:1000::20) 246.908 ms 249.03 ms 247.193 ms
macosx tiger no likey the ntpdate v6 :( but: ~> traceroute6 2001:0440:1880:1000::0020 traceroute6 to 2001:0440:1880:1000::0020 (2001:440:1880:1000::20) from 2001:408:1009:2:203:93ff:feec:f318, 30 hops max, 12 byte packets 1 350.0-0fastethernet.rtr.ops-netman.net 2.551 ms 1.894 ms 23.325 ms 2 2001.gr-0-1-0.hr6.tco4.alter.net 26.575 ms 55.095 ms * 3 2001:408:11::8 98.683 ms 95.157 ms 98.663 ms 4 sl-bb1v6-bru.sprintlink.net 174.97 ms 172.633 ms 172.264 ms 5 sl-bb1v6-sto-t-100.sprintv6.net 207.452 ms 205.804 ms 211.638 ms 6 2001:7f8:d:fb::34 213.451 ms 226.246 ms 225.651 ms 7 2001:440:1880:1::2 209.371 ms 225.198 ms 225.462 ms 8 2001:440:1880:1::12 218.561 ms 227.934 ms 226.571 ms 9 2001:440:1880:1000::20 207.678 ms 206.464 ms 207.334 ms (debian sarge seems to like ntpdate v6 though :) bug open to apple)
On Thu, 13 Oct 2005 Valdis.Kletnieks@vt.edu wrote:
Somehow, I don't think anything that Abilene does is going to fix Jordi's routing. From where *you* are, do *you* have a path to 2001:0440:1880:1000::0020 that *doesn't go through Japan? If so, what does your path look like?
# traceroute6 2001:0440:1880:1000::0020 .... 2 2001:7f8:d:ff::26 (2001:7f8:d:ff::26) 2.888 ms 19.51 ms 18.8 ms 3 2001:440:1880:1::2002 (2001:440:1880:1::2002) 0.945 ms 1.058 ms 0.741 ms 4 2001:440:1880:1::2011 (2001:440:1880:1::2011) 0.986 ms 1.009 ms 0.759 ms 5 2001:440:1880:1::2031 (2001:440:1880:1::2031) 6.317 ms 5.561 ms 20.877 ms 6 2001:440:1880:1000::20 (2001:440:1880:1000::20) 0.867 ms 0.979 ms 0.894 ms -- Mikael Abrahamsson email: swmike@swm.pp.se
On Thu, 2005-10-13 at 15:50 -0400, Valdis.Kletnieks@vt.edu wrote:
On Thu, 13 Oct 2005 21:44:23 +0200, Jeroen Massar said:
Kick Abilene to not be so silly and get some real transits. Then again Abiline is educational and those networks seem to have very nice (read: overcomplex) routing policies...
Somehow, I don't think anything that Abilene does is going to fix Jordi's routing. From where *you* are, do *you* have a path to 2001:0440:1880:1000::0020 that *doesn't go through Japan? If so, what does your path look like?
As mentioned in my first response, see: http://www.sixxs.net/tools/traceroute/ which is a Distributed Traceroute, allowing one to traceroute over IPv4 and IPv6 from most of the SixXS POPs, which are present in most parts of Europe and also one in the US @ OCCAID. or check: http://www.sixxs.net/tools/grh/ for a nice BGP looking glass (grh.sixxs.net for telnet ;) To solve Jordi's problem I've brought him, offlist, into contact with an ISP that is able to give him native IPv6 in Spain with a native route to most places, so that problem should not exist anymore in a few weeks. On Thu, 2005-10-13 at 16:11 -0400, Bill Owens wrote:
On Thu, Oct 13, 2005 at 09:44:23PM +0200, Jeroen Massar wrote:
Kick Abilene to not be so silly and get some real transits. Then again Abiline is educational and those networks seem to have very nice (read: overcomplex) routing policies...
I don't speak for Abilene or Internet2, but here's what I know. Abilene doesn't buy transit from anyone.
Yes, I am aware of this; as I mentioned, they are educational and have some overcomplex routing policies. This seem to be the case for the majority of NREN's unfortunately which causes bad connectivity towards the 'commercial' prefixes, one day this will include a Google service, who btw are also already using their own IPv6 prefix, and then it will most likely hit these places that people need good connectivity to those sites too... Good part is that they do have a staff that is aware and understands this issue, it is just good to note whenever doing a traceroute or noticing bad connectivity that NREN's seem to have those issues... Having them in GRH also allows one to see where it most likely goes wrong. Greets, Jeroen
On Thu, Oct 13, 2005 at 03:50:17PM -0400, Valdis.Kletnieks@vt.edu wrote:
On Thu, 13 Oct 2005 21:44:23 +0200, Jeroen Massar said:
Kick Abilene to not be so silly and get some real transits. Then again Abiline is educational and those networks seem to have very nice (read: overcomplex) routing policies...
Somehow, I don't think anything that Abilene does is going to fix Jordi's routing. From where *you* are, do *you* have a path to 2001:0440:1880:1000::0020 that *doesn't go through Japan? If so, what does your path look like?
Mine does not: punk:~/Desktop> traceroute6 2001:0440:1880:1000::0020 traceroute to 2001:0440:1880:1000::0020 (2001:440:1880:1000::20) from 2001:418:3f4:0:20e:a6ff:febf:a5ca, 30 hops max, 16 byte packets 1 2001:418:3f4::1 (2001:418:3f4::1) 3.95 ms 1.342 ms 1.266 ms 2 2001:418:2400:5000::1 (2001:418:2400:5000::1) 15.109 ms 16.2 ms 15.136 ms 3 ae0-4.r01.chcgil06.us.bb.verio.net (2001:418:0:7056::1) 15.652 ms 15.417 ms 15.683 ms 4 p16-1-1-1.r21.snjsca04.us.bb.verio.net (2001:418:0:2000::2a9) 62.209 ms 61.899 ms 62.196 ms 5 p64-0-0-0.r20.mlpsca01.us.bb.verio.net (2001:418:0:2000::235) 63.7 ms 63.328 ms 64.479 ms 6 xe-1-1.r02.mlpsca01.us.bb.verio.net (2001:418:0:2000::35) 62.185 ms 63.449 ms 62.941 ms 7 fa-0-0-0.r00.mlpsca01.us.b6.verio.net (2001:418:0:700f::b600) 62.942 ms 63.405 ms 64.565 ms 8 tu-0.sprint.mlpsca01.us.b6.verio.net (2001:418:0:4000::4a) 64.486 ms 65.586 ms 65.215 ms 9 sl-s1v6-nyc-t-1001.sprintv6.net (2001:440:1239:1005::2) 137.216 ms sl-bb1v6-nyc-t-1001.sprintv6.net (2001:440:1239:100b::1) 135.68 ms sl-s1v6-nyc-t-1001.sprintv6.net (2001:440:1239:1005::2) 137.073 ms 10 sl-bb1v6-sto-t-102.sprintv6.net (2001:440:1239:100d::2) 231.689 ms sl-bb1v6-sto-t-101.sprintv6.net (2001:440:1239:1012::1) 232.464 ms sl-bb1v6-sto-t-102.sprintv6.net (2001:440:1239:100d::2) 231.665 ms 11 2001:7f8:d:fb::34 (2001:7f8:d:fb::34) 235.363 ms 250.232 ms 252.684 ms 12 2001:440:1880:1::2 (2001:440:1880:1::2) 246.693 ms 252.567 ms 251.964 ms 13 2001:440:1880:1::12 (2001:440:1880:1::12) 564.686 ms 310.919 ms 254.205 ms 14 2001:440:1880:1000::20 (2001:440:1880:1000::20) 232.569 ms 234.418 ms 232.461 ms -- Jared Mauch | pgp key available via finger from jared@puck.nether.net clue++; | http://puck.nether.net/~jared/ My statements are only mine.
On Fri, 14 Oct 2005 10:21:20 EDT, Jared Mauch said:
Mine does not:
punk:~/Desktop> traceroute6 2001:0440:1880:1000::0020 traceroute to 2001:0440:1880:1000::0020 (2001:440:1880:1000::20) from 2001:418:3f4:0:20e:a6ff:febf:a5ca, 30 hops max, 16 byte packets 1 2001:418:3f4::1 (2001:418:3f4::1) 3.95 ms 1.342 ms 1.266 ms 2 2001:418:2400:5000::1 (2001:418:2400:5000::1) 15.109 ms 16.2 ms 15.136 ms 3 ae0-4.r01.chcgil06.us.bb.verio.net (2001:418:0:7056::1) 15.652 ms 15.417 ms 15.683 ms 4 p16-1-1-1.r21.snjsca04.us.bb.verio.net (2001:418:0:2000::2a9) 62.209 ms 61.899 ms 62.196 ms 5 p64-0-0-0.r20.mlpsca01.us.bb.verio.net (2001:418:0:2000::235) 63.7 ms 63.328 ms 64.479 ms 6 xe-1-1.r02.mlpsca01.us.bb.verio.net (2001:418:0:2000::35) 62.185 ms 63.449 ms 62.941 ms 7 fa-0-0-0.r00.mlpsca01.us.b6.verio.net (2001:418:0:700f::b600) 62.942 ms 63.405 ms 64.565 ms 8 tu-0.sprint.mlpsca01.us.b6.verio.net (2001:418:0:4000::4a) 64.486 ms 65.586 ms 65.215 ms 9 sl-s1v6-nyc-t-1001.sprintv6.net (2001:440:1239:1005::2) 137.216 ms sl-bb1v6-nyc-t-1001.sprintv6.net (2001:440:1239:100b::1) 135.68 ms sl-s1v6-nyc-t-1001.sprintv6.net (2001:440:1239:1005::2) 137.073 ms
Interesting. :) That's the first one I've seen that a sprintv6.net address isn't at hop number 3 or so (indicating that the person is basically directly connected to sprintv6.net) and also doesn't take a loop through Japan....
On Fri, Oct 14, 2005 at 02:15:20PM -0400, Valdis.Kletnieks@vt.edu wrote:
Interesting. :) That's the first one I've seen that a sprintv6.net address isn't at hop number 3 or so (indicating that the person is basically directly connected to sprintv6.net) and also doesn't take a loop through Japan....
We (AS2914) stopped adding tunnel peers awhile ago, and only add tunnels for customers at this point (i think). As a tech guy, I prefer people to connect to us native instead of any other method, it removes all the sillyness i've seen (including v6 p-mtu issues). It keeps the packets moving the best path and not cloaking the underlying topology so people can find the trouble. Same reason MPLS networks don't always turn on/off ttl-decrement when following a (TE) tunnel. It also makes it easier to troubleshoot and speculate where troubles are ;-). - jared -- Jared Mauch | pgp key available via finger from jared@puck.nether.net clue++; | http://puck.nether.net/~jared/ My statements are only mine.
On Thu, Oct 13, 2005 at 09:44:23PM +0200, Jeroen Massar wrote:
Kick Abilene to not be so silly and get some real transits. Then again Abiline is educational and those networks seem to have very nice (read: overcomplex) routing policies...
I don't speak for Abilene or Internet2, but here's what I know. Abilene doesn't buy transit from anyone. Their internal routing policy is quite straightforward, given that it is goverened by the Abilene Conditions of Use (their AUP). The CoU would normally prohibit connections to commercial ISPs, but it is set aside for IPv6 (and IPv4 multicast). As a result, Abilene peers with a number of commercial IPv6 networks at locations where it is convenient to do so (PAIX, primarily). Abilene has excellent IPv6 connectivity to the R&E world, typically along the same paths and with similar performance to IPv4. They'll never have that good connectivity with the commercial world, and in any case once the commercial uptake of IPv6 is high enough, they'll disconnect those peerings and let their members get IPv6 connectivity through their individual commercial ISPs. I have no idea when that will happen nor how the decision will be made, but that's the stated plan. Bill. PS - I have no interest whatsoever in debating R&E versus commercial, the role of Abilene, etc. I'm just passing along this info. . .
Blech. :) (For comparison, here's the IPv4 traceroute:
Very interesting. From the east coast your IPv4 traffic goes to Virginia and then to the UK. But your IPv6 traffic goes to Atlanta, Houston, LA and across the Pacific. Is this due to someone's misconfiguration of weights? --Michael Dillon
Yes ;-) JPMac:~ jordi$ ntpdate -q 2001:0440:1880:1000::0020 13 Oct 21:23:11 ntpdate[347]: can't find host 2001:0440:1880:1000::0020 server 0.0.0.0, stratum 0, offset 0.000000, delay 32639.00000 server 17.72.133.42, stratum 2, offset -9.996023, delay 0.13766 13 Oct 21:23:12 ntpdate[347]: step time server 17.72.133.42 offset -9.996023 sec Regards, Jordi
De: Peter Lothberg <roll@Stupi.SE> Responder a: <owner-nanog@merit.edu> Fecha: Thu, 13 Oct 2005 11:05:58 PDT Para: Randy Bush <randy@psg.com> CC: "Steven M. Bellovin" <smb@cs.columbia.edu>, <nanog@nanog.org> Asunto: Re: IPv6 news
so having dual stack backbones is very important. but ...
Other global providers have a IPv6 network to, all open for business, but there are very VERY few customers.
And, I'm not so sure we even have a "Internet" of IPv6 out there either. It looks cold and empty to me.
Here's a challange, have NTP server attached directly to a good clock and a IPv6 network.
Is there anyone who can talk to it using IPv6 on the Nanog list?
(Time20.Stupi.SE, 2001:0440:1880:1000::0020)
-Peter
************************************ The IPv6 Portal: http://www.ipv6tf.org Barcelona 2005 Global IPv6 Summit Information available at: http://www.ipv6-es.com This electronic message contains information which may be privileged or confidential. The information is intended to be for the use of the individual(s) named above. If you are not the intended recipient be aware that any disclosure, copying, distribution or use of the contents of this information, including attached files, is prohibited.
ntpdate -q 2001:0440:1880:1000::0020 server 2001:440:1880:1000::20, stratum 1, offset 0.048519, delay 0.56551 13 Oct 15:41:08 ntpdate[7397]: adjust time server 2001:440:1880:1000::20 offset 0.048519 sec Tim Rainier JORDI PALET MARTINEZ <jordi.palet@consulintel.es> Sent by: owner-nanog@merit.edu 10/13/2005 03:23 PM Please respond to jordi.palet@consulintel.es To "nanog@merit.edu" <nanog@merit.edu> cc Subject Re: IPv6 news Yes ;-) JPMac:~ jordi$ ntpdate -q 2001:0440:1880:1000::0020 13 Oct 21:23:11 ntpdate[347]: can't find host 2001:0440:1880:1000::0020 server 0.0.0.0, stratum 0, offset 0.000000, delay 32639.00000 server 17.72.133.42, stratum 2, offset -9.996023, delay 0.13766 13 Oct 21:23:12 ntpdate[347]: step time server 17.72.133.42 offset -9.996023 sec Regards, Jordi
De: Peter Lothberg <roll@Stupi.SE> Responder a: <owner-nanog@merit.edu> Fecha: Thu, 13 Oct 2005 11:05:58 PDT Para: Randy Bush <randy@psg.com> CC: "Steven M. Bellovin" <smb@cs.columbia.edu>, <nanog@nanog.org> Asunto: Re: IPv6 news
so having dual stack backbones is very important. but ...
Other global providers have a IPv6 network to, all open for business, but there are very VERY few customers.
And, I'm not so sure we even have a "Internet" of IPv6 out there either. It looks cold and empty to me.
Here's a challange, have NTP server attached directly to a good clock and a IPv6 network.
Is there anyone who can talk to it using IPv6 on the Nanog list?
(Time20.Stupi.SE, 2001:0440:1880:1000::0020)
-Peter
************************************ The IPv6 Portal: http://www.ipv6tf.org Barcelona 2005 Global IPv6 Summit Information available at: http://www.ipv6-es.com This electronic message contains information which may be privileged or confidential. The information is intended to be for the use of the individual(s) named above. If you are not the intended recipient be aware that any disclosure, copying, distribution or use of the contents of this information, including attached files, is prohibited.
And the reason why it fails (clicked send too fast !) traceroute6 to 2001:0440:1880:1000::0020 (2001:440:1880:1000::20) from 2001:7f9:2000:100:20d:93ff:feeb:73, 30 hops max, 12 byte packets 1 2001:7f9:2000:100:200:1cff:feb5:c535 1.832 ms * 1.138 ms 2 2001:7f9:2000:1:1::1 97.941 ms 101.684 ms 93.166 ms 3 v6-tunnel40-uk6x.ipv6.btexact.com 142.381 ms 167.692 ms 180.064 ms 4 ft-euro6ix-uk6x.ipv6.btexact.com 154.574 ms 328.447 ms 352.331 ms 5 po3-2.lonbb3.london.opentransit.net 362.112 ms 232.994 ms 231.141 ms 6 so7-2-0.loncr1.london.opentransit.net 248.975 ms 248.86 ms 249.376 ms 7 po12-0.loncr3.london.opentransit.net 159.662 ms 214.85 ms 395.218 ms 8 po12-0.oakcr2.oakhill.opentransit.net 379.212 ms 257.403 ms 366.123 ms 9 so4-0-0.loacr2.los-angeles.opentransit.net 402.118 ms 281.826 ms 450.289 ms 10 po2-0.kitbb1.kitaibaraki.opentransit.net 522.638 ms 452.638 ms po1-0.tkybb2.tokyo.opentransit.net 481.732 ms 11 ge0-0-0.tkybb4.tokyo.opentransit.net 421.303 ms po1-3.tkybb2.tokyo.opentransit.net 479.118 ms 595.444 ms 12 ge0-0-0.tkybb4.tokyo.opentransit.net 514.295 ms 472.411 ms 2001:688:0:2:8::23 467.252 ms 13 hitachi1.otemachi.wide.ad.jp 588.472 ms 439.962 ms 525.157 ms 14 hitachi1.otemachi.wide.ad.jp 422.59 ms 423.892 ms 421.864 ms 15 pc6.otemachi.wide.ad.jp 404.03 ms otm6-gate1.iij.net 473.603 ms 449.513 ms 16 otm6-gate1.iij.net 418.808 ms 517.862 ms otm6-bb0.iij.net 416 ms 17 plt001ix06.iij.net 433.301 ms plt001ix06.iij.net 426.364 ms 454.844 ms 18 plt001ix06.iij.net 473.956 ms 456.72 ms plt6-gate1.iij.net 717.541 ms 19 sl-bb1v6-rly-t-22.sprintv6.net 422.121 ms plt6-gate1.iij.net 433.742 ms 504.335 ms 20 sl-bb1v6-rly-t-22.sprintv6.net 429.331 ms 445.961 ms sl-s1v6-nyc-t-1000.sprintv6.net 439.238 ms 21 sl-bb1v6-sto-t-102.sprintv6.net 591.344 ms sl-s1v6-nyc-t-1000.sprintv6.net 423.258 ms 600.439 ms 22 * * * Regards, Jordi
De: Peter Lothberg <roll@Stupi.SE> Responder a: <owner-nanog@merit.edu> Fecha: Thu, 13 Oct 2005 11:05:58 PDT Para: Randy Bush <randy@psg.com> CC: "Steven M. Bellovin" <smb@cs.columbia.edu>, <nanog@nanog.org> Asunto: Re: IPv6 news
so having dual stack backbones is very important. but ...
Other global providers have a IPv6 network to, all open for business, but there are very VERY few customers.
And, I'm not so sure we even have a "Internet" of IPv6 out there either. It looks cold and empty to me.
Here's a challange, have NTP server attached directly to a good clock and a IPv6 network.
Is there anyone who can talk to it using IPv6 on the Nanog list?
(Time20.Stupi.SE, 2001:0440:1880:1000::0020)
-Peter
************************************ The IPv6 Portal: http://www.ipv6tf.org Barcelona 2005 Global IPv6 Summit Information available at: http://www.ipv6-es.com This electronic message contains information which may be privileged or confidential. The information is intended to be for the use of the individual(s) named above. If you are not the intended recipient be aware that any disclosure, copying, distribution or use of the contents of this information, including attached files, is prohibited.
On Thu, 13 Oct 2005 21:26:12 +0200, JORDI PALET MARTINEZ said:
16 otm6-gate1.iij.net 418.808 ms 517.862 ms otm6-bb0.iij.net 416 ms 17 plt001ix06.iij.net 433.301 ms plt001ix06.iij.net 426.364 ms 454.844 ms 18 plt001ix06.iij.net 473.956 ms 456.72 ms plt6-gate1.iij.net 717.541 ms 19 sl-bb1v6-rly-t-22.sprintv6.net 422.121 ms plt6-gate1.iij.net 433.742 ms 504.335 ms 20 sl-bb1v6-rly-t-22.sprintv6.net 429.331 ms 445.961 ms sl-s1v6-nyc-t-1000.sprintv6.net 439.238 ms 21 sl-bb1v6-sto-t-102.sprintv6.net 591.344 ms sl-s1v6-nyc-t-1000.sprintv6.net 423.258 ms 600.439 ms 22 * * *
That looks a lot like my traceroute6 (except that when I tried it, the next hop was working). Is it just me, or is this saying that 2001:440::/32 isn't peered in enough places (since for both my eastern US location and Jordi's London location, the route points off to iij.net)?
roll@Stupi.SE (Peter Lothberg) writes:
Is there anyone who can talk to it using IPv6 on the Nanog list?
(Time20.Stupi.SE, 2001:0440:1880:1000::0020)
[sa:amd64] ntpq -p 2001:0440:1880:1000::0020 remote refid st t when poll reach delay offset jitter ============================================================================== oCHRONOLOG(0) .PPS. 0 l 46 64 377 0.000 0.000 0.004 +ntp1.sp.se .PPS. 1 u 47 64 377 6.064 -0.001 0.035 +ntp2.sp.se .PPS. 1 u 20 64 377 6.061 -0.025 0.048 +Time1.Stupi.SE .PPS. 1 u 16 64 377 0.479 0.031 0.053 +Time2.Stupi.SE .PPS. 1 u 32 64 377 0.500 0.026 0.046 Time3.Stupi.SE .INIT. 16 u - 1024 0 0.000 0.000 4000.00 +Time4.Stupi.SE .PPS. 1 u 33 64 377 0.543 -0.017 0.043 +Time10.Stupi.SE .GPS. 1 u 12 64 377 5.333 -0.471 0.039 -- Paul Vixie
On 14/10/2005, at 3:35 AM, Peter Lothberg wrote:
Here's a challange, have NTP server attached directly to a good clock and a IPv6 network.
Is there anyone who can talk to it using IPv6 on the Nanog list?
(Time20.Stupi.SE, 2001:0440:1880:1000::0020)
yoyo$ ntpdate -q 2001:0440:1880:1000::0020 server 2001:440:1880:1000::20, stratum 1, offset 0.001079, delay 0.37489 15 Oct 20:30:06 ntpdate[11313]: adjust time server 2001:440:1880:1000::20 offset 0.001079 sec yoyo$ traceroute6 2001:0440:1880:1000::0020 traceroute to 2001:0440:1880:1000::0020 (2001:440:1880:1000::20) from 2001:388:4000:4002:200:e2ff:fea5:80fe, 30 hops max, 16 byte packets 1 gigabitethernet0-2.er2.aarnet.cpe.aarnet.net.au (2001:388:4000:4002:20f:23ff:fea3:ec00) 0.288 ms 0.223 ms 0.14 ms 2 ge-1-0-3.bb1.a.adl.aarnet.net.au (2001:388:1:2003:212:1eff:fe92:d201) 0.428 ms 0.428 ms 0.377 ms 3 so-0-1-0.bb1.a.mel.aarnet.net.au (2001:388:1:6::2) 9.225 ms 9.256 ms 9.205 ms 4 so-0-1-0.bb1.b.syd.aarnet.net.au (2001:388:1:a::2) 50.35 ms 40.85 ms 57.739 ms 5 so-0-0-0.bb1.a.lax.aarnet.net.au (2001:388:1:15::2) 168.519 ms 168.575 ms 168.497 ms 6 ge-2-7.a00.lsanca17.us.ra.verio.net (2001:418:4000:5000::9) 173.056 ms 168.829 ms 168.807 ms 7 * * * 8 p16-1-1-3.r20.mlpsca01.us.bb.verio.net (2001:418:0:2000::1a1) 179.239 ms 179.162 ms 179.176 ms 9 xe-1-1.r02.mlpsca01.us.bb.verio.net (2001:418:0:2000::35) 179.159 ms 179.229 ms 179.211 ms 10 fa-0-0-0.r00.mlpsca01.us.b6.verio.net (2001:418:0:700f::b600) 179.289 ms 179.25 ms 179.216 ms 11 tu-0.sprint.mlpsca01.us.b6.verio.net (2001:418:0:4000::4a) 180.563 ms 180.6 ms 180.513 ms 12 sl-s1v6-nyc-t-1001.sprintv6.net (2001:440:1239:1005::2) 252.605 ms sl-bb1v6-nyc-t-1001.sprintv6.net (2001:440:1239:100b::1) 251.836 ms sl-s1v6-nyc-t-1001.sprintv6.net (2001:440:1239:1005::2) 252.652 ms 13 sl-bb1v6-sto-t-102.sprintv6.net (2001:440:1239:100d::2) 401.242 ms sl-bb1v6-sto-t-101.sprintv6.net (2001:440:1239:1012::1) 347.852 ms sl-bb1v6-sto-t-102.sprintv6.net (2001:440:1239:100d::2) 401.301 ms 14 2001:7f8:d:fb::34 (2001:7f8:d:fb::34) 364.019 ms 418.652 ms 367.302 ms 15 2001:440:1880:1::2 (2001:440:1880:1::2) 401.974 ms 367.857 ms 419.025 ms 16 2001:440:1880:1::12 (2001:440:1880:1::12) 353.331 ms 422.397 ms 367.951 ms 17 2001:440:1880:1000::20 (2001:440:1880:1000::20) 401.24 ms 349.339 ms 401.704 ms yoyo$ It might be "closer" if we turned up IPv6 with Sprint but are they native yet? Mark.
On Sat, Oct 15, 2005 at 08:36:29PM +0930, Mark Prior wrote:
It might be "closer" if we turned up IPv6 with Sprint but are they native yet?
Nope. http://www.nanog.org/mtg-0405/augmentation.html and http://www.nanog.org/mtg-0405/pdf/rockell.pdf Although it's dated, I don't believe it's changed (much) at all. the added complexity is: 1) software testing/validation . need to test v6 transport to/from box . need to test v6 igp (ospfv3, isis, rip, w00t) . need to test v6 ibgp . need to test v6 ebgp 2) added link configuration . need to assign a /31 to link and a /126 3) added network config complexity . need to add v6 ibgp and ebgp peers . need to validate nobody forgot step #2 in the path 4) some devices do IPv6 in software, this means using limited rotuer cpu resources for something that is done in hw for ipv4. 5) internal policies to make 1-4 happen, including increased budgets for IT tools, etc.. require serious commitment at large corporations. - jared -- Jared Mauch | pgp key available via finger from jared@puck.nether.net clue++; | http://puck.nether.net/~jared/ My statements are only mine.
participants (60)
-
Alexei Roudnev
-
Andre Oppermann
-
Bill Owens
-
Bill Woodcock
-
bmanning@vacation.karoshi.com
-
Chris Adams
-
Christopher L. Morrow
-
Crist Clark
-
Daniel Golding
-
Daniel Roesen
-
Daniel Senie
-
David Conrad
-
David Conrad
-
David Meyer
-
Elmar K. Bins
-
Fred Baker
-
Gary E. Miller
-
Gordon Cook
-
Gregory Edigarov
-
Iljitsch van Beijnum
-
Jared Mauch
-
Jeroen Massar
-
Joe Abley
-
Joe Maimon
-
John Dupuy
-
John Payne
-
JORDI PALET MARTINEZ
-
Kevin Loch
-
Mark Prior
-
Mark Smith
-
Marshall Eubanks
-
Michael.Dillon@btradianz.com
-
Mikael Abrahamsson
-
Mike Leber
-
Nicholas Suan
-
Owen DeLong
-
Paul Jakma
-
Paul Vixie
-
Paul Vixie
-
Pekka Savola
-
Per Heldal
-
Peter Dambier
-
Peter Lothberg
-
Phillip Vandry
-
Randy Bush
-
Robert E.Seastrom
-
Roy Badami
-
Sabri Berisha
-
Scott Morris
-
Simon Lyall
-
Stephane Bortzmeyer
-
Stephen Sprunk
-
Suresh Ramasubramanian
-
Susan Harris
-
Todd Vierling
-
Tom Vest
-
Tony Li
-
trainier@kalsec.com
-
Valdis.Kletnieks@vt.edu
-
william(at)elan.net