Why is IPv6 broken? It's broken, first and foremost, because not all network providers who claim to be tier 1 are tier 1. Even worse, some of these providers run 6to4 relays or providers to home users. A user has no choice which provider is running their 6to4 relay...so, they might end up using a relay that is run by a provider who doesn't peer with their intended destination. I don't think the IETF saw that one coming. But the result is to make 6to4 even more broken. Now, I know some people want 6to4 to die, but while it still exists in some form, user experience is worse than it could be. The temporary fix is for any provider to run their own 6to4 relay for their own customers (assuming that they themselves have full connectivity). Right now, unless you buy transit from multiple tier 1s, and do so with carefully chosen tier ones, you have only part of the IPv6 internet. Many tier 1s are unsuitable even as backup connections, since you still want your backup connection to have access to the whole internet! Good tier 2 providers might be an excellent choice, sine good providers have already done this leg work and can monitor their providers for compliance. A few myths... Routing table size has nothing to do with completeness of routes. Google may be one route, through aggregation. And SmallCo may advertise a large route through one provider, and, due to traffic engineering, a smaller route through a second one - in many cases, anyone that had the large route would be able to contact SmallCo, even without the smaller route being present. So routing table size doesn't work. In addition, some providers aggregate their routing tables to reduce routing load and such. Others intentionally don't or deaggregate it intentionally so that they can brag about having bigger routing tables. What you need to ask is: "How many /64s can you get to from your network, and how many of these /64s are reachable from at least one other major provider (you don't care about internal-only networks, after all)?" They can give you that information, but many won't want to. It's also not about technical people not getting along. It's about business players trying to make money, but not just that either. It's also about ensuring that providers don't end up assuming more than their share of costs for a link. Just because you have a common peering point doesn't mean that turning peering on would reduce your costs. In some cases it may increase costs tremendously, particularly on your long haul backbone links, because the other party would like to take advantage of an attitude of trust on the internet. That's why we end up with peering policies and contracts. What is the issue? Let's take Hurricane. This is no different than other providers...basically, they want to say, "We shouldn't need to pay for IPv6 transit from anyone." This is what Cogent said on IPv4 a few years ago. Google used to say this too for IPv6, not sure if they are still saying it. Basically, "We know we're big enough that you won't want to screw your users by not peering with us." A small network couldn't do this tactic - a 100 node network who said to the IPv4 tier 1s: "Hey, I'm in the Podunk Internet Exchange, so are you, so I'm going to peer from you so I don't have to buy any bandwidth for my web server (placed in the Podunk exchange). Sure, they would like to - it would save a ton of money if their site got lots of hits. I mean, who wouldn't want free connectivity? In IPv6, we're going through what we settled years ago in IPv4 - who has to pay who to connect. After all, even free peering connections have a cost in manpower, debugging, traffic engineering, documentation, etc. Some players who aren't getting free interconnection to tier 1s in IPv4 want to get it in IPv6. So they've worked to attract lots of users, and done so under the guise of "We like IPv6 and want to promote it." Others have not bothered with trying to attract the users, but have said, "We're too big for you to not want to give us connectivity for free, since it would piss off your users if you don't" (Google did this at one point in the past, may still be doing it). The Google example is basically trying to use a monopoly position to force business decisions. Now, HE, Google, and others would want you to think, "Hey, IPv6 is all new, and these $#@! other providers just want to make a buck on something they have no right to." Well, perhaps. But what they aren't saying is, "We can turn on BGP for IPv6 on our existing connections to other providers, with no cost to us, and actually have full connectivity." The issue isn't about cost today - nobody is charging extra for IPv6 in addition to IPv4 on a pipe where you already buy IPv4 bandwidth. And Google and HE already buy IPv4 bandwidth. What they are thinking of is the future, 15 years from now, when there is no IPv4 - in that future, IPv6 isn't insignificant bandwidth, it's everything. Wouldn't it be nice to be a tier 1 and not pay for that? Of course! And certainly one can argue for or against the current tier 1 club's exclusivity. But it's the way the internet works right now, for better or worse. In the meantime, in pursuit of this future, today's customers are screwed by these providers trying to position themselves to make more profit margin down the road. Which is better for the customer? A system where they are screwed today so that their provider can have a better negotiating position in business discussions OR a position where they do whatever they have to take to provide the customer with full connectivity? (To HE's credit, they are giving away transit today on IPv6, so it's not like you are losing anything of value by not having the full internet routing tables, but it's a huge reason to not pay HE anything in other services, such as data center colocation - go with a provider that you pay and which gives you what you pay for - full transit). A bit about peering... Lots of people who aren't running big networks don't understand peering. They think, "Doesn't this benefit everyone if everyone exchanges traffic?" Maybe, on a pure level, but the business doesn't work that way. I'll give you an example. Let's say you are a little ISP, and located in Virginia, near a major peering point. You say, "All the tier 1s are there, I can pull fiber to that peering point, which is only a block away, and have free internet, other than the cost of the line." So, let's say you run the line, and, let's say that all the tier 1s agree to let you peer for free, since they want your traffic too. Now, let's say your user downloads 1,000 TB from a server in California, on Qwest's network. You paid, let's say, $15,000 for your piece of fiber going a block. You needed to hire contractors and buy permits and such, after all. So you shared in the costs of letting the user get to the server. What did Qwest pay? Well, they dug trenches, pulled fiber, negotiated with cities, counties, and states, paid taxes on their work, lit this fiber, etc. It cost a lot because they went a lot further than your one block. And a lot more than $15,000. You say, "So what! Their customer benefits too!" That's true, but let's go a bit further. Let's say you have a network that extends to California - you by DS3s from Sprint to do it. There's some cost in that, but your user in Virginia would need more bandwidth than your DS3s. So you decide NOT to peer in California, just in Virginia. That way you don't have to upgrade your lines for your Virginia user. Maybe you even legally break your company into two entities, so that you can peer in California and Virginia both, but you can say with a straight face, "We only have Virginia offices for this user - the other company is a separate entity, and not the entity that owns either the server or the end user." In other words, you found a way to shift most of the traffic burden and infrastructure costs to Qwest, away from your user. This is why Qwest has some sort of peering policy. Among other things, it will require multiple exchange points, and Qwest will probably say they will send traffic to the closest peering point, to minimize their costs. You get to do the same (more on that later). Let's say that you currently buy bandwidth from NTT - you're not big enough to get free peering from everyone, but Qwest agrees to peer with you. Of course Qwest and NTT also have a business relationship, to give each other free peering. If Qwest gives you and many other customers free peering, however, you'll send less traffic across NTT's network. That might be good from a technical standpoint, but NTT now is selling you a smaller pipe - and making less money. In effect, Qwest undercut NTT's business and lowered NTT's profits on the connection. How will NTT respond to that, when they were also giving free peering to and from Qwest? Well, they might decide that Qwest isn't a very nice partner and tell Qwest, "Pay us for transit or get lost." That could be ugly - both NTT and Qwest could lose, but Qwest, if they actually care about stable service, won't want to risk it. So generally you don't give peering to anyone who is a customer of one of your free peers. You don't hurt their business. In fact, it's often a requirement in the peering connection, legally. (that said, you could argue whether or not there is an abuse of monopoly here...that's a different issue) Going one further, let's say you have the server, and Qwest has the end-user. That doesn't change anything - the economics are still such that Qwest has the cost, you don't. That said, it's convention that the person receiving the traffic pays for most of the backhaul. Asymmetry in the Internet: What's the path between your host and a remote server? How do you find it? If you said "traceroute", you might be right, but are probably wrong. You need to trace route both sides. Every provider on the internet is trying to minimize costs. This means that you want traffic to leave your network and go to the destination network with as little distance traveled as possible, because costs go up with distance. It's cheaper to increase the size of pipes within a city to get to a peering point than to increase your backbone pipe size. So, peering contracts typically specify that you dump traffic to the peer as soon as possible. That means the person receiving the traffic generally pays more. It also means that any traffic that crosses an AS boundary almost certainly travels a completely different path each way. In many cases, one third party provider may be used in one direction, another in the other direction. So seeing packet loss on your traceroute at some random tinet router doesn't mean that this router is the cause of any problem, since the return path for that packet from that provider's router might actually cross yet another network that is never transited in either direction for your network connection. (I'm ignoring that most large providers also don't always send ICMP reliably BECAUSE they limit this intentionally to spare the router CPU from overload - it takes router CPU to generate an ICMP TTL exceeded, but it doesn't take router CPU to forward a packet - so traceroute or ping indicating loss at a router doesn't mean anything in itself - the path itself likely has zero percent loss). So, here's the scenerio. Let's say a user and a server are on two seperate networks, U (user) and S (server). Let's say they both utilize transit provider T. So the path could be: U -- T -- S. S buys an OC12 from T, while U buys a T1. But let's say that the user has a second transit provider, BIG, who is a free peer of T. He bought an OC3 from BIG. So there's another path between U and S: U -- BIG -- T -- S. Likely this path is much faster than U -- T -- S. So, the path for the traffic to S goes U -- T -- S. Now, what path does the traffic from T's router go, when T's router generates an ICMP TTL exceeded in response from a traceroute from a user? Does it go straight over the T1 line, or does it go over the peering connection to BIG and then to the customer? The answer, it turns out, depends on network configuration and policy. Let's say it goes out over the T1, but the T1 is congested. It will look like the congestion is at the connection between BIG and T, because this is the first hop that will show packet loss. BUT...the congestion is actually at the U's connection to T, which is irrelevant to the actual traffic path between U and S. So the user, at this point, calls up BIG and T and bitches about "Your peering congestion is congested" when the real problem is that traffic completely unrelated to the user's problem is going via a congested path that is never used for connectivity between U and S. If you add several providers into this loop, you can end up with a situation where traffic uses Sprint in one direction, but never hits a Sprint router in the other. This is actually very common. A user with slow downloads might be experiencing packet loss on the path from server to user, but not the other way around. In other words, the problem is a provider that never shows up on the user's traceroute! Remember that the providers hand off the traffic as soon as possible to their peer. So, whoever receives the larger amount of traffic needs the bigger cross-country (or trans-oceanic) links. If one side transmits a T1's worth of data, the other side transmits an OC48's worth of data, only one needs the OC48s across the country - the one receiving the traffic. That's why you hear about "traffic ratios". If the traffic is even both ways, both sides have to pay for the same amount of cross-country infrastructure to carry that traffic. So most providers won't peer with someone for free that sends, say, 10 times the amount of traffic that they will receive. It would end up costing a lot of money Back to IPv6...that's interesting, but what does it have to do with IPv6? Some providers want to do away with traffic ratio policies, mutliple location peering, not providing free services to the other's customers, etc. THAT is why you can't ping some sites from your HE tunnel. It's not just that providers won't peer. It's also that providers have rules to keep themselves from getting screwed. Certainly, there's ways around some of this (for example, traffic ratios - if I make sure my network is used for the cross-country traffic I send, not yours, then I've addressed that concern at a bit of increased expense for myself). But it's generally not worth doing until the size of the providers is sufficiently large. Other things don't have a good technical fix, like not peering with your peer's customer - that's a business rule.
<deBunk> Where did you get all this from ? There is not even one single reference to a URL, not to be rude but how long did it take you to write this theory ? As for "It's broken, first and foremost..." They may be a Tier 1 provider of other services and also happen to offer IPv6 at which they are only a Tier 2 or 3 but using the marketing gimics of theyre original Tier 1 status to get acknowledgement. I stopped reading shortly after 'I think' the second paragraph and scanned the rest for URLs that might have made this clear and to the point but did not find any. Heresay. </deBunk> On Sat, Jul 09, 2011 at 03:25:27PM -0600, Bob Network wrote:
Why is IPv6 broken?
It's broken, first and foremost, because not all network providers who claim to be tier 1 are tier 1.
Even worse, some of these providers run 6to4 relays or providers to home users. A user has no choice which provider is running their 6to4 relay...so, they might end up using a relay that is run by a provider who doesn't peer with their intended destination. I don't think the IETF saw that one coming. But the result is to make 6to4 even more broken. Now, I know some people want 6to4 to die, but while it still exists in some form, user experience is worse than it could be. The temporary fix is for any provider to run their own 6to4 relay for their own customers (assuming that they themselves have full connectivity).
Right now, unless you buy transit from multiple tier 1s, and do so with carefully chosen tier ones, you have only part of the IPv6 internet. Many tier 1s are unsuitable even as backup connections, since you still want your backup connection to have access to the whole internet! Good tier 2 providers might be an excellent choice, sine good providers have already done this leg work and can monitor their providers for compliance.
A few myths...
Routing table size has nothing to do with completeness of routes. Google may be one route, through aggregation. And SmallCo may advertise a large route through one provider, and, due to traffic engineering, a smaller route through a second one - in many cases, anyone that had the large route would be able to contact SmallCo, even without the smaller route being present. So routing table size doesn't work. In addition, some providers aggregate their routing tables to reduce routing load and such. Others intentionally don't or deaggregate it intentionally so that they can brag about having bigger routing tables. What you need to ask is: "How many /64s can you get to from your network, and how many of these /64s are reachable from at least one other major provider (you don't care about internal-only networks, after all)?" They can give you that information, but many won't want to.
It's also not about technical people not getting along. It's about business players trying to make money, but not just that either. It's also about ensuring that providers don't end up assuming more than their share of costs for a link. Just because you have a common peering point doesn't mean that turning peering on would reduce your costs. In some cases it may increase costs tremendously, particularly on your long haul backbone links, because the other party would like to take advantage of an attitude of trust on the internet. That's why we end up with peering policies and contracts.
What is the issue?
Let's take Hurricane. This is no different than other providers...basically, they want to say, "We shouldn't need to pay for IPv6 transit from anyone." This is what Cogent said on IPv4 a few years ago. Google used to say this too for IPv6, not sure if they are still saying it. Basically, "We know we're big enough that you won't want to screw your users by not peering with us."
A small network couldn't do this tactic - a 100 node network who said to the IPv4 tier 1s: "Hey, I'm in the Podunk Internet Exchange, so are you, so I'm going to peer from you so I don't have to buy any bandwidth for my web server (placed in the Podunk exchange). Sure, they would like to - it would save a ton of money if their site got lots of hits. I mean, who wouldn't want free connectivity?
In IPv6, we're going through what we settled years ago in IPv4 - who has to pay who to connect. After all, even free peering connections have a cost in manpower, debugging, traffic engineering, documentation, etc.
Some players who aren't getting free interconnection to tier 1s in IPv4 want to get it in IPv6. So they've worked to attract lots of users, and done so under the guise of "We like IPv6 and want to promote it." Others have not bothered with trying to attract the users, but have said, "We're too big for you to not want to give us connectivity for free, since it would piss off your users if you don't" (Google did this at one point in the past, may still be doing it). The Google example is basically trying to use a monopoly position to force business decisions.
Now, HE, Google, and others would want you to think, "Hey, IPv6 is all new, and these $#@! other providers just want to make a buck on something they have no right to." Well, perhaps. But what they aren't saying is, "We can turn on BGP for IPv6 on our existing connections to other providers, with no cost to us, and actually have full connectivity." The issue isn't about cost today - nobody is charging extra for IPv6 in addition to IPv4 on a pipe where you already buy IPv4 bandwidth. And Google and HE already buy IPv4 bandwidth. What they are thinking of is the future, 15 years from now, when there is no IPv4 - in that future, IPv6 isn't insignificant bandwidth, it's everything. Wouldn't it be nice to be a tier 1 and not pay for that? Of course! And certainly one can argue for or against the current tier 1 club's exclusivity. But it's the way the internet works right now, for better or worse. In the meantime, in pursuit of this future, today's customers are screwed by these providers trying to position themselves to make more profit margin down the road.
Which is better for the customer? A system where they are screwed today so that their provider can have a better negotiating position in business discussions OR a position where they do whatever they have to take to provide the customer with full connectivity? (To HE's credit, they are giving away transit today on IPv6, so it's not like you are losing anything of value by not having the full internet routing tables, but it's a huge reason to not pay HE anything in other services, such as data center colocation - go with a provider that you pay and which gives you what you pay for - full transit).
A bit about peering...
Lots of people who aren't running big networks don't understand peering. They think, "Doesn't this benefit everyone if everyone exchanges traffic?" Maybe, on a pure level, but the business doesn't work that way.
I'll give you an example. Let's say you are a little ISP, and located in Virginia, near a major peering point. You say, "All the tier 1s are there, I can pull fiber to that peering point, which is only a block away, and have free internet, other than the cost of the line." So, let's say you run the line, and, let's say that all the tier 1s agree to let you peer for free, since they want your traffic too. Now, let's say your user downloads 1,000 TB from a server in California, on Qwest's network.
You paid, let's say, $15,000 for your piece of fiber going a block. You needed to hire contractors and buy permits and such, after all. So you shared in the costs of letting the user get to the server. What did Qwest pay? Well, they dug trenches, pulled fiber, negotiated with cities, counties, and states, paid taxes on their work, lit this fiber, etc. It cost a lot because they went a lot further than your one block. And a lot more than $15,000.
You say, "So what! Their customer benefits too!" That's true, but let's go a bit further. Let's say you have a network that extends to California - you by DS3s from Sprint to do it. There's some cost in that, but your user in Virginia would need more bandwidth than your DS3s. So you decide NOT to peer in California, just in Virginia. That way you don't have to upgrade your lines for your Virginia user. Maybe you even legally break your company into two entities, so that you can peer in California and Virginia both, but you can say with a straight face, "We only have Virginia offices for this user - the other company is a separate entity, and not the entity that owns either the server or the end user."
In other words, you found a way to shift most of the traffic burden and infrastructure costs to Qwest, away from your user.
This is why Qwest has some sort of peering policy. Among other things, it will require multiple exchange points, and Qwest will probably say they will send traffic to the closest peering point, to minimize their costs. You get to do the same (more on that later).
Let's say that you currently buy bandwidth from NTT - you're not big enough to get free peering from everyone, but Qwest agrees to peer with you. Of course Qwest and NTT also have a business relationship, to give each other free peering. If Qwest gives you and many other customers free peering, however, you'll send less traffic across NTT's network. That might be good from a technical standpoint, but NTT now is selling you a smaller pipe - and making less money. In effect, Qwest undercut NTT's business and lowered NTT's profits on the connection. How will NTT respond to that, when they were also giving free peering to and from Qwest? Well, they might decide that Qwest isn't a very nice partner and tell Qwest, "Pay us for transit or get lost." That could be ugly - both NTT and Qwest could lose, but Qwest, if they actually care about stable service, won't want to risk it. So generally you don't give peering to anyone who is a customer of one of your free peers. You don't hurt their business. In fact, it's often a requirement in the peering connection, legally. (that said, you could argue whether or not there is an abuse of monopoly here...that's a different issue)
Going one further, let's say you have the server, and Qwest has the end-user. That doesn't change anything - the economics are still such that Qwest has the cost, you don't. That said, it's convention that the person receiving the traffic pays for most of the backhaul.
Asymmetry in the Internet:
What's the path between your host and a remote server? How do you find it? If you said "traceroute", you might be right, but are probably wrong. You need to trace route both sides.
Every provider on the internet is trying to minimize costs. This means that you want traffic to leave your network and go to the destination network with as little distance traveled as possible, because costs go up with distance. It's cheaper to increase the size of pipes within a city to get to a peering point than to increase your backbone pipe size. So, peering contracts typically specify that you dump traffic to the peer as soon as possible. That means the person receiving the traffic generally pays more. It also means that any traffic that crosses an AS boundary almost certainly travels a completely different path each way. In many cases, one third party provider may be used in one direction, another in the other direction. So seeing packet loss on your traceroute at some random tinet router doesn't mean that this router is the cause of any problem, since the return path for that packet from that provider's router might actually cross yet another network that is never transited in either direction for your network connection. (I'm ignoring that most large providers also don't always send ICMP reliably BECAUSE they limit this intentionally to spare the router CPU from overload - it takes router CPU to generate an ICMP TTL exceeded, but it doesn't take router CPU to forward a packet - so traceroute or ping indicating loss at a router doesn't mean anything in itself - the path itself likely has zero percent loss).
So, here's the scenerio.
Let's say a user and a server are on two seperate networks, U (user) and S (server).
Let's say they both utilize transit provider T. So the path could be: U -- T -- S. S buys an OC12 from T, while U buys a T1.
But let's say that the user has a second transit provider, BIG, who is a free peer of T. He bought an OC3 from BIG. So there's another path between U and S: U -- BIG -- T -- S. Likely this path is much faster than U -- T -- S.
So, the path for the traffic to S goes U -- T -- S.
Now, what path does the traffic from T's router go, when T's router generates an ICMP TTL exceeded in response from a traceroute from a user? Does it go straight over the T1 line, or does it go over the peering connection to BIG and then to the customer? The answer, it turns out, depends on network configuration and policy. Let's say it goes out over the T1, but the T1 is congested. It will look like the congestion is at the connection between BIG and T, because this is the first hop that will show packet loss. BUT...the congestion is actually at the U's connection to T, which is irrelevant to the actual traffic path between U and S. So the user, at this point, calls up BIG and T and bitches about "Your peering congestion is congested" when the real problem is that traffic completely unrelated to the user's problem is going via a congested path that is never used for connectivity between U and S.
If you add several providers into this loop, you can end up with a situation where traffic uses Sprint in one direction, but never hits a Sprint router in the other. This is actually very common. A user with slow downloads might be experiencing packet loss on the path from server to user, but not the other way around. In other words, the problem is a provider that never shows up on the user's traceroute!
Remember that the providers hand off the traffic as soon as possible to their peer. So, whoever receives the larger amount of traffic needs the bigger cross-country (or trans-oceanic) links. If one side transmits a T1's worth of data, the other side transmits an OC48's worth of data, only one needs the OC48s across the country - the one receiving the traffic. That's why you hear about "traffic ratios". If the traffic is even both ways, both sides have to pay for the same amount of cross-country infrastructure to carry that traffic. So most providers won't peer with someone for free that sends, say, 10 times the amount of traffic that they will receive. It would end up costing a lot of money
Back to IPv6...that's interesting, but what does it have to do with IPv6?
Some providers want to do away with traffic ratio policies, mutliple location peering, not providing free services to the other's customers, etc.
THAT is why you can't ping some sites from your HE tunnel. It's not just that providers won't peer. It's also that providers have rules to keep themselves from getting screwed.
Certainly, there's ways around some of this (for example, traffic ratios - if I make sure my network is used for the cross-country traffic I send, not yours, then I've addressed that concern at a bit of increased expense for myself). But it's generally not worth doing until the size of the providers is sufficiently large. Other things don't have a good technical fix, like not peering with your peer's customer - that's a business rule.
On Sat, Jul 9, 2011 at 5:25 PM, Bob Network <networkjoe@hotmail.com> wrote:
Why is IPv6 broken?
You should have titled your thread, "my own personal rant about Hurricane Electric's IPv6 strategy." You may also have left out the dodgy explanation of peering policies and technicalities, since these issues have been remarkably static since about 1996. The names of the networks change, but the song remains the same. This is not a novel subject on this mailing list. In fact, there have been a number of threads discussing HE's practices lately. If you are so interested in them, I suggest you review the list archive. There are quite a few serious, unresolved technical problems with IPv6 adoption besides a few networks playing chicken with their collective customer-bases. The lack of will on the part of vendors and operators to participate in the IETF process, and make necessary and/or beneficial changes to the IPv6 standards, has left us in a situation where IPv6 implementation produces networks which are vulnerable to trivial DoS attacks and network intrusions. The lack of will on the part of access providers to insist on functioning IPv6 support on CPE and BRAS platforms has even mid-sized ISPs facing nine-figure (as in, hundred-million-dollars) expenses to forklift-upgrade their access networks and end-user equipment, at a time when IPv6 seems to be the only way to continue growing the Internet. The lack of will on the part of major transit networks, including Savvis, to deploy IPv6 capabilities to their customers, means that customers caught in multi-year contracts may have no option for native connectivity. Cogent's policy of requiring a new contract, and from what I am still being told by some European customers, new money, from customers in exchange for provisioning IPv6 on existing circuits, means a simple technical project gets caught up in the complexities of budgeting and contract execution. If you believe that the most serious problem facing IPv6 adoption is that HE / Level3 / Cogent don't carry a full table, you are living in a fantasy world. -- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts
On 7/10/2011 10:14 AM, Jeff Wheeler wrote:
Why is IPv6 broken? You should have titled your thread, "my own personal rant about Hurricane Electric's IPv6 strategy." You may also have left out the dodgy explanation of peering policies and technicalities, since these issues have been remarkably static since about 1996. The names of the networks change, but the song remains the same. This is not a novel subject on this mailing list. In fact, there have been a number of
On Sat, Jul 9, 2011 at 5:25 PM, Bob Network<networkjoe@hotmail.com> wrote: threads discussing HE's practices lately. If you are so interested in them, I suggest you review the list archive.
There are quite a few serious, unresolved technical problems with IPv6 adoption besides a few networks playing chicken with their collective customer-bases. The lack of will on the part of vendors and operators to participate in the IETF process, and make necessary and/or beneficial changes to the IPv6 standards, has left us in a situation where IPv6 implementation produces networks which are vulnerable to trivial DoS attacks and network intrusions.
The lack of will on the part of access providers to insist on functioning IPv6 support on CPE and BRAS platforms has even mid-sized ISPs facing nine-figure (as in, hundred-million-dollars) expenses to forklift-upgrade their access networks and end-user equipment, at a time when IPv6 seems to be the only way to continue growing the Internet.
The lack of will on the part of major transit networks, including Savvis, to deploy IPv6 capabilities to their customers, means that customers caught in multi-year contracts may have no option for native connectivity. Cogent's policy of requiring a new contract, and from what I am still being told by some European customers, new money, from customers in exchange for provisioning IPv6 on existing circuits, means a simple technical project gets caught up in the complexities of budgeting and contract execution.
+1 The lack of will on the part of the IETF to attract input from and involve operators in their processes (which I would posit is a critical element in the process). And the lack of will/fore site on the part of the IETF to respond to input from operators that they have received. If fingers can be pointed at both sides, i.e. operators and IETF, then both sides are to blame. The IETF only has value if they are publishing "standards" that work properly in the real world. If the implementers of these "standards" say that they are broken, then the IETF has failed to provide value.
If you believe that the most serious problem facing IPv6 adoption is that HE / Level3 / Cogent don't carry a full table, you are living in a fantasy world.
+1 -DMM
On 2011-07-10 17:56 , David Miller wrote: [..]
+1
The lack of will on the part of the IETF to attract input from and involve operators in their processes (which I would posit is a critical element in the process).
Ehmmmm ANYBODY, including you, can sign up to the IETF mailing lists and participate there, just like a couple of folks from NANOG are already doing. You are on NANOG out of your own free will, the same applies to the IETF. If you don't participate here your voice is not heard either, just like at the IETF. Peeking at the ipv6@ietf.org member list, I don't see your name there. You can signup here: https://www.ietf.org/mailman/listinfo/ipv6 Greets, Jeroen
On 7/10/2011 12:16 PM, Jeroen Massar wrote:
+1
The lack of will on the part of the IETF to attract input from and involve operators in their processes (which I would posit is a critical element in the process). Ehmmmm ANYBODY, including you, can sign up to the IETF mailing lists and
On 2011-07-10 17:56 , David Miller wrote: [..] participate there, just like a couple of folks from NANOG are already doing.
You are on NANOG out of your own free will, the same applies to the IETF. If you don't participate here your voice is not heard either, just like at the IETF.
True, anyone can participate in the IETF processes. However, if key players do not participate, then something is broken. I will take my lumps for not participating. My point was - "If fingers can be pointed at both sides, i.e. operators and IETF, then both sides are to blame." In the corporate world, if I were contemplating changing the framework of a system, then I would need to get buy in / agreement from the stakeholders of that system. If I was going to change the framework behind an HR system, then the HR managers and HR systems experts would all have to agree to the change. If I changed the framework and broke all of the HR systems and then told my boss that I scheduled a meeting and nobody from HR showed up and therefore I used that as agreement in their absence, then I would get fired. Yes, I understand that corporate environments are very different from the IETF environment, but there are perhaps some lessons to learn from the corporate environment. Most RFCs operate within a meritocracy. A standard can be proposed for "Example Protocol v10" and if nobody likes it outside of the IETF, then it is not implemented by anyone and it eventually dies on the vine. IPv6 is "different" in that it is the underpinning of every other protocol/standard that will exist on or operate over the internet for the next 20-30 years (probably) We had 10+ years of IPv6 not being implemented by anyone (seriously), yet it didn't die on the vine. Perhaps the process for "Example Protocol v10" and the process for IPv6 should be different - given the fundamental difference in their scope. No, we can't change the past. "Those who do not learn from history are doomed to repeat it." - Santayana. I would say that many variables that got us to where we are today - which is out of IPv4 addresses and faced with only IPv6, which many believe is fundamentally flawed, as our only way forward - holds some lessons to be learned... but perhaps this is just me - and if so, I apologize for the noise.
Peeking at the ipv6@ietf.org member list, I don't see your name there. You can signup here: https://www.ietf.org/mailman/listinfo/ipv6
Absolutely true, fixed.
Greets, Jeroen
-DMM
On Sun, Jul 10, 2011 at 1:41 PM, David Miller <dmiller@tiggee.com> wrote:
On 7/10/2011 12:16 PM, Jeroen Massar wrote:
You are on NANOG out of your own free will, the same applies to the IETF. If you don't participate here your voice is not heard either, just like at the IETF.
True, anyone can participate in the IETF processes. However, if key players do not participate, then something is broken. I will take my lumps for not participating.
My point was - "If fingers can be pointed at both sides, i.e. operators and IETF, then both sides are to blame."
Hi David, This is a process problem, not an individual problem. The IETF is run by volunteers. They volunteer because they find designing protocols to be fun. For the most part, operators are not entertained by designing network protocols. So, for the most part they don't partiticpate. This is not going to change. And it also isn't the problem -- people who enjoy the work tend to do better work. The problem is that the IETF routinely exceeds the scope of designing network protocols. Participants in the working groups take what are fundamentally operations issues unto themselves. They do so knowing they lack adequate participation by network operators. And the process that leads to RFCs offers inadequate checks and balances to mitigate that behavior. Consider, for example, RFC 3484. That's the one that determines how an IPv6 capable host selects which of a group of candidate IPv4 and IPv6 addresses for a particular host name gets priority. How is a server's address priority NOT an issue that should be managed at an operations level by individual server administrators? Yet the working group which produced it came up with a static prioritization that is the root cause of a significant portion of the IPv6 deployment headaches we face. I don't know the whole solution to this problem, but I'm pretty sure I know the first step. Today's RFC candidates are required to call out IANA considerations and security considerations in special sections. They do so because each of these areas has landmines that the majority of working groups are ill equipped to consider on their own. There should be an operations callout as well -- a section where proposed operations defaults (as well as statics for which a solid case can be made for an operations tunable) are extracted from the thick of it and offered for operator scrutiny prior to publication of the RFC. Food for thought. Regards, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
On Jul 10, 2011, at 12:23 PM, William Herrin wrote:
On Sun, Jul 10, 2011 at 1:41 PM, David Miller <dmiller@tiggee.com> wrote:
On 7/10/2011 12:16 PM, Jeroen Massar wrote:
You are on NANOG out of your own free will, the same applies to the IETF. If you don't participate here your voice is not heard either, just like at the IETF.
True, anyone can participate in the IETF processes. However, if key players do not participate, then something is broken. I will take my lumps for not participating.
My point was - "If fingers can be pointed at both sides, i.e. operators and IETF, then both sides are to blame."
Hi David,
This is a process problem, not an individual problem.
The IETF is run by volunteers. They volunteer because they find designing protocols to be fun. For the most part, operators are not entertained by designing network protocols. So, for the most part they don't partiticpate.
This is not going to change. And it also isn't the problem -- people who enjoy the work tend to do better work.
The problem is that the IETF routinely exceeds the scope of designing network protocols. Participants in the working groups take what are fundamentally operations issues unto themselves. They do so knowing they lack adequate participation by network operators. And the process that leads to RFCs offers inadequate checks and balances to mitigate that behavior.
Consider, for example, RFC 3484. That's the one that determines how an IPv6 capable host selects which of a group of candidate IPv4 and IPv6 addresses for a particular host name gets priority. How is a server's address priority NOT an issue that should be managed at an operations level by individual server administrators? Yet the working group which produced it came up with a static prioritization that is the root cause of a significant portion of the IPv6 deployment headaches we face.
3484 specifies a static default. By definition, defaults in absence of operator configuration kind of have to be static. Having a reasonable and expected set of defaults documented in an RFC provides a known quantity for what operators can/should expect from hosts they have not configured. I see nothing wrong with RFC 3484 other than I would agree that the choices made were suboptimal. Mostly that was based on optimism and a lack of experience available at the time of writing. There is another RFC and there are APIs and most operating systems have configuration mechanisms where an operator CAN set that to something other than the 3484 defaults.
I don't know the whole solution to this problem, but I'm pretty sure I know the first step.
I don't know what you had in mind, but, reading RFC 5014 would be my suggestion as a good starting point.
Today's RFC candidates are required to call out IANA considerations and security considerations in special sections. They do so because each of these areas has landmines that the majority of working groups are ill equipped to consider on their own.
There should be an operations callout as well -- a section where proposed operations defaults (as well as statics for which a solid case can be made for an operations tunable) are extracted from the thick of it and offered for operator scrutiny prior to publication of the RFC.
I think this would be a good idea, actually. It would probably be more effective to propose it to IETF than to NANOG, however. Owen
On Sun, Jul 10, 2011 at 4:22 PM, Owen DeLong <owen@delong.com> wrote:
On Jul 10, 2011, at 12:23 PM, William Herrin wrote:
Consider, for example, RFC 3484. That's the one that determines how an IPv6 capable host selects which of a group of candidate IPv4 and IPv6 addresses for a particular host name gets priority. How is a server's address priority NOT an issue that should be managed at an operations level by individual server administrators? Yet the working group which produced it came up with a static prioritization that is the root cause of a significant portion of the IPv6 deployment headaches we face.
3484 specifies a static default. By definition, defaults in absence of operator configuration kind of have to be static. Having a reasonable and expected set of defaults documented in an RFC provides a known quantity for what operators can/should expect from hosts they have not configured. I see nothing wrong with RFC 3484 other than I would agree that the choices made were suboptimal. Mostly that was based on optimism and a lack of experience available at the time of writing.
Hi Owen, A more optimal answer would have been to make AAAA records more like MX or SRV records -- with explicit priorities the clients are encouraged to follow. I wasn't there but I'd be willing to bet there was a lonely voice in the room saying, hey, this should be controlled by the sysadmin. A lonely voice that got shouted down.
Today's RFC candidates are required to call out IANA considerations and security considerations in special sections. They do so because each of these areas has landmines that the majority of working groups are ill equipped to consider on their own.
There should be an operations callout as well -- a section where proposed operations defaults (as well as statics for which a solid case can be made for an operations tunable) are extracted from the thick of it and offered for operator scrutiny prior to publication of the RFC.
I think this would be a good idea, actually. It would probably be more effective to propose it to IETF than to NANOG, however.
If the complaint is that the IETF doesn't adequately listen to the operations folk, then I think it makes sense to consult the operations folks early and often on potential fixes. If folks here think it would help, -that- is when I'll it to the IETF. Regards, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
On Jul 10, 2011, at 11:57 PM, William Herrin wrote:
On Sun, Jul 10, 2011 at 4:22 PM, Owen DeLong <owen@delong.com> wrote:
On Jul 10, 2011, at 12:23 PM, William Herrin wrote:
Consider, for example, RFC 3484. That's the one that determines how an IPv6 capable host selects which of a group of candidate IPv4 and IPv6 addresses for a particular host name gets priority. How is a server's address priority NOT an issue that should be managed at an operations level by individual server administrators? Yet the working group which produced it came up with a static prioritization that is the root cause of a significant portion of the IPv6 deployment headaches we face.
3484 specifies a static default. By definition, defaults in absence of operator configuration kind of have to be static. Having a reasonable and expected set of defaults documented in an RFC provides a known quantity for what operators can/should expect from hosts they have not configured. I see nothing wrong with RFC 3484 other than I would agree that the choices made were suboptimal. Mostly that was based on optimism and a lack of experience available at the time of writing.
Hi Owen,
A more optimal answer would have been to make AAAA records more like MX or SRV records -- with explicit priorities the clients are encouraged to follow. I wasn't there but I'd be willing to bet there was a lonely voice in the room saying, hey, this should be controlled by the sysadmin. A lonely voice that got shouted down.
Give me a break... multiple implementations have chosen to tweak the algorithm independently and at various times. It's just an rfc, not the gospel according to richard draves. " Acknowledgments The author would like to acknowledge the contributions of the IPng Working Group, particularly Marc Blanchet, Brian Carpenter, Matt Crawford, Alain Durand, Steve Deering, Robert Elz, Jun-ichiro itojun Hagino, Tony Hain, M.T. Hollinger, JINMEI Tatuya, Thomas Narten, Erik Nordmark, Ken Powell, Markku Savela, Pekka Savola, Hesham Soliman, Dave Thaler, Mauro Tortonesi, Ole Troan, and Stig Venaas. In addition, the anonymous IESG reviewers had many great comments and suggestions for clarification. "
Today's RFC candidates are required to call out IANA considerations and security considerations in special sections. They do so because each of these areas has landmines that the majority of working groups are ill equipped to consider on their own.
There should be an operations callout as well -- a section where proposed operations defaults (as well as statics for which a solid case can be made for an operations tunable) are extracted from the thick of it and offered for operator scrutiny prior to publication of the RFC.
I think this would be a good idea, actually. It would probably be more effective to propose it to IETF than to NANOG, however.
If the complaint is that the IETF doesn't adequately listen to the operations folk, then I think it makes sense to consult the operations folks early and often on potential fixes. If folks here think it would help, -that- is when I'll it to the IETF.
Regards, Bill Herrin
-- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
On Mon, Jul 11, 2011 at 3:08 AM, Joel Jaeggli <joelja@bogus.com> wrote:
On Jul 10, 2011, at 11:57 PM, William Herrin wrote:
A more optimal answer would have been to make AAAA records more like MX or SRV records -- with explicit priorities the clients are encouraged to follow. I wasn't there but I'd be willing to bet there was a lonely voice in the room saying, hey, this should be controlled by the sysadmin. A lonely voice that got shouted down.
Give me a break... multiple implementations have chosen to tweak the algorithm independently and at various times.
It's just an rfc, not the gospel according to richard draves.
" Acknowledgments
The author would like to acknowledge the contributions of the IPng Working Group, particularly Marc Blanchet, Brian Carpenter, Matt Crawford, Alain Durand, Steve Deering, Robert Elz, Jun-ichiro itojun Hagino, Tony Hain, M.T. Hollinger, JINMEI Tatuya, Thomas Narten, Erik Nordmark, Ken Powell, Markku Savela, Pekka Savola, Hesham Soliman, Dave Thaler, Mauro Tortonesi, Ole Troan, and Stig Venaas. In addition, the anonymous IESG reviewers had many great comments and suggestions for clarification. "
Joel, I am giving you a break. Instead of calling this list of folks to the carpet over a failure of imagination that by the time we've ubiquitously deployed IPv6 will have been the root cause of billions if not tens of billions of dollars in needless industry expense, I'm trying to move the discussion past the errors and focus on ways to help the next team of smart, selfless and dedicated individuals avoid sullying their results with a similar mistake. Denial keeps the discussion focused on the errors. You don't want that and neither do I.
Today's RFC candidates are required to call out IANA considerations and security considerations in special sections. They do so because each of these areas has landmines that the majority of working groups are ill equipped to consider on their own.
There should be an operations callout as well -- a section where proposed operations defaults (as well as statics for which a solid case can be made for an operations tunable) are extracted from the thick of it and offered for operator scrutiny prior to publication of the RFC.
Do you find this adjustment objectionable? Do you have other fresh ideas to float? Something better than the tired refrain about operators not showing up? 'Cause I have to tell you: Several years ago I picked a working group and I showed up. And I faced and lost the argument against the persistent certainty on the workability of ridiculous deployment scenarios by folks who never managed any system larger than a software development lab. And I stopped participating in the group about a year ago as the core of participants who hadn't given up wandered off into la la land. Regards, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
On Jul 11, 2011, at 8:13 AM, William Herrin wrote:
Today's RFC candidates are required to call out IANA considerations and security considerations in special sections. They do so because each of these areas has landmines that the majority of working groups are ill equipped to consider on their own.
There should be an operations callout as well -- a section where proposed operations defaults (as well as statics for which a solid case can be made for an operations tunable) are extracted from the thick of it and offered for operator scrutiny prior to publication of the RFC.
Do you find this adjustment objectionable? Do you have other fresh ideas to float? Something better than the tired refrain about operators not showing up?
The operations area has a directorate. It reviews basically every draft in front of the IESG. I'm on it. Am I not an operator? Do I think that adding yet another required section to an internet draft is going to increase it's quality? No I do not.
'Cause I have to tell you: Several years ago I picked a working group and I showed up. And I faced and lost the argument against the persistent certainty on the workability of ridiculous deployment scenarios by folks who never managed any system larger than a software development lab. And I stopped participating in the group about a year ago as the core of participants who hadn't given up wandered off into la la land.
Regards, Bill Herrin
-- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
On Mon, Jul 11, 2011 at 11:20 AM, Joel Jaeggli <joelja@bogus.com> wrote:
On Jul 11, 2011, at 8:13 AM, William Herrin wrote:
Today's RFC candidates are required to call out IANA considerations and security considerations in special sections. They do so because each of these areas has landmines that the majority of working groups are ill equipped to consider on their own.
There should be an operations callout as well -- a section where proposed operations defaults (as well as statics for which a solid case can be made for an operations tunable) are extracted from the thick of it and offered for operator scrutiny prior to publication of the RFC.
Do you find this adjustment objectionable? Do you have other fresh ideas to float? Something better than the tired refrain about operators not showing up?
The operations area has a directorate. It reviews basically every draft in front of the IESG. I'm on it. Am I not an operator?
Well, you work at Zynga, a company which makes facebook games. Before that you worked at Nokia, company which makes phones but doesn't run phone networks. Before that it was Check Point Software, a company which makes firewalls but doesn't run networks. And before that it was the University of Oregon. Do you believe any of those roles offered you the perspective you'd gain working for an ISP, telco or MSO? Are you not an operator? Sure, why not. It's a big tent. Are you well qualified to represent operator interests before the IETF? You really haven't been speaking to the issues I had to deal with when I led an ISP and you've expressed little respect for the validity of issues I face now. But you do show up. Regards, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
Well, you work at Zynga, a company which makes facebook games. Before that you worked at Nokia, company which makes phones but doesn't run phone networks. Before that it was Check Point Software, a company which makes firewalls but doesn't run networks. And before that it was the University of Oregon.
down to the ad homina pretty quickly. congrats. fyi, what joel does/did at those companies is/was build and run networks and data centers. next ad hominem attack? randy
On Mon, Jul 11, 2011 at 11:20 AM, Joel Jaeggli <joelja@bogus.com> wrote:
On Jul 11, 2011, at 8:13 AM, William Herrin wrote:
Today's RFC candidates are required to call out IANA considerations and security considerations in special sections. They do so because each of these areas has landmines that the majority of working groups are ill equipped to consider on their own.
There should be an operations callout as well -- a section where proposed operations defaults (as well as statics for which a solid case can be made for an operations tunable) are extracted from the thick of it and offered for operator scrutiny prior to publication of the RFC.
Do you find this adjustment objectionable?
Do I think that adding yet another required section to an internet draft is going to increase it's quality? No I do not.
Joel, You may be right. Calling out IANA considerations doesn't seem to have made the IETF any smarter on the shared ISP IPv4 space. And I have no idea if calling out security implications has helped reduce security-related design flaws. On the other hand, calling out ops issues in RFCs is a modest reform that at worst shouldn't hurt anything. That beats my next best idea: asking the ops area to schedule its meetings with the various NOG meetings instead of with the rest of the IETF so that the attendance is ops who dabble in development instead of developers who dabble in ops. You disagree? What are your thoughts on fixing the problem? Regards, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
On Mon, Jul 11, 2011 at 3:18 PM, William Herrin <bill@herrin.us> wrote:
On the other hand, calling out ops issues in RFCs is a modest reform that at worst shouldn't hurt anything. That beats my next best idea:
I think if this were done, some guy like me would spend endless hours arguing with others about what should and should not be documented in this proposed section, without it actually benefiting the process or the improving the underlying protocol function / specification. Let me give you an example: BGP Messages, which are up to 4KB, need to be expanded to support future features like as-path signing. Randy Bush proposes to extend them to 65,535 octets, the maximum size without significantly changing the message header. This raises a few concerns which I label as operational, for example, off-by-one bugs in code can fail to be detected by a neighboring BGP speaker in some circumstances, because an age-old (since BGP 1) idiot check in the protocol is being silently removed. If you ask me, that is operational and belongs in such a section. I'm sure others will disagree. So we would have a bunch of arguing over whether or not to call this out specifically. Another person believes that expanding the message will affect some vendors' custom TCP stacks, due to window size considerations. I might think that is a developer problem and the affected vendors should fix their crappy TCP implementations, but it might produce unusual stalling problems, etc. which operators have to troubleshoot. Is that an operational issue? Should it be documented? There can be many "operational concerns" when creating or modifying a protocol specification, and every person won't agree on what belongs and what doesn't. However, I do not think the requirement to document them will improve the process or the protocols. It will only add work. Besides, you want "IETF people" who are claimed not to understand operational problems to figure them out and document them in the RFCs? I do not think this will be helpful. More hands-on operators participating in their process is what is needed. -- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts
On Mon, Jul 11, 2011 at 3:41 PM, Jeff Wheeler <jsw@inconcepts.biz> wrote:
On Mon, Jul 11, 2011 at 3:18 PM, William Herrin <bill@herrin.us> wrote:
On the other hand, calling out ops issues in RFCs is a modest reform that at worst shouldn't hurt anything. That beats my next best idea:
I think if this were done, some guy like me would spend endless hours arguing with others about what should and should not be documented in this proposed section, without it actually benefiting the process or the improving the underlying protocol function / specification. Let me give you an example:
BGP Messages, which are up to 4KB, need to be expanded to support future features like as-path signing. Randy Bush proposes to extend them to 65,535 octets, the maximum size without significantly changing the message header. This raises a few concerns which I label as operational, for example, off-by-one bugs in code can fail to be detected by a neighboring BGP speaker in some circumstances, because an age-old (since BGP 1) idiot check in the protocol is being silently removed.
If you ask me, that is operational and belongs in such a section.
Hi Jeff, Thanks for your thoughtful response. Question: It seems to me like figuring out what is or isn't a security issue to be called out has exactly the same pitfalls. How do you deal with it?
Besides, you want "IETF people" who are claimed not to understand operational problems to figure them out and document them in the RFCs? I do not think this will be helpful. More hands-on operators participating in their process is what is needed.
You're an "IETF person" trying to figure out what is or isn't an operations issue so that you can call it out. How might you go about figuring that out? Personally, I might ask a few ops: "Lend me your ear for three minutes to tell you about what I'm working on. Now that that I've given you the pitch, is this something you'd like to control in a configuration or is it something you want to -just work-?" "Control" = operations issue. "Just work" = not an operations issue. Regards, Bill -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
On Jul 11, 2011, at 12:18 PM, William Herrin wrote:
On Mon, Jul 11, 2011 at 11:20 AM, Joel Jaeggli <joelja@bogus.com> wrote:
On Jul 11, 2011, at 8:13 AM, William Herrin wrote:
> Today's RFC candidates are required to call out IANA considerations > and security considerations in special sections. They do so because > each of these areas has landmines that the majority of working groups > are ill equipped to consider on their own. > > There should be an operations callout as well -- a section where > proposed operations defaults (as well as statics for which a solid > case can be made for an operations tunable) are extracted from the > thick of it and offered for operator scrutiny prior to publication of > the RFC.
Do you find this adjustment objectionable?
Do I think that adding yet another required section to an internet draft is going to increase it's quality? No I do not.
Joel,
You may be right. Calling out IANA considerations doesn't seem to have made the IETF any smarter on the shared ISP IPv4 space. And I have no idea if calling out security implications has helped reduce security-related design flaws.
On the other hand, calling out ops issues in RFCs is a modest reform that at worst shouldn't hurt anything.
To my mind, one of the really good criterion for an operational considerations document is some actual experience running it.
That beats my next best idea: asking the ops area to schedule its meetings with the various NOG meetings instead of with the rest of the IETF so that the attendance is ops who dabble in development instead of developers who dabble in ops.
The OPS area works on OPS and Management. Protocol development of the sort you're describing generally occurs in working-groups chartered in the Internet or Routing areas... At least one of the ops chairs are on this list attends nanog regularly etc. Participants, especially those that actually do the work are the important part as far as I'm concerned. Rough consensus is an ugly an imperfect business, it should be recognized that not everyone is going to come away from every exchange with what they want.
You disagree? What are your thoughts on fixing the problem?
I'm not sure that we agree on the dimensions of the problem. on the question of ipv6 is broken: * You're going to have to cope with what you have and can squeeze out of vendors in the near term. implmentors don't change that fast. * People have to show up with the problem statement and stick around to do the work * the outcomes are not always pretty. I hope that my time is productively employed. http://tools.ietf.org/html/draft-gashinsky-v6nd-enhance-00
Regards, Bill Herrin
-- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
You disagree? What are your thoughts on fixing the problem?
I'm not sure that we agree on the dimensions of the problem.
on the question of ipv6 is broken:
* You're going to have to cope with what you have and can squeeze out of vendors in the near term. implmentors don't change that fast. * People have to show up with the problem statement and stick around to do the work * the outcomes are not always pretty.
I don't think that has anything to do with the problem Bill is trying to address. While it is the topic that started this thread, the problem that I think Bill is trying to address and which I agree needs to be addressed is that IETF standards are developed with what has become increasingly obvious as insufficient operator input. Yes, operators are partially to blame in that decisions are made by those that show up and operators have a hard time showing up to the IETF process for a variety of reasons that are mostly related to the realities of running day to day operations and not realy something the IETF can easily address. However, part of the problem also relates to ways in which the IETF is particularly difficult for operators to credibly participate. (the amount of ego and religion in some of the working groups, the need for a thick skin if you want to make a statement that goes counter to the current dogma, the time-consuming nature of meaningful participation, etc.). I don't pretend to have answers to all of these problems, but, I think there first needs to be recognition and consensus that the lack of operator input into the IETF is becoming increasingly problematic and is impacting the ability to deploy what is developed by the IETF. Owen
Once upon a time, there was only the IETF, then NOGs came and standards became sloppy....
On Jul 11, 2011, at 2:54 PM, Franck Martin wrote:
Once upon a time, there was only the IETF, then NOGs came and standards became sloppy....
Uh, no... Really not. Read some of the earliest standards documents and you'll find that they are pretty sloppy, but, the community back then (predecessor to NOGs) was small enough that people could develop and deploy workarounds and feed those resolutions into superseding RFCs. Today, there are many more operators and IETF has become a much larger and more complex set of bodies as well. Owen
On Mon, Jul 11, 2011 at 5:10 PM, Joel Jaeggli <joelja@bogus.com> wrote:
On Jul 11, 2011, at 12:18 PM, William Herrin wrote:
On Mon, Jul 11, 2011 at 11:20 AM, Joel Jaeggli <joelja@bogus.com> wrote:
On Jul 11, 2011, at 8:13 AM, William Herrin wrote:
>> Today's RFC candidates are required to call out IANA considerations >> and security considerations in special sections. They do so because >> each of these areas has landmines that the majority of working groups >> are ill equipped to consider on their own. >> >> There should be an operations callout as well -- a section where >> proposed operations defaults (as well as statics for which a solid >> case can be made for an operations tunable) are extracted from the >> thick of it and offered for operator scrutiny prior to publication of >> the RFC.
Do you find this adjustment objectionable?
Do I think that adding yet another required section to an internet draft is going to increase it's quality? No I do not.
Joel,
You may be right. Calling out IANA considerations doesn't seem to have made the IETF any smarter on the shared ISP IPv4 space. And I have no idea if calling out security implications has helped reduce security-related design flaws.
On the other hand, calling out ops issues in RFCs is a modest reform that at worst shouldn't hurt anything.
To my mind, one of the really good criterion for an operational considerations document is some actual experience running it.
Hi Joel, I'm not looking for anything that sophisticated. I just want a list of "These are the things that can be tuned at an operations level (plus the defaults) and these are the things that can't be tuned but someone in the discussion thought a reasonable person might want them to be." The idea is that an ops guy should be able to read the three-paragraph intro, jump to the list of tunables and then be able to offer feedback along the lines of, "Whoa! Of course X should be tunable, are you kidding? Here's a rough description of where I want to configure it." and "I'm never going to alter Y. You can make it configurable, but I'd just as soon deal with everybody having the same value of Y." Heck, make it multiple choice. 1 is "no chance I'll ever want to change this value" and 5 is "I'll want to set this value case by case."
That beats my next best idea: asking the ops area to schedule its meetings with the various NOG meetings instead of with the rest of the IETF so that the attendance is ops who dabble in development instead of developers who dabble in ops.
The OPS area works on OPS and Management. Protocol development of the sort you're describing generally occurs in working-groups chartered in the Internet or Routing areas...
A moment ago you said, the ops area "reviews basically every draft in front of the IESG."
Participants, especially those that actually do the work are the important part as far as I'm concerned.
I don't disagree. But producing flawed standards does nobody any good, least of all the folks who poured their hearts and souls into making them.
Rough consensus is an ugly an imperfect business, it should be recognized that not everyone is going to come away from every exchange with what they want.
Which if you were talking about a rough consensus of operations folk addressing operations issues would be just fine. This is basically what happens at the address registries like ARIN and it more or less works. That's broken in the IETF. The ops folks aren't there for the consensus checks. As a consequence, ops issues are being decided not with -rough- consensus but with -false- consensus. False consensus falls apart when you try to bring the excluded folks back to the party, as you must in the operators' case with any standard the IETF produces.
You disagree? What are your thoughts on fixing the problem?
I'm not sure that we agree on the dimensions of the problem.
on the question of ipv6 is broken:
* You're going to have to cope with what you have and can squeeze out of vendors in the near term. implmentors don't change that fast. * People have to show up with the problem statement and stick around to do the work * the outcomes are not always pretty.
V6 poses some difficult challenges and you're right that in the short term we're basically stuck with what is and have to make the best of it. But V6 isn't my focus in this thread. The ops are sufficiently irate at this point that they'll keep pounding on the WG's and the vendors until fixes happen. My focus in this thread is this: how do we help the next teams avoid the discourtesy and the smackdown that the v6 teams are getting for not adequately recognizing the ops' issues. These guys should have been heroes but instead they screwed the pooch and everybody's paying for it. How do we fix the systemic problems so that next time they are heroes? Regards, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
On Jul 11, 2011, at 3:37 PM, William Herrin wrote:
On Mon, Jul 11, 2011 at 5:10 PM, Joel Jaeggli <joelja@bogus.com> wrote:
On Jul 11, 2011, at 12:18 PM, William Herrin wrote:
On Mon, Jul 11, 2011 at 11:20 AM, Joel Jaeggli <joelja@bogus.com> wrote:
On Jul 11, 2011, at 8:13 AM, William Herrin wrote:
>>> Today's RFC candidates are required to call out IANA considerations >>> and security considerations in special sections. They do so because >>> each of these areas has landmines that the majority of working groups >>> are ill equipped to consider on their own. >>> >>> There should be an operations callout as well -- a section where >>> proposed operations defaults (as well as statics for which a solid >>> case can be made for an operations tunable) are extracted from the >>> thick of it and offered for operator scrutiny prior to publication of >>> the RFC.
Do you find this adjustment objectionable?
Do I think that adding yet another required section to an internet draft is going to increase it's quality? No I do not.
Joel,
You may be right. Calling out IANA considerations doesn't seem to have made the IETF any smarter on the shared ISP IPv4 space. And I have no idea if calling out security implications has helped reduce security-related design flaws.
On the other hand, calling out ops issues in RFCs is a modest reform that at worst shouldn't hurt anything.
To my mind, one of the really good criterion for an operational considerations document is some actual experience running it.
Hi Joel,
I'm not looking for anything that sophisticated. I just want a list of "These are the things that can be tuned at an operations level (plus the defaults) and these are the things that can't be tuned but someone in the discussion thought a reasonable person might want them to be." The idea is that an ops guy should be able to read the three-paragraph intro, jump to the list of tunables and then be able to offer feedback along the lines of, "Whoa! Of course X should be tunable, are you kidding? Here's a rough description of where I want to configure it." and "I'm never going to alter Y. You can make it configurable, but I'd just as soon deal with everybody having the same value of Y."
Heck, make it multiple choice. 1 is "no chance I'll ever want to change this value" and 5 is "I'll want to set this value case by case."
That beats my next best idea: asking the ops area to schedule its meetings with the various NOG meetings instead of with the rest of the IETF so that the attendance is ops who dabble in development instead of developers who dabble in ops.
The OPS area works on OPS and Management. Protocol development of the sort you're describing generally occurs in working-groups chartered in the Internet or Routing areas...
A moment ago you said, the ops area "reviews basically every draft in front of the IESG."
I said there is an ops directorate that reviews basically every draft in front of the iesg... much like their are genart reviews, security reviews iana reviews etc. The principle work on a draft by the time that occurs is already done unless the iesg returns the draft to the work group. <SNIP>
Participants, especially those that actually do the work are the important part as far as I'm concerned.
My focus in this thread is this: how do we help the next teams avoid the discourtesy and the smackdown that the v6 teams are getting for not adequately recognizing the ops' issues. These guys should have been heroes but instead they screwed the pooch and everybody's paying for it. How do we fix the systemic problems so that next time they are heroes?
IPNG was a long time ago. things have changed and will continue to change because standards are written by humans and cemented with consensus which is imperment and changable. Rational changes, Requirements change and things should evolve, that's not failure it's healthy.
Regards, Bill Herrin
On Mon, 11 Jul 2011, William Herrin wrote:
If the complaint is that the IETF doesn't adequately listen to the operations folk, then I think it makes sense to consult the operations folks early and often on potential fixes. If folks here think it would help, -that- is when I'll it to the IETF.
I started participating in the IETF 1-2 years ago. Coming from Fidonet background, the threshold of entry felt very low, as long as you make any kind of sense, people will discuss with you there and it doesn't matter who you are. You don't even have to go to the meetings (I've only been to a single one). I encourage everybody to participate, at least to subscribe to the WG mailing lists and keep a look out for the draft announcements and give feedback to those. If we in the ISP business don't do this, the show will be run by the vendors and academics (as is the case right now). They're saying "come to us", you're saying "come to us", and as long as both do this the rate of communication is going to be limited. What is needed is more people with operational backgrounds. For instance, I pitched the idea that ended up as a draft, dunno what will come of it: <http://www.ietf.org/mail-archive/web/isis-wg/current/msg02556.html> This has purely operational background and the puritans didn't like it (they didn't even understand why one would want to do it like that), but after a while I feel I received some traction and it might actually end up as a protocol enhancement that will help some ISPs in their daily work. Even something like your IGP isn't "done", and can be enhanced even if it takes time. -- Mikael Abrahamsson email: swmike@swm.pp.se
Sent from my iPad On Jul 11, 2011, at 2:57, William Herrin <bill@herrin.us> wrote:
On Sun, Jul 10, 2011 at 4:22 PM, Owen DeLong <owen@delong.com> wrote:
On Jul 10, 2011, at 12:23 PM, William Herrin wrote:
Consider, for example, RFC 3484. That's the one that determines how an IPv6 capable host selects which of a group of candidate IPv4 and IPv6 addresses for a particular host name gets priority. How is a server's address priority NOT an issue that should be managed at an operations level by individual server administrators? Yet the working group which produced it came up with a static prioritization that is the root cause of a significant portion of the IPv6 deployment headaches we face.
3484 specifies a static default. By definition, defaults in absence of operator configuration kind of have to be static. Having a reasonable and expected set of defaults documented in an RFC provides a known quantity for what operators can/should expect from hosts they have not configured. I see nothing wrong with RFC 3484 other than I would agree that the choices made were suboptimal. Mostly that was based on optimism and a lack of experience available at the time of writing.
Hi Owen,
A more optimal answer would have been to make AAAA records more like MX or SRV records -- with explicit priorities the clients are encouraged to follow. I wasn't there but I'd be willing to bet there was a lonely voice in the room saying, hey, this should be controlled by the sysadmin. A lonely voice that got shouted down.
Uh, right, because the average system administrator wants the remote host telling his systems which address to prefer? Besides, that would have been DESTINATION address selection, not source address selection which isn't what we're talking about. I wasn't there either, but, it _IS_ controlled by the sysadmin. There are defaults in case the sysadmin is asleep at the switch (RFC 3484) and there are handles and knobs for the sysadmin to tune if he wants (the other RFC that I referred you to).
Today's RFC candidates are required to call out IANA considerations and security considerations in special sections. They do so because each of these areas has landmines that the majority of working groups are ill equipped to consider on their own.
There should be an operations callout as well -- a section where proposed operations defaults (as well as statics for which a solid case can be made for an operations tunable) are extracted from the thick of it and offered for operator scrutiny prior to publication of the RFC.
I think this would be a good idea, actually. It would probably be more effective to propose it to IETF than to NANOG, however.
If the complaint is that the IETF doesn't adequately listen to the operations folk, then I think it makes sense to consult the operations folks early and often on potential fixes. If folks here think it would help, -that- is when I'll it to the IETF.
I think it would help. Hopefully others will express similar sentiment. Owen
On 10 Jul 2011, at 21:22, Owen DeLong wrote:
On Jul 10, 2011, at 12:23 PM, William Herrin wrote:
Consider, for example, RFC 3484. That's the one that determines how an IPv6 capable host selects which of a group of candidate IPv4 and IPv6 addresses for a particular host name gets priority. How is a server's address priority NOT an issue that should be managed at an operations level by individual server administrators? Yet the working group which produced it came up with a static prioritization that is the root cause of a significant portion of the IPv6 deployment headaches we face.
3484 specifies a static default. By definition, defaults in absence of operator configuration kind of have to be static. Having a reasonable and expected set of defaults documented in an RFC provides a known quantity for what operators can/should expect from hosts they have not configured. I see nothing wrong with RFC 3484 other than I would agree that the choices made were suboptimal. Mostly that was based on optimism and a lack of experience available at the time of writing.
There is another RFC and there are APIs and most operating systems have configuration mechanisms where an operator CAN set that to something other than the 3484 defaults.
There is a DHCPv6 option to configure host policy described in http://tools.ietf.org/html/draft-ietf-6man-addr-select-opt-01, which is hopefully approaching a WGLC at IETF81. RFC3484 was originally published in 2003, which is a long time ago. And of course it turned out that, with wider operational experience and feedback from the operator community, there were some issues uncovered and some omissions. Perhaps some of those might have been foreseen, but I highly doubt all of them would have. Many of the issues were captured in RFC5220, which led to the work on RFC3484-bis, which is also close to publication. Now, perhaps the DHCPv6 option and the 3484-bis drafts could be posted to the NANOG list at an appropriate time, for example when the docs hit WGLC. But there are a lot of WGLCs out there and the question is then whether the NANOG list, or some special NANOG list for IETF WGLCs, would want all those notifications. As a co-author of the DHCPv6 and 3484-bis drafts, I am quite happy to post about these to NANOG as they approach last call. There are three open issues on 3484-bis that some feedback would be welcomed on.
There should be an operations callout as well -- a section where proposed operations defaults (as well as statics for which a solid case can be made for an operations tunable) are extracted from the thick of it and offered for operator scrutiny prior to publication of the RFC.
I think this would be a good idea, actually. It would probably be more effective to propose it to IETF than to NANOG, however.
Whether it's a separate section in the draft, or a recommendation to post to operators communities (which is more than just NANOG of course), the question as mentioned above is how that's done in a way to get the attention of appropriate operators without drowning them in notifications. Tim
The IETF is run by volunteers. They volunteer because they find designing protocols to be fun. For the most part, operators are not entertained by designing network protocols. So, for the most part they don't partiticpate.
Randy Bush, "Editorial zone: Into the Future with the Internet Vendor Task Force: a very curmudgeonly view, or testing spaghetti," ACM SIGCOMM Computer Communication Review Volume 35 Issue 5, October 2005. http://archive.psg.com/051000.ccr-ivtf.html
On 7/10/11 6:29 PM, Randy Bush wrote:
The IETF is run by volunteers. They volunteer because they find designing protocols to be fun. For the most part, operators are not entertained by designing network protocols. So, for the most part they don't partiticpate.
Randy Bush, "Editorial zone: Into the Future with the Internet Vendor Task Force: a very curmudgeonly view, or testing spaghetti," ACM SIGCOMM Computer Communication Review Volume 35 Issue 5, October 2005. http://archive.psg.com/051000.ccr-ivtf.html
I agree with Randy. Well, that's no surprise, I usually agree with Randy. But I didn't know/remember that he'd managed to get his rant published! Good work.... But the problem has been pretty apparent since circa 1991. I remember calls for an Internet Operator's Task Force (IOTF) to parallel IETF sometime in '92 or '93. Folks have asked me from time to time why I stopped participating in the IETF a decade or so ago. My usual tongue-in-cheek reply is, "it's more important to use the protocols we already have before we build more." (CF. nukes.) IPv6 was certainly a part of it (as was security). As I remind folks from time to time, I'm the guy that originally registered v6 with IANA. But PIPE->SIP->SIPP was a much simpler, shorter, cleaner extension using 64-bit addresses. My proposal used the upper 32-bits extending the then 16-bit BGP ASN, making addresses match topology, shrinking the routing tables.... Although I *do* find designing protocols to be fun, these days I only post Experimental drafts. There are committees (dysfunctional "working groups") where the chair cannot get his own drafts through the process in under 4 years. It took about 7 years to publish the group negotiation extension to SSH, many years after it was shipping. It's no wonder that operators don't want to participate.
On Jul 10, 2011, at 9:16 AM, Jeroen Massar wrote:
On 2011-07-10 17:56 , David Miller wrote: [..]
+1
The lack of will on the part of the IETF to attract input from and involve operators in their processes (which I would posit is a critical element in the process).
Ehmmmm ANYBODY, including you, can sign up to the IETF mailing lists and participate there, just like a couple of folks from NANOG are already doing.
You are on NANOG out of your own free will, the same applies to the IETF. If you don't participate here your voice is not heard either, just like at the IETF.
Peeking at the ipv6@ietf.org member list, I don't see your name there. You can signup here: https://www.ietf.org/mailman/listinfo/ipv6
Greets, Jeroen
While this is true, there are a couple of factors that make it more difficult than it would appear on the surface. Number one: Participating effectively in IETF is a rather time-consuming process. While a lot of engineers and developers may have IETF effort as a primary part of their job function and/or get their employer to let them spend time on it, operators are often too busy keeping what they already have running and it can be _VERY_ difficult to get management to support the idea of investing time in things like IETF which are not seen by management as having direct operational impact. NANOG is about the limit of their vision on such things and even that is not well supported in a lot of organizations. Number two: While anyone can participate, approaching IETF as an operator requires a rather thick skin, or, at least it did the last couple of times I attempted to participate. I've watched a few times where operators were shouted down by purists and religion over basic real-world operational concerns. It seems to be a relatively routine practice and does not lead to operators wanting to come back to an environment where they feel unwelcome. Owen
On 07/10/2011 12:45 PM, Owen DeLong wrote:
While this is true, there are a couple of factors that make it more difficult than it would appear on the surface.
Number one: Participating effectively in IETF is a rather time-consuming process. While a lot of engineers and developers may have IETF effort as a primary part of their job function and/or get their employer to let them spend time on it, operators are often too busy keeping what they already have running and it can be _VERY_ difficult to get management to support the idea of investing time in things like IETF which are not seen by management as having direct operational impact. NANOG is about the limit of their vision on such things and even that is not well supported in a lot of organizations.
Vendors make up the vast bulk of attendance at ietf. And vendors are there for one reason: to make stuff that you'll be paying for. So you pay for it at ietf time, or you pay for it at deployment time. Either way, you'll be paying.
Number two: While anyone can participate, approaching IETF as an operator requires a rather thick skin, or, at least it did the last couple of times I attempted to participate. I've watched a few times where operators were shouted down by purists and religion over basic real-world operational concerns. It seems to be a relatively routine practice and does not lead to operators wanting to come back to an environment where they feel unwelcome.
If you're trying to imply that operators get singled out, that's not been my experience. You definitely need to have a thick skin given egos and there's definitely a large pool of professional ietf finger waggers, but their holier than thou attitude is spread to all in their path, from what I've seen. I won't speak for every working group, but the ones i've been involved with have been pretty receptive to operator input. Mike
On Sun, Jul 10, 2011 at 3:45 PM, Owen DeLong <owen@delong.com> wrote:
Number two: While anyone can participate, approaching IETF as an operator requires a rather thick skin, or, at least it did the last couple of times I attempted to participate. I've watched a few times where
I am subscribed to the IDR (BGP, etc.) and LISP lists. These are populated with different people and cover entirely different topics. My opinion is the following: * The IDR list is welcoming of operators, but whether or not your opinion is listened to or included in the process, I do not know. Randy Bush, alone, posts more on this list than the sum of all operators who post in the time I've been reading. I think Randy's influence is 100% negative, and it concerns me deeply that one individual has the potential to do so much damage to essential protocols like BGP. Also, the priorities of this list are pretty fucked. Inaction within this working group is the reason we still don't have expanded BGP communities for 32 bit ASNs. The reason for this is operators aren't participating. The people on the list or the current participants of the WG should not be blamed. My gripe about Randy Bush having the potential to do huge damage would not exist if there were enough people on the list who understand what they're doing to offer counter-arguments.
operators were shouted down by purists and religion over basic real-world operational concerns. It seems to be a relatively routine practice and does not lead to operators wanting to come back to an environment where they feel unwelcome.
I have found my input on the LISP list completely ignored because, as you suggest, my concerns are real-world and don't have any impact on someone's pet project. LISP as it stands today can never work on the Internet, and regardless of the fine reputations of the people at Cisco and other organizations who are working on it, they are either furthering it only because they would rather work on a pet project than something useful to customers, or because they truly cannot understand its deep, insurmountable design flaws at Internet-scale. You would generally hope that someone saying, "LISP can't work at Internet-scale because anyone will be able to trivially DoS any LISP ITR ('router' for simplicity), but here is a way you can improve it," well, that remark, input, and person should be taken quite seriously, their input examined, and other assumptions about the way LISP is supposed to work ought to be questioned. None of this has happened. LISP is a pet project to get some people their Ph.D.s and keep some old guard vendor folks from jumping ship to another company. It is a shame that the IETF is manipulated to legitimize that kind of thing. Then again, I could be wrong. Randy Bush could be a genius and LISP could revolutionize mobility. -- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts
\ I have found my input on the LISP list completely ignored because, as you suggest, my concerns are real-world and don't have any impact on someone's pet project. LISP as it stands today can never work on the Internet, and regardless of the fine reputations of the people at Cisco and other organizations who are working on it, they are either furthering it only because they would rather work on a pet project than something useful to customers, or because they truly cannot understand its deep, insurmountable design flaws at Internet-scale. You would generally hope that someone saying, "LISP can't work at Internet-scale because anyone will be able to trivially DoS any LISP ITR ('router' for simplicity), but here is a way you can improve it," well, that remark, input, and person should be taken quite seriously, their input examined, and other assumptions about the way LISP is supposed to work ought to be questioned. None of this has happened.
Jeff I've spend many hours working through the issues you brought up (indeed cache management, population, and security are three of my focus areas in LISP, and something we considered when we started this), have been socializing them with the LISP team, and can personally say that I take your comments very seriously. Or testing group in house as well as on the LISP beta network have been working through these issues. Also, we've had an email thread going on about this for, by my count, 3-4 replies back and forth. While I appreciate your opinions above, I have to say that I disagree with them, and also with the conclusions you draw. -Darrel P.S. oh and Randy Bush is pretty damn smart. :-)
On 07/10/2011 12:45, Owen DeLong wrote:
On Jul 10, 2011, at 9:16 AM, Jeroen Massar wrote:
On 2011-07-10 17:56 , David Miller wrote: [..]
+1
The lack of will on the part of the IETF to attract input from and involve operators in their processes (which I would posit is a critical element in the process).
Discussing how the IETF should fix itself is both fruitless, and off topic for this list. However ...
While this is true, there are a couple of factors that make it more difficult than it would appear on the surface.
Number one: Participating effectively in IETF is a rather time-consuming process. While a lot of engineers and developers may have IETF effort as a primary part of their job function and/or get their employer to let them spend time on it, operators are often too busy keeping what they already have running and it can be _VERY_ difficult to get management to support the idea of investing time in things like IETF which are not seen by management as having direct operational impact. NANOG is about the limit of their vision on such things and even that is not well supported in a lot of organizations.
Number two: While anyone can participate, approaching IETF as an operator requires a rather thick skin, or, at least it did the last couple of times I attempted to participate. I've watched a few times where operators were shouted down by purists and religion over basic real-world operational concerns. It seems to be a relatively routine practice and does not lead to operators wanting to come back to an environment where they feel unwelcome.
What you're saying is absolutely right (unfortunately), but the answer is that operators need to suck it up and get involved. The problem will not fix itself if we don't. The good news is that in many areas (at least, the areas that I participate in) there is starting to be a lot more sympathy toward operational concerns/realities, and real progress is being made. Yes, it's slow, arduous, and often frustrating. (How's that for a sales pitch?) But there is literally no other solution to improving the situation that for the people that care to get involved in helping to fix it. For those interested in IPv6 I highly recommend subscribing to the the 6man and v6ops lists, listen in on the conversations for a while, and then chime in when you feel comfortable. Treat those on the list with the same courtesy and respect that you'd like to be treated with, and way more often than not it will bear fruit. hth, Doug -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/
In a message written on Sun, Jul 10, 2011 at 06:16:09PM +0200, Jeroen Massar wrote:
Ehmmmm ANYBODY, including you, can sign up to the IETF mailing lists and participate there, just like a couple of folks from NANOG are already doing.
The way the IETF and the operator community interact is badly broken. The IETF does not want operators in many steps of the process. If you try to bring up operational concerns in early protocol development for example you'll often get a "we'll look at that later" response, which in many cases is right. Sometimes you just have to play with something before you worry about the operational details. It also does not help that many operational types are not hardcore programmers, and can't play in the sandbox during the major development cycles. But this shuts out operators and discourages them from participating when they are needed, which is at the idea phase and towards the end of development. If the IETF really wanted to get useful operator impact, they would slightly modify their process. On the front end there would be a more clear way for operational types to add to the To-Do list "stuff we really need to make the Internet work better". Then, some sausage would be made largely without operator involvement (but hey, if you want to participate no exclusions), and then when developmen is about 80-90% done there would be an "operational testing and comment period". Operators would be actively brought back in the process to test some small scale deployments and provide feedback of operational concerns that might lead to some tweaks, and then boom, out the door it goes. I suspect this would both increase operator participation by a few orders of magnitude, and also keep the operators from annoying the developers so much when they are in "trying things out" mode. -- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/
On Mon, Jul 11, 2011 at 3:35 PM, Leo Bicknell <bicknell@ufp.org> wrote:
The IETF does not want operators in many steps of the process. If you try to bring up operational concerns in early protocol development for example you'll often get a "we'll look at that later" response, which in many cases is right. Sometimes you just have to play with something before you worry about the operational details. It also
I really don't understand why that is right / good. People get personally invested in their project / spec, and not only that, vendor people get their company's time and money invested in proof-of-concept. The longer something goes on with what may be serious design flaws, the harder it is to get them fixed, simply because of momentum. Wouldn't it be nice if we could change the way that next-header works in IPv6 now? Or get rid of SLAAC and erase the RFCs recommending /80 and /64 from history? -- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts
On Jul 11, 2011, at 12:57 PM, Jeff Wheeler wrote:
On Mon, Jul 11, 2011 at 3:35 PM, Leo Bicknell <bicknell@ufp.org> wrote:
The IETF does not want operators in many steps of the process. If you try to bring up operational concerns in early protocol development for example you'll often get a "we'll look at that later" response, which in many cases is right. Sometimes you just have to play with something before you worry about the operational details. It also
I really don't understand why that is right / good. People get personally invested in their project / spec, and not only that, vendor people get their company's time and money invested in proof-of-concept. The longer something goes on with what may be serious design flaws, the harder it is to get them fixed, simply because of momentum.
Wouldn't it be nice if we could change the way that next-header works in IPv6 now? Or get rid of SLAAC and erase the RFCs recommending /80 and /64 from history?
No... I like SLAAC and find it useful in a number of places. What's wrong with /64? Yes, we need better DOS protection in switches and routers to accommodate some of the realities of those decisions, but, that's not to say that SLAAC or /64s are bad. They're fine ideas with proper protections. I'm not sure about the /80 reference as I haven't encountered that recommendation outside of some perverse ideas about point-to-point links. Owen
On Mon, Jul 11, 2011 at 5:12 PM, Owen DeLong <owen@delong.com> wrote:
No... I like SLAAC and find it useful in a number of places. What's wrong with /64? Yes, we need better DOS protection in switches and routers
See my slides http://inconcepts.biz/~jsw/IPv6_NDP_Exhaustion.pdf for why no vendor's implementation is effective "DOS protection" today and how much complexity is involved in doing it correctly, which requires not only knobs on routers, but also on layer-2 access switches, which is not easy to implement. It's a whole lot smarter to just configure a smaller network when that is practical. In fact, that advice should be "the standard." I really don't understand why we need SLAAC. I believe it is a relic of a mindset when a DHCP client might have been hard to implement cost-effectively in a really light-weight client device (coffee pot? wrist-watch?) Or when running a DHCP server was some big undertaking that couldn't be made not only obvious, but transparent, to SOHO users buying any $99 CPE. I do understand why SLAAC needs /64. Okay, so configure /64 on those networks where SLAAC is utilized. Otherwise, do something else. Pretty simple! Again, please see my slides.
to accommodate some of the realities of those decisions, but, that's not to say that SLAAC or /64s are bad. They're fine ideas with proper protections.
The proper protections are kinda hard to do if you have relatively dumb layer-2 access switches. It is a lot harder than RA Guard, and we aren't ever likely to see that feature on a large base of installed "legacy" switches, like Cisco 2950. Replacing those will be expensive. We can't replace them yet anyway because similar switches (price) today still do not have RA Guard, let alone any knobs to defend against neighbor table churn, etc. I'm not sure if they ever will have the later.
I'm not sure about the /80 reference as I haven't encountered that recommendation outside of some perverse ideas about point-to-point links.
This is because you didn't follow IPv6 progress until somewhat recently, and you are not aware that the original suggestion for prefix length was 80 bits, leaving just 48 bits for the host portion of the address. This was later revised. It helps to know a bit of the history that got us to where we are now. It was originally hoped, by some, that we may not even need NDP because the layer-2 adjacency would always be encoded in the end of the layer-3 address. Some people still think vendors may get us to that point with configuration knobs. -- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts
On Mon, Jul 11, 2011 at 5:03 PM, Jeff Wheeler <jsw@inconcepts.biz> wrote:
On Mon, Jul 11, 2011 at 5:12 PM, Owen DeLong <owen@delong.com> wrote:
No... I like SLAAC and find it useful in a number of places. What's wrong with /64? Yes, we need better DOS protection in switches and routers
See my slides http://inconcepts.biz/~jsw/IPv6_NDP_Exhaustion.pdf for why no vendor's implementation is effective "DOS protection" today and how much complexity is involved in doing it correctly, which requires [snip]
If every vendor's implementation is vulnerable to a NDP Exhaustion vulnerability, how come the behavior of specific routers has not been documented specifically? If "zero" devices are not vulnerable, you came to this conclusion because you tested every single implementation against IPv6 NDP DoS, or? How come there are no security advisories. What's the CWE or CVE number for this vulnerability? I'm not denying the that NDP overflow might be a DoS issue for all IPv6 routers, but I haven't seen any specific documentation from vendors or security researchers about specific DoS conditions that can be caused by NDP overflow on particular devices.... It would be useful to at least have the risk properly described, in terms of what kind of DoS condition could arise on specific implementations. Regards, -- -JH
On Mon, 2011-07-11 at 18:48 -0500, Jimmy Hess wrote:
It would be useful to at least have the risk properly described, in terms of what kind of DoS condition could arise on specific implementations.
RFC3756 IPv6 Neighbor Discovery (ND) Trust Models and Threats Section 4.3.2 In this attack, the attacking node begins fabricating addresses with the subnet prefix and continuously sending packets to them. The last hop router is obligated to resolve these addresses by sending neighbor solicitation packets. A legitimate host attempting to enter the network may not be able to obtain Neighbor Discovery service from the last hop router as it will be already busy with sending other solicitations. This DoS attack is different from the others in that the attacker may be off-link. The resource being attacked in this case is the conceptual neighbor cache, which will be filled with attempts to resolve IPv6 addresses having a valid prefix but invalid suffix. This is a DoS attack. The above RFC and RFC3971 (SEND) both have good descriptions of a BUNCH of possible attacks. RFC3971 is a bit dismissive IMHO of this particular attack. I realise this is not "specific implementations" as you requested, but it seems to me that the problem is generic enough not to require that. The attack is made possible by the design of the protocol, not any failing of specific implementations. Specific implementations need to describe what they've done about it (mitigation or prevention). Regards, K. -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Karl Auer (kauer@biplane.com.au) +61-2-64957160 (h) http://www.biplane.com.au/kauer/ +61-428-957160 (mob) GPG fingerprint: DA41 51B1 1481 16E1 F7E2 B2E9 3007 14ED 5736 F687 Old fingerprint: B386 7819 B227 2961 8301 C5A9 2EBC 754B CD97 0156
On 07/11/2011 09:17 PM, Karl Auer wrote:
I realise this is not "specific implementations" as you requested, but it seems to me that the problem is generic enough not to require that.
The attack is made possible by the design of the protocol, not any failing of specific implementations. Specific implementations need to describe what they've done about it (mitigation or prevention).
Vulnerability to this specific issues has a great deal to do with the implementation. After all, whenever there's a data structure that can potentially grow out of bounds (or hit a limit), it becomes a resource management issue. In this particular case, if the implementation enforces a limit on the number of entries in the "INCOMPLETE" state, then only nodes that have never communicated with the outside world could be affected by this attack. And if those entries that are in the "INCOMPLETE" state are pruned periodically (e.g. in a round-robin fashion), chances are that even those "new hosts" would be able to get into the neighbor cache and hence remain unaffected by this attack. Thanks, -- Fernando Gont e-mail: fernando@gont.com.ar || fgont@acm.org PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076 FFF1
On Thu, Jul 14, 2011 at 7:29 PM, Fernando Gont <fernando@gont.com.ar> wrote:
On 07/11/2011 09:17 PM, Karl Auer wrote: Vulnerability to this specific issues has a great deal to do with the implementation. After all, whenever there's a data structure that can Yes
In this particular case, if the implementation enforces a limit on the number of entries in the "INCOMPLETE" state, then only nodes that have never communicated with the outside world could be affected by this attack. And if those entries that are in the "INCOMPLETE" state are pruned periodically (e.g. in a round-robin fashion), chances are that
Not only that but it's possible to differentiate _how_ an entry is added to the table when the table reaches a "high water mark" it's possible to drop the packet that was attempting to cause a NDP discover, fail to add the INCOMPLETE entry to the table, but _still_ send the outgoing NDP neighbor solicitation, and complete the entry or "whitelist" the destination if the neighbor advertises itself. That is: if the destination is good, the neighbor will respond to the NDP solicit, even though the neighbor doesn't have an entry in the table. So a small number of packets are lost at initial setup, due to the attack, but further packets are unaffected, So long as the attack does not overwhelm router CPU, and so long as the INCOMPLETE entry high water mark is at a low enough level, so there is still ample space in the table. Even more sophisticated strategies may be available. It should be possible to mitigate this, so long as the attack does not actually originate from a neighbor on the same subnet as a router IP interface on an IPv6 subnet with sufficient number of IPs.
even those "new hosts" would be able to get into the neighbor cache and hence remain unaffected by this attack.
Thanks,
-- -JH
On Jul 14, 2011, at 6:24 PM, Jimmy Hess wrote:
On Thu, Jul 14, 2011 at 7:29 PM, Fernando Gont <fernando@gont.com.ar> wrote:
On 07/11/2011 09:17 PM, Karl Auer wrote: Vulnerability to this specific issues has a great deal to do with the implementation. After all, whenever there's a data structure that can Yes
In this particular case, if the implementation enforces a limit on the number of entries in the "INCOMPLETE" state, then only nodes that have never communicated with the outside world could be affected by this attack. And if those entries that are in the "INCOMPLETE" state are pruned periodically (e.g. in a round-robin fashion), chances are that
Not only that but it's possible to differentiate _how_ an entry is added to the table when the table reaches a "high water mark" it's possible to drop the packet that was attempting to cause a NDP discover, fail to add the INCOMPLETE entry to the table, but _still_ send the outgoing NDP neighbor solicitation, and complete the entry or "whitelist" the destination if the neighbor advertises itself.
That is: if the destination is good, the neighbor will respond to the NDP solicit, even though the neighbor doesn't have an entry in the table.
So a small number of packets are lost at initial setup, due to the attack, but further packets are unaffected,
So long as the attack does not overwhelm router CPU, and so long as the INCOMPLETE entry high water mark is at a low enough level, so there is still ample space in the table.
Even more sophisticated strategies may be available.
It should be possible to mitigate this, so long as the attack does not actually originate from a neighbor on the same subnet as a router IP interface on an IPv6 subnet with sufficient number of IPs.
even those "new hosts" would be able to get into the neighbor cache and hence remain unaffected by this attack.
Thanks,
-- -JH
Very true. This is where Mr. Wheeler's arguments depart from reality. He's right in that the problem can't be truly fixed without some very complicated code added to lots of devices, but, it can be mitigated relatively easily and mitigation really is good enough for most real world purposes. Owen
On Thu, Jul 14, 2011 at 9:47 PM, Owen DeLong <owen@delong.com> wrote:
Very true. This is where Mr. Wheeler's arguments depart from reality. He's right in that the problem can't be truly fixed without some very complicated code added to lots of devices, but, it can be mitigated relatively easily and mitigation really is good enough for most real world purposes.
ok,I'll bite, what's the solution?
On 07/14/2011 10:24 PM, Jimmy Hess wrote:
In this particular case, if the implementation enforces a limit on the number of entries in the "INCOMPLETE" state, then only nodes that have never communicated with the outside world could be affected by this attack. And if those entries that are in the "INCOMPLETE" state are pruned periodically (e.g. in a round-robin fashion), chances are that
Not only that but it's possible to differentiate _how_ an entry is added to the table when the table reaches a "high water mark" it's possible to drop the packet that was attempting to cause a NDP discover, fail to add the INCOMPLETE entry to the table, but _still_ send the outgoing NDP neighbor solicitation, and complete the entry or "whitelist" the destination if the neighbor advertises itself.
Agreed. I should double-check whether there's room in the current specifications to do this -- however, whether the specs allow this or not is irrelevant. At the point you're being hit with a DoS, you better do something about it (particularly when it's possible, as in this case!)
That is: if the destination is good, the neighbor will respond to the NDP solicit, even though the neighbor doesn't have an entry in the table.
Modulo that when the high water mark has not been hit, the router should probably *not* create ND cache entries in response to this "gratuitous" ND advertisements, since otherwise it would open the door to a DoS from local attackers.
It should be possible to mitigate this, so long as the attack does not actually originate from a neighbor on the same subnet as a router IP interface on an IPv6 subnet with sufficient number of IPs.
Well, unless there's some layer-2 anti-spoofing mitigation in place, with /64 subnets the "local attacker" typically *will* have enough addresses. Thanks, -- Fernando Gont e-mail: fernando@gont.com.ar || fgont@acm.org PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076 FFF1
On Jul 14, 2011, at 10:06 PM, Fernando Gont <fernando@gont.com.ar> wrote:
It should be possible to mitigate this, so long as the attack does not actually originate from a neighbor on the same subnet as a router IP interface on an IPv6 subnet with sufficient number of IPs.
Well, unless there's some layer-2 anti-spoofing mitigation in place, with /64 subnets the "local attacker" typically *will* have enough addresses.
Solving a local attack is something I consider different in scope than the current draft being discussed in 6man, v6ops, ipv6@ etc... Anyone on a layer-2 network can do something interesting like flood all f's and kill the lan. Trying to keep the majority of thoughts here for layer-3 originated attacks, even if the target is a layer2 item. - Jared
On 07/14/2011 11:35 PM, Jared Mauch wrote:
Well, unless there's some layer-2 anti-spoofing mitigation in place, with /64 subnets the "local attacker" typically *will* have enough addresses.
Solving a local attack
Well, I was talking about not *introducing* ;-) one.
is something I consider different in scope than the current draft being discussed in 6man, v6ops, ipv6@ etc...
Which I-D are you referring to? Thanks, -- Fernando Gont e-mail: fernando@gont.com.ar || fgont@acm.org PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076 FFF1
http://tools.ietf.org/html/draft-gashinsky-v6nd-enhance-00 Sent from my iThing On Jul 14, 2011, at 10:57 PM, Fernando Gont <fernando@gont.com.ar> wrote:
On 07/14/2011 11:35 PM, Jared Mauch wrote:
Well, unless there's some layer-2 anti-spoofing mitigation in place, with /64 subnets the "local attacker" typically *will* have enough addresses.
Solving a local attack
Well, I was talking about not *introducing* ;-) one.
is something I consider different in scope than the current draft being discussed in 6man, v6ops, ipv6@ etc...
Which I-D are you referring to?
Thanks, -- Fernando Gont e-mail: fernando@gont.com.ar || fgont@acm.org PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076 FFF1
On Thu, Jul 14, 2011 at 9:35 PM, Jared Mauch <jared@puck.nether.net> wrote:
On Jul 14, 2011, at 10:06 PM, Fernando Gont <fernando@gont.com.ar> wrote: Anyone on a layer-2 network can do something interesting like flood all f's and kill the lan. Trying to keep the majority of thoughts here for layer-3 originated attacks, even if the target is a layer2 item. - Jared
In most cases if you have a DoS attack coming from the same Layer-2 network that a router is attached to, it would mean there was already a serious security incident that occured to give the attacker that special point to attack from. A similarly hazardous situation exists with IPv4, and it is basically unheard of for IPv4's Layer 2/ARP security weaknesses to be exploited to create a DoS condition, even though they can be (very easily), IPv4 Layer 2 DoS conditions are often due to a malfunction or error than intended attack; more likely, IPv6 Layer 2 security weaknesses will be used to intercept traffic for snooping, or quietly subvert network policy. LAN DoS conditions are noticed quickly, and usually result in physical unplugging of the attacking (or malfunctioning) node. Methods can be designed to protect against spoofed NDP flooding on the LAN that do not require the router's involvement. For IPv4 switched networks there is a technology referred to as 'Dynamic ARP Inspection'. Untrusted IPv6 LAN environments will need to implement SEND or some form of 'Dynamic ND inspection' plus RA-guard. If it comes down to solving a remote DoS issue at the cost of creating a LAN DoS issue that comes down to 'hosts on the LAN having to spoof' I would say that's easily well worth it. -- -JH
On 07/15/2011 12:24 AM, Jimmy Hess wrote:
A similarly hazardous situation exists with IPv4, and it is basically unheard of for IPv4's Layer 2/ARP security weaknesses to be exploited to create a DoS condition, even though they can be (very easily),
IMO, the situation is different, in that the typical IPv4 subnet size eliminate some of the attack vectors. For example, it would be virtually impossible for an ARP cache to grow without bounds, and consume all kernel memory, because the typical IPv4 subnet size imposes a limit on the number of entries. That is *not* the case with v6.
IPv4 Layer 2 DoS conditions are often due to a malfunction or error than intended attack; more likely, IPv6 Layer 2 security weaknesses will be used to intercept traffic for snooping, or quietly subvert network policy. LAN DoS conditions are noticed quickly, and usually result in physical unplugging of the attacking (or malfunctioning) node.
Assuming the admin of the possibly-ipv6-enabled-by-default router is IPv6 aware, etc.
Methods can be designed to protect against spoofed NDP flooding on the LAN that do not require the router's involvement.
Which ones?
For IPv4 switched networks there is a technology referred to as 'Dynamic ARP Inspection'.
Untrusted IPv6 LAN environments will need to implement SEND or some form of 'Dynamic ND inspection' plus RA-guard.
Good luck with deploying SEND. OTOH, forget about current implementations of RA-Guard: * http://tools.ietf.org/id/draft-gont-v6ops-ra-guard-evasion-01.txt * http://tools.ietf.org/id/draft-gont-6man-nd-extension-headers-01.txt
If it comes down to solving a remote DoS issue at the cost of creating a LAN DoS issue that comes down to 'hosts on the LAN having to spoof'
I would say that's easily well worth it.
You *can* fix the remote DoS issue, *without* introducing the locally-exploitable one. That's the point. Thanks, -- Fernando Gont e-mail: fernando@gont.com.ar || fgont@acm.org PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076 FFF1
On Jul 14, 2011, at 8:24 PM, Jimmy Hess wrote:
On Thu, Jul 14, 2011 at 9:35 PM, Jared Mauch <jared@puck.nether.net> wrote:
On Jul 14, 2011, at 10:06 PM, Fernando Gont <fernando@gont.com.ar> wrote: Anyone on a layer-2 network can do something interesting like flood all f's and kill the lan. Trying to keep the majority of thoughts here for layer-3 originated attacks, even if the target is a layer2 item. - Jared
In most cases if you have a DoS attack coming from the same Layer-2 network that a router is attached to, it would mean there was already a serious security incident that occured to give the attacker that special point to attack from.
That's one possibility. The other likely possibility is that you are a University. Owen
On Thu, 14 Jul 2011 23:13:03 PDT, Owen DeLong said:
On Jul 14, 2011, at 8:24 PM, Jimmy Hess wrote:
In most cases if you have a DoS attack coming from the same Layer-2 network that a router is attached to, it would mean there was already a serious security incident that occured to give the attacker that special point to attack from.
That's one possibility.
The other likely possibility is that you are a University.
Nope. Unless you want to add "or you are a cable provider, or you are a DSL provider, or you are a...." to that. (Hint - what percent of students launch DoS attacks that cut themselves off from the net? Compare to what percent of non-student machines out on cable and DSL are botted or pwned) Even if you're a university with resident students, if said students are on the same Layer-2 as anything you actually care about, you have a serious security incident. "Student manages to DoS the router out of the dorm and strands 3 floors of dorm without internet" is just as interesting as "Joe Sixpack manages to DoS the router at the cable head end and strands 3 blocks of Comcast customers without internet", for the *exact same reasons*. If the student is able to play more level-2 games than Joe Sixpack can, you misdesigned your network.
On Jul 15, 2011, at 10:24 AM, Jimmy Hess wrote:
In most cases if you have a DoS attack coming from the same Layer-2 network that a router is attached to, it would mean there was already a serious security incident that occured to give the attacker that special point to attack fr
This scenario is quite common in physical/virtual co-lo IDCs, FYI - customer A attacking customer B, both within the same subnet, in many cases. ----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com> The basis of optimism is sheer terror. -- Oscar Wilde
* Jared Mauch:
Solving a local attack is something I consider different in scope than the current draft being discussed in 6man, v6ops, ipv6@ etc...
That's not going to happen because it's a layering violation between the IETF and IEEE. It has not been solved during thirty years of IPv4 over Ethernet. Why would be IPv6 be different? In practice, the IPv4 vs IPv6 difference is that some vendors provide DHCP snooping, private VLANs and unicast flood protection in IPv4 land, which seems to provide a scalable way to build Ethernet networks with address validation---but there is nothing comparable for IPv6 right now, from any vendor.
On Jul 17, 2011, at 4:15 PM, Florian Weimer wrote:
In practice, the IPv4 vs IPv6 difference is that some vendors provide DHCP snooping, private VLANs and unicast flood protection in IPv4 land, which seems to provide a scalable way to build Ethernet networks with address validation---but there is nothing comparable for IPv6 right now, from any vendor.
It seems to me that IPv4 feature parity in terms of layer-2 security features should be prominently featured in upcoming RFPs. ----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com> The basis of optimism is sheer terror. -- Oscar Wilde
On Sun, 17 Jul 2011, Florian Weimer wrote:
In practice, the IPv4 vs IPv6 difference is that some vendors provide DHCP snooping, private VLANs and unicast flood protection in IPv4 land, which seems to provide a scalable way to build Ethernet networks with address validation---but there is nothing comparable for IPv6 right now, from any vendor.
That is not true. Check out work and reports from the IETF SAVI WG. There are actually quite a few implementations out there, being used in production. -- Mikael Abrahamsson email: swmike@swm.pp.se
* Mikael Abrahamsson:
On Sun, 17 Jul 2011, Florian Weimer wrote:
In practice, the IPv4 vs IPv6 difference is that some vendors provide DHCP snooping, private VLANs and unicast flood protection in IPv4 land, which seems to provide a scalable way to build Ethernet networks with address validation---but there is nothing comparable for IPv6 right now, from any vendor.
That is not true. Check out work and reports from the IETF SAVI WG. There are actually quite a few implementations out there, being used in production.
Others use tunnels, PPPoE or lots of scripting, so certainly something can be done about it. To my knowledge, SAVI SEND is still at a similar stage. Pointers to vendor documentation would be appreciated if this is not the case. And SAVI SEND is not the full story---it's still missing unicast flood protection.
On Sun, 17 Jul 2011, Florian Weimer wrote:
Others use tunnels, PPPoE or lots of scripting, so certainly something can be done about it. To my knowledge, SAVI SEND is still at a similar stage. Pointers to vendor documentation would be appreciated if this is not the case.
<www.ietf.org/proceedings/79/slides/savi-6.pdf> -- Mikael Abrahamsson email: swmike@swm.pp.se
* Mikael Abrahamsson:
On Sun, 17 Jul 2011, Florian Weimer wrote:
Others use tunnels, PPPoE or lots of scripting, so certainly something can be done about it. To my knowledge, SAVI SEND is still at a similar stage. Pointers to vendor documentation would be appreciated if this is not the case.
<www.ietf.org/proceedings/79/slides/savi-6.pdf>
Interesting, thnaks. It's not the vendors I would expect, and it's not based on SEND (which is not surprising at all and actually a good thing). Is this actually secure in the sense that it ties addresses to specific ports for both sending and receiving? I'm asking because folks have built similar systems for IPv4 which weren't. The CLI screenshots look good, better than what most folks achieve with IPv4.
On Sun, 17 Jul 2011, Florian Weimer wrote:
Interesting, thnaks. It's not the vendors I would expect, and it's not based on SEND (which is not surprising at all and actually a good thing).
Personally I think SEND is never going to get any traction.
Is this actually secure in the sense that it ties addresses to specific ports for both sending and receiving? I'm asking because folks have built similar systems for IPv4 which weren't. The CLI screenshots look good, better than what most folks achieve with IPv4.
As far as I know, it's designed to work securely in an ETTH scenario, which implies both sending and receiving (if I understood you correctly). -- Mikael Abrahamsson email: swmike@swm.pp.se
* Mikael Abrahamsson:
On Sun, 17 Jul 2011, Florian Weimer wrote:
Interesting, thnaks. It's not the vendors I would expect, and it's not based on SEND (which is not surprising at all and actually a good thing).
Personally I think SEND is never going to get any traction.
Last time, I was told that SEND was the way to go, despite not actually fixing anything. This mess is even worse than SCTP.
Is this actually secure in the sense that it ties addresses to specific ports for both sending and receiving? I'm asking because folks have built similar systems for IPv4 which weren't. The CLI screenshots look good, better than what most folks achieve with IPv4.
As far as I know, it's designed to work securely in an ETTH scenario, which implies both sending and receiving (if I understood you correctly).
And it would also plug the NDP DOS vector because you've got a small set of addresses you need to process. Let's hope this gets buy-in from more vendors (and across the whole switch product lines, please), with full interoperability.
On Mon, Jul 11, 2011 at 8:17 PM, Karl Auer <kauer@biplane.com.au> wrote:
RFC3756 IPv6 Neighbor Discovery (ND) Trust Models and Threats
In this attack, the attacking node begins fabricating addresses with the subnet prefix and continuously sending packets to them. The last hop router is obligated to resolve these addresses by sending neighbor solicitation packets. A legitimate host attempting to enter the network may not be able to obtain Neighbor Discovery service from the last hop router as it will be already busy with sending other solicitations.
Hi Karl, My off-the-cuff naive solution to this problem would be to discard the oldest incomplete solicitation to fit the new one and, upon receiving an apparently unsolicited response to a discarded solicitation, restart the process flagging that particular query non-discardable. That would be an implementation change, not a protocol change. I would expect to occasionally lose a packet due to the discard while the router was under attack with the accordingly minimal impact. I would also expect to see a multicast flood on the LAN of about the same data rate as the inbound attack packets. Where does this naive approach break down? On Fri, Jul 15, 2011 at 12:13 AM, Fernando Gont <fernando@gont.com.ar> wrote:
On 07/15/2011 12:24 AM, Jimmy Hess wrote:
A similarly hazardous situation exists with IPv4, and it is basically unheard of for IPv4's Layer 2/ARP security weaknesses to be exploited to create a DoS condition, even though they can be (very easily),
IMO, the situation is different, in that the typical IPv4 subnet size eliminate some of the attack vectors.
Hi Fernando, Not at a practical level. The reason it's unheard of for IPv4 is that if you're a hacker with an ability to generate arbitrary packets on a LAN, DOSing the adjacent router by overwhelming its ARP cache is one of the least interesting things you can do... and one of the easiest to get busted at. It isn't necessary (or possible) to solve every conceivable *local* DOS attack. And frankly remote saturation-bomb attacks are out of bounds too. The concern Karl presented was that it was possible to remotely disable an IPv6 LAN with tailored traffic much less than that network's inbound capacity. Solve that issue with IPv6 ND and we're done. Regards, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
On Sun, Jul 17, 2011 at 11:42 AM, William Herrin <bill@herrin.us> wrote:
My off-the-cuff naive solution to this problem would be to discard the oldest incomplete solicitation to fit the new one and, upon receiving an apparently unsolicited response to a discarded solicitation, restart the process flagging that particular query non-discardable.
Do you mean to write, "flagging that ND entry non-discardable?" Once the ND entry is in place, it should not be purged for quite some time (configurable is a plus), on the order of minutes or hours. Making them "permanent" would, however, cause the ND table to eventually become full when foolish things like frequent source address changes for "privacy" are in use, many clients are churning in and out of the LAN, etc.
Where does this naive approach break down?
It breaks down because the control-plane can't handle the relatively small number of punts which must be generated in order to send ND solicits, and without the ability to install "incomplete" entries into the data-plane, those punts cannot be policed without, by design, discarding some "good" punts along with the "bad" punts resulting from DoS traffic. -- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts
On Jul 17, 2011, at 10:35 AM, Jeff Wheeler wrote:
On Sun, Jul 17, 2011 at 11:42 AM, William Herrin <bill@herrin.us> wrote:
My off-the-cuff naive solution to this problem would be to discard the oldest incomplete solicitation to fit the new one and, upon receiving an apparently unsolicited response to a discarded solicitation, restart the process flagging that particular query non-discardable.
Do you mean to write, "flagging that ND entry non-discardable?" Once the ND entry is in place, it should not be purged for quite some time (configurable is a plus), on the order of minutes or hours. Making them "permanent" would, however, cause the ND table to eventually become full when foolish things like frequent source address changes for "privacy" are in use, many clients are churning in and out of the LAN, etc.
I believe it was obvious in the context he means flagging the incomplete ND entry as one that is not to be discarded early for pruning of potentially DOS-related ND entries. Basically an ND entry would have the following states and timers: Flagbits: I Incomplete (1 = ND entry is not complete) D Discardable (1 = Incomplete entry is result of incoming packet for unverified neighbor) Timers: A Age -- Counts up from time ND entry was created (most likely synthetic and calculated by storing T in the ND entry and using Tnow - Tentry). The system would have a two timer policies: 1 for Incomplete Timeout and 1 for Complete Timeout. (TI and TC) At A=TI, an incomplete entry would be discarded regardless of the D flag. At A=TC, a complete entry would be discarded regardless of the D flag. When a packet arrives for a host which does not exist in the ND table, a new entry with flags I and D would be created. An ND request would be generated as normal. When a new ND table entry is required and the table is full, the oldest entry with both I and D flags (max(A)) would be replaced with the new entry. When an ND response is received matching an entry with the I flag set, the I and D flags would be cleared and the entry would be filled in with the appropriate data. When an ND response is received not matching an entry with the I flag set, a new entry with the I flag and no D flag would be created. In this way, you cannot overflow the neighbor table in a way that creates significant disruption unless there are actually too many neighbors, in which case, it's bad network design and not DOS.
Where does this naive approach break down?
It breaks down because the control-plane can't handle the relatively small number of punts which must be generated in order to send ND solicits, and without the ability to install "incomplete" entries into the data-plane, those punts cannot be policed without, by design, discarding some "good" punts along with the "bad" punts resulting from DoS traffic.
I think most of this punting could be handled at the line card level. Is there any reason that the ND process can't be moved into line-card level silicon as described above? Sure, that doesn't solve the problem on current hardware, but, it moves it from design problem to implementation issue, which IMHO is a step in the right direction. Owen
On Sun, Jul 17, 2011 at 3:40 PM, Owen DeLong <owen@delong.com> wrote:
Basically an ND entry would have the following states and timers:
I've discussed what you have described with some colleagues in the past. The idea has merit and I would certainly not complain if vendors included it (as a knob) on their boxes. The downfalls of this approach are that they still don't ensure the discovery of new neighbors (rather than "ever seen" neighbors) during DoS, and you make the local DoS a bit more complex by needing to establish more rules for purging these semi-permanent entries.
I think most of this punting could be handled at the line card level. Is there any reason that the ND process can't be moved into line-card level silicon as described above?
You could implement ND solicit in the data-plane (and remove punts entirely) in even some current chips, to say nothing of future ones. Whether or not that is a good idea, well, keep in mind that the ND solicits would then be mcasted to the LAN at a potentially unlimited rate. That is not necessarily a problem unless the L2 implementation is not too good with respect to multicast. For example, in some "switches" (mostly those that are routers that can switch) the L2 mcast has surprising caveats, such as using up a lot of fabric capacity for whatever replication scheme has been chosen. Of course, you also hope NDP on all the connected hosts works right. I believe some Juniper customers noticed a pretty big problem with JUNOS NDP implementation when deploying boxes using the DE-CIX addressing scheme, and in a situation like that, the ingress router for the attack could be crippled by spurious responses from the other mis-behaving hosts on the LAN, essentially like smurf except without sending any garbage back out to the Internet. What you definitely don't want to do is assume this fixes the local DoS, because it doesn't. I would like for you to keep in mind that a host on the LAN, misconfigured to do something like "local proxy-arp," or otherwise responding to all ND solicits, would accidentally DoS the LAN's gateway. I do not think we should assume that the local DoS won't happen, or is "fixable" with a whack-a-mole method.
Sure, that doesn't solve the problem on current hardware, but, it moves it from design problem to implementation issue, which IMHO is a step in the right direction.
Well, it already is a design problem that implementations can largely work-around. Vendors just aren't doing it. :-/ -- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts
On Jul 17, 2011, at 1:17 PM, Jeff Wheeler wrote:
On Sun, Jul 17, 2011 at 3:40 PM, Owen DeLong <owen@delong.com> wrote:
Basically an ND entry would have the following states and timers:
I've discussed what you have described with some colleagues in the past. The idea has merit and I would certainly not complain if vendors included it (as a knob) on their boxes. The downfalls of this approach are that they still don't ensure the discovery of new neighbors (rather than "ever seen" neighbors) during DoS, and you make the local DoS a bit more complex by needing to establish more rules for purging these semi-permanent entries.
Sure they do... Just not necessarily on the first attempt. There are no semi-permanent entries. In fact, it doesn't make any entry more permanent than today's state. The D flag just makes entries more readily discardable than today's entries. So you have some misconceptions about how it would work in practice, I think. Under DOS, the first packet that arrives for a known host generates the standard ND request sent to the host, but, the Incomplete ND table entry is created with the D flag set. If the host responds before the ND table entry is discarded, all functions as normal. If the entry is discarded before the host responds, then the response from the host creates a new incomplete entry without the D flag set. This entry will live for the normal time that an incomplete ND entry would be kept (not eligible for early discard) and the retry packet from the originating host would then generate a new ND request and the response should arrive before the normal incomplete ND timer expires. At that point a normal complete entry is created and things continue to function. So, what happens under this scenario is that you have a small chance that you need to wait for an initial connection retry on an unseen host, but, you can easily discard incomplete ND entries for which no response has yet been received. Further, since you're only discarding the oldest one entry each time you need to create a new entry in a full table, this would only start discarding things when an actual table overflow is occurring whether from DOS or other cause. If it's another cause, I don't think this makes life any worse. If it's DOS, then, it should be relatively rare that a responsive host is the oldest ND table entry that would get discarded, no?
I think most of this punting could be handled at the line card level. Is there any reason that the ND process can't be moved into line-card level silicon as described above?
You could implement ND solicit in the data-plane (and remove punts entirely) in even some current chips, to say nothing of future ones. Whether or not that is a good idea, well, keep in mind that the ND solicits would then be mcasted to the LAN at a potentially unlimited rate.
There's no reason it would have to be an unlimited rate, but, I think that would probably be acceptable in most cases anyway.
That is not necessarily a problem unless the L2 implementation is not too good with respect to multicast. For example, in some "switches" (mostly those that are routers that can switch) the L2 mcast has surprising caveats, such as using up a lot of fabric capacity for whatever replication scheme has been chosen.
If your L2 implementation sucks on Mcast in IPv6, you're kind of in a bad way anyway.
Of course, you also hope NDP on all the connected hosts works right. I believe some Juniper customers noticed a pretty big problem with JUNOS NDP implementation when deploying boxes using the DE-CIX addressing scheme, and in a situation like that, the ingress router for the attack could be crippled by spurious responses from the other mis-behaving hosts on the LAN, essentially like smurf except without sending any garbage back out to the Internet.
I think the bad NDP implementations on the hosts will get sorted fairly quickly anyway. Since all a spurious hosts would do is create a new incomplete entry without the D flag set the FIRST time it sends an unsolicited ND response, I'm not sure how that would really cripple the ingress router. Care to explain that?
What you definitely don't want to do is assume this fixes the local DoS, because it doesn't. I would like for you to keep in mind that a host on the LAN, misconfigured to do something like "local proxy-arp," or otherwise responding to all ND solicits, would accidentally DoS the LAN's gateway. I do not think we should assume that the local DoS won't happen, or is "fixable" with a whack-a-mole method.
I consider local DOS to be a corner case unique to universities and very poorly run colos. We've already had that discussion and IIRC agreed to disagree.
Sure, that doesn't solve the problem on current hardware, but, it moves it from design problem to implementation issue, which IMHO is a step in the right direction.
Well, it already is a design problem that implementations can largely work-around. Vendors just aren't doing it. :-/
Well, I think provided a simple solution as outlined above it might be easier to get them to do so if they think there is demand. I know I'll be discussing this with the guy that deals with our vendors to see if we can convince them to roll it into an upcoming release. Owen
On Sun, Jul 17, 2011 at 1:35 PM, Jeff Wheeler <jsw@inconcepts.biz> wrote:
On Sun, Jul 17, 2011 at 11:42 AM, William Herrin <bill@herrin.us> wrote:
My off-the-cuff naive solution to this problem would be to discard the oldest incomplete solicitation to fit the new one and, upon receiving an apparently unsolicited response to a discarded solicitation, restart the process flagging that particular query non-discardable.
Do you mean to write, "flagging that ND entry non-discardable?"
Hi Jeff, I meant flagging the new incomplete solicitation ineligible for previous sentence's early load-based discard. When you get a response to a solicitation you no longer remember making, solicit again and don't forget about it until the normal protocol timeouts this time.
Where does this naive approach break down?
It breaks down because the control-plane can't handle the relatively small number of punts which must be generated in order to send ND solicits, and without the ability to install "incomplete" entries into the data-plane, those punts cannot be policed without, by design, discarding some "good" punts along with the "bad" punts resulting from DoS traffic.
Let me try to restate what you've said to make sure I understand. When the data plane knows what ARP or ND is underway, it can guard against overwhelming the control plane by discarding excessive traffic for the same yet-unresolved destination while allowing packets for new destinations on the lan through to the control plane. Without that knowledge, it can only have one queue causing the data plane to discard packets which would initiate neighbor discovery prior to the control plane seeing them, preventing any solicitation or implementing the logic I described. This doesn't sound particularly hard to me. Most CPE has the control and data planes on the same silicon. A guard at the data plane is pointless in the first place. Just punt the packet up and move on. On the bigger iron where the planes are on running on different chips, you can move the initial ND solicitation packet into the data plane. After failing to find an incomplete ND, generate and send the ND solicitation and THEN make the decision whether to punt to the control plane or discard. If you discard, the control plane will find out about the "good" ones when the response comes back. This means you could multiply a unicast flood into a multicast flood but you'll have to pump out several orders of magnitude more packets than with the original problem before it causes me grief. Still, you've sold me on part of the claim: A /64 is inherently vulnerable to a remote DOS attack that a /120 is not. Now sell me on the other part: How does this require effort on the attacker's part that's enough smaller than the general form "flood his link" attack that I should care about it beyond poking my vendors to see if they've reasonably covered the high-load corner cases? I see how the original attack could kill a lan with a relatively small number of packets. With the naive solution, it seems to require something a lot closer to a steady flood. Regards, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
On Jul 17, 2011, at 1:32 PM, William Herrin wrote:
On Sun, Jul 17, 2011 at 1:35 PM, Jeff Wheeler <jsw@inconcepts.biz> wrote:
On Sun, Jul 17, 2011 at 11:42 AM, William Herrin <bill@herrin.us> wrote:
My off-the-cuff naive solution to this problem would be to discard the oldest incomplete solicitation to fit the new one and, upon receiving an apparently unsolicited response to a discarded solicitation, restart the process flagging that particular query non-discardable.
Do you mean to write, "flagging that ND entry non-discardable?"
Hi Jeff,
I meant flagging the new incomplete solicitation ineligible for previous sentence's early load-based discard. When you get a response to a solicitation you no longer remember making, solicit again and don't forget about it until the normal protocol timeouts this time.
If you're going to solicit again rather than wait for a new packet, what's the point of not just installing a complete entry? After all, if someone sends you a spurious response, they'll likely also be able to respond to the solicit, so, you don't really protect anything by sending the solicit. I figured you stick the ineligible incomplete entry in the table and wait for the retransmit of the original packet.
Where does this naive approach break down?
It breaks down because the control-plane can't handle the relatively small number of punts which must be generated in order to send ND solicits, and without the ability to install "incomplete" entries into the data-plane, those punts cannot be policed without, by design, discarding some "good" punts along with the "bad" punts resulting from DoS traffic.
Let me try to restate what you've said to make sure I understand. When the data plane knows what ARP or ND is underway, it can guard against overwhelming the control plane by discarding excessive traffic for the same yet-unresolved destination while allowing packets for new destinations on the lan through to the control plane. Without that knowledge, it can only have one queue causing the data plane to discard packets which would initiate neighbor discovery prior to the control plane seeing them, preventing any solicitation or implementing the logic I described.
This doesn't sound particularly hard to me.
Most CPE has the control and data planes on the same silicon. A guard at the data plane is pointless in the first place. Just punt the packet up and move on.
I think Jeff's focus here is on the kinds of core and TOR switches that are mostly used in datacenters, not so much the CPE end of the world.
Still, you've sold me on part of the claim: A /64 is inherently vulnerable to a remote DOS attack that a /120 is not.
More accurately, the larger your single subnet address space, the more vulnerable you are to table overflow attacks. A /120 is exactly as vulnerable as an IPv4 /24. A /96 is exactly as vulnerable as an IPv4 /0. With bigger address spaces come new challenges. In the real world, I think this is less of an issue because: a. While the attack surface is large, the benefits of carrying out such an attack are relatively small. b. It's a relatively easy attack to spot, identify, quench, and likely trace. Owen
On Mon, Jul 11, 2011 at 7:48 PM, Jimmy Hess <mysidia@gmail.com> wrote:
If every vendor's implementation is vulnerable to a NDP Exhaustion vulnerability, how come the behavior of specific routers has not been documented specifically?
Well, I am in the business of knowing the behavior of kit being considered by my clients for their applications. Every box breaks when tested, period. I imagine you have tested zero, thus you have no data of your own to go on. No vendors are rushing to spend money on "independent" testing laboratories to produce reports about this, because they pretty much all know their boxes will break (or are not even aware of the potential problem, in the case of a few scary vendors.)
If "zero" devices are not vulnerable, you came to this conclusion because you tested every single implementation against IPv6 NDP DoS, or?
Although I have tested many routers to verify my thinking, if you actually read the slides and understand how routers work, you too will know that every router is vulnerable. If you don't know, you don't understand how routers work. It's that simple.
How come there are no security advisories. What's the CWE or CVE number for this vulnerability?
Again, no one is interested in this problem yet because vendors really don't want their customers to demand more knobs. Cisco is the only vendor who has done anything at all. If you read about their knob, you immediately realize that it is a knob to control the failure mode of the box, not to "fix" anything. Why? It can't be "fixed" without not using /64 (or similar) or going to the extreme lengths I outline in those slides.
It would be useful to at least have the risk properly described, in terms of what kind of DoS condition could arise on specific implementations.
Let's take 6500/SUP720 for example. On this platform, a policer is shared between the need to resolve ARP entries and ND table entries. If you attack a dual-stack SUP720 box it will break not only IPv6 neighbor resolution, but also IPv4 neighbor resolution. This is pretty much the "worst-case scenario" because not only will your IPv6 break, which may annoy customers but not be a disaster; it will also break mission-critical IPv4. That's bad. Routing-protocol adjacencies can be affected, disabling not just some hosts downstream of the box, but also its upstream connectivity. It doesn't get any worse than that. You are right to question my statements. I'm not an independent lab doing professional tests and showing the environment and conditions of how you can reproduce the results. I'm just a guy helping my clients decide what kit to buy, and how they should configure their networks. The only reason I have bothered to produce slides is because we are at a point where we have end-customers questioning our reluctance to provision /64 networks for mixed-use data-center LANs, and until vendors actually do something to address this, or "the standard" changes, I need to increase awareness of this problem so I am not forced to deploy a broken design on my own networks the way a lot of other clueless people are. Again, this is only hard to understand (or accept) if you don't know how your routers work. * why do you think there is an ARP and ND table? * why do you think there are policers to protect the CPU from excessive ARP/ND punts or traffic? * do you even know the limit of your boxes' ARP / ND tables? Do you realize that limit is a tiny fraction of one /64? * do you understand what happens when your ARP/ND policers are reached? * did you think about the impact on neighboring routers and protocol next-hops, not just servers? * did you every try to deploy a /16 on a flat LAN with a lot of hosts and see what happens? Doesn't work too well. A v6 /64 is 281 trillion times bigger than a v4 /16. There's no big leap of logic here as to why one rogue machine could break your LAN. There is no router which is not vulnerable to this. If you don't believe me, read the Cisco documentation on their knob limiting ND entries per interface, after which there may be service impact on that interface. That's the best anyone is doing right now. Of course, vendors understand that we, as customers, can configure a subnet smaller than /64. They are leaving us open to link-local issues right now even with a smaller global subnet size, but at least that cannot be exploited from "the Internet." And as it happens, exactly the same features / knobs are needed to "fix" both problems with /64, and with link-local neighbor learning. -- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts
Again, no one is interested in this problem yet because vendors really don't want their customers to demand more knobs. Cisco is the only vendor who has done anything at all. If you read about their knob, you immediately realize that it is a knob to control the failure mode of the box, not to "fix" anything. Why? It can't be "fixed" without not using /64 (or similar) or going to the extreme lengths I outline in those slides.
While it can't be "fixed", controlling the failure mode is adequate in the vast majority of cases. Beyond that, it becomes increasingly academic and purist-oriented rather than operational. Owen
On Jul 11, 2011, at 7:19 PM, Jeff Wheeler wrote:
Again, this is only hard to understand (or accept) if you don't know how your routers work. * why do you think there is an ARP and ND table? * why do you think there are policers to protect the CPU from excessive ARP/ND punts or traffic? * do you even know the limit of your boxes' ARP / ND tables? Do you realize that limit is a tiny fraction of one /64? * do you understand what happens when your ARP/ND policers are reached? * did you think about the impact on neighboring routers and protocol next-hops, not just servers? * did you every try to deploy a /16 on a flat LAN with a lot of hosts and see what happens? Doesn't work too well. A v6 /64 is 281 trillion times bigger than a v4 /16. There's no big leap of logic here as to why one rogue machine could break your LAN.
FYI, in case you're interested in these topics, the IETF working group ARMD was chartered to explore address resolution scale. I'm one of the co-chairs. It's in the Operations Area, and we'd love to have more operators involved - if you're willing to contribute, your input will help set the direction. (If operators don't contribute, it will be just another vendor-led circle... well, you know the score.) For details please see http://tools.ietf.org/wg/armd/charters. Cheers, -Benson
On Mon, Jul 11, 2011 at 3:35 PM, Leo Bicknell <bicknell@ufp.org> wrote:
If the IETF really wanted to get useful operator impact, they would slightly modify their process. On the front end there would be a more clear way for operational types to add to the To-Do list "stuff we really need to make the Internet work better".
Hi Leo, That's an interesting idea, but how does it work? As it stands, I can join a working group mailing list and submit an I-D any time I feel like it. There is almost zero barrier to entry. And I can take it to any step short of the final publication track through the simple expedient of working on it myself. How does the to-do list differ from this? Does it provide some kind of push counter to the folks who hum against publication? How's it work?
Then, some sausage would be made largely without operator involvement (but hey, if you want to participate no exclusions), and then when developmen is about 80-90% done there would be an "operational testing and comment period".
That's another interesting idea. Would you mind gaming it out for me? Use RFC 3484. You have I-D-v6-address-selection 90% ready for publication as RFC 3484. Now what? In their prescience, the operator feedback is going to be "with multiple addresses on a server representing various Internet paths with various reliabilities, we need a way to communicate to the client which addresses to prefer based on our expert knowledge of the reliability of our local network." What elicits that feedback? What do the authors of I-D-v6-address-selection do with the feedback prior to publication? Thanks, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
Leo, Maybe we can fix this by: a) bringing together larger groups of clueful operators in the IETF b) deciding which issues interest them c) showing up and being vocal as a group in protocol developing working groups To some degree, we already do this in the IETF OPS area, but judging by your comments, we don't do it nearly enough. Comments? Ron -----Original Message----- From: Leo Bicknell [mailto:bicknell@ufp.org] Sent: Monday, July 11, 2011 3:35 PM To: nanog@nanog.org Subject: Re: Anybody can participate in the IETF (Was: Why is IPv6 broken?) In a message written on Sun, Jul 10, 2011 at 06:16:09PM +0200, Jeroen Massar wrote:
Ehmmmm ANYBODY, including you, can sign up to the IETF mailing lists and participate there, just like a couple of folks from NANOG are already doing.
The way the IETF and the operator community interact is badly broken. The IETF does not want operators in many steps of the process. If you try to bring up operational concerns in early protocol development for example you'll often get a "we'll look at that later" response, which in many cases is right. Sometimes you just have to play with something before you worry about the operational details. It also does not help that many operational types are not hardcore programmers, and can't play in the sandbox during the major development cycles.
In a message written on Tue, Jul 12, 2011 at 11:28:58AM -0400, Ronald Bonica wrote:
Maybe we can fix this by:
a) bringing together larger groups of clueful operators in the IETF b) deciding which issues interest them c) showing up and being vocal as a group in protocol developing working groups
To some degree, we already do this in the IETF OPS area, but judging by your comments, we don't do it nearly enough.
I don't think it's that simple, sadly. I'll no doubt get flamed by the 5 people on NANOG that also participate in the IETF in a regular basis, but the reality is most operators don't want to sit through multi-year protocol devopment work, or have much of anything to do with "pie in the sky" ideas. The IETF can, should, and does do both of those things today. Where the friction occurs is there is no good place to loop the operators back in, so they are often kicked out, discouraged, or just uninterested on the front end (we're going to go play with new ideas kids!) and then not brought back in (it's ready for deployment, wait, why are no operators interested). So it's not that individual issues are of interest to operators (outside of the IETF OPS area, which is a special case), it's that the process needs work. I'll pick on LISP as an example, since many operators are at least aware of it. Some operators have said we need a locator and identifier split. Interesting feedback. The IETF has gone off and started playing in the sandbox, trying to figure out how to make that go. Several years of coding have occured, a bunch of proof of concept testing is going on. Even many of the operators who wanted such a spit are not really interested in following the details of the work right now. Of course, if you are, you can, I'm not advocating any exclusions. But there is no roadmap in the IETF process now for LISP that says "We've got this 90% baked, we need to circulate a draft to the NANOG mailing list, request operator comments, and actively solicit operators to participate in the expanded test network". We need that mechanism to tell folks "hey, it's real enough your operational feedback is now useful" and "come test our new idea". Today the IETF just finishes their work, "tosses it over the wall" and hopes for the best. Generally it's not 100%, and vendors make propretary changes to the standards slowly over time to meet the needs of operators. It would be far better if there was at least one round of "ask the operators" and incorproate feedback before it went over the wall, and in paricular before working groups disbanded. In short, make it easy for the operators to participate at the right time in the process. It will be better for everyone! -- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/
On Tue, Jul 12, 2011 at 8:42 AM, Leo Bicknell <bicknell@ufp.org> wrote:
In a message written on Tue, Jul 12, 2011 at 11:28:58AM -0400, Ronald Bonica wrote:
Maybe we can fix this by:
a) bringing together larger groups of clueful operators in the IETF b) deciding which issues interest them c) showing up and being vocal as a group in protocol developing working groups
To some degree, we already do this in the IETF OPS area, but judging by your comments, we don't do it nearly enough.
I don't think it's that simple, sadly. I'll no doubt get flamed by the 5 people on NANOG that also participate in the IETF in a regular basis, but the reality is most operators don't want to sit through multi-year protocol devopment work, or have much of anything to do with "pie in the sky" ideas.
The IETF can, should, and does do both of those things today. Where the friction occurs is there is no good place to loop the operators back in, so they are often kicked out, discouraged, or just uninterested on the front end (we're going to go play with new ideas kids!) and then not brought back in (it's ready for deployment, wait, why are no operators interested).
So it's not that individual issues are of interest to operators (outside of the IETF OPS area, which is a special case), it's that the process needs work.
I'll pick on LISP as an example, since many operators are at least aware of it. Some operators have said we need a locator and identifier split. Interesting feedback. The IETF has gone off and started playing in the sandbox, trying to figure out how to make that go. Several years of coding have occured, a bunch of proof of concept testing is going on. Even many of the operators who wanted such a spit are not really interested in following the details of the work right now. Of course, if you are, you can, I'm not advocating any exclusions.
W.R.T. to LISP, in defense of the IETF or the IRTF, i do not believe "the IETF" has told the world that LISP is the best fit for the Internet or solves any specific problem well. The IETF has never said the "Internet Architecture" is going to LISP, and it likely will not / cannot. My expectation is that LISP will go away as quickly as it came. Cameron
But there is no roadmap in the IETF process now for LISP that says "We've got this 90% baked, we need to circulate a draft to the NANOG mailing list, request operator comments, and actively solicit operators to participate in the expanded test network". We need that mechanism to tell folks "hey, it's real enough your operational feedback is now useful" and "come test our new idea".
Today the IETF just finishes their work, "tosses it over the wall" and hopes for the best. Generally it's not 100%, and vendors make propretary changes to the standards slowly over time to meet the needs of operators. It would be far better if there was at least one round of "ask the operators" and incorproate feedback before it went over the wall, and in paricular before working groups disbanded.
In short, make it easy for the operators to participate at the right time in the process. It will be better for everyone!
-- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/
W.R.T. to LISP, in defense of the IETF or the IRTF, i do not believe "the IETF" has told the world that LISP is the best fit for the Internet or solves any specific problem well.
The IETF has never said the "Internet Architecture" is going to LISP, and it likely will not / cannot. My expectation is that LISP will go away as quickly as it came.
i will not dispute this, not my point. but i have to respect dino and the lisp fanboys (and, yes, they are all boys) for actually *doing* something after 30 years of loc/id blah blah blah (as did hip). putting their, well dino's, code where their mouths were and going way out on a limb. i am *not* saying i would run it in an operational network. but maz-san and i were happy to help the experiment by dropping the first asian node in a test rack on the public net. randy
On Jul 12, 2011 5:21 PM, "Randy Bush" <randy@psg.com> wrote:
W.R.T. to LISP, in defense of the IETF or the IRTF, i do not believe "the IETF" has told the world that LISP is the best fit for the Internet or solves any specific problem well.
The IETF has never said the "Internet Architecture" is going to LISP, and it likely will not / cannot. My expectation is that LISP will go away as quickly as it came.
i will not dispute this, not my point. but i have to respect dino and the lisp fanboys (and, yes, they are all boys) for actually *doing* something after 30 years of loc/id blah blah blah (as did hip). putting their, well dino's, code where their mouths were and going way out on a limb.
Understood. But watch for similarities between 6to4 and LISP. Both are clever, both have great intentions, both are extremely dangerous once people start thinking this is anything beyond a toy. And when lowly plebeians like myself hear that research folks at iij and Facebook are doing "something" with LISP, we think that is a blessing of this technology. But, after the "fan boy" chatter dies down, you hear that this is not actually support, it's just engineers doing "Dino" a favor. I admittedly have dismissed LISP early on and do not understand its merits , the idea of ip in ip tunnels as the new internet architecture gives me indigestion. I am also concerned about the questionable business case of why edge networks would make investments to bail out DFZ providers (the main point of LISP). If ipv6 was a hard sell, I can't even imagine making LISP get traction. If the economics are not right, it will never fly, and the economics of LISP are all wrong. Please, spare me line about how LISP is just a knob I turn and has no cost. I fear that at its worst and most successful, LISP ensures ipv4 is the backbone transport media to the detriment of ipv6 and at its best, it is a distraction for folks that need to be making ipv6 work, for real. Cb PS. I think the research guys should give more time to ILNP and creating a graceful unwind of ipv4 and NAT. The dividends from ipv6 only start to really pay when ipv4 becomes optional. My 2 cents, and no more.
i am *not* saying i would run it in an operational network. but maz-san and i were happy to help the experiment by dropping the first asian node in a test rack on the public net.
randy
i will not dispute this, not my point. but i have to respect dino and the lisp fanboys (and, yes, they are all boys) for actually *doing* something after 30 years of loc/id blah blah blah (as did hip). putting their, well dino's, code where their mouths were and going way out on a limb.
[ i have been correctly reminded that dino is far from the only lisp hacker these days, e.g. http://www.openlisp.org/ being notable. ]
Understood. But watch for similarities between 6to4 and LISP. Both are clever, both have great intentions, both are extremely dangerous once people start thinking this is anything beyond a toy.
again, i will not dispute this. it is not my point.
And when lowly plebeians like myself hear that research folks at iij and Facebook are doing "something" with LISP, we think that is a blessing of this technology.
when you hear that research folk are doing something, the best guess would be that it's research. :)
But, after the "fan boy" chatter dies down, you hear that this is not actually support, it's just engineers doing "Dino" a favor.
not exactly. someone i respect is doing some r&d. we do r&d. we all help each other. vendors are kind enough to loan kit to researchers. this does not mean they endorse all of our r&ds projects, just that they endorse and help r&d. our job is to make the internet a better place. on the ops side, when things break, isps all help each other, loan line cards, do remote hands, etc., whether our marketing departments compete or not. our job is to keep the internet running well.
I fear that at its worst and most successful, LISP ensures ipv4 is the backbone transport media to the detriment of ipv6 and at its best, it is a distraction for folks that need to be making ipv6 work, for real.
i suspect that a number of lisp proponents are of that mind. i do not think it does a service to the internet.
PS. I think the research guys should give more time to ILNP
looks interesting but i am unaware of a public code base or research testbed. whack me with a clue bat. randy
On Wed, Jul 13, 2011 at 2:27 AM, Randy Bush <randy@psg.com> wrote:
I fear that at its worst and most successful, LISP ensures ipv4 is the backbone transport media to the detriment of ipv6 and at its best, it is a distraction for folks that need to be making ipv6 work, for real.
i suspect that a number of lisp proponents are of that mind. i do not think it does a service to the internet.
My understanding is that transport over v6 is indeed on everyone's mind and absolutely is a goal for all the LISP people. So on this particular point, your concern is being addressed. What LISP has not done is actually improve the root problem of scaling up the number of multi-homed networks or locators. The cache scheme works if you imagine an ideal Internet where there is no DoS, but otherwise, it does not work. All the same problems of flow-cache routing still exist and LISP actually makes them worse in some cases, not better. It also adds huge complexity and risk but what value it adds (outside of VPN-over-Internet) is questionable at best. -- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts
btw, a litte birdie told me to take another look at 6296 IPv6-to-IPv6 Network Prefix Translation. M. Wasserman, F. Baker. June 2011. (Format: TXT=73700 bytes) (Status: EXPERIMENTAL) which also could be considered to be in the loc/id space randy
On Wed, Jul 13, 2011 at 10:09, Randy Bush <randy@psg.com> wrote:
btw, a litte birdie told me to take another look at
6296 IPv6-to-IPv6 Network Prefix Translation. M. Wasserman, F. Baker. June 2011. (Format: TXT=73700 bytes) (Status: EXPERIMENTAL)
which also could be considered to be in the loc/id space
randy
No, that's a misuse of "loc/id" since no identification is involved, even at the network layer -- but it is in the "reduce issues in global routing and local renumbering" space (that's part of what LISP does). Cameron: As for ILNP, it's going to be difficult to get from where things are now to a world where ILNP is not just useless overhead. When you finally do, considering what it gives you, will the journey have been worth it? LISP apparently has more benefits, and NPT6 is so much easier -- particularly if you have rapid adaptation to apparent address changes, which many apps have and all mobile devices need already -- sorry but I don't think ILNP is going to make it. You can't just say "the IETF should pay more attention". I've invited people to promote it and nobody stepped up. Scott
On Jul 13, 2011 7:39 AM, "Scott Brim" <scott.brim@gmail.com> wrote:
On Wed, Jul 13, 2011 at 10:09, Randy Bush <randy@psg.com> wrote:
btw, a litte birdie told me to take another look at
6296 IPv6-to-IPv6 Network Prefix Translation. M. Wasserman, F. Baker. June 2011. (Format: TXT=73700 bytes) (Status: EXPERIMENTAL)
which also could be considered to be in the loc/id space
randy
No, that's a misuse of "loc/id" since no identification is involved, even at the network layer -- but it is in the "reduce issues in global routing and local renumbering" space (that's part of what LISP does).
Cameron: As for ILNP, it's going to be difficult to get from where things are now to a world where ILNP is not just useless overhead. When you finally do, considering what it gives you, will the journey have been worth it? LISP apparently has more benefits, and NPT6 is so much easier -- particularly if you have rapid adaptation to apparent address changes, which many apps have and all mobile devices need already -- sorry but I don't think ILNP is going to make it. You can't just say "the IETF should pay more attention". I've invited people to promote it and nobody stepped up.
"Difficult" depends on your time horizon. Ipv6 is/was difficult. Sctp is difficult, but I remain bullish on its value. ILNP may be more difficult, but i believe it is strategically correct. We can disagree on merits of competing RESEARCH topics. I am just providing "ops feedback ", to bring this thread full circle. Lastly, we must make sure that LISP does not become the next 6to4 where good intentions for RESEARCH become a quantifiable network nightmare. Cb
On Wed, Jul 13, 2011 at 10:07 AM, Cameron Byrne <cb.list6@gmail.com> wrote:
On Jul 13, 2011 7:39 AM, "Scott Brim" <scott.brim@gmail.com> wrote:
On Wed, Jul 13, 2011 at 10:09, Randy Bush <randy@psg.com> wrote:
btw, a litte birdie told me to take another look at
6296 IPv6-to-IPv6 Network Prefix Translation. M. Wasserman, F. Baker. June 2011. (Format: TXT=73700 bytes) (Status: EXPERIMENTAL)
which also could be considered to be in the loc/id space
randy
No, that's a misuse of "loc/id" since no identification is involved, even at the network layer -- but it is in the "reduce issues in global routing and local renumbering" space (that's part of what LISP does).
Cameron: As for ILNP, it's going to be difficult to get from where things are now to a world where ILNP is not just useless overhead. When you finally do, considering what it gives you, will the journey have been worth it? LISP apparently has more benefits, and NPT6 is so much easier -- particularly if you have rapid adaptation to apparent address changes, which many apps have and all mobile devices need already -- sorry but I don't think ILNP is going to make it. You can't just say "the IETF should pay more attention". I've invited people to promote it and nobody stepped up.
"Difficult" depends on your time horizon. Ipv6 is/was difficult. Sctp is difficult, but I remain bullish on its value. ILNP may be more difficult, but i believe it is strategically correct.
We can disagree on merits of competing RESEARCH topics. I am just providing "ops feedback ", to bring this thread full circle.
Lastly, we must make sure that LISP does not become the next 6to4 where good intentions for RESEARCH become a quantifiable network nightmare.
i would agree that LISP hasn't necessarily improved the root problem posed. however, on this front nor it hasn't done any harm. the intriguing elements with LISP for me personally, are in all of the adjunct capabilities that a L/I split enables. there are some very valid and interesting applications that this enables and some novel technology capabilities that are exercised. (useful endpoint mobility, novel load balancing, encap data plane liveness, etc.) researching and getting our hands dirty as an industry with these technologies has considerable value. without actually poking at running code and pushing bits over these interfaces we run the risk of letting the perfect be the enemy of the good. i like the fact that this research let's us gauge how far from perfect the current state of the art is. fwiw - while there are folks that see LISP as an impending ops nightmare (if you don't like it, don't use it.) there are a number of folks for whom it provides compelling solutions to real problems that they have and they're keen on using it to solve those problems or explore the solution space. to that end i don't know that we need to make sure that LISP doesn't become anything. we need to find solutions to problems and rationally explore those solutions and incrementally enhance them. yes. i participate in the LISP research test bed in my (very) small way. -- steve ulrich (sulrich@botwerks.*)
On Jul 13, 2011, at 10:39 AM, Scott Brim wrote:
Cameron: As for ILNP, it's going to be difficult to get from where things are now to a world where ILNP is not just useless overhead. When you finally do, considering what it gives you, will the journey have been worth it? LISP apparently has more benefits, and NPT6 is so much easier -- particularly if you have rapid adaptation to apparent address changes, which many apps have and all mobile devices need already -- sorry but I don't think ILNP is going to make it. You can't just say "the IETF should pay more attention". I've invited people to promote it and nobody stepped up.
I think ILNP is a great solution. My concern with it is that the needed changes to TCP and UDP are not likely to happen.
On Wed, Jul 13, 2011 at 11:09, Fred Baker <fred@cisco.com> wrote:
I think ILNP is a great solution. My concern with it is that the needed changes to TCP and UDP are not likely to happen.
I guess I should clarify: I think ILNP is elegant. But the real Internet evolves incrementally, and only as needed. Other trajectories are much more likely.
On Jul 13, 2011, at 10:39 AM, Scott Brim wrote:
On Wed, Jul 13, 2011 at 10:09, Randy Bush <randy@psg.com> wrote:
btw, a litte birdie told me to take another look at
6296 IPv6-to-IPv6 Network Prefix Translation. M. Wasserman, F. Baker. June 2011. (Format: TXT=73700 bytes) (Status: EXPERIMENTAL)
which also could be considered to be in the loc/id space
randy
No, that's a misuse of "loc/id" since no identification is involved, even at the network layer -- but it is in the "reduce issues in global routing and local renumbering" space (that's part of what LISP does).
interesting, because that is exactly what Mike O'Dell suggested it as - a prefix/identification (loc/id) split. If you're going to take your line of reasoning, ILNP doesn't provide an identifier (as the term is defined in RFC 1992), and neither does LISP except as it redefines the terms to make it do. You're looking for something along the lines of HIP - which has other problems. I would describe NPTv6 as a location/identifier split in the sense that it makes the endpoint identifier in the IPv6 address independent of ISP's prefix - the PA (and therefore aggregatable) prefixes used outside the edge network are translated to the prefix used within the shop, and the host doesn't have to mess with them. As you point out, PA prefixes help with the route table - we aren't carrying infinite numbers of PI prefixes. To my way of thinking, shim6 was DOA if anything because it transferred the complexity of managing the route table from the transit networks to the edge networks, and the edge networks lacked both the expertise and the desire to deal with it. Folks are trampling the RIRs to get PI prefixes to avoid the multi-prefix model. But making the route table aggregate requires PA prefixes. Deploying ILNP (which is in many ways superior) requires a change to the TCP/UDP pseudoheader. Deploying NPTv6 makes the edge network look PA to the transit network, PI to the edge network, and doesn't change TCP. There is a headache with http/sip/etc referrals, which are better served if they use domain names anyway. But to my mind referrals have a solution if people choose to use it, so it's a solvable problem. So to me, NPTv6 fits pretty nicely.
Scott, I am not so sure that Randy's suggestion can be dismissed out of hand. When we started down the path of locator/identifier separation, we did so because the separation of locators and identifiers might solve some real operational problems. We were not so interested in architectural purity. At this point, it might be interesting to do the following: - enumerate the operational problems solved by LISP - enumerate the subset of those problems also solved by RFC 6296 - execute a cost/benefit analysis on both solutions Ron
-----Original Message----- From: Scott Brim [mailto:scott.brim@gmail.com] Sent: Wednesday, July 13, 2011 10:39 AM To: Randy Bush Cc: North American Network Operators' Group Subject: Re: in defense of lisp (was: Anybody can participate in the IETF)
On Wed, Jul 13, 2011 at 10:09, Randy Bush <randy@psg.com> wrote:
btw, a litte birdie told me to take another look at
6296 IPv6-to-IPv6 Network Prefix Translation. M. Wasserman, F. Baker. June 2011. (Format: TXT=73700 bytes) (Status: EXPERIMENTAL)
which also could be considered to be in the loc/id space
randy
No, that's a misuse of "loc/id" since no identification is involved, even at the network layer -- but it is in the "reduce issues in global routing and local renumbering" space (that's part of what LISP does).
Cameron: As for ILNP, it's going to be difficult to get from where things are now to a world where ILNP is not just useless overhead. When you finally do, considering what it gives you, will the journey have been worth it? LISP apparently has more benefits, and NPT6 is so much easier -- particularly if you have rapid adaptation to apparent address changes, which many apps have and all mobile devices need already -- sorry but I don't think ILNP is going to make it. You can't just say "the IETF should pay more attention". I've invited people to promote it and nobody stepped up.
Scott
On Jul 13, 2011, at 12:02 PM, Ronald Bonica wrote:
At this point, it might be interesting to do the following:
- enumerate the operational problems solved by LISP - enumerate the subset of those problems also solved by RFC 6296 - execute a cost/benefit analysis on both solutions
I'll let a LISP advocate state the values of LISP. My perception: it's a lot of overhead for what you actually get, comparable to building what Cisco once called "fast switching" into the network. In looking at 6296, I was trying to find a way to make edge networks be willing to use PA addresses instead of PI. If you have one ISP and never want to change ISPs, PA is wonderful; if you have multiple ISPs, the prevailing multihoming model in the IETF calls for you to have a subnet from each of your upstream prefixes on each LAN and to have your host divine which address pair implies the most acceptable route to your destination. If you have any ISP's prefix on your LAN and you want to remove the ISP (change to a different one, stop using one, whatever), you are somehow buried in renumbering (See RFC 4192). Edge networks are not crazy about renumbering, and they're not crazy about having a prefix per ISP on each LAN - hence PI. So, to get edge networks to use PA addresses, I reason that the edge network needs an address that is not derived from its upstream, and it has to be translated to the prefix of the upstream. The other factor (how to not require a change to TCP/UDP checksums) is the checksum update. So to my way of thinking, NPTv6 provides a way to statelessly (e.g. scalably) enable any host to talk with any host and at the same time make the edge network look PA to the upstream, has the managability characteristics of PI in the edge network, and not have to change TCP/UDP. LISP, to my knowledge, provides no way to push back on route table growth (it moves it from the transit network to the edge network, but the edge network still has to deal with it). To my mind, if you liked stateful NAT in IPv4, you'll like stateless NPTv6 in IPv6 better. With that, I'll return you to your more operational musings. I'm with the IETF. Please feel free to inform the world on how clueless I am operationally. I'm already convinced of the fact; that's why I talk with and listen to operators.
On Jul 13, 2011, at 11:02 PM, Ronald Bonica wrote:
- enumerate the operational problems solved by LISP
Separation of locator/ID is a fundamental architectural principle which transcends transport-specific (i.e., IPv4/IPv6) considerations. It allows for node/application/services agility, and in the case of the IPv4/IPv6 Internet, besides providing a way to solve mobility and to do on-demand dynamic provisioning/on-the-fly-reprovisioning of communications relationships, finally starts us down the long-overdue evolution towards an eventual fully out-of-band control plane. Controlling routing-table excursion in the IPv4/IPv6 Internet was/is the tactical problem that LISP was/is intended to address (pardon the pun), but the above long-term strategic benefits are its real value, IMHO.
- enumerate the subset of those problems also solved by RFC 6296
In light of the above, I view LISP and RFC6296 as orthogonal to one another. I also view RFC6296 as a perpetuation of the clear violation of the end-to-end principle (i.e., ' . . . functions placed at low levels of a system may be redundant or of little value when compared with the cost of providing them at that low level . . .') embodied in the abomination of NAT/PAT into IPv6, and the consequent instantiation of yet more unnecessary and harmful state into networks which are already deep in the throes of autogenic thromboembolism. ----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com> The basis of optimism is sheer terror. -- Oscar Wilde
I also view RFC6296 as a perpetuation of the clear violation of the end-to-end principle (i.e., ' . . . functions placed at low levels of a system may be redundant or of little value when compared with the cost of providing them at that low level . . .') embodied in the abomination of NAT/PAT into IPv6, and the consequent instantiation of yet more unnecessary and harmful state into networks which are already deep in the throes of autogenic thromboembolism.
great rant. not to quibble but i thought 6296 was stateless. randy
On Jul 14, 2011, at 10:49 AM, Randy Bush wrote:
not to quibble but i thought 6296 was stateless.
AFAICT, the translators themselves are just rewriting addresses and not paying attention to 'connections', which is all to the good. But then we get to this: ----- 5.2. Recommendations for Application Writers Several mechanisms (e.g., STUN [RFC5389], Traversal Using Relays around NAT (TURN) [RFC5766], and Interactive Connectivity Establishment (ICE) [RFC5245]) have been used with traditional IPv4 NAT to circumvent some of the limitations of such devices. Similar mechanisms could also be applied to circumvent some of the issues with an NPTv6 Translator. However, all of these require the assistance of an external server or a function co-located with the translator that can tell an "internal" host what its "external" addresses are. ----- ----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com> The basis of optimism is sheer terror. -- Oscar Wilde
Op 13-7-2011 16:09, Randy Bush schreef:
btw, a litte birdie told me to take another look at
The free Open Source FreeBSD based pfSense firewall supports this. Not everyone can get BGP, specifically calling out residential connections here. As a 1:1 NAT mechanism it works pretty well, I can reach the outside, and the outside can reach me. Which I think is what was intended in the specifications. And pretty much the internet. It took me 4 months to write the IPv6 support in pfSense to what it is today. Which is not feature complete. But the NPT part was just a few hours in the grand scheme. I've also contacted the nice people from the draft that we support it. Since then we've got v4 and v6 with BGP at work so it's moot. But I digress. Kind regards, Seth Mos pfSense developer.
6296 IPv6-to-IPv6 Network Prefix Translation. M. Wasserman, F. Baker. June 2011. (Format: TXT=73700 bytes) (Status: EXPERIMENTAL)
which also could be considered to be in the loc/id space
randy
On Jul 13, 2011 7:50 AM, "Seth Mos" <seth.mos@dds.nl> wrote:
Op 13-7-2011 16:09, Randy Bush schreef:
btw, a litte birdie told me to take another look at
The free Open Source FreeBSD based pfSense firewall supports this. Not everyone can get BGP, specifically calling out residential connections
here.
As a 1:1 NAT mechanism it works pretty well, I can reach the outside, and the outside can reach me. Which I think is what was intended in the specifications. And pretty much the internet.
It took me 4 months to write the IPv6 support in pfSense to what it is today. Which is not feature complete. But the NPT part was just a few hours in the grand scheme.
I've also contacted the nice people from the draft that we support it.
Since then we've got v4 and v6 with BGP at work so it's moot. But I
digress.
Kind regards,
Seth Mos pfSense developer.
Thank you for your work. CB
6296 IPv6-to-IPv6 Network Prefix Translation. M. Wasserman, F. Baker. June 2011. (Format: TXT=73700 bytes) (Status: EXPERIMENTAL)
which also could be considered to be in the loc/id space
randy
you want to give ops feedback to the ietf, well ... i suggest a loc/id session at the next nanog, 20-30 mins each for LISP ILNP 6296 where each is explained at an architectural level in some detail with also a predeterimied list of questions such as "how does this address loc/id separation, routing table scaling, incremental deployment, state of implementation/testing, ..." and then a half hour where someone sums up the similarities and differences. and someone writes it up. randy
On Tue, Jul 12, 2011 at 11:42 AM, Leo Bicknell <bicknell@ufp.org> wrote:
I'll pick on LISP as an example, since many operators are at least aware of it. Some operators have said we need a locator and identifier split. Interesting feedback. The IETF has gone off and started playing in the sandbox, trying to figure out how to make that go.
As an operator (who understands how most things work in very great detail), I found the LISP folks very much uninterested in my concerns about if LISP can ever be made to scale up to "Internet-scale," with respect to a specific DDoS vector. I also think that an explosion of small, multi-homed SOHO networks would be a disaster, because we might have 3 million FIB instead of 360k FIB after a few years. These things are directly related to each-other, too. So I emailed some LISP gurus off-list and discussed my concern. I was encouraged to post to the LISP IETF list, which I did. To my great surprise, not one single person was interested in my problem. If you think it is a small problem, well, you should try going back to late-1990s flow-cache routing in your data-center networks and see what happens when you get DDoS. I am sure most of us remember some of those painful experiences. Now there is a LISP "threats" draft which the working group mandates they produce, discussing various security problems. The current paper is a laundry list of "what if" scenarios, like, what if a malicious person could fill the LISP control-plane with garbage. BGP has the same issue, if some bad guy had enable on a big enough network that their peers/transits don't filter their routes, they could do a lot of damage before they were stopped. This sometimes happens even by accident, for example, some poor guy accidentally announcing 12/9 and giving AT&T a really bad day. What it doesn't contain is anything relevant to the special-case DDoS that all LISP sites would be vulnerable to, due to the IMO bad flow-cache management system that is specified. I am having a very great deal of trouble getting the authors of the "threats" document to even understand what the problem is, because as one of them put it, he is "just a researcher." I am sure he and his colleagues are very smart guys, but they clearly do not remember our 1990s pains. That is the "not an operator" problem. It is understandable. Others who have been around long enough simply dismiss this problem, because they believe the unparalleled benefits of LISP for mobility and multi-homing SOHO sites must greatly out-weigh the fact that, well, if you are a content provider and you receive a DDoS, your site will be down and there isn't a damn thing you can do about it, other than spec routers that have way, way more FIB than the number of possible routes, again due to the bad caching scheme. The above is what I think is the "ego-invested" problem, where certain pretty smart, well-intentioned people have a lot of time, and professional credibility, invested in making LISP work. I'm sure it isn't pleasing for these guys to defend their project against my argument that it may never be able to reach Internet-scale, and that they have missed what I claim is a show-stopping problem with an easy way to improve it through several years of development. Especially since I am a guy who did not ever participate in the IETF before, someone they don't know from a random guy on the street. I am glad that this NANOG discussion has got some of these LISP folks to pay more attention to my argument, and my suggested improvement (I am not only bashing their project; I have positive input, too.) Simply posting to their mailing list once and emailing a few draft authors did not cause any movement at all. Evidently it does get attention, though, to jump up and down on a different list. Go figure! If operators don't provide input and *perspective* to things like LISP, we will end up with bad results. How many of us are amazed that we still do not have 32:32 bits BGP communities to go along with 32 bit ASNs, for signalling requests to transit providers without collision with other networks' community schemes? It is a pretty stupid situation, and yet here we are, with 32 bit ASN for years, and if you want to do advertisement control with 32 bit ASNs used, you are either mapping your 32 bit neighbors to special numbers, or your community scheme can overlap with others. That BGP community problem is pretty tiny compared to, what if people really started rolling out something new and clever like LISP, but in a half-baked, broken way that takes us back to 1990s era of small DDoS taking out whole data-center aggregation router. A lot of us think IPv6 is over-baked and broken, and probably this is why it has taken such a very long time to get anywhere with it. But ultimately, it is our fault for not participating. I am reversing my own behavior and providing input to some WGs I care about, in what time I have to do so. More operators should do the same. Otherwise, we have no right to blame the people who do participate in IETF, because we aren't part of the solution. -- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts
Jeff, On Jul 12, 2011, at 20:13 , Jeff Wheeler wrote:
On Tue, Jul 12, 2011 at 11:42 AM, Leo Bicknell <bicknell@ufp.org> wrote:
I'll pick on LISP as an example, since many operators are at least aware of it. Some operators have said we need a locator and identifier split. Interesting feedback. The IETF has gone off and started playing in the sandbox, trying to figure out how to make that go.
As an operator (who understands how most things work in very great detail),
Granted. You are the real world expert. Now can you stop repeating this in each email and move on?
I found the LISP folks very much uninterested in my concerns about if LISP can ever be made to scale up to "Internet-scale," with respect to a specific DDoS vector.
This is completely false. Several people gave credit to you about the existence of the threat you pointed out.
I also think that an explosion of small, multi-homed SOHO networks would be a disaster, because we might have 3 million FIB instead of 360k FIB after a few years. These things are directly related to each-other, too.
So I emailed some LISP gurus off-list and discussed my concern. I was encouraged to post to the LISP IETF list, which I did. To my great surprise, not one single person was interested in my problem.
This is again false. We had mail exchange both privately and on the mailinglist. We proposed to you text to be added to the threats draft but you did not like it. We are asking to propose text but we have no answer from you on this point.
If you think it is a small problem, well, you should try going back to late-1990s flow-cache routing in your data-center networks and see what happens when you get DDoS. I am sure most of us remember some of those painful experiences.
Now there is a LISP "threats" draft which the working group mandates they produce, discussing various security problems. The current paper is a laundry list of "what if" scenarios, like, what if a malicious person could fill the LISP control-plane with garbage. BGP has the same issue, if some bad guy had enable on a big enough network that their peers/transits don't filter their routes, they could do a lot of damage before they were stopped.
So you are saying that BGP can be victim of similar attacks/problem.... still... if you are reading this email it means that the Internet is still running...
This sometimes happens even by accident, for example, some poor guy accidentally announcing 12/9 and giving AT&T a really bad day.
What it doesn't contain is anything relevant to the special-case DDoS that all LISP sites would be vulnerable to, due to the IMO bad flow-cache management system that is specified.
If you still think that LISP is using a flow-cache you should have a second read to the set of drafts.
I am having a very great deal of trouble getting the authors of the "threats" document to even understand what the problem is,
For the third time: this is false. We got the problem, we were asking for more specific information in order to quantify the risk. We asked you help to state the problem and explained to you where the solution should be addressed. But you seem to be stuck on the operator vs. researcher discussion, which IMHO is just pointless.
because as one of them put it, he is "just a researcher." I am sure he and his colleagues are very smart guys, but they clearly do not remember our 1990s pains.
That is the "not an operator" problem. It is understandable.
Others who have been around long enough simply dismiss this problem, because they believe the unparalleled benefits of LISP for mobility and multi-homing SOHO sites must greatly out-weigh the fact that, well, if you are a content provider and you receive a DDoS, your site will be down and there isn't a damn thing you can do about it, other than spec routers that have way, way more FIB than the number of possible routes, again due to the bad caching scheme.
The above is what I think is the "ego-invested" problem, where certain pretty smart, well-intentioned people have a lot of time, and professional credibility, invested in making LISP work. I'm sure it isn't pleasing for these guys to defend their project against my argument that it may never be able to reach Internet-scale, and that they have missed what I claim is a show-stopping problem with an easy way to improve it through several years of development. Especially since I am a guy who did not ever participate in the IETF before, someone they don't know from a random guy on the street.
I am glad that this NANOG discussion has got some of these LISP folks to pay more attention to my argument, and my suggested improvement (I am not only bashing their project; I have positive input, too.) Simply posting to their mailing list once and emailing a few draft authors did not cause any movement at all. Evidently it does get attention, though, to jump up and down on a different list. Go figure!
If operators don't provide input and *perspective* to things like LISP, we will end up with bad results.
True. That technical feedback is the most welcome. Let me now ask a simple question: why are you so strongly against LISP? You do not like it? Fine, other people do. You do not believe in it and do not see any value? Fine, other people do. You think that there are issues that cannot be solved? Fine, other people believe those issues can be solved and are scratching their head to find deployable solutions. As I said before, your technical experience and feedback is the most welcome, but let's try to focus only on the technical level. thanks Luigi Iannone
Hello Jeff, On 13 Jul 2011, at 10:08, Luigi Iannone wrote:
Jeff,
On Jul 12, 2011, at 20:13 , Jeff Wheeler wrote:
On Tue, Jul 12, 2011 at 11:42 AM, Leo Bicknell <bicknell@ufp.org> wrote:
I'll pick on LISP as an example, since many operators are at least aware of it. Some operators have said we need a locator and identifier split. Interesting feedback. The IETF has gone off and started playing in the sandbox, trying to figure out how to make that go.
As an operator (who understands how most things work in very great detail), Granted. You are the real world expert. Now can you stop repeating this in each email and move on?
I found the LISP folks very much uninterested in my concerns about if LISP can ever be made to scale up to "Internet-scale," with respect to a specific DDoS vector.
This is completely false. Several people gave credit to you about the existence of the threat you pointed out.
Definitely, since 2007 we are working on LISP and always keep scalability in mind. All our researches are always constrained by the scalability. What you pointed out is very interesting and we are working on it. Nevertheless, and I am just repeating myself now, the problem you expose is an operational problem that can also have security implications. This is why we proposed you several time to expose the problem at next IETF such that we can all discuss the problem and propose solutions to be added in the main specs. The cache management problem is really interesting and we could even imagine a draft like "implementation guidelines" that would explain how to implement the caches in a smart way. The solution to the problem you are pointing-out can also be proposed in the deployment draft as it could be alleviate with the help of filters.
I also think that an explosion of small, multi-homed SOHO networks would be a disaster, because we might have 3 million FIB instead of 360k FIB after a few years. These things are directly related to each-other, too.
So I emailed some LISP gurus off-list and discussed my concern. I was encouraged to post to the LISP IETF list, which I did. To my great surprise, not one single person was interested in my problem.
This is again false. We had mail exchange both privately and on the mailinglist. We proposed to you text to be added to the threats draft but you did not like it. We are asking to propose text but we have no answer from you on this point.
Among the people that are interesting by the problem, you have the threats authors. You received personal replies from me on Jul 6th @2:08pm and Jul 7th @1:13 am. In our SVN, the draft is already updated with your comments and you are in the acks. The version will be submitted just after the IETF with the comments that will come from the presentation we will make during IETF81. I understand that you would like to have the comment addresses immediately when you are thinking about them but sometime it is nice to take a few days to think about the problem in order to problem a more robust description and thus build more sustainable ways to a solution.
If you think it is a small problem, well, you should try going back to late-1990s flow-cache routing in your data-center networks and see what happens when you get DDoS. I am sure most of us remember some of those painful experiences.
Now there is a LISP "threats" draft which the working group mandates they produce, discussing various security problems. The current paper is a laundry list of "what if" scenarios, like, what if a malicious person could fill the LISP control-plane with garbage. BGP has the same issue, if some bad guy had enable on a big enough network that their peers/transits don't filter their routes, they could do a lot of damage before they were stopped.
So you are saying that BGP can be victim of similar attacks/problem.... still... if you are reading this email it means that the Internet is still running...
Some people like Randy Bush try to find solutions for BGP not to be destroyed in presence of malicious guys. For BGP as for LISP, things take time just become the problem is definitely more complex as we always have to make tradeoffs between various contradictory elements.
This sometimes happens even by accident, for example, some poor guy accidentally announcing 12/9 and giving AT&T a really bad day.
What it doesn't contain is anything relevant to the special-case DDoS that all LISP sites would be vulnerable to, due to the IMO bad flow-cache management system that is specified.
If you still think that LISP is using a flow-cache you should have a second read to the set of drafts.
Depends you definition of flow. If a flow is defined by the destination prefix, then yes, LISP has a flow-cache ;-)
I am having a very great deal of trouble getting the authors of the "threats" document to even understand what the problem is,
For the third time: this is false. We got the problem, we were asking for more specific information in order to quantify the risk. We asked you help to state the problem and explained to you where the solution should be addressed. But you seem to be stuck on the operator vs. researcher discussion, which IMHO is just pointless.
fully agree, our draft is already updated, and I know exactly the sentence that has been added in Sec 6.2.2 of the draft. We have not submitted the draft yet as we are following the "rules" of the WG. We will present the draft and present the modifications we want to have and ask the WG to accept them. Then we will add them in the draft.
because as one of them put it, he is "just a researcher." I am sure he and his colleagues are very smart guys, but they clearly do not remember our 1990s pains.
That is the "not an operator" problem. It is understandable.
Others who have been around long enough simply dismiss this problem, because they believe the unparalleled benefits of LISP for mobility and multi-homing SOHO sites must greatly out-weigh the fact that, well, if you are a content provider and you receive a DDoS, your site will be down and there isn't a damn thing you can do about it, other than spec routers that have way, way more FIB than the number of possible routes, again due to the bad caching scheme.
The above is what I think is the "ego-invested" problem, where certain pretty smart, well-intentioned people have a lot of time, and professional credibility, invested in making LISP work. I'm sure it isn't pleasing for these guys to defend their project against my argument that it may never be able to reach Internet-scale, and that they have missed what I claim is a show-stopping problem with an easy way to improve it through several years of development. Especially since I am a guy who did not ever participate in the IETF before, someone they don't know from a random guy on the street.
I am glad that this NANOG discussion has got some of these LISP folks to pay more attention to my argument, and my suggested improvement (I am not only bashing their project; I have positive input, too.) Simply posting to their mailing list once and emailing a few draft authors did not cause any movement at all. Evidently it does get attention, though, to jump up and down on a different list. Go figure!
If operators don't provide input and *perspective* to things like LISP, we will end up with bad results.
True. That technical feedback is the most welcome.
Sure, please provide your feedback. What about deploying LISP in your test network and provide us all the information you got from this deployment? For you information, we are currently working with operators that are testing LISP stuff so be sure that operators are listened!
Let me now ask a simple question: why are you so strongly against LISP?
You do not like it? Fine, other people do. You do not believe in it and do not see any value? Fine, other people do. You think that there are issues that cannot be solved? Fine, other people believe those issues can be solved and are scratching their head to find deployable solutions.
As I said before, your technical experience and feedback is the most welcome, but let's try to focus only on the technical level.
Maybe you could write a draft with all the points you don't like in LISP and why. This document could be a starting point to improve LISP from an operational viewpoint. Tank you for providing us all the technical details. Damien Saucez
thanks
Luigi Iannone
Luigi, you have mis-understood quite a bit of the content of my message. I'm not sure if this is of any further interest to NANOG readers, but as it is basically what seems to go on a lot, from my observations of IETF list activity, I'll copy my reply to the list as you have done. On Wed, Jul 13, 2011 at 4:08 AM, Luigi Iannone <luigi@net.t-labs.tu-berlin.de> wrote:
Granted. You are the real world expert. Now can you stop repeating this in each email and move on?
No. This is a point that needs to be not only made, but driven home. You do not understand how routers work, which is why you are having such difficulty understanding the severity of this problem. The lisp-threats work you have done is basically all control-plane / signalling issues, and no data-plane issues. This is not a coincidence; it is because your knowledge of the control-plane side is good and of the data-plane is weak.
This is completely false. Several people gave credit to you about the existence of the threat you pointed out.
Really? In April, when I posted a serious problem, and received no replies? Now, the original folks who I discussed this with, before ever posting to the IETF LISP list, are finally seeking clarification, because apparently there may have been some confusion in April, possibly leading to their total dismissal of this as a practical concern.
This is again false. We had mail exchange both privately and on the mailinglist. We proposed to you text to be added to the threats draft but you did not like it. We are asking to propose text but we have no answer from you on this point.
Actually, you classified this as an implementation concern, which is false. You have said yourself that this is why you believe it deserves just one sentence, if that, in the lisp-threats draft. This is not an implementation-specific concern, it is a design flaw in the MS negative response scheme, which emerges to produce a trivial DoS threat if LISP ever scales up.
Now there is a LISP "threats" draft which the working group mandates they produce, discussing various security problems. The current paper is a laundry list of "what if" scenarios, like, what if a malicious person could fill the LISP control-plane with garbage. BGP has the
So you are saying that BGP can be victim of similar attacks/problem.... still... if you are reading this email it means that the Internet is still running...
This is where I believe you are mis-reading my message. Your threats draft covers legitimate concerns which also exist in the current system that is widely deployed, which is largely, BGP plus big FIB. What you don't cover, at all, is an IMO critical new threat that emerges in the data-plane from the design of the MS protocol.
If you still think that LISP is using a flow-cache you should have a second read to the set of drafts.
This language may appear unclear if you haven't read it in the context of my other postings. LISP routing most certainly is a flow-cache, however, the definition of "flow" is different. Some platforms and routing schemes see a flow as a layer-3 destination /32 or similar (some 90s routers), others more granular (firewalls, where flows are usually layer-4 and often stateful), and with LISP, the "flow" the address space routed from your ITR to a remote ETR, which may cover a large amount of address space and many smaller flows. The LISP drafts also refer to these flows as "tunnels," but that language could easily be confused to mean much more permanent, static tunnels, or MPLS-like tunnels which are signaled throughout the network of P routers. So there are clear semantic issues of importance when talking about LISP, and all these terms must be read in the correct context.
For the third time: this is false. We got the problem, we were asking for more specific information in order to quantify the risk. We asked you help
You haven't "got it," or you would already understand the risk very well. It is not my intention to fault you and your colleagues for failing to understand this; but to demonstrate clearly that the right kind of expertise is absolutely not being applied to LISP, and there is a huge and possibly intractable threat that was completely overlooked when producing what is meant to be an authoritative document on currently-known "threats" to LISP.
to state the problem and explained to you where the solution should be addressed. But you seem to be stuck on the operator vs. researcher discussion, which IMHO is just pointless.
Substantially all operators are "stuck" there. They should participate more.
Let me now ask a simple question: why are you so strongly against LISP?
No new work has been done to address the problem of scaling up the number of locators or multi-homed end-sites. However, the *claims* being made by LISP advocates is that the caching scheme you have, which is not novel, does solve this problem. It does not. It cannot as there has been no novel work on this. It is very unfortunate that LISP folks point to an academic paper that studied the affect of 20k nominal flows. This is not Internet-scale, but a lot of you who are working hard on LISP don't seem to understand that. DoS attacks are a real world concern that we all have to live with when deploying things for Internet use (as opposed to enterprise VPN, etc.) If you don't even consider their impact, how would you expect content to be available over a LISP infrastructure? How could a large subscriber-access ITR platform work, if a trivial DoS against it would impact all connected subscribers? The root problem remains that as you scale up the number of locators and destination prefixes, you need to scale up the hardware. This is made 10x worse, as I have demonstrated, by the inflexible and foolish negative mapping reply scheme that is specified for LISP.
You do not believe in it and do not see any value? Fine, other people do.
As I have said, I believe the value of LISP is limited to VPN-over-Internet. It can never scale up for large-scale, Internet use. This is an opinion shared by virtually all operators I've spoken to who have followed LISP. Why? Again, pet project, ego, and academia vs operational reality. Get some other opinions. I'm not the only guy who thinks this way, I'm just the only one bothering to jump up and down, because I think LISP is a really good example of what is being discussed in this NANOG thread (IETF brokenness due to lack of operator participation), and a waste of vendor resources.
You think that there are issues that cannot be solved? Fine, other people believe those issues can be solved and are scratching their head to find deployable solutions.
I've seen the "LISP Youtube Video." It looks clever, but it'll never, ever work at large scale. Would you like to know what actually does work, has existing code, and just needs some killer app? SCTP. It does the mobility that LISP promises, and removes the need to even have loc/ID separation, because applications perceive a socket which the OS (SCTP stack) at each end can multi-home, and port across changing IP addresses, and so on. SCTP isn't going to sell any routers, but it solves all those problems that LISP would like to solve (but can't at scale.) -- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts
Jeff, on one point we agree, there is value in continuing this thread. I've tried to bring the discussion back to the technical issues, but I failed. Personally, I find your emails aggressive and close to offensive in some sentences. Differently from you, in my replies (all of them public) I never judged your competences. For me this thread is closed. Have a nice day Luigi On Jul 13, 2011, at 11:21 , Jeff Wheeler wrote:
Luigi, you have mis-understood quite a bit of the content of my message. I'm not sure if this is of any further interest to NANOG readers, but as it is basically what seems to go on a lot, from my observations of IETF list activity, I'll copy my reply to the list as you have done.
On Wed, Jul 13, 2011 at 4:08 AM, Luigi Iannone <luigi@net.t-labs.tu-berlin.de> wrote:
Granted. You are the real world expert. Now can you stop repeating this in each email and move on?
No. This is a point that needs to be not only made, but driven home. You do not understand how routers work, which is why you are having such difficulty understanding the severity of this problem. The lisp-threats work you have done is basically all control-plane / signalling issues, and no data-plane issues. This is not a coincidence; it is because your knowledge of the control-plane side is good and of the data-plane is weak.
This is completely false. Several people gave credit to you about the existence of the threat you pointed out.
Really? In April, when I posted a serious problem, and received no replies? Now, the original folks who I discussed this with, before ever posting to the IETF LISP list, are finally seeking clarification, because apparently there may have been some confusion in April, possibly leading to their total dismissal of this as a practical concern.
This is again false. We had mail exchange both privately and on the mailinglist. We proposed to you text to be added to the threats draft but you did not like it. We are asking to propose text but we have no answer from you on this point.
Actually, you classified this as an implementation concern, which is false. You have said yourself that this is why you believe it deserves just one sentence, if that, in the lisp-threats draft. This is not an implementation-specific concern, it is a design flaw in the MS negative response scheme, which emerges to produce a trivial DoS threat if LISP ever scales up.
Now there is a LISP "threats" draft which the working group mandates they produce, discussing various security problems. The current paper is a laundry list of "what if" scenarios, like, what if a malicious person could fill the LISP control-plane with garbage. BGP has the
So you are saying that BGP can be victim of similar attacks/problem.... still... if you are reading this email it means that the Internet is still running...
This is where I believe you are mis-reading my message. Your threats draft covers legitimate concerns which also exist in the current system that is widely deployed, which is largely, BGP plus big FIB. What you don't cover, at all, is an IMO critical new threat that emerges in the data-plane from the design of the MS protocol.
If you still think that LISP is using a flow-cache you should have a second read to the set of drafts.
This language may appear unclear if you haven't read it in the context of my other postings. LISP routing most certainly is a flow-cache, however, the definition of "flow" is different. Some platforms and routing schemes see a flow as a layer-3 destination /32 or similar (some 90s routers), others more granular (firewalls, where flows are usually layer-4 and often stateful), and with LISP, the "flow" the address space routed from your ITR to a remote ETR, which may cover a large amount of address space and many smaller flows.
The LISP drafts also refer to these flows as "tunnels," but that language could easily be confused to mean much more permanent, static tunnels, or MPLS-like tunnels which are signaled throughout the network of P routers. So there are clear semantic issues of importance when talking about LISP, and all these terms must be read in the correct context.
For the third time: this is false. We got the problem, we were asking for more specific information in order to quantify the risk. We asked you help
You haven't "got it," or you would already understand the risk very well. It is not my intention to fault you and your colleagues for failing to understand this; but to demonstrate clearly that the right kind of expertise is absolutely not being applied to LISP, and there is a huge and possibly intractable threat that was completely overlooked when producing what is meant to be an authoritative document on currently-known "threats" to LISP.
to state the problem and explained to you where the solution should be addressed. But you seem to be stuck on the operator vs. researcher discussion, which IMHO is just pointless.
Substantially all operators are "stuck" there. They should participate more.
Let me now ask a simple question: why are you so strongly against LISP?
No new work has been done to address the problem of scaling up the number of locators or multi-homed end-sites. However, the *claims* being made by LISP advocates is that the caching scheme you have, which is not novel, does solve this problem. It does not. It cannot as there has been no novel work on this.
It is very unfortunate that LISP folks point to an academic paper that studied the affect of 20k nominal flows. This is not Internet-scale, but a lot of you who are working hard on LISP don't seem to understand that. DoS attacks are a real world concern that we all have to live with when deploying things for Internet use (as opposed to enterprise VPN, etc.) If you don't even consider their impact, how would you expect content to be available over a LISP infrastructure? How could a large subscriber-access ITR platform work, if a trivial DoS against it would impact all connected subscribers?
The root problem remains that as you scale up the number of locators and destination prefixes, you need to scale up the hardware. This is made 10x worse, as I have demonstrated, by the inflexible and foolish negative mapping reply scheme that is specified for LISP.
You do not believe in it and do not see any value? Fine, other people do.
As I have said, I believe the value of LISP is limited to VPN-over-Internet. It can never scale up for large-scale, Internet use. This is an opinion shared by virtually all operators I've spoken to who have followed LISP. Why? Again, pet project, ego, and academia vs operational reality.
Get some other opinions. I'm not the only guy who thinks this way, I'm just the only one bothering to jump up and down, because I think LISP is a really good example of what is being discussed in this NANOG thread (IETF brokenness due to lack of operator participation), and a waste of vendor resources.
You think that there are issues that cannot be solved? Fine, other people believe those issues can be solved and are scratching their head to find deployable solutions.
I've seen the "LISP Youtube Video." It looks clever, but it'll never, ever work at large scale. Would you like to know what actually does work, has existing code, and just needs some killer app? SCTP. It does the mobility that LISP promises, and removes the need to even have loc/ID separation, because applications perceive a socket which the OS (SCTP stack) at each end can multi-home, and port across changing IP addresses, and so on.
SCTP isn't going to sell any routers, but it solves all those problems that LISP would like to solve (but can't at scale.)
-- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts
On Jul 13, 2011, at 13:03 , Luigi Iannone wrote:
Jeff,
on one point we agree, there is value in continuing this thread.
There is _no_ value..... my mistake... Luigi
I've tried to bring the discussion back to the technical issues, but I failed.
Personally, I find your emails aggressive and close to offensive in some sentences. Differently from you, in my replies (all of them public) I never judged your competences.
For me this thread is closed.
Have a nice day
Luigi
On Jul 13, 2011, at 11:21 , Jeff Wheeler wrote:
Luigi, you have mis-understood quite a bit of the content of my message. I'm not sure if this is of any further interest to NANOG readers, but as it is basically what seems to go on a lot, from my observations of IETF list activity, I'll copy my reply to the list as you have done.
On Wed, Jul 13, 2011 at 4:08 AM, Luigi Iannone <luigi@net.t-labs.tu-berlin.de> wrote:
Granted. You are the real world expert. Now can you stop repeating this in each email and move on?
No. This is a point that needs to be not only made, but driven home. You do not understand how routers work, which is why you are having such difficulty understanding the severity of this problem. The lisp-threats work you have done is basically all control-plane / signalling issues, and no data-plane issues. This is not a coincidence; it is because your knowledge of the control-plane side is good and of the data-plane is weak.
This is completely false. Several people gave credit to you about the existence of the threat you pointed out.
Really? In April, when I posted a serious problem, and received no replies? Now, the original folks who I discussed this with, before ever posting to the IETF LISP list, are finally seeking clarification, because apparently there may have been some confusion in April, possibly leading to their total dismissal of this as a practical concern.
This is again false. We had mail exchange both privately and on the mailinglist. We proposed to you text to be added to the threats draft but you did not like it. We are asking to propose text but we have no answer from you on this point.
Actually, you classified this as an implementation concern, which is false. You have said yourself that this is why you believe it deserves just one sentence, if that, in the lisp-threats draft. This is not an implementation-specific concern, it is a design flaw in the MS negative response scheme, which emerges to produce a trivial DoS threat if LISP ever scales up.
Now there is a LISP "threats" draft which the working group mandates they produce, discussing various security problems. The current paper is a laundry list of "what if" scenarios, like, what if a malicious person could fill the LISP control-plane with garbage. BGP has the
So you are saying that BGP can be victim of similar attacks/problem.... still... if you are reading this email it means that the Internet is still running...
This is where I believe you are mis-reading my message. Your threats draft covers legitimate concerns which also exist in the current system that is widely deployed, which is largely, BGP plus big FIB. What you don't cover, at all, is an IMO critical new threat that emerges in the data-plane from the design of the MS protocol.
If you still think that LISP is using a flow-cache you should have a second read to the set of drafts.
This language may appear unclear if you haven't read it in the context of my other postings. LISP routing most certainly is a flow-cache, however, the definition of "flow" is different. Some platforms and routing schemes see a flow as a layer-3 destination /32 or similar (some 90s routers), others more granular (firewalls, where flows are usually layer-4 and often stateful), and with LISP, the "flow" the address space routed from your ITR to a remote ETR, which may cover a large amount of address space and many smaller flows.
The LISP drafts also refer to these flows as "tunnels," but that language could easily be confused to mean much more permanent, static tunnels, or MPLS-like tunnels which are signaled throughout the network of P routers. So there are clear semantic issues of importance when talking about LISP, and all these terms must be read in the correct context.
For the third time: this is false. We got the problem, we were asking for more specific information in order to quantify the risk. We asked you help
You haven't "got it," or you would already understand the risk very well. It is not my intention to fault you and your colleagues for failing to understand this; but to demonstrate clearly that the right kind of expertise is absolutely not being applied to LISP, and there is a huge and possibly intractable threat that was completely overlooked when producing what is meant to be an authoritative document on currently-known "threats" to LISP.
to state the problem and explained to you where the solution should be addressed. But you seem to be stuck on the operator vs. researcher discussion, which IMHO is just pointless.
Substantially all operators are "stuck" there. They should participate more.
Let me now ask a simple question: why are you so strongly against LISP?
No new work has been done to address the problem of scaling up the number of locators or multi-homed end-sites. However, the *claims* being made by LISP advocates is that the caching scheme you have, which is not novel, does solve this problem. It does not. It cannot as there has been no novel work on this.
It is very unfortunate that LISP folks point to an academic paper that studied the affect of 20k nominal flows. This is not Internet-scale, but a lot of you who are working hard on LISP don't seem to understand that. DoS attacks are a real world concern that we all have to live with when deploying things for Internet use (as opposed to enterprise VPN, etc.) If you don't even consider their impact, how would you expect content to be available over a LISP infrastructure? How could a large subscriber-access ITR platform work, if a trivial DoS against it would impact all connected subscribers?
The root problem remains that as you scale up the number of locators and destination prefixes, you need to scale up the hardware. This is made 10x worse, as I have demonstrated, by the inflexible and foolish negative mapping reply scheme that is specified for LISP.
You do not believe in it and do not see any value? Fine, other people do.
As I have said, I believe the value of LISP is limited to VPN-over-Internet. It can never scale up for large-scale, Internet use. This is an opinion shared by virtually all operators I've spoken to who have followed LISP. Why? Again, pet project, ego, and academia vs operational reality.
Get some other opinions. I'm not the only guy who thinks this way, I'm just the only one bothering to jump up and down, because I think LISP is a really good example of what is being discussed in this NANOG thread (IETF brokenness due to lack of operator participation), and a waste of vendor resources.
You think that there are issues that cannot be solved? Fine, other people believe those issues can be solved and are scratching their head to find deployable solutions.
I've seen the "LISP Youtube Video." It looks clever, but it'll never, ever work at large scale. Would you like to know what actually does work, has existing code, and just needs some killer app? SCTP. It does the mobility that LISP promises, and removes the need to even have loc/ID separation, because applications perceive a socket which the OS (SCTP stack) at each end can multi-home, and port across changing IP addresses, and so on.
SCTP isn't going to sell any routers, but it solves all those problems that LISP would like to solve (but can't at scale.)
-- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts
We all make mistakes in not questioning our own positions, from time to time. You, Jeff, seem to be making that very same mistake. Please keep these points in mind: * Rome wasn't built in a day. The current system didn't come ready-made pre-built with all the bells and whistles you are used to. It grew slowly over time, as we learned what works, what doesn't, and what was missing. Any system that attempts to deal with locator/id separation will assuredly not be built in a day, either. * While you have stated a problem relating to a security consideration – specifically that there is a potential reflection attack that could cause cache thrashing, the solution may not be what you expect. * Yes, you were asked. Even so... Novelty isn't something worth arguing over, except in patent battles. Usefulness is only worth arguing over marginally more. Deployment (or lack thereof) speaks for itself. LISP or ILNP or what-have-you either will or won't be deployed over the long run. * Never is a very long time. Many uses of "never" have been used relating to the Internet. It is the corollary to "Imminent Death of the 'Net: film @ 11." I still have the NANOG tee-shirt with Robert Metcalfe, someone with considerably more notoriety, eating his hat. Eliot
On Sun, Jul 17, 2011 at 11:07 AM, Eliot Lear <lear@cisco.com> wrote:
We all make mistakes in not questioning our own positions, from time to time. You, Jeff, seem to be making that very same mistake.
Rome wasn't built in a day. The current system didn't come ready-made pre-built with all the bells and whistles you are used to. It grew slowly over time, as we learned what works, what doesn't, and what was missing. Any system that attempts to deal with locator/id separation will assuredly not be built in a day, either.
LISP work has been going on for a long time to still not have any useful discussion on a designed-in, trivial DoS which will affect any ITR and make the work being done to allow ETRs to validate source addresses (or even do loose uRPF) into a DoS vector for ETRs as well.
While you have stated a problem relating to a security consideration – specifically that there is a potential reflection attack that could cause cache thrashing, the solution may not be what you expect.
I agree, a solution might be available. One has not been presented yet. In my earliest postings to the IETF LISP list, the ones which received zero replies, I suggest a way to significantly improve the cache churn DoS problem. It is not novel, as Darrel Lewis informed me, which means that even already-available research has not been applied to LISP in this area, and the Mapping Service protocol ties the hands of implementors so they *cannot* apply such techniques while still conforming to the specifications.
Yes, you were asked. Even so... Novelty isn't something worth arguing over, except in patent battles.
Really? Novelty, by definition, advances the state of the art. You may not think it's very important to inform people that LISP is based on essentially the same flow-caching scheme used in the 1990s, but I do.
Never is a very long time. Many uses of "never" have been used relating to the Internet. It is the corollary to "Imminent Death of the 'Net: film @ 11." I still have the NANOG tee-shirt with Robert Metcalfe, someone with considerably more notoriety, eating his hat.
And yet, I am quite comfortable with the statement that LISP can never scale up to meet the demands of the Internet. Perhaps with fundamental changes to its design, and its advocates giving up some of their current assumptions, some progress could be made. In its current form, though, LISP will never be a useful tool to scale the Internet, and in fact, it cannot meet the demands of today's Internet. Unless, of course, you pretend that the ability to DoS any router with a trivial amount of traffic is not worthy of concern. -- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts
-----Original Message----- From: Leo Bicknell [mailto:bicknell@ufp.org] Sent: Tuesday, July 12, 2011 11:42 AM To: Ronald Bonica Cc: nanog@nanog.org Subject: Re: Anybody can participate in the IETF (Was: Why is IPv6 broken?)
[snip]
But there is no roadmap in the IETF process now for LISP that says "We've got this 90% baked, we need to circulate a draft to the NANOG mailing list, request operator comments, and actively solicit operators to participate in the expanded test network". We need that mechanism to tell folks "hey, it's real enough your operational feedback is now useful" and "come test our new idea".
Leo, We need to fix this problem. Without the feedback loop that you describe, the IETF will never know whether they are producing useful stuff or nonsense. How does the following sound as a solution: Let's say we set up an new IETF mailing list, primarily for the use of operators. When an operator sees a draft that might be of interest to the operational community, he creates a new thread on the list, copying the draft authors and WG chairs. (The authors and chairs can decide whether to add the WG to the thread). The OPS AD will consider thread contents when evaluating the draft. Ron
Leo Bicknell wrote: In short, make it easy for the operators to participate at the right
time in the process. It will be better for everyone!
Unfortunately, where you want to be inserted into the process is when everybody has said their piece 80-dozen times and are tired and just want to get on with life. So it doesn't matter whether you're an operator or the IESG -- you're not going to make many friends at that point telling them they got it wrong. On the other hand, is it really too much to ask operators -- especially big ones with a vested interest in not having the IETF throw crap over the wall for them to debug -- to *hire* a liaison whose job is to monitor a swath of working groups, bofs, etc, and participate the entire way through? I imagine they'd be pretty popular amongst clueful vendors, and would give you a leg up knowing what's good and what's just sales-drek. Mike
On Jul 12, 2011, at 12:40 PM, Michael Thomas wrote:
Leo Bicknell wrote: In short, make it easy for the operators to participate at the right
time in the process. It will be better for everyone!
Unfortunately, where you want to be inserted into the process is when everybody has said their piece 80-dozen times and are tired and just want to get on with life. So it doesn't matter whether you're an operator or the IESG -- you're not going to make many friends at that point telling them they got it wrong.
On the other hand, is it really too much to ask operators -- especially big ones with a vested interest in not having the IETF throw crap over the wall for them to debug -- to *hire* a liaison whose job is to monitor a swath of working groups, bofs, etc, and participate the entire way through? I imagine they'd be pretty popular amongst clueful vendors, and would give you a leg up knowing what's good and what's just sales-drek.
By definition if crap has been thrown of the wall and you're trying to deploy it, that means: * you have a commercial or other compelling reason to run it. * someone has implemented it. the bar to make something relevant on those two points is much higher, than the one that involves submitting an internet draft. getting something through draft to publication via a working group is itself a rather involved process. Plenty of crap is thrown over the wall which you will never use, because the marketplace doesn't care, nobody built it, nobody has that problem it turns out, it turned out to be too hard or it was actually a dumb idea. in the market place for idea this seems normal and healthy.
Mike
On Tue, Jul 12, 2011 at 8:28 AM, Ronald Bonica <rbonica@juniper.net> wrote:
Leo,
Maybe we can fix this by:
a) bringing together larger groups of clueful operators in the IETF b) deciding which issues interest them c) showing up and being vocal as a group in protocol developing working groups
To some degree, we already do this in the IETF OPS area, but judging by your comments, we don't do it nearly enough.
Comments?
There may be an OPS area, but it is not listened to. Witness the latest debacle with the attempt at trying to make 6to4 historic. Various "non-practicing entities" were able to derail what network operators largely supported. Since the IETF failed to make progress operators will do other things to stop 6to4 ( i have heard no AAAA over IPv4 transport, blackhole 6to4 anycast, decom relay routers...) Real network operators have a relatively low BS threshold, they have customers to support and businesses to run, and they don't have thumb wrestle these people who don't actually have any skin in the game. Cameron
Ron
-----Original Message----- From: Leo Bicknell [mailto:bicknell@ufp.org] Sent: Monday, July 11, 2011 3:35 PM To: nanog@nanog.org Subject: Re: Anybody can participate in the IETF (Was: Why is IPv6 broken?)
In a message written on Sun, Jul 10, 2011 at 06:16:09PM +0200, Jeroen Massar wrote:
Ehmmmm ANYBODY, including you, can sign up to the IETF mailing lists and participate there, just like a couple of folks from NANOG are already doing.
The way the IETF and the operator community interact is badly broken.
The IETF does not want operators in many steps of the process. If you try to bring up operational concerns in early protocol development for example you'll often get a "we'll look at that later" response, which in many cases is right. Sometimes you just have to play with something before you worry about the operational details. It also does not help that many operational types are not hardcore programmers, and can't play in the sandbox during the major development cycles.
On 07/12/2011 08:43, Cameron Byrne wrote:
Witness the latest debacle with the attempt at trying to make 6to4 historic.
Various "non-practicing entities" were able to derail what network operators largely supported. Since the IETF failed to make progress operators will do other things to stop 6to4 ( i have heard no AAAA over IPv4 transport, blackhole 6to4 anycast, decom relay routers...)
FYI, my understanding (and I'm sure Ron will correct me if I'm wrong) is that what's actually happening is that the IESG is pushing that draft forward knowing that it's going to be appealed. The appeal process will then sort itself out, and we'll have a result. The fact that one person can stall the end result through the appeals process is both a strength and occasionally a burden of the way the IETF does its work. Doug -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/
Cameron, Please stay tuned. While 6-to-4-historic is on hold, it is far from being dead. Expect more discussion in Quebec and on the mailing list. I doubt if there will be any final decision before Quebec. Ron
-----Original Message----- From: Cameron Byrne [mailto:cb.list6@gmail.com] Sent: Tuesday, July 12, 2011 11:44 AM To: Ronald Bonica Cc: Leo Bicknell; nanog@nanog.org Subject: Re: Anybody can participate in the IETF (Was: Why is IPv6 broken?)
On Tue, Jul 12, 2011 at 8:28 AM, Ronald Bonica <rbonica@juniper.net> wrote:
Leo,
Maybe we can fix this by:
a) bringing together larger groups of clueful operators in the IETF b) deciding which issues interest them c) showing up and being vocal as a group in protocol developing working groups
To some degree, we already do this in the IETF OPS area, but judging by your comments, we don't do it nearly enough.
Comments?
There may be an OPS area, but it is not listened to.
Witness the latest debacle with the attempt at trying to make 6to4 historic.
Various "non-practicing entities" were able to derail what network operators largely supported. Since the IETF failed to make progress operators will do other things to stop 6to4 ( i have heard no AAAA over IPv4 transport, blackhole 6to4 anycast, decom relay routers...)
Real network operators have a relatively low BS threshold, they have customers to support and businesses to run, and they don't have thumb wrestle these people who don't actually have any skin in the game.
Cameron
Ron
-----Original Message----- From: Leo Bicknell [mailto:bicknell@ufp.org] Sent: Monday, July 11, 2011 3:35 PM To: nanog@nanog.org Subject: Re: Anybody can participate in the IETF (Was: Why is IPv6 broken?)
In a message written on Sun, Jul 10, 2011 at 06:16:09PM +0200, Jeroen Massar wrote:
Ehmmmm ANYBODY, including you, can sign up to the IETF mailing lists and participate there, just like a couple of folks from NANOG are already doing.
The way the IETF and the operator community interact is badly broken.
The IETF does not want operators in many steps of the process. If you try to bring up operational concerns in early protocol development for example you'll often get a "we'll look at that later" response, which in many cases is right. Sometimes you just have to play with something before you worry about the operational details. It also does not help that many operational types are not hardcore programmers, and can't play in the sandbox during the major development cycles.
On Jul 12, 2011, at 8:43 AM, Cameron Byrne wrote:
On Tue, Jul 12, 2011 at 8:28 AM, Ronald Bonica <rbonica@juniper.net> wrote:
Leo,
Maybe we can fix this by:
a) bringing together larger groups of clueful operators in the IETF b) deciding which issues interest them c) showing up and being vocal as a group in protocol developing working groups
To some degree, we already do this in the IETF OPS area, but judging by your comments, we don't do it nearly enough.
Comments?
There may be an OPS area, but it is not listened to.
Witness the latest debacle with the attempt at trying to make 6to4 historic.
Various "non-practicing entities" were able to derail what network operators largely supported. Since the IETF failed to make progress operators will do other things to stop 6to4 ( i have heard no AAAA over IPv4 transport, blackhole 6to4 anycast, decom relay routers...)
Those are all REALLY bad ideas. Speaking as an operator, the best thing you can do to alleviate the problems with 6to4 is operate more, not less 6to4 relays. Blocking AAAA over IPv4 transport is just silly. It's just as likely that your AAAA record is destined for an end-host that has native IPv6 connectivity with an intermediate resolver that desn't have IPv6 as it is that you're sending that to a 6to4 host. Further, there's no reason to believe the 6to4 host won't attempt to resolve via IPv6, so, it doesn't really help anyway.
Real network operators have a relatively low BS threshold, they have customers to support and businesses to run, and they don't have thumb wrestle these people who don't actually have any skin in the game.
I agree, but, it's not hard to run 6to4 relays and running them does much more to alleviate the problems with 6to4 than anything you proposed above. Indeed, what you proposed above will likely create more customer issues rather than reduce them. Owen
Cameron
Ron
-----Original Message----- From: Leo Bicknell [mailto:bicknell@ufp.org] Sent: Monday, July 11, 2011 3:35 PM To: nanog@nanog.org Subject: Re: Anybody can participate in the IETF (Was: Why is IPv6 broken?)
In a message written on Sun, Jul 10, 2011 at 06:16:09PM +0200, Jeroen Massar wrote:
Ehmmmm ANYBODY, including you, can sign up to the IETF mailing lists and participate there, just like a couple of folks from NANOG are already doing.
The way the IETF and the operator community interact is badly broken.
The IETF does not want operators in many steps of the process. If you try to bring up operational concerns in early protocol development for example you'll often get a "we'll look at that later" response, which in many cases is right. Sometimes you just have to play with something before you worry about the operational details. It also does not help that many operational types are not hardcore programmers, and can't play in the sandbox during the major development cycles.
On Jul 12, 2011, at 12:53 PM, Owen DeLong wrote:
On Jul 12, 2011, at 8:43 AM, Cameron Byrne wrote:
On Tue, Jul 12, 2011 at 8:28 AM, Ronald Bonica <rbonica@juniper.net> wrote:
Leo,
Maybe we can fix this by:
a) bringing together larger groups of clueful operators in the IETF b) deciding which issues interest them c) showing up and being vocal as a group in protocol developing working groups
To some degree, we already do this in the IETF OPS area, but judging by your comments, we don't do it nearly enough.
Comments?
There may be an OPS area, but it is not listened to.
Witness the latest debacle with the attempt at trying to make 6to4 historic.
Various "non-practicing entities" were able to derail what network operators largely supported. Since the IETF failed to make progress operators will do other things to stop 6to4 ( i have heard no AAAA over IPv4 transport, blackhole 6to4 anycast, decom relay routers...)
Those are all REALLY bad ideas. Speaking as an operator, the best thing you can do to alleviate the problems with 6to4 is operate more, not less 6to4 relays.
Unless of course the large providers get their shared transition space in which case all 6to4 behind it will break in a really ugly way, pretty much exactly like in the mobile operator in question. The goal of 6to4 to historic was not to encourage the outcome described, it was to take having 6to4 as a default method of any kind off the table going into the future. If mature adults want to use it great, but conformance tests shouldn't require it, CPE shouldn't it on just because what they think they have a is a public IP with not filtering and hosts shouldn't use it unless told to do so..
Blocking AAAA over IPv4 transport is just silly. It's just as likely that your AAAA record is destined for an end-host that has native IPv6 connectivity with an intermediate resolver that desn't have IPv6 as it is that you're sending that to a 6to4 host. Further, there's no reason to believe the 6to4 host won't attempt to resolve via IPv6, so, it doesn't really help anyway.
Real network operators have a relatively low BS threshold, they have customers to support and businesses to run, and they don't have thumb wrestle these people who don't actually have any skin in the game.
I agree, but, it's not hard to run 6to4 relays and running them does much more to alleviate the problems with 6to4 than anything you proposed above. Indeed, what you proposed above will likely create more customer issues rather than reduce them.
Owen
Cameron
Ron
-----Original Message----- From: Leo Bicknell [mailto:bicknell@ufp.org] Sent: Monday, July 11, 2011 3:35 PM To: nanog@nanog.org Subject: Re: Anybody can participate in the IETF (Was: Why is IPv6 broken?)
In a message written on Sun, Jul 10, 2011 at 06:16:09PM +0200, Jeroen Massar wrote:
Ehmmmm ANYBODY, including you, can sign up to the IETF mailing lists and participate there, just like a couple of folks from NANOG are already doing.
The way the IETF and the operator community interact is badly broken.
The IETF does not want operators in many steps of the process. If you try to bring up operational concerns in early protocol development for example you'll often get a "we'll look at that later" response, which in many cases is right. Sometimes you just have to play with something before you worry about the operational details. It also does not help that many operational types are not hardcore programmers, and can't play in the sandbox during the major development cycles.
In message <56E0FB8F-BB53-4DB0-829B-39DFBAB483E8@bogus.com>, Joel Jaeggli write s:
On Jul 12, 2011, at 12:53 PM, Owen DeLong wrote:
On Tue, Jul 12, 2011 at 8:28 AM, Ronald Bonica <rbonica@juniper.net> = wrote:
Leo, =20 Maybe we can fix this by: =20 a) bringing together larger groups of clueful operators in the IETF b) deciding which issues interest them c) showing up and being vocal as a group in protocol developing = working groups =20 To some degree, we already do this in the IETF OPS area, but judging = by your comments, we don't do it nearly enough. =20 Comments? =20 =20 There may be an OPS area, but it is not listened to. =20 Witness the latest debacle with the attempt at trying to make 6to4 = historic. =20 Various "non-practicing entities" were able to derail what network operators largely supported. Since the IETF failed to make progress operators will do other things to stop 6to4 ( i have heard no AAAA over IPv4 transport, blackhole 6to4 anycast, decom relay routers...) =20 Those are all REALLY bad ideas. Speaking as an operator, the best =
=20 On Jul 12, 2011, at 8:43 AM, Cameron Byrne wrote: =20 thing you can do to alleviate the problems with 6to4 is operate more, not less = 6to4 relays.
Unless of course the large providers get their shared transition space = in which case all 6to4 behind it will break in a really ugly way, pretty = much exactly like in the mobile operator in question.=20
And would deploying draft-andrews-v6ops-6to4-router-option-02.txt and/or adding router reachability tests have addressed this issue?
The goal of 6to4 to historic was not to encourage the outcome described, = it was to take having 6to4 as a default method of any kind off the table = going into the future. If mature adults want to use it great, but = conformance tests shouldn't require it, CPE shouldn't it on just because = what they think they have a is a public IP with not filtering and hosts = shouldn't use it unless told to do so..
But that is *not* what the draft did. Making the protocol historic did LOTS more than that. I think there was universal consensus that 6to4 should be off by default. There was this nuke 6to4 from orbit attitude which did nothing to help with already deployed/shipped boxes. 6to4 historic is actually harmful for dealing with the existing problems as it tells vendors not to include 6to4 support in future products which means operators won't have boxes with fixes to other problems to alleviate the problems cause but the currently deployed customer boxes. What would have been much better would have been to encourage CPE vendors to release images which address some of the known issues. Just adding a check box saying "enable 6to4" and for ISP to send out email to say "check your router vendor web site for fixed images". The better fix would be to get them to also add support for draft-andrews-v6ops-6to4-router-option-02.txt which greys out the checkbox when 0.0.0.0 is sent as a response to the option. Remember operators are in the position to alleviate lots of the 6to4 issues themselves.
Real network operators have a relatively low BS threshold, they have customers to support and businesses to run, and they don't have =
wrestle these people who don't actually have any skin in the game. =20 I agree, but, it's not hard to run 6to4 relays and running them does = much more to alleviate the problems with 6to4 than anything you proposed above. Indeed, what you proposed above will likely create more = customer issues rather than reduce them. =20 Owen =20 Cameron =20 =20
Ron =20 =20 -----Original Message----- From: Leo Bicknell [mailto:bicknell@ufp.org] Sent: Monday, July 11, 2011 3:35 PM To: nanog@nanog.org Subject: Re: Anybody can participate in the IETF (Was: Why is IPv6 =
broken?)
=20 In a message written on Sun, Jul 10, 2011 at 06:16:09PM +0200, = Jeroen Massar wrote:
Ehmmmm ANYBODY, including you, can sign up to the IETF mailing =
Blocking AAAA over IPv4 transport is just silly. It's just as likely = that your AAAA record is destined for an end-host that has native IPv6 = connectivity with an intermediate resolver that desn't have IPv6 as it is that = you're sending that to a 6to4 host. Further, there's no reason to believe the 6to4 host won't attempt to resolve via IPv6, so, it doesn't really = help anyway. =20 thumb lists
and participate there, just like a couple of folks from NANOG are = already doing. =20 The way the IETF and the operator community interact is badly = broken. =20 The IETF does not want operators in many steps of the process. If = you try to bring up operational concerns in early protocol development = for example you'll often get a "we'll look at that later" response, = which in many cases is right. Sometimes you just have to play with = something before you worry about the operational details. It also does = not help that many operational types are not hardcore programmers, and = can't play in the sandbox during the major development cycles. =20 =20 =20 =20 =20 =20 =20
-- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org
On Jul 12, 2011 6:42 PM, "Mark Andrews" <marka@isc.org> wrote:
In message <56E0FB8F-BB53-4DB0-829B-39DFBAB483E8@bogus.com>, Joel Jaeggli
write
s:
On Jul 12, 2011, at 12:53 PM, Owen DeLong wrote:
=20 On Jul 12, 2011, at 8:43 AM, Cameron Byrne wrote: =20
On Tue, Jul 12, 2011 at 8:28 AM, Ronald Bonica <rbonica@juniper.net>
=
Leo, =20 Maybe we can fix this by: =20 a) bringing together larger groups of clueful operators in the IETF b) deciding which issues interest them c) showing up and being vocal as a group in protocol developing = working groups =20 To some degree, we already do this in the IETF OPS area, but judging = by your comments, we don't do it nearly enough. =20 Comments? =20 =20 There may be an OPS area, but it is not listened to. =20 Witness the latest debacle with the attempt at trying to make 6to4 = historic. =20 Various "non-practicing entities" were able to derail what network operators largely supported. Since the IETF failed to make progress operators will do other things to stop 6to4 ( i have heard no AAAA over IPv4 transport, blackhole 6to4 anycast, decom relay routers...) =20 Those are all REALLY bad ideas. Speaking as an operator, the best =
wrote: thing you
can do to alleviate the problems with 6to4 is operate more, not less = 6to4 relays.
Unless of course the large providers get their shared transition space = in which case all 6to4 behind it will break in a really ugly way, pretty = much exactly like in the mobile operator in question.=20
And would deploying draft-andrews-v6ops-6to4-router-option-02.txt and/or adding router reachability tests have addressed this issue?
The goal of 6to4 to historic was not to encourage the outcome described, = it was to take having 6to4 as a default method of any kind off the table = going into the future. If mature adults want to use it great, but = conformance tests shouldn't require it, CPE shouldn't it on just because = what they think they have a is a public IP with not filtering and hosts = shouldn't use it unless told to do so..
But that is *not* what the draft did. Making the protocol historic did LOTS more than that. I think there was universal consensus that 6to4 should be off by default.
There was this nuke 6to4 from orbit attitude which did nothing to help with already deployed/shipped boxes. 6to4 historic is actually harmful for dealing with the existing problems as it tells vendors not to include 6to4 support in future products which means operators won't have boxes with fixes to other problems to alleviate the problems cause but the currently deployed customer boxes.
What would have been much better would have been to encourage CPE vendors to release images which address some of the known issues. Just adding a check box saying "enable 6to4" and for ISP to send out email to say "check your router vendor web site for fixed images". The better fix would be to get them to also add support for draft-andrews-v6ops-6to4-router-option-02.txt which greys out the checkbox when 0.0.0.0 is sent as a response to the option.
Remember operators are in the position to alleviate lots of the 6to4 issues themselves.
But they will not. If there is not a revenue forecast, there is no project. That said, CGN is moving forward as a "keep the lights on" initiative.... as is real native v6. I don't care to rehash this yet again with no progress. Cb.
Real network operators have a relatively low BS threshold, they have customers to support and businesses to run, and they don't have =
wrestle these people who don't actually have any skin in the game. =20 I agree, but, it's not hard to run 6to4 relays and running them does = much more to alleviate the problems with 6to4 than anything you proposed above. Indeed, what you proposed above will likely create more = customer issues rather than reduce them. =20 Owen =20 Cameron =20 =20
Ron =20 =20 -----Original Message----- From: Leo Bicknell [mailto:bicknell@ufp.org] Sent: Monday, July 11, 2011 3:35 PM To: nanog@nanog.org Subject: Re: Anybody can participate in the IETF (Was: Why is IPv6 =
broken?)
=20 In a message written on Sun, Jul 10, 2011 at 06:16:09PM +0200, = Jeroen Massar wrote:
Ehmmmm ANYBODY, including you, can sign up to the IETF mailing =
Blocking AAAA over IPv4 transport is just silly. It's just as likely = that your AAAA record is destined for an end-host that has native IPv6 = connectivity with an intermediate resolver that desn't have IPv6 as it is that = you're sending that to a 6to4 host. Further, there's no reason to believe the 6to4 host won't attempt to resolve via IPv6, so, it doesn't really = help anyway. =20 thumb lists
and participate there, just like a couple of folks from NANOG are = already doing. =20 The way the IETF and the operator community interact is badly = broken. =20 The IETF does not want operators in many steps of the process. If = you try to bring up operational concerns in early protocol development = for example you'll often get a "we'll look at that later" response, = which in many cases is right. Sometimes you just have to play with = something before you worry about the operational details. It also does = not help that many operational types are not hardcore programmers, and = can't play in the sandbox during the major development cycles. =20 =20 =20 =20 =20 =20 =20
-- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org
On Jul 12, 2011, at 6:41 PM, Mark Andrews wrote:
In message <56E0FB8F-BB53-4DB0-829B-39DFBAB483E8@bogus.com>, Joel Jaeggli write s:
On Jul 12, 2011, at 12:53 PM, Owen DeLong wrote:
On Tue, Jul 12, 2011 at 8:28 AM, Ronald Bonica <rbonica@juniper.net> = wrote:
Leo, =20 Maybe we can fix this by: =20 a) bringing together larger groups of clueful operators in the IETF b) deciding which issues interest them c) showing up and being vocal as a group in protocol developing = working groups =20 To some degree, we already do this in the IETF OPS area, but judging = by your comments, we don't do it nearly enough. =20 Comments? =20 =20 There may be an OPS area, but it is not listened to. =20 Witness the latest debacle with the attempt at trying to make 6to4 = historic. =20 Various "non-practicing entities" were able to derail what network operators largely supported. Since the IETF failed to make progress operators will do other things to stop 6to4 ( i have heard no AAAA over IPv4 transport, blackhole 6to4 anycast, decom relay routers...) =20 Those are all REALLY bad ideas. Speaking as an operator, the best =
=20 On Jul 12, 2011, at 8:43 AM, Cameron Byrne wrote: =20 thing you can do to alleviate the problems with 6to4 is operate more, not less = 6to4 relays.
Unless of course the large providers get their shared transition space = in which case all 6to4 behind it will break in a really ugly way, pretty = much exactly like in the mobile operator in question.=20
And would deploying draft-andrews-v6ops-6to4-router-option-02.txt and/or adding router reachability tests have addressed this issue?
Neither of these approaches address existing cpe, and shared transtion space is justified on the basis of existing cpe... We go into this with the internet we have not the one that we would like to have the later takes time.
The goal of 6to4 to historic was not to encourage the outcome described, = it was to take having 6to4 as a default method of any kind off the table = going into the future. If mature adults want to use it great, but = conformance tests shouldn't require it, CPE shouldn't it on just because = what they think they have a is a public IP with not filtering and hosts = shouldn't use it unless told to do so..
But that is *not* what the draft did. Making the protocol historic did LOTS more than that. I think there was universal consensus that 6to4 should be off by default.
And that'll take some time while particularly for the CPE to age out.
There was this nuke 6to4 from orbit attitude which did nothing to help with already deployed/shipped boxes. 6to4 historic is actually harmful for dealing with the existing problems as it tells vendors not to include 6to4 support in future products which means operators won't have boxes with fixes to other problems to alleviate the problems cause but the currently deployed customer boxes.
The interpretation of attitude is a matter of taste. When that authors of 3056 and 3068 come down in support of or opposed to the same draft there clearly some debate. If we focus on what really would be in the best interests of the end user, it is a decline to zero in the unintentional use of 6to4 in cpe and operating systems. it is the removal of 6to4 from requirements where it presently exists, and it is the continued support of relays to support legacy devices. It is really hard to justify the expansion and deployment of new relays when in fact tunneled traffic can be observed to be on the decline (possibly because devices particularly hosts that do receive regular updates receive tweaks to their address selection algorithm). http://asert.arbornetworks.com/2011/04/six-months-six-providers-and-ipv6/
What would have been much better would have been to encourage CPE vendors to release images which address some of the known issues. Just adding a check box saying "enable 6to4" and for ISP to send out email to say "check your router vendor web site for fixed images". The better fix would be to get them to also add support for draft-andrews-v6ops-6to4-router-option-02.txt which greys out the checkbox when 0.0.0.0 is sent as a response to the option.
Remember operators are in the position to alleviate lots of the 6to4 issues themselves.
Real network operators have a relatively low BS threshold, they have customers to support and businesses to run, and they don't have =
wrestle these people who don't actually have any skin in the game. =20 I agree, but, it's not hard to run 6to4 relays and running them does = much more to alleviate the problems with 6to4 than anything you proposed above. Indeed, what you proposed above will likely create more = customer issues rather than reduce them. =20 Owen =20 Cameron =20 =20
Ron =20 =20 -----Original Message----- From: Leo Bicknell [mailto:bicknell@ufp.org] Sent: Monday, July 11, 2011 3:35 PM To: nanog@nanog.org Subject: Re: Anybody can participate in the IETF (Was: Why is IPv6 =
broken?)
=20 In a message written on Sun, Jul 10, 2011 at 06:16:09PM +0200, = Jeroen Massar wrote:
Ehmmmm ANYBODY, including you, can sign up to the IETF mailing =
Blocking AAAA over IPv4 transport is just silly. It's just as likely = that your AAAA record is destined for an end-host that has native IPv6 = connectivity with an intermediate resolver that desn't have IPv6 as it is that = you're sending that to a 6to4 host. Further, there's no reason to believe the 6to4 host won't attempt to resolve via IPv6, so, it doesn't really = help anyway. =20 thumb lists
and participate there, just like a couple of folks from NANOG are = already doing. =20 The way the IETF and the operator community interact is badly = broken. =20 The IETF does not want operators in many steps of the process. If = you try to bring up operational concerns in early protocol development = for example you'll often get a "we'll look at that later" response, = which in many cases is right. Sometimes you just have to play with = something before you worry about the operational details. It also does = not help that many operational types are not hardcore programmers, and = can't play in the sandbox during the major development cycles. =20 =20 =20 =20 =20 =20 =20
-- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org
In message <9C391C3A-3535-4C47-A743-57287685942E@bogus.com>, Joel Jaeggli write s:
On Jul 12, 2011, at 6:41 PM, Mark Andrews wrote:
=20 On Jul 12, 2011, at 12:53 PM, Owen DeLong wrote: =20
=3D20 On Jul 12, 2011, at 8:43 AM, Cameron Byrne wrote: =3D20
On Tue, Jul 12, 2011 at 8:28 AM, Ronald Bonica = <rbonica@juniper.net> =3D wrote:
Leo, =3D20 Maybe we can fix this by: =3D20 a) bringing together larger groups of clueful operators in the = IETF b) deciding which issues interest them c) showing up and being vocal as a group in protocol developing =3D working groups =3D20 To some degree, we already do this in the IETF OPS area, but = judging =3D by your comments, we don't do it nearly enough. =3D20 Comments? =3D20 =3D20 There may be an OPS area, but it is not listened to. =3D20 Witness the latest debacle with the attempt at trying to make 6to4 = =3D historic. =3D20 Various "non-practicing entities" were able to derail what network operators largely supported. Since the IETF failed to make =
operators will do other things to stop 6to4 ( i have heard no AAAA over IPv4 transport, blackhole 6to4 anycast, decom relay = routers...) =3D20 Those are all REALLY bad ideas. Speaking as an operator, the best =3D thing you can do to alleviate the problems with 6to4 is operate more, not less = =3D 6to4 relays. =20 Unless of course the large providers get their shared transition = space =3D in which case all 6to4 behind it will break in a really ugly way, =
=20 In message <56E0FB8F-BB53-4DB0-829B-39DFBAB483E8@bogus.com>, Joel = Jaeggli write s: progress pretty =3D
much exactly like in the mobile operator in question.=3D20 =20 And would deploying draft-andrews-v6ops-6to4-router-option-02.txt = and/or adding router reachability tests have addressed this issue?
Neither of these approaches address existing cpe, and shared transtion = space is justified on the basis of existing cpe...
I didn't claim it would work with existing CPE equipment. Declaring 6to4 historic won't work with existing CPE equipment either. As for requesting shared transition space, there are lots of benefits to it other than helping existing CPE equipement. draft-andrews-v6ops-6to4-router-option-02.txt helps when you are just filtering the protocol 41 traffic.
We go into this with the internet we have not the one that we would like = to have the later takes time.
The goal of 6to4 to historic was not to encourage the outcome = described, =3D it was to take having 6to4 as a default method of any kind off the = table =3D going into the future. If mature adults want to use it great, but =3D conformance tests shouldn't require it, CPE shouldn't it on just = because =3D what they think they have a is a public IP with not filtering and = hosts =3D shouldn't use it unless told to do so.. =20 But that is *not* what the draft did. Making the protocol historic did LOTS more than that. I think there was universal consensus that 6to4 should be off by default.
And that'll take some time while particularly for the CPE to age out.
There was this nuke 6to4 from orbit attitude which did nothing to help with already deployed/shipped boxes. 6to4 historic is actually harmful for dealing with the existing problems as it tells vendors not to include 6to4 support in future products which means operators won't have boxes with fixes to other problems to alleviate the problems cause but the currently deployed customer boxes.
The interpretation of attitude is a matter of taste. When that authors = of 3056 and 3068 come down in support of or opposed to the same draft = there clearly some debate.=20
If we focus on what really would be in the best interests of the end = user, it is a decline to zero in the unintentional use of 6to4 in cpe = and operating systems. it is the removal of 6to4 from requirements where = it presently exists, and it is the continued support of relays to = support legacy devices.=20
And to support those that can't get IPv6 from their ISPs.
It is really hard to justify the expansion and deployment of new relays = when in fact tunneled traffic can be observed to be on the decline = (possibly because devices particularly hosts that do receive regular = updates receive tweaks to their address selection algorithm). = http://asert.arbornetworks.com/2011/04/six-months-six-providers-and-ipv6/
Which may or may not be a short term dip. We are yet to see much in the way of IPv6 only content. When that appears, which it will, the tunneled traffic will go up unless ISPs have deployed native IPv6 to all customers. Are you willing to bet on which will happen first? This whole area is in a state of flux.
Blocking AAAA over IPv4 transport is just silly. It's just as likely = =3D that your AAAA record is destined for an end-host that has native IPv6 =3D connectivity with an intermediate resolver that desn't have IPv6 as it is that =3D you're sending that to a 6to4 host. Further, there's no reason to believe =
What would have been much better would have been to encourage CPE vendors to release images which address some of the known issues. Just adding a check box saying "enable 6to4" and for ISP to send out email to say "check your router vendor web site for fixed images". The better fix would be to get them to also add support for draft-andrews-v6ops-6to4-router-option-02.txt which greys out the checkbox when 0.0.0.0 is sent as a response to the option. =20 Remember operators are in the position to alleviate lots of the 6to4 issues themselves. =20 the
6to4 host won't attempt to resolve via IPv6, so, it doesn't really =3D=
anyway. =3D20
Real network operators have a relatively low BS threshold, they = have customers to support and businesses to run, and they don't have =3D
wrestle these people who don't actually have any skin in the game. =3D20 I agree, but, it's not hard to run 6to4 relays and running them does = =3D much more to alleviate the problems with 6to4 than anything you proposed above. Indeed, what you proposed above will likely create more =3D customer issues rather than reduce them. =3D20 Owen =3D20 Cameron =3D20 =3D20
Ron =3D20 =3D20 -----Original Message----- From: Leo Bicknell [mailto:bicknell@ufp.org] Sent: Monday, July 11, 2011 3:35 PM To: nanog@nanog.org Subject: Re: Anybody can participate in the IETF (Was: Why is IPv6 =
=3D broken?)
=3D20 In a message written on Sun, Jul 10, 2011 at 06:16:09PM +0200, =3D Jeroen Massar wrote: > Ehmmmm ANYBODY, including you, can sign up to the IETF mailing =3D
help thumb lists
> and participate there, just like a couple of folks from NANOG are = =3D already doing. =3D20 The way the IETF and the operator community interact is badly =3D broken. =3D20 The IETF does not want operators in many steps of the process. If = =3D you try to bring up operational concerns in early protocol = development =3D for example you'll often get a "we'll look at that later" response, =3D=
which in many cases is right. Sometimes you just have to play with =3D=
something before you worry about the operational details. It also = does =3D not help that many operational types are not hardcore programmers, = and =3D can't play in the sandbox during the major development cycles.
=3D20 =3D20 =3D20 =3D20 =3D20 =3D20 =3D20 =20 =20 --=20 Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org =20
-- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org
On Jul 12, 2011, at 10:59 PM, Mark Andrews wrote:
I didn't claim it would work with existing CPE equipment. Declaring 6to4 historic won't work with existing CPE equipment either.
If the hosts behind it stop using 2002::/16 addresses as a product of a software update which seems rather more likely (also there some evidence for that), it will. that said yes one assumption is that you have to continue to support it. <snip>
It is really hard to justify the expansion and deployment of new relays = when in fact tunneled traffic can be observed to be on the decline = (possibly because devices particularly hosts that do receive regular = updates receive tweaks to their address selection algorithm). = http://asert.arbornetworks.com/2011/04/six-months-six-providers-and-ipv6/
Which may or may not be a short term dip.
correlation is not causation but... http://arstechnica.com/apple/news/2010/11/apple-fixes-broken-ipv6-by-breakin...
We are yet to see much in the way of IPv6 only content. When that appears, which it will, the tunneled traffic will go up unless ISPs have deployed native IPv6 to all customers. Are you willing to bet on which will happen first?
I'm willing to bet that subpar experience due to auto-tunneling is considered a liability for content providers.
This whole area is in a state of flux.
Blocking AAAA over IPv4 transport is just silly. It's just as likely = =3D that your AAAA record is destined for an end-host that has native IPv6 =3D connectivity with an intermediate resolver that desn't have IPv6 as it is that =3D you're sending that to a 6to4 host. Further, there's no reason to believe =
What would have been much better would have been to encourage CPE vendors to release images which address some of the known issues. Just adding a check box saying "enable 6to4" and for ISP to send out email to say "check your router vendor web site for fixed images". The better fix would be to get them to also add support for draft-andrews-v6ops-6to4-router-option-02.txt which greys out the checkbox when 0.0.0.0 is sent as a response to the option. =20 Remember operators are in the position to alleviate lots of the 6to4 issues themselves. =20 the
6to4 host won't attempt to resolve via IPv6, so, it doesn't really =3D=
anyway. =3D20
Real network operators have a relatively low BS threshold, they = have customers to support and businesses to run, and they don't have =3D
wrestle these people who don't actually have any skin in the game. =3D20 I agree, but, it's not hard to run 6to4 relays and running them does = =3D much more to alleviate the problems with 6to4 than anything you proposed above. Indeed, what you proposed above will likely create more =3D customer issues rather than reduce them. =3D20 Owen =3D20 Cameron =3D20 =3D20 > Ron > =3D20 > =3D20 > -----Original Message----- > From: Leo Bicknell [mailto:bicknell@ufp.org] > Sent: Monday, July 11, 2011 3:35 PM > To: nanog@nanog.org > Subject: Re: Anybody can participate in the IETF (Was: Why is IPv6 = =3D broken?) > =3D20 > In a message written on Sun, Jul 10, 2011 at 06:16:09PM +0200, =3D Jeroen Massar wrote: >> Ehmmmm ANYBODY, including you, can sign up to the IETF mailing =3D
help thumb lists
>> and participate there, just like a couple of folks from NANOG are = =3D already doing. > =3D20 > The way the IETF and the operator community interact is badly =3D broken. > =3D20 > The IETF does not want operators in many steps of the process. If = =3D you try to bring up operational concerns in early protocol = development =3D for example you'll often get a "we'll look at that later" response, =3D=
which in many cases is right. Sometimes you just have to play with =3D=
something before you worry about the operational details. It also = does =3D not help that many operational types are not hardcore programmers, = and =3D can't play in the sandbox during the major development cycles.
> =3D20 > =3D20 > =3D20 > =3D20 =3D20 =3D20 =3D20 =20 =20 --=20 Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org =20
-- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org
In message <430FFF20-43ED-45BB-846D-FEE8769FC399@bogus.com>, Joel Jaeggli write s:
On Jul 12, 2011, at 10:59 PM, Mark Andrews wrote:
=20 I didn't claim it would work with existing CPE equipment. Declaring 6to4 historic won't work with existing CPE equipment either.
If the hosts behind it stop using 2002::/16 addresses as a product of a = software update which seems rather more likely (also there some evidence = for that), it will. that said yes one assumption is that you have to = continue to support it.
When you switch the source address preference from 2002::/16 to IPv4 you loose insight into which machines have 2002::/16 addresses still without explict testing.
<snip>
It is really hard to justify the expansion and deployment of new = relays =3D when in fact tunneled traffic can be observed to be on the decline =3D (possibly because devices particularly hosts that do receive regular = =3D updates receive tweaks to their address selection algorithm). =3D = http://asert.arbornetworks.com/2011/04/six-months-six-providers-and-ipv6/ =20 Which may or may not be a short term dip.
correlation is not causation but...
= http://arstechnica.com/apple/news/2010/11/apple-fixes-broken-ipv6-by-break= ing-it-some-more.ars
We are yet to see much in the way of IPv6 only content. When that appears, which it will, the = tunneled traffic will go up unless ISPs have deployed native IPv6 to all = customers. Are you willing to bet on which will happen first?
I'm willing to bet that subpar experience due to auto-tunneling is = considered a liability for content providers.
This whole area is in a state of flux. -- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org
On Jul 12, 2011, at 2:21 PM, Joel Jaeggli wrote:
On Jul 12, 2011, at 12:53 PM, Owen DeLong wrote:
On Jul 12, 2011, at 8:43 AM, Cameron Byrne wrote:
On Tue, Jul 12, 2011 at 8:28 AM, Ronald Bonica <rbonica@juniper.net> wrote:
Leo,
Maybe we can fix this by:
a) bringing together larger groups of clueful operators in the IETF b) deciding which issues interest them c) showing up and being vocal as a group in protocol developing working groups
To some degree, we already do this in the IETF OPS area, but judging by your comments, we don't do it nearly enough.
Comments?
There may be an OPS area, but it is not listened to.
Witness the latest debacle with the attempt at trying to make 6to4 historic.
Various "non-practicing entities" were able to derail what network operators largely supported. Since the IETF failed to make progress operators will do other things to stop 6to4 ( i have heard no AAAA over IPv4 transport, blackhole 6to4 anycast, decom relay routers...)
Those are all REALLY bad ideas. Speaking as an operator, the best thing you can do to alleviate the problems with 6to4 is operate more, not less 6to4 relays.
Unless of course the large providers get their shared transition space in which case all 6to4 behind it will break in a really ugly way, pretty much exactly like in the mobile operator in question.
Actually, if those same providers run 6to4 gateways/routers on their networks in that shared transition space with public IPv6 addresses on the exterior, it would not break at all. As I said, the resolution to the 6to4 problems described is to run MORE, not less 6to4 gateways.
The goal of 6to4 to historic was not to encourage the outcome described, it was to take having 6to4 as a default method of any kind off the table going into the future. If mature adults want to use it great, but conformance tests shouldn't require it, CPE shouldn't it on just because what they think they have a is a public IP with not filtering and hosts shouldn't use it unless told to do so..
I have no problem with saying 6to4 should not be enabled by default. However, that doesn't change the fact that the best way to resolve things given current shipping software and hardware is to deploy 6to4 gateways in the appropriate places. Owen
Blocking AAAA over IPv4 transport is just silly. It's just as likely that your AAAA record is destined for an end-host that has native IPv6 connectivity with an intermediate resolver that desn't have IPv6 as it is that you're sending that to a 6to4 host. Further, there's no reason to believe the 6to4 host won't attempt to resolve via IPv6, so, it doesn't really help anyway.
Real network operators have a relatively low BS threshold, they have customers to support and businesses to run, and they don't have thumb wrestle these people who don't actually have any skin in the game.
I agree, but, it's not hard to run 6to4 relays and running them does much more to alleviate the problems with 6to4 than anything you proposed above. Indeed, what you proposed above will likely create more customer issues rather than reduce them.
Owen
Cameron
Ron
-----Original Message----- From: Leo Bicknell [mailto:bicknell@ufp.org] Sent: Monday, July 11, 2011 3:35 PM To: nanog@nanog.org Subject: Re: Anybody can participate in the IETF (Was: Why is IPv6 broken?)
In a message written on Sun, Jul 10, 2011 at 06:16:09PM +0200, Jeroen Massar wrote:
Ehmmmm ANYBODY, including you, can sign up to the IETF mailing lists and participate there, just like a couple of folks from NANOG are already doing.
The way the IETF and the operator community interact is badly broken.
The IETF does not want operators in many steps of the process. If you try to bring up operational concerns in early protocol development for example you'll often get a "we'll look at that later" response, which in many cases is right. Sometimes you just have to play with something before you worry about the operational details. It also does not help that many operational types are not hardcore programmers, and can't play in the sandbox during the major development cycles.
On Jul 12, 2011, at 7:20 PM, Owen DeLong wrote:
On Jul 12, 2011, at 2:21 PM, Joel Jaeggli wrote:
On Jul 12, 2011, at 12:53 PM, Owen DeLong wrote:
On Jul 12, 2011, at 8:43 AM, Cameron Byrne wrote:
On Tue, Jul 12, 2011 at 8:28 AM, Ronald Bonica <rbonica@juniper.net> wrote:
Leo,
Maybe we can fix this by:
a) bringing together larger groups of clueful operators in the IETF b) deciding which issues interest them c) showing up and being vocal as a group in protocol developing working groups
To some degree, we already do this in the IETF OPS area, but judging by your comments, we don't do it nearly enough.
Comments?
There may be an OPS area, but it is not listened to.
Witness the latest debacle with the attempt at trying to make 6to4 historic.
Various "non-practicing entities" were able to derail what network operators largely supported. Since the IETF failed to make progress operators will do other things to stop 6to4 ( i have heard no AAAA over IPv4 transport, blackhole 6to4 anycast, decom relay routers...)
Those are all REALLY bad ideas. Speaking as an operator, the best thing you can do to alleviate the problems with 6to4 is operate more, not less 6to4 relays.
Unless of course the large providers get their shared transition space in which case all 6to4 behind it will break in a really ugly way, pretty much exactly like in the mobile operator in question.
Actually, if those same providers run 6to4 gateways/routers on their networks in that shared transition space with public IPv6 addresses on the exterior, it would not break at all.
arin 2011-5 specfically cites numbering cpe in space as the justification for deployment. the cpe therefore have to be natted and you are implying that you'll be natting the 6to4, overall I'd put that in the less desirable category as far as violating expectations go...
As I said, the resolution to the 6to4 problems described is to run MORE, not less 6to4 gateways.
Are you advocating draft-kuarsingh-v6ops-6to4-provider-managed-tunnel? http://tools.ietf.org/html/draft-kuarsingh-v6ops-6to4-provider-managed-tunne...
The goal of 6to4 to historic was not to encourage the outcome described, it was to take having 6to4 as a default method of any kind off the table going into the future. If mature adults want to use it great, but conformance tests shouldn't require it, CPE shouldn't it on just because what they think they have a is a public IP with not filtering and hosts shouldn't use it unless told to do so..
I have no problem with saying 6to4 should not be enabled by default. However, that doesn't change the fact that the best way to resolve things given current shipping software and hardware is to deploy 6to4 gateways in the appropriate places.
and we have http://tools.ietf.org/html/draft-ietf-v6ops-6to4-advisory-02 The fact of the matter is more 6to4 relays is only an anodyne as is rejiggering the address selection priority, the pain may go down it won't go away.
Owen
Blocking AAAA over IPv4 transport is just silly. It's just as likely that your AAAA record is destined for an end-host that has native IPv6 connectivity with an intermediate resolver that desn't have IPv6 as it is that you're sending that to a 6to4 host. Further, there's no reason to believe the 6to4 host won't attempt to resolve via IPv6, so, it doesn't really help anyway.
Real network operators have a relatively low BS threshold, they have customers to support and businesses to run, and they don't have thumb wrestle these people who don't actually have any skin in the game.
I agree, but, it's not hard to run 6to4 relays and running them does much more to alleviate the problems with 6to4 than anything you proposed above. Indeed, what you proposed above will likely create more customer issues rather than reduce them.
Owen
Cameron
Ron
-----Original Message----- From: Leo Bicknell [mailto:bicknell@ufp.org] Sent: Monday, July 11, 2011 3:35 PM To: nanog@nanog.org Subject: Re: Anybody can participate in the IETF (Was: Why is IPv6 broken?)
In a message written on Sun, Jul 10, 2011 at 06:16:09PM +0200, Jeroen Massar wrote:
Ehmmmm ANYBODY, including you, can sign up to the IETF mailing lists and participate there, just like a couple of folks from NANOG are already doing.
The way the IETF and the operator community interact is badly broken.
The IETF does not want operators in many steps of the process. If you try to bring up operational concerns in early protocol development for example you'll often get a "we'll look at that later" response, which in many cases is right. Sometimes you just have to play with something before you worry about the operational details. It also does not help that many operational types are not hardcore programmers, and can't play in the sandbox during the major development cycles.
On Jul 10, 2011, at 12:16 PM, Jeroen Massar wrote:
On 2011-07-10 17:56 , David Miller wrote: [..]
+1
The lack of will on the part of the IETF to attract input from and involve operators in their processes (which I would posit is a critical element in the process).
Ehmmmm ANYBODY, including you, can sign up to the IETF mailing lists and participate there, just like a couple of folks from NANOG are already doing.
You are on NANOG out of your own free will, the same applies to the IETF. If you don't participate here your voice is not heard either, just like at the IETF.
Peeking at the ipv6@ietf.org member list, I don't see your name there. You can signup here: https://www.ietf.org/mailman/listinfo/ipv6
Thanks, Jeroen. For IPv6 functionality, I'd suggest ipv6@ietf.org (https://www.ietf.org/mailman/listinfo/ipv6). For IPv6 operational issues, I'd suggest v6ops@ietf.org (https://www.ietf.org/mailman/listinfo/v6ops). For security-related issues, you might also look into opsec@ietf.org (https://www.ietf.org/mailman/listinfo/opsec). On Jul 10, 2011, at 3:45 PM, Owen DeLong wrote:
Number two: While anyone can participate, approaching IETF as an operator requires a rather thick skin, or, at least it did the last couple of times I attempted to participate. I've watched a few times where operators were shouted down by purists and religion over basic real-world operational concerns.
That goes both ways. I periodically see dismissive statements about the IETF on operational lists, and dismissive statements about operators on IETF lists. I would classify David's comment as "dismissive", the kind of comment that causes IETF folks to not participate in operational meetings or lists, and the kind of comment cited by operational folks such as you as reasons to leave IETF meetings and lists. Such comments tend to come from a small set of individuals on each side. If such comments bother you, feel free to block the in-duh-viduals that send them. Personally, I try to listen to them; they are often telling me something I need to hear but don't want to.
On Sun, 2011-07-10 at 10:14 -0400, Jeff Wheeler wrote:
Cogent's policy of requiring a new contract, and from what I am still being told by some European customers, new money, from customers in exchange for provisioning IPv6 on existing circuits, means a simple technical project gets caught up in the complexities of budgeting and contract execution.
"Can we have IPv6 transit?" "Yes, please turn up a session to.." That was asking Cogent for IPv6 dual-stack on our existing IPv4 transit. I'm not saying it's any good, but it certainly didn't cost extra. Tom
On 11/07/2011 08:25, Tom Hill wrote:
I'm not saying it's any good, but it certainly didn't cost extra.
Several people mentioned this to Jeff on IRC a short time ago, so it's not clear why he chose to suggest that ipv6 users in Europe were being fleeced by Cogent for a set-up fee. Perhaps it has happened, but it appears not to be their policy. Of course, if you actually want a full ipv6 table, you will need to go elsewhere. Nick
On Mon, Jul 11, 2011 at 3:25 AM, Tom Hill <tom@ninjabadger.net> wrote:
On Sun, 2011-07-10 at 10:14 -0400, Jeff Wheeler wrote:
Cogent's policy of requiring a new contract, and from what I am still being told by some European customers, new money, from customers in exchange for provisioning IPv6 on existing circuits, means a simple technical project gets caught up in the complexities of budgeting and contract execution.
"Can we have IPv6 transit?" "Yes, please turn up a session to.."
That was asking Cogent for IPv6 dual-stack on our existing IPv4 transit.
I continue to hear different. In my first-hand experience just about three weeks ago, I was told by Cogent that I need to execute a new contract to get IPv6 added to an existing IPv4 circuit (U.S. customer.) This turned a simple pilot project with only a few I.T. folks involved into, well, I'm still waiting on this new contract to be executed. I'm not surprised. -- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts
On Mon, 2011-07-11 at 04:50 -0400, Jeff Wheeler wrote:
"Can we have IPv6 transit?" "Yes, please turn up a session to.."
That was asking Cogent for IPv6 dual-stack on our existing IPv4 transit.
I continue to hear different. In my first-hand experience just about three weeks ago, I was told by Cogent that I need to execute a new contract to get IPv6 added to an existing IPv4 circuit (U.S. customer.) This turned a simple pilot project with only a few I.T. folks involved into, well, I'm still waiting on this new contract to be executed. I'm not surprised.
In fairness, we have a small commit. If you're talking multi-gigabit+, then perhaps they could be a little more concerned about the amount of IPv6 traffic that you might start pushing, leading to delay tactics and/or a required contract change to protect themselves. (Not that it's likely much to be concerned about. But then, I don't know who your customer is. ;)) Or the more likely reality that one hand doesn't talk to the other and everyone's getting varying answers/actions from Cogent, depending on whom they speak with. Tom
so... how much of the heavy lifting are you personally willing to do and how much are you depending/expecting others to do on your behalf? public whining that the v6 network does not mirror the v4 network is not productive and is not news. of course ymmv. /bill
participants (39)
-
Benson Schliesser
-
bmanning@vacation.karoshi.com
-
Bob Network
-
Cameron Byrne
-
Christopher Morrow
-
Damien Saucez
-
Darrel Lewis
-
David Miller
-
Dobbins, Roland
-
Doug Barton
-
Eliot Lear
-
Fernando Gont
-
Florian Weimer
-
Franck Martin
-
Fred Baker
-
Jared Mauch
-
Jason Hellenthal
-
Jeff Wheeler
-
Jeroen Massar
-
Jimmy Hess
-
Joel Jaeggli
-
Karl Auer
-
Leo Bicknell
-
Luigi Iannone
-
Mark Andrews
-
Michael Thomas
-
Mikael Abrahamsson
-
Nick Hilliard
-
Owen DeLong
-
Randy Bush
-
Ronald Bonica
-
Scott Brim
-
Seth Mos
-
steve ulrich
-
Tim Chown
-
Tom Hill
-
Valdis.Kletnieks@vt.edu
-
William Allen Simpson
-
William Herrin