Re Owen, Just a short (ok, now I read it again, it's grown...) answer to the list, but you're right, we might continue this in private. (Reply-To set) Thanks for being so patient explaining everything, and for discussing with a (still somewhat) hairy-head like myself :-) owen@delong.com (Owen DeLong) wrote:
You're only talking v6? Why? Anyway, let's follow this through...
Because we don't really need to solve this in V4. V4 multihoming is well understood and is unlikely to hit a scaling limit on router capabilities before we hit an end of life on address space. ... Again... Multihoming already works in V4 and there is no real need to solve this in the V4 world.
I can expect a strongly rising demand of end-customers to multihome right now, and we still have a bunch of /24s to go on. But then, it may only add another 300Kprefixes to the BGP table, which is not really an order of magnitude. As to the "it works" - surely it does, but up to now I believed it wouldn't scale far enough. Maybe I'm wrong (see Moore).
You only need one RLI in the packet header. More would actually be bad. Let me 'splain. If you are routing on RLI, then, you need to choose the best path and stick to it. If the packet doesn't make it through that way, that's OK... That's what retransmits are for. If you start rerouting it on the fly, it's likely to loop a lot before dying, but, little else is achieved. Worse, it's likely to loop even if it might have gotten there given one path and only one path chosen as best by the RLI inserting router.
Actually, I don't understand the last part; why should it loop in this case? It's a matter of destination(s) look-up on the "core" routers, just like in your model. Only the destination's potentially more than one. It would of course loop anyway if it entered (the same part of) the same transit AS again, but that is independent of whether you see the ESI or not (aka RLI insertion vs. encapsulation). I'm still not comfortable with the box in Sao Paolo determining whether the packet should go to ISP A in Hamburg or ISP B in Munich or ISP C in Frankfurt (from where the respective ISP would forward it to the customer in Cologne). This decision can easily be made later on and result in a "better" path.
No, it is not. Since the RLI inserting router has up to date dynamic information about which RLIs are reachable and at what cost (BGP
The inserting router is less probable to have up-to-date RLI topology information than routers closer to the packet's destination, due to the way the topology information gets distributed.
No. You have nearly the same advantage you have today. If the path goes away, then, hopefully by the time of retransmit, the RLI inserting router will have learned that that RLI destination is no longer reachable, and, he will insert a different one in the retransmitted packet. Same as what happens today with the retransmitted packet being sent a different way.
I don't like "hopefully" here, but maybe that's our trade-off anyway. You are, nonetheless, giving the "RLI inserting router" somewhat "hotter" information, if it has to make the topological choice (choose destination RLI and, implicitly, select a group of possible paths over all others). If it were only to know the translation information which does not change as often, I'd be much happier. What I also do not like is the wrong analogy to today's routing mechanism. You claim implicitly that the RLI inserting router's new decision was the same as what happened in the Internet routing system today: rerouting packets. This means, in other words, you're making a global choice locally. But of course, the current system does not reroute at the packet source (only), it can do this on any hop between source and destination and thus makes only local choices locally. This is a significant difference, because it makes adaptation to changes easier, faster, and it works with only partial convergence along the path.
Who exactly chooses? IMHO it's AS B that does the selection. And: B is closer to the target, aka the source of the routing information. Its BGP table is more probable to be up-to-date.
Right... B is the first DFZ router. A is not likely DFZ since A is not multihomed in your scenario. No need for A to be DFZ if A only talks to B.
Yesyesyes, consider A B C D E F T A B C D G H T What now? Is "D" necessarily the first DFZ router? I think not. So you are still using B for the RLI insertion; B has to make the choice, and that choice may be wrong or sub-optimal.
Z's ESI is visible in the core, but, not carried in the routing table. Z does not have an RLI, but, instead uses the RLIs of their provider(s).
Yup, in your "add something to the header" scenario, the ESI is still visible. In mine it is not (it is, but encapsulated). Actually, it does not matter, as long as the destination can revive this information ("destination" as in "the re-translating router").
In the long run (once this is ubiquitous on core routers), the global prefix-based table can be abandoned freeing router memory. Hopefully that would occur before the global table and this table grew to require significant hardware upgrades, and, would make significant room for caching ESI->RLI lookups.
Moving the intelligence out of the core. Well, yes, that's an advantage for the migration phase (which could take decades).
No, you don't have to distribute it. You _CAN_ provide it for lookup instead.
How do I get there? Bootstrapping? 2.1.20? That's not moot at all.
Then I do not understand why you want the DFZ routers to be able to translate.
I don't know what you mean by translate.
Translate ESI to RLI and insert that into the packet header.
Well... Since we already have RIRs, I don't see a reason that the top level of the hierarchy for this information couldn't be managed as ANYCAST servers at well known addresses run by the RIRs and/or IANA. All space originates from there anyway, so, it is a natural point of hierarchy. In essence, the router will learn the path to the Root and Top Level RLIs which will be fixed ASNs assigned as part of this protocol deployment. Only the root is truly necessary.
Special ASNs/RLIs, reserved for this? What about extensibility there? And actually, that's not bootstrapping the system, because if you're in the DFZ, you need a specific path to go there, and you have to get it from somewhere. So either there's a hole in the idea or I'm too dumb to understand. Or do you mean, every RLI hosts such an anycasted server, in order for their routers to be able to reach it? Lest we forget that the anycasted RIR routing topology servers also need to get updated somehow... Btw, yes, I like the idea of RIRs and RAs taking control over Internet routing (it's only logical, and it's necessary albeit currently impossible); I would propose the same. Not everybody may like that, though.
The source of the packet does not determine it. The first DFZ router (often many routers removed from the source) determines it's best path. Just like today when the first DFZ router makes a choice, e.g. between forwarding to 701 or 3561 to get to 10565. Once the packet is handed off to 701, it's not going to come back and go via 3561 in most cases. If 701 loses it's connection downstream towards 10565, it will likely drop the packet.
BGP has always been based on the idea of contiguous ASs. That's why 3561 will refuse to accept the packet. That avoids loops, but it also makes life harder in other respects. I think your "first DFZ router" is quite close to the packet source in most cases. It usually is the customer's own router (if they participate in DFZ routing, like we do), or it's one of the upstreams' edge routers. At least, it is, as soon as prefix-based routing has disappeared from large parts of the DFZ. Cheers, Elmar. -- "Begehe nur nicht den Fehler, Meinung durch Sachverstand zu substituieren." (PLemken, <bu6o7e$e6v0p$2@ID-31.news.uni-berlin.de>) --------------------------------------------------------------[ ELMI-RIPE ]---