Here is a good mesage on the subject from DNSOP. It explains the PPLB loadbalancing on BGP links is not the only way the problem can arise. PPLB on interior links can also be a problem. The following statement from RFC 1546 is probably most significant: It is important to remember that anycasting is a stateless service. An internetwork has no obligation to deliver two successive packets sent to the same anycast address to the same host. Particularly, the second sentence. --Dean -- Av8 Internet Prepared to pay a premium for better service? www.av8.net faster, more reliable, better service 617 344 9000 ---------- Forwarded message ---------- Date: Mon, 4 Oct 2004 19:51:45 -0400 (EDT) From: Dean Anderson <dean@av8.com> To: Iljitsch van Beijnum <iljitsch@muada.com> Subject: Re: [dnsop] Re: Root Anycast (fwd) On Sat, 2 Oct 2004, Iljitsch van Beijnum wrote:
On 2-okt-04, at 2:48, Dean Anderson wrote:
I agree entirely. Though I'd point out that paths over OSPF redundant links exiting an AS could lead to different border routers.
Well, I don't see how information in OSPF or another internal protocol could lead to different links being used. (Unless you redistribute BGP in OSPF, which isn't a very good idea.)
It doesn't have to be exported: host | A OSPF interior: sends traffic to B1, B2 (PPLB & default route) / \ B1 B2 BGP used to talk to AS D and E | | D1 E1 BGP used to talk AS B and AS F / \ / \ F1 F2 BGP ... / \ / \ f1 f3 f1, f3 are anycasted roots Any packet sent to B1 will likely always arrive at f1, while any packet sent to B2 will probably always arrive at f3... BTW, good call on the UDP fragmentation. I didn't think of that. --Dean
I'm not sure what "significant number of coincidences" is,
Like this:
A / \ B1 B2 | | C D | | E1 E2
AS A connects to two different routers in AS B, and each of these routers prefers a different external path towards different anycast instances of AS E. In order for this to happen the paths from B to both anycast instances must be completely identical, except that for one router in B one path is preferred and for another router the other. This will only happen if these routers connect to ASes C and D themselves, or if one sees a better IGP metric towards the router connecting to C and another sees a better IGP metric towards the router connecting to D.
If PPLB is turned on, and if there is a path from A to E through both B1 and B2, packets will be evenly distributed across those links. I'm told this works on IOS. I've also been told many companies are working on this. Next, we need to look more in detail how anycast works. Essentially, anycast is giving two or more computers the same IP address, and then distributing those computers so that packets to that IP address will hit only one computer. Thus, you get load distribution and not just redundancy. Using your example, AS C and AS D both connect to AS E through different routers. Using Vixie's document, we suppose that the destination IP is on a switched LAN directly attached to both routers. Note that this does not appear to be the case in practice. In practice, it seems that anycast servers are physically distributed and that routes to the particular address just go to differnet computers that don't share a lan. But I think these particulars are not significant. So, I'll just consider the case of a switched LAN: On getting the packet from A to anycast host(on E), router E1 will ARP for the MAC address of the anycast host. It will get one MAC. After that, and until the expiration of the ARP, it will use that MAC for that IP. Packets from that router will go through the switch only to that physical address. Again, a load distribution is achieved. At first, I thought a possible solution might be to use anycast on a MAC address. However, this approach only achieves replication, not load distribution. All packets have to be seen and processed by all anycast hosts, and there is no distribution of load. This would still not work with TCP, and in fact would be worse.
However, this is a more treacherous problem because a site many AS's away from the roots may configure PPLB and find no problems. Sometime later, a change at one of the intermediate ASs causes packets from that site to now go over multiple paths instead of one path. Suddenly, the site is not working but they have made no change, and perhaps their immediate upstreams have also made no change. The only way out of that mess once a deep hole is dug would be to have very strict global regulation of peering and even transit so that this situation is always avoided.
Here's picture of the hard to find problem: (borrowing your pictures) A deploys PPLB and finds no problems: A / \ B1 B2 | | X Y | | C D \ | F G | | E1 E2 Everything is hitting E2 and just one anycast server. Sometime later, C decides to peer with F and drop peering with G A / \ B1 B2 | | X Y | | C D | | F G | | E1 E2 Now A has problems. A calls X and Y and asks them if anything has changed. They report now. A calls E1 and asks them if anything has changes. Likewise, they report no change.
Apart from the small chance of this actually occurring (and that most DNS stuff is UDP that fits in one packet),
I think this is only true of the network today: Few people are using PPLB at present, Few are using TCP as a DNS transport. Few are using large UDP packets (which as you pointed out, could be fragmented). However, in the future, we anticipate a different usage pattern: Many will be using PPLB to improve performance and reliability Many will be using TCP for DNS Many will be using large UDP packets for DNS
and the fact that it's extremely unlikely to hit multiple anycasted roots at the same time, I don't believe this problem will be too harmful. In most cases a TCP session will still work even though there is a lot of "packet loss". And as long as there are some non-anycasted roots we'll be fine anyway.
The likelihood of hitting multiple anycasted roots depends on: The number of anycasted roots The number and configuration of paths to the roots The uniqueness of multiple PPLB paths between a user and the roots The use of PPLB The more anycasted roots there are, and the more paths there are, and the the more PPLB is used, the more likely it becomes that there are distinct paths to different anycast roots through which PPLB interleaved packets are sent. --Dean