Break ends at 11:40, PGP signing will take place, and don't forget to fill out servers. ANYCAST fun for the final sessions. Lorenzo Colitti, RIPE NCC [slides are at: http://www.nanog.org/mtg-0606/pdf/lorenzo-colitti.pdf Agenda: introduction latency client-side server-side Benefit of individual nodes Stability Routing issues Why anycast? root server anycast widely deployed c, f, i j, k, m at least reasons for anycasting provide resiliency: eg contain DOS attacks spread server and network load increase performance but is it effective? measure latency ideally for every given client, BGP should chose node with lowest RTT. does it? from every client, measure RTTs to anycast IP address service interfaces of global nodes (not anycasted) for every client, compare K RTT to RTT of closest global node a = RTTk/min(RTTi) if 1, BGP is picking right node if > 1, BGP picks the wrong node if <1, seeing local node. Latency with TTM: methodology DNS queries from ~100 TTM test boxes dig hostname.bind see which host answers extract RTT take min of 5 queries check paths to service interfaces; is it same as prod IP according to RIS, mostly 'yes' TTM probe locations, mostly in europe Latency with TTM: results (5 nodes) most values are close to one; generally BGP doing pretty good job. from 2 nodes to 5 nodes (2 nodes, April 2005) (5 nodes, April 2006) mostly same results, clustered around one, whether 2 or 5 nodes. consistency of 'a' over time average of that over time. TT103 is outlier calculated over time, threw out that one outlier. results are pretty consistent. average is little higher than one, mostly consistent over time measuring from servers TTM latency measurements not optimal locations biased towards europe limited number of probes (~100) don't reflect k client distribution how to fix? ping clients from servers much larger dataset methodology process packet traces on k global nodes extract list of client IP addreses ping all addresses from all global nodes plot distribution of 'a' 6 hours of data 246,769,005 queries 845,328 unique IP addresses CDF of 'a' seen from servers results not as good as seen by TTM only 50% of clients have a = 1 about 10% are 4x slower/farther. probably due to TTM clustering in europe latency conclusions 5 node result vs 2 node, comparable, at least in TTM non-TTM results not so rosy. How many nodes are needed--is 5 enough? evaluate existing instances how to measure benefit of an instance? Assume optimal instance selection that is, every client sees closest instance this is upper benefit of benefit consistent to see if we've reached diminishing returns for every client, see how much its performance if the chosen node didn't exist. B is loss factor, how much a client would suffer if an instance were knocked out B = RTTknockout/RTT... Graph for LINX; 90% of clients wouldn't see an impact if it went away; 10% would see a worsening. geographic distribution pretty wide AMS-IX about 20% would suffer performance degregation; busiest two nodes, see a lot of clients, important to k deployment. If they plot it for both LINX and AMSIX together, about 65% wouldn't be affected, most of others would see 4x, 10% would be 7x worse. So taken together, the *two* nodes are important. Tokyo; best node for few clients; but those served, BADLY served by others; about 10% who would go more than 7x if it went way, those clients mostly Asia. Miami node at NOTA, moderate benefit for some clients, US and southAm would be badly served by europe or Tokyo. Delhi node is mostly ineffective, most would be served better by other nodes. Condense the graph into one number to get a value for effectiveness of each node. weighted average of B for each client. if benefit value is 1, node doesn't provide any benefit at all. larger numbers show higher benefits. Europe, when taken together, high benefit, as is Tokyo; Miami node not so effective, and Delhi is nearly ineffective. Does anycast provide any value then? knock out all except LINX; dark red curve (pre 1997) 10% wouldn't notice, 85% would get worse, benefit value is 18.8, so anycast does bring value. Stability the more routes competing in BGP with more nodes doesn't matter for single packet UDP exchanges does matter for TCP Look at node switches that occur. collect packet dumps on each node. extract all 53/UDP traffic k nodes only NTP synchronized if IP shows up on two nodes, log a switch. 5 nodes, april 2006, 0.06% saw switches 2830 switchers out of 845,328, 0.33% switchers no big issue with instance switchers. Routing issues k-root structure 5 global nodes (prepended) linkx, amsix, tokyo, mia, del different prepending values no-export causing reachability TT103 has value of 200, the graph axis is cut. tt103 is in Yokohama; Tokyo is 2ms away; but the query goes to Delhi through Tokyo to LA. 416ms vs 2, so value is 208. Thanks to Matsuzaki and Randy Bush, got BGP paths from AS2497 bad interaction of different prepending lengths need to fix prepending on Tokyo node. Delhi had shorter prepending. no-export and leaks local nodes can be worse than global nodes tt89, seeing Denic local node, 30ms instead of going to London. Local node, if no-export is ignored, announced to customer, they are more specific, leak to customers. no-export can lead to loss of reachability http://www.merit.edu/mail.archives/nanog/2005-10/msg01226.html problematic interaction of no-export with anycast use no-export to prevent local nodes from leaking if have an AS whose providers all peer with a local node and honor no-export customer never sees route for k IP address. solution, send out a less specific, covering prefix. Q: Mark Kosters, Verisign--saw much higher switching rates; can he define switching better? A: if an IP is seen at one location, then shifts to a different site, that's one switch; going back to the first node would be a second switch.