official ISC statement concerning nonimpact on f-root from sql worm
someone wrote me privately and asked:
Hi, Paul - Some of the articles on the 1434 worm talk about 5 or more root nameservers being down.
a lot of people are pretty confused about the data they have available; sometimes it's because they have the wrong data, and sometimes it's because they don't know how to interpret it properly.
Do you know if this was just sheer volume, ...
all of the root name servers i monitor (which is, well, i guess, all of them) looked like they were "up" the whole time of the ms-sql flash worm. latency and jitter were a bit wider than normal, but there was no packet loss at all. (note: icmp is a poor test; some roots won't answer "ping".) looking at rob thomas's graphs tends to confirm that this was generally true for most people who monitor the roots, no matter what their point of "view". in the f-root case, our traffic volume went DOWN by a little bit during the ms-sql flashworm, which corresponded to a number of bgp session resets from our peers. my assumption so far is that a lot of the normal root service traffic bound for "f-root" didn't get here due to OPN (which is "other people's networks" -- the bane of all reliability goals and programmes.)
... or were some of the routers providing connectivity to them actually allowing in UDP 1434?
I'd expect that, given the variety of different attacks on them, at least most of them would block anything except DNS requests except from approved locations, and preferably do so at the ISP routers instead of whatever access lines they're on....
at f-root we don't generate "icmp port unreachable" when non-53 udp is heard -- it's a difference between "deny" and "reject" in freebsd's ipfw and ip6fw subsystems. (most of the f-root hosts are now better able to drop this traffic than their upstream routers are, due to hardware ages.) i'd like to know who it is that couldn't reach 5 of the 13 servers, so if anyone on nanog@ has heard that, please tell me what you heard and who you heard it from so i can get to the bottom of it.
On Mon, 27 Jan 2003, Paul Vixie wrote:
i'd like to know who it is that couldn't reach 5 of the 13 servers, so if anyone on nanog@ has heard that, please tell me what you heard and who you heard it from so i can get to the bottom of it.
This appears to be the earliest "source" of the quote. http://forums.military.com/1/OpenTopic?a=tpc&s=78919038&f=409192893&m=4551982416 F-Secure has also been also been widely quoted with the same statistic. Since several root servers disabled ICMP after the last attack, I think they mis-read the ping loss as servers being down. 5 root servers never respond to ICMP. I did DNS queries of all of the root servers several times throughout the event. Other than G, I was eventually able to get responses from all of them. Sometimes it took several seconds to get a response. Folks have reported they were able to reach servers through particular networks, but not through other networks. For example, a person sent me data showing they could reach G through MFN, although I couldn't reach it through several other networks. I don't have a network connection on MFN.
THe RIPE NCC monitors response times to DNS queries of all root server addresses from ~60 points worldwide. Most of these points are within the RIPE region. 2 of the 13 server addresses (B&G) showed noticable degradation of response times to DNS queries across all measuring points. The other 11 server addresses awere *not* measurably affected. For the two servers that were affected, the data suggests that the servers themselves remained healthy during the event but that network infrastructure relatively close to them was affected. For B this lasted roughly from 0530(UTC) to 0630 and again shortly around 0730. For G it lasted from 0530 to roughly 1030. My conclusion is, that any degradation in DNS root name service that people have seen was due to Internet infrastructure problems between themselves and the servers. For only two of the thirteen server addresses these problems have been close enough to them to cause service degradation across all of our measurement points. A further tentative conclusion is that this data (again) supports the case for distributing root service across the Internet using IP anycasting. We expect to present a more detailed analysis at the RIPE meeting this week. Daniel
participants (3)
-
Daniel Karrenberg
-
Paul Vixie
-
Sean Donelan