On Wed, Aug 11, 2004 at 04:49:18PM +0000, Paul Vixie scribed:
what i meant by "act globally, think locally" in connection with That MIT Paper is that the caching effects seen at mit are at best representative of that part of mit's campus for that week, and that
Totally agreed. The paper was based upon two traces, one from MIT LCS, and one from KAIST in Korea. I think that the authors understood that they were only looking at two sites, but their numbers have a very interesting story to tell -- and I think that they're actually fairly generalizable. For instance, the rather poorly-behaving example from your f-root snapshot is rather consistent with one of the findings in the paper: [Regarding root and gTLD server lookups] "...It is likely that many of these are automatically generated by incorrectly implemented or configured resolvers; for example, the most common error 'loopback' is unlikely to be entered by a user"
even a variance of 1% in caching effectiveness at MIT that's due to generally high or low TTL's (on A, or MX, or any other kind of data) becomes a huge factor in f-root's load, since MIT's load is only one
But remember - the only TTLs that the paper was suggesting could be reduced were non-nameserver A records. You could drop those all to zero and not affect f-root's load one bit. In fairness, I think this is jumbled together with NS record caching in the paper, since most responses from the root/gTLD servers include both NS records and A records in an additional section. Global impact is greatest when the resulting load changes are concentrated in one place. The most clear example of that is changes that impact the root servers. When a 1% increase in total traffic is instead spread among hundreds of thousands of different, relatively unloaded DNS servers, the impact on any one DNS server is minimal. And since we're talking about a protocol that variously occupies less than 3% of all Internet traffic, the packet count / byte count impact is negligible (unless it's concentrated, as happens at root and gtld servers). The other questions you raise, such as:
how much of the measured traffic was due to bad logic in caching/forwarding servers, or in clients? how will high and low ttl's affect bad logic that's known to be in wide deployment?
are equally important questions to ask, but .. there are only so many questions that a single paper can answer. This one provides valuable insight into client behavior and when and why DNS caching is effective. There have been other papers in the past (for instance, Danzig's 1992 study) that examined questions closer to those you pose. The results from those papers were useful in an entirely different way (namely, that almost all root server traffic was totally bogus because of client errors). It's clear that from the perspective of a root name server operator, the latter questions are probably more important. But from the perspective of, say, an Akamai or a Yahoo (or joe-random dot com), the former insights are equally valuable. -Dave -- work: dga@lcs.mit.edu me: dga@pobox.com MIT Laboratory for Computer Science http://www.angio.net/