i wrote:
wrt the mit paper on why small ttl's are harmless, i recommend that y'all actually read it, the whole thing, plus some of the references, rather than assuming that the abstract is well supported by the body.
here's what i've learned by watching nanog's reaction to this paper, and by re-reading the paper itself. 1. almost nobody has time to invest in reading this kind of paper. 2. almost everybody is willing to form a strong opinion regardless of that. 3. people from #2 use the paper they didn't read in #1 to justify an opinion. 4. folks who need academic credit will write strong self-consistent papers. 5. those papers do not have to be inclusive or objective to get published. 6. on the internet, many folks by nature "think locally and act globally". 7. #6 includes manufacturers, operators, endusers, spammers, and researchers. 8. the confluence of bad science and disinterested operators is disheartening. 9. good "actual policy" must often fly in the face of "accepted mantra". we now return control of your television set to you.
Regarding both Paul's message below and Simon Walter's earlier message on this topic... Simon Walters scribed:
I'm slightly concerned that the authors think web traffic is the big source of DNS, they may well be right (especially given one of the authors is talking about his own network), but my quick glance at the
Two things - first, the paper breaks down the DNS traffic by the protocol that generated it - see section III C, which notes "a small percentage of these lookups are related to reverse bloack-lists such as rbl.maps.vix.com" -- but remember that the study was published in 2001 based upon measurements made in January and December of 2000. RBL traffic wasn't nearly the proportion of DNS queries that it is today. As the person responsible for our group's spam filtering (one mailserver among many that were measured as a part of the study), we didn't start using spamassassin until late 2001, and I believe we were one of the more aggressive spam filtering groups in our lab. Also note that they found that about 20% of the TCP connections were FTP connections, mostly to/from mirror sites hosted in our lab. Sendmail of five years ago also wasn't as aggressive about performing reverse verification of sender addresses. I asked Jaeyeon about this (we share an office), and she noted that: "In our follow-up measurement study, [we found] that DNSBL related DNS lookups at CSAIL in February 2004 account for 14% of all DNS lookups. In comparison, DNSBL related traffic accounted for merely 0.4% of all DNS lookups at CSAIL in December 2000." Your question was right on the money for contemporary DNS data.
The abstract doesn't mention that the TTL on NS records is found to be important for scalability of the DNS. Probably the main point Paul wants us to note. Just because the DNS in insensitive to slight changes in A record TTL doesn't mean TTL doesn't matter on other records.
This is a key observation, and seems like it's definitely missing from the abstract (alas, space constraints...). They're not talking about the NS records, and they're not talking about the associated A records for _nameservers_. On Sat, Aug 07, 2004 at 04:55:00PM +0000, Paul Vixie scribed:
here's what i've learned by watching nanog's reaction to this paper, and by re-reading the paper itself.
1. almost nobody has time to invest in reading this kind of paper. 2. almost everybody is willing to form a strong opinion regardless of that. 3. people from #2 use the paper they didn't read in #1 to justify an opinion.
:) human nature.
4. folks who need academic credit will write strong self-consistent papers. 5. those papers do not have to be inclusive or objective to get published. 6. on the internet, many folks by nature "think locally and act globally".
7. #6 includes manufacturers, operators, endusers, spammers, and researchers. 8. the confluence of bad science and disinterested operators is disheartening. 9. good "actual policy" must often fly in the face of "accepted mantra".
I'm not quite sure how to respond to this part (because I'm not quite sure what you meant...). It's possible that the data analyzed in the paper may not be representative of, say, commercial Internet traffic, but how is the objectivity in question? The conclusions of the paper are actually pretty consistent with what informed intuition might suggest. First: "If NS records had lower TTL values, essentially all of the DNS lookup traffic observed in our trace would have gone to a root or gTLLD server, which would have increased the load on them by a factor of about five. Good NS-record caching is therefore critical to DNS scalability." and second: "Most of the benefit of caching [of A records] is achieved with TTL values of only a small number of minutes. This is because most cache hits are produced by single clients looking up the same server multiple times in quick succession [...]" As most operational experience can confirm, operating a nameserver for joe-random-domain is utterly trivial -- we used to (primary) a couple thousand domains on a p90 with bind 4.. As your own experience can confirm, running a root nameserver is considerably less trivial. The paper confirms the need for good TTL and caching management to reduce the load on root nameservers, but once you're outside that sphere of ~100 critical servers, the hugely distributed and heavy-tailed nature of DNS lookups renders caching a bit less effective except in those cases where client access patterns cause intense temporal correlations. -Dave -- work: dga@lcs.mit.edu me: dga@pobox.com MIT Laboratory for Computer Science http://www.angio.net/
participants (2)
-
David G. Andersen
-
Paul Vixie