 
            On Nov 8, 2009, at 7:46 PM, bmanning@vacation.karoshi.com wrote:
"The paper also presents the results of trace-driven simulations that explore the effect of varying TTLs and varying degrees of cache sharing on DNS cache hit rates. "
I'm not debating the traces - I wonder about the simulation model. (and yes, I've read the paper)
I'm happy to chat about this offline if it bores people, but I'm curious what you're wondering about. The method was pretty simple: - Record the TCP SYN/FIN packets and the DNS packets - For every SYN, figure out what name the computer had resolved to open a connection to this IP address - From the TTL of the DNS, figure out whether finding that binding would have required a DNS lookup There are some obvious potential sources of error - most particularly, name-based HTTP virtual hosting may break some of the assumptions in this - but I'd guess that with a somewhat smaller trace, not too much error is introduced by clients going to different name-based vhosts on the same IP address within a small amount of time. There are certainly some, but I'd be surprised if it was more than a %age of the accesses. Are there other methodological concerns? I'd also point out for this discussion two studies that looked at how accurately one can geo-map clients based on the IP address of their chosen DNS resolver. There are obviously potential pitfalls here (e.g., someone who travels and still uses their "home" resolver). In 2002: Z. M. Mao, C. D. Cranor, F. Douglis, and M. Rabinovich. A Precise and Efficient Evaluation of the Proximity between Web Clients and their Local DNS Servers. In Proc. USENIX Annual Technical Conference, Berkeley, CA, June 2002. Bottom line: It's ok but not great. "We con- clude that DNS is good for very coarse-grained server selection, since 64% of the associations belong to the same Autonomous System. DNS is less useful for finer- grained server selection, since only 16% of the client and local DNS associations are in the same network-aware cluster [13] (based on BGP routing information from a wide set of routers)" We did a wardriving study in Pittsburgh recently where we found that, of the access points we could connect to, 99% of them used their ISP's provided DNS server. Pretty good if your target is residential users: http://www.cs.cmu.edu/~dga/papers/han-imc2008-abstract.html (it's a small part of the paper in section 4.3). -Dave