DNS queries for . IN A return rcode 2 SERVFAIL from windows DNS recursing resolvers
Hey all, This must be old news for everyone else. While looking at a dns monitor on a load balancer that defaulted to . A queries to check liveliness on DNS resolvers, it became quite clear that windows 2000/2003 DNS server appears to return rcode=2 for queries looking for an A record for the root. The resolvers appear to work properly in all other regards. So the monitors were switched to localhost. A (Is this a bad idea?) A little testing later and the results for . A are: Windows NT 4, ancount=0, authority=1, rcode=0 Windows 2000, rcode=2 Windows 2003, rcode=2 bind, ancount=0, authority=1, rcode=0 To my (inexpert) eyes that doesnt seem quite right. I cant seem to find any online information regarding this difference of behavior. Enlightenment appreciated. Joe Here is the output. fpdns -c -s 64.95.32.34 && dig @64.95.32.34 . a 64.95.32.34 Microsoft Windows DNS NT4 ; <<>> DiG 9.6.1-P2 <<>> @64.95.32.34 . a ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 35180 ;; flags: qr aa; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0 ;; QUESTION SECTION: ;. IN A ;; AUTHORITY SECTION: . 86400 IN SOA A.ROOT-SERVERS.NET. NSTLD.VERISIGN-GRS.COM. 2010010500 1800 900 604800 86400 ;; Query time: 114 msec ;; SERVER: 64.95.32.34#53(64.95.32.34) ;; WHEN: Tue Jan 5 07:40:33 2010 ;; MSG SIZE rcvd: 92 fpdns -c -s 216.222.144.16 && dig @216.222.144.16 . a 216.222.144.16 ISC BIND 9.2.3rc1 -- 9.6.1-P1 [recursion enabled] id: "9.5.1-P2" ; <<>> DiG 9.6.1-P2 <<>> @216.222.144.16 . a ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49220 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0 ;; QUESTION SECTION: ;. IN A ;; AUTHORITY SECTION: . 2314 IN SOA A.ROOT-SERVERS.NET. NSTLD.VERISIGN-GRS.COM. 2010010500 1800 900 604800 86400 ;; Query time: 38 msec ;; SERVER: 216.222.144.16#53(216.222.144.16) ;; WHEN: Tue Jan 5 07:42:08 2010 ;; MSG SIZE rcvd: 92 fpdns -c -s joe.jmaimon.com && dig @joe.jmaimon.com . a 216.222.150.100 ISC BIND 9.2.3rc1 -- 9.6.1-P1 [recursion enabled] id: "9.5.0-P2-W2" ; <<>> DiG 9.6.1-P2 <<>> @joe.jmaimon.com . a ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 39125 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0 ;; QUESTION SECTION: ;. IN A ;; AUTHORITY SECTION: . 10800 IN SOA A.ROOT-SERVERS.NET. NSTLD.VERISIGN-GRS.COM. 2010010500 1800 900 604800 86400 ;; Query time: 40 msec ;; SERVER: 216.222.150.100#53(216.222.150.100) ;; WHEN: Tue Jan 5 07:57:52 2010 ;; MSG SIZE rcvd: 92 fpdns -c -s 64.95.32.130 && dig @64.95.32.130 . a 64.95.32.130 Microsoft Windows DNS 2000 ; <<>> DiG 9.6.1-P2 <<>> @64.95.32.130 . a ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 30535 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;. IN A ;; Query time: 35 msec ;; SERVER: 64.95.32.130#53(64.95.32.130) ;; WHEN: Tue Jan 5 07:43:51 2010 ;; MSG SIZE rcvd: 17 fpdns -c -s 72.26.241.205 && dig @72.26.241.205 . a 72.26.241.205 No match found ; <<>> DiG 9.6.1-P1 <<>> @72.26.241.205 . a ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 12807 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;. IN A ;; Query time: 0 msec ;; SERVER: 72.26.241.205#53(72.26.241.205) ;; WHEN: Tue Jan 5 08:13:06 2010 ;; MSG SIZE rcvd: 17
Joe Maimon <jmaimon@ttec.com> writes:
Hey all,
This must be old news for everyone else. While looking at a dns monitor on a load balancer that defaulted to . A queries to check liveliness on DNS resolvers, it became quite clear that windows 2000/2003 DNS server appears to return rcode=2 for queries looking for an A record for the root. The resolvers appear to work properly in all other regards.
well, there is no A RR for the root domain. RCODE=2 is still an error, you should receive RCODE=0 ANCOUNT=0 for an unused RR type. but many resolvers get confused when the root domain is the QNAME, so let's assume that you're using one of those.
So the monitors were switched to localhost. A
(Is this a bad idea?)
probably. there is no "localhost" in the root zone. this name is a TCP/IP stack convention, not a standard. for health monitoring purposes you should probably choose one of your own local names, since there's almost certainly no local intelligence in your resolver about them. that means to look up one of your own names the resolver probably has to iterate downward from the root zone to the top level and all the way down to your authority nameservers. (the problem here is, you may be testing more than you intend, and a failure in your own authority server or in the delegation path to it would look the same as an IP path failure or a resolver problem.)
A little testing later and the results for . A are:
Windows NT 4, ancount=0, authority=1, rcode=0 Windows 2000, rcode=2 Windows 2003, rcode=2 bind, ancount=0, authority=1, rcode=0
To my (inexpert) eyes that doesnt seem quite right.
probably resolver bugs, either in those TCP/IP stacks or in the "recursive nameserver" they are using. (is the same recursive nameserver used in all four tests?)
I cant seem to find any online information regarding this difference of behavior.
Enlightenment appreciated.
i suggest re-asking this over on dns-operations@lists.dns-oarc.net, since it a bit deep in the DNS bits for a general purpose list like NANOG. -- Paul Vixie KI6YSY
participants (2)
-
Joe Maimon
-
Paul Vixie