Problems with NS*.worldnic.com
I saw some mention of this in a previous thread. Is anyone else still experiencing problems? We're seeing general slowness and the use of the truncate bit in responses, forcing to TCP mode.
We have few servers with Interland / Miami. Today for around 1 hour 15 minutes the dns / tcp traffic was timing out. Httpd was very slow for domains with backup dns servers in Europe but other domains with DNS within Interland only was not resolving at all. I only noticed that traffic was not going through from here, Saudi Arabia but it appeared to be resolving okay from United States. I do not know if this is related to worldnic dns problem, but I think Interland is outsourcing DNS from Verisign. aljuhani@riyadmail.com ----- Original Message ----- From: "Greg Schwimer" <gschwimer@godaddy.com> To: <nanog@nanog.org> Sent: Monday, April 25, 2005 21:34 Subject: Problems with NS*.worldnic.com
I saw some mention of this in a previous thread. Is anyone else still experiencing problems? We're seeing general slowness and the use of the truncate bit in responses, forcing to TCP mode.
something *very* strange is going on. the worldnic servers have been giving delayed or no results for days now. and nsi is hoping we and the wsj/nyt won't notice. i don't think this roam.psg.com:/usr/home/randy> doc -p -w worldnic.net Doc-2.1.4: doc -p -w worldnic.net Doc-2.1.4: Starting test of worldnic.net. parent is net. Doc-2.1.4: Test date - Mon Apr 25 14:20:45 HST 2005 ;; res_nsend: Protocol not supported DIGERR (UNKNOWN): dig @a.gtld-servers.net. for NS of worldnic.net. failed ;; res_nsend: Protocol not supported DIGERR (UNKNOWN): dig @b.gtld-servers.net. for NS of worldnic.net. failed is the worldnic problem, but could be. but it is a problem. (i generally ignore b root issues). but it's probably time for us all to dump symptoms here and figure it out as a community, as the dog with the bone ain't 'fessing up. randy
On Mon, 25 Apr 2005, Randy Bush wrote:
i don't think this
roam.psg.com:/usr/home/randy> doc -p -w worldnic.net Doc-2.1.4: doc -p -w worldnic.net Doc-2.1.4: Starting test of worldnic.net. parent is net. Doc-2.1.4: Test date - Mon Apr 25 14:20:45 HST 2005 ;; res_nsend: Protocol not supported DIGERR (UNKNOWN): dig @a.gtld-servers.net. for NS of worldnic.net. failed ;; res_nsend: Protocol not supported DIGERR (UNKNOWN): dig @b.gtld-servers.net. for NS of worldnic.net. failed
is the worldnic problem, but could be. but it is a problem.
a.gtld-servers.net and b.gtld-servers.net have AAAA records. Some applications and stacks try the v6 address first if it's available and will appear to hang if you don't have v6 connectivity. That may very well be what's happening here. Matt -- Matt Larson <mlarson@verisign.com> VeriSign Naming and Directory Services
Matt Larson wrote:
a.gtld-servers.net and b.gtld-servers.net have AAAA records. Some applications and stacks try the v6 address first if it's available and will appear to hang if you don't have v6 connectivity. That may very well be what's happening here.
Are the AAAA records for a & b.gtld-servers.net new?
Randy, and others with this issue... On 4/25/05 5:24 PM, "Randy Bush" <randy@psg.com> wrote:
something *very* strange is going on. the worldnic servers have been giving delayed or no results for days now. and nsi is hoping we and the wsj/nyt won't notice.
i don't think this
roam.psg.com:/usr/home/randy> doc -p -w worldnic.net Doc-2.1.4: doc -p -w worldnic.net Doc-2.1.4: Starting test of worldnic.net. parent is net. Doc-2.1.4: Test date - Mon Apr 25 14:20:45 HST 2005 ;; res_nsend: Protocol not supported DIGERR (UNKNOWN): dig @a.gtld-servers.net. for NS of worldnic.net. failed ;; res_nsend: Protocol not supported DIGERR (UNKNOWN): dig @b.gtld-servers.net. for NS of worldnic.net. failed
is the worldnic problem, but could be. but it is a problem. (i generally ignore b root issues).
but it's probably time for us all to dump symptoms here and figure it out as a community, as the dog with the bone ain't 'fessing up.
I spent some time two months ago chasing this down with the same two gtld-servers.net records, on my mac. The culprit is dig. On any system that is both ipv4 and ipv6 enabled, *even if there is no ipv6 connectivity*, dig - which has no awareness of ipv6 v. ipv4 - attempts to connect using the ipv6 address if both an ipv6 and an ipv4 address is provided. When it fails to connect, it fails. It does not do the "better" thing, which is to try another address for the same hostname, in this case the ipv4 address. If you attempt a dig for the ipv4 address of a.gtld-servers.net or b.gtld-servers.net, you will get your answer. I am not sure whether the correct solution is to "fix" dig so that is tries ipv4, or to get the os "fixed" on a dual stack capable system so that if there is not ipv6 connectivity it disables that part of the system. I suspect the first is appropriate, because there are obviously internal processes that may validly want to use ipv6 even though there is no ipv6 connection. I also suspect that the same thing would occur for an A record that had multiple ipv4 addresses in a round robin configuration, but where the "first" ip address was unreachable, the behavior would be the same as if it was an ipv6 address being tried on a system that had no ipv6 connectivity. But I have not taken the time to test. I did ask the maintainers of dig to look at this, and perhaps provide a solution. Rodney Joffe CenterGate Research Group, LLC http://www.centergate.com "Technology so advanced, even WE don't understand it"(R)
On 4/26/05, Rodney Joffe <rjoffe@centergate.com> wrote:
The culprit is dig.
I am not sure whether the correct solution is to "fix" dig so that is tries ipv4, or to get the os "fixed" on a dual stack capable system so that if there is not ipv6 connectivity it disables that part of the system. I
I'd say fix the resolver to not try resolve v6 where there exists no v6 connectivity -srs
On Mon, 25 Apr 2005 21:34:54 PDT, Rodney Joffe said:
I am not sure whether the correct solution is to "fix" dig so that is tries ipv4, or to get the os "fixed" on a dual stack capable system so that if there is not ipv6 connectivity it disables that part of the system. I suspect the first is appropriate, because there are obviously internal processes that may validly want to use ipv6 even though there is no ipv6 connection.
The problem is that you *could* have local/campuswide ipv6 connectivity, but not have an IPv6 connection to the outside world. So my system comes up, it sees a Router Advertisement, it can get to other IPv6 systems that are 3-4 hops away. So how is it supposed to "know" that it doesn't have an ipv6 connection? Presumably the same way it "knows" it doesn't have an ipv4 connection when your OC-moby to the outside world falls over, but it's still perfectly able to talk to the entire rest of your corporate network....
So how is it supposed to "know" that it doesn't have an ipv6 connection?
in my case, because o no interfaces have v6 addresses o v6 stack is not present o ... it should also not use smoke signals, analog voice phone, ... the chances of a box having a v6 connection to *anything* today is low, and should not be a reason to *break* v4 services. randy
On Tue, 26 Apr 2005 Valdis.Kletnieks@vt.edu wrote:
On Mon, 25 Apr 2005 21:34:54 PDT, Rodney Joffe said:
I am not sure whether the correct solution is to "fix" dig so that is tries ipv4, or to get the os "fixed" on a dual stack capable system so that if there is not ipv6 connectivity it disables that part of the system. I suspect the first is appropriate, because there are obviously internal processes that may validly want to use ipv6 even though there is no ipv6 connection.
The problem is that you *could* have local/campuswide ipv6 connectivity, but not have an IPv6 connection to the outside world. So my system comes up, it sees a Router Advertisement, it can get to other IPv6 systems that are 3-4 hops away.
So how is it supposed to "know" that it doesn't have an ipv6 connection?
Presumably the same way it "knows" it doesn't have an ipv4 connection when your OC-moby to the outside world falls over, but it's still perfectly able to talk to the entire rest of your corporate network....
Perhaps a solution is to specifically enable ipv6 dns resolution as preferable to ipv4 or the other way around. This could perhaps be switch in resolv.conf or nsswitch.conf. Something like: /etc/nsswitch.conf ... hosts: files dns dns-resolver: AAAA [NOTFOUND=return] A6 A Note: in this meaning NOTFOUND is only true when NXDOMAIN but not for NODATA OR /etc/resolv.conf search example.com protocol ipv6 ipv4 -- William Leibzon Elan Networks william@elan.net
On Mon, 25 Apr 2005 22:19:51 PDT, "william(at)elan.net" said:
Perhaps a solution is to specifically enable ipv6 dns resolution as preferable to ipv4 or the other way around. This could perhaps be switch in resolv.conf or nsswitch.conf. Something like:
/etc/resolv.conf search example.com protocol ipv6 ipv4
At least on my system, there's an 'options inet6' line that makes it look for AAAA records, and mapping ipv4 into ipv6 addresses if only an A record is found. Also note that it doesn't fix the problem that's being seen - I might be able to contact the nameservers listed in resolv.conf via both IPv4 and IPv6 - the fun starts when my nameserver gets an NS entry that contains an AAAA record, and the nameserver has enough IPv6 connectivity to think it's worth a try, but you can't get there from here...
At 21:34 -0700 4/25/05, Rodney Joffe wrote:
The culprit is dig.
Ahh, dig. What version? You have to be running the latest at all times these days...so many changes... In my experiences with v6 the problems I have come down two are: 1) Broken testing tools. (See change 1610 in the BIND CHANGES file for one.) 2) Broken route policy. (Dasterdly ISP's!) 3) Broken OS API's. (Have we learned nothing since or from Berkeley Sockets?) #1 - I've had to reevaluate everything I know about debugging since I met IPv6. Now there's an entirely alternate universe of failure to consider. One day I was sitting in RIPE NCC's offices and couldn't 'dig @ns.ripe.net'. So I walked to the ops room and asked, "umm, is your big machine down." After a good laugh, we figured that my Mac was trying v6 where v6 wasn't *really* live. #2 - When I first got real live IPv6 service from a provider, I tried tracerouting to all the machines I knew about - the roots as listed on root-servers.org, the RIPE machines. I'd get about halfway there and fail. I asked for reverse traces from the other side and see failures about the same place. We had to work with ISPs to loosen route policies. #3 - I have seen all sorts of mistakes involving OS's, OS API's, and app software API's. Mapped addresses are mishandled, having more than one address to try is something apps don't deal with. (Like they've been force fed one kind of food their entire life, and now have to choose from a menu.) At NANOG last year I related my problems with ssh (choosing v6 over v4 - and me assigning the same domain name to two machines, one on a v4 net and one on a v6 net). Stupid me... The biggest problem was that one type of machine kept dropping its statically configured default v6 route. Packets would get in, but they didn't know where to go next. The machine logged all activity as good though...it didn't know. -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Edward Lewis +1-571-434-5468 NeuStar If you knew what I was thinking, you'd understand what I was saying.
something *very* strange is going on. the worldnic servers have been giving delayed or no results for days now. and nsi is hoping we and the wsj/nyt won't notice.
I agree 100%.
but it's probably time for us all to dump symptoms here and figure it out as a community, as the dog with the bone ain't 'fessing up.
randy
I'll bite. I couldn't resolve ns*.worldnic.com domains until I finally bit the bullet, and unblocked port 53 TCP from my DNS server. Then it worked fine. (after a few tries) I'm using BIND 9.2.4 without the eye pee vee six stuff compiled in. Because I don't want to start something; No discussion about me blocking port 53, ok? I got tired of gobs of log files of script kiddies trying to download my domains 5 years ago... I actually READ my logs.... besides, I had to keep the linux boxes safe from the tyranny of bind 8 until they got upgraded. :-) -Jerry
participants (12)
-
aljuhani
-
Edward Lewis
-
Greg Schwimer
-
Janet Sullivan
-
Jerry Pasker
-
Kevin Loch
-
Matt Larson
-
Randy Bush
-
Rodney Joffe
-
Suresh Ramasubramanian
-
Valdis.Kletnieks@vt.edu
-
william(at)elan.net