dns cache beyond ttl - viasat / exede
Hello, I am moving a number of web sites from one colo to another, re-numbering them in the process, and I have run into an interesting issue I'd like to solicit feedback on. My dns TTL's are all 300 seconds, and I have noticed that once I update the A records with the new addresses, most (but not all) web clients begin using the new address within 5 minutes or so. However, there is a persistent set of stragglers who continue accessing the site(s) on their old addresses for far in excess of this - up to a week in fact. And, what I have noted, all of these clients have something in common - they all appear to be satellite users of viasat/exede. This is based on whois lookups of the ip addresses of the clients. Note, I am NOT expecting 'turn on a time' - just looking for clients to refresh within a reasonable time. I am wondering if perhaps this is due to some kind of (known?) bug in the embedded dns cache/client in the client satellite modem, or if there is another plausible explanation I am not seeing. It compounds my problem slightly since I have to continue running the web sites at both the old and new addresses while these things time out I guess and it's just inconvenient. Thanks. MIke-
I've seen similar issues (years ago) where some ISPs didn't honour DNS TTLs, and would instead cache the results a LOT longer. On Mon, Oct 7, 2019 at 9:08 AM Mike <mike-nanog@tiedyenetworks.com> wrote:
Hello,
I am moving a number of web sites from one colo to another, re-numbering them in the process, and I have run into an interesting issue I'd like to solicit feedback on.
My dns TTL's are all 300 seconds, and I have noticed that once I update the A records with the new addresses, most (but not all) web clients begin using the new address within 5 minutes or so. However, there is a persistent set of stragglers who continue accessing the site(s) on their old addresses for far in excess of this - up to a week in fact. And, what I have noted, all of these clients have something in common - they all appear to be satellite users of viasat/exede. This is based on whois lookups of the ip addresses of the clients. Note, I am NOT expecting 'turn on a time' - just looking for clients to refresh within a reasonable time.
I am wondering if perhaps this is due to some kind of (known?) bug in the embedded dns cache/client in the client satellite modem, or if there is another plausible explanation I am not seeing. It compounds my problem slightly since I have to continue running the web sites at both the old and new addresses while these things time out I guess and it's just inconvenient.
Thanks.
MIke-
On 10/7/2019 10:08 AM, Mike wrote:
I am wondering if perhaps this is due to some kind of (known?) bug in the embedded dns cache/client in the client satellite modem, or if there is another plausible explanation I am not seeing. It compounds my problem slightly since I have to continue running the web sites at both the old and new addresses while these things time out I guess and it's just inconvenient.
From experience with Wildblue and a few other Sat internet providers when I did wilderness ranch installs, I can tell you that those modems do lots of weird fuckery with packets. * Intercepting DNS packets and doing caching like what you are describing * Responding to three way handshake before the other end actually does (nmap -sT remote host ends up with every port being 'open' but closing connection right away) * Hijacking http and https connections and sending them through a tunneling proxy or caching proxy. * Multiple layers of NAT Due to the RTT being so high, the providers do everything in their power to make it seem like you aren't on as an agonizingly slow connection as you are. -- Brielle Bruns The Summit Open Source Development Group http://www.sosdg.org / http://www.ahbl.org
On Mon, Oct 7, 2019 at 10:13 AM Brielle <bruns@2mbit.com> wrote:
* Responding to three way handshake before the other end actually does (nmap -sT remote host ends up with every port being 'open' but closing connection right away)
This is the action of the TCP accelerator. TCP has a "long fat pipe" problem where high-delay links (like a trip to geostationary orbit and back) demolish throughput. To combat this, satellite protocols translate TCP flows to a non-TCP or modified TCP protocol for transmission through the satellite and then back to TCP in the modem. Regards, Bill Herrin -- William Herrin bill@herrin.us https://bill.herrin.us/
On 10/7/2019 12:15 PM, William Herrin wrote:
On Mon, Oct 7, 2019 at 10:13 AM Brielle <bruns@2mbit.com <mailto:bruns@2mbit.com>> wrote:
* Responding to three way handshake before the other end actually does (nmap -sT remote host ends up with every port being 'open' but closing connection right away)
This is the action of the TCP accelerator. TCP has a "long fat pipe" problem where high-delay links (like a trip to geostationary orbit and back) demolish throughput. To combat this, satellite protocols translate TCP flows to a non-TCP or modified TCP protocol for transmission through the satellite and then back to TCP in the modem.
Yeah, its just one of those things that really messes with you when you are trying to diagnose obscure error messages and application behavior. Usually I'd be in a place 50+ miles from nearest town with cell service, only accessible via jetboat... stuck on Iridium sat phone at $5/min with the Sat company, their utterly useless first level support that refuses to actually get a network engineer involved... I'm a patient, tolerant woman normally with tech support, but the sat internet providers push you into a red zone so quickly with their support... -- Brielle Bruns The Summit Open Source Development Group http://www.sosdg.org / http://www.ahbl.org
On Mon, Oct 7, 2019 at 11:56 AM Brielle <bruns@2mbit.com> wrote:
On 10/7/2019 12:15 PM, William Herrin wrote:
This is the action of the TCP accelerator. TCP has a "long fat pipe" problem where high-delay links (like a trip to geostationary orbit and back) demolish throughput. To combat this, satellite protocols translate TCP flows to a non-TCP or modified TCP protocol for transmission through the satellite and then back to TCP in the modem.
Yeah, its just one of those things that really messes with you when you are trying to diagnose obscure error messages and application behavior.
Usually I'd be in a place 50+ miles from nearest town with cell service, only accessible via jetboat... stuck on Iridium sat phone at $5/min with the Sat company, their utterly useless first level support that refuses to actually get a network engineer involved...
I'm a patient, tolerant woman normally with tech support, but the sat internet providers push you into a red zone so quickly with their support...
You don't happen to have some documented examples of this do you? I could use examples of stuff that broke and was hard to diagnose and fix due to unexpected proxying behavior for an argument I'm having elsewhere. Regards, Bill Herrin -- William Herrin bill@herrin.us https://bill.herrin.us/
On 10/7/2019 3:23 PM, William Herrin wrote:
You don't happen to have some documented examples of this do you? I could use examples of stuff that broke and was hard to diagnose and fix due to unexpected proxying behavior for an argument I'm having elsewhere.
I'll see what I can dig up from my notes. Its been around 7-8 years since I've been directly involved. -- Brielle Bruns The Summit Open Source Development Group http://www.sosdg.org / http://www.ahbl.org
Hi Mike, The UT uses a combination of caching, prefetching, and spoofing to accelerate web traffic for users. On the terrestrial side, there is a cluster of accelerators that also take part in that process. What is the "lag" time that you have observed? Also, do you know if your clients are on the Viasat-1 or Viasat-2 satellite? The infrastructure behind both satellites differs significantly. I used to work for Viasat and have forwarded your mail to a few of my former colleagues. Thanks, Sabri ----- On Oct 7, 2019, at 9:08 AM, Mike mike-nanog@tiedyenetworks.com wrote:
Hello,
I am moving a number of web sites from one colo to another, re-numbering them in the process, and I have run into an interesting issue I'd like to solicit feedback on.
My dns TTL's are all 300 seconds, and I have noticed that once I update the A records with the new addresses, most (but not all) web clients begin using the new address within 5 minutes or so. However, there is a persistent set of stragglers who continue accessing the site(s) on their old addresses for far in excess of this - up to a week in fact. And, what I have noted, all of these clients have something in common - they all appear to be satellite users of viasat/exede. This is based on whois lookups of the ip addresses of the clients. Note, I am NOT expecting 'turn on a time' - just looking for clients to refresh within a reasonable time.
I am wondering if perhaps this is due to some kind of (known?) bug in the embedded dns cache/client in the client satellite modem, or if there is another plausible explanation I am not seeing. It compounds my problem slightly since I have to continue running the web sites at both the old and new addresses while these things time out I guess and it's just inconvenient.
Thanks.
MIke-
On Mon, Oct 7, 2019 at 9:08 AM Mike <mike-nanog@tiedyenetworks.com> wrote:
My dns TTL's are all 300 seconds, and I have noticed that once I update the A records with the new addresses, most (but not all) web clients begin using the new address within 5 minutes or so. However, there is a persistent set of stragglers who continue accessing the site(s) on their old addresses for far in excess of this - up to a week in fact. And, what I have noted, all of these clients have something in common - they all appear to be satellite users of viasat/exede. This is based on whois lookups of the ip addresses of the clients. Note, I am NOT expecting 'turn on a time' - just looking for clients to refresh within a reasonable time.
Hi Mike, You may be looking at a web browser "feature" called "DNS pinning." This is used to defeat the "DNS rebinding" attack on javascript that would allow a web site to instruct a browser to scan the interior behind its user's firewall by having an attacker rotate the IP addresses used for Javascript's allowed server name. Depending on the implementation, DNS pinned browsers may not recognize a change to your IP address until the browser is stopped and restarted. Regards, Bill Herrin -- William Herrin bill@herrin.us https://bill.herrin.us/
William Herrin <bill@herrin.us> wrote:
You may be looking at a web browser "feature" called "DNS pinning." This is used to defeat the "DNS rebinding" attack on javascript that would allow a web site to instruct a browser to scan the interior behind its user's firewall by having an attacker rotate the IP addresses used for Javascript's allowed server name.
Depending on the implementation, DNS pinned browsers may not recognize a change to your IP address until the browser is stopped and restarted.
I thought DNS pinning was only for the lifetime of the web page, so closing the tab (or all tabs open on the site...) should be enough, if a reload isn't. Tony. -- f.anthony.n.finch <dot@dotat.at> http://dotat.at/ democracy, participation, and the co-operative principle
On Tue, Oct 8, 2019 at 4:22 AM Tony Finch <dot@dotat.at> wrote:
William Herrin <bill@herrin.us> wrote:
Depending on the implementation, DNS pinned browsers may not recognize a change to your IP address until the browser is stopped and restarted.
I thought DNS pinning was only for the lifetime of the web page, so closing the tab (or all tabs open on the site...) should be enough, if a reload isn't.
It depends on the implementation. There are a bunch of things the browser can do to be smart about it. Which leaves behind a few stragglers that weren't smart about it. -Bill -- William Herrin bill@herrin.us https://bill.herrin.us/
On 10/7/19 9:08 AM, Mike wrote:
I am wondering if perhaps this is due to some kind of (known?) bug in the embedded dns cache/client in the client satellite modem, or if there is another plausible explanation I am not seeing. It compounds my problem slightly since I have to continue running the web sites at both the old and new addresses while these things time out I guess and it's just inconvenient.
Back when I was the mail/DNS/network admin at a hosting company, and we would have to renumber, I saw the same thing. This was back in the days when the cable companies had small pipes to the Internet. Their DNS servers would impose a minimum 1 week TTL on all DNS requests, so that the vast majority would be resolved "locally" without having to resort to the root servers. Other answers point to satellite companies doing something similar, to combat the long RTD that DNS resolution would require without aggressive caching. Almost all of the Web servers I managed used Linux, so I was able to play games in the firewall to let both numbers get to the Web servers without having a convoluted configuration in Apache. The three Windows/ISS hosts were not that difficult to do, but was tiresome. Those games stopped when the hosting company got its own /21 allocation.
participants (7)
-
Andrew Kerr
-
Brielle
-
Mike
-
Sabri Berisha
-
Stephen Satchell
-
Tony Finch
-
William Herrin