On Feb 28, 2012, at 10:22 AM, William Herrin wrote:
On Tue, Feb 28, 2012 at 9:02 AM, Jared Mauch <jared@puck.nether.net> wrote:
On Feb 27, 2012, at 2:53 PM, Valdis.Kletnieks@vt.edu wrote:
On Mon, 27 Feb 2012 14:02:04 EST, William Herrin said:
The net result is that when you switch the IP address of your server, a percentage of your users (declining over time) will be unable to access it for hours, days, weeks or even years regardless of the DNS TTL setting.
Amen brother.
So just for grins, after seeing William's I set up a listener on an address that had an NTP server on it many moons ago. As in the machine was shut down around 2002/06/30 22:49 and we didn't re-assign the IP address ever since *because* it kept getting hit with NTP packets.. Yes, a decade ago.
In the first 15 minutes, 234 different IP's have tried to NTP to that address.
I hereby reject the principle that one can not renumber a host/name and move it. I reject the idea that you can't move a service, or have one MX, DNS, etc.. host be down and have it be fatal without something else being SERIOUSLY broken. If you are right, nobody could ever renumber anything ever, nor take a service down ever in the most absolute terms.
Something else IS seriously broken. Several something elses actually:
1. DNS TTL at the application boundary, due in part to...
DNS TTL shouldn't make it to the application boundary...
2. Pushing the name to layer 3 address mapping process up from layer 4 to layer 7 where each application has to (incorrectly) reinvent the process, and...
But they don't have to... They can simply use getaddrinfo()/getnameinfo() and let the OS libraries do it. The fact that some applications choose to use their own resolvers instead of system libraries is what is broken.
3. A layer 4 protocol which overloads the layer 3 address as an inseverable component of its transport identifier.
Even stuff like SMTP which took care to respect the DNS TTL in its own standards gets busted at the back end: too many antispam process components rely on the source IP address, crushing large scale servers that suddenly appear, transmitting large amounts of email from a fresh IP address.
I think this is orthogonal to DNS TTL issues.
Shockingly enough we have a strongly functional network despite this brokenness. But, it's broken all the same and renumbering is majorly impaired as a consequence.
In my experience, the biggest hurdle to renumbering has nothing to do with DNS, DNS TTLs, respect or failure to respect them, etc. In my experience the biggest renumbering challenges come from the number of configuration files which contain your IP addresses yet are not under your control. VPNs (the configuration at the far side of the VPN) Firewalls (vendors, clients, etc. that have put your IP addresses into exceptions) Router configurations (vendors, clients, etc. that have special routing policy to reach you) etc. These are the things that make renumbering hard. The DNS stuff is usually fairly trivial to work around with a little time and planning.
Renumbering in light of these issues isn't impossible. An overlap period is required in which both old and new addresses are operable.
That's desirable even if you have a 5 second TTL and everyone did honor it.
The duration of that overlap period is not defined by the the protocol itself. Rather, it varies with the tolerable level or residual brokenness, literally how many nines of users should be operating on the new address before the old address can go away.
There is some truth to that. The combination of applications having their own (broken) resolver libraries and operating systems that provide even more broken resolvers (thanks, Redmond) has made this a bigger challenge than it should be. The ideal solution is to go back to using the OS resolver libraries and fix them. Best of luck actually achieving that. Owen