Steve Gibbard wrote:
So from my uninformed vantage point, it looks like they started doing this more or less right -- two servers or clusters of servers in two different facilities, a few thousand miles apart on different power grids and not subject to the same natural disasters. In other words, they did the hard part. What they didn't do is put them in different BGP routes, which for a network with as much IP space as Qwest has would seem fairly easy.
I didn't get to play detective at the time of the outage, but configutation (which is automatically replicated) may also have been enough to take out both nameservers.
It also makes good management sense to run your nameservers with the same software and versions, but perhaps it doesn't make good continuity sense.. ?
That may not be necessarily true. Vendor diversity is not a bad idea. It's expensive support wise, but you could run different h/w and bind at two locations. This is perfectly acceptable operationally, AFAIK. Security is another story. That depends largely on people these days so YMMV. Anyhow, does anyone know what really happened? -M<