RE: From Microsoft's site
At 6:30 p.m. Tuesday (PST), a Microsoft technician made a configuration change to the routers on the edge of Microsoft's Domain Name Server network. The DNS servers are used to connect domain names with numeric IP addresses (e.g. 207.46.230.219) of the various servers and networks that make up Microsoft's Web presence.
Connect 22.5 hours worth of dots between these two events.
At approximately 5 p.m. Wednesday (PST), Microsoft removed the changes to the router configuration and immediately saw a massive improvement in the DNS network.
Their management should be real embarrassed to take so long to back out the last-change.
On Wed, 24 Jan 2001, Roeland Meyer wrote:
Connect 22.5 hours worth of dots between these two events.
At approximately 5 p.m. Wednesday (PST), Microsoft removed the changes to the router configuration and immediately saw a massive improvement in the DNS network.
Their management should be real embarrassed to take so long to back out the last-change.
Somebody bitched a router config, and it took 22.5 hours to figure it out? That's the sort of goof you might expect from a mom and pop ISP with a hundred customers and virtually no IP clue. I'll be shocked if multiple people (at multiple levels) aren't fired over this. Screwing up happens. Taking this long to figure out what you (or even for others to figure out what someone else) screwed up is just absolutely unbelievable. Is the brain cell in their networking division on vacation this week? -- ---------------------------------------------------------------------- Jon Lewis *jlewis@lewis.org*| I route System Administrator | therefore you are Atlantic Net | _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
On Thu, 25 Jan 2001 00:37:58 EST, jlewis@lewis.org said:
Their management should be real embarrassed to take so long to back out the last-change.
Somebody bitched a router config, and it took 22.5 hours to figure it out?
Umm.. let's think more carefully here. A major *MAJOR* player is changing a config *during prime time*? Hell, we're not that big, and we get 3AM-7AM local. ANything else is emergency-only. So we'll assume that the *real* timeline was: 5PM something *else* melts 6:30PM change a config to stop THAT emergency 6:45PM notice you've scrogged it up <next 19 hours> try to decide which is worse, the DNS being screwed but your *local* operations are back online using local private secondaries, or DNS being OK but whatever was loose trashing the corporate backbone? Meanwhile, your OTHER set of network monkeys is busy fighting whatever fire melted stuff to start with... <META MODE="so totally hypothetical we won't even GO there..."> They'd not be the first organization this week that had to make an emergency router config change because Ramen multicasting was melting their routers, or the first to not get it right on the first try. They'd merely be the ones thinking hardest how to put the right spin on it... </META>` I have *NO* evidence that Ramen was the actual cause other than it's this week's problem. However, I'm pretty sure that *whatever* happened, the poor router tech was *already* having a Very Bad Day before he ever GOT to the part where he changed the config..... Valdis Kletnieks Operating Systems Analyst Virginia Tech
Vladis, et al; Of course, we know (well :) that there have been multicast problems from the MSDP storms from the RAMEN since Saturday before last. If there have been wider network problems caused by these MSDP storms, I would like to hear of them, either on or off list. I would like to give a report on this in Atlanta. For those MSDPer's out there, we have good luck with rate limits to limit the damage. I will be glad to share (off list?) the configs used. Also, FWIW, there does not seem to have been a MSDP storm at 5:00 PM or so on Tuesday. Regards Marshall Eubanks Valdis.Kletnieks@vt.edu wrote:
On Thu, 25 Jan 2001 00:37:58 EST, jlewis@lewis.org said:
Their management should be real embarrassed to take so long to back out the last-change.
Somebody bitched a router config, and it took 22.5 hours to figure it out?
Umm.. let's think more carefully here.
A major *MAJOR* player is changing a config *during prime time*?
Hell, we're not that big, and we get 3AM-7AM local. ANything else is emergency-only.
So we'll assume that the *real* timeline was:
5PM something *else* melts 6:30PM change a config to stop THAT emergency 6:45PM notice you've scrogged it up
<next 19 hours> try to decide which is worse, the DNS being screwed but your *local* operations are back online using local private secondaries, or DNS being OK but whatever was loose trashing the corporate backbone? Meanwhile, your OTHER set of network monkeys is busy fighting whatever fire melted stuff to start with...
<META MODE="so totally hypothetical we won't even GO there..."> They'd not be the first organization this week that had to make an emergency router config change because Ramen multicasting was melting their routers, or the first to not get it right on the first try.
They'd merely be the ones thinking hardest how to put the right spin on it... </META>`
I have *NO* evidence that Ramen was the actual cause other than it's this week's problem. However, I'm pretty sure that *whatever* happened, the poor router tech was *already* having a Very Bad Day before he ever GOT to the part where he changed the config.....
Valdis Kletnieks Operating Systems Analyst Virginia Tech
-- Multicast Technologies, Inc. 10301 Democracy Lane, Suite 201 Fairfax, Virginia 22030 Phone : 703-293-9624 Fax : 703-293-9609 e-mail : tme@on-the-i.com http://www.on-the-i.com
participants (4)
-
jlewis@lewis.org
-
Marshall Eubanks
-
Roeland Meyer
-
Valdis.Kletnieks@vt.edu