On Sat, 24 Nov 2001, Neil J. McRae wrote:
I'd be surprised if it was the GSR, and in anycase that doesn't absolve anyone. If it was a software issue- why wasn't the software properly tested? Why was such a critical upgrade rolled out across the entire network at the same time? It doesn't add up.
It appears to be yet another CEF bug. If you want to use a GSR you are stuck using some version of IOS with a CEF bug. The question is which bug do you want. Each version of IOS has a slightly different set. Several US network providers have also been bitten by CEF bugs too. While trying to fix one set of bugs, BT upgraded of their network. I'm not sure if they were upgrading at 9am in the morning, or had upgraded earlier and the bug finally came out under load at 9am. When the BT network melted down, Cisco suggested installing a different version of IOS, which had previously been tested. At noon, BT found the new version had an even worse bug, sending packets out the wrong interface. It was until 2200 (13 hours later), BT and Cisco found a version of IOS which stablized the network. "Stablized" not fixed. The running version of IOS still has a bug, but it isn't as severe.