"Perry E. Metzger" writes:
I admit that the problem at NSI is larger by three orders of magnitude, but essentially the same sort of scripts could be run. If such scripts were in place at NSI, such failures, which have occurred multiple times, would never have happened.
Humans CANNOT be trusted with this sort of thing. Humans are fallible. You can't have humans involved in this sort of release process.
And who wrote the QA scripts you describe? Complex systems have complex failure modes. Yes, there are clearly steps that can be taken to minimize the problems, but anybody who claims that building "robustness" into complex systems is anything other than "hard" should spend some time reading the RISKS archives (http://www.CSL.sri.com/risksinfo.html).