In message <3.0b36.32.19961014140811.006de8d0@mail.cts.com>, "Kent W. England" writes:
At 09:06 PM 12-10-96 -0700, maillists wrote:
In hindsight, solutions to tolerate a given failure are very easy to come by but shouldn't be talked down. I did not blame BBN--shit happens--
One thing constructive I would note that I haven't seen yet is that while one can imagine many awful scenarios where your lovely NOC is absolutely leveled (*) by a horrendous accident, it is much more likely that your NOC will be forced to evacuate for some "trivial" reason. Perhaps asbestos. Perhaps a gas main leak in the street. Perhaps the fire alarms just keep going off. Perhaps the power went off and the fire marshall won't let anyone stay in the building without elevator power. It might very well have been the case that had BBN had a backup generator they might still have been forced to evacuate the building, if the fire dept was paying attention.
You have to remember that it's harder to keep people in a given building than it is to safeguard the equipment there. You always have to be ready to send your NOC operators packing with a cell phone and a laptop for up to 48 hours, in case the building has to be evacuated for one of a dozen reasons. You should have a backup call center with an ACD set up in some other city or a contract with someone for a backup service.
--Kent
Kent, We have a backup NOC 1,200 miles away. All the tools are set up and kept running in two places and at least once a year Ann Arbor intentionally hands off NOC duties to Elmsford. The people issue is the main problem, with Elmsford not being as familiar with day to day NOC operations. Both NOCs are connected to two POPs in different states with alegedly diverse circuits. Neither NOC has generators but the last time the UPS ran out we had NOC service cut over in time. This is also why we have DNS servers in different states (and kerberos slave servers and other critical services). I don't think a cell phone and a laptop would cut it. Curtis