1. When I had a power supply fail in a fileserver about a year ago, I limped it along until my next maintence window (which happened to be in 24 hours, thank goodness) and replaced it then. It was only a 10 minute downtime for my users who were very happy because there was no downtime durning business hours. Usually this is what I will do. The less downtime I can have outside my maintence windows, the better. 2. Depends. If there is a chance I'll break something if I don't shut it all down, I will. If there is not a likely chance I'll break it, then great, I'll keep working. If I have to shut down my database server, I'll switch over to the backup and keep working and then do the repairs and bring my backup online. We've had issues here with power outages and usually the UPS' will hold. The one time they didn't, we went and brought all the machines down gracefully as we didn't have the auto-shutdown installed on the systems. While I do realize this is describing the "perfect" problem, there will be times when a NIC will fail or someone will cut the fiber, and then you just have to handle it the best way you know how to get the issue resolved, then take a blunt object (like the clue phone) to the person who cut the fiber. ;-) -Eric -- Eric Whitehill ericw@xtratyme.com Network Engineer XtraTyme Technologies 320.864.8513 http://www.xtratyme.com
1. Do you attempt to preserve service as long as possible, including running equipment to the point of destruction?
2. Do you attempt to minimize recovery time by shutting down equipment to a "safe" condition before failure?
If you are running a database/transaction oriented system, I would expect you want to put the database into a stable condition. On the other hand, if you are operating mostly communication equipment, you would want to leave it operating as long as possible.
I'm aware of a variety of proprietary software shutdown programs associated with UPS vendors. But I'm wondering do any "open standards" exist for initiating soft shutdowns?