Seth wrote:
Jonathan Lassoff wrote:
Just a heads up to anyone on list that PG&E has just sustained a large outage in San Francisco that has caused a few hiccups (both network, electrical, infrastructural, etc.) around the city.
I've confirmed that both customers in 365 Main and parts of telecom 1 have both sustained brief blackouts. No word yet form 200 Paul.
Anyone in the area that could use a hand with anything, I'll probably be wrapping up fixes for my stuff soon, and would be glad to help however I can.
I have a question: does anyone seriously accept "oh, power trouble" as a reason your servers went offline? Where's the generators? UPS? Testing said combination of UPS and generators? What if it was important? I honestly find it hard to believe anyone runs a facility like that and people actually *pay* for it.
If you do accept this is a good reason for failure, why?
Unfortunate real-world lesson: there is a functional difference between pushing the UPS test cutover button, and some of the stuff that can happen out on the power lines (including rapid voltage swings, harmonics, etc). I know 365 Main has the equipment and tests it, I've been standing outside when the generators spool up. I've had generator firmware upgrades generate reporting info on the serial uplink that flipped the UPSes into permanent error state until the Liebert guys got off the plane with the replacement mainboard. I've had grid voltage fluctuations that toasted VSDs in chillers. I watched a building's electrical service go "pop" when a transformer blew and ran 10kv into the 220 mains for a fraction of a second as it arced. I was at home but called in after a 5 MW generator popped under a sufficiently badly harmonic UPS and PDU load of only about 2.4 MW. I had a client who forgot to wire the A/C into the UPS, and nearly melted a whole server room. And the stories that the power guy I'm working with tells about foreign facilities, particularly in middle east war zones, are really scary... We fundamentally do not have the facilities problem completely nailed down to the point that things will never drop. Level 4 datacenters can, and will, fail. Nothing you can do including just doing 48V DC for everything are truly foolproof solutions. -george williiam herbert gherbert@retro.com