Never push the Big Red Button (New York City subway failure)
NEW YORK CITY TRANSIT RAIL CONTROL CENTER POWER OUTAGE ISSUE ON AUGUST 29, 2021 Key Findings September 8, 2021 https://www.governor.ny.gov/sites/default/files/2021-09/WSP_Key_Findings_Sum... Key Findings [...] 3. Based on the electrical equipment log readings and the manufacturer’s official assessment, it was determined that the most likely cause of RCC shutdown was the “Emergency Power Off” button being manually activated. Secondary Findings 1. The “Emergency Power Off” button did not have a protective cover at the time of the shutdown or the following WSP investigation. [...] Mitigation Steps 1. Set up the electrical equipment Control and Communication systems properly to stay active so that personnel can monitor RCC electrical system operations. [...]
Reminds me of something that happened about 25 years ago when an elementary school visited our data center of the insurance company where I worked. One of our operators strategically positioned himself between the kids and the mainframe, leaned back and hit it's EPO button. Matthew Huff | Director of Technical Operations | OTA Management LLC Office: 914-460-4039 mhuff@ox.com | www.ox.com ........................................................................................................................................... -----Original Message----- From: NANOG <nanog-bounces+mhuff=ox.com@nanog.org> On Behalf Of Sean Donelan Sent: Friday, September 10, 2021 12:38 PM To: nanog@nanog.org Subject: Never push the Big Red Button (New York City subway failure) NEW YORK CITY TRANSIT RAIL CONTROL CENTER POWER OUTAGE ISSUE ON AUGUST 29, 2021 Key Findings September 8, 2021 https://www.governor.ny.gov/sites/default/files/2021-09/WSP_Key_Findings_Sum... Key Findings [...] 3. Based on the electrical equipment log readings and the manufacturer’s official assessment, it was determined that the most likely cause of RCC shutdown was the “Emergency Power Off” button being manually activated. Secondary Findings 1. The “Emergency Power Off” button did not have a protective cover at the time of the shutdown or the following WSP investigation. [...] Mitigation Steps 1. Set up the electrical equipment Control and Communication systems properly to stay active so that personnel can monitor RCC electrical system operations. [...]
On Fri, Sep 10, 2021 at 1:49 PM Matthew Huff <mhuff@ox.com> wrote:
Reminds me of something that happened about 25 years ago when an elementary school visited our data center of the insurance company where I worked. One of our operators strategically positioned himself between the kids and the mainframe, leaned back and hit it's EPO button.
Or when your building engineering team cuts themselves a new key for the 'main breaker' for the facility... and tests it at 2pm on a tuesday. Or when that same team cuts a second key (gotta have 2 keys!) and tests that key on the same 'main breaker' ... at 2pm on the following tuesday. <quadruple face palm> not fakenews, a real story from a large building full of gov't employees and computers and all manner of 'critical infrastructure' for the agency occupying said building. Matthew Huff | Director of Technical Operations | OTA Management LLC
Office: 914-460-4039 mhuff@ox.com | www.ox.com
...........................................................................................................................................
-----Original Message----- From: NANOG <nanog-bounces+mhuff=ox.com@nanog.org> On Behalf Of Sean Donelan Sent: Friday, September 10, 2021 12:38 PM To: nanog@nanog.org Subject: Never push the Big Red Button (New York City subway failure)
NEW YORK CITY TRANSIT RAIL CONTROL CENTER POWER OUTAGE ISSUE ON AUGUST 29, 2021 Key Findings September 8, 2021
https://www.governor.ny.gov/sites/default/files/2021-09/WSP_Key_Findings_Sum...
Key Findings [...]
3. Based on the electrical equipment log readings and the manufacturer’s official assessment, it was determined that the most likely cause of RCC shutdown was the “Emergency Power Off” button being manually activated.
Secondary Findings
1. The “Emergency Power Off” button did not have a protective cover at the time of the shutdown or the following WSP investigation.
[...] Mitigation Steps
1. Set up the electrical equipment Control and Communication systems properly to stay active so that personnel can monitor RCC electrical system operations.
[...]
True EPO story; maintenance crew carrying new drywall into the data center backed into the EPO that didn't have a cover on it. One of the most eerie sounds in networking...a completely silent data center. -chris On Fri, Sep 10, 2021 at 2:48 PM Christopher Morrow <morrowc.lists@gmail.com> wrote:
On Fri, Sep 10, 2021 at 1:49 PM Matthew Huff <mhuff@ox.com> wrote:
Reminds me of something that happened about 25 years ago when an elementary school visited our data center of the insurance company where I worked. One of our operators strategically positioned himself between the kids and the mainframe, leaned back and hit it's EPO button.
Or when your building engineering team cuts themselves a new key for the 'main breaker' for the facility... and tests it at 2pm on a tuesday. Or when that same team cuts a second key (gotta have 2 keys!) and tests that key on the same 'main breaker' ... at 2pm on the following tuesday.
<quadruple face palm>
not fakenews, a real story from a large building full of gov't employees and computers and all manner of 'critical infrastructure' for the agency occupying said building.
Matthew Huff | Director of Technical Operations | OTA Management LLC
Office: 914-460-4039 mhuff@ox.com | www.ox.com
...........................................................................................................................................
-----Original Message----- From: NANOG <nanog-bounces+mhuff=ox.com@nanog.org> On Behalf Of Sean Donelan Sent: Friday, September 10, 2021 12:38 PM To: nanog@nanog.org Subject: Never push the Big Red Button (New York City subway failure)
NEW YORK CITY TRANSIT RAIL CONTROL CENTER POWER OUTAGE ISSUE ON AUGUST 29, 2021 Key Findings September 8, 2021
https://www.governor.ny.gov/sites/default/files/2021-09/WSP_Key_Findings_Sum...
Key Findings [...]
3. Based on the electrical equipment log readings and the manufacturer’s official assessment, it was determined that the most likely cause of RCC shutdown was the “Emergency Power Off” button being manually activated.
Secondary Findings
1. The “Emergency Power Off” button did not have a protective cover at the time of the shutdown or the following WSP investigation.
[...] Mitigation Steps
1. Set up the electrical equipment Control and Communication systems properly to stay active so that personnel can monitor RCC electrical system operations.
[...]
-- Chris Kane
Since we are telling power horror stories… How about the call from the night operator that arrived at 10:00pm asking “Is there any reason there is no power in the data center?” Turns out someone had plugged in a new high end workgroup laser printer to the outside wall of the datacenter. The power receptacle was wired into the data center’s UPS and completely smoked the UPS. Luckily the static transfer switched worked, but the three mainframes weren’t’ happy… Or Our building had a major ground fault issue that took years to find and resolve. We got hit with lightning that caused the mainframe to fault and recycle…and two minutes in, we got hit by lightning again. When the system failed to start, we called IBM support. When we explained what happened there was a very long pause…then some mumbling off phone, then the manager got on the line and said someone would be flying out and be onsite within 12 hours. We were down for 3 days, and got fined $250,000 by the insurance regulators since we couldn’t pay claims. Matthew Huff | Director of Technical Operations | OTA Management LLC Office: 914-460-4039 mhuff@ox.com<mailto:mhuff@ox.com> | www.ox.com<http://www.ox.com> ........................................................................................................................................... From: Chris Kane <ccie14430@gmail.com> Sent: Friday, September 10, 2021 3:16 PM To: Christopher Morrow <morrowc.lists@gmail.com> Cc: Matthew Huff <mhuff@ox.com>; nanog@nanog.org Subject: Re: Never push the Big Red Button (New York City subway failure) True EPO story; maintenance crew carrying new drywall into the data center backed into the EPO that didn't have a cover on it. One of the most eerie sounds in networking...a completely silent data center. -chris On Fri, Sep 10, 2021 at 2:48 PM Christopher Morrow <morrowc.lists@gmail.com<mailto:morrowc.lists@gmail.com>> wrote: On Fri, Sep 10, 2021 at 1:49 PM Matthew Huff <mhuff@ox.com<mailto:mhuff@ox.com>> wrote: Reminds me of something that happened about 25 years ago when an elementary school visited our data center of the insurance company where I worked. One of our operators strategically positioned himself between the kids and the mainframe, leaned back and hit it's EPO button. Or when your building engineering team cuts themselves a new key for the 'main breaker' for the facility... and tests it at 2pm on a tuesday. Or when that same team cuts a second key (gotta have 2 keys!) and tests that key on the same 'main breaker' ... at 2pm on the following tuesday. <quadruple face palm> not fakenews, a real story from a large building full of gov't employees and computers and all manner of 'critical infrastructure' for the agency occupying said building. Matthew Huff | Director of Technical Operations | OTA Management LLC Office: 914-460-4039 mhuff@ox.com<mailto:mhuff@ox.com> | www.ox.com<http://www.ox.com> ........................................................................................................................................... -----Original Message----- From: NANOG <nanog-bounces+mhuff=ox.com@nanog.org<mailto:ox.com@nanog.org>> On Behalf Of Sean Donelan Sent: Friday, September 10, 2021 12:38 PM To: nanog@nanog.org<mailto:nanog@nanog.org> Subject: Never push the Big Red Button (New York City subway failure) NEW YORK CITY TRANSIT RAIL CONTROL CENTER POWER OUTAGE ISSUE ON AUGUST 29, 2021 Key Findings September 8, 2021 https://www.governor.ny.gov/sites/default/files/2021-09/WSP_Key_Findings_Sum... Key Findings [...] 3. Based on the electrical equipment log readings and the manufacturer’s official assessment, it was determined that the most likely cause of RCC shutdown was the “Emergency Power Off” button being manually activated. Secondary Findings 1. The “Emergency Power Off” button did not have a protective cover at the time of the shutdown or the following WSP investigation. [...] Mitigation Steps 1. Set up the electrical equipment Control and Communication systems properly to stay active so that personnel can monitor RCC electrical system operations. [...] -- Chris Kane
A nearby datacenter once lost power delayed because someone hit the switch to transfer from city power to generator power and then failed to notice. The power went out the day after when there was no fuel left. On Fri, Sep 10, 2021 at 9:24 PM Matthew Huff <mhuff@ox.com> wrote:
Since we are telling power horror stories…
How about the call from the night operator that arrived at 10:00pm asking “Is there any reason there is no power in the data center?”
Turns out someone had plugged in a new high end workgroup laser printer to the outside wall of the datacenter. The power receptacle was wired into the data center’s UPS and completely smoked the UPS. Luckily the static transfer switched worked, but the three mainframes weren’t’ happy…
Or
Our building had a major ground fault issue that took years to find and resolve. We got hit with lightning that caused the mainframe to fault and recycle…and two minutes in, we got hit by lightning again. When the system failed to start, we called IBM support. When we explained what happened there was a very long pause…then some mumbling off phone, then the manager got on the line and said someone would be flying out and be onsite within 12 hours. We were down for 3 days, and got fined $250,000 by the insurance regulators since we couldn’t pay claims.
*Matthew Huff* | Director of Technical Operations | OTA Management LLC
*Office: 914-460-4039*
*mhuff@ox.com <mhuff@ox.com> | **www.ox.com <http://www.ox.com>*
*...........................................................................................................................................*
*From:* Chris Kane <ccie14430@gmail.com> *Sent:* Friday, September 10, 2021 3:16 PM *To:* Christopher Morrow <morrowc.lists@gmail.com> *Cc:* Matthew Huff <mhuff@ox.com>; nanog@nanog.org *Subject:* Re: Never push the Big Red Button (New York City subway failure)
True EPO story; maintenance crew carrying new drywall into the data center backed into the EPO that didn't have a cover on it. One of the most eerie sounds in networking...a completely silent data center.
-chris
On Fri, Sep 10, 2021 at 2:48 PM Christopher Morrow < morrowc.lists@gmail.com> wrote:
On Fri, Sep 10, 2021 at 1:49 PM Matthew Huff <mhuff@ox.com> wrote:
Reminds me of something that happened about 25 years ago when an elementary school visited our data center of the insurance company where I worked. One of our operators strategically positioned himself between the kids and the mainframe, leaned back and hit it's EPO button.
Or when your building engineering team cuts themselves a new key for the 'main breaker' for the facility... and tests it at 2pm on a tuesday.
Or when that same team cuts a second key (gotta have 2 keys!) and tests that key on the same 'main breaker' ... at 2pm on the following tuesday.
<quadruple face palm>
not fakenews, a real story from a large building full of gov't employees and computers and all manner of 'critical infrastructure' for the agency occupying said building.
Matthew Huff | Director of Technical Operations | OTA Management LLC
Office: 914-460-4039 mhuff@ox.com | www.ox.com
...........................................................................................................................................
-----Original Message----- From: NANOG <nanog-bounces+mhuff=ox.com@nanog.org> On Behalf Of Sean Donelan Sent: Friday, September 10, 2021 12:38 PM To: nanog@nanog.org Subject: Never push the Big Red Button (New York City subway failure)
NEW YORK CITY TRANSIT RAIL CONTROL CENTER POWER OUTAGE ISSUE ON AUGUST 29, 2021 Key Findings September 8, 2021
https://www.governor.ny.gov/sites/default/files/2021-09/WSP_Key_Findings_Sum...
Key Findings [...]
3. Based on the electrical equipment log readings and the manufacturer’s official assessment, it was determined that the most likely cause of RCC shutdown was the “Emergency Power Off” button being manually activated.
Secondary Findings
1. The “Emergency Power Off” button did not have a protective cover at the time of the shutdown or the following WSP investigation.
[...] Mitigation Steps
1. Set up the electrical equipment Control and Communication systems properly to stay active so that personnel can monitor RCC electrical system operations.
[...]
--
Chris Kane
On Fri, Sep 10, 2021 at 4:21 PM Baldur Norddahl <baldur.norddahl@gmail.com> wrote:
A nearby datacenter once lost power delayed because someone hit the switch to transfer from city power to generator power and then failed to notice. The power went out the day after when there was no fuel left.
:-) A story, told to me by a friend... The utility let them know that they were going to be doing some maintenance work in the area. No impact expected, but out of an abundance of caution, they transfer over to generators. After the utility lets them know that the maintenance work is all finished, they want to switch back. If the generators are "emergency power", and you need to switch back to "utility power", obviously the way to do this must be the big red button, clearly marked as "EMERGENCY POWER OFF", no?! I suspect it is apocryphal, but it's still entertaining, W
On Fri, Sep 10, 2021 at 9:24 PM Matthew Huff <mhuff@ox.com> wrote:
Since we are telling power horror stories…
How about the call from the night operator that arrived at 10:00pm asking “Is there any reason there is no power in the data center?”
Turns out someone had plugged in a new high end workgroup laser printer to the outside wall of the datacenter. The power receptacle was wired into the data center’s UPS and completely smoked the UPS. Luckily the static transfer switched worked, but the three mainframes weren’t’ happy…
Or
Our building had a major ground fault issue that took years to find and resolve. We got hit with lightning that caused the mainframe to fault and recycle…and two minutes in, we got hit by lightning again. When the system failed to start, we called IBM support. When we explained what happened there was a very long pause…then some mumbling off phone, then the manager got on the line and said someone would be flying out and be onsite within 12 hours. We were down for 3 days, and got fined $250,000 by the insurance regulators since we couldn’t pay claims.
*Matthew Huff* | Director of Technical Operations | OTA Management LLC
*Office: 914-460-4039*
*mhuff@ox.com <mhuff@ox.com> | **www.ox.com <http://www.ox.com>*
*...........................................................................................................................................*
*From:* Chris Kane <ccie14430@gmail.com> *Sent:* Friday, September 10, 2021 3:16 PM *To:* Christopher Morrow <morrowc.lists@gmail.com> *Cc:* Matthew Huff <mhuff@ox.com>; nanog@nanog.org *Subject:* Re: Never push the Big Red Button (New York City subway failure)
True EPO story; maintenance crew carrying new drywall into the data center backed into the EPO that didn't have a cover on it. One of the most eerie sounds in networking...a completely silent data center.
-chris
On Fri, Sep 10, 2021 at 2:48 PM Christopher Morrow < morrowc.lists@gmail.com> wrote:
On Fri, Sep 10, 2021 at 1:49 PM Matthew Huff <mhuff@ox.com> wrote:
Reminds me of something that happened about 25 years ago when an elementary school visited our data center of the insurance company where I worked. One of our operators strategically positioned himself between the kids and the mainframe, leaned back and hit it's EPO button.
Or when your building engineering team cuts themselves a new key for the 'main breaker' for the facility... and tests it at 2pm on a tuesday.
Or when that same team cuts a second key (gotta have 2 keys!) and tests that key on the same 'main breaker' ... at 2pm on the following tuesday.
<quadruple face palm>
not fakenews, a real story from a large building full of gov't employees and computers and all manner of 'critical infrastructure' for the agency occupying said building.
Matthew Huff | Director of Technical Operations | OTA Management LLC
Office: 914-460-4039 mhuff@ox.com | www.ox.com
...........................................................................................................................................
-----Original Message----- From: NANOG <nanog-bounces+mhuff=ox.com@nanog.org> On Behalf Of Sean Donelan Sent: Friday, September 10, 2021 12:38 PM To: nanog@nanog.org Subject: Never push the Big Red Button (New York City subway failure)
NEW YORK CITY TRANSIT RAIL CONTROL CENTER POWER OUTAGE ISSUE ON AUGUST 29, 2021 Key Findings September 8, 2021
https://www.governor.ny.gov/sites/default/files/2021-09/WSP_Key_Findings_Sum...
Key Findings [...]
3. Based on the electrical equipment log readings and the manufacturer’s official assessment, it was determined that the most likely cause of RCC shutdown was the “Emergency Power Off” button being manually activated.
Secondary Findings
1. The “Emergency Power Off” button did not have a protective cover at the time of the shutdown or the following WSP investigation.
[...] Mitigation Steps
1. Set up the electrical equipment Control and Communication systems properly to stay active so that personnel can monitor RCC electrical system operations.
[...]
--
Chris Kane
-- The computing scientist’s main challenge is not to get confused by the complexities of his own making. -- E. W. Dijkstra
If the generators are "emergency power", and you need to switch back to "utility power", obviously the way to do this must be the big red button, clearly marked as "EMERGENCY POWER OFF", no?!
The owner of my previous company did the same thing to us many years ago because there was a small smudge on the placard between POWER and OFF that he interpreted as a dash. He was never happy with the custom sign I hung after that, REVENUE REDUCTION SWITCH. But he never tried to be helpful after that, so mission accomplished. On Fri, Sep 10, 2021 at 4:35 PM Warren Kumari <warren@kumari.net> wrote:
On Fri, Sep 10, 2021 at 4:21 PM Baldur Norddahl <baldur.norddahl@gmail.com> wrote:
A nearby datacenter once lost power delayed because someone hit the switch to transfer from city power to generator power and then failed to notice. The power went out the day after when there was no fuel left.
:-)
A story, told to me by a friend...
The utility let them know that they were going to be doing some maintenance work in the area. No impact expected, but out of an abundance of caution, they transfer over to generators. After the utility lets them know that the maintenance work is all finished, they want to switch back. If the generators are "emergency power", and you need to switch back to "utility power", obviously the way to do this must be the big red button, clearly marked as "EMERGENCY POWER OFF", no?!
I suspect it is apocryphal, but it's still entertaining, W
On Fri, Sep 10, 2021 at 9:24 PM Matthew Huff <mhuff@ox.com> wrote:
Since we are telling power horror stories…
How about the call from the night operator that arrived at 10:00pm asking “Is there any reason there is no power in the data center?”
Turns out someone had plugged in a new high end workgroup laser printer to the outside wall of the datacenter. The power receptacle was wired into the data center’s UPS and completely smoked the UPS. Luckily the static transfer switched worked, but the three mainframes weren’t’ happy…
Or
Our building had a major ground fault issue that took years to find and resolve. We got hit with lightning that caused the mainframe to fault and recycle…and two minutes in, we got hit by lightning again. When the system failed to start, we called IBM support. When we explained what happened there was a very long pause…then some mumbling off phone, then the manager got on the line and said someone would be flying out and be onsite within 12 hours. We were down for 3 days, and got fined $250,000 by the insurance regulators since we couldn’t pay claims.
*Matthew Huff* | Director of Technical Operations | OTA Management LLC
*Office: 914-460-4039*
*mhuff@ox.com <mhuff@ox.com> | **www.ox.com <http://www.ox.com>*
*...........................................................................................................................................*
*From:* Chris Kane <ccie14430@gmail.com> *Sent:* Friday, September 10, 2021 3:16 PM *To:* Christopher Morrow <morrowc.lists@gmail.com> *Cc:* Matthew Huff <mhuff@ox.com>; nanog@nanog.org *Subject:* Re: Never push the Big Red Button (New York City subway failure)
True EPO story; maintenance crew carrying new drywall into the data center backed into the EPO that didn't have a cover on it. One of the most eerie sounds in networking...a completely silent data center.
-chris
On Fri, Sep 10, 2021 at 2:48 PM Christopher Morrow < morrowc.lists@gmail.com> wrote:
On Fri, Sep 10, 2021 at 1:49 PM Matthew Huff <mhuff@ox.com> wrote:
Reminds me of something that happened about 25 years ago when an elementary school visited our data center of the insurance company where I worked. One of our operators strategically positioned himself between the kids and the mainframe, leaned back and hit it's EPO button.
Or when your building engineering team cuts themselves a new key for the 'main breaker' for the facility... and tests it at 2pm on a tuesday.
Or when that same team cuts a second key (gotta have 2 keys!) and tests that key on the same 'main breaker' ... at 2pm on the following tuesday.
<quadruple face palm>
not fakenews, a real story from a large building full of gov't employees and computers and all manner of 'critical infrastructure' for the agency occupying said building.
Matthew Huff | Director of Technical Operations | OTA Management LLC
Office: 914-460-4039 mhuff@ox.com | www.ox.com
...........................................................................................................................................
-----Original Message----- From: NANOG <nanog-bounces+mhuff=ox.com@nanog.org> On Behalf Of Sean Donelan Sent: Friday, September 10, 2021 12:38 PM To: nanog@nanog.org Subject: Never push the Big Red Button (New York City subway failure)
NEW YORK CITY TRANSIT RAIL CONTROL CENTER POWER OUTAGE ISSUE ON AUGUST 29, 2021 Key Findings September 8, 2021
https://www.governor.ny.gov/sites/default/files/2021-09/WSP_Key_Findings_Sum...
Key Findings [...]
3. Based on the electrical equipment log readings and the manufacturer’s official assessment, it was determined that the most likely cause of RCC shutdown was the “Emergency Power Off” button being manually activated.
Secondary Findings
1. The “Emergency Power Off” button did not have a protective cover at the time of the shutdown or the following WSP investigation.
[...] Mitigation Steps
1. Set up the electrical equipment Control and Communication systems properly to stay active so that personnel can monitor RCC electrical system operations.
[...]
--
Chris Kane
-- The computing scientist’s main challenge is not to get confused by the complexities of his own making. -- E. W. Dijkstra
On Sep 10, 2021, at 1:33 PM, Warren Kumari <warren@kumari.net> wrote:
The utility let them know that they were going to be doing some maintenance work in the area. No impact expected, but out of an abundance of caution, they transfer over to generators. After the utility lets them know that the maintenance work is all finished, they want to switch back. If the generators are "emergency power", and you need to switch back to "utility power", obviously the way to do this must be the big red button, clearly marked as "EMERGENCY POWER OFF", no?!
One of the many stories that came out of 9/11 was a switching center in NY City that had a diesel generator as a power backup - which of course acted as primary when the city power is off. After a few days of operation, it needed to be refueled, so a truck was sent in carrying gasoline. The generator was refueled and restarted, and - oops - diesel != gasoline. So then they needed to bring in a new generator. Yup, it happens, and it happened.
On Wed, Sep 15, 2021 at 3:20 PM Fred Baker <fredbaker.ietf@gmail.com> wrote:
On Sep 10, 2021, at 1:33 PM, Warren Kumari <warren@kumari.net> wrote:
The utility let them know that they were going to be doing some maintenance work in the area. No impact expected, but out of an abundance of caution, they transfer over to generators. After the utility lets them know that the maintenance work is all finished, they want to switch back. If the generators are "emergency power", and you need to switch back to "utility power", obviously the way to do this must be the big red button, clearly marked as "EMERGENCY POWER OFF", no?!
One of the many stories that came out of 9/11 was a switching center in NY City that had a diesel generator as a power backup - which of course acted as primary when the city power is off. After a few days of operation, it needed to be refueled, so a truck was sent in carrying gasoline. The generator was refueled and restarted, and - oops - diesel != gasoline. So then they needed to bring in a new generator.
Oooof. I've seen someone at a gas station do something similar -- I cannot remember if it was putting diesel in their gasoline car, or gas in their diesel pickup, but I *do* remember the sudden yelp and look of dismay when they suddenly realized what they were doing. It must be really easy to get wrong in a car (operating on autopilot), but that's a much less bad failure than a generator... Anyway, refueling generators reminds me of: https://www.mail-archive.com/nanog@nanog.org/msg111947.html W
Yup, it happens, and it happened.
-- The computing scientist’s main challenge is not to get confused by the complexities of his own making. -- E. W. Dijkstra
On Wed, Sep 15, 2021 at 12:32 PM Warren Kumari <warren@kumari.net> wrote:
Oooof. I've seen someone at a gas station do something similar -- I cannot remember if it was putting diesel in their gasoline car, or gas in their diesel pickup, but I *do* remember the sudden yelp and look of dismay when they suddenly realized what they were doing. It must be really easy to get wrong in a car (operating on autopilot), but that's a much less bad failure than a generator...
The diesel nozzle has a larger diameter than the gasoline one. It doesn't fit in the filler neck of a normal gasoline-powered car. Regards, Bill Herrin -- William Herrin bill@herrin.us https://bill.herrin.us/
In the data centers I've worked in over the decades, those Big Red Buttons would activate a normally-closed contactor in a breaker panel. When pushed, the contactor would open, and turn off all the circults in said breaker panel. Not affected are lights, convenience outlets, door locks, and other non-data loads. Resetting the contactor to the working position was done after throwing all the breakers to the off position, and then turn on each breaker, one at a time. The only noise that I have ever heard when the Big Red Button was pushed was the loud BANG as the contactor operated. You hear a similar bang in movies in scenes where lights in a large area are turned on and off. Nothing like the BANG of a 600-amp 3-phase breaker tripping -- experienced that at University of Illinois Center for Advanced Computation. You immediately look for the person holding a gun.
The bigger thing to notice is the *lack* of noise as every server, switch and storage system spins down. --- Keith Stokes On Sep 15, 2021, at 3:50 PM, Stephen Satchell <list@satchell.net<mailto:list@satchell.net>> wrote: In the data centers I've worked in over the decades, those Big Red Buttons would activate a normally-closed contactor in a breaker panel. When pushed, the contactor would open, and turn off all the circults in said breaker panel. Not affected are lights, convenience outlets, door locks, and other non-data loads. Resetting the contactor to the working position was done after throwing all the breakers to the off position, and then turn on each breaker, one at a time. The only noise that I have ever heard when the Big Red Button was pushed was the loud BANG as the contactor operated. You hear a similar bang in movies in scenes where lights in a large area are turned on and off. Nothing like the BANG of a 600-amp 3-phase breaker tripping -- experienced that at University of Illinois Center for Advanced Computation. You immediately look for the person holding a gun.
Indeed. Few sounds in the data center haunt me quite as much as a sensation that the decibel level has just decreased significantly. On Wed, Sep 15, 2021, 4:14 PM Keith Stokes <keiths@salonbiz.com> wrote:
The bigger thing to notice is the *lack* of noise as every server, switch and storage system spins down.
---
Keith Stokes
On Sep 15, 2021, at 3:50 PM, Stephen Satchell <list@satchell.net> wrote:
In the data centers I've worked in over the decades, those Big Red Buttons would activate a normally-closed contactor in a breaker panel. When pushed, the contactor would open, and turn off all the circults in said breaker panel. Not affected are lights, convenience outlets, door locks, and other non-data loads. Resetting the contactor to the working position was done after throwing all the breakers to the off position, and then turn on each breaker, one at a time.
The only noise that I have ever heard when the Big Red Button was pushed was the loud BANG as the contactor operated. You hear a similar bang in movies in scenes where lights in a large area are turned on and off.
Nothing like the BANG of a 600-amp 3-phase breaker tripping -- experienced that at University of Illinois Center for Advanced Computation. You immediately look for the person holding a gun.
Totally agree @billy, could've put it better myself. Though it's much more entertaining (and terrifying) to wince at the audible difference between just the hardware whirring to a silence, and all of the temp and humidity controls whirring to a halt with the hardware. The fear in the air is literally palpable, because it's immediately 90 degrees and your lungs cave. On Wed, Sep 15, 2021 at 7:51 PM Billy Croan <BCroan@unrealservers.net> wrote:
Indeed. Few sounds in the data center haunt me quite as much as a sensation that the decibel level has just decreased significantly.
On Wed, Sep 15, 2021, 4:14 PM Keith Stokes <keiths@salonbiz.com> wrote:
The bigger thing to notice is the *lack* of noise as every server, switch and storage system spins down.
---
Keith Stokes
On Sep 15, 2021, at 3:50 PM, Stephen Satchell <list@satchell.net> wrote:
In the data centers I've worked in over the decades, those Big Red Buttons would activate a normally-closed contactor in a breaker panel. When pushed, the contactor would open, and turn off all the circults in said breaker panel. Not affected are lights, convenience outlets, door locks, and other non-data loads. Resetting the contactor to the working position was done after throwing all the breakers to the off position, and then turn on each breaker, one at a time.
The only noise that I have ever heard when the Big Red Button was pushed was the loud BANG as the contactor operated. You hear a similar bang in movies in scenes where lights in a large area are turned on and off.
Nothing like the BANG of a 600-amp 3-phase breaker tripping -- experienced that at University of Illinois Center for Advanced Computation. You immediately look for the person holding a gun.
We had a new fire suppression guy testing our system for our computer room, and unbeknownst to him, one of the tests that he did triggered the EPO. One of my co-workers was in the room, which had about 20 racks of equipment, and he told me afterwards that for a quick second he thought he died. He said that that going from all that noise to eerie quiet was very disconcerting. rgt On Wed, Sep 15, 2021 at 7:51 PM Billy Croan <BCroan@unrealservers.net> wrote:
Indeed. Few sounds in the data center haunt me quite as much as a sensation that the decibel level has just decreased significantly.
On Wed, Sep 15, 2021, 4:14 PM Keith Stokes <keiths@salonbiz.com> wrote:
The bigger thing to notice is the *lack* of noise as every server, switch and storage system spins down.
---
Keith Stokes
On Sep 15, 2021, at 3:50 PM, Stephen Satchell <list@satchell.net> wrote:
In the data centers I've worked in over the decades, those Big Red Buttons would activate a normally-closed contactor in a breaker panel. When pushed, the contactor would open, and turn off all the circults in said breaker panel. Not affected are lights, convenience outlets, door locks, and other non-data loads. Resetting the contactor to the working position was done after throwing all the breakers to the off position, and then turn on each breaker, one at a time.
The only noise that I have ever heard when the Big Red Button was pushed was the loud BANG as the contactor operated. You hear a similar bang in movies in scenes where lights in a large area are turned on and off.
Nothing like the BANG of a 600-amp 3-phase breaker tripping -- experienced that at University of Illinois Center for Advanced Computation. You immediately look for the person holding a gun.
On Sep 15, 2021, at 2:20 PM, Fred Baker <fredbaker.ietf@gmail.com> wrote:
One of the many stories that came out of 9/11 was a switching center in NY City that had a diesel generator as a power backup - which of course acted as primary when the city power is off. After a few days of operation, it needed to be refueled, so a truck was sent in carrying gasoline. The generator was refueled and restarted, and - oops - diesel != gasoline. So then they needed to bring in a new generator.
Yup, it happens, and it happened.
I distinctly remember something like this - Someone built a datacenter with large fuel storage tanks in the basement and the actual generators up on the roof, or some higher floor. It was tested several times, everything seemed to be working as expected, and life went on. Then one day the power went out, the generators came on, but after about 10 minutes the generators started to crap out. It was then discovered that they had forgotten to include the transfer pumps for getting the fuel up from the basement to the generators in the list of things powered by said generators…
On Fri, Sep 10, 2021 at 2:52 PM Christopher Morrow <morrowc.lists@gmail.com> wrote:
On Fri, Sep 10, 2021 at 1:49 PM Matthew Huff <mhuff@ox.com> wrote:
Reminds me of something that happened about 25 years ago when an elementary school visited our data center of the insurance company where I worked. One of our operators strategically positioned himself between the kids and the mainframe, leaned back and hit it's EPO button.
Or when your building engineering team cuts themselves a new key for the 'main breaker' for the facility... and tests it at 2pm on a tuesday. Or when that same team cuts a second key (gotta have 2 keys!) and tests that key on the same 'main breaker' ... at 2pm on the following tuesday.
<quadruple face palm>
not fakenews, a real story from a large building full of gov't employees and computers and all manner of 'critical infrastructure' for the agency occupying said building.
In the early 2000s a friend of mine worked for a company in NYC that provided stock feeds to large banks and brokerages and similar. They'd ship a (locked) cabinet full of stuff to their customers, complete with an Ethernet cable stickin' out the back. Customer would plug this into their network and, um, do whatever it is stock people do. There was some horrendously expensive SLA attached, and so they outsourced support to one of the managed services companies so that they could provide 24x7x2hour response all over the country. One day, one of their largest customers, a large bank, also in NYC is down. This means that the brokerage arm is unable to do any trades, and so is, um, annoyed. Support rushes over to the customer and "fix it". My friend doesn't really get a good explanation of how it got fixed, but, meh, it's working, so all good. A few weeks later, same thing - customer devices disappear from monitoring, smart-hands/support rush over and fix it, no useful RFO. This happens a few more times, and everyone is getting increasingly annoyed. Eventually my friend arranges it so that he gets paged at the same time as the support provider. Pager goes off, friend jumps in a cab to the customer. He arrives at the same time as the smart-hands person, who, oddly, is clutching 1: an Ethernet face-plate and 2: a punch-down tool. Somewhat mystified, my friend follows the support person to where the cabinet is located. Because it's important and special, but not actually bank owned, it cannot live in their data-center... and so it is located in the corridor, just outside the server room. Because of where the Ethernet cable comes out the back of the cabinet, and where the wall jack is, there is basically no slack. When someone goes in or out, especially if they are wheeling a cart or carrying a box of equipment, they bang into the cabinet, which slowly rolls away -- ripping the wall jack off the wall, and the cable out the back of the jack. Support's "solution" to this has been to punch down the cable onto a new wall jack, screw it back onto the wall, wheel the cabinet back into place, and call it fixed. My friend screwed down the cabinet feet, so it wasn't resting on the wheels any more, replaced the 6ft Ethernet with a 15ft, and the issue never recurred :-P W
Matthew Huff | Director of Technical Operations | OTA Management LLC
Office: 914-460-4039 mhuff@ox.com | www.ox.com
...........................................................................................................................................
-----Original Message----- From: NANOG <nanog-bounces+mhuff=ox.com@nanog.org> On Behalf Of Sean Donelan Sent: Friday, September 10, 2021 12:38 PM To: nanog@nanog.org Subject: Never push the Big Red Button (New York City subway failure)
NEW YORK CITY TRANSIT RAIL CONTROL CENTER POWER OUTAGE ISSUE ON AUGUST 29, 2021 Key Findings September 8, 2021
https://www.governor.ny.gov/sites/default/files/2021-09/WSP_Key_Findings_Sum...
Key Findings [...]
3. Based on the electrical equipment log readings and the manufacturer’s official assessment, it was determined that the most likely cause of RCC shutdown was the “Emergency Power Off” button being manually activated.
Secondary Findings
1. The “Emergency Power Off” button did not have a protective cover at the time of the shutdown or the following WSP investigation.
[...] Mitigation Steps
1. Set up the electrical equipment Control and Communication systems properly to stay active so that personnel can monitor RCC electrical system operations.
[...]
-- The computing scientist’s main challenge is not to get confused by the complexities of his own making. -- E. W. Dijkstra
On Fri, 10 Sep 2021, Sean Donelan wrote:
1. The “Emergency Power Off” button did not have a protective cover at the time of the shutdown or the following WSP investigation.
Aka "molly-guard". https://en.wiktionary.org/wiki/molly-guard -- Mikael Abrahamsson email: swmike@swm.pp.se
----- Original Message -----
From: "Sean Donelan" <sean@donelan.com>
NEW YORK CITY TRANSIT RAIL CONTROL CENTER POWER OUTAGE ISSUE ON AUGUST 29, 2021 Key Findings September 8, 2021
https://www.governor.ny.gov/sites/default/files/2021-09/WSP_Key_Findings_Sum...
Key Findings [...]
3. Based on the electrical equipment log readings and the manufacturer’s official assessment, it was determined that the most likely cause of RCC shutdown was the “Emergency Power Off” button being manually activated.
I don't even *do* datacenter for a living, and I know that when you hit the Molly button, 1) A Klaxon goes off in the Data Center -- one that sounds *different* from the Halon Klaxon, in both cadence and tone (just for a couple bursts), and 2) Yellow rotating beacons turn on, and stay on while you're on Emergency Power. Yes, real honest-to-ghod *rotating mechanical beacons*, none of this flashing LED crap. Clearly, it's important that the use of Emergency Power be annoyingly noticeable. Cheers, -- jra -- Jay R. Ashworth Baylink jra@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://www.bcp38.info 2000 Land Rover DII St Petersburg FL USA BCP38: Ask For It By Name! +1 727 647 1274
Now I'm curious... in all of the DCs and COs I've worked in - to the best of my knowledge, I haven't personally tested this! - the EPO button does not switch to emergency power. It turns off ALL equipment power in the space - no lights, no klaxons, nothing. In simpler setups, the EPO is connected to the UPS so anything plugged in to the UPS does dark instantly. In one DC I'm familiar with, the EPO switch kills all the UPS output and uses several relays to kill commercial power at the same time. In some, the room lights were not covered by the EPO switch, in some they were. Emergency exit lamps will continue to be lit, as they have internal batteries, and are required by building/fire code. Is it (somewhat) common for an EPO switch to only disconnect commercial power and leave local redundant power live? What sort of facilities would have this? -Adam Adam Thompson Consultant, Infrastructure Services [1593169877849] 100 - 135 Innovation Drive Winnipeg, MB, R3T 6A8 (204) 977-6824 or 1-800-430-6404 (MB only) athompson@merlin.mb.ca<mailto:athompson@merlin.mb.ca> www.merlin.mb.ca<http://www.merlin.mb.ca/> ________________________________ From: NANOG <nanog-bounces+athompson=merlin.mb.ca@nanog.org> on behalf of Jay R. Ashworth <jra@baylink.com> Sent: September 11, 2021 22:23 To: nanog <nanog@nanog.org> Subject: Re: Never push the Big Red Button (New York City subway failure) ----- Original Message -----
From: "Sean Donelan" <sean@donelan.com>
NEW YORK CITY TRANSIT RAIL CONTROL CENTER POWER OUTAGE ISSUE ON AUGUST 29, 2021 Key Findings September 8, 2021
https://www.governor.ny.gov/sites/default/files/2021-09/WSP_Key_Findings_Sum...
Key Findings [...]
3. Based on the electrical equipment log readings and the manufacturer’s official assessment, it was determined that the most likely cause of RCC shutdown was the “Emergency Power Off” button being manually activated.
I don't even *do* datacenter for a living, and I know that when you hit the Molly button, 1) A Klaxon goes off in the Data Center -- one that sounds *different* from the Halon Klaxon, in both cadence and tone (just for a couple bursts), and 2) Yellow rotating beacons turn on, and stay on while you're on Emergency Power. Yes, real honest-to-ghod *rotating mechanical beacons*, none of this flashing LED crap. Clearly, it's important that the use of Emergency Power be annoyingly noticeable. Cheers, -- jra -- Jay R. Ashworth Baylink jra@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://www.bcp38.info 2000 Land Rover DII St Petersburg FL USA BCP38: Ask For It By Name! +1 727 647 1274
On Sep 15, 2021, at 10:58 AM, Adam Thompson <athompson@merlin.mb.ca> wrote:
Now I'm curious... in all of the DCs and COs I've worked in - to the best of my knowledge, I haven't personally tested this! - the EPO button does not switch to emergency power. It turns off ALL equipment power in the space - no lights, no klaxons, nothing. In simpler setups, the EPO is connected to the UPS so anything plugged in to the UPS does dark instantly. In one DC I'm familiar with, the EPO switch kills all the UPS output and uses several relays to kill commercial power at the same time. In some, the room lights were not covered by the EPO switch, in some they were. Emergency exit lamps will continue to be lit, as they have internal batteries, and are required by building/fire code.
It was always my understanding EPO was to be used for “We have an electrical fire and need to remove the source RFN”, not “we need to be on the redundant power instead of city power and don’t want to wait for the automatic transfer”.
On Wed, Sep 15, 2021 at 9:21 AM Daniel Seagraves <dseagrav@humancapitaldev.com> wrote:
It was always my understanding EPO was to be used for “We have an electrical fire and need to remove the source RFN”, not “we need to be on the redundant power instead of city power and don’t want to wait for the automatic transfer”.
Hi Daniel, That's correct. I'm not sure what Jay was on about, but the EPO button kills power to everything that would otherwise be protected from a building power failure. There's generally no warning; you know it happened from the rapid silence. I've also never seen warning lights that the facility is on emergency power. It's probably a good idea but I've never seen it. Regards, Bill Herrin -- William Herrin bill@herrin.us https://bill.herrin.us/
On Wed, Sep 15, 2021 at 12:23 PM Daniel Seagraves < dseagrav@humancapitaldev.com> wrote:
On Sep 15, 2021, at 10:58 AM, Adam Thompson <athompson@merlin.mb.ca> wrote:
Now I'm curious... in all of the DCs and COs I've worked in - to the best of my knowledge, I haven't personally tested this! - the EPO button does *not* switch to emergency power. It turns off ALL equipment power in the space - no lights, no klaxons, nothing. In simpler setups, the EPO is connected to the UPS so anything plugged in to the UPS does dark instantly. In one DC I'm familiar with, the EPO switch kills all the UPS output *and* uses several relays to kill commercial power at the same time. In some, the room lights were not covered by the EPO switch, in some they were. Emergency exit lamps will continue to be lit, as they have internal batteries, and are required by building/fire code.
It was always my understanding EPO was to be used for “We have an electrical fire and need to remove the source RFN”, not “we need to be on the redundant power instead of city power and don’t want to wait for the automatic transfer”.
Well, there is the EPO button, which generally does that, and the (variously labeled) HALON/FM-200/GAS FIRE SUPPRESSION/GAS DISCHARGE button, which does the flashy lights and klangly bell and similar. This is fairly much always required by code, to give people time to evacuate before the gas dumps and they suffocate. People often refer to both of these as EPOs (or "the buttons that must not be pressed unless you have a REALLY good reason."). When I grew up (in South Africa), Halon/BCF was still in active use. When there was a fire (or you pressed and held the big red HALON button) a siren would sound and lights would flash for a few seconds to allow everyone time to evacuate the machine room. I'm assuming that things are now less stupid, but at the local University, the BCF was stored in large gas bottles, with a pyrotechnic valve to release it. The pyrotechnic charge was initiated with LA/LS (Lead Azide/Lead Styphnate) hot-wire initiators, which were supposed to be replaced every 2 years as part of some maintenance schedule - when LA/LS ages, especially in the presence of humidity, it apparently can form a much more sensitive crystal structure, which is very shock sensitive. The system was installed in the 1960s, and the initiators were replaced once or twice. Eventually, however, with sanctions, especially on things that can be made to go boom, it became hard to get replacements, and so they stopped replacing them... and eventually forgot about them ...... right up until sometime in the early 1990s, when someone accidentally knocked into the bottles with a loaded equipment cart. By this time the initiators had become sufficiently old and ornery that they decided that they'd had enough, and set off the pyro charges, which dumped all of the Halon into the room. Luckily everyone survived, but IIRC, two people passed out before making it to the door, and someone had to rush in and pull them to fresh air. The added gas pressure also cracked the big glass window (what's the point in having a big mainframe with flashy lights and spinning tapes if you cannot show it off?), and also caused a few head-crashes. W -- The computing scientist’s main challenge is not to get confused by the complexities of his own making. -- E. W. Dijkstra
On September 15, 2021 at 13:31 warren@kumari.net (Warren Kumari) wrote:
Well, there is the EPO button, which generally does that, and the (variously labeled) HALON/FM-200/GAS FIRE SUPPRESSION/GAS DISCHARGE button, which does the flashy lights and klangly bell and similar. This is fairly much always required by code, to give people time to evacuate before the gas dumps and they suffocate.
People don't suffocate from Halon dumps, I've been thru a couple (not me personally but staff, I was in my office but arrived quickly.) What is somewhat dangerous about Halon (or likely more modern) fire suppression dumps is they create like 90mph winds so you're in some danger from something like a pencil nearby. Hence, cover your face with your arms or a coat or similar if one is imminent. It makes a truly impressive mess of a machine room, piles of paper for example basically tossed out to the walls etc. But better than a fire. Same for people, better than being roasted alive or having to breathe burning plastics etc. People who complain about these systems always seem to pose it like if it wasn't for the fire suppression system we'd all be much better off, like if not for these damn lifeboats we could just stay on the ship. Granted in those two cases nothing very dangerous was going on. Once it was caused by a nearby security person's walkie-talkie -- yeah, we proved that by reproducing it tho w/ the hold button pressed so it wasn't a theory. -- -Barry Shein Software Tool & Die | bzs@TheWorld.com | http://www.TheWorld.com Purveyors to the Trade | Voice: +1 617-STD-WRLD | 800-THE-WRLD The World: Since 1989 | A Public Information Utility | *oo*
Miy story in the late 1970s I was working in a large computer facility with both mainframes and mil-spec 400hz computers. Management decided that the EPO should be tested. So we powered down the disk and tapes. The electrician pressed the EPO button and NOTHING. Everything kept running. Turns out a wire had come loose and the fuse in the EPO circuit had blown. Roy
I once worked for a provider who had a company next door that ran a small datacenter of about a dozen or so racks. They had been sold and all of their infrastructure had been virtualized and moved to the new owner’s network. The last task of the local admin was to just get rid of everything. They didn’t care how just get it gone. So he came over and asked us to come take a look and we could have anything we wanted. I picked up a few servers for lab a bunch of racks and stuff. While we were working I asked the guy “So there is absolutely nothing that’s in production in here anymore?” He said “Yep” so I asked “Then if the power went off in here it wouldn’t be a big deal” and he said “Not at all”. Then I asked “Can I hit the red button?” He said “Sure, I always wondered what happened”. I hit the button and with a loud booming sound the room went dead silent and then the UPS started beeping. It was at that moment everyone realized that you just don’t pull the button out to restart the room. It took us 20 minutes to figure out how to turn it all back on. And with that when I got back to our office I made sure someone knew how to restart everything if we ever had to hit our red button. -richey From: NANOG <nanog-bounces+richey.goldberg=gmail.com@nanog.org> on behalf of Roy <r.engehausen@gmail.com> Date: Thursday, September 16, 2021 at 12:41 AM To: nanog <nanog@nanog.org> Subject: Re: Never push the Big Red Button (New York City subway failure) Miy story in the late 1970s I was working in a large computer facility with both mainframes and mil-spec 400hz computers. Management decided that the EPO should be tested. So we powered down the disk and tapes. The electrician pressed the EPO button and NOTHING. Everything kept running. Turns out a wire had come loose and the fuse in the EPO circuit had blown. Roy
----- On Sep 15, 2021, at 9:08 PM, bzs bzs@theworld.com wrote: Hi,
People don't suffocate from Halon dumps, I've been thru a couple (not me personally but staff, I was in my office but arrived quickly.)
What is somewhat dangerous about Halon (or likely more modern) fire suppression dumps is they create like 90mph winds so you're in some danger from something like a pencil nearby. Hence, cover your face with your arms or a coat or similar if one is imminent.
I can speak from experience. Back in the early 2000s I was working for a small regional ISP that provided colocation services in the same building as the office was. We had an Inergen system and I had the honor of being in the room when it suddenly went off without warning. The noise and air movement was similar to the one time I rode a motorcycle on the autobahn and hit 200mph. Not fun. Afterwards I felt slightly lightheaded, but was otherwise ok. Not that my boss cared, he lighted a piece of paper outside of the room, walked in, and noted that, after the flames died out, "hey, it works". Thanks, Sabri
On 9/15/21 08:58, Adam Thompson wrote:
Now I'm curious... in all of the DCs and COs I've worked in - to the best of my knowledge, I haven't personally tested this! - the EPO button does *not* switch to emergency power. It turns off ALL equipment power in the space - no lights, no klaxons, nothing. In simpler setups, the EPO is connected to the UPS so anything plugged in to the UPS does dark instantly. In one DC I'm familiar with, the EPO switch kills all the UPS output *and* uses several relays to kill commercial power at the same time.
That's my understanding as well. Not necessarily the room lights depending on the facility, but all equipment power. To be used if the space is on fire or someone is in the process of being electrocuted. I've never seen a klaxon or audible alarm connected with EPO. Things just get very quiet. -- Jay Hennigan - jay@west.net Network Engineering - CCIE #7880 503 897-8550 - WB6RDV
----- Original Message -----
From: "Adam Thompson" <athompson@merlin.mb.ca>
Now I'm curious... in all of the DCs and COs I've worked in - to the best of my knowledge, I haven't personally tested this! - the EPO button does not switch to emergency power. It turns off ALL equipment power in the space - no lights, no klaxons, nothing. In simpler setups, the EPO is connected to the UPS so anything plugged in to the UPS does dark instantly. In one DC I'm familiar with, the EPO switch kills all the UPS output and uses several relays to kill commercial power at the same time. In some, the room lights were not covered by the EPO switch, in some they were. Emergency exit lamps will continue to be lit, as they have internal batteries, and are required by building/fire code.
Is it (somewhat) common for an EPO switch to only disconnect commercial power and leave local redundant power live? What sort of facilities would have this?
No... I just hadn't had my coffee yet that morning and I crossed the streams. That should be the response to the *ATS cutover*, not the Molly switch. If someone hits the Molly button, you don't *need* an alarm. :-} Cheers, -- jra -- Jay R. Ashworth Baylink jra@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://www.bcp38.info 2000 Land Rover DII St Petersburg FL USA BCP38: Ask For It By Name! +1 727 647 1274
Code requires this here. The intent of the EPO buttons are to immediately disconnect all energized power to the entire facility/building in the event of a critical fault like an electrical fire or electrocution. Only locally-battery powered low-voltage emergency lighting should still be operating. Often the next step after EPO is to flood the room... —L.B. Ms. Lady Benjamin PD Cannon of Glencoe, ASCE 6x7 Networks & 6x7 Telecom, LLC CEO lb@6by7.net <mailto:lb@6by7.net> "The only fully end-to-end encrypted global telecommunications company in the world.” FCC License KJ6FJJ
On Sep 15, 2021, at 8:58 AM, Adam Thompson <athompson@merlin.mb.ca> wrote:
Now I'm curious... in all of the DCs and COs I've worked in - to the best of my knowledge, I haven't personally tested this! - the EPO button does not switch to emergency power. It turns off ALL equipment power in the space - no lights, no klaxons, nothing. In simpler setups, the EPO is connected to the UPS so anything plugged in to the UPS does dark instantly. In one DC I'm familiar with, the EPO switch kills all the UPS output and uses several relays to kill commercial power at the same time. In some, the room lights were not covered by the EPO switch, in some they were. Emergency exit lamps will continue to be lit, as they have internal batteries, and are required by building/fire code.
Is it (somewhat) common for an EPO switch to only disconnect commercial power and leave local redundant power live? What sort of facilities would have this?
-Adam
Adam Thompson Consultant, Infrastructure Services
100 - 135 Innovation Drive Winnipeg, MB, R3T 6A8 (204) 977-6824 or 1-800-430-6404 (MB only) athompson@merlin.mb.ca <mailto:athompson@merlin.mb.ca> www.merlin.mb.ca <http://www.merlin.mb.ca/> From: NANOG <nanog-bounces+athompson=merlin.mb.ca@nanog.org> on behalf of Jay R. Ashworth <jra@baylink.com> Sent: September 11, 2021 22:23 To: nanog <nanog@nanog.org> Subject: Re: Never push the Big Red Button (New York City subway failure)
----- Original Message -----
From: "Sean Donelan" <sean@donelan.com>
NEW YORK CITY TRANSIT RAIL CONTROL CENTER POWER OUTAGE ISSUE ON AUGUST 29, 2021 Key Findings September 8, 2021
https://www.governor.ny.gov/sites/default/files/2021-09/WSP_Key_Findings_Sum... <https://www.governor.ny.gov/sites/default/files/2021-09/WSP_Key_Findings_Summary-for_release.pdf>
Key Findings [...]
3. Based on the electrical equipment log readings and the manufacturer’s official assessment, it was determined that the most likely cause of RCC shutdown was the “Emergency Power Off” button being manually activated.
I don't even *do* datacenter for a living, and I know that when you hit the Molly button,
1) A Klaxon goes off in the Data Center -- one that sounds *different* from the Halon Klaxon, in both cadence and tone (just for a couple bursts), and
2) Yellow rotating beacons turn on, and stay on while you're on Emergency Power.
Yes, real honest-to-ghod *rotating mechanical beacons*, none of this flashing LED crap.
Clearly, it's important that the use of Emergency Power be annoyingly noticeable.
Cheers, -- jra -- Jay R. Ashworth Baylink jra@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://www.bcp38.info <http://www.bcp38.info/> 2000 Land Rover DII St Petersburg FL USA BCP38: Ask For It By Name! +1 727 647 1274
Sigh, people often mis-hear this when I say it, so I will try to say it carefully. If you have an Emergency Power Off (EPO), the electrical code (and life-safety code) allows use of several alternative wiring methods. Some people mistakenly believe the allowed alternatives are the rule, but they are actually exceptions. On the other hand.... If you do NOT have an Emergency Power Off (EPO), you are NOT allowed to use the associated alternatives in the electrical code. Among the alternatives NOT allowed without the EPO, is almost everything in Article 645 - Information Technology Equipment rooms (or the equivalent in international electrical codes). Most people, including licensed electricians, believe it is the other way around. Why does Article 645 exist? (and equivalents in other international electrical codes) IBM in the 1950s published construction specifications for Automatic Data Processing rooms for its mainframe computers, which everyone else copied. IBM wanted to use alternative wiring methods in ADP rooms for its mainframe computers, so the Big Red Button was born. Also, mainframe computers used to cost more than the building, so people (insurance companies) didn't want the mainframe damaged during a fire. It is possible to design a data center WITHOUT using those electrical code exceptions, and WITHOUT a "Big Red Button." You can check, because my data center ideas were copied by several tech companies world-wide (you know who you are), and don't have Big Red Buttons. All of those data centers also have water-based automatic fire sprinklers. Both were very radical ideas at the time, which are now commonly accepted. In most cases, you'll need a fully licensed, Professional Engineer specializing in Electrical Engineering to sign off on the final design. A licensed electrician isn't enough. Nevertheless, it is possible to build a safe, code-compliant data center WITHOUT a Big Red Button. The design also seemed to be more reliable. Let the misinterpretation begin ...
On Sep 17, 2021, at 8:59 AM, Sean Donelan <sean@donelan.com> wrote:
It is possible to design a data center WITHOUT using those electrical code exceptions, and WITHOUT a "Big Red Button."
You can check, because my data center ideas were copied by several tech companies world-wide (you know who you are), and don't have Big Red Buttons. All of those data centers also have water-based automatic fire sprinklers. Both were very radical ideas at the time, which are now commonly accepted.
In most cases, you'll need a fully licensed, Professional Engineer specializing in Electrical Engineering to sign off on the final design. A licensed electrician isn't enough. Nevertheless, it is possible to build a safe, code-compliant data center WITHOUT a Big Red Button. The design also seemed to be more reliable.
What’s the gain in _not_ having one that makes it worth the sign-off and hassle? Just avoiding the possibility of accidental activation or something I’m not thinking of?
participants (24)
-
Adam Thompson
-
Baldur Norddahl
-
Billy Croan
-
bzs@theworld.com
-
Callan Banner
-
Chris Kane
-
Christopher Morrow
-
Daniel Seagraves
-
Fred Baker
-
Jay Hennigan
-
Jay R. Ashworth
-
Keith Stokes
-
Lady Benjamin Cannon of Glencoe, ASCE
-
Matthew Huff
-
Mikael Abrahamsson
-
richey goldberg
-
Robert Taylor
-
Roy
-
Sabri Berisha
-
Sean Donelan
-
Stephen Satchell
-
Tom Beecher
-
Warren Kumari
-
William Herrin