Re: Revisiting the Aviation Safety vs. Networking discussion
At 02:08 AM 12/25/2009, Scott Howard wrote:
On Thu, Dec 24, 2009 at 6:27 PM, George Bonser <gbonser@seven.com> wrote:
So you can put a lot of process around changes in advance but there isn't quite as much to manage incidents that strike out of the clear blue. Too much process at that point could impede progress in clearing the issue. Capt. Sullenberger did not need to fill out an incident report, bring up a conference bridge, and give a detailed description of what was happening with his plane, the status of all subsystems, and his proposed plan of action (subject to consensus of those on the conference bridge) and get approval for deviation from his initial flight plan before he took the required actions to land the plane as best as he could under the circumstances.
"*mayday mayday mayday. **Cactus fifteen thirty nine hit birds, we've lost thrust (in/on) both engines we're turning back towards LaGuardia*" - Capt. Sullenberger
Not exactly "detailed", but he definitely initiated an "incident report" (the mayday), gave a "description of what was happening with his plane", the "status of [the relevant] subsystems", and his proposed plan of action - even in the order you've asked for!
His actions were then "subject to the consensus of those on the conference bridge" (ie, ATC) who could have denied his actions if they believed they would have made the situation worse (ie, if what they were proposing would have had them on a collision course with another plane). In this case, the conference bridge gave approval for his course of action ("*ok uh, you need to return to LaGuardia? turn left heading of uh two two zero.*" - ATC)
Once he declared an emergency, he had the right of way over all other traffic. ATC would move anyone in his way out of the way. Under <http://en.wikipedia.org/wiki//wiki/U.S.>U.S. <http://en.wikipedia.org/wiki//wiki/FAA>FAA FAR 91.3, "Responsibility and authority of the pilot in command", the FAA declares:[2] * (a) The pilot in command of an aircraft is directly responsible for, and is the final authority as to, the operation of that aircraft. * (b) In an in-flight emergency requiring immediate action, the pilot in command may deviate from any rule of this part to the extent required to meet that emergency. * (c) Each pilot in command who deviates from a rule under paragraph (b) of this section shall, upon the request of the Administrator, send a written report of that deviation to the Administrator. Just because we have checklists doesn't mean we can't think on our feet and handle situations not contemplated in checklists, but checklists and procedures exist to ensure we don't forget something we need to remember. They aren't a substitute for creativity and logical thought. They are an aid to it to ensure a minimum of creative thinking is needed to solve problems which shouldn't exist in the first place. -Robert SEL&MEL+I "Well done is better than well said." - Benjamin Franklin
In general, it seems that a field has to be aware that it can kill (or has killed) an embarrassing number of people before its members accept the need for controls such as processes and checklists. Here's a couple if incidents in which gruesome, public loss of life was necessary to for thought to triumph over ego: Doctors took forever to get over their bad selves and adopt the process of handwashing: http://en.wikipedia.org/wiki/Ignaz_Semmelweis Pilots discover humility and the value of checklists in managing complexity: http://www.atchistory.org/History/checklst.htm Reactor-rats, wing-wipers, barber-surgeons, and rocket-jockeys now recognize that the best and brightest among us, polished with state of the art education and training, ruthlessly drilled in the fundamentals, and armed with the best processes and checklists, are just barely good enough to have even-money odds when dealing with everything the world can throw at them. I suppose that once us packet-pushers kill enough people, the economics of lost market share, falling stock prices, and embarrassed CxOs on CNN will push us in that direction. Until then, however, Anarchy and Heroics (http://www.cert.org/archive/pdf/csi0711.pdf) sing their siren song. David On Sat, Dec 26, 2009 at 4:24 PM, Robert Boyle <robert@tellurian.com> wrote:
At 02:08 AM 12/25/2009, Scott Howard wrote:
On Thu, Dec 24, 2009 at 6:27 PM, George Bonser <gbonser@seven.com> wrote:
So you can put a lot of process around changes in advance but there isn't quite as much to manage incidents that strike out of the clear blue. Too much process at that point could impede progress in clearing the issue. Capt. Sullenberger did not need to fill out an incident report, bring up a conference bridge, and give a detailed description of what was happening with his plane, the status of all subsystems, and his proposed plan of action (subject to consensus of those on the conference bridge) and get approval for deviation from his initial flight plan before he took the required actions to land the plane as best as he could under the circumstances.
"*mayday mayday mayday. **Cactus fifteen thirty nine hit birds, we've lost thrust (in/on) both engines we're turning back towards LaGuardia*" - Capt. Sullenberger
Not exactly "detailed", but he definitely initiated an "incident report" (the mayday), gave a "description of what was happening with his plane", the "status of [the relevant] subsystems", and his proposed plan of action - even in the order you've asked for!
His actions were then "subject to the consensus of those on the conference bridge" (ie, ATC) who could have denied his actions if they believed they would have made the situation worse (ie, if what they were proposing would have had them on a collision course with another plane). In this case, the conference bridge gave approval for his course of action ("*ok uh, you need to return to LaGuardia? turn left heading of uh two two zero.*" - ATC)
Once he declared an emergency, he had the right of way over all other traffic. ATC would move anyone in his way out of the way. Under <http://en.wikipedia.org/wiki//wiki/U.S.>U.S. <http://en.wikipedia.org/wiki//wiki/FAA>FAA FAR 91.3, "Responsibility and authority of the pilot in command", the FAA declares:[2] * (a) The pilot in command of an aircraft is directly responsible for, and is the final authority as to, the operation of that aircraft. * (b) In an in-flight emergency requiring immediate action, the pilot in command may deviate from any rule of this part to the extent required to meet that emergency. * (c) Each pilot in command who deviates from a rule under paragraph (b) of this section shall, upon the request of the Administrator, send a written report of that deviation to the Administrator. Just because we have checklists doesn't mean we can't think on our feet and handle situations not contemplated in checklists, but checklists and procedures exist to ensure we don't forget something we need to remember. They aren't a substitute for creativity and logical thought. They are an aid to it to ensure a minimum of creative thinking is needed to solve problems which shouldn't exist in the first place.
-Robert SEL&MEL+I
"Well done is better than well said." - Benjamin Franklin
The connection may not be immediately apparent, but I think Philip Greenspun's article critiquing Malcolm Gladwell's musings on cranial metrics etc. has some bearing: http://philip.greenspun.com/flying/foreign-airline-safety ...or is at least an interesting read. In observing network operations screw-ups, I've seen a lot that were either caused by, or prolonged by, a culture-of-emergency. Young guys drinking way too much coffee, working a service window at two in the morning, believing they've seen something that needs to be fixed, and winging it. In building networks, I've tried very hard to engineer things such that the operating procedure for dealing with an "emergency" is to note its existence and place it in a work queue to be dealt with by people who are on a day shift, have just come in from a full night's sleep, and are working in a team with senior people who can assist with anything tricky, and make sure that junior folks are following proceedures that have been worked out in advance by people who had plenty of time in a lab, and plenty of time to choose the best of many alternative procedures. In my experience, reducing the frequency of emergencies is most beneficial in reducing the frequency of outages. :-) -Bill
participants (3)
-
Bill Woodcock
-
David Hiers
-
Robert Boyle