System accident
   HOME

TheInfoList



OR:

A system accident (or normal accident) is an "unanticipated interaction of multiple failures" in a complex system. This complexity can either be of technology or of human organizations, and is frequently both. A system accident can be easy to see in hindsight, but extremely difficult in foresight because there are simply too many action pathways to seriously consider all of them.
Charles Perrow Charles B. Perrow (February 9, 1925 – November 12, 2019) was an emeritus professor of sociology at Yale University and visiting professor at Stanford University. He authored several books and many articles on organizations, and was primari ...
first developed these ideas in the mid-1980s.
William Langewiesche William Langewiesche (pronounced:long-gah-vee-shuh) (born June 12, 1955) is an American author and journalist who was also a professional airplane pilot for many years. Since 2019 he has been a writer at large for The New York Times Magazine. P ...
in the late 1990s wrote, "the control and operation of some of the riskiest technologies require organizations so complex that serious failures are virtually guaranteed to occur." Safety systems themselves are sometimes the added complexity which leads to this type of accident. Maintenance problems are common with redundant systems. Maintenance crews can fail to restore a redundant system to active status. They are often overworked or maintenance is deferred due to budget cuts, because managers know that they system will continue to operate without fixing the backup system.


General characterization

In 2012 Charles Perrow wrote, "A normal accident ystem accidentis where everyone tries very hard to play safe, but unexpected interaction of two or more failures (because of interactive complexity), causes a cascade of failures (because of tight coupling)." Charles Perrow uses the term normal accident to emphasize that, given the current level of technology, such accidents are highly likely over a number of years or decades. extended this approach with
human reliability Human reliability (also known as human performance or HU) is related to the field of human factors and ergonomics, and refers to the reliability of humans in fields including manufacturing, medicine and nuclear power. Human performance can b ...
and the Swiss cheese model, now widely accepted in
aviation safety Aviation safety is the study and practice of managing risks in aviation. This includes preventing aviation accidents and incidents through research, educating air travel personnel, passengers and the general public, as well as the design of airc ...
and healthcare. There is an aspect of an animal devouring its own tail, in that more formality and effort to get it exactly right can actually make the situation worse.Langewiesche, William (March 1998).
The Lessons of Valujet 592
''The Atlantic''. See especially the last three paragraphs of this long article: “ . . . Understanding why might keep us from making the system even more complex, and therefore perhaps more dangerous, too.”
For example, the more organizational riga-ma-role involved in adjusting to changing conditions, the more employees will likely delay reporting such changes, "problems," and unexpected conditions. These accidents often resemble Rube Goldberg devices in the way that small errors of judgment, flaws in technology, and insignificant damages combine to form an emergent disaster. William Langewiesche writes about, "an entire pretend reality that includes unworkable chains of command, unlearnable training programs, unreadable manuals, and the fiction of regulations, checks, and controls." An opposing idea is that of the
high reliability organization A high reliability organization (HRO) is an organization that has succeeded in avoiding catastrophes in an environment where normal accidents can be expected due to risk factors and complexity. Important case studies in HRO research include both ...
.


Scott Sagan

Scott Sagan has multiple publications discussing the reliability of complex systems, especially regarding nuclear weapons. ''The Limits of Safety'' (1993) provided an extensive review of close calls during the Cold War that could have resulted in a nuclear war by accident.


Possible system accidents


Apollo 13 space flight, 1970

Apollo 13 Review Board:


Three Mile Island, 1979

Charles Perrow: "It resembled other accidents in nuclear plants and in other high risk, complex and highly interdependent operator-machine systems; none of the accidents were caused by management or operator ineptness or by poor government regulation, though these characteristics existed and should have been expected. I maintained that the accident was normal, because in complex systems there are bound to be multiple faults that cannot be avoided by planning and that operators cannot immediately comprehend."


ValuJet (AirTran) 592, Everglades, 1996

On May 11, 1996, ValuJet Airlines Flight 592, a regularly scheduled flight from Miami International to Hartsfield–Jackson Atlanta, crashed about 10 minutes after taking off as a result of a fire in the cargo compartment caused by improperly stored and labeled hazardous cargo. All 110 people on board died. The airline had a poor safety record before the crash. The accident brought widespread attention to the airline's management problems, including inadequate training of employees in proper handling of hazardous materials. The maintenance manual for the MD-80 aircraft documented the necessary procedures and was "correct" in a sense. However, it was so huge that it was neither helpful nor informative.


2008 financial institution near-meltdown

In a 2014 monograph, economist Alan Blinder stated that complicated financial instruments made it hard for potential investors to judge whether the price was reasonable. In a section entitled "Lesson # 6: Excessive complexity is not just anti-competitive, it's dangerous," he further stated, "But the greater hazard may come from opacity. When investors don't understand the risks that inhere in the securities they buy (examples: the mezzanine tranche of a
CDO-Squared CDO-Squared is a collateralized debt obligations backed primarily by the tranches issued by other CDOs. These instruments became popular before the financial crisis of 2007–08. There were 36 CDO-Squared deals made in 2005, 48 in 2006 and 41 in 200 ...
; a
CDS The compact disc (CD) is a digital optical disc data storage format that was co-developed by Philips and Sony to store and play digital audio recordings. In August 1982, the first compact disc was manufactured. It was then released in Octo ...
on a
synthetic CDO A synthetic CDO (collateralized debt obligation) is a variation of a CDO that generally uses credit default swaps and other derivatives to obtain its investment goals.Lemke, Lins and Picard, ''Mortgage-Backed Securities'', §5:16 (Thomson West, 2017 ...
,...), big mistakes can be made--especially if rating agencies tell you they are triple-A, to wit, safe enough for grandma. When the crash comes, losses may therefore be much larger than investors dreamed imaginable. Markets may dry up as no one knows what these securities are really worth. Panic may set in. Thus complexity ''per se'' is a source of risk."''What Did We Learn from the Financial Crisis, the Great Recession, and the Pathetic Recovery?'' (PDF file), Alan S. Blinder, Princeton University, Griswold Center for Economic Policy Studies, Working Paper No. 243, November 2014.


Possible future applications of concept


Five-fold increase in airplane safety since 1980s, but flight systems sometimes switch to unexpected "modes" on their own

In an article entitled "The Human Factor", William Langewiesche talks the 2009 crash of
Air France Flight 447 Air France Flight 447 (AF447 or AFR447) was a scheduled international passenger flight from Rio de Janeiro, Brazil, to Paris, France. On 1 June 2009, inconsistent airspeed indications led to the pilots inadvertently stalling the Airbus A330 ser ...
over the mid-Atlantic. He points out that, since the 1980s when the transition to automated cockpit systems began, safety has improved fivefold. Langwiesche writes, "In the privacy of the cockpit and beyond public view, pilots have been relegated to mundane roles as system managers." He quotes engineer Earl Wiener who takes the humorous statement attributed to the Duchess of Windsor that one can never be too rich or too thin, and adds "or too careful about what you put into a digital flight-guidance system." Wiener says that the effect of automation is typically to reduce the workload when it is light, but to increase it when it's heavy. Boeing Engineer Delmar Fadden said that once capacities are added to flight management systems, they become impossibly expensive to remove because of certification requirements. But if unused, may in a sense lurk in the depths unseen.The Human Factor
''Vanity Fair'', William Langewiesche, September 17, 2014. " . . . pilots have been relegated to mundane roles as system managers, . . . Since the 1980s, when the shift began, the safety record has improved fivefold, to the current one fatal accident for every five million departures. No one can rationally advocate a return to the glamour of the past."
Langewiesche cites industrial engineer Nadine Sarter who writes about "automation surprises," often related to system modes the pilot does not fully understand or that the system switches to on its own. In fact, one of the more common questions asked in cockpits today is, "What's it doing now?" In response to this, Langewiesche again points to the fivefold increase in safety and writes, "No one can rationally advocate a return to the glamour of the past."


Healthier interplay between theory and practice in which safety rules are sometimes changed?

From the article "A New Accident Model for Engineering Safer Systems," by Nancy Leveson, in ''Safety Science'', April 2004:
"However, instructions and written procedures are almost never followed exactly as operators strive to become more efficient and productive and to deal with time pressures. . . . . even in such highly constrained and high-risk environments as nuclear power plants, modification of instructions is repeatedly found and the violation of rules appears to be quite rational, given the actual workload and timing constraints under which the operators must do their job. In these situations, a basic conflict exists between error as seen as a deviation from the ''normative procedure'' and error as seen as a deviation from the rational and normally used ''effective procedure'' (Rasmussen and Pejtersen, 1994)."A New Accident Model for Engineering Safer Systems
Nancy Leveson, ''Safety Science'', Vol. 42, No. 4, April 2004. Paper based on research partially supported by National Science Foundation and NASA. " . . In fact, a common way for workers to apply pressure to management without actually going out on strike is to 'work to rule,' which can lead to a breakdown in productivity and even chaos. . "


See also

*
Unintended consequences In the social sciences, unintended consequences (sometimes unanticipated consequences or unforeseen consequences) are outcomes of a purposeful action that are not intended or foreseen. The term was popularised in the twentieth century by Ameri ...


References

* *


Notes


Further reading

* * Gross, Michael Joseph (May 29, 2015)
Life and Death at Cirque du Soleil
This ''Vanity Fair'' article states: " . . . A system accident is one that requires many things to go wrong in a cascade. Change any element of the cascade and the accident may well not occur, but every element shares the blame. . . " * * *''Beyond Engineering: A New Way of Thinking About Technology'', Todd La Prote, Karlene Roberts, and Gene Rochlin, Oxford University Press, 1997. This book provides counter-examples of complex systems which have good safety records. * Pidgeon, Nick (Sept. 22, 2011). "In retrospect: Normal accidents," ''Nature''. * * Roush, Wade Edmund. CATASTROPHE AND CONTROL: HOW TECHNOLOGICAL DISASTERS ENHANCE DEMOCRACY, Ph.D. Dissertation, Massachusetts Institute of Technology, 1994, page 15. ' . . ''Normal Accidents'' is essential reading today for industrial managers, organizational sociologists, historians of technology, and interested lay people alike, because it shows that a major strategy engineers have used in this century to keep hazardous technologies under control—multiple layers of "fail-safe" backup devices—often adds a dangerous level of unpredictability to the system as a whole. . ' * * {{cite book, last =Wallace, first =Brendan, title =Beyond Human Error, publisher =CRC Press, date =2009-03-05, location =Florida, isbn =978-0-8493-2718-6 Safety engineering Engineering failures