A system accident (or normal accident) is an "unanticipated interaction of multiple failures" in a
complex system
A complex system is a system composed of many components that may interact with one another. Examples of complex systems are Earth's global climate, organisms, the human brain, infrastructure such as power grid, transportation or communication sy ...
. This complexity can either be of technology or of human organizations and is frequently both. A system accident can be easy to see in hindsight, but extremely difficult in foresight because there are simply too many action pathways to seriously consider all of them.
Charles Perrow
Charles Bryce Perrow (February 9, 1925 – November 12, 2019), or Chick Perrow was an American sociologist and a leading figure of organizational sociology. He spent most of his career at SUNY Stony Brook and Yale University as a professor of ...
first developed these ideas in the mid-1980s. Safety systems themselves are sometimes the added complexity which leads to this type of accident.
Pilot and author
William Langewiesche used Perrow's concept in his analysis of the factors at play in a 1996 aviation disaster. He wrote in ''The Atlantic'' in 1998: "the control and operation of some of the riskiest technologies require organizations so complex that serious failures are virtually guaranteed to occur."
Characteristics and overview
In 2012 Charles Perrow wrote, "A normal accident
ystem accidentis where everyone tries very hard to play safe, but unexpected interaction of two or more failures (because of interactive complexity), causes a cascade of failures (because of tight coupling)." Perrow uses the term ''normal accident'' to emphasize that, given the current level of technology, such accidents are highly likely over a number of years or decades.
James Reason extended this approach with
human reliability
In the field of human factors and ergonomics, human reliability (also known as human performance or HU) is the probability that a human performs a task to a sufficient standard. Reliability of humans can be affected by many factors such as age, ...
and the
Swiss cheese model
The Swiss cheese model of accident causation is a model used in Risk analysis (engineering), risk analysis and risk management. It likens human systems to multiple slices of Swiss cheese (North America), Swiss cheese, which have randomly placed an ...
, now widely accepted in
aviation safety
Aviation safety is the study and practice of managing risks in aviation. This includes preventing aviation accidents and incidents through research, educating air travel personnel, passengers and the general public, as well as the design of airc ...
and healthcare.
These accidents often resemble
Rube Goldberg devices in the way that small errors of judgment, flaws in technology, and insignificant damages combine to form an
emergent disaster. Langewiesche writes about, "an entire pretend reality that includes unworkable chains of command, unlearnable training programs, unreadable manuals, and the fiction of regulations, checks, and controls."
The more formality and effort to get it exactly right, at times can actually make failure more likely.
For example, employees are more likely to delay reporting any changes, problems, and unexpected conditions, wherever organizational procedures involved in adjusting to changing conditions are complex, difficult, or laborious.
A contrasting idea is that of the
high reliability organization. In his assessment of the vulnerabilities of complex systems,
Scott Sagan, for example, discusses in multiple publications their robust reliability, especially regarding nuclear weapons. ''The Limits of Safety'' (1993) provided an extensive review of close calls during the
Cold War
The Cold War was a period of global Geopolitics, geopolitical rivalry between the United States (US) and the Soviet Union (USSR) and their respective allies, the capitalist Western Bloc and communist Eastern Bloc, which lasted from 1947 unt ...
that could have resulted in a nuclear war by accident.
System accident examples
Apollo 13
The Apollo 13 Review Board stated in the introduction to chapter five of their report:
mphasis addedref name="Apollo13 rb">
Three Mile Island accident
Perrow considered the Three Mile Island accident ''normal'':
ValuJet Flight 592
On May 11, 1996,
Valujet Flight 592
ValuJet Airlines Flight 592 was a regularly scheduled flight from Miami International Airport, Miami to Hartsfield–Jackson Atlanta International Airport, Atlanta in the United States. On May 11, 1996, the ValuJet Airlines McDonnell Douglas DC ...
, a regularly scheduled ValuJet Airlines flight from Miami International to Hartsfield–Jackson Atlanta, crashed about 10 minutes after taking off as a result of a fire in the cargo compartment caused by improperly stored and labeled hazardous cargo. All 110 people on board died. The airline had a poor safety record before the crash. The accident brought widespread attention to the airline's management problems, including inadequate training of employees in proper handling of hazardous materials. The maintenance manual for the MD-80 aircraft documented the necessary procedures and was "correct" in a sense. However, it was so huge that it was neither helpful nor informative.
Financial crises and investment losses
In a 2014 monograph, economist Alan Blinder stated that complicated financial instruments made it hard for potential investors to judge whether the price was reasonable. In a section entitled "Lesson # 6: Excessive complexity is not just anti-competitive, it's dangerous", he further stated, "But the greater hazard may come from opacity. When investors don't understand the risks that inhere in the securities they buy (examples: the mezzanine tranche of a
CDO-Squared
CDO-Squared is an investment in the form of a special-purpose entity (SPE) with securitization payments backed by collateralized debt obligation tranches. A collateralized debt obligation is a product structured by a bank in which an investor buy ...
; a
CDS on a
synthetic CDO
A synthetic CDO is a variation of a CDO (collateralized debt obligation) that generally uses credit default swaps and other derivatives to obtain its investment goals.Lemke, Lins and Picard, ''Mortgage-Backed Securities'', §5:16 (Thomson West, 2 ...
...), big mistakes can be made–especially if rating agencies tell you they are triple-A, to wit, safe enough for grandma. When the crash comes, losses may therefore be much larger than investors dreamed imaginable. Markets may dry up as no one knows what these securities are really worth. Panic may set in. Thus complexity ''per se'' is a source of risk."
Continuing challenges
Air transport safety
Despite a significant increase in airplane safety since 1980s, there is concern that automated flight systems have become so complex that they both add to the risks that arise from overcomplication and are incomprehensible to the crews who must work with them. As an example, professionals in the aviation industry note that such systems sometimes switch or engage on their own; crew in the cockpit are not necessarily privy to the rationale for their auto-engagement, causing perplexity. Langewiesche cites industrial engineer
Nadine Sarter who writes about "automation surprises," often related to system modes the pilot does not fully understand or that the system switches to on its own. In fact, one of the more common questions asked in cockpits today is, "What's it doing now?" In response to this, Langewiesche points to the fivefold increase in aviation safety and writes, "No one can rationally advocate a return to the glamour of the past."
[
In an article entitled "The Human Factor", Langewiesche discusses the 2009 crash of ]Air France Flight 447
Air France Flight 447 was a scheduled international passenger flight from Rio de Janeiro/Galeão International Airport, Rio de Janeiro, Brazil, to Charles de Gaulle Airport, Paris, France. On 1 June 2009, inconsistent airspeed indications and mi ...
over the mid-Atlantic. He points out that, since the 1980s when the transition to automated cockpit systems began, safety has improved fivefold. Langwiesche writes, "In the privacy of the cockpit and beyond public view, pilots have been relegated to mundane roles as system managers." He quotes engineer Earl Wiener who takes the humorous statement attributed to the Duchess of Windsor that one can never be too rich or too thin, and adds "or too careful about what you put into a digital flight-guidance system." Wiener says that the effect of automation is typically to reduce the workload when it is light, but to increase it when it's heavy.
Boeing Engineer Delmar Fadden said that once capacities are added to flight management systems, they become impossibly expensive to remove because of certification requirements. But if unused, may in a sense lurk in the depths unseen.
Theory and practice interplay
Human factors in the implementation of safety procedures play a role in overall effectiveness of safety systems. Maintenance problems are common with redundant systems. Maintenance crews can fail to restore a redundant system to active status. They may be overworked, or maintenance deferred due to budget cuts, because managers know that they system will continue to operate without fixing the backup system. Steps in procedures may be changed and adapted in practice, from the formal safety rules, often in ways that seem appropriate and rational, and may be essential in meeting time constraints and work demands. In a 2004 ''Safety Science'' article, reporting on research partially supported by National Science Foundation and NASA, Nancy Leveson writes:[
* Citing: ]
See also
* Unintended consequences
In the social sciences, unintended consequences (sometimes unanticipated consequences or unforeseen consequences, more colloquially called knock-on effects) are outcomes of a purposeful action that are not intended or foreseen. The term was po ...
Notes
Sources
*
*
References
Further reading
*
Direct article download
*
* Gross, Michael Joseph (May 29, 2015)
"Life and Death at Cirque du Soleil"
This ''Vanity Fair'' article states: "... A system accident is one that requires many things to go wrong in a cascade. Change any element of the cascade and the accident may well not occur, but every element shares the blame..."
*
*
*''Beyond Engineering: A New Way of Thinking About Technology'', Todd La Prote, Karlene Roberts, and Gene Rochlin, Oxford University Press, 1997. This book provides counter-examples of complex systems which have good safety records.
* Pidgeon, Nick (September 22, 2011). "In retrospect: Normal accidents". ''Nature''.
*
*
*
*
{{ref end
Safety engineering
Engineering failures