A high reliability organization (HRO) is an
organization
An organization or organisation (Commonwealth English; see spelling differences), is an entity—such as a company, an institution, or an association—comprising one or more people and having a particular purpose.
The word is derived from ...
that has succeeded in avoiding catastrophes in an environment where
normal accidents can be expected due to
risk factor
In epidemiology, a risk factor or determinant is a variable associated with an increased risk of disease or infection.
Due to a lack of harmonization across disciplines, determinant, in its more widely accepted scientific meaning, is often use ...
s and
complexity
Complexity characterises the behaviour of a system or model whose components interaction, interact in multiple ways and follow local rules, leading to nonlinearity, randomness, collective dynamics, hierarchy, and emergence.
The term is generall ...
.
Important case studies in HRO research include both studies of disasters (e.g.,
Three Mile Island nuclear incident, the
Challenger Disaster
On January 28, 1986, the broke apart 73 seconds into its flight, killing all seven crew members aboard. The spacecraft disintegrated above the Atlantic Ocean, off the coast of Cape Canaveral, Florida, at 11:39a.m. EST (16:39 UTC). It was ...
and
Columbia Disaster
The Space Shuttle ''Columbia'' disaster was a fatal accident in the United States space program that occurred on February 1, 2003. During the STS-107 mission, Space Shuttle ''Columbia'' disintegrated as it reentered the atmosphere over Texa ...
, the
Bhopal chemical leak, the
Tenerife air crash, the
Mann Gulch forest fire, the
Black Hawk friendly fire incident in Iraq) and HROs like the air traffic control system, naval aircraft carriers, and nuclear power operations.
History
HRO theory flowed out of
Normal Accident Theory, which led a group of researchers at the
University of California, Berkeley
The University of California, Berkeley (UC Berkeley, Berkeley, Cal, or California) is a public land-grant research university in Berkeley, California. Established in 1868 as the University of California, it is the state's first land-grant u ...
(Todd LaPorte, Gene Rochlin, and Karlene Roberts) to study how organizations working with complex and hazardous systems operated
error free. They researched three organizations: United States nuclear aircraft carriers (in partnership with Rear Admiral (ret.) Tom Mercer on the USS Carl Vinson), the Federal Aviation Administration's Air Traffic Control system (and commercial aviation more generally), and nuclear power operations (Pacific Gas and Electric's Diablo Canyon reactor).
The result of this initial work was the defining characteristics of HROs hold in common:
# "Hyper
complexity
Complexity characterises the behaviour of a system or model whose components interaction, interact in multiple ways and follow local rules, leading to nonlinearity, randomness, collective dynamics, hierarchy, and emergence.
The term is generall ...
" – extreme variety of components, systems, and levels.
#
Tight coupling
In computing and systems design, a loosely coupled system is one
# in which components are weakly associated (have breakable relationships) with each other, and thus changes in one component least affect existence or performance of another comp ...
– reciprocal interdependence across many units and levels.
# Extreme hierarchical differentiation – multiple levels, each with its own elaborate control and regulating mechanisms.
# Large numbers of decision makers in complex communication networks – characterized by redundancy in control and information systems.
# Degree of
accountability
Accountability, in terms of ethics and governance, is equated with answerability, blameworthiness, liability, and the expectation of account-giving. As in an aspect of governance, it has been central to discussions related to problems in the publ ...
that does not exist in most organizations – substandard performance or deviations from standard procedures meet with severe adverse consequences.
# High frequency of immediate feedback about decisions.
# Compressed time factors – cycles of major activities are measured in seconds.
# More than one critical outcome that must happen simultaneously – simultaneity signifies both the complexity of operations as well as the inability to withdraw or modify operations decisions.
While many organizations display some of these characteristics, HROs display them all simultaneously.
Normal Accident and HRO theorists agreed that interactive complexity and tight coupling can, theoretically, lead to a system accident. However, they hold different opinions on whether those system accidents are inevitable or are manageable. Serious accidents in high risk, hazardous operations can be prevented through a combination of organizational design, culture, management, and human choice. Theorists of both schools place a lot of emphasis on human interaction with the system as either cause (Normal Accident Theory - NAT) or prevention (HRO) of a systems accident.
High reliability organization theory and HROs are often contrasted against
Charles Perrow
Charles B. Perrow (February 9, 1925 – November 12, 2019) was an emeritus professor of sociology at Yale University and visiting professor at Stanford University. He authored several books and many articles on organizations, and was primari ...
's Normal Accident Theory (see Sagan for a comparison of HRO and NAT). NAT represents Perrow's attempt to translate his understanding of the disaster at Three Mile Island nuclear facility into a more general formulation of accidents and disasters. Perrow's 1984 book also included chapters on petrochemical plants, aviation accidents, naval accidents, "earth-based system" accidents (dam breaks, earthquakes), and "exotic" accidents (genetic engineering, military operations, and space flight). At Three Mile Island the technology was tightly coupled due to time-dependent processes, invariant sequences, and limited slack. The events that spread through this technology were invisible concatenations that were impossible to anticipate and cascaded in an interactively complex manner. Perrow hypothesized that regardless of the effectiveness of management and operations, accidents in systems that are characterized by tight coupling and interactive complexity will be normal or inevitable as they often cannot be foreseen or prevented. This pessimistic view, described by some theorists as unashamedly technologically deterministic, contrasts with the more optimistic view of HRO proponents, who argued that high-risk, high-hazard organizations can function safely despite the hazards of complex systems. Despite their differences, NAT and HRO theory share a focus on the social and organizational underpinnings of system safety and accident causation/prevention. As research continued, a body of knowledge emerged based on the studying of a variety of organizations. For example, a fire incident command system, Loma Linda Hospital's Pediatric Intensive Care Unit, and the California Independent System Operator were all studied as examples of HROs.
Although they may seem diverse, these organizations have a number of similarities. First, they operate in unforgiving social and political environments. Second, their technologies are risky and present the potential for error. Third, the severity and scale of possible consequences from errors or mistakes precludes learning through
experimentation
An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into cause-and-effect by demonstrating what outcome occurs when ...
. Finally, these organizations all use complex processes to manage complex technologies and complex work to avoid
failure
Failure is the state or condition of not meeting a desirable or intended objective (goal), objective, and may be viewed as the opposite of Success (concept), success. The criteria for failure depends on context, and may be relative to a parti ...
. HROs share many properties with other high-performing organizations including highly trained-personnel, continuous training, effective reward systems, frequent process audits and continuous improvement efforts. Yet other properties such as an organization-wide sense of vulnerability, a widely distributed sense of responsibility and accountability for reliability, concern about misperception, misconception and misunderstanding that is generalized across a wide set of tasks, operations, and assumptions, pessimism about possible failures, redundancy and a variety of checks and counter checks as a precaution against potential mistakes are more distinctive.
Defining high reliability and specifying what constitutes a HRO has presented some challenges. Roberts
[Roberts, K. H. (1990). Some Characteristics of High-Reliability Organizations. Organization Science, 1, 160-177.] initially proposed that high reliability organizations are a subset of hazardous organizations that have enjoyed a record of high safety over long periods of time. Specifically she argued that: “One can identify this subset by answering the question, “how many times could this organization have failed resulting in catastrophic consequences that it did not?” If the answer is on the order of tens of thousands of times the organization is “high reliability””
(p. 160). More recent definitions have built on this starting point but emphasized the dynamic nature of producing reliability (i.e., constantly seeking to improve reliability and intervening both to prevent errors and failures and to cope and recover quickly should errors become manifest). Some researchers view HROs as reliability-seeking rather than reliability-achieving. Reliability-seeking organizations are not distinguished by their absolute errors or
accident rate, but rather by their “effective management of innately risky technologies through organizational control of both hazard and probability” (p. 14). Consequently, the phrase "high reliability" has come to mean that high risk and high effectiveness can co-exist, for organizations that must perform well under trying conditions, and that it takes intensive effort to do so.
While the early research focused on high risk industries, other expressed interest in HROs and sought to emulate their success. A key turning point was Karl Weick,
Kathleen M. Sutcliffe, and David Obstfeld's reconceptualization of the literature on high reliability. These researchers systematically reviewed the case study literature on HROs and illustrated how the infrastructure of high reliability was grounded in processes of collective mindfulness which are indicated by a preoccupation with failure, reluctance to simplify interpretations, sensitivity to operations, commitment to resilience, and deference to expertise. In other words, HROs are distinctive because of their efforts to organize in ways that increase the quality of attention across the organization, thereby enhancing people's alertness and awareness to details so that they can detect subtle ways in which contexts vary and call for contingent responding (i.e., collective mindfulness). This construct was elaborated and refined as mindful organizing in Weick and Sutcliffe's 2001 and 2007 editions of their book Managing the Unexpected. Mindful organizing forms a basis for individuals to interact continuously as they develop, refine and update a shared understanding of the situation they face and their capabilities to act on that understanding. Mindful organizing proactively triggers actions that forestall and contain errors and crises and requires leaders and employees to pay close attention to shaping the social and relational infrastructure of the organization. They establish a set of interrelated organizing processes and practices, which jointly contribute to the system's (e.g., team, unit, organization) overall
safety culture
Safety culture is the collection of the beliefs, perceptions and values that employees share in relation to risks within an organization, such as a workplace or community. Safety culture is a part of organizational culture, and has been describe ...
.
Characteristics
Successful organizations in high-risk industries continually "reinvent" themselves. For example, when an incident command team realizes what they thought was a garage fire has now changed into a hazardous material incident, they completely restructure their response organization.
There are five characteristics of HROs that have been identified
as responsible for the "mindfulness" that keeps them working well when facing unexpected situations.
;Preoccupation with failure: ''HROs treat anomalies as symptoms of a problem with the system''. The latent organizational weaknesses that contribute to small errors can also contribute to larger problems, so ''
errors are reported promptly'' so problems can be found and fixed.
;Reluctance to simplify interpretations: HROs take deliberate steps to comprehensively understand the work environment as well as a specific situation. They are cognizant that the operating environment is very complex, so they look across system boundaries to determine the path of problems (where they started, where they may end up) and value a diversity of experience and opinions.
;Sensitivity to operations: HROs are continuously sensitive to unexpected changed conditions. They
monitor
Monitor or monitor may refer to:
Places
* Monitor, Alberta
* Monitor, Indiana, town in the United States
* Monitor, Kentucky
* Monitor, Oregon, unincorporated community in the United States
* Monitor, Washington
* Monitor, Logan County, West ...
the systems’ safety and security barriers and controls to ensure they remain in place and operate as intended. ''
Situational awareness
Situational awareness or situation awareness (SA) is the perception of environmental elements and events with respect to time or space, the comprehension of their meaning, and the projection of their future status. An alternative definition is tha ...
'' is extremely important to HROs.
;Commitment to
resilience: HROs develop the capability to detect, contain, and recover from errors. Errors will happen, but HROs are not paralyzed by them.
;Deference to
expertise
An expert is somebody who has a broad and deep understanding and competence in terms of knowledge, skill and experience through practice and education in a particular field. Informally, an expert is someone widely recognized as a reliable s ...
: HROs follow typical communication hierarchy during routine operations, but defer to the person with the expertise to solve the problem during upset conditions. ''During a crisis, decisions are made at the front line'' and authority migrates to the person who can solve the problem, regardless of their hierarchical rank.
Although the original research and early application of HRO theory into practice occurred in high risk industries, research covers a wide variety of applications and settings.
Health care
Health care or healthcare is the improvement of health via the prevention, diagnosis, treatment, amelioration or cure of disease, illness, injury, and other physical and mental impairments in people. Health care is delivered by health profe ...
has been the largest practitioner area for the past several years.
The applications of
Crew Resource Management
Crew resource management or cockpit resource management (CRM)Diehl, Alan (2013) "Air Safety Investigators: Using Science to Save Lives-One Crash at a Time." Xlibris Corporation. . http://www.prweb.com/releases/DrAlanDiehl/AirSafetyInvestigators/ ...
is another area of focus for leaders in HROs requiring competent behavior systems measurement and intervention.
Wildfire
A wildfire, forest fire, bushfire, wildland fire or rural fire is an unplanned, uncontrolled and unpredictable fire in an area of Combustibility and flammability, combustible vegetation. Depending on the type of vegetation present, a wildfire ...
s create complex and very dynamic mega-crisis situations across the globe every year. U.S. wildland firefighters, often organized using the
Incident Command System
The Incident Command System (ICS) is a standardized approach to the command, control, and coordination of emergency response providing a common hierarchy within which responders from multiple agencies can be effective.
ICS was initially develo ...
into flexible inter-agency
incident management team
{{No footnotes, date=September 2020
Incident management team (IMT) is a term used in the United States of America to refer to a group of trained personnel that responds to an emergency. Although the incident management team concept was originally ...
s, are not only called upon to "bring order to chaos" in today's mega-fires, they also are requested on "all-hazard events" like
hurricane
A tropical cyclone is a rapidly rotating storm system characterized by a low-pressure center, a closed low-level atmospheric circulation, strong winds, and a spiral arrangement of thunderstorms that produce heavy rain and squalls. Depend ...
s, floods and earthquakes. The U.S. Wildland Fire Lessons Learned Center has been providing education and training to the wildland fire community on high reliability since 2002.
HRO behaviors can be developed into high-functioning skills of anticipation and resilience. Learning organizations that strive for high performance in things they can plan for, can become HROs that are able to better manage unexpected events that by definition cannot be planned for.
Notes
{{reflist
Risk management
Types of organization