active redundancy
   HOME

TheInfoList



OR:

Active redundancy is a design concept that increases
operational availability Operational availability in systems engineering is a measurement of how long a system has been available to use when compared with how long it should have been available to be used. Definition Operational availability is a management concept that ...
and that reduces
operating cost Operating costs or operational costs, are the expenses which are related to the operation of a business, or to the operation of a device, component, piece of equipment or facility. They are the cost of resources used by an organization just to main ...
by automating most critical maintenance actions. This concept is related to
condition-based maintenance The technical meaning of maintenance involves functional checks, servicing, repairing or replacing of necessary devices, equipment, machinery, building infrastructure, and supporting utilities in industrial, business, and residential installa ...
and
fault reporting Fault reporting is a maintenance concept that increases operational availability and that reduces operating cost through three mechanisms. * Reduce labor-intensive diagnostic evaluation * Eliminate diagnostic testing down-time * Provide notificati ...
.


History

The initial requirement began with military combat systems during World War I. The approach used for survivability was to install thick armor plate to resist gun fire and install multiple guns. This became unaffordable and impractical during the Cold War when aircraft and missile systems became common. The new approach was to build distributed systems that continue to work when components are damaged. This depends upon very crude forms of artificial intelligence that perform reconfiguration by obeying specific rules. An example of this approach is the AN/UYK-43 computer. Formal design philosophies involving active redundancy are required for critical systems where corrective labor is undesirable or impractical to correct failure during normal operation. Commercial aircraft are required to have multiple redundant computing systems, hydraulic systems, and propulsion systems so that a single in-flight equipment failure will not cause loss of life. A more recent outcome of this work is the Internet, which relies on a backbone of routers that provide the ability to automatically re-route communication without human intervention when failures occur. Satellites placed into orbit around the earth must include massive active redundancy to ensure operation will continue for a decade or longer despite failures induced by normal failure, radiation-induced failure, and thermal shock. This strategy now dominates space systems, aircraft, and missile systems.


Principle

Maintenance requires three actions, which usually involve down time and high priority labor costs: * Automatic fault detection * Automatic fault isolation * Automatic reconfiguration Active redundancy eliminates down time and reduces manpower requirements by automating all three actions. This requires some amount of automated
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech re ...
. ''N'' stands for needed equipment. The amount of excess capacity affects overall system reliability by limiting the effects of failure. For example, if it takes two generators to power a city, then "N+1" would be three generators to allow a single failure. Similarly, "N+2" would be four generators, which would allow one generator to fail while a second generator has already failed. Active redundancy improves
operational availability Operational availability in systems engineering is a measurement of how long a system has been available to use when compared with how long it should have been available to be used. Definition Operational availability is a management concept that ...
as follows. :A_^ = 0.99 \ up \ time :::\approx failed \ 90 \ hours/year :A_^ = 1 - \left( (1 - A_^ ) \times (1 - A_^ ) \right) = 0.9999 \ up \ time ::::\approx failed \ 50 \ minutes/year :A_^ = 1 - \left( (1 - A_^ ) \times (1 - A_^ ) \times (1 - A_^ ) \right) = 0.999999 \ up \ time ::::\approx failed \ 30 \ seconds/year


Passive components

Active redundancy in passive components requires redundant components that share the burden when failure occurs, like in cabling and piping. This allows forces to be redistributed across a bridge to prevent failure if a vehicle ruptures a cable. This allows water flow to be redistributed through pipes when a limited number of valves are shut or pumps shut down.


Active components

Active redundancy in active components requires reconfiguration when failure occurs. Computer programming must recognize the failure and automatically reconfigure to restore operation. All modern computers provide the following when an existing feature is enabled via
fault reporting Fault reporting is a maintenance concept that increases operational availability and that reduces operating cost through three mechanisms. * Reduce labor-intensive diagnostic evaluation * Eliminate diagnostic testing down-time * Provide notificati ...
. * Automatic fault detection * Automatic fault isolation Mechanical devices must reconfigure, such as transmission settings on hybrid vehicles that have redundant propulsion systems. The petroleum engine will start up when battery power fails. Electrical power systems must perform two actions to prevent total system failure when smaller failures occur, such as when a tree falls across a power line. Power systems incorporate communication, switching, and automatic scheduling that allows these actions to be automated. * Shut down the damaged power line to isolate the failure * Adjust generator settings to prevent voltage and frequency excursions


Benefits

This is the only known strategy that can achieve high availability.


Detriments

This maintenance philosophy requires custom development with extra components.


See also

* Active redundancy *
Operational availability Operational availability in systems engineering is a measurement of how long a system has been available to use when compared with how long it should have been available to be used. Definition Operational availability is a management concept that ...
*
Fault reporting Fault reporting is a maintenance concept that increases operational availability and that reduces operating cost through three mechanisms. * Reduce labor-intensive diagnostic evaluation * Eliminate diagnostic testing down-time * Provide notificati ...


References

{{Reflist Maintenance Engineering concepts Reliability engineering Safety Fault-tolerant computer systems