HOME

TheInfoList



OR:

Failover is switching to a redundant or standby
computer A computer is a machine that can be programmed to Execution (computing), carry out sequences of arithmetic or logical operations (computation) automatically. Modern digital electronic computers can perform generic sets of operations known as C ...
server Server may refer to: Computing *Server (computing), a computer program or a device that provides functionality for other programs or devices, called clients Role * Waiting staff, those who work at a restaurant or a bar attending customers and su ...
,
system A system is a group of Interaction, interacting or interrelated elements that act according to a set of rules to form a unified whole. A system, surrounded and influenced by its environment (systems), environment, is described by its boundaries, ...
, hardware component or network upon the failure or
abnormal termination An abnormal end or abend is an abnormal termination of software, or a program crash. This usage derives from an error message from the IBM OS/360, IBM zOS operating systems. Usually capitalized, but may appear as "abend". Some common ABEND codes ...
of the previously active application, server, system, hardware component, or network in a
computer network A computer network is a set of computers sharing resources located on or provided by network nodes. The computers use common communication protocols over digital interconnections to communicate with each other. These interconnections are ...
. Failover and switchover are essentially the same operation, except that failover is automatic and usually operates without warning, while switchover requires human intervention.
Systems design Systems design interfaces, and data for an electronic control system to satisfy specified requirements. System design could be seen as the application of system theory to product development. There is some overlap with the disciplines of system a ...
ers usually provide failover capability in servers, systems or networks requiring near-continuous availability and a high degree of
reliability Reliability, reliable, or unreliable may refer to: Science, technology, and mathematics Computing * Data reliability (disambiguation), a property of some disk arrays in computer storage * High availability * Reliability (computer networking), a ...
. At the server level, failover automation usually uses a " heartbeat" system that connects two servers, either through using a separate cable (for example,
RS-232 In telecommunications, RS-232 or Recommended Standard 232 is a standard originally introduced in 1960 for serial communication transmission of data. It formally defines signals connecting between a ''DTE'' (''data terminal equipment'') such a ...
serial ports/cable) or a network connection. As long as a regular "pulse" or "heartbeat" continues between the main server and the second server, the second server will not bring its systems online. There may also be a third "spare parts" server that has running spare components for "hot" switching to prevent downtime. The second server takes over the work of the first as soon as it detects an alteration in the "heartbeat" of the first machine. Some systems have the ability to send a notification of failover. Certain systems, intentionally, do not failover entirely automatically, but require human intervention. This "automated with manual approval" configuration runs automatically once a human has approved the failover. Failback is the process of restoring a system, component, or service previously in a state of failure back to its original, working state, and having the standby system go from functioning back to standby. The use of
virtualization In computing, virtualization or virtualisation (sometimes abbreviated v12n, a numeronym) is the act of creating a virtual (rather than actual) version of something at the same abstraction level, including virtual computer hardware platforms, stor ...
software has allowed failover practices to become less reliant on physical hardware through the process referred to as
migration Migration, migratory, or migrate may refer to: Human migration * Human migration, physical movement by humans from one region to another ** International migration, when peoples cross state boundaries and stay in the host state for some minimum le ...
in which a running virtual machine is moved from one physical host to another, with little or no disruption in service.


History

The term "failover", although probably in use by engineers much earlier, can be found in a 1962 declassified
NASA The National Aeronautics and Space Administration (NASA ) is an independent agency of the US federal government responsible for the civil space program, aeronautics research, and space research. NASA was established in 1958, succeeding t ...
report. The term "switchover" can be found in the 1950s when describing '"Hot" and "Cold" Standby Systems', with the current meaning of immediate switchover to a running system (hot) and delayed switchover to a system that needs starting (cold). A conference proceedings from 1957 describes computer systems with both Emergency Switchover (i.e. failover) and Scheduled Failover (for maintenance).Proceedings of the Western Joint Computer Conference
Macmillan 1957


See also

*
Data integrity Data integrity is the maintenance of, and the assurance of, data accuracy and consistency over its entire Information Lifecycle Management, life-cycle and is a critical aspect to the design, implementation, and usage of any system that stores, proc ...
* Disaster recovery *
Fault-tolerance Fault tolerance is the property that enables a system to continue operating properly in the event of the failure of one or more faults within some of its components. If its operating quality decreases at all, the decrease is proportional to the ...
*
Fencing (computing) Fencing is the process of isolating a node of a computer cluster or protecting shared resources when a node appears to be malfunctioning.''Sun Cluster environment: Sun Cluster 2.2'' by Enrique Vargas, Joseph Bianco, David Deeths 2001 ISBN page 58 ...
*
High-availability cluster High-availability clusters (also known as HA clusters, fail-over clusters) are groups of computers that support server applications that can be reliably utilized with a minimum amount of down-time. They operate by using high availability softwa ...
* Load balancing *
Log shipping Log shipping is the process of automating the backup of transaction log files on a primary (production) database server, and then restoring them onto a standby server. This technique is supported by Microsoft SQL Server,Safety engineering Safety engineering is an engineering discipline which assures that engineered systems provide acceptable levels of safety. It is strongly related to industrial engineering/systems engineering, and the subset system safety engineering. Safety en ...
*
teleportation (virtualization) In the context of virtualization, where a ''guest'' simulation of an entire computer is actually merely a software virtual machine (VM) running on a ''host'' computer under a hypervisor, migration (also known as teleportation) is the process by wh ...


References

{{compu-network-stub Computer networking Fault-tolerant computer systems