In
reliability engineering
Reliability engineering is a sub-discipline of systems engineering that emphasizes the ability of equipment to function without failure. Reliability is defined as the probability that a product, system, or service will perform its intended functi ... , the term availability has the following meanings:
* The degree to which a
system
A system is a group of interacting or interrelated elements that act according to a set of rules to form a unified whole. A system, surrounded and influenced by its open system (systems theory), environment, is described by its boundaries, str ... ,
subsystem
A system is a group of interacting or interrelated elements that act according to a set of rules to form a unified whole. A system, surrounded and influenced by its environment, is described by its boundaries, structure and purpose and is exp ... or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at an unknown, ''i.e.'' a random, time.
* The probability that an item will operate satisfactorily at a given point in time when used under stated conditions in an ideal support environment.
Normally
high availability
High availability (HA) is a characteristic of a system that aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period.
There is now more dependence on these systems as a result of modernization ... systems might be specified as 99.98%, 99.999% or 99.9996%. The converse, unavailability, is 1 minus the availability.
Representation
The simplest representation of availability (''A'') is a ratio of the expected value of the uptime of a system to the aggregate of the expected values of up and down time (that results in the "total amount of time" ''C'' of the observation window)
:
A = \frac = \frac
Another equation for availability (''A'') is a ratio of the Mean Time To Failure (MTTF) and Mean Time Between Failure (MTBF), or
:
A = \frac = \frac
If we define the status function
X(t) as
:
X(t)=
\begin
1, & \text t\\
0, & \text
\end
therefore, the availability ''A''(''t'') at time ''t'' > 0 is represented by
:
A(t)=\Pr (t)=1 E (t) \,
Average availability must be defined on an interval of the real line. If we consider an arbitrary constant
c>0 , then average availability is represented as
:
A_c = \frac \int_0^c A(t)\,dt.
Limiting (or steady-state) availability is represented by
:
A = \lim_ A_c.
Limiting average availability is also defined on an interval
,c /math> as,
: A_\infty =\lim_ A_c = \lim_\frac \int_0^c A(t)\,dt,\quad c > 0.
Availability is the probability that an item will be in an operable and committable state at the start of a mission when the mission is called for at a random time, and is generally defined as uptime divided by total time (uptime plus downtime).
Series vs Parallel components
Let's say a series component is composed of components A, B and C. Then following formula applies:
Availability of series component = (availability of component A) x (availability of component B) x (availability of component C)
Therefore, combined availability of multiple components in a series is always lower than the availability of individual components.
On the other hand, following formula applies to parallel components:
Availability of parallel components = 1 - (1 - availability of component A) X (1 - availability of component B) X (1 - availability of component C)
In corollary, if you have N parallel components each having X availability, then:
Availability of parallel components = 1 - (1 - X)^ N
Using parallel components can exponentially increase the availability of overall system. For example if each of your hosts has only 50% availability, by using 10 of hosts in parallel, you can achieve 99.9023% availability.
Note that redundancy doesn’t always lead to higher availability. In fact, redundancy increases complexity which in turn reduces availability. According to Marc Brooker, to take advantage of redundancy, ensure that:
# You achieve a net-positive improvement in the overall availability of your system
# Your redundant components fail independently
# Your system can reliably detect healthy redundant components
# Your system can reliably scale out and scale-in redundant components.
Methods and techniques to model availability
Reliability Block Diagrams or Fault Tree Analysis
Fault tree analysis (FTA) is a type of failure analysis in which an undesired state of a system is examined. This analysis method is mainly used in safety engineering and reliability engineering to understand how systems can fail, to identify the ... are developed to calculate availability of a system or a functional failure condition within a system including many factors like:
* Reliability models
* Maintainability models
* Maintenance concepts
* Redundancy
* Common cause failure
* Diagnostics
* Level of repair
* Repair status
* Dormant failures
* Test coverage
* Active operational times / missions / sub system states
* Logistical aspects like; spare part (stocking) levels at different depots, transport times, repair times at different repair lines, manpower availability and more.
* Uncertainty in parameters
Furthermore, these methods are capable to identify the most critical items and failure modes or events that impact availability.
Definitions within systems engineering
Availability, inherent (Ai )
The probability that an item will operate satisfactorily at a given point in time when used under stated conditions in an ideal support environment. It excludes logistics time, waiting or administrative downtime, and preventive maintenance downtime. It includes corrective maintenance downtime.
Inherent availability is generally derived from analysis of an engineering design:
# The impact of a repairable-element (refurbishing/remanufacture isn't repair, but rather replacement) on the availability of the system, in which it operates, equals mean time between failures MTBF/(MTBF+ mean time to repair
Mean time to repair (MTTR) is a basic measure of the maintainability of repairable items. It represents the average time required to repair a failed component or device. Expressed mathematically, it is the total corrective maintenance time for ... MTTR).
# The impact of a one-off/non-repairable element (could be refurbished/remanufactured) on the availability of the system, in which it operates, equals the mean time to failure
A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ... (MTTF)/(MTTF + the mean time to repair
Mean time to repair (MTTR) is a basic measure of the maintainability of repairable items. It represents the average time required to repair a failed component or device. Expressed mathematically, it is the total corrective maintenance time for ... MTTR).
It is based on quantities under control of the designer.
Availability, achieved (Aa)
The probability that an item will operate satisfactorily at a given
point in time when used under stated conditions in an ideal support environment (i.e., that personnel, tools, spares, etc. are instantaneously available). It excludes logistics time and waiting or administrative downtime.
It includes active preventive and corrective maintenance downtime.
Availability, operational (Ao)
The probability that an item will operate satisfactorily at a given point in time when used in an actual or realistic operating and support environment. It includes logistics time, ready time, and waiting or administrative downtime, and both preventive and corrective maintenance downtime. This value is equal to the mean time between failure ( MTBF ) divided by the mean time between failure plus the mean downtime (MDT). This measure extends the definition of availability to elements controlled by the logisticians and mission planners such as quantity and proximity of spares, tools and manpower to the hardware item.
Refer to Systems engineering
Systems engineering is an interdisciplinary field of engineering and engineering management that focuses on how to design, integrate, and manage complex systems over their Enterprise life cycle, life cycles. At its core, systems engineering uti ... for more details
Basic example
If we are using equipment which has a mean time to failure
A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ... (MTTF) of 81.5 years and mean time to repair
Mean time to repair (MTTR) is a basic measure of the maintainability of repairable items. It represents the average time required to repair a failed component or device. Expressed mathematically, it is the total corrective maintenance time for ... (MTTR) of 1 hour:
: MTTF in hours = (This is a reliability parameter and often has a high level of uncertainty!)
: Inherent availability (Ai)
: Inherent unavailability
Outage due to equipment in hours per year = 1/rate = 1/MTTF = 0.01235 hours per year.
Literature
Availability is well established in the literature of stochastic modeling and optimal maintenance . Barlow and Proschan 975
Year 975 ( CMLXXV) was a common year starting on Friday of the Julian calendar.
Events
By place
Byzantine Empire
* Arab–Byzantine War: Emperor John I raids Mesopotamia and invades Syria, using the Byzantine base at Antioch to pres ... define availability of a repairable system as "the probability that the system is operating at a specified time t." Blanchard 998
Year 998 ( CMXCVIII) was a common year starting on Saturday of the Julian calendar.
Events
By place
Europe
* Spring – Otto III retakes Rome and restores power in the papal city. Crescentius II (the Younger) and his followers ... gives a qualitative definition of availability as "a measure of the degree of a system which is in the operable and committable state at the start of mission when the mission is called for at an unknown random point in time." This definition comes from the MIL-STD-721. Lie, Hwang, and Tillman 977
Year 977 ( CMLXXVII) was a common year starting on Monday of the Julian calendar.
Events
By place Europe
* May – Boris II, dethroned emperor (''tsar'') of Bulgaria, and his brother Roman manage to escape from captivity in Const ... developed a complete survey along with a systematic classification of availability.
Availability measures are classified by either the time interval of interest or the mechanisms for the system downtime
In computing and telecommunications, downtime (also (system) outage or (system) drought colloquially) is a period when a system is unavailable. The unavailability is the proportion of a time-span that a system is unavailable or offline.
This is ... . If the time interval of interest is the primary concern, we consider instantaneous, limiting, average, and limiting average availability. The aforementioned definitions are developed in Barlow and Proschan 975
Year 975 ( CMLXXV) was a common year starting on Friday of the Julian calendar.
Events
By place
Byzantine Empire
* Arab–Byzantine War: Emperor John I raids Mesopotamia and invades Syria, using the Byzantine base at Antioch to pres ... Lie, Hwang, and Tillman 977
Year 977 ( CMLXXVII) was a common year starting on Monday of the Julian calendar.
Events
By place Europe
* May – Boris II, dethroned emperor (''tsar'') of Bulgaria, and his brother Roman manage to escape from captivity in Const ... and Nachlas 998
Year 998 ( CMXCVIII) was a common year starting on Saturday of the Julian calendar.
Events
By place
Europe
* Spring – Otto III retakes Rome and restores power in the papal city. Crescentius II (the Younger) and his followers ... The second primary classification for availability is contingent on the various mechanisms for downtime such as the inherent availability, achieved availability, and operational availability. (Blanchard 998
Year 998 ( CMXCVIII) was a common year starting on Saturday of the Julian calendar.
Events
By place
Europe
* Spring – Otto III retakes Rome and restores power in the papal city. Crescentius II (the Younger) and his followers ... Lie, Hwang, and Tillman 977
Year 977 ( CMLXXVII) was a common year starting on Monday of the Julian calendar.
Events
By place Europe
* May – Boris II, dethroned emperor (''tsar'') of Bulgaria, and his brother Roman manage to escape from captivity in Const ... . Mi 998
Year 998 ( CMXCVIII) was a common year starting on Saturday of the Julian calendar.
Events
By place
Europe
* Spring – Otto III retakes Rome and restores power in the papal city. Crescentius II (the Younger) and his followers ... gives some comparison results of availability considering inherent availability.
Availability considered in maintenance modeling can be found in Barlow and Proschan 975
Year 975 ( CMLXXV) was a common year starting on Friday of the Julian calendar.
Events
By place
Byzantine Empire
* Arab–Byzantine War: Emperor John I raids Mesopotamia and invades Syria, using the Byzantine base at Antioch to pres ... for replacement models, Fawzi and Hawkes 991
Year 991 (Roman numerals, CMXCI) was a common year starting on Thursday of the Julian calendar.
Events
* March 1: In Rouen, Pope John XV ratifies the first Peace and Truce of God, Truce of God, between Æthelred the Unready and Richard I o ... for an R-out-of-N system with spare s and repairs, Fawzi and Hawkes 990
Year 990 ( CMXC) was a common year starting on Wednesday of the Julian calendar.
Events
By place
Europe
* Al-Mansur, Chancellor and effective ruler of Al-Andalus, conquers the Castle of Montemor-o-Velho (modern Portugal), expanding t ... for a series system with replacement and repair, Iyer 992
Year 992 ( CMXCII) was a leap year starting on Friday of the Julian calendar.
Events
By place
Worldwide
* Winter – A superflare from the sun causes an Aurora Borealis, with visibility as far south as Germany and Korea.
Euro ... for imperfect repair models, Murdock 995
Year 995 (Roman numerals, CMXCV) was a common year starting on Tuesday of the Julian calendar.
Events
By place
Japan
* 17 May - Fujiwara no Michitaka (imperial regent) dies.
* 3 June: Fujiwara no Michikane gains power and becomes Rege ... for age replacement preventive maintenance models, Nachlas 998, 1989 for preventive maintenance models, and Wang and Pham 996
Year 996 ( CMXCVI) was a leap year starting on Wednesday of the Julian calendar.
Events
By place
Japan
* February - Chotoku Incident: Fujiwara no Korechika and Takaie shoot an arrow at Retired Emperor Kazan.
* 2 March: Emperor ... for imperfect maintenance models. A very comprehensive recent book is by Trivedi and Bobbio 017 017 may refer to:
* DOL-017, GameCube console
* '' Global Underground 017'', DJ mix album
* Road FC 017, 2014 Mixed Martial Arts event
* Swift 017.n, racing car
* Tyrrell 017, Formula One racing car
See also
* 17 (disambiguation)
Seventeen o ...
Applications
Availability factor
The availability factor of a power plant is the duration it achieves production of electricity divided by the duration that it was planned to produce electricity. In the field of reliability engineering, ''availability factor'' is known as operat ... is used extensively in power plant engineering
Power plant engineering, abbreviated as TPTL, is a branch of the field of energy engineering, and is defined as the engineering and technology required for the production of an electric power station. Technique is focused on power generation for ... . For example, the North American Electric Reliability Corporation
The North American Electric Reliability Corporation (NERC) is a nonprofit corporation based in Atlanta, Georgia, and formed on March 28, 2006, as the successor to the North American Electric Reliability Council (also known as NERC). The original ... implemented the Generating Availability Data System The Generating Availability Data System (GADS) is a database produced by the North American Electric Reliability Corporation (NERC). It includes annual summary reports comprising the statistics for power stations in the United States and Canada.
... in 1982.
See also
* Dependability
In systems engineering, dependability is a measure of a system's availability, reliability, maintainability, and in some cases, other characteristics such as durability, safety and security. In real-time computing, dependability is the ability to ...
* Reliability engineering
Reliability engineering is a sub-discipline of systems engineering that emphasizes the ability of equipment to function without failure. Reliability is defined as the probability that a product, system, or service will perform its intended functi ...
* Safety engineering
Safety engineering is an engineering Branches of science, discipline which assures that engineered systems provide acceptable levels of safety. It is strongly related to industrial engineering/systems engineering, and the subset system safety en ...
* List of system quality attributes
Within systems engineering, quality attributes are realized non-functional requirements used to evaluate the performance of a system. These are sometimes named architecture characteristics, or "ilities" after the suffix many of the words share. ...
* Spurious trip level Spurious trip level (STL) is defined as a discrete level for specifying the spurious trip requirements of safety functions to be allocated to safety systems. An STL of 1 means that this safety function has the highest level of spurious trips. The hi ...
* Condition-based maintenance
Predictive maintenance techniques are designed to help determine the condition of in-service equipment in order to estimate when maintenance should be performed. This approach claims more cost savings over routine or time-based preventive maint ...
* Fault reporting
Fault reporting is a maintenance concept that increases operational availability and that reduces operating cost by three mechanisms:
* Reduce labor-intensive diagnostic evaluation
* Eliminate diagnostic testing down-time
* Provide notification ...
* High availability
High availability (HA) is a characteristic of a system that aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period.
There is now more dependence on these systems as a result of modernization ...
*
HOME
Content is Copyleft Website design, code, and AI is Copyrighted (c) 2014-2017 by Stephen Payne
Consider donating to Wikimedia
As an Amazon Associate I earn from qualifying purchases