HOME

TheInfoList



OR:

Reliability engineering is a sub-discipline of systems engineering that emphasizes the ability of equipment to function without failure. Reliability describes the ability of a system or component to function under stated conditions for a specified period of time. Reliability is closely related to availability, which is typically described as the ability of a component or system to function at a specified moment or interval of time. The reliability function is theoretically defined as the
probability Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and ...
of success at time t, which is denoted R(t). This probability is estimated from detailed (physics of failure) analysis, previous data sets or through reliability testing and reliability modelling. Availability, testability, maintainability and maintenance are often defined as a part of "reliability engineering" in reliability programs. Reliability often plays the key role in the cost-effectiveness of systems. Reliability engineering deals with the prediction, prevention and management of high levels of "
lifetime Lifetime may refer to: * Life expectancy, the length of time a person is expected to remain alive Arts, entertainment, and media Music * Lifetime (band), a rock band from New Jersey * ''Life Time'' (Rollins Band album), by Rollins Band * ...
" engineering
uncertainty Uncertainty refers to Epistemology, epistemic situations involving imperfect or unknown information. It applies to predictions of future events, to physical measurements that are already made, or to the unknown. Uncertainty arises in partially ...
and
risk In simple terms, risk is the possibility of something bad happening. Risk involves uncertainty about the effects/implications of an activity with respect to something that humans value (such as health, well-being, wealth, property or the environme ...
s of failure. Although
stochastic Stochastic (, ) refers to the property of being well described by a random probability distribution. Although stochasticity and randomness are distinct in that the former refers to a modeling approach and the latter refers to phenomena themselve ...
parameters define and affect reliability, reliability is not only achieved by mathematics and statistics. "Nearly all teaching and literature on the subject emphasize these aspects, and ignore the reality that the ranges of uncertainty involved largely invalidate quantitative methods for prediction and measurement."O'Connor, Patrick D. T. (2002), ''Practical Reliability Engineering'' (Fourth Ed.), John Wiley & Sons, New York. . For example, it is easy to represent "probability of failure" as a symbol or value in an equation, but it is almost impossible to predict its true magnitude in practice, which is massively multivariate, so having the equation for reliability does not begin to equal having an accurate predictive measurement of reliability. Reliability engineering relates closely to Quality Engineering,
safety engineering Safety engineering is an engineering discipline which assures that engineered systems provide acceptable levels of safety. It is strongly related to industrial engineering/systems engineering, and the subset system safety engineering. Safety eng ...
and system safety, in that they use common methods for their analysis and may require input from each other. It can be said that a system must be reliably safe. Reliability engineering focuses on costs of failure caused by system downtime, cost of spares, repair equipment, personnel, and cost of warranty claims.


History

The word ''reliability'' can be traced back to 1816, and is first attested to the poet
Samuel Taylor Coleridge Samuel Taylor Coleridge (; 21 October 177225 July 1834) was an English poet, literary critic, philosopher, and theologian who, with his friend William Wordsworth, was a founder of the Romantic Movement in England and a member of the Lak ...
. Before World War II the term was linked mostly to repeatability; a test (in any type of science) was considered "reliable" if the same results would be obtained repeatedly. In the 1920s, product improvement through the use of
statistical process control Statistical process control (SPC) or statistical quality control (SQC) is the application of statistical methods to monitor and control the quality of a production process. This helps to ensure that the process operates efficiently, producing ...
was promoted by Dr.
Walter A. Shewhart Walter Andrew Shewhart (pronounced like "shoe-heart"; March 18, 1891 – March 11, 1967) was an American physicist, engineer and statistician, sometimes known as the ''father of Statistical process control, statistical quality control'' and also ...
at
Bell Labs Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984), then AT&T Bell Laboratories (1984–1996) and Bell Labs Innovations (1996–2007), is an American industrial research and scientific development company owned by mult ...
, around the time that Waloddi Weibull was working on statistical models for fatigue. The development of reliability engineering was here on a parallel path with quality. The modern use of the word reliability was defined by the U.S. military in the 1940s, characterizing a product that would operate when expected and for a specified period of time. In World War II, many reliability issues were due to the inherent unreliability of electronic equipment available at the time, and to fatigue issues. In 1945, M.A. Miner published the seminal paper titled "Cumulative Damage in Fatigue" in an ASME journal. A main application for reliability engineering in the military was for the vacuum tube as used in radar systems and other electronics, for which reliability proved to be very problematic and costly. The
IEEE The Institute of Electrical and Electronics Engineers (IEEE) is a 501(c)(3) professional association for electronic engineering and electrical engineering (and associated disciplines) with its corporate office in New York City and its operati ...
formed the Reliability Society in 1948. In 1950, the
United States Department of Defense The United States Department of Defense (DoD, USDOD or DOD) is an executive branch department of the federal government charged with coordinating and supervising all agencies and functions of the government directly related to national secur ...
formed a group called the "Advisory Group on the Reliability of Electronic Equipment" (AGREE) to investigate reliability methods for military equipment. This group recommended three main ways of working: * Improve component reliability. * Establish quality and reliability requirements for suppliers. * Collect field data and find root causes of failures. In the 1960s, more emphasis was given to reliability testing on component and system level. The famous military standard MIL-STD-781 was created at that time. Around this period also the much-used predecessor to military handbook 217 was published by RCA and was used for the prediction of failure rates of electronic components. The emphasis on component reliability and empirical research (e.g. Mil Std 217) alone slowly decreased. More pragmatic approaches, as used in the consumer industries, were being used. In the 1980s, televisions were increasingly made up of solid-state semiconductors. Automobiles rapidly increased their use of semiconductors with a variety of microcomputers under the hood and in the dash. Large air conditioning systems developed electronic controllers, as had microwave ovens and a variety of other appliances. Communications systems began to adopt electronics to replace older mechanical switching systems.
Bellcore iconectiv is a supplier of network planning and network management services to telecommunications providers. Known as Bellcore after its establishment in the United States in 1983 as part of the break-up of the Bell System, the company's name ...
issued the first consumer prediction methodology for telecommunications, and
SAE SAE or Sae may refer to: Science and technology : * Selective area epitaxy, local growth of epitaxial layer through a patterned dielectric mask deposited on a semiconductor substrate * Serious adverse event, in a clinical trial * Simultaneous Auth ...
developed a similar document SAE870050 for automotive applications. The nature of predictions evolved during the decade, and it became apparent that die complexity wasn't the only factor that determined failure rates for integrated circuits (ICs). Kam Wong published a paper questioning the bathtub curve—see also
reliability-centered maintenance Reliability-centered maintenance (RCM) is a concept of maintenance planning to ensure that systems continue to do what their user require in their present operating context. Successful implementation of RCM will lead to increase in cost effecti ...
. During this decade, the failure rate of many components dropped by a factor of 10. Software became important to the reliability of systems. By the 1990s, the pace of IC development was picking up. Wider use of stand-alone microcomputers was common, and the PC market helped keep IC densities following Moore's law and doubling about every 18 months. Reliability engineering was now changing as it moved towards understanding the physics of failure. Failure rates for components kept dropping, but system-level issues became more prominent. Systems thinking became more and more important. For software, the CMM model ( Capability Maturity Model) was developed, which gave a more qualitative approach to reliability. ISO 9000 added reliability measures as part of the design and development portion of certification. The expansion of the World-Wide Web created new challenges of security and trust. The older problem of too little reliability information available had now been replaced by too much information of questionable value. Consumer reliability problems could now be discussed online in real time using data. New technologies such as micro-electromechanical systems ( MEMS), handheld GPS, and hand-held devices that combined cell phones and computers all represent challenges to maintain reliability. Product development time continued to shorten through this decade and what had been done in three years was being done in 18 months. This meant that reliability tools and tasks had to be more closely tied to the development process itself. In many ways, reliability became part of everyday life and consumer expectations.


Overview


Objective

The objectives of reliability engineering, in decreasing order of priority, are: # To apply engineering knowledge and specialist techniques to prevent or to reduce the likelihood or frequency of failures. # To identify and correct the causes of failures that do occur despite the efforts to prevent them. # To determine ways of coping with failures that do occur, if their causes have not been corrected. # To apply methods for estimating the likely reliability of new designs, and for analysing reliability data. The reason for the priority emphasis is that it is by far the most effective way of working, in terms of minimizing costs and generating reliable products. The primary skills that are required, therefore, are the ability to understand and anticipate the possible causes of failures, and knowledge of how to prevent them. It is also necessary to have knowledge of the methods that can be used for analysing designs and data.


Scope and techniques

Reliability engineering for " complex systems" requires a different, more elaborate systems approach than for non-complex systems. Reliability engineering may in that case involve: * System availability and mission readiness analysis and related reliability and maintenance requirement allocation * Functional system failure analysis and derived requirements specification * Inherent (system) design reliability analysis and derived requirements specification for both hardware and software design * System diagnostics design * Fault tolerant systems (e.g. by redundancy) * Predictive and preventive maintenance (e.g. reliability-centered maintenance) * Human factors / human interaction / human errors * Manufacturing- and assembly-induced failures (effect on the detected "0-hour quality" and reliability) * Maintenance-induced failures * Transport-induced failures * Storage-induced failures * Use (load) studies, component stress analysis, and derived requirements specification * Software (systematic) failures * Failure / reliability testing (and derived requirements) * Field failure monitoring and corrective actions *
Spare part A spare part, spare, service part, repair part, or replacement part, is an interchangeable part that is kept in an inventory and used for the repair or refurbishment of defective equipment/units. Spare parts are an important feature of logist ...
s stocking (availability control) * Technical documentation, caution and warning analysis * Data and information acquisition/organisation (creation of a general reliability development hazard log and
FRACAS Fracas may refer to: * Fracas! Improv Festival, an improvisational theater festival held at the University of Southern California * Failure Reporting, Analysis and Corrective Action Systems A failure reporting, analysis, and corrective action syst ...
system) * Chaos engineering Effective reliability engineering requires understanding of the basics of failure mechanisms for which experience, broad engineering skills and good knowledge from many different special fields of engineering are required, for example: * Tribology * Stress (mechanics) *
Fracture mechanics Fracture mechanics is the field of mechanics concerned with the study of the propagation of cracks in materials. It uses methods of analytical solid mechanics to calculate the driving force on a crack and those of experimental solid mechanics ...
/