TheInfoList

Uncertainty quantification (UQ) is the science of quantitative characterization and reduction of in both computational and real world applications. It tries to determine how likely certain outcomes are if some aspects of the system are not exactly known. An example would be to predict the acceleration of a human body in a head-on crash with another car: even if the speed was exactly known, small differences in the manufacturing of individual cars, how tightly every bolt has been tightened, etc., will lead to different results that can only be predicted in a statistical sense. Many problems in the natural sciences and engineering are also rife with sources of uncertainty.
Computer experimentA computer experiment or simulation experiment is an experiment used to study a computer simulation, also referred to as an in silico system. This area includes computational physics, computational chemistry, computational biology and other similar ...
s on
computer simulation Computer simulation is the process of mathematical modelling A mathematical model is a description of a system A system is a group of Interaction, interacting or interrelated elements that act according to a set of rules to form a unifie ...
s are the most common approach to study problems in uncertainty quantification.

# Sources of uncertainty

Uncertainty can enter
mathematical model A mathematical model is a description of a system A system is a group of Interaction, interacting or interrelated elements that act according to a set of rules to form a unified whole. A system, surrounded and influenced by its environmen ...
s and experimental measurements in various contexts. One way to categorize the sources of uncertainty is to consider: ; Parameter uncertainty: This comes from the model parameters that are inputs to the computer model (mathematical model) but whose exact values are unknown to experimentalists and cannot be controlled in physical experiments, or whose values cannot be exactly inferred by
statistical methods Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data Data are units of information Information can be thought of as the resolution of uncertainty; it answers th ...
. Some examples of this are the local
free-fall In Newtonian physics, free fall is any motion of a body where gravity Gravity (), or gravitation, is a list of natural phenomena, natural phenomenon by which all things with mass or energy—including planets, stars, galaxy, galaxies, ...

acceleration in a falling object experiment, various material properties in a finite element analysis for engineering, and
multiplier uncertaintyIn macroeconomics, multiplier uncertainty is lack of perfect knowledge of the Multiplier (economics), multiplier effect of a particular policy action, such as a monetary or fiscal policy change, upon the intended target of the policy. For example, a ...
in the context of
macroeconomic policy Macroeconomics (from the Greek prefix ''makro-'' meaning "large" + ''economics'') is a branch of economics Economics () is the social science that studies how people interact with value; in particular, the Production (economics), producti ...
optimization. ; Parametric variability: This comes from the variability of input variables of the model. For example, the dimensions of a work piece in a process of manufacture may not be exactly as designed and instructed, which would cause variability in its performance. ; Structural uncertainty: Also known as model inadequacy, model bias, or model discrepancy, this comes from the lack of knowledge of the underlying physics in the problem. It depends on how accurately a mathematical model describes the true system for a real-life situation, considering the fact that models are almost always only approximations to reality. One example is when modeling the process of a falling object using the free-fall model; the model itself is inaccurate since there always exists air friction. In this case, even if there is no unknown parameter in the model, a discrepancy is still expected between the model and true physics. ; Algorithmic uncertainty: Also known as numerical uncertainty, or discrete uncertainty. This type comes from numerical errors and numerical approximations per implementation of the computer model. Most models are too complicated to solve exactly. For example, the
finite element method The finite element method (FEM) is a widely used method for numerically solving differential equations In mathematics, a differential equation is an equation In mathematics, an equation is a statement that asserts the equality (mathematics) ...
or
finite difference method#REDIRECT Finite difference method In numerical analysis (c. 1800–1600 BC) with annotations. The approximation of the square root of 2 is four sexagesimal figures, which is about six decimal figures. 1 + 24/60 + 51/602 + 10/603 = 1.41421296... ...
may be used to approximate the solution of a
partial differential equation In mathematics Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, change (mathematical analysis, analysis). I ...
(which introduces numerical errors). Other examples are numerical integration and infinite sum truncation that are necessary approximations in numerical implementation. ; Experimental uncertainty: Also known as observation error, this comes from the variability of experimental measurements. The experimental uncertainty is inevitable and can be noticed by repeating a measurement for many times using exactly the same settings for all inputs/variables. ; Interpolation uncertainty: This comes from a lack of available data collected from computer model simulations and/or experimental measurements. For other input settings that don't have simulation data or experimental measurements, one must interpolate or extrapolate in order to predict the corresponding responses.

## Aleatoric and epistemic uncertainty

Uncertainty is sometimes classified into two categories, prominently seen in medical applications. ; Aleatoric uncertainty: Aleatoric uncertainty is also known as statistical uncertainty, and is representative of unknowns that differ each time we run the same experiment. For example, a single arrow shot with a mechanical bow that exactly duplicates each launch (the same acceleration, altitude, direction and final velocity) will not all impact the same point on the target due to random and complicated vibrations of the arrow shaft, the knowledge of which cannot be determined sufficiently to eliminate the resulting scatter of impact points. The argument here is obviously in the definition of "cannot". Just because we cannot measure sufficiently with our currently available measurement devices does not preclude necessarily the existence of such information, which would move this uncertainty into the below category. ''Aleatoric'' is derived from the Latin alea or dice, referring to a game of chance. ; Epistemic uncertainty: Epistemic uncertainty is also known as systematic uncertainty, and is due to things one could in principle know but does not in practice. This may be because a measurement is not accurate, because the model neglects certain effects, or because particular data have been deliberately hidden. An example of a source of this uncertainty would be the
drag Drag or The Drag may refer to: Places * Drag, Norway, a village in Tysfjord municipality, Nordland, Norway * ''Drág'', the Hungarian name for Dragu Commune in Sălaj County, Romania * Drag (Austin, Texas), the portion of Guadalupe Street adja ...
in an experiment designed to measure the acceleration of gravity near the earth's surface. The commonly used gravitational acceleration of 9.8 m/s² ignores the effects of air resistance, but the air resistance for the object could be measured and incorporated into the experiment to reduce the resulting uncertainty in the calculation of the gravitational acceleration. In real life applications, both kinds of uncertainties are present. Uncertainty quantification intends to explicitly express both types of uncertainty separately. The quantification for the aleatoric uncertainties can be relatively straightforward, where traditional (frequentist) probability is the most basic form. Techniques such as the
Monte Carlo method Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be determini ...
are frequently used. A probability distribution can be represented by its moments (in the
Gaussian Carl Friedrich Gauss (1777–1855) is the eponym of all of the topics listed below. There are over 100 topics all named after this German mathematician and scientist, all in the fields of mathematics, physics, and astronomy. The English eponymous ...

case, the
mean There are several kinds of mean in mathematics Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, change (ma ...
and
covariance In probability theory and statistics, covariance is a measure of the joint variability of two random variables. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the less ...

suffice, although, in general, even knowledge of all moments to arbitrarily high order still does not specify the distribution function uniquely), or more recently, by techniques such as Karhunen–Loève and
polynomial chaosPolynomial chaos (PC), also called Polynomial chaos expansion (PCE) or Wiener chaos expansion, is a method for representing a random variable A random variable is a variable whose values depend on outcomes of a random In common parlance, ...
expansions. To evaluate epistemic uncertainties, the efforts are made to understand the (lack of) knowledge of the system, process or mechanism. Epistemic uncertainty is generally understood through the lens of
Bayesian probability Bayesian probability is an interpretation of the concept of probability, in which, instead of frequentist probability, frequency or propensity probability, propensity of some phenomenon, probability is interpreted as reasonable expectation represe ...
, where probabilities are interpreted as indicating how certain a rational person could be regarding a specific claim.

### Mathematical perspective of aleatoric and epistemic uncertainty

In mathematics, uncertainty is often characterized in terms of a
probability distribution In probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...
. From that perspective, epistemic uncertainty means not being certain what the relevant probability distribution is, and aleatoric uncertainty means not being certain what a
random sample In statistics Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin wi ...
drawn from a probability distribution will be.

## Uncertainty versus Variability

Technical professionals are often asked to estimate "ranges" for uncertain quantities. It is important that they distinguish whether they are being asked for variability ranges or uncertainty ranges. Likewise, it is important for modelers to know if they are building models of variability or uncertainty, and their relationship, if any.

# Two types of uncertainty quantification problems

There are two major types of problems in uncertainty quantification: one is the forward propagation of uncertainty (where the various sources of uncertainty are propagated through the model to predict the overall uncertainty in the system response) and the other is the inverse assessment of model uncertainty and parameter uncertainty (where the model parameters are calibrated simultaneously using test data). There has been a proliferation of research on the former problem and a majority of uncertainty analysis techniques were developed for it. On the other hand, the latter problem is drawing increasing attention in the engineering design community, since uncertainty quantification of a model and the subsequent predictions of the true system response(s) are of great interest in designing robust systems.

## Forward uncertainty propagation

Uncertainty propagation is the quantification of uncertainties in system output(s) propagated from uncertain inputs. It focuses on the influence on the outputs from the ''parametric variability'' listed in the sources of uncertainty. The targets of uncertainty propagation analysis can be: * To evaluate low-order moments of the outputs, i.e.
mean There are several kinds of mean in mathematics Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, change (ma ...
and
variance In probability theory Probability theory is the branch of concerned with . Although there are several different , probability theory treats the concept in a rigorous mathematical manner by expressing it through a set of . Typically these ax ...

. * To evaluate the reliability of the outputs. This is especially useful in
reliability engineering Reliability engineering is a sub-discipline of systems engineering that emphasizes the ability of equipment to function without failure. Reliability describes the ability of a system or component to function under stated conditions for a specifie ...
where outputs of a system are usually closely related to the performance of the system. * To assess the complete probability distribution of the outputs. This is useful in the scenario of
utility As a topic of economics Economics () is a social science Social science is the Branches of science, branch of science devoted to the study of society, societies and the Social relation, relationships among individuals within thos ...

optimization where the complete distribution is used to calculate the utility.

## Inverse uncertainty quantification

Given some experimental measurements of a system and some computer simulation results from its mathematical model, inverse uncertainty quantification estimates the discrepancy between the experiment and the mathematical model (which is called bias correction), and estimates the values of unknown parameters in the model if there are any (which is called parameter calibration or simply calibration). Generally this is a much more difficult problem than forward uncertainty propagation; however it is of great importance since it is typically implemented in a model updating process. There are several scenarios in inverse uncertainty quantification:

### Bias correction only

Bias correction quantifies the ''model inadequacy'', i.e. the discrepancy between the experiment and the mathematical model. The general model updating formula for bias correction is: :$y^e\left(\mathbf\right)=y^m\left(\mathbf\right)+\delta\left(\mathbf\right)+\varepsilon$ where $y^e\left(\mathbf\right)$ denotes the experimental measurements as a function of several input variables $\mathbf$, $y^m\left(\mathbf\right)$ denotes the computer model (mathematical model) response, $\delta\left(\mathbf\right)$ denotes the additive discrepancy function (aka bias function), and $\varepsilon$ denotes the experimental uncertainty. The objective is to estimate the discrepancy function $\delta\left(\mathbf\right)$, and as a by-product, the resulting updated model is $y^m\left(\mathbf\right)+\delta\left(\mathbf\right)$. A prediction confidence interval is provided with the updated model as the quantification of the uncertainty.

### Parameter calibration only

Parameter calibration estimates the values of one or more unknown parameters in a mathematical model. The general model updating formulation for calibration is: :$y^e\left(\mathbf\right)=y^m\left(\mathbf,\boldsymbol^*\right)+\varepsilon$ where $y^m\left(\mathbf,\boldsymbol\right)$ denotes the computer model response that depends on several unknown model parameters $\boldsymbol$, and $\boldsymbol^*$ denotes the true values of the unknown parameters in the course of experiments. The objective is to either estimate $\boldsymbol^*$, or to come up with a probability distribution of $\boldsymbol^*$ that encompasses the best knowledge of the true parameter values.

### Bias correction and parameter calibration

It considers an inaccurate model with one or more unknown parameters, and its model updating formulation combines the two together: :$y^e\left(\mathbf\right)=y^m\left(\mathbf,\boldsymbol^*\right)+\delta\left(\mathbf\right)+\varepsilon$ It is the most comprehensive model updating formulation that includes all possible sources of uncertainty, and it requires the most effort to solve.

# Selective methodologies for uncertainty quantification

Much research has been done to solve uncertainty quantification problems, though a majority of them deal with uncertainty propagation. During the past one to two decades, a number of approaches for inverse uncertainty quantification problems have also been developed and have proved to be useful for most small- to medium-scale problems.

## Methodologies for forward uncertainty propagation

Existing uncertainty propagation approaches include probabilistic approaches and non-probabilistic approaches. There are basically five categories of probabilistic approaches for uncertainty propagation: * Simulation-based methods: Monte Carlo simulations,
importance sampling In statistics Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with ...
, adaptive sampling, etc. * Local expansion-based methods: Taylor series, perturbation method, etc. These methods have advantages when dealing with relatively small input variability and outputs that don't express high nonlinearity. These linear or linearized methods are detailed in the article
Uncertainty propagation In statistics Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with ...
. * Functional expansion-based methods: Neumann expansion, orthogonal or Karhunen–Loeve expansions (KLE), with polynomial chaos expansion (PCE) and wavelet expansions as special cases. * Most probable point (MPP)-based methods: first-order reliability method (FORM) and second-order reliability method (SORM). * Numerical integration-based methods: Full factorial numerical integration (FFNI) and dimension reduction (DR). For non-probabilistic approaches, interval analysis, , possibility theory and evidence theory are among the most widely used. The probabilistic approach is considered as the most rigorous approach to uncertainty analysis in engineering design due to its consistency with the theory of decision analysis. Its cornerstone is the calculation of probability density functions for sampling statistics. This can be performed rigorously for random variables that are obtainable as transformations of Gaussian variables, leading to exact confidence intervals.

## Methodologies for inverse uncertainty quantification

### Frequentist

In
regression analysis In ing, regression analysis is a set of statistical processes for the relationships between a (often called the 'outcome' or 'response' variable) and one or more s (often called 'predictors', 'covariates', 'explanatory variables' or 'features' ...
and
least squares The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the resid ...
problems, the
standard error The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution In statistics Statistics is the discipline that concerns the collection, organization, analysis, interpret ...
of
parameter estimateIn statistics Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a ...
s is readily available, which can be expanded into a
confidence interval In statistics Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data Data (; ) are individual facts, statistics, or items of information, often numeric. In a mor ...

.

### Bayesian

Several methodologies for inverse uncertainty quantification exist under the Bayesian framework. The most complicated direction is to aim at solving problems with both bias correction and parameter calibration. The challenges of such problems include not only the influences from model inadequacy and parameter uncertainty, but also the lack of data from both computer simulations and experiments. A common situation is that the input settings are not the same over experiments and simulations.

### =Modular Bayesian approach

= An approach to inverse uncertainty quantification is the modular Bayesian approach. The modular Bayesian approach derives its name from its four-module procedure. Apart from the current available data, a
prior distribution In Bayesian statistical inference, a prior probability distribution, often simply called the prior, of an uncertain quantity is the probability distribution that would express one's beliefs about this quantity before some evidence is taken into ...
of unknown parameters should be assigned. ;Module 1: Gaussian process modeling for the computer model To address the issue from lack of simulation results, the computer model is replaced with a
Gaussian process In probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressi ...
(GP) model :$y^m\left(\mathbf,\boldsymbol\right)\sim\mathcal\big\left(\mathbf^m\left(\cdot\right)^T\boldsymbol^m,\sigma_m^2R^m\left(\cdot,\cdot\right)\big\right)$ where :$R^m\big\left(\left(\mathbf,\boldsymbol\right),\left(\mathbf\text{'},\boldsymbol\text{'}\right)\big\right)=\exp\left\\exp\left\.$ $d$ is the dimension of input variables, and $r$ is the dimension of unknown parameters. While $\mathbf^m\left(\cdot\right)$ is pre-defined, $\left\$, known as ''
hyperparameter In Bayesian statistics Bayesian statistics is a theory in the field of statistics based on the Bayesian probability, Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an Event (probability theory), ev ...
s'' of the GP model, need to be estimated via maximum likelihood estimation (MLE). This module can be considered as a generalized
kriging In statistics Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with ...

method. ;Module 2: Gaussian process modeling for the discrepancy function Similarly with the first module, the discrepancy function is replaced with a GP model :$\delta\left(\mathbf\right)\sim\mathcal\big\left(\mathbf^\delta\left(\cdot\right)^T\boldsymbol^\delta,\sigma_\delta^2R^\delta\left(\cdot,\cdot\right)\big\right)$ where :$R^\delta\left(\mathbf,\mathbf\text{'}\right)=\exp\left\.$ Together with the prior distribution of unknown parameters, and data from both computer models and experiments, one can derive the maximum likelihood estimates for $\left\$. At the same time, $\boldsymbol^m$ from Module 1 gets updated as well. ;Module 3: Posterior distribution of unknown parameters
Bayes' theorem In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule; recently Bayes–Price theorem), named after the Reverend Thomas Bayes, describes the probability of an event (probability theory), event, based on p ...
is applied to calculate the
posterior distribution In Bayesian statistics, the posterior probability of a random event or an uncertain proposition is the conditional probability that is assigned after the relevant evidence Evidence, broadly construed, is anything presented in support of an ...
of the unknown parameters: :$p\left(\boldsymbol\mid\text,\boldsymbol\right)\propto p\left(\rm\mid\boldsymbol,\boldsymbol\right)p\left(\boldsymbol\right)$ where $\boldsymbol$ includes all the fixed hyperparameters in previous modules. ;Module 4: Prediction of the experimental response and discrepancy function

### =Fully Bayesian approach

= Fully Bayesian approach requires that not only the priors for unknown parameters $\boldsymbol$ but also the priors for the other hyperparameters $\boldsymbol$ should be assigned. It follows the following steps: # Derive the posterior distribution $p\left(\boldsymbol,\boldsymbol\mid\text\right)$; # Integrate $\boldsymbol$ out and obtain $p\left(\boldsymbol\mid\text\right)$. This single step accomplishes the calibration; # Prediction of the experimental response and discrepancy function. However, the approach has significant drawbacks: * For most cases, $p\left(\boldsymbol,\boldsymbol\mid\text\right)$ is a highly intractable function of $\boldsymbol$. Hence the integration becomes very troublesome. Moreover, if priors for the other hyperparameters $\boldsymbol$ are not carefully chosen, the complexity in numerical integration increases even more. * In the prediction stage, the prediction (which should at least include the expected value of system responses) also requires numerical integration.
Markov chain Monte Carlo In statistics Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data Data (; ) are individual facts, statistics, or items of information, often numeric. In a mor ...
(MCMC) is often used for integration; however it is computationally expensive. The fully Bayesian approach requires a huge amount of calculations and may not yet be practical for dealing with the most complicated modelling situations.

# Known issues

The theories and methodologies for uncertainty propagation are much better established, compared with inverse uncertainty quantification. For the latter, several difficulties remain unsolved: # Dimensionality issue: The computational cost increases dramatically with the dimensionality of the problem, i.e. the number of input variables and/or the number of unknown parameters. # Identifiability issue: Multiple combinations of unknown parameters and discrepancy function can yield the same experimental prediction. Hence different values of parameters cannot be distinguished/identified.

# Random Events to Quantifiable Uncertainty

While rolling one six-sided die, the probability of getting one to six is equal. An interval of 90% coverage probability extends the entire output range. While rolling 5 dice and observing the sum of outcomes, the width of an interval of 88.244% confidence is 46.15% of the range. The interval becomes narrower compared to the range with a larger number of dice-rolling. Our real-life events are influenced by numerous probabilistic events and the effect of all probabilistic events can be predicted by a narrow interval of high coverage probability; most of the situations.