HOME

TheInfoList



OR:

A computer experiment or simulation experiment is an experiment used to study a computer simulation, also referred to as an
in silico In biology and other experimental sciences, an ''in silico'' experiment is one performed on computer or via computer simulation. The phrase is pseudo-Latin for 'in silicon' (correct la, in silicio), referring to silicon in computer chips. It ...
system. This area includes
computational physics Computational physics is the study and implementation of numerical analysis to solve problems in physics for which a quantitative theory already exists. Historically, computational physics was the first application of modern computers in science, ...
,
computational chemistry Computational chemistry is a branch of chemistry that uses computer simulation to assist in solving chemical problems. It uses methods of theoretical chemistry, incorporated into computer programs, to calculate the structures and properties of mo ...
,
computational biology Computational biology refers to the use of data analysis, mathematical modeling and Computer simulation, computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the ...
and other similar disciplines.


Background

Computer simulation Computer simulation is the process of mathematical modelling, performed on a computer, which is designed to predict the behaviour of, or the outcome of, a real-world or physical system. The reliability of some mathematical models can be deter ...
s are constructed to emulate a physical system. Because these are meant to replicate some aspect of a system in detail, they often do not yield an analytic solution. Therefore, methods such as
discrete event simulation A discrete-event simulation (DES) models the operation of a system as a ( discrete) sequence of events in time. Each event occurs at a particular instant in time and marks a change of state in the system. Between consecutive events, no change in t ...
or
finite element The finite element method (FEM) is a popular method for numerically solving differential equations arising in engineering and mathematical modeling. Typical problem areas of interest include the traditional fields of structural analysis, heat t ...
solvers are used. A
computer model Computer simulation is the process of mathematical modelling, performed on a computer, which is designed to predict the behaviour of, or the outcome of, a real-world or physical system. The reliability of some mathematical models can be deter ...
is used to make inferences about the system it replicates. For example, climate models are often used because experimentation on an earth sized object is impossible.


Objectives

Computer experiments have been employed with many purposes in mind. Some of those include: * Uncertainty quantification: Characterize the uncertainty present in a computer simulation arising from unknowns during the computer simulation's construction. *
Inverse problem An inverse problem in science is the process of calculating from a set of observations the causal factors that produced them: for example, calculating an image in X-ray computed tomography, source reconstruction in acoustics, or calculating the ...
s: Discover the underlying properties of the system from the physical data. * Bias correction: Use physical data to correct for bias in the simulation. *
Data assimilation Data assimilation is a mathematical discipline that seeks to optimally combine theory (usually in the form of a numerical model) with observations. There may be a number of different goals sought – for example, to determine the optimal state es ...
: Combine multiple simulations and physical data sources into a complete predictive model. *
Systems design Systems design interfaces, and data for an electronic control system to satisfy specified requirements. System design could be seen as the application of system theory to product development. There is some overlap with the disciplines of system ...
: Find inputs that result in optimal system performance measures.


Computer simulation modeling

Modeling of computer experiments typically uses a Bayesian framework.
Bayesian statistics Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about the event, ...
is an interpretation of the field of
statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...
where all evidence about the true state of the world is explicitly expressed in the form of
probabilities Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speaking, ...
. In the realm of computer experiments, the Bayesian interpretation would imply we must form a
prior distribution In Bayesian statistical inference, a prior probability distribution, often simply called the prior, of an uncertain quantity is the probability distribution that would express one's beliefs about this quantity before some evidence is taken into ...
that represents our prior belief on the structure of the computer model. The use of this philosophy for computer experiments started in the 1980s and is nicely summarized by Sacks et al. (1989

While the Bayesian approach is widely used,
frequentist Frequentist inference is a type of statistical inference based in frequentist probability, which treats “probability” in equivalent terms to “frequency” and draws conclusions from sample-data by means of emphasizing the frequency or pro ...
approaches have been recently discusse

The basic idea of this framework is to model the computer simulation as an unknown function of a set of inputs. The computer simulation is implemented as a piece of computer code that can be evaluated to produce a collection of outputs. Examples of inputs to these simulations are coefficients in the underlying model, initial conditions and forcing functions. It is natural to see the simulation as a deterministic function that maps these ''inputs'' into a collection of ''outputs''. On the basis of seeing our simulator this way, it is common to refer to the collection of inputs as x, the computer simulation itself as f, and the resulting output as f(x). Both x and f(x) are vector quantities, and they can be very large collections of values, often indexed by space, or by time, or by both space and time. Although f(\cdot) is known in principle, in practice this is not the case. Many simulators comprise tens of thousands of lines of high-level computer code, which is not accessible to intuition. For some simulations, such as climate models, evaluation of the output for a single set of inputs can require millions of computer hour


Gaussian process prior

The typical model for a computer code output is a Gaussian process. For notational simplicity, assume f(x) is a scalar. Owing to the Bayesian framework, we fix our belief that the function f follows a
Gaussian process In probability theory and statistics, a Gaussian process is a stochastic process (a collection of random variables indexed by time or space), such that every finite collection of those random variables has a multivariate normal distribution, i.e. ...
, f \sim \operatorname(m(\cdot),C(\cdot,\cdot)), where m is the mean function and C is the covariance function. Popular mean functions are low order polynomials and a popular
covariance function In probability theory and statistics, the covariance function describes how much two random variables change together (their ''covariance'') with varying spatial or temporal separation. For a random field or stochastic process ''Z''(''x'') on a doma ...
is Matern covariance, which includes both the exponential ( \nu = 1/2 ) and Gaussian covariances (as \nu \rightarrow \infty ).


Design of computer experiments

The design of computer experiments has considerable differences from
design of experiments The design of experiments (DOE, DOX, or experimental design) is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associ ...
for parametric models. Since a Gaussian process prior has an infinite dimensional representation, the concepts of A and D criteria (see
Optimal design In the design of experiments, optimal designs (or optimum designs) are a class of experimental designs that are optimal with respect to some statistical criterion. The creation of this field of statistics has been credited to Danish statistic ...
), which focus on reducing the error in the parameters, cannot be used. Replications would also be wasteful in cases when the computer simulation has no error. Criteria that are used to determine a good experimental design include integrated mean squared prediction erro

and distance based criteri

Popular strategies for design include
latin hypercube sampling Latin hypercube sampling (LHS) is a statistical method for generating a near-random sample of parameter values from a multidimensional distribution. The sampling method is often used to construct computer experiments or for Monte Carlo integratio ...
and low discrepancy sequences.


Problems with massive sample sizes

Unlike physical experiments, it is common for computer experiments to have thousands of different input combinations. Because the standard inference requires matrix inversion of a square matrix of the size of the number of samples (n), the cost grows on the \mathcal (n^3) . Matrix inversion of large, dense matrices can also cause numerical inaccuracies. Currently, this problem is solved by greedy decision tree techniques, allowing effective computations for unlimited dimensionality and sample siz
patent WO2013055257A1
or avoided by using approximation methods, e.g


See also

*
Simulation A simulation is the imitation of the operation of a real-world process or system over time. Simulations require the use of Conceptual model, models; the model represents the key characteristics or behaviors of the selected system or proc ...
* Uncertainty quantification *
Bayesian statistics Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about the event, ...
*
Gaussian process emulator In statistics, Gaussian process emulator is one name for a general type of statistical model that has been used in contexts where the problem is to make maximum use of the outputs of a complicated (often non-random) computer-based simulation model. ...
*
Design of experiments The design of experiments (DOE, DOX, or experimental design) is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associ ...
*
Molecular dynamics Molecular dynamics (MD) is a computer simulation method for analyzing the physical movements of atoms and molecules. The atoms and molecules are allowed to interact for a fixed period of time, giving a view of the dynamic "evolution" of th ...
*
Monte Carlo method Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deter ...
*
Surrogate model A surrogate model is an engineering method used when an outcome of interest cannot be easily measured or computed, so a model of the outcome is used instead. Most engineering design problems require experiments and/or simulations to evaluate design ...
* Grey box completion and validation


Further reading

* * {{cite journal , last1 = Fehr , first1 = Jörg , last2 = Heiland , first2 = Jan , last3 = Himpe , first3 = Christian , last4 = Saak , first4 = Jens , title = Best practices for replicability, reproducibility and reusability of computer-based experiments exemplified by model reduction software , journal = AIMS Mathematics , volume = 1 , issue = 3 , pages = 261–281 , date = 2016 , doi = 10.3934/Math.2016.3.261 , arxiv = 1607.01191 Computational science Design of experiments Simulation