Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational

algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...

s that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deterministic in principle. They are often used in physical and

mathematical Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...

problems and are most useful when it is difficult or impossible to use other approaches. Monte Carlo methods are mainly used in three problem classes: optimization, numerical integration, and generating draws from a

probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomeno ...

. In physics-related problems, Monte Carlo methods are useful for simulating systems with many coupled

degrees of freedom Degrees of freedom (often abbreviated df or DOF) refers to the number of independent variables or parameters of a thermodynamic system. In various scientific fields, the word "freedom" is used to describe the limits to which physical movement or ...

, such as fluids, disordered materials, strongly coupled solids, and cellular structures (see cellular Potts model,

interacting particle systems In probability theory, an interacting particle system (IPS) is a stochastic process (X(t))_ on some configuration space \Omega= S^G given by a site space, a countable-infinite graph G and a local state space, a compact metric space S . More ...

, McKean–Vlasov processes, kinetic models of gases). Other examples include modeling phenomena with significant

uncertainty Uncertainty refers to Epistemology, epistemic situations involving imperfect or unknown information. It applies to predictions of future events, to physical measurements that are already made, or to the unknown. Uncertainty arises in partially ...

in inputs such as the calculation of

risk In simple terms, risk is the possibility of something bad happening. Risk involves uncertainty about the effects/implications of an activity with respect to something that humans value (such as health, well-being, wealth, property or the environme ...

in business and, in mathematics, evaluation of multidimensional

definite integral In mathematics, an integral assigns numbers to functions in a way that describes displacement, area, volume, and other concepts that arise by combining infinitesimal data. The process of finding integrals is called integration. Along with ...

s with complicated boundary conditions. In application to systems engineering problems (space, oil exploration, aircraft design, etc.), Monte Carlo–based predictions of failure, cost overruns and schedule overruns are routinely better than human intuition or alternative "soft" methods. In principle, Monte Carlo methods can be used to solve any problem having a probabilistic interpretation. By the

law of large numbers In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials sho ...

, integrals described by the

expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a ...

of some random variable can be approximated by taking the

empirical mean The sample mean (or "empirical mean") and the sample covariance are statistics computed from a sample of data on one or more random variables. The sample mean is the average value (or mean value) of a sample of numbers taken from a larger pop ...

( the 'sample mean') of independent samples of the variable. When the

of the variable is parameterized, mathematicians often use a Markov chain Monte Carlo (MCMC) sampler. The central idea is to design a judicious Markov chain model with a prescribed stationary probability distribution. That is, in the limit, the samples being generated by the MCMC method will be samples from the desired (target) distribution. By the ergodic theorem, the stationary distribution is approximated by the empirical measures of the random states of the MCMC sampler. In other problems, the objective is generating draws from a sequence of probability distributions satisfying a nonlinear evolution equation. These flows of probability distributions can always be interpreted as the distributions of the random states of a Markov process whose transition probabilities depend on the distributions of the current random states (see McKean–Vlasov processes, nonlinear filtering equation). In other instances we are given a flow of probability distributions with an increasing level of sampling complexity (path spaces models with an increasing time horizon, Boltzmann–Gibbs measures associated with decreasing temperature parameters, and many others). These models can also be seen as the evolution of the law of the random states of a nonlinear Markov chain. A natural way to simulate these sophisticated nonlinear Markov processes is to sample multiple copies of the process, replacing in the evolution equation the unknown distributions of the random states by the sampled empirical measures. In contrast with traditional Monte Carlo and MCMC methodologies, these mean-field particle techniques rely on sequential interacting samples. The terminology ''mean field'' reflects the fact that each of the ''samples'' ( particles, individuals, walkers, agents, creatures, or phenotypes) interacts with the empirical measures of the process. When the size of the system tends to infinity, these random empirical measures converge to the deterministic distribution of the random states of the nonlinear Markov chain, so that the statistical interaction between particles vanishes. Despite its conceptual and algorithmic simplicity, the computational cost associated with a Monte Carlo simulation can be staggeringly high. In general the method requires many samples to get a good approximation, which may incur an arbitrarily large total runtime if the processing time of a single sample is high. Although this is a severe limitation in very complex problems, the embarrassingly parallel nature of the algorithm allows this large cost to be reduced (perhaps to a feasible level) through parallel computing strategies in local processors, clusters, cloud computing, GPU, FPGA, etc.

Overview

Monte Carlo methods vary, but tend to follow a particular pattern: # Define a domain of possible inputs # Generate inputs randomly from a

over the domain # Perform a deterministic computation on the inputs # Aggregate the results Pi 30K

For example, consider a quadrant (circular sector) inscribed in a unit square. Given that the ratio of their areas is , the value of can be approximated using a Monte Carlo method: # Draw a square, then inscribe a quadrant within it #

Uniformly Uniform distribution may refer to: * Continuous uniform distribution * Discrete uniform distribution * Uniform distribution (ecology) * Equidistributed sequence In mathematics, a sequence (''s''1, ''s''2, ''s''3, ...) of real numbers is said to be ...

scatter a given number of points over the square # Count the number of points inside the quadrant, i.e. having a distance from the origin of less than 1 # The ratio of the inside-count and the total-sample-count is an estimate of the ratio of the two areas, . Multiply the result by 4 to estimate . In this procedure the domain of inputs is the square that circumscribes the quadrant. We generate random inputs by scattering grains over the square then perform a computation on each input (test whether it falls within the quadrant). Aggregating the results yields our final result, the approximation of . There are two important considerations: # If the points are not uniformly distributed, then the approximation will be poor. # There are many points. The approximation is generally poor if only a few points are randomly placed in the whole square. On average, the approximation improves as more points are placed. Uses of Monte Carlo methods require large amounts of random numbers, and their use benefitted greatly from pseudorandom number generators, which were far quicker to use than the tables of random numbers that had been previously used for statistical sampling.

History

Before the Monte Carlo method was developed, simulations tested a previously understood deterministic problem, and statistical sampling was used to estimate uncertainties in the simulations. Monte Carlo simulations invert this approach, solving deterministic problems using probabilistic metaheuristics (see simulated annealing). An early variant of the Monte Carlo method was devised to solve the

Buffon's needle problem In mathematics, Buffon's needle problem is a question first posed in the 18th century by Georges-Louis Leclerc, Comte de Buffon: :Suppose we have a floor made of parallel strips of wood, each the same width, and we drop a needle onto the floo ...

, in which can be estimated by dropping needles on a floor made of parallel equidistant strips. In the 1930s,

Enrico Fermi Enrico Fermi (; 29 September 1901 – 28 November 1954) was an Italian (later naturalized American) physicist and the creator of the world's first nuclear reactor, the Chicago Pile-1. He has been called the "architect of the nuclear age" an ...

first experimented with the Monte Carlo method while studying neutron diffusion, but he did not publish this work. In the late 1940s, Stanislaw Ulam invented the modern version of the Markov Chain Monte Carlo method while he was working on nuclear weapons projects at the

Los Alamos National Laboratory Los Alamos National Laboratory (often shortened as Los Alamos and LANL) is one of the sixteen research and development laboratories of the United States Department of Energy (DOE), located a short distance northwest of Santa Fe, New Mexico, i ...

. In 1946, nuclear weapons physicists at Los Alamos were investigating neutron diffusion in the core of a nuclear weapon. Despite having most of the necessary data, such as the average distance a neutron would travel in a substance before it collided with an atomic nucleus and how much energy the neutron was likely to give off following a collision, the Los Alamos physicists were unable to solve the problem using conventional, deterministic mathematical methods. Ulam proposed using random experiments. He recounts his inspiration as follows: Being secret, the work of von Neumann and Ulam required a code name. A colleague of von Neumann and Ulam, Nicholas Metropolis, suggested using the name ''Monte Carlo'', which refers to the Monte Carlo Casino in

Monaco Monaco (; ), officially the Principality of Monaco (french: Principauté de Monaco; Ligurian: ; oc, Principat de Mónegue), is a sovereign ''Sovereign'' is a title which can be applied to the highest leader in various categories. The word ...

where Ulam's uncle would borrow money from relatives to gamble. Monte Carlo methods were central to the

simulation A simulation is the imitation of the operation of a real-world process or system over time. Simulations require the use of models; the model represents the key characteristics or behaviors of the selected system or process, whereas the ...

s required for the Manhattan Project, though severely limited by the computational tools at the time. Von Neumann, Nicholas Metropolis and others programmed the

ENIAC ENIAC (; Electronic Numerical Integrator and Computer) was the first programmable, electronic, general-purpose digital computer, completed in 1945. There were other computers that had these features, but the ENIAC had all of them in one pac ...

computer to perform the first fully automated Monte Carlo calculations, of a fission weapon core, in the spring of 1948. In the 1950s Monte Carlo methods were used at Los Alamos for the development of the hydrogen bomb, and became popularized in the fields of

physics Physics is the natural science that studies matter, its fundamental constituents, its motion and behavior through space and time, and the related entities of energy and force. "Physical science is that department of knowledge which rel ...

, physical chemistry, and

operations research Operations research ( en-GB, operational research) (U.S. Air Force Specialty Code: Operations Analysis), often shortened to the initialism OR, is a discipline that deals with the development and application of analytical methods to improve dec ...

. The

Rand Corporation The RAND Corporation (from the phrase "research and development") is an American nonprofit global policy think tank created in 1948 by Douglas Aircraft Company to offer research and analysis to the United States Armed Forces. It is financ ...

and the U.S. Air Force were two of the major organizations responsible for funding and disseminating information on Monte Carlo methods during this time, and they began to find a wide application in many different fields. The theory of more sophisticated mean-field type particle Monte Carlo methods had certainly started by the mid-1960s, with the work of Henry P. McKean Jr. on Markov interpretations of a class of nonlinear parabolic partial differential equations arising in fluid mechanics. We also quote an earlier pioneering article by Theodore E. Harris and Herman Kahn, published in 1951, using mean-field genetic-type Monte Carlo methods for estimating particle transmission energies. Mean-field genetic type Monte Carlo methodologies are also used as heuristic natural search algorithms (a.k.a. metaheuristic) in evolutionary computing. The origins of these mean-field computational techniques can be traced to 1950 and 1954 with the work of

Alan Turing Alan Mathison Turing (; 23 June 1912 – 7 June 1954) was an English mathematician, computer scientist, logician, cryptanalyst, philosopher, and theoretical biologist. Turing was highly influential in the development of theoretical c ...

on genetic type mutation-selection learning machines and the articles by Nils Aall Barricelli at the

Institute for Advanced Study The Institute for Advanced Study (IAS), located in Princeton, New Jersey, in the United States, is an independent center for theoretical research and intellectual inquiry. It has served as the academic home of internationally preeminent scholar ...

Princeton, New Jersey Princeton is a municipality with a borough form of government in Mercer County, in the U.S. state of New Jersey. It was established on January 1, 2013, through the consolidation of the Borough of Princeton and Princeton Township, both of whi ...

. Quantum Monte Carlo, and more specifically diffusion Monte Carlo methods can also be interpreted as a mean-field particle Monte Carlo approximation of

Feynman Richard Phillips Feynman (; May 11, 1918 – February 15, 1988) was an American theoretical physicist, known for his work in the path integral formulation of quantum mechanics, the theory of quantum electrodynamics, the physics of the superflu ...

– Kac path integrals. The origins of Quantum Monte Carlo methods are often attributed to Enrico Fermi and

Robert Richtmyer Robert Davis Richtmyer (October 10, 1910 – September 24, 2003) was an American physicist, mathematician, educator, author, and musician. Biography Richtmyer was born on October 10, 1910 in Ithaca, New York. His father was physicist Floyd K. Ri ...

who developed in 1948 a mean-field particle interpretation of neutron-chain reactions, but the first heuristic-like and genetic type particle algorithm (a.k.a. Resampled or Reconfiguration Monte Carlo methods) for estimating ground state energies of quantum systems (in reduced matrix models) is due to Jack H. Hetherington in 1984 In molecular chemistry, the use of genetic heuristic-like particle methodologies (a.k.a. pruning and enrichment strategies) can be traced back to 1955 with the seminal work of

Marshall N. Rosenbluth Marshall Nicholas Rosenbluth (5 February 1927 – 28 September 2003) was an Americans, American Plasma Physics, plasma physicist and member of the United States National Academy of Sciences, National Academy of Sciences, and member of the America ...

and Arianna W. Rosenbluth. The use of Sequential Monte Carlo in advanced

signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing '' signals'', such as sound, images, and scientific measurements. Signal processing techniques are used to optimize transmissions, ...

and Bayesian inference is more recent. It was in 1993, that Gordon et al., published in their seminal work the first application of a Monte Carlo resampling algorithm in Bayesian statistical inference. The authors named their algorithm 'the bootstrap filter', and demonstrated that compared to other filtering methods, their bootstrap algorithm does not require any assumption about that state-space or the noise of the system. We also quote another pioneering article in this field of Genshiro Kitagawa on a related "Monte Carlo filter", and the ones by Pierre Del Moral and Himilcon Carvalho, Pierre Del Moral, André Monin and Gérard Salut on particle filters published in the mid-1990s. Particle filters were also developed in signal processing in 1989–1992 by P. Del Moral, J. C. Noyer, G. Rigal, and G. Salut in the LAAS-CNRS in a series of restricted and classified research reports with STCAN (Service Technique des Constructions et Armes Navales), the IT company DIGILOG, and th
LAAS-CNRS
(the Laboratory for Analysis and Architecture of Systems) on radar/sonar and GPS signal processing problems. These Sequential Monte Carlo methodologies can be interpreted as an acceptance-rejection sampler equipped with an interacting recycling mechanism. From 1950 to 1996, all the publications on Sequential Monte Carlo methodologies, including the pruning and resample Monte Carlo methods introduced in computational physics and molecular chemistry, present natural and heuristic-like algorithms applied to different situations without a single proof of their consistency, nor a discussion on the bias of the estimates and on genealogical and ancestral tree based algorithms. The mathematical foundations and the first rigorous analysis of these particle algorithms were written by Pierre Del Moral in 1996. Branching type particle methodologies with varying population sizes were also developed in the end of the 1990s by Dan Crisan, Jessica Gaines and Terry Lyons, and by Dan Crisan, Pierre Del Moral and Terry Lyons. Further developments in this field were developed in 2000 by P. Del Moral, A. Guionnet and L. Miclo.

Definitions

There is no consensus on how ''Monte Carlo'' should be defined. For example, Ripley defines most probabilistic modeling as '' stochastic simulation'', with ''Monte Carlo'' being reserved for Monte Carlo integration and Monte Carlo statistical tests. Sawilowsky distinguishes between a

, a Monte Carlo method, and a Monte Carlo simulation: a simulation is a fictitious representation of reality, a Monte Carlo method is a technique that can be used to solve a mathematical or statistical problem, and a Monte Carlo simulation uses repeated sampling to obtain the statistical properties of some phenomenon (or behavior). Examples: *Simulation: Drawing ''one'' pseudo-random uniform variable from the interval ,1can be used to simulate the tossing of a coin: If the value is less than or equal to 0.50 designate the outcome as heads, but if the value is greater than 0.50 designate the outcome as tails. This is a simulation, but not a Monte Carlo simulation. *Monte Carlo method: Pouring out a box of coins on a table, and then computing the ratio of coins that land heads versus tails is a Monte Carlo method of determining the behavior of repeated coin tosses, but it is not a simulation. *Monte Carlo simulation: Drawing ''a large number'' of pseudo-random uniform variables from the interval ,1at one time, or once at many different times, and assigning values less than or equal to 0.50 as heads and greater than 0.50 as tails, is a ''Monte Carlo simulation'' of the behavior of repeatedly tossing a coin. Kalos and Whitlock point out that such distinctions are not always easy to maintain. For example, the emission of radiation from atoms is a natural stochastic process. It can be simulated directly, or its average behavior can be described by stochastic equations that can themselves be solved using Monte Carlo methods. "Indeed, the same computer code can be viewed simultaneously as a 'natural simulation' or as a solution of the equations by natural sampling."

Monte Carlo and random numbers

The main idea behind this method is that the results are computed based on repeated random sampling and statistical analysis. The Monte Carlo simulation is, in fact, random experimentations, in the case that, the results of these experiments are not well known. Monte Carlo simulations are typically characterized by many unknown parameters, many of which are difficult to obtain experimentally. Monte Carlo simulation methods do not always require truly random numbers to be useful (although, for some applications such as

primality testing A primality test is an algorithm for determining whether an input number is prime. Among other fields of mathematics, it is used for cryptography. Unlike integer factorization, primality tests do not generally give prime factors, only stating wheth ...

, unpredictability is vital). Many of the most useful techniques use deterministic, pseudorandom sequences, making it easy to test and re-run simulations. The only quality usually necessary to make good

s is for the pseudo-random sequence to appear "random enough" in a certain sense. What this means depends on the application, but typically they should pass a series of statistical tests. Testing that the numbers are uniformly distributed or follow another desired distribution when a large enough number of elements of the sequence are considered is one of the simplest and most common ones. Weak correlations between successive samples are also often desirable/necessary. Sawilowsky lists the characteristics of a high-quality Monte Carlo simulation: *the (pseudo-random) number generator has certain characteristics (e.g. a long "period" before the sequence repeats) *the (pseudo-random) number generator produces values that pass tests for randomness *there are enough samples to ensure accurate results *the proper sampling technique is used *the algorithm used is valid for what is being modeled *it simulates the phenomenon in question. Pseudo-random number sampling algorithms are used to transform uniformly distributed pseudo-random numbers into numbers that are distributed according to a given

. Low-discrepancy sequences are often used instead of random sampling from a space as they ensure even coverage and normally have a faster order of convergence than Monte Carlo simulations using random or pseudorandom sequences. Methods based on their use are called quasi-Monte Carlo methods. In an effort to assess the impact of random number quality on Monte Carlo simulation outcomes, astrophysical researchers tested cryptographically-secure pseudorandom numbers generated via Intel's RDRAND instruction set, as compared to those derived from algorithms, like the Mersenne Twister, in Monte Carlo simulations of radio flares from brown dwarfs. RDRAND is the closest pseudorandom number generator to a true random number generator. No statistically significant difference was found between models generated with typical pseudorandom number generators and RDRAND for trials consisting of the generation of 10⁷ random numbers.

Monte Carlo simulation versus "what if" scenarios

There are ways of using probabilities that are definitely not Monte Carlo simulations – for example, deterministic modeling using single-point estimates. Each uncertain variable within a model is assigned a "best guess" estimate. Scenarios (such as best, worst, or most likely case) for each input variable are chosen and the results recorded. By contrast, Monte Carlo simulations sample from a

for each variable to produce hundreds or thousands of possible outcomes. The results are analyzed to get probabilities of different outcomes occurring. For example, a comparison of a spreadsheet cost construction model run using traditional "what if" scenarios, and then running the comparison again with Monte Carlo simulation and triangular probability distributions shows that the Monte Carlo analysis has a narrower range than the "what if" analysis. This is because the "what if" analysis gives equal weight to all scenarios (see quantifying uncertainty in corporate finance), while the Monte Carlo method hardly samples in the very low probability regions. The samples in such regions are called "rare events".

Applications

Monte Carlo methods are especially useful for simulating phenomena with significant

in inputs and systems with many coupled degrees of freedom. Areas of application include:

Physical sciences

Monte Carlo methods are very important in computational physics, physical chemistry, and related applied fields, and have diverse applications from complicated quantum chromodynamics calculations to designing heat shields and aerodynamic forms as well as in modeling radiation transport for radiation dosimetry calculations. In statistical physics, Monte Carlo molecular modeling is an alternative to computational

molecular dynamics Molecular dynamics (MD) is a computer simulation method for analyzing the physical movements of atoms and molecules. The atoms and molecules are allowed to interact for a fixed period of time, giving a view of the dynamic "evolution" of th ...

, and Monte Carlo methods are used to compute

statistical field theories Statistics (from German: '' Statistik'', "d