HOME

TheInfoList



OR:

The nested sampling algorithm is a
computation Computation is any type of arithmetic or non-arithmetic calculation that follows a well-defined model (e.g., an algorithm). Mechanical or electronic devices (or, historically, people) that perform computations are known as ''computers''. An es ...
al approach to the
Bayesian statistics Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about the event, ...
problems of comparing models and generating samples from posterior distributions. It was developed in 2004 by
physicist A physicist is a scientist who specializes in the field of physics, which encompasses the interactions of matter and energy at all length and time scales in the physical universe. Physicists generally are interested in the root or ultimate caus ...
John Skilling.


Background

Bayes' theorem In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For examp ...
can be applied to a pair of competing models M_1 and M_2 for data D, one of which may be true (though which one is unknown) but which both cannot be true simultaneously. The posterior probability for M_1 may be calculated as: : \begin P(M_1\mid D) & = \frac \\ & = \frac \\ & = \frac \end The prior probabilities M_1 and M_2 are already known, as they are chosen by the researcher ahead of time. However, the remaining
Bayes factor The Bayes factor is a ratio of two competing statistical models represented by their marginal likelihood, and is used to quantify the support for one model over the other. The models in questions can have a common set of parameters, such as a nu ...
P(D\mid M_2)/P(D\mid M_1) is not so easy to evaluate, since in general it requires marginalizing nuisance parameters. Generally, M_1 has a set of parameters that can be grouped together and called \theta, and M_2 has its own vector of parameters that may be of different dimensionality, but is still termed \theta. The marginalization for M_1 is : P(D\mid M_1) = \int d \theta \, P(D\mid \theta,M_1) P(\theta\mid M_1) and likewise for M_2. This integral is often analytically intractable, and in these cases it is necessary to employ a numerical algorithm to find an approximation. The nested sampling algorithm was developed by John Skilling specifically to approximate these marginalization integrals, and it has the added benefit of generating samples from the posterior distribution P(\theta\mid D,M_1). It is an alternative to methods from the Bayesian literature such as bridge sampling and defensive importance sampling. Here is a simple version of the nested sampling algorithm, followed by a description of how it computes the marginal probability density Z=P(D\mid M) where M is M_1 or M_2: Start with N points \theta_1,\ldots,\theta_N sampled from prior. for i=1 to j do % The number of iterations j is chosen by guesswork. L_i := \min(current likelihood values of the points); X_i := \exp(-i/N); w_i := X_ - X_i Z := Z + L_i\cdot w_i; Save the point with least likelihood as a sample point with weight w_i. Update the point with least likelihood with some
Markov chain Monte Carlo In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain ...
steps according to the prior, accepting only steps that keep the likelihood above L_i. end return Z; At each iteration, X_i is an estimate of the amount of prior mass covered by the hypervolume in parameter space of all points with likelihood greater than \theta_i. The weight factor w_i is an estimate of the amount of prior mass that lies between two nested hypersurfaces \ and \. The update step Z := Z+L_i w_i computes the sum over i of L_i w_i to numerically approximate the integral : \begin P(D\mid M) &= \int P(D\mid \theta,M) P(\theta\mid M) \,d \theta \\ &= \int P(D\mid \theta,M) \,dP(\theta\mid M) \end In the limit j \to \infty, this estimator has a positive bias of order 1 / N which can be removed by using (1 - 1/N) instead of the \exp (-1/N) in the above algorithm. The idea is to subdivide the range of f(\theta) = P(D\mid\theta,M) and estimate, for each interval (\theta_), f(\theta_i)/math>, how likely it is a priori that a randomly chosen \theta would map to this interval. This can be thought of as a Bayesian's way to numerically implement
Lebesgue integration In mathematics, the integral of a non-negative function of a single variable can be regarded, in the simplest case, as the area between the graph of that function and the -axis. The Lebesgue integral, named after French mathematician Henri Lebe ...
.


Implementations

Example implementations demonstrating the nested sampling algorithm are publicly available for download, written in several
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s. * Simple examples in C, R, or
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
are on John Skilling's website. * A
Haskell Haskell () is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research and industrial applications, Haskell has pioneered a number of programming lang ...
port of the above simple codes is on Hackage. * An example in R originally designed for fitting spectra is described at and is on GitHub. * An example in
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
, named Diamonds, is on GitHub. * A highly modular
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
parallel example for
statistical physics Statistical physics is a branch of physics that evolved from a foundation of statistical mechanics, which uses methods of probability theory and statistics, and particularly the Mathematics, mathematical tools for dealing with large populations ...
and
condensed matter physics Condensed matter physics is the field of physics that deals with the macroscopic and microscopic physical properties of matter, especially the solid and liquid phases which arise from electromagnetic forces between atoms. More generally, the sub ...
uses is on GitHub. * pymatnest is a
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
package designed for exploring the
energy landscape An energy landscape is a mapping of possible states of a system. The concept is frequently used in physics, chemistry, and biochemistry, e.g. to describe all possible conformations of a molecular entity, or the spatial positions of interacting mole ...
of different materials, calculating thermodynamic variables at arbitrary temperatures and locating
phase transitions In chemistry, thermodynamics, and other related fields, a phase transition (or phase change) is the physical process of transition between one state of a medium and another. Commonly the term is used to refer to changes among the basic states of ...
is on GitHub. * The MultiNest software package is capable of performing nested sampling on multi-modal posterior distributions. It has interfaces for C++, Fortran and Python inputs, and is available on GitHub. * PolyChord is another nested sampling software package available on GitHub. PolyChord's computational efficiency scales better with an increase in the number of parameters than MultiNest, meaning PolyChord can be more efficient for high dimensional problems. * NestedSamplers.jl a
Julia Julia is usually a feminine given name. It is a Latinate feminine form of the name Julio and Julius. (For further details on etymology, see the Wiktionary entry "Julius".) The given name ''Julia'' had been in use throughout Late Antiquity (e.g. ...
package for implementing single- and multi-ellipsoidal nested sampling algorithms is on GitHub.
Korali
is a high-performance framework for uncertainty quantification, optimization, and deep reinforcement learning, which also implements nested sampling.


Applications

Since nested sampling was proposed in 2004, it has been used in many aspects of the field of
astronomy Astronomy () is a natural science that studies astronomical object, celestial objects and phenomena. It uses mathematics, physics, and chemistry in order to explain their origin and chronology of the Universe, evolution. Objects of interest ...
. One paper suggested using nested sampling for
cosmological Cosmology () is a branch of physics and metaphysics dealing with the nature of the universe. The term ''cosmology'' was first used in English in 1656 in Thomas Blount's ''Glossographia'', and in 1731 taken up in Latin by German philosopher ...
model selection Model selection is the task of selecting a statistical model from a set of candidate models, given data. In the simplest cases, a pre-existing set of data is considered. However, the task can also involve the design of experiments such that the ...
and object detection, as it "uniquely combines accuracy, general applicability and computational feasibility." A refinement of the algorithm to handle multimodal posteriors has been suggested as a means to detect astronomical objects in extant datasets. Other applications of nested sampling are in the field of
finite element updating Finite element model updating is the process of ensuring that finite element analysis results in models that better reflect the measured data than the initial models. It is part of verification and validation of numerical models. The process The ...
where the algorithm is used to choose an optimal
finite element The finite element method (FEM) is a popular method for numerically solving differential equations arising in engineering and mathematical modeling. Typical problem areas of interest include the traditional fields of structural analysis, heat t ...
model, and this was applied to
structural dynamics Structural dynamics is a type of structural analysis which covers the behavior of a structure subjected to dynamic (actions having high acceleration) loading. Dynamic loads include people, wind, waves, traffic, earthquakes, and blasts. Any structur ...
. This sampling method has also been used in the field of materials modeling. It can be used to learn the partition function from
statistical mechanics In physics, statistical mechanics is a mathematical framework that applies statistical methods and probability theory to large assemblies of microscopic entities. It does not assume or postulate any natural laws, but explains the macroscopic be ...
and derive
thermodynamic Thermodynamics is a branch of physics that deals with heat, work, and temperature, and their relation to energy, entropy, and the physical properties of matter and radiation. The behavior of these quantities is governed by the four laws of ther ...
properties.


Dynamic nested sampling

Dynamic nested sampling is a generalisation of the nested sampling algorithm in which the number of samples taken in different regions of the parameter space is dynamically adjusted to maximise calculation accuracy. This can lead to large improvements in accuracy and computational efficiency when compared to the original nested sampling algorithm, in which the allocation of samples cannot be changed and often many samples are taken in regions which have little effect on calculation accuracy. Publicly available dynamic nested sampling software packages include: * - a Python implementation of dynamic nested sampling which can be downloaded from GitHub. * dyPolyChord: a software package which can be used with Python, C++ and Fortran likelihood and prior distributions. dyPolyChord is available on GitHub. Dynamic nested sampling has been applied to a variety of scientific problems, including analysis of gravitational waves, mapping distances in space and exoplanet detection.


See also

*
Bayesian model comparison The Bayes factor is a ratio of two competing statistical models represented by their marginal likelihood, and is used to quantify the support for one model over the other. The models in questions can have a common set of parameters, such as a nul ...
*
List of algorithms The following is a list of well-known algorithms along with one-line descriptions for each. Automated planning Combinatorial algorithms General combinatorial algorithms * Brent's algorithm: finds a cycle in function value iterations using on ...


References

{{Reflist Bayesian statistics Model selection Randomized algorithms