Nested Sampling
   HOME

TheInfoList



OR:

The nested sampling algorithm is a
computation Computation is any type of arithmetic or non-arithmetic calculation that follows a well-defined model (e.g., an algorithm). Mechanical or electronic devices (or, historically, people) that perform computations are known as ''computers''. An es ...
al approach to the Bayesian statistics problems of comparing models and generating samples from posterior distributions. It was developed in 2004 by
physicist A physicist is a scientist who specializes in the field of physics, which encompasses the interactions of matter and energy at all length and time scales in the physical universe. Physicists generally are interested in the root or ultimate caus ...
John Skilling.


Background

Bayes' theorem In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For examp ...
can be applied to a pair of competing models M_1 and M_2 for data D, one of which may be true (though which one is unknown) but which both cannot be true simultaneously. The posterior probability for M_1 may be calculated as: : \begin P(M_1\mid D) & = \frac \\ & = \frac \\ & = \frac \end The prior probabilities M_1 and M_2 are already known, as they are chosen by the researcher ahead of time. However, the remaining Bayes factor P(D\mid M_2)/P(D\mid M_1) is not so easy to evaluate, since in general it requires marginalizing nuisance parameters. Generally, M_1 has a set of parameters that can be grouped together and called \theta, and M_2 has its own vector of parameters that may be of different dimensionality, but is still termed \theta. The marginalization for M_1 is : P(D\mid M_1) = \int d \theta \, P(D\mid \theta,M_1) P(\theta\mid M_1) and likewise for M_2. This integral is often analytically intractable, and in these cases it is necessary to employ a numerical algorithm to find an approximation. The nested sampling algorithm was developed by John Skilling specifically to approximate these marginalization integrals, and it has the added benefit of generating samples from the posterior distribution P(\theta\mid D,M_1). It is an alternative to methods from the Bayesian literature such as bridge sampling and defensive importance sampling. Here is a simple version of the nested sampling algorithm, followed by a description of how it computes the marginal probability density Z=P(D\mid M) where M is M_1 or M_2: Start with N points \theta_1,\ldots,\theta_N sampled from prior. for i=1 to j do % The number of iterations j is chosen by guesswork. L_i := \min(current likelihood values of the points); X_i := \exp(-i/N); w_i := X_ - X_i Z := Z + L_i\cdot w_i; Save the point with least likelihood as a sample point with weight w_i. Update the point with least likelihood with some
Markov chain Monte Carlo In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain ...
steps according to the prior, accepting only steps that keep the likelihood above L_i. end return Z; At each iteration, X_i is an estimate of the amount of prior mass covered by the hypervolume in parameter space of all points with likelihood greater than \theta_i. The weight factor w_i is an estimate of the amount of prior mass that lies between two nested hypersurfaces \ and \. The update step Z := Z+L_i w_i computes the sum over i of L_i w_i to numerically approximate the integral : \begin P(D\mid M) &= \int P(D\mid \theta,M) P(\theta\mid M) \,d \theta \\ &= \int P(D\mid \theta,M) \,dP(\theta\mid M) \end In the limit j \to \infty, this estimator has a positive bias of order 1 / N which can be removed by using (1 - 1/N) instead of the \exp (-1/N) in the above algorithm. The idea is to subdivide the range of f(\theta) = P(D\mid\theta,M) and estimate, for each interval (\theta_), f(\theta_i)/math>, how likely it is a priori that a randomly chosen \theta would map to this interval. This can be thought of as a Bayesian's way to numerically implement
Lebesgue integration In mathematics, the integral of a non-negative function of a single variable can be regarded, in the simplest case, as the area between the graph of that function and the -axis. The Lebesgue integral, named after French mathematician Henri Lebe ...
.


Implementations

Example implementations demonstrating the nested sampling algorithm are publicly available for download, written in several
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s. * Simple examples in C, R, or
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
are on John Skilling's website. * A Haskell port of the above simple codes is on Hackage. * An example in R originally designed for fitting spectra is described at and is on GitHub. * An example in C++, named Diamonds, is on GitHub. * A highly modular
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
parallel example for statistical physics and
condensed matter physics Condensed matter physics is the field of physics that deals with the macroscopic and microscopic physical properties of matter, especially the solid and liquid phases which arise from electromagnetic forces between atoms. More generally, the sub ...
uses is on GitHub. * pymatnest is a
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
package designed for exploring the
energy landscape An energy landscape is a mapping of possible states of a system. The concept is frequently used in physics, chemistry, and biochemistry, e.g. to describe all possible conformations of a molecular entity, or the spatial positions of interacting mole ...
of different materials, calculating thermodynamic variables at arbitrary temperatures and locating phase transitions is on GitHub. * The MultiNest software package is capable of performing nested sampling on multi-modal posterior distributions. It has interfaces for C++, Fortran and Python inputs, and is available on GitHub. * PolyChord is another nested sampling software package available on GitHub. PolyChord's computational efficiency scales better with an increase in the number of parameters than MultiNest, meaning PolyChord can be more efficient for high dimensional problems. * NestedSamplers.jl a Julia package for implementing single- and multi-ellipsoidal nested sampling algorithms is on GitHub.
Korali
is a high-performance framework for uncertainty quantification, optimization, and deep reinforcement learning, which also implements nested sampling.


Applications

Since nested sampling was proposed in 2004, it has been used in many aspects of the field of
astronomy Astronomy () is a natural science that studies astronomical object, celestial objects and phenomena. It uses mathematics, physics, and chemistry in order to explain their origin and chronology of the Universe, evolution. Objects of interest ...
. One paper suggested using nested sampling for cosmological model selection and object detection, as it "uniquely combines accuracy, general applicability and computational feasibility." A refinement of the algorithm to handle multimodal posteriors has been suggested as a means to detect astronomical objects in extant datasets. Other applications of nested sampling are in the field of
finite element updating Finite element model updating is the process of ensuring that finite element analysis results in models that better reflect the measured data than the initial models. It is part of verification and validation of numerical models. The process The ...
where the algorithm is used to choose an optimal finite element model, and this was applied to structural dynamics. This sampling method has also been used in the field of materials modeling. It can be used to learn the partition function from
statistical mechanics In physics, statistical mechanics is a mathematical framework that applies statistical methods and probability theory to large assemblies of microscopic entities. It does not assume or postulate any natural laws, but explains the macroscopic be ...
and derive
thermodynamic Thermodynamics is a branch of physics that deals with heat, work, and temperature, and their relation to energy, entropy, and the physical properties of matter and radiation. The behavior of these quantities is governed by the four laws of ther ...
properties.


Dynamic nested sampling

Dynamic nested sampling is a generalisation of the nested sampling algorithm in which the number of samples taken in different regions of the parameter space is dynamically adjusted to maximise calculation accuracy. This can lead to large improvements in accuracy and computational efficiency when compared to the original nested sampling algorithm, in which the allocation of samples cannot be changed and often many samples are taken in regions which have little effect on calculation accuracy. Publicly available dynamic nested sampling software packages include: * - a Python implementation of dynamic nested sampling which can be downloaded from GitHub. * dyPolyChord: a software package which can be used with Python, C++ and Fortran likelihood and prior distributions. dyPolyChord is available on GitHub. Dynamic nested sampling has been applied to a variety of scientific problems, including analysis of gravitational waves, mapping distances in space and exoplanet detection.


See also

*
Bayesian model comparison The Bayes factor is a ratio of two competing statistical models represented by their marginal likelihood, and is used to quantify the support for one model over the other. The models in questions can have a common set of parameters, such as a nul ...
* List of algorithms


References

{{Reflist Bayesian statistics Model selection Randomized algorithms