statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...

and

physics Physics is the natural science that studies matter, its fundamental constituents, its motion and behavior through space and time, and the related entities of energy and force. "Physical science is that department of knowledge which r ...

, multicanonical ensemble (also called multicanonical sampling or flat histogram) is a

Markov chain Monte Carlo In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain ...

sampling technique that uses the

Metropolis–Hastings algorithm In statistics and statistical physics, the Metropolis–Hastings algorithm is a Markov chain Monte Carlo (MCMC) method for obtaining a sequence of random samples from a probability distribution from which direct sampling is difficult. This seque ...

to compute

integral In mathematics Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented i ...

s where the integrand has a rough landscape with multiple

local minima In mathematical analysis, the maxima and minima (the respective plurals of maximum and minimum) of a function, known collectively as extrema (the plural of extremum), are the largest and smallest value of the function, either within a given ran ...

. It samples states according to the inverse of the

density of states In solid state physics and condensed matter physics, the density of states (DOS) of a system describes the number of modes per unit frequency range. The density of states is defined as D(E) = N(E)/V , where N(E)\delta E is the number of states i ...

, which has to be known a priori or be computed using other techniques like the

Wang and Landau algorithm The Wang and Landau algorithm, proposed by Fugao Wang and David P. Landau, is a Monte Carlo method designed to estimate the density of states of a system. The method performs a non-Markovian random walk to build the density of states by quickly vi ...

. Multicanonical sampling is an important technique for

spin Spin or spinning most often refers to: * Spinning (textiles), the creation of yarn or thread by twisting fibers together, traditionally by hand spinning * Spin, the rotation of an object around a central axis * Spin (propaganda), an intentionally b ...

systems like the

Ising model The Ising model () (or Lenz-Ising model or Ising-Lenz model), named after the physicists Ernst Ising and Wilhelm Lenz, is a mathematical model of ferromagnetism in statistical mechanics. The model consists of discrete variables that represent ...

spin glass In condensed matter physics, a spin glass is a magnetic state characterized by randomness, besides cooperative behavior in freezing of spins at a temperature called 'freezing temperature' ''Tf''. In ferromagnetic solids, component atoms' magne ...

es.

Motivation

In systems with a large number of degrees of freedom, like

systems,

Monte Carlo integration In mathematics, Monte Carlo integration is a technique for numerical integration using random numbers. It is a particular Monte Carlo method that numerically computes a definite integral. While other algorithms usually evaluate the integrand at ...

is required. In this integration,

importance sampling Importance sampling is a Monte Carlo method for evaluating properties of a particular distribution, while only having samples generated from a different distribution than the distribution of interest. Its introduction in statistics is generally att ...

and in particular the

Metropolis algorithm A metropolis () is a large city or conurbation which is a significant economic, political, and cultural center for a country or region, and an important hub for regional or international connections, commerce, and communications. A big ci ...

, is a very important technique. However, the Metropolis algorithm samples states according to

\exp(-\beta E)

where beta is the inverse of the temperature. This means that an

energy barrier In chemistry and physics, activation energy is the minimum amount of energy that must be provided for compounds to result in a chemical reaction. The activation energy (''E''a) of a reaction is measured in joules per mole (J/mol), kilojoules p ...

\Delta E

on the energy spectrum is exponentially difficult to overcome. Systems with multiple local energy minima like the

Potts model In statistical mechanics, the Potts model, a generalization of the Ising model, is a model of interacting spins on a crystalline lattice. By studying the Potts model, one may gain insight into the behaviour of ferromagnets and certain other phenome ...

become hard to sample as the algorithm gets stuck in the system's local minima. This motivates other approaches, namely, other sampling distributions.

Overview

Multicanonical ensemble uses the Metropolis–Hastings algorithm with a sampling distribution given by the inverse of the density of states of the system, contrary to the sampling distribution

\exp(-\beta E)

of the Metropolis algorithm. With this choice, on average, the number of states sampled at each energy is constant, i.e. it is a simulation with a "flat histogram" on energy. This leads to an algorithm for which the energy barriers are no longer difficult to overcome. Another advantage over the Metropolis algorithm is that the sampling is independent of the temperature of the system, which means that one simulation allows the estimation of thermodynamical variables for all temperatures (thus the name "multicanonical": several temperatures). This is a great improvement in the study of first order

phase transition In chemistry, thermodynamics, and other related fields, a phase transition (or phase change) is the physical process of transition between one state of a medium and another. Commonly the term is used to refer to changes among the basic states of ...

s. The biggest problem in performing a multicanonical ensemble is that the density of states has to be known ''a priori''. One important contribution to multicanonical sampling was the

, which asymptotically converges to a multicanonical ensemble while calculating the density of states during the convergence. The multicanonical ensemble is not restricted to physical systems. It can be employed on abstract systems which have a cost function ''F''. By using the density of states with respect to F, the method becomes general for computing higher-dimensional integrals or finding local minima.

Motivation

Consider a system and its phase-space

\Omega

characterized by a configuration

\boldsymbol

\Omega

and a "cost" function ''F'' from the system's phase-space to a one-dimensional space

\Gamma

F(\Omega) = \Gamma = Gamma_\min, \Gamma_\max /math>, the spectrum of ''F''.



The computation of an average quantity \langle Q\rangle over the phase-space requires the evaluation of an integral:

: \langle Q\rangle = \int_\Omega Q(\boldsymbol) P_r(\boldsymbol) \,d\boldsymbol where P_r(\boldsymbol) is the weight of each state (e.g. P_r(\boldsymbol)=1/V correspond to uniformly distributed states).

When ''Q'' does not depend on the particular state but only on the particular F's value of the state F(\boldsymbol) = F_\boldsymbol,
the formula for \langle Q\rangle can be integrated over ''f'' by adding a

dirac delta function In mathematics, the Dirac delta distribution ( distribution), also known as the unit impulse, is a generalized function or distribution over the real numbers, whose value is zero everywhere except at zero, and whose integral over the entire ...

and be written as :

\begin
 \langle Q\rangle & = \int_^ \int_\Omega  Q(F_\boldsymbol) P_r(F_\boldsymbol) \delta(f - F_\boldsymbol) \,d\boldsymbol \,df \\
 & = \int_^ Q(f) \int_\Omega \delta(f - F_\boldsymbol) P_r(F_\boldsymbol) \,d\boldsymbol \, df \\
 & = \int_^ Q(f) P(f) \, df \\
\end

where :

P(f) = \int_\Omega P_r(r)\delta(f - F(\boldsymbol)) \,d\boldsymbol

is the marginal distribution of F. When the system has a large number of degrees of freedom, an analytical expression for

\langle Q\rangle

is often hard to obtain, and

is typically employed in the computation of

\langle Q\rangle

. On the simplest formulation, the method chooses ''N'' uniformly distributed states

\boldsymbol_i\in \Omega

, and uses the

estimator In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (the estimate) are distinguished. For example, the ...

\overline_N = \sum_^N Q(\boldsymbol_i) V P_r(\boldsymbol_i)

for computing

\langle Q\rangle

because

\overline_N

converges almost surely to

\langle Q\rangle

by the

strong law of large numbers In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials shou ...

: :

\lim_ \overline_N = \langle Q\rangle.

One typical problem of this convergence is that the variance of ''Q'' can be very high, which leads to a high computational effort to achieve reasonable results. To improve this convergence, the

was proposed. Generally, Monte Carlo methods' idea is to use

to improve the convergence of the estimator

\overline_N

by sampling states according to an ''arbitrary'' distribution

\pi(\boldsymbol)

, and use the appropriate estimator: :

\overline_N = \sum_^N Q(\boldsymbol_i) \pi^(\boldsymbol_i) P_r(\boldsymbol_i)

. This estimator generalizes the estimator of the mean for samples drawn from an arbitrary distribution. Therefore, when

\pi(\boldsymbol)

is a uniform distribution, it corresponds the one used on a uniform sampling above. When the system is a physical system in contact with a heat bath, each state

\boldsymbol

is weighted according to the

Boltzmann factor Factor, a Latin word meaning "who/which acts", may refer to: Commerce * Factor (agent), a person who acts for, notably a mercantile and colonial agent * Factor (Scotland), a person or firm managing a Scottish estate * Factors of production, su ...

P_r(\boldsymbol_i) \propto \exp(-\beta F_\boldsymbol)

. In Monte Carlo, the

canonical ensemble In statistical mechanics, a canonical ensemble is the statistical ensemble that represents the possible states of a mechanical system in thermal equilibrium with a heat bath at a fixed temperature. The system can exchange energy with the heat b ...

is defined by choosing

\pi(\boldsymbol)

to be proportional to

P_r(\boldsymbol_i)

. In this situation, the estimator corresponds to a simple arithmetic average: :

\overline_N = \frac \sum_^N Q(\boldsymbol_i)

Historically, this occurred because the original idea was to use

to compute averages on a system in contact with a heat bath where the weight is given by the Boltzmann factor,

P(\boldsymbol) \propto \exp(-\beta E(\boldsymbol))

. While it is often the case that the sampling distribution

\pi

is chosen to be the weight distribution

P_r

, this does not need to be the case. One situation where the canonical ensemble is not an efficient choice is when it takes an arbitrarily long time to converge. One situation where this happens is when the function F has multiple local minima. The computational cost for the algorithm to leave a specific region with a local minimum exponentially increases with the cost function's value of the minimum. That is, the deeper the minimum, the more time the algorithm spends there, and the harder it will be to leave (exponentially growing with the depth of the local minimum). One way to avoid becoming stuck in local minima of the cost function is to make the sampling technique "invisible" to local minima. This is the basis of the multicanonical ensemble.

Multicanonical ensemble

The multicanonical ensemble is defined by choosing the sampling distribution to be :

\pi(\boldsymbol) \propto \frac

where

P(f)

is the marginal distribution of F defined above. The consequence of this choice is that the average number of samples with a given value of ''f'', m(f), is given by :

m(f) = \int_\Omega \delta(f - F_\boldsymbol) \pi(\boldsymbol) P_r(\boldsymbol)\,d\boldsymbol \propto \int_\Omega \delta(f - F_\boldsymbol) P_r(\boldsymbol) \frac d\boldsymbol = \frac \int_\Omega \delta(f - F_\boldsymbol) P_r(\boldsymbol) d\boldsymbol = 1

that is, the average number of samples ''does not'' depend on ''f'': all costs ''f'' are equally sampled regardless of whether they are more or less probable. This motivates the name "flat-histogram". For systems in contact with a heat bath, the sampling is independent of the temperature and one simulation allows to study all temperatures.

Tunneling time and critical slowing down

Like in any other Monte Carlo method, there are correlations of the samples being drawn from

P(\boldsymbol)

. A typical measurement of the correlation is the ''tunneling time''. The tunneling time is defined by the number of Markov steps (of the Markov chain) the simulation needs to perform a round-trip between the minimum and maximum of the spectrum of ''F''. One motivation to use the tunneling time is that when it crosses the spectra, it passes through the region of the maximum of the density of states, thus de-correlating the process. On the other hand using round-trips ensures that the system visits all the spectrum. Because the histogram is flat on the variable ''F'', a multicanonic ensemble can be seen as a diffusion process (i.e. a

random walk In mathematics, a random walk is a random process that describes a path that consists of a succession of random steps on some mathematical space. An elementary example of a random walk is the random walk on the integer number line \mathbb Z ...

) on the one-dimensional line of ''F'' values.

Detailed balance The principle of detailed balance can be used in kinetic systems which are decomposed into elementary processes (collisions, or steps, or elementary reactions). It states that at equilibrium, each elementary process is in equilibrium with its reve ...

of the process dictates that there is no

drift Drift or Drifts may refer to: Geography * Drift or ford (crossing) of a river * Drift, Kentucky, unincorporated community in the United States * In Cornwall, England: ** Drift, Cornwall, village ** Drift Reservoir, associated with the village ...

on the process. This implies that the tunneling time, in local dynamics, should scale as a diffusion process, and thus the tunneling time should scale quadratically with the size of the spectrum, ''N'': :

\tau_ \propto N^2

However, in some systems (the Ising model being the most paradigmatic), the scaling suffers from critical slowing down: it is

N^

where

z>0

depends on the particular system. Non-local dynamics were developed to improve the scaling to a quadratic scaling (see the

Wolff algorithm The Wolff algorithm, named after Ulli Wolff, is an algorithm for Monte Carlo simulation of the Ising model and Potts model in which the unit to be flipped is not a single spin (as in the heat bath In thermodynamics, heat is defined as the f ...

), beating the critical slowing down. However, it is still an open question whether there is a local dynamics that does not suffer from critical slowing down in spin systems like the Ising model.

References

{{Reflist, refs= {{Cite journal , last1 = Berg , first1 = B. , last2 = Neuhaus , first2 = T. , doi = 10.1103/PhysRevLett.68.9 , title = Multicanonical ensemble: A new approach to simulate first-order phase transitions , journal = Physical Review Letters , volume = 68 , issue = 1 , pages = 9–12 , year = 1992 , pmid = 10045099, arxiv = hep-lat/9202004 , bibcode = 1992PhRvL..68....9B , s2cid = 19478641 {{Cite journal , last1 = Wang , first1 = F. , last2 = Landau , first2 = D. , doi = 10.1103/PhysRevLett.86.2050 , title = Efficient, Multiple-Range Random Walk Algorithm to Calculate the Density of States , journal = Physical Review Letters , volume = 86 , issue = 10 , pages = 2050–2053 , year = 2001 , pmid = 11289852, arxiv = cond-mat/0011174 , bibcode = 2001PhRvL..86.2050W , s2cid = 2941153 {{cite book , title= Monte Carlo Methods in Statistical Physics , last1=Newmann , first1=M E J , last2=Barkema , first2=G T , year= 2002 , publisher= Oxford University Press , location=USA , isbn=0198517971 {{Cite journal , last1 = Lee , first1 = J. , last2 = Choi , first2 = M. , doi = 10.1103/PhysRevE.50.R651 , title = Optimization by multicanonical annealing and the traveling salesman problem , journal = Physical Review E , volume = 50 , issue = 2 , pages = R651–R654 , year = 1994 , pmid = 9962167, bibcode = 1994PhRvE..50..651L {{Cite journal , last1 = Metropolis , first1 = N. , last2 = Rosenbluth , first2 = A. W. , last3 = Rosenbluth , first3 = M. N. , last4 = Teller , first4 = A. H. , last5 = Teller , first5 = E. , title = Equation of State Calculations by Fast Computing Machines , doi = 10.1063/1.1699114 , journal = The Journal of Chemical Physics , volume = 21 , issue = 6 , pages = 1087 , year = 1953 , bibcode = 1953JChPh..21.1087M , osti = 4390578 , s2cid = 1046577 {{cite book , title=Monte Carlo statistical methods , last1=Robert , first1=Christian , last2=Casella , first2=George , year= 2004 , publisher=Springer , isbn=978-0-387-21239-5 , url=https://www.springer.com/statistics/statistical+theory+and+methods/book/978-0-387-21239-5 {{Cite journal , last1 = Dayal , first1 = P. , last2 = Trebst , first2 = S. , last3 = Wessel , first3 = S. , last4 = Würtz , first4 = D. , last5 = Troyer , first5 = M. , last6 = Sabhapandit , first6 = S. , last7 = Coppersmith , first7 = S. , author7-link= Susan Coppersmith , title = Performance Limitations of Flat-Histogram Methods , doi = 10.1103/PhysRevLett.92.097201 , journal = Physical Review Letters , volume = 92 , issue = 9 , year = 2004 , pages = 097201 , pmid = 15089505, arxiv = cond-mat/0306108 , bibcode = 2004PhRvL..92i7201D , s2cid = 1128445 {{Cite journal , last1 = Wolff , first1 = U. , title = Collective Monte Carlo Updating for Spin Systems , doi = 10.1103/PhysRevLett.62.361 , journal = Physical Review Letters , volume = 62 , issue = 4 , pages = 361–364 , year = 1989 , pmid = 10040213, bibcode = 1989PhRvL..62..361W Monte Carlo methods Computational physics