CMA-ES

	CMA-ES Covariance matrix adaptation evolution strategy (CMA-ES) is a particular kind of strategy for numerical optimization. Evolution strategies (ES) are stochastic, derivative-free methods for numerical optimization of non-linear or non-convex continuous optimization problems. They belong to the class of evolutionary algorithms and evolutionary computation. An evolutionary algorithm is broadly based on the principle of biological evolution, namely the repeated interplay of variation (via recombination and mutation) and selection: in each generation (iteration) new individuals (candidate solutions, denoted as x) are generated by variation, usually in a stochastic way, of the current parental individuals. Then, some individuals are selected to become the parents in the next generation based on their fitness or objective function value f(x). Like this, over the generation sequence, individuals with better and better f-values are generated. In an evolution strategy, new candidate solutions ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Evolution Strategy In computer science, an evolution strategy (ES) is an optimization technique based on ideas of evolution. It belongs to the general class of evolutionary computation or artificial evolution methodologies. History The 'evolution strategy' optimization technique was created in the early 1960s and developed further in the 1970s and later by Ingo Rechenberg, Hans-Paul Schwefel and their co-workers. Methods Evolution strategies use natural problem-dependent representations, and primarily mutation and selection, as search operators. In common with evolutionary algorithms, the operators are applied in a loop. An iteration of the loop is called a generation. The sequence of generations is continued until a termination criterion is met. For real-valued search spaces, mutation is performed by adding a normally distributed random vector. The step size or mutation strength (i.e. the standard deviation of the normal distribution) is often governed by self-adaptation (see evolution window). ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Derivative-free Optimization Derivative-free optimization is a discipline in mathematical optimization that does not use derivative information in the classical sense to find optimal solutions: Sometimes information about the derivative of the objective function ''f'' is unavailable, unreliable or impractical to obtain. For example, ''f'' might be non-smooth, or time-consuming to evaluate, or in some way noisy, so that methods that rely on derivatives or approximate them via finite differences are of little use. The problem to find optimal points in such situations is referred to as derivative-free optimization, algorithms that do not use derivatives or finite differences are called derivative-free algorithms. Introduction The problem to be solved is to numerically optimize an objective function f\colon A\to\mathbb for some set A (usually A\subset\mathbb^n), i.e. find x_0\in A such that without loss of generality f(x_0)\leq f(x) for all x\in A. When applicable, a common approach is to iteratively improve a pa ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Estimation Of Distribution Algorithms ''Estimation of distribution algorithms'' (EDAs), sometimes called ''probabilistic model-building genetic algorithms'' (PMBGAs), are stochastic optimization methods that guide the search for the optimum by building and sampling explicit probabilistic models of promising candidate solutions. Optimization is viewed as a series of incremental updates of a probabilistic model, starting with the model encoding an uninformative prior over admissible solutions and ending with the model that generates only the global optima. EDAs belong to the class of evolutionary algorithms. The main difference between EDAs and most conventional evolutionary algorithms is that evolutionary algorithms generate new candidate solutions using an ''implicit'' distribution defined by one or more variation operators, whereas EDAs use an ''explicit'' probability distribution encoded by a Bayesian network, a multivariate normal distribution, or another model class. Similarly as other evolutionary algorithms, EDAs ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Numerical Optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfields: discrete optimization and continuous optimization. Optimization problems of sorts arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries. In the more general approach, an optimization problem consists of maximizing or minimizing a real function by systematically choosing input values from within an allowed set and computing the value of the function. The generalization of optimization theory and techniques to other formulations constitutes a large area of applied mathematics. More generally, optimization includes finding "best available" values of some objective function given a define ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Square Root Of A Matrix In mathematics, the square root of a matrix extends the notion of square root from numbers to matrices. A matrix is said to be a square root of if the matrix product is equal to . Some authors use the name ''square root'' or the notation only for the specific case when is positive semidefinite, to denote the unique matrix that is positive semidefinite and such that (for real-valued matrices, where is the transpose of ). Less frequently, the name ''square root'' may be used for any factorization of a positive semidefinite matrix as , as in the Cholesky factorization, even if . This distinct meaning is discussed in '. Examples In general, a matrix can have several square roots. In particular, if A = B^2 then A=(-B)^2 as well. The 2×2 identity matrix \textstyle\begin1 & 0\\ 0 & 1\end has infinitely many square roots. They are given by :\begin \pm 1 & 0\\ 0 & \pm 1\end and \begin a & b\\ c & -a\end where (a, b, c) are any numbers (real or complex) such that a^2+bc= ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Exponential Decay A quantity is subject to exponential decay if it decreases at a rate proportional to its current value. Symbolically, this process can be expressed by the following differential equation, where is the quantity and (lambda) is a positive rate called the exponential decay constant, disintegration constant, rate constant, or transformation constant: :\frac = -\lambda N. The solution to this equation (see derivation below) is: :N(t) = N_0 e^, where is the quantity at time , is the initial quantity, that is, the quantity at time . Measuring rates of decay Mean lifetime If the decaying quantity, ''N''(''t''), is the number of discrete elements in a certain set, it is possible to compute the average length of time that an element remains in the set. This is called the mean lifetime (or simply the lifetime), where the exponential time constant, \tau, relates to the decay rate constant, λ, in the following way: :\tau = \frac. The mean lifetime can be looked at as a ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Positive-definite Matrix In mathematics, a symmetric matrix M with real entries is positive-definite if the real number z^\textsfMz is positive for every nonzero real column vector z, where z^\textsf is the transpose of More generally, a Hermitian matrix (that is, a complex matrix equal to its conjugate transpose) is positive-definite if the real number z^* Mz is positive for every nonzero complex column vector z, where z^* denotes the conjugate transpose of z. Positive semi-definite matrices are defined similarly, except that the scalars z^\textsfMz and z^* Mz are required to be positive ''or zero'' (that is, nonnegative). Negative-definite and negative semi-definite matrices are defined analogously. A matrix that is not positive semi-definite and not negative semi-definite is sometimes called indefinite. A matrix is thus positive-definite if and only if it is the matrix of a positive-definite quadratic form or Hermitian form. In other words, a matrix is positive-definite if and only if it defines a ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Pseudocode In computer science, pseudocode is a plain language description of the steps in an algorithm or another system. Pseudocode often uses structural conventions of a normal programming language, but is intended for human reading rather than machine reading. It typically omits details that are essential for machine understanding of the algorithm, such as variable declarations and language-specific code. The programming language is augmented with natural language description details, where convenient, or with compact mathematical notation. The purpose of using pseudocode is that it is easier for people to understand than conventional programming language code, and that it is an efficient and environment-independent description of the key principles of an algorithm. It is commonly used in textbooks and scientific publications to document algorithms and in planning of software and other algorithms. No broad standard for pseudocode syntax exists, as a program in pseudocode is not an executa ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Premature Convergence In evolutionary algorithms (EA), the term of premature convergence means that a population for an optimization problem converged too early, resulting in being suboptimal. In this context, the parental solutions, through the aid of genetic operators, are not able to generate offspring that are superior to, or outperform, their parents. Premature convergence is a common problem found in evolutionary algorithms in general and genetic algorithms in particular, as it leads to a loss, or convergence of, a large number of alleles, subsequently making it very difficult to search for a specific gene in which the alleles were present.Baker, J.E. & Grefenstette, J. (2014). ''Proceedings of the First International Conference on Genetic Algorithms and their Applications''. Hoboken: Taylor and Francis, pp. 101 – 105. An allele is considered lost if, in a population, a gene is present, where all individuals are sharing the same value for that particular gene. An allele is, as defined by De Jong, ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Cross-Entropy Method The cross-entropy (CE) method is a Monte Carlo method for importance sampling and optimization. It is applicable to both combinatorial and continuous problems, with either a static or noisy objective. The method approximates the optimal importance sampling estimator by repeating two phases:Rubinstein, R.Y. and Kroese, D.P. (2004), The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning, Springer-Verlag, New York . #Draw a sample from a probability distribution. #Minimize the ''cross-entropy'' between this distribution and a target distribution to produce a better sample in the next iteration. Reuven Rubinstein developed the method in the context of ''rare event simulation'', where tiny probabilities must be estimated, for example in network reliability analysis, queueing models, or performance analysis of telecommunication systems. The method has also been applied to the traveling salesman, quadratic assignment, ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Principal Components Analysis Principal component analysis (PCA) is a popular technique for analyzing large datasets containing a high number of dimensions/features per observation, increasing the interpretability of data while preserving the maximum amount of information, and enabling the visualization of multidimensional data. Formally, PCA is a statistical technique for reducing the dimensionality of a dataset. This is accomplished by linearly transforming the data into a new coordinate system where (most of) the variation in the data can be described with fewer dimensions than the initial data. Many studies use the first two principal components in order to plot the data in two dimensions and to visually identify clusters of closely related data points. Principal component analysis has applications in many fields such as population genetics, microbiome studies, and atmospheric science. The principal components of a collection of points in a real coordinate space are a sequence of p unit vectors, where the i ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Information Geometry Information geometry is an interdisciplinary field that applies the techniques of differential geometry to study probability theory and statistics. It studies statistical manifolds, which are Riemannian manifolds whose points correspond to probability distributions. Introduction Historically, information geometry can be traced back to the work of C. R. Rao, who was the first to treat the Fisher matrix as a Riemannian metric. The modern theory is largely due to Shun'ichi Amari, whose work has been greatly influential on the development of the field. Classically, information geometry considered a parametrized statistical model as a Riemannian manifold. For such models, there is a natural choice of Riemannian metric, known as the Fisher information metric. In the special case that the statistical model is an exponential family, it is possible to induce the statistical manifold with a Hessian metric (i.e a Riemannian metric given by the potential of a convex function). In thi ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]