Chernoff Bound

picture info	Chernoff Bound In probability theory, a Chernoff bound is an exponentially decreasing upper bound on the tail of a random variable based on its moment generating function. The minimum of all such exponential bounds forms ''the'' Chernoff or Chernoff-Cramér bound, which may decay faster than exponential (e.g. sub-Gaussian). It is especially useful for sums of independent random variables, such as sums of Bernoulli random variables. The bound is commonly named after Herman Chernoff who described the method in a 1952 paper, though Chernoff himself attributed it to Herman Rubin. In 1938 Harald Cramér had published an almost identical concept now known as Cramér's theorem. It is a sharper bound than the first- or second-moment-based tail bounds such as Markov's inequality or Chebyshev's inequality, which only yield power-law bounds on tail decay. However, when applied to sums the Chernoff bound requires the random variables to be independent, a condition that is not required by either Markov's i ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Probability Theory Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set of axioms of probability, axioms. Typically these axioms formalise probability in terms of a probability space, which assigns a measure (mathematics), measure taking values between 0 and 1, termed the probability measure, to a set of outcomes called the sample space. Any specified subset of the sample space is called an event (probability theory), event. Central subjects in probability theory include discrete and continuous random variables, probability distributions, and stochastic processes (which provide mathematical abstractions of determinism, non-deterministic or uncertain processes or measured Quantity, quantities that may either be single occurrences or evolve over time in a random fashion). Although it is no ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Cumulative Distribution Function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Every probability distribution Support (measure theory), supported on the real numbers, discrete or "mixed" as well as Continuous variable, continuous, is uniquely identified by a right-continuous Monotonic function, monotone increasing function (a càdlàg function) F \colon \mathbb R \rightarrow [0,1] satisfying \lim_F(x)=0 and \lim_F(x)=1. In the case of a scalar continuous distribution, it gives the area under the probability density function from negative infinity to x. Cumulative distribution functions are also used to specify the distribution of multivariate random variables. Definition The cumulative distribution function of a real-valued random variable X is the function given by where the right-hand side represents the probability ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Binary Entropy Function Binary may refer to: Science and technology Mathematics * Binary number, a representation of numbers using only two values (0 and 1) for each digit * Binary function, a function that takes two arguments * Binary operation, a mathematical operation that takes two arguments * Binary relation, a relation involving two elements * Finger binary, a system for counting in binary numbers on the fingers of human hands Computing * Binary code, the representation of text and data using only the digits 1 and 0 * Bit, or binary digit, the basic unit of information in computers * Binary file, composed of something other than human-readable text ** Executable, a type of binary file that contains machine code for the computer to execute * Binary tree, a computer tree data structure in which each node has at most two children * Binary-coded decimal, a method for encoding for decimal digits in binary sequences Astronomy * Binary star, a star system with two stars in it * Binary planet, t ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Bernoulli Distribution In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli, is the discrete probability distribution of a random variable which takes the value 1 with probability p and the value 0 with probability q = 1-p. Less formally, it can be thought of as a model for the set of possible outcomes of any single experiment that asks a yes–no question. Such questions lead to outcome (probability), outcomes that are Boolean-valued function, Boolean-valued: a single bit whose value is success/yes and no, yes/Truth value, true/Binary code, one with probability ''p'' and failure/no/false (logic), false/Binary code, zero with probability ''q''. It can be used to represent a (possibly biased) coin toss where 1 and 0 would represent "heads" and "tails", respectively, and ''p'' would be the probability of the coin landing on heads (or vice versa where 1 would represent tails and ''p'' would be the probability of tails). In particular, unfair co ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Normal Distribution In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is f(x) = \frac e^\,. The parameter is the mean or expectation of the distribution (and also its median and mode), while the parameter \sigma^2 is the variance. The standard deviation of the distribution is (sigma). A random variable with a Gaussian distribution is said to be normally distributed, and is called a normal deviate. Normal distributions are important in statistics and are often used in the natural and social sciences to represent real-valued random variables whose distributions are not known. Their importance is partly due to the central limit theorem. It states that, under some conditions, the average of many samples (observations) of a random variable with finite mean and variance is itself a random variable—whose distribution c ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Degenerate Distribution In probability theory, a degenerate distribution on a measure space (E, \mathcal, \mu) is a probability distribution whose support is a null set with respect to \mu. For instance, in the -dimensional space endowed with the Lebesgue measure, any distribution concentrated on a -dimensional subspace with is a degenerate distribution on . This is essentially the same notion as a singular probability measure, but the term ''degenerate'' is typically used when the distribution arises as a limit of (non-degenerate) distributions. When the support of a degenerate distribution consists of a single point , this distribution is a Dirac measure in : it is the distribution of a deterministic random variable equal to with probability 1. This is a special case of a discrete distribution; its probability mass function equals 1 in and 0 everywhere else. In the case of a real-valued random variable, the cumulative distribution function of the degenerate distribution localized in is ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Logarithmically Concave Function In convex analysis, a non-negative function is logarithmically concave (or log-concave for short) if its domain is a convex set, and if it satisfies the inequality : f(\theta x + (1 - \theta) y) \geq f(x)^ f(y)^ for all and . If is strictly positive, this is equivalent to saying that the logarithm of the function, , is concave; that is, : \log f(\theta x + (1 - \theta) y) \geq \theta \log f(x) + (1-\theta) \log f(y) for all and . Examples of log-concave functions are the 0-1 indicator functions of convex sets (which requires the more flexible definition), and the Gaussian function. Similarly, a function is '' log-convex'' if it satisfies the reverse inequality : f(\theta x + (1 - \theta) y) \leq f(x)^ f(y)^ for all and . Properties * A log-concave function is also quasi-concave. This follows from the fact that the logarithm is monotone implying that the superlevel sets of this function are convex. * Every concave function that is nonnegative on it ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Logarithmically Convex Function In mathematics, a function ''f'' is logarithmically convex or superconvex if \circ f, the composition of the logarithm with ''f'', is itself a convex function. Definition Let be a convex subset of a real vector space, and let be a function taking non-negative values. Then is: * Logarithmically convex if \circ f is convex, and * Strictly logarithmically convex if \circ f is strictly convex. Here we interpret \log 0 as -\infty. Explicitly, is logarithmically convex if and only if, for all and all , the two following equivalent conditions hold: :\begin \log f(tx_1 + (1 - t)x_2) &\le t\log f(x_1) + (1 - t)\log f(x_2), \\ f(tx_1 + (1 - t)x_2) &\le f(x_1)^tf(x_2)^. \end Similarly, is strictly logarithmically convex if and only if, in the above two expressions, strict inequality holds for all . The above definition permits to be zero, but if is logarithmically convex and vanishes anywhere in , then it vanishes everywhere in the interior of . Equivalent conditions If is a di ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Moment-generating Function In probability theory and statistics, the moment-generating function of a real-valued random variable is an alternative specification of its probability distribution. Thus, it provides the basis of an alternative route to analytical results compared with working directly with probability density functions or cumulative distribution functions. There are particularly simple results for the moment-generating functions of distributions defined by the weighted sums of random variables. However, not all random variables have moment-generating functions. As its name implies, the moment-generating function can be used to compute a distribution’s moments: the -th moment about 0 is the -th derivative of the moment-generating function, evaluated at 0. In addition to univariate real-valued distributions, moment-generating functions can also be defined for vector- or matrix-valued random variables, and can even be extended to more general cases. The moment-generating function of a real-valu ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Cumulant Generating Function In probability theory and statistics, the cumulants of a probability distribution are a set of quantities that provide an alternative to the '' moments'' of the distribution. Any two probability distributions whose moments are identical will have identical cumulants as well, and vice versa. The first cumulant is the mean, the second cumulant is the variance, and the third cumulant is the same as the third central moment. But fourth and higher-order cumulants are not equal to central moments. In some cases theoretical treatments of problems in terms of cumulants are simpler than those using moments. In particular, when two or more random variables are statistically independent, the th-order cumulant of their sum is equal to the sum of their th-order cumulants. As well, the third and higher-order cumulants of a normal distribution are zero, and it is the only distribution with this property. Just as for moments, where ''joint moments'' are used for collections of random variables ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Convex Conjugate In mathematics and mathematical optimization, the convex conjugate of a function is a generalization of the Legendre transformation which applies to non-convex functions. It is also known as Legendre–Fenchel transformation, Fenchel transformation, or Fenchel conjugate (after Adrien-Marie Legendre and Werner Fenchel). The convex conjugate is widely used for constructing the dual problem in optimization theory, thus generalizing Lagrangian duality. Definition Let X be a real topological vector space and let X^ be the dual space to X. Denote by :\langle \cdot , \cdot \rangle : X^ \times X \to \mathbb the canonical dual pairing, which is defined by \left\langle x^, x \right\rangle \mapsto x^ (x). For a function f : X \to \mathbb \cup \ taking values on the extended real number line, its is the function :f^ : X^ \to \mathbb \cup \ whose value at x^* \in X^ is defined to be the supremum: :f^ \left( x^ \right) := \sup \left\, or, equivalently, in terms of the in ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Legendre–Fenchel Transformation In mathematics and mathematical optimization, the convex conjugate of a function is a generalization of the Legendre transformation which applies to non-convex functions. It is also known as Legendre–Fenchel transformation, Fenchel transformation, or Fenchel conjugate (after Adrien-Marie Legendre and Werner Fenchel). The convex conjugate is widely used for constructing the dual problem in optimization theory, thus generalizing Lagrangian duality. Definition Let X be a real topological vector space and let X^ be the dual space to X. Denote by :\langle \cdot , \cdot \rangle : X^ \times X \to \mathbb the canonical dual pairing, which is defined by \left\langle x^, x \right\rangle \mapsto x^ (x). For a function f : X \to \mathbb \cup \ taking values on the extended real number line, its is the function :f^ : X^ \to \mathbb \cup \ whose value at x^* \in X^ is defined to be the supremum: :f^ \left( x^ \right) := \sup \left\, or, equivalently, in terms of the infimum: ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]