HOME
*





Stein's Lemma
Stein's lemma, named in honor of Charles Stein, is a theorem of probability theory that is of interest primarily because of its applications to statistical inference — in particular, to James–Stein estimation and empirical Bayes methods — and its applications to portfolio choice theory. The theorem gives a formula for the covariance of one random variable with the value of a function of another, when the two random variables are jointly normally distributed. Statement of the lemma Suppose ''X'' is a normally distributed random variable with expectation μ and variance σ2. Further suppose ''g'' is a function for which the two expectations E(''g''(''X'') (''X'' − μ)) and E(''g'' ′(''X'')) both exist. (The existence of the expectation of any random variable is equivalent to the finiteness of the expectation of its absolute value.) Then :E\bigl(g(X)(X-\mu)\bigr)=\sigma^2 E\bigl(g'(X)\bigr). In general, suppose ''X'' and ''Y'' are jointly normally d ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Charles Stein (statistician)
Charles Max Stein (March 22, 1920 – November 24, 2016) was an American mathematical statistician and professor of statistics at Stanford University. He received his Ph.D in 1947 at Columbia University with advisor Abraham Wald. He held faculty positions at Berkeley and the University of Chicago before moving permanently to Stanford in 1953. He is known for Stein's paradox in decision theory, which shows that ordinary least squares estimates can be uniformly improved when many parameters are estimated; for Stein's lemma, giving a formula for the covariance of one random variable with the value of a function of another when the two random variables are jointly normally distributed; and for Stein's method, a way of proving theorems such as the Central Limit Theorem that does not require the variables to be independent and identically distributed. He was a member of the National Academy of Sciences. He died in November 2016 at the age of 96. Works *''Approximate Computatio ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Absolute Value
In mathematics, the absolute value or modulus of a real number x, is the non-negative value without regard to its sign. Namely, , x, =x if is a positive number, and , x, =-x if x is negative (in which case negating x makes -x positive), and For example, the absolute value of 3 and the absolute value of −3 is The absolute value of a number may be thought of as its distance from zero. Generalisations of the absolute value for real numbers occur in a wide variety of mathematical settings. For example, an absolute value is also defined for the complex numbers, the quaternions, ordered rings, fields and vector spaces. The absolute value is closely related to the notions of magnitude, distance, and norm in various mathematical and physical contexts. Terminology and notation In 1806, Jean-Robert Argand introduced the term ''module'', meaning ''unit of measure'' in French, specifically for the ''complex'' absolute value,Oxford English Dictionary, Draft Revision, June 2008 an ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Stein Discrepancy
A Stein discrepancy is a statistical divergence between two probability measures that is rooted in Stein's method. It was first formulated as a tool to assess the quality of Markov chain Monte Carlo samplers,J. Gorham and L. Mackey. Measuring Sample Quality with Stein's Method. Advances in Neural Information Processing Systems, 2015. but has since been used in diverse settings in statistics, machine learning and computer science. Definition Let \mathcal be a measurable space and let \mathcal be a set of measurable functions of the form m : \mathcal \rightarrow \mathbb. A natural notion of distance between two probability distributions P, Q, defined on \mathcal, is provided by an integral probability metric : (1.1) \quad d_(P , Q) := \sup_ , \mathbb_(X)- \mathbb_(Y) , where for the purposes of exposition we assume that the expectations exist, and that the set \mathcal is sufficiently rich that (1.1) is indeed a metric on the set of probability distributions on \mathcal, ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Taylor Expansions For The Moments Of Functions Of Random Variables
In probability theory, it is possible to approximate the moments of a function ''f'' of a random variable ''X'' using Taylor expansions, provided that ''f'' is sufficiently differentiable and that the moments of ''X'' are finite. First moment Given \mu_X and \sigma^2_X, the mean and the variance of X, respectively,Haym Benaroya, Seon Mi Han, and Mark Nagurka. ''Probability Models in Engineering and Science''. CRC Press, 2005, p166. a Taylor expansion of the expected value of f(X) can be found via : \begin \operatorname\left (X)\right& = \operatorname\left \left(\mu_X + \left(X - \mu_X\right)\right)\right\\ & \approx \operatorname\left (\mu_X) + f'(\mu_X)\left(X-\mu_X\right) + \fracf''(\mu_X) \left(X - \mu_X\right)^2 \right\\ & = f(\mu_X) + f'(\mu_X) \operatorname \left X-\mu_X \right+ \fracf''(\mu_X) \operatorname \left \left(X - \mu_X\right)^2 \right \end Since E -\mu_X0, the second term vanishes. Also, E X-\mu_X)^2/math> is \sigma_X^2. Therefore, :\operatorname\left ( ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Stein's Method
Stein's method is a general method in probability theory to obtain bounds on the distance between two probability distributions with respect to a probability metric. It was introduced by Charles Stein, who first published it in 1972, to obtain a bound between the distribution of a sum of m-dependent sequence of random variables and a standard normal distribution in the Kolmogorov (uniform) metric and hence to prove not only a central limit theorem, but also bounds on the rates of convergence for the given metric. History At the end of the 1960s, unsatisfied with the by-then known proofs of a specific central limit theorem, Charles Stein developed a new way of proving the theorem for his statistics lecture.Charles Stein: The Invariant, the Direct and the "Pretentious"
. Interview given ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Exponential Family
In probability and statistics, an exponential family is a parametric set of probability distributions of a certain form, specified below. This special form is chosen for mathematical convenience, including the enabling of the user to calculate expectations, covariances using differentiation based on some useful algebraic properties, as well as for generality, as exponential families are in a sense very natural sets of distributions to consider. The term exponential class is sometimes used in place of "exponential family", or the older term Koopman–Darmois family. The terms "distribution" and "family" are often used loosely: specifically, ''an'' exponential family is a ''set'' of distributions, where the specific distribution varies with the parameter; however, a parametric ''family'' of distributions is often referred to as "''a'' distribution" (like "the normal distribution", meaning "the family of normal distributions"), and the set of all exponential families is sometimes l ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Multivariate Normal
In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be ''k''-variate normally distributed if every linear combination of its ''k'' components has a univariate normal distribution. Its importance derives mainly from the Central limit theorem#Multidimensional CLT, multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) Correlation (statistics), correlated real-valued random variables each of which clusters around a mean value. Definitions Notation and parameterization The multivariate normal distribution of a ''k''-dimensional random vector \mathbf = (X_1,\ldots,X_k)^ can be written in the following notation: : \mathbf\ \sim\ \mathcal(\boldsymbol\mu,\ ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Isserlis' Theorem
In probability theory, Isserlis' theorem or Wick's probability theorem is a formula that allows one to compute higher-order moments of the multivariate normal distribution in terms of its covariance matrix. It is named after Leon Isserlis. This theorem is also particularly important in particle physics, where it is known as Wick's theorem after the work of . Other applications include the analysis of portfolio returns, quantum field theory and generation of colored noise. Statement If (X_1,\dots, X_) is a zero-mean multivariate normal random vector, then\operatorname ,X_1 X_2\cdots X_\,= \sum_\prod_ \operatorname ,X_i X_j\,= \sum_\prod_ \operatorname(\,X_i, X_j\,), where the sum is over all the pairings of \, i.e. all distinct ways of partitioning \ into pairs \, and the product is over the pairs contained in p. In his original paper, Leon Isserlis proves this theorem by mathematical induction, generalizing the formula for the 4^ order moments, which takes the appearance : ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Integration By Substitution
In calculus, integration by substitution, also known as ''u''-substitution, reverse chain rule or change of variables, is a method for evaluating integrals and antiderivatives. It is the counterpart to the chain rule for differentiation, and can loosely be thought of as using the chain rule "backwards". Substitution for a single variable Introduction Before stating the result rigorously, consider a simple case using indefinite integrals. Compute \textstyle\int(2x^3+1)^7(x^2)\,dx. Set u=2x^3+1. This means \textstyle\frac=6x^2, or in differential form, du=6x^2\,dx. Now :\int(2x^3 +1)^7(x^2)\,dx = \frac\int\underbrace_\underbrace_=\frac\int u^\,du=\frac\left(\fracu^\right)+C=\frac(2x^3+1)^+C, where C is an arbitrary constant of integration. This procedure is frequently used, but not all integrals are of a form that permits its use. In any event, the result should be verified by differentiating and comparing to the original integrand. :\frac\left frac(2x^3+1)^+C\right\f ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Integration By Parts
In calculus, and more generally in mathematical analysis, integration by parts or partial integration is a process that finds the integral of a product of functions in terms of the integral of the product of their derivative and antiderivative. It is frequently used to transform the antiderivative of a product of functions into an antiderivative for which a solution can be more easily found. The rule can be thought of as an integral version of the product rule of differentiation. The integration by parts formula states: \begin \int_a^b u(x) v'(x) \, dx & = \Big (x) v(x)\Biga^b - \int_a^b u'(x) v(x) \, dx\\ & = u(b) v(b) - u(a) v(a) - \int_a^b u'(x) v(x) \, dx. \end Or, letting u = u(x) and du = u'(x) \,dx while v = v(x) and dv = v'(x) \, dx, the formula can be written more compactly: \int u \, dv \ =\ uv - \int v \, du. Mathematician Brook Taylor discovered integration by parts, first publishing the idea in 1715. More general formulations of integration by parts ex ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Probability Density Function
In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can be interpreted as providing a ''relative likelihood'' that the value of the random variable would be close to that sample. Probability density is the probability per unit length, in other words, while the ''absolute likelihood'' for a continuous random variable to take on any particular value is 0 (since there is an infinite set of possible values to begin with), the value of the PDF at two different samples can be used to infer, in any particular draw of the random variable, how much more likely it is that the random variable would be close to one sample compared to the other sample. In a more precise sense, the PDF is used to specify the probability of the random variable falling ''within a particular range of values'', as opposed to ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers is spread out from their average value. Variance has a central role in statistics, where some ideas that use it include descriptive statistics, statistical inference, hypothesis testing, goodness of fit, and Monte Carlo sampling. Variance is an important tool in the sciences, where statistical analysis of data is common. The variance is the square of the standard deviation, the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by \sigma^2, s^2, \operatorname(X), V(X), or \mathbb(X). An advantage of variance as a measure of dispersion is that it is more amenable to algebraic manipulation than other measures of dispersion such as the expected absolute deviation; for e ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]