Stein's Paradox

	Stein's Paradox In decision theory and estimation theory, Stein's example (also known as Stein's phenomenon or Stein's paradox) is the observation that when three or more parameters are estimated simultaneously, there exist combined estimators more accurate on average (that is, having lower expected mean squared error) than any method that handles the parameters separately. It is named after Charles Stein of Stanford University, who discovered the phenomenon in 1955. An intuitive explanation is that optimizing for the mean-squared error of a ''combined'' estimator is not the same as optimizing for the errors of separate estimators of the individual parameters. In practical terms, if the combined error is in fact of interest, then a combined estimator should be used, even if the underlying parameters are independent. If one is instead interested in estimating an individual parameter, then using a combined estimator does not help and is in fact worse. Formal statement The following is the simpl ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Decision Theory Decision theory (or the theory of choice; not to be confused with choice theory) is a branch of applied probability theory concerned with the theory of making decisions based on assigning probabilities to various factors and assigning numerical consequences to the outcome. There are three branches of decision theory: # Normative decision theory: Concerned with the identification of optimal decisions, where optimality is often determined by considering an ideal decision-maker who is able to calculate with perfect accuracy and is in some sense fully rational. # Prescriptive decision theory: Concerned with describing observed behaviors through the use of conceptual models, under the assumption that those making the decisions are behaving under some consistent rules. # Descriptive decision theory: Analyzes how individuals actually make the decisions that they do. Decision theory is closely related to the field of game theory and is an interdisciplinary topic, studied by econom ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	James–Stein Estimator The James–Stein estimator is a biased estimator of the mean, \boldsymbol\theta, of (possibly) correlated Gaussian distributed random vectors Y = \ with unknown means \. It arose sequentially in two main published papers, the earlier version of the estimator was developed by Charles Stein in 1956, which reached a relatively shocking conclusion that while the then usual estimate of the mean, or the sample mean written by Stein and James as (Y_i) = , is admissible when m \leq 2, however it is inadmissible when m \geq 3 and proposed a possible improvement to the estimator that shrinks the sample means towards a more central mean vector \boldsymbol\nu (which can be chosen a priori or commonly the "average of averages" of the sample means given all samples share the same size), is commonly referred to as Stein's example or paradox. This earlier result was improved later by Willard James and Charles Stein in 1961 through simplifying the original process. It can be shown that t ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Estimation Theory Estimation theory is a branch of statistics that deals with estimating the values of parameters based on measured empirical data that has a random component. The parameters describe an underlying physical setting in such a way that their value affects the distribution of the measured data. An ''estimator'' attempts to approximate the unknown parameters using the measurements. In estimation theory, two approaches are generally considered: * The probabilistic approach (described in this article) assumes that the measured data is random with probability distribution dependent on the parameters of interest * The set-membership approach assumes that the measured data vector belongs to a set which depends on the parameter vector. Examples For example, it is desired to estimate the proportion of a population of voters who will vote for a particular candidate. That proportion is the parameter sought; the estimate is based on a small random sample of voters. Alternatively, it ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Scientific American ''Scientific American'', informally abbreviated ''SciAm'' or sometimes ''SA'', is an American popular science magazine. Many famous scientists, including Albert Einstein and Nikola Tesla, have contributed articles to it. In print since 1845, it is the oldest continuously published magazine in the United States. ''Scientific American'' is owned by Springer Nature, which in turn is a subsidiary of Holtzbrinck Publishing Group. History ''Scientific American'' was founded by inventor and publisher Rufus Porter (painter), Rufus Porter in 1845 as a four-page weekly newspaper. The first issue of the large format newspaper was released August 28, 1845. Throughout its early years, much emphasis was placed on reports of what was going on at the United States Patent and Trademark Office, U.S. Patent Office. It also reported on a broad range of inventions including perpetual motion machines, an 1860 device for buoying vessels by Abraham Lincoln, and the universal joint which now can be found ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Stein's Lemma Stein's lemma, named in honor of Charles Stein, is a theorem of probability theory that is of interest primarily because of its applications to statistical inference — in particular, to James–Stein estimation and empirical Bayes methods — and its applications to portfolio choice theory. The theorem gives a formula for the covariance of one random variable with the value of a function of another, when the two random variables are jointly normally distributed. Statement of the lemma Suppose ''X'' is a normally distributed random variable with expectation μ and variance σ2. Further suppose ''g'' is a function for which the two expectations E(''g''(''X'') (''X'' − μ)) and E(''g'' ′(''X'')) both exist. (The existence of the expectation of any random variable is equivalent to the finiteness of the expectation of its absolute value.) Then :E\bigl(g(X)(X-\mu)\bigr)=\sigma^2 E\bigl(g'(X)\bigr). In general, suppose ''X'' and ''Y'' are jointly normally d ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Integration By Parts In calculus, and more generally in mathematical analysis, integration by parts or partial integration is a process that finds the integral of a product of functions in terms of the integral of the product of their derivative and antiderivative. It is frequently used to transform the antiderivative of a product of functions into an antiderivative for which a solution can be more easily found. The rule can be thought of as an integral version of the product rule of differentiation. The integration by parts formula states: \begin \int_a^b u(x) v'(x) \, dx & = \Big (x) v(x)\Biga^b - \int_a^b u'(x) v(x) \, dx\\ & = u(b) v(b) - u(a) v(a) - \int_a^b u'(x) v(x) \, dx. \end Or, letting u = u(x) and du = u'(x) \,dx while v = v(x) and dv = v'(x) \, dx, the formula can be written more compactly: \int u \, dv \ =\ uv - \int v \, du. Mathematician Brook Taylor discovered integration by parts, first publishing the idea in 1715. More general formulations of integration by parts ex ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Equivariant Estimation In statistics, the concept of being an invariant estimator is a criterion that can be used to compare the properties of different estimators for the same quantity. It is a way of formalising the idea that an estimator should have certain intuitively appealing qualities. Strictly speaking, "invariant" would mean that the estimates themselves are unchanged when both the measurements and the parameters are transformed in a compatible way, but the meaning has been extended to allow the estimates to change in appropriate ways with such transformations. The term equivariant estimator is used in formal mathematical contexts that include a precise description of the relation of the way the estimator changes in response to changes to the dataset and parameterisation: this corresponds to the use of " equivariance" in more general mathematics. General setting Background In statistical inference, there are several approaches to estimation theory that can be used to decide immediately what es ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Least Squares The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the residuals (a residual being the difference between an observed value and the fitted value provided by a model) made in the results of each individual equation. The most important application is in data fitting. When the problem has substantial uncertainties in the independent variable (the ''x'' variable), then simple regression and least-squares methods have problems; in such cases, the methodology required for fitting errors-in-variables models may be considered instead of that for least squares. Least squares problems fall into two categories: linear or ordinary least squares and nonlinear least squares, depending on whether or not the residuals are linear in all unknowns. The linear least-squares problem occurs in statistical regressio ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Gauss–Markov Theorem In statistics, the Gauss–Markov theorem (or simply Gauss theorem for some authors) states that the ordinary least squares (OLS) estimator has the lowest sampling variance within the class of linear unbiased estimators, if the errors in the linear regression model are uncorrelated, have equal variances and expectation value of zero. The errors do not need to be normal, nor do they need to be independent and identically distributed (only uncorrelated with mean zero and homoscedastic with finite variance). The requirement that the estimator be unbiased cannot be dropped, since biased estimators exist with lower variance. See, for example, the James–Stein estimator (which also drops linearity), ridge regression, or simply any degenerate estimator. The theorem was named after Carl Friedrich Gauss and Andrey Markov, although Gauss' work significantly predates Markov's. But while Gauss derived the result under the assumption of independence and normality, Markov reduced the assu ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Maximum Likelihood Estimation In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statistical model, the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference. If the likelihood function is differentiable, the derivative test for finding maxima can be applied. In some cases, the first-order conditions of the likelihood function can be solved analytically; for instance, the ordinary least squares estimator for a linear regression model maximizes the likelihood when all observed outcomes are assumed to have Normal distributions with the same variance. From the perspective of Bayesian inference, M ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Channel Estimation In wireless communications, channel state information (CSI) is the known channel properties of a communication link. This information describes how a signal propagates from the transmitter to the receiver and represents the combined effect of, for example, scattering, fading, and power decay with distance. The method is called Channel estimation. The CSI makes it possible to adapt transmissions to current channel conditions, which is crucial for achieving reliable communication with high data rates in multiantenna systems. CSI needs to be estimated at the receiver and usually quantized and feedback to the transmitter (although reverse-link estimation is possible in TDD systems). Therefore, the transmitter and receiver can have different CSI. The CSI at the transmitter and the CSI at the receiver are sometimes referred to as CSIT and CSIR, respectively. Different kinds of channel state information There are basically two levels of CSI, namely instantaneous CSI and statisti ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Brownian Motion Brownian motion, or pedesis (from grc, πήδησις "leaping"), is the random motion of particles suspended in a medium (a liquid or a gas). This pattern of motion typically consists of random fluctuations in a particle's position inside a fluid sub-domain, followed by a relocation to another sub-domain. Each relocation is followed by more fluctuations within the new closed volume. This pattern describes a fluid at thermal equilibrium, defined by a given temperature. Within such a fluid, there exists no preferential direction of flow (as in transport phenomena). More specifically, the fluid's overall linear and angular momenta remain null over time. The kinetic energies of the molecular Brownian motions, together with those of molecular rotations and vibrations, sum up to the caloric component of a fluid's internal energy (the equipartition theorem). This motion is named after the botanist Robert Brown, who first described the phenomenon in 1827, while looking throu ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]