Theil–Sen Estimator

picture info	Theil–Sen Estimator In non-parametric statistics, the Theil–Sen estimator is a method for robustly fitting a line to sample points in the plane (simple linear regression) by choosing the median of the slopes of all lines through pairs of points. It has also been called Sen's slope estimator, slope selection, the single median method, the Kendall robust line-fit method, and the Kendall–Theil robust line. It is named after Henri Theil and Pranab K. Sen, who published papers on this method in 1950 and 1968 respectively,; and after Maurice Kendall because of its relation to the Kendall tau rank correlation coefficient. This estimator can be computed efficiently, and is insensitive to outliers. It can be significantly more accurate than non-robust simple linear regression (least squares) for skewed and heteroskedastic data, and competes well against least squares even for normally distributed data in terms of statistical power. It has been called "the most popular nonparametric technique for e ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Y-intercept In analytic geometry, using the common convention that the horizontal axis represents a variable ''x'' and the vertical axis represents a variable ''y'', a ''y''-intercept or vertical intercept is a point where the graph of a function or relation intersects the ''y''-axis of the coordinate system. As such, these points satisfy ''x'' = 0. Using equations If the curve in question is given as y= f(x), the ''y''-coordinate of the ''y''-intercept is found by calculating f(0). Functions which are undefined at ''x'' = 0 have no ''y''-intercept. If the function is linear and is expressed in slope-intercept form as f(x)=a+bx, the constant term a is the ''y''-coordinate of the ''y''-intercept. Multiple y-intercepts Some 2-dimensional mathematical relationships such as circles, ellipses, and hyperbolas can have more than one ''y''-intercept. Because functions associate ''x'' values to no more than one ''y'' value as part of their definition, they can have at most one ''y''-intercept ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Linear Transformation In mathematics, and more specifically in linear algebra, a linear map (also called a linear mapping, linear transformation, vector space homomorphism, or in some contexts linear function) is a mapping V \to W between two vector spaces that preserves the operations of vector addition and scalar multiplication. The same names and the same definition are also used for the more general case of modules over a ring; see Module homomorphism. If a linear map is a bijection then it is called a . In the case where V = W, a linear map is called a (linear) '' endomorphism''. Sometimes the term refers to this case, but the term "linear operator" can have different meanings for different conventions: for example, it can be used to emphasize that V and W are real vector spaces (not necessarily with V = W), or it can be used to emphasize that V is a function space, which is a common convention in functional analysis. Sometimes the term ''linear function'' has the same meaning as ''li ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Equivariant In mathematics, equivariance is a form of symmetry for functions from one space with symmetry to another (such as symmetric spaces). A function is said to be an equivariant map when its domain and codomain are acted on by the same symmetry group, and when the function commutes with the action of the group. That is, applying a symmetry transformation and then computing the function produces the same result as computing the function and then applying the transformation. Equivariant maps generalize the concept of invariants, functions whose value is unchanged by a symmetry transformation of their argument. The value of an equivariant map is often (imprecisely) called an invariant. In statistical inference, equivariance under statistical transformations of data is an important property of various estimation methods; see invariant estimator for details. In pure mathematics, equivariance is a central object of study in equivariant topology and its subtopics equivariant cohomolo ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Breakdown Point Robust statistics are statistics with good performance for data drawn from a wide range of probability distributions, especially for distributions that are not normal. Robust statistical methods have been developed for many common problems, such as estimating location, scale, and regression parameters. One motivation is to produce statistical methods that are not unduly affected by outliers. Another motivation is to provide methods with good performance when there are small departures from a parametric distribution. For example, robust methods work well for mixtures of two normal distributions with different standard deviations; under this model, non-robust methods like a t-test work poorly. Introduction Robust statistics seek to provide methods that emulate popular statistical methods, but which are not unduly affected by outliers or other small departures from model assumptions. In statistics, classical estimation methods rely heavily on assumptions which are often not ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Robust Statistics Robust statistics are statistics with good performance for data drawn from a wide range of probability distributions, especially for distributions that are not normal. Robust statistical methods have been developed for many common problems, such as estimating location, scale, and regression parameters. One motivation is to produce statistical methods that are not unduly affected by outliers. Another motivation is to provide methods with good performance when there are small departures from a parametric distribution. For example, robust methods work well for mixtures of two normal distributions with different standard deviations; under this model, non-robust methods like a t-test work poorly. Introduction Robust statistics seek to provide methods that emulate popular statistical methods, but which are not unduly affected by outliers or other small departures from model assumptions. In statistics, classical estimation methods rely heavily on assumptions which are often not ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Least Squares The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the residuals (a residual being the difference between an observed value and the fitted value provided by a model) made in the results of each individual equation. The most important application is in data fitting. When the problem has substantial uncertainties in the independent variable (the ''x'' variable), then simple regression and least-squares methods have problems; in such cases, the methodology required for fitting errors-in-variables models may be considered instead of that for least squares. Least squares problems fall into two categories: linear or ordinary least squares and nonlinear least squares, depending on whether or not the residuals are linear in all unknowns. The linear least-squares problem occurs in statistical regr ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Efficiency (statistics) In statistics, efficiency is a measure of quality of an estimator, of an experimental design, or of a hypothesis testing procedure. Essentially, a more efficient estimator, needs fewer input data or observations than a less efficient one to achieve the Cramér–Rao bound. An ''efficient estimator'' is characterized by having the smallest possible variance, indicating that there is a small deviance between the estimated value and the "true" value in the L2 norm sense. The relative efficiency of two procedures is the ratio of their efficiencies, although often this concept is used where the comparison is made between a given procedure and a notional "best possible" procedure. The efficiencies and the relative efficiency of two procedures theoretically depend on the sample size available for the given procedure, but it is often possible to use the asymptotic relative efficiency (defined as the limit of the relative efficiencies as the sample size grows) as the principal comp ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Unbiased Estimator In statistics, the bias of an estimator (or bias function) is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called ''unbiased''. In statistics, "bias" is an property of an estimator. Bias is a distinct concept from consistency: consistent estimators converge in probability to the true value of the parameter, but may be biased or unbiased; see bias versus consistency for more. All else being equal, an unbiased estimator is preferable to a biased estimator, although in practice, biased estimators (with generally small bias) are frequently used. When a biased estimator is used, bounds of the bias are calculated. A biased estimator may be used for various reasons: because an unbiased estimator does not exist without further assumptions about a population; because an estimator is difficult to compute (as in unbiased estimation of standard deviation); because a biased esti ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Weighted Median In statistics, a weighted median of a sample is the 50% weighted percentile. It was first proposed by F. Y. Edgeworth in 1888. Like the median, it is useful as an estimator of central tendency, robust against outliers. It allows for non-uniform statistical weights related to, e.g., varying precision measurements in the sample. Definition General case For n distinct ordered elements x_1, x_2,...,x_n with positive weights w_1, w_2,...,w_n such that \sum_^n w_i = 1, the weighted median is the element x_k satisfying :\sum_^ w_i \le 1/2 and \sum_^ w_i \le 1/2 Special case Consider a set of elements in which two of the elements satisfy the general case. This occurs when both element's respective weights border the midpoint of the set of weights without encapsulating it; Rather, each element defines a partition equal to 1/2. These elements are referred to as the lower weighted median and upper weighted median. Their conditions are satisfied as follows: Lower Weighted Median :\s ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Repeated Median Regression In robust statistics, repeated median regression, also known as the repeated median estimator, is a robust linear regression algorithm. The estimator has a breakdown point of 50%. Although it is equivariant under scaling, or under linear transformations of either its explanatory variable or its response variable, it is not under affine transformations that combine both variables.Peter J. Rousseeuw, Nathan S. Netanyahu, and David M. Mount,New Statistical and Computational Results on the Repeated Median Regression Estimator, in ''New Directions in Statistical Data Analysis and Robustness'', edited by Stephan Morgenthaler, Elvezio Ronchetti, and Werner A. Stahel, Birkhauser Verlag, Basel, 1993, pp. 177-194. It can be calculated in O(n^2) time by brute force, in O(n \log^2 n) time using more sophisticated techniques, or in O(n\log n) randomized expected time. It may also be calculated using an on-line algorithm with O(n) update time. Method The repeated median method estimates the slo ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	With Replacement In statistics, quality assurance, and survey methodology, sampling is the selection of a subset (a statistical sample) of individuals from within a statistical population to estimate characteristics of the whole population. Statisticians attempt to collect samples that are representative of the population in question. Sampling has lower costs and faster data collection than measuring the entire population and can provide insights in cases where it is infeasible to measure an entire population. Each observation measures one or more properties (such as weight, location, colour or mass) of independent objects or individuals. In survey sampling, weights can be applied to the data to adjust for the sample design, particularly in stratified sampling. Results from probability theory and statistical theory are employed to guide the practice. In business and medical research, sampling is widely used for gathering information about a population. Acceptance sampling is used to determine ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]