HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, the standard deviation line (or SD line) marks points on a
scatter plot A scatter plot, also called a scatterplot, scatter graph, scatter chart, scattergram, or scatter diagram, is a type of plot or mathematical diagram using Cartesian coordinates to display values for typically two variables for a set of dat ...
that are an equal number of
standard deviation In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its Expected value, mean. A low standard Deviation (statistics), deviation indicates that the values tend to be close to the mean ( ...
s away from the average in each dimension. For example, in a 2-dimensional scatter diagram with variables x and y, points that are 1 standard deviation away from the mean of x and also 1 standard deviation away from the mean of y are on the SD line. The SD line is a useful visual tool since points in a scatter diagram tend to cluster around it, more or less tightly depending on their
correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...
.


Properties


Relation to regression line

The SD line goes through the point of averages and has a slope of \frac when the correlation between x and y is positive, and -\frac when the correlation is negative. Unlike the
regression line In statistics, linear regression is a model that estimates the relationship between a scalar response (dependent variable) and one or more explanatory variables (regressor or independent variable). A model with exactly one explanatory variable ...
, the SD line does not take into account the relationship between x and y. The slope of the SD line is related to that of the regression line by a = r \frac where a is the slope of the regression line, r is the
correlation coefficient A correlation coefficient is a numerical measure of some type of linear correlation, meaning a statistical relationship between two variables. The variables may be two columns of a given data set of observations, often called a sample, or two c ...
, and \frac is the magnitude of the slope of the SD line.


Typical distance of points to SD line

The
root mean square In mathematics, the root mean square (abbrev. RMS, or rms) of a set of values is the square root of the set's mean square. Given a set x_i, its RMS is denoted as either x_\mathrm or \mathrm_x. The RMS is also known as the quadratic mean (denote ...
vertical distance of points from the SD line is \sqrt{2(1 - , r, )} \times\sigma_y. This gives an idea of the spread of points around the SD line. Descriptive statistics