In statistics, standardized (regression) coefficients, also called beta coefficients or beta weights, are the estimates resulting from a

regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...

where the underlying data have been

standardized Standardization or standardisation is the process of implementing and developing technical standards based on the consensus of different parties that include firms, users, interest groups, standards organizations and governments. Standardization ...

so that the

variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of number ...

s of dependent and independent variables are equal to 1. Therefore, standardized coefficients are

unitless A dimensionless quantity (also known as a bare quantity, pure quantity, or scalar quantity as well as quantity of dimension one) is a quantity to which no physical dimension is assigned, with a corresponding SI unit of measurement of one (or 1), ...

and refer to how many standard deviations a dependent variable will change, per standard deviation increase in the predictor variable.

Usage

Standardization of the coefficient is usually done to answer the question of which of the independent variables have a greater effect on the

dependent variable Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or dema ...

in a

multiple regression In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one o ...

analysis where the variables are measured in different

units of measurement A unit of measurement is a definite magnitude of a quantity, defined and adopted by convention or by law, that is used as a standard for measurement of the same kind of quantity. Any other quantity of that kind can be expressed as a mul ...

(for example, income measured in dollars and family size measured in number of individuals). It may also be considered a general measure of

effect size In statistics, an effect size is a value measuring the strength of the relationship between two variables in a population, or a sample-based estimate of that quantity. It can refer to the value of a statistic calculated from a sample of data, the ...

, quantifying the "magnitude" of the effect of one variable on another. For simple linear regression with orthogonal predictors, the standardized regression coefficient equals the

correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statisti ...

between the independent and dependent variables.

Implementation

A regression carried out on original (unstandardized) variables produces unstandardized coefficients. A regression carried out on standardized variables produces standardized coefficients. Values for standardized and unstandardized coefficients can also be re-scaled to one another subsequent to either type of analysis. Suppose that

\beta

is the regression coefficient resulting from a

linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is ...

(predicting

y

x

). The standardized coefficient simply results as

\beta^\ast = \frac \beta

, where

s_x

and

s_y

are the (estimated) standard deviations of

x

and

y

, respectively. Sometimes, standardization is done only without respect to the standard deviation of the

regressor Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or dema ...

(the independent variable

x

Advantages and disadvantages

Standardized coefficients' advocates note that the coefficients are independent of the involved variables'

(i.e., standardized coefficients are ''

''), which makes comparisons easy. Critics voice concerns that such a standardization can be very misleading. Due to the re-scaling based on sample standard deviations, any effect apparent in the standardized coefficient may be due to

confounding In statistics, a confounder (also confounding variable, confounding factor, extraneous determinant or lurking variable) is a variable that influences both the dependent variable and independent variable, causing a spurious association. Con ...

with the particularities (especially: variability) of the involved data sample(s). Also, the interpretation or meaning of a "''one standard deviation change''" in the regressor

x

may vary markedly between non-

normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu i ...

s (e.g., when

skewed In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive, zero, negative, or undefined. For a unimoda ...

asymmetric Asymmetric may refer to: *Asymmetry in geometry, chemistry, and physics Computing * Asymmetric cryptography, in public-key cryptography *Asymmetric digital subscriber line, Internet connectivity * Asymmetric multiprocessing, in computer architect ...

or multimodal).

Terminology

Some

statistical software Statistical software are specialized computer programs for analysis in statistics and econometrics. Open-source * ADaMSoft – a generalized statistical software with data mining algorithms and methods for data management * ADMB – a softwa ...

packages like

PSPP PSPP is a free software application for analysis of sampled data, intended as a free alternative for IBM SPSS Statistics. It has a graphical user interface and conventional command-line interface. It is written in C and uses GNU Scientific ...

SPSS SPSS Statistics is a statistical software suite developed by IBM for data management, advanced analytics, multivariate analysis, business intelligence, and criminal investigation. Long produced by SPSS Inc., it was acquired by IBM in 2009. Cur ...

and SYSTAT label the standardized regression coefficients as "Beta" while the unstandardized coefficients are labeled "B". Others, like

DAP DAP or Dap may refer to: Science * DAP (gene), human gene that encodes death-associated proteins, which mediate programmed cell death * Diamidophosphate, phosphorylating compound * Diaminopimelic acid, amino acid derivative of lysine * Diamino ...

/ SAS label them "Standardized Coefficient". Sometimes the unstandardized variables are also labeled as "b".

References

External links

Which Predictors Are More Important?
- why standardized coefficients are used {{DEFAULTSORT:Standardized Coefficient Regression analysis