HOME

TheInfoList



OR:

In
statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...
, the reduced chi-square statistic is used extensively in
goodness of fit The goodness of fit of a statistical model describes how well it fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question. Such measure ...
testing. It is also known as mean squared weighted deviation (MSWD) in isotopic dating and variance of unit weight in the context of
weighted least squares Weighted least squares (WLS), also known as weighted linear regression, is a generalization of ordinary least squares and linear regression in which knowledge of the variance of observations is incorporated into the regression. WLS is also a speci ...
. Its square root is called regression standard error, standard error of the regression, or standard error of the equation (see Ordinary least squares#Reduced chi-squared)


Definition

It is defined as chi-square per
degree of freedom Degrees of freedom (often abbreviated df or DOF) refers to the number of independent variables or parameters of a thermodynamic system. In various scientific fields, the word "freedom" is used to describe the limits to which physical movement or ...
: :\chi^2_\nu = \frac \nu, where the chi-squared is a weighted sum of squared deviations: :\chi^2 = \sum_ with inputs:
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbe ...
\sigma_i^2, observations ''O'', and calculated data ''C''. The degree of freedom, \nu = n - m, equals the number of observations ''n'' minus the number of fitted parameters ''m''. In
weighted least squares Weighted least squares (WLS), also known as weighted linear regression, is a generalization of ordinary least squares and linear regression in which knowledge of the variance of observations is incorporated into the regression. WLS is also a speci ...
, the definition is often written in matrix notation as :\chi^2_\nu = \frac, where ''r'' is the vector of residuals, and ''W'' is the weight matrix, the inverse of the input (diagonal) covariance matrix of observations. If ''W'' is non-diagonal, then
generalized least squares In statistics, generalized least squares (GLS) is a technique for estimating the unknown parameters in a linear regression model when there is a certain degree of correlation between the residuals in a regression model. In these cases, ordinar ...
applies. In
ordinary least squares In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear function of a set of explanatory variables) by the ...
, the definition simplifies to: :\chi^2_\nu = \frac, :RSS = \sum r^2, where the numerator is the
residual sum of squares In statistics, the residual sum of squares (RSS), also known as the sum of squared estimate of errors (SSE), is the sum of the squares of residuals (deviations predicted from actual empirical values of data). It is a measure of the discrepa ...
(RSS).


Discussion

As a general rule, when the variance of the measurement error is known ''a priori'', a \chi_\nu^2 \gg 1 indicates a poor model fit. A \chi_\nu^2 > 1 indicates that the fit has not fully captured the data (or that the error variance has been underestimated). In principle, a value of \chi_\nu^2 around 1 indicates that the extent of the match between observations and estimates is in accord with the error variance. A \chi_\nu^2 < 1 indicates that the model is "over-fitting" the data: either the model is improperly fitting noise, or the error variance has been overestimated. When the variance of the measurement error is only partially known, the reduced chi-squared may serve as a correction estimated ''a posteriori''.


Applications


Geochronology

In
geochronology Geochronology is the science of determining the age of rocks, fossils, and sediments using signatures inherent in the rocks themselves. Absolute geochronology can be accomplished through radioactive isotopes, whereas relative geochronology is ...
, the MSWD is a measure of goodness of fit that takes into account the relative importance of both the internal and external reproducibility, with most common usage in isotopic dating.Wendt, I., and Carl, C., 1991,The statistical distribution of the mean squared weighted deviation, Chemical Geology, 275–285. In general when: MSWD = 1 if the age data fit a univariate normal distribution in ''t'' (for the
arithmetic mean In mathematics and statistics, the arithmetic mean ( ) or arithmetic average, or just the '' mean'' or the ''average'' (when the context is clear), is the sum of a collection of numbers divided by the count of numbers in the collection. The co ...
age) or log(''t'') (for the
geometric mean In mathematics, the geometric mean is a mean or average which indicates a central tendency of a set of numbers by using the product of their values (as opposed to the arithmetic mean which uses their sum). The geometric mean is defined as the ...
age) space, or if the compositional data fit a bivariate normal distribution in U/Helium.html"_;"title="Uranium.html"_;"title="og(Uranium">U/ He),log(Thorium.html"_;"title="Uranium">U/Helium.html"_;"title="Uranium.html"_;"title="og(Uranium">U/Helium">He),log(
Th/He).html"_;"title="Helium">He),log(Thorium.html"_;"title="Uranium">U/Helium.html"_;"title="Uranium.html"_;"title="og(Uranium">U/Helium">He),log(Thorium">Th/He)">Helium">He),log(Thorium.html"_;"title="Uranium">U/Helium.html"_;"title="Uranium.html"_;"title="og(Uranium">U/Helium">He),log(Thorium">Th/He)space_(for_the_central_age). MSWD_<_1_if_the_observed_scatter_is_less_than_that_predicted_by_the_analytical_uncertainties._In_this_case,_the_data_are_said_to_be_"underdispersed",_indicating_that_the_analytical_uncertainties_were_overestimated. MSWD_>_1_if_the_observed_scatter_exceeds_that_predicted_by_the_analytical_uncertainties._In_this_case,_the_data_are_said_to_be_"overdispersed"._This_situation_is_the_rule_rather_than_the_exception_in_(U-Th)/He_geochronology,_indicating_an_incomplete_understanding_of_the_isotope_system._Several_reasons_have_been_proposed_to_explain_the_overdispersion_of_(U-Th)/He_data,_including_unevenly_distributed_U-Th_distributions_and_radiation_damage. Often_the_geochronologist_will_determine_a_series_of_age_measurements_on_a_single_sample,_with_the_measured_value_x_i_having_a_weighting_w_i_and_an_associated_error_\sigma__for_each_age_determination._As_regards_weighting,_one_can_either_weight_all_of_the_measured_ages_equally,_or_weight_them_by_the_proportion_of_the_sample_that_they_represent._For_example,_if_two_thirds_of_the_sample_was_used_for_the_first_measurement_and_one_third_for_the_second_and_final_measurement,_then_one_might_weight_the_first_measurement_twice_that_of_the_second. The_arithmetic_mean_of_the_age_determinations_is :\overline_=_\frac_N, but_this_value_can_be_misleading,_unless_each_determination_of_the_age_is_of_equal_significance. When_each_measured_value_can_be_assumed_to_have_the_same_weighting,_or_significance,_the_biased_and_unbiased_(or_"Standard_deviation#With_sample_standard_deviation.html" "title="Thorium">Th/He).html" ;"title="Helium">He),log(Thorium.html" ;"title="Uranium">U/Helium.html" ;"title="Uranium.html" ;"title="og(Uranium">U/Helium">He),log(Thorium">Th/He)">Helium">He),log(Thorium.html" ;"title="Uranium">U/Helium.html" ;"title="Uranium.html" ;"title="og(Uranium">U/Helium">He),log(Thorium">Th/He)space (for the central age). MSWD < 1 if the observed scatter is less than that predicted by the analytical uncertainties. In this case, the data are said to be "underdispersed", indicating that the analytical uncertainties were overestimated. MSWD > 1 if the observed scatter exceeds that predicted by the analytical uncertainties. In this case, the data are said to be "overdispersed". This situation is the rule rather than the exception in (U-Th)/He geochronology, indicating an incomplete understanding of the isotope system. Several reasons have been proposed to explain the overdispersion of (U-Th)/He data, including unevenly distributed U-Th distributions and radiation damage. Often the geochronologist will determine a series of age measurements on a single sample, with the measured value x_i having a weighting w_i and an associated error \sigma_ for each age determination. As regards weighting, one can either weight all of the measured ages equally, or weight them by the proportion of the sample that they represent. For example, if two thirds of the sample was used for the first measurement and one third for the second and final measurement, then one might weight the first measurement twice that of the second. The arithmetic mean of the age determinations is :\overline = \frac N, but this value can be misleading, unless each determination of the age is of equal significance. When each measured value can be assumed to have the same weighting, or significance, the biased and unbiased (or "Standard deviation#With sample standard deviation">sample Sample or samples may refer to: Base meaning * Sample (statistics), a subset of a population – complete data set * Sample (signal), a digital discrete sample of a continuous analog signal * Sample (material), a specimen or small quantity of ...
" and "population" respectively) estimators of the variance are computed as follows: : \sigma^2 = \fracN \text s^2 = \frac \cdot \sigma^2 = \frac \cdot \sum_^ (x_i - \overline)^2. The standard deviation is the square root of the variance. When individual determinations of an age are not of equal significance, it is better to use a weighted mean to obtain an "average" age, as follows: :\overline^* = \frac. The biased weighted estimator of variance can be shown to be :\sigma^2 = \frac, which can be computed as :\sigma^2 = \frac. The unbiased weighted estimator of the sample variance can be computed as follows: :s^2 = \frac \cdot . Again, the corresponding standard deviation is the square root of the variance. The unbiased weighted estimator of the sample variance can also be computed on the fly as follows: :s^2 = \frac. The unweighted mean square of the weighted deviations (unweighted MSWD) can then be computed, as follows: : \text_u = \frac \cdot \sum_^N \frac. By analogy, the weighted mean square of the weighted deviations (weighted MSWD) can be computed as follows: : \text_w = \frac \cdot \sum_^N \frac.


Rasch Analysis

In data analysis based on the Rasch Model, the reduced chi-squared statistic is called the outfit mean-square statistic, and the information-weighted reduced chi-squared statistic is called the infit mean-square statistic.


References

{{reflist Geochronological dating methods Statistical deviation and dispersion