HOME

TheInfoList



OR:

In
descriptive statistics A descriptive statistic (in the count noun sense) is a summary statistic that quantitatively describes or summarizes features from a collection of information, while descriptive statistics (in the mass noun sense) is the process of using and an ...
, summary statistics are used to summarize a set of
observation Observation is the active acquisition of information from a primary source. In living beings, observation employs the senses. In science, observation can also involve the perception and recording of data via the use of scientific instruments. The ...
s, in order to communicate the largest amount of information as simply as possible. Statisticians commonly try to describe the observations in * a measure of location, or
central tendency In statistics, a central tendency (or measure of central tendency) is a central or typical value for a probability distribution.Weisberg H.F (1992) ''Central Tendency and Variability'', Sage University Paper Series on Quantitative Applications ...
, such as the
arithmetic mean In mathematics and statistics, the arithmetic mean ( ) or arithmetic average, or just the ''mean'' or the ''average'' (when the context is clear), is the sum of a collection of numbers divided by the count of numbers in the collection. The colle ...
* a measure of
statistical dispersion In statistics, dispersion (also called variability, scatter, or spread) is the extent to which a Probability distribution, distribution is stretched or squeezed. Common examples of measures of statistical dispersion are the variance, standard de ...
like the standard mean absolute deviation * a measure of the shape of the distribution like
skewness In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive, zero, negative, or undefined. For a unimodal d ...
or
kurtosis In probability theory and statistics, kurtosis (from el, κυρτός, ''kyrtos'' or ''kurtos'', meaning "curved, arching") is a measure of the "tailedness" of the probability distribution of a real-valued random variable. Like skewness, kurtosi ...
* if more than one variable is measured, a measure of
statistical dependence Independence is a fundamental notion in probability theory, as in statistics and the theory of stochastic processes. Two events are independent, statistically independent, or stochastically independent if, informally speaking, the occurrence of o ...
such as a
correlation coefficient A correlation coefficient is a numerical measure of some type of correlation, meaning a statistical relationship between two variables. The variables may be two columns of a given data set of observations, often called a sample, or two components ...
A common collection of
order statistics In statistics, the ''k''th order statistic of a statistical sample is equal to its ''k''th-smallest value. Together with rank statistics, order statistics are among the most fundamental tools in non-parametric statistics and inference. Importa ...
used as summary statistics are the
five-number summary The five-number summary is a set of descriptive statistics that provides information about a dataset. It consists of the five most important sample percentiles: # the sample minimum ''(smallest observation)'' # the lower quartile or ''first quart ...
, sometimes extended to a
seven-number summary In descriptive statistics, the seven-number summary is a collection of seven summary statistics, and is an extension of the five-number summary. There are three similar, common forms. As with the five-number summary, it can be represented by a modi ...
, and the associated
box plot In descriptive statistics, a box plot or boxplot is a method for graphically demonstrating the locality, spread and skewness groups of numerical data through their quartiles. In addition to the box on a box plot, there can be lines (which are ca ...
. Entries in an
analysis of variance Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures (such as the "variation" among and between groups) used to analyze the differences among means. ANOVA was developed by the statisticia ...
table can also be regarded as summary statistics.


Examples


Location

Common measures of location, or
central tendency In statistics, a central tendency (or measure of central tendency) is a central or typical value for a probability distribution.Weisberg H.F (1992) ''Central Tendency and Variability'', Sage University Paper Series on Quantitative Applications ...
, are the
arithmetic mean In mathematics and statistics, the arithmetic mean ( ) or arithmetic average, or just the ''mean'' or the ''average'' (when the context is clear), is the sum of a collection of numbers divided by the count of numbers in the collection. The colle ...
,
median In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value. The basic fe ...
,
mode Mode ( la, modus meaning "manner, tune, measure, due measure, rhythm, melody") may refer to: Arts and entertainment * '' MO''D''E (magazine)'', a defunct U.S. women's fashion magazine * ''Mode'' magazine, a fictional fashion magazine which is ...
, and
interquartile mean The interquartile mean (IQM) (or midmean) is a statistical measure of central tendency based on the truncated mean of the interquartile range. The IQM is very similar to the scoring method used in sports that are evaluated by a panel of judges: ''d ...
.


Spread

Common measures of
statistical dispersion In statistics, dispersion (also called variability, scatter, or spread) is the extent to which a Probability distribution, distribution is stretched or squeezed. Common examples of measures of statistical dispersion are the variance, standard de ...
are the
standard deviation In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...
,
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
,
range Range may refer to: Geography * Range (geographic), a chain of hills or mountains; a somewhat linear, complex mountainous or hilly area (cordillera, sierra) ** Mountain range, a group of mountains bordered by lowlands * Range, a term used to i ...
,
interquartile range In descriptive statistics, the interquartile range (IQR) is a measure of statistical dispersion, which is the spread of the data. The IQR may also be called the midspread, middle 50%, fourth spread, or H‑spread. It is defined as the difference ...
,
absolute deviation In mathematics and statistics, deviation is a measure of difference between the observed value of a variable and some other value, often that variable's mean. The sign of the deviation reports the direction of that difference (the deviation is posit ...
, mean absolute difference and the
distance standard deviation In statistics and in probability theory, distance correlation or distance covariance is a measure of dependence between two paired random vectors of arbitrary, not necessarily equal, dimension. The population distance correlation coefficient is zer ...
. Measures that assess spread in comparison to the typical size of data values include the
coefficient of variation In probability theory and statistics, the coefficient of variation (CV), also known as relative standard deviation (RSD), is a standardized measure of dispersion of a probability distribution or frequency distribution. It is often expressed as ...
. The Gini coefficient was originally developed to measure income inequality and is equivalent to one of the
L-moment In statistics, L-moments are a sequence of statistics used to summarize the shape of a probability distribution. They are linear combinations of order statistics ( L-statistics) analogous to conventional moments, and can be used to calculate qu ...
s. A simple summary of a dataset is sometimes given by quoting particular
order statistics In statistics, the ''k''th order statistic of a statistical sample is equal to its ''k''th-smallest value. Together with rank statistics, order statistics are among the most fundamental tools in non-parametric statistics and inference. Importa ...
as approximations to selected
percentiles In statistics, a ''k''-th percentile (percentile score or centile) is a score ''below which'' a given percentage ''k'' of scores in its frequency distribution falls (exclusive definition) or a score ''at or below which'' a given percentage falls ...
of a distribution.


Shape

Common measures of the shape of a distribution are
skewness In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive, zero, negative, or undefined. For a unimodal d ...
or
kurtosis In probability theory and statistics, kurtosis (from el, κυρτός, ''kyrtos'' or ''kurtos'', meaning "curved, arching") is a measure of the "tailedness" of the probability distribution of a real-valued random variable. Like skewness, kurtosi ...
, while alternatives can be based on
L-moment In statistics, L-moments are a sequence of statistics used to summarize the shape of a probability distribution. They are linear combinations of order statistics ( L-statistics) analogous to conventional moments, and can be used to calculate qu ...
s. A different measure is the distance skewness, for which a value of zero implies central symmetry.


Dependence

The common measure of dependence between paired random variables is the
Pearson product-moment correlation coefficient In statistics, the Pearson correlation coefficient (PCC, pronounced ) ― also known as Pearson's ''r'', the Pearson product-moment correlation coefficient (PPMCC), the bivariate correlation, or colloquially simply as the correlation coefficient ...
, while a common alternative summary statistic is
Spearman's rank correlation coefficient In statistics, Spearman's rank correlation coefficient or Spearman's ''ρ'', named after Charles Spearman and often denoted by the Greek letter \rho (rho) or as r_s, is a nonparametric measure of rank correlation ( statistical dependence between ...
. A value of zero for the
distance correlation In statistics and in probability theory, distance correlation or distance covariance is a measure of dependence between two paired random vectors of arbitrary, not necessarily equal, dimension. The population distance correlation coefficient is zer ...
implies independence.


Human perception of summary statistics

Humans efficiently use summary statistics to quickly perceive the gist of auditory and visual information.


See also

* Common test statistics *
Descriptive statistics A descriptive statistic (in the count noun sense) is a summary statistic that quantitatively describes or summarizes features from a collection of information, while descriptive statistics (in the mass noun sense) is the process of using and an ...
*
Sample statistics A statistic (singular) or sample statistic is any quantity computed from values in a sample which is considered for a statistical purpose. Statistical purposes include estimating a population parameter, describing a sample, or evaluating a hypot ...
*
Sufficient statistic In statistics, a statistic is ''sufficient'' with respect to a statistical model and its associated unknown parameter if "no other statistic that can be calculated from the same sample provides any additional information as to the value of the pa ...


References


External links

* {{DEFAULTSORT:Summary Statistics ja:要約統計量