Mathematical statistics is the application of
probability theory
Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...
and other mathematical concepts to
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, as opposed to techniques for collecting statistical data. Specific mathematical techniques that are commonly used in statistics include
mathematical analysis
Analysis is the branch of mathematics dealing with continuous functions, limit (mathematics), limits, and related theories, such as Derivative, differentiation, Integral, integration, measure (mathematics), measure, infinite sequences, series ( ...
,
linear algebra
Linear algebra is the branch of mathematics concerning linear equations such as
:a_1x_1+\cdots +a_nx_n=b,
linear maps such as
:(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n,
and their representations in vector spaces and through matrix (mathemat ...
,
stochastic analysis,
differential equations, and
measure theory
In mathematics, the concept of a measure is a generalization and formalization of geometrical measures (length, area, volume) and other common notions, such as magnitude (mathematics), magnitude, mass, and probability of events. These seemingl ...
.
Introduction
Statistical data collection is concerned with the planning of studies, especially with the
design of randomized experiments and with the planning of
surveys using
random sampling
In this statistics, quality assurance, and survey methodology, sampling is the selection of a subset or a statistical sample (termed sample for short) of individuals from within a statistical population to estimate characteristics of the who ...
. The initial analysis of the data often follows the study protocol specified prior to the study being conducted. The data from a study can also be analyzed to consider secondary hypotheses inspired by the initial results, or to suggest new studies. A secondary analysis of the data from a planned study uses tools from
data analysis
Data analysis is the process of inspecting, Data cleansing, cleansing, Data transformation, transforming, and Data modeling, modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Da ...
, and the process of doing this is mathematical statistics.
Data analysis is divided into:
*
descriptive statistics
A descriptive statistic (in the count noun sense) is a summary statistic that quantitatively describes or summarizes features from a collection of information, while descriptive statistics (in the mass noun sense) is the process of using and an ...
– the part of statistics that describes data, i.e. summarises the data and their typical properties.
*
inferential statistics – the part of statistics that draws conclusions from data (using some model for the data): For example, inferential statistics involves selecting a model for the data, checking whether the data fulfill the conditions of a particular model, and with quantifying the involved uncertainty (e.g. using
confidence intervals).
While the tools of data analysis work best on data from randomized studies, they are also applied to other kinds of data. For example, from
natural experiments and
observational studies
In fields such as epidemiology, social sciences, psychology and statistics, an observational study draws inferences from a sample to a population where the independent variable is not under the control of the researcher because of ethical conc ...
, in which case the inference is dependent on the model chosen by the statistician, and so subjective.
Topics
The following are some of the important topics in mathematical statistics:
Probability distributions
A
probability distribution
In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
is a
function that assigns a
probability
Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
to each
measurable subset of the possible outcomes of a random
experiment
An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into cause-and-effect by demonstrating what outcome occurs whe ...
,
survey, or procedure of
statistical inference
Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properties of ...
. Examples are found in experiments whose
sample space
In probability theory, the sample space (also called sample description space, possibility space, or outcome space) of an experiment or random trial is the set of all possible outcomes or results of that experiment. A sample space is usually den ...
is non-numerical, where the distribution would be a
categorical distribution; experiments whose sample space is encoded by discrete
random variables
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. The term 'random variable' in its mathematical definition refers ...
, where the distribution can be specified by a
probability mass function
In probability and statistics, a probability mass function (sometimes called ''probability function'' or ''frequency function'') is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes i ...
; and experiments with sample spaces encoded by continuous random variables, where the distribution can be specified by a
probability density function
In probability theory, a probability density function (PDF), density function, or density of an absolutely continuous random variable, is a Function (mathematics), function whose value at any given sample (or point) in the sample space (the s ...
. More complex experiments, such as those involving
stochastic processes defined in
continuous time, may demand the use of more general
probability measure
In mathematics, a probability measure is a real-valued function defined on a set of events in a σ-algebra that satisfies Measure (mathematics), measure properties such as ''countable additivity''. The difference between a probability measure an ...
s.
A probability distribution can either be
univariate or
multivariate. A univariate distribution gives the probabilities of a single
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
taking on various alternative values; a multivariate distribution (a
joint probability distribution
A joint or articulation (or articular surface) is the connection made between bones, ossicles, or other hard structures in the body which link an animal's skeletal system into a functional whole.Saladin, Ken. Anatomy & Physiology. 7th ed. McGraw- ...
) gives the probabilities of a
random vector
In probability, and statistics, a multivariate random variable or random vector is a list or vector of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge ...
—a set of two or more random variables—taking on various combinations of values. Important and commonly encountered univariate probability distributions include the
binomial distribution
In probability theory and statistics, the binomial distribution with parameters and is the discrete probability distribution of the number of successes in a sequence of statistical independence, independent experiment (probability theory) ...
, the
hypergeometric distribution, and the
normal distribution
In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
f(x) = \frac ...
. The
multivariate normal distribution is a commonly encountered multivariate distribution.
Special distributions
*
Normal distribution
In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
f(x) = \frac ...
, the most common continuous distribution
*
Bernoulli distribution
In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli, is the discrete probability distribution of a random variable which takes the value 1 with probability p and the value 0 with pro ...
, for the outcome of a single Bernoulli trial (e.g. success/failure, yes/no)
*
Binomial distribution
In probability theory and statistics, the binomial distribution with parameters and is the discrete probability distribution of the number of successes in a sequence of statistical independence, independent experiment (probability theory) ...
, for the number of "positive occurrences" (e.g. successes, yes votes, etc.) given a fixed total number of
independent occurrences
*
Negative binomial distribution, for binomial-type observations but where the quantity of interest is the number of failures before a given number of successes occurs
*
Geometric distribution
In probability theory and statistics, the geometric distribution is either one of two discrete probability distributions:
* The probability distribution of the number X of Bernoulli trials needed to get one success, supported on \mathbb = \;
* T ...
, for binomial-type observations but where the quantity of interest is the number of failures before the first success; a special case of the negative binomial distribution, where the number of successes is one.
*
Discrete uniform distribution, for a finite set of values (e.g. the outcome of a fair die)
*
Continuous uniform distribution
In probability theory and statistics, the continuous uniform distributions or rectangular distributions are a family of symmetric probability distributions. Such a distribution describes an experiment where there is an arbitrary outcome that li ...
, for continuously distributed values
*
Poisson distribution
In probability theory and statistics, the Poisson distribution () is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known const ...
, for the number of occurrences of a Poisson-type event in a given period of time
*
Exponential distribution
In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the distance between events in a Poisson point process, i.e., a process in which events occur continuousl ...
, for the time before the next Poisson-type event occurs
*
Gamma distribution, for the time before the next k Poisson-type events occur
*
Chi-squared distribution, the distribution of a sum of squared
standard normal variables; useful e.g. for inference regarding the
sample variance
In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion, ...
of normally distributed samples (see
chi-squared test
A chi-squared test (also chi-square or test) is a Statistical hypothesis testing, statistical hypothesis test used in the analysis of contingency tables when the sample sizes are large. In simpler terms, this test is primarily used to examine w ...
)
*
Student's t distribution, the distribution of the ratio of a
standard normal variable and the square root of a scaled
chi squared variable; useful for inference regarding the
mean
A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ...
of normally distributed samples with unknown variance (see
Student's t-test
Student's ''t''-test is a statistical test used to test whether the difference between the response of two groups is statistically significant or not. It is any statistical hypothesis test in which the test statistic follows a Student's ''t''- ...
)
*
Beta distribution, for a single probability (real number between 0 and 1); conjugate to the
Bernoulli distribution
In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli, is the discrete probability distribution of a random variable which takes the value 1 with probability p and the value 0 with pro ...
and
binomial distribution
In probability theory and statistics, the binomial distribution with parameters and is the discrete probability distribution of the number of successes in a sequence of statistical independence, independent experiment (probability theory) ...
Statistical inference
Statistical inference
Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properties of ...
is the process of drawing conclusions from data that are subject to random variation, for example, observational errors or sampling variation.
[Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. ] Initial requirements of such a system of procedures for
inference
Inferences are steps in logical reasoning, moving from premises to logical consequences; etymologically, the word '' infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinct ...
and
induction are that the system should produce reasonable answers when applied to well-defined situations and that it should be general enough to be applied across a range of situations. Inferential statistics are used to test hypotheses and make estimations using sample data. Whereas
descriptive statistics
A descriptive statistic (in the count noun sense) is a summary statistic that quantitatively describes or summarizes features from a collection of information, while descriptive statistics (in the mass noun sense) is the process of using and an ...
describe a sample, inferential statistics infer predictions about a larger population that the sample represents.
The outcome of statistical inference may be an answer to the question "what should be done next?", where this might be a decision about making further experiments or surveys, or about drawing a conclusion before implementing some organizational or governmental policy.
For the most part, statistical inference makes propositions about populations, using data drawn from the population of interest via some form of random sampling. More generally, data about a random process is obtained from its observed behavior during a finite period of time. Given a parameter or hypothesis about which one wishes to make inference, statistical inference most often uses:
* a
statistical model
A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of Sample (statistics), sample data (and similar data from a larger Statistical population, population). A statistical model repre ...
of the random process that is supposed to generate the data, which is known when randomization has been used, and
* a particular realization of the random process; i.e., a set of data.
Regression
In
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, regression analysis is a statistical process for estimating the relationships among variables. It includes many ways for modeling and analyzing several variables, when the focus is on the relationship between a
dependent variable
A variable is considered dependent if it depends on (or is hypothesized to depend on) an independent variable. Dependent variables are studied under the supposition or demand that they depend, by some law or rule (e.g., by a mathematical functio ...
and one or more
independent variables. More specifically, regression analysis helps one understand how the typical value of the dependent variable (or 'criterion variable') changes when any one of the independent variables is varied, while the other independent variables are held fixed. Most commonly, regression analysis estimates the
conditional expectation of the dependent variable given the independent variables – that is, the
average value of the dependent variable when the independent variables are fixed. Less commonly, the focus is on a
quantile
In statistics and probability, quantiles are cut points dividing the range of a probability distribution into continuous intervals with equal probabilities or dividing the observations in a sample in the same way. There is one fewer quantile t ...
, or other
location parameter
In statistics, a location parameter of a probability distribution is a scalar- or vector-valued parameter x_0, which determines the "location" or shift of the distribution. In the literature of location parameter estimation, the probability distr ...
of the conditional distribution of the dependent variable given the independent variables. In all cases, the estimation target is a
function of the independent variables called the regression function. In regression analysis, it is also of interest to characterize the variation of the dependent variable around the regression function which can be described by a
probability distribution
In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
.
Many techniques for carrying out regression analysis have been developed. Familiar methods, such as
linear regression
In statistics, linear regression is a statistical model, model that estimates the relationship between a Scalar (mathematics), scalar response (dependent variable) and one or more explanatory variables (regressor or independent variable). A mode ...
, are
parametric, in that the regression function is defined in terms of a finite number of unknown
parameter
A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
s that are estimated from the
data
Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...
(e.g. using
ordinary least squares).
Nonparametric regression refers to techniques that allow the regression function to lie in a specified set of
functions, which may be
infinite-dimensional.
Nonparametric statistics
Nonparametric statistics are values calculated from data in a way that is not based on
parameterized families of
probability distribution
In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
s. They include both
descriptive and
inferential statistics. The typical parameters are the expectations, variance, etc. Unlike
parametric statistics, nonparametric statistics make no assumptions about the
probability distribution
In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
s of the variables being assessed.
Non-parametric methods are widely used for studying populations that take on a ranked order (such as movie reviews receiving one to four stars). The use of non-parametric methods may be necessary when data have a
ranking
A ranking is a relationship between a set of items, often recorded in a list, such that, for any two items, the first is either "ranked higher than", "ranked lower than", or "ranked equal to" the second. In mathematics, this is known as a weak ...
but no clear numerical interpretation, such as when assessing
preferences. In terms of
levels of measurement
Level of measurement or scale of measure is a classification that describes the nature of information within the values assigned to dependent and independent variables, variables. Psychologist Stanley Smith Stevens developed the best-known class ...
, non-parametric methods result in "ordinal" data.
As non-parametric methods make fewer assumptions, their applicability is much wider than the corresponding parametric methods. In particular, they may be applied in situations where less is known about the application in question. Also, due to the reliance on fewer assumptions, non-parametric methods are more
robust.
One drawback of non-parametric methods is that since they do not rely on assumptions, they are generally less
powerful than their parametric counterparts.
Low power non-parametric tests are problematic because a common use of these methods is for when a sample has a low sample size.
Many parametric methods are proven to be the most powerful tests through methods such as the
Neyman–Pearson lemma and the
Likelihood-ratio test.
Another justification for the use of non-parametric methods is simplicity. In certain cases, even when the use of parametric methods is justified, non-parametric methods may be easier to use. Due both to this simplicity and to their greater robustness, non-parametric methods are seen by some statisticians as leaving less room for improper use and misunderstanding.
Statistics, mathematics, and mathematical statistics
Mathematical statistics is a key subset of the discipline of
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
.
Statistical theorists study and improve statistical procedures with mathematics, and statistical research often raises mathematical questions.
Mathematicians and statisticians like
Gauss
Johann Carl Friedrich Gauss (; ; ; 30 April 177723 February 1855) was a German mathematician, astronomer, Geodesy, geodesist, and physicist, who contributed to many fields in mathematics and science. He was director of the Göttingen Observat ...
,
Laplace, and
C. S. Peirce used
decision theory
Decision theory or the theory of rational choice is a branch of probability theory, probability, economics, and analytic philosophy that uses expected utility and probabilities, probability to model how individuals would behave Rationality, ratio ...
with
probability distribution
In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
s and
loss functions (or
utility function
In economics, utility is a measure of a certain person's satisfaction from a certain state of the world. Over time, the term has been used with at least two meanings.
* In a Normative economics, normative context, utility refers to a goal or ob ...
s). The decision-theoretic approach to statistical inference was reinvigorated by
Abraham Wald and his successors
[
] and makes extensive use of
scientific computing,
analysis
Analysis (: analyses) is the process of breaking a complex topic or substance into smaller parts in order to gain a better understanding of it. The technique has been applied in the study of mathematics and logic since before Aristotle (38 ...
, and
optimization
Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfiel ...
; for the
design of experiments
The design of experiments (DOE), also known as experiment design or experimental design, is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. ...
, statisticians use
algebra
Algebra is a branch of mathematics that deals with abstract systems, known as algebraic structures, and the manipulation of expressions within those systems. It is a generalization of arithmetic that introduces variables and algebraic ope ...
and
combinatorics
Combinatorics is an area of mathematics primarily concerned with counting, both as a means and as an end to obtaining results, and certain properties of finite structures. It is closely related to many other areas of mathematics and has many ...
. But while statistical practice often relies on
probability
Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
and
decision theory
Decision theory or the theory of rational choice is a branch of probability theory, probability, economics, and analytic philosophy that uses expected utility and probabilities, probability to model how individuals would behave Rationality, ratio ...
, their application can be controversial
[
]
See also
* Asymptotic theory (statistics)
References
Further reading
* Borovkov, A. A. (1999). ''Mathematical Statistics''. CRC Press.
Virtual Laboratories in Probability and Statistics (Univ. of Ala.-Huntsville)
StatiBot
interactive online expert system on statistical tests.
*
{{DEFAULTSORT:Mathematical Statistics
Statistical theory
Actuarial science