The empirical probability,
relative frequency, or experimental probability of an event is the ratio of the number of
outcome
Outcome may refer to:
* Outcome (probability), the result of an experiment in probability theory
* Outcome (game theory), the result of players' decisions in game theory
* ''The Outcome'', a 2005 Spanish film
* An outcome measure (or endpoint) ...
s in which a specified event occurs to the total number of trials,
not in a theoretical sample space but in an actual experiment. More generally, empirical probability estimates probabilities from
experience and
observation
Observation is the active acquisition of information from a primary source. In living beings, observation employs the senses. In science, observation can also involve the perception and recording of data via the use of scientific instruments. The ...
.
Given an event ''A'' in a sample space, the relative frequency of ''A'' is the ratio ''m/n'', ''m'' being the number of outcomes in which the event ''A'' occurs, and ''n'' being the total number of outcomes of the experiment.
In statistical terms, the
empirical
Empirical evidence for a proposition is evidence, i.e. what supports or counters this proposition, that is constituted by or accessible to sense experience or experimental procedure. Empirical evidence is of central importance to the sciences and ...
probability is an ''estimate'' or
estimator of a probability. In simple cases, where the result of a trial only determines whether or not the specified event has occurred, modelling using a
binomial distribution
In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no quest ...
might be appropriate and then the empirical estimate is the
maximum likelihood estimate. It is the
Bayesian estimate
In estimation theory and decision theory, a Bayes estimator or a Bayes action is an estimator or decision rule that minimizes the posterior expected value of a loss function (i.e., the posterior expected loss). Equivalently, it maximizes the ...
for the same case if certain assumptions are made for the
prior distribution of the probability. If a trial yields more information, the empirical probability can be improved on by adopting further assumptions in the form of a
statistical model
A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of Sample (statistics), sample data (and similar data from a larger Statistical population, population). A statistical model repres ...
: if such a model is fitted, it can be used to derive an estimate of the probability of the specified event
Advantages and disadvantages
Advantages
An advantage of estimating probabilities using empirical probabilities is that this procedure is relatively free of assumptions.
For example, consider estimating the probability among a population of men that they satisfy two conditions:
# that they are over 6
feet in height.
# that they prefer strawberry jam to raspberry jam.
A direct estimate could be found by counting the number of men who satisfy both conditions to give the empirical probability of the combined condition. An alternative estimate could be found by multiplying the proportion of men who are over 6 feet in height with the proportion of men who prefer strawberry jam to raspberry jam, but this estimate relies on the assumption that the two conditions are
statistically independent.
Disadvantages
A disadvantage in using empirical probabilities arises in estimating probabilities which are either very close to zero, or very close to one. In these cases very large sample sizes would be needed in order to estimate such probabilities to a good standard of relative accuracy. Here
statistical model
A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of Sample (statistics), sample data (and similar data from a larger Statistical population, population). A statistical model repres ...
s can help, depending on the context, and in general one can hope that such models would provide improvements in accuracy compared to empirical probabilities, provided that the assumptions involved actually do hold.
For example, consider estimating the probability that the lowest of the daily-maximum temperatures at a site in February in any one year is less than zero degrees Celsius. A record of such temperatures in past years could be used to estimate this probability. A model-based alternative would be to select a family of
probability distributions and fit it to the dataset containing past years′ values. The fitted distribution would provide an alternative estimate of the desired probability. This alternative method can provide an estimate of the probability even if all values in the record are greater than zero.
Mixed nomenclature
The phrase ''a-posteriori probability'' is also used as an alternative to empirical probability or relative frequency.
[ The use of the phrase "a-posteriori" is reminiscent of terms in Bayesian statistics, but is not directly related to ]Bayesian inference
Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, a ...
, where ''a-posteriori probability'' is occasionally used to refer to posterior probability, which is different even though it has a confusingly similar name.
The term ''a-posteriori probability'', in its meaning as equivalent to empirical probability, may be used in conjunction with '' a priori probability'' which represents an estimate of a probability not based on any observations, but based on deductive reasoning
Deductive reasoning is the mental process of drawing deductive inferences. An inference is deductively valid if its conclusion follows logically from its premises, i.e. if it is impossible for the premises to be true and the conclusion to be fals ...
.
available online
)
See also
*Empirical distribution function
In statistics, an empirical distribution function (commonly also called an empirical Cumulative Distribution Function, eCDF) is the distribution function associated with the empirical measure of a sample. This cumulative distribution function ...
* Empirical measure
* Estimating quantiles from a sample
* Frequency probability
References
{{Reflist
Applied probability
Observational study
Estimation theory