HOME

TheInfoList



OR:

The empirical probability,
relative frequency In statistics, the frequency (or absolute frequency) of an event i is the number n_i of times the observation has occurred/recorded in an experiment or study. These frequencies are often depicted graphically or in tabular form. Types The cumula ...
, or experimental probability of an event is the ratio of the number of outcomes in which a specified event occurs to the total number of trials, not in a theoretical sample space but in an actual experiment. More generally, empirical probability estimates probabilities from
experience Experience refers to conscious events in general, more specifically to perceptions, or to the practical knowledge and familiarity that is produced by these conscious processes. Understood as a conscious event in the widest sense, experience involv ...
and
observation Observation is the active acquisition of information from a primary source. In living beings, observation employs the senses. In science, observation can also involve the perception and recording of data via the use of scientific instruments. The ...
. Given an event ''A'' in a sample space, the relative frequency of ''A'' is the ratio ''m/n'', ''m'' being the number of outcomes in which the event ''A'' occurs, and ''n'' being the total number of outcomes of the experiment. In statistical terms, the
empirical Empirical evidence for a proposition is evidence, i.e. what supports or counters this proposition, that is constituted by or accessible to sense experience or experimental procedure. Empirical evidence is of central importance to the sciences and ...
probability is an ''estimate'' or
estimator In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (the estimate) are distinguished. For example, the ...
of a probability. In simple cases, where the result of a trial only determines whether or not the specified event has occurred, modelling using a
binomial distribution In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no ques ...
might be appropriate and then the empirical estimate is the
maximum likelihood estimate In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statist ...
. It is the
Bayesian estimate In estimation theory and decision theory, a Bayes estimator or a Bayes action is an estimator or decision rule that minimizes the posterior expected value of a loss function (i.e., the posterior expected loss). Equivalently, it maximizes the po ...
for the same case if certain assumptions are made for the
prior distribution In Bayesian statistical inference, a prior probability distribution, often simply called the prior, of an uncertain quantity is the probability distribution that would express one's beliefs about this quantity before some evidence is taken into ...
of the probability. If a trial yields more information, the empirical probability can be improved on by adopting further assumptions in the form of a
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form ...
: if such a model is fitted, it can be used to derive an estimate of the probability of the specified event


Advantages and disadvantages


Advantages

An advantage of estimating probabilities using empirical probabilities is that this procedure is relatively free of assumptions. For example, consider estimating the probability among a population of men that they satisfy two conditions: # that they are over 6 feet in height. # that they prefer strawberry jam to raspberry jam. A direct estimate could be found by counting the number of men who satisfy both conditions to give the empirical probability of the combined condition. An alternative estimate could be found by multiplying the proportion of men who are over 6 feet in height with the proportion of men who prefer strawberry jam to raspberry jam, but this estimate relies on the assumption that the two conditions are
statistically independent Independence is a fundamental notion in probability theory, as in statistics and the theory of stochastic processes. Two events are independent, statistically independent, or stochastically independent if, informally speaking, the occurrence of o ...
.


Disadvantages

A disadvantage in using empirical probabilities arises in estimating probabilities which are either very close to zero, or very close to one. In these cases very large sample sizes would be needed in order to estimate such probabilities to a good standard of relative accuracy. Here
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form ...
s can help, depending on the context, and in general one can hope that such models would provide improvements in accuracy compared to empirical probabilities, provided that the assumptions involved actually do hold. For example, consider estimating the probability that the lowest of the daily-maximum temperatures at a site in February in any one year is less than zero degrees Celsius. A record of such temperatures in past years could be used to estimate this probability. A model-based alternative would be to select a family of
probability distributions In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon ...
and fit it to the dataset containing past years′ values. The fitted distribution would provide an alternative estimate of the desired probability. This alternative method can provide an estimate of the probability even if all values in the record are greater than zero.


Mixed nomenclature

The phrase ''a-posteriori probability'' is also used as an alternative to empirical probability or relative frequency. The use of the phrase "a-posteriori" is reminiscent of terms in
Bayesian statistics Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about the event, ...
, but is not directly related to
Bayesian inference Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, and ...
, where ''a-posteriori probability'' is occasionally used to refer to
posterior probability The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posterior p ...
, which is different even though it has a confusingly similar name. The term ''a-posteriori probability'', in its meaning as equivalent to empirical probability, may be used in conjunction with ''
a priori probability An ''a priori'' probability is a probability that is derived purely by deductive reasoning. One way of deriving ''a priori'' probabilities is the principle of indifference, which has the character of saying that, if there are ''N'' mutually exc ...
'' which represents an estimate of a probability not based on any observations, but based on
deductive reasoning Deductive reasoning is the mental process of drawing deductive inferences. An inference is deductively valid if its conclusion follows logically from its premises, i.e. if it is impossible for the premises to be true and the conclusion to be fal ...
.
available online
)


See also

*
Empirical distribution function In statistics, an empirical distribution function (commonly also called an empirical Cumulative Distribution Function, eCDF) is the distribution function associated with the empirical measure of a sample. This cumulative distribution function ...
*
Empirical measure In probability theory, an empirical measure is a random measure arising from a particular realization of a (usually finite) sequence of random variables. The precise definition is found below. Empirical measures are relevant to mathematical sta ...
* Estimating quantiles from a sample *
Frequency probability Frequentist probability or frequentism is an interpretation of probability; it defines an event's probability as the limit of its relative frequency in many trials (the long-run probability). Probabilities can be found (in principle) by a repe ...


References

{{Reflist Applied probability Observational study Estimation theory