In
statistical classification
In statistics, classification is the problem of identifying which of a set of categories (sub-populations) an observation (or observations) belongs to. Examples are assigning a given email to the "spam" or "non-spam" class, and assigning a diagno ...
, Bayes error rate is the lowest possible error rate for any classifier of a random outcome (into, for example, one of two categories) and is analogous to the irreducible error.
[K. Tumer, K. (1996) "Estimating the Bayes error rate through classifier combining" in ''Proceedings of the 13th International Conference on Pattern Recognition'', Volume 2, 695–699]
A number of approaches to the estimation of the Bayes error rate exist. One method seeks to obtain analytical bounds which are inherently dependent on distribution parameters, and hence difficult to estimate. Another approach focuses on class densities, while yet another method combines and compares various classifiers.
The Bayes error rate finds important use in the study of patterns and
machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
techniques.
Error determination
In terms of machine learning and pattern classification, the labels of a set of random observations can be divided into 2 or more classes. Each observation is called an ''instance'' and the class it belongs to is the ''label''.
The Bayes error rate of the data distribution is the probability an instance is misclassified by a classifier that knows the true class probabilities given the predictors.
For a
multiclass classifier, the expected prediction error may be calculated as follows:
: