Bayes Classifier

	Bayes Classifier In statistical classification, the Bayes classifier minimizes the probability of misclassification. Definition Suppose a pair (X,Y) takes values in \mathbb^d \times \, where Y is the class label of X. Assume that the conditional distribution of ''X'', given that the label ''Y'' takes the value ''r'' is given by :(X\mid Y=r) \sim P_r for r=1,2,\dots,K where "\sim" means "is distributed as", and where P_r denotes a probability distribution. A classifier is a rule that assigns to an observation ''X''=''x'' a guess or estimate of what the unobserved label ''Y''=''r'' actually was. In theoretical terms, a classifier is a measurable function C: \mathbb^d \to \, with the interpretation that ''C'' classifies the point ''x'' to the class ''C''(''x''). The probability of misclassification, or risk, of a classifier ''C'' is defined as :\mathcal(C) = \operatorname\. The Bayes classifier is :C^\text(x) = \underset \operatorname(Y=r \mid X=x). In practice, as in most of statistics, the d ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Statistical Classification In statistics, classification is the problem of identifying which of a set of categories (sub-populations) an observation (or observations) belongs to. Examples are assigning a given email to the "spam" or "non-spam" class, and assigning a diagnosis to a given patient based on observed characteristics of the patient (sex, blood pressure, presence or absence of certain symptoms, etc.). Often, the individual observations are analyzed into a set of quantifiable properties, known variously as explanatory variables or ''features''. These properties may variously be categorical (e.g. "A", "B", "AB" or "O", for blood type), ordinal (e.g. "large", "medium" or "small"), integer-valued (e.g. the number of occurrences of a particular word in an email) or real-valued (e.g. a measurement of blood pressure). Other classifiers work by comparing observations to previous observations by means of a similarity or distance function. An algorithm that implements classification, especially in a ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Probability Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speaking, 0 indicates impossibility of the event and 1 indicates certainty."Kendall's Advanced Theory of Statistics, Volume 1: Distribution Theory", Alan Stuart and Keith Ord, 6th Ed, (2009), .William Feller, ''An Introduction to Probability Theory and Its Applications'', (Vol 1), 3rd Ed, (1968), Wiley, . The higher the probability of an event, the more likely it is that the event will occur. A simple example is the tossing of a fair (unbiased) coin. Since the coin is fair, the two outcomes ("heads" and "tails") are both equally probable; the probability of "heads" equals the probability of "tails"; and since no other outcomes are possible, the probability of either "heads" or "tails" is 1/2 (which could also be written ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Conditional Distribution In probability theory and statistics, given two jointly distributed random variables X and Y, the conditional probability distribution of Y given X is the probability distribution of Y when X is known to be a particular value; in some cases the conditional probabilities may be expressed as functions containing the unspecified value x of X as a parameter. When both X and Y are categorical variables, a conditional probability table is typically used to represent the conditional probability. The conditional distribution contrasts with the marginal distribution of a random variable, which is its distribution without reference to the value of the other variable. If the conditional distribution of Y given X is a continuous distribution, then its probability density function is known as the conditional density function. The properties of a conditional distribution, such as the moments, are often referred to by corresponding names such as the conditional mean and conditional variance. ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Classification Rule Given a population whose members each belong to one of a number of different sets or classes, a classification rule or classifier is a procedure by which the elements of the population set are each predicted to belong to one of the classes. A perfect classification is one for which every element in the population is assigned to the class it really belongs to. An imperfect classification is one in which some errors appear, and then statistical analysis must be applied to analyse the classification. A special kind of classification rule is binary classification, for problems in which there are only two classes. Testing classification rules Given a data set consisting of pairs ''x'' and ''y'', where ''x'' denotes an element of the population and ''y'' the class it belongs to, a classification rule ''h''(''x'') is a function that assigns each element ''x'' to a predicted class \hat=h(x). A binary classification is such that the label ''y'' can take only one of two values. The true ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Risk (statistics) Statistical risk is a quantification of a situation's risk using statistical methods. These methods can be used to estimate a probability distribution for the outcome of a specific variable, or at least one or more key parameters of that distribution, and from that estimated distribution a risk function can be used to obtain a single non-negative number representing a particular conception of the risk of the situation. Statistical risk is taken account of in a variety of contexts including finance and economics, and there are many risk functions that can be used depending on the context. One measure of the statistical risk of a continuous variable, such as the return on an investment, is simply the estimated variance of the variable, or equivalently the square root of the variance, called the standard deviation. Another measure in finance, one which views upside risk as unimportant compared to downside risk, is the downside beta. In the context of a binary variable, a ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Consistency (statistics) In statistics, consistency of procedures, such as computing confidence intervals or conducting hypothesis tests, is a desired property of their behaviour as the number of items in the data set to which they are applied increases indefinitely. In particular, consistency requires that the outcome of the procedure with unlimited data should identify the underlying truth.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. (entries for consistency, consistent estimator, consistent test) Use of the term in statistics derives from Sir Ronald Fisher in 1922. Use of the terms ''consistency'' and ''consistent'' in statistics is restricted to cases where essentially the same procedure can be applied to any number of data items. In complicated applications of statistics, there may be several ways in which the number of data items may grow. For example, records for rainfall within an area might increase in three ways: records for additional time periods; records for additiona ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Naive Bayes Classifier In statistics, naive Bayes classifiers are a family of simple " probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features (see Bayes classifier). They are among the simplest Bayesian network models, but coupled with kernel density estimation, they can achieve high accuracy levels. Naive Bayes classifiers are highly scalable, requiring a number of parameters linear in the number of variables (features/predictors) in a learning problem. Maximum-likelihood training can be done by evaluating a closed-form expression, which takes linear time, rather than by expensive iterative approximation as used for many other types of classifiers. In the statistics literature, naive Bayes models are known under a variety of names, including simple Bayes and independence Bayes. All these names reference the use of Bayes' theorem in the classifier's decision rule, but naive Bayes is not (necessarily) a Bayesian method. Introduct ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Bayes Error Rate In statistical classification, Bayes error rate is the lowest possible error rate for any classifier of a random outcome (into, for example, one of two categories) and is analogous to the irreducible error.K. Tumer, K. (1996) "Estimating the Bayes error rate through classifier combining" in ''Proceedings of the 13th International Conference on Pattern Recognition'', Volume 2, 695–699 A number of approaches to the estimation of the Bayes error rate exist. One method seeks to obtain analytical bounds which are inherently dependent on distribution parameters, and hence difficult to estimate. Another approach focuses on class densities, while yet another method combines and compares various classifiers. The Bayes error rate finds important use in the study of patterns and machine learning techniques. Error determination In terms of machine learning and pattern classification, the labels of a set of random observations can be divided into 2 or more classes. Each observation is cal ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Fubini's Theorem In mathematical analysis Fubini's theorem is a result that gives conditions under which it is possible to compute a double integral by using an iterated integral, introduced by Guido Fubini in 1907. One may switch the order of integration if the double integral yields a finite answer when the integrand is replaced by its absolute value. \, \iint\limits_ f(x,y)\,\text(x,y) = \int_X\left(\int_Y f(x,y)\,\texty\right)\textx=\int_Y\left(\int_X f(x,y) \, \textx \right) \texty \qquad \text \qquad \iint\limits_ , f(x,y), \,\text(x,y) <+\infty. Fubini's theorem implies that two iterated integrals are equal to the corresponding double integral across its integrands. Tonelli's theorem, introduced by Leonida Tonelli in 1909, is similar, but applies to a non-negative measurable function rather than one integrable over their domains. A related theorem is ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Bayesian Statistics Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about the event, such as the results of previous experiments, or on personal beliefs about the event. This differs from a number of other interpretations of probability, such as the frequentist interpretation that views probability as the limit of the relative frequency of an event after many trials. Bayesian statistical methods use Bayes' theorem to compute and update probabilities after obtaining new data. Bayes' theorem describes the conditional probability of an event based on data as well as prior information or beliefs about the event or conditions related to the event. For example, in Bayesian inference, Bayes' theorem can be used to estimate the parameters of a probability distribution or statistical model. Since Bayesian statistics treats pr ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]