Categorization
   HOME

TheInfoList



OR:

Classification is the activity of assigning objects to some pre-existing classes or categories. This is distinct from the task of establishing the classes themselves (for example through cluster analysis). Examples include diagnostic tests, identifying spam emails and deciding whether to give someone a driving license. As well as 'category', synonyms or near-synonyms for 'class' include 'type', 'species', 'order', 'concept', 'taxon', 'group', 'identification' and 'division'. The meaning of the word 'classification' (and its synonyms) may take on one of several related meanings. It may encompass both classification and the creation of classes, as for example in 'the task of categorizing pages in Wikipedia'; this overall activity is listed under
taxonomy image:Hierarchical clustering diagram.png, 280px, Generalized scheme of taxonomy Taxonomy is a practice and science concerned with classification or categorization. Typically, there are two parts to it: the development of an underlying scheme o ...
. It may refer exclusively to the underlying scheme of classes (which otherwise may be called a taxonomy). Or it may refer to the label given to an object by the classifier. Classification is a part of many different kinds of activities and is studied from many different points of view including
medicine Medicine is the science and Praxis (process), practice of caring for patients, managing the Medical diagnosis, diagnosis, prognosis, Preventive medicine, prevention, therapy, treatment, Palliative care, palliation of their injury or disease, ...
,
philosophy Philosophy ('love of wisdom' in Ancient Greek) is a systematic study of general and fundamental questions concerning topics like existence, reason, knowledge, Value (ethics and social sciences), value, mind, and language. It is a rational an ...
, law,
anthropology Anthropology is the scientific study of humanity, concerned with human behavior, human biology, cultures, society, societies, and linguistics, in both the present and past, including archaic humans. Social anthropology studies patterns of behav ...
,
biology Biology is the scientific study of life and living organisms. It is a broad natural science that encompasses a wide range of fields and unifying principles that explain the structure, function, growth, History of life, origin, evolution, and ...
,
taxonomy image:Hierarchical clustering diagram.png, 280px, Generalized scheme of taxonomy Taxonomy is a practice and science concerned with classification or categorization. Typically, there are two parts to it: the development of an underlying scheme o ...
,
cognition Cognition is the "mental action or process of acquiring knowledge and understanding through thought, experience, and the senses". It encompasses all aspects of intellectual functions and processes such as: perception, attention, thought, ...
, communications, knowledge organization,
psychology Psychology is the scientific study of mind and behavior. Its subject matter includes the behavior of humans and nonhumans, both consciousness, conscious and Unconscious mind, unconscious phenomena, and mental processes such as thoughts, feel ...
,
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
,
machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
,
economics Economics () is a behavioral science that studies the Production (economics), production, distribution (economics), distribution, and Consumption (economics), consumption of goods and services. Economics focuses on the behaviour and interac ...
and
mathematics Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...
.


Binary vs multi-class classification

Methodological work aimed at improving the accuracy of a classifier is commonly divided between cases where there are exactly two classes ( binary classification) and cases where there are three or more classes ( multiclass classification).


Evaluation of accuracy

Unlike in
decision theory Decision theory or the theory of rational choice is a branch of probability theory, probability, economics, and analytic philosophy that uses expected utility and probabilities, probability to model how individuals would behave Rationality, ratio ...
, it is assumed that a classifier repeats the classification task over and over. And unlike a lottery, it is assumed that each classification can be either right or wrong; in the theory of measurement, classification is understood as measurement against a nominal scale. Thus it is possible to try to measure the accuracy of a classifier. Measuring the accuracy of a classifier allows a choice to be made between two alternative classifiers. This is important both when developing a classifier and in choosing which classifier to deploy. There are however many different methods for evaluating the accuracy of a classifier and no general method for determining which method should be used in which circumstances. Different fields have taken different approaches, even in binary classification. In
pattern recognition Pattern recognition is the task of assigning a class to an observation based on patterns extracted from data. While similar, pattern recognition (PR) is not to be confused with pattern machines (PM) which may possess PR capabilities but their p ...
, error rate is popular. The Gini coefficient and KS statistic are widely used in the credit scoring industry. Sensitivity and specificity are widely used in epidemiology and medicine. Precision and recall are widely used in information retrieval. Classifier accuracy depends greatly on the characteristics of the data to be classified. There is no single classifier that works best on all given problems (a phenomenon that may be explained by the no-free-lunch theorem).


See also

* Help:Category, for information about Wikipedia's categories * Class (disambiguation) * Classified (disambiguation) * Classifier (disambiguation) * Cognitive categorization * Data classification (disambiguation) * Classification theorem *
Folk taxonomy A folk taxonomy is a vernacular name, naming system, as distinct from Taxonomy (general), scientific taxonomy. Folk biological classification is the way people traditionally describe and organize the world around them, typically making generous us ...
* Fuzzy classification


References


External links

*{{Commonscat-inline Theories Former disambiguation pages converted to set index articles