Minimum redundancy feature selection is an
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
frequently used in a method to accurately identify characteristics of
gene
In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protei ...
s and
phenotype
In genetics, the phenotype () is the set of observable characteristics or traits of an organism. The term covers the organism's morphology (physical form and structure), its developmental processes, its biochemical and physiological propert ...
s and narrow down their relevance and is usually described in its pairing with relevant feature selection as ''Minimum Redundancy Maximum Relevance'' (mRMR). This method was first proposed in 2003 by Hanchuan Peng and Chris Ding, followed by a theoretical formulation based on mutual information, along with the first definition of multivariate mutual information, published in IEEE Trans. Pattern Analysis and Machine Intelligence in 2005.
''
Feature selection
In machine learning, feature selection is the process of selecting a subset of relevant Feature (machine learning), features (variables, predictors) for use in model construction. Feature selection techniques are used for several reasons:
* sim ...
'', one of the basic problems in
pattern recognition
Pattern recognition is the task of assigning a class to an observation based on patterns extracted from data. While similar, pattern recognition (PR) is not to be confused with pattern machines (PM) which may possess PR capabilities but their p ...
and
machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
, identifies subsets of data that are relevant to the parameters used and is normally called ''
Maximum Relevance''. These subsets often contain material which is relevant but redundant and mRMR attempts to address this problem by removing those redundant subsets. mRMR has a variety of applications in many areas such as
cancer diagnosis and
speech recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also ...
.
Features can be selected in many different ways. One scheme is to select features that
correlate strongest to the
classification
Classification is the activity of assigning objects to some pre-existing classes or categories. This is distinct from the task of establishing the classes themselves (for example through cluster analysis). Examples include diagnostic tests, identif ...
variable. This has been called maximum-relevance selection. Many
heuristic algorithms can be used, such as the sequential forward, backward, or floating selections.
On the other hand, features can be selected to be mutually far away from each other while still having "high" correlation to the classification variable. This scheme, termed as ''Minimum Redundancy Maximum Relevance'' (mRMR) selection has been found to be more powerful than the maximum relevance selection.
As a special case, the "correlation" can be replaced by the
statistical dependency between variables.
Mutual information
In probability theory and information theory, the mutual information (MI) of two random variables is a measure of the mutual Statistical dependence, dependence between the two variables. More specifically, it quantifies the "Information conten ...
can be used to quantify the dependency. In this case, it is shown that mRMR is an approximation to maximizing the dependency between the
joint distribution of the selected features and the classification variable.
Studies have tried different measures for redundancy and relevance measures. A recent study compared several measures within the context of biomedical images.
[Auffarth, B., Lopez, M., Cerquides, J. (2010). Comparison of redundancy and relevance measures for feature selection in tissue classification of CT images. Advances in Data Mining. Applications and Theoretical Aspects. p. 248--262. Springer. http://www.csc.kth.se/~auffarth/publications/redrel.pdf]
References
{{reflist
External links
* Peng, H.C., Long, F., and Ding, C.,
Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy" IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 27, No. 8, pp. 1226–1238, 2005.
* Chris Ding and Hanchuan Peng,
Minimum Redundancy Feature Selection from Microarray Gene Expression Data. 2nd IEEE Computer Society Bioinformatics Conference (CSB 2003), 11–14 August 2003, Stanford, CA, USA. Pages 523–529.
Penglab mRMR
Machine learning algorithms