Machine learning algorithms
   HOME

TheInfoList



OR:

The following outline is provided as an overview of and topical guide to machine learning.
Machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
is a subfield of
soft computing Soft computing is a set of algorithms, including neural networks, fuzzy logic, and evolutionary algorithms. These algorithms are tolerant of imprecision, uncertainty, partial truth and approximation. It is contrasted with hard computing: al ...
within
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
that evolved from the study of pattern recognition and computational learning theory in
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech ...
.http://www.britannica.com/EBchecked/topic/1116194/machine-learning In 1959, Arthur Samuel defined machine learning as a "field of study that gives computers the ability to learn without being explicitly programmed". Machine learning explores the study and construction of
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
s that can
learn Learning is the process of acquiring new understanding, knowledge, behaviors, skills, values, attitudes, and preferences. The ability to learn is possessed by humans, animals, and some machines; there is also evidence for some kind of l ...
from and make predictions on
data In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpret ...
. Such algorithms operate by building a model from an example
training set In machine learning, a common task is the study and construction of algorithms that can learn from and make predictions on data. Such algorithms function by making data-driven predictions or decisions, through building a mathematical model from ...
of input observations in order to make data-driven predictions or decisions expressed as outputs, rather than following strictly static program instructions.


What ''type'' of thing is machine learning?

* An
academic discipline An academy (Attic Greek: Ἀκαδήμεια; Koine Greek Ἀκαδημία) is an institution of secondary education, secondary or tertiary education, tertiary higher education, higher learning (and generally also research or honorary membershi ...
* A branch of
science Science is a systematic endeavor that builds and organizes knowledge in the form of testable explanations and predictions about the universe. Science may be as old as the human species, and some of the earliest archeological evidence ...
** An
applied science Applied science is the use of the scientific method and knowledge obtained via conclusions from the method to attain practical goals. It includes a broad range of disciplines such as engineering and medicine. Applied science is often contrasted ...
*** A subfield of
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
**** A branch of
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech ...
**** A subfield of soft computing *** Application of
statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...


Branches of machine learning


Subfields of machine learning

* Computational learning theory – studying the design and analysis of
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
algorithms. * Grammar induction * Meta-learning


Cross-disciplinary fields involving machine learning

* Adversarial machine learning *
Predictive analytics Predictive analytics encompasses a variety of statistical techniques from data mining, predictive modeling, and machine learning that analyze current and historical facts to make predictions about future or otherwise unknown events. In busine ...
*
Quantum machine learning Quantum machine learning is the integration of quantum algorithms within machine learning programs. The most common use of the term refers to machine learning algorithms for the analysis of classical data executed on a quantum computer, i.e. quan ...
* Robot learning ** Developmental robotics


Applications of machine learning

* Applications of machine learning *
Bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
*
Biomedical informatics Health informatics is the field of science and engineering that aims at developing methods and technologies for the acquisition, processing, and study of patient data, which can come from different sources and modalities, such as electronic hea ...
*
Computer vision Computer vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate tasks that the human ...
*
Customer relationship management Customer relationship management (CRM) is a process in which a business or other organization administers its interactions with customers, typically using data analysis to study large amounts of information. CRM systems compile data from a r ...
– * Data mining * Earth sciences * Email filtering * Inverted pendulum – balance and equilibrium system. *
Natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to proc ...
(NLP) ** Named Entity Recognition ** Automatic summarization ** Automatic taxonomy construction ** Dialog system ** Grammar checker ** Language recognition *** Handwriting recognition ***
Optical character recognition Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a sc ...
***
Speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the ...
**** Text to Speech Synthesis (TTS) **** Speech Emotion Recognition (SER) **
Machine translation Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation or interactive translation), is a sub-field of computational linguistics that investigates ...
**
Question answering Question answering (QA) is a computer science discipline within the fields of information retrieval and natural language processing (NLP), which is concerned with building systems that automatically answer questions posed by humans in a natural ...
**
Speech synthesis Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal langua ...
** Text mining ***
Term frequency–inverse document frequency Term may refer to: *Terminology, or term, a noun or compound word used in a specific context, in particular: **Technical term, part of the specialized vocabulary of a particular field, specifically: ***Scientific terminology, terms used by scienti ...
(tf–idf) ** Text simplification * Pattern recognition ** Facial recognition system ** Handwriting recognition ** Image recognition **
Optical character recognition Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a sc ...
**
Speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the ...
* Recommendation system ** Collaborative filtering ** Content-based filtering ** Hybrid recommender systems (Collaborative and content-based filtering) *
Search engine A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...
**
Search engine optimization Search engine optimization (SEO) is the process of improving the quality and quantity of website traffic to a website or a web page from search engines. SEO targets unpaid traffic (known as "natural" or "organic" results) rather than dire ...
* Social Engineering


Machine learning hardware

*
Graphics processing unit A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, m ...
*
Tensor processing unit Tensor Processing Unit (TPU) is an AI accelerator application-specific integrated circuit (ASIC) developed by Google for neural network machine learning, using Google's own TensorFlow software. Google began using TPUs internally in 2015, and ...
*
Vision processing unit A vision processing unit (VPU) is (as of 2018) an emerging class of microprocessor; it is a specific type of AI accelerator, designed to accelerate machine vision tasks. Overview Vision processing units are distinct from video processing uni ...


Machine learning tools

*
Comparison of deep learning software The following table compares notable software frameworks, libraries and computer programs for deep learning. Deep-learning software by name Comparison of compatibility of machine learning models See also *Comparison of numerical-analy ...


Machine learning frameworks


Proprietary machine learning frameworks

* Amazon Machine Learning * Microsoft Azure Machine Learning Studio *
DistBelief TensorFlow is a Free and open-source software, free and open-source Library (computing), software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on Types of artificial ...
– replaced by TensorFlow


Open source machine learning frameworks

* Apache Singa * Apache MXNet * Caffe * PyTorch * mlpack * TensorFlow * Torch * CNTK * Accord.Net * Jax


Machine learning libraries

*
Deeplearning4j Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). It is a framework with wide support for deep learning algorithms. Deeplearning4j includes implementations of the restricted Boltzmann machine, ...
*
Theano In Greek mythology, Theano (; Ancient Greek: Θεανώ) may refer to the following personages: *Theano, wife of Metapontus, king of Icaria. Metapontus demanded that she bear him children, or leave the kingdom. She presented the children of Mel ...
* scikit-learn * Keras


Machine learning algorithms

* Almeida–Pineda recurrent backpropagation * ALOPEX * Backpropagation * Bootstrap aggregating *
CN2 algorithm The CN2 induction algorithm is a learning algorithm for rule induction.Clark, P. and Niblett, T (1989) The CN2 induction algorithm. Machine Learning 3(4):261-283. It is designed to work even when the training data is imperfect. It is based on ide ...
* Constructing skill trees * Dehaene–Changeux model *
Diffusion map Diffusion maps is a dimensionality reduction or feature extraction algorithm introduced by Ronald Coifman, Coifman and Lafon which computes a family of embeddings of a data set into Euclidean space (often low-dimensional) whose coordinates can be ...
*
Dominance-based rough set approach The dominance-based rough set approach (DRSA) is an extension of rough set theory for multi-criteria decision analysis (MCDA), introduced by Greco, Matarazzo and Słowiński. Greco, S., Matarazzo, B., Słowiński, R.: Rough sets theory for multi- ...
* Dynamic time warping * Error-driven learning * Evolutionary multimodal optimization * Expectation–maximization algorithm *
FastICA FastICA is an efficient and popular algorithm for independent component analysis invented by Aapo Hyvärinen at Helsinki University of Technology. Like most ICA algorithms, FastICA seeks an orthogonal rotation of prewhitened data, through a fixed- ...
*
Forward–backward algorithm The forward–backward algorithm is an inference algorithm for hidden Markov models which computes the posterior marginals of all hidden state variables given a sequence of observations/emissions o_:= o_1,\dots,o_T, i.e. it computes, for all hi ...
* GeneRec * Genetic Algorithm for Rule Set Production * Growing self-organizing map * Hyper basis function network * IDistance * K-nearest neighbors algorithm * Kernel methods for vector output * Kernel principal component analysis * Leabra *
Linde–Buzo–Gray algorithm The Linde–Buzo–Gray algorithm (introduced by Yoseph Linde, Andrés Buzo and Robert M. Gray in 1980) is a vector quantization algorithm to derive a good codebook A codebook is a type of document used for gathering and storing cryptography ...
*
Local outlier factor In anomaly detection, the local outlier factor (LOF) is an algorithm proposed by Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng and Jörg Sander in 2000 for finding anomalous data points by measuring the local deviation of a given data point ...
* Logic learning machine * LogitBoost *
Manifold alignment Manifold alignment is a class of machine learning algorithms that produce projections between sets of data, given that the original data sets lie on a common manifold. The concept was first introduced as such by Ham, Lee, and Saul in 2003, adding ...
* Markov chain Monte Carlo (MCMC) * Minimum redundancy feature selection *
Mixture of experts Mixture of experts (MoE) refers to a machine learning technique where multiple expert networks (learners) are used to divide a problem space into homogeneous regions. It differs from ensemble techniques in that typically only a few, or 1, expert mo ...
*
Multiple kernel learning Multiple kernel learning refers to a set of machine learning methods that use a predefined set of kernels and learn an optimal linear or non-linear combination of kernels as part of the algorithm. Reasons to use multiple kernel learning include ...
* Non-negative matrix factorization *
Online machine learning In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques whi ...
* Out-of-bag error * Prefrontal cortex basal ganglia working memory * PVLV * Q-learning * Quadratic unconstrained binary optimization * Query-level feature *
Quickprop Quickprop is an iterative method for determining the minimum of the loss function of an artificial neural network, following an algorithm inspired by the Newton's method. Sometimes, the algorithm is classified to the group of the second order lear ...
*
Radial basis function network In the field of mathematical modeling, a radial basis function network is an artificial neural network that uses radial basis functions as activation functions. The output of the network is a linear combination of radial basis functions of the inp ...
* Randomized weighted majority algorithm * Reinforcement learning * Repeated incremental pruning to produce error reduction (RIPPER) * Rprop *
Rule-based machine learning Rule-based machine learning (RBML) is a term in computer science intended to encompass any machine learning method that identifies, learns, or evolves 'rules' to store, manipulate or apply. The defining characteristic of a rule-based machine lear ...
* Skill chaining * Sparse PCA * State–action–reward–state–action *
Stochastic gradient descent Stochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an objective function with suitable smoothness properties (e.g. differentiable or subdifferentiable). It can be regarded as a stochastic approximation of ...
* Structured kNN *
T-distributed stochastic neighbor embedding t-distributed stochastic neighbor embedding (t-SNE) is a statistical method for visualizing high-dimensional data by giving each datapoint a location in a two or three-dimensional map. It is based on Stochastic Neighbor Embedding originally de ...
* Temporal difference learning * Wake-sleep algorithm *
Weighted majority algorithm (machine learning) In machine learning, weighted majority algorithm (WMA) is a meta learning algorithm used to construct a compound algorithm from a pool of prediction algorithms, which could be any type of learning algorithms, classifiers, or even real human exper ...


Machine learning methods


Instance-based algorithm

* K-nearest neighbors algorithm (KNN) * Learning vector quantization (LVQ) * Self-organizing map (SOM)


Regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...

*
Logistic regression In statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear combination of one or more independent variables. In regression an ...
* Ordinary least squares regression (OLSR) *
Linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is cal ...
*
Stepwise regression In statistics, stepwise regression is a method of fitting regression models in which the choice of predictive variables is carried out by an automatic procedure. In each step, a variable is considered for addition to or subtraction from the set of ...
* Multivariate adaptive regression splines (MARS) * Regularization algorithm **
Ridge regression Ridge regression is a method of estimating the coefficients of multiple-regression models in scenarios where the independent variables are highly correlated. It has been used in many fields including econometrics, chemistry, and engineering. Also ...
** Least Absolute Shrinkage and Selection Operator (LASSO) ** Elastic net **
Least-angle regression In statistics, least-angle regression (LARS) is an algorithm for fitting linear regression models to high-dimensional data, developed by Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani. Suppose we expect a response variab ...
(LARS) * Classifiers ** Probabilistic classifier *** Naive Bayes classifier **
Binary classifier Binary classification is the task of classifying the elements of a set into two groups (each called ''class'') on the basis of a classification rule. Typical binary classification problems include: * Medical testing to determine if a patient has ...
**
Linear classifier In the field of machine learning, the goal of statistical classification is to use an object's characteristics to identify which class (or group) it belongs to. A linear classifier achieves this by making a classification decision based on the v ...
** Hierarchical classifier


Dimensionality reduction

Dimensionality reduction *
Canonical correlation analysis In statistics, canonical-correlation analysis (CCA), also called canonical variates analysis, is a way of inferring information from cross-covariance matrices. If we have two vectors ''X'' = (''X''1, ..., ''X'n'') and ''Y' ...
(CCA) * Factor analysis * Feature extraction * Feature selection * Independent component analysis (ICA) *
Linear discriminant analysis Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to find a linear combination of features ...
(LDA) * Multidimensional scaling (MDS) * Non-negative matrix factorization (NMF) *
Partial least squares regression Partial least squares regression (PLS regression) is a statistical method that bears some relation to principal components regression; instead of finding hyperplanes of maximum variance between the response and independent variables, it finds a ...
(PLSR) * Principal component analysis (PCA) * Principal component regression (PCR) *
Projection pursuit Projection pursuit (PP) is a type of statistical technique which involves finding the most "interesting" possible projections in multidimensional data. Often, projections which deviate more from a normal distribution are considered to be more inter ...
* Sammon mapping *
t-distributed stochastic neighbor embedding t-distributed stochastic neighbor embedding (t-SNE) is a statistical method for visualizing high-dimensional data by giving each datapoint a location in a two or three-dimensional map. It is based on Stochastic Neighbor Embedding originally de ...
(t-SNE)


Ensemble learning

Ensemble learning * AdaBoost * Boosting * Bootstrap aggregating (Bagging) *
Ensemble averaging In machine learning, particularly in the creation of artificial neural networks, ensemble averaging is the process of creating multiple models and combining them to produce a desired output, as opposed to creating just one model. Frequently an ens ...
– process of creating multiple models and combining them to produce a desired output, as opposed to creating just one model. Frequently an ensemble of models performs better than any individual model, because the various errors of the models "average out." * Gradient boosted decision tree (GBDT) * Gradient boosting machine (GBM) *
Random Forest Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of ...
* Stacked Generalization (blending)


Meta-learning

Meta-learning * Inductive bias * Metadata


Reinforcement learning

Reinforcement learning * Q-learning * State–action–reward–state–action (SARSA) * Temporal difference learning (TD) *
Learning Automata A learning automaton is one type of machine learning algorithm studied since 1970s. Learning automata select their current action based on past experiences from the environment. It will fall into the range of reinforcement learning if the environme ...


Supervised learning

Supervised learning * Averaged one-dependence estimators (AODE) *
Artificial neural network Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected unit ...
* Case-based reasoning *
Gaussian process regression In statistics, originally in geostatistics, kriging or Kriging, also known as Gaussian process regression, is a method of interpolation based on Gaussian process governed by prior covariances. Under suitable assumptions of the prior, kriging giv ...
* Gene expression programming * Group method of data handling (GMDH) * Inductive logic programming *
Instance-based learning In machine learning, instance-based learning (sometimes called memory-based learning) is a family of learning algorithms that, instead of performing explicit generalization, compare new problem instances with instances seen in training, which have ...
*
Lazy learning In machine learning, lazy learning is a learning method in which generalization of the training data is, in theory, delayed until a query is made to the system, as opposed to eager learning, where the system tries to generalize the training data be ...
*
Learning Automata A learning automaton is one type of machine learning algorithm studied since 1970s. Learning automata select their current action based on past experiences from the environment. It will fall into the range of reinforcement learning if the environme ...
* Learning Vector Quantization * Logistic Model Tree * Minimum message length (decision trees, decision graphs, etc.) ** Nearest Neighbor Algorithm **
Analogical modeling Analogical modeling (AM) is a formal theory of exemplar based analogical reasoning, proposed by Royal Skousen, professor of Linguistics and English language at Brigham Young University in Provo, Utah. It is applicable to language modeling and othe ...
* Probably approximately correct learning (PAC) learning *
Ripple down rules Ripple-down rules (RDR) are a way of approaching knowledge acquisition. Knowledge acquisition refers to the transfer of knowledge from human experts to knowledge-based systems. Introductory material Ripple-down rules are an incremental approac ...
, a knowledge acquisition methodology * Symbolic machine learning algorithms *
Support vector machine In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laborat ...
s * Random Forests * Ensembles of classifiers ** Bootstrap aggregating (bagging) ** Boosting (meta-algorithm) * Ordinal classification * Information fuzzy networks (IFN) * Conditional Random Field * ANOVA * Quadratic classifiers *
k-nearest neighbor In statistics, the ''k''-nearest neighbors algorithm (''k''-NN) is a non-parametric supervised learning method first developed by Evelyn Fix and Joseph Hodges in 1951, and later expanded by Thomas Cover. It is used for classification and regressi ...
* Boosting ** SPRINT *
Bayesian network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Ba ...
s ** Naive Bayes *
Hidden Markov model A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("''hidden''") states. As part of the definition, HMM requires that there be an ...
s **
Hierarchical hidden Markov model The hierarchical hidden Markov model (HHMM) is a statistical model derived from the hidden Markov model (HMM). In an HHMM, each state is considered to be a self-contained probabilistic model. More precisely, each state of the HHMM is itself an HHMM ...


Bayesian

Bayesian statistics * Bayesian knowledge base * Naive Bayes *
Gaussian Naive Bayes In statistics, naive Bayes classifiers are a family of simple " probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features (see Bayes classifier). They are among the simplest Baye ...
* Multinomial Naive Bayes * Averaged One-Dependence Estimators (AODE) * Bayesian Belief Network (BBN) *
Bayesian Network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Ba ...
(BN)


Decision tree algorithms

Decision tree algorithm * Decision tree * Classification and regression tree (CART) * Iterative Dichotomiser 3 (ID3) * C4.5 algorithm * C5.0 algorithm * Chi-squared Automatic Interaction Detection (CHAID) * Decision stump * Conditional decision tree * ID3 algorithm *
Random forest Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of ...
* SLIQ


Linear classifier

Linear classifier In the field of machine learning, the goal of statistical classification is to use an object's characteristics to identify which class (or group) it belongs to. A linear classifier achieves this by making a classification decision based on the v ...
* Fisher's linear discriminant *
Linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is cal ...
*
Logistic regression In statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear combination of one or more independent variables. In regression an ...
*
Multinomial logistic regression In statistics, multinomial logistic regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes. That is, it is a model that is used to predict the prob ...
* Naive Bayes classifier * Perceptron *
Support vector machine In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laborat ...


Unsupervised learning

Unsupervised learning * Expectation-maximization algorithm * Vector Quantization * Generative topographic map *
Information bottleneck method The information bottleneck method is a technique in information theory introduced by Naftali Tishby, Fernando C. Pereira, and William Bialek. It is designed for finding the best tradeoff between accuracy and complexity ( compression) when summarizi ...
* Association rule learning algorithms **
Apriori algorithm AprioriRakesh Agrawal and Ramakrishnan SrikanFast algorithms for mining association rules Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pages 487-499, Santiago, Chile, September 1994. is an algorithm for frequent ...
**
Eclat algorithm Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.Pi ...


Artificial neural networks

Artificial neural network Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected unit ...
* Feedforward neural network **
Extreme learning machine Extreme learning machines are feedforward neural networks for classification, regression, clustering, sparse approximation, compression and feature learning with a single layer or multiple layers of hidden nodes, where the parameters of hidden n ...
**
Convolutional neural network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...
* Recurrent neural network ** Long short-term memory (LSTM) * Logic learning machine * Self-organizing map


Association rule learning

Association rule learning *
Apriori algorithm AprioriRakesh Agrawal and Ramakrishnan SrikanFast algorithms for mining association rules Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pages 487-499, Santiago, Chile, September 1994. is an algorithm for frequent ...
*
Eclat algorithm Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.Pi ...
* FP-growth algorithm


Hierarchical clustering

Hierarchical clustering * Single-linkage clustering *
Conceptual clustering Conceptual clustering is a machine learning paradigm for unsupervised classification that has been defined by Ryszard S. Michalski in 1980 (Fisher 1987, Michalski 1980) and developed mainly during the 1980s. It is distinguished from ordinary dat ...


Cluster analysis

Cluster analysis *
BIRCH A birch is a thin-leaved deciduous hardwood tree of the genus ''Betula'' (), in the family Betulaceae, which also includes alders, hazels, and hornbeams. It is closely related to the beech- oak family Fagaceae. The genus ''Betula'' cont ...
* DBSCAN * Expectation-maximization (EM) *
Fuzzy clustering Fuzzy clustering (also referred to as soft clustering or soft ''k''-means) is a form of clustering in which each data point can belong to more than one cluster. Clustering or cluster analysis involves assigning data points to clusters such that i ...
* Hierarchical Clustering * K-means clustering * K-medians * Mean-shift * OPTICS algorithm


Anomaly detection

Anomaly detection * ''k''-nearest neighbors algorithm (''k''-NN) *
Local outlier factor In anomaly detection, the local outlier factor (LOF) is an algorithm proposed by Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng and Jörg Sander in 2000 for finding anomalous data points by measuring the local deviation of a given data point ...


Semi-supervised learning

Semi-supervised learning * Active learning – special case of semi-supervised learning in which a learning algorithm is able to interactively query the user (or some other information source) to obtain the desired outputs at new data points. * Generative models * Low-density separation * Graph-based methods *
Co-training Co-training is a machine learning algorithm used when there are only small amounts of labeled data and large amounts of unlabeled data. One of its uses is in text mining for search engines. It was introduced by Avrim Blum and Tom Mitchell in 1998. ...
* Transduction


Deep learning

Deep learning Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised. ...
* Deep belief networks * Deep Boltzmann machines * Deep
Convolutional neural network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...
s * Deep Recurrent neural networks *
Hierarchical temporal memory Hierarchical temporal memory (HTM) is a biologically constrained machine intelligence technology developed by Numenta. Originally described in the 2004 book ''On Intelligence'' by Jeff Hawkins with Sandra Blakeslee, HTM is primarily used today for ...
* Generative Adversarial Network ** Style transfer *
Transformer A transformer is a passive component that transfers electrical energy from one electrical circuit to another circuit, or multiple circuits. A varying current in any coil of the transformer produces a varying magnetic flux in the transformer' ...
* Stacked Auto-Encoders


Other machine learning methods and problems

* Anomaly detection * Association rules * Bias-variance dilemma * Classification **
Multi-label classification In machine learning, multi-label classification or multi-output classification is a variant of the classification problem where multiple nonexclusive labels may be assigned to each instance. Multi-label classification is a generalization of mult ...
* Clustering *
Data Pre-processing Data preprocessing can refer to manipulation or dropping of data before it is used in order to ensure or enhance performance, and is an important step in the data mining process. The phrase "garbage in, garbage out" is particularly applicable to ...
* Empirical risk minimization *
Feature engineering Feature engineering or feature extraction or feature discovery is the process of using domain knowledge to extract features (characteristics, properties, attributes) from raw data. The motivation is to use these extra features to improve the qua ...
* Feature learning * Learning to rank *
Occam learning In computational learning theory, Occam learning is a model of algorithmic learning where the objective of the learner is to output a succinct representation of received training data. This is closely related to probably approximately correct (P ...
*
Online machine learning In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques whi ...
* PAC learning * Regression * Reinforcement Learning * Semi-supervised learning *
Statistical learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
* Structured prediction ** Graphical models ***
Bayesian network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Ba ...
*** Conditional random field (CRF) ***
Hidden Markov model A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("''hidden''") states. As part of the definition, HMM requires that there be an ...
(HMM) * Unsupervised learning * VC theory


Machine learning research

* List of artificial intelligence projects *
List of datasets for machine learning research These datasets are applied for machine learning research and have been cited in peer-reviewed academic journals. Datasets are an integral part of the field of machine learning. Major advances in this field can result from advances in learning a ...


History of machine learning

History of machine learning * Timeline of machine learning


Machine learning projects

Machine learning projects *
DeepMind DeepMind Technologies is a British artificial intelligence subsidiary of Alphabet Inc. and research laboratory founded in 2010. DeepMind was acquired by Google in 2014 and became a wholly owned subsidiary of Alphabet Inc, after Google's restru ...
* Google Brain * OpenAI * Meta AI


Machine learning organizations

Machine learning organizations *
Knowledge Engineering and Machine Learning Group The Knowledge Engineering and Machine Learning group (KEMLg) is a research group belonging to the Technical University of Catalonia (UPC) – BarcelonaTech. It was founded by Prof. Ulises Cortés. The group has been active in the Artificial I ...


Machine learning conferences and workshops

* Artificial Intelligence and Security (AISec) (co-located workshop with CCS) *
Conference on Neural Information Processing Systems The Conference and Workshop on Neural Information Processing Systems (abbreviated as NeurIPS and formerly NIPS) is a machine learning and computational neuroscience conference held every December. The conference is currently a double-track meet ...
(NIPS) * ECML PKDD *
International Conference on Machine Learning The International Conference on Machine Learning (ICML) is the leading international academic conference in machine learning. Along with NeurIPS and ICLR, it is one of the three primary conferences of high impact in machine learning and artific ...
(ICML)
ML4ALL
(Machine Learning For All)


Machine learning publications


Books on machine learning

Books about machine learning


Machine learning journals

* ''
Machine Learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
'' * ''
Journal of Machine Learning Research The ''Journal of Machine Learning Research'' is a peer-reviewed open access scientific journal covering machine learning. It was established in 2000 and the first editor-in-chief was Leslie Kaelbling. The current editors-in-chief are Francis Bac ...
'' (JMLR) * ''
Neural Computation Neural computation is the information processing performed by networks of neurons. Neural computation is affiliated with the philosophical tradition known as Computational theory of mind, also referred to as computationalism, which advances the t ...
''


Persons influential in machine learning

* Alberto Broggi * Andrei Knyazev *
Andrew McCallum Andrew McCallum is a professor in the computer science department at University of Massachusetts Amherst. His primary specialties are in machine learning, natural language processing, information extraction, information integration, and socia ...
*
Andrew Ng Andrew Yan-Tak Ng (; born 1976) is a British-born American computer scientist and technology entrepreneur focusing on machine learning and AI. Ng was a co-founder and head of Google Brain and was the former Chief Scientist at Baidu, buildin ...
* Anuraag Jain * Armin B. Cremers * Ayanna Howard * Barney Pell * Ben Goertzel *
Ben Taskar Ben Taskar (March 3, 1977 – November 18, 2013) was a professor and researcher in the area of machine learning and applications to computational linguistics and computer vision. He was a Magerman Term Associate Professor for Computer and Infor ...
*
Bernhard Schölkopf Bernhard Schölkopf is a German computer scientist (born 20 February 1968) known for his work in machine learning, especially on kernel methods and causality. He is a director at the Max Planck Institute for Intelligent Systems in Tübingen, ...
* Brian D. Ripley * Christopher G. Atkeson * Corinna Cortes * Demis Hassabis * Douglas Lenat *
Eric Xing Eric Poe Xing is an American computer scientist, academic administrator, and entrepreneur. Prior to his appointment as President of MBZUAI, Xing was a professor in the School of Computer Science at Carnegie Mellon University and researcher in mac ...
* Ernst Dickmanns * Geoffrey Hinton – co-inventor of the backpropagation and contrastive divergence training algorithms * Hans-Peter Kriegel * Hartmut Neven * Heikki Mannila *
Ian Goodfellow Ian J. Goodfellow (born ) is a computer scientist, engineer, and executive, most noted for his work on artificial neural networks and deep learning. He was previously employed as a research scientist at Google Brain and director of machine learn ...
– Father of Generative & adversarial networks * Jacek M. Zurada *
Jaime Carbonell Jaime Guillermo Carbonell (July 29, 1953 – February 28, 2020) was a computer scientist who made seminal contributions to the development of natural language processing tools and technologies. His extensive research in machine translation result ...
* Jeremy Slovak *
Jerome H. Friedman Jerome Harold Friedman (born December 29, 1939) is an American statistician, consultant and Professor of Statistics at Stanford University, known for his contributions in the field of statistics and data mining.
*
John D. Lafferty John D. Lafferty is an American scientist, Professor at Yale University and leading researcher in machine learning. He is best known for proposing the Conditional Random Fields with Andrew McCallum and Fernando C.N. Pereira. Biography In 2017, ...
* John Platt – invented SMO and Platt scaling * Julie Beth Lovins *
Jürgen Schmidhuber Jürgen Schmidhuber (born 17 January 1963) is a German computer scientist most noted for his work in the field of artificial intelligence, deep learning and artificial neural networks. He is a co-director of the Dalle Molle Institute for Artific ...
* Karl Steinbuch *
Katia Sycara Ekaterini Panagiotou Sycara ( el, Κάτια Συκαρά) is a Greek computer scientist. She is an Edward Fredkin Research Professor of Robotics in the Robotics Institute, School of Computer Science at Carnegie Mellon University internationally ...
*
Leo Breiman Leo Breiman (January 27, 1928 – July 5, 2005) was a distinguished statistician at the University of California, Berkeley. He was the recipient of numerous honors and awards, and was a member of the United States National Academy of Sciences ...
– invented bagging and random forests *
Lise Getoor Lise Getoor is a professor in the computer science department, at the University of California, Santa Cruz, and an adjunct professor in the Computer Science Department at the University of Maryland, College Park. Her primary research interests ...
*
Luca Maria Gambardella Luca Maria Gambardella (born 4 January 1962) is an Italian computer scientist and author. He is the former director of the Dalle Molle Institute for Artificial Intelligence Research in Manno, in the Ticino canton of Switzerland. With Marco Do ...
* Léon Bottou * Marcus Hutter * Mehryar Mohri *
Michael Collins Michael Collins or Mike Collins most commonly refers to: * Michael Collins (Irish leader) (1890–1922), Irish revolutionary leader, soldier, and politician * Michael Collins (astronaut) (1930–2021), American astronaut, member of Apollo 11 and ...
*
Michael I. Jordan Michael Irwin Jordan (born February 25, 1956) is an American scientist, professor at the University of California, Berkeley and researcher in machine learning, statistics, and artificial intelligence. Jordan was elected a member of the Nat ...
* Michael L. Littman *
Nando de Freitas Nando de Freitas is a researcher in the field of machine learning, and in particular in the subfields of neural networks, Bayesian inference and Bayesian optimization, and deep learning. Biography De Freitas was born in Zimbabwe. He did his un ...
* Ofer Dekel *
Oren Etzioni Oren Etzioni (born 1964) is an American entrepreneur, Professor Emeritus of computer science, and founding CEO of the Allen Institute for Artificial Intelligence (AI2). On June 15, 2022, he announced that he will step down as CEO of AI2 effective ...
*
Pedro Domingos Pedro Domingos is a Professor Emeritus of computer science and engineering at the University of Washington. He is a researcher in machine learning known for Markov logic network enabling uncertain inference. Education Domingos received an un ...
* Peter Flach * Pierre Baldi * Pushmeet Kohli * Ray Kurzweil * Rayid Ghani * Ross Quinlan * Salvatore J. Stolfo * Sebastian Thrun *
Selmer Bringsjord Selmer Bringsjord (born November 24, 1958) is the chair of the Department of Cognitive Science at Rensselaer Polytechnic Institute and a professor of Computer Science and Cognitive Science. He also holds an appointment in the Lally School of ...
*
Sepp Hochreiter Josef "Sepp" Hochreiter (born 14 February 1967) is a German computer scientist. Since 2018 he has led the Institute for Machine Learning at the Johannes Kepler University of Linz after having led the Institute of Bioinformatics from 2006 to 2018 ...
* Shane Legg *
Stephen Muggleton Stephen H. Muggleton FBCS, FIET, FAAAI, FECCAI, FSB, FREng (born 6 December 1959, son of Louis Muggleton) is Professor of Machine Learning and Head of the Computational Bioinformatics Laboratory at Imperial College London.Steve Omohundro * Tom M. Mitchell * Trevor Hastie *
Vasant Honavar Vasant G. Honavar is an Indian born American computer scientist, and artificial intelligence, machine learning, big data, data science, causal inference, knowledge representation, bioinformatics and health informatics researcher and professor. ...
*
Vladimir Vapnik Vladimir Naumovich Vapnik (russian: Владимир Наумович Вапник; born 6 December 1936) is one of the main developers of the Vapnik–Chervonenkis theory of statistical learning, and the co-inventor of the support-vector machin ...
– co-inventor of the SVM and VC theory * Yann LeCun – invented convolutional neural networks * Yasuo Matsuyama *
Yoshua Bengio Yoshua Bengio (born March 5, 1964) is a Canadian computer scientist, most noted for his work on artificial neural networks and deep learning. He is a professor at the Department of Computer Science and Operations Research at the Université de ...
* Zoubin Ghahramani


See also

* Outline of artificial intelligence ** Outline of computer vision * Outline of robotics * Accuracy paradox *
Action model learning Action model learning (sometimes abbreviated action learning) is an area of machine learning concerned with creation and modification of software agent's knowledge about ''effects'' and ''preconditions'' of the ''actions'' that can be executed wi ...
* Activation function * Activity recognition * ADALINE *
Adaptive neuro fuzzy inference system An adaptive neuro-fuzzy inference system or adaptive network-based fuzzy inference system (ANFIS) is a kind of artificial neural network that is based on Takagi–Sugeno fuzzy inference system. The technique was developed in the early 1990s. Since ...
* Adaptive resonance theory * Additive smoothing * Adjusted mutual information *
AIVA AIVA (Artificial Intelligence Virtual Artist) is an algorithmic composition, electronic composer recognized by the SACEM. Description Created in February 2016, AIVA specializes in Classical music, classical and symphonic music composition. It b ...
* AIXI * AlchemyAPI * AlexNet * Algorithm selection *
Algorithmic inference Algorithmic inference gathers new developments in the statistical inference methods made feasible by the powerful computing devices widely available to any data analyst. Cornerstones in this field are computational learning theory, granular computin ...
* Algorithmic learning theory * AlphaGo * AlphaGo Zero * Alternating decision tree * Apprenticeship learning *
Causal Markov condition The Markov condition, sometimes called the Markov assumption, is an assumption made in Bayesian probability theory, that every node in a Bayesian network is conditionally independent of its nondescendants, given its parents. Stated loosely, it is as ...
*
Competitive learning Competitive learning is a form of unsupervised learning in artificial neural networks, in which nodes compete for the right to respond to a subset of the input data. A variant of Hebbian learning, competitive learning works by increasing the special ...
* Concept learning *
Decision tree learning Decision tree learning is a supervised learning approach used in statistics, data mining and machine learning. In this formalism, a classification or regression decision tree is used as a predictive model to draw conclusions about a set of ...
* Differentiable programming * Distribution learning theory *
Eager learning In artificial intelligence, eager learning is a learning method in which the system tries to construct a general, input-independent target function during training of the system, as opposed to lazy learning, where generalization beyond the training ...
* End-to-end reinforcement learning * Error tolerance (PAC learning) * Explanation-based learning * Feature * GloVe * Hyperparameter *
Inferential theory of learning Inferential Theory of Learning (ITL) is an area of machine learning which describes inferential processes performed by learning agents. ITL has been continuously developed by Ryszard S. Michalski, starting in the 1980s. The first known publication ...
*
Learning automata A learning automaton is one type of machine learning algorithm studied since 1970s. Learning automata select their current action based on past experiences from the environment. It will fall into the range of reinforcement learning if the environme ...
* Learning classifier system * Learning rule * Learning with errors * M-Theory (learning framework) *
Machine learning control Machine learning control (MLC) is a subfield of machine learning, intelligent control and control theory which solves optimal control problems with methods of machine learning. Key applications are complex nonlinear systems for which linear cont ...
* Machine learning in bioinformatics *
Margin Margin may refer to: Physical or graphical edges * Margin (typography), the white space that surrounds the content of a page *Continental margin, the zone of the ocean floor that separates the thin oceanic crust from thick continental crust *Leaf ...
* Markov chain geostatistics *
Markov chain Monte Carlo In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain ...
(MCMC) * Markov information source * Markov logic network * Markov model * Markov random field *
Markovian discrimination Within the probability theory Markov model, Markovian discrimination in spam filtering is a method used in CRM114 and other spam filters to model the statistical behaviors of spam and nonspam more accurately than in simple Bayesian methods. A s ...
* Maximum-entropy Markov model * Multi-armed bandit *
Multi-task learning Multi-task learning (MTL) is a subfield of machine learning in which multiple learning tasks are solved at the same time, while exploiting commonalities and differences across tasks. This can result in improved learning efficiency and prediction ac ...
* Multilinear subspace learning * Multimodal learning * Multiple instance learning * Multiple-instance learning *
Never-Ending Language Learning Never-Ending Language Learning system (NELL) is a semantic machine learning system developed by a research team at Carnegie Mellon University, and supported by grants from DARPA, Google, NSF, and CNPq with portions of the system running on a superc ...
* Offline learning *
Parity learning Parity learning is a problem in machine learning. An algorithm that solves this problem must find a function ''ƒ'', given some samples (''x'', ''ƒ''(''x'')) and the assurance that ''ƒ'' computes the parity of bits at some fixed locations. T ...
*
Population-based incremental learning In computer science and machine learning, population-based incremental learning (PBIL) is an optimization algorithm, and an estimation of distribution algorithm. This is a type of genetic algorithm where the genotype of an entire population (probab ...
* Predictive learning * Preference learning * Proactive learning *
Proximal gradient methods for learning Proximal gradient (forward backward splitting) methods for learning is an area of research in optimization and statistical learning theory which studies algorithms for a general class of convex regularization problems where the regularization pena ...
* Semantic analysis *
Similarity learning Similarity learning is an area of supervised machine learning in artificial intelligence. It is closely related to regression and classification, but the goal is to learn a similarity function that measures how similar or related two objects are. ...
* Sparse dictionary learning * Stability (learning theory) * Statistical learning theory *
Statistical relational learning Statistical relational learning (SRL) is a subdiscipline of artificial intelligence and machine learning that is concerned with domain models that exhibit both uncertainty (which can be dealt with using statistical methods) and complex, relational ...
*
Tanagra Tanagra ( el, Τανάγρα) is a town and a municipality north of Athens in Boeotia, Greece. The seat of the municipality is the town Schimatari. It is not far from Thebes, and it was noted in antiquity for the figurines named after it. The T ...
*
Transfer learning Transfer learning (TL) is a research problem in machine learning (ML) that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem. For example, knowledge gained while learning to recognize ...
* Variable-order Markov model *
Version space learning Version space learning is a logical approach to machine learning, specifically binary classification. Version space learning algorithms search a predefined space of hypotheses, viewed as a set of logical sentences. Formally, the hypothesis space i ...
* Waffles * Weka *
Loss function In mathematical optimization and decision theory, a loss function or cost function (sometimes also called an error function) is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cos ...
**
Loss functions for classification In machine learning and mathematical optimization, loss functions for classification are computationally feasible loss functions representing the price paid for inaccuracy of predictions in classification problems (problems of identifying whi ...
** Mean squared error (MSE) ** Mean squared prediction error (MSPE) **
Taguchi loss function The Taguchi loss function is graphical depiction of loss developed by the Japanese business statistician Genichi Taguchi to describe a phenomenon affecting the value of products produced by a company. Praised by Dr. W. Edwards Deming (the business ...
* Low-energy adaptive clustering hierarchy


Other

* Anne O'Tate * Ant colony optimization algorithms * Anthony Levandowski * Anti-unification (computer science) * Apache Flume * Apache Giraph *
Apache Mahout Apache Mahout is a project of the Apache Software Foundation to produce free implementations of distributed or otherwise scalable machine learning algorithms focused primarily on linear algebra. In the past, many of the implementations use the ...
* Apache SINGA * Apache Spark * Apache SystemML * Aphelion (software) * Arabic Speech Corpus * Archetypal analysis * Arthur Zimek * Artificial ants *
Artificial bee colony algorithm In computer science and operations research, the artificial bee colony algorithm (ABC) is an optimization algorithm based on the intelligent foraging behaviour of honey bee swarm, proposed by Derviş Karaboğa (Erciyes University) in 2005. Alg ...
*
Artificial development Artificial development, also known as artificial embryogeny or machine intelligence or computational development, is an area of computer science and engineering concerned with computational models motivated by genotype–phenotype mappings in biol ...
*
Artificial immune system In artificial intelligence, artificial immune systems (AIS) are a class of computationally intelligent, rule-based machine learning systems inspired by the principles and processes of the vertebrate immune system. The algorithms are typically mode ...
*
Astrostatistics Astrostatistics is a discipline which spans astrophysics, statistical analysis and data mining. It is used to process the vast amount of data produced by automated scanning of the cosmos, to characterize complex datasets, and to link astronomica ...
* Averaged one-dependence estimators * Bag-of-words model * Balanced clustering *
Ball tree In computer science, a ball tree, balltree or metric tree, is a space partitioning data structure for organizing points in a multi-dimensional space. A ball tree partitions data points into a nested set of balls. The resulting data structure ...
* Base rate * Bat algorithm *
Baum–Welch algorithm In electrical engineering, statistical computing and bioinformatics, the Baum–Welch algorithm is a special case of the expectation–maximization algorithm used to find the unknown parameters of a hidden Markov model (HMM). It makes use of the ...
* Bayesian hierarchical modeling * Bayesian interpretation of kernel regularization * Bayesian optimization * Bayesian structural time series * Bees algorithm * Behavioral clustering *
Bernoulli scheme In mathematics, the Bernoulli scheme or Bernoulli shift is a generalization of the Bernoulli process to more than two possible outcomes. Bernoulli schemes appear naturally in symbolic dynamics, and are thus important in the study of dynamical sy ...
* Bias–variance tradeoff * Biclustering * BigML * Binary classification * Bing Predicts *
Bio-inspired computing Bio-inspired computing, short for biologically inspired computing, is a field of study which seeks to solve computer science problems using models of biology. It relates to connectionism, social behavior, and emergence. Within computer science, b ...
* Biogeography-based optimization * Biplot * Bondy's theorem * Bongard problem * Bradley–Terry model * BrownBoost * Brown clustering * Burst error *
CBCL (MIT) The Center for Biological & Computational Learning is a research lab at the Massachusetts Institute of Technology. CBCL was established in 1992 with support from the National Science Foundation. It is based in the Department of Brain & Cognitive Sc ...
* CIML community portal *
CMA-ES Covariance matrix adaptation evolution strategy (CMA-ES) is a particular kind of strategy for numerical optimization. Evolution strategies (ES) are stochastic, derivative-free methods for numerical optimization of non-linear or non-convex continu ...
* CURE data clustering algorithm * Cache language model *
Calibration (statistics) There are two main uses of the term calibration in statistics that denote special types of statistical inference problems. "Calibration" can mean :*a reverse process to regression, where instead of a future dependent variable being predicted from ...
* Canonical correspondence analysis * Canopy clustering algorithm *
Cascading classifiers Cascading is a particular case of ensemble learning based on the concatenation of several classifiers, using all information collected from the output from a given classifier as additional information for the next classifier in the cascade. Unl ...
* Category utility * CellCognition * Cellular evolutionary algorithm * Chi-square automatic interaction detection *
Chromosome (genetic algorithm) In genetic algorithms, a chromosome (also sometimes called a genotype) is a set of parameters which define a proposed solution to the problem that the genetic algorithm is trying to solve. The set of all solutions is known as the ''population''. T ...
* Classifier chains *
Cleverbot Cleverbot is a chatterbot web application that uses machine learning techniques to have conversations with humans. It was created by British AI scientist Rollo Carpenter. It was preceded by Jabberwacky, a chatbot project that began in 1988 a ...
*
Clonal selection algorithm In artificial immune systems, clonal selection algorithms are a class of algorithms inspired by the clonal selection theory of acquired immunity that explains how B and T lymphocytes improve their response to antigens over time called affinity mat ...
* Cluster-weighted modeling *
Clustering high-dimensional data Clustering high-dimensional data is the cluster analysis of data with anywhere from a few dozen to many thousands of dimensions. Such high-dimensional spaces of data are often encountered in areas such as medicine, where DNA microarray technology ca ...
*
Clustering illusion The clustering illusion is the tendency to erroneously consider the inevitable "streaks" or "clusters" arising in small samples from random distributions to be non-random. The illusion is caused by a human tendency to underpredict the amount of v ...
* CoBoosting * Cobweb (clustering) *
Cognitive computer A cognitive computer is a computer that hardwires artificial intelligence and machine-learning algorithms into an integrated circuit (printed circuit board) that closely reproduces the behavior of the human brain. It generally adopts a neuromorphic ...
* Cognitive robotics *
Collostructional analysis Collostructional analysis is a family of methods developed by (in alphabetical order) Stefan Th. Gries (University of California, Santa Barbara) and Anatol Stefanowitsch (Free University of Berlin). Collostructional analysis aims at measuring the d ...
* Common-method variance *
Complete-linkage clustering Complete-linkage clustering is one of several methods of agglomerative hierarchical clustering. At the beginning of the process, each element is in a cluster of its own. The clusters are then sequentially combined into larger clusters until all ...
* Computer-automated design *
Concept class In computational learning theory in mathematics, a concept over a domain ''X'' is a total Boolean function over ''X''. A concept class is a class of concepts. Concept classes are a subject of computational learning theory. Concept class terminolog ...
*
Concept drift In predictive analytics and machine learning, concept drift means that the statistical properties of the target variable, which the model is trying to predict, change over time in unforeseen ways. This causes problems because the predictions become ...
*
Conference on Artificial General Intelligence The Conference on Artificial General Intelligence is a meeting of researchers in the field of Artificial General Intelligence organized by thAGI Societyand held annually since 2008. The conference was initiated by the 2006 Bethesda Artificial Gen ...
* Conference on Knowledge Discovery and Data Mining * Confirmatory factor analysis * Confusion matrix *
Congruence coefficient In multivariate statistics, the congruence coefficient is an index of the similarity between factors that have been derived in a factor analysis. It was introduced in 1948 by Cyril Burt who referred to it as ''unadjusted correlation''. It is also c ...
* Connect (computer system) * Consensus clustering * Constrained clustering * Constrained conditional model * Constructive cooperative coevolution * Correlation clustering * Correspondence analysis *
Cortica Headquartered in Tel Aviv Cortica utilizes unsupervised learning methods to recognize and analyze digital images and video. The technology developed by the Cortica team is based on research of the function of the human brain. Company Founding Co ...
* Coupled pattern learner * Cross-entropy method * Cross-validation (statistics) * Crossover (genetic algorithm) *
Cuckoo search In operations research, cuckoo search is an optimization algorithm developed by Xin-She Yang and Suash Deb in 2009. It was inspired by the obligate brood parasitism of some cuckoo species by laying their eggs in the nests of host birds of other ...
* Cultural algorithm * Cultural consensus theory * Curse of dimensionality *
DADiSP DADiSP (Data Analysis and Display, pronounced day-disp) is a numerical computing environment developed by DSP Development Corporation which allows one to display and manipulate data series, matrices and images with an interface similar to a s ...
* DARPA LAGR Program * Darkforest *
Dartmouth workshop The Dartmouth Summer Research Project on Artificial Intelligence was a 1956 summer workshop widely consideredKline, Ronald R., Cybernetics, Automata Studies and the Dartmouth Conference on Artificial Intelligence, IEEE Annals of the History of ...
*
DarwinTunes DarwinTunes was a research project into the use of natural selection to create music led by Bob MacCallum and Armand Leroi, scientists at Imperial College London. The project asks volunteers on the Internet to listen to automatically generated so ...
* Data Mining Extensions *
Data exploration Data exploration is an approach similar to initial data analysis, whereby a data analyst uses visual exploration to understand what is in a dataset and the characteristics of the data, rather than through traditional data management systems.
*
Data pre-processing Data preprocessing can refer to manipulation or dropping of data before it is used in order to ensure or enhance performance, and is an important step in the data mining process. The phrase "garbage in, garbage out" is particularly applicable to ...
*
Data stream clustering In computer science, data stream clustering is defined as the clustering of data that arrive continuously such as telephone records, multimedia data, financial transactions etc. Data stream clustering is usually studied as a streaming algorithm an ...
* Dataiku *
Davies–Bouldin index The Davies–Bouldin index (DBI), introduced by David L. Davies and Donald W. Bouldin in 1979, is a metric for evaluating clustering algorithms. This is an internal evaluation scheme, where the validation of how well the clustering has been ...
*
Decision boundary __NOTOC__ In a statistical-classification problem with two classes, a decision boundary or decision surface is a hypersurface that partitions the underlying vector space into two sets, one for each class. The classifier will classify all the point ...
* Decision list *
Decision tree model In computational complexity the decision tree model is the model of computation in which an algorithm is considered to be basically a decision tree, i.e., a sequence of ''queries'' or ''tests'' that are done adaptively, so the outcome of the prev ...
*
Deductive classifier A deductive classifier is a type of artificial intelligence inference engine. It takes as input a set of declarations in a frame language about a domain such as medical research or molecular biology. For example, the names of classes, sub-classes ...
* DeepArt *
DeepDream DeepDream is a computer vision program created by Google engineer Alexander Mordvintsev that uses a convolutional neural network to find and enhance patterns in images via algorithmic pareidolia, thus creating a dream-like appearance reminiscent ...
* Deep Web Technologies * Defining length * Dendrogram * Dependability state model * Detailed balance *
Determining the number of clusters in a data set Determining the number of clusters in a data set, a quantity often labelled ''k'' as in the ''k''-means algorithm, is a frequent problem in data clustering, and is a distinct issue from the process of actually solving the clustering problem. For a ...
* Detrended correspondence analysis * Developmental robotics *
Diffbot Diffbot is a developer of machine learning and computer vision algorithms and public APIs for extracting data from web pages / web scraping to create a knowledge base. The company has gained interest from its application of computer vision t ...
*
Differential evolution In evolutionary computation, differential evolution (DE) is a method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. Such methods are commonly known as metaheuristics as ...
*
Discrete phase-type distribution The discrete phase-type distribution is a probability distribution that results from a system of one or more inter-related geometric distributions occurring in sequence, or phases. The sequence in which each of the phases occur may itself be a stoch ...
*
Discriminative model Discriminative models, also referred to as conditional models, are a class of logistical models used for classification or regression. They distinguish decision boundaries through observed data, such as pass/fail, win/lose, alive/dead or healthy/sic ...
*
Dissociated press Dissociated press is a parody generator (a computer program that generates nonsensical text). The generated text is based on another text using the Markov chain technique. The name is a play on " Associated Press" and the psychological term diss ...
* Distributed R * Dlib * Document classification *
Documenting Hate Documenting Hate is a project of ProPublica, in collaboration with a number of journalistic, academic, and computing organizations, for systematic tracking of hate crimes and bias incidents. It uses an online form to facilitate reporting of inciden ...
*
Domain adaptation Domain adaptation is a field associated with machine learning and transfer learning. This scenario arises when we aim at learning from a source data distribution a well performing model on a different (but related) target data distribution. Fo ...
* Doubly stochastic model * Dual-phase evolution * Dunn index * Dynamic Bayesian network * Dynamic Markov compression * Dynamic topic model * Dynamic unobserved effects model * EDLUT *
ELKI ELKI (for ''Environment for DeveLoping KDD-Applications Supported by Index-Structures'') is a data mining (KDD, knowledge discovery in databases) software framework developed for use in research and teaching. It was originally at the database s ...
*
Edge recombination operator The edge recombination operator (ERO) is an operator that creates a path that is similar to a set of existing paths (parents) by looking at the edges rather than the vertices. The main application of this is for crossover in genetic algorithms wh ...
* Effective fitness * Elastic map * Elastic matching * Elbow method (clustering) * Emergent (software) * Encog * Entropy rate * Erkki Oja *
Eurisko Eurisko ( Gr., ''I discover'') is a discovery system written by Douglas Lenat in RLL-1, a representation language itself written in the Lisp programming language. A sequel to Automated Mathematician, it consists of heuristics, i.e. rules of t ...
* European Conference on Artificial Intelligence * Evaluation of binary classifiers * Evolution strategy * Evolution window * Evolutionary Algorithm for Landmark Detection * Evolutionary algorithm * Evolutionary art * Evolutionary music * Evolutionary programming * Evolvability (computer science) * Evolved antenna * Evolver (software) * Evolving classification function * Expectation propagation * Exploratory factor analysis * F1 score * FLAME clustering *
Factor analysis of mixed data In statistics, factor analysis of mixed data or factorial analysis of mixed data (FAMD, in the French original: ''AFDM'' or ''Analyse Factorielle de Données Mixtes''), is the factorial method devoted to data tables in which a group of individuals ...
*
Factor graph A factor graph is a bipartite graph representing the factorization of a function. In probability theory and its applications, factor graphs are used to represent factorization of a probability distribution function, enabling efficient computatio ...
* Factor regression model * Factored language model * Farthest-first traversal *
Fast-and-frugal trees In the study of decision-making, a fast-and-frugal tree is a simple graphical structure that categorizes objects by asking one question at a time. These decision trees are used in a range of fields: psychology, artificial intelligence, and managemen ...
* Feature Selection Toolbox *
Feature hashing In machine learning, feature hashing, also known as the hashing trick (by analogy to the kernel trick), is a fast and space-efficient way of vectorizing features, i.e. turning arbitrary features into indices in a vector or matrix. It works by applyi ...
* Feature scaling * Feature vector *
Firefly algorithm In mathematical optimization, the firefly algorithm is a metaheuristic proposed by Xin-She Yang and inspired by the flashing behavior of firefly, fireflies. Algorithm In pseudocode the algorithm can be stated as: Begin 1) Objective functio ...
*
First-difference estimator In statistics and econometrics, the first-difference (FD) estimator is an estimator used to address the problem of omitted variables with panel data. It is consistent under the assumptions of the fixed effects model. In certain situations it can b ...
* First-order inductive learner *
Fish School Search Fish School Search (FSS), proposed by Bastos Filho and Lima Neto in 2008 is, in its basic version, an unimodal optimization algorithm inspired on the collective behavior of fish schools. The mechanisms of feeding and coordinated movement were used a ...
* Fisher kernel * Fitness approximation * Fitness function * Fitness proportionate selection * Fluentd * Folding@home * Formal concept analysis *
Forward algorithm The forward algorithm, in the context of a hidden Markov model (HMM), is used to calculate a 'belief state': the probability of a state at a certain time, given the history of evidence. The process is also known as ''filtering''. The forward alg ...
* Fowlkes–Mallows index *
Frederick Jelinek Frederick Jelinek (18 November 1932 – 14 September 2010) was a Czech-American researcher in information theory, automatic speech recognition, and natural language processing. He is well known for his oft-quoted statement, "Every time I fire a ...
* Frrole *
Functional principal component analysis Functional principal component analysis (FPCA) is a statistical method for investigating the dominant modes of variation of functional data. Using this method, a random function is represented in the eigenbasis, which is an orthonormal basis of t ...
* GATTO *
GLIMMER In bioinformatics, GLIMMER (Gene Locator and Interpolated Markov ModelER) is used to find genes in prokaryotic DNA. "It is effective at finding genes in bacteria, archea, viruses, typically finding 98-99% of all relatively long protein coding g ...
* Gary Bryce Fogel *
Gaussian adaptation Gaussian adaptation (GA), also called normal or natural adaptation (NA) is an evolutionary algorithm designed for the maximization of manufacturing yield due to statistical deviation of component values of signal processing systems. In short, GA ...
*
Gaussian process In probability theory and statistics, a Gaussian process is a stochastic process (a collection of random variables indexed by time or space), such that every finite collection of those random variables has a multivariate normal distribution, i.e. ...
* Gaussian process emulator *
Gene prediction In computational biology, gene prediction or gene finding refers to the process of identifying the regions of genomic DNA that encode genes. This includes protein-coding genes as well as RNA genes, but may also include prediction of other functio ...
* General Architecture for Text Engineering *
Generalization error For supervised learning applications in machine learning and statistical learning theory, generalization error (also known as the out-of-sample error or the risk) is a measure of how accurately an algorithm is able to predict outcome values for p ...
*
Generalized canonical correlation In statistics, the generalized canonical correlation analysis (gCCA), is a way of making sense of cross-correlation matrices between the sets of random variables when there are more than two sets. While a conventional CCA generalizes principal com ...
* Generalized filtering * Generalized iterative scaling * Generalized multidimensional scaling * Generative adversarial network * Generative model * Genetic algorithm *
Genetic algorithm scheduling The genetic algorithm is an operational research method that may be used to solve scheduling problems in production planning. Importance of production scheduling To be competitive, corporations must minimize inefficiencies and maximize productivit ...
*
Genetic algorithms in economics Genetic algorithms have increasingly been applied to economics since the pioneering work by John H. Miller in 1986. It has been used to characterize a variety of models including the cobweb model, the overlapping generations model, game theory, s ...
* Genetic fuzzy systems * Genetic memory (computer science) * Genetic operator * Genetic programming * Genetic representation *
Geographical cluster A geographical cluster is a localized anomaly, usually an excess of something given the distribution or variation of something else. Often it is considered as an incidence rate In epidemiology, incidence is a measure of the probability of occ ...
* Gesture Description Language * Geworkbench *
Glossary of artificial intelligence This glossary of artificial intelligence is a list of definitions of terms and concepts relevant to the study of artificial intelligence, its sub-disciplines, and related fields. Related glossaries include Glossary of computer science, Glossary o ...
*
Glottochronology Glottochronology (from Attic Greek γλῶττα ''tongue, language'' and χρόνος ''time'') is the part of lexicostatistics which involves comparative linguistics and deals with the chronological relationship between languages.Sheila Embleton ...
* Golem (ILP) * Google matrix * Grafting (decision trees) * Gramian matrix * Grammatical evolution * Granular computing *
GraphLab Turi is a graph-based, high performance, distributed computation framework written in C++. The GraphLab project was started by Prof. Carlos Guestrin of Carnegie Mellon University in 2009. It is an open source project using an Apache License. Wh ...
* Graph kernel * Gremlin (programming language) * Growth function * HUMANT (HUManoid ANT) algorithm *
Hammersley–Clifford theorem The Hammersley–Clifford theorem is a result in probability theory, mathematical statistics and statistical mechanics that gives necessary and sufficient conditions under which a strictly positive probability distribution (of events in a probabili ...
*
Harmony search This is a chronologically ordered list of metaphor-based metaheuristics and swarm intelligence algorithms, sorted by decade of proposal. Algorithms 1980s-1990s Simulated annealing (Kirkpatrick et al., 1983) Simulated annealing is a pro ...
* Hebbian theory * Hidden Markov random field *
Hidden semi-Markov model A hidden semi-Markov model (HSMM) is a statistical model with the same structure as a hidden Markov model except that the unobservable process is semi-Markov rather than Markov. This means that the probability of there being a change in the hidden ...
*
Hierarchical hidden Markov model The hierarchical hidden Markov model (HHMM) is a statistical model derived from the hidden Markov model (HMM). In an HHMM, each state is considered to be a self-contained probabilistic model. More precisely, each state of the HHMM is itself an HHMM ...
* Higher-order factor analysis * Highway network * Hinge loss *
Holland's schema theorem Holland's schema theorem, also called the fundamental theorem of genetic algorithms, is an inequality that results from coarse-graining an equation for evolutionary dynamics. The Schema Theorem says that short, low-order schemata with above-averag ...
* Hopkins statistic *
Hoshen–Kopelman algorithm The Hoshen–Kopelman algorithm is a simple and efficient algorithm for labeling clusters on a grid, where the grid is a regular network of cells, with the cells being either occupied or unoccupied. This algorithm is based on a well-known union- ...
* Huber loss * IRCF360 *
Ian Goodfellow Ian J. Goodfellow (born ) is a computer scientist, engineer, and executive, most noted for his work on artificial neural networks and deep learning. He was previously employed as a research scientist at Google Brain and director of machine learn ...
* Ilastik * Ilya Sutskever * Immunocomputing * Imperialist competitive algorithm * Inauthentic text * Incremental decision tree *
Induction of regular languages In computational learning theory, induction of regular languages refers to the task of learning a formal description (e.g. grammar) of a regular language from a given set of example strings. Although E. Mark Gold has shown that not every regular la ...
* Inductive bias *
Inductive probability Inductive probability attempts to give the probability of future events based on past events. It is the basis for inductive reasoning, and gives the mathematical basis for learning and the perception of patterns. It is a source of knowledge about ...
*
Inductive programming Inductive programming (IP) is a special area of automatic programming, covering research from artificial intelligence and programming, which addresses learning of typically declarative ( logic or functional) and often recursive programs from in ...
*
Influence diagram Influence or influencer may refer to: * Social influence, in social psychology, influence in interpersonal relationships **Minority influence, when the minority affect the behavior or beliefs of the majority * Influencer marketing, through indivi ...
* Information Harvesting * Information fuzzy networks * Information gain in decision trees * Information gain ratio * Inheritance (genetic algorithm) * Instance selection *
Intel RealSense Intel RealSense Technology is a product range of depth and tracking technologies designed to give machines and devices depth perception capabilities. The technologies, owned by Intel are used in autonomous drones, robots, AR/VR, smart home device ...
* Interacting particle system *
Interactive machine translation Interactive machine translation (IMT), is a specific sub-field of computer-aided translation. Under this translation paradigm, the computer software that assists the human translator attempts to predict the text the user is going to input by taking ...
*
International Joint Conference on Artificial Intelligence The International Joint Conference on Artificial Intelligence (IJCAI) is the leading conference in the field of Artificial Intelligence. The conference series has been organized by the nonprofit IJCAI Organization since 1969, making it the oldest pr ...
* International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics *
International Semantic Web Conference The International Semantic Web Conference (ISWC) is a series of academic conferences and the premier international forum for the Semantic Web, Linked Data and Knowledge Graph Community. Here, scientists, industry specialists, and practitioners ...
* Iris flower data set * Island algorithm * Isotropic position * Item response theory *
Iterative Viterbi decoding Iterative Viterbi decoding is an algorithm that spots the subsequence ''S'' of an observation ''O'' = having the highest average probability (i.e., probability scaled by the length of ''S'') of being generated by a given hidden Markov model ''M'' w ...
* JOONE * Jabberwacky * Jaccard index * Jackknife variance estimates for random forest * Java Grammatical Evolution * Joseph Nechvatal * Jubatus *
Julia (programming language) Julia is a high-level, dynamic programming language. Its features are well suited for numerical analysis and computational science. Distinctive aspects of Julia's design include a type system with parametric polymorphism in a dynamic progra ...
* Junction tree algorithm * K-SVD * K-means++ * K-medians clustering * K-medoids *
KNIME KNIME (), the Konstanz Information Miner, is a free and open-source data analytics, reporting and integration platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining "Building Blocks ...
* KXEN Inc. * K q-flats * Kaggle * Kalman filter * Katz's back-off model * Kernel adaptive filter * Kernel density estimation * Kernel eigenvoice * Kernel embedding of distributions * Kernel method * Kernel perceptron * Kernel random forest * Kinect * Klaus-Robert Müller * Kneser–Ney smoothing * Knowledge Vault * Knowledge integration * LIBSVM * LPBoost * Labeled data * LanguageWare * Language identification in the limit * Language model * Large margin nearest neighbor * Latent Dirichlet allocation * Latent class model * Latent semantic analysis * Latent variable * Latent variable model * Lattice Miner * Layered hidden Markov model * Learnable function class * Least squares support vector machine * Leave-one-out error * Leslie P. Kaelbling * Linear genetic programming * Linear predictor function * Linear separability * Lingyun Gu * Linkurious * Lior Ron (business executive) * List of genetic algorithm applications * List of metaphor-based metaheuristics * List of text mining software * Local case-control sampling * Local independence * Local tangent space alignment * Locality-sensitive hashing * Log-linear model * Logistic model tree * Low-rank approximation * Low-rank matrix approximations * MATLAB * MIMIC (immunology) * MXNet * Mallet (software project) * Manifold regularization * Margin-infused relaxed algorithm * Margin classifier * Mark V. Shaney * Massive Online Analysis * Matrix regularization * Matthews correlation coefficient * Mean shift * Mean squared error * Mean squared prediction error * Measurement invariance * Medoid * MeeMix * Melomics * Memetic algorithm * Meta-optimization * Mexican International Conference on Artificial Intelligence * Michael Kearns (computer scientist) * MinHash * Mixture model * Mlpy * Models of DNA evolution * Moral graph * Mountain car problem * Movidius * Multi-armed bandit *
Multi-label classification In machine learning, multi-label classification or multi-output classification is a variant of the classification problem where multiple nonexclusive labels may be assigned to each instance. Multi-label classification is a generalization of mult ...
* Multi expression programming * Multiclass classification * Multidimensional analysis * Multifactor dimensionality reduction * Multilinear principal component analysis * Multiple correspondence analysis * Multiple discriminant analysis * Multiple factor analysis * Multiple sequence alignment * Multiplicative weight update method * Multispectral pattern recognition * Mutation (genetic algorithm) * MysteryVibe * N-gram * NOMINATE (scaling method) * Native-language identification * Natural Language Toolkit * Natural evolution strategy * Nearest-neighbor chain algorithm * Nearest centroid classifier * Nearest neighbor search * Neighbor joining * Nest Labs * NetMiner * NetOwl * Neural Designer * Neural Engineering Object * Neural Lab * Neural modeling fields * Neural network software * NeuroSolutions * Neuro Laboratory * Neuroevolution * Neuroph * Niki.ai * Noisy channel model * Noisy text analytics * Nonlinear dimensionality reduction * Novelty detection * Nuisance variable * One-class classification * Onnx * OpenNLP * Optimal discriminant analysis * Oracle Data Mining * Orange (software) * Ordination (statistics) * Overfitting * PROGOL * PSIPRED * Pachinko allocation * PageRank * Parallel metaheuristic * Parity benchmark * Part-of-speech tagging * Particle swarm optimization * Path dependence * Pattern language (formal languages) * Peltarion Synapse * Perplexity * Persian Speech Corpus * Picas (app) * Pietro Perona * Pipeline Pilot * Piranha (software) * Pitman–Yor process * Plate notation * Polynomial kernel * Pop music automation * Population process * Portable Format for Analytics * Predictive Model Markup Language * Predictive state representation * Preference regression * Premature convergence * Principal geodesic analysis * Prior knowledge for pattern recognition * Prisma (app) * Probabilistic Action Cores * Probabilistic context-free grammar * Probabilistic latent semantic analysis * Probabilistic soft logic * Probability matching * Probit model * Product of experts * Programming with Big Data in R * Proper generalized decomposition * Pruning (decision trees) * Pushpak Bhattacharyya * Q methodology * Qloo * Quality control and genetic algorithms * Quantum Artificial Intelligence Lab * Queueing theory * Quick, Draw! * R (programming language) * Rada Mihalcea * Rademacher complexity * Radial basis function kernel * Rand index * Random indexing * Random projection * Random subspace method * Ranking SVM * RapidMiner * Rattle GUI * Raymond Cattell * Reasoning system * Regularization perspectives on support vector machines * Relational data mining * Relationship square * Relevance vector machine * Relief (feature selection) * Renjin * Repertory grid * Representer theorem * Reward-based selection * Richard Zemel * Right to explanation * RoboEarth * Robust principal component analysis * RuleML Symposium * Rule induction * Rules extraction system family * SAS (software) * SNNS * SPSS Modeler * SUBCLU * Sample complexity * Sample exclusion dimension * Santa Fe Trail problem * Savi Technology * Schema (genetic algorithms) * Search-based software engineering * Selection (genetic algorithm) * Self-Service Semantic Suite * Semantic folding * Semantic mapping (statistics) * Semidefinite embedding * Sense Networks * Sensorium Project * Sequence labeling * Sequential minimal optimization * Shattered set * Shogun (toolbox) * Silhouette (clustering) * SimHash * SimRank * Similarity measure * Simple matching coefficient * Simultaneous localization and mapping * Sinkov statistic * Sliced inverse regression * Snakes and Ladders * Soft independent modelling of class analogies * Soft output Viterbi algorithm * Solomonoff's theory of inductive inference * SolveIT Software * Spectral clustering * Spike-and-slab variable selection * Statistical machine translation * Statistical parsing * Statistical semantics * Stefano Soatto * Stephen Wolfram * Stochastic block model * Stochastic cellular automaton * Stochastic diffusion search * Stochastic grammar * Stochastic matrix * Stochastic universal sampling * Stress majorization * String kernel * Structural equation modeling * Structural risk minimization * Structured sparsity regularization * Structured support vector machine * Subclass reachability * Sufficient dimension reduction * Sukhotin's algorithm * Sum of absolute differences * Sum of absolute transformed differences * Swarm intelligence * Switching Kalman filter * Symbolic regression * Synchronous context-free grammar * Syntactic pattern recognition * TD-Gammon * TIMIT * Teaching dimension * Teuvo Kohonen * Textual case-based reasoning * Theory of conjoint measurement * Thomas G. Dietterich * Thurstonian model * Topic model * Tournament selection * Training, test, and validation sets * Transiogram * Trax Image Recognition * Trigram tagger * Truncation selection * Tucker decomposition * UIMA * UPGMA * Ugly duckling theorem * Uncertain data * Uniform convergence in probability * Unique negative dimension * Universal portfolio algorithm * User behavior analytics * VC dimension * VIGRA * Validation set * Vapnik–Chervonenkis theory * Variable-order Bayesian network * Variable kernel density estimation * Variable rules analysis * Variational message passing * Varimax rotation * Vector quantization * Vicarious (company) * Viterbi algorithm * Vowpal Wabbit * WACA clustering algorithm * WPGMA * Ward's method * Weasel program * Whitening transformation * Winnow (algorithm) * Win–stay, lose–switch * Witness set * Wolfram Language * Wolfram Mathematica * Writer invariant * Xgboost * Yooreeka * Zeroth (software)


Further reading

* Trevor Hastie, Robert Tibshirani and
Jerome H. Friedman Jerome Harold Friedman (born December 29, 1939) is an American statistician, consultant and Professor of Statistics at Stanford University, known for his contributions in the field of statistics and data mining.
(2001).
The Elements of Statistical Learning
', Springer. . *
Pedro Domingos Pedro Domingos is a Professor Emeritus of computer science and engineering at the University of Washington. He is a researcher in machine learning known for Markov logic network enabling uncertain inference. Education Domingos received an un ...
(September 2015), The Master Algorithm, Basic Books, * Mehryar Mohri, Afshin Rostamizadeh, Ameet Talwalkar (2012).
Foundations of Machine Learning
', The MIT Press. . * Ian H. Witten and Eibe Frank (2011). ''Data Mining: Practical machine learning tools and techniques'' Morgan Kaufmann, 664pp., . * David J. C. MacKay.
Information Theory, Inference, and Learning Algorithms
' Cambridge: Cambridge University Press, 2003. * Richard O. Duda, Peter E. Hart, David G. Stork (2001) ''Pattern classification'' (2nd edition), Wiley, New York, . * Christopher Bishop (1995). ''Neural Networks for Pattern Recognition'', Oxford University Press. . *
Vladimir Vapnik Vladimir Naumovich Vapnik (russian: Владимир Наумович Вапник; born 6 December 1936) is one of the main developers of the Vapnik–Chervonenkis theory of statistical learning, and the co-inventor of the support-vector machin ...
(1998). ''Statistical Learning Theory''. Wiley-Interscience, . * Ray Solomonoff, ''An Inductive Inference Machine'', IRE Convention Record, Section on Information Theory, Part 2, pp., 56–62, 1957. * Ray Solomonoff,
An Inductive Inference Machine
A privately circulated report from the 1956 Dartmouth Conferences, Dartmouth Summer Research Conference on AI.


References


External links


Data Science: Data to Insights from MIT (machine learning)
* Popular online course by
Andrew Ng Andrew Yan-Tak Ng (; born 1976) is a British-born American computer scientist and technology entrepreneur focusing on machine learning and AI. Ng was a co-founder and head of Google Brain and was the former Chief Scientist at Baidu, buildin ...
, a
Coursera
It uses GNU Octave. The course is a free version of Stanford University's actual course taught by Ng, see.stanford.edu/Course/CS229 available for free].
mloss
is an academic database of open-source machine learning software. {{Outline footer Outlines of applied sciences, Machine learning Wikipedia outlines, Machine learning Computing-related lists Machine learning, * Data mining, Machine learning