The following outline is provided as an overview of and topical guide to machine learning.

Machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

is a subfield of

soft computing Soft computing is a set of algorithms, including neural networks, fuzzy logic, and evolutionary algorithms. These algorithms are tolerant of imprecision, uncertainty, partial truth and approximation. It is contrasted with hard computing: al ...

within

computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...

that evolved from the study of pattern recognition and computational learning theory in

artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech ...

.http://www.britannica.com/EBchecked/topic/1116194/machine-learning In 1959, Arthur Samuel defined machine learning as a "field of study that gives computers the ability to learn without being explicitly programmed". Machine learning explores the study and construction of

algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...

s that can

learn Learning is the process of acquiring new understanding, knowledge, behaviors, skills, values, attitudes, and preferences. The ability to learn is possessed by humans, animals, and some machines; there is also evidence for some kind of l ...

from and make predictions on

data In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpret ...

. Such algorithms operate by building a model from an example

training set In machine learning, a common task is the study and construction of algorithms that can learn from and make predictions on data. Such algorithms function by making data-driven predictions or decisions, through building a mathematical model from ...

of input observations in order to make data-driven predictions or decisions expressed as outputs, rather than following strictly static program instructions.

What ''type'' of thing is machine learning?

* An

academic discipline An academy (Attic Greek: Ἀκαδήμεια; Koine Greek Ἀκαδημία) is an institution of secondary education, secondary or tertiary education, tertiary higher education, higher learning (and generally also research or honorary membershi ...

* A branch of

science Science is a systematic endeavor that builds and organizes knowledge in the form of testable explanations and predictions about the universe. Science may be as old as the human species, and some of the earliest archeological evidence ...

** An

applied science Applied science is the use of the scientific method and knowledge obtained via conclusions from the method to attain practical goals. It includes a broad range of disciplines such as engineering and medicine. Applied science is often contrasted ...

*** A subfield of

**** A branch of

**** A subfield of soft computing *** Application of

statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...

Branches of machine learning

Subfields of machine learning

* Computational learning theory – studying the design and analysis of

machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

algorithms. * Grammar induction * Meta-learning

Cross-disciplinary fields involving machine learning

* Adversarial machine learning *

Predictive analytics Predictive analytics encompasses a variety of statistical techniques from data mining, predictive modeling, and machine learning that analyze current and historical facts to make predictions about future or otherwise unknown events. In busine ...

Quantum machine learning Quantum machine learning is the integration of quantum algorithms within machine learning programs. The most common use of the term refers to machine learning algorithms for the analysis of classical data executed on a quantum computer, i.e. quan ...

* Robot learning ** Developmental robotics

Applications of machine learning

* Applications of machine learning *

Bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...

Biomedical informatics Health informatics is the field of science and engineering that aims at developing methods and technologies for the acquisition, processing, and study of patient data, which can come from different sources and modalities, such as electronic hea ...

Computer vision Computer vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate tasks that the human ...

Customer relationship management Customer relationship management (CRM) is a process in which a business or other organization administers its interactions with customers, typically using data analysis to study large amounts of information. CRM systems compile data from a r ...

– * Data mining * Earth sciences * Email filtering * Inverted pendulum – balance and equilibrium system. *

Natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to proc ...

(NLP) ** Named Entity Recognition ** Automatic summarization ** Automatic taxonomy construction ** Dialog system ** Grammar checker ** Language recognition *** Handwriting recognition ***

Optical character recognition Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a sc ...

***

Speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the ...

**** Text to Speech Synthesis (TTS) **** Speech Emotion Recognition (SER) **

Machine translation Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation or interactive translation), is a sub-field of computational linguistics that investigates ...

Question answering Question answering (QA) is a computer science discipline within the fields of information retrieval and natural language processing (NLP), which is concerned with building systems that automatically answer questions posed by humans in a natural ...

Speech synthesis Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal langua ...

** Text mining ***

Term frequency–inverse document frequency Term may refer to: *Terminology, or term, a noun or compound word used in a specific context, in particular: **Technical term, part of the specialized vocabulary of a particular field, specifically: ***Scientific terminology, terms used by scienti ...

(tf–idf) ** Text simplification * Pattern recognition ** Facial recognition system ** Handwriting recognition ** Image recognition **

* Recommendation system ** Collaborative filtering ** Content-based filtering ** Hybrid recommender systems (Collaborative and content-based filtering) *

Search engine A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...

Search engine optimization Search engine optimization (SEO) is the process of improving the quality and quantity of website traffic to a website or a web page from search engines. SEO targets unpaid traffic (known as "natural" or "organic" results) rather than dire ...

* Social Engineering

Machine learning hardware

Graphics processing unit A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, m ...

Tensor processing unit Tensor Processing Unit (TPU) is an AI accelerator application-specific integrated circuit (ASIC) developed by Google for neural network machine learning, using Google's own TensorFlow software. Google began using TPUs internally in 2015, and ...

Vision processing unit A vision processing unit (VPU) is (as of 2018) an emerging class of microprocessor; it is a specific type of AI accelerator, designed to accelerate machine vision tasks. Overview Vision processing units are distinct from video processing uni ...

Machine learning tools

Comparison of deep learning software The following table compares notable software frameworks, libraries and computer programs for deep learning. Deep-learning software by name Comparison of compatibility of machine learning models See also *Comparison of numerical-analy ...

Machine learning frameworks

Proprietary machine learning frameworks

* Amazon Machine Learning * Microsoft Azure Machine Learning Studio *

DistBelief TensorFlow is a Free and open-source software, free and open-source Library (computing), software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on Types of artificial ...

– replaced by TensorFlow

Open source machine learning frameworks

* Apache Singa * Apache MXNet * Caffe * PyTorch * mlpack * TensorFlow * Torch * CNTK * Accord.Net * Jax

Machine learning libraries

Deeplearning4j Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). It is a framework with wide support for deep learning algorithms. Deeplearning4j includes implementations of the restricted Boltzmann machine, ...

Theano In Greek mythology, Theano (; Ancient Greek: Θεανώ) may refer to the following personages: *Theano, wife of Metapontus, king of Icaria. Metapontus demanded that she bear him children, or leave the kingdom. She presented the children of Mel ...

* scikit-learn * Keras

Machine learning algorithms

* Almeida–Pineda recurrent backpropagation * ALOPEX * Backpropagation * Bootstrap aggregating *

CN2 algorithm The CN2 induction algorithm is a learning algorithm for rule induction.Clark, P. and Niblett, T (1989) The CN2 induction algorithm. Machine Learning 3(4):261-283. It is designed to work even when the training data is imperfect. It is based on ide ...

* Constructing skill trees * Dehaene–Changeux model *

Diffusion map Diffusion maps is a dimensionality reduction or feature extraction algorithm introduced by Ronald Coifman, Coifman and Lafon which computes a family of embeddings of a data set into Euclidean space (often low-dimensional) whose coordinates can be ...

Dominance-based rough set approach The dominance-based rough set approach (DRSA) is an extension of rough set theory for multi-criteria decision analysis (MCDA), introduced by Greco, Matarazzo and Słowiński. Greco, S., Matarazzo, B., Słowiński, R.: Rough sets theory for multi- ...

* Dynamic time warping * Error-driven learning * Evolutionary multimodal optimization * Expectation–maximization algorithm *

FastICA FastICA is an efficient and popular algorithm for independent component analysis invented by Aapo Hyvärinen at Helsinki University of Technology. Like most ICA algorithms, FastICA seeks an orthogonal rotation of prewhitened data, through a fixed- ...

Forward–backward algorithm The forward–backward algorithm is an inference algorithm for hidden Markov models which computes the posterior marginals of all hidden state variables given a sequence of observations/emissions o_:= o_1,\dots,o_T, i.e. it computes, for all hi ...

* GeneRec * Genetic Algorithm for Rule Set Production * Growing self-organizing map * Hyper basis function network * IDistance * K-nearest neighbors algorithm * Kernel methods for vector output * Kernel principal component analysis * Leabra *

Linde–Buzo–Gray algorithm The Linde–Buzo–Gray algorithm (introduced by Yoseph Linde, Andrés Buzo and Robert M. Gray in 1980) is a vector quantization algorithm to derive a good codebook A codebook is a type of document used for gathering and storing cryptography ...

Local outlier factor In anomaly detection, the local outlier factor (LOF) is an algorithm proposed by Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng and Jörg Sander in 2000 for finding anomalous data points by measuring the local deviation of a given data point ...

* Logic learning machine * LogitBoost *

Manifold alignment Manifold alignment is a class of machine learning algorithms that produce projections between sets of data, given that the original data sets lie on a common manifold. The concept was first introduced as such by Ham, Lee, and Saul in 2003, adding ...

* Markov chain Monte Carlo (MCMC) * Minimum redundancy feature selection *

Mixture of experts Mixture of experts (MoE) refers to a machine learning technique where multiple expert networks (learners) are used to divide a problem space into homogeneous regions. It differs from ensemble techniques in that typically only a few, or 1, expert mo ...

Multiple kernel learning Multiple kernel learning refers to a set of machine learning methods that use a predefined set of kernels and learn an optimal linear or non-linear combination of kernels as part of the algorithm. Reasons to use multiple kernel learning include ...

* Non-negative matrix factorization *

Online machine learning In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques whi ...

* Out-of-bag error * Prefrontal cortex basal ganglia working memory * PVLV * Q-learning * Quadratic unconstrained binary optimization * Query-level feature *

Quickprop Quickprop is an iterative method for determining the minimum of the loss function of an artificial neural network, following an algorithm inspired by the Newton's method. Sometimes, the algorithm is classified to the group of the second order lear ...

Radial basis function network In the field of mathematical modeling, a radial basis function network is an artificial neural network that uses radial basis functions as activation functions. The output of the network is a linear combination of radial basis functions of the inp ...

* Randomized weighted majority algorithm * Reinforcement learning * Repeated incremental pruning to produce error reduction (RIPPER) * Rprop *

Rule-based machine learning Rule-based machine learning (RBML) is a term in computer science intended to encompass any machine learning method that identifies, learns, or evolves 'rules' to store, manipulate or apply. The defining characteristic of a rule-based machine lear ...

* Skill chaining * Sparse PCA * State–action–reward–state–action *

Stochastic gradient descent Stochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an objective function with suitable smoothness properties (e.g. differentiable or subdifferentiable). It can be regarded as a stochastic approximation of ...

* Structured kNN *

T-distributed stochastic neighbor embedding t-distributed stochastic neighbor embedding (t-SNE) is a statistical method for visualizing high-dimensional data by giving each datapoint a location in a two or three-dimensional map. It is based on Stochastic Neighbor Embedding originally de ...

* Temporal difference learning * Wake-sleep algorithm *

Weighted majority algorithm (machine learning) In machine learning, weighted majority algorithm (WMA) is a meta learning algorithm used to construct a compound algorithm from a pool of prediction algorithms, which could be any type of learning algorithms, classifiers, or even real human exper ...

Machine learning methods

Instance-based algorithm

* K-nearest neighbors algorithm (KNN) * Learning vector quantization (LVQ) * Self-organizing map (SOM)

Regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...

Logistic regression In statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear combination of one or more independent variables. In regression an ...

* Ordinary least squares regression (OLSR) *

Linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is cal ...

Stepwise regression In statistics, stepwise regression is a method of fitting regression models in which the choice of predictive variables is carried out by an automatic procedure. In each step, a variable is considered for addition to or subtraction from the set of ...

* Multivariate adaptive regression splines (MARS) * Regularization algorithm **

Ridge regression Ridge regression is a method of estimating the coefficients of multiple-regression models in scenarios where the independent variables are highly correlated. It has been used in many fields including econometrics, chemistry, and engineering. Also ...

** Least Absolute Shrinkage and Selection Operator (LASSO) ** Elastic net **

Least-angle regression In statistics, least-angle regression (LARS) is an algorithm for fitting linear regression models to high-dimensional data, developed by Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani. Suppose we expect a response variab ...

(LARS) * Classifiers ** Probabilistic classifier *** Naive Bayes classifier **

Binary classifier Binary classification is the task of classifying the elements of a set into two groups (each called ''class'') on the basis of a classification rule. Typical binary classification problems include: * Medical testing to determine if a patient has ...

Linear classifier In the field of machine learning, the goal of statistical classification is to use an object's characteristics to identify which class (or group) it belongs to. A linear classifier achieves this by making a classification decision based on the v ...

** Hierarchical classifier

Dimensionality reduction

Dimensionality reduction *

Canonical correlation analysis In statistics, canonical-correlation analysis (CCA), also called canonical variates analysis, is a way of inferring information from cross-covariance matrices. If we have two vectors ''X'' = (''X''1, ..., ''X'n'') and ''Y' ...

(CCA) * Factor analysis * Feature extraction * Feature selection * Independent component analysis (ICA) *

Linear discriminant analysis Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to find a linear combination of features ...

(LDA) * Multidimensional scaling (MDS) * Non-negative matrix factorization (NMF) *

Partial least squares regression Partial least squares regression (PLS regression) is a statistical method that bears some relation to principal components regression; instead of finding hyperplanes of maximum variance between the response and independent variables, it finds a ...

(PLSR) * Principal component analysis (PCA) * Principal component regression (PCR) *

Projection pursuit Projection pursuit (PP) is a type of statistical technique which involves finding the most "interesting" possible projections in multidimensional data. Often, projections which deviate more from a normal distribution are considered to be more inter ...

* Sammon mapping *

t-distributed stochastic neighbor embedding t-distributed stochastic neighbor embedding (t-SNE) is a statistical method for visualizing high-dimensional data by giving each datapoint a location in a two or three-dimensional map. It is based on Stochastic Neighbor Embedding originally de ...

(t-SNE)

Ensemble learning

Ensemble learning * AdaBoost * Boosting * Bootstrap aggregating (Bagging) *

Ensemble averaging In machine learning, particularly in the creation of artificial neural networks, ensemble averaging is the process of creating multiple models and combining them to produce a desired output, as opposed to creating just one model. Frequently an ens ...

– process of creating multiple models and combining them to produce a desired output, as opposed to creating just one model. Frequently an ensemble of models performs better than any individual model, because the various errors of the models "average out." * Gradient boosted decision tree (GBDT) * Gradient boosting machine (GBM) *

Random Forest Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of ...

* Stacked Generalization (blending)

Meta-learning

Meta-learning * Inductive bias * Metadata

Reinforcement learning

Reinforcement learning * Q-learning * State–action–reward–state–action (SARSA) * Temporal difference learning (TD) *

Learning Automata A learning automaton is one type of machine learning algorithm studied since 1970s. Learning automata select their current action based on past experiences from the environment. It will fall into the range of reinforcement learning if the environme ...

Supervised learning

Supervised learning * Averaged one-dependence estimators (AODE) *

Artificial neural network Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected unit ...

* Case-based reasoning *

Gaussian process regression In statistics, originally in geostatistics, kriging or Kriging, also known as Gaussian process regression, is a method of interpolation based on Gaussian process governed by prior covariances. Under suitable assumptions of the prior, kriging giv ...

* Gene expression programming * Group method of data handling (GMDH) * Inductive logic programming *

Instance-based learning In machine learning, instance-based learning (sometimes called memory-based learning) is a family of learning algorithms that, instead of performing explicit generalization, compare new problem instances with instances seen in training, which have ...

Lazy learning In machine learning, lazy learning is a learning method in which generalization of the training data is, in theory, delayed until a query is made to the system, as opposed to eager learning, where the system tries to generalize the training data be ...

* Learning Vector Quantization * Logistic Model Tree * Minimum message length (decision trees, decision graphs, etc.) ** Nearest Neighbor Algorithm **

Analogical modeling Analogical modeling (AM) is a formal theory of exemplar based analogical reasoning, proposed by Royal Skousen, professor of Linguistics and English language at Brigham Young University in Provo, Utah. It is applicable to language modeling and othe ...

* Probably approximately correct learning (PAC) learning *

Ripple down rules Ripple-down rules (RDR) are a way of approaching knowledge acquisition. Knowledge acquisition refers to the transfer of knowledge from human experts to knowledge-based systems. Introductory material Ripple-down rules are an incremental approac ...

, a knowledge acquisition methodology * Symbolic machine learning algorithms *

Support vector machine In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laborat ...

s * Random Forests * Ensembles of classifiers ** Bootstrap aggregating (bagging) ** Boosting (meta-algorithm) * Ordinal classification * Information fuzzy networks (IFN) * Conditional Random Field * ANOVA * Quadratic classifiers *

k-nearest neighbor In statistics, the ''k''-nearest neighbors algorithm (''k''-NN) is a non-parametric supervised learning method first developed by Evelyn Fix and Joseph Hodges in 1951, and later expanded by Thomas Cover. It is used for classification and regressi ...

* Boosting ** SPRINT *

Bayesian network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Ba ...

s ** Naive Bayes *

Hidden Markov model A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("''hidden''") states. As part of the definition, HMM requires that there be an ...

s **

Hierarchical hidden Markov model The hierarchical hidden Markov model (HHMM) is a statistical model derived from the hidden Markov model (HMM). In an HHMM, each state is considered to be a self-contained probabilistic model. More precisely, each state of the HHMM is itself an HHMM ...

Bayesian

Bayesian statistics * Bayesian knowledge base * Naive Bayes *

Gaussian Naive Bayes In statistics, naive Bayes classifiers are a family of simple " probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features (see Bayes classifier). They are among the simplest Baye ...

* Multinomial Naive Bayes * Averaged One-Dependence Estimators (AODE) * Bayesian Belief Network (BBN) *

Bayesian Network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Ba ...

(BN)

Decision tree algorithms

Decision tree algorithm * Decision tree * Classification and regression tree (CART) * Iterative Dichotomiser 3 (ID3) * C4.5 algorithm * C5.0 algorithm * Chi-squared Automatic Interaction Detection (CHAID) * Decision stump * Conditional decision tree * ID3 algorithm *

Random forest Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of ...

* SLIQ

Linear classifier

* Fisher's linear discriminant *

Multinomial logistic regression In statistics, multinomial logistic regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes. That is, it is a model that is used to predict the prob ...

* Naive Bayes classifier * Perceptron *

Unsupervised learning

Unsupervised learning * Expectation-maximization algorithm * Vector Quantization * Generative topographic map *

Information bottleneck method The information bottleneck method is a technique in information theory introduced by Naftali Tishby, Fernando C. Pereira, and William Bialek. It is designed for finding the best tradeoff between accuracy and complexity ( compression) when summarizi ...

* Association rule learning algorithms **

Apriori algorithm AprioriRakesh Agrawal and Ramakrishnan SrikanFast algorithms for mining association rules Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pages 487-499, Santiago, Chile, September 1994. is an algorithm for frequent ...

Eclat algorithm Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.Pi ...

Artificial neural networks

* Feedforward neural network **

Extreme learning machine Extreme learning machines are feedforward neural networks for classification, regression, clustering, sparse approximation, compression and feature learning with a single layer or multiple layers of hidden nodes, where the parameters of hidden n ...

Convolutional neural network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...

* Recurrent neural network ** Long short-term memory (LSTM) * Logic learning machine * Self-organizing map

Association rule learning

Association rule learning *

* FP-growth algorithm

Hierarchical clustering

Hierarchical clustering * Single-linkage clustering *

Conceptual clustering Conceptual clustering is a machine learning paradigm for unsupervised classification that has been defined by Ryszard S. Michalski in 1980 (Fisher 1987, Michalski 1980) and developed mainly during the 1980s. It is distinguished from ordinary dat ...

Cluster analysis

Cluster analysis *

BIRCH A birch is a thin-leaved deciduous hardwood tree of the genus ''Betula'' (), in the family Betulaceae, which also includes alders, hazels, and hornbeams. It is closely related to the beech- oak family Fagaceae. The genus ''Betula'' cont ...

* DBSCAN * Expectation-maximization (EM) *

Fuzzy clustering Fuzzy clustering (also referred to as soft clustering or soft ''k''-means) is a form of clustering in which each data point can belong to more than one cluster. Clustering or cluster analysis involves assigning data points to clusters such that i ...

* Hierarchical Clustering * K-means clustering * K-medians * Mean-shift * OPTICS algorithm

Anomaly detection

Anomaly detection * ''k''-nearest neighbors algorithm (''k''-NN) *

Semi-supervised learning

Semi-supervised learning * Active learning – special case of semi-supervised learning in which a learning algorithm is able to interactively query the user (or some other information source) to obtain the desired outputs at new data points. * Generative models * Low-density separation * Graph-based methods *

Co-training Co-training is a machine learning algorithm used when there are only small amounts of labeled data and large amounts of unlabeled data. One of its uses is in text mining for search engines. It was introduced by Avrim Blum and Tom Mitchell in 1998. ...

* Transduction

Deep learning

Deep learning Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised. ...

* Deep belief networks * Deep Boltzmann machines * Deep

s * Deep Recurrent neural networks *

Hierarchical temporal memory Hierarchical temporal memory (HTM) is a biologically constrained machine intelligence technology developed by Numenta. Originally described in the 2004 book ''On Intelligence'' by Jeff Hawkins with Sandra Blakeslee, HTM is primarily used today for ...

* Generative Adversarial Network ** Style transfer *

Transformer A transformer is a passive component that transfers electrical energy from one electrical circuit to another circuit, or multiple circuits. A varying current in any coil of the transformer produces a varying magnetic flux in the transformer' ...

* Stacked Auto-Encoders

Machine learning research

* List of artificial intelligence projects *

List of datasets for machine learning research These datasets are applied for machine learning research and have been cited in peer-reviewed academic journals. Datasets are an integral part of the field of machine learning. Major advances in this field can result from advances in learning a ...

History of machine learning

History of machine learning * Timeline of machine learning

Machine learning projects

Machine learning projects *

DeepMind DeepMind Technologies is a British artificial intelligence subsidiary of Alphabet Inc. and research laboratory founded in 2010. DeepMind was acquired by Google in 2014 and became a wholly owned subsidiary of Alphabet Inc, after Google's restru ...

* Google Brain * OpenAI * Meta AI

Machine learning organizations

Machine learning organizations *

Knowledge Engineering and Machine Learning Group The Knowledge Engineering and Machine Learning group (KEMLg) is a research group belonging to the Technical University of Catalonia (UPC) – BarcelonaTech. It was founded by Prof. Ulises Cortés. The group has been active in the Artificial I ...

Machine learning conferences and workshops

* Artificial Intelligence and Security (AISec) (co-located workshop with CCS) *

Conference on Neural Information Processing Systems The Conference and Workshop on Neural Information Processing Systems (abbreviated as NeurIPS and formerly NIPS) is a machine learning and computational neuroscience conference held every December. The conference is currently a double-track meet ...

(NIPS) * ECML PKDD *

International Conference on Machine Learning The International Conference on Machine Learning (ICML) is the leading international academic conference in machine learning. Along with NeurIPS and ICLR, it is one of the three primary conferences of high impact in machine learning and artific ...

(ICML)
ML4ALL
(Machine Learning For All)

Machine learning publications

Books on machine learning

Books about machine learning

Machine learning journals

* ''

Machine Learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

'' * ''

Journal of Machine Learning Research The ''Journal of Machine Learning Research'' is a peer-reviewed open access scientific journal covering machine learning. It was established in 2000 and the first editor-in-chief was Leslie Kaelbling. The current editors-in-chief are Francis Bac ...

'' (JMLR) * ''

Neural Computation Neural computation is the information processing performed by networks of neurons. Neural computation is affiliated with the philosophical tradition known as Computational theory of mind, also referred to as computationalism, which advances the t ...

Persons influential in machine learning

* Alberto Broggi * Andrei Knyazev *

Andrew McCallum Andrew McCallum is a professor in the computer science department at University of Massachusetts Amherst. His primary specialties are in machine learning, natural language processing, information extraction, information integration, and socia ...

Andrew Ng Andrew Yan-Tak Ng (; born 1976) is a British-born American computer scientist and technology entrepreneur focusing on machine learning and AI. Ng was a co-founder and head of Google Brain and was the former Chief Scientist at Baidu, buildin ...

* Anuraag Jain * Armin B. Cremers * Ayanna Howard * Barney Pell * Ben Goertzel *

Ben Taskar Ben Taskar (March 3, 1977 – November 18, 2013) was a professor and researcher in the area of machine learning and applications to computational linguistics and computer vision. He was a Magerman Term Associate Professor for Computer and Infor ...

Bernhard Schölkopf Bernhard Schölkopf is a German computer scientist (born 20 February 1968) known for his work in machine learning, especially on kernel methods and causality. He is a director at the Max Planck Institute for Intelligent Systems in Tübingen, ...

* Brian D. Ripley * Christopher G. Atkeson * Corinna Cortes * Demis Hassabis * Douglas Lenat *

Eric Xing Eric Poe Xing is an American computer scientist, academic administrator, and entrepreneur. Prior to his appointment as President of MBZUAI, Xing was a professor in the School of Computer Science at Carnegie Mellon University and researcher in mac ...

* Ernst Dickmanns * Geoffrey Hinton – co-inventor of the backpropagation and contrastive divergence training algorithms * Hans-Peter Kriegel * Hartmut Neven * Heikki Mannila *

Ian Goodfellow Ian J. Goodfellow (born ) is a computer scientist, engineer, and executive, most noted for his work on artificial neural networks and deep learning. He was previously employed as a research scientist at Google Brain and director of machine learn ...

– Father of Generative & adversarial networks * Jacek M. Zurada *

Jaime Carbonell Jaime Guillermo Carbonell (July 29, 1953 – February 28, 2020) was a computer scientist who made seminal contributions to the development of natural language processing tools and technologies. His extensive research in machine translation result ...

* Jeremy Slovak *

Jerome H. Friedman Jerome Harold Friedman (born December 29, 1939) is an American statistician, consultant and Professor of Statistics at Stanford University, known for his contributions in the field of statistics and data mining.

John D. Lafferty John D. Lafferty is an American scientist, Professor at Yale University and leading researcher in machine learning. He is best known for proposing the Conditional Random Fields with Andrew McCallum and Fernando C.N. Pereira. Biography In 2017, ...

* John Platt – invented SMO and Platt scaling * Julie Beth Lovins *

Jürgen Schmidhuber Jürgen Schmidhuber (born 17 January 1963) is a German computer scientist most noted for his work in the field of artificial intelligence, deep learning and artificial neural networks. He is a co-director of the Dalle Molle Institute for Artific ...

* Karl Steinbuch *

Katia Sycara Ekaterini Panagiotou Sycara ( el, Κάτια Συκαρά) is a Greek computer scientist. She is an Edward Fredkin Research Professor of Robotics in the Robotics Institute, School of Computer Science at Carnegie Mellon University internationally ...

Leo Breiman Leo Breiman (January 27, 1928 – July 5, 2005) was a distinguished statistician at the University of California, Berkeley. He was the recipient of numerous honors and awards, and was a member of the United States National Academy of Sciences ...

– invented bagging and random forests *

Lise Getoor Lise Getoor is a professor in the computer science department, at the University of California, Santa Cruz, and an adjunct professor in the Computer Science Department at the University of Maryland, College Park. Her primary research interests ...

Luca Maria Gambardella Luca Maria Gambardella (born 4 January 1962) is an Italian computer scientist and author. He is the former director of the Dalle Molle Institute for Artificial Intelligence Research in Manno, in the Ticino canton of Switzerland. With Marco Do ...

* Léon Bottou * Marcus Hutter * Mehryar Mohri *

Michael Collins Michael Collins or Mike Collins most commonly refers to: * Michael Collins (Irish leader) (1890–1922), Irish revolutionary leader, soldier, and politician * Michael Collins (astronaut) (1930–2021), American astronaut, member of Apollo 11 and ...

Michael I. Jordan Michael Irwin Jordan (born February 25, 1956) is an American scientist, professor at the University of California, Berkeley and researcher in machine learning, statistics, and artificial intelligence. Jordan was elected a member of the Nat ...

* Michael L. Littman *

Nando de Freitas Nando de Freitas is a researcher in the field of machine learning, and in particular in the subfields of neural networks, Bayesian inference and Bayesian optimization, and deep learning. Biography De Freitas was born in Zimbabwe. He did his un ...

* Ofer Dekel *

Oren Etzioni Oren Etzioni (born 1964) is an American entrepreneur, Professor Emeritus of computer science, and founding CEO of the Allen Institute for Artificial Intelligence (AI2). On June 15, 2022, he announced that he will step down as CEO of AI2 effective ...

Pedro Domingos Pedro Domingos is a Professor Emeritus of computer science and engineering at the University of Washington. He is a researcher in machine learning known for Markov logic network enabling uncertain inference. Education Domingos received an un ...

* Peter Flach * Pierre Baldi * Pushmeet Kohli * Ray Kurzweil * Rayid Ghani * Ross Quinlan * Salvatore J. Stolfo * Sebastian Thrun *

Selmer Bringsjord Selmer Bringsjord (born November 24, 1958) is the chair of the Department of Cognitive Science at Rensselaer Polytechnic Institute and a professor of Computer Science and Cognitive Science. He also holds an appointment in the Lally School of ...

Sepp Hochreiter Josef "Sepp" Hochreiter (born 14 February 1967) is a German computer scientist. Since 2018 he has led the Institute for Machine Learning at the Johannes Kepler University of Linz after having led the Institute of Bioinformatics from 2006 to 2018 ...

* Shane Legg *

Stephen Muggleton Stephen H. Muggleton FBCS, FIET, FAAAI, FECCAI, FSB, FREng (born 6 December 1959, son of Louis Muggleton) is Professor of Machine Learning and Head of the Computational Bioinformatics Laboratory at Imperial College London.Steve Omohundro * Tom M. Mitchell * Trevor Hastie *

Vasant Honavar Vasant G. Honavar is an Indian born American computer scientist, and artificial intelligence, machine learning, big data, data science, causal inference, knowledge representation, bioinformatics and health informatics researcher and professor. ...

Vladimir Vapnik Vladimir Naumovich Vapnik (russian: Владимир Наумович Вапник; born 6 December 1936) is one of the main developers of the Vapnik–Chervonenkis theory of statistical learning, and the co-inventor of the support-vector machin ...

– co-inventor of the SVM and VC theory * Yann LeCun – invented convolutional neural networks * Yasuo Matsuyama *

Yoshua Bengio Yoshua Bengio (born March 5, 1964) is a Canadian computer scientist, most noted for his work on artificial neural networks and deep learning. He is a professor at the Department of Computer Science and Operations Research at the Université de ...

* Zoubin Ghahramani

Determining the number of clusters in a data set Determining the number of clusters in a data set, a quantity often labelled ''k'' as in the ''k''-means algorithm, is a frequent problem in data clustering, and is a distinct issue from the process of actually solving the clustering problem. For a ...

* Detrended correspondence analysis * Developmental robotics *

Diffbot Diffbot is a developer of machine learning and computer vision algorithms and public APIs for extracting data from web pages / web scraping to create a knowledge base. The company has gained interest from its application of computer vision t ...

Differential evolution In evolutionary computation, differential evolution (DE) is a method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. Such methods are commonly known as metaheuristics as ...

Discrete phase-type distribution The discrete phase-type distribution is a probability distribution that results from a system of one or more inter-related geometric distributions occurring in sequence, or phases. The sequence in which each of the phases occur may itself be a stoch ...

Discriminative model Discriminative models, also referred to as conditional models, are a class of logistical models used for classification or regression. They distinguish decision boundaries through observed data, such as pass/fail, win/lose, alive/dead or healthy/sic ...

Dissociated press Dissociated press is a parody generator (a computer program that generates nonsensical text). The generated text is based on another text using the Markov chain technique. The name is a play on " Associated Press" and the psychological term diss ...

* Distributed R * Dlib * Document classification *

Documenting Hate Documenting Hate is a project of ProPublica, in collaboration with a number of journalistic, academic, and computing organizations, for systematic tracking of hate crimes and bias incidents. It uses an online form to facilitate reporting of inciden ...

Domain adaptation Domain adaptation is a field associated with machine learning and transfer learning. This scenario arises when we aim at learning from a source data distribution a well performing model on a different (but related) target data distribution. Fo ...

* Doubly stochastic model * Dual-phase evolution * Dunn index * Dynamic Bayesian network * Dynamic Markov compression * Dynamic topic model * Dynamic unobserved effects model * EDLUT *

ELKI ELKI (for ''Environment for DeveLoping KDD-Applications Supported by Index-Structures'') is a data mining (KDD, knowledge discovery in databases) software framework developed for use in research and teaching. It was originally at the database s ...

Edge recombination operator The edge recombination operator (ERO) is an operator that creates a path that is similar to a set of existing paths (parents) by looking at the edges rather than the vertices. The main application of this is for crossover in genetic algorithms wh ...

* Effective fitness * Elastic map * Elastic matching * Elbow method (clustering) * Emergent (software) * Encog * Entropy rate * Erkki Oja *

Eurisko Eurisko ( Gr., ''I discover'') is a discovery system written by Douglas Lenat in RLL-1, a representation language itself written in the Lisp programming language. A sequel to Automated Mathematician, it consists of heuristics, i.e. rules of t ...

* European Conference on Artificial Intelligence * Evaluation of binary classifiers * Evolution strategy * Evolution window * Evolutionary Algorithm for Landmark Detection * Evolutionary algorithm * Evolutionary art * Evolutionary music * Evolutionary programming * Evolvability (computer science) * Evolved antenna * Evolver (software) * Evolving classification function * Expectation propagation * Exploratory factor analysis * F1 score * FLAME clustering *

Factor analysis of mixed data In statistics, factor analysis of mixed data or factorial analysis of mixed data (FAMD, in the French original: ''AFDM'' or ''Analyse Factorielle de Données Mixtes''), is the factorial method devoted to data tables in which a group of individuals ...

Factor graph A factor graph is a bipartite graph representing the factorization of a function. In probability theory and its applications, factor graphs are used to represent factorization of a probability distribution function, enabling efficient computatio ...

* Factor regression model * Factored language model * Farthest-first traversal *

Fast-and-frugal trees In the study of decision-making, a fast-and-frugal tree is a simple graphical structure that categorizes objects by asking one question at a time. These decision trees are used in a range of fields: psychology, artificial intelligence, and managemen ...

* Feature Selection Toolbox *

Feature hashing In machine learning, feature hashing, also known as the hashing trick (by analogy to the kernel trick), is a fast and space-efficient way of vectorizing features, i.e. turning arbitrary features into indices in a vector or matrix. It works by applyi ...

* Feature scaling * Feature vector *

Firefly algorithm In mathematical optimization, the firefly algorithm is a metaheuristic proposed by Xin-She Yang and inspired by the flashing behavior of firefly, fireflies. Algorithm In pseudocode the algorithm can be stated as: Begin 1) Objective functio ...

First-difference estimator In statistics and econometrics, the first-difference (FD) estimator is an estimator used to address the problem of omitted variables with panel data. It is consistent under the assumptions of the fixed effects model. In certain situations it can b ...

* First-order inductive learner *

Fish School Search Fish School Search (FSS), proposed by Bastos Filho and Lima Neto in 2008 is, in its basic version, an unimodal optimization algorithm inspired on the collective behavior of fish schools. The mechanisms of feeding and coordinated movement were used a ...

* Fisher kernel * Fitness approximation * Fitness function * Fitness proportionate selection * Fluentd * Folding@home * Formal concept analysis *

Forward algorithm The forward algorithm, in the context of a hidden Markov model (HMM), is used to calculate a 'belief state': the probability of a state at a certain time, given the history of evidence. The process is also known as ''filtering''. The forward alg ...

* Fowlkes–Mallows index *

Frederick Jelinek Frederick Jelinek (18 November 1932 – 14 September 2010) was a Czech-American researcher in information theory, automatic speech recognition, and natural language processing. He is well known for his oft-quoted statement, "Every time I fire a ...

* Frrole *

Functional principal component analysis Functional principal component analysis (FPCA) is a statistical method for investigating the dominant modes of variation of functional data. Using this method, a random function is represented in the eigenbasis, which is an orthonormal basis of t ...

* GATTO *

GLIMMER In bioinformatics, GLIMMER (Gene Locator and Interpolated Markov ModelER) is used to find genes in prokaryotic DNA. "It is effective at finding genes in bacteria, archea, viruses, typically finding 98-99% of all relatively long protein coding g ...

* Gary Bryce Fogel *

Gaussian adaptation Gaussian adaptation (GA), also called normal or natural adaptation (NA) is an evolutionary algorithm designed for the maximization of manufacturing yield due to statistical deviation of component values of signal processing systems. In short, GA ...

Gaussian process In probability theory and statistics, a Gaussian process is a stochastic process (a collection of random variables indexed by time or space), such that every finite collection of those random variables has a multivariate normal distribution, i.e. ...

* Gaussian process emulator *

Gene prediction In computational biology, gene prediction or gene finding refers to the process of identifying the regions of genomic DNA that encode genes. This includes protein-coding genes as well as RNA genes, but may also include prediction of other functio ...

* General Architecture for Text Engineering *

Generalization error For supervised learning applications in machine learning and statistical learning theory, generalization error (also known as the out-of-sample error or the risk) is a measure of how accurately an algorithm is able to predict outcome values for p ...

Generalized canonical correlation In statistics, the generalized canonical correlation analysis (gCCA), is a way of making sense of cross-correlation matrices between the sets of random variables when there are more than two sets. While a conventional CCA generalizes principal com ...

* Generalized filtering * Generalized iterative scaling * Generalized multidimensional scaling * Generative adversarial network * Generative model * Genetic algorithm *

Genetic algorithm scheduling The genetic algorithm is an operational research method that may be used to solve scheduling problems in production planning. Importance of production scheduling To be competitive, corporations must minimize inefficiencies and maximize productivit ...

Genetic algorithms in economics Genetic algorithms have increasingly been applied to economics since the pioneering work by John H. Miller in 1986. It has been used to characterize a variety of models including the cobweb model, the overlapping generations model, game theory, s ...

* Genetic fuzzy systems * Genetic memory (computer science) * Genetic operator * Genetic programming * Genetic representation *

Geographical cluster A geographical cluster is a localized anomaly, usually an excess of something given the distribution or variation of something else. Often it is considered as an incidence rate In epidemiology, incidence is a measure of the probability of occ ...

* Gesture Description Language * Geworkbench *

Glossary of artificial intelligence This glossary of artificial intelligence is a list of definitions of terms and concepts relevant to the study of artificial intelligence, its sub-disciplines, and related fields. Related glossaries include Glossary of computer science, Glossary o ...

Glottochronology Glottochronology (from Attic Greek γλῶττα ''tongue, language'' and χρόνος ''time'') is the part of lexicostatistics which involves comparative linguistics and deals with the chronological relationship between languages.Sheila Embleton ...

* Golem (ILP) * Google matrix * Grafting (decision trees) * Gramian matrix * Grammatical evolution * Granular computing *

GraphLab Turi is a graph-based, high performance, distributed computation framework written in C++. The GraphLab project was started by Prof. Carlos Guestrin of Carnegie Mellon University in 2009. It is an open source project using an Apache License. Wh ...

* Graph kernel * Gremlin (programming language) * Growth function * HUMANT (HUManoid ANT) algorithm *

Hammersley–Clifford theorem The Hammersley–Clifford theorem is a result in probability theory, mathematical statistics and statistical mechanics that gives necessary and sufficient conditions under which a strictly positive probability distribution (of events in a probabili ...

Harmony search This is a chronologically ordered list of metaphor-based metaheuristics and swarm intelligence algorithms, sorted by decade of proposal. Algorithms 1980s-1990s Simulated annealing (Kirkpatrick et al., 1983) Simulated annealing is a pro ...

* Hebbian theory * Hidden Markov random field *

Hidden semi-Markov model A hidden semi-Markov model (HSMM) is a statistical model with the same structure as a hidden Markov model except that the unobservable process is semi-Markov rather than Markov. This means that the probability of there being a change in the hidden ...

* Higher-order factor analysis * Highway network * Hinge loss *

Holland's schema theorem Holland's schema theorem, also called the fundamental theorem of genetic algorithms, is an inequality that results from coarse-graining an equation for evolutionary dynamics. The Schema Theorem says that short, low-order schemata with above-averag ...

* Hopkins statistic *

Hoshen–Kopelman algorithm The Hoshen–Kopelman algorithm is a simple and efficient algorithm for labeling clusters on a grid, where the grid is a regular network of cells, with the cells being either occupied or unoccupied. This algorithm is based on a well-known union- ...

* Huber loss * IRCF360 *

* Ilastik * Ilya Sutskever * Immunocomputing * Imperialist competitive algorithm * Inauthentic text * Incremental decision tree *

Induction of regular languages In computational learning theory, induction of regular languages refers to the task of learning a formal description (e.g. grammar) of a regular language from a given set of example strings. Although E. Mark Gold has shown that not every regular la ...

* Inductive bias *

Inductive probability Inductive probability attempts to give the probability of future events based on past events. It is the basis for inductive reasoning, and gives the mathematical basis for learning and the perception of patterns. It is a source of knowledge about ...

Inductive programming Inductive programming (IP) is a special area of automatic programming, covering research from artificial intelligence and programming, which addresses learning of typically declarative ( logic or functional) and often recursive programs from in ...

Influence diagram Influence or influencer may refer to: * Social influence, in social psychology, influence in interpersonal relationships **Minority influence, when the minority affect the behavior or beliefs of the majority * Influencer marketing, through indivi ...

* Information Harvesting * Information fuzzy networks * Information gain in decision trees * Information gain ratio * Inheritance (genetic algorithm) * Instance selection *

Intel RealSense Intel RealSense Technology is a product range of depth and tracking technologies designed to give machines and devices depth perception capabilities. The technologies, owned by Intel are used in autonomous drones, robots, AR/VR, smart home device ...

* Interacting particle system *

Interactive machine translation Interactive machine translation (IMT), is a specific sub-field of computer-aided translation. Under this translation paradigm, the computer software that assists the human translator attempts to predict the text the user is going to input by taking ...

International Joint Conference on Artificial Intelligence The International Joint Conference on Artificial Intelligence (IJCAI) is the leading conference in the field of Artificial Intelligence. The conference series has been organized by the nonprofit IJCAI Organization since 1969, making it the oldest pr ...

* International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics *

International Semantic Web Conference The International Semantic Web Conference (ISWC) is a series of academic conferences and the premier international forum for the Semantic Web, Linked Data and Knowledge Graph Community. Here, scientists, industry specialists, and practitioners ...

* Iris flower data set * Island algorithm * Isotropic position * Item response theory *

Iterative Viterbi decoding Iterative Viterbi decoding is an algorithm that spots the subsequence ''S'' of an observation ''O'' = having the highest average probability (i.e., probability scaled by the length of ''S'') of being generated by a given hidden Markov model ''M'' w ...

* JOONE * Jabberwacky * Jaccard index * Jackknife variance estimates for random forest * Java Grammatical Evolution * Joseph Nechvatal * Jubatus *

Julia (programming language) Julia is a high-level, dynamic programming language. Its features are well suited for numerical analysis and computational science. Distinctive aspects of Julia's design include a type system with parametric polymorphism in a dynamic progra ...

* Junction tree algorithm * K-SVD * K-means++ * K-medians clustering * K-medoids *

KNIME KNIME (), the Konstanz Information Miner, is a free and open-source data analytics, reporting and integration platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining "Building Blocks ...

* KXEN Inc. * K q-flats * Kaggle * Kalman filter * Katz's back-off model * Kernel adaptive filter * Kernel density estimation * Kernel eigenvoice * Kernel embedding of distributions * Kernel method * Kernel perceptron * Kernel random forest * Kinect * Klaus-Robert Müller * Kneser–Ney smoothing * Knowledge Vault * Knowledge integration * LIBSVM * LPBoost * Labeled data * LanguageWare * Language identification in the limit * Language model * Large margin nearest neighbor * Latent Dirichlet allocation * Latent class model * Latent semantic analysis * Latent variable * Latent variable model * Lattice Miner * Layered hidden Markov model * Learnable function class * Least squares support vector machine * Leave-one-out error * Leslie P. Kaelbling * Linear genetic programming * Linear predictor function * Linear separability * Lingyun Gu * Linkurious * Lior Ron (business executive) * List of genetic algorithm applications * List of metaphor-based metaheuristics * List of text mining software * Local case-control sampling * Local independence * Local tangent space alignment * Locality-sensitive hashing * Log-linear model * Logistic model tree * Low-rank approximation * Low-rank matrix approximations * MATLAB * MIMIC (immunology) * MXNet * Mallet (software project) * Manifold regularization * Margin-infused relaxed algorithm * Margin classifier * Mark V. Shaney * Massive Online Analysis * Matrix regularization * Matthews correlation coefficient * Mean shift * Mean squared error * Mean squared prediction error * Measurement invariance * Medoid * MeeMix * Melomics * Memetic algorithm * Meta-optimization * Mexican International Conference on Artificial Intelligence * Michael Kearns (computer scientist) * MinHash * Mixture model * Mlpy * Models of DNA evolution * Moral graph * Mountain car problem * Movidius * Multi-armed bandit *

* Multi expression programming * Multiclass classification * Multidimensional analysis * Multifactor dimensionality reduction * Multilinear principal component analysis * Multiple correspondence analysis * Multiple discriminant analysis * Multiple factor analysis * Multiple sequence alignment * Multiplicative weight update method * Multispectral pattern recognition * Mutation (genetic algorithm) * MysteryVibe * N-gram * NOMINATE (scaling method) * Native-language identification * Natural Language Toolkit * Natural evolution strategy * Nearest-neighbor chain algorithm * Nearest centroid classifier * Nearest neighbor search * Neighbor joining * Nest Labs * NetMiner * NetOwl * Neural Designer * Neural Engineering Object * Neural Lab * Neural modeling fields * Neural network software * NeuroSolutions * Neuro Laboratory * Neuroevolution * Neuroph * Niki.ai * Noisy channel model * Noisy text analytics * Nonlinear dimensionality reduction * Novelty detection * Nuisance variable * One-class classification * Onnx * OpenNLP * Optimal discriminant analysis * Oracle Data Mining * Orange (software) * Ordination (statistics) * Overfitting * PROGOL * PSIPRED * Pachinko allocation * PageRank * Parallel metaheuristic * Parity benchmark * Part-of-speech tagging * Particle swarm optimization * Path dependence * Pattern language (formal languages) * Peltarion Synapse * Perplexity * Persian Speech Corpus * Picas (app) * Pietro Perona * Pipeline Pilot * Piranha (software) * Pitman–Yor process * Plate notation * Polynomial kernel * Pop music automation * Population process * Portable Format for Analytics * Predictive Model Markup Language * Predictive state representation * Preference regression * Premature convergence * Principal geodesic analysis * Prior knowledge for pattern recognition * Prisma (app) * Probabilistic Action Cores * Probabilistic context-free grammar * Probabilistic latent semantic analysis * Probabilistic soft logic * Probability matching * Probit model * Product of experts * Programming with Big Data in R * Proper generalized decomposition * Pruning (decision trees) * Pushpak Bhattacharyya * Q methodology * Qloo * Quality control and genetic algorithms * Quantum Artificial Intelligence Lab * Queueing theory * Quick, Draw! * R (programming language) * Rada Mihalcea * Rademacher complexity * Radial basis function kernel * Rand index * Random indexing * Random projection * Random subspace method * Ranking SVM * RapidMiner * Rattle GUI * Raymond Cattell * Reasoning system * Regularization perspectives on support vector machines * Relational data mining * Relationship square * Relevance vector machine * Relief (feature selection) * Renjin * Repertory grid * Representer theorem * Reward-based selection * Richard Zemel * Right to explanation * RoboEarth * Robust principal component analysis * RuleML Symposium * Rule induction * Rules extraction system family * SAS (software) * SNNS * SPSS Modeler * SUBCLU * Sample complexity * Sample exclusion dimension * Santa Fe Trail problem * Savi Technology * Schema (genetic algorithms) * Search-based software engineering * Selection (genetic algorithm) * Self-Service Semantic Suite * Semantic folding * Semantic mapping (statistics) * Semidefinite embedding * Sense Networks * Sensorium Project * Sequence labeling * Sequential minimal optimization * Shattered set * Shogun (toolbox) * Silhouette (clustering) * SimHash * SimRank * Similarity measure * Simple matching coefficient * Simultaneous localization and mapping * Sinkov statistic * Sliced inverse regression * Snakes and Ladders * Soft independent modelling of class analogies * Soft output Viterbi algorithm * Solomonoff's theory of inductive inference * SolveIT Software * Spectral clustering * Spike-and-slab variable selection * Statistical machine translation * Statistical parsing * Statistical semantics * Stefano Soatto * Stephen Wolfram * Stochastic block model * Stochastic cellular automaton * Stochastic diffusion search * Stochastic grammar * Stochastic matrix * Stochastic universal sampling * Stress majorization * String kernel * Structural equation modeling * Structural risk minimization * Structured sparsity regularization * Structured support vector machine * Subclass reachability * Sufficient dimension reduction * Sukhotin's algorithm * Sum of absolute differences * Sum of absolute transformed differences * Swarm intelligence * Switching Kalman filter * Symbolic regression * Synchronous context-free grammar * Syntactic pattern recognition * TD-Gammon * TIMIT * Teaching dimension * Teuvo Kohonen * Textual case-based reasoning * Theory of conjoint measurement * Thomas G. Dietterich * Thurstonian model * Topic model * Tournament selection * Training, test, and validation sets * Transiogram * Trax Image Recognition * Trigram tagger * Truncation selection * Tucker decomposition * UIMA * UPGMA * Ugly duckling theorem * Uncertain data * Uniform convergence in probability * Unique negative dimension * Universal portfolio algorithm * User behavior analytics * VC dimension * VIGRA * Validation set * Vapnik–Chervonenkis theory * Variable-order Bayesian network * Variable kernel density estimation * Variable rules analysis * Variational message passing * Varimax rotation * Vector quantization * Vicarious (company) * Viterbi algorithm * Vowpal Wabbit * WACA clustering algorithm * WPGMA * Ward's method * Weasel program * Whitening transformation * Winnow (algorithm) * Win–stay, lose–switch * Witness set * Wolfram Language * Wolfram Mathematica * Writer invariant * Xgboost * Yooreeka * Zeroth (software)

References

External links

Data Science: Data to Insights from MIT (machine learning)
* Popular online course by

, a
Coursera
It uses GNU Octave. The course is a free version of Stanford University's actual course taught by Ng, see.stanford.edu/Course/CS229 available for free].
mloss
is an academic database of open-source machine learning software. {{Outline footer Outlines of applied sciences, Machine learning Wikipedia outlines, Machine learning Computing-related lists Machine learning, * Data mining, Machine learning