The following outline is provided as an overview of, and topical guide to, machine learning: Machine learning (ML) is a subfield of

artificial intelligence Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...

within

computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...

that evolved from the study of

pattern recognition Pattern recognition is the task of assigning a class to an observation based on patterns extracted from data. While similar, pattern recognition (PR) is not to be confused with pattern machines (PM) which may possess PR capabilities but their p ...

and

computational learning theory In computer science, computational learning theory (or just learning theory) is a subfield of artificial intelligence devoted to studying the design and analysis of machine learning algorithms. Overview Theoretical results in machine learning m ...

.http://www.britannica.com/EBchecked/topic/1116194/machine-learning In 1959, Arthur Samuel defined machine learning as a "field of study that gives computers the ability to learn without being explicitly programmed". ML involves the study and construction of

algorithm In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...

s that can

learn Learning is the process of acquiring new understanding, knowledge, behaviors, skills, value (personal and cultural), values, Attitude (psychology), attitudes, and preferences. The ability to learn is possessed by humans, non-human animals, and ...

from and make predictions on

data Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...

. These algorithms operate by building a

model A model is an informative representation of an object, person, or system. The term originally denoted the plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin , . Models can be divided in ...

from a

training set In machine learning, a common task is the study and construction of algorithms that can learn from and make predictions on data. Such algorithms function by making data-driven predictions or decisions, through building a mathematical model from ...

of example observations to make data-driven predictions or decisions expressed as outputs, rather than following strictly static program instructions.

How can machine learning be categorized?

* An

academic discipline An academic discipline or academic field is a subdivision of knowledge that is taught and researched at the college or university level. Disciplines are defined (in part) and recognized by the academic journals in which research is published, a ...

* A branch of

science Science is a systematic discipline that builds and organises knowledge in the form of testable hypotheses and predictions about the universe. Modern science is typically divided into twoor threemajor branches: the natural sciences, which stu ...

** An

applied science Applied science is the application of the scientific method and scientific knowledge to attain practical goals. It includes a broad range of disciplines, such as engineering and medicine. Applied science is often contrasted with basic science, ...

*** A subfield of

**** A branch of

**** A subfield of

soft computing Soft computing is an umbrella term used to describe types of algorithms that produce approximate solutions to unsolvable high-level problems in computer science. Typically, traditional hard-computing algorithms heavily rely on concrete data and ma ...

**** Application of

statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

Paradigms of machine learning

Supervised learning In machine learning, supervised learning (SL) is a paradigm where a Statistical model, model is trained using input objects (e.g. a vector of predictor variables) and desired output values (also known as a ''supervisory signal''), which are often ...

, where the model is trained on labeled data *

Unsupervised learning Unsupervised learning is a framework in machine learning where, in contrast to supervised learning, algorithms learn patterns exclusively from unlabeled data. Other frameworks in the spectrum of supervisions include weak- or semi-supervision, wh ...

, where the model tries to identify patterns in unlabeled data *

Reinforcement learning Reinforcement learning (RL) is an interdisciplinary area of machine learning and optimal control concerned with how an intelligent agent should take actions in a dynamic environment in order to maximize a reward signal. Reinforcement learnin ...

, where the model learns to make decisions by receiving rewards or penalties.

Applications of machine learning

Applications of machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalise to unseen data, and thus perform tasks without explicit instru ...

Bioinformatics Bioinformatics () is an interdisciplinary field of science that develops methods and Bioinformatics software, software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, ...

Biomedical informatics Health informatics combines communications, information technology (IT), and health care to enhance patient care and is at the forefront of the medical technological revolution. It can be viewed as a branch of engineering and applied science. ...

Computer vision Computer vision tasks include methods for image sensor, acquiring, Image processing, processing, Image analysis, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical ...

Customer relationship management Customer relationship management (CRM) is a strategic process that organizations use to manage, analyze, and improve their interactions with customers. By leveraging data-driven insights, CRM helps businesses optimize communication, enhance cus ...

Data mining Data mining is the process of extracting and finding patterns in massive data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and ...

Earth sciences Earth science or geoscience includes all fields of natural science related to the planet Earth. This is a branch of science dealing with the physical, chemical, and biological complex constitutions and synergistic linkages of Earth's four spheres ...

Email filtering Email filtering is the processing of email to organize it according to specified criteria. The term can apply to the intervention of human intelligence, but most often refers to the automatic processing of messages at an SMTP server, possibly ap ...

Inverted pendulum An inverted pendulum is a pendulum that has its center of mass above its Lever, pivot point. It is unstable equilibrium, unstable and falls over without additional help. It can be suspended stably in this inverted position by using a control s ...

(balance and equilibrium system) *

Natural language processing Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related ...

Named Entity Recognition Named-entity recognition (NER) (also known as (named) entity identification, entity chunking, and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pr ...

Automatic summarization Automatic summarization is the process of shortening a set of data computationally, to create a subset (a summary) that represents the most important or relevant information within the original content. Artificial intelligence algorithms are comm ...

** Automatic taxonomy construction **

Dialog system A dialogue system, or conversational agent (CA), is a computer system intended to converse with a human. Dialogue systems employed one or more of text, speech, graphics, haptics, gestures, and other modes for communication on both the input and ...

Grammar checker A grammar checker, in computing terms, is a Computer program, program, or part of a program, that attempts to verify written text for grammatical correctness. Grammar checkers are most often implemented as a feature of a larger program, such as a ...

** Language recognition ***

Handwriting recognition Handwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible handwriting, handwritten input from sources such as paper documents, photographs, touch-screens ...

***

Optical character recognition Optical character recognition or optical character reader (OCR) is the electronics, electronic or machine, mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo ...

***

Speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also ...

**** Text to Speech Synthesis **** Speech Emotion Recognition **

Machine translation Machine translation is use of computational techniques to translate text or speech from one language to another, including the contextual, idiomatic and pragmatic nuances of both languages. Early approaches were mostly rule-based or statisti ...

Question answering Question answering (QA) is a computer science discipline within the fields of information retrieval and natural language processing (NLP) that is concerned with building systems that automatically answer questions that are posed by humans in a n ...

Speech synthesis Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal langua ...

Text mining Text mining, text data mining (TDM) or text analytics is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extracting information from differe ...

*** Term frequency–inverse document frequency **

Text simplification Text simplification is an operation used in natural language processing to change, enhance, classify, or otherwise process an existing body of human-readable text so its grammar and structure is greatly simplified while the underlying meaning an ...

Pattern recognition Pattern recognition is the task of assigning a class to an observation based on patterns extracted from data. While similar, pattern recognition (PR) is not to be confused with pattern machines (PM) which may possess PR capabilities but their p ...

Facial recognition system A facial recognition system is a technology potentially capable of matching a human face from a digital image or a Film frame, video frame against a database of faces. Such a system is typically employed to authenticate users through ID verif ...

Image recognition Computer vision tasks include methods for acquiring, processing, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical or symbolic information, e.g. in the form o ...

Recommendation system Recommendation may refer to: * European Union recommendation, in international law * Letter of recommendation, in employment or academia * W3C recommendation, in Internet contexts * A computer-generated recommendation created by a recommender ...

Collaborative filtering Collaborative filtering (CF) is, besides content-based filtering, one of two major techniques used by recommender systems.Francesco Ricci and Lior Rokach and Bracha ShapiraIntroduction to Recommender Systems Handbook, Recommender Systems Handbo ...

** Content-based filtering ** Hybrid recommender systems *

Search engine A search engine is a software system that provides hyperlinks to web pages, and other relevant information on World Wide Web, the Web in response to a user's web query, query. The user enters a query in a web browser or a mobile app, and the sea ...

Search engine optimization Search engine optimization (SEO) is the process of improving the quality and quantity of Web traffic, website traffic to a website or a web page from web search engine, search engines. SEO targets unpaid search traffic (usually referred to as ...

* Social engineering

Machine learning hardware

Graphics processing unit A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal ...

* Tensor processing unit *

Vision processing unit A vision processing unit (VPU) is (as of 2023) an emerging class of microprocessor; it is a specific type of AI accelerator, designed to accelerate machine vision tasks. Overview Vision processing units are distinct from graphics processing un ...

Machine learning tools

Comparison of deep learning software The following tables compare notable software frameworks, libraries, and computer programs for deep learning applications. Deep learning software by name Comparison of machine learning model compatibility See also * Comparison of numeri ...

Machine learning frameworks

Proprietary machine learning frameworks

* Amazon Machine Learning * Microsoft Azure Machine Learning Studio * DistBelief (replaced by

TensorFlow TensorFlow is a Library (computing), software library for machine learning and artificial intelligence. It can be used across a range of tasks, but is used mainly for Types of artificial neural networks#Training, training and Statistical infer ...

)

Open source machine learning frameworks

* Apache Singa * Apache MXNet * Caffe *

PyTorch PyTorch is a machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, originally developed by Meta AI and now part of the Linux Foundation umbrella. It is one of the mo ...

mlpack mlpack is a free, open-source and header-only software library for machine learning and artificial intelligence written in C++, built on top of the Armadillo library and thensmallennumerical optimization library. mlpack has an emphasis on scal ...

Torch A torch is a stick with combustible material at one end which can be used as a light source or to set something on fire. Torches have been used throughout history and are still used in processions, symbolic and religious events, and in juggl ...

* CNTK *

Accord.Net Accord.NET is a framework for scientific computing in .NET. The source code of the project is available under the terms of the Gnu Lesser Public License, version 2.1. The framework comprises a set of libraries that are available in source code a ...

* Jax
MLJ.jl
– A machine learning framework for Julia

Machine learning libraries

* Deeplearning4j * Theano *

scikit-learn scikit-learn (formerly scikits.learn and also known as sklearn) is a free and open-source machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support ...

Keras Keras is an open-source library that provides a Python interface for artificial neural networks. Keras was first independent software, then integrated into the TensorFlow library, and later added support for more. "Keras 3 is a full rewrite o ...

Machine learning algorithms

* Almeida–Pineda recurrent backpropagation * ALOPEX *

Backpropagation In machine learning, backpropagation is a gradient computation method commonly used for training a neural network to compute its parameter updates. It is an efficient application of the chain rule to neural networks. Backpropagation computes th ...

Bootstrap aggregating Bootstrap aggregating, also called bagging (from bootstrap aggregating) or bootstrapping, is a machine learning (ML) ensemble meta-algorithm designed to improve the stability and accuracy of ML classification and regression algorithms. It also ...

* CN2 algorithm *

Constructing skill trees Constructing skill trees (CST) is a hierarchical reinforcement learning algorithm which can build skill trees from a set of sample solution trajectories obtained from demonstration. CST uses an incremental MAP (maximum a posteriori) change point ...

* Dehaene–Changeux model * Diffusion map * Dominance-based rough set approach *

Dynamic time warping In time series analysis, dynamic time warping (DTW) is an algorithm for measuring similarity between two temporal sequences, which may vary in speed. For instance, similarities in walking could be detected using DTW, even if one person was walk ...

* Error-driven learning * Evolutionary multimodal optimization *

Expectation–maximization algorithm In statistics, an expectation–maximization (EM) algorithm is an iterative method to find (local) maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent varia ...

FastICA FastICA is an efficient and popular algorithm for independent component analysis invented by Aapo Hyvärinen at Helsinki University of Technology. Like most ICA algorithms, FastICA seeks an orthogonal rotation of FastICA#Prewhitening the data, prew ...

* Forward–backward algorithm * GeneRec * Genetic Algorithm for Rule Set Production * Growing self-organizing map *

Hyper basis function network In machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and ...

* IDistance * ''k''-nearest neighbors algorithm *

Kernel methods for vector output Kernel methods are a well-established tool to analyze the relationship between input data and the corresponding output of a function. Kernels encapsulate the properties of functions in a Kernel trick, computationally efficient way and allow algorith ...

Kernel principal component analysis In the field of multivariate statistics, kernel principal component analysis (kernel PCA) is an extension of principal component analysis (PCA) using techniques of kernel methods. Using a kernel, the originally linear operations of PCA are performe ...

* Leabra * Linde–Buzo–Gray algorithm * Local outlier factor * Logic learning machine * LogitBoost * Manifold alignment * Markov chain Monte Carlo (MCMC) *

Minimum redundancy feature selection Minimum redundancy feature selection is an algorithm frequently used in a method to accurately identify characteristics of genes and phenotypes and narrow down their relevance and is usually described in its pairing with relevant feature selection a ...

Mixture of experts Mixture of experts (MoE) is a machine learning technique where multiple expert networks (learners) are used to divide a problem space into homogeneous regions. MoE represents a form of ensemble learning. They were also called committee machines. ...

* Multiple kernel learning *

Non-negative matrix factorization Non-negative matrix factorization (NMF or NNMF), also non-negative matrix approximation is a group of algorithms in multivariate analysis and linear algebra where a matrix is factorized into (usually) two matrices and , with the property th ...

Online machine learning In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques whic ...

* Out-of-bag error *

Prefrontal cortex basal ganglia working memory Prefrontal cortex basal ganglia working memory (PBWM) is an algorithm that Computer simulation, models working memory in the prefrontal cortex and the basal ganglia. It can be compared to long short-term memory (LSTM) in functionality, but is more ...

* PVLV *

Q-learning ''Q''-learning is a reinforcement learning algorithm that trains an agent to assign values to its possible actions based on its current state, without requiring a model of the environment ( model-free). It can handle problems with stochastic tra ...

* Quadratic unconstrained binary optimization * Query-level feature * Quickprop *

Radial basis function network In the field of mathematical modeling, a radial basis function network is an artificial neural network that uses radial basis functions as activation functions. The output of the network is a linear combination of radial basis functions of the in ...

* Randomized weighted majority algorithm *

* Repeated incremental pruning to produce error reduction (RIPPER) * Rprop *

Rule-based machine learning Rule-based machine learning (RBML) is a term in computer science intended to encompass any machine learning method that identifies, learns, or evolves 'rules' to store, manipulate or apply. The defining characteristic of a rule-based machine learn ...

* Skill chaining * Sparse PCA *

State–action–reward–state–action State–action–reward–state–action (SARSA) is an algorithm for learning a Markov decision process policy, used in the reinforcement learning area of machine learning. It was proposed by Rummery and Niranjan in a technical note with the na ...

Stochastic gradient descent Stochastic gradient descent (often abbreviated SGD) is an Iterative method, iterative method for optimizing an objective function with suitable smoothness properties (e.g. Differentiable function, differentiable or Subderivative, subdifferentiable ...

* Structured kNN * T-distributed stochastic neighbor embedding *

Temporal difference learning Temporal difference (TD) learning refers to a class of model-free reinforcement learning methods which learn by bootstrapping from the current estimate of the value function. These methods sample from the environment, like Monte Carlo methods, a ...

* Wake-sleep algorithm * Weighted majority algorithm (machine learning)

Machine learning methods

Instance-based algorithm

K-nearest neighbors algorithm In statistics, the ''k''-nearest neighbors algorithm (''k''-NN) is a Non-parametric statistics, non-parametric supervised learning method. It was first developed by Evelyn Fix and Joseph Lawson Hodges Jr., Joseph Hodges in 1951, and later expand ...

(KNN) * Learning vector quantization (LVQ) *

Self-organizing map A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine learning technique used to produce a low-dimensional (typically two-dimensional) representation of a higher-dimensional data set while preserving the t ...

(SOM)

Regression analysis

Logistic regression In statistics, a logistic model (or logit model) is a statistical model that models the logit, log-odds of an event as a linear function (calculus), linear combination of one or more independent variables. In regression analysis, logistic regres ...

Ordinary least squares regression In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear function of a set of explanatory variables) by the ...

(OLSR) *

Linear regression In statistics, linear regression is a statistical model, model that estimates the relationship between a Scalar (mathematics), scalar response (dependent variable) and one or more explanatory variables (regressor or independent variable). A mode ...

Stepwise regression In statistics, stepwise regression is a method of fitting regression models in which the choice of predictive variables is carried out by an automatic procedure. In each step, a variable is considered for addition to or subtraction from the set of ...

Multivariate adaptive regression splines In statistics, multivariate adaptive regression splines (MARS) is a form of regression analysis introduced by Jerome H. Friedman in 1991. It is a non-parametric regression technique and can be seen as an extension of linear models that automatic ...

(MARS) * Regularization algorithm **

Ridge regression Ridge regression (also known as Tikhonov regularization, named for Andrey Tikhonov) is a method of estimating the coefficients of multiple- regression models in scenarios where the independent variables are highly correlated. It has been used in m ...

Least Absolute Shrinkage and Selection Operator In statistics and machine learning, lasso (least absolute shrinkage and selection operator; also Lasso, LASSO or L1 regularization) is a regression analysis method that performs both variable selection and Regularization (mathematics), regulariza ...

(LASSO) ** Elastic net **

Least-angle regression In statistics, least-angle regression (LARS) is an algorithm for fitting linear regression models to high-dimensional data, developed by Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani. Suppose we expect a response variab ...

(LARS) * Classifiers ** Probabilistic classifier ***

Naive Bayes classifier In statistics, naive (sometimes simple or idiot's) Bayes classifiers are a family of " probabilistic classifiers" which assumes that the features are conditionally independent, given the target class. In other words, a naive Bayes model assumes th ...

** Binary classifier **

Linear classifier In machine learning, a linear classifier makes a classification decision for each object based on a linear combination of its features. Such classifiers work well for practical problems such as document classification, and more generally for prob ...

Hierarchical classifier Hierarchical classification is a system of grouping things according to a hierarchy. In the field of machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Comput ...

Dimensionality reduction

Dimensionality reduction Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally ...

Canonical correlation analysis In statistics, canonical-correlation analysis (CCA), also called canonical variates analysis, is a way of inferring information from cross-covariance matrices. If we have two vectors ''X'' = (''X''1, ..., ''X'n'') and ''Y'' ...

(CCA) *

Factor analysis Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observe ...

Feature extraction Feature may refer to: Computing * Feature recognition, could be a hole, pocket, or notch * Feature (computer vision), could be an edge, corner or blob * Feature (machine learning), in statistics: individual measurable properties of the phenome ...

Feature selection In machine learning, feature selection is the process of selecting a subset of relevant Feature (machine learning), features (variables, predictors) for use in model construction. Feature selection techniques are used for several reasons: * sim ...

Independent component analysis In signal processing, independent component analysis (ICA) is a computational method for separating a multivariate statistics, multivariate signal into additive subcomponents. This is done by assuming that at most one subcomponent is Gaussian and ...

(ICA) *

Linear discriminant analysis Linear discriminant analysis (LDA), normal discriminant analysis (NDA), canonical variates analysis (CVA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to fi ...

(LDA) *

Multidimensional scaling Multidimensional scaling (MDS) is a means of visualizing the level of similarity of individual cases of a data set. MDS is used to translate distances between each pair of n objects in a set into a configuration of n points mapped into an ...

(MDS) *

(NMF) * Partial least squares regression (PLSR) *

Principal component analysis Principal component analysis (PCA) is a linear dimensionality reduction technique with applications in exploratory data analysis, visualization and data preprocessing. The data is linearly transformed onto a new coordinate system such that th ...

(PCA) * Principal component regression (PCR) *

Projection pursuit Projection pursuit (PP) is a type of statistical technique that involves finding the most "interesting" possible projections in multidimensional data. Often, projections that deviate more from a normal distribution are considered to be more intere ...

* Sammon mapping * t-distributed stochastic neighbor embedding (t-SNE)

Ensemble learning

Ensemble learning In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. Unlike a statistical ensemble in statist ...

AdaBoost AdaBoost (short for Adaptive Boosting) is a statistical classification meta-algorithm formulated by Yoav Freund and Robert Schapire in 1995, who won the 2003 Gödel Prize for their work. It can be used in conjunction with many types of learnin ...

* Boosting *

(also "bagging" or "bootstrapping") * Ensemble averaging * Gradient boosted decision tree (GBDT) *

Gradient boosting Gradient boosting is a machine learning technique based on boosting in a functional space, where the target is ''pseudo-residuals'' instead of residuals as in traditional boosting. It gives a prediction model in the form of an ensemble of weak ...

Random Forest Random forests or random decision forests is an ensemble learning method for statistical classification, classification, regression analysis, regression and other tasks that works by creating a multitude of decision tree learning, decision trees ...

* Stacked Generalization

Meta-learning

Meta-learning Meta-learning is a branch of metacognition concerned with learning about one's own learning and learning processes. The term comes from the meta prefix's modern meaning of an abstract recursion, or "X about X", similar to its use in metaknowle ...

* Inductive bias *

Metadata Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive ...

Reinforcement learning

(SARSA) *

(TD) * Learning Automata

Supervised learning

* Averaged one-dependence estimators (AODE) *

Artificial neural network In machine learning, a neural network (also artificial neural network or neural net, abbreviated ANN or NN) is a computational model inspired by the structure and functions of biological neural networks. A neural network consists of connected ...

Case-based reasoning Case-based reasoning (CBR), broadly construed, is the process of solving new problems based on the solutions of similar past problems. In everyday life, an auto mechanic who fixes an engine by recalling another car that exhibited similar sympto ...

Gaussian process regression In statistics, originally in geostatistics, kriging or Kriging (), also known as Gaussian process regression, is a method of interpolation based on Gaussian process governed by prior covariances. Under suitable assumptions of the prior, kriging g ...

Gene expression programming Gene expression programming (GEP) in computer programming is an evolutionary algorithm that creates computer programs or models. These computer programs are complex tree structures that learn and adapt by changing their sizes, shapes, and compos ...

Group method of data handling A group is a number of persons or things that are located, gathered, or classed together. Groups of people * Cultural group, a group whose members share the same cultural identity * Ethnic group, a group whose members share the same ethnic iden ...

(GMDH) *

Inductive logic programming Inductive logic programming (ILP) is a subfield of symbolic artificial intelligence which uses logic programming as a uniform representation for examples, background knowledge and hypotheses. The term "''inductive''" here refers to philosophical ...

Instance-based learning In machine learning, instance-based learning (sometimes called memory-based learning) is a family of learning algorithms that, instead of performing explicit generalization, compare new problem instances with instances seen in training, which have b ...

* Lazy learning * Learning Automata * Learning Vector Quantization * Logistic Model Tree * Minimum message length (decision trees, decision graphs, etc.) ** Nearest Neighbor Algorithm **

Analogical modeling Analogical modeling (AM) is a formal theory of exemplar theory, exemplar based analogical reasoning, proposed by Royal Skousen, professor of Linguistics and English language at Brigham Young University in Provo, Utah. It is applicable to language ...

Probably approximately correct learning In computational learning theory, probably approximately correct (PAC) learning is a framework for mathematical analysis of machine learning. It was proposed in 1984 by Leslie Valiant.L. Valiant. A theory of the learnable.' Communications of the ...

(PAC) learning *

Ripple down rules Ripple-down rules (RDR) are a way of approaching knowledge acquisition. Knowledge acquisition refers to the transfer of knowledge from human experts to knowledge-based systems. Introductory material Ripple-down rules are an incremental approach ...

, a knowledge acquisition methodology * Symbolic machine learning algorithms *

Support vector machine In machine learning, support vector machines (SVMs, also support vector networks) are supervised max-margin models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laborato ...

s * Random Forests * Ensembles of classifiers **

(bagging) **

Boosting (meta-algorithm) In machine learning (ML), boosting is an ensemble metaheuristic for primarily reducing bias (as opposed to variance). It can also improve the stability and accuracy of ML classification and regression algorithms. Hence, it is prevalent in supe ...

* Ordinal classification *

Conditional Random Field Conditional random fields (CRFs) are a class of statistical modeling methods often applied in pattern recognition and machine learning and used for structured prediction. Whereas a classifier predicts a label for a single sample without consi ...

ANOVA Analysis of variance (ANOVA) is a family of statistical methods used to compare the means of two or more groups by analyzing variance. Specifically, ANOVA compares the amount of variation ''between'' the group means to the amount of variation ''w ...

Quadratic classifier In statistics, a quadratic classifier is a statistical classifier that uses a quadratic decision surface to separate measurements of two or more classes of objects or events. It is a more general version of the linear classifier. The classific ...

s *

k-nearest neighbor In statistics, the ''k''-nearest neighbors algorithm (''k''-NN) is a non-parametric supervised learning method. It was first developed by Evelyn Fix and Joseph Hodges in 1951, and later expanded by Thomas Cover. Most often, it is used for cl ...

* Boosting ** SPRINT *

Bayesian network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Whi ...

s **

Naive Bayes In statistics, naive (sometimes simple or idiot's) Bayes classifiers are a family of " probabilistic classifiers" which assumes that the features are conditionally independent, given the target class. In other words, a naive Bayes model assumes th ...

Hidden Markov model A hidden Markov model (HMM) is a Markov model in which the observations are dependent on a latent (or ''hidden'') Markov process (referred to as X). An HMM requires that there be an observable process Y whose outcomes depend on the outcomes of X ...

s ** Hierarchical hidden Markov model

Bayesian

Bayesian statistics Bayesian statistics ( or ) is a theory in the field of statistics based on the Bayesian interpretation of probability, where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about ...

* Bayesian knowledge base *

* Gaussian Naive Bayes * Multinomial Naive Bayes * Averaged One-Dependence Estimators (AODE) * Bayesian Belief Network (BBN) *

Bayesian Network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Whi ...

(BN)

Decision tree algorithms

Decision tree algorithm *

Decision tree A decision tree is a decision support system, decision support recursive partitioning structure that uses a Tree (graph theory), tree-like Causal model, model of decisions and their possible consequences, including probability, chance event ou ...

* Classification and regression tree (CART) * Iterative Dichotomiser 3 (ID3) * C4.5 algorithm * C5.0 algorithm * Chi-squared Automatic Interaction Detection (CHAID) * Decision stump * Conditional decision tree *

ID3 algorithm In decision tree learning, ID3 (Iterative Dichotomiser 3) is an algorithm invented by Ross QuinlanQuinlan, J. R. 1986. Induction of Decision Trees. Mach. Learn. 1, 1 (Mar. 1986), 81–106 used to generate a decision tree from a dataset. ID3 is th ...

Random forest Random forests or random decision forests is an ensemble learning method for statistical classification, classification, regression analysis, regression and other tasks that works by creating a multitude of decision tree learning, decision trees ...

* SLIQ

Linear classifier

* Fisher's linear discriminant *

Multinomial logistic regression In statistics, multinomial logistic regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes. That is, it is a model that is used to predict the prob ...

Perceptron In machine learning, the perceptron is an algorithm for supervised classification, supervised learning of binary classification, binary classifiers. A binary classifier is a function that can decide whether or not an input, represented by a vect ...

Unsupervised learning

* Expectation-maximization algorithm *

Vector Quantization Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors. Developed in the early 1980s by Robert M. Gray, it was ori ...

* Generative topographic map *

Information bottleneck method The information bottleneck method is a technique in information theory introduced by Naftali Tishby, Fernando C. Pereira, and William Bialek. It is designed for finding the best tradeoff between accuracy and complexity (Data compression, compressio ...

Association rule learning Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.P ...

algorithms **

Apriori algorithm AprioriRakesh Agrawal and Ramakrishnan SrikanFast algorithms for mining association rules Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pages 487-499, Santiago, Chile, September 1994. is an algorithm for frequent ...

** Eclat algorithm

Artificial neural networks

Feedforward neural network Feedforward refers to recognition-inference architecture of neural networks. Artificial neural network architectures are based on inputs multiplied by weights to obtain outputs (inputs-to-output): feedforward. Recurrent neural networks, or neur ...

** Extreme learning machine **

Convolutional neural network A convolutional neural network (CNN) is a type of feedforward neural network that learns features via filter (or kernel) optimization. This type of deep learning network has been applied to process and make predictions from many different ty ...

Recurrent neural network Recurrent neural networks (RNNs) are a class of artificial neural networks designed for processing sequential data, such as text, speech, and time series, where the order of elements is important. Unlike feedforward neural networks, which proces ...

** Long short-term memory (LSTM) * Logic learning machine *

Association rule learning

* Eclat algorithm * FP-growth algorithm

Hierarchical clustering

Hierarchical clustering In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two ...

Single-linkage clustering In statistics, single-linkage clustering is one of several methods of hierarchical clustering. It is based on grouping clusters in bottom-up fashion (agglomerative clustering), at each step combining two clusters that contain the closest pair of el ...

Conceptual clustering Conceptual clustering is a machine learning paradigm for unsupervised classification that has been defined by Ryszard S. Michalski in 1980 (Fisher 1987, Michalski 1980) and developed mainly during the 1980s. It is distinguished from ordinary cluste ...

Cluster analysis

Cluster analysis Cluster analysis or clustering is the data analyzing technique in which task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more Similarity measure, similar (in some specific sense defined by the ...

BIRCH A birch is a thin-leaved deciduous hardwood tree of the genus ''Betula'' (), in the family Betulaceae, which also includes alders, hazels, and hornbeams. It is closely related to the beech- oak family Fagaceae. The genus ''Betula'' contains 3 ...

* DBSCAN * Expectation–maximization (EM) * Fuzzy clustering *

* ''k''-means clustering * ''k''-medians * Mean-shift * OPTICS algorithm

Anomaly detection

Anomaly detection In data analysis, anomaly detection (also referred to as outlier detection and sometimes as novelty detection) is generally understood to be the identification of rare items, events or observations which deviate significantly from the majority of ...

* ''k''-nearest neighbors algorithm (''k''-NN) * Local outlier factor

Semi-supervised learning

Semi-supervised learning Weak supervision (also known as semi-supervised learning) is a paradigm in machine learning, the relevance and notability of which increased with the advent of large language models due to large amount of data required to train them. It is charact ...

Active learning Active learning is "a method of learning in which students are actively or experientially involved in the learning process and where there are different levels of active learning, depending on student involvement." states that "students particip ...

* Generative models * Low-density separation * Graph-based methods *

Co-training Co-training is a machine learning algorithm used when there are only small amounts of labeled data and large amounts of unlabeled data. One of its uses is in text mining for search engines. It was introduced by Avrim Blum and Tom Mitchell in 1998 ...

* Transduction

Deep learning

Deep learning Deep learning is a subset of machine learning that focuses on utilizing multilayered neural networks to perform tasks such as classification, regression, and representation learning. The field takes inspiration from biological neuroscience a ...

* Deep belief networks * Deep

Boltzmann machine A Boltzmann machine (also called Sherrington–Kirkpatrick model with external field or stochastic Ising model), named after Ludwig Boltzmann, is a spin glass, spin-glass model with an external field, i.e., a Spin glass#Sherrington–Kirkpatrick m ...

s * Deep

s *

Hierarchical temporal memory Hierarchical temporal memory (HTM) is a biologically constrained machine intelligence technology developed by Numenta. Originally described in the 2004 book '' On Intelligence'' by Jeff Hawkins with Sandra Blakeslee, HTM is primarily used toda ...

Generative Adversarial Network A generative adversarial network (GAN) is a class of machine learning frameworks and a prominent framework for approaching generative artificial intelligence. The concept was initially developed by Ian Goodfellow and his colleagues in June ...

** Style transfer *

Transformer In electrical engineering, a transformer is a passive component that transfers electrical energy from one electrical circuit to another circuit, or multiple Electrical network, circuits. A varying current in any coil of the transformer produces ...

* Stacked Auto-Encoders

Machine learning research

* List of artificial intelligence projects *

List of datasets for machine learning research These datasets are used in machine learning (ML) research and have been cited in peer-reviewed academic journals. Datasets are an integral part of the field of machine learning. Major advances in this field can result from advances in learni ...

History of machine learning

History of machine learning * Timeline of machine learning

Machine learning projects

Machine learning projects: *

DeepMind DeepMind Technologies Limited, trading as Google DeepMind or simply DeepMind, is a British–American artificial intelligence research laboratory which serves as a subsidiary of Alphabet Inc. Founded in the UK in 2010, it was acquired by Go ...

Google Brain Google Brain was a deep learning artificial intelligence research team that served as the sole AI branch of Google before being incorporated under the newer umbrella of Google AI, a research division at Google dedicated to artificial intelligence ...

OpenAI OpenAI, Inc. is an American artificial intelligence (AI) organization founded in December 2015 and headquartered in San Francisco, California. It aims to develop "safe and beneficial" artificial general intelligence (AGI), which it defines ...

Meta AI Meta AI is a research division of Meta (formerly Facebook) that develops artificial intelligence and augmented reality technologies. History The foundation of laboratory was announced in 2013, under the name Facebook Artificial Intelligence ...

Hugging Face Hugging Face, Inc. is a French-American company based in List of tech companies in the New York metropolitan area, New York City that develops computation tools for building applications using machine learning. It is most notable for its Transf ...

Machine learning organizations

Machine learning conferences and workshops

* Artificial Intelligence and Security (AISec) (co-located workshop with CCS) * Conference on Neural Information Processing Systems (NIPS) * ECML PKDD *

International Conference on Machine Learning The International Conference on Machine Learning (ICML) is a leading international academic conference in machine learning. Along with NeurIPS and ICLR, it is one of the three primary conferences of high impact in machine learning and artificial ...

(ICML)
ML4ALL
(Machine Learning For All)

Machine learning publications

Books on machine learning

* Mathematics for Machine Learning * Hands-On Machine Learning Scikit-Learn, Keras, and TensorFlow * The Hundred-Page Machine Learning Book

Machine learning journals

* ''

Machine Learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...

'' * ''

Journal of Machine Learning Research The ''Journal of Machine Learning Research'' is a peer-reviewed open access scientific journal covering machine learning. It was established in 2000 and the first editor-in-chief was Leslie Kaelbling. The current editors-in-chief are Francis Bac ...

'' (JMLR) * ''

Neural Computation Neural computation is the information processing performed by networks of neurons. Neural computation is affiliated with the philosophical tradition known as Computational theory of mind, also referred to as computationalism, which advances the th ...

Persons influential in machine learning

* Alberto Broggi * Andrei Knyazev *

Andrew McCallum Andrew McCallum is a professor in the computer science department at University of Massachusetts Amherst. His primary specialties are in machine learning, natural language processing, information extraction, information integration, and social ...

Andrew Ng Andrew Yan-Tak Ng (; born April 18, 1976) is a British-American computer scientist and Internet Entrepreneur, technology entrepreneur focusing on machine learning and artificial intelligence (AI). Ng was a cofounder and head of Google Brain and ...

* Anuraag Jain * Armin B. Cremers * Ayanna Howard * Barney Pell *

Ben Goertzel Ben Goertzel is a computer scientist, artificial intelligence researcher, and businessman. He helped popularize the term artificial general intelligence. Early life and education Three of Goertzel's Jewish great-grandparents immigrated to New Yo ...

* Ben Taskar *

Bernhard Schölkopf Bernhard Schölkopf (born 20 February 1968) is a German computer scientist known for his work in machine learning, especially on kernel methods and causality. He is a director at the Max Planck Institute for Intelligent Systems in Tübingen, ...

* Brian D. Ripley * Christopher G. Atkeson * Corinna Cortes *

Demis Hassabis Sir Demis Hassabis (born 27 July 1976) is a British artificial intelligence (AI) researcher, and entrepreneur. He is the chief executive officer and co-founder of Google DeepMind, and Isomorphic Labs, and a UK Government AI Adviser. In 2024, Ha ...

Douglas Lenat Douglas Bruce Lenat (September 13, 1950 – August 31, 2023) was an American computer scientist and researcher in artificial intelligence who was the founder and CEO of Cycorp, Inc. in Austin, Texas. Lenat was awarded the biannual IJCAI Comp ...

* Eric Xing * Ernst Dickmanns *

Geoffrey Hinton Geoffrey Everest Hinton (born 1947) is a British-Canadian computer scientist, cognitive scientist, and cognitive psychologist known for his work on artificial neural networks, which earned him the title "the Godfather of AI". Hinton is Univer ...

* Hans-Peter Kriegel *

Hartmut Neven Hartmut Neven (born 1964) is a German American scientist working in quantum computing, computer vision, robotics and computational neuroscience. He is best known for his work in face and object recognition and his contributions to quantum machin ...

* Heikki Mannila * Ian Goodfellow * Jacek M. Zurada *

Jaime Carbonell Jaime Guillermo Carbonell (July 29, 1953 – February 28, 2020) was a computer scientist who made seminal contributions to the development of natural language processing tools and technologies. His extensive research in machine translation resul ...

* Jeremy Slovak *

Jerome H. Friedman Jerome Harold Friedman (born December 29, 1939) is an American statistician, consultant and Professor of Statistics at Stanford University, known for his contributions in the field of statistics and data mining.

* John D. Lafferty * John Platt * Julie Beth Lovins *

Jürgen Schmidhuber Jürgen Schmidhuber (born 17 January 1963) is a German computer scientist noted for his work in the field of artificial intelligence, specifically artificial neural networks. He is a scientific director of the Dalle Molle Institute for Artifici ...

* Karl Steinbuch * Katia Sycara *

Leo Breiman Leo Breiman (January 27, 1928 – July 5, 2005) was an American statistician at the University of California, Berkeley and a member of the United States National Academy of Sciences. Breiman's work helped to bridge the gap between statistics an ...

* Lise Getoor * Luca Maria Gambardella * Léon Bottou *

Marcus Hutter Marcus Hutter (born 14 April 1967 in Munich) is a computer scientist, professor and artificial intelligence researcher. As a senior researcher at DeepMind, he studies the mathematical foundations of artificial general intelligence. Hutter stu ...

* Mehryar Mohri *

Michael Collins Michael Collins or Mike Collins most commonly refers to: * Michael Collins (Irish leader) (1890–1922), Irish revolutionary leader, soldier, and politician * Michael Collins (astronaut) (1930–2021), American astronaut, member of Apollo 11 and Ge ...

* Michael I. Jordan * Michael L. Littman *

Nando de Freitas Nando de Freitas is a researcher in the field of machine learning, and in particular in the subfields of neural networks, Bayesian inference and Bayesian optimization, and deep learning. Biography De Freitas was born in Zimbabwe. He did his und ...

* Ofer Dekel * Oren Etzioni *

Pedro Domingos Pedro Domingos (born 1965) is a Professor Emeritus of computer science and engineering at the University of Washington. He is a researcher in machine learning known for Markov logic network enabling uncertain inference. Education Domingos rece ...

* Peter Flach * Pierre Baldi * Pushmeet Kohli *

Ray Kurzweil Raymond Kurzweil ( ; born February 12, 1948) is an American computer scientist, author, entrepreneur, futurist, and inventor. He is involved in fields such as optical character recognition (OCR), speech synthesis, text-to-speech synthesis, spee ...

* Rayid Ghani * Ross Quinlan * Salvatore J. Stolfo *

Sebastian Thrun Sebastian Thrun (born May 14, 1967) is a German-American entrepreneur, educator, and computer scientist. He is chief executive officer of Kitty Hawk Corporation, and chairman and co-founder of Udacity. Before that, he was a Google vice preside ...

* Selmer Bringsjord *

Sepp Hochreiter Josef "Sepp" Hochreiter (born 14 February 1967) is a German computer scientist. Since 2018 he has led the Institute for Machine Learning at the Johannes Kepler University of Linz after having led the Institute of Bioinformatics from 2006 to 201 ...

* Shane Legg * Stephen Muggleton * Steve Omohundro *

Tom M. Mitchell Tom Michael Mitchell (born August 9, 1951) is an American computer scientist and the Founders University Professor at Carnegie Mellon University (CMU). He is a founder and former chair of the Machine Learning Department at CMU. Mitchell is known ...

* Trevor Hastie * Vasant Honavar * Vladimir Vapnik *

Yann LeCun Yann André Le Cun ( , ; usually spelled LeCun; born 8 July 1960) is a French-American computer scientist working primarily in the fields of machine learning, computer vision, mobile robotics and computational neuroscience. He is the Silver Pr ...

* Yasuo Matsuyama *

Yoshua Bengio Yoshua Bengio (born March 5, 1964) is a Canadian-French computer scientist, and a pioneer of artificial neural networks and deep learning. He is a professor at the Université de Montréal and scientific director of the AI institute Montreal In ...

Zoubin Ghahramani Zoubin Ghahramani FRS (; born 8 February 1970) is a British-Iranian researcher and Professor of Information Engineering at the University of Cambridge. He holds joint appointments at University College London and the Alan Turing Institute. and ...

* Gaussian process * Gaussian process emulator * Gene prediction * General Architecture for Text Engineering * Generalization error * Generalized canonical correlation * Generalized filtering * Generalized iterative scaling * Generalized multidimensional scaling * Generative adversarial network * Generative model * Genetic algorithm * Genetic algorithm scheduling * Genetic algorithms in economics * Genetic fuzzy systems * Genetic memory (computer science) * Genetic operator * Genetic programming * Genetic representation * Geographical cluster * Gesture Description Language * Geworkbench * Glossary of artificial intelligence * Glottochronology * Golem (ILP) * Google matrix * Grafting (decision trees) * Gramian matrix * Grammatical evolution * Granular computing * GraphLab * Graph kernel * Gremlin (programming language) * Growth function * HUMANT (HUManoid ANT) algorithm * Hammersley–Clifford theorem * Harmony search * Hebbian theory * Hidden Markov random field * Hidden semi-Markov model * Hierarchical hidden Markov model * Higher-order factor analysis * Highway network * Hinge loss * Holland's schema theorem * Hopkins statistic * Hoshen–Kopelman algorithm * Huber loss * IRCF360 * Ian Goodfellow * Ilastik * Ilya Sutskever * Immunocomputing * Imperialist competitive algorithm * Inauthentic text * Incremental decision tree * Induction of regular languages * Inductive bias * Inductive probability * Inductive programming * Influence diagram * Information Harvesting * Information gain in decision trees * Information gain ratio * Inheritance (genetic algorithm) * Instance selection * Intel RealSense * Interacting particle system * Interactive machine translation * International Joint Conference on Artificial Intelligence * International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics * International Semantic Web Conference * Iris flower data set * Island algorithm * Isotropic position * Item response theory * Iterative Viterbi decoding * JOONE * Jabberwacky * Jaccard index * Jackknife variance estimates for random forest * Java Grammatical Evolution * Joseph Nechvatal * Jubatus * Julia (programming language) * Junction tree algorithm * k-SVD, ''k''-SVD * k-means++, ''k''-means++ * k-medians clustering, ''k''-medians clustering * k-medoids, ''k''-medoids * KNIME * KXEN Inc. * k q-flats, ''k q''-flats * Kaggle * Kalman filter * Katz's back-off model * Kernel adaptive filter * Kernel density estimation * Kernel eigenvoice * Kernel embedding of distributions * Kernel method * Kernel perceptron * Kernel random forest * Kinect * Klaus-Robert Müller * Kneser–Ney smoothing * Knowledge Vault * Knowledge integration * LIBSVM * LPBoost * Labeled data * LanguageWare * Language identification in the limit * Language model * Large margin nearest neighbor * Latent Dirichlet allocation * Latent class model * Latent semantic analysis * Latent variable * Latent variable model * Lattice Miner * Layered hidden Markov model * Learnable function class * Least squares support vector machine * Leslie P. Kaelbling * Linear genetic programming * Linear predictor function * Linear separability * Lingyun Gu * Linkurious * Lior Ron (business executive) * List of genetic algorithm applications * List of metaphor-based metaheuristics * List of text mining software * Local case-control sampling * Local independence * Local tangent space alignment * Locality-sensitive hashing * Log-linear model * Logistic model tree * Low-rank approximation * Low-rank matrix approximations * MATLAB * MIMIC (immunology) * MXNet * Mallet (software project) * Manifold regularization * Margin-infused relaxed algorithm * Margin classifier * Mark V. Shaney * Massive Online Analysis * Matrix regularization * Matthews correlation coefficient * Mean shift *

* Mean squared prediction error * Measurement invariance * Medoid * MeeMix * Melomics * Memetic algorithm * Meta-optimization * Mexican International Conference on Artificial Intelligence * Michael Kearns (computer scientist) * MinHash * Mixture model * Mlpy * Models of DNA evolution * Moral graph * Mountain car problem * Movidius * Multi-armed bandit *

* Multi expression programming * Multiclass classification * Multidimensional analysis * Multifactor dimensionality reduction * Multilinear principal component analysis * Multiple correspondence analysis * Multiple discriminant analysis * Multiple factor analysis * Multiple sequence alignment * Multiplicative weight update method * Multispectral pattern recognition * Mutation (genetic algorithm) * MysteryVibe * N-gram * NOMINATE (scaling method) * Native-language identification * Natural Language Toolkit * Natural evolution strategy * Nearest-neighbor chain algorithm * Nearest centroid classifier * Nearest neighbor search * Neighbor joining * Nest Labs * NetMiner * NetOwl * Neural Designer * Neural Engineering Object * Neural modeling fields * Neural network software * NeuroSolutions * Neuroevolution * Neuroph * Niki.ai * Noisy channel model * Noisy text analytics * Nonlinear dimensionality reduction * Novelty detection * Nuisance variable * One-class classification * Onnx * OpenNLP * Optimal discriminant analysis * Oracle Data Mining * Orange (software) * Ordination (statistics) * Overfitting * PROGOL * PSIPRED * Pachinko allocation * PageRank * Parallel metaheuristic * Parity benchmark * Part-of-speech tagging * Particle swarm optimization * Path dependence * Pattern language (formal languages) * Peltarion Synapse * Perplexity * Persian Speech Corpus * Pietro Perona * Pipeline Pilot * Piranha (software) * Pitman–Yor process * Plate notation * Polynomial kernel * Pop music automation * Population process * Portable Format for Analytics * Predictive Model Markup Language * Predictive state representation * Preference regression * Premature convergence * Principal geodesic analysis * Prior knowledge for pattern recognition * Prisma (app) * Probabilistic Action Cores * Probabilistic context-free grammar * Probabilistic latent semantic analysis * Probabilistic soft logic * Probability matching * Probit model * Product of experts * Programming with Big Data in R * Proper generalized decomposition * Pruning (decision trees) * Pushpak Bhattacharyya * Q methodology * Qloo * Quality control and genetic algorithms * Quantum Artificial Intelligence Lab * Queueing theory * Quick, Draw! * R (programming language) * Rada Mihalcea * Rademacher complexity * Radial basis function kernel * Rand index * Random indexing * Random projection * Random subspace method * Ranking SVM * RapidMiner * Rattle GUI * Raymond Cattell * Reasoning system * Regularization perspectives on support vector machines * Relational data mining * Relationship square * Relevance vector machine * Relief (feature selection) * Renjin * Repertory grid * Representer theorem * Reward-based selection * Richard Zemel * Right to explanation * RoboEarth * Robust principal component analysis * RuleML Symposium * Rule induction * Rules extraction system family * SAS (software) * SNNS * SPSS Modeler * SUBCLU * Sample complexity * Sample exclusion dimension * Santa Fe Trail problem * Savi Technology * Schema (genetic algorithms) * Search-based software engineering * Selection (genetic algorithm) * Self-Service Semantic Suite * Semantic folding * Semantic mapping (statistics) * Semidefinite embedding * Sense Networks * Sensorium Project * Sequence labeling * Sequential minimal optimization * Shattered set * Shogun (toolbox) * Silhouette (clustering) * SimHash * SimRank * Similarity measure * Simple matching coefficient * Simultaneous localization and mapping * Sinkov statistic * Sliced inverse regression * Snakes and Ladders * Soft independent modelling of class analogies * Soft output Viterbi algorithm * Solomonoff's theory of inductive inference * SolveIT Software * Spectral clustering * Spike-and-slab variable selection * Statistical machine translation * Statistical parsing * Statistical semantics * Stefano Soatto * Stephen Wolfram * Stochastic block model * Stochastic cellular automaton * Stochastic diffusion search * Stochastic grammar * Stochastic matrix * Stochastic universal sampling * Stress majorization * String kernel * Structural equation modeling * Structural risk minimization * Structured sparsity regularization * Structured support vector machine * Subclass reachability * Sufficient dimension reduction * Sukhotin's algorithm * Sum of absolute differences * Sum of absolute transformed differences * Swarm intelligence * Switching Kalman filter * Symbolic regression * Synchronous context-free grammar * Syntactic pattern recognition * TD-Gammon * TIMIT * Teaching dimension * Teuvo Kohonen * Textual case-based reasoning * Theory of conjoint measurement * Thomas G. Dietterich * Thurstonian model * Topic model * Tournament selection * Training, test, and validation sets * Transiogram * Trax Image Recognition * Trigram tagger * Truncation selection * Tucker decomposition * UIMA * UPGMA * Ugly duckling theorem * Uncertain data * Uniform convergence in probability * Unique negative dimension * Universal portfolio algorithm * User behavior analytics * VC dimension * VIGRA * Validation set * Vapnik–Chervonenkis theory * Variable-order Bayesian network * Variable kernel density estimation * Variable rules analysis * Variational message passing * Varimax rotation * Vector quantization * Vicarious (company) * Viterbi algorithm * Vowpal Wabbit * WACA clustering algorithm * WPGMA * Ward's method * Weasel program * Whitening transformation * Winnow (algorithm) * Win–stay, lose–switch * Witness set * Wolfram Language * Wolfram Mathematica * Writer invariant * Xgboost * Yooreeka * Zeroth (software)

References

External links

Data Science: Data to Insights from MIT (machine learning)
* Popular online course by

, a
Coursera
It uses GNU Octave. The course is a free version of Stanford University's actual course taught by Ng, see.stanford.edu/Course/CS229 available for free].
mloss
is an academic database of open-source machine learning software. {{Outline footer Outlines of computing and engineering, Machine learning Outlines, Machine learning Computing-related lists Machine learning, * Data mining, Machine learning