The following
outline is provided as an overview of, and topical guide to, machine learning:
Machine learning (ML) is a subfield of
artificial intelligence
Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...
within
computer science
Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
that evolved from the study of
pattern recognition
Pattern recognition is the task of assigning a class to an observation based on patterns extracted from data. While similar, pattern recognition (PR) is not to be confused with pattern machines (PM) which may possess PR capabilities but their p ...
and
computational learning theory
In computer science, computational learning theory (or just learning theory) is a subfield of artificial intelligence devoted to studying the design and analysis of machine learning algorithms.
Overview
Theoretical results in machine learning m ...
.
[http://www.britannica.com/EBchecked/topic/1116194/machine-learning ] In 1959,
Arthur Samuel defined machine learning as a "field of study that gives computers the ability to learn without being explicitly programmed".
ML involves the study and construction of
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
s that can
learn
Learning is the process of acquiring new understanding, knowledge, behaviors, skills, value (personal and cultural), values, Attitude (psychology), attitudes, and preferences. The ability to learn is possessed by humans, non-human animals, and ...
from and make predictions on
data
Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...
.
These algorithms operate by building a
model
A model is an informative representation of an object, person, or system. The term originally denoted the plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin , .
Models can be divided in ...
from a
training set
In machine learning, a common task is the study and construction of algorithms that can learn from and make predictions on data. Such algorithms function by making data-driven predictions or decisions, through building a mathematical model from ...
of example observations to make data-driven predictions or decisions expressed as outputs, rather than following strictly static program instructions.
How can machine learning be categorized?
* An
academic discipline
An academic discipline or academic field is a subdivision of knowledge that is taught and researched at the college or university level. Disciplines are defined (in part) and recognized by the academic journals in which research is published, a ...
* A branch of
science
Science is a systematic discipline that builds and organises knowledge in the form of testable hypotheses and predictions about the universe. Modern science is typically divided into twoor threemajor branches: the natural sciences, which stu ...
** An
applied science
Applied science is the application of the scientific method and scientific knowledge to attain practical goals. It includes a broad range of disciplines, such as engineering and medicine. Applied science is often contrasted with basic science, ...
*** A subfield of
computer science
Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
**** A branch of
artificial intelligence
Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...
**** A subfield of
soft computing
Soft computing is an umbrella term used to describe types of algorithms that produce approximate solutions to unsolvable high-level problems in computer science. Typically, traditional hard-computing algorithms heavily rely on concrete data and ma ...
**** Application of
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
Paradigms of machine learning
*
Supervised learning
In machine learning, supervised learning (SL) is a paradigm where a Statistical model, model is trained using input objects (e.g. a vector of predictor variables) and desired output values (also known as a ''supervisory signal''), which are often ...
, where the model is trained on labeled data
*
Unsupervised learning
Unsupervised learning is a framework in machine learning where, in contrast to supervised learning, algorithms learn patterns exclusively from unlabeled data. Other frameworks in the spectrum of supervisions include weak- or semi-supervision, wh ...
, where the model tries to identify patterns in unlabeled data
*
Reinforcement learning
Reinforcement learning (RL) is an interdisciplinary area of machine learning and optimal control concerned with how an intelligent agent should take actions in a dynamic environment in order to maximize a reward signal. Reinforcement learnin ...
, where the model learns to make decisions by receiving rewards or penalties.
Applications of machine learning
*
Applications of machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalise to unseen data, and thus perform tasks without explicit instru ...
*
Bioinformatics
Bioinformatics () is an interdisciplinary field of science that develops methods and Bioinformatics software, software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, ...
*
Biomedical informatics
Health informatics combines communications, information technology (IT), and health care to enhance patient care and is at the forefront of the medical technological revolution. It can be viewed as a branch of engineering and applied science.
...
*
Computer vision
Computer vision tasks include methods for image sensor, acquiring, Image processing, processing, Image analysis, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical ...
*
Customer relationship management
Customer relationship management (CRM) is a strategic process that organizations use to manage, analyze, and improve their interactions with customers. By leveraging data-driven insights, CRM helps businesses optimize communication, enhance cus ...
*
Data mining
Data mining is the process of extracting and finding patterns in massive data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and ...
*
Earth sciences
Earth science or geoscience includes all fields of natural science related to the planet Earth. This is a branch of science dealing with the physical, chemical, and biological complex constitutions and synergistic linkages of Earth's four spheres ...
*
Email filtering
Email filtering is the processing of email to organize it according to specified criteria. The term can apply to the intervention of human intelligence, but most often refers to the automatic processing of messages at an SMTP server, possibly ap ...
*
Inverted pendulum
An inverted pendulum is a pendulum that has its center of mass above its Lever, pivot point. It is unstable equilibrium, unstable and falls over without additional help. It can be suspended stably in this inverted position by using a control s ...
(balance and equilibrium system)
*
Natural language processing
Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related ...
**
Named Entity Recognition
Named-entity recognition (NER) (also known as (named) entity identification, entity chunking, and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pr ...
**
Automatic summarization
Automatic summarization is the process of shortening a set of data computationally, to create a subset (a summary) that represents the most important or relevant information within the original content. Artificial intelligence algorithms are comm ...
**
Automatic taxonomy construction
**
Dialog system
A dialogue system, or conversational agent (CA), is a computer system intended to converse with a human. Dialogue systems employed one or more of text, speech, graphics, haptics, gestures, and other modes for communication on both the input and ...
**
Grammar checker
A grammar checker, in computing terms, is a Computer program, program, or part of a program, that attempts to verify written text for grammatical correctness. Grammar checkers are most often implemented as a feature of a larger program, such as a ...
** Language recognition
***
Handwriting recognition
Handwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible handwriting, handwritten input from sources such as paper documents, photographs, touch-screens ...
***
Optical character recognition
Optical character recognition or optical character reader (OCR) is the electronics, electronic or machine, mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo ...
***
Speech recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also ...
****
Text to Speech Synthesis
****
Speech Emotion Recognition
**
Machine translation
Machine translation is use of computational techniques to translate text or speech from one language to another, including the contextual, idiomatic and pragmatic nuances of both languages.
Early approaches were mostly rule-based or statisti ...
**
Question answering
Question answering (QA) is a computer science discipline within the fields of information retrieval and natural language processing (NLP) that is concerned with building systems that automatically answer questions that are posed by humans in a n ...
**
Speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal langua ...
**
Text mining
Text mining, text data mining (TDM) or text analytics is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extracting information from differe ...
***
Term frequency–inverse document frequency
**
Text simplification
Text simplification is an operation used in natural language processing to change, enhance, classify, or otherwise process an existing body of human-readable text so its grammar and structure is greatly simplified while the underlying meaning an ...
*
Pattern recognition
Pattern recognition is the task of assigning a class to an observation based on patterns extracted from data. While similar, pattern recognition (PR) is not to be confused with pattern machines (PM) which may possess PR capabilities but their p ...
**
Facial recognition system
A facial recognition system is a technology potentially capable of matching a human face from a digital image or a Film frame, video frame against a database of faces. Such a system is typically employed to authenticate users through ID verif ...
**
Handwriting recognition
Handwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible handwriting, handwritten input from sources such as paper documents, photographs, touch-screens ...
**
Image recognition
Computer vision tasks include methods for acquiring, processing, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical or symbolic information, e.g. in the form o ...
**
Optical character recognition
Optical character recognition or optical character reader (OCR) is the electronics, electronic or machine, mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo ...
**
Speech recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also ...
*
Recommendation system
Recommendation may refer to:
* European Union recommendation, in international law
* Letter of recommendation, in employment or academia
* W3C recommendation, in Internet contexts
* A computer-generated recommendation created by a recommender ...
**
Collaborative filtering
Collaborative filtering (CF) is, besides content-based filtering, one of two major techniques used by recommender systems.Francesco Ricci and Lior Rokach and Bracha ShapiraIntroduction to Recommender Systems Handbook, Recommender Systems Handbo ...
**
Content-based filtering
**
Hybrid recommender systems
*
Search engine
A search engine is a software system that provides hyperlinks to web pages, and other relevant information on World Wide Web, the Web in response to a user's web query, query. The user enters a query in a web browser or a mobile app, and the sea ...
**
Search engine optimization
Search engine optimization (SEO) is the process of improving the quality and quantity of Web traffic, website traffic to a website or a web page from web search engine, search engines. SEO targets unpaid search traffic (usually referred to as ...
*
Social engineering
Machine learning hardware
*
Graphics processing unit
A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal ...
*
Tensor processing unit
*
Vision processing unit
A vision processing unit (VPU) is (as of 2023) an emerging class of microprocessor; it is a specific type of AI accelerator, designed to accelerate machine vision tasks.
Overview
Vision processing units are distinct from graphics processing un ...
Machine learning tools
*
Comparison of deep learning software
The following tables compare notable software frameworks, libraries, and computer programs for deep learning applications.
Deep learning software by name
Comparison of machine learning model compatibility
See also
* Comparison of numeri ...
Machine learning frameworks
Proprietary machine learning frameworks
*
Amazon Machine Learning
*
Microsoft Azure Machine Learning Studio
*
DistBelief (replaced by
TensorFlow
TensorFlow is a Library (computing), software library for machine learning and artificial intelligence. It can be used across a range of tasks, but is used mainly for Types of artificial neural networks#Training, training and Statistical infer ...
)
Open source machine learning frameworks
*
Apache Singa
*
Apache MXNet
*
Caffe
*
PyTorch
PyTorch is a machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, originally developed by Meta AI and now part of the Linux Foundation umbrella. It is one of the mo ...
*
mlpack
mlpack is a free, open-source and header-only software library for machine learning and artificial intelligence written in C++, built on top of the Armadillo library and thensmallennumerical optimization library. mlpack has an emphasis on scal ...
*
TensorFlow
TensorFlow is a Library (computing), software library for machine learning and artificial intelligence. It can be used across a range of tasks, but is used mainly for Types of artificial neural networks#Training, training and Statistical infer ...
*
Torch
A torch is a stick with combustible material at one end which can be used as a light source or to set something on fire. Torches have been used throughout history and are still used in processions, symbolic and religious events, and in juggl ...
*
CNTK
*
Accord.Net
Accord.NET is a framework for scientific computing in .NET. The source code of the project is available under the terms of the Gnu Lesser Public License, version 2.1.
The framework comprises a set of libraries that are available in source code a ...
*
Jax
MLJ.jl– A machine learning framework for Julia
Machine learning libraries
*
Deeplearning4j
*
Theano
*
scikit-learn
scikit-learn (formerly scikits.learn and also known as sklearn) is a free and open-source machine learning library for the Python programming language.
It features various classification, regression and clustering algorithms including support ...
*
Keras
Keras is an open-source library that provides a Python interface for artificial neural networks. Keras was first independent software, then integrated into the TensorFlow library, and later added support for more. "Keras 3 is a full rewrite o ...
Machine learning algorithms
*
Almeida–Pineda recurrent backpropagation
*
ALOPEX
*
Backpropagation
In machine learning, backpropagation is a gradient computation method commonly used for training a neural network to compute its parameter updates.
It is an efficient application of the chain rule to neural networks. Backpropagation computes th ...
*
Bootstrap aggregating
Bootstrap aggregating, also called bagging (from bootstrap aggregating) or bootstrapping, is a machine learning (ML) ensemble meta-algorithm designed to improve the stability and accuracy of ML classification and regression algorithms. It also ...
*
CN2 algorithm
*
Constructing skill trees
Constructing skill trees (CST) is a hierarchical reinforcement learning algorithm which can build skill trees from a set of sample solution trajectories obtained from demonstration. CST uses an incremental MAP (maximum a posteriori) change point ...
*
Dehaene–Changeux model
*
Diffusion map
*
Dominance-based rough set approach
*
Dynamic time warping
In time series analysis, dynamic time warping (DTW) is an algorithm for measuring similarity between two temporal sequences, which may vary in speed. For instance, similarities in walking could be detected using DTW, even if one person was walk ...
*
Error-driven learning
*
Evolutionary multimodal optimization
*
Expectation–maximization algorithm
In statistics, an expectation–maximization (EM) algorithm is an iterative method to find (local) maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent varia ...
*
FastICA
FastICA is an efficient and popular algorithm for independent component analysis invented by Aapo Hyvärinen at Helsinki University of Technology. Like most ICA algorithms, FastICA seeks an orthogonal rotation of FastICA#Prewhitening the data, prew ...
*
Forward–backward algorithm
*
GeneRec
*
Genetic Algorithm for Rule Set Production
*
Growing self-organizing map
*
Hyper basis function network In machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and ...
*
IDistance
*
''k''-nearest neighbors algorithm
*
Kernel methods for vector output Kernel methods are a well-established tool to analyze the relationship between input data and the corresponding output of a function. Kernels encapsulate the properties of functions in a Kernel trick, computationally efficient way and allow algorith ...
*
Kernel principal component analysis
In the field of multivariate statistics, kernel principal component analysis (kernel PCA)
is an extension of principal component analysis (PCA) using techniques of kernel methods. Using a kernel, the originally linear operations of PCA are performe ...
*
Leabra
*
Linde–Buzo–Gray algorithm
*
Local outlier factor
*
Logic learning machine
*
LogitBoost
*
Manifold alignment
*
Markov chain Monte Carlo (MCMC)
*
Minimum redundancy feature selection Minimum redundancy feature selection is an algorithm frequently used in a method to accurately identify characteristics of genes and phenotypes and narrow down their relevance and is usually described in its pairing with relevant feature selection a ...
*
Mixture of experts
Mixture of experts (MoE) is a machine learning technique where multiple expert networks (learners) are used to divide a problem space into homogeneous regions. MoE represents a form of ensemble learning. They were also called committee machines. ...
*
Multiple kernel learning
*
Non-negative matrix factorization
Non-negative matrix factorization (NMF or NNMF), also non-negative matrix approximation is a group of algorithms in multivariate analysis and linear algebra where a matrix is factorized into (usually) two matrices and , with the property th ...
*
Online machine learning
In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques whic ...
*
Out-of-bag error
*
Prefrontal cortex basal ganglia working memory Prefrontal cortex basal ganglia working memory (PBWM) is an algorithm that Computer simulation, models working memory in the prefrontal cortex and the basal ganglia.
It can be compared to long short-term memory (LSTM) in functionality, but is more ...
*
PVLV
*
Q-learning
''Q''-learning is a reinforcement learning algorithm that trains an agent to assign values to its possible actions based on its current state, without requiring a model of the environment ( model-free). It can handle problems with stochastic tra ...
*
Quadratic unconstrained binary optimization
*
Query-level feature
*
Quickprop
*
Radial basis function network
In the field of mathematical modeling, a radial basis function network is an artificial neural network that uses radial basis functions as activation functions. The output of the network is a linear combination of radial basis functions of the in ...
*
Randomized weighted majority algorithm
*
Reinforcement learning
Reinforcement learning (RL) is an interdisciplinary area of machine learning and optimal control concerned with how an intelligent agent should take actions in a dynamic environment in order to maximize a reward signal. Reinforcement learnin ...
*
Repeated incremental pruning to produce error reduction (RIPPER)
*
Rprop
*
Rule-based machine learning
Rule-based machine learning (RBML) is a term in computer science intended to encompass any machine learning method that identifies, learns, or evolves 'rules' to store, manipulate or apply. The defining characteristic of a rule-based machine learn ...
*
Skill chaining
*
Sparse PCA
*
State–action–reward–state–action
State–action–reward–state–action (SARSA) is an algorithm for learning a Markov decision process policy, used in the reinforcement learning area of machine learning. It was proposed by Rummery and Niranjan in a technical note with the na ...
*
Stochastic gradient descent
Stochastic gradient descent (often abbreviated SGD) is an Iterative method, iterative method for optimizing an objective function with suitable smoothness properties (e.g. Differentiable function, differentiable or Subderivative, subdifferentiable ...
*
Structured kNN
*
T-distributed stochastic neighbor embedding
*
Temporal difference learning
Temporal difference (TD) learning refers to a class of model-free reinforcement learning methods which learn by bootstrapping from the current estimate of the value function. These methods sample from the environment, like Monte Carlo methods, a ...
*
Wake-sleep algorithm
*
Weighted majority algorithm (machine learning)
Machine learning methods
Instance-based algorithm
*
K-nearest neighbors algorithm
In statistics, the ''k''-nearest neighbors algorithm (''k''-NN) is a Non-parametric statistics, non-parametric supervised learning method. It was first developed by Evelyn Fix and Joseph Lawson Hodges Jr., Joseph Hodges in 1951, and later expand ...
(KNN)
*
Learning vector quantization (LVQ)
*
Self-organizing map
A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine learning technique used to produce a low-dimensional (typically two-dimensional) representation of a higher-dimensional data set while preserving the t ...
(SOM)
Regression analysis
*
Logistic regression
In statistics, a logistic model (or logit model) is a statistical model that models the logit, log-odds of an event as a linear function (calculus), linear combination of one or more independent variables. In regression analysis, logistic regres ...
*
Ordinary least squares regression
In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear function of a set of explanatory variables) by the ...
(OLSR)
*
Linear regression
In statistics, linear regression is a statistical model, model that estimates the relationship between a Scalar (mathematics), scalar response (dependent variable) and one or more explanatory variables (regressor or independent variable). A mode ...
*
Stepwise regression
In statistics, stepwise regression is a method of fitting regression models in which the choice of predictive variables is carried out by an automatic procedure. In each step, a variable is considered for addition to or subtraction from the set of ...
*
Multivariate adaptive regression splines
In statistics, multivariate adaptive regression splines (MARS) is a form of regression analysis introduced by Jerome H. Friedman in 1991. It is a non-parametric regression technique and can be seen as an extension of linear models that automatic ...
(MARS)
* Regularization algorithm
**
Ridge regression
Ridge regression (also known as Tikhonov regularization, named for Andrey Tikhonov) is a method of estimating the coefficients of multiple- regression models in scenarios where the independent variables are highly correlated. It has been used in m ...
**
Least Absolute Shrinkage and Selection Operator
In statistics and machine learning, lasso (least absolute shrinkage and selection operator; also Lasso, LASSO or L1 regularization) is a regression analysis method that performs both variable selection and Regularization (mathematics), regulariza ...
(LASSO)
**
Elastic net
**
Least-angle regression
In statistics, least-angle regression (LARS) is an algorithm for fitting linear regression models to high-dimensional data, developed by Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani.
Suppose we expect a response variab ...
(LARS)
*
Classifiers
**
Probabilistic classifier
***
Naive Bayes classifier
In statistics, naive (sometimes simple or idiot's) Bayes classifiers are a family of " probabilistic classifiers" which assumes that the features are conditionally independent, given the target class. In other words, a naive Bayes model assumes th ...
**
Binary classifier
**
Linear classifier
In machine learning, a linear classifier makes a classification decision for each object based on a linear combination of its features. Such classifiers work well for practical problems such as document classification, and more generally for prob ...
**
Hierarchical classifier
Hierarchical classification is a system of grouping things according to a hierarchy.
In the field of machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Comput ...
Dimensionality reduction
Dimensionality reduction
Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally ...
*
Canonical correlation analysis
In statistics, canonical-correlation analysis (CCA), also called canonical variates analysis, is a way of inferring information from cross-covariance matrices. If we have two vectors ''X'' = (''X''1, ..., ''X'n'') and ''Y'' ...
(CCA)
*
Factor analysis
Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observe ...
*
Feature extraction
Feature may refer to:
Computing
* Feature recognition, could be a hole, pocket, or notch
* Feature (computer vision), could be an edge, corner or blob
* Feature (machine learning), in statistics: individual measurable properties of the phenome ...
*
Feature selection
In machine learning, feature selection is the process of selecting a subset of relevant Feature (machine learning), features (variables, predictors) for use in model construction. Feature selection techniques are used for several reasons:
* sim ...
*
Independent component analysis
In signal processing, independent component analysis (ICA) is a computational method for separating a multivariate statistics, multivariate signal into additive subcomponents. This is done by assuming that at most one subcomponent is Gaussian and ...
(ICA)
*
Linear discriminant analysis
Linear discriminant analysis (LDA), normal discriminant analysis (NDA), canonical variates analysis (CVA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to fi ...
(LDA)
*
Multidimensional scaling
Multidimensional scaling (MDS) is a means of visualizing the level of similarity of individual cases of a data set. MDS is used to translate distances between each pair of n objects in a set into a configuration of n points mapped into an ...
(MDS)
*
Non-negative matrix factorization
Non-negative matrix factorization (NMF or NNMF), also non-negative matrix approximation is a group of algorithms in multivariate analysis and linear algebra where a matrix is factorized into (usually) two matrices and , with the property th ...
(NMF)
*
Partial least squares regression (PLSR)
*
Principal component analysis
Principal component analysis (PCA) is a linear dimensionality reduction technique with applications in exploratory data analysis, visualization and data preprocessing.
The data is linearly transformed onto a new coordinate system such that th ...
(PCA)
*
Principal component regression (PCR)
*
Projection pursuit
Projection pursuit (PP) is a type of statistical technique that involves finding the most "interesting" possible projections in multidimensional data. Often, projections that deviate more from a normal distribution are considered to be more intere ...
*
Sammon mapping
*
t-distributed stochastic neighbor embedding (t-SNE)
Ensemble learning
Ensemble learning
In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone.
Unlike a statistical ensemble in statist ...
*
AdaBoost
AdaBoost (short for Adaptive Boosting) is a statistical classification meta-algorithm formulated by Yoav Freund and Robert Schapire in 1995, who won the 2003 Gödel Prize for their work. It can be used in conjunction with many types of learnin ...
*
Boosting
*
Bootstrap aggregating
Bootstrap aggregating, also called bagging (from bootstrap aggregating) or bootstrapping, is a machine learning (ML) ensemble meta-algorithm designed to improve the stability and accuracy of ML classification and regression algorithms. It also ...
(also "bagging" or "bootstrapping")
*
Ensemble averaging
*
Gradient boosted decision tree (GBDT)
*
Gradient boosting
Gradient boosting is a machine learning technique based on boosting in a functional space, where the target is ''pseudo-residuals'' instead of residuals as in traditional boosting. It gives a prediction model in the form of an ensemble of weak ...
*
Random Forest
Random forests or random decision forests is an ensemble learning method for statistical classification, classification, regression analysis, regression and other tasks that works by creating a multitude of decision tree learning, decision trees ...
*
Stacked Generalization
Meta-learning
Meta-learning
Meta-learning is a branch of metacognition concerned with learning about one's own learning and learning processes.
The term comes from the meta prefix's modern meaning of an abstract recursion, or "X about X", similar to its use in metaknowle ...
*
Inductive bias
*
Metadata
Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including:
* Descriptive ...
Reinforcement learning
Reinforcement learning
Reinforcement learning (RL) is an interdisciplinary area of machine learning and optimal control concerned with how an intelligent agent should take actions in a dynamic environment in order to maximize a reward signal. Reinforcement learnin ...
*
Q-learning
''Q''-learning is a reinforcement learning algorithm that trains an agent to assign values to its possible actions based on its current state, without requiring a model of the environment ( model-free). It can handle problems with stochastic tra ...
*
State–action–reward–state–action
State–action–reward–state–action (SARSA) is an algorithm for learning a Markov decision process policy, used in the reinforcement learning area of machine learning. It was proposed by Rummery and Niranjan in a technical note with the na ...
(SARSA)
*
Temporal difference learning
Temporal difference (TD) learning refers to a class of model-free reinforcement learning methods which learn by bootstrapping from the current estimate of the value function. These methods sample from the environment, like Monte Carlo methods, a ...
(TD)
*
Learning Automata
Supervised learning
Supervised learning
In machine learning, supervised learning (SL) is a paradigm where a Statistical model, model is trained using input objects (e.g. a vector of predictor variables) and desired output values (also known as a ''supervisory signal''), which are often ...
*
Averaged one-dependence estimators (AODE)
*
Artificial neural network
In machine learning, a neural network (also artificial neural network or neural net, abbreviated ANN or NN) is a computational model inspired by the structure and functions of biological neural networks.
A neural network consists of connected ...
*
Case-based reasoning
Case-based reasoning (CBR), broadly construed, is the process of solving new problems based on the solutions of similar past problems.
In everyday life, an auto mechanic who fixes an engine by recalling another car that exhibited similar sympto ...
*
Gaussian process regression
In statistics, originally in geostatistics, kriging or Kriging (), also known as Gaussian process regression, is a method of interpolation based on Gaussian process governed by prior covariances. Under suitable assumptions of the prior, kriging g ...
*
Gene expression programming
Gene expression programming (GEP) in computer programming is an evolutionary algorithm that creates computer programs or models. These computer programs are complex tree structures that learn and adapt by changing their sizes, shapes, and compos ...
*
Group method of data handling
A group is a number of persons or things that are located, gathered, or classed together.
Groups of people
* Cultural group, a group whose members share the same cultural identity
* Ethnic group, a group whose members share the same ethnic iden ...
(GMDH)
*
Inductive logic programming
Inductive logic programming (ILP) is a subfield of symbolic artificial intelligence which uses logic programming as a uniform representation for examples, background knowledge and hypotheses. The term "''inductive''" here refers to philosophical ...
*
Instance-based learning In machine learning, instance-based learning (sometimes called memory-based learning) is a family of learning algorithms that, instead of performing explicit generalization, compare new problem instances with instances seen in training, which have b ...
*
Lazy learning
*
Learning Automata
*
Learning Vector Quantization
*
Logistic Model Tree
*
Minimum message length (decision trees, decision graphs, etc.)
**
Nearest Neighbor Algorithm
**
Analogical modeling
Analogical modeling (AM) is a formal theory of exemplar theory, exemplar based analogical reasoning, proposed by Royal Skousen, professor of Linguistics and English language at Brigham Young University in Provo, Utah. It is applicable to language ...
*
Probably approximately correct learning
In computational learning theory, probably approximately correct (PAC) learning is a framework for mathematical analysis of machine learning. It was proposed in 1984 by Leslie Valiant.L. Valiant. A theory of the learnable.' Communications of the ...
(PAC) learning
*
Ripple down rules
Ripple-down rules (RDR) are a way of approaching knowledge acquisition. Knowledge acquisition refers to the transfer of knowledge from human experts to knowledge-based systems.
Introductory material
Ripple-down rules are an incremental approach ...
, a knowledge acquisition methodology
* Symbolic machine learning algorithms
*
Support vector machine
In machine learning, support vector machines (SVMs, also support vector networks) are supervised max-margin models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laborato ...
s
*
Random Forests
*
Ensembles of classifiers
**
Bootstrap aggregating
Bootstrap aggregating, also called bagging (from bootstrap aggregating) or bootstrapping, is a machine learning (ML) ensemble meta-algorithm designed to improve the stability and accuracy of ML classification and regression algorithms. It also ...
(bagging)
**
Boosting (meta-algorithm)
In machine learning (ML), boosting is an ensemble metaheuristic for primarily reducing bias (as opposed to variance). It can also improve the stability and accuracy of ML classification and regression algorithms. Hence, it is prevalent in supe ...
*
Ordinal classification
*
Conditional Random Field
Conditional random fields (CRFs) are a class of statistical modeling methods often applied in pattern recognition and machine learning and used for structured prediction. Whereas a classifier predicts a label for a single sample without consi ...
*
ANOVA
Analysis of variance (ANOVA) is a family of statistical methods used to compare the means of two or more groups by analyzing variance. Specifically, ANOVA compares the amount of variation ''between'' the group means to the amount of variation ''w ...
*
Quadratic classifier
In statistics, a quadratic classifier is a statistical classifier that uses a quadratic decision surface to separate measurements of two or more classes of objects or events. It is a more general version of the linear classifier.
The classific ...
s
*
k-nearest neighbor
In statistics, the ''k''-nearest neighbors algorithm (''k''-NN) is a non-parametric supervised learning method. It was first developed by Evelyn Fix and Joseph Hodges in 1951, and later expanded by Thomas Cover.
Most often, it is used for cl ...
*
Boosting
** SPRINT
*
Bayesian network
A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Whi ...
s
**
Naive Bayes
In statistics, naive (sometimes simple or idiot's) Bayes classifiers are a family of " probabilistic classifiers" which assumes that the features are conditionally independent, given the target class. In other words, a naive Bayes model assumes th ...
*
Hidden Markov model
A hidden Markov model (HMM) is a Markov model in which the observations are dependent on a latent (or ''hidden'') Markov process (referred to as X). An HMM requires that there be an observable process Y whose outcomes depend on the outcomes of X ...
s
**
Hierarchical hidden Markov model
Bayesian
Bayesian statistics
Bayesian statistics ( or ) is a theory in the field of statistics based on the Bayesian interpretation of probability, where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about ...
* Bayesian knowledge base
*
Naive Bayes
In statistics, naive (sometimes simple or idiot's) Bayes classifiers are a family of " probabilistic classifiers" which assumes that the features are conditionally independent, given the target class. In other words, a naive Bayes model assumes th ...
*
Gaussian Naive Bayes
*
Multinomial Naive Bayes
*
Averaged One-Dependence Estimators (AODE)
*
Bayesian Belief Network (BBN)
*
Bayesian Network
A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Whi ...
(BN)
Decision tree algorithms
Decision tree algorithm
*
Decision tree
A decision tree is a decision support system, decision support recursive partitioning structure that uses a Tree (graph theory), tree-like Causal model, model of decisions and their possible consequences, including probability, chance event ou ...
*
Classification and regression tree (CART)
*
Iterative Dichotomiser 3 (ID3)
*
C4.5 algorithm
*
C5.0 algorithm
*
Chi-squared Automatic Interaction Detection (CHAID)
*
Decision stump
* Conditional decision tree
*
ID3 algorithm
In decision tree learning, ID3 (Iterative Dichotomiser 3) is an algorithm invented by Ross QuinlanQuinlan, J. R. 1986. Induction of Decision Trees. Mach. Learn. 1, 1 (Mar. 1986), 81–106 used to generate a decision tree from a dataset. ID3 is th ...
*
Random forest
Random forests or random decision forests is an ensemble learning method for statistical classification, classification, regression analysis, regression and other tasks that works by creating a multitude of decision tree learning, decision trees ...
* SLIQ
Linear classifier
Linear classifier
In machine learning, a linear classifier makes a classification decision for each object based on a linear combination of its features. Such classifiers work well for practical problems such as document classification, and more generally for prob ...
*
Fisher's linear discriminant
*
Linear regression
In statistics, linear regression is a statistical model, model that estimates the relationship between a Scalar (mathematics), scalar response (dependent variable) and one or more explanatory variables (regressor or independent variable). A mode ...
*
Logistic regression
In statistics, a logistic model (or logit model) is a statistical model that models the logit, log-odds of an event as a linear function (calculus), linear combination of one or more independent variables. In regression analysis, logistic regres ...
*
Multinomial logistic regression
In statistics, multinomial logistic regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes. That is, it is a model that is used to predict the prob ...
*
Naive Bayes classifier
In statistics, naive (sometimes simple or idiot's) Bayes classifiers are a family of " probabilistic classifiers" which assumes that the features are conditionally independent, given the target class. In other words, a naive Bayes model assumes th ...
*
Perceptron
In machine learning, the perceptron is an algorithm for supervised classification, supervised learning of binary classification, binary classifiers. A binary classifier is a function that can decide whether or not an input, represented by a vect ...
*
Support vector machine
In machine learning, support vector machines (SVMs, also support vector networks) are supervised max-margin models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laborato ...
Unsupervised learning
Unsupervised learning
Unsupervised learning is a framework in machine learning where, in contrast to supervised learning, algorithms learn patterns exclusively from unlabeled data. Other frameworks in the spectrum of supervisions include weak- or semi-supervision, wh ...
*
Expectation-maximization algorithm
*
Vector Quantization
Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors. Developed in the early 1980s by Robert M. Gray, it was ori ...
*
Generative topographic map
*
Information bottleneck method
The information bottleneck method is a technique in information theory introduced by Naftali Tishby, Fernando C. Pereira, and William Bialek. It is designed for finding the best tradeoff between accuracy and complexity (Data compression, compressio ...
*
Association rule learning
Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.P ...
algorithms
**
Apriori algorithm
AprioriRakesh Agrawal and Ramakrishnan SrikanFast algorithms for mining association rules Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pages 487-499, Santiago, Chile, September 1994. is an algorithm for frequent ...
**
Eclat algorithm
Artificial neural networks
Artificial neural network
In machine learning, a neural network (also artificial neural network or neural net, abbreviated ANN or NN) is a computational model inspired by the structure and functions of biological neural networks.
A neural network consists of connected ...
*
Feedforward neural network
Feedforward refers to recognition-inference architecture of neural networks. Artificial neural network architectures are based on inputs multiplied by weights to obtain outputs (inputs-to-output): feedforward. Recurrent neural networks, or neur ...
**
Extreme learning machine
**
Convolutional neural network
A convolutional neural network (CNN) is a type of feedforward neural network that learns features via filter (or kernel) optimization. This type of deep learning network has been applied to process and make predictions from many different ty ...
*
Recurrent neural network
Recurrent neural networks (RNNs) are a class of artificial neural networks designed for processing sequential data, such as text, speech, and time series, where the order of elements is important. Unlike feedforward neural networks, which proces ...
**
Long short-term memory (LSTM)
*
Logic learning machine
*
Self-organizing map
A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine learning technique used to produce a low-dimensional (typically two-dimensional) representation of a higher-dimensional data set while preserving the t ...
Association rule learning
Association rule learning
Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.P ...
*
Apriori algorithm
AprioriRakesh Agrawal and Ramakrishnan SrikanFast algorithms for mining association rules Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pages 487-499, Santiago, Chile, September 1994. is an algorithm for frequent ...
*
Eclat algorithm
*
FP-growth algorithm
Hierarchical clustering
Hierarchical clustering
In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two ...
*
Single-linkage clustering
In statistics, single-linkage clustering is one of several methods of hierarchical clustering. It is based on grouping clusters in bottom-up fashion (agglomerative clustering), at each step combining two clusters that contain the closest pair of el ...
*
Conceptual clustering
Conceptual clustering is a machine learning paradigm for unsupervised classification that has been defined by Ryszard S. Michalski in 1980 (Fisher 1987, Michalski 1980) and developed mainly during the 1980s. It is distinguished from ordinary cluste ...
Cluster analysis
Cluster analysis
Cluster analysis or clustering is the data analyzing technique in which task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more Similarity measure, similar (in some specific sense defined by the ...
*
BIRCH
A birch is a thin-leaved deciduous hardwood tree of the genus ''Betula'' (), in the family Betulaceae, which also includes alders, hazels, and hornbeams. It is closely related to the beech- oak family Fagaceae. The genus ''Betula'' contains 3 ...
*
DBSCAN
*
Expectation–maximization (EM)
*
Fuzzy clustering
*
Hierarchical clustering
In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two ...
*
''k''-means clustering
*
''k''-medians
*
Mean-shift
*
OPTICS algorithm
Anomaly detection
Anomaly detection
In data analysis, anomaly detection (also referred to as outlier detection and sometimes as novelty detection) is generally understood to be the identification of rare items, events or observations which deviate significantly from the majority of ...
*
''k''-nearest neighbors algorithm (''k''-NN)
*
Local outlier factor
Semi-supervised learning
Semi-supervised learning
Weak supervision (also known as semi-supervised learning) is a paradigm in machine learning, the relevance and notability of which increased with the advent of large language models due to large amount of data required to train them. It is charact ...
*
Active learning
Active learning is "a method of learning in which students are actively or experientially involved in the learning process and where there are different levels of active learning, depending on student involvement." states that "students particip ...
*
Generative models
*
Low-density separation
*
Graph-based methods
*
Co-training
Co-training is a machine learning algorithm used when there are only small amounts of labeled data and large amounts of unlabeled data. One of its uses is in text mining for search engines. It was introduced by Avrim Blum and Tom Mitchell in 1998 ...
*
Transduction
Deep learning
Deep learning
Deep learning is a subset of machine learning that focuses on utilizing multilayered neural networks to perform tasks such as classification, regression, and representation learning. The field takes inspiration from biological neuroscience a ...
*
Deep belief networks
* Deep
Boltzmann machine
A Boltzmann machine (also called Sherrington–Kirkpatrick model with external field or stochastic Ising model), named after Ludwig Boltzmann, is a spin glass, spin-glass model with an external field, i.e., a Spin glass#Sherrington–Kirkpatrick m ...
s
* Deep
Convolutional neural network
A convolutional neural network (CNN) is a type of feedforward neural network that learns features via filter (or kernel) optimization. This type of deep learning network has been applied to process and make predictions from many different ty ...
s
* Deep
Recurrent neural network
Recurrent neural networks (RNNs) are a class of artificial neural networks designed for processing sequential data, such as text, speech, and time series, where the order of elements is important. Unlike feedforward neural networks, which proces ...
s
*
Hierarchical temporal memory
Hierarchical temporal memory (HTM) is a biologically constrained machine intelligence technology developed by Numenta. Originally described in the 2004 book '' On Intelligence'' by Jeff Hawkins with Sandra Blakeslee, HTM is primarily used toda ...
*
Generative Adversarial Network
A generative adversarial network (GAN) is a class of machine learning frameworks and a prominent framework for approaching generative artificial intelligence. The concept was initially developed by Ian Goodfellow and his colleagues in June ...
**
Style transfer
*
Transformer
In electrical engineering, a transformer is a passive component that transfers electrical energy from one electrical circuit to another circuit, or multiple Electrical network, circuits. A varying current in any coil of the transformer produces ...
*
Stacked Auto-Encoders
Other machine learning methods and problems
*
Anomaly detection
In data analysis, anomaly detection (also referred to as outlier detection and sometimes as novelty detection) is generally understood to be the identification of rare items, events or observations which deviate significantly from the majority of ...
*
Association rules
*
Bias-variance dilemma
*
Classification
Classification is the activity of assigning objects to some pre-existing classes or categories. This is distinct from the task of establishing the classes themselves (for example through cluster analysis). Examples include diagnostic tests, identif ...
**
Multi-label classification
In machine learning, multi-label classification or multi-output classification is a variant of the statistical classification, classification problem where multiple nonexclusive labels may be assigned to each instance. Multi-label classification ...
*
Clustering
*
Data Pre-processing
*
Empirical risk minimization
In statistical learning theory, the principle of empirical risk minimization defines a family of learning algorithms based on evaluating performance over a known and fixed dataset. The core idea is based on an application of the law of large num ...
*
Feature engineering
*
Feature learning
In machine learning (ML), feature learning or representation learning is a set of techniques that allow a system to automatically discover the representations needed for feature detection or classification from raw data. This replaces manual fea ...
*
Learning to rank
Learning to rank. Slides from Tie-Yan Liu's talk at World Wide Web Conference, WWW 2009 conference aravailable online or machine-learned ranking (MLR) is the application of machine learning, typically Supervised learning, supervised, Semi-supervi ...
*
Occam learning
*
Online machine learning
In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques whic ...
*
PAC learning
*
Regression
*
Reinforcement Learning
Reinforcement learning (RL) is an interdisciplinary area of machine learning and optimal control concerned with how an intelligent agent should take actions in a dynamic environment in order to maximize a reward signal. Reinforcement learnin ...
*
Semi-supervised learning
Weak supervision (also known as semi-supervised learning) is a paradigm in machine learning, the relevance and notability of which increased with the advent of large language models due to large amount of data required to train them. It is charact ...
*
Statistical learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
*
Structured prediction
Structured prediction or structured output learning is an umbrella term for supervised machine learning techniques that involves predicting structured objects, rather than discrete or real values.
Similar to commonly used supervised learning t ...
**
Graphical model
A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. Graphical models are commonly used in ...
s
***
Bayesian network
A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Whi ...
***
Conditional random field
Conditional random fields (CRFs) are a class of statistical modeling methods often applied in pattern recognition and machine learning and used for structured prediction. Whereas a classifier predicts a label for a single sample without consi ...
(CRF)
***
Hidden Markov model
A hidden Markov model (HMM) is a Markov model in which the observations are dependent on a latent (or ''hidden'') Markov process (referred to as X). An HMM requires that there be an observable process Y whose outcomes depend on the outcomes of X ...
(HMM)
*
Unsupervised learning
Unsupervised learning is a framework in machine learning where, in contrast to supervised learning, algorithms learn patterns exclusively from unlabeled data. Other frameworks in the spectrum of supervisions include weak- or semi-supervision, wh ...
*
VC theory
Machine learning research
*
List of artificial intelligence projects
*
List of datasets for machine learning research
These datasets are used in machine learning (ML) research and have been cited in peer-reviewed academic journals. Datasets are an integral part of the field of machine learning. Major advances in this field can result from advances in learni ...
History of machine learning
History of machine learning
*
Timeline of machine learning
Machine learning projects
Machine learning projects:
*
DeepMind
DeepMind Technologies Limited, trading as Google DeepMind or simply DeepMind, is a British–American artificial intelligence research laboratory which serves as a subsidiary of Alphabet Inc. Founded in the UK in 2010, it was acquired by Go ...
*
Google Brain
Google Brain was a deep learning artificial intelligence research team that served as the sole AI branch of Google before being incorporated under the newer umbrella of Google AI, a research division at Google dedicated to artificial intelligence ...
*
OpenAI
OpenAI, Inc. is an American artificial intelligence (AI) organization founded in December 2015 and headquartered in San Francisco, California. It aims to develop "safe and beneficial" artificial general intelligence (AGI), which it defines ...
*
Meta AI
Meta AI is a research division of Meta (formerly Facebook) that develops artificial intelligence and augmented reality technologies.
History
The foundation of laboratory was announced in 2013, under the name Facebook Artificial Intelligence ...
*
Hugging Face
Hugging Face, Inc. is a French-American company based in List of tech companies in the New York metropolitan area, New York City that develops computation tools for building applications using machine learning. It is most notable for its Transf ...
Machine learning organizations
Machine learning conferences and workshops
* Artificial Intelligence and Security (AISec) (co-located workshop with CCS)
*
Conference on Neural Information Processing Systems (NIPS)
*
ECML PKDD
*
International Conference on Machine Learning
The International Conference on Machine Learning (ICML) is a leading international academic conference in machine learning. Along with NeurIPS and ICLR, it is one of the three primary conferences of high impact in machine learning and artificial ...
(ICML)
ML4ALL(Machine Learning For All)
Machine learning publications
Books on machine learning
*
Mathematics for Machine Learning
*
Hands-On Machine Learning Scikit-Learn, Keras, and TensorFlow
*
The Hundred-Page Machine Learning Book
Machine learning journals
* ''
Machine Learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
''
* ''
Journal of Machine Learning Research
The ''Journal of Machine Learning Research'' is a peer-reviewed open access scientific journal covering machine learning. It was established in 2000 and the first editor-in-chief was Leslie Kaelbling. The current editors-in-chief are Francis Bac ...
'' (JMLR)
* ''
Neural Computation
Neural computation is the information processing performed by networks of neurons. Neural computation is affiliated with the philosophical tradition known as Computational theory of mind, also referred to as computationalism, which advances the th ...
''
Persons influential in machine learning
*
Alberto Broggi
*
Andrei Knyazev
*
Andrew McCallum
Andrew McCallum is a professor in the computer science department at University of Massachusetts Amherst. His primary specialties are in machine learning, natural language processing, information extraction, information integration, and social ...
*
Andrew Ng
Andrew Yan-Tak Ng (; born April 18, 1976) is a British-American computer scientist and Internet Entrepreneur, technology entrepreneur focusing on machine learning and artificial intelligence (AI). Ng was a cofounder and head of Google Brain and ...
*
Anuraag Jain
*
Armin B. Cremers
*
Ayanna Howard
*
Barney Pell
*
Ben Goertzel
Ben Goertzel is a computer scientist, artificial intelligence researcher, and businessman. He helped popularize the term artificial general intelligence.
Early life and education
Three of Goertzel's Jewish great-grandparents immigrated to New Yo ...
*
Ben Taskar
*
Bernhard Schölkopf
Bernhard Schölkopf (born 20 February 1968) is a German computer scientist known for his work in machine learning, especially on kernel methods and causality. He is a director at the Max Planck Institute for Intelligent Systems in Tübingen, ...
*
Brian D. Ripley
*
Christopher G. Atkeson
*
Corinna Cortes
*
Demis Hassabis
Sir Demis Hassabis (born 27 July 1976) is a British artificial intelligence (AI) researcher, and entrepreneur. He is the chief executive officer and co-founder of Google DeepMind, and Isomorphic Labs, and a UK Government AI Adviser. In 2024, Ha ...
*
Douglas Lenat
Douglas Bruce Lenat (September 13, 1950 – August 31, 2023) was an American computer scientist and researcher in artificial intelligence who was the founder and CEO of Cycorp, Inc. in Austin, Texas.
Lenat was awarded the biannual IJCAI Comp ...
*
Eric Xing
*
Ernst Dickmanns
*
Geoffrey Hinton
Geoffrey Everest Hinton (born 1947) is a British-Canadian computer scientist, cognitive scientist, and cognitive psychologist known for his work on artificial neural networks, which earned him the title "the Godfather of AI".
Hinton is Univer ...
*
Hans-Peter Kriegel
*
Hartmut Neven
Hartmut Neven (born 1964) is a German American scientist working in quantum computing, computer vision, robotics and computational neuroscience. He is best known for his work in face and object recognition and his contributions to quantum machin ...
*
Heikki Mannila
*
Ian Goodfellow
*
Jacek M. Zurada
*
Jaime Carbonell
Jaime Guillermo Carbonell (July 29, 1953 – February 28, 2020) was a computer scientist who made seminal contributions to the development of natural language processing tools and technologies. His extensive research in machine translation resul ...
*
Jeremy Slovak
*
Jerome H. Friedman
Jerome Harold Friedman (born December 29, 1939) is an American statistician, consultant and Professor of Statistics at Stanford University, known for his contributions in the field of statistics and data mining.
*
John D. Lafferty
*
John Platt
*
Julie Beth Lovins
*
Jürgen Schmidhuber
Jürgen Schmidhuber (born 17 January 1963) is a German computer scientist noted for his work in the field of artificial intelligence, specifically artificial neural networks. He is a scientific director of the Dalle Molle Institute for Artifici ...
*
Karl Steinbuch
*
Katia Sycara
*
Leo Breiman
Leo Breiman (January 27, 1928 – July 5, 2005) was an American statistician at the University of California, Berkeley and a member of the United States National Academy of Sciences.
Breiman's work helped to bridge the gap between statistics an ...
*
Lise Getoor
*
Luca Maria Gambardella
*
Léon Bottou
*
Marcus Hutter
Marcus Hutter (born 14 April 1967 in Munich) is a computer scientist, professor and artificial intelligence researcher. As a senior researcher at DeepMind, he studies the mathematical foundations of artificial general intelligence.
Hutter stu ...
*
Mehryar Mohri
*
Michael Collins Michael Collins or Mike Collins most commonly refers to:
* Michael Collins (Irish leader) (1890–1922), Irish revolutionary leader, soldier, and politician
* Michael Collins (astronaut) (1930–2021), American astronaut, member of Apollo 11 and Ge ...
*
Michael I. Jordan
*
Michael L. Littman
*
Nando de Freitas
Nando de Freitas is a researcher in the field of machine learning, and in particular in the subfields of neural networks, Bayesian inference and Bayesian optimization, and deep learning.
Biography
De Freitas was born in Zimbabwe. He did his und ...
*
Ofer Dekel
*
Oren Etzioni
*
Pedro Domingos
Pedro Domingos (born 1965) is a Professor Emeritus of computer science and engineering at the University of Washington. He is a researcher in machine learning known for Markov logic network enabling uncertain inference.
Education
Domingos rece ...
*
Peter Flach
*
Pierre Baldi
*
Pushmeet Kohli
*
Ray Kurzweil
Raymond Kurzweil ( ; born February 12, 1948) is an American computer scientist, author, entrepreneur, futurist, and inventor. He is involved in fields such as optical character recognition (OCR), speech synthesis, text-to-speech synthesis, spee ...
*
Rayid Ghani
*
Ross Quinlan
*
Salvatore J. Stolfo
*
Sebastian Thrun
Sebastian Thrun (born May 14, 1967) is a German-American entrepreneur, educator, and computer scientist. He is chief executive officer of Kitty Hawk Corporation, and chairman and co-founder of Udacity. Before that, he was a Google vice preside ...
*
Selmer Bringsjord
*
Sepp Hochreiter
Josef "Sepp" Hochreiter (born 14 February 1967) is a German computer scientist. Since 2018 he has led the Institute for Machine Learning at the Johannes Kepler University of Linz after having led the Institute of Bioinformatics from 2006 to 201 ...
*
Shane Legg
*
Stephen Muggleton
*
Steve Omohundro
*
Tom M. Mitchell
Tom Michael Mitchell (born August 9, 1951) is an American computer scientist and the Founders University Professor at Carnegie Mellon University (CMU). He is a founder and former chair of the Machine Learning Department at CMU. Mitchell is known ...
*
Trevor Hastie
*
Vasant Honavar
*
Vladimir Vapnik
*
Yann LeCun
Yann André Le Cun ( , ; usually spelled LeCun; born 8 July 1960) is a French-American computer scientist working primarily in the fields of machine learning, computer vision, mobile robotics and computational neuroscience. He is the Silver Pr ...
*
Yasuo Matsuyama
*
Yoshua Bengio
Yoshua Bengio (born March 5, 1964) is a Canadian-French computer scientist, and a pioneer of artificial neural networks and deep learning. He is a professor at the Université de Montréal and scientific director of the AI institute Montreal In ...
*
Zoubin Ghahramani
Zoubin Ghahramani FRS (; born 8 February 1970) is a British-Iranian researcher and Professor of Information Engineering at the University of Cambridge. He holds joint appointments at University College London and the Alan Turing Institute. and ...
See also
*
Outline of artificial intelligence
The following outline is provided as an overview of and topical guide to artificial intelligence:
Artificial intelligence (AI) is intelligence exhibited by machines or software. It is also the name of the scientific field which studies how to ...
**
Outline of computer vision
*
Outline of robotics
following outline is provided as an overview of and topical guide to robotics:
Robotics is a branch of mechanical engineering, electrical engineering and computer science that deals with the design, construction, operation, and application o ...
*
Accuracy paradox
*
Action model learning
*
Activation function
The activation function of a node in an artificial neural network is a function that calculates the output of the node based on its individual inputs and their weights. Nontrivial problems can be solved using only a few nodes if the activation f ...
*
Activity recognition
Activity recognition aims to recognize the actions and goals of one or more agents from a series of observations on the agents' actions and the environmental conditions. Since the 1980s, this research field has captured the attention of several co ...
*
ADALINE
*
Adaptive neuro fuzzy inference system
An adaptive neuro-fuzzy inference system or adaptive network-based fuzzy inference system (ANFIS) is a kind of artificial neural network that is based on Fuzzy logic#Takagi–Sugeno–Kang (TSK), Takagi–Sugeno fuzzy inference system. The techniqu ...
*
Adaptive resonance theory
*
Additive smoothing
In statistics, additive smoothing, also called Laplace smoothing or Lidstone smoothing, is a technique used to smooth count data, eliminating issues caused by certain values having 0 occurrences. Given a set of observation counts \mathbf = \lang ...
*
Adjusted mutual information
*
AIVA
*
AIXI
AIXI is a theoretical mathematical formalism for artificial general intelligence.
It combines Solomonoff induction with sequential decision theory.
AIXI was first proposed by Marcus Hutter in 2000 and several results regarding AIXI are proved ...
*
AlchemyAPI
*
AlexNet
AlexNet is a convolutional neural network architecture developed for image classification tasks, notably achieving prominence through its performance in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC). It classifies images into 1, ...
*
Algorithm selection
*
Algorithmic inference Algorithmic inference gathers new developments in the statistical inference methods made feasible by the powerful computing devices widely available to any data analyst. Cornerstones in this field are computational learning theory, granular computin ...
*
Algorithmic learning theory
Algorithmic learning theory is a mathematical framework for analyzing
machine learning problems and algorithms. Synonyms include formal learning theory and algorithmic inductive inference. Algorithmic learning theory is different from statistica ...
*
AlphaGo
AlphaGo is a computer program that plays the board game Go. It was developed by the London-based DeepMind Technologies, an acquired subsidiary of Google. Subsequent versions of AlphaGo became increasingly powerful, including a version that c ...
*
AlphaGo Zero
*
Alternating decision tree An alternating decision tree (ADTree) is a machine learning method for classification. It generalizes decision trees and has connections to boosting.
An ADTree consists of an alternation of decision nodes, which specify a predicate condition, and ...
*
Apprenticeship learning
In artificial intelligence, apprenticeship learning (or learning from demonstration or imitation learning) is the process of learning by observing an expert.
*
Causal Markov condition
*
Competitive learning Competitive learning is a form of unsupervised learning in artificial neural networks, in which nodes compete for the right to respond to a subset of the input data. A variant of Hebbian learning, competitive learning works by increasing the special ...
*
Concept learning
Concept learning, also known as category learning, concept attainment, and concept formation, is defined by Jerome Bruner, Bruner, Goodnow, & Austin (1956) as "the search for and testing of attributes that can be used to distinguish exemplars fro ...
*
Decision tree learning
Decision tree learning is a supervised learning approach used in statistics, data mining and machine learning. In this formalism, a classification or regression decision tree is used as a predictive model to draw conclusions about a set of obser ...
*
Differentiable programming
Differentiable programming is a programming paradigm in which a numeric computer program can be differentiated throughout via automatic differentiation. This allows for gradient-based optimization of parameters in the program, often via gradient ...
*
Distribution learning theory
*
Eager learning
*
End-to-end reinforcement learning
*
Error tolerance (PAC learning)
*
Explanation-based learning
Explanation-based learning (EBL) is a form of machine learning that exploits a very strong, or even perfect, domain theory (i.e. a formal theory of an application domain akin to a domain model in ontology engineering, not to be confused with Scott ...
*
Feature
Feature may refer to:
Computing
* Feature recognition, could be a hole, pocket, or notch
* Feature (computer vision), could be an edge, corner or blob
* Feature (machine learning), in statistics: individual measurable properties of the phenome ...
*
GloVe
A glove is a garment covering the hand, with separate sheaths or openings for each finger including the thumb. Gloves protect and comfort hands against cold or heat, damage by friction, abrasion or chemicals, and disease; or in turn to provide a ...
*
Hyperparameter
*
Inferential theory of learning
*
Learning automata
*
Learning classifier system
Learning classifier systems, or LCS, are a paradigm of rule-based machine learning methods that combine a discovery component (e.g. typically a genetic algorithm in evolutionary computation) with a learning component (performing either supervised ...
*
Learning rule
Learning is the process of acquiring new understanding, knowledge, behaviors, skills, values, attitudes, and preferences. The ability to learn is possessed by humans, non-human animals, and some machines; there is also evidence for some kind ...
*
Learning with errors
In cryptography, learning with errors (LWE) is a mathematical problem that is widely used to create secure encryption algorithms. It is based on the idea of representing secret information as a set of equations with errors. In other words, LWE is ...
*
M-Theory (learning framework)
*
Machine learning control
*
Machine learning in bioinformatics
*
Margin
Margin may refer to:
Physical or graphical edges
*Margin (typography), the white space that surrounds the content of a page
* Continental margin, the zone of the ocean floor that separates the thin oceanic crust from thick continental crust
*Leaf ...
*
Markov chain geostatistics
*
Markov chain Monte Carlo
In statistics, Markov chain Monte Carlo (MCMC) is a class of algorithms used to draw samples from a probability distribution. Given a probability distribution, one can construct a Markov chain whose elements' distribution approximates it – that ...
(MCMC)
*
Markov information source
*
Markov logic network A Markov logic network (MLN) is a probabilistic logic which applies the ideas of a Markov network to first-order logic, defining probability distributions on possible worlds on any given domain.
History
In 2002, Ben Taskar, Pieter Abbeel and ...
*
Markov model
In probability theory, a Markov model is a stochastic model used to Mathematical model, model pseudo-randomly changing systems. It is assumed that future states depend only on the current state, not on the events that occurred before it (that is, ...
*
Markov random field
In the domain of physics and probability, a Markov random field (MRF), Markov network or undirected graphical model is a set of random variables having a Markov property described by an undirected graph
In discrete mathematics, particularly ...
*
Markovian discrimination
Markovian discrimination is a class of spam filtering methods used in CRM114 and other spam filters to filter based on statistical patterns of transition probabilities between words or other lexical tokens in spam messages that would not be cap ...
*
Maximum-entropy Markov model
*
Multi-armed bandit
*
Multi-task learning
Multi-task learning (MTL) is a subfield of machine learning in which multiple learning tasks are solved at the same time, while exploiting commonalities and differences across tasks. This can result in improved learning efficiency and prediction ac ...
*
Multilinear subspace learning
Multilinear subspace learning is an approach for disentangling the causal factor of data formation and performing dimensionality reduction.M. A. O. Vasilescu, D. Terzopoulos (2003"Multilinear Subspace Analysis of Image Ensembles" "Proceedings of ...
*
Multimodal learning
Multimodal learning is a type of deep learning that integrates and processes multiple types of data, referred to as modalities, such as text, audio, images, or video. This integration allows for a more holistic understanding of complex data, imp ...
*
Multiple instance learning
In machine learning, multiple-instance learning (MIL) is a type of supervised learning. Instead of receiving a set of instances which are individually Labeled data, labeled, the learner receives a set of labeled ''bags'', each containing many ins ...
*
Multiple-instance learning
*
Never-Ending Language Learning
*
Offline learning
*
Parity learning
*
Population-based incremental learning In computer science and machine learning, population-based incremental learning (PBIL) is an Optimization (mathematics), optimization algorithm, and an estimation of distribution algorithm. This is a type of genetic algorithm where the genotype of a ...
*
Predictive learning
*
Preference learning
Preference learning is a subfield of machine learning that focuses on modeling and predicting preferences based on observed preference information. Preference learning typically involves supervised learning using datasets of pairwise preference com ...
*
Proactive learning
*
Proximal gradient methods for learning Proximal gradient (forward backward splitting) methods for learning is an area of research in optimization and statistical learning theory which studies algorithms for a general class of Convex function#Definition, convex Regularization (mathematics ...
*
Semantic analysis
*
Similarity learning
*
Sparse dictionary learning
*
Stability (learning theory)
*
Statistical learning theory
Statistical learning theory is a framework for machine learning drawing from the fields of statistics and functional analysis. Statistical learning theory deals with the statistical inference problem of finding a predictive function based on da ...
*
Statistical relational learning
Statistical relational learning (SRL) is a subdiscipline of artificial intelligence and machine learning that is concerned with domain models that exhibit both uncertainty (which can be dealt with using statistical methods) and complex, relational ...
*
Tanagra
Tanagra () is a town and a municipality north of Athens in Boeotia, Greece. The seat of the municipality is the town Schimatari. It is not far from Thebes, and it was noted in antiquity for the figurines named after it. The Tanagra figurines we ...
*
Transfer learning
Transfer learning (TL) is a technique in machine learning (ML) in which knowledge learned from a task is re-used in order to boost performance on a related task. For example, for image classification, knowledge gained while learning to recogniz ...
*
Variable-order Markov model
*
Version space learning
Version space learning is a logical approach to machine learning, specifically binary classification. Version space learning algorithms search a predefined space of hypotheses, viewed as a set of logical sentences. Formally, the hypothesis space i ...
*
Waffles
A waffle is a dish made from leavened Batter (cooking), batter or dough that is cooked between two plates that are patterned to give a characteristic size, shape, and surface impression. There are many variations based on the type of waffle iron ...
*
Weka
The weka, also known as the Māori hen or woodhen (''Gallirallus australis'') is a flightless bird species of the rail family. It is endemic to New Zealand. Some authorities consider it as the only extant member of the genus '' Gallirallus''. ...
*
Loss function
In mathematical optimization and decision theory, a loss function or cost function (sometimes also called an error function) is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cost ...
**
Loss functions for classification
In machine learning and mathematical optimization, loss functions for classification are computationally feasible loss functions representing the price paid for inaccuracy of predictions in classification problems (problems of identifying which ...
**
Mean squared error
In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference betwee ...
(MSE)
**
Mean squared prediction error (MSPE)
**
Taguchi loss function The Taguchi loss function is graphical depiction of loss developed by the Japanese business statistician Genichi Taguchi to describe a phenomenon affecting the value of products produced by a company. Praised by Dr. W. Edwards Deming (the busin ...
*
Low-energy adaptive clustering hierarchy
Other
*
Anne O'Tate
*
Ant colony optimization algorithms
In computer science and operations research, the ant colony optimization algorithm (ACO) is a probabilistic technique for solving computational problems that can be reduced to finding good paths through graphs. Artificial ants represent mul ...
*
Anthony Levandowski
*
Anti-unification (computer science)
*
Apache Flume
This list of Apache Software Foundation projects contains the software development projects of The Apache Software Foundation (ASF).
Besides the projects, there are a few other distinct areas of Apache:
*Incubator: for aspiring ASF projects
*Att ...
*
Apache Giraph
*
Apache Mahout
Apache Mahout is a project of the Apache Software Foundation to produce free implementations of distributed or otherwise scalable machine learning algorithms focused primarily on linear algebra. In the past, many of the implementations use th ...
*
Apache SINGA
*
Apache Spark
Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming clusters with implicit data parallelism and fault tolerance. Originally developed at the University of Californ ...
*
Apache SystemML
Apache SystemDS (Previously, Apache SystemML) is an open source ML system for the end-to-end data science lifecycle.
SystemDS's distinguishing characteristics are:
# Algorithm customizability via R-like and Python-like languages.
# Multiple ex ...
*
Aphelion (software)
The ''Aphelion Imaging Software Suite'' is a software suite that includes three base products - Aphelion Lab, Aphelion Dev, and Aphelion for addressing image processing
An image or picture is a visual representation. An image can be two ...
*
Arabic Speech Corpus
*
Archetypal analysis
*
Arthur Zimek
*
Artificial ants
*
Artificial bee colony algorithm
In computer science and operations research, the artificial bee colony algorithm (ABC) is an optimization algorithm based on the intelligent foraging behaviour of honey bee swarm, proposed by Derviş Karaboğa (Erciyes University) in 2005.
Alg ...
*
Artificial development
*
Artificial immune system
Artificial immune systems (AIS) are a class of rule-based machine learning systems inspired by the principles and processes of the vertebrate immune system. The algorithms are typically modeled after the immune system's characteristics of learning ...
*
Astrostatistics
*
Averaged one-dependence estimators
*
Bag-of-words model
The bag-of-words (BoW) model is a model of text which uses an unordered collection (a "multiset, bag") of words. It is used in natural language processing and information retrieval (IR). It disregards word order (and thus most of syntax or gramm ...
*
Balanced clustering
*
Ball tree
*
Base rate
In probability and statistics, the base rate (also known as prior probabilities) is the class of probabilities unconditional on "featural evidence" ( likelihoods).
It is the proportion of individuals in a population who have a certain characte ...
*
Bat algorithm
*
Baum–Welch algorithm
In electrical engineering, statistical computing and bioinformatics, the Baum–Welch algorithm is a special case of the expectation–maximization algorithm used to find the unknown parameters of a hidden Markov model (HMM). It makes use of the ...
*
Bayesian hierarchical modeling
Bayesian hierarchical modelling is a statistical model written in multiple levels (hierarchical form) that estimates the parametric model, parameters of the Posterior probability, posterior distribution using the Bayesian inference, Bayesian metho ...
*
Bayesian interpretation of kernel regularization
*
Bayesian optimization
Bayesian optimization is a sequential design strategy for global optimization of black-box functions, that does not assume any functional forms. It is usually employed to optimize expensive-to-evaluate functions. With the rise of artificial intell ...
*
Bayesian structural time series
*
Bees algorithm
In computer science and operations research, the bees algorithm is a population-based search algorithm which was developed by Pham, Ghanbarzadeh et al. in 2005.Pham DT, Ghanbarzadeh A, Koc E, Otri S, Rahim S and Zaidi M. The Bees Algorithm. Techn ...
*
Behavioral clustering
*
Bernoulli scheme
In mathematics, the Bernoulli scheme or Bernoulli shift is a generalization of the Bernoulli process to more than two possible outcomes. Bernoulli schemes appear naturally in symbolic dynamics, and are thus important in the study of dynamical syst ...
*
Bias–variance tradeoff
In statistics and machine learning, the bias–variance tradeoff describes the relationship between a model's complexity, the accuracy of its predictions, and how well it can make predictions on previously unseen data that were not used to train ...
*
Biclustering
*
BigML
*
Binary classification
Binary classification is the task of classifying the elements of a set into one of two groups (each called ''class''). Typical binary classification problems include:
* Medical testing to determine if a patient has a certain disease or not;
* Qual ...
*
Bing Predicts
*
Bio-inspired computing
Bio-inspired computing, short for biologically inspired computing, is a field of study which seeks to solve computer science problems using models of biology. It relates to connectionism, social behavior, and emergence. Within computer science, b ...
*
Biogeography-based optimization Biogeography-based optimization (BBO) is an evolutionary algorithm (EA) that optimization, optimizes a function (mathematics), function by stochastically and iterative method, iteratively improving candidate solutions with regard to a given measure ...
*
Biplot
Biplots are a type of exploratory graph used in statistics, a generalization of the simple two-variable scatterplot.
A biplot overlays a ''score plot'' with a ''loading plot''.
A biplot allows information on both samples and variables of a d ...
*
Bondy's theorem
*
Bongard problem
*
Bradley–Terry model
The Bradley–Terry model is a probability model for the outcome of pairwise comparisons between items, teams, or objects. Given a pair of items and drawn from some population, it estimates the probability that the pairwise comparison turns out ...
*
BrownBoost
*
Brown clustering
*
Burst error
In telecommunications, a burst error or error burst is a contiguous sequence of symbols, received over a communication channel, such that the first and last symbols are in error and there exists no contiguous subsequence of ''m'' correctly receiv ...
*
CBCL (MIT)
*
CIML community portal
*
CMA-ES
Covariance matrix adaptation evolution strategy (CMA-ES) is a particular kind of strategy for numerical optimization. evolution strategy, Evolution strategies (ES) are stochastic, Derivative-free optimization, derivative-free methods for numerical ...
*
CURE data clustering algorithm
*
Cache language model A cache language model is a type of statistical language model. These occur in the natural language processing subfield of computer science and assign probabilities to given sequences of words by means of a probability distribution. Statistical lang ...
*
Calibration (statistics)
*
Canonical correspondence analysis In multivariate analysis, canonical correspondence analysis (CCA) is an ordination technique that determines axes from the response data as a unimodal combination of measured predictors. CCA is commonly used in ecology in order to extract gradients ...
*
Canopy clustering algorithm
*
Cascading classifiers
Cascading is a particular case of ensemble learning based on the concatenation of several classifiers, using all information collected from the output from a given classifier as additional information for the next classifier in the cascade. Unli ...
*
Category utility
*
CellCognition
*
Cellular evolutionary algorithm
*
Chi-square automatic interaction detection
*
Chromosome (genetic algorithm)
A chromosome or genotype in evolutionary algorithms (EA) is a set of parameters which define a proposed solution of the problem that the evolutionary algorithm is trying to solve. The set of all solutions, also called ''individuals'' according to ...
*
Classifier chains
*
Cleverbot
*
Clonal selection algorithm
*
Cluster-weighted modeling In data mining, cluster-weighted modeling (CWM) is an algorithm-based approach to non-linear prediction of outputs ( dependent variables) from inputs (independent variables) based on density estimation using a set of models (clusters) that are each ...
*
Clustering high-dimensional data
*
Clustering illusion
The clustering illusion is the tendency to erroneously consider the inevitable "streaks" or "clusters" arising in small samples from random distributions to be non-random. The illusion is caused by a human tendency to underpredict the amount of St ...
*
CoBoosting
*
Cobweb (clustering) COBWEB is an incremental system for hierarchical conceptual clustering. COBWEB was invented by Professor Douglas H. Fisher (Computer Science), Douglas H. Fisher, currently at Vanderbilt University.
COBWEB incrementally organizes observations into a ...
*
Cognitive computer
*
Cognitive robotics
*
Collostructional analysis
*
Common-method variance
*
Complete-linkage clustering
Complete-linkage clustering is one of several methods of agglomerative hierarchical clustering. At the beginning of the process, each element is in a cluster of its own. The clusters are then sequentially combined into larger clusters until all ...
*
Computer-automated design
*
Concept class
*
Concept drift
In predictive analytics, data science, machine learning and related fields, concept drift or drift is an evolution of data that invalidates the data model. It happens when the statistical properties of the target variable, which the model is trying ...
*
Conference on Artificial General Intelligence
*
Conference on Knowledge Discovery and Data Mining
*
Confirmatory factor analysis
In statistics, confirmatory factor analysis (CFA) is a special form of factor analysis, most commonly used in social science research.Kline, R. B. (2010). ''Principles and practice of structural equation modeling (3rd ed.).'' New York, New York: Gu ...
*
Confusion matrix
In the field of machine learning and specifically the problem of statistical classification, a confusion matrix, also known as error matrix, is a specific table layout that allows visualization of the performance of an algorithm, typically a super ...
*
Congruence coefficient In multivariate statistics, the congruence coefficient is an index of the similarity between factors that have been derived in a factor analysis. It was introduced in 1948 by Cyril Burt who referred to it as ''unadjusted correlation''. It is also ...
*
Connect (computer system)
Connect is a social network analysis software data mining computer system developed by HM Revenue and Customs, HMRC (UK) that cross-references business's and people's tax records with other databases to establish fraudulent or undisclosed (misdire ...
*
Consensus clustering
Consensus clustering is a method of aggregating (potentially conflicting) results from multiple clustering algorithms. Also called cluster ensembles or aggregation of clustering (or partitions), it refers to the situation in which a number of diff ...
*
Constrained clustering
*
Constrained conditional model
*
Constructive cooperative coevolution
*
Correlation clustering
*
Correspondence analysis
Correspondence analysis (CA) is a multivariate statistical technique proposed by Herman Otto Hartley (Hirschfeld) and later developed by Jean-Paul Benzécri. It is conceptually similar to principal component analysis, but applies to categorical ...
*
Cortica
*
Coupled pattern learner
*
Cross-entropy method
*
Cross-validation (statistics)
Cross-validation, sometimes called rotation estimation or out-of-sample testing, is any of various similar model validation techniques for assessing how the results of a statistics, statistical analysis will Generalization error, generalize to ...
*
Crossover (genetic algorithm)
Crossover in evolutionary algorithms and evolutionary computation, also called recombination, is a genetic operator used to combine the chromosome (genetic algorithm), genetic information of two parents to generate new offspring. It is one way to ...
*
Cuckoo search
*
Cultural algorithm
*
Cultural consensus theory
Cultural consensus theory is an approach to information pooling (aggregation, data fusion) which supports a framework for the measurement and evaluation of beliefs as cultural; shared to some extent by a group of individuals. Cultural consensus m ...
*
Curse of dimensionality
The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience. T ...
*
DADiSP
*
DARPA LAGR Program
*
Darkforest
*
Dartmouth workshop
The Dartmouth Summer Research Project on Artificial Intelligence was a 1956 summer workshop widely consideredKline, Ronald R., "Cybernetics, Automata Studies and the Dartmouth Conference on Artificial Intelligence", ''IEEE Annals of the History ...
*
DarwinTunes
*
Data Mining Extensions
Data Mining Extensions (DMX) is a query language for data mining models supported by Microsoft's SQL Server Analysis Services product.
Like SQL, it supports a data definition language (DDL), data manipulation language (DML) and a data query lan ...
*
Data exploration
*
Data pre-processing
*
Data stream clustering
*
Dataiku
*
Davies–Bouldin index
The Davies–Bouldin index (DBI), introduced by David L. Davies and Donald W. Bouldin in 1979, is a metric for evaluating clustering algorithms. This is an internal evaluation scheme, where the validation of how well the clustering has been d ...
*
Decision boundary
__NOTOC__
In a statistical-classification problem with two classes, a decision boundary or decision surface is a hypersurface that partitions the underlying vector space into two sets, one for each class. The classifier will classify all the poin ...
*
Decision list Decision lists are a representation for Boolean functions which can be easily learnable from examples. Single term decision lists are more expressive than disjunctions and conjunctions; however, 1-term decision lists are less expressive than the ...
*
Decision tree model
*
Deductive classifier A deductive classifier is a type of artificial intelligence inference engine. It takes as input a set of declarations in a frame language about a domain such as medical research or molecular biology. For example, the names of Class hierarchy, classe ...
*
DeepArt
*
DeepDream
*
Deep Web Technologies
*
Defining length
*
Dendrogram
A dendrogram is a diagram representing a Tree (graph theory), tree graph. This diagrammatic representation is frequently used in different contexts:
* in hierarchical clustering, it illustrates the arrangement of the clusters produced by ...
*
Dependability state model
*
Detailed balance
The principle of detailed balance can be used in Kinetics (physics), kinetic systems which are decomposed into elementary processes (collisions, or steps, or elementary reactions). It states that at Thermodynamic equilibrium, equilibrium, each elem ...
*
Determining the number of clusters in a data set
*
Detrended correspondence analysis
Detrended correspondence analysis (DCA) is a multivariate statistics, statistical technique widely used by ecology, ecologists to find the main factors or gradients in large, species-rich but usually sparse data matrices that typify Community (ecol ...
*
Developmental robotics
*
Diffbot
*
Differential evolution
Differential evolution (DE) is an evolutionary algorithm to optimize a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. Such methods are commonly known as metaheuristics as they make few ...
*
Discrete phase-type distribution
*
Discriminative model
Discriminative models, also referred to as conditional models, are a class of models frequently used for classification. They are typically used to solve binary classification problems, i.e. assign labels, such as pass/fail, win/lose, alive/dead or ...
*
Dissociated press
*
Distributed R
*
Dlib
*
Document classification
Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a document to one or more Class (philosophy), classes or Categorization, categories. This may be do ...
*
Documenting Hate
*
Domain adaptation
Domain adaptation is a field associated with machine learning and inductive transfer, transfer learning. It addresses the challenge of training a model on one data distribution (the source domain) and applying it to a related but different data ...
*
Doubly stochastic model
*
Dual-phase evolution
Dual phase evolution (DPE) is a process that drives self-organization within complex adaptive systems.
It arises in response to phase changes within the network of connections formed by a system's components. DPE occurs in a wide range of physica ...
*
Dunn index
The Dunn index, introduced by Joseph C. Dunn in 1974, is a metric for evaluating clustering algorithms. This is part of a group of validity indices including the Davies–Bouldin index or Silhouette index, in that it is an internal evaluation sch ...
*
Dynamic Bayesian network
*
Dynamic Markov compression
*
Dynamic topic model
*
Dynamic unobserved effects model
*
EDLUT
*
ELKI
ELKI (''Environment for Developing KDD-Applications Supported by Index-Structures'') is a data mining (KDD, knowledge discovery in databases) software framework developed for use in research and teaching. It was originally created by the databa ...
*
Edge recombination operator
*
Effective fitness
*
Elastic map
*
Elastic matching
*
Elbow method (clustering)
*
Emergent (software)
Emergent (formerly PDP++) is a biologically-based neural simulation software that is primarily intended for creating models of the brain and cognitive processes. Development initially began in 1995 at Carnegie Mellon University, and , continues ...
*
Encog
*
Entropy rate
In the mathematical theory of probability, the entropy rate or source information rate is a function assigning an entropy to a stochastic process.
For a strongly stationary process, the conditional entropy for latest random variable eventually ...
*
Erkki Oja
*
Eurisko
Eurisko ( Gr., ''I discover'') is a discovery system written by Douglas Lenat in RLL-1, a representation language itself written in the Lisp programming language. A sequel to Automated Mathematician, it consists of heuristics, i.e. rules of thu ...
*
European Conference on Artificial Intelligence
*
Evaluation of binary classifiers
*
Evolution strategy
Evolution strategy (ES) from computer science is a subclass of evolutionary algorithms, which serves as an optimization (mathematics), optimization technique. It uses the major genetic operators mutation (evolutionary algorithm), mutation, recomb ...
*
Evolution window
*
Evolutionary Algorithm for Landmark Detection
*
Evolutionary algorithm
Evolutionary algorithms (EA) reproduce essential elements of the biological evolution in a computer algorithm in order to solve "difficult" problems, at least Approximation, approximately, for which no exact or satisfactory solution methods are k ...
*
Evolutionary art
Evolutionary art is a branch of generative art, in which the artist does not do the work of constructing the artwork, but rather lets a system do the construction. In evolutionary art, initially generated art is put through an iterated process o ...
*
Evolutionary music
*
Evolutionary programming
Evolutionary programming is an evolutionary algorithm, where a share of new population is created by mutation of previous population without crossover. Evolutionary programming differs from evolution strategy ES(\mu+\lambda) in one detail. All in ...
*
Evolvability (computer science)
The term evolvability is used for a recent framework of computational learning introduced by Leslie Valiant
Leslie Gabriel Valiant (born 28 March 1949) is a British American computer scientist and computational theorist. He was born to a che ...
*
Evolved antenna
*
Evolver (software)
*
Evolving classification function
*
Expectation propagation
*
Exploratory factor analysis
In multivariate statistics, exploratory factor analysis (EFA) is a statistical method used to uncover the underlying structure of a relatively large set of Variable (research), variables. EFA is a technique within factor analysis whose overarching ...
*
F1 score
In statistical analysis of binary classification and information retrieval systems, the F-score or F-measure is a measure of predictive performance. It is calculated from the precision and recall of the test, where the precision is the number o ...
*
FLAME clustering
*
Factor analysis of mixed data
*
Factor graph
A factor graph is a bipartite graph representing the factorization of a function (mathematics), function. In probability theory and its applications, factor graphs are used to represent factorization of a Probability distribution function (disam ...
*
Factor regression model
*
Factored language model The factored language model (FLM) is an extension of a conventional language model introduced by Jeff Bilmes and Katrin Kirchoff in 2003. In an FLM, each word is viewed as a vector of ''k'' factors: w_i = \. An FLM provides the probabilistic model ...
*
Farthest-first traversal
*
Fast-and-frugal trees
Fast-and-frugal tree ''or'' matching heuristic (in the study of decision-making) is a simple graphical structure that categorizes objects by asking one question at a time. These decision trees are used in a range of fields: psychology, artificial ...
*
Feature Selection Toolbox
Feature Selection Toolbox (FST) is software primarily for feature selection in the machine learning domain, written in C++, developed at the Institute of Information Theory and Automation (UTIA), of the Czech Academy of Sciences.
Version 1
Th ...
*
Feature hashing
*
Feature scaling
*
Feature vector
In machine learning and pattern recognition, a feature is an individual measurable property or characteristic of a data set. Choosing informative, discriminating, and independent features is crucial to produce effective algorithms for pattern re ...
*
Firefly algorithm
*
First-difference estimator
*
First-order inductive learner
In machine learning, first-order inductive learner (FOIL) is a rule-based learning algorithm.
Background
Developed in 1990 by Ross Quinlan,J.R. Quinlan. Learning Logical Definitions from Relations. Machine Learning, Volume 5, Number 3, 1990/ref> ...
*
Fish School Search
*
Fisher kernel
*
Fitness approximation
Fitness approximationY. JinA comprehensive survey of fitness approximation in evolutionary computation ''Soft Computing'', 9:3–12, 2005 aims to approximate the objective or fitness functions in evolutionary optimization by building up machine l ...
*
Fitness function
A fitness function is a particular type of objective or cost function that is used to summarize, as a single figure of merit, how close a given candidate solution is to achieving the set aims. It is an important component of evolutionary algorit ...
*
Fitness proportionate selection
Fitness proportionate selection, also known as roulette wheel selection or spinning wheel selection, is a selection technique used in evolutionary algorithms for selecting potentially useful solutions for recombination.
Method
In fitness prop ...
*
Fluentd
Fluentd is a cross platform, cross-platform open-source software, open-source data collection software project originally developed at Treasure Data. It is written primarily in the C (Programming Language), C programming language with a thin-Ruby ...
*
Folding@home
Folding@home (FAH or F@h) is a distributed computing project aimed to help scientists develop new therapeutics for a variety of diseases by the means of simulating protein dynamics. This includes the process of protein folding and the movements ...
*
Formal concept analysis
In information science, formal concept analysis (FCA) is a principled way of deriving a ''concept hierarchy'' or formal ontology from a collection of objects and their properties. Each concept in the hierarchy represents the objects sharing som ...
*
Forward algorithm
The forward algorithm, in the context of a hidden Markov model (HMM), is used to calculate a 'belief state': the probability of a state at a certain time, given the history of evidence. The process is also known as ''filtering''. The forward alg ...
*
Fowlkes–Mallows index
The Fowlkes–Mallows index is an external evaluation method that is used to determine the similarity between two clusterings (clusters obtained after a clustering algorithm), and also a metric to measure confusion matrices. This measure of simi ...
*
Frederick Jelinek
Frederick Jelinek (18 November 1932 – 14 September 2010) was a Czech-American researcher in information theory, automatic speech recognition, and natural language processing. He is well known for his oft-quoted statement, "Every time I fire ...
*
Frrole
*
Functional principal component analysis
*
GATTO
*
GLIMMER
In bioinformatics, GLIMMER (Gene Locator and Interpolated Markov ModelER) is used to gene prediction, find genes in prokaryotic DNA. "It is effective at finding genes in bacteria, archea, viruses, typically finding 98-99% of all relatively long ge ...
*
Gary Bryce Fogel
*
Gaussian adaptation
Gaussian adaptation (GA), also called normal or natural adaptation (NA) is an evolutionary algorithm designed for the maximization of manufacturing yield due to statistical deviation of component values of signal processing systems. In short, GA ...
* Gaussian process
* Gaussian process emulator
* Gene prediction
* General Architecture for Text Engineering
* Generalization error
* Generalized canonical correlation
* Generalized filtering
* Generalized iterative scaling
* Generalized multidimensional scaling
* Generative adversarial network
* Generative model
* Genetic algorithm
* Genetic algorithm scheduling
* Genetic algorithms in economics
* Genetic fuzzy systems
* Genetic memory (computer science)
* Genetic operator
* Genetic programming
* Genetic representation
* Geographical cluster
* Gesture Description Language
* Geworkbench
* Glossary of artificial intelligence
* Glottochronology
* Golem (ILP)
* Google matrix
* Grafting (decision trees)
* Gramian matrix
* Grammatical evolution
* Granular computing
* GraphLab
* Graph kernel
* Gremlin (programming language)
* Growth function
* HUMANT (HUManoid ANT) algorithm
* Hammersley–Clifford theorem
* Harmony search
* Hebbian theory
* Hidden Markov random field
* Hidden semi-Markov model
*
Hierarchical hidden Markov model
* Higher-order factor analysis
* Highway network
* Hinge loss
* Holland's schema theorem
* Hopkins statistic
* Hoshen–Kopelman algorithm
* Huber loss
* IRCF360
*
Ian Goodfellow
* Ilastik
* Ilya Sutskever
* Immunocomputing
* Imperialist competitive algorithm
* Inauthentic text
* Incremental decision tree
* Induction of regular languages
*
Inductive bias
* Inductive probability
* Inductive programming
* Influence diagram
* Information Harvesting
* Information gain in decision trees
* Information gain ratio
* Inheritance (genetic algorithm)
* Instance selection
* Intel RealSense
* Interacting particle system
* Interactive machine translation
* International Joint Conference on Artificial Intelligence
* International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics
* International Semantic Web Conference
* Iris flower data set
* Island algorithm
* Isotropic position
* Item response theory
* Iterative Viterbi decoding
* JOONE
* Jabberwacky
* Jaccard index
* Jackknife variance estimates for random forest
* Java Grammatical Evolution
* Joseph Nechvatal
* Jubatus
* Julia (programming language)
* Junction tree algorithm
* k-SVD, ''k''-SVD
* k-means++, ''k''-means++
* k-medians clustering, ''k''-medians clustering
* k-medoids, ''k''-medoids
* KNIME
* KXEN Inc.
* k q-flats, ''k q''-flats
* Kaggle
* Kalman filter
* Katz's back-off model
* Kernel adaptive filter
* Kernel density estimation
* Kernel eigenvoice
* Kernel embedding of distributions
* Kernel method
* Kernel perceptron
* Kernel random forest
* Kinect
* Klaus-Robert Müller
* Kneser–Ney smoothing
* Knowledge Vault
* Knowledge integration
* LIBSVM
* LPBoost
* Labeled data
* LanguageWare
* Language identification in the limit
* Language model
* Large margin nearest neighbor
* Latent Dirichlet allocation
* Latent class model
* Latent semantic analysis
* Latent variable
* Latent variable model
* Lattice Miner
* Layered hidden Markov model
* Learnable function class
* Least squares support vector machine
* Leslie P. Kaelbling
* Linear genetic programming
* Linear predictor function
* Linear separability
* Lingyun Gu
* Linkurious
* Lior Ron (business executive)
* List of genetic algorithm applications
* List of metaphor-based metaheuristics
* List of text mining software
* Local case-control sampling
* Local independence
* Local tangent space alignment
* Locality-sensitive hashing
* Log-linear model
* Logistic model tree
* Low-rank approximation
* Low-rank matrix approximations
* MATLAB
* MIMIC (immunology)
* MXNet
* Mallet (software project)
* Manifold regularization
* Margin-infused relaxed algorithm
* Margin classifier
* Mark V. Shaney
* Massive Online Analysis
* Matrix regularization
* Matthews correlation coefficient
* Mean shift
*
Mean squared error
In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference betwee ...
*
Mean squared prediction error
* Measurement invariance
* Medoid
* MeeMix
* Melomics
* Memetic algorithm
* Meta-optimization
* Mexican International Conference on Artificial Intelligence
* Michael Kearns (computer scientist)
* MinHash
* Mixture model
* Mlpy
* Models of DNA evolution
* Moral graph
* Mountain car problem
* Movidius
*
Multi-armed bandit
*
Multi-label classification
In machine learning, multi-label classification or multi-output classification is a variant of the statistical classification, classification problem where multiple nonexclusive labels may be assigned to each instance. Multi-label classification ...
* Multi expression programming
* Multiclass classification
* Multidimensional analysis
* Multifactor dimensionality reduction
* Multilinear principal component analysis
* Multiple correspondence analysis
* Multiple discriminant analysis
* Multiple factor analysis
* Multiple sequence alignment
* Multiplicative weight update method
* Multispectral pattern recognition
* Mutation (genetic algorithm)
* MysteryVibe
* N-gram
* NOMINATE (scaling method)
* Native-language identification
* Natural Language Toolkit
* Natural evolution strategy
* Nearest-neighbor chain algorithm
* Nearest centroid classifier
* Nearest neighbor search
* Neighbor joining
* Nest Labs
* NetMiner
* NetOwl
* Neural Designer
* Neural Engineering Object
* Neural modeling fields
* Neural network software
* NeuroSolutions
* Neuroevolution
* Neuroph
* Niki.ai
* Noisy channel model
* Noisy text analytics
* Nonlinear dimensionality reduction
* Novelty detection
* Nuisance variable
* One-class classification
* Onnx
* OpenNLP
* Optimal discriminant analysis
* Oracle Data Mining
* Orange (software)
* Ordination (statistics)
* Overfitting
* PROGOL
* PSIPRED
* Pachinko allocation
* PageRank
* Parallel metaheuristic
* Parity benchmark
* Part-of-speech tagging
* Particle swarm optimization
* Path dependence
* Pattern language (formal languages)
* Peltarion Synapse
* Perplexity
* Persian Speech Corpus
* Pietro Perona
* Pipeline Pilot
* Piranha (software)
* Pitman–Yor process
* Plate notation
* Polynomial kernel
* Pop music automation
* Population process
* Portable Format for Analytics
* Predictive Model Markup Language
* Predictive state representation
* Preference regression
* Premature convergence
* Principal geodesic analysis
* Prior knowledge for pattern recognition
* Prisma (app)
* Probabilistic Action Cores
* Probabilistic context-free grammar
* Probabilistic latent semantic analysis
* Probabilistic soft logic
* Probability matching
* Probit model
* Product of experts
* Programming with Big Data in R
* Proper generalized decomposition
* Pruning (decision trees)
* Pushpak Bhattacharyya
* Q methodology
* Qloo
* Quality control and genetic algorithms
* Quantum Artificial Intelligence Lab
* Queueing theory
* Quick, Draw!
* R (programming language)
* Rada Mihalcea
* Rademacher complexity
* Radial basis function kernel
* Rand index
* Random indexing
* Random projection
* Random subspace method
* Ranking SVM
* RapidMiner
* Rattle GUI
* Raymond Cattell
* Reasoning system
* Regularization perspectives on support vector machines
* Relational data mining
* Relationship square
* Relevance vector machine
* Relief (feature selection)
* Renjin
* Repertory grid
* Representer theorem
* Reward-based selection
* Richard Zemel
* Right to explanation
* RoboEarth
* Robust principal component analysis
* RuleML Symposium
* Rule induction
* Rules extraction system family
* SAS (software)
* SNNS
* SPSS Modeler
* SUBCLU
* Sample complexity
* Sample exclusion dimension
* Santa Fe Trail problem
* Savi Technology
* Schema (genetic algorithms)
* Search-based software engineering
* Selection (genetic algorithm)
* Self-Service Semantic Suite
* Semantic folding
* Semantic mapping (statistics)
* Semidefinite embedding
* Sense Networks
* Sensorium Project
* Sequence labeling
* Sequential minimal optimization
* Shattered set
* Shogun (toolbox)
* Silhouette (clustering)
* SimHash
* SimRank
* Similarity measure
* Simple matching coefficient
* Simultaneous localization and mapping
* Sinkov statistic
* Sliced inverse regression
* Snakes and Ladders
* Soft independent modelling of class analogies
* Soft output Viterbi algorithm
* Solomonoff's theory of inductive inference
* SolveIT Software
* Spectral clustering
* Spike-and-slab variable selection
* Statistical machine translation
* Statistical parsing
* Statistical semantics
* Stefano Soatto
* Stephen Wolfram
* Stochastic block model
* Stochastic cellular automaton
* Stochastic diffusion search
* Stochastic grammar
* Stochastic matrix
* Stochastic universal sampling
* Stress majorization
* String kernel
* Structural equation modeling
* Structural risk minimization
* Structured sparsity regularization
* Structured support vector machine
* Subclass reachability
* Sufficient dimension reduction
* Sukhotin's algorithm
* Sum of absolute differences
* Sum of absolute transformed differences
* Swarm intelligence
* Switching Kalman filter
* Symbolic regression
* Synchronous context-free grammar
* Syntactic pattern recognition
* TD-Gammon
* TIMIT
* Teaching dimension
* Teuvo Kohonen
* Textual case-based reasoning
* Theory of conjoint measurement
* Thomas G. Dietterich
* Thurstonian model
* Topic model
* Tournament selection
* Training, test, and validation sets
* Transiogram
* Trax Image Recognition
* Trigram tagger
* Truncation selection
* Tucker decomposition
* UIMA
* UPGMA
* Ugly duckling theorem
* Uncertain data
* Uniform convergence in probability
* Unique negative dimension
* Universal portfolio algorithm
* User behavior analytics
* VC dimension
* VIGRA
* Validation set
* Vapnik–Chervonenkis theory
* Variable-order Bayesian network
* Variable kernel density estimation
* Variable rules analysis
* Variational message passing
* Varimax rotation
* Vector quantization
* Vicarious (company)
* Viterbi algorithm
* Vowpal Wabbit
* WACA clustering algorithm
* WPGMA
* Ward's method
* Weasel program
* Whitening transformation
* Winnow (algorithm)
* Win–stay, lose–switch
* Witness set
* Wolfram Language
* Wolfram Mathematica
* Writer invariant
* Xgboost
* Yooreeka
* Zeroth (software)
Further reading
*
Trevor Hastie, Robert Tibshirani and
Jerome H. Friedman
Jerome Harold Friedman (born December 29, 1939) is an American statistician, consultant and Professor of Statistics at Stanford University, known for his contributions in the field of statistics and data mining. (2001).
The Elements of Statistical Learning', Springer. .
*
Pedro Domingos
Pedro Domingos (born 1965) is a Professor Emeritus of computer science and engineering at the University of Washington. He is a researcher in machine learning known for Markov logic network enabling uncertain inference.
Education
Domingos rece ...
(September 2015), The Master Algorithm, Basic Books,
*
Mehryar Mohri, Afshin Rostamizadeh, Ameet Talwalkar (2012).
Foundations of Machine Learning', The MIT Press. .
* Ian H. Witten and Eibe Frank (2011). ''Data Mining: Practical machine learning tools and techniques'' Morgan Kaufmann, 664pp., .
* David J. C. MacKay.
Information Theory, Inference, and Learning Algorithms' Cambridge: Cambridge University Press, 2003.
* Richard O. Duda, Peter E. Hart, David G. Stork (2001) ''Pattern classification'' (2nd edition), Wiley, New York, .
* Christopher Bishop (1995). ''Neural Networks for Pattern Recognition'', Oxford University Press. .
*
Vladimir Vapnik (1998). ''Statistical Learning Theory''. Wiley-Interscience, .
* Ray Solomonoff, ''An Inductive Inference Machine'', IRE Convention Record, Section on Information Theory, Part 2, pp., 56–62, 1957.
* Ray Solomonoff,
An Inductive Inference Machine A privately circulated report from the 1956 Dartmouth Conferences, Dartmouth Summer Research Conference on AI.
References
External links
Data Science: Data to Insights from MIT (machine learning)* Popular online course by
Andrew Ng
Andrew Yan-Tak Ng (; born April 18, 1976) is a British-American computer scientist and Internet Entrepreneur, technology entrepreneur focusing on machine learning and artificial intelligence (AI). Ng was a cofounder and head of Google Brain and ...
, a
Coursera It uses GNU Octave. The course is a free version of Stanford University's actual course taught by Ng, see.stanford.edu/Course/CS229 available for free].
mlossis an academic database of open-source machine learning software.
{{Outline footer
Outlines of computing and engineering, Machine learning
Outlines, Machine learning
Computing-related lists
Machine learning, *
Data mining, Machine learning