HOME

TheInfoList



OR:

mlpack is a
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
software library for C++, built on top of the Armadillo library and th
ensmallen
numerical optimization library. mlpack has an emphasis on scalability, speed, and ease-of-use. Its aim is to make machine learning possible for novice users by means of a simple, consistent API, while simultaneously exploiting C++ language features to provide maximum performance and maximum flexibility for expert users. Its intended target users are scientists and engineers. It is
open-source software Open-source software (OSS) is computer software that is released under a license in which the copyright holder grants users the rights to use, study, change, and distribute the software and its source code to anyone and for any purpose. ...
distributed under the
BSD license BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD lice ...
, making it useful for developing both open source and proprietary software. Releases 1.0.11 and before were released under the
LGPL The GNU Lesser General Public License (LGPL) is a free-software license published by the Free Software Foundation (FSF). The license allows developers and companies to use and integrate a software component released under the LGPL into their own ...
license. The project is supported by the
Georgia Institute of Technology The Georgia Institute of Technology, commonly referred to as Georgia Tech or, in the state of Georgia, as Tech or The Institute, is a public research university and institute of technology in Atlanta, Georgia. Established in 1885, it is part ...
and contributions from around the world.


Miscellaneous features

Class templates for GRU, LSTM structures are available, thus the library also supports Recurrent Neural Networks. There are bindings to R, Go, Julia, and Python. Its binding system is extensible to other languages.


Supported algorithms

Currently ''mlpack'' supports the following
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
s and models: *
Collaborative Filtering Collaborative filtering (CF) is a technique used by recommender systems.Francesco Ricci and Lior Rokach and Bracha ShapiraIntroduction to Recommender Systems Handbook Recommender Systems Handbook, Springer, 2011, pp. 1-35 Collaborative filtering ...
* Decision stumps (one-level decision trees) * Density Estimation Trees * Euclidean minimum spanning trees * Gaussian Mixture Models (GMMs) *
Hidden Markov Models A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("''hidden''") states. As part of the definition, HMM requires that there be an ...
(HMMs) *
Kernel density estimation In statistics, kernel density estimation (KDE) is the application of kernel smoothing for probability density estimation, i.e., a non-parametric method to estimate the probability density function of a random variable based on '' kernels'' as ...
(KDE) * Kernel Principal Component Analysis (KPCA) *
K-Means Clustering ''k''-means clustering is a method of vector quantization, originally from signal processing, that aims to partition ''n'' observations into ''k'' clusters in which each observation belongs to the cluster with the nearest mean (cluster centers ...
*
Least-Angle Regression In statistics, least-angle regression (LARS) is an algorithm for fitting linear regression models to high-dimensional data, developed by Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani. Suppose we expect a response variab ...
(LARS/LASSO) *
Linear Regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is cal ...
* Bayesian Linear Regression * Local Coordinate Coding * Locality-Sensitive Hashing (LSH) *
Logistic regression In statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear combination of one or more independent variables. In regression an ...
* Max-Kernel Search *
Naive Bayes Classifier In statistics, naive Bayes classifiers are a family of simple " probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features (see Bayes classifier). They are among the simplest Bay ...
* Nearest neighbor search with dual-tree algorithms * Neighbourhood Components Analysis (NCA) * Non-negative Matrix Factorization (NMF) *
Principal Components Analysis Principal component analysis (PCA) is a popular technique for analyzing large datasets containing a high number of dimensions/features per observation, increasing the interpretability of data while preserving the maximum amount of information, and ...
(PCA) *
Independent component analysis In signal processing, independent component analysis (ICA) is a computational method for separating a multivariate signal into additive subcomponents. This is done by assuming that at most one subcomponent is Gaussian and that the subcomponents ar ...
(ICA) * Rank-Approximate Nearest Neighbor (RANN) * Simple Least-Squares
Linear Regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is cal ...
(and
Ridge Regression Ridge regression is a method of estimating the coefficients of multiple-regression models in scenarios where the independent variables are highly correlated. It has been used in many fields including econometrics, chemistry, and engineering. Also ...
) * Sparse Coding, Sparse dictionary learning * Tree-based Neighbor Search (all-k-nearest-neighbors, all-k-furthest-neighbors), using either kd-trees or cover trees * Tree-based Range Search


See also

*
Armadillo (C++ library) Armadillo is a linear algebra software library for the C++ programming language. It aims to provide efficient and streamlined base calculations, while at the same time having a straightforward and easy-to-use interface. Its intended target user ...
*
List of numerical analysis software Listed here are notable end-user computer applications intended for use with numerical or data analysis: Numerical-software packages General-purpose computer algebra systems Interface-oriented Language-oriented Historically signific ...
*
List of numerical libraries This is a list of numerical libraries, which are libraries used in software development for performing numerical calculations. It is not a complete listing but is instead a list of numerical libraries with articles on Wikipedia, with few except ...
*
Numerical linear algebra Numerical linear algebra, sometimes called applied linear algebra, is the study of how matrix operations can be used to create computer algorithms which efficiently and accurately provide approximate answers to questions in continuous mathematic ...
*
Scientific computing Computational science, also known as scientific computing or scientific computation (SC), is a field in mathematics that uses advanced computing capabilities to understand and solve complex problems. It is an area of science that spans many disc ...


References


External links

* * {{GitHub, mlpack/mlpack C++ libraries Data mining and machine learning software Free computer libraries Free mathematics software Free science software Free software programmed in C++ Free statistical software