Shogun (toolbox)
   HOME

TheInfoList



OR:

Shogun is a free, open-source
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
software library written in
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
. It offers numerous algorithms and data structures for
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
problems. It offers interfaces for Octave,
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
, R,
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
,
Lua Lua or LUA may refer to: Science and technology * Lua (programming language) * Latvia University of Agriculture * Last universal ancestor, in evolution Ethnicity and language * Lua people, of Laos * Lawa people, of Thailand sometimes referred t ...
,
Ruby A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called ...
and C# using SWIG. It is licensed under the terms of the
GNU General Public License The GNU General Public License (GNU GPL or simply GPL) is a series of widely used free software licenses that guarantee end users the four freedoms to run, study, share, and modify the software. The license was the first copyleft for general ...
version 3 or later.


Description

The focus of ''Shogun'' is on kernel machines such as support vector machines for regression and classification problems. ''Shogun'' also offers a full implementation of
Hidden Markov model A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("''hidden''") states. As part of the definition, HMM requires that there be an o ...
s. The core of ''Shogun'' is written in C++ and offers interfaces for
MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementa ...
, Octave,
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
, R,
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
,
Lua Lua or LUA may refer to: Science and technology * Lua (programming language) * Latvia University of Agriculture * Last universal ancestor, in evolution Ethnicity and language * Lua people, of Laos * Lawa people, of Thailand sometimes referred t ...
,
Ruby A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called ...
and C#. ''Shogun'' has been under active development since 1999. Today there is a vibrant user community all over the world using ''Shogun'' as a base for research and education, and contributing to the core package.


Supported algorithms

Currently ''Shogun'' supports the following algorithms: * Support vector machines * Dimensionality reduction algorithms, such as PCA, Kernel PCA, Locally Linear Embedding, Hessian Locally Linear Embedding, Local Tangent Space Alignment, Linear Local Tangent Space Alignment, Kernel Locally Linear Embedding, Kernel Local Tangent Space Alignment, Multidimensional Scaling, Isomap, Diffusion Maps, Laplacian Eigenmaps * Online learning algorithms such as SGD-QN,
Vowpal Wabbit Vowpal Wabbit (VW) is an open-source fast online interactive machine learning system library and program developed originally at Yahoo! Research, and currently at Microsoft Research. It was started and is led by John Langford. Vowpal Wabbit's i ...
* Clustering algorithms: k-means and GMM * Kernel Ridge Regression, Support Vector Regression *
Hidden Markov Models A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("''hidden''") states. As part of the definition, HMM requires that there be an ob ...
* K-Nearest Neighbors * Linear discriminant analysis * Kernel Perceptrons. Many different kernels are implemented, ranging from kernels for numerical data (such as gaussian or linear kernels) to kernels on special data (such as strings over certain alphabets). The currently implemented kernels for numeric data include: * linear * gaussian * polynomial * sigmoid kernels The supported kernels for special data include: * Spectrum * Weighted Degree * Weighted Degree with Shifts The latter group of kernels allows processing of arbitrary sequences over fixed alphabets such as DNA sequences as well as whole e-mail texts.


Special features

As ''Shogun'' was developed with bioinformatics applications in mind it is capable of processing huge datasets consisting of up to 10 million samples. ''Shogun'' supports the use of pre-calculated kernels. It is also possible to use a combined kernel i.e. a kernel consisting of a linear combination of arbitrary kernels over different domains. The coefficients or weights of the linear combination can be learned as well. For this purpose ''Shogun'' offers a ''multiple kernel learning'' functionality.


References

* S. Sonnenburg, G. Rätsch, S. Henschel, C. Widmer, J. Behr, A. Zien, F. De Bona, A. Binder, C. Gehl and V. Franc: ''The SHOGUN Machine Learning Toolbox'',
Journal of Machine Learning Research The ''Journal of Machine Learning Research'' is a peer-reviewed open access scientific journal covering machine learning. It was established in 2000 and the first editor-in-chief was Leslie Kaelbling. The current editors-in-chief are Francis Ba ...
, 11:1799−1802, June 11, 2010. * M. Gashler. Waffles: A Machine Learning Toolkit. Journal of Machine Learning Research, 12 (July):2383–2387, 2011. * P. Vincent, Y. Bengio, N. Chapados, and O. Delalleau. Plearn high-performance machine learning library. URL http://plearn.berlios.de/.


External links


Shogun toolbox homepage
* * {{Freshmeat, shogun, SHOGUN C++ libraries Free software programmed in C++ Data mining and machine learning software Free statistical software Free computer libraries Free mathematics software Free science software