Waffles is a collection of command-line tools for performing
machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
operations developed at
Brigham Young University
Brigham Young University (BYU, sometimes referred to colloquially as The Y) is a private research university in Provo, Utah. It was founded in 1875 by religious leader Brigham Young and is sponsored by the Church of Jesus Christ of Latter-day ...
. These tools are written in
C++, and are available under the
GNU Lesser General Public License
The GNU Lesser General Public License (LGPL) is a free-software license published by the Free Software Foundation (FSF). The license allows developers and companies to use and integrate a software component released under the LGPL into their own ...
.
Description
The Waffles machine learning toolkit
[
] contains command-line tools for performing various operations related to
machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
,
data mining, and
predictive modeling. The primary focus of Waffles is to provide tools that are simple to use in scripted experiments or processes. For example, the supervised learning algorithms included in Waffles are all designed to support multi-dimensional labels,
classification Classification is a process related to categorization, the process in which ideas and objects are recognized, differentiated and understood.
Classification is the grouping of related facts into classes.
It may also refer to:
Business, organizat ...
and
regression
Regression or regressions may refer to:
Science
* Marine regression, coastal advance due to falling sea level, the opposite of marine transgression
* Regression (medicine), a characteristic of diseases to express lighter symptoms or less extent ( ...
, automatically impute missing values, and automatically apply necessary filters to transform the data to a type that the algorithm can support, such that arbitrary learning algorithms can be used with arbitrary data sets. Many other machine learning toolkits provide similar functionality, but require the user to explicitly configure data filters and transformations to make it compatible with a particular learning algorithm. The algorithms provided in Waffles also have the ability to automatically tune their own parameters (with the cost of additional computational overhead).
Because Waffles is designed for script-ability, it deliberately avoids presenting its tools in a graphical environment. It does, however, include a graphical "wizard" tool that guides the user to generate a command that will perform a desired task. This wizard does not actually perform the operation, but requires the user to paste the command that it generates into a command terminal or a script. The idea motivating this design is to prevent the user from becoming "locked in" to a graphical interface.
All of the Waffles tools are implemented as thin wrappers around functionality in a C++ class library. This makes it possible to convert scripted processes into native applications with minimal effort.
Waffles was first released as an open source project in 2005. Since that time, it has been developed at
Brigham Young University
Brigham Young University (BYU, sometimes referred to colloquially as The Y) is a private research university in Provo, Utah. It was founded in 1875 by religious leader Brigham Young and is sponsored by the Church of Jesus Christ of Latter-day ...
, with a new version having been released approximately every 6–9 months. Waffles is not an acronym—the toolkit was named after the food for historical reasons.
Advantages
Some of the advantages of Waffles in contrast with other popular open source machine learning toolkits include:
* Waffles automatically takes care of many issues related to data format in order to simplify its tools.
* Because it is implemented in C++, many of its algorithms are particularly fast. Also, the lack of dependency on any virtual machine makes it easier to deploy in conjunction with other applications.
* The functionality included in Waffles is very broad, including algorithms for
dimensionality reduction
Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally ...
,
collaborative filtering, visualization, clustering, supervised learning, optimization, linear algebra, data transformation, image and signal processing, policy learning, and sparse matrix operations.
Disadvantages
* Although Waffles provides significant breadth, it lacks the depth of many toolkits that focus on a particular area of machine learning. The
Weka (machine learning) toolkit, for example, provides many more classification algorithms than Waffles provides.
* Waffles only has a limited graphical interface.
See also
*
Weka (machine learning)
*
RapidMiner
RapidMiner is a data science platform designed for enterprises that analyses the collective impact of organizations’ employees, expertise and data. Rapid Miner's data science platform is intended to support many analytics users across a broad AI ...
(formerly ''YALE (Yet Another Learning Environment)''), a commercial machine learning framework implemented in Java
*
List of numerical analysis software
References
{{Reflist
Data mining and machine learning software
Free science software
Software optimization
Free data analysis software