MALLET is a
Java
Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
"Machine Learning for Language Toolkit".
Description
MALLET is an integrated collection of Java code useful for statistical
natural language processing,
document classification
Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a document to one or more classes or categories. This may be done "manually" (or "intellectually") ...
,
cluster analysis,
information extraction
Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents and other electronically represented sources. In most of the cases this activity concer ...
,
topic model
In statistics and natural language processing, a topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. Topic modeling is a frequently used text-mining tool for discovery of hidden ...
ing and other
machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
applications to text.
History
MALLET was developed primarily by
Andrew McCallum
Andrew McCallum is a professor in the computer science department at University of Massachusetts Amherst. His primary specialties are in machine learning, natural language processing, information extraction, information integration, and social ...
, of the
University of Massachusetts Amherst
The University of Massachusetts Amherst (UMass Amherst, UMass) is a public research university in Amherst, Massachusetts and the sole public land-grant university in Commonwealth of Massachusetts. Founded in 1863 as an agricultural college, ...
, with assistance from graduate students and faculty from both UMASS and the
University of Pennsylvania
The University of Pennsylvania (also known as Penn or UPenn) is a private research university in Philadelphia. It is the fourth-oldest institution of higher education in the United States and is ranked among the highest-regarded universitie ...
.
See also
External links
Official website of the projectat the University of Massachusetts Amherst
* Th
Topic Modeling Toolis an independently developed GUI that outputs MALLET results in CSV and HTML files
Free artificial intelligence applications
Natural language processing toolkits
Free software programmed in Java (programming language)
Java (programming language) libraries
Data mining and machine learning software
{{prog-lang-stub