HOME

TheInfoList



OR:

MALLET is a
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
"Machine Learning for Language Toolkit".


Description

MALLET is an integrated collection of Java code useful for statistical natural language processing,
document classification Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a document to one or more classes or categories. This may be done "manually" (or "intellectually") ...
, cluster analysis,
information extraction Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents and other electronically represented sources. In most of the cases this activity concer ...
,
topic model In statistics and natural language processing, a topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. Topic modeling is a frequently used text-mining tool for discovery of hidden ...
ing and other
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
applications to text.


History

MALLET was developed primarily by
Andrew McCallum Andrew McCallum is a professor in the computer science department at University of Massachusetts Amherst. His primary specialties are in machine learning, natural language processing, information extraction, information integration, and social ...
, of the
University of Massachusetts Amherst The University of Massachusetts Amherst (UMass Amherst, UMass) is a public research university in Amherst, Massachusetts and the sole public land-grant university in Commonwealth of Massachusetts. Founded in 1863 as an agricultural college, ...
, with assistance from graduate students and faculty from both UMASS and the
University of Pennsylvania The University of Pennsylvania (also known as Penn or UPenn) is a private research university in Philadelphia. It is the fourth-oldest institution of higher education in the United States and is ranked among the highest-regarded universitie ...
.


See also


External links


Official website of the project
at the University of Massachusetts Amherst * Th
Topic Modeling Tool
is an independently developed GUI that outputs MALLET results in CSV and HTML files Free artificial intelligence applications Natural language processing toolkits Free software programmed in Java (programming language) Java (programming language) libraries Data mining and machine learning software {{prog-lang-stub