Corpus manager
   HOME

TheInfoList



OR:

A corpus manager (corpus browser or corpus query system) is a tool for multilingual
corpus Corpus is Latin for "body". It may refer to: Linguistics * Text corpus, in linguistics, a large and structured set of texts * Speech corpus, in linguistics, a large set of speech audio files * Corpus linguistics, a branch of linguistics Music * ...
analysis, which allows effective searching in corpora. A corpus manager usually represents a complex tool that allows one to perform searches for language forms or sequences. It may provide information about the context or allow the user to search by positional attributes, such as
lemma Lemma may refer to: Language and linguistics * Lemma (morphology), the canonical, dictionary or citation form of a word * Lemma (psycholinguistics), a mental abstraction of a word about to be uttered Science and mathematics * Lemma (botany), a ...
, tag, etc. These are called concordances. Other features include the ability to search for
Collocation In corpus linguistics, a collocation is a series of words or terms that co-occur more often than would be expected by chance. In phraseology, a collocation is a type of compositional phraseme, meaning that it can be understood from the words th ...
s, frequency statistics as well as metadata information about the processed text. The narrower meaning of corpus manager refers only to the server side or the corpus query engine, whereas the client side is simply called the user interface. A corpus manager can be software installed on a personal computer or it might be provided as a web service.


List of corpus managers

* BNCweb – a web-based interface for the
British National Corpus The British National Corpus (BNC) is a 100-million-word text corpus of samples of written and spoken English from a wide range of sources. The corpus covers British English of the late 20th century from a wide variety of genres, with the intention ...
* CQPweb - a web-based interface for the study of a large variety of corpora including the Spoken BNC2014 * BYU-BNC – a website that allows searches of the British National Corpora and others created at
Brigham Young University Brigham Young University (BYU, sometimes referred to colloquially as The Y) is a private research university in Provo, Utah. It was founded in 1875 by religious leader Brigham Young and is sponsored by the Church of Jesus Christ of Latter-day ...
* Coma – a tool extension of the system EXMARaLDA for working with oral corpora on a computer * NoSketch Engine – a free open-source corpus management system combining Manatee (back-end) and Bonito (web interface) * KonText – an extended and modified web interface to ''NoSketch Engine'' (a Bonito replacement) *
Sketch Engine Sketch Engine is a corpus manager and text analysis software developed by Lexical Computing CZ s.r.o. since 2003. Its purpose is to enable people studying language behaviour ( lexicographers, researchers in corpus linguistics, translators or lan ...
– text corpus management and analysis software with more than 500 corpora in 90+ languages * WordSmith ToolsWordSmith Tools homepage
/ref> – a software package primarily for
linguists Linguistics is the scientific study of human language. It is called a scientific study because it entails a comprehensive, systematic, objective, and precise analysis of all aspects of language, particularly its nature and structure. Linguis ...


References

{{Reflist Applied linguistics Linguistic research Corpus linguistics Database management systems Lexicography