HOME

TheInfoList



OR:

Machine-readable dictionary (''MRD'') is a
dictionary A dictionary is a listing of lexemes from the lexicon of one or more specific languages, often arranged alphabetically (or by radical and stroke for ideographic languages), which may include information on definitions, usage, etymologies ...
stored as machine (computer) data instead of being printed on paper. It is an
electronic dictionary An electronic dictionary is a dictionary whose data exists in digital form and can be accessed through a number of different media. Electronic dictionaries can be found in several forms, including software installed on tablet or desktop computers ...
and
lexical database In digital lexicography, natural language processing, and digital humanities, a lexical resource is a language resource consisting of data regarding the lexemes of the lexicon of one or more languages e.g., in the form of a database. Characterist ...
. A machine-readable dictionary is a dictionary in an electronic form that can be loaded in a database and can be queried via application software. It may be a single language explanatory dictionary or a multi-language dictionary to support translations between two or more languages or a combination of both. Translation software between multiple languages usually apply bidirectional dictionaries. An MRD may be a dictionary with a proprietary structure that is queried by dedicated software (for example online via internet) or it can be a dictionary that has an open structure and is available for loading in computer databases and thus can be used via various software applications. Conventional dictionaries contain a
lemma Lemma may refer to: Language and linguistics * Lemma (morphology), the canonical, dictionary or citation form of a word * Lemma (psycholinguistics), a mental abstraction of a word about to be uttered Science and mathematics * Lemma (botany), a ...
with various descriptions. A machine-readable dictionary may have additional capabilities and is therefore sometimes called a smart dictionary. An example of a smart dictionary is the Open Source
Gellish English dictionary The Gellish English Dictionary-Taxonomy is an example of an open-source “smart” electronic dictionary, in which concepts are arranged in a subtype-supertype hierarchy, thus forming a taxonomy. The dictionary-taxonomy is machine readable. It is ...
.
The term dictionary is also used to refer to an electronic
vocabulary A vocabulary is a set of familiar words within a person's language. A vocabulary, usually developed with age, serves as a useful and fundamental tool for communication and acquiring knowledge. Acquiring an extensive vocabulary is one of the la ...
or
lexicon A lexicon is the vocabulary of a language or branch of knowledge (such as nautical or medical). In linguistics, a lexicon is a language's inventory of lexemes. The word ''lexicon'' derives from Koine Greek language, Greek word (), neuter of () ...
as used for example in
spelling checker In software, a spell checker (or spelling checker or spell check) is a software feature that checks for misspellings in a text. Spell-checking features are often embedded in software or services, such as a word processor, email client, electronic d ...
s. If dictionaries are arranged in a subtype-supertype hierarchy of concepts (or terms) then it is called a
taxonomy Taxonomy is the practice and science of categorization or classification. A taxonomy (or taxonomical classification) is a scheme of classification, especially a hierarchical classification, in which things are organized into groups or types. ...
. If it also contains other relations between the concepts, then it is called an
ontology In metaphysics, ontology is the philosophical study of being, as well as related concepts such as existence, becoming, and reality. Ontology addresses questions like how entities are grouped into categories and which of these entities exis ...
. Search engines may use either a vocabulary, a taxonomy or an ontology to optimise the search results. Specialised electronic dictionaries are morphological dictionaries or syntactic dictionaries. The term MRD is often contrasted with NLP dictionary, in the sense that an MRD is the electronic form of a dictionary which was printed before on paper. Although being both used by programs, in contrast, the term NLP dictionary is preferred when the dictionary was built from scratch with NLP in mind. An ISO standard for MRD and NLP is able to represent both structures and is called
Lexical Markup Framework Language resource management - Lexical markup framework (LMF; ISO 24613:2008), is the International Organization for Standardization ISO/TC37 standard for natural language processing (NLP) and machine-readable dictionary (MRD) lexicons. The scop ...
.Gil Francopoulo (edited by) LMF Lexical Markup Framework, ISTE / Wiley 2013 ()


History

The first widely distributed MRDs were the Merriam-Webster Seventh Collegiate (W7) and the Merriam-Webster New Pocket Dictionary (MPD). Both were produced by a government-funded project at
System Development Corporation System Development Corporation (SDC) was a computer software company based in Santa Monica, California. Founded in 1955, it is considered the first company of its kind. History SDC began as the systems engineering group for the SAGE air-defense ...
under the direction of John Olney. They were manually keyboarded as no typesetting tapes of either book were available. Originally each was distributed on multiple reels of magnetic tape as card images with each separate word of each definition on a separate punch card with numerous special codes indicating the details of its usage in the printed dictionary. Olney outlined a grand plan for the analysis of the definitions in the dictionary, but his project expired before the analysis could be carried out. Robert Amsler at the University of Texas at Austin resumed the analysis and completed a taxonomic description of the Pocket Dictionary under
National Science Foundation The National Science Foundation (NSF) is an independent agency of the United States government that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National I ...
funding, however his project expired before the taxonomic data could be distributed. Roy Byrd et al. at IBM Yorktown Heights resumed analysis of the Webster's Seventh Collegiate following Amsler's work. Finally, in the 1980s starting with initial support from Bellcore and later funded by various U.S. federal agencies, including NSF,
ARDA Arda or ARDA may refer to: Places *Arda (Maritsa), a river in Bulgaria and Greece * Arda (Italy), a river in Italy *Arda (Douro), a river in Portugal * Arda, Bulgaria, a village in southern Bulgaria * Arda, County Fermanagh, a townland in County ...
,
DARPA The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military. Originally known as the Adv ...
, DTO, and
REFLEX In biology, a reflex, or reflex action, is an involuntary, unplanned sequence or action and nearly instantaneous response to a stimulus. Reflexes are found with varying levels of complexity in organisms with a nervous system. A reflex occurs ...
,
George Armitage Miller George Armitage Miller (February 3, 1920 – July 22, 2012) was an American psychologist who was one of the founders of cognitive psychology, and more broadly, of cognitive science. He also contributed to the birth of psycholinguistics. Mille ...
and
Christiane Fellbaum Christiane D. Fellbaum is a Lecturer with Rank of Professor in the Program in Linguistics and the Computer Science Department at Princeton University. The co-developer of the WordNet project, she is also its current director. Biography Fellbaum r ...
at Princeton University completed the creation and wide distribution of a dictionary and its taxonomy in the
WordNet WordNet is a lexical database of semantic relations between words in more than 200 languages. WordNet links words into semantic relations including synonyms, hyponyms, and meronyms. The synonyms are grouped into '' synsets'' with short definition ...
project, which today stands as the most widely distributed computational lexicology resource.


References

{{Natural language processing Computational linguistics Dictionaries by type Lexicography