New General Service List
   HOME
*





New General Service List
The New General Service List (NGSL) is a list of 2,818 words (lemmas) claimed to be the core vocabulary of the English language published by Dr. Charles Browne, Dr. Brent Culligan and Joseph Phillips in March 2013. The words in the NGSL represent the most important high frequency words of the English language for second language learners of English and is a major update of Michael West's 1953 GSL. Although there are more than 600,000 word families in the English language, the 2,800 words in the NGSL give more than 90% coverage for learners when trying to read most general texts of English. The main goals of the NGSL project were to (1) modernize and greatly increase the size of the corpus used by, and to (2) create a list of words that provided a higher degree of coverage with fewer words than, the original GSL. The 273-million-word subsection of the more than two-billion-word Cambridge English Corpus is about 100 times larger than the 2.5 million word corpus developed in the 1 ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Lemma (morphology)
In morphology and lexicography, a lemma (plural ''lemmas'' or ''lemmata'') is the canonical form, dictionary form, or citation form of a set of word forms. In English, for example, ''break'', ''breaks'', ''broke'', ''broken'' and ''breaking'' are forms of the same lexeme, with ''break'' as the lemma by which they are indexed. ''Lexeme'', in this context, refers to the set of all the inflected or alternating forms in the paradigm of a single word, and ''lemma'' refers to the particular form that is chosen by convention to represent the lexeme. Lemmas have special significance in highly inflected languages such as Arabic, Turkish and Russian. The process of determining the ''lemma'' for a given lexeme is called lemmatisation. The lemma can be viewed as the chief of the principal parts, although lemmatisation is at least partly arbitrary. Morphology The form of a word that is chosen to serve as the lemma is usually the least marked form, but there are several exceptions such as ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

English Language
English is a West Germanic language of the Indo-European language family, with its earliest forms spoken by the inhabitants of early medieval England. It is named after the Angles, one of the ancient Germanic peoples that migrated to the island of Great Britain. Existing on a dialect continuum with Scots, and then closest related to the Low Saxon and Frisian languages, English is genealogically West Germanic. However, its vocabulary is also distinctively influenced by dialects of France (about 29% of Modern English words) and Latin (also about 29%), plus some grammar and a small amount of core vocabulary influenced by Old Norse (a North Germanic language). Speakers of English are called Anglophones. The earliest forms of English, collectively known as Old English, evolved from a group of West Germanic (Ingvaeonic) dialects brought to Great Britain by Anglo-Saxon settlers in the 5th century and further mutated by Norse-speaking Viking settlers starting in the 8th and 9th ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


General Service List
The General Service List (GSL) is a list of roughly 2,000 words published by Michael West in 1953. The words were selected to represent the most frequent words of English and were taken from a corpus of written English. The target audience was English language learners and ESL teachers. To maximize the utility of the list, some frequent words that overlapped broadly in meaning with words already on the list were omitted. In the original publication the relative frequencies of various senses of the words were also included. Details The list is important because a person who knows all the words on the list and their related families would understand approximately 90–95 percent of colloquial speech and 80–85 percent of common written texts. The list consists only of headwords, which means that the word "be" is high on the list, but assumes that the person is fluent in all forms of the word, e.g. am, is, are, was, were, being, and been. Researchers have expressed doubts about th ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Cambridge English Corpus
The Cambridge English Corpus (CEC) (formerly the Cambridge International Corpus, CIC), is a multi-billion word corpus of English language (containing both text corpus and spoken corpus data). The Cambridge English Corpus contains data from a number of sources including written and spoken, British and American English. The CEC also contains the Cambridge Learner Corpus, a 40m word corpus made up from English exam responses written by English language learners. The Cambridge English Corpus is used to inform Cambridge University Press English Language Teaching publications as well as for research in corpus linguistics. Access is currently restricted to authors and researchers working on projects and publications for Cambridge University Press, and researchers at Cambridge English Language Assessment. It contains instances of modern written English, taken from newspapers, magazines, novels, letters, emails, textbooks, websites, and many other sources. Its spoken data is taken from ma ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Lemmatization
Lemmatisation ( or lemmatization) in linguistics is the process of grouping together the inflected forms of a word so they can be analysed as a single item, identified by the word's lemma, or dictionary form. In computational linguistics, lemmatisation is the algorithmic process of determining the lemma of a word based on its intended meaning. Unlike stemming, lemmatisation depends on correctly identifying the intended part of speech and meaning of a word in a sentence, as well as within the larger context surrounding that sentence, such as neighboring sentences or even an entire document. As a result, developing efficient lemmatisation algorithms is an open area of research. Description In many languages, words appear in several ''inflected'' forms. For example, in English, the verb 'to walk' may appear as 'walk', 'walked', 'walks' or 'walking'. The base form, 'walk', that one might look up in a dictionary, is called the ''lemma'' for the word. The association of the base form ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Michael Philip West
Michael Philip West (1888–1973) was an English language teacher and researcher who worked extensively in India in the mid 1900s. He produced the reading scheme "The New Method Readers" (Longmans, Green and Co) and A General Service List of English Words (Longman, Harlow, Essex Harlow is a large town and local government district located in the west of Essex, England. Founded as a new town, it is situated on the border with Hertfordshire and London, Harlow occupies a large area of land on the south bank of the upper ..., 1953). References * Further reading "Michael West" entry in the Warwick ELT Archive Hall of Fame*Biography* 1888 births 1973 deaths Teachers of English as a second or foreign language {{edu-bio-stub ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Swadesh List
The Swadesh list ("Swadesh" is pronounced ) is a classic compilation of tentatively universal concepts for the purposes of lexicostatistics. Translations of the Swadesh list into a set of languages allow researchers to quantify the interrelatedness of those languages. The Swadesh list is named after linguist Morris Swadesh. It is used in lexicostatistics (the quantitative assessment of the genealogical relatedness of languages) and glottochronology (the dating of language divergence). Because there are several different lists, some authors also refer to "Swadesh lists". Versions and authors Morris Swadesh himself created several versions of his list. He started with a list of 215 meanings (falsely introduced as a list of 225 meanings in the paper due to a spelling error), which he reduced to 165 words for the Salish-Spokane-Kalispel language. In 1952, he published a list of 215 meanings,Swadesh 1952: 456–PDF/ref> of which he suggested the removal of 16 for being unclear or not ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Français Fondamental
''Français fondamental'' (French for ''Fundamental French'') is a list of words and grammatical concepts, devised in the beginning of the 1950s for teaching foreigners and residents of the French Union, France's colonial empire. A series of investigations in the 1950s and 1960s showed that a small number of words are used the same way orally and in writing in all circumstances; thus a limited number of grammatical rules were necessary for a functional language. Origins ''Français fondamental'' was developed by the ''Centre d'Etude du Français Élémentaire'', which was renamed to the ''Centre de Recherche et d'Etude pour la Diffusion du Français'' (CREDIF) in 1959. It was headed by linguist Georges Gougenheim.Stern, H. H. ''Fundamental Concepts of Language Teaching: Historical and Interdisciplinary Perspectives on Applied Linguistic Research''. Oxford University Press. 24 March 1983p. 55 Retrieved from Google Books on October 17, 2012. , 9780194370653. The Ministry of Educatio ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]