LOB Corpus
   HOME
*





LOB Corpus
The Lancaster-Oslo/Bergen (LOB) Corpus is a million-word collection of British English texts which was compiled in the 1970s in collaboration between the University of Lancaster, the University of Oslo, and the Norwegian Computing Centre for the Humanities, Bergen, to provide a British counterpart to the Brown Corpus compiled by Henry Kučera and W. Nelson Francis for American English in the 1960s. Its composition was designed to match the original Brown corpus in terms of its size and genres as closely as possible using documents published in the UK by British authors. Both corpora consist of 500 samples each comprising about 2000 words in the following genres: The corpus has been also tagged, i.e. part-of-speech In grammar, a part of speech or part-of-speech (abbreviated as POS or PoS, also known as word class or grammatical category) is a category of words (or, more generally, of lexical items) that have similar grammatical properties. Words that are ass ... categories have ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

University Of Lancaster
, mottoeng = Truth lies open to all , established = , endowment = £13.9 million , budget = £317.9 million , type = Public , city = Bailrigg, City of Lancaster , country = England , coor = , campus = Bailrigg , faculty = 1,872 (full-time equivalent) , administrative_staff = 3,223 (full-time equivalent) , chancellor = Alan Milburn , head_label = Pro-Chancellor , head = Alistair Burt , vice_chancellor = Andy Schofield , students = 15,979 Lancaster Universit"Student numbers FOI Request 2019" 6 November 2019. Retrieved 4 December 2019 , undergrad = 11,419 , postgrad = 4,560 , colours = 'Quaker Grey' and red , affiliations = N8 Group, ACU, AACSB, AMBA, NWUA, EUA, EQUIS, Universities UK , website www.lancaster.ac.uk, logo = Lancaster University logo.svg Lancaster University (legally The University of Lancaster) is a public research university in Lancaster, Lancashire, England. The university was established in 1964 by royal charter, a ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

University Of Oslo
The University of Oslo ( no, Universitetet i Oslo; la, Universitas Osloensis) is a public research university located in Oslo, Norway. It is the highest ranked and oldest university in Norway. It is consistently ranked among the top universities in the world and as one of the leading universities of Northern Europe; the Academic Ranking of World Universities ranked it the 58th best university in the world and the third best in the Nordic countries. In 2016, the Times Higher Education World University Rankings listed the university at 63rd, making it the highest ranked Norwegian university. Originally named the Royal Frederick University, the university was established in 1811 as the de facto Norwegian continuation of Denmark-Norway's common university, the University of Copenhagen, with which it shares many traditions. It was named for King Frederick VI of Denmark and Norway, and received its current name in 1939. The university was commonly nicknamed "The Royal Frederick ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Norwegian Computing Centre For The Humanities
Norwegian, Norwayan, or Norsk may refer to: *Something of, from, or related to Norway, a country in northwestern Europe *Norwegians, both a nation and an ethnic group native to Norway *Demographics of Norway *The Norwegian language, including the two official written forms: **Bokmål, literally "book language", used by 85–90% of the population of Norway **Nynorsk, literally "New Norwegian", used by 10–15% of the population of Norway *The Norwegian Sea Norwegian or may also refer to: Norwegian *Norwegian Air Shuttle, an airline, trading as Norwegian **Norwegian Long Haul, a defunct subsidiary of Norwegian Air Shuttle, flying long-haul flights * Norwegian Air Lines, a former airline, merged with Scandinavian Airlines in 1951 * Norwegian coupling, used for narrow-gauge railways * Norwegian Cruise Line, a cruise line * Norwegian Elkhound, a canine breed. * Norwegian Forest cat, a domestic feline breed * Norwegian Red, a breed of dairy cattle * Norwegian Township, Schuylkill C ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Bergen
Bergen (), historically Bjørgvin, is a city and municipality in Vestland county on the west coast of Norway. , its population is roughly 285,900. Bergen is the second-largest city in Norway. The municipality covers and is on the peninsula of Bergenshalvøyen. The city centre and northern neighbourhoods are on Byfjorden, 'the city fjord', and the city is surrounded by mountains; Bergen is known as the "city of seven mountains". Many of the extra-municipal suburbs are on islands. Bergen is the administrative centre of Vestland county. The city consists of eight boroughs: Arna, Bergenhus, Fana, Fyllingsdalen, Laksevåg, Ytrebygda, Årstad, and Åsane. Trading in Bergen may have started as early as the 1020s. According to tradition, the city was founded in 1070 by King Olav Kyrre and was named Bjørgvin, 'the green meadow among the mountains'. It served as Norway's capital in the 13th century, and from the end of the 13th century became a bureau city of the Hanseatic Leag ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Brown Corpus
The Brown University Standard Corpus of Present-Day American English (or just Brown Corpus) is an electronic collection of text samples of American English, the first major structured corpus of varied genres. This corpus first set the bar for the scientific study of the frequency and distribution of word categories in everyday language use. Compiled by Henry Kučera and W. Nelson Francis at Brown University, in Rhode Island, it is a general language corpus containing 500 samples of English, totaling roughly one million words, compiled from works published in the United States in 1961. History In 1967, Kučera and Francis published their classic work ''Computational Analysis of Present-Day American English'', which provided basic statistics on what is known today simply as the ''Brown Corpus''. The Brown Corpus was a carefully compiled selection of current American English, totalling about a million words drawn from a wide variety of sources. Kučera and Francis subjected it to a ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Henry Kučera
Henry Kučera (15 February 1925 – 20 February 2010), born Jindřich Kučera () was a Czech-American linguist who pioneered corpus linguistics, linguistic software, a major contributor to the ''American Heritage Dictionary'', and a pioneer in the development of spell checking computer software. He is remembered in particular as one of the initiators of the Brown Corpus. Early life and education Kučera was born in Třebařov (between Pardubice and Olomouc) in Czechoslovakia and later moved with his family to Hodonín, where he studied. When the Communists came to power in February 1948, his studies in philosophy and linguistics at Charles University in the Czech capital of Prague were interrupted. He was forced to leave Czechoslovakia in April 1948 when it became clear that his political writings had placed him at risk of detention by the Communist authorities. Kučera then moved to Allied-occupied Germany where he worked under the supervision of the U.S. CIC (Counterinte ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Part-of-speech Tagging
In corpus linguistics, part-of-speech tagging (POS tagging or PoS tagging or POST), also called grammatical tagging is the process of marking up a word in a text (corpus) as corresponding to a particular part of speech, based on both its definition and its context. A simplified form of this is commonly taught to school-age children, in the identification of words as nouns, verbs, adjectives, adverbs, etc. Once performed by hand, POS tagging is now done in the context of computational linguistics, using algorithms which associate discrete terms, as well as hidden parts of speech, by a set of descriptive tags. POS-tagging algorithms fall into two distinctive groups: rule-based and stochastic. E. Brill's tagger, one of the first and most widely used English POS-taggers, employs rule-based algorithms. Principle Part-of-speech tagging is harder than just having a list of words and their parts of speech, because some words can represent more than one part of speech at different times, ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Part-of-speech
In grammar, a part of speech or part-of-speech (abbreviated as POS or PoS, also known as word class or grammatical category) is a category of words (or, more generally, of lexical items) that have similar grammatical properties. Words that are assigned to the same part of speech generally display similar syntactic behavior (they play similar roles within the grammatical structure of sentences), sometimes similar morphological behavior in that they undergo inflection for similar properties and even similar semantic behavior. Commonly listed English parts of speech are noun, verb, adjective, adverb, pronoun, preposition, conjunction, interjection, numeral, article, and determiner. Other terms than ''part of speech''—particularly in modern linguistic classifications, which often make more precise distinctions than the traditional scheme does—include word class, lexical class, and lexical category. Some authors restrict the term ''lexical category'' to refer only to a particu ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




English Corpora
English usually refers to: * English language * English people English may also refer to: Peoples, culture, and language * ''English'', an adjective for something of, from, or related to England ** English national identity, an identity and common culture ** English language in England, a variant of the English language spoken in England * English languages (other) * English studies, the study of English language and literature * ''English'', an Amish term for non-Amish, regardless of ethnicity Individuals * English (surname), a list of notable people with the surname ''English'' * People with the given name ** English McConnell (1882–1928), Irish footballer ** English Fisher (1928–2011), American boxing coach ** English Gardner (b. 1992), American track and field sprinter Places United States * English, Indiana, a town * English, Kentucky, an unincorporated community * English, Brazoria County, Texas, an unincorporated community * Engli ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Linguistic Research
Linguistics is the scientific study of human language. It is called a scientific study because it entails a comprehensive, systematic, objective, and precise analysis of all aspects of language, particularly its nature and structure. Linguistics is concerned with both the cognitive and social aspects of language. It is considered a scientific field as well as an academic discipline; it has been classified as a social science, natural science, cognitive science,Thagard, PaulCognitive Science, The Stanford Encyclopedia of Philosophy (Fall 2008 Edition), Edward N. Zalta (ed.). or part of the humanities. Traditional areas of linguistic analysis correspond to phenomena found in human linguistic systems, such as syntax (rules governing the structure of sentences); semantics (meaning); morphology (structure of words); phonetics (speech sounds and equivalent gestures in sign languages); phonology (the abstract sound system of a particular language); and pragmatics (how social contex ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Applied Linguistics
Applied linguistics is an interdisciplinary field which identifies, investigates, and offers solutions to language-related real-life problems. Some of the academic fields related to applied linguistics are education, psychology, communication research, information science, natural language processing, anthropology, and sociology. Domain Applied linguistics is an interdisciplinary field. Major branches of applied linguistics include bilingualism and multilingualism, conversation analysis, contrastive linguistics, language assessment, literacies, discourse analysis, language pedagogy, second language acquisition, language planning and policy, interlinguistics, stylistics, language teacher education, forensic linguistics, and translation. Journals Major journals of the field include ''Research Methods in Applied Linguistics'', ''Annual Review of Applied Linguistics'', ''Applied Linguistics'', Studies in Second Language Acquisition, ''Applied Psycholinguistics'', ''Internat ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]