The Lancaster-Oslo/Bergen (LOB) Corpus is a million-word collection of British English texts which was compiled in the 1970s in collaboration between the
University of Lancaster
, mottoeng = Truth lies open to all
, established =
, endowment = £13.9 million
, budget = £317.9 million
, type = Public
, city = Bailrigg, City of Lancaster
, country = England
, coor =
, campus = Bailrigg
, faculty = ...
, the
University of Oslo
The University of Oslo ( no, Universitetet i Oslo; la, Universitas Osloensis) is a public research university located in Oslo, Norway. It is the highest ranked and oldest university in Norway. It is consistently ranked among the top universit ...
, and the
Norwegian Computing Centre for the Humanities,
Bergen
Bergen (), historically Bjørgvin, is a city and municipality in Vestland county on the west coast of Norway. , its population is roughly 285,900. Bergen is the second-largest city in Norway. The municipality covers and is on the peninsula o ...
, to provide a British counterpart to the
Brown Corpus
The Brown University Standard Corpus of Present-Day American English (or just Brown Corpus) is an electronic collection of text samples of American English, the first major structured corpus of varied genres. This corpus first set the bar for the ...
compiled by
Henry Kučera
Henry Kučera (15 February 1925 – 20 February 2010), born Jindřich Kučera () was a Czech-American linguist who pioneered corpus linguistics, linguistic software, a major contributor to the ''American Heritage Dictionary'', and a pioneer in t ...
and
W. Nelson Francis for American English in the 1960s.
Its composition was designed to match the original Brown corpus in terms of its size and genres as closely as possible using documents published in the UK by British authors. Both corpora consist of 500 samples each comprising about 2000 words in the following genres:
The corpus has been also
tagged Tagged may refer to:
* Tagged (website), a social discovery website
* Tagged (web series), an American teen psychological thriller web series
{{disambiguation ...
, i.e.
part-of-speech categories have been assigned to every word.
External links
LOB Corpus ManualLOB Corpus from the Oxford Text Archive
English corpora
Linguistic research
Applied linguistics
Corpora
{{english-lang-stub