HOME

TheInfoList



OR:

The Bank of English is a representative subset of the 4.5 billion words
COBUILD COBUILD, an acronym for Collins Birmingham University International Language Database, is a British research facility set up at the University of Birmingham in 1980 and funded by Collins publishers. The facility was initially led by Professor Jo ...
corpus Corpus is Latin for "body". It may refer to: Linguistics * Text corpus, in linguistics, a large and structured set of texts * Speech corpus, in linguistics, a large set of speech audio files * Corpus linguistics, a branch of linguistics Music * ...
, a collection of English texts. These are mainly British in origin, but content from North America, Australia, New Zealand, South Africa and other Commonwealth countries is also being included. The majority of the texts are from written English, collected from websites, newspapers, magazines and books. There is also a large component of spoken data using material from radio, TV and informal conversations. The Bank of English totals 650 million running words.The Collins Corpus
/ref> Copies of the corpus are held both at
HarperCollins HarperCollins Publishers LLC is one of the Big Five English-language publishing companies, alongside Penguin Random House, Simon & Schuster, Hachette, and Macmillan. The company is headquartered in New York City and is a subsidiary of News Cor ...
Publishers and the
University of Birmingham , mottoeng = Through efforts to heights , established = 1825 – Birmingham School of Medicine and Surgery1836 – Birmingham Royal School of Medicine and Surgery1843 – Queen's College1875 – Mason Science College1898 – Mason Univers ...
. The version at Birmingham can be accessed for academic research. The Bank of English forms part of the ''Collins Word Web'' together with the French, German and Spanish corpora.


See also

*
Corpus of Contemporary American English The Corpus of Contemporary American English (COCA) is a one-billion-word corpus of contemporary American English. It was created by Mark Davies, retired professor of corpus linguistics at Brigham Young University (BYU). Content The Corpus of C ...
(COCA) *
British National Corpus The British National Corpus (BNC) is a 100-million-word text corpus of samples of written and spoken English from a wide range of sources. The corpus covers British English of the late 20th century from a wide variety of genres, with the intention ...
(BNC)


References


External links


COBUILD Reference
{{Corpus linguistics English corpora Online databases Applied linguistics Linguistic research Corpora