Brown Corpus
   HOME
*



picture info

Brown Corpus
The Brown University Standard Corpus of Present-Day American English (or just Brown Corpus) is an electronic collection of text samples of American English, the first major structured corpus of varied genres. This corpus first set the bar for the scientific study of the frequency and distribution of word categories in everyday language use. Compiled by Henry Kučera and W. Nelson Francis at Brown University, in Rhode Island, it is a general language corpus containing 500 samples of English, totaling roughly one million words, compiled from works published in the United States in 1961. History In 1967, Kučera and Francis published their classic work ''Computational Analysis of Present-Day American English'', which provided basic statistics on what is known today simply as the ''Brown Corpus''. The Brown Corpus was a carefully compiled selection of current American English, totalling about a million words drawn from a wide variety of sources. Kučera and Francis subjected it to a ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Metcalf Research Laboratory (Brown) 05
Metcalf may refer to: __NOTOC__ People and fictional characters * Metcalf (surname) Places in the United States * Metcalf, Georgia, a village * Metcalf, Illinois, a village * Metcalfe County, Kentucky * Metcalf, Holliston, Massachusetts, a district of Holliston * Metcalf Hill, New York, a mountain Other uses * USS Metcalf (DD-595), USS ''Metcalf'' (DD-595), a US Navy destroyer * Metcalf Center for Science and Engineering, a building at Boston University in Massachusetts * Metcalf (dinghy), an American sailboat design * Metcalf transmission substation, site of the 2013 Metcalf sniper attack which damaged electrical transformers, near San Jose, California * Metcalf, a fictional town in Alfred Hitchcock's film ''Strangers on a Train (film), Strangers on a Train'' See also

* Metcalf Chateau, a group of Asian-American artists with ties to Honolulu * Metcalfe (other) {{disambig, geo ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


George Kingsley Zipf
George Kingsley Zipf (; January 7, 1902 – September 25, 1950), was an American linguist and philologist who studied statistical occurrences in different languages.. Zipf earned his bachelors, masters, and doctoral degrees from Harvard University, although he also studied at the University of Bonn and the University of Berlin. He was Chairman of the German Department and University Lecturer (meaning he could teach any subject he chose) at Harvard University. He worked with Chinese and demographics, and much of his effort can explain properties of the Internet, distribution of income within nations, and many other collections of data. Zipf's law He is the eponym of Zipf's law, which states that while only a few words are used very often, many or most are used rarely, :P_n \sim 1/n^a where ''Pn'' is the frequency of a word ranked ''n''th and the exponent ''a'' is almost 1. This means that the second item occurs approximately 1/2 as often as the first, and the third item 1/3 a ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Linguistic Research
Linguistics is the scientific study of human language. It is called a scientific study because it entails a comprehensive, systematic, objective, and precise analysis of all aspects of language, particularly its nature and structure. Linguistics is concerned with both the cognitive and social aspects of language. It is considered a scientific field as well as an academic discipline; it has been classified as a social science, natural science, cognitive science,Thagard, PaulCognitive Science, The Stanford Encyclopedia of Philosophy (Fall 2008 Edition), Edward N. Zalta (ed.). or part of the humanities. Traditional areas of linguistic analysis correspond to phenomena found in human linguistic systems, such as syntax (rules governing the structure of sentences); semantics (meaning); morphology (structure of words); phonetics (speech sounds and equivalent gestures in sign languages); phonology (the abstract sound system of a particular language); and pragmatics (how social contex ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Applied Linguistics
Applied linguistics is an interdisciplinary field which identifies, investigates, and offers solutions to language-related real-life problems. Some of the academic fields related to applied linguistics are education, psychology, communication research, information science, natural language processing, anthropology, and sociology. Domain Applied linguistics is an interdisciplinary field. Major branches of applied linguistics include bilingualism and multilingualism, conversation analysis, contrastive linguistics, language assessment, literacies, discourse analysis, language pedagogy, second language acquisition, language planning and policy, interlinguistics, stylistics, language teacher education, forensic linguistics, and translation. Journals Major journals of the field include ''Research Methods in Applied Linguistics'', ''Annual Review of Applied Linguistics'', ''Applied Linguistics'', Studies in Second Language Acquisition, ''Applied Psycholinguistics'', ''Internat ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




English Corpora
English usually refers to: * English language * English people English may also refer to: Peoples, culture, and language * ''English'', an adjective for something of, from, or related to England ** English national identity, an identity and common culture ** English language in England, a variant of the English language spoken in England * English languages (other) * English studies, the study of English language and literature * ''English'', an Amish term for non-Amish, regardless of ethnicity Individuals * English (surname), a list of notable people with the surname ''English'' * People with the given name ** English McConnell (1882–1928), Irish footballer ** English Fisher (1928–2011), American boxing coach ** English Gardner (b. 1992), American track and field sprinter Places United States * English, Indiana, a town * English, Kentucky, an unincorporated community * English, Brazoria County, Texas, an unincorporated community * Engli ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


LOB Corpus
The Lancaster-Oslo/Bergen (LOB) Corpus is a million-word collection of British English texts which was compiled in the 1970s in collaboration between the University of Lancaster, the University of Oslo, and the Norwegian Computing Centre for the Humanities, Bergen, to provide a British counterpart to the Brown Corpus compiled by Henry Kučera and W. Nelson Francis for American English in the 1960s. Its composition was designed to match the original Brown corpus in terms of its size and genres as closely as possible using documents published in the UK by British authors. Both corpora consist of 500 samples each comprising about 2000 words in the following genres: The corpus has been also tagged, i.e. part-of-speech In grammar, a part of speech or part-of-speech (abbreviated as POS or PoS, also known as word class or grammatical category) is a category of words (or, more generally, of lexical items) that have similar grammatical properties. Words that are ass ... categories have ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Keypunch
A keypunch is a device for precisely punching holes into stiff paper cards at specific locations as determined by keys struck by a human operator. Other devices included here for that same function include the gang punch, the pantograph punch, and the stamp. The term was also used for similar machines used by humans to transcribe data onto punched tape media. For Jacquard looms, the resulting punched cards were joined together to form a paper tape, called a "chain", containing a program that, when read by a loom, directed its operation.Bell, T.F. (1895) '' Jacquard Weaving and Designing'', Longmans, Green And Co. For Hollerith machines and other unit record machines the resulting punched cards contained data to be processed by those machines. For computers equipped with a punched card input/output device the resulting punched cards were either data or programs directing the computer's operation. Early Hollerith keypunches were manual devices. Later keypunches were electrom ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


International Corpus Of English
The International Corpus of English (ICE) is a set of corpora representing varieties of English from around the world. Over twenty countries or groups of countries where English is the first language or an official second language are included. History Sidney Greenbaum's goal to compile corpora that would compare the syntax of world English became the ICE project that was achieved by Professor Charles F. Meyer. Sidney Greenbaum anticipated for international teams of researchers to collect comparable national variations of English both written and spoken. Comparable variations would be British English, American English, and Indian English, that would be represented through a computer corpora. The corpora are used by researchers to compare the syntax of the varieties of English. ICE corpora completion would have comprehensive linguistic analysis of varieties of English that have emerged. Ongoing research for ICE is implemented by international teams in diversified regions. The project ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

British National Corpus
The British National Corpus (BNC) is a 100-million-word text corpus of samples of written and spoken English from a wide range of sources. The corpus covers British English of the late 20th century from a wide variety of genres, with the intention that it be a representative sample of spoken and written British English of that time. It is used in corpus linguistic for analysis of corpora History The project to create the BNC involved the collaboration of three publishers (with the Oxford University Press as the lead collaborator, Longman and W. & R. Chambers), two universities (the University of Oxford and Lancaster University), and the British Library. The creation of the BNC started in 1991 under the management of the BNC consortium, and the project was finished by 1994. There have been no additions of new samples after 1994, but the BNC underwent slight revisions before the release of the second edition BNC World (2001) and the third edition BNC XML Edition (2007).
[...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Corpus Of Contemporary American English
The Corpus of Contemporary American English (COCA) is a one-billion-word corpus of contemporary American English. It was created by Mark Davies, retired professor of corpus linguistics at Brigham Young University (BYU). Content The Corpus of Contemporary American English (COCA) is composed of one billion words as of November 2021. The corpus is constantly growing: In 2009 it contained more than 385 million words; In 2010 the corpus grew in size to 400 million words; By March 2019, the corpus had grown to 560 million words. As of November 2021, the Corpus of Contemporary American English is composed of 485,202 texts. According to the corpus website, the current corpus (November 2021) is composed of texts that include 24-25 million words for each year 1990-2019. For each year contained in the corpus (1990-2019), the corpus is evenly divided between six registers/genres: TV/movies, spoken, fiction, magazine, newspaper, and academic (see Texts and Registers page of the COCA websi ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Hapax Legomena
In corpus linguistics, a ''hapax legomenon'' ( also or ; ''hapax legomena''; sometimes abbreviated to ''hapax'', plural ''hapaxes'') is a word or an expression that occurs only once within a context: either in the written record of an entire language, in the works of an author, or in a single text. The term is sometimes incorrectly used to describe a word that occurs in just one of an author's works but more than once in that particular work. ''Hapax legomenon'' is a transliteration of Greek , meaning "being said once". The related terms ''dis legomenon'', ''tris legomenon'', and ''tetrakis legomenon'' respectively (, , ) refer to double, triple, or quadruple occurrences, but are far less commonly used. ''Hapax legomena'' are quite common, as predicted by Zipf's law, which states that the frequency of any word in a corpus is inversely proportional to its rank in the frequency table. For large corpora, about 40% to 60% of the words are ''hapax legomena'', and another 10% to 1 ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]