In linguistics, co-occurrence or cooccurrence is an above-chance frequency of ordered occurrence of two adjacent terms in a

text corpus In linguistics and natural language processing, a corpus (: corpora) or text corpus is a dataset, consisting of natively digital and older, digitalized, language resources, either annotated or unannotated. Annotated, they have been used in corp ...

. Co-occurrence in this

linguistic Linguistics is the scientific study of language. The areas of linguistic analysis are syntax (rules governing the structure of sentences), semantics (meaning), Morphology (linguistics), morphology (structure of words), phonetics (speech sounds ...

sense can be interpreted as an indicator of semantic proximity or an

idiom An idiom is a phrase or expression that largely or exclusively carries a Literal and figurative language, figurative or non-literal meaning (linguistic), meaning, rather than making any literal sense. Categorized as formulaic speech, formulaic ...

atic expression. Corpus linguistics and its statistic analyses reveal patterns of co-occurrences within a language and enable to work out typical

collocation In corpus linguistics, a collocation is a series of words or terms that co-occur more often than would be expected by chance. In phraseology, a collocation is a type of compositional phraseme, meaning that it can be understood from the words t ...

s for its lexical items. A ''co-occurrence restriction'' is identified when linguistic elements never occur together. Analysis of these restrictions can lead to discoveries about the

structure A structure is an arrangement and organization of interrelated elements in a material object or system, or the object or system so organized. Material structures include man-made objects such as buildings and machines and natural objects such as ...

and development of a language. Co-occurrence can be seen an extension of word counting in higher dimensions. Co-occurrence can be quantitatively described using measures like a massive

correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...

mutual information In probability theory and information theory, the mutual information (MI) of two random variables is a measure of the mutual Statistical dependence, dependence between the two variables. More specifically, it quantifies the "Information conten ...

References

External links

* Corpus linguistics {{Ling-stub

See also

References

External links