Writer invariant
   HOME

TheInfoList



OR:

Writer invariant, also called authorial invariant or author's invariant, is a property of a text which is invariant of its
author An author is the writer of a book, article, play, mostly written work. A broader definition of the word "author" states: "''An author is "the person who originated or gave existence to anything" and whose authorship determines responsibility f ...
, that is, it will be similar in all texts of a given author and different in texts of different authors. It can be used to find
plagiarism Plagiarism is the fraudulent representation of another person's language, thoughts, ideas, or expressions as one's own original work.From the 1995 '' Random House Compact Unabridged Dictionary'': use or close imitation of the language and thought ...
or discover who is real author of
anonymously Anonymity describes situations where the acting person's identity is unknown. Some writers have argued that namelessness, though technically correct, does not capture what is more centrally at stake in contexts of anonymity. The important idea he ...
published text. Writer invariant is also an author's pattern of writing a letter in handwritten
text recognition Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scen ...
. While it is generally recognised that writer invariants exist, it is not agreed what properties of a text should be used. Among the first ones used was distribution of
word A word is a basic element of language that carries an semantics, objective or pragmatics, practical semantics, meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of w ...
lengths; other proposed invariants include average sentence length, average word length,
noun A noun () is a word that generally functions as the name of a specific object or set of objects, such as living creatures, places, actions, qualities, states of existence, or ideas.Example nouns for: * Living creatures (including people, alive, d ...
,
verb A verb () is a word (part of speech) that in syntax generally conveys an action (''bring'', ''read'', ''walk'', ''run'', ''learn''), an occurrence (''happen'', ''become''), or a state of being (''be'', ''exist'', ''stand''). In the usual descri ...
or
adjective In linguistics, an adjective (list of glossing abbreviations, abbreviated ) is a word that generally grammatical modifier, modifies a noun or noun phrase or describes its referent. Its semantic role is to change information given by the noun. Tra ...
usage frequency,
vocabulary A vocabulary is a set of familiar words within a person's language. A vocabulary, usually developed with age, serves as a useful and fundamental tool for communication and acquiring knowledge. Acquiring an extensive vocabulary is one of the la ...
richness, and frequency of
function word In linguistics, function words (also called functors) are words that have little lexical meaning or have ambiguous meaning and express grammatical relationships among other words within a sentence, or specify the attitude or mood of the speaker. ...
s, or specific function words. Of these, average sentence lengths can be very similar in works of different authors or vary significantly even within a single work; average word lengths likewise turn out to be very similar in works of different authors. Analysis of function words shows promise because they are used by authors unconsciously.


See also

*
Stylometry Stylometry is the application of Stylistics (linguistics), the study of linguistic style, usually to written language. It has also been applied successfully to music and to fine-art paintings as well.Shlomo Argamon, Argamon, Shlomo, Kevin Burns, ...
*
Writeprint Writeprint is a method in forensic linguistics of establishing author identification over the internet, likened to a digital fingerprint. Identity is established through a comparison of distinguishing stylometric characteristics of an unknown wri ...


References

Statistical natural language processing {{comp-ling-stub