Menzerath's Law
   HOME
*





Menzerath's Law
Menzerath's law, or Menzerath–Altmann law (named after Paul Menzerath and Gabriel Altmann), is a linguistic law according to which the increase of the size of a linguistic construct results in a decrease of the size of its constituents, and vice versa. E.g., the longer a sentence (measured in terms of the number of clauses) the shorter the clauses (measured in terms of the number of words), or: the longer a word (in syllables or morphs) the shorter the syllables or morphs in sounds. According to Altmann (1980), it can be mathematically stated as: y=a \cdot x^ \cdot e^ where: * y is the constituent size (e.g. syllable length) * x size of the linguistic construct that is being inspected (e.g. number of syllables per word) * a, b, c are the parameters The law can be explained by the assumption that linguistic segments contain information about its structure (besides the information that needs to be communicated). The assumption that the length of the structure information is in ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Linguistics
Linguistics is the scientific study of human language. It is called a scientific study because it entails a comprehensive, systematic, objective, and precise analysis of all aspects of language, particularly its nature and structure. Linguistics is concerned with both the cognitive and social aspects of language. It is considered a scientific field as well as an academic discipline; it has been classified as a social science, natural science, cognitive science,Thagard, PaulCognitive Science, The Stanford Encyclopedia of Philosophy (Fall 2008 Edition), Edward N. Zalta (ed.). or part of the humanities. Traditional areas of linguistic analysis correspond to phenomena found in human linguistic systems, such as syntax (rules governing the structure of sentences); semantics (meaning); morphology (structure of words); phonetics (speech sounds and equivalent gestures in sign languages); phonology (the abstract sound system of a particular language); and pragmatics (how social con ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Zipf's Law
Zipf's law (, ) is an empirical law that often holds, approximately, when a list of measured values is sorted in decreasing order. It states that the value of the ''n''th entry is inversely proportional to ''n''. The best known instance of Zipf's law applies to the frequency table of words in a text or corpus of natural language:\text \propto \frac. It is usually found that the most common word occurs approximately twice as often as the next common one, three times as often as the third most common, and so on. For example, in the Brown Corpus of American English text, the word "'' the''" is the most frequently occurring word, and by itself accounts for nearly 7% of all word occurrences (69,971 out of slightly over 1 million). True to Zipf's Law, the second-place word "''of''" accounts for slightly over 3.5% of words (36,411 occurrences), followed by "''and''" (28,852). It is often used in the following form, called Zipf-Mandelbrot law:\text \propto \fracwhere a, b are fi ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Principle Of Least Effort
The principle of least effort is a broad theory that covers diverse fields from evolutionary biology to webpage design. It postulates that animals, people, and even well-designed machines will naturally choose the path of least resistance or "effort". It is closely related to many other similar principles: see Principle of least action or other articles listed below. This is perhaps best known or at least documented among researchers in the field of library and information science. Their principle states that an information-seeking client will tend to use the most convenient search method in the least exacting mode available. Information-seeking behavior stops as soon as minimally acceptable results are found. This theory holds true regardless of the user's proficiency as a searcher, or their level of subject expertise. Also, this theory takes into account the user’s previous information-seeking experience. The user will use the tools that are most familiar and easy to use that ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Pareto Distribution
The Pareto distribution, named after the Italian civil engineer, economist, and sociologist Vilfredo Pareto ( ), is a power-law probability distribution that is used in description of social, quality control, scientific, geophysical, actuarial, and many other types of observable phenomena; the principle originally applied to describing the distribution of wealth in a society, fitting the trend that a large portion of wealth is held by a small fraction of the population. The Pareto principle or "80-20 rule" stating that 80% of outcomes are due to 20% of causes was named in honour of Pareto, but the concepts are distinct, and only Pareto distributions with shape value () of log45 ≈ 1.16 precisely reflect it. Empirical observation has shown that this 80-20 distribution fits a wide range of cases, including natural phenomena and human activities. Definitions If ''X'' is a random variable with a Pareto (Type I) distribution, then the probability that ''X'' is ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Benford's Law
Benford's law, also known as the Newcomb–Benford law, the law of anomalous numbers, or the first-digit law, is an observation that in many real-life sets of numerical data, the leading digit is likely to be small.Arno Berger and Theodore P. HillBenford's Law Strikes Back: No Simple Explanation in Sight for Mathematical Gem 2011. In sets that obey the law, the number 1 appears as the leading significant digit about 30% of the time, while 9 appears as the leading significant digit less than 5% of the time. If the digits were distributed uniformly, they would each occur about 11.1% of the time. Benford's law also makes predictions about the distribution of second digits, third digits, digit combinations, and so on. The graph to the right shows Benford's law for base 10, one of infinitely many cases of a generalized law regarding numbers expressed in arbitrary (integer) bases, which rules out the possibility that the phenomenon might be an artifact of the base-10 number syste ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Bradford's Law
Bradford's law is a pattern first described by Samuel C. Bradford in 1934 that estimates the exponentially diminishing returns of searching for references in science journals. One formulation is that if journals in a field are sorted by number of articles into three groups, each with about one-third of all articles, then the number of journals in each group will be proportional to 1:n:n². There are a number of related formulations of the principle. In many disciplines this pattern is called a Pareto distribution. As a practical example, suppose that a researcher has five core scientific journals for his or her subject. Suppose that in a month there are 12 articles of interest in those journals. Suppose further that in order to find another dozen articles of interest, the researcher would have to go to an additional 10 journals. Then that researcher's Bradford multiplier ''b''''m'' is 2 (i.e. 10/5). For each new dozen articles, that researcher will need to look in ''b''''m'' time ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Heaps' Law
In linguistics, Heaps' law (also called Herdan's law) is an empirical law which describes the number of distinct words in a document (or set of documents) as a function of the document length (so called type-token relation). It can be formulated as : V_R(n) = Kn^\beta where ''VR'' is the number of distinct words in an instance text of size ''n''. ''K'' and β are free parameters determined empirically. With English text corpora, typically ''K'' is between 10 and 100, and β is between 0.4 and 0.6. The law is frequently attributed to Harold Stanley Heaps, but was originally discovered by . Under mild assumptions, the Herdan–Heaps law is asymptotically equivalent to Zipf's law concerning the frequencies of individual words within a text. This is a consequence of the fact that the type-token relation (in general) of a homogenous text can be derived from the distribution of its types. Heaps' law means that as more instance text is gathered, there will be diminishing returns in t ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Brevity Law
In linguistics, the brevity law (also called Zipf's law of abbreviation) is a linguistic law that qualitatively states that the more frequently a word is used, the shorter that word tends to be, and vice versa; the less frequently a word is used, the longer it tends to be. This is a statistical regularity that can be found in natural languages and other natural systems and that claims to be a general rule. The brevity law was originally formulated by the linguist George Kingsley Zipf in 1945 as a negative correlation between the frequency of a word and its size. He analyzed a written corpus in American English and showed that the average lengths in terms of the average number of phonemes fell as the frequency of occurrence increased. Similarly, in a Latin corpus, he found a negative correlation between the number of syllables in a word and the frequency of its appearance. This observation says that the most frequent words in a language are the shortest, e.g. the most common words ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Proteome
The proteome is the entire set of proteins that is, or can be, expressed by a genome, cell, tissue, or organism at a certain time. It is the set of expressed proteins in a given type of cell or organism, at a given time, under defined conditions. Proteomics is the study of the proteome. Types of proteomes While proteome generally refers to the proteome of an organism, multicellular organisms may have very different proteomes in different cells, hence it is important to distinguish proteomes in cells and organisms. A cellular proteome is the collection of proteins found in a particular cell type under a particular set of environmental conditions such as exposure to hormone stimulation. It can also be useful to consider an organism's complete proteome, which can be conceptualized as the complete set of proteins from all of the various cellular proteomes. This is very roughly the protein equivalent of the genome. The term ''proteome'' has also been used to refer to the collectio ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Quantitative Linguistics
Quantitative linguistics (QL) is a sub-discipline of general linguistics and, more specifically, of mathematical linguistics. Quantitative linguistics deals with language learning, language change, and application as well as structure of natural languages. QL investigates languages using statistical methods; its most demanding objective is the formulation of language laws and, ultimately, of a general theory of language in the sense of a set of interrelated languages laws. Synergetic linguistics was from its very beginning specifically designed for this purpose. QL is empirically based on the results of language statistics, a field which can be interpreted as statistics of languages or as statistics of any linguistic object. This field is not necessarily connected to substantial theoretical ambitions. Corpus linguistics and computational linguistics are other fields which contribute important empirical evidence. History The earliest QL approaches date back in the ancient Greek and ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Genome
In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as regulatory sequences (see non-coding DNA), and often a substantial fraction of 'junk' DNA with no evident function. Almost all eukaryotes have mitochondria and a small mitochondrial genome. Algae and plants also contain chloroplasts with a chloroplast genome. The study of the genome is called genomics. The genomes of many organisms have been sequenced and various regions have been annotated. The International Human Genome Project reported the sequence of the genome for ''Homo sapiens'' in 200The Human Genome Project although the initial "finished" sequence was missing 8% of the genome consisting mostly of repetitive sequences. With advancements in technology that could handle sequenci ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Chromosome
A chromosome is a long DNA molecule with part or all of the genetic material of an organism. In most chromosomes the very long thin DNA fibers are coated with packaging proteins; in eukaryotic cells the most important of these proteins are the histones. These proteins, aided by chaperone proteins, bind to and condense the DNA molecule to maintain its integrity. These chromosomes display a complex three-dimensional structure, which plays a significant role in transcriptional regulation. Chromosomes are normally visible under a light microscope only during the metaphase of cell division (where all chromosomes are aligned in the center of the cell in their condensed form). Before this happens, each chromosome is duplicated ( S phase), and both copies are joined by a centromere, resulting either in an X-shaped structure (pictured above), if the centromere is located equatorially, or a two-arm structure, if the centromere is located distally. The joined copies are now called si ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]