Redundancy (information Theory)

	Redundancy (information Theory) In information theory, redundancy measures the fractional difference between the entropy of an ensemble , and its maximum possible value \log(, \mathcal_X, ). Informally, it is the amount of wasted "space" used to transmit certain data. Data compression is a way to reduce or eliminate unwanted redundancy, while forward error correction is a way of adding desired redundancy for purposes of error detection and correction when communicating over a noisy channel of limited capacity. Quantitative definition In describing the redundancy of raw data, the rate of a source of information is the average entropy per symbol. For memoryless sources, this is merely the entropy of each symbol, while, in the most general case of a stochastic process, it is :r = \lim_ \frac H(M_1, M_2, \dots M_n), in the limit, as ''n'' goes to infinity, of the joint entropy of the first ''n'' symbols divided by ''n''. It is common in information theory to speak of the "rate" or "entropy" of a language. Th ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Information Theory Information theory is the scientific study of the quantification (science), quantification, computer data storage, storage, and telecommunication, communication of information. The field was originally established by the works of Harry Nyquist and Ralph Hartley, in the 1920s, and Claude Shannon in the 1940s. The field is at the intersection of probability theory, statistics, computer science, statistical mechanics, information engineering (field), information engineering, and electrical engineering. A key measure in information theory is information entropy, entropy. Entropy quantifies the amount of uncertainty involved in the value of a random variable or the outcome of a random process. For example, identifying the outcome of a fair coin flip (with two equally likely outcomes) provides less information (lower entropy) than specifying the outcome from a roll of a dice, die (with six equally likely outcomes). Some other important measures in information theory are mutual informat ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Hartley Function The Hartley function is a measure of uncertainty, introduced by Ralph Hartley in 1928. If a sample from a finite set ''A'' uniformly at random is picked, the information revealed after the outcome is known is given by the Hartley function : H_0(A) := \mathrm_b \vert A \vert , where denotes the cardinality of ''A''. If the base of the logarithm is 2, then the unit of uncertainty is the shannon (more commonly known as bit). If it is the natural logarithm, then the unit is the nat. Hartley used a base-ten logarithm, and with this base, the unit of information is called the hartley (aka ban or dit) in his honor. It is also known as the Hartley entropy. Hartley function, Shannon entropy, and Rényi entropy The Hartley function coincides with the Shannon entropy (as well as with the Rényi entropies of all orders) in the case of a uniform probability distribution. It is a special case of the Rényi entropy since: :H_0(X) = \frac 1 \log \sum_^ p_i^0 = \log , X, . But it can als ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Source Coding Theorem In information theory, Shannon's source coding theorem (or noiseless coding theorem) establishes the limits to possible data compression, and the operational meaning of the Shannon entropy. Named after Claude Shannon, the source coding theorem shows that (in the limit, as the length of a stream of independent identically-distributed random variables, independent and identically-distributed random variable (i.i.d.) data tends to infinity) it is impossible to compress the data such that the code rate (average number of bits per symbol) is less than the Shannon entropy of the source, without it being virtually certain that information will be lost. However it is possible to get the code rate arbitrarily close to the Shannon entropy, with negligible probability of loss. The source coding theorem for symbol codes places an upper and a lower bound on the minimal possible expected length of codewords as a function of the Entropy (information theory), entropy of the input word (which i ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Negentropy In information theory and statistics, negentropy is used as a measure of distance to normality. The concept and phrase "negative entropy" was introduced by Erwin Schrödinger in his 1944 popular-science book ''What is Life?'' Later, Léon Brillouin shortened the phrase to ''negentropy''. In 1974, Albert Szent-Györgyi proposed replacing the term ''negentropy'' with ''syntropy''. That term may have originated in the 1940s with the Italian mathematician Luigi Fantappiè, who tried to construct a unified theory of biology and physics. Buckminster Fuller tried to popularize this usage, but ''negentropy'' remains common. In a note to ''What is Life?'' Schrödinger explained his use of this phrase. Information theory In information theory and statistics, negentropy is used as a measure of distance to normality. Out of all distributions with a given mean and variance, the normal or Gaussian distribution is the one with the highest entropy. Negentropy measures the difference in entrop ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Huffman Encoding In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly used for lossless data compression. The process of finding or using such a code proceeds by means of Huffman coding, an algorithm developed by David A. Huffman while he was a Sc.D. student at MIT, and published in the 1952 paper "A Method for the Construction of Minimum-Redundancy Codes". The output from Huffman's algorithm can be viewed as a variable-length code table for encoding a source symbol (such as a character in a file). The algorithm derives this table from the estimated probability or frequency of occurrence (''weight'') for each possible value of the source symbol. As in other entropy encoding methods, more common symbols are generally represented using fewer bits than less common symbols. Huffman's method can be efficiently implemented, finding a code in time linear to the number of input weights if these weights are sorted. However, although opt ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Minimum Redundancy Coding In information theory, an entropy coding (or entropy encoding) is any lossless data compression method that attempts to approach the lower bound declared by Shannon's source coding theorem, which states that any lossless data compression method must have expected code length greater or equal to the entropy of the source. More precisely, the source coding theorem states that for any source distribution, the expected code length satisfies \mathbb E_(d(x))\geq \mathbb E_ \log_b(P(x))/math>, where l is the number of symbols in a code word, d is the coding function, b is the number of symbols used to make output codes and P is the probability of the source symbol. An entropy coding attempts to approach this lower bound. Two of the most common entropy coding techniques are Huffman coding and arithmetic coding. If the approximate entropy characteristics of a data stream are known in advance (especially for signal compression), a simpler static code may be useful. These static codes incl ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Stationary Process In mathematics and statistics, a stationary process (or a strict/strictly stationary process or strong/strongly stationary process) is a stochastic process whose unconditional joint probability distribution does not change when shifted in time. Consequently, parameters such as mean and variance also do not change over time. If you draw a line through the middle of a stationary process then it should be flat; it may have 'seasonal' cycles, but overall it does not trend up nor down. Since stationarity is an assumption underlying many statistical procedures used in time series analysis, non-stationary data are often transformed to become stationary. The most common cause of violation of stationarity is a trend in the mean, which can be due either to the presence of a unit root or of a deterministic trend. In the former case of a unit root, stochastic shocks have permanent effects, and the process is not mean-reverting. In the latter case of a deterministic trend, the process is called ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Ergodicity In mathematics, ergodicity expresses the idea that a point of a moving system, either a dynamical system or a stochastic process, will eventually visit all parts of the space that the system moves in, in a uniform and random sense. This implies that the average behavior of the system can be deduced from the trajectory of a "typical" point. Equivalently, a sufficiently large collection of random samples from a process can represent the average statistical properties of the entire process. Ergodicity is a property of the system; it is a statement that the system cannot be reduced or factored into smaller components. Ergodic theory is the study of systems possessing ergodicity. Ergodic systems occur in a broad range of systems in physics and in geometry. This can be roughly understood to be due to a common phenomenon: the motion of particles, that is, geodesics on a hyperbolic manifold are divergent; when that manifold is compact, that is, of finite size, those orbits return to the ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Expected Value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a large number of independently selected outcomes of a random variable. The expected value of a random variable with a finite number of outcomes is a weighted average of all possible outcomes. In the case of a continuum of possible outcomes, the expectation is defined by integration. In the axiomatic foundation for probability provided by measure theory, the expectation is given by Lebesgue integration. The expected value of a random variable is often denoted by , , or , with also often stylized as or \mathbb. History The idea of the expected value originated in the middle of the 17th century from the study of the so-called problem of points, which seeks to divide the stakes ''in a fair way'' between two players, who have to end th ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Mutual Information In probability theory and information theory, the mutual information (MI) of two random variables is a measure of the mutual dependence between the two variables. More specifically, it quantifies the " amount of information" (in units such as shannons (bits), nats or hartleys) obtained about one random variable by observing the other random variable. The concept of mutual information is intimately linked to that of entropy of a random variable, a fundamental notion in information theory that quantifies the expected "amount of information" held in a random variable. Not limited to real-valued random variables and linear dependence like the correlation coefficient, MI is more general and determines how different the joint distribution of the pair (X,Y) is from the product of the marginal distributions of X and Y. MI is the expected value of the pointwise mutual information (PMI). The quantity was defined and analyzed by Claude Shannon in his landmark paper "A Mathemati ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Data Compression Ratio Data compression ratio, also known as compression power, is a measurement of the relative reduction in size of data representation produced by a data compression algorithm. It is typically expressed as the division of uncompressed size by compressed size. Definition Data compression ratio is defined as the ratio between the ''uncompressed size'' and ''compressed size'': : = \frac Thus, a representation that compresses a file's storage size from 10 MB to 2 MB has a compression ratio of 10/2 = 5, often notated as an explicit ratio, 5:1 (read "five" to "one"), or as an implicit ratio, 5/1. This formulation applies equally for compression, where the uncompressed size is that of the original; and for decompression, where the uncompressed size is that of the reproduction. Sometimes the ''space saving'' is given instead, which is defined as the reduction in size relative to the uncompressed size: : = 1 - \frac Thus, a representation that compresses the storage size of a file from 1 ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]