Explaining Away

	Explaining Away The interaction information is a generalization of the mutual information for more than two variables. There are many names for interaction information, including ''amount of information'', ''information correlation'', ''co-information'', and simply ''mutual information''. Interaction information expresses the amount of information (redundancy or synergy) bound up in a set of variables, ''beyond'' that which is present in any subset of those variables. Unlike the mutual information, the interaction information can be either positive or negative. These functions, their negativity and minima have a direct interpretation in algebraic topology. Definition The conditional mutual information can be used to inductively define the interaction information for any finite number of variables as follows: :I(X_1;\ldots;X_) = I(X_1;\ldots;X_n) - I(X_1;\ldots;X_n\mid X_), where :I(X_1;\ldots;X_n \mid X_) = \mathbb E_\big(I(X_1;\ldots;X_n) \mid X_\big). Some authors define the interaction inf ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Mutual Information In probability theory and information theory, the mutual information (MI) of two random variables is a measure of the mutual dependence between the two variables. More specifically, it quantifies the " amount of information" (in units such as shannons (bits), nats or hartleys) obtained about one random variable by observing the other random variable. The concept of mutual information is intimately linked to that of entropy of a random variable, a fundamental notion in information theory that quantifies the expected "amount of information" held in a random variable. Not limited to real-valued random variables and linear dependence like the correlation coefficient, MI is more general and determines how different the joint distribution of the pair (X,Y) is from the product of the marginal distributions of X and Y. MI is the expected value of the pointwise mutual information (PMI). The quantity was defined and analyzed by Claude Shannon in his landmark paper "A Mathemati ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Algebraic Topology Algebraic topology is a branch of mathematics that uses tools from abstract algebra to study topological spaces. The basic goal is to find algebraic invariant (mathematics), invariants that classification theorem, classify topological spaces up to homeomorphism, though usually most classify up to Homotopy#Homotopy equivalence and null-homotopy, homotopy equivalence. Although algebraic topology primarily uses algebra to study topological problems, using topology to solve algebraic problems is sometimes also possible. Algebraic topology, for example, allows for a convenient proof that any subgroup of a free group is again a free group. Main branches of algebraic topology Below are some of the main areas studied in algebraic topology: Homotopy groups In mathematics, homotopy groups are used in algebraic topology to classify topological spaces. The first and simplest homotopy group is the fundamental group, which records information about loops in a space. Intuitively, homotopy gro ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Conditional Mutual Information In probability theory, particularly information theory, the conditional mutual information is, in its most basic form, the expected value of the mutual information of two random variables given the value of a third. Definition For random variables X, Y, and Z with support sets \mathcal, \mathcal and \mathcal, we define the conditional mutual information as This may be written in terms of the expectation operator: I(X;Y, Z) = \mathbb_Z P_ \otimes P_ )/math>. Thus I(X;Y, Z) is the expected (with respect to Z) Kullback–Leibler divergence from the conditional joint distribution P_ to the product of the conditional marginals P_ and P_. Compare with the definition of mutual information. In terms of PMFs for discrete distributions For discrete random variables X, Y, and Z with support sets \mathcal, \mathcal and \mathcal, the conditional mutual information I(X;Y, Z) is as follows : I(X;Y, Z) = \sum_ p_Z(z) \sum_ \sum_ p_(x,y, z) \log \frac where the marginal, joint, ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Symmetric Function In mathematics, a function of n variables is symmetric if its value is the same no matter the order of its arguments. For example, a function f\left(x_1,x_2\right) of two arguments is a symmetric function if and only if f\left(x_1,x_2\right) = f\left(x_2,x_1\right) for all x_1 and x_2 such that \left(x_1,x_2\right) and \left(x_2,x_1\right) are in the domain of f. The most commonly encountered symmetric functions are polynomial functions, which are given by the symmetric polynomials. A related notion is alternating polynomials, which change sign under an interchange of variables. Aside from polynomial functions, tensors that act as functions of several vectors can be symmetric, and in fact the space of symmetric k-tensors on a vector space V is isomorphic to the space of homogeneous polynomials of degree k on V. Symmetric functions should not be confused with even and odd functions, which have a different sort of symmetry. Symmetrization Given any function f in n variables wi ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Kirkwood Approximation The Kirkwood superposition approximation was introduced in 1935 by John G. Kirkwood as a means of representing a discrete probability distribution. The Kirkwood approximation for a discrete probability density function P(x_,x_,\ldots ,x_) is given by : P^(x_1,x_2,\ldots ,x_n) = \prod_^\left prod_p(\mathcal_i)\right = \frac where : \prod_p(\mathcal_i) is the product of probabilities over all subsets of variables of size ''i'' in variable set \scriptstyle\mathcal. This kind of formula has been considered by Watanabe (1960) and, according to Watanabe, also by Robert Fano. For the three-variable case, it reduces to simply : P^\prime(x_1,x_2,x_3)=\frac The Kirkwood approximation does not generally produce a valid probability distribution (the normalization condition is violated). Watanabe claims that for this reason informational expressions of this type are not meaningful, and indeed there has been very little written about the properties of this measure. The Kirkwood approxi ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Binary Data Binary data is data whose unit can take on only two possible states. These are often labelled as 0 and 1 in accordance with the binary numeral system and Boolean algebra. Binary data occurs in many different technical and scientific fields, where it can be called by different names including ''bit'' (binary digit) in computer science, ''truth value'' in mathematical logic and related domains and ''binary variable'' in statistics. Mathematical and combinatoric foundations A discrete variable that can take only one state contains zero information, and is the next natural number after 1. That is why the bit, a variable with only two possible values, is a standard primary unit of information. A collection of bits may have states: see binary number for details. Number of states of a collection of discrete variables depends exponentially on the number of variables, and only as a power law on number of states of each variable. Ten bits have more () states than three decimal digits ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Mutual Information In probability theory and information theory, the mutual information (MI) of two random variables is a measure of the mutual dependence between the two variables. More specifically, it quantifies the " amount of information" (in units such as shannons (bits), nats or hartleys) obtained about one random variable by observing the other random variable. The concept of mutual information is intimately linked to that of entropy of a random variable, a fundamental notion in information theory that quantifies the expected "amount of information" held in a random variable. Not limited to real-valued random variables and linear dependence like the correlation coefficient, MI is more general and determines how different the joint distribution of the pair (X,Y) is from the product of the marginal distributions of X and Y. MI is the expected value of the pointwise mutual information (PMI). The quantity was defined and analyzed by Claude Shannon in his landmark paper "A Mathemati ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Dual Total Correlation In information theory, dual total correlation (Han 1978), information rate (Dubnov 2006), excess entropy (Olbrich 2008), or binding information (Abdallah and Plumbley 2010) is one of several known non-negative generalizations of mutual information. While total correlation is bounded by the sum entropies of the ''n'' elements, the dual total correlation is bounded by the joint-entropy of the ''n'' elements. Although well behaved, dual total correlation has received much less attention than the total correlation. A measure known as "TSE-complexity" defines a continuum between the total correlation and dual total correlation (Ay 2001). Definition For a set of ''n'' random variables \, the dual total correlation D(X_1,\ldots,X_n) is given by : D(X_1,\ldots,X_n) = H\left( X_1, \ldots, X_n \right) - \sum_^n H\left( X_i \mid X_1, \ldots, X_, X_, \ldots, X_n \right) , where H(X_,\ldots,X_) is the joint entropy of the variable set \ and H(X_i \mid \cdots ) is the conditional entropy ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]