Backpropagation Through Structure

	Backpropagation Through Structure Backpropagation through structure (BPTS) is a gradient-based technique for training recursive neural network A recursive neural network is a kind of deep neural network created by applying the same set of weights recursively over a structured input, to produce a structured prediction over variable-size input structures, or a scalar prediction on it, by ...s, proposed in a 1996 paper written by Christoph Goller and Andreas Küchler. References Artificial neural networks {{compu-ai-stub ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Gradient Method In optimization, a gradient method is an algorithm In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ... to solve problems of the form :\min_\; f(x) with the search directions defined by the gradient of the function at the current point. Examples of gradient methods are the gradient descent and the conjugate gradient. See also * Gradient descent * Stochastic gradient descent * Coordinate descent * Frank–Wolfe algorithm * Landweber iteration * Random coordinate descent * Conjugate gradient method * Derivation of the conjugate gradient method * Nonlinear conjugate gradient method * Biconjugate gradient method * Biconjugate gradient stabilized method References * First order methods Optimization algorithms and methods Numerical linear algebra {{linear-al ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Recursive Neural Network A recursive neural network is a kind of deep neural network created by applying the same set of weights recursively over a structured input, to produce a structured prediction over variable-size input structures, or a scalar prediction on it, by traversing a given structure in topological order. These networks were first introduced to learn distributed representations of structure (such as logical terms), but have been successful in multiple applications, for instance in learning sequence and tree structures in natural language processing (mainly continuous representations of phrases and sentences based on word embeddings). Architectures Basic In the simplest architecture, nodes are combined into parents using a weight matrix (which is shared across the whole network) and a non-linearity such as the \tanh hyperbolic function. If c_1 and c_2 are n-dimensional vector representations of nodes, their parent will also be an n-dimensional vector, defined as: : p_ = \tanh(W _ ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]