A recursive neural network is a kind of deep neural network created by applying the same set of weights recursively over a structured input, to produce a structured prediction over variable-size input structures, or a scalar prediction on it, by traversing a given structure in topological order. Recursive neural networks, sometimes abbreviated as RvNNs, have been successful, for instance, in learning sequence and tree structures in

natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to pro ...

, mainly phrase and sentence continuous representations based on word embedding. RvNNs have first been introduced to learn

distributed representation Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected unit ...

s of structure, such as logical terms. Models and general frameworks have been developed in further works since the 1990s.

Architectures

Basic

In the most simple architecture, nodes are combined into parents using a weight matrix that is shared across the whole network, and a non-linearity such as '' tanh''. If ''c''₁ and ''c''₂ are ''n''-dimensional vector representation of nodes, their parent will also be an ''n''-dimensional vector, calculated as

p_ = \tanh\left(W_1 ; c_2 right)

Where ''W'' is a learned

n\times 2n

weight matrix. This architecture, with a few improvements, has been used for successfully parsing natural scenes, syntactic parsing of natural language sentences, and recursive autoencoding and generative modeling of 3D shape structures in the form of cuboid abstractions.

Recursive cascade correlation (RecCC)

RecCC is a constructive neural network approach to deal with tree domains with pioneering applications to chemistry and extension to directed acyclic graphs.

Unsupervised RNN

A framework for unsupervised RNN has been introduced in 2004.

Tensor

Recursive neural tensor networks use one, tensor-based composition function for all nodes in the tree.

Training

Stochastic gradient descent

Typically,

stochastic gradient descent Stochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an objective function with suitable smoothness properties (e.g. differentiable or subdifferentiable). It can be regarded as a stochastic approximation of ...

(SGD) is used to train the network. The gradient is computed using

backpropagation through structure Backpropagation through structure (BPTS) is a gradient-based technique for training recursive neural nets (a superset of recurrent neural nets) and is extensively described in a 1996 paper written by Christoph Goller and Andreas Küchler. Refere ...

(BPTS), a variant of backpropagation through time used for recurrent neural networks.

Properties

Universal approximation capability of RNN over trees has been proved in literature.

Related models

Recurrent neural networks

Recurrent neural networks are recursive artificial neural networks with a certain structure: that of a linear chain. Whereas recursive neural networks operate on any hierarchical structure, combining child representations into parent representations, recurrent neural networks operate on the linear progression of time, combining the previous time step and a hidden representation into the representation for the current time step.

Tree Echo State Networks

An efficient approach to implement recursive neural networks is given by the Tree Echo State Network within the reservoir computing paradigm.

Extension to graphs

Extensions to graphs include graph neural network (GNN), Neural Network for Graphs (NN4G), and more recently convolutional neural networks for graphs.

References

Artificial neural networks {{compu-AI-stub