Bidirectional Recurrent Neural Networks
   HOME

TheInfoList



OR:

Bidirectional
recurrent neural networks A recurrent neural network (RNN) is a class of artificial neural networks where connections between nodes can create a cycle, allowing output from some nodes to affect subsequent input to the same nodes. This allows it to exhibit temporal dynamic ...
(BRNN) connect two hidden layers of opposite directions to the same output. With this form of generative deep learning, the output layer can get information from past (backwards) and future (forward) states simultaneously. Invented in 1997 by Schuster and Paliwal,Schuster, Mike, and Kuldip K. Paliwal.
Bidirectional recurrent neural networks
" Signal Processing, IEEE Transactions on 45.11 (1997): 2673-2681.2. Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen, Ryan
BRNNs were introduced to increase the amount of input information available to the network. For example,
multilayer perceptron A multilayer perceptron (MLP) is a fully connected class of feedforward artificial neural network (ANN). The term MLP is used ambiguously, sometimes loosely to mean ''any'' feedforward ANN, sometimes strictly to refer to networks composed of mu ...
(MLPs) and
time delay neural network Time delay neural network (TDNN) Alexander Waibel, Tashiyuki Hanazawa, Geoffrey Hinton, Kiyohito Shikano, Kevin J. Lang, Phoneme Recognition Using Time-Delay Neural Networks', IEEE Transactions on Acoustics, Speech, and Signal Processing, Volume 3 ...
(TDNNs) have limitations on the input data flexibility, as they require their input data to be fixed. Standard
recurrent neural network A recurrent neural network (RNN) is a class of artificial neural networks where connections between nodes can create a cycle, allowing output from some nodes to affect subsequent input to the same nodes. This allows it to exhibit temporal dynamic ...
(RNNs) also have restrictions as the future input information cannot be reached from the current state. On the contrary, BRNNs do not require their input data to be fixed. Moreover, their future input information is reachable from the current state. BRNN are especially useful when the context of the input is needed. For example, in
handwriting recognition Handwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other dev ...
, the performance can be enhanced by knowledge of the letters located before and after the current letter.


Architecture

The principle of BRNN is to split the neurons of a regular RNN into two directions, one for positive time direction (forward states), and another for negative time direction (backward states). Those two states' output are not connected to inputs of the opposite direction states. The general structure of RNN and BRNN can be depicted in the right diagram. By using two time directions, input information from the past and future of the current time frame can be used unlike standard RNN which requires the delays for including future information.


Training

BRNNs can be trained using similar algorithms to RNNs, because the two directional neurons do not have any interactions. However, when back-propagation through time is applied, additional processes are needed because updating input and output layers cannot be done at once. General procedures for training are as follows: For forward pass, forward states and backward states are passed first, then output neurons are passed. For backward pass, output neurons are passed first, then forward states and backward states are passed next. After forward and backward passes are done, the weights are updated.


Applications

Applications of BRNN include : *Speech Recognition (Combined with
Long short-term memory Long short-term memory (LSTM) is an artificial neural network used in the fields of artificial intelligence and deep learning. Unlike standard feedforward neural networks, LSTM has feedback connections. Such a recurrent neural network (RNN) ca ...
) *Translation *Handwritten Recognition *Protein Structure Prediction *Part-of-speech tagging *Dependency Parsing *Entity Extraction


References

{{reflist


External links



Implementation of BRNN/LSTM in Python with Theano Neural network architectures