A Neural Turing machine (NTM) is a
recurrent neural network
A recurrent neural network (RNN) is a class of artificial neural networks where connections between nodes can create a cycle, allowing output from some nodes to affect subsequent input to the same nodes. This allows it to exhibit temporal dynamic ...
model of a
Turing machine
A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer algor ...
. The approach was published by
Alex Graves
Alexander John Graves (born July 23, 1965) is an American film director, television director, television producer and screenwriter.
Early life
Alex Graves was born in Kansas City, Missouri. His father, William Graves, was a reporter for '' Th ...
et al. in 2014.
NTMs combine the fuzzy
pattern matching
In computer science, pattern matching is the act of checking a given sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually has to be exact: "either it will or will not be ...
capabilities of
neural network
A neural network is a network or neural circuit, circuit of biological neurons, or, in a modern sense, an artificial neural network, composed of artificial neurons or nodes. Thus, a neural network is either a biological neural network, made up ...
s with the
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
ic power of
programmable computer
A stored-program computer is a computer that stores program instructions in electronically or optically accessible memory. This contrasts with systems that stored the program instructions with plugboards or similar mechanisms.
The definition ...
s.
An NTM has a neural network controller coupled to
external memory resources, which it interacts with through attentional mechanisms. The memory interactions are differentiable end-to-end, making it possible to optimize them using
gradient descent
In mathematics, gradient descent (also often called steepest descent) is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of ...
.
An NTM with a
long short-term memory
Long short-term memory (LSTM) is an artificial neural network used in the fields of artificial intelligence and deep learning. Unlike standard feedforward neural networks, LSTM has feedback connections. Such a recurrent neural network (RNN) ca ...
(LSTM) network controller can infer simple algorithms such as copying, sorting, and associative recall from examples alone.
The authors of the original NTM paper did not publish their
source code
In computing, source code, or simply code, is any collection of code, with or without comment (computer programming), comments, written using a human-readable programming language, usually as plain text. The source code of a Computer program, p ...
.
The first stable open-source implementation was published in 2018 at the 27th International Conference on Artificial Neural Networks, receiving a best-paper award. Other open source implementations of NTMs exist but as of 2018 they are not sufficiently stable for production use.
The developers either report that the
gradients of their implementation sometimes become
NaN during training for unknown reasons and cause training to fail;
report slow convergence;
or do not report the speed of learning of their implementation.
Differentiable neural computers are an outgrowth of Neural Turing machines, with
attention mechanisms that control where the memory is active, and improve performance.
References
{{Differentiable computing
Neural network architectures