A neural network is a network or
circuit of biological
neuron
A neuron, neurone, or nerve cell is an electrically excitable cell that communicates with other cells via specialized connections called synapses. The neuron is the main component of nervous tissue in all animals except sponges and placozoa. N ...
s, or, in a modern sense, an
artificial neural network
Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains.
An ANN is based on a collection of connected unit ...
, composed of
artificial neuron
An artificial neuron is a mathematical function conceived as a model of biological neurons, a neural network. Artificial neurons are elementary units in an artificial neural network. The artificial neuron receives one or more inputs (representing ...
s or nodes. Thus, a neural network is either a
biological neural network, made up of biological neurons, or an artificial neural network, used for solving
artificial intelligence
Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech re ...
(AI) problems. The connections of the biological neuron are modeled in artificial neural networks as weights between nodes. A positive weight reflects an excitatory connection, while negative values mean inhibitory connections. All inputs are modified by a weight and summed. This activity is referred to as a linear combination. Finally, an activation function controls the
amplitude
The amplitude of a periodic variable is a measure of its change in a single period (such as time or spatial period). The amplitude of a non-periodic signal is its magnitude compared with a reference value. There are various definitions of amplit ...
of the output. For example, an acceptable range of output is usually between 0 and 1, or it could be −1 and 1.
These artificial networks may be used for
predictive modeling
Predictive modelling uses statistics to predict outcomes. Most often the event one wants to predict is in the future, but predictive modelling can be applied to any type of unknown event, regardless of when it occurred. For example, predictive mod ...
,
adaptive control Adaptive control is the control method used by a controller which must adapt to a controlled system with parameters which vary, or are initially uncertain. For example, as an aircraft flies, its mass will slowly decrease as a result of fuel consumpt ...
and applications where they can be trained via a dataset. Self-learning resulting from experience can occur within networks, which can derive conclusions from a complex and seemingly unrelated set of information.
Overview
A
biological neural network is composed of a group of chemically connected or functionally associated neurons. A single neuron may be connected to many other neurons and the total number of neurons and connections in a network may be extensive. Connections, called
synapses, are usually formed from
axon
An axon (from Greek ἄξων ''áxōn'', axis), or nerve fiber (or nerve fibre: see spelling differences), is a long, slender projection of a nerve cell, or neuron, in vertebrates, that typically conducts electrical impulses known as action po ...
s to
dendrite
Dendrites (from Greek δένδρον ''déndron'', "tree"), also dendrons, are branched protoplasmic extensions of a nerve cell that propagate the electrochemical stimulation received from other neural cells to the cell body, or soma, of the ...
s, though
dendrodendritic synapse
Dendrodendritic synapses are connections between the dendrites of two different neurons. This is in contrast to the more common axodendritic synapse (chemical synapse) where the axon sends signals and the dendrite receives them. Dendrodendritic sy ...
s and other connections are possible. Apart from electrical signalling, there are other forms of signalling that arise from
neurotransmitter diffusion.
Artificial intelligence, cognitive modelling, and neural networks are information processing paradigms inspired by how biological neural systems process data.
Artificial intelligence
Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech re ...
and
cognitive modelling try to simulate some properties of biological neural networks. In the
artificial intelligence
Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech re ...
field, artificial neural networks have been applied successfully to
speech recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the m ...
,
image analysis and
adaptive control Adaptive control is the control method used by a controller which must adapt to a controlled system with parameters which vary, or are initially uncertain. For example, as an aircraft flies, its mass will slowly decrease as a result of fuel consumpt ...
, in order to construct
software agents (in
computer and video games) or
autonomous robot
An autonomous robot is a robot that acts without recourse to human control. The first autonomous robots environment were known as Elmer and Elsie, which were constructed in the late 1940s by W. Grey Walter. They were the first robots in history ...
s.
Historically, digital computers evolved from the
von Neumann model
The von Neumann architecture — also known as the von Neumann model or Princeton architecture — is a computer architecture based on a 1945 description by John von Neumann, and by others, in the ''First Draft of a Report on the EDVAC''. The ...
, and operate via the execution of explicit instructions via access to memory by a number of processors. On the other hand, the origins of neural networks are based on efforts to model information processing in biological systems. Unlike the von Neumann model, neural network computing does not separate memory and processing.
Neural network theory has served to identify better how the neurons in the brain function and provide the basis for efforts to create artificial intelligence.
History
The preliminary theoretical base for contemporary neural networks was independently proposed by
Alexander Bain (1873) and
William James
William James (January 11, 1842 – August 26, 1910) was an American philosopher, historian, and psychologist, and the first educator to offer a psychology course in the United States.
James is considered to be a leading thinker of the lat ...
(1890). In their work, both thoughts and body activity resulted from interactions among neurons within the brain.
For Bain,
every activity led to the firing of a certain set of neurons. When activities were repeated, the connections between those neurons strengthened. According to his theory, this repetition was what led to the formation of memory. The general scientific community at the time was skeptical of Bain's
theory because it required what appeared to be an inordinate number of neural connections within the brain. It is now apparent that the brain is exceedingly complex and that the same brain “wiring” can handle multiple problems and inputs.
James's
theory was similar to Bain's,
however, he suggested that memories and actions resulted from electrical currents flowing among the neurons in the brain. His model, by focusing on the flow of electrical currents, did not require individual neural connections for each memory or action.
C. S. Sherrington (1898) conducted experiments to test James's theory. He ran electrical currents down the spinal cords of rats. However, instead of demonstrating an increase in electrical current as projected by James, Sherrington found that the electrical current strength decreased as the testing continued over time. Importantly, this work led to the discovery of the concept of
habituation
Habituation is a form of non-associative learning in which an innate (non-reinforced) response to a stimulus decreases after repeated or prolonged presentations of that stimulus. Responses that habituate include those that involve the intact org ...
.
McCulloch
McCulloch is a Scottish surname. It's a variation of the Northern Irish surname McCullough. It's commonly found in Galloway.
Notable people with the surname include:
*Alan McCulloch (politician), New Zealand politician
*Alan McLeod McCulloch ( ...
and
Pitts (1943) created a computational model for neural networks based on mathematics and algorithms. They called this model
threshold logic. The model paved the way for neural network research to split into two distinct approaches. One approach focused on biological processes in the brain and the other focused on the application of neural networks to artificial intelligence.
In the late 1940s psychologist
Donald Hebb
Donald Olding Hebb (July 22, 1904 – August 20, 1985) was a Canadian psychologist who was influential in the area of neuropsychology, where he sought to understand how the function of neurons contributed to psychological processes such as learn ...
created a hypothesis of learning based on the mechanism of neural plasticity that is now known as
Hebbian learning
Hebbian theory is a neuroscientific theory claiming that an increase in synaptic efficacy arises from a presynaptic cell's repeated and persistent stimulation of a postsynaptic cell. It is an attempt to explain synaptic plasticity, the adaptatio ...
. Hebbian learning is considered to be a 'typical'
unsupervised learning
Unsupervised learning is a type of algorithm that learns patterns from untagged data. The hope is that through mimicry, which is an important mode of learning in people, the machine is forced to build a concise representation of its world and t ...
rule and its later variants were early models for
long term potentiation
In neuroscience, long-term potentiation (LTP) is a persistent strengthening of synapses based on recent patterns of activity. These are patterns of synaptic activity that produce a long-lasting increase in signal transmission between two neuron ...
. These ideas started being applied to computational models in 1948 with
Turing's B-type machines.
Farley and Clark (1954) first used computational machines, then called calculators, to simulate a Hebbian network at MIT. Other neural network computational machines were created by Rochester, Holland, Habit, and Duda (1956).
Rosenblatt Rosenblatt is a surname of German and Jewish origin, meaning "rose leaf". People with this surname include:
* Albert Rosenblatt (born 1936), New York Court of Appeals judge
* Dana Rosenblatt, known as "Dangerous" (born 1972), American boxer
* Elie ...
(1958) created the
perceptron
In machine learning, the perceptron (or McCulloch-Pitts neuron) is an algorithm for supervised learning of binary classifiers. A binary classifier is a function which can decide whether or not an input, represented by a vector of numbers, belon ...
, an algorithm for pattern recognition based on a two-layer learning computer network using simple addition and subtraction. With mathematical notation, Rosenblatt also described circuitry not in the basic perceptron, such as the
exclusive-or
Exclusive or or exclusive disjunction is a logical operation that is true if and only if its arguments differ (one is true, the other is false).
It is symbolized by the prefix operator J and by the infix operators XOR ( or ), EOR, EXOR, , , ...
circuit, a circuit whose mathematical computation could not be processed until after the
backpropagation
In machine learning, backpropagation (backprop, BP) is a widely used algorithm for training feedforward artificial neural networks. Generalizations of backpropagation exist for other artificial neural networks (ANNs), and for functions gener ...
algorithm was created by Werbos
(1975).
Neural network research stagnated after the publication of machine learning research by
Marvin Minsky
Marvin Lee Minsky (August 9, 1927 – January 24, 2016) was an American cognitive and computer scientist concerned largely with research of artificial intelligence (AI), co-founder of the Massachusetts Institute of Technology's AI laboratory, ...
and
Seymour Papert
Seymour Aubrey Papert (; 29 February 1928 – 31 July 2016) was a South African-born American mathematician, computer scientist, and educator, who spent most of his career teaching and researching at MIT. He was one of the pioneers of artificia ...
(1969). They discovered two key issues with the computational machines that processed neural networks. The first issue was that single-layer neural networks were incapable of processing the exclusive-or circuit. The second significant issue was that computers were not sophisticated enough to effectively handle the long run time required by large neural networks. Neural network research slowed until computers achieved greater processing power. Also key in later advances was the
backpropagation
In machine learning, backpropagation (backprop, BP) is a widely used algorithm for training feedforward artificial neural networks. Generalizations of backpropagation exist for other artificial neural networks (ANNs), and for functions gener ...
algorithm which effectively solved the exclusive-or problem (Werbos 1975).
In the late 1970s to early 1980s, interest briefly emerged in theoretically investigating the
Ising model
The Ising model () (or Lenz-Ising model or Ising-Lenz model), named after the physicists Ernst Ising and Wilhelm Lenz, is a mathematical model of ferromagnetism in statistical mechanics. The model consists of discrete variables that represent ...
in relation to . In 1981, the Ising model was solved exactly for the general case of closed Cayley trees (with loops) with an arbitrary branching ratio and found to exhibit unusual
phase transition
In chemistry, thermodynamics, and other related fields, a phase transition (or phase change) is the physical process of transition between one state of a medium and another. Commonly the term is used to refer to changes among the basic states o ...
behavior in its local-apex and long-range site-site correlations.
The
parallel distributed processing
Connectionism refers to both an approach in the field of cognitive science that hopes to explain mental phenomena using artificial neural networks (ANN) and to a wide range of techniques and algorithms using ANNs in the context of artificial int ...
of the mid-1980s became popular under the name
connectionism
Connectionism refers to both an approach in the field of cognitive science that hopes to explain mental phenomena using artificial neural networks (ANN) and to a wide range of techniques and algorithms using ANNs in the context of artificial in ...
. The text by Rumelhart and McClelland (1986) provided a full exposition on the use of connectionism in computers to simulate neural processes.
Neural networks, as used in artificial intelligence, have traditionally been viewed as simplified models of neural processing in the brain, even though the relation between this model and brain biological architecture is debated, as it is not clear to what degree artificial neural networks mirror brain function.
Artificial intelligence
A ''neural network'' (NN), in the case of artificial neurons called ''artificial neural network'' (ANN) or ''simulated neural network'' (SNN), is an interconnected group of natural or
artificial neuron
An artificial neuron is a mathematical function conceived as a model of biological neurons, a neural network. Artificial neurons are elementary units in an artificial neural network. The artificial neuron receives one or more inputs (representing ...
s that uses a
mathematical or computational model for
information processing based on a
connectionistic approach to
computation
Computation is any type of arithmetic or non-arithmetic calculation that follows a well-defined model (e.g., an algorithm).
Mechanical or electronic devices (or, historically, people) that perform computations are known as ''computers''. An es ...
. In most cases an ANN is an
adaptive system
An adaptive system is a set of interacting or interdependent entities, real or abstract, forming an integrated whole that together are able to respond to environmental changes or changes in the interacting parts, in a way analogous to either conti ...
that changes its structure based on external or internal information that flows through the network.
In more practical terms neural networks are
non-linear statistical data modeling or
decision making
In psychology, decision-making (also spelled decision making and decisionmaking) is regarded as the cognitive process resulting in the selection of a belief or a course of action among several possible alternative options. It could be either rati ...
tools. They can be used to model complex relationships between inputs and outputs or to
find patterns in data.
An
artificial neural network
Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains.
An ANN is based on a collection of connected unit ...
involves a network of simple processing elements (
artificial neuron
An artificial neuron is a mathematical function conceived as a model of biological neurons, a neural network. Artificial neurons are elementary units in an artificial neural network. The artificial neuron receives one or more inputs (representing ...
s) which can exhibit complex global behavior, determined by the connections between the processing elements and element parameters. Artificial neurons were first proposed in 1943 by
Warren McCulloch
Warren Sturgis McCulloch (November 16, 1898 – September 24, 1969) was an American neurophysiologist and cybernetician, known for his work on the foundation for certain brain theories and his contribution to the cybernetics movement.Ken Aizawa ( ...
, a neurophysiologist, and
Walter Pitts
Walter Harry Pitts, Jr. (23 April 1923 – 14 May 1969) was a logician who worked in the field of computational neuroscience.Smalheiser, Neil R"Walter Pitts", ''Perspectives in Biology and Medicine'', Volume 43, Number 2, Winter 2000, pp. 21 ...
, a logician, who first collaborated at the
University of Chicago
The University of Chicago (UChicago, Chicago, U of C, or UChi) is a private research university in Chicago, Illinois. Its main campus is located in Chicago's Hyde Park neighborhood. The University of Chicago is consistently ranked among the b ...
.
One classical type of artificial neural network is the
recurrent Hopfield network
A Hopfield network (or Ising model of a neural network or Ising–Lenz–Little model) is a form of recurrent artificial neural network and a type of spin glass system popularised by John Hopfield in 1982 as described earlier by Little in 1974 b ...
.
The concept of a neural network appears to have first been proposed by
Alan Turing
Alan Mathison Turing (; 23 June 1912 – 7 June 1954) was an English mathematician, computer scientist, logician, cryptanalyst, philosopher, and theoretical biologist. Turing was highly influential in the development of theoretical co ...
in his 1948 paper ''Intelligent Machinery'' in which he called them "B-type unorganised machines".
The utility of artificial neural network models lies in the fact that they can be used to infer a function from observations and also to use it. Unsupervised neural networks can also be used to learn representations of the input that capture the salient characteristics of the input distribution, e.g., see the
Boltzmann machine
A Boltzmann machine (also called Sherrington–Kirkpatrick model with external field or stochastic Ising–Lenz–Little model) is a stochastic spin-glass model with an external field, i.e., a Sherrington–Kirkpatrick model, that is a stochastic ...
(1983), and more recently,
deep learning algorithms, which can implicitly learn the distribution function of the observed data. Learning in neural networks is particularly useful in applications where the complexity of the data or task makes the design of such functions by hand impractical.
Applications
Neural networks can be used in different fields. The tasks to which artificial neural networks are applied tend to fall within the following broad categories:
*
Function approximation
In general, a function approximation problem asks us to select a function among a that closely matches ("approximates") a in a task-specific way. The need for function approximations arises in many branches of applied mathematics, and comput ...
, or
regression analysis
In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
, including
time series prediction
In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Ex ...
and modeling.
*
Classification, including
pattern
A pattern is a regularity in the world, in human-made design, or in abstract ideas. As such, the elements of a pattern repeat in a predictable manner. A geometric pattern is a kind of pattern formed of geometric shapes and typically repeated li ...
and sequence recognition,
novelty detection Novelty detection is the mechanism by which an intelligent organism is able to identify an incoming sensory pattern as being hitherto unknown. If the pattern is sufficiently salient or associated with a high positive or strong negative utility, i ...
and sequential decision making.
*
Data processing, including filtering, clustering,
blind signal separation
Blind may refer to:
* The state of blindness, being unable to see
* A window blind, a covering for a window
Blind may also refer to:
Arts, entertainment, and media Films
* ''Blind'' (2007 film), a Dutch drama by Tamar van den Dop
* ''Blind' ...
and
compression
Compression may refer to:
Physical science
*Compression (physics), size reduction due to forces
*Compression member, a structural element such as a column
*Compressibility, susceptibility to compression
* Gas compression
*Compression ratio, of a ...
.
Application areas of ANNs include
nonlinear system identification System identification is a method of identifying or measuring the mathematical model of a system from measurements of the system inputs and outputs. The applications of system identification include any system where the inputs and outputs can be mea ...
and control (vehicle control, process control), game-playing and decision making (backgammon, chess, racing), pattern recognition (radar systems,
face identification, object recognition), sequence recognition (gesture, speech,
handwritten text recognition), medical diagnosis, financial applications,
data mining (or knowledge discovery in databases, "KDD"), visualization and
e-mail spam
Email spam, also referred to as junk email, spam mail, or simply spam, is unsolicited messages sent in bulk by email (spamming).
The name comes from a Monty Python sketch in which the name of the canned pork product Spam is ubiquitous, unavoida ...
filtering. For example, it is possible to create a semantic profile of user's interests emerging from pictures trained for object recognition.
Neuroscience
Theoretical and
computational neuroscience
Computational neuroscience (also known as theoretical neuroscience or mathematical neuroscience) is a branch of neuroscience which employs mathematical models, computer simulations, theoretical analysis and abstractions of the brain to u ...
is the field concerned with the analysis and computational modeling of biological neural systems.
Since neural systems are intimately related to cognitive processes and behaviour, the field is closely related to cognitive and behavioural modeling.
The aim of the field is to create models of biological neural systems in order to understand how biological systems work. To gain this understanding, neuroscientists strive to make a link between observed biological processes (data), biologically plausible mechanisms for neural processing and learning (
biological neural network models) and theory (statistical learning theory and
information theory
Information theory is the scientific study of the quantification (science), quantification, computer data storage, storage, and telecommunication, communication of information. The field was originally established by the works of Harry Nyquist a ...
).
Types of models
Many models are used; defined at different levels of abstraction, and modeling different aspects of neural systems. They range from models of the short-term behaviour of
individual neurons, through models of the dynamics of neural circuitry arising from interactions between individual neurons, to models of behaviour arising from abstract neural modules that represent complete subsystems. These include models of the long-term and short-term plasticity of neural systems and its relation to learning and memory, from the individual neuron to the system level.
Connectivity
In August 2020 scientists reported that bi-directional connections, or added appropriate feedback connections, can accelerate and improve communication between and in modular
neural networks of the brain's
cerebral cortex
The cerebral cortex, also known as the cerebral mantle, is the outer layer of neural tissue of the cerebrum of the brain in humans and other mammals. The cerebral cortex mostly consists of the six-layered neocortex, with just 10% consistin ...
and lower the threshold for their successful communication. They showed that adding feedback connections between a resonance pair can support successful propagation of a single pulse packet throughout the entire network.
Criticism
Historically, a common criticism of neural networks, particularly in robotics, was that they require a large diversity of training samples for real-world operation. This is not surprising, since any learning machine needs sufficient representative examples in order to capture the underlying structure that allows it to generalize to new cases. Dean Pomerleau, in his research presented in the paper "Knowledge-based Training of Artificial Neural Networks for Autonomous Robot Driving," uses a neural network to train a robotic vehicle to drive on multiple types of roads (single lane, multi-lane, dirt, etc.). A large amount of his research is devoted to (1) extrapolating multiple training scenarios from a single training experience, and (2) preserving past training diversity so that the system does not become overtrained (if, for example, it is presented with a series of right turns—it should not learn to always turn right). These issues are common in neural networks that must decide from amongst a wide variety of responses, but can be dealt with in several ways, for example by randomly shuffling the training examples, by using a numerical optimization algorithm that does not take too large steps when changing the network connections following an example, or by grouping examples in so-called mini-batches.
A. K. Dewdney Alexander Keewatin Dewdney (born August 5, 1941) is a Canadian mathematician, computer scientist, author, filmmaker, and conspiracy theorist. Dewdney is the son of Canadian artist and author Selwyn Dewdney, and brother of poet Christopher Dewdney. ...
, a former ''
Scientific American
''Scientific American'', informally abbreviated ''SciAm'' or sometimes ''SA'', is an American popular science magazine. Many famous scientists, including Albert Einstein and Nikola Tesla, have contributed articles to it. In print since 1845, it i ...
'' columnist, wrote in 1997, "Although neural nets do solve a few toy problems, their powers of computation are so limited that I am surprised anyone takes them seriously as a general problem-solving tool."
Arguments for Dewdney's position are that to implement large and effective software neural networks, much processing and storage resources need to be committed. While the brain has hardware tailored to the task of processing signals through a graph of neurons, simulating even a most simplified form on Von Neumann technology may compel a neural network designer to fill many millions of
database
In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases sp ...
rows for its connections—which can consume vast amounts of computer
memory
Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered, ...
and
data storage capacity. Furthermore, the designer of neural network systems will often need to simulate the transmission of signals through many of these connections and their associated neurons—which must often be matched with incredible amounts of
CPU processing power and time. While neural networks often yield ''effective'' programs, they too often do so at the cost of ''efficiency'' (they tend to consume considerable amounts of time and money).
Arguments against Dewdney's position are that neural nets have been successfully used to solve many complex and diverse tasks, such as autonomously flying aircraft.
Technology writer
Roger Bridgman commented on Dewdney's statements about neural nets:
Neural networks, for instance, are in the dock not only because they have been hyped to high heaven, (what hasn't?) but also because you could create a successful net without understanding how it worked: the bunch of numbers that captures its behaviour would in all probability be "an opaque, unreadable table...valueless as a scientific resource".
In spite of his emphatic declaration that science is not technology, Dewdney seems here to pillory neural nets as bad science when most of those devising them are just trying to be good engineers. An unreadable table that a useful machine could read would still be well worth having.
Although it is true that analyzing what has been learned by an artificial neural network is difficult, it is much easier to do so than to analyze what has been learned by a biological neural network. Moreover, recent emphasis on the explainability of AI has contributed towards the development of methods, notably those based on attention mechanisms, for visualizing and explaining learned neural networks. Furthermore, researchers involved in exploring learning algorithms for neural networks are gradually uncovering generic principles that allow a learning machine to be successful. For example, Bengio and LeCun (2007) wrote an article regarding local vs non-local learning, as well as shallow vs deep architecture.
Some other criticisms came from believers of hybrid models (combining neural networks and
symbolic approaches). They advocate the intermix of these two approaches and believe that hybrid models can better capture the mechanisms of the human mind (Sun and Bookman, 1990).
Recent improvements
While initially research had been concerned mostly with the electrical characteristics of neurons, a particularly important part of the investigation in recent years has been the exploration of the role of
neuromodulators
Neuromodulation is the physiological process by which a given neuron uses one or more chemicals to regulate diverse populations of neurons. Neuromodulators typically bind to metabotropic, G-protein coupled receptors (GPCRs) to initiate a second ...
such as
dopamine,
acetylcholine
Acetylcholine (ACh) is an organic chemical that functions in the brain and body of many types of animals (including humans) as a neurotransmitter. Its name is derived from its chemical structure: it is an ester of acetic acid and choline. Part ...
, and
serotonin on behaviour and learning.
Biophysical
Biophysics is an interdisciplinary science that applies approaches and methods traditionally used in physics to study biological phenomena. Biophysics covers all scales of biological organization, from molecular to organismic and populations. Bi ...
models, such as
BCM theory
BCM theory, BCM synaptic modification, or the BCM rule, named for Elie Bienenstock, Leon Cooper, and Paul Munro, is a physical theory of learning in the visual cortex developed in 1981.
The BCM model proposes a sliding threshold for long-term pote ...
, have been important in understanding mechanisms for
synaptic plasticity
In neuroscience, synaptic plasticity is the ability of synapses to strengthen or weaken over time, in response to increases or decreases in their activity. Since memories are postulated to be represented by vastly interconnected neural circuit ...
, and have had applications in both computer science and neuroscience. Research is ongoing in understanding the computational algorithms used in the brain, with some recent biological evidence for
radial basis networks and
neural backpropagation
Neural backpropagation is the phenomenon in which, after the action potential of a neuron creates a voltage spike down the axon (normal propagation), another impulse is generated from the Soma (biology), soma and propagates towards the Apical den ...
as mechanisms for processing data.
Computational devices have been created in CMOS for both biophysical simulation and
neuromorphic computing
Neuromorphic engineering, also known as neuromorphic computing, is the use of electronic circuits to mimic neuro-biological architectures present in the nervous system. A neuromorphic computer/chip is any device that uses physical artificial ne ...
. More recent efforts show promise for creating
nanodevice
A molecular machine, nanite, or nanomachine is a molecular component that produces quasi-mechanical movements (output) in response to specific stimuli (input). In cellular biology, macromolecular machines frequently perform tasks essential for ...
s for very large scale
principal component
Principal may refer to:
Title or rank
* Principal (academia), the chief executive of a university
** Principal (education), the office holder/ or boss in any school
* Principal (civil service) or principal officer, the senior management level ...
s analyses and
convolution
In mathematics (in particular, functional analysis), convolution is a operation (mathematics), mathematical operation on two function (mathematics), functions ( and ) that produces a third function (f*g) that expresses how the shape of one is ...
. If successful, these efforts could usher in a new era of
neural computing
Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains.
An ANN is based on a collection of connected unit ...
that is a step beyond digital computing, because it depends on
learning
Learning is the process of acquiring new understanding, knowledge, behaviors, skills, value (personal and cultural), values, attitudes, and preferences. The ability to learn is possessed by humans, animals, and some machine learning, machines ...
rather than
programming and because it is fundamentally
analog
Analog or analogue may refer to:
Computing and electronics
* Analog signal, in which information is encoded in a continuous variable
** Analog device, an apparatus that operates on analog signals
*** Analog electronics, circuits which use analog ...
rather than
digital even though the first instantiations may in fact be with CMOS digital devices.
Between 2009 and 2012, the
recurrent neural network
A recurrent neural network (RNN) is a class of artificial neural networks where connections between nodes can create a cycle, allowing output from some nodes to affect subsequent input to the same nodes. This allows it to exhibit temporal dynamic ...
s and deep
feedforward neural networks developed in the research group of
Jürgen Schmidhuber
Jürgen Schmidhuber (born 17 January 1963) is a German computer scientist most noted for his work in the field of artificial intelligence, deep learning and artificial neural networks. He is a co-director of the Dalle Molle Institute for Artifi ...
at the
Swiss AI Lab IDSIA have won eight international competitions in
pattern recognition
Pattern recognition is the automated recognition of patterns and regularities in data. It has applications in statistical data analysis, signal processing, image analysis, information retrieval, bioinformatics, data compression, computer graphi ...
and
machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
. For example, multi-dimensional
long short term memory (LSTM) won three competitions in connected
handwriting recognition
Handwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other de ...
at the 2009 International Conference on Document Analysis and Recognition (ICDAR), without any prior knowledge about the three different languages to be learned.
Variants of the
back-propagation
In machine learning, backpropagation (backprop, BP) is a widely used algorithm for training feedforward artificial neural networks. Generalizations of backpropagation exist for other artificial neural networks (ANNs), and for functions gener ...
algorithm as well as unsupervised methods by
Geoff Hinton and colleagues at the
University of Toronto
The University of Toronto (UToronto or U of T) is a public research university in Toronto, Ontario, Canada, located on the grounds that surround Queen's Park. It was founded by royal charter in 1827 as King's College, the first institution ...
can be used to train deep, highly nonlinear neural architectures, similar to the 1980
Neocognitron
__NOTOC__
The neocognitron is a hierarchical, multilayered artificial neural network proposed by Kunihiko Fukushima in 1979. It has been used for Japanese handwritten character recognition and other pattern recognition tasks, and served as the ins ...
by
Kunihiko Fukushima
Kunihiko Fukushima ( Japanese: 福島 邦彦, born 16 March 1936) is a Japanese computer scientist, most noted for his work on artificial neural networks and deep learning. He is currently working part-time as a Senior Research Scientist at the F ...
, and the "standard architecture of vision", inspired by the simple and complex cells identified by
David H. Hubel
David Hunter Hubel (February 27, 1926 – September 22, 2013) was a Canadian American neurophysiologist noted for his studies of the structure and function of the visual cortex. He was co-recipient with Torsten Wiesel of the 1981 Nobel Pri ...
and
Torsten Wiesel
Torsten Nils Wiesel (born 3 June 1924) is a Swedish neurophysiologist. With David H. Hubel, he received the 1981 Nobel Prize in Physiology or Medicine, for their discoveries concerning information processing in the visual system; the prize was ...
in the primary
visual cortex
The visual cortex of the brain is the area of the cerebral cortex that processes visual information. It is located in the occipital lobe. Sensory input originating from the eyes travels through the lateral geniculate nucleus in the thalamus and ...
.
Radial basis function and wavelet networks have also been introduced. These can be shown to offer best approximation properties and have been applied in
nonlinear system identification System identification is a method of identifying or measuring the mathematical model of a system from measurements of the system inputs and outputs. The applications of system identification include any system where the inputs and outputs can be mea ...
and classification applications.
Deep learning feedforward networks alternate
convolution
In mathematics (in particular, functional analysis), convolution is a operation (mathematics), mathematical operation on two function (mathematics), functions ( and ) that produces a third function (f*g) that expresses how the shape of one is ...
al layers and max-pooling layers, topped by several pure classification layers. Fast
GPU
A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobi ...
-based implementations of this approach have won several pattern recognition contests, including the IJCNN 2011 Traffic Sign Recognition Competition and the ISBI 2012 Segmentation of Neuronal Structures in Electron Microscopy Stacks challenge. Such neural networks also were the first artificial pattern recognizers to achieve human-competitive or even superhuman performance
[D. C. Ciresan, U. Meier, J. Schmidhuber. Multi-column Deep Neural Networks for Image Classification. IEEE Conf. on Computer Vision and Pattern Recognition CVPR 2012.] on benchmarks such as traffic sign recognition (IJCNN 2012), or the MNIST handwritten digits problem of
Yann LeCun
Yann André LeCun ( , ; originally spelled Le Cun; born 8 July 1960) is a French computer scientist working primarily in the fields of machine learning, computer vision, mobile robotics and computational neuroscience. He is the Silver Professo ...
and colleagues at
NYU
New York University (NYU) is a private research university in New York City. Chartered in 1831 by the New York State Legislature, NYU was founded by a group of New Yorkers led by then-Secretary of the Treasury Albert Gallatin.
In 1832, the ...
.
See also
References
External links
A Brief Introduction to Neural Networks (D. Kriesel)- Illustrated, bilingual manuscript about artificial neural networks; Topics so far: Perceptrons, Backpropagation, Radial Basis Functions, Recurrent Neural Networks, Self Organizing Maps, Hopfield Networks.
*
ttps://web.archive.org/web/20091216110504/http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html Another introduction to ANNbr>
Next Generation of Neural Networks- Google Tech Talks
Neural Networks and Information*
{{Authority control
Computational neuroscience
Network architecture
Networks
Econometrics
Emerging technologies