Jürgen Schmidhuber
   HOME

TheInfoList



OR:

Jürgen Schmidhuber (born 17 January 1963) is a German computer scientist most noted for his work in the field of
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech r ...
, deep learning and artificial neural networks. He is a co-director of the
Dalle Molle Institute for Artificial Intelligence Research The Dalle Molle Institute for Artificial Intelligence Research ( it, Istituto Dalle Molle di Studi sull'Intelligenza Artificiale, italic=no, IDSIA) is a research institution based in Lugano, in Canton Ticino in southern Switzerland. It was founde ...
in
Lugano Lugano (, , ; lmo, label=Ticinese dialect, Ticinese, Lugan ) is a city and municipality in Switzerland, part of the Lugano District in the canton of Ticino. It is the largest city of both Ticino and the Italian-speaking southern Switzerland. Luga ...
, in
Ticino Ticino (), sometimes Tessin (), officially the Republic and Canton of Ticino or less formally the Canton of Ticino,, informally ''Canton Ticino'' ; lmo, Canton Tesin ; german: Kanton Tessin ; french: Canton du Tessin ; rm, Chantun dal Tessin . ...
in southern Switzerland. Following
Google Scholar Google Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines. Released in beta in November 2004, the Google Scholar index includes ...
, from 2016 to 2021 he has received more than 100,000 scientific
citations A citation is a reference to a source. More precisely, a citation is an abbreviated alphanumeric expression embedded in the body of an intellectual work that denotes an entry in the bibliographic references section of the work for the purpose of ...
. He has been referred to as "father of modern AI," "father of AI," "dad of mature AI," "Papa" of famous AI products, "Godfather," and "father of deep learning." (Schmidhuber himself, however, has called
Alexey Grigorevich Ivakhnenko Alexey Ivakhnenko ( uk, Олексíй Григо́рович Іва́хненко); (30 March 1913 – 16 October 2007) was a Soviet and Ukrainian mathematician most famous for developing the Group Method of Data Handling (GMDH), a method of ind ...
the "father of deep learning.") Schmidhuber completed his undergraduate (1987) and PhD (1991) studies at the
Technical University of Munich The Technical University of Munich (TUM or TU Munich; german: Technische Universität München) is a public research university in Munich, Germany. It specializes in engineering, technology, medicine, and applied and natural sciences. Establis ...
in
Munich Munich ( ; german: München ; bar, Minga ) is the capital and most populous city of the German state of Bavaria. With a population of 1,558,395 inhabitants as of 31 July 2020, it is the third-largest city in Germany, after Berlin and Ha ...
, Germany. His PhD advisors were Wilfried Brauer and
Klaus Schulten Klaus Schulten (January 12, 1947 – October 31, 2016) was a German-American computational biophysicist and the Swanlund Professor of Physics at the University of Illinois at Urbana-Champaign. Schulten used supercomputing techniques to app ...
. He taught there from 2004 until 2009 when he became a professor of artificial intelligence at the Università della Svizzera Italiana in
Lugano Lugano (, , ; lmo, label=Ticinese dialect, Ticinese, Lugan ) is a city and municipality in Switzerland, part of the Lugano District in the canton of Ticino. It is the largest city of both Ticino and the Italian-speaking southern Switzerland. Luga ...
, Switzerland.


Work

With his students
Sepp Hochreiter Josef "Sepp" Hochreiter (born 14 February 1967) is a German computer scientist. Since 2018 he has led the Institute for Machine Learning at the Johannes Kepler University of Linz after having led the Institute of Bioinformatics from 2006 to 2018 ...
,
Felix Gers Felix Gers is a professor of computer science at Berlin University of Applied Sciences Berlin. With Jürgen Schmidhuber and Fred Cummins, he introduced the forget gate to the long short-term memory Long short-term memory (LSTM) is an artificia ...
, Fred Cummins,
Alex Graves Alexander John Graves (born July 23, 1965) is an American film director, television director, television producer and screenwriter. Early life Alex Graves was born in Kansas City, Missouri. His father, William Graves, was a reporter for ''Th ...
, and others, Schmidhuber published increasingly sophisticated versions of a type of
recurrent neural network A recurrent neural network (RNN) is a class of artificial neural networks where connections between nodes can create a cycle, allowing output from some nodes to affect subsequent input to the same nodes. This allows it to exhibit temporal dynamic ...
called the
long short-term memory Long short-term memory (LSTM) is an artificial neural network used in the fields of artificial intelligence and deep learning. Unlike standard feedforward neural networks, LSTM has feedback connections. Such a recurrent neural network (RNN) ...
(LSTM). First results were already reported in Hochreiter's diploma thesis (1991) which analyzed and overcame the famous
vanishing gradient In machine learning, the vanishing gradient problem is encountered when training artificial neural networks with gradient-based learning methods and backpropagation. In such methods, during each iteration of training each of the neural network's ...
problem. The name LSTM was introduced in a tech report (1995) leading to the most cited LSTM publication (1997). The standard LSTM architecture which is used in almost all current applications was introduced in 2000. Today's "vanilla LSTM" using
backpropagation through time Backpropagation through time (BPTT) is a gradient-based technique for training certain types of recurrent neural networks. It can be used to train Elman networks. The algorithm was independently derived by numerous researchers. Algorithm Th ...
was published in 2005, and its
connectionist temporal classification (CTC) Connectionist temporal classification (CTC) is a type of neural network output and associated scoring function, for training recurrent neural networks (RNNs) such as LSTM networks to tackle sequence problems where the timing is variable. It can ...
training algorithm in 2006. CTC enabled end-to-end speech recognition with LSTM. In 2015, LSTM trained by CTC was used in a new implementation of
speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the ...
in Google's software for smartphones. Google also used LSTM for the smart assistant Allo and for
Google Translate Google Translate is a multilingual neural machine translation service developed by Google to translate text, documents and websites from one language into another. It offers a website interface, a mobile app for Android and iOS, and an API ...
.
Apple An apple is an edible fruit produced by an apple tree (''Malus domestica''). Apple trees are cultivated worldwide and are the most widely grown species in the genus ''Malus''. The tree originated in Central Asia, where its wild ancestor, ' ...
used LSTM for the "Quicktype" function on the iPhone and for
Siri Siri ( ) is a virtual assistant that is part of Apple Inc.'s iOS, iPadOS, watchOS, macOS, tvOS, and audioOS operating systems. It uses voice queries, gesture based control, focus-tracking and a natural-language user interface to answer qu ...
.
Amazon Amazon most often refers to: * Amazons, a tribe of female warriors in Greek mythology * Amazon rainforest, a rainforest covering most of the Amazon basin * Amazon River, in South America * Amazon (company), an American multinational technolog ...
used LSTM for
Amazon Alexa Amazon Alexa, also known simply as Alexa, is a virtual assistant technology largely based on a Polish speech synthesiser named Ivona, bought by Amazon in 2013. It was first used in the Amazon Echo smart speaker and the Echo Dot, Echo Studio ...
. In 2017, Facebook performed some 4.5 billion automatic translations every day using LSTM networks.
Bloomberg Business Week ''Bloomberg Businessweek'', previously known as ''BusinessWeek'', is an American weekly business magazine published fifty times a year. Since 2009, the magazine is owned by New York City-based Bloomberg L.P. The magazine debuted in New York City ...
wrote: "These powers make LSTM arguably the most commercial AI achievement, used for everything from predicting diseases to composing music." In 2011, Schmidhuber's team at
IDSIA The Dalle Molle Institute for Artificial Intelligence Research ( it, Istituto Dalle Molle di Studi sull'Intelligenza Artificiale, italic=no, IDSIA) is a research institution based in Lugano, in Canton Ticino in southern Switzerland. It was found ...
with his postdoc Dan Ciresan also achieved dramatic speedups of
convolutional neural network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...
s (CNNs) on fast parallel computers called
GPU A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobi ...
s. An earlier CNN on GPU by Chellapilla et al. (2006) was 4 times faster than an equivalent implementation on CPU. The deep CNN of Dan Ciresan et al. (2011) at
IDSIA The Dalle Molle Institute for Artificial Intelligence Research ( it, Istituto Dalle Molle di Studi sull'Intelligenza Artificiale, italic=no, IDSIA) is a research institution based in Lugano, in Canton Ticino in southern Switzerland. It was found ...
was already 60 times faster and achieved the first superhuman performance in a computer vision contest in August 2011. Between 15 May 2011 and 10 September 2012, their fast and deep CNNs won no fewer than four image competitions. They also significantly improved on the best performance in the literature for multiple image
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases s ...
s. The approach has become central to the field of computer vision. It is based on CNN designs introduced much earlier by
Yann LeCun Yann André LeCun ( , ; originally spelled Le Cun; born 8 July 1960) is a French computer scientist working primarily in the fields of machine learning, computer vision, mobile robotics and computational neuroscience. He is the Silver Professo ...
et al. (1989)Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel
Backpropagation Applied to Handwritten Zip Code Recognition
AT&T Bell Laboratories
who applied the
backpropagation In machine learning, backpropagation (backprop, BP) is a widely used algorithm for training feedforward artificial neural networks. Generalizations of backpropagation exist for other artificial neural networks (ANNs), and for functions gener ...
algorithm to a variant of
Kunihiko Fukushima Kunihiko Fukushima ( Japanese: 福島 邦彦, born 16 March 1936) is a Japanese computer scientist, most noted for his work on artificial neural networks and deep learning. He is currently working part-time as a Senior Research Scientist at the F ...
's original CNN architecture called
neocognitron __NOTOC__ The neocognitron is a hierarchical, multilayered artificial neural network proposed by Kunihiko Fukushima in 1979. It has been used for Japanese handwritten character recognition and other pattern recognition tasks, and served as the ins ...
, later modified by J. Weng's method called max-pooling. In 2014, Schmidhuber formed a company, Nnaisense, to work on commercial applications of artificial intelligence in fields such as finance, heavy industry and
self-driving car A self-driving car, also known as an autonomous car, driver-less car, or robotic car (robo-car), is a car that is capable of traveling without human input.Xie, S.; Hu, J.; Bhowmick, P.; Ding, Z.; Arvin, F.,Distributed Motion Planning for S ...
s. Sepp Hochreiter,
Jaan Tallinn Jaan Tallinn (born 14 February 1972) is an Estonian billionaire computer programmer and investor known for his participation in the development of Skype and file-sharing application FastTrack/ Kazaa. Jaan Tallinn is a leading figure in the field ...
, and
Marcus Hutter Marcus Hutter (born April 14, 1967 in Munich) is DeepMind Senior Scientist researching the mathematical foundations of artificial general intelligence. He is on leave from his professorship at the ANU College of Engineering and Computer Scie ...
are advisers to the company. Sales were under US$11 million in 2016; however, Schmidhuber states that the current emphasis is on research and not revenue. Nnaisense raised its first round of capital funding in January 2017. Schmidhuber's overall goal is to create an all-purpose AI by training a single AI in sequence on a variety of narrow tasks.


Views

According to
The Guardian ''The Guardian'' is a British daily newspaper. It was founded in 1821 as ''The Manchester Guardian'', and changed its name in 1959. Along with its sister papers ''The Observer'' and ''The Guardian Weekly'', ''The Guardian'' is part of the Gu ...
, Schmidhuber complained in a "scathing 2015 article" that fellow deep learning researchers
Geoffrey Hinton Geoffrey Everest Hinton One or more of the preceding sentences incorporates text from the royalsociety.org website where: (born 6 December 1947) is a British-Canadian cognitive psychologist and computer scientist, most noted for his work on a ...
,
Yann LeCun Yann André LeCun ( , ; originally spelled Le Cun; born 8 July 1960) is a French computer scientist working primarily in the fields of machine learning, computer vision, mobile robotics and computational neuroscience. He is the Silver Professo ...
and
Yoshua Bengio Yoshua Bengio (born March 5, 1964) is a Canadian computer scientist, most noted for his work on artificial neural networks and deep learning. He is a professor at the Department of Computer Science and Operations Research at the Université ...
"heavily cite each other," but "fail to credit the pioneers of the field", allegedly understating the contributions of Schmidhuber and other early machine learning pioneers including
Alexey Grigorevich Ivakhnenko Alexey Ivakhnenko ( uk, Олексíй Григо́рович Іва́хненко); (30 March 1913 – 16 October 2007) was a Soviet and Ukrainian mathematician most famous for developing the Group Method of Data Handling (GMDH), a method of ind ...
who published the first deep learning networks already in 1965. LeCun denied the charge, stating instead that Schmidhuber "keeps claiming credit he doesn't deserve". Schmidhuber replied that LeCun did not provide a single example for his statement, and listed several
priority dispute In science, priority is the credit given to the individual or group of individuals who first made the discovery or propose the theory. Fame and honours usually go to the first person or group to publish a new finding, even if several researchers arr ...
s.


Recognition

Schmidhuber received the Helmholtz Award of the International Neural Network Society in 2013, and the Neural Networks Pioneer Award of the
IEEE Computational Intelligence Society The Institute of Electrical and Electronics Engineers (IEEE) is a 501(c)(3) professional association for electronic engineering and electrical engineering (and associated disciplines) with its corporate office in New York City and its operati ...
in 2016 for "pioneering contributions to deep learning and neural networks." He is a member of the
European Academy of Sciences and Arts The European Academy of Sciences and Arts (EASA, la, Academia Scientiarum et Artium Europaea) is a transnational and interdisciplinary network, connecting about 2,000 recommended scientists and artists worldwide, including 37 Nobel Prize laur ...
.


References

{{DEFAULTSORT:Schmidhuber, Jurgen Living people Artificial intelligence researchers Machine learning researchers Computer scientists Members of the European Academy of Sciences and Arts Technical University of Munich alumni Technical University of Munich faculty University of Lugano faculty 1963 births