HOME

TheInfoList



OR:

Natural-language understanding (NLU) or natural-language interpretation (NLI) is a subtopic of
natural-language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to proc ...
in
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech r ...
that deals with machine reading comprehension. Natural-language understanding is considered an
AI-hard In the field of artificial intelligence, the most difficult problems are informally known as AI-complete or AI-hard, implying that the difficulty of these computational problems, assuming intelligence is computational, is equivalent to that of solv ...
problem. There is considerable commercial interest in the field because of its application to
automated reasoning In computer science, in particular in knowledge representation and reasoning and metalogic, the area of automated reasoning is dedicated to understanding different aspects of reasoning. The study of automated reasoning helps produce computer progr ...
,
machine translation Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation or interactive translation), is a sub-field of computational linguistics that investigates t ...
,
question answering Question answering (QA) is a computer science discipline within the fields of information retrieval and natural language processing (NLP), which is concerned with building systems that automatically answer questions posed by humans in a natural l ...
, news-gathering,
text categorization Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a document to one or more classes or categories. This may be done "manually" (or "intellectually") ...
, voice-activation, archiving, and large-scale
content analysis Content analysis is the study of documents and communication artifacts, which might be texts of various formats, pictures, audio or video. Social scientists use content analysis to examine patterns in communication in a replicable and systematic ...
.


History

The program
STUDENT A student is a person enrolled in a school or other educational institution. In the United Kingdom and most commonwealth countries, a "student" attends a secondary school or higher (e.g., college or university); those in primary or elementa ...
, written in 1964 by Daniel Bobrow for his PhD dissertation at
MIT The Massachusetts Institute of Technology (MIT) is a private land-grant research university in Cambridge, Massachusetts. Established in 1861, MIT has played a key role in the development of modern technology and science, and is one of the ...
, is one of the earliest known attempts at natural-language understanding by a computer. Eight years after John McCarthy coined the term
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech r ...
, Bobrow's dissertation (titled ''Natural Language Input for a Computer Problem Solving System'') showed how a computer could understand simple natural language input to solve algebra word problems. A year later, in 1965,
Joseph Weizenbaum Joseph Weizenbaum (8 January 1923 – 5 March 2008) was a German American computer scientist and a professor at MIT. The Weizenbaum Award is named after him. He is considered one of the fathers of modern artificial intelligence. Life and care ...
at MIT wrote
ELIZA ELIZA is an early natural language processing computer program created from 1964 to 1966 at the MIT Artificial Intelligence Laboratory by Joseph Weizenbaum. Created to demonstrate the superficiality of communication between humans and machines ...
, an interactive program that carried on a dialogue in English on any topic, the most popular being psychotherapy. ELIZA worked by simple parsing and substitution of key words into canned phrases and Weizenbaum sidestepped the problem of giving the program a
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spa ...
of real-world knowledge or a rich lexicon. Yet ELIZA gained surprising popularity as a toy project and can be seen as a very early precursor to current commercial systems such as those used by Ask.com. In 1969,
Roger Schank Roger Carl Schank (born 1946) is an American artificial intelligence theorist, cognitive psychologist, learning scientist, educational reformer, and entrepreneur. Beginning in the late 1960s, he pioneered conceptual dependency theory (within the ...
at Stanford University introduced the
conceptual dependency theory Conceptual dependency theory is a model of natural language understanding used in artificial intelligence systems. Roger Schank at Stanford University introduced the model in 1969, in the early days of artificial intelligence. This model was exte ...
for natural-language understanding. This model, partially influenced by the work of
Sydney Lamb Sydney MacDonald Lamb (born May 4, 1929 in Denver, Colorado) is an American linguist and professor at Rice University, whose stratificational grammar is a significant alternative theory to Chomsky's transformational grammar. He has specializ ...
, was extensively used by Schank's students at
Yale University Yale University is a Private university, private research university in New Haven, Connecticut. Established in 1701 as the Collegiate School, it is the List of Colonial Colleges, third-oldest institution of higher education in the United Sta ...
, such as
Robert Wilensky Robert Wilensky (26 March 1951 – 15 March 2013) was an American computer scientist and emeritus professor at the UC Berkeley School of Information, with his main focus of research in artificial intelligence. Academic career In 1971, Wilensk ...
,
Wendy Lehnert Wendy Grace Lehnert is an American computer scientist specializing in natural language processing and known for her pioneering use of machine learning in natural language processing. She is a professor emerita at the University of Massachusetts Amh ...
, and
Janet Kolodner Janet Lynne Kolodner is an American cognitive scientist and learning scientist and Regents' Professor Emerita in the School of Interactive Computing, College of Computing at the Georgia Institute of Technology. She was Founding Editor in Chie ...
. In 1970, William A. Woods introduced the
augmented transition network An augmented transition network or ATN is a type of graph theoretic structure used in the operational definition of formal languages, used especially in parsing relatively complex natural languages, and having wide application in artificial intelli ...
(ATN) to represent natural language input. Instead of ''
phrase structure rules Phrase structure rules are a type of rewrite rule used to describe a given language's syntax and are closely associated with the early stages of transformational grammar, proposed by Noam Chomsky in 1957. They are used to break down a natural lang ...
'' ATNs used an equivalent set of
finite state automata A finite-state machine (FSM) or finite-state automaton (FSA, plural: ''automata''), finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number o ...
that were called recursively. ATNs and their more general format called "generalized ATNs" continued to be used for a number of years. In 1971,
Terry Winograd Terry Allen Winograd (born February 24, 1946) is an American professor of computer science at Stanford University, and co-director of the Stanford Human–Computer Interaction Group. He is known within the philosophy of mind and artificial intel ...
finished writing SHRDLU for his PhD thesis at MIT. SHRDLU could understand simple English sentences in a restricted world of children's blocks to direct a robotic arm to move items. The successful demonstration of SHRDLU provided significant momentum for continued research in the field. Winograd continued to be a major influence in the field with the publication of his book ''Language as a Cognitive Process''. At Stanford, Winograd would later advise
Larry Page Lawrence Edward Page (born March 26, 1973) is an American business magnate, computer scientist and internet entrepreneur. He is best known for co-founding Google with Sergey Brin. Page was the chief executive officer of Google from 1997 u ...
, who co-founded
Google Google LLC () is an American Multinational corporation, multinational technology company focusing on Search Engine, search engine technology, online advertising, cloud computing, software, computer software, quantum computing, e-commerce, ar ...
. In the 1970s and 1980s, the natural language processing group at
SRI International SRI International (SRI) is an American nonprofit organization, nonprofit scientific research, scientific research institute and organization headquartered in Menlo Park, California. The trustees of Stanford University established SRI in 1946 as ...
continued research and development in the field. A number of commercial efforts based on the research were undertaken, ''e.g.'', in 1982 Gary Hendrix formed Symantec Corporation originally as a company for developing a natural language interface for database queries on personal computers. However, with the advent of mouse-driven
graphical user interface The GUI ( "UI" by itself is still usually pronounced . or ), graphical user interface, is a form of user interface that allows User (computing), users to Human–computer interaction, interact with electronic devices through graphical icon (comp ...
s, Symantec changed direction. A number of other commercial efforts were started around the same time, ''e.g.'', Larry R. Harris at the Artificial Intelligence Corporation and Roger Schank and his students at Cognitive Systems Corp. In 1983, Michael Dyer developed the BORIS system at Yale which bore similarities to the work of Roger Schank and W. G. Lehnert. The third millennium saw the introduction of systems using machine learning for text classification, such as the IBM Watson. However, experts debate how much "understanding" such systems demonstrate: ''e.g.'', according to
John Searle John Rogers Searle (; born July 31, 1932) is an American philosopher widely noted for contributions to the philosophy of language, philosophy of mind, and social philosophy. He began teaching at UC Berkeley in 1959, and was Willis S. and Mari ...
, Watson did not even understand the questions. John Ball, cognitive scientist and inventor o
Patom Theory
supports this assessment. Natural language processing has made inroads for applications to support human productivity in service and ecommerce, but this has largely been made possible by narrowing the scope of the application. There are thousands of ways to request something in a human language that still defies conventional natural language processing. "To have a meaningful conversation with machines is only possible when we match every word to the correct meaning based on the meanings of the other words in the sentence – just like a 3-year-old does without guesswork."


Scope and context

The umbrella term "natural-language understanding" can be applied to a diverse set of computer applications, ranging from small, relatively simple tasks such as short commands issued to
robot A robot is a machine—especially one programmable by a computer—capable of carrying out a complex series of actions automatically. A robot can be guided by an external control device, or the control may be embedded within. Robots may be ...
s, to highly complex endeavors such as the full comprehension of newspaper articles or poetry passages. Many real-world applications fall between the two extremes, for instance
text classification Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a document to one or more classes or categories. This may be done "manually" (or "intellectual ...
for the automatic analysis of ema