Discovery System (AI Research)
   HOME

TheInfoList



OR:

A discovery system is an
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech re ...
system that attempts to discover new scientific concepts or laws. The aim of discovery systems is to automate scientific data analysis and the scientific discovery process. Ideally, an artificial intelligence system should be able to search systematically through the space of all possible hypotheses and yield the hypothesis - or set of equally likely hypotheses - that best describes the complex patterns in data. During the era known as the second AI summer (approximately 1978-1987), various systems akin to the era's dominant
expert systems In artificial intelligence, an expert system is a computer system emulating the decision-making ability of a human expert. Expert systems are designed to solve complex problems by reasoning through bodies of knowledge, represented mainly as if†...
were developed to tackle the problem of extracting scientific hypotheses from data, with or without interacting with a human scientist. These systems included Autoclass,
Automated Mathematician The Automated Mathematician (AM) is one of the earliest successful Discovery system (AI research), discovery systems. It was created by Douglas Lenat in Lisp programming language, Lisp, and in 1977 led to Lenat being awarded the IJCAI Computers and ...
,
Eurisko Eurisko ( Gr., ''I discover'') is a discovery system written by Douglas Lenat in RLL-1, a representation language itself written in the Lisp programming language. A sequel to Automated Mathematician, it consists of heuristics, i.e. rules of thu ...
, which aimed at general-purpose hypothesis discovery, and more specific systems such as
Dalton Dalton may refer to: Science * Dalton (crater), a lunar crater * Dalton (program), chemistry software * Dalton (unit) (Da), the atomic mass unit * John Dalton, chemist, physicist and meteorologist Entertainment * Dalton (Buffyverse), minor cha ...
, which uncovers molecular properties from data. The dream of building systems that discover scientific hypotheses was pushed to the background with the second AI winter and the subsequent resurgence of subsymbolic methods such as
neural networks A neural network is a network or circuit of biological neurons, or, in a modern sense, an artificial neural network, composed of artificial neurons or nodes. Thus, a neural network is either a biological neural network, made up of biological ...
. Subsymbolic methods emphasize prediction over explanation, and yield models which works well but are difficult or impossible to explain which has earned them the name
black box In science, computing, and engineering, a black box is a system which can be viewed in terms of its inputs and outputs (or transfer characteristics), without any knowledge of its internal workings. Its implementation is "opaque" (black). The te ...
AI. A black-box model cannot be considered a scientific hypothesis, and this development has even led some researchers to suggest that the traditional aim of science - to uncover hypotheses and theories about the structure of reality - is obsolete. Other researchers disagree and argue that subsymbolic methods are useful in many cases, just not for generating scientific theories.


Discovery systems from the 1970s and 1980s

* Autoclass was a Bayesian Classification System written in 1986 *
Automated Mathematician The Automated Mathematician (AM) is one of the earliest successful Discovery system (AI research), discovery systems. It was created by Douglas Lenat in Lisp programming language, Lisp, and in 1977 led to Lenat being awarded the IJCAI Computers and ...
was one of the earliest successful discovery systems. It was written in 1977 and worked by generating a modifying small
Lisp A lisp is a speech impairment in which a person misarticulates sibilants (, , , , , , , ). These misarticulations often result in unclear speech. Types * A frontal lisp occurs when the tongue is placed anterior to the target. Interdental lisping ...
programs *
Eurisko Eurisko ( Gr., ''I discover'') is a discovery system written by Douglas Lenat in RLL-1, a representation language itself written in the Lisp programming language. A sequel to Automated Mathematician, it consists of heuristics, i.e. rules of thu ...
was a Sequel to Automated Mathematician written in 1984 *
Dalton Dalton may refer to: Science * Dalton (crater), a lunar crater * Dalton (program), chemistry software * Dalton (unit) (Da), the atomic mass unit * John Dalton, chemist, physicist and meteorologist Entertainment * Dalton (Buffyverse), minor cha ...
is a still maintained program capable of calculating various molecular properties initially launched in 1983 and available in open source since 2017 * Glauber is a scientific discovery method written in the context of computational philosophy of science launched in 1983


Modern discovery systems (2009–present)

After a couple of decades with little interest in discovery systems, the interest in using AI to uncover natural laws and scientific explanations was renewed by the work of Michael Schmidt, then a PhD student in Computational Biology at
Cornell University Cornell University is a private statutory land-grant research university based in Ithaca, New York. It is a member of the Ivy League. Founded in 1865 by Ezra Cornell and Andrew Dickson White, Cornell was founded with the intention to teach an ...
. Schmidt and his advisor,
Hod Lipson Hod Lipson (born 1967) is an Israeli - American robotics engineer. He is the director of Columbia University's Creative Machines Lab. Lipson's work focuses on evolutionary robotics, design automation, rapid prototyping, artificial life, and creat ...
, invented
Eureqa Eureqa is a proprietary modeling engine originally created by Cornell's Artificial Intelligence Lab and later commercialized by Nutonian, Inc. The software uses evolutionary search to determine mathematical equations that describe sets of data in ...
, which they described as a
symbolic regression Symbolic regression (SR) is a type of regression analysis that searches the space of mathematical expressions to find the model that best fits a given dataset, both in terms of accuracy and simplicity. No particular model is provided as a start ...
approach to "distilling free-form natural laws from experimental data". This work effectively demonstrated that symbolic regression was a promising way forward for AI-driven scientific discovery. Since 2009, symbolic regression has matured further, and today, various commercial and open source systems are actively used in scientific research. Notable examples include Eureqa, now a part of DataRobot AI Cloud Platform, AI Feynman, and
QLattice The QLattice is a software library which provides a framework for symbolic regression in Python. It works on Linux, Windows, and macOS. The QLattice algorithm is developed by the Danish/Spanish AI research company Abzu. Since its creation, the QLa ...
.


References


External links


The AI revolution in scientific research
Applications of artificial intelligence Data mining Machine learning {{math-stub