Friendly artificial intelligence (friendly AI or FAI) is hypothetical
artificial general intelligence
Artificial general intelligence (AGI)—sometimes called human‑level intelligence AI—is a type of artificial intelligence that would match or surpass human capabilities across virtually all cognitive tasks.
Some researchers argue that sta ...
(AGI) that would have a positive (benign) effect on humanity or at least
align with human interests such as fostering the improvement of the human species. It is a part of the
ethics of artificial intelligence
The ethics of artificial intelligence covers a broad range of topics within AI that are considered to have particular ethical stakes. This includes algorithmic biases, Fairness (machine learning), fairness, automated decision-making, accountabili ...
and is closely related to
machine ethics
Machine ethics (or machine morality, computational morality, or computational ethics) is a part of the ethics of artificial intelligence concerned with adding or ensuring moral behaviors of man-made machines that use artificial intelligence, otherw ...
. While machine ethics is concerned with how an artificially intelligent agent ''should'' behave, friendly artificial intelligence research is focused on how to practically bring about this behavior and ensuring it is adequately constrained.
Etymology and usage
The term was coined by
Eliezer Yudkowsky
Eliezer S. Yudkowsky ( ; born September 11, 1979) is an American artificial intelligence researcher and writer on decision theory and ethics, best known for popularizing ideas related to friendly artificial intelligence. He is the founder of and ...
, who is best known for popularizing the idea,
to discuss
superintelligent artificial agents that reliably implement human values.
Stuart J. Russell and
Peter Norvig
Peter Norvig (born 14 December 1956) is an American computer scientist and Distinguished Education Fellow at the Stanford Institute for Human-Centered AI. He previously served as a director of research and search quality at Google. Norvig is th ...
's leading
artificial intelligence
Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...
textbook, ''
Artificial Intelligence: A Modern Approach'', describes the idea:
Yudkowsky (2008) goes into more detail about how to design a Friendly AI. He asserts that friendliness (a desire not to harm humans) should be designed in from the start, but that the designers should recognize both that their own designs may be flawed, and that the robot will learn and evolve over time. Thus the challenge is one of mechanism design—to define a mechanism for evolving AI systems under a system of checks and balances, and to give the systems utility functions that will remain friendly in the face of such changes.
"Friendly" is used in this context as
technical terminology
Jargon, or technical language, is the specialized terminology associated with a particular field or area of activity. Jargon is normally employed in a particular Context (language use), communicative context and may not be well understood outside ...
, and picks out agents that are safe and useful, not necessarily ones that are "friendly" in the colloquial sense. The concept is primarily invoked in the context of discussions of recursively self-improving artificial agents that rapidly
explode in intelligence, on the grounds that this hypothetical technology would have a large, rapid, and difficult-to-control impact on human society.
Risks of unfriendly AI
The roots of concern about artificial intelligence are very old. Kevin LaGrandeur showed that the dangers specific to AI can be seen in ancient literature concerning artificial humanoid servants such as the
golem
A golem ( ; ) is an animated Anthropomorphism, anthropomorphic being in Jewish folklore, which is created entirely from inanimate matter, usually clay or mud. The most famous golem narrative involves Judah Loew ben Bezalel, the late 16th-century ...
, or the proto-robots of
Gerbert of Aurillac and
Roger Bacon
Roger Bacon (; or ', also '' Rogerus''; ), also known by the Scholastic accolades, scholastic accolade ''Doctor Mirabilis'', was a medieval English polymath, philosopher, scientist, theologian and Franciscans, Franciscan friar who placed co ...
. In those stories, the extreme intelligence and power of these humanoid creations clash with their status as slaves (which by nature are seen as sub-human), and cause disastrous conflict. By 1942 these themes prompted
Isaac Asimov
Isaac Asimov ( ; – April 6, 1992) was an Russian-born American writer and professor of biochemistry at Boston University. During his lifetime, Asimov was considered one of the "Big Three" science fiction writers, along with Robert A. H ...
to create the "
Three Laws of Robotics
The Three Laws of Robotics (often shortened to The Three Laws or Asimov's Laws) are a set of rules devised by science fiction author Isaac Asimov, which were to be followed by robots in several of his stories. The rules were introduced in his 194 ...
"—principles hard-wired into all the robots in his fiction, intended to prevent them from turning on their creators, or allowing them to come to harm.
In modern times as the prospect of
superintelligent AI looms nearer, philosopher
Nick Bostrom
Nick Bostrom ( ; ; born 10 March 1973) is a Philosophy, philosopher known for his work on existential risk, the anthropic principle, human enhancement ethics, whole brain emulation, Existential risk from artificial general intelligence, superin ...
has said that superintelligent AI systems with goals that are not aligned with human ethics are intrinsically dangerous unless extreme measures are taken to ensure the safety of humanity. He put it this way:
Basically we should assume that a 'superintelligence' would be able to achieve whatever goals it has. Therefore, it is extremely important that the goals we endow it with, and its entire motivation system, is 'human friendly.'
In 2008, Eliezer Yudkowsky called for the creation of "friendly AI" to mitigate
existential risk from advanced artificial intelligence. He explains: "The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else."
Steve Omohundro says that a sufficiently advanced AI system will, unless explicitly counteracted, exhibit a number of
basic "drives", such as resource acquisition,
self-preservation
Self-preservation is a behavior or set of behaviors that ensures the survival of an organism. It is thought to be universal among all living organisms.
Self-preservation is essentially the process of an organism preventing itself from being harm ...
, and continuous self-improvement, because of the intrinsic nature of any goal-driven systems and that these drives will, "without special precautions", cause the AI to exhibit undesired behavior.
Alexander Wissner-Gross says that AIs driven to maximize their future freedom of action (or causal path entropy) might be considered friendly if their planning horizon is longer than a certain threshold, and unfriendly if their planning horizon is shorter than that threshold.
Luke Muehlhauser, writing for the
Machine Intelligence Research Institute
The Machine Intelligence Research Institute (MIRI), formerly the Singularity Institute for Artificial Intelligence (SIAI), is a non-profit research institute focused since 2005 on identifying and managing potential existential risks from artifi ...
, recommends that
machine ethics
Machine ethics (or machine morality, computational morality, or computational ethics) is a part of the ethics of artificial intelligence concerned with adding or ensuring moral behaviors of man-made machines that use artificial intelligence, otherw ...
researchers adopt what
Bruce Schneier
Bruce Schneier (; born January 15, 1963) is an American cryptographer, computer security professional, privacy specialist, and writer. Schneier is an Adjunct Lecturer in Public Policy at the Harvard Kennedy School and a Fellow at the Berkman ...
has called the "security mindset": Rather than thinking about how a system will work, imagine how it could fail. For instance, he suggests even an AI that only makes accurate predictions and communicates via a text interface might cause unintended harm.
In 2014, Luke Muehlhauser and Nick Bostrom underlined the need for 'friendly AI';
nonetheless, the difficulties in designing a 'friendly' superintelligence, for instance via programming counterfactual moral thinking, are considerable.
Coherent extrapolated volition
Yudkowsky advances the Coherent Extrapolated Volition (CEV) model. According to him, our coherent extrapolated volition is "our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted".
Rather than a Friendly AI being designed directly by human programmers, it is to be designed by a "seed AI" programmed to first study
human nature
Human nature comprises the fundamental dispositions and characteristics—including ways of Thought, thinking, feeling, and agency (philosophy), acting—that humans are said to have nature (philosophy), naturally. The term is often used to denote ...
and then produce the AI that humanity would want, given sufficient time and insight, to arrive at a satisfactory answer.
The appeal to an
objective through contingent human nature (perhaps expressed, for mathematical purposes, in the form of a
utility function
In economics, utility is a measure of a certain person's satisfaction from a certain state of the world. Over time, the term has been used with at least two meanings.
* In a Normative economics, normative context, utility refers to a goal or ob ...
or other
decision-theoretic formalism), as providing the ultimate criterion of "Friendliness", is an answer to the
meta-ethical
In metaphilosophy and ethics, metaethics is the study of the nature, scope, ground, and meaning of moral judgment, ethical belief, or Value_(ethics), values. It is one of the three branches of ethics generally studied by philosophers, the others ...
problem of defining an
objective morality; extrapolated volition is intended to be what humanity objectively would want, all things considered, but it can only be defined relative to the psychological and cognitive qualities of present-day, unextrapolated humanity.
Other approaches
Steve Omohundro has proposed a "scaffolding" approach to
AI safety, in which one provably safe AI generation helps build the next provably safe generation.
Seth Baum argues that the development of safe, socially beneficial artificial intelligence or artificial general intelligence is a function of the social psychology of AI research communities and so can be constrained by extrinsic measures and motivated by intrinsic measures. Intrinsic motivations can be strengthened when messages resonate with AI developers; Baum argues that, in contrast, "existing messages about beneficial AI are not always framed well". Baum advocates for "cooperative relationships, and positive framing of AI researchers" and cautions against characterizing AI researchers as "not want(ing) to pursue beneficial designs".
In his book ''
Human Compatible'', AI researcher
Stuart J. Russell lists three principles to guide the development of beneficial machines. He emphasizes that these principles are not meant to be explicitly coded into the machines; rather, they are intended for the human developers. The principles are as follows:
The "preferences" Russell refers to "are all-encompassing; they cover everything you might care about, arbitrarily far into the future."
Similarly, "behavior" includes any choice between options,
and the uncertainty is such that some probability, which may be quite small, must be assigned to every logically possible human preference.
Public policy
James Barrat, author of ''
Our Final Invention'', suggested that "a public-private partnership has to be created to bring A.I.-makers together to share ideas about security—something like the
International Atomic Energy Agency
The International Atomic Energy Agency (IAEA) is an intergovernmental organization that seeks to promote the peaceful use of nuclear technology, nuclear energy and to inhibit its use for any military purpose, including nuclear weapons. It was ...
, but in partnership with corporations." He urges AI researchers to convene a meeting similar to the
Asilomar Conference on Recombinant DNA, which discussed
risks of biotechnology.
John McGinnis encourages governments to accelerate friendly AI research. Because the goalposts of friendly AI are not necessarily eminent, he suggests a model similar to the
National Institutes of Health
The National Institutes of Health (NIH) is the primary agency of the United States government responsible for biomedical and public health research. It was founded in 1887 and is part of the United States Department of Health and Human Service ...
, where "Peer review panels of computer and cognitive scientists would sift through projects and choose those that are designed both to advance AI and assure that such advances would be accompanied by appropriate safeguards." McGinnis feels that peer review is better "than regulation to address technical issues that are not possible to capture through bureaucratic mandates". McGinnis notes that his proposal stands in contrast to that of the
Machine Intelligence Research Institute
The Machine Intelligence Research Institute (MIRI), formerly the Singularity Institute for Artificial Intelligence (SIAI), is a non-profit research institute focused since 2005 on identifying and managing potential existential risks from artifi ...
, which generally aims to avoid government involvement in friendly AI.
Criticism
Some critics believe that both human-level AI and superintelligence are unlikely and that, therefore, friendly AI is unlikely. Writing in ''
The Guardian
''The Guardian'' is a British daily newspaper. It was founded in Manchester in 1821 as ''The Manchester Guardian'' and changed its name in 1959, followed by a move to London. Along with its sister paper, ''The Guardian Weekly'', ''The Guardi ...
'', Alan Winfield compares human-level artificial intelligence with faster-than-light travel in terms of difficulty and states that while we need to be "cautious and prepared" given the stakes involved, we "don't need to be obsessing" about the risks of superintelligence. Boyles and Joaquin, on the other hand, argue that Luke Muehlhauser and
Nick Bostrom
Nick Bostrom ( ; ; born 10 March 1973) is a Philosophy, philosopher known for his work on existential risk, the anthropic principle, human enhancement ethics, whole brain emulation, Existential risk from artificial general intelligence, superin ...
’s proposal to create friendly AIs appear to be bleak. This is because Muehlhauser and Bostrom seem to hold the idea that intelligent machines could be programmed to think counterfactually about the moral values that human beings would have had.
In an article in ''
AI & Society'', Boyles and Joaquin maintain that such AIs would not be that friendly considering the following: the infinite amount of antecedent counterfactual conditions that would have to be programmed into a machine, the difficulty of cashing out the set of moral values—that is, those that are more ideal than the ones human beings possess at present, and the apparent disconnect between counterfactual antecedents and ideal value consequent.
Some philosophers claim that any truly "rational" agent, whether artificial or human, will naturally be benevolent; in this view, deliberate safeguards designed to produce a friendly AI could be unnecessary or even harmful. Other critics question whether artificial intelligence can be friendly. Adam Keiper and Ari N. Schulman, editors of the technology journal ''
The New Atlantis'', say that it will be impossible ever to guarantee "friendly" behavior in AIs because problems of ethical complexity will not yield to software advances or increases in computing power. They write that the criteria upon which friendly AI theories are based work "only when one has not only great powers of prediction about the likelihood of myriad possible outcomes but certainty and consensus on how one values the different outcomes.
The inner workings of advanced AI systems may be complex and difficult to interpret, leading to concerns about transparency and accountability.
See also
*
Affective computing
Affective computing is the study and development of systems and devices that can recognize, interpret, process, and simulate human affects. It is an interdisciplinary field spanning computer science, psychology, and cognitive science. While som ...
*
AI alignment
In the field of artificial intelligence (AI), alignment aims to steer AI systems toward a person's or group's intended goals, preferences, or ethical principles. An AI system is considered ''aligned'' if it advances the intended objectives. A '' ...
*
AI effect
The AI effect is the discounting of the behavior of an artificial intelligence program as not "real" intelligence.
The author Pamela McCorduck writes: "It's part of the history of the field of artificial intelligence that every time somebody fi ...
*
AI takeover
An AI takeover is an imagined scenario in which artificial intelligence (AI) emerges as the dominant form of intelligence on Earth and computer programs or robots effectively take control of the planet away from the human species, which relies o ...
*
Ambient intelligence
Ambient intelligence (AmI) refers to environments with electronic devices that are aware of and can recognize the presence of human beings and adapt accordingly. This concept encompasses various technologies in consumer electronics, telecommunic ...
*
Applications of artificial intelligence
Artificial intelligence (AI) has been used in applications throughout industry and academia. In a manner analogous to electricity or computers, AI serves as a general-purpose technology. AI programs are designed to simulate human perception and u ...
*
Artificial intelligence arms race
*
Artificial intelligence systems integration
The core idea of artificial intelligence systems integration is making individual software components, such as speech synthesizers, interoperable with other components, such as common sense knowledgebases, in order to create larger, broader and ...
*
Autonomous agent
An autonomous agent is an artificial intelligence (AI) system that can perform complex tasks independently.
Definitions
There are various definitions of autonomous agent. According to Brustoloni (1991):
According to Maes (1995):
Franklin ...
*
Embodied agent
In artificial intelligence, an embodied agent, also sometimes referred to as an interface agent, is an intelligent agent that interacts with the environment through a physical body within that environment. Agents that are represented graphically ...
*
Emotion recognition
Emotion recognition is the process of identifying human emotion. People vary widely in their accuracy at recognizing the emotions of others. Use of technology to help people with emotion recognition is a relatively nascent research area. Gener ...
*
Existential risk from artificial general intelligence
Existential risk from artificial intelligence refers to the idea that substantial progress in artificial general intelligence (AGI) could lead to human extinction or an irreversible global catastrophe.
One argument for the importance of this r ...
*
Hallucination (artificial intelligence)
In the field of artificial intelligence (AI), a hallucination or artificial hallucination (also called bullshitting, confabulation, or delusion) is a response generated by AI that contains false or misleading information presented as fact. Thi ...
*
Hybrid intelligent system
Hybrid intelligent system denotes a software system which employs, in parallel, a combination of methods and techniques from artificial intelligence subfields, such as:
* Neuro-symbolic systems
* Neuro-fuzzy systems
* Hybrid connectionist-symbol ...
*
Intelligence explosion
The technological singularity—or simply the singularity—is a hypothetical point in time at which technological growth becomes uncontrollable and irreversible, resulting in unforeseeable consequences for human civilization. According to the ...
*
Intelligent agent
In artificial intelligence, an intelligent agent is an entity that Machine perception, perceives its environment, takes actions autonomously to achieve goals, and may improve its performance through machine learning or by acquiring knowledge r ...
*
Intelligent control
Intelligent control is a class of control techniques that use various artificial intelligence computing approaches like neural networks, Bayesian probability, fuzzy logic, machine learning, reinforcement learning, evolutionary computation and gene ...
*
Machine ethics
Machine ethics (or machine morality, computational morality, or computational ethics) is a part of the ethics of artificial intelligence concerned with adding or ensuring moral behaviors of man-made machines that use artificial intelligence, otherw ...
*
Machine Intelligence Research Institute
The Machine Intelligence Research Institute (MIRI), formerly the Singularity Institute for Artificial Intelligence (SIAI), is a non-profit research institute focused since 2005 on identifying and managing potential existential risks from artifi ...
*
OpenAI
OpenAI, Inc. is an American artificial intelligence (AI) organization founded in December 2015 and headquartered in San Francisco, California. It aims to develop "safe and beneficial" artificial general intelligence (AGI), which it defines ...
*
Regulation of algorithms
Regulation of algorithms, or algorithmic regulation, is the creation of laws, rules and public sector policies for promotion and regulation of algorithms, particularly in artificial intelligence and machine learning. For the subset of AI algorith ...
*
Roko's basilisk
*
Sentiment analysis
Sentiment analysis (also known as opinion mining or emotion AI) is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subje ...
*
Singularitarianism
Singularitarianism is a Social movement, movement defined by the belief that a technological singularity—the creation of superintelligence—will likely happen in the medium future, and that deliberate action ought to be taken to ensure that t ...
– a moral philosophy advocated by proponents of Friendly AI
*
Suffering risks
Suffering, or pain in a broad sense, may be an experience of unpleasantness or aversion, possibly associated with the perception of harm or threat of harm in an individual. Suffering is the basic element that makes up the negative valence (psyc ...
*
Technological singularity
The technological singularity—or simply the singularity—is a hypothetical point in time at which technological growth becomes uncontrollable and irreversible, resulting in unforeseeable consequences for human civilization. According to the ...
*
Three Laws of Robotics
The Three Laws of Robotics (often shortened to The Three Laws or Asimov's Laws) are a set of rules devised by science fiction author Isaac Asimov, which were to be followed by robots in several of his stories. The rules were introduced in his 194 ...
References
Further reading
* Yudkowsky, E. (2008)
Artificial Intelligence as a Positive and Negative Factor in Global Risk In ''Global Catastrophic Risks'', Oxford University Press.
Discusses Artificial Intelligence from the perspective of
Existential risk
A global catastrophic risk or a doomsday scenario is a hypothetical event that could damage human well-being on a global scale, endangering or even destroying Modernity, modern civilization. Existential risk is a related term limited to even ...
. In particular, Sections 1-4 give background to the definition of Friendly AI in Section 5. Section 6 gives two classes of mistakes (technical and philosophical) which would both lead to the accidental creation of non-Friendly AIs. Sections 7-13 discuss further related issues.
* Omohundro, S. (2008). The Basic AI Drives Appeared in AGI-08 – Proceedings of the First Conference on Artificial General Intelligence.
* Mason, C. (2008)
Human-Level AI Requires Compassionate Intelligence Appears in
AAAI
The Association for the Advancement of Artificial Intelligence (AAAI) is an international scientific society devoted to promote research in, and responsible use of, artificial intelligence. AAAI also aims to increase public understanding of artif ...
2008 Workshop on Meta-Reasoning: Thinking About Thinking.
* Froding, B. and Peterson, M. (2021)
Friendly AIEthics and Information Technology, Vol. 23, pp. 207–214.
External links
Ethical Issues in Advanced Artificial Intelligenceby Nick Bostrom
What is Friendly AI?— A brief description of Friendly AI by the Machine Intelligence Research Institute.
Creating Friendly AI 1.0: The Analysis and Design of Benevolent Goal Architectures— A near book-length description from the MIRI
— by
Bill Hibbard
Commentary on MIRI's Guidelines on Friendly AI— by Peter Voss.
The Problem with ‘Friendly’ Artificial Intelligence— On the motives for and impossibility of FAI; by Adam Keiper and Ari N. Schulman.
{{DEFAULTSORT:Friendly Artificial Intelligence
Philosophy of artificial intelligence
Singularitarianism
Transhumanism
Affective computing