Friendly AI
   HOME

TheInfoList



OR:

Friendly artificial intelligence (also friendly AI or FAI) refers to hypothetical
artificial general intelligence Artificial general intelligence (AGI) is the ability of an intelligent agent to understand or learn any intellectual task that a human being can. It is a primary goal of some artificial intelligence research and a common topic in science fictio ...
(AGI) that would have a positive (benign) effect on humanity or at least align with human interests or contribute to foster the improvement of the human species. It is a part of the
ethics of artificial intelligence The ethics of artificial intelligence is the branch of the ethics of technology specific to artificially intelligent systems. It is sometimes divided into a concern with the moral behavior of ''humans'' as they design, make, use and treat artific ...
and is closely related to machine ethics. While machine ethics is concerned with how an artificially intelligent agent ''should'' behave, friendly artificial intelligence research is focused on how to practically bring about this behaviour and ensuring it is adequately constrained.


Etymology and usage

The term was coined by
Eliezer Yudkowsky Eliezer Shlomo Yudkowsky (born September 11, 1979) is an American decision theory and artificial intelligence (AI) researcher and writer, best known for popularizing the idea of friendly artificial intelligence. He is a co-founder and research ...
, who is best known for popularizing the idea, to discuss
superintelligent A superintelligence is a hypothetical agent that possesses intelligence far surpassing that of the brightest and most gifted human minds. "Superintelligence" may also refer to a property of problem-solving systems (e.g., superintelligent language ...
artificial agents that reliably implement human values.
Stuart J. Russell Stuart Jonathan Russell (born 1962) is a British computer scientist known for his contributions to artificial intelligence (AI). He is a professor of computer science at the University of California, Berkeley and was from 2008 to 2011 an adjunct ...
and
Peter Norvig Peter Norvig (born December 14, 1956) is an American computer scientist and Distinguished Education Fellow at the Stanford Institute for Human-Centered AI. He previously served as a director of research and search quality at Google. Norvig is t ...
's leading
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech ...
textbook, '' Artificial Intelligence: A Modern Approach'', describes the idea:
Yudkowsky (2008) goes into more detail about how to design a Friendly AI. He asserts that friendliness (a desire not to harm humans) should be designed in from the start, but that the designers should recognize both that their own designs may be flawed, and that the robot will learn and evolve over time. Thus the challenge is one of mechanism design—to define a mechanism for evolving AI systems under a system of checks and balances, and to give the systems utility functions that will remain friendly in the face of such changes.
'Friendly' is used in this context as
technical terminology Jargon is the specialized terminology associated with a particular field or area of activity. Jargon is normally employed in a particular communicative context and may not be well understood outside that context. The context is usually a partic ...
, and picks out agents that are safe and useful, not necessarily ones that are "friendly" in the colloquial sense. The concept is primarily invoked in the context of discussions of recursively self-improving artificial agents that rapidly explode in intelligence, on the grounds that this hypothetical technology would have a large, rapid, and difficult-to-control impact on human society.


Risks of unfriendly AI

The roots of concern about artificial intelligence are very old. Kevin LaGrandeur showed that the dangers specific to AI can be seen in ancient literature concerning artificial humanoid servants such as the
golem A golem ( ; he, , gōlem) is an animated, anthropomorphic being in Jewish folklore, which is entirely created from inanimate matter (usually clay or mud). The most famous golem narrative involves Judah Loew ben Bezalel, the late 16th-century ...
, or the proto-robots of
Gerbert of Aurillac Pope Sylvester II ( – 12 May 1003), originally known as Gerbert of Aurillac, was a French-born scholar and teacher who served as the bishop of Rome and ruled the Papal States from 999 to his death. He endorsed and promoted study of Arab and G ...
and
Roger Bacon Roger Bacon (; la, Rogerus or ', also '' Rogerus''; ), also known by the scholastic accolade ''Doctor Mirabilis'', was a medieval English philosopher and Franciscan friar who placed considerable emphasis on the study of nature through emp ...
. In those stories, the extreme intelligence and power of these humanoid creations clash with their status as slaves (which by nature are seen as sub-human), and cause disastrous conflict. By 1942 these themes prompted
Isaac Asimov yi, יצחק אזימאװ , birth_date = , birth_place = Petrovichi, Russian SFSR , spouse = , relatives = , children = 2 , death_date = , death_place = Manhattan, New York City, U.S. , nationality = Russian (1920–1922)Soviet (192 ...
to create the "
Three Laws of Robotics The Three Laws of Robotics (often shortened to The Three Laws or known as Asimov's Laws) are a set of rules devised by science fiction author Isaac Asimov. The rules were introduced in his 1942 short story " Runaround" (included in the 1950 colle ...
"—principles hard-wired into all the robots in his fiction, intended to prevent them from turning on their creators, or allowing them to come to harm. In modern times as the prospect of superintelligent AI looms nearer, philosopher
Nick Bostrom Nick Bostrom ( ; sv, Niklas Boström ; born 10 March 1973) is a Swedish-born philosopher at the University of Oxford known for his work on existential risk, the anthropic principle, human enhancement ethics, superintelligence risks, and the ...
has said that superintelligent AI systems with goals that are not aligned with human ethics are intrinsically dangerous unless extreme measures are taken to ensure the safety of humanity. He put it this way:
Basically we should assume that a 'superintelligence' would be able to achieve whatever goals it has. Therefore, it is extremely important that the goals we endow it with, and its entire motivation system, is 'human friendly.'
In 2008 Eliezer Yudkowsky called for the creation of "friendly AI" to mitigate
existential risk from advanced artificial intelligence Existential risk from artificial general intelligence is the hypothesis that substantial progress in artificial general intelligence (AGI) could result in human extinction or some other unrecoverable global catastrophe. It is argued that the huma ...
. He explains: "The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else." Steve Omohundro says that a sufficiently advanced AI system will, unless explicitly counteracted, exhibit a number of basic "drives", such as resource acquisition, self-preservation, and continuous self-improvement, because of the intrinsic nature of any goal-driven systems and that these drives will, "without special precautions", cause the AI to exhibit undesired behavior. Alexander Wissner-Gross says that AIs driven to maximize their future freedom of action (or causal path entropy) might be considered friendly if their planning horizon is longer than a certain threshold, and unfriendly if their planning horizon is shorter than that threshold. Luke Muehlhauser, writing for the
Machine Intelligence Research Institute The Machine Intelligence Research Institute (MIRI), formerly the Singularity Institute for Artificial Intelligence (SIAI), is a non-profit research institute focused since 2005 on identifying and managing potential existential risks from artif ...
, recommends that machine ethics researchers adopt what
Bruce Schneier Bruce Schneier (; born January 15, 1963) is an American cryptographer, computer security professional, privacy specialist, and writer. Schneier is a Lecturer in Public Policy at the Harvard Kennedy School and a Fellow at the Berkman Klein Cente ...
has called the "security mindset": Rather than thinking about how a system will work, imagine how it could fail. For instance, he suggests even an AI that only makes accurate predictions and communicates via a text interface might cause unintended harm. In 2014, Luke Muehlhauser and Nick Bostrom underlined the need for 'friendly AI'; nonetheless, the difficulties in designing a 'friendly' superintelligence, for instance via programming counterfactual moral thinking, are considerable.


Coherent extrapolated volition

Yudkowsky advances the Coherent Extrapolated Volition (CEV) model. According to him, coherent extrapolated volition is people's choices and the actions people would collectively take if "we knew more, thought faster, were more the people we wished we were, and had grown up closer together." Rather than a Friendly AI being designed directly by human programmers, it is to be designed by a "seed AI" programmed to first study
human nature Human nature is a concept that denotes the fundamental dispositions and characteristics—including ways of thinking, feeling, and acting—that humans are said to have naturally. The term is often used to denote the essence of humankind, or ...
and then produce the AI which humanity would want, given sufficient time and insight, to arrive at a satisfactory answer. The appeal to an objective through contingent human nature (perhaps expressed, for mathematical purposes, in the form of a
utility function As a topic of economics, utility is used to model worth or value. Its usage has evolved significantly over time. The term was introduced initially as a measure of pleasure or happiness as part of the theory of utilitarianism by moral philosoph ...
or other decision-theoretic formalism), as providing the ultimate criterion of "Friendliness", is an answer to the
meta-ethical In metaphilosophy and ethics, meta-ethics is the study of the nature, scope, and meaning of moral judgment. It is one of the three branches of ethics generally studied by philosophers, the others being normative ethics (questions of how one ought ...
problem of defining an objective morality; extrapolated volition is intended to be what humanity objectively would want, all things considered, but it can only be defined relative to the psychological and cognitive qualities of present-day, unextrapolated humanity.


Other approaches

Steve Omohundro has proposed a "scaffolding" approach to AI safety, in which one provably safe AI generation helps build the next provably safe generation.
Seth Baum Seth Baum is an American researcher involved in the field of risk research. He is the executive director of the Global Catastrophic Risk Institute (GCRI), a think tank focused on existential risk. He is also affiliated with the Blue Marble Space ...
argues that the development of safe, socially beneficial artificial intelligence or artificial general intelligence is a function of the social psychology of AI research communities, and so can be constrained by extrinsic measures and motivated by intrinsic measures. Intrinsic motivations can be strengthened when messages resonate with AI developers; Baum argues that, in contrast, "existing messages about beneficial AI are not always framed well". Baum advocates for "cooperative relationships, and positive framing of AI researchers" and cautions against characterizing AI researchers as "not want(ing) to pursue beneficial designs". In his book ''
Human Compatible ''Human Compatible: Artificial Intelligence and the Problem of Control'' is a 2019 non-fiction book by computer scientist Stuart J. Russell. It asserts that risk to humanity from advanced artificial intelligence (AI) is a serious concern despit ...
'', AI researcher
Stuart J. Russell Stuart Jonathan Russell (born 1962) is a British computer scientist known for his contributions to artificial intelligence (AI). He is a professor of computer science at the University of California, Berkeley and was from 2008 to 2011 an adjunct ...
lists three principles to guide the development of beneficial machines. He emphasizes that these principles are not meant to be explicitly coded into the machines; rather, they are intended for the human developers. The principles are as follows: The "preferences" Russell refers to "are all-encompassing; they cover everything you might care about, arbitrarily far into the future." Similarly, "behavior" includes any choice between options, and the uncertainty is such that some probability, which may be quite small, must be assigned to every logically possible human preference.


Public policy

James Barrat, author of '' Our Final Invention'', suggested that "a public-private partnership has to be created to bring A.I.-makers together to share ideas about security—something like the International Atomic Energy Agency, but in partnership with corporations." He urges AI researchers to convene a meeting similar to the Asilomar Conference on Recombinant DNA, which discussed risks of biotechnology.
John McGinnis John Oldham McGinnis is a professor at Northwestern University Pritzker School of Law and author of over 90 academic and popular articles and essays. His popular writings have been published in '' The Wall Street Journal'', '' National Review'', ...
encourages governments to accelerate friendly AI research. Because the goalposts of friendly AI are not necessarily eminent, he suggests a model similar to the
National Institutes of Health The National Institutes of Health, commonly referred to as NIH (with each letter pronounced individually), is the primary agency of the United States government responsible for biomedical and public health research. It was founded in the lat ...
, where "Peer review panels of computer and cognitive scientists would sift through projects and choose those that are designed both to advance AI and assure that such advances would be accompanied by appropriate safeguards." McGinnis feels that peer review is better "than regulation to address technical issues that are not possible to capture through bureaucratic mandates". McGinnis notes that his proposal stands in contrast to that of the
Machine Intelligence Research Institute The Machine Intelligence Research Institute (MIRI), formerly the Singularity Institute for Artificial Intelligence (SIAI), is a non-profit research institute focused since 2005 on identifying and managing potential existential risks from artif ...
, which generally aims to avoid government involvement in friendly AI. According to
Gary Marcus Gary F. Marcus (born February 8, 1970) is a professor emeritus of psychology and neural science at New York University. In 2014 he founded Geometric Intelligence, a machine-learning company later acquired by Uber. Marcus's books include '' Guit ...
, the annual amount of money being spent on developing machine morality is tiny.


Criticism

Some critics believe that both human-level AI and superintelligence are unlikely, and that therefore friendly AI is unlikely. Writing in ''
The Guardian ''The Guardian'' is a British daily newspaper. It was founded in 1821 as ''The Manchester Guardian'', and changed its name in 1959. Along with its sister papers '' The Observer'' and '' The Guardian Weekly'', ''The Guardian'' is part of the ...
'', Alan Winfield compares human-level artificial intelligence with faster-than-light travel in terms of difficulty, and states that while we need to be "cautious and prepared" given the stakes involved, we "don't need to be obsessing" about the risks of superintelligence. Boyles and Joaquin, on the other hand, argue that Luke Muehlhauser and
Nick Bostrom Nick Bostrom ( ; sv, Niklas Boström ; born 10 March 1973) is a Swedish-born philosopher at the University of Oxford known for his work on existential risk, the anthropic principle, human enhancement ethics, superintelligence risks, and the ...
’s proposal to create friendly AIs appear to be bleak. This is because Muehlhauser and Bostrom seem to hold the idea that intelligent machines could be programmed to think counterfactually about the moral values that humans beings would have had. In an article in ''
AI & Society ''AI & Society'' is a quarterly peer-reviewed scientific journal published by Springer. The editor-in-chief is Karamjit S. Gill, Brighton University. ''AI & Society'' has been running since 1987. It covers all aspects of artificial intelligence a ...
'', Boyles and Joaquin maintain that such AIs would not be that friendly considering the following: the infinite amount of antecedent counterfactual conditions that would have to be programmed into a machine, the difficulty of cashing out the set of moral values—that is, those that are more ideal than the ones human beings possess at present, and the apparent disconnect between counterfactual antecedents and ideal value consequent. Some philosophers claim that any truly "rational" agent, whether artificial or human, will naturally be benevolent; in this view, deliberate safeguards designed to produce a friendly AI could be unnecessary or even harmful. Other critics question whether it is possible for an artificial intelligence to be friendly. Adam Keiper and Ari N. Schulman, editors of the technology journal '' The New Atlantis'', say that it will be impossible to ever guarantee "friendly" behavior in AIs because problems of ethical complexity will not yield to software advances or increases in computing power. They write that the criteria upon which friendly AI theories are based work "only when one has not only great powers of prediction about the likelihood of myriad possible outcomes, but certainty and consensus on how one values the different outcomes.


See also

*
Affective computing Affective computing is the study and development of systems and devices that can recognize, interpret, process, and simulate human affects. It is an interdisciplinary field spanning computer science, psychology, and cognitive science. While so ...
* AI alignment *
AI effect :''For the magnitude of effect of a pesticide, see Pesticide application. Of change in farming practices, see Agricultural intensification.'' The AI effect occurs when onlookers discount the behavior of an artificial intelligence program by argu ...
*
AI takeover An AI takeover is a hypothetical scenario in which an artificial intelligence (AI) becomes the dominant form of intelligence on Earth, as computer programs or robots effectively take the control of the planet away from the human species. Possible ...
*
Ambient intelligence In computing, ambient intelligence (AmI) refers to electronic environments that are sensitive and responsive to the presence of people. Ambient intelligence was a projection on the future of consumer electronics, telecommunications and comput ...
*
Applications of artificial intelligence Artificial intelligence (AI) has been used in applications to alleviate certain problems throughout industry and academia. AI, like electricity or computers, is a general purpose technology that has a multitude of applications. It has been used ...
* Artificial intelligence arms race * Artificial intelligence systems integration *
Autonomous agent An autonomous agent is an intelligent agent operating on a user's behalf but without any interference of that user. An intelligent agent, however appears according to an IBM white paper as: Intelligent agents are software entities that carry out ...
*
Embodied agent In artificial intelligence, an embodied agent, also sometimes referred to as an interface agent, is an intelligent agent that interacts with the environment through a physical body within that environment. Agents that are represented graphically ...
*
Emotion recognition Emotion recognition is the process of identifying human emotion. People vary widely in their accuracy at recognizing the emotions of others. Use of technology to help people with emotion recognition is a relatively nascent research area. Genera ...
*
Existential risk from artificial general intelligence Existential risk from artificial general intelligence is the hypothesis that substantial progress in artificial general intelligence (AGI) could result in human extinction or some other unrecoverable global catastrophe. It is argued that the hum ...
*
Hybrid intelligent system Hybrid intelligent system denotes a software system which employs, in parallel, a combination of methods and techniques from artificial intelligence subfields, such as: * Neuro-symbolic systems * Neuro-fuzzy systems * Hybrid connectionist-symbolic ...
*
Intelligence explosion The technological singularity—or simply the singularity—is a hypothetical future point in time at which technological growth becomes uncontrollable and irreversible, resulting in unforeseeable changes to human civilization. According to the ...
*
Intelligent agent In artificial intelligence, an intelligent agent (IA) is anything which perceives its environment, takes actions autonomously in order to achieve goals, and may improve its performance with learning or may use knowledge. They may be simple or ...
*
Intelligent control Intelligent control is a class of control techniques that use various artificial intelligence computing approaches like neural networks, Bayesian probability, fuzzy logic, machine learning, reinforcement learning, evolutionary computation and genet ...
* Machine ethics *
Machine Intelligence Research Institute The Machine Intelligence Research Institute (MIRI), formerly the Singularity Institute for Artificial Intelligence (SIAI), is a non-profit research institute focused since 2005 on identifying and managing potential existential risks from artif ...
*
OpenAI OpenAI is an artificial intelligence (AI) research laboratory consisting of the for-profit corporation OpenAI LP and its parent company, the non-profit OpenAI Inc. The company conducts research in the field of AI with the stated goal of promo ...
*
Regulation of algorithms Regulation of algorithms, or algorithmic regulation, is the creation of laws, rules and public sector policies for promotion and regulation of algorithms, particularly in artificial intelligence and machine learning. For the subset of AI algorith ...
* Roko's basilisk *
Sentiment analysis Sentiment analysis (also known as opinion mining or emotion AI) is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjec ...
*
Singularitarianism Singularitarianism is a movement defined by the belief that a technological singularity—the creation of superintelligence—will likely happen in the medium future, and that deliberate action ought to be taken to ensure that the singularity bene ...
– a moral philosophy advocated by proponents of Friendly AI *
Technological singularity The technological singularity—or simply the singularity—is a hypothetical future point in time at which technological growth becomes uncontrollable and irreversible, resulting in unforeseeable changes to human civilization. According to the m ...
*
Three Laws of Robotics The Three Laws of Robotics (often shortened to The Three Laws or known as Asimov's Laws) are a set of rules devised by science fiction author Isaac Asimov. The rules were introduced in his 1942 short story " Runaround" (included in the 1950 colle ...


References


Further reading

* Yudkowsky, E
Artificial Intelligence as a Positive and Negative Factor in Global Risk
In ''Global Catastrophic Risks'', Oxford University Press, 2008.
Discusses Artificial Intelligence from the perspective of Existential risk. In particular, Sections 1-4 give background to the definition of Friendly AI in Section 5. Section 6 gives two classes of mistakes (technical and philosophical) which would both lead to the accidental creation of non-Friendly AIs. Sections 7-13 discuss further related issues. * Omohundro, S. 2008 The Basic AI Drives Appeared in AGI-08 - Proceedings of the First Conference on Artificial General Intelligence * Mason, C. 200
Human-Level AI Requires Compassionate Intelligence
Appears in
AAAI The Association for the Advancement of Artificial Intelligence (AAAI) is an international scientific society devoted to promote research in, and responsible use of, artificial intelligence. AAAI also aims to increase public understanding of artif ...
2008 Workshop on Meta-Reasoning:Thinking About Thinking * Froding, B. and Peterson, M 202
Friendly AI
Ethics and Information Technology volume 23, pp 207–214.


External links


Ethical Issues in Advanced Artificial Intelligence
by Nick Bostrom
What is Friendly AI?
— A brief description of Friendly AI by the Machine Intelligence Research Institute.
Creating Friendly AI 1.0: The Analysis and Design of Benevolent Goal Architectures
— A near book-length description from the MIRI

— by
Bill Hibbard Bill Hibbard is a scientist at the University of Wisconsin–Madison Space Science and Engineering Center working on visualization and machine intelligence. He is principal author of the Vis5D, Cave5D, and VisAD open-source visualization system ...

Commentary on MIRI's Guidelines on Friendly AI
— by Peter Voss.
The Problem with ‘Friendly’ Artificial Intelligence
— On the motives for and impossibility of FAI; by Adam Keiper and Ari N. Schulman. {{DEFAULTSORT:Friendly Artificial Intelligence Philosophy of artificial intelligence Singularitarianism Transhumanism Affective computing