The Alignment Problem
   HOME

TheInfoList



OR:

''The Alignment Problem: Machine Learning and Human Values'' is a 2020 non-fiction book by the American writer
Brian Christian Brian Christian (born 1984 in Wilmington, Delaware) is an American non-fiction author, poet, programmer and researcher, best known for a bestselling series of books about the human implications of computer science, including ''The Most Human Human ...
. It is based on numerous interviews with experts trying to build
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech re ...
systems, particularly
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
systems, that are aligned with human values.


Summary

The book is divided into three sections: Prophecy, Agency, and Normativity. Each section covers researchers and engineers working on different challenges in the alignment of
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech re ...
with human values.


Prophecy

In the first section, Christian interweaves discussions of the history of artificial intelligence research, particularly the
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
approach of
artificial neural networks Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected unit ...
such as the Perceptron and
AlexNet AlexNet is the name of a convolutional neural network (CNN) architecture, designed by Alex Krizhevsky in collaboration with Ilya Sutskever and Geoffrey Hinton, who was Krizhevsky's Ph.D. advisor. AlexNet competed in the ImageNet Large Scale Visu ...
, with examples of how AI systems can have unintended behavior. He tells the story of
Julia Angwin Julia Angwin is a Pulitzer Prize-winning American investigative journalist, New York Times bestselling author, and entrepreneur. She is co-founder and editor-in-chief of The Markup, a nonprofit newsroom that investigates the impact of technology ...
, a journalist whose ProPublica investigation of the
COMPAS Compas, also known as compas direct or compas direk (; Haitian Creole: ''konpa'', ''kompa'' or ''kompa dirèk''), is a modern méringue dance music genre of Haiti. The genre was popularized following the creation of Ensemble Aux Callebasses in ...
algorithm, a tool for predicting
recidivism Recidivism (; from ''recidive'' and ''ism'', from Latin ''recidīvus'' "recurring", from ''re-'' "back" and ''cadō'' "I fall") is the act of a person repeating an undesirable behavior after they have experienced negative consequences of th ...
among criminal defendants, led to widespread criticism of its accuracy and bias towards certain demographics. One of AI's main alignment challenges is its black box nature (inputs and outputs are identifiable but the transformation process in between is undetermined). The lack of transparency makes it difficult to know where the system is going right and where it is going wrong.


Agency

In the second section, Christian similarly interweaves the history of the
psychological Psychology is the scientific study of mind and behavior. Psychology includes the study of conscious and unconscious phenomena, including feelings and thoughts. It is an academic discipline of immense scope, crossing the boundaries between t ...
study of reward, such as
behaviorism Behaviorism is a systematic approach to understanding the behavior of humans and animals. It assumes that behavior is either a reflex evoked by the pairing of certain antecedent (behavioral psychology), antecedent stimuli in the environment, o ...
and
dopamine Dopamine (DA, a contraction of 3,4-dihydroxyphenethylamine) is a neuromodulatory molecule that plays several important roles in cells. It is an organic compound, organic chemical of the catecholamine and phenethylamine families. Dopamine const ...
, with the computer science of reinforcement learning, in which AI systems need to develop policy ("what to do") in the face of a value function ("what rewards or punishment to expect"). He calls the
DeepMind DeepMind Technologies is a British artificial intelligence subsidiary of Alphabet Inc. and research laboratory founded in 2010. DeepMind was List of mergers and acquisitions by Google, acquired by Google in 2014 and became a wholly owned subsid ...
AlphaGo and AlphaZero systems "perhaps the single most impressive achievement in automated curriculum design." He also highlights the importance of curiosity, in which reinforcement learners are intrinsically motivated to explore their environment, rather than exclusively seeking the external reward.


Normativity

The third section covers training AI through the imitation of human or machine behavior, as well as philosophical debates such as between possibilism and
actualism In analytic philosophy, actualism is the view that everything there ''is'' (i.e., everything that has ''being'', in the broadest sense) is actual. Another phrasing of the thesis is that the domain of unrestricted quantification ranges over al ...
that imply different ideal behavior for AI systems. Of particular importance is inverse reinforcement learning, a broad approach for machines to learn the objective function of a human or another agent. Christian discusses the normative challenges associated with
effective altruism Effective altruism is a philosophical and social movement that advocates "using evidence and reason to figure out how to benefit others as much as possible, and taking action on that basis". People who pursue the goals of effective altruism, c ...
and existential risk, including the work of philosophers Toby Ord and
William MacAskill William David MacAskill (; born 24 March 1987) is a Scottish philosopher and author, as well as one of the originators of the effective altruism movement. He is an Associate Professor in Philosophy and Research Fellow at the Global Priorities ...
who are trying to devise human and machine strategies for navigating the alignment problem as effectively as possible.


Reception

The book received positive reviews from critics. ''
The Wall Street Journal ''The Wall Street Journal'' is an American business-focused, international daily newspaper based in New York City, with international editions also available in Chinese and Japanese. The ''Journal'', along with its Asian editions, is published ...
'''s David A. Shaywitz emphasized the frequent problems when applying algorithms to real-world problems, describing the book as "a nuanced and captivating exploration of this white-hot topic." ''
Publishers Weekly ''Publishers Weekly'' (''PW'') is an American weekly trade news magazine targeted at publishers, librarians, booksellers, and literary agents. Published continuously since 1872, it has carried the tagline, "The International News Magazine of B ...
'' praised the book for its writing and extensive research. ''
Kirkus Reviews ''Kirkus Reviews'' (or ''Kirkus Media'') is an American book review magazine founded in 1933 by Virginia Kirkus (1893–1980). The magazine is headquartered in New York City. ''Kirkus Reviews'' confers the annual Kirkus Prize to authors of fic ...
'' gave the book a positive review, calling it "technically rich but accessible", and "an intriguing exploration of AI." Writing for ''
Nature Nature, in the broadest sense, is the physics, physical world or universe. "Nature" can refer to the phenomenon, phenomena of the physical world, and also to life in general. The study of nature is a large, if not the only, part of science. ...
'', Virginia Dignum gave the book a positive review, favorably comparing it to
Kate Crawford Kate Crawford (born 1976) is a writer, composer, producer and academic. Crawford is a principal researcher at Microsoft Research (Social Media Collective), the co-founder and former director of research at the AI Now Institute at NYU, a visitin ...
's '' Atlas of AI''. In 2021, journalist Ezra Klein had Christian on his podcast, ''The Ezra Klein Show,'' writing in ''
The New York Times ''The New York Times'' (''the Times'', ''NYT'', or the Gray Lady) is a daily newspaper based in New York City with a worldwide readership reported in 2020 to comprise a declining 840,000 paid print subscribers, and a growing 6 million paid ...
'', "''The Alignment Problem'' is the best book on the key technical and moral questions of A.I. that I’ve read." Later that year, the book was listed in a '' Fast Company'' feature, "5 books that inspired Microsoft CEO
Satya Nadella Satya Narayana Nadella (, ; born 19 August 1967) is an Indian-American business executive. He is the executive chairman and CEO of Microsoft, succeeding Steve Ballmer in 2014 as CEO and John W. Thompson in 2021 as chairman. Before becoming CE ...
this year". In 2022, the book won the Eric and Wendy Schmidt Award for Excellence in Science Communication, given by The National Academies of Sciences, Engineering, and Medicine in partnership with Schmidt Futures. In 2024, ''
The New York Times ''The New York Times'' (''the Times'', ''NYT'', or the Gray Lady) is a daily newspaper based in New York City with a worldwide readership reported in 2020 to comprise a declining 840,000 paid print subscribers, and a growing 6 million paid ...
'' named ''The Alignment Problem'' one of the "5 Best Books About Artificial Intelligence," saying: "If you're going to read one book on artificial intelligence, this is the one."


See also

*
Effective altruism Effective altruism is a philosophical and social movement that advocates "using evidence and reason to figure out how to benefit others as much as possible, and taking action on that basis". People who pursue the goals of effective altruism, c ...
* Global catastrophic risk * '' Human Compatible: Artificial Intelligence and the Problem of Control'' * '' Superintelligence: Paths, Dangers, Strategies''


References

{{DEFAULTSORT:Alignment Problem: Machine Learning and Human Values, The 2020 non-fiction books Books about effective altruism Books about existential risk Existential risk from artificial general intelligence English non-fiction books English-language books Futurology books W. W. Norton & Company books Non-fiction books about Artificial intelligence