Artificial Intelligence and the Value Alignment Problem

A Philosophical Introduction

Written by: Travis LaCroix

Publication Date: May 9, 2025
ISBN: 9781554816293 / 1554816297
354 pages; 6" x 9"

Written for an interdisciplinary audience, this book provides strikingly clear explanations of the many difficult technical and moral concepts central to discussions of ethics and AI. In particular, it serves as an introduction to the value alignment problem: that of ensuring that AI systems are aligned with the values of humanity. LaCroix redefines the problem as a structural one, showing the reader how various topics in AI ethics, from bias and fairness to transparency and opacity, can be understood as instances of the key problem of value alignment. Numerous case studies are presented throughout the book to highlight the significance of the issues at stake and to clarify the central role of the value alignment problem in the many ethical challenges facing the development and implementation of AI.

Comments

“Travis LaCroix’s book on value alignment is, without a doubt, the best I have read on AI ethics. I highly recommend it to anyone interested in the ethics of artificial intelligence. The text is intellectually rigorous, and many of its ideas are genuinely novel. I found his discussion of measuring value alignment particularly insightful, along with the appendix on superintelligence and the control problem, which provides valuable depth to the topic.” — Martin Peterson, Texas A&M University

“LaCroix’s Artificial Intelligence and the Value Alignment Problem offers an insightful overview and evaluation of the predicament we find ourselves in with respect to machine learning. The book doesn't shy away from engaging with the mathematical background of these challenges, but it does so in a way that’s intelligible to readers with limited mathematical experience. The structural characterization of the alignment problem(s) provides a great conceptual tool for exploring the ways that values are (or fail to be) incorporated in machine learning systems. The discussions of values are also inclusive, incorporating views from Western, Eastern, and Indigenous philosophy. This book offers an up-to-date introduction to the topic at a level suitable for undergraduates while also providing a novel analytic tool for anyone already working in the area of AI ethics.” — Gillman Payette, University of Calgary

List of Figures
List of Tables
List of Cases
Preface
Acknowledgements

Introduction

I Basic Concepts

1 A Brief History of Artificial Intelligence

1.1 The Idea of AI
1.2 The Invention of AI
1.3 First-Wave AI: False Promises
1.4 Second-Wave AI: Empty Threats
1.5 Third-Wave AI: Deep Hype
1.6 Summary

2 Artificial Intelligence Today

2.1 Neural Network Architectures
2.2 Data and Datasets
2.3 Machine Learning Methods
2.4 Objectives, Goals, and Values
2.5 Learning Algorithms
2.6 Evaluation
2.7 Scaling Laws
2.8 Summary

3 The Value Alignment Problem

3.1 The Standard Definition of Value Alignment
3.2 Adding Sophistication to the Standard Definition
3.3 The Principal-Agent Framework
3.4 The Value Alignment Problem for Artificial Intelligence
3.5 Benefits of the Structural Definition

II Axes of Value Alignment

Introduction to Part II

4 Objectives

4.1 Proxies and Abstractions
4.2 Insights from the Structural Definition
4.3 Bias and Fairness
4.4 Algorithmic Bias
4.5 The Social Character of Objectives
4.6 Summary

5 Information

5.1 Informational Asymmetries, Economic and Artificial
5.2 Transparency and Opacity
5.3 Explainability, Interpretability, and Understanding
5.4 Data and Datasets
5.5 Interaction Effects
5.6 Summary

6 Principals

6.1 Principals and Their Goals
6.2 The Values of Humanity
6.3 The Values Encoded in AI Research
6.4 The Human Costs of Artificial Intelligence
6.5 Interaction Effects
6.6 Summary

III Approaches to Value Alignment

Introduction to Part III

7 AI Safety

7.1 Adversarial Examples
7.2 Concrete Problems in AI Safety
7.3 Mitigating Risk
7.4 AI Safety and the Value Alignment Problem
7.5 Summary

8 Machine Ethics

8.1 Artificial Moral Agency
8.2 Our Best Normative Theories
8.3 Technical Approaches to Artificial Moral Agency
8.4 Critiques of Artificial Moral Agency
8.5 Related Concepts
8.6 Summary

IV Mitigating Misalignment

Introduction to Part IV

9 Measuring Degrees of Alignment

9.1 Benchmarking
9.2 Benchmarking Ethics
9.3 Aligning Values
9.4 Degrees of Alignment
9.5 The Scaling Hypothesis for Value-Aligned AI
9.6 Summary

10 Normativity and Language

10.1 Linguistic Communication
10.2 Language in Human Value Alignment
10.3 Language, Value Alignment, and Information Transfer
10.4 Objective Functions and Value Proxies
10.5 Implications
10.6 Summary

11 Values and Value-Ladenness

11.1 The Value-Free Ideal of Science
11.2 Against the Value-Free Ideal
11.3 Values and Value Alignment
11.4 Optimism
11.5 Regulation
11.6 Summary

12 Conclusion

V Appendix

A Superintelligence and Control

A.1 Superintelligence
A.2 Paths to Superintelligence
A.3 Forms of Superintelligence
A.4 Intelligence Explosion and the Singularity
A.5 Existential Risk
A.6 Intelligence, Motivation, and Goals
A.7 The Control Problem
A.8 Criticism
A.9 Summary

References
Index

Travis LaCroix is Assistant Professor of Philosophy at Durham University and a faculty affiliate at the Schwartz Reisman Institute for Technology and Society (University of Toronto).

• Blends pedagogical and scholarly aims to create a text of use to researchers and students alike.
• Clear explanations of both philosophical concepts and scientific matters make the text suitable for undergraduates from either humanities or science backgrounds.
• Includes a helpful appendix on superintelligent AI and how it relates to the key issues of the book.
• Companion site includes a database of case studies to aid student thinking about the many facets of the value alignment problem.
• List of Cases provided within the book highlights key examples related to important chapter concepts.

The companion site for Artificial Intelligence and the Value Alignment Problem includes additional case study documents. This site is currently under construction.