Non-fictionIntermediatephilosophical critical visionary

Human Compatible

Name: Human Compatible
Author: Stuart J. Russell

by Stuart J. Russell

Not enough ratings yet — via Open Library

Berkeley professor Stuart Russell argues AI must remain uncertain about human preferences to avoid catastrophic misalignment.

"You can't fetch the coffee if you're dead".

Editorial Summary

Human Compatible: Artificial Intelligence and the Problem of Control is a 2019 non-fiction book by computer scientist Stuart J. Russell. It asserts that the risk to humanity from advanced artificial intelligence (AI) is a serious concern despite the uncertainty surrounding future progress in AI. Stuart Russell is a professor of Computer Science and holder of the Smith-Zadeh Chair in Engineering at the University of California, Berkeley, where he also directs the Center for Human Compatible Artificial Intelligence. Russell then examines the current debate surrounding AI risk. He offers refutations to a number of common arguments dismissing AI risk and attributes much of their persistence to tribalism—AI researchers may see AI risk concerns as an "attack" on their field. Russell then proposes an approach to developing provably beneficial machines that focus on deference to humans. Unlike in the standard model of AI, where the objective is rigid and certain, this approach would have the AI's true objective remain uncertain, with the AI only approaching certainty about it as it gains more information about humans and the world. This uncertainty would, ideally, prevent catastrophic misunderstandings of human preferences and encourage cooperation and communication with humans. The principles are as follows: 1. The machine's only objective is to maximize the realization of human preferences. 2. The machine is initially uncertain about what those preferences are. 3. The ultimate source of information about human preferences is human behavior.

Perspective

"Human Compatible is the book where the field's leading textbook author explains why the field's standard approach is probably going to get us killed — Russell's authority makes the argument impossible to dismiss as outsider alarmism. His distinctive contribution is the proposed solution, not just the diagnosis: a framework where machines defer to humans precisely because they remain uncertain about human values, offering a concrete technical path rather than vague calls for caution. AI researchers, engineers, and technically literate policymakers who want the most rigorous case for why alignment matters will find this the clearest statement of the problem."

Similar Books

Matched by concept and theme

Study

Editorial Summary

Perspective

Artificial intelligence

Superintelligence

Artificial Intelligence

If Anyone Builds It, Everyone Dies