Paperclip maximizer

https://arbital.com/p/paperclip_maximizer

by Eliezer Yudkowsky Jul 16 2015 updated Mar 3 2017

This agent will not stop until the entire universe is filled with paperclips.


An expected paperclip maximizer is an agent that outputs the action it believes will lead to the greatest number of paperclips existing. Or in more detail, its utility function is linear in the number of paperclips times the number of seconds that each paperclip lasts, over the lifetime of the universe. See http://wiki.lesswrong.com/wiki/Paperclip_maximizer.

The agent may be a [ bounded maximizer] rather than an [ objective maximizer] without changing the key ideas; the core premise is just that, given actions A and B where the paperclip maximizer has evaluated the consequences of both actions, the paperclip maximizer always prefers the action that it expects to lead to more paperclips.

Some key ideas that the notion of an expected paperclip maximizer illustrates:


Comments

Patrick LaVictoire

Maybe a useful thing to add: when we say things like "if X goes wrong, I expect your AI to become a paperclip maximizer", we don't necessarily mean that the AI will have a terminal goal as human-comprehensible and human-stupid as "maximizing paperclips", we mean that it will actually seek to maximize a goal that isn't very near the exact direction of human preference, and thanks to instrumental goals and edge instantiation, this results in a world that is just as worthless to human values as if we had let loose a paperclip maximizer.