Expected utility agent

https://arbital.com/p/expected_utility_agent

by Eliezer Yudkowsky Dec 2 2015 updated Jan 20 2017

If you're not some kind of expected utility agent, you're going in circles.


An Expected utility agent has some way of consistently scoring all the possible outcomes of its actions, like assigning 20 points to saving a burning orphanage. The agent weighs its actions by estimating the [probabilistic_expectation average expected] score of an action's consequences. For example, an action with a 50% chance of leading to an outcome with utility 20, a 25% chance of leading to an outcome with utility 35, and a 25% chance of leading to an outcome with utility 45, would have an expected utility of 30. These utilities can potentially reflect any sort of morality or values - selfishness, altruism, or paperclips. Several [ famous mathematical theorems] suggest that if you can't be viewed as some type of expected utility agent, you must be [circular_preferences going in circles], [Dutch_book_argument making bad bets], or exhibiting other detrimental behaviors. Several [ famous experiments] show that human beings do exhibit those behaviors, and [ can't be viewed as expected utility agents].

[summary(Brief): An Expected utility agent has some way of scoring the consequences of its actions (e.g., rescuing a burning orphanage is worth 20 points), and it weighs actions according to their expected scores. This simple-sounding assumption has a lot of consequences.]

[summary(Technical): An [ agent] with a [ coherent] [ utility function] over [ outcomes] and a coherent [action_counterfactuals counterfactual] [ probability function] that relates its accessible [ actions] to their probable outcomes. Combining the utility function on outcomes, with the probability function from actions to outcomes, yields an action's expectation of utility. Most such agents treated in the literature are [expected_utility_maximizer maximizers], but other forms of optimization could also qualify if the [ decision rule] equivalently treated actions of equivalent expected utility. Several [ famous coherence theorems] suggest that any agent not exhibiting stupid behavior must be viewable as an expected utility agent.]

[todo: (Alexei: Is this line necessary if we have the summary paragraph visible?) An expected utility agent is an agent whose decision rule treats two actions equivalently whenever they have the same \expected_utility.]

[todo: write longer explanation of expected utility, the consequences of the assumption, and an introduction.]


Comments

Paul Christiano

It's easy to equivocate between "can be viewed as" and "is." Indeed, any rational agent "can be viewed as" an expected utility maximizer, but it need not have any internal architecture resembling such a maximizer. And in particular, the utility-function-being-maximized need not be represented explicitly.

Most of the actual oomph from decreeing something an expected utility maximizer seems to come from these additional assumptions, which aren't delivered by the relevant theorems. All the theorems give you is a characterization of the agent's attitude towards uncertainty (and so e.g. they have no content when there is no uncertainty).

(I expect the author doesn't often make this mistake, but it is pretty common in the broader LessWrong crowd.)