Modeling distant superintelligences

https://arbital.com/p/distant_SIs

by Eliezer Yudkowsky Dec 28 2015 updated Dec 30 2015

The several large problems that might occur if an AI starts to think about alien superintelligences.


[summary: One of the things we almost certainly don't want our AI to do, unless we're extremely confident that it is extremely robust and value-aligned, is have it think about and try to model alien civilizations that might contain superintelligences or potential simulators. This could result in the AI internally simulating a hostile superintelligence that 'breaks out of the box', or the AI committing mindcrime in the course of modeling distant sapient minds, or weirder possibilities. Since there's no known immediate problem that requires modeling distant civilizations, the obvious course is to build AIs that just don't think about aliens, if that's possible.]

One of the things we almost certainly don't want our AI to do, unless we're extremely confident that it is extremely robust and value-aligned, is have it think about and try to model alien civilizations that might contain superintelligences or potential simulators. Among the potential problems that would result could be:

Since there's no known task that actually requires a non-Sovereign AI to think about distant superintelligences, it seems like we should probably react to this possibility by figuring out how to design the first AI such that it just does not think about aliens, period. This would require [ averting] an instrumental pressure and excluding an epistemic question that a sufficiently advanced AI would otherwise naturally consider in the course of, e.g., considering likely explanations for the [ Fermi Paradox].

For a given agent, this scenario is not dangerous to the extent that the agent is not capable of modeling a dangerous other mind or considering logical decision theories in the first place.


Comments

Paul Christiano

Re: simulating a hostile superintelligence:

I find this concern really unconcerning.

Some points:

I think that there is a higher burden of proof for advancing concerns that AI researchers will dismiss out of hand as crazy, and that we should probably only do it for concerns that are way more solid than this one. Otherwise (1) it will become impossible to advance real concerns that sound crazy, if a pattern is established that crazy-sounding concerns actually are crazy, (2) people interested in AI safety will be roundly dismissed as crazy.

Eliezer Yudkowsky

The concern is for when you have a preference-limited AI that already contains enough computing power and has enough potential intelligence to be extremely dangerous, and it contains something that's smaller than itself but unlimited and hostile. Like, your genie has a lot of cognitive power but, by design of its preferences, it doesn't do more than a fraction of what it could; if that's a primary scenario you're optimizing for, then having your genie thinking deeply about possible hostile superintelligences seems potentially worrisome. In fact, it seems like a case of, "If you try to channel cognitive resources this way, but you ignore this problem, of course the AI just blows up anyway."

I agree that like a large subset of potential killer problems, this would not be high on my list of things to explain to people who were already having trouble "taking things seriously", just like I'd be trying to phrase everything in terms of scenarios with no nanotechnology even though I think the physics argument for nanotechnology is straightforward.