Re: simulating a hostile superintelligence:
I find this concern really unconcerning.
Some points:
- This is only really a problem if our own AI development, on Earth, is going so slowly that "having your AI speculate about what aliens might do" is not only the most effective way to develop a powerful AI, it is way more effective than what we were doing anyway. But it looks like "do AI development super slowly" is already a dead end for a bunch of other reasons, so we don't really need to talk about this particular bizarre reason. I guess you aren't yet convinced that this is a dead end, but I do hope to convince you at some point.
- At the point where such massive amounts of internal computing power are being deployed, it seems implausible that an AI system won't be thinking about how to think. At that point, the concern is not about the internal robustness of our system, but instead about the whether the AI is well-calibrated about its own internal robustness. The latter problem seems like one that we essentially have to solve anyway).
I think that there is a higher burden of proof for advancing concerns that AI researchers will dismiss out of hand as crazy, and that we should probably only do it for concerns that are way more solid than this one. Otherwise (1) it will become impossible to advance real concerns that sound crazy, if a pattern is established that crazy-sounding concerns actually are crazy, (2) people interested in AI safety will be roundly dismissed as crazy.
Comments
Eliezer Yudkowsky
The concern is for when you have a preference-limited AI that already contains enough computing power and has enough potential intelligence to be extremely dangerous, and it contains something that's smaller than itself but unlimited and hostile. Like, your genie has a lot of cognitive power but, by design of its preferences, it doesn't do more than a fraction of what it could; if that's a primary scenario you're optimizing for, then having your genie thinking deeply about possible hostile superintelligences seems potentially worrisome. In fact, it seems like a case of, "If you try to channel cognitive resources this way, but you ignore this problem, of course the AI just blows up anyway."
I agree that like a large subset of potential killer problems, this would not be high on my list of things to explain to people who were already having trouble "taking things seriously", just like I'd be trying to phrase everything in terms of scenarios with no nanotechnology even though I think the physics argument for nanotechnology is straightforward.