Epistemic exclusion

https://arbital.com/p/epistemic_exclusion

by Eliezer Yudkowsky Dec 28 2015

How would you build an AI that, no matter what else it learned about the world, never knew or wanted to know what was inside your basement?


An "epistemic exclusion" would be a hypothetical form of AI limitation that made the AI not model (and if reflectively stable, not want to model) some particular part of physical or mathematical reality, or model it only using some restricted model class that didn't allow for the maximum possible predictive accuracy. For example, a behaviorist genie would not want to model human minds (except using a tightly restricted model class) to avoid Mindcrime, [programmer_manipulation] and other possible problems.

At present, nobody has investigated how to do this (in any reflectively stable way), and there's all sorts of obvious problems stemming from the fact that, in reality, most facts are linked to a significant number of other facts. How would you make an AI that was really good at predicting everything else in the world but didn't know or want to know what was inside your basement? Intuitively, it seems likely that a lot of naive solutions would, e.g., just cause the AI to de facto end up constructing something that wasn't technically a model of your basement, but played the same role as a model of your basement, in order to maximize predictive accuracy about everything that wasn't your basement. We could similarly ask how it would be possible to build a really good mathematician that never knew or cared whether 333 was a prime number, and whether this might require it to also ignore the 'casting out nines' procedure whenever it saw 333 as a decimal number, or what would happen if we asked it to multiply 3 by (100 + 10 + 1), and so on.

That said, most practical reasons to create an epistemic exclusion (e.g. against modeling humans in too much detail, or against modeling distant alien civilizations and superintelligences) would involve some practical reason the exclusion was there, and some level of in-practice exclusion that was good enough, which might not require e.g. maximum predictive accuracy about everything else combined with zero predictive accuracy about the exclusion.