Limited AGI

https://arbital.com/p/limited_agi

by Eliezer Yudkowsky Jul 10 2016 updated Feb 22 2017

Task-based AGIs don't need unlimited cognitive and material powers to carry out their Tasks; which means their powers can potentially be limited.


One of the reasons why a Task AGI can potentially be safer than an Autonomous AGI, is that since Task AGIs only need to carry out activities of limited scope, they may only need limited material and cognitive powers to carry out those tasks. The nonadversarial principle still applies, but takes the form of "don't run the search" rather than "make sure the search returns the correct answer".

Obstacles

• Increasing your material and cognitive efficacy is instrumentally convergent in all sorts of places and would presumably need to be averted all over the place.

• Good limitation proposals are [deceptive_ease not as easy as they look] because particular domain capabilities can often be derived from more general architectures. An Artificial General Intelligence doesn't have a handcrafted 'thinking about cars' module and a handcrafted 'thinking about planes' module, so you can't just handcraft the two modules at different levels of ability.

E.g. many have suggested that 'drive' or 'emotion' is something that can be selectively removed from AGIs to 'limit' their ambitions; presumably these people are using a mental model that is not the standard expected utility agent model. To know which kind of limitations are easy, you need a sufficiently good background picture of the AGI's subprocesses that you understand which kind of system capabilities will naturally carve at the joints.

Related ideas

The research avenue of Mild optimization can be viewed as pursuing a kind of very general Limitation.

Behaviorism asks to Limit the AGI's ability to model other minds in non-whitelisted detail.

Taskishness can be seen as an Alignment/Limitation hybrid in the sense that it asks for the AI to only want or try to do a bounded amount at every level of internal organization.

Low impact can be seen as an Alignment/Limitation hybrid in the sense that a successful impact penalty would make the AI not want to implement larger-scale plans.

Limitation may be viewed as yet another subproblem of the Hard problem of corrigibility, since it seems like a type of precaution that a generic agent would desire to construct into a generic imperfectly-aligned subagent.

Limitation can be seen as motivated by both the Non-adversarial principle and the Minimality principle.