Central examples

value_alignment_central_examples.json

https://arbital.com/p/value_alignment_central_examples

by Eliezer Yudkowsky Jul 14 2015 updated Dec 28 2015

List of central examples in Value Alignment Theory domain.

[summary: Comprehensive list of central examples used in Value Alignment Theory.]

Central examples used in Value Alignment Theory:

Paperclip maximizer
instrumental convergence
Gandhian stability & Orthogonality
value not instrumentally convergent
agents
nanotech catastrophe?
programmer manipulation?
astronomical failure
Smile maximizer
Value identification problem
Unforseen maximum
Edge Instantiation
Patch resistance
Treacherous Turn
- Programmer deception
Context Change problem
Complexity of value
AIXI and AIXI-tl
Cartesian boundaries
- problem of 'wireheading' the reward signal
methodology of unbounded analysis
what we do and don't know about AI
agents
Little box in a cellular automaton?
Naturalized induction
Nuclear Prisoner's Dilemma
timeless decision theory & Newcomblike problems
problem of the blackmail-free equilibrium
division-of-gains problem would need further-expanded matrix
Delta-sigma agents?
tiling agents
ZF provability Oracle
power/safety tradeoff
Notion of a pivotal achievement
Boxing problem
Behaviorist genie
power/safety tradeoff
defeater for some agency and recursion assumptions
That Alien Message
Boxing problem
Cognitive uncontainability
Diamond maximizer
Ontology identification problem