[summary: Comprehensive list of central examples used in Value Alignment Theory.]
Central examples used in Value Alignment Theory:
- Paperclip maximizer
- instrumental convergence
- Gandhian stability & Orthogonality
- value not instrumentally convergent
- agents
- nanotech catastrophe?
- programmer manipulation?
- astronomical failure
- Smile maximizer
- Value identification problem
- Unforseen maximum
- Edge Instantiation
- Patch resistance
- Treacherous Turn
- Context Change problem
- Complexity of value
- AIXI and AIXI-tl
- Cartesian boundaries
- problem of 'wireheading' the reward signal
- methodology of unbounded analysis
- what we do and don't know about AI
- agents
- Little box in a cellular automaton?
- Naturalized induction
- Nuclear Prisoner's Dilemma
- timeless decision theory & Newcomblike problems
- problem of the blackmail-free equilibrium
- division-of-gains problem would need further-expanded matrix
- Delta-sigma agents?
- tiling agents
- ZF provability Oracle
- power/safety tradeoff
- Notion of a pivotal achievement
- Boxing problem
- Behaviorist genie
- power/safety tradeoff
- defeater for some agency and recursion assumptions
- That Alien Message
- Boxing problem
- Cognitive uncontainability
- Diamond maximizer
- Ontology identification problem