{
localUrl: '../page/research_directions_ai_control.html',
arbitalUrl: 'https://arbital.com/p/research_directions_ai_control',
rawJsonUrl: '../raw/1tz.json',
likeableId: '777',
likeableType: 'page',
myLikeValue: '0',
likeCount: '0',
dislikeCount: '0',
likeScore: '0',
individualLikes: [],
pageId: 'research_directions_ai_control',
edit: '7',
editSummary: '',
prevEdit: '6',
currentEdit: '7',
wasPublished: 'true',
type: 'wiki',
title: 'Research directions in AI control',
clickbait: '',
textLength: '3001',
alias: 'research_directions_ai_control',
externalUrl: '',
sortChildrenBy: 'likes',
hasVote: 'false',
voteType: '',
votesAnonymous: 'false',
editCreatorId: 'PaulChristiano',
editCreatedAt: '2016-03-04 00:50:02',
pageCreatorId: 'PaulChristiano',
pageCreatedAt: '2016-02-03 00:07:31',
seeDomainId: '0',
editDomainId: '705',
submitToDomainId: '0',
isAutosave: 'false',
isSnapshot: 'false',
isLiveEdit: 'true',
isMinorEdit: 'false',
indirectTeacher: 'false',
todoCount: '0',
isEditorComment: 'false',
isApprovedComment: 'true',
isResolved: 'false',
snapshotText: '',
anchorContext: '',
anchorText: '',
anchorOffset: '0',
mergedInto: '',
isDeleted: 'false',
viewCount: '14',
text: '\nWhat research would best advance our understanding of AI control?\n\nI’ve been thinking about this question a lot over the last few weeks. This post lays out my best guesses.\n\n### My current take on AI control\n\nI want to [focus on existing AI techniques](https://arbital.com/p/1w2), minimizing speculation about future developments. As a special case, I would like to [use minimal assumptions about unsupervised learning](https://arbital.com/p/1w3), instead relying on supervised and[reinforcement learning](https://arbital.com/p/1v2?title=reinforcement-learning-and-linguistic-convention). My goal is to find [scalable](https://arbital.com/p/1v1?title=scalable-ai-control) approaches to AI control that can be applied to existing AI systems.\n\nFor now, I think that [act-based approaches](https://arbital.com/p/1w4) look significantly more promising than goal-directed approaches. (Note that both categories are [consistent with using value learning](https://arbital.com/p/1vj?title=learn-policies-or-goals).) I think that many apparent problems are distinctive to goal-directed approaches [and can be temporarily set aside](https://arbital.com/p/1w5). But a more direct motivation is that the goal-directed approach seems to require speculative future developments in AI, whereas we can [take a stab](https://arbital.com/p/1vw) at the act-based approach now (though obviously much more work is needed).\n\nIn light of those views, I find the following research directions most attractive:\n\n### Four promising directions\n\n- [Elaborating on apprenticeship learning](https://arbital.com/p/1vx/elaborations_apprenticeship_learning). \nImitating human behavior seems especially promising as a scalable approach to AI control, but there are many outstanding problems.\n- [Efficiently using human feedback](https://arbital.com/p/1w1). \nThe limited availability of human feedback may be a serious bottleneck for realistic approaches to AI control.\n- [Explaining human judgments and disagreements](https://arbital.com/p/1vy/human_arguments_ai_control). \nMy preferred approach to AI control requires humans to understand AIs’ plans and beliefs. We don’t know how to solve the analogous problem for humans.\n- [Designing feedback mechanisms for reinforcement learning](https://arbital.com/p/1vd?title=reward-engineering). \nA grab bag of problems, united by a need for proxies of hard-to-optimize, implicit objectives.\n\nI will probably be doing work in one or more of these directions soon. I am also interested in talking with anyone who is considering looking into these or similar questions.\n\nI’d love to find considerations that would change my view — whether arguments against these projects, or more promising alternatives. But these are my current best guesses, and I consider them good enough that the right next step is to work on them.\n\n(This research was supported as part of the [_Future of Life Institute_](http://futureoflife.org/) FLI-RFP-AI1 program, grant #2015–143898.)',
metaText: '',
isTextLoaded: 'true',
isSubscribedToDiscussion: 'false',
isSubscribedToUser: 'false',
isSubscribedAsMaintainer: 'false',
discussionSubscriberCount: '1',
maintainerCount: '1',
userSubscriberCount: '0',
lastVisit: '',
hasDraft: 'false',
votes: [],
voteSummary: 'null',
muVoteSummary: '0',
voteScaling: '0',
currentUserVote: '-2',
voteCount: '0',
lockedVoteType: '',
maxEditEver: '0',
redLinkCount: '0',
lockedBy: '',
lockedUntil: '',
nextPageId: '',
prevPageId: '',
usedAsMastery: 'false',
proposalEditNum: '0',
permissions: {
edit: {
has: 'false',
reason: 'You don't have domain permission to edit this page'
},
proposeEdit: {
has: 'true',
reason: ''
},
delete: {
has: 'false',
reason: 'You don't have domain permission to delete this page'
},
comment: {
has: 'false',
reason: 'You can't comment in this domain because you are not a member'
},
proposeComment: {
has: 'true',
reason: ''
}
},
summaries: {},
creatorIds: [
'PaulChristiano'
],
childIds: [],
parentIds: [
'active_learning_powerful_predictors'
],
commentIds: [],
questionIds: [],
tagIds: [],
relatedIds: [],
markIds: [],
explanations: [],
learnMore: [],
requirements: [],
subjects: [],
lenses: [],
lensParentId: '',
pathPages: [],
learnMoreTaughtMap: {},
learnMoreCoveredMap: {},
learnMoreRequiredMap: {},
editHistory: {},
domainSubmissions: {},
answers: [],
answerCount: '0',
commentCount: '0',
newCommentCount: '0',
linkedMarkCount: '0',
changeLogs: [
{
likeableId: '0',
likeableType: 'changeLog',
myLikeValue: '0',
likeCount: '0',
dislikeCount: '0',
likeScore: '0',
individualLikes: [],
id: '8266',
pageId: 'research_directions_ai_control',
userId: 'JessicaChuan',
edit: '7',
type: 'newEdit',
createdAt: '2016-03-04 00:50:02',
auxPageId: '',
oldSettingsValue: '',
newSettingsValue: ''
},
{
likeableId: '0',
likeableType: 'changeLog',
myLikeValue: '0',
likeCount: '0',
dislikeCount: '0',
likeScore: '0',
individualLikes: [],
id: '7783',
pageId: 'research_directions_ai_control',
userId: 'JessicaChuan',
edit: '6',
type: 'newEdit',
createdAt: '2016-02-25 01:58:12',
auxPageId: '',
oldSettingsValue: '',
newSettingsValue: ''
},
{
likeableId: '0',
likeableType: 'changeLog',
myLikeValue: '0',
likeCount: '0',
dislikeCount: '0',
likeScore: '0',
individualLikes: [],
id: '7758',
pageId: 'research_directions_ai_control',
userId: 'JessicaChuan',
edit: '5',
type: 'newEdit',
createdAt: '2016-02-24 23:13:28',
auxPageId: '',
oldSettingsValue: '',
newSettingsValue: ''
},
{
likeableId: '0',
likeableType: 'changeLog',
myLikeValue: '0',
likeCount: '0',
dislikeCount: '0',
likeScore: '0',
individualLikes: [],
id: '6890',
pageId: 'research_directions_ai_control',
userId: 'JessicaChuan',
edit: '3',
type: 'newEdit',
createdAt: '2016-02-11 22:43:16',
auxPageId: '',
oldSettingsValue: '',
newSettingsValue: ''
},
{
likeableId: '0',
likeableType: 'changeLog',
myLikeValue: '0',
likeCount: '0',
dislikeCount: '0',
likeScore: '0',
individualLikes: [],
id: '6184',
pageId: 'research_directions_ai_control',
userId: 'JessicaChuan',
edit: '0',
type: 'deleteChild',
createdAt: '2016-02-03 00:15:30',
auxPageId: 'implicit_consequentialism',
oldSettingsValue: '',
newSettingsValue: ''
},
{
likeableId: '0',
likeableType: 'changeLog',
myLikeValue: '0',
likeCount: '0',
dislikeCount: '0',
likeScore: '0',
individualLikes: [],
id: '6182',
pageId: 'research_directions_ai_control',
userId: 'JessicaChuan',
edit: '2',
type: 'newChild',
createdAt: '2016-02-03 00:13:33',
auxPageId: 'implicit_consequentialism',
oldSettingsValue: '',
newSettingsValue: ''
},
{
likeableId: '0',
likeableType: 'changeLog',
myLikeValue: '0',
likeCount: '0',
dislikeCount: '0',
likeScore: '0',
individualLikes: [],
id: '6181',
pageId: 'research_directions_ai_control',
userId: 'JessicaChuan',
edit: '2',
type: 'newEdit',
createdAt: '2016-02-03 00:08:06',
auxPageId: '',
oldSettingsValue: '',
newSettingsValue: ''
},
{
likeableId: '0',
likeableType: 'changeLog',
myLikeValue: '0',
likeCount: '0',
dislikeCount: '0',
likeScore: '0',
individualLikes: [],
id: '6180',
pageId: 'research_directions_ai_control',
userId: 'JessicaChuan',
edit: '1',
type: 'newEdit',
createdAt: '2016-02-03 00:07:31',
auxPageId: '',
oldSettingsValue: '',
newSettingsValue: ''
},
{
likeableId: '0',
likeableType: 'changeLog',
myLikeValue: '0',
likeCount: '0',
dislikeCount: '0',
likeScore: '0',
individualLikes: [],
id: '6179',
pageId: 'research_directions_ai_control',
userId: 'JessicaChuan',
edit: '0',
type: 'newParent',
createdAt: '2016-02-03 00:04:37',
auxPageId: 'active_learning_powerful_predictors',
oldSettingsValue: '',
newSettingsValue: ''
}
],
feedSubmissions: [],
searchStrings: {},
hasChildren: 'false',
hasParents: 'true',
redAliases: {},
improvementTagIds: [],
nonMetaTagIds: [],
todos: [],
slowDownMap: 'null',
speedUpMap: 'null',
arcPageIds: 'null',
contentRequests: {}
}