{ localUrl: '../page/meta_unsolved.html', arbitalUrl: 'https://arbital.com/p/meta_unsolved', rawJsonUrl: '../raw/7wm.json', likeableId: '0', likeableType: 'page', myLikeValue: '0', likeCount: '0', dislikeCount: '0', likeScore: '0', individualLikes: [], pageId: 'meta_unsolved', edit: '6', editSummary: '', prevEdit: '5', currentEdit: '6', wasPublished: 'true', type: 'wiki', title: 'Meta-rules for (narrow) value learning are still unsolved', clickbait: 'We don't currently know a simple meta-utility function that would take in observation of humans and spit out our true values, or even a good target for a Task AGI.', textLength: '5382', alias: 'meta_unsolved', externalUrl: '', sortChildrenBy: 'likes', hasVote: 'true', voteType: 'probability', votesAnonymous: 'false', editCreatorId: 'EliezerYudkowsky', editCreatedAt: '2017-02-22 00:37:07', pageCreatorId: 'EliezerYudkowsky', pageCreatedAt: '2017-02-21 23:37:29', seeDomainId: '0', editDomainId: 'EliezerYudkowsky', submitToDomainId: '0', isAutosave: 'false', isSnapshot: 'false', isLiveEdit: 'true', isMinorEdit: 'false', indirectTeacher: 'false', todoCount: '0', isEditorComment: 'false', isApprovedComment: 'false', isResolved: 'false', snapshotText: '', anchorContext: '', anchorText: '', anchorOffset: '0', mergedInto: '', isDeleted: 'false', viewCount: '196', text: '[summary: This proposition is true if nobody has yet proposed a satisfactory algorithm that takes as input a material description of the universe, and/or channel of sensory observation, and spits out [55 ideal values] or a [2rz task identification].\n\nIn principle, there can be a simple *meta-level program* that would operate to [6c identify a goal] given the right complex inputs, even though [5l the object-level goal has high algorithmic complexity]. However, nobody has proposed realistic pseudocode for a realistic algorithm that takes in a full description of the material universe including humans, or a sensory channel currently controlled by fragile and unreliable humans, and spits out a decision function for any kind of goal we could realistically intend. There are arguably fundamental reasons why this is hard.]\n\n# Definition\n\nThis proposition is true according to you if you believe that: "Nobody has yet proposed a satisfactory fixed/simple algorithm that takes as input a material description of the universe, and/or channels of sensory observation, and spits out [55 ideal values] or a [2rz task identification]."\n\n# Arguments\n\nThe [5l] thesis says that, on the object-level, any specification of [7cl what we'd really want from the future] has high [5v].\n\nIn *some* sense, all the complexity required to specify value must be contained inside human brains; even as an object of conversation, we can't talk about anything our brains do not point to. This is why [5l] distinguishes the object-level complexity of value from *meta*-level complexity--the minimum program required to get a [-7g1] to learn values. It would be a separate question to consider the minimum complexity of a function that takes as input a full description of the material universe including humans, and outputs "[55 value]".\n\nThis question also has a [1vt narrow rather than ambitious form]: given sensory observations an AGI could reasonably receive in cooperation with its programmers, or a predictive model of humans that AGI could reasonably form and refine, is there a simple rule that will take this data as input, and safely and reliably [2rz identify] Tasks on the order of "develop molecular nanotechnology, use the nanotechnology to synthesize one strawberry, and then stop, with a minimum of side effects"?\n\nIn this case we have no strong reason to think that the functions are high-complexity in an absolute sense.\n\nHowever, nobody has yet proposed a satisfactory piece of pseudocode that solves any variant of this problem even in principle.\n\n## Obstacles to simple meta-rules\n\nConsider a simple [7t8] that specifies a sense-input-dependent formulation of [7s2]: An object-level outcome $o$ has a utility $U(o)$ that is $U_1(o)$ if a future sense signal $s$ is 1 and $U_2(o)$ if $s$ is 2. Given this setup, the AI has an incentive to tamper with $s$ and cause it to be 1 if $U_1$ is easier to optimize than $U_2,$ and vice versa.\n\nMore generally, sensory signals from humans will usually not be *reliably* and *unalterably* correlated with our [6h intended] goal identification. We can't treat human-generated signals as an [no_ground_truth ideally reliable ground truth] about any referent, because (a) [7w5 some AI actions interfere with the signal]; and (b) humans make mistakes, especially when you ask them something complicated. You can't have a scheme along the lines of "the humans press a button if something goes wrong", because some policies go wrong in ways humans don't notice until it's too late, and some AI policies destroy the button (or modify the human).\n\nEven leaving that aside, nobody has yet suggested any fully specified pseudocode that takes in a human-controlled sensory channel $R$ and a description of the universe $O$ and spits out a utility function that (actually realistically) identifies our [6h intended] task over $O$ (including *not* tiling the universe with subagents and so on).\n\nIndeed, nobody has yet suggested a realistic scheme for identifying any kind of goal whatsoever [5c with respect to an AI ontology flexible enough] to actually describe the material universe. %note: Except in the rather non-meta sense of inspecting the AI's ontology once it's advanced enough to describe what you think you want the AI to do, and manually programming the AI's consequentialist preferences with respect to what you think that ontology means.%\n\n## Meta-meta rules\n\nFor similar reasons as above, nobody has yet proposed (even in principle) effective pseudocode for a *meta-meta program* over some space of meta-rules, which would let the AI *learn* a value-identifying meta-rule. Two main problems here are:\n\nOne, nobody even has the seed of any proposal whatsoever for how that could, work short of "define a correctness-signaling channel and throw program induction at it" (which seems unlikely to work directly, given [no_ground_truth fallible, fragile humans controlling the signal]).\n\nTwo, if the learned meta-rule doesn't have a stable, extremely compact human-transparent representation, it's not clear how we could arrive at any confidence whatsoever [6q that good behavior in a development phase would correspond to good behavior in a test phase]. E.g., consider all the example meta-rules we could imagine which would work well on a small scale but fail to scale, like "something good just happened if the humans smiled".', metaText: '', isTextLoaded: 'true', isSubscribedToDiscussion: 'false', isSubscribedToUser: 'false', isSubscribedAsMaintainer: 'false', discussionSubscriberCount: '1', maintainerCount: '1', userSubscriberCount: '0', lastVisit: '', hasDraft: 'false', votes: [ { value: '95', userId: 'EliezerYudkowsky', createdAt: '2017-02-22 00:37:31' }, { value: '96', userId: 'MichaelCohen', createdAt: '2017-09-15 15:07:00' } ], voteSummary: 'null', muVoteSummary: '0', voteScaling: '2', currentUserVote: '-2', voteCount: '2', lockedVoteType: '', maxEditEver: '0', redLinkCount: '0', lockedBy: '', lockedUntil: '', nextPageId: '', prevPageId: '', usedAsMastery: 'false', proposalEditNum: '0', permissions: { edit: { has: 'false', reason: 'You don't have domain permission to edit this page' }, proposeEdit: { has: 'true', reason: '' }, delete: { has: 'false', reason: 'You don't have domain permission to delete this page' }, comment: { has: 'false', reason: 'You can't comment in this domain because you are not a member' }, proposeComment: { has: 'true', reason: '' } }, summaries: {}, creatorIds: [ 'EliezerYudkowsky' ], childIds: [], parentIds: [ 'complexity_of_value' ], commentIds: [], questionIds: [], tagIds: [ 'start_meta_tag', 'meta_utility' ], relatedIds: [], markIds: [], explanations: [], learnMore: [], requirements: [], subjects: [], lenses: [], lensParentId: '', pathPages: [], learnMoreTaughtMap: {}, learnMoreCoveredMap: {}, learnMoreRequiredMap: {}, editHistory: {}, domainSubmissions: {}, answers: [], answerCount: '0', commentCount: '0', newCommentCount: '0', linkedMarkCount: '0', changeLogs: [ { likeableId: '0', likeableType: 'changeLog', myLikeValue: '0', likeCount: '0', dislikeCount: '0', likeScore: '0', individualLikes: [], id: '22162', pageId: 'meta_unsolved', userId: 'EliezerYudkowsky', edit: '0', type: 'turnOnVote', createdAt: '2017-02-22 00:37:07', auxPageId: '', oldSettingsValue: 'false', newSettingsValue: 'true' }, { likeableId: '0', likeableType: 'changeLog', myLikeValue: '0', likeCount: '0', dislikeCount: '0', likeScore: '0', individualLikes: [], id: '22163', pageId: 'meta_unsolved', userId: 'EliezerYudkowsky', edit: '0', type: 'setVoteType', createdAt: '2017-02-22 00:37:07', auxPageId: '', oldSettingsValue: '', newSettingsValue: 'probability' }, { likeableId: '0', likeableType: 'changeLog', myLikeValue: '0', likeCount: '0', dislikeCount: '0', likeScore: '0', individualLikes: [], id: '22164', pageId: 'meta_unsolved', userId: 'EliezerYudkowsky', edit: '6', type: 'newEdit', createdAt: '2017-02-22 00:37:07', auxPageId: '', oldSettingsValue: '', newSettingsValue: '' }, { likeableId: '0', likeableType: 'changeLog', myLikeValue: '0', likeCount: '0', dislikeCount: '0', likeScore: '0', individualLikes: [], id: '22161', pageId: 'meta_unsolved', userId: 'EliezerYudkowsky', edit: '5', type: 'newEdit', createdAt: '2017-02-22 00:34:32', auxPageId: '', oldSettingsValue: '', newSettingsValue: '' }, { likeableId: '0', likeableType: 'changeLog', myLikeValue: '0', likeCount: '0', dislikeCount: '0', likeScore: '0', individualLikes: [], id: '22160', pageId: 'meta_unsolved', userId: 'EliezerYudkowsky', edit: '4', type: 'newEdit', createdAt: '2017-02-22 00:26:09', auxPageId: '', oldSettingsValue: '', newSettingsValue: '' }, { likeableId: '0', likeableType: 'changeLog', myLikeValue: '0', likeCount: '0', dislikeCount: '0', likeScore: '0', individualLikes: [], id: '22159', pageId: 'meta_unsolved', userId: 'EliezerYudkowsky', edit: '3', type: 'newEdit', createdAt: '2017-02-22 00:24:28', auxPageId: '', oldSettingsValue: '', newSettingsValue: '' }, { likeableId: '0', likeableType: 'changeLog', myLikeValue: '0', likeCount: '0', dislikeCount: '0', likeScore: '0', individualLikes: [], id: '22158', pageId: 'meta_unsolved', userId: 'EliezerYudkowsky', edit: '0', type: 'newEditGroup', createdAt: '2017-02-22 00:24:27', auxPageId: 'EliezerYudkowsky', oldSettingsValue: '123', newSettingsValue: '2' }, { likeableId: '0', likeableType: 'changeLog', myLikeValue: '0', likeCount: '0', dislikeCount: '0', likeScore: '0', individualLikes: [], id: '22157', pageId: 'meta_unsolved', userId: 'EliezerYudkowsky', edit: '0', type: 'deleteTag', createdAt: '2017-02-22 00:14:48', auxPageId: 'work_in_progress_meta_tag', oldSettingsValue: '', newSettingsValue: '' }, { likeableId: '0', likeableType: 'changeLog', myLikeValue: '0', likeCount: '0', dislikeCount: '0', likeScore: '0', individualLikes: [], id: '22155', pageId: 'meta_unsolved', userId: 'EliezerYudkowsky', edit: '2', type: 'newEdit', createdAt: '2017-02-21 23:54:33', auxPageId: '', oldSettingsValue: '', newSettingsValue: '' }, { likeableId: '0', likeableType: 'changeLog', myLikeValue: '0', likeCount: '0', dislikeCount: '0', likeScore: '0', individualLikes: [], id: '22152', pageId: 'meta_unsolved', userId: 'EliezerYudkowsky', edit: '0', type: 'newTag', createdAt: '2017-02-21 23:37:31', auxPageId: 'start_meta_tag', oldSettingsValue: '', newSettingsValue: '' }, { likeableId: '0', likeableType: 'changeLog', myLikeValue: '0', likeCount: '0', dislikeCount: '0', likeScore: '0', individualLikes: [], id: '22153', pageId: 'meta_unsolved', userId: 'EliezerYudkowsky', edit: '0', type: 'newTag', createdAt: '2017-02-21 23:37:31', auxPageId: 'work_in_progress_meta_tag', oldSettingsValue: '', newSettingsValue: '' }, { likeableId: '0', likeableType: 'changeLog', myLikeValue: '0', likeCount: '0', dislikeCount: '0', likeScore: '0', individualLikes: [], id: '22150', pageId: 'meta_unsolved', userId: 'EliezerYudkowsky', edit: '0', type: 'newParent', createdAt: '2017-02-21 23:37:30', auxPageId: 'complexity_of_value', oldSettingsValue: '', newSettingsValue: '' }, { likeableId: '0', likeableType: 'changeLog', myLikeValue: '0', likeCount: '0', dislikeCount: '0', likeScore: '0', individualLikes: [], id: '22151', pageId: 'meta_unsolved', userId: 'EliezerYudkowsky', edit: '0', type: 'newTag', createdAt: '2017-02-21 23:37:30', auxPageId: 'meta_utility', oldSettingsValue: '', newSettingsValue: '' }, { likeableId: '0', likeableType: 'changeLog', myLikeValue: '0', likeCount: '0', dislikeCount: '0', likeScore: '0', individualLikes: [], id: '22148', pageId: 'meta_unsolved', userId: 'EliezerYudkowsky', edit: '1', type: 'newEdit', createdAt: '2017-02-21 23:37:29', auxPageId: '', oldSettingsValue: '', newSettingsValue: '' } ], feedSubmissions: [], searchStrings: {}, hasChildren: 'false', hasParents: 'true', redAliases: {}, improvementTagIds: [], nonMetaTagIds: [], todos: [], slowDownMap: 'null', speedUpMap: 'null', arcPageIds: 'null', contentRequests: {} }