Reflectively consistent degree of freedom

{
  localUrl: '../page/reflective_degree_of_freedom.html',
  arbitalUrl: 'https://arbital.com/p/reflective_degree_of_freedom',
  rawJsonUrl: '../raw/2fr.json',
  likeableId: '1370',
  likeableType: 'page',
  myLikeValue: '0',
  likeCount: '0',
  dislikeCount: '0',
  likeScore: '0',
  individualLikes: [],
  pageId: 'reflective_degree_of_freedom',
  edit: '4',
  editSummary: '',
  prevEdit: '3',
  currentEdit: '4',
  wasPublished: 'true',
  type: 'wiki',
  title: 'Reflectively consistent degree of freedom',
  clickbait: 'When an instrumentally efficient, self-modifying AI can be like X or like X' in such a way that X wants to be X and X' wants to be X', that's a reflectively consistent degree of freedom.',
  textLength: '3798',
  alias: 'reflective_degree_of_freedom',
  externalUrl: '',
  sortChildrenBy: 'likes',
  hasVote: 'false',
  voteType: '',
  votesAnonymous: 'false',
  editCreatorId: 'EliezerYudkowsky',
  editCreatedAt: '2016-03-09 03:19:20',
  pageCreatorId: 'EliezerYudkowsky',
  pageCreatedAt: '2016-03-09 03:08:49',
  seeDomainId: '0',
  editDomainId: 'EliezerYudkowsky',
  submitToDomainId: '0',
  isAutosave: 'false',
  isSnapshot: 'false',
  isLiveEdit: 'true',
  isMinorEdit: 'false',
  indirectTeacher: 'false',
  todoCount: '4',
  isEditorComment: 'false',
  isApprovedComment: 'true',
  isResolved: 'false',
  snapshotText: '',
  anchorContext: '',
  anchorText: '',
  anchorOffset: '0',
  mergedInto: '',
  isDeleted: 'false',
  viewCount: '259',
  text: 'A "reflectively consistent degree of freedom" is when a self-modifying AI can have multiple possible properties $X_i \\in X$ such that an AI with property $X_1$ wants to go on being an AI with property $X_1,$ and an AI with $X_2$ will ceteris paribus only choose to self-modify into designs that are also $X_2,$ etcetera.\n\nThe archetypal reflectively consistent degree of freedom is a [humean_freedom Humean degree of freedom], the refective consistency of many different possible [1fw utility functions].  If Gandhi doesn't want to kill you, and you offer Gandhi a pill that makes him want to kill people, then [gandhi_stability_argument Gandhi will refuse the pill], because he knows that if he takes the pill then pill-taking-future-Gandhi will kill people, and the current Gandhi rates this outcome low in his preference function.  Similarly, a [10h paperclip maximizer] wants to remain a paperclip maximizer.  Since these two possible preference frameworks are both [71 consistent under reflection], they constitute a "reflectively consistent degree of freedom" or "reflective degree of freedom".\n\nFrom a design perspective, or the standpoint of an [1cv], the key fact about a reflectively consistent degree of freedom is that it doesn't automatically self-correct as a result of the AI trying to improve itself.  The problem "Has trouble understanding General Relativity" or "Cannot beat a human at poker" or "Crashes on seeing a picture of a dolphin" is something that you might expect to correct automatically and without specifically directed effort, assuming you otherwise improved the AI's general ability to understand the world and that it was self-improving.  "Wants paperclips instead of eudaimonia" is *not* self-correcting.\n\nAnother way of looking at it is that reflective degrees of freedom describe information that is not automatically extracted or learned given a sufficiently smart AI, the way it would automatically learn General Relativity.  If you have a concept whose borders (membership condition) relies on knowing about General Relativity, then when the AI is sufficiently smart it will see a simple definition of that concept.  If the concept's borders instead rely on [ value-laden] judgments, there may be no algorithmically simple description of that concept, even given lots of knowledge of the environment, because the [humean_freedom Humean degrees of freedom] need to be independently specified.\n\nOther properties besides the preference function look like they should be reflectively consistent in similar ways.  For example, [ son of CDT] and [ UDT] both seem to be reflectively consistent in different ways.  So an AI that has, from our perspective, a 'bad' decision theory (one that leads to behaviors we don't want), isn't 'bugged' in a way we can rely on to self-correct.  (This is one reason why MIRI studies decision theory and not computer vision.  There's a sense in which mistakes in computer vision automatically fix themselves, given a sufficiently advanced AI, and mistakes in decision theory don't fix themselves.)\n\nSimilarly, [27p Bayesian priors] are by default consistent under reflection - if you're a Bayesian with a prior, you want to create copies of yourself that have the same prior or [1ly Bayes-updated] versions of the prior.  So 'bugs' (from a human standpoint) like being [Pascal's Muggable](https://wiki.lesswrong.com/wiki/Pascal's_mugging) might not automatically fix themselves in a way that correlated with sufficient growth in other knowledge and general capability, in the way we might expect a specific mistaken belief about gravity to correct itself in a way that correlated to sufficient general growth in capability.  (This is why MIRI thinks about [ naturalistic induction] and similar questions about prior probabilities.)',
  metaText: '',
  isTextLoaded: 'true',
  isSubscribedToDiscussion: 'false',
  isSubscribedToUser: 'false',
  isSubscribedAsMaintainer: 'false',
  discussionSubscriberCount: '1',
  maintainerCount: '1',
  userSubscriberCount: '0',
  lastVisit: '',
  hasDraft: 'false',
  votes: [],
  voteSummary: 'null',
  muVoteSummary: '0',
  voteScaling: '0',
  currentUserVote: '-2',
  voteCount: '0',
  lockedVoteType: '',
  maxEditEver: '0',
  redLinkCount: '0',
  lockedBy: '',
  lockedUntil: '',
  nextPageId: '',
  prevPageId: '',
  usedAsMastery: 'false',
  proposalEditNum: '0',
  permissions: {
    edit: {
      has: 'false',
      reason: 'You don't have domain permission to edit this page'
    },
    proposeEdit: {
      has: 'true',
      reason: ''
    },
    delete: {
      has: 'false',
      reason: 'You don't have domain permission to delete this page'
    },
    comment: {
      has: 'false',
      reason: 'You can't comment in this domain because you are not a member'
    },
    proposeComment: {
      has: 'true',
      reason: ''
    }
  },
  summaries: {},
  creatorIds: [
    'EliezerYudkowsky'
  ],
  childIds: [
    'humean_free_boundary',
    'value_laden'
  ],
  parentIds: [
    'reflective_stability'
  ],
  commentIds: [
    '2gh'
  ],
  questionIds: [],
  tagIds: [],
  relatedIds: [],
  markIds: [],
  explanations: [],
  learnMore: [],
  requirements: [],
  subjects: [],
  lenses: [],
  lensParentId: '',
  pathPages: [],
  learnMoreTaughtMap: {},
  learnMoreCoveredMap: {},
  learnMoreRequiredMap: {},
  editHistory: {},
  domainSubmissions: {},
  answers: [],
  answerCount: '0',
  commentCount: '0',
  newCommentCount: '0',
  linkedMarkCount: '0',
  changeLogs: [
    {
      likeableId: '0',
      likeableType: 'changeLog',
      myLikeValue: '0',
      likeCount: '0',
      dislikeCount: '0',
      likeScore: '0',
      individualLikes: [],
      id: '8423',
      pageId: 'reflective_degree_of_freedom',
      userId: 'EliezerYudkowsky',
      edit: '4',
      type: 'newChild',
      createdAt: '2016-03-09 03:22:08',
      auxPageId: 'humean_free_boundary',
      oldSettingsValue: '',
      newSettingsValue: ''
    },
    {
      likeableId: '0',
      likeableType: 'changeLog',
      myLikeValue: '0',
      likeCount: '0',
      dislikeCount: '0',
      likeScore: '0',
      individualLikes: [],
      id: '8422',
      pageId: 'reflective_degree_of_freedom',
      userId: 'EliezerYudkowsky',
      edit: '4',
      type: 'newEdit',
      createdAt: '2016-03-09 03:19:20',
      auxPageId: '',
      oldSettingsValue: '',
      newSettingsValue: ''
    },
    {
      likeableId: '0',
      likeableType: 'changeLog',
      myLikeValue: '0',
      likeCount: '0',
      dislikeCount: '0',
      likeScore: '0',
      individualLikes: [],
      id: '8419',
      pageId: 'reflective_degree_of_freedom',
      userId: 'EliezerYudkowsky',
      edit: '3',
      type: 'newEdit',
      createdAt: '2016-03-09 03:11:53',
      auxPageId: '',
      oldSettingsValue: '',
      newSettingsValue: ''
    },
    {
      likeableId: '0',
      likeableType: 'changeLog',
      myLikeValue: '0',
      likeCount: '0',
      dislikeCount: '0',
      likeScore: '0',
      individualLikes: [],
      id: '8418',
      pageId: 'reflective_degree_of_freedom',
      userId: 'EliezerYudkowsky',
      edit: '2',
      type: 'newEdit',
      createdAt: '2016-03-09 03:11:14',
      auxPageId: '',
      oldSettingsValue: '',
      newSettingsValue: ''
    },
    {
      likeableId: '0',
      likeableType: 'changeLog',
      myLikeValue: '0',
      likeCount: '0',
      dislikeCount: '0',
      likeScore: '0',
      individualLikes: [],
      id: '8417',
      pageId: 'reflective_degree_of_freedom',
      userId: 'EliezerYudkowsky',
      edit: '1',
      type: 'newEdit',
      createdAt: '2016-03-09 03:08:49',
      auxPageId: '',
      oldSettingsValue: '',
      newSettingsValue: ''
    },
    {
      likeableId: '0',
      likeableType: 'changeLog',
      myLikeValue: '0',
      likeCount: '0',
      dislikeCount: '0',
      likeScore: '0',
      individualLikes: [],
      id: '8416',
      pageId: 'reflective_degree_of_freedom',
      userId: 'EliezerYudkowsky',
      edit: '0',
      type: 'newParent',
      createdAt: '2016-03-09 02:39:44',
      auxPageId: 'reflective_stability',
      oldSettingsValue: '',
      newSettingsValue: ''
    }
  ],
  feedSubmissions: [],
  searchStrings: {},
  hasChildren: 'true',
  hasParents: 'true',
  redAliases: {},
  improvementTagIds: [],
  nonMetaTagIds: [],
  todos: [],
  slowDownMap: 'null',
  speedUpMap: 'null',
  arcPageIds: 'null',
  contentRequests: {}
}