Evolution Strategies and Reinforcement Learning

{
  localUrl: '../page/evolution_strategies.html',
  arbitalUrl: 'https://arbital.com/p/evolution_strategies',
  rawJsonUrl: '../raw/88d.json',
  likeableId: '0',
  likeableType: 'page',
  myLikeValue: '0',
  likeCount: '0',
  dislikeCount: '0',
  likeScore: '0',
  individualLikes: [],
  pageId: 'evolution_strategies',
  edit: '1',
  editSummary: 'An overview of evolution strategies black box optimization.',
  prevEdit: '0',
  currentEdit: '1',
  wasPublished: 'true',
  type: 'wiki',
  title: 'Evolution Strategies and Reinforcement Learning',
  clickbait: 'Evolution strategies as a simplified implementation of reinforcement learning',
  textLength: '1229',
  alias: 'evolution_strategies',
  externalUrl: '',
  sortChildrenBy: 'likes',
  hasVote: 'false',
  voteType: '',
  votesAnonymous: 'false',
  editCreatorId: 'AshtonHellwig',
  editCreatedAt: '2017-04-19 06:43:13',
  pageCreatorId: 'AshtonHellwig',
  pageCreatedAt: '2017-04-19 06:43:13',
  seeDomainId: '0',
  editDomainId: '2638',
  submitToDomainId: '0',
  isAutosave: 'false',
  isSnapshot: 'false',
  isLiveEdit: 'true',
  isMinorEdit: 'false',
  indirectTeacher: 'false',
  todoCount: '0',
  isEditorComment: 'false',
  isApprovedComment: 'false',
  isResolved: 'false',
  snapshotText: '',
  anchorContext: '',
  anchorText: '',
  anchorOffset: '0',
  mergedInto: '',
  isDeleted: 'false',
  viewCount: '28',
  text: '[1qm Reinforcement learning] is a machine learning task that involves an agent interacting with a dynamic environment to find the correct parameters (typically out of 1,000,000) in a policy function that best links the input with the output, with the key difference against supervised learning that the agent gains feedback as either punishments (for less-than-optimal behavior) and rewards (which increase based on efficacy). The output suggests a way to improve the agent, rather than a set of correct initial [6lq vector variables], which is then iterated into a new process to collect new episodes of interaction and further update the agent [(Salimans et al.)](https://blog.openai.com/evolution-strategies/). According to [7gz OpenAI], the older form of what is now known as RL, called evolution strategies (ES), perhaps rivals the performance of standard RL considering there is no agent or environment, in addition to being more easily scalable in a distributed system [(Salimans et al.)](https://blog.openai.com/evolution-strategies/). The ES [5v algorithm] is most closely related to a "guess and check" environment. Reinforcement learning is being applied to cases such as self-driving cars and video game programming.\n',
  metaText: '',
  isTextLoaded: 'true',
  isSubscribedToDiscussion: 'false',
  isSubscribedToUser: 'false',
  isSubscribedAsMaintainer: 'false',
  discussionSubscriberCount: '1',
  maintainerCount: '1',
  userSubscriberCount: '0',
  lastVisit: '',
  hasDraft: 'false',
  votes: [],
  voteSummary: 'null',
  muVoteSummary: '0',
  voteScaling: '0',
  currentUserVote: '-2',
  voteCount: '0',
  lockedVoteType: '',
  maxEditEver: '0',
  redLinkCount: '0',
  lockedBy: '',
  lockedUntil: '',
  nextPageId: '',
  prevPageId: '',
  usedAsMastery: 'false',
  proposalEditNum: '0',
  permissions: {
    edit: {
      has: 'false',
      reason: 'You don't have domain permission to edit this page'
    },
    proposeEdit: {
      has: 'true',
      reason: ''
    },
    delete: {
      has: 'false',
      reason: 'You don't have domain permission to delete this page'
    },
    comment: {
      has: 'false',
      reason: 'You can't comment in this domain because you are not a member'
    },
    proposeComment: {
      has: 'true',
      reason: ''
    }
  },
  summaries: {},
  creatorIds: [
    'AshtonHellwig'
  ],
  childIds: [],
  parentIds: [],
  commentIds: [],
  questionIds: [],
  tagIds: [],
  relatedIds: [],
  markIds: [],
  explanations: [],
  learnMore: [],
  requirements: [],
  subjects: [],
  lenses: [],
  lensParentId: '',
  pathPages: [],
  learnMoreTaughtMap: {},
  learnMoreCoveredMap: {},
  learnMoreRequiredMap: {},
  editHistory: {},
  domainSubmissions: {},
  answers: [],
  answerCount: '0',
  commentCount: '0',
  newCommentCount: '0',
  linkedMarkCount: '0',
  changeLogs: [
    {
      likeableId: '0',
      likeableType: 'changeLog',
      myLikeValue: '0',
      likeCount: '0',
      dislikeCount: '0',
      likeScore: '0',
      individualLikes: [],
      id: '22461',
      pageId: 'evolution_strategies',
      userId: 'AshtonHellwig',
      edit: '1',
      type: 'newEdit',
      createdAt: '2017-04-19 06:43:13',
      auxPageId: '',
      oldSettingsValue: '',
      newSettingsValue: 'An overview of evolution strategies black box optimization.'
    }
  ],
  feedSubmissions: [],
  searchStrings: {},
  hasChildren: 'false',
  hasParents: 'false',
  redAliases: {},
  improvementTagIds: [],
  nonMetaTagIds: [],
  todos: [],
  slowDownMap: 'null',
  speedUpMap: 'null',
  arcPageIds: 'null',
  contentRequests: {}
}