[summary:
- Strictly infrahuman: The AI can't do better than a human in any regard (in that domain).
- Infrahuman: The AI almost always loses to the human (in that domain).
- Par-human: The AI sometimes wins and sometimes loses; it's weaker in some places and stronger in others (in that domain).
- High-human: The AI performs around as well as exceptionally competent humans.
- Superhuman: The AI almost always wins.
- Efficient: Human advice contributes no marginal improvement to the AI's competence.
- Strongly superhuman: The AI is much better than human; the domain is rich enough for humans to be surprised at the AI's tactics.
- Optimal: Perfect performance for the domain.
These thresholds aren't always ordered as above. For example, logical Tic-Tac-Toe is simple enough that humans and AIs can both play optimally; so, in the Tic-Tac-Toe domain, optimal play isn't superhuman.]
Some thresholds in 'sufficiently advanced' machine intelligence are not absolute ability levels within a domain, but abilities relative to the human programmers or operators of the AI. When this is true, it's useful to think about relative ability levels within a domain; and one generic set of distinguished thresholds in relative ability is:
- Strictly infrahuman: The AI cannot do anything its human operators / programmers cannot do. Computer chess in 1966 relative to a human master.
- Infrahuman: The AI is definitely weaker than its operators but can deploy some surprising moves. Computer chess in 1986 relative to a human master.
- Par-human (or more confusingly "human-level"): If competing in that domain, the AI would sometimes win, sometimes lose; it's better than human at some things and worse in others; it just barely wins or loses. Computer chess in 1991 on a home computer, relative to a strong amateur human player.
- High-human: The AI performs as well as exceptionally competent humans. Computer chess just before 1996.
- Superhuman: The AI always wins. Computer chess in 2006.
- Efficient: Human advice contributes no marginal improvement to the AI's competence. Computer chess was somewhere around this level in 2016, with "advanced" / "freestyle" / "hybrid" / "centaur" chess starting to lose out against purely machine players. %note: Citation solicited. Googling gives the impression that nothing has been heard from 'advanced chess' in the last few years.%
- Strongly superhuman:
- The ceiling of possible performance in the domain is far above the human level; the AI can perform orders of magnitudes better. E.g., consider a human and computer competing at how fast they can do arithmetic. In principle the domain is simple, but competing with respect to speed leaves room overhead for the computer to do literally billions of times better.
- The domain is rich enough that humans don't understand key generalizations, leaving them shocked at how the AI wins. Computer Go relative to human masters in 2017 was just starting to exhibit the first signs of this ("We thought we were one or two stones below God, but after playing AlphaGo, we think it is more like three or four"). Similarly, consider a human grandmaster playing Go against a human novice.
- Optimal: The AI's performance is perfect for the domain; God could do no better. Computer play in checkers as of 2007.
The ordering of these thresholds isn't always as above. For example, in the extremely simple domain of logical Tic-Tac-Toe, humans can play optimally after a small amount of training. Optimal play in Tic-Tac-Toe is therefore not superhuman. Similarly, if an AI is playing in a rich domain but still has strange weak spots, the AI might be strongly superhuman (its play is much better and shocks human masters) but not efficient (the AI still sometimes plays wrong moves that human masters can see are wrong).
The term "human-equivalent" is deprecated because it confusingly implies a roughly human-style balance of capabilities, e.g., an AI that is roughly as good at conversation as a human and also roughly as good at arithmetic as a human. This seems pragmatically unlikely.
The other Wiki lists the categories "optimal, super-human, high-human, par-human, sub-human".
Relevant thresholds for AI alignment problems
Considering these categories as thresholds of advancement relevant to the point at which AI alignment problems first materialize:
- "Strictly infrahuman" means we don't expect to be surprised by any tactic the AI uses to achieve its goals (within a domain).
- "Infrahuman" means we might be surprised by a tactic, but not surprised by overall performance levels.
- "Par-human" means we need to start worrying that humans will lose in any event determined by a competition (although this seems to imply the non-adversarial principle has already been violated); we can't rely on humans winning some event determined by a contest of relevant ability. Or this may suppose that the AI gains access to resources or capabilities that we have strong reason to believe are protected by a lock of roughly human ability levels, even if that lock is approached in a different way than usual.
- "High-human" means the AI will probably see strategies that a human sees in a domain; it might be possible for an AI of par-human competence to miss them, but this is much less likely for a high-human AI. It thus behaves like a slightly weaker version of postulating efficiency for purposes of expecting the AI to see some particular strategy or point.
- "Superhuman" implies at least weak cognitive uncontainability by Vinge's Law. Also, if something is known to be difficult or impossible for humans, but seems possibly doable in principle, we may need to consider it becoming possible given some superhuman capability level.
- "Efficiency" is a fully sufficient condition for the AI seeing any opportunity that a human sees; e.g., it is a fully sufficient condition for many instrumentally convergent strategies. Similarly, it can be postulated as a fully sufficient condition to refute a claim that an AI will take a path such that some other path would get more of its utility function.
- "Strongly superhuman" means we need to expect that an AI's strategies may deploy faster than human reaction times, or overcome great starting disadvantages. Even if the AI starts off in a much worse position it may still win.
- "Optimality" doesn't obviously correspond to any particular threshold of results, but is still an important concept in the hierarchy, because only by knowing the absolute limits on optimal performance can we rule out strongly superhuman performance as being possible. See also the claim Almost all real-world domains are rich.
'Human-level AI' confused with 'general intelligence'
The term "human-level AI" is sometimes used in the literature to denote Artificial General Intelligence. This should probably be avoided, because:
- Narrow AIs have achieved par-human or superhuman ability in many specific domains without general intelligence.
- If we consider general intelligence as a capability, a kind of superdomain, it seems possible to imagine infrahuman levels of general intelligence (or superhuman levels). The apparently large jump from humans to chimpanzees mean that we mainly see human levels of general intelligence with no biological organisms exhibiting the same ability at a lower level; but, at least so far as we currently know, AI could possibly take a different developmental path. So alignment thresholds that could plausibly follow from general intelligence, like big-picture awareness, aren't necessarily locked to par-human performance overall.
Arguably, the term 'human-level' should just be avoided entirely, because it's been pragmatically observed to function as a gotcha button that derails the conversation some fraction of the time; with the interrupt being "Gotcha! AIs won't have a humanlike balance of abilities!"
Comments
Ryan Carey
subhuman should contrast with superhuman, or infrahuman with suprahuman.