Thanks for the reply. I agree that strong Inevitability is unreasonable, and I understand the function of #1 and #2 in disrupting a prior frame of mind which assumes strong Inevitability, but that's not the only alternative to Orthogonality. I'm surprised that the arguments are considered successively stronger arguments in favor of Orthogonality, since #6 basically says "under reasonable hypotheses, Orthogonality may well be false." (I admit that's a skewed reading, but I don't know what the referenced ongoing work looks like, so I'm skipping that bit for now. [Edit: is this "tiling agents"? I'm not familiar with that work, but I can go learn about it.])
The other arguments are interesting commentary, but don't argue that Orthogonality is true for agents we ought to care about.
- Gandhian stability argues that self-modifying agents will try to preserve their preference systems, but not that they can become arbitrarily powerful while doing so. As it happens, circular preference systems illustrate how Gandhian stability could limit how powerful a cognitive agent can become.
- The unbounded agents argument says Orthogonality is true when "mind space" is broader than what we care about.
- The search tractability argument looks like a statement about the relative difficulty of accomplishing different goals, not the relative difficulties of holding those goals. I don't mean to dismiss the argument, but I don't understand it. I'm not even clear on exactly what the argument is saying about the tractability of searching for strategies for different goals. That it's the same for all possible goals?