Maybe a useful thing to add: when we say things like "if X goes wrong, I expect your AI to become a paperclip maximizer", we don't necessarily mean that the AI will have a terminal goal as human-comprehensible and human-stupid as "maximizing paperclips", we mean that it will actually seek to maximize a goal that isn't very near the exact direction of human preference, and thanks to instrumental goals and edge instantiation, this results in a world that is just as worthless to human values as if we had let loose a paperclip maximizer.