I’m writing a series of posts clarifying my position on the Intelligence Explosion, and here I want to discuss some theoretical work on the types of goals self-improving systems might converge upon.
Stephen Omohundro has made a convincing case that we can expect to see a wide variety of systems, with different utility functions and different architectures, to manifest a very similar set of sub-goals because such sub-goals are required to achieve almost any macro-goal.
These sub-goals are commonly referred to as the AI ‘drives’, and my discussion below isn’t exhaustive. Consult Omohundro (2008) and Bostrom (2014) for more lengthy treatments.
Imagine two different systems, one designed to solve the goldbach conjecture and another to manufacture solar panels. Both systems are at about the intelligence level of a reasonably bright human and they are capable of making changes to their own code.
These systems find that they can better accomplish their goals if they improve themselves by acquiring more resources and optimizing their reasoning algorithms. Further, they become protective of themselves and their utility functions because, well, they can’t accomplish their current goals if those goals change or they allow themselves to be shut off.
Despite how very different the terminal goals of these two systems are, each of them nevertheless develop drives to self-improve, defend themselves, and preserve their utility function even though neither system had these drives explicitly programmed in at the beginning.
Now, to my knowledge no one is claiming that each and every AI system will manifest all the drives in the course of self-improving. But Omohundro’s analysis might furnish a way to think about the general contours of recursive self-improvement in intelligent machines.
Thinking about the drives in advance is important because we might find that, to our surprise, the first Artificial General Intelligences we make have goals we didn’t give them. They might resist being unplugged or having their goals tampered with.
The drive to self improve is particularly important, though, because it could be a catalyst for an Intelligence Explosion.