Convergent AI Drives.

I’m writing a series of posts clarifying my position on the Intelligence Explosion, and here I want to discuss some theoretical work on the types of goals self-improving systems might converge upon.

Stephen Omohundro has made a convincing case that we can expect to see a wide variety of systems, with different utility functions and different architectures, to manifest a very similar set of sub-goals because such sub-goals are required to achieve almost any macro-goal.

These sub-goals are commonly referred to as the AI ‘drives’, and my discussion below isn’t exhaustive. Consult Omohundro (2008) and Bostrom (2014) for more lengthy treatments.

Imagine two different systems, one designed to solve the goldbach conjecture and another to manufacture solar panels. Both systems are at about the intelligence level of a reasonably bright human and they are capable of making changes to their own code.

These systems find that they can better accomplish their goals if they improve themselves by acquiring more resources and optimizing their reasoning algorithms. Further, they become protective of themselves and their utility functions because, well, they can’t accomplish their current goals if those goals change or they allow themselves to be shut off.

Despite how very different the terminal goals of these two systems are, each of them nevertheless develop drives to self-improve, defend themselves, and preserve their utility function even though neither system had these drives explicitly programmed in at the beginning.

Now, to my knowledge no one is claiming that each and every AI system will manifest all the drives in the course of self-improving. But Omohundro’s analysis might furnish a way to think about the general contours of recursive self-improvement in intelligent machines.

Thinking about the drives in advance is important because we might find that, to our surprise, the first Artificial General Intelligences we make have goals we didn’t give them. They might resist being unplugged or having their goals tampered with.

The drive to self improve is particularly important, though, because it could be a catalyst for an Intelligence Explosion.

The Mathematics of Superintelligence.

I’m writing a series of posts summarizing my position on the Intelligence Explosion, and in addition to the actual empirical, real-world achievements discussed in the last post, significant advances have also been made in developing the theoretical underpinnings of superintelligent agents. Two of particular interest are Marcus Hutter’s AIXI and Jurgen Schmidhuber’s Gödel Machine

Now, I have to confess at this point that I’m very much out of my depth here mathematically. But as I understand it, AIXI is a set of surprisingly compact equations which describe how an optimal reasoner would gather evidence and update its beliefs.

These equations turn out to not be computable, but they can be approximated. MCAIXI, for example, is a scaled-down version of AIXI that managed to learn how to play a number of games on its own from scratch.

The Gödel machine is a theoretical piece of software with two components: one devoted to performing some arbitrary task like calculating digits of pi, and another called a proof searcher which is capable of rewriting any part of the Gödel Machines code, including itself, as soon as its found proof that the rewrite would be an improvement.

The first superintelligence might bear little mathematical resemblance to AIXI or the Gödel Machine, but these theoretical successes, combined with all the progress that’s been made in AI in previous years, lend weight to the notion that smarter-than-human machines will be a part of the future.


Peripatesis: Superintelligent Motivation, Hannibal Captures Tarentum.

‘Peripatesis’ is a made-up word related to the word ‘peripatetic’, which is an adjective that means ‘roaming’ or ‘meandering’. I’ve always liked to think of knowledge as a huge structure through which a person could walk, sprint, dive, climb, or fly in as straightforward or peripatetic a fashion as they like.

Here’s are my recent wanderings and wonderings:

Bostrom, N., Superintelligence, p. 105-126

In chapters 7 and 8 Bostrom covers the relationship between intelligence and motivation and what the default outcome of the intelligence explosion would be, respectively.

The point of chapter 7 is to establish the Orthogonality Thesis, so called because the idea is that nearly any goal can be attached to nearly any level of intelligence. Intelligent humans might not want to spend all day making paperclips, but assuming that a superintelligent AI wouldn’t want to is anthropomorphism.

Chapter 8 gives a panoply of reasons for expecting a non-carefully-controlled intelligence explosion to be catastrophic. The basic idea is that when one human tells another human “make me smile” both humans come pre-equipped with a vast, shared cognitive machinery which means that neither party will interpret this command as ‘staple the corners of my mouth up so I’m always grinning’.

Assuming an AI would rule that option out is also anthropomorphism, and of a kind that’s very deadly if we’re dealing with a superintelligence.

Goldsworthy, A., The Fall of Carthage, p. 222-233

This week I made it party through the section which covers the years 216 B.C.-203 B.C. The Romans and Carthaginians spent this period vying for control of major cities in southern Italy, such as Capua.

Hannibal finally managed to capture a port at Tarentum in 212 B.C., a goal he had been particularly interested in achieving for a while. This didn’t actually amount to much, as the Romans recaptured Tarentum in 209 B.C.

Peripatesis: Superintelligent Strategic Advantages, The Aftermath of Cannae, ‘Utility’ In Game Theory.

‘Peripatesis’ is a made-up word related to the word ‘peripatetic’, which is an adjective that means ‘roaming’ or ‘meandering’. I’ve always liked to think of knowledge as a huge structure through which a person could walk, sprint, dive, climb, or fly in as straightforward or peripatetic a fashion as they like.

Here’s are my recent wanderings and wonderings:

Bostrom, N., Superintelligence, p. 78-104

I made it through two chapters this week, in which Bostrom addressed the questions of what sort of strategic advantage superintelligence-development projects could expect to have and how this would impact the future.

The question of strategic advantage is closely related to the question of take-off speed, because if we can expect a fast takeoff then it’s likely that the first project which creates an AI capable of recursive self-improvement will also give rise to the first superintelligence. If, on the other hand, the takeoff is slow then there might be many different AIs improving themselves on the path to superintelligence.

As Bostrom believes that a takeoff will probably be fast, he also believes that first superintelligence will probably be the only superintelligence. Any such agent will be capable of utilizing various superpowers, or abilities far beyond those possessed by competing agents, to disproportionately affect the future.

Using conservative estimates for the computational ability required to simulate human minds and how much of the available matter in the universe can be converted to computational substrate, Bostrom makes the case that the future is a truly vast place inhabited by a near-uncountable number of minds.

Given this, a strong argument can be made that the development of the first superintelligence is the most important project a group of humans will ever undertake.

Goldsworthy, A., The Fall of Carthage, p. 214-221

Following his astonishing victory in the battle of Cannae, Hannibal faced the question of whether to march on Rome or to spend some time resting his troops and planning his next move.

Hannibal chose the latter, giving rise to one of the great ‘what-if’s’ of world history. It is far from clear that Rome would’ve been able to repel a full assault on the city itself, and equally unclear that Hannibal could’ve taken Rome.

In any case Rome ignored the delegation Hannibal sent to negotiate, choosing instead to begin the process of rebuilding their army.

For his part, Hannibal gained many new allies in Southern Italy, and with them a means of drawing supplies to feed his army, meaning he no longer needed to constantly be on the move.

From this point on it was to be a markedly different war between these two titanic enemies.

Luce, R., Raiffa, H., Games and Decisions, p. 12-38

As the concept of utility is essential in gaining an understanding of game theory, the entirety of Chapter 2 is devoted to it.

The following conceptual distinctions are made: an individual can be thought of as any entity or entities with a unitary goal, and a group is one comprised of members with competing goals. Decisions can be made under conditions of certainty, risk, or uncertainty.

A situation is certain when each action leads invariably to a known outcome, risky when each action leads to a set of possible outcomes which have known probabilities, and uncertain when it isn’t clear what the outcome of an action will be.

The authors then turn to analyzing decision making under certainty and under risk before laying out a number of axioms central to game theory.

Peripatesis: Intelligence Explosion Dynamics, The Battle Of Cannae.

‘Peripatesis’ is a made-up word related to the word ‘peripatetic’, which is an adjective that means ‘roaming’ or ‘meandering’. I’ve always liked to think of knowledge as a huge structure through which a person could walk, sprint, dive, climb, or fly in as straightforward or peripatetic a fashion as they like.

Here’s are my recent wanderings and wonderings:

Bostrom, N., Superintelligence, p. 62-78

In Chapter 5 Bostrom examines several different shapes an ‘intelligence explosion’ or ‘takeoff’ could have. The two most important factors governing this shape are the optimization power of the system and its recalcitrance, or resistance to improvement; in conjunction, optimization power and recalcitrance can give rise to slow, moderate, or fast takeoffs.

Different sorts of potentially superintelligent systems have different recalcitrance profiles. For example, the recalcitrance of developing nootropics would initially be low as little in the way of nootropic research has been done, but would likely to start to rise rapidly after most of the low-hanging fruit had been picked. An emulated mind on the other hand might have low-to-moderate recalcitrance for quite a while if it were able to absorb increasing amounts of hardware to run copies of itself on.

AI could play a role in the development of several different kinds of superintelligent systems, for example a network of knowledge workers whose output is vastly improved in terms of quality and quantity by the aid of an AI research assistant. As such, it makes sense to analyze the dynamics of an AI takeoff in some depth.

The recalcitrance of AI systems is hard to judge, because it may turn out that the key to making an AI smart enough to begin self-improving may be many little insights, each of which requires more and more effort to uncover, or it may be a single insight that eludes everyone for a long time. In the latter case, there may not be much improvement in the system at all until the last piece of the puzzle drops into place, and then change may start happening very quickly.

Bostrom believes that AI recalcitrance will not prove to be very high. Earth spends a lot of time developing new and more powerful computers, so once a human-level AI system is developed it will probably be able to make use of extra computing power laying around to run itself more quickly. It could also avail itself of the truly vast quantities of information available via the internet to fashion an knowledge base far in excess of anything possessed by a human being.

With respect to the other part of the equation, optimization power, it seems most likely that it will increase during the takeoff because people will begin investing huge amounts of effort in any AI system that shows promise. At a certain point the system will become capable enough that most optimization pressure is coming from the system itself. This might result in an improvement cascade, wherein each improvement makes further improvements easier, and the takeoff could be very fast indeed.


Goldsworthy, A., The Fall of Carthage, p. 198-214

In this section of Goldsworthy’s history of the Punic wars I learned about one of the most epic defeats ever dealt to the Roman empire: the battle of Cannae.

After a string of humiliating defeats the Roman senate had decided to put the two incoming consuls, Caius Tarentius Varro and Lucius Aemilius Paullus in charge of the largest army Rome had ever fielded. Each consul was to have four larger-than-normal legions, instead of the usual two, and the consuls were to face Hannibal together.

Hannibal waited until crops had ripened before heading south, eventually setting camp in the former stronghold Cannae. The Romans followed, taking great care to avoid spots where they could be ambushed.

The roman plan was to punch through the center of Hannibal’s army, scattering his infantry. Roman cavalry, notably inferior to their Punic counterparts, were only supposed to prevent flanking maneuvers for long enough to allow victory in the center.

Hannibal anticipated this and strengthened his center, bulging it outward towards his enemy.

After hours of intense fighting the Romans had managed to push the Carthaginian center backwards, eventually routing them. In the process their originally neat formations began to lose shape until the Roman soldiers were in one great mass hacking away at retreating Gauls. At this point the Libyan infantry on either side turned inward to face the Romans, and the real butchery started.

Disorganized and surrounded, the Romans were unable to make much use of their superior numbers. Though they inflicted ghastly casualties on the Carthaginians, Hannibal emerged victorious over the greatest fighting force Rome could muster.

For centuries thereafter the defeat at Cannae would be a Roman yardstick for measuring other losses.

Peripatesis: Forms of Superintelligence, Game Theory.

‘Peripatesis’ is a made-up word related to the word ‘peripatetic’, which is an adjective that means ‘roaming’ or ‘meandering’. I’ve always liked to think of knowledge as a huge structure through which a person could walk, sprint, dive, climb, or fly in as straightforward or peripatetic a fashion as they like.

Here’s are my recent wanderings and wonderings:

Bostrom, N. Superintelligence, p. 52-61

In chapter 3 Bostrom outlines three distinct forms a superintelligence could take:

speed superintelligence is one which functions in a similar fashion to a human mind but which does so much more quickly, like a whole-brain emulation. A collective superintelligence is a superintelligence comprised of a network of lesser intelligences, like an extremely well-run conglomeration of knowledge workers. And a quality superintelligence is one which functions at the same speed as a human mind but which, for architectural or other reasons, does so much better.

He also lists many advantages that a digital intelligence would have over a biological one: for example, electrical circuitry has much lower latencies than neural circuitry, and thus communication among the elements from which a computer-based mind is built would be much faster.

Luce, R., Raiffa, H. Games and Decisions, p. 1-12

This introductory volume to game theory begins by pointing out that ‘conflicts of interest’, situations in which several agents are trying to influence the outcome of a situation while not having full control of all the variables, is both very interesting to most people and at the heart of the discipline.

Game theory gets off the ground by making certain assumptions about the players involved, such as that they have consistent preferences and know the preferences of the other players. In real life of course this is almost never the case, but simplifying assumptions of this sort are often required in developing mathematical tools.

If such assumptions simplify too much, though, the insights gleaned won’t be of much use. Is this the case for game theory?

The authors promise to address this in later chapters, but point out that, at a minimum, game theory could be used to design experiments which could then be used to modify the assumptions of game theory.


Peripatesis: Paths To Superintelligence.

‘Peripatesis’ is a made-up word related to the word ‘peripatetic’, which is an adjective that means ‘roaming’ or ‘meandering’. I’ve always liked to think of knowledge as a huge structure through which a person could walk, sprint, dive, climb, or fly in as straightforward or peripatetic a fashion as they like.

Here’s are my recent wanderings and wonderings:

This week has been especially busy, and so little in the way of reading got done.

Bostrom, N. Superintelligence, p. 22-51

Beginning in the second chapter Bostrom discusses a number of paths to superintelligence, including evolving one, programming a seed AI which bootstraps itself to superintelligence, emulating a human brain in a computer, improving human cognition, improving brain-computer interfaces, and designing smarter organizations.

The most direct path to superintelligence is more likely to involve programming a seed AI than it is augmenting human brains or organizations, so it makes more sense to focus on that route.

Peripatesis: Narrow And General AI, Maximus ‘The Delayer’ Avoids Battle With Hannibal.

‘Peripatesis’ is a made-up word related to the word ‘peripatetic’, which is an adjective that means ‘roaming’ or ‘meandering’. I’ve always liked to think of knowledge as a huge structure through which a person could walk, sprint, dive, climb, or fly in as straightforward or peripatetic a fashion as they like.

Here’s are my recent wanderings and wonderings:

Bostrom, N. Superintelligence, p 1-22

The book’s far-ranging introduction spends most of its time taking a high-altitude look at the history and state-of-the-art of AI. After its founding the field was beset by boom periods of high investment and optimism followed by ‘winters’ during which funding disappeared and AI research fell out of favor. Behind the scenes, however, the actual nitty-gritty of AI development continued, resulting in more sophisticated expert systems, better neural nets, and numerous problems of the ‘computers will never do X’ variety being solved.

While surveying some of the astonishing successes of modern AI Bostrom introduces the distinction between a ‘narrow AI’, one with extremely high performance in a single domain like chess playing, and ‘general AI’, software able to reason across a wide variety of domains like humans can. No matter how impressive Watson or Deep Blue might be, they are only able to outperform humans in very limited ways; the real interest lies in machines that are as good or better than humans in lots of different ways.

Chapter 1 ends with a discussion of three different surveys taken of the opinions of AI experts. One survey was on when the experts thought human-level AI would be developed, one was on how long it would take human-level machines to become superintelligent, and another was on the overall impact of superintelligent AIs. It is notoriously difficult to predict when and what progress will be made in AI and so expert opinions were, predictably, all over the place. But the results do hint that the problem of AI safety is worth thinking seriously about.

Goldsworthy, A. The Fall of Carthage, p. 190-196

In the face of several crushing defeats the Roman government elected a dictator, Quintus Fabius Maximus, who would spend his six-month term carefully avoiding engagements with Hannibal, a passivity for which he received the nickname ‘the delayer’. This strategy, while causing much consternation among war-hungry Roman aristocrats, later came to be seen as Rome’s salvation, giving her time to recover from the defeats at Trebia and Trasimene.

Hannibal spent this period criss-crossing the Appennines, pillaging and looting freely. During one particularly crafty maneuver, he managed to move through a pass blocked by Fabius’ army by first tying wooden branches to the horns of oxen and then lighting the branches on fire, sending them into the pass first. The Roman troops occupying the pass believed the Carthaginian army was on the move and so descended to engage. In the resulting confusion Hannibal managed to slip the main column of his army through to the other side, carrying with them the spoils of war gathered over the previous weeks.

Fast Losing Ground

I’m writing a series of posts summarizing my position on the Intelligence Explosion, and here I want to give a couple of examples of recent AI developments which should make even hardened skeptics consider the possibility that our creations might soon catch up with us.

But first, I want to point out that while the history of early AI research is marred by over-confident prognostications that wound up not panning out and causing several “AI winters”, it is also true that AI skeptics have a history of believing that ‘machines will never do X’, only to have machines do X not very long thereafter.

This is humorously captured in the following cartoon, attributed to Ray Kurzwiel:



Most of us are rapidly becoming acquainted with living in a world suffused with increasingly smart software. But many would be surprised to learn that there are computer programs in existence now which can write compelling classical music. Emily Howell is the product of several decades work by David Cope, who conceived of the idea of creating software to help with his music after experiencing a particularly bad case of composer’s block. The results speak for themselves:


Granted this is not exactly breathtaking; it might be what we’d expect from an advanced piano student who was still primarily leaning on technique because she hadn’t found her creative voice yet. But it’s a long way from the soundtracks of 8-bit video games I grew up playing, and it was written by a computer program.

But what about natural language? Computer-generated music is impressive, but can computers rise to the challenge of processing and responding to speech in real time? IBM’s Watson, a truly monumental achievement, managed to not only do this, but to utterly stomp two of the best jeopardy players of all time. Last I checked the technology was being turned to helping doctors perform better diagnoses.

In my mind the most impressive example is the lesser-well-known Adam (King et al., 2004), an almost fully autonomous science laboratory which, when fed data on yeast genetics, managed to form a hypothesis, design and carry out an experiment to test its hypothesis, and in the process discover something that was unknown to any scientist before. Though this may seem like light-years away from an AI doing, say, astrophysics research, the difference is one of degree, not kind.

Admittedly, we’re still not talking about general intelligences like human beings here. But the weight of the evidence points to a future where increasingly large chunks of civilization are being managed by intelligent machines. This may come to include the production of art, science, and even the design of new intelligent systems.


Your Intelligence Isn’t Magical.

I’m writing a series of posts summarizing my views on the Intelligence Explosion, and the first claim I want to defend is that we should take seriously the possibility of human-level artificial intelligence because fundamentally human intelligence is not magic.

Human intelligence is the product of the brain, an object of staggering complexity which, nevertheless, is built up from thoroughly non-magical components. When neurons are networked together into more and more sophisticated circuitry, there is no point at which magic enters the process and gives rise to intelligence.

Furthermore, human intelligence is the product of the blind, brute-force search algorithm which is evolution. Organisms are born with random mutations into environments which act as fitness functions.  Beneficial mutations preserve themselves by leading to greater reproductive success while deleterious ones eliminate themselves by lowering reproductive success. Evolution slowly explores possibilities by acting on and changing existing DNA patterns.

Even without engineering oversight, evolution managed to produce Homo Sapiens, primates with the ability to reason across a wide variety of domains and use their intelligence in ways radically different from the uses for which it evolved.

This is not to imply that our intelligence is well understood; my impression is that great strides have been made in modeling brain activity, but we are surely still a long way from having probed these mysteries fully.

Nor does it imply that building a human-level intelligence will be easy. For decades now AI researchers and computer scientists have been trying, making progress in various narrowly defined tasks like chess, but still nowhere near achieving the creation of a general reasoner on par with humans.

Additionally, it doesn’t imply that a human-level AI must actually resemble human intelligence in any way. AI research is a vast field, and within it there are approaches which draw on neuroscience and mathematical psychology, and de novo approaches which want to build an AI ‘from the ground up’, as it were.

But don’t lose sight of this key fact: the intelligence which produced these words is a non-magical product of a brain made of non-magical components which was produced by a non-magical process. It is hard for me to see where or why a skeptic could draw a special line in the sand at the level of a human and say ‘machines won’t ever get this far’.