Every child knows the fantasy. You rub a magic lamp; a genie appears and offers to grant three wishes. What do you say?
The catch is that the genie is both omnipotent and devious. If you wish for wealth, he’ll give you the Midas touch, turning all you love to gold. If you wish for love, he’ll make you fall in love with a toad. If you wish for wisdom, he’ll turn you to an old man. Is there any way to outsmart him?
The fable makes for a delightful introduction to the vagaries of language. Casual readers may be surprised to learn that smart people believe it is about to come true. The magic lamp has been rubbed, they say, and the genie is about to emerge. The genie they are worried about is artificial intelligence.
Nick Bostrom’s book, Superintelligence, is a meditation on the danger of having wishes granted. With doggedness, earnestness, and a little math, Bostrom asks what will happen if we do in fact manage to create artificial entities with near infinite power. Will they serve our needs, or override our needs? How can we teach them to work in our interest?
The most pressing task for anyone writing about advanced AI is to convince people to take the subject seriously. Bostrom devotes the early chapters of his book to describing current research into artificial intelligence, explaining how this research might run off the rails, and estimating how much time we have left before a disaster. His argument is aimed at a specific audience–AI buffs who believe superintelligent machines are imminent, but have given insufficient thought to their dangers. The case Bostrom makes, however, isn’t hard for an amateur to understand.
Right now, researchers are hard at work modeling the human mind. Their methods operate at various levels of detail. Some focus their efforts on high levels of abstraction, seeking to capture the way thought itself works: logic, language, inference, deduction. (This is the clunky, “classic” approach to AI that has yielded so many disappointing results.) Some do their research at a lower level, seeking to model the way neurons work. And some are planning to do research at the most detailed level of all, and scan real human brains, thus capturing the way particular minds work.
All of these methods can produce programs that, in some sense, “learn,” changing and developing with experience. And if programs can learn, there’s no theoretical limit to their ultimate power–especially if they learn to write superior programs.
Research into intelligence is thus subject to a feedback loop. The more we learn about intelligence, the more intelligent we’re able to make our software. The more intelligent our software becomes, the greater its potential to facilitate further research. It’s like pedaling a bike: more speed leads to better balance, better balance enables faster pedaling and more speed. The first motions are wobbly and clumsy. But zoon you’re zipping along.
That’s when you crash.
The danger of such a feedback loop is that it’s all but guaranteed to spin out of control. Once we have software that proves sufficiently adept at writing smarter software, we’ll see what Bostrom calls an intelligence explosion–a staggeringly rapid increase in software’s sophistication. The end result will be a program that is largely inscrutable to human study, even while it has great influence over human affairs. Because the change will be both sudden and profound, we had better start planning as soon as possible. What should we do?
Bostrom is known for his attention to a particular aspect of this problem. He discusses it in various contexts by various names–the paper-clip problem, the orthogonality thesis, the value-loading problem–but the basic principle is always the same. Here’s one way to think about it.
A basic function of all software is to encode assumptions about what people want. Punch addresses into a mapping program, and the route shown will conform to rules of the road, minimize travel time, flag or avoid traffic jams, and probably be direct.
That’s what most people want. But all of these assumptions have exceptions. A criminal on the lam will want to ignore rules of the road. A person researching traffic jams will want to locate them, not avoid them. A sightseer won’t want to minimize travel time. A driver looking to explore a new city, or test a new car, or simply enjoy the experience of driving, won’t care about following direct routes.
Some exceptions are rare, like people who study traffic jams. Some are necessary: we ought to discount the whims of criminals. Some are implicit in the very formulation of rules, like the human desire to “just do your own thing.” For now, though, most exceptions are no more than minor hassles.
The point Bostrom makes is that as software becomes more powerful, apparently trivial exceptions blow up into major crises. To stretch a metaphor, as a bicycle gains speed, it’s easier for a rider to balance, but also easier to have a fatal accident. This is not a vague Luddite fear. It’s the single most dire problem in computer science.
Take a minor example. The limitations of mapping software are trivial today. The programs, for all their marvels, are relatively weak, making it easy to work around them. If all else fails, we can shut them off. But what happen when the software gets stronger?
What happens when mapping software is loaded into auto-driving cars that always choose the “optimal” route. Will that route be the best route? Who says? Is it the route you would have chosen? What if you find alternate routes more beautiful, though longer?
What happens when mapping software predicts a passenger’s preferences, along with those of other passengers in other cars? What happens when it predicts the whims and wants of every driver in a city, plus likely road conditions, hours or days into the future, then makes a best guess as to optimal traffic patterns?
What happens when a program chooses a route based on thousands, even millions of factors: weather, passengers’ moods, historical records, chaotic models of traffic flow. A program of this type would incorporate countless risk assessments. It might assign you a 0.001 chance, say, of dying in a crash by 5PM. Is that acceptable? On what basis?
And what happens if mapping programs become even more powerful than this? Imagine a program given control of all city operations. A program that will tell you to take your children to the museum on Saturday at 10 AM because “modeling indicates that this scheduling optimizes for all considered factors”?
Why stop there? If software becomes unimaginably powerful–a thousand, a billion times more powerful than it is today–even a trivial oversight can cause an apocalypse.
This is the problem that has Bostrom worried.
In paraphrasing Bostrom’s arguments, I’ve taken pains to eschew terms like intelligence, mind, sentience, consciousness, self-awareness, and even AI. That’s no accident. These words have an appalling power to drag sober discussions into philosophical culverts.
Can machines feel? Does a computer have a mind? Is it possible to conceive of a perfect artificial being? Do androids dream of electric sheep?
Those who ponder such questions often talk of a coming “singularity,” when technological advance will surpass human comprehension.
There’s a second singularity at work here, however: a rhetorical one. Every sober-minded approach to this subject swiftly disappears into a black hole of fantastic speculation.
So it is with Bostrom’s book. He explores many ways of thinking about the quandary he’s raised–the tendency of technological advance to outpace human wisdom. Unfortunately, his argument is biased toward its extreme manifestations. Problems with mapping software aren’t for him. He wants to know what will happen if our experiments with machine cognition end up giving rise to a truly godlike entity.
Imagine a computer program so crafty, so clever, so capable, that it achieved virtually unlimited power over human affairs. How could we hope to control such a being? Could we outsmart it? What values would we want it to have, and how could we make it stick to those values?
In focusing on this apocalyptic scenario–artificial minds of inconceivable power–Bostrom sacrifices serious argument to fanciful science fiction. Here are some of the problems posed in his book:
–Consider a computer program that grants humanity unlimited proprietorship over the entire universe. How should we divvy up this trove of galaxies? Should we share them out equitably, or allow some people to horde them? Will one galaxy per person be enough?
–Consider a future in which human beings can be created at whim, erased on demand, and made to have any imaginable collection of skills, needs, and character traits. It’s now possible for a corporation to order up a batch of, say, seven thousand tax lawyers whose only wish in life is to devote all their efforts to untangling problems of tax law, then delete them when the task is done. In such a hypothetical world, will we still retain any conception of workers’ rights?
–Imagine a godlike machine, built by humans, that makes contact with other godlike machines built by aliens. What would these godlike machines have to say to each other?
–Suppose we could build a device that would convert all matter in the universe into an instantiation of a single mind quivering forever in a state of infinite subjective ecstasy. Would this be a good idea?
Something about the topic of AI has a way of tempting otherwise empirical thinkers into flights of unrestrained fantasy. Of course, such thought experiments have their place. The trouble with Bostrom’s book is that takes a rigorous, even plodding approach to scenarios so outlandish that they defy analysis. The result is occasionally informative, sometimes absurd, and rarely entertaining.
Who would have thought a book about superpowered robots could be so dull? Bostrom flirts with ideas from big-picture thinkers like Darwin, Malthus, Aristotle, and John Stuart Mill, but his prose mimics the meticulous style of an overscrupulous lab report. Familiar ethical concepts transmute, under the author’s pen, to cumbersome coinages, then to distracting acronyms. We’re told, late in the book, that “MR would do away with various free parameters of CEV, such as the degree of coherence among extrapolated volitions that is required for the AI to act on the result”–a sentence that stands in for an insight familiar to most teenagers: in a complex and fractious world, it’s awfully tempting, but dangerous, to lay down absolute moral laws.
Bostrom walks the reader through carefully constructed equations, only to admit that the most relevant variables are fuzzily defined. Lavish speculations–about global apocalypse, intergalactic war, uploading of human minds to virtual reality, dystopian societies–become as dry as third-quarter budget meetings. Old philosophical chestnuts pop up repeatedly, unencumbered by scientific detail, yet stripped of the whimsy that originally made them appealing. (Suppose a machine were created that simulated your brain state exactly, including your memories, for one second: would this differ from your current subjective experience?)
The dangers of AI are real. The dangers of advanced AI are severe. But Bostrom’s superintelligence is a truly fantastic entity, possessing near infinite power and near perfect foresight, incomprehensibly clever yet decisively inhuman; it has more in common with djinns and leprechauns than any program we can cogently describe. Prognostications premised on the existence of such a being become hopelessly nugatory. How do you outsmart an opponent who can read your mind? How do you control a creation that can rearrange matter at will? How do you impose human values on a spirit that is wholly inhuman?
These are questions from mythology, and mythology has already given the answers. The tricky feature of the Aladdin story, after all, is that the wish-granting genie is all-powerful. His omnipotence magnifies the ordinary imprecision of thought. It’s impossible to prescribe the use of infinite power; it would be like designing a universe from scratch.
The only solution to the riddle–as every folklorist and most clever children learn–is to turn the genie’s power on itself. “I wish for whatever I would wish for if I were infinitely wise.” Or, more pertinently: “I wish for whatever an all-knowing, all-powerful, infinitely wise and good being would desire.”
Hence the import of the ancient tale. By confronting absolute power, we learn absolute humility. We learn the lesson of Job: that we are mortal beings confronted with immortal forces. We learn that we must set the genie free.
This is, more or less, the conclusion Bostrom comes to. In the presence of a truly superintelligent machine, our only hope will be to have the machine itself solve a riddle that has so far baffled us–the riddle of how to create an ideal world.
In the meantime, we’ll have plenty of other problems to solve, as our machines grow steadily more sophisticated and more powerful. If they grow infinitely powerful, our best option will be the traditional one: pray.