Ripples and Puddles
Hans Moravec
April 2000
Computers were invented recently to
mechanize tedious manual informational procedures. Such procedures
were themselves invented only during the last ten millennia, as
agricultural civilizations outgrew village-scale social instincts. The
instincts arose in our hominid ancestors during several million years
of life in the wild, and were themselves built on perceptual and motor
mechanisms that had evolved in a vertebrate lineage spanning hundreds
of millions of years.
Bookkeeping and its elaborations exploit ancestral faculties for
manipulating objects and following instructions. We recognize written
symbols in the way our ancestors identified berries and mushrooms,
operate pencils like they wielded hunting sticks, and learn to
multiply and integrate by parts as they acquired village procedures
for cooking and tentmaking.
Paperwork uses evolved skills, but in an unnaturally narrow and
unforgiving way. Where our ancestors worked in complex visual, tactile
and social settings, alert to subtle opportunities or threats, a clerk
manipulates a handful of simple symbols on a featureless field. And
while a dropped berry is of little consequence to a gatherer, a missed
digit can invalidate a whole calculation.
The peripheral alertness by which our ancestors survived is a
distraction to a clerk. Attention to the texture of the paper, the
smell of the ink, the shape of the symbols, the feel of the chair, the
noise down the hall, digestive rumblings, family worries and so on can
derail a procedure. Clerking is hard work more because of the
preponderance of human mentation it must suppress than the tiny bit it
uses effectively.
Ripples
Like little ripples on the surface of a deep, turbulent pool,
calculation and other kinds of procedural thought are possible only
when the turbulence is quelled. Humans achieve quiescence imperfectly
by intense concentration. Much easier to discard the pesky abyss
altogether: ripples are safer in a shallow pan. Numbers are better
manipulated as calculus stones or abacus beads than in human memory. A
few cogwheels in Blaise Pascal's seventeenth century calculator
perform the entire procedure of addition better and faster than a
human mind. Charles Babbage's nineteenth century Analytical Engine
would have outcalculated dozens of human computers and eliminated
their errors. Such devices are effective because they encode the bits
of surface information used in calculation, and not the millions of
distracting processes churning the depths of the human brain.
The deep processes sometimes help. We guess quotient digits in long
divisions with a sense of proportion our ancestors perhaps used to
divide food among mouths. Mechanical calculators, unable to guess,
plod through repeated subtractions. More momentously, geometric proofs
are guided (and motivated!) by our deep ability to see points, lines,
shapes and their symmetries, similarities and congruences. And true
creative work is shaped more by upwellings from the deep than by overt
procedure.
Calculators gave way to Alan Turing's universal computers, and grew to
thousands, then millions and now approaching billions of storage
locations and procedure steps per second. In doing so they transcended
their paperwork origins and acquired their own murky depths. For
instance, without great care, one computer process can spoil another,
like a clerk derailed by stray thoughts. On the plus side,
superhumanly huge searches, table lookups and the like can sometimes
function like human deep processes. In 1956 Allen Newell, Herbert
Simon and John Shaw's Logic Theorist's massive searches found proofs
like a novice human logician. Herbert Gelernter's 1963 Geometry
Theorem Prover used large searches and Cartesian coordinate arithmetic
to equal a fair human geometer's visual intuitions. Expert systems'
large compilations of inference rules and combinatorial searches match
human experience in narrow fields. Deep Blue's giga-scale search,
opening and endgame books and carefully-tuned board evaluations
defeated the top human chess player in 1997.
Despite such isolated soundings, computers remain shallow bowls. No
reasoning program even approaches the sensory and mental depths
habitually manifest at the surface of human thought. Doug Lenat's
common-sense encoding Cyc, begun in the 1980s and about the most
ambitious, would capture broad verbal knowledge yet still lack visual,
auditory, tactile or abstract understanding.
Many critics contrast computers' superiority in rote work with their
deficits of comprehension to conclude that computers are prodigiously
powerful, but universal computation lacks some human mental principle
(of physical, situational or supernatural kind, per taste). Some
Artificial Intelligence practitioners profess a related view: computer
hardware is sufficient, but difficult unsolved conceptual problems
keep us from programming true intelligence.
The latter premise can seem plausible for reasoning, but it is
preposterous for sensing. The sounds and images processed by human
ears and eyes represent megabytes per second of raw data, itself
enough to overwhelm computers past and present. Text, speech and
vision programs derive meaning from from snippets of such data by
weighing and reweighing thousands or millions of hypotheses in its
light. At least some of the human brain works similarly. Roughly ten
times per second at each of the retina's million effective pixels,
dozens of neurons weigh the hypothesis that a static or moving
boundary is visible then and there. The visual cortex's ten billion
neurons elaborate those results, each moment appraising possible
orientations and colors at all the image locations. Efficient computer
vision programs require over 100 calculations each to make similar
assessments. Most of the brain remains mysterious, but all its neurons
seem to work about diligently as those in the visual system. Elsewhere
I've detailed the retinal calculation to conclude that it would take
on the order of 100 trillion calculations per second of computing --
about a million present-day PCs -- to match the brain's functionality.
That number presumes an emulation of the brain at the scale of image
edge detectors: a few hundred thousand calculations per second doing
the job of a few hundred neurons. The computational requirements would
increase (maybe a lot) if we demanded emulation at a finer grain, say
explicit representation of each neuron. By insisting on a fine grain
we constrain the solution space and outlaw global optimizations. On
the plus side, by constraining the space we simplify the search! No
need to find efficient algorithms for edge detection and other
hundred-neuron-scale nervous system functions. If we had good models
for neurons and a wiring diagram of a brain, we could emulate it as a
straightforward network simulation. The problems of Artificial
Intelligence would be reduced to merely instrumentally- and
computationally-daunting work.
Alternatively we could try to implement the brain's function at much
larger than edge-detector grain. The solution space expands and with
it the difficulty of finding globally efficient algorithms, but their
computational requirements decrease. Perhaps programs implementing
humanlike intelligence in a highly abstract way are possible on
existing computers, as AI traditionalists imagine. Perhaps, as they
also imagine, devising such programs requires lifetimes of work by
world-class geniuses.
But it may not be so easy. The most efficient programs exhibiting
human intelligence might exceed the power and memory of present PCs
manyfold, and devising them might be superhumanly difficult. We don't
know: the pool is extremely murky below the ripples, and has not been
fathomed.
(Very powerful optimizing compilers could conceivably blur grain sizes
by transforming neuron-level brain simulation programs into
super-efficient code that preserves input-output behavior but
resembles traditional AI programs. Such compilers would surely need
superhuman mental power (they would be singlehandedly solving the AI
problem, after all), but perhaps of a relatively simple, idiot-savant,
kind.)
Puddles
Each approach to matching human performance is interesting
intellectually and has immediate pragmatic benefits. Reasoning
programs outperform humans at important tasks, and many already earn
their keep. Neural modeling is of great biological interest, and may
have medical uses. Efficient perception programs are somewhat
interesting to biologists, and useful in automating factory processes
and data entry.
But by which will succeed first? The answer is surely a combination of
all those techniques and others, but I believe the perception route,
currently an underdog, will play the largest role.
Reasoning-type programs are superb for consciously explicable tasks,
but become unwieldy when applied to deeper processes. In part this is
simply because the tasks deep in the subconscious murk elude
observation. But also, the deeper processes are quantitatively
different. A few bits of problem data ripple across the conscious
surface, but billions of noisy neural signals seethe below. Reasoning
programs will become more powerful and useful in coming decades, but I
think comprehensive verbal common sense, let alone sensory
understanding, will continue to elude them.
Entire animal nervous systems, hormonal signals and interconnection
plasticity included, may become simulable in coming decades, as
imaging instrumentation and computational resources rapidly
improve. Such simulations will greatly accelerate neurobiological
understanding, but I think not rapidly enough to win the
race. Valentino Braitenberg, who analyses small nervous systems and
has designed artificial ones, notes the rule of "downhill synthesis
and uphill analysis" -- it is usually easier to compose a circuit with
certain behaviors than to describe how an existing circuit manages to
achieve them. Meager understanding and thus means to modify designs,
the cost of simulating at a very fine grain and ethical hurdles as
simulations approach human-scale will slow the applications of neural
simulations. But robot toys following in Aibo's pawprints should be
interesting!
No human-scale intelligence (as far as we know) ever developed from
conscious reasoning down, nor from simulations of neural processes,
and we really don't know how hard doing either may be. But the third
approach is familiar ground.
Multicellular animals with cells specialized for signaling emerged in
the Cambrian explosion a half-billion years ago. In a game of
evolutionary one-upmanship (there's always room at the top!) maximum
nervous system masses doubled about every 15 million years, from
fractional micrograms then to several kilograms now (with several
abrupt retreats, often followed by accelerated redevelopment, when
catastrophic events eliminated the largest animals).
Our gadgets, too, are growing exponentially more complex, but 10
million times as fast: human foresight and culture enables bigger,
quicker steps than blind Darwinian evolution. The power of new
personal computers has doubled annually since the mid 1990s. The "edge
operator" estimate makes today's PCs comparable only to milligram
nervous systems, as of insects or the smallest vertebrates (eg. the 1
cm dwarf goby fish), but humanlike power is just thirty years away. A
sufficiently vigorous development with well-chosen selection criteria
should be able to incrementally mold that growing power in stages
analogous to those of vertebrate mental evolution. I believe a certain
kind of robot industry will do this very naturally. No great
intellectual leaps should be required: when insight fails, Darwinian
trial and error will suffice -- each ancestor along the lineage from
tiny first vertebrates to ourselves became such by being a survivor in
its time, and similarly ongoing commercial viability will select
intermediate robot minds.
Building intelligent machines by this route is like slowly flooding
puddles to make pools. Existing robot control and perception programs
seem muddy puddles because they compete in areas of deepest human and
animal expertise. Reasoning programs, though equally shallow,
comparatively shine by efficiently performing tasks humans do
awkwardly and animals not at all. But if we keep pouring, the puddles
will surely become deeper. That may not be true for reasoning
programs: can pools be filled surface down?
Many of our sensory, spatial and intellectual abilities evolved to
deal with a mobile lifestyle: an animal on the move confronts a
relentless stream of novel opportunities and dangers. Other skills
arose to meet the challenges of cooperation and competition in social
groups. Elsewhere I've outlined a plan for commercial robot
development that provides similar challenges. It will require a large,
vigorous industry to search for analogous solutions. Today the
industry is tiny. Advanced robots have insectlike mentalities, besting
human labor only rarely, in exceptionally repetitive or dangerous
work. But I expect a mass market to emerge this decade. The first
widely usable products will be guidance systems for industrial
transport and cleaning machines that three-dimensionally map and
competently navigate unfamiliar spaces, and can be quickly taught new
routes by ordinary workers. I have been developing programs that do
this. They need about a billion calculations per second, like the
brainpower of a guppy! Industrial machines will be followed by
mass-marketed utility robots for homes. The first may be a small, very
autonomous robot vacuum cleaner that maps a residence, plans its own
routes and schedules, keeps itself charged and empties its dustbag
when necessary into a larger container. Larger machines with
manipulator arms and the ability to perform several different tasks
may follow, culminating eventually in human-scale "universal" robots
that can run application programs for most simple chores. Their
10-billion-calculation-per-second lizard-scale minds would execute
application programs with reptilian inflexibility.
This path to machine intelligence, incremental, reactive,
opportunistic and market-driven, does not require a long-range map,
but has one in our own evolution. In the decades following the first
universal robots, I expect a second generation with mammallike
brainpower and cognitive ability. They will have a conditioned
learning mechanism, and steer among alternative paths in their
application programs on the basis of past experience, gradually
adapting to their special circumstances. A third generation will think
like small primates and maintain physical, cultural and psychological
models of their world to mentally rehearse and optimize tasks before
physically performing them. A fourth, humanlike, generation will
abstract and reason from the world model. I expect the reasoning
systems will be adopted from the traditional AI approach maligned
earlier in this essay. The puddles will have reached the ripples.
Robotics should become the largest industry on the planet early in
this evolution, eclipsing the information industry. The latter
achieved its exalted status by automating marginal tasks we used to
call paperwork. Robotics will automate everything else!
*************************************