Robots Among Us

Hans Moravec
Carnegie Mellon University
Robotics Institute

June 1999

Bedazzled by the explosion of computers into everyday life, pundits predict a world saturated by communicating chips, in our gadgets, dwellings, clothes, even bodies. But if pervasive computing handles most of our information needs, it will still not clean the floors, take out the garbage, assemble kit furniture or do any of a thousand other other essential physical tasks. The old dream of mechanical servants will remain unmet.

Robot inventors in home, university and industrial laboratories have tinkered with the problem for most of the century. While mechanical bodies adequate for manual work can be built, artificial minds for autonomous servants have been frustratingly out of reach. The problem's deceptive difficulty fooled generations of workers who attempted to solve it using computers.

The first electronic computers in the 1950s did the work of thousands of clerks, seeming to transcend humans, let alone other machines. Yet the first reasoning and game-playing programs on those computers were a match merely for single human beginners, and each only in a single narrow task. And, in the 1960s, computer-linked cameras and mechanical arms took hours to unreliably find and move a few white blocks on a black tabletop, much worse than a toddler. The situation did not improve substantially for decades, and disheartened waves of robotics devotees.

But things are changing. Robot tasks wildly impossible in the 1970s and 1980s are nearing commercial viability in the 1990s. Experimental mobile robots map and navigate unfamiliar office suites, and robot vehicles drive themselves, mostly unaided, across entire countries. Computer vision systems locate textured objects and track and analyze faces in real time. Personal computers recognize text and speech. Why suddenly now?

Mental Illusions

The short answer is that, after decades at about 1 MIPS (Million Instructions Per Second, each instruction representing work like adding two ten-digit numbers), computer power available to research robots shot through 10, 100 and now 1,000 MIPS in the 1990s. This is odd because the cost-effectiveness of computing rose steadily all those decades. In 1960 computers were a new and mysterious factor in the cold war, and even outlandish possibilities like artificial intelligence (AI) warranted significant investment. In the early 1960s AI programs ran on the era's supercomputers, similar to those used for physical simulations by weapons physicists and meteorologists. By the 1970s the promise of AI had faded, and the effort limped for a decade on old hardware. In contrast, weapons labs upgraded repeatedly to new supercomputers. In the 1980s, departmental computers gave way to smaller project computers then to individual workstations and personal computers. Machine costs fell and their numbers rose, but power stayed at 1 MIPS. By 1990 the research environment was saturated with computers, and only then did further gains manifest in increased power rather than numbers.

Mobile robot research might have blossomed sooner had the work been done on supercomputers, but pointlessly. At best, a mobile robot's computer could substitute for a human driver, a function worth perhaps $10 an hour. Supercomputer time cost at least $500 per hour. Besides, dominant opinion in the AI labs, dating from when computers did the work of thousands, was that, with the right program, 1 MIPS could encompass any human skill, . The opinion remained defensible in the 1970s, as reasoning and game-playing programs performed at modest human levels.

For the few researchers in the newborn fields of computer vision and robotics, however, 1 MIPS was obviously far from sufficient. With the best programs, single images crammed memory, simply scanning them consumed seconds, and serious image analysis took hours. Human vision performed much more elaborate functions many times a second.

Hindsight enlightens. Computers calculate using as few gates and switching operations as possible. Human calculation, by contrast, is a laboriously learned, ponderous, awkward, unnatural behavior. Tens of billions of neurons in our vision and motor systems strain to analogize and process a digit a second. If our brain were rewired into 10 billion arithmetic circuits, each doing 100 calculations a second, by a mad computer designer with a future surgical tool, we'd outcompute 1 MIPS computers a millionfold, and the illusion of computer power would be exposed. Robotics, in fact, gave us an even better exposé.

Though spectacular underachievers at the wacky new stunt of longhand calculation, we are veteran overachievers at perception and navigation. Our ancestors, across hundreds of millions of years, prevailed by being frontrunners in the competition to find food, escape danger and protect offspring. Existing robot-controlling computers are far too feeble to match this ultra-optimized perceptual inheritance. But by how much?

The vertebrate retina is understood well enough to be a kind of Rosetta stone roughly relating nervous tissue to computation. Besides light detectors, the retina contains edge- and motion-detecting circuitry, packed into a little tenth-millimeter-thick, two-centimeter-across patch that reports on a million image regions in parallel about ten times a second via the optic nerve. In robot vision, similar detections, well coded, each require the execution of a few hundred computer instructions, making the retina's 10 million detections per second worth over 1,000 MIPS. In a risky extrapolation that must serve until something better emerges, this implies it would take about 50,000 MIPS to functionally imitate a gram of neural tissue, and almost 100 million MIPS (or 100 trillion instructions per second) to emulate the 1,500 gram human brain. PCs in 1999 beat insects, but lose to the human retina and to the 0.1 gram brain of a goldfish. They are a daunting million times too weak to perform like a human brain.

While dispiriting to artificial intelligence pioneers, the deficit does not warrant abandoning their goals. Computer power for a given price roughly doubled each year in the 1990s, after doubling every 18 months in the 1980s, and every two years prior. Twenty or thirty more years at the present pace would close the millionfold gap. Better yet, sufficiently useful robots don't need full human-scale brainpower.

Commercial and research experiences convince me that mental power like a small guppy, about 1,000 MIPS, will suffice to guide mobile utility robots reliably through unfamiliar surroundings, suiting them for jobs in hundreds of thousands of industrial locations and eventually hundreds of millions of homes. Such machines are less than a decade away, but have been elusive so long that only a few dozen small research groups pursue them.

One Track Minds

Commercial mobile robots, the smartest to date barely insectlike at 10 MIPS, have found few jobs. A paltry ten thousand work worldwide, and companies that made them are struggling or defunct (robot manipulators have a similar story). The largest class, Automatically Guided Vehicles (AGVs), transport materials in factories and warehouses. Most follow buried signal-emitting wires and detect endpoints and collisions with switches, a technique developed in the 1960s. It costs hundreds of thousands of dollars to install guide wires under concrete floors, and the routes are then fixed, making the robots economical only for large, exceptionally stable factories. Some robots made possible by the advent of microprocessors in the 1980s track softer cues, like patterns in tiled floors, and use ultrasonics and infrared proximity sensors to detect and negotiate their way around obstacles.

The most advanced industrial mobile robots to date, developed since the late 1980s, are guided by occasional navigational markers, for instance laser-sensed bar codes, and by preexisting features like walls, corners and doorways. The hard-hat labor of laying guide wires is replaced by programming carefully tuned for each route segment. The small companies who developed the robots discovered many industrial customers eager to automate transport, floor cleaning, security patrol and other routine jobs. Alas, most buyers lost interest as they realized that installation and route changing required time-consuming and expensive work by experienced route programmers of precarious availability. Technically successful, the robots fizzled commercially. But in failure they revealed the essentials for success.

First one needs reasonably-priced physical vehicles to do various jobs. Fortunately existing AGVs, fork lift trucks, floor scrubbers and other industrial machines designed for human riders or to follow wires can be adapted for autonomy. Second, the customer should be able, unassisted, to rapidly put a robot to work where needed. Floor cleaning and most other mundane tasks cannot bear the cost, time and uncertainty of expert installation. Third, the robots must work for at least six months between missteps. Customers routinely rejected robots that, after a month of flawless operation, wedged themselves in corners, wandered away lost, rolled over employees' feet or fell down stairs. Six months, however, earned the machines a sick day.

Robots exist that work faultlessly for years, perfected by a repeated process that fixes the most frequent failures, revealing successively rarer problems that are corrected in turn. Alas, the reliability has been achieved only for prearranged routes. Insectlike 10 MIPS is just enough to track a few hand-picked landmarks on each path segment. Such robots are easily confused by minor surprises like shifted bar codes or blocked corridors, not unlike ants on scent trails or moths guided by the moon, who can be trapped by circularized trails or streetlights. (Unlike plodding robots, though, insects routinely take lethal risks, and thus have more interesting, if short, lives.)

A Sense of Space

Robots that chart their own routes emerged from laboratories worldwide in the mid 1990s, as microprocessors reached 100 MIPS. Most build two-dimensional maps from sonar or laser rangefinder scans to locate and route themselves, and the best seem able to navigate office hallways for days between confusions. To date they fall far short of the six-month commercial criterion. Too often different locations in coarse 2D maps resemble one another, or the same location, scanned at different heights, looks different, or small obstacles or awkward protrusions are overlooked. But sensors, computers and techniques are improving, and success is in sight.

My small laboratory is in the race. In the 1980s we devised a way to distill large amounts of noisy sensor data into reliable maps by accumulating statistical evidence of emptiness or occupancy in each cell of a grid representing the surroundings. The approach worked well in 2D, and guides many of the robots mentioned above. Three-dimensional maps, a thousand times richer, promised to be even better, but for years seemed computationally out of reach. In 1992 we found economies of scale and other tricks that reduced 3D grid costs a hundredfold, and now have a test program that accumulates thousands of measurements from stereoscopic camera glimpses to map a room's volume down to centimeter-scale. With 1,000 MIPS the program digests over a glimpse per second, adequate for slow indoor travel. A thousand MIPS is just appearing in high-end personal computers. In a few years it will be found in smaller, cheaper computers fit for robots, and we've begun an intensive three-year project to develop a prototype commercial product. Highlights in the development will be automatic learning processes to optimize hundreds of evidence-weighing parameters, programs to find clear paths, locations, floors, walls, doors and other objects in the 3D maps, and sample application programs orchestrating the basic skills into tasks like delivery, floor cleaning and patrol. The initial testbed is a small camera-studded mobile robot. Tiny mass-produced digital camera chips promise to be the cheapest way to get the millions of measurements needed for dense maps.

As a first commercial product, we plan a basketball-sized "navigation head" for retrofit onto existing industrial vehicles. It would have multiple stereoscopic cameras, 1,000 MIPS, generic mapping, recognition and control software, an application-specific program, and a hardware connection to vehicle power, controls and sensors. Head-equipped vehicles with transport or patrol programs could be taught new routes simply by leading them through once. Floor-cleaning programs would be shown the boundaries of their work area. Introduced to a job location, the vehicles would understand their changing surroundings competently enough to work at least six months without debilitating mistakes. Ten thousand AGVs, a hundred thousand cleaning machines and, possibly, a million fork-lift trucks are candidates for retrofit, and robotization may greatly expand those markets.

Income and experience from spatially-aware industrial robots would set the stage for smarter yet cheaper ($1,000 rather than $10,000) consumer products, starting probably with small, patient robot vacuum cleaners that automatically learn their way around a home, explore unoccupied rooms and clean whenever needed. I imagine a machine low enough to fit under some furniture, with an even lower extendible brush, that returns to a docking station to recharge and disgorge its dust load. Such machines could open a true mass market for robots, with a hundred million potential customers.

Fast Replay

Commercial success will provoke competition and accelerate investment in manufacturing, engineering and research. Vacuuming robots should beget smarter cleaning robots with dusting, scrubbing and picking-up arms, followed by larger multifunction utility robots with stronger, more dexterous arms and better sensors. Programs will be written to make such machines pick up clutter, store, retrieve and deliver things, take inventory, guard homes, open doors, mow lawns, play games and on. New applications will expand the market and spur further advancements, when robots fall short in acuity, precision, strength, reach, dexterity, skill or processing power. Capability, numbers sold, engineering and manufacturing quality, and cost effectiveness will increase in a mutually reinforcing spiral. Perhaps as by 2010 the process will have produced the first broadly competent "universal robots," as big as people but with lizardlike 5,000 MIPS minds that can be programmed for almost any simple chore.

Like competent but instinct-ruled reptiles, first-generation universal robots will handle only contingencies explicitly covered in their current application programs. Unable to adapt to changing circumstances, they will often perform inefficiently or not at all. Still, so much physical work awaits them in businesses, streets, fields and homes that robotics could begin to overtake pure information technology commercially.

A second generation of universal robot with a mouselike 100,000 MIPS will adapt as the first generation does not, and even be trainable. Besides application programs, the robots would host a suite of software "conditioning modules" that generate positive and negative reinforcement signals in predefined circumstances. Application programs would have alternatives for every step small and large (grip under/over hand, work in/out doors). As jobs are repeated, alternatives that had resulted in positive reinforcement will be favored, those with negative outcomes shunned. With a well-designed conditioning suite (eg. positive for doing a job fast, keeping the batteries charged, negative for breaking or hitting something) a second-generation robot will slowly learn to work increasingly well.

A monkeylike 5 million MIPS will permit a third generation of robots to learn very quickly from mental rehearsals in simulations that model physical, cultural and psychological factors. Physical properties include shape, weight, strength, texture and appearance of things and how to handle them. Cultural aspects include a thing's name, value, proper location and purpose. Psychological factors, applied to humans and other robots, include goals, beliefs, feelings and preferences. Developing the simulators will be a huge undertaking involving thousands of programmers and experience-gathering robots. The simulation would track external events, and tune its models to keep them faithful to reality. It should let a robot learn a skill by imitation, and afford a kind of consciousness. Asked why there are candles on the table, a third generation robot might consult its simulation of house, owner and self to honestly reply that it put them there because its owner likes candlelit dinners and it likes to please its owner. Further queries would elicit more details about a simple inner mental life concerned only with concrete situations and people in its work area.

Fourth-generation universal robots with a humanlike 100 million MIPS will be able to abstract and generalize. The first ever AI programs did narrow abstract reasoning almost as well as people, and many existing expert systems outperform us. But the symbols they manipulate are meaningless unless interpreted by humans. For instance, a medical diagnosis program needs a human practitioner to enter a patient's symptoms, and to implement a recommended therapy. Not so a third-generation robot, whose simulator provides a two-way conduit between symbolic descriptions and physical reality. Fourth-generation machines result from melding powerful reasoning programs to third-generation machines. Properly educated, the resulting robots could become intellectually formidable.

The path I've outlined roughly recapitulates the evolution of human intelligence - at ten-million speed. It suggests robot intelligence will surpass our own well before 2050. In that case, mass-produced, fully-educated robot scientists working diligently, cheaply, rapidly and increasingly effectively will ensure that most of what science knows in 2050 will have been discovered by our artificial progeny!

Raw material for figures for the article

Illustrations that illuminate some of the article's claims.

Click on links to get description and small figure.
For most, clicking on small figure gets double-size version.

Robots in our lab:
The Uranus mobile Robot, left, with trinocular cameras and sonar ring.
The Neptune robot is seen at the right.

2D grid map of hallway made by a sonar-sensing robot

1979 sparse map and 1997 dense 3D grid map of a room from 20 stereoscopic images

MIPS and megabytes

Faster than exponential growth in computing power

The long freeze in AI computer power

Navigation head (commercial product)

AGV (factory vehicle), possible user of navigation head

Vacuum-cleaning robot concept

Honda P2, an embryonic universal robot

A conceptual universal robot

Robot delivering package

Twin robot assembling furniture

Robot in kitchen