Robots, After All


Hans Moravec


Carnegie Mellon University
Robotics Institute


July 1999


Computers have invaded everyday life, and networked machines are worming their way into our gadgets, dwellings, clothes, even bodies. But if pervasive computing soon handles most of our information needs, it will still not clean the floors, take out the garbage, assemble kit furniture or do any of a thousand other other essential physical tasks. The old dream of mechanical servants will remain mostly unmet.

Robot inventors in home, university and industrial laboratories have tinkered with the problem for most of the century. While mechanical bodies adequate for manual work can be built, artificial minds for autonomous servants have been frustratingly out of reach. The problem's deceptive difficulty fooled generations of workers who attempted to solve it using computers.

The first electronic computers in the 1950s did the work of thousands of clerks, seeming to transcend humans, let alone other machines. Yet the first reasoning and game-playing programs on those computers were a match merely for single human beginners, and each only in a single narrow task. And, in the 1960s, computer-linked cameras and mechanical arms took hours to unreliably find and move a few white blocks on a black tabletop, much worse than a toddler. The situation did not improve substantially for decades, and disheartened waves of robotics devotees.

But things are changing. Robot tasks wildly impossible in the 1970s and 1980s are nearing commercial viability in the 1990s. Experimental mobile robots map and navigate unfamiliar office suites [1], and robot vehicles drive themselves, mostly unaided, across entire countries [2]. Computer vision systems locate textured objects and track and analyze faces in real time. Personal computers recognize text and speech. Why suddenly now?

Mental Illusions

The short answer is that, after decades at about 1 MIPS (Million Instructions Per Second, each instruction representing work like adding two ten-digit numbers), computer power available to research robots shot through 10, 100 and now 1,000 MIPS in the 1990s (Figure 1). This is odd because the cost-effectiveness of computing rose steadily all those decades (Figure 2). In 1960 computers were a new and mysterious factor in the cold war, and even outlandish possibilities like artificial intelligence (AI) warranted significant investment. In the early 1960s AI programs ran on the era's supercomputers, similar to those used for physical simulations by weapons physicists and meteorologists. By the 1970s the promise of AI had faded, and the effort limped for a decade on old hardware. In contrast, weapons labs upgraded repeatedly to new supercomputers. In the 1980s, departmental computers gave way to smaller project computers then to individual workstations and personal computers. Machine costs fell and their numbers rose, but power stayed at 1 MIPS. By 1990 the research environment was saturated with computers, and only then did further gains manifest in increased power rather than numbers.

Mobile robot research might have blossomed sooner had the work been done on supercomputers, but pointlessly. At best, a mobile robot's computer could substitute for a human driver, a function worth perhaps $10 an hour. Supercomputer time cost at least $500 per hour. Besides, dominant opinion in the AI labs, dating from when computers did the work of thousands, was that, with the right program, 1 MIPS could encompass any human skill, . The opinion remained defensible in the 1970s, as reasoning and game-playing programs performed at modest human levels.

For the few researchers in the newborn fields of computer vision and robotics, however, 1 MIPS was obviously far from sufficient. With the best programs, single images crammed memory, simply scanning them consumed seconds, and serious image analysis took hours. Human vision performed much more elaborate functions many times a second.

Hindsight enlightens. Computers calculate using as few gates and switching operations as possible. Human calculation, by contrast, is a laboriously learned, ponderous, awkward, unnatural behavior. Tens of billions of neurons in our vision and motor systems strain to analogize and process a digit a second. If our brain were rewired into 10 billion arithmetic circuits, each doing 100 calculations a second, by a mad computer designer with a future surgical tool, we'd outcompute 1 MIPS computers a millionfold, and the illusion of computer power would be exposed. Robotics, in fact, gave us an even better exposé.

Though spectacular underachievers at the wacky new stunt of longhand calculation, we are veteran overachievers at perception and navigation. Our ancestors, across hundreds of millions of years, prevailed by being frontrunners in the competition to find food, escape danger and protect offspring. Existing robot-controlling computers are far too feeble to match this ultra-optimized perceptual inheritance. But by how much?

The vertebrate retina is understood well enough to be a kind of Rosetta stone roughly relating nervous tissue to computation. Besides light detectors, the retina contains edge- and motion-detecting circuitry, packed into a little tenth-millimeter-thick, two-centimeter-across patch that reports on a million image regions in parallel about ten times a second via the optic nerve. In robot vision, similar detections, well coded, each require the execution of a few hundred computer instructions, making the retina's 10 million detections per second worth over 1,000 MIPS. In a risky extrapolation that must serve until something better emerges, this implies it would take about 50,000 MIPS to functionally imitate a gram of neural tissue, and almost 100 million MIPS (or 100 trillion instructions per second) to emulate the 1,500 gram human brain. PCs in 1999 beat insects, but lose to the human retina and to the 0.1 gram brain of a goldfish. They are a daunting million times too weak to perform like a human brain. [3]

While dispiriting to artificial intelligence pioneers, the deficit does not warrant abandoning their goals. Computer power for a given price roughly doubled each year in the 1990s, after doubling every 18 months in the 1980s, and every two years prior. Twenty or thirty more years at the present pace would close the millionfold gap. Better yet, sufficiently useful robots don't need full human-scale brainpower.

Commercial and research experiences convince me that mental power like a small guppy, about 1,000 MIPS, will suffice to guide mobile utility robots reliably through unfamiliar surroundings, suiting them for jobs in hundreds of thousands of industrial locations and eventually hundreds of millions of homes. Such machines are less than a decade away, but have been elusive so long that only a few dozen small research groups pursue them.

One Track Minds

Commercial mobile robots, the smartest to date barely insectlike at 10 MIPS, have found few jobs. A paltry ten thousand work worldwide, and companies that made them are struggling or defunct (robot manipulators have a similar story). The largest class, Automatic Guided Vehicles (AGVs) (Figure 3), transport materials in factories and warehouses. Most follow buried signal-emitting wires and detect endpoints and collisions with switches, a technique developed in the 1960s. It costs hundreds of thousands of dollars to install guide wires under concrete floors, and the routes are then fixed, making the robots economical only for large, exceptionally stable factories. Some robots made possible by the advent of microprocessors in the 1980s track softer cues, like patterns in tiled floors, and use ultrasonics and infrared proximity sensors to detect and negotiate their way around obstacles.

The most advanced industrial mobile robots to date, developed since the late 1980s, are guided by occasional navigational markers, for instance laser-sensed bar codes, and by preexisting features like walls, corners and doorways. The hard-hat labor of laying guide wires is replaced by programming carefully tuned for each route segment. The small companies who developed the robots discovered many industrial customers eager to automate transport, floor cleaning, security patrol and other routine jobs. Alas, most buyers lost interest as they realized that installation and route changing required time-consuming and expensive work by experienced route programmers of precarious availability. Technically successful, the robots fizzled commercially. But in failure they revealed the essentials for success.

First one needs reasonably-priced physical vehicles to do various jobs. Fortunately existing AGVs, fork lift trucks, floor scrubbers and other industrial machines designed for human riders or to follow wires can be adapted for autonomy. Second, the customer should be able, unassisted, to rapidly put a robot to work where needed. Floor cleaning and most other mundane tasks cannot bear the cost, time and uncertainty of expert installation. Third, the robots must work for at least six months between missteps. Customers routinely rejected robots that, after a month of flawless operation, wedged themselves in corners, wandered away lost, rolled over employees' feet or fell down stairs. Six months, however, earned the machines a sick day.

Robots exist that work faultlessly for years, perfected by a repeated process that fixes the most frequent failures, revealing successively rarer problems that are corrected in turn. Alas, the reliability has been achieved only for prearranged routes. Insectlike 10 MIPS is just enough to track a few hand-picked landmarks on each path segment. Such robots are easily confused by minor surprises like shifted bar codes or blocked corridors, not unlike ants on scent trails or moths guided by the moon, who can be trapped by circularized trails or streetlights. (Unlike plodding robots, though, insects routinely take lethal risks, and thus have more interesting, if short, lives.)

A Sense of Space

Robots that chart their own routes emerged from laboratories worldwide in the mid 1990s, as microprocessors reached 100 MIPS. Most build two-dimensional maps from sonar or laser rangefinder scans to locate and route themselves, and the best seem able to navigate office hallways for days between confusions. To date they fall far short of the six-month commercial criterion. Too often different locations in coarse 2D maps resemble one another, or the same location, scanned at different heights, looks different, or small obstacles or awkward protrusions are overlooked. But sensors, computers and techniques are improving, and success is in sight.

My small laboratory is in the race. In the 1980s we devised a way to distill large amounts of noisy sensor data into reliable maps by accumulating statistical evidence of emptiness or occupancy in each cell of a grid representing the surroundings. The approach worked well in 2D (Figure 4), and guides many of the robots mentioned above. Three-dimensional maps, a thousand times richer, promised to be even better, but for years seemed computationally out of reach. In 1992 we found economies of scale and other tricks that reduced 3D grid costs a hundredfold, and now have a test program that accumulates thousands of measurements from stereoscopic camera glimpses to map a room's volume down to centimeter-scale ([4] and Figure 5). With 1,000 MIPS the program digests over a glimpse per second, adequate for slow indoor travel. A thousand MIPS is just appearing in high-end personal computers. In a few years it will be found in smaller, cheaper computers fit for robots, and we've begun an intensive three-year project to develop a prototype commercial product [5]. Highlights in the development will be automatic learning processes to optimize hundreds of evidence-weighing parameters, programs to find clear paths, locations, floors, walls, doors and other objects in the 3D maps, and sample application programs orchestrating the basic skills into tasks like delivery, floor cleaning and patrol. The initial testbed is a small camera-studded mobile robot. Tiny mass-produced digital camera chips promise to be the cheapest way to get the millions of measurements needed for dense maps.

As a first commercial product, we plan a basketball-sized "navigation head" (Figure 6) for retrofit onto existing industrial vehicles. It would have multiple stereoscopic cameras, 1,000 MIPS, generic mapping, recognition and control software, an application-specific program, and a hardware connection to vehicle power, controls and sensors. Head-equipped vehicles with transport or patrol programs could be taught new routes simply by leading them through once. Floor-cleaning programs would be shown the boundaries of their work area. Introduced to a job location, the vehicles would understand their changing surroundings competently enough to work at least six months without debilitating mistakes. Ten thousand AGVs, a hundred thousand cleaning machines (Figure 7) and, possibly, a million fork-lift trucks are candidates for retrofit, and robotization may greatly expand those markets.

Income and experience from spatially-aware industrial robots would set the stage for smarter yet cheaper ($1,000 rather than $10,000) consumer products, starting probably with small, patient robot vacuum cleaners that automatically learn their way around a home, explore unoccupied rooms and clean whenever needed (Figure 8). I imagine a machine low enough to fit under some furniture, with an even lower extendible brush, that returns to a docking station to recharge and disgorge its dust load. Such machines could open a true mass market for robots, with a hundred million potential customers.

Fast Replay

Commercial success will provoke competition and accelerate investment in manufacturing, engineering and research. Vacuuming robots should beget smarter cleaning robots with dusting, scrubbing and picking-up arms, followed by larger multifunction utility robots with stronger, more dexterous arms and better sensors. Programs will be written to make such machines pick up clutter, store, retrieve and deliver things, take inventory, guard homes, open doors, mow lawns, play games and on. New applications will expand the market and spur further advancements, when robots fall short in acuity, precision, strength, reach, dexterity, skill or processing power. Capability, numbers sold, engineering and manufacturing quality, and cost effectiveness will increase in a mutually reinforcing spiral. Perhaps as by 2010 the process will have produced the first broadly competent "universal robots," as big as people but with lizardlike 5,000 MIPS minds that can be programmed for almost any simple chore (Figure 9).

Like competent but instinct-ruled reptiles, first-generation universal robots will handle only contingencies explicitly covered in their current application programs. Unable to adapt to changing circumstances, they will often perform inefficiently or not at all. Still, so much physical work awaits them in businesses, streets, fields and homes that robotics could begin to overtake pure information technology commercially.

A second generation of universal robot with a mouselike 100,000 MIPS will adapt as the first generation does not, and even be trainable. Besides application programs, the robots would host a suite of software "conditioning modules" that generate positive and negative reinforcement signals in predefined circumstances. Application programs would have alternatives for every step small and large (grip under/over hand, work in/out doors). As jobs are repeated, alternatives that had resulted in positive reinforcement will be favored, those with negative outcomes shunned. With a well-designed conditioning suite (eg. positive for doing a job fast, keeping the batteries charged, negative for breaking or hitting something) a second-generation robot will slowly learn to work increasingly well.

A monkeylike 5 million MIPS will permit a third generation of robots to learn very quickly from mental rehearsals in simulations that model physical, cultural and psychological factors. Physical properties include shape, weight, strength, texture and appearance of things and how to handle them. Cultural aspects include a thing's name, value, proper location and purpose. Psychological factors, applied to humans and other robots, include goals, beliefs, feelings and preferences. Developing the simulators will be a huge undertaking involving thousands of programmers and experience-gathering robots. The simulation would track external events, and tune its models to keep them faithful to reality. It should let a robot learn a skill by imitation, and afford a kind of consciousness. Asked why there are candles on the table, a third generation robot might consult its simulation of house, owner and self to honestly reply that it put them there because its owner likes candlelit dinners and it likes to please its owner. Further queries would elicit more details about a simple inner mental life concerned only with concrete situations and people in its work area.

Fourth-generation universal robots with a humanlike 100 million MIPS will be able to abstract and generalize. The first ever AI programs reasoned abstractly almost as well as people, albeit in very narrow domains [6], and many existing expert systems outperform us. But the symbols these programs manipulate are meaningless unless interpreted by humans. For instance, a medical diagnosis program needs a human practitioner to enter a patient's symptoms, and to implement a recommended therapy. Not so a third-generation robot, whose simulator provides a two-way conduit between symbolic descriptions and physical reality. Fourth-generation machines result from melding powerful reasoning programs to third-generation machines. They may reason about everyday actions by referring to their simulators like Herbert Gelernter's 1959 geometry theorem prover [6] examined analytic-geometry "diagrams" to check special-case examples before trying to prove general geometric statements. Properly educated, the resulting robots are likely to become intellectually formidable.


References

1. Kortenkamp, D., Bonasso, R. P., and Murphy, R. (eds.) Artificial intelligence and mobile robots : case studies of successful robot systems. MIT Press, Cambridge, Mass., 1998.

2. Pomerleau, D. RALPH: rapidly adapting lateral position handler, In Proceedings of the Intelligent Vehicles '95. Symposium. IEEE Press, Piscataway, N.J., 1996.
< http://www.cs.cmu.edu/~pomerlea/nhaa.html >

3. Moravec, H., Robot: Mere Machine to Transcendent Mind. Oxford University Press, NY, 1998.
< http://www.frc.ri.cmu.edu/users/hpm/book98 >

4. Moravec, H., Robot Spatial Perception by Stereoscopic Vision and 3D Evidence Grids, CMU Robotics Institute Technical Report CMU-RI-TR-96-34, September 1996.
< http://www.frc.ri.cmu.edu/users/hpm/project.archive/robot.papers/1996/9609.stereo.paper/SGabstract.html >

5. Moravec, H., Robust Navigation by Probabilistic Volumetric Sensing: research proposal, January 1999.
< http://www.frc.ri.cmu.edu/users/hpm/project.archive/robot.papers/1999/ARPA.proposal.99/ARPA.990108.html >

6. Feigenbaum, E., and Feldman, J. (eds.) Computers and Thought. McGraw-Hill, NY, 1963.




Figures

< http://www.frc.ri.cmu.edu/users/hpm/book97/ch3/AI.power.300.jpg >
(preview in < http://www.frc.ri.cmu.edu/users/hpm/book98/fig.ch3/p068.html > )
Figure 1: From 1960 to 1990 the cost of computers used in AI and robotics research declined while their numbers increased as funding decreased. The dilution absorbed computer-efficiency gains during the period, and the power available to individual AI programs remained almost unchanged at 1 MIPS--less than insect power. AI computer cost bottomed in 1990, and since then power has doubled yearly, to near 1,000 MIPS in 1999. The major visible exception to this pattern is computer chess, shown by a progression of knights, whose prestige lured the resources of major computer companies and the talents of programmers and machine designers. Exceptions also exist in less public competitions, like petroleum exploration and intelligence gathering, whose high return on investment warrants access to the largest computers. The indicated power of each special-purpose chess machines is that of a general-purpose computer that would be needed to perform its function. The same rule applies to the placement of the animals at the right: each marks the minimum power of a general-purpose computer that could perform the task of its nervous system.

< http://www.frc.ri.cmu.edu/users/hpm/book97/ch3/power.300.jpg >
(preview in < http://www.frc.ri.cmu.edu/users/hpm/book98/fig.ch3/p060.html > )
Figure 2: The number of MIPS in $1,000 of computer from 1900 to the present. Steady improvements in mechanical and electromechanical calculators before World War II had increased the speed of calculation a thousandfold over manual methods from 1900 to 1940. The pace quickened with the appearance of electronic computers during the war, and 1940 to 1980 saw a millionfold increase. Since then the pace has been even quicker. The vertical scale is logarithmic; the major divisions represent thousandfold increases in computer performance. Exponential growth would show as a straight line, the upward curve indicates faster than exponential growth, an accelerating rate of innovation. The reduced spread of the data in the 1990s is probably the result of intensified competition: underperforming machines are more rapidly squeezed out.

< http://www.frc.ri.cmu.edu/users/hpm/CACM/AGV.jpg >
Figure 3: Automatic Guided Vehicles, or AGVs transport materials in factories and warehouses. Today's models need frequent navigational markers and route-specific programming to guide them, which are provided in an expensive installation process that make them economical only in stable, high-value locations. Three-dimensional perception, as discussed in this article, will allow such machines to work just as reliably after being installed simply by leading them once through a new route, greatly expanding their potential usefulness.

< http://www.frc.ri.cmu.edu/users/hpm/book98/fig.ch2/Grid.150.jpg >
(preview in < http://www.frc.ri.cmu.edu/users/hpm/book98/fig.ch2/p035.html > )
Figure 4: A two-dimensional grid map of a difficult hallway. A robot with a belt of 24 Polaroid sonar units obtained 624 range measurements along the 8 meter hallway. More than half the ranges were too long or missing because of deflections by the mirrorlike walls. A good reconstruction, in a 64 by 32 cell grid, was nevertheless obtained because the evidence patterns representing sonar ranges had been well adjusted for this kind of environment by an automatic learning process. The traversed hallway runs left-right. The start of an adjoining hallway, running upwards, can be seen on the right.

< http://www.frc.ri.cmu.edu/users/hpm/CACM/Grid3D.jpg >
(preview in < http://www.frc.ri.cmu.edu/users/hpm/book98/fig.ch2/p039.html > )
Figure 5: View of a three-dimensional grid map of an office. Twenty stereoscopic pairs of images, each providing about 2,500 depth measurements were digested into a 256 by 256 by 64 cell map. An original camera image is shown on the left, the right shows a perspective view of the "probably occupied" cells of the grid map representing a 6 meter square by 2 meter high volume. To aid visualization, the cells in about a dozen box-shaped volumes selected by hand were "spotlighted" with distinctive tints.

< http://www.frc.ri.cmu.edu/users/hpm/CACM/Head.jpg >
Figure 6: Conceptual image of a 3D-perceiving "Navigation head" about the size of a basketball for retrofit onto existing autonomous vehicle to allow them to be used without specialist installation. It would have multiple trinocular stereoscopic cameras, 1,000 MIPS, generic mapping, recognition and control software, an application-specific program, and a hardware connection to vehicle power, controls and sensors.

< http://www.frc.ri.cmu.edu/users/hpm/CACM/Scrubber.jpg >
Figure 7: A human-guided, self-propelled floor-scrubbing machine that could be made autonomous by integrating a navigation head with floor-cleaning application software. A janitorial supervisor might shepherd several machines down institutional corridors, dropping them off one by one in rooms and other areas to be cleaned.

< http://www.frc.ri.cmu.edu/users/hpm/book97/ch4/V-table.150.jpg >
(preview in < http://www.frc.ri.cmu.edu/users/hpm/book98/fig.ch4/p094.html > )
Figure 8: This conceptual design for an automatic home vacuum cleaner of the near future is intended to function with very little instruction from its owner. It has omnidirectional wheels, stereoscopic eyes on all faces, and 1,000 MIPS of processing programmed to give it a 3D sense of space. The robot is shown vacuuming under furniture, at a docking station regurgitating accumulated dust and recharging with retracted nozzle, moving at an angle to clean room edges and corners, and topping up its batteries at a handy socket, using an optional "field recharging arm."

< http://www.frc.ri.cmu.edu/users/hpm/book98/fig.ch4/UR-E.150.jpg >
(preview in < http://www.frc.ri.cmu.edu/users/hpm/book98/fig.ch4/p105.html > )
Figure 9: An omnidirectional wheelbase allows flexible movement on flat floors. Elevators, ramps, hoists, or special carrier carts would be used to change floors, climb stairs, or traverse very rough ground. The central post, on a swivel mount, is a "bus" that provides mechanical support, power, and control for a changeable suite of manipulators, sensors, and other accessories that rotate and ride up and down its length. One such accessory, an array of miniature cameras, gives the robot 360 degrees of stereoscopic vision, primarily for navigation. Manipulator-mounted cameras provide precise views of work objects. To reach greater heights, the robot itself can extend its post by attaching new segments. Batteries and computers in the base supply power, control, and stability. Major structural members, like the arms, are made of strong, light, composite materials. Lightweight, high-torque electric motors drive the large motions, like the wheels and arms. Even lighter, though less efficient, actuators like shape-memory metals drive the many motions of the fingers. These innovations combine to give the robot roughly the size, weight, strength, and endurance of a human in a spindly structure that resembles the cartoon broomsticks in Disney's "The Sorcerer's Apprentice."


Credits: Figures 1, 2, 4, 5, 8, 9 from reference [3], figure 6 from reference [5], figure 3, Frog Systems, Inc., figure 7, Karcher, Inc.