
 Sensing versus Inferring in Robot Control

 Hans P. Moravec
 Carnegie-Mellon University
 Robotics Institute

 January 1987

\section{Introduction}

	For the last fifteen years I have worked with and around
mobile robots controlled by large programs that take in a certain
amount of data and mull it over for seconds, minutes or even hours
[\cite{Moravec83}, \cite{Moravec85}].  Although their accomplishments
are sometimes impressive, they are brittle - if any of many key steps
do not work as planned, the entire process is likely to fail beyond
recovery.  This is a strange contrast to much more modest machines
with which I worked and played, and that I contemplated,
previously. Their sensors (touch switches and photocells, usually)
were wired to motors through only simple logic, but they nevertheless
managed to extricate themselves out of many very difficult and
confusing situations.  One of the most elaborate examples was the
Hopkins beast, built around 1964 at Johns Hopkins university, that
wandered the halls, centering itself by sonar. When its batteries ran
low, the machine used a special photocell array sensor to find
standard wall outlets (black on white walls), and to dock with them,
to "feed" until the batteries were sufficiently recharged.  Also
notable were the artificial turtles of British psychologist W. Grey
Walter in the 1950s, the most elaborate of which could learn to
associate two stimulii in a Pavlovian way, using an array of
capacitors for memory. Artificial animals of this kind were the
clearest hint of a connection between mind and machine until the
advent of computer based artificial intelligence, with its emphasis on
the rational aspects of human thought, in the late 1950s.  My own
thinking has returned to these examples after being prodded by the
technical success of the Denning Sentry, which manages to navigate
night after night using techniques somewhere between those of the
Hopkins beast and our recent "intelligent" robots, and also by the
work of my friend Rod Brooks at MIT, who is getting very interesting
behavior from small robots controlled by computer simulations of small
nervous systems.  Industrial vision modules, able to identify parts
using methods chosen for speed and simplicity as well as
effectiveness, are also relevant.

	The approaches can be ordered on a spectrum.  At one extreme
are hardwired responses such as limit switches that turn off motors
when they are toggled, at the other are "AI" programs that subject
sensor inputs to millions of computations before they decide on an
effector action.  A given amount of switching logic, or computational
power, may be configured anywhere along this spectrum, from broad and
shallow to narrow and deep.  There are costs and benefits for any
choice.  At the "shallow" end the data bandwidth to the world can be
very high since the logic handles each sensor input only briefly
before going on to the next.  This can be a great strength.
Conversely, at the "deep" end the response can be very flexible, but
as most of the logic is tied up making inferences, the system can
notice the world only sparsely in space and time. The bandwidth is
low, and the latency is high.

	Strategies near the shallow end can be much more responsive to
changing events - with properly chosen connections the action in the
world around them in effect becomes part of the robot's reasoning.  It
is this property that makes many such approaches work so well.  Errors
made at one moment are sensed and corrected the next.  On the other
hand, long term goals require the long memory and large scale models
possible only in a deep controller.


\section{Optimum Controller Depth}

	It is possible and desirable to organize a robot controller so
that independent wide and deep strategies can co-exist.  For some
occurences such as a collision, an immediate reflex action (Stop!) is
appropriate.  At other times a long stretch of combinatorial thinking,
resulting in a plan, is a good preface to action.  My experiences
suggest, however, that the most effective use of a given amount of
computational power during normal motion is between the extremes of
complexity.  There is a way of {\it impedance matching} the available
processing power to the time constants and complexity of the robot's
environment.  A larger, faster computer will permit better performance
through more complex interpretation of more data, but overly complex
algorithms on a smaller machine reduce performance by restricting the
rate at which raw data can be ingested.  For the time constants that
apply to a robot crawling in a sedate indoor environment, controlled
by a million instruction per second computer, processing about 100
independent numbers from the sensors every few seconds of travel, thus
spending about 5,000 instructions per reading, seems to give the most
solid results.  Figure 1 is a graphical metaphor for this idea.

	The Denning machines have hosted a number control programs,
obtaining their inputs from a ring of 24 sonar transducers which are
fired in banks, and can give readings about three times per second,
and also from an optical sensor which reports the azimuth and
elevation of wall-mounted navigational infrared beacons.  The simplest
programs are able to poll the sensors at their maximum rate, and
process all the readings in a uniform manner. The massively redundant
data helps eliminate false readings, but the computer is so busy
massaging raw data, there is no time to model the surroundings;
decisions to slow down or speed up, veer left or right, must be made
by simple tests and arithmetical combinations of sonar distance
averages and beacon sensor angles. In this mode, halving the rate at
which data is collected and processed affects performance only
slightly.  The complexity of the controller is too low for maximum
effectiveness.  We are now getting much richer behavior from a program
nearer the optimum complexity, a mapper that processes about two
hundred sonar readings collected during each five meters of travel
into an occupancy map of its surroundings, and matches this map to
ones built in an initial training run to accurately locate itself.
The map building and the matching each consume about three seconds of
onboard computer time, and the robot is able to navigate indefinitely
soley by extended landmarks. The maps can also be used to locate
corridors, and have other future potential.  Any increase in program
complexity would either require slower robot motion, or else would
result in frequent navigational errors.  Simplifying the processing
would have the same effect, since the maps would be less accurate and
reliable.

	Another program I feel is close to the optimum for a 1 mip
machine controlling a slow indoor robot is the scanline stereo program
of Serey and Matthies [\cite{Serey86}].  By restricting the visual
field of two cameras to a single scanline, this program successfully
navigates by subjecting about 500 numbers to a dynamic programming
stereo correspondence method that gives a distance profile of objects
penetrating a camera height horizontal plane around the robot.
Properly optimized this program could produce its results with a few
seconds of computation per image pair.  The road following methods
reported by Wallace et al.  [\cite{Wallace84}, \cite{Wallace85}]
similarly qualify.

	Programs too high in complexity for efficient robot operation
on a 1 mip machine include the full three dimensional stereo
navigation program of my "Stanford Cart" thesis, and to a lesser
extent its somewhat simplified descendents [\cite{Thorpe84}] at CMU,
and especially the programs controlling the SRI "Shakey" robot in the
early 1970s [\cite{Nillson84}].  The symptoms of over complexity are
brittleness - the program is unable to take in sufficient data to
verify the correctness of its observations of the world, and often
makes and executes elaborate plans on the basis of mistaken
assumptions, leading to spectacular failures.  The Cart programs fail
to correctly cross a room about half the time, and I believe Shakey
never completed a typical three or four step plan without human
intervention.  Compare this to the Denning robot, which manages to
automatically patrol several thousand feet of office corridors for 30
nights at a time fully automatically, returning to its recharging
hutch every morining.  The impedance match of an overly complex
program can be improved a little by effectively increasing the time
constant of the world - that is slowing everything down. The Cart took
five hours to cross a room, and Shakey took longer to complete a
single task - each vision step alone involved and hour of processing.

	It could be argued that stretching in the direction of
complexity is necessary so that the techniques will be ready for the
more powerful computers to come.  I think this approach is probably
not the best one.  Working with an overly complex controller in less
than real time can dramatically limit the number of experiments that
can be done.  Yet experiments are the primary means by which effective
methods can be distinguished from ineffective ones.  Fundamentally,
Shakey and the Cart can be considered spectacular, isolated stunts
whose success and usefulness cannot be judged until their methods are
experimentally and objectively pitted against a variety of other ways
of accomplishing the same ends.  Such experiments are still too
expensive to do today.

	There is an alternative approach to developing the robot
controllers of the future.  The size of the tetrahedraon in Figure 1
represents the amount of computer power available in a robot.  The
tetrahedron is growing in size as the cost of computation decreases
over the years, taking the "best performance" plateau up with it.
Instead of struggling heroically to sit uncomfortably and
ineffectively on the "Intelligent Systems" ledge at the top of the
tetrahedron, from now on I propose to ride the "best performance"
plateau as effectively as possible. I believe by doing so my
experimental discoveries, modest though they may be individually, will
accumulate in a steadily improving system over the years, with both
complexity and bandwidth increasing gradually as the processors
available improve.

\end{document}
