Three Dimensional Images from Cheap Sonar
Hans P. Moravec
The Robotics Institute
Carnegie-Mellon University
Pittsburgh, PA 15213
December 21, 1985
@Section(Introduction)
We propose to build and use moderate resolution three
dimensional space occupancy maps built from multiple measurements from
cheap sonar sensors. By cheap sonar I mean range readings obtained
from unmodified Polaroid sonar transducers driven by the original
Polaroid circuit board, or by an improved board (allowing closer
minimun ranges) from Texas Instruments. This is a simple, but highly
developed and reliable, not to mention inexpensive, system that
returns the distance to the nearest reflector in a certain wide cone
of sensitivity. Though much more information can be obtained, in
principle, from single sound bursts by modifying the aperture, phase
relationships, frequencies and processing, such an approach ignores
the present very good solution.
@Section(Past Work)
In earlier work [Moravec&Elfes 1985] we described the use of
multiple wide-angle sonar range measurements to map the surroundings
of an autonomous mobile robot. A sonar range reading provides
information concerning @r(empty) and @r(occupied) volumes in a cone
(subtending 30 degrees in our case) in front of the sensor. The
reading is modelled as probability profiles projected onto a
rasterized map, where somewhere occupied and everywhere empty areas
are represented. Range measurements from multiple points of view
(taken from multiple sensors on the robot, and from the same sensors
after robot moves) are systematically integrated in the map.
Overlapping empty volumes re-inforce each other, and serve to condense
the range of occupied volumes. The map definition improves as more
readings are added. The final map shows regions probably occupied,
probably unoccupied, and unknown areas. The method deals effectively
with clutter, and can be used for motion planning and for extended
landmark recognition. This system was tested on our @i(Neptune)
mobile robot, and recently outdoors on the @i(Terregator) robot.
@Section(Experimental Approach)
Processing a single reading from a standard unit is
computationally cheap; only one number is generated, limiting the
computations necessary or possible. The range accuracy of a typical
reading is better than a centimeter, but because of the wide angle of
the pulse, the lateral position of the reflection is uncertain to on
the order of a meter. By exercising multiple units repeatedly,
readings from multiple viewpoints may be combined to deduce the
location of the reflecting surfaces more precisely. The combining
process is a kind of deconvolution - each point in the final high
resolution map is a consequence of many of the individual readings
combined in a particular, unique way and each reading participates in
many map points.
Our existing approach uses the idea that the interior of each
sonar reading cone (bounded by the sensitivity profile laterally, and
by the range surface lengthwise) is known to be empty, and that the
reflecting point is somewhere on the range surface in this cone. The
empty interiors of other readings overlapping this range surface
reduce the region of uncertainty of the location of the echoing point
in a probabilistic way, while intersecting range surfaces reinforce
each other at the intersections. The deconvolution is essentially
non-linear.
The old programs work in two dimensions, collapsing the
measurement cones vertically into flat pie wedges that are combined in
a two dimensional map array that ultimately holds numbers giving the
confidence that a given cell is empty or occupied. We have
experimentally noted that maps with a range of 10 meters and a
resolution of 15 to 30 cm can be reliably constructed with data from a
ring of 24 robot-mounted transducers looking out horizontally at 15
degree intervals and pulsed at six locations a few meters apart in the
robot's travels (144 independent measurements). The sharpness of the
map can be seen to improve as more readings are added. Many readings
are combined to form one map probability point, and this process makes
our method quite tolerant to the occasional range errors encountered
in the sonar data.
A highly optimized version of the program, using fixed point
arithmetic, can process 144 points in roughly 1 second on a big Vax, 2
seconds on a Sun2 and 4 seconds on a Macintosh, building a 32x32 map
of eight bit probabilities. A companion program correlates two such
maps, using a coarse to fine hierarchy of reductions and a dual
representation (raster and list of occupied cells) to search over X, Y
shift and angle, in similar times. Another program devises good robot
paths through the probability maps.
@subsection(3D mapping)
Our approach generalizes very naturally to three dimensions -
in fact the collapse of cones to wedges in the 2D program is its
greatest single approximation, and information waster.
The sensors must be configured differently, however. The only
height information in the present planar ring comes from the vertical
divergence of the cones of sensitivity, whose symmetry makes it
impossible in principle to distinguish reflections from above the ring
plane from those an equal distance below the plane. Even without this
ambiguity, the present arrangement could provide very little vertical
resolution.
An arrangement of sensors on the surface of a partial sphere
would be much better. The 15 degree spacing of the 24 sensors on the
planar ring was chosen to give some overlap of fields of view. It was
discovered that this spacing allowed multiple sensors to be fired
simultaneously without serious interference, in three, or even two,
interleaved banks, greatly speeding data gathering. Using the same
idea and spacing to fill a sphere instead of a circle leads to the
following calculation.
A sphere represents 4@g(p) of solid angle. Spacing the
sensors 15 degrees from each other assigns a cone with 15 degree apex
to each sensor. A cone with apex angle T subtends 2@g(p)(1-cos(T/2))
solid angle, and we can (glossing over packing problems) arrange about
4@g(p)/(2@g(p)(1-cos(T/2)) = 2/(1-cos(T/2)) of them into a sphere.
With T=15 degrees 233 transducers fill a sphere. If we content
ourselves with a 90 degree wedge (almost a fisheye if you note that
the beams fan out an additional 15 degrees on all edges, for a net
coverage of 120 degrees) then this gets reduced to a more manageable
34 transducers.
If actually packed onto a spherical cap, the sensor group
would greatly resemble a compound insect eye, each facet being a
Polaroid transducer. The insect would be a monster. The transducers
are somewhat less than 5cm in diameter, which would demand a sphere
radius of about 40cm. A 90 degree cap from this sphere would be a
shallow bowl 56cm in diameter and 12cm deep.
One such sensor array on the nose of a vehicle, tilted down
somewhat, should be adequate for many tasks, but imagine getting
better side coverage, say for obstacle avoidance, by placing two, one
on each side of the head, enhancing the giant insect effect.
@subsection(How Many Readings, How much Computation?)
The 3D map we hope to derive from this array has more cells
than the 2D maps we have worked with, and will require more data. How
much?
Suppose we build our maps to a range of about 10 meters in the
vehicle forward direction, 5 meters laterally and 3 meters in the
vertical direction, and to a resolution of 30cm in each
direction. There will be 33x17x10 cells, each holding a number, in the
final map. This is 5,610 numbers. A naive degrees of freedom analysis
suggests that a similar number of readings each returning one number
are necessary to determine this many variables. Fortunately our 2D
experience suggests that far fewer will suffice.
We have noted experimentally that 144 readings nicely spaced
around our cluttered laboratory is just enough to give us good 32 cell
by 32 cell maps covering a square area 10 meters on a side. There are
1024 points in such maps, so we seem to be accomplishing the
impossible, extracting 1024 unknowns from 144 equations. Actually, the
1024 numbers are not very informative as their magnitude represents
our certainty (or uncertainty) about particular cells being occupied,
not something intrinsic about the scenery. Most of the cells in the
final mape are labelled an unsurprising "unknown" (represented by 0)
or "probably empty" (represented by a negative number). The real
information is concentrated in the locations of the reflecting
boundary seen by the robot, i.e. the minority of cells labelled
"probably occupied". To first approximation this boundary is a one
dimensional contour embedded in the 2D map. Its length in cells is on
the order of the boundary length of the map, 4x32. The information is
not in the contents of these cells (positive probability numbers), but
in their location. Each cell represents about one number - think of
the boundary expressed in polar co-ordinates - the information is in
the radius at each angle, the angle itself is just the independent
variable. SO - we have 144 equations to determine about 4x32 = 128
variables - just about right! Mathematics is great.
In 3D the contour becomes a surface. In our example of two
paragraphs ago the map size was 33x17x10 cells. The surface of this
volume has about 2,100 cells, and thus requires about 2,100 readings
by the above analysis, or 62 full scans of the 34 transducers in the
90 degree eye. The sensors can be pulsed about twice per second. With
two way interleaving, a full eye poll takes a second. The 62 readings
would thus take about a minute. Computation times on a big Vax,
extrapolating from the fast 2D program, would also be at about 30
seconds to a minute. It is assumed that the robot travels about ten
meters during this minute (a speed of 0.6 km/hr) to give each reading
set a fresh vantage point, and that adequate dead reckoning is
provided to correctly relate the 60 sets geometrically. Of course,
lower resolution maps, or simple obstacle detection, can be
accomplished faster, in as little as one (half second) pulse gathering
period.
These numbers suggest that high speed travel is best left to
longer range sensors, and perhaps simpler techniques. The sonar
mapping could be very useful for slow, close in, tight maneuvering in
complicated environments and on very rough ground. The very general
path passability grid route planners demonstrated by the group extend
in a natural way to the dense 3D data this approach will provide.
@Section(Research Plan)
All our sonar experiments so far have been conducted with
early prototype sonar rings provided by our sometime collaborator,
Denning Mobile Robotics, Inc. of Woburn, Massachusetts. Because of a
rather old fashioned (small buffer) serial interface on our Vax
computers, the processors on these rings can't reliably communicate
with the Vaxes in the present configuration, and this has been a
serious hinderance to sonar experimentation. We will begin the work
by building new interfaces for the transducers using Texas Instrument
driver boards funneling into an MC68000 microprocessor. Denning has
agreed to help in this effort - they have been using a TI board based
design successfully for six months.
A second stage is design and construction of the physical
array. This will require a mathematical optimization and an
evaluation by simulations of the individual sensor placements.
The bulk, and point, of the work will be an extended series of
experiments with 3D map building and navigation programs. One small
but interesting subproblem in the early stages is 3D raster fill of
conically bounded sphere surfaces and volumes. A more significant
problem is the handling of position uncertainty in the measurements
made during an extended run. Our probability raster permits a very
direct representation for uncertainty - it can simply be added to the
probability distribution, increasing the spread of each reading in the
appropriate directions.
We'd like to try an approach that projects the incremental
uncertainty of each move onto old measurements rather than new ones.
The result would be a map that is always detailed for the local area
around the vehicle, and fades to fuzziness under the cumulative effect
of errors in the distance. Very old readings that provide almost no
information because of uncertainty in their location could eventually
be eliminated from the mapmaking.
The three dimensional nature of the images will permit some
work in identification of large objects. Recognition of small objects
is ruled out by the coarseness (about 10cm) of the anticipated maps.