## Learning Sensor Models for Evidence Grids

### Hans Moravec

Mike Blackwell

Carnegie Mellon University

Robotics Institute

September 1991

**Introduction**

Evidence grids (which we have previously called occupancy grids,
probability grids and certainty grids: the newer term *evidence
grids* better captures the intent and the implementation of the
method) are a probabilistic, finite-element representation of robot
spatial knowledge. The grids allow the efficient accumulation of
small amounts of information from individual sensor readings into
increasingly accurate and confident maps of a robot's surroundings.
Our first experiments using the method to interpret measurements from
a ring of 24 Polaroid sonar transducers carried on board an
autonomously navigating mobile robot were surprisingly successful,
compared with our previous experiences with stereo-vision based
programs that mapped points on objects as error distributions in
space. These older programs enabled a robot to map and traverse
cluttered 30 meter obstacle courses, succeeding about three times in
four attempts [Moravec80]. By contrast the grid method accomplished a
similar task with a vanishingly small failure rate [Moravec85,Elfes89].
We then applied the grid approach to stereo-vision-derived range data
from a TV equipped robot, also with good success [Serey86]. A
subsequent experiment integrated sonar and vision data, generating
maps with correct features not found in those from either sensor alone
[Matthies87].

These encouraging early results were obtained using *ad-hoc*
statistical models and methods. We then developed a Bayesian
statistical foundation for grid updates. A key result of this
derivation was a combining formula (replacing several formulas in the
old formulation) for integrating two independently derived maps of the
same area, or for adding a new reading to a developing map. When used
to add a new reading to an evidence grid, the combining formula
requires that the reading itself be represented as such a grid,
containing only the independent information provided by that
particular reading. A sensor (for instance a sonar sensor) may return
one of a set of possible values (one for each range) when a reading is
taken. Each different value requires its own grid. The set of grids
for all possible values returned during a reading we call a *sensor
model*. Each In the early formulation we constructed a sensor
model from an *ad-hoc* interpretation of the technical
specifications of our sonar sensors. This worked well in our first
experiments, where the data, mostly sonar in a cluttered lab, was very
benign, meaning that the ranges reported by the sensors usually
corresponded to the location of objects in the field of view. The
only uncertainties were in lateral position of the object, and range
uncertainty, both of which we were able to model well enough by an
*ad-hoc* function. In fact, the maps built during robot runs
were very insensitive to small adjustments in out sensor
model.

The situation is very different if the sources of uncertainty are
more complicated. In data from a smooth-walled narrow corridor, the
majority of the sonar rangings suffer specular reflection, returning
ranges that are much farther than the nearest objects because they
come via glancing bounces from walls that the sonar sees as perfect
mirrors. In principle the evidence grid method should be able to
slowly accumulate partial information (evidence) from such data as
long as there exists any correlation between the readings we get and
what's out there--but only if the sensor model accurately represents
the information contained in a reading. Our *ad-hoc* models of
sonar sensors turned out to be totally ineffective in highly specular
environments. Our initial cluttered laboratory environment, with its
many reflective surface patches in different orientations, had been
unusually benign, from the sonar sensing point of view.

We were not up to the task of constructing a precision sensor
model by hand, so we looked for a way to determine (learn) it
automatically from accumulated robot sense data. We experimented with
two learning approaches. In an earlier approach the evidence maps for
individual ranges are estimated directly by observing the frequencies
of occupancy from a large sample of readings made in known positions
in known surroundings. This has problems--it needs a huge amount of
data, is sensitive to quirks in the training sample, gives a
statistically noisy and unnecessarily blurry result, and does not in
any way compensate for the fact that individual readings don't give
entirely independent information, (as we dangerously assume in the
basic formulation). The new, more successful, approach shapes the
individual evidence maps with a parameterized closed-form formula.
The parameters of this formula are adjusted, in a hill-climbing
process, so that the map built by the functions with data from a robot
test run is as good as possible. Using this approach with a
9-parameter function a program using several weeks of Sparc1+
workstation search time (and several days more on a MASPAR parallel
computer) was able to produce a crisp, correct map of the difficult
4ft specular hallway outside my office, from data that produces an
unrecognizable splatter when used with a naive sensor model that works
fine in benign circumstances.

**Learning Sequence**

The following illustrations show maps produced from a data set of
648 sonar measurements collected with a Denning sensor ring mounted on
our Uranus robot. A set of 24 readings was taken at each foot of
travel down the center of a narrow (4 foot wide) 28 foot long leg of
an L shaped corridor. The readings were processed into maps by means
of a sensor model generated by a formula (revealed at the end of this
paper) controlled by nine parameters. The correctness of the map for
a particular setting of the parameters was given by a *Score*
expression also defined at the end of the paper that allows the result
to be interpreted as the number of correct bits in the reconstruction.
A cell with a correct probability of one that corresponds to an ideal
map with a corresponding one contributes 1 bit to the sum, as does a
zero cell with a zero ideal. A cell with a probability value of one
half, indicating total ignorance, contributes 0 to the sum, regardless
of the ideal. (A cell whose value is very near zero or one but whose
ideal is the opposite can contribute an arbitrarily large negative
number to the sum. This is the log probability way of saying there is
then no chance that this map could be a representation of the ideal.
In our program sensor model probabilities are represented by an eight
bit quantity which is the integer part of logb p/(1-p) with b = 2^(1/4).
This scaling allows a probability range of approximately 10^-9 to 1-10^-9
and a consequent worst case per cell goodness figure of about
-32.)

The first map in the set is the ideal. White represents an
occupancy probability of 0, black represents 1 and the 50%
checkerboard pattern means 0.5, i.e. don't care. A perfect
reconstruction would have a goodness figure of 578 correct
bits--found by matching the ideal with itself. The robot traversed
the horizontal part of the L shape. The five indentations in that
hallway are slightly recessed closed metal doors. The rest of the
hallway is smooth painted wallboard, a nearly perfect sonar mirror.

The remaining maps are reconstructions with increasingly good
scores, excerpted from thousands encountered in our program's search.
The annotation in curly braces at the bottom of each map contains the
nine parameters of the sensor model that produced the indicated score.
The last two numbers, in the starred parentheses, are the time, in
seconds, required to construct the sensor model -- stored as a large
numerical table, and to construct the map given model. The first
(naive) maps have very negative scores, implying that the
reconstruction contains many high values that should be zero, or low
values that should be one. These models are overconfident and wrong,
and the resulting maps are in a sense worse than nothing, since a
totally noncommittal map, with probability one half everywhere, would
have a score of zero. Some of the intermediate maps are
revealing--the smooth walls are mostly unseen, but reflective images
of the door frames appear beyond these walls. Two final
(hallway-trained) maps, each found by a different search, one with a
score of about 419 bits, the other with 424 bits, are easily good
enough for most navigational purposes.

Following the maps is a Mathematica presentations of the
sensor-model function, for sonar range measurements of 2, 3, 5 and 10
feet, for the initial naive map parameters and for two final,
hallway-trained parameter sets. The naive model gives a terrible
result in the hallway, where about seven sonar readings out of eight
are at glancing angles that result in specular mirages, but is quite
adequate for data as benign as that obtained in our original cluttered
laboratory runs.

Following the Mathematica displays are examples of reconstructions
from a 328 reading simulated robot run very similar to our original
(physical) lab runs--the ideal map, a map reconstructed from a model
produced by the old learning method (trained from 20,000 readings in
the same simulated environment), the map produced with the naive
nine-parameter model, and the map produced by the hallway-trained
model. The naive model produces a quite good map, as expected for the
model-insensitive data gathered in non-specular surroundings. The
hallway-trained model, on the other hand, produces much more
conservative maps of the simulated laboratory. Locations far from the
robot path remain shrouded in uncertainty--i.e. they remain close to
one half in probability value. This has some benefits, as there are
fewer artifacts (areas that are incorrectly labelled as probably
occupied or empty) than in the more self-assured naive
map.

# <<PICTURES GO HERE>>

### Mathematical Tidbits

An evidence grid is a regular finite element model of space. Each
cell of the grid contains a likelihood estimate (our evidence) of a
given property of the corresponding region of space. Here we are
concerned simply occupancy of the region, represented by **o**x. *p*(**o**x) is the
likelihood (or probability) that region x is occupied. When a
discussion contains only one particular x, we will drop the subscript,
and refer simply to *p*(**o**).

Let *p*(**A**|**B**) represent our best estimate of the
likelihood of situation **A** if we have received information
**B**. By definition *p*(**A**|**B**) =
*p*(**A** intersection **B**)/*p*(**B**). Plain
*p*(**A**) represents our estimate of **A** given no
special information. The alternative to situation **A** will be
referred to as ~**A** (not **A**). For any **B**,
*p*(**A**|**B**) + *p*(~**A**|**B**) =
1

For the two occupancy cases of a cell **o** (the cell is
occupied) and ~**o** (the cell is empty), with prior likelihoods
*p*(**o**) and *p*(~**o**) and new information
**M**, Bayes theorem can be expressed:

*p*(**o**|M)/*p*(~**o**|**M**) =
*p*(**M**|**o**)/*p*(**M**|~**o**) *
*p*(**o**)/*p*(~**o**)

This formula combines independent sources of information about
**o** and **M** into an estimate of a single quantity
*p*(**o**|**M**). The new information **M**, occurs in
terms of the probability of **M** in the situation that a
particular cell is or is not occupied, *p*(**M**|**o**) and
*p*(**M**|~**o**). This inversion of **o** and
**M** is a key feature of Bayes' theorem.

But consider the problem of generating a map from information
**M**1 and **M**2 when
each has already been individually processed into a map, i.e. find
*p*(**o**|**M**1 intersection
**M**2) given *p*(**o**|**M**1) and *p*(**o**|**M**2). This requires finding *p*(**A**
intersection **B**) knowing *p*(**A**) and
*p*(**B**), which is not possible in general. But if we assume
that **A** and **B** represent independent information
*p*(**A** intersection **B**) = *p*(**A**) *
*p*(**B**). Algebra then gives us:

*p*(**o**|**M**1 intersection
**M**2)/*p*(~**o**|**M**1 intersection **M**2)

=
*p*(**o**|**M**1)/*p*(~**o**|**M**1) *
*p*(**o**|**M**2)/*p*(~**o**|**M**2)

This *combining formula* can be used to build maps from
sensor readings if *p*(**o**|**M**1)
is interpreted as the developing map, and
*p*(**o**|**M**2) represents a
reading. The approach is especially efficient if we use the logarithm
of the probability ratio (itself known as the *odds*). In *log
odds*, the combining formula is a simple addition:

*log* [*p*(**o**|**M**1
intersection **M**2)/*p*(~**o**|**M**1
intersection **M**2)] =

*log* [*p*(**o**|**M**1)/*p*(~**o**|**M**1)]
+ *log* [*p*(**o**|**M**2)/*p*(~**o**|**M**2)]

and the log odds form itself can be considered *weight of
evidence*. Properly scaled, eight bit integers seem adequate for
storing the sensor model's weight-of-evidence values.

The values of *p*(**o**|**M**2)
can be learned by accumulating actual occupancy statistics in known
environments for each spatial location around a sensor for each
possible reading. The resulting averages will contain information
about the occupancy density and other more complicated peculiarities
of the training spaces as well as of the sensor response. The
background density effect can be removed by identifying the training
space density with *p*(**o**|**M**c)
and dividing it out in formula, but the other artifacts remain. This
learning approach also requires very large numbers of readings, so as
to give enough instances of each possible return from the sensor. Even
so, the result is statistically noisy and blurry, and does not in any
way compensate for the fact that individual readings don't give
entirely independent information, (as we dangerously assume in the
basic formulation).

The new, more successful, approach described in this paper shapes
the individual evidence maps with a parameterized closed-form formula.
The parameters of this formula are adjusted, in a hill-climbing
process, so that the map built by the functions with data from a robot
test run is as good as possible.

Using learned sensor models to process sensor readings from a
roving robot is potentially very fast because the learned
**M**2 values are independent of the current
map, and can be stored in tables. An update can thus be done with as
little as one arithmetic operation per cell (an addition in the weight
of evidence form) in a sensor's sensitive volume. In practice there
are three additions and a bounds test in our inner loop.

**Match, Score and Entropy**

To provide a merit value for the parameter adjustment learning, we
compare a reconstructed map with a (usually hand made) ideal one in
the following way. The probability that two maps represent the same
situation is equal to the product of the probabilities that each
corresponding pair of cells represents the same value. Now, each
number in our maps represents the probability that the cell is a
occupied. So the probability that two cells **A**i and **B**i are both occupied
is just **A**i * **B**i. The probability that they are both empty is
~**A**i * ~**B**i. The
probability that they are the same is the sum of these cases:
**A**i * **B**i +
~**A**i * ~**B**i. The
probability that two maps represent the same thing is then
Product i=1..n (**A**i *
**B**i + ~**A**i *
~**B**i) over all the cells of the
maps. Generally this will be a ridiculously tiny number. To get a
number of reasonable magnitude, and incidentally an information
measure in bits, we take *log*2 of this
probability. By adding 1 to each term, we see that two cells that are
maximally confident (either exactly 1 or 0) and identical score as 1
(correct bit), while a cell in an unknown (0.5) state, scores 0 when
compared to any value. We call this result the **Match**
between the maps.

**Match** = *n* + *log*2 Product i=1..n (**A**i **B**i + ~**A**i * ~**B**i)

= Sum i=1..n ( 1+*log*2(**A**i * **B**i + ~**A**i * ~**B**i))

(**Note added in 1997:** Cross Entropy is
probably a superior measure of match:

**Cross Entropy** = Sum i=1..n
( 1+**A**i *log*2(**B**i) + ~**A**i *log*2(~**B**i))

**A** would be the map with more accumulated evidence, **B** the
one with less.)

**Match** is generally a small number that climbs slowly to the
number of cells in the map as the maps being compared both become more
certain (the probabilities approach 0 or 1) and become identical to
each other.

When a **Match** is calculated between a reconstructed map and
its perfect ideal (which contains either 0 or 1 in every cell), we
call the result the **Score**. The perfect knowledge of the world
this requires is easily obtained in simulation, and can be available
in physical runs if the environment is carefully measured by hand, or
by a very reliable and high resolution calibration sensor. Even if
the ideal is not known, one can reason that if a reconstructed map has
a probability of *p* in a cell, it should be the case that a
fraction *p* of the time the ideal map should have a 1 in that
place, and a 0 the other ~*p* of the time. This reasoning
allows us to calculate an average expected* Score* for a map,
even when the ideal map is unknown. We call this the **Entropy**.
A bit of algebra reveals that:

**Entropy** = Sum i=1..n (
1 + *p*i *log*2(*p*i) + ~*p*i* log*2(~*p*i))

If the probabilities in a reconstructed map are a completely
accurate representation of the robot's true state of incomplete
knowledge, then the map's Score would be equal to its Entropy. If the
magnitude of Score is greater than Entropy, then the map can be said
to be overconfident.

A totally undifferentiated map, where all the cells have the same
probability value *p*, has minimum **Score** when the
*p* is equal to the number of cells in the ideal map that are 1
divided by the total number of cells, i.e. when the probability is
equal to the occupancy density of the ideal. But **Entropy**
is maximum when *p* = 0.5 and drops towards 0 as *p* goes to
either 0 or 1.

**Nine-Parameter Sensor Model**

The sensor model, whose nine parameters are adjusted by our
learning procedure to maximize the **Score** of the map
build with the hallway data, is described by the following Mathematica
procedure.

**ModelPt
** [range_,x_,y_,m_] :=

Block** [**

{a,r,rd,l,ru,po,pe,pc,pt,

em0,oc0,an0,ru0,ruinf,emscale,ocscale,anscale,ruscale},

{em0,oc0,an0,ru0,ruinf,emscale,ocscale,anscale,ruscale}
= m;

a = Abs [ArcTan [x,y]];

r = Sqrt [x^2+y^2];

ru
= (ru0*ruscale + ruinf*range)/(range+ruscale);

rd
= Exp [-((range-r)/ru)^2];

l = an0*Exp [-x/anscale]/2;

po
= rd*oc0*0.5*(1+Exp [-r/ocscale]);

pe
= If [N [r<range], 0.5*(1-(1-em0)*Exp [-r/emscale]), 0.5];

pc
= If [N [po>pe], pe+rd*(po-pe), pe];

If [N [a>l],
0.5, 0.5+(pc-0.5)*(2/(1+(a/l)^2)-1) ]**]**;

It is controlled by nine parameters:

**em0**the
initial depth of the "empty region" of each probability density
function

**oc0**the height of the "range ridge" for
short range readings

**an0**the angle of the sonar beam at
short ranges

**ru0**the range uncertainty, or width of the
range ridge, for short readings

**ruinf**the range
uncertainty for distant readings

**emscale**a constant
giving the rate of rise of the empty region with
range

**ocscale**a constant giving the rate of fall of the
occupied ridge with reading range

**anscale**a constant
giving the rate of decrease of effective beam angle with
range

**ruscale**a constant giving the rate at which range
uncertainty varies from ru0 to ruinf

**Acknowledgements**

We would like to thank Bill Ross for porting the learning code to
the MASPAR parallel computer and so finding the highest scoring (best
educated) sensor models in our experiments. Jason Almeter brought the
Uranus robot to an operational state and used it to collect the data
set employed in the learning experiment. Jesse Easudes spent many
days collecting a prior data set for the same purpose. Bert Enderton
worked with a simulation of sonar data to show the basic feasibility
of the learning approach. Dong Woo Cho wrote the first version of the
simulator. Alberto Elfes did most of the foundational work for the
grid approach, as will be found in his forthcoming PhD thesis:
*Occupancy Grids: A Stochastic Approach to Mobile Robot Perception
and Navigation*. The Uranus robot was largely designed and
constructed by Gregg Podnar. We would also like to thank Kevin
Dowling, a high energy contributor at many stages in the effort, and
all the other people who have helped in the Mobile Robot Lab since
1980.

**References**

[Elfes89] Elfes, A.,
**Using Occupancy Grids for Mobile Robot Perception and Navigation,
IEEE Computer Magazine**, special issue on Autonomous Intelligent
Machines, June 1989, pp. 46-58.

[Matthies87]Matthies,
L. H. and A. Elfes, **Integration of Sonar and Stereo Range Data
Using a Grid-Based Representation**, in Proceedings of the 1988 IEEE
International Conference on Robotics and Automation, Philadelphia, PA,
April, 1988.

[Serey86]Serey, B. and L. H. Matthies,
**Obstacle Avoidance using 1-D Stereo Vision**, CMU Robotics
Institute Mobile Robot Lab Report, November, 1986.

[Moravec85]Moravec, H. P. and A. E. Elfes, **High Resolution Maps
from Wide Angle Sonar**, proceeding of the 1985 IEEE International
Conference on Robotics and Automation, St. Louis, March, 1985, pp
116-121.

[Moravec80]Moravec, H. P., **Robot Rover Visual
Navigation**, UMI Research Press, Ann Arbor, Michigan, 1981.