Learning Sensor Models for Evidence Grids

Hans Moravec
Mike Blackwell

Carnegie Mellon University
Robotics Institute

September 1991


Evidence grids (which we have previously called occupancy grids, probability grids and certainty grids: the newer term evidence grids better captures the intent and the implementation of the method) are a probabilistic, finite-element representation of robot spatial knowledge. The grids allow the efficient accumulation of small amounts of information from individual sensor readings into increasingly accurate and confident maps of a robot's surroundings. Our first experiments using the method to interpret measurements from a ring of 24 Polaroid sonar transducers carried on board an autonomously navigating mobile robot were surprisingly successful, compared with our previous experiences with stereo-vision based programs that mapped points on objects as error distributions in space. These older programs enabled a robot to map and traverse cluttered 30 meter obstacle courses, succeeding about three times in four attempts [Moravec80]. By contrast the grid method accomplished a similar task with a vanishingly small failure rate [Moravec85,Elfes89]. We then applied the grid approach to stereo-vision-derived range data from a TV equipped robot, also with good success [Serey86]. A subsequent experiment integrated sonar and vision data, generating maps with correct features not found in those from either sensor alone [Matthies87].

These encouraging early results were obtained using ad-hoc statistical models and methods. We then developed a Bayesian statistical foundation for grid updates. A key result of this derivation was a combining formula (replacing several formulas in the old formulation) for integrating two independently derived maps of the same area, or for adding a new reading to a developing map. When used to add a new reading to an evidence grid, the combining formula requires that the reading itself be represented as such a grid, containing only the independent information provided by that particular reading. A sensor (for instance a sonar sensor) may return one of a set of possible values (one for each range) when a reading is taken. Each different value requires its own grid. The set of grids for all possible values returned during a reading we call a sensor model. Each In the early formulation we constructed a sensor model from an ad-hoc interpretation of the technical specifications of our sonar sensors. This worked well in our first experiments, where the data, mostly sonar in a cluttered lab, was very benign, meaning that the ranges reported by the sensors usually corresponded to the location of objects in the field of view. The only uncertainties were in lateral position of the object, and range uncertainty, both of which we were able to model well enough by an ad-hoc function. In fact, the maps built during robot runs were very insensitive to small adjustments in out sensor model.

The situation is very different if the sources of uncertainty are more complicated. In data from a smooth-walled narrow corridor, the majority of the sonar rangings suffer specular reflection, returning ranges that are much farther than the nearest objects because they come via glancing bounces from walls that the sonar sees as perfect mirrors. In principle the evidence grid method should be able to slowly accumulate partial information (evidence) from such data as long as there exists any correlation between the readings we get and what's out there--but only if the sensor model accurately represents the information contained in a reading. Our ad-hoc models of sonar sensors turned out to be totally ineffective in highly specular environments. Our initial cluttered laboratory environment, with its many reflective surface patches in different orientations, had been unusually benign, from the sonar sensing point of view.

We were not up to the task of constructing a precision sensor model by hand, so we looked for a way to determine (learn) it automatically from accumulated robot sense data. We experimented with two learning approaches. In an earlier approach the evidence maps for individual ranges are estimated directly by observing the frequencies of occupancy from a large sample of readings made in known positions in known surroundings. This has problems--it needs a huge amount of data, is sensitive to quirks in the training sample, gives a statistically noisy and unnecessarily blurry result, and does not in any way compensate for the fact that individual readings don't give entirely independent information, (as we dangerously assume in the basic formulation). The new, more successful, approach shapes the individual evidence maps with a parameterized closed-form formula. The parameters of this formula are adjusted, in a hill-climbing process, so that the map built by the functions with data from a robot test run is as good as possible. Using this approach with a 9-parameter function a program using several weeks of Sparc1+ workstation search time (and several days more on a MASPAR parallel computer) was able to produce a crisp, correct map of the difficult 4ft specular hallway outside my office, from data that produces an unrecognizable splatter when used with a naive sensor model that works fine in benign circumstances.

Learning Sequence

The following illustrations show maps produced from a data set of 648 sonar measurements collected with a Denning sensor ring mounted on our Uranus robot. A set of 24 readings was taken at each foot of travel down the center of a narrow (4 foot wide) 28 foot long leg of an L shaped corridor. The readings were processed into maps by means of a sensor model generated by a formula (revealed at the end of this paper) controlled by nine parameters. The correctness of the map for a particular setting of the parameters was given by a Score expression also defined at the end of the paper that allows the result to be interpreted as the number of correct bits in the reconstruction. A cell with a correct probability of one that corresponds to an ideal map with a corresponding one contributes 1 bit to the sum, as does a zero cell with a zero ideal. A cell with a probability value of one half, indicating total ignorance, contributes 0 to the sum, regardless of the ideal. (A cell whose value is very near zero or one but whose ideal is the opposite can contribute an arbitrarily large negative number to the sum. This is the log probability way of saying there is then no chance that this map could be a representation of the ideal. In our program sensor model probabilities are represented by an eight bit quantity which is the integer part of logb p/(1-p) with b = 2^(1/4). This scaling allows a probability range of approximately 10^-9 to 1-10^-9 and a consequent worst case per cell goodness figure of about -32.)

The first map in the set is the ideal. White represents an occupancy probability of 0, black represents 1 and the 50% checkerboard pattern means 0.5, i.e. don't care. A perfect reconstruction would have a goodness figure of 578 correct bits--found by matching the ideal with itself. The robot traversed the horizontal part of the L shape. The five indentations in that hallway are slightly recessed closed metal doors. The rest of the hallway is smooth painted wallboard, a nearly perfect sonar mirror.

The remaining maps are reconstructions with increasingly good scores, excerpted from thousands encountered in our program's search. The annotation in curly braces at the bottom of each map contains the nine parameters of the sensor model that produced the indicated score. The last two numbers, in the starred parentheses, are the time, in seconds, required to construct the sensor model -- stored as a large numerical table, and to construct the map given model. The first (naive) maps have very negative scores, implying that the reconstruction contains many high values that should be zero, or low values that should be one. These models are overconfident and wrong, and the resulting maps are in a sense worse than nothing, since a totally noncommittal map, with probability one half everywhere, would have a score of zero. Some of the intermediate maps are revealing--the smooth walls are mostly unseen, but reflective images of the door frames appear beyond these walls. Two final (hallway-trained) maps, each found by a different search, one with a score of about 419 bits, the other with 424 bits, are easily good enough for most navigational purposes.

Following the maps is a Mathematica presentations of the sensor-model function, for sonar range measurements of 2, 3, 5 and 10 feet, for the initial naive map parameters and for two final, hallway-trained parameter sets. The naive model gives a terrible result in the hallway, where about seven sonar readings out of eight are at glancing angles that result in specular mirages, but is quite adequate for data as benign as that obtained in our original cluttered laboratory runs.

Following the Mathematica displays are examples of reconstructions from a 328 reading simulated robot run very similar to our original (physical) lab runs--the ideal map, a map reconstructed from a model produced by the old learning method (trained from 20,000 readings in the same simulated environment), the map produced with the naive nine-parameter model, and the map produced by the hallway-trained model. The naive model produces a quite good map, as expected for the model-insensitive data gathered in non-specular surroundings. The hallway-trained model, on the other hand, produces much more conservative maps of the simulated laboratory. Locations far from the robot path remain shrouded in uncertainty--i.e. they remain close to one half in probability value. This has some benefits, as there are fewer artifacts (areas that are incorrectly labelled as probably occupied or empty) than in the more self-assured naive map.


Mathematical Tidbits

An evidence grid is a regular finite element model of space. Each cell of the grid contains a likelihood estimate (our evidence) of a given property of the corresponding region of space. Here we are concerned simply occupancy of the region, represented by ox. p(ox) is the likelihood (or probability) that region x is occupied. When a discussion contains only one particular x, we will drop the subscript, and refer simply to p(o).

Let p(A|B) represent our best estimate of the likelihood of situation A if we have received information B. By definition p(A|B) = p(A intersection B)/p(B). Plain p(A) represents our estimate of A given no special information. The alternative to situation A will be referred to as ~A (not A). For any B, p(A|B) + p(~A|B) = 1

For the two occupancy cases of a cell o (the cell is occupied) and ~o (the cell is empty), with prior likelihoods p(o) and p(~o) and new information M, Bayes theorem can be expressed:

p(o|M)/p(~o|M) = p(M|o)/p(M|~o) * p(o)/p(~o)

This formula combines independent sources of information about o and M into an estimate of a single quantity p(o|M). The new information M, occurs in terms of the probability of M in the situation that a particular cell is or is not occupied, p(M|o) and p(M|~o). This inversion of o and M is a key feature of Bayes' theorem.

But consider the problem of generating a map from information M1 and M2 when each has already been individually processed into a map, i.e. find p(o|M1 intersection M2) given p(o|M1) and p(o|M2). This requires finding p(A intersection B) knowing p(A) and p(B), which is not possible in general. But if we assume that A and B represent independent information p(A intersection B) = p(A) * p(B). Algebra then gives us:

p(o|M1 intersection M2)/p(~o|M1 intersection M2)
= p(o|M1)/p(~o|M1) * p(o|M2)/p(~o|M2)

This combining formula can be used to build maps from sensor readings if p(o|M1) is interpreted as the developing map, and p(o|M2) represents a reading. The approach is especially efficient if we use the logarithm of the probability ratio (itself known as the odds). In log odds, the combining formula is a simple addition:

log [p(o|M1 intersection M2)/p(~o|M1 intersection M2)] =
log [p(o|M1)/p(~o|M1)] + log [p(o|M2)/p(~o|M2)]

and the log odds form itself can be considered weight of evidence. Properly scaled, eight bit integers seem adequate for storing the sensor model's weight-of-evidence values.

The values of p(o|M2) can be learned by accumulating actual occupancy statistics in known environments for each spatial location around a sensor for each possible reading. The resulting averages will contain information about the occupancy density and other more complicated peculiarities of the training spaces as well as of the sensor response. The background density effect can be removed by identifying the training space density with p(o|Mc) and dividing it out in formula, but the other artifacts remain. This learning approach also requires very large numbers of readings, so as to give enough instances of each possible return from the sensor. Even so, the result is statistically noisy and blurry, and does not in any way compensate for the fact that individual readings don't give entirely independent information, (as we dangerously assume in the basic formulation).

The new, more successful, approach described in this paper shapes the individual evidence maps with a parameterized closed-form formula. The parameters of this formula are adjusted, in a hill-climbing process, so that the map built by the functions with data from a robot test run is as good as possible.

Using learned sensor models to process sensor readings from a roving robot is potentially very fast because the learned M2 values are independent of the current map, and can be stored in tables. An update can thus be done with as little as one arithmetic operation per cell (an addition in the weight of evidence form) in a sensor's sensitive volume. In practice there are three additions and a bounds test in our inner loop.

Match, Score and Entropy

To provide a merit value for the parameter adjustment learning, we compare a reconstructed map with a (usually hand made) ideal one in the following way. The probability that two maps represent the same situation is equal to the product of the probabilities that each corresponding pair of cells represents the same value. Now, each number in our maps represents the probability that the cell is a occupied. So the probability that two cells Ai and Bi are both occupied is just Ai * Bi. The probability that they are both empty is ~Ai * ~Bi. The probability that they are the same is the sum of these cases: Ai * Bi + ~Ai * ~Bi. The probability that two maps represent the same thing is then Product i=1..n (Ai * Bi + ~Ai * ~Bi) over all the cells of the maps. Generally this will be a ridiculously tiny number. To get a number of reasonable magnitude, and incidentally an information measure in bits, we take log2 of this probability. By adding 1 to each term, we see that two cells that are maximally confident (either exactly 1 or 0) and identical score as 1 (correct bit), while a cell in an unknown (0.5) state, scores 0 when compared to any value. We call this result the Match between the maps.

Match = n + log2 Product i=1..n (Ai Bi + ~Ai * ~Bi)
= Sum i=1..n ( 1+log2(Ai * Bi + ~Ai * ~Bi))

(Note added in 1997: Cross Entropy is probably a superior measure of match:

Cross Entropy = Sum i=1..n ( 1+Ai log2(Bi) + ~Ai log2(~Bi))

A would be the map with more accumulated evidence, B the one with less.)

Match is generally a small number that climbs slowly to the number of cells in the map as the maps being compared both become more certain (the probabilities approach 0 or 1) and become identical to each other.

When a Match is calculated between a reconstructed map and its perfect ideal (which contains either 0 or 1 in every cell), we call the result the Score. The perfect knowledge of the world this requires is easily obtained in simulation, and can be available in physical runs if the environment is carefully measured by hand, or by a very reliable and high resolution calibration sensor. Even if the ideal is not known, one can reason that if a reconstructed map has a probability of p in a cell, it should be the case that a fraction p of the time the ideal map should have a 1 in that place, and a 0 the other ~p of the time. This reasoning allows us to calculate an average expected Score for a map, even when the ideal map is unknown. We call this the Entropy. A bit of algebra reveals that:

Entropy = Sum i=1..n ( 1 + pi log2(pi) + ~pi log2(~pi))

If the probabilities in a reconstructed map are a completely accurate representation of the robot's true state of incomplete knowledge, then the map's Score would be equal to its Entropy. If the magnitude of Score is greater than Entropy, then the map can be said to be overconfident.

A totally undifferentiated map, where all the cells have the same probability value p, has minimum Score when the p is equal to the number of cells in the ideal map that are 1 divided by the total number of cells, i.e. when the probability is equal to the occupancy density of the ideal. But Entropy is maximum when p = 0.5 and drops towards 0 as p goes to either 0 or 1.

Nine-Parameter Sensor Model

The sensor model, whose nine parameters are adjusted by our learning procedure to maximize the Score of the map build with the hallway data, is described by the following Mathematica procedure.

ModelPt [range_,x_,y_,m_] :=
Block [
{em0,oc0,an0,ru0,ruinf,emscale,ocscale,anscale,ruscale} = m;
a = Abs [ArcTan [x,y]];
r = Sqrt [x^2+y^2];
ru = (ru0*ruscale + ruinf*range)/(range+ruscale);
rd = Exp [-((range-r)/ru)^2];
l = an0*Exp [-x/anscale]/2;
po = rd*oc0*0.5*(1+Exp [-r/ocscale]);
pe = If [N [r<range], 0.5*(1-(1-em0)*Exp [-r/emscale]), 0.5];
pc = If [N [po>pe], pe+rd*(po-pe), pe];
If [N [a>l], 0.5, 0.5+(pc-0.5)*(2/(1+(a/l)^2)-1) ]];

It is controlled by nine parameters:

the initial depth of the "empty region" of each probability density function
the height of the "range ridge" for short range readings
the angle of the sonar beam at short ranges
the range uncertainty, or width of the range ridge, for short readings
the range uncertainty for distant readings
a constant giving the rate of rise of the empty region with range
a constant giving the rate of fall of the occupied ridge with reading range
a constant giving the rate of decrease of effective beam angle with range
a constant giving the rate at which range uncertainty varies from ru0 to ruinf


We would like to thank Bill Ross for porting the learning code to the MASPAR parallel computer and so finding the highest scoring (best educated) sensor models in our experiments. Jason Almeter brought the Uranus robot to an operational state and used it to collect the data set employed in the learning experiment. Jesse Easudes spent many days collecting a prior data set for the same purpose. Bert Enderton worked with a simulation of sonar data to show the basic feasibility of the learning approach. Dong Woo Cho wrote the first version of the simulator. Alberto Elfes did most of the foundational work for the grid approach, as will be found in his forthcoming PhD thesis: Occupancy Grids: A Stochastic Approach to Mobile Robot Perception and Navigation. The Uranus robot was largely designed and constructed by Gregg Podnar. We would also like to thank Kevin Dowling, a high energy contributor at many stages in the effort, and all the other people who have helped in the Mobile Robot Lab since 1980.


Elfes, A., Using Occupancy Grids for Mobile Robot Perception and Navigation, IEEE Computer Magazine, special issue on Autonomous Intelligent Machines, June 1989, pp. 46-58.

Matthies, L. H. and A. Elfes, Integration of Sonar and Stereo Range Data Using a Grid-Based Representation, in Proceedings of the 1988 IEEE International Conference on Robotics and Automation, Philadelphia, PA, April, 1988.

Serey, B. and L. H. Matthies, Obstacle Avoidance using 1-D Stereo Vision, CMU Robotics Institute Mobile Robot Lab Report, November, 1986.

Moravec, H. P. and A. E. Elfes, High Resolution Maps from Wide Angle Sonar, proceeding of the 1985 IEEE International Conference on Robotics and Automation, St. Louis, March, 1985, pp 116-121.

Moravec, H. P., Robot Rover Visual Navigation, UMI Research Press, Ann Arbor, Michigan, 1981.