DARPA MARS program research progress

Project Title: Robust Navigation by Probabilistic Volumetric Sensing

Organization: Carnegie Mellon University

Principal Investigator: Hans Moravec

Date: September 15, 2002

The most recent update of this report can be found at http://www.frc.ri.cmu.edu/~hpm/talks

Technical Report


Since 1999 we've been working full-time to develop laboratory prototype sensor-based software for utility mobile robots for industrial transport, floor maintenance, security etc., that matches the months-between-error reliability of existing industrial robots without requiring their expensive worksite preparation or site-specific programming. Our machines will navigate employing a dense 3D awareness of their surroundings, be tolerant of route surprises, and be easily placed by ordinary workers in entirely new routes or work areas. The long-elusive combination of easy installation and reliability should greatly expand cost-effective niches for mobile robots, and make possible a growing market that can itself sustain further development.


Our system is built around 3D grids of spatial occupancy evidence, a technique we have been developing since 1984, following a prior decade of robot navigation work using a different method. 2D versions of the grid approach found favor in many successful research mobile robots, but seem short of commercial reliability. 3D grids, with at least 1,000 times as much world data, were computationally infeasible until 1992, when when we combined increased computer power with 100x speedup from representational, organizational and coding innovations. In 1996 we wrote a preliminary stereoscopic front end for our fast 3D grid code, and the gratifying results convinced us of the feasibility of the approach, given at least 1,000 MIPS of computer power. This contract enables us to extend that start towards a universally convincing demonstration, just as the requisite computing power arrives.

In year 2000 reports we described a greatly improved camera calibration program and a sensor model learning approach guided by color consistency that produced near-photorealistic 3D maps. In July 2001 we reported on a new meticulously-prepared set of 606 images from a hallway traverse by a surrogate robot designed to overcome the limitations in two older data sets and mentioned that we were re-constructing the system in modular C++ to incorporate improved techniques and to rescue us from a plain C program that was growing into an impenetrable tangle.

In our February 2002 we reported that a first draft of the new program was complete. We showed first results with the new data that gratifyingly exceeded the best realism we had achieved previously. Many internal images of our new grids can truly be mistaken for photographs of a real location, and are clearly superior for navigation planning. Defects remain at the outer edges of the grid, especially those at the upper extreme of the robot's fields of view. Contemplating these has produced a number of new ideas, two of which were particularly exciting and and very likely to result in further improvements, especially for the most difficult grid regions. We nicknamed them "Two Thresholds" and "Prior Probe" in the February report.

Besides pursuing new research opportunities as they present themselves, we are following a long-term strategy towards practical free-ranging robot applications. The new position-calibrated data allows us temporarily to avoid the problem of registering uncertain viewpoints. The next phase of the project, however, will try to derive equally good maps from image sequences collected by imprecisely traveling robots. Scott Crosby, who joined the project from May 2001 to May 2002, developed a program that can geometrically register a partial map built from a single robot location with a more complete working map. Our most recent maps have cell dimensions 512x512x128, thus contain 32 million cells. Scott's program, in less than a second, selectively samples about 30 thousand of those to participate in the registration process, and, in versions of the new map data randomly displaced by a few centimeters and degrees, manages to recover the correct position and orientation to within a few millimeters displacement for the worst case cells at several meters range. Further improvements are expected, but the present performance would already be sufficient to navigate robots.

New Work

In May 2002, to acquire new test data more automatically, we borrowed a Pioneer II robot from another CMU laboratory (courtesy of graduate student Mike Montemerlo). Chris Schroeder, who joined the project as an undergraduate in March 2002, interfaced the robot to a Powerbook through a USB to serial converter. We purchased and connected four relatively inexpensive firewire cameras (Fire i400), using third party driver software. Three of the cameras are mounted equilaterally on the robot for trinocular stereo vision, the fourth is also on the robot high on a post looking down on the scene. It functions as a bike bicycle training wheels), substituting for the fix-mounted overhead cameras we had used in to aid learning sensor models in our previous data set. Registering fix-mounted overhead cameras with uncalibrated robot paths would be quite burdensome. Advantageously, this approach provides a much greater amount of training imagery. We also equipped the robot with a new textured light source, a bundle of 16 laser line generators, each producing a 60 degree fan of light. This bundle, about the size of one camera, is much more compact than the "mirror ball" device we had used previously, consumes one tenth the power and projects a sharper, better defined pattern (primarily to help range blank surfaces). The equipped robot is shown here:

As of September 2002 the robot has been used to collect one new data set containing over 400 images. It was programmed to rotate and collect 6 60-degree separated views at a sequence of positions every 30 cm along a prearranged L-shaped path. Two-dimensional matching of overlapping floor patches seen from the overhead camera allows us to register the views in the six directions to each other. Other new code includes supplementary local least-squares local image dewarping correction necessary because the new cameras show additional distortions, perhaps because the imaging chips are mounted slightly off perpendicular. Results from the new data and subsequent runs in varied environments will be forthcoming.

Other work nearing completion includes implementation of the "prior probe" and "two thresholds" ideas mentioned in the last report. Prior probe examines developing grids to obtain statistical occupancy priors to improve stereoscopic estimation. We noticed as we implemented it that a similar technique could be used to image the grid: we propagate rays from a viewpoint in each pixel direction, cell to cell until an occupied cell is encountered. Previously we had used a conventional graphics algorithm that projected each face of every occupied grid cube. The new approach examines only the volume around a viewpoint, and is several times faster for our learning process. Its cost actually declines as the grid becomes more densely occupied, while the conventional algorithm increases in cost, and it is insensitive to overall grid size. The straightforward implementation slows when the grid is very empty and the rays are long: we are implementing a grid resolution pyramid to improve that case.

A company is being formed to commercialize these techniques, currently incorporated as Botfactory in Atlanta, GA.