Organization: Carnegie Mellon University
Principal Investigator: Hans Moravec
Date: January 20, 2000
The work has natural three stages: completion and improvement of the basic perception code; creation of an identification layer for navigation and recognition of architectural features; finally, sample application programs that orchestrate the other abilities into practical behaviors like patrol, delivery and cleaning. We need both capability and efficiency. The existing code allows one-second time resolution with 1,000 MIPS, but our 3D maps have millions of cells, and straightforward implementations of path planning, localization, object matching, etc. would be much slower. We will combine techniques like coarse-to-fine matching, selective sampling and alternate representations to get the desired results at sufficient speed.
Ideas awaiting evaluation include variable resolution trinocular stereoscopic matching; optimization of sensor evidence profiles and other system parameters using robot images of the scene to first naturally colorize the occupied cells in a grid, then to check its faithfulness, to close the learning loop; randomly textured light to improve stereo of uniform surfaces; coarse-to-fine path finding; occupied-cell-list matching of grids for fast localization and feature identification; and others. Ongoing work will lead us to more.
In the last months of 1999 we greatly improved a version of flatfish originally written in 1996. Many changes and additions were made to greatly extend the range of calibration images it could handle correctly and automatically. The largest of these, iterative incremental rectification, now allows the program to correctly rectify true fisheye lens images, with fields of view in excess of 120 degrees. Previously the usable FOV had been limited to about 100 degrees.
Flatfish is run under supervision once for each robot camera. The functions it produces are then used repeatedly onboard the robot to rectify each camera image. The program assumes that image imperfections arise from small camera and target mounting misalignments and from a lens distortion that is radially symmetric about some, initially unknown, center. These assumptions are mostly correct for solid-state cameras, but would not be true for electron-gun cameras, whose scanning rasters often contain non-radial distortions.
Flatfish works with images of a test pattern. The camera fixture is leveled, and positioned, with reasonable accuracy, perpendicularly a known distance in front of a screen patterned with a precise square array of about 400 black spots on a white background, with one additional spot at a half-grid position at the center serving as an absolute position reference.
Figure 1: Robot with three 90 degree field of view cameras, aligned on rectification test pattern
The program inputs an image of the calibration spot pattern, an aim point on the pattern, the measured distance between the lens center and the pattern, and the desired angular field of view of the rectified image. It produces a file of calibration parameters which contains the necessary information to rectify images from the same camera to these specifications. The lens center (where the pinhole would be, if it were a pinhole camera) is conventionally taken to be at the distance of the lens iris ring. To determine the position without guessing, a helper program scalefish takes two calibration files produced from the same camera at different distances from a test pattern, with other settings including guessed lens center unchanged. By relating the image scale change reflected in the calibration files to the distance change (which must be precisely given), scalefish deduces the actual lens center position, which can then be used in later runs of flatfish with that type of lens.
To calibrate stereoscopic camera sets, all are aimed at the same calibration pattern, as in figure 1. The flatfish program is run for each resulting image, with only the aim point on the calibration pattern specified differently for each calibration. The aim point for each camera defines the rectified view direction, and the separation of the two aim points on the pattern, vertically and laterally, must be equal to the physical separation of the cameras.
Flatfish begins by estimating the density of spots expected from the setup description, then applies a "spot operator" to identify round dark spots in the image. At each pixel the spot operator applies templates at a range of sizes, each consisting of a round region and its annular surround. The "spot value" is the difference in mean lightness between the exterior and the interior regions minus terms proportional to the nonuniformity of brightness within the interior and exterior regions. An adjacent spot-linking process connects the spots into a grid to give each a "spot coordinate", for instance three spots up and two to the right from a starting spot. The program then guesses an optical center (initially the pixel center of image raster), and calculates a radius from this center to each spot, in both pixel and spot coordinates. We show the result as a scatter diagram: a point for each spot, positioned horizontally according to its spot radius and vertically by its pixel radius vertically. The shape of plot gives the radial lens distortion, the scatter an indication of the error made in choice of center. The program then adjusts the assumed optical center (and any additional parameters, currently pixel aspect ratio) to minimize the scatter. A function with several parameters fit to the scatter plot defines the radial lens distortion, which is then eliminated. The program then determines first-order corrections to remove the effects of camera aiming errors in azimuth, elevation and roll, as well as scaling and recentering the result to the desired field of view.
Examples of the program's operation on images from cameras with fields of view between 30 and 90 degrees follow. These is an undemanding regime within the scope of the old program, though the new program works much more automatically and reliably.
90 degree field of view example, from left camera of a stereo pair
The original program managed to rectify correctly only about 100
degrees of the field of view of very wide angle cameras, because the spot
finder could not recognize spots distorted into extreme ellipses at the
edges of the field of view. The new program addresses this by iterating
the rectification steps. The first iteration locates spots only in
the innermost approximately 100 degrees of the field of view, but extrapolates
the rectification roughly to wider angles. Usually all the spots
in the resulting rectified image are then recognizable, and a second iteration
is accurate to the image edges. This process works only if the scatter
fit function extrapolates reasonably. We found that polynomials and
other generic functions do not, tending instead to rise or fall precipitously
just beyond the range of the fitted points. We found a very good
solution by considering the lens designer's problem, to provide equal illumination
across the field of view, which can be achieved by assigning every pixel
an equal solid angle of view. With radial symmetry, this demands
that the pixel radius be proportional to a specific arctangent of the spot
(or projective) radius. We installed this arctangent as the first
order basis function for the least-squares fit. A linear function,
modeling a projective relation, was the second term, and subsequent functions
were successive powers of these arctangent and linear functions.
With this basis the extrapolations are often nearly quantitatively as well
as qualitatively correct. The program iterates repeatedly, increasing
the degree of the fit function, until the number of found spots and the
scatter both fail to improve, or the function coefficients grow too large,
indicating an incipient singularity.
Here is an example from a 120 degree image, with vignetting because the camera was fitted with a lens intended for a different camera with a smaller sensor.
120 degree field of view, multiple iterations.
Second iteration, starting with previous rectified image
Third iteration, starting with 2nd iteration rectified image
We found the arctangent "uniform illumination" function to be a
good representation of distortion in all the lenses we encountered.
This led us to try applying it pre-emptively, before any spot-finding.
Since we do not know the exact image scaling, positioning or rotation in
advance, such pre-rectification is bound to produce images with slight
scale, position and rotation errors. Still, spots are likely to be
rounder and more uniform in size in the pre-rectified than in the raw images,
making flatfish's first iteration easier. We found this was
in fact he case, and flatfish now pre-rectifies unless otherwise instructed.
Here are some examples of pre-rectified calibrations:
120 degree field of view, with pre-rectification
First iteration, starting with pre-rectified image
(subsequent iterations, too boring to include)
30 degree field of view, with pre-rectification
(subsequent iterations, too boring to include)
60 degree field of view, with pre-rectification
(subsequent iterations, too boring to include)
90 degree field of view, overcorrected by pre-rectification
(subsequent iterations, too boring to include)
The current version of the program can be found here: flatfish.c
In other developments, The G4 and iBook computers, linked by wireless network, that will control our first robot testbed have arrived, as have three Quickcam Pro cameras to be mounted on it. We delayed ordering a robot chassis, and have now decided to use the Probotics Cye platform, with special laptop trailer for the iBook, which can do the job of collecting stereoscopic imagery for less than $1,000 per platform.
Graduate student Martin Martin, working towards his thesis about one
year away, has integrated monocular vision, using an optical-flow approach,
with sonar range sensing and omnidirectional robot control towards a program
that evolves high level control behaviors. The work is being conducted
using our existing Uranus mobile robot, controlled via a long cable by
a desktop dual pentium machine running the Be operating system. Though
the details are different, the approach foreshadows the development system
we will use for the main work.
At that time, or perhaps the following meeting, we will also be able to report on genetic algorithm learning of strategies for combining sensor and control techniques for more effective control of robot mobility. This is the thesis work of Martin Martin.