Towards Autonomous Vehicles The Mobile Robot Lab Robotics Institute Carnegie-Mellon University Pittsburgh, PA 15213 June 1985 @section(Introduction) The CMU Mobile Robot Lab was started in 1981 to pursue research in perception, planning and control for autonomously roving robots. The obvious short and long range practical applications of robot mobility aside, we think our work directly addresses the problem of building a generally intelligent machine. Among living things, only mobile organisms exhibit the sensory and behavioral characteristics that are the basis of our own intelligence. A roving entity encounters a wide variety of circumstances, and must perceive and respond with great generality to function effectively. We feel our research makes discoveries that parallel the evolution of intelligence in mobile animals. The selection function in both cases is the same - the effective functioning of a physical mobile entity in a varied and uncertain world. We think this experimentally guided bottom up approach can find some solutions, such as the secret of effective common sense reasoning, more effectively than the seemingly direct traditional top down approach to artificial intelligence. Our first funding came from an Office of Naval Research contract to develop land-based technology for eventual application to autonomous underwater robots. The subprojects were design and construction of highly maneuverable vehicles, development of stereo and sonar vision algorithms, and algorithms for path planning and higher level control. New developments were to be demonstrated in working systems that performed various tasks. We chose two tasks, one simple and one complex. In the first the vehicle was to travel to a goal location specified relative to its starting point, avoiding obstacles en route. This would encourage efforts in stereo, sonar, path planning, and vision-based vehicle motion estimation. The second task - finding, opening, and passing through doorways - was to serve as a longer term focus for work on maneuverable vehicles, object recognition, and high level control. Our first generation of obstacle avoidance systems now work, and we have taken first steps toward door-opening. We've built a simple vehicle to support obstacle avoidance work and another more complex one to serve our longer term plans. Two obstacle avoidance systems have been tested, one relying solely on stereo and the other on sonar. An initial design for high level software has been tested in simulation. We are preparing to start a second phase of our work which will extend the stereo capability towards shape extraction, merge stereo and sonar into a single system, and begin work on object modelling and recognition. @section(Overview) Our long term plans called for a small, accurate, very maneuverable, self-powered vehicle integrated with a small but strong manipulator. @i[Pluto] (generically the @i[CMU Rover]) was designed to meet these requirements. Among its several innovations was an @i(omnidirectional) drive system for accurate control of robot motion in three independent degrees of freedom (sideways, front/back and spinning like a skater). Our design used three complex assemblies each independently able to drive and steer its own wheel. Co-ordinated control of the six motors was a more difficult problem than we had anticipated, and is now being attacked as a research problem in its own right. In the meantime Pluto is unusable for its intended purposes. For the sake of the rest of the research we constructed a much simpler second vehicle, @i[Neptune]. Power and control come via a tether and two synchronous AC motors steer and drive, switched by a single onboard processor. Equipped with two vidicon cameras and a ring of sonar range finders, Neptune is very robust and has been used in visual and sonar mapping, navigation and obstacle avoidance experiments. The control work for Pluto began with the design of successful controllers for the six individual motors. When run simultaneously on the robot, these controllers experienced severe oscillations because of couplings between the drive forces in our overconstrained wheelbase, and fluctuating steering misalignments. We expect to solve the problem by with an integrated control strategy for the entire wheelbase. Our early designs for this require several times more total bandwidth and processing power than the individual controllers. As part of the effort we are developing a general calculus for kinematic description of wheeled mobile robots and a simulator for their dynamics. There are several hardware efforts in progress. We are building a third vehicle, @i[Uranus], with a new, more easily controlled omnidirectional drive system to carry on the longer range work stalled in Pluto, along with a special-purpose manipulator for grasping doorknobs. Just becoming operational is a video digitizer/display that shares memory with a Vax. In addition, we are exploring processor and digitizer configurations for use onboard the vehicles. On the software side we have concentrated on obstacle avoidance and high level control. We have two obstacle avoidance systems, one using stereo and the other using sonar. Both use a new path planner first developed for the stereo system. We have also designed and simulated the operation of a control system for distributed processors. The stereo work improves on the system built for the Stanford Cart @cite(Moravec80), which used correlation to track isolated feature points through a sequence of stereo pairs. We reduced the number of images digitized per step, added constraints that improve its tracking ability, and are now modifying its motion estimation algorithm. In the process we have reduced its runtime by an order of magnitude. The robot can now visually navigate across a large room in under an hour on one Vax power. The sonar system uses data from a ring of twenty-four wide angle Polaroid range finders to map the surroundings of an autonomous mobile robot. A sonar range reading provides information concerning space occupancy in a cone (subtending 30 degrees in our case) in front of the sensor. The reading is modelled as probability profiles projected onto a rasterized map for occupied and empty areas. Range measurements from multiple points of view (taken from multiple sensors on the robot, and from the same sensors after robot moves) are systematically integrated in the map. Overlapping empty volumes re-inforce each other, and empty volumes serve to condense the profiles of occupied volumes. The map resolution improves as more readings are added. The final map shows regions probably occupied, probably unoccupied, and unknown areas. The method deals effectively with clutter, and can be used for motion planning and for extended landmark recognition. Both systems plan robot paths with a new algorithm called path relaxation. It was first developed for the stereo vision navigator, but coincidentally happens to have a structure perfectly suited to our sonar mapper. Space is represented as a raster of weights as is in the sonar maps and costs are assigned to paths as a function of their length and the weights through which they pass. A combinatorial search on the raster grid coarsely finds a least cost path, then a relaxation procedure perturbs the co-ordinates of the vertices of this path to smooth it and reduce its cost. Our work on high level control began with a design for a distributed planning and control system for the several processors of Pluto. A system was designed around a network of message-passing kernels, a central blackboard module to represent state, and a notion of master/slave processes wherein masters monitored the blackboard while slaves dealt with external events. A small version of this system was tested in simulation. Development is proceeding slowly, with a planned acceleration when our other robot work warrants. We've begun a new project in co-operation with a Civil Engineering group led by William Whitaker. The Terragator is a large mobile robot built by the Whitaker's group. Powered by an onboard gasoline fueled generator, it is designed for long outdoor journeys. We equipped it with a television camera and microwave TV link, and wrote a simple but fast program that drives it along roads in real time, visually tracking the left and right edges. We plan to extend this work to longer journeys and faster speeds, obstacle detection, landmark recognition and long range navigation. @section(Vehicles) We wanted a flexible vehicle to support research on vision, vision-guided manipulation, and the planning issues that come with mobility. An early innovation was the perception that a mobile wheelbase could be considered part of an attached arm. The weight and power needed can be reduced by using the mobility of the Rover as an enormous reach substitute for the shoulder joint of the arm. Such a strategy works best if the Rover body is given a full three degrees of freedom (front/back, left/right and compass heading) in the plane of the floor. Conventional steering arrangements as in cars give only two degrees at any instant. This manipulation mode is also much helped if the wheels can servo very accurately and rapidly. Other desirata for the robot were the ability to run untethered, to work with multi-sensory systems, and the presence of some onboard processing power to reduce the communications bandwidth and perform some local decision-making. @begin(comment) The design of omnidirectional vehicles was a research problem at the time this project was started and remains so today. Conventional drives did not satisfy our requirements and omnidirectional drives had not been demonstrated. Given this state of affairs, we built a vehicle of our own design to meet the specifications listed above. This vehicle (Pluto) largely works, but the wheel and motor design encountered unforeseen control problems that prevent it from being useful in vision experiments. While we expect to lick this problem, it has and will take a considerable amount of time. Therefore we have built a second, much simpler vehicle (Neptune) to support the obstacle avoidance work and we are currently designing another omnidirectional vehicle (Uranus) around a new wheel concept that will be easier to control. These vehicles are described below. @end(comment) Pluto, our first vehicle, was built to the above specification. A second, simpler vehicle called Neptune was subsequently built to support obstacle avoidance work. A third vehicle is currently being designed to test a new concept in omnidirectionality. These are described below. @subsection(Pluto) Physically, Pluto is cylindrical, about a meter tall, 55 cm in diameter, and weighs about 200 lbs (fig. 1). Its three individually steerable wheel assemblies give it a full three degrees of mobility in the plane (fig. 2). The control algorithm for this arrangement steers the wheels so that at every instant lines through their axles meet at a common point. Properly orchestrated, this design permits unconstrained motion in any (2D) direction and simultaneous independent control of the robot's rotation about its own vertical axis. To permit low-friction steering while the robot is stationary, each assembly has two parallel wheels connected by a differential gear. The drive shaft of the differential goes straight up into the body of the robot. A concentric hollow shaft around this one connects to the housing of the differential. Turning the inner shaft causes the wheels to roll forwards or backwards; turning the outer one steers the assembly, with the two wheels rolling in a circle. Each shaft is driven by a brushless DC motor with samarium-cobalt permanent-magnet rotors and three-phase windings. The motor sequencing signals come directly from onboard microprocessors, one for each motor, which read shaft encoders to servo the motors to the desired motion. A seventh processor, the conductor, coordinates the action of the six motor sequencing processors. Another processor reads the shaft encoder outputs and monitors the motor torques to provide an onboard dead-reckoning estimate of the vehicle's position. Power for this ensemble is provided by a set of sealed lead-acid batteries. Pluto was to be equipped with a collection of sensors including cameras, sonar, and bump detectors and used in a series of advanced experiments in vision, navigation and planning. The bulk of the computation would be performed on a remote Vax-11/780, with communication taking place over a microwave video link and a radio data link. Extra processors were included in the design to service the sensors and manage the communication link. This plan was waylaid by a difficult and unexpected problem in controlling the six motors of the omnidirectional wheelbase. We are able to to drive the robot successfully when one wheel at a time is energized, but experience large oscillations when all are running at once. The problem is caused by interactions between the servo loops of the individual actuators through the redundant degrees of freedom in the wheels. A similar situation arises in a milder form in other locomotion systems with redundant degrees of freedom, especially legged vehicles. We are now investigating control algorithms and processor architectures for this interesting problem, but in the meantime Pluto is unavailable for experimental work with our vision systems. Neptune was built to fill the gap. @subsection(Neptune) We decided to build quickly but carefully a simple and robust platform for obstacle avoidance experiments. Neptune was designed to eliminate many potential problems. It is a tethered, remotely powered tricycle with a lone onboard processor (fig. 3). To simplify servoing and remove the need for shaft encoders, synchronous AC motors drive and steer the front wheel while the rear wheels trail. The vehicle itself is about two feet tall, four feet long, and two feet wide. It weighs about 250 pounds. It is currently configured with two black-and-white vidicon cameras on fixed mounts and ring of twenty-four Polaroid sonar range-finders. The range-finders have a useable range of about twenty-five feet and a thirty degree beam width, so that the beams of adjacent sensors overlap by about by fifty percent. The vehicle moves at a constant velocity, with angles and distances controlled by timing the motors with the onboard MC68000. Neptune is an unqualified success. It has been our workhorse for obstacle avoidance and indoor road following experiments and will be used in future to test extended vision algorithms and to merge stereo and sonar into one system. @subsection(Uranus) Omnidirectionality appears to be an idea whose time has come. While Pluto was in gestation several new methods for achieving it were published and patented. One, developed at Stanford, is based on novel wheels that have passive rollers instead of tires, oriented at right angles to the wheel. The rollers permit the wheel to by pushed passively in the broadside direction. Three such wheels, each with its own motor, mounted around a round wheelbase allow smooth motion in three degrees of freedom. Regardless of the direction of travel, one wheel or another is always travelling nearly broadside, and this is a weakness of the system. It requires an expensive and potentially troublesome bearing system for the rollers, and suffers from a low ground clearance, limited by the roller diameter, and inability to travel on soft ground. Despite these limitations, it would have been a far more fortunate design choice than the individually steerable wheels of Pluto. Another new design for omnidirectionality was invented recently in Sweden. It too uses wheels surrounded by passive rollers, but the rollers are angled at 45 degrees to the wheel plane. One of these wheels can travel broadside on its rollers, but the whole wheel must simultaneously turn, resulting in a screw-like motion. Like screws, these wheels are not mirror symmetric, and come in right handed and left handed varieties. An omnidirectional vehicle is built with four of these wheels, mounted like wagon wheels, but with careful attention to handedness. The right front wheel is right handed and the left front is left handed, but the right rear is left handed and the left rear is right handed. Each wheel is turned by its own motor. To move the vehicle forward, all four wheels turn in the same direction, as in a conventional wagon. However, if the wheels on opposite sides of the vehicle are driven in opposite directions, the vehicle moves sideways, like a crab. By running the front and back sideways in opposite directions, the vehicle can be made to turn in place. Because the rollers are not required to turn when the vehicle moves in the forward direction, the Swedish design has good bump and soft ground handling ability in that direction. In our experience-scarred judgement the Swedish design is the most practical omnidirectional system known. It, alone, is being used outside of an experimental context, in commercially available wheelchairs and factory transport vehicles. Uranus, the Mobile Robot Lab's third construction, is being designed around this proven drive system to carry on the long range work stalled in Pluto. We obtained the wheels themselves from Mecanum, Inc. of Sweden which holds the license. Pluto's many lessons guide us in this project. In just about every way Uranus is simpler than Pluto. There are four motors, not six, no concentric shafts and only a single, benign, redundant interaction mode among the wheels. @section(A Manipulator for Door-opening) We've decided that visually locating, opening and passing through a door is an excellent task to guide development of advanced vision, planning and control work. To this end, we've designed and are building a special arm to be mounted on Uranus (fig 6). The arm design is simultaneously strong, light and low-power because it exploits the mobility of the robot. The arm has four joints: a vertical translational joint, rotational shoulder and elbow joints with vertical axes, and a rotating wrist. The redundancies between the shoulder and elbow joints and the rotation of the vehicle permit the robot to hold the door in a stable, open position while the body of the robot passes through the doorway. The arm design uses of the robot's strength to handle doors. The manipulator's joints are only lightly actuated, since the motors in the joints are used only for positioning the arm under no load. Once the gripper has secured a doorknob, the elbow joint becomes a totally passive pivot and the base joint is alternately locked into position and released. Neither joint's motor is actuated again until the arm has released the door. The gripper itself is constructed from a janitorial lightbulb extractor. This is a spring-loaded, cylindrical device with a sliding collar. With the collar retracted, the gripper is pushed over the lightbulb (or doorknob); when the collar is tightened the gripper holds fast. Our manipulator uses this gripper with a motorized collar. @section(Mobility Control for Wheeled Mobile Robots) It has become clear to us that the complex mechanical designs of highly maneuverable wheeled mobile robots (WMRs), such as @i[Pluto] and @i[Uranus], require sophisticated co-ordinated controllers for effective motion control. Over-constrained multiple-wheeled robots, in particular, are a major challenge. We initially approached the problem by neglecting the motor interactions and designing independent control algorithms for each of the motors on @i[Pluto]. We found that only minimal mobility control was possible in this framework @cite(Muir84a). The severe motor interactions we observed provided a motivation to develop better control algorithms. @subsection(Pulse-Width Modulation Control of Brushless DC Motors) During the design for @i[Pluto]'s controller we developed a @b[p]ulse-@b[w]idth @b[m]odulation control mode for brushless DC motors @cite(Muir84). Digital pwm servo controllers for these motors are interesting because their high torque-to-weight ratio, ease of computer control, efficiency, and simple drive circuitry make them very suitable for robots. We switch the motor windings with transistors controlled directly by microprocessor logic signals, and achieve linear control by pulse-width modulation of the stator coil voltages. We found that the motors can be modelled by linear discrete-time transfer functions, with the pulse-width playing the role of the control signal, if the pulse period is chosen much smaller than the time-constants of the motors. These models allow us to use classical control engineering methods to the design our motor control systems. @subsection(Wheeled Mobile Robot Simulations for Controller Design Studies) @begin(comment) Continued research in the area of controller design for WMRs is aimed at developing techniques which are generally applicable and which provide better control. The present approach to the problem dictates that accurate kinematic and dynamic models of the robots be developed. These models form the basis of computer simulations of the robots on which proposed control strategies can be tested. Using computer simulations, we will have the ability to evaluate the performance of a robot-controller combination before spending much time, money, and effort in the actual mechanical and electrical construction. Adaptive control algorithms show promise for providing better robot control by having the ability to adapt both to coupling forces and torques from other motors and a changing environment (ie. floor surface or robot load). The controllers which demonstrate the best performance in simulations will be implemented on actual robots to verify both the accuracy of the simulations and the performance of the controllers. @end(comment) Our experience with Pluto prompted a systematic study of the problem of controlling wheeled mobile robots, both for Pluto's sake and for future designs. Our first result is the kinematics half of a calculus for general wheeled drives similar to one developed in the last decade for describing the kinematics and kinetics of manipulators. The problem is different because closed-link chains are present in wheel systems and because of the peculiar nature of the interaction between a wheel and the floor. The calculus will aid development of accurate kinematic and dynamic models of the robots, allowing simulations for testing proposed control strategies. Our tentative solutions for control of Pluto all require high bandwidth communication between the individual motor controllers. Adaptive control algorithms seem promising because they have the ability to adapt both to coupling forces and torques from other motors and to a changing floor surface or robot load. @section(Perception and Planning for Obstacle Avoidance) The obstacle avoidance task prompted our first major work on robot perception. At the broadest level, the perception problem has two main components: understanding how to use individual sensors and understanding how to combine multiple sensors in a single system. We have addressed the first problem by developing rudimentary navigation systems that use vision and sonar separately. This section describes these systems as they exist and outlines our plans for extending their separate capabilities. We expect to integrate the two systems in the near future. @subsection(Stereo Vision) Our stereo system continues the work done by Moravec with the Stanford Cart. The basic task requires the robot to navigate from its initial position to a specified goal location, avoiding any obstacles in between. Stereo is used to detect obstacles and estimate the motion of the vehicle (actually avoiding the obstacles is discussed later under path planning). The Cart approach is to detect local, high variance @i[features] in one image, to use stereo correspondence to determine the three-dimensional positions of the features, and to track the features over time to determine the motion of the vehicle. Our work with these algorithms has focussed on the following issues: @begin(itemize) the number of stereo images used at each point in time the @i[interest operator] used to pick features the algorithm used for tracking @end(itemize) After reviewing the algorithms used by the Stanford Cart, we will discuss each of these issues in turn. @paragraph(Vision in the Stanford Cart) The Stanford Cart used nine-way stereo at each robot position to detect and track obstacles. An @i[interest operator] was applied to the center image to pick features, then a coarse to fine correlation process was applied to locate the features in the other eight images. Histogram-based triangulation from the set of match locations provided the initial three-dimensional obstacle positions. Obstacles were tracked as the robot moved by applying the correlator to the new center image to reacquire the old features. Following this the features were matched in the other new images to obtain distances to the obstacles from the new robot location. Lastly, least squares was used to find a best fit transformation mapping the old feature locations into the new, thereby obtaining the vehicle motion. The whole system moved the Cart in one-meter steps, taking about 15 minutes per step on a DEC KL-10. @paragraph(Number of Images) The redundant information provided by nine-way stereo in the Cart was used to improve the accuracy and reliability of triangulation. Increases in accuracy stemmed from the effects of averaging, the ability to detect gross errors in matches, and the effects of baseline. The baseline issue is an interesting one. The wider the baseline, the better accuracy; however, wide baselines increase the difficulty of matching and reduce the amount of overlap in the fields of view, which is important when imaging very near objects. Using many images provides the best of both worlds by permitting a short baseline between adjacent images but a long baseline overall. The great expense of redundant stereo prompted the use of only two-way stereo in our current system. When all available matching constraints are used, we find that the gross system performance is as good with only two-way stereo as it was with nine. Here we are using the epipolar constraint, minimum and maximum disparity limits based on distances of 1.5 meters to infinity, and constraints on feature reacquisition based on dead-reckoned estimates of vehicle motion. Gross performance, in terms of the accuracy of visual motion estimation, of course breaks down when the constraints are dropped. The system now runs in about 35 CPU seconds per step (three to four minutes of elapsed time) on a Vax-11/780. These results establish the viability of two-way stereo empirically, but the use of redundant images remains an interesting question. Two particular areas to be explored are the use of three cameras, which offers the ability to detect mismatches, and the use of the redundancy provided by motion. We expect to examine these areas in the future, both theoretically and empirically. @paragraph(Interest Operators) The interest operator is designed to pick small patches or features in one image that can be reliably matched in another. In general, this requires that the patch exhibit high intensity variations in more than one direction to improve its localizability in another image. For example, edges show high variation in the direction of their gradient, but little variation in the direction of their length, making them poorly localizable in that direction. Ostensibly, a better interest operator will lead to a higher likelihood of correct matches. This has prompted a lot of effort to develop better operators @cite(Nagel70), @cite(Kitchen70), @cite(Bandyopadyay), @cite(Kass), but no convincing evidence existed that any one operator is superior. Therefore, we evaluated the relative performance of a number of operators in the context of our system @cite(Thorpe84a). The operators used were those of Moravec, Kitchen and Rosenfeld, and several new operators we developed here. As a control, a set of features were also picked by hand. Criterion used in assessing the performance of an operator was the number of features, from an initial set of forty picked by the operator, that could be correctly matched in another image. Here correct means that the match location was within a pixel or two of the subjectively best match as judged by the experimenter. Results were averaged over a number of trials with different images. Experiments were also run with and without the constraint offerred by epipolar lines and disparity limits. We found that rates of matching success showed very little variation between the better operators, which included the Moravec and Kitchen and Rosenfeld operators and two of our new ones. The rates varied from about 60% correct in difficult images with no matching constraint up to over 90% when all constraints were used in less difficult images. On the whole, the Moravec operator performed slightly better than other operators and only a little worse than manual feature selection. More importantly, we found that the improvement bought by the use of constraint was much more pronounced than that obtained by using different operators. We conclude that our research emphasis should no longer be placed on operators, since the Moravec operator is cheaper and as least as effective as other candidates, but should instead by placed on getting the most out of the available constraints and image redundancy (as results from motion, for example). @paragraph(Tracking and Motion Estimation) The Stanford Cart tracked features and estimated the motion of the vehicle as separate operations. Tracking was performed by reacquiring the features one at a time with the coarse to fine correlator. Bad matches were pruned at this stage with a heuristic that required inter-feature distances among the reacquired features to be similar to the distances among the old features. Motion estimation was then done by finding the transformation that minimized the least squared error between new and old feature positions. This approach is unsatisfactory for two reasons. First, it makes poor use of the assumption that objects in the environment do not move. This assumption is the justification for fitting a single transformation to the feature data to estimate vehicle motion. By reacquiring features one at a time, with no propagation of constraint between features, this assumption is initially ignored. It is only later, in the pruning heuristic and motion solving, that it comes into play. The second objection to the Cart approach is that it throws intensity information away too early. The basic goal of the motion solver is to find a transformation that optimizes the correlation between features picked at a previous step and their projected position in the current images. Despite the best efforts of the interest operator, correlation peaks for individual features may be fairly broad, so that it makes little difference locally which pixel in a small region is chosen as the match. However, picking a single location can skew the global transformation. The first objection can be addressed by using dead-reckoned estimates of vehicle motion to constrain feature reacquisition. This requires some tolerance to allow for errors in the dead-reckoned estimate, however, and in Neptune the tolerance must be fairly large. A better approach that addresses both objections has been developed by Lucas @cite(Lucas84). His method sets the error function for a transformation to be the squared difference of intensity between a pixel in the previous image and its projected location in the current image, summed over all features. Starting from the dead-reckoned motion estimate, Newton iteration is employed to find a transformation that minimizes this error function. Greater tolerance for errors in the initial estimate is obtained by applying the algorithm first to blurred versions of the image, then to successively sharper images. Lucas has shown the algorithm to work well, with synthetic and real images, for a single step of motion when the feature distances are given @i[a priori]. We are currently adapting the algorithm for use in our system. @subsection(Sonar) We began our work with sonar mapping because direct sonar range measurements promised to provide basic navigation and denser maps with considerably less computation than vision. To this end a ring of 24 of the readily available Polaroid ultrasonic range transducers was mounted on Neptune. The Polaroid units have a useful measuring range of 0.9 to 35.0 ft., and are sensitive in a conical 30@+(o) region in front of the sensor. We are using the analog control circuitry provided with the unit. The system is optimized for giving the range of the nearest sound reflector in its field of view, and works well for this purpose. @paragraph(Mapping) The sonar depth measurements have a certain imprecision. Specular reflections with a low angle of incidence produce wrong readings. On smooth surfaces the sonar beam may not return directly to the sensor after hitting an object, but can instead suffer multiple reflections, giving completely erroneous range measurements. More importantly because of the wide angle of the sonar beam, an isolated sonar reading imposes only a very loose constraint on the position of the detected object. These conditions led us to consider a probabilistic approach to the interpretion of range data and the building of sonar-based maps. @section(Sonar Mapping) We begin our map building by eliminating some incoming readings with a number of consistency heuristics to remove the most obvious problems. Because of the wide beam angle the filtered data provides only indirect information about the location of the detected objects. We combine the constraints from individual readings to reduce the uncertainty. Our inferences are represented as probabilities in a discrete grid. A range reading is interpreted as providing information about space volumes that are probably @c(EMPTY) and somewhere @c(OCCUPIED). We model the sonar beam by two probability distribution functions. These functions model our confidence that the various points inside the cone of the beam are empty and our uncertainty about the location of the point that caused the echo. The functions are based on the range reading and on the spatial sensitivity pattern of the sonar. The @i(empty) function fills the volume of the cone of the reading while the @i(occupied) covers the range surface (front) of the cone. Both functions have values in the range zero to one, representing certainty that a point in space is empty or occupied. These probability density functions are now projected on a horizontal plane to generate map information. Sonar Maps are two-dimensional arrays of cells corresponding to a horizontal grid imposed on the area to be mapped. In each cell we store information that describes its status (@c(UNKNOWN), @c(EMPTY) or @c(OCCUPIED)) and the associated certainty factors. Occupancy probabilities from individual readings are projected onto the sonar map and combined with information already stored there. The position and the orientation of the sonar sensor are used to register the beam with the map during this projection process. Each sonar reading provides partial evidence about a map cell being @c(OCCUPIED) or @c(EMPTY). This evidence is combined with existing data to refine the status of each cell. The @i(evidence combination rules) that control this process allow the new evidence to enhance or weaken existing hypotheses. Different readings asserting that a cell is @c(EMPTY) will enhance each other, as will readings implying that the cell is @c(OCCUPIED) while evidence that the cell is @c(EMPTY) will weaken the certainty of it being @c(OCCUPIED) and vice-versa. The operations performed on the empty and occupied areas are not symmetrical. The probability distribution for @i(empty) areas represents a solid volume whose totality is probably empty but the @i(occupied) probability distribution for a single reading represents a lack of knowledge we have about the location of a single reflecting point somewhere in the range of the distribution. Empty regions are simply added using a probabilistic addition formula. The @i(occupied) probabilities for a single reading, on the other hand, are reduced in the areas that the other data suggests is empty, then normalized to make their sum unity. Only after this narrowing process are the @i(occupied) probabilities from each reading combined using the addition formula. One range measurement contains only a little information. By combining the evidence from many readings as the robot moves in its environment, the area known to be empty is expanded. The number of regions somewhere containing an occupied cell increases, while the range of uncertainty in each such region decreases. The overall effect as more readings are added is a gradually increasing coverage along with an increasing precision in the object locations. Typically after a few hundred readings (and less than a second of computer time) our process is able to "condense out" a comprehensive map covering a thousand square feet with better than one foot position accuracy of the objects detected. Note that such a result does not violate information theoretic or degree of freedom constraints, since the detected boundaries of objects are linear, not quadratic in the dimensions of the map. A thousand square foot map may contain only a hundred linear feet of boundary. @section(Map Matching) To navigate by sonar maps we need a procedure that can match two maps and report the displacement and rotation that best takes one into the other. Our most successful procedures first simplify maps by subtracting the empty probability from the occupied probability in each map cell, producing a single number in the range -1 to +1 which is negative if the cell is empty, positive if occupied and zero if unknown. A measure of the goodness of the match between two maps at a trial displacement and rotation is found by computing the sum of products of corresponding cells in the two maps. Note that an occupied cell falling on an occupied cell contributes a positive increment to the sum, as does an empty cell falling on an empty cell (the product of two negatives). An empty cell falling on an occupied one reduces the sum, and any comparison involving an unknown value causes neither an increase nor a decrease. This naive approach is very slow. Applied to maps with a linear dimension of @i(n), each trial position requires @p(O)(@i(n@+(2))) multiplications. Each search dimension (two axes of displacement and one of rotation) requires @p(O)(@i(n)) trial positions. The total cost of the approach thus grows as @p(O)(@i(n)@+(5)). With a typical @i(n) of 50 this approach can burn up a good fraction of an hour of Vax time. Considerable savings can be extracted from the observation that most of the information in the maps is contained in the occupied cells alone. Typically only @p(O)(@i(n)) cells in the map, corresponding to wall and object boundaries, are labelled occupied. A revised matching procedure comparing maps A and B through trial transformation T (represented by a 2x2 rotation matrix and a 2 element displacement vector) enumerates the occupied cells of A, transforming the co-ordinates of each such cell in A through T to find a corresponding cell in B. The [A, B] pairs obtained this way are multiplied and summed, as in the previous paragraph's procedure. In addition the occupied cells in B are enumerated and multiplied with corresponding cells in A, found by transforming the B co-ordinates through T@+(-1) (the inverse function of T), and these products are also added to the sum. The result is normalized by dividing by the number of terms in the sum. This procedure is implemented efficiently by preprocessing each sonar map to give both a raster representation and a linear list of the co-ordinates of occupied cells. The cost growns as @p(O)(@i(n)@+(4)), and the typical Vax running time is down to a few minutes. A further speedup is made possible by generating a hierarchy of reduced resolution versions of each map. A coarser map is produced from a finer one by converting two by two subarrays of cells in the original into single cells of the reduction. Our existing programs assign the maximum value found in the subarray as the value of the result cell, thus preserving occupied cells. If the original array has dimension @i(n), the first reduction is of size @i(n)/2, the second of @i(n)/4 and so on. A list of occupied cell locations is produced for each reduction level so that the matching method of the previous paragraph can be applied. The maximum number of reduction levels is @b(log)@-(2)@i(n). A match found at one level can be refined at the next finer level by trying only about three values of each of the two translational and one rotational parameters, in the vicinity of the values found at the coarser level, for a total of 27 trials. With a moderate a-priori constraint on the transformation this amount of search is adequate even at the first (coarsest) level. Since the cost of a trial evaluation is proportional to the dimension of the map, the coarse matches are inexpensive in any case. Applied to its fullest, this method brings the matching cost down to a growth rate only slightly larger than @p(O)(@i(n)), and practical Vax matching times to under a second. We found one further preprocessing step is required to make the matching process work in practice. Raw maps at standard resolutions (6 inch cells) produced from moderate numbers (about 100) of sonar measurements have narrow bands of cells labelled @i(occupied). In separately generated maps of the same area the relative positions of these narrow bands shifts by as much as several pixels, making good registration of the occupied areas of the two maps impossible. This can be explained by saying that the high spatial frequency components of the position of the bands is noise, only the lower frequencies carry information. The problem can be fixed by filtering (blurring) the @i(occupied) maps to remove the high frequency noise. Experiment suggests that a map made from 100 readings should be blurred with a spread of about two feet, while for map made from 200 readings a one foot smear is adequate. Blurring increases the number of cells labelled @i(occupied). So as not to increase the computational cost from this effect, only the final raster version of the map is blurred. The occupied cell list used in the matching process is still made from the unfiltered raster. With the full process outlined here, maps with about 3000 six inch cells made from 200 well spaced readings of a cluttered 20 foot room can be matched with an accuracy of about six inches displacement and three degrees rotation in one second of Vax time. @subsection(Path Planning) @tag(intro) Path Relaxation is a two-step path-planning process for mobile robots. It finds a safe path for a robot to traverse a field of obstacles and arrive at its destination. The first step of path relaxation finds a preliminary path on an eight-connected grid of points. The second step adjusts, or "relaxes", the position of each preliminary path point to improve the path. One advantage of path relaxation is that it allows many different factors to be considered in choosing a path. Typical path planning algorithms evaluate the cost of alternative paths solely on the basis of path length. The cost function used by Path Relaxation, in contrast, also includes how close the path comes to objects (the further away, the lower the cost) and penalties for traveling through areas out of the field of view. The effect is to produce paths that neither clip the corners of obstacles nor make wide deviations around isolated objects, and that prefer to stay in mapped terrain unless a path through unmapped regions is substantially shorter. Other factors, such as sharpness of corners or visibility of landmarks, could also be added for a particular robot or mission. A cost function describes how desirable it is to have a path go through a given point. This function can be thought of as a terrain map, and the vehicle as a marble rolling towards the goal. The terrain (cost function) consists of a gradual slope towards the goal, hills with sloping sides for obstacles, and plateaus for unexplored regions. The height of the hills has to do with the confidence that there really is an object there. Hill diameter depends on robot precision: a more precise robot can drive closer to an object, so the hills will be tall and narrow, while a less accurate vehicle will need more clearance, requiring wide, gradually tapering hillsides. Using this analogy, the first step is a global grid search that finds a good valley for the path to follow. The second step is a local relaxation step that moves the nodes in the path to the bottom of the valley in which they lie. @paragraph(Grid Search) @tag(gridsearch) Once the grid size has been fixed, the next step is to assign costs to paths on the grid and then to search for the best path along the grid from the start to the goal. "Best", in this case, has three conflicting requirements: shorter path length, greater margin away from obstacles, and less distance in uncharted areas. These three are explicitly balanced by the way path costs are calculated. A path's cost is the sum of the costs of the nodes through which it passes, each multiplied by the distance to the adjacent nodes. (In a 4-connected graph all lengths are the same, but in an 8-connected graph we have to distinguish between orthogonal and diagonal links.) The node costs consist of three parts to explicitly represent the three conflicting criteria. @begin(enumerate) Cost for distance. Each node starts out with a cost of one unit, for length traveled. Cost for near objects. Each object near a node adds to that node's cost. The nearer the obstacle, the more cost it adds. The exact slope of the cost function will depend on the accuracy of the vehicle (a more accurate vehicle can afford to come closer to objects), and the vehicle's speed (a faster vehicle can afford to go farther out of its way), among other factors. Cost for within or near an unmapped region. The cost for traveling in an unmapped region will depend on the vehicle's mission. If this is primarily an exploration trip, for example, the cost might be relatively low. There is also a cost added for being near an unmapped region, using the same sort of function of distance as is used for obstacles. This provides a buffer to keep paths from coming too close to potentially unmapped hazards. @end(enumerate) The first step of Path Relaxation is to set up the grid and construct the list of obstacles and the vehicle's current position and field of view. @foot( In this implementation, there are two types of obstacles: polygonal and circular. Currently, the circular obstacles are used for points found by @c(FIDO)'s vision system, each bounded by a circular error limit, and the polygons are used for the field of view. The vision system will eventually give polygonal obstacles, at which point both the obstacles and the field of view will be represented as polygons and the circular obstacles will no longer be needed.) The system can then calculate the cost at each node, based on the distances to nearby obstacles and whether that node is within the field of view or not. The next step is to create links from each node to its 8 neighbors. The start and goal locations do not necessarily lie on grid points, so special nodes need to be created for them and linked into the graph. The system then searches this graph for the minimum-cost path from the start to the goal. The search itself is a standard A* @cite(nilsson71) search. The estimated total cost of a path, used by A* to pick which node to expand next, is the sum of the cost so far plus the straight-line distance from the current location to the goal. This has the effect, in regions of equal cost, of finding the path that most closely approximates the straight-line path to the goal. @paragraph(Relaxation) @tag(iterative) Grid search finds an approximate path; the next step is an optimization step that fine-tunes the location of each node on the path to minimize the total cost. One way to do this would be to precisely define the cost of the path by a set of non-linear equations and solve them simultaneously to analytically determine the optimal position of each node. This approach is not, in general, computationally feasible. The approach used here is a relaxation method. Each node's position is adjusted in turn, using only local information to minimize the cost of the path sections on either side of that node. Since moving one node may affect the cost of its neighbors, the entire procedure is repeated until no node moves farther than some small amount. Node motion has to be restricted. If nodes were allowed to move in any direction, they would all end up at low cost points, with many nodes bunched together and a few long links between them. This would not give a very good picture of the actual cost along the path. So in order to keep the nodes spread out, a node's motion is restricted to be perpendicular to a line between the preceding and following nodes. Furthermore, at any one step a node is allowed to move no more than one unit. As a node moves, all three factors of cost are affected: distance traveled (from the preceding node, via this node, to the next node), proximity to objects, and relationship to unmapped regions. The combination of these factors makes it difficult to directly solve for minimum cost node position. Instead, a binary search is used to find that position to whatever accuracy is desired. The relaxation step has the effect of turning jagged lines into straight ones where possible, of finding the "saddle" in the cost function between two objects, and of curving around isolated objects. It also does the "right thing" at region boundaries. The least cost path crossing a border between different cost regions will follow the same path as a ray of light refracting at a boundary between media with different transmission velocities. The relaxed path will approach that path. @paragraph(Example Run) The figure shows a run using real data. Objects are represented as little circles, where the size of the circle is the positional uncertainty of the stereo system. The numbers are not all consecutive, because some of the points being tracked are on the floor or are high off the ground, and therefore aren't obstacles. The dotted lines surround the area not in the field of view; this should extend to negative infinity. The start position of the robot is approximately (0, -.2) and the goal is (0, 14.5). The grid path is marked with 0's. After one iteration of relaxation, the path is marked by 1's. After the second relaxation, the path is marked by 2's. The greatest change from 1 to 2 was less than .3 meters, the threshold, so the process stopped. The size of the "hills" in the cost function is 1 meter, which means that the robot will try to stay 1 meter away from obstacles unless that causes it to go too far out of its way. @begin(figure) @presspicture (file = "sumpic.press", height = 6.0 in) @caption() @end(figure) @section(Global Planning) We are building navigating robot systems that use these sonar and vision derived maps maps in a more general context. General maps can be classified along several dimensions: @begin(itemize) @c(The Geographical Dimension): We identify @i(Views), which are maps generated from scans taken from one position; @i(Local Maps), which correspond to physically delimited spaces such as corridors, labs or offices, and which typically define a connected region of visibility; and @i(Global Maps), which store information about the whole known environment of operation of the robot. @c(The Abstraction Dimension): This begins with the @i(Sensor) level, which are the low-level, data-intensive representations. Above that, we have the @i(Symbolic) level, where the Sensor data is abstracted, interpreted and compacted, and where general information about rooms, passageways, corridors, and position of major obstacles is stored. A graph-like representation will be used which bears only a topological equivalence to the real world. @i(Nodes) correspond to "interesting" areas, where there is enough sonar information to allow recognition, and enough complexity to make more sophisticated path-planning and navigation necessary. @i(Links) represent "uninteresting" areas which lack detail, such as corridors, and where navigation is mostly done by simple dead-reckoning and wall-following schemes. Global path-planning will occur on this level. @c(The Resolution Dimension): Our finest sonar maps are more coarse (about half foot resolution) than those we usually get from vision. Nevertheless several operations on the maps are expensive and are done more quickly and surely at lower levels of resolution. For those needs we reduce our raw maps by a kind of averaging that results in a coarser grid. @end(itemize) @section(New Work) We have begun work in a major new area, road following systems for the DARPA Autonomous Land Vehicles program. The goals of the DARPA program begin with following well defined roads with no intersections or obstacles, then progress to navigation and obstacle avoidance in road networks and eventually to navigation in open terrain. We are working on this in co-operation with a Civil Engineering group led by William Whitaker. The Terragator is a large mobile robot built by the Whitaker's group. Powered by an onboard gasoline fueled generator, it is designed for long outdoor journeys. We equipped it with a television camera and microwave TV link, and wrote a simple but fast program that drives it along roads in real time, visually tracking the left and right edges. One interesting early discovery is that the most effective strategy for getting and keeping the vehicle on the road is to keep the road image centered in a forward-looking camera. We are extending this work to longer journeys and faster speeds, and plan to incorporate obstacle detection, landmark recognition and long range navigation. The effort complements our other projects and is a natural application of a number of the techniques we have been developing. @section(Conclusion) The most consistently interesting stories are those about journeys, and the most fascinating organisms are those that move from place to place. These observations are more than idiosyncrasies of human psychology, but illustrate a fundamental principle. The world at large has great diversity, and a traveller constantly encounters novel circumstances, and is consequently challenged to respond in new ways. Organisms and mechanisms do not exist in isolation, but are systems with their environments, and those on the prowl in general have a richer environment than those rooted to one place. Mobility supplies danger along with excitement. Inappropriate actions or lack of well-timed appropriate ones can result in the demise of a free roamer, say over the edge of a cliff, far more easily than of a stationary entity for whom particular actions are more likely to have fixed effects. Challenge combines with opportunity in a strong selection pressure that drives an evolving species that happens to find itself in a mobile way of life in certain directions, directions quite different from those of stationary organisms. The last billion years on the surface of the earth has seen a grand experiment exploring these pressures. Besides the fortunate consequence of our own existence, some universals are apparent from the results to date and from the record. In particular, intelligence seems to follow from mobility. The same pressures seem to be at work in the technological evolution of robots and it may be that mobile robots are the best route to solutions to some of the most vexing unsolved problems on the way to true artificial intelligence - problems such as how to program common sense reasoning and learning from sensory experience. This opportunity carries a price - programs to control mobile robots are more difficult to get right than most - the robot is free to search the diverse world looking for just the combination that will foil the plans of its designers. There's still a long way to go. We believe that our real-world experimental approach is teaching us the important lessons. It is our experience in many perceptual and control problems succumb to particular simple techniques, but that only a very small fraction of the plausible simple methods work in practice. Determining @i(which) method works often cannot be decided theoretically, but succumbs easily to realistic experiments.