Obstacle Avoidance and Navigation in the Real World by a Seeing Robot Rover, Hans Moravec, 1980
<-- Previous  Next -->

## Appendix 6: Correlation

The problem is, given two n by n arrays of pixels, referred to as $A_{i,j}$ and $B_{i,j}$, determine how alike they are.

One of the simplest comparisons is $$\sum_{i,j = 0,0}^{n,n}(A_{i,j}-B_{i,j})^2$$

If the camera did exact photometric measurements, and if lighting conditions remained unchanged between frames, this measure would be ideal. Unfortunately the camera response varies from place to place on the camera field due to the nature of the optics and complicated vidicon effects, from scene to scene because of target voltage response to illumination, and from time to time due to changes in battery voltage. In spite of these drawbacks the measure has been used successfully.

### Normalized Correlation

Another alternative is “normalized correlation,” the root mean square average displacement of co-ordinate pairs $(A_{i,j},B_{i,j})$ from a least squares regression line relating $A$ to $B$. Let $a_{i,j} = A_{i,j}-\overline{A}$ and $b_{i,j} = B_{i,j}-\overline{B}$, $\overline{A}$ and $\overline{B}$ being the means over the entire window. The normalized correlation coefficient is defined as: $$\sigma = {\sum{a_{i,j} b_{i,j}} \over \sqrt{\sum{a_{i,j}^ 2}\sum{b_{i,j}^ 2} }}$$

Normalized correlation is invariant under all linear transformations of the values of $A$ and $B$. The complete contrast insensitivity this implies is a needless waste of information, because contrast changes in actual pictures are relatively small from frame to frame. In addition, the measure has a degeneracy when all the values of either $A$ or $B$ are the same. This will happen often if the search window contains uniformly shaded areas, and must be handled by a special test. A different measure is called for.

### Pseudo-normalized Correlation

Although the correlation coefficient itself remains unchanged, the “line of best fit” is different when calculated from $A$ to $B$ than from $B$ to $A$. For the best fit of $A$ to $B$, $b = ka$, we wish to find $k$ which minimizes $\sum(ka–b)^2$. This is $k = \sum{ab}/\sum{a^2}$. The line can be expressed $$a = {{\sigma\sqrt{\sum{a^2}}}\over{\sqrt{\sum{b^2}}}}\;b$$ for the best fit of $B$ to $A$, $a = kb$, we wish to find $k$ which minimizes $\sum(kb–a)^2$. This is $k = \sum{ab}/\sum{b^2}$. The line can be expressed $$b = {{\sigma\sqrt{\sum{b^2}}}\over{\sqrt{\sum{a^2}}}}\;a$$ these are equivalent if $\sigma$ is unity, i.e. if the correlation is perfect. Since for our application, the comparison of two picture patches, there is no particular reason for choosing one over the other, we compromise and choose a line equidistant from the two, removing the asymmetry between $A$ and $B$: $$b = {\sqrt{\sum{b^2}}\over\sqrt{\sum{a^2}}}\;a$$ or equivalently $$a = {\sqrt{\sum{a^2}}\over\sqrt{\sum{b^2}}}\;b$$

We will use this to obtain a correlation measure more suited to our needs.

Small contrast changes should be tolerated, large ones should not. Contrast is expressed by the slope of the line of best fit, a slope of unity denoting a perfect match, slopes of zero and infinity indicating that one or the other of the patches has no variations in intensity. Consider the normalized dot product of the line of best fit with a line of unity slope. Since this is the cosine of the angle between them, it will be unity for a perfect match, and very near unity for small contrast differences, since the cosine is flat around zero angle, and zero for a negative correlation. If we were to multiply this dot product by the normalized correlation coefficient, we would obtain a measure which was unity only for perfect correlation with no contrast differences, dropping rapidly with lessening correlation, and slowly at first and then rapidly with increasing contrast differences.

This measure still has a few flaws. If one of the patches is without intensity variations, the normalized correlation coefficient becomes $0/0$ and the dot product $1/\sqrt{2}$. In fact we want our measure to be zero in this case. This can be arranged if we multiply the normalized coefficient by the cosine of twice the angle between the correlation line and a line of slope 1. This cosine will be zero for lines of slope zero or infinity, and the limit of the product with the normalized correlation coefficient will also be zero in those cases. In addition, perfect negative correlation will again appear as minus unity, which might be an advantage in some circumstances.

Deriving an expression for this measure, we note that the cosine of our “compromise” correlation line with one of unity slope is $${\sqrt{\sum{a^2}} + \sqrt{\sum{b^2}}}\over \sqrt{2 \left( \sum{a^2} + \sum{b^2} \right)}$$ To convert this to the cosine of twice the angle with $a=b$, we use the identity $$\cos(2t) \; = \; 2 \cos^2(t) - 1$$ giving us $${{2 \left(\sqrt{\sum{a^2}} + \sqrt{\sum{b^2}}\right) ^2}\over {2 \left(\sum{a^2} + \sum{b^2}\right)}} - 1 \quad = \quad {2 \sqrt{\sum{a^2}\sum{b^2}} \over \sum{a^2} + \sum{b^2}}$$ multiplying by the normalized correlation coefficient, we get

$${{2 \sqrt{\sum{a^2}\sum{b^2}}}\over{\sum{a^2} + \sum{b^2}}} \;\; {\sum{ab} \over \sqrt{\sum{a^2}\sum{b^2}} } \quad = \quad {2 \sum{ab} \over \sum{a^2} + \sum{b^2}}$$

Note that whereas normalized correlation consists of the sum of the pairwise products of $A$ and $B$ divided by the geometric mean of the sum of their squares, this new measure, referred to as pseudo-normalized correlation, is the sum of the products divided by the arithmetic mean of the sums of the squares. Since it involves an addition and a halving rather than a multiplication and a square root, it is actually easier to calculate. The prettiness of the result leads me to suspect that a more elegant derivation exists.

This pseudo-normalized measure works as well as any of the alternatives, and is easy to compute. It is the standard correlation measure in cart software.

<-- Previous  Next -->