[Next] [Up] [Previous]
Next: Dealing with Differences in Up: Computer Vision Previous: Internal Representation

Method 1: Template Matching

Our problem is to recognize that the Underground sign is visible given an array of grey-levels from the MTG's camera. One way of doing this to store away an idealized image of an Underground sign like that shown in figure 8.2 and to compare this image with the input image, looking for a match. This is somewhat analogous to a botanist trying to identify a particular specimen in a field by alternating gaze between the field and a text book picture of the desired plant. We call the idealized image a template and represent it inside the computer as an array of grey-levels in just the same way as we did for input images. Figure 8.4 shows the array of grey-levels representing the idealized image or template shown in figure 8.2.

[IMAGE ]
Figure 8.4: The array of grey-levels of the stored template.

Since they are in the same form, input images can be compared with the stored template looking for the pattern of grey-levels of the template within the array of grey-levels of the input image. An easy way of doing this is to slide the template over the input image from left to right and from top to bottom moving a single pixel at a time and at each position looking to see whether the numbers lying above one another from each image are the same.

Trace over the edges of the underground sign in figure 8.1 and then compare it with the template sign in figure 8.2. How would the template need to be altered so that its edges fit exactly under your tracing?

A moment's inspection of figures 8.1 and 8.2 should convince you that nowhere will the template and the input image match up exactly. Indeed we should be surprised if they did since in general it is very unlikely that two images of Underground signs will exhibit precisely the same array of grey-levels. There are at least three reasons for this:

The appearance of the Underground sign, and therefore the pattern of grey-levels, is crucially dependent on viewing angle and size. As we move away from the sign or as the sign gets smaller, its image becomes smaller, and as we move around the sign its image becomes elliptical. Try rotating a round saucer in front of your eyes and observe how its projected shape changes. It is quite difficult to convince oneself of this, since we are so used to perceiving things as they are in three dimensions. (For the sake of this discussion we have ignored perspective effects within images of Underground signs.)
The grey-levels are affected by the pattern of light falling on the surface of the sign. The same sign will look different in the morning when the sun is rising in the East from in the evening when the sun is setting in the West.
No two Underground signs are exactly alike. There will always be imperfections and surface markings which distinguish one from the other (aside from the different types of sign which can be seen around London).

In the case of the appearances of the signs in the stored template (figure 8.2) and input image (figure 8.1), the former has a higher contrast and extends over a greater number of pixels than the latter, and furthermore has a circular shape whereas the latter has an elliptical shape. There are also many details on the real sign which are not represented in our stored template -- in particular, the word `Underground' appears across the middle of the real sign. What we need to do is to transform one or both of the arrays of numbers (i.e., the input image and the template) to make them comparable.

[Next] [Up] [Previous]
Next: Dealing with Differences in Up: Computer Vision Previous: Internal Representation

Cogsweb Project: luisgh@cogs.susx.ac.uk