Before looking at the recognition method in more detail, we need a well defined procedure for locating salient features without first recognizing what is depicted. Two kinds of feature have been particularly important in computer vision: edges and regions.
We have already met the notion of an edge earlier in this chapter, as an image contour along which there is a step change in intensity. For most everyday scenes, there is a close correspondence between such edges and the boundaries between objects and parts of objects in the scene. For example, in figure 8.7 there are edges corresponding to the markings and border of the Underground sign and other objects in the background. As we have discussed, edges are detected by first looking for edge elements and then grouping these elements into longer edges. For the purposes of matching image features with models, we must identify edges which `stand out' as single entities from the network of adjacent edge elements; otherwise there is no reason to expect that discovered edges will match with the edges specified in the model. A promising approach is to break the chains of adjacent edge elements at sharp corners and places where three or more chains are joined together (i.e., at junctions). For example, the half-rings of the Underground sign give rise to four smoothly curved chains of edge elements terminated abruptly where they join with edge elements derived from the central bar (see figure 8.7). Rather than pursue this further, we turn our attention to another kind of salient feature.
In some ways complementary to edges, regions are connected groups of pixels over which intensity or texture is nearly homogeneous. For example, although the grey-levels on the ring of the Underground sign differ from point to point, they are nevertheless confined to a narrow band of values ranging from around 20 to 45 (ignoring those `mixed' pixels on the border of the ring), whereas the background of the sign ranges from around 3 to 8.
This observation suggests a simple algorithm for detecting regions. First draw up a list of non-overlapping grey-level ranges corresponding to the desired regions and then label each pixel according to the range in which its grey-level lies. Of course, the grey-level ranges are chosen carefully to ensure that pixels within individual target regions are given the same label. Unfortunately, the ranges must be chosen before a segmentation into meaningful regions is derived, but nevertheless this can be achieved automatically. The idea is to look at the distribution of grey-levels in the input image to find ranges where large numbers of values are clustered. The expectation is that such clusters of values will be derived from one or more non-adjacent image regions.
Figure 8.9 shows the regions derived from the array shown in figure 8.3 using the above algorithm with grey-level ranges determined manually. Pixels with grey-levels in the range 3-8 are labelled 1, those in the range 20-45 are labelled 2, those in the range 45-80 are labelled 3 and those in the range 85-99 are labelled 4. Pixels with grey-levels falling outside all of the ranges are labelled 0. An array of labels as in figure 8.9 representing the segmentation of an image into regions is known as a region map.
[IMAGE ]
Figure: A region map produced from the Underground sign of
figure 8.1.
As required, most pixels on the ring have been assigned the label 2, those on the two half-discs within the ring the label 4, and those on the background the label 1. Notice, however, that many pixels in other parts of the image have been assigned these labels. A large proportion of the pixels on the bar have the label 3 although some have been labelled 2 and 0, because this part of the image is not completely homogeneous.
Figure 8.10 shows an alternative depiction of the region map, in which lines are drawn between pixels with different labels. This shows clearly that the half-rings and half-discs have been extracted by the segmentation algorithm as single regions. Unfortunately the central bar of the sign appears as several regions of which the largest `leaks' into the narrow band of mixed pixels on the inner boundary of the ring.
[IMAGE ]
Figure: An alternative depiction of the region map in
figure 8.9.
If you would like to read more about edge and region finding algorithms, there are useful sections on the subject in the book Computer Vision by Ballard and Brown (1982).