Christopher Tyler & Richard Miller
ARVO 1998
a. Symmetry Detection in Visual Images We have embarked on a computational project to develop a biologically motivated detectors for line segments and for local symmetry in noisy images. This detector is inspired by biological detection strategies, including low accuracy for individual calculations, simulated local computation, massive redundancy, simultaneous use of multiple scales, and piling up of disparate cues. Unlike other current approaches to line extension (e.g., Carpenter, Grossberg & Mehanian, 1989), our algorithm is effective up to S:N ratios of 1:1, where the pattern symmetry is barely visible to human observers.
The symmetry algorithm makes use of a symmetry-axis extension procedure that finds local symmetry points by looking for lateral matches axes and then builds a symmetry axis by joined symmetry points with a line extension procedure. The symmetry axis detection is enhanced by filling in small gaps when a continuity constraint is met. Our symmetry algorithm typically performs on a par with human observers in finding regions of local reflection symmetry relative to any of a number of symmetry axes. It reports the axes of the found regions and also grades their salience, usually giving answers with which a human observer can agree.
b. Initial Detection of Face Location by Symmetry. Under the assumptions invoked for this approach, one property that generally distinguishes a face from its surroundings is its bilateral symmetry. If the head is turned too much, the symmetry of its image is degraded. If the illumination is asymmetric, the image symmetry is again degraded. Finally, if the head is tilted, symmetry is degraded relative to a particular axis of analysis (although it can readily be found if multiple axis orientations are analyzed). The question, therefore, is whether an algorithm based on facial symmetry is sufficiently robust to be of value in identifying faces against a background of other objects.
Fig. 1. Efficacy of the symmetry axis algorithm in finding faces
To implement the symmetry detection for the face recognition application, we included some prefiltering that enhanced edges relative to uniform areas, which has the effect of reducing the contamination by asymmetric illumination on the face. This problem is considered in detail below, but is included at this stage to improve the robustness of the initial face detection process. To give an idea of the efficacy of the algorithm, we applied it to a group photograph (Fig. 1). The individuals have a variety of ethnicity, facial hair, glasses, expressions and poses, with several tilted substantially from the vertical. Their clothes and inter-face spaces constitute a fair sampling of non-face objects, although not the most crowded surroundings that one could imagine. The lighting is somewhat from the left, introducing substantially asymmetric shadowing in the images (This asymmetry is not particularly evident in the reproduction; the nose shadowing shows clear asymmetries, but it is mostly masked by the symmetry axis markers). Some of the faces are also quite asymmetrically illuminated due to shadowing behind a neighbor, although this effect is not so noticeable to our visual systems because we are adept at compensating for such illumination gradients.
In summary, the group photograph of Fig. 1 has a variety of real-life distortions that might be expected to cause problems for many current face-recognition algorithms. Nevertheless, the initial phase of face detection by means of the symmetry properties is remarkably successful. The algorithm at its current level of development picks up 19 of the 19 faces in this photograph with only about the same number of intrusions from non-face features. Obviously, the algorithm would also detect any other symmetric objects in a scene, such as vases or banisters, but it seems to be an effective initialization stage to pick out regions for further processing. Specific eye detectors, for example, would then be brought in to play to weed out symmetric objects without facial features. This subsequent processing should eliminate all the intrusions in the image shown, since none of them have symmetric facial features on either side.
c. Compensating for Asymmetrical Illumination. The next demonstration illustrates an approach tof decomposition between the surface albedo and illumination images. Fig. 2A shows a face with asymmetric illumination. Fig. 2B shows a weakly high-pass filtered version of this image, a standard manipulation that provides local normalization and reduces low-frequency part of the illumination component but does not eliminate the shading. The axis of symmetry was found by the symmetry axis algorithm described in the previous section: its output is shown as the white `blaze' through the nose of the image in Fig. 2B. In Fig. 2C, the face has been reflected about its axis of facial symmetry after high-pass filtering, to provide the estimated outlines of the predominant features, independent of shading from the oblique illumination. This estimate of the segregated features is not perfect, but in general it seems to do an excellent job of isolating the outlines of the feature aspects of the original face. (Note that, without the symmetrizing, the high-pass filtered version in Fig. 2B still has substantial contrast energy on the non-feature regions of the face.) The illumination component of the same image is estimated in Fig. 2D, which consists of a low-pass version of Fig. 2A from which was subtracted the image of Fig. 2C. This fourth image provides a convincing estimate of how the face would look after removal of the reflective coloration of the features to leave the white marble appearance of a "classical sculpture", the image of a uniformly colored 3D mask lit from the side.
Fig. 2. Use of symmetry to segregate the feature information from the illumination shading. The `blaze' in B is the result of the symmetry-finding algorithm. Note its effectiveness despite detailed asymmetries in the filtered image.
d. Pilot Study of Eigentemplate Procedure. The second aspect of the project is the feature recognition by Eigentemplates. With the constraint of straight-ahead view, we chose four faces that would pose difficulties for current face-recognition algorithms. One (Fig. 3A) has the features lit by approximately homogeneous illumination but is a cluttered image with objects, such as earrings, that are easily confused with facial features. The second (Fig. 4A) is the simpler image of Fig. 2A that has pronounced asymmetry of illumination. The third (Fig. 5A) has the eyes obscured by spectacles
Fig. 3. Features found by preliminary Eigentemplate analysis in a cluttered facial image. Note the reconstruction failure in D in the absence of offsets and inhibition between the wavelets.
Fig. 4. Features found by preliminary Eigentemplate analysis in a side-lit facial image. Again, removing the offsets and inhibition destroyed the ability to find the correct features.
Fig. 5. Features found by preliminary Eigentemplate analysis in a facial image with spectacles and a smile. Removing the offsets and inhibition only affected the result for one feature.
Fig. 6. Features found by preliminary Eigentemplate analysis in a cluttered, side-lit facial image. Removing the offsets and inhibition brought all three Eigentemplates to the same location.
with highlights and the fourth (Fig. 6A) has a tousled appearance with markedly different feature structure. (These latter two images would pose problems for contour-based algorithms because the boundaries of the faces are nebulous or ill-defined.)
A pilot version of the Eigentemplate approach that we are developing is shown in the remaining panels. The proposed wavelet Eigentemplates were approximated by two or three Gabor filters chosen by hand to match either the eye or the mouth. Because we have not yet implemented the automatic size scaling, all filters were scaled with the wider mouth filter, which was set to the measured mouth width in the raw image. In order to obtain a veridical result, we had to implement two enhancements to the Eigentemplate search procedure. The eye had to have a vertical offset included between the vertical and horizontal Gabors, to lower the relative position of the `pupil'. Themouth detector had to have an inhibitory component included with left and right offsets, to ensure that any mouth detected would have blank cheeks on either side. Note that such features are not arbitrary manipulations, but exemplify processes to be included in the full Eigentemplate scheme.
These relatively crude Eigentemplates were successful in identifying their respective target features from the preprocessed images in column B of Figs. 3-6, as shown by the filter combination in the position of the maximum detection signal in column C of Figs. 3-6. (For the eyes, separate maxima were specified to the left and right of the symmetry axis.) Note how the eye filter combination avoids detection of the eye-like earring in Fig. 3A and how the mouth filter combination avoids the horizontal curved parts of the necklace and neckline in the same figure. Such selectivity would be impossible without the inhibitory and offset components of the filter combination, which are unique feature of our cortically-inspired approach. The power of the offset structure in the Eigentemplates is shown in Fig. 5C, where the eye location is found at the true eye position, uncontaminated by the bright highlights reflecting from the subject's spectacles. The effect of removal of this inhibitory component (and the symmetry axis constraint) is shown in column D of Figs. 3-6. Now the `mouth' detector is attracted by arbitrary horizontal lines and the `eye' detector picks up eye-like features such as earrings.
Having identified the features, the Gabor wavelet vector now provides a description of the properties of the feature within the amplitude and phase variations allowed for that wavelet description. Of particular interest is the case of the mouth, which was open with the teeth showing white in Fig 5A and closed in the other three cases. The complex property of the wavelets allowed both types of mouth to be found by the same template, which adapts its description to the light or dark phase of the relevant component as appropriate. At this crude pilot level, the mouth in Fig. 6A also has a light patch at the center, but in this case it is from the upper lip between the mouth crease and the mustache rather than from the teeth.
Note that this wavelet description is linear, and is not affected by the nonlinearity of the inhibitory processes in the identification stage. The inhibition helps in targeting the particular point in the image where the feature resides. Once there, the wavelet vector provides a linear description of the feature at the point that was selected nonlinearly. The reconstruction failures of column D help to illustrate how the feature descriptions are adaptive, adjusting the relative amplitudes and phases of the Gabor components to the properties of the `features' found.