Diffuse illumination as a default assumption

for shape-from-shading in the absence of shadows.

 
Christopher W. Tyler
ABSTRACT
Radial sinusoids (blurry spoke patterns) appear dramatically saturated toward the brighter regions. The saturation is not perceptually logarithmic but exhibits a hyperbolic (Naka-Rushton) compression behavior at normal indoor luminance levels. The object interpretation of the spoke patterns was not consistent with the default assumption of any unidirectional light source, but implied a diffuse illumination (as if the object were looming out of a fog). The depth interpretation was consistent with the hypothesis that the compressed brightness profile provided the neural signal for perceived shape, as an approximation to computing the diffuse Lambertian illumination function for this surface. The surface material of the images was perceived as non-Lambertian to varying degrees, ranging from a chalky matte to a lustrous metallic.
 

INTRODUCTION

It is common to assume that the perception of the shape of an object from its shading image follows a few simple principles based on default assumptions about the light source and surface properties. For example, much of the computer vision literature makes the assumption of a spatially limited (or approximately point) source of light and surfaces of Lambertian (or uniform matte) reflectance properties. Such assumptions are commonly supposed to provide reasonable approximations to the typical interpretations of the human perceptual system (at least in the absence of explicit highlight features). In fact, however, the present analysis will show that there is wide variation in the interpreted surface quality depending on minor variations in the luminance profile of the shading image. Human observers do not seem to make a default assumption about reflectance properties, but to impute them for the particular shading image. Moreover, their interpretation of simple shading images is not consistent with the point-source assumption. The theoretical expectations for a variety of illuminant assumptions is examined in an attempt to determine what default assumption is made by human observers.
 
The focus of the present analysis will be on shading images based on sinusoidal and related luminance functions. As an initial demonstration of the shapes perceived from sinusoidal shading images, Fig. 1 depicts three spoke patterns in which there is repetitive modulation as a function of radial angle. The first pattern has a linear sinusoidal profile, the second is predistorted so as to have an approximately sinusoidal appearance to most observers and the third is further distorted so as to appear as an accelerating function with wider dark bars than light bars. Note that, in this radial format, there is a strong tendency to perceive these luminance profiles as deriving from three-dimensional surfaces.
 
What are the properties of the perceived surfaces? Although the generator function is one- dimensional, we are able to estimate simultaneously the surface shape, the reflectance properties and something about the illuminant distribution. We thus parse the one-dimensional luminance function at a particular radius in the image into three distinct functions. Such parsing can occur only if the visual system makes default assumptions about two of the functions. The question to be addressed is what default assumptions are made?
 
Since the patterns are radially symmetric, the illuminant distribution must itself be symmetric (or the shading on different spokes would vary with the orientation of the spokes relative to the direction of the illuminant). Thus the only possible variation of illuminant properties is the degree of diffusion of the illuminant from a point source (positioned above the center of the surface). To most observers, the surface appears to be of matte (or Lambertian) material in A and to become progressively more lustrous in B and C. Somehow, the human visual system partitions the single function in each image into separate shape, reflectance and illumination functions. This study is an initial attempt to explore the rules by which such partitioning takes place.
Fig. 1. Depictions of sinusoidal spoke patterns with various levels of brightness distortion. Left: linear sinusoid; center: accelerating hyperbolic distortion to provide sinusoidal appearance; right: extreme hyperbolic distortion to appear as an accelerating distortion. Best approximation to intended appearance will be obtained if viewed from a distance so that pixellation is not visible.
 
Compressive Brightness Distortion
Before proceeding with the analysis of surface properties, first consider the simple compressive distortion of the brightness image. If the surface properties are ignored for the moment, the direct brightness profile of Fig. 1 does not appear to be sinusoidal: the dark bars look much narrower than the bright bars (based on the perceived transition through mid-gray). This narrowing effect is far more pronounced in high contrast images on a linearized CRT screen than in this printed example, which has a contrast of about 95%. For several reasons, it is probable that the perceived distortion arises at the first layer of visual processing, the output of the retinal cone receptors (Macleod et al., 1992; Hamer & Tyler, 1995). However, the focus here is on its perceptual characteristics, not its neural origin.
 
It is reasonable to be skeptical of the linearity of the reproduction of Fig. 1. A simple test of the accuracy of its linearity is to view the figure in (very) low illumination, after dark-adapting the eyes for a few minutes. In such conditions, the visual system defaults to an approximately linear range, and it can be seen that Fig. 1A now appears to have roughly equal widths of the bright and dark bars.
 
 
Fig. 2. Does the visual system reconstruct the full sequence or use the simplifying assumption that the output approximates the input?
 
 
In terms of shape-from-shading issues, the question arises whether the depth interpretation mechanism of the visual system 'knows' that it is being fed a distorted input? The most adaptive strategy, for either genetic specification or developmental interaction with the environment, would be for the brightness distortion to be compensated in the depth interpretation process, so that the perceived brightness distortion does not distort the depth interpretation.
 
However, the observed depth interpretation from these patterns seems to follow closely the waveform of the perceived brightness profile; when the brightness is perceived as sinusoidal (Fig. 1B), the surface is perceived as a roughly sinusoidal 'rosette'. When the brightness pattern is perceived as having narrow dark bars (Fig. 1A), the surface is perceived more like a ring of cones with narrow valleys between them. Fig. 1C continues this trend, although a second principle of change in surface properties now appears. The question to be addressed is: what principles is the visual system using in deriving its surface interpretation from the luminance profile?
 
The direct relationship between perceived brightness and surface depth that is the typical perception of the patterns of Fig. 1 is surprising in relation to the luminance profiles that should be expected from geometric reflectance considerations. For example, in Fig. 1B the surface appears approximately sinusoidal and peaks in phase with the peaks of the luminance image. As the following illumination analysis will show, this interpretation is completely incompatible with point source illumination in any position. This incompatibility is surprising in view of the widespread use of the point source assumption in the field of computer vision Development of a diffuse illumination analysis then provides an explanatory basis for the observed perceptual interpretations. An additional benefit of the diffuse illumination analysis is that it shows how the direct relationship between perceived brightness and surface depth perception is compatible with the operation of a compensation for early brightness compression in the perceived brightness function.
 
Illumination Analysis
The general principles of luminance profiles based on Lambertian objects are well known, but it is instructive to consider the variety of luminance patterns that may arise from a simple object such as a sinusoidal surface, for comparison with human perceptual performance. For point sources at infinity, the angle of incidence is a the critical variable. For the alternative assumption of diffuse illumination, the principle factor is the acceptance angle outside which the diffuse illumination is blocked from reaching a particular point on the surface. The assumptions for the following analysis are: i) that the surface has constant albedo (inherent reflectance),
ii) that the surface has Lambertian reflectance properties and
iii) that secondary reflections from one part of the surface to another are negligible.
 
Fig. 3. Lambertian reflectance profiles for a sinusoidal surface (A) under three illumination conditions: B - point-source illumination from infinity at a grazing angle to the left-hand slopes; C - point-source illumination from infinity directly above the surface; D - diffuse illumination from all directions.
 
The Lambertian reflectance assumption is that the surface illumination is proportional to the sine of the angle of incidence at the surface and that the reflectance is uniform at all angles. Hence, the reflected light is assumed to follow the cosine rule of proportionality to the cosine of the angle of incidence relative to the surface normal.
 
Fig. 3 shows (top) the profile of a sinusoidal surface, below which are three luminance profiles for selected illumination conditions designed to illustrate the variety of outputs. The surface is assumed to be Lambertian, which implies that the reflected luminance is proportional the incident illumination and hence proportional to the cosine of the angle of the surface to the viewer.
 
The first luminance profile is derived from a point source at infinity whose angle grazes (is tangential to) the left-hand descending slopes of the sinusoidal surface. Hence, the reflected luminance is lowest at the position of the grazing slope and highest along the opposite slope, as shown by Fig. 3B. Note that, in this position, the luminance profile has the same number of cycles as the original surface (though distorted rather than being a strict derivative).
 
The second luminance profile is derived from a point source at infinity directly above (normal to) the surface (Fig. 3C). Because the peaks and troughs of the surface waveform are at the same angle, they have the same Lambertian reflectance, and hence produce a frequency-doubled luminance profile. For 100% luminance modulation, this profile is close to sinusoidal as described in the following section. Here the point is that a quantitative shift in the angle of incidence of the point source produces a qualitative change in the resulting luminance profile of the same object.
The third luminance profile (Fig. 3D) is derived from the assumption of a diffuse illumination source rather than point source. The resulting luminance profile is again very different from the other two based on point sources. These examples are chosen to illustrate the complexity of the interpretation of shape from shading, since a given shape can give rise to qualitatively different shading profiles depending on the assumed source of illumination. When confronted with a luminance profile that is actually sinusoidal, does the human observer assume that it is frequency-doubled reflection of a underlying surface of half that frequency, the diffusely illuminated profile of a non-sinusoidal surface, or a non-Lambertian surface (and so on)?
 
Geometric Derivation
To develop the theoretical reflectance functions of Fig. 3 required two stages; computation of the angle-of-incidence functions for the selected illumination conditions and conversion to reflectance functions through the Lambertian reflectance assumption. The sinusoidal surface profile is shown again for reference in Fig. 4, below which are plots of the angle of incidence for three different illumination conditions.
The first angle-of-incidence function (Fig. 4B) is derived from a point source at infinity whose angle grazes (is tangential to) the left-hand descending slopes of the sinusoidal surface. Hence, the angle of incidence is zero at the position of the grazing slope and highest along the opposite slope, as shown by Fig. 4B. This curve will itself be sinusoidal if (and only if) the amplitude of the surface sinusoid (top curve) is such that opposite flanks are at a 90º angle to each other. Note that, in this position, the angle-of-incidence function has the same number of cycles as the original surface function (though shifted in phase in the direction of the angle of the incident light).
 
Fig. 4. Net angle-of-incidence profiles for a sinusoidal surface (A) under three illumination conditions: B - point-source illumination at a grazing angle to the left-hand slopes; C - point-source illumination directly above the surface; D - diffuse illumination from all directions.
The second angle-of-incidence function (Fig. 4C) is derived from a point source directly above (normal to the mean orientation of) the surface. Because the peaks and troughs of the surface waveform are at the same angle, they produce a frequency-doubled luminance profile that is asymmetric with respect to its peaks and troughs.
 
The third angle-of-incidence function (Fig. 4D) is derived from the assumption of a diffuse illumination source rather than a point source. The light is assumed to be coming equally from all directions, but to be occluded if any part of the surface lies in its path. No secondary reflections are considered. The resulting luminance profile is again very different from the other two based on point sources. The surface is assumed to be Lambertian, which implies that the reflected luminance is proportional the incident illumination, and hence proportional to the cosine of the angle of the surface to the viewer.
Before attempting to answer these questions, first consider the derivation of the diffuse illumination profile depicted in Fig. 4D. As depicted in Fig. 5 for a particular point on the upper trace of the surface being viewed, the acceptance angle for any point on the surface is the angle between the line passing through point p that is tangent to the surface on the left (Fig. 5B) and the one that is tangent to the surface on the right (Fig. 5C). The sum of the two angles and defines the acceptance angle for each point on the surface. Within this acceptance angle, the light from all directions has to be integrated according to the Lambertian cosine rule for each direction of the diffuse illumination relative to the orientation of the surface.
The net result of the diffuse illumination analysis, which omits secondary reflections, is shown for the sinusoidal surface by the lowest curve of Figs. 3 & 4. Note that this curve peaks at a value of at each peak of the waveform but drops to some lower (non-zero) value depending on the absolute depth of the sinusoidal modulation of the surface. Interestingly, the acceptance angle is not a well-known function such as a catenary, but has marked shoulders between relative straight regions. Note that the flatness of the lower portion implies that the trough of the sinusoid approximates the shape of a circle, which has a constant acceptance angle relative to a gap in its surface (as was demonstrated by Euclid).
  
 
 
Fig. 5. Derivation of diffuse illumination profile for the sinusoidal surface (A). B: surface tangent to the left of each point along surface; C: surface tangent to the right of each point along surface; D: net acceptance angle at each point.

DISCUSSION
The conclusion from the analysis of the three paradigm cases in Fig. 3 is that, contrary to the appearance of images in Fig. 1, there is no simple illumination assumption of a sinusoidal Lambertian surface form that would give rise to a periodic luminance profile matching the frequency and phase of the surface waveform. The only luminance function that has the correct frequency and phase relative to the peaks of the surface is the diffuse one, and even it is much more cuspy than a sinusoid.
 
Perception of Sinusoidal Patterns
With the analysis in hand, we may now analyze the perception of the patterns of Fig. 1. The most important result is that these patterns do give pronounced depth perceptions, even though they are qualitatively incompatible with any position of point source illumination. These reports correspond most closely to the diffuse reflectance profile of Fig. 2 (bottom curve), as looking like a surface with peaks at the positions of the luminance peaks. However, the case where the brightness profile (Fig. 1B) looks most sinusoidal corresponds to the case where the perceived surface has the most sinusoidal shape. This seems odd, since a sinusoidal surface is predicted to have a much more peaked luminance distribution according to the diffuse illumination assumption (Figs. 3D & 4D).
 
Note that typical deviations from the Lambertian and the diffuse assumptions will both enhance the discrepancy. If the surface had a reflectance function that is more focused than the Lambertian, it would tend to increase the luminance in the direction of the observer, and hence make the peaks of the assumed surface brighter relative to the rest. Similarly, if the illumination source were more focused than a pure diffuse source, it would introduce a second-harmonic component into the reflectance function similar to Fig. 3C, which would again enhance the peaks and also introduce a bright band in the center of the dark strips. Hence, the diffuse illumination function at the bottom of Fig. 2 is the least peaked function to be expected from any single illumination source.
 
Fig. 6. Role of response compression in the interpretation of depth from shading. A. Sinusoidal surface shape. B. Net reflectance profile assuming diffuse illumination and Lambertian reflectance function. C. Perceived brightness signal after hyperbolic saturation. Not similarity to original surface waveform. D. Same degree of hyperbolic saturation applied to a sinusoidal signal, to illustrate how much brightness distortion is perceived in Fig. 1A under high illumination.
 
Role of Perceptual Response Compression
Human vision is, of course, not linear as a function of image luminance L but shows a saturating compression of the internal response R that seems to be most closely approximated by a hyperbolic function (like the Naka-Rushton equations for receptor response saturation), as described in Chan et al. (1991) and Tyler & Liu (1996). The optimal equation was of the form
 
Fig. 6 illustrates how such a brightness compression behavior can result in an output that approximates the original surface shape. For a sinusoidal surface (Fig. 6A) the diffuse reflectance function under Lambertian assumptions is the peaky function of Fig. 6B. The effect of a hyperbolic compression on this waveform is shown in Fig. 6C to result in an approximately sinusoidal output waveform. For comparison, the effect of the same hyperbolic compression on a straightforward sinusoidal waveform is shown in Fig. 6D, appearing strongly asymmetric in terms of the peak versus trough shapes. It is thus plausible that that the shape-processing system could use the compressed brightness signal as a simple means of deriving the original surface shape from the diffuse reflectance profile.
Fig. 7. A second example of response compression in the interpretation of depth from shading. A. Cyclic surface shape. B. Net reflectance profile assuming diffuse illumination and Lambertian reflectance function. C. Perceived brightness signal after hyperbolic saturation. Not similarity to original surface waveform. D. Same degree of hyperbolic saturation applied to a sinusoidal signal, to illustrate similarity of result to A & C.
If the visual system does indeed use its inbuilt brightness compression as a surrogate for a more elaborate reconstitution algorithm of the shape from shading under diffuse illumination assumptions, the approximation should work for other typical surface waveforms. One example to test this hypothesis is a cylindrical waveform, corresponding to a one-dimensional version of the sphere that is used widely in computational vision (and which corresponds to the most-simplified form of an isolated object in the world). A cylindrical waveform is depicted in one-dimensional cross-section in Fig. 7A, although the vertical axis is extended relative to a purely circular cross-section. The subsequent panels, in the same format as Fig. 6, show the diffuse reflectance profile, the effect of brightness saturation on this profile, and a simple sinusoid with the same degree of compression. Notice that the saturated diffuse profile again looks similar to the surface waveform, supporting the idea that the brightness-compressed signal can generally act as a surrogate for the back-computation of the surface waveform. In this case, the compressed sinusoid looks somewhat similar also, which may explain why the linear sinusoid of Fig. 1A resembles a ring of conical 'dunce caps' (since a cone is a version of a cylinder with a converging diameter). If the visual system treats the brightness-compressed signal as an approximation to the depth profile of the object under diffuse illumination, any object that generates a similar signal after brightness compression should appear to have a similar shape.
 
Finally, some brief thoughts on the different qualities of surface material perceived in Fig. 1. Given that the image that appears Lambertian is the one that resembles the ring of dunce caps with circular cross-section. it may be that the visual system has a Bayesian constraint to prefer a solution that corresponds to such discrete 'objects' rather than a continuously deformed surface. If so, shape reconstructions that deviated from such a circular cross section (in the absence of explicit contour cues) may tend to be interpreted as deviations from the Lambertian assumption rather than deviations from the assumption of circular cross-section. It is not intended for the present paper to provide an empirical analysis of this question but merely to frame the hypothesis.
 

CONCLUSION

The object interpretation of the spoke patterns of Fig. 1 is not consistent with the default assumption of any unidirectional light source, but implies a diffuse illumination (as if the object were looming out of a fog). The existence of such a default for human vision of shape from shading has not been previously described, my our knowledge. It should be noted that similar percepts are obtained for linear sinusoids of high contrast (as a 'stack of cigarettes'), although the sense of shape-from-shading is weaker initially. No-one ever seems to see a linear sinusoid in a rectangular aperture according to the predictions of Fig. 3 B for a local illumination source, even though there is now no orientational symmetry to force a symmetric source illumination Thus, the default to diffuse illumination appears to be general unless there are specific cues to imply an oriented source (e.g., Ramachandran, 1988).
 
Given default diffusion, the depth interpretation is consistent with the hypothesis that the visual system uses the compressed brightness profile directly as the neural signal for perceived shape. It is shown that this equivalence is a reasonable approximation to computing the diffuse Lambertian illumination function for this surface. This match provides the visual system with a rough-and-ready algorithm for shape reconstruction without requiring elaborate back-calculation of the brightness compression and integral angle-of-acceptance functions through which the diffuse illumination image was built up.
 
Acknowledgments
Supported by NEI grant # 7890.
 

REFERENCES

Chan H & Tyler CW (1991) Increment and decrement asymmetries: Implications for pattern detection and appearance. Society for Information Display, Technical Digest 23, 251-254.
 
Chan H, Tyler CW, Wenderoth P and Liu L (1991) Appearance of bright and dark areas: An investigation into the nature of brightness saturation. Investigative Ophthalmology & Visual Science, Suppl. B, 1273.
 
Tyler CW, Chan H & Liu L (1992) Different spatial tunings for ON and OFF pathway stimulation. Optometry & Physiological Optics 12, 233-240.
 
Hamer RD & Tyler CW (1995) Phototransduction: Modeling the primate cone flash response. Visual Neuroscience 12, 1063-1082.
 
Macleod DIA, Williams DR & Makous W (1992) A visual nonlinearity fed by single cones. Vision Research 32, 347-363.
 
Ramachandran VS (1988) The perception of depth from shading. Scientific American, 269, 76-83.
 
Tyler CW & Liu L. (1996) Saturation revealed by clamping the gain of the retinal light response. Vision Research 36, 2553-2562.