Leonid L. Kontsevich & Christopher W. Tyler
Smith-Kettlewell Eye Research Institute
2318 Fillmore Street, San Francisco, California, 94115.
[For optimal viewing, make sure that your browser is set to 256 or more colors]
Abstract
To study the ability of humans to read subtle changes in facial expression, we developed a novel application of the spatial reverse correlation technique4 consisting of adding samples of spatial noise to the image and categorizing the results according to their effect on human perception of emotion. This added noise method differs from the Bubbles techniqe of Gosselin & Schyns (2001) in permitting an evolutionary development to novel forms. The added noise had profound effect on the facial expression and in almost every instance the new expression was meaningful. To quantify the effect, we asked naïve observers to rank the face of Mona Lisa superimposed with noise, based on their perception of her emotional state along the sad/happy dimension, and cumulated the noise instances that were similarly ranked. The cumulated noise revealed that the smile is carried entirely by the shape of the mouth region. The perception of smiling in the eyes is solely attributable to a configurational effect projecting from the mouth region.
Introduction
Reading facial expressions is of high importance for humans as social beings. Expressions are often the result of subtle changes in facial features, so humans indeed develop amazing sensitivity to these changes (Ahumada & Lovell, 1971). Over the centuries, artists have excelled in depicting facial expressions, although it is often hard to formulate explicitly what exact changes make a particular expression (Hess, 1975; Ekman, Friesen & O'Sullivan, 1988; Ekman & Friesen, 1984). To address this issue we employed a spatial reverse correlation technique (Sutter, 1975; Neri, Parker & Blakemore, 1999; Ringach, Hawken & Shapley, 1997; DeAngelis, Ohzawa & Freeman, 1995), consisting of adding samples of spatial noise to the image and categorizing the results according to their effect on human perception. The added noise had profound effect on the facial expression, which seemed to have a meaningful interpretation in almost every instance (Fig. 1).

Fig. 1. The base stimulus was a gray-scale JPEG detail of the Mona Lisa painting by Leonardo da Vinci, superimposed with a random sample of binary, one-pixel noise on each to alter the facial configuration of the original face. The added noise was randomly different on each trial. Twelve individual examples of superimposed noise are shown, to exemplify the range of emotional expressions induced by the addition of a single sample of noise. Emotional labels are suggested for these twelve expressions, although these are the authors impressions rather than deriving from a systematic study.
To quantify the effect, we asked naïve observers to rank the face of Mona Lisa superimposed with noise, based on their perception of her emotional state along the sad/happy dimension, and cumulated the noise instances that were similarly ranked. The reverse correlation technique operates in the same manner as the process of Darwinian natural selection. Nature [the noise generation routine] provides purely random variations around the current configuration of the oragism [Mona Lisa's face]. Natural selection [the observer] operates to select adaptive components and eliminate mal-adaptive ones. The result in a modified organism [face] that is better adapted to evoke a particular state in the natural environment [the observer]. Just as natural selection generates apparently purposive arganismic forms, the reverse correlation method con generate meaningful modifications of the base image.
Methods
In the reverse correlation experiments, the observers were asked to rank the emotional expression of the modified face into 4 categories: SAD, SLIGHTLY SAD, SLIGHTLY HAPPY, HAPPY. Spatial noise in the form of one-pixel binary random luminance at 50% contrast was added to the face image. Different random noise was added each trial, with no pattern or relationship to the underlying face images. The reverse correlation procedure operates by cumulating noise according to the observers' selction criterion. The noise samples from each trial were accumulated separately for each category. Each of 12 normally-sighted observers conducted 100 trials, with the results averaged across the observers. To evaluate the specific effects in the eye region, we removed the noise for b (HAPPY) and c (SAD) from the lower part of the face. Twelve observers were asked to rank the difference in degree of happiness between the eyes on a 10-point scale from neutral to very happy.
Results
The summation of the noise instances coherently accumulates luminance information in the locations relevant to a particular facial expression and tends to average it out in the irrelevant locations. When the result of the reverse correlation for a particular category is added to the original face image, it changes the facial expression toward an average expression from that category. From the original Mona Lisa face reproduced in Fig. 2a, the effect of the reverse-correlation noise for the two extreme categories, i.e., the sad and happy facial expressions averaged for all observers, are shown in Figs. 2b and c, respectively.

Fig. 2. a. The original detail of Mona Lisas face that was used throughout the study. The averaged noise samples for the two extreme categories (sad and happy) in the reverse correlation experiments are shown superimposed on the full-color original in b and c. They represent the average facial expression over all observers for these two categories. The superimposed noise is shown alone in d and e, with the face outlines, to emphasize that the only identifiable feature in either case is at the corners of the mouth.
The most prominent difference between the "altered" images and the original is in the mouth shape: in the sad face the mouth is flat and in the happy face it is curved upward at the corners. This feature hardly can be considered as surprising, but it is interesting that the reverse correlation technique allows us to project the spatial representation of the observers concept of sad and happy on to a particular face. The cumulated noise revealed that the smile is carried entirely by the shape of the mouth.
The question to the observers was generic at the level of the emotional interpretation of the overall expression. We can, however, use the spatial projection property of the reverse correlation to determine which local parts of the face carry the expression. The eyes, for example, are often characterized as the "window of the soul". Is their change in the spatial configuration of the eyes equivalent to the marked curvature change in the mouth?
At first glance, the eyes show an obvious change. They seem to be smiling in Fig. 2b and serious in Fig. 2c. On closer examination, however, this effect was found to be illusory. The difference in the eye expression was entirely due to a configuration projection from the shape of the mouth. We first asked the observers to rate the eye expressions in Figs. 2b and c and confirmed the observation: in the happy face, the rating for the happy expression in the eyes was significantly higher than for the sad one. To determine whether the eye expression is a result of the added noise or a configurational illusion, we removed the noise from lower half of the saddest and happy faces, as shown in Fig. 3a and b so that both faces have the identical mouth and lower half of the face. On a scale from zero (neutra) to 10 (very happy( he observers gave an average estimate of 4.8 + 2.0 for the full noise overlay in band c, compared with a rating of at 0.0 +2.1 for the noise overlaid on the upper half of the face alone in d and e. (Some observers saw some expressions as sad and used negative values on the scale.)

Fig. 3. Superimposing the reverse correlation noise on the upper face alone provides no detectable change in the the emotional expression.
To evaluate the specific effects in the eye region, we removed the noise for b ('happy') and c ('sad') from the lower part of the face. The resulting images, Fig. 3 a and b, demonstrate that the eyes contribution conveys little or no clues to the happiness of the overall expression, implying that the predominant information comes from the shape of the mouth and the lower part of the face. Comparison of these pictures shows no apparent difference between the expressions in the eyes, once the mouth difference is removed. This identity was also confirmed psychophysically in the observer ratings, which came out at zero for the mean difference between the two expressions. Evidently, the reverse correlation procedure, which generated such a vivid modification of the mouth structure, had no effect on the emotional expression carried by the eyes. (There is a perceptible difference on the alertness dimension, but this is not ranked as more or less happy.)

Fig. 4. The reverse correlation noise from the lower face alone generates a change in expression in the whole face.
As a predictive test, we implemented the reverse manipulation, with the 'happy' and 'sad' noise from the lower face superiposed on the original (Fig. 4). We know that the expression of the mouthis changed by the noise. The question is whether this change also affects the perceived expression of the eyes. We leave it for the reader to judge the result. All those who have observed this figure report a marked change in the eye expression, despite the fact that the images are identical in the two panels of Fig. 4. We conclude, therefore, that the perceived difference between eye expressions with added noise is evoked entirely by longe range interaction from the change in the mouth expression.
Discussion
This surprising result implies that the emotional dimension of eye expression is entirely derived from long range induction from information in the mouth region. Obviously, there are visible differences in eye shapes: lid widening, brow height, lines around the eyes and so on. One might have expected that one or more of these cues would have been projected onto the face differently by the reverse correlation technique. However, the negative result for the eye region alone implies that such cues are not weighted on the sad/happy dimension. (Perhaps the eyes code for the dimension of emotional intensity rather than affect, for example.) Happiness seems to be encoded by changes in the shape of the mouth area, but not by any component in the upper half of the face.
We note that the reverse correlation technique with added noise differs in principle from the 'Bubbles' technique for identifying key features of categorization, recently introduced by Gosselin & Schyns (2001). The bubbles technique is a method for randomly selecting regions of a preexisting image and rating their relevance to categorization, but it does not permit the emergence of any novel configurations (in the form described). The reverse correlation overlay, on the other hand, is not merely a categorization technique. It permits indefinite modification of the pre-existing form in an evolutionary fashion. To evolve a long featural distance from the original, it should be implemented in progressive steps. Instead of cumulating the noise in one overall session, it should be cumulated in blocks for each response category. Within each block, the base image would be the image from the previous block plus the cumulated noise for that block. For example, the image for block 2 would be Fig. 2b rather than Fig 2a. Thus, the novel features would progressively develop from an evolving baseline to an arbitrary deviation from the original. Preliminary studies revealed that this incremental approach was successful in enhancing the sadness of the expression in Fig 2c.
In conclusion, we demonstrate that the reverse correlation technique provides a convenient means to reveal the cues employed in subtle changes of emotional content of facial expression. The same method could be applied to evaluation of the spatial cues for a full range of emotional expressions. Beyond its scientific value, the spatial reverse correlation method allows one to convert internal imagery into real images.
References
Ahumada, A. J. Jr. & Lovell, J. (1971). Stimulus features in signal detection. Journal of the Acoustical Society of America 49, 1751-1756.
Ekman P, Friesen WV & O'Sullivan M. (1988) Smiles when lying. Journal of Personality and Social Psychology. 54, 414-20.
Ekman, P., & Friesen, W. (1984). Unmasking the Face. A Guide to Recognizing Emotions from Facial Cues. Englewood Cliffs, NJ: Prentice Hall.
Gosselin F. & Schyns P.G. (2001) Bubbles: a technique to reveal the use of information in recognition tasks. Vision Research 41, 2261-2271.
Hess, E. (1975) The Tell-Tale Eye. Van Nostrand Reinhold, New York.
Sutter, E.E. (1975). A revised conception of visual receptive fields based on pseudorandom spatiotemporal pattern stimuli. Proceedings of the First Symposium on Testing and Identification of Nonlinear Systems, March 17-20, 1975, Pasadena, California.
Neri, P., Parker, A. J. & Blakemore. C. (1999). Probing the human stereoscopic system with reverse correlation. Nature, 401, 695-698.
Ringach, D. L., Hawken, M. J. & Shapley, R. (1997). Dynamics of orientation tuning in macaque primary visual cortex. Nature, 387, 281-284.
DeAngelis, G. C., Ohzawa, I. & Freeman, R. D. (1995). Neuronal mechanisms underlying stereopsis: how do simple cells in the visual cortex encode binocular disparity? Perception, 24, 3-31.