The analysis of the depth coordinates of objects in a visual scene is of vital importance for animals and in technological applications. In 2D camera projections of a 3D visual scene depth information is initially lost but can be recovered when using two cameras in a stereoscopic setup. The projection of an object located at a finite distance from the camera is laterally displaced in the left camera image compared to the right camera image. This displacement, called disparity, can be used to retrieve the object's depth coordinate. Sanger (1988) proposed that the phase relation between two spatial band-pass filter responses (Gabor filters) could be used to measure local disparity. Simple cells in the visual cortex have receptive fields which can be described as Gabor filters. Most of them are driven binocularly and tuned to respond most strongly to stimuli at a certain preferred distance from the fixation plane. Cortical complex cells receive input from simple cells. As a consequence also complex cells implicitly encode visual depth. Here we formalize the computational procedure which could underlie the extraction of depth information from complex cell responses and solve it analytically for two different stimulus situations. The theory predicts that a strong discrepancy should exist between the actual and the perceived depth of sine-wave luminance modulated ("grating") stimuli. If the spatial frequency of the grating is increased it should appear to move closer to the observer.
展开▼