-
Notifications
You must be signed in to change notification settings - Fork 12
StereoRigDesigner
Stereoscopic viewing of 3D scenes is achieved via a modern version of Sir Charles Wheatstone's (1802-1875) mirror stereoscope, the original version of which is depicted in Figure 1. The Wheatstone stereoscope used a pair of mirrors at 45 degree angles to the user's eyes, is used to reflect pictures located off to the left and right sides.

Figure 1. Drawing of the Wheatstone mirror stereoscope.
The device provided the first demonstration of the importance of binocular depth perception by showing that when the two side pictures (which simulated left-eye and right-eye views of the same object) were presented so that each eye saw only the image designed for it, the two views can be fused together recreating the 3D appearance of the object, a process known as stereopsis.
In our modern version of Wheatstone's stereoscope stereoscopic viewing of a 3D scene is achieved by first capturing left and right views of the 3D scene, and then projecting these views onto the eye's vergence plane. The system is designed so that the following symmetries between real viewing, capturing and projection exist:
- camera distance to real scene = subject's distance to real scene = subject's distance to projection screen
- camera interaxial distance (separation) = subject's interocular separation
- camera field of view = projection field of view
- captured image dimensions = projection screen dimensions (magnification = 1)
The above symmetries result in reproduced 3D sceneries whose dimensions are identical to those in the real scene. Below we describe the two components of the stereoscopic viewing rig: the capturing subsystem and the projection/reproduction subsystem.
The stereo scene capturing subsystem consists of three parts:
-
First, a scene is composed in Blender, often programmatically using our custom Python library. In this phase, we specify the 3D geometry of the various scene objects, the positioning of the illumination lights and the specification of the capturing camera.
-
The generated scene composition, exported as a Colada file, is imported into RenderToolbox-3 (RT3). RT3 is used to horizontally position the viewing camera so as to capture the left and right stereo views and to assign reflectance properties to the different surfaces. RT3 passes this information to a physically-based renderer (Mitsuba), which ultimately generates renderings of the scene that would be captured by cameras located in the subject's left and right eye positions.
-
Finally, a StereoViewController object is used to present the generated stereo renderings in the two left and right display panels. The StereoViewController uses openGL to create the viewing frustum Paul Bourke, followed by a correcting projection transformation which eliminates any projection artifacts that may be introduced by errors in the exact position and rotation of the mirror and the displays.
Scene geometry is specified in Blender using an XYZ coordinate system, in which the vergence plane is located at Y = 0.0. The capturing camera is positioned at the location of the cyclopean eye:
cameraPosition = <0, -vd, cameraElevation>
where vd is the viewing distance (the distance from the subject's eye nodal position to the vergence plane), and cameraElevation, is usually chosen to be near the center of the scene along the vertical axis. The camera is configured to look straight forward, i.e.:
cameraLookAt = <0, 0, cameraElevation>
The camera's horizontal field of view of is usually set to 33 deg, whereas the camera's width-to-height aspect ratio is usually set to 1.33. At a viewing distance of 76.4 cm this corresponds to a scene of 42.37 (W) x 31.78 (H) cm, i.e., close to display panel dimensions used in Stereo LCD display panel which are 51.79 (W) x 32.36 (H) cm.
To eliminate binocular rivalrous viewing conditions, the camera is not allowed to capture the scene in its entirety. Instead a viewing aperture is used, which is realized as a rectangular window in the front wall of the scene enclosing room. Care must be taken to specify the dimensions of that window so that it eliminates peripheral regions of the scene that do not get captured in both the left and right camera positions. The dimensions of this window have an upper bound that depends on the enclosing room width, height and depth. This upper bound is computed by the StereoRigDesigner app (see below).
For a specified scene width of 42 (W) x 30 (H) cm, the StereoRigDesigner - computed maximal aperture dimensions, which is also the dimension of the binocular field, are as follows:
Scene depth (cm) | Aperture size (cm) | Aperture size (deg) |
---|---|---|
1 | 35.1 x 29.6 | 26.2 x 22.2 |
5 | 32.9 x 28.0 | 25.9 x 22.2 |
10 | 30.1 x 26.1 | 25.5 x 22.2 |
20 | 24.6 x 22.1 | 24.6 x 22.2 |
40 | 13.6 x 14.3 | 21.2 x 22.2 |
It must be emphasized however, that stereo pairs create a "virtual" three dimensional image in which binocular disparity and convergence cues are correct but accommodation cues are inconsistent because each eye is looking at a flat image. The visual system will tolerate this conflicting accommodation to a certain extent. It is thought that a feature's maximum separation on the display, i.e., the feature's display disparity, must not exceed 1/30 of the viewing distance.
Stereo pair examples and disparity analysis here
Although RT3 is mainly used to modify the spectral properties of scene surfaces, here we only present its use to specify the geometry of left and right eye views. The following RT-3 script specifies that the scene view will be captured in an image consisting of 1280x1024 pixels. The left and camera positions to be separated by 6.4 cm, the typical observer inter-ocular distance. The remaining parameters are set to match those defined in Blender: viewing distance = 76.4, camera field of view: 31 deg, camera y-position equal to the viewing distance, and camera elevation (z-coordinate): 19.5 cm.
% Choose batch renderer options.
hints = struct(...
'imageWidth', 1280, ...
'imageHeight', 1024 ...
);
eyeLabel = {'Left','Right'};
eyeXpos = [ -3.2 3.2];
% match Blender settings
viewingDistance = 76.4;
cameraFOV = 31;
cameraDepthPos = -viewingDistance;
cameraElevation = 19.5;
stereoView = containers.Map(eyeLabel,eyeXpos);
for eyeLabel = keys(stereoView)
fprintf('Generating %s eye view\n', char(eyeLabel));
conditionKeys = {'eyeXpos', 'eyeYpos', 'distance', 'fov'};
conditionValues = {stereoView(char(eyeLabel)), cameraElevation, cameraDepthPos, cameraFOV};
conditionsFile = WriteConditionsFile(conditionsFile, conditionKeys, conditionValues);
nativeSceneFiles = MakeSceneFiles(parentSceneFile, conditionsFile, mappingsFile, hints);
radianceDataFiles = BatchRender(nativeSceneFiles, hints);
montageName = sprintf('FancyBallroom%s', char(eyeLabel));
montageFile = [montageName '.tiff'];
[SRGBMontage, XYZMontage] = MakeMontage(radianceDataFiles, montageFile, toneMapFactor, isScaleGamma, hints);
end
This script is accompanied by a mappings file. The portion of the mappings file that relates to scene geometry, is shown below:
Collada {
% swap camera handedness from Blender's Collada output
Camera:scale|sid=scale = -1 1 1
% move the camera
Camera:translate|sid=location = (eyeXpos) (distance) (eyeYpos)
% choose the camera field of view
Camera-camera:optics:technique_common:perspective:xfov = (fov)
}
A StereoViewController object is a Matlab object used to facilitate presentation of stereo pair stimuli.
horizontalFOV = 31;
viewingDistance = 76.4;
interocularDistance = 6.4;
% These dimensions are for the Stereo LCD display panels
displayPanelDims = [51.7988 32.3618];
% Get the sequence of stereo pairs and their expected dimensions in cm
[stereoPairSequence, imageWidthCM, imageHeightCM] = GetStereoPairSequence(viewingDistance, horizontalFOV, displayPanelDims);
stereoCalibrationInfo = struct(...
'displayPosition', {'left', 'right'}, ...
'spectralFileNames', {'StereoLCDLeft.mat', 'StereoLCDRight.mat'}, ...
'warpFileNames', {'StereoWarp-Radiance-left.mat', 'StereoWarp-Radiance-right.mat'}, ...
'interOcularDistanceInCm', interocularDistance, ...
'sceneDimensionsInCm', [displaPanelDims(1) displaPanelDims(2) viewingDistance]...
);
% Instantiate a StereoViewController object
stereoView = StereoViewController('stereodemo', ...
stereoCalibrationInfo, ...
performStringentCalibrationChecks, ...
'beVerbose', true);
% Configure the stereo pair struct
stereoPair = struct(...
'stimulusSource', 'file', ...
'imageNames', {},...
'imagePosition', [0 0], ...
'imageSize', [imageWidthCM imageHeightCM] ... % desired image size in cm.
);
% Start listening for key presses, while suppressing any
% output of keypresses on the command window
ListenChar(2); FlushEvents;
Speak('Hit Enter to show the stimuli');
exitLoop = false;
fileIndex = 1;
while (~exitLoop)
% Show the next stereoPair in the rendering pipeline
stereoPair.imageNames = {...
fullfile(stereoPairDir, stereoPairSequence{fileIndex}{1}), ...
fullfile(stereoPairDir, stereoPairSequence{fileIndex}{2})...
};
stereoView.setStereoPair(stereoPair);
stereoView.showStimulus();
key = mglGetKeyEvent;
fileIndex = fileIndex + 1;
if (fileIndex > numel(stereoPairSequence))
fileIndex = 1;
end
end % while loop
stereoView.shutdown;
The hardware for achieving the projection of left and right eye views consists of two panel displays and two first surface mirrors. The two display panels present the captured left and right eye views of the scene (stereo image pair). Each display panel is paired with a corresponding mirror which is used to reflect the display image into the corresponding eye of the subject. Figure 1A illustrates this operation for the left eye.

Figure 1A. Stereoscopic rig apparatus with the left frustum path outlined. The orange square on the left monitor represents the left component of the stereo image pair for a hypothetical scene. It's image projection onto the mirror surface is reflected onto the subject's left retina. This view frustum is depicted in orange in Figure 1A.
Tracing the real rays that emerge from the mirror backwards to a perceived point of origin on the vergence plane, yields the perceived (virtual) image of the corresponding stereo component (Figure 1B).

Figure 1B. Formation of the virtual image of the left stereo component on the mergence plane. The virtual image is represented by the red rectangle on the gray vergence plane in the back.
Figure 2A depicts the view frusta for the left and right virtual images, depicted by red and blue rectangles, respectively. These images are sampled by stereo disparity tuned neurons in the visual cortex which initiate perception of the original 3D scene.

Figure 1C. View frusta for left and right virtual images.
In practice, the geometric design and alignment of the mirrors and display panels in a stereo rig can be quite daunting. To ease this process, we designed a MATLAB app (StereoRigDesigner) which allows the user to specify various desired properties (or restrictions) of the experimental setup and of the 3D scene to be viewed. Based on these data, the app computes the necessary 3D positions and rotations of the monitors and mirrors so as to achieve perfectly positioned virtual images. The computed positions and rotations are such that the spatial extent of the generated virtual images match exactly the spatial extent of the captured views, thereby yielding a stereo system with a 1:1 magnification factor.
The properties that can be specified by the user are:
- eye separation (in the Brainard lab setup, this is set to 6.4 cm)
- viewing distance (in the Brainard lab setup, this is set to 76.4 cm)
- mirror rotation
- mirror offset
- mirror width
- mirror height
- distance from mirror assembly to eye nodal point distance
- max desired virtual stimulus width
- max desired virtual stimulus height
- max desired scene depth All the input properties can be reset to their default values, which correspond to the default configuration of the Stereo LCD rig in the Brainard lab.
The information computed by the app is:
- left and right display 3D position
- left and right mirror 3D position
- binocular rivalry minimizing aperture position and size
Various menu-driven visualization options can be selected and the app can save graphic outputs of the computed rig configurations in PDF and PNG formats. Figure 2 depicts an instance of the app's GUI.

Figure 2 GUI of the Stereo Rig Designer Matlab app. Several input parameters can be entered in the left side input fields. The computed 3D positions and rotation of the key components are displayed in the output fields at the top. Various viewing conditions can be specified via menu driven options.
One application of StereoRigDesigner is to easily generate equivalent rig configurations with different geometries, i.e., rig configurations that produce identical virtual stimuli although the mirrors and monitors gave different rotations/positions. This could be useful for designing rigs with different footprints.
The default configuration of the LCD stereo rig in the Brainard lab is depicted in Figure 2.1.1.

Figure 2.1.1 The default stereo rig configuration used in the LCD Stereo rig of the Brainard lab. Viewing distance is 76.4 cm and mirror rotation is 87.6 deg.
Two equivalent rigs (that would produce identical virtual stimuli) with very different component arrangements can be seen in Figs 2.1.2 and 2.1.3.

Figure 2.1.2 Stereo rig configuration with a 50 cm viewing distance and a 40 degree mirror rotation.

Figure 2.1.3 Stereo rig configuration with a 100 cm viewing distance and a 60 degree mirror rotation.
A second application of StereoRigDesigner is to determine the minimal dimensions of the reflecting mirrors given the desired max scene dimensions and a particular rig configuration.
Figure 2.2.1 depicts the rig configuration required to project 42 (W) x35 (H) stimuli when the mirror assembly is located 10 cm from the eye nodal point. The minimal mirror dimensions are 13 (W) x 9.6 (H) cm.

Figure 2.2.1 Determining minimal mirror dimensions for projection of 42 (W) x35 (H) cm stimuli: nodal-mirror distance = 10 cm.
Figure 2.2.2 depicts the rig configuration required to project 42 (W) x35 (H) stimuli when the mirror assembly is located 5 cm from the eye nodal point. In this configuration, the required mirrors can be quite a bit smaller, 8 (W) x 6.3 (H) cm. Also note that the mirrors are not abutting any more.

Figure 2.2.2 Determining minimal mirror dimensions for projection of 42 (W) x35 (H) cm stimuli: nodal-mirror distance = 5 cm.
2.3 StereoRigDesigner:Computing the viewing aperture dimensions to avoid binocularly rivalrous conditions
A third application of StereoRigDesigner is determining the size of the viewing aperture that is necessary to avoid binocular rivalrous stereo viewing. Binocular rivalry occurs in a stereo rig because the leftmost part of the frustum of the left image does not have a corresponding component in the right image, and similarly the rightmost part of the frustum of the right image does not have a corresponding component in the left image.
It must be noted that this aperture is not a real component. Its optimal size depends jointly on the width, height and depth of the 3D scene (as well as several other properties of the stereo rig), and it must be specified during the design of the stereo image pairs. Before the StereoRigDesigner app became available it was computed by trial and error, often requiring several time-consuming stimulus rendering passes.
For a 30 (W) x 20 (H) cm scene with a 10 cm depth, the maximal aperture is 19.7 (W) by 17.4 cm (H), as seen in Figure 2.3.1. If the scene's depth increases to 20 cm, the maximal aperture is 15.7 (W) x 14.8 (H) cm (Figure 2.3.2), and when the scene depth increases to 40 cm, the maximal aperture is 7.9 (W) x 9.5 (H), as seen in Figure 2.3.3. We see, therefore, that as scene depth changes, the maximal viewing aperture changes not only in its size, but also its aspect ratio, something that might not have been immediately evident.

Figure 2.3.1 Determining the maximal viewing aperture for a 30 (W) x 20 (H) scene with a 10 cm depth range. The gray semitransparent box represents the enclosing room that could be specified during the scene geometry specification phase. The white dotted rectangle represents the viewing aperture.

Figure 2.3.2 Determining the maximal viewing aperture for a 30 (W) x 20 (H) scene with a 20 cm depth range. Note that the aperture is smaller.

Figure 2.3.2 Determining the maximal viewing aperture for a 30 (W) x 20 (H) scene with a 40 cm depth range. Note that the aperture is even smaller.