On generating stereo images with focus stacking softwareLast Updated: 2nd Nov 2015
By Tony Peterson
On Generating Stereo Images With Focus Stacking Software
Prior to the appearance of fast enough computers with sufficient memory, the only way to generate photographic stereo image pairs was to take pictures of the same subject with two cameras set some distance apart and aimed at the same point in space. Alternatively, one could move the camera or rotate the subject and take two photographs. This method simulates the different views our own eyes have of anything we look at. Delivering a different image to each eye can be accomplished using separate image projectors for each eye, polarizing/colorizing the images and using special glasses, or, as all field geologists quickly learn to do, by crossing the eyes and holding the air photos just-so. All of the stereo pairs in this article are meant to be viewed with the cross-eyed technique.
Obviously, any stereo pair works because it contains information on the 3rd dimension, toward/away from the subject (the Z axis). A true stereo image pair of a scene where some objects overlap along the Z axis can only be obtained by taking photographs from two different locations, because parts of the background objects are covered by the foreground objects in different ways. For examples of true stereo pairs here on Mindat, have a look at Modris Baum’s portfolio. Most of his stereo pairs were taken by rotating the specimen and using one camera, for example:
If background overlap is absent or negligibly small (or, if the background is to be ignored by leaving it out of focus), there is another way, which makes use of the image stacking techniques now so popular in microphotography. As we all know, these stacking techniques and computer algorithms permit the photographer to take a series of images as the camera is racked towards or away from the subject, so that the focal plane moves through it and in-focus samples of all or part of the subject are recorded. The algorithm then selects the most in-focus portions of each image, and tacks them together to create a final image that has a depth of focus which cannot be obtained with one image.
Of course there’s a little more to it than that: the images will need some x-y adjustments and perhaps rotations to accommodate mechanical imperfections, and the subject’s apparent size changes as the camera is moved, so each image must also be scaled. In principle this technique is independent of the subject’s size, but practically speaking, focussing rails only permit movements on the order of a few cm BUT are capable of precise movements of 100 microns or less. It is also possible to create the image stack by changing the focal plane of the lens (i.e., turning the focussing ring) but it is difficult to be precise with this. I sometimes employ that technique when I want to stack through a depth beyond about 1 cm, and I have painted the ridges on my focussing ring so I can move it at reasonably precise intervals. However, I find that stereo images of wide (i.e., macro) views are usually less effective than micro views, and I don't attempt them very often.
The algorithm to construct the image must detect whatever it is that distinguishes in focus from out of focus, and that means edge detection, which in turn relies on finding areas of rapid change in brightness, i.e., lines of high contrast. Here I’m out of my depth but even I know that this will usually be done with a Fourier transform of the image (for more information you can start at http://en.wikipedia.org/wiki/Focus_stacking).
Things can go wrong. Broad areas of low contrast, like uniform crystal faces, contain very little information and the algorithm can retain domains around minor image imperfections such as dust shadows on the detector, or even minor variations in pixel sensitivity. The worst offenders are bright highlights: when out of focus, these will generate halos that have higher contrast than surrounding parts of the image and are then, unfortunately, selected by the algorithm. Fortunately in many situations it is straightforward to touch these up with manual selection from the correct part of the image stack, provided your software package has that capability built in (below). However, if a highlight – say, one generated by a small reflective crystal face – has a significantly nearer Z position than the obscured area, satisfactory touchup will not be possible, and a “veil” covering the wanted detail will always be present. Dark (negative) highlights can also be problematical.
Stereo From Focus Stacks
The concept behind making stereo pairs from image stacks is quite simple. The parallax effect is simulated by shifting each image slightly to the right or the left (+ or – X), relative to the one below it, prior to sampling of the in-focus areas. For a physical analogy, think of a deck of cards, with each card being the focal plane of an image. If you look directly overtop, all you see is the top card; skew the deck by pushing it one way or the other, and a portion of each card is now seen, as though you had moved your head to one side or the other.
One can immediately see an advantage to this technique over two-position photography: it is possible to obtain a stereo visualization of objects at the bottom of deep fractures or depressions, where two-position photography is impossible because the sides of the fracture block viewing the objects from any position except directly overhead. An example:
In addition, the image stack can be rotated at any angle prior to skewing and stacking, to mimic turning the plane of vision. Practically speaking, only 90 degree rotations would commonly be used, but this can be a way to manage the aspect ratio of the final image. Because the images are presented side-by-side, so that the combined image can only be half the width of the screen, long objects might be better presented in a vertical orientation. The example below was captured with the mineralized pocket oriented horizontally, but turned vertically for the stereo rendering:
Another advantage of stacking stereo is that lighting remains constant: in two-camera stereo photography, crystal faces can be illuminated differently, which produces a discordance between the two images.
Finally, one can vary the magnitude of the stereo effect. Strong stereo effects with a clean image can only be created with stacks that generate few artifacts. The examples below were created with Zerene Stack at 1% shift (first example) and 3% shift (second example, same view):
I have been late to attempt anything but 1% shifts and all of my other stereo images at the time of writing (and in this article) are at 1%.
The disadvantages of stacking stereo are: as noted previously, objects overlapping along the z axis cannot be properly represented (the regions that contain no information become filled by out of focus areas); and artifacts generated from highlights become both more intense and more frequent. Any out-of-focus halo becomes stretched along the x axis, which extends its unwanted influence further across the image.
Any regions in the background or foreground that were left out of focus will have the appearance of a flat, painted stage background. This is more tolerable in the background than the foreground. The types of artifacts that are produced in low-contrast areas, which might easily be overlooked in a 2-D stacking, become random features that don’t have right-left correspondence and are very distracting during stereo viewing. Fortunately OOF backgrounds and foregrounds are easily touched up (below). Highlight smears that aren’t in a separated foreground can also be touched up but the process can be arduous – better to avoid making them in the first place!!
Some Practical Considerations
An overriding principle governs all of the practical aspects of stereo pair creation. Because the stacking algorithm extracts information from the images, they need to contain as much information as possible to achieve the best results.
Lighting There are no special rules for stereo image production, but the importance of the general rules is magnified. Try not to have extremes of light and dark over large areas, and use exposure settings that take advantage of the dynamic range of your camera. These can be monitored by viewing the brightness histogram of your images. A large area of uniform brightness and color (color is also information!) results in high, narrow peaks in the histogram of (# of pixels) vs brightness, in each color channel. Try to find natural details in such areas, eg., by partial illumination of crystal faces, or positioning light sources to pick out growth features, lamellar twinning, etc. Carefully manage over-bright reflections. Minor “blowouts” (oversaturation of any or all of the color channels) are permissible and can even be desirable, but will be easier to manage if the surrounding areas are close in Z value.
Specimen Positioning The best results (which for me mean a strong, meaningful stereo effect with a minimum of artifacts) are obtained when a sloping surface tilting toward/away from the observer (like looking from altitude down on a valley or a ridge) contains features that project above the surface, but not so much as to eclipse anything behind or beneath them. This description can include quite “rugged” specimen terrain, for example:
Transparent Objects This technique is well suited to transparent crystals and can create stunning effects. The visual cues for such objects in a 2-D photograph are edges which reflect narrow bands of light, and relatively distorted views of background objects and inclusions through differently-oriented faces. These are also useful in stereo pairs:
(This image, an early effort made with a shallow stack and without touching up, contains some artifacts but shows both double refraction, and simple refractive doubling of an inclusion feature viewed through different faces).
The stacking technique usually allows for clear stereo imaging of objects viewed through the crystal – including edges on the far side. An example with a cubic mineral, uncomplicated by double refraction:
Mirror Effects It not infrequently happens that a view of a well-lit object appears as a reflection in a crystal face. Of course, the reflective image will be in focus when the face the reflection appears on is not. In favourable circumstances, this can be a pleasing effect, e.g.:
But it can also happen that the reflection overlaps surface features, producing a confused mingling of images. If this happens, the only recourse is to shorten the stack to avoid the reflection, or use touchup to select either the reflection, or the surface features. An example where some of this mingling is present:
Initial Image Processing Everyone who picks up a camera to photograph a mineral specimen values spatial resolution, which is provided by a good quality lens and a large CCD detector with small pixels. Sometimes, the importance of dynamic resolution – the number of brightness levels in each channel - is underappreciated. For single-shot photography it may not be a great issue, but because stacking strongly depends on the detection of variations in brightness levels, high dynamic resolution is important. A single-shot image may be recorded as an 8-bit jpeg by the camera software, using compression algorithms that are designed to retain dynamic resolution where it is needed. However, such images cannot be edited very much (e.g., changing the brightness levels, or altering the color balance) as this inevitably destroys information, adversely affecting the quality of the image.
Most SLR cameras generate 14 bit raw images. I strongly recommend that photographers take raw images from the camera and convert them to 16 bit TIFF using Photoshop, or software provided with the camera (Canon has rather good software for this). Any image can be substantially edited at a HIGHER bit level without creating dynamic artifacts, such as brightness "steps". Stack the 16 bit tiffs, and complete ALL editing before converting to jpeg. If you archive your work, be sure to retain the final 16 bit stacked images (archiving the stacks is very difficult, as it requires a great deal of memory) so that you can go back and change your edits at a later date. I use Photoshop to convert from raw to TIFF, and I optimize the dynamics as much as possible (both brightness and color balance) during this step.
Touching Up I can’t speak for other programs, but Zerene Stack gives a way to touchup stereo pairs. The original images can’t be used because they haven’t been shifted along X. However, a pair of touchup source images can be made by applying the stereo generation algorithm to as few as two adjacent images. I find this is usually necessary to fix out-of-focus background or foreground areas (relatively easy), and less often highlight artifacts (potentially arduous). Here is an example of a simple background/foreground touchup situation:
And here is one, with criss-crossing acicular metallic crystals with many highlights, that was extremely arduous:
How Many Images in the Stack? Well, you probably can’t have too many, although it is possible to have more than you need. Theoretically, only an infinitesimally thin plane of the subject can be in focus for any given position of the camera, but depending on the aperture setting, distance to the subject, lens type, etc., there is a finite thickness along Z which is so close to perfect focus that the error is undetectable. Ideally, a stack will be designed to have these slices overlap slightly, but a practical consideration is the time required to generate a stack, and to run the stacking algorithm on it. For microphotography with a field of view of 4-6 mm and a Z depth on the order of 1-3 mm, I find approximately 100 images give very good results, but as few as 50 can work well. Unsurprisingly, highlight artifacts are reduced by using a finer stack, although stack depth has no effect on background/foreground artifacts.
Currently, I plan my stacks for both 2D and 3D images. I try to make the stack deep enough to have ALL of the scene in focus, and use that stack for stereo pairs. I then use a subset of that stack which has the principal subject (usually a single crystal or crystal group) in focus but leaves part of the background and foreground out of focus. This retains important visual cues on the depth of field which are needed for a successful 2D rendering. An example (2D first, then stereo):
January 9, 2015
Article has been viewed at least 8521 times.