A High-Resolution Monocular Panoramic Camera
Using a Double Mirror Pyramid
High-resolution
panoramic capture is highly desirable in many applications such as immersive virtual
environments, tele-conferencing, surveillance, and robot navigation. In
addition, a single viewpoint for all viewing directions, a large depth-of-field
(omni-focus), and real-time acquisition are desired in some imaging
applications (e.g. 3D reconstruction and rendering). The FOV of a conventional camera is limited
by the size of its sensor and the focal length of its lens. For example, a
typical 16mm lens with 2/3" CCD sensor has a
FOV.
The number of pixels on the sensor (
for NTSC camera) determines the resolution. The depth-of-field is limited
and is determined by various imaging parameters such as aperture, focal length,
and the scene location of the object.
Many
approaches have been presented to achieve various subsets of these properties:
wide FOV, high resolution, large depth-of-field, a single viewpoint, and
real-time acquisition. Among these, mirror-pyramid (MP)-based camera systems
offer a promising approach to capturing high-resolution, wide-FOV
panoramas as
they provide single-viewpoint images at video rate. Such systems use planar
mirrors assembled
in pyramid or prism shapes, and as many cameras as the number of mirror faces, each located and
oriented to capture the part of the scene reflected off one of the flat mirror
faces. Images from the individual cameras are concatenated to yield a
360-degree wide panoramic image. Compared to designs using parabolic or
hyperbolic mirrors, flat mirrors are easier to design and produce, and they
introduce minimal optical aberrations.
We
have developed a double-mirror-pyramid design that doubles the size of the
visual field of the single-pyramid based systems. With this prototype, we have
developed methods for optimally choosing the
parameters of MP-based camera systems, e.g., camera placement, pyramid geometry, sensor
usage, and uniformity of image resolution, and
how the resultant image quality can be evaluated.
2. Overview of panoramic imaging
The existing methods of capturing panoramas
fall into one of the two categories: dioptric methods, where only refractive
elements (lenses) are employed, and catadioptric methods, where a combination
of reflective and refractive components is used. Typical dioptric systems
include: the camera cluster method where multiple cameras
point in different directions to cover a wide FOV; the fisheye method where a single camera acquires
a wide FOV image through a fisheye lens; and the rotating camera method where a
conventional camera pans to generate mosaics, or a camera with a non-frontal,
tilted sensor pans around its viewpoint to acquire panoramic omni-focused
images. The catadioptric methods include: sensors in which a single camera
captures the scene as reflected off a single curved mirror, or sensors
in which multiple cameras image the scene as reflected off the planar mirror surfaces.
The dioptric camera clusters are capable of capturing high-resolution panoramas at video rate. However, the cameras in these clusters typically do not share a unique viewpoint due to physical
constraints, which makes it difficult or even impossible to mosaic individual
images to form a true panoramic view, while apparent continuity across images
may be achieved by ad hoc image blending. The sensors with fisheye lens are able to deliver
large FOV images at video rate, but suffer from low resolution, irreversible
distortion for close-by objects, and non-unique viewpoints for different
portions of the FOV. The rotating
cameras deliver high-resolution wide FOV via panning, as well as omni-focus
when used in conjunction with non-frontal imaging, but they have limited
vertical FOV. Furthermore, because they sequentially capture different parts of
the FOV, moving objects may be imaged incorrectly.
The catadioptric sensors that use a parabolic-
or a hyperbolic-mirror to map an omni-directional view onto a single sensor are able to achieve a single
viewpoint at video rate, but the resolution of the acquired image is limited to
that of the sensor used and varies
significantly with the viewing direction across the visual fields.
Analogous to the dioptric case, this resolution problem can be alleviated partially by replacing the
simultaneous imaging of the entire FOV with panning and sequential imaging of
its parts, followed by mosaicing the images, at the expense of video rate. Another category of the catadioptric
sensors employs a number of planar
mirrors assembled in the shape of right mirror-pyramids, together with as many cameras as the number of pyramid faces. Each of these cameras, capturing the part of the scene reflected
off one of the faces, is located
and oriented strategically such that the mirror images of their
viewpoints are co-located at a single point inside the pyramid. Effectively,
this creates a virtual camera that captures wide-FOV, high-resolution panorama
at video rate.
Proposed Double-Mirror-Pyramid Camera
The
main challenge in constructing a panoramic camera from multiple sensors is to
co-locate the entrance pupils of the multiple cameras so that adjacent cameras
cover contiguous FOV without obstructing the view of other cameras or their
own. Nalwa first used a right mirror pyramid (MP) formed from planar mirrors
for this purpose. He reported an implementation using a 4-sided right pyramid
and 4 cameras. The pyramid stands on its horizontal base. Each triangular face forms a 45-degree angle
with the base. The cameras are positioned in the horizontal plane that contains
the pyramid’s vertex such that the entrance pupil of each camera is equidistant
from the vertex and the mirror images of the entrance pupils coincide at a
common point, C, on the axis of the pyramid. The cameras are pointed vertically
downward at the pyramid faces such that the virtual optical axes of the cameras
are all contained in a plane parallel to the pyramid base, effectively viewing
the world horizontally outward from the common virtual viewpoint
.
The vertical dimension of the panoramic FOV in each of the
aforementioned cases is the same as that of each of the cameras used – only
their horizontal FOVs are concatenated to obtain a wider, panoramic view. We
have developed a panoramic design that uses a dual mirror-pyramid
(DMP), formed by joining two mirror-pyramids such that their bases coincide
(Fig. 2), together with two layers of camera clusters. Such a DMP-based design thus doubles the
vertical FOV while preserving the ability to acquire panoramic high-resolution
images from an apparent single viewpoint at video rate.
In a MP-based panoramic system, it is critical to
optimize the geometry of a pyramid or prism shape, the placement of a common
viewpoint along with the mirror surfaces, and the selection of camera
parameters to maximize the overall FOV, sensor usage, and image uniformity. We
have analyzed both geometrical and optical constraints in a DMP-based system
and established relationships that relate the design parameters to the
resultant image quality. This
analysis can be generalized and applied to other MP-based designs.

As
illustrated in Fig. 2-a, a DMP is formed by stacking two N-sided
mirror-pyramids, back to back such that their bases coincide. Without loss of
generality, we assume the use of a right pyramid in which the surfaces are
symmetric to the pyramid axis. A right
DMP can be basically characterized by the number of
mirror faces in a single pyramid, the slope angle, and its base and cap radii. The
slope angle of a pyramid refers to the angle formed by a mirror surface with the pyramid base. We
assume the two pyramids have the same slope angle,
. The base or cap radii of a pyramid refer
to the radii of the inscribed circles of the base or cap polygons. We consider a unit DMP
which has a unit base radius, with cap radii of
and
, for the pyramids A and D, respectively.
and
can also be interpreted as the ratios of the actual cap radii to the pyramid base.
Using
the constraints and relationships derived in the section 4 and 5, we designed a
DMP panoramic camera with two right-hexagonal (
) truncated pyramids (Fig. 3). The pyramid has a base radius of
and a slope angle of
40 degrees. The ratio of camera-pyramid size is about
0.2. The optimal cap radius of the pyramid is 56.3mm, and the shape factor of
the pyramid is 0.43. This DMP design yields a total
non-occluded FOV. Each of the mirror surfaces
effectively covers 60 degrees FOV horizontally and 41.2 degrees
vertically.
The cameras are tilted at
20.58 degrees relative to the pyramid base, yielding optimal sensor
utilization. The
field angles corresponding to corners
are
,
, and
, respectively. The aspect ratio of the reflective visual field is 1.64.
Assuming a sensor with an aspect ratio of 4:3, the minimal FOV requirement for
the cameras are
and
for the horizontal and
vertical directions, respectively.

The
cameras selected are Pulnix with 2/3",
black/white CCD sensors.
Thus the maximum focal length is 7.13mm, which yields a sensor efficiency of
72% and image non-uniformity 27%. A 6.5mm lens is selected. This sensor-lens
combination effectively provides a FOV of
and
for the horizontal and
vertical directions, respectively. The pyramid-camera combination yields a
panoramic system with total 2.176 Million pixels, sensor efficiency of 60%,
and image non-uniformity 27%.

Figure
4 shows results for a prototype containing only 4 physical cameras, instead of
the capacity of 12. Figures 4-a through 4-d show four images acquired by four
adjacent cameras, two horizontally adjacent in the upper layer and the other
two horizontally adjacent in the lower layer, which also form two vertically
adjacent pairs. After post-processing for
keystone and radial distortions, images from vertically adjacent cameras are concatenated to form a vertical mosaic corresponding
to double the FOV of the individual cameras. The resulting vertical mosaics are shown in Figures 4-e and 4-f, respectively. The two
vertical mosaics (six mosaics in if it were a full implementation) are
projected onto a cylinder centered at the common viewpoint to form the seamless
cylindrical mosaic. The final 4-camera mosaic of the two images 4-e and 4-f is
shown in Figure 4-g.