Camera arrangement

This page discusses how many cameras to use with ScannedReality Studio and how to arrange these cameras around the subject.

Number of cameras

While it is possible to use ScannedReality Studio with a single depth camera, the quality of the results will improve the more cameras are being used, since there is more data for the program to work with.

In addition to this basic rule, keep in mind that for a complete 3D reconstruction of the subject that you aim to film, any surfaces that are not directly filmed by any camera must be interpolated or otherwise filled in by the program. If these areas are too large, then this fill-in will most likely not look plausible.

Because of this, ScannedReality Studio has a reconstruction setting that blends out surfaces that face away from all cameras, becoming transparent. This setting is intended for use with one or a few cameras.


Example views of a person reconstructed in “front-facing” mode, using two Azure Kinect cameras placed in front of the person. The sides and the back of the person are transparent.


Example views of a person reconstructed in “surround” mode, using cameras placed all around the person. All sides are opaque, there is no transparency.

For full 3D reconstructions of a person in “surround” mode, at least four depth cameras should be used as a bare minimum. With this number of cameras, occlusions are still very likely to occur often, and the cameras need to be placed relatively far away from the subject to observe it completely, which reduces the reconstruction quality and may cause areas that can be difficult to observe for depth cameras, such as hair, to be missing.

Furthermore, keep in mind that for a robust coverage of the volume that you aim to record, any given point within the volume should ideally not only be observed from one direction, but from a variety of directions. Otherwise, the point may be occluded easily and if occluded, any detail at this point will be missed.

Given these considerations, we recommend using at least 8 to 10 depth cameras for filming a person in “surround” mode in good quality.

Camera placement

For the “front-facing” reconstruction mode, camera placement is mostly an artistic choice, determining the main view that will be available in the 3D reconstruction.

For the “surround” reconstruction mode however, the goal generally is to observe the subject as completely as possible, ideally leaving no unobserved surfaces that need to be filled in artifically.

Another consideration is to achieve a good observation distance between the subject and the cameras: Most importantly, depth cameras exhibit higher noise for larger observation distances, thus putting them closer to the subject will generally improve the surface quality (while reducing the area that they can see). They usually also have a minimum observation distance.

While less critical, similar rules apply to color cameras: Moving them closer to the subject will increase the effective resolution with which the subject is observed, while reducing the observed area. Moving cameras that have a fixed focus too close to the subject may result in a blurry image.

A strategy to define the camera placement is thus to determine a desired range for the observation distance, and then distribute the cameras evenly around the intended recording volume at that range to the volume. For example, for a single standing human, this may result in a capsule-like shape in which the cameras might be arranged around the subject.


Two-dimensional sketch for a strategy of placing cameras for a “surround” recording configuration: Based on the planned recording volume (bright red), go outwards by a fixed distance (bright blue), and on the boundary of that area, distribute the inwards-looking cameras (black triangles) roughly evenly.

In practice, camera systems with Azure Kinect or Femto Bolt cameras are usually limited to at most 10 cameras (since more of these cameras could not be synchronized tightly anymore without interfering). This means that important areas, such as the front side in general, or the face of a person, might have to be prioritized rather than distributing cameras in a completely even way.

The following screenrecording of ScannedReality Studio shows an example of a real-world camera setup that we used with 10 Femto Bolt cameras, with the camera poses shown as yellow pyramids:

To physically place the cameras around the subject, tripods or telescopic clamping rods may be used, for example. It should be ensured that the cameras are firmly fixed into place and do not move, since an accurate calibration of their positions and orientations is required to reconstruct volumetric videos, which would be disturbed even by tiny motions.

There is no need to position the cameras in an upright way to create “nice” images. The main aspect of importance is that the coverage of the subject being filmed is as good as possible.

If possible, it is also helpful to avoid having bright lights in the camera images that would cause overexposure and blooming to occur, since the blooming could extend over the subject and possibly end up in the resulting videos. Thus, indirect lighting of the subject is usually highly preferable over direct lighting.