Distributed localization of camera networks
Localization problem in a camera sensor network refers to uniquely
determining the pose of every camera with respect to a fixed reference
frame. More precisely, assume we have a network of n cameras deployed
in 3-D. The relative pose of each camera with respect to a
reference frame is defined by its rotation R and its translation T
with respect to that frame. We say the network is localized if there is a
set of relative transformations (R
ij,T
ij) between camera i and
j such that when the reference frame for node 1 is fixed at (R
1,T
1),
the other poses (R
i,T
i) are uniquely determined.
When the field-of-view's of two cameras intersect, it is possible to estimate their relative transformation(R
ij,T
ij) using two-view epipolar geometry. However, there are a couple of problems in directly using these pairwise estimates.
- The estimates may not be consistent if the whole network is taken into consideration. For instance, if we go trough a loop in the network, the composition of all relative transformations should give the identity.
- The translations are obtained only to a scale. It is necessary to exploit the constraints from the network topology in order to find these additional scales.
Previous work either is not applicable in this setting (typically, only the case of range measurements has been considered) or does not correctly exploit the manifold structure of SE(3).
In our work we developed distributed algorithm that, given the pairwise measurement, find a complete, consistent localization of the network. Moreover, our algorithm is optimal with respect to the use of the appropriate metric in SE(3).
Distributed calibration of camera networks
The problem of camera calibration for a single camera has been widely
studied in the computer vision literature. Such methods typically require
the user to show a calibration rig to the camera. Obviously, such methods
do not scale well for large camera sensor networks, because they would
require manual calibration of each camera. On the other hand,
self-calibration methods automatically calibrate the cameras by solving
nonlinear equations such as Kruppa's equations. While these methods
are very elegant, they suffer from the fact that the problem of solving
Kruppa's equations is numerically ill-conditioned. Moreover, these methods
assume that all the cameras have the same intrinsic parameters so they are
not readily applicable to camera sensor networks, where each camera can
have different intrinsic parameters.
We show that the problem of automatically calibrating a large number of
cameras in a sensor network can be solved in a distributed way by solving
a set of
linear equations. We show that this is possible under the mild
assumption that only one of the cameras needs to be calibrated.
Once the cameras' intrinsic and extrinsic parameters are known, the next
problem is to recover the structure of the 3D scene (triangulation
problem). However, solving the multiple view triangulation problem becomes
difficult in a sensor network setup. This is because, while each pair of
nodes could easily compute the 3D structure of the scene via linear
triangulation, the estimates from different pairs of nodes may not be the
same, especially in real situations where image data are noisy.
We propose a method based on
distributed consensus algorithms for
estimating the 3D structure of a scene in a distributed way. We show that
all the nodes compute the same 3D structure, even though they communicate
with only a few neighbors in the network. This method requires that all
the cameras observe the same scene and that the network graph over which
the nodes communicate be connected.