posted on 2005-09-01, 00:00authored byCarlo Tomasi, Takeo Kanade
Abstract: "Inferring the depth and shape of remote objects and the complete camera motion from a sequence of images is possible in principle, but is an ill-conditioned problem, because translation and rotation are hard to distinguish, and the size of the object is small with respect to its distance from the camera. We show how to overcome these problems by inferring shape and rotation without computing depth and camera translation as intermediate steps. On a single epipolar plane, image measurements can be represented by an FxP matrix, obtained by tracking P points through F frames. We show that under orthographic projection this matrix is of rank 2. Using this observation, we develop an algorithm to recover shape and camera rotation, based on singular value decomposition. The algorithm gives accurate results, and does not introduce smoothing in either shape or camera rotation."