posted on 1990-11-01, 00:00authored byCarlo Tomasi, Takeo Kanade
Abstract: "Inferring the depth and shape of remote objects and the complete camera motion from a stream of images is possible, but is an ill-conditioned problem when the objects are distant with respect to their size. To overcome this difficulty, we have developed a factorization method to decompose an image stream directly into object shape and camera motion, without computing depth as an intermediate step. The factorization method is explored in a series of technical reports, going from basic principles through implementation. This is the first report in the series, and presents basic concepts in the case of planar motion, in which images are single scanlines.In this situation, an image stream can be represented by the F [cross] P matrix of the image coordinates of P points tracked through F frames. We show that under orthographic projection this measurement matrix is of rank 3. Using this observation, we develop an algorithm to recover shape and camera motion, based on the singular value decomposition of the measurement matrix. Noise is defeated by applying a well-conditioned computation to the highly redundant input represented by an image stream. No assumptions are made about smoothness or regularity of the camera motion, and even sudden jumps in the camera velocity are faithfully reproduced in the computed output."