posted on 2000-01-01, 00:00authored byCarlo Tomasi, Takeo Kanade
Abstract: "In principle, three orthographic images of four points are sufficient to recover the positions of the points relative to each other (shape), and the viewpoints from which the images were taken (motion). In practice, however, the solution to this structure-from-motion problem is reliable only when the viewing direction changes considerably between images.This conflicts with the difficulty of establishing correspondence between images over long-range camera motions. Image streams, long sequences of images covering a wide motion in small steps, allow solving this conflict by using tracking for correspondence and redundancy for increased reliability in the structure-from-motion computation. This report is the second of a series on a new factorization method for the computation of shape and camera motion from a stream of images. While the first report considered a camera moving on a plane, we now extend theory, analysis and experiments to general, three-dimensional motion.In our method, we represent feature points in an image stream by a 2F X P measurement matrix, which gathers the horizontal and vertical coordinates of the P points tracked through F frames. If coordinates are measured with respect to their centroid, we show that under orthography the measurement matrix is of rank 3. Using this fact, we cast structure-from- motion as a matrix factorization problem, which we solve with an algorithm based on Singular Value Decomposition. Our algorithm gives accurate results, without relying on any smoothness assumption for either shape or motion."