posted on 2006-01-01, 00:00authored byDerek Hoiem, Alexei A Efros, Martial Hebert
Image understanding requires not only individually estimating elements of the visual world but also capturing the
interplay among them. In this paper, we provide a framework for placing local object detection in the context of the
overall 3D scene by modeling the interdependence of objects, surface orientations, and camera viewpoint.
Most object detection methods consider all scales and
locations in the image as equally likely. We show that with
probabilistic estimates of 3D geometry, both in terms of
surfaces and world coordinates, we can put objects into
perspective and model the scale and location variance in
the image. Our approach reflects the cyclical nature of the
problem by allowing probabilistic object hypotheses to refine geometry and vice-versa. Our framework allows painless substitution of almost any object detector and is easily
extended to include other aspects of image understanding.
Our results confirm the benefits of our integrated approach.