posted on 2007-01-01, 00:00authored byAndrew Stein, Derek Hoiem, Martial Hebert
While great strides have been made in detecting and localizing
specific objects in natural images, the bottom-up
segmentation of unknown, generic objects remains a difficult
challenge. We believe that occlusion can provide a
strong cue for object segmentation and “pop-out”, but detecting
an object’s occlusion boundaries using appearance
alone is a difficult problem in itself. If the camera or the
scene is moving, however, that motion provides an additional
powerful indicator of occlusion. Thus, we use standard
appearance cues (e.g. brightness/color gradient) in
addition to motion cues that capture subtle differences in
the relative surface motion (i.e. parallax) on either side of
an occlusion boundary. We describe a learned local classifier
and global inference approach which provide a framework
for combining and reasoning about these appearance
and motion cues to estimate which region boundaries of
an initial over-segmentation correspond to object/occlusion
boundaries in the scene. Through results on a dataset which
contains short videos with labeled boundaries, we demonstrate
the effectiveness of motion cues for this task.