posted on 2005-01-01, 00:00authored bySanjiv Kumar, Martial Hebert
We present a two-layer hierarchical formulation to exploit
different levels of contextual information in images for
robust classification. Each layer is modeled as a conditional
field that allows one to capture arbitrary observationdependent
label interactions. The proposed framework has
two main advantages. First, it encodes both the short-range
interactions (e.g., pixelwise label smoothing) as well as the
long-range interactions (e.g., relative configurations of objects
or regions) in a tractable manner. Second, the formulation
is general enough to be applied to different domains
ranging from pixelwise image labeling to contextual object
detection. The parameters of the model are learned using
a sequential maximum-likelihood approximation. The benefits
of the proposed framework are demonstrated on four
different datasets and comparison results are presented.