posted on 2007-11-01, 00:00authored byNoah A. Smith, Douglas Vail, John D. Lafferty
We describe a new loss function, due to Jeon
and Lin (2006), for estimating structured
log-linear models on arbitrary features. The
loss function can be seen as a (generative) alternative
to maximum likelihood estimation
with an interesting information-theoretic interpretation,
and it is statistically consistent.
It is substantially faster than maximum
(conditional) likelihood estimation of conditional
random fields (Lafferty et al., 2001;
an order of magnitude or more). We compare
its performance and training time to an
HMM, a CRF, an MEMM, and pseudolikelihood
on a shallow parsing task. These experiments
help tease apart the contributions
of rich features and discriminative training,
which are shown to be more than additive.