We present a novel system combination of
machine translation and text summarization
which provides high quality summary
translations superior to the baseline translation
of the entire document. We first use
supervised learning and build a classifier
that predicts if the translation of a sentence
has high or low translation quality. This
is a reference-free estimation of MT quality
which helps us to distinguish the subset
of sentences which have better translation
quality. We pair this classifier with a state-of-the-art
summarization system to build
an MT-aware summarization system. To
evaluate summarization quality, we build a
test set by summarizing a bilingual corpus.
We evaluate the performance of our system
with respect to both MT and summarization
quality and, demonstrate that we
can balance between improving MT quality
and maintaining a decent summarization
quality.
History
Publisher Statement
Published in International Joint Conference on Natural Language Processing, pages 270–278,
Nagoya, Japan, 14-18 October 2013.