posted on 2002-07-01, 00:00authored byJae Dong Kim, Ralf D Brown, Peter J Jansen, Jaime G. Carbonell
Since subsentential alignment is critically important to the translation quality of
an Example-Based Machine Translation (EBMT) system which operates by finding and
combining phrase-level matches against the training examples, we recently decided to develop
a new alignment algorithm for the purpose of improving the EBMT system’s performance.
Unlike most algorithms in the literature, this new Symmetric Probabilistic Alignment
(SPA) algorithm treats the source and target languages in a symmetric fashion. In this
paper, we describe our basic algorithm and some extensions for using context and positional
information, compare its alignment accuracy with IBM Model 4, and report on experiments
in which either IBM Model 4 or SPA alignments are substituted for the aligner
currently built into the EBMT system. Both Model 4 and SPA are significantly better than
the internal aligner and SPA slightly outperforms Model 4 despite being handicapped by
incomplete integration with EBMT.