posted on 1996-01-01, 00:00authored byRalf D Brown, Rebecca Hutchinson, Paul N Bennett, Jaime G. Carbonell, Peter J Jansen
Many corpus-based Machine Translation (MT) systems generate a number of partial translations
which are then pieced together rather than immediately producing one overall translation. While
this makes them more robust to ill-formed input, they are subject to disfluencies at phrasal translation
boundaries even for well-formed input. We address this “boundary friction” problem by
introducing a method that exploits overlapping phrasal translations and the increased confidence
in translation accuracy they imply. We specify an efficient algorithm for producing translations
using overlap. Finally, our empirical analysis indicates that this approach produces higher quality
translations than the standard method of combining non-overlapping fragments generated by our
Example-Based MT (EBMT) system in a peak-to-peak comparison.