file.pdf (229.59 kB)

A Trainable Transfer-based Machine Translation Approach for Languages with Limited Resources

Download (229.59 kB)
journal contribution
posted on 01.01.2004 by Alon Lavie, Katharina Probst, Erik Peterson, Stephan Vogel, Lori Levin, Ariadna Font Llitjós, Jaime G. Carbonell

We describe a Machine Translation (MT) approach that is specifically designed to enable rapid development of MT for languages with limited amounts of online resources. Our approach assumes the availability of a small number of bi-lingual speakers of the two languages, but these need not be linguistic experts.  The bi-lingual speakers create a comparatively small corpus of word aligned phrases and sentences (on the order of magnitude of a few thousand sentence pairs) using a specially designed elicitation tool.  From this data, the learning module of our system automatically infers hierarchical syntactic transfer rules, which encode how syntactic constituent structures in the source language transfer to the target language.  The collection of transfer rules is then used in our run-time system to translate previously unseen source language text into the target language.  We describe the general principles underlying our approach, and present results from an experiment, where we developed a basic Hindi-to-English MT system over the course of two months, using extremely limited resources. 

History

Publisher Statement

All Rights Reserved

Date

01/01/2004

Exports

Exports