posted on 2009-01-01, 00:00authored byChristian Monson, Ariadna Font Llitjós, Vamshi Ambati, Lorraine LevinLorraine Levin, Alon LavieAlon Lavie, Alison Alvarez, Roberto Aranovitch, Jaime G. Carbonell, Robert Frederking, Erik Peterson, Katharina Probst
Producing machine translation (MT) for the many minority languages
in the world is a serious challenge. Minority languages
typically have few resources for building MT systems. For many
minor languages there is little machine readable text, few
knowledgeable linguists, and little money available for MT development.
For these reasons, our research programs on minority
language MT have focused on leveraging to the maximum
extent two resources that are available for minority languages:
linguistic structure and bilingual informants. All natural languages
contain linguistic structure. And although the details of
that linguistic structure vary from language to language, language
universals such as context-free syntactic structure and the
paradigmatic structure of inflectional morphology, allow us to
learn the specific details of a minority language. Similarly, most
minority languages possess speakers who are bilingual with the
major language of the area. This paper discusses our efforts to
utilize linguistic structure and the translation information that
bilingual informants can provide in three sub-areas of our rapid
development MT program: morphology induction, syntactic
transfer rule learning, and refinement of imperfect learned rules