Design and Implementation of Controlled Elicitation for Machine Translation of Low-density Languages
journal contributionposted on 2003-06-01, 00:00 authored by Katharina Probst, Ralf D Brown, Jaime G. Carbonell, Alon LavieAlon Lavie, Lorraine LevinLorraine Levin, Erik Peterson
NICE is a machine translation project for low-density languages. We are building a tool that will elicit a controlled corpus from a bilingual speaker who is not an expert in linguistics. The corpus is intended to cover major typological phenomena, as it is designed to work for any language. Using implicational universals, we strive to minimize the number of sentences that each informant has to translate. From the elicited sentences, we learn transfer rules with a version space algorithm. Our vision for MT in the future is one in which systems can be quickly trained for new languages by native speakers, so that speakers of minor languages can participate in education, health care, government, and internet without having to give up their languages.