One System, Many Domains: Open-Domain Statistical Machine Translation via Feature Augmentation

journal contribution
posted on 01.10.2012, 00:00 by Jonathan H. Clark, Alon Lavie, Chris Dyer

In this paper, we introduce a simple technique for incorporating domain information into a statistical machine translation system that significantly improves translation quality when test data comes from multiple domains. Our approach augments (conjoins) standard translation model and language model features with domain indicator features and requires only minimal modifications to the optimization and decoding procedures. We evaluate our method on two language pairs with varying numbers of domains, and observe significant improvements of up to 1.0 BLEU


