On-Line Algorithms for Combining Language Models
Multiple language models are combined for many tasks in language modeling, such as domain and topic adaptation. In this work, we compare on-line algorithms from machine learning to existing algorithms for combining language models. On-line algorithms developed for this problem have parameters that are updated dynamically to adapt to a data set during evaluation. On-line analysis provides guarantees that these algorithms will perform nearly as well as the best model chosen in hindsight from a large class of models, e.g., the set of all static mixtures. We describe several on-line algorithms and present results comparing these techniques with existing language modeling combination methods on the task of domain adaptation. We demonstrate that in some situations, on-line techniques can significantly outperform static mixtures (by over 10% in terms of perplexity), and are especially effective when the nature of the test data is unknown or changesover time.