Carnegie Mellon University
Browse

Efficient Sampling and Feature Selection in Whole Sentence Maximum Entropy Language Models

Download (67.21 kB)
journal contribution
posted on 2004-01-01, 00:00 authored by Stanley F Chen, Roni Rosenfeld

Conditional Maximum Entropy models have been successfully applied to estimating language model probabilities of the form p(wjh), but are often too demanding computationally. Furthermore, the conditional framework does not lend itself to expressing global sentential phenomena. We have recently introduced a non-conditional Maximum Entropy language model which directly models the probability of an entire sentence or utterance. The model treats each utterance as a "bag of features," where features are arbitrary computable properties of the sentence. Using the model is computationally straightforward since it does not require normalization. Training the model requires efficient sampling of sentences from an exponential distribution. In this paper, we further develop the model and demonstrate its feasibility and power. We compare the efficiency of several sampling techniques, implement smoothing to accommodate rare features, and suggest an efficient algorithm for improving convergence rate

History

Publisher Statement

All Rights Reserved

Date

2004-01-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC