Statistical Language Modeling using the CMU-Cambridge Toolkit

Rosenfeld, Roni; Clarkson, Philip

doi:10.1184/R1/6609881.v1

Statistical Language Modeling using the CMU-Cambridge Toolkit

journal contribution

posted on 2004-03-01, 00:00 authored by Roni Rosenfeld, Philip Clarkson

The CMU Statistical Language Modeling toolkit was released in 1994 in order to facilitate the construction and testing of bigram and trigram language models. It is currently in use in over 40 academic, government and industrial laboratories in over 12 countries. This paper presents a new version of the toolkit. We outline the conventional language modeling technology, as implemented in the toolkit, and describe the extra e ciency and functionality that the new toolkit provides as compared to previous software for this task. Finally, we give an example of the use of the toolkit in constructing and testing a simple language model.

History

Date

2004-03-01

Usage metrics

Keywords

computer sciences Information and Computing Sciences not elsewhere classified

Licence

In Copyright

Statistical Language Modeling using the CMU-Cambridge Toolkit

History

Date

Usage metrics

Categories

Keywords

Licence

Exports