Carnegie Mellon University
Browse

Improving Speech-Recognition Performance via Phone-Dependent VQ Codebooks and Adaptive Language Models in SPHINX-II

Download (35.61 kB)
conference contribution
posted on 2023-03-01, 20:01 authored by M Hwang, Ronald RosenfeldRonald Rosenfeld, E Thayer, R Mosur, L Chase, R Weide, X Huang, F Alleva

This paper presents improvements in acoustic and language modeling for automatic speech recognition. Specifically, semi-continuous HMMs (SCHMMs) with phone-dependent VQ codebooks are presented and incorporated into the SPHINX-II speech recognition system. The phone-dependent VQ codebooks relax the density-tying constraint in SCHMMs in order to obtain more detailed models. A 6% error rate reduction was achieved on the speaker-independent 20000-word Wall Street Journal (WSJ) task. Dynamic adaptation of the language model in the context of long documents is also explored. A maximum entropy framework is used to exploit long distance trigrams and trigger effects. A 10%-15% word error rate reduction is reported on the same WSJ task using the adaptive language modeling technique.

History

Publisher Statement

© 1994 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Date

1994-04-19

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC