Carnegie Mellon University
Browse

OOV Detection and Recovery using Hybrid Models with Different Fragments

Download (204.11 kB)
journal contribution
posted on 2011-08-01, 00:00 authored by Long Qin, Ming Sun, Alexander RudnickyAlexander Rudnicky

In this paper, we address the out-of-vocabulary (OOV) detection and recovery problem by developing three different fragment-word hybrid systems. A fragment language model (LM) and a word LM were trained separately and then combined into a single hybrid LM. Using this hybrid model, the recognizer can recognize any OOVs as fragment sequences. Different types of fragments, such as phones, subwords, and graphones were tested and compared on the WSJ 5k and 20k evaluation sets. The experiment results show that the subword and graphone hybrid systems perform better than the phone hybrid system in both 5k and 20k tasks. Furthermore, given less training data, the subword hybrid system is more preferable than the graphone hybrid system.

History

Publisher Statement

Copyright © 2011 ISCA

Date

2011-08-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC