Carnegie Mellon University
file.pdf (204.11 kB)

OOV Detection and Recovery using Hybrid Models with Different Fragments

Download (204.11 kB)
journal contribution
posted on 2011-08-01, 00:00 authored by Long Qin, Ming Sun, Alexander RudnickyAlexander Rudnicky

In this paper, we address the out-of-vocabulary (OOV) detection and recovery problem by developing three different fragment-word hybrid systems. A fragment language model (LM) and a word LM were trained separately and then combined into a single hybrid LM. Using this hybrid model, the recognizer can recognize any OOVs as fragment sequences. Different types of fragments, such as phones, subwords, and graphones were tested and compared on the WSJ 5k and 20k evaluation sets. The experiment results show that the subword and graphone hybrid systems perform better than the phone hybrid system in both 5k and 20k tasks. Furthermore, given less training data, the subword hybrid system is more preferable than the graphone hybrid system.


Publisher Statement

Copyright © 2011 ISCA



Usage metrics