Carnegie Mellon University
Browse
file.pdf (139.78 kB)

Building a Vocabulary Self-Learning Speech Recognition System

Download (139.78 kB)
journal contribution
posted on 2014-08-01, 00:00 authored by Long Qin, Alexander RudnickyAlexander Rudnicky

This paper presents initial studies on building a vocabulary self-learning speech recognition system that can automatically learn unknown words and expand its recognition vocabulary. Our recognizer can detect and recover out-of-vocabulary (OOV) words in speech, then incorporate OOV words into its lexicon and language model (LM). As a result, these unknown words can be correctly recognized when encountered by the recognizer in future. Specifically, we apply the word-fragment hybrid system framework to detect the presence of OOV words. We propose a better phoneme-to-grapheme (P2G) model so as to correctly recover the written form for more OOV words. Furthermore, we estimate LM scores for OOV words using their syntactic and semantic properties. The experimental results show that more than 40% OOV words are successfully learned from the development data, and about 60% learned OOV words are recognized in the testing data.

History

Publisher Statement

Copyright © 2014 ISCA

Date

2014-08-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC