10.1184/R1/6473342.v1
Di Xu
Di
Xu
Yun Wang
Yun
Wang
Florian Metze
Florian
Metze
EM-based Phoneme Confusion Matrix Generation for Low-resource Spoken Term Detection
Carnegie Mellon University
2014
Expectation-maximization algorithm
machine learning
information retrieval
spoken term detection
out-of-vocabulary words
2014-12-01 00:00:00
Journal contribution
https://kilthub.cmu.edu/articles/journal_contribution/EM-based_Phoneme_Confusion_Matrix_Generation_for_Low-resource_Spoken_Term_Detection/6473342
<p>The idea of using a data-driven phoneme confusion matrix (PCM) to enhance speech recognition and retrieval performance is not new to the speech community. Although empirical results show various degrees of improvements brought by introducing a PCM, the underlying data-driven processes introduced in most papers are rather ad-hoc and lack rigorous statistical justifications. In this paper we will focus on the statistical aspects of PCM generation, propose and justify a novel expectation-maximization based algorithm for data-driven PCM generation. We will evaluate the performance of the generated PCMs under the context of low-resource spoken term detection, with primary focus on out-of-vocabulary keywords.</p>