10.1184/R1/6473342.v1 Di Xu Di Xu Yun Wang Yun Wang Florian Metze Florian Metze EM-based Phoneme Confusion Matrix Generation for Low-resource Spoken Term Detection Carnegie Mellon University 2014 Expectation-maximization algorithm machine learning information retrieval spoken term detection out-of-vocabulary words 2014-12-01 00:00:00 Journal contribution https://kilthub.cmu.edu/articles/journal_contribution/EM-based_Phoneme_Confusion_Matrix_Generation_for_Low-resource_Spoken_Term_Detection/6473342 <p>The idea of using a data-driven phoneme confusion matrix (PCM) to enhance speech recognition and retrieval performance is not new to the speech community. Although empirical results show various degrees of improvements brought by introducing a PCM, the underlying data-driven processes introduced in most papers are rather ad-hoc and lack rigorous statistical justifications. In this paper we will focus on the statistical aspects of PCM generation, propose and justify a novel expectation-maximization based algorithm for data-driven PCM generation. We will evaluate the performance of the generated PCMs under the context of low-resource spoken term detection, with primary focus on out-of-vocabulary keywords.</p>