EM-based Phoneme Confusion Matrix Generation for Low-resource Spoken Term Detection

Xu, Di; Wang, Yun; Metze, Florian

doi:10.1184/R1/6473342.v1

file.pdf (250.95 kB)

EM-based Phoneme Confusion Matrix Generation for Low-resource Spoken Term Detection

journal contribution

posted on 2014-12-01, 00:00 authored by Di Xu, Yun Wang, Florian MetzeFlorian Metze

The idea of using a data-driven phoneme confusion matrix (PCM) to enhance speech recognition and retrieval performance is not new to the speech community. Although empirical results show various degrees of improvements brought by introducing a PCM, the underlying data-driven processes introduced in most papers are rather ad-hoc and lack rigorous statistical justifications. In this paper we will focus on the statistical aspects of PCM generation, propose and justify a novel expectation-maximization based algorithm for data-driven PCM generation. We will evaluate the performance of the generated PCMs under the context of low-resource spoken term detection, with primary focus on out-of-vocabulary keywords.

History

Publisher Statement

© 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Date

2014-12-01

Usage metrics

Keywords

Expectation-maximization algorithm machine learning information retrieval spoken term detection out-of-vocabulary words

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

EM-based Phoneme Confusion Matrix Generation for Low-resource Spoken Term Detection

History

Publisher Statement

Date

Usage metrics

Categories

Keywords

Licence

Exports