Using Conversational Word Bursts in Spoken Term Detection
We describe a language independent word burst feature based on the structure of conversational speech that can be used to improve spoken term detection (STD) performance. Word burst refers to a phenomenon in conversational speech in which particular content words tend to occur in close proximity of each other as a byproduct of the topic under discussion. To take advantage of bursts, we describe a rescoring procedure that can be applied to lattice and confusion network outputs to improve STD performance. This approach is particularly effective when acoustic models are built with limited training data (and ASR performance is relatively poor). We find that word bursts appear in the four languages we examined and that STD performance can be improved for three of them; the remaining language is agglutinative.