Query-by-Example Spoken Term Detection Evaluation on Low-Resource Languages

Anguera, Xavier; Rodriguez-Fuentes, Luis J.; Szoke, Igor; Buzo, Andi; Metze, Florian; Penagarikano, Mikel

doi:10.1184/R1/6473639.v1

file.pdf (608.9 kB)

Query-by-Example Spoken Term Detection Evaluation on Low-Resource Languages

journal contribution

posted on 2014-05-01, 00:00 authored by Xavier Anguera, Luis J. Rodriguez-Fuentes, Igor Szoke, Andi Buzo, Florian MetzeFlorian Metze, Mikel Penagarikano

As part of the MediaEval 2013 benchmark evaluation campaign, the objective of the Spoken Web Search (SWS) task was to perform Query-by-Example Spoken Term Detection (QbE-STD), using spoken queries to retrieve matching segments in a set of audio files. As in previous editions, the SWS 2013 evaluation focused on the development of technology specifically designed to perform speech search in a low-resource setting. In this paper, we first describe the main features of past SWS evaluations and then focus on the 2013 SWS task, in which a special effort was made to prepare a challenging database, including speech in 9 different languages with diverse environment and channel conditions. The main novelties of the submitted systems are reviewed and performance figures are then presented and discussed, demonstrating the feasibility of the proposed task, even under such challenging conditions. Finally, the fusion of the 10 top-performing systems is analyzed. The best fusion provides a 30% relative improvement over the best single system in the evaluation, which proves that a variety of approaches can be effectively combined to bring complementary information in the search for queries.

History

Publisher Statement

Date

2014-05-01

Usage metrics

Keywords

benchmark evaluation low-resource languages query-by-example spoken term detection

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Query-by-Example Spoken Term Detection Evaluation on Low-Resource Languages

History

Publisher Statement

Date

Usage metrics

Categories

Keywords

Licence

Exports