Carnegie Mellon University
Browse

Augmenting Translation Models with Simulated Acoustic Confusions for Improved Spoken Language Translation

Download (314.97 kB)
journal contribution
posted on 2014-04-01, 00:00 authored by Yulia Tsvetkov, Florian MetzeFlorian Metze, Chris Dyer

We propose a novel technique for adapting text-based statistical machine translation to deal with input from automatic speech recognition in spoken language translation tasks. We simulate likely misrecognition errors using only a source language pronunciation dictionary and language model (i.e., without an acoustic model), and use these to augment the phrase table of a standard MT system. The augmented system can thus recover from recognition errors during decoding using synthesized phrases. Using the outputs of five different English ASR systems as input, we find consistent and significant improvements in translation quality. Our proposed technique can also be used in conjunction with lattices as ASR output, leading to further improvements.

History

Publisher Statement

Copyright 2014 Association for Computational Linguistics

Date

2014-04-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC