Carnegie Mellon University
Browse

Transliteration by Sequence Labeling with Lattice Encodings and Reranking

Download (201.82 kB)
journal contribution
posted on 2012-07-01, 00:00 authored by Waleed Ammar, Chris Dyer, Noah A. Smith

We consider the task of generating transliterated word forms. To allow for a wide range of interacting features, we use a conditional random field (CRF) sequence labeling model. We then present two innovations: a training objective that optimizes toward any of a set of possible correct labels (since more than one transliteration is often possible for a particular input), and a k-best reranking stage to incorporate nonlocal features. This paper presents results on the Arabic-English transliteration task of the NEWS 2012 workshop.

History

Publisher Statement

Copyright 2012 ACL

Date

2012-07-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC