Transforming Standard Arabic to Colloquial Arabic

Mohamed, Emad; Mohit, Behrang; Oflazer, Kemal

doi:10.1184/R1/6368087.v1

P12-2035.pdf (223.62 kB)

Transforming Standard Arabic to Colloquial Arabic

journal contribution

posted on 2012-07-08, 00:00 authored by Emad Mohamed, Behrang Mohit, Kemal OflazerKemal Oflazer

We present a method for generating Colloquial Egyptian Arabic (CEA) from morphologically disambiguated Modern Standard Arabic (MSA). When used in POS tagging, this process improves the accuracy from 73.24% to 86.84% on unseen CEA text, and reduces the percentage of out-of vocabulary words from 28.98% to 16.66%. The process holds promise for any NLP task targeting the dialectal varieties of Arabic; e.g., this approach may provide a cheap way to leverage MSA data and morphological resources to create resources for colloquial Arabic to English machine translation. It can also considerably speed up the annotation of Arabic dialects.

History

Publisher Statement

Published in Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pages 176–180, Jeju, Republic of Korea, 8-14 July 2012.

Date

2012-07-08

Usage metrics

Keywords

Egyptian Arabic Modern Standard Arabic Corpus Transformation

Licence

CC BY-NC-SA 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Transforming Standard Arabic to Colloquial Arabic

History

Publisher Statement

Date

Usage metrics

Categories

Keywords

Licence

Exports