Large Scale Arabic Error Annotation: Guidelines and Framework

Zaghouani, Wajdi; Mohit, Behrang; Habash, Nizar; Obeid, Ossama; Tomeh, Nadi; Rozovskaya, Alla; Farra, Noura; Alkuhlani, Sarah; Oflazer, Kemal

doi:10.1184/R1/6373136.v1

956_Paper.pdf (373.24 kB)

Large Scale Arabic Error Annotation: Guidelines and Framework

journal contribution

posted on 2015-05-01, 00:00 authored by Wajdi Zaghouani, Behrang Mohit, Nizar Habash, Ossama Obeid, Nadi Tomeh, Alla Rozovskaya, Noura Farra, Sarah Alkuhlani, Kemal OflazerKemal Oflazer

We present annotation guidelines and a web-based annotation framework developed as part of an effort to create a manually annotated Arabic corpus of errors and corrections for various text types. Such a corpus will be invaluable for developing Arabic error correction tools, both for training models and as a gold standard for evaluating error correction algorithms. We summarize the guidelines we created. We also describe issues encountered during the training of the annotators, as well as problems that are specific to the Arabic language that arose during the annotation process. Finally, we present the annotation tool that was developed as part of this project, the annotation pipeline, and the quality of the resulting annotations.

History

Publisher Statement

Published in Proceedings of LREC, May 2014, Reykjavik, Iceland

Date

2015-05-01

Usage metrics

Keywords

Corpus Compilation Arabic Error Annotation

Licence

CC BY-NC-SA 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Large Scale Arabic Error Annotation: Guidelines and Framework

History

Publisher Statement

Date

Usage metrics

Categories

Keywords

Licence

Exports