posted on 2015-05-01, 00:00authored byWajdi Zaghouani, Behrang Mohit, Nizar Habash, Ossama Obeid, Nadi Tomeh, Alla Rozovskaya, Noura Farra, Sarah Alkuhlani, Kemal OflazerKemal Oflazer
We present annotation guidelines and a web-based annotation framework developed as part of an effort to create a manually annotated
Arabic corpus of errors and corrections for various text types. Such a corpus will be invaluable for developing Arabic error correction
tools, both for training models and as a gold standard for evaluating error correction algorithms. We summarize the guidelines we
created. We also describe issues encountered during the training of the annotators, as well as problems that are specific to the Arabic
language that arose during the annotation process. Finally, we present the annotation tool that was developed as part of this project, the
annotation pipeline, and the quality of the resulting annotations.
History
Publisher Statement
Published in Proceedings of LREC, May 2014, Reykjavik, Iceland