QCMUQ@QALB-2015 Shared Task: Combining Character level MT and Error-tolerant Finite-State Recognition for Arabic Spelling Correction

Bouamor, Houda; Sajjad, Hassan; Durrani, Nadir; Oflazer, Kemal

doi:10.1184/R1/6373157.v1

W15-3217.pdf (119.29 kB)

QCMUQ@QALB-2015 Shared Task: Combining Character level MT and Error-tolerant Finite-State Recognition for Arabic Spelling Correction

journal contribution

posted on 2015-07-26, 00:00 authored by Houda BouamorHouda Bouamor, Hassan Sajjad, Nadir Durrani, Kemal OflazerKemal Oflazer

We describe the CMU-Q and QCRI’s joint efforts in building a spelling correction system for Arabic in the QALB 2015 Shared Task. Our system is based on a hybrid pipeline that combines rule-based linguistic techniques with statistical methods using language modeling and machine translation, as well as an error-tolerant finite-state automata method. We trained and tested our spelling corrector using the dataset provided by the shared task organizers. Our system outperforms the baseline system and yields better correction quality with an F-score of 68.12 on L1- test-2015 test set and 38.90 on the L2-test2015. This ranks us 2nd in the L2 subtask and 5th in the L1 subtask.

History

Publisher Statement

Published in Proceedings of the Second Workshop on Arabic Natural Language Processing, pages 144–149, Beijing, China, July 26-31, 2015.

Date

2015-07-26

Usage metrics

Keywords

Arabic Spelling Correction Error-tolerant recognition

Licence

CC BY-NC-SA 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

QCMUQ@QALB-2015 Shared Task: Combining Character level MT and Error-tolerant Finite-State Recognition for Arabic Spelling Correction

History

Publisher Statement

Date

Usage metrics

Categories

Keywords

Licence

Exports