W15-3217.pdf (119.29 kB)
Download fileQCMUQ@QALB-2015 Shared Task: Combining Character level MT and Error-tolerant Finite-State Recognition for Arabic Spelling Correction
journal contribution
posted on 2015-07-26, 00:00 authored by Houda BouamorHouda Bouamor, Hassan Sajjad, Nadir Durrani, Kemal OflazerKemal OflazerWe describe the CMU-Q and QCRI’s joint
efforts in building a spelling correction
system for Arabic in the QALB 2015
Shared Task. Our system is based on a
hybrid pipeline that combines rule-based
linguistic techniques with statistical methods
using language modeling and machine
translation, as well as an error-tolerant
finite-state automata method. We trained
and tested our spelling corrector using the
dataset provided by the shared task organizers.
Our system outperforms the baseline
system and yields better correction
quality with an F-score of 68.12 on L1-
test-2015 test set and 38.90 on the L2-test2015.
This ranks us 2nd in the L2 subtask
and 5th in the L1 subtask.