Evaluating an Agglutinative Segmentation Model for ParaMor

Monson, Christian; Lavie, Alon; Carbonell, Jaime G.; Levin, Lori

doi:10.1184/R1/6622136.v1

File(s) stored somewhere else

http://www.cs.cmu.edu/~jgc/publications.html

Please note: Linked content is NOT stored on Carnegie Mellon University and we can't guarantee its availability, quality, security or accept any liability.

Evaluating an Agglutinative Segmentation Model for ParaMor

journal contribution

posted on 2008-01-01, 00:00 authored by Christian Monson, Alon Lavie, Jaime G. Carbonell, Lori Levin

This paper describes and evaluates a modification to the segmentation model used in the unsupervised morphology induction system, ParaMor. Our improved segmentation model permits multiple morpheme boundaries in a single word. To prepare ParaMor to effectively apply the new agglutinative segmentation model, two heuristics improve ParaMor’s precision. These precision-enhancing heuristics are adaptations of those used in other unsupervised morphology induction systems, including work by Hafer and Weiss (1974) and Goldsmith (2006). By reformulating the segmentation model used in ParaMor, we significantly improve ParaMor’s performance in all language tracks and in both the linguistic evaluation as well as in the task based information retrieval (IR) evaluation of the peer operated competition Morpho Challenge 2007. ParaMor’s improved morpheme recall in the linguistic evaluations of German, Finnish, and Turkish is higher than that of any system which competed in the Challenge. In the three languages of the IR evaluation, our enhanced ParaMor significantly outperforms, at average precision over newswire queries, a morphologically naïve baseline; scoring just behind the leading system from Morpho Challenge 2007 in English and ahead of the first place system in German.

History

Publisher Statement

Date

2008-01-01

Usage metrics

Keywords

Software Research

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) stored somewhere else

Evaluating an Agglutinative Segmentation Model for ParaMor

History

Publisher Statement

Date

Usage metrics

Categories

Keywords

Licence

Exports