Carnegie Mellon University
Browse

A Probabilistic Model for Canonicalizing Named Entity Mentions

Download (285.25 kB)
journal contribution
posted on 2012-07-01, 00:00 authored by Dani Yogatama, Yanchuan Sim, Noah A. Smith

We present a statistical model for canonicalizing named entity mentions into a table whose rows represent entities and whose columns are attributes (or parts of attributes). The model is novel in that it incorporates entity context, surface features, firstorder dependencies among attribute-parts, and a notion of noise. Transductive learning from a few seeds and a collection of mention tokens combines Bayesian inference and conditional estimation. We evaluate our model and its components on two datasets collected from political blogs and sports news, finding that it outperforms a simple agglomerative clustering approach and previous work.

History

Publisher Statement

Copyright 2012 ACL

Date

2012-07-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC