Carnegie Mellon University
Browse
- No file added yet -

Structured Databases of Named Entities from Bayesian Nonparametrics

Download (393.2 kB)
journal contribution
posted on 2011-07-01, 00:00 authored by Jacob Eisenstein, Tae Yano, William W. Cohen, Noah A. Smith, Eric P Xing

We present a nonparametric Bayesian approach to extract a structured database of entities from text. Neither the number of entities nor the fields that characterize each entity are provided in advance; the only supervision is a set of five prototype examples. Our method jointly accomplishes three tasks: (i) identifying a set of canonical entities, (ii) inferring a schema for the fields that describe each entity, and (iii) matching entities to their references in raw text. Empirical evaluation shows that the approach learns an accurate database of entities and a sensible model of name structure.

History

Publisher Statement

Copyright 2011 ACM

Date

2011-07-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC