Carnegie Mellon University
Browse
file.pdf (512.67 kB)

Knowledge Graph Identification

Download (512.67 kB)
journal contribution
posted on 2013-10-01, 00:00 authored by Jay Pujara, Hui Miao, Lise Getoor, William W. Cohen

Large-scale information processing systems are able to extract massive collections of interrelated facts, but unfortunately transforming these candidate facts into useful knowledge is a formidable challenge. In this paper, we show how uncertain extractions about entities and their relations can be transformed into a knowledge graph. The extractions form anextraction graph and we refer to the task of removing noise, inferring missing information, and determining which candidate facts should be included into a knowledge graph as knowledge graph identification. In order to perform this task, we must reason jointly about candidate facts and their associated extraction confidences, identify co-referent entities, and incorporate ontological constraints. Our proposed approach uses probabilistic soft logic (PSL), a recently introduced probabilistic modeling framework which easily scales to millions of facts. We demonstrate the power of our method on a synthetic Linked Data corpus derived from the MusicBrainz music community and a real-world set of extractions from the NELL project containing over 1M extractions and 70K ontological relations. We show that compared to existing methods, our approach is able to achieve improved AUC and F1 with significantly lower running time.

History

Publisher Statement

The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-41335-3_34

Date

2013-10-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC