Carnegie Mellon University
Browse

A Network Analysis Model for Disambiguation of Names in Lists

Download (791.58 kB)
journal contribution
posted on 2005-01-01, 00:00 authored by Bradley Malin, Edoardo Airoldi, Kathleen CarleyKathleen Carley
In research and application, social networks are increasingly extracted from relationships inferred by name collocations in text-based documents. Despite the fact that names represent real entities, names are not unique identifiers and it is often unclear when two name observations correspond to the same underlying entity. One confounder stems from ambiguity, in which the same name correctly references multiple entities. Prior name disambiguation methods measured similarity between two names as a function of their respective documents. In this paper, we propose an alternative similarity metric based on the probability of walking from one ambiguous name to another in a random walk of the social network constructed from all documents. We experimentally validate our model on actor-actor relationships derived from the Internet Movie Database. Using a global similarity threshold, we demonstrate random walks achieve a significant increase in disambiguation capability in comparison to prior models.

History

Date

2005-01-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC