Very Fast Similarity Queries on Semi-Structured Data from the Web

Dalvi, Bhavana; Cohen, William W.

doi:10.1184/R1/6476444.v1

file.pdf (551.57 kB)

Very Fast Similarity Queries on Semi-Structured Data from the Web

journal contribution

posted on 2013-01-01, 00:00 authored by Bhavana Dalvi, William W. Cohen

In this paper, we propose a single low-dimensional representation for entities found in different datasets on the web. Our proposed PIC-D embeddings can represent large D-partite graphs using small number of dimensions enabling fast similarity queries. Our experiments show that this representation can be constructed in small amount of time (linear in number of dimensions). We demonstrate how it can be used for variety of similarity queries like set expansion, automatic set instance acquisition, and column classification. Our approach results in comparable precision with respect to task specific baselines and up to two orders of magnitude improvement in terms of query response time.

History

Publisher Statement

Date

2013-01-01

Usage metrics

Keywords

Machine Learning

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Very Fast Similarity Queries on Semi-Structured Data from the Web

History

Publisher Statement

Date

Usage metrics

Categories

Keywords

Licence

Exports