On Triangular versus Edge Representations — Towards Scalable Modeling of Networks

Ho, Qirong; Yin, Junming; P Xing, Eric

doi:10.1184/R1/6476216.v1

file.pdf (1.06 MB)

On Triangular versus Edge Representations — Towards Scalable Modeling of Networks

journal contribution

posted on 2012-12-01, 00:00 authored by Qirong Ho, Junming Yin, Eric P Xing

In this paper, we argue for representing networks as a bag of triangular motifs, particularly for important network problems that current model-based approaches handle poorly due to computational bottlenecks incurred by using edge representations. Such approaches require both 1-edges and 0-edges (missing edges) to be provided as input, and as a consequence, approximate inference algorithms for these models usually require Ω(N² ) time per iteration, precluding their application to larger real-world networks. In contrast, triangular modeling requires less computation, while providing equivalent or better inference quality. A triangular motif is a vertex triple containing 2 or 3 edges, and the number of such motifs is Θ(Σ_i D²/_i ) (where D_i is the degree of vertex i), which is much smaller than N² for low-maximum-degree networks. Using this representation, we develop a novel mixed-membership network model and approximate inference algorithm suitable for large networks with low max-degree. For networks with high maximum degree, the triangular motifs can be naturally subsampled in a node-centric fashion, allowing for much faster inference at a small cost in accuracy. Empirically, we demonstrate that our approach, when compared to that of an edge-based model, has faster runtime and improved accuracy for mixed-membership community detection. We conclude with a large-scale demonstration on an N ≈ 280, 000-node network, which is infeasible for network models with Ω(N² ) inference cost.

History

Date

2012-12-01

Usage metrics

Keywords

Machine Learning

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

On Triangular versus Edge Representations — Towards Scalable Modeling of Networks

History

Date

Usage metrics

Categories

Keywords

Licence

Exports