Carnegie Mellon University
Browse
TEXT
README.md (5.1 kB)
DATASET
filtered_attrs.csv (468.85 kB)
DATASET
bias_labels.csv (48.04 kB)
DATASET
filtered_backlinks.csv (1.51 MB)
DATASET
filtered_outlinks.csv (1.3 MB)
DATASET
filtered_combined_attrs.csv (2.49 MB)
DATASET
link_scheme_outlinks.csv (1.25 MB)
DATASET
link_scheme_outlink_attrs.csv (1.47 MB)
DATASET
discovered_domains.csv (341.18 kB)
DATASET
discovered_domains_sample_annotated.csv (15.14 kB)
1/0
10 files

Dataset for "Detection and Discovery of Misinformation Sources using Attributed Webgraphs"

dataset
posted on 2024-02-12, 18:44 authored by Peter CarragherPeter Carragher, Kathleen CarleyKathleen Carley, Evan Williams

We demonstrate that Search Engine Optimization (SEO) attributes provide strong signals for predicting news site reliability. We introduce a novel attributed webgraph dataset with labeled news domains and their connections to outlinking and backlinking domains. Finally, we introduce and evaluate a novel graph-based algorithm for discovering previously unknown misinformation news sources.


This dataset is provided courtesy of Ahrefs.com. The associated paper is upcoming at ICWSM 2024.

Funding

Scalable Tools for Social Media Assessment.

United States Department of the Navy

Find out more...

History

Publisher Statement

Carragher, P., Williams, E., & Carley, K. (2024). Detection and Discovery of Misinformation Sources using Attributed Webgraphs. arXiv preprint arXiv:2401.02379.

Date

2024-02-07

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC