Carnegie Mellon University
Browse
1/1
3 files

DOM XSS Web Vulnerability Dataset

dataset
posted on 2021-03-12, 16:03 authored by Clement FungClement Fung, Lujo Bauer, Limin JiaLimin Jia
This dataset relates to a large scale web crawl performed over the Alexa 10K in 2019. For each website, the Javascript code is analyzed with taint tracking, a dynamic analysis technique, to determine if DOM XSS (Document Object Model Cross Site Scripting) vulnerabilities are present.

From taint tracking, we generate two datasets: a dataset of "unconfirmed" functions, labeled with the result of this taint analysis, and a dataset of "confirmed" functions, labeled with the result of a proof-of-concept DOM XSS exploit.

Both datasets were used to train a variety of machine learning models, and the resulting models are also included for reference.

Funding

National Science Foundation Grant CNS1704542

History

Date

2021-02-10

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC