Tracking User Web Browsing Behavior: Privacy Harms and Security Benefits

The presence of web tracking technology has grown to a near-ubiquitous state as web pages contain a growing number of trackers, representing progressively more and more third parties, that employ an increasingly diverse set of tracking techniques. The current discourse surrounding web tracking has focused on the collection of browsing data and its use in the estimated $566B online advertising industry in the United States. As a result, existing privacy protections, both through tools like ad blockers and policy mechanisms like cookie consent banners, have been designed to provide a binary choice to users: either opt-in or optout of web tracking. To the user, this frames the issue of web tracking, and all derived use of data collected therein, as either inherently good or bad. However, this thesis demonstrates browsing data collected through web tracking can be used to both inflict privacy harm and to provide security benefits to users—providing evidence that the emphasis of our current privacy framework should be placed on data use, not data collection. 

Our limited understanding of web tracking, and its potential uses, derives from the lack of reliable browsing data that is available to researchers. While information about how users browse the web is abundantly collected in the private sector, for proprietary reasons it is rarely distributed. As such, this thesis starts with an examination of recruitment and retention in several longitudinal measurement panels, a recently employed method that has provided researchers with detailed browsing data collected over long periods of time. We provide a set of best practices and recommendations for the design of future web browsing studies. Using data from one of these panels, we then provide an updated and more detailed snapshot of how users browse the web. Among other findings, we demonstrate that individual browsing patterns are relatively unique but also habitual, and that there are common patterns leading users to less travelled, riskier areas of the internet. 

Building upon this foundation, in subsequent chapters we demonstrate that user browsing data collected through web tracking can be used to the detriment of users’ privacy but also to improve security outcomes. In terms of privacy harms, we first show that a user’s unique browsing behavior, their behavioral fingerprint, can be used to greatly reduce online anonymity and, combined with browser fingerprinting, can provide trackers with robust crosssite tracking methods without using browser cookies. Next, we demonstrate that information about visiting sensitive websites can be inferred from the advertising profiles generated by a major data aggregator, indicating that even legitimate use of browsing data presents a privacy risk to users. Finally, we also show that there are security benefits to web tracking by detecting user exposure to malicious websites based on their prior browsing activity. 

Based on these findings, we recommend that the current privacy framework surrounding web tracking place greater emphasis on how browsing data is used, rather than simply its collection. We evaluate a range of use cases for browsing data, provide recommendations to policymakers, and advocate for an expansion of user choice that involves options to consent to data collection for specific purposes. We close with a discussion of the challenges and opportunities that the dual nature of web tracking presents and make recommendations for future research. 




