Toward Automated Worldwide Monitoring of Network-level Censorship

2019-01-18T21:45:46Z (GMT) by Zachary Weinberg
Although Internet censorship is a well-studied topic, to date most published studies have focused<br>on a single aspect of the phenomenon, using methods and sources specific to each researcher.<br>Results are dicult to compare, and global, historical perspectives are rare. Because each group<br>maintains their own software, erroneous methods may continue to be used long after the error has<br>been discovered. Because censors continually update their equipment and blacklists, it may be<br>impossible to reproduce historical results even with the same vantage points and testing software.<br>Because “probe lists” of potentially censored material are labor-intensive to compile, requiring an<br>understanding of the politics and culture of each country studied, researchers discover only the most<br>obvious and long-lasting cases of censorship.<br>In this dissertation I will show that it is possible to make progress toward addressing all of<br>these problems at once. I will present a proof-of concept monitoring system designed to operate<br>continuously, in as many di erent countries as possible, using the best known techniques for<br>detection and analysis. I will also demonstrate improved techniques for verifying the geographic<br>location of a monitoring vantage point; for distinguishing innocuous network problems from<br>censorship and other malicious network interference; and for discovering new web pages that are<br>closely related to known-censored pages. These techniques improve the accuracy of a continuous<br>monitoring system and reduce the manual labor required to operate it.<br>This research has, in addition, already led to new discoveries. For example, I have confirmed<br>reports that a commonly-used heuristic is too sensitive and will mischaracterize a wide variety of<br>unrelated problems as censorship. I have been able to identify a few cases of political censorship<br>within a much longer list of cases of moralizing censorship. I have expanded small seed groups of<br>politically sensitive documents into larger groups of documents to test for censorship. Finally, I<br>can also detect other forms of network interference with a totalitarian motive, such as injection of<br>surveillance scripts.<br>In summary, this work demonstrates that mostly-automated measurements of Internet censorship<br>on a worldwide scale are feasible, and that the elusive global and historical perspective is within<br>reach.