Toward Robust Machine Learning by Countering Superficial Features
Machine learning, especially deep neural networks, has demonstrated remarkable empirical performances over various benchmarks. A potential next step is to extend these empirical successes beyond the i.i.d setting to a more practical scenario where the test data can be collected independently from the training data, while considered as the same task. In other words, how to train a robust model with data from one distribution and the test performance will not vary significantly over data from other different but related distributions. While there are many different works devoted to solve this problem of learning robust models from various perspectives, this thesis aims to complement other studies by offering a set of tools for the situation (and under the hypothesis) that one potential issue behind of the model’s non-robustness behaviors is the model’s tendency to predict through some features, which we refer to as superficial features.
We aim to attack the problem of learning robust models with several technical weapons: we first introduce a line of empirical efforts with numerical successes on different robustness-related benchmarks; we further aim to formally discuss the problem by assuming the challenges lie in the tendency of models’ learning of superficial features, which will also lead to a set of principled solutions; we also contribute engineering efforts to deliver a software that allows human to interact with image classification models to improve the model’s robustness against superficial features.
In particular, we first hypothesize the underlying challenge of learning robust model lies in the data, and then validate our hypothesis by investigating the model’s behavior responding to different copies of the image data.
Building upon our hypothesis, we introduce several new methods for image classification, countering different specific superficial features in the data. The success of these methods are validated as the empirical performances over standard domain generalization image classification tasks.
Further, with the empirical success, we propose to formalize the problem of train?ing a model over data with superficial features. With the knowledge of the superficial features, the formalization leads to a proved bound of the generalization error over the distribution absent of the superficial features. Our formalization can connect to our proposed methods in the previous section well. Our bound will also inspire a new method forgoing the knowledge of superficial features with strong empirical successes.
Finally, to foster the process of building robust models, we introduce a software with GUI that allows users to inspect image classification model’s decision process and annotate superficial features exploited by the model.
History
Date
2021-12-01Degree Type
- Dissertation
Department
- Language Technologies Institute
Degree Name
- Doctor of Philosophy (PhD)