Assessing and enhancing adversarial robustness in context and applications to speech security
Designing artificial intelligence models that are robust to adversarial perturbations of their inputs is a highly desirable objective, as such models would not only be less prone to security breaches but also better aligned with human reasoning and perception. Neural Automatic Speech Recognition (ASR) is vulnerable to these adversarial perturbations but currently lacks defenses with strong evidence for robustness to state-of-the-art attacks. This contrasts with image tasks for which several defense algorithms have shown increasingly good results even against adaptive adversaries. Another odd aspect of speech robustness is the lack of transferability between different models for most adversarial attacks. The specifics of ASR as a task certainly play an important role in this odd situation: at their core speech utterances are time series in an infinite-dimensional Hilbert space with an underlying semantic structure, while transcription outputs belong to an infinite metric space.
The objective of this thesis is to explore the dependencies between adversarial perturbations and the context in which it is studied (nature of the task, set of outputs, etc.), in order to assess their threat against speech recognition models and the possibility for robust ASR. Specifically, (1) we show that adversarial robustness is closely related to training methods and model architectures. We propose novel methods to study this relationship without the cost of training robust models, by measuring the randomness in robustness on undefended models and by quantifying the number of adversarial perturbations on a hypersphere. (2) we quantify the threat posed by adversarial attacks on ASR, both under white-box and black-box threat models. We introduce an evaluation framework and apply it notably to show that self-supervised speech models are uniquely vulnerable to transferable attacks. We also take into account the specifics of voice assistants trained on user data and investigate applications of adversarial perturbations for privacy attacks. (3) we investigate two approaches to increase ASR robustness. First, we show that the Randomized Smoothing defense paradigm enables us to combine the strengths of general machine learning defense guarantees and domain-specific speech processing tools. Using it we achieve state-of-the-art ASR robustness against state-of-the-art white-box attacks for several model architectures. Then we introduce a novel framework called adversarial masked prediction, that lets us robustly pretrain self-supervised speech models.
- Language Technologies Institute
- Doctor of Philosophy (PhD)