Collaborative learning by leveraging siloed data
Regulations can often limit stakeholders’ modeling capabilities by preventing data sharing. For example, in order to protect patient privacy, clinical centers may be unable to share their data and thus lack representative records to learn about a rare condition. To address this challenge, previous work in machine learning has shown that these stakeholders benefit from training models in a collaborative fashion, improving their predictive performance. However, as we start training these collaborative models in real-world settings, and in order to be truly useful, they need to provide utility along dimensions beyond predictive performance. In this thesis, we propose methods and algorithms to improve collaborative models that leverage siloed data along three dimensions. In the first part, we propose methods to reduce the communication footprint of models learned by mobile devices cooperating over edge networks, allowing for higher capacity models to be trained. Then, in the second part, we introduce an algorithm that provides explanations about predictions of models trained across clinical centers, thus improving their clinical utility. Finally, in the third part, we address the need to encode expert supervision into collaborative models trained using on-device data, increasing the class of problems we can tackle in these scenarios
History
Date
2023-08-08Degree Type
- Dissertation
Department
- Machine Learning
Degree Name
- Doctor of Philosophy (PhD)