Reliable and Practical Machine Learning for Dynamic Healthcare Settings
Machine learning (ML) algorithms have shown great promise on a variety of healthcare-related tasks. However, as these algorithms transition from research to deployment, they enter a constantly evolving environment rife with changes in clinical practices, record-keeping policies, patient populations, and diseases themselves. Models that performed well in the past are liable to fail in the future, and especially in such high-stakes settings as healthcare, complacency can have negative consequences. This thesis explores the application and development of machine learning algorithms for dynamic healthcare settings. First, we present case studies which characterize how common ML techniques fare on several medical datasets over time, and discuss types of distribution shifts that can occur in healthcare data. In the second part we dive into learning from underreported data and how to adapt to shifting levels of underreporting, motivated by challenges which arose when developing a model for predicting severe COVID-19. Moving from prediction over time to decision-making over time, we study two scenarios, one in which decisions are cheap, frequent, and directly tied to forecasts, and another in which the interaction dynamics are modeled off of those between a doctor and patient, where interactions have some cost. Finally, we reflect more broadly on the development of reliable machine learning algorithms in healthcare over time.
- Machine Learning
- Doctor of Philosophy (PhD)