Carnegie Mellon University
Shuaibi_cmu_0041E_10847.pdf (40.6 MB)

Generalizable Machine Learning Models for Electrocatalyst Discovery

Download (40.6 MB)
posted on 2023-02-28, 22:09 authored by Muhammed ShuaibiMuhammed Shuaibi

With the global population on the rise, increasing energy demands, and resulting climate change, the future of our energy infrastructure has become one of society’s most pressing problems. Decreasing prices of renewable energy offers a promising path towards a sustainable future. However, the sun does not always shine nor the wind always blows, and addressing how we store intermittent energy sources will play a key role in our transition. One approach is to store energy in chemical forms. Unfortunately, these processes often rely on expensive rare metal catalysts, making them ill suited for commercial scale. The discovery of catalysts that can efficiently, selectively, and economically take part in these processes will be critical for society. 

This thesis is centered around building generalizable machine learning (ML) models that span chemical and material space for catalyst discovery. A vital component in achieving this includes the curation of large-scale catalyst datasets. We first present how we can leverage active learning methods and physical biases to build ML models in the low data regime to accelerate density functional theory (DFT). We then present the largest catalyst dataset of its kind, Open Catalyst 2020 (OC20), accompanied by baseline models and challenges to stimulate research in the catalysis and ML communities. With this dataset we explore the extent to which building a general purpose machine learning model is feasible. We then develop SpinConv, a graph neural network (GNN) that uniquely captures 3D atomic information to improve predictions on OC20. Next, we expand OC20 to present the Open Catalyst 2022 (OC22) dataset, consisting of oxide materials and more general purpose tasks. We also explore the extent existing datasets complement one another through alternative training strategies. Lastly, we discuss some of the challenges, trends, and general findings the community and ourselves have faced in building generalizable machine learning models. 




Degree Type

  • Dissertation


  • Chemical Engineering

Degree Name

  • Doctor of Philosophy (PhD)


Zachary (Zack) Ulissi

Usage metrics


    Ref. manager