Towards Theoretical and Empirical Foundations of Machine Learning for Differential Equations
Recent advances in machine learning have propelled the use of data-driven methods in scientific discovery. In this we study the application of machine learning techniques to solve Partial Differential Equations (PDEs), which form fundamental building blocks in analyzing and describing various scientific phenomena, ranging from fluid dynamics to climate and weather forecasting and molecular dynamics. However, as the dimensionality of the system increases, the computational cost of simulating PDE solutions grows exponentially with the input dimension. Additionally, every new configuration of a PDE system necessitates rerunning the numerical solver from scratch, adding on to the computational challenges
This thesis aims to theoretically and empirically investigate the conditions under which data-driven machine learning can effectively solve PDEs. It establishes conditions under which for specific classes of PDEs data driven machine learning techniques provide tangible benefits, especially in terms of reducing computational costs. Furthermore, it explores the architectural design space of using neural networks to approximate PDE solutions and fundamentally understands the choice of architecture that benefits downstream applications.
The thesis is divided into three parts. The first part containing Chapters 2, 3 and 4 introduces theoretical results that establish the representational capacity of neural networks for approximating solutions to complex PDEs. These chapters show that for certain families of PDEs the using a neural network can provably evade the curse of dimensionality. The second part includes Chapters 5, 6 and 7 and explores the architectural design choices of neural operators: neural networks that approximate solutions to an entire family of PDEs. We further use our insights towards designing efficient architectures for multi-physics models that can approximate solutions to multiple families of PDEs at once. The third part includes results on approximating graph structured data—which includes various scientific data such as molecules and PDEs on irregular meshes. In Chapter 8we showhowthe state-space models based architectures such as Mamba generalizes to graph structured data. Finally, in Chapter 9 we introduce a theoretical results that establishes representational benefits of maintaining edge embeddings in graph neural networks.
History
Date
2025-02-21Degree Type
- Dissertation
Thesis Department
- Machine Learning
Degree Name
- Doctor of Philosophy (PhD)