Carnegie Mellon University
hzhao1_MachineLearning_2020.pdf (5.47 MB)

Towards a Unified Framework for Learning and Reasoning

Download (5.47 MB)
posted on 2021-04-14, 17:05 authored by Han ZhaoHan Zhao
The success of supervised machine learning in recent years crucially hinges on the availability of large-scale and unbiased data, which is often time-consuming and expensive to collect. Recent advances in deep learning focus on learning rich and invariant representations that have found abundant applications in domain adaptation, multitask learning, algorithmic fairness, and machine translations, just to name a few. However, it is not clear what price we have to pay in terms of task utility for such universal representations. On the other hand, learning is
only one of the two most fundamental cognitive abilities of intelligent agents. An intelligent agent needs to have both the ability to learn from the experience, and the ability to reason from what has been learned. However, classic symbolic reasoning cannot model the inherent
uncertainty that ubiquitously exists, and it is not robust to noisy observations. Perhaps more fundamentally, reasoning is computationally intractable in general. As a result, learning, which often takes reasoning as a sub-procedure, is also hard. Building on the fundamental concepts
from information theory and theoretical computer science, this thesis aims to understand the inherent tradeoff between utility and invariance in learning the representations, and to develop efficient algorithms for learning tractable and exact probabilistic inference machines. This thesis contains two parts. The first part is devoted to understanding and learning
invariant representations. In particular, we will focus on understanding the costs of existing invariant representations by characterizing a fundamental tradeoff between invariance and utility. First, we will use domain adaptation as an example to both theoretically and empirically show such tradeoff in achieving small joint generalization error. This result also implies an inherent tradeoff between demographic parity, a statistical notion of group fairness, and utility in both classification and regression settings. Going beyond, we will further show that such general tradeoff exists in learning with structured data. In particular, we shall derive an impossibility theorem for universal machine translation by learning language-invariant
representations. Second, we will focus on designing learning algorithms to escape the existing tradeoff and to utilize the benefits of invariant representations. We will show how the algorithm can be used to guarantee equalized treatment of individuals between groups, and discuss what additional problem structure it requires to permit efficient domain adaptation and machine translation through learning invariant representations. The second part of the thesis is devoted to learning tractable and exact circuits for
probabilistic reasoning. It is well-known that exact marginal and conditional inference in classic probabilistic graphical models (PGMs), including Bayesian Networks (BNs) and
Markov Networks (MNs), is #P-complete. As a result, practitioners usually need to resort to various approximate inference schemes to ensure computational tractability. Probabilistic circuits, which include Sum-Product Networks (SPNs) as a special case, have been proposed as
tractable deep models for exact probabilistic inference. They distinguish themselves from other types of probabilistic graphical models by the fact that inference can be done exactly in linear time with respect to the size of the circuit. This has generated a lot of interest since inference
is not only a powerful tool to reason under uncertainty, but also a core task for parameter estimation and structure learning. In this part, we will concentrate on both theoretical and practical parts of learning tractable probabilistic circuits. In particular, we will investigate the representational power of SPNs, as well as its parameter learning procedures in both online and offline settings.




Degree Type

  • Dissertation


  • Machine Learning

Degree Name

  • Doctor of Philosophy (PhD)


Geoff Gordon