Knowledge-enhanced Representation Learning for Multiview Context Understanding
Computational context understanding refers to an agent’s ability to fuse disparate sources of information for decision-making and is, therefore, generally regarded as a prerequisite for sophisticated machine reasoning capabilities, as in artificial intelligence (AI). Data-driven and knowledge-driven methods are two classical techniques in the pursuit of such machine sense-making capability. However, while data-driven methods seek to model the statistical regularities of events by making observations in the real-world, they remain difficult to interpret and they lack mechanisms for naturally incorporating external knowledge. Conversely, knowledge-driven methods combine structured knowledge bases, enable symbolic reasoning based on axiomatic principles, and yield more interpretable predictions; however, they often lack the ability to estimate the statistical salience of an inference or robustly accommodate perturbations in the input. To combat these issues, we use hybrid AI methodology as a general framework for combining the strengths of both approaches. Specifically, we inherit the concept of neuro-symbolism as a way of using domain knowledge to guide the learning progress of deep neural networks. Domain knowledge appears in many forms, including: (i) graphical models, which characterise such relationships between entities as dependence, independence, causality, correlation, and partial correlation; (ii) commonsense knowledge, which covers spatial knowledge, affordances from objects’ physical properties, semantic relations, and functional knowledge; (iii) privileged information, in the form of demonstrations or soft labels from an expert agent; (iv) learned behaviour primitives and priors, which agents may compose for generalisable and transferable task-execution; and (v) auxiliary tasks, objectives, and constraints — carefully-chosen, for constrained optimisation.
Regardless of the type of domain knowledge available, the same practical objective remains: to learn meaningful neural representations, for downstream tasks of interest. The underlying goal of neural representation learning is to statistically identify the best explanatory factors of variation in the agent’s input data or observations, often requiring intuition about the complementarity between multiple modalities or views in the input. While there has been much focus on learning effective neural representations for specific tasks, then transferring or adapting the learned representations to other tasks, comparatively less focus has been placed on representation learning in the presence of various types of domain knowledge. This knowledge could be used to recover information about the underlying generating process, to design effective modelling strategies in learning problems, to ensure model transferability or generalisability, or to understand the complementarity between views.
This thesis studies the avenues by which the aforementioned types of domain knowledge can be combined with neural representations, in order to achieve improved model performance and generalisability for the following problem domains: neural commonsense reasoning, multimodal robot navigation, and autonomous driving. This thesis contributes a collection of tools, methodologies, tasks, international AI challenges and leaderboards, datasets, and knowledge graphs; additionally, this work led to the successful organisation of two international workshops in safe learning for autonomous driving.
History
Date
2022-04-10Degree Type
- Dissertation
Department
- Language Technologies Institute
Degree Name
- Doctor of Philosophy (PhD)