Modeling Event Implications via Multi-faceted Entity Representations
Representing knowledge is a foundational aspect of Natural Language Understanding, ranging from meticulously designed notational vocabularies to high-dimensional automatically-created numerical distributions. Good representations must have high coverage with respect to the meanings inherent in a task and an unambiguous interpretable structure that is machine-readable. Once the representation is defined, constructing instances of such representations for given examples (either manually or automatically) faces two main sub-problems: (1) extracting the information conveyed by a word or sentence and (2) structuring the extracted information appropriately and filtering out what is unimportant, based on context.
Recent advances focus on representation learning by deep neural networks, where numerical distributions representing words or sentences are learned using a machine learning model by processing huge volumes of data. However, these representations do not have an internal schematic representation and are hence not interpretable by humans. This leads to a currently central question in AI of whether these high-performing models are able to reason over events and their implications in the real world, or whether they simply memorize all the training examples and perform a small amount of generalization.
In this thesis, we address the problem of representing and reasoning about events and their implications in the physical world. We propose methods to create more explainable representations of knowledge that retain only the parts of the encoded information that are relevant to a task at hand. Our approach results in models that learn underlying reasoning mechanisms and apply them to unseen situations (i.e., generalization). We study representations at different levels of semantics (lexical/conceptual, sentence, and discourse); representations for words, sentences, and event chains. Our methods address the following questions:
- Can we separate different aspects of meaning in our representations and identify the aspects relevant for a task at hand, either via a fixed structure or through learning?
- How do task formulation and representation structure affect performance in limited-data scenarios?
- May infusing representations with human knowledge replace the need for huge volumes of training data? We study how definitions, ontologies or task explanations can be combined with a machine learning model.
- Can a model trained only on language learn physical event implications and reasoning mechanisms that generalize across domains?
The answers to these questions enable us to create multi-faceted representations of entities that guide a deep neural network to learn reasoning mechanisms and avoid shortcut learning, which is a major impediment in limited-data or domain-transfer scenarios.
History
Date
2022-10-04Degree Type
- Dissertation
Department
- Language Technologies Institute
Degree Name
- Doctor of Philosophy (PhD)