Using Multitask Learning to Understand Language Processing in the Brain

Schwartz, Daniel

doi:10.1184/R1/21699494.v1

Using Multitask Learning to Understand Language Processing in the Brain

thesis

posted on 2022-12-15, 20:20 authored by Daniel SchwartzDaniel Schwartz

Understanding the cognitive processes involved in human language comprehension has been a longstanding goal in the scientific community. While significant progress towards that goal has been made, the processes involved in integrating a sequence of individual word meanings into the meaning of a clause, sentence, or discourse is poorly understood. Recently, the natural language processing (NLP) community has demonstrated that deep language models are, to an extent, capable of representing word meanings and performing integration of those meanings into a representation that can successfully capture the meaning of a sequence. In this thesis, we therefore leverage deep language models as an analysis tool to improve our understanding of human language processing. In the setting of multitask learning, we can gain insight into the mechanisms that deep language models use to make their predictions by comparing tasks to each other. Furthermore, if some of the task predictions we ask the model to make are relevant to cognitive processing — for example the prediction of eyetracking data measured as participants read sentences — we can ultimately use those insights to better understand language processing in people. In this work, we first examine the use of constructive interference in multitask learning as an analysis tool. Constructive interference occurs when two tasks are related and a model is constrained by having to accurately predict both. In those cases, the representation the model learns often generalizes better to predicting unseen data than if that model had been trained on just one of those tasks. This is because the constraint that the representations must work for the prediction of both tasks provides a helpful inductive bias. If generalization error improves when tasks are trained together, this can be viewed, with caveats, as an indication that the tasks are related. In our experiments improved generalization error suggests relationships between event-related potential components (ERPs) that are consistent with the existing literature, and suggests other relationships which bear on the interpretation of ERPs. Next, we investigate what a deep language model learns when it is trained to predict brain activity recordings. We find that the information encoded into the parameters of the model helps the model predict brain activity, generalizes to prediction of unseen participants’ brain activity, and to some degree, generalizes across different brain activity recording modalities. These findings provide evidence that the information the model encodes into its parameters is relevant to the cognitive processes underlying brain activity, and not just to idiosyncrasies in the data, making fine-tuning and multitask learning valid tools for probing those cognitive processes. Finally, we develop an analysis method in which a model learns a small number of latent functions which take a sequence of words as input and produce a representation from which multiple task outputs must be predicted. We assess task similarities based on the weights which map from the common latent representation to the output associated with each task. The similarities produced by this method capture expected relationships between NLP tasks, and can help us understand how a deep language model makes its predictions. We also examine the similarities between cognition-relevant tasks and NLP tasks, and find that the mechanisms underlying the model’s predictions in cognition-relevant tasks are related to agent-like and patient-like semantic properties and to modifiers in a sentence. The methods developed here can be applied with different sets of tasks to gain different kinds of insight into both deep language models and cognitive processing, and offer a promising direction for understanding language processing in the brain.

History

Date

2020-08-07

Degree Type

Dissertation

Department

Language Technologies Institute

Degree Name

Doctor of Philosophy (PhD)

Advisor(s)

Tom Mitchell

Usage metrics

Keywords

Computational Neuroscience Multitask Learning Natural Language Processing Neuroscience Natural Language Processing

Licence

CC BY-NC-SA 4.0

Using Multitask Learning to Understand Language Processing in the Brain

History

Date

Degree Type

Department

Degree Name

Advisor(s)

Usage metrics

Categories

Keywords

Licence

Exports