Towards Efficient Natural Language Generation

He, Junxian

doi:10.1184/R1/24788010.v1

Towards Efficient Natural Language Generation

thesis

posted on 2023-12-21, 20:45 authored by Junxian HeJunxian He

Natural language generation (NLG) has seen remarkable success benefiting from the development of deep learning techniques. As large-scale pretraining becomes the de-facto standard in NLP, enormous training data and model parameters consistently lead to state-of-the-art performance on standard NLG tasks. While quite successful, current NLG approaches are inefficient from several aspects, which prohibits their usage in broader and practical settings: (1) they are label-inefficient – conditional neural generation (e.g. machine translation) often requires a large number of annotated samples to train, which limits their applications in low-resource regimes; (2) they are parameter-inefficient – it is common practice to fine-tune a pretrained model to adapt it to the downstream task, however, these models could scale to trillions of parameters (Fedus et al., 2021), which would cause a large memory footprint when serving a large number of tasks; and (3) lastly, we focus on the compute-inefficiency of a trending model class, retrieval-augmented NLG models. They retrieve from an external datastore to assist in generation, the added datastore and retrieval process incurs non-trivial space and time cost due to extra computation.

In this thesis, we aim to provide a deeper understanding of research problems in efficient NLG and utilizing the insights to design better approaches. Specifically, (1) for label-efficiency we study unsupervised and semi-supervised conditional genera?tion that take advantage of the abundant unlabeled text data, and thus mitigate the requirement of numerous annotated samples. The proposed methods are validated on a wide variety of NLG tasks; (2) for parameter-efficiency we propose a unified framework to connect parameter-efficient transfer learning, where only few parameters need to be updated to adapt a large pretrained model to downstream tasks. Our framework provides a new understanding of this direction, as well as instantiating state-of-the-art approaches for parameter-efficient NLG; (3) for compute-efficiency in retrieval-augmented NLG we either design new models or post-adapt the retrieval component to compress the datastore, reduce the retrieval compute, and speed up the inference.

History

Date

2022-08-18

Degree Type

Dissertation

Department

Language Technologies Institute

Degree Name

Doctor of Philosophy (PhD)

Advisor(s)

Graham Neubig Taylor Berg-Kirkpatrick

Usage metrics

Keywords

natural language generation efficiency generative models Natural Language Processing

Licence

In Copyright

Towards Efficient Natural Language Generation

History

Date

Degree Type

Department

Degree Name

Advisor(s)

Usage metrics

Categories

Keywords

Licence

Exports