Towards Efficient Natural Language Generation
Natural language generation (NLG) has seen remarkable success benefiting from the development of deep learning techniques. As large-scale pretraining becomes the de-facto standard in NLP, enormous training data and model parameters consistently lead to state-of-the-art performance on standard NLG tasks. While quite successful, current NLG approaches are inefficient from several aspects, which prohibits their usage in broader and practical settings: (1) they are label-inefficient – conditional neural generation (e.g. machine translation) often requires a large number of annotated samples to train, which limits their applications in low-resource regimes; (2) they are parameter-inefficient – it is common practice to fine-tune a pretrained model to adapt it to the downstream task, however, these models could scale to trillions of parameters (Fedus et al., 2021), which would cause a large memory footprint when serving a large number of tasks; and (3) lastly, we focus on the compute-inefficiency of a trending model class, retrieval-augmented NLG models. They retrieve from an external datastore to assist in generation, the added datastore and retrieval process incurs non-trivial space and time cost due to extra computation.
In this thesis, we aim to provide a deeper understanding of research problems in efficient NLG and utilizing the insights to design better approaches. Specifically, (1) for label-efficiency we study unsupervised and semi-supervised conditional genera?tion that take advantage of the abundant unlabeled text data, and thus mitigate the requirement of numerous annotated samples. The proposed methods are validated on a wide variety of NLG tasks; (2) for parameter-efficiency we propose a unified framework to connect parameter-efficient transfer learning, where only few parameters need to be updated to adapt a large pretrained model to downstream tasks. Our framework provides a new understanding of this direction, as well as instantiating state-of-the-art approaches for parameter-efficient NLG; (3) for compute-efficiency in retrieval-augmented NLG we either design new models or post-adapt the retrieval component to compress the datastore, reduce the retrieval compute, and speed up the inference.
- Language Technologies Institute
- Doctor of Philosophy (PhD)