Carnegie Mellon University
vgangal - CMU-LTI-dissertation_2022.pdf (6.47 MB)

A Suite of Low-Data Generation Settings Necessitating Interventions on End-to-End NLG Models

Download (6.47 MB)
posted on 2023-01-06, 21:36 authored by Varun Prashant GangalVarun Prashant Gangal

Natural Language Generation (NLG) is the field of study that aims to endow agents with the ability to generate language to satisfy any stated communicative goal (CG). Today, NLG systems that converse (e.g., Meena) and co-author (e.g., Gmail SmartCompose) with humans aren’t just deployable, but a familiar part of net-equipped societies. Models underlying today’s systems e.g., T5 [142], based on neural architectures like Transformers [180], are “end-to-end" in terms of their structure and overall learning process. 

Notwithstanding these rapid strides, emerging work points to concerns about aspects of NLG model outputs such as commonsense plausibility [102], local coherence [127], and global coherence [90] that arise under their respective generation settings. In this thesis, we identify and characterize six such generation settings that present challenges for learning suitable end-to-end models when applied sans any setting-specific changes. 

In some of these settings, the CG specifies an unusual, esoteric set of constraints for the outputs to satisfy, e.g., being phonetically difficult. Gold output examples, each of which is a commonly accepted creatively coined artifact e.g., the tongue twister She sells seashells on the seashore are hard to curate, leading to low-count, small datasets (e.g., ≈ 400 for tongue twister generation). Feasible learning in spite of such low data needs setting-driven changes to the learning process. 

In other instances of these settings, the CG requires the output to satisfy, in addition to typical requirements like fluency, complex aspects or properties in relation to the input such as creating commonsense plausible combinations of input concepts for the Commongen setting [102]. Generating to satisfy these aspects needs a particularly knowledge-rich interpretation of the input. However, the aspects in question are too wide-ranging in scope, making such specification impractical — Thus, the CG is in some sense “partially specified". Moreover, the training data, though not low-count, is still at a scale insufficient to acquire the knowledge tabula rasa. It is hence needed to bridge this knowledge gap by incorporating explicit sources of knowledge into the learning process such as, e.g., augmenting input via grounding in another modality for Commongen [46]. 

Some instances of these settings may even constitute a challenging blend of the above two classes, with unusual output constraints married with a knowledgeintensive output-input relationship mandated by the CG e.g., Generating sarcastic comments about an input short story.

We show how each setting benefits from a specific, setting-inspired intervention in the end-to-end nature of the NLG model architecture and learning process to design a final, improved NLG system that viably generate outputs satisfying the CG. 

The sheer diversity of linguistic form means there will always arise new, datadeficient NLG settings that involve unusual constraints, underspecified CGs or other challenging configurations of CG, output and input. This thesis illustrates a general method for addressing such situations systematically. We sketch a general recipe outlining how to design such interventions. 




Degree Type

  • Dissertation


  • Language Technologies Institute

Degree Name

  • Doctor of Philosophy (PhD)


Eduard Hovy

Usage metrics


    Ref. manager