Carnegie Mellon University
Browse

Learning Structured Neural Semantic Parsers

Download (4.68 MB)
thesis
posted on 2022-12-16, 21:07 authored by Pengcheng YinPengcheng Yin
<p>Semantic parsing, the task of translating user-issued natural language (NL) utterances (<em>e.g., Flights from Pittsburgh to New York</em>) into formal meaning representations (MRs, <em>e.g.</em>, an SQL database query or a Python program), has become an important direction in developing natural language interfaces to computational systems. Recent years have witnessed the burgeoning of applying neural network-based semantic parsers in various tasks and domains. However, meaning representations typically exhibit strong syntactic structure, and are defined following domain-specific structured knowledge schemas (<em>e.g.</em>, a database schema or Python API specification), which is not easily captured by standard neural sequence transduction models. Neural semantic parsers are also data-hungry, requiring non-trivial manual annotation effort by domain experts. These issues limit the scope of applications supported by a neural semantic parser, impeding the progress of applying the system to broader scenarios, especially those with diverse and complex structure of meaning representations. </p> <p>In this thesis, we explore developing neural semantic parsing models that could better capture the <em>structure</em> in various types of logical formalisms and knowledge schemas, while providing approaches to mitigate the cost of labeled data acquisition. The dissertation consists of three parts. The first part introduces a generalpurpose parsing model with built-in syntactic knowledge of the grammatical structure of meaning representations. Next, in the second part, we investigate approaches to encode structured information in domain knowledge schemas (<em>e.g.</em>, database tables) useful to understand user-issued utterances. Specifically, we focus on grounding elements in the schema (<em>e.g.</em>, columns like departure_city in database tables, or functions like GetFlight (<u>from=GetCityByName(·))</u> in API specifications) to their corresponding NL constituents (<em>e.g., from Pittsburgh</em>) in utterances. Finally, in the third part, we aim to improve the data efficiency of semantic parsers via semi-supervised learning, while developing machine-assisted approaches to accelerate training data acquisition. </p>

History

Date

2021-08-13

Degree Type

  • Dissertation

Thesis Department

  • Language Technologies Institute

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Graham Neubig

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC