Example-Driven Question Answering
Open-domain question answering (QA) is an emerging information-seeking paradigm, which automatically generates accurate and concise answers to natural language questions from humans. It becomes one of the most natural and efficient ways to interact with the web and especially desirable in hands-free speech-enabled environments. Building QA systems, however, either has to rely on off-the-shelf natural language processing tools that are not optimized for the QA task or train domain-specific modules (e.g., question type classification) with annotated data. Additionally, optimizing QA systems with hand-crafted procedures or feature engineering is costly, timeconsuming and laborious to transfer to new domains and languages.
This dissertation studies the idea of example-driven question answering, which focuses on learning to search, select, and generate answers to unseen questions solely by observing existing noisy question-answer examples along with text corpus or knowledge base. To achieve this goal, we developed novel neural network architectures throughout the QA pipeline, that can be trained directly from question-answer examples. First, we propose candidate retrieval models that can utilize noisy signals to produce dense indexing for text corpus and generate structured queries for knowledge graphs. Second, we developed generative relevance models that do not require annotated negative QA pairs and discriminative relevance models that can utilize pseudo negative examples. Third, we improved encoderdecoder models for response text generation which can accept external guidance for specific language style and topic. The integrated QA pipeline aims to generate answer-like embedding vectors to search, select the most relevant passages, and compose a natural-sounding response based on the selected passages.
This dissertation demonstrates the feasibility of creating open-domain example-driven QA pipelines based on neural networks without any feature engineering or dedicated manual annotations for each QA module. Experiments show our models achieve state-of-the-art or competitive performance on several real-world ranking and generation tasks on domains of QA and conversation generation. When applying in the TREC LiveQA competitions, our approach received the highest average scores among automatic systems in main tasks of 2015, 2016 and 2017, and the highest average score in the medical subtask of 2017.
History
Date
2019-09-30Degree Type
- Dissertation
Department
- Computer Science
Degree Name
- Doctor of Philosophy (PhD)