Improving Reliability in Dialogue Systems
Dialogue systems have undergone significant advancements by leveraging large public corpora and advancements in neural architectures. Thanks to large pre-trained language models and recent developments in neural networks, dialogue generation systems are now capable of producing fluent and engaging responses across various dialogue contexts. However, black-box nature and heightened complexity of end-to-end neural dialogue models make them susceptible to unknown failure modes that often emerge only after deployment. To improve the reliability of neural dialogue models for practical applications, several challenges need to be addressed. Firstly, creating robust and bias-free evaluation and ranking models for dialogue is a not straight-forward as it requires careful consideration of various factors such as context, coherence, relevance, and user satisfaction. Secondly, controlling the outputs of dialogue response generation models to align with developers’ intended goals presents a challenge. Current approaches often lack the necessary flexibility, intuitiveness, interpretability, and data-efficiency to enable fine-grained control over the generated responses. Lastly, enhancing safety measures is crucial to ensure that dialogue systems do not generate offensive or factually incorrect responses, thereby avoiding unintended harm to users.
This thesis addresses the challenges in enhancing the reliability of neural dialogue models by introducing novel techniques for robust evaluation and providing finer, more intuitive control over the response generation process. The thesis comprises two main parts that tackle these challenges. The first part focuses on the development of techniques for creating robust dialogue response evaluation and ranking algorithms. These techniques utilize multiple references, automatically generated adversarial responses, and improved benchmarking methods for assessing factuality. By incorporating these approaches, the thesis aims to establish more reliable and comprehensive evaluation metrics for dialogue systems, ensuring a more accurate assessment of their performance. The second part of the thesis proposes techniques to empower developers with flexible, intuitive, and interpretable means of controlling the generation process. This includes the utilization of templates, examples, instructions, and guidelines to guide the system towards generating responses that align with specific tasks and developer intent. Additionally, this part introduces safety mechanisms designed to prevent misuse and harm to users. These safety mechanisms utilize natural language instructions and guidelines to ensure responsible and ethical behavior of the dialogue systems.
History
Date
2023-07-23Degree Type
- Dissertation
Department
- Language Technologies Institute
Degree Name
- Doctor of Philosophy (PhD)