Towards Efficient and Scalable Representation Learning
Nowadays it becomes more and more challenging to tackle the quickly growing amounts of data to extract useful information for making informed decisions. Even with the recent advancements in deep learning, however, the question of how to make use of such enormous data for a diverse set of tasks in an efficient and scalable manner has yet to be resolved.
To undertake the two main aspects of representation learning from data, namely efficiency and scalability, this thesis presents techniques to deal with diverse tasks including sentiment analysis, handwriting recognition and document intelligence where data appear in different forms: multimodal data that includes text, audio, and videos, noisy scanned handwriting images, or long documents with differing layouts. Due to the availability and potential issues of their data and the distinct objectives of the associated tasks, there is no one-size-fits-all solution but a specific approach to each problem. In addition, in dealing with large-scale data, this thesis also presents some approximation techniques and analysis to estimate the essential components, learn effective representation and speed up the learning process, including matrix trace approximation with a parallel non-adaptive method, spectrum approximation in Gaussian Processes training, and task-based mixture-of-experts models for large-scale multitask neural machine translation models. Throughout those works, this thesis introduces novel approaches for tackling issues that are presented in the data and the tasks, learning efficient representation, and approximating models for practical scalability in the real world.
History
Date
2023-05-08Degree Type
- Dissertation
Department
- Language Technologies Institute
Degree Name
- Doctor of Philosophy (PhD)