Online and Adaptive Methods for Multitask Learning
The power of joint learning in multiple tasks arises from the transfer of relevant knowledge across said tasks, especially from information-rich tasks to information?poor ones. Lifelong learning, on the other hand, provides an efficient way to learn new tasks faster by utilizing the knowledge learned from the previous tasks and prevent catastrophic forgetting or significantly degrading performance on the old tasks. Despite several advantages on learning from related tasks, it poses considerable challenges in terms of effectiveness by minimizing prediction errors for all tasks and overall computational tractability for real-time performance, especially when the number of tasks is large. In contrast, human beings seem natural in accumulating and retaining the knowledge from the past and leverage this knowledge to acquire new skills and solve new problems efficiently. We consider two key challenges in multitask and lifelong learning:
- Sequential data: In many real-world applications such as optimizing financial trading, email prioritization, personalized news, and spam filtering, etc., data arrive sequentially. In such cases, we need to make predictions and update the per-task models in an efficient real-time manner when a new observation or task is available.
- Scalability: Most existing algorithms for multitask learning cannot scale to very large dataset especially when the number of tasks is large. The computational complexity arises due to the difficulty in learning the shared knowledge from all the tasks.
Together, these two challenges may hinder the practical application of multitask learning to several real-world problems. In this thesis, we propose simple and efficient algorithms for learning from related tasks to address the aforementioned challenges. We present algorithms that feature probabilistic interpretation, efficient updating rules and flexible modulation on whether learners focus on their specific task or on jointly address all tasks. We develop a novel approach to active learning for sequential problems that first determines if the learner can acquire a label from its peers. If so, it saves the query for later use in more difficult cases, and if not it queries the human. We define a new machine learning paradigm based on a curriculum defined dynamically by the learner ("self-paced") instead of a fixed curriculum set a-priori by a teacher. The primary focus of this thesis is to scale the multitask and lifelong learning to practical applications where both the tasks and/or the examples of the tasks arrive in an online fashion.
History
Date
2023-12-18Degree Type
- Dissertation
Department
- Language Technologies Institute
Degree Name
- Doctor of Philosophy (PhD)