Learning and Decision Making from Diverse Forms of Information

Xu, Yichong

doi:10.1184/R1/14394884.v1

Learning and Decision Making from Diverse Forms of Information

thesis

posted on 2021-04-14, 17:42 authored by Yichong XuYichong Xu

Classical machine learning posits that data are independently and identically distributed, in a single format usually the same as test data. In modern applications

however, additional information in other formats might be available freely or at a lower cost. For example, in data crowdsourcing we can collect preferences over the

data points instead of directly asking the labels of a single data point at a lower cost. In natural language understanding problems, we might have limited amount of data in the target domain, but can use a large amount of general domain data for free. The main topic of this thesis is to study how to efficiently incorporate these diverse

forms of information into the learning and decision making process. We study two representative paradigms in this thesis ? Firstly, we study learning and decision making problems with direct labels and comparisons. In many applications such as clinical settings and material science,

comparisons are much cheaper to obtain than direct labels. We show that comparisons can greatly reduce the problem complexity; using comparisons as input, our algorithm requires an exponentially smaller amount of labels to work

than traditional label-only algorithms. Moreover, our total query complexity is similar to previous algorithms. We consider various learning problems in this settings, including classification, regression, multi-armed bandits, nonconvex optimization, and reinforcement learning. ? Secondly, we study multi-task learning and transfer learning to learn from different domains and tasks of data. In this case, our algorithm use the previous collected data from similar tasks or domains, which are essentially free to use. We propose simple yet effective ways to transfer the knowledge from

other domains and tasks, and achieve state-of-the-art result on several natural language understanding benchmarks.

We illustrate both theoretical and practical insights in this thesis. Theoretically, we show performance guarantees of our algorithms as well as their statistical minimaxity

through information-theoretic limits. On the practical side, we demonstrate promising experimental results on price estimation and natural language understanding tasks.

History

Date

2020-07-28

Degree Type

Dissertation

Department

Machine Learning

Degree Name

Doctor of Philosophy (PhD)

Advisor(s)

Artur Dubrawski Aarti Singh

Usage metrics

Keywords

interactive learning pairwise comparison multitask learning preference learning ranking Artificial Intelligence and Image Processing

Licence

CC BY-SA 4.0

Learning and Decision Making from Diverse Forms of Information

History

Date

Degree Type

Department

Degree Name

Advisor(s)

Usage metrics

Categories

Keywords

Licence

Exports