ExploringWeakly Labeled Data Across the Noise-Bias Spectrum

Fisher, Robert W. H.

doi:10.1184/R1/6716600.v1

ExploringWeakly Labeled Data Across the Noise-Bias Spectrum.pdf (2.05 MB)

ExploringWeakly Labeled Data Across the Noise-Bias Spectrum

thesis

posted on 2016-04-01, 00:00 authored by Robert W. H. Fisher

As the availability of unstructured data on the web continues to increase, it is becoming increasingly necessary to develop machine learning methods that rely less on human annotated training data. In this thesis, we present methods for learning from weakly labeled data. We present a unifying framework to understand weakly labeled data in terms of bias and noise and identify methods that are well suited to learning from certain types of weak labels. To compensate for the tremendous sizes of weakly labeled datasets, we leverage computationally efficient and statistically consistent spectral methods. Using these methods, we present results from four diverse, real-world applications coupled with a unifying simulation environment. This allows us to make general observations that would not be apparent when examining any one application on its own. These contributions allow us to significantly improve prediction when labeled data is available, and they also make learning tractable when the cost of acquiring annotated data is prohibitively high.

History

Date

2016-04-01

Degree Type

Dissertation

Department

Machine Learning

Degree Name

Doctor of Philosophy (PhD)

Advisor(s)

Reid Simmons

Usage metrics

Keywords

Weakly labeled data spectral methods latent variable models

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

ExploringWeakly Labeled Data Across the Noise-Bias Spectrum

History

Date

Degree Type

Department

Degree Name

Advisor(s)

Usage metrics

Categories

Keywords

Licence

Exports