Carnegie Mellon University
Browse

An Analysis of Active Learning With Uniform Feature Noise

Download (489.97 kB)
journal contribution
posted on 2014-04-01, 00:00 authored by Aaditya Ramdas, Barnabas Poczos, Aarti Singh, Larry Wasserman
<p>In active learning, the user sequentially chooses values for feature X and an oracle returns the corresponding label Y. In this paper, we consider the effect of feature noise in active learning, which could arise either because X itself is being measured, or it is corrupted in transmission to the oracle, or the oracle returns the label of a noisy version of the query point. In statistics, feature noise is known as“errors in variables” and has been studied extensively in non-active settings. However, the effect of feature noise in active learning has not been studied before. We consider the well-known Berkson errors-in-variables model with additive uniform noise of width σ. Our simple but revealing setting is that of one-dimensional binary classification setting where the goal is to learn a threshold (point where the probability of a + label crosses half). We deal with regression functions that are antisymmetric in a region of size σ around the threshold and also satisfy Tsybakov’s margin condition around the threshold. We prove minimax lower and upper bounds which demonstrate that when σ is smaller than the minimiax active/passive noiseless error derived in Castro & Nowak (2007), then noise has no effect on the rates and one achieves the same noiseless rates. For larger σ, the <em>unflattening</em> of the regression function on convolution with uniform noise, along with its local antisymmetry around the threshold, together yield a behaviour where noise <em>appears</em> to be beneficial. Our key result is that active learning can buy significant improvement over a passive strategy even in the presence of feature noise.</p>

History

Publisher Statement

Copyright 2014 by the authors.

Date

2014-04-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC