posted on 1999-02-01, 00:00authored byJoseph O'Sullivan, John Langford, Richard Caruana, Avrim Blum
Most machine learning algorithms are lazy: they
extract from the training set the minimum information
needed to predict its labels. Unfortunately,
this often leads to models that are not robust
when features are removed or obscured in
future test data. For example, a backprop net
trained to steer a car typically learns to recognize
the edges of the road, but does not learn
to recognize other features such as the stripes
painted on the road which could be useful when
road edges disappear in tunnels or are obscured
by passing trucks. The net learns the minimum
necessary to steer on the training set. In contrast,
human driving is remarkably robust as features
become obscured. Motivated by this, we propose
a framework for robust learning that biases induction
to learn many different models from the
same inputs. We present a meta algorithm for
robust learning called FeatureBoost, and demonstrate
it on several problems using backprop nets,
k-nearest neighbor, and decision trees.