posted on 2007-01-01, 00:00authored byNathan Ratliff, David Bradley, J. Andrew Bagnell, Joel Chestnutt
The Maximum Margin Planning (MMP) (Ratliff et al., 2006) algorithm solves
imitation learning problems by learning linear mappings from features to cost
functions in a planning domain. The learned policy is the result of minimum-cost
planning using these cost functions. These mappings are chosen so that example
policies (or trajectories) given by a teacher appear to be lower cost (with a loss-
scaled margin) than any other policy for a given planning domain. We provide a
novel approach, MMP BOOS T , based on the functional gradient descent view of
boosting (Mason et al., 1999; Friedman, 1999a) that extends MMP by “boosting”
in new features. This approach uses simple binary classification or regression to
improve performance of MMP imitation learning, and naturally extends to the
class of structured maximum margin prediction problems. (Taskar et al., 2005)
Our technique is applied to navigation and planning problems for outdoor mobile
robots and robotic legged locomotion.