posted on 2005-01-01, 00:00authored byJun Morimoto, Jun Nakanishi, Gen Endo, Gordon Cheng, Christopher G. Atkeson, Garth Zeglin
We propose a model-based reinforcement learning
algorithm for biped walking in which the robot learns
to appropriately modulate an observed walking pattern. Viapoints
are detected from the observed walking trajectories
using the minimum jerk criterion. The learning algorithm
modulates the via-points as control actions to improve walking
trajectories. This decision is based on a learned model of the
Poincar´e map of the periodic walking pattern. The model
maps from a state in the single support phase and the control
actions to a state in the next single support phase. We applied
this approach to both a simulated robot model and an actual
biped robot. We show that successful walking policies are
acquired.