posted on 2007-01-01, 00:00authored byBrenna Argall, Brett Browning, Manuela M. Veloso
Learning by demonstration can be a powerful and natural
tool for developing robot control policies. That is, instead
of tedious hand-coding, a robot may learn a control policy
by interacting with a teacher. In this work we present an algorithm
for learning by demonstration in which the teacher
operates in two phases. The teacher first demonstrates the
task to the learner. The teacher next critiques learner performance
of the task. This critique is used by the learner
to update its control policy. In our implementation we utilize
a 1-Nearest Neighbor technique which incorporates both
training dataset and teacher critique. Since the teacher critiques
performance only, they do not need to guess at an effective
critique for the underlying algorithm. We argue that
this method is particularly well-suited to human teachers,
who are generally better at assigning credit to performances
than to algorithms. We have applied this algorithm to the
simulated task of a robot intercepting a ball. Our results
demonstrate improved performance with teacher critiquing,
where performance is measured by both execution success
and efficiency.