The Necessity of Average Rewards in Cooperative Multirobot Learning

Tangamchit, Poj; Dolan, John M.; Khosla, Pradeep K.

doi:10.1184/R1/6561245.v1

The Necessity of Average Rewards in Cooperative Multirobot Learning

journal contribution

posted on 2002-01-01, 00:00 authored by Poj Tangamchit, John M. Dolan, Pradeep K. Khosla

Learning can be an effective way for robot systems to deal with dynamic environments and changing task conditions. However, popular singlerobot learning algorithms based on discounted rewards, such as Q learning, do not achieve cooperation (i.e., purposeful division of labor) when applied to task-level multirobot systems. A tasklevel system is defined as one performing a mission that is decomposed into subtasks shared among robots. In this paper, we demonstrate the superiority of average-reward-based learning such as the Monte Carlo algorithm for task-level multirobot systems, and suggest an explanation for this superiority.

History

Publisher Statement

"©2002 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE." "This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder."

Date

2002-01-01

Usage metrics

Keywords

Robotics Adaptive Agents and Intelligent Robotics

Licence

In Copyright

The Necessity of Average Rewards in Cooperative Multirobot Learning

History

Publisher Statement

Date

Usage metrics

Categories

Keywords

Licence

Exports