10.1184/R1/6626153.v1
Poj Tangamchit
Poj
Tangamchit
John Dolan
John
Dolan
Pradeep Khosla
Pradeep
Khosla
The Necessity of Average Rewards in Cooperative Multirobot Learning
Carnegie Mellon University
2002
Software Research
2002-01-01 00:00:00
Journal contribution
https://kilthub.cmu.edu/articles/journal_contribution/The_Necessity_of_Average_Rewards_in_Cooperative_Multirobot_Learning/6626153
<p>
</p><p>Learning can be an effective way for robot systems to deal with dynamic environments and changing task conditions. However, popular single-robot learning algorithms based on discounted rewards, such as Q learning, do not achieve cooperation (i.e., purposeful division of labor) when applied to task-level multirobot systems. A task-level system is defined as one performing a mission that is decomposed into subtasks shared among robots. In this paper, we demonstrate the superiority of average-reward-based learning such as the Monte Carlo algorithm for task-level multirobot systems, and suggest an explanation for this superiority.</p>
<p></p>