posted on 2002-01-01, 00:00authored byPoj Tangamchit, John M. Dolan, Pradeep K. Khosla
Learning can be an effective way for robot
systems to deal with dynamic environments and
changing task conditions. However, popular singlerobot
learning algorithms based on discounted
rewards, such as Q learning, do not achieve
cooperation (i.e., purposeful division of labor) when
applied to task-level multirobot systems. A tasklevel
system is defined as one performing a mission
that is decomposed into subtasks shared among
robots. In this paper, we demonstrate the superiority
of average-reward-based learning such as the Monte
Carlo algorithm for task-level multirobot systems,
and suggest an explanation for this superiority.