首页 | 本学科首页   官方微博 | 高级检索  
     


Rationality of reward sharing in multi-agent reinforcement learning
Authors:Kazuteru Miyazaki  Shigenobu Kobayashi
Affiliation:(1) National Institution for Academic Degrees, 3-29-1 Ootsuka Bunkyo-ku, 112-0012 Tokyo, Japan;(2) Tokyo Institute of Technology, 4259 Nagatsuta Midori, 226-8502 Yokohama, Japan
Abstract:
In multi-agent reinforcement learning systems, it is important to share a reward among all agents. We focus on theRationality Theorem of Profit Sharing 5) and analyze how to share a reward among all profit sharing agents. When an agent gets adirect reward R (R>0), anindirect reward μR (μ≥0) is given to the other agents. We have derived the necessary and sufficient condition to preserve the rationality as follows;

$$mu  < frac{{M - 1}}{{M^W (1 - (tfrac{1}{M})^{W_o } )(n - 1)L}}$$
whereM andL are the maximum number of conflicting all rules and rational rules in the same sensory input,W andW o are the maximum episode length of adirect and anindirect-reward agents, andn is the number of agents. This theory is derived by avoiding the least desirable situation whose expected reward per an action is zero. Therefore, if we use this theorem, we can experience several efficient aspects of reward sharing. Through numerical examples, we confirm the effectiveness of this theorem. Kazuteru Miyazaki, Dr. Eng.: He is an associate professor in the Faculty of Assessment and Research for Degrees at National Institution for Academic Degrees. He obtained his BEng. form Meiji University in 1991, and his Dr. Eng. form Tokyo Institute of Technology in 1996. His research interests are in Machine Learning and Robotics. He has published over 30 research papers and received several awards. He is a member of the Japan Society of Mechanical Engineers (JSME), Japanese Society for Artificial Intelligence (JSAI), and the Society of Instrument and Control Engineers of Japan (SICE). Shigenobu Kobayashi, Dr. Eng.: He received his Dr. Eng. from Tokyo Institute of Technology in 1974. He is professor at Dept. of Computational Intelligence and Systems Science, Tokyo Institute of Technology. His research interests include artificial intelligence, emergent systems, evolutionary computation and reinforcement learning.
Keywords:Reinforcement Learning  Multi-agent System  Profit Sharing  Rationality Theorem  Direct and Indirect Rewards
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号