首页 | 本学科首页   官方微博 | 高级检索  
     


Using temporal-difference learning for multi-agent bargaining
Affiliation:1. Department of Information Management, Ming Chuan University, 333 Taoyuan, Taiwan, ROC;2. Graduate Institute of Technology Management, National Tsing Hua University, 300 Hsinchu, Taiwan, ROC
Abstract:This research treats a bargaining process as a Markov decision process, in which a bargaining agent’s goal is to learn the optimal policy that maximizes the total rewards it receives over the process. Reinforcement learning is an effective method for agents to learn how to determine actions for any time steps in a Markov decision process. Temporal-difference (TD) learning is a fundamental method for solving the reinforcement learning problem, and it can tackle the temporal credit assignment problem. This research designs agents that apply TD-based reinforcement learning to deal with online bilateral bargaining with incomplete information. This research further evaluates the agents’ bargaining performance in terms of the average payoff and settlement rate. The results show that agents using TD-based reinforcement learning are able to achieve good bargaining performance. This learning approach is sufficiently robust and convenient, hence it is suitable for online automated bargaining in electronic commerce.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号