Affiliation: | aDepartment of Information Management, Kun Shan University of Technology, Tainan Hsien 710, Taiwan bDepartment of Industrial Engineering, Mississippi State University, P.O. Box 9542, 260 McCain Bldg, Miss. Sate, MS 39762, USA |
Abstract: | Reinforcement learning (RL) has received some attention in recent years from agent-based researchers because it deals with the problem of how an autonomous agent can learn to select proper actions for achieving its goals through interacting with its environment. Each time after an agent performs an action, the environment's response, as indicated by its new state, is used by the agent to reward or penalize its action. The agent's goal is to maximize the total amount of reward it receives over the long run. Although there have been several successful examples demonstrating the usefulness of RL, its application to manufacturing systems has not been fully explored. In this study, a single machine agent employs the Q-learning algorithm to develop a decision-making policy on selecting the appropriate dispatching rule from among three given dispatching rules. The system objective is to minimize mean tardiness. This paper presents a factorial experiment design for studying the settings used to apply Q-learning to the single machine dispatching rule selection problem. The factors considered in this study include two related to the agent's policy table design and three for developing its reward function. This study not only investigates the main effects of this Q-learning application but also provides recommendations for factor settings and useful guidelines for future applications of Q-learning to agent-based production scheduling. |