首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度强化学习的电力物资配送多目标路径优化
引用本文:徐郁,朱韵攸,刘筱,邓雨婷,廖勇. 基于深度强化学习的电力物资配送多目标路径优化[J]. 计算机应用, 2022, 42(10): 3252-3258. DOI: 10.11772/j.issn.1001-9081.2021091582
作者姓名:徐郁  朱韵攸  刘筱  邓雨婷  廖勇
作者单位:国网重庆市电力公司 永川供电分公司, 重庆 402160
国网重庆市电力公司 信息通信分公司, 重庆 401120
重庆锦禹云能源科技有限公司, 重庆 400050
重庆大学 微电子与通信工程学院, 重庆 400044
基金项目:国网重庆市电力公司科技项目(2021渝电科技8#)
摘    要:针对现有电力物资车辆路径问题(EVRP)优化时考虑目标函数较为单一、约束不够全面,并且传统求解算法效率不高的问题,提出一种基于深度强化学习(DRL)的电力物资配送多目标路径优化模型和求解算法。首先,充分考虑了电力物资配送区域的加油站分布情况、物资运输车辆的油耗等约束,建立了以电力物资配送路径总长度最短、成本最低、物资需求点满意度最高为目标的多目标电力物资配送模型;其次,设计了一种基于DRL的电力物资配送路径优化算法DRL-EVRP求解所提模型。DRL-EVRP使用改进的指针网络(Ptr-Net)和Q-学习(Q-learning)算法结合的深度Q-网络(DQN)来将累积增量路径长度的负值与满意度之和作为奖励函数。所提算法在进行训练学习后,可直接用于电力物资配送路径规划。仿真实验结果表明,DRL-EVRP求解得到的电力物资配送路径总长度相较于扩展C-W(ECW)节约算法、模拟退火(SA)算法更短,且运算时间在可接受范围内,因此所提算法能更加高效、快速地进行电力物资配送路径优化。

关 键 词:电力物资  多目标路径优化  车辆路径问题  深度强化学习  指针网络
收稿时间:2021-09-07
修稿时间:2021-11-11

Multi-objective routing optimization of electric power material distribution based on deep reinforcement learning
Yu XU,Yunyou ZHU,Xiao LIU,Yuting DENG,Yong LIAO. Multi-objective routing optimization of electric power material distribution based on deep reinforcement learning[J]. Journal of Computer Applications, 2022, 42(10): 3252-3258. DOI: 10.11772/j.issn.1001-9081.2021091582
Authors:Yu XU  Yunyou ZHU  Xiao LIU  Yuting DENG  Yong LIAO
Affiliation:Yongchuan Power Supply Branch,State Grid Chongqing Electric Power Company,Chongqing 402160,China
Information and Telecommunication Branch,State Grid Chongqing Electric Power Company,Chongqing 401120,China
Chongqing Jinyuyun Energy Technology Company Limited,Chongqing 400050,China
School of Microelectronics and Communication Engineering,Chongqing University,Chongqing 400044,China
Abstract:In the existing optimization of Electric power material Vehicle Routing Problem (EVRP), the objective function is relatively single, the constraints are not comprehensive enough, and the traditional solution algorithms are not efficient. Therefore, a multi-objective routing optimization model and solution algorithm for electric power material distribution based on Deep Reinforcement Learning (DRL) was proposed. Firstly, the electric power material distribution area constraints such as the distribution of gas stations and the fuel consumption of material transportation vehicles were fully considered to establish a multi-objective power material distribution model with the objectives of the minimum total length of the power material distribution routings, the lowest cost, and the highest material demand point satisfaction. Secondly, a power material distribution routing optimization algorithm DRL-EVRP was designed on the basis of Deep Reinforcement Learning (DRL) to solve the proposed model. In the algorithm, the improved Pointer Network (Ptr-Net) and the Q-learning algorithm were combined to form the Deep Q-Network (DQN), which was used to take the sum of the negative value of the cumulative incremental routing length and the satisfaction as the reward function. After DRL-EVRP algorithm was trained and learned, it can be directly used for the planning of electric power material distribution routings. Simulation results show that the total length of the power material distribution routing solved by DRL-EVRP algorithm is shorter than those solved by the Extended Clarke and Wright (ECW) saving algorithm and Simulated Annealing (SA) algorithm, and the calculation time of the proposed algorithm is within an acceptable range. Therefore, the power material distribution routing can be optimized more efficiently and quickly by the proposed algorithm.
Keywords:electric power material  multi-objective routing optimization  Vehicle Routing Problem (VRP)  Deep Reinforcement Learning (DRL)  Pointer Network (Ptr-Net)  
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号