基于深度强化学习的电力物资配送多目标路径优化 Multi-objective routing optimization of electric power material distribution based on deep reinforcement learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于深度强化学习的电力物资配送多目标路径优化

引用本文：	徐郁,朱韵攸,刘筱,邓雨婷,廖勇. 基于深度强化学习的电力物资配送多目标路径优化[J]. 计算机应用, 2022, 42(10): 3252-3258. DOI: 10.11772/j.issn.1001-9081.2021091582

作者姓名：	徐郁朱韵攸刘筱邓雨婷廖勇

作者单位：	国网重庆市电力公司永川供电分公司, 重庆 402160 国网重庆市电力公司信息通信分公司, 重庆 401120 重庆锦禹云能源科技有限公司, 重庆 400050 重庆大学微电子与通信工程学院, 重庆 400044

基金项目：	国网重庆市电力公司科技项目（2021渝电科技8#）

摘要：	针对现有电力物资车辆路径问题（EVRP）优化时考虑目标函数较为单一、约束不够全面，并且传统求解算法效率不高的问题，提出一种基于深度强化学习（DRL）的电力物资配送多目标路径优化模型和求解算法。首先，充分考虑了电力物资配送区域的加油站分布情况、物资运输车辆的油耗等约束，建立了以电力物资配送路径总长度最短、成本最低、物资需求点满意度最高为目标的多目标电力物资配送模型；其次，设计了一种基于DRL的电力物资配送路径优化算法DRL-EVRP求解所提模型。DRL-EVRP使用改进的指针网络（Ptr-Net）和Q-学习（Q-learning）算法结合的深度Q-网络（DQN）来将累积增量路径长度的负值与满意度之和作为奖励函数。所提算法在进行训练学习后，可直接用于电力物资配送路径规划。仿真实验结果表明，DRL-EVRP求解得到的电力物资配送路径总长度相较于扩展C-W(ECW)节约算法、模拟退火（SA）算法更短，且运算时间在可接受范围内，因此所提算法能更加高效、快速地进行电力物资配送路径优化。
关键词：	电力物资多目标路径优化车辆路径问题深度强化学习指针网络
收稿时间：	2021-09-07
修稿时间：	2021-11-11
Multi-objective routing optimization of electric power material distribution based on deep reinforcement learning

Yu XU,Yunyou ZHU,Xiao LIU,Yuting DENG,Yong LIAO. Multi-objective routing optimization of electric power material distribution based on deep reinforcement learning[J]. Journal of Computer Applications, 2022, 42(10): 3252-3258. DOI: 10.11772/j.issn.1001-9081.2021091582

Authors:	Yu XU Yunyou ZHU Xiao LIU Yuting DENG Yong LIAO

Affiliation:	Yongchuan Power Supply Branch，State Grid Chongqing Electric Power Company，Chongqing 402160，China Information and Telecommunication Branch，State Grid Chongqing Electric Power Company，Chongqing 401120，China Chongqing Jinyuyun Energy Technology Company Limited，Chongqing 400050，China School of Microelectronics and Communication Engineering，Chongqing University，Chongqing 400044，China

Abstract:	In the existing optimization of Electric power material Vehicle Routing Problem （EVRP）， the objective function is relatively single， the constraints are not comprehensive enough， and the traditional solution algorithms are not efficient. Therefore， a multi-objective routing optimization model and solution algorithm for electric power material distribution based on Deep Reinforcement Learning （DRL） was proposed. Firstly， the electric power material distribution area constraints such as the distribution of gas stations and the fuel consumption of material transportation vehicles were fully considered to establish a multi-objective power material distribution model with the objectives of the minimum total length of the power material distribution routings， the lowest cost， and the highest material demand point satisfaction. Secondly， a power material distribution routing optimization algorithm DRL-EVRP was designed on the basis of Deep Reinforcement Learning （DRL） to solve the proposed model. In the algorithm， the improved Pointer Network （Ptr-Net） and the Q-learning algorithm were combined to form the Deep Q-Network （DQN）， which was used to take the sum of the negative value of the cumulative incremental routing length and the satisfaction as the reward function. After DRL-EVRP algorithm was trained and learned， it can be directly used for the planning of electric power material distribution routings. Simulation results show that the total length of the power material distribution routing solved by DRL-EVRP algorithm is shorter than those solved by the Extended Clarke and Wright （ECW） saving algorithm and Simulated Annealing （SA） algorithm， and the calculation time of the proposed algorithm is within an acceptable range. Therefore， the power material distribution routing can be optimized more efficiently and quickly by the proposed algorithm.

Keywords:	electric power material multi-objective routing optimization Vehicle Routing Problem (VRP) Deep Reinforcement Learning (DRL) Pointer Network (Ptr-Net)

	点击此处可从《计算机应用》浏览原始摘要信息
	点击此处可从《计算机应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏