首页 | 本学科首页   官方微博 | 高级检索  
     

基于异步奖励深度确定性策略梯度的边缘计算多任务资源联合优化
引用本文:周恒.基于异步奖励深度确定性策略梯度的边缘计算多任务资源联合优化[J].计算机应用研究,2023,40(5).
作者姓名:周恒
作者单位:太原科技大学
基金项目:山西省回国留学人员科研资助项目(2020-126,2021-134,2021-135);山西省重点研发计划资助项目(201903D121023);山西省基础研究计划面上项目(20210302123206)
摘    要:移动边缘计算(MEC)系统中,因本地计算能力和电池能量不足,终端设备可以决定是否将延迟敏感性任务卸载到边缘节点中执行。针对卸载过程中用户任务随机产生且系统资源动态变化问题,提出了一种基于异步奖励的深度确定性策略梯度(asynchronous reward deep deterministic policy gradient,ARDDPG)算法。不同于传统独立任务资源分配采用顺序等待执行的策略,该算法在任务产生的时隙即可执行资源分配,不必等待上一个任务执行完毕,以异步模式获取任务计算奖励。ARDDPG算法在时延约束下联合优化了任务卸载决策、动态带宽分配和计算资源分配,并通过深度确定性策略梯度训练神经网络来探索最佳优化性能。仿真结果表明,与随机策略、基线策略和DQN算法相比,ARDDPG算法在不同时延约束和任务生成率下有效降低了任务丢弃率和系统的时延和能耗。

关 键 词:边缘计算    任务卸载    资源联合优化    动态带宽分配    DDPG
收稿时间:2022/8/31 0:00:00
修稿时间:2022/10/27 0:00:00

Multi-tasks resource joint optimization based on asynchronous reward deep deterministic policy gradient in edge computing
Zhou Heng.Multi-tasks resource joint optimization based on asynchronous reward deep deterministic policy gradient in edge computing[J].Application Research of Computers,2023,40(5).
Authors:Zhou Heng
Affiliation:Taiyuan University of Science and Technology
Abstract:In mobile edge computing(MEC) system, the terminal devices can decide whether to offload delay-sensitive tasks to edge nodes for execution due to insufficient local computing capacity and battery power. Aiming at the problem that user tasks randomly generated and system resources dynamically changed during the offloading process, this paper proposed an asynchronous reward deep deterministic policy gradient(ARDDPG) algorithm. Different from the traditional policy of sequential waiting for execution of independent task resource allocation, the ARDDPG algorithm could execute the resource allocation in the time slot of the task generation without waiting for the completion of the execution of the previous task, and obtain the task calculation reward in asynchronous mode. The algorithm jointly optimized the task offload decision, system bandwidth and computing resource allocation under time delay constraints, and trained the neural network with depth deterministic policy gradient to explore the optimal performance. Simulation results show that compared with random strategy, baseline strategy and DQN algorithm, the ARDDPG algorithm can effectively reduce the task discarding rate and the delay and energy consumption of the system under different delay constraints and task generation rates.
Keywords:edge computing  task offloading  resource joint optimization  dynamic bandwidth allocation  DDPG
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号