基于迁移演员-评论家学习的服务功能链部署算法 Deployment Algorithm of Service Function Chain Based on Transfer Actor-Critic Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于迁移演员-评论家学习的服务功能链部署算法

引用本文：	唐伦,贺小雨,王晓,陈前斌.基于迁移演员-评论家学习的服务功能链部署算法[J].电子与信息学报,2020,42(11):2671-2679.

作者姓名：	唐伦贺小雨王晓陈前斌

作者单位：	1.重庆邮电大学通信与信息工程学院重庆 4000652.重庆邮电大学移动通信技术重点实验室重庆 400065

基金项目：	国家自然科学基金(61571073)，重庆市教委科学技术研究项目(KJZD-M20180601)

摘要：	针对5G网络切片环境下由于业务请求的随机性和未知性导致的资源分配不合理从而引起的系统高时延问题，该文提出了一种基于迁移演员-评论家(A-C)学习的服务功能链(SFC)部署算法(TACA)。首先，该算法建立基于虚拟网络功能放置、计算资源、链路带宽资源和前传网络资源联合分配的端到端时延最小化模型，并将其转化为离散时间马尔可夫决策过程(MDP)。而后，在该MDP中采用A-C学习算法与环境进行不断交互动态调整SFC部署策略，优化端到端时延。进一步，为了实现并加速该A-C算法在其他相似目标任务中(如业务请求到达率普遍更高)的收敛过程，采用迁移A-C学习算法实现利用源任务学习的SFC部署知识快速寻找目标任务中的部署策略。仿真结果表明，该文所提算法能够减小且稳定SFC业务数据包的队列积压，优化系统端到端时延，并提高资源利用率。
关键词：	网络切片服务功能链部署马尔可夫决策过程演员-评论家学习迁移学习
收稿时间：	2019-07-18
Deployment Algorithm of Service Function Chain Based on Transfer Actor-Critic Learning

Lun TANG,Xiaoyu HE,Xiao WANG,Qianbin CHEN.Deployment Algorithm of Service Function Chain Based on Transfer Actor-Critic Learning[J].Journal of Electronics & Information Technology,2020,42(11):2671-2679.

Authors:	Lun TANG Xiaoyu HE Xiao WANG Qianbin CHEN

Affiliation:	1.School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China2.Key Laboratory of Mobile Communication, Chongqing University of Posts and Telecommunications, Chongqing 400065, China

Abstract:	To solve the problem of high system delay caused by unreasonable resource allocation because of randomness and unpredictability of service requests in 5G network slicing, this paper proposes a deployment scheme of Service Function Chain (SFC) based on Transfer Actor-Critic (A-C) Algorithm (TACA). Firstly, an end-to-end delay minimization model is built based on Virtual Network Function (VNF) placement, and joint allocation of computing resources, link resources and fronthaul bandwidth resources, then the model is transformed into a discrete-time Markov Decision Process (MDP). Next, A-C learning algorithm is adopted in the MDP to adjust dynamically SFC deployment scheme by interacting with environment, so as to optimize the end-to-end delay. Furthermore, in order to realize and accelerate the convergence of the A-C algorithm in similar target tasks (such as the arrival rate of service requests is generally higher), the transfer A-C algorithm is adopted to utilize the SFC deployment knowledge learned from source tasks to find quickly the deployment strategy in target tasks. Simulation results show that the proposed algorithm can reduce and stabilize the queuing length of SFC packets, optimize the system end-to-end delay, and improve resource utilization.

Keywords:

	点击此处可从《电子与信息学报》浏览原始摘要信息
	点击此处可从《电子与信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏