首页 | 本学科首页   官方微博 | 高级检索  
     

基于MCPDDPG的智能车辆路径规划方法及应用
引用本文:余伶俐,魏亚东,霍淑欣.基于MCPDDPG的智能车辆路径规划方法及应用[J].控制与决策,2021,36(4):835-846.
作者姓名:余伶俐  魏亚东  霍淑欣
作者单位:中南大学自动化学院,长沙410083
基金项目:国家重点研发计划项目(2018YFB1201602);国家自然科学基金项目(61976224);湖南省科技重大专项(2017GK1010).
摘    要:针对智能车路径规划过程中常存在动态环境感知预估不足的问题,使用基于蒙特卡罗深度策略梯度学习(Monte Carlo prediction deep deterministic policy gradient, MCPDDPG)的智能车辆路径规划方法,设计一种基于环境感知预测、行为决策和控制序列生成的框架,实现实时的决策和规划,并输出连续的车辆控制序列.首先,利用序贯蒙特卡罗预估他车行为状态量;然后,设计基于强化Q学习的行为决策方法,使智能车辆实时预知碰撞风险,采取合理的规避策略;最后,构建深度策略梯度学习网络框架,获取智能车辆规划路径的最优轨迹序列.实验结果表明,所提方法能够缓解环境感知的预估不足问题,提升智能车辆行为决策的快速性,保障路径规划的主动安全,并输出连续的轨迹序列,为智能车辆导航控制提供前提.

关 键 词:路径规划  蒙特卡罗预测  智能车辆  深度策略梯度  强化学习  决策

The method and application of intelligent vehicle path planning based on MCPDDPG
YU Ling-li,WEI Ya-dong,HUO Shu-xin.The method and application of intelligent vehicle path planning based on MCPDDPG[J].Control and Decision,2021,36(4):835-846.
Authors:YU Ling-li  WEI Ya-dong  HUO Shu-xin
Affiliation:College of Automation,Central South University,Changsha410083,China
Abstract:Aiming at the problem of insufficient dynamic environment perception and estimation in the process of intelligent vehicle path planning, we design a frame based on environment perception prediction、behavior decision and control sequence generation with an intelligent vehicle path planning method based on MCPDDPG(Monte Carlo prediction deep deterministic policy cradient). The framework can realize a real-time decision-making and planning for intelligent vehicle, and output continuous vehicle control sequences. Firstly, we use sequential Monte Carlo to estimate the behavioral state of other cars;Then, we design a behavioral decision method based on reinforcement Q learning to enable intelligent vehicles to predict collision risks in real time and adopt reasonable avoidance strategies;Finally, we build a deep deterministic policy gradient learning network to obtain the optimal trajectory sequence of the intelligent vehicle planning path. Experimental results show that the proposed method can alleviate the problem of insufficient prediction of environmental perception, improve the speed of intelligent vehicle behavior decision-making, ensure the active safety of path planning, and output a continuous trajectory sequence, which provides a prerequisite for intelligent vehicle navigation control.
Keywords:path planning  Monte Carlo prediction  intelligent land vehicle  deep deterministic policy gradient  reinforcement learning  decision-making
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《控制与决策》浏览原始摘要信息
点击此处可从《控制与决策》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号