基于强化学习的遗传算法求解一种新的钻削路径优化问题 Genetic algorithm based on reinforcement learning for a novel drilling path optimization problem期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于强化学习的遗传算法求解一种新的钻削路径优化问题

引用本文：	朱光宇,张德颂.基于强化学习的遗传算法求解一种新的钻削路径优化问题[J].控制与决策,2024,39(2):697-704.

作者姓名：	朱光宇张德颂

作者单位：	福州大学机械工程及自动化学院,福州 350108

基金项目：	工信部智能制造综合标准化与新模式应用项目(工信部联装(2016)213号).

摘要：	孔加工是机械制造的基本工序之一.针对数控机床的刀具路径优化问题,提出一种新颖的孔加工刀具路径优化模型-----带可决策孔的孔加工多刀具路径优化问题(MTdDPO).在该模型中,工件上的孔分为两类:固定孔和可决策孔.MTdDPO的目标是通过判断可决策孔的路径归属和路径内各孔的加工顺序来实现加工路径长度的最小化.为实现MTdDPO的优化,提出基于强化学习的分段遗传算法(RLSGA).在RLSGA中,种群被视为智能体,智能体的状态是种群的多样性系数,3种不同的分段交叉算子是智能体的动作,智能体的奖励与种群的适应度值和多样性系数的变化有关.针对MTdDPO,新建5个基准测试问题,并在测试问题上将RLSGA与其他4个算法进行对比.结果表明,RLSGA的表现明显优于其他算法,RLSGA能够有效地解决MTdDPO问题.
关键词：	孔加工路径优化遗传算法强化学习组合优化
Genetic algorithm based on reinforcement learning for a novel drilling path optimization problem

ZHU Guang-yu,ZHANG De-song.Genetic algorithm based on reinforcement learning for a novel drilling path optimization problem[J].Control and Decision,2024,39(2):697-704.

Authors:	ZHU Guang-yu ZHANG De-song

Affiliation:	School of Mechanical Engineering and Automation,Fuzhou University,Fuzhou 350108,China

Abstract:	Hole-making is one of the basic processes of mechanical production. For the problem of toolpath optimization of computer numerical control(CNC) machine tools, a novel toolpath model for hole-making called as multi-tool drilling path optimization problems with decidable holes(MTdDPO) is proposed. In the MTdDPO, holes on workpieces are divided into two categories: fixed holes and decidable holes. The goal of the MTdDPO is to minimize the length of the machining path by judging the path ownership of decidable holes and the machining sequence of all holes in each path. To realize the optimization of the MTdDPO, a segmented genetic algorithm based on reinforcement learning(RLSGA) is proposed. The population of the RLSGA is regarded as the agent, the states of the agent are the intervals of the diversity coefficient of the population, three different segmental crossover operators are the actions of the agent, and the reward of the agent is related to the changes in fitness value and diversity coefficients of the population. Based on the MTdDPO, 5 benchmark test problems are designed, and the RLSGA is compared with other 4 algorithms on these test problems. Results show that the performance of the RLSGA is significantly better than other algorithms, which means the RLSGA can effectively solve the MTdDPO problems.

Keywords:

	点击此处可从《控制与决策》浏览原始摘要信息
	点击此处可从《控制与决策》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏