自适应对抗学习求解旅行商问题 Adaptive Adversarial Learning for Solving TSP期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

自适应对抗学习求解旅行商问题

引用本文：	熊文瑞,陶继平. 自适应对抗学习求解旅行商问题[J]. 计算机工程与应用, 2022, 58(17): 224-229. DOI: 10.3778/j.issn.1002-8331.2102-0169

作者姓名：	熊文瑞陶继平

作者单位：	1.厦门大学航空航天学院，福建厦门 3610052.厦门大学大数据智能分析与决策重点实验室，福建厦门 361005

基金项目：	国家自然科学基金（61772438）；;福建省自然科学基金（2020J01053）；

摘要：	深度学习为组合优化问题提供了新的解决思路，目前该研究方向多关注于对模型和训练方法的改良，更多的论文引入自然语言处理方向的新模型来加以改进求解效果，而缺乏从实例的数据生成方向来关注模型的泛化能力和鲁棒性。为解决该问题，借鉴对抗学习的思想，针对经典组合优化问题——旅行商问题，从数据生成方向切入研究，设计生成器网络，使用监督学习的方式来产生对抗样本，并将对抗样本加入到随机样本中混合训练，以改善模型对该类问题的泛化性能。同时，依据强化学习训练过程中判别器模型的更新方式提出一种自适应机制，来训练对抗模型，最终得到能够在随机分布样本上和对抗样本上都取得较好结果的模型。仿真验证了所提出方法的有效性。
关键词：	对抗训练强化学习模型泛化旅行商问题
Adaptive Adversarial Learning for Solving TSP

XIONG Wenrui,TAO Jiping. Adaptive Adversarial Learning for Solving TSP[J]. Computer Engineering and Applications, 2022, 58(17): 224-229. DOI: 10.3778/j.issn.1002-8331.2102-0169

Authors:	XIONG Wenrui TAO Jiping

Affiliation:	1.School of Aerospace Engineering, Xiamen University, Xiamen, Fujian 361005, China2.Key Laboratory of Big Data Intelligent Analysis and Decision-marking of Xiamen Province, Xiamen University, Xiamen, Fujian 361005, China

Abstract:	Deep learning gives a new insight into solutions to combinatorial optimization issues. Recently, the majority of related works focus on the developments of models as well as training methods. More researches try to promote the solution quality by introducing a particular model which belongs to the field of natural language processing, instead of evaluating its generalization performance and robustness from the prospective of data generation.?Aiming to a typical travelling salesman problem, this paper bases on the process of generating instances and designs a generator network, which is inspired by adversarial learning. To be more specific, supervised learning is used to produce the adversarial samples. They are required to be mixed with random samples for further training so as to improve the generalization of the model. Simultaneously, a self-adaption mechanism is derived from the updating mode of the discriminator during the reinforcement training process, which will be used later to train the certain adversarial model. In this way, a model which can achieve great results on both types of samples is created. Simulation results demonstrate the effectiveness of the proposed approach.

Keywords:	adversarial training reinforce learning model generalization traveling salesman problem（TSP）

	点击此处可从《计算机工程与应用》浏览原始摘要信息
	点击此处可从《计算机工程与应用》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏