基于相似性的CITCP强化学习奖励策略 Similarity-based Reward Strategy of Reinforcement Learning in CITCP期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于相似性的CITCP强化学习奖励策略

引用本文：	杨羊,潘超月,曹天歌,李征.基于相似性的CITCP强化学习奖励策略[J].计算机系统应用,2022,31(2):325-334.

作者姓名：	杨羊潘超月曹天歌李征

作者单位：	北京化工大学信息科学与技术学院, 北京 100029

基金项目：	国家自然科学基金(61872026)

摘要：	在面向持续集成测试用例优先排序(continuous integration test case prioritization,CITCP)的强化学习方法中,智能体通过对测试用例实施奖励从而调整测试用例优先排序策略以适应后续集成测试,可以满足持续集成测试频繁迭代和快速反馈的需求.智能体通常只奖励执行失效测试用例,但实际工业程序持续集成测试具有集成高频繁但测试低失效的特点,对CITCP的实际应用提出新的挑战.测试低失效,即稀少的执行失效测试用例数量,会导致强化学习中奖励对象稀少,引发强化学习的稀疏奖励问题.本文研究一种强化学习奖励对象选择策略,在奖励执行失效测试用例的基础上,通过选择与执行失效测试用例相似的执行通过测试用例实施奖励,从而增加奖励对象,以解决奖励稀疏问题.研究具体包括,设计了一种测试用例历史执行信息序列和执行时间特征向量表示的相似性度量方法,并基于相似性度量选择与执行失效测试用例集相似的执行通过测试用例集实施奖励.在6个工业数据集上开展了实验研究,结果表明基于相似性的奖励对象选择策略通过增加有效奖励对象解决了稀疏奖励问题,并进一步提高了基于强化学习的持续集成测试用例优先排序质量.
关键词：	持续集成测试强化学习测试用例优先排序相似性奖励对象选择策略稀疏奖励
收稿时间：	2021/4/10 0:00:00
修稿时间：	2021/5/11 0:00:00
Similarity-based Reward Strategy of Reinforcement Learning in CITCP

YANG Yang,PAN Chao-Yue,CAO Tian-Ge,LI Zheng.Similarity-based Reward Strategy of Reinforcement Learning in CITCP[J].Computer Systems& Applications,2022,31(2):325-334.

Authors:	YANG Yang PAN Chao-Yue CAO Tian-Ge LI Zheng

Affiliation:	College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China

Abstract:	In the reinforcement learning method for the continuous integration test case prioritization (CITCP), the agent rewards the test cases to realize the adjustment of test case prioritization strategy, and thus they can meet the needs of frequent iteration and rapid feedback in continuous integration testing. The agent usually only rewards the failure test cases. However, in the actual industrial processes, the continuous integration testing features high-frequency integration and low-failure-rate tests, which poses a new challenge to the actual application of CITCP. Low-failure-rate tests can be understood as a sparse number of failure test cases, which can lead to the sparsity of reward objects in reinforcement learning and bring about the sparse reward problem. In this study, a reward object selection strategy is proposed to solve the sparse reward problem. With the failure test cases rewarded, passing test cases similar to failure test cases are selected to be rewarded, and thus the number of reward objects increases. Specifically, the similarity measure method for test cases is designed with the feature vector representation of historical execution information sequences and duration time. Then, the passing test cases similar to the failure test cases are selected to be rewarded through the similarity measure. The experiments are conducted in six industrial data sets, and the results show that the similarity-based reward object selection strategy can effectively solve the sparse reward problem by increasing the reward objects and further improve the quality of reinforcement learning-based CITCP.

Keywords:	continuous integration testing reinforcement learning test case prioritization similarity reward object selection strategy sparse reward
本文献已被维普等数据库收录！
	点击此处可从《计算机系统应用》浏览原始摘要信息
	点击此处可从《计算机系统应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏