首页 | 本学科首页   官方微博 | 高级检索  
     

基于最远总距离采样的代价敏感主动学习
引用本文:任杰,闵帆,汪敏.基于最远总距离采样的代价敏感主动学习[J].计算机应用,2019,39(9):2499-2504.
作者姓名:任杰  闵帆  汪敏
作者单位:1. 西南石油大学 计算机科学学院, 成都 610500; 2. 西南石油大学 电气信息学院, 成都 610500
基金项目:四川省青年科技创新团队专项(2019JDTD0017);四川省应用基础研究项目(2019JDTD0017)。
摘    要:主动学习旨在通过人机交互减少专家标注,代价敏感主动学习则致力于平衡标注与误分类代价。基于三支决策(3WD)和标签均匀分布(LUD)模型,提出一种基于最远总距离采样的代价敏感主动学习算法(CAFS)。首先,设计了最远总距离采样策略,以查询代表性样本的标签;其次,利用了LUD模型和代价函数,计算期望采样数目;最后,使用了k-Means聚类技术分裂已获得不同标签的块。CAFS算法利用三支决策思想迭代地进行标签查询、实例预测和块分裂,直至处理完所有实例。学习过程在代价最小化目标的控制下进行。在9个公开数据上比较,CAFS比11个主流的算法具有更低的平均代价。

关 键 词:主动学习  k-Means聚类  标签均匀分布  三支决策  
收稿时间:2019-03-22
修稿时间:2019-05-06

Cost-sensitive active learning through farthest distance sum sampling
REN Jie,MIN Fan,WANG Min.Cost-sensitive active learning through farthest distance sum sampling[J].journal of Computer Applications,2019,39(9):2499-2504.
Authors:REN Jie  MIN Fan  WANG Min
Affiliation:1. School of Computer Science, Southwest Petroleum University, Chengdu Sichuan 610500, China;
2. School of Electrical Engineering and Information, Southwest Petroleum University, Chengdu Sichuan 610500, China
Abstract:Active learning aims to reduce expert labeling through man-machine interaction, while cost-sensitive active learning focuses on balancing labeling and misclassification costs. Based on Three-Way Decision (3WD) methodology and Label Uniform Distribution (LUD) model, a Cost-sensitive Active learning through the Farthest distance sum Sampling (CAFS) algorithm was proposed. Firstly, the farthest total distance sampling strategy was designed to query the labels of representative samples. Secondly, LUD model and cost function were used to calculate the expected sampling number. Finally, k-Means algorithm was employed to split blocks obtained different labels. In CAFS, 3WD methodology was adopted in the iterative process of label query, instance prediction, and block splitting, until all instances were processed. The learning process was controlled by the cost minimization objective. Results on 9 public datasets show that CAFS has lower average cost compared with 11 mainstream algorithms.
Keywords:active learning                                                                                                                        k-Means clustering" target="_blank">k-Means clustering')">k-Means clustering                                                                                                                        label uniform distribution                                                                                                                        Three-Way Decision (3WD)
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号