首页 | 本学科首页   官方微博 | 高级检索  
     

基于采样策略的主动学习算法研究进展
引用本文:吴伟宁,刘扬,郭茂祖,刘晓燕.基于采样策略的主动学习算法研究进展[J].计算机研究与发展,2012,49(6):1162-1173.
作者姓名:吴伟宁  刘扬  郭茂祖  刘晓燕
作者单位:哈尔滨工业大学计算机科学与技术学院 哈尔滨150001
基金项目:国家自然科学基金项目,中国博士后科学基金特别资助项目
摘    要:主动学习算法通过选择信息含量大的未标记样例交由专家进行标记,多次循环使分类器的正确率逐步提高,进而在标记总代价最小的情况下获得分类器的强泛化能力,这一技术引起了国内外研究人员的关注.侧重从采样策略的角度,详细介绍了主动学习中学习引擎和采样引擎的工作过程,总结了主动学习算法的理论研究成果,详细评述了主动学习的研究现状和发展动态.首先,针对采样策略选择样例的不同方式将主动学习算法划分为不同类型,进而,对基于不同采样策略的主动学习算法进行了深入地分析和比较,讨论了各种算法适用的应用领域及其优缺点.最后指出了存在的开放性问题和进一步的研究方向.

关 键 词:机器学习  主动学习  采样策略  标记代价  样例选择

Advances in Active Learning Algorithms Based on Sampling Strategy
Wu Weining , Liu Yang , Guo Maozu , Liu Xiaoyan.Advances in Active Learning Algorithms Based on Sampling Strategy[J].Journal of Computer Research and Development,2012,49(6):1162-1173.
Authors:Wu Weining  Liu Yang  Guo Maozu  Liu Xiaoyan
Affiliation:(School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001)
Abstract:The classifier in active learning algorithms is trained by choosing the most informative unlabeled instances for human experts to label.In the cycling procedure,the classification accuracy of the model is improved,and then the classifier with high generalization capability is obtained by minimizing the totally labeling cost.Active learning has attracted attentions of researchers both at home and abroad widely.It is pointed out that the active learning technique is a very important research at present.In this paper,the active learning algorithms are introduced by putting a particular emphasis on the sampling strategies.The iterative processes of the learning engine and the sampling engine are described in detail.The existing theories of active learning are summarized.The recent work and the development of active learning are discussed,including their approaches and corresponding sampling strategies.Firstly,the active learning algorithms are categorized into three main classes according to different ways of selecting the examples.And then,the sampling strategies are summarized by analyzing their correlations.The advantages and the shortcomings of sampling strategies are discussed and compared deeply within real applications.Finally the open problems which are still remained,and the interests of active learning in future research are forecasted.
Keywords:machine learning  active learning  sampling strategy  labeling cost  instances selection
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号