首页 | 本学科首页   官方微博 | 高级检索  
     

利用置信度重取样的SemiBoost-CR分类模型
引用本文:唐焕玲,鲁明羽. 利用置信度重取样的SemiBoost-CR分类模型[J]. 计算机科学与探索, 2011, 5(11): 1048-1056. DOI: 10.3778/j.issn.1673-9418.2011.11.010
作者姓名:唐焕玲  鲁明羽
作者单位:1. 山东工商学院计算机科学与技术学院,山东烟台,264005
2. 大连海事大学信息科学技术学院,辽宁大连,116026
基金项目:国家自然科学基金No.61073133,61175053; 高等学校博士学科点专项科研基金No.20070151009~~
摘    要:结合半监督学习和集成学习方法,提出了一种基于置信度重取样的SemiBoost-CR分类模型.给出了基于标注近邻与未标注近邻的置信度计算公式,按照置信度重采样,不仅选取一定比例置信度较高的未标注样本,而且选取一定比例置信度较低的未标注样本,分别以不同的策略加入到已标注的训练样本集,引入置信度高的未标注样本,用以提高基分类...

关 键 词:boosting  半监督分类  朴素贝叶斯  置信度  重取样
修稿时间: 

Advanced SemiBoost-CR Categorization Model Utilizing Confidence-Based Resampling
TANG Huanling,LU Mingyu. Advanced SemiBoost-CR Categorization Model Utilizing Confidence-Based Resampling[J]. Journal of Frontier of Computer Science and Technology, 2011, 5(11): 1048-1056. DOI: 10.3778/j.issn.1673-9418.2011.11.010
Authors:TANG Huanling  LU Mingyu
Affiliation:1. School of Computer Science and Technology, Shandong Institute of Business and Technology, Yantai, Shandong 264005, China 2. School of Information Science and Technology, Dalian Maritime University, Dalian, Liaoning 116026, China
Abstract:This paper proposes SemiBoost-CR, an enhanced categorization model which utilizing the confidence- based resampling technique and incorporating semi-supervised learning with ensemble learning. The confidence score is derived from the nearer labeled neighbors and unlabeled neighbors of the example. According to the confidence-based resampling, not only the unlabeled examples with higher confidence score, but also the unlabeled ones with lower confidence score are selected and added to the labeled training set. The accuracy of the base classi-fier is to be improved by introducing the unlabeled data with higher confidence; the diversity among the base classi-fiers is further increased by introducing the unlabeled data with lower confidence. Experimental results show that SemiBoost-CR can boost the performance of Naive Bayesian text categorization.
Keywords:boosting  semi-supervised categorization  Naive Bayesian  confidence  resampling
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机科学与探索》浏览原始摘要信息
点击此处可从《计算机科学与探索》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号