首页 | 本学科首页   官方微博 | 高级检索  
     

一种结合独立性模型与差异评估的Co-Training改进方案
引用本文:唐焕玲,林正奎,鲁明羽,邬俊.一种结合独立性模型与差异评估的Co-Training改进方案[J].计算机研究与发展,2008,45(11).
作者姓名:唐焕玲  林正奎  鲁明羽  邬俊
作者单位:1. 大连海事大学信息科学技术学院,辽宁,大连,116026;烟台职业学院计算机与信息工程系,山东,烟台,264670
2. 大连海事大学信息科学技术学院,辽宁,大连,116026
基金项目:国家自然科学基金项目,高等学校博士学科点专项科研基金项目
摘    要:Co-Training算法要求两个特征视图满足一致性和独立性,但是,许多应用中不存在自然划分且满足这种假设的两个视图.为此,提出利用互信息(MI)或者CHI统计量评估特征之间的相互独立性,建立特征相互独立性模型(MID-Model).基于该模型,提出了新的特征子集划分方法PMID-MI与PMID-CHI算法,能有效地将一个特征集合划分成两个独立性较强的子集.并且利用多种差异评估法,进一步验证两个子集的独立性.基分类器之间的差异性能够减少两个基分类器给同一个未标注文本都标注错误的可能性.最后,提出了对Co-Training的改进算法SC-PMID.实验结果表明SC-PMID算法能够明显提高半监督分类精度.

关 键 词:半监督分类  标注文本  未标注文本  相互独立性模型  差异性评估

An Advanced Co-Training Algorithm Based on Mutual Independence and Diversity Measures
Tang Huanling,Lin Zhengkui,Lu Mingyu,Wu Jun.An Advanced Co-Training Algorithm Based on Mutual Independence and Diversity Measures[J].Journal of Computer Research and Development,2008,45(11).
Authors:Tang Huanling  Lin Zhengkui  Lu Mingyu  Wu Jun
Affiliation:Tang Huanling1,2,Lin Zhengkui1,Lu Mingyu1,, Wu Jun11(School of Information Science & Technology,Dalian Maritime University,Dalian,Liaoning 116026)2(Department of Computer & Information Engineering,Yantai Vocational College,Yantai,Sh,ong 264670)
Abstract:Co-training algorithm is constrained by the assumption that the features can be split into two subsets which are both compatible and independent. However, the assumption is usually violated to some degree in real-world application. The authors propose two methods to evaluate the mutual independence utilizing conditional mutual information or conditional CHI statistics, and present a method to construct a mutual independence model (MID-Model)for initial features set. Based on MID-Model, two novel feature par...
Keywords:Co-Training
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号