一种处理部分标记数据的粗糙集属性约简算法 Rough Set Attribute Reduction Algorithm for Partially Labeled Data期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一种处理部分标记数据的粗糙集属性约简算法

引用本文：	张维,苗夺谦,高灿,李峰.一种处理部分标记数据的粗糙集属性约简算法[J].计算机科学,2017,44(1):25-31.

作者姓名：	张维苗夺谦高灿李峰

作者单位：	同济大学电子与信息工程学院计算机科学与技术系上海201804;上海电力学院计算机科学与技术学院上海200090;同济大学嵌入式系统与服务计算教育部重点实验室上海201804,同济大学电子与信息工程学院计算机科学与技术系上海201804;同济大学嵌入式系统与服务计算教育部重点实验室上海201804,深圳大学计算机与软件学院广东518060;香港理工大学应用科学与纺织学院香港,同济大学电子与信息工程学院计算机科学与技术系上海201804;同济大学嵌入式系统与服务计算教育部重点实验室上海201804

基金项目：	本文受国家自然科学基金项目(61273304),2013年度高等学校博士学科点专项科研基金(20130072130004)资助

摘要：	属性约简是粗糙集理论中重要的研究内容之一,是数据挖掘中知识获取的关键步骤。Pawlak粗糙集约简的对象一般是有标记的决策表或者是无标记的信息表。而在很多现实问题中有标记数据很有限,更多的是无标记数据,即半监督数据。为此,结合半监督协同学习理论,提出了处理半监督数据的属性约简算法。该算法首先在有标记数据上构造两个差异性较大的约简来构造基分类器；然后在无标记数据上交互协同学习,扩大有标记数据集,获得质量更好的约简,构造性能更好的分类器,该过程迭代进行,从而实现利用无标记数据提高有标记数据的约简质量,最终获得质量较好的属性约简。UCI数据集上的实验分析表明,该算法是有效且可行的。
关键词：	粗糙集增量式属性约简协同学习部分标记数据半监督学习
收稿时间：	2015/10/9 0:00:00
修稿时间：	2015/12/13 0:00:00
Rough Set Attribute Reduction Algorithm for Partially Labeled Data

ZHANG Wei,MIAO Duo-qian,GAO Can and LI Feng.Rough Set Attribute Reduction Algorithm for Partially Labeled Data[J].Computer Science,2017,44(1):25-31.

Authors:	ZHANG Wei MIAO Duo-qian GAO Can and LI Feng

Affiliation:	Department of Computer Science and Technology,School of Electronics and Information Engineering, Tongji University,Shanghai 201804,China;Department of Computer Science and Technology,Shanghai University of Electric Power,Shanghai 200090,China;The Key Laboratory of Embedded System and Service Computing,Ministry of Education,Tongji University,Shanghai 201804,China,Department of Computer Science and Technology,School of Electronics and Information Engineering, Tongji University,Shanghai 201804,China;The Key Laboratory of Embedded System and Service Computing,Ministry of Education,Tongji University,Shanghai 201804,China,School of Computer and Software,Shenzhen University,Guangdong 518060,China;Institute of Textiles & Clothing,The Hong Kong Polytechnic University,Hong Kong,China and Department of Computer Science and Technology,School of Electronics and Information Engineering, Tongji University,Shanghai 201804,China;The Key Laboratory of Embedded System and Service Computing,Ministry of Education,Tongji University,Shanghai 201804,China

Abstract:	Attribute reduction,as an important preprocessing step for knowledge acquiring in data mining,is one of the key issues in rough set theory.Rough set theory is an effective supervised learning model for labeled data.However,attribute reduction for partially labeled data is outside the realm of traditional rough set theory.In this paper,a rough set attribute reduction algorithm for partially labeled data was proposed based on co-training which capitalizes on the unlabeled data to improve the quality of attribute reducts from few labeled data.It gets two diverse reducts of the labeled data,employs them to train its base classifiers,and then co-trains the two base classifiers iteratively.In every round,the base classifiers learn from each other on the unlabeled data and enlarge the labeled data,so better quality reducts could be computed from the enlarged labeled data and employed to construct base classifiers of higher performance.The theoretical analysis and experimental results with UCI data sets show that the proposed algorithm can select a few attributes but keep classification power.

Keywords:	Rough sets Incremental attribute reduction Co-training Partially labeled data Semi-supervised learning

	点击此处可从《计算机科学》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏