首页 | 本学科首页   官方微博 | 高级检索  
     

基于相关随机子空间的分类数据聚类集成
引用本文:马海峰,刘宇熹.基于相关随机子空间的分类数据聚类集成[J].计算机应用研究,2013,30(4):1082-1084.
作者姓名:马海峰  刘宇熹
作者单位:1. 江苏常州机电职业技术学院, 江苏 常州 213000; 2. 上海财经大学 国际工商管理学院, 上海 200433; 3. 上海理工大学 管理学院, 上海 200093
基金项目:国家自然科学基金资助项目(70972062); 上海市哲学社会科学规划课题(2011BGL011); 上海市重点学科项目(S30504); 上海财经大学研究生科研创新基金资助项目
摘    要:为了提升分类数据聚类集成的效果,提出了一种新的相关随机子空间聚类集成模型。该模型利用粗糙集理论将分类属性分解成相关和不相关子集,在相关属性子集上随机生成多个相关子空间并对分类数据进行聚类,通过集成多个较优且具差异性的聚类结果以获得最终的聚类划分。此外,将粗糙集约简概念应用于相关子空间属性数目的确定,有效地避免了参数对聚类结果的影响。UCI数据集实验表明,新模型的性能优于其他已有模型,说明了其有效性。

关 键 词:分类数据  粗糙集  属性约简  相关子空间  聚类集成

Relevant random subspace-based clustering ensemble for categorical data
MA Hai-feng,LIU Yu-xi.Relevant random subspace-based clustering ensemble for categorical data[J].Application Research of Computers,2013,30(4):1082-1084.
Authors:MA Hai-feng  LIU Yu-xi
Affiliation:1. Changzhou Institute of Mechatronic Technology, Changzhou Jiangsu 213000, China; 2. School of International Business Administration, Shanghai University of Finance & Economics, Shanghai 200433, China; 3. School of Management, University of Shanghai for Science & Technology, Shanghai 200093, China
Abstract:In order to improve the quality of clustering ensemble for categorical data, this paper proposed a relevant random subspace-based clustering ensemble model. Based on the theory of rough sets, the model first decomposed the entire set of categorical attributes into relevant and irrelevant attribute sets. Then it used the relevant attribute set to generate the relevant subspaces randomly and obtained a final clustering solution by combing multiple good and diverse partitions resulting from the relevant subspaces. Moreover, the model employed the concept of attribute reduction in rough sets to determine the number of attributes in each relevant subspace, avoiding the effect of parameter on clustering ensemble result effectively. Empirical results on selected UCI data sets show that the proposed model achieves better and more robust clustering performance compared to some representative clustering ensemble models for categorical data, showing the effectiveness of the proposed model.
Keywords:categorical data  rough sets  attribute reduction  relevant subspace  clustering ensemble
本文献已被 CNKI 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号