首页 | 本学科首页   官方微博 | 高级检索  
     

一种用于处理高维稀疏数据的半监督聚类算法
引用本文:崔鹏,张汝波.一种用于处理高维稀疏数据的半监督聚类算法[J].计算机科学,2010,37(7):205-207.
作者姓名:崔鹏  张汝波
作者单位:1. 哈尔滨工程大学计算机与技术学院,哈尔滨,150001;哈尔滨理工大学计算机与技术学院,哈尔滨,150080
2. 哈尔滨工程大学计算机与技术学院,哈尔滨,150001
基金项目:863国家重点基金项目 
摘    要:半监督聚类是近年来研究的热点,传统的方法是在无监督算法的基础上加入有限的背景知识来提高聚类性能.然而大多数半监督聚类技术都基于邻近或密度,难以处理高维数据,因此必须将约减的特征加入到半监督聚类过程中.为解决此问题,提出了一种新的半监督聚类算法框架.该算法利用样本约束传递性进行预处理,然后将特征投影到低维空间实现降维,最终用半监督算法对约减后的样本进行聚类.通过实验同现行主要降维方法进行了比较,说明此方法能有效地处理高维数据,聚类效果良好.

关 键 词:降维  半监督聚类  特征选择  约束
收稿时间:2009/8/28 0:00:00
修稿时间:2009/10/18 0:00:00

Novel Semi-supervised Clustering for High Dimensional Data
CUI Peng,ZHANG Ru-bo.Novel Semi-supervised Clustering for High Dimensional Data[J].Computer Science,2010,37(7):205-207.
Authors:CUI Peng  ZHANG Ru-bo
Affiliation:(School of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China);(College of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080 , China)
Abstract:Semi-supervised clustering is a popular clustering method in recent year, which usually incorporates limited background knowledge to improve the clustering performance. However, most of existing methods based on neighbors or density can't be used for processing high dimensionality data. So it is critical of merging the reduced feature with semi-supervised clustering process. ho solve the problem, we proposed a framework for semi-supervised clustering. The framework firstly preprocesses instances with transmissibility of constraints;then reduced dimensionality by projecting feature into low dimensional space;finally it clustered instances with reduced features. To evaluate the effectiveness of the method, we implemented experiments on datasets, the results show the method has good clustering performance for handling data of high dimension.
Keywords:Dimensionality reduction  Semi-supervised clustering  Feature selection  Constraints
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号