面向大规模数据集的近邻传播聚类 Affinity Propagation Clustering for Large Scale Dataset期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

面向大规模数据集的近邻传播聚类

引用本文：	谷瑞军,汪加才,陈耿,陈圣磊.面向大规模数据集的近邻传播聚类[J].计算机工程,2010,36(23):22-24.

作者姓名：	谷瑞军汪加才陈耿陈圣磊

作者单位：	(1. 南京审计学院信息科学学院，南京 210029； 2. 江苏大学计算机科学与通信工程学院，江苏镇江 212013)

基金项目：	国家自然科学基金资助项目，江苏省高校自然科学基金资助项目，校级预研课题基金资助项目

摘要：	近邻传播聚类在计算过程中需构建相似度矩阵，该矩阵的规模随样本数急剧增长，限制了算法在大规模数据集上的直接应用。为此，提出一种改进的近邻传播聚类算法，利用数据点的局部分布，借鉴半监督聚类的思想构造稀疏化的相似度矩阵，并对聚类结果中的簇代表点再次或多次聚类，直至得到合适的簇划分。实验结果表明，该算法在处理能力和运算速度上优于原算法。
关键词：	近邻传播聚类大规模数据集数据挖掘
Affinity Propagation Clustering for Large Scale Dataset

GU Rui-jun,WANG Jia-cai,CHEN Geng,CHEN Sheng-lei.Affinity Propagation Clustering for Large Scale Dataset[J].Computer Engineering,2010,36(23):22-24.

Authors:	GU Rui-jun WANG Jia-cai CHEN Geng CHEN Sheng-lei

Affiliation:	(1. School of Information Science, Nanjing Audit University, Nanjing 210029, China; 2. School of Computer Science and Telecommunication Engineering, Jiangsu University, Zhenjiang 212013, China)

Abstract:	Affinity Propagation（AP）clustering takes the full similarity matrix to perform propagation,which limits its application in large scale dataset.An improved affinity propagation clustering is proposed specially for processing large dataset,which fully utilizes local distribution to add constraint like semi-supervised clustering to construct sparse similarity matrix.AP runs on sparse similarity matrix to obtain an initial cluster partition,and runs iteratively on the exemplars until it obtains a reasonable partition.Experimental results demonstrate that improved affinity propagation performs better both in processing scale and processing time.

Keywords:	affinity propagation clustering large scale dataset data mining
本文献已被维普万方数据等数据库收录！
	点击此处可从《计算机工程》浏览原始摘要信息
	点击此处可从《计算机工程》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏