海量中文短信文本密度聚类研究 Study on Mass Chinese Short Message Text Density Clustering期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

海量中文短信文本密度聚类研究

引用本文：	周泓,刘金岭. 海量中文短信文本密度聚类研究[J]. 计算机工程, 2010, 36(22): 81-82

作者姓名：	周泓刘金岭

作者单位：	(淮阴工学院计算机工程学院，江苏淮安 233003)

摘要：	根据短信文本的特性，给出一种基于密度的中文短信聚类的方法，该方法将文本数据中具有高密度的区域划分为簇，构造一个可达相似度的升序排列的种子队列存储待扩张的短信文本，选择大阈值相似度可达的对象，即快速定位稠密空间的文本对象使较高密度的簇优先完成。实验结果表明，该聚类方法比K-means提高10倍左右的效率。
关键词：	密度簇邻域短信文本聚类
Study on Mass Chinese Short Message Text Density Clustering

ZHOU Hong,LIU Jin-ling. Study on Mass Chinese Short Message Text Density Clustering[J]. Computer Engineering, 2010, 36(22): 81-82

Authors:	ZHOU Hong LIU Jin-ling

Affiliation:	(Faculty of Computer Engineering, Huaiyin Institute of Technology, Huaian 223003, China)

Abstract:	According to the characteristics of short message text, a clustering method of the Chinese message based on density is given. High-density region of the text data is divided into clusters and a seed queue is constructed, which is arranged in ascending order of the reachable similarity, to store the text of short message text to be expanded. The text message is disposed in a specific order. In order to make higher-density clusters to complete first, the object is selected according to a greater threshold similarity, namely that the dense space text object which can be rapidly located makes the high-density cluster complete first. Experimental result shows that this clustering method’s efficiency is increased 10 times of K-means method.

Keywords:	density cluster neighborhood short message text clustering
本文献已被维普万方数据等数据库收录！
	点击此处可从《计算机工程》浏览原始摘要信息
	点击此处可从《计算机工程》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏