一个基于关联规则的多层文档聚类算法 Multi-level document clustering algorithm based on association rules期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一个基于关联规则的多层文档聚类算法

引用本文：	宋江春,沈钧毅,宋擒豹.一个基于关联规则的多层文档聚类算法[J].计算机应用,2005,25(7):1570-1572.

作者姓名：	宋江春沈钧毅宋擒豹

作者单位：	西安交通大学,电子与信息工程学院,陕西,西安,710049;西安交通大学,电子与信息工程学院,陕西,西安,710049;西安交通大学,电子与信息工程学院,陕西,西安,710049

基金项目：	国家自然科学基金资助项目(60173058)

摘要：	提出了一种新的基于关联规则的多层文档聚类算法，该算法利用新的文档特征抽取方法构造了文档的主题和关键字特征向量。首先在主题特征向量空间中利用频集快速算法对文档进行初始聚类，然后在基于主题关键字的新的特征向量空间中利用类间距和连接度对初始文档类进行求精，从而得到最终聚类。由于使用了两层聚类方法，使算法的效率和精度都大大提高；使用新的文档特征抽取方法还解决了由于文档关键字过多而导致文档特征向量的维数过高的问题。
关键词：	文档挖掘文档聚类关联规则文档主题特征向量文档关键字特征向量
文章编号：	1001-9081(2005)07-1570-03
收稿时间：	2005-02-03
修稿时间：	2005-04-01
Multi-level document clustering algorithm based on association rules

SONG Jiang-chun,SHEN Jun-yi,SONG Qing-bao.Multi-level document clustering algorithm based on association rules[J].journal of Computer Applications,2005,25(7):1570-1572.

Authors:	SONG Jiang-chun SHEN Jun-yi SONG Qing-bao

Affiliation:	School of Electronics and Information Engineering, Xi'an Jiaotong University

Abstract:	A multi-level document clustering algorithm was proposed based on association rules, It constructed ducument feature vector of topic and keyword by using a new method of document feature extraction. Firstly, it found the initial ducument clusters by using fast algorithm of finding frequent item sets in topic vector space, then in keyword vector space, re-clustered the initial clusters according to the cluster distance and the link intensity. For processing initial clustering by using classical fast frequent item sets, the efficiency and the precision of the algorithm were highly increased. The new method of ducument feature extraction is also used to solve the problem that the dimention of the keyword vector space is too high with increasing of

Keywords:	in document Key words: document mining document clustering association rule document topic feature vector document keyword feature vector
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《计算机应用》浏览原始摘要信息
	点击此处可从《计算机应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏