首页 | 本学科首页   官方微博 | 高级检索  
     

模糊聚类在中文文本分类中的应用研究
引用本文:杜长海,吉根林.模糊聚类在中文文本分类中的应用研究[J].计算机工程与应用,2006,42(8):170-172,177.
作者姓名:杜长海  吉根林
作者单位:南京师范大学数学与计算机科学学院,南京,210097;苏州大学江苏省计算机信息处理重点实验室,苏州,215006
摘    要:将基于等价关系的模糊聚类技术应用于中文文本分类,提出了基于模糊聚类的中文文本分类算法ATCFC。该算法利用基于二级字索引的正向最大匹配算法对文本分词,建立模糊特征向量空间模型,使用贴近度法刻划文本间的相似度。利用算法ATCFC对文本集合进行动态聚类实验,实验结果表明算法ATCFC对于中文文本分类是可行、有效的。

关 键 词:模糊聚类  文本分类  贴近度  模糊等价矩阵
文章编号:1002-8331-(2006)08-0170-03
收稿时间:2005-08
修稿时间:2005-08

Study on Application of Fuzzy Clustering in Chinese Text Categorization
Du Changhai,Ji Genlin.Study on Application of Fuzzy Clustering in Chinese Text Categorization[J].Computer Engineering and Applications,2006,42(8):170-172,177.
Authors:Du Changhai  Ji Genlin
Affiliation:School of Mathematic and Computer Science, Nanjing Normal University, Nanjing 210097;Jiangsu Province Key Lab of Information Processing,Suzhou University,Suzhou 215006
Abstract:This paper studies Chinese text categorization with the technique of fuzzy clustering based on equivalence relation and proposes an algorithm(ATCFC) for Chinese text categorization based on fuzzy clustering.This algorithm uses forward maximum match algorithm based on two-level word-index to segment Chinese text,creates fuzzy feature vector space model and describes similarity degree among texts using the method of close degree.Algorithm ATCFC is used to conduct a dynamic clustering experiment on a text set and the experimental results demonstrate that algorithm ATCFC is feasible and effective for Chinese text categorization.
Keywords:fuzzy clustering  text categorization  close degree  fuzzy equivalence matrix
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号