首页 | 本学科首页   官方微博 | 高级检索  
     

基于关联规则挖掘的中文文本自动分类
引用本文:王元珍,钱铁云,冯小年.基于关联规则挖掘的中文文本自动分类[J].小型微型计算机系统,2005,26(8):1380-1383.
作者姓名:王元珍  钱铁云  冯小年
作者单位:1. 华中科技大学,计算机学院,数据库与多媒体技术研究所,湖北,武汉,430074
2. 中国电力财务有限公司,华中分公司,湖北,武汉,430077
基金项目:科技部科技电子政务系统关键技术及应用系统的研究(2001BA110B01)资助;高等学校博士学科专项基金(200304870320)资助
摘    要:随着电子出版物和互联网文档的飞速增加,自动文档分类工作正变得日渐重要.提出一种基于关联规则的中文文本自动分类方法.该算法将文档视作事务.关键词视作项,利用改进的关联规则挖掘算法挖掘项和类剐间的相关关系.挖掘出的规则形成分类器,可用于类标号未知的文档的区分.实验证明,该算法能较快地获得可理解的规则并且具有较好的召回率和准确率.

关 键 词:基于关联的分类  中文文本分类  关联规则挖掘
文章编号:1000-1220(2005)08-1380-04
收稿时间:2004-02-19
修稿时间:2004-02-19

Association Rules Based Automatic Chinese Text Categorization
WANG Yuan-Zhen,QIAN Tie-yun,FENG Xiao-nian.Association Rules Based Automatic Chinese Text Categorization[J].Mini-micro Systems,2005,26(8):1380-1383.
Authors:WANG Yuan-Zhen  QIAN Tie-yun  FENG Xiao-nian
Abstract:With the rapid expansion of electronic publication, it becomes more and more important to classify document automatically. This paper introduced a new method that is called association based document classification into Chinese Text Categorization. In our algorithms, each document is viewed as a transaction and each keyword as an item, then an association rule mining algorithm is used to mine the correlation between item and category at the end, unlabeled documents are classified using these found rules. Experiments confirmed that this method gets understandable rules of classifier fast and has a promising recall and precision rate.
Keywords:association based classification  Chinese text categorization  association rule mining
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号