首页 | 本学科首页   官方微博 | 高级检索  
     

浅谈基于粗集理论的文本归类系统
引用本文:孟坛,张蓉.浅谈基于粗集理论的文本归类系统[J].河北机电学院学报,2010(6):414-416.
作者姓名:孟坛  张蓉
作者单位:国家林业局昆明勘察设计院,云南昆明650216
摘    要:文本归类是处理大量文本数据自动分类的重要技术。基于粗集理论建立的林业文本信息归类系统,是在已知类别的训练集的基础上,通过分析训练数据样本,建立决策表产生区分矩阵构造出区分函数,并化简它,得到最小属性约简,最后应用Apriori算法产生最终分类的规则表,利用产生的规则表,可将林业文本信息数据进行自动归类。

关 键 词:粗集  林业文本信息分类  Apriori算法

Study on text categorization system based on rough set theory
Authors:MENG Tan  ZHANG Rong
Affiliation:(Kunming Survey and Design Institute of State Forestry Administration, Kunming Yunnan 650216, China)
Abstract:Text categorization is an important automatic classification technology in dealing with large amounts of text data. We established text categorization system based on the rough sets theory. On the basis of the known categories training sets, we analyzed samples of the training data, found a decision table to get the discernibility matrix and built the discernibility functions, then simplified the discernibility functions to get the min-attribute simplification. At last, we used Apriori algorithm to generate the final classification rule table, which is available for automatic text data categorization.
Keywords:rough sets  text categorization  Apriori algorithm
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号