浅谈基于粗集理论的文本归类系统 |
| |
引用本文: | 孟坛,张蓉.浅谈基于粗集理论的文本归类系统[J].河北机电学院学报,2010(6):414-416. |
| |
作者姓名: | 孟坛 张蓉 |
| |
作者单位: | 国家林业局昆明勘察设计院,云南昆明650216 |
| |
摘 要: | 文本归类是处理大量文本数据自动分类的重要技术。基于粗集理论建立的林业文本信息归类系统,是在已知类别的训练集的基础上,通过分析训练数据样本,建立决策表产生区分矩阵构造出区分函数,并化简它,得到最小属性约简,最后应用Apriori算法产生最终分类的规则表,利用产生的规则表,可将林业文本信息数据进行自动归类。
|
关 键 词: | 粗集 林业文本信息分类 Apriori算法 |
Study on text categorization system based on rough set theory |
| |
Authors: | MENG Tan ZHANG Rong |
| |
Affiliation: | (Kunming Survey and Design Institute of State Forestry Administration, Kunming Yunnan 650216, China) |
| |
Abstract: | Text categorization is an important automatic classification technology in dealing with large amounts of text data. We established text categorization system based on the rough sets theory. On the basis of the known categories training sets, we analyzed samples of the training data, found a decision table to get the discernibility matrix and built the discernibility functions, then simplified the discernibility functions to get the min-attribute simplification. At last, we used Apriori algorithm to generate the final classification rule table, which is available for automatic text data categorization. |
| |
Keywords: | rough sets text categorization Apriori algorithm |
本文献已被 维普 等数据库收录! |
|