首页 | 本学科首页   官方微博 | 高级检索  
     

粗糙集规则匹配算法及其在文本分类中的应用
引用本文:朱敏玲,吴海艋,石磊.粗糙集规则匹配算法及其在文本分类中的应用[J].计算机系统应用,2018,27(4):131-137.
作者姓名:朱敏玲  吴海艋  石磊
作者单位:北京信息科技大学 计算机学院, 北京 100101,北京信息科技大学 计算机学院, 北京 100101,中国科学院 软件研究所, 北京 100190
基金项目:国家自然科学基金(11401031);北京信息科技大学2016-2017学年度“实培计划”项目
摘    要:为提高中文文本分类的效果,提出了一种基于粗糙集理论的规则匹配方法.在对文本特征的提取过程中,对CHI统计方法进行了适当的改进,并对特征项的权值进行了缩放和离散化.结合区分矩阵实现关于粗糙集理论的属性约简和规则提取,并采用规则预检验的方法对规则匹配的决策参数进行优化,以提高中文文本分类的效果.实验结果表明改进后的规则匹配方法分类准确率更高,同时在训练数据较少的情况下也可以取得不错的效果.

关 键 词:粗糙集  中文文本分类  属性约简  规则提取  规则匹配
收稿时间:2017/7/16 0:00:00
修稿时间:2017/7/28 0:00:00

Rough Set Rule Matching Method and its Application in Text Categorization
ZHU Min-Ling,WU Hai-Meng and SHI Lei.Rough Set Rule Matching Method and its Application in Text Categorization[J].Computer Systems& Applications,2018,27(4):131-137.
Authors:ZHU Min-Ling  WU Hai-Meng and SHI Lei
Affiliation:Computer School, Beijing Information Science and Technology University, Beijing 100101, China,Computer School, Beijing Information Science and Technology University, Beijing 100101, China and Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
Abstract:To improve the performance of Chinese text classification, a rule matching method based on rough set theory is proposed in this study. In the extracting process of textual features, the CHI statistical method is improved and the weight of the feature is scaled and discretized. It combines the discriminant matrix to achieve the attribute reduction and rule extraction for rough set theory, and uses rule pre-test method to optimize the decision parameters of rule matching to improve the effect of Chinese text categorization. The experimental results demonstrate that the categorization accuracy of the improved matching method is higher, and in the case of less training data, it can also achieve decent results
Keywords:rough set  Chinese text classification  attribute reduction  rule extraction  rule matching
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号