首页 | 本学科首页   官方微博 | 高级检索  
     

基于GMM的文本规则挖掘的粗糙集方法研究
引用本文:洪壮壮,黄兆华,万仲保,张薇,高梦茜.基于GMM的文本规则挖掘的粗糙集方法研究[J].中文信息学报,2020,34(2):56-62.
作者姓名:洪壮壮  黄兆华  万仲保  张薇  高梦茜
作者单位:华东交通大学 软件学院,江西 南昌 330013
基金项目:国家重点研发计划(2018YFC0831106);江西省自然科学基金(20122BAB201040)
摘    要:领域文本具有结构复杂、相似性高以及动态变化等特点,且存在着连续型与离散型并存的混合数据,这在一定程度上限制了知识发现方法对文本规则的挖掘效率。针对这一问题,该文提出了基于GMM与粗糙集的文本规则挖掘方法。该方法首先根据目标数据的属性类型构造信息表;然后利用高斯混合模型(GMM,Gaussian Mixture Model)聚类算法对连续数据进行聚类划分,依此对数据进行离散化及状态约简,并生成决策表;最后利用粗糙集理论对决策表进行属性约简,通过约简表对决策规则进行提取。实验结果表明:相比于传统的方法,该文方法拥有更高的抽取精度以及较强的属性约简能力,其信息抽取的平均准确率与F1值能够达到95.0%和95.7%。

关 键 词:混合数据  规则挖掘  高斯混合模型  粗糙集  属性约简  决策规则

Research on Rough Set Method of Text Rule Mining Based on GMM
HONG Zhuangzhuang,HUANG Zhaohua,WAN Zhongbao,ZHANG Wei,GAO Mengxi.Research on Rough Set Method of Text Rule Mining Based on GMM[J].Journal of Chinese Information Processing,2020,34(2):56-62.
Authors:HONG Zhuangzhuang  HUANG Zhaohua  WAN Zhongbao  ZHANG Wei  GAO Mengxi
Affiliation:Department of Software Engineering, East China Jiaotong University, Nanchang, Jiangxi 330013, China
Abstract:The domain texts can be characterized by the complex structure, the high similarity and the dynamic change. With a mixture of continuous and discrete types of data, the existing knowledge discovery method is restricted in the mining efficiency of the text rules. To deal with this issue, this paper proposes a text rule mining method based on GMM and Rough Set. Firstly, the method constructs an information table according to the attribute type of the target data; Then, the Gaussian Mixture Model (GMM) clustering algorithm is applied to cluster the continuous data, on which the data is discretized and the state is reduced, and the decision table is generated; Finally, the rough set theory is used to reduce the attributes of decision table, and the decision rules are extracted through the reduction table. The experimental results show that the proposed method has higher precision and stronger attribute reduction ability, achieving an average precision and F score of 95.0% and 95.7%, respectively.
Keywords:hybrid data  rule mining  Gaussian Mixture Model  rough set  attribute reduction  decision rule  
本文献已被 维普 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号