首页 | 本学科首页   官方微博 | 高级检索  
     

基于优化可辨识矩阵和改进差别信息树的属性约简算法
引用本文:徐怡,唐静昕. 基于优化可辨识矩阵和改进差别信息树的属性约简算法[J]. 计算机科学, 2020, 47(3): 73-78
作者姓名:徐怡  唐静昕
作者单位:安徽大学计算智能与信号处理教育部重点实验室 合肥 230039;安徽大学计算机科学与技术学院 合肥 230601;安徽大学计算机科学与技术学院 合肥 230601
基金项目:国家自然科学基金;安徽省自然科学基金
摘    要:运用可辨识矩阵表示信息系统中所有对象的区分信息,为研究属性约简提供了新方向。然而,传统的可辨识矩阵在构造结束后才利用核属性消除冗余元素项,忽略了核属性在矩阵构建过程中的作用。针对这一问题,文中做了以下研究:1)优化可辨识矩阵的构造方式,在计算任意两个对象的区分信息之前,先判断核属性上的取值是否相等,如果不相等,则直接将对应元素项记为Φ,忽略对其他条件属性的判断;2)提出属性加权重要度的概念,综合考虑每个条件属性占可辨识矩阵中非空元素项的比率(称为宏观重要度)与每个属性对区分对象的贡献程度(称为微观重要度),并通过例子说明了该度量方法的合理性;3)针对优化后的矩阵仍然存在大量冗余元素和空集这一缺陷,结合差别信息树的概念提出基于优化可辨识矩阵和属性加权重要度的差别信息树。按照属性加权重要度对优化可辨识矩阵中所有非空元素项进行排序,使得重要度高的属性被更多的节点共享;且在构建过程中将不包含核属性的元素项映射到树中的一条路径上,而包含核属性的元素项则被直接忽略。最后,提出基于优化可辨识矩阵和改进差别信息树的约简算法HSDI-tree。在UCI的5个数据集上分别比较了HSDI-tree算法与CDI-tree,DI-tree和IDI-tree算法的约简结果和节点个数,实验结果表明HSDI-tree算法能有效找到最小属性约简且空间压缩能力更好。

关 键 词:粗糙集  属性重要度  可辨识矩阵  属性约简  差别信息树

Attribute Reduction Algorithm Based on Optimized Discernibility Matrix and Improving
XU Yi,TANG Jing-xin. Attribute Reduction Algorithm Based on Optimized Discernibility Matrix and Improving[J]. Computer Science, 2020, 47(3): 73-78
Authors:XU Yi  TANG Jing-xin
Affiliation:(Key Laboratory of Intelligent Computing and Signal Processing and Ministry of Education,Anhui University,Hefei 230039,China;College of Computer Science and Technology,Anhui University,Hefei 230601,China)
Abstract:Discernibility matrix expresses the distinguishing information of all objects in the information system with matrix elements,which provides a new idea for attribute reduction.However,the traditional discernibility matrix uses the core attributes to eliminate redundant element items after the construction is finished,ignoring the role of the core attributes in the matrix construction process.In response to this problem,the following research is done.Firstly,the definition of the discernibility matrix is optimized.Before calculating the distinguishing information of any two objects,it is first determined whether the values on the core attributes are equal.If not,the corresponding element items are directly recorded as Φ,and the judgment of other attributes is ignored.Secondly,the concept of attribute weighted importance is proposed.The ratio of each condition attribute to the non-empty element term in the discernibility matrix(called macro importance)and the contribution of each attribute to the distinguishing object(called micro Importance)are comprehensively considered,and the rationality of the measurement method is illustrated by an example.Thirdly,aiming at the disadvantages that there are a lot of redundant elements and empty sets in the optimized discernibility matrix,by combining the concept of discernibility information tree,discernibility information tree based on optimized discernibility matrix and attribute weighted importance is proposed.All non-empty element items in the optimized discernibility matrix are sorted according to attribute weighted importance,so that attributes with high importance are shared by more nodes.Element items that do not contain core attributes are mapped to a path in the tree during the build process,while element items that contain core attributes are ignored.Finally,a reduction algorithm HSDI-tree based on optimized discernibility matrix and improving discernibility information tree is proposed.This paper compared the reduction results and the number of nodes of the HSDI-tree algorithm,CDI-tree,DI-tree and IDI-tree algorithms on the five data sets of UCI.The experimental results show that the HSDI-tree algorithm can effectively find the minimum attribute reduction and has better space compression ability.
Keywords:Rough set  Attribute importance  Discernibility matrix  Attribute reduction  Discernibility information tree
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号