期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Pawlak粗糙集的知识约简包括对决策表的知识约简和对信息表的知识约简。作为Pawlak粗糙集的扩展,邻域粗糙集在针对决策表的属性约简方面应用广泛,而针对信息表的属性约简方面应用鲜少。为了设计一种适用于信息表的属性约简算法,根据Pawlak粗糙集的信息表知识约简标准,首先提出一种邻域粗糙集的信息表知识约简标准,然后根据这种标准,结合贪心思想,进一步提出了一种适用于聚类任务的信息表属性约简算法。与主成分分析(principal component analysis,PCA)算法相比,实验结果表明用该算法对数据集降维后,得到的属性约简集合的属性个数较多,K-means算法根据属性集合进行聚类的精度较高。实验结果证明该算法能有效地应用于信息表的属性约简方面。相似文献

10.

属性约简在高校就业决策分析中的应用

杨飞代广珍《计算机技术与发展》2007,17(7):223-225,229

粗糙集理论是一种采用新方式来研究不精确、不确定性知识的数学工具。属性约简的计算是粗糙集理论中的一个重要问题。描述基于粗糙集的属性约简的相关概念，包括核、约简、分类精度；通过分析多种属性约简算法，结合可辨识矩阵和逻辑运算，提出了一种属性约简算法；围绕高校中的管理信息系统，利用该算法抽取与学生就业相关的数据信息，给出了影响学生就业的各条件因素与工作方向之问的依赖关系和约简后的数据表；获取相关规则得出结论，取得了良好的效果。相似文献

11.

关系模式最小基数候选关键字多项式时间求解算法 总被引：1，自引：1，他引：0

郝忠孝刘国华《计算机研究与发展》1995,32(2):27-33

相似文献

12.

强函数依赖与偏序结构

周定康《计算机研究与发展》1993,30(4):10-15

相似文献

13.

基于改进区分矩阵的决策表增量式属性约简 总被引：2，自引：0，他引：2

下载免费PDF全文

刘高峰牟廉明张涛《计算机工程》2010,36(20):46-48

针对属性在不断增加的决策表,为了快速准确地计算出属性约简,提出一种增量式属性约简算法。以正域为约简的标准,利用贪心算法思想,以属性区分能力为选择标准,逐渐构造近似的属性约简,从中删减掉不必要的属性,最终得到属性约简。经复杂度分析与实验数据测试,证明该算法的复杂度低并且约简结果准确。相似文献

14.

关系模式一种基于超图的全部候选关键字求法 总被引：1，自引：0，他引：1

郝忠孝郭景峰《计算机学报》1992,15(4):264-270

本文详细讨论了基于超图的关系模式的有关候选关键字的某些理论,给出了相应的定理.圆满地解决了关系模式全部候选关键字的求解问题,具体地给出了以递归形式的求全部候选关键字的新算法. 相似文献

15.

基于相对密度和熵的混合属性聚类融合算法

余泽《计算机系统应用》2014,23(12):125-130

混合属性聚类是近年来的研究热点,对于混合属性数据的聚类算法要求处理好数值属性以及分类属性,而现存许多算法没有很好得平衡两种属性,以至于得不到令人满意的聚类结果.针对混合属性,在此提出一种基于交集的聚类融合算法,算法单独用基于相对密度的算法处理数值属性,基于信息熵的算法处理分类属性,然后通过基于交集的融合算法融合两个聚类成员,最终得到聚类结果.算法在UCI数据集Zoo上进行验证,与现存k-prototypes与EM算法进行了比较,在聚类的正确率上都优于k-prototypes与EM算法,还讨论了融合算法中交集元素比的取值对算法结果的影响. 相似文献

16.

Cluster center initialization algorithm for K-modes clustering

Shehroz S. Khan Amir Ahmad 《Expert systems with applications》2013,40(18):7444-7456

Partitional clustering of categorical data is normally performed by using K-modes clustering algorithm, which works well for large datasets. Even though the design and implementation of K-modes algorithm is simple and efficient, it has the pitfall of randomly choosing the initial cluster centers for invoking every new execution that may lead to non-repeatable clustering results. This paper addresses the randomized center initialization problem of K-modes algorithm by proposing a cluster center initialization algorithm. The proposed algorithm performs multiple clustering of the data based on attribute values in different attributes and yields deterministic modes that are to be used as initial cluster centers. In the paper, we propose a new method for selecting the most relevant attributes, namely Prominent attributes, compare it with another existing method to find Significant attributes for unsupervised learning, and perform multiple clustering of data to find initial cluster centers. The proposed algorithm ensures fixed initial cluster centers and thus repeatable clustering results. The worst-case time complexity of the proposed algorithm is log-linear to the number of data objects. We evaluate the proposed algorithm on several categorical datasets and compared it against random initialization and two other initialization methods, and show that the proposed method performs better in terms of accuracy and time complexity. The initial cluster centers computed by the proposed approach are close to the actual cluster centers of the different data we tested, which leads to faster convergence of K-modes clustering algorithm in conjunction to better clustering results. 相似文献