首页 | 本学科首页   官方微博 | 高级检索  
     

非共现数据两阶段加权IB算法
引用本文:姬波,叶阳东. 非共现数据两阶段加权IB算法[J]. 小型微型计算机系统, 2012, 0(10): 2278-2282
作者姓名:姬波  叶阳东
作者单位:郑州大学信息工程学院计算机科学技术系
基金项目:国家自然科学基金项目(60773048,61170223)资助
摘    要:非共现数据是指不符合联合概率分布,而是符合一个未知函数的数据.将非共现数据转化为共现形式后可以采用熵来定量度量信息并进行聚类.但是,现有算法假设非共现数据的各个属性特征对聚类贡献均匀,没有考虑代表性属性和不相关(冗余)属性对聚类效果的不同影响.因此,本文提出一个非共现数据的两阶段加权IB算法(TSAW-sIB),在非共现数据共现转化的两个阶段,从"非共现/共现/联合"三个视角观察非共现数据,突出代表性属性,抑制冗余属性,获得更能准确反映非共现数据特征的数据表示并进行聚类.实验表明,TSAW-sIB算法优于ROCK、COOLCAT和LIMBO算法.

关 键 词:非共现数据  特征加权  两阶段  信息瓶颈方法  聚类

Two-stage Attribute Weighting IB Algorithm for Non Co-occurrence Data
JI Bo,YE Yang-dong. Two-stage Attribute Weighting IB Algorithm for Non Co-occurrence Data[J]. Mini-micro Systems, 2012, 0(10): 2278-2282
Authors:JI Bo  YE Yang-dong
Affiliation:(School of Information Engineering,Zhengzhou University,Zhengzhou 450001,China)
Abstract:Non co-occurrence data does not appear in the form of co-occurrence of two variables X,Y,but rather as a sample of values of an unknown function Z(X,Y).The co-occurrence transformation of non co-occurrence data is necessary for clustering on the concept of Shannon entropy.However,these clustering algorithms treat all features fairly and set weights of all features equally.Therefore,the paper proposes a two-stage attribute weighting IB Algorithm(TPAW-sIB).At two stages of the co-occurrence transformation,we highlight representative features and dim irrelevant features from three viewpoints:non co-occurrence,co-occurrence and both.Experiments show that the TPAW-sIB algorithm is superior to the ROCK algorithm,the COOLCAT algorithm and the LIMBO algorithm.
Keywords:non co-occurrence  feature weighting  two stage  information bottleneck  clustering
本文献已被 CNKI 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号