首页 | 本学科首页   官方微博 | 高级检索  
     

二元数据子空间聚类算法的初始化研究*
引用本文:夏英,鲁宁,丰江帆.二元数据子空间聚类算法的初始化研究*[J].计算机应用研究,2009,26(1):47-49.
作者姓名:夏英  鲁宁  丰江帆
作者单位:重庆邮电大学,中韩合作空间信息系统研究所,重庆,400065
基金项目:国家“863”计划资助项目(2007AA12Z238)
摘    要:针对二元数据空间高维稀疏性的特点而提出的有限混合伯努利模型,能够快速寻找映射簇的模型框架;EM算法是数学模型进行参数迭代的重要方法,其算法的优劣很大程度上取决于其初始参数。对于运用EM算法来实现有限混合伯努利模型聚类算法已有许多研究, EM算法中参数的选取直接影响聚类算法的性能。引入 Binning法和改变数据之间相似度测量方式、中心点的选取方式来进行初始化,从而大大减少聚类结果对初始参数的依赖,实验证明该算法是高效的、正确的。

关 键 词:子空间聚类  二元数据  有限混合伯努利模型  EM算法

Research of initialization of subspace clustering algorithm in binary data
XIA Ying,LU Ning,FENG Jiang-fan.Research of initialization of subspace clustering algorithm in binary data[J].Application Research of Computers,2009,26(1):47-49.
Authors:XIA Ying  LU Ning  FENG Jiang-fan
Affiliation:(SIKO-GIS Research Center, Chongqing University of Posts & Telecommunications, Chongqing 400065,China)
Abstract:Aiming at the characteristic of high-dimensionality and sparseness in binary data set,proposes the finite mixtures of Bernoulli distributions model for finding projected clusters fast.EM algorithm is the important method of iterative parameters,and the degree of good or bad with EM algorithm lies on initial parameters.As far as the finite mixtures of Bernoulli distributions model,there have been lots of researches about it.However,in EM algorithm,the initial parameters affect the clustering performance directly.Therefore,this paper introduced Binning method and computed parameters through changing the comparability measurement between dates and selection style about core-point,in order to reduce the dependence of the clustering for initial parameters.Experiment demonstrates the algorithm is efficient and accurate.
Keywords:subspace clustering  binary data  the finite mixtures of Bernoulli distributions model  EM algorithm
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号