首页 | 本学科首页   官方微博 | 高级检索  
     

基于区间数的不确定性数据聚类算法:UD-OPTICS
引用本文:吴翠先,何少元.基于区间数的不确定性数据聚类算法:UD-OPTICS[J].计算机工程与科学,2019,41(7):1303-1311.
作者姓名:吴翠先  何少元
作者单位:重庆邮电大学通信与信息工程学院,重庆 400065;重庆邮电大学通信新技术应用研究中心,重庆 400065;重庆信科设计有限公司,重庆 401121;重庆邮电大学通信与信息工程学院,重庆 400065;重庆邮电大学通信新技术应用研究中心,重庆 400065
摘    要:在不确定性数据聚类算法的研究中,普遍需要假设不确定性数据服从某种分布,继而获得表示不确定性数据的概率密度函数或概率分布函数,然而这种假设很难保证与实际应用系统中的不确定性数据分布一致。现有的基于密度的算法对初始参数敏感,在对密度不均匀的不确定性数据聚类时,无法发现任意密度的类簇。鉴于这些不足,提出基于区间数的不确定性数据对象排序识别聚类结构算法(UD-OPTICS)。该算法利用区间数理论,结合不确定性数据的相关统计信息来更加合理地表示不确定性数据,提出了低计算复杂度的区间核心距离与区间可达距离的概念与计算方法,将其用于度量不确定性数据间的相似度,拓展类簇与对象排序识别聚类结构。该算法可很好地发现任意密度的类簇。实验结果表明,UD-OPTICS算法具有较高的聚类精度和较低的复杂度。

关 键 词:不确定性数据  区间数  密度聚类算法  OPTICS
收稿时间:2018-07-24
修稿时间:2019-07-25

UD-OPTICS: An uncertain data clustering algorithm based on interval number
WU Cui xian,HE Shao yuan.UD-OPTICS: An uncertain data clustering algorithm based on interval number[J].Computer Engineering & Science,2019,41(7):1303-1311.
Authors:WU Cui xian  HE Shao yuan
Affiliation: (1.School of Telecommunication and Information Engineering, Chongqing University of Posts and Telecommunications,Chongqing 400065; 2.Research Center of New Telecommunication Technology Applications, Chongqing University of Posts and Telecommunications,Chongqing 400065; 3.Chongqing Information Technology Designing Company Limited,Chongqing 401121,China)  
Abstract:The research on uncertain data clustering algorithms generally assumes that uncertain data obeys a certain distribution, so we can obtain the probability density function or probability distribution function which represents the uncertain data. However, it is difficult to guarantee the consistency between the assumed distribution and the distribution of uncertain data in practical applications. Existing algorithms based on density are sensitive to initial parameters, so they cannot find class clusters of arbitrary density when clustering uncertain data with uneven density. In view of these shortcomings, we propose an algorithm based on interval number for uncertain data object sorting recognition clustering structure (UD OPTICS). It uses the interval number theory and the statistical information of the uncertain data to represent the uncertain data more reasonably. We propose the concept and calculation method of interval core distance and interval reachable distance with low computational complexity, which are used to measure the similarity between uncertain data and expand the cluster structure of clusters and object sorting. This algorithm can well find clusters of arbitrary density. Experimental results show that the UD OPTICS algorithm has higher clustering accuracy and lower complexity.
Keywords:uncertain data  interval number  density clustering algorithm  OPTICS  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号