首页 | 本学科首页   官方微博 | 高级检索  
     

基于粒计算的复杂数据多粒度主曲线提取算法
引用本文:王培培,张红云.基于粒计算的复杂数据多粒度主曲线提取算法[J].数据采集与处理,2018,33(1):122-131.
作者姓名:王培培  张红云
作者单位:1.同济大学计算机科学与技术系,上海,201804; 2.同济大学嵌入式系统与服务计算教育部重点实验室,上海,201804
摘    要:传统的主曲线算法已被广泛应用到很多领域,但在复杂数据的主曲线提取上效果不佳,而有效的融合粒计算与主曲线学习算法是解决该类问题最有效的途径之一。为此,本文提出了基于粒计算的复杂数据多粒度主曲线提取算法。首先,利用基于t最近邻(T-nearest-neighbors, TNN)的谱聚类算法对数据进行粒化,提出拐点估计方法来自动确定粒的个数;然后调用软K段主曲线算法对每个粒进行局部主曲线提取,并提出通过消除假边来优化每个粒的主曲线提取过程;最后采用局部到全局的策略进行多粒度主曲线提取,并对过拟合线段进行优化,最终形成一条能较好描述数据原始分布形态的主曲线。实验结果表明该算法是一种行之有效的多粒度主曲线提取算法。

关 键 词:粒化  t最近邻  谱聚类  主曲线  多粒度

Multi-granularity Principal Curve Extraction Algorithm Based on Granular Computing for Complex Data
Wang Peipei,Zhang Hongyun.Multi-granularity Principal Curve Extraction Algorithm Based on Granular Computing for Complex Data[J].Journal of Data Acquisition & Processing,2018,33(1):122-131.
Authors:Wang Peipei  Zhang Hongyun
Affiliation:1.Department of Computer Science and Technology, Tongji University,Shanghai,201804,China; 2.Key Laboratory of Embedded Systems and Service Computing, Ministry of Education, Tongji University, Shanghai,201804,China
Abstract:The traditional principal curve algorithm is widely used in many fields, but it is ineffective in extracting the principal curves for complex data. To solve the kind of the problem, one of most effective ways is to combine the granular computing with the principal curve algorithm. Therefore, a new multi-granularity principal curve extraction algorithm for complex data based on granular computing is proposed. Firstly, we use the spectral clustering algorithm based on t-nearest neighbor (TNN) to granulate the data and propose the inflexion point estimation to automatically determine the number of granules. Then the local principal curve extraction for each granule is carried out by using soft K-segments principal curve algorithm and optimized by removing the false edges. Finally, a local-to-global strategy is adopted to extract the multi-granularity principal curves to optimize overfitting curves and a principal curve which can describe the original data distribution pattern can be obtained. Experimental results demonstrate the excellent feasibility of the proposed principal curve extraction algorithm.
Keywords:data granulation  t-nearest neighbor  spectral clustering  principal curves  multi-granularity
点击此处可从《数据采集与处理》浏览原始摘要信息
点击此处可从《数据采集与处理》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号