首页 | 本学科首页   官方微博 | 高级检索  
     

基于傅里叶变换和连通图的聚类分析方法
引用本文:巨瑜芳,雷小锋,戴 斌,庄 伟,宋丰泰.基于傅里叶变换和连通图的聚类分析方法[J].计算机应用研究,2012,29(8):2837-2840.
作者姓名:巨瑜芳  雷小锋  戴 斌  庄 伟  宋丰泰
作者单位:中国矿业大学计算机科学与技术学院,江苏徐州,221116
基金项目:江苏省基础研究计划资助项目(BK2009093); 中国矿业大学科技基金资助项目(OD080313)
摘    要:聚类是假设数据在具有某种群聚结构的前提下根据观察到的无标记的样本发现数据的最优划分。针对已有的聚类算法存在的缺点,假设数据样本的结果簇是密集的,且簇与簇之间区别明显,基于该假设提出一种基于傅里叶变换和连通图的聚类分析方法 FGClus。首先针对每个样本点计算k阶距离矩阵并序列化作为离散傅里叶变换的输入信号;然后抽取频域内幅值最小的复数项并构造输入序列进行傅里叶逆变换,得到在时域空间中的最佳阈值;最后利用该阈值结合连通图指导最终的聚类过程。实验表明,FGClus算法克服了K-means算法聚类前需确定聚类个数、聚类结果对初始代表点的选取敏感、只能聚类球状数据等缺点,取得了良好的聚类效果。

关 键 词:聚类分析  离散傅里叶变换  连通图  最短路径K近邻查询  最佳阈值

Cluster analysis methods based on Fourier transform and graph theory
JU Yu-fang,LEI Xiao-feng,DAI Bin,ZHUANG Wei,SONG Feng-tai.Cluster analysis methods based on Fourier transform and graph theory[J].Application Research of Computers,2012,29(8):2837-2840.
Authors:JU Yu-fang  LEI Xiao-feng  DAI Bin  ZHUANG Wei  SONG Feng-tai
Affiliation:School of Computer Science & Technology, China University of Mining & Technology, Xuzhou Jiangsu 221116, China
Abstract:Clustering is to find the best partition of unlabeled observations under a certain group structure hypothesis. For the shortcomings in the existing clustering algorithms, this paper assumed that the results of the data sample was intensive and the differences among every cluster were significant. Based on the assumption it presented a cluster analysis method called FGClus based on discrete Fourier transform and graph theory. First, this method calculatd k-distance matrix of each sample point as a sequence of the input signal of discrete Fourier transform, then extracted the minimum amplitude of the complex frequency domain items and constructed the input sequence of inverse Fourier transform, to get the optimal threshold value of the space in the time domain. Finally, it used threshold and connected graph to guide the final clustering process. Large numbers of experiments show that FGClus algorithm can overcome existed shortcomings of K-means algorithm, such as the number of clusters must be determined before clustering, the results is sensitive on initial selection of representative points and it just can cluster spherical datas, which achieves good clustering results.
Keywords:cluster analysis  discrete Fourier transform  connected graph  shortest path KNN query  optimal threshold
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号