首页 | 本学科首页   官方微博 | 高级检索  
     

基于差异性采样的流数据聚类算法
引用本文:邱云飞,孙梦冉.基于差异性采样的流数据聚类算法[J].计算机应用研究,2019,36(6).
作者姓名:邱云飞  孙梦冉
作者单位:辽宁工程技术大学软件学院,辽宁葫芦岛,125105;辽宁工程技术大学软件学院,辽宁葫芦岛,125105
基金项目:国家自然科学基金资助项目(61404069);辽宁省教育厅科学研究项目(LJYL048)
摘    要:针对传统聚类算法对流数据进行聚类时面临时间复杂度高,存储空间需求大以及准确度较低的问题,提出一种基于差异性采样的流数据聚类算法。首先利用差异性采样法对流数据进行采样并用样本点构造核矩阵,然后利用核模糊C均值聚类算法对核矩阵中的点进行聚类得到一个带有标记的样本核矩阵,最后利用带有标记的样本核矩阵对流数据中的点进行划分。同时利用衰退聚类机制,实时更新样本核矩阵。实验结果表明,相比于传统聚类算法,该算法实现了更低的时间复杂度,同时实时聚类,得到较为理想的聚类结果。

关 键 词:差异性采样  衰退聚类机制  核模糊C均值  流数据  时间复杂度
收稿时间:2017/12/18 0:00:00
修稿时间:2019/4/24 0:00:00

Stream data clustering algorithm based on differential sampling
QIU Yun-fei and SUN Meng-ran.Stream data clustering algorithm based on differential sampling[J].Application Research of Computers,2019,36(6).
Authors:QIU Yun-fei and SUN Meng-ran
Affiliation:Software College of Liaoning Technical University,LiaoNing HuLudao,
Abstract:Concerning the problems of high time complexity, large storage space requirements and low accuracy when traditional clustering algorithm cluster stream data, this paper proposed a kind of stream data clustering algorithm based on differential sampling. First, it used the differential sampling method sampled stream data, and used sample points to construct kernel matrix. Then it used kernel fuzzy C-means clustering algorithm clustered the data points in the kernel matrix, obtained a marked sample kernel matrix. Finally, using the marked kernel matrix divided the stream data. Meanwhile, this paper adopted the fading cluster mechanism to update kernel matrix in real time. Experimental results show that compared with the traditional clustering algorithm, the proposed algorithm achieves lower time complexity, real-time clustering at the same time, get the ideal clustering result.
Keywords:differential sampling  fading cluster mechanism  kernel fuzzy C-means  stream data  time complexity
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号