面向大数据的K-means算法综述 |
| |
引用本文: | 任远航. 面向大数据的K-means算法综述[J]. 计算机应用研究, 2020, 37(12): 3528-3533 |
| |
作者姓名: | 任远航 |
| |
作者单位: | 电子科技大学 信息与软件工程学院,成都610054 |
| |
摘 要: | 聚类作为一种重要的数据挖掘方式,如何在海量数据下更快获得一个有理论保证的K-means的近似解则是一个关键问题。首先,定义K-means问题并介绍相关背景;然后,从理论保证和加速两个方面分别介绍国内外先进研究成果;最后,总结现有的成果并对未来的面向大数据的K-means研究方向予以展望和预测。
|
关 键 词: | 聚类 K-means 采样 次线性时间算法 理论保证 |
收稿时间: | 2019-10-12 |
修稿时间: | 2019-12-28 |
Survey of K-means algorithms on big data |
| |
Affiliation: | School of Information and Software Engineering, University of Electronic Science and Technology of China |
| |
Abstract: | Among all the clustering problems, the K-means problem is probably the most well-known one. How to obtain a theoretically guaranteed solution of K-means efficiently for the big data can be a key problem. This paper surveyed the progress of this problem. Firstly, this paper defined the K-means problem and introduced relevant backgrounds. Secondly, it introduced separately and described in details the techniques for theoretical guarantee and speed up. Finally, it summarized the main results and forecasted the future directions of K-means algorithms on big data. |
| |
Keywords: | clustering K-means sampling sub-linear time algorithms theoretical guarantee |
本文献已被 万方数据 等数据库收录! |
| 点击此处可从《计算机应用研究》浏览原始摘要信息 |
|
点击此处可从《计算机应用研究》下载全文 |