首页 | 本学科首页   官方微博 | 高级检索  
     

多密度自适应确定DBSCAN算法参数的算法研究
引用本文:万佳,胡大裟,蒋玉明. 多密度自适应确定DBSCAN算法参数的算法研究[J]. 计算机工程与应用, 2022, 58(2): 78-85. DOI: 10.3778/j.issn.1002-8331.2012-0476
作者姓名:万佳  胡大裟  蒋玉明
作者单位:1.四川大学 计算机学院,成都 6100652.四川省大数据分析与融合应用技术工程实验室,成都 610065
基金项目:国家重点研发计划项目(2020YFB1707900)。
摘    要:DBSCAN算法的Eps和MinPts参数需要人为设定,取值不当会导致聚类结果准确度不高,且在密度分布差异大的数据集上,由于参数的全局性,错误地应用于不同密度的簇,导致不能正确地发现簇.针对以上问题,提出一种多密度自适应参数确定算法,利用经过去噪衰减后的数据集的自身分布特性生成候选Eps和MinPts参数列表,并在簇数...

关 键 词:DBSCAN算法  去噪衰减  多密度阈值  自适应

Research on Method of Multi-density Self-Adaptive Determination of DBSCAN Algorithm Parameters
WAN Jia,HU Dasha,JIANG Yuming. Research on Method of Multi-density Self-Adaptive Determination of DBSCAN Algorithm Parameters[J]. Computer Engineering and Applications, 2022, 58(2): 78-85. DOI: 10.3778/j.issn.1002-8331.2012-0476
Authors:WAN Jia  HU Dasha  JIANG Yuming
Affiliation:1.College of Computer Science, Sichuan University, Chengdu 610065, China2.Big Data Analysis and Fusion Application Technology Engineering Laboratory of Sichuan Province, Chengdu 610065, China
Abstract:The parameters of [Eps] and [MinPts] in DBSCAN algorithm need to be set artificially. Improper values will result in low accuracy of clustering results. Moreover, due to the global nature of the parameters, they are mistakenly applied to clusters with different densities, causing the cluster not to be found correctly. To solve above problems, this paper puts forward a kind of multi-density adaptive parameter determination algorithm, the algorithm first generates candidate [Eps] and [MinPts] parameter lists by using the self-distribution characteristics of the denoising attenuated data set, and in the interval where the number of clusters tends to be stable, corresponding [Eps] and [MinPts] are selected as the initial density threshold according to the denoising level. Then, the candidate parameter list is generated by using the same method for noise data generated by clustering under the condition of the density threshold, and the optimal parameter is selected to obtain the new density threshold. The step is cycled until the noise data quantity or density threshold is lower than a certain degree. Finally, the clustering results of different density thresholds are combined to obtain the final clustering result. The experimental results show that the algorithm can adaptively select the appropriate multi-density threshold and has a good clustering effect on the data sets with large density distribution differences.
Keywords:DBSCAN algorithm  denoising attenuation  multi-density threshold  self-adaptive
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号