首页 | 本学科首页   官方微博 | 高级检索  
     


On comparing two sequences of numbers and its applications to clustering analysis
Authors:R.J.G.B. Campello  E.R. Hruschka
Affiliation:Department of Computer Sciences, University of São Paulo at São Carlos, SCC/ICMC/USP, CP 668, São Carlos, SP 13560-970, Brazil
Abstract:A conceptual problem that appears in different contexts of clustering analysis is that of measuring the degree of compatibility between two sequences of numbers. This problem is usually addressed by means of numerical indexes referred to as sequence correlation indexes. This paper elaborates on why some specific sequence correlation indexes may not be good choices depending on the application scenario in hand. A variant of the Product-Moment correlation coefficient and a weighted formulation for the Goodman-Kruskal and Kendall’s indexes are derived that may be more appropriate for some particular application scenarios. The proposed and existing indexes are analyzed from different perspectives, such as their sensitivity to the ranks and magnitudes of the sequences under evaluation, among other relevant aspects of the problem. The results help suggesting scenarios within the context of clustering analysis that are possibly more appropriate for the application of each index.
Keywords:Clustering analysis   Goodman-Kruskal index   Kendall&rsquo  s index   Pearson Product-Moment index   Spearman&rsquo  s index   Sensitivity analysis
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号