首页 | 本学科首页   官方微博 | 高级检索  
     


A cluster centers initialization method for clustering categorical data
Authors:Liang Bai  Jiye Liang  Chuangyin Dang  Fuyuan Cao
Affiliation:1. College of Computer Science and Engineering, Taibah University, KSA;2. Math and computer science Department, Faculty of Science, Menoufia University, Menoufia, Egypt;3. Systems and Computer Engineering Department, Faculty of Engineering, Al-Azhar University, Cairo, Egypt
Abstract:The leading partitional clustering technique, k-modes, is one of the most computationally efficient clustering methods for categorical data. However, the performance of the k-modes clustering algorithm which converges to numerous local minima strongly depends on initial cluster centers. Currently, most methods of initialization cluster centers are mainly for numerical data. Due to lack of geometry for the categorical data, these methods used in cluster centers initialization for numerical data are not applicable to categorical data. This paper proposes a novel initialization method for categorical data which is implemented to the k-modes algorithm. The method integrates the distance and the density together to select initial cluster centers and overcomes shortcomings of the existing initialization methods for categorical data. Experimental results illustrate the proposed initialization method is effective and can be applied to large data sets for its linear time complexity with respect to the number of data objects.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号