An improved overlapping k-means clustering method for medical applications期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

An improved overlapping k-means clustering method for medical applications

Affiliation:	1. Department of Systems Science and Industrial Engineering, The State University of New York at Binghamton, 4400 Vestal Parkway East, Binghamton, NY-13902, USA;2. Department of Computer Engineering, Islamic Azad University at Shabestar, Shabestar-Dizajkhalil Road, Shabestar, East Azerbaijan-5381637181, Iran;1. Computer Science Department, The University of Mato Grosso do Sul (UFMS) at Ponta Porã, Brazil;2. Computer Science Department, The University of São Paulo (USP) at São Carlos, Brazil;3. INESC-TEC Department, The University of Porto, Portugal;1. Department of Computer Engineering and Information Technology, Payame Noor University (PNU), Qeshm, Iran;2. Department of Electrical Engineering, Sadjad University of Technology, Mashhad, Iran

Abstract:	Data clustering has been proven to be an effective method for discovering structure in medical datasets. The majority of clustering algorithms produce exclusive clusters meaning that each sample can belong to one cluster only. However, most real-world medical datasets have inherently overlapping information, which could be best explained by overlapping clustering methods that allow one sample belong to more than one cluster. One of the simplest and most efficient overlapping clustering methods is known as overlapping k-means (OKM), which is an extension of the traditional k-means algorithm. Being an extension of the k-means algorithm, the OKM method also suffers from sensitivity to the initial cluster centroids. In this paper, we propose a hybrid method that combines k-harmonic means and overlapping k-means algorithms (KHM-OKM) to overcome this limitation. The main idea behind KHM-OKM method is to use the output of KHM method to initialize the cluster centers of OKM method. We have tested the proposed method using FBCubed metric, which has been shown to be the most effective measure to evaluate overlapping clustering algorithms regarding homogeneity, completeness, rag bag, and cluster size-quantity tradeoff. According to results from ten publicly available medical datasets, the KHM-OKM algorithm outperforms the original OKM algorithm and can be used as an efficient method for clustering medical datasets.

Keywords:
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏