Development of assessment criteria for clustering algorithms |
| |
Authors: | Sameh A Salem Asoke K Nandi |
| |
Affiliation: | (1) Signal Processing and Communications Group, Department of Electrical Engineering and Electronics, The University of Liverpool, Brownlow Hill, Liverpool, L69 3GJ, UK |
| |
Abstract: | In this paper, new measures—called clustering performance measures (CPMs)—for assessing the reliability of a clustering algorithm
are proposed. These CPMs are defined using a validation measure, which determines how well the algorithm works with a given set of parameter values, and a repeatability measure, which is used for studying the stability of the clustering solutions and has the ability to estimate the correct number
of clusters in a dataset. These proposed CPMs can be used to evaluate clustering algorithms that have a structure bias to
certain types of data distribution as well as those that have no structure biases. Additionally, we propose a novel cluster
validity index, V
I
index, which is able to handle non-spherical clusters. Five clustering algorithms on different types of real-world data and
synthetic data are evaluated. The first dataset type refers to a communications signal dataset representing one modulation
scheme under a variety of noise conditions, the second represents two breast cancer datasets, while the third type represents
different synthetic datasets with arbitrarily shaped clusters. Additionally, comparisons with other methods for estimating
the number of clusters indicate the applicability and reliability of the proposed cluster validity
V
I
index and repeatability measure for correct estimation of the number of clusters.
Sameh A. Salem
graduated with a BSc degree in Communications and Electronics Engineering and an MSc in Communications and Electronics Engineering,
both from Helwan University, Cairo, Egypt, in May 1998 and October 2003, respectively. He is currently pursuing PhD degree
in the Signal Processing and Communications Group, Department of Electrical Engineering and Electronics, The University of
Liverpool, UK. His research interests include clustering algorithms, machine learning, and parallel computing.
Asoke K. Nandi
received PhD degree from the University of Cambridge (Trinity College), Cambridge, UK, in 1979. He held several research positions
in Rutherford Appleton Laboratory (UK), European Organisation for Nuclear Research (Switzerland), Department of Physics, Queen
Mary College (London, UK) and Department of Nuclear Physics (Oxford, UK). In 1987, he joined the Imperial College, London,
UK, as the Solartron Lecturer in the Signal Processing Section of the Electrical Engineering Department. In 1991, he joined
the Signal Processing Division of the Electronic and Electrical Engineering Department in the University of Strathclyde, Glasgow,
UK, as a Senior Lecturer; subsequently, he was appointed as a Reader in 1995 and a Professor in 1998. In March 1999, he moved
to the University of Liverpool, Liverpool, UK to take up his appointment with David Jardine, Chair of Signal Processing in
the Department of Electrical Engineering and Electronics.
In 1983, he was a member of the UA1 team at CERN that discovered the three fundamental particles known as W+, W− and Z0 providing the evidence for the unification of the electromagnetic and weak forces, which was recognised by the Nobel Committee
for Physics in 1984. Currently, he is the Head of the Signal Processing and Communications Research Group with interests in
the areas of non-Gaussian signal processing, communications, and machine learning research. With his group he has been carrying
out research in machine condition monitoring, signal modelling, system identification, communication signal processing, biomedical
signals, ultrasonics, blind source separation, and blind deconvolution. He has authored or co-authored over 350 technical
publications, including two books “Automatic Modulation Recognition of Communications Signals” (Kluwer Academic, Boston, MA, 1996) and “Blind Estimation Using Higher-Order Statistics” (Kluwer Academic, Boston, MA, 1999) and over 140 journal papers.
Professor Nandi was awarded the Mounbatten Premium, Division Award of the Electronics and Communications Division, of the
Institution of Electrical Engineers of the UK in 1998 and the Water Arbitration Prize of the Institution of Mechanical Engineers
of the UK in 1999. He is a Fellow of the Cambridge Philosophical Society, the Institution of Engineering and Technology, the
Institute of Mathematics and its applications, the Institute of Physics, the Royal Society for Arts, the Institution of Mechanical
Engineers, and the British Computer Society.
![MediaObjects/10044_2007_99_Figc_HTML.jpg](/content/c66481358kp7h446/MediaObjects/10044_2007_99_Figc_HTML.jpg) |
| |
Keywords: | Clustering performance measures Unsupervised classification Data clustering Validity indices Clustering algorithms |
本文献已被 SpringerLink 等数据库收录! |
|