Two-phase clustering process for outliers detection |
| |
Affiliation: | 1. Department of Mathematics and Statistics, University of Jyväskylä, P.O. Box 35 (MaD), 40014 University of Jyväskylä, Finland;2. Freshwater Centre, Finnish Environment Institute, SYKE, Jyväskylä Office, Survontie 9A, 40500, Jyväskylä, Finland;3. Department of Signal Processing, Tampere University of Technology, P.O. Box 553, FI-33101, Tampere, Finland;4. Electrical & Electronics Engineering Department, Izmir University of Economics Turkey, Sakarya Street, No: 156, 35330, Balçova - Izmir - Turkey;5. Electrical Engineering, College of Engineering, Qatar University, P.O Box 2713, Qatar |
| |
Abstract: | In this paper, a two-phase clustering algorithm for outliers detection is proposed. We first modify the traditional k-means algorithm in Phase 1 by using a heuristic “if one new input pattern is far enough away from all clusters' centers, then assign it as a new cluster center”. It results that the data points in the same cluster may be most likely all outliers or all non-outliers. And then we construct a minimum spanning tree (MST) in Phase 2 and remove the longest edge. The small clusters, the tree with less number of nodes, are selected and regarded as outlier. The experimental results show that our process works well. |
| |
Keywords: | |
本文献已被 ScienceDirect 等数据库收录! |
|