ITERATE: a conceptual clustering algorithm for data mining |
| |
Authors: | Biswas G. Weinberg J.B. Fisher D.H. |
| |
Affiliation: | Dept. of Comput. Sci., Vanderbilt Univ., Nashville, TN; |
| |
Abstract: | The data exploration task can be divided into three interrelated subtasks: 1) feature selection, 2) discovery, and 3) interpretation. This paper describes an unsupervised discovery method with biases geared toward partitioning objects into clusters that improve interpretability. The algorithm ITERATE employs: 1) a data ordering scheme and 2) an iterative redistribution operator to produce maximally cohesive and distinct clusters. Cohesion or intraclass similarity is measured in terms of the match between individual objects and their assigned cluster prototype. Distinctness or interclass dissimilarity is measured by an average of the variance of the distribution match between clusters. The authors demonstrate that interpretability, from a problem-solving viewpoint, is addressed by the intraclass and interclass measures. Empirical results demonstrate the properties of the discovery algorithm and its applications to problem solving |
| |
Keywords: | |
|
|