首页 | 本学科首页   官方微博 | 高级检索  
     


On constructing an optimal consensus clustering from multiple clusterings
Authors:Piotr Berman  Bhaskar DasGupta  Ming-Yang Kao  Jie Wang
Affiliation:a Department of Computer Science & Engineering, Pennsylvania State University, University Park, PA 16802, USA
b Department of Computer Science, University of Illinois at Chicago, Chicago, IL 60607, USA
c Department of Electrical Engineering & Computer Science, Northwestern University, Evanston, IL 60208, USA
d Department of Computer Science, University of Massachusetts Lowell, Lowell, MA 01854, USA
Abstract:Computing a suitable measure of consensus among several clusterings on the same data is an important problem that arises in several areas such as computational biology and data mining. In this paper, we formalize a set-theoretic model for computing such a similarity measure. Roughly speaking, in this model we have k>1 partitions (clusters) of the same data set each containing the same number of sets and the goal is to align the sets in each partition to minimize a similarity measure. For k=2, a polynomial-time solution was proposed by Gusfield (Information Processing Letters 82 (2002) 159-164). In this paper, we show that the problem is MAX-SNP-hard for k=3 even if each partition in each cluster contains no more than 2 elements and provide a View the MathML source-approximation algorithm for the problem for any k.
Keywords:Computational complexity   Approximation algorithms   Consensus clustering
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号