SUBic: A Scalable Unsupervised Framework for Discovering High Quality Biclusters |
| |
Authors: | Jooil Lee Yanhua Jin Won Suk Lee |
| |
Affiliation: | Department of Computer Science, Yonsei University, Seoul 120-749, Korea |
| |
Abstract: | A biclustering algorithm extends conventional clustering techniques to extract all of the meaningful subgroups of genes and conditions in the expression matrix of a microarray dataset. However, such algorithms are very sensitive to input parameters and show poor scalability. This paper proposes a scalable unsupervised biclustering framework, SUBic, to find high quality constant-row biclusters in an expression matrix effectively. A one-dimensional clustering algorithm is proposed to partition the attributes, that is, columns of an expression matrix into disjoint groups based on the similarity of expression values. These groups form a set of short transactions and are used to discover a set of frequent itemsets each of which corresponds to a bicluster. However, a bicluster may include any attribute whose expression value is not similar enough to others, so a bicluster refinement is used to enhance the quality of a bicluster by removing those attributes based on its distribution of expression values. The performance of the proposed method is comparatively analyzed through a series of experiments on synthetic and real datasets. |
| |
Keywords: | biclustering clustering expression matrix frequent itemset sub-matrix |
本文献已被 CNKI 万方数据 SpringerLink 等数据库收录! |
|