首页 | 本学科首页   官方微博 | 高级检索  
     


Improvement of new automatic differential fuzzy clustering using SVM classifier for microarray analysis
Authors:Indrajit Saha  Ujjwal Maulik  Sanghamitra Bandyopadhyay  Dariusz Plewczynski
Affiliation:1. Interdisciplinary Centre for Mathematical and Computational Modeling (ICM), University of Warsaw, 02-106 Warsaw, Poland;2. Department of Computer Science and Engineering, Jadavpur University, Kolkata 700 032, West Bengal, India;3. The Machine Intelligence Unit, Indian Statistical Institute, Kolkata 700 108, West Bengal, India;1. Centre for Health Informatics and Multiprofessional Education, University College London, London, United Kingdom;2. Technological Innovation Group, Virgen del Rocío University Hospital, Seville, Spain;3. Biomedical Informatics Research Area, Digitalica Salud SL, Seville, Spain;4. The European Institute for Health Records (EuroRec), Sint-Martens-Latem, Belgium;1. IBM T.J. Watson Research Center, Yorktown Heights, NY, United States;2. The Hebrew University of Jerusalem, Jerusalem, Israel;3. Informatics Institute, University of Alabama at Birmingham, Birmingham, AL, United States;1. Vanderbilt University School of Medicine, Department of Radiology and Radiological Science, 1161 21st Avenue South, CCC-1121 MCN, Nashville, TN 37232-2675;2. Vanderbilt-Ingram Cancer Center, Nashville, Tennessee;1. Institute of Biomedical Technologies, Italian National Research Council (ITB-CNR), Segrate, Italy;2. Institute for Applied Mathematics and Information Technologies “E. Magenes”, Italian National Research Council (IMATI-CNR), Milano, Italy;3. Medical Physics Laboratory, Regina Elena National Cancer Institute, Roma, Italy;1. Division of Biomedical Informatics, Cincinnati Children''s Hospital Medical Center, Cincinnati, USA;2. Department of Anesthesia, Cincinnati Children’s Hospital Medical Center, Cincinnati, USA;3. Division of Human Genetics, Cincinnati Children’s Hospital Medical Center, Cincinnati, USA
Abstract:In recent year, the problem of clustering in microarray data has been gaining significant attention. However most of the clustering methods attempt to find the group of genes where the number of cluster is known a priori. This fact motivated us to develop a new real-coded improved differential evolution based automatic fuzzy clustering algorithm which automatically evolves the number of clusters as well as the proper partitioning of a gene expression data set. To improve the result further, the clustering method is integrated with a support vector machine, a well-known technique for supervised learning. A fraction of the gene expression data points selected from different clusters based on their proximity to the respective centers, is used for training the SVM. The clustering assignments of the remaining gene expression data points are thereafter determined using the trained classifier. The performance of the proposed clustering technique has been demonstrated on five gene expression data sets by comparing it with the differential evolution based automatic fuzzy clustering, variable length genetic algorithm based fuzzy clustering and well known Fuzzy C-Means algorithm. Statistical significance test has been carried out to establish the statistical superiority of the proposed clustering approach. Biological significance test has also been carried out using a web based gene annotation tool to show that the proposed method is able to produce biologically relevant clusters of genes. The processed data sets and the matlab version of the software are available at http://bio.icm.edu.pl/~darman/IDEAFC-SVM/.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号