首页 | 本学科首页   官方微博 | 高级检索  
     


Gene selection for designing optimal fuzzy rule base classifier by estimating missing value
Affiliation:1. Computer Science and Engineering of St. Thomas’ College of Engineering and Technology, Kidderpore, Kolkata 700023, India;2. Computer Science and Technology of Indian Institute of Engineering Science and Technology, Shibpur, Howrah 711103, India;3. Centre for Healthcare Science and Technology of Indian Institute of Engineering Science and Technology, Shibpur, Howrah 711103, India;1. ASAP Group, School of Computer Science, University of Nottingham, Nottingham NG8 1BB, UK;2. School of Computer Science, Queen''s University, Belfast BT7 1NN, UK;3. Department of Information Systems, Faculty of Information Technology, Institut Teknologi Sepuluh Nopember, Jl. Raya ITS, Kampus ITS Sukolilo, Surabaya 60111, Indonesia;1. Department of Informatics, Modeling, Electronics and System Engineering, University of Calabria, Rende (CS), Italy;2. Department of Computer Science and UMIACS, University of Maryland, College Park, USA;1. Electrical and Industrial Systems Laboratory, University of Sciences and Technology Houari Boumediene USTHB, P.O. Box 32, El Alia, Bab Ezzouar, 16111 Algiers, Algeria;2. Process Control Laboratory, Polytechnic National School ENP, P.O. Box 182, 10 Hassan Badi, El-Harrach, 16200 Algiers, Algeria;3. Power Electronics and Control of Energy Systems Laboratory, Department of Electrical and Computer Engineering, Université du Québec à Trois-Rivières UQTR, C.P. 500, Trois-Rivieres G9A 5H7, QC, Canada;1. Department of Computer Engineering and Informatics, University of Patras, Rio Patras, 26500, Greece;2. Department of Informatics, Ionian University, Corfu, Greece;3. Department of Computer and Informatics Engineering, Technological Educational Institute of Western Greece, M. Alexandrou 1, Koukouli Patras 26334, Greece;1. School of Automation, Northwestern Polytechnical University, 710072 Xi’an, PR China;2. UMR CNRS 7253, Heudiasyc, Université de Technologie de Compiègne, 60205 Compiègne, France;3. UMR CNRS 6279, ICD-LM2S, Université de Technologie de Troyes, 10010 Troyes, France
Abstract:DNA microarray technology, a high throughput technology evaluates the expression of thousands of genes simultaneously under different experimental conditions. Analysis of the gene expression data reveals that not all but few important genes are responsible for the diseases. However, the DNA microarray data set usually contain multiple missing value and therefore, selection of important genes using the incomplete data set may be erroneous, resulting misclassification in disease prediction. In the paper we propose an integrated framework, which first imputes the missing value and then in order to achieve maximum accuracy in classifying the patients a classifier has been designed to select the genes using the complete microarray data set.Here functionally similar genes are employed to estimate the missing value unlike the existing gene expression value based distance similarity measure. However, the functionally similar genes may differ in their protein production capacity and so the degree of similarity between the genes varies from gene to gene. The problem has been dealt by proposing a novel method to impute the missing value using the concept of fuzzy similarity. After imputing the missing value, the continuous gene expression matrix is discretized using fuzzy sets to distinguish the activation levels of different genes. The proposed fuzzy importance factor (FIf) of each gene represents its activation level or protein production capacity both in the disease and normal class. The importance of each gene is evaluated while optimizing the number of rules in the fuzzy classifier depending on the FIf. The methodology we propose has been demonstrated using nine different cancer data sets and compared with the state of the art methods. Analysis of experimental results reveals that the proposed framework able to classify the diseased and normal patients with improved accuracy.
Keywords:DNA microarray  Fuzzy rule base classifier  Impute missing value  Gene selection  Fuzzy similarity
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号