首页 | 本学科首页   官方微博 | 高级检索  
     

DNA微阵列数据特征提取的分类方法研究
引用本文:彭红毅,叶燕锐,张俊辉,罗泽举,奉国和. DNA微阵列数据特征提取的分类方法研究[J]. 计算机工程与应用, 2010, 46(28): 40-42. DOI: 10.3778/j.issn.1002-8331.2010.28.011
作者姓名:彭红毅  叶燕锐  张俊辉  罗泽举  奉国和
作者单位:1.华南农业大学 理学院 统计系,广州 510642 2.华南理工大学 生物科学与工程学院,广州 510006 3.重庆工商大学 计算机科学与信息工程学院,重庆 400067 4.华南师范大学 经济管理学院,广州 510006
基金项目:国家社会科学基金,广东省自然科学基金,华南农业大学校长基金,重庆市科委重点攻关项目
摘    要:常用的排列方法从DNA微数据中选择的基因集合往往会包含相关性较高的基因,而且使用单个基因评价方法也不能真正反映由此得到的特征集合分类能力的优劣。另外,基因数量远多于样本数量是进行疾病诊断面临的又一挑战。为此,提出一种DNA微阵列数据特征提取方法用于组织分类。该方法运用K-means方法对基因进行聚类分析,获取各子类DNA微阵列数据中心,用排列法去除对分类无关的子类,然后利用ICA方法提取剩余子类集合的特征,用SVMs方法构造分类器对组织进行分类。真实的生物学数据实验表明,该方法通过提取一种复合基因,能综合评价基因分类能力,减少特征数,提高分类器的分类准确性。

关 键 词:DNA微阵列  特征提取  独立成分分析(ICA)  聚类分析  支持向量机(SVMs)  
收稿时间:2010-03-08
修稿时间:2010-4-21 

Method of extracting features from DNA microarray data for classification
PENG Hong-yi,YE Yan-rui,ZHANG Jun-hui,LUO Ze-ju,FENG Guo-he. Method of extracting features from DNA microarray data for classification[J]. Computer Engineering and Applications, 2010, 46(28): 40-42. DOI: 10.3778/j.issn.1002-8331.2010.28.011
Authors:PENG Hong-yi  YE Yan-rui  ZHANG Jun-hui  LUO Ze-ju  FENG Guo-he
Affiliation:1.College of Science,South China Agricultural University,Guangzhou 510642,China 2.School of Bioscience & Bioengineering,South China University of Technology,Guangzhou 510006,China 3.School of Computer Science and Information Engineering,Chongqing Technology and Business University,Chongqing 400067,China 4.College of Economics and Management,South China Normal University,Guangzhou 510006,China
Abstract:Gene sets of interest typically selected by usual ranking methods from DNA microarray data will contain many highly correlated genes,and using the evaluating method of single gene does not reflect really the capacity of classifier of character sets.And disease diagnostics based on gene expression microarray data presents another major challenge due to the number of genes far exceeding the number of samples.So a method of extracting DNA microarray data features for the tissue classification is proposed.The method makes use of K-means to cluster analysis for genes,getting the DNA microarray data centers of every subclass,then uses ranking methods to get grid of the genes not useful for classification.Then,the features of the remaining subclass sets are extracted by ICA,thus a classifier is structured by SVMs for tissues classification.Real biological data experiments show that the method can evaluate the classification capacity of genes,decrease the number of features and increase the classification accuracy of the existing classifiers by extracting a compound gene.
Keywords:DNA microarray  extracting feature  Independent Components Analysis (ICA)  clustering analysis  Support Vector Machines (SVMs)
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号