首页 | 本学科首页   官方微博 | 高级检索  
     

肿瘤基因表达谱分类特征基因选取问题及分析方法研究
引用本文:李颖新,李建更,阮晓钢. 肿瘤基因表达谱分类特征基因选取问题及分析方法研究[J]. 计算机学报, 2006, 29(2): 324-330
作者姓名:李颖新  李建更  阮晓钢
作者单位:北京工业大学电子信息与控制工程学院,北京,100022
摘    要:对肿瘤分类特征基因选取问题的研究是发现肿瘤特异表达基因、研究肿瘤基因表达模式的重要手段,文中基于多类别肿瘤基因表达谱数据集,从研究肿瘤与正常组织的分类入手,对肿瘤分类特征基因选取问题进行分析和研究,首先对基于Relief算法的特征选取策略加以改进生成候选特征集合;然后以支持向量机作为分类器对其分类性能进行检验以选取分类特征基因;最后结合分类模型。利用灵敏度分析方法进行特征基因的精确搜索以滤除冗余,基于该方法文中选出了52个具有良好分类性能的特征基因作为肿瘤的基因特征,并对其表达行为进行了简要分析。

关 键 词:肿瘤  基因表达  特征基因  组织分类  特征选取  支持向量机
收稿时间:2004-08-12
修稿时间:2004-08-122005-11-04

Study of Informative Gene Selection for Tissue Classification Based on Tumor Gene Expression Profiles
LI Ying-Xin,LI Jian-Geng,RUAN Xiao-Gang. Study of Informative Gene Selection for Tissue Classification Based on Tumor Gene Expression Profiles[J]. Chinese Journal of Computers, 2006, 29(2): 324-330
Authors:LI Ying-Xin  LI Jian-Geng  RUAN Xiao-Gang
Affiliation:1.School of Electronic Information and Control Engineering , Beijing University of Technology, Beijing 100022
Abstract:Informative gene selection is of great importance in the analysis of microarray expression data because of its huge dimensionality and relatively small samples, and also provides a systemic and promising way to reveal the gene expression patterns of tumors with large scale gene expression profiles. In this paper, the authors analyze the Multi-Class tumor gene expression profile dataset, which contains 218 tumor samples spanning 14 common tumor types, as well as 90 normal tissue samples, to find a small subset of genes for distinguishing tumor from normal tissues. First, a Relief-based feature selection algorithm is applied to create candidate feature subsets and the one with the best classification performance is selected as the informative gene subset for classification. Then, a sensitivity analysis method based on the classifier of support vector machine with RBF kernel is employed to eliminate the redundant genes. As a result, 52 in- formative genes are selected as markers for making distinctions between different tumor tissues and their normal counterparts, and their expressions are analyzed to explore the tumor gene expression patterns. At the end of this paper, several methods for informative gene selection are also analyzed and compared to validate the feasibility and effectiveness of the method employed in this work.
Keywords:tumors gene expression   informative genes   tissue classification   feature selection  support vector machine
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号