首页 | 本学科首页   官方微博 | 高级检索  
     

基于遗传算法及聚类的基因表达数据特征选择
引用本文:任江涛,黄焕宇,孙婧昊,印鉴.基于遗传算法及聚类的基因表达数据特征选择[J].计算机科学,2006,33(9):155-156.
作者姓名:任江涛  黄焕宇  孙婧昊  印鉴
作者单位:中山大学计算机科学系,广州,510275
基金项目:国家自然科学基金;广东省自然科学基金
摘    要:特征选择是模式识别及数据挖掘等领域的重要问题之一。针对高维数据对象(如基因表达数据)的特征选择,一方面可以提高分类及聚类的精度和效率,另一方面可以找出富含信息的特征子集,如发现与疾病密切相关的重要基因。针对此问题,本文提出了一种新的面向基因表达数据的特征选择方法,在特征子集搜索上采用遗传算法进行随机搜索,在特征子集评价上采用聚类算法及聚类错误率作为学习算法及评价指标。实验结果表明,该算法可有效地找出具有较好可分离性的特征子集,从而实现降维并提高聚类及分类精度。

关 键 词:特征选择  遗传算法  聚类  基因表达数据

Gene Expression Data Feature Selection Based on GA and Clustering
REN Jiang,Tao HUANG,Huan-Yu,SUN Jing-Hao,YIN Jian.Gene Expression Data Feature Selection Based on GA and Clustering[J].Computer Science,2006,33(9):155-156.
Authors:REN Jiang  Tao HUANG  Huan-Yu  SUN Jing-Hao  YIN Jian
Affiliation:Department of Computer Science, Zhongshan University, Guangzhou 510275
Abstract:Feature selection is one of the important problems in the pattern recognition and data mining areas. For highdimensional data such as gene expression data, feature selection not only can improve the accuracy and efficiency of classification and clustering, but also can discover informative feature subset, such as genes highly related to some diseases. This paper proposes a new feature selection method for the gene expression data, which realizes the feature subset search by genetic algorithm, and the feature subset is evaluated by the clustering algorithm and the error rate. The experiments show that the proposed algorithm can find the feature subsets with good separability, which results in the good clustering and classification accuracy.
Keywords:Feature selection  GA  Clustering  Gene expression data
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号