首页 | 本学科首页   官方微博 | 高级检索  
     

结肠癌患者诊断的基因标志物识别算法*
引用本文:谢娟英,樊雯.结肠癌患者诊断的基因标志物识别算法*[J].模式识别与人工智能,2017,30(11):1019-1029.
作者姓名:谢娟英  樊雯
作者单位:1.陕西师范大学 计算机科学学院 西安 710119
2.中国科学技术大学 软件工程学院 苏州 215123
基金项目:国家自然科学基金项目(No.61673251)、中央高校基本科研业务费重点项目(No.GK201701006)、陕西省科技攻关项目(No.2013K12-03-24)、陕西师范大学研究生培养创新基金项目(No.2016CSY009,2015CXS028)资助
摘    要:为了得到具有强分类信息的极少结肠癌特征基因,实现对结肠癌患者的准确识别,文中提出结肠癌患者诊断的基因标志物识别算法.首先提出基因密度和基因距离的概念,构造以基因密度和基因距离分别为横纵坐标的基因2D空间散列图,选择处于密度峰值点的基因构成优选基因子集,然后采用密度峰值K中心点(DP_K-medoids)算法对降维后的结肠数据集样本进行聚类分析.基因距离和样本距离分别采用欧氏距离、曼哈顿距离、切比雪夫距离和夹角余弦距离度量.实验表明,在夹角余弦距离下,文中算法可以选择到具有高准确率、高灵敏度、高特异度和高马修斯相关系数的规模较小的结肠癌基因子集.

关 键 词:聚类    K中心点(K-medoids)算法    密度峰值K中心点(DP_K-medoids)算法    基因选择    结肠癌  
收稿时间:2017-05-04

Gene Markers Identification Algorithm for Detecting Colon Cancer Patients
XIE Juanying,FAN Wen.Gene Markers Identification Algorithm for Detecting Colon Cancer Patients[J].Pattern Recognition and Artificial Intelligence,2017,30(11):1019-1029.
Authors:XIE Juanying  FAN Wen
Affiliation:1.School of Computer Science, Shaanxi Normal University, Xi′an 7101193
2.School of Software Engineering, University of Science and Technology of China, Suzhou 215123
Abstract:To detect those few informative genes with strong classification information and identify colon cancer patients as correctly as possible, an algorithm is proposed in this paper to identify the gene markers for detecting colon cancer patients. The densities and distances are defined for genes firstly. All genes are scattered in a 2D space with gene density and distance as X-axis and Y-axis, respectively. Those genes at high density peaks are selected to construct the optimal gene subset. Then, those samples only with genes in the optimal gene subset of colon dataset are clustered by DP_K-medoids clustering algorithm. The distances between genes or samples are calculated via Euclidean distance, Manhattan distance, Chebyshev distance and the cosine distance, respectively. The experimental results demonstrate that the proposed algorithm can find the optimal gene subset of colon cancer with high accuracy, sensitivity, specificity and MCC, and with a very few number of genes as well.
Keywords:Clustering  K-medoids Algorithm  Density Peak Optimized K-medoids Algorithm  Gene Selection  Colon Cancer  
点击此处可从《模式识别与人工智能》浏览原始摘要信息
点击此处可从《模式识别与人工智能》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号