K-means聚类算法在肿瘤基因变异识别中的应用 USING K-MEANS CLUSTERING ALGORITHM FOR CANCER GENE VARIANT DETECTING期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

K-means聚类算法在肿瘤基因变异识别中的应用

引用本文：	叶骁.K-means聚类算法在肿瘤基因变异识别中的应用[J].计算机应用与软件,2019,36(3):287-290,333.

作者姓名：	叶骁

作者单位：	复旦大学计算机科学技术学院智能信息处理重点实验室上海200433

摘要：	二代测序NGS(Next-generation sequencing)数据的迅速发展加快人们对于基因的探索,同时也给测序数据分析任务带来更大的挑战。癌细胞特异变异的识别是测序数据分析的一项重要基础性工作。当前的变异识别工具大多采用贝叶斯模型方法,特异度、灵敏度和速度都远远满足不了需求。K-means是一种简洁高效的无监督聚类算法,基于此将位点信息映射成多维的特征,再进行类别个数为2的聚类过程。该算法明显提高了准确度和召回率,实验结果验证了算法的有效性。
关键词：	K-MEANS 变异识别二代测序
USING K-MEANS CLUSTERING ALGORITHM FOR CANCER GENE VARIANT DETECTING

Ye Xiao.USING K-MEANS CLUSTERING ALGORITHM FOR CANCER GENE VARIANT DETECTING[J].Computer Applications and Software,2019,36(3):287-290,333.

Authors:	Ye Xiao

Affiliation:	(Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai 200433, China)

Abstract:	The rapid development of next-generation sequencing data has accelerated the exploration of genes, and has also brought greater challenges to sequencing data analysis tasks. The identification of cancer-specific mutations is an important basic task in sequencing data analysis. Most of the current mutation identification tools use Bayesian model methods, but the specificity, sensitivity, and speed are far from meeting the needs. K-means is a concise and efficient unsupervised clustering algorithm. The algorithm mapped the site information into multidimensional features, and then carried out the clustering process with two classes. The algorithm improved the accuracy and recall rate obviously. Experimental results verify the effectiveness of the algorithm.

Keywords:	K-means Variant calling Next-generation sequencing
本文献已被维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏