首页 | 本学科首页   官方微博 | 高级检索  
     

基于异构并行计算的单细胞测序数据聚类算法
引用本文:谢林娟,李荔瑄,张少强.基于异构并行计算的单细胞测序数据聚类算法[J].计算机工程与应用,2022,58(24):83-89.
作者姓名:谢林娟  李荔瑄  张少强
作者单位:天津师范大学 计算机与信息工程学院,天津 300387
摘    要:随着单细胞RNA测序技术的发展,目前单细胞测序通量由上千细胞发展到主流上万细胞的规模。基于单细胞RNA测序数据的细胞分型是研究细胞的重要问题之一,该问题主要运用无监督聚类方法。现有针对大规模单细胞测序数据的聚类方法通过简化细胞关系网络来降低时间复杂度,从而导致细胞分型准确度降低。而常见较高准确度的细胞分型方法无法处理大规模数据。为此,采用将k]最近邻与细胞相似度阈值结合构建全新的细胞关系网络,并采用CPU+GPU异构并行计算提高运算速度,通过改进的马尔科夫聚类算法进行细胞聚类。通过在七个较大规模单细胞数据集上实验,发现该算法比现有主要算法具有更好的聚类准确度,从而适合基于主流单细胞测序技术数据的细胞分型。

关 键 词:单细胞RNA测序  无监督聚类  并行计算  细胞分型  

Clustering of Single-Cell RNA-Seq Data Based on Heterogeneous Parallel Computing
XIE Linjuan,LI Lixuan,ZHANG Shaoqiang.Clustering of Single-Cell RNA-Seq Data Based on Heterogeneous Parallel Computing[J].Computer Engineering and Applications,2022,58(24):83-89.
Authors:XIE Linjuan  LI Lixuan  ZHANG Shaoqiang
Affiliation:College of Computer and Information Engineering, Tianjin Normal University, Tianjin 300387, China
Abstract:With the development of single-cell RNA sequencing(scRNA-seq) technology, the mainstream scRNA-seq throughput has grown from thousands of cells to tens of thousands of cells. Cell typing based on scRNA-seq data is one of the important problems in cell research, which mainly uses unsupervised clustering methods. The existing clustering methods for large-scale single-cell sequencing data reduce the time complexity by simplifying the single-cell network, which leads to the accuracy decline of cell typing. However, the common cell typing methods with high accuracy cannot handle large-scale data. For this reason, this study adopts the combination of k]-nearest neighbors(KNN) and cell-cell similarity threshold to construct a new single-cell network, uses CPU+GPU heterogeneous parallel computing to improve the computing speed, and finally performs cell clustering by an improved Markov clustering algorithm. Through experiments on seven large-scale single-cell datasets, it is found that the algorithm has better clustering accuracy than the main algorithms, and thus is suitable for cell typing of scRNA-seq data produced by mainstream technologies.
Keywords:single-cell RNA-seq  unsupervised clustering  parallel computing  cell typing  
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号