首页 | 本学科首页   官方微博 | 高级检索  
     

基于流形嵌入的宏基因组叠连群分箱方法研究
引用本文:何翀,王美丽,景旭.基于流形嵌入的宏基因组叠连群分箱方法研究[J].计算机应用与软件,2022,39(3):82-88.
作者姓名:何翀  王美丽  景旭
作者单位:西北农林科技大学信息工程学院 陕西 咸阳712100,西北农林科技大学信息工程学院 陕西 咸阳712100;西北农林科技大学农业农村部农业物联网重点实验室 陕西 咸阳712100;西北农林科技大学陕西省农业信息感知与智能服务重点实验室 陕西 咸阳712100
基金项目:陕西省重点研发计划项目(2019ZDLNY07-02-01,2019NY-167);
摘    要:宏基因组组装往往只能得到较长片段的叠连群,无法恢复完整的基因组.现有的一些分箱方法并未充分挖掘叠连群序列组成和样本覆盖度内部结构信息.开发了基于流形嵌入的宏基因组学叠连群分箱方法,可以挖掘出高维数据中内部的非线性结构特征,从而降低数据的维度,提高计算性能.使用流形嵌入的结果估计出初始分箱数,比使用基于单拷贝基因的分箱数...

关 键 词:宏基因组  分装  流形嵌入

METAGENOMICS CONTIG BINNING BASED ON MANIFOLD EMBEDDING
He Chong,Wang Meili,Jing Xu.METAGENOMICS CONTIG BINNING BASED ON MANIFOLD EMBEDDING[J].Computer Applications and Software,2022,39(3):82-88.
Authors:He Chong  Wang Meili  Jing Xu
Affiliation:(College of Information Engineering,Northwest A&F University,Xianyang 712100,Shaanxi,China;Key Laboratory of Agricultural Internet of Things,Ministry of Agriculture,Northwest A&F University,Xianyang 712100,Shaanxi,China;Shaanxi Key Laboratory of Agricultural Information Perception and Intelligent Service,Northwest A&F University,Xianyang 712100,Shaanxi,China)
Abstract:Metagenomics assembling can only obtain long segments of contigs,and cannot restore the complete genomes.Some existing binning methods do not fully mine the internal structure information of sequence composition and sample coverage of contigs.A metagenomics contig binning method based on manifold embedding is developed,which can mine the internal nonlinear structural features in high-dimensional data,so as to reduce the dimension of data and improve computational performance.It used the results of manifold embedding to estimate the initial bin number,which was more efficient than the bin number initialization method based on single copy genes.Based on the sequence composition and sample coverage information,manifold embedding better showed the internal structure of high-dimensional data embedding space,and provided more effective feature information for binning.Compared with other methods,this method achieves the highest ACC,NMI and Ari on the SpeciesMock data set.
Keywords:Metagenomics  Binning  Manifold embedding
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号