首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 171 毫秒
1.
针对单处理器后序遍历二叉树的时间复杂度为O(n)问题,提出了在EREW PRAM并行计算模型下一种后序遍历二叉树的算法。将后序遍历二叉树的边构造一个单链表,使用指针跳越技术对单链表进行表序问题求解,从而得到后序遍历二叉树结点的顺序。得出了运用该算法将时间复杂度从O(n)减少到O(logn)的结论。  相似文献   

2.
随着新一代测序技术的发展,一些新的全基因组组装算法应运而生,特别是针对第三代高通量测序仪产生的海量短序列的组装软件被不断开发出来,这些组装软件渐渐走向市场。但是,由于这些组装软件的适用性和其性能的差别,选择一款性能优良的组装工具或者开发并行高吞吐的组装工具成为了当前面临的一大难题。本文选取基于 De Bruijn 图算法开发的 4 款 De Novo 组装的软件(Velvet、SOAPdenovo、IDBA、ABySS)对 4 种物种的基因组的模拟数据进行测试,并从软件的算法、组装性能和组装质量 3 个方面分析这 4 个软件的性能,同时根据其算法特点推断影响这些软件性能的关键因素,并给出软件的使用建议以及开发并行序列组装工具来组装超大规模的基因数据应该注意的问题。  相似文献   

3.
基因组测序是生物信息学中最基本的研究方向之一,然而大多数生物的基因组都不可能一次性获得,需要利用序列拼接技术对实验中获得的DNA片段进行拼接操作.目前,测序过程中获得的DNA片段越来越短,基于Euler路径的拼接算法在处理这种短片段拼按时具有优势.在Euler路径算法中,一个关键的步骤是de Bruijn图的构建,一直以来,构建de Bruijn图的方式总是让后一个k-mer与前一个k-mer 之间有k-1个碱基的交叠,相邻的两个k-mer之间相互错开一位.但文中的研究发现,如果有边连接的两个k-mer之间有k-2个或者更少的碱基相交叠,会对de Bruijn图结构复杂性产生重要影响.针对这些影响进行详细分析,并设计实验进行验证,实验结果表明,k-mer之间的错位数变化对de Bruijn图结构复杂性有显著影响.  相似文献   

4.
并行双调排序算法的有效实现及性能分析   总被引:1,自引:0,他引:1  
排序是计算机中最常见的操作之一,双调排序是一个非常著名的排离算法,也是最早的并行排序算法,又调排离对排序算法的研究具有非常深远的影响,基于双调排序算法的基本思想,介绍了双调排序在分布存储的并行计算机环境下的一种有效实现方式,采用局部多对多通信替换全局通信,很好地解决了双调排序中的通信问题,算法的计算复杂度为⊙n/p(logn log^2p),其中n为待排序的关键字个数,p为处理器数,算法在二维网孔结构上通信时间复杂度达到了O(2.12132√p.n/p)其量级达到了理论上的下限,分析结果表明,双调排序算法也具有很好的通信性能和可扩展性。  相似文献   

5.
图结构聚类(SCAN)是一种著名的基于密度的图聚类算法。该算法不仅能够找到图中的聚类结构,而且还能发现图中的Hub节点和离群节点。然而,随着图数据规模越来越大,传统的SCAN算法的复杂度为O(m1.5)(m为图中边的条数),因此很难处理大规模的图数据。为了解决SCAN算法的可扩展性问题,本文提出了一种新颖的基于MapReduce的海量图结构聚类算法MRSCAN。具体地,我们提出了一种计算核心节点,以及两种合并聚类的MapReduce算法。最后,在多个真实的大规模图数据集上进行实验测试,实验结果验证了算法的准确性、有效性,以及可扩展性。  相似文献   

6.
对于无线城市数据中社团发现问题,针对已有的团搜索(CS)算法运行过程生成大量重复团、生成结果冗余、算法时间复杂度较高等问题,从优化边存储、预先进行边处理、搜索建团入手,用特殊的二叉树结构存储、权重[K]选择排序、深度优先遍历构建T-CS算法。针对海量数据溢出问题,结合MapReduce模型,提出了MP-T-CS算法。实验证明,MP-T-CS算法不仅可以解决运行过程大量重复团问题,时间代价大大降低,对海量数据的处理能力大大提升,生成团的代表性大大提高。  相似文献   

7.
介绍了排序的基本概念和常见排序类型,讲解了图排序中的一种非常重要的排序算法拓扑排序的概念,研究了拓扑排序的原理以及2种常见的遍历算法,分析了拓扑排序的空间和时间复杂度,采用这2种遍历算法对有向无环图进行拓扑排序的实现,并对拓扑排序的常见应用场景进行了介绍。  相似文献   

8.
海量空间线矢量自动拼接研究   总被引:1,自引:0,他引:1  
由于地图接边以及地图综合的要求,空间矢量在应用时需要进行图幅内和图幅间的拼接.海量空间矢量数据拼接时,需要自动处理,且其效率十分重要.针对海量空间线矢量的图幅内自动拼接,提出了基于端点排序的拼接方法.此方法能减少拼接判断冗余,从而提高图幅内线矢量拼接效率.针对海量空间线矢量的图幅间自动拼接,提出了"虚拼接"方法.此方法生成图幅间线矢量拼接链,以拼接链实现线矢量ID映射,达到应用所需拼接效果,避免大范围内的数据重组.并设计了综合应用这两种方法实现海量空间线矢量拼接的技术方案.应用证明,这个技术方案行之有效.  相似文献   

9.
基因组序列拼接的主流方法是将整条序列随机打断成小片段,然后根据片段间重叠关系连接成长序列.由于较多噪音存在,算法复杂度高,加之生物数据的海量增长,序列拼接处理导致巨大的时空开销而无法完成.本文提出一种基于最大频繁序列模式的聚类算法,将整个数据集分成若干个子集,分别高效地处理,实现了一个基因拼接网格系统、透明动态的资源管理,大大扩展了基因拼接计算能力.基于最大频繁序列模式聚类算法及挖掘算法,针对生物数据的特性做出了优化.  相似文献   

10.
李智慧  林吓洪  申瑞民 《计算机仿真》2010,27(1):313-315,337
利用海量的生物网络数据发现功能模块越来越受到人们的重视,从蛋白质建模的网络图中挖掘高连通子图是其中一个很重要的问题,然而由于数据规模巨大,现有的算法在时间效率上无法胜任实际的应用需求。通过深入研究高连通图的性质定理,设计了一个高连通子图的贪心挖掘算法(HCSGM)算法,在时间复杂度上比HCS算法提高了一个数量级。实验结果表明,HCSGM算法在仿真数据上的挖掘结果优于HCS算法,并且能够从大规模网络图中快速地进行高连通子图挖掘,从而高效地从蛋白质相互作用数据库中挖掘出功能模块。  相似文献   

11.
大数据研究领域的许多问题可以转换为图的问题。本文将阐述鲲鹏大数据系统计算引擎中有关大规模图处理的研究进展以及应用,具体包括高效子图匹配算法、面向图的稀疏数据存储结构和大规模图异步计算模型及其在基因拼接中的应用。  相似文献   

12.
Graphs that are used to model real-world entities with vertices and relationships among entities with edges, have proven to be a powerful tool for describing real-world problems in applications. In most real-world scenarios, entities and their relationships are subject to constant changes. Graphs that record such changes are called dynamic graphs. In recent years, the widespread application scenarios of dynamic graphs have stimulated extensive research on dynamic graph processing systems that continuously ingest graph updates and produce up-to-date graph analytics results. As the scale of dynamic graphs becomes larger, higher performance requirements are demanded to dynamic graph processing systems. With the massive parallel processing power and high memory bandwidth, GPUs become mainstream vehicles to accelerate dynamic graph processing tasks. GPU-based dynamic graph processing systems mainly address two challenges: maintaining the graph data when updates occur (i.e., graph updating) and producing analytics results in time (i.e., graph computing). In this paper, we survey GPU-based dynamic graph processing systems and review their methods on addressing both graph updating and graph computing. To comprehensively discuss existing dynamic graph processing systems on GPUs, we first introduce the terminologies of dynamic graph processing and then develop a taxonomy to describe the methods employed for graph updating and graph computing. In addition, we discuss the challenges and future research directions of dynamic graph processing on GPUs.  相似文献   

13.
近年来,图计算在诸多领域发挥着越来越重要的作用。连通分量算法是图计算的重要基础算法,可以应用于可达性查询、一致性检测等众多场景。面向大规模图遍历Graph500标准测试,对连通分量算法进行了算法和数据结构优化。主要有以下创新:(1)对并查集提出了捷径向量算法,并测试了算法和数据结构的配合程度;(2)利用多线程迭代轮转对算法实现并行加速;(3)从多个维度比较了不同实现方法的优缺点。基于优化方法,对性能进行了评估分析,当scale=25(包含225个节点)时,捷径向量算法对基于二维向量和链表的按秩合并算法的加速比分别是1.38倍和1.40倍,对BFS和DFS的加速比分别为4.76倍和4.70倍,且空间占用为该2个算法的4.1%~4.6%,此外,并行对串行的加速比为1.57倍。  相似文献   

14.
In order to analyze and process the large graphs with high cost efficiency, researchers have developed a number of out-of-core graph processing systems in recent years based on just one commodity computer. On the other hand, with the rapidly growing need of analyzing graphs in the real-world, graph processing systems have to efficiently handle massive concurrent graph processing (CGP) jobs. Unfortunately, due to the inherent design for single graph processing job, existing out-of-core graph processing systems usually incur unnecessary data accesses and severe competition of I/O bandwidth when handling the CGP jobs. In this paper, we propose GraphCP, a disk I/O optimized out-of-core graph processing system that efficiently supports the processing of CGP jobs. GraphCP proposes a benefit-aware sharing execution model to share the I/O access and processing of graph data among the CGP jobs and adaptively schedule the graph data loading based on the states of vertices, which efficiently overcomes above challenges faced by existing out-of-core graph processing systems. Moreover, GraphCP adopts a dependency-based future-vertex updating model so as to reduce disk I/Os in the future iterations. In addition, GraphCP organizes the graph data with a Source-Sorted Sub-Block graph representation for better processing capacity and I/O access locality. Extensive evaluation results show that GraphCP is 20.5× and 8.9× faster than two out-of-core graph processing systems GridGraph and GraphZ, and 3.5× and 1.7× faster than two state-of-art concurrent graph processing systems Seraph and GraphSO.  相似文献   

15.
N-hop neighborhoods information is very useful in analytic tasks on large-scale graphs, like finding clique in a social network, recommending friends or advertising links according to one’s interests, predicting links among websites and etc. To get the N-hop neighborhoods information on a large graph, such as a web graph, a twitter social graph, the most straightforward method is to conduct a breadth first search (BFS) on a parallel distributed graph processing framework, such as Pregel and GraphLab. However, due to the massive volume of message transfer, the BFS method results in high communication cost and has low efficiency. In this work, we propose a key/value based method, namely KVB, which perfectly fits into the prevailing parallel graph processing framework and computes N-hop neighborhoods on a large scale graph efficiently. Unlike the BFS method, our method need not transfer large amount of neighborhoods information, thus, significantly reduces the overhead on both the communication and intermediate results in the distributed framework.We formalize the N-hop neighborhoods query processing as an optimization problem based on a set of quantitative cost metrics of parallel graph processing. Moreover, we propose a solution to efficiently load only the relevant neighborhoods for computation. Specially, we prove the optimal partial neighborhoods load problem is NP-hard and carefully design a heuristic strategy. We have implemented our algorithm on a distributed graph framework- Spark GraphX and validated our solution with extensive experiments over a number of real world and synthetic large graphs on a modest indoor cluster. Experiments show that our solution generally gains an order of magnitude speedup comparing to the state-of-art BFS implementation.  相似文献   

16.
一种基于De Bruijn网络结构的并行矩阵乘算法   总被引:1,自引:1,他引:0  
在De Bruijn网络中进行并行矩阵乘法运算,算法简单,容易实现。首先介绍了De Bruijn网络结构,然后提出了一种基于De Bruijn网络结构的矩阵乘法的并行算法,分析了它的加速比、效率等性能及可扩展性,通过与Cannon算法的比较,证明它的时间复杂度等效于Cannon算法,最后通过实验验证了这个结论的正确性。  相似文献   

17.
Database query processing can benefit significantly from parallelism. Parallel database algorithms combine substantial CPU and I/O activity, memory requirements, and massive data exchange between processes, all of which must be considered to obtain optimal performance. Since parallel external sorting is a very typical example, we have focused on sorting to tune Volcano, a new query processing system. The purpose of the Volcano project is to provide efficient, extensible tools for query and request processing in novel application domains, particularly in object-oriented and scientific database systems, and for experimental database performance research. It includes all query processing algorithms conventionally used in relational database systems as well as several new ones, and can execute all of them in parallel. In this article, we present Volcano's parallel external sorting algorithm and a sequence of enhancements to improve its performance. We obtained very good absolute performance, 84 seconds for 100 MB of data, as well as near-linear speedup with sixteen CPUs and disks. Furthermore, these results were achieved on a shared-memory machine despite the common belief that parallel query processing is best implemented on distributed-memory systems. We detail our tuning measures and report on their effectiveness.  相似文献   

18.
黎萍  朱军燕  彭芳  杨亮 《计算机工程》2014,(3):193-195,200
结合可视图的骨架构造方法和A~*图搜索方法,采用矩形包络障碍物,在障碍物顶点外延生成路径点。在此基础上,提出一种新的路径规划算法Lambda~*,与A~*算法类似,搜索过程需要2张表,但CLOSED表保存从起始节点开始的路径节点,OPEN表保存CLOSED表中扩展节点的后续节点,可减少在OPEN表中保存的节点数量,减少计算量和耗时,并通过增加SMOOTH过程以提高路径的平滑度。将算法应用于二维空间环境进行机器人路径规划仿真实验,结果表明,与A~*算法相比,Lambda~*算法能够以增加较少路径长度为前提,大幅降低路径规划的耗时。  相似文献   

19.
The undirected de Bruijn graph is often used as the model of communication network for its useful properties,such as short diameter,small maximum vertex degree.In this paper,we consider the alphabet overlap graph G(k,d,s): the vertex set V = {v|v = (v1 ...vk);vi ∈ {1,2,...,d},i = 1,2,...,k};they are distinct and two vertices u = (u1 ...uk) and v = (v1 ...vk) are adjacent if and only if us+i = vi or vs+i = ui (i = 1,2,...,k s).In particular,when s = 1,G(k,d,s) is just an undirected de Bruijn graph.First,we give a formula to calculate the vertex degree of G(k,d,s).Then,we use the corollary of Menger’s theorem to prove that the connectivity of G(k,d,s) is 2ds 2d2s k for s k/2.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号