首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
生物信息学是以计算机为工具对生物信息进行储存、检索和分析的科学。序列比对是生物信息学中的一个基本问题,设计快速而有效的序列比对算法是生物信息学研究的一个重要内容,通过序列比较可以发现生物序列中的功能、结构和进化的信息,序列比较的基本操作是比对。本文介绍了序列比对算法的发展现状,描述了常用的各类序列比对算法,并分析了它们的优劣。  相似文献   

2.
一种多序列比对的局部优化算法   总被引:2,自引:0,他引:2  
核苷酸序列簇和氨基酸序列簇的比对是分子生物学研究的一种基本工具。多序列比对问题是NP完全问题,任何研究快而完全算法的努力都将面临极大的困难。该文提出了一种多序列比对的局部优化算法,并对基因库中的序列做了测试。  相似文献   

3.
冯晓龙  高静 《计算机仿真》2020,37(2):231-236
针对生物信息分析中基因短序列比对任务计算耗时长的问题,采用Spark平台、RDD数据集以及分布式文件系统HDFS设计了一种分布式计算模型。采用分而治之的策略将庞大的计算任务分割为多个互不重叠的小任务在分布式集群上并行执行。通过基于位置偏移量等分的数据分区算法实现数据的分发;通过将基因短序列封装入RDD数据集的方法实现了短序列的逐条处理;通过将基因比对算法传入RDD的Map函数的方法实现了基因序列的比对。计算模型的实现使得串行比对算法在分布式集群上可扩展,并显著降低了计算耗时,计算结果可与后续的生物信息分析工作相兼容。实验结果证明计算模型具有较好的稳定性和可扩展性,在Spark集群上取得了优秀的加速比。  相似文献   

4.
自适应蚁群算法在序列比对中的应用   总被引:11,自引:2,他引:9  
梁栋  霍红卫 《计算机仿真》2005,22(1):100-102,106
序列比对是生物信息学的重要研究工具。蚁群算法是一种新型的模拟进化算法,并被成功地应用于旅行商问题(TSP)等组合优化问题中。该文将蚁群算法应用于序列比对,并提出基于自适应调整信息素的改进算法。仿真结果表明这种新的比对算法是有效的,而它的改进算法的效果更为理想。  相似文献   

5.
序列比对是生物信息学中一个重要的研究方向,它可以确定两个或多个序列之间的相似性,进而判断其同源性并推测出序列间的进化关系。目前,启发式序列比对算法BLAST算法在实际问题用着重要应用。该算法中有一个参数叫做种子(Seeds),种子是控制比对速度和灵敏度的关键。但是种子的长度是基于经验而取的一个固定值,这个经验值并不适合于所有长度序列比对问题。因此,对于两条不用长度的序列之间实现启发式比对就需要取合理长度的种子,以便实现高效快速的比对。文中应用概率随机的思想对不同长度序列比对的种子的长度进行了分析,在此基础上对一定长度下种子的比对灵敏度做出了计算。通过理论推导和实验分析一定灵敏度下种子长度的计算结果是可行且有效的。这就给在高灵敏度(灵敏度几乎等于动态规划算法)下实现快速启发式序列比对的优化提供了保证。  相似文献   

6.
生物序列比对算法的实现与集成   总被引:1,自引:1,他引:0  
为了使不同的比对算法产生的结果能够方便地横向比较进而评价最佳的比对模式,深入地研究了序列比对领域最经典和最常用的比对算法,其中包括经典的动态规划、半经验式的BLAST和FASTA、启发式的CLUSTALW以及相似组方法等,并对各种方法加以程序化的实现,研制成功了集成化的序列比对平台,为科研人员提供了有效的序列比对及其评价工具。  相似文献   

7.
多序列比对问题是生物信息学研究的重要部分,是解决物种进化关系、基因组序列分析等问题的基础。多序列比对算法具有很高的专用性,不同的算法适用于不同的研究环境。目前常用的多序列比对软件是在生物信息学理论指导下利用多个子算法装配形成的,而现有的研究主要针对特定算法的特定步骤进行优化,缺乏领域层次高抽象性的算法框架研究,致使多序列比对算法较为繁杂且冗余过多。根据产生式编程以及软件复用的思想,分析了多序列比对算法族MSAA的特征,设计了相应的泛型算法构件并刻画了构件间的交互关系,进一步借助PAR平台形式化构建了MSAA构件库,提高了装配算法的可靠性和组装灵活性,便于研究人员的维护和优化。  相似文献   

8.
在许多科学和商业领域,序列模式的发现技术发挥着越来越重要的作用,然而人们对于高效的基于投影树算法的并行模式关注较少。该文首先介绍了频繁序列挖掘模式的基本概念,然后基于投影树算法,提出了分布式存储并行序列挖掘算法,并对算法的性能进行了详细的分析。  相似文献   

9.
GPU加速的生物序列比对   总被引:1,自引:1,他引:0  
为了精确高效地进行生物序列比对,提出一种GPU加速的Smith-Waterman算法.该算法使用菱形数据布局以更充分地利用GPU的并行处理能力;使用查询串分批处理技术来支持上百兆规模的序列比对;同时引入树形算法,以优化最大匹配值的计算.将该算法在一块NVIDIA GeForce GTX285显卡上实现,并使用多组不同规模的生物序列进行了比对实验.实验结果表明,与CPU上的串行算法相比,采用文中算法最高可获得120倍以上的性能提升.  相似文献   

10.
生物序列比对是生物信息领域的重要课题,比对结果的合理性和正确性关系到基于比对结果研究的正确性。在保证正确性的前提下利用并行计算充分挖掘计算潜力对提高比对效率有重要意义。针对双序列的全局比对问题,提出了基于蚁群算法的双序列比对并行化方案。对耗时最多的搜索比对路径和信息素更新两个步骤给出了基于共享内存模型的并行化方法。"天河二号"上OpenMP实验结果表明,8线程并行情况下,加速比可达5.03,且序列越长性能越高。  相似文献   

11.
In this paper, we consider a distributed convex optimization problem where the objective function is an average combination of individual objective function in multi‐agent systems. We propose a novel Newton Consensus method as a distributed algorithm to address the problem. This method utilises the efficient finite‐time average consensus method as an information fusion tool to construct the exact Newtonian global gradient direction. Under suitable assumptions, this strategy can be regarded as a distributed implementation of the classical standard Newton method and eventually has a quadratic convergence rate. The numerical simulation and comparison experiment show the superiority of the algorithm in convergence speed and performance.  相似文献   

12.
随着互联网的飞速发展,需要处理的数据量不断增加,在互联网数据挖掘领域中传统的单机文本聚类算法无法满足海量数据处理的要求,针对在单机情况下,传统LDA算法无法分析处理大规模语料集的问题,提出基于MapReduce计算框架,采用Gibbs抽样方法的并行化LDA主题模型的建立方法。利用分布式计算框架MapReduce研究了LDA主题模型的并行化实现,并且考察了该并行计算程序的计算性能。通过对Hadoop并行计算与单机计算进行实验对比,发现该方法在处理大规模语料时,能够较大地提升算法的运行速度,并且随着集群节点数的增加,在加速比方面也有较好的表现。基于Hadoop平台并行化地实现LDA算法具有可行性,解决了单机无法分析大规模语料集中潜藏主题信息的问题。  相似文献   

13.
In this article we have undertaken a qualitative and quantitative comparison of common approaches used to develop distributed solutions in Java: RMI and Web services for regular unsecured communication, RMI‐SSL and WS‐Security for secure communication and authentication, and HTTP‐to‐port and HTTP‐to‐CGI/servlet tunnelling for RMI communication through firewalls and proxies. We have performed a functional comparison that helps with the selection of the most appropriate approach. We have also carried out a detailed performance analysis with the identification of major bottlenecks, identification of design and implementation guidelines for distributed applications, and specification of optimizations for distributed middleware. This article contributes to the understanding of different approaches for developing Java distributed applications, provides detailed performance analysis, presents design and implementation guidelines, and identifies the major performance overheads. Copyright © 2006 John Wiley & Sons, Ltd.  相似文献   

14.
15.
This paper describes an implementation and performance evaluation of different deadlock prevention algorithms. A deadlock prevention algorithm ensures that deadlock will never happen. The algorithms for deadlock prevention are proposed and implemented in a locally distributed system. A number of experiments were executed in a distributed system for various lengths of file operation and different numbers of files. The performance of the system and of each algorithm is evaluated and discussed. Some general results are derived for a single-host and a distributed system.  相似文献   

16.
提出了一种最长队列优先的分布式迭代算法。与现有算法不同的是,该算法针对可扩展网络交换调度结构的特点,为处于最高优先级的调度器安排了两次迭代。其中的第一次迭代实现最长虚拟输出队列(VOQ)的查找,并且在最高优先级时隙之前的一个时隙完成,以缩短信号的处理时间。仿真结果表明,本算法与现有算法相比,在大流量的uniform流量模式下,延时性能与吞吐率获得了明显的提高;同时,该算法的硬件代价小,有效地实现了性能和复杂度的良好折中。  相似文献   

17.
Performance study of distributed Apriori-like frequent itemsets mining   总被引:2,自引:1,他引:1  
In this article, we focus on distributed Apriori-based frequent itemsets mining. We present a new distributed approach which takes into account inherent characteristics of this algorithm. We study the distribution aspect of this algorithm and give a comparison of the proposed approach with a classical Apriori-like distributed algorithm, using both analytical and experimental studies. We find that under a wide range of conditions and datasets, the performance of a distributed Apriori-like algorithm is not related to global strategies of pruning since the performance of the local Apriori generation is usually characterized by relatively high success rates of candidate sets frequency at low levels which switch to very low rates at some stage, and often drops to zero. This means that the intermediate communication steps and remote support counts computation and collection in classical distributed schemes are computationally inefficient locally, and then constrains the global performance. Our performance evaluation is done on a large cluster of workstations using the Condor system and its workflow manager DAGMan. The results show that the presented approach greatly enhances the performance and achieves good scalability compared to a typical distributed Apriori founded algorithm.  相似文献   

18.
This paper is concerned with the design, implementation, and evaluation of algorithms for communication partner identification in mobile agent-based distributed job workflow execution. We first describe a framework for distributed job workflow execution over the Grid: the Mobile Code Collaboration Framework (MCCF). Based on the study of agent communications during a job workflow execution on MCCF, we identify the unnecessary agent communications that degrade the system performance. Then, we design a novel subjob grouping algorithm for preprocessing the job workflow's static specification in MCCF. The obtained information is used in both static and dynamic algorithms to identify partners for agent communication. The mobile agent dynamic location and communication based on this approach is expected to reduce the agent communication overhead by removing unnecessary communication partners during the dynamic job workflow execution. The proof of the dynamic algorithm's correctness and effectiveness are elaborated. Finally, the algorithms are evaluated through a comparison study using simulated job workflows executed on a prototype implementation of the MCCF on a LAN environment and an emulated WAN setup. The results show the scalability and efficiency of the algorithms as well as the advantages of the dynamic algorithm over the static one.  相似文献   

19.
Scheduling is a key component for performance guarantees in the case of distributed applications running in large scale heterogeneous environments. Another function of the scheduler in such system is the implementation of resilience mechanisms to cope with possible faults. In this case resilience is best approached using dedicated rescheduling mechanisms. The performance of rescheduling is very important in the context of large scale distributed systems and dynamic behavior. The paper proposes a generic rescheduling algorithm. The algorithm can use a wide variety of scheduling heuristics that can be selected by users in advance, depending on the system’s structure. The rescheduling component is designed as a middleware service that aims to increase the dependability of large scale distributed systems. The system was evaluated in a real-world implementation for a Grid system. The proposed approach supports fault tolerance and offers an improved mechanism for resource management. The evaluation of the proposed rescheduling algorithm was performed using modeling and simulation. We present experimental results confirming the performance and capabilities of the proposed rescheduling algorithm.  相似文献   

20.
推荐系统可以根据用户的基本信息与行为分析用户的兴趣,〖JP2〗向用户提供个性化推荐服务,因而成了近年来的研究热点。本文研究基于ALS模型协同过滤推荐算法。算法采用分布式平台实现,对比以往单节点实现,实验结果表明该算〖JP2〗法在计算速度上有了很大的提升。本文通过在损失函数上融合物品的相似性来减少隐形因子物品属性信息的丢失,同时在最优模型得出的预测评分中引入兴趣遗忘函数,通过实验对比结果表明,本文的优化算法有效提高了推荐系统的准确性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号