首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
郭宁  冯萍  康继昌 《计算机工程》2011,37(23):205-207
传统双序列比对算法使用动态规划进行序列比对的速度慢,且准确性不高。为解决该问题,提出一种基于布尔逻辑的双序列搜索比对算法。根据一条序列中定长的碱基片段搜索2条序列的相似区,对相似区进行比对,包括相似区中碱基的比对以及子序列与另一条序列的比对,并通过并行执行机制实现加速比对。仿真实验结果表明,该算法具有较高的准确性和较好的实时性。  相似文献   

2.
一种多搜索策略的多生物序列比对自适应遗传算法   总被引:1,自引:0,他引:1  
多生物序列比对是用来计算生物序列间相似性的重要工具,本文在引入熵来度量种群多样性的基础上,提出了一种多搜索策略的自适应遗传算法,其交叉和变异概率随着熵的变化进行自动调整,并且综合考虑了利用动态规划算法来设计遗传操作算子.实验结果表明,这个算法具有较强的全局搜索能力和局部搜索能力,并且能有效的克服未成熟收敛问题.  相似文献   

3.
The rigorous alignment of multiple protein sequences becomes impractical even with a modest number of sequences, since computer memory and time requirements increase as the product of the lengths of the sequences. We have devised a strategy to approach such an optimal alignment, which modifies the intensive computer storage and time requirements of dynamic programming. Our algorithm randomly divides a group of unaligned sequences into two subgroups, between which an optimal alignment is then obtained by a Needleman-Wunsch style of algorithm. Our algorithm uses a matrix with dimensions corresponding to the lengths of the two aligned sequence subgroups. The pairwise alignment process is repeated using different random divisions of the whole group into two subgroups. Compared with the rigorous approach of solving the n-dimensional lattice by dynamic programming, our iterative algorithm results in alignments that match or are close to the optimal solution, on a limited set of test problems. We have implemented this algorithm in a computer program that runs on the IBM PC class of machines, together with a user-friendly environment for interactively selecting sequences or groups of sequences to be aligned either simultaneously or progressively.  相似文献   

4.
多序列联配(MSA)是一个NP问题,为了取得一个好的联配结果,常用渐进和迭代两种方法,但渐进方法不能调整早期的错误,迭代方法面临怎样跳出局部最优的问题。该文提出了一种新的求精方法,该方法基于极值遗传算法和挖掘策略。极值遗传算法基于极值组合元素,能够减少搜索空间。易于找到全局最优解。算法实现过程中,首先用挖掘算法挖掘出已知联配中的不良序列块,然后所有的不良序列块用极值遗传算法重新联配。当初始的序列是用渐进算法联配时,新的求精方法能调整早期的一些错误,充分结合渐进和迭代算法的优点。最后算法用来自于数据库BAliBASE中数据进行了验证。  相似文献   

5.
生物序列比对是生物信息学中最基础的研究课题之一.基于动态规划的Needleman-Wunsch双序列比对算法主要采用迭代算法及空位罚分规则对基因序列进行逐一比对,计算二者相似性得分,最后通过回溯分析得出序列之间的最佳比对.虽然该算法可以得到最佳比对结果,但是时间复杂度和空间复杂度较高.首先对原算法进行分析,对计算得分和...  相似文献   

6.
吕青  周欣  李凤莲 《计算机应用研究》2023,40(1):136-140+146
传统的匹配系统采用固定分块的方式处理大规模解剖学本体,普遍存在语义信息的丢失,影响了匹配效果。为此,提出一种动态分块调节机制,将按匹配情况确定的实体不断地重新分配到目标块中,动态地调节各个分块,从而尽可能地保留语义完整性。此外,针对该问题计算复杂度高的特点,引入了紧凑进化算法对子匹配任务中的阈值以及进行实体重新分配的块标志位优化,并设计了一种精英解参与的概率向量更新方式对该算法进行改进。实验在OAEI(ontology alignment evaluation initiative)的Anatomy测试集上进行,验证了所提方法对匹配结果质量的提升。此外,和其他匹配系统的对比也展示了所构建匹配系统的先进性。  相似文献   

7.
基于遗传算法的一种生物序列比对方法   总被引:1,自引:0,他引:1  
敖友云  迟洪钦 《计算机工程与设计》2006,27(19):3647-3648,3651
生物序列比对是对DNA(或RNA,蛋白质)序列,寻找和确定它们的相似部分或稳定区域.二重序列比对问题可采用动态规划方法求得其最优解;多重序列比对问题是一个NP完全的组合优化问题,有待进一步探索与研究.通过合理的编码表示,采用相应的遗传算子,设计了一种求生物序列比对的遗传算法.并对几组DNA序列进行了测试.  相似文献   

8.
随着生物序列数据库中序列数据的激增, 开发兼有高度生物敏感性和高效率的算法显得极为迫切. 通过对生物序列比对问题中Needleman-Wunsch算法和Smith-Waterman算法深入分析, 提出了Smith-Waterman算法的改进算法, 并通过实验验证该算法, 对改进前后的运行性能进行比较分析. 实验证明, 改进后的算法实现了双序列局部最优解个数的优化, 有效降低了生物序列比对算法时间与空间的复杂性, 提高序列比对的得分率和准确率.  相似文献   

9.
We have implemented a parallel version of a dynamic programming biological sequence comparison algorithm to study the potential applicability of using parallel computers for genetic sequence comparisons. Our parallel program is built using C-Linda, a machine-independent parallel programming language, and was tested on both a 10 CPU Sequent Symmetry and a 64 CPU Intel Hypercube. C-Linda implements a shared associative memory model, "tuple space," through which multiple processes can communicate and coordinate control. In our master-worker (MW) parallel implementation, a master process creates several worker processes, extracts a test sequence and multiple library sequences from a database and stores them in tuple space. Each worker reads the test sequence and then repeatedly extracts library strings from tuple space, performs pairwise sequence comparison using a local comparison algorithm to generate a similarity score, and returns the similarity scores to tuple space. The master collects the scores from tuple space and identifies the best match over all library sequences. We also implemented a method of global interworker communication to reduce the total search time by stopping those string comparisons that had no chance of improving on the current best match. Comparisons of the total run time, speedup, and efficiency were made for parallel and sequential versions of a basic MW implementation as well as versions with the global abort threshold.  相似文献   

10.
多序列比对问题是生物信息学研究的重要部分,是解决物种进化关系、基因组序列分析等问题的基础。多序列比对算法具有很高的专用性,不同的算法适用于不同的研究环境。目前常用的多序列比对软件是在生物信息学理论指导下利用多个子算法装配形成的,而现有的研究主要针对特定算法的特定步骤进行优化,缺乏领域层次高抽象性的算法框架研究,致使多序列比对算法较为繁杂且冗余过多。根据产生式编程以及软件复用的思想,分析了多序列比对算法族MSAA的特征,设计了相应的泛型算法构件并刻画了构件间的交互关系,进一步借助PAR平台形式化构建了MSAA构件库,提高了装配算法的可靠性和组装灵活性,便于研究人员的维护和优化。  相似文献   

11.
The multiple alignment of the sequences of DNA and proteins is applicable to various important fields in molecular biology. Although the approach based on Dynamic Programming is well-known for this problem, it requires enormous time and space to obtain the optimal alignment. On the other hand, this problem corresponds to the shortest path problem and the A* algorithm, which can efficiently find the shortest path with an estimator, is usable.

First, this paper directly applies the A* algorithm to multiple sequence alignment problem with more powerful estimator in more than two-dimensional case and discusses the extensions of this approach utilizing an upper bound of the shortest path length and of modification of network structure. The algorithm to provide the upper bound is also proposed in this paper. The basic part of these results was originally shown in Ikeda and Imai [11]. This part is similar to the branch-and-bound techniques implemented in MSA program in Gupta et al. [6]. Our framework is based on the edge length transformation to reduce the problem to the shortest path problem, which is more suitable to generalizations to enumerating suboptimal alignments and parametric analysis as done in Shibuya and Imai [15–17]. By this enhanced A* algorithm, optimal multiple alignments of several long sequences can be computed in practice, which is shown by computational results.

Second, this paper proposes a k-group alignment algorithm for multiple alignment as a practical method for much larger-size problem of, say multiple alignments of 50–100 sequences. A basic part of these results were originally presented in Imai and Ikeda [13]. In existing iterative improvement methods for multiple alignment, the so-called group-to-group two-dimensional dynamic programming has been used, and in this respect our proposal is to extend the ordinary two-group dynamic programming to a k-group alignment programming. This extension is conceptually straightforward, and here our contribution is to demonstrate that the k-group alignment can be implemented so as to run in a reasonable time and space under standard computing environments. This is established by generalizing the above A* search approach. The k-group alignment method can be directly incorporated in existing methods such as iterative improvement algorithms [2, 5] and tree-based (iterative) algorithms [9]. This paper performs computational experiments by applying the k-group method to iterative improvement algorithms, and shows that our approach can find better alignments in reasonable time. For example, through larger-scale computational experiments here, 34 protein sequences with very high homology can be optimally 10-group aligned, and 64 sequences with high homology can be optimally 5-group aligned.  相似文献   


12.
生物分子网络比对是生物信息学中一个重要领域,是研究生物现象和生命机理的有效手段,而自适应匈牙利贪心混合算法(AHGA)是其中一个有效的生物分子网络比对算法。但是生物分子网络数据的规模都比较大,而且由于其拥有生物背景,生物分子网络数据具有一些特殊性。为了能够在可以接受的时间范围内获得大规模生物分子网络的比对结果,使用MPI和统一计算架构(CUDA)对自适应混合算法进行了并行化,在比对中充分考虑生物分子网络的生物学意义,对两种方式进行了对比分析,以寻找更合适生物分子网络的比对方法。  相似文献   

13.
用于生物序列比对的经典动态规划算法是用一个固定的替换矩阵来逐点计算生物序列间的代价。这些方法可用来发现具有最大计分值的比对结果,但实际上,则更加倾向于考虑生物序列中所隐含的结构或功能信息.本文用可变长马尔科夫链方法来发现生物序列中所隐含的结构或功能信息子片断并定义其权值,最后提出一个新的基于结构信息的生物序列比对方法.  相似文献   

14.
Multiple sequence alignment, known as NP-complete problem, is among the most important and challenging tasks in computational biology. For multiple sequence alignment, it is difficult to solve this type of problems directly and always results in exponential complexity. In this paper, we present a novel algorithm of genetic algorithm with ant colony optimization for multiple sequence alignment. The proposed GA-ACO algorithm is to enhance the performance of genetic algorithm (GA) by incorporating local search, ant colony optimization (ACO), for multiple sequence alignment. In the proposed GA-ACO algorithm, genetic algorithm is conducted to provide the diversity of alignments. Thereafter, ant colony optimization is performed to move out of local optima. From simulation results, it is shown that the proposed GA-ACO algorithm has superior performance when compared to other existing algorithms.  相似文献   

15.
基于动态规划和遗传算法的混合算法研究   总被引:3,自引:0,他引:3  
动态规划法和遗传算法是目前在水电站厂内经济运行中广泛应用的两种优化算法,文章提出了一种基于动态规划法和遗传算法的混合优化算法来分别解决大规模机组组合问题中空间最优化和时间最优化的计算机求解问题。避免了遗传算法计算速度缓慢的问题,又避免了动态规划法的“维数灾”问题。最后使用清江隔河岩水电站的4台机组的运行数据进行了仿真研究,并和完全使用动态规划法的结果进行了比较,获得了良好的效果,说明该混合优化算法对于厂内经济运行是一种可行的算法。  相似文献   

16.
Multiple sequence alignment remains one of the most powerful tools for assessing sequence relateness and the identification of structurally and functionally important protein regions. In this work, two new techniques are introduced to increase the sensitivity of dynamic programming and to enable checks for alignment consistency: Profile-preprocessed and secondary structure-induced alignments. Both strategies are based upon the hierarchical dynamic programming technique and can be applied separately or used in combination. Alignments resulting from the strategies are shown in comparison with the multiple alignment methods CLUSTALX and MULTAL for distant sequence sets of the flavoxin and cupredoxin protein families.  相似文献   

17.
We present the first space and time optimal parallel algorithm for the pairwise sequence alignment problem, a fundamental problem in computational biology. This problem can be solved sequentially in O(mn) time and O(m+n) space, where m and n are the lengths of the sequences to be aligned. The fastest known parallel space-optimal algorithm for pairwise sequence alignment takes optimal O(m+n/p) space, but suboptimal O((m+n)/sup 2//p) time, where p is the number of processors. On the other hand, the most space economical time-optimal parallel algorithm takes O(mn/p) time, but O(m+n/p) space. We close this gap by presenting an algorithm that achieves both time and space optimality, i.e. requires only O((m+n)/p) space and O(mn/p) time. We also present an experimental evaluation of the proposed algorithm on an IBM xSeries cluster. Although presented in the context of full sequence alignments, our algorithm is applicable to other alignment problems in computational biology including local alignments and syntenic alignments. It is also a useful addition to the range of techniques available for parallel dynamic programming.  相似文献   

18.
Biological sequence (e.g. DNA sequence) can be treated as strings over some fixed alphabet of characters (a, c, t and g)[1]. Sequence alignment is a procedure of comparing two or more sequences by searching for a series of individual characters that are in the same order in the sequences. Two-sequence alignment, pair-wise alignment, is a way of stacking one sequence above the other and matching characters from the two sequences thaat lie in the same column: identical characters are placed in …  相似文献   

19.
As the cost of genome sequencing continues to drop, comparison of large sequences becomes tantamount to our understanding of evolution and gene function. Rapid genome alignment stands to play a fundamental role in furthering biological understanding. In 2002, a fast algorithm based on statistical estimation called super pairwise alignment (SPA) was developed by Shen et al. The method was proved to be much faster than traditional dynamic programming algorithms, while it suffered small drop in accuracy. In this paper, we propose a new method based on SPA that target analysis of large-scale genomes. The new method, named super genome alignment (SGA), applies Yang-Keiffer coding theory to alignment and results in a grammar-based algorithm. SGA has the same computational complexity as its predecessor SPA, and it can process large-scale genomes. SGA is tested by using numerous pairs of microbial and eukaryotic genomes, which serve as the benchmark to compare it with the competing BLASTZ method. When compared with BLASTZ, the result shows that SGA is significantly faster by at least an order of magnitude (for some genome pairs the differences is as large at two orders of magnitude), and suffers on average only about 1% loss of the similarity of alignment.  相似文献   

20.
A software system has been developed for modern maritime container transportation. This system contains six modules: the demand forecasting, the stowage planning, the shipping line optimization, the slot pricing and allocation, the container distribution, and the contribution analysis. The first three modules will be presented in this paper. The system constructs problem models and uses exponential smoothing, regression analysis, neural network, linear programming, genetic algorithm and sequence alignment methods to solve relevant issues. The reliability and practicality of the software and the algorithms included are confirmed by practical applications or analyzing their estimation accuracy. The system has been tested with the actual transportation and proves to be an effective decision-making tool in maritime transport coordination. The proposed system may help companies significantly increase profit.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号