共查询到20条相似文献,搜索用时 15 毫秒
1.
Parametric optimization of sequence alignment 总被引:1,自引:0,他引:1
Theoptimal alignment or theweighted minimum edit distance between two DNA or amino acid sequences for a given set of weights is computed by classical dynamic programming techniques, and is widely used in molecular biology. However, in DNA and amino acid sequences there is considerable disagreement about how to weight matches, mismatches, insertions/deletions (indels or spaces), and gaps.Parametric sequence alignment is the problem of computing the optimal-valued alignment between two sequences as afunction of variable weights for matches, mismatches, spaces, and gaps. The goal is to partition the parameter space into regions (which are necessarily convex) such that in each region one alignment is optimal throughout and such that the regions are maximal for this property. In this paper we are primarily concerned with the structure of this convex decomposition, and secondarily with the complexity of computing the decomposition. The most striking results are the following: For the special case where only matches, mismatches, and spaces are counted, and where spaces are counted throughout the alignment, we show that the decomposition is surprisingly simple: all regions are infinite; there are at most n2/3 regions; the lines that bound the regions are all of the form =c + (c + 0.5); and the entire decomposition can be found inO(knm) time, wherek is the actual number of regions, andn相似文献
2.
3.
Zou J Xie HZ Yang SY Chen JJ Ren JX Wei YQ 《Journal of molecular graphics & modelling》2008,27(4):430-438
Pharmacophore modeling, including ligand- and structure-based approaches, has become an important tool in drug discovery. However, the ligand-based method often strongly depends on the training set selection, and the structure-based pharmacophore model is usually created based on apo structures or a single protein-ligand complex, which might miss some important information. In this study, multicomplex-based method has been suggested to generate a comprehensive pharmacophore map of cyclin-dependent kinase 2 (CDK2) based on a collection of 124 crystal structures of human CDK2-inhibitor complex. Our multicomplex-based comprehensive pharmacophore map contains almost all the chemical features important for CDK2-inhibitor interactions. A comparison with previously reported ligand-based pharmacophores has revealed that the ligand-based models are just a subset of our comprehensive map. Furthermore, one most-frequent-feature pharmacophore model consisting of the most frequent pharmacophore features was constructed based on the statistical frequency information provided by the comprehensive map. Validations to the most-frequent-feature model show that it can not only successfully discriminate between known CDK2 inhibitors and the molecules of focused inactive dataset, but also is capable of correctly predicting the activities of a wide variety of CDK2 inhibitors in an external active dataset. Obviously, this investigation provides some new ideas about how to develop a multicomplex-based pharmacophore model that can be used in virtual screening to discover novel potential lead compounds. 相似文献
4.
超大规模序列比对计算的并行优化 总被引:1,自引:0,他引:1
针对生物信息学研究中的超大规模序列比对计算问题进行了研究,解决了现有的e-PCR软件包在处理小麦基因引物扩增比对任务中存在的内存瓶颈、I/O瓶颈和计算时间瓶颈问题,利用数据和任务分割的基本方法,使其最关键的引物与模板的比对计算能够大规模并行,进而采用基于主从通信模式的MPI通信框架进行编程实现,并从任务的缩减、负载平衡、容错和多作业并发等方面进行了优化,最终在百万亿次超级计算机上顺利实现了千核级大规模并行计算,在数十日内即可完成原本预期需要数年的小麦序列扩增比对计算. 相似文献
5.
New applications of fingerprints of multiple potential 4-point three-dimensional (3D) pharmacophores in combinatorial library design and virtual screening are presented. Preliminary results demonstrating the feasibility of a simulated annealing process for combinatorial reagent selection that concurrently optimizes product diversity in BCUT chemistry space and in terms of unique 4-point pharmacophores are discussed, and the advantage of using a customized chemistry-space derived for the library design is demonstrated. In addition, an extension to the multiple pharmacophore method for structure-based design that uses the shape of the target site as an additional constraint is presented. This development enables the docking process to be quantified in terms of the number and identities of the pharmacophoric hypotheses that can be matched by a compound or a library of compounds. The design of an example combinatorial library based on the Ugi condensation reaction and a serine protease active site is described. 相似文献
6.
利用合作接收和混合优化策略,提出了一种干扰对齐算法,该算法可以在最小化泄漏干扰的同时最大化理想接收信号功率.首先通过接收机合作接收技术,接收端可以合作估计出发送端的预编码向量,从而无须事先知道预编码向量;然后利用干扰对齐问题的统计特性仅由干扰协方差矩阵之和就可以描述的原理,可设计出算法的合作接收的结构;最后通过一个基于合作博弈理论的讨价还价过程,可以选出对所有接收机最优的完全合作机制.仿真结果显示,相比传统的分布式干扰对齐机制,基于合作接收和混合优化技术的干扰对齐机制在各种发射功率条件下都具有更好的可行性. 相似文献
7.
Chiu YH Wu CH Su HY Cheng CJ 《IEEE transactions on pattern analysis and machine intelligence》2007,29(1):28-39
This work proposes a novel approach to translate Chinese to Taiwanese sign language and to synthesize sign videos. An aligned bilingual corpus of Chinese and Taiwanese sign language (TSL) with linguistic and signing information is also presented for sign language translation. A two-pass alignment in syntax level and phrase level is developed to obtain the optimal alignment between Chinese sentences and Taiwanese sign sequences. For sign video synthesis, a scoring function is presented to develop motion transition-balanced sign videos with rich combinations of intersign transitions. Finally, the maximum a posteriori (MAP) algorithm is employed for sign video synthesis based on joint optimization of two-pass word alignment and intersign epenthesis generation. Several experiments are conducted in an educational environment to evaluate the performance on the comprehension of sign expression. The proposed approach outperforms the IBM Model2 in sign language translation. Moreover, deaf students perceived sign videos generated by the proposed method to be satisfactory 相似文献
8.
Yougang Xiao Xuejun Li Xiaoqing Chen 《Structural and Multidisciplinary Optimization》2008,36(3):319-327
Based on the analyses of the mechanical features of rotary kiln with multi-supports, a general mechanical model for indeterminate kiln with variable bending rigidities, arbitrary supports and complex loads is established. From this model, the equations of the rotational angle and the deformation are deduced, the general matrix and procedure are developed. The correlation between the roller forces and axis deflections of no. 2 rotary kiln is derived. To improve kiln performance by kiln axis alignment, taking roller forces equilibrium and relative axis deflection minimum as the optimization goal, considering the fuzzy constraints of axis alignment, the fuzzy optimization model of kiln axis alignment is set up. The optimization model is solved with the max–min approach. The results show that fuzzy optimization alignment of rotary kiln can make kiln axis as straight as possible and can distribute kiln loads equally. 相似文献
9.
Ananthula RS Ravikumar M Pramod AB Madala KK Mahmood SK 《Journal of molecular graphics & modelling》2008,27(4):546-557
This paper describes the generation of ligand-based as well as structure-based models and virtual screening of less toxic P-selectin receptor inhibitors. Ligand-based model, 3D-pharmacophore was generated using 27 quinoline salicylic acid compounds and is used to retrieve the actives of P-selectin. This model contains three hydrogen bond acceptors (HBA), two ring aromatics (RA) and one hydrophobic feature (HY). To remove the toxic hits from the screened molecules, a counter pharmacophore model was generated using inhibitors of dihydrooratate dehydrogenase (DHOD), an important enzyme involved in nucleic acid synthesis, whose inhibition leads to toxic effects. Structure-based models were generated by docking and scoring of inhibitors against P-selectin receptor, to remove the false positives committed by pharmacophore screening. The combination of these ligand-based and structure-based virtual screening models were used to screen a commercial database containing 538,000 compounds. 相似文献
10.
《Applied Soft Computing》2008,8(1):55-78
Multiple sequence alignment, known as NP-complete problem, is among the most important and challenging tasks in computational biology. For multiple sequence alignment, it is difficult to solve this type of problems directly and always results in exponential complexity. In this paper, we present a novel algorithm of genetic algorithm with ant colony optimization for multiple sequence alignment. The proposed GA-ACO algorithm is to enhance the performance of genetic algorithm (GA) by incorporating local search, ant colony optimization (ACO), for multiple sequence alignment. In the proposed GA-ACO algorithm, genetic algorithm is conducted to provide the diversity of alignments. Thereafter, ant colony optimization is performed to move out of local optima. From simulation results, it is shown that the proposed GA-ACO algorithm has superior performance when compared to other existing algorithms. 相似文献
11.
Good AC Cheney DL Sitkoff DF Tokarski JS Stouch TR Bassolino DA Krystek SR Li Y Mason JS Perkins TD 《Journal of molecular graphics & modelling》2003,22(1):31-40
An important element of any structure-based virtual screening (SVS) technique is the method used to orient the ligands in the target active site. This has been a somewhat overlooked issue in recent SVS validation studies, with the assumption being made that the performance of an algorithm for a given set of orientation sampling settings will be representative for the general behavior of said technique. Here, we analyze five different SVS targets using a variety of sampling paradigms within the DOCK, GOLD and PROMETHEUS programs over a data set of approximately 10,000 noise compounds, combined with data sets containing multiple active compounds. These sets have been broken down by chemotype, with chemotype hit rate used to provide a measure of enrichment with a potentially improved relevance to real world SVS experiments. The variability in enrichment results produced by different sampling paradigms is illustrated, as is the utility of using pharmacophores to constrain sampling to regions that reflect known structural biology. The difference in results when comparing chemotype with compound hit rates is also highlighted. 相似文献
12.
As the cost of genome sequencing continues to drop, comparison of large sequences becomes tantamount to our understanding of evolution and gene function. Rapid genome alignment stands to play a fundamental role in furthering biological understanding. In 2002, a fast algorithm based on statistical estimation called super pairwise alignment (SPA) was developed by Shen et al. The method was proved to be much faster than traditional dynamic programming algorithms, while it suffered small drop in accuracy. In this paper, we propose a new method based on SPA that target analysis of large-scale genomes. The new method, named super genome alignment (SGA), applies Yang-Keiffer coding theory to alignment and results in a grammar-based algorithm. SGA has the same computational complexity as its predecessor SPA, and it can process large-scale genomes. SGA is tested by using numerous pairs of microbial and eukaryotic genomes, which serve as the benchmark to compare it with the competing BLASTZ method. When compared with BLASTZ, the result shows that SGA is significantly faster by at least an order of magnitude (for some genome pairs the differences is as large at two orders of magnitude), and suffers on average only about 1% loss of the similarity of alignment. 相似文献
13.
Jun Sun Xiaojun Wu Wei Fang Yangrui Ding Haixia Long Webo Xu 《Information Sciences》2012,182(1):93-114
Multiple sequence alignment (MSA) is an NP-complete and important problem in bioinformatics. For MSA, Hidden Markov Models (HMMs) are known to be powerful tools. However, the training of HMMs is computationally hard so that metaheuristic methods such as simulated annealing (SA), evolutionary algorithms (EAs) and particle swarm optimization (PSO), have been employed to tackle the training problem. In this paper, quantum-behaved particle swarm optimization (QPSO), a variant of PSO, is analyzed mathematically firstly, and then an improved version is proposed to train the HMMs for MSA. The proposed method, called diversity-maintained QPSO (DMQPO), is based on the analysis of QPSO and integrates a diversity control strategy into QPSO to enhance the global search ability of the particle swarm. To evaluate the performance of the proposed method, we use DMQPSO, QPSO and other algorithms to train the HMMs for MSA on three benchmark datasets. The experiment results show that the HMMs trained with DMQPSO and QPSO yield better alignments for the benchmark datasets than other most commonly used HMM training methods such as Baum–Welch and PSO. 相似文献
14.
Andrew Koster Marco Schorlemmer Jordi Sabater-Mir 《International journal of human-computer studies》2012,70(6):450-473
In open multi-agent systems trust models are an important tool for agents to achieve effective interactions. However, in these kinds of open systems, the agents do not necessarily use the same, or even similar, trust models, leading to semantic differences between trust evaluations in the different agents. Hence, to successfully use communicated trust evaluations, the agents need to align their trust models. We explicate that currently proposed solutions, such as common ontologies or ontology alignment methods, lead to additional problems and propose a novel approach. We show how the trust alignment can be formed by considering the interactions that agents share and describe a mathematical framework to formulate precisely how the interactions support trust evaluations for both agents. We show how this framework can be used in the alignment process and explain how an alignment should be learned. Finally, we demonstrate this alignment process in practice, using a first-order regression algorithm, to learn an alignment and test it in an example scenario. 相似文献
15.
Dental biometrics utilizes dental radiographs for human identification. The dental radiographs provide information about teeth, including tooth contours, relative positions of neighboring teeth, and shapes of the dental work (e.g., crowns, fillings, and bridges). The proposed system has two main stages: feature extraction and matching. The feature extraction stage uses anisotropic diffusion to enhance the images and a mixture of Gaussians model to segment the dental work. The matching stage has three sequential steps: tooth-level matching, computation of image distances, and subject identification. In the tooth-level matching step, tooth contours are matched using a shape registration method and the dental work is matched on overlapping areas. The distance between the tooth contours and the distance between the dental works are then combined using posterior probabilities. In the second step, the tooth correspondences between the given query (postmortem) radiograph and the database (antemortem) radiograph are established. A distance based on the corresponding teeth is then used to measure the similarity between the two radiographs. Finally, all the distances between the given postmortem radiographs and the antemortem radiographs that provide candidate identities are combined to establish the identity of the subject associated with the postmortem radiographs. 相似文献
16.
HMG-CoA还原酶(HMG-CoA Reductase,HMGR)是降血脂药物设计的重要靶标,抑制该酶的活性可以有效地降低血浆总胆固醇水平,从而降低心脑血管疾病的发病几率。虽然已经开发了数种他汀类药物作为HMGR抑制剂应用于临床,但是他汀类药物的安全性,特别是长期服用的安全性一直备受关注,所以设计新型安全的HMGR抑制剂仍然十分迫切。本论文利用蛋白质活性位点分析程序Grid,分析了HMGR底物结合腔的形状和表面特性,在细致地分析了各类药物与HMGR具体的氢键、疏水相互作用后,结合分子对接、3D-QSAR研究结果,总结了HMGR抑制剂的药效基团模型,并提出了可行的HMGR抑制剂的设计方案,为全新HMGR抑制剂的设计和先导化合物的优化提供了可靠的信息,并对HMGR抑制剂的进一步修饰提出了可行的思路。 相似文献
17.
We study the computational complexity of the Viterbi alignment and relaxed decoding problems for IBM model 3, focusing on the problem of finding a solution which has significant overlap with an optimal. That is, an approximate solution is considered good if it looks like some optimal solution with a few mistakes, where mistakes can be wrong values (such as a word aligned incorrectly or a wrong word in decoding), as well as insertions and deletions (spurious/missing words in decoding). In this setting, we show that it is computationally hard to find a solution which is correct on more than half (plus an inverse polynomial fraction) of the words. More precisely, if there is a polynomial-time algorithm computing an alignment for IBM model 3 which agrees with some Viterbi alignment on \(l/2+l^\epsilon \) words, where l is the length of the English sentence, or producing a decoding with \(l/2+l^\epsilon \) correct words, then P \(=\) NP. We also present a similar structure inapproximability result for phrase-based alignment. As these strong lower bounds are for the general definitions of the Viterbi alignment and decoding problems, we also consider, from a parameterized complexity perspective, which properties of the input make these problems intractable. As a first step in this direction, we show that Viterbi alignment has a fixed-parameter tractable algorithm with respect to limiting the range of words in the target sentence to which a source word can be aligned. We note that by comparison, limiting maximal fertility—even to three—does not affect NP-hardness of the result. 相似文献
18.
H-BloX is a web-based JavaScript application that allows the calculation and visualization of Shannon information content or relative entropy (Kullback-Leibler 'distance') within sequence alignment blocks. The application was designed for use in both teaching and research. Amino acid, nucleic acid sequences, or any other type of aligned chemical structures may serve as the input. Various interpretations of the meaning of 'entropy' or 'information content' are possible, including treatment as a chemical diversity measure or the degree of feature conservation. For analysis of numerical data by H-BloX, values must be converted to a user-defined character alphabet before computation of entropy or information content. H-BloX was successfully applied to feature identification in Escherichia coli signal peptides and their cleavage sites. Characteristics known features became visible, e.g., the hydrophobic core region and the well-known '-3,-1' cleavage site pattern. Based on the H-BloX analysis, the hydrophobic core is centered at amino acid residue position 13, counting from the N-terminal end of the protein precursor sequence. This result was obtained by using a built-in feature of H-BloX that enables conversion of amino acid sequences to a different alphabet that is based on hydrophobicity assignments. H-BloX can be accessed online or downloaded as HTML/JavaScript at http://bopwww.biologie.uni-freiburg.de/~bioinfo/HBloX/html/index.html. 相似文献
19.
Cabañas-Molero P. Cortina-Parajón Raquel Combarro E. F. Alonso Pedro Bris-Peñalver F. J. 《The Journal of supercomputing》2019,75(3):1001-1013
The Journal of Supercomputing - This paper presents a real-time audio-to-score alignment system for musical applications. The aim of these systems is to synchronize a live musical performance with... 相似文献
20.
Three-dimensional models, or pharmacophores, describing Euclidean constraints on the location on small molecules of functional
groups (like hydrophobic groups, hydrogen acceptors and donors, etc.), are often used in drug design to describe the medicinal
activity of potential drugs (or ‘ligands’). This medicinal activity is produced by interaction of the functional groups on
the ligand with a binding site on a target protein. In identifying structure-activity relations of this kind there are three
principal issues: (1) It is often difficult to “align” the ligands in order to identify common structural properties that
may be responsible for activity; (2) Ligands in solution can adopt different shapes (or `conformations’) arising from torsional
rotations about bonds. The 3-D molecular substructure is typically sought on one or more low-energy conformers; and (3) Pharmacophore
models must, ideally, predict medicinal activity on some quantitative scale. It has been shown that the logical representation
adopted by Inductive Logic Programming (ILP) naturally resolves many of the difficulties associated with the alignment and
multi-conformation issues. However, the predictions of models constructed by ILP have hitherto only been nominal, predicting
medicinal activity to be present or absent. In this paper, we investigate the construction of two kinds of quantitative pharmacophoric
models with ILP: (a) Models that predict the probability that a ligand is “active”; and (b) Models that predict the actual
medicinal activity of a ligand. Quantitative predictions are obtained by the utilising the following statistical procedures
as background knowledge: logistic regression and naive Bayes, for probability prediction; linear and kernel regression, for
activity prediction. The multi-conformation issue and, more generally, the relational representation used by ILP results in
some special difficulties in the use of any statistical procedure. We present the principal issues and some solutions. Specifically,
using data on the inhibition of the protease Thermolysin, we demonstrate that it is possible for an ILP program to construct
good quantitative structure-activity models. We also comment on the relationship of this work to other recent developments
in statistical relational learning.
Editors: Tamás Horváth and Akihiro Yamamoto 相似文献