首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 406 毫秒
1.
To utilize fully all available information in protein structure prediction, including both backbone and side-chain structures, we present a novel algorithm for solving a generalized threading problem. In this problem we consider simultaneous backbone threading and side-chain packing during the process of a protein structure prediction. For a given query protein sequence and a template structure, our goal is to find a threading alignment between the query sequence and the template structure, along with a rotamer assignment for each side-chain of the query protein, which optimizes an energy function that combines a backbone threading energy and a side-chain packing energy. This highly computationally challenging problem is solved through first formulating this problem as a graph-based optimization problem. Various graph-theoretic techniques are employed to achieve the computational efficiency to make our algorithm practically useful, which takes advantage of a number of special properties of the graph representing this generalized threading problem. The overall framework of our algorithm is a dynamic programming algorithm implemented on an optimal tree decomposition of the graph representation of our problem. By using various additional heuristic techniques such as dead-end elimination, we have demonstrated that our algorithm can solve a generalized threading problem within a practically acceptable amount of time and space, the first of its kind.  相似文献   

2.
To fully utilize all available information in protein structure prediction, including both backbone and side-chain structures, we present a novel algorithm for solving a generalized threading problem. In this problem, we consider simultaneously backbone threading and side-chain packing during the process of a protein structure prediction. For a given query protein sequence and a template structure, our goal is to find a threading alignment between the query sequence and the template structure, along with a rotamer assignment for each side-chain of the query protein, which optimizes an energy function that combines a backbone threading energy and a side-chain packing energy. This highly computationally challenging problem is solved through first formulating this problem as a graph-based optimization problem. Various graph-theoretic techniques are employed to achieve the computational efficiency to make our algorithm practically useful, which takes advantage of a number of special properties of the graph representing this generalized threading problem. The overall framework of our algorithm is a dynamic programming algorithm implemented on an optimal tree decomposition of the graph representation of our problem. By using various additional heuristic techniques such as the dead-end elimination, we have demonstrated that our algorithm can solve a generalized threading problem within practically acceptable amount of time and space, the first of its kind.  相似文献   

3.
J. Yadgari  A. Amir  R. Unger 《Constraints》2001,6(2-3):271-292
The biological function of proteins is dependent, to a large extent, on their native three dimensional conformation. Thus, it is important to know the structure of as many proteins as possible. Since experimental methods for structure determination are very tedious, there is a significant effort to calculate the structure of a protein from its linear sequence. Direct methods of calculating structure from sequence are not available yet. Thus, an indirect approach to predict the conformation of protein, called threading, is discussed. In this approach, known structures are used as constraints, to restrict the search for the native conformation. Threading requires finding good alignments between a sequence and a structure, which is a major computational challenge and a practical bottleneck in applying threading procedures. The Genetic Algorithm paradigm, an efficient search method that is based on evolutionary ideas, is used to perform sequence to structure alignments. A proper representation is discussed in which genetic operators can be effectively implemented. The algorithm performance is tested for a set of six sequence/structure pairs. The effects of changing operators and parameters are explored and analyzed.  相似文献   

4.
5.
The determination of a protein's structure from the knowledge of its linear chain is one of the important problems that remains as a bottleneck in interpreting the rapidly increasing repository of genetic sequence data. One approach to this problem that has shown promise and given a measure of success is threading. In this approach contact energies between different amino acids are first determined by statistical methods applied to known structures. These contact energies are then applied to a sequence whose structure is to be determined by threading it through various known structures and determining the total threading energy for each candidate structure. That structure that yields the lowest total energy is then considered the leading candidate among all the structures tested. Additional information is often needed in order to support the results of threading studies, as it is well known in the field that the contact potentials used are not sufficiently sensitive to allow definitive conclusions. Here, we investigate the hypothesis that the environment of an amino acid residue realized as all those residues not local to it on the chain but sufficiently close spatially can supply information predictive of the type of that residue that is not adequately reflected in the individual contact energies. We present evidence that confirms this hypothesis and suggests a high order cooperativity between the residues that surround a given residue and how they interact with it. We suggest a possible application to threading.  相似文献   

6.
In this paper, we study the protein threading problem, which was proposed for predicting a folded 3D protein structure from an amino acid sequence. Since this problem was already proved to be NP-hard, we study polynomial time approximation algorithms. We show several hardness results for the approximation, which includes a MAX SNP-hardness result. We also show approximation algorithms for a special case and a general case, where a graph representing interactions between amino acid residues is restricted to be planar in a special case. For this special case, we obtain a constant approximation ratio.  相似文献   

7.
《Computers & chemistry》1998,21(5):369-375
Six protein pairs, all with known 3D-structures, were used to evaluate different protein structure prediction tools. Firstly, alignments between a target sequence and a template sequence or structure were obtained by sequence alignment with QUANTA or by threading with THREADER, 123D and PHD Topits. Secondly, protein structure models were generated using MODELLER. The two protein structure assessment tools used were the root mean square deviation (RMSD) compared with the experimental target structure and the total 3D profile score. Also the accuracy of the active sites of models built in the absence and presence of ligands was investigated. Our study confirms that threading methods are able to yield more accurate models than comparative modelling in cases of low sequence identity (<30%). However, a gap of 2 Å(RMSD) exists between the theoretically best model and the models obtained by threading methods. For high sequence identities (>30%) comparative modelling using MODELLER resulted in accurate models. Furthermore, the total 3D profile score was not always able to distinguish correct from incorrect folds when different alignment methods were used. Finally, we found it to be important to include possible ligands in the model-building process in order to prevent unrealistic filling of active site areas.  相似文献   

8.
In this paper, we study a registration problem that is motivated by a practical biology problem – fitting protein structures to low‐re solution density maps. We consider registration between two sets of lines features (e.g., helices in the proteins) that have undergone not a single, but multiple isometric transformations (e.g., hinge‐motions). The problem is further complicated by the presence of symmetry in each set. We formulate the problem as a clique‐finding problem in a product graph, and propose a heuristic solution that includes a fast clique‐finding algorithm unique to the structure of this graph. When tested on a suite of real protein structures, the algorithm achieved high accuracy even for very large inputs containing hundreds of helices.  相似文献   

9.
A new method is presented for extracting statistical potentials dependent on the relative side chain and backbone orientations in proteins. Coarse-grained, anisotropic potentials are constructed for short-, medium-, and long-range interactions using the Boltzmann method and a database of non-homologous protein structures. The new orientation-dependent potentials are analyzed using a spherical harmonics decomposition method with real eigenfunctions. This method permits a more realistic, continuous angular representation of the coarse-grained potentials. Results of tests for discriminating the native protein conformations from large sets of decoy proteins, show that the new continuous distance- and orientation-dependent potentials present significantly improved performance. Novel graphical representations are developed and used to depict the orientational dependence of the interaction potentials. These new continuous anisotropic statistical potentials could be instrumental in developing new computational methods for structure prediction, threading and coarse-grained simulations.  相似文献   

10.
《Information Fusion》2009,10(3):217-232
Protein secondary structure prediction is still a challenging problem at today. Even if a number of prediction methods have been presented in the literature, the various prediction tools that are available on-line produce results whose quality is not always fully satisfactory. Therefore, a user has to know which predictor to use for a given protein to be analyzed. In this paper, we propose a server implementing a method to improve the accuracy in protein secondary structure prediction. The method is based on integrating the prediction results computed by some available on-line prediction tools to obtain a combined prediction of higher quality. Given an input protein p whose secondary structure has to be predicted, and a group of proteins F, whose secondary structures are known, the server currently works according to a two phase approach: (i) it selects a set of predictors good at predicting the secondary structure of proteins in F (and, therefore, supposedly, that of p as well), and (ii) it integrates the prediction results delivered for p by the selected team of prediction tools. Therefore, by exploiting our system, the user is relieved of the burden of selecting the most appropriate predictor for the given input protein being, at the same time, assumed that a prediction result at least as good as the best available one will be delivered. The correctness of the resulting prediction is measured referring to EVA accuracy parameters used in several editions of CASP.  相似文献   

11.
In this paper we study the protein structure comparison problem where each protein is modeled as a sequence of 3D points, and a contact edge is placed between every two of these points that are sufficiently close. Given two proteins represented this way, our problem is to find a subset of points from each protein, and a bijective matching of points between these two subsets, with the objective of maximizing either (A) the size of the subsets (the LCP problem), or (B) the number of edges that exist simultaneously in both subsets (the CMO problem), under the requirement that only points within a specified proximity can be matched. It is known that the general CMO problem (without the proximity requirement) is hard to approximate. However, with the proximity requirement, it is known that if a minimum inter-residue distance is imposed on the input, approximate solutions can be efficiently obtained. In this paper we mainly show that the CMO problem under these conditions: (1) is NP-hard, but (2) allows a PTAS. The rest of this paper shows algorithms for the LCP problem which improve on known results.  相似文献   

12.
《Micro, IEEE》2004,24(6):74-82
Memory latency dominates the performance of many applications on modern processors, despite advances in caches and prefetching techniques. Numerous prefetching techniques, both in hardware and software, try to alleviate the memory bottleneck. One such technique, known as helper threading improves single-thread performance on a simultaneous multithreaded architecture (SMT), which shares processor resources, including caches, among logical threads. It uses otherwise idle hardware thread contexts to execute speculative threads on behalf of the main thread. Helper threading accelerates a program by exploiting a processor's multithreading capability to run assist threads. Based on the helper threading usage model, virtual multithreading (VMT), a form of switch-on-event user-level multithreading, can improve performance for real-world workloads with a wall-clock speedup of 5.0 to 38.5 percent  相似文献   

13.
Given the amino-acid sequence of a protein, the prediction of a protein’s tertiary structure is known as the protein folding problem. The protein folding problem in the hydrophobic–hydrophilic lattice model is to find the lowest energy conformation. In order to enhance the performance of predicting protein structure, in this paper we propose an efficient hybrid Taguchi-genetic algorithm that combines genetic algorithm, Taguchi method, and particle swarm optimization (PSO). The GA has the capability of powerful global exploration, while the Taguchi method can exploit the optimum offspring. In addition, we present the PSO inspired by a mutation mechanism in a genetic algorithm. We demonstrate that our algorithm can be applied successfully to the protein folding problem based on the hydrophobic-hydrophilic lattice model. Simulation results indicate that our approach performs very well against existing evolutionary algorithm.  相似文献   

14.
在线程环境设计中存在三种结构不同的线程模型:多对一、一对一和多对多,一直以来,线程模型的特性分析仍然主要位于感性层面,缺乏完整的测试数据验证。FreeBSD5提供了基于三种线程模型的线程环境,为评测不同线程环境的性能提供了条件。论文以FreeBSD5下的测试结果为基础,结合Linux下一对一模型线程库NPTL的测试结果,分析了三种模型的不同特点,指出一对一模型和多对多模型均具有良好的性能,同时,基于SchedulerActivations的多对多模型也有很大的发展空间。  相似文献   

15.
As an important attribute of proteins, protein subcellular location(s) can provide valuable information about their functions. Determining protein subcellular locations using experimental methods are usually expensive and time-consuming. Over the years, a variety of computational approaches have been developed to predict protein subcellular locations based on knowledge of known protein locations. However, the problem is inherently hard, especially for proteins that can exist at multiple subcellular locations. Further studies are still in great need in this area. In this paper, we propose an ensemble learning framework that utilizes a modified Weighted K-Nearest Neighbors (WKNN) as the basic learning algorithm. Two different types of features are considered and extracted from training data, which are based on protein amino acid compositions (Amphiphilic Pseudo Amino Acid Composition, or AmPseAAC) and protein sequence similarities (Protein Similarity Measure, or PSM), respectively. Two individual classifiers are trained separately based on these two types of features and each assigns a probability distribution over different locations to a query protein. Based on the outputs of the two base classifiers, a novel ensemble strategy named Maximized Probability on Label (MPoL) is proposed. The strategy produces a final set of protein locations for each protein by integrating prediction results of the base classifiers through an optimization procedure. To measure the prediction quality of the proposed approach, two different types of evaluation metrics, example-based metrics and label-based metrics, are used. To evaluate the performance of our approach objectively, we compare its results with those predicted by another popular method named iLoc-Animal on a benchmark dataset through cross-validation. Results show that in terms of absolute true success rate on multi-location prediction, MPoL has achieved much better results than iLoc-Animal. It implies that the proposed method has some potential to solve a diverse set of multi-label learning problems.  相似文献   

16.
Whenever evolutionary algorithms are used to solve certain classes of problems such as those that present a huge search space, the incorporation of problem-specific knowledge is required to achieve adequate levels of performance. In this paper, we propose a multi-objective optimization-based procedure that includes such a domain-specific knowledge to cope with a difficult problem, the protein structure prediction (PSP). This problem is considered to be an open problem as there is no recognized “best” procedure to find solutions. It presents a vast search space and the analysis of each protein conformation requires significant amount of computing time. In our procedure, we provide a reduction of the search space by using the dependent rotamer library and include new heuristics to improve a multi-objective approach to PSP based on the PAES algorithm. As it is shown in the paper, by using benchmark proteins from the CASP8 set, this hybrid PSP procedure provides competitive results when it is compared with some of the better proposals appeared up to now.  相似文献   

17.
The prediction of protein function from structure is becoming of growing importance in the age of structural genomics. We have focused on the problem of identifying sites of potential serine protease inhibitor interactions on the surface of proteins of known structure. Given that there is no sequence conservation within canonical loops from different inhibitor families we first compare representative loops to all fragments of equal length among proteins of known structure by calculating main-chain RMS deviation. Fragments with RMS deviation below a certain threshold (hits) are removed if residues have solvent accessibilities appreciably lower than those observed in the search structure. These remaining hits are further filtered to remove those occurring largely within secondary structure elements. Likely functional significance is restricted further by considering only extracellular protein domains. Also a test is performed to see if the loop can dock into the binding site of the serine protease trypsin without unacceptable steric clashes. By comparing different canonical loop structures to the protein structure database we show that the method was able to detect previously known inhibitors. In addition, we discuss potentially new canonical loop structures found in secreted hydrolases, toxins, viral proteins, cytokines and other proteins. We discuss the possible functional significance of several of the examples found.  相似文献   

18.
《Computers & chemistry》2002,26(1):31-39
The prediction of protein function from structure is becoming of growing importance in the age of structural genomics. We have focused on the problem of identifying sites of potential serine protease inhibitor interactions on the surface of proteins of known structure. Given that there is no sequence conservation within canonical loops from different inhibitor families we first compare representative loops to all fragments of equal length among proteins of known structure by calculating main-chain RMS deviation. Fragments with RMS deviation below a certain threshold (hits) are removed if residues have solvent accessibilities appreciably lower than those observed in the search structure. These remaining hits are further filtered to remove those occurring largely within secondary structure elements. Likely functional significance is restricted further by considering only extracellular protein domains. Also a test is performed to see if the loop can dock into the binding site of the serine protease trypsin without unacceptable steric clashes. By comparing different canonical loop structures to the protein structure database we show that the method was able to detect previously known inhibitors. In addition, we discuss potentially new canonical loop structures found in secreted hydrolases, toxins, viral proteins, cytokines and other proteins. We discuss the possible functional significance of several of the examples found.  相似文献   

19.
Carbohydrates have fundamental roles throughout biology, yet they have not been as well studied as proteins and nucleic acids, in part due to limitations in the experimental tools. Improved methods for studying glycans could spur significant advances in the understanding and application of glycobiology. The use of affinity reagents, such as lectins and glycan-binding antibodies, is a valuable complement to methods involving mass spectrometry and chromatography. This article addresses two limitations that have prevented the broader experimental use of glycan-binding proteins: sensitivity and availability. The sensitivity limitation stems from the poor affinity that many glycan-binding proteins have as isolated analytical reagents. To address this problem, I propose making use of multivalent interactions between lectins and glycans, mimicking those frequently found in the biological setting. Recent experiments show that a practical technique for producing lectin multimers can significantly improve detection sensitivity. The second limitation, availability, is the difficulty of finding and obtaining glycan-binding proteins that recognize less common or arbitrarily defined glycan structures. To address this problem, I propose translating the wealth of existing glycan array data into a quantitative, searchable database of the specificities of glycan-binding proteins. Such a resource would allow us to more easily identify proteins with defined specificities and perform detailed comparisons between reagents. Solutions to these two limitations could lead to the more effective use of, and a broader range of, glycan-binding reagents.  相似文献   

20.
The timetabling problem is concerned with the allocation, subject to constraints, of given resources to objects in space and time in such way as to satisfy as nearly as possible a set of desirable objectives. This problem is known to be NP–complete and as such only combinatorial optimization methods can guarantee an optimal timetable. In this paper we propose a sector–based genetic algorithm for solving a university weekly courses timetabling problem. Preliminary experimental results indicate that the algorithm is promising.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号