首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
The protein threading problem is the problem of determining the three-dimensional structure of a given but arbitrary protein sequence from a set of known structures of other proteins. This problem is known to be NP-hard and current computational approaches to threading are unrealistic for long proteins and/or large template data sets. In this paper, we propose an evolution strategy for the solution of the protein threading problem. We also propose three parallel methods for fast threading. Our experiments produced encouraging preliminary results in term of threading energy as well as significant reduction in threading time.  相似文献   

3.
《Computers & chemistry》1994,18(3):255-258
The identification and characterization of local residue patterns or conserved segments shared by a set of biopolymers has provided a number of insights in molecular biology. Biopolymer sequences are observations from macro molecules that share common structural or function features. The approach taken here rests on the notion that information may be most efficiently extracted from these observations through the use of a model that faithfully represents macro-molecular characteristics. Accordingly, our efforts are focused on statistical models which attempt to capture central features of protein structure, function, and change. Here the assumptions that underlie two new methods for the analysis of protein sequence data are explicitly delineated. (1) Threading of a sequence through structural motifs seeks to determine if a protein sequence fits a known protein structure. The assumptions delineated here also generally apply to other contact based threading methods that have been recently described. (2) Multiple sequence alignment via the Gibbs sampling algorithm seeks to identify position specific empirical free energy models for residue sites in common motifs and simultaneously the align sequence observations form these motifs.  相似文献   

4.
J. Yadgari  A. Amir  R. Unger 《Constraints》2001,6(2-3):271-292
The biological function of proteins is dependent, to a large extent, on their native three dimensional conformation. Thus, it is important to know the structure of as many proteins as possible. Since experimental methods for structure determination are very tedious, there is a significant effort to calculate the structure of a protein from its linear sequence. Direct methods of calculating structure from sequence are not available yet. Thus, an indirect approach to predict the conformation of protein, called threading, is discussed. In this approach, known structures are used as constraints, to restrict the search for the native conformation. Threading requires finding good alignments between a sequence and a structure, which is a major computational challenge and a practical bottleneck in applying threading procedures. The Genetic Algorithm paradigm, an efficient search method that is based on evolutionary ideas, is used to perform sequence to structure alignments. A proper representation is discussed in which genetic operators can be effectively implemented. The algorithm performance is tested for a set of six sequence/structure pairs. The effects of changing operators and parameters are explored and analyzed.  相似文献   

5.
Experience has shown that protein redesigns (using the backbone from a known protein structure) are far more likely to produce well-ordered, native-like structures than are true de novo designs. Therefore, to design a four-helix bundle made of identical short helices, we here proceed by an extensive redesign of the ROP protein. A fully symmetrical SymROP sequence derived from ROP was chosen by modeling ideal-geometry side chains, including hydrogens, while maintaining the "goodness-of-fit" of side-chain packing by calculating all-atom contact surfaces with the Reduce and Probe programs. To estimate the probable extent of backbone movement and side-chain mobility, restrained molecular dynamics simulations were compared for candidate sequences and controls, including substitution of Abu for all or half the core Ala residues. The resulting 17-residue designed sequence is 41% identical to the relevant regions in ROP. SymROP is intended for construction by the Template Assembled Synthetic Proteins approach, to control the bundle topology, to use short helices, and to allow blocked termini and unnatural amino acids. ROP protein has been a valuable system for studying helical protein structure because of its simplicity and regularity within a structure large enough to have a real hydrophobic core. The SymROP design carries that simplicity and regularity even further.  相似文献   

6.
Proteins belonging to the same class, having similar structures thus performing the same function are known to have different thermal stabilities depending on the source— thermophile or mesophile. The variation in thermo-stability has not been attributed to any unified factor yet and understanding this phenomenon is critically needed in several areas, particularly in protein engineering to design stable variants of the proteins. Toward this motive, the present study focuses on the sequence and structural investigation of a dataset of 373 pairs of proteins; a thermophilic protein and its mesophilic structural analog in each pair, from the perspectives of hydrophobic free energy, hydrogen bonds, physico-chemical properties of amino acids and residue–residue contacts. Our results showed that the hydrophobic free energy due to carbon, charged nitrogen and charged oxygen atoms was stronger in 65% of thermophilic proteins. The number of hydrogen bonds which bridges the buried and exposed regions of proteins was also greater in case of thermophiles. Amino acids of extended shape, volume and molecular weight along with more medium and long range contacts were observed in many of the thermophilic proteins. These results highlight the preference of thermophiles toward the amino acids with larger side chain and charged to make up greater free energy, better packing of residues and increase the overall compactness.  相似文献   

7.
To utilize fully all available information in protein structure prediction, including both backbone and side-chain structures, we present a novel algorithm for solving a generalized threading problem. In this problem we consider simultaneous backbone threading and side-chain packing during the process of a protein structure prediction. For a given query protein sequence and a template structure, our goal is to find a threading alignment between the query sequence and the template structure, along with a rotamer assignment for each side-chain of the query protein, which optimizes an energy function that combines a backbone threading energy and a side-chain packing energy. This highly computationally challenging problem is solved through first formulating this problem as a graph-based optimization problem. Various graph-theoretic techniques are employed to achieve the computational efficiency to make our algorithm practically useful, which takes advantage of a number of special properties of the graph representing this generalized threading problem. The overall framework of our algorithm is a dynamic programming algorithm implemented on an optimal tree decomposition of the graph representation of our problem. By using various additional heuristic techniques such as dead-end elimination, we have demonstrated that our algorithm can solve a generalized threading problem within a practically acceptable amount of time and space, the first of its kind.  相似文献   

8.
To fully utilize all available information in protein structure prediction, including both backbone and side-chain structures, we present a novel algorithm for solving a generalized threading problem. In this problem, we consider simultaneously backbone threading and side-chain packing during the process of a protein structure prediction. For a given query protein sequence and a template structure, our goal is to find a threading alignment between the query sequence and the template structure, along with a rotamer assignment for each side-chain of the query protein, which optimizes an energy function that combines a backbone threading energy and a side-chain packing energy. This highly computationally challenging problem is solved through first formulating this problem as a graph-based optimization problem. Various graph-theoretic techniques are employed to achieve the computational efficiency to make our algorithm practically useful, which takes advantage of a number of special properties of the graph representing this generalized threading problem. The overall framework of our algorithm is a dynamic programming algorithm implemented on an optimal tree decomposition of the graph representation of our problem. By using various additional heuristic techniques such as the dead-end elimination, we have demonstrated that our algorithm can solve a generalized threading problem within practically acceptable amount of time and space, the first of its kind.  相似文献   

9.
Protein Structure from Contact Maps: A Case-Based Reasoning Approach   总被引:1,自引:0,他引:1  
Determining the three-dimensional structure of a protein is an important step in understanding biological function. Despite advances in experimental methods (crystallography and NMR) and protein structure prediction techniques, the gap between the number of known protein sequences and determined structures continues to grow. Approaches to protein structure prediction vary from those that apply physical principles to those that consider known amino acid sequences and previously determined protein structures. In this paper we consider a two-step approach to structure prediction: (1) predict contacts between amino acids using sequence data; (2) predict protein structure using the predicted contact maps. Our focus is on the second step of this approach. In particular, we apply a case-based reasoning framework to determine the alignment of secondary structures based on previous experiences stored in a case base, along with detailed knowledge of the chemical and physical properties of proteins. Case-based reasoning is founded on the premise that similar problems have similar solutions. Our hypothesis is that we can use previously determined structures and their contact maps to predict the structure for novel proteins from their contact maps. The paper presents an overview of contact maps along with the general principles behind our methodology of case-based reasoning. We discuss details of the implementation of our system and present empirical results using contact maps retrieved from the Protein Data Bank. Funding provided by: The Natural Science and Engineering Research Council (Ottawa); Institute for Robotics and Intelligent Systems (Ottawa); Protein Engineering Network Center of Excellence (Edmonton)  相似文献   

10.
Dihydrofolate reductase (DHFR), a key enzyme in tetrahydrofolate-mediated biosynthetic pathways, has a structural motif known to be highly conserved over a wide range of organisms. Given its critical role in purine and amino acid synthesis, DHFR is a well established therapeutic target for treating a wide range of prokaryotic and eukaryotic infections as well as certain types of cancer. Here we present a structural-based computer analysis of bacterial (Bacilli) and plasmid DHFR evolution. We generated a structure-based sequence alignment using 7 wild-type DHFR x-ray crystal structures obtained from the RCSB Protein Data Bank and 350 chromosomal and plasmid homology models we generated from sequences obtained from the NCBI Protein Database. We used these alignments to compare active site and non-active site conservation in terms of amino acid residues, secondary structure and amino acid residue class. With respect to amino acid sequences and residue classes, active-site positions in both plasmid and chromosomal DHFR are significantly more conserved than non-active site positions. Secondary structure conservation was similar for active site and non-active site positions. Plasmid-encoded DHFR proteins have greater degree of sequence and residue class conservation, particularly in sequence positions associated with a network of concerted protein motions, than chromosomal-encoded DHFR proteins. These structure-based were used to build DHFR specific phylogenetic trees from which evidence for horizontal gene transfer was identified.  相似文献   

11.
In this paper, a machine learning approach, known as support vector machine (SVM) is employed to predict the distance between antibody’s interface residue and antigen in antigen–antibody complex. The heavy chains, light chains and the corresponding antigens of 37 antibodies are extracted from the antibody–antigen complexes in protein data bank. According to different distance ranges, sequence patch sizes and antigen classes, a number of computational experiments are conducted to describe the distance between antibody’s interface residue and antigen with antibody sequence information. The high prediction accuracy of both self-consistent and cross-validation tests indicates that the sequential discovered information from antibody structure characterizes much in predicting the distance between antibody’s interface residue and antigen. Furthermore, the antigen class is predicted from residue composition information that belongs to different distance range by SVM, which shows some potential significance.  相似文献   

12.
In protein structures, side-chains of asparagine and aspartic acid (Asx) and glutamine and glutamic acid (Glx) can approach their own backbone nitrogen or carbonyl group. We have systematically analyzed intra-residue contacts in Asx and Glx residues and their secondary structure preferences in two different datasets consisting of 500 and 1506 high-resolution structures. Intra-residue contact in an Asx/Glx residue between the heavy atoms of side-chain and main-chain functional groups of the same residue was investigated irrespective of whether such contacts are due to hydrogen bonding or not. Our search yielded 563 and 1462 cases of self-contacting Asx and Glx residues from the two datasets. Two important observations have been made in this analysis. First, self-contacts involving side-chain oxygen and backbone nitrogen atoms in majority of Asx residues are not due to hydrogen bonds. In the second instance, surprisingly, side-chain and backbone carbonyl oxygens of a significant number of Asx and Glx residues approach each other. For a wide-range of accessible surface areas, self-contacting residues are surrounded by less number of polar groups compared to all other Asx/Glx residues. In buried and partially buried regions, side-chain and main-chain functional groups of these residues together participate in simultaneous interactions with the available polar groups or water molecules. Asx/Glx residues with self-contacts are rarely observed in the middle of an α-helix or a β-strand. Asx/Glx side-chain having contact with its own backbone nitrogen shows different capping preferences compared to those having contact with its backbone oxygen. Examples of proteins with multiple self-contacting Asx/Glx residues are found. We speculate that mutation of a self-contacting residue in the buried or partially buried region of a protein will destabilize the structure. The results of this analysis will help in engineering protein structures and site-directed mutagenesis experiments.  相似文献   

13.
In this paper, we study the protein threading problem, which was proposed for predicting a folded 3D protein structure from an amino acid sequence. Since this problem was already proved to be NP-hard, we study polynomial time approximation algorithms. We show several hardness results for the approximation, which includes a MAX SNP-hardness result. We also show approximation algorithms for a special case and a general case, where a graph representing interactions between amino acid residues is restricted to be planar in a special case. For this special case, we obtain a constant approximation ratio.  相似文献   

14.
《Computers & chemistry》1998,21(5):369-375
Six protein pairs, all with known 3D-structures, were used to evaluate different protein structure prediction tools. Firstly, alignments between a target sequence and a template sequence or structure were obtained by sequence alignment with QUANTA or by threading with THREADER, 123D and PHD Topits. Secondly, protein structure models were generated using MODELLER. The two protein structure assessment tools used were the root mean square deviation (RMSD) compared with the experimental target structure and the total 3D profile score. Also the accuracy of the active sites of models built in the absence and presence of ligands was investigated. Our study confirms that threading methods are able to yield more accurate models than comparative modelling in cases of low sequence identity (<30%). However, a gap of 2 Å(RMSD) exists between the theoretically best model and the models obtained by threading methods. For high sequence identities (>30%) comparative modelling using MODELLER resulted in accurate models. Furthermore, the total 3D profile score was not always able to distinguish correct from incorrect folds when different alignment methods were used. Finally, we found it to be important to include possible ligands in the model-building process in order to prevent unrealistic filling of active site areas.  相似文献   

15.
Few structures of membrane proteins are known and their relationships with the membrane are unclear. In a previous report, 20 X-ray structures of transmembrane proteins were analyzed in silico for their orientation in a 36A-thick membrane [J. Mol. Graph. Model. 20 (2001) 235]. In this paper, we use the same approach to analyze how the insertion of the X-ray structures varies with the bilayer thickness. The protein structures are kept constant and, at each membrane thickness, the protein is allowed to tilt and rotate in order to accommodate at their best. The conditions are said to be optimal when the energy of insertion is minimal. The results show that most helix bundles require thicker membranes than porin barrels. Moreover, in a few instances, the ideal membrane thickness is unrealistic with respect to natural membranes supporting that the X-ray structure requires adaptation to stabilize in membrane. For instance, the squalene cyclase could adapt by bending the side chains of its ring of lysine and arginine in order to increase the hydrophobic surface in contact with membranes. We analyzed the distribution of amino acids in the water, interface and acyl chain layers of the membrane and compared with the literature.  相似文献   

16.
脯氨酸肽键数据集的构建   总被引:1,自引:0,他引:1  
由分辨率<0.25nm,同一性(identity)<30%的2401条肽链中计算提取了全部顺式与反式脯氨酸肽键的位置,数目分别为1221个与26401个,从而建立了一个较大规模的脯氨酸肽键数据集。统计分析了该数据集的基本特征:肽键N端残基的分布、N端残基的二面角统计、在二级结构中的分布情况、顺式肽键在脯氨酸肽键中所占比例。此数据集对于进一步研究顺反X-Pro肽键的结构、与氨基酸序列之间的关系,以及肽链折叠动力学具有重要作用。  相似文献   

17.
Oxidative damage to the plasma membrane Ca(2+)-ATPase (PMCA) appears to contribute to the decreased clearance of intracellular Ca(2+) in the neurons of aged brain, possibly contributing to its vulnerability to numerous age-related diseases such as Alzheimer's disease. The precise sites of oxidative susceptibility have not been identified. However, it is known that calmodulin (CaM) protects the purified PMCA against oxidative inactivation, perhaps via conformational restructuring of the protein through dissociation of a 20 residue domain (C20W) in the C-terminal region that function as a CaM-binding site. In order to postulate likely oxidation sites and the mechanism underlying the protection offered by CaM, we have generated a three-dimensional model of PMCA via a combination of homology/comparative modeling, threading, protein-protein docking, and guidance from prior biochemical and analytical studies. The resulting model was validated based on surface polarity/hydrophobicity profiling, standard ProCheck, WhatIF, and PROVE checks, as well as comparison with empirical structure-function observations. This model was then used to identify likely oxidation sites by comparing time-averaged solvent accessibility of potentially oxidizable surface residues as measured from molecular dynamics simulations of intact PMCA and the PMCA sequence from which C20W has been deleted. The resulting model complex has permitted us to identify three amino acids whose solvent accessibility is greatly reduced by the C20W dissociation: Tyr 589, Met 622, and Met 831.  相似文献   

18.
F. Barsi  P. Maestrini 《Calcolo》1974,11(2):219-242
The problems of detecting overflow and single or multiple residue digit errors in Redundant Residue Numeber Systems are considered through an unified approach. It is shown that a single intermodular procedure allows concurrent detection of additive overflow and single residue digit error, even in the case where the error affects a number in overflow. In addition, it is shown that codes of adequate redundancy may allow detection of additive overflow and single bit error, provided that the residue digits are appropriately encoded. The discussion concerns both separate residue codes (i. e., codes being referred to as RRNS, where one or more redundant residues are added) and Product Codes.  相似文献   

19.
Hfq is an abundant RNA-binding bacterial protein that was first identified in E. coli as a required host factor for phage Qβ RNA replication. The pleiotrophic phenotype resulting from the deletion of Hfq predicates the importance of this protein. Two RNA-binding sites have been characterized: the proximal site which binds sRNA and mRNA and the distal site which binds poly(A) tails. Previous studies mainly focused on the key residues in the proximal site of the protein. A recent mutation study in E. coli Hfq showed that a distal residue Val43 is important for the protein function. Interestingly, when we analyzed the sequence and structure of Staphylococcus aureus Hfq using the CONSEQ server, the results elicited that more functional residues were located far from the nucleotide-binding portion (NBP). From the analysis seven individual residues Asp9, Leu12, Glu13, Lys16, Gln31, Gly34 and Asp40 were selected to investigate the conformational changes in Hfq–RNA complex due to point mutation effect of those residues using molecular dynamics simulations. Results showed a significant effect on Asn28 which is an already known highly conserved functionally important residue. Mutants D9A, E13A and K16A depicted effects on base stacking along with increase in RNA pore diameter, which is required for the threading of RNA through the pore for the post-translational modification. Further, the result of protein stability analysis by the CUPSAT server showed destabilizing effect in the most mutants. From this study we characterized a series of important residues located far from the NBP and provide some clues that those residues may affect sRNA binding in Hfq.  相似文献   

20.
鉴于不同类型氨基酸的相互作用对蛋白质结构预测的影响不同,文中融合卷积神经网络和长短时记忆神经网络模型,提出卷积长短时记忆神经网络,并应用到蛋白质8类二级结构的预测中.首先基于氨基酸序列的类别信息和氨基酸结构的进化信息表示蛋白质序列,并采用卷积提取氨基酸残基之间的局部相关特征,然后利用双向长短时记忆神经网络提取蛋白质序列内部残基之间的远程相互作用,最后将提取的蛋白质的局部相关特征和远程相互作用用于蛋白质8类二级结构的预测.实验表明,相比基准方法,文中模型提高8类二级结构预测的精度,并具有良好的可扩展性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号