首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We describe a method based on neural networks for predictingcontact maps of proteins using as input chemico-physical andevolutionary information. Neural networks are trained on a dataset comprising the contact maps of 200 non-homologous proteinsof well resolved three-dimensional structures. The systems learnthe association rules between the covalent structure of eachprotein and its correspondent contact map by means of a standardback propagation algorithm. Validation of the predictor on thetraining set and on 408 proteins of known structure which arenot homologous to those contained in the training set indicatethat this method scores higher than statistical approaches previouslydescribed and based on correlated mutations and sequence information.  相似文献   

2.
Any two ß-strands belonging to two different ß-sheetsin a protein structure are considered to pack interactivelyif each ß-strand has at least one residue that undergoesa loss of one tenth or more of its solvent contact surface areaupon packing. A data set of protein 3-D structures (determinedat 2.5 Å resolution or better), corresponding to 428 proteinchains, contains 1986 non-identical pairs of ß-strandsinvolved in interactive packing. The inter-axial distance betweenthese is significantly correlated to the weighted sum of thevolumes of the interacting residues at the packing interface.This correlation can be used to predict the changes in the inter-sheetdistances in equivalent ß-sheets in homologous proteinsand, therefore, is of value in comparative modelling of proteins.  相似文献   

3.
A method using protein sequence divergence to predict the three-dimensionalstructure of the transmembrane domain of seven-helix membraneproteins is described. The key component in the multistep procedureis the calculation of a hydrophilic and lipophilic variabilityindex for each amino acid in an alignment of a family of homologousproteins. The variability profile, a plot of the calculatedvariability index versus alignment position, can be used topredict a tertiary model of the backbone conformation of thetransmembrane domain. This method was applied to bacteriorhodopsin(BR) and the model obtained was compared with the known structureof this protein. Using an alignment of the amino acid sequencesof BR and closely related (20% identity) proteins, the boundariesof the transmembrane regions, their secondary structures andorientations inside the membrane bilayer were predicted basedon the variability profile. Additional information about theshape of the helix bundle was also obtained from the averagevariability of each transmembrane helix with the assumptionthat the helices are packed sequentially and form a closed helixbundle. Correct features of the known structure of BR were foundin the model structure, suggesting that a similar strategy canbe used to predict transmembrane helices and the packing shapeof other membrane proteins with seven transmembrane helices,such as the opsins and other G-protein coupled receptors.  相似文献   

4.
The rough energy landscapes and tight packing of protein interiors are two of the critical factors that have prevented the wide application of physics-based models in protein side-chain assignment and protein structure prediction in general. Complementing the rotamer-based methods, we propose an ab initio method that utilizes molecular mechanics simulations for protein side-chain assignment and refinement. By reducing the side-chain size, a smooth energy landscape was obtained owing to the increased distances between the side chains. The side chains then gradually grow back during molecular dynamics simulations while adjusting to their surrounding driven by the interaction energies. The method overcomes the barriers due to tight packing that limit conformational sampling of physics-based models. A key feature of this approach is that the resulting structures are free from steric collisions and allow the application of all-atom models in the subsequent refinement. Tests on a small set of proteins showed nearly 100% accuracy on both chi1 and chi2 of buried residues and 94% of them were within 20 degrees from the native conformation, 79% were within 10 degrees and 42% were within 5 degrees . However, the accuracy decreased when exposed side chains were involved. Further improvement and application of the method and the possible reasons that affect the accuracy on the exposed side chains are discussed.  相似文献   

5.
An assessment of the effect of the helix dipole in protein structures   总被引:1,自引:0,他引:1  
The locations of the cations bound to the peptide group at theC-termini and the anions attached to the main-chain NH groupat the N-termini of helices are analysed. The ions are hardlyfound along the helical axis, where the effect due to the helixmacrodipole is likely to be the maximum. The disposition ofthe ions appears to be controlled more by the stereoelectronicrequirements of the ligand group rather than any long distanceelectric field. This and other related structural observationscall for some circumspection in assigning a role for the helixdipole in protein structure and function.  相似文献   

6.
A three-dimensional (3-D) model of the transmembrane domainof human rhodopsin was predicted from the sequence divergenceanalysis of 42 sequences of rhodopsins and visual pigments withouta template. The prediction steps include multiple sequence alignment,calculation of a variability profile of the aligned sequences,use of the variability profile to identify the boundaries oftransmembrane regions, their secondary structure and packingshape in a helix bundle, prediction of side-chain conformationsand structure refinement. The identification of the retinalbinding site was assisted by its known covalent linkage withK296. The structural features of the predicted 3-D model arein good agreement with a low resolution electron density mapof bovine rhodopsin and with residues in contact with retinalas determined experimentally.  相似文献   

7.
8.
The instabilities of the native structures of mutant proteinswith an amino acid exchange are estimated by using the contactenergy and the number of contacts for each type of amino acidpair, which were estimated from 18 192 residue–residuecontacts observed in 42 crystals of globular proteins. Theywere then used to evaluate a transition probability matrix ofcodon substitutions and a log relatedness odds matrix, whichis used as a scoring matrix to measure the similarity betweenprotein sequences. To consider amino acid substitutions in homologousproteins, base mutation rates and the effects of the geneticcode are also taken into account. The average fitness of anamino acid exchange is approximated to be proportional to thestructural stability of the mutant protein, which is then approximatedby the average energy change of the protein native structureexpected for the ammo acid exchange with neglect of the energychange of the denatured state. In global and local homologysearches, this scoring matrix tends to yield significantly higheralignment scores than either the unitary matrix or the geneticcode matrix, and also may yield higher alignment scores fordistantly related protein pairs than MDM78. One of advantagesof this scoring matrix is that the equilibrium frequencies ofcodons and also base mutation rates can be adjusted.  相似文献   

9.
A relational database of protein structure has been developedto enable rapid and flexible enquiries about the occurrenceof many aspects of protein architecture. The coordinates of294 proteins from the Brookhaven Data Bank have been processedby standard computer programs to generate many additional termsthat quantify aspects of protein structure. These terms includesolvent accessibility, main-chain and side-chain dihedral angles,and secondary structure. In a relational database, the informationis stored in tables with columns holding the different termsand rows holding the different entries for the terms. The differentrelational base tables store the information about the proteincoordinate set, the different chains in the protein, the aminoacid residues and ligands, the atomic coordinates, the saltbridges, the hydrogen bonds, the disulphide bridges and theclose tertiary contacts. The database was established underORACLE management system. Enquiries are constructed in ORACLEusing SQL (structured query language) which is simple to useand alleviates the need for extensive computer programs. A singletable can be searched for entries that meet various criteria,e.g. all protein solved to better than a given resolution. Thepower of the database occurs when several tables, or the entriesin a single table, are cross-correlated. For example the dihedralangles of proline in the fourth position in an -helix in highresolution structures can be rapidly obtained. The structuraldatabase provides a powerful tool to obtain empirical rulesabout protein conformation. This database of protein structuresis part of a joint project between Birkbeck College and LeedsUniversity to establish an integrated data resource of proteinsequences and structures (ISIS) that encodes the complex patternsof residues and coordinates that define protein conformation.The entire data resource (ISIS) will provide a system to guideall areas of protein modelling including structure prediction,site-directed mutagenesis and de novo protein design. The availabilityof ISIS is described in the paper.  相似文献   

10.
One of the most difficult problems in predicting the three dimensionalstructure of proteins is how to deal with the local minimumproblem. In many cases of practical interest this problem hasbeen reduced to how to select an appropriate set of startingconformations for carrying out energy minimizations. How thesestarting conformations are selected, however, is often basedon the physical intuition of the person doing the calculations,and hence it is hard to avoid bearing some sort of arbitrariness.To improve such a situation, we introduced the simulated annealingMonte Carlo algorithm to locate the optimal starting conformationsfor energy minimizations. The method developed here is validfor both single and multiple polypeptide chain systems. Theannealing process can be conducted with respect to either theinternal dihedral angles of a polypeptide chain or the externalrotations and translations of various constituent polypeptidechains, and hence is particularly useful for studying the packingarrangements of secondary structures in proteins, such as helix/helixpacking, helix/sheet packing and sheet/sheet packing. It wasshown via a number of comparative calculations that the finalstructures obtained through the annealing process not only hadlower energies than the corresponding energy-minimized structuresreported previously, but also assumed the forms closer to theobservations in proteins. All these results indicate that abetter result can be obtained in search of low-energy structuresof proteins by incorporating the simulated annealing approach.It has been observed during simulated annealing for each ofthese cases that there is a critical temperature T*, termedas ‘phase transition point’, around which the energyhas a remarkable drop and below which the decrease in energygradually becomes steady and slow.  相似文献   

11.
A 3-D model of a protein can be constructed from its amino acidsequence and the 3-D structures of one or more homologues byannealing three sets of fragments: the structurally conservedregions, structurally variable regions and the side chains.The method encoded in the computer program COMPOSER was assessedby generating 3-D models of eight proteins whose crystal structuresare already known and for which 3-D structures of homologuesare available. In the structurally conserved regions, differencesbetween modelled and X-ray structures are smaller than the differencesbetween the X-ray structures of the modelled protein and thehomologues used to build the model. When several homologuesare used, the contributions of the known structures are weighted,preferably by the square of sequence similarity; this is especiallyimportant when the similarities of the homologues to the modelledstructure differ greatly. The ‘collar’ extensionapproach, in which a similar region of different length in ahomologue is used to extend the framework, can result in a moreaccurate model. If known homologues comprise more than one relatedgroup of proteins and they are both distantly related to theunknown, then alignment of the sequence to be modelled witheach group of homologues facilitates identification of structurallyconserved regions of the unknown and leads to an improved model.Models have root mean square differences (r.m.s.d.s) with thestructures defined by X-ray analysis of between 0.73 and 1.56Å for all C atoms, for seven of the eight models. Forthe model of mucor pepsin, where the closest homologue has 33%sequence identity and 20% of the residues are in structurallyvariable regions, the r.m.s.d. for the framework region is 1.71Å and the r.m.s.d. for all C atoms is 3.47 Â.  相似文献   

12.
In this paper we present for seven subtilisin structures a systematiccomparison of densely packed side-group clusters (defined asan ensemble of side chains with extensive internal atomic contactsas compared with those made with the surrounding protein environmentand measured relative to the maximum possible for each residuetype). Spatially consistent clusters are observed at structurallyequivalent positions in the proteins, as revealed by carefulmultiple superpositioning of the respective backbone atoms.The clusters are positioned at strategic loop-connecting sitesnear the protein surfaces. The residues within consistent clustersdisplaying extensive association show varying conservation atstructurally equivalent alignment sites. Suggestions for residuesubstitutions, as observed over the seven tertiary structures,were taken from the cluster positions and were shown to be consistentwith a number of point mutations in one of the seven structures(savinase) that result in increased thermal stability.  相似文献   

13.
A data bank merging related protein structures and sequences   总被引:1,自引:0,他引:1  
A data collection which merges protein structural and sequenceinformation is described. Structural superpositions amongstproteins with similar main-chain fold were performed or collectedfrom the literature. Sequences taken from the protein primarystructure databases were associated with the multiple structuralalignments providing they were at least 50% homologous in residueidentity to one of the structural sequences and at least 50%of the structural sequence residues were alignable. Such restrictionsallow reasonable confidence that the primary sequences sharethe conformation of the tertiary structural templates, exceptin the less conserved loop regions. Multiple structural superpositionswere collected for 38 familial groups containing a total of209 tertiary structures; 45 structures had no superposable matesand were used individually. Other information is also providedas main-chain and side-chain conformational angles, secondarystructural assignments and the like. Wedding the primary andtertiary structural data resulted in an 8-fold increase of databank sequence entries over those associated with the known three-dimensionalarchitectures alone.  相似文献   

14.
An empirical relationship between occupancy and the atomic displacementparameter of water molecules in protein crystal structures hasbeen found by comparing a set of well refined sperm whale myoglobincrystal structures. The relationship agrees with a series ofindependent structural features whose impact on water occupancycan easily be predicted as well as with other known data andis independent of the protein fold. The estimation of the wateroccupancy in protein crystal structures may help in understandingthe physico-chemical properties of the protein–solventinterface and can allow the monitoring of the accuracy of theprotein crystal structure refinement.  相似文献   

15.
We present an efficient technique for the comparison of proteinstructures. The algorithm uses a vector representation of thesecondary structure elements and searches for spatial configurationsof secondary structure elements in proteins. In such recurringprotein folds, the order of the secondary structure elementsin the protein chains is disregarded. The method is based onthe geometric hashing paradigm and implements approaches originatingin computer vision. It represents and matches the secondarystructure element vectors in a 3-D translation and rotationinvariant manner. The matching of a pair of proteins takes onaverage under 3 s on a Silicon Graphics Indigo2 workstation,allowing extensive all-against-all comparisons of the data setof non-redundant protein structures. Here we have carried outsuch a comparison for a data set of over 500 protein molecules.The detection of recurring topological and non-topological,secondary structure element order-independent protein foldsmay provide further insight into evolution. Moreover, as theserecurring folding units are likely to be conformationalHy favourable,the availability of a data set of such topological motifs canserve as a rich input for threading routines. Below, we describethis rapid technique and the results it has obtained. Whilesome of the obtained matches conserve the order of the secondarystructure elements, others are entirely order independent. Asan example, we focus on the results obtained for Che Y, a signaltransduction protein, and on the profilin-ß-actincomplex. The Che Y molecule is composed of a five-stranded,parallel ß-sheet flanked by five helices. Here weshow its similarity with the Escherichia coli elongation factor,with L-arabinose binding protein, with haloalkane dehalogenaseand with adenylate kinase. The profilin–ß-actincontains an antiparallel ß-pleated sheet with -helicaltermini. Its similarities to lipase, fructose disphosphataseand ß-lactamase are displayed.  相似文献   

16.
A new similarity score (-score) is proposed which is able tofind the correct protein structure among the very close alternativesand to distinguish between correct and deliberately misfoldedstructures. This score is based on the general principle `similarlikes similar', and it favors hydrophobic and hydrophilic contacts,and disfavors hydrophobic-to-hydrophilic contacts in proteins.The values of -scores calculated for the high-resolution proteinstructures from the representative set are compared with thoseof alternatives: (i) very close alternatives which are onlyslightly distorted by conformational energy minimization invacuo; (ii) alternatives with subsequently growing distortions,generated by molecular dynamics simulations in vacuo; (iii)structures derived by molecular dynamics simulation in solventat 300 K; (iv) deliberately misfolded protein models. In nearlyall tested cases the similarity score can successfully distinguishbetween experimental structure and its alternatives, even ifthe root mean square displacement of all heavy atoms is lessthan 1 Å. The confidence interval of the similarity scorewas estimated using the high-resolution X-ray structures ofdomain pairs related by non-crystallographic symmetry. The similarityscore can be used for the evaluation of the general qualityof the protein models, choosing the correct structures amongthe very close alternatives, characterization of models simulatingfolding/unfolding, etc.  相似文献   

17.
Evolutionarily conserved hydrophobic residues at the core of protein structures are generally assumed to play a structural role in protein folding and stability. Recent studies have implicated that their importance to protein structures is uneven, with a few of them being crucial and the rest of them being secondary. In this work, we explored the possibility of employing this feature of native structures for discriminating non-native structures from native ones. First, we developed a network tool to quantitatively measure the structural contributions of individual amino acid residues. We systematically applied this method to diverse fold-type sets of native proteins. It was confirmed that this method could grasp the essential structural features of native proteins. Next, we applied it to a number of decoy sets of proteins. The results indicate that such an approach indeed identified non-native structures in most test cases. This finding should be of help for the investigation of the fundamental problem of protein structure prediction.  相似文献   

18.
Detection of internal cavities in globular proteins   总被引:1,自引:0,他引:1  
We have undertaken a study of internal cavities in five proteinstructure groups, each containing different crystallographicstructure determinations of the same protein, to understandbetter the nature of packing defects in protein tertiary architectures.Our results show that cavity detection and consistency of detectionare highly dependent on probe and cavity size, cavity positionwithin the globular protein and the local ‘quality’(r.m.s. deviation) of structural consistency within the group.The consistency of solvent placement within cavities has alsobeen examined. We provide guidelines for estimating the likelihoodof a given cavity to be an actual packing defect or to be aresult of experimental error.  相似文献   

19.
An analysis of the geometry of metal binding by carboxylic andcarboxamide groups in proteins is presented. Most of the ligandsare from aspartic and glutamic acid side chains. Water moleculesbound to carboxylate anions are known to interact with oxygenlone-pairs. However, metal ions are also found to approach thecarboxylate group along the C - O direction. More metal ionsare found to be along the syn than the anti lone-pair direction.This seems to be the result of the stability of the five-memberedring that is formed by the carboxylate anion hydrogen bondedto a ligand water molecule and the metal ion in the syn position.Ligand residues are usually from the helix, turn or regionswith no regular secondary structure. Because of the steric interactionsassociated with bringing all the ligands around a metal center,a calcium ion can bind only near the ends of a helix; a metal,like zinc, with a low coordination number, can bind anywherein the helix. Based on the analysis of the positions of watermolecules in the metal coordination sphere, the sequence ofthe EF hand (a calcium-binding structure) is discussed.  相似文献   

20.
Annexin I homology models were built from the annexin V crystalstructure. Three methods for side-chain prediction were testedbased on molecular mechanics conformational search, the useof a rotamer database, or a combination of these two methods.We showed that rotamer-based methods were more efficient andthat molecular mechanics energy minimizations, prior to rotamerselection, did not afford clearly improved predictions. Modelsbuilt in vacuo and with an implicit solvation term were comparedwith the annexin I crystal structure which became availableduring the course of this study. The analysis of solvation energies,root mean square deviations, Xi angles and hydrogen bonds showedthat models built with implicit solvation were of better quality.In annexin V, repeat III displays A-B and D-E loop conformationsquite different from other repeats. Since the sequence differencessuggest that repeat III in annexin I might present a conformationsimilar to other repeats, two annexin I models with differentrepeat III conformations were built and compared to determinewhether the correct conformation could have been predicted.We show that using a combination of evaluation criteria, itis possible to discriminate unequivocally between the nativeand the incorrect fold, stressing that only one criterion shouldnot be used to evaluate protein structures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号