首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Compensating changes in protein multiple sequence alignments   总被引:2,自引:0,他引:2  
A method was developed to identify compensating changes betweenresidues at positions in a multiple sequence alignment. (Forexample, one position might always contain a positively chargedresidue when the other is negatively charged and vice versa.)A correlation-based method was used to measure the compensationfound in the four residues at a pair of positions in any twosequences in a multiple alignment. All possible sequence pairingswere measured at the pair of positions and the resulting matrixanalysed to give a measure of cooperathity among the pairs.The basic method was sufficiently flexible to consider a numberof amino acid relatedness models based both on scalar and vectorialproperties. Pairs of compensating positions were selected bythe method and their mean separation (in a protein of knownstructure) was compared to both the mean pair-wise separationover all residues and the pairwise separation over an equivalentsample of pairs of residues selected on the basis of their conservationalone. The latter is an important control that has been omittedfrom previous studies. The results indicated that, at best,there was a slight effect (of marginal significance) leadingto the selection of closer pairs by the compensation measurewhen compared to the mean of all pairs. However, this was neveras good as the simpler measure based on conservation alone,which always found a significant majority of proteins with asample mean less than the overall mean  相似文献   

2.
We have studied the question of how much extra predictive powerthe correlated mutational behaviour of pairs of amino acid residuesseparated along a sequence has concerning the likelihood ofthose residues being in contact in the folded protein. The mutationalbehaviour is deduced from multiple sequence alignments. Ourfindings are that there is, indeed, some valuable informationavailable from this source and that it is sufficient to makea significant improvement in our ability to predict contacts,when compared with earlier methods that do not take into accountthe correlations between the mutations. This improvement isapproximately twice as large as can be obtained by the moreeconomical method of simply averaging pair preferences overthe same sequence alignment. Even when using a method basedon pair preferences, a further significant improvement can bemade by penalizing more variable regions (on the reasonableassumption that invariant residues are relatively more likelyto be in contact), though we have found no way of improvingthe pair preference method to the extent that it matches themethod based on correlated behaviour. Our new method is thoughtto be the best data-based method of contact prediction developedso far, achieving, on average, an improvement over a random(i.e. information-free) prediction of a factor of five whenthe number of contacts predicted is chosen to match the numberthat actually occur.  相似文献   

3.
A major problem in predicting protein structure by homologymodelling is that the sequence alignment from which the modelis built may not be the best one in terms of the correct equivalencingof residues assessed by structural or functional criteria. Auseful strategy is to generate and examine a number of suboptimalalignments as better alignments can often be found away fromthe optimal. A procedure to filter rapidly suboptimal alignmentsbased on measurement of core volumes and packing pair potentialsis investigated. The approach is benchmarked on three pairsof sequences which are non-trivial to align correctly, namelytwo immunoglobulin domains, plastocyanin with azurin and twodistant globin sequences. It is shown to be useful to reducea large ensemble of possible alignments down to a few whichcorrespond more closely to the correct (structure based) alignment.  相似文献   

4.
Secondary structure prediction for modelling by homology   总被引:1,自引:0,他引:1  
An improved method of secondary structure prediction has beendeveloped to aid the modelling of proteins by homology. Selecteddata from four published algorithms are scaled and combinedas a weighted mean to produce consensus algorithms. Each consensusalgorithm is used to predict the secondary structure of a proteinhomologous to the target protein and of known structure. Bycomparison of the predictions to the known structure, accuracyvalues are calculated and a consensus algorithm chosen as theoptimum combination of the composite data for prediction ofthe homologous protein. This customized algorithm is then usedto predict the secondary structure of the unknown protein. Inthis manner the secondary structure prediction is initiallytuned to the required protein family before prediction of thetarget protein. The method improves statistical secondary structureprediction and can be incorporated into more comprehensive systemssuch as those involving consensus prediction from multiple sequencealignments. Thirty one proteins from five families were usedto compare the new method to that of Garnier, Osguthorpe andRobson (GOR) and sequence alignment. The improvement over GORis naturally dependent on the similarity of the homologous protein,varying from a mean of 3% to 7% with increasing alignment significancescore.  相似文献   

5.
Secondary structure prediction: combination of three different methods   总被引:1,自引:0,他引:1  
A combination of three complementary secondary structure predictionmethods is presented. The methods used are the GOR III method,the Homologue method and a new method, the bit pattern method,which is based on hydrophilic/hydrophobic residue patterns.For this purpose a hydropathy scale was developed and is presentedhere. The combination algorithm (Combine method) was designedto take the best results of each method and use their differencesin order to improve the prediction. The combination yields 65.5%correctly predicted residues in three states: -helix (H), ß-strand(E) and aperiodic structure (C) which is an improvement rangingfrom 2.5 to 6.5% compared with the individual methods when testedwith a 67-polypeptide chain database. Seventy-five per centof the regular secondary structure (H and E) runs are correctlylocated and ß-sheet runs are much better located bythe Combine method in comparison to the other methods.  相似文献   

6.
Twilight zone of protein sequence alignments   总被引:8,自引:0,他引:8  
Sequence alignments unambiguously distinguish between proteinpairs of similar and non-similar structure when the pairwisesequence identity is high (>40% for long alignments). Thesignal gets blurred in the twilight zone of 20–35% sequenceidentity. Here, more than a million sequence alignments wereanalysed between protein pairs of known structures to re-definea line distinguishing between true and false positives for lowlevels of similarity. Four results stood out. (i) The transitionfrom the safe zone of sequence alignment into the twilight zoneis described by an explosion of false negatives. More than 95%of all pairs detected in the twilight zone had different structures.More precisely, above a cut-off roughly corresponding to 30%sequence identity, 90% of the pairs were homologous; below 25%less than 10% were. (ii) Whether or not sequence homology impliedstructural identity depended crucially on the alignment length.For example, if 10 residues were similar in an alignment oflength 16 (>60%), structural similarity could not be inferred.(iii) The `more similar than identical' rule (discarding allpairs for which percentage similarity was lower than percentageidentity) reduced false positives significantly. (iv) Usingintermediate sequences for finding links between more distantfamilies was almost as successful: pairs were predicted to behomologous when the respective sequence families had proteinsin common. All findings are applicable to automatic databasesearches.  相似文献   

7.
8.
One of the general paradigms for ab initio protein structure prediction involves sampling the conformational space such that a large set of decoy (candidate) structures are generated and then selecting native-like conformations from those decoys using various scoring functions. In this study, based on a physical/geometric approach first suggested by Banavar and colleagues, we formulate a knowledge-based scoring function, which uses the radii of curvature formed among triplets of residues in a protein conformation. By analyzing its performance on various decoy sets, we determine a good set of parameters--the distance cutoff and the number of distance bins--to use for configuring such a function. Furthermore, we investigate the effect of using various approaches for compiling the prior distribution on the performance of the knowledge-based function. Possible extensions to the current form of the residue triplet scoring function are discussed.  相似文献   

9.
Judging the significance of alignments is still a major problemin sequence comparison. We present a method to delineate reliableregions within an alignment. This differs from standard approachesin that it does not attempt to attribute one significance valueto the alignment as a whole, but assesses alignment qualitylocally. An algorithm is provided that predicts which residuepairs in an alignment are likely to be correctly matched. Thepredictions are evaluated by comparison with alignments takenfrom tertiary structural superpositions.  相似文献   

10.
An optimized self-organizing map algorithm has been used toobtain protein topological (proteinotopic) maps. A neural networkis able to arrange a set of proteins depending on their ultravioletcircular dichroism spectra in a completely unsupervised learningprocess. Analysis of the proteinotopic map reveals that thenetwork extracts the main secondary structure features evenwith the small number of examples used. Some methods to usethe proteinotopic map for protein secondary structure predictionare tested showing a good performance in the 200–240 nmwavelength range that is likely to increase as new protein structuresare known.  相似文献   

11.
Predictions of protein secondary structure using current methodsare often unrealistic, i.e. the predicted -helices or ß-strandsare too short. To improve the realism, various heuristic ‘filtering’or ‘smoothing’ methods are used. They are more orless intuitive and are based on ad hoc corrections. We presenta regularization method to obtain a realistic secondary structurefrom predicted propensities. It is based on the known dynamicprogramming algorithm and is quite objective. It can be usedwith any prediction method which yields propensities. The regularizedpredictions conserve well the overall prediction accuracy andimprove the ‘protein-likeness’ of the prediction.  相似文献   

12.
Accurate assignments of secondary structures in proteins arecrucial for a useful comparison with theoretical predictions.Three major programs which automatically determine the locationof helices and strands are used for this purpose, namely DSSP,P-Curve and Define. Their results have been compared for a non-redundantdatabase of 154 proteins. On a residue per residue basis, thepercentage match score is only 63% between the three methods.While these methods agree on the overall number of residuesin each of the three states (helix, strand or coil), they differon the number of helices or strands, thus implying a wide discrepancyin the length of assigned structural elements. Moreover, thelength distribution of helices and strands points to the existenceof artefacts inherent to each assignment algorithm. To overcomethese difficulties a consensus assignment is proposed whereeach residue is assigned to the state determined by at leasttwo of the three methods. With this assignment the artefactsof each algorithm are attenuated. The residues assigned in thesame state by the three methods are better predicted than theothers. This assignment will thus be useful for analysing thesuccess rate of prediction methods more accurately.  相似文献   

13.
Potassium channels: a computer prediction of structure and selectivity   总被引:3,自引:0,他引:3  
Model structures for the pore of the potassium channels Shakerand ROMK1 are predicted. The models arise from computer simulationsand suggest reasons for the striking selectivity of these channelsfor K+ and the blocking of ROMK1 by internal Mg2+. The modelledstructure of the Shaker pore is supported by mutagenesis data.The mutagenesis experiments indicate the side chains responsiblefor binding to blocking agents [tetraethylammonium (TEA) andcharybdotoxin (CTX)] and the model has these side chains suitablyoriented for binding. An aromatic K+ binding site part way downthe pore is also predicted by the Shaker pore model.  相似文献   

14.
The method of simulated annealing can be of use in protein structureprediction by homology modelling where side chain conformationsmust be predicted. In this study an attempt has been made tooptimize a molecular dynamics method for this purpose. Heatingand cooling protocols to maximize the accuracy of the predictionshave been developed. The optimized protocol involves coolingfrom 3000 to 0 K over 20 ps while simultaneously introducingthe non-bonded energy term. The use of a 'soft' non-bonded interactionenergy term in place of a standard 6–12 potential is foundto be important. The reliability of the predictions has beenanalysed in terms of the environment of the residues (solventaccessibility) and the degree of uncertainty in the structure(number of unknown torsion angles). Depending on these factorsthe percentage of unknown side chain torsion angles that arecorrectly predicted within 30° ranges from –50 to75%. Potential problems and limitations of the method are discussed.  相似文献   

15.
The EcoRV DNA methyltransferase (M·EcoRV) is an -adeninemethyltransferase. We have used two different programs to predictthe secondary structure of M·EcoRV. The resulting consensusprediction was tested by a mutant profiling analysis. 29 neutralmutations of M·EcoRV were generated by five cycles ofrandom mutagenesis and selection for active variants to increasethe reliability of the prediction and to get a secondary structureprediction for some ambiguously predicted regions. The predictedconsensus secondary structure elements could be aligned to thecommon topology of the structures of the catalytic domains ofM·HhaI and M·TaqI. In a complementary approachwe have isolated nine catalytically inactive single mutants.Five of these mutants contain an amino acid exchange withinthe catalytic domain of M·EcoRV (Val20-Ala, Lys81Arg,Cys192Arg, Asp193Gly, Trp231Arg). The Trp231Arg mutant bindsDNA similarly to wild-type M·EcoRV, but is catalyticallyinactive. Hence this mutant behaves like a bona fide activesite mutant. According to the structure prediction, Trp231 islocated in a loop at the putative active site of M·EcoRV.The other inactive mutants were insoluble. They contain aminoacid exchanges within the conserved amino acid motifs X, IIIor IV in M·EcoRV confirming the importance of these regions.  相似文献   

16.
The integral membrane sialoglycoprotein PrPSc is the only identifiablecomponent of the scrapie prion. Scrapie in animals and Creutzfeldt-Jakobdisease in humans are transmissible, degenerative neurologicaldiseases caused by prions. Standard predictive strategies havebeen used to analyze the secondary structure of the prion proteinin conjunction with Fourier analysis of the primary sequencehydrophobicities to detect potential amphipathic regions. Severalhydrophobic segments, a proline- and glycine-rich repeat regionand putative glycosylation sites are incorporated into a modelfor the integral membrane topology of PrP. The complete aminoacid sequences of the hamster, human and mouse prion proteinsare compared and the effects of residue substitutions upon thepredicted conformation of the polypeptide chain are discussed.While PrP has a unique primary structure, its predicted secondarystructure shares some interesting features with the serum amyloidA proteins. These proteins undergo a post-translational modificationto yield amyloid A, molecules that share with PrP the abilityto polymerize into birefringent filaments. Our analyses mayexplain some experimental observations on PrP, and suggest furtherstudies on the properties of the scrapie and cellular PrP isoforms.  相似文献   

17.
Relatively little has been known about the structure of alpha-helical membrane proteins, since until recently few structures had been crystallized. These limited data have restricted structural analyses to the prediction of secondary structure, rather than tertiary folds. In order to address this, this paper describes an analysis of the 23 available membrane protein structures. A number of findings are made that are of particular relevance to transmembrane helix packing: (1) on average lipid-tail-accessible transmembrane residues are significantly more hydrophobic, less conserved and contain different residue types to buried residues; (2) charged residues are not always buried and, when accessible to membrane lipid tails, few are paired with another charge and instead they often interact with phospholipid head-groups or with other residue types; (3) a significant proportion of lipid-tail-accessible charged and polar residues form hydrogen bonds only with residues one turn away in the same helix (intra-helix); (4) pore-lining residues are usually hydrophobic and it is difficult to distinguish them from buried residues in terms of either residue type or conservation; and (5) information was gained about the proportion of helices that tend to contribute to lining a pore and the resulting pore diameter. These findings are discussed with relevance to the prediction of membrane protein 3D structure.  相似文献   

18.
We describe a method based on neural networks for predictingcontact maps of proteins using as input chemico-physical andevolutionary information. Neural networks are trained on a dataset comprising the contact maps of 200 non-homologous proteinsof well resolved three-dimensional structures. The systems learnthe association rules between the covalent structure of eachprotein and its correspondent contact map by means of a standardback propagation algorithm. Validation of the predictor on thetraining set and on 408 proteins of known structure which arenot homologous to those contained in the training set indicatethat this method scores higher than statistical approaches previouslydescribed and based on correlated mutations and sequence information.  相似文献   

19.
In the course of molecular modeling or mutant prediction oneoften wants quick answers to questions such as: ‘Are thereany residues in a beta-strand that point into an internal cavity,and are highly mutable?;’ ‘Are there large polarresidues in a helix that make a contact with a hydrophobic residuein a sheet, and don't make the maximal number of hydrogen bonds?’or ‘Which hydrophobic residues are in a helix with a largehydrophobic moment, and make a contact with a co-factor, butat the same time still have a large accessible surface?’.I describe here a method to get answers to these kinds of questionsin a very quick and easy manner. The method described is partlybased on the principles used in the design of relational databases,and its mode of operation is similar to the query methods usedin a relational database environment. Although designed foraiding in molecular modeling, its applicability is much moregeneral. The method has been implemented as part of a largemolecular modeling package which copes with the numerous problemsin systematic handling of protein structures, e.g. residue numbering.This also implies that many normal tools such as graphical analyses,I/O facilities, etc. are available on-line.  相似文献   

20.
An amino acid index is a set of 20 numerical values representingany of the different physicochemical and biochemical propertiesof amino adds. As a follow-up to the previous study, we haveincreased the size of the database, which currently contains402 published indices, and re-performed the single-linkage clusteranalysis. The results basically confirmed the previous findings.Another important feature of amino acids that can be representednumerically is the similarity between them. Thus, a similaritymatrix, also called a mutation matrix, is a set of 20x20 numericalvalues used for protein sequence alignments and similarity searches.We have collected 42 published matrices, performed hierarchicalcluster analyses and identified several clusters correspondingto the nature of the data set and the method used for constructingthe mutation matrix. Further, we have tried to reproduce eachmutation matrix by the combination of amino acid indices inorder to understand which properties of amino acids are reflectedmost. There was a relationship between the PAM units of Dayhoff'smutation matrix and the volume and hydrophobicity of amino adds.The database of 402 amino acid indices and 42 amino acid mutationmatrices is made publicly available on the Internet.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号