首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A method for comparison of protein sequences based on theirprimary and secondary structure is described. Protein sequencesare annotated with predicted secondary structures (using a modifiedChou and Fasman method). Two lettered code sequences are generated(Xx, where X is the amino acid and x is its annotated secondarystructure). Sequences are compared with a dynamic programmingmethod (STRALIGN) that includes a similarity matrix for boththe amino acids and secondary structures. The similarity valuefor each paired two-lettered code is a linear combination ofsimilarity values for the paired amino acids and their annotatedsecondary structures. The method has been applied to eight globinproteins (28 pairs) for which the X-ray structure is known.For protein pairs with high primary sequence similarity (>45%),STRALIGN alignment is identical to that obtained by a dynamicprogramming method using only primary sequence information.However, alignment of protein pairs with lower primary sequencesimilarity improves significantly with the addition of secondarystructure annotation. Alignment of the pair with the least primarysequence similarity of 16% was improved from 0 to 37% ‘correct’alignment using this method. In addition, STRALIGN was successfullyapplied to seven pairs of distantly related cytochrome c proteins,and three pairs of distantly related picornavirus proteins.  相似文献   

2.
The G proteins transduce hormonal and other signals into regulationof enzymes such as adenylyl cyclase and retinal cGMP phosphodiesterase.Each G protein contains an subunit that binds and hydrolyzesguanine nucleotides and interacts with ß subunitsand specific receptor and effector proteins. Amphipathic andsecondary structure analysis of the primary sequences of fivedifferent chains (bovine s, t1 and t2, mouse i, and rat o)predicted the secondary structure of a composite chain (avg).The chains contain four short regions of sequence homologousto regions in the GDP binding domain of bacterial elongationfactor Tu (EF-Tu). Similarities between the predicted secondarystructures of these regions in avg and the known secondary structureof EF-Tu allowed us to construct a three-dimensional model ofthe GDP binding domain of avg. Identification of the GDP bindingdomain of avg defined three additional domains in the compositepolypeptide. The first includes the amino terminal 41 residuesof avg, with a predicted am phipathic helical structure; thisdomain may control binding of the chains to the ßcomplex. The second domain, containing predicted ßstrands and helices, several of which are strongly amphipathic,probably contains sequences responsible for interaction of chains with effector enzymes. The predicted structure of thethird domain, containing the carhoxy terminal 100 amino acids,is predominantly ß sheet with an amphipathic helixat the carboxy terminus. We propose that this domain is reponsiblefor receptor binding. Our model should help direct further experimentsinto the structure and function of the G protein chain.  相似文献   

3.
In free solution, the caseins behave as non-compact and largelyflexible molecules with a high proportion of residues accessibleto solvent. Historically, they have been described as randomcoil-type proteins with only a nutritional function. Nevertheless,secondary structure prediction algorithms indicate that manyparts of the (unphosphorylated, unglyco-sylated) polypeptidechains can form regular structures. In particular, a recurrentmotif of the Ca2+-sensitive caseins in man, rat, mouse, guineapig and ruminant species is an a-helix-loop-a-helix conformationin which the loop region typically contains a cluster of sitesof phosphorylation. The biological function of the caseins isconsidered and it is suggested that the potential or actualconformations of the group of Ca2+-sensitive caseins are suitedto the function of modulating the precipitation of calcium phosphatefrom solution. Either they can act as sites for nucleation orthey can bind rapidly to calcium phosphate nuclei as they formspontaneously from supersaturated solution.  相似文献   

4.
Secondary structures of histones H1, H2A, H2B, H3, H4 and H5have been calculated by the computer program ALB based on amolecular theory of protein secondary structure. The predictedsecondary structures of all histones are predominantly -helical.The calculated secondary structure of linker histones H1 andH5 is close to that previously obtained from two-dimensionalNMR data. For each of the core histones (H2A, H2B, H3, H4) onelong -helix and several short ones have been predicted. Theselong helices can be identified with rods in the low-resolutionelectron density map.  相似文献   

5.
We have studied the question of how much extra predictive powerthe correlated mutational behaviour of pairs of amino acid residuesseparated along a sequence has concerning the likelihood ofthose residues being in contact in the folded protein. The mutationalbehaviour is deduced from multiple sequence alignments. Ourfindings are that there is, indeed, some valuable informationavailable from this source and that it is sufficient to makea significant improvement in our ability to predict contacts,when compared with earlier methods that do not take into accountthe correlations between the mutations. This improvement isapproximately twice as large as can be obtained by the moreeconomical method of simply averaging pair preferences overthe same sequence alignment. Even when using a method basedon pair preferences, a further significant improvement can bemade by penalizing more variable regions (on the reasonableassumption that invariant residues are relatively more likelyto be in contact), though we have found no way of improvingthe pair preference method to the extent that it matches themethod based on correlated behaviour. Our new method is thoughtto be the best data-based method of contact prediction developedso far, achieving, on average, an improvement over a random(i.e. information-free) prediction of a factor of five whenthe number of contacts predicted is chosen to match the numberthat actually occur.  相似文献   

6.
7.
Secondary structure prediction for modelling by homology   总被引:1,自引:0,他引:1  
An improved method of secondary structure prediction has beendeveloped to aid the modelling of proteins by homology. Selecteddata from four published algorithms are scaled and combinedas a weighted mean to produce consensus algorithms. Each consensusalgorithm is used to predict the secondary structure of a proteinhomologous to the target protein and of known structure. Bycomparison of the predictions to the known structure, accuracyvalues are calculated and a consensus algorithm chosen as theoptimum combination of the composite data for prediction ofthe homologous protein. This customized algorithm is then usedto predict the secondary structure of the unknown protein. Inthis manner the secondary structure prediction is initiallytuned to the required protein family before prediction of thetarget protein. The method improves statistical secondary structureprediction and can be incorporated into more comprehensive systemssuch as those involving consensus prediction from multiple sequencealignments. Thirty one proteins from five families were usedto compare the new method to that of Garnier, Osguthorpe andRobson (GOR) and sequence alignment. The improvement over GORis naturally dependent on the similarity of the homologous protein,varying from a mean of 3% to 7% with increasing alignment significancescore.  相似文献   

8.
The integral membrane sialoglycoprotein PrPSc is the only identifiablecomponent of the scrapie prion. Scrapie in animals and Creutzfeldt-Jakobdisease in humans are transmissible, degenerative neurologicaldiseases caused by prions. Standard predictive strategies havebeen used to analyze the secondary structure of the prion proteinin conjunction with Fourier analysis of the primary sequencehydrophobicities to detect potential amphipathic regions. Severalhydrophobic segments, a proline- and glycine-rich repeat regionand putative glycosylation sites are incorporated into a modelfor the integral membrane topology of PrP. The complete aminoacid sequences of the hamster, human and mouse prion proteinsare compared and the effects of residue substitutions upon thepredicted conformation of the polypeptide chain are discussed.While PrP has a unique primary structure, its predicted secondarystructure shares some interesting features with the serum amyloidA proteins. These proteins undergo a post-translational modificationto yield amyloid A, molecules that share with PrP the abilityto polymerize into birefringent filaments. Our analyses mayexplain some experimental observations on PrP, and suggest furtherstudies on the properties of the scrapie and cellular PrP isoforms.  相似文献   

9.
An optimized self-organizing map algorithm has been used toobtain protein topological (proteinotopic) maps. A neural networkis able to arrange a set of proteins depending on their ultravioletcircular dichroism spectra in a completely unsupervised learningprocess. Analysis of the proteinotopic map reveals that thenetwork extracts the main secondary structure features evenwith the small number of examples used. Some methods to usethe proteinotopic map for protein secondary structure predictionare tested showing a good performance in the 200–240 nmwavelength range that is likely to increase as new protein structuresare known.  相似文献   

10.
The use of multiple sequence alignments for secondary structurepredictions is analysed. Seven different protein families, containingonly sequences of known structure, were considered to providea range of alignment and prediction conditions. Using alignmentsobtained by spatial superposition of main chain atoms in knowntertiary protein structures allowed a mean of 8% in secondarystructure prediction accuracy, when compared to those obtainedfrom the individual sequences. Substitution of these alignmentsby those determined directly from an automated sequence alignmentalgorithm showed variations in the prediction accuracy whichcorrelated with the quality of the multiple alignments and distanceof the primary sequence. Secondary structure predictions canbe reliably improved using alignments from an automatic alignmentprocedure with a mean increase of 6.87percnt;, giving an overallprediction accuracy of 68.5%, if there is a minimum of 25% sequenceidentity between all sequences in a family.  相似文献   

11.
12.
Synthetic genes coding for artificial proteins with predeflnedand nutritionally valuable amino acid compositions have beenconstructed and cloned In bacterial plasmid vector pKK233-2.The genes were constructed from three easily interchangeable‘cassettes’ encoding either essential, non-essentialor branched-chain amino acid residues. A potential hairpin loopstructure in the mRNA around the region of the ribosome bindingsite was probably the reason for blockage of translation fromthis vector. Two selected genes, AHB (containing one copy ofeach cassette) and A (consisting of six copies concatemerizedA6cassette) were cloned into pUR300, a (ß-Gal fusionvector and expressed as fusion proteins (ß-Gal-AHBand (ß-Gal-A6.  相似文献   

13.
Abstract The crystal structure of a hybrid Escherichia coli triosephosphateisomerase (TIM) has been determined at 2.8 Å resolution.The hybrid TIM (ETIM8CHI) was constructed by replacing the eighthß-unit of E.coli TIM with the equivalent unit of chickenTIM. This replacement involves 10 sequence changes. One of thechanges concerns the mutation of a buried alanine (Ala232 instrand 8) into a phenylalanine. The ETIM8CHI structure showsthat the A232F sequence change can be incorporated by a side-chainrotation of Phe224 (in helix 7). No cavities or strained dihedralsare observed in ETIM8CHI in the region near position 232, whichis in agreement with the observation that ETIM8CHI and E.coliTIM have similar stabilities. The largest CA (C-alpha atom)movements, 3 Å, are seen for the C-terminal end of helix8 (associated with the outward rotation of Phe224) and for theresidues in the loop after helix 1 (associated with sequencechanges in helix 8). From the structure it is not clear whythe kcat of ETIM8CHI is 10 times lower than in wild type E.coliTIM  相似文献   

14.
The structures of the interfaces of nine dimeric and nine tetramericproteins have been analyzed and have been seen to follow generalprinciples. These interfaces are combinations of four structuralmotifs, which resemble features of monomeric proteins. Theseare: (i) extended beta sheet; (ii) helix–helix packing;(iii) sheet–sheet packing; and (iv) loop interactions.Other common structural features in the interfaces studied aretwo-fold symmetry, charged hydrogen bonds and channel formation(found only in tetramers). Monomer–monomer interfacesare intermediate in hydrophobicity and charge between the interfacesbetween secondary structures of monomeric proteins and the exteriorsof monomeric proteins. A typical interface has one of the firstthree of the structural motifs at its centre and loop interactionsaround the outside, where most of the charge resides.  相似文献   

15.
16.
A method has been developed to detect pairs of positions withcorrelated mutations in protein multiple sequence alignments.The method is based on reconstruction of the phylogenetic treefor a set of sequences and statistical analysis of the distributionof mutations in the branches of the tree. The database of homology-derivedprotein structures (HSSP) is used as the source of multiplesequence alignments for proteins of known three-dimensionalstructure. We analyse pairs of positions with correlated mutationsin 67 protein families and show quantitatively that the presenceof such positions is a typical feature of protein families.A significant but weak tendency is observed for correlated residuepairs to be close in the three-dimensional structure. With furtherimprovements, methods of this type may be useful for the predictionof residue-residue contacts and subsequent prediction of proteinstructure using distance geometry algorithms. In conclusion,we suggest a new experimental approach to protein structuredetermination in which selection of functional mutants afterrandom mutagenesis and analysis of correlated mutations providesufficient proximity constraints for calculation of the proteinfold  相似文献   

17.
We have compared a novel sequence–structure matching technique,FORESST, for detecting remote homologs to three existing sequencebased methods, including local amino acid sequence similarityby BLASTP, hidden Markov models (HMMs) of sequences of proteinfamilies using SAM, HMMs based on sequence motifs identifiedusing meta-MEME. FORESST compares predicted secondary structuresto a library of structural families of proteins, using HMMs.Altogether 45 proteins from nine structural families in thedatabase CATH were used in a cross-validated test of the foldassignment accuracy of each method. Local sequence similarityof a query sequence to a protein family is measured by the highestsegment pair (HSP) score. Each of the HMM-based approaches (FORESST,MEME, amino acid sequence-based HMM) yielded log-odds scorefor the query sequence. In order to make a fair comparison amongthese methods, the scores for each method were converted toZ-scores in a uniform way by comparing the raw scores of a queryprotein with the corresponding scores for a set of unrelatedproteins. Z-Scores were analyzed as a function of the maximumpairwise sequence identity (MPSID) of the query sequence tosequences used in training the model. For MPSID above 20%, theZ-scores increase linearly with MPSID for the sequence-basedmethods but remain roughly constant for FORESST. Below 15%,average Z-scores are close to zero for the sequence-based methods,whereas the FORESST method yielded average Z-scores of 1.8 and1.1, using observed and predicted secondary structures, respectively.This demonstrates the advantage of the sequence–structuremethod for detecting remote homologs.  相似文献   

18.
Hie structure of E.coli soluble inorganic pyrophosphatase hasbeen refined at 2.7 resolution to an R-factor of 20.9. Theoverall fold of the molecule is essentially the same as yeastpyrophosphatase, except that yeast pyrophosphatase is longerat both the N- and C-termini. Escherichia coli pyrophosphataseis a mixed +ß protein with a complicated topology.The active site cavity, which is also very similar to the yeastenzyme, is formed by seven ß-strands and an -helixand has a rather asymmetric distribution of charged residues.Our structure-based alignment extends and improves upon earliersequence alignment studies; it shows that probably no more than14, not 15–17 charged and polar residues are part of theconserved enzyme mechanism of pyrophosphatases. Six of theseconserved residues, at the bottom of the active site cavity,form a tight group centred on Asp70 and probably bind the twoessential Mg+ ions. The others, more spreadout and more positivelycharged, presumably bind substrate. Escherichia coli pyrophosphatasehas an extra aspartate residue in the active site cavity, whichmay explain why the two enzymes bind divalent cation differently.Based on the structure, we have identified a sequence motifthat seems to occur only in soluble inorganic pyrophosphatases.  相似文献   

19.
A methodology is proposed to solve a difficult modeling problemrelated to the recently sequenced P39 protein. This sequenceshares no similarity with any known 3D structure, but a foldis proposed by several threading tools. The difficulty in aligningthe target sequence on one of the proposed template structuresis overcome by combining the results of several available predictionmethods and by refining a rational consensus between them. Insilico validation of the obtained model and a preliminary cross-checkwith experimental features allow us to state that this borderlineprediction is at least reasonable. This model raises relevanthypotheses on the main structural features of the protein andallows the design of site-directed mutations. Knowing the geneticcontext of the P39 reading frame, we are now able to suggesta function for the P39 protein: it would act as a periplasmicsubstrate-binding protein.  相似文献   

20.
We report the complete structure determination of a 34 residuesynthetic peptide with the amino acid sequence of the dimerizationdomain (leucine zipper) of GCN4. A high resolution structurein solution was obtained by 1H-NMR studies and distance geometrycalculations followed by restrained energy minimization. A setof 20 final structures was obtained with an average root meansquare deviation of 1.3 A for the backbone atoms (excludingthe first and the last two residues). The structure containsan uninterrupted helix. A comparison with a structure previouslydetermined for a larger peptide containing both the DNA-bindingregion (basic region) and the leucine-zipper motif shows thestructural independence of the leucine-zipper domain from thecontiguous DNA binding region.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号