首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Evaluation and improvements in the automatic alignment of protein sequences   总被引:1,自引:0,他引:1  
The accuracy of protein sequence alignment obtained by applyinga commonly used global sequence comparison algorithm is assessed.Alignments based on the superposition of the three-dimensionalstructures are used as a standard for testing the automatic,sequence-based methods. Alignments obtained from the globalcomparison of five pairs of homologous protein sequences studiedgave 54% agreement overall for residues in secondary structures.The inclusion of information about the secondary structure ofone of the proteins in order to limit the number of gaps insertedin regions of secondary structure, improved this figure to 68%.A similarity score of greater than six standard deviation unitssuggests that an alignment which is greater than 75% correctwithin secondary structural regions can be obtained automaticallyfor the pair of sequences.  相似文献   

2.
Compensating changes in protein multiple sequence alignments   总被引:2,自引:0,他引:2  
A method was developed to identify compensating changes betweenresidues at positions in a multiple sequence alignment. (Forexample, one position might always contain a positively chargedresidue when the other is negatively charged and vice versa.)A correlation-based method was used to measure the compensationfound in the four residues at a pair of positions in any twosequences in a multiple alignment. All possible sequence pairingswere measured at the pair of positions and the resulting matrixanalysed to give a measure of cooperathity among the pairs.The basic method was sufficiently flexible to consider a numberof amino acid relatedness models based both on scalar and vectorialproperties. Pairs of compensating positions were selected bythe method and their mean separation (in a protein of knownstructure) was compared to both the mean pair-wise separationover all residues and the pairwise separation over an equivalentsample of pairs of residues selected on the basis of their conservationalone. The latter is an important control that has been omittedfrom previous studies. The results indicated that, at best,there was a slight effect (of marginal significance) leadingto the selection of closer pairs by the compensation measurewhen compared to the mean of all pairs. However, this was neveras good as the simpler measure based on conservation alone,which always found a significant majority of proteins with asample mean less than the overall mean  相似文献   

3.
The residue pair preference profile (R3P) method is an inversefolding method that combines environmental profiles and pairpreference profiles. The method uses statistical preferencesfor residue pairs which score the likelihood of finding a profiledresidue to be paired with a residue within its local environmentAll pairs are characterized by their dihedral angles, secondarystructure and number of neighboring residues as a function ofresidue type. Each residue pair preference is expressed forall 20 amino acids of the profiled residue and is weighted bythe compatibility of the environment residue with its own localenvironment The R3P method produces an initial profile-sequencealignment which is then refined by converting the initial profileinto a profile of a target sequence threaded into the structureof the initial profile. We have tested this method by evaluatingalignments of sequences with known 3-D structures using structuralsuperposition alignments as reference. R3P-sequence alignmentsare 50% correct on average for sequences whose 3-D structurepairs superimpose with an r.m.s. deviation of 1.97 Å.The average improvement in correctness during this iterativerefinement is 14%. The R3P-sequence alignments are comparedwith sequence-sequence and 3-D profile-sequence alignments.When all three methods are combined, on average 50% of the alignmentsare correct for pairs of 3-D structures that superimpose within2.12 Å. A 3-D model of HisA is predicted with the combinedmethod.  相似文献   

4.
Making an alignment of the amino acid sequences is an essentialstep in the prediction of an unknown protein structure by modelbuilding from the known structure of a protein of the same family.To improve the accuracy of the alignments, we introduced theconcept of hydrophobic core scores, which restrains puttinginsertions/deletions in the hydrophobic core regions of theprotein. Eight pairs of protein sequences were aligned by thismethod, and the quality of the alignments were assessed byreference to those obtained by the structural superposition.The introduction of the hydrophobic core scores derived fromthe knowledge of the tertiary structure of one of each pairresulted in an improvement of the accuracy of the alignments.The quality of the alignment was found to depend on the homologyof the protein sequences.  相似文献   

5.
A data bank merging related protein structures and sequences   总被引:1,自引:0,他引:1  
A data collection which merges protein structural and sequenceinformation is described. Structural superpositions amongstproteins with similar main-chain fold were performed or collectedfrom the literature. Sequences taken from the protein primarystructure databases were associated with the multiple structuralalignments providing they were at least 50% homologous in residueidentity to one of the structural sequences and at least 50%of the structural sequence residues were alignable. Such restrictionsallow reasonable confidence that the primary sequences sharethe conformation of the tertiary structural templates, exceptin the less conserved loop regions. Multiple structural superpositionswere collected for 38 familial groups containing a total of209 tertiary structures; 45 structures had no superposable matesand were used individually. Other information is also providedas main-chain and side-chain conformational angles, secondarystructural assignments and the like. Wedding the primary andtertiary structural data resulted in an 8-fold increase of databank sequence entries over those associated with the known three-dimensionalarchitectures alone.  相似文献   

6.
A multiple sequence alignment algorithm is described that usesa dynamic programming-based pattern construction method to aligna set of homologous sequences based on their common patternof conserved sequence elements. This pattern-induced multi-sequencealignment (PUMA) algorithm can employ secondary-structure dependentgap penalties for use in comparative modelling of new sequenceswhen the three-dimensional structure of one or more membersof the same family is known. We show that the use of secondarystructure information can significantly improve the accuracyof aligning structure boundaries in a set of homologous sequenceseven when the structure of only one member of the family isknown  相似文献   

7.
A three-dimensional model of the 507–749 region of neutralendopeptidase-24.11 (NEP; E.C.3.4.24.11) was constructed integratingthe results of secondary structure predictions and sequencehomologies with the bacterial endopeptidase thermolysin. Additionaldata were extracted from the structure of two other metalloproteases,astacin and stromelysin. The resulting model accounts for themain biological properties of NEP and has been used to describethe environment close to the zinc atom defining the catalyticsite. The analysis of several thiol inhibitors, complexed inthe model active site, revealed the presence of a large hydrophobicpocket at the S1' subsite level. This is supported by the natureof the constitutive amino acids. The computed energies of boundinhibitors correspond with the relative affinities of the stereoisomersof benzofused macrocycle derivatives of thiorphan. The modelcould be used to facilitate the design of new NEP inhibitors,as illustrated in the paper.  相似文献   

8.
A new approach has been developed to reduce multiple proteinstructures obtained from NMR structure analysis to a smallernumber of representative structures which still reflect thestructural diversity of the data sets. The method, based onthe clustering of similar structures, has been tested in thehomology model building of the structure of Sox-5, a sequence-specificDNA-binding protein belonging to the high mobility group (HMG)nuclear proteins family. Sox (SRY box) genes are the autosomalgenes related to the sex-determining SRY, Y chromosomal gene.The Sox-5 protein, encoded by one of the SRY-related genes,displays a 29% sequence identity with the HMG1 B-box domainwhose structure, determined previously by NMR, has been usedin our study to predict the structure of Sox-5. Two independentensembles of HMG1 structures, each represented by closely relatedcoordinate sets, were used. Nine representative structures forHMG1 were subsequently selected as starting points for the modellingof Sox-5. The model of the protein shows close similarity tothe HMG1 fold, with differences at the secondary structure levellocated mainly in a-helices 1 and 3. A left-handed, three residueper turn polyproline II helix, forming a conserved polyprolineII/-helix supersecondary motif, was identified in the N-terminalregion of Sox-5 and other HMG boxes.  相似文献   

9.
An empirical relationship between occupancy and the atomic displacementparameter of water molecules in protein crystal structures hasbeen found by comparing a set of well refined sperm whale myoglobincrystal structures. The relationship agrees with a series ofindependent structural features whose impact on water occupancycan easily be predicted as well as with other known data andis independent of the protein fold. The estimation of the wateroccupancy in protein crystal structures may help in understandingthe physico-chemical properties of the protein–solventinterface and can allow the monitoring of the accuracy of theprotein crystal structure refinement.  相似文献   

10.
A methodology is proposed to solve a difficult modeling problemrelated to the recently sequenced P39 protein. This sequenceshares no similarity with any known 3D structure, but a foldis proposed by several threading tools. The difficulty in aligningthe target sequence on one of the proposed template structuresis overcome by combining the results of several available predictionmethods and by refining a rational consensus between them. Insilico validation of the obtained model and a preliminary cross-checkwith experimental features allow us to state that this borderlineprediction is at least reasonable. This model raises relevanthypotheses on the main structural features of the protein andallows the design of site-directed mutations. Knowing the geneticcontext of the P39 reading frame, we are now able to suggesta function for the P39 protein: it would act as a periplasmicsubstrate-binding protein.  相似文献   

11.
A new multiple sequence alignment procedure is presented. Severaldifferent multiple alignments are made using differing criteria.Having divided the sequences into strongly conserved regions(SCRs) and loosely conserved regions (LCRs), the ‘best’alignment for each LCR is chosen, independently of the otherLCRs, from a selection of possibilities in the multiple alignments.To help make this choice for each LCR, the secondary structureis predicted and shown alongside each different possible alignment.One advantage of this method over automatic, non-interactivemethods, is that the final alignment is not dependent on thechoice of a single set of scoring parameters. Another is that,by allowing interactive choice and by taking account of secondarystructural information, the final alignment is based more onbiological rather than mathematical factors. This method canproduce better alignments than any of the initial automaticmultiple alignment methods used.  相似文献   

12.
The catalytic subunit of protein kinase casein kinase 2 (CK2),which has specificity for both ATP and GTP, shows significantamino acid sequence similarity to the cyclin-dependent kinase2 (CDK2). We constructed site-directed mutants of CK2 and useda three-dimensional model to investigate the basis for the dualspecificity. Introduction of Phe and Gly at positions 50 and51, in order to restore the pattern of the glycine-rich motif,did not seriously affect the specificity for ATP or GTP. Weshow that the dual specificity probably originates from theloop situated around the position His115 to Asp120 (HVNNTD).The insertion of a residue in this loop in CK2 subunits, comparedwith CDK2 and other kinases, might orient the backbone to interactwith the base A and G; this insertion is conserved in all knownCK2. The mutant N118, the design of which was based on the modelling,showed reduced affinity for GTP as predicted from the model.Other mutants were intended to probe the integrity of the catalyticloop, alter the polarity of a buried residue and explore theimportance of the carboxy terminus. Introduction of Arg to replaceAsn189, which is mapped on the activation loop, results in amutant with decreased kcat, possibly as a result of disruptionof the interaction between this residue and basic residues inthe vicinity. Truncation at position 331 eliminates the last60 residues of the subunit and this mutant has a reduced catalyticefficiency compared with the wild-type. Catalytic efficiencyis restored in the truncation mutant by the replacement of apotentially buried Glu at position 252 by Lys, probably owingto a higher stability resulting from the formation of a saltbridge between Lys252 and Asp208.  相似文献   

13.
The average hydrophobicity of a polypeptide segment is consideredto be the most important factor in the formation of transmembranehelices, and the partitioning of the most hydrophobic (MH) segmentinto the alternative nonpolar environment, a membrane or hydrophobiccore of a globular protein may determine the type of proteinproduced. In order to elucidate the importance of the MH segmentin determining which of the two types of protein results froma given amino acid sequence, we statistically studied the characteristicsof MH helices, longer than 19 residues in length, in 97 membraneproteins whose three-dimensional structure or topology is known,as well as 397 soluble proteins selected from the Protein DataBank. The average hydrophobicity of MH helices in membrane proteinshad a characteristic relationship with the length of the protein.All MH helices in membrane proteins that were longer than 500residues had a hydrophobicity greater than 1.75 (Kyte and Doolittlescale), while the MH helices in membrane proteins smaller than100 residues could be as hydrophilic as 0.1. The possibilityof developing a method to discriminate membrane proteins fromsoluble ones, based on the effect of size on the type of proteinproduced, is discussed.  相似文献   

14.
Any two ß-strands belonging to two different ß-sheetsin a protein structure are considered to pack interactivelyif each ß-strand has at least one residue that undergoesa loss of one tenth or more of its solvent contact surface areaupon packing. A data set of protein 3-D structures (determinedat 2.5 Å resolution or better), corresponding to 428 proteinchains, contains 1986 non-identical pairs of ß-strandsinvolved in interactive packing. The inter-axial distance betweenthese is significantly correlated to the weighted sum of thevolumes of the interacting residues at the packing interface.This correlation can be used to predict the changes in the inter-sheetdistances in equivalent ß-sheets in homologous proteinsand, therefore, is of value in comparative modelling of proteins.  相似文献   

15.
A Monte Carlo simulation program (MONTY) has been developedto dock proteins onto DNA. Protein and DNA interact via square-wellpotentials for hydrogen bond and van der Waals interactions.The effect of the inclusion of DNA flexibility and experimentallyderived restraints has been tested on members of the helix-turn-helixfamily of DNA binding proteins. Unwinding and bending the DNAdouble helix improves the number of correctly retrieved hydrogenbonds in simulations starting from the 434 cro protein monomercomplexed with a standard B-DNA ORl half-site. Agreement withphosphate ethylation interference and mutagenesis data is rewardedwith energy bonuses. This protocol was tested on protein-DNAcomplexes of 434 cro, lac headpiece and a mutant lac headpieceresembling the gal repressor headpiece with the recognitionhelices in correct and reversed orientations in the DNA majorgroove. The inclusion of experimental data gives an improvedconvergence of the correctly oriented structures and allowsfor an easier discrimination between correctly and incorrectlydocked complexes  相似文献   

16.
The protein kinase family can be subdivided into two main groupsbased on their ability to phosphorylate Ser/Thr or Tyr substrates.In order to understand the basis of this functional difference,we have carried out a comparative analysis of sequence conservationwithin and between the Ser/Thr and Tyr protein kinases. A multiplesequence alignment of 86 protein kinase sequences was generated.For each position in the alignment we have computed the conservationof residue type in the Ser/Thr, in the Tyr and in both of thekinase subfamilies. To understand the structural and/or functionalbasis for the conservation, we have mapped these conservationproperties onto the backbone of the recently determined structureof the cAMP–dependent Ser/Thr kinase. The results showthat the kinase structure can be roughly segregated, based uponconservation, into three zones. The inner zone contains residueshighly conserved in all the kinase family and describes thehydrophobic core of the enzyme together with residues essentialfor substrate and ATP binding and catalysis. The outer zonecontains residues highly variable in all kinases and representsthe solvent–exposed surface of the protein. The thirdzone is comprised of residues conserved in either the Ser/Thror Tyr kinases or in both, but which are not conserved betweenthem. These are sandwiched between the hydrophobic core andthe solvent-exposed surface. In addition to analyzing overallconservation hi the kinase family, we have also looked at conservationof its substrate and ATP binding sites. The ATP site is highlyconserved throughout the kinases, whereas the substrate bindingsite is more variable. The active site contains several positionswhich differ between the Ser/Thr and Tyr kinases and may beresponsible for discriminating between hydroxyl bearing sidechains. Using this information we propose a model for Tyr substratebinding to the catalytic domain of the epidermal growth factorreceptor (EGFR).  相似文献   

17.
18.
We present here a spectroscopic structural characterizationof octarellin, a recently reported de novo protein modelledon /ß-barrel proteins [K. Go raj, A.Renard and J.A.Martial(1990) Protein Engng, 3, 259–266]. Infrared and Ramanspectra analyses of octarellin‘s secondary structure revealthe expected percentage of -helices (30%) and a higher ß-sheetcontent (40%) than predicted from the design. When the Ramanspectra obtained with octarellin and native triosephosphateisomerase (a natural /ß-barrel) are compared, similarpercentages of secondary structures are found. Thermal denaturationof octarellin monitored by CD confirms that its secondary structuresare quite stable, whereas its native-like tertiary fold is not.Tyrosine residues, predicted to be partially hidden from solvent,are actually exposed as revealed by Raman and UV absorptionspectra. We conclude that the attempted /ß-barrelconformation in octarellin may be loosely packed. The criteriaused to design octarellin are discussed and improvements suggested.  相似文献   

19.
The solution structure of the 38 amino acid C-terminal regionof the precursor for the HPLC-6 antifreeze protein from winterflounder has been investigated with molecular dynamics usingthe AMBER software. The simulation for the peptide in aqueoussolution was carried out at a constant temperature of 0°Cand at atmospheric pressure. The simulation covered 120 ps andthe results were analyzed based on data sampled upon reachinga stable equilibrium phase. Information has been obtained onthe quality of constant temperature and pressure simulations,the solution structure and dynamics, the hydrogen bonding network,the helix-stabilizing role of terminal charges and the interactionwith the surrounding water molecules. The Lys18–Glu22interactions and the terminal charged residues are found tostabilize a helical structure with the side chains of Thr2,Thr13, Thr24 and Thr35 equally spaced on one side of the helix.The spacing between oxygen atoms in the hydroxyl group of thethreonine side chains exhibits fluctuations of the order of2–3 Å during the 120 ps of simulation, but valuessimultaneously close to the repeat distance of 16.6 Åbetween oxygen atoms along the [0112] direction in ice are observed.Furthermore, two engineered variants were studied using thesame simulation protocol.  相似文献   

20.
The results of a protein design project are used to comparedifferent predictive strategies with respect to proteinproteininteractions. We have been able to generate variants of humanpancreatic secretory trypsin inhibitor (hPSTI) optimized withrespect to the affinity and specificity for human leukocyteelastase relative to trypsin and chymotrypsin, and in particularchymotrypsin. The extremely strong and specific human leukocyteelastase inhibitors were thus developed in three rounds of mutagenesisand two rounds of 3-D modelling; only 24 variants in total weresynthesized, although variations at seven different amino acidpositions were involved (i.e. from 207 possible variants). Anexcellent elastase inhibitor could be designed with the minimumof two amino acid exchanges. The value of structural modellingand actual structure determination is discussed in the lightof the experimental results of the designed protein variantsand the results of tertiary structure determinations of thefree variant and the inhibitorprotease complex. Particular referenceis given to the strategy to be followed in protein design projectsin general and to the development of protease inhibitors inparticular.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号