首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Evolutionary divergence and conservation of trypsin   总被引:7,自引:0,他引:7  
The trypsin sequences currently available in the data bankshave been collected and aligned using first the amino acid sequencehomology and, subsequently, the superposed crystal structuresof trypsins from the cow, the bacterium Streptomyces griseusand the fungus Fusarium oxysporum. The phylogenetic tree constructedaccording to this multiple alignment is consistent with a continuousevolutionary divergence of trypsin from a common ancestor ofboth prokaryotes and eukaryotes. Comparison of crystal structuresreveals a strict conservation of secondary structure. Similarly,in the alignment of all the sequences, insertions and deletionsoccur only in regions corresponding to loops between the secondarystructure elements in the known crystal structures. The conservedresidues cluster around the active site. Almost all conservedresidues can be associated with one of the basic functionalfeatures of the protein: zymogen activation, catalysis and substratespecificity. In contrast, the residues of the hydrophobic coreof the protein and the calcium ion binding sites are generallynot conserved. The conserved features of trypsin and the natureof the conservation are discussed In detail  相似文献   

2.
Antibodies are powerful tools for studying the in situ localizationand physiology of proteins. The prediction of epitopes by molecularmodelling has been used successfully for the papilloma virus,and valuable antibodies have been raised [Muller et aL (1990)J. Gen. Virol, 71, 2709–2717]. We have improved the modellingapproach to allow us to predict epitopes from the primary sequencesof the cystic fibrosis transmembrane conductance regulator.The procedure involves searching for fragments of primary sequenceslikely to make amphipathic secondary structures, which are hydrophilicenough to be at the surface of the folded protein and thus accessibleto antibodies. Amphipathic helices were predicted using themethods of Berzofsky, Eisenberg and Jahnig. Their hydrophobichydrophilicinterface was calculated and drawn, and used to predict theorientation of the helices at the surface of the native protein.Amino acids involved in turns were selected using the algorithmof Eisenberg. Tertiary structures were calculated using ‘FOLDING’,a software developed by R.Brasseur for the prediction of smallprotein structures [Brasseur (1995) J. MoL Graphics, in press].We selected sequences that folded as turns with at least fiveprotruding polar residues. One important property of antibodiesis selectivity. To optimize the selectivity of the raised antibodies,each sequence was screened for similarity (FASTA) to the proteinsequences from several databanks. Ubiquitous sequences werediscarded. This approach led to the identification of 13 potentialepitopes in the cystic fibrosis transmembrane conductance regulator:seven helices and six loops.  相似文献   

3.
The use of multiple sequence alignments for secondary structurepredictions is analysed. Seven different protein families, containingonly sequences of known structure, were considered to providea range of alignment and prediction conditions. Using alignmentsobtained by spatial superposition of main chain atoms in knowntertiary protein structures allowed a mean of 8% in secondarystructure prediction accuracy, when compared to those obtainedfrom the individual sequences. Substitution of these alignmentsby those determined directly from an automated sequence alignmentalgorithm showed variations in the prediction accuracy whichcorrelated with the quality of the multiple alignments and distanceof the primary sequence. Secondary structure predictions canbe reliably improved using alignments from an automatic alignmentprocedure with a mean increase of 6.87percnt;, giving an overallprediction accuracy of 68.5%, if there is a minimum of 25% sequenceidentity between all sequences in a family.  相似文献   

4.
Sequence weighting techniques are aimed at balancing redundantobserved information from subsets of similar sequences in multiplealignments. Traditional approaches apply the same weight toall positions of a given sequence, hence equal efficiency ofphylogenetic changes is assumed along the whole sequence. Thisrestrictive assumption is not required for the new method PSIC(position-specific independent counts) described in this paper.The number of independent observations (counts) of an aminoacid type at a given alignment position is calculated from theoverall similarity of the sequences that share the amino acidtype at this position with the help of statistical concepts.This approach allows the fast computation of position-specificsequence weights even for alignments containing hundreds ofsequences. The PSIC approach has been applied to profile extractionand to the fold family assignment of protein sequences withknown structures. Our method was shown to be very productivein finding distantly related sequences and more powerful thanHidden Markov Models or the profile methods in WiseTools andPSI-BLAST in many cases. The profile extraction routine is availableon the WWW (http://www.bork.embl-heidelberg.de/PSIC or http://www.imb.ac.ru/PSIC).  相似文献   

5.
We present an efficient technique for the comparison of proteinstructures. The algorithm uses a vector representation of thesecondary structure elements and searches for spatial configurationsof secondary structure elements in proteins. In such recurringprotein folds, the order of the secondary structure elementsin the protein chains is disregarded. The method is based onthe geometric hashing paradigm and implements approaches originatingin computer vision. It represents and matches the secondarystructure element vectors in a 3-D translation and rotationinvariant manner. The matching of a pair of proteins takes onaverage under 3 s on a Silicon Graphics Indigo2 workstation,allowing extensive all-against-all comparisons of the data setof non-redundant protein structures. Here we have carried outsuch a comparison for a data set of over 500 protein molecules.The detection of recurring topological and non-topological,secondary structure element order-independent protein foldsmay provide further insight into evolution. Moreover, as theserecurring folding units are likely to be conformationalHy favourable,the availability of a data set of such topological motifs canserve as a rich input for threading routines. Below, we describethis rapid technique and the results it has obtained. Whilesome of the obtained matches conserve the order of the secondarystructure elements, others are entirely order independent. Asan example, we focus on the results obtained for Che Y, a signaltransduction protein, and on the profilin-ß-actincomplex. The Che Y molecule is composed of a five-stranded,parallel ß-sheet flanked by five helices. Here weshow its similarity with the Escherichia coli elongation factor,with L-arabinose binding protein, with haloalkane dehalogenaseand with adenylate kinase. The profilin–ß-actincontains an antiparallel ß-pleated sheet with -helicaltermini. Its similarities to lipase, fructose disphosphataseand ß-lactamase are displayed.  相似文献   

6.
Evaluation and improvements in the automatic alignment of protein sequences   总被引:1,自引:0,他引:1  
The accuracy of protein sequence alignment obtained by applyinga commonly used global sequence comparison algorithm is assessed.Alignments based on the superposition of the three-dimensionalstructures are used as a standard for testing the automatic,sequence-based methods. Alignments obtained from the globalcomparison of five pairs of homologous protein sequences studiedgave 54% agreement overall for residues in secondary structures.The inclusion of information about the secondary structure ofone of the proteins in order to limit the number of gaps insertedin regions of secondary structure, improved this figure to 68%.A similarity score of greater than six standard deviation unitssuggests that an alignment which is greater than 75% correctwithin secondary structural regions can be obtained automaticallyfor the pair of sequences.  相似文献   

7.
The residue pair preference profile (R3P) method is an inversefolding method that combines environmental profiles and pairpreference profiles. The method uses statistical preferencesfor residue pairs which score the likelihood of finding a profiledresidue to be paired with a residue within its local environmentAll pairs are characterized by their dihedral angles, secondarystructure and number of neighboring residues as a function ofresidue type. Each residue pair preference is expressed forall 20 amino acids of the profiled residue and is weighted bythe compatibility of the environment residue with its own localenvironment The R3P method produces an initial profile-sequencealignment which is then refined by converting the initial profileinto a profile of a target sequence threaded into the structureof the initial profile. We have tested this method by evaluatingalignments of sequences with known 3-D structures using structuralsuperposition alignments as reference. R3P-sequence alignmentsare 50% correct on average for sequences whose 3-D structurepairs superimpose with an r.m.s. deviation of 1.97 Å.The average improvement in correctness during this iterativerefinement is 14%. The R3P-sequence alignments are comparedwith sequence-sequence and 3-D profile-sequence alignments.When all three methods are combined, on average 50% of the alignmentsare correct for pairs of 3-D structures that superimpose within2.12 Å. A 3-D model of HisA is predicted with the combinedmethod.  相似文献   

8.
Hydrophobic cluster analysis (HCA) is a protein sequence comparisonmethod based on -helical representations of the sequences wherethe size, shape and orientation of the clusters of hydrophobicresidues are primarily compared. The effectiveness of HCA hasbeen suggested to originate from its potential ability to focuson the residues forming the hydrophobic core of globular proteins.We have addressed the robustness of the bidimensional representationused for HCA in its ability to detect the regular secondarystructure elements of proteins. Various parameters have beenstudied such as those governing cluster size and limits, thehydrophobic residues constituting the clusters as well as thepotential shift of the cluster positions with respect to theposition of the regular secondary structure elements. The followingresults have been found to support the -helical bidimensionalrepresentation used in HCA: (i) there is a positive correlation(clearly above background noise) between the hydrophobic clustersand the regular secondary structure elements in proteins; (ii)the hydrophobic clusters are centred on the regular secondarystructure elements; (iii) the pitch of the helical representationwhich gives the best correspondence is that of an -helix. Thecorrespondence between hydrophobic clusters and regular secondarystructure elements suggests a way to implement variable gappenalties during the automatic alignment of protein sequences.  相似文献   

9.
A 16 kDa protein has been isolated in a homogeneous form asthe major component of a paracrystalline paired membrane structureclosely resembling the gap junction. The primary structure ofthis protein from arthropod and vertebrate species has beendetermined by protein and cDNA sequencing. The amino acid sequencesare highly conserved and virtually identical to the amino acidsequence of the proteolipid subunit of the vacuolar H+-ATPases.The disposition of the protein in the membrane has been studiedusing proteases and the N,N'-dicyclohexylcarbodiimide reactivesite identified. These data, together with secondary structurepredictions, suggest that the 16 kDa protein is for the mostpart buried in the membrane, arranged in a bundle of four hydrophobicß-helices. Using computer graphics, a model has beenconstructed based on this arrangement and on the electron microscopicimages of the paracrystalline arrays  相似文献   

10.
The alignment of Escherichia coli citrate synthase to pig heartcitrate synthase and the multiple alignment of the known sequencesof the citrate synthase family of enzymes have been performedusing six different amino acid similarity scoring matrices anda large range of gap penalty ratios for insertions and deletionsof amino acids. The alignment studies have been performed asthe first step in a project aimed at homology modelling E.colicitrate synthase (a hexamer) from pig heart citrate synthase(a dimer) in a molecular modelling approach to the study ofmulti-subunit enzymes. The effects of several important variablesin producing realistic alignments have been investigated. Thedifference between multiple alignment of the family of enzymesversus simple pairwise alignment of the pig heart and E.coliproteins was explored. The effects of initial separate multiplealignments of the most highly related or most homologous speciesof the family of enzymes upon a subsequent pairwise alignmentbetween species was evaluated. The value of ‘fingerprinting’certain residues to bias the alignment in favour of matchingthose residues, as well as the worth of the computerized approachcompared to an intuitive alignment technique, were assessed.  相似文献   

11.
We have recently reported the first complete amino acid sequenceof an iron-containing superoxide dismutase. The iron enzymeis thought to be closely homologous to the manganese-containingsuperoxide dismutases. The availability of complete amino acidsequence information for four manganese superoxide dismutasesand the crystal structures for two iron and two manganese superoxidedismutases prompted us to investigate the degree of homologybetween the two proteins at various levels. We report that itis not possible to clearly distinguish the two proteins on thebasis of their secondary or tertiary structures. It would appearthat a small number of single site substitutions are responsiblefor conferring distinguishing properties between the two proteins.Substitution of glyclne 77 and glutamine 154 by a glutamineand an alanine respectively in Photobacterium leiognathi ironsuperoxide dismutase may distinguish the kinetic and other particularproperties of this protein from the manganese protein (and otheriron superoxide dismutases). Furthermore the primary structureof both the iron and manganese proteins does not appear to haveany homology with any other known amino acid sequence.  相似文献   

12.
A new approach has been developed to reduce multiple proteinstructures obtained from NMR structure analysis to a smallernumber of representative structures which still reflect thestructural diversity of the data sets. The method, based onthe clustering of similar structures, has been tested in thehomology model building of the structure of Sox-5, a sequence-specificDNA-binding protein belonging to the high mobility group (HMG)nuclear proteins family. Sox (SRY box) genes are the autosomalgenes related to the sex-determining SRY, Y chromosomal gene.The Sox-5 protein, encoded by one of the SRY-related genes,displays a 29% sequence identity with the HMG1 B-box domainwhose structure, determined previously by NMR, has been usedin our study to predict the structure of Sox-5. Two independentensembles of HMG1 structures, each represented by closely relatedcoordinate sets, were used. Nine representative structures forHMG1 were subsequently selected as starting points for the modellingof Sox-5. The model of the protein shows close similarity tothe HMG1 fold, with differences at the secondary structure levellocated mainly in a-helices 1 and 3. A left-handed, three residueper turn polyproline II helix, forming a conserved polyprolineII/-helix supersecondary motif, was identified in the N-terminalregion of Sox-5 and other HMG boxes.  相似文献   

13.
Twilight zone of protein sequence alignments   总被引:8,自引:0,他引:8  
Sequence alignments unambiguously distinguish between proteinpairs of similar and non-similar structure when the pairwisesequence identity is high (>40% for long alignments). Thesignal gets blurred in the twilight zone of 20–35% sequenceidentity. Here, more than a million sequence alignments wereanalysed between protein pairs of known structures to re-definea line distinguishing between true and false positives for lowlevels of similarity. Four results stood out. (i) The transitionfrom the safe zone of sequence alignment into the twilight zoneis described by an explosion of false negatives. More than 95%of all pairs detected in the twilight zone had different structures.More precisely, above a cut-off roughly corresponding to 30%sequence identity, 90% of the pairs were homologous; below 25%less than 10% were. (ii) Whether or not sequence homology impliedstructural identity depended crucially on the alignment length.For example, if 10 residues were similar in an alignment oflength 16 (>60%), structural similarity could not be inferred.(iii) The `more similar than identical' rule (discarding allpairs for which percentage similarity was lower than percentageidentity) reduced false positives significantly. (iv) Usingintermediate sequences for finding links between more distantfamilies was almost as successful: pairs were predicted to behomologous when the respective sequence families had proteinsin common. All findings are applicable to automatic databasesearches.  相似文献   

14.
In the tobamovirus coat protein family, amino acid residuesat some spatially close positions are found to be substitutedin a coordinated manner [Altschuh et al. (1987) J. Mol. Biol.,193,693]. Therefore, these positions show an identical patternof amino acid substitutions when amino acid sequences of thesehomologous proteins are aligned. Based on this principle, coordinatedsubstitutions have been searched for in three additional proteinfamilies: serine proteases, cysteine proteases and the haemoglobins.Coordinated changes have been found in all three protein familiesmostly within structurally constrained regions. This methodworks with a varying degree of success depending on the functionof the proteins, the range of sequence similarities and thenumber of sequences considered. By relaxing the criteria forresidue selection, the method was adapted to cover a broaderrange of protein families and to study regions of the proteinshaving weaker structural constraints. The information derivedby these methods provides a general guide for engineering ofa large variety of proteins to analyse structure–functionrelationships.  相似文献   

15.
A general protein sequence alignment methodology for detectinga priori unknown common structural and functional regions isdescribed. The method proposed in this paper is based on twobasic requirements for a meaningful alignment. First, each sequenceor segment of a sequence is characterized by a multivariatephysicochemical profile. Second, the alignment is performedby considering all the sequences simultaneously, and the algorithmdetects those regions that form a set of similar profiles. Inorder to test the structural meaning of the alignment obtainedfrom the sequences, quantitative comparisons are performed withstructurally conserved regions (SCR) determined from the X-raystructures of three serine proteases. Results suggest that thelimits of the SCR may be predicted from the similarities betweenthe physicochemical profiles of the sequences. The proceduresare not completely automated. The final step requires a visualscreening of alternative pathways in order to determine an optimalalignment.  相似文献   

16.
The instabilities of the native structures of mutant proteinswith an amino acid exchange are estimated by using the contactenergy and the number of contacts for each type of amino acidpair, which were estimated from 18 192 residue–residuecontacts observed in 42 crystals of globular proteins. Theywere then used to evaluate a transition probability matrix ofcodon substitutions and a log relatedness odds matrix, whichis used as a scoring matrix to measure the similarity betweenprotein sequences. To consider amino acid substitutions in homologousproteins, base mutation rates and the effects of the geneticcode are also taken into account. The average fitness of anamino acid exchange is approximated to be proportional to thestructural stability of the mutant protein, which is then approximatedby the average energy change of the protein native structureexpected for the ammo acid exchange with neglect of the energychange of the denatured state. In global and local homologysearches, this scoring matrix tends to yield significantly higheralignment scores than either the unitary matrix or the geneticcode matrix, and also may yield higher alignment scores fordistantly related protein pairs than MDM78. One of advantagesof this scoring matrix is that the equilibrium frequencies ofcodons and also base mutation rates can be adjusted.  相似文献   

17.
In order to express uteroglobin in Escherichia coli we haveconstructed a DNA coding for complete mature rabbit uteroglobinby fusing genomic sequences from the second exon of the geneto an incomplete cDNA. This DNA was inserted into various positionsof the polylinker cloning region of pDS expression vectors andthe uteroglobin gene was expressed in E.coli by IPTG induction.Four different uteroglobinderived proteins were produced containing1, 3,5 and 7 more N-terminal amino acids than the naturallyoccurring mature protein. The yield of soluble protein stronglyincreased with increasing length of the N-terminal additions.Protein and RNA analysis showed that this variation is mostlikely due to progressively higher translation efficienciesof the larger recombinants. UG7, the most efficiently synthesizedrecombinant protein, carrying seven additional N-terminal aminoacids, was purified and further characterized. Like naturaluteroglobin, UG7 forms a dimer and binds progesterone with anaffinity indistinguishable to the natural protein. This bacteriallyproduced protein can be used for detailed structure–functioninvestigations of uteroglobin.  相似文献   

18.
The thermostability of DNA-binding protein HU from bacilli   总被引:3,自引:0,他引:3  
The primary and tertiary structures of DNA-binding protein HUfrom Bacillus stearothermophilus are already known. The primarystructure has been previously determined for HU from the closelyrelated B.globigü and the determinations of the sequencesfrom B.caldolyticus and B.subtilis are described here. Thesebacteria have optimum growth temperatures of > 70C (B.caldolyticus),65C (B.stearothermophilus), 37C (B.subtilis) and 30C (B.globigü).in vitro measurements from circular dichroic spectra describedhere give Tm values reflecting these growth temperatures, of68, 64, 43 and 41C respectively. We discuss here the relativethermostability of the four proteins in terms of the amino aciddifferences between the sequences and the three-dimensionalmodel of the B.stearothermophilus HU. The current model forthe interaction of the protein with DNA is only discussed interms of its relevance with regard to thermostability.  相似文献   

19.
A model of the lignin peroxidase LIII of Phlebia radiata wasconstructed on the basis of the structure of cytochrome c peroxidase(CCP). Because of the low percentage of amino acid identitybetween the CCP and the lignin peroxidase LIII of Phlebia radiata,alignment of the sequences was based on the generation of atemplate from a knowledge of the 3-D structure of CCP and consensussequences of lignin peroxidases. This approach gave an alignmentin which all the insertions in the lignin peroxidase were placedat loop regions of CCP, with a 21.1% identity for these twoproteins. The model was constructed using this alignment andthe computer program COMPOSER, which assembles the model asa series of rigid fragments derived from CCP and other proteins.Manual intervention was required for some of the longer loopregions. The -helices forming the structural framework, andespecially the haem environment of CCP, are conserved in theLIII model and the core is close packed without holes. A possiblesite of the substrate oxidation at the haem edge of LIII isdiscussed.  相似文献   

20.
Quantifying the local reliability of a sequence alignment   总被引:4,自引:0,他引:4  
We present a method for attributing a measure of reliabilityto a residue pair in an optimal alignment of two protein sequences.Validation based on a database of structurally correct alignments[Pascarella and Argos (1992) Protein Engng, 5, 121–137]shows that correctly aligned parts of a sequence alignment systematicallyreceive high scores in this measure. The higher the sequencesimilarity between two sequences, the larger is the fractionfound of the correct parts of the alignment. We used these observationsto design a program that draws a reliability curve along anoptimal alignment reflecting the chances for each residue pairto be aligned correctly.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号