A data bank merging related protein structures and sequences |
| |
Authors: | Pascarella, Stefano Argos, Patrick |
| |
Affiliation: | 1European Molecular Biology Laboratory Meyerhofstrasse 1, Postfach 10 22 09, D-6900 Heidelberg, Germany 2Dipartimento di Scienze Biochimiche and Centro di Biologia Molecolare del Consiglio Nazionale delle Ricerche, Universita' La Sapienza, 100185 Roma, Italy |
| |
Abstract: | A data collection which merges protein structural and sequenceinformation is described. Structural superpositions amongstproteins with similar main-chain fold were performed or collectedfrom the literature. Sequences taken from the protein primarystructure databases were associated with the multiple structuralalignments providing they were at least 50% homologous in residueidentity to one of the structural sequences and at least 50%of the structural sequence residues were alignable. Such restrictionsallow reasonable confidence that the primary sequences sharethe conformation of the tertiary structural templates, exceptin the less conserved loop regions. Multiple structural superpositionswere collected for 38 familial groups containing a total of209 tertiary structures; 45 structures had no superposable matesand were used individually. Other information is also providedas main-chain and side-chain conformational angles, secondarystructural assignments and the like. Wedding the primary andtertiary structural data resulted in an 8-fold increase of databank sequence entries over those associated with the known three-dimensionalarchitectures alone. |
| |
Keywords: | data bank/ protein folding/ protein structure/ sequence alignment/ structure superposition |
本文献已被 Oxford 等数据库收录! |
|