Recognition of analogous and homologous protein folds--assessment of prediction success and associated alignment accuracy using empirical substitution matrices |
| |
Authors: | Russell RB; Saqi MA; Bates PA; Sayle RA; Sternberg MJ |
| |
Affiliation: | Biomolecular Modelling Laboratory, Imperial Cancer Research Fund, London, UK. |
| |
Abstract: | Fold recognition methods aim to use the information in the known protein
structures (the targets) to identify that the sequence of a protein of
unknown structure (the probe) will adopt a known fold. This paper
highlights that the structural similarities sought by these methods can be
divided into two types: remote homologues and analogues. Homologues are the
result of divergent evolution and often share a common function. We define
remote homologues as those that are not easily detectable by sequence
comparison methods alone. Analogues do not have a common ancestor and
generally do not have a common function. Several sets of empirical matrices
for residue substitution, secondary structure conservation and residue
accessibility conservation have previously been derived from aligned pairs
of remote homologues and analogues (Russell et al., J. Mol. Biol., 1997,
269, 423-439). Here a method for fold recognition, FOLDFIT, is introduced
that uses these matrices to match the sequences, secondary structures and
residue accessibilities of the probe and target. The approach is evaluated
on distinct datasets of analogous and remotely homologous folds. The
accuracy of FOLDFIT with the different matrices on the two datasets is
contrasted to results from another fold recognition method (THREADER) and
to searches using mutation matrices in the absence of any structural
information. FOLDFIT identifies at top rank 12 out of 18 remotely
homologous folds and five out of nine analogous folds. The average
alignment accuracies for residue and secondary structure equivalencing are
much higher for homologous folds (residue approximately 42%, secondary
structure approximately 78%) than for analogues folds (approximately 12%,
approximately 47%). Sequence searches alone can be successful for several
homologues in the testing sets but nearly always fail for the analogues.
These results suggest that the recognition of analogous and remotely
homologous folds should be assessed separately. This study has implications
for the development and comparative evaluation of fold recognition
algorithms.
|
| |
Keywords: | |
本文献已被 Oxford 等数据库收录! |
|