'Multifrequency' location and clustering of sequence patterns from proteins |
| |
Authors: | E Ollivier H Soldano A Viari |
| |
Affiliation: | ABI, Institut Curie Section Physique-Chemie et CTIS, Centre de recherche INRA, Paris, France. |
| |
Abstract: | In previous work, we have shown that a set of characteristics, defined as (code frequency) pairs, can be derived from a protein family by the use of a signal-processing method. This method enables the location and extraction of sequence patterns by taking into account each (code frequency) pair individually. In the present paper, we propose to extend this method in order to detect and visualize patterns by taking into account several pairs simultaneously. Two 'multifrequency' methods are described. The first one is based on a rewriting of the sequences with new symbols which summarize the frequency information. The second method is based on a clustering of the patterns associated with each pair. Both methods lead to the definition of significant consensus sequences. Some results obtained with calcium-binding proteins and serine proteases are also discussed. |
| |
Keywords: | |
|
|