Exploring microbial genome sequences to identify protein families on the grid. |
| |
Authors: | Yudong Sun Anil Wipat Matthew Pocock Peter A Lee Keith Flanagan James T Worthington |
| |
Affiliation: | Newcastle University, Newcastle Upon Tyne, NE1 7RU, UK. yudong.sun@comlab.ox.ac.uk |
| |
Abstract: | The analysis of microbial genome sequences can identify protein families that provide potential drug targets for new antibiotics. With the rapid accumulation of newly sequenced genomes, this analysis has become a computationally intensive and data-intensive problem. This paper describes the development of a Web-service-enabled, component-based, architecture to support the large-scale comparative analysis of complete microbial genome sequences and the subsequent identification of orthologues and protein families (Microbase). The system is coordinated through the use of Web-service-based notifications and integrates distributed computing resources together with genomic databases to realize all-against-all comparisons for a large volume of genome sequences and to present the data in a computationally amenable format through a Web service interface. We demonstrate the use of the system in searching for orthologues and candidate protein families, which ultimately could lead to the identification of potential therapeutic targets. |
| |
Keywords: | |
|
|