首页 | 本学科首页   官方微博 | 高级检索  
     


Sequence clustering strategies improve remote homology recognitions while reducing search times
Authors:Li  Weizhong; Jaroszewski  Lukasz; Godzik  Adam
Affiliation:The Burnham Institute, La Jolla, CA 92037, USA
Abstract:Sequence databases are rapidly growing, thereby increasing thecoverage of protein sequence space, but this coverage is unevenbecause most sequencing efforts have concentrated on a smallnumber of organisms. The resulting granularity of sequence spacecreates many problems for profile-based sequence comparisonprograms. In this paper, we suggest several strategies thataddress these problems, and at the same time speed up the searchesfor homologous proteins and improve the ability of profile methodsto recognize distant homologies. One of our strategies combinesdatabase clustering, which removes highly redundant sequence,and a two-step PSI-BLAST (PDB-BLAST), which separates sequencespaces of profile composition and space of homology searching.The combination of these strategies improves distant homologyrecognitions by more than 100%, while using only 10% of theCPU time of the standard PSI-BLAST search. Another method, intermediateprofile searches, allows for the exploration of additional searchdirections that are normally dominated by large protein sub-familieswithin very diverse families. All methods are evaluated witha large fold-recognition benchmark.
Keywords:
本文献已被 Oxford 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号