首页 | 本学科首页   官方微博 | 高级检索  
     


A support vector machines approach for virtual screening of active compounds of single and multiple mechanisms from large libraries at an improved hit-rate and enrichment factor
Affiliation:1. Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore 117543, Singapore;2. Shanghai Center for Bioinformation Technology, Shanghai 201203, PR China;3. College of Chemistry, Sichuan University, Chengdu 610064, PR China;4. Bioinformatics Research Group, School of Life Sciences, Xiamen University, Xiamen 361005, FuJian Province, PR China;1. Homi Bhabha National Institute, Training School Complex, Anushakti Nagar, Mumbai 400085, India;2. Laser Materials Development and Devices Division, Raja Ramanna Centre for Advanced Technology, Indore 452013, India;3. UGC-DAE Consortium for Scientific Research, University Campus, Khandwa Road, Indore 452001, India;1. Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA;2. Discovery Chemistry Research and Technologies, Eli Lilly and Company, Indianapolis, IN 46285, USA;3. Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA;1. Instituto de Química e Biotecnologia, Universidade Federal de Alagoas, Campus A. C. Simões, Tabuleiro do Martins 57072-970, Maceió-AL, Brazil;2. Centro de Tecnologia da Informação Renato Archer, Divisão de Mostradores da Informação 13069-901, Campinas-SP, Brazil;3. Department of Chemistry, Loughborough University, LE11 3TU, Leicestershire, United Kingdom;1. Laboratoire des Matériaux et Environnement, Faculté des Sciences, Département de Chimie, Université Ibn Zohr, B.P. 8106 Agadir, Morocco;3. Faculdade de Engenharia, Departamento de Engenharia Química, Universidade do Porto, Rua Roberto Frias, 4200-465 Porto, Portugal;4. Department of Mechanical Systems Engineering, Faculty of Engineering, Hiroshima Institute of Technology, 2-1-1 Miyake, Saeki-ku, Hiroshima 731-5193, Japan;1. Department of Chemistry, Universidad de Burgos, Pza. Misael Bañuelos s/n, E-09001 Burgos, Spain;2. New Materials Division, IK4-CIDETEC, Paseo Miramón, 196 E-20009 Donostia - San Sebastián, Spain;1. Departamento de Fisicoquímica, Facultad de Ciencias Químicas, Universidad Nacional de Córdoba, Ciudad Universitaria, X5000HUA, Córdoba, Argentina;2. Instituto de Investigaciones en Fisicoquímica de Córdoba (INFIQC), CONICET, Ciudad Universitaria, X5000HUA, Córdoba, Argentina
Abstract:Support vector machines (SVM) and other machine-learning (ML) methods have been explored as ligand-based virtual screening (VS) tools for facilitating lead discovery. While exhibiting good hit selection performance, in screening large compound libraries, these methods tend to produce lower hit-rate than those of the best performing VS tools, partly because their training-sets contain limited spectrum of inactive compounds. We tested whether the performance of SVM can be improved by using training-sets of diverse inactive compounds. In retrospective database screening of active compounds of single mechanism (HIV protease inhibitors, DHFR inhibitors, dopamine antagonists) and multiple mechanisms (CNS active agents) from large libraries of 2.986 million compounds, the yields, hit-rates, and enrichment factors of our SVM models are 52.4–78.0%, 4.7–73.8%, and 214–10,543, respectively, compared to those of 62–95%, 0.65–35%, and 20–1200 by structure-based VS and 55–81%, 0.2–0.7%, and 110–795 by other ligand-based VS tools in screening libraries of ≥1 million compounds. The hit-rates are comparable and the enrichment factors are substantially better than the best results of other VS tools. 24.3–87.6% of the predicted hits are outside the known hit families. SVM appears to be potentially useful for facilitating lead discovery in VS of large compound libraries.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号