首页 | 本学科首页   官方微博 | 高级检索  
     


Transmembrane segments prediction and understanding using support vector machine and decision tree
Affiliation:1. State Key Laboratory of Fine Chemicals, R&D Center of Membrane Science and Technology, School of Chemical Engineering, Dalian University of Technology, 2# Linggong Road, Dalian, 116012, China;2. School of Textile and Material Engineering, Dalian Polytechnic University, 1# Qinggong Yuan, Dalian, 116034, China;3. Key Laboratory of Biobased Materials, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, 266101, China;1. Mechanical Engineering Department, College of Engineering, Universiti Tenaga Nasional, Jalan IKRAM-UNITEN, 43000 Kajang, Selangor, Malaysia;2. Department of Thermofluids, Faculty of Mechanical Engineering, Universiti Teknologi Malaysia, UTM Skudai, 81310 Johor Bahru, Malaysia;3. Department of Mechanical Engineering, University of Malaya, 50603 Kuala Lumpur, Malaysia;4. Limkokwing University of Creative Technology, Jalan Teknokrat 1/1, 63000 Cyberjaya, Selangor, Malaysia;1. Pharmacokinetics & Drug Metabolism, Amgen Inc., Seattle, Washington, 98119;2. Global Development, Amgen Inc., Thousand Oaks, California, 91320;3. Pharmacokinetics & Drug Metabolism, Amgen Inc., Puzol, Spain;4. Department of Pharmacy and Pharmaceutics, School of Pharmacy, University of Washington, Seattle, Washington, 98195
Abstract:In recent years, there have been many studies focusing on improving the accuracy of prediction of transmembrane segments, and many significant results have been achieved. In spite of these considerable results, the existing methods lack the ability to explain the process of how a learning result is reached and why a prediction decision is made. The explanation of a decision made is important for the acceptance of machine learning technology in bioinformatics applications such as protein structure prediction. While support vector machines (SVM) have shown strong generalization ability in a number of application areas, including protein structure prediction, they are black box models and hard to understand. On the other hand, decision trees provide insightful interpretation, however, they have lower prediction accuracy. In this paper, we present an innovative approach to rule generation for understanding prediction of transmembrane segments by integrating the merits of both SVMs and decision trees. This approach combines SVMs with decision trees into a new algorithm called SVM_DT. The results of the experiments for prediction of transmembrane segments on 165 low-resolution test data set show that not only the comprehensibility of SVM_DT is much better than that of SVMs, but also that the test accuracy of these rules is high as well. Rules with confidence values over 90% have an average prediction accuracy of 93.4%. We also found that confidence and prediction accuracy values of the rules generated by SVM_DT are quite consistent. We believe that SVM_DT can be used not only for transmembrane segments prediction, but also for understanding the prediction. The prediction and its interpretation obtained can be used for guiding biological experiments.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号