首页 | 本学科首页   官方微博 | 高级检索  
     


Rule generation for protein secondary structure prediction with support vector machines and decision tree
Authors:He Jieyue  Hu Hae-Jin  Harrison Robert  Tai Phang C  Pan Yi
Affiliation:Dept. of Comput. Sci. & Eng., Nanjing Univ., China;
Abstract:Support vector machines (SVMs) have shown strong generalization ability in a number of application areas, including protein structure prediction. However, the poor comprehensibility hinders the success of the SVM for protein structure prediction. The explanation of how a decision made is important for accepting the machine learning technology, especially for applications such as bioinformatics. The reasonable interpretation is not only useful to guide the "wet experiments," but also the extracted rules are helpful to integrate computational intelligence with symbolic AI systems for advanced deduction. On the other hand, a decision tree has good comprehensibility. In this paper, a novel approach to rule generation for protein secondary structure prediction by integrating merits of both the SVM and decision tree is presented. This approach combines the SVM with decision tree into a new algorithm called SVM/spl I.bar/DT, which proceeds in three steps. This algorithm first trains an SVM. Then, a new training set is generated through careful selection from the output of the SVM. Finally, the obtained training set is used to train a decision tree learning system and to extract the corresponding rule sets. The results of the experiments of protein secondary structure prediction on RS126 data set show that the comprehensibility of SVM/spl I.bar/DT is much better than that of the SVM. Moreover, the generalization ability of SVM/spl I.bar/DT is better than that of C4.5 decision trees and is similar to that of the SVM. Hence, SVM/spl I.bar/DT can be used not only for prediction, but also for guiding biological experiments.
Keywords:
本文献已被 PubMed 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号