首页 | 本学科首页   官方微博 | 高级检索  
     


Classification of proteins multiple-labelled and single-labelled with protein functional classes
Authors:Mary Q Yang  Okan K Ersoy
Affiliation:1. United States Department of Health and Human Services , National Human Genome Research Institute—National Institutes of Health , Bethesda, MD, 20852, USA;2. School of Electrical and Computer Engineering, Purdue University , W. Lafayette, IN, 47907, USA;3. School of Electrical and Computer Engineering, Purdue University , W. Lafayette, IN, 47907, USA
Abstract:Advances in high-throughput genome sequencing technology have led to an explosion in the amount of sequence data that are available. The determination of protein function using experimental techniques is time-consuming and expensive; the use of machine-learning techniques rapidly to assess protein function may be useful in streamlining this process. The problem of assigning functional classes to proteins is complicated by the fact that a single protein can participate in several different pathways and thus can have multiple functions. We have developed a tree-based classifier that is capable of handling multiple-labelled data and gaining an insight into the multi-functional nature of proteins. We call the resulting tree a recursive maximum contrast tree (RMCT) and the resulting classifier a multiple-labelled instance classifier (MLIC). We investigate the synergy of machine-learning-based ensemble methods and physiochemical-based feature augments. We test our algorithm on protein phylogenetic profiles generated from 60 completely sequenced genomes and we compare our results with those achieved by algorithms such as support vector machines and decision trees.
Keywords:Computational intelligence  Machine learning  Classification  Multifunctional proteins  Bioinformatics
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号