首页 | 本学科首页   官方微博 | 高级检索  
     


Innovative genetic algorithms for chemoinformatics
Authors:B. K. Lavine   C. E. Davidson  A. J. Moores
Affiliation:

Department of Chemistry, Clarkson University, Box 5810, Potsdam, NY 13699-5810, USA

Abstract:In this paper, we report on the development of a genetic algorithm (GA) for pattern recognition analysis of multivariate chemical data. The GA identifies feature subsets that optimize the separation of the classes in a plot of the two or three largest principal components of the data. Because principal components maximize variance, the bulk of the information encoded by the selected features is about differences between classes in the data set. The principal component (PC) plot function as embedded information filter. Sets of features are selected based on their principal component plots, with a good principal component plot generated by features whose variance or information is primarily about differences between classes in the data set. This limits the GA to search for these types of feature subsets, significantly reducing the size of the search space. In addition, the pattern recognition GA focuses on those classes and/or samples that are difficult to classify by boosting their weights over successive generation using a perceptron to learn the class and sample weights. Samples that consistently classify correctly are not as heavily weighted in the analysis as samples that are difficult to classify. The pattern recognition GA integrates aspects of artificial intelligence and evolutionary computations to yield a “smart” one-pass procedure for feature selection. The efficacy and efficiency of the pattern recognition GA is demonstrated via problems from chemical communication and environmental analysis.
Keywords:Genetic algorithms   Chemoinformatics   Perceptron
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号