共查询到20条相似文献,搜索用时 0 毫秒
1.
This paper proposes a method called layered genetic programming (LAGEP) to construct a classifier based on multi-population genetic programming (MGP). LAGEP employs layer architecture to arrange multiple populations. A layer is composed of a number of populations. The results of populations are discriminant functions. These functions transform the training set to construct a new training set. The successive layer uses the new training set to obtain better discriminant functions. Moreover, because the functions generated by each layer will be composed to a long discriminant function, which is the result of LAGEP, every layer can evolve with short individuals. For each population, we propose an adaptive mutation rate tuning method to increase the mutation rate based on fitness values and remaining generations. Several experiments are conducted with different settings of LAGEP and several real-world medical problems. Experiment results show that LAGEP achieves comparable accuracy to single population GP in much less time. 相似文献
2.
Mu-Yen Chen Kuang-Ku Chen Heien-Kun Chiang Hwa-Shan Huang Mu-Jung Huang 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2007,11(12):1173-1183
As a broad subfield of artificial intelligence, machine learning is concerned with the development of algorithms and techniques
that allow computers to learn. These methods such as fuzzy logic, neural networks, support vector machines, decision trees
and Bayesian learning have been applied to learn meaningful rules; however, the only drawback of these methods is that it
often gets trapped into a local optimal. In contrast with machine learning methods, a genetic algorithm (GA) is guaranteeing
for acquiring better results based on its natural evolution and global searching. GA has given rise to two new fields of research
where global optimization is of crucial importance: genetic based machine learning (GBML) and genetic programming (GP). This
article adopts the GBML technique to provide a three-phase knowledge extraction methodology, which makes continues and instant
learning while integrates multiple rule sets into a centralized knowledge base. Moreover, the proposed system and GP are both
applied to the theoretical and empirical experiments. Results for both approaches are presented and compared. This paper makes
two important contributions: (1) it uses three criteria (accuracy, coverage, and fitness) to apply the knowledge extraction
process which is very effective in selecting an optimal set of rules from a large population; (2) the experiments prove that
the rule sets derived by the proposed approach are more accurate than GP. 相似文献
3.
Richard J. Preen Larry Bull 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2014,18(1):153-167
A number of representation schemes have been presented for use within learning classifier systems, ranging from binary encodings to neural networks. This paper presents results from an investigation into using discrete and fuzzy dynamical system representations within the XCSF learning classifier system. In particular, asynchronous random Boolean networks are used to represent the traditional condition-action production system rules in the discrete case and asynchronous fuzzy logic networks in the continuous-valued case. It is shown possible to use self-adaptive, open-ended evolution to design an ensemble of such dynamical systems within XCSF to solve a number of well-known test problems. 相似文献
4.
5.
Estimation of classifier performance 总被引:1,自引:0,他引:1
Fukunaga K. Hayes R.R. 《IEEE transactions on pattern analysis and machine intelligence》1989,11(10):1087-1101
An expression for expected classifier performance previously derived by the authors (ibid., vol.11, no.8, p.873-855, Aug. 1989) is applied to a variety of error estimation methods and a unified and comprehensive approach to the analysis of classifier performance is presented. After the error expression is introduced, it is applied to three cases: (1) a given classifier and a finite test set; (2) given test distributions a finite design set; and (3) finite and independent design and test sets. For all cases, the expected values and variances of the classifier errors are presented. Although the study of Case 1 does not produce any new results, it is important to confirm that the proposed approach produces the known results, and also to show how these results are modified when the design set becomes finite, as in Cases 2 and 3. The error expression is used to compute the bias between the leave-one-out and resubstitution errors for quadratic classifiers. The effect of outliers in design samples on the classification error is discussed. Finally, the theoretical analysis of the bootstrap method is presented for quadratic classifiers 相似文献
6.
Durga Prasad Muni Nikhil R Pal Jyotirmoy Das 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2006,36(1):106-117
This paper presents an online feature selection algorithm using genetic programming (GP). The proposed GP methodology simultaneously selects a good subset of features and constructs a classifier using the selected features. For a c-class problem, it provides a classifier having c trees. In this context, we introduce two new crossover operations to suit the feature selection process. As a byproduct, our algorithm produces a feature ranking scheme. We tested our method on several data sets having dimensions varying from 4 to 7129. We compared the performance of our method with results available in the literature and found that the proposed method produces consistently good results. To demonstrate the robustness of the scheme, we studied its effectiveness on data sets with known (synthetically added) redundant/bad features. 相似文献
7.
Ubaldo M. García-PalomaresAuthor Vitae Orestes Manzanilla-SalazarAuthor Vitae 《Decision Support Systems》2012,52(3):717-728
This paper describes a novel approach to build a piecewise (non)linear surface that separates individuals from two classes with an a priori classification accuracy. In particular, total classification with a good generalization level can be obtained, provided no individual belongs to both classes. The method is iterative: at each iteration a new piece of the surface is found via the solution of a Linear Programming model. Theoretically, the larger the number of iterations, the better the classification accuracy in the training set; numerically, we also found that the generalization ability does not deteriorate on the cases tested. Nonetheless, we have included a procedure that computes a lower bound to the number of errors that will be generated in any given validation set. If needed, an early stopping criterion is provided. We also showed that each piece of the discriminating surface is equivalent to a neuron of a feed forward neural network (FFNN); so as a byproduct we are providing a novel training scheme for FFNNs that avoids the minimization of non convex functions which, in general, present many local minima.We compare this algorithm with a new linear SVM that needs no pre tuning and has an excellent performance on standard and synthetic data. Highly encouraging numerical results are reported on synthetic examples, on the Japanese Bank dataset, and on medium and small datasets from the Irvine repository of machine learning databases. 相似文献
8.
The genetic programming (GP) paradigm, which applies the Darwinian principle of evolution to hierarchical computer programs, has been applied with breakthrough success in various scientific and engineering applications. However, one of the main drawbacks of GP has been the often large amount of computational effort required to solve complex problems. Much disparate research has been conducted over the past 25 years to devise innovative methods to improve the efficiency and performance of GP. This paper attempts to provide a comprehensive overview of this work related to Canonical Genetic Programming based on parse trees and originally championed by Koza (Genetic programming: on the programming of computers by means of natural selection. MIT, Cambridge, 1992). Existing approaches that address various techniques for performance improvement are identified and discussed with the aim to classify them into logical categories that may assist with advancing further research in this area. Finally, possible future trends in this discipline and some of the open areas of research are also addressed. 相似文献
9.
Prediction of the natural gas consumption in chemical processing facilities with genetic programming
Cinkarna Ltd. is a chemical processing company in Slovenia and the country’s largest manufacturer of titanium oxides (TiO2). Chemical processing and titanium oxide manufacturing in particular requires high natural gas consumption, and it is difficult to accurately pre-order gas from suppliers. In accordance with the Energy Agency of the Republic of Slovenia regulations, each natural gas supplier regulates and determines the charges for the differences between the ordered (predicted) and the actually supplied quantities of natural gas. Yearly charges for these differences total 1.11 % of supplied natural gas costs (average 50,960 EUR per year). This paper presents natural gas consumption prediction and the minimization of associated costs. The data on daily temperature, steam boilers, sulfur acid and TiO2 production was collected from January 2012 until November 2014. Based on the collected data, a linear regression and a genetic programming model were developed. Compared to the specialist’s prediction of natural gas consumption, the linear regression and genetic programming models reduce the charges for the differences between the ordered and the actually supplied quantities by 3.00 and 5.30 times, respectively. Also, from January until November 2014 the same genetic programming model was used in practice. The results show that in a similar gas consumption regime the differences between the ordered and the actually supplied quantities are statistically significant, namely, they are 3.19 times lower (t test, p < 0.05) than in the period in which the specialist responsible for natural gas consumption made the predictions. 相似文献
10.
A new method for design of a fuzzy-rule-based classifier using genetic algorithms (GAs) is discussed. The optimal parameters of the fuzzy classifier including fuzzy membership functions and the size and structure of fuzzy rules are extracted from the training data using GAs. This is done by introducing new representation schemes for fuzzy membership functions and fuzzy rules. An effectiveness measure for fuzzy rules is developed that allows for systematic addition or deletion of rules during the GA optimization process. A clustering method is utilized for generating new rules to be added when additions are required. The performance of the classifier is tested on two real-world databases (Iris and Wine) and a simulated Gaussian database. The results indicate that highly accurate classifiers could be designed with relatively few fuzzy rules. The performance is also compared to other fuzzy classifiers tested on the same databases. 相似文献
11.
Schema theory is the most well-known model of evolutionary algorithms. Imitating from genetic algorithms (GA), nearly all schemata defined for genetic programming (GP) refer to a set of points in the search space that share some syntactic characteristics. In GP, syntactically similar individuals do not necessarily have similar semantics. The instances of a syntactic schema do not behave similarly, hence the corresponding schema theory becomes unreliable. Therefore, these theories have been rarely used to improve the performance of GP. The main objective of this study is to propose a schema theory which could be a more realistic model for GP and could be potentially employed for improving GP in practice. To achieve this aim, the concept of semantic schema is introduced. This schema partitions the search space according to semantics of trees, regardless of their syntactic variety. We interpret the semantics of a tree in terms of the mutual information between its output and the target. The semantic schema is characterized by a set of semantic building blocks and their joint probability distribution. After introducing the semantic building blocks, an algorithm for finding them in a given population is presented. An extraction method that looks for the most significant schema of the population is provided. Moreover, an exact microscopic schema theorem is suggested that predicts the expected number of schema samples in the next generation. Experimental results demonstrate the capability of the proposed schema definition in representing the semantics of the schema instances. It is also revealed that the semantic schema theorem estimation is more realistic than previously defined schemata. 相似文献
12.
Darren M. Chitty 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2016,20(2):661-680
Genetic Programming (GP) (Koza, Genetic programming, MIT Press, Cambridge, 1992) is well-known as a computationally intensive technique. Subsequently, faster parallel versions have been implemented that harness the highly parallel hardware provided by graphics cards enabling significant gains in the performance of GP to be achieved. However, extracting the maximum performance from a graphics card for the purposes of GP is difficult. A key reason for this is that in addition to the processor resources, the fast on-chip memory of graphics cards needs to be fully exploited. Techniques will be presented that will improve the performance of a graphics card implementation of tree-based GP by better exploiting this faster memory. It will be demonstrated that both L1 cache and shared memory need to be considered for extracting the maximum performance. Better GP program representation and use of the register file is also explored to further boost performance. Using an NVidia Kepler 670GTX GPU, a maximum performance of 36 billion Genetic Programming Operations per Second is demonstrated. 相似文献
13.
《Knowledge》2007,20(2):127-133
This paper proposes a new tree-generation algorithm for grammar-guided genetic programming that includes a parameter to control the maximum size of the trees to be generated. An important feature of this algorithm is that the initial populations generated are adequately distributed in terms of tree size and distribution within the search space. Consequently, genetic programming systems starting from the initial populations generated by the proposed method have a higher convergence speed. Two different problems have been chosen to carry out the experiments: a laboratory test involving searching for arithmetical equalities and the real-world task of breast cancer prognosis. In both problems, comparisons have been made to another five important initialization methods. 相似文献
14.
It is quite difficult but essential for Genetic Programming (GP) to evolve the choice structures. Traditional approaches usually ignore this issue. They define some “if-structures” functions according to their problems by combining “if-else” statement, conditional criterions and elemental functions together. Obviously, these if-structure functions depend on the specific problems and thus have much low reusability. Based on this limitation of GP, in this paper we propose a kind of termination criterion in the GP process named “Combination Termination Criterion” (CTC). By testing CTC, the choice structures composed of some basic functions independent to the problems can be evolved successfully. Theoretical analysis and experiment results show that our method can evolve the programs with choice structures effectively within an acceptable additional time. 相似文献
15.
16.
17.
Kishore J.K. Patnaik L.M. Mani V. Agrawal V.K. 《Evolutionary Computation, IEEE Transactions on》2000,4(3):242-258
Explores the feasibility of applying genetic programming (GP) to multicategory pattern classification problem. GP can discover relationships and express them mathematically. GP-based techniques have an advantage over statistical methods because they are distribution-free, i.e., no prior knowledge is needed about the statistical distribution of the data. GP also automatically discovers the discriminant features for a class. GP has been applied for two-category classification. A methodology for GP-based n-class classification is developed. The problem is modeled as n two-class problems, and a genetic programming classifier expression (GPCE) is evolved as a discriminant function for each class. The GPCE is trained to recognize samples belonging to its own class and reject others. A strength of association (SA) measure is computed for each GPCE to indicate the degree to which it can recognize samples of its own class. SA is used for uniquely assigning a class to an input feature vector. Heuristic rules are used to prevent a GPCE with a higher SA from swamping one with a lower SA. Experimental results are presented to demonstrate the applicability of GP for multicategory classification, and they are found to be satisfactory. We also discuss the various issues that arise in our approach to GP-based classification, such as the creation of training sets, the role of incremental learning, and the choice of function set in the evolution of GPCE, as well as conflict resolution for uniquely assigning a class 相似文献
18.
Optimal resampling and classifier prototype selection in classifier ensembles using genetic algorithms 总被引:2,自引:0,他引:2
Ensembles of classifiers that are trained on different parts of the input space provide good results in general. As a popular boosting technique, AdaBoost is an iterative and gradient based deterministic method used for this purpose where an exponential loss function is minimized. Bagging is a random search based ensemble creation technique where the training set of each classifier is arbitrarily selected. In this paper, a genetic algorithm based ensemble creation approach is proposed where both resampled training sets and classifier prototypes evolve so as to maximize the combined accuracy. The objective function based random search procedure of the resultant system guided by both ensemble accuracy and diversity can be considered to share the basic properties of bagging and boosting. Experimental results have shown that the proposed approach provides better combined accuracies using a fewer number of classifiers than AdaBoost. 相似文献
19.
Ilker Fatih Kara 《Advances in Engineering Software》2011,42(6):295-304
The use of fibre reinforced polymer (FRP) bars to reinforce concrete structures has received a great deal of attention in recent years due to their excellent corrosion resistance, high tensile strength, and good non-magnetization properties. Due to the relatively low modulus of elasticity of FRP bars, concrete members reinforced longitudinally with FRP bars experience reduced shear strength compared to the shear strength of those reinforced with the same amounts of steel reinforcement. This paper presents a simple yet improved model to calculate the concrete shear strength of FRP-reinforced concrete slender beams (a/d > 2.5) without stirrups based on the gene expression programming (GEP) approach. The model produced by GEP is constructed directly from a set of experimental results available in the literature. The results of training, testing and validation sets of the model are compared with experimental results. All of the results show that GEP is a strong technique for the prediction of the shear capacity of FRP-reinforced concrete beams without stirrups. The performance of the GEP model is also compared to that of four commonly used shear design provisions for FRP-reinforced concrete beams. The proposed model produced by GEP provides the most accurate results in calculating the concrete shear strength of FRP-reinforced concrete beams among existing shear equations provided by current provisions. A parametric study is also carried out to evaluate the ability of the proposed GEP model and current shear design guidelines to quantitatively account for the effects of basic shear design parameters on the shear strength of FRP-reinforced concrete beams. 相似文献