首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper presents an online feature selection algorithm using genetic programming (GP). The proposed GP methodology simultaneously selects a good subset of features and constructs a classifier using the selected features. For a c-class problem, it provides a classifier having c trees. In this context, we introduce two new crossover operations to suit the feature selection process. As a byproduct, our algorithm produces a feature ranking scheme. We tested our method on several data sets having dimensions varying from 4 to 7129. We compared the performance of our method with results available in the literature and found that the proposed method produces consistently good results. To demonstrate the robustness of the scheme, we studied its effectiveness on data sets with known (synthetically added) redundant/bad features.  相似文献   

2.
Evolutionary constructive induction   总被引:1,自引:0,他引:1  
Feature construction in classification is a preprocessing step in which one or more new attributes are constructed from the original attribute set, the object being to construct features that are more predictive than the original feature set. Genetic programming allows the construction of nonlinear combinations of the original features. We present a comprehensive analysis of genetic programming (GP) used for feature construction, in which four different fitness functions are used by the GP and four different classification techniques are subsequently used to build the classifier. Comparisons are made of the error rates and the size and complexity of the resulting trees. We also compare the overall performance of GP in feature construction with that of GP used directly to evolve a decision tree classifier, with the former proving to be a more effective use of the evolutionary paradigm.  相似文献   

3.
We review the main results obtained in the theory of schemata in genetic programming (GP), emphasizing their strengths and weaknesses. Then we propose a new, simpler definition of the concept of schema for GP, which is closer to the original concept of schema in genetic algorithms (GAs). Along with a new form of crossover, one-point crossover, and point mutation, this concept of schema has been used to derive an improved schema theorem for GP that describes the propagation of schemata from one generation to the next. We discuss this result and show that our schema theorem is the natural counterpart for GP of the schema theorem for GAs, to which it asymptotically converges.  相似文献   

4.
Traditional genetic programming (GP) randomly combines subtrees by applying crossover. There is a growing interest in methods that can control such recombination operations in order to achieve faster convergence. In this paper, a new approach is presented for guiding the recombination process for genetic programming. The method is based on extracting the global information of the promising solutions that appear during the genetic search. The aim is to use this information to control the crossover operation afterwards. A separate control module is used to process the collected information. This module guides the search process by sending feedback to the genetic engine about the consequences of possible recombination alternatives.  相似文献   

5.
This paper describes a genetic programming (GP) approach to medical data classification problems. In this approach, the evolved genetic programs are simplified online during the evolutionary process using algebraic simplification rules, algebraic equivalence and prime techniques. The new simplification GP approach is examined and compared to the standard GP approach on two medical data classification problems. The results suggest that the new simplification GP approach can not only be more efficient with slightly better classification performance than the basic GP system on these problems, but also significantly reduce the sizes of evolved programs. Comparison with other methods including decision trees, naive Bayes, nearest neighbour, nearest centroid, and neural networks suggests that the new GP approach achieved superior results to almost all of these methods on these problems. The evolved genetic programs are also easier to interpret than the “hidden patterns” discovered by the other methods.
Phillip WongEmail:
  相似文献   

6.
概念获取是自然语言理解领域中重要的研究课题。该文提出了一种基于汉语量词的名词概念描述方法,设计并实现了一个权重计算方案。通过聚类实验探索了量词对名词语义区分的作用和贡献,实验结果表明基于量词的名词概念表达方式是有效的,可以区分大部分名词概念。  相似文献   

7.
Schema theory is the most well-known model of evolutionary algorithms. Imitating from genetic algorithms (GA), nearly all schemata defined for genetic programming (GP) refer to a set of points in the search space that share some syntactic characteristics. In GP, syntactically similar individuals do not necessarily have similar semantics. The instances of a syntactic schema do not behave similarly, hence the corresponding schema theory becomes unreliable. Therefore, these theories have been rarely used to improve the performance of GP. The main objective of this study is to propose a schema theory which could be a more realistic model for GP and could be potentially employed for improving GP in practice. To achieve this aim, the concept of semantic schema is introduced. This schema partitions the search space according to semantics of trees, regardless of their syntactic variety. We interpret the semantics of a tree in terms of the mutual information between its output and the target. The semantic schema is characterized by a set of semantic building blocks and their joint probability distribution. After introducing the semantic building blocks, an algorithm for finding them in a given population is presented. An extraction method that looks for the most significant schema of the population is provided. Moreover, an exact microscopic schema theorem is suggested that predicts the expected number of schema samples in the next generation. Experimental results demonstrate the capability of the proposed schema definition in representing the semantics of the schema instances. It is also revealed that the semantic schema theorem estimation is more realistic than previously defined schemata.  相似文献   

8.
Multicast operation is an important operation in multicomputer communication systems and can be used to support several collective communication operations. A significant performance improvement can be achieved by supporting multicast operations at the hardware level. We propose an asynchronous tree-based multicasting (ATBM) technique for multistage interconnection networks (MINs). The deadlock issues in tree-based multicasting in MINs are analyzed first to examine the main causes of deadlocks. An ATBM framework is developed in which deadlocks are prevented by serializing the initiations of tree operations that have a potential to create deadlocks. These tree operations are identified through a grouping algorithm. The ATBM approach is not only simple to implement but also provides good communication performance using minimal overheads in terms of additional hardware requirements and synchronization delay. Using the ATBM framework, algorithms are developed for both unidirectional and bidirectional multistage interconnection networks. The performances of the proposed algorithms are evaluated through simulation experiments. The results indicate that the proposed hardware-based ATBM scheme reduces the communication latency when compared to the software multicasting approach proposed earlier  相似文献   

9.
10.
We present an efficient graph-based evolutionary optimization technique, called evolutionary graph generation (EGG), and the proposed approach is applied to the design of combinational and sequential arithmetic circuits based on parallel counter-tree architecture. The fundamental idea of EGG is to employ general circuit graphs as individuals and manipulate the circuit graphs directly using new evolutionary graph operations without encoding the graphs into other indirect representations, such as the bit strings used in genetic algorithm (GA) proposed by Holland (1992) and trees used in genetic programming (GP) proposed by Koza et al. (1997). In this paper, the EGG system is applied to the design of constant-coefficient multipliers and the design of bit-serial data-parallel adders. The results demonstrate the potential capability of EGG to solve the practical design problems for arithmetic circuits with limited knowledge of computer arithmetic algorithms. The proposed EGG system can help to simplify and speed up the process of designing arithmetic circuits and can produce better solutions to the given problem  相似文献   

11.
A method for pattern classification using genetic algorithms (GAs) has been recently described in Pal, Bandyopadhyay and Murthy (1998), where the class boundaries of a data set are approximated by a fixed number H of hyperplanes. As a consequence of fixing H a priori, the classifier suffered from the limitation of overfitting (or underfitting) the training data with an associated loss of its generalization capability. In this paper, we propose a scheme for evolving the value of H automatically using the concept of variable length strings/chromosomes. The crossover and mutation operators are newly defined in order to handle variable string lengths. The fitness function ensures primarily the minimization of the number of misclassified samples, and also the reduction of the number of hyperplanes. Based on an analogy between the classification principles of the genetic classifier and multilayer perceptron (with hard limiting neurons), a method for automatically determining the architecture and the connection weights of the latter is described.  相似文献   

12.
The concept of multiobjective optimization (MOO) has been integrated with variable length chromosomes for the development of a nonparametric genetic classifier which can overcome the problems, like overfitting/overlearning and ignoring smaller classes, as faced by single objective classifiers. The classifier can efficiently approximate any kind of linear and/or nonlinear class boundaries of a data set using an appropriate number of hyperplanes. While designing the classifier the aim is to simultaneously minimize the number of misclassified training points and the number of hyperplanes, and to maximize the product of class wise recognition scores. The concepts of validation set (in addition to training and test sets) and validation functional are introduced in the multiobjective classifier for selecting a solution from a set of nondominated solutions provided by the MOO algorithm. This genetic classifier incorporates elitism and some domain specific constraints in the search process, and is called the CEMOGA-Classifier (constrained elitist multiobjective genetic algorithm based classifier). Two new quantitative indices, namely, the purity and minimal spacing, are developed for evaluating the performance of different MOO techniques. These are used, along with classification accuracy, required number of hyperplanes and the computation time, to compare the CEMOGA-Classifier with other related ones.  相似文献   

13.
Genetic Programming (GP) homologous crossovers are a group of operators, including GP one-point crossover and GP uniform crossover, where the offspring are created preserving the position of the genetic material taken from the parents. In this paper we present an exact schema theory for GP and variable-length Genetic Algorithms (GAs) which is applicable to this class of operators. The theory is based on the concepts of GP crossover masks and GP recombination distributions that are generalisations of the corresponding notions used in GA theory and in population genetics, as well as the notions of hyperschema and node reference systems, which are specifically required when dealing with variable size representations.In this paper we also present a Markov chain model for GP and variable-length GAs with homologous crossover. We obtain this result by using the core of Vose's model for GAs in conjunction with the GP schema theory just described. The model is then specialised for the case of GP operating on 0/1 trees: a tree-like generalisation of the concept of binary string. For these, symmetries exist that can be exploited to obtain further simplifications.In the absence of mutation, the Markov chain model presented here generalises Vose's GA model to GP and variable-length GAs. Likewise, our schema theory generalises and refines a variety of previous results in GP and GA theory.  相似文献   

14.
Data mining has attracted a lot of research efforts during the past decade. However, little work has been reported on the efficiency of supporting a large number of users who issue different data mining queries periodically when there are new needs and when data is updated. Our work is motivated by the fact that the pattern-growth method is one of the most efficient methods for frequent pattern mining which constructs an initial tree and mines frequent patterns on top of the tree. In this paper, we present a data mining proxy approach that can reduce the I/O costs to construct an initial tree by utilizing the trees that have already been resident in memory. The tree we construct is the smallest for a given data mining query. In addition, our proxy approach can also reduce CPU cost in mining patterns, because the cost of mining relies on the sizes of trees. The focus of the work is to construct an initial tree efficiently. We propose three tree operations to construct a tree. With a unique coding scheme, we can efficiently project subtrees from on-disk trees or in-memory trees. Our performance study indicated that the data mining proxy significantly reduces the I/O cost to construct trees and CPU cost to mine patterns over the trees constructed.  相似文献   

15.
A new fuzzy-model-based approach to fault detection and diagnosis is proposed. The scheme uses a set of fuzzy reference models which describe faulty and fault-free operation, and a classifier based on fuzzy matching for fault diagnosis. The reference models are obtained off-line from simulation data. A fuzzy model which describes the actual behavior of the plant is identified online from normal operating data and compared to each of the reference models. A degree of similarity is evaluated every time the online fuzzy model is identified. Dempster's rule of combination is used to combine new evidence with that already collected. The method of diagnosis accounts for any ambiguity (which may result from fault-free and faulty operation, or different faults, having similar symptoms at a given operating point) by comparing the fuzzy reference models with each other. Results are presented which demonstrate the effectiveness of the scheme when it is used to detect and identify faults in the cooling coil subsystem of the air-handling unit of both simulated and experimental air-conditioning plant  相似文献   

16.
This article introduces variable chromosome lengths (VCL) in the context of a genetic algorithm (GA). This concept is applied to structural topology optimization but is also suitable to a broader class of design problems. In traditional genetic algorithms, the chromosome length is determined a priori when the phenotype is encoded into the corresponding genotype. Subsequently, the chromosome length does not change. This approach does not effectively solve problems with large numbers of design variables in complex design spaces such as those encountered in structural topology optimization. We propose an alternative approach based on a progressive refinement strategy, where a GA starts with a short chromosome and first finds an optimum solution in the simple design space. The optimum solutions are then transferred to the following stages with longer chromosomes, while maintaining diversity in the population. Progressively refined solutions are obtained in subsequent stages. A strain energy filter is used in order to filter out inefficiently used design cells such as protrusions or isolated islands. The variable chromosome length genetic algorithm (VCL-GA) is applied to two structural topology optimization problems: a short cantilever and a bridge problem. The performance of the method is compared to a brute-force approach GA, which operates ab initio at the highest level of resolution.  相似文献   

17.
Quay crane scheduling is one of the most important operations in seaport terminals. The effectiveness of this operation can directly influence the overall performance as well as the competitive advantages of the terminal. This paper develops a new priority-based schedule construction procedure to generate quay crane schedules. From this procedure, two new hybrid evolutionary computation methods based on genetic algorithm (GA) and genetic programming (GP) are developed. The key difference between the two methods is their representations which decide how priorities of tasks are determined. While GA employs a permutation representation to decide the priorities of tasks, GP represents its individuals as a priority function which is used to calculate the priorities of tasks. A local search heuristic is also proposed to improve the quality of solutions obtained by GA and GP. The proposed hybrid evolutionary computation methods are tested on a large set of benchmark instances and the computational results show that they are competitive and efficient as compared to the existing methods. Many new best known solutions for the benchmark instances are discovered by using these methods. In addition, the proposed methods also show their flexibility when applied to generate robust solutions for quay crane scheduling problems under uncertainty. The results show that the obtained robust solutions are better than those obtained from the deterministic inputs.  相似文献   

18.
One of the major challenges in pattern recognition problems is the feature extraction process which derives new features from existing features, or directly from raw data in order to reduce the cost of computation during the classification process, while improving classifier efficiency. Most current feature extraction techniques transform the original pattern vector into a new vector with increased discrimination capability but lower dimensionality. This is conducted within a predefined feature space, and thus, has limited searching power. Genetic programming (GP) can generate new features from the original dataset without prior knowledge of the probabilistic distribution. In this paper, a GP-based approach is developed for feature extraction from raw vibration data recorded from a rotating machine with six different conditions. The created features are then used as the inputs to a neural classifier for the identification of six bearing conditions. Experimental results demonstrate the ability of GP to discover autimatically the different bearing conditions using features expressed in the form of nonlinear functions. Furthermore, four sets of results--using GP extracted features with artificial neural networks (ANN) and support vector machines (SVM), as well as traditional features with ANN and SVM--have been obtained. This GP-based approach is used for bearing fault classification for the first time and exhibits superior searching power over other techniques. Additionaly, it significantly reduces the time for computation compared with genetic algorithm (GA), therefore, makes a more practical realization of the solution.  相似文献   

19.
Hardware supported multicast in fat-tree-based InfiniBand networks   总被引:1,自引:1,他引:0  
The multicast operation is a very commonly used operation in parallel applications. It can be used to implement many collective communication operations as well. Therefore, its performance will affect parallel applications and collective communication operations. With the hardware supported multicast of the InfiniBand Architecture (IBA), in this paper, we propose a cyclic multicast scheme for fat-tree-based (m-port n-tree) InfiniBand networks. The basic concept of the proposed cyclic multicast scheme is to find the union sets of the output ports of switches in the paths between the source processing node and each destination processing node in a multicast group. Based on the union sets and the path selection scheme, the forwarding table for a given multicast group can be constructed. We implement the proposed multicast scheme along with the OpenSM multicast scheme and the unicast scheme on an m-port n-tree InfiniBand network simulator. Several one-to-many, many-to-many, many-to-all, and all-to-many multicast cases are simulated. The simulation results show that the proposed multicast scheme outperforms the unicast scheme for all simulated cases. For one-to-many case, the performance of the cyclic multicast scheme is the same as that of the OpenSM multicast scheme. For many-to-many and all-to-many cases, the cyclic multicast scheme outperforms the OpenSM multicast scheme. For many-to-all case, the performance of the cyclic multicast scheme is a little better than that of the OpenSM multicast scheme.
Yeh-Ching ChungEmail:
  相似文献   

20.
We propose a new scheme for implementing gate operations between remote qubits in linear nearest neighbor (LNN) architectures, one that does not require qubits to be adjacent to each other in order to perform a gate operation between them. The key feature of our scheme is a new two-control, one-target controlled-unitary gate operation, which we refer to as the C2(?I) gate. The gate operation can be implemented easily in a single step, requiring only a single control parameter of the system Hamiltonian. Using the C2(?I) gate, we show how to implement CNOT gate operations between remote qubits that do not have any direct coupling between them, along an LNN array. Since this is achieved without requiring swap operations or additional ancilla qubits in the circuit, the quantum cost of our circuit can be more than 50 % lower than those using conventional swap methods. All CNOT gate operations between remote qubits can be achieved with fidelity greater than 99.5 %.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号