共查询到20条相似文献,搜索用时 0 毫秒
1.
Dong-Xia Chang Author Vitae Xian-Da Zhang Author Vitae Author Vitae 《Pattern recognition》2009,42(7):1210-1987
In this paper, a new clustering algorithm based on genetic algorithm (GA) with gene rearrangement (GAGR) is proposed, which in application may effectively remove the degeneracy for the purpose of a more efficient search. A new crossover operator that exploits a measure of similarity between chromosomes in a population is also presented. Adaptive probabilities of crossover and mutation are employed to prevent the convergence of the GAGR to a local optimum. Using the real-world data sets, we compare the performance of our GAGR clustering algorithm with K-means algorithm and other GA methods. An application of the GAGR clustering algorithm in unsupervised classification of multispectral remote sensing images is also provided. Experiment results demonstrate that the GAGR clustering algorithm has high performance, effectiveness and flexibility. 相似文献
2.
Maurizio Naldi Sancho Salcedo-Sanz Leopoldo Carro-Calvo Luigi Laura Antonio Portilla-Figueras Giuseppe F. Italiano 《Applied Soft Computing》2013,13(11):4303-4319
Network clustering algorithms are typically based only on the topology information of the network. In this paper, we introduce traffic as a quantity representing the intensity of the relationship among nodes in the network, regardless of their connectivity, and propose an evolutionary clustering algorithm, based on the application of genetic operators and capable of exploiting the traffic information. In a comparative evaluation based on synthetic instances and two real world datasets, we show that our approach outperforms a selection of well established evolutionary and non-evolutionary clustering algorithms. 相似文献
3.
Dongxia Chang Yao ZhaoChangwen Zheng Xianda Zhang 《Expert systems with applications》2012,39(2):2194-2202
In this paper, a genetic clustering algorithm is described that uses a new similarity measure based message passing between data points and the candidate centers described by the chromosome. In the new algorithm, a variable-length real-value chromosome representation and a set of problem-specific evolutionary operators are used. Therefore, the proposed GA with message-based similarity (GAMS) clustering algorithm is able to automatically evolve and find the optimal number of clusters as well as proper clusters of the data set. Effectiveness of GAMS clustering algorithm is demonstrated for both artificial and real-life data set. Experiment results demonstrated that the GAMS clustering algorithm has high performance, effectiveness and flexibility. 相似文献
4.
Yongguo Liu Author Vitae Kefei Chen Author Vitae Author Vitae Wei Zhang Author Vitae 《Pattern recognition》2004,37(5):927-942
Traditional intrusion detection methods lack extensibility in face of changing network configurations as well as adaptability in face of unknown attack types. Meanwhile, current machine-learning algorithms need labeled data for training first, so they are computational expensive and sometimes misled by artificial data. In this paper, a new detection algorithm, the Intrusion Detection Based on Genetic Clustering (IDBGC) algorithm, is proposed. It can automatically establish clusters and detect intruders by labeling normal and abnormal groups. Computer simulations show that this algorithm is effective for intrusion detection. 相似文献
5.
Dong-Xia Chang Author Vitae Xian-Da Zhang Author Vitae Author Vitae Dao-Ming Zhang Author Vitae 《Pattern recognition》2010,43(4):1346-1279
In this paper, a genetic clustering algorithm based on dynamic niching with niche migration (DNNM-clustering) is proposed. It is an effective and robust approach to clustering on the basis of a similarity function relating to the approximate density shape estimation. In the new algorithm, a dynamic identification of the niches with niche migration is performed at each generation to automatically evolve the optimal number of clusters as well as the cluster centers of the data set without invoking cluster validity functions. The niches can move slowly under the migration operator which makes the dynamic niching method independent of the radius of the niches. Compared to other existing methods, the proposed clustering method exhibits the following robust characteristics: (1) robust to the initialization, (2) robust to clusters volumes (ability to detect different volumes of clusters), and (3) robust to noise. Moreover, it is free of the radius of the niches and does not need to pre-specify the number of clusters. Several data sets with widely varying characteristics are used to demonstrate its superiority. An application of the DNNM-clustering algorithm in unsupervised classification of the multispectral remote sensing image is also provided. 相似文献
6.
Applying graph theory to clustering, we propose a partitional clustering method and a clustering tendency index. No initial assumptions about the data set are requested by the method. The number of clusters and the partition that best fits the data set, are selected according to the optimal clustering tendency index value. 相似文献
7.
In this paper, we show how one can take advantage of the stability and effectiveness of object data clustering algorithms when the data to be clustered are available in the form of mutual numerical relationships between pairs of objects. More precisely, we propose a new fuzzy relational algorithm, based on the popular fuzzy C-means (FCM) algorithm, which does not require any particular restriction on the relation matrix. We describe the application of the algorithm to four real and four synthetic data sets, and show that our algorithm performs better than well-known fuzzy relational clustering algorithms on all these sets. 相似文献
8.
APSCAN: A parameter free algorithm for clustering 总被引:1,自引:0,他引:1
DBSCAN is a density based clustering algorithm and its effectiveness for spatial datasets has been demonstrated in the existing literature. However, there are two distinct drawbacks for DBSCAN: (i) the performances of clustering depend on two specified parameters. One is the maximum radius of a neighborhood and the other is the minimum number of the data points contained in such neighborhood. In fact these two specified parameters define a single density. Nevertheless, without enough prior knowledge, these two parameters are difficult to be determined; (ii) with these two parameters for a single density, DBSCAN does not perform well to datasets with varying densities. The above two issues bring some difficulties in applications. To address these two problems in a systematic way, in this paper we propose a novel parameter free clustering algorithm named as APSCAN. Firstly, we utilize the Affinity Propagation (AP) algorithm to detect local densities for a dataset and generate a normalized density list. Secondly, we combine the first pair of density parameters with any other pair of density parameters in the normalized density list as input parameters for a proposed DDBSCAN (Double-Density-Based SCAN) to produce a set of clustering results. In this way, we can obtain different clustering results with varying density parameters derived from the normalized density list. Thirdly, we develop an updated rule for the results obtained by implementing the DDBSCAN with different input parameters and then synthesize these clustering results into a final result. The proposed APSCAN has two advantages: first it does not need to predefine the two parameters as required in DBSCAN and second, it not only can cluster datasets with varying densities but also preserve the nonlinear data structure for such datasets. 相似文献
9.
为降低k值的不确定性和初始聚类中心的随机性对聚类结果的影响,提出一种改进的遗传k-means聚类算法。采用并行计算的方式降低k值和初始聚类中心对聚类结果的影响,利用平均类内距和类间距设计适应度函数保证聚类结果的正确性,改进遗传算法的遗传算子来提高算法效率。通过UCI标准数据集验证了该算法的正确性和有效性,并应用于玉米良种选育中。实验结果表明,该算法能获得更优良的玉米品种,指导玉米选育工作。 相似文献
10.
In this paper, the conventional k-modes-type algorithms for clustering categorical data are extended by representing the clusters of categorical data with k-populations instead of the hard-type centroids used in the conventional algorithms. Use of a population-based centroid representation makes it possible to preserve the uncertainty inherent in data sets as long as possible before actual decisions are made. The k-populations algorithm was found to give markedly better clustering results through various experiments. 相似文献
11.
Cao D. Nguyen 《Information Sciences》2008,178(22):4205-4227
We introduce a novel clustering algorithm named GAKREM (Genetic Algorithm K-means Logarithmic Regression Expectation Maximization) that combines the best characteristics of the K-means and EM algorithms but avoids their weaknesses such as the need to specify a priori the number of clusters, termination in local optima, and lengthy computations. To achieve these goals, genetic algorithms for estimating parameters and initializing starting points for the EM are used first. Second, the log-likelihood of each configuration of parameters and the number of clusters resulting from the EM is used as the fitness value for each chromosome in the population. The novelty of GAKREM is that in each evolving generation it efficiently approximates the log-likelihood for each chromosome using logarithmic regression instead of running the conventional EM algorithm until its convergence. Another novelty is the use of K-means to initially assign data points to clusters. The algorithm is evaluated by comparing its performance with the conventional EM algorithm, the K-means algorithm, and the likelihood cross-validation technique on several datasets. 相似文献
12.
Mao-Zu Guo Jun Wang Chun-yu Wang Yang Liu 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2009,13(12):1143-1151
TagSNP selection, which aims to select a small subset of informative single nucleotide polymorphisms (SNPs) to represent the
whole large SNP set, has played an important role in current genomic research. Not only can this cut down the cost of genotyping
by filtering a large number of redundant SNPs, but also it can accelerate the study of genome-wide disease association. In
this paper, we propose a new hybrid method called CMDStagger that combines the ideas of the clustering and the graph algorithm,
to find the minimum set of tagSNPs. The proposed algorithm uses the information of the linkage disequilibrium association
and the haplotype diversity to reduce the information loss in tagSNP selection, and has no limit of block partition. The approach
is tested on eight benchmark datasets from Hapmap and chromosome 5q31. Experimental results show that the algorithm in this
paper can reduce the selection time and obtain less tagSNPs with high prediction accuracy. It indicates that this method has
better performance than previous ones. 相似文献
13.
This paper discusses a novel and effective technique for extracting multiple ellipses from an image, using a genetic algorithm with multiple populations (MPGA). MPGA evolves a number of subpopulations in parallel, each of which is clustered around an actual or perceived ellipse in the target image. The technique uses both evolution and clustering to direct the search for ellipses—full or partial. MPGA is explained in detail, and compared with both the widely used randomized Hough transform (RHT) and the sharing genetic algorithm (SGA). In thorough and fair experimental tests, using both synthetic and real-world images, MPGA exhibits solid advantages over RHT and SGA in terms of accuracy of recognition—even in the presence of noise or/and multiple imperfect ellipses in an image—and speed of computation. 相似文献
14.
Nowadays, many 3PL providers usually equip their distribution centers with different facilities, enabling them to be specialized in handling certain products types, and enhancing their ability of reuse and recycle the waste produced from packaging and repackaging. In practice, this problem type has been attracted much attention by researchers and environmental protectionisms. More importantly, because of the difference in product handling specialty, this induces different processing efficiency, handling reliability, and costs. In this connection, the objective of this paper is to propose a modified genetic algorithm to deal with the problem. The new chromosome encoding enhances the searching ability of the genetic algorithm in finding location, allocation, and routing solutions with high handling reliability and recycling ability for the distribution centers. To test the optimization reliability of the modified genetic algorithm, a number of numerical experiments have been carried out. The results demonstrated that the modified algorithm is able to obtain the Pareto solutions under multi-criterion decision making. Meanwhile, the handling reliability and recycling of the distributed centers are increased and the overall performance of the distribution network is improved. 相似文献
15.
A modified genetic algorithm for distributed scheduling problems 总被引:8,自引:1,他引:8
H. Z. Jia A. Y. C. Nee J. Y. H. Fuh Y. F. Zhang 《Journal of Intelligent Manufacturing》2003,14(3-4):351-362
Genetic algorithms (GAs) have been widely applied to the scheduling and sequencing problems due to its applicability to different domains and the capability in obtaining near-optimal results. Many investigated GAs are mainly concentrated on the traditional single factory or single job-shop scheduling problems. However, with the increasing popularity of distributed, or globalized production, the previously used GAs are required to be further explored in order to deal with the newly emerged distributed scheduling problems. In this paper, a modified GA is presented, which is capable of solving traditional scheduling problems as well as distributed scheduling problems. Various scheduling objectives can be achieved including minimizing makespan, cost and weighted multiple criteria. The proposed algorithm has been evaluated with satisfactory results through several classical scheduling benchmarks. Furthermore, the capability of the modified GA was also tested for handling the distributed scheduling problems. 相似文献
16.
A genetic algorithm is a randomized optimization technique that draws its inspiration from the biological sciences. Specifically, it uses the idea that genetics determines the evolution of any species in the natural world. Integer strings are used to encode an optimization problem and these strings are subject to combinatorial operations called reproduction, crossover and mutation, which improve these strings and cause them to ‘evolve’ to an optimal or nearly optimal solution. In this paper, the general machinations of genetic algorithms are described and a performance-enhanced algorithm is proposed for solving the important practical problem of railway scheduling. The problem under consideration involves moving a number of trains carrying mineral deposits across a long haul railway line with both single and double tracks in either direction. Collisions can only be avoided in sections of the line with double tracks. Constraints reflecting practical requirements to reduce environmental impacts from mineral transport, such as avoidance of loaded trains traversing populated areas during certain time slots, have to be satisfied. This is an NP-hard problem, which usually requires enumerative, as opposed to constructive, algorithms. For this reason, an ‘educated’ random search procedure like the genetic algorithm is an alternative and effective technique. The genetic algorithm is given difficult test problems to solve and the algorithm was able to generate feasible solutions in all cases. 相似文献
17.
Nibaran DasRam Sarkar Subhadip BasuMahantapas Kundu Mita Nasipuri Dipak Kumar Basu 《Applied Soft Computing》2012,12(5):1592-1606
Identification of local regions from where optimal discriminating features can be extracted is one of the major tasks in the area of pattern recognition. To locate such regions different kind of region sampling techniques are used in the literature. There is no standard methodology to identify exactly such regions. Here we have proposed a methodology where local regions of varying heights and widths are created dynamically. Genetic algorithm (GA) is then applied on these local regions to sample the optimal set of local regions from where an optimal feature set can be extracted that has the best discriminating features. We have evaluated the proposed methodology on a data set of handwritten Bangla digits. In the present work, we have randomly generated seven sets of local regions and from every set, GA selects an optimal group of local regions which produces best recognition performance with a support vector machine (SVM) based classifier. Other popular optimization techniques like simulated annealing (SA) and hill climbing (HC) have also been evaluated with the same data set and maximum recognition accuracies were found to be 97%, 96.7% and 96.7% for GA, SA and HC, respectively. We have also compared the performance of the present technique with those of other zone based techniques on the same database. 相似文献
18.
A genetic algorithm (GA) has control parameters that must be determined before execution. We propose a self-organizing genetic
algorithm (SOGA) as a multimodal function optimizer which sets GA parameters such as population size, crossover probability,
and mutation probability adaptively during the execution of a genetic algorithm. In SOGA, GA parameters change according to
the fitnesses of individuals. SOGA and other approaches for adapting operator probabilities in GAs are discussed. The validity
of the proposed algorithm is verified in simulation examples, including system identification.
This work was presented, in part, at the International Symposium on Artificial Life and Robotics, Oita, Japan, February 18–20,
1996 相似文献
19.
Agostino 《Pattern recognition》2003,36(12):2955-2966
The core of a k-means algorithm is the reallocation phase. A variety of schemes have been suggested for moving entities from one cluster to another and each of them may give a different clustering even though the data set is the same. The present paper describes shortcomings and relative merits of 17 relocation methods in connection with randomly generated data sets. 相似文献
20.
针对密度峰值聚类(CFSFDP)算法处理多密度峰值数据集时,人工选择聚类中心易造成簇的误划分问题,提出一种结合遗传k均值改进的密度峰值聚类算法。在CFSFDP求得的可能簇中心中,利用基于可变染色体长度编码的遗传k均值的全局搜索能力自动搜索出最优聚类中心,同时自适应确定遗传k均值的交叉概率,避免早熟问题的出现。在UCI数据集上的实验结果表明,改进算法具有较好的聚类质量和较少的迭代次数,验证了所提算法的可行性和有效性。 相似文献