期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A genetic algorithm with gene rearrangement for K-means clustering

Dong-Xia Chang Author Vitae Xian-Da Zhang Author Vitae Author Vitae 《Pattern recognition》2009,42(7):1210-1987

In this paper, a new clustering algorithm based on genetic algorithm (GA) with gene rearrangement (GAGR) is proposed, which in application may effectively remove the degeneracy for the purpose of a more efficient search. A new crossover operator that exploits a measure of similarity between chromosomes in a population is also presented. Adaptive probabilities of crossover and mutation are employed to prevent the convergence of the GAGR to a local optimum. Using the real-world data sets, we compare the performance of our GAGR clustering algorithm with K-means algorithm and other GA methods. An application of the GAGR clustering algorithm in unsupervised classification of multispectral remote sensing images is also provided. Experiment results demonstrate that the GAGR clustering algorithm has high performance, effectiveness and flexibility. 相似文献

2.

A traffic-based evolutionary algorithm for network clustering

Maurizio Naldi Sancho Salcedo-Sanz Leopoldo Carro-Calvo Luigi Laura Antonio Portilla-Figueras Giuseppe F. Italiano 《Applied Soft Computing》2013,13(11):4303-4319

Network clustering algorithms are typically based only on the topology information of the network. In this paper, we introduce traffic as a quantity representing the intensity of the relationship among nodes in the network, regardless of their connectivity, and propose an evolutionary clustering algorithm, based on the application of genetic operators and capable of exploiting the traffic information. In a comparative evaluation based on synthetic instances and two real world datasets, we show that our approach outperforms a selection of well established evolutionary and non-evolutionary clustering algorithms. 相似文献

3.

A genetic clustering algorithm using a message-based similarity measure

Dongxia Chang Yao ZhaoChangwen Zheng Xianda Zhang 《Expert systems with applications》2012,39(2):2194-2202

In this paper, a genetic clustering algorithm is described that uses a new similarity measure based message passing between data points and the candidate centers described by the chromosome. In the new algorithm, a variable-length real-value chromosome representation and a set of problem-specific evolutionary operators are used. Therefore, the proposed GA with message-based similarity (GAMS) clustering algorithm is able to automatically evolve and find the optimal number of clusters as well as proper clusters of the data set. Effectiveness of GAMS clustering algorithm is demonstrated for both artificial and real-life data set. Experiment results demonstrated that the GAMS clustering algorithm has high performance, effectiveness and flexibility. 相似文献

4.

A genetic clustering method for intrusion detection

Yongguo Liu Author Vitae Kefei Chen Author Vitae Author Vitae Wei Zhang Author Vitae 《Pattern recognition》2004,37(5):927-942

Traditional intrusion detection methods lack extensibility in face of changing network configurations as well as adaptability in face of unknown attack types. Meanwhile, current machine-learning algorithms need labeled data for training first, so they are computational expensive and sometimes misled by artificial data. In this paper, a new detection algorithm, the Intrusion Detection Based on Genetic Clustering (IDBGC) algorithm, is proposed. It can automatically establish clusters and detect intruders by labeling normal and abnormal groups. Computer simulations show that this algorithm is effective for intrusion detection. 相似文献

5.

A robust dynamic niching genetic algorithm with niche migration for automatic clustering problem

Dong-Xia Chang Author Vitae Xian-Da Zhang Author Vitae Author Vitae Dao-Ming Zhang Author Vitae 《Pattern recognition》2010,43(4):1346-1279

In this paper, a genetic clustering algorithm based on dynamic niching with niche migration (DNNM-clustering) is proposed. It is an effective and robust approach to clustering on the basis of a similarity function relating to the approximate density shape estimation. In the new algorithm, a dynamic identification of the niches with niche migration is performed at each generation to automatically evolve the optimal number of clusters as well as the cluster centers of the data set without invoking cluster validity functions. The niches can move slowly under the migration operator which makes the dynamic niching method independent of the radius of the niches. Compared to other existing methods, the proposed clustering method exhibits the following robust characteristics: (1) robust to the initialization, (2) robust to clusters volumes (ability to detect different volumes of clusters), and (3) robust to noise. Moreover, it is free of the radius of the niches and does not need to pre-specify the number of clusters. Several data sets with widely varying characteristics are used to demonstrate its superiority. An application of the DNNM-clustering algorithm in unsupervised classification of the multispectral remote sensing image is also provided. 相似文献

6.

A partitional clustering algorithm validated by a clustering tendency index based on graph theory

Helena Brás Silva Author Vitae Paula Brito^{Author Vitae} 《Pattern recognition》2006,39(5):776-788

Applying graph theory to clustering, we propose a partitional clustering method and a clustering tendency index. No initial assumptions about the data set are requested by the method. The number of clusters and the partition that best fits the data set, are selected according to the optimal clustering tendency index value. 相似文献

7.

A new fuzzy relational clustering algorithm based on the fuzzy C-means algorithm

P.?Corsini B.?Lazzerini F.?Marcelloni Email author 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2005,9(6):439-447

In this paper, we show how one can take advantage of the stability and effectiveness of object data clustering algorithms when the data to be clustered are available in the form of mutual numerical relationships between pairs of objects. More precisely, we propose a new fuzzy relational algorithm, based on the popular fuzzy C-means (FCM) algorithm, which does not require any particular restriction on the relation matrix. We describe the application of the algorithm to four real and four synthetic data sets, and show that our algorithm performs better than well-known fuzzy relational clustering algorithms on all these sets. 相似文献

8.

APSCAN: A parameter free algorithm for clustering 总被引：1，自引：0，他引：1

Xiaoming Chen Wanquan Liu Huining Qiu Jianhuang Lai 《Pattern recognition letters》2011,32(7):973-986

DBSCAN is a density based clustering algorithm and its effectiveness for spatial datasets has been demonstrated in the existing literature. However, there are two distinct drawbacks for DBSCAN: (i) the performances of clustering depend on two specified parameters. One is the maximum radius of a neighborhood and the other is the minimum number of the data points contained in such neighborhood. In fact these two specified parameters define a single density. Nevertheless, without enough prior knowledge, these two parameters are difficult to be determined; (ii) with these two parameters for a single density, DBSCAN does not perform well to datasets with varying densities. The above two issues bring some difficulties in applications. To address these two problems in a systematic way, in this paper we propose a novel parameter free clustering algorithm named as APSCAN. Firstly, we utilize the Affinity Propagation (AP) algorithm to detect local densities for a dataset and generate a normalized density list. Secondly, we combine the first pair of density parameters with any other pair of density parameters in the normalized density list as input parameters for a proposed DDBSCAN (Double-Density-Based SCAN) to produce a set of clustering results. In this way, we can obtain different clustering results with varying density parameters derived from the normalized density list. Thirdly, we develop an updated rule for the results obtained by implementing the DDBSCAN with different input parameters and then synthesize these clustering results into a final result. The proposed APSCAN has two advantages: first it does not need to predefine the two parameters as required in DBSCAN and second, it not only can cluster datasets with varying densities but also preserve the nonlinear data structure for such datasets. 相似文献

9.

改进的遗传k-means算法及其应用

黄松邱建林《计算机工程与设计》2020,41(6):1617-1623

为降低k值的不确定性和初始聚类中心的随机性对聚类结果的影响,提出一种改进的遗传k-means聚类算法。采用并行计算的方式降低k值和初始聚类中心对聚类结果的影响,利用平均类内距和类间距设计适应度函数保证聚类结果的正确性,改进遗传算法的遗传算子来提高算法效率。通过UCI标准数据集验证了该算法的正确性和有效性,并应用于玉米良种选育中。实验结果表明,该算法能获得更优良的玉米品种,指导玉米选育工作。相似文献

10.

A k-populations algorithm for clustering categorical data

Dae-Won Kim KiYoung Lee Kwang H. Lee 《Pattern recognition》2005,38(7):1131-1134

In this paper, the conventional k-modes-type algorithms for clustering categorical data are extended by representing the clusters of categorical data with k-populations instead of the hard-type centroids used in the conventional algorithms. Use of a population-based centroid representation makes it possible to preserve the uncertainty inherent in data sets as long as possible before actual decisions are made. The k-populations algorithm was found to give markedly better clustering results through various experiments. 相似文献

11.

GAKREM: A novel hybrid clustering algorithm

Cao D. Nguyen 《Information Sciences》2008,178(22):4205-4227

We introduce a novel clustering algorithm named GAKREM (Genetic Algorithm K-means Logarithmic Regression Expectation Maximization) that combines the best characteristics of the K-means and EM algorithms but avoids their weaknesses such as the need to specify a priori the number of clusters, termination in local optima, and lengthy computations. To achieve these goals, genetic algorithms for estimating parameters and initializing starting points for the EM are used first. Second, the log-likelihood of each configuration of parameters and the number of clusters resulting from the EM is used as the fitness value for each chromosome in the population. The novelty of GAKREM is that in each evolving generation it efficiently approximates the log-likelihood for each chromosome using logarithmic regression instead of running the conventional EM algorithm until its convergence. Another novelty is the use of K-means to initially assign data points to clusters. The algorithm is evaluated by comparing its performance with the conventional EM algorithm, the K-means algorithm, and the likelihood cross-validation technique on several datasets. 相似文献

12.

A hybrid clustering and graph based algorithm for tagSNP selection

Mao-Zu Guo Jun Wang Chun-yu Wang Yang Liu 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2009,13(12):1143-1151

TagSNP selection, which aims to select a small subset of informative single nucleotide polymorphisms (SNPs) to represent the whole large SNP set, has played an important role in current genomic research. Not only can this cut down the cost of genotyping by filtering a large number of redundant SNPs, but also it can accelerate the study of genome-wide disease association. In this paper, we propose a new hybrid method called CMDStagger that combines the ideas of the clustering and the graph algorithm, to find the minimum set of tagSNPs. The proposed algorithm uses the information of the linkage disequilibrium association and the haplotype diversity to reduce the information loss in tagSNP selection, and has no limit of block partition. The approach is tested on eight benchmark datasets from Hapmap and chromosome 5q31. Experimental results show that the algorithm in this paper can reduce the selection time and obtain less tagSNPs with high prediction accuracy. It indicates that this method has better performance than previous ones. 相似文献

13.

A multi-population genetic algorithm for robust and fast ellipse detection 总被引：2，自引：0，他引：2

Jie Yao Nawwaf Kharma Peter Grogono 《Pattern Analysis & Applications》2005,8(1-2):149-162

This paper discusses a novel and effective technique for extracting multiple ellipses from an image, using a genetic algorithm with multiple populations (MPGA). MPGA evolves a number of subpopulations in parallel, each of which is clustered around an actual or perceived ellipse in the target image. The technique uses both evolution and clustering to direct the search for ellipses—full or partial. MPGA is explained in detail, and compared with both the widely used randomized Hough transform (RHT) and the sharing genetic algorithm (SGA). In thorough and fair experimental tests, using both synthetic and real-world images, MPGA exhibits solid advantages over RHT and SGA in terms of accuracy of recognition—even in the presence of noise or/and multiple imperfect ellipses in an image—and speed of computation. 相似文献

14.

A modified genetic algorithm for maximizing handling reliability and recyclability of distribution centers

S.H. Chung H.K. Chan F.T.S. Chan 《Expert systems with applications》2013,40(18):7588-7595

Nowadays, many 3PL providers usually equip their distribution centers with different facilities, enabling them to be specialized in handling certain products types, and enhancing their ability of reuse and recycle the waste produced from packaging and repackaging. In practice, this problem type has been attracted much attention by researchers and environmental protectionisms. More importantly, because of the difference in product handling specialty, this induces different processing efficiency, handling reliability, and costs. In this connection, the objective of this paper is to propose a modified genetic algorithm to deal with the problem. The new chromosome encoding enhances the searching ability of the genetic algorithm in finding location, allocation, and routing solutions with high handling reliability and recycling ability for the distribution centers. To test the optimization reliability of the modified genetic algorithm, a number of numerical experiments have been carried out. The results demonstrated that the modified algorithm is able to obtain the Pareto solutions under multi-criterion decision making. Meanwhile, the handling reliability and recycling of the distributed centers are increased and the overall performance of the distribution network is improved. 相似文献

15.

A modified genetic algorithm for distributed scheduling problems 总被引：8，自引：1，他引：8

H. Z. Jia A. Y. C. Nee J. Y. H. Fuh Y. F. Zhang 《Journal of Intelligent Manufacturing》2003,14(3-4):351-362

Genetic algorithms (GAs) have been widely applied to the scheduling and sequencing problems due to its applicability to different domains and the capability in obtaining near-optimal results. Many investigated GAs are mainly concentrated on the traditional single factory or single job-shop scheduling problems. However, with the increasing popularity of distributed, or globalized production, the previously used GAs are required to be further explored in order to deal with the newly emerged distributed scheduling problems. In this paper, a modified GA is presented, which is capable of solving traditional scheduling problems as well as distributed scheduling problems. Various scheduling objectives can be achieved including minimizing makespan, cost and weighted multiple criteria. The proposed algorithm has been evaluated with satisfactory results through several classical scheduling benchmarks. Furthermore, the capability of the modified GA was also tested for handling the distributed scheduling problems. 相似文献

16.

A genetic algorithm for railway scheduling with environmental considerations

Vivian Salim Xiaoqiang Cai 《Environmental Modelling & Software》1997,12(4):301-309

A genetic algorithm is a randomized optimization technique that draws its inspiration from the biological sciences. Specifically, it uses the idea that genetics determines the evolution of any species in the natural world. Integer strings are used to encode an optimization problem and these strings are subject to combinatorial operations called reproduction, crossover and mutation, which improve these strings and cause them to ‘evolve’ to an optimal or nearly optimal solution. In this paper, the general machinations of genetic algorithms are described and a performance-enhanced algorithm is proposed for solving the important practical problem of railway scheduling. The problem under consideration involves moving a number of trains carrying mineral deposits across a long haul railway line with both single and double tracks in either direction. Collisions can only be avoided in sections of the line with double tracks. Constraints reflecting practical requirements to reduce environmental impacts from mineral transport, such as avoidance of loaded trains traversing populated areas during certain time slots, have to be satisfied. This is an NP-hard problem, which usually requires enumerative, as opposed to constructive, algorithms. For this reason, an ‘educated’ random search procedure like the genetic algorithm is an alternative and effective technique. The genetic algorithm is given difficult test problems to solve and the algorithm was able to generate feasible solutions in all cases. 相似文献

17.

A genetic algorithm based region sampling for selection of local features in handwritten digit recognition application

Nibaran DasRam Sarkar Subhadip BasuMahantapas Kundu Mita Nasipuri Dipak Kumar Basu 《Applied Soft Computing》2012,12(5):1592-1606

Identification of local regions from where optimal discriminating features can be extracted is one of the major tasks in the area of pattern recognition. To locate such regions different kind of region sampling techniques are used in the literature. There is no standard methodology to identify exactly such regions. Here we have proposed a methodology where local regions of varying heights and widths are created dynamically. Genetic algorithm (GA) is then applied on these local regions to sample the optimal set of local regions from where an optimal feature set can be extracted that has the best discriminating features. We have evaluated the proposed methodology on a data set of handwritten Bangla digits. In the present work, we have randomly generated seven sets of local regions and from every set, GA selects an optimal group of local regions which produces best recognition performance with a support vector machine (SVM) based classifier. Other popular optimization techniques like simulated annealing (SA) and hill climbing (HC) have also been evaluated with the same data set and maximum recognition accuracies were found to be 97%, 96.7% and 96.7% for GA, SA and HC, respectively. We have also compared the performance of the present technique with those of other zone based techniques on the same database. 相似文献

18.

A self-organizing genetic algorithm for multimodal function optimization 总被引：1，自引：0，他引：1

Il-Kwon Jeong Ju-Jang Lee 《Artificial Life and Robotics》1998,2(1):48-52

A genetic algorithm (GA) has control parameters that must be determined before execution. We propose a self-organizing genetic algorithm (SOGA) as a multimodal function optimizer which sets GA parameters such as population size, crossover probability, and mutation probability adaptively during the execution of a genetic algorithm. In SOGA, GA parameters change according to the fitnesses of individuals. SOGA and other approaches for adapting operator probabilities in GAs are discussed. The validity of the proposed algorithm is verified in simulation examples, including system identification. This work was presented, in part, at the International Symposium on Artificial Life and Robotics, Oita, Japan, February 18–20, 1996 相似文献

19.

A computational study of several relocation methods for k-means algorithms

Agostino 《Pattern recognition》2003,36(12):2955-2966

The core of a k-means algorithm is the reallocation phase. A variety of schemes have been suggested for moving entities from one cluster to another and each of them may give a different clustering even though the data set is the same. The present paper describes shortcomings and relative merits of 17 relocation methods in connection with randomly generated data sets. 相似文献

20.

结合遗传k均值改进的密度峰值聚类算法

卜秋瑾段隆振段文影《计算机工程与设计》2020,41(4):1012-1016

针对密度峰值聚类(CFSFDP)算法处理多密度峰值数据集时,人工选择聚类中心易造成簇的误划分问题,提出一种结合遗传k均值改进的密度峰值聚类算法。在CFSFDP求得的可能簇中心中,利用基于可变染色体长度编码的遗传k均值的全局搜索能力自动搜索出最优聚类中心,同时自适应确定遗传k均值的交叉概率,避免早熟问题的出现。在UCI数据集上的实验结果表明,改进算法具有较好的聚类质量和较少的迭代次数,验证了所提算法的可行性和有效性。相似文献