共查询到20条相似文献,搜索用时 15 毫秒
1.
Clustering attempts to partition a dataset into a meaningful set of mutually exclusive clusters. It is known that sequential clustering algorithms can give optimal partitions when applied to an ordered set of objects. In this technical note, we explore how this approach could be generalized to partition datasets in which there is no natural sequential ordering of the objects. As such, it extends the application of sequential clustering algorithms to all sets of objects. 相似文献
2.
A. Verikas K. Malmqvist A. Verikas M. Bacauskiene 《Neural computing & applications》2003,11(3-4):203-209
In this paper, an approach to weighting features for classification based on the nearest-neighbour rules is proposed. The weights are adaptive in the sense that the weight values are different in various regions of the feature space. The values of the weights are found by performing a random search in the weight space. A correct classification rate is the criterion maximised during the search. Experimentally, we have shown that the proposed approach is useful for classification. The weight values obtained during the experiments show that the importance of features may be different in different regions of the feature space. 相似文献
3.
In this work we present a new hybrid algorithm for feedforward neural networks, which combines unsupervised and supervised
learning. In this approach, we use a Kohonen algorithm with a fuzzy neighborhood for training the weights of the hidden layers
and gradient descent method for training the weights of the output layer. The goal of this method is to assist the existing
variable learning rate algorithms. Simulation results show the effectiveness of the proposed algorithm compared with other
well-known learning methods. 相似文献
4.
APSCAN: A parameter free algorithm for clustering 总被引:1,自引:0,他引:1
DBSCAN is a density based clustering algorithm and its effectiveness for spatial datasets has been demonstrated in the existing literature. However, there are two distinct drawbacks for DBSCAN: (i) the performances of clustering depend on two specified parameters. One is the maximum radius of a neighborhood and the other is the minimum number of the data points contained in such neighborhood. In fact these two specified parameters define a single density. Nevertheless, without enough prior knowledge, these two parameters are difficult to be determined; (ii) with these two parameters for a single density, DBSCAN does not perform well to datasets with varying densities. The above two issues bring some difficulties in applications. To address these two problems in a systematic way, in this paper we propose a novel parameter free clustering algorithm named as APSCAN. Firstly, we utilize the Affinity Propagation (AP) algorithm to detect local densities for a dataset and generate a normalized density list. Secondly, we combine the first pair of density parameters with any other pair of density parameters in the normalized density list as input parameters for a proposed DDBSCAN (Double-Density-Based SCAN) to produce a set of clustering results. In this way, we can obtain different clustering results with varying density parameters derived from the normalized density list. Thirdly, we develop an updated rule for the results obtained by implementing the DDBSCAN with different input parameters and then synthesize these clustering results into a final result. The proposed APSCAN has two advantages: first it does not need to predefine the two parameters as required in DBSCAN and second, it not only can cluster datasets with varying densities but also preserve the nonlinear data structure for such datasets. 相似文献
5.
We previously developed a clustering and classification algorithm—supervised (CCAS) to learn patterns of normal and intrusive
activities and to classify observed system activities. Here we further enhance the robustness of CCAS to the presentation
order of training data and the noises in training data. This robust CCAS adds data redistribution, a supervised hierarchical
grouping of clusters and removal of outliers as the postprocessing steps. 相似文献
6.
A hybrid neural network classifier combining ordered fuzzy ARTMAP and the dynamic decay adjustment algorithm 总被引:1,自引:1,他引:0
Shing Chiang Tan M. V. C. Rao Chee Peng Lim 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2008,12(8):765-775
This paper presents a novel conflict-resolving neural network classifier that combines the ordering algorithm, fuzzy ARTMAP
(FAM), and the dynamic decay adjustment (DDA) algorithm, into a unified framework. The hybrid classifier, known as Ordered
FAMDDA, applies the DDA algorithm to overcome the limitations of FAM and ordered FAM in achieving a good generalization/performance.
Prior to network learning, the ordering algorithm is first used to identify a fixed order of training patterns. The main aim
is to reduce and/or avoid the formation of overlapping prototypes of different classes in FAM during learning. However, the
effectiveness of the ordering algorithm in resolving overlapping prototypes of different classes is compromised when dealing
with complex datasets. Ordered FAMDDA not only is able to determine a fixed order of training patterns for yielding good generalization,
but also is able to reduce/resolve overlapping regions of different classes in the feature space for minimizing misclassification
during the network learning phase. To illustrate the effectiveness of Ordered FAMDDA, a total of ten benchmark datasets are
experimented. The results are analyzed and compared with those from FAM and Ordered FAM. The outcomes demonstrate that Ordered
FAMDDA, in general, outperforms FAM and Ordered FAM in tackling pattern classification problems. 相似文献
7.
The list of documents returned by Internet search engines in response to a query these days can be quite overwhelming. There is an increasing need for organising this information and presenting it in a more compact and efficient manner. This paper describes a method developed for the automatic clustering of World Wide Web documents, according to their relevance to the user’s information needs, by using a hybrid neural network. The objective is to reduce the time and effort the user has to spend to find the information sought after. Clustering documents by features representative of their contents—in this case, key words and phrases—increases the effectiveness and efficiency of the search process. It is shown that a two-dimensional visual presentation of information on retrieved documents, instead of the traditional linear listing, can create a more user-friendly interface between a search engine and the user. 相似文献
8.
Maurizio Naldi Sancho Salcedo-Sanz Leopoldo Carro-Calvo Luigi Laura Antonio Portilla-Figueras Giuseppe F. Italiano 《Applied Soft Computing》2013,13(11):4303-4319
Network clustering algorithms are typically based only on the topology information of the network. In this paper, we introduce traffic as a quantity representing the intensity of the relationship among nodes in the network, regardless of their connectivity, and propose an evolutionary clustering algorithm, based on the application of genetic operators and capable of exploiting the traffic information. In a comparative evaluation based on synthetic instances and two real world datasets, we show that our approach outperforms a selection of well established evolutionary and non-evolutionary clustering algorithms. 相似文献
9.
Applying graph theory to clustering, we propose a partitional clustering method and a clustering tendency index. No initial assumptions about the data set are requested by the method. The number of clusters and the partition that best fits the data set, are selected according to the optimal clustering tendency index value. 相似文献
10.
Cao D. Nguyen 《Information Sciences》2008,178(22):4205-4227
We introduce a novel clustering algorithm named GAKREM (Genetic Algorithm K-means Logarithmic Regression Expectation Maximization) that combines the best characteristics of the K-means and EM algorithms but avoids their weaknesses such as the need to specify a priori the number of clusters, termination in local optima, and lengthy computations. To achieve these goals, genetic algorithms for estimating parameters and initializing starting points for the EM are used first. Second, the log-likelihood of each configuration of parameters and the number of clusters resulting from the EM is used as the fitness value for each chromosome in the population. The novelty of GAKREM is that in each evolving generation it efficiently approximates the log-likelihood for each chromosome using logarithmic regression instead of running the conventional EM algorithm until its convergence. Another novelty is the use of K-means to initially assign data points to clusters. The algorithm is evaluated by comparing its performance with the conventional EM algorithm, the K-means algorithm, and the likelihood cross-validation technique on several datasets. 相似文献
11.
Vladimir J. Lumelsky 《Pattern recognition》1982,15(2):53-60
One problem in clustering (classification) analysis relates to whether or not the original variables should be transformed in some way before they are used by the clustering algorithm. More often than not, the original variables do require some transformation. The purpose of the transformation may be a desire to have more compact clusters in the space of the transformed variables, to take into account the different nature and/or units of the variables involved, to allow for the different or equal ‘importance’ of different variables, to minimize the number of variables used, etc. Among the linear transformations of variables we distinguish two groups - those which change only the scales of the variables (they are often called weighting procedures), and those which also rotate the space of variables (a good example would be the method of principal components(1)). This paper addresses the former group of transformations.One strong reason for using the weighted variables (as opposed to their linear combinations) is that when using them one can interpret the results of the classification in terms of the original (physical) variables. Unfortunately, weighting the variables can result in ‘spoiling’ the compactness of the clusters in the space of the weighted variables if the weighting procedure being used ‘does not care’ about the results of clustering (in other words if the weighting is done prior to and independently of the clustering).A method of weighting the variables which is a part of the classification procedure and thus guarantees an improvement of the cluster clarity is suggested in this paper. The weights of variables and the clusters of objects produced by the algorithm correspond to a local minimum of some classification criterion. Because of this, the resultant weights can be interpreted as a measure of ‘importance’ of the variables for the classification purpose. These weights are compared with such popular weighting procedures as equal variance(6) and Mahalanobis distance(7) methods. Two examples of the performance of the algorithm are presented. 相似文献
12.
Mao-Zu Guo Jun Wang Chun-yu Wang Yang Liu 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2009,13(12):1143-1151
TagSNP selection, which aims to select a small subset of informative single nucleotide polymorphisms (SNPs) to represent the
whole large SNP set, has played an important role in current genomic research. Not only can this cut down the cost of genotyping
by filtering a large number of redundant SNPs, but also it can accelerate the study of genome-wide disease association. In
this paper, we propose a new hybrid method called CMDStagger that combines the ideas of the clustering and the graph algorithm,
to find the minimum set of tagSNPs. The proposed algorithm uses the information of the linkage disequilibrium association
and the haplotype diversity to reduce the information loss in tagSNP selection, and has no limit of block partition. The approach
is tested on eight benchmark datasets from Hapmap and chromosome 5q31. Experimental results show that the algorithm in this
paper can reduce the selection time and obtain less tagSNPs with high prediction accuracy. It indicates that this method has
better performance than previous ones. 相似文献
13.
This paper presents a hybrid efficient genetic algorithm (EGA) for the stochastic competitive Hopfield (SCH) neural network, which is named SCH–EGA. This approach aims to tackle the frequency assignment problem (FAP). The objective of the FAP in satellite communication system is to minimize the co-channel interference between satellite communication systems by rearranging the frequency assignment so that they can accommodate increasing demands. Our hybrid algorithm involves a stochastic competitive Hopfield neural network (SCHNN) which manages the problem constraints, when a genetic algorithm searches for high quality solutions with the minimum possible cost. Our hybrid algorithm, reflecting a special type of algorithm hybrid thought, owns good adaptability which cannot only deal with the FAP, but also cope with other problems including the clustering, classification, and the maximum clique problem, etc. In this paper, we first propose five optimal strategies to build an efficient genetic algorithm. Then we explore three hybridizations between SCHNN and EGA to discover the best hybrid algorithm. We believe that the comparison can also be helpful for hybridizations between neural networks and other evolutionary algorithms such as the particle swarm optimization algorithm, the artificial bee colony algorithm, etc. In the experiments, our hybrid algorithm obtains better or comparable performance than other algorithms on 5 benchmark problems and 12 large problems randomly generated. Finally, we show that our hybrid algorithm can obtain good results with a small size population. 相似文献
14.
Rule learning is one of the most common tasks in knowledge discovery. In this paper, we investigate the induction of fuzzy classification rules for data mining purposes, and propose a hybrid genetic algorithm for learning approximate fuzzy rules. A novel niching method is employed to promote coevolution within the population, which enables the algorithm to discover multiple rules by means of a coevolutionary scheme in a single run. In order to improve the quality of the learned rules, a local search method was devised to perform fine-tuning on the offspring generated by genetic operators in each generation. After the GA terminates, a fuzzy classifier is built by extracting a rule set from the final population. The proposed algorithm was tested on datasets from the UCI repository, and the experimental results verify its validity in learning rule sets and comparative advantage over conventional methods. 相似文献
15.
This paper introduces a novel neurofuzzy system based on polynomial fuzzy neural network (PFNN) architecture. A PFNN consists
of a set of if-then rules with appropriate membership functions (MFs) whose parameters are optimized via a hybrid genetic
algorithm. A polynomial neural network is employed in the defuzzification scheme to improve output performance and to select
appropriate rules. A performance criterion for model selection is defined to overcome the overfitting problem in the modeling
procedure. For a performance assessment of the PFNN inference system, two well-known problems are employed for a comparison
with other methods. The results of these comparisons show that the PFNN inference system out-performs the other methods and
exhibits robustness characteristics.
This work was presented in part at the Fourth International Symposium on Artificial Life and Robotics, Oita, Japan, January
19–22, 1999 相似文献
16.
A neural network model and algorithm for the hybrid flow shop scheduling problem
in a dynamic environment 总被引:8,自引:0,他引:8
A hybrid flow shop (HFS) is a generalized flow shop with multiple machines in some stages. HFS is fairly common in flexible manufacturing and in process industry. Because manufacturing systems often operate in a stochastic and dynamic environment, dynamic hybrid flow shop scheduling is frequently encountered in practice. This paper proposes a neural network model and algorithm to solve the dynamic hybrid flow shop scheduling problem. In order to obtain training examples for the neural network, we first study, through simulation, the performance of some dispatching rules that have demonstrated effectiveness in the previous related research. The results are then transformed into training examples. The training process is optimized by the delta-bar-delta (DBD) method that can speed up training convergence. The most commonly used dispatching rules are used as benchmarks. Simulation results show that the performance of the neural network approach is much better than that of the traditional dispatching rules.This revised version was published in June 2005 with corrected page numbers. 相似文献
17.
Javad HaddadniaAuthor Vitae 《Pattern recognition》2003,36(5):1187-1202
This paper presents a fuzzy hybrid learning algorithm (FHLA) for the radial basis function neural network (RBFNN). The method determines the number of hidden neurons in the RBFNN structure by using cluster validity indices with majority rule while the characteristics of the hidden neurons are initialized based on advanced fuzzy clustering. The FHLA combines the gradient method and the linear least-squared method for adjusting the RBF parameters and the neural network connection weights. The RBFNN with the proposed FHLA is used as a classifier in a face recognition system. The inputs to the RBFNN are the feature vectors obtained by combining shape information and principal component analysis. The designed RBFNN with the proposed FHLA, while providing a faster convergence in the training phase, requires a hidden layer with fewer neurons and less sensitivity to the training and testing patterns. The efficiency of the proposed method is demonstrated on the ORL and Yale face databases, and comparison with other algorithms indicates that the FHLA yields excellent recognition rate in human face recognition. 相似文献
18.
We propose a new technique for the identification of discrete-time hybrid systems in the piecewise affine (PWA) form. This problem can be formulated as the reconstruction of a possibly discontinuous PWA map with a multi-dimensional domain. In order to achieve our goal, we provide an algorithm that exploits the combined use of clustering, linear identification, and pattern recognition techniques. This allows to identify both the affine submodels and the polyhedral partition of the domain on which each submodel is valid avoiding gridding procedures. Moreover, the clustering step (used for classifying the datapoints) is performed in a suitably defined feature space which allows also to reconstruct different submodels that share the same coefficients but are defined on different regions. Measures of confidence on the samples are introduced and exploited in order to improve the performance of both the clustering and the final linear regression procedure. 相似文献
19.
The statistical properties of training, validation and test data play an important role in assuring optimal performance in artificial neural networks (ANNs). Researchers have proposed optimized data partitioning (ODP) and stratified data partitioning (SDP) methods to partition of input data into training, validation and test datasets. ODP methods based on genetic algorithm (GA) are computationally expensive as the random search space can be in the power of twenty or more for an average sized dataset. For SDP methods, clustering algorithms such as self organizing map (SOM) and fuzzy clustering (FC) are used to form strata. It is assumed that data points in any individual stratum are in close statistical agreement. Reported clustering algorithms are designed to form natural clusters. In the case of large multivariate datasets, some of these natural clusters can be big enough such that the furthest data vectors are statistically far away from the mean. Further, these algorithms are computationally expensive as well. We propose a custom design clustering algorithm (CDCA) to overcome these shortcomings. Comparisons are made using three benchmark case studies, one each from classification, function approximation and prediction domains. The proposed CDCA data partitioning method is evaluated in comparison with SOM, FC and GA based data partitioning methods. It is found that the CDCA data partitioning method not only perform well but also reduces the average CPU time. 相似文献
20.
Constrained optimization based on hybrid evolutionary algorithm and adaptive constraint-handling technique 总被引:3,自引:1,他引:2
Yong Wang Zixing Cai Yuren Zhou Zhun Fan 《Structural and Multidisciplinary Optimization》2009,37(4):395-413
A novel approach to deal with numerical and engineering constrained optimization problems, which incorporates a hybrid evolutionary
algorithm and an adaptive constraint-handling technique, is presented in this paper. The hybrid evolutionary algorithm simultaneously
uses simplex crossover and two mutation operators to generate the offspring population. Additionally, the adaptive constraint-handling
technique consists of three main situations. In detail, at each situation, one constraint-handling mechanism is designed based
on current population state. Experiments on 13 benchmark test functions and four well-known constrained design problems verify
the effectiveness and efficiency of the proposed method. The experimental results show that integrating the hybrid evolutionary
algorithm with the adaptive constraint-handling technique is beneficial, and the proposed method achieves competitive performance
with respect to some other state-of-the-art approaches in constrained evolutionary optimization. 相似文献