首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A text independent speaker recognition system based on wavelet transform derived from fuzzy c-means clustering is proposed. The fuzzy c-means clustering is applied to the speaker data compression in spectrum domain. A set of experiments are conducted, which gives a 95% recognition rate for 100 Mandarin speakers.  相似文献   

2.
Major problems exist in both crisp and fuzzy clustering algorithms. The fuzzy c-means type of algorithms use weights determined by a power m of inverse distances that remains fixed over all iterations and over all clusters, even though smaller clusters should have a larger m. Our method uses a different “distance” for each cluster that changes over the early iterations to fit the clusters. Comparisons show improved results. We also address other perplexing problems in clustering: (i) find the optimal number K of clusters; (ii) assess the validity of a given clustering; (iii) prevent the selection of seed vectors as initial prototypes from affecting the clustering; (iv) prevent the order of merging from affecting the clustering; and (v) permit the clusters to form more natural shapes rather than forcing them into normed balls of the distance function. We employ a relatively large number K of uniformly randomly distributed seeds and then thin them to leave fewer uniformly distributed seeds. Next, the main loop iterates by assigning the feature vectors and computing new fuzzy prototypes. Our fuzzy merging then merges any clusters that are too close to each other. We use a modified Xie-Bene validity measure as the goodness of clustering measure for multiple values of K in a user-interaction approach where the user selects two parameters (for eliminating clusters and merging clusters after viewing the results thus far). The algorithm is compared with the fuzzy c-means on the iris data and on the Wisconsin breast cancer data.  相似文献   

3.
In this paper, we propose a context-sensitive technique for unsupervised change detection in multitemporal remote sensing images. The technique is based on fuzzy clustering approach and takes care of spatial correlation between neighboring pixels of the difference image produced by comparing two images acquired on the same geographical area at different times. Since the ranges of pixel values of the difference image belonging to the two clusters (changed and unchanged) generally have overlap, fuzzy clustering techniques seem to be an appropriate and realistic choice to identify them (as we already know from pattern recognition literatures that fuzzy set can handle this type of situation very well). Two fuzzy clustering algorithms, namely fuzzy c-means (FCM) and Gustafson-Kessel clustering (GKC) algorithms have been used for this task in the proposed work. For clustering purpose various image features are extracted using the neighborhood information of pixels. Hybridization of FCM and GKC with two other optimization techniques, genetic algorithm (GA) and simulated annealing (SA), is made to further enhance the performance. To show the effectiveness of the proposed technique, experiments are conducted on two multispectral and multitemporal remote sensing images. A fuzzy cluster validity index (Xie-Beni) is used to quantitatively evaluate the performance. Results are compared with those of existing Markov random field (MRF) and neural network based algorithms and found to be superior. The proposed technique is less time consuming and unlike MRF does not require any a priori knowledge of distributions of changed and unchanged pixels.  相似文献   

4.
In practical cluster analysis tasks, an efficient clustering algorithm should be less sensitive to parameter configurations and tolerate the existence of outliers. Based on the neural gas (NG) network framework, we propose an efficient prototype-based clustering (PBC) algorithm called enhanced neural gas (ENG) network. Several problems associated with the traditional PBC algorithms and original NG algorithm such as sensitivity to initialization, sensitivity to input sequence ordering and the adverse influence from outliers can be effectively tackled in our new scheme. In addition, our new algorithm can establish the topology relationships among the prototypes and all topology-wise badly located prototypes can be relocated to represent more meaningful regions. Experimental results1on synthetic and UCI datasets show that our algorithm possesses superior performance in comparison to several PBC algorithms and their improved variants, such as hard c-means, fuzzy c-means, NG, fuzzy possibilistic c-means, credibilistic fuzzy c-means, hard/fuzzy robust clustering and alternative hard/fuzzy c-means, in static data clustering tasks with a fixed number of prototypes.  相似文献   

5.
The fuzzy c-means (FCM) algorithm is a widely applied clustering technique, but the implicit assumption that each attribute of the object data has equal importance affects the clustering performance. At present, attribute weighted fuzzy clustering has became a very active area of research, and numerous approaches that develop numerical weights have been combined into fuzzy clustering. In this paper, interval number is introduced for attribute weighting in the weighted fuzzy c-means (WFCM) clustering, and it is illustrated that interval weighting can obtain appropriate weights more easily from the viewpoint of geometric probability. Moreover, a genetic heuristic strategy for attribute weight searching is proposed to guide the alternating optimization (AO) of WFCM, and improved attribute weights in interval-constrained ranges and reasonable data partition can be obtained simultaneously. The experimental results demonstrate that the proposed algorithm is superior in clustering performance. It reveals that the interval weighted clustering can act as an optimization operator on the basis of the traditional numerical weighted clustering, and the effects of interval weight perturbation on clustering performance can be decreased.  相似文献   

6.
Since Quandt [The estimation of the parameters of a linear regression system obeying two separate regimes, Journal of the American Statistical Association 53 (1958) 873-880] initiated the research on 2-regressions analysis, switching regression had been widely studied and applied in psychology, economics, social science and music perception. In fuzzy clustering, the fuzzy c-means (FCM) is the most commonly used algorithm. Hathaway and Bezdek [Switching regression models and fuzzy clustering, IEEE Transactions on Fuzzy Systems 1 (1993) 195-204] embedded FCM into switching regression where it was called fuzzy c-regressions (FCR). However, the FCR always depends heavily on initial values. In this paper, we propose a mountain c-regressions (MCR) method for solving the initial-value problem. First, we perform data transformation for the switching regression data set, and then implement the modified mountain clustering on the transformed data to extract c cluster centers. These extracted c cluster centers in the transformed space will correspond to c regression models in the original data set. The proposed MCR method can form well-estimated c regression models for switching regression data sets. According to the properties of transformation, the proposed MCR is also robust to noise and outliers. Several examples show the effectiveness and superiority of our proposed method.  相似文献   

7.
Incomplete data are often encountered in data sets used in clustering problems, and inappropriate treatment of incomplete data can significantly degrade the clustering performance. In view of the uncertainty of missing attributes, we put forward an interval representation of missing attributes based on nearest-neighbor information, named nearest-neighbor interval, and a hybrid approach utilizing genetic algorithm and fuzzy c-means is presented for incomplete data clustering. The overall algorithm is within the genetic algorithm framework, which searches for appropriate imputations of missing attributes in corresponding nearest-neighbor intervals to recover the incomplete data set, and hybridizes fuzzy c-means to perform clustering analysis and provide fitness metric for genetic optimization simultaneously. Several experimental results on a set of real-life data sets are presented to demonstrate the better clustering performance of our hybrid approach over the compared methods.  相似文献   

8.
This article presents a multi-objective genetic algorithm which considers the problem of data clustering. A given dataset is automatically assigned into a number of groups in appropriate fuzzy partitions through the fuzzy c-means method. This work has tried to exploit the advantage of fuzzy properties which provide capability to handle overlapping clusters. However, most fuzzy methods are based on compactness and/or separation measures which use only centroid information. The calculation from centroid information only may not be sufficient to differentiate the geometric structures of clusters. The overlap-separation measure using an aggregation operation of fuzzy membership degrees is better equipped to handle this drawback. For another key consideration, we need a mechanism to identify appropriate fuzzy clusters without prior knowledge on the number of clusters. From this requirement, an optimization with single criterion may not be feasible for different cluster shapes. A multi-objective genetic algorithm is therefore appropriate to search for fuzzy partitions in this situation. Apart from the overlap-separation measure, the well-known fuzzy Jm index is also optimized through genetic operations. The algorithm simultaneously optimizes the two criteria to search for optimal clustering solutions. A string of real-coded values is encoded to represent cluster centers. A number of strings with different lengths varied over a range correspond to variable numbers of clusters. These real-coded values are optimized and the Pareto solutions corresponding to a tradeoff between the two objectives are finally produced. As shown in the experiments, the approach provides promising solutions in well-separated, hyperspherical and overlapping clusters from synthetic and real-life data sets. This is demonstrated by the comparison with existing single-objective and multi-objective clustering techniques.  相似文献   

9.
《Pattern recognition letters》2003,24(9-10):1607-1612
Based on the defect of rival checked fuzzy c-means clustering algorithm, a new algorithm: suppressed fuzzy c-means clustering algorithm is proposed. The new algorithm overcomes the shortcomings of the original algorithm, establishes more natural and more reasonable relationships between hard c-means clustering algorithm and fuzzy c-means clustering algorithm.  相似文献   

10.
This paper presents adaptive and non-adaptive fuzzy c-means clustering methods for partitioning symbolic interval data. The proposed methods furnish a fuzzy partition and prototype for each cluster by optimizing an adequacy criterion based on suitable squared Euclidean distances between vectors of intervals. Moreover, various cluster interpretation tools are introduced. Experiments with real and synthetic data sets show the usefulness of these fuzzy c-means clustering methods and the merit of the cluster interpretation tools.  相似文献   

11.
Although there have been many researches on cluster analysis considering feature (or variable) weights, little effort has been made regarding sample weights in clustering. In practice, not every sample in a data set has the same importance in cluster analysis. Therefore, it is interesting to obtain the proper sample weights for clustering a data set. In this paper, we consider a probability distribution over a data set to represent its sample weights. We then apply the maximum entropy principle to automatically compute these sample weights for clustering. Such method can generate the sample-weighted versions of most clustering algorithms, such as k-means, fuzzy c-means (FCM) and expectation & maximization (EM), etc. The proposed sample-weighted clustering algorithms will be robust for data sets with noise and outliers. Furthermore, we also analyze the convergence properties of the proposed algorithms. This study also uses some numerical data and real data sets for demonstration and comparison. Experimental results and comparisons actually demonstrate that the proposed sample-weighted clustering algorithms are effective and robust clustering methods.  相似文献   

12.
This paper proposes a novel intuitionistic fuzzy c-least squares support vector regression (IFC-LSSVR) with a Sammon mapping clustering algorithm. Sammon mapping effectively reduces the complexity of raw data, while intuitionistic fuzzy sets (IFSs) can effectively tune the membership of data points, and LSSVR improves the conventional fuzzy c-regression model. The proposed clustering algorithm combines the advantages of IFSs, LSSVR and Sammon mapping for solving actual clustering problems. Moreover, IFC-LSSVR with Sammon mapping adopts particle swarm optimization to obtain optimal parameters. Experiments conducted on a web-based adaptive learning environment and a dataset of wheat varieties demonstrate that the proposed algorithm is more efficient than conventional algorithms, such as the k-means (KM) and fuzzy c-means (FCM) clustering algorithms, in standard measurement indexes. This study thus demonstrates that the proposed model is a credible fuzzy clustering algorithm. The novel method contributes not only to the theoretical aspects of fuzzy clustering, but is also widely applicable in data mining, image systems, rule-based expert systems and prediction problems.  相似文献   

13.
Spectral clustering with fuzzy similarity measure   总被引:1,自引:0,他引:1  
Spectral clustering algorithms have been successfully used in the field of pattern recognition and computer vision. The widely used similarity measure for spectral clustering is Gaussian kernel function which measures the similarity between data points. However, it is difficult for spectral clustering to choose the suitable scaling parameter in Gaussian kernel similarity measure. In this paper, utilizing the prototypes and partition matrix obtained by fuzzy c-means clustering algorithm, we develop a fuzzy similarity measure for spectral clustering (FSSC). Furthermore, we introduce the K-nearest neighbor sparse strategy into FSSC and apply the sparse FSSC to texture image segmentation. In our experiments, we firstly perform some experiments on artificial data to verify the efficiency of the proposed fuzzy similarity measure. Then we analyze the parameters sensitivity of our method. Finally, we take self-tuning spectral clustering and Nyström methods for baseline comparisons, and apply these three methods to the synthetic texture and remote sensing image segmentation. The experimental results show that the proposed method is significantly effective and stable.  相似文献   

14.
What are the most relevant factors to be considered by employees when searching for an employer? The answer to this question poses valuable knowledge from the Business Intelligence viewpoint since it allows companies to retain personnel and attract competent employees. It leads to an increase in sales of their products or services, therefore remaining competitive across similar companies in the market. In this paper we assess the attractiveness of companies in Belgium by using a new two-stage methodology based on Artificial Intelligence techniques. The proposed method allows constructing high-quality prototypes from partial rankings indicating experts’ preferences. Being more explicit, in the first step we propose a fuzzy clustering algorithm for partial rankings called fuzzy c-aggregation. This algorithm is based on the well-known fuzzy c-means procedure and uses the Hausdorff distance as dissimilarity functional and a counting strategy for updating the center of each cluster. However, we cannot ensure the optimality of such prototypes, and therefore more accurate prototypes must be derived. That is why the second step is focused on solving the extended Kemeny ranking problem for each discovered cluster taking into account the estimated membership matrix. To accomplish that, we adopt an optimization method based on Swarm Intelligence that exploits a colony of artificial ants. Several simulations show the effectiveness of the proposal for the real-world problem under investigation.  相似文献   

15.
In this paper, we define I-fuzzy partitions (or intuitionistic fuzzy partitions as called by Atanassov or interval-valued fuzzy partitions). As our ultimate goal is to compare the results of standard fuzzy clustering algorithms (e.g. fuzzy c-means), we define a method to construct them from a set of fuzzy clusters obtained from several executions of fuzzy c-means. From a practical point of view, the approach presented here tries to solve the difficulty of comparing the results of fuzzy clustering methods and, in particular, the difficulty of finding the global optimal.  相似文献   

16.
The k-means algorithm and its variations are known to be fast clustering algorithms. However, they are sensitive to the choice of starting points and are inefficient for solving clustering problems in large datasets. Recently, incremental approaches have been developed to resolve difficulties with the choice of starting points. The global k-means and the modified global k-means algorithms are based on such an approach. They iteratively add one cluster center at a time. Numerical experiments show that these algorithms considerably improve the k-means algorithm. However, they require storing the whole affinity matrix or computing this matrix at each iteration. This makes both algorithms time consuming and memory demanding for clustering even moderately large datasets. In this paper, a new version of the modified global k-means algorithm is proposed. We introduce an auxiliary cluster function to generate a set of starting points lying in different parts of the dataset. We exploit information gathered in previous iterations of the incremental algorithm to eliminate the need of computing or storing the whole affinity matrix and thereby to reduce computational effort and memory usage. Results of numerical experiments on six standard datasets demonstrate that the new algorithm is more efficient than the global and the modified global k-means algorithms.  相似文献   

17.
Approximating clusters in very large (VL=unloadable) data sets has been considered from many angles. The proposed approach has three basic steps: (i) progressive sampling of the VL data, terminated when a sample passes a statistical goodness of fit test; (ii) clustering the sample with a literal (or exact) algorithm; and (iii) non-iterative extension of the literal clusters to the remainder of the data set. Extension accelerates clustering on all (loadable) data sets. More importantly, extension provides feasibility—a way to find (approximate) clusters—for data sets that are too large to be loaded into the primary memory of a single computer. A good generalized sampling and extension scheme should be effective for acceleration and feasibility using any extensible clustering algorithm. A general method for progressive sampling in VL sets of feature vectors is developed, and examples are given that show how to extend the literal fuzzy (c-means) and probabilistic (expectation-maximization) clustering algorithms onto VL data. The fuzzy extension is called the generalized extensible fast fuzzy c-means (geFFCM) algorithm and is illustrated using several experiments with mixtures of five-dimensional normal distributions.  相似文献   

18.
19.
Fuzzy c-means (FCM) algorithms with spatial constraints (FCM_S) have been proven effective for image segmentation. However, they still have the following disadvantages: (1) although the introduction of local spatial information to the corresponding objective functions enhances their insensitiveness to noise to some extent, they still lack enough robustness to noise and outliers, especially in absence of prior knowledge of the noise; (2) in their objective functions, there exists a crucial parameter α used to balance between robustness to noise and effectiveness of preserving the details of the image, it is selected generally through experience; and (3) the time of segmenting an image is dependent on the image size, and hence the larger the size of the image, the more the segmentation time. In this paper, by incorporating local spatial and gray information together, a novel fast and robust FCM framework for image segmentation, i.e., fast generalized fuzzy c-means (FGFCM) clustering algorithms, is proposed. FGFCM can mitigate the disadvantages of FCM_S and at the same time enhances the clustering performance. Furthermore, FGFCM not only includes many existing algorithms, such as fast FCM and enhanced FCM as its special cases, but also can derive other new algorithms such as FGFCM_S1 and FGFCM_S2 proposed in the rest of this paper. The major characteristics of FGFCM are: (1) to use a new factor Sij as a local (both spatial and gray) similarity measure aiming to guarantee both noise-immunity and detail-preserving for image, and meanwhile remove the empirically-adjusted parameter α; (2) fast clustering or segmenting image, the segmenting time is only dependent on the number of the gray-levels q rather than the size N(?q) of the image, and consequently its computational complexity is reduced from O(NcI1) to O(qcI2), where c is the number of the clusters, I1 and are the numbers of iterations, respectively, in the standard FCM and our proposed fast segmentation method. The experiments on the synthetic and real-world images show that FGFCM algorithm is effective and efficient.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号