首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Clustering analysis is the process of separating data according to some similarity measure. A cluster consists of data which are more similar to each other than to other clusters. The similarity of a datum to a certain cluster can be defined as the distance of that datum to the prototype of that cluster. Typically, the prototype of a cluster is a real vector that is called the center of that cluster. In this paper, the prototype of a cluster is generalized to be a complex vector (complex center). A new distance measure is introduced. New formulas for the fuzzy membership and the fuzzy covariance matrix are introduced. Cluster validity measures are used to assess the goodness of the partitions obtained by the complex centers compared those obtained by the real centers. The validity measures used in this paper are the partition coefficient, classification entropy, partition index, separation index, Xie and Beni’s index, and Dunn’s index. It is shown in this paper that clustering with complex prototypes will give better partitions of the data than using real prototypes.  相似文献   

2.
A new data clustering algorithm Density oriented Kernelized version of Fuzzy c-means with new distance metric (DKFCM-new) is proposed. It creates noiseless clusters by identifying and assigning noise points into separate cluster. In an earlier work, Density Based Fuzzy C-Means (DOFCM) algorithm with Euclidean distance metric was proposed which only considered the distance between cluster centroid and data points. In this paper, we tried to improve the performance of DOFCM by incorporating a new distance measure that has also considered the distance variation within a cluster to regularize the distance between a data point and the cluster centroid. This paper presents the kernel version of the method. Experiments are done using two-dimensional synthetic data-sets, standard data-sets referred from previous papers like DUNN data-set, Bensaid data-set and real life high dimensional data-sets like Wisconsin Breast cancer data, Iris data. Proposed method is compared with other kernel methods, various noise resistant methods like PCM, PFCM, CFCM, NC and credal partition based clustering methods like ECM, RECM, CECM. Results shown that proposed algorithm significantly outperforms its earlier version and other competitive algorithms.  相似文献   

3.
A cluster validity index for fuzzy clustering   总被引:1,自引:0,他引:1  
A new cluster validity index is proposed for the validation of partitions of object data produced by the fuzzy c-means algorithm. The proposed validity index uses a variation measure and a separation measure between two fuzzy clusters. A good fuzzy partition is expected to have a low degree of variation and a large separation distance. Testing of the proposed index and nine previously formulated indices on well-known data sets shows the superior effectiveness and reliability of the proposed index in comparison to other indices and the robustness of the proposed index in noisy environments.  相似文献   

4.
This paper proposes a fuzzy clustering-based algorithm for fuzzy modeling. The algorithm incorporates unsupervised learning with an iterative process into a framework, which is based on the use of the weighted fuzzy c-means. In the first step, the learning vector quantization (LVQ) algorithm is exploited as a data pre-processor unit to group the training data into a number of clusters. Since different clusters may contain different number of objects, the centers of these clusters are assigned weight factors, the values of which are calculated by the respective cluster cardinalities. These centers accompanied with their weights are considered to be a new data set, which is further elaborated by an iterative process. This process consists of applying in sequence the weighted fuzzy c-means and the back-propagation algorithm. The application of the weighted fuzzy c-means ensures that the contribution of each cluster center to the final fuzzy partition is determined by its cardinality, meaning that the real data structure can be easier discovered. The algorithm is successfully applied to three test cases, where the produced fuzzy models prove to be very accurate as well as compact in size.  相似文献   

5.
The problem of optimal non-hierarchical clustering is addressed. A new algorithm combining differential evolution and k-means is proposed and tested on eight well-known real-world data sets. Two criteria (clustering validity indexes), namely TRW and VCR, were used in the optimization of classification. The classification of objects to be optimized is encoded by the cluster centers in differential evolution (DE) algorithm. It induced the problem of rearrangement of centers in the population to ensure an efficient search via application of evolutionary operators. A new efficient heuristic for this rearrangement was also proposed. The plain DE variants with and without the rearrangement were compared with corresponding hybrid k-means variants. The experimental results showed that hybrid variants with k-means algorithm are essentially more efficient than the non-hybrid ones. Compared to a standard k-means algorithm with restart, the new hybrid algorithm was found more reliable and more efficient, especially in difficult tasks. The results for TRW and VCR criterion were compared. Both criteria provided the same optimal partitions and no significant differences were found in efficiency of the algorithms using these criteria.  相似文献   

6.
Fuzzy c-means (FCMs) is an important and popular unsupervised partitioning algorithm used in several application domains such as pattern recognition, machine learning and data mining. Although the FCM has shown good performance in detecting clusters, the membership values for each individual computed to each of the clusters cannot indicate how well the individuals are classified. In this paper, a new approach to handle the memberships based on the inherent information in each feature is presented. The algorithm produces a membership matrix for each individual, the membership values are between zero and one and measure the similarity of this individual to the center of each cluster according to each feature. These values can change at each iteration of the algorithm and they are different from one feature to another and from one cluster to another in order to increase the performance of the fuzzy c-means clustering algorithm. To obtain a fuzzy partition by class of the input data set, a way to compute the class membership values is also proposed in this work. Experiments with synthetic and real data sets show that the proposed approach produces good quality of clustering.  相似文献   

7.
This paper presents adaptive and non-adaptive fuzzy c-means clustering methods for partitioning symbolic interval data. The proposed methods furnish a fuzzy partition and prototype for each cluster by optimizing an adequacy criterion based on suitable squared Euclidean distances between vectors of intervals. Moreover, various cluster interpretation tools are introduced. Experiments with real and synthetic data sets show the usefulness of these fuzzy c-means clustering methods and the merit of the cluster interpretation tools.  相似文献   

8.
A new cluster validity index is proposed that determines the optimal partition and optimal number of clusters for fuzzy partitions obtained from the fuzzy c-means algorithm. The proposed validity index exploits an overlap measure and a separation measure between clusters. The overlap measure, which indicates the degree of overlap between fuzzy clusters, is obtained by computing an inter-cluster overlap. The separation measure, which indicates the isolation distance between fuzzy clusters, is obtained by computing a distance between fuzzy clusters. A good fuzzy partition is expected to have a low degree of overlap and a larger separation distance. Testing of the proposed index and nine previously formulated indexes on well-known data sets showed the superior effectiveness and reliability of the proposed index in comparison to other indexes.  相似文献   

9.
《Pattern recognition letters》2003,24(9-10):1607-1612
Based on the defect of rival checked fuzzy c-means clustering algorithm, a new algorithm: suppressed fuzzy c-means clustering algorithm is proposed. The new algorithm overcomes the shortcomings of the original algorithm, establishes more natural and more reasonable relationships between hard c-means clustering algorithm and fuzzy c-means clustering algorithm.  相似文献   

10.
Fuzzy c-means (FCM) algorithm is one of the most popular methods for image segmentation. However, the standard FCM algorithm must be estimated by expertise users to determine the cluster number. So, we propose an automatic fuzzy clustering algorithm (AFCM) for automatically grouping the pixels of an image into different homogeneous regions when the number of clusters is not known beforehand. In order to get better segmentation quality, this paper presents an algorithm based on AFCM algorithm, called automatic modified fuzzy c-means cluster segmentation algorithm (AMFCM). AMFCM algorithm incorporates spatial information into the membership function for clustering. The spatial function is the weighted summation of the membership function in the neighborhood of each pixel under consideration. Experimental results show that AMFCM algorithm not only can spontaneously estimate the appropriate number of clusters but also can get better segmentation quality.  相似文献   

11.
The implementation of fuzzy clustering in the design process of vector quantizers faces three challenges. The first is the high computational cost. The second challenge arises because a vector quantizer is required to assign each training sample to only one cluster. However, such an aggressive interpretation of fuzzy clustering results to a crisp partition of inferior quality. The third one is the dependence on initialization. In this paper we develop a fuzzy clustering-based vector quantization algorithm that deals with the aforementioned problems. The algorithm utilizes a specialized objective function, which involves the c-means and the fuzzy c-means along with a competitive agglomeration term. The joint effect is a learning process where the number of codewords (i.e. cluster centers) affected by a specific training sample is gradually reducing and therefore, the number of distance calculations is also reducing. Thus, the computational cost becomes smaller. In addition, the partition is smoothly transferred from fuzzy to crisp conditions and there is no need to employ any aggressive interpretation of fuzzy clustering. The competitive agglomeration term refines large clusters from small and spurious ones. Then, contrary to the classical competitive agglomeration method, we do not discard the small clusters but instead migrate them close to large clusters, rendering more competitive. Thus, the codeword migration process uses the net effect of the competitive agglomeration and acts to further reduce the dependence on initialization in order to obtain a better local minimum. The algorithm is applied to grayscale image compression. The main simulation findings can be summarized as follows: (a) a comparison between the proposed method and other related approaches shows its statistically significant superiority, (b) the algorithm is a fast process, (c) the algorithm is insensitive with respect to its design parameters, and (d) the reconstructed images maintain high quality, which is quantified in terms of the distortion measure.  相似文献   

12.
13.
In this paper, a fuzzy clustering method based on evolutionary programming (EPFCM) is proposed. The algorithm benefits from the global search strategy of evolutionary programming, to improve fuzzy c-means algorithm (FCM). The cluster validity can be measured by some cluster validity indices. To increase the convergence speed of the algorithm, we exploit the modified algorithm to change the number of cluster centers dynamically. Experiments demonstrate EPFCM can find the proper number of clusters, and the result of clustering does not depend critically on the choice of the initial cluster centers. The probability of trapping into the local optima will be very lower than FCM.  相似文献   

14.
A new clustering method for object data, called ECM (evidential c-means) is introduced, in the theoretical framework of belief functions. It is based on the concept of credal partition, extending those of hard, fuzzy, and possibilistic ones. To derive such a structure, a suitable objective function is minimized using an FCM-like algorithm. A validity index allowing the determination of the proper number of clusters is also proposed. Experiments with synthetic and real data sets show that the proposed algorithm can be considered as a promising tool in the field of exploratory statistics.  相似文献   

15.
Since Quandt [The estimation of the parameters of a linear regression system obeying two separate regimes, Journal of the American Statistical Association 53 (1958) 873-880] initiated the research on 2-regressions analysis, switching regression had been widely studied and applied in psychology, economics, social science and music perception. In fuzzy clustering, the fuzzy c-means (FCM) is the most commonly used algorithm. Hathaway and Bezdek [Switching regression models and fuzzy clustering, IEEE Transactions on Fuzzy Systems 1 (1993) 195-204] embedded FCM into switching regression where it was called fuzzy c-regressions (FCR). However, the FCR always depends heavily on initial values. In this paper, we propose a mountain c-regressions (MCR) method for solving the initial-value problem. First, we perform data transformation for the switching regression data set, and then implement the modified mountain clustering on the transformed data to extract c cluster centers. These extracted c cluster centers in the transformed space will correspond to c regression models in the original data set. The proposed MCR method can form well-estimated c regression models for switching regression data sets. According to the properties of transformation, the proposed MCR is also robust to noise and outliers. Several examples show the effectiveness and superiority of our proposed method.  相似文献   

16.
This paper initially describes the relational counterpart of possibilistic c-means (PCM) algorithm, called relational PCM (or RPCM). RPCM is then improved to better handle arbitrary dissimilarity data. First, a re-scaling of the PCM membership function is proposed in order to obtain zero membership values when the distance to prototype equals the maximum value allowed in bounded dissimilarity measures. Second, a heuristic method of reference distance initialisation is provided which diminishes the known PCM tendency of producing coincident clusters. Finally, RPCM improved with our initialisation strategy is tested on both synthetic and real data sets with satisfactory results.  相似文献   

17.
This paper presents an idea of clustering resolution. On the basis of the idea, fuzzy clustering algorithms based on resolution are deduced, which naturally comprise a set of clustering algorithms. Thus, c-means algorithm and fuzzy c-means algorithms are actually special examples in the set. As an application for codebook design in image compression based on vector quantization, fuzzy clustering algorithms based on multiresolution are developed, which are almost prior to conventional algorithms in all aspects.  相似文献   

18.
通过对k-平均算法存在不足的分析,提出了一种基于Ward’s方法的k-平均优化算法。算法首先在用Ward’s方法对样本数据初步聚类的基础上,确定合适的簇数目、初始聚类中心等k-平均算法的初始参数,并进行孤立点检测、删除;基于上述处理再采用传统k-平均算法进行聚类。将优化的k-平均算法应用到罪犯人格类型分析中,实验结果表明,该算法的效率、聚类效果均明显优于传统k-平均算法。  相似文献   

19.
This paper proposes a novel intuitionistic fuzzy c-least squares support vector regression (IFC-LSSVR) with a Sammon mapping clustering algorithm. Sammon mapping effectively reduces the complexity of raw data, while intuitionistic fuzzy sets (IFSs) can effectively tune the membership of data points, and LSSVR improves the conventional fuzzy c-regression model. The proposed clustering algorithm combines the advantages of IFSs, LSSVR and Sammon mapping for solving actual clustering problems. Moreover, IFC-LSSVR with Sammon mapping adopts particle swarm optimization to obtain optimal parameters. Experiments conducted on a web-based adaptive learning environment and a dataset of wheat varieties demonstrate that the proposed algorithm is more efficient than conventional algorithms, such as the k-means (KM) and fuzzy c-means (FCM) clustering algorithms, in standard measurement indexes. This study thus demonstrates that the proposed model is a credible fuzzy clustering algorithm. The novel method contributes not only to the theoretical aspects of fuzzy clustering, but is also widely applicable in data mining, image systems, rule-based expert systems and prediction problems.  相似文献   

20.
In this paper, we define I-fuzzy partitions (or intuitionistic fuzzy partitions as called by Atanassov or interval-valued fuzzy partitions). As our ultimate goal is to compare the results of standard fuzzy clustering algorithms (e.g. fuzzy c-means), we define a method to construct them from a set of fuzzy clusters obtained from several executions of fuzzy c-means. From a practical point of view, the approach presented here tries to solve the difficulty of comparing the results of fuzzy clustering methods and, in particular, the difficulty of finding the global optimal.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号