首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper presents an extension of the one-class support vector machines (OC-SVM) into an ensemble of soft OC-SVM classifiers. The idea consists in prior clustering of the input data with a kernel version of the deterministically annealed fuzzy c-means. This way partitioned data is trained with a number of soft OC-SVM classifiers which allow weight assignment to each of the training data. Weights are obtained from the cluster membership values, computed in the kernel fuzzy c-means. The method was designed and tested mostly in the tasks of image classification and segmentation, although it can be used for other one-class problems.  相似文献   

2.
Fuzzy c-means clustering with spatial constraints is considered as suitable algorithm for data clustering or data analyzing. But FCM has still lacks enough robustness to employ with noise data, because of its Euclidean distance measure objective function for finding the relationship between the objects. It can only be effective in clustering ‘spherical’ clusters, and it may not give reasonable clustering results for “non-compactly filled” spherical data such as “annular-shaped” data. This paper realized the drawbacks of the general fuzzy c-mean algorithm and it tries to introduce an extended Gaussian version of fuzzy C-means by replacing the Euclidean distance in the original object function of FCM. Firstly, this paper proposes initial kernel version of fuzzy c-means to aim at simplifying its computation and then extended it to extended Gaussian kernel version of fuzzy c-means. It derives an effective method to construct the membership matrix for objects, and it derives a robust method for updating centers from extended Gaussian version of fuzzy C-means. Furthermore, this paper proposes a new prototypes learning method and it obtains initial cluster centers using new mathematical initialization centers for the new effective objective function of fuzzy c-means, so that this paper tries to minimize the iteration of algorithms to obtain more accurate result. Initial experiment will be done with an artificially generated data to show how effectively the new proposed Gaussian version of fuzzy C-means works in obtaining clusters, and then the proposed methods can be implemented to cluster the Wisconsin breast cancer database into two clusters for the classes benign and malignant. To show the effective performance of proposed fuzzy c-means with new initialization of centers of clusters, this work compares the results with results of recent fuzzy c-means algorithm; in addition, it uses Silhouette method to validate the obtained clusters from breast cancer datasets.  相似文献   

3.
A Possibilistic Fuzzy c-Means Clustering Algorithm   总被引:20,自引:0,他引:20  
In 1997, we proposed the fuzzy-possibilistic c-means (FPCM) model and algorithm that generated both membership and typicality values when clustering unlabeled data. FPCM constrains the typicality values so that the sum over all data points of typicalities to a cluster is one. The row sum constraint produces unrealistic typicality values for large data sets. In this paper, we propose a new model called possibilistic-fuzzy c-means (PFCM) model. PFCM produces memberships and possibilities simultaneously, along with the usual point prototypes or cluster centers for each cluster. PFCM is a hybridization of possibilistic c-means (PCM) and fuzzy c-means (FCM) that often avoids various problems of PCM, FCM and FPCM. PFCM solves the noise sensitivity defect of FCM, overcomes the coincident clusters problem of PCM and eliminates the row sum constraints of FPCM. We derive the first-order necessary conditions for extrema of the PFCM objective function, and use them as the basis for a standard alternating optimization approach to finding local minima of the PFCM objective functional. Several numerical examples are given that compare FCM and PCM to PFCM. Our examples show that PFCM compares favorably to both of the previous models. Since PFCM prototypes are less sensitive to outliers and can avoid coincident clusters, PFCM is a strong candidate for fuzzy rule-based system identification.  相似文献   

4.
In this study a fuzzy c-means clustering algorithm based method is proposed for solving a capacitated multi-facility location problem of known demand points which are served from capacitated supply centres. It involves the integrated use of fuzzy c-means and convex programming. In fuzzy c-means, data points are allowed to belong to several clusters with different degrees of membership. This feature is used here to split demands between supply centers. The cluster number is determined by an incremental method that starts with two and designated when capacity of each cluster is sufficient for its demand. Finally, each group of cluster and each model are solved as a single facility location problem. Then each single facility location problem given by fuzzy c-means is solved by convex programming which optimizes transportation cost is used to fine-tune the facility location. Proposed method is applied to several facility location problems from OR library (Osman & Christofides, 1994) and compared with centre of gravity and particle swarm optimization based algorithms. Numerical results of an asphalt producer’s real-world data in Turkey are reported. Numerical results show that the proposed approach performs better than using original fuzzy c-means, integrated use of fuzzy c-means and center of gravity methods in terms of transportation costs.  相似文献   

5.
基于模糊分割和邻近对的支持向量机分类器   总被引:1,自引:0,他引:1  
支持向量机算法对噪声点和异常点是敏感的,为了解决这个问题,人们提出了模糊支持向量机,但其中的模糊隶属度函数需要人为设置。提出基于模糊分割和邻近对的支持向量机分类器。在该算法中,首先根据聚类有效性用模糊c-均值聚类算法分别对训练集中的正负类数据聚类;然后,根据聚类结果构造c个二分类问题,求解得c个二分类器;最后,用邻近对策略对样本点进行识别。用4个著名的数据集进行了数值实验,结果表明该算法能有效提高带噪声点和异常点数据集分类的预测精度。  相似文献   

6.
Reducing the time complexity of the fuzzy c-means algorithm   总被引:13,自引:0,他引:13  
In this paper, we present an efficient implementation of the fuzzy c-means clustering algorithm. The original algorithm alternates between estimating centers of the clusters and the fuzzy membership of the data points. The size of the membership matrix is on the order of the original data set, a prohibitive size if this technique is to be applied to very large data sets with many clusters. Our implementation eliminates the storage of this data structure by combining the two updates into a single update of the cluster centers. This change significantly affects the asymptotic runtime as the new algorithm is linear with respect to the number of clusters, while the original is quadratic. Elimination of the membership matrix also reduces the overhead associated with repeatedly accessing a large data structure. Empirical evidence is presented to quantify the savings achieved by this new method  相似文献   

7.
Fuzzy functions with support vector machines   总被引:1,自引:0,他引:1  
A new fuzzy system modeling (FSM) approach that identifies the fuzzy functions using support vector machines (SVM) is proposed. This new approach is structurally different from the fuzzy rule base approaches and fuzzy regression methods. It is a new alternate version of the earlier FSM with fuzzy functions approaches. SVM is applied to determine the support vectors for each fuzzy cluster obtained by fuzzy c-means (FCM) clustering algorithm. Original input variables, the membership values obtained from the FCM together with their transformations form a new augmented set of input variables. The performance of the proposed system modeling approach is compared to previous fuzzy functions approaches, standard SVM, LSE methods using an artificial sparse dataset and a real-life non-sparse dataset. The results indicate that the proposed fuzzy functions with support vector machines approach is a feasible and stable method for regression problems and results in higher performances than the classical statistical methods.  相似文献   

8.
介绍一种基于模糊逻辑的数据聚类技术,讨论了模糊C均值聚类方法。模糊C均值算法就是利用模糊逻辑理论和聚类思想,将n样本划分到c个类别中的一个,使得被划分到同一簇的对象之间相似度最大,而不同簇之间的相似度最小。  相似文献   

9.
In recent year, the problem of clustering in microarray data has been gaining significant attention. However most of the clustering methods attempt to find the group of genes where the number of cluster is known a priori. This fact motivated us to develop a new real-coded improved differential evolution based automatic fuzzy clustering algorithm which automatically evolves the number of clusters as well as the proper partitioning of a gene expression data set. To improve the result further, the clustering method is integrated with a support vector machine, a well-known technique for supervised learning. A fraction of the gene expression data points selected from different clusters based on their proximity to the respective centers, is used for training the SVM. The clustering assignments of the remaining gene expression data points are thereafter determined using the trained classifier. The performance of the proposed clustering technique has been demonstrated on five gene expression data sets by comparing it with the differential evolution based automatic fuzzy clustering, variable length genetic algorithm based fuzzy clustering and well known Fuzzy C-Means algorithm. Statistical significance test has been carried out to establish the statistical superiority of the proposed clustering approach. Biological significance test has also been carried out using a web based gene annotation tool to show that the proposed method is able to produce biologically relevant clusters of genes. The processed data sets and the matlab version of the software are available at http://bio.icm.edu.pl/~darman/IDEAFC-SVM/.  相似文献   

10.
一种隶属关系不确定的可能性模糊聚类方法   总被引:5,自引:0,他引:5  
模糊聚类是聚类分析的一个重要分支,模糊C-均值聚类算法及其改进算法都是一种基于概率约束的聚类方法,所采用隶属度的取值形式体现了数据集的绝对隶属程度,常常出现不理想的聚类结果.对此,提出了不确定隶属的概念,在此基础上,通过提出两个基于相对隶属程度的判断准则参数,设计出一种新的基于隶属关系不确定的可能性模糊聚类新算法,并给出了具体算法实现.新算法将迭代过程中数据集对聚类簇隶属的可能性与不确定性关系引入目标函数中,达到明显的优化聚类结果的功效.理论分析和实验结果表明,相对其他聚类算法,新算法具有更高的聚类正确率.  相似文献   

11.
The first stage of organizing objects is to partition them into groups or clusters. The clustering is generally done on individual object data representing the entities such as feature vectors or on object relational data incorporated in a proximity matrix.This paper describes another method for finding a fuzzy membership matrix that provides cluster membership values for all the objects based strictly on the proximity matrix. This is generally referred to as relational data clustering. The fuzzy membership matrix is found by first finding a set of vectors that approximately have the same inter-vector Euclidian distances as the proximities that are provided. These vectors can be of very low dimension such as 5 or less. Fuzzy c-means (FCM) is then applied to these vectors to obtain a fuzzy membership matrix. In addition two-dimensional vectors are also created to provide a visual representation of the proximity matrix. This allows comparison of the result of automatic clustering to visual clustering. The method proposed here is compared to other relational clustering methods including NERFCM, Rouben’s method and Windhams A-P method. Various clustering quality indices are also calculated for doing the comparison using various proximity matrices as input. Simulations show the method to be very effective and no more computationally expensive than other relational data clustering methods. The membership matrices that are produced by the proposed method are less crisp than those produced by NERFCM and more representative of the proximity matrix that is used as input to the clustering process.  相似文献   

12.
This paper presents a new model of support vector machines (SVMs) that handle data with tolerance and uncertainty. The constraints of the SVM are converted to fuzzy inequality. Giving more relaxation to the constraints allows us to consider an importance degree for each training samples in the constraints of the SVM. The new method is called relaxed constraints support vector machines (RSVMs). Also, the fuzzy SVM model is improved with more relaxed constraints. The new model is called fuzzy RSVM. With this method, we are able to consider importance degree for training samples both in the cost function and constraints of the SVM, simultaneously. In addition, we extend our method to solve one‐class classification problems. The effectiveness of the proposed method is demonstrated on artificial and real‐life data sets.  相似文献   

13.
改进的SVM在入侵检测中的应用   总被引:2,自引:0,他引:2       下载免费PDF全文
提出模糊支持向量机的入侵检测方法,根据输入样本对分类结果不同的影响程度,引入模糊隶属度,探讨了模糊支持向量(FSVM)原理。为进一步提高支持向量机的分类性能,提出Bagging算法对FSVM分类器进行集成,实验结果表明,提出的方法具有良好的检测性能。  相似文献   

14.
This paper presents a fuzzy clustering algorithm for the extraction of a smooth curve from unordered noisy data. In this method, the input data are first clustered into different regions using the fuzzy c-means algorithm and each region is represented by its cluster center. Neighboring cluster centers are linked to produce a graph according to the average class membership values. Loops in the graph are removed to form a curve according to spatial relations of the cluster centers. The input samples are then reclustered using the fuzzy c-means (FCM) algorithm, with the constraint that the curve must be smooth. The method has been tested with both open and closed curves with good results.  相似文献   

15.
张瑞垚  周平 《自动化学报》2022,48(9):2198-2211
针对非线性强、先验故障知识少、异常工况识别难的污水处理过程监测问题,提出一种基于鲁棒加权模糊c均值(Robust weighted fuzzy c-means, RoW-FCM)聚类与核偏最小二乘(Kernel partial least squares, KPLS)的过程监测方法.首先,针对污水处理过程的高维非线性耦合特性,采用核偏最小二乘对高维输入变量进行降维;其次,针对传统基于最近邻分配的模糊c均值算法对离群点敏感以及存在聚类不平衡簇的问题,提出充分考虑样本间相互关系的基于鲁棒加权模糊c均值聚类算法.通过引入可能性划分矩阵作为权值参数实现不同样本数据的区分加权,提高了离群点数据聚类的鲁棒性,同时引入聚类大小控制参数解决不平衡簇的问题.进一步将基于鲁棒加权模糊c均值算法对核偏最小二乘降维后的得分矩阵进行聚类,利用聚类得到的隶属度矩阵实现异常工况的检测;最后,建立隶属度矩阵与过程变量的回归模型,并利用得到的变量贡献矩阵描述变量对各个簇的解释程度,实现异常工况的识别.数值仿真以及污水处理过程数据实验表明该方法具有更好的鲁棒性能,在异常工况检测和识别上具有较好的效果.  相似文献   

16.
Generally, abnormal points (noise and outliers) cause cluster analysis to produce low accuracy especially in fuzzy clustering. These data not only stay in clusters but also deviate the centroids from their true positions. Traditional fuzzy clustering like Fuzzy C-Means (FCM) always assigns data to all clusters which is not reasonable in some circumstances. By reformulating objective function in exponential equation, the algorithm aggressively selects data into the clusters. However noisy data and outliers cannot be properly handled by clustering process therefore they are forced to be included in a cluster because of a general probabilistic constraint that the sum of the membership degrees across all clusters is one. In order to improve this weakness, possibilistic approach relaxes this condition to improve membership assignment. Nevertheless, possibilistic clustering algorithms generally suffer from coincident clusters because their membership equations ignore the distance to other clusters. Although there are some possibilistic clustering approaches that do not generate coincident clusters, most of them require the right combination of multiple parameters for the algorithms to work. In this paper, we theoretically study Possibilistic Exponential Fuzzy Clustering (PXFCM) that integrates possibilistic approach with exponential fuzzy clustering. PXFCM has only one parameter and not only partitions the data but also filters noisy data or detects them as outliers. The comprehensive experiments show that PXFCM produces high accuracy in both clustering results and outlier detection without generating coincident problems.  相似文献   

17.
核模糊C-均值聚类KFCM是利用核函数将数据映射到高维空间,通过计算数据点与聚类中心的隶属度对数据进行聚类的算法,拥有高效、快捷的特点而被广泛应用于各领域,然而KFCM算法存在对聚类中心的初始值敏感和不能自适应确定聚类数两个局限性。针对这两个问题,提出一种局部搜索自适应核模糊聚类方法,该方法引入核方法提高数据的可分性,并构造基于核函数的评价函数来确定最优的聚类数目和利用部分样本数据进行局部搜索以寻找初始聚类中心。人工数据和UCI数据集上的实验结果验证了该算法的有效性。  相似文献   

18.
19.
模糊c均值聚类算法是目前聚类分析中最受欢迎的算法之一,但其聚类效果往往受初始参数的影响.针对这一问题,提出一种基于网格和密度的模糊c均值聚类初始化方法.以网格和密度为工具提取聚类样本的类聚类中心,以此来初始化模糊c均值聚类算法的初始参数,从而弥补原算法的不足.实验证明方法是可行的、有效的.  相似文献   

20.
Suppressed fuzzy c-means clustering algorithm (S-FCM) is one of the most effective fuzzy clustering algorithms. Even if S-FCM has some advantages, some problems exist. First, it is unreasonable to compulsively modify the membership degree values for all the data points in each iteration step of S-FCM. Furthermore, duo to only utilizing the spatial information derived from the pixel’s neighborhood window to guide the process of image segmentation, S-FCM cannot obtain satisfactory segmentation results on images heavily corrupted by noise. This paper proposes an optimal-selection-based suppressed fuzzy c-means clustering algorithm with self-tuning non local spatial information for image segmentation to solve the above drawbacks of S-FCM. Firstly, an optimal-selection-based suppressed strategy is presented to modify the membership degree values for data points. In detail, during each iteration step, all the data points are ranked based on their biggest membership degree values, and then the membership degree values of the top r ranked data points are modified while the membership degree values of the other data points are not changed. In this paper, the parameter r is determined by the golden section method. Secondly, a novel gray level histogram is constructed by using the self-tuning non local spatial information for each pixel, and then fuzzy c-means clustering algorithm with the optimal-selection-based suppressed strategy is executed on this histogram. The self-tuning non local spatial information of a pixel is derived from the pixels with a similar neighborhood configuration to the given pixel and can preserve more information of the image than the spatial information derived from the pixel’s neighborhood window. This method is applied to Berkeley and other real images heavily contaminated by noise. The image segmentation experiments demonstrate the superiority of the proposed method over other fuzzy algorithms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号