首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Robust fuzzy clustering of relational data   总被引:1,自引:0,他引:1  
Popular relational-data clustering algorithms, relational dual of fuzzy c-means (RFCM), non-Euclidean RFCM (NERFCM) (both by Hathaway et al), and FANNY (by Kaufman and Rousseeuw) are examined. A new algorithm, which is a generalization of FANNY, called the fuzzy relational data clustering (FRC) algorithm, is introduced, having an identical objective functional as RFCM. However, the FRC does not have the restriction of RFCM, which is that the relational data is derived from Euclidean distance as the measure of dissimilarity between the objects, and it also does not have limitations of FANNY, including the use of a fixed membership exponent, or a fuzzifier exponent, m. The FRC algorithm is further improved by incorporating the concept of Dave's object data noise clustering (NC) algorithm, done by proposing a concept of noise-dissimilarity. Next, based on the constrained minimization, which includes an inequality constraint for the memberships and corresponding Kuhn-Tucker conditions, a noise resistant, FRC algorithm is derived which works well for all types of non-Euclidean dissimilarity data. Thus it is shown that the extra computations for data expansion (/spl beta/-spread transformation) required by the NERFCM algorithm are not necessary. This new algorithm is called robust non-Euclidean fuzzy relational data clustering (robust-NE-FRC), and its robustness is demonstrated through several numerical examples. Advantages of this new algorithm are: faster convergence, robustness against outliers, and ability to handle all kinds of relational data, including non-Euclidean. The paper also presents a new and better interpretation of the noise-class.  相似文献   

2.
This paper is concerned with the computational efficiency of fuzzy clustering algorithms when the data set to be clustered is described by a proximity matrix only (relational data) and the number of clusters must be automatically estimated from such data. A fuzzy variant of an evolutionary algorithm for relational clustering is derived and compared against two systematic (pseudo-exhaustive) approaches that can also be used to automatically estimate the number of fuzzy clusters in relational data. An extensive collection of experiments involving 18 artificial and two real data sets is reported and analyzed.  相似文献   

3.
This paper presents new algorithms-fuzzy c-medoids (FCMdd) and robust fuzzy c-medoids (RFCMdd)-for fuzzy clustering of relational data. The objective functions are based on selecting c representative objects (medoids) from the data set in such a way that the total fuzzy dissimilarity within each cluster is minimized. A comparison of FCMdd with the well-known relational fuzzy c-means algorithm (RFCM) shows that FCMdd is more efficient. We present several applications of these algorithms to Web mining, including Web document clustering, snippet clustering, and Web access log analysis  相似文献   

4.
Fuzzy relational classifier (FRC) is a recently proposed two-step nonlinear classifier. At first, the unsupervised fuzzy c-means (FCM) clustering is performed to explore the underlying groups of the given dataset. Then, a fuzzy relation matrix indicating the relationship between the formed groups and the given classes is constructed for subsequent classification. It has been shown that FRC has two advantages: interpretable classification results and avoidance of overtraining. However, FRC not only lacks the robustness which is very important for a classifier, but also fails on the dataset with non-spherical distributions. Moreover, the classification mechanism of FRC is sensitive to the improper class labels of the training samples, thus leading to considerable decline in classification performance. The purpose of this paper is to develop a Robust FRC (RFRC) algorithm aiming at overcoming or mitigating all of the above disadvantages of FRC and maintaining its original advantages. In the proposed RFRC algorithm, we employ our previously proposed robust kernelized FCM (KFCM) to replace FCM to enhance its robustness against outliers and its suitability for the non-spherical data structures. In addition, we incorporate the soft class labels into the classification mechanism to improve its performance, especially for the datasets containing the improper class labels. The experimental results on 2 artificial and 11 real-life benchmark datasets demonstrate that RFRC algorithm can consistently outperform FRC in classification performance.  相似文献   

5.
This paper deals with the connections existing between fuzzy set theory and fuzzy relational databases. Our new result dealing with fuzzy relations is how to calculate the greatest lower bound (glb) of two similarity relations. Our main contributions in fuzzy relational databases are establishing from fuzzy set theory what a fuzzy relational database should be (the result is both surprising and elegant), and making fuzzy relational databases even more robust.Our work in fuzzy relations and in fuzzy databases had led us into other interesting problems—two of which we mention in this paper. The first is primarily mathematical, and the second provides yet another connection between fuzzy set theory and artificial intelligence. In understanding similarity relations in terms of other fuzzy relations and in making fuzzy databases more robust, we work with closure and interior operators; we present some important properties of these operators. In establishing the connection between fuzzy set theory and artificial intelligence, we show that an abstraction on a set is in fact a partition on the set; that is, an abstraction defines an equivalence relation on the underlying set.  相似文献   

6.
7.
In this paper, we show how one can take advantage of the stability and effectiveness of object data clustering algorithms when the data to be clustered are available in the form of mutual numerical relationships between pairs of objects. More precisely, we propose a new fuzzy relational algorithm, based on the popular fuzzy C-means (FCM) algorithm, which does not require any particular restriction on the relation matrix. We describe the application of the algorithm to four real and four synthetic data sets, and show that our algorithm performs better than well-known fuzzy relational clustering algorithms on all these sets.  相似文献   

8.
Fuzzy clustering is an important problem which is the subject of active research in several real-world applications. Fuzzy c-means (FCM) algorithm is one of the most popular fuzzy clustering techniques because it is efficient, straightforward, and easy to implement. However, FCM is sensitive to initialization and is easily trapped in local optima. Particle swarm optimization (PSO) is a stochastic global optimization tool which is used in many optimization problems. In this paper, a hybrid fuzzy clustering method based on FCM and fuzzy PSO (FPSO) is proposed which make use of the merits of both algorithms. Experimental results show that our proposed method is efficient and can reveal encouraging results.  相似文献   

9.
10.
In the real world, there exist a lot of fuzzy data which cannot or need not be precisely defined. We distinguish two types of fuzziness: one in an attribute value itself and the other in an association of them. For such fuzzy data, we propose a possibility-distribution-fuzzy-relational model, in which fuzzy data are represented by fuzzy relations whose grades of membership and attribute values are possibility distributions. In this model, the former fuzziness is represented by a possibility distribution and the latter by a grade of membership. Relational algebra for the ordinary relational database as defined by Codd includes the traditional set operations and the special relational operations. These operations are classified into the primitive operations, namely, union, difference, extended Cartesian product, selection and projection, and the additional operations, namely, intersection, join, and division. We define the relational algebra for the possibility-distribution-fuzzy-relational model of fuzzy databases.  相似文献   

11.
This paper aims to propose a fuzzy classifier, which is a one-class-in-one-network structure consisting of multiple novel single-layer perceptrons. Since the output value of each single-layer perceptron can be interpreted as the overall grade of the relationship between the input pattern and one class, the degree of relationship between an attribute of the input pattern and that of this class can be taken into account by establishing a representative pattern for each class. A feature of this paper is that it employs the grey relational analysis to compute the grades of relationship for individual attributes. In particular, instead of using the sigmoid function as the activation function, a non-additive technique, the Choquet integral, is used as an activation function to synthesize the performance values, since an assumption of noninteraction among attributes may not be reasonable. Thus, a single-layer perceptron in the proposed structure performs the synthetic evaluation of the Choquet integral-based grey relational analysis for a pattern. Each connection weight is interpreted as a degree of importance of an attribute and can be determined by a genetic algorithm-based method. The experimental results further demonstrate that the test results of the proposed fuzzy classifier are better than or comparable to those of other fuzzy or non-fuzzy classification methods.  相似文献   

12.
In the present article we review the main research works in fuzzy databases; propose an extension of relation division operator to fuzzy databases; provide a model for fuzzy information and resolve the identification problem in fuzzy databases. For this, three notions are relevant: (a) the concept of nuanced information for representing fuzzy values and the associated nuance, (b) the nuanced division operator, (c) the possibility of weighting attributes in order to express data and query pertinence and trust. We then show how to resolve the problem of fuzzy identification with the nuanced division operator. © 1994 John Wiley & Sons, Inc.  相似文献   

13.
Fuzzy order statistics and their application to fuzzy clustering   总被引:1,自引:0,他引:1  
The median and the median absolute deviation (MAD) are robust statistics based on order statistics. Order statistics are extended to fuzzy sets to define a fuzzy median and a fuzzy MAD. The fuzzy c-means (FCM) clustering algorithm is defined for any p-norm (pFCM), including the l1-norm (1FCM), The 1FCM clustering algorithm is implemented via the alternating optimization (AO) method and the clustering centers are shown to be the fuzzy median. The resulting AO-1FCM clustering algorithm is called the fuzzy c-medians (FCMED) clustering algorithm. An example illustrates the robustness of the FCMED  相似文献   

14.
A fuzzy clustering problem consists of assigning a set of patterns to a given number of clusters with respect to some criteria such that each of them may belong to more than one cluster with different degrees of membership. In order to solve it, we first propose a new local search heuristic, called Fuzzy J-Means, where the neighbourhood is defined by all possible centroid-to-pattern relocations. The “integer” solution is then moved to a continuous one by an alternate step, i.e., by finding centroids and membership degrees for all patterns and clusters. To alleviate the difficulty of being stuck in local minima of poor value, this local search is then embedded into the Variable Neighbourhood Search metaheuristic. Results on five standard test problems from the literature are reported and compared with those obtained with the well-known Fuzzy C-Means heuristic. It appears that solutions of substantially better quality are obtained with the proposed methods than with this former one.  相似文献   

15.
In recent year, the problem of clustering in microarray data has been gaining significant attention. However most of the clustering methods attempt to find the group of genes where the number of cluster is known a priori. This fact motivated us to develop a new real-coded improved differential evolution based automatic fuzzy clustering algorithm which automatically evolves the number of clusters as well as the proper partitioning of a gene expression data set. To improve the result further, the clustering method is integrated with a support vector machine, a well-known technique for supervised learning. A fraction of the gene expression data points selected from different clusters based on their proximity to the respective centers, is used for training the SVM. The clustering assignments of the remaining gene expression data points are thereafter determined using the trained classifier. The performance of the proposed clustering technique has been demonstrated on five gene expression data sets by comparing it with the differential evolution based automatic fuzzy clustering, variable length genetic algorithm based fuzzy clustering and well known Fuzzy C-Means algorithm. Statistical significance test has been carried out to establish the statistical superiority of the proposed clustering approach. Biological significance test has also been carried out using a web based gene annotation tool to show that the proposed method is able to produce biologically relevant clusters of genes. The processed data sets and the matlab version of the software are available at http://bio.icm.edu.pl/~darman/IDEAFC-SVM/.  相似文献   

16.
The first stage of organizing objects is to partition them into groups or clusters. The clustering is generally done on individual object data representing the entities such as feature vectors or on object relational data incorporated in a proximity matrix.This paper describes another method for finding a fuzzy membership matrix that provides cluster membership values for all the objects based strictly on the proximity matrix. This is generally referred to as relational data clustering. The fuzzy membership matrix is found by first finding a set of vectors that approximately have the same inter-vector Euclidian distances as the proximities that are provided. These vectors can be of very low dimension such as 5 or less. Fuzzy c-means (FCM) is then applied to these vectors to obtain a fuzzy membership matrix. In addition two-dimensional vectors are also created to provide a visual representation of the proximity matrix. This allows comparison of the result of automatic clustering to visual clustering. The method proposed here is compared to other relational clustering methods including NERFCM, Rouben’s method and Windhams A-P method. Various clustering quality indices are also calculated for doing the comparison using various proximity matrices as input. Simulations show the method to be very effective and no more computationally expensive than other relational data clustering methods. The membership matrices that are produced by the proposed method are less crisp than those produced by NERFCM and more representative of the proximity matrix that is used as input to the clustering process.  相似文献   

17.
This paper initially describes the relational counterpart of possibilistic c-means (PCM) algorithm, called relational PCM (or RPCM). RPCM is then improved to better handle arbitrary dissimilarity data. First, a re-scaling of the PCM membership function is proposed in order to obtain zero membership values when the distance to prototype equals the maximum value allowed in bounded dissimilarity measures. Second, a heuristic method of reference distance initialisation is provided which diminishes the known PCM tendency of producing coincident clusters. Finally, RPCM improved with our initialisation strategy is tested on both synthetic and real data sets with satisfactory results.  相似文献   

18.
One of the critical aspects of clustering algorithms is the correct identification of the dissimilarity measure used to drive the partitioning of the data set. The dissimilarity measure induces the cluster shape and therefore determines the success of clustering algorithms. As cluster shapes change from a data set to another, dissimilarity measures should be extracted from data. To this aim, we exploit some pairs of points with known dissimilarity value to teach a dissimilarity relation to a feed-forward neural network. Then, we use the neural dissimilarity measure to guide an unsupervised relational clustering algorithm. Experiments on synthetic data sets and on the Iris data set show that the relational clustering algorithm based on the neural dissimilarity outperforms some popular clustering algorithms (with possible partial supervision) based on spatial dissimilarity.  相似文献   

19.
Fuzzy relational compression   总被引:3,自引:0,他引:3  
This study concentrates on fuzzy relational calculus regarded as a basis of data compression. In this setting, images are represented as fuzzy relations. We investigate fuzzy relational equations as a basis of image compression. It is shown that both compression and decompression (reconstruction) phases are closely linked with the way in which fuzzy relational equations are being usually set and solved. The theoretical findings encountered in the theory of these equations are easily accommodated as a backbone of the relational compression. The character of the solutions to the equations make them ideal for reconstruction purposes as they specify the extremal elements of the solution set and in such a way help establish some envelopes of the original images under compression. The flexibility of the conceptual and algorithmic framework arising there is also discussed. Numerical examples provide a suitable illustrative material emphasizing the main features of the compression mechanisms.  相似文献   

20.
Issam Dagher 《Computing》2011,92(1):49-63
Prototype classifier is based on representing every cluster by a prototype. All the input patterns that belong to that cluster will have the same label as the prototype. It should be noted that a prototype does not have to be only one data. A cluster could be represented by more than one data. In this paper, the M-dimensional rectangle of the Fuzzy ART is used as a prototype. A new tree clustering structure replaces the training phase of Fuzzy ARTMAP. The obtained clusters are used to form the prototype rectangles. These rectangles will be used in the test phase of the Fuzzy ARTMAP. This algorithm is compared to the Nearest Neighbor classifier, the Fuzzy ARTMAP, C4.5, and the fuzzy ART-Var algorithms for different values of the vigilance parameter. Databases from the UCI repository will be used for comparison. Experimental results show the good generalization capability of this new algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号