首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
This paper presents the development of soft clustering and learning vector quantization (LVQ) algorithms that rely on multiple weighted norms to measure the distance between the feature vectors and their prototypes. Clustering and LVQ are formulated in this paper as the minimization of a reformulation function that employs distinct weighted norms to measure the distance between each of the prototypes and the feature vectors under a set of equality constraints imposed on the weight matrices. Fuzzy LVQ and clustering algorithms are obtained as special cases of the proposed formulation. The resulting clustering algorithm is evaluated and benchmarked on three data sets that differ in terms of the data structure and the dimensionality of the feature vectors. This experimental evaluation indicates that the proposed multinorm algorithm outperforms algorithms employing the Euclidean norm as well as existing clustering algorithms employing weighted norms.  相似文献   

2.
Fuzzy algorithms for learning vector quantization   总被引:14,自引:0,他引:14  
This paper presents the development of fuzzy algorithms for learning vector quantization (FALVQ). These algorithms are derived by minimizing the weighted sum of the squared Euclidean distances between an input vector, which represents a feature vector, and the weight vectors of a competitive learning vector quantization (LVQ) network, which represent the prototypes. This formulation leads to competitive algorithms, which allow each input vector to attract all prototypes. The strength of attraction between each input and the prototypes is determined by a set of membership functions, which can be selected on the basis of specific criteria. A gradient-descent-based learning rule is derived for a general class of admissible membership functions which satisfy certain properties. The FALVQ 1, FALVQ 2, and FALVQ 3 families of algorithms are developed by selecting admissible membership functions with different properties. The proposed algorithms are tested and evaluated using the IRIS data set. The efficiency of the proposed algorithms is also illustrated by their use in codebook design required for image compression based on vector quantization.  相似文献   

3.
An axiomatic approach to soft learning vector quantization andclustering   总被引:11,自引:0,他引:11  
This paper presents an axiomatic approach to soft learning vector quantization (LVQ) and clustering based on reformulation. The reformulation of the fuzzy c-means (FCM) algorithm provides the basis for reformulating entropy-constrained fuzzy clustering (ECFC) algorithms. According to the proposed approach, the development of specific algorithms reduces to the selection of a generator function. Linear generator functions lead to the FCM and fuzzy learning vector quantization algorithms while exponential generator functions lead to ECFC and entropy-constrained learning vector quantization algorithms. The reformulation of LVQ and clustering algorithms also provides the basis for developing uncertainty measures that can identify feature vectors equidistant from all prototypes. These measures are employed by a procedure developed to make soft LVQ and clustering algorithms capable of identifying outliers in the data set. This procedure is evaluated by testing the algorithms generated by linear and exponential generator functions on speech data.  相似文献   

4.
This paper presents the development and investigates the properties of ordered weighted learning vector quantization (LVQ) and clustering algorithms. These algorithms are developed by using gradient descent to minimize reformulation functions based on aggregation operators. An axiomatic approach provides conditions for selecting aggregation operators that lead to admissible reformulation functions. Minimization of admissible reformulation functions based on ordered weighted aggregation operators produces a family of soft LVQ and clustering algorithms, which includes fuzzy LVQ and clustering algorithms as special cases. The proposed LVQ and clustering algorithms are used to perform segmentation of magnetic resonance (MR) images of the brain. The diagnostic value of the segmented MR images provides the basis for evaluating a variety of ordered weighted LVQ and clustering algorithms.  相似文献   

5.
This paper presents a general methodology for the development of fuzzy algorithms for learning vector quantization (FALVQ). The design of specific FALVQ algorithms according to existing approaches reduces to the selection of the membership function assigned to the weight vectors of an LVQ competitive neural network, which represent the prototypes. The development of a broad variety of FALVQ algorithms can be accomplished by selecting the form of the interference function that determines the effect of the nonwinning prototypes on the attraction between the winning prototype and the input of the network. The proposed methodology provides the basis for extending the existing FALVQ 1, FALVQ 2, and FALVQ 3 families of algorithms. This paper also introduces two quantitative measures which establish a relationship between the formulation that led to FALVQ algorithms and the competition between the prototypes during the learning process. The proposed algorithms and competition measures are tested and evaluated using the IRIS data set. The significance of the proposed competition measure is illustrated using FALVQ algorithms to perform segmentation of magnetic resonance images of the brain.  相似文献   

6.
Generalized clustering networks and Kohonen''s self-organizingscheme   总被引:7,自引:0,他引:7  
The relationship between the sequential hard c-means (SHCM) and learning vector quantization (LVQ) clustering algorithms is discussed. The impact and interaction of these two families of methods with Kohonen's self-organizing feature mapping (SOFM), which is not a clustering method but often lends ideas to clustering algorithms, are considered. A generalization of LVQ that updates all nodes for a given input vector is proposed. The network attempts to find a minimum of a well-defined objective function. The learning rules depend on the degree of distance match to the winner node; the lesser the degree of match with the winner, the greater the impact on nonwinner nodes. Numerical results indicate that the terminal prototypes generated by this modification of LVQ are generally insensitive to initialization and independent of any choice of learning coefficient. IRIS data obtained by E. Anderson's (1939) is used to illustrate the proposed method. Results are compared with the standard LVQ approach.  相似文献   

7.
Clustering Incomplete Data Using Kernel-Based Fuzzy C-means Algorithm   总被引:3,自引:0,他引:3  
  相似文献   

8.
In this paper we propose a new metric to replace the Euclidean norm in c-means clustering procedures. On the basis of the robust statistic and the influence function, we claim that the proposed new metric is more robust than the Euclidean norm. We then create two new clustering methods called the alternative hard c-means (AHCM) and alternative fuzzy c-means (AFCM) clustering algorithms. These alternative types of c-means clustering have more robustness than c-means clustering. Numerical results show that AHCM has better performance than HCM and AFCM is better than FCM. We recommend AFCM for use in cluster analysis. Recently, this AFCM algorithm has successfully been used in segmenting the magnetic resonance image of Ophthalmology to differentiate the abnormal tissues from the normal tissues.  相似文献   

9.
Learning vector quantization with training data selection   总被引:2,自引:0,他引:2  
In this paper, we propose a method that selects a subset of the training data points to update LVQ prototypes. The main goal is to conduct the prototypes to converge at a more convenient location, diminishing misclassification errors. The method selects an update set composed by a subset of points considered to be at the risk of being captured by another class prototype. We associate the proposed methodology to a weighted norm, instead of the Euclidean, in order to establish different levels of relevance for the input attributes. The technique was implemented on a controlled experiment and on Web available data sets.  相似文献   

10.
In this paper, we introduce new algorithms that perform clustering and feature weighting simultaneously and in an unsupervised manner. The proposed algorithms are computationally and implementationally simple, and learn a different set of feature weights for each identified cluster. The cluster dependent feature weights offer two advantages. First, they guide the clustering process to partition the data set into more meaningful clusters. Second, they can be used in the subsequent steps of a learning system to improve its learning behavior. An extension of the algorithm to deal with an unknown number of clusters is also proposed. The extension is based on competitive agglomeration, whereby the number of clusters is over-specified, and adjacent clusters are allowed to compete for data points in a manner that causes clusters which lose in the competition to gradually become depleted and vanish. We illustrate the performance of the proposed approach by using it to segment color images, and to build a nearest prototype classifier.  相似文献   

11.
This paper introduces concepts and algorithms of feature selection, surveys existing feature selection algorithms for classification and clustering, groups and compares different algorithms with a categorizing framework based on search strategies, evaluation criteria, and data mining tasks, reveals unattempted combinations, and provides guidelines in selecting feature selection algorithms. With the categorizing framework, we continue our efforts toward-building an integrated system for intelligent feature selection. A unifying platform is proposed as an intermediate step. An illustrative example is presented to show how existing feature selection algorithms can be integrated into a meta algorithm that can take advantage of individual algorithms. An added advantage of doing so is to help a user employ a suitable algorithm without knowing details of each algorithm. Some real-world applications are included to demonstrate the use of feature selection in data mining. We conclude this work by identifying trends and challenges of feature selection research and development.  相似文献   

12.
Five existing LVQ algorithms are reviewed. The Premature Clustering Phenomenon, which downgrades the performance of LVQ is explained. By introducing and applying the “equalizing factor” as a remedy for the premature clustering phenomenon a breakthrough is achieved in improving the performance of the LVQ network, and its performance becomes competitive with that of the best known classifiers. For estimating the equalizing factor four different formulas are suggested, which result in four different versions of the LVQ4a algorithm. A new weight-updating formula for LVQ is presented, and the LVQ4b algorithm is presented as implementation of this new weight-updating formula in batch mode training. In addition, four variants of the LVQ4c algorithm are presented as the customized LVQ4b algorithm for pattern mode training.A meticulous analysis of their performances and that of five early training algorithms has been carried out and they have been compared against each other, on 16 databases of the Farsi optical character recognition problem.  相似文献   

13.
Graph-based learning algorithms including label propagation and spectral clustering are known as the effective state-of-the-art algorithms for a variety of tasks in machine learning applications. Given input data, i.e. feature vectors, graph-based methods typically proceed with the following three steps: (1) generating graph edges, (2) estimating edge weights and (3) running a graph based algorithm. The first and second steps are difficult, especially when there are only a few (or no) labeled instances, while they are important because the performance of graph-based methods heavily depends on the quality of the input graph. For the second step of the three-step procedure, we propose a new method, which optimizes edge weights through a local linear reconstruction error minimization under a constraint that edges are parameterized by a similarity function of node pairs. As a result our generated graph can capture the manifold structure of the input data, where each edge represents similarity of each node pair. To further justify this approach, we also provide analytical considerations for our formulation such as an interpretation as a cross-validation of a propagation model in the feature space, and an error analysis based on a low dimensional manifold model. Experimental results demonstrated the effectiveness of our adaptive edge weighting strategy both in synthetic and real datasets.  相似文献   

14.
在PSO聚类算法的基础上,提出了基于量子行为的微粒群优化算法(QPSO)的数据聚类.QPSO算法不仅参数个数少、随机性强,并且能覆盖所有解空间,保证算法的全局收敛.PSO与QPSO算法的不同在于聚类中心的进化上,实验中用到四个数据集比较的结果,证明了QPSO优于PSO聚类方法.在聚类过程中使用了一种新的度量代替Euclidean标准,实验证明了新的度量方法比Euclidean标准更具有健壮性,聚类的结果更精确.  相似文献   

15.
针对传统学习矢量量化算法没有考虑属性的重要度差异的问题,提出一种加权学习矢量量化算法.该算法为每一维属性引入一个权重系数,用其表征相应属性在分类过程中的重要程度,并与权向量一同更新.利用输入样本和获胜神经元之间的修正距离的均值,控制权重系数更新的阈值及步长.距离均值确保了更新过程的稳定性,且无需进行权重系数的归一化操作.UCI机器学习数据库中6组数据的实验结果表明,该算法能够有效给出数据的本质属性,尤其是局部型权重系数.与传统学习矢量量化算法及其改进算法相比,识别率高、性能稳定、计算复杂度低.  相似文献   

16.
Privacy preserving data mining has become increasingly popular because it allows sharing of privacy-sensitive data for analysis purposes. However, existing techniques such as random perturbation do not fare well for simple yet widely used and efficient Euclidean distance-based mining algorithms. Although original data distributions can be pretty accurately reconstructed from the perturbed data, distances between individual data points are not preserved, leading to poor accuracy for the distance-based mining methods. Besides, they do not generally focus on data reduction. Other studies on secure multi-party computation often concentrate on techniques useful to very specific mining algorithms and scenarios such that they require modification of the mining algorithms and are often difficult to generalize to other mining algorithms or scenarios. This paper proposes a novel generalized approach using the well-known energy compaction power of Fourier-related transforms to hide sensitive data values and to approximately preserve Euclidean distances in centralized and distributed scenarios to a great degree of accuracy. Three algorithms to select the most important transform coefficients are presented, one for a centralized database case, the second one for a horizontally partitioned, and the third one for a vertically partitioned database case. Experimental results demonstrate the effectiveness of the proposed approach.  相似文献   

17.
Abstract

A clustering algorithm based on the commonly used least-squared Euclidean distance performs poorly on AVHRR images of clouds. This is primarily due to the data inadequately fitting the assumptions on which such an algorithm is based. Algorithms based on two modified clustering criteria are shown to be convergent within the same algorithm shell. These modified criteria have been developed elsewhere to allow generalized Gaussian clusters and also to account for differences in the populations of the different clusters. The new algorithms are tested on satellite data and found to give much improved results.  相似文献   

18.
人群异常状态检测的图分析方法   总被引:2,自引:0,他引:2  
提出一种图分析方法用于动态人群场景异常状态检测. 使用自适应Mean shift算法对场景速度场进行非参数概率密度估计聚类, 聚类结果构成以聚类中心为顶点、各聚类中心之间距离为边权重的无向图. 通过分析图顶点的空间分布及边权重矩阵动态系统的预测值与观测值之间的离散程度,对动态场景中的异常事件进行检测和定位. 使用多个典型动态场景视频数据库进行对比实验,结果表明图分析方法适应性强、可有效监控动态人群场景中的异常状态.  相似文献   

19.
Likas A 《Neural computation》1999,11(8):1915-1932
A general technique is proposed for embedding online clustering algorithms based on competitive learning in a reinforcement learning framework. The basic idea is that the clustering system can be viewed as a reinforcement learning system that learns through reinforcements to follow the clustering strategy we wish to implement. In this sense, the reinforcement guided competitive learning (RGCL) algorithm is proposed that constitutes a reinforcement-based adaptation of learning vector quantization (LVQ) with enhanced clustering capabilities. In addition, we suggest extensions of RGCL and LVQ that are characterized by the property of sustained exploration and significantly improve the performance of those algorithms, as indicated by experimental tests on well-known data sets.  相似文献   

20.
Traditionally, prototype-based fuzzy clustering algorithms such as the Fuzzy C Means (FCM) algorithm have been used to find “compact” or “filled” clusters. Recently, there have been attempts to generalize such algorithms to the case of hollow or “shell-like” clusters, i.e., clusters that lie in subspaces of feature space. The shell clustering approach provides a powerful means to solve the hitherto unsolved problem of simultaneously fitting multiple curves/surfaces to unsegmented, scattered and sparse data. In this paper, we present several fuzzy and possibilistic algorithms to detect linear and quadric shell clusters. We also introduce generalizations of these algorithms in which the prototypes represent sets of higher-order polynomial functions. The suggested algorithms provide a good trade-off between computational complexity and performance, since the objective function used in these algorithms is the sum of squared distances, and the clustering is sensitive to noise and outliers. We show that by using a possibilistic approach to clustering, one can make the proposed algorithms robust  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号