首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Spectral clustering with fuzzy similarity measure   总被引:1,自引:0,他引:1  
Spectral clustering algorithms have been successfully used in the field of pattern recognition and computer vision. The widely used similarity measure for spectral clustering is Gaussian kernel function which measures the similarity between data points. However, it is difficult for spectral clustering to choose the suitable scaling parameter in Gaussian kernel similarity measure. In this paper, utilizing the prototypes and partition matrix obtained by fuzzy c-means clustering algorithm, we develop a fuzzy similarity measure for spectral clustering (FSSC). Furthermore, we introduce the K-nearest neighbor sparse strategy into FSSC and apply the sparse FSSC to texture image segmentation. In our experiments, we firstly perform some experiments on artificial data to verify the efficiency of the proposed fuzzy similarity measure. Then we analyze the parameters sensitivity of our method. Finally, we take self-tuning spectral clustering and Nyström methods for baseline comparisons, and apply these three methods to the synthetic texture and remote sensing image segmentation. The experimental results show that the proposed method is significantly effective and stable.  相似文献   

2.
3.
Clustering requires the user to define a distance metric, select a clustering algorithm, and set the hyperparameters of that algorithm. Getting these right, so that a clustering is obtained that meets the users subjective criteria, can be difficult and tedious. Semi-supervised clustering methods make this easier by letting the user provide must-link or cannot-link constraints. These are then used to automatically tune the similarity measure and/or the optimization criterion. In this paper, we investigate a complementary way of using the constraints: they are used to select an unsupervised clustering method and tune its hyperparameters. It turns out that this very simple approach outperforms all existing semi-supervised methods. This implies that choosing the right algorithm and hyperparameter values is more important than modifying an individual algorithm to take constraints into account. In addition, the proposed approach allows for active constraint selection in a more effective manner than other methods.  相似文献   

4.
Spectral clustering in multi-agent systems   总被引:2,自引:2,他引:0  
We examine the application of spectral clustering for breaking up the behavior of a multi-agent system in space and time into smaller, independent elements. We propose clustering observations of individual entities in order to identify significant changes in the parameter space (like spatial position) and detect temporal alterations of behavior within the same framework. Available knowledge of important interactions (events) between entities is also considered. We describe a novel algorithm utilizing iterative subdivisions where clusters are pre-processed at each step to counter spatial scaling, rotation, replay speed, and varying sampling frequency. A method is presented to balance spatial and temporal segmentation based on the expected group size, and a validity measure is introduced to determine the optimal number of clusters. We demonstrate our results by analyzing the outcomes of computer games and compare our algorithm to K-means and traditional spectral clustering.  相似文献   

5.
In this article, we address the problem of automatic constraint selection to improve the performance of constraint-based clustering algorithms. To this aim we propose a novel active learning algorithm that relies on a k-nearest neighbors graph and a new constraint utility function to generate queries to the human expert. This mechanism is paired with propagation and refinement processes that limit the number of constraint candidates and introduce a minimal diversity in the proposed constraints. Existing constraint selection heuristics are based on a random selection or on a min–max criterion and thus are either inefficient or more adapted to spherical clusters. Contrary to these approaches, our method is designed to be beneficial for all constraint-based clustering algorithms. Comparative experiments conducted on real datasets and with two distinct representative constraint-based clustering algorithms show that our approach significantly improves clustering quality while minimizing the number of human expert solicitations.  相似文献   

6.
大多数现存的谱聚类方法均使用传统距离度量计算样本之间的相似性, 这样仅仅考虑了两两样本之间的相似性而忽略了周围的近邻信息, 更没有顾及数据的全局性分布结构. 因此, 本文提出一种新的融合欧氏距离和 Kendall Tau距离的谱聚类方法. 该方法通过融合两两样本之间的直接距离以及其周围的近邻信息, 充分利用了不同的相似性度量可以从不同角度抓取数据之间结构信息的优势, 更加全面地反映数据的底层结构信息. 通过与传统聚类算法在UCI标准数据集上的实验结果作比较, 验证了本文的方法可以显著提高聚类效果.  相似文献   

7.
Spectral clustering techniques are heuristic algorithms aiming to find approximate solutions to difficult graph-cutting problems, usually NP-complete, which are useful to clustering. A fundamental working hypothesis of these techniques is that the optimal partition of K classes can be obtained from the first K eigenvectors of the graph normalized Laplacian matrix LN if the gap between the K-th and the K+1-th eigenvalue of LN is sufficiently large. If the gap is small a perturbation may swap the corresponding eigenvectors and the results can be very different from the optimal ones.In this paper we suggest a weaker working hypothesis: the optimal partition of K classes can be obtained from a K-dimensional subspace of the first M>K eigenvectors, where M is a parameter chosen by the user. We show that the validity of this hypothesis can be confirmed by the gap size between the K-th and the M+1-th eigenvalue of LN. Finally we present and analyse a simple probabilistic algorithm that generalizes current spectral techniques in this extended framework. This algorithm gives results on real world graphs that are close to the state of the art by selecting correct K-dimensional subspaces of the linear span of the first M eigenvectors, robust to small changes of the eigenvalues.  相似文献   

8.
This paper presents an application of Fuzzy Clustering of Large Applications based on Randomized Search (FCLARANS) for attribute clustering and dimensionality reduction in gene expression data. Domain knowledge based on gene ontology and differential gene expressions are employed in the process. The use of domain knowledge helps in the automated selection of biologically meaningful partitions. Gene ontology (GO) study helps in detecting biologically enriched and statistically significant clusters. Fold-change is measured to select the differentially expressed genes as the representatives of these clusters. Tools like Eisen plot and cluster profiles of these clusters help establish their coherence. Important representative features (or genes) are extracted from each enriched gene partition to form the reduced gene space. While the reduced gene set forms a biologically meaningful attribute space, it simultaneously leads to a decrease in computational burden. External validation of the reduced subspace, using various well-known classifiers, establishes the effectiveness of the proposed methodology on four sets of publicly available microarray gene expression data.  相似文献   

9.
基于全局K-Means的谱聚类算法   总被引:3,自引:1,他引:2  
谢皝  张平伟  罗晟 《计算机应用》2010,30(7):1936-1937
谱聚类算法是近年来研究得比较多的一种聚类算法。但谱聚类是对初始化敏感的,针对这种缺陷,提出一种基于全局K-means的谱聚类算法(GKSC),引入对初值不敏感的全局K-means算法来改善。通过仿真实验表明:GKSC与传统谱聚类算法相比更能得到稳定的聚类结果和更高的聚类精确度。  相似文献   

10.
Spectral clustering: A semi-supervised approach   总被引:2,自引:0,他引:2  
Recently, graph-based spectral clustering algorithms have been developing rapidly, which are proposed as discrete combinatorial optimization problems and approximately solved by relaxing them into tractable eigenvalue decomposition problems. In this paper, we first review the current existing spectral clustering algorithms in a unified-framework way and give a straightforward explanation about spectral clustering. We also present a novel model for generalizing the unsupervised spectral clustering to semi-supervised spectral clustering. Under this model, prior information given by some instance-level constraints can be generalized to space-level constraints. We find that (undirected) graph built on the enlarged prior information is more meaningful, hence the boundaries of the clusters are more correct. Experimental results based on toy data, real-world data and image segmentation demonstrate the advantages of the proposed model.  相似文献   

11.
针对传统谱聚类在构建关系矩阵时只考虑样本的全局特征而忽略样本的局部特征、在聚类划分时通常需要指定聚类个数、无法对交叉点进行正确划分等问题,提出了一种改进的基于局部主成分分析和连通图分解的谱聚类算法。首先自动学习挑选数据集的中心点,然后使用局部主成分分析得到数据集的关系矩阵,最后用连通图分解算法完成对关系矩阵的划分。实验结果表明提出的改进算法性能优于现有经典算法。  相似文献   

12.
选择性聚类融合研究进展   总被引:1,自引:0,他引:1  
传统的聚类融合方法通常是将所有产生的聚类成员融合以获得最终的聚类结果。在监督学习中,选择分类融合方法会获得更好的结果,从选择分类融合中得到启示,在聚类融合中应用这种方法被定义为选择性聚类融合。对选择性聚类融合关键技术进行了综述,讨论了未来的研究方向。  相似文献   

13.
Bagging-based spectral clustering ensemble selection   总被引:2,自引:0,他引:2  
Traditional clustering ensemble methods combine all obtained clustering results at hand. However, we can often achieve a better clustering solution if only parts of the clustering results available are combined. In this paper, we generalize the selective clustering ensemble algorithm proposed by Azimi and Fern and a novel clustering ensemble method, SELective Spectral Clustering Ensemble (SELSCE), is proposed. The component clusterings of the ensemble system are generated by spectral clustering (SC) capable of engendering diverse committees. The random scaling parameter, Nyström approximation are used to perturb SC for producing the components of the ensemble system. After the generation of component clusterings, the bagging technique, usually applied in supervised learning, is used to assess the component clustering. We randomly pick part of the available clusterings to get a consensus result and then compute normalized mutual information (NMI) or adjusted rand index (ARI) between the consensus result and the component clusterings. Finally, the components are ranked by aggregating multiple NMI or ARI values. The experimental results on UCI dataset and images demonstrate that the proposed algorithm can achieve a better result than the traditional clustering ensemble methods.  相似文献   

14.
Requirements Engineering - Requirements selection is a decision-making process that enables project managers to focus on the deliverables that add most value to the project outcome. This task is...  相似文献   

15.
16.
Cluster ensemble approaches make use of a set of clustering solutions which are derived from different data sources to gain a more comprehensive and significant clustering result over conventional single clustering approaches. Unfortunately, not all the clustering solutions in the ensemble contribute to the final result. In this paper, we focus on the clustering solution selection strategy in the cluster ensemble, and propose to view clustering solutions as features such that suitable feature selection techniques can be used to perform clustering solution selection. Furthermore, a hybrid clustering solution selection strategy (HCSS) is designed based on a proposed weighting function, which combines several feature selection techniques for the refinement of clustering solutions in the ensemble. Finally, a new measure is designed to evaluate the effectiveness of clustering solution selection strategies. The experimental results on both UCI machine learning datasets and cancer gene expression profiles demonstrate that HCSS works well on most of the datasets, obtains more desirable final results, and outperforms most of the state-of-the-art clustering solution selection strategies.  相似文献   

17.
The results of experiments with a novel criterion for absolute non-parametric feature selection are reported. The basic idea of the new technique involves the use of computer graphics and the human pattern recognition ability to interactively choose a number of features, this number not being necessarily determined in advance, from a larger set of measurements. The triangulation method, recently proposed in the cluster analysis literature for mapping points from l-space to 2-space, is used to yield a simple and efficient algorithm for feature selection by interactive clustering. It is shown that a subset of features can thus be chosen which allows a significant reduction in storage and time while still keeping the probability of error in classification within reasonable bounds.  相似文献   

18.
文本信息中包括许多无用特征,这种噪声特征会影响文本聚类效果,为此提出一种基于粒子群优化的文本特征选择算法.利用词频逆文本频率指数为目标函数评估每个文档的文本特征,从初始文档数据集中求解新的有用特征最优子集;以该最优有用特征子集作为K均值聚类的输入进行文本聚类,得到最优文本聚类结果.利用文档数据集进行聚类测试,其结果表明...  相似文献   

19.
Spectral clustering based on matrix perturbation theory   总被引:5,自引:1,他引:5  
This paper exposes some intrinsic characteristics of the spectral clustering method by using the tools from the matrix perturbation theory. We construct a weight ma- trix of a graph and study its eigenvalues and eigenvectors. It shows that the num- ber of clusters is equal to the number of eigenvalues that are larger than 1, and the number of points in each of the clusters can be approximated by the associated eigenvalue. It also shows that the eigenvector of the weight matrix can be used directly to perform clustering; that is, the directional angle between the two-row vectors of the matrix derived from the eigenvectors is a suitable distance measure for clustering. As a result, an unsupervised spectral clustering algorithm based on weight matrix (USCAWM) is developed. The experimental results on a number of artificial and real-world data sets show the correctness of the theoretical analysis.  相似文献   

20.
谱聚类方法研究及其在Weka中的实现   总被引:1,自引:1,他引:0  
介绍了谱聚类方法的基本原理和算法思想,针对谱聚类方法优化问题求解的困难,分析了一种有原则的求解策略,从而给出算法的具体描述,并作为一个插件在Weka上进行了实现.对实现的系统进行了实验和测试,指出了应用中的关键问题.实验结果表明,谱聚类方法效果优于K-means方法.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号