共查询到20条相似文献,搜索用时 15 毫秒
1.
谱聚类的现状及其在社会网络中的应用 总被引:1,自引:0,他引:1
近年来,凭借其重要的研究意义,采用数据聚类去分析社会网络已成为时下最热门的话题之一。这些研究最直接应用的是防止恐怖袭击和社区通过检测疾病的传播。此外,由于社会网络是动态的,而社会关系的变化是可以通过数据聚类方法预测的。从而使得清楚了解社会网络结构将有助于促进社会发展和社会成员间的合作。从数据挖掘角度来看,社交网络是一种不完全的,庞大的,复杂的,动态的网络。而这些特性使得传统的数据聚类方法并不能成功应用在社会网络中。相反,作为一个最流行的现代数据的聚类算法,谱聚类在对社交网络的问题提供了一种系统的,灵活实用的解决方案。理论和实验证明,谱聚类在寻找全局最优解和处理大型数据集方面的性能优于传统聚类算法。一方面审视讨论当今谱聚类的理论和算法,及其优于传统聚类算法的特点。另一方面,也涵盖了社会网络的基本知识及两个典型的谱聚类在社会网络中的应用。 相似文献
2.
Multi-view clustering has attracted much attention recently. Among all clustering approaches, spectral ones have gained much popularity thanks to an elaborated and solid theoretical foundation. A major limitation of spectral clustering based methods is that these methods only provide a non-linear projection of the data, to which an additional step of clustering is required. This can degrade the quality of the final clustering due to various factors such as the initialization process or outliers. To overcome these challenges, this paper presents a constrained version of a recent method called Multiview Spectral Clustering via integrating Nonnegative Embedding and Spectral Embedding. Besides retaining the advantages of this method, our proposed model integrates two types of constraints: (i) a consistent smoothness of the nonnegative embedding over all views and (ii) an orthogonality constraint over the columns of the nonnegative embedding. Experimental results on several real datasets show the superiority of the proposed approach. 相似文献
3.
Dynamic data mining has gained increasing attention in the last decade. It addresses changing data structures which can be observed in many real-life applications, e.g. buying behavior of customers. As opposed to classical, i.e. static data mining where the challenge is to discover pattern inherent in given data sets, in dynamic data mining the challenge is to understand – and in some cases even predict – how such pattern will change over time. Since changes in general lead to uncertainty, the appropriate approaches for uncertainty modeling are needed in order to capture, model, and predict the respective phenomena considered in dynamic environments. As a consequence, the combination of dynamic data mining and soft computing is a very promising research area. The proposed algorithm consists of a dynamic clustering cycle when the data set will be refreshed from time to time. Within this cycle criteria check if the newly arrived data have structurally changed in comparison to the data already analyzed. If yes, appropriate actions are triggered, in particular an update of the initial settings of the cluster algorithm. As we will show, rough clustering offers strong tools to detect such changing data structures. To evaluate the proposed dynamic rough clustering algorithm it has been applied to synthetic as well as to real-world data sets where it provides new insights regarding the underlying dynamic phenomena. 相似文献
4.
A conceptual problem that appears in different contexts of clustering analysis is that of measuring the degree of compatibility between two sequences of numbers. This problem is usually addressed by means of numerical indexes referred to as sequence correlation indexes. This paper elaborates on why some specific sequence correlation indexes may not be good choices depending on the application scenario in hand. A variant of the Product-Moment correlation coefficient and a weighted formulation for the Goodman-Kruskal and Kendall’s indexes are derived that may be more appropriate for some particular application scenarios. The proposed and existing indexes are analyzed from different perspectives, such as their sensitivity to the ranks and magnitudes of the sequences under evaluation, among other relevant aspects of the problem. The results help suggesting scenarios within the context of clustering analysis that are possibly more appropriate for the application of each index. 相似文献
5.
Both mixed data types and cluster constraints are frequently encountered in the classification problems of construction management. For example, in a bridge let project, engineers generally group the bridges into several subgroups based on their proximities, structure type, material, etc. Moreover, constraints may be set for each cluster to ensure the project’s overall effectiveness. In this study, an effective clustering algorithm – the constrained k-prototypes (CKP) algorithm – is proposed to resolve the abovementioned problems. Several tests and experimental results have shown that CKP cannot only handle mixed data types but also satisfy user-specified constraints. In order to demonstrate the applicability of CKP, it is also applied to real-world problems in construction management. 相似文献
6.
7.
信息图的构造对许多机器学习任务来说是至关重要的。基于稀疏表示理论,提出了一种有向非负l1图。在构造此图的过程中,先将每个样例表示成其他样例的非负线性组合,再通过求解l1最小化问题来同时获得近邻样例和对应的相似度。最后将基于非负l1图的谱聚类方法应用于手写字符的聚类问题。与基于l1图的谱聚类方法相比,所提方法具有较好的聚类性能和较低的计算复杂度。 相似文献
8.
Jaehwan Kim Author Vitae 《Pattern recognition》2006,39(11):2025-2035
Multi-way partitioning of an undirected weighted graph where pairwise similarities are assigned as edge weights, provides an important tool for data clustering, but is an NP-hard problem. Spectral relaxation is a popular way of relaxation, leading to spectral clustering where the clustering is performed by the eigen-decomposition of the (normalized) graph Laplacian. On the other hand, semidefinite relaxation, is an alternative way of relaxing a combinatorial optimization, leading to a convex optimization. In this paper we employ a semidefinite programming (SDP) approach to the graph equipartitioning for clustering, where sufficient conditions for strong duality hold. The method is referred to as semidefinite spectral clustering, where the clustering is based on the eigen-decomposition of the optimal feasible matrix computed by SDP. Numerical experiments with several data sets, demonstrate the useful behavior of our semidefinite spectral clustering, compared to existing spectral clustering methods. 相似文献
9.
一种基于谱聚类的半监督聚类方法 总被引:6,自引:1,他引:6
半监督聚类利用少部分标签的数据辅助大量未标签的数据进行非监督的学习,从而提高聚类的性能。提出一种基于谱聚类的半监督聚类算法,其利用标签数据的信息,调整点与点之间的距离所形成的距离矩阵,而后基于被调整的距离矩阵进行谱聚类。实验表明,该算法较之于已提出的半监督聚类算法,获得了更好的聚类性能。 相似文献
10.
提出一种改进的用于求解约束优化问题的进化算法.该算法利用混沌方法初始化个体以保证其均匀分布在搜索空间中.在进化过程中,将种群分为可行子种群和不可行子种群,分别采用不同的交叉和变异操作,以平衡算法的全局和局部搜索能力.标准测试问题的实验结果表明了改进算法的有效性.最后将改进算法应用到两个工程优化设计问题中,得到了满意的结果. 相似文献
11.
Spectral clustering aims to partition a data set into several groups by using the Laplacian of the graph such that data points in the same group are similar while data points in different groups are dissimilar to each other. Spectral clustering is very simple to implement and has many advantages over the traditional clustering algorithms such as k-means. Non-negative matrix factorization (NMF) factorizes a non-negative data matrix into a product of two non-negative (lower rank) matrices so as to achieve dimension reduction and part-based data representation. In this work, we proved that the spectral clustering under some conditions is equivalent to NMF. Unlike the previous work, we formulate the spectral clustering as a factorization of data matrix (or scaled data matrix) rather than the symmetrical factorization of the symmetrical pairwise similarity matrix as the previous study did. Under the NMF framework, where regularization can be easily incorporated into the spectral clustering, we propose several non-negative and sparse spectral clustering algorithms. Empirical studies on real world data show much better clustering accuracy of the proposed algorithms than some state-of-the-art methods such as ratio cut and normalized cut spectral clustering and non-negative Laplacian embedding. 相似文献
12.
针对约束优化系统易陷局部优化的问题,提出了基于分解协调的多Agent约束优化算法(DCMACOA)。对可分系统,不同于传统的分解协调算法,DCMACOA选用各子系统间的关联变量为协调变量,借助于多Agent及生物免疫的进化思想,对各子系统优化及系统协调采用了多Agent免疫优化方法,优化搜索算子主要包括:邻域克隆选择、邻域竞争及邻域协作。工业流程和换热器面积优化仿真实例表明,相比传统的分解协调算法,DCMACOA能改善整体与局部的搜索性能,提高对可分系统的约束优化求解能力,具有较好的全局搜索性能。 相似文献
13.
谱聚类方法的应用已经开始从图像分割领域扩展到文本挖掘领域中,并取得了一定的成果。在自动确定聚类数目的基础上,结合模糊理论与谱聚类算法,提出了一种应用在多文本聚类中的模糊聚类算法,该算法主要描述了如何实现单个文本同时属于多个文本类的模糊谱聚类方法。实验仿真结果表明该算法具有很好的聚类效果。 相似文献
14.
多层自动确定类别的谱聚类算法 总被引:1,自引:0,他引:1
自动确定聚类数和海量数据的处理是谱聚类的关键问题。在自动确定聚类数谱聚类算法的基础上,提出了一种能处理大规模数据集的多层算法。该算法的核心思想是把大规模数据集根据一定的相关性逐级进行合并,使之成为小数据集,再对分组后的小数据集用自动确定类别的谱聚类算法聚类,最后逐层进行拆分并微调, 完成全部数据的聚类。实验证明该算法的聚类效果很好。 相似文献
15.
In this paper, we present a locality-constrained nonnegative robust shape interaction (LNRSI) subspace clustering method. LNRSI integrates the local manifold structure of data into the robust shape interaction (RSI) in a unified formulation, which guarantees the locality and the low-rank property of the optimal affinity graph. Compared with traditional low-rank representation (LRR) learning method, LNRSI can not only pursuit the global structure of data space by low-rank regularization, but also keep the locality manifold, which leads to a sparse and low-rank affinity graph. Due to the clear block-diagonal effect of the affinity graph, LNRSI is robust to noise and occlusions, and achieves a higher rate of correct clustering. The theoretical analysis of the clustering effect is also discussed. An efficient solution based on linearized alternating direction method with adaptive penalty (LADMAP) is built for our method. Finally, we evaluate the performance of LNRSI on both synthetic data and real computer vision tasks, i.e., motion segmentation and handwritten digit clustering. The experimental results show that our LNRSI outperforms several state-of-the-art algorithms. 相似文献
16.
粗糙集和模糊集理论已经被用于各种类型的不确定性建模中。Dubois和Prade研究了将模糊集和粗糙集结合的问题。提出了粗糙support-intuitionistic模糊集。介绍了粗糙集、粗糙直觉模糊集和support-intuitionistic模糊集等的概念;定义了在Pawlak近似空间中的support-intuitionistic模糊集的上下近似,讨论了一些粗糙support-intuitionistic模糊集近似算子的性质,给出了其相似度表达式;将其应用到聚类分析问题中,并通过一个实例验证其合理性。 相似文献
17.
In recent years there has been a growing interest in clustering methods stemming from the spectral decomposition of the data affinity matrix, which are shown to present good results on a wide variety of situations. However, a complete theoretical understanding of these methods in terms of data distributions is not yet well understood. In this paper, we propose a spectral clustering based mode merging method for mean shift as a theoretically well-founded approach that enables a probabilistic interpretation of affinity based clustering through kernel density estimation. This connection also allows principled kernel optimization and enables the use of anisotropic variable-size kernels to match local data structures. We demonstrate the proposed algorithm's performance on image segmentation applications and compare its clustering results with the well-known Mean Shift and Normalized Cut algorithms. 相似文献
18.
近年来,谱聚类在分类领域得到了广泛的研究,其中基于路径和基于密度的算法是两个重要的研究方向。虽然这两种算法在一些数据集上能取得较好的分类效果,但不能对一些特殊的数据集进行准确分类。融合了这两种方法的优点,通过多级密度约束来寻找路径,根据得到的路径建立新的相似性矩阵。为了加强对噪声的鲁棒性,根据数据集的局部信息加入鲁棒性系数,提出了基于路径与密度的稳健谱聚类算法。实验结果表明该方法在人工数据集和手写体数据集上能取得较理想的分类结果。 相似文献
19.
Spectral clustering and path-based clustering are two recently developed clustering approaches that have delivered impressive results in a number of challenging clustering tasks. However, they are not robust enough against noise and outliers in the data. In this paper, based on M-estimation from robust statistics, we develop a robust path-based spectral clustering method by defining a robust path-based similarity measure for spectral clustering under both unsupervised and semi-supervised settings. Our proposed method is significantly more robust than spectral clustering and path-based clustering. We have performed experiments based on both synthetic and real-world data, comparing our method with some other methods. In particular, color images from the Berkeley segmentation data set and benchmark are used in the image segmentation experiments. Experimental results show that our method consistently outperforms other methods due to its higher robustness. 相似文献