首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 140 毫秒
1.
Spectral clustering: A semi-supervised approach   总被引:2,自引:0,他引:2  
Recently, graph-based spectral clustering algorithms have been developing rapidly, which are proposed as discrete combinatorial optimization problems and approximately solved by relaxing them into tractable eigenvalue decomposition problems. In this paper, we first review the current existing spectral clustering algorithms in a unified-framework way and give a straightforward explanation about spectral clustering. We also present a novel model for generalizing the unsupervised spectral clustering to semi-supervised spectral clustering. Under this model, prior information given by some instance-level constraints can be generalized to space-level constraints. We find that (undirected) graph built on the enlarged prior information is more meaningful, hence the boundaries of the clusters are more correct. Experimental results based on toy data, real-world data and image segmentation demonstrate the advantages of the proposed model.  相似文献   

2.
Most knowledge discovery in databases (KDD) research is concentrated on supervised inductive learning. Conceptual clustering is an unsupervised inductive learning technique that organizes observations into an abstraction hierarchy without using predefined class values. However, a typical conceptual clustering algorithm is not suitable for a KDD task because of space and time constraints. Furthermore, typical incremental and non-incremental clustering algorithms are not designed for a partitioned data set. In this paper, we present a conceptual clustering algorithm that works on partitioned data. The proposed algorithm improves the clustering process by using less computation time and less space while maintaining the clustering quality.  相似文献   

3.
In order to import the domain knowledge or application-dependent parameters into the data mining systems, constraint-based mining has attracted a lot of research attention recently. In this paper, the attributes employed to model the constraints are called constraint attributes and those attributes involved in the objective function to be optimized are called optimization attributes. The constrained clustering considered in this paper is conducted in such a way that the objective function of optimization attributes is optimized subject to the condition that the imposed constraint is satisfied. Explicitly, we address the problem of constrained clustering with numerical constraints, in which the constraint attribute values of any two data items in the same cluster are required to be within the corresponding constraint range. This numerical constrained clustering problem, however, cannot be dealt with by any conventional clustering algorithms. Consequently, we devise several effective and efficient algorithms to solve such a clustering problem. It is noted that due to the intrinsic nature of the numerical constrained clustering, there is an order dependency on the process of attaining the clustering, which in many cases degrades the clustering results. In view of this, we devise a progressive constraint relaxation technique to remedy this drawback and improve the overall performance of clustering results. Explicitly, by using a smaller (tighter) constraint range in earlier iterations of merge, we will have more room to relax the constraint and seek for better solutions in subsequent iterations. It is empirically shown that the progressive constraint relaxation technique is able to improve not only the execution efficiency but also the clustering quality.  相似文献   

4.
针对传统的密度聚类算法不能处理带有多约束条件的问题,在现有的密度聚类算法的基础上,提出了一个带有多约束条件限制的密度聚类算法。该算法将多约束条件引入到密度聚类分析中,并分析了多约束条件对聚类结果的影响。实验表明该算法在多约束条件下,可有效完成对数据点的聚类并且效果较好,为现实情况中处理多约束聚类提供了良好的理论支持。  相似文献   

5.
双层随机游走半监督聚类   总被引:3,自引:0,他引:3  
何萍  徐晓华  陆林  陈崚 《软件学报》2014,25(5):997-1013
半监督聚类旨在根据用户给出的必连和不连约束,把所有数据点划分到不同的簇中,从而获得更准确、更加符合用户要求的聚类结果.目前的半监督聚类算法大多数通过修改已有的聚类算法或者结合度规学习,使聚类结果与点对约束尽可能地保持一致,却很少考虑点对约束对周围无约束数据的显式影响程度.提出一种由在顶点上的低层随机游走和在组件上的高层随机游走两部分构成的双层随机游走半监督聚类算法,其中,低层随机游走主要负责计算选出的约束顶点对其他顶点的影响范围和影响程度,称为组件;高层随机游走则进一步将各个点对约束以自适应的强度在组件上进行约束传播,把它们在每个顶点上的影响综合在一个簇指示矩阵中.UCI数据集和大型真实数据集上的实验结果表明,双层随机游走半监督聚类算法比其他半监督聚类算法更准确,也比较高效.  相似文献   

6.
带障碍约束的遗传K中心空间聚类分析   总被引:1,自引:0,他引:1       下载免费PDF全文
空间聚类分析是空间数据挖掘中的一个重要研究课题。传统聚类算法忽略了真实世界中许多约束条件的存在,而约束条件的存在会影响聚类结果的合理性。讨论了带障碍约束的空间聚类问题,研究了一种基于遗传和划分相结合的带障碍约束空间数据聚类分析方法,设计了一个带障碍约束的遗传K中心空间聚类分析算法。对比实验表明,该方法兼顾了局部收敛和全局收敛性能,考虑到了现实障碍物对聚类结果的影响,使得聚类结果更具有实际意义,其结果优于传统K中心聚类及单纯的遗传聚类,不足之处是其计算速度相对较慢。  相似文献   

7.
To obtain a user-desired and accurate clustering result in practical applications, one way is to utilize additional pairwise constraints that indicate the relationship between two samples, that is, whether these samples belong to the same cluster or not. In this paper, we put forward a discriminative learning approach which can incorporate pairwise constraints into the recently proposed two-class maximum margin clustering framework. In particular, a set of pairwise loss functions is proposed, which features robust detection and penalization for violating the pairwise constraints. Consequently, the proposed method is able to directly find the partitioning hyperplane, which can separate the data into two groups and satisfy the given pairwise constraints as much as possible. In this way, it makes fewer assumptions on the distance metric or similarity matrix for the data, which may be complicated in practice, than existing popular constrained clustering algorithms. Finally, an iterative updating algorithm is proposed for the resulting optimization problem. The experiments on a number of real-world data sets demonstrate that the proposed pairwise constrained two-class clustering algorithm outperforms several representative pairwise constrained clustering counterparts in the literature.  相似文献   

8.
范虹  侯存存  朱艳春  姚若侠 《软件学报》2017,28(11):3080-3093
现有的软子空间聚类算法在分割MR图像时易受随机噪声的影响,而且算法因依赖于初始聚类中心的选择而容易陷入局部最优,导致分割效果不理想.针对这一问题,提出一种基于烟花算法的软子空间MR图像聚类算法.算法首先设计一个结合界约束与噪声聚类的目标函数,弥补现有算法对噪声数据敏感的缺陷,并提出一种隶属度计算方法,快速、准确地寻找簇类所在子空间;然后,在聚类过程中引入自适应烟花算法,有效地平衡局部与全局搜索,弥补现有算法容易陷入局部最优的不足.EWKM,FWKM,FSC,LAC算法在UCI数据集、人工合成图像、Berkeley图像数据集以及临床乳腺MR图像、脑部MR图像上的聚类结果表明,所提出的算法不仅在UCI数据集上能够取得较好的结果,而且对图像聚类也具有较好的抗噪性能,尤其是对MR图像的聚类具有较高的精度和鲁棒性,能够较为有效地实现MR图像的分割.  相似文献   

9.
In this paper, we present a particle swarm optimizer (PSO) to solve the variable weighting problem in projected clustering of high-dimensional data. Many subspace clustering algorithms fail to yield good cluster quality because they do not employ an efficient search strategy. In this paper, we are interested in soft projected clustering. We design a suitable k-means objective weighting function, in which a change of variable weights is exponentially reflected. We also transform the original constrained variable weighting problem into a problem with bound constraints, using a normalized representation of variable weights, and we utilize a particle swarm optimizer to minimize the objective function in order to search for global optima to the variable weighting problem in clustering. Our experimental results on both synthetic and real data show that the proposed algorithm greatly improves cluster quality. In addition, the results of the new algorithm are much less dependent on the initial cluster centroids. In an application to text clustering, we show that the algorithm can be easily adapted to other similarity measures, such as the extended Jaccard coefficient for text data, and can be very effective.  相似文献   

10.
In the last few years, hypergraph-based methods have gained considerable attention in the resolution of real-world clustering problems, since such a mode of representation can handle higher-order relationships between elements compared to the standard graph theory. The most popular and promising approach to hypergraph clustering arises from concepts in spectral hypergraph theory [53], and clustering is configured as a hypergraph cut problem where an appropriate objective function has to be optimized. The spectral relaxation of this optimization problem allows to get a clustering that is close to the optimum, but this approach generally suffers from its high computational demands, especially in real-world problems where the size of the data involved in their resolution becomes too large. A natural way to overcome this limitation is to operate a reduction of the hypergraph, where spectral clustering should be applied over a hypergraph of smaller size. In this paper, we introduce two novel hypergraph reduction algorithms that are able to maintain the hypergraph structure as accurate as possible. These algorithms allowed us to design a new approach devoted to hypergraph clustering, based on the multilevel paradigm that operates in three steps: (i) hypergraph reduction; (ii) initial spectral clustering of the reduced hypergraph and (iii) clustering refinement. The accuracy of our hypergraph clustering framework has been demonstrated by extensive experiments with comparison to other hypergraph clustering algorithms, and have been successfully applied to image segmentation, for which an appropriate hypergraph-based model have been designed. The low running times displayed by our algorithm also demonstrates that the latter, unlike the standard spectral clustering approach, can handle datasets of considerable size.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号