首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
在分析单词-文档谱聚类方法的基本步骤,找出其对初始值敏感的根本原因的基础上,提出一种基于模糊-调和均值的单词-文档谱聚类方法.首先从矩阵相似的角度对谱聚类中的Laplacian矩阵进行处理,使其满足对初始值不敏感的条件;然后通过加入模糊的概念,用模糊K-调和均值算法代替K-均值算法,使聚类结果对初始值不敏感.实验结果表明,所提出的方法不仅使聚类结果对初始值不敏感,而且在一定程度上提高了数据的鲁棒性.  相似文献   

2.
刘建军  胡卫东  郁文贤 《计算机仿真》2009,26(7):192-194,227
以实现RBF网络的增量学习能力和提高其增量学习的稳健性为目的,给出了一种RBF网络增量学习算法.算法首先对初始数据集进行聚类得到初始的RBF网络结构,然后采用GAP-RBF算法中的隐层节点调整策略动态调整网络结构实现RBF网络增量学习.RBF网络的初始化降低了初始数据集样本训练顺序对RBF网络性能的影响,增强了其增量学习的稳健性.IRIS数据集和雷达实测数据集仿真实验表明,算法具有较好的增量学习能力.  相似文献   

3.
范虹  侯存存  朱艳春  姚若侠 《软件学报》2017,28(11):3080-3093
现有的软子空间聚类算法在分割MR图像时易受随机噪声的影响,而且算法因依赖于初始聚类中心的选择而容易陷入局部最优,导致分割效果不理想.针对这一问题,提出一种基于烟花算法的软子空间MR图像聚类算法.算法首先设计一个结合界约束与噪声聚类的目标函数,弥补现有算法对噪声数据敏感的缺陷,并提出一种隶属度计算方法,快速、准确地寻找簇类所在子空间;然后,在聚类过程中引入自适应烟花算法,有效地平衡局部与全局搜索,弥补现有算法容易陷入局部最优的不足.EWKM,FWKM,FSC,LAC算法在UCI数据集、人工合成图像、Berkeley图像数据集以及临床乳腺MR图像、脑部MR图像上的聚类结果表明,所提出的算法不仅在UCI数据集上能够取得较好的结果,而且对图像聚类也具有较好的抗噪性能,尤其是对MR图像的聚类具有较高的精度和鲁棒性,能够较为有效地实现MR图像的分割.  相似文献   

4.
Iterative refinement clustering algorithms are widely used in data mining area, but they are sensitive to the initialization. In the past decades, many modified initialization methods have been proposed to reduce the influence of initialization sensitivity problem. The essence of iterative refinement clustering algorithms is the local search method. The big numbers of the local minimum points which are embedded in the search space make the local search problem hard and sensitive to the initialization. The smaller number of local minimum points, the more robust of initialization for a local search algorithm is. In this paper, we propose a Top–Down Clustering algorithm with Smoothing Search Space (TDCS3) to reduce the influence of initialization. The main steps of TDCS3 are to: (1) dynamically reconstruct a series of smoothed search spaces into a hierarchical structure by ‘filling’ the local minimum points; (2) at the top level of the hierarchical structure, an existing iterative refinement clustering algorithm is run with random initialization to generate the clustering result; (3) eventually from the second level to the bottom level of the hierarchical structure, the same clustering algorithm is run with the initialization derived from the previous clustering result. Experiment results on 3 synthetic and 10 real world data sets have shown that TDCS3 has significant effects on finding better, robust clustering result and reducing the impact of initialization.  相似文献   

5.
A novel and generic multi-objective design paradigm is proposed which utilizes quantum-behaved PSO (QPSO) for deciding the optimal configuration of the LQR controller for a given problem considering a set of competing objectives. There are three main contributions introduced in this paper as follows. (1) The standard QPSO algorithm is reinforced with an informed initialization scheme based on the simulated annealing algorithm and Gaussian neighborhood selection mechanism. (2) It is also augmented with a local search strategy which integrates the advantages of memetic algorithm into conventional QPSO. (3) An aggregated dynamic weighting criterion is introduced that dynamically combines the soft and hard constraints with control objectives to provide the designer with a set of Pareto optimal solutions and lets her to decide the target solution based on practical preferences. The proposed method is compared against a gradient-based method, seven meta-heuristics, and the trial-and-error method on two control benchmarks using sensitivity analysis and full factorial parameter selection and the results are validated using one-tailed T-test. The experimental results suggest that the proposed method outperforms opponent methods in terms of controller effort, measures associated with transient response and criteria related to steady-state.  相似文献   

6.
医学超声成像技术以其实时性、无损性与廉价性等优点被广泛应用于医疗诊断,但由于其固有的斑点噪声和与组织相关的纹理特性使得医学超声图像的分割一直是一个难题。模糊C均值聚类算法(FCM)具有较强的抗噪声能力,能够较好地完成医学超声图像的分割任务,但其局限性在于对聚类中心的初值较敏感,当随机选取初始聚类中心时,很有可能使分割过程陷入局部极小,影响分割结果。利用遗传算法(GA)能够寻找全局最优解的特点,提出一种基于遗传算法寻找初始聚类中心的模糊聚类方法,应用于医学超声图像分割并取得了良好效果。  相似文献   

7.
In spite of the initialization problem, the Expectation-Maximization (EM) algorithm is widely used for estimating the parameters of finite mixture models. Most popular model-based clustering techniques might yield poor clusters if the parameters are not initialized properly. To reduce the sensitivity of initial points, a novel algorithm for learning mixture models from multivariate data is introduced in this paper. The proposed algorithm takes advantage of TRUST-TECH (TRansformation Under STability-reTaining Equilibra CHaracterization) to compute neighborhood local maxima on likelihood surface using stability regions. Basically, our method coalesces the advantages of the traditional EM with that of the dynamic and geometric characteristics of the stability regions of the corresponding nonlinear dynamical system of the log-likelihood function. Two phases namely, the EM phase and the stability region phase, are repeated alternatively in the parameter space to achieve improvements in the maximum likelihood. The EM phase obtains the local maximum of the likelihood function and the stability region phase helps to escape out of the local maximum by moving towards the neighboring stability regions. The algorithm has been tested on both synthetic and real datasets and the improvements in the performance compared to other approaches are demonstrated. The robustness with respect to initialization is also illustrated experimentally.  相似文献   

8.
基于全局K-Means的谱聚类算法   总被引:3,自引:1,他引:2  
谢皝  张平伟  罗晟 《计算机应用》2010,30(7):1936-1937
谱聚类算法是近年来研究得比较多的一种聚类算法。但谱聚类是对初始化敏感的,针对这种缺陷,提出一种基于全局K-means的谱聚类算法(GKSC),引入对初值不敏感的全局K-means算法来改善。通过仿真实验表明:GKSC与传统谱聚类算法相比更能得到稳定的聚类结果和更高的聚类精确度。  相似文献   

9.
贾娟娟  贾富杰 《计算机科学》2018,45(Z11):247-250, 255
采用传统的模糊C均值聚类(FCM)算法进行彩色图像分割存在聚类数的选取、初始聚类中心的确定、迭代过程中的大计算量及后处理等问题。在对上述问题进行研究的基础上,针对传统FCM聚类分割时初始值选取方法的盲目性和随机性,为了更准确地自动获取待分割图像聚类的初始参数,提出了一种结合爬山法的模糊C均值彩色图像分割方法(HFCM),该方法可根据待分割图像的三维颜色直方图自适应地获取FCM算法的初始聚类中心及聚类数目,同时提出一种最频滤波与区域合并相结合的新的后处理策略,有效消除了小的空间区域。实验表明,相对于传统FCM,该图像分割方法的速度较快,并且分割结果更接近人类分割效果。  相似文献   

10.
In clustering algorithms, choosing a subset of representative examples is very important in data set. Such “exemplars” can be found by randomly choosing an initial subset of data objects and then iteratively refining it, but this works well only if that initial choice is close to a good solution. In this paper, based on the frequency of attribute values, the average density of an object is defined. Furthermore, a novel initialization method for categorical data is proposed, in which the distance between objects and the density of the object is considered. We also apply the proposed initialization method to k-modes algorithm and fuzzy k-modes algorithm. Experimental results illustrate that the proposed initialization method is superior to random initialization method and can be applied to large data sets for its linear time complexity with respect to the number of data objects.  相似文献   

11.
针对传统k-均值算法对初始聚类中心敏感的问题,提出了启发式初始化独立的k-均值算法。该算法引入prim算法选择k个初始聚类中心,且通过设置阈值参数θ,避免同一类中的多个数据对象同时作为初始聚类中心,否则将导致聚类迭代次数增加,并得到错误的聚类结果。与传统的k-均值算法和基于遗传算法的k-均值聚类算法相比,实验结果表明改进的算法不仅降低了初始聚类中心选取的随机性对聚类性能产生的影响,有效减少了聚类迭代次数,而且降低了离群点对聚类性能的影响,从而验证了算法的可行性和有效性。  相似文献   

12.
对称非负矩阵分解SNMF作为一种基于图的聚类算法,能够更自然地捕获图表示中嵌入的聚类结构,并且在线性和非线性流形上获得更好的聚类结果,但对变量的初始化比较敏感。另外,标准的SNMF算法利用误差平方和来衡量分解的质量,对噪声和异常值敏感。为了解决这些问题,在集成学习视角下,提出一种鲁棒自适应对称非负矩阵分解聚类算法RS3NMF(robust self-adaptived symmetric nonnegative matrix factorization)。基于L2,1范数的RS3NMF模型缓解了噪声和异常值的影响,保持了特征旋转不变性,提高了模型的鲁棒性。同时,在不借助任何附加信息的前提下,利用SNMF对初始化特征的敏感性来逐步增强聚类性能。采用交替迭代方法优化,并保证目标函数值的收敛性。大量实验结果表明,所提RS3NMF算法优于其他先进的算法,具有较强的鲁棒性。  相似文献   

13.
针对基于Hub的聚类算法K-hubs算法存在对初始聚类中心敏感的问题,提出一种基于Hub的初始中心选择策略。该策略充分利用高维数据普遍存在的Hubness现象,选择相距最远的K个Hub点作为初始的聚类中心。实验表明采用该策略的K-hubs算法与原来采用随机初始中心的K-hubs算法相比,前者拥有较好的初始中心分布,能够提高聚类准确率,而且初始中心所在的位置倾向于接近最终簇中心,有利于加快算法收敛。  相似文献   

14.
方新  赵卫东  杨晓春 《计算机应用》2008,28(5):1240-1243
图像分割可以看作对具有不同特征的像素进行聚类的过程。综合考虑像素的灰度、梯度及邻域等特征,将Ant-Tree聚类算法引入图像分割中。针对Ant-Tree算法的聚类结果信息冗余的缺点,采用了一种改进的树结构模型来提高聚类速度。此外,还提出了一种新的初始化方法,结合K-means算法动态修正聚类中心,提高了聚类准确度和算法的鲁棒性。实验结果证明改进的Ant-Tree算法可以快速准确地分割出目标,是一种非常有效的图像分割方法。  相似文献   

15.
Clustering provides a knowledge acquisition method for intelligent systems. This paper proposes a novel data-clustering algorithm, by combining a new initialization technique, K-means algorithm and a new gradual data transformation approach to provide more accurate clustering results than the K-means algorithm and its variants by increasing the clusters’ coherence. The proposed data transformation approach solves the problem of generating empty clusters, which frequently occurs for other clustering algorithms. An efficient method based on the principal component transformation and a modified silhouette algorithm is also proposed in this paper to determine the number of clusters. Several different data sets are used to evaluate the efficacy of the proposed method to deal with the empty cluster generation problem and its accuracy and computational performance in comparison with other K-means based initialization techniques and clustering methods. The developed estimation method for determining the number of clusters is also evaluated and compared with other estimation algorithms. Significances of the proposed method include addressing the limitations of the K-means based clustering and improving the accuracy of clustering as an important method in the field of data mining and expert systems. Application of the proposed method for the knowledge acquisition in time series data such as wind, solar, electric load and stock market provides a pre-processing tool to select the most appropriate data to feed in neural networks or other estimators in use for forecasting such time series. In addition, utilization of the knowledge discovered by the proposed K-means clustering to develop rule based expert systems is one of the main impacts of the proposed method.  相似文献   

16.
为了进一步降低无监督深度哈希检索任务中的伪标签噪声,提出了一种等量约束聚类的无监督蒸馏哈希图像检索方法。该方法主要分为两个阶段,在第一阶段中,主要对无标签图像进行软伪标签标注,用于第二阶段监督哈希特征学习,通过所提等量约束聚类算法,在软伪标签标注过程中可以有效降低伪标签中的噪声;在第二阶段中,主要对学生哈希网络进行训练,用于提取图像哈希特征。通过所提出的无监督蒸馏哈希方法,利用图像软伪标签指导哈希特征学习,进一步提高了哈希检索性能,实现了高效的无监督哈希图像检索。为了评估所提方法的有效性,在CIFAR-10、FLICKR25K和EuroSAT三个公开数据集上进行了实验,并与其他先进方法进行了比较。在CIFAR-10数据集上,与TBH方法相比,所提方法检索精度平均提高12.7%;在FLICKR25K数据集上,与DistillHash相比,所提方法检索精度平均提高1.0%;在EuroSAT数据集上,与ETE-GAN相比,所提方法检索精度平均提高16.9%。在三个公开数据集上进行的实验结果表明,所提方法能够实现高性能的无监督哈希检索,且对各类数据均有较好的适应性。  相似文献   

17.
Partitional clustering of categorical data is normally performed by using K-modes clustering algorithm, which works well for large datasets. Even though the design and implementation of K-modes algorithm is simple and efficient, it has the pitfall of randomly choosing the initial cluster centers for invoking every new execution that may lead to non-repeatable clustering results. This paper addresses the randomized center initialization problem of K-modes algorithm by proposing a cluster center initialization algorithm. The proposed algorithm performs multiple clustering of the data based on attribute values in different attributes and yields deterministic modes that are to be used as initial cluster centers. In the paper, we propose a new method for selecting the most relevant attributes, namely Prominent attributes, compare it with another existing method to find Significant attributes for unsupervised learning, and perform multiple clustering of data to find initial cluster centers. The proposed algorithm ensures fixed initial cluster centers and thus repeatable clustering results. The worst-case time complexity of the proposed algorithm is log-linear to the number of data objects. We evaluate the proposed algorithm on several categorical datasets and compared it against random initialization and two other initialization methods, and show that the proposed method performs better in terms of accuracy and time complexity. The initial cluster centers computed by the proposed approach are close to the actual cluster centers of the different data we tested, which leads to faster convergence of K-modes clustering algorithm in conjunction to better clustering results.  相似文献   

18.
This paper presents an extension of the one-class support vector machines (OC-SVM) into an ensemble of soft OC-SVM classifiers. The idea consists in prior clustering of the input data with a kernel version of the deterministically annealed fuzzy c-means. This way partitioned data is trained with a number of soft OC-SVM classifiers which allow weight assignment to each of the training data. Weights are obtained from the cluster membership values, computed in the kernel fuzzy c-means. The method was designed and tested mostly in the tasks of image classification and segmentation, although it can be used for other one-class problems.  相似文献   

19.
软硬结合的快速模糊C-均值聚类算法的研究   总被引:2,自引:1,他引:1  
讨论的是对模糊C-均值聚类方法的改进,在原有的模糊C-均值算法的基础上,提出一种软硬结合的快速模糊C-均值聚类算法。快速模糊C-均值聚类算法是在模糊C-均值聚类算法之前加入一层硬C-均值聚类算法。硬聚类算法能比模糊聚类算法以高得多的速度完成,将硬聚类中心作为模糊聚类中心的迭代初值,从而提高模糊C-均值聚类算法的收敛速度,这对于大量数据的聚类是很有意义的。用数据仿真验证了这种快速模糊C-均值聚类算法比模糊C-均值算法迭代调整过程短,收敛速度快,聚类效果好。  相似文献   

20.
The leading partitional clustering technique, k-modes, is one of the most computationally efficient clustering methods for categorical data. However, the performance of the k-modes clustering algorithm which converges to numerous local minima strongly depends on initial cluster centers. Currently, most methods of initialization cluster centers are mainly for numerical data. Due to lack of geometry for the categorical data, these methods used in cluster centers initialization for numerical data are not applicable to categorical data. This paper proposes a novel initialization method for categorical data which is implemented to the k-modes algorithm. The method integrates the distance and the density together to select initial cluster centers and overcomes shortcomings of the existing initialization methods for categorical data. Experimental results illustrate the proposed initialization method is effective and can be applied to large data sets for its linear time complexity with respect to the number of data objects.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号