首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
孙倩  陈昊  李超 《计算机应用研究》2020,37(6):1707-1710,1764
针对大数据聚类算法计算效率与聚类性能较低的问题,提出了一种基于改进人工蜂群算法与MapReduce的大数据聚类算法。将灰狼优化算法与人工蜂群算法结合,同时提高人工蜂群算法的搜索能力与开发能力,该策略能够有效地提高聚类处理的性能;采用混沌映射与反向学习作为ABC种群的初始化策略,提高搜索的解质量;将聚类算法基于Hadoop的MapReduce编程模型实现,通过最小化类内距离的平方和实现对大数据的聚类处理。实验结果表明,该算法有效地提高了大数据集的聚类质量,同时加快了聚类速度。  相似文献   

2.
One of the simple techniques for Data Clustering is based on Fuzzy C-means (FCM) clustering which describes the belongingness of each data to a cluster by a fuzzy membership function instead of a crisp value. However, the results of fuzzy clustering depend highly on the initial state selection and there is also a high risk for getting the best results when the datasets are large. In this paper, we present a hybrid algorithm based on FCM and modified stem cells algorithms, we called it SC-FCM algorithm, for optimum clustering of a dataset into K clusters. The experimental results obtained by using the new algorithm on different well-known datasets compared with those obtained by K-means algorithm, FCM, Genetic Algorithm (GA), Particle Swarm Optimization (PSO), Ant Colony Optimization (ACO), Artificial Bee Colony (ABC) Algorithm demonstrate the better performance of the new algorithm.  相似文献   

3.
蛋白质交互作用(PPI)网络聚类算法是研究和揭示蛋白质功能的主要方法之一。由于PPI网络的特性,传统算法不能有效聚类。文中提出一种基于蜂群和广度优先遍历的聚类算法。为避免噪声点对实验结果的干扰,在预处理阶段利用距离-密度算法确定聚类个数,剔除噪声点。然后利用结点网络综合特征值确定初始聚类中心,利用广度优先遍历搜索算法进行聚类。再采用改进的蜂群算法自动寻找最优合并阈值。最后用正确率和查全率对该算法进行性能评价并对算法中一些重要参数进行仿真分析,仿真结果表明该聚类算法有效提高PPI网络的聚类效果。  相似文献   

4.
梁冰  徐华 《计算机应用》2017,37(9):2600-2604
针对核模糊C均值(KFCM)算法对初始聚类中心敏感、易陷入局部最优的问题,利用人工蜂群(ABC)算法的构架简单、全局收敛速度快的优势,提出了一种改进的人工蜂群算法(IABC)与KFCM迭代相结合的聚类算法。首先,以IABC求得最优解作为KFCM算法的初始聚类中心,IABC在迭代过程中将与当前维度最优解的差值的变化率作为权值,对雇佣蜂的搜索行为进行改进,平衡人工蜂群算法的全局搜索与局部开采能力;其次,以类内距离和类间距离为基础,构造出适应KFCM算法的适应度函数,利用KFCM算法优化聚类中心;最后,IABC和KFCM算法交替执行,实现最佳聚类效果。采用3组Benchmark测试函数6组UCI标准数据集进行仿真实验,实验结果表明,与基于改进人工蜂群的广义模糊聚类(IABC-KGFCM)相比,IABC-KFCM对数据集的聚类有效性指标提高1到4个百分点,具有鲁棒性强和聚类精度高的优势。  相似文献   

5.
Extreme learning machine (ELM) as a new learning approach has shown its good generalization performance in regression and classification applications. Clustering analysis is an important tool to explore the structure of data and has been employed in many disciplines and applications. In this paper, we present a method that builds on ELM projection of input data into a high-dimensional feature space and followed by unsupervised clustering using artificial bee colony (ABC) algorithm. While ELM projection facilitates separability of clusters, a metaheuristic technique such as ABC algorithm overcomes problems of dependence on initialization of cluster centers and convergence to local minima suffered by conventional algorithms such as K-means. The proposed ELM-ABC algorithm is tested on 12 benchmark data sets. The experimental results show that the ELM-ABC algorithm can effectively improve the quality of clustering.  相似文献   

6.
段谟意 《计算机应用》2013,33(3):727-729
针对日益严重的网络安全问题,基于人工蜂群与聚类方法提出一种新的状态检测算法--DASA。该算法首先根据SKETCH方法和Hash函数建立业务流异常状态模型,并且利用人工蜂群技术实现对异常状态的检测。最后,以实际数据进行仿真实验,对比分析了样本数据与DASA算法检测的结果,发现DASA具有较好的适应性,而且聚类个数、丢弃阈值和邻域半径等因素对状态检测产生较大影响。  相似文献   

7.
Over the past few years, swarm intelligence based optimization techniques such as ant colony optimization and particle swarm optimization have received considerable attention from engineering researchers and practitioners. These algorithms have been used in the solution of various engineering problems. Recently, a relatively new swarm based optimization algorithm called the Artificial Bee Colony (ABC) algorithm has begun to attract interest from researchers to solve optimization problems. The aim of this study is to present an optimization algorithm based on the ABC algorithm for the discrete optimum design of truss structures. The ABC algorithm is a meta-heuristic optimization technique that mimics the process of food foraging of honeybees. Originally the ABC algorithm was developed for continuous function optimization problems. This paper describes the modifications made to the ABC algorithm in order to solve discrete optimization problems and to improve the algorithm’s performance. In order to demonstrate the effectiveness of the modified algorithm, four structural problems with up to 582 truss members and 29 design variables were solved and the results were compared with those obtained using other well-known meta-heuristic search techniques. The results demonstrate that the ABC algorithm is very effective and robust for the discrete optimization designs of truss structural problems.  相似文献   

8.
Clustering techniques have received attention in many fields of study such as engineering, medicine, biology and data mining. The aim of clustering is to collect data points. The K-means algorithm is one of the most common techniques used for clustering. However, the results of K-means depend on the initial state and converge to local optima. In order to overcome local optima obstacles, a lot of studies have been done in clustering. This paper presents an efficient hybrid evolutionary optimization algorithm based on combining Modify Imperialist Competitive Algorithm (MICA) and K-means (K), which is called K-MICA, for optimum clustering N objects into K clusters. The new Hybrid K-ICA algorithm is tested on several data sets and its performance is compared with those of MICA, ACO, PSO, Simulated Annealing (SA), Genetic Algorithm (GA), Tabu Search (TS), Honey Bee Mating Optimization (HBMO) and K-means. The simulation results show that the proposed evolutionary optimization algorithm is robust and suitable for handling data clustering.  相似文献   

9.
The self-organizing map (SOM) has been widely used in many industrial applications. Classical clustering methods based on the SOM often fail to deliver satisfactory results, specially when clusters have arbitrary shapes. In this paper, through some preprocessing techniques for filtering out noises and outliers, we propose a new two-level SOM-based clustering algorithm using a clustering validity index based on inter-cluster and intra-cluster density. Experimental results on synthetic and real data sets demonstrate that the proposed clustering algorithm is able to cluster data better than the classical clustering algorithms based on the SOM, and find an optimal number of clusters.  相似文献   

10.
喻金平  郑杰  梅宏标 《计算机应用》2014,34(4):1065-1069
针对K均值聚类(KMC)算法全局搜索能力差、初始聚类中心选择敏感,以及原始人工蜂群(ABC)算法的初始化随机性、易早熟、后期收敛速度慢等问题,提出了一种改进人工蜂群算法(IABC)。该算法利用最大最小距离积方法初始化蜂群,构造出适应KMC算法的适应度函数以及一种基于全局引导的位置更新公式以提高迭代寻优过程的效率。将改进的人工蜂群算法与KMC算法结合提出IABC-Kmeans算法以改善聚类性能。通过Sphere、Rastrigin、Rosenbrock和Griewank四个标准测试函数和UCI标准数据集上进行测试的仿真实验表明,IABC算法收敛速度快,克服了原始算法易陷入局部最优解的缺点;IABC-Kmeans算法则具有更好的聚类质量和综合性能。  相似文献   

11.
戚攀  包开阳  马皛源 《计算机应用》2018,38(7):1974-1980
为了提高无线传感器网络(WSN)的能量效率并延长其生命周期,提出了一种基于模糊C均值聚类(FCM)和群体智能的WSN分层路由算法(FCM-SI)。首先采用FCM聚类算法对网络进行分簇,优化普通节点与簇头(CH)间距离;然后采用三参数的人工蜂群(ABC)算法选取每个簇的最优簇头;最后采用蚁群优化(ACO)算法搜索簇头至基站(BS)的多跳路径,路径综合考虑了网络的能耗和负载均衡性能。仿真结果显示,与基于均匀分簇的改进的低功耗自适应分簇(I-LEACH)算法、基于ABC的低功耗自适应分簇(ABC-LEACH)算法和基于ACO的低功耗自适应分簇(ANT-LEACH)算法相比,FCM-SI在100 m×100 m,100个节点的初始网络条件下将网络生命周期分别提高了65.2%、49.6%和29.0%。FCM-SI能够有效地延长网络寿命,提高能量利用效率。  相似文献   

12.
蛋白质相互作用网络的蜂群信息流聚类模型与算法   总被引:1,自引:0,他引:1  
蛋白质相互作用网络的聚类算法研究是充分理解分子的结构、功能及识别蛋白质的功能模块的重要方法.很多传统聚类算法对于蛋白质相互作用网络聚类效果不佳.功能流模拟算法是一种新型聚类算法,但该算法没有考虑到距离的作用效果并且需要人为地设置合并阈值,带有主观性.文中提出了一种新颖的基于蜂群优化机理的信息流聚类模型与算法.该方法中,数据预处理采用结点网络综合特征值的排序来初始化聚类中心,将蜂群算法的蜜源位置对应于其聚类中心,蜜源的收益度大小对应于模块间的相似度,采蜜蜂结点的所有邻接点按照结点网络综合特征值的降序排列,作为侦察蜂的搜索邻域.采用正确率、查全率等指标对聚类效果做出客观评价,并对算法的一些关键参数进行仿真、对比与分析.结果表明新算法不仅克服了原功能流模拟算法的缺点,且其正确率和查全率的几何平均值最高,能够有效地识别蛋白质功能模块.  相似文献   

13.
针对K-means聚类算法对初始聚类中心敏感和易陷入局部最优解的缺点,提出一种基于K-means的人工蜂群(ABC)聚类算法。将改进的人工蜂群算法和K-means迭代相结合,使算法对初始聚类中心的依赖性和陷入局部最优解的可能性降低,提高了算法的稳定性。通过基于反向学习的初始化策略,增强了初始群体的多样性。利用非线性选择策略,改善了过早收敛问题,提高了搜索效率。通过对邻域搜索范围的动态调整,提高了算法收敛速度,增强了局部寻优能力。实验结果表明,该算法不仅克服了K-means算法稳定性差的缺点,而且具有良好的性能和聚类效果。  相似文献   

14.
To automatically extract T-S fuzzy models with enhanced performance from data is an interesting and important issue for fuzzy system modeling. In this paper, a novel methodology is proposed for this issue based on a three-step procedure. Firstly, the idea of variable length genotypes is introduced to the artificial bee colony (ABC) algorithm to derive a so-called Variable string length Artificial Bee Colony (VABC) algorithm. The VABC algorithm can be used to solve a kind of optimization problems where the length of the optimal solutions is not known as a priori. Secondly, fuzzy clustering without knowing cluster number as a priori is viewed as such kind of optimization problem. Thus, a novel version of Fuzzy C-Means clustering technique (VABC-FCM), holding powerful global search ability, is proposed based on the VABC algorithm. Use of VABC allows the encoding of variable cluster number. This makes VABC-FCM not require a priori specification of the cluster number. Finally, the proposed VABC-FCM algorithm is used to extract T-S fuzzy model from data. Such VABC-FCM based convenient T-S fuzzy model extraction methodology does not require a specification of rule number as a priori. Some artificial data sets are applied to validate the performance of the convenient T-S fuzzy model. The experimental results show that the proposed convenient T-S fuzzy model has low approximation error and high prediction accuracy with appreciate rule number. Moreover, the convenient T-S fuzzy model is used to model the characteristics of superheated steam temperature in power plant, and the results suggest the powerful performance of the proposed method.  相似文献   

15.
针对传统模糊C-均值聚类算法对初始值和噪声敏感的缺点,提出了一种基于多链量子蜂群算法的模糊C-均值聚类算法。首先,将多链拓展编码方案应用到量子蜂群算法中,提出了多链量子蜂群算法;其次,利用多链量子蜂群算法来优化模糊C-均值聚类的初始聚类中心;最后,设计一种新的利用多链量子蜂群算法优化模糊C-均值聚类中心的图像分割算法。实验结果表明,所提出的基于多链量子蜂群算法的模糊C-均值聚类图像分割算法是有效的,相对于传统模糊C-均值聚类算法及基于模糊的人工蜂群算法,所提算法在分割正确率、分割速度及鲁棒性上均更有效。  相似文献   

16.
In recent years, heuristic algorithms have been successfully applied to solve clustering and classification problems. In this paper, gravitational search algorithm (GSA) which is one of the newest swarm based heuristic algorithms is used to provide a prototype classifier to face the classification of instances in multi-class data sets. The proposed method employs GSA as a global searcher to find the best positions of the representatives (prototypes). The proposed GSA-based classifier is used for data classification of some of the well-known benchmark sets. Its performance is compared with the artificial bee colony (ABC), the particle swarm optimization (PSO), and nine other classifiers from the literature. The experimental results of twelve data sets from UCI machine learning repository confirm that the GSA can successfully be applied as a classifier to classification problems.  相似文献   

17.
Geo-demographic analysis is an essential part of a geographical information system (GIS) for predicting people’s behavior based on statistical models and their residential location. Fuzzy Geographically Weighted Clustering (FGWC) serves as one of the most efficient algorithms in geo-demographic analysis. Despite being an effective algorithm, FGWC is sensitive to initialize when the random selection of cluster centers makes the iterative process falling into the local optimal solution easily. Artificial Bee Colony (ABC), one of the most popular meta-heuristic algorithms, can be regarded as the tool to achieve global optimization solutions. This research aims to propose a novel geo-demographic analysis algorithm that integrates FGWC to the optimization scheme of ABC for improving geo-demographic clustering accuracy. Experimental results on various datasets show that the clustering quality of the proposed algorithm called FGWC-ABC is better than those of other relevant methods. The proposed algorithm is also applied to a decision-making application for analyzing crime behavior problem in the population using the US communities and crime dataset. It provides fuzzy rules to determine the violent crime rate in terms of linguistic labels from socioeconomic variables. These results are significant to make predictions of further US violent crime rate and to facilitate appropriate decisions on prevention such the situations in the future.  相似文献   

18.
In this paper, we present an agglomerative fuzzy $k$-means clustering algorithm for numerical data, an extension to the standard fuzzy $k$-means algorithm by introducing a penalty term to the objective function to make the clustering process not sensitive to the initial cluster centers. The new algorithm can produce more consistent clustering results from different sets of initial clusters centers. Combined with cluster validation techniques, the new algorithm can determine the number of clusters in a data set, which is a well known problem in $k$-means clustering. Experimental results on synthetic data sets (2 to 5 dimensions, 500 to 5000 objects and 3 to 7 clusters), the BIRCH two-dimensional data set of 20000 objects and 100 clusters, and the WINE data set of 178 objects, 17 dimensions and 3 clusters from UCI, have demonstrated the effectiveness of the new algorithm in producing consistent clustering results and determining the correct number of clusters in different data sets, some with overlapping inherent clusters.  相似文献   

19.
The differential evolution optimization-based clustering techniques are powerful, robust and more sophisticated than the conventional clustering methods due to their stochastic and heuristic characteristics. Unfortunately, these algorithms suffer from several drawbacks such as the tendency to be trapped or stagnated into local optima and slow convergence rates. These drawbacks are consequences of the difficulty in balancing the exploitation and exploration processes which directly affects the final quality of the clustering solutions. Hence, a variance-based differential evolution algorithm with an optional crossover for data clustering is presented in this paper to further enhance the quality of the clustering solutions along with the convergence speed. The proposed algorithm considers the balance between the exploitation and exploration processes by introducing (i) a single-based solution representation, (ii) a switchable mutation scheme, (iii) a vector-based estimation of the mutation factor, and (iv) an optional crossover strategy. The performance of the proposed algorithm is compared with current state-of-the-art differential evolution-based clustering techniques on 15 benchmark datasets from the UCI repository. The experimental results are also thoroughly evaluated and verified via non-parametric statistical analysis. Based on the obtained experimental results, the proposed algorithm achieves an average enhancement up to 11.98% of classification accuracy and obtains a significant improvement in terms of cluster compactness over the competing algorithms. Moreover, the proposed algorithm outperforms its peers in terms of the convergence speed and provides repeatable clustering results over 50 independent runs.  相似文献   

20.
Combining multiple clusterings using evidence accumulation   总被引:2,自引:0,他引:2  
We explore the idea of evidence accumulation (EAC) for combining the results of multiple clusterings. First, a clustering ensemble - a set of object partitions, is produced. Given a data set (n objects or patterns in d dimensions), different ways of producing data partitions are: 1) applying different clustering algorithms and 2) applying the same clustering algorithm with different values of parameters or initializations. Further, combinations of different data representations (feature spaces) and clustering algorithms can also provide a multitude of significantly different data partitionings. We propose a simple framework for extracting a consistent clustering, given the various partitions in a clustering ensemble. According to the EAC concept, each partition is viewed as an independent evidence of data organization, individual data partitions being combined, based on a voting mechanism, to generate a new n /spl times/ n similarity matrix between the n patterns. The final data partition of the n patterns is obtained by applying a hierarchical agglomerative clustering algorithm on this matrix. We have developed a theoretical framework for the analysis of the proposed clustering combination strategy and its evaluation, based on the concept of mutual information between data partitions. Stability of the results is evaluated using bootstrapping techniques. A detailed discussion of an evidence accumulation-based clustering algorithm, using a split and merge strategy based on the k-means clustering algorithm, is presented. Experimental results of the proposed method on several synthetic and real data sets are compared with other combination strategies, and with individual clustering results produced by well-known clustering algorithms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号