首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 74 毫秒
1.
针对现有聚类算法计算复杂度普遍较高的问题,提出了一种基于定位的方法。该算法采用空间定位的方法将数据对象映射到特征空间中,并利用空间立方体的某些特殊顶点定位任一数据点;通过计算数据点与空间立方体顶点群的距离差异,完成聚类过程。在电信数据集上的实验结果表明,算法的时间复杂度降至O(N)级别。  相似文献   

2.
In this paper, we show how one can take advantage of the stability and effectiveness of object data clustering algorithms when the data to be clustered are available in the form of mutual numerical relationships between pairs of objects. More precisely, we propose a new fuzzy relational algorithm, based on the popular fuzzy C-means (FCM) algorithm, which does not require any particular restriction on the relation matrix. We describe the application of the algorithm to four real and four synthetic data sets, and show that our algorithm performs better than well-known fuzzy relational clustering algorithms on all these sets.  相似文献   

3.
基于树编辑距离的层次聚类算法   总被引:1,自引:0,他引:1       下载免费PDF全文
为了识别犯罪嫌疑人伪造和篡改的虚假身份,利用树编辑距离计算个体属性相似性,证明了树编辑距离的相关数学性质,对属性应用层次编码方法,提出了一种新的基于树编辑距离的层次聚类算法HCTED(Hi-erarchical Clustering Algorithm Based on Tree Edit Distance)。新算法通过树编辑操作使用最少的代价计算属性相似性,克服了传统聚类算法标称型计算的缺陷,提高了聚类精度,通过设定阈值对给定样本聚类。实验证明了新方法在身份识别上的准确性和有效性,讨论了不同参数对实验结果的影响,对比传统聚类算法,HCTED算法性能明显提高。新算法已经应用到警用流动人口分析中,取得了良好效果。  相似文献   

4.
R. Wilson  M. Spann 《Pattern recognition》1990,23(12):1413-1425
Estimation theory is used to derive a new approach to the clustering problem. The new method is a unification of centroid and mode estimation, achieved by considering the effect of spatial scale on the estimator. The result is a multiresolution method which spans a range of spatial scales, giving enhanced robustness both to noise in the data and to changes of scale in the data, by using comparison between scales as a test of cluster validity. Iterative and non-iterative algorithms based on the new estimator are presented and are shown to be more accurate than simple scale-space filtering in identifying and locating the cluster centres from noisy test data. Results from a wide range of applications are used to illustrate the power and versatility of the new method.  相似文献   

5.
This paper proposes a stock price trend clustering and trend investment decision model by using a genetic algorithm to search for optimal solutions and the best investment strategies for different stock price trends.The new price trend clustering model identifies three types of stock price movements:uptrends,sideways trends,and downtrends.Unfortunately,trends discovered through stock price movements or technical indicator graphs are typically subjective and unquantifiable.This paper takes daily stock prices and trading volume data from the China Shanghai Stock Exchange Composite Index(SSECI)from January 2,1997 to August31,2012,to examine the performance of the proposed trend clustering model.The proposed model is also compared to other popular stock market investment strategies to verify its validity.Research result shows that the proposed trend clustering model correctly identifies three different trends in the stock market.Furthermore,the trend investment strategy model developed by using genetic algorithm methodology performs better than other investment strategies,namely,Granville’s rule,the KD indicator strategy,the buys and holds strategy,and GMA rules,in both bull and bear market periods.Research results prove the proposed new model to be a stable and valid investment strategy.  相似文献   

6.
Clustering technique is used in image segmentation because of its simple and easy approach. However, the existing clustering techniques required prior information as input and the performance are entirely dependent on this prior information, which is the main drawback of the clustering approaches. Therefore, many researchers are trying to introduce a novel method with user free parameter. We proposed a clustering method, that is, independent of user parameters and later we used a region merging technique to improve the performance of the clustering output. In this article, we proposed a hybrid image segmentation method which is based on a clustering algorithm and black hole algorithm. In the clustering technique, we have used recursive density estimation technique of surrounding pixels. After clustering technique, presence of small segments may be present and it would give lower a performance of segmentation output. Therefore, a segment is merged with another segment by finding best matched segment. Black hole algorithm concept has been used to define the fitness of each segment and to find the best matching segment. We have compared the proposed method with the other clustering-based segmentation methods and different evaluation indices are used to calculate the performance, and the result proved the effectiveness of the proposed algorithm.  相似文献   

7.
传统的谱聚类算法对初始化敏感,针对这个缺陷,引入Canopy算法对样本进行“粗”聚类得到初始聚类中心点,将结果作为K-Means算法的输入,提出了一种基于Canopy和谱聚类融合的聚类算法(Canopy-SC),减少了传统谱聚类算法选择初始中心点的盲目性,并将其用于人脸图像聚类。与传统的谱聚类算法相比,Canopy-SC算法能够得到较好的聚类中心和聚类结果,同时具有更高的聚类精确度。实验结果表明了该算法的有效性和可行性。  相似文献   

8.
9.
Most of the well-known clustering methods based on distance measures, distance metrics and similarity functions have the main problem of getting stuck in the local optima and their performance strongly depends on the initial values of the cluster centers. This paper presents a new approach to enhance the clustering problems with the bio-inspired Cuttlefish Algorithm (CFA) by searching the best cluster centers that can minimize the clustering metrics. Various UCI Machine Learning Repository datasets are used to test and evaluate the performance of the proposed method. For the sake of comparison, we have also analysed several algorithms such as K-means, Genetic Algorithm and the Particle Swarm Optimization (PSO) Algorithm. The simulations and obtained results demonstrate that the performance of the proposed CFA-Clustering method is superior to the other counterpart algorithms in most cases. Therefore, the CFA can be considered as an alternative stochastic method to solve clustering problems.  相似文献   

10.
Clustering is an important research area with numerous applications in pattern recognition, machine learning, and data mining. Since the clustering problem on numeric data sets can be formulated as a typical combinatorial optimization problem, many researches have addressed the design of heuristic algorithms for finding sub-optimal solutions in a reasonable period of time. However, most of the heuristic clustering algorithms suffer from the problem of being sensitive to the initialization and do not guarantee the high quality results. Recently, Approximate Backbone (AB), i.e., the commonly shared intersection of several sub-optimal solutions, has been proposed to address the sensitivity problem of initialization. In this paper, we aim to introduce the AB into heuristic clustering to overcome the initialization sensitivity of conventional heuristic clustering algorithms. The main advantage of the proposed method is the capability of restricting the initial search space around the optimal result by defining the AB, and in turn, reducing the impact of initialization on clustering, eventually improving the performance of heuristic clustering. Experiments on synthetic and real world data sets are performed to validate the effectiveness of the proposed approach in comparison to three conventional heuristic clustering algorithms and three other algorithms with improvement on initialization.  相似文献   

11.
The evaluation of node importance in complex networks has been an increasing widespread concern in recent years. Seeking and protecting vital nodes is important to ensure the security and stability of the whole network. Existing clustering algorithms of complex networks all have certain drawbacks, which could not cover everything in calculation accuracy and time complexity, and need external supervision. To design a fast complex networks clustering method is a problem which requires to be solved immediately. This paper proposes a clustering algorithm of complex networks based on data field using physical data field theory, which excavates key nodes in complex networks by evaluating the importance of nodes based on a mutual information algorithm, and then uses it to classify the clusters. To verify the validity of the algorithm, a simulation experiment was conducted. The results indicated that the algorithm could analyze the cluster exactly and calculate with high-speed, it could also determine the granularity of a partition according to the actual demand.  相似文献   

12.
13.
王娟 《微型机与应用》2011,30(20):71-73,76
传统K-means算法对初始聚类中心的选取和样本的输入顺序非常敏感,容易陷入局部最优。针对上述问题,提出了一种基于遗传算法的K-means聚类算法GKA,将K-means算法的局部寻优能力与遗传算法的全局寻优能力相结合,通过多次选择、交叉、变异的遗传操作,最终得到最优的聚类数和初始质心集,克服了传统K-means算法的局部性和对初始聚类中心的敏感性。  相似文献   

14.
We develop a new algorithm for clustering search results. Differently from many other clustering systems that have been recently proposed as a post-processing step for Web search engines, our system is not based on phrase analysis inside snippets, but instead uses latent semantic indexing on the whole document content. A main contribution of the paper is a novel strategy – called dynamic SVD clustering – to discover the optimal number of singular values to be used for clustering purposes. Moreover, the algorithm is such that the SVD computation step has in practice good performance, which makes it feasible to perform clustering when term vectors are available. We show that the algorithm has very good classification performance, and that it can be effectively used to cluster results of a search engine to make them easier to browse by users. The algorithm has being integrated into the Noodles search engine, a tool for searching and clustering Web and desktop documents.  相似文献   

15.
针对无线传感器网络分簇路由算法中簇头节点负载过重,簇头能量利用率不高,提出了一种基于粒子群优化的双簇头多跳路由算法。该算法根据簇头任务的不同,利用节点的能量、距离汇聚节点的距离以及节点的位置关系分别构建适应值函数,选择出最优主簇头完成数据采集和融合任务,以及与其协作的最优副簇头完成簇间数据转发任务,最终实现采集能耗和传输能耗最小化。仿真实验结果表明,与其他路由算法相比,该算法可以有效减轻簇头节点负载,减小簇头能量消耗,均衡整个网络能耗,延长了网络的生存周期。  相似文献   

16.
针对能量异构的无线传感器网络,提出一种新的基于能量分布的非均匀分簇算法(EDUCRA).在该算法中,距汇聚点较近的节点直接与汇聚点通信、簇首使用非均匀的竞争范围来构造大小不等的簇.通过在OMNet++平台上的仿真结果表明:该算法可以有效均衡网路能量消耗,提高节点能量利用率,延长网络寿命.  相似文献   

17.
This paper presents the colored farthest-neighbor graph (CFNG), a new method for finding clusters of similar objects. The method is useful because it works for both objects with coordinates and for objects without coordinates. The only requirement is that the distance between any two objects be computable. In other words, the objects must belong to a metric space. The CFNG uses graph coloring to improve on an existing technique by Rovetta and Masulli. Just as with their technique, it uses recursive partitioning to build a hierarchy of clusters. In recursive partitioning, clusters are sometimes split prematurely, and one of the contributions of this paper is a way to reduce the occurrence of such premature splits, which also result when other partition methods are used to find clusters.  相似文献   

18.
In this paper, we present a new approach of speech clustering with regards of the speaker identity. It consists in grouping the homogeneous speech segments that are obtained at the end of the segmentation process, by using the spatial information provided by the stereophonic speech signals. The proposed method uses the differential energy of the two stereophonic signals collected by two cardioid microphones, in order to cluster all the speech segments that belong to the same speaker. The total number of clusters obtained at the end should be equal to the real number of speakers present in the meeting room and each cluster should contain the global intervention of only one speaker. The proposed system is suitable for debates or multi-conferences for which the speakers are located at fixed positions. Basically, our approach tries to make a speaker localization with regards to the position of the microphones, taken as a spatial reference. Based on this localization, the new proposed method can recognize the speaker identity of any speech segment during the meeting. So, the intervention of each speaker is automatically detected and assigned to him by estimating his relative position. In a purpose of comparison, two types of clustering methods have been implemented and experimented: the new approach, which we called Energy Differential based Spatial Clustering (EDSC) and a classical statistical approach called “Mono-Gaussian based Sequential Clustering” (MGSC). Experiments of speaker clustering are done on a stereophonic speech corpus called DB15, composed of 15 stereophonic scenarios of about 3.5 minutes each. Every scenario corresponds to a free discussion between two or three speakers seated at fixed positions in the meeting room. Results show the outstanding performances of the new approach in terms of precision and speed, especially for short speech segments, where most of clustering techniques present a strong failure.  相似文献   

19.
In recent years, the historical data during the search process of evolutionary algorithms has received increasing attention from many researchers, and some hybrid evolutionary algorithms with machine-learning have been proposed. However, the majority of the literature is centered on continuous problems with a single optimization objective. There are still a lot of problems to be handled for multi-objective combinatorial optimization problems. Therefore, this paper proposes a machine-learning based multi-objective memetic algorithm (ML-MOMA) for the discrete permutation flowshop scheduling problem. There are two main features in the proposed ML-MOMA. First, each solution is assigned with an individual archive to store the non-dominated solutions found by it and based on these individual archives a new population update method is presented. Second, an adaptive multi-objective local search is developed, in which the analysis of historical data accumulated during the search process is used to adaptively determine which non-dominated solutions should be selected for local search and how the local search should be applied. Computational results based on benchmark problems show that the cooperation of the above two features can help to achieve a balance between evolutionary global search and local search. In addition, many of the best known Pareto fronts for these benchmark problems in the literature can be improved by the proposed ML-MOMA.  相似文献   

20.
针对影响k-means聚类效果的聚类数目和初始中心点两大因素,提出了基于双重遗传的kmeans算法。它用外层遗传算法控制聚类数目,用内层遗传算法控制聚类的初始中心点,并采用类间距离和类内距离以及二者之间的比值来评价聚类结果的好坏,在算法终止后,可同时求得较优的聚类数目和某聚类数目下的较优初始中心点。此外,根据内外层遗传算法的特殊性,采用不同的编码策略适应算法需求,为保留优质个体,采用精英个体保留策略。通过UCI数据集测试实例证明此算法有很好的实用性,对数据挖掘技术有一定参考价值。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号