共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
针对传统分簇算法无法适用于信道动态变化的认知Ad Hoc网络,提出了一种基于信道相似度的分布式分簇算法.首先计算节点间的信道相似度,利用改进的EM算法估计节点属于不同簇的概率,再结合图的最小割算法取得最优的分簇结果.算法既最大化簇内相似度,也最小化簇间相似度.最后,提出了一个协调机制,可以同步全局的分簇信息.整个过程完全分布式运行,并且无需依赖公共控制信道.仿真结果表明,算法能够根据信道变化,动态地调整分簇结构,提高簇内公共信道数量.与此同时,算法还能有效减少簇间公共信道,降低簇间通信干扰. 相似文献
3.
高阶异构数据模糊联合聚类算法 总被引:1,自引:0,他引:1
为了更有效地分析聚簇重叠部分高阶异构数据的聚簇结果,提出了一种高阶异构数据模糊联合聚类(HFCC)算法,该算法最小化每个特征空间中对象与聚簇中心的加权距离。推导出对象隶属度和特征权重的迭代更新公式,设计出聚类过程的迭代算法,并且从理论上证明了该迭代算法的收敛性。另外,通过泛化XB指标,提出适用于评估高阶异构数据聚类质量的指标GXB,用于判断聚簇数目。实验表明,HFCC算法能够有效探测数据内部隐藏的重叠聚簇结构,并且HFCC算法聚类效果明显优于5种有代表性的硬划分算法,此外GXB指标能够有效判定高阶异构数据的聚簇数目。 相似文献
4.
基于相似度的词聚类算法 总被引:1,自引:1,他引:0
基于类的统计语言模型是解决统计模型数据稀疏问题的重要方法.传统的统计方法基于贪婪原则,常以语料的似然函数或困惑度(perplexity)作为评价标准.传统的聚类方法的主要缺点是聚类速度慢,初值对结果影响大,易陷入局部最优.本文提出了词相似度定义、词集合相似度定义,一种自下而上的分层聚类算法.这种方法不但能改善聚类效果,而且可根据不同的模型选择不同的相似度定义,从而提高聚类的使用效果. 相似文献
5.
Spectral clustering is a powerful tool for exploratory data analysis. Many existing spectral clustering algorithms typically measure the similarity by using a Gaussian kernel function or an undirected k‐nearest neighbor (kNN) graph, which cannot reveal the real clusters when the data are not well separated. In this paper, to improve the spectral clustering, we consider a robust similarity measure based on the shared nearest neighbors in a directed kNN graph. We propose two novel algorithms for spectral clustering: one based on the number of shared nearest neighbors, and one based on their closeness. The proposed algorithms are able to explore the underlying similarity relationships between data points, and are robust to datasets that are not well separated. Moreover, the proposed algorithms have only one parameter, k. We evaluated the proposed algorithms using synthetic and real‐world datasets. The experimental results demonstrate that the proposed algorithms not only achieve a good level of performance, they also outperform the traditional spectral clustering algorithms. 相似文献
6.
模糊C-均值(FCM)聚类算法的一个主要问题是需要事先确定聚类的数目,为此定义了类内差异度和类间重叠度来分别度量同一个聚类中数据的相似度和不同聚类间的分离程度,进而基于这两个度量提出一个新的有效性函数用于判定最佳聚类数目。实验结果表明,该有效性函数能有效地判定聚类数目,并且有较好的鲁棒性。 相似文献
7.
Shih-Ming Pan Kuo-Sheng Cheng 《IEEE transactions on systems, man and cybernetics. Part C, Applications and reviews》2007,37(5):827-838
Traditional clustering algorithms (e.g., the K-means algorithm and its variants) are used only for a fixed number of clusters. However, in many clustering applications, the actual number of clusters is unknown beforehand. The general solution to this type of a clustering problem is that one selects or defines a cluster validity index and performs a traditional clustering algorithm for all possible numbers of clusters in sequence to find the clustering with the best cluster validity. This is tedious and time-consuming work. To easily and effectively determine the optimal number of clusters and, at the same time, construct the clusters with good validity, we propose a framework of automatic clustering algorithms (called ETSAs) that do not require users to give each possible value of required parameters (including the number of clusters). ETSAs treat the number of clusters as a variable, and evolve it to an optimal number. Through experiments conducted on nine test data sets, we compared the ETSA with five traditional clustering algorithms. We demonstrate the superiority of the ETSA in finding the correct number of clusters while constructing clusters with good validity. 相似文献
8.
9.
《IEEE transactions on systems, man and cybernetics. Part C, Applications and reviews》2009,39(4):420-425
10.
Dynamic estimation of number of clusters in data sets 总被引:3,自引:0,他引:3
A new method for estimating during clustering the number of clusters in data sets is proposed. The cluster validity index, Bcrit, takes the homogeneity in each cluster into account and is connected to the geometrical properties of the data set. Bcrit represents the combination of two validity indices. Comparisons between Bcrit and six cluster validity indices, conducted on real data sets, are presented 相似文献
11.
《电子学报:英文版》2017,(6):1221-1226
Category-based statistic language model is an important method to solve the problem of sparse data in statistical language models. But there are two bottlenecks about this model: 1) The problem of word clustering, it is hard to find a suitable clustering method that has good performance and has not large amount of computation; 2) Class-based method always loses some prediction ability to adapt the text of different domain. In order to solve above problems, a novel definition of word similarity by utilizing mutual information was presented. Based on word similarity, the definition of word set similarity was given and a bottom-up hierarchical clustering algorithm was proposed. Experimental results show that the word clustering algorithm based on word similarity is better than conventional greedy clustering method in speed and performance, the perplexity is reduced from 283 to 207.8. 相似文献
12.
针对已有的无线入侵检测方法训练时间长和检测精度低的问题,提出一种基于调整后的BIRCH——MBIRCH算法的无线Mesh网络入侵检测算法。该算法首先一次性扫描数据集获得CF(聚类特征),然后自底向上地计算不同层次的聚类有效指标,主要是考虑数据集的几何结构,即通过度量簇内数据点分布的紧凑度以及簇间的相似度,并保持二者之间的平衡,根据此指标确定CF树的簇结点,直到得到最佳聚类结果,将最佳聚类结果作为训练样本指定判别函数,对网络数据定位。实验结果表明,该算法不仅明显减少样本训练时间,同时提高了算法检测精度,符合无线Mesh网络的入侵检测需要。 相似文献
13.
In most spectral clustering approaches, the Gaussian kernel‐based similarity measure is used to construct the affinity matrix. However, such a similarity measure does not work well on a dataset with a nonlinear and elongated structure. In this paper, we present a new similarity measure to deal with the nonlinearity issue. The maximum flow between data points is computed as the new similarity, which can satisfy the requirement for similarity in the clustering method. Additionally, the new similarity carries the global and local relations between data. We apply it to spectral clustering and compare the proposed similarity measure with other state‐of‐the‐art methods on both synthetic and real‐world data. The experiment results show the superiority of the new similarity: 1) The max‐flow‐based similarity measure can significantly improve the performance of spectral clustering; 2) It is robust and not sensitive to the parameters. 相似文献
14.
为了满足对XML文档集合进行数据挖掘需求,本文提出了根据XML文档树的语义信息和结构信息来计算其结构相似度,通过结构相似度构造其结构相似度矩阵,在此基础上应用DBSCAN算法来对XML文档集合进行聚类.与其他聚类算法相比,其聚类的速度得到了很大的提高. 相似文献
15.
一种基于距离调节的聚类算法 总被引:2,自引:1,他引:1
针对k-means算法不适合凹形样本空间的问题,提出了一种基于距离调节的聚类算法.算法中引入了一种调节最短路径距离作为算法的相似度函数,该函数可以使经过高密度数据区域的两点距离缩短,而经过低密度数据区域的两点距离加长,由此来缩小类间样本的相似度,同时加大类间的相似度,以及更好的聚类.实验结果证明,该算法对凹状的聚类样本空间具有很好的聚类效果. 相似文献
16.
聚类分析是基因表达数据分析研究的主要技术之一,其算法的基本出发点在于根据对象间相似度将对象划分为不同的类,选择适当的相似性度量准则是获得有效聚类结果的关键。采用预处理过的基因数据集在不同相似性度量准则下进行的不同聚类算法的聚类分析,并得到聚类结果评价。其中算法本身的缺陷及距离相似性度量的局限性都是影响结果评价的因素,为了获得更有效的聚类结果,改进相关聚类算法并提出了一种比例相似性度量准则。 相似文献
17.
18.
提出一种适用于道路障碍物识别检测的聚类算法,该算法用来处理各向异性分布的激光点云数据。算法的基本思想是:针对点云空间分布的实时变化,提出在线学习合并阈值的层次聚类算法,以确定聚类数搜索范围上界和初始聚类中心的待选点集;然后提出距离乘积最大化方法,对待选点集进行初始化排序,既结合点云的空间密度分布改善了聚类结果,又克服了传统K-means算法初始聚类中心难确定的问题;最后选取Silhouette和距离评价函数为聚类有效性指标分析算法的聚类效果,确定最佳聚类数。用以上自适应、在线学习的算法对2.5D激光雷达采集的点云数据进行聚类,并与其他两种聚类算法进行实际试验比较发现,本算法可以正确分割大多数空间分布各异且相互连接的障碍物。 相似文献
19.
文章提出了一种基于模糊聚类的文本分类器构造方法,介绍了文本中特征词之间模糊相似度的度量方法,给出了利用“编网法”思想实现模糊聚类的算法。通过比较文本中特征词之间的模糊相似度,实现特征词的聚类,最终获取能够识别文本主题类别的特征词集合,并给出了分类器性能的测试结果。 相似文献
20.
Multiple-Parameter Radar Signal Sorting Using Support Vector Clustering and Similitude Entropy Index
Zhanling Wang Dengfu Zhang Duyan Bi Shiqiang Wang 《Circuits, Systems, and Signal Processing》2014,33(6):1985-1996
The radar signal sorting method based on traditional support vector clustering (SVC) algorithm takes a high time complexity, and the traditional validity index cannot efficiently indicate the best sorting result. Aiming at solving the problem, we study a new sorting method based on cone cluster labeling (CCL) method. The CCL method relies on the theory of approximate coverings both in feature space and data space. Also a new cluster validity index, similitude entropy (SE), is proposed. It can be used to evaluate the compactness and separation of clusters with information entropy theory. Simulations including the performance comparison between the proposed method and the conventional methods are presented. Results show that while maintaining the sorting accuracy, the proposed method can reduce the computing complexity effectively in sorting the signals. 相似文献