首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Energy constraint is an important issue in wireless sensor networks. This paper proposes a parallel energy-efficient coverage optimization mechanism to optimize the positions of mobile sensor nodes based on maximum entropy clustering in large-scale wireless sensor networks. According to the models of coverage and energy, stationary nodes are partitioned into clusters by maximum entropy clustering. After identifying the boundary node of each cluster, the sensing area is divided for parallel optimization. A numerical algorithm is adopted to calculate the coverage metric of each cluster, while the lowest cost paths of the inner cluster are used to define the energy metric in which Dijkstra’s algorithm is utilized. Then cluster heads are assigned to perform parallel particle swarm optimization to maximize the coverage metric and minimize the energy metric where a weight coefficient between the two metrics is employed to achieve a tradeoff between coverage area and energy efficiency. Simulations of the optimization mechanism and a target tracking application verify that coverage performance can be guaranteed by choosing a proper weight coefficient for each cluster and energy efficiency is enhanced by parallel energy-efficient optimization.  相似文献   

2.
In recent years, many information networks have become available for analysis, including social networks, road networks, sensor networks, biological networks, etc. Graph clustering has shown its effectiveness in analyzing and visualizing large networks. The goal of graph clustering is to partition vertices in a large graph into clusters based on various criteria such as vertex connectivity or neighborhood similarity. Many existing graph clustering methods mainly focus on the topological structures, but largely ignore the vertex properties which are often heterogeneous. Recently, a new graph clustering algorithm, SA-cluster, has been proposed which combines structural and attribute similarities through a unified distance measure. SA-Cluster performs matrix multiplication to calculate the random walk distances between graph vertices. As part of the clustering refinement, the graph edge weights are iteratively adjusted to balance the relative importance between structural and attribute similarities. As a consequence, matrix multiplication is repeated in each iteration of the clustering process to recalculate the random walk distances which are affected by the edge weight update. In order to improve the efficiency and scalability of SA-cluster, in this paper, we propose an efficient algorithm In-Cluster to incrementally update the random walk distances given the edge weight increments. Complexity analysis is provided to estimate how much runtime cost Inc-Cluster can save. We further design parallel matrix computation techniques on a multicore architecture. Experimental results demonstrate that Inc-Cluster achieves significant speedup over SA-Cluster on large graphs, while achieving exactly the same clustering quality in terms of intra-cluster structural cohesiveness and attribute value homogeneity.  相似文献   

3.
Peer-to-peer (p2p) networks are used by millions for searching and downloading content. Recently, clustering algorithms were shown to be useful for helping users find content in large networks. Yet, many of these algorithms overlook the fact that p2p networks follow graph models with a power-law node degree distribution. This paper studies the obtained clusters when applying clustering algorithms on power-law graphs and their applicability for finding content. Driven by the observed deficiencies, a simple yet efficient clustering algorithm is proposed, which targets a relaxed optimization of a minimal distance distribution of each cluster with a size balancing scheme. A comparative analysis using a song-similarity graph collected from 1.2 million Gnutella users reveals that commonly used efficiency measures often overlook search and recommendation applicability issues and provide the wrong impression that the resulting clusters are well suited for these tasks. We show that the proposed algorithm performs well on various measures that are well suited for the domain.  相似文献   

4.
针对大数据环境下传统并行密度聚类算法中存在的数据划分不合理,聚类结果准确度不高,结果受参数影响较大以及并行效率低等问题,提出一种MapReduce下使用均值距离与关联性标记的并行OPTICS算法——POMDRM-MR。算法使用一种基于维度稀疏度的减少边界点划分策略(DS-PRBP),划分数据集;针对各个分区,提出标记点排序识别簇算法(MOPTICS),构建数据点与核心点之间的关联性,并标记数据点迭代次数,在距离度量中,使用领域均值距离策略(FMD),计算数据点的领域均值距离,代替可达距离排序,输出关联性标记序列;最后结合重排序序列提取簇算法(REC),对输出序列进行二次排序并提取簇,提高算法局部聚类的准确性和稳定性;在合并全局簇时,算法提出边界密度筛选策略(BD-FLC),计算筛选密度相近局部簇;又基于n叉树的并集型合并与MapReduce模型,提出并行局部簇合并算法(MCNT-MR),加快局部簇收敛,并行合并局部簇,提升全局簇合并效率。对照实验表明,POMDRM-MR算法聚类效果更佳,且在大规模数据集下算法的并行化性能更好。  相似文献   

5.
With the popularization of digital cameras, the use of several cameras by group photographers at the same event is becoming common. Photographers can share their contents and even take pictures of each other. So it is becoming important to manage concurrent photos from multiple cameras in order to classify many accumulated photos into proper clusters. In this paper, we propose a novel photo clustering method based on the max-flow network algorithm, and we visualize a network graph for cluster verification. To apply our algorithm, input concurrent photos are used to create an edge-weighted graph structure. In order to transform the photo clustering problem into a graph partition one, first we need to construct an Augmented Concurrent photo Graph (ACG) and then rewrite our original problem in terms of the graph partition one using the min-cut max-flow network model. The previous methods dealt with photo clustering as a 1-D problem using a linear partition. But we consider clustering for concurrent group photos as a 2-D partition based on other users’ photo contents. Each photo is used to create a node and similarities between photos are used to create the edge weights (capacities) of the network. We partition the network into two subgraphs according to the min-cut, which represents the weakest edge connections between the photos. Using repeated graph partitions for each subgraph (sub-network), we can obtain suitable subgraphs corresponding to photo clusters. The graph construction or partition can be adjusted according to user preferences in order to obtain the intended results.  相似文献   

6.
Large graphs are scale free and ubiquitous having irregular relationships. Clustering is used to find existent similar patterns in graphs and thus help in getting useful insights. In real-world, nodes may belong to more than one cluster thus, it is essential to analyze fuzzy cluster membership of nodes. Traditional centralized fuzzy clustering algorithms incur high communication cost and produce poor quality of clusters when used for large graphs. Thus, scalable solutions are obligatory to handle huge amount of data in less computational time with minimum disk access. In this paper, we proposed a parallel fuzzy clustering algorithm named ‘PGFC’ for handling scalable graph data. It will be advantageous from the viewpoint of expert systems to develop a clustering algorithm that can assure scalability along with better quality of clusters for handling large graphs.The algorithm is parallelized using bulk synchronous parallel (BSP) based Pregel model. The cluster centers are initialized using degree centrality measure, resulting in lesser number of iterations. The performance of PGFC is compared with other state of art clustering algorithms using synthetic graphs and real world networks. The experimental results reveal that the proposed PGFC scales up linearly to handle large graphs and produces better quality of clusters when compared to other graph clustering counterparts.  相似文献   

7.
图数据隐私保护的研究目前主要集中在简单图,适应范围有限。将权重图数据的隐私保护作为研究对象,可以改善权重图发布之后数据的可用性及有效性。针对在利用聚类匿名化方法处理社交网络数据时,需要增删大量的边和节点,造成严重的数据失真的问题进行了研究。提出了(k,l)加权社交网络匿名算法KFCMSA(联合k成员模糊聚类和模拟退火),并利用改进的簇划分算法将权重社交网络聚类成不同的簇,对同一簇中节点的边权重进行泛化使节点满足l多样性。在实现k度匿名的同时有效减少了边的改变量,提高了数据的可用性,实现最优聚类的同时防止了同质性攻击。聚类质量实验和数据可用性分析表明该算法具有较高的性能优势和较高边保留率。  相似文献   

8.
A topological and dynamical characterization of the cluster structures described by the support vector clustering is developed. It is shown that each cluster can be decomposed into its constituent basin level cells and can be naturally extended to an enlarged clustered domain, which serves as a basis for inductive clustering. A simplified weighted graph preserving the topological structure of the clusters is also constructed and is employed to develop a robust and inductive clustering algorithm. Simulation results are given to illustrate the robustness and effectiveness of the proposed method  相似文献   

9.
Agglomerative clustering generates the partition hierarchically by a sequence of merge operations. We propose an alternative to the merge-based approach by removing the clusters iteratively one by one until the desired number of clusters is reached. We apply local optimization strategy by always removing the cluster that increases the distortion the least. Data structures and their update strategies are considered. The proposed algorithm is applied as a crossover method in a genetic algorithm, and compared against the best existing clustering algorithms. The proposed method provides best performance in terms of minimizing intra-cluster variance.  相似文献   

10.
属性图用属性向量描述节点,用边描述节点间的关系。为了把节点划分为具有紧密联系的社团,一种有效的方法是对属性图进行聚类。聚类方法有不同的标准,如节点连接度和属性相似度。虽然社团一般是围绕紧密的连边和相似的属性值的节点形成,但是目前的方法都只关注了这两种数据形式中的一种。通过给每个节点赋予一个自治域,提出一个准确且可延展的多节点系统用于提取属性图中的重叠社团。首先,引入带有可调带宽因子的核函数用于测度每个节点的影响力,具有最高局部影响力的节点可以被看作领导节点。其次,提出一种新颖的局部扩展策略,使每一个领导节点能够吸收属性图中相关性最强的跟随者。接着,设计了多节点社团意识系统,该系统为节点之间的充分沟通提供了必要的条件,从而能够得出最优的重叠社团结构。社团中的节点不仅互相联系紧密,而且也有相似的属性。该算法的计算复杂度在特定带宽条件下近似于连边数目的线性函数。最后,基于标准属性图和真实属性图的实验验证了该系统的有效性和高效性。  相似文献   

11.
Data clustering is a very well studied problem in machine learning, data mining, and related disciplines. Most of the existing clustering methods have focused on optimizing a single clustering objective. Often, several recent disciplines such as robot team deployment, ad hoc networks, multi-agent systems, facility location, etc., need to consider multiple criteria, often conflicting, during clustering. Motivated by this, in this paper, we propose a sequential game theoretic approach for multi-objective clustering, called ClusSMOG-II. It is specially designed to optimize simultaneously intrinsically conflicting objectives, which are inter-cluster/intra-cluster inertia and connectivity. This technique has an advantage of keeping the number of clusters dynamic. The approach consists of three main steps. The first step sets initial clusters with their representatives, whereas the second step calculates the correct number of clusters by resolving a sequence of multi-objective multi-act sequential two-player games for conflict-clusters. Finally, the third step constructs homogenous clusters by resolving sequential two-player game between each cluster representative and the representative of its nearest neighbor. For each game, we define payoff functions that correspond to the model objectives. We use a methodology based on backward induction to calculate a pure Nash equilibrium for each game. Experimental results confirm the effectiveness of the proposed approach over state-of-the-art clustering algorithms.  相似文献   

12.
叶小莺  万梅  唐蓉  谢云  陈桂宏  李强 《计算机应用研究》2020,37(6):1670-1674,1687
针对社交网络中社交关系的有向性与多样性,提出了一种基于图聚类与蚁群算法的社交网络聚类算法。首先,在网络覆盖率的约束下为社交网络建立有向、非全连接的二维图模型;然后,采用K-medoids算法搜索用户分组的中心用户,采用人工蚁群算法在2D图中搜索各个用户与中心用户的相似性,将满足相似性阈值的用户分为同一个用户组。设计了低活跃用户的预测机制解决网络的稀疏性问题与冷启动问题。此外,通过网络覆盖率的约束条件权衡聚类准确率与覆盖率两个指标。仿真实验结果表明,该算法实现了较好的社交网络聚类性能,并且有效地缓解了稀疏性问题与冷启动问题。  相似文献   

13.
Clustering problems are applicable to several areas of science. Approaches and algorithms are as varied as the applications. From a graph theory perspective, clustering can be generated by partitioning an input graph into a vertex-disjoint union of cliques (clusters) through addition and deletion of edges. Finding the minimum number of edges additions and deletions required to cluster data that can be represented as graphs is a well-known problem in combinatorial optimization, often referred to as cluster editing problem. However, many real-world clustering applications are characterized by overlapping clusters, that is, clusters that are non-disjoint. In these situations cluster editing cannot be applied to these problems. Literature concerning a relaxation of the cluster editing, where clusters can overlap, is scarce. In this work, we propose the overlapping cluster editing problem, a variation of the cluster editing where the goal is to partition a graph, also by editing edges, into maximal cliques that are not necessarily disjoint. In addition, we also present three slightly different versions of a hybrid heuristic to solve this problem. Each hybrid heuristic is based on coupling two metaheuristicsthat, together, generate a set of clusters; and one of three mixed-integer linear programming models, also introduced in this paper, that uses these clusters as input. The objective with the metaheuristics is to limit the solution exploration space in the models’ resolution, therefore reducing its computational time.Tests results show that the all proposed hybrid heuristic versions are able to generate good-quality overlapping cluster editing solutions. In particular, one version of the hybrid heuristic achieved, at a low computational cost, the best results in 51 of 112 randomly-generated graphs. Although the other two hybrid heuristic versions have harder to solve models, they obtained reasonable results in medium-sized randomly-generated graphs. In addition, the hybrid heuristic achieved good results identifying labeled overlapping clusters in a supervised data set experiment. Furthermore, we also show that, with our new problem definition, clustering a vertex in more than one cluster can reduce the edges editing cost.  相似文献   

14.
基于密度可达的多密度聚类算法   总被引:1,自引:0,他引:1       下载免费PDF全文
为对多密度数据集聚类,提出一种基于密度可达的多密度聚类算法。使用网格划分技术来提高计算每个点密度值的效率,每次聚类都是从最高密度点开始,根据密度可达的概念和广度优先的策略逐步向外扩展进行聚类。实验表明,该算法能够有效地对任意形状、大小的均匀数据集和多密度数据集进行聚类,并能较好地识别出孤立点和噪声,其精度和效率优于SNN算法。  相似文献   

15.
如何根据负载状况实时优化应用服务器集群的部署,以在能耗与性能之间取得平衡是急需解决的重要问题.对此,提出一种应用服务器集群能耗与性能平衡的在线实时优化策略,优化目标是最小化能耗与请求丢弃速率的加权值,优化内容包括各服务器的开关和CPU频率.该策略包括小规模集群优化(SSCOpt)和大规模集群优化(LSCOpt)两种方案:前者定义大量的变量,将集群优化描述成线性混合整数规划问题,然后采用软件包求解;后者通过分析能耗和负载模型的特性定义很少的变量,将集群优化描述成非线性混合整数规划问题,并提出一种基于花朵授粉算法和变量融合的求解算法.测试结果表明:当集群规模较小时,SSCOpt方案能快速求得全局最优部署;当集群规模较大时,LSCOpt方案能快速求得很好的次优部署.  相似文献   

16.
Several protocols have been proposed to deal with the group key management problem in mobile ad hoc networks (MANETs). Most of these protocols organize the network into clusters to reduce the cost of key refresh or rekeying. Rekeying constitutes a challenging issue in group key management because it must be launched whenever the constitution of the group is altered following a leave or a join operation. However, cluster maintenance may also generate significative communication overhead. So, the clustering algorithm is an important factor in the performance of any key management solution. A clustering algorithm that ensures stable clusters in spite of mobility is very appreciable in mobile ad hoc networks. In fact, all the overhead due to the traffic generated by cluster adjustments and the related rekeying procedures will be saved. As far as we know, no existing clustering algorithm takes into account self-stabilization while relying on the mobility resilience of graph alliances. In this paper, we propose a fully distributed and self-stabilizing clustering algorithm for key management in MANETs where each cluster is an alliance.  相似文献   

17.
Short text clustering by finding core terms   总被引:1,自引:1,他引:0  
A new clustering strategy, TermCut, is presented to cluster short text snippets by finding core terms in the corpus. We model the collection of short text snippets as a graph in which each vertex represents a piece of short text snippet and each weighted edge between two vertices measures the relationship between the two vertices. TermCut is then applied to recursively select a core term and bisect the graph such that the short text snippets in one part of the graph contain the term, whereas those snippets in the other part do not. We apply the proposed method on different types of short text snippets, including questions and search results. Experimental results show that the proposed method outperforms state-of-the-art clustering algorithms for clustering short text snippets.  相似文献   

18.
Clustering entities into dense parts is an important issue in social network analysis. Real social networks usually evolve over time and it remains a problem to efficiently cluster dynamic social networks. In this paper, a dynamic social network is modeled as an initial graph with an infinite change stream, called change stream model, which naturally eliminates the parameter setting problem of snapshot graph model. Based on the change stream model, the incremental version of a well known k-clique clustering problem is studied and incremental k-clique clustering algorithms are proposed based on local DFS (depth first search) forest updating technique. It is theoretically proved that the proposed algorithms outperform corresponding static ones and incremental spectral clustering algorithm in terms of time complexity. The practical performances of our algorithms are extensively evaluated and compared with the baseline algorithms on ENRON and DBLP datasets. Experimental results show that incremental k-clique clustering algorithms are much more efficient than corresponding static ones, and have no accumulating errors that incremental spectral clustering algorithm has and can capture the evolving details of the clusters that snapshot graph model based algorithms miss.  相似文献   

19.
针对互联网中存在的恶意行为,特别是社交网络应用中的在线恶意行为,通常使用基于用户多维特征的聚类分析算法进行检测。提出一种动态特征选择算法(DFSA),使用具有特征加权熵的模糊C均值目标函数,首先为参数构建一个学习模式,自动计算每个特征权重,并剔除权重小于阈值的特征,动态选择重要的特征,迭代地更新隶属函数、簇中心和特征权重直到最优化为止,最后识别出具有高精度的恶意用户行为簇。仿真结果表明,对比SDAFS算法、ELAFC算法和NADMB算法,DFSA算法在Rand指数、Jaccard指数和归一化互信息量3个主要性能指标上均有改善。  相似文献   

20.
针对现有在线社交网络(OSNs)采样方法无法有效地应用于低连通性的社交网络,且采集的样本顶点平均度严重偏离原始社交网络、顶点过度采样等问题,本文基于蒙特卡罗随机游走(MHRW)采样方法,引入双重跳跃策略、并行机制和顶点缓存区,提出一种跳跃无偏并行顶点(JPS)采样方法。将在线社交网络数据集建模为包含顶点和边的社交图进行模拟采样,利用Python/Matplotlib绘图库绘制采集的样本顶点属性图。实验结果表明,该采样方法更有效地应用于不同连通强度的社交图,提高了采样过程中的顶点更新率,降低了样本顶点的平均度偏差且能够更快速地收敛。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号