首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Clustering and embedding using commute times   总被引:1,自引:0,他引:1  
This paper exploits the properties of the commute time between nodes of a graph for the purposes of clustering and embedding, and explores its applications to image segmentation and multi-body motion tracking. Our starting point is the lazy random walk on the graph, which is determined by the heatkernel of the graph and can be computed from the spectrum of the graph Laplacian. We characterize the random walk using the commute time (i.e. the expected time taken for a random walk to travel between two nodes and return) and show how this quantity may be computed from the Laplacian spectrum using the discrete Green's function. Our motivation is that the commute time can be anticipated to be a more robust measure of the proximity of data than the raw proximity matrix. In this paper, we explore two applications of the commute time. The first is to develop a method for image segmentation using the eigenvector corresponding to the smallest eigenvalue of the commute time matrix. We show that our commute time segmentation method has the property of enhancing the intra-group coherence while weakening inter-group coherence and is superior to the normalized cut. The second application is to develop a robust multi-body motion tracking method using an embedding based on the commute time. Our embedding procedure preserves commute time, and is closely akin to kernel PCA, the Laplacian eigenmap and the diffusion map. We illustrate the results both on synthetic image sequences and real world video sequences, and compare our results with several alternative methods.  相似文献   

2.
In this paper, we investigate the use of heat kernels as a means of embedding the individual nodes of a graph in a vector space. The reason for turning to the heat kernel is that it encapsulates information concerning the distribution of path lengths and hence node affinities on the graph. The heat kernel of the graph is found by exponentiating the Laplacian eigensystem over time. In this paper, we explore how graphs can be characterized in a geometric manner using embeddings into a vector space obtained from the heat kernel. We explore two different embedding strategies. The first of these is a direct method in which the matrix of embedding co-ordinates is obtained by performing a Young–Householder decomposition on the heat kernel. The second method is indirect and involves performing a low-distortion embedding by applying multidimensional scaling to the geodesic distances between nodes. We show how the required geodesic distances can be computed using parametrix expansion of the heat kernel. Once the nodes of the graph are embedded using one of the two alternative methods, we can characterize them in a geometric manner using the distribution of the node co-ordinates. We investigate several alternative methods of characterization, including spatial moments for the embedded points, the Laplacian spectrum for the Euclidean distance matrix and scalar curvatures computed from the difference in geodesic and Euclidean distances. We experiment with the resulting algorithms on the COIL database.  相似文献   

3.
Graph structures have been proved important in high level-vision since they can be used to represent structural and relational arrangements of objects in a scene. One of the problems that arises in the analysis of structural abstractions of objects is graph clustering. In this paper, we explore how permutation invariants computed from the trace of the heat kernel can be used to characterize graphs for the purposes of measuring similarity and clustering. The heat kernel is the solution of the heat equation and is a compact representation of the path-length distribution on a graph. The trace of the heat kernel is given by the sum of the Laplacian eigenvalues exponentiated with time. We explore three different approaches to characterizing the heat kernel trace as a function of time. Our first characterization is based on the zeta function, which from the Mellin transform is the moment generating function of the heat kernel trace. Our second characterization is unary and is found by computing the derivative of the zeta function at the origin. The third characterization is derived from the heat content, i.e. the sum of the elements of the heat kernel. We show how the heat content can be expanded as a power series in time, and the coefficients of the series can be computed using the Laplacian spectrum. We explore the use of these characterizations as a means of representing graph structure for the purposes of clustering, and compare them with the use of the Laplacian spectrum. Experiments with the synthetic and real-world databases reveal that each of the three proposed invariants is effective and outperforms the traditional Laplacian spectrum. Moreover, the heat-content invariants appear to consistently give the best results in both synthetic sensitivity studies and on real-world object recognition problems.  相似文献   

4.
The spectrum of a graph has been widely used in graph theory to characterise the properties of a graph and extract information from its structure. It has also been employed as a graph representation for pattern matching since it is invariant to the labelling of the graph. There are, however, a number of potential drawbacks in using the spectrum as a representation of a graph. Firstly, more than one graph may share the same spectrum. It is well known, for example, that very few trees can be uniquely specified by their spectrum. Secondly, the spectrum may change dramatically with a small change structure.There are a wide variety of graph matrix representations from which the spectrum can be extracted. Among these are the adjacency matrix, combinatorial Laplacian, normalised Laplacian and unsigned Laplacian. Spectra can also be derived from the heat kernel matrix and path length distribution matrix. The choice of matrix representation clearly has a large effect on the suitability of spectrum in a number of pattern recognition tasks.In this paper we investigate the performance of the spectra as a graph representation in a variety of situations. Firstly, we investigate the cospectrality of the various matrix representations over large graph and tree sets, extending the work of previous authors. We then show that the Euclidean distance between spectra tracks the edit distance between graphs over a wide range of edit costs, and we analyse the accuracy of this relationship. We then use the spectra to both cluster and classify the graphs and demonstrate the effect of the graph matrix formulation on error rates. These results are produced using both synthetic graphs and trees and graphs derived from shape and image data.  相似文献   

5.
6.
The spectrum of a graph is the set of all eigenvalues of the Laplacian matrix of the graph. There is a closed relationship between the Laplacian spectrum of graphs and some properties of graphs such as connectivity. In the recent years Laplacian spectrum of graphs has been widely applied in many fields. The application of Laplacian spectrum of graphs to circuit partitioning problems is reviewed in this paper. A new criterion of circuit partitioning is proposed and the bounds of the partition ratio for weighted graphs are also presented. Moreover, the deficiency of graph-partitioning algorithms by Laplacian eigenvectors is addressed and an algorithm by means of the minimal spanning tree of a graph is proposed. By virtue of taking the graph structure into consideration this algorithm can fulfill general requirements of circuit partitioning.  相似文献   

7.
The analysis of complex networks is of major interest in various fields of science. In many applications we face the challenge that the exact topology of a network is unknown but we are instead given information about distances within this network. The theoretical approaches to this problem have so far been focusing on the reconstruction of graphs from shortest path distance matrices. Often, however, movements in networks do not follow shortest paths but occur in a random fashion. In these cases an appropriate distance measure can be defined as the mean length of a random walk between two nodes — a quantity known as the mean first hitting time.In this contribution we investigate whether a graph can be reconstructed from its mean first hitting time matrix and put forward an algorithm for solving this problem. A heuristic method to reduce the computational effort is described and analyzed. In the case of trees we can even give an algorithm for reconstructing graphs from incomplete random walk distance matrices.  相似文献   

8.
This work presents a new perspective on characterizing the similarity between elements of a database or, more generally, nodes of a weighted and undirected graph. It is based on a Markov-chain model of random walk through the database. More precisely, we compute quantities (the average commute time, the pseudoinverse of the Laplacian matrix of the graph, etc.) that provide similarities between any pair of nodes, having the nice property of increasing when the number of paths connecting those elements increases and when the "length" of paths decreases. It turns out that the square root of the average commute time is a Euclidean distance and that the pseudoinverse of the Laplacian matrix is a kernel matrix (its elements are inner products closely related to commute times). A principal component analysis (PCA) of the graph is introduced for computing the subspace projection of the node vectors in a manner that preserves as much variance as possible in terms of the Euclidean commute-time distance. This graph PCA provides a nice interpretation to the "Fiedler vector," widely used for graph partitioning. The model is evaluated on a collaborative-recommendation task where suggestions are made about which movies people should watch based upon what they watched in the past. Experimental results on the MovieLens database show that the Laplacian-based similarities perform well in comparison with other methods. The model, which nicely fits into the so-called "statistical relational learning" framework, could also be used to compute document or word similarities, and, more generally, it could be applied to machine-learning and pattern-recognition tasks involving a relational database  相似文献   

9.
The arrangement graphs are a class of generalized star graphs. In this paper we construct a graph that consists of the maximum number of directed edge-disjoint spanning trees in an arrangement graph. The paths that connect the common root node to any given node through different spanning trees are node-disjoint, and the lengths of these paths differ from the shortest possible lengths by a small additive constant. This graph can be used to derive fault-tolerant algorithms for broadcasting and scattering problems without prior knowledge of the faulty elements of the network.  相似文献   

10.
A spanning tree T of a graph G=(V,E) is called a locally connected spanning tree if the set of all neighbors of v in T induces a connected subgraph of G for all vV. The problem of recognizing whether a graph admits a locally connected spanning tree is known to be NP-complete even when the input graphs are restricted to chordal graphs. In this paper, we propose linear time algorithms for finding locally connected spanning trees in cographs, complements of bipartite graphs and doubly chordal graphs, respectively.  相似文献   

11.
A quantum particle evolving by Schrödinger’s equation contains, from the kinetic energy of the particle, a term in its Hamiltonian proportional to Laplace’s operator. In discrete space, this is replaced by the discrete or graph Laplacian, which gives rise to a continuous-time quantum walk. Besides this natural definition, some quantum walk algorithms instead use the adjacency matrix to effect the walk. While this is equivalent to the Laplacian for regular graphs, it is different for non-regular graphs and is thus an inequivalent quantum walk. We algorithmically explore this distinction by analyzing search on the complete bipartite graph with multiple marked vertices, using both the Laplacian and adjacency matrix. The two walks differ qualitatively and quantitatively in their required jumping rate, runtime, sampling of marked vertices, and in what constitutes a natural initial state. Thus the choice of the Laplacian or adjacency matrix to effect the walk has important algorithmic consequences.  相似文献   

12.
Towards a theoretical foundation for Laplacian-based manifold methods   总被引:1,自引:0,他引:1  
In recent years manifold methods have attracted a considerable amount of attention in machine learning. However most algorithms in that class may be termed “manifold-motivated” as they lack any explicit theoretical guarantees. In this paper we take a step towards closing the gap between theory and practice for a class of Laplacian-based manifold methods. These methods utilize the graph Laplacian associated to a data set for a variety of applications in semi-supervised learning, clustering, data representation.We show that under certain conditions the graph Laplacian of a point cloud of data samples converges to the Laplace-Beltrami operator on the underlying manifold. Theorem 3.1 contains the first result showing convergence of a random graph Laplacian to the manifold Laplacian in the context of machine learning.  相似文献   

13.
This paper shows how to construct a generative model for graph structure through the embedding of the nodes of the graph in a vector space. We commence from a sample of graphs where the correspondences between nodes are unknown ab initio. We also work with graphs where there may be structural differences present, i.e. variations in the number of nodes in each graph and their edge structure. We characterise the graphs using the heat-kernel, and this is obtained by exponentiating the Laplacian eigensystem with time. The idea underpinning the method is to embed the nodes of the graphs into a vector space by performing a Young-Householder decomposition of the heat-kernel into an inner product of node co-ordinate matrices. The co-ordinates of the nodes are determined by the eigenvalues and eigenvectors of the Laplacian matrix, together with a time-parameter which can be used to scale the embedding. Node correspondences are located by applying Scott and Longuet-Higgins algorithm to the embedded nodes. We capture variations in graph structure using the covariance matrix for corresponding embedded point positions. We construct a point-distribution model for the embedded node positions using the eigenvalues and eigenvectors of the covariance matrix. We show how to use this model to both project individual graphs into the eigenspace of the point position covariance matrix and how to fit the model to potentially noisy graphs to reconstruct the Laplacian matrix. We illustrate the utility of the resulting method for shape analysis using data from the Caltech–Oxford and COIL databases.  相似文献   

14.
The second-order random walk has recently been shown to effectively improve the accuracy in graph analysis tasks.Existing work mainly focuses on centralized second-order random walk (SOW) algorithms.SOW algorithms rely on edge-to-edge transition probabilities to generate next random steps.However,it is prohibitively costly to store all the probabilities for large-scale graphs,and restricting the number of probabilities to consider can negatively impact the accuracy of graph analysis tasks.In this paper,we propose and study an alternative approach,SOOP (second-order random walks with on-demand probability computation),that avoids the space overhead by computing the edge-to-edge transition probabilities on demand during the random walk.However,the same probabilities may be computed multiple times when the same edge appears multiple times in SOW,incurring extra cost for redundant computation and communication.We propose two optimization techniques that reduce the complexity of computing edge-to-edge transition probabilities to generate next random steps,and reduce the cost of communicating out-neighbors for the probability computation,respectively.Our experiments on real-world and synthetic graphs show that SOOP achieves orders of magnitude better performance than baseline precompute solutions,and it can efficiently computes SOW algorithms on billion-scale graphs.  相似文献   

15.
A complex network can be modeled as a graph representing the “who knows who” relationship. In the context of graph theory for social networks, the notion of centrality is used to assess the relative importance of nodes in a given network topology. For example, in a network composed of large dense clusters connected through only a few links, the nodes involved in those links are particularly critical as far as the network survivability is concerned. This may also impact any application running on top of it. Such information can be exploited for various topological maintenance issues to prevent congestion and disruption. This can also be used offline to identify the most important actors in large social interaction graphs. Several forms of centrality have been proposed so far. Yet, they suffer from imperfections: initially designed for small social graphs, they are either of limited use (degree centrality), either incompatible in a distributed setting (e.g. random walk betweenness centrality).In this paper we introduce a novel form of centrality: the second order centrality which can be computed in a distributed manner. This provides locally each node with a value reflecting its relative criticity and relies on a random walk visiting the network in an unbiased fashion. To this end, each node records the time elapsed between visits of that random walk (called return time in the sequel) and computes the standard deviation (or second order moment) of such return times. The key point is that central nodes see regularly the random walk compared to other topology nodes. Both through theoretical analysis and simulation, we show that the standard deviation can be used to accurately identify critical nodes as well as to globally characterize graphs topology in a distributed way. We finally compare our proposal to well-known centralities to assess its competitivity.  相似文献   

16.
The problem of counting the number of spanning trees is an old topic in graph theory with important applications to reliable network design. Usually, it is desirable to put forward a formula of the number of spanning trees for various graphs, which is not only interesting in its own right but also in practice. Since some large graphs can be composed of some existing smaller graphs by using the product of graphs, the number of spanning trees of such large graph is also closely related to that of the corresponding smaller ones. In this article, we establish a formula for the number of spanning trees in the lexicographic product of two graphs, in which one graph is an arbitrary graph G and the other is a complete multipartite graph. The results extend some of the previous work, which is closely related to the number of vertices and Lapalacian eigenvalues of smaller graphs only.  相似文献   

17.
Pattern vectors from algebraic graph theory   总被引:3,自引:0,他引:3  
Graph structures have proven computationally cumbersome for pattern analysis. The reason for this is that, before graphs can be converted to pattern vectors, correspondences must be established between the nodes of structures which are potentially of different size. To overcome this problem, in this paper, we turn to the spectral decomposition of the Laplacian matrix. We show how the elements of the spectral matrix for the Laplacian can be used to construct symmetric polynomials that are permutation invariants. The coefficients of these polynomials can be used as graph features which can be encoded in a vectorial manner. We extend this representation to graphs in which there are unary attributes on the nodes and binary attributes on the edges by using the spectral decomposition of a Hermitian property matrix that can be viewed as a complex analogue of the Laplacian. To embed the graphs in a pattern space, we explore whether the vectors of invariants can be embedded in a low-dimensional space using a number of alternative strategies, including principal components analysis (PCA), multidimensional scaling (MDS), and locality preserving projection (LPP). Experimentally, we demonstrate that the embeddings result in well-defined graph clusters. Our experiments with the spectral representation involve both synthetic and real-world data. The experiments with synthetic data demonstrate that the distances between spectral feature vectors can be used to discriminate between graphs on the basis of their structure. The real-world experiments show that the method can be used to locate clusters of graphs.  相似文献   

18.
Finding the degree-constrained minimum spanning tree (d-MST) of a graph is a well-studied NP-hard problem of importance in communications network design and other network-related problems. In this paper we describe some previously proposed algorithms for solving the problem, and then introduce a novel tree construction algorithm called the randomized primal method (RPM) which builds degree-constrained trees of low cost from solution vectors taken as input. RPM is applied in three stochastic iterative search methods: simulated annealing, multistart hillclimbing, and a genetic algorithm. While other researchers have mainly concentrated on finding spanning trees in Euclidean graphs, we consider the more general case of random graph problems. We describe two random graph generators which produce particularly challenging d-MST problems. On these and other problems we find that the genetic algorithm employing RPM outperforms simulated annealing and multistart hillclimbing. Our experimental results provide strong evidence that the genetic algorithm employing RPM finds significantly lower-cost solutions to random graph d-MST problems than rival methods  相似文献   

19.
社交网络信息已被广泛的应用到传统的推荐上,一定程度上减轻了数据稀疏和冷启动问题.随着表示学习的兴起,出现了利用表示学习进行推荐的算法研究.然而社交网络过大,表示学习可扩展性差,难以在有限内存中进行计算.聚集图通过空间压缩,保留了关键的结构关系,去除次要或噪音的结构数据,便于表示学习能够有效学习图结构,从而更好地找到相似用户进行推荐.首先,利用图聚集算法同时考虑分组间及分组内的结构得到最终的聚集图;其次,在聚集图上计算随机游走的转移概率,然后选择每个具有偏差概率的后继节点并生成节点序列;最后将节点序列输入到skip-gram学习用户的潜在表示,获得节点的表示向量整合其信息到贝叶斯个性化排序模型(BPR)来解决项目排名问题.实验结果表明,该方法相比于社会化贝叶斯个性化排序(SBPR)、协同用户网络嵌入(CUNE)等基线方法在推荐任务中保持时间效率的同时有效提升了准确率、召回率和平均精度均值.  相似文献   

20.
The problem of counting the number of spanning trees is an old topic in graph theory with important applications to reliable network design. Usually, it is desirable to put forward a formula of the number of spanning trees for various graphs, which is not only interesting in its own right but also in practice. Since some large graphs can be composed of some existing smaller graphs by using the product of graphs, the number of spanning trees of such large graph is also closely related to that of the corresponding smaller ones. In this article, we establish a formula for the number of spanning trees in the lexicographic product of two graphs, in which one graph is an arbitrary graph G and the other is a complete multipartite graph. The results extend some of the previous work, which is closely related to the number of vertices and Lapalacian eigenvalues of smaller graphs only.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号