首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A survey of graph edit distance   总被引:1,自引:0,他引:1  
Inexact graph matching has been one of the significant research foci in the area of pattern analysis. As an important way to measure the similarity between pairwise graphs error-tolerantly, graph edit distance (GED) is the base of inexact graph matching. The research advance of GED is surveyed in order to provide a review of the existing literatures and offer some insights into the studies of GED. Since graphs may be attributed or non-attributed and the definition of costs for edit operations is various, the existing GED algorithms are categorized according to these two factors and described in detail. After these algorithms are analyzed and their limitations are identified, several promising directions for further research are proposed.  相似文献   

2.
Bayesian graph edit distance   总被引:2,自引:0,他引:2  
This paper describes a novel framework for comparing and matching corrupted relational graphs. The paper develops the idea of edit-distance originally introduced for graph-matching by Sanfeliu and Fu (1983). We show how the Levenshtein distance (1966) can be used to model the probability distribution for structural errors in the graph-matching problem. This probability distribution is used to locate matches using MAP label updates. We compare the resulting graph-matching algorithm with that recently reported by Wilson and Hancock. The use of edit-distance offers an elegant alternative to the exhaustive compilation of label dictionaries. Moreover, the method is polynomial rather than exponential in its worst-case complexity. We support our approach with an experimental study on synthetic data and illustrate its effectiveness on an uncalibrated stereo correspondence problem. This demonstrates experimentally that the gain in efficiency is not at the expense of quality of match  相似文献   

3.
在图相似性搜索问题中,图编辑距离是较为普遍的度量方法,其计算性能很大程度上决定了图相似性搜索算法的性能。针对传统图编辑距离算法中存在的因大量冗余映射和较大搜索空间导致的性能低下问题,提出了一种改进的图编辑距离算法。该算法首先对图中顶点进行等价划分,以此计算映射编码来判断等价映射;然后定义映射完整性更新等价映射优先级,选出主映射参与扩展;其次,设计高效的启发式函数,提出基于映射编码的下界计算方法,快速得到最优映射。最后,将改进的图编辑距离算法扩展应用于图相似性搜索。在不同数据集上的实验结果表明,该算法具有更好的搜索性能,在搜索空间上最大可降低49%,速度提升了约29%。  相似文献   

4.
Graph matching and graph edit distance have become important tools in structural pattern recognition. The graph edit distance concept allows us to measure the structural similarity of attributed graphs in an error-tolerant way. The key idea is to model graph variations by structural distortion operations. As one of its main constraints, however, the edit distance requires the adequate definition of edit cost functions, which eventually determine which graphs are considered similar. In the past, these cost functions were usually defined in a manual fashion, which is highly prone to errors. The present paper proposes a method to automatically learn cost functions from a labeled sample set of graphs. To this end, we formulate the graph edit process in a stochastic context and perform a maximum likelihood parameter estimation of the distribution of edit operations. The underlying distortion model is learned using an Expectation Maximization algorithm. From this model we finally derive the desired cost functions. In a series of experiments we demonstrate the learning effect of the proposed method and provide a performance comparison to other models.  相似文献   

5.
Graph edit distance is a powerful and flexible method for error-tolerant graph matching. Yet it can only be calculated for small graphs in practice due to its exponential time complexity when considering unconstrained graphs. In this paper we propose a quadratic time approximation of graph edit distance based on Hausdorff matching. In a series of experiments we analyze the performance of the proposed Hausdorff edit distance in the context of graph classification and compare it with a cubic time algorithm based on the assignment problem. Investigated applications include nearest neighbor classification of graphs representing letter drawings, fingerprints, and molecular compounds as well as hidden Markov model classification of vector space embedded graphs representing handwriting. In many cases, a substantial speedup is achieved with only a minor loss in accuracy or, in one case, even with a gain in accuracy. Overall, the proposed Hausdorff edit distance shows a promising potential in terms of flexibility, efficiency, and accuracy.  相似文献   

6.
Graphs are widely used to model complicated data semantics in many applications in bioinformatics, chemistry, social networks, pattern recognition, etc. A recent trend is to tolerate noise arising from various sources such as erroneous data entries and find similarity matches. In this paper, we study graph similarity queries with edit distance constraints. Inspired by the $q$ -gram idea for string similarity problems, our solution extracts paths from graphs as features for indexing. We establish a lower bound of common features to generate candidates. Efficient algorithms are proposed to handle three types of graph similarity queries by exploiting both matching and mismatching features as well as degree information to improve the filtering and verification on candidates. We demonstrate the proposed algorithms significantly outperform existing approaches with extensive experiments on real and synthetic datasets.  相似文献   

7.
A linear programming (LP) approach is proposed for the weighted graph matching problem. A linear program is obtained by formulating the graph matching problem in L1 norm and then transforming the resulting quadratic optimization problem to a linear one. The linear program is solved using a simplex-based algorithm. Then, approximate 0-1 integer solutions are obtained by applying the Hungarian method on the real solutions of the linear program. The complexity of the proposed algorithm is polynomial time, and it is O(n 6L) for matching graphs of size n. The developed algorithm is compared to two other algorithms. One is based on an eigendecomposition approach and the other on a symmetric polynomial transform. Experimental results showed that the LP approach is superior in matching graphs than both other methods  相似文献   

8.
We consider a relationship between the unit cost edit distance for two rooted ordered trees and the unit cost edit distance for the corresponding Euler strings. We show that the edit distance between trees is at least half of the edit distance between the Euler strings and is at most 2h+1 times the edit distance between the Euler strings, where h is the minimum height of two trees. The result can be extended for more general cost functions.  相似文献   

9.
Traditional normalized tree edit distances do not satisfy the triangle inequality. We present a metric normalization method for tree edit distance, which results in a new normalized tree edit distance fulfilling the triangle inequality, under the condition that the weight function is a metric over the set of elementary edit operations with all costs of insertions/deletions having the same weight. We prove that the new distance, in the range [0, 1], is a genuine metric as a simple function of the sizes of two ordered labeled trees and the tree edit distance between them, which can be directly computed through tree edit distance with the same complexity. Based on an efficient algorithm to represent digits as ordered labeled trees, we show that the normalized tree edit metric can provide slightly better results than other existing methods in handwritten digit recognition experiments using the approximating and eliminating search algorithm (AESA) algorithm.  相似文献   

10.
In this paper, we propose a formal definition of a new class of trees called semi-ordered trees and a polynomial dynamic programming algorithm to compute a constrained edit distance between such trees. The core of the method relies on a similar approach to compare unordered [Kaizhong Zhang, A constrained edit distance between unordered labeled trees, Algorithmica 15 (1996) 205–222] and ordered trees [Kaizhong Zhang, Algorithms for the constrained editing distance between ordered labeled trees and related problems, Pattern Recognition 28 (3) (1995) 463–474]. The method is currently applied to evaluate the similarity between architectures of apple trees [Vincent Segura, Aida Ouangraoua, Pascal Ferraro, Evelyne Costes, Comparison of tree architecture using tree edit distances: Application to two-year-old apple tree, Euphytica 161 (2007) 155–164].  相似文献   

11.
This paper considers the problem of computing a constrained edit distance between unordered labeled trees. The problem of approximate unordered tree matching is also considered. We present dynamic programming algorithms solving these problems in sequential timeO(|T 1|×|T 2|×(deg(T 1)+deg(T 2))× log2(deg(T 1)+deg(T 2))). Our previous result shows that computing the edit distance between unordered labeled trees is NP-complete.This research was supported by the Natural Sciences and Engineering Research Council of Canada under Grant No. OGP0046373.  相似文献   

12.
Graph similarity is an important notion with many applications. Graph edit distance is one of the most flexible graph similarity measures available. The main problem with this measure is that in practice it can only be computed for small graphs due to its exponential time complexity. This paper addresses the high complexity of graph edit distance computations. Specifically, we present CSI_GED, a novel edge-centric approach for computing graph edit distance through common sub-structure isomorphisms enumeration. CSI_GED utilizes backtracking search combined with a number of heuristics to reduce memory requirements and quickly prune away a large portion of the mapping search space. Experiments show that CSI_GED is highly efficient for computing graph edit distance; it outperforms the state-of-the-art methods by over three orders of magnitude. It also shows that CSI_GED scales the computation gracefully to larger and distant graphs on which current methods fail to run. Moreover, we evaluated CSI_GED as a stand-alone graph edit similarity search query method. The experiments show that CSI_GED is effective and scalable, and outperforms the state-of-the-art indexing-based methods by over two orders of magnitude.  相似文献   

13.
Approximate labeling via graph cuts based on linear programming   总被引:1,自引:0,他引:1  
A new framework is presented for both understanding and developing graph-cut-based combinatorial algorithms suitable for the approximate optimization of a very wide class of Markov random fields (MRFs) that are frequently encountered in computer vision. The proposed framework utilizes tools from the duality theory of linear programming in order to provide an alternative and more general view of state-of-the-art techniques like the alpha-expansion algorithm, which is included merely as a special case. Moreover, contrary to alpha-expansion, the derived algorithms generate solutions with guaranteed optimality properties for a much wider class of problems, for example, even for MRFs with nonmetric potentials. In addition, they are capable of providing per-instance suboptimality bounds in all occasions, including discrete MRFs with an arbitrary potential function. These bounds prove to be very tight in practice (that is, very close to 1), which means that the resulting solutions are almost optimal. Our algorithms' effectiveness is demonstrated by presenting experimental results on a variety of low-level vision tasks, such as stereo matching, image restoration, image completion, and optical flow estimation, as well as on synthetic problems.  相似文献   

14.
We define the unrestricted modified edit distance based on the modified edit distance defined by Galil and Giancarlo (1989) where the cost of substring deletions and insertions are contextsensitive and the cost of character substitutions are context-free. The modified edit distance is the minimum cost of converting a string X to a string Y where the sequence of edit operations has the property that all substring deletions precede all character substitutions and all character substitutions precede all substring insertions. Note that the modified edit distance does not satisfy the triangle inequality. We show that the problem of finding the unrestricted modified edit distance which is the minimum cost over all edit sequences (without these constraints) of converting X to Y is undecidable.  相似文献   

15.
16.
基于树编辑距离的层次聚类算法   总被引:1,自引:0,他引:1       下载免费PDF全文
为了识别犯罪嫌疑人伪造和篡改的虚假身份,利用树编辑距离计算个体属性相似性,证明了树编辑距离的相关数学性质,对属性应用层次编码方法,提出了一种新的基于树编辑距离的层次聚类算法HCTED(Hi-erarchical Clustering Algorithm Based on Tree Edit Distance)。新算法通过树编辑操作使用最少的代价计算属性相似性,克服了传统聚类算法标称型计算的缺陷,提高了聚类精度,通过设定阈值对给定样本聚类。实验证明了新方法在身份识别上的准确性和有效性,讨论了不同参数对实验结果的影响,对比传统聚类算法,HCTED算法性能明显提高。新算法已经应用到警用流动人口分析中,取得了良好效果。  相似文献   

17.
G. Tremblay  F. Champagne 《Software》2007,37(2):207-230
Musical dictations for ear training and training in music writing form a key practice of basic musical training. Marking students' dictation exercises for large groups of students can require a lot of work. In this paper, we present a tool, called CADiM, that can help automate the marking of such musical dictations. The edit distance, which computes the similarity between any two strings, has been used in various areas such as string/text analysis, protein/genome matching in bio‐computing and musical applications, for example music retrieval or musicological analysis. CADiM's marking algorithm is based on an earlier edit distance proposed for musical sequences, but adapted to reflect the marking heuristic used by a domain expert's specific approach to musical training. Computing an edit distance on musical scores requires using an appropriate representation. More precisely, given our specific context, a symbolic representation is required. We use MusicXML, an XML application for standard Western music notation. Given a Document Type Definition for MusicXML, existing Java tools can generate a MusicXML parser. Such a parser, given appropriate input files, then generates an intermediate form (DOM object) on which analyses and transformations are performed in order to compute the edit distance. In turn, the edit distance is used to give a mark as well as identify the key errors. CADiM has been applied to a number of test cases and the results compared with those obtained by a domain expert. Overall, the results are promising, namely, only 3% difference between the domain expert's marks and those produced by CADiM. Copyright © 2006 John Wiley & Sons, Ltd.  相似文献   

18.
Marc  Laurent  Amaury  Marc   《Pattern recognition》2008,41(8):2611-2629
Nowadays, there is a growing interest in machine learning and pattern recognition for tree-structured data. Trees actually provide a suitable structural representation to deal with complex tasks such as web information extraction, RNA secondary structure prediction, computer music, or conversion of semi-structured data (e.g. XML documents). Many applications in these domains require the calculation of similarities over pairs of trees. In this context, the tree edit distance (ED) has been subject of investigations for many years in order to improve its computational efficiency. However, used in its classical form, the tree ED needs a priori fixed edit costs which are often difficult to tune, that leaves little room for tackling complex problems. In this paper, to overcome this drawback, we focus on the automatic learning of a non-parametric stochastic tree ED. More precisely, we are interested in two kinds of probabilistic approaches. The first one builds a generative model of the tree ED from a joint distribution over the edit operations, while the second works from a conditional distribution providing then a discriminative model. To tackle these tasks, we present an adaptation of the expectation–maximization algorithm for learning these distributions over the primitive edit costs. Two experiments are conducted. The first is achieved on artificial data and confirms the interest to learn a tree ED rather than a priori imposing edit costs; The second is applied to a pattern recognition task aiming to classify handwritten digits.  相似文献   

19.
目前关于XML文档相似性算法有很多种,其中基于编辑距离的方法是很重要的一类。目前已发表的基于编辑距离的算法中,编辑图算法由于其计算高效率的特点成为研究的出发点。首先介绍了编辑图算法的思想,由于它在计算过程中对同层兄弟节点的顺序有很强的依赖性,因此不能准确有效地比较数据无序的数据中心的XML文档相似性。针对该问题,在编辑图算法思想的基础上,结合路径算法的思想提出拆分编辑图算法。实验结果表明,拆分编辑图算法降低了编辑图算法中对兄弟节点次序的依赖性,更适合于数据中心的XML文档相似性比较,而且所得结果更加准确有效。  相似文献   

20.
Inventory systems with uncertainty go hand in hand with the determination of a safety stock level. The decision on the safety stock level is based on a performance measure, for example the expected shortage per replenishment period or the probability of a stock-out per replenishment period. The performance measure assumes complete knowledge of the probability distribution during lead time, which might not be available. In case of incomplete information regarding the lead-time distribution of demand, no single figure for the safety stock can de determined in order to satisfy a performance measure. However, an optimisation model may be formulated in order to determine a safety stock level which guarantees the performance measure under the worst case of lead-time demand, of which the distribution is known in an incomplete way. It is shown that this optimisation problem can be formulated as a linear programming problem.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号