首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper considers the problem of computing a constrained edit distance between unordered labeled trees. The problem of approximate unordered tree matching is also considered. We present dynamic programming algorithms solving these problems in sequential timeO(|T 1|×|T 2|×(deg(T 1)+deg(T 2))× log2(deg(T 1)+deg(T 2))). Our previous result shows that computing the edit distance between unordered labeled trees is NP-complete.This research was supported by the Natural Sciences and Engineering Research Council of Canada under Grant No. OGP0046373.  相似文献   

2.
The guided tree edit distance problem is to find a minimum cost series of edit operations that transforms two input forests F and G into isomorphic forests F and G such that a third input forest H is included in F (and G). The edit operations are relabeling a vertex and deleting a vertex. We show efficient algorithms for this problem that are faster than the previous algorithm for this problem of Peng and Ting [Z. Peng, H. Ting, Guided forest edit distance: Better structure comparisons by using domain-knowledge, in: Proc. 18th Symposium on Combinatorial Pattern Matching (CPM), 2007, pp. 28-39].  相似文献   

3.
We consider a relationship between the unit cost edit distance for two rooted ordered trees and the unit cost edit distance for the corresponding Euler strings. We show that the edit distance between trees is at least half of the edit distance between the Euler strings and is at most 2h+1 times the edit distance between the Euler strings, where h is the minimum height of two trees. The result can be extended for more general cost functions.  相似文献   

4.
In this paper, we propose a formal definition of a new class of trees called semi-ordered trees and a polynomial dynamic programming algorithm to compute a constrained edit distance between such trees. The core of the method relies on a similar approach to compare unordered [Kaizhong Zhang, A constrained edit distance between unordered labeled trees, Algorithmica 15 (1996) 205–222] and ordered trees [Kaizhong Zhang, Algorithms for the constrained editing distance between ordered labeled trees and related problems, Pattern Recognition 28 (3) (1995) 463–474]. The method is currently applied to evaluate the similarity between architectures of apple trees [Vincent Segura, Aida Ouangraoua, Pascal Ferraro, Evelyne Costes, Comparison of tree architecture using tree edit distances: Application to two-year-old apple tree, Euphytica 161 (2007) 155–164].  相似文献   

5.
贾楠  付晓东  黄袁  刘晓燕  代志华 《计算机应用》2012,32(12):3529-3533
在工作流的发现和聚类等应用中,需要对两个工作流模型的距离进行度量。因此,提出一种计算两个不同结构化工作流的距离定量度量方法。首先介绍了结构化工作流,并将每一个结构化工作流转换为流程结构树;然后基于两个结构树之间的树编辑距离来计算工作流之间的距离及相应相似度。该距离度量方法满足距离度量的3个属性,即同实体不可区分性、对称性和三角不等式性质。这些属性使得该距离度量方法可以在工作流模型管理活动中作为定量分析工具。实验结果表明,基于树编辑距离的工作流度量方法是可行的。同时,与基于邻接矩阵的距离度量方法相比,该方法考虑了不同结构之间的语义距离,有效验证了此方法的合理性。  相似文献   

6.
7.
Traditional normalized tree edit distances do not satisfy the triangle inequality. We present a metric normalization method for tree edit distance, which results in a new normalized tree edit distance fulfilling the triangle inequality, under the condition that the weight function is a metric over the set of elementary edit operations with all costs of insertions/deletions having the same weight. We prove that the new distance, in the range [0, 1], is a genuine metric as a simple function of the sizes of two ordered labeled trees and the tree edit distance between them, which can be directly computed through tree edit distance with the same complexity. Based on an efficient algorithm to represent digits as ordered labeled trees, we show that the normalized tree edit metric can provide slightly better results than other existing methods in handwritten digit recognition experiments using the approximating and eliminating search algorithm (AESA) algorithm.  相似文献   

8.
Marc  Laurent  Amaury  Marc   《Pattern recognition》2008,41(8):2611-2629
Nowadays, there is a growing interest in machine learning and pattern recognition for tree-structured data. Trees actually provide a suitable structural representation to deal with complex tasks such as web information extraction, RNA secondary structure prediction, computer music, or conversion of semi-structured data (e.g. XML documents). Many applications in these domains require the calculation of similarities over pairs of trees. In this context, the tree edit distance (ED) has been subject of investigations for many years in order to improve its computational efficiency. However, used in its classical form, the tree ED needs a priori fixed edit costs which are often difficult to tune, that leaves little room for tackling complex problems. In this paper, to overcome this drawback, we focus on the automatic learning of a non-parametric stochastic tree ED. More precisely, we are interested in two kinds of probabilistic approaches. The first one builds a generative model of the tree ED from a joint distribution over the edit operations, while the second works from a conditional distribution providing then a discriminative model. To tackle these tasks, we present an adaptation of the expectation–maximization algorithm for learning these distributions over the primitive edit costs. Two experiments are conducted. The first is achieved on artificial data and confirms the interest to learn a tree ED rather than a priori imposing edit costs; The second is applied to a pattern recognition task aiming to classify handwritten digits.  相似文献   

9.
10.
It is shown in this paper that the minimum distance between two finite planar sets of n points can be computer in O(n log n) worst-case running time and that this is optimal to within a constant factor. Furthermore, when the sets form a convex polygon this complexity can be reduced O(n).  相似文献   

11.
基于树编辑距离的层次聚类算法   总被引:1,自引:0,他引:1       下载免费PDF全文
为了识别犯罪嫌疑人伪造和篡改的虚假身份,利用树编辑距离计算个体属性相似性,证明了树编辑距离的相关数学性质,对属性应用层次编码方法,提出了一种新的基于树编辑距离的层次聚类算法HCTED(Hi-erarchical Clustering Algorithm Based on Tree Edit Distance)。新算法通过树编辑操作使用最少的代价计算属性相似性,克服了传统聚类算法标称型计算的缺陷,提高了聚类精度,通过设定阈值对给定样本聚类。实验证明了新方法在身份识别上的准确性和有效性,讨论了不同参数对实验结果的影响,对比传统聚类算法,HCTED算法性能明显提高。新算法已经应用到警用流动人口分析中,取得了良好效果。  相似文献   

12.
Many enterprise areas, such as marketing, variant design, group technology, and cellular manufacturing, require their wide variety of products to be organized into families, which are clusters of similar products. In this paper, we propose a similarity metric for finding the distance between existing products based on bills of materials (BOMs), a class of unordered trees. We show that existing editing operations for unordered trees are not consistent for BOMs, and present a similarity metric based on the symmetric difference. We also provide an polynomial time algorithm for finding the minimum weighted symmetric difference between a pair of unordered trees. The results of the pairwise comparisons are used as a distance metric for a clustering algorithm that groups the BOM trees into product families.  相似文献   

13.
This paper deals with the multiobjective version of the optimal spanning tree problem. More precisely, we are interested in determining the optimal spanning tree according to an Ordered Weighted Average (OWA) of its objective values. We first show that the problem is weakly NP-hard. We then propose different mixed integer programming formulations, according to different subclasses of OWA functions. Furthermore, we provide various preprocessing procedures, the validity scopes of which depend again on the considered subclass of OWA functions. For designing such procedures, we propose generalized optimality conditions and efficiently computable bounds. These procedures enable to reduce the size of the instances before launching an off-the-shelf software for solving the mixed integer program. Their impact on the resolution time is evaluated on the basis of numerical experiments.  相似文献   

14.
15.
16.
Proposes a distance measure between unrooted and unordered trees based on the strongly structure-preserving mapping (SSPM). SSPM can make correspondences between the vertices of similar substructures of given structures more strictly than previously proposed mappings. The time complexity of computing the distance between trees Ta and T b is O(mb3NaNb), where Na and Nb are the number of vertices in trees Ta and Tb, respectively; ma and m b are the maximum degrees of a vertex in Ta and T b, respectively; and ma⩽mb is assumed. The space complexity of the method is O(NaNb )  相似文献   

17.
The greedy algorithm for edit distance with moves   总被引:1,自引:0,他引:1  
  相似文献   

18.
结合编辑距离和Google距离的语义标注方法*   总被引:1,自引:0,他引:1  
提出了一种在领域本体指导下对网页进行语义标注的方法。该方法利用编辑距离和Google距离从词语的语法和语义两方面综合度量词汇与本体概念之间的语义相关度,从而在网页与本体之间建立映射关系。此外,对网页进行语义标注后,利用标注结果对本体进行有效扩充,使本体更趋于领域化。实验结果表明该方法是行之有效的。  相似文献   

19.
Finding a sequence of edit operations that transforms one string of symbols into another with the minimum cost is a well-known problem. The minimum cost, or edit distance, is a widely used measure of the similarity of two strings. An important parameter of this problem is the cost function, which specifies the cost of each insertion, deletion, and substitution. We show that cost functions having the same ratio of the sum of the insertion and deletion costs divided by the substitution cost yield the same minimum cost sequences of edit operations. This leads to a partitioning of the universe of cost functions into equivalence classes. Also, we show the relationship between a particular set of cost functions and the longest common subsequence of the input strings. This work was supported in part by the U.S. Department of Defense and the U.S. Department of Energy.  相似文献   

20.
We consider a transformation on binary trees, named right-arm rotation, which is a special instance of the well-known rotation transformation. Only rotations at nodes of the right arm of the trees are allowed. Using ordinal tools, we give an efficient algorithm for computing the right-arm rotation distance between two binary trees, i.e., the minimum number of right-arm rotations necessary to transform one tree into the other.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号