首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
This paper considers the problem of computing a constrained edit distance between unordered labeled trees. The problem of approximate unordered tree matching is also considered. We present dynamic programming algorithms solving these problems in sequential timeO(|T 1|×|T 2|×(deg(T 1)+deg(T 2))× log2(deg(T 1)+deg(T 2))). Our previous result shows that computing the edit distance between unordered labeled trees is NP-complete.This research was supported by the Natural Sciences and Engineering Research Council of Canada under Grant No. OGP0046373.  相似文献   

2.
基于约束树编辑距离与导航树的信息采集   总被引:1,自引:0,他引:1       下载免费PDF全文
姜波  丁岳伟 《计算机工程》2009,35(14):75-77
介绍基于网站和网页结构的信息采集算法,提出一种基于约束树编辑距离的导航树算法。该算法通过提取网页的HTML的重要标记生成网页结构的标签树,对网页进行结构分析,通过约束树编辑距离算法判断爬行到的网页与主题的相关性,并根据网站基于URL的拓扑结构,提出基于导航树的信息采集约束信息采集器的爬行路径,提高了目标页面采集的效率和准确率。  相似文献   

3.
Let $G=(V,E)$ be an undirected multigraph with a special vertex ${\it root} \in V$, and where each edge $e \in E$ is endowed with a length $l(e) \geq 0$ and a capacity $c(e) > 0$. For a path $P$ that connects $u$ and $v$, the {\it transmission time} of $P$ is defined as $t(P)=\mbox{\large$\Sigma$}_{e \in P} l(e) + \max_{e \in P}\!{(1 / c(e))}$. For a spanning tree $T$, let $P_{u,v}^T$ be the unique $u$--$v$ path in $T$. The {\sc quickest radius spanning tree problem} is to find a spanning tree $T$ of $G$ such that $\max _{v \in V} t(P^T_{root,v})$ is minimized. In this paper we present a 2-approximation algorithm for this problem, and show that unless $P =NP$, there is no approximation algorithm with a performance guarantee of $2 - \epsilon$ for any $\epsilon >0$. The {\sc quickest diameter spanning tree problem} is to find a spanning tree $T$ of $G$ such that $\max_{u,v \in V} t(P^T_{u,v})$ is minimized. We present a ${3 \over 2}$-approximation to this problem, and prove that unless $P=NP$ there is no approximation algorithm with a performance guarantee of ${3 \over 2}-\epsilon$ for any $\epsilon >0$.  相似文献   

4.
Let $G=(V,E)$ be an undirected multigraph with a special vertex ${\it root} \in V$, and where each edge $e \in E$ is endowed with a length $l(e) \geq 0$ and a capacity $c(e) > 0$. For a path $P$ that connects $u$ and $v$, the {\it transmission time} of $P$ is defined as $t(P)=\mbox{\large$\Sigma$}_{e \in P} l(e) + \max_{e \in P}\!{(1 / c(e))}$. For a spanning tree $T$, let $P_{u,v}^T$ be the unique $u$--$v$ path in $T$. The {\sc quickest radius spanning tree problem} is to find a spanning tree $T$ of $G$ such that $\max _{v \in V} t(P^T_{root,v})$ is minimized. In this paper we present a 2-approximation algorithm for this problem, and show that unless $P =NP$, there is no approximation algorithm with a performance guarantee of $2 - \epsilon$ for any $\epsilon >0$. The {\sc quickest diameter spanning tree problem} is to find a spanning tree $T$ of $G$ such that $\max_{u,v \in V} t(P^T_{u,v})$ is minimized. We present a ${3 \over 2}$-approximation to this problem, and prove that unless $P=NP$ there is no approximation algorithm with a performance guarantee of ${3 \over 2}-\epsilon$ for any $\epsilon >0$.  相似文献   

5.
Experimental comparisons of the running time of approximate string matching algorithms for the k differences problem are presented. Given a pattern string, a text string, and an integer k, the task is to find all approximate occurrences of the pattern in the text with at most k differences (insertions, deletions, changes). We consider seven algorithms based on different approaches including dynamic programming, Boyer–Moore string matching, suffix automata, and the distribution of characters. It turns out that none of the algorithms is the best for all values of the problem parameters, and the speed differences between the methods can be considerable.  相似文献   

6.
We present a framework for an automated generation of exact search tree algorithms for NP-hard problems. The purpose of our approach is twofold—rapid development and improved upper bounds. Many search tree algorithms for various problems in the literature are based on complicated case distinctions. Our approach may lead to a much simpler process of developing and analyzing these algorithms. Moreover, using the sheer computing power of machines it may also lead to improved upper bounds on search tree sizes (i.e., faster exact solving algorithms) in comparison with previously developed hand-made search trees. Among others, such an example is given with the NP-complete Cluster Editing problem (also known as Correlation Clustering on complete unweighted graphs), which asks for the minimum number of edge additions and deletions to create a graph which is a disjoint union of cliques. The hand-made search tree for Cluster Editing had worst-case size O(2.27k), which now is improved to O(1.92k) due to our new method. (Herein, k denotes the number of edge modifications allowed.)  相似文献   

7.
吴海辉  吴建国 《微机发展》2004,14(4):18-21,24
在编制汉字输入法的过程中遇到了字符串的存储和检索问题,对此提出了一种基于有序二叉树的高效优化索引树,给出了优化索引树的生成算法和搜索算法。在该高效优化索引树中,采用特定的非定长结构存储树节点,并把索引树存放在一个字节型逻辑数组中,从而大大减少了索引树中儿子指针和兄弟指针的个数,使得索引树中不存在空指针。优化后的索引树不仅占用存储空间少,而且检索速度极快,非常适合存储编码信息。  相似文献   

8.
基于改进的遗传算法的多目标优化问题研究   总被引:1,自引:0,他引:1  
孔德剑 《计算机仿真》2012,29(2):213-215
研究多目标优化算法问题,针对传统的多目标优化算法由于计算复杂度非常高,难以获得令人满意的解等问题,在图论和遗传算法基础上,提出了一种改进的遗传算法求解多目标优化方法。首先采用二进制编码表示最小树问题,然后采用深度优先搜索算法进行图的连通性判断,给出了一种新的适应度函数,以提高算法执行速度和进化效率。最后仿真结果表明,与经典的Prim算法和Kruskal算法相比,新算法复杂度较低,并能在第一次遗传进化过程中获得一批最小生成树,适合于解决不同类型的多目标最小树问题。  相似文献   

9.
提出一种改进的树匹配算法,通过考量HTML特性,对树编辑距离方法进行改进,根据不同HTML树结点在浏览器中所显示的相关数据的不同权重赋以不同的权重值。算法由HTML数据对象构造具有结点权重的HTML树,模式识别通过取得两棵构造树的最大映射值达成。通过基于商用网站的实验对算法有效性进行了证实。  相似文献   

10.
提出一种更新移动目标最短路径树的近似算法来避免重新生成整棵路径树。算法使用了局部图的思想,使每次迭代更新尽量少的节点来减少代价。实验证明算法具有良好的效率、近似度和可伸缩性。分析了如何调整算法,以便在近似度和效率之间实现平衡。  相似文献   

11.
最近点对问题是空中交通控制系统中的一个重要问题,并且在许多领域都有应用,也是计算几何学研究的基本问题之一.利用分治法解决该问题的线性和平面情况,算法可以在O(n*logn)时间内完成.本文在此基础上,进一步实现空间最接近点的算法,并对算法的复杂性进行分析.  相似文献   

12.
Approximation Algorithms for Connected Dominating Sets   总被引:38,自引:0,他引:38  
S. Guha  S. Khuller 《Algorithmica》1998,20(4):374-387
The dominating set problem in graphs asks for a minimum size subset of vertices with the following property: each vertex is required to be either in the dominating set, or adjacent to some vertex in the dominating set. We focus on the related question of finding a connected dominating set of minimum size, where the graph induced by vertices in the dominating set is required to be connected as well. This problem arises in network testing, as well as in wireless communication. Two polynomial time algorithms that achieve approximation factors of 2H(Δ)+2 and H(Δ)+2 are presented, where Δ is the maximum degree and H is the harmonic function. This question also arises in relation to the traveling tourist problem, where one is looking for the shortest tour such that each vertex is either visited or has at least one of its neighbors visited. We also consider a generalization of the problem to the weighted case, and give an algorithm with an approximation factor of (c n +1) \ln n where c n ln k is the approximation factor for the node weighted Steiner tree problem (currently c n = 1.6103 ). We also consider the more general problem of finding a connected dominating set of a specified subset of vertices and provide a polynomial time algorithm with a (c+1) H(Δ) +c-1 approximation factor, where c is the Steiner approximation ratio for graphs (currently c = 1.644 ). Received June 22, 1996; revised February 28, 1997.  相似文献   

13.
The guided tree edit distance problem is to find a minimum cost series of edit operations that transforms two input forests F and G into isomorphic forests F and G such that a third input forest H is included in F (and G). The edit operations are relabeling a vertex and deleting a vertex. We show efficient algorithms for this problem that are faster than the previous algorithm for this problem of Peng and Ting [Z. Peng, H. Ting, Guided forest edit distance: Better structure comparisons by using domain-knowledge, in: Proc. 18th Symposium on Combinatorial Pattern Matching (CPM), 2007, pp. 28-39].  相似文献   

14.
一种非参数惩罚函数的优化演化算法   总被引:5,自引:0,他引:5  
周育人  周继香  王勇 《计算机工程》2005,31(10):31-33,41
对约束优化问题的处理通常使用惩罚函数法,使用普通惩罚函数法的困难存在于参数的选取。该文提出一种基于演化算法的非参数罚函数算法,对违反约束条件动态地进行惩罚,由适应值的设定来平衡群体中可行解和不可行解的比例,使群体较好地向最优解逼近,使用实数编码的多父体单形杂交演化策略来实现新算法,通过对测试函数的检验,该算法具有稳健、高效、简洁易于实现等特点。  相似文献   

15.
Tree Expressions for Information Systems   总被引:1,自引:0,他引:1       下载免费PDF全文
The discernibility matrix is one of the most important approaches to computing positive region, reduct, core and value reduct in rough sets. The subject of this paper is to develop a parallel approach of it, called "tree expression". Its computational complexity for positive region and reduct is O(m^2 × n) instead of O(m × n^2) in discernibility-matrix-based approach, and is not over O(n^2) for other concepts in rough sets, where rn and n are the numbers of attributes and objects respectively in a given dataset (also called an "information system" in rough sets). This approach suits information systems with n ≥ m and containing over one million objects.  相似文献   

16.
张棪  曹健 《计算机科学》2016,43(Z6):374-379, 383
决策树作为机器学习中的一个预测模型,因其输出结果易于理解和解释,而被广泛应用于各个领域,成为了学术界研究的热点。随着数据产生速度的剧增,由于内存容量和处理器速度等限制,常规的决策树算法无法对大数据集进行处理,因此需要对决策树算法的实现进行针对性的处理。首先阐述了决策树的基本算法和优化方法,在此基础上结合大数据带来的挑战,分类比较了各类针对性算法的优缺点,并介绍了支撑这些算法运行的平台。最后讨论了面向大数据的决策树算法的未来发展方向。  相似文献   

17.
戴东波  熊赟  朱扬勇 《软件学报》2010,21(4):718-731
序列数据在文本、Web访问日志文件、生物数据库中普遍存在,对其进行相似性查找是一种重要的获取和分析知识的手段.基于参考集索引技术是一类解决序列相似性查找的有效方法,主要思想是找到序列数据库中的少数序列作为参考集,通过参考集过滤掉数据库中与查询序列不相关的数据,从而高效地回答查询.在现有基于参考集索引技术的基础上,提出一种过滤能力更强的序列相似性查询算法IRI(improved reference indexing).首先,充分利用了先前的查询结果集来加速当前的查询,其次考虑了基于序列特征的上界和下界,使得应用参考集进行过滤的上下界更紧,过滤能力进一步加强.最后,为了避免候选集中费时的编辑距离计算,则只计算前缀序列间的编辑距离,从而进一步加速算法运行.实验采用真实的DNA序列和蛋白质序列数据,结果表明,算法IRI在查询性能上明显优于现有的基于参考集索引方法RI(reference indexing).  相似文献   

18.
网络环境的文本检索往往是同时面向大量用户的,传统的单模式匹配算法无法应付数量巨大的关键字,而一般的基于Trie树的多模式匹配算法又存在空间复杂度不良、结构复 杂等问题。针对这种检索大量关键字的应用,本文通过修改Trie树节点的结构得到一种更为简单的多模式匹配算法。该算法既有多模式匹配的性能,又具有高效的空间利用率,并且非常容易实现。  相似文献   

19.
We investigate in this paper the performance of parallel algorithms for computing the controllable part of a control linear system, with application to the computation of minimal realizations. Our approach is based on a method that transforms the matrices of the system to block Hessenberg form by using rank-revealing orthogonal factorizations.The experimental analysis on a high performance architecture includes two rank-revealing numerical tools: the SVD and the rank-revealing QR factorizations. Results are also reported, using the rank-revealing QR factorizations, on a parallel distributed architecture.  相似文献   

20.
The convex differences tree (CDT) representation of a simple polygon is useful in computer graphics, computer vision, computer aided design and robotics. The root of the tree contains the convex hull of the polygon and there is a child node recursively representing every connectivity component of the set difference between the convex hull and the polygon. We give an O(n log K + K log2 n) time algorithm for constructing the CDT, where n is the number of polygon vertices and K is the number of nodes in the CDT. The algorithm is adaptive to a complexity measure defined on its output while still being worst case efficient. For simply shaped polygons, where K is a constant, the algorithm is linear. In the worst case K = O(n) and the complexity is O(n log2 n). We also give an O(n log n) algorithm which is an application of the recently introduced compact interval tree data structure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号