共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
3.
4.
We raise the question of approximating the compressibility of a string with respect to a fixed compression scheme, in sublinear time. We study this question in detail for two popular lossless compression schemes: run-length encoding (RLE) and a variant of Lempel-Ziv (LZ77), and present sublinear algorithms for approximating compressibility with respect to both schemes. We also give several lower bounds that show that our algorithms for both schemes cannot be improved significantly. Our investigation of LZ77 yields results whose interest goes beyond the initial questions we set out to study. In particular, we prove combinatorial structural lemmas that relate the compressibility of a string with respect to LZ77 to the number of distinct short substrings contained in it (its ?th subword complexity , for small ?). In addition, we show that approximating the compressibility with respect to LZ77 is related to approximating the support size of a distribution. 相似文献
5.
6.
Abstract. In this paper we propose a dynamic programming algorithm to compare two quotiented trees using a constrained edit distance.
A quotiented tree is a tree defined with an additional equivalent relation on vertices and such that the quotient graph is
also a tree. The core of the method relies on an adaptation of an algorithm recently proposed by Zhang for comparing unordered
rooted trees. This method is currently being used in plant architecture modelling to quantify different types of variability
between plants represented by quotiented trees. 相似文献
7.
In this paper we propose a dynamic programming algorithm to compare two quotiented trees using a constrained edit distance. A quotiented tree is a tree defined with an additional equivalent relation on vertices and such that the quotient graph is also a tree. The core of the method relies on an adaptation of an algorithm recently proposed by Zhang for comparing unordered rooted trees. This method is currently being used in plant architecture modelling to quantify different types of variability between plants represented by quotiented trees. 相似文献
8.
两字符串的编辑距离是从一个串转换到另一个串所需要的最少基本操作数。编辑距离广泛应用于字符串近似匹配、字符串相似连接等领域。动态规划法利用编辑距离矩阵来计算两个串的编辑距离,需要计算矩阵中的所有元素,时间效率低。改进的方法改变了矩阵中元素的计算次序,减少了需要比对的元素,但仍需要比对一半以上的元素,时间效率还有待提高。提出基于基本操作序列的编辑距离顺序验证方法。首先,分析了基本操作序列的可列性,给出了列举基本操作序列的方法。然后依次顺序验证基本操作数从小到大的基本操作序列直到某一序列通过验证,得到其编辑距离。在阈值为2的字符串近似搜索实验中发现,所提方法比动态规划类方法具有更高的效率。 相似文献
9.
支持块编辑距离的索引结构 总被引:1,自引:0,他引:1
在近似字符串匹配中,传统的编辑距离不能很好地衡量诸如人名、地址等数据的相似关系,而块编辑距离可以很好地衡量两个字符串的相似性.如何有效地支持块编辑距离,进行近似字符串查询处理具有重要的意义.计算两个字符串的块编辑距离是一个NP完全问题,因此希望提供有效的方法可以增强过滤能力,并减少假通过率.设计了一种支持移动编辑距离的新颖的索引结构SHV-Trie,通过研究移动编辑距离的操作特性,使用字母出现的频率作为支持移动编辑距离操作的一个下界,并且提出相应的查询过滤算法,同时,针对索引SHV-Trie的空间开销过大的问题,提出一种优化字母排列的索引结构和一种压缩的索引结构及相关查询过滤算法.真实数据集上的实验结果与分析显示了所提出的索引结构具有良好的过滤能力,并通过减少效率假通过率提高查询的效率. 相似文献
10.
秦鸿霞 《数字社区&智能家居》2007,(15)
主要探讨了用Visual FoxPro设计编辑框中文本的查找与替换方法.首先在进行查找与替换表单界面的设计及相关属性设置的基础上,阐明了查找与替换的功能目标;然后分析了相应对象相关事件的基本过程,并提供了相应的Visual FoxPro程序代码. 相似文献
11.
秦鸿霞 《数字社区&智能家居》2007,3(8):767-768,783
主要探讨了用Visual FoxPro设计编辑框中文本的查找与替换方法。首先在进行查找与替换表单界面的设计及相关属性设置的基础上,阐明了查找与替换的功能目标;然后分析了相应对象相关事件的基本过程,并提供了相应的Visual FoxPro程序代码。 相似文献
12.
Given an undirected connected graph G we consider the problem of finding a spanning tree of G which has a maximum number of internal (non-leaf) vertices among all spanning trees of G. This problem, called Maximum Internal Spanning Tree problem, is clearly NP-hard since it is a generalization of the Hamiltonian Path problem. From the optimization point of view the Maximum Internal Spanning Tree problem is equivalent to the Minimum Leaf Spanning Tree problem. However, the two problems have different approximability properties. Lu and Ravi proved that the latter has no constant factor approximation–unless P = NP–, while Salamon and Wiener gave a linear-time 2-approximation algorithm for the Maximum Internal Spanning Tree problem. 相似文献
13.
14.
基于最小编辑距离的维语词语检错与纠错研究 总被引:2,自引:1,他引:2
拼写错误的发现和候选词选取是文本分析中的一个重要的技术问题。本文结合维吾尔语的语音和词语结构特点,列出了文本中常见的拼写错误类型,详细分析了解决方法,利用最小编辑距离(minimum edit distance)算法实现了维吾尔语文本拼写错误分析中的查错和纠错功能,并以此为基础,结合维吾尔语构词规则,进一步提高了建议候选词的准确率和速度。该算法已被成功地应用到了维吾尔语文字自动校对和多文种文本检索等领域中。在以新疆高校学报为语料的测试中,词语查纠率达到 85%以上。 相似文献
15.
16.
To analyze or simulate an operating/computer system, one must construct a model of the programs executing within the system. Many recent analyses, particularly queueing models, have used mathematically convenient program models. For example, in one popular model the times between page faults are assumed to be exponentially distributed and independent. However, such a program model is inaccurate, which may cause the analysis to be unrepresentative of the real world. Simulation models of systems, on the other hand, have relied largely on traces of actual programs. Such traces are undoubtedly more accurate than simple mathematical models, but they have several drawbacks. They are expensive to generate, they may not be truly representative of typical programs, and they may contain more detail than is necessary for accurate system modeling. Moreover, it is difficult to extrapolate their behavior to other similar programs. 相似文献
17.
We consider the problem of finding a minimum diameter spanning tree
with maximum node degree $B$ in a complete undirected edge-weighted
graph. We provide an $O(\sqrt{\log_Bn})$-approximation algorithm for the
problem. Our algorithm is purely combinatorial, and relies on a
combination of filtering and divide and conquer. 相似文献
18.
We consider the problem of finding a minimum diameter spanning tree
with maximum node degree $B$ in a complete undirected edge-weighted
graph. We provide an $O(\sqrt{\log_Bn})$-approximation algorithm for the
problem. Our algorithm is purely combinatorial, and relies on a
combination of filtering and divide and conquer. 相似文献
19.