共查询到20条相似文献,搜索用时 15 毫秒
1.
New Techniques for Regular Expression Searching 总被引:1,自引:0,他引:1
We present two new techniques for regular expression searching and use them to
derive faster practical algorithms.
Based on the specific properties of Glushkovs nondeterministic finite automaton
construction algorithm, we show how to encode a deterministic finite
automaton (DFA) using O(m2m) bits, where m is the number of characters,
excluding operator symbols, in the regular expression. This compares favorably
against the worst case of O(m2m||) bits needed by a classical DFA
representation (where is the alphabet) and O(m22m) bits needed
by the Wu and Manber approach implemented in Agrep.
We also present a new way to search for regular expressions, which is
able to skip text characters. The idea is to determine the minimum
length of a string matching the regular expression,
manipulate the original automaton so that it recognizes all the
reverse prefixes of length up to of the strings originally accepted,
and use it to skip text characters as done for exact string matching
in previous work.
We combine these techniques into two algorithms, one able and one unable to
skip text characters. The algorithms are simple to implement, and our
experiments show that they permit fast searching for regular expressions,
normally faster than any existing algorithm. 相似文献
2.
当前深度包检测算法通常需要将正则表达式转换为NFA或者DFA.但是随着网络带宽的不断增加.NFA和DFA状态占用的存储空间越来越大,极大地考验着系统的存储能力。为了应对这个问题.提出一种基于正则表达式相性的分组算法来对表达式进行分组,实验证明该算法能减少NFA和DFA状态的数量,提高匹配的效率。 相似文献
3.
正则表达式是数据验证技术中功能最为强大的输入控制技术。传统的基于NFA的正则表达式引擎的匹配速度低。通过正则表达式与自动机等价的原理,研究了通过最小化的确定的有限自动机(DFA)来等价实现.NET中正则表达式的数据验证的机制,以期提高正则表达式的匹配速度。 相似文献
4.
MIAO Yu 《数字社区&智能家居》2008,(20)
本文主要介绍基于编译器构造技术中的由正规表达式到最小化DFA的算法设计和实现技术,以及自动机转换正规式的方法。正规式与自动机理论以不同方式表达相同语言,两者相互转换在编译器构造过程中起至关重要的作用,也被广泛应用于计算机科学的各个领域。 相似文献
5.
自动机理论是编译程序中单词识别的基本理论。论文分析了自动机与正规表达式等价性定理,指出了从确定有限自动机到正规表达式重构规则中存在的问题,给出了一个包含多个结点所组成回路的有限自动机到正规表达式的重构定理,并通过实例对于该定理所阐明的方法的运用进行了详细的讨论。 相似文献
6.
A sensing strategy for the reverse engineering of machined parts 总被引:1,自引:0,他引:1
The reverse engineering of machined parts requires sensing an existing part and producing a design (and perhaps a manufacturing process) for it. We have developed a reverse engineering system that has proven effective with a set of machined parts. This paper describes the system, presents some results, and discusses strategy for a new system.This work was supported by ARPA under ARO grant number DAAH04-93-G-0420, DARPA grant N00014-91-J-4123, NSF grant CDA 9024721, and a University of Utah Research Committee grant. All opinions, findings, conclusions or recommendations expressed in this document are those of the authors and do not necessarily reflect the views of the sponsoring agencies. 相似文献
7.
基于幂图的属性约简搜索式算法 总被引:7,自引:0,他引:7
粗糙集理论是一种新的处理不精确、不完全与不一致数据的数学工具.属性约简是粗糙集理论的重要研究内容之一,已有的属性约简算法主要是基于代数表示与信息表示的方法.同一问题在不同的知识表示下,其求解难度是不同的.文中从改变属性约简问题的知识表示人手,提出了该问题的一种新的表示方式--幂图;给出了基于幂图的属性约简搜索式算法,把属性约简计算问题转化为在幂图中的搜索问题.理论分析表明新算法是有效的,为属性约简研究提供了一条新的途径. 相似文献
8.
Predicting the fold, or approximate 3D structure, of a protein from its amino acid sequence is an important problem in biology. The homology modeling approach uses a protein database to identify fold-class relationships by sequence similarity. The main limitation of this method is that some proteins with similar structures appear to have very different sequences, which we call the hidden-homology problem. As in other real-world domains for machine learning, this difficulty may be caused by a low-level representation. Learning in such domains can be improved by using domain knowledge to search for representations that better match the inductive bias of a preferred algorithm. In this domain, knowledge of amino acid properties can be used to construct higher-level representations of protein sequences. In one experiment using a 179-protein data set, the accuracy of fold-class prediction was increased from 77.7% to 81.0%. The search results are analyzed to refine the grouping of small residues suggested by Dayhoff. Finally, an extension to the representation incorporates sequential context directly into the representation, which can express finer relationships among the amino acids. The methods developed in this domain are generalized into a framework that suggests several systematic roles for domain knowledge in machine learning. Knowledge may define both a space of alternative representations, as well as a strategy for searching this space. The search results may be summarized to extract feedback for revising the domain knowledge. 相似文献
9.
A new high spectral accuracy compact difference scheme is proposed here. This has been obtained by constrained optimization of error in spectral space for discretizing first derivative for problems with non-periodic boundary condition. This produces a scheme with the highest spectral accuracy among all known compact schemes, although this is formally only second-order accurate. Solution of Navier-Stokes equation for incompressible flows are reported here using this scheme to solve two fluid flow instability problems that are difficult to solve using explicit schemes. The first problem investigates the effect of wind-shear past bluff-body and the second problem involves predicting a vortex-induced instability. 相似文献
10.
11.
12.
M. J. Jordan 《Software》1978,8(3):285-300
A list processor, SLP, is presented as a means for manipulating data structures represented as compact lists. The list space is both paged and segmented in order to handle large data structures on a conventional minicomputer. The list processing primitives arc provided as procedure calls embedded in a high-level language, and the system includes aids for developing user programs. The implementation for a PDP 11/45 computer is presented along with a discussion of the use and performance of the system in practice. 相似文献
13.
本文提出了一种用于对正则表达式的覆盖能力进行评价的算法。我们将一条正则表达式可覆盖的实例的数目定义为正则表达式的覆盖能力。算法首先将完整的正则表达式分成若干片断,然后分析每个片断可覆盖的字符串实例数目,最后根据乘法原理将各个片断可覆盖的实例数目相乘,即为当前正则表达式可覆盖的实例数目。 相似文献
14.
本文提出了一种用于对正则表达式的覆盖能力进行评价的算法.我们将一条正则表达式可覆盖的实例的数目定义为正则表达式的覆盖能力.算法首先将完整的正则表达式分成若干片断,然后分析每个片断可覆盖的字符串实例数目,最后根据乘法原理将各个片断可覆盖的实例数目相乘,即为当前正则表达式可覆盖的实例数目. 相似文献
15.
Feature-Based Methods for Large Scale Dynamic Programming 总被引:5,自引:0,他引:5
We develop a methodological framework and present a few different ways in which dynamic programming and compact representations can be combined to solve large scale stochastic control problems. In particular, we develop algorithms that employ two types of feature-based compact representations; that is, representations that involve feature extraction and a relatively simple approximation architecture. We prove the convergence of these algorithms and provide bounds on the approximation error. As an example, one of these algorithms is used to generate a strategy for the game of Tetris. Furthermore, we provide a counter-example illustrating the difficulties of integrating compact representations with dynamic programming, which exemplifies the shortcomings of certain simple approaches. 相似文献
16.
一种新的一维无格子超车跟车仿真模型 总被引:3,自引:6,他引:3
近年来,人们对超车和跟车进行了广泛的研究。但以往方法尚有其不足之处。主要体现在,不能统一处理超车和跟车模型以及不能较好地体现司机的行为特性,另外不能通过简洁的计算同时给出任意时刻车辆在道路的纵向和横向上的位置。为此,作者在其首创的固支梁挠度曲线超车跟车模型基础上,提出了一种新的一维无格子超车跟车模型。它根据前方车辆的情况及道路宽度的变化情况直接确定当前车辆在道路宽度方向上的位置。新模型能同时适用于道路宽度变化和前方车辆速度变化的情形。 相似文献
17.
编译基础设施中多目标编译技术探讨 总被引:3,自引:0,他引:3
从编译基础设施的基本概念出发,着重讨论了编译器后端构造所涉及的关键技术;比较全面地总结并评述了具有代表性的公共编译设施及春采用的中间表示技术、后端构造技术和相关工具;并探讨了编译器后端构造研究中存在的一些问题及相应的解决方案。 相似文献
18.
Interval routing (IR) is a space-efficient routing method for computer networks. For longest routing path analysis, researchers have focused on lower bounds for many years. For any n-node graph G of diameter D, there exists an upper bound of 2D for IR using one or more labels, and an upper bound of
for IR using
or more labels. We present two upper bounds in the first part of the paper. We show that for every integer i>0, every n-node graph of diameter D has a k-dominating set of size
for
. This result implies a new upper bound of
for IR using
or more labels, where i is any positive integer constant. We apply the result by Kutten and Peleg [8] to achieve an upper bound of (1+)D for IR using O(n/D) or more labels, where is any constant in (0,1). The second part of the paper offers some lower bounds for planar graphs. For any M-label interval routing scheme (M-IRS), where
, we derive a lower bound of [(2M+1)/(2M)]D−1 on the longest path for
, and a lower bound of [(2(1+δ)M+1)/(2(1+δ)M)]D, where δ(0,1], for
. The latter result implies a lower bound of
on the number of labels needed to achieve optimality. 相似文献
19.
LinkNet:一种用于大规模P2P系统查找的新方法 总被引:2,自引:0,他引:2
提出了一种新的可扩展分布式数据结构LinkNet来支持大规模P2P系统中的数据查找.在LinkNet中,所有的元素存储在一个有序的双向链表中,该链表中的每个结点都可以存储多个元素.LinkNet使用虚拟链接来减少存储开销和加速查找过程.在一个包含N个结点M个元素的网络中,LinkNet占用的存储空间期望值为O(M),并且当M足够大时,查找操作期望只需要传递O(logN)条消息. 相似文献
20.
A general computational method for the accurate calculation of rotationally and vibrationally excited states of tetraatomic molecules is developed. The resulting program is particularly appropriate for molecules executing wide-amplitude motions and isomerizations. The program offers a choice of coordinate systems based on Radau, Jacobi, diatom-diatom and orthogonal satellite vectors. The method includes all six vibrational dimensions plus three rotational dimensions. Vibration-rotation calculations with reduced dimensionality in the radial degrees of freedom are easily tackled via constraints imposed on the radial coordinates via the input file.