共查询到20条相似文献,搜索用时 15 毫秒
1.
Eberhard Bertsch 《Acta Informatica》1996,33(7):631-639
We give a simple grammatical characterization of the notion of suffix redundancy presented procedurally in a recent article by Fischer and Mauney. It is shown that certain properties of grammars related to suffix redundancy can be decided at parser generation time. Some suggestions concerning efficiency of repair attempts conclude the article. 相似文献
2.
3.
在分析典型的空间关联规则算法的基础上,针对规则冗余和挖掘伸缩性差的不足,提出基于元规则和FP增长树的空间关联规则挖掘算法,该算法不用产生候选集合,而使用最不频繁的项后缀,减少了扫描数据库的次数,大大降低了搜索开销;同时,元规则的约束,提供了好的选择性,减少了规则的冗余.本文最后以水土流失的空间要素的关联关系为例,验证算法的有效性,并与典型算法比较,本文提出的算法时间性能和空间伸缩性均优于典型算法. 相似文献
4.
q-gram matching is used for approximate substring matching problems in a wide range of application areas, including intrusion detection. In this paper, we present a tree-based model to perform fast linear time q-gram matching. All q-grams present in the text are stored in a tree structure similar to trie. We use a tree redundancy pruning algorithm to reduce the size of the tree without losing any information. We also use suffix links for fast q-gram search during query matching. We compare our work with the Rabin-Karp-based hash-table technique, commonly used for multiple q-gram search. We present results of experiments on system call sequence data used for intrusion detection. 相似文献
5.
表达式求值是程序设计语言编译中的一个最基本问题。与人们习惯的中缀表示的表达式相比,后缀表达式不存在括号,没有优先级的差别,表达式中各个运算是按照运算符出现的顺序进行的。因此非常适合串行工作的计算机处理方式。该文首先对这两种表达式表示方法进行了分析比较,然后通过具体分析实现这两种表达式求值的算法来论证表达式后缀表示优于中缀表示。最后简要谈一下中缀表达式到后缀表达式的转换。 相似文献
6.
数据立方计算是代价非常大的操作,并且被广泛研究。受空问的限制,存储一个完全实例化的数据立方是不可行的。最近提出的一种语义压缩数据立方一Dwarf,通过消除前缀冗余和后缀冗余把一个完全实例化的数据立方压缩存储到一个很小的空问。然而,当数据源发生变化时,它的更新过程是很复杂的。本文通过研究Dwarf在更新过程中汇总结点的变化特性,提出了一种基于Dwarf的新的增量更新算法,既能完全实例化数据立方又不需要重新计算,大大提高了数据立方的更新效率。实验进一步证明了该算法的效率和有效性,尤其适合数据仓库中的高维数据集。 相似文献
7.
Moritz G. Maaß 《Information Processing Letters》2007,101(6):250-254
We present a new and simple algorithm to reconstruct suffix links in suffix trees and suffix arrays. The algorithm is based on observations regarding suffix tree construction algorithms. With our algorithm we bring suffix arrays even closer to the ease of use and implementation of suffix trees. 相似文献
8.
Suffix trees and suffix arrays are fundamental full-text index data structures to solve problems occurring in string processing.
Since suffix trees and suffix arrays have different capabilities, some problems are solved more efficiently using suffix trees
and others are solved more efficiently using suffix arrays. We consider efficient index data structures with the capabilities
of both suffix trees and suffix arrays without requiring much space. When the size of an alphabet is small, enhanced suffix
arrays are such index data structures. However, when the size of an alphabet is large, enhanced suffix arrays lose the power
of suffix trees. Pattern searching in an enhanced suffix array takes O(m|Σ|) time while pattern searching in a suffix tree takes O(mlog |Σ|) time where m is the length of a pattern and Σ is an alphabet.
In this paper, we present linearized suffix trees which are efficient index data structures with the capabilities of both suffix trees and suffix arrays even when the size
of an alphabet is large. A linearized suffix tree has all the functionalities of the enhanced suffix array and supports the
pattern search in O(mlog |Σ|) time. In a different point of view, it can be considered a practical implementation of the suffix tree supporting
O(mlog |Σ|)-time pattern search.
In addition, we also present two efficient algorithms for computing suffix links on the enhanced suffix array and the linearized
suffix tree. These are the first algorithms that run in O(n) time without using the range minima query. Our experimental results show that our algorithms are faster than the previous
algorithms. 相似文献
9.
Joubert de Castro Lima Author Vitae Celso Massaki Hirata Author Vitae 《Information Sciences》2011,181(13):2626-2655
We present a new full cube computation technique and a cube storage representation approach, called the multidimensional cyclic graph (MCG) approach. The data cube relational operator has exponential complexity and therefore its materialization involves both a huge amount of memory and a substantial amount of time. Reducing the size of data cubes, without a loss of generality, thus becomes a fundamental problem. Previous approaches, such as Dwarf, Star and MDAG, have substantially reduced the cube size using graph representations. In general, they eliminate prefix redundancy and some suffix redundancy from a data cube. The MCG differs significantly from previous approaches as it completely eliminates prefix and suffix redundancies from a data cube. A data cube can be viewed as a set of sub-graphs. In general, redundant sub-graphs are quite common in a data cube, but eliminating them is a hard problem. Dwarf, Star and MDAG approaches only eliminate some specific common sub-graphs. The MCG approach efficiently eliminates all common sub-graphs from the entire cube, based on an exact sub-graph matching solution. We propose a matching function to guarantee one-to-one mapping between sub-graphs. The function is computed incrementally, in a top-down fashion, and its computation uses a minimal amount of information to generate unique results. In addition, it is computed for any measurement type: distributive, algebraic or holistic. MCG performance analysis demonstrates that MCG is 20-40% faster than Dwarf, Star and MDAG approaches when computing sparse data cubes. Dense data cubes have a small number of aggregations, so there is not enough room for runtime and memory consumption optimization, therefore the MCG approach is not useful in computing such dense cubes. The compact representation of sparse data cubes enables the MCG approach to reduce memory consumption by 70-90% when compared to the original Star approach, proposed in [33]. In the same scenarios, the improved Star approach, proposed in [34], reduces memory consumption by only 10-30%, Dwarf by 30-50% and MDAG by 40-60%, when compared to the original Star approach. The MCG is the first approach that uses an exact sub-graph matching function to reduce cube size, avoiding unnecessary aggregation, i.e. improving cube computation runtime. 相似文献
10.
在许多重要工业应用场合,需要长期、连续、安全可靠的控制保障。冗余控制是一种解决方案,在SIEMENS,S7系列PLC中,有S7—300的软冗余和S7的硬冗余两种解决方案。通过对两种冗余的概念、工作原理、硬件配置以及整体性能等方面的研究,阐述了PLC软冗余及硬冗余控制系统在工业生产中所适用的场合。 相似文献
11.
12.
Homayoun Seraji 《野外机器人技术杂志》1992,9(3):411-451
This article establishes new goals for redundancy resolution based on manipulator dynamics and end-effector characteristics. These goals can be accomplished by employing the recently developed configuration control approach. Redundancy resolution is achieved by controlling the joint inertia matrix or the end-effector mass matrix that affect the inertial torques or by reducing the joint torques due to gravity loading and payload. The manipulator mechanical advantage and velocity ratio are also used as performance measures to be improved by proper utilization of redundancy. Furthermore, end-effector compliance, sensitivity, and impulsive force at impact are introduced as redundancy-resolution criteria. The new goals for redundancy resolution presented in this article allow a more efficient utilization of the redundant joints based on the desired task requirements. Simple case studies using computer simulations are described for illustration. 相似文献
13.
Kunihiko Sadakane 《Theory of Computing Systems》2007,41(4):589-607
We introduce new data structures for compressed suffix trees whose size are linear in the text size. The size is measured
in bits; thus they occupy only O(n log|A|) bits for a text of length n on an alphabet A.
This is a remarkable improvement on current suffix trees which require O(n log n) bits. Though some components of suffix trees
have been compressed, there is no linear-size data structure for suffix trees with full functionality such as computing suffix
links, string-depths and lowest common ancestors. The data structure proposed in this paper is the first one that has linear
size and supports all operations efficiently. Any algorithm running on a suffix tree can also be executed on our compressed
suffix trees with a slight slowdown of a factor of polylog(n). 相似文献
14.
后缀树的并行构造算法 总被引:1,自引:0,他引:1
后缀树是一种非常重要的数据结构,它在与字符串处理相关的各种领域里有着非常广泛的应用。构造后缀树是应用后缀树解决问题的前提和关键。虽然很多现有的后缀树构造算法都是线性时间和空间的,但是,当被索引的字符串的长度很长时,构造其后缀树所消耗的时间和空间仍将非常巨大,这极大地限制了后缀树的实际应用。而并行技术是解决这一问题的很好途径,因此人们提出了后缀树的并行构造算法。本文对后缀树的三种并行构造算法进行了综述,通过系统的比较和分析,总结出当前存在的问题,并指明了下一步的研究方向。 相似文献
15.
16.
Agents offer a convenient level of granularity at which to add redundancy a key factor in developing robust software. Blindly adding code introduces more errors, makes the system more complex, and renders it harder to understand. However, adding more code can make software better, if it is added in the right way. As this article describes, the key concepts appear to be redundancy and the appropriate granularity 相似文献
17.
Finding motifs in biological sequences is one of the most intriguing problems for string algorithm designers due to, on the one hand, the numerous applications of this problem in molecular biology and, on the other hand, the challenging aspects of the computational problem. Indeed, when dealing with biological sequences it is necessary to work with approximations (that is, to identify fragments that are not necessarily identical, but just similar, according to a given similarity notion), and this complicates the problem. Existing algorithms run in time linear with respect to the input size. Nevertheless, the output size can be very large due to the approximation (namely exponential in the approximation degree). This often makes the output unreadable, as well as slowing down the inference itself. A high degree of redundancy has been detected in the set of motifs that satisfy traditional requirements, even for exact motifs. Moreover, it has been observed many times that only a subset of these motifs, namely the maximal motifs, could be enough to provide the information of all of them. In this paper, we aim at removing such redundancy. We extend some notions of maximality already defined for exact motifs to the case of approximate motifs with Hamming distance, and we give a characterization of maximal motifs on the suffix tree. Given that this data structure is used by a whole class of motif extraction tools, we show how these tools can be modified to include the maximality requirement without changing the asymptotical complexity. 相似文献
18.
现有的基于后缀数组的滑动窗口压缩算法,在每次窗口滑动后都需要重新构建后缀数组,影响了算法的效率。在分析了滑动窗口下后缀数组的特点后,提出一种构建后缀数组的新方法,使得在压缩算法执行过程中只需要部分构建后缀数组,在不损失压缩效率的情况下,使得整个压缩算法的效率得到提高。实验验证了提出算法的有效性。 相似文献
20.
A suffix tree is a fundamental data structure for string searching algorithms. Unfortunately, when it comes to the use of suffix trees in real-life applications, the current methods for constructing suffix trees do not scale for large inputs. As suffix trees are larger than the input sequences and quickly outgrow the main memory, the first attempts at building large suffix trees focused on algorithms which avoid massive random access to the trees being built. However, all the existing practical algorithms perform random access to the input string, thus requiring in essence that the input be small enough to be kept in main memory. The constantly growing pool of string data, especially biological sequences, requires us to build suffix trees for much larger strings. 相似文献