共查询到20条相似文献,搜索用时 0 毫秒
1.
提出了一种利用改进的k-contextual树自动机推理算法的信息抽取技术。其核心思想是将结构化(半结构化)文档转换成树,然后利用一种改进的k-contextual树(KLH树)来构造出能够接受样本的无秩树自动机,依据该自动机接收和拒绝状态来确定是否抽取网页信息。该方法充分利用了网页文档的树状结构,依托树自动机将传统的以单一结构途径的信息抽取方法与文法推理原则相结合,得到信息抽取规则。实验证明,该方法与同类抽取方法相比,样本学习时间以及抽取所需时间上均有所缩短。 相似文献
2.
About malicious software in smartphones 总被引:1,自引:0,他引:1
Phones with some of the capabilities of modern computers also have the same kind of drawbacks. These phones are commonly referred to as smartphones. They have both phone and personal digital assistant (PDA) functionality. Typical to these devices is to have a wide selection of different connectivity options from general packet radio service (GPRS) data transfer to multi media messages (MMS) and wireless local area network (WLAN) capabilities. They also have standardized operating systems, which makes smartphones a viable platform for malware writers. Since the design of the operating systems is recent, many common security holes and vulnerabilities have been taken into account during the design. However, these precautions have not fully protected these devices. Even now, when smartphones are not that common, there is a handful of viruses for them. In this paper we will discuss some of the most typical viruses in the mobile environment and propose guidelines and predictions for the future. 相似文献
3.
Inène Guessarian 《Theory of Computing Systems》1983,16(1):237-263
We define topdown pushdown tree automata (PDTA's) which extend the usual string pushdown automata by allowing trees instead of strings in both the input and the stack. We prove that PDTA's recognize the class of context-free tree languages. (Quasi)realtime and deterministic PDTA's accept the classes of Greibach and deterministic tree languages, respectively. Finally, PDTA's are shown to be equivalent to restricted PDTA's, whose stack is linear: this both yields a more operational way of recognizing context-free tree languages and connects them with the class of indexed languages. 相似文献
4.
Most work on pattern mining focuses on simple data structures such as itemsets and sequences of itemsets. However, a lot of recent applications dealing with complex data like chemical compounds, protein structures, XML and Web log databases and social networks, require much more sophisticated data structures such as trees and graphs. In these contexts, interesting patterns involve not only frequent object values (labels) appearing in the graphs (or trees) but also frequent specific topologies found in these structures. Recently, several techniques for tree and graph mining have been proposed in the literature. In this paper, we focus on constraint-based tree pattern mining. We propose to use tree automata as a mechanism to specify user constraints over tree patterns. We present the algorithm CoBMiner which allows user constraints specified by a tree automata to be incorporated in the mining process. An extensive set of experiments executed over synthetic and real data (XML documents and Web usage logs) allows us to conclude that incorporating constraints during the mining process is far more effective than filtering the interesting patterns after the mining process. 相似文献
5.
针对恶意软件通过加密函数规避安全检测和流量分析这一问题,提出了一种识别恶意软件中加密函数的方法。通过识别恶意软件动态执行路径中的循环、循环的输入和输出参数,构建恶意软件的动态循环数据流图,通过循环数据流图提取循环的输入和输出参数集合,设计已知加密函数的参考实现对循环输入集合中的元素进行运算,判断输出是否能够匹配输出集合中的元素从而识别恶意软件中的加密函数。实验证明此分析方法能够分析严重混淆的恶意软件其传输载荷所采用的加密函数。 相似文献
6.
为了解决传统攻击树模型在恶意代码检测中存在行为差异性描述不准确、危害量化不合理的问题,提出一种将攻击树结构进行改造、构建攻击树文本图的改进攻击树检测方法,并设计了危害权值算法,从而可以更好地描述和判断恶意代码的攻击行为,引入云检测技术构建检测系统对算法进行验证.实验结果表明,该算法较传统算法对恶意代码及其变种的检测有明显的提高. 相似文献
7.
We introduce the class of rigid tree automata (RTA), an extension of standard bottom-up automata on ranked trees with distinguished states called rigid. Rigid states define a restriction on the computation of RTA on trees: RTA can test for equality in subtrees reaching the same rigid state. RTA are able to perform local and global tests of equality between subtrees, non-linear tree pattern matching, and some inequality and disequality tests as well. Properties like determinism, pumping lemma, Boolean closure, and several decision problems are studied in detail. In particular, the emptiness problem is shown decidable in linear time for RTA whereas membership of a given tree to the language of a given RTA is NP-complete. Our main result is the decidability of whether a given tree belongs to the rewrite closure of an RTA language under a restricted family of term rewriting systems, whereas this closure is not an RTA language. This result, one of the first on rewrite closure of languages of tree automata with constraints, is enabling the extension of model checking procedures based on finite tree automata techniques, in particular for the verification of communicating processes with several local non-rewritable memories, like security protocols. Finally, a comparison of RTA with several classes of tree automata with local and global equality tests, with dag automata and Horn clause formalisms is also provided. 相似文献
8.
Summary The following three results concerning tree automata are presented in this paper. (1) Rounds has presented the following open problem: For every recognizable setR, can we construct a deterministic finite-state transformation recognizingR? We show that this is not possible, in fact, even for a local set. However, the following is true: For every recognizable setR there is an inverse projectionR effectively obtained such thatR is recognized by a deterministic finite-state transformation. (2) Martin and Vere in their study of tree automata leave open the question of whether Generalized Syntax Directed Transductions (GSDT's) are closed under Arden's transformation or Greibach's transformation, and conjecture that they are not. We prove that this conjecture is true. It is also shown that GSDT's are not closed under transformation to LR(k) grammars. (3) Peters and Ritchie have shown that if, in a grammar where the generative rules are context-free, there are recognition rules which are context-sensitive, the language recognized is still context-free. A tree-automata-oriented proof is given by Rounds. We show that a similar result holds also for right linear grammars, i.e., if the generative rules are right linear, then using context-sensitive rules for recognition, one can still recognize only regular languages. Some other related results concerning context-sensitive extensions of subclasses of context-free languages are also presented.This work was partially supported by NSF Grant GJ27, U.S. Army Research Office, Durham (DA-31-124 ARO(D)-98), and NSF Grant GS-2509.A present on leave at The Institute for Advanced Study, Princeton, N.J. 相似文献
9.
Andreas Maletti 《Information and Computation》2009,207(11):1284-1299
10.
Timed tree automata with an application to temporal logic 总被引:1,自引:0,他引:1
Finite automata on -sequences and -trees were introduced in the sixties by Büchi, McNaughton and Rabin. Finite automata on timed -sequences were introduced by Alur and Dill. In this paper we extend the theory of timed -sequences to -trees. The main motivation is the introduction of a new way to specify real-time systems and to study, using automata-theoretic
techniques, branching-time temporal logics with timing constraints. We study closure properties and decision problems for
the obtained classes of timed -tree languages. In particular, we show the decidability of the emptiness problem. As an application of the introduced theory,
we give a new decidable branching time temporal logic (STCTL) whose semantics is based upon timed -trees.
Received: 8 September 1997 / 27 June 2001 相似文献
11.
Recognizing mathematical expressions using tree transformation 总被引:6,自引:0,他引:6
Zanibbi R. Blostein D. Cordy J.R. 《IEEE transactions on pattern analysis and machine intelligence》2002,24(11):1455-1467
We describe a robust and efficient system for recognizing typeset and handwritten mathematical notation. From a list of symbols with bounding boxes the system analyzes an expression in three successive passes. The Layout Pass constructs a Baseline Structure Tree (BST) describing the two-dimensional arrangement of input symbols. Reading order and operator dominance are used to allow efficient recognition of symbol layout even when symbols deviate greatly from their ideal positions. Next, the Lexical Pass produces a Lexed BST from the initial BST by grouping tokens comprised of multiple input symbols; these include decimal numbers, function names, and symbols comprised of nonoverlapping primitives such as "=". The Lexical Pass also labels vertical structures such as fractions and accents. The Lexed BST is translated into L/sup A/T/sub E/X. Additional processing, necessary for producing output for symbolic algebra systems, is carried out in the Expression Analysis Pass. The Lexed BST is translated into an Operator Tree, which describes the order and scope of operations in the input expression. The tree manipulations used in each pass are represented compactly using tree transformations. The compiler-like architecture of the system allows robust handling of unexpected input, increases the scalability of the system, and provides the groundwork for handling dialects of mathematical notation. 相似文献
12.
We define a weighted monadic second order logic for trees where the weights are taken from a commutative semiring. We prove that a restricted version of this logic characterizes the class of formal tree series which are accepted by weighted bottom-up finite state tree automata. The restriction on the logic can be dropped if additionally the semiring is locally finite. This generalizes corresponding classical results of Thatcher, Wright, and Doner for tree languages and it extends recent results of Droste and Gastin [Weighted automata and weighted logics, in: Automata, Languages and Programming—32nd International Colloquium, ICALP 2005, Lisbon, Portugal, 2005, Proceedings, Lecture Notes in Computer Science, Vol. 3580, Springer, Berlin, 2005, pp. 513–525, full version in Theoretical Computer Science, to appear.] from formal power series on words to formal tree series. 相似文献
13.
14.
基于恶意软件分类的特征码提取方法 总被引:2,自引:0,他引:2
网络已经成为恶意软件最主要的传播途径。由于恶意软件以网络数据包的形式在网络上传播时和正常的数据包并无差异,传统检测方法不能有效检测。对传统的特征码检测方法进行了改进,提出了一种基于恶意软件分类的特征码提取方法,提高了对恶意软件片段的识别能力,并在一定程度上解决了传统的特征检测方法检测效率低、特征库过于庞大的问题。 相似文献
15.
Android中存在很多系统机制供应用程序使用,然而这些机制在不当使用时会对用户安全和利益造成很大的破坏性。提出一种基于程序分析的方法,检测应用程序使用这些机制时可能存在的恶意行为。针对函数本身的特征,构建与之相对应的函数摘要。在构建摘要时使用指令级的模拟执行,在检测恶意行为时使用函数级的模拟执行,通过这两种不同级别的模拟执行分析出应用程序中潜在的恶意行为。基于上述方法,设计和实现了一个原型系统。通过对公开的恶意应用样本进行检测,验证了本方法是有效的。 相似文献
16.
Robert Luh Gregor Schramm Markus Wagner Helge Janicke Sebastian Schrittwieser 《Journal in Computer Virology》2018,14(4):291-311
Targeted attacks on IT systems are a rising threat to the confidentiality of sensitive data and the availability of critical systems. The emergence of Advanced Persistent Threats (APTs) made it paramount to fully understand the particulars of such attacks in order to improve or devise effective defense mechanisms. Grammar inference paired with visual analytics (VA) techniques offers a powerful foundation for the automated extraction of behavioral patterns from sequential event traces. To facilitate the interpretation and analysis of APTs, we present SEQUIN, a grammar inference system based on the Sequitur compression algorithm that constructs a context-free grammar (CFG) from string-based input data. In addition to recursive rule extraction, we expanded the procedure through automated assessment routines capable of dealing with multiple input sources and types. This automated assessment enables the accurate identification of interesting frequent or anomalous patterns in sequential corpora of arbitrary quantity and origin. On the formal side, we extended the CFG with attributes that help describe the extracted (malicious) actions. Discovery-focused pattern visualization of the output is provided by our dedicated KAMAS VA prototype. 相似文献
17.
The term ‘inductive inference’ denotes the process of hypothesizing a general rule from examples. It can be considered as the inverse process of program testing, which is a process of sampling the behaviour of a program and gathering confidence in the quality of the software from the samples. As one of the fundamental and ubiquitous components of intelligent behaviour, much effort has been spent on both the theory and practice of inductive inference as a branch of artificial intelligence. In this paper, software testing and inductive inference are reviewed to illustrate how the rich and solid theory of inductive inference can be used to study the foundations of software testing. 相似文献
18.
Tree automata generalize the notion of a finite automaton working on strings to that of a finite automaton operating on trees. Most results for finite automata have been extended to tree automata. In this paper we introduce tree derivatives which extend the concept of Brzozowski's string derivatives. We use tree derivatives for minimizing and characterizing tree automata. Tree derivatives are used as a basis of inference of tree automata from finite samples of trees. Our method compares tree derivative sets and infers a tree automaton based on the amount of overlap between the derivative sets. Several of the limitations present in the tree inference techniques by Brayer and Fu and Edwards, Gonzalez, and Thomason are not imposed by our algorithm. 相似文献
19.
Nicolas Peltier 《Annals of Mathematics and Artificial Intelligence》2009,56(1):65-85
We propose a method to use finite model builders in order to construct infinite models of first-order formulae. The constructed
models are Herbrand interpretations, in which the interpretation of the predicate symbols is specified by tree tuple automata
(Comon et al. 1997). Our approach is based on formula transformation: a formula ϕ is transformed into a formula Δ(ϕ) s.t. ϕ has a model representable by a term tuple automaton iff Δ(ϕ) has a finite model. This paper is an extended version of Peltier (2008). 相似文献
20.
基于无秩树自动机的信息抽取技术研究 总被引:1,自引:0,他引:1
针对目前基于网页结构的信息抽取方法的缺陷,提出了一种基于无秩树自动机的信息抽取技术,其核心思想是通过将结构化(半结构化)文档转换成无秩树,然后利用(k,l)-contextual树构造样本自动机,依据树自动机接收和拒绝状态来对网页进行数据的抽取.该方法充分利用结构,依托树自动机将传统的以单一结构途径的信息抽取方法与文法推理原则相结合,得到信息抽取规则.实验结果表明,该方法与同类抽取方法相比在准确率、召回率以及抽取所需时间上均有所提高. 相似文献