期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Worst-case optimal algorithm for XPath evaluation over XML streams

Prakash Ramanan 《Journal of Computer and System Sciences》2009,75(8):465-485

We consider the XPath evaluation problem: Evaluate an XPath query Q on a streaming XML document D; i.e., determine the set Q(D) of document elements selected by Q. We mainly consider Conjunctive XPath queries that involve only the child and descendant axes. Previously known in-memory algorithms for this problem use O(|D|) space and O(|Q||D|) time. Several previously known algorithms for the streaming version use Ω(dn) space and Ω(dn|D|) time in the worst case; d denotes the depth of D, and n denotes the number of location steps in Q. Their exponential space requirement could well exceed the O(|D|) space used by the in-memory algorithms. We present an efficient algorithm that uses O(d|Q|+nc) space and O((|Q|+dn)|D|) time in the worst case; c denotes the maximum number of elements of D that can be candidates for output, at any one instant. For some worst case Q and D, the memory space used by our algorithm matches our lower bound proved in a different paper; so, our algorithm uses optimal memory space in the worst case. 相似文献

2.

Memory lower bounds for XPath evaluation over XML streams

Prakash Ramanan 《Journal of Computer and System Sciences》2011,77(6):1120-1140

We consider the XPath evaluation problem: Evaluate an XPath query Q on a streaming XML document D. We consider two versions of the problem: 1) Filtering Problem: Determine if there is a match for Q in D. 2) Node Selection Problem: Determine the set Q(D) of document nodes selected by Q. We consider Conjunctive XPath (CXPath) queries that involve only the child and descendant axes. Let d denote the depth of D, and n denote the number of location steps in Q. Bar-Yossef et al. (2007, 2005) [6] and [7] presented lower bounds on the memory space required by any algorithm to solve these two problems. Their lower bounds apply to each query in a large subset of XPath, and are obtained (mostly) using nonrecursive(Q,D). In this paper, we present larger lower bounds for a different class of queries (namely, CXPath queries with independent predicates), on recursive(Q,D). One of our results is an Ω(n⋅maxcands(Q,D)) lower bound for the node selection problem, for a worst-case Q; maxcands(Q,D) is the maximum number of nodes of D that can be candidates for output, at any one instant. So, there is no algorithm for the node selection problem that uses O(f(d,|Q|)+maxcands(Q,D)) space, for any function f. This shows that some previously published algorithms are incorrect. 相似文献

3.

Schema-aware XPath filtering on XML document streams

Daewook Lee Joonho Kwon Weidong Yang Hyoseop Shin Jae-min Kwak Sukho Lee 《Journal of Intelligent Manufacturing》2009,20(3):273-282

The XML stream filtering is gaining widespread attention from the research community in recent years. There have been many efforts to improve the performance of the XML filtering system by utilizing XML schema information. In this paper, we design and implement an XML stream filtering system, SFilter, which uses DTD or XML schema information for improving the performance. We propose the simplification and two kinds of optimization, one is static and the other is dynamic optimization. The Simplification and static optimization transform the XPath queries to make automata as an index structure for the filtering. The dynamic optimization are done in runtime at the filtering time. We developed five kinds of static optimization and two kinds of dynamic optimization. We present the novel filtering algorithm for the resulting transformed XPath queries and runtime optimizing. The experimental result shows that our system filters the XML streams efficiently. 相似文献

4.

Attribute-based evaluation of multiple continuous queries for filtering incoming tuples of a data stream

Hyun-Ho Lee Eun-Won Yun Won-Suk Lee 《Information Sciences》2008,178(11):2416-2432

The filtering of incoming tuples of a data stream should be completed quickly and continuously, which requires strict time and space constraints. In order to guarantee these constraints, the selection predicates of continuous queries are grouped or indexed in most data stream management systems (DSMS). This paper proposes a new scheme called attribute selection construct (ASC). Given a set of continuous queries, an ASC divides the domain of an attribute of a data stream into a set of disjoint regions based on the selection predicates that are imposed on the attribute. Each region maintains the pre-computed matching results of the selection predicates. Consequently, an ASC can collectively evaluate all of its selection predicates at the same time. Furthermore, it can also monitor the overall evaluation statistics, such as its selectivity and tuple dropping ratio, dynamically. For those attributes that are employed to express the selection predicates of the queries, the processing order of their ASC’s can significantly influence the overall performance of a multiple query evaluation. The evaluation sequence can be optimized by periodically capturing the run-time tuple dropping ratio of its current evaluation sequence. The performance of the proposed method is analyzed by a series of experiments to identify its various characteristics. 相似文献

5.

Vague continuous K-nearest neighbor queries over moving objects with uncertain velocity in road networks 总被引：1，自引：0，他引：1

Ping Fan Guohui LiLing Yuan Yanhong Li 《Information Systems》2012,37(1):13-32

Recent research has focused on Continuous K Nearest Neighbor (CKNN) queries in road networks, where the queries and the data objects are moving. Most existing approaches assume the fixed velocity of moving objects. The release of fixed moving velocity makes the query process slowly due to the significant repetitive query cost. In this paper, we study CKNN queries over moving objects with uncertain velocity in road networks. A Distance Interval Model (DIM) is designed to calculate the minimal and maximal road network distances between moving objects and query point. Furthermore, we propose a novel Possibility-based Vague KNN (PVKNN) algorithm to process the query efficiently, which determines the CKNN query results with possibility within each division time subinterval of given time interval by applying the vague set theory. In the PVKNN algorithm, the query efficiency can be improved significantly with the pruning, distilling and possibility-ranking phases. With these phases, the objects candidates are scaled down and the given time interval is divided into subintervals to reduce the repetitive query cost. In addition, an index structure TPR^uv-Tree is designed to efficiently index moving objects with uncertain velocity in road network by involving edge connection and moving objects information. Experiments with simulation and comparison show that significant improvement in the performance of efficiency can be achieved with our proposed algorithms. 相似文献