期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Processing approximate aggregate queries in wireless sensor networks

Antonios Deligiannakis Yannis Kotidis Nick Roussopoulos 《Information Systems》2006,31(8):770-792

In-network data aggregation has been recently proposed as an effective means to reduce the number of messages exchanged in wireless sensor networks. Nodes of the network form an aggregation tree, in which parent nodes aggregate the values received from their children and propagate the result to their own parents. However, this schema provides little flexibility for the end-user to control the operation of the nodes in a data sensitive manner. For large sensor networks with severe energy constraints, the reduction (in the number of messages exchanged) obtained through the aggregation tree might not be sufficient. In this paper, we present new algorithms for obtaining approximate aggregate statistics from large sensor networks. The user specifies the maximum error that he is willing to tolerate and, in turn, our algorithms program the nodes in a way that seeks to minimize the number of messages exchanged in the network, while always guaranteeing that the produced estimate lies within the specified error from the exact answer. A key ingredient to our framework is the notion of the residual mode of operation that is used to eliminate messages from sibling nodes when their cumulative change to the computed aggregate is small. We introduce two new algorithms, based on potential gains, which adaptively redistribute the error thresholds to those nodes that benefit the most and try to minimize the total number of transmitted messages in the network. Our techniques significantly reduce the number of messages, often by a factor of 10 for a modest 2% relative error bound, and consistently outperform previous techniques for computing approximate aggregates, which we have adapted for sensor networks. 相似文献

2.

A note on efficient aggregate queries in sensor networks

Boaz Patt-Shamir 《Theoretical computer science》2007,370(1-3):254-264

We consider a scenario where nodes in a sensor network hold numeric items, and the task is to evaluate simple functions of the distributed data. In this note we present distributed protocols for computing the median with sublinear space and communication complexity per node. Specifically, we give a deterministic protocol for computing median with polylog complexity and a randomized protocol that computes an approximate median with polyloglog communication complexity per node. On the negative side, we observe that any deterministic protocol that counts the number of distinct data items must have linear complexity in the worst case. 相似文献

3.

Approximate distributed top-<Emphasis Type="Italic">k</Emphasis> queries

Boaz Patt-Shamir Allon Shafrir 《Distributed Computing》2008,21(1):1-22

We consider a distributed system where each node keeps a local count for items (similar to elections where nodes are ballot boxes and items are candidates). A top-k query in such a system asks which are the k items whose global count, across all nodes in the system, is the largest. In this paper, we present a Monte Carlo algorithm that outputs, with high probability, a set of k candidates which approximates the top-k items. The algorithm is motivated by sensor networks in that it focuses on reducing the individual communication complexity. In contrast to previous algorithms, the communication complexity depends only on the global scores and not on the partition of scores among nodes. If the number of nodes is large, our algorithm dramatically reduces the communication complexity when compared with deterministic algorithms. We show that the complexity of our algorithm is close to a lower bound on the cell-probe complexity of any non-interactive top-k approximation algorithm. We show that for some natural global distributions (such as the Geometric or Zipf distributions), our algorithm needs only polylogarithmic number of communication bits per node. An extended abstract of this paper appeared in Proc. 13th Int. Colloquium on Structural Information and Communication Complexity, SIROCCO 2006, Lecture Notes in Computer Science 4056, pp. 319–333. 相似文献

4.

Complexity of answering counting aggregate queries over DL-Lite

《Journal of Web Semantics》2015

相似文献

5.

Selecting and using views to compute aggregate queries

Foto Afrati Rada Chirkova 《Journal of Computer and System Sciences》2011,77(6):1079-1107

We consider a workload of aggregate queries and investigate the problem of selecting materialized views that (1) provide equivalent rewritings for all the queries, and (2) are optimal, in that the cost of evaluating the query workload is minimized. We consider conjunctive views and rewritings, with or without aggregation; in each rewriting, only one view contributes to computing the aggregated query output. We look at query rewriting using existing views and at view selection. In the query-rewriting problem, we give sufficient and necessary conditions for a rewriting to exist. For view selection, we prove complexity results. Finally, we give algorithms for obtaining rewritings and selecting views. 相似文献

6.

Containment for queries over trees with attribute value comparisons

《Information Systems》2016

相似文献

7.

Bandwidth-constrained queries in sensor networks

Antonios Deligiannakis Yannis Kotidis Nick Roussopoulos 《The VLDB Journal The International Journal on Very Large Data Bases》2008,17(3):443-467

Sensor networks consist of battery-powered wireless devices that are required to operate unattended for long periods of time. Thus, reducing energy drain is of utmost importance when designing algorithms and applications for such networks. Aggregate queries are often used by monitoring applications to assess the status of the network and detect abnormal behavior. Since radio transmission often constitutes the biggest factor of energy drain in a node, in this paper we propose novel algorithms for the evaluation of bandwidth- constrained queries over sensor networks. The goal of our techniques is, given a target bandwidth utilization factor, to program the sensor nodes in a way that seeks to maximize the accuracy of the produced query results at the monitoring node, while always providing strong error guarantees to the monitoring application. This is a distinct difference of our framework from previous techniques that only provide probabilistic guarantees on the accuracy of the query result. Our algorithms are equally applicable when the nodes have ample power resources, but bandwidth consumption needs to be minimized, for instance in densely distributed networks, to ensure proper operation of the nodes. Our experiments with real sensor data show that bandwidth-constrained queries can substantially reduce the number of messages in the network while providing very tight error bounds on the query result. 相似文献

8.

Answering why-not questions on KNN queries

Zhefan ZHONG Xin LIN Liang HE Jing YANG 《Frontiers of Computer Science》2019,13(5):1062

Being decades of study, the usability of database systems have received more attention in recent years. Now it is especially able to explain missing objects in a query result, which is called “why-not” questions, and is the focus of concern. This paper studies the problem of answering whynot questions on KNN queries. In our real life, many users would like to use KNN queries to investigate the surrounding circumstances. Nevertheless, they often feel disappointed when finding the result not including their expected objects. In this paper, we use the query refinement approach to resolve the problem. Given the original KNN query and a set of missing objects as input, our algorithm offer a refined KNN query that includes the missing objects to the user. The experimental results demonstrate the efficiency of our proposed optimizations and algorithms. 相似文献

9.

Enabling soft queries for data retrieval

Hwanjo Yu Seung-won Hwang Kevin Chen-Chuan Chang 《Information Systems》2007

相似文献

10.

Nearest and reverse nearest neighbor queries for moving objects 总被引：4，自引：0，他引：4

Rimantas Benetis Christian S. Jensen Gytis Karĉiauskas Simonas Ŝaltenis 《The VLDB Journal The International Journal on Very Large Data Bases》2006,15(3):229-249

With the continued proliferation of wireless communications and advances in positioning technologies, algorithms for efficiently answering queries about large populations of moving objects are gaining interest. This paper proposes algorithms for k nearest and reverse k nearest neighbor queries on the current and anticipated future positions of points moving continuously in the plane. The former type of query returns k objects nearest to a query object for each time point during a time interval, while the latter returns the objects that have a specified query object as one of their k closest neighbors, again for each time point during a time interval. In addition, algorithms for so-called persistent and continuous variants of these queries are provided. The algorithms are based on the indexing of object positions represented as linear functions of time. The results of empirical performance experiments are reported. 相似文献

11.

Detecting proximity events in sensor networks

Antonios Deligiannakis Yannis Kotidis 《Information Systems》2011

Sensor networks are often used to perform monitoring tasks, such as animal and vehicle tracking, or the surveillance of enemy forces in military applications. In this paper we introduce the concept of proximity queries, which allow us to report interesting events, observed by nodes in the network that lie within a certain distance from each other. An event is triggered when a user-programmable predicate is satisfied on a sensor node. We study the problem of computing proximity queries in sensor networks and propose several alternative techniques that differ in the number of messages exchanged by the nodes and the quality of the returned answers. Our solutions utilize a distributed routing index, maintained by the nodes in the network, that is dynamically updated as new observations are obtained by the nodes. This distributed index allows us to efficiently process multiple proximity queries involving several different event types within a fraction of the cost that a straightforward evaluation requires. We present an extensive experimental study to show the benefits of our techniques under different scenarios using both synthetic and real data sets. Our results demonstrate that our algorithms scale better and require significantly fewer messages compared to a straightforward execution of the queries. 相似文献

12.

Determinacy and query rewriting for conjunctive queries and views

Foto N. Afrati 《Theoretical computer science》2011,412(11):1005-1021

Answering queries using views is the problem which examines how to derive the answers to a query when we only have the answers to a set of views. Constructing rewritings is a widely studied technique to derive those answers. In this paper we consider the problem of the existence of rewritings in the case where the answers to the views uniquely determine the answers to the query. Specifically, we say that a view set Vdetermines a query Q if for any two databases D₁,D₂ it holds: V(D₁)=V(D₂) implies Q(D₁)=Q(D₂). We consider the case where query and views are defined by conjunctive queries and investigate the question: If a view set V determines a query Q, is there an equivalent rewriting of Q using V? We present here interesting cases where there are such rewritings in the language of conjunctive queries. Interestingly, we identify a class of conjunctive queries, CQ_path, for which a view set can produce equivalent rewritings for “almost all” queries which are determined by this view set. We introduce a problem which relates determinacy to query equivalence. We show that there are cases where restricted results can carry over to broader classes of queries. 相似文献

13.

无线传感器网络中基于关联度的多查询优化

下载免费PDF全文

李希明郑瑾《计算机工程与应用》2011,47(21):98-101

无线传感器网络是一种以数据为中心的网络,用户通过基站向网络提出查询请求获取所需数据。如何通过多查询的优化来减少传感器节点的能耗以延长网络生命期是无线传感器网络中需要解决的关键问题之一。提出了基于关联度的多查询优化算法,其基本思想是节点通过节点与候选父亲节点之间的关联度来选择父节点,从而被相同查询覆盖的节点聚集成一个组,多个查询间共享组中节点的数据,在网络中对查询数据进行有效的融合,充分减少了网络的数据传输量,延长了网络的生命期。理论分析和模拟实验表明该算法可以充分减少数据传输量,从而达到节能的目的。相似文献

14.

The complexity of weighted counting for acyclic conjunctive queries

Arnaud Durand Stefan Mengel 《Journal of Computer and System Sciences》2014

This paper is a study of weighted counting of the solutions of acyclic conjunctive queries (ACQ). The unweighted quantifier free version of this problem is known to be tractable (for combined complexity), but it is also known that introducing even a single quantified variable makes it #P

# P

-hard. We first show that weighted counting for quantifier free ACQ is still tractable and that even minimalistic extensions of the problem lead to hard cases. We then introduce a new parameter for quantified queries that permits to isolate a large island of tractability. We show that, up to a standard assumption from parameterized complexity, this parameter fully characterizes tractable subclasses for counting weighted solutions for ACQs. Thus we completely determine the tractability frontier for weighted counting for ACQ. 相似文献

15.

Genetic algorithms for approximate similarity queries 总被引：1，自引：0，他引：1

Renato Agma J.M. Caetano 《Data & Knowledge Engineering》2007,62(3):459-482

Algorithms to query large sets of simple data (composed of numbers and small character strings) are constructed to retrieve the exact answer, retrieving every relevant element, so the answer said to be exact. Similarity searching over complex data is much more expensive than searching over simple data. Moreover, comparison operations over complex data usually consider features extracted from each element, instead of the elements themselves. Thus, even if an algorithm retrieves an exact answer, it is ‘exact’ regarding the extracted features, not regarding the original elements themselves. Therefore, trading exact answering with query time response can be worthwhile. In this work we developed two search strategies based on genetic algorithms to allow retrieving approximate data indexed by Metric Access Methods (MAM) within a limited, user-defined, amount of time. These strategies allow implementing algorithms to answer both range and k-nearest neighbor queries, and allow also to estimate the precision obtained for the approximate answer. Experimental evaluation shows that very good results (corresponding to what the user would expect) can be obtained in a fraction of the time required to obtain the exact answer. 相似文献

16.

Optimizing in-network aggregate queries in wireless sensor networks for energy saving

Chih-Chieh HungAuthor VitaeWen-Chih PengAuthor Vitae 《Data & Knowledge Engineering》2011,70(7):617-641

This study proposes a method of in-network aggregate query processing to reduce the number of messages incurred in a wireless sensor network. When aggregate queries are issued to the resource-constrained wireless sensor network, it is important to efficiently perform these queries. Given a set of multiple aggregate queries, the proposed approach shares intermediate results among queries to reduce the number of messages. When the sink receives multiple queries, it should be propagated these queries to a wireless sensor network via existing routing protocols. The sink could obtain the corresponding topology of queries and views each query as a query tree. With a set of query trees collected at the sink, it is necessary to determine a set of backbones that share intermediate results with other query trees (called non-backbones). First, it is necessary to formulate the objective cost function for backbones and non-backbones. Using this objective cost function, it is possible to derive a reduction graph that reveals possible cases of sharing intermediate results among query trees. Using the reduction graph, this study first proposes a heuristic algorithm BM (standing for Backbone Mapping). This study also develops algorithm OOB (standing for Obtaining Optimal Backbones) that exploits a branch-and-bound strategy to obtain the optimal solution efficiently. This study tests the performance of these algorithms on both synthesis and real datasets. Experimental results show that by sharing the intermediate results, the BM and OOB algorithms significantly reduce the total number of messages incurred by multiple aggregate queries, thereby extending the lifetime of sensor networks. 相似文献

17.

Sampling-based estimators for subset-based queries

Shantanu Joshi Christopher Jermaine 《The VLDB Journal The International Journal on Very Large Data Bases》2009,18(1):181-202

We consider the problem of using sampling to estimate the result of an aggregation operation over a subset-based SQL query, where a subquery is correlated to an outer query by a NOT EXISTS, NOT IN, EXISTS or IN clause. We design an unbiased estimator for our query and prove that it is indeed unbiased. We then provide a second, biased estimator that makes use of the superpopulation concept from statistics to minimize the mean squared error of the resulting estimate. The two estimators are tested over an extensive set of experiments. Material in this paper is based upon work supported by the National Science Foundation via grants 0347408 and 0612170. 相似文献

18.

Adaptive optimization for multiple continuous queries

Hong Kyu Park Author VitaeWon Suk LeeAuthor Vitae 《Data & Knowledge Engineering》2012,71(1):29-46

Because it operates under a strict time constraint, query processing for data streams should be continuous and rapid. To guarantee this constraint, most previous researches optimize the evaluation order of multiple join operations in a set of continuous queries using a greedy optimization strategy so that the order is re-optimized dynamically in run-time due to the time-varying characteristics of data streams. However, this method often results in a sub-optimal plan because the greedy strategy traces only the first promising plan. This paper proposes a new multiple query optimization approach, Adaptive Sharing-based Extended Greedy Optimization Approach (A-SEGO), that traces multiple promising partial plans simultaneously. A-SEGO presents a novel method for sharing the results of common sub-expressions in a set of queries cost-effectively. The number of partial plans can be flexibly controlled according to the query processing workload. In addition, to avoid invoking the optimization process too frequently, optimization is performed only when the current execution plan is relatively no longer efficient. A series of experiments are comparatively analyzed to evaluate the performance of the proposed method in various stream environments. 相似文献

19.

Probabilistic location-dependent queries at different location granularities

《Pervasive and Mobile Computing》2017

Approaches for the processing of location-dependent queries usually assume that the location data are expressed precisely, usually using GPS locations. However, this is unrealistic because positioning methods do not have a perfect accuracy (e.g., the positioning approach used in cellular networks handles only the cell where mobile users are located). Besides, users may need to express queries based on concepts of locations other than traditional GPS locations, which we call location granules.In this paper, we focus on location granule-based query processing (i.e., processing of queries with location granules) in situations where the location data available is imprecise, which we have called probabilistic location-dependent queries. For that purpose, we exploit the concept of uncertainty location granule, which represents the location uncertainty of an object. In particular, we tackle the problem of processing probabilistic inside (range) constraints. We analyze in detail how those constraints can be processed, taking into account both the existence of location uncertainty affecting the relevant objects and the location granularity specified. An extensive experimental evaluation shows the feasibility of the proposed probabilistic query processing approach and analyzes the advantages of using index structures to speed up the query processing. 相似文献

20.

可伸缩的道路网络多连续k近邻查询处理

廖巍吴晓平钟志农《计算机工程与设计》2009,30(24)

针对基于道路网络的多用户连续k近邻查询处理,提出了一种可伸缩的多用户连续查询处理(scalable processing of multiple continuous queries,SPMCQ)框架.SPMCQ框架采用流水线处理策略,将连续k近邻查询执行分解为可同时作业的预处理、查询执行和结果分发3个阶段,利用多线程技术提高查询处理的并行性.基于SPMCO框架,分别利用基于内存的哈希表和线性链表结构对移动对象位置和道路网络有向图模型进行存储和管理,提出了多连续k近邻查询处理SCkNN算法.实验结果表明,在处理多用户连续k近邻查询时,该算法性能优于目前的道路网络连续k近邻查询处理算法. 相似文献