首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 171 毫秒
1.
How to process a skyline query efficiently has received considerable attention in recent years. A skyline query identifies a set of non-dominated data records in a multidimensional dataset. Whereas most previous studies have resolved this problem in a centralized environment, this work considers it in a distributed sensor network environment. An algorithm, known as Skyline Sensor Algorithm (SkySensor), is presented to efficiently retrieve skyline results from a sensor network. A cluster-based architecture is designed in SkySensor to collect all sensor readings. A pruning method is then proposed to progressively sift out the skyline results from the sensor network. SkySensor avoids the need of collecting data from all sensors in the network, which is an extremely expensive action, when searching for the skyline results. The performance study indicates that SkySensor is highly efficient, and significantly outperforms previous methods in processing skyline queries.  相似文献   

2.
Advances in geographical tracking, multimedia processing, information extraction, and sensor networks have created a deluge of probabilistic data. While similarity search is an important tool to support the manipulation of probabilistic data, it raises new challenges to traditional relational databases. The problem stems from the limited effectiveness of the distance metrics employed by existing database systems. On the other hand, several more complicated distance operators have proven their values for better distinguishing ability in specific probabilistic domains. In this paper, we discuss the similarity search problem with respect to Earth Mover’s Distance (EMD). EMD is the most successful distance metric for probability distribution comparison but is an expensive operator as it has cubic time complexity. We present a new database indexing approach to answer EMD-based similarity queries, including range queries and k-nearest neighbor queries on probabilistic data. Our solution utilizes primal-dual theory from linear programming and employs a group of B + trees for effective candidate pruning. We also apply our filtering technique to the processing of continuous similarity queries, especially with applications to frame copy detection in real-time videos. Extensive experiments show that our proposals dramatically improve the usefulness and scalability of probabilistic data management.  相似文献   

3.
Sensor networks build temporary wireless connections in environments where the stationary infrastructures are either destroyed or too expensive to construct. Most of the previous research in sensor networks focuses on routing protocols that adapt to the dynamic network topologies, and not much work has been done on data accessing. One important data accessing application is similarity search, which provides the foundation of content-based retrieval. Many traditional similarity search algorithms are based on centralized or flooding mechanisms, which are not effective in wireless sensor network environments due to the multiple limitations such as bandwidth and power. In this paper we tackle the problem of similarity search by using semantic-based caching to reflect the data content distribution in the network. The basic idea is analyzing the cached results of earlier queries and trying to resolve the later queries within a small collection of content-related mobile nodes. Based on a Hilbert space-filling curve, the data points in a multi-dimensional semantic space are described as a linear representation. These data points are further cached to facilitate query processing. Through extensive simulations, we show that our method can perform similarity search with improved performance in terms of search cost and response time.  相似文献   

4.
Efficient Similarity Search over Future Stream Time Series   总被引:2,自引:0,他引:2  
With the advance of hardware and communication technologies, stream time series is gaining ever-increasing attention due to its importance in many applications such as financial data processing, network monitoring, Web click-stream analysis, sensor data mining, and anomaly detection. For all of these applications, an efficient and effective similarity search over stream data is essential. Because of the unique characteristics of the stream, for example, data are frequently updated and real-time response is required, the previous approaches proposed for searching through archived data may not work in the stream scenarios. Especially, in the cases where data often arrive periodically for various reasons (for example, the communication congestion or batch processing), queries on such incomplete time series or even future time series may result in inaccuracy using traditional approaches. Therefore, in this paper, we propose three approaches, polynomial, Discrete Fourier Transform (DFT), and probabilistic, to predict the unknown values that have not arrived at the system and answer similarity queries based on the predicted data. We also apply efficient indexes, that is, a multidimensional hash index and a B+-tree, to facilitate the prediction and similarity search on future time series, respectively. Extensive experiments demonstrate the efficiency and effectiveness of our methods for prediction and answering queries.  相似文献   

5.
Similarity search is one of the critical issues in many applications. When using all attributes of objects to determine their similarity, most prior similarity search algorithms are easily influenced by a few attributes with high dissimilarity. The frequent k-n-match query is proposed to overcome the above problem. However, the prior algorithm to process frequent k-n-match queries is designed for static data, whose attributes are fixed, and is not suitable for dynamic data. Thus, we propose in this paper two schemes to process continuous frequent k-n-match queries over dynamic data. First, the concept of safe region is proposed and four formulae are devised to compute safe regions. Then, scheme CFKNMatchAD-C is developed to speed up the process of continuous frequent k-n-match queries by utilizing safe regions to avoid unnecessary query re-evaluations. To reduce the amount of data transmitted by networked data sources, scheme CFKNMatchAD-C also uses safe regions to eliminate transmissions of unnecessary data updates which will not affect the results of queries. Moreover, for large-scale environments, we further propose scheme CFKNMatchAD-D by extending scheme CFKMatchAD-C to employ multiple servers to process continuous frequent k-n-match queries. Experimental results show that scheme CFKNMatchAD-C and scheme CFKNMatchAD-D outperform the prior algorithm in terms of average response time and the amount of produced network traffic.  相似文献   

6.
This paper presents PRIDE, a novel data abstraction layer for collaborative 2-tier sensor network applications. PRIDE, more specifically, targets distributed real-time applications, in which multiple collaborative mobile devices have to analyze a global situation by collecting and managing data streams from massive underlying sensors. PRIDE at these devices hides the details of underlying sensors and provides transparent, timely, and robust access to global sensor data under highly dynamic and unpredictable environments of emerging sensor network applications. For transparent and efficient sharing of global sensor data, a model-based predictive replication mechanism is proposed and integrated into a conventional data management system that supports diverse types of spatial and temporal queries. In addition, for robust and timely query processing, the predictive replication scheme is extended to the problem of guaranteeing Quality-of-Service (QoS) by introducing feedback control of the accuracy bounds of models. We show the viability of the proposed solution by implementing and evaluating it on a 2-tier sensor network testbed, emulating collaborative search-and-rescue tasks with realistic workloads. Our evaluation results demonstrate that PRIDE can achieve timely sensor data sharing among a large number of devices in a highly robust and controlled manner.  相似文献   

7.
Similarity search in P2P systems has attracted a lot of attention recently and several important applications, like distributed image search, can profit from the proposed distributed algorithms. In this paper, we address the challenging problem of efficient processing of range queries in metric spaces, where data is horizontally distributed across a super-peer network. Our approach relies on SIMPEER (Doulkeridis et al. in Proceedings of VLDB, pp. 986–997, 2007), a framework that dynamically clusters peer data, in order to build distributed routing information at super-peer level. SIMPEER allows the evaluation of exact range and nearest neighbor queries in a distributed manner that reduces communication cost, network latency, bandwidth consumption and computational overhead at each individual peer. In this paper, we extend SIMPEER by focusing on efficient range query processing and providing recall-based guarantees for the quality of the result retrieved so far. This is especially useful for range queries that lead to result sets of high cardinality and incur high processing costs, while the complete result set becomes overwhelming for the user. Our framework employs statistics for estimating an upper limit of the number of possible results for a range query and each super-peer may decide not to propagate further the query and reduce the scope of the search. We provide an experimental evaluation of our framework and show that our approach performs efficiently, even in the case of high degree of distribution.  相似文献   

8.
传感器网络中的数据查询处理   总被引:1,自引:0,他引:1  
传统的传感器网络采用集中式数据管理,不能有效利用便宜的本地计算来代替昂贵的网络通信.采用分布式的方法,在sink节点的应用层与网络层之间增加查询代理层,把用户查询分发到相关的传感器节点上进行处理.这样,通过减少网络传输的数据量,来降低传感器节点的能量消耗,延长网络寿命.  相似文献   

9.
In this paper, a new approach has been introduced that integrates an evolutionary-based mechanism with a distributed query sensor cover algorithm for optimal query execution in self-organized wireless sensor networks (WSN). An algorithm based on an evolutionary technique is proposed, with problem-specific genetic operators to improve computing efficiency. Redundancy within a sensor network can be exploited to reduce the communication cost incurred in execution of spatial queries. Any reduction in communication cost would result in an efficient use of battery energy, which is very limited in sensors. Our objective is to self-organize the network, in response to a query, into a topology that involves an optimal subset of sensors that is sufficient to process the query subject to connectivity, coverage, energy consumption, cover size and communication overhead constraints. Query processing must incorporate energy awareness into the system by reducing the total energy consumption and hence increasing the lifetime of the sensor cover, which is beneficial for large long running queries. Experiments have been carried out on networks with different sensors Transmission radius, different query sizes, and different network configurations. Through extensive simulations, we have shown that our designed technique result in substantial energy savings in a sensor network. Compared with other techniques, the results demonstrated a significant improvement of the proposed technique in terms of energy-efficient query cover with lower communication cost and lower size.  相似文献   

10.
A common approach to improve the reliability of query results based on error-prone sensors is to introduce redundant sensors. However, using multiple sensors to generate the value for a data item can be expensive, especially in wireless environments where continuous queries are executed. Moreover, some sensors may not be working properly and their readings need to be discarded. In this paper, we propose a statistical approach to decide which sensor nodes to be used to answer a query. In particular, we propose to solve the problem with the aid of continuous probabilistic query (CPQ), which is originally used to manage uncertain data and is associated with a probabilistic guarantee on the query result. Based on the historical data values from the sensor nodes, the query type, and the requirement on the query, we present methods to select an appropriate set of sensors and provide reliable answers for several common aggregate queries. Our statistics-based sensor node selection algorithm is demonstrated in a number of simulation experiments, which shows that a small number of sensor nodes can provide accurate and robust query results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号