首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 625 毫秒
1.
Wireless sensor networks are used in a large array of applications to capture, collect, and analyze physical environmental data. Many existing sensor systems instruct sensor nodes to report their measurements to central repositories outside the network, which is expensive in energy cost. Recent technological advances in flash memory have given rise to the development of storagecentric sensor networks, where sensor nodes are equipped with high-capacity flash memory storage such that sensor data can be stored and managed inside the network to reduce expensive communication. This novel architecture calls for new data management techniques to fully exploit distributed in-network data storage. This paper describes some of our research on distributed query processing in such flash-based sensor networks. Of particular interests are the issues that arise in the design of storage management and indexing structures combining sensor system workload and read/write/erase characteristics of flash memory.  相似文献   

2.
Wireless sensor networks are used in a large array of applications to capture, collect, and analyze physical environmental data. Many existing sensor systems instruct sensor nodes to report their measurements to central repositories outside the network, which is expensive in energy cost. Recent technological advances in flash memory have given rise to the development of storage-centric sensor networks, where sensor nodes are equipped with high-capacity flash memory storage such that sensor data can be stored and managed inside the network to reduce expensive communication. This novel architecture calls for new data management techniques to fully exploit distributed in-network data storage. This paper describes some of our research on distributed query processing in such flash-based sensor networks. Of particular interests are the issues that arise in the design of storage management and indexing structures combining sensor system workload and read/write/erase characteristics of flash memory.  相似文献   

3.
A number of indexing techniques have been proposed in recent times for optimizing the queries on XML and other semi-structured data models. Most of the semi-structured models use tree-like structures and query languages (XPath, XQuery, etc.) which make use of regular path expressions to optimize the query processing. In this paper, we propose two algorithms called Entry-point algorithm (EPA) and Two-point Entry algorithms that exploit different types of indices to efficiently process XPath queries. We discuss and compare two approaches namely, Root-first and Bottom-first in implementing the EPA. We present the experimental results of the algorithms using XML benchmark queries and data and compare the results with that of traditional methods of query processing with and without the use of indexes, and ToXin indexing approach. Our algorithms show improved performance results than the traditional methods and Toxin indexing approach.  相似文献   

4.
5.
一种闪存文件系统的数据恢复机制   总被引:1,自引:0,他引:1       下载免费PDF全文
张延园  焦磊 《计算机工程》2008,34(14):283-285
基于面向大容量NAND闪存的嵌入式文件系统CFFS,结合其对芯片上数据的索引方式,在CFFS中引入引用节点和页位图等数据结构和相应算法,提出一种数据恢复机制。该机制在数据遭到破坏时将闪存文件系统恢复到一个一致的历史状态,保证芯片上数据的一致性和可用性。  相似文献   

6.
In the last decade, spatio-temporal database research focuses on the design of effective and efficient indexing structures in support of location-based queries such as predictive range queries and nearest neighbor queries. While a variety of indexing techniques have been proposed to accelerate the processing of updates and queries, not much attention has been paid to the updating protocol, which is another important factor affecting the system performance. In this paper, we propose a generic and adaptive updating protocol for moving object databases with less number of updates between objects and the database server, thereby reducing the overall workload of the system. In contrast to the approach adopted by most conventional moving object database systems where the exact locations and velocities last disclosed are used to predict their motions, we propose the concept of Spatio-temporal safe region to approximate possible future locations. Spatio-temporal safe regions provide larger space of tolerance for moving objects, freeing them from location and velocity updates as long as the errors remain predictable in the database. To answer predictive queries accurately, the server is allowed to probe the latest status of objects when their safe regions are inadequate in returning the exact query results. Spatio-temporal safe regions are calculated and optimized by the database server with two contradictory objectives: reducing update workload while guaranteeing query accuracy and efficiency. To achieve this, we propose a cost model that estimates the composition of active and passive updates based on historical motion records and query distribution. More system performance improvements can be obtained by cutting more updates from the clients, when the users of system are comfortable with incomplete but accuracy bounded query results. We have conducted extensive experiments to evaluate our proposal on a variety of popular indexing structures. The results confirm the viability, robustness, accuracy and efficiency of our proposed protocol.  相似文献   

7.
Spatiotemporal objects – that is, objects that evolve over time – appear in many applications. Due to the nature of such applications, storing the evolution of objects through time in order to answer historical queries (queries that refer to past states of the evolution) requires a very large specialized database, what is termed in this article a spatiotemporal archive. Efficient processing of historical queries on spatiotemporal archives requires equally sophisticated indexing schemes. Typical spatiotemporal indexing techniques represent the objects using minimum bounding regions (MBR) extended with a temporal dimension, which are then indexed using traditional multidimensional index structures. However, rough MBR approximations introduce excessive overlap between index nodes, which deteriorates query performance. This article introduces a robust indexing scheme for answering spatiotemporal queries more efficiently. A number of algorithms and heuristics are elaborated that can be used to preprocess a spatiotemporal archive in order to produce finer object approximations, which, in combination with a multiversion index structure, will greatly improve query performance in comparison to the straightforward approaches. The proposed techniques introduce a query efficiency vs. space tradeoff that can help tune a structure according to available resources. Empirical observations for estimating the necessary amount of additional storage space required for improving query performance by a given factor are also provided. Moreover, heuristics for applying the proposed ideas in an online setting are discussed. Finally, a thorough experimental evaluation is conducted to show the merits of the proposed techniques. Edited by B. Seeger A short version of this article appeared as “Efficient indexing of spatiotemporal objects” in the Proceedings of Extending Database Technology 2002 [19]. This work was partially supported by NSF grants IIS-9907477, EIA-9983445, NSF IIS 9984729, NSF ITR 0220148, NSF IIS-0133825, NRDRP, and the U.S. Department of Defense.  相似文献   

8.
Complex queries on trajectory data are increasingly common in applications involving moving objects. MBR or grid-cell approximations on trajectories perform suboptimally since they do not capture the smoothness and lack of internal area of trajectories. We describe a parametric space indexing method for historical trajectory data, approximating a sequence of movement functions with single continuous polynomial. Our approach works well, yielding much finer approximation quality than MBRs. We present the PA-tree, a parametric index that uses this method, and show through extensive experiments that PA-trees have excellent performance for offline and online spatio-temporal range queries. Compared to MVR-trees, PA-trees are an order of magnitude faster to construct and incur I/O cost for spatio-temporal range queries lower by a factor of 2-4. SETI is faster than our method for index construction and timestamp queries, but incurs twice the I/O cost for time interval queries, which are much more expensive and are the bottleneck in online processing. Therefore, the PA-tree is an excellent choice for both offline and online processing of historical trajectories  相似文献   

9.
The primary issues that affect the design of indexing methods are examined, and several structures and algorithms for specific cases are proposed. The append-only tree (AP-tree) structure indexes data for append-only databases to help event-join optimization and queries that can exploit the inherent time ordering of such databases. Two variable indexing for the surrogate and time is discussed. It is shown that a nested index could be a very efficient structure in this context and is preferable to a composite B-tree or an index that involves linear lists of historical tuples. The problems of indexing time intervals, as related to nonsurrogate joint-indexing, are discussed. Several algorithms to partition the time line are introduced. A two-variable AT index based on nested indexing is outlined  相似文献   

10.
ADS: the adaptive data series index   总被引:1,自引:0,他引:1  
Numerous applications continuously produce big amounts of data series, and in several time critical scenarios analysts need to be able to query these data as soon as they become available. This, however, is not currently possible with the state-of-the-art indexing methods and for very large data series collections. In this paper, we present the first adaptive indexing mechanism, specifically tailored to solve the problem of indexing and querying very large data series collections. We present a detailed design and evaluation of our method using approximate and exact query algorithms with both synthetic and real data sets. Adaptive indexing significantly outperforms previous solutions, gracefully handling large data series collections, reducing the data to query delay: By the time state-of-the-art indexing techniques finish indexing 1 billion data series (and before answering even a single query), our method has already answered \(3*10^5\) queries.  相似文献   

11.
Indexing moving objects (MO) is a hot topic in the field of moving objects databases since many years. An impressive number of access methods have been proposed to optimize the processing of MO-related queries. Several methods have focused on spatio-temporal range queries, which represent the foundation of MO trajectory queries. Surprisingly, only a few of them consider that the objects movements are constrained. This is an important aspect for several reasons ranging from better capturing the relationship between the trajectory and the network space to more accurate trajectory representation with lower storage requirements. In this paper, we propose T-PARINET, an access method to efficiently retrieve the trajectories of objects moving in networks. T-PARINET is designed for continuous indexing of trajectory data flows. The cornerstone of T-PARINET is PARINET, an efficient index for historical trajectory data. The structure of PARINET is based on a combination of graph partitioning and a set of composite B+-tree local indexes. Because the network can be modeled using graphs, the partitioning of the trajectory data makes use of graph partitioning theory and can be tuned for a given query load and a given data distribution in the network space. The tuning process is built on a good quality cost model that is supplied with PARINET. The advantage of having a cost model is twofold; it allows a better integration of the index into the query optimizer of any DBMS, and it permits tuning the index structure for better performance. The tuning process can be performed before the index creation in the case of historical data or online in the case of indexing data flows. In fact, massive online updates can degrade the index quality, which can be measured by the cost model. We propose a specific maintenance process that results into T-PARINET. We study different types of queries and provide an optimized configuration for several scenarios. T-PARINET can easily be integrated into any RDBMS, which is an essential asset particularly for industrial or commercial applications. The experimental evaluation under an off-the-shelf DBMS shows that our method is robust. It also significantly outperforms the reference R-tree-based access methods for in-network trajectory databases.  相似文献   

12.
This paper defines direction relations (e.g., north, northeast) between two-dimensional objects and shows how they can be efficiently retrieved using B-, KDB- and R- tree-based data structures. Essentially, our work studies optimisation techniques for 2D range queries that arise during the processing of direction relations. We test the efficiency of alternative indexing methods through extensive experimentation and present analytical models that estimate their performance. The analytical estimates are shown to be very close to the actual results and can be used by spatial query optimizers in order to predict the retrieval cost. In addition, we implement modifications of the existing structures that yield better performance for certain queries. We conclude the paper by discussing the most suitable method depending on the type of the range and the properties of the data.  相似文献   

13.
针对矩形空间数据对象,以传统CIF四叉树索引技术为基础,利用Hadoop平台与MapReduce并行编程模型,采用“分而治之”的思想,对数据空间进行划分,设计适用于分布式环境的创建索引、相交查询、区域删除的并行算法。在此基础上,通过改变数据集中矩形对象的数目与map数进行实验,分析并行创建与相交查询的效率。实验结果表明,对于大数据量的数据集与多数据集,并行创建与查询可以提高处理效率。   相似文献   

14.
Queries on semistructured data are hard to process due to the complex nature of the data and call for specialized techniques. Existing path-based indexes and query processing algorithms are not efficient for searching complex structures beyond simple paths, even when the queries are high-selective. We introduce the definition of minimal infrequent structures (MIS), which are structures that 1) exist in the data, 2) are not frequent with respect to a support threshold, and 3) all substructures of them are frequent. By indexing the occurrences of MIS, we can efficiently locate the high-selective substructures of a query, improving search performance significantly. An efficient data mining algorithm is proposed, which finds the minimal infrequent structures. Their occurrences in the XML data are then indexed by a lightweight data structure and used as a fast filter step in query evaluation. We validate the efficiency and applicability of our methods through experimentation on both synthetic and real data.  相似文献   

15.
In location-based services, a density query returns the regions with high concentrations of moving objects (MOs). The use of density queries can help users identify crowded regions so as to avoid congestion. Most of the existing methods try very hard to improve the accuracy of query results, but ignore query efficiency. However, response time is also an important concern in query processing and may have an impact on user experience. In order to address this issue, we present a new definition of continuous density queries. Our approach for processing continuous density queries is based on the new notion of a safe interval, using which the states of both dense and sparse regions are dynamically maintained. Two indexing structures are also used to index candidate regions for accelerating query processing and improving the quality of results. The efficiency and accuracy of our approach are shown through an experimental comparison with snapshot density queries.  相似文献   

16.
This paper studies the problem of how to conduct external sorting on flash drives while avoiding intermediate writes to the disk. The focus is on sort in portable electronic devices, where relations are only larger than the main memory by a small factor, and on sort as part of distributed processes where relations are frequently partially sorted. In such cases, sort algorithms that refrain from writing intermediate results to the disk have three advantages over algorithms that perform intermediate writes. First, on devices in which read operations are much faster than writes, such methods are efficient and frequently outperform Merge Sort. Secondly, they reduce flash cell degradation caused by writes. Thirdly, they can be used in cases where there is not enough disk space for the intermediate results. Novel sort algorithms that avoid intermediate writes to the disk are presented. An experimental evaluation, on different flash storage devices, shows that in many cases the new algorithms can extend the lifespan of the devices by avoiding unnecessary writes to the disk, while maintaining efficiency, in comparison with Merge Sort.  相似文献   

17.
With the rapid increasing capacity of flash memory, flash-aware indexing techniques are highly desirable for flash devices. The unique features of flash memory, such as the erase-before-write constraint and the asymmetric read/write cost, severely deteriorate the performance of the traditional B+-tree algorithm. In this paper, we propose an optimized indexing method, called lazy-update B+-tree, to overcome the limitations of flash memory. The basic idea is to defer the committing of update requests to the B...  相似文献   

18.
Access Structures for Angular Similarity Queries   总被引:2,自引:0,他引:2  
Angular similarity measures have been utilized by several database applications to define semantic similarity between various data types such as text documents, time-series, images, and scientific data. Although similarity searches based on Euclidean distance have been extensively studied in the database community, processing of angular similarity searches has been relatively untouched. Problems due to a mismatch in the underlying geometry as well as the high dimensionality of the data make current techniques either inapplicable or their use results in poor performance. This brings up the need for effective indexing methods for angular similarity queries. We first discuss how to efficiently process such queries and propose effective access structures suited to angular similarity measures. In particular, we propose two classes of access structures, namely, angular-sweep and cone-shell, which perform different types of quantization based on the angular orientation of the data objects. We also develop query processing algorithms that utilize these structures as dense indices. The proposed techniques are shown to be scalable with respect to both dimensionality and the size of the data. Our experimental results on real data sets from various applications show two to three orders of magnitude of speedup over the current techniques  相似文献   

19.
Tree index structures are crucial components in data management systems. Existing tree index structure are designed with the implicit assumption that the underlying external memory storage is the conventional magnetic hard disk drives. This assumption is going to be invalid soon, as flash memory storage is increasingly adopted as the main storage media in mobile devices, digital cameras, embedded sensors, and notebooks. Though it is direct and simple to port existing tree index structures on the flash memory storage, that direct approach does not consider the unique characteristics of flash memory, i.e., slow write operations, and erase-before-update property, which would result in a sub optimal performance. In this paper, we introduce FAST (i.e., Flash-Aware Search Trees) as a generic framework for flash-aware tree index structures. FAST distinguishes itself from all previous attempts of flash memory indexing in two aspects: (1) FAST is a generic framework that can be applied to a wide class of data partitioning tree structures including R-tree and its variants, and (2) FAST achieves both efficiency and durability of read and write flash operations through memory flushing and crash recovery techniques. Extensive experimental results, based on an actual implementation of FAST inside the GiST index structure in PostgreSQL, show that FAST achieves better performance than its competitors.  相似文献   

20.
The existing NAND flash memory file systems have not taken into account multiple NAND flash memories for large-capacity storage. In addition, since large-capacity NAND flash memory is much more expensive than the same capacity hard disk drive, it is cost wise infeasible to build large-capacity flash drives. To resolve these problems, this paper suggests a new file system called NAFS for large-capacity storage with multiple small-capacity and low-cost NAND flash memories. It adopts a new cache policy, mount scheme, and garbage collection scheme in order to improve read and write performance, to reduce the mount time, and to improve the wear-leveling effectiveness. Our performance results show that NAFS is more suitable for large-capacity storage than conventional NAND file systems such as YAFFS2 and JFFS2 and a disk-based file system for Linux such as HDD-RAID5-EXT3 in terms of the read and write transfer rate using a double cache policy and the mount time using metadata stored on a separate partition. We also demonstrate that the wear-leveling effectiveness of NAFS can be improved by our adaptive garbage collection scheme.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号