首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A. Gambacorta  G. Leoni 《Calcolo》1982,19(4):397-414
In information retrieval systems storage occupancy is in general disregarded to take advantage in running time; in an environment in which retrieval time is not a primary requirement (this is the case of Data Base Machines), transposed files become a very interesting structure; in fact they allow a considerable space saving with respect to fast access structures, such as inverted files. But transposed files, in most cases, are not practicable, as theyare to be searched by a sequential scanning. In this paper a structure is defined to implement transposed files, called 2-B tree, which is a sort of two-dimensionalB-tree, and avoid sequential scanning; search in a trasposed file stored as a 2-B tree in π memory pages is 0 (π1/2), while insertion and deletion are 0 (logπ). Moreover, 2-B tree organization can achieve an improvement in storage requirements up to 42% with respect to inverted file organization.   相似文献   

2.
Hash tables on external memory are commonly used for indexing in database management systems. In this paper we present an algorithm that, in an asymptotic sense, achieves the best possible I/O and space complexities. Let B denote the number of records that fit in a block, and let N denote the total number of records. Our hash table uses I/Os, expected, for looking up a record (no matter if it is present or not). To insert, delete or change a record that has just been looked up requires I/Os, amortized expected, including I/Os for reorganizing the hash table when the size of the database changes. The expected external space usage is times the optimum of N/B blocks, and just O(1) blocks of internal memory are needed.  相似文献   

3.
We consider the problem of indexing a set of objects moving in d-dimensional spaces along linear trajectories. A simple external-memory indexing scheme is proposed to efficiently answer general range queries. The following are examples of the queries that can be answered by the proposed method: report all moving objects that will (i) pass between two given points within a specified time interval; (ii) become within a given distance from some or all of a given set of other moving objects. Our scheme is based on mapping the objects to a dual space, where queries about moving objects are transformed into polyhedral queries concerning their speeds and initial locations. We then present a simple method for answering such polyhedral queries, based on partitioning the space into disjoint regions and using a B+-tree to index the points in each region. By appropriately selecting the boundaries of each region, we guarantee an average search time that matches a known lower bound for the problem. Specifically, for a fixed d, if the coordinates of a given set of N points are statistically independent, the proposed technique answers polyhedral queries, on the average, in O((N/B)1−1/d⋅(log B N)1/d+K/B) I/O's using O(N/B) space, where B is the block size, and K is the number of reported points. Our approach is novel in that, while it provides a theoretical upper bound on the average query time, it avoids the use of complicated data structures, making it an effective candidate for practical applications. The proposed index is also dynamic in the sense that it allows object insertion and deletion in an amortized update cost of log B(N) I/O's. Experimental results are presented to show the superiority of the proposed index over other methods based on R-trees. recommend Ahmed Elmagarmid  相似文献   

4.
In this paper, we present lower and upper bounds on the size of limited width, bounded and unbounded fan-out parallel prefix circuits. The lower bounds on the sizes of such circuits are a function of the depth, width, and number of inputs. The size requirement of an N input bounded fan-out parallel prefix circuit having limited width W and extra depth k (the difference between allowed and minimum possible depth) is shown to be (N log2 W/2 k + N) for k log2 W. This implies that insisting on minimum depth causes the circuit size to be nonlinear, while as little as log2log2 W of extra depth can possibly reduce the size to linear. Also, we show that there is a clear difference between the two cases of bounded and unbounded fan-out by proving the size of a limited width, unbounded fan-out parallel prefix circuit lies between a lower bound of ((2 + 21–k /3)N) and an upper bound of O((2 + 21–k )N).Uniform, systolic constructions of limited width parallel prefix circuits are provided here and shown to be asymptotically optimal. By associating the width of the circuit with the number of processors and the fan-out capabilities of the circuit with the interconnection structure of a multiprocessor, time- and processor-efficient algorithms may be developed.  相似文献   

5.
In this paper we study the external memory planar point enclosure problem: Given N axis-parallel rectangles in the plane, construct a data structure on disk (an index) such that all K rectangles containing a query point can be reported I/O-efficiently. This problem has important applications in e.g. spatial and temporal databases, and is dual to the important and well-studied orthogonal range searching problem. Surprisingly, despite the fact that the problem can be solved optimally in internal memory with linear space and O(log N+K) query time, we show that one cannot construct a linear sized external memory point enclosure data structure that can be used to answer a query in O(log  B N+K/B) I/Os, where B is the disk block size. To obtain this bound, Ω(N/B 1−ε ) disk blocks are needed for some constant ε>0. With linear space, the best obtainable query bound is O(log 2 N+K/B) if a linear output term O(K/B) is desired. To show this we prove a general lower bound on the tradeoff between the size of the data structure and its query cost. We also develop a family of structures with matching space and query bounds. An extended abstract of this paper appeared in Proceedings of the 12th European Symposium on Algorithms (ESA’04), Bergen, Norway, September 2004, pp. 40–52. L. Arge’s research was supported in part by the National Science Foundation through RI grant EIA–9972879, CAREER grant CCR–9984099, ITR grant EIA–0112849, and U.S.-Germany Cooperative Research Program grant INT–0129182, as well as by the US Army Research Office through grant W911NF-04-01-0278, by an Ole Roemer Scholarship from the Danish National Science Research Council, a NABIIT grant from the Danish Strategic Research Council and by the Danish National Research Foundation. V. Samoladas’ research was supported in part by a grant co-funded by the European Social Fund and National Resources-EPEAEK II-PYTHAGORAS. K. Yi’s research was supported in part by the National Science Foundation through ITR grant EIA–0112849, U.S.-Germany Cooperative Research Program grant INT–0129182, and Hong Kong Direct Allocation Grant (DAG07/08).  相似文献   

6.
Grouped sweeping scheduling for DASD-based multimedia storage management   总被引:2,自引:0,他引:2  
This paper presents a new formulation of DASD (direct access storage device) disk arm scheduling schemes for multimedia storage management. The formulation, referred to as grouped sweeping scheduling (GSS), provides a framework for minimizing the buffer space required in the retrieval of multimedia streams. In GSS, the set ofn media streams that are to be served concurrently is divided intog groups. Groups are served in fixed order and streams within each group are served in an elevator-like SCAN scheme. Hence, the fixed order (FIFO) and SCAN schemes are special cases of GSS wheng=n andg=1, respectively. In this paper an optimization problem is formulated in which the buffer requirement is minimized with respect to the two design parameters:g and the size of the service unit, i.e. the number of blocks accessed in each service cycle. This formulation initially assumes that all media streams have the same playout requirements. A procedure of complexityO(1) is developed in computing the optimum solution to this problem. The proof of optimality and comparisons between the optimized GSS scheme and FIFO and SCAN are also presented. The paper also discusses the effect of disk arrays in the GSS formulation and issues related to operating GSS in a dynamic setting where streams arrive and depart in random order. Finally, the GSS scheme is extended to support heterogeneous media streams where each stream may have its own playout requirement.This paper extends earlier results that were presented at NOSDAV'92 [1] and Multimedia'93 [2].  相似文献   

7.
This paper studies aggregate search in transaction time databases. Specifically, each object in such a database can be modeled as a horizontal segment, whose y-projection is its search key, and its x-projection represents the period when the key was valid in history. Given a query timestamp q t and a key range , a count-query retrieves the number of objects that are alive at q t , and their keys fall in . We provide a method that accurately answers such queries, with error less than , where N alive(q t ) is the number of objects alive at time q t , and ɛ is any constant in (0, 1]. Denoting the disk page size as B, and nN / B, our technique requires O(n) space, processes any query in O(log B n) time, and supports each update in O(log B n) amortized I/Os. As demonstrated by extensive experiments, the proposed solutions guarantee query results with extremely high precision (median relative error below 5%), while consuming only a fraction of the space occupied by the existing approaches that promise precise results.  相似文献   

8.
By using anN-loop shift-register structure called a uniform ladder,N records can be sorted by a simplified adaptation of the odd-even transposition-sort algorithm to finish in (N + 1)/2 loop times (periods) using (N – 1) comparators. The sorting can be overlapped with input/output; the percentage of unoverlapped sorting times is less than 20% of the total time with a single ladder, less than 6% using two ladders, and is zero with a sufficient number of ladders.Presented at the Second International Magnetic Bubble Conference, Eindhoven, The Netherlands, August 1976.  相似文献   

9.
This paper gives tight bounds on the cost of cache-oblivious searching. The paper shows that no cache-oblivious search structure can guarantee a search performance of fewer than lg elog  B N memory transfers between any two levels of the memory hierarchy. This lower bound holds even if all of the block sizes are limited to be powers of 2. The paper gives modified versions of the van Emde Boas layout, where the expected number of memory transfers between any two levels of the memory hierarchy is arbitrarily close to [lg e+O(lg lg B/lg B)]log  B N+O(1). This factor approaches lg e≈1.443 as B increases. The expectation is taken over the random placement in memory of the first element of the structure.  相似文献   

10.
This paper gives efficient algorithms for the muiticommodity flow problem for two classes C12 and C01 of planar undirected graphs. Every graph in C12 has two face boundaries B1 and B2 such that each of the source-sink pairs lies on B1 or B2. On the other hand, every graph inC 01 has a face boundaryB 1 such that some of the source-sink pairs lie onB 1 and all the other pairs share a common sink lying onB 1. The algorithms run inO(kn +nT(n)) time if a graph hasn vertices andk source-sink pairs andT(n) is the time required for finding the single-source shortest paths in a planar graph ofn vertices.  相似文献   

11.
In this paper, we present two linear-size external memory data structures for approximate range searching. Our first structure, the BAR-B-tree, stores a set of N points in ℝ d and can report all points inside a query range Q by accessing O(log  B N+ε γ +k ε /B) disk blocks, where B is the disk block size, γ=1−d for convex queries and γ=−d otherwise, and k ε is the number of points lying within a distance of ε⋅diam (Q) to the query range Q. Our second structure, the object-BAR-B-tree, is able to store objects of arbitrary shapes of constant complexity and provides similar query guarantees. In addition, both structures can support other types of range searching queries such as range aggregation and nearest-neighbor. Finally, we present I/O-efficient algorithms to build these structures.  相似文献   

12.
An active area of research in supercomputing is concerned with mapping certain finite sums, such as discrete Fourier transforms, onto arrays of processors. This paper presents systolic mapping techniques that exploit the parallelism inherent in discrete Fourier transforms. It is established that, for anM-dimensional signal, parallel executions of such transforms are closely related to mappings of an (M + 1)-dimensional finite vector space into itself. Three examples of such parallel schemes are then described for the discrete Fourier transform of a two-dimensional finite extent sequence of sizeN 1 ×N 2. The first is a linear array ofN 1 +N 2 – 1 processors and takesO(N 1 N 2) steps. The second is anN 1 ×N 2 rectangular array of processors and takesO(N 1 +N 2) steps, and the third is a hexagonal array which usesN 1 N 2 + (N 2 – 1)(N 1 +N 2 – 1) processors andO(N 1 +N 2) steps. All three mappings are optimal in that they achieve asymptotically the highest speedup possible over the sequential execution of the same transform, and can easily be generalized to higher dimensions.  相似文献   

13.
Mehlhorn  Sanders 《Algorithmica》2003,35(1):75-93
Abstract. We consider the simple problem of scanning multiple sequences. There are k sequences of total length N which are to be scanned concurrently. One pointer into each sequence is maintained and an adversary specifies which pointer is to be advanced. The concept of scanning multiple sequence is ubiquitous in algorithms designed for hierarchical memory. In the external memory model of computation with block size B , a memory consisting of m blocks, and at most m sequences the problem is trivially solved with N/B memory misses by reserving one block of memory for each sequence. However, in a cache memory with associativity a , every access may lead to a cache fault if k > a . For a direct mapped cache (a = 1 ) two sequences suffice. We show that by randomizing the starting addresses of the sequences the number of cache misses can be kept to O(N/B) provided that k = O(m/B 1/a ) , i.e., the number of sequences that can be supported is decreased by a factor B 1/a . We also show a corresponding lower bound. Our result leads to a general method for converting sequence based algorithms designed for the external memory model of computation to cache memory even for caches with small associativity.  相似文献   

14.
In many geometrical applications data needs to be sorted in more than one dimension to speed up running time. For example, the evaluation of set operations on polygons, and hidden-surface algorithms. We describe an efficient algorithm for doing this, which we call Box Sort. Given areN boxes in an-dimensional space. A box is a spatial region bounded by (hyper-)planes orthogonal to the coordinate axes. Box Sort is a multidimensional sorting method for theN boxes, which allows us to perform quick range queries (box searches) with respect to these boxes. When the givenN boxes are sufficiently scattered around then-dimensional space, the box search for an arbitrary query box of the same order of size as the sorted boxes can be carried out inO(logN) time. The necessary memory requirement for Box Sort is invariablyO(N).  相似文献   

15.
The database community has devoted extensive amount of efforts to indexing and querying temporal data in the past decades. However, insufficient amount of attention has been paid to temporal ranking queries. More precisely, given any time instance t, the query asks for the top-k objects at time t with respect to some score attribute. Some generic indexing structures based on R-trees do support ranking queries on temporal data, but as they are not tailored for such queries, the performance is far from satisfactory. We present the Seb-tree, a simple indexing scheme that supports temporal ranking queries much more efficiently. The Seb-tree answers a top-k query for any time instance t in the optimal number of I/Os in expectation, namely, O(logB \fracNB+\frackB){O\left({\rm log}_B\,\frac{N}{B}+\frac{k}{B}\right)} I/Os, where N is the size of the data set and B is the disk block size. The index has near-linear size (for constant and reasonable k max values, where k max is the maximum value for the possible values of the query parameter k), can be constructed in near-linear time, and also supports insertions and deletions without affecting its query performance guarantee. Most of all, the Seb-tree is especially appealing in practice due to its simplicity as it uses the B-tree as the only building block. Extensive experiments on a number of large data sets, show that the Seb-tree is more than an order of magnitude faster than the R-tree based indexes for temporal ranking queries.  相似文献   

16.
Edmonds  Pruhs 《Algorithmica》2008,36(3):315-330
Abstract. We investigate server scheduling policies to minimize average user perceived latency in pull-based client-server systems (systems where multiple clients request data from a server) where the server answers requests on a multicast/ broadcast channel. We first show that there is no O(1) -competitive algorithm for this problem. We then give a method to convert any nonclairvoyant unicast scheduling algorithm A to nonclairvoyant multicast scheduling algorithm B . We show that if A works well, when jobs can have parallel and sequential phases, then B works well if it is given twice the resources. More formally, if A is an s -speed c -approximation unicast algorithm, then its counterpart algorithm B is a 2s -speed c -approximation multicast algorithm. It is already known [5] that Equi-partition, which devotes an equal amount of processing power to each job, is a (2 + ε) -speed O(1 + 1/ε) -approximation algorithm for unicast scheduling of such jobs. Hence, it follows that the algorithm {BEQUI}, which broadcasts all requested files at a rate proportional to the number of outstanding requests for that file, is a (4 + ε) -speed O(1 + 1/ε) -approximation algorithm. We give another algorithm BEQUI-EDF and show that BEQUI-EDF is also a (4 + ε) -speed O(1 + 1/ε) -approximation algorithm. However, BEQUI-EDF has the advantage that the maximum number of preemptions is linear in the number of requests, and the advantage that no preemptions occur if the data items have unit size.  相似文献   

17.
Excessive buffer requirement to handle continuous-media playbacks is an impediment to cost- effective provisioning for on-line video retrieval. Given the skewed distribution of video popularity, it is expected that often there are concurrent playbacks of the same video file within a short time interval. This creates an opportunity to batch multiple requests and to service them with a single stream from the disk without violating the on-demand constraint. However, there is a need to keep data in memory between successive uses to do this. This leads to a buffer space trade-off between servicing a request in memory mode vs. servicing it in disk-mode. In this work, we develop a novel algorithm to minimize the buffer requirement to support a set of concurrent playbacks. One of the beauties of the proposed scheme is that it enables the server to dynamically adapt to the changing workload while minimizing the total buffer space requirement. Our algorithm makes a significant contribution in decreasing the total buffer requirement, especially when the user access pattern is biased in favor of a small set of files. The idea of the proposed scheme is modeled in detail using an analytical formulation, and optimality of the algorithm is proved. An analytical framework is developed so that the proposed scheme can be used in combination with various existing disk-scheduling strategies. Our simulation results confirm that under certain circumstances, it is much more resource efficient to support some of the playbacks in memory mode and subsequently the proposed scheme enables the server to minimize the overall buffer space requirement.  相似文献   

18.
Aperiodic sorteris a sorting network that is a cascade of a number of identicalblocks, where outputiof each block is inputiof the next block. Previously, Dowd, Perl, Rudolph, and Saks introduced thebalancedmerging network, withN=2kinputs/outputs and log Nstages of comparators. Using a very intricate proof, they showed that a cascade of log Nsuch blocks constitutes a sorting network. In this paper, we introduce a large class of merging networks with the same periodic property. This class contains 2N/2−1networks, whereN=2kis the number of inputs. The balanced merger is one network in this class. Other networks use fewer comparators. We provide a very simple and elegant correctness proof based on the recursive structure of the networks.  相似文献   

19.
This paper applies matrix-analytic approach to the examination of the loss behavior of a space priority queue. In addition to the evaluation of the long-term high-priority and low-priority packet loss probabilities, we examine the bursty nature of packet losses by means of conditional statistics with respect to critical and non-critical periods that occur in an alternating manner. The critical period corresponds to having more than a certain number of packets in the buffer; non-critical corresponds to the opposite. Hence there is a threshold buffer level that splits the state space into two. By such a state-space decomposition, two hypothesized Markov chains are devised to describe the alternating renewal process. The distributions of various absorbing times in the two hypothesized Markov chains are derived to compute the average durations of the two periods and the conditional high-priority packet loss probability encountered during a critical period. These performance measures greatly assist the space priority mechanism for determining a proper threshold. The overall complexity of computing these performance measures is of the order O(K2m13m23), where K is the buffer capacity, and m1 and m2 are the numbers of phases of the underlying Markovian structures for the high-priority and low-priority packet arrival processes, respectively. Thus the results obtained are computationally tractable and numerical results show that, by choosing a proper threshold, a space priority queue not only can maintain the quality of service for the high-priority traffic but also can provide the near-optimum utilization of the capacity for the low-priority traffic.  相似文献   

20.
The Interpolation-Based Bintree (IBB) is a storage-saving encodingscheme for representing binary images. In this paper, we presentefficient parallel algorithms for important manipulations on IBBcoded images (also called bincodes). Given a set of bincodes, e.g.,B with size n, the 4-neighborfinding and the diagonal-neighbor finding algorithms onB can be accomplished in O(1) time on an n x n mesh computer with multiple broadcasting(MMB). Given two sets of bincodes, B 1 and B 2, with size n and m n, respectively, the intersection and unionoperations for B 1 and B 2 can be performedin O(1) time on an MMB using processors. With n 2 processors, the complementoperation for B can be performed in O(logn) time.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号