首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
An optimal prefetching and I/O scheduling algorithm L-OPT, for parallel I/O systems, using a read-once model of block references is presented. The algorithm uses knowledge of the next $L$ references, $L$-block lookahead, to create a minimal-length I/O schedule. For a system with $D$ disks and a buffer of capacity $m$ blocks, we show that the competitive ratio of L-OPT is $\Theta(\sqrt{mD/L})$ when $L \geq m$, which matches the lower bound of any prefetching algorithm with $L$-block lookahead. Tight bounds for the remaining ranges of lookahead are also presented. In addition we show that L-OPT is the optimal offline algorithm: when the lookahead consists of the entire reference string, it performs the absolute minimum possible number of I/Os. Finally, we show that L-OPT is comparable with the best online algorithm with the same amount of lookahead; the ratio of the length of its schedule to the length of the optimal schedule is always within a constant factor.  相似文献   

2.
E. Torng 《Algorithmica》1998,20(2):175-200
Paging (caching) is the problem of managing a two-level memory hierarchy in order to minimize the time required to process a sequence of memory accesses. In order to measure this quantity, which we refer to as the total memory access time, we define the system parameter miss penalty to represent the extra time required to access slow memory. We also introduce the system parameter page size. In the context of paging, miss penalty is quite large, so most previous studies of on-line paging have implicitly set miss penalty =∞ in order to simplify the model. We show that this seemingly insignificant simplification substantially alters the precision of derived results. For example, previous studies have essentially ignored page size. Consequently, we reintroduce the miss penalty and page size parameters to the paging problem and present a more accurate analysis of on-line paging (and caching). We validate using this more accurate model by deriving intuitively appealing results for the paging problem which cannot be derived using the simplified model. First, we present a natural, quantifiable definition of the amount of locality of reference in any access sequence. We also point out that the amount of locality of reference in an access sequence should depend on page size among other factors. We then show that deterministic and randomized marking algorithms such as the popular least recently used (LRU) algorithm achieve constant competitive ratios when processing typical access sequences which exhibit significant locality of reference; this represents the first competitive analysis result which (partially) explains why LRU performs as well as it is observed to in practice. Next, we show that finite lookahead can be used to obtain algorithms with improved competitive ratios. In particular, we prove that modified marking algorithms with sufficient lookahead achieve competitive ratios of 2. This is in stark contrast to the simplified model where lookahead cannot be used to obtain algorithms with improved competitive ratios. We conclude by using competitive analysis to evaluate the benefits of increasing associativity in caches. We accomplish this by specifying an algorithm and varying the system configuration rather than the usual process of specifying the system configuration and varying the algorithm. Received August 7, 1995; revised May 7, 1996, and August 6, 1996.  相似文献   

3.
We describe a new search algorithm for speech recognition which applies the monotone graph search procedure to the problem of building a word graph. A first backward pass provides a method for estimating the word boundary times and phone segment boundary times needed to build the word graph using either the 1-phone or 2-phone lookahead assumptions. It also provides a heuristic for the search which satisfies the monotonicity condition. A second backward pass applies forward–backward pruning to the word graph. We show how the search can be made to run very quickly if the 1-phone lookahead assumption holds. We present the results of experiments performed on the 5000-word speaker-independentWall Street Journaltask under both the 1-phone and 2-phone lookahead assumptions. These results show that the 1-phone lookahead assumption leads to unacceptably large error rates for speaker-independent recognition using current acoustic phonetic modelling techniques. Finally, we give an account of the methods we have developed to process speech data in successive blocks so as to address the real-time issue and to control the memory requirements of the search.  相似文献   

4.
A new measure for the study of on-line algorithms   总被引:3,自引:0,他引:3  
An accepted measure for the performance of an on-line algorithm is the competitive ratio introduced by Sleator and Tarjan. This measure is well motivated and has led to the development of a mathematical theory for on-line algorithms.We investigate the behavior of this measure with respect to memory needs and benefits of lookahead and find some counterintuitive features. We present lower bounds on the size of memory devoted to recording the past. It is also observed that the competitive ratio reflects no improvement in the performance of an on-line algorithm due to any (finite) amount of lookahead.We offer an alternative measure that exhibits a different and, in some respects, more intuitive behavior. In particular, we demonstrate the use of our new measure by analyzing the tradeoff between the amortized cost of on-line algorithms for the paging problem and the amount of lookahead available to them. We also derive on-line algorithms for theK-server problem on any bounded metric space, which, relative to the new measure, are optimal among all on-line algorithms (up to a factor of 2) and are within a factor of 2K from the optimal off-line performance.  相似文献   

5.
Buffer management for a D-disk parallel I/O system is considered in the context of randomized placement of data on the disks. A simple prefetching and caching algorithm PHASE-LRU using bounded lookahead is described and analyzed. It is shown that PHASE-LRU performs an expected number of I/Os that is within a factor Θ(logD/loglogD) of the number performed by an optimal off-line algorithm. In contrast, any deterministic buffer management algorithm with the same amount of lookahead must do at least times the number of I/Os of the optimal.  相似文献   

6.
This paper presents a new scheme of I/O scheduling on storage servers of distributed/parallel file systems, for yielding better I/O performance. To this end, we first analyze read/write requests in the I/O queue of storage server (we name them block I/Os), by using our proposed technique of horizontal partition. Then, all block requests are supposed to be divided into multiple groups, on the basis of their offsets. This is to say, all requests related to the same chunk file will be grouped together, and then be satisfied within the same time slot between opening and closing the target chunk file on the storage server. As a result, the time resulted by completing block I/O requests can be significantly decreased, because of less file operations on the corresponding chunk files at the low-level file systems of server machines. Furthermore, we introduce an algorithm to rate a priority for each group of block I/O requests, and then the storage server dispatches groups of I/Os by following the priority order. Consequently, the applications having higher I/O priorities, e.g. they have less I/O operations and small size of involved data, can finish at a earlier time. We implement a prototype of this server-side scheduling in the PARTE file system, to demonstrate the feasibility and applicability of the proposed scheme. Experimental results show that the newly proposed scheme can achieve better I/O bandwidth and less I/O time, compared with the strategy of First Come First Served, as well as other server-side I/O scheduling approaches.  相似文献   

7.
Parallel Learning of Belief Networks in Large and Difficult Domains   总被引:1,自引:0,他引:1  
Learning belief networks from large domains can be expensive even with single-link lookahead search (SLLS). Since a SLLS cannot learn correctly in a class of problem domains, multi-link lookahead search (MLLS) is needed which further increases the computational complexity. In our experiment, learning in some difficult domains over more than a dozen variables took days. In this paper, we study how to use parallelism to speed up SLLS for learning in large domains and to tackle the increased complexity of MLLS for learning in difficult domains. We propose a natural decomposition of the learning task for parallel processing. We investigate two strategies for job allocation among processors to further improve load balancing and efficiency of the parallel system. For learning from very large datasets, we present a regrouping of the available processors such that slow data access through the file system can be replaced by fast memory access. Experimental results in a distributed memory MIMD computer demonstrate the effectiveness of the proposed algorithms.  相似文献   

8.
S. Albers 《Algorithmica》1997,18(3):283-305
We introduce a new model of lookahead for on-line paging algorithms and study several algorithms using this model. A paging algorithm is n-line with strong lookahead l if it sees the present request and a sequence of future requests that contains l pairwise distinct pages. We show that strong lookahead has practical as well as theoretical importance and improves the competitive factors of on-line paging algorithms. This is the first model of lookahead having such properties. In addition to lower bounds we present a number of deterministic and randomized on-line paging algorithms with strong lookahead which are optimal or nearly optimal. Received April 8, 1994; revised May 15, 1995.  相似文献   

9.
External sorting—the process of sorting a file that is too large to fit into the computer's internal memory and must be stored externally on disks—is a fundamental subroutine in database systems[G], [IBM]. Of prime importance are techniques that use multiple disks in parallel in order to speed up the performance of external sorting. The simple randomized merging (SRM ) mergesort algorithm proposed by Barve et al. [BGV] is the first parallel disk sorting algorithm that requires a provably optimal number of passes and that is fast in practice. Knuth [K,Section 5.4.9] recently identified SRM (which he calls ``randomized striping') as the method of choice for sorting with parallel disks. In this paper we present an efficient implementation of SRM, based upon novel and elegant data structures. We give a new implementation for SRM's lookahead forecasting technique for parallel prefetching and its forecast and flush technique for buffer management. Our techniques amount to a significant improvement in the way SRM carries out the parallel, independent disk accesses necessary to read blocks of input runs efficiently during external merging. Our implementation is based on synchronous parallel I/O primitives provided by the TPIE programming environment[TPI]; whenever our program issues an I/O read (write) operation, one block of data is synchronously read from (written to) each disk in parallel. We compare the performance of SRM over a wide range of input sizes with that of disk-striped mergesort (DSM ), which is widely used in practice. DSM consists of a standard mergesort in conjunction with striped I/O for parallel disk access. SRM merges together significantly more runs at a time compared with DSM, and thus it requires fewer merge passes. We demonstrate in practical scenarios that even though the streaming speeds for merging with DSM are a little higher than those for SRM (since DSM merges fewer runs at a time), sorting using SRM is often significantly faster than with DSM (since SRM requires fewer passes). The techniques in this paper can be generalized to meet the load-balancing requirements of other applications using parallel disks, including distribution sort and multiway partitioning of a file into several other files. Since both parallel disk merging and multimedia processing deal with streams that get ``consumed' at nonuniform and partially predictable rates, our techniques for lookahead based upon forecasting data may have relevance in video server applications. Received June 28, 2000, and in revised form June 5, 2001. Online publication April 8, 2002.  相似文献   

10.
Hash tables on external memory are commonly used for indexing in database management systems. In this paper we present an algorithm that, in an asymptotic sense, achieves the best possible I/O and space complexities. Let B denote the number of records that fit in a block, and let N denote the total number of records. Our hash table uses I/Os, expected, for looking up a record (no matter if it is present or not). To insert, delete or change a record that has just been looked up requires I/Os, amortized expected, including I/Os for reorganizing the hash table when the size of the database changes. The expected external space usage is times the optimum of N/B blocks, and just O(1) blocks of internal memory are needed.  相似文献   

11.
We study an online weighted interval scheduling problem on a single machine, where all intervals have unit length and the objective is to maximize the total weight of all completed intervals. We investigate how the function of finite lookahead improves the competitivities of deterministic online heuristics, under both preemptive and non-preemptive models. The lookahead model studied in this paper is that an online heuristic is said to have a lookahead ability of LD if at any time point it is able to foresee all the intervals to be released within the next LD   units of time. We investigate both competitive online heuristics and lower bounds on the competitive ratio, with lookahead 0≤LD≤10LD1 under the preemptive model, and lookahead 0≤LD≤20LD2 under the non-preemptive model. A method to transform a preemptive lookahead online algorithm to a non-preemptive online algorithm with enhanced lookahead ability is also given. Computational tests are performed to compare the practical competitivities of the online heuristics with different lookahead abilities.  相似文献   

12.
We consider the NP-hard problem of scheduling parallel jobs with release dates on identical parallel machines to minimize the makespan. A parallel job requires simultaneously a prespecified, job-dependent number of machines when being processed. We prove that the makespan of any nonpreemptive list-schedule is within a factor of 2 of the optimal preemptive makespan. This gives the best-known approximation algorithms for both the preemptive and the nonpreemptive variant of the problem. We also show that no list-scheduling algorithm can achieve a better performance guarantee than 2 for the nonpreemptive problem, no matter which priority list is chosen. List-scheduling also works in the online setting where jobs arrive over time and the length of a job becomes known only when it completes; it therefore yields a deterministic online algorithm with competitive ratio 2 as well. In addition, we consider a different online model in which jobs arrive one by one and need to be scheduled before the next job becomes known. We show that no list-scheduling algorithm has a constant competitive ratio. Still, we present the first online algorithm for scheduling parallel jobs with a constant competitive ratio in this context. We also prove a new information-theoretic lower bound of 2.25 for the competitive ratio of any deterministic online algorithm for this model. Moreover, we show that 6/5 is a lower bound for the competitive ratio of any deterministic online algorithm of the preemptive version of the model jobs arriving over time.  相似文献   

13.
On-Line Algorithms for the Dynamic Traveling Repair Problem   总被引:1,自引:0,他引:1  
We consider the dynamic traveling repair problem in which requests with deadlines arrive through time on points in a metric space. Servers move from point to point at constant speed. The goal is to plan the motion of servers so that the maximum number of requests are met by their deadline. We consider a restricted version of the problem in which there is a single server and the length of time between the arrival of a request and its deadline is constant. We give upper bounds for the competitive ratio of two very natural algorithms as well as several lower bounds for any deterministic algorithm. Most of the results in this paper are expressed as a function of β, the diameter of the metric space. In particular, we prove that the upper bound given for one of the two algorithms is within a constant factor of the best possible competitive ratio.  相似文献   

14.
With the exponential growth of WWW traffic, web proxy caching becomes a critical technique for Internet web services. Well-organized proxy caching systems with multiple servers can greatly reduce the user perceived latency and decrease the network bandwidth consumption. Thus, many research papers focused on improving web caching performance with the efficient coordination algorithms among multiple servers. Hash based algorithm is the most widely used server coordination mechanism, however, there's still a lot of technical issues need to be addressed. In this paper, we propose a new hash based web caching architecture, Tulip. Tulip aggregates web objects that are likely to be accessed together into object clusters and uses object clusters as the primary access units. Tulip extends the locality-based algorithm in UCFS to hash based web proxy systems and proposes a simple algorithm to reduce the data grouping overhead. It takes into consideration the access speed dispatch between memory and disk and replaces expensive small disk I/O with less large ones. In case a client request cannot be fulfilled by the server in the memory, the system fetches the whole cluster which contains the required object into memory, the future requests for other objects in the same cluster can be satisfied directly from memory and slow disk I/Os are avoided. It also introduces a simple and efficient data dupllication algorithm, few maintenance work need to be done in case of server join/leave or server failure. Along with the local caching strategy, Tulip achieves better fault tolerance and load balance capability with the minimal cost. Our simulation results show Tulip has better performance than previous approaches.  相似文献   

15.
时延Petri网分布式模拟的先行值研究   总被引:1,自引:0,他引:1  
先行值计算是提高时延Petri网并行模拟性能的一个好的方法。给出了时延Petri网的先行值计算的四种基本结构,对于存在循环的复杂的Petri网结构给出了预测图算法,通过预测图,能够很容易求出静态和动态先行值,在并行模拟中利用先行值可以分析出存在并发和阻塞的结构,从而为网分块在并行机的结点上运行奠定了基础。  相似文献   

16.
Exact train pathing   总被引:1,自引:0,他引:1  
Suppose we are given a schedule of train movements over a rail network into which a new train is to be included. The origin and the destination are specified for the new train; it is required that a schedule (including the path) be determined for it so as to minimize the time taken without affecting the schedules for the old trains. In the standard formulations of this single train pathing problem, the time taken by the train to traverse any block (track segment guarded by a signal) in the network is deemed to be a fixed number, independent of how the train arrived onto that block. In other words, the standard formulations of train pathing do not accurately model the acceleration/deceleration restrictions on trains. In this paper we give an algorithm to solve the single train pathing problem, while taking into account the maximum allowed acceleration and deceleration as well as explicitly modeling signals. For trains having ‘large’ maximum acceleration and deceleration, our algorithm runs in polynomial time. On the other hand, if the train to be pathed is capable of only very small acceleration so that it must take a long time to reach full speed, our algorithm takes exponential time. However, we prove that the pathing problem is NP-complete for small acceleration values, thus justifying the time required by our algorithm. Our algorithm can be used as a subroutine in a heuristic for multiple train pathing. If all trains have large (but possibly different) accelerations this algorithm will run in polynomial time. V. Nagarajan is supported in part by NSF grant CCF-0728841.  相似文献   

17.
Recently, active prevention healthcares are needed for potential patients to be suffered in the future as the forecasted diseases inherited from ancestors. We call active U-healthcare, for providing active, periodic, and continuous medical treatments depending on inherited heterogeneous states in DNAs of patients, such as diabetes, heart diseases, and female diseases. However, the bottleneck of the aggressive active U-healthcare is memory overhead in DNA sequence analysis of each patient since the sequences of DNAs have massive volume. Thus, the efficient retrieve of the many disease patterns in originally recorded on DNAs of potential patients is a major problem. This paper focuses on a novel method for efficient retrieving of disease patterns using a suffix tree in memory. The suffix tree is widely used in the similarity search for sequences consisting of limited characters. It is efficient when the occurrence frequency of a common prefix is high. Since in-memory suffix tree construction algorithms do not scale up, a large-scale disk-based suffix tree construction algorithm, TRELLIS, has been proposed recently. However, the algorithm requires a large amount of memory, disk space, and disk I/Os in order to merge sub-trees having a common prefix. In this paper, we propose a new non-merging method, called NST. The experimental results show that NST constructs an index using less memory than TRELLIS.  相似文献   

18.
Algorithms for parallel memory,I: Two-level memories   总被引:1,自引:0,他引:1  
We provide the first optimal algorithms in terms of the number of input/outputs (I/Os) required between internal memory and multiple secondary storage devices for the problems of sorting, FFT, matrix transposition, standard matrix multiplication, and related problems. Our two-level memory model is new and gives a realistic treatmentof parallel block transfer, in which during a single I/O each of theP secondary storage devices can simultaneously transfer a contiguous block ofB records. The model pertains to a large-scale uniprocessor system or parallel multiprocessor system withP disks. In addition, the sorting, FFT, permutation network, and standard matrix multiplication algorithms are typically optimal in terms of the amount of internal processing time. The difficulty in developing optimal algorithms is to cope with the partitioning of memory intoP separate physical devices. Our algorithms' performances can be significantly better than those obtained by the wellknown but nonoptimal technique of disk striping. Our optimal sorting algorithm is randomized, but practical; the probability of using more than times the optimal number of I/Os is exponentially small inl(logl) log(M/B), whereM is the internal memory size.A summarized version of this research was presented at the 22nd Annual ACM Symposium on Theory of Computing, Baltimore, MD, May 1990. This work was done while the first author was at Brown University. Support was provided in part by a National Science Foundation Presidential Young Investigator Award with matching funds from IBM, by NSF Research Grants DCR-8403613 and CCR-9007851, by Army Research Office Grant DAAL03-91-G-0035, and by the Office of Naval Research and the Defense Advanced Research Projects Agency under Contract N00014-91-J-4052 ARPA Order 8225. This work was done in part while the second author was at Brown University supported by a Bellcore graduate fellowship and at Bellcore.  相似文献   

19.
We study the scheduling problem of minimizing the maximum starting time on-line. The goal is to minimize the last time that a job starts. We show that while the greedy algorithm has a competitive ratio of Θ (log m), we can give a constant competitive algorithm for this problem. We also show that the greedy algorithm is optimal for resource augmentation in the sense that it requires 2m − 1 machines to have a competitive ratio of 1, whereas no algorithm can achieve this with 2m − 2 machines.  相似文献   

20.
In this paper, an efficient page rank (PR) exact algorithm is proposed, which can improve the computation efficiency without sacrificing results accuracy. The existing exact algorithms are generally based on the original power method (PM). In order to reduce the number of I/Os required to improve efficiency, they partition the big graph into multiple smaller ones that can be totally fitted in memory. The algorithm proposed in this paper can further reduce the required number of I/Os. Instead of partitioning the graph into the general subgraphs, our algorithm partitions graph into a special kind of subgraphs: SCCs (strongly connected components), the nodes in which are reachable to each other. By exploiting the property of SCC, some theories are proposed, based on which the computation iterations can be constrained on these SCC subgraphs. Our algorithm can reduce lots of I/Os and save a large amount of computations, as well as keeping the results accuracy. In a word, our algorithm is more efficient among the existing exact algorithms. The experiments demonstrate that the algorithms proposed in this paper can make an obvious efficiency improvement and can attain high accurate results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号