期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

毛友发杨明福《计算机工程》2004,30(18):33-34,121

研究了并行存储预取优化算法，根据并行存储的主要访问模式，提出要同时对文件内数据块访问和文件间访问进行建模，并对文件内数据块访问和文件间访问建模分别提出了E_IS_PPM算法和Last_N_Successor算法。最后将两个算法结合起来，提出了文件预取综合算法，算法根据计算和存储的可重叠程度以及文件预取页面的可获得性，自适应地决定预取深度。相似文献

2.

Practical prefetching techniques for multiprocessor file systems

David Kotz Carla Schlatter Ellis 《Distributed and Parallel Databases》1993,1(1):33-51

Improvements in the processing speed of multiprocessors are outpacing improvements in the speed of disk hardware. Parallel disk I/O subsystems have been proposed as one way to close the gap between processor and disk speeds. In a previous paper we showed that prefetching and caching have thepotential to deliver the performance benefits of parallel file systems to parallel applications. In this paper we describe experiments withpractical prefetching policies that base decisions only on on-line reference history, and that can be implemented efficiently. We also test the ability of those policies across a range of architectural parameters. 相似文献

3.

Optimal Read-Once Parallel Disk Scheduling

Mahesh Kallahalla Peter J. Varman 《Algorithmica》2005,43(4):309-343

An optimal prefetching and I/O scheduling algorithm L-OPT, for parallel I/O systems, using a read-once model of block references is presented. The algorithm uses knowledge of the next $L$ references, $L$-block lookahead, to create a minimal-length I/O schedule. For a system with $D$ disks and a buffer of capacity $m$ blocks, we show that the competitive ratio of L-OPT is $\Theta(\sqrt{mD/L})$ when $L \geq m$, which matches the lower bound of any prefetching algorithm with $L$-block lookahead. Tight bounds for the remaining ranges of lookahead are also presented. In addition we show that L-OPT is the optimal offline algorithm: when the lookahead consists of the entire reference string, it performs the absolute minimum possible number of I/Os. Finally, we show that L-OPT is comparable with the best online algorithm with the same amount of lookahead; the ratio of the length of its schedule to the length of the optimal schedule is always within a constant factor. 相似文献

4.

Optimizing sort order query execution in balanced and nested gridfiles

Mueck T.A. Schauer M.J. 《Knowledge and Data Engineering, IEEE Transactions on》1995,7(2):246-260

Disk input/output (I/O) efficient query execution is an important topic with respect to DBMS performance. In this context, we elaborate on the construction of disk access plans for sort order queries in balanced and nested grid files. The key idea is to use the order information contained in the directory of the multiattribute search structure. The presented algorithms are shown to yield a significant decrease in the number of disk I/O operations by appropriate use of the order information. Two algorithms for the construction of appropriate disk access plans are proposed, namely a greedy approach and a heuristic divide-and-conquer approach. Both approaches yield considerable I/O savings compared to straightforward query processing without consideration of any directory order information. The former performs well for small buffer page allocations, i.e., for a small number of buffer pages relative to the number of data buckets processed in the query. The latter is superior to the greedy algorithm with respect to the total number of I/O operations and with respect to the overall maximum of buffer pages needed to achieve the minimal number of disk I/O operations. Both approaches rely on a binary trie as a temporary data structure. This trie is used as an explicit representation of the order information. The storage consumption of the temporary data structure is shown to be negligible in realistic cases, Even for pathological cases with respect to degenerated balanced and nested grid files, reasonable upper bounds can be given 相似文献

5.

一种面向视频播放系统的RAID并行预取技术及实现 总被引：3，自引：0，他引：3

李宇张江陵冯丹《计算机研究与发展》2002,39(11):1526-1530

随着视频数字化技术的广泛应用，视频播放系统中的存在“瓶颈”也日益突出，在分析视频存储系统磁盘阵列的I／O调度算法和在实际应用中视频数据的特点的基础上，提出一种面向视频媒体服务的RAID并行预取实时调度算法，它利用未来数据的可行预测性，预先将其从磁盘取入缓冲区，同时优化任务调度，使主机数据的回送和从磁盘预取数据同步执行，进一步提高了阵列的I／O吞吐率，用I／Ometer测试结果证明，新算法具有很好的性能。相似文献

6.

Balancing disk energy against reliability in video playback

Minseok Song Wanhyung Ryu Jeong Seop Sim Yeongju Lee 《Multimedia Systems》2014,20(1):25-43

Video files contain large amount of data, which can be stored cost-effectively on a hard disk drive, but this consumes a significant energy when it is spinning and ready to read data. The energy used by a disk can be reduced by prefetching video frames into buffer to allow the disk to spin down. But frequent spindowns compromise disk life, so it is desirable to limit the number of times that the disk spins down. We propose a method of data prefetching that fully utilizes the available buffer while providing continuous video playback. We analyze the effect of the amount of data comprising the frames in the buffer on disk power consumption and formulate algorithms that determine when the disk should enter standby mode and the optimal number of disk spindowns. We implemented our scheme in the Linux 2.6 MPlayer and find that a portable 1.8-inch disk uses between 10 and 37 % less energy than it does with the existing MPlayer. 相似文献

7.

面向流媒体业务的存储设备性能改进策略

李明哲陈晓张武《网络新媒体技术》2013,(6):8-11,16

本文分析了RAID、I/O调度和预读等技术在提高流媒体存储系统性能方面的局限性。为突破这些传统策略的限制,本文针对存储设备的特点,挖掘磁盘设备的寻道延迟和旋转延迟的提升空间,提出了流的独立存储、潜伏缓存和排练等方法,从多个角度为进一步提升存储系统服务性能提出了新思路。理论分析和实测评估验证了新方法对系统性能的提升效果。相似文献

8.

Heuristics for scheduling I/O operations 总被引：1，自引：0，他引：1

Jain R. Somalwar K. Werth J. Browne J.C. 《Parallel and Distributed Systems, IEEE Transactions on》1997,8(3):310-320

The I/O bottleneck in parallel computer systems has recently begun receiving increasing interest. Most attention has focused on improving the performance of I/O devices using fairly low level parallelism in techniques such as disk striping and interleaving. Widely applicable solutions, however, will require an integrated approach which addresses the problem at multiple system levels, including applications, systems software, and architecture. We propose that within the context of such an integrated approach, scheduling parallel I/O operations will become increasingly attractive and can potentially provide substantial performance benefits. We describe a simple I/O scheduling problem and present approximate algorithms for its solution. The costs of using these algorithms in terms of execution time, and the benefits in terms of reduced time to complete a batch of I/O operations, are compared with the situations in which no scheduling is used, and in which an optimal scheduling algorithm is used. The comparison is performed both theoretically and experimentally. We have found that, in exchange for a small execution time overhead, the approximate scheduling algorithms can provide substantial improvements in I/O completion times 相似文献

9.

A Simple and Efficient Parallel Disk Mergesort

R. D. Barve J. S. Vitter 《Theory of Computing Systems》2002,35(2):189-215

External sorting—the process of sorting a file that is too large to fit into the computer's internal memory and must be stored externally on disks—is a fundamental subroutine in database systems[G], [IBM]. Of prime importance are techniques that use multiple disks in parallel in order to speed up the performance of external sorting. The simple randomized merging (SRM ) mergesort algorithm proposed by Barve et al. [BGV] is the first parallel disk sorting algorithm that requires a provably optimal number of passes and that is fast in practice. Knuth [K,Section 5.4.9] recently identified SRM (which he calls ``randomized striping') as the method of choice for sorting with parallel disks. In this paper we present an efficient implementation of SRM, based upon novel and elegant data structures. We give a new implementation for SRM's lookahead forecasting technique for parallel prefetching and its forecast and flush technique for buffer management. Our techniques amount to a significant improvement in the way SRM carries out the parallel, independent disk accesses necessary to read blocks of input runs efficiently during external merging. Our implementation is based on synchronous parallel I/O primitives provided by the TPIE programming environment[TPI]; whenever our program issues an I/O read (write) operation, one block of data is synchronously read from (written to) each disk in parallel. We compare the performance of SRM over a wide range of input sizes with that of disk-striped mergesort (DSM ), which is widely used in practice. DSM consists of a standard mergesort in conjunction with striped I/O for parallel disk access. SRM merges together significantly more runs at a time compared with DSM, and thus it requires fewer merge passes. We demonstrate in practical scenarios that even though the streaming speeds for merging with DSM are a little higher than those for SRM (since DSM merges fewer runs at a time), sorting using SRM is often significantly faster than with DSM (since SRM requires fewer passes). The techniques in this paper can be generalized to meet the load-balancing requirements of other applications using parallel disks, including distribution sort and multiway partitioning of a file into several other files. Since both parallel disk merging and multimedia processing deal with streams that get ``consumed' at nonuniform and partially predictable rates, our techniques for lookahead based upon forecasting data may have relevance in video server applications. Received June 28, 2000, and in revised form June 5, 2001. Online publication April 8, 2002. 相似文献

10.

Analysis of simple randomized buffer management for parallel I/O

Mahesh Kallahalla Peter J. Varman 《Information Processing Letters》2004,90(1):47-52

Buffer management for a D-disk parallel I/O system is considered in the context of randomized placement of data on the disks. A simple prefetching and caching algorithm PHASE-LRU using bounded lookahead is described and analyzed. It is shown that PHASE-LRU performs an expected number of I/Os that is within a factor Θ(logD/loglogD) of the number performed by an optimal off-line algorithm. In contrast, any deterministic buffer management algorithm with the same amount of lookahead must do at least times the number of I/Os of the optimal. 相似文献

11.

I/O issues in a multimedia system

Narasimha Reddy A.L. Wyllie J.C. 《Computer》1994,27(3):69-74

In future computer system design, I/O systems will have to support continuous media such as video and audio, whose system demands are different from those of data such as text. Multimedia computing requires us to focus on designing I/O systems that can handle real-time demands. Video- and audio-stream playback and teleconferencing are real-time applications with different I/O demands. We primarily consider playback applications which require guaranteed real-time I/O throughput. In a multimedia server, different service phases of a real-time request are disk, small computer systems interface (SCSI) bus, and processor scheduling. Additional service might be needed if the request must be satisfied across a local area network. We restrict ourselves to the support provided at the server, with special emphasis on two service phases: disk scheduling and SCSI bus contention. When requests have to be satisfied within deadlines, traditional real-time systems use scheduling algorithms such as earliest deadline first (EDF) and least slack time first. However, EDF makes the assumption that disks are preemptable, and the seek-time overheads of its strict real-time scheduling result in poor disk utilization. We can provide the constant data rate necessary for real-time requests in various ways that require trade-offs. We analyze how trade-offs that involve buffer space affect the performance of scheduling policies. We also show that deferred deadlines, which increase buffer requirements, improve system performance significantly 相似文献

12.

Location-Oblivious Distributed Unit Disk Graph Coloring

Michel Barbeau Prosenjit Bose Paz Carmi Mathieu Couture Evangelos Kranakis 《Algorithmica》2011,60(2):236-249

We present the first location-oblivious distributed unit disk graph coloring algorithm having a provable performance ratio of three (i.e. the number of colors used by the algorithm is at most three times the chromatic number of the graph). This is an improvement over the standard sequential coloring algorithm that has a worst case lower bound on its performance ratio of 4−3/k (for any k>2, where k is the chromatic number of the unit disk graph achieving the lower bound) (Tsai et al., in Inf. Process. Lett. 84(4):195–199, 2002). We present a slightly better worst case lower bound on the performance ratio of the sequential coloring algorithm for unit disk graphs with chromatic number 4. Using simulation, we compare our algorithm with other existing unit disk graph coloring algorithms. 相似文献

13.

多线程环境下基于多预取点的文件预取

刘金胡创胡明龚奕利《计算机应用》2012,32(6):1713-1716

为解决当前Linux内核的预取算法在多线程情况下出现预取误判的问题,依据多线程环境下进程对磁盘文件的访问特点,提出一种基于多预取点的预取算法。在Linux内核原有的预取算法的基础上,结合多线程环境下应用程序对数据的访问模式,在Linux内核的页面缓存层进行了实现。实验和分析表明,在IOzone单线程测试中,该算法和Linux内核原预取算法性能相当;在多线程测试中,读取相同大小的文件,耗时比Linux内核原预取算法至少少1/3。新算法对于提高I/O并行度,从而提高整个计算机系统并行化很有帮助。相似文献

14.

Energy-aware disk scheduling for soft real-time I/O requests

Youjip Won Jongmin Kim Wonmin Jung 《Multimedia Systems》2008,13(5-6):409-428

In this work, we develop energy-aware disk scheduling algorithm for soft real-time I/O. Energy consumption is one of the major factors which bar the adoption of hard disk in mobile environment. Heat dissipation of large scale storage system also calls for an energy-aware scheduling technique to further increase the storage density. The basic idea in this work is to properly determine the I/O burst size so that device can be in standby mode between consecutive I/O bursts and that it can satisfy the soft real-time requirement. We develop an elaborate model which incorporates the energy consumption characteristics, overhead of mode transition in determining the appropriate I/O burst size and the respective disk operating schedule. Efficacy of energy-aware disk scheduling algorithm greatly relies on not only disk scheduling algorithm itself but also various operating system and device firmware related concerns. It is crucial that the various operating system level and device level features need to be properly addressed within disk scheduling framework. Our energy-aware disk scheduling algorithm successfully addresses a number of outstanding issues. First, we examine the effect of OS and hard disk firmware level prefetch policy and incorporate its effect in our disk scheduling framework. Second, our energy aware scheduling framework can allocate a certain fraction of disk bandwidth to handle sporadically arriving non real-time I/O’s. Third, we examine the relationship between lock granularity of the buffer management and energy consumption. We develop a prototype software with energy-aware scheduling algorithm. In our experiment, proposed algorithm can reduce the energy consumption to one fourth if we use energy-aware disk scheduling algorithm. However, energy-aware disk scheduling algorithm increases buffer requirement significantly, e.g., from 4 to 140 KByte. We carefully argue that the buffer overhead is still justifiable given the cost of DRAM chip and importance of energy management in modern mobile devices. The result of our work not only provides the energy efficient scheduling algorithm but also provides an important guideline in capacity planning of future energy efficient mobile devices. This paper is funded by KOSEF through Statistical Research Paper for Complex System at Seoul National University. 相似文献

15.

Threaded Prefetching: A New Instruction Memory Hierarchy for Real-Time Systems

Lee Minsuk Min Sang Lyul Shin Heonshik Kim Chong Sang Park Chang Yun 《Real-Time Systems》1997,13(1):47-65

Cache memories have been extensively used to bridge the speed gap between high speed processors and relatively slow main memory. However, they are not widely used in real-time systems due to their unpredictable performance. This paper proposes an instruction prefetching scheme called threaded prefetching as an alternative to instruction caching in real-time systems. In the proposed threaded prefetching, an instruction block pointer called a thread is assigned to each instruction memory block and is made to point to the next block on the worst case execution path that is determined by a compile-time analysis. Also, the thread is not updated throughout the entire program execution to guarantee predictability. This paper also compares the worst case performances of various previous instruction prefetching schemes with that of the proposed threaded prefetching. By analyzing several benchmark programs, we show that the worst case performance of the proposed scheme is significantly better than those of previous instruction prefetching schemes. The results also show that when the block size is large enough the worst case performance of the proposed threaded prefetching scheme is almost as good as that of an instruction cache with 100 % hit ratio. 相似文献

16.

Optimizing I/O server placement for parallel I/O on switch-based irregular networks

Yih-Fang Lin Chien-Min Wang Jan-Jan Wu 《The Journal of supercomputing》2006,36(3):201-217

In this paper, we study I/O server placement for optimizing parallel I/O performance on switch-based clusters, which typically adopt irregular network topologies to allow construction of scalable systems with incremental expansion capability. Finding optimal solution to this problem is computationally intractable. We quantified the number of messages travelling through each network link by a workload function, and developed three heuristic algorithms to find good solutions based on the values of the workload function. The maximum-workload-based heuristic chooses the locations for I/O nodes in order to minimize the maximum value of the workload function. The distance-based heuristic aims to minimize the average distance between the compute nodes and I/O nodes, which is equivalent to minimizing average workload on the network links. The load-balance-based heuristic balances the workload on the links based on a recursive traversal of the routing tree for the network. Our simulation results demonstrate performance advantage of our algorithms over a number of algorithms commonly used in existing parallel systems. In particular, the load-balance-based algorithm is superior to the other algorithms in most cases, with improvement ratio of 10 to 95% in terms of parallel I/O throughput. 相似文献

17.

On using cache conscious clustering for improving OODBMS performance

《Information and Software Technology》2006,48(11):1073-1082

The two main techniques of improving I/O performance of Object Oriented Database Management Systems (OODBMS) are clustering and buffer replacement. Clustering is the placement of objects accessed near to each other in time into the same page. Buffer replacement involves the selection of a page to be evicted, when the buffer is full. The page evicted ideally should be the page needed least in the future. These two techniques both influence the likelihood of a requested object being memory resident. We believe an effective way of reducing disk I/O is to take advantage of the synergy that exists between clustering and buffer replacement. Hence, we design a framework, whereby clustering algorithms incorporating buffer replacement cache behaviour can be conveniently employed for enhancing the I/O performance of OODBMS. We call this new type of clustering algorithm, Cache Conscious Clustering (C3). In this paper, we present the C3 framework, and a C3 algorithm that we have developed, namely C3-GP. We have tested C3-GP against three well known clustering algorithms. The results show that C3-GP out performs them by up to 40% when using popular buffer replacement algorithms such as LRU and CLOCK. C3-GP offers the same performance as the best existing clustering algorithm when the buffer size compared to the database size is very small. 相似文献

18.

Run Placement Policies for Concurrent Mergesorts Using Parallel Prefetching

Kun-Lung Wu Philip S. Yu James Z. Teng 《Knowledge and Information Systems》1999,1(4):435-457

We study the performance of various run placement policies on disks for the merge phase of concurrent mergesorts using parallel prefetching. The initial sorted runs (input) of a merge and its final sorted run (output) are stored on multiple disks but each run resides only on a single disk. In this paper, we examine through detailed simulations three different run placement policies and the impact of buffer thrashing. The results show that, with buffer thrashing avoidance, the best performance can be achieved by a run placement policy that uses a proper subset of the disks dedicated for writing the output runs while the rest of the disks are used for prefetching the input runs in parallel. However, the proper number of write disks is workload dependent, and if not carefully chosen, it can adversely affect the system performance. In practice, a reasonably good performance can be achieved by a run placement policy that does not place the output run of a merge on any of the disks that store its own input runs but allows the output run to share the same disk with some of the input runs of other merges. 相似文献

19.

一种基于磁盘介质的网络存储系统缓存

尹洋刘振军许鲁《软件学报》2009,20(10):2752-2765

随着计算规模越来越大,网络存储系统应用领域越来越广泛,对网络存储系统I/O性能要求也越来越高.在存储系统高负载的情况下,采用低速介质在客户机和网络存储系统的I/O路径上作为数据缓存也变得具有实际的意义.设计并实现了一种基于磁盘介质的存储系统块一级的缓存原型D-Cache.采用两级结构对磁盘缓存进行管理,并提出了相应的基于块一级的两级缓存管理算法.该管理算法有效地解决了因磁盘介质响应速度慢而带来的磁盘缓存管理难题,并通过位图的使用消除了磁盘缓存写Miss时的Copy on Write开销.原型系统的测试结果表明,在存储服务器高负载的情况下,缓存系统能够有效地提高系统的整体性能. 相似文献

20.

非定常Monte Carlo输运问题的并行算法 总被引：1，自引：0，他引：1

刘杰邓力胡庆丰袁国兴李晓梅《计算机学报》2004,27(1):99-106

文中给出了非定常MonteCarlo(下文简写为MC)输运问题的并行算法 ,对并行程序的加载运行模式进行了讨论和优化设计 .针对MC并行计算设计了一种理想情况下无通信的并行随机数发生器算法 .动态MC输运问题有大量的I/O操作 ,特别是读取剩余粒子数据文件需要大量的I/O时间 ,文中针对I/O问题 ,提出了三种并行I/O算法 .最后给出了并行算法的性能测试结果 ,对比串行计算时间 ,使用 6 4台处理机时的并行计算时间缩短了 30倍相似文献