首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
一种支持并发访问流的文件预取算法   总被引:1,自引:0,他引:1  
吴峰光  奚宏生  徐陈锋 《软件学报》2010,21(8):1820-1833
设计并实现了一种按需预取算法,采用更为宽松的顺序性判决条件,并以页面和页面缓存的状态作为可靠的决策依据.它可以发现淹没在随机读中的顺序访问并进行有效的预读,支持对单个文件实例的并发访问而产生的交织访问模式.实验结果表明:相对于原Linux预读算法,该算法在随机干扰下的顺序读性能可提高29%;交织读的性能是传统算法的4~27倍;同时,应用程序可见延迟改善可达35倍.该算法已被Linux 2.6.24内核采用.  相似文献   

2.
This paper studies workfile disk management for concurrent mergesorts ina multiprocessor database system. Specifically, we examine the impacts of workfile disk allocation and data striping on the average mergesort response time. Concurrent mergesorts in a multiprocessor system can creat severe I/O interference in which a large number of sequential write requests are continuously issued to the same workfile disk and block other read requests for a long period of time. We examine through detailed simulations a logical partitioning approach to workfile disk management and evaluate the effectiveness of datastriping. The results show that (1) without data striping, the best performance is achieved by using the entire workfile disks as a single partition if there are abundant workfile disks (or system workload is light); (2) however, if there are limited workfile disks (or system workload is heavy), the workfile disks should be partitioned into multiple groups and the optimal partition size is workload dependent; (3) data striping is beneficial only if the striping unit size is properly chosen; and (4) with a proper striping size, the best performance is generally achieved by using the entire disks as a single logical partition.  相似文献   

3.
毛友发  杨明福 《计算机工程》2004,30(18):33-34,121
研究了并行存储预取优化算法,根据并行存储的主要访问模式,提出要同时对文件内数据块访问和文件间访问进行建模,并对文件内数据块访问和文件间访问建模分别提出了E_IS_PPM算法和Last_N_Successor算法。最后将两个算法结合起来,提出了文件预取综合算法,算法根据计算和存储的可重叠程度以及文件预取页面的可获得性,自适应地决定预取深度。  相似文献   

4.
5.
《计算机工程》2017,(2):98-104
传统的基于模拟退火的现场可编程门阵列(FPGA)时序驱动布局算法在时延代价的计算上存在一定误差,已有的时序优化算法能够改善布局质量,但增加了时耗。针对上述问题,提出一种基于事务内存(TM)的并行FPGA时序布局算法TM_DCP。将退火过程分发至多线程执行,利用TM机制保证共享内存访问的合法性,并将改进的时序优化算法嵌入到事务中并发执行。测试结果表明,与通用布局布线工具相比,8线程下的TM_DCP算法在总线长仅有轻微增加的情况下,关键路径时延平均降低了4.2%,同时获得了1.7倍的加速,且其执行速度随线程数的增加具有较好的可扩展性。  相似文献   

6.
7.
基于RAID的适度贪婪并行预取技术   总被引:1,自引:0,他引:1  
吴志刚  冯丹  张江陵 《计算机工程》2003,29(18):164-165,176
Prefetching(预取)技术是在计算机体系设计中为提高系统性能而通常采用的一项重要技术。在RAID(廉价冗余磁盘阵列)系统中采用有效的预取技术可以缩短主机读请求的平均响应时间,提高磁盘阵列的数据吞吐率。在分析了一些主要应用模型的数据请求特性的基础上,实现了一种适度贪婪的并行预取算法,实验证明该预取技术对主机的连续大量数据读请求是十分有效的。  相似文献   

8.
提出了集群服务器并行网页预取模型,模型采用了马尔科夫链分析访问路径并在Web集群服务器的各节点上并行预取页面,把集群技术的高性能和高可靠性与预取技术的快速响应能力结合起来。实验表明,将此模型应用于集群服务器的分发器上,服务器系统具有更高的请求命中率和更大的吞吐量。  相似文献   

9.
不同的Cache预取策略适用于不同的存取模式。本文介绍了存储系统Cache预取技术的研究现状,从分析存取模式出发,构造了存取模式三元组模型,并在磁盘阵列上测试了适 用于复杂环境下的Cache预取自适应策略,结果证明,自适应策略能够在不同环境上获得磁盘阵列的最优性能。  相似文献   

10.
结合访存失效队列状态的预取策略   总被引:1,自引:0,他引:1  
随着存储系统的访问速度与处理器的运算速度的差距越来越显著,访存性能已成为提高计算机系统性能的瓶颈.通过对指令Cache和数据Cache失效行为的分析,提出一种预取策略--结合访存失效队列状态的预取策略.该预取策略保持了指令和数据访问的次序,有利于预取流的提取.并将指令流和数据流的预取相分离,避免相互替换.在预取发起时机的选择上,不但考虑当前总线是否空闲,而且结合访存失效队列的状态,减小对处理器正常访存请求的影响.通过流过滤机制提高预取准确性,降低预取对访存带宽的需求.结果表明,采用结合访存失效队列状态的预取策略,处理器的平均访存延时减少30%,SPEC CPU2000程序的IPC值平均提高8.3%.  相似文献   

11.
王国仁  汤南  于亚新  孙冰  于戈 《软件学报》2006,17(4):770-781
主要研究XML文档的并行数据分片策略,以便能够并行处理XML查询.为了描述XML数据分片,提出了媒介节点的概念.一组媒介节点的集合可以将一棵XML数据树分割成一棵根树和一组子树的集合:根树将在所有站点中复制;而子树集合则可以根据用户查询的工作负载被均匀地分片到各个站点中.对于同一棵XML数据树,会有很多种媒介节点的集合;而不同的媒介节点集合会产生不同的数据分片结果.然后,依据各个数据分片中的用户查询工作量是否均衡,来衡量一个分片的好坏.选择一组最佳的媒介节点集合是一个NP-hard问题.为了解决此问题,设计了一组启发式优化规则.基于这一思想,提出并实现了一种基于媒介节点的XML数据分片算法WIN(workload-aware intermediary nodes data placement strategy).大量实验结果证明:WIN算法的性能要优于以往的并行XML数据分片策略.  相似文献   

12.
一种流媒体文件的分块放置方法   总被引:9,自引:0,他引:9  
吴松  金海  邹德清 《计算机学报》2006,29(3):500-507
在研究了文件内部访问倾向性特征的基础上,提出了一种新型的流媒体文件分块放置策略.它能够有效消除流媒体访问倾向性对服务器性能的负面影响,提高流媒体服务器的服务能力.  相似文献   

13.
This paper presents a Page Rank based prefetching technique for accesses to Web page clusters. The approach uses the link structure of a requested page to determine the most important linked pages and to identify the page(s) to be prefetched. The underlying premise of our approach is that in the case of cluster accesses, the next pages requested by users of the Web server are typically based on the current and previous pages requested. Furthermore, if the requested pages have a lot of links to some important page, that page has a higher probability of being the next one requested. An experimental evaluation of the prefetching mechanism is presented using real server logs. The results show that the Page-Rank based scheme does better than random prefetching for clustered accesses, with hit rates of 90% in some cases.  相似文献   

14.
Integrating Web Prefetching and Caching Using Prediction Models   总被引:2,自引:0,他引:2  
Yang  Qiang  Zhang  Henry Hanning 《World Wide Web》2001,4(4):299-321
Web caching and prefetching have been studied in the past separately. In this paper, we present an integrated architecture for Web object caching and prefetching. Our goal is to design a prefetching system that can work with an existing Web caching system in a seamless manner. In this integrated architecture, a certain amount of caching space is reserved for prefetching. To empower the prefetching engine, a Web-object prediction model is built by mining the frequent paths from past Web log data. We show that the integrated architecture improves the performance over Web caching alone, and present our analysis on the tradeoff between the reduced latency and the potential increase in network load.  相似文献   

15.
Simulated annealing based standard cell placement for VLSI designs has long been acknowledged as a computation-intensive process, and as a result, several research efforts have been undertaken to parallelize this algorithm. Parallel placement is most needed for very large circuits. Since these circuits do not fit in memory, the traditional approach has been to partition and place individual modules. This causes a degradation in placement quality in terms of area and wirelength. Our algorithm is circuit-partitioned and can handle arbitrarily large circuits on cluster-of-workstations-type parallel machines, such as the Intel Paragon and IBM SP-2. Most previous work in parallel placement has minimized just area and wirelength, but with current deep submicron designs, minimizing wirelength delay is most important. As a result the algorithm discussed in this paper also supports timing driven placement for partitioned circuits. The algorithm, calledmpiPLACE, has been tested on several large industry benchmarks on a variety of parallel architectures.  相似文献   

16.
C++ was originally designed as a sequential programming language. For development of multithreaded applications, libraries, such as Pthreads, Windows threads, and Boost, are traditionally used. The C++11 standard introduced some basic concepts and means for developing parallel and concurrent programs, but the direct use of these low-level means requires high programming skills and significant efforts. The absence of high-level models of parallelism in C++ is somewhat compensated for by various parallel libraries and directive parallelization tools (such as OpenMP), as well as by language extensions supported by some compilers (Intel CilkPlus). Nevertheless, we still require more advanced means to express parallelism in programs at the level of language standard and language library. In this survey, we consider the means for parallel and concurrent programming that are included into the C++17 standard, as well as some capabilities that are to be expected in the future standards.  相似文献   

17.
In this paper we propose and evaluate a new data-prefetching technique for cache coherent multiprocessors. Prefetches are issued by a functional unit called a prefetch engine which is controlled by the compiler. We let second-level cache misses generate cache miss traps and start the prefetch engine in a trap handler. The trap handler is fast (40–50 cycles) and does not normally delay the program beyond the memory latency of the miss. Once started, the prefetch engine executes on its own and causes no instruction overhead. The only instruction overhead in our approach is when a trap handler completes after data arrives. The advantages of this technique are (1) it exploits static compiler analysis to determine what to prefetch, which is hard to do in hardware, (2) it uses prefetching with very little instruction overhead, which is a limitation for traditional software-controlled prefetching, and (3) it is accurate in the sense that it generates very little useless traffic while maintaining a high prefetching coverage. We also study whether one could emulate the prefetch engine in software, which would not require any additional hardware beyond support for generating cache miss traps and ordinary prefetch instructions. In this paper we present the functionality of the prefetch engine and a compiler algorithm to control it. We evaluate our technique on six parallel scientific and engineering applications using an optimizing compiler with our algorithm and a simulated multiprocessor. We find that the prefetch engine removes up to 67% of the memory access stall time at an instruction overhead less than 0.42%. The emulated prefetch engine removes in general less stall time at a higher instruction overhead.  相似文献   

18.
Sanders  Egner  Korst 《Algorithmica》2003,35(1):21-55
   Abstract. High performance applications involving large data sets require the efficient and flexible use of multiple disks. In an external memory machine with D parallel, independent disks, only one block can be accessed on each disk in one I/ O step. This restriction leads to a load balancing problem that is perhaps the main inhibitor for the efficient adaptation of single-disk external memory algorithms to multiple disks. We solve this problem for arbitrary access patterns by randomly mapping blocks of a logical address space to the disks. We show that a shared buffer of O (D) blocks suffices to support efficient writing. The analysis uses the properties of negative association to handle dependencies between the random variables involved. This approach might be of independent interest for probabilistic analysis in general. If two randomly allocated copies of each block exist, N arbitrary blocks can be read within
I/ O steps with high probability. The redundancy can be further reduced from 2 to 1+1/r for any integer r without a big impact on reading efficiency. From the point of view of external memory models, these results rehabilitate Aggarwal and Vitter's ``single-disk multi-head' model [1] that allows access to D arbitrary blocks in each I/ O step. This powerful model can be emulated on the physically more realistic independent disk model [2] with small constant overhead factors. Parallel disk external memory algorithms can therefore be developed in the multi-head model first. The emulation result can then be applied directly or further refinements can be added.  相似文献   

19.
In this paper, we discuss and compare several policies to place replicas in tree networks, subject to server capacity and Quality of Service (QoS) constraints. The client requests are known beforehand, while the number and location of the servers are to be determined. The standard approach in the literature is to enforce that all requests of a client be served by the closest server in the tree. We introduce and study two new policies. In the first policy, all requests from a given client are still processed by the same server, but this server can be located anywhere in the path from the client to the root. In the second policy, the requests of a given client can be processed by multiple servers. One major contribution of this paper is to assess the impact of these new policies on the total replication cost. Another important goal is to assess the impact of server heterogeneity. In this paper, we establish several new complexity results, and provide several efficient polynomial heuristics for NP-complete instances of the problem. The absolute performance of these heuristics is assessed by comparison with the optimal solution provided by the formulation of the problem in terms of the solution of an integer linear program.  相似文献   

20.
数据分布是并行数据库系统实现的基础,其方法的优劣,直接影响到并行数据库的运行效率。通过对一维、多维几种数据分布方法的分析、对比,阐述并行数据库数据分布策略及方向。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号