期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Strategy independent program restructuring using the critical reference principle

Kwai-Ting Lan Dennis Kafura 《Performance Evaluation》1984,4(2):105-115

This paper defines and analyzes two new strategy independent program restructuring algorithms. The more effective of these two algorithms combines the critical reference principle used in strategy dependent algorithms with the bounded locality interval mechanism. Limited, though encouraging, results are presented which show that this new algorithm can be at least as effective as the Critical LRU algorithm, even when the memory management policy is LRU itself, and can also be at least as effective as Critical Working Set, even when the memory management policy is the working set policy. This new algorithm combines the benefits of strategy independent restructuring with the performance improvements previously possible only with a strategy dependent method. 相似文献

2.

Improving page prefetching with prior knowledge

Edwin J. Lau 《Performance Evaluation》1982,2(3):195-206

Prefetching is a technique applied to memory management policies in which pages are brought into memory before they are actually needed. In this study, prior knowledge of program behavior obtained through trace data is used to parameterize a variation of the working set memory management policy supporting demand page prefetching using one page lookahead. Two new algorithms supporting double page prefetching are proposed. A comparative analysis of them is made.Evaluation of the techniques through trace driven simulation shows that, in general, they are very effective in reducing the page fault rate and in some cases the space time product. As expected, memory occupancy increases, but usually by small amounts. The results are encouraging for their ability to improve performance for a variety of programs which exhibit some form of locality (spatial and/or temporal). 相似文献

3.

Segmentation,paging and optimal page sizes in virtual memory

Timo O Alanko A Inkeri Verkamo 《Performance Evaluation》1983,3(1):13-33

On the basis of empirical data two topics concerning virtual memory systems are discussed: determining an optimal page size and performance of segmentation as compared to paging. Several production programs have been executed (on a simulator) both in a segmented system and in a paged system with various page sizes; the memory management was based on the working set policy. The memory usage and the fault rates were recorded, and the lifetime functions and space-time integrals were evaluated. The observations are explained using a new model of program behaviour which is a refinement of the phase-transition model. The results show that there is no globally optimal page size. Two characteristic types of programs are observed: the first requires a small page size and a large window size, the second requires a large page size and a small window size. Segmentation and paging are compared with respect to their usage of various resources. In the sense of the space-time integral, segmentation usually outperforms paging; if the mean segment size is large, the difference is remarkable. Several commonly used assumptions about the effects of page size on program behaviour are validated; some of them are found inaccurate or even wrong. 相似文献

4.

Token-ordered LRU: an effective page replacement policy and its implementation in Linux systems

Song Xiaodong 《Performance Evaluation》2005,60(1-4):5-29

Most computer systems use a global page replacement policy based on the LRU principle to approximately select a Least Recently Used page for a replacement in the entire user memory space. During execution interactions, a memory page can be marked as LRU even when its program is conducting page faults. We define the LRU pages under such a condition as false LRU pages because these LRU pages are not produced by program memory reference delays, which is inconsistent with the LRU principle. False LRU pages can significantly increase page faults, even cause system thrashing. This poses a more serious risk in a large parallel systems with distributed memories because of the existence of coordination among processes running on individual node. In the case, the process thrashing in a single node or a small number of nodes could severely affect other nodes running coordinating processes, even crash the whole system. In this paper, we focus on how to improve the page replacement algorithm running on one node.

After a careful study on characterizing the memory usage and the thrashing behaviors in the multi-programming system using LRU replacement. we propose an LRU replacement alternative, called token-ordered LRU, to eliminate or reduce the unnecessary page faults by effectively ordering and scheduling memory space allocations. Compared with traditional thrashing protection mechanisms such as load control, our policy allows more processes to keep running to support synchronous distributed process computing. We have implemented the token-ordered LRU algorithm in a Linux kernel to show its effectiveness. 相似文献

5.

Implementing sequentially consistent programs on processor consistent platforms

Lisa Higham Jalal Kawash 《Journal of Parallel and Distributed Computing》2008

Designers of distributed algorithms typically assume strong memory consistency guarantees, but system implementations provide weaker guarantees for better performance and scalability. This motivates the study of how to implement programs designed for sequential consistency on platforms with weaker consistency models. Typically, such implementations are impossible using only read and write operations to shared variables. One variant of processor consistency originally proposed by Goodman and called here PC-G is an exception because it provides just enough consistency to implement mutual exclusion using only reads and writes. This paper investigates the existence of compilers to convert arbitrary programs that use shared read/write variables with sequentially consistent memory semantics, to programs that use read/write variables with PC-G consistency semantics. We first provide a simple program transformation, and prove that it correctly compiles any 2-process program to a PC-G memory system, while preserving wait-freedom. We next prove that even a substantial generalization of this transformation cannot be a compiler for even a very restricted class of 3-process programs. Even though our program transformation is not a general compiler for three or more processes, it does correctly transform some specific n-process programs. In particular, for the special case of the (necessarily randomized) Test&Set algorithm of Tromp and Vitanyi, our transformation extends to any number of processes, thus providing the first algorithm for expected wait-free Test&Set on any weak memory system, using only read/write variables. 相似文献

6.

Program Restructuring in a Multilevel Virtual Memory

《IEEE transactions on pattern analysis and machine intelligence》1983,(1):69-79

Program restructuring techniques have proven successful in two-level automatically managed memory hierarchies. The possibility of extending them to multilevel environments is investigated. The performance of strategy-oriented restructuring algorithms in a three-level linear hierarchy managed by sampled working set policies or by a combination of sampled working set and local LRU policies is studied both analytically (assuming an independent reference model of program behavior) and by trace-driven simulation. The results of the study show that strategy-oriented restructuring may be as beneficial in a virtual memory with three levels as it is in one with two levels. 相似文献

7.

Page replacement algorithm for large-array manipulation

Hiroshi Muramatsu Hiroaki Negishi 《Software》1980,10(7):575-587

The Least Recently Used (LRU) method is the technique that is most often used for a page replacement algorithm. However, LRU may not be an effective method for some large-array manipulation. A method for solving this problem is presented in this paper. In this method, the Operating System (OS) automatically detects loops that are not suitable for the LRU method. There is no necessity to modify application programs nor any additional overhead as long as sufficient real memory is provided. Therefore this method is most suitable for practical use. It functions on the basis of information about the relationship between the working set size and the window size, and indication of whether the detected page fault is caused by an instruction fetch or an operand fetch, etc. The results of experiments and improvements are also shown. 相似文献

8.

Multiprogramming and memory contention

Alan Jay Smith 《Software》1980,10(7):531-552

We study memory contention during multiprogramming when programs are free to compete for page frames. A random walk between the possible partitions of memory over the set of active programs is used to model memory contention and calculate throughput. Our model of contention takes into account program characteristics by using miss ratio curves, and also considers memory size and page fetch latency. With the aid of numerous trace-driven simulations, we are able to verify our model, finding good agreement both in the observed distribution of memory among competing programs and in CPU utilization. We find that for high ratios of secondary to primary memory access time and under conditions of high memory contention, small programs with compact working sets are able to run with far less than expected interference from larger, more diffuse programs. In the case of multiprogramming the same program several times, we find that observed partition distributions are not necessarily even and that higher than expected levels of CPU use are observed. Lower ratios of access time are found to yield different results; programs compete on a more even basis and partition memory relatively more evenly than with higher ratios. 相似文献

9.

Efficient calculation of the space-time performance of the working set

Gary Chastek

John P. Kearns 《Performance Evaluation》1989,10(4):281-294

The performance of working set memory management is often used as a benchmark against which other algorithms are compared. Efficient algorithms which can generate exact performance curves for the mean working set size and fault rate have appeared in the literature. These algorithms exploit the fact that the curve of fault rate or mean resident set size versus the working set parameter value may be computed by gathering various statistics during one pass across the reference string and then by substituting those statistics into relatively simple formulae.

This paper develops an efficient technique for the determination of the exact space time product versus parameter value curve. The usual simulation approach to compute a space-time performance curve exhibits worst case time complexity of O(nn²) where p is the number of pages referenced by the program and n is the total number of memory references generated by the program. The one-pass technique presented here requires O(pn) time. Given that n may be quite large in practice (10⁷ references is not unusual), this reduction in complexity is necessary for the practical determination of space-time performance of the working set. 相似文献

10.

Some Empirical Observations on Program Behavior with Applications to Program Restructuring

《IEEE transactions on pattern analysis and machine intelligence》1985,(2):188-193

The dynamic behavior of executing programs is a significant factor in the performance of virtual memory computer systems. Program restructuring attempts to improve the behavior of programs by reorganizing their object code to account for the characteristics of the virtual memory environment. A significant component of the restructuring process involves a restructuring graph. An analysis of restructuring graphs of typical programs found edge weights to be distributed in a Bradford–Zipf fashion, implying that a large fraction of total edge weight is concentrated in relatively few edges. This empirical observation can be used to improve the clustering phase of program restructuring, by limiting consideration to edges of large weight. We consider the effect of this improved clustering in the restructuring process by examining various means of restructuring some typical programs. In our experiments, 95 percent of the total edge value is typically accounted for by 50–60 percent of the edges. For naive clustering algorithms, clustering time is therefore typically halved; for more sophisticated methods, more substantial savings result. Finally, clustering with 95 percent of total edge value typically results in only a small decay in performance measures such as number of page faults and average working set size. 相似文献

11.

A set of strategy-independent restructuring algorithms

Makoto Kobayashi 《Software》1977,7(5):585-594

This paper proposes a set of new program restructuring algorithms which can be used to reorganize programs so as to increase their performance under two typical memory management strategies. The new algorithms are based on a recently proposed program behaviour model called the bounded locality intervals model, which allows us to give a precise definition of the localities of a program. The paging activities of a program restructured with the new algorithms under working-set and global LRU-like memory management strategies are simulated to evaluate the new algorithms. Some of them are shown to have quite satisfactory performance. 相似文献

12.

对页面访问序列影响LRU页面置换算法的研究

蒋飞虎舒平《微机发展》2006,16(5):42-43

页面置换算法是操作系统中虚拟存储管理的一个重要部分。改进页面置换算法,可以降低页面失败率,从而有效地提高系统性能。现有的应用于虚拟存储管理的页面置换算法主要是Least Reference Used(LRU)页面置换算法。文中利用页面访问间隔数,分析不同的页面访问序列对LRU算法的影响,把页面访问序列分为LRU-友好页面访问序列、LRU-不友好页面访问序列、不友好页面访问序列三类,为改进LRU页面置换算法提供了依据。相似文献

13.

A Comparative Study of Set Associative Memory Mapping Algorithms and Their Use for Cache and Main Memory

《IEEE transactions on pattern analysis and machine intelligence》1978,(2):121-130

Set associative page mapping algorithms have become widespread for the operation of cache memories for reasons of cost and efficiency. We show how to calculate analytically the effectiveness of standard bit-selection set associative page mapping or random mapping relative to fully associative (unconstrained mapping) paging. For two miss ratio models, Saltzer's linear model and a mixed geometric model, we are able to obtain simple, closed-form expressions for the relative LRU fault rates. We also experiment with two (infeasible to implement) dynamic mapping algorithms, in which pages are assigned to sets either in an LRU or FIFO manner at fault times, and find that they often yield significantly lower miss ratios than static algorithms such as bit selection. Trace driven simulations are used to generate experimental results and to verify the accuracy of our calculations. We suggest that as electronically accessed third-level memories composed of electron-beam tubes, magnetic bubbles, or charge-coupled devices become available, algorithms currently used only for cache paging will be applied to main memory, for the same reasons of efficiency, implementation ease, and cost. 相似文献

14.

A low-cost memory remapping scheme for address bus protection

Jun Yang Lan Gao Youtao Zhang Marek Chrobak Hsien-Hsin S. Lee 《Journal of Parallel and Distributed Computing》2010

The address sequence on the processor-memory bus can reveal abundant information about the control flow of a program. This can lead to leakage of proprietary algorithms or critical information such as encryption keys. Addresses can be observed by side-channel attacks mounted on remote servers that run sensitive programs but are not under the physical control of the client. Two previously proposed hardware techniques tackled this problem through randomizing address patterns on the bus. In this paper, we examine these attempts and show that they impose great pressure on both the memory and the disk. We propose a lightweight solution to alleviating the pressure with equal security strength. The results show that our technique can reduce the memory traffic by a factor of 10 compared with the prior scheme, while keeping almost the same page fault rate as a baseline system with no security protection. 相似文献

15.

Exploiting the least recently used page replacement algorithm

Donald B. Innes 《Software》1977,7(2):271-273

Many implementations of paged virtual memory systems employ demand fetching with least recently used (LRU) replacement. The stack characteristic of LRU replacement implies that a reference string which repeatedly accesses a number of pages in sequence will cause a page fault for each successive page referenced when the number of pages is greater than the number of page frames allocated to the program's LRU stack. In certain circumstances when the individual operations being performed on the referenced string are independent, or more precisely are commutative, the order of alternate page reference sequences can be reversed. This paper considers sequences which cannot be reversed and shows how placement of information on pages can achieve a similar effect if at least half the pages can be held in the LRU stack. 相似文献

16.

A Unified Analysis of Paging and Caching

E. Torng 《Algorithmica》1998,20(2):175-200

Paging (caching) is the problem of managing a two-level memory hierarchy in order to minimize the time required to process a sequence of memory accesses. In order to measure this quantity, which we refer to as the total memory access time, we define the system parameter miss penalty to represent the extra time required to access slow memory. We also introduce the system parameter page size. In the context of paging, miss penalty is quite large, so most previous studies of on-line paging have implicitly set miss penalty =∞ in order to simplify the model. We show that this seemingly insignificant simplification substantially alters the precision of derived results. For example, previous studies have essentially ignored page size. Consequently, we reintroduce the miss penalty and page size parameters to the paging problem and present a more accurate analysis of on-line paging (and caching). We validate using this more accurate model by deriving intuitively appealing results for the paging problem which cannot be derived using the simplified model. First, we present a natural, quantifiable definition of the amount of locality of reference in any access sequence. We also point out that the amount of locality of reference in an access sequence should depend on page size among other factors. We then show that deterministic and randomized marking algorithms such as the popular least recently used (LRU) algorithm achieve constant competitive ratios when processing typical access sequences which exhibit significant locality of reference; this represents the first competitive analysis result which (partially) explains why LRU performs as well as it is observed to in practice. Next, we show that finite lookahead can be used to obtain algorithms with improved competitive ratios. In particular, we prove that modified marking algorithms with sufficient lookahead achieve competitive ratios of 2. This is in stark contrast to the simplified model where lookahead cannot be used to obtain algorithms with improved competitive ratios. We conclude by using competitive analysis to evaluate the benefits of increasing associativity in caches. We accomplish this by specifying an algorithm and varying the system configuration rather than the usual process of specifying the system configuration and varying the algorithm. Received August 7, 1995; revised May 7, 1996, and August 6, 1996. 相似文献

17.

Internal Scheduling and Memory Contention

《IEEE transactions on pattern analysis and machine intelligence》1981,(1):135-146

It has been suggested that the algorithm used to schedule those processes active and in main memory can have an effect on memory contention. We create models for memory contention in a system that uses global LRU replacement and either round robin or priority internal scheduling. Parameters to our model include the ratio of secondary storage to primary storage access times, thus allowing consideration of a variety of storage technologies. The round robin quantum size is included and is shown to have some effect. Our model uses LRU miss ratio curves and thus reflects actual program characteristics. Trace driven simulations are used to verify the accuracy of the models. We find that in most cases internal scheduling has only a small effect on page fault rates and CPU utilization. In certain cases, however priority scheduling is found to besignificant in relieving thrashing. 相似文献

18.

Window‐LRFU: a cache replacement policy subsumes the LRU and window‐LFU policies

Sen Bai Xin Bai Xiangjiu Che 《Concurrency and Computation》2016,28(9):2670-2684

Replacement algorithms have been widely used as key technologies for cache management in areas such as file systems or database management. A replacement algorithm determines which page to be evicted when the cache is full and a new page is referenced. Because replacement policies considering only recency or frequency such as LRU (Least Recently Used) and LFU (Least Frequently Used) do not perform well, replacement polices that take both recency and frequency into account have been intensively studied. As a classical replacement policy, LRFU (Least Recently/Frequently Used) policy subsumes the LRU and LFU policy. However, because LFU is not able to adapt to the change of page accessing pattern and it is hard to select a suitable λ for each certain trace, LRFU cannot always guarantee a good performance. In this paper, we propose a Window‐LRFU policy, to subsume the LRU and Window‐LFU policies. Experimental results show that the Window‐LRFU policy outperforms LRFU and has at least competitive performance than other classical algorithms. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献

19.

A Comparison of Dynamic and Static Virtual Memory Allocation Algorithms

《IEEE transactions on pattern analysis and machine intelligence》1981,(1):122-131

In this paper we compare the performance of virtual memory allocation algorithms. The primary measure of performance is the space-time product of primary memory occupancy, or space-time cost, used by a program during its execution. Using DMIN, an optimal dynamic aliocation algorithm, we compute the minimum space-time cost achievable for some benchmark program runs. We compare the DMIN space-time cost with the space-time cost from: MIN, an optimal static allocation algorithm, VMIN, an optimal variable space algorithm, and two heuristic dynamic allocation algorithms. the page fault frequency algorithm and the damped working set algorithm. 相似文献

20.

Time-shift replacement algorithm for main memory performance optimization

Hong-Bin Tsai Chin-Laung Lei 《The Journal of supercomputing》2018,74(6):2729-2746

Page replacement algorithms of main memory in modern operating systems are crucial in system performance. When memory is full, a page replacement algorithm exploits temporal locality and frequency of page references to evict the page that is least likely to be accessed in the near future. Subsequently, loading the majority of data directly from memory improves performance by reducing I/O waits of accessing slow storage. Research of replacement algorithms that maximizes hit ratio while incurring as less overhead as possible has been constantly studied. In this paper, we propose a time-shift least recently used (TSLRU) algorithm that converts frequency information of page references into temporal locality. Frequent accesses of a page are thus recognized and accumulated in terms of time. Moreover, pages being loaded into memory for the first time are not necessarily the most recently used pages. As a result, one-pass pages are evicted sooner in our algorithm than in traditional LRU algorithm. Our performance evaluations show that the TSLRU outperforms conventional page replacement algorithms on both artificial and real application traces. For example, hit ratio of TSLRU advances ARC by \(4.17\%\) and LRU by \(5.91\%\) on normal distributed workloads. Moreover, TSLRU outperforms ARC by over \(2\%\) on half of the application traces tested. 相似文献