期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Robust performance in hybrid-memory cooperative caches

Luiz Ramos Ricardo Bianchini 《Parallel Computing》2014

Modern servers require large main memories, which so far have been enabled by increasing DRAM’s density. With DRAM’s scalability nearing its limit, Phase-Change Memory (PCM) is being considered as an alternative technology. PCM is denser, more scalable, and consumes lower idle power than DRAM, while exhibiting byte-addressability and access times in the nanosecond range. Still, PCM is slower than DRAM and has limited endurance. These characteristics prompted the study of hybrid memory systems, combining a small amount of DRAM and a large amount of PCM. In this paper, we leverage hybrid memories to improve the performance of cooperative memory caches in server clusters. Our approach entails a novel policy that exploits popularity information in placing objects across servers and memory technologies. Our results show that (1) DRAM-only and PCM-only memory systems do not perform well in all cases; and (2) when managed properly, hybrid memories always exhibit the best or close-to-best performance, with significant gains in many cases, without increasing energy consumption. 相似文献

2.

相变存储器写寿命延长关键技术研究进展

张震付印金胡谷雨《计算机工程与科学》2018,40(9):1546-1555

随着大数据分析应用时效性提升和“存储墙”问题日益突出,存储系统已成为当前计算机系统整体性能的瓶颈。以相变存储器（PCM）为代表的新型非易失性存储器（NVM）具有集成度高、功耗低、读写访问速度高、非易失、体积小和抗震等优良特性,已成为最具潜力的下一代存储设备。然而,写寿命有限是PCM实用化的一道障碍,如何通过减少写操作和磨损均衡以提升PCM使用寿命是当前的研究热点。从减少PCM写操作、均匀写操作分布以及在混合内存中的页面迁移等三个方面介绍了当前PCM写寿命延长技术的研究现状以及优缺点,最后探讨未来进一步改进PCM寿命可能的研究方向。相似文献

3.

基于DRAM牺牲Cache的异构内存页迁移机制

裴颂文钱艺幻叶笑春刘海坤孔令和《计算机研究与发展》2022,59(3):568-581

当海量数据请求访问异构内存系统时,异构内存页在动态随机存储器(dynamic random access memory,DRAM)和非易失性存储器(non-volatile memory,NVM)之间进行频繁的往返迁移.然而,应用于传统内存页的迁移策略难以适应内存页"冷""热"度的快速动态变化,这使得从DRAM迁移至N... 相似文献

4.

Exploiting write power asymmetry to improve phase change memory system performance

Qi WANG Donghui WANG Chaohuan HOU 《Frontiers of Computer Science》2015,9(4):566-575

Phase change memory (PCM) is a promising candidate to replace DRAM as main memory, thanks to its better scalability and lower static power than DRAM. However, PCM also presents a few drawbacks, such as long write latency and high write power. Moreover, the write commands parallelism of PCM is restricted by instantaneous power constraints, which degrades write bandwidth and overall performance. The write power of PCM is asymmetric: writing a zero consumes more power than writing a one. In this paper, we propose a new scheduling policy, write power asymmetry scheduling (WPAS), that exploits the asymmetry of write power. WPAS improveswrite commands parallelism of PCM memory without violating power constraint. The evaluation results show that WPAS can improve performance by up to 35.5%, and 18.5% on average. The effective read latency can be reduced by up to 33.0%, and 17.1% on average. 相似文献

5.

基于DRAM和PCM的混合主存模拟器

张德志万寿红岳丽华《计算机系统应用》2017,26(9):16-23

相变存储器（PCM）由于其非易失性、高读取速度以及低静态功耗等优点,已成为主存研究领域的热点.然而,目前缺乏可用的PCM设备,这使得基于PCM的算法研究得不到有效验证.因此,本文提出了利用主存模拟器仿真并验证PCM算法的思路.本文首先介绍了现有主存模拟器的特点,并指出其并不能完全满足当前主存研究的实际需求,在此基础上提出并构建了一个基于DRAM和PCM的混合主存模拟器.与现有模拟器的实验比较结果表明,本文设计的混合主存模拟器能够有效地模拟DRAM和PCM混合存储架构,并能够支持不同形式的混合主存系统模拟,具有高可配置性.最后,论文通过一个使用示例说明了混合主存模拟器编程接口的易用性. 相似文献

6.

利用相变存储器不对称性的写入优化方法

下载免费PDF全文

张格毅陈小刚郭继鹏宋志棠陈邦明《计算机工程与应用》2021,57(14):75-82

相变存储器具有集成度高、功耗低、非易失等优良特性,是作为非易失性内存最有潜力的存储介质之一。如何降低其写入延时和增加其使用寿命,是PCM作为非易失性内存时亟需解决的问题。为此,提出利用相变存储器擦除和写入时间不对称的特点擦写独立的写入方法,RSIW（Reset and Set Independently Write）。该方法不同于传统的写入方案,将写和擦的操作分离,让慢速的写操作在空闲时进行,使得相变存储器的写入速度获得显著提升。同时,RSIW还能结合磨损均衡的策略,有效地均衡各个块的写入频率。对擦写独立的写入方法和实施细节进行了描述,对比了同类使用相变存储器擦写不对称性进行优化的方案,最后使用gem5仿真器进行了实验,根据实验结果,该方法对比同类的技术能将系统的运行效率提高37.1%~69.1%。相似文献

7.

A compiler assisted wear leveling for morphable PCM in embedded systems

《Journal of Systems Architecture》2016

Phase change memory (PCM) is considered as a promising alternative of DRAM-based main memory in embedded systems. A PCM cell can be dynamically programmed to be in either multiple-level cell (MLC) mode or single-level cell (SLC) mode. With this morphable feature, we can utilize the high-density of MLC and low-latency of SLC, to satisfy various memory requirements of specific applications in embedded systems. However, compared to its SLC counterpart, the lifetime of MLC is limited.To address this issue, this paper proposes a simple and effective wear-leveling technique, named Mixer, to enhance the lifetime of morphable PCM considering the program specific features. We first build an Integer Linear Programming (ILP) formulation to dynamically configure the optimal SLC/MLC partition in morphable PCM, and produce the best data allocation for each variable to achieve a balanced write distribution in morphable PCM with low memory access cost. The basic idea is to allocate low-latency SLC and high-density MLC cells for write intensive variables and other ordinary variables, respectively. We then propose a polynomial time algorithm to achieve near-optimal results. The evaluation results show that the proposed technique can effectively improve the lifetime of morphable PCM in embedded systems compared with previous work. 相似文献

8.

A space allocation and reuse strategy for PCM-based embedded systems

《Journal of Systems Architecture》2014,60(8):655-667

Phase change memory (PCM) has emerged as a promising candidate to replace DRAM in embedded systems, due to its appealing properties, such as zero leakage power, scalability, shock-resistivity and high density. However, it can only sustain a limited number of write operations. On the other hand, as a program in embedded systems usually distributes write traffic in an extremely unbalanced way, which could further decrease PCM lifetime.In this paper, we propose a space-based wear leveling technique in software compiler level by exploiting the program-specific features. The basic idea is to extend frequently written variables into specific-sized arrays, and evenly distribute writes on allocated array. In such way, we can effectively distribute the write traffic of the program across the whole PCM chip. A space allocation and reuse (SAR) strategy and a polynomial-time algorithm are proposed to produce optimal and near-optimal space allocation, respectively, for achieving a balanced write distribution. The experimental results show our technique can greatly extend the lifetime of PCM-based embedded systems compared with the previous work, and achieve approximately 94% the theoretical maximum of lifetime. Compared with a baseline scheme without wear-leveling mechanism, our technique introduces no more than 0.8% extra writes and 0.7% running overhead. 相似文献

9.

Memory organizations for 3D-DRAMs and PCMs in processor memory hierarchy

《Journal of Systems Architecture》2015,61(10):539-552

In this paper, we describe and evaluate three possible architectures for using 3D-DRAMs and PCMs in the processor memory hierarchy. We explore: (i) using 3D-DRAM as main memory with PCM as backing store; (ii) using 3D-DRAM as the Last Level Cache and PCM as the main memory; and (iii) using both 3D-DRAM and PCM as main memory. In each of these configurations, since the proposed memories are significantly faster than today’s off-chip 2D DRAMs for main memories and magnetic hard drives for secondary storage, we introduce hardware assistance to speedup virtual to physical address translation.We use Simics, a full system simulator, and benchmarks from both SPEC and OLTP suites to evaluate our designs. We use CACTI for obtaining energy and latency values for our configurations. We measure energy consumed and execution performance for the selected benchmarks.Our studies lead to the following conclusions. The best performance is obtained when 3D-DRAMs are used as last level caches (LLC) and PCM as the main memory. However, this organization performs poorly in terms of energy consumed. Our 3D-DRAM together with PCM as main memory is the best choice in terms of energy consumed. In terms of write-backs, 3D-DRAM as LLC causes fewer writes to PCM than the other organization.These experiments can be extended to explore specific memory organizations, capacities of 3D-DRAM needed as LLC or main memory and how the hybrid PCM/DRAM memory should be used for specific application contexts. 相似文献

10.

利用多维分级Cache替换策略减少对PCM内存写回量

阮深沉王海霞汪东升《计算机工程与科学》2016,38(8):1568-1573

寻找新型存储材料代替DRAM内存是当前的一个研究热点。相变存储PCM因其具有低功耗、高存储密度和非易失性的优点受到广泛的关注,然而PCM的可擦写次数有限,要用作内存必须考虑如何减少对其的写操作。针对该问题,一种有效的解决方法是优化Cache替换策略,减少Cache中脏块被替换出的数量。现有研究主要通过在插入和访问命中时给脏块设定较高的保护优先级来达到给脏块额外保护的目的,但是在降级过程中不再对脏块与干净块进行区分,这导致Cache可能在存在大量干净块的情况下仍然先替换脏块。提出一种新型的Cache替换策略MAC,它通过一个多维分级结构在脏块与干净块之间设置了不可逾越的界限,使得脏块能得到更有力的保护。模拟实验表明,相对LRU替换策略,MAC以较低的硬件开销代价平均减少约25.12%的内存写,同时对程序运行性能几乎没有影响。相似文献

11.

A survey of operating system support for persistent memory

Miao CAI Hao HUANG 《Frontiers of Computer Science》2021,15(4):154207

Emerging persistent memory technologies, like PCM and 3D XPoint, offer numerous advantages, such as higher density, larger capacity, and better energy efficiency, compared with the DRAM. However, they also have some drawbacks, e.g., slower access speed, limited write endurance, and unbalanced read/write latency. Persistent memory technologies provide both great opportunities and challenges for operating systems. As a result, a large number of solutions have been proposed. With the increasing number and complexity of problems and approaches, we believe this is the right moment to investigate and analyze these works systematically.To this end, we perform a comprehensive and in-depth study on operating system support for persistent memory within three steps. First, we present an overview of how to build the operating system on persistent memory from three perspectives: system abstraction, crash consistency, and system reliability. Then, we classify the existing research works into three categories: storage stack, memory manager, and OS-bypassing library. For each category, we summarize the major research topics and discuss these topics deeply. Specifically, we present the challenges and opportunities in each topic, describe the contributions and limitations of proposed approaches, and compare these solutions in different dimensions. Finally, we also envision the future operating system based on this study. 相似文献

12.

A Survey of Non-Volatile Main Memory Technologies: State-of-the-Arts,Practices, and Future Directions

下载免费PDF全文

Hai-Kun Liu Di Chen Hai Jin Xiao-Fei Liao Binsheng He Kan Hu Yu Zhang 《计算机科学技术学报》2021,36(1):4-32

Non-Volatile Main Memories(NVMMs)have recently emerged as a promising technology for future memory systems.Generally,NVMMs have many desirable properties such as high density,byte-addressability,non-volatility,low cost,and energy efficiency,at the expense of high write latency,high write power consumption,and limited write endurance.NVMMs have become a competitive alternative of Dynamic Random Access Memory(DRAM),and will fundamentally change the landscape of memory systems.They bring many research opportunities as well as challenges on system archi-tectural designs,memory management in operating systems(OSes),and programming models for hybrid memory systems.In this article,we first revisit the landscape of emerging NVMM technologies,and then survey the state-of-the-art studies of NVMM technologies.We classify those studies with a taxonomy according to different dimensions such as memory ar-chitectures,data persistence,performance improvement,energy saving,and wear leveling.Second,to demonstrate the best practices in building NVMM systems,we introduce our recent work of hybrid memory system designs from the dimensions of architectures,systems,and applications.At last,we present our vision of future research directions of NVMMs and shed some light on design challenges and opportunities. 相似文献

13.

相变存储器的存储技术教学研究

李华《广东电脑与电讯》2017,1(3):69-71

相变存储器的诞生,是人类存储技术的一个里程碑,它改善了传统DRAM存储方式的缺陷,进一步拓展了计算机内存,使计算机结构发生了创新性的变革。本文从相变存储器的概念入手,对其相变存储技术进行初步的分析研究,总结它的规律和特点,以期为今后的存储技术教学带来一些有益的帮助。相似文献

14.

Using FORAY Models to Enable MPSoC Memory Optimizations

Ilya Issenin Nikil Dutt 《International journal of parallel programming》2008,36(1):93-113

With the technology advances it becomes feasible to implement a large multiprocessor system on a single chip. In such Systems-on-Chip (SoCs), a significant portion of energy is spent in the memory subsystem. There are several approaches reducing this energy, including the ones at physical, architecture and algorithmic levels. Classical approaches, including algorithmic and some architectural approaches, use static analysis and transformation of the application source code. However, often it is not possible to perform static analysis and optimization of a program’s memory access behavior unless the program is written in an easily analyzable form, e.g., free from pointer arithmetic. In this paper, we introduce the FORAY model of a program that allows aggressive analysis of the application’s memory behavior and enables such optimizations on arbitrary code which are not possible to apply otherwise. We then present FORAY-GEN: an automated profile-based approach for extraction of the FORAY model from the original program. We also outline our approach in applying FORAY-GEN for multiprocessor SoCs. We demonstrate how FORAY-GEN enhances applicability of other memory subsystem optimization approaches, resulting in an average of two times increase in the number of memory references that can be analyzed by existing static approaches. 相似文献

15.

基于模拟存储器的FPGA原型验证系统

下载免费PDF全文

张明周宏伟张民选《计算机工程与科学》2007,29(6):87-88

当前ASIC功能验证流程中,FPGA原型验证系统的可调试性一直是制约验证速度的重要障碍。本文提出了一种模拟存储器技术,即将FPGA板上的存储请求映射到PC机上,由PC机上的软件模拟存储器的行为。通过此技术,功能验证工程师可以非常方便地记录和分析测试用例的执行轨迹,以及设置访存事务级的断点等,大大增加了验证板的可调性。同时,模
拟存储系统的设计复杂度和成本也低于由硬件实现的大容量存储系统,有助于降低FPGA原型验证板的设计复杂度。相似文献

16.

面向DRAM和NVM异构混合内存架构的排序连接算法优化

杨柳金培权《计算机工程与科学》2021,43(2):191-198

随着计算机技术的高速发展,数据的应用规模也在不断扩大,各行各业对于数据存取速度的要求也越来越高.为了满足这种需求,内存数据库的思想被提出,然而传统的内存存储器DRAM由于密度和能耗的限制无法大规模集成和扩展.与此同时,非易失内存(NVM)以其性能高、密度高、能耗低的优势弥补了DRAM的不足.DRAM和NVM结合在一起组... 相似文献

17.

Organizational memory information systems: a transactive memory approach 总被引：2，自引：0，他引：2

Dorit Nevo Yair Wand 《Decision Support Systems》2005,39(4):549-562

Effective management of organizational memory (OM) is critical to collaboration and knowledge sharing in organizations. We present a framework for managing organizational memory based on transactive memory, a mechanism of collective memory in small groups. While being effective in small groups, there are difficulties hindering the extension of transactive memory to larger groups. We claim that information technology can be used to help overcome these difficulties. We present a formal architecture for directories of meta-memories required in extended transactive memory systems and propose the use of meta-knowledge to substitute for the lack of tacit group knowledge that exists in small groups. 相似文献

18.

Hypervisor中内存回收技术的改进

吴岳《计算机系统应用》2016,25(9):277-280

虚拟化是云计算的关键技术. Hypervisor在虚拟机与主机硬件之间提供了一个抽象层,允许用户为运行着的虚拟机分配的内存总值超过主机的可用内存,这种技术称为内存过量分配. 为了能够降低这个技术对虚拟机性能的影响,hypervisor必须提供高效率的内存回收机制. 在本论文中,作者提出了一种解决方案：使用非易失性内存作为hypervisor交换页面数据的缓存设备. 作者从系统内存中划分出空间模拟了非易失性内存设备,修改了KVM模块中的算法,并制定了五种测试环境. 通过实验数据证明,相比现有的Ballooning技术与Hypervisor swapping技术,使用非易失性内存并配合低优先级队列算法时,虚拟机性能可提高30%和50%左右. 相似文献

19.

ESP：基于OS／2的SVM研究及实验

陈勇刘心松《小型微型计算机系统》1993,14(10):14-19

共享虚拟存储技术可以将松耦合系统中相互独立的物理存储空间组织成一个统一的逻辑存储空间。我们基于ＯＳ／２对共享虚拟存储系统ＥＳＰ的硬软件结构，算法和性能等进行了研究和实验，初步结论是理论分析与实测结果基本一致，这种技术在多机系统，分布式计算机以及并行处理中是有用的。相似文献

20.

An implementation of storage management in capability environments

Paolo Corsini Lanfranco Lopriore 《Software》1995,25(5):501-520

The exploitation of the salient features of capability-based addressing environments leads to a high number of small objects existing in memory at the same time. It is thus necessary to enhance the efficiency of the mechanisms for object relocation, and to avoid congestion of input/output devices due to swapping. In this paper, we present an approach to the management of a large virtual memory space aimed at solving these problems. We insert partial information concerning the physical allocation of each object into the virtual identifier of this object. Objects are grouped into large swapping units, called pages. The page size is independent of the average object size. This results in enhanced efficiency in managing the relocation information both with regard to memory requirements and access times. The allocation of objects into pages, and the movement of pages through the memory hierarchy, are controlled by user processes. This means that programs which have knowledge of their own use of virtual memory can increase their locality of reference, diminish the number of swap operations and reduce fragmentation. 相似文献