期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

张震付印金胡谷雨《计算机工程与科学》2018,40(9):1546-1555

随着大数据分析应用时效性提升和“存储墙”问题日益突出,存储系统已成为当前计算机系统整体性能的瓶颈。以相变存储器（PCM）为代表的新型非易失性存储器（NVM）具有集成度高、功耗低、读写访问速度高、非易失、体积小和抗震等优良特性,已成为最具潜力的下一代存储设备。然而,写寿命有限是PCM实用化的一道障碍,如何通过减少写操作和磨损均衡以提升PCM使用寿命是当前的研究热点。从减少PCM写操作、均匀写操作分布以及在混合内存中的页面迁移等三个方面介绍了当前PCM写寿命延长技术的研究现状以及优缺点,最后探讨未来进一步改进PCM寿命可能的研究方向。相似文献

2.

基于BKDRHash的混合内存损耗均衡算法研究

《微型机与应用》2017,(11)

相变存储器(PCM)是一种新型的非易失性存储器(NVM),与传统内存DRAM互有优势。基于DRAM和PCM的混合内存使得同时发挥DRAM与PCM各自的优势成为可能。然而,由于PCM写操作寿命有限,在设计混合内存的管理策略时,不仅要对混合内存体系结构进行设计,还需要设计一种损耗均衡算法对PCM写操作进行负载均衡优化。文中设计了一种损耗均衡算法,将写操作逻辑地址作为输入,使用BKDRHash函数对地址进行映射,实现PCM的损耗均衡。实验结果表明,文中提出的损耗均衡算法能够以很少的时延与功耗损失大幅提升PCM的使用寿命。相似文献

3.

支持高并发访问的新型NVM存储系统

蔡涛陈志鹏牛德姣王杰詹毕晟《计算机应用》2019,39(1):51-56

I/O系统软件栈是影响NVM存储系统性能的重要因素。针对NVM存储系统的读写速度不均衡、写寿命有限等问题，设计了同异步融合的访问请求管理策略；在使用异步策略管理数据量较大的写操作的同时，仍然使用同步策略管理读请求和少量数据的写请求。针对多核处理器环境下不同计算核心访问存储系统时地址转换开销大的问题，设计了面向多核处理器地址转换缓存策略，减少地址转换的时间开销。最后实现了支持高并发访问NVM存储系统（CNVMS）的原型，并使用通用测试工具进行了随机读写、顺序读写、混合读写和实际应用负载的测试。实验结果表明，与PMBD相比，所提策略能提高1%~22%的读写速度和9%~15%的IOPS，验证了CNVMS策略能有效提高NVM存储系统的I/O性能和访问请求处理速度。相似文献

4.

面向视频应用中相变存储器的双阈值近似写方法

方运潭李华伟李晓维《计算机辅助设计与图形学学报》2014,(5):835-840

相变存储器(PCM)作为一种新型的非易失性存储器有望替代DRAM.针对PCM在视频应用中的使用,考虑到图像中亮度数据比色度数据更为重要,为了减少PCM的写操作能耗,延长写寿命,提出一种双阈值的近似写方法.首先分别为亮度数据和色度数据设置一个阈值寄存器,在对PCM进行写操作时根据不同的数据选择不同的阈值;然后将阈值和新旧数据之间的绝对差值进行比较,当绝对差值小于或等于阈值时禁止PCM的写操作,否则只对PCM有变化的数据位进行更新.实验结果表明,该方法能够以较低的硬件开销有效地减少PCM的写操作,且可以灵活地在写操作减少量和视频质量之间进行权衡. 相似文献

5.

面向新型非易失存储器的文件级磨损均衡机制

蔡涛张永春牛德姣倪晓蓉梁东莺《计算机研究与发展》2015,(7)

自旋转移力矩磁存储器（spin transfer torque random access memory ,ST TRAM ）和磁阻式随机存储器（magnetic random access memory ,MRAM ）等新型存储器具有接近于 DRAM 的访问速度,是构建高性能外存系统和提高计算机系统性能的重要手段,但有限的写次数是其重要局限之一．设计了文件系统级磨损均衡机制,使用 Hash 函数分散文件在外存中的存储,避免在创建和删除文件时反复分配某些存储块,通过分配文件空间时选择写次数较低的存储块,避免写操作的集中;使用主动迁移策略,在外存系统 I/O 负载较低时主动迁移写次数较高的数据块,减少磨损均衡机制对 I/O 性能的影响．最后在开源的基于对象存储设备 Open‐osd 上实现了面向新型存储器文件系统级磨损均衡机制的原型,使用存储系统通用测试工具 filebench 和 postmark 的多个通用数据集进行了测试与分析,验证了基于新型存储器的文件系统级磨损均衡机制能稳定地将存储块写次数差减少到原来的1/20左右,同时最高仅损失了6％的 I/O 性能和增加了0．5％的额外写操作,具有高效和稳定的特性．相似文献

6.

基于PCM的GPU存储系统设计与优化

穆帅单书畅邓仰东王志华《计算机科学》2013,40(10):29-31,71

以相变存储器(PCM)为代表的新型非易失存储器,具有存储密度高和静态功耗低等传统动态随机存取存储器(DRAM)不具备的优势,但是过长的写操作延时会严重影响访存的性能.设计了基于PCM的图形处理器(GPU)中的存储系统.仿真结果显示,GPU程序中的内存写请求分布极不均匀,对少量的内存地址有非常高的访问频率.面向访存分布不均匀特点的专用缓冲单元设计,能够有效地存储频繁访问的内存数据,从而减少对PCM的访问次数,消除过长的写操作延时对系统性能的负面影响.GPU仿真器上的结果显示,基于缓冲单元的PC以存储系统能够有效地提高GPU的运算性能. 相似文献

7.

基于贪婪策略的NAND FLASH存储器的磨损均衡算法研究

贾鑫张少平《计算机科学》2017,44(Z11):312-316

NAND FLASH存储器是无线传感器网络节点的存储设备。传感器节点在监控区域中不断获取数据信息,并进行节点之间的数据交互,使得NAND FLASH存储器频繁地进行写操作,从而造成物理块的擦除次数不均衡,缩短了存储器的使用寿命,最终影响整个传感器网络的使用寿命。针对上述问题,提出了贪婪策略的分区地址映射磨损均衡算法。该算法根据磨损擦除的参数进行贪婪选择,选择出擦除次数小的物理块进行写操作,而对擦除次数大的物理块进行配置与实验数据迁移,进入等待擦除。通过软件测试的方式,证明了所提算法可以有效地实现并优化NAND FLASH存储器的磨损均衡。相似文献

8.

非易失性内存友好的线性哈希索引——NVM-LH

汤晨黄国锐金培权《计算机应用》2021,41(3):623-629

非易失性内存（NVM）因其大容量、持久化、按位存取和读延迟低等特性而受到人们的关注,但它同时也具有写次数有限、读写速度不均衡等缺点。针对传统线性哈希索引直接在NVM上实现时会导致大量的随机写操作这一问题,提出了一种新的NVM友好的线性哈希索引NVM-LH。NVM-LH通过存储数据时的缓存行对齐实现了缓存友好性,同时提出了无日志的数据一致性保证策略。此外,NVM-LH还通过优化分裂和删除操作来减少NVM写操作。实验结果表明,NVM-LH在空间利用率上比CCEH高30%,在NVM写次数上比CCEH减少了15%左右,表现了更好的NVM友好性。相似文献

9.

WAPFTL:支持预测机制的负载自适应闪存转换层算法

谢徐超宋振龙李琼魏登萍方健肖立权《计算机工程与科学》2014,36(7):1238-1243

基于NAND Flash的固态盘凭借其低延迟、低功耗、高可靠性等优点,已经开始应用于企业级服务器和高性能计算领域。针对固态盘相对较差的写性能及使用寿命有限等不足,提出了一种闪存转换层中基于页映射机制的自适应地址映射算法WAPFTL。该算法能够在地址转换过程中预测负载读写特性并自适应地调整地址映射信息缓存的策略。实验结果表明,WAPFTL能够高效协同利用负载的时间局部性和空间局部性,提高地址映射命中率,减少因地址映射而引起的额外写操作次数;同时,有效减少了垃圾回收次数,提高了SSD整体性能。相似文献

10.

基于重用信息的非易失性缓存动态旁路策略

焦童陈玲玲安鑫李建华《计算机工程》2021,47(4):158-165

非易失性存储器具有能耗低、可扩展性强和存储密度大等优势,可替代传统静态随机存取存储器作为片上缓存,但其写操作的能耗及延迟较高,在大规模应用前需优化写性能。提出一种基于缓存块重用信息的动态旁路策略,用于优化非易失性存储器的缓存性能。分析测试程序访问最后一级缓存（LLC）时的重用特征,根据缓存块的重用信息动态预测相应的写操作是否绕过非易失性缓存,利用预测表进行旁路操作完成LLC缺失时的填充,同时采用动态路径选择进行上级缓存写回操作,通过监控模块为旁路的缓存块选择合适的上级缓存,并将重用计数较高的缓存块填充其中以减少LLC写操作次数。实验结果表明,与未采用旁路策略的缓存设计相比,该策略使4核处理器中所有SPLASH-2程序的运行时间平均减少6.6%,缓存能耗平均降低22.5%,有效提高了整体缓存性能。相似文献

11.

A space allocation and reuse strategy for PCM-based embedded systems

《Journal of Systems Architecture》2014,60(8):655-667

Phase change memory (PCM) has emerged as a promising candidate to replace DRAM in embedded systems, due to its appealing properties, such as zero leakage power, scalability, shock-resistivity and high density. However, it can only sustain a limited number of write operations. On the other hand, as a program in embedded systems usually distributes write traffic in an extremely unbalanced way, which could further decrease PCM lifetime.In this paper, we propose a space-based wear leveling technique in software compiler level by exploiting the program-specific features. The basic idea is to extend frequently written variables into specific-sized arrays, and evenly distribute writes on allocated array. In such way, we can effectively distribute the write traffic of the program across the whole PCM chip. A space allocation and reuse (SAR) strategy and a polynomial-time algorithm are proposed to produce optimal and near-optimal space allocation, respectively, for achieving a balanced write distribution. The experimental results show our technique can greatly extend the lifetime of PCM-based embedded systems compared with the previous work, and achieve approximately 94% the theoretical maximum of lifetime. Compared with a baseline scheme without wear-leveling mechanism, our technique introduces no more than 0.8% extra writes and 0.7% running overhead. 相似文献

12.

Hyper switching memory utilization on hybrid main memory for improved task execution and reduced power consumption

《Microprocessors and Microsystems》2020

The problem of lifetime maximization of PCM has been well studied. The arrival of non-volatile memory devices has replaced the traditional DRAM. Still the DRAM has many limitations on endurance and high power write operations. Similarly, number of designs has been discussed earlier to maximize the lifetime of PCM by catching the main memory at available DRAM. Still they could not achieve the performance on power consumption reduction and increasing memory utilization. To improve the performance in power consumption reduction and lifetime maximization, and categorical model is presented in this paper. The proposed method categorizes the processes according to their memory access activity. The categorized process has been allocated to respective part of hybrid memory which encourages maximum read and minimum write in PCM. The proposed method increases the lifetime of PCM than other methods. 相似文献

13.

基于DRAM牺牲Cache的异构内存页迁移机制

裴颂文钱艺幻叶笑春刘海坤孔令和《计算机研究与发展》2022,59(3):568-581

当海量数据请求访问异构内存系统时,异构内存页在动态随机存储器(dynamic random access memory,DRAM)和非易失性存储器(non-volatile memory,NVM)之间进行频繁的往返迁移.然而,应用于传统内存页的迁移策略难以适应内存页"冷""热"度的快速动态变化,这使得从DRAM迁移至N... 相似文献

14.

A compiler assisted wear leveling for morphable PCM in embedded systems

《Journal of Systems Architecture》2016

Phase change memory (PCM) is considered as a promising alternative of DRAM-based main memory in embedded systems. A PCM cell can be dynamically programmed to be in either multiple-level cell (MLC) mode or single-level cell (SLC) mode. With this morphable feature, we can utilize the high-density of MLC and low-latency of SLC, to satisfy various memory requirements of specific applications in embedded systems. However, compared to its SLC counterpart, the lifetime of MLC is limited.To address this issue, this paper proposes a simple and effective wear-leveling technique, named Mixer, to enhance the lifetime of morphable PCM considering the program specific features. We first build an Integer Linear Programming (ILP) formulation to dynamically configure the optimal SLC/MLC partition in morphable PCM, and produce the best data allocation for each variable to achieve a balanced write distribution in morphable PCM with low memory access cost. The basic idea is to allocate low-latency SLC and high-density MLC cells for write intensive variables and other ordinary variables, respectively. We then propose a polynomial time algorithm to achieve near-optimal results. The evaluation results show that the proposed technique can effectively improve the lifetime of morphable PCM in embedded systems compared with previous work. 相似文献

15.

Exploiting write power asymmetry to improve phase change memory system performance

Qi WANG Donghui WANG Chaohuan HOU 《Frontiers of Computer Science》2015,9(4):566-575

Phase change memory (PCM) is a promising candidate to replace DRAM as main memory, thanks to its better scalability and lower static power than DRAM. However, PCM also presents a few drawbacks, such as long write latency and high write power. Moreover, the write commands parallelism of PCM is restricted by instantaneous power constraints, which degrades write bandwidth and overall performance. The write power of PCM is asymmetric: writing a zero consumes more power than writing a one. In this paper, we propose a new scheduling policy, write power asymmetry scheduling (WPAS), that exploits the asymmetry of write power. WPAS improveswrite commands parallelism of PCM memory without violating power constraint. The evaluation results show that WPAS can improve performance by up to 35.5%, and 18.5% on average. The effective read latency can be reduced by up to 33.0%, and 17.1% on average. 相似文献

16.

利用相变存储器不对称性的写入优化方法

下载免费PDF全文

张格毅陈小刚郭继鹏宋志棠陈邦明《计算机工程与应用》2021,57(14):75-82

相变存储器具有集成度高、功耗低、非易失等优良特性,是作为非易失性内存最有潜力的存储介质之一。如何降低其写入延时和增加其使用寿命,是PCM作为非易失性内存时亟需解决的问题。为此,提出利用相变存储器擦除和写入时间不对称的特点擦写独立的写入方法,RSIW（Reset and Set Independently Write）。该方法不同于传统的写入方案,将写和擦的操作分离,让慢速的写操作在空闲时进行,使得相变存储器的写入速度获得显著提升。同时,RSIW还能结合磨损均衡的策略,有效地均衡各个块的写入频率。对擦写独立的写入方法和实施细节进行了描述,对比了同类使用相变存储器擦写不对称性进行优化的方案,最后使用gem5仿真器进行了实验,根据实验结果,该方法对比同类的技术能将系统的运行效率提高37.1%~69.1%。相似文献

17.

Write-Optimized B+ Tree Index Technology for Persistent Memory

下载免费PDF全文

Rui-Xiang Ma Fei Wu Bu-Rong Dong Meng Zhang Wei-Jun Li Chang-Sheng Xie 《计算机科学技术学报》2021,36(5):1037-1050

Due to its low latency,byte-addressable,non-volatile,and high density,persistent memory (PM) is expected to be used to design a high-performance storage system.However,PM also has disadvantages such as limited endurance,thereby proposing challenges to traditional index technologies such as B+ tree.B+ tree is originally designed for dynamic random access memory (DRAM)-based or disk-based systems and has a large write amplification problem.The high write amplification is detrimental to a PM-based system.This paper proposes WO-tree,a write-optimized B+ tree for PM.WO-tree adopts an unordered write mechanism for the leaf nodes,and the unordered write mechanism can reduce a large number of write operations caused by maintaining the entry order in the leaf nodes.When the leaf node is split,WO-tree performs the cache line flushing operation after all write operations are completed,which can reduce frequent data flushing operations.WO-tree adopts a partial logging mechanism and it only writes the log for the leaf node.The inner node recognizes the data inconsistency by the read operation and the data can be recovered using the leaf node information,thereby significantly reducing the logging overhead.Furthermore,WO-tree adopts a lock-free search for inner nodes,which reduces the locking overhead for concurrency operation.We evaluate WO-tree using the Yahoo!Cloud Serving Benchmark(YCSB) workloads.Compared with traditional B+ tree,wB-tree,and Fast-Fair,the number of cache line flushes caused by WO-tree insertion operations is reduced by 84.7％,22.2％,and 30.8％,respectively,and the execution time is reduced by 84.3％,27.3％,and 44.7％,respectively. 相似文献

18.

Write reconstruction for write throughput improvement on MLC PCM based main memory

《Journal of Systems Architecture》2016

The emerging Phase Change Memory (PCM) is considered as one of the most promising candidates to replace DRAM as main memory due to its better scalability and non-volatility. With multi-bit storage capability, Multiple-Level-Cell (MLC) PCM outperforms Single-Level-Cell (SLC) in density. However, the high write latency has been a performance bottleneck for MLC PCM for two reasons: First, MLC PCM has a much longer programming time; Second, the write latencies of different cell state transitions range significantly. When cells are concurrently written in the burst mode, the write latency of a burst is delayed by the worst state transitions. To improve the write throughput of MLC PCM based main memory, this paper proposes a Write Reconstruction (WR) scheme. WR reconstructs multiple burst writes targeting the same memory row, where the worst case cells are grouped together at some writes. With this approach, the write latency of other writes will be reduced. WR incurs low implementation overhead and shows significant efficiency. Experimental results show that WR achieves 18.1% of write latency reduction on average, with negligible power overhead. 相似文献

19.

Resource abstraction and data placement for distributed hybrid memory pool

Tingting CHEN Haikun LIU Xiaofei LIAO Hai JIN 《Frontiers of Computer Science》2021,15(3):153103

Emerging byte-addressable non-volatile memory (NVM) technologies offer higher density and lower cost than DRAM, at the expense of lower performance and limited write endurance. There have been many studies on hybrid NVM/DRAMmemory management in a single physical server. However, it is still an open problem on how to manage hybrid memories efficiently in a distributed environment. This paper proposes Alloy, a memory resource abstraction and data placement strategy for an RDMA-enabled distributed hybrid memory pool (DHMP). Alloy provides simple APIs for applications to utilize DRAM or NVM resource in the DHMP, without being aware of the hardware details of the DHMP. We propose a hotness-aware data placement scheme, which combines hot data migration, data replication and write merging together to improve application performance and reduce the cost of DRAM. We evaluate Alloy with several micro-benchmark workloads and public benchmark workloads. Experimental results show that Alloy can significantly reduce the DRAM usage in the DHMP by up to 95%, while reducing the total memory access time by up to 57% compared with the state-of-the-art approaches. 相似文献