期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

何炎祥沈凡凡张军江南李清安李建华《计算机研究与发展》2015,52(6)

随着半导体工艺的发展,处理器集成的片上缓存越来越大,传统存储器件的漏电功耗问题日益严峻,如何设计高能效的片上存储架构已成为重要挑战.为解决这些问题,国内外研究者讨论了大量的新型非易失性存储技术,它们具有非易失性、低功耗和高存储密度等优良特性.为探索spin-transfer torque RAM (STT-RAM),phase change memory (PCM),resistive RAM (RRAM)和domain-wall memory(DWM)四种新型非易失性存储器(non-volatile memory,NVM)架构缓存的方法,对比了其与传统存储器件的物理特性,讨论了其架构缓存的优缺点和适用性,重点分类并总结了其架构缓存的优化方法和策略,分析了其中针对新型非易失性存储器写功耗高、写寿命有限和写延迟长等缺点所作出的关键优化技术,最后探讨了新型非易失性存储器件在未来缓存优化中可能的研究方向. 相似文献

2.

基于持久化内存的索引设计重新思考与优化

韩书楷熊子威蒋德钧熊劲《计算机研究与发展》2021,58(2):356-370

非易失性内存(non-volatile memory, NVM)是近几年来出现的一种新型存储介质.一方面,同传统的易失性内存一样,它有着低访问延迟、可字节寻址的特性;另一方面,与易失性内存不同的是,掉电后它存储的数据不会丢失,此外它还有着更高的密度以及更低的能耗开销.这些特性使得非易失性内存有望被大规模应用在未来的计算机系统中.非易失性内存的出现为构建高效的持久化索引提供了新的思路.由于非易失性硬件还处于研究阶段,因此大多数面向非易失性内存的索引研究工作基于模拟环境开展.在2019年4月英特尔发布了基于3D-XPoint技术的非易失性内存硬件apache pass (AEP),这使得研究人员可以基于真实的硬件环境去进行相关研究工作.首先评测了真实的非易失性内存器件,结果显示AEP的写延迟接近DRAM,而读延迟是DRAM的3~4倍.基于对硬件的实际评测结果,研究发现过去很多工作对非易失性内存的性能假设存在偏差,这使得过去的一些工作大多只针对写性能进行优化,并没有针对读性能进行优化.因此,重新审视了之前研究工作,针对过去的混合索引工作进行了读优化.此外,还提出了一种基于混合内存的异步缓存方法.实验结果表明,经过异步缓存方法优化后的混合索引读性能是优化前的1.8倍,此外,经过异步缓存优化后的持久化索引最多可以降低50%的读延迟. 相似文献

3.

基于非易失性存储器的存储引擎性能优化

王海涛李战怀张晓赵晓南《集成技术》2022,11(3):56-70

非易失性存储器具有接近内存的读写速度,可利用其替换传统的存储设备,从而提升存储引擎的性能。但是,传统的存储引擎通常使用通用块接口读写数据,导致了较长的 I/O 软件栈,增加了软件层的读写延迟,进而限制了非易失性存储器的性能优势。针对这一问题,该文以 Ceph 大数据存储系统为基础,研究设计了基于非易失性存储器的新型存储引擎 NVMStore,通过内存映射的方式访问存储设备,根据非易失性存储器的字节可寻址和数据持久化特性,优化数据读写流程,从而减小数据写放大以及软件栈的开销。实验结果表明,与使用非易失性存储器的传统存储引擎相比,NVMStore能够显著提升 Ceph 的小块数据读写性能。相似文献

4.

基于混合内存的Apache Spark缓存系统实现与优化

魏森周浩然胡创程大钊《计算机科学》2023,(6):10-21

随着大数据时代数据规模的激增,内存计算框架得到了长足发展。主流内存计算框架Apache Spark使用内存来缓存中间结果,大幅度地提升了数据处理速度。同时,具有较快的读写速度和较大容量的非易失性存储器NVM在内存计算领域展现出了巨大的发展前景,使用DRAM和NVM构建Spark混合缓存系统成为一种可行方案。文中提出了一种基于DRAM-NVM混合内存的Spark缓存系统,该系统选择平面混合缓存模型作为设计方案,然后为缓存块管理系统设计了专用的数据结构,并提出了适用于Spark的混合缓存系统整体设计架构。另外,为了将频繁访问的缓存块保存在DRAM缓存中,提出了基于缓存块最小重用代价的混合缓存管理策略。首先从DAG信息中获取RDD的未来重用次数,未来重用次数多的缓存块将被优先保存在DRAM缓存中,并在缓存块迁移时考虑了迁移成本。设计实验表明,DRAM-NVM混合缓存相比原有缓存系统的性能平均提升了53.06%,对于相同的混合内存,所提策略相比默认缓存策略有平均35.09%的提升。同时,使用文中设计的混合系统只需要1/4的DRAM和3/4的NVM作为缓存,就能达到全部DRAM缓存约79%的性能... 相似文献

5.

一种基于强化学习的混合缓存能耗优化与评价

范浩徐光平薛彦兵高赞张桦《计算机研究与发展》2020,57(6):1125-1139

新兴的非易失存储器STT-RAM具有低泄漏功率、高密度和快速读取速度、高写入能量等特点;而SRAM具有高泄漏功率、低密度、快速读取写入速度、低写入能量等特点.SRAM和STT-RAM相结合组成的混合缓存充分发挥了两者的性能,提供了比SRAM更低的泄漏功率和更高的单元密度,比STT-RAM更高的写入速度和更低的写入能量.混合缓存结构主要是通过把写密集数据放入SRAM中、读密集型数据放入STT-RAM中发挥这2种存储器的性能.因此如何识别并分配读写密集型数据是混合缓存设计的关键挑战.利用缓存访问请求的写入强度和重用信息,提出一种基于强化学习的缓存管理方法,设计缓存分配策略优化能耗.关键思想是使用强化学习对得到的缓存行(cache line)集合的能耗进行学习,得到该集合分配到SRAM或者STT-RAM的权重,将集合中的缓存行分配到权重大的区域.实验评估表明：提出的策略与以前的策略相比,在单核(四核)系统中能耗平均降低了16.9％(9.7％). 相似文献

6.

影响非易失性内存系统性能的因素分析

夏飞蒋德钧熊劲《计算机研究与发展》2014,(Z1)

新型非易失性存储器(non-volatile memory,NVM)具有扩展性好、静态能耗低、非易失性等特点,基于NVM的内存系统有望在未来补充甚至替代DRAM内存.但是NVM写延迟较长、写耐久性有限、动态写能耗高的问题,对NVM的实际应用产生了挑战.NVM内存系统如何影响应用程序,哪些因素会影响NVM内存系统的性能,是一个值得研究的问题.初步评测了NVM内存系统的性能,所提出的NVM内存包括两种:一种是只有NVM的内存(NVM-only memory);另一种是DRAM/NVM构成的混合内存.同时对比了NVM内存与DRAM内存的性能,分析了影响NVM内存系统的因素.最后,讨论了NVM内存系统研究的未来工作. 相似文献

7.

流处理器的相变存储器主存性能优化

下载免费PDF全文

郝秀蕊安虹李小强汤旭龙《计算机工程》2011,37(24):251-253

将相变存储器(PCRAM)作为流处理器Imagine的主存储器,对其性能进行优化。建立(PCRAM)性能分析模型,针对PCRAM可写次数有限的缺陷,采用避免冗余位写技术,使PCRAM的生命周期延长3.4倍。利用PCRAM的非易失性,避免不必要的缓存行写回。分析访存调度算法对 PCRAM性能的影响,结果表明,row/open调度算法性能较优,适合PCRAM使用。相似文献

8.

重用感知的非一致缓存迁移策略研究

汪玲黄炎袁光辉《计算机工程》2014,(2):81-85

随着工艺的持续进步,多核处理器集成了越来越多的核以及片上缓存系统,因此利用非一致缓存架构(NUCA)应对片上多核处理器的缓存系统中逐渐增大的线延迟。高效的缓存块迁移策略对整个缓存系统至关重要。当前动态非一致缓存架构(D-NUCA)中的缓存块迁移策略未考虑缓存块的历史访问信息,导致缓存块在不同的bank之间抖动从而增加缓存块的访问延迟。为此,提出一种重用感知的缓存块迁移(RABM)策略,采用缓存块的历史迁移信息来预测将来的缓存块迁移,从而提升D-NUCA的性能以及降低整个缓存系统的功耗。基于PARSEC基准测试程序的全系统仿真结果显示,与D-NUCA相比,基于RABM的D-NUCA可以使每时钟周期指令数平均提高9.6%,片上缓存系统功耗降低14%。相似文献

9.

基于相变存储器的存储系统与技术综述 总被引：2，自引：0，他引：2

张鸿斌范捷舒继武胡庆达《计算机研究与发展》2014,51(8):1647-1662

随着处理器和存储器之间性能差距的不断增大,“存储墙”问题日益突出,但传统DRAM器件的集成度已接近极限,能耗问题也已成为瓶颈,如何设计扎实有效的存储架构解决存储墙问题已成为必须面对的挑战.近年来,以相变存储器(phase change memory, PCM)为代表的新型存储器件因其高集成度、低功耗的特点而受到了国内外研究者的广泛关注.特别地,相变存储器因其非易失性及字节寻址的特性而同时具备主存和外存的特点,在其影响下,主存和外存之间的界限正在变得模糊,将对未来的存储体系结构带来重大变化.重点讨论了基于PCM构建主存的结构,分析了其构建主存中的写优化技术、磨损均衡技术、硬件纠错技术、坏块重用技术、软件优化等关键问题,然后讨论了PCM在外存储系统的应用研究以及其对外存储体系结构和系统设计带来的影响.最后给出了PCM在存储系统中的应用研究展望. 相似文献

10.

非易失性控制器DS1210芯片特点及应用

马秀丽朱俊英黄增满《电子技术应用》1995,(3)

介绍非易失性控制器DS1210芯片的性能特点及应用,该芯片非常适用于用后备电池保存随机存取存储器中数据信息的单片机应用系统. 相似文献

11.

A hybrid memory architecture supporting fine-grained data migration

Ye CHI Jianhui YUE Xiaofei LIAO Haikun LIU Hai JIN 《Frontiers of Computer Science》2024,18(2):182103

Hybrid memory systems composed of dynamic random access memory (DRAM) and Non-volatile memory (NVM) often exploit page migration technologies to fully take the advantages of different memory media. Most previous proposals usually migrate data at a granularity of 4 KB pages, and thus waste memory bandwidth and DRAM resource. In this paper, we propose Mocha, a non-hierarchical architecture that organizes DRAM and NVM in a flat address space physically, but manages them in a cache/memory hierarchy. Since the commercial NVM device–Intel Optane DC Persistent Memory Modules (DCPMM) actually access the physical media at a granularity of 256 bytes (an Optane block), we manage the DRAM cache at the 256-byte size to adapt to this feature of Optane. This design not only enables fine-grained data migration and management for the DRAM cache, but also avoids write amplification for Intel Optane DCPMM. We also create an Indirect Address Cache (IAC) in Hybrid Memory Controller (HMC) and propose a reverse address mapping table in the DRAM to speed up address translation and cache replacement. Moreover, we exploit a utility-based caching mechanism to filter cold blocks in the NVM, and further improve the efficiency of the DRAM cache. We implement Mocha in an architectural simulator. Experimental results show that Mocha can improve application performance by 8.2% on average (up to 24.6%), reduce 6.9% energy consumption and 25.9% data migration traffic on average, compared with a typical hybrid memory architecture–HSCC. 相似文献

12.

Energy optimization for multi-level cell non-volatile memory using state remapping

《Microprocessors and Microsystems》2017

Non-volatile Memory (NVM) is emerging as a promising technology to build future main memory or cache. Multi-level cell (MLC) NVM that stores multiple bits in a single cell has been developed in recent years. Different NVM technology has its own writing schemes to store multiple bits, and the amount of write energy varies across different states. For MLC Phase-Change Memory (PCM), the energy consumption of writing intermediate states, ‘01’ and ‘10’, is bigger than that of writing states ‘00’ and ‘11’. For MLC Spin-Transfer Torque Magnetic RAM (STT-MRAM), the energy consumption of flipping the left bit of a 2-bit cell is greater than that of flipping the right bit. To reduce the MLC NVM write energy consumption, we propose an encoding scheme to reduce the amount of intermediate states’ write for MLC PCM and another encoding scheme to decrease the number of the left bit flips for MLC STT-MRAM. The main idea of both schemes is state remapping. We find two minimum write frequency states and remap them to state ‘01’ and ‘10’ respectively for MLC PCM. In addition, for MLC STT-MRAM, we seeks the remapping decision that can minimize the number of the left bit flips and reduces the write of states ‘01’ and ‘10’. The experimental results show that the encoding scheme for MLC PCM saves 5.25% energy on average and the encoding scheme for MLC STT-MRAM saves 12.17% energy on average. 相似文献

13.

支持高并发访问的新型NVM存储系统

蔡涛陈志鹏牛德姣王杰詹毕晟《计算机应用》2019,39(1):51-56

I/O系统软件栈是影响NVM存储系统性能的重要因素。针对NVM存储系统的读写速度不均衡、写寿命有限等问题，设计了同异步融合的访问请求管理策略；在使用异步策略管理数据量较大的写操作的同时，仍然使用同步策略管理读请求和少量数据的写请求。针对多核处理器环境下不同计算核心访问存储系统时地址转换开销大的问题，设计了面向多核处理器地址转换缓存策略，减少地址转换的时间开销。最后实现了支持高并发访问NVM存储系统（CNVMS）的原型，并使用通用测试工具进行了随机读写、顺序读写、混合读写和实际应用负载的测试。实验结果表明，与PMBD相比，所提策略能提高1%~22%的读写速度和9%~15%的IOPS，验证了CNVMS策略能有效提高NVM存储系统的I/O性能和访问请求处理速度。相似文献

14.

非易失性内存友好的线性哈希索引——NVM-LH

汤晨黄国锐金培权《计算机应用》2021,41(3):623-629

非易失性内存（NVM）因其大容量、持久化、按位存取和读延迟低等特性而受到人们的关注,但它同时也具有写次数有限、读写速度不均衡等缺点。针对传统线性哈希索引直接在NVM上实现时会导致大量的随机写操作这一问题,提出了一种新的NVM友好的线性哈希索引NVM-LH。NVM-LH通过存储数据时的缓存行对齐实现了缓存友好性,同时提出了无日志的数据一致性保证策略。此外,NVM-LH还通过优化分裂和删除操作来减少NVM写操作。实验结果表明,NVM-LH在空间利用率上比CCEH高30%,在NVM写次数上比CCEH减少了15%左右,表现了更好的NVM友好性。相似文献

15.

Architecture and data migration methodology for L1 cache design with hybrid SRAM and volatile STT-RAM configuration

《Microprocessors and Microsystems》2016

Spin-Transfer Torque RAM (STT-RAM) has the advantages of circuit density and ignorable leakage power. However, it suffers from the bad write latency and poor write power consumption. Therefore, it is difficult to replace entire SRAM with STT-RAM in the L1 cache, but we can relax the retention time of STT-RAM cell to improve its write performance and replace some of the SRAM capacity to reduce leakage power. In this paper, we propose a locality-aware approach for L1 cache design with hybrid SRAM and volatile STT-RAM configuration. Based on the principle of cache locality, data block is mapped to SRAM firstly to reduce write latency and write energy, and is moved to volatile STT-RAM to reduce leakage power consumption. After a time period when there is no access of a data block in the volatile STT-RAM, we then stop its refresh operations to further reduce power consumption. Experimental results show that in comparison with the SRAM only L1 cache configuration, our hybrid cache configuration and data migration methodology reduce energy consumption by about 15–20%, with only nearly to 5% of latency overhead. Also when comparing to the STT-RAM only L1 cache configuration, we reduce memory access latency nearly to 20% with close or even better energy consumption. 相似文献

16.

MacroTrend: A Write-Efficient Cache Algorithm for NVM-Based Read Cache

下载免费PDF全文

鲍宁柴云鹏秦啸王传雯《计算机科学技术学报》2022,37(1):207-230

The future storage systems are expected to contain a wide variety of storage media and layers due to the rapid development of NVM(non-volatile memory)techniques.For NVM-based read caches,many kinds of NVM devices cannot stand frequent data updates due to limited write endurance or high energy consumption of writing.However,traditional cache algorithms have to update cached blocks frequently because it is difficult for them to predict long-term popularity according to such limited information about data blocks,such as only a single value or a queue that reflects frequency or recency.In this paper,we propose a new MacroTrend(macroscopic trend)prediction method to discover long-term hot blocks through blocks'macro trends illustrated by their access count histograms.And then a new cache replacement algorithm is designed based on the MacroTrend prediction to greatly reduce the write amount while improving the hit ratio.We conduct extensive experiments driven by a series of real-world traces and find that compared with LRU,MacroTrend can reduce the write amounts of NVM cache devices significantly with similar hit ratios,leading to longer NVM lifetime or less energy consumption. 相似文献

17.

WOBTree: a write-optimized B+-tree for non-volatile memory

Haitao WANG Zhanhuai LI Xiao ZHANG Xiaonan ZHAO Song JIANG 《Frontiers of Computer Science》2021,15(5):155106

The emergence of non-volatile memory (NVM) has introduced new opportunities for performance optimizations in existing storage systems. To better utilize its byte-addressability and near-DRAM performance, NVM can be attached on the memory bus and accessed via load/store memory instructions rather than the conventional block interface. In this scenario, a cache line (usually 64 bytes) becomes the data transfer unit between volatile and non-volatile devices. However, the failureatomicity of write on NVM is the memory bit width (usually 8 bytes). This mismatch between the data transfer unit and the atomicity unit may introduce write amplification and compromise data consistency of node-based data structures such as B+-trees. In this paper, we propose WOBTree, a Write-Optimized B+-Tree for NVM to address the mismatch problem without expensive logging. WOBTree minimizes the update granularity from a tree node to a much smaller subnode and carefully arranges the write operations in it to ensure crash consistency and reduce write amplification. Experimental results show that compared with previous persistent B+-tree solutions, WOBTree reduces the write amplification by up to 86× and improves write performance by up to 61× while maintaining similar search performance. 相似文献

18.

一种基于时间戳的高扩展性的持久性软件事务内存

刘超杰王芳邹晓敏冯丹《计算机研究与发展》2022,59(3):499-517

新兴的非易失性内存(non-volatile memory, NVM)具有字节寻址、持久性、大容量和低功耗等优点,然而,在NVM上进行并发编程往往比较困难,用户既要保证数据的崩溃一致性又要保证并发的正确性.为了降低用户开发难度,研究人员提出持久性事务内存方案,但是现有持久性事务内存普遍存在扩展性较差问题.测试发现限制扩展性的关键因素在于全局逻辑时钟和冗余NVM写操作.针对这2个方面,提出了线程逻辑时钟方法,通过允许每个线程拥有一个独立时钟,消除全局逻辑时钟中心化问题;提出了缓存行感知的双版本方法,为数据维护2个版本,通过循环更新这2个版本来保证数据的崩溃一致性,从而消除冗余的NVM写操作.基于所提出的这2个方法,实现了一个基于时间戳的高扩展的持久性软件事务内存(scalable durable transactional memory, SDTM),对比测试显示,在YCSB负载下,与DudeTM和PMDK相比,SDTM的性能最多分别提高了2.8倍和29倍. 相似文献

19.

A spill data aware memory assignment technique for improving power consumption of multimedia memory systems

Youn Jonghee Cho Doosan 《Multimedia Tools and Applications》2019,78(5):5463-5478

As embedded memory technology evolves, the traditional Static Random Access Memory (SRAM) technology has reached the end of development. For deepening the manufacturing process technology, the next generation memory technology is highly required because of the exponentially increasing leakage current of SRAM. Non-volatile memories such as STT-MRAM (Spin Torque Transfer Magnetic Random Access Memory), PCM (Phase Change Memory) are good candidates for replacing SRAM technology in embedded memory systems. They have many advanced characteristics in the perspective of power consumption, leakage power, size (density) and latency. Nonetheless, nonvolatile memories have two major problems that hinder their use it the next-generation memory. First, the lifetime of the nonvolatile memory cell is limited by the number of write operations. Next, the write operation consumes more latency and power than the same size of the read operation. This study describes a compiler optimization technique to overcome such disadvantages of a nonvolatile memory component in hybrid cache memories. A hybrid cache is proposed to overcome the disadvantages using a compiler. Specifically, to minimize the number of write operations for nonvolatile memory, we present a data replacement technique that considers the locations of the register spill data. Many portions of the memory accesses are yielded by the spill data of a register allocator in an optimizing compiler. Such spill data can be partially removed using a recalculation method. Thus, we implemented an optimization technique that rearranges the data placement with recalculation to minimize the write instructions on the nonvolatile memory. Our experimental results show that the proposed technique can reduce the average number of spill codes by 20%, and improves the energy consumption by 20.2% on average.

相似文献