期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

姜国松《计算机科学》2013,40(8):79-82,108

混合主存储器由DRAM构成,它可用作cache来扩展非易失性存储器,相比传统的主存储器能够提供更大的存储能力。不过,要使混合存储器具有高性能和可扩展性,一个关键的挑战在于需要对缓存在DRAM中的数据的元数据(如标签)以一个细粒度的方式进行有效管理。基于这样的观察:利用DRAM缓存行的局部性,将元数据与元数据对应的数据存储在片外缓存中相同的行,使用一个小的缓冲区来只缓存最近被访问的片内缓存行,以降低细粒度DRAM缓存的开销。利用这种细粒度的DRAM高速缓存的灵活性和效率,还开发了一种自适应的策略来选择在数据迁移到DRAM时最佳的迁移粒度。在搭配了512MB的DRAM缓存的混合型存储系统中,建议使用8kB的片上缓存,这样,相比一个传统的8MB的SRAM元数据存储,即使没有考虑大的SRAM元数据存储的能源开销,也可以提升6%以内的性能,以及18%的能效节约。相似文献

2.

PCM混合主存系统的写感知主存管理算法

《计算机科学与探索》2016,(6):799-810

相变存储器(phase change memory,PCM)凭借字节可寻址,读取速度快(纳秒级),高存储密度,低能耗等优点,在目前基于DRAM(dynamic random access memory)的主存扩展达到瓶颈的情形下,已经成为最具前途的主存存储介质之一,但是PCM有高写延迟,寿命有限等缺陷,因此出现了DRAM/PCM混合主存架构。提出了一种以减少PCM写和保持命中率为目标的混合主存管理算法——写感知的CLOCK算法(CLOCK with a write-aware strategy,CLOCKW)。已有研究主要基于写临近信息(recency of writes,RW)来预测页面写热度,CLOCKW引入内在写距离(inter-write-distance,IWD)概念,并结合写临近信息来预测页面写热度,从而把写密集页面放置在DRAM。此外,CLOCKW通过记录有限的历史写操作信息,将新置换进的页面放在合适的存储介质,避免不必要的页面迁移。最后,基于CLOCK算法的CLOCKW满足虚拟主存管理的低代价要求。实验显示,CLOCKW在保持命中率前提下,可以有效减少PCM写次数。相似文献

3.

基于混合内存的Apache Spark缓存系统实现与优化

魏森周浩然胡创程大钊《计算机科学》2023,(6):10-21

随着大数据时代数据规模的激增,内存计算框架得到了长足发展。主流内存计算框架Apache Spark使用内存来缓存中间结果,大幅度地提升了数据处理速度。同时,具有较快的读写速度和较大容量的非易失性存储器NVM在内存计算领域展现出了巨大的发展前景,使用DRAM和NVM构建Spark混合缓存系统成为一种可行方案。文中提出了一种基于DRAM-NVM混合内存的Spark缓存系统,该系统选择平面混合缓存模型作为设计方案,然后为缓存块管理系统设计了专用的数据结构,并提出了适用于Spark的混合缓存系统整体设计架构。另外,为了将频繁访问的缓存块保存在DRAM缓存中,提出了基于缓存块最小重用代价的混合缓存管理策略。首先从DAG信息中获取RDD的未来重用次数,未来重用次数多的缓存块将被优先保存在DRAM缓存中,并在缓存块迁移时考虑了迁移成本。设计实验表明,DRAM-NVM混合缓存相比原有缓存系统的性能平均提升了53.06%,对于相同的混合内存,所提策略相比默认缓存策略有平均35.09%的提升。同时,使用文中设计的混合系统只需要1/4的DRAM和3/4的NVM作为缓存,就能达到全部DRAM缓存约79%的性能... 相似文献

4.

谁是聚焦下一代通用存储器内存和闪存的“接班人”？

陈可《微型计算机》2006,(28):83-85

在随机存储技术大家族中，SRAM、DRAM与Flash RAM是最主要的类型。这三者各有所长，分别用于不同的场合：SRAM的速度最快、但密度极低，最主要的应用领域就是各类芯片的缓存，例如CPU的一级缓存、二级缓存均为SRAM电路；DRAM内存存储密度较高，但读写速度稍慢，适合作为计算机的内存、显存以及其他嵌入设备的内存系统；Flash RAM存储密度最高，同时又具有非易失性的特征，在切断电源供应后仍能够保存数据，因此很适合用于数据存储，被广泛用于数码产品以及USB闪盘中。相似文献

5.

基于相变存储器的存储系统与技术综述 总被引：2，自引：0，他引：2

张鸿斌范捷舒继武胡庆达《计算机研究与发展》2014,51(8)

随着处理器和存储器之间性能差距的不断增大,"存储墙"问题日益突出,但传统DRAM器件的集成度已接近极限,能耗问题也已成为瓶颈,如何设计扎实有效的存储架构解决存储墙问题已成为必须面对的挑战.近年来,以相变存储器(phase change memory,PCM)为代表的新型存储器件因其高集成度、低功耗的特点而受到了国内外研究者的广泛关注.特别地,相变存储器因其非易失性及字节寻址的特性而同时具备主存和外存的特点,在其影响下,主存和外存之间的界限正在变得模糊,将对未来的存储体系结构带来重大变化.重点讨论了基于PCM构建主存的结构,分析了其构建主存中的写优化技术、磨损均衡技术、硬件纠错技术、坏块重用技术、软件优化等关键问题,然后讨论了PCM在外存储系统的应用研究以及其对外存储体系结构和系统设计带来的影响.最后给出了PCM在存储系统中的应用研究展望. 相似文献

6.

基于片内SRAM的固态硬盘转换层设计

谢长生李博陆晨王芬《计算机科学》2010,37(7):296-300

SSD逐渐成为了存储业界研究的热点.提出基于片内SRAM的flash转换层设计--SBAST,通过SRAM缓存更新的页提高了SSD随机写的效率,并减少了不必要的擦除操作.通过SSDsim的仿真实验,论证了该设计的有效性,给出了后续的计划. 相似文献

7.

基于DRAM和PCM的混合主存模拟器

张德志万寿红岳丽华《计算机系统应用》2017,26(9):16-23

相变存储器（PCM）由于其非易失性、高读取速度以及低静态功耗等优点,已成为主存研究领域的热点.然而,目前缺乏可用的PCM设备,这使得基于PCM的算法研究得不到有效验证.因此,本文提出了利用主存模拟器仿真并验证PCM算法的思路.本文首先介绍了现有主存模拟器的特点,并指出其并不能完全满足当前主存研究的实际需求,在此基础上提出并构建了一个基于DRAM和PCM的混合主存模拟器.与现有模拟器的实验比较结果表明,本文设计的混合主存模拟器能够有效地模拟DRAM和PCM混合存储架构,并能够支持不同形式的混合主存系统模拟,具有高可配置性.最后,论文通过一个使用示例说明了混合主存模拟器编程接口的易用性. 相似文献

8.

DSP与海量存储器的接口技术 总被引：2，自引：0，他引：2

刘国福张连超《单片机与嵌入式系统应用》2001,(9):21-24

在分析DSP芯片TMS320F206存储结构的基础上,解决TMS320F206与海量SRAM、FLASH和DRAM的接口设计问题;比较三种海量存储器在DSP系统中应用的优缺点,并给出一个应用实例。相似文献

9.

DSP与海量存储器的接口技术

刘国福张玘张连超《单片机与嵌入式系统应用》2001,(7):36-39

在分析DSP芯片TMS320F206存储结构的基础上,解决TMS320F206与海量SRAM、FLASH和DRAM的接口设计问题;比较三种海量存储器在DSP系统中应用的优缺点,并给出一个应用实例. 相似文献

10.

众核处理器中使用写掩码实现混合写回/写穿透策略 总被引：4，自引：0，他引：4

林伟叶笑春宋风龙张浩《计算机学报》2008,31(11)

高速缓存采用写回策略,能极大地节省对片上网络和访存带宽的消耗,这对于片上众核(大于16核)的结构尤为重要.与通常多核系统中基于目录/总线的写无效或写更新协议不同,文中给出了片上实现域一致性存储模型和基于硬件锁的缓存一致性协议的方案并提出了在L1高速缓存保存写掩码的方法,用以记录本地更新缓存块的字节位置,解决了写回策略下伪共享带来的缓存一致性问题.文中还进一步提出两种优化掩码存储空间开销的新方法:通过设定程序中较少出现的、长度为1~3字节的写指令为写穿透,在L1中每4字节设置一位写掩码,将写掩码的芯片面积开销压缩到字节粒度的27.9%;设计项数为L1缓存块总数12.5%的多路写掩码缓存,在不损失性能的情况下,将面积开销压缩到字节粒度的17.7%.搭建的众核平台Godson-T采用域一致性存储模型,使用写掩码实现混合写回/写穿透缓存策略(临界区内写穿透,临界区外写回).实验使用splash2的3个程序和2个生物计算程序进行评估.结果表明,相对于完全写穿透,混合写回策略在32和64线程的配置下普遍获得24%以上的性能提升,性能略优于完全写回,并且采用两种优化空间开销的新方法后性能无损失. 相似文献

11.

Architecture and data migration methodology for L1 cache design with hybrid SRAM and volatile STT-RAM configuration

《Microprocessors and Microsystems》2016

Spin-Transfer Torque RAM (STT-RAM) has the advantages of circuit density and ignorable leakage power. However, it suffers from the bad write latency and poor write power consumption. Therefore, it is difficult to replace entire SRAM with STT-RAM in the L1 cache, but we can relax the retention time of STT-RAM cell to improve its write performance and replace some of the SRAM capacity to reduce leakage power. In this paper, we propose a locality-aware approach for L1 cache design with hybrid SRAM and volatile STT-RAM configuration. Based on the principle of cache locality, data block is mapped to SRAM firstly to reduce write latency and write energy, and is moved to volatile STT-RAM to reduce leakage power consumption. After a time period when there is no access of a data block in the volatile STT-RAM, we then stop its refresh operations to further reduce power consumption. Experimental results show that in comparison with the SRAM only L1 cache configuration, our hybrid cache configuration and data migration methodology reduce energy consumption by about 15–20%, with only nearly to 5% of latency overhead. Also when comparing to the STT-RAM only L1 cache configuration, we reduce memory access latency nearly to 20% with close or even better energy consumption. 相似文献

12.

Energy efficient task allocation for hybrid main memory architecture

《Journal of Systems Architecture》2016

Compared with the conventional dynamic random access memory (DRAM), emerging non-volatile memory technologies provide better density and energy efficiency. However, current NVM devices typically suffer from high write power, long write latency and low write endurance. In this paper, we study the task allocation problem for the hybrid main memory architecture with both DRAM and PRAM, in order to leverage system performance and the energy consumption of the memory subsystem via assigning different memory devices for each individual task. For an embedded system with a static set of periodical tasks, we design an integer linear programming (ILP) based offline adaptive space allocation (offline-ASA) algorithm to obtain the optimal task allocation. Furthermore, we propose an online adaptive space allocation (online-ASA) algorithm for dynamic task set where arrivals of tasks are not known in advance. Experimental results show that our proposed schemes achieve 27.01% energy saving on average, with additional performance cost of 13.6%. 相似文献

13.

Prober: exploiting sequential characteristics in buffer for improving SSDs write performance

Wen ZHOU Dan FENG Yu HUA Jingning LIU Fangting HUANG Yu CHEN Shuangwu ZHANG 《Frontiers of Computer Science》2016,10(5):951-964

Solid state disks (SSDs) are becoming one of the mainstream storage devices due to their salient features, such as high read performance and low power consumption. In order to obtain high write performance and extend flash lifespan, SSDs leverage an internal DRAM to buffer frequently rewritten data to reduce the number of program operations upon the flash. However, existing buffer management algorithms demonstrate their blank in leveraging data access features to predict data attributes. In various real-world workloads, most of large sequential write requests are rarely rewritten in near future. Once these write requests occur, many hot data will be evicted from DRAM into flash memory, thus jeopardizing the overall system performance. In order to address this problem, we propose a novel large write data identification scheme, called Prober. This scheme probes large sequential write sequences among the write streams at early stage to prevent them from residing in the buffer. In the meantime, to further release space and reduce waiting time for handling the incoming requests, we temporarily buffer the large data into DRAM when the buffer has free space, and leverage an actively write-back scheme for large sequential write data when the flash array turns into idle state. Experimental results demonstrate that our schemes improve hit ratio of write requests by up to 10%, decrease the average response time by up to 42% and reduce the number of erase operations by up to 11%, compared with the state-of-the-art buffer replacement algorithms. 相似文献

14.

基于多级磁自旋存储器的Cache调度策略的设计

朱艳娜王党辉《计算机科学》2018,45(Z6):513-517

多级磁自旋存储器(Multi-Level Cell Spin-Transfer Torque RAM,MLC STT-RAM)可在一个存储单元中存储多个比特位,有望取代SRAM用于构建大容量低功耗的最后一级Cache(Last Level Cache,LLC)。MLC STT-RAM的静态功耗在理论上为0,且拥有高密度和优秀的读操作特性,但它的缺陷在于低效的写操作。针对这一问题,在MLC STT-RAM Cache hard/soft逻辑分区结构的基础上,实现了MLC STT-RAM LLC写操作密集度预测技术以及相应Cache结构的设计。通过动态预测写操作密集度较高的Cache块,帮助MLC STT-RAM LLC减少执行写操作的代价。预测的基本思想是利用访存指令地址与相应Cache块行为特征的联系,根据预测结果决定数据在LLC中的放置位置。实验结果显示,在MLC STT-RAM LLC中应用写操作密集度预测技术,使得写操作动态功耗降低6.3%的同时,系统性能有所提升。相似文献

15.

A spill data aware memory assignment technique for improving power consumption of multimedia memory systems

Youn Jonghee Cho Doosan 《Multimedia Tools and Applications》2019,78(5):5463-5478

As embedded memory technology evolves, the traditional Static Random Access Memory (SRAM) technology has reached the end of development. For deepening the manufacturing process technology, the next generation memory technology is highly required because of the exponentially increasing leakage current of SRAM. Non-volatile memories such as STT-MRAM (Spin Torque Transfer Magnetic Random Access Memory), PCM (Phase Change Memory) are good candidates for replacing SRAM technology in embedded memory systems. They have many advanced characteristics in the perspective of power consumption, leakage power, size (density) and latency. Nonetheless, nonvolatile memories have two major problems that hinder their use it the next-generation memory. First, the lifetime of the nonvolatile memory cell is limited by the number of write operations. Next, the write operation consumes more latency and power than the same size of the read operation. This study describes a compiler optimization technique to overcome such disadvantages of a nonvolatile memory component in hybrid cache memories. A hybrid cache is proposed to overcome the disadvantages using a compiler. Specifically, to minimize the number of write operations for nonvolatile memory, we present a data replacement technique that considers the locations of the register spill data. Many portions of the memory accesses are yielded by the spill data of a register allocator in an optimizing compiler. Such spill data can be partially removed using a recalculation method. Thus, we implemented an optimization technique that rearranges the data placement with recalculation to minimize the write instructions on the nonvolatile memory. Our experimental results show that the proposed technique can reduce the average number of spill codes by 20%, and improves the energy consumption by 20.2% on average.

相似文献

16.

Exploiting write power asymmetry to improve phase change memory system performance

Qi WANG Donghui WANG Chaohuan HOU 《Frontiers of Computer Science》2015,9(4):566-575

Phase change memory (PCM) is a promising candidate to replace DRAM as main memory, thanks to its better scalability and lower static power than DRAM. However, PCM also presents a few drawbacks, such as long write latency and high write power. Moreover, the write commands parallelism of PCM is restricted by instantaneous power constraints, which degrades write bandwidth and overall performance. The write power of PCM is asymmetric: writing a zero consumes more power than writing a one. In this paper, we propose a new scheduling policy, write power asymmetry scheduling (WPAS), that exploits the asymmetry of write power. WPAS improveswrite commands parallelism of PCM memory without violating power constraint. The evaluation results show that WPAS can improve performance by up to 35.5%, and 18.5% on average. The effective read latency can be reduced by up to 33.0%, and 17.1% on average. 相似文献

17.

一种基于HBase的数据持久性和可用性研究

唐长城杨峰代栋孙明明周学海《计算机系统应用》2013,22(10):175-180

HBase（HadoopDataBase）是ApacheHadoop项目下的一款非关系型数据库,它是一个基于列簇的开源数据存储系统,关于HBase的研究和应用越来越受到关注．由于HBase会在内存缓存数据后写文件系统,所以缓存的大小成为影响系统性能的一个重要因素．本文提出一种基于备份日志的持久性、可用性方案RemoteLogProcess,使得HBase能够在不同的缓存规模获得更好的写性能．实验证明,在保证数据的持久性和可用性前提下,RLP能够在不同的缓存大小下获得稳定的性能,并且在缓存不超过默认设置时明显提高写操作时间性能．相似文献

18.

NVMRA: utilizing NVM to improve the random write operations for NAND‐flash‐based mobile devices

下载免费PDF全文

Renhai Chen Zhaoyan Shen Chenlin Ma Zili Shao Yong Guan 《Software》2016,46(9):1263-1284

NAND flash memory has become the major storage media in mobile devices, such as smartphones. However, the random write operations of NAND flash memory heavily affect the I/O performance, thus seriously degrading the application performance in mobile devices. The main reason for slow random write operations is the out‐of‐place update feature of NAND flash memory. Newly emerged non‐volatile memory, such as phase‐change memory, spin transfer torque, supports in‐place updates and presents much better I/O performance than that of flash memory. All these good features make non‐volatile memory (NVM) as a promising solution to improve the random write performance for NAND flash memory. In this paper, we propose a non‐volatile memory for random access (NVMRA) scheme to utilize NVM to improve the I/O performance in mobile devices. NVMRA exploits the I/O behaviors of applications to improve the random write performance for each application. Based on different I/O behaviors, such as random write‐dominant I/O behavior, NVMRA adopts different storing decisions. The scheme is evaluated on a real Android 4.2 platform. The experimental results show that the proposed scheme can effectively improve the I/O performance and reduce the I/O energy consumption for mobile devices. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献

19.

消费级混合式固态存储分析与研究

罗龙飞李蓍城石亮《集成技术》2022,11(3):71-84

混合式固态存储已成为当前消费级终端领域的主流存储设备。然而在学术领域,关于混合式固态存储设计和问题的讨论与分析仍不够充分。该文针对现有的混合式存储设备,结合相关领域前沿研究,从混合式闪存架构介绍、亟待解决的痛点问题和相关研究进展3个方面进行讨论和分析。文章介绍和分析了混合式闪存的主流架构及其特点,展示了在真实设备平台上测试的实验数据结果,揭露了混合式闪存中亟待解决的问题,重点介绍了读特征、写特征、读写冲突和容量特征相关问题。同时介绍了相应问题的最新研究进展,并分析了各个技术的优劣和未来的发展方向。相似文献

20.

一种混合映射闪存转换层的设计与实现

郁志平刘伟彭虎《计算机工程》2014,(2):300-302,307

使用NAND Flash作为存储媒介的存储设备常需要闪存转换层(FTL)对NAND进行管理。页映射是一种常见的映射方式,但需要很大的内存存放页映射表,在嵌入式环境下这一条件往往无法满足。针对该问题,提出一种基于超级块的混合映射FTL,包括坏块管理、地址翻译、垃圾回收、上电恢复,使用的SRAM空间不到128 KB,远小于页映射,同时不需要存储映射表,程序在固态硬盘开发板上成功运行,实现固态硬盘基本读写功能。测试结果表明,该混合映射FTL方案具有较好的顺序读写性能。相似文献