期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Architecture and data migration methodology for L1 cache design with hybrid SRAM and volatile STT-RAM configuration

《Microprocessors and Microsystems》2016

Spin-Transfer Torque RAM (STT-RAM) has the advantages of circuit density and ignorable leakage power. However, it suffers from the bad write latency and poor write power consumption. Therefore, it is difficult to replace entire SRAM with STT-RAM in the L1 cache, but we can relax the retention time of STT-RAM cell to improve its write performance and replace some of the SRAM capacity to reduce leakage power. In this paper, we propose a locality-aware approach for L1 cache design with hybrid SRAM and volatile STT-RAM configuration. Based on the principle of cache locality, data block is mapped to SRAM firstly to reduce write latency and write energy, and is moved to volatile STT-RAM to reduce leakage power consumption. After a time period when there is no access of a data block in the volatile STT-RAM, we then stop its refresh operations to further reduce power consumption. Experimental results show that in comparison with the SRAM only L1 cache configuration, our hybrid cache configuration and data migration methodology reduce energy consumption by about 15–20%, with only nearly to 5% of latency overhead. Also when comparing to the STT-RAM only L1 cache configuration, we reduce memory access latency nearly to 20% with close or even better energy consumption. 相似文献

2.

基于非完美功率域非正交多址接入网络的上行链路低功耗研究

任桂山吴冕泽陈学梅苏锋李红艳《计算机应用与软件》2021,38(2):114-118

功率域非正交多址接入(PD-NOMA)技术可以有效提高无线网络频谱利用率,满足大规模节点接入及低时延等需求,但存在功耗大的缺点,在工业传感器网络中面临巨大挑战.对此,基于PD-NOMA的上行网络,接收机使用串行干扰抵消(SIC)迭代解码,在给定实时性需求下,通过用户调度和功率分配的联合优化,最小化网络的功耗.通过分析最... 相似文献

3.

Power-aware data retrieval protocols for indexed broadcast parallel channels

Ali R. Hurson Angela Maria Muoz-Avila Neil Orchowski Behrooz Shirazi Yu Jiao 《Pervasive and Mobile Computing》2006,2(1):85-107

In pervasive and mobile computing environments, “timely and reliable” access to public data requires methods that allow quick, efficient, and low-power access to information to overcome technological limitations of wireless communication and access devices. The literature suggests broadcasting (one-way communication) as an effective way to disseminate the public data to mobile devices. Within the scope of broadcasting, the response time and energy consumption of retrieval methods have been used as the performance metrics for measuring the effectiveness of different access methods. The hardware and architecture of the mobile units offer different operational modes that consume different energy levels. Along with these architectural and hardware enhancements, techniques such as indexing, broadcasting along parallel channels, and efficient allocation and retrieval protocols can be used to minimize power consumption and access latency.In general, the retrieval methods attempt to determine the optimal access pattern for retrieving the requested data objects on parallel broadcast channels. The employment of heuristics provides a methodology for such ideal path planning solutions. Using informative heuristics and intelligent searches of an access forest can provide a prioritized cost evaluation of access patterns for requested data objects and, hence, an optimal path for the access of requested data on broadcast air channels.This paper examines two scheduling methods that along with a set of heuristics generate and facilitate the access patterns for retrieving data objects in the presence of conflicts in an indexed parallel broadcast channel environment. A simulation of the proposed schemes is presented for analyzing the relationship between response time and power consumption. 相似文献

4.

基于深度Q学习的电力物联网任务卸载研究

丁忠林李洋曹委谈宇浩徐波《计算机与现代化》2022,(11):75-80

随着现代化城市与工业生产中电力需求的不断提高,电力物联网（Power Internet of Things, PIoT）作为一种能够显著提高电力系统效率的解决方案受到了广泛关注。为有效解决接入问题,现有的电力设备往往已配备内置轻量级人工智能的5G模组。然而,受制于模组有限的计算能力和通信能力,设备产生的海量数据难以实时处理和分析。基于该问题,本文主要研究电力物联网系统中的任务卸载问题,通过联合优化卸载决策和边缘服务器的计算资源分配,从而降低时延与能耗的加权和。此外本文提出一种基于深度强化学习的任务卸载算法,首先任务在边缘服务器的处理过程建模为队列,其次基于凸优化理论对本地计算资源分配进行优化,最后采用深度Q学习算法优化任务卸载决策。实验结果表明,本文提出的方法能够有效降低系统时延与能耗的加权和。相似文献

5.

一种有效的异构盘高能效缓存机制

窦少彬杨良怀龚卫华《计算机系统应用》2011,20(11):99-102,106

固态盘具有低功耗、高性能、耐冲击等优势,硬盘具有高容量、低价格等优势.通过改进文件系统的结构,把固态盘和硬盘结合起来,固态盘作为硬盘的大容量缓存,组成一个我们称之为异构盘的异构系统,其性能接近于固态盘,价格却接近于硬盘.同时,在硬盘有足够空闲时长时,使之关闭以减少能耗.针对大容量缓存,我们采用了合适的树形搜索结构,提出... 相似文献

6.

基于hybrid拓扑的数据网格副本创建策略* 总被引：1，自引：1，他引：0

卢炎生胡辉《计算机应用研究》2007,24(11):286-288

数据复制技术被广泛应用于数据网格中,以缩短数据访问时间和传输时间、降低网络带宽消耗.针对包含树型拓扑和环型拓扑的混合式网格拓扑结构,提出了一种考虑网络带宽、网络传输延迟、用户请求频率和站点可用存储空间大小等因素的副本创建策略,并引入评估函数衡量各因素的影响大小,具有良好的可靠性、可扩展性和自适应性.模拟实验的结果显示此副本创建策略可以有效降低数据平均访问时间. 相似文献

7.

Understanding optimal data gathering in the energy and latency domains of a wireless sensor network 总被引：3，自引：0，他引：3

U. F. T. F. M. 《Computer Networks》2006,50(18):3564-3584

The problem of optimal data gathering in wireless sensor networks (WSNs) is addressed by means of optimization techniques. The goal of this work is to lay the foundations to develop algorithms and techniques that minimize the data gathering latency and at the same time balance the energy consumption among the nodes, so as to maximize the network lifetime. Following an incremental-complexity approach, several mathematical programming problems are proposed with focus on different network performance metrics. First, the static routing problem is formulated for large and dense WSNs. Optimal data-gathering trees are analyzed and the effects of several sensor capabilities and constraints are discussed, e.g., radio power constraints, energy consumption model, and data aggregation functionalities. Then, dynamic re-routing and scheduling are considered. An accurate network model is proposed that captures the tradeoff between the data gathering latency and the energy consumption, by modeling the interactions among the routing, medium access control and physical layers.For each problem, extensive simulation results are provided. The proposed models provide a deeper insight into the problem of timely and energy efficient data gathering. Useful guidelines for the design of efficient WSNs are derived and discussed. 相似文献

8.

重用感知的非一致缓存迁移策略研究

汪玲黄炎袁光辉《计算机工程》2014,(2):81-85

随着工艺的持续进步,多核处理器集成了越来越多的核以及片上缓存系统,因此利用非一致缓存架构(NUCA)应对片上多核处理器的缓存系统中逐渐增大的线延迟。高效的缓存块迁移策略对整个缓存系统至关重要。当前动态非一致缓存架构(D-NUCA)中的缓存块迁移策略未考虑缓存块的历史访问信息,导致缓存块在不同的bank之间抖动从而增加缓存块的访问延迟。为此,提出一种重用感知的缓存块迁移(RABM)策略,采用缓存块的历史迁移信息来预测将来的缓存块迁移,从而提升D-NUCA的性能以及降低整个缓存系统的功耗。基于PARSEC基准测试程序的全系统仿真结果显示,与D-NUCA相比,基于RABM的D-NUCA可以使每时钟周期指令数平均提高9.6%,片上缓存系统功耗降低14%。相似文献

9.

Architecting high-performance energy-efficient soft error resilient cache under 3D integration technology

Hongbin SunAuthor Vitae Pengju RenAuthor VitaeNanning ZhengAuthor Vitae Tong ZhangAuthor VitaeTao LiAuthor Vitae 《Microprocessors and Microsystems》2011,35(4):371-381

Radiation-induced soft error has become an emerging reliability threat to high performance microprocessor design. As the size of on chip cache memory steadily increased for the past decades, resilient techniques against soft errors in cache are becoming increasingly important for processor reliability. However, conventional soft error resilient techniques have significantly increased the access latency and energy consumption in cache memory, thereby resulting in undesirable performance and energy efficiency degradation. The emerging 3D integration technology provides an attractive advantage, as the 3D microarchitecture exhibits heterogeneous soft error resilient characteristics due to the shielding effect of die stacking. Moreover, the 3D shielding effect can offer several inner dies that are inherently invulnerable to soft error, as they are implicitly protected by the outer dies. To exploit the invulnerability benefit, we propose a soft error resilient 3D cache architecture, in which data blocks on the soft error invulnerable dies have no protection against soft error, therefore, access to the data block on the soft error invulnerable die incurs a considerably reduced access latency and energy. Furthermore, we propose to maximize the access on the soft error invulnerable dies by dynamically moving data blocks among different dies, thereby achieving further performance and energy efficiency improvement. Simulation results show that the proposed 3D cache architecture can reduce the power consumption by up to 65% for the L1 instruction cache, 60% for the L1 data cache and 20% for the L2 cache, respectively. In general, the overall IPC performance can be improved by 5% on average. 相似文献

10.

新型非易失相变存储器PCM应用研究 总被引：1，自引：0，他引：1

刘金垒李琼《计算机研究与发展》2012,(Z1):90-93

并行I/O技术有效优化了I/O性能,但对访问延迟却难以控制.相变存储器(phase change memory,PCM)作为一种SCM(storage class memory),具有非易失性、随机可读写、低延迟、高吞吐率、体积小和低功耗的特点,为I/O性能优化提供了最直接有效的途径.研究了PCM的特性与存在的问题,总结了目前PCM的应用研究进展,针对高性能计算中的并行I/O问题,提出了一种基于相变存储器PCM的层次式并行混合存储模型,能够有效提高并行文件系统元数据服务效率和并行I/O吞吐率. 相似文献

11.

基于功率控制的无线传感器网络MAC协议研究

于凯谢志军金光唐建华《传感技术学报》2013,26(9)

本文提出了一种无线传感器网络自适应功率控制MAC协议APC-SMAC。APC-SMAC协议首先通过建立基于最优邻居节点数的功率调度表,提高信道利用率,减少通信能耗,另一方面对干扰节点发送反馈帧控制其进入睡眠,从而提高了网络数据吞吐量,减少了网络延迟。实验结果表明该协议在网络吞吐量、网络时延和平均能耗上均有了较大的提高。相似文献

12.

多核Cache稀疏目录性能提升方法综述

吴健虢陈海燕刘胜邓让钰陈俊杰《计算机工程与科学》2019,41(3):385-392

受限于功耗,十多年前通用微处理器就停止追求更高的主频转而向集成更多处理器核的方向发展;同时,随着晶体管密度按摩尔定律不断提高,单片可集成的处理器核数成倍增长,片上多核、众核处理器已成为高性能微处理器发展的主流。未来千核级通用众核处理器支持共享存储编程模型是一种必然趋势,但传统的Cache一致性目录结构面临着查找延迟高、目录项替换频繁以及硬件代价和功耗可扩展性有限等问题。稀疏目录实现了传统目录结构硬件开销与一致性维护效率的折衷,被认为是众核处理器维护Cache一致性的一种高能效、可扩展结构。综述了近年来提高稀疏目录性能的相关研究与方法,并对其在面积、访问延迟、功耗和实现复杂性等方面进行分析,归纳出这些方法各自的优点和存在的不足,对创新设计未来高性能众核处理器共享存储体系结构具有一定的参考价值。相似文献

13.

基于持久化内存的索引设计重新思考与优化

韩书楷熊子威蒋德钧熊劲《计算机研究与发展》2021,58(2):356-370

非易失性内存(non-volatile memory,NVM)是近几年来出现的一种新型存储介质.一方面,同传统的易失性内存一样,它有着低访问延迟、可字节寻址的特性;另一方面,与易失性内存不同的是,掉电后它存储的数据不会丢失,此外它还有着更高的密度以及更低的能耗开销这些特性使得非易失性内存有望被大规模应用在未来的计算机系... 相似文献

14.

Performance comparison of some shared memory organizations for 2D mesh-like NOCs

Martti Forsell^{Author Vitae} 《Microprocessors and Microsystems》2011,35(2):274-284

While the research community has already studied a considerable amount of techniques related to achieving high bandwidth, good reliability, low power consumption, certain quality of service in communication on networks on chip (NOC) especially with artificial communication patterns, a little attention has paid to the effects of memory organizations to performance of computing engines employing NOCs with real parallel workloads. In this paper we compare the performance of some shared memory organizations for chip multiprocessors (CMP) employing advanced homogeneous 2D-mesh-like NOCs and making use of emulated shared memory and non-uniform memory access models. The evaluated techniques range from applying different hashing functions to elimination methods of speed difference between processing resources and memories, and from access methods to latency hiding and concurrent memory access support techniques. Tests are performed on our CMP/NOC framework with simple but real parallel programs that can be directly used as building blocks of larger explicitly parallel applications. 相似文献

15.

Linked instruction caches for enhancing power efficiency of embedded systems

Chang-Jung Ku Ching-Wen Chen An Hsia Chun-Lin Chen 《Microprocessors and Microsystems》2014

The power consumed by memory systems accounts for 45% of the total power consumed by an embedded system, and the power consumed during a memory access is 10 times higher than during a cache access. Thus, increasing the cache hit rate can effectively reduce the power consumption of the memory system and improve system performance. In this study, we increased the cache hit rate and reduced the cache-access power consumption by developing a new cache architecture known as a single linked cache (SLC) that stores frequently executed instructions. SLC has the features of low power consumption and low access delay, similar to a direct mapping cache, and a high cache hit rate similar to a two way-set associative cache by adding a new link field. In addition, we developed another design known as a multiple linked caches (MLC) to further reduce the power consumption during each cache access and avoid unnecessary cache accesses when the requested data is absent from the cache. In MLC, the linked cache is split into several small linked caches that store frequently executed instructions to reduce the power consumption during each access. To avoid unnecessary cache accesses when a requested instruction is not in the linked caches, the addresses of the frequently executed blocks are recorded in the branch target buffer (BTB). By consulting the BTB, a processor can access the memory to obtain the requested instruction directly if the instruction is not in the cache. In the simulation results, our method performed better than selective compression, traditional cache, and filter cache in terms of the cache hit rate, power consumption, and execution time. 相似文献

16.

多个无线传感器网络中节能MAC协议设计

朱亮《信息安全与技术》2012,(12):78-80,90

在无线传感器网络中,媒体访问控制(MAC)层协议影响着整个网络的性能。根据无线传感器网络对节点能耗和时延的要求,本文提出了一种基于跨层设计的节能MAC协议。利用物理层、MAC层和路由层之间的信息交互,在保证可靠通信的基础上,实现在一个监听/睡眠周期内数据多跳传输,缩短数据传输时延,并且有效控制网络数据传输的冗余度,降低冗余节点能量消耗。性能分析和仿真结果表明,节能MAC协议能够有效地降低网络时延并且减少节点能耗。相似文献

17.

Evaluation of low power consumption network on chip routing architecture

《Microprocessors and Microsystems》2021

Network on Chip (NoC) is growing technology whereby multiprocessor state interconnect patterns are formed. NoC technology is adapted to support a variety of multiprocessor requirements. The existing designs do not support the growth requirements of user applications. Because of the complex routing connections, several problems exist about traffic congestion and Power consumption contributing to a network's low efficiency. Traffic Congestion, Power consumption, and latency are a significant concern in Network on Chip architectures because of various dynamic routing connections. The existing models do not consider all the above-mentioned factors and struggle to achieve higher performance. The previous methods do not trigger the circuits according to the traffic condition and maximum power consumption. For this, the proposed High-Speed Virtual Logic Network on Chip router architecture is utilized for controlling the traffic congestion and deadlock issues, reduce the latency by selecting the minimal interval paths. In this research work, an architecture containing a Virtual router is introduced which yields low power consumption resulting in improving the performance of a network by performing the routing in a diagonal direction along with the other directions. Also, the method selects an optimal path according to various conditions that neglect the unnecessary triggering of chips which reduces the power consumption. The proposed model considers the dynamic congestion and route available to perform routing with the least power consumption. By comparing both the architectures, VC Router outperformed 15% of low power consumption for the 8-bit system, 10% of low power consumption for the 16-bit system, and 22% of low power consumption for the 32-bit system. 相似文献

18.

Dynamic replica placement and selection strategies in data grids— A comprehensive survey

R. Kingsy Grace R. Manimegalai 《Journal of Parallel and Distributed Computing》2014

Data replication techniques are used in data grid to reduce makespan, storage consumption, access latency and network bandwidth. Data replication enhances data availability and thereby increases the system reliability. There are two steps involved in data replication, namely, replica placement and replica selection. Replica placement involves identifying the best possible node to duplicate data based on network latency and user request. Replica selection involves selecting the best replica location to access the data for job execution in the data grid. Various replica placement and selection algorithms are available in the literature. These algorithms measure and analyze different parameters such as bandwidth consumption, access cost, scalability, execution time, storage consumption and makespan. In this paper, various replica placement and selection strategies along with their merits and demerits are discussed. This paper also analyses the performance of various strategies with respect to the parameters mentioned above. In particular, this paper focuses on the dynamic replica placement and selection strategies in the data grid environment. 相似文献

19.

An adaptive polling interval and short preamble media access control protocol for wireless sensor networks

Defu Chen Zhengsu Tao 《Frontiers of Computer Science in China》2011,5(3):300-307

Media access control (MAC) protocols control how nodes access a shared wireless channel. It is critical to the performance of wireless sensor networks (WSN). An adaptive polling interval and short preamble MAC protocol (AX-MAC) is proposed in this paper. AXMAC is an asynchronous protocol which composed of two basic features. First, rendezvous between the sender and the receiver is reached by a series of short preambles. Second, nodes dynamically adjust their polling intervals according to network traffic conditions. Threshold parameters used to determine traffic conditions and adjust polling intervals are analyzed based on a Markov chain. Energy consumption and network latency are also discussed in detail. Simulation results indicate that AXMAC is suited to dynamic network traffic conditions and is superior to both X-MAC and Boost-MAC in energy consumption and latency. 相似文献

20.

A workload independent energy reduction strategy for D-NUCA caches

Pierfrancesco Foglia Manuel Comparetti 《The Journal of supercomputing》2014,68(1):157-182

Wire delays and leakage energy consumption are both growing problems in the design of large on chip caches built in deep submicron technologies. D-NUCA caches (Dynamic-Nonuniform Cache Architecture) exploit an aggressive subbanking of the cache and a migration mechanism to speed up frequently accessed data access latency, to limit wire delays effects on performances. Way Adaptable D-NUCA is a leakage power reduction technique specifically suited for D-NUCA caches. It dynamically varies the portion of the powered-on cache area based on the running workload caching needs, but it relies on application dependent parameters that must be evaluated off-line. This limits the effectiveness of Way Adaptable D-NUCA in the general purpose, multiprogrammed environment. In this paper, we propose a new power reduction technique for D-NUCA caches, which still adapts the powered-on cache area to the needs of the running workload, but it does not rely on application-dependent parameters. Results show that our proposal saves around 49 % of total cache energy consumption in a single core environment and 44 % in CMP environment. By adding a timer, it performs similarly to previously proposed techniques to reduce leakage power consumptions, and outperforms them when they are applied in a workload independent manner. 相似文献