期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Exploring the Design Space of Self-Regulating Power-Aware On/Off Interconnection Networks

Soteriou V. Li-Shiuan Peh 《Parallel and Distributed Systems, IEEE Transactions on》2007,18(3):393-408

With power consumption becoming increasingly critical in interconnected systems, power-aware networks become part-and-parcel of many single-chip and multichip systems. As communication links consume significant power regardless of utilization, a mechanism to realize such power-aware networks is on/off links-network links that can be turned on/off as a function of traffic. In this paper, we investigate and propose self-regulating power-aware interconnection networks that turn their links on/off in response to bursts and dips in traffic in a distributed fashion. We explore the design space of such on/off networks, outlining a 5-step design methodology along with various building block solutions at each step that can be effectively assembled to develop various on/off network designs. We applied our methodology to the design of two classes of on/off networks with links that possess substantially different on/off delays, an on-chip network as well as a chip-to-chip network, and show that our designs are able to adapt dynamically to variations in network traffic. Three specific network designs are then constructed, presented, and evaluated. Our simulations show that link power consumption can be reduced by up to 54.4 percent, with a modest increase in network latency 相似文献

2.

面向主干网的网络级绿色节能机制

张金宏王兴伟易波黄敏《软件学报》2020,31(9):2926-2943

近些年,全球范围内的互联网高能耗问题引发了持续关注,节能已成为未来互联网研究的热门议题之一.面向主干网,提出一种网络级绿色节能机制：一方面,在全局视图中使用最小剩余容量优先的绿色路由算法规划全局路由路径,这样使得网络中开启的捆绑链路数目最小,从而实现第一步节能;另一方面,在局部视图中使用绿色降序最佳适应算法将流量负载汇聚到捆绑链路中的最小物理链路集合,这样可以尽可能多地关闭物理链路,从而实现进一步节能.提出的机制在节能的同时兼顾用户QoS需求的满足,在提供QoS保证的前提下最大化节能收益.为了全面评估该机制,选取3个典型主干网拓扑：CERNET2,GéANT和INTERNET2,分别在高负载、中负载和低负载的情形下,与其他3种节能机制从网络功耗和网络性能（平均路由跳数、物理链路关闭数目、路由成功率和运行时间）方面做详尽的对比分析.仿真结果表明：该机制节能效果显著,且有令人满意的性能表现. 相似文献

3.

Silicon-aware distributed switch architecture for on-chip networks

《Journal of Systems Architecture》2013,59(7):505-515

It is well-known that current Chip MultiProcessor (CMP) and high-end MultiProcessor System-on-Chip (MPSoC) designs are growing in their number of components. Networks-on-Chip (NoC) provide the required connectivity for such CMP and MPSoC designs at reasonable costs. As technology advances, links become the critical component in the NoC due to their long delay and power consumption, becoming unacceptable for long global interconnects.In this paper we present a new switch architecture that reduces the negative impact of links on the NoC. We call our proposal distributed switch. The distributed switch spreads the circuitry of the switch onto the links. Thus, packets are buffered, routed, and forwarded at the same time they are crossing the link.Distributing a modular switch onto the link improves the trade off between the power consumption and the operating frequency of the entire network. On the contrary, area resources are increased. Additionally, the distributed switch presents better fault tolerance and process variation behavior with respect to a non-distributed switch. 相似文献

4.

Power-efficient Interconnection Networks: Dynamic Voltage Scaling with Links

《Computer Architecture Letters》2002,1(1):6-6

Power consumption is a key issue in highperformanceinterconnection network design. Communicationlinks, already a aignificant consumer of power now,will take up an ever larger portion of the power budgetas demand for network bandwidth increases. In this paper,we motivate the use of dynamic voltage scaling (DVS)for links, where the frequency and voltage of links are dynamicallyadjusted to minimize power consumption. Wepropose a history-based DYS algorithm that jjlidiciously adjustsDVS poIicies based on past link utilization. Despitevery conservative assumptions about DVS link characteristics,our approach realizes up to 4.5X power savings (3.2Xaverage), with just an average 27.4% Iatency increase and2.5% throughput reduction. To the best of our knowledge,this is the first study that targets dynamic power optimizationof interconnection networks. 相似文献

5.

FILESPPA: Fast Instruction Level Embedded System Power and Performance Analyzer

Nikolaos KroupisAuthor Vitae Dimitrios SoudrisAuthor Vitae 《Microprocessors and Microsystems》2011,35(3):329-342

In the low power embedded systems design, it is important to analyze and optimize both the hardware and the software components of the system. The power consumption evaluation of the embedded systems is very slow procedure using the instruction-level power models into the simulator. Moreover, a huge number of simulations are needed to explore the power consumption in the instruction memory hierarchy to find the best cache parameters of each hierarchy’s level. In this paper we present a methodology which is aiming to estimate the system power consumption in short time, without simulation. The proposed methodology is based on the fast instruction analysis using instruction level power models, cache memory and memory power models. Based on the proposed methodology a software tool was developed named FILESPPA in order to automate the methodology’s steps for the MIPS processor architectures. The experimental results show the efficiency of the proposed methodology and tool in term of estimation accuracy, reducing the system power estimation time of the simulation technique. 相似文献

6.

面向芯核设计的功耗层次化管理策略 总被引：1，自引：1，他引：0

陈静华陈迪平徐勇军张志敏李晓维《计算机辅助设计与图形学学报》2005,17(5):1079-1084

在对现有的各种低功耗设计技术和动态功耗管理策略进行研究的基础上,提出了一种适合于面向芯核设计的层次化电路功耗层次化管理策略及其实现方法．通过让系统在处于空闲状态时迅速进入极低功耗模式的方法来降低功耗,不但能很好地完成复杂系统的功耗管理功能,而且具有较好的可扩展性．将此方法应用于一款数百万门级的片上系统设计中的实验结果表明,在对芯片面积和性能影响不大的情况下,在很大程度上实现了功耗管理,并大幅度地降低了系统功耗。相似文献

7.

Global analysis of piecewise linear systems using impact maps and surface Lyapunov functions

Goncalves J.M. Megretski A. Dahleh M.A. 《Automatic Control, IEEE Transactions on》2003,48(12):2089-2106

This paper presents an entirely new constructive global analysis methodology for a class of hybrid systems known as piecewise linear systems (PLS). This methodology infers global properties of PLS solely by studying the behavior at switching surfaces associated with PLS. The main idea is to analyze impact maps, i.e., maps from one switching surface to the next switching surface. Such maps are known to be "unfriendly" maps in the sense that they are highly nonlinear, multivalued, and not continuous. We found, however, that an impact map induced by an linear time-invariant flow between two switching surfaces can be represented as a linear transformation analytically parametrized by a scalar function of the state. This representation of impact maps allows the search for surface Lyapunov functions (SuLF) to be done by simply solving a semidefinite program, allowing global asymptotic stability, robustness, and performance of limit cycles and equilibrium points of PLS to be efficiently checked. This new analysis methodology has been applied to relay feedback, on/off and saturation systems, where it has shown to be very successful in globally analyzing a large number of examples. In fact, it is still an open problem whether there exists an example with a globally stable limit cycle or equilibrium point that cannot be successfully analyzed with this new methodology. Examples analyzed include systems of relative degree larger than one and of high dimension, for which no other analysis methodology could be applied. This success in globally analyzing certain classes of PLS has shown the power of this new methodology, and suggests its potential toward the analysis of larger and more complex PLS. 相似文献

8.

Accelerating embedded image processing for real time: a case study

Sol Pedre Tomáš Krajník Elías Todorovich Patricia Borensztejn 《Journal of Real-Time Image Processing》2016,11(2):349-374

Many image processing applications need real-time performance, while having restrictions of size, weight and power consumption. Common solutions, including hardware/software co-designs, are based on Field Programmable Gate Arrays (FPGAs). Their main drawback is long development time. In this work, a co-design methodology for processor-centric embedded systems with hardware acceleration using FPGAs is proposed. The goal of this methodology is to achieve real-time embedded solutions, using hardware acceleration, but achieving development time similar to that of software projects. Well established methodologies, techniques and languages from the software domain—such as Object-Oriented Paradigm design, Unified Modelling Language, and multithreading programming—are applied; and semiautomatic C-to-HDL translation tools and methods are used and compared. The methodology is applied to achieve an embedded implementation of a global vision algorithm for the localization of multiple robots in an e-learning robotic laboratory. The algorithm is specifically developed to work reliably 24/7 and to detect the robot’s positions and headings even in the presence of partial occlusions and varying lighting conditions expectable in a normal classroom. The co-designed implementation of this algorithm processes 1,600 × 1,200 pixel images at a rate of 32 fps with an estimated energy consumption of 17 mJ per frame. It achieves a 16× acceleration and 92 % energy saving, which compares favorably with the most optimized embedded software solutions. This case study shows the usefulness of the proposed methodology for embedded real-time image processing applications. 相似文献

9.

Race-to-halt energy saving strategies

《Journal of Systems Architecture》2014,60(10):796-815

Energy consumption is one of the major issues for modern embedded systems. Early, power saving approaches mainly focused on dynamic power dissipation, while neglecting the static (leakage) energy consumption. However, technology improvements resulted in a case where static power dissipation increasingly dominates. Addressing this issue, hardware vendors have equipped modern processors with several sleep states. We propose a set of leakage-aware energy management approaches that reduce the energy consumption of embedded real-time systems while respecting the real-time constraints. Our algorithms are based on the race-to-halt strategy that tends to run the system at top speed with an aim to create long idle intervals, which are used to deploy a sleep state. The effectiveness of our algorithms is illustrated with an extensive set of simulations that show an improvement of up to 8% reduction in energy consumption over existing work at high utilization. The complexity of our algorithms is smaller when compared to state-of-the-art algorithms. We also eliminate assumptions made in the related work that restrict the practical application of the respective algorithms. Moreover, a novel study about the relation between the use of sleep intervals and the number of pre-emptions is also presented utilizing a large set of simulation results, where our algorithms reduce the experienced number of pre-emptions in all cases. Our results show that sleep states in general can save up to 30% of the overall number of pre-emptions when compared to the sleep-agnostic earliest-deadline-first algorithm. 相似文献

10.

Performance‐steered design of software architectures for embedded multicore systems

Alessio Bechini Cosimo Antonio Prete 《Software》2002,32(12):1155-1173

相似文献

11.

Compiler-directed power optimization of high-performance interconnection networks for load-balancing MPI applications

Yang Xuejun Yi Huizhan Qu Xiangli Zhou Haifang 《Frontiers of Computer Science in China》2007,1(1):94-105

Energy consumption of parallel computers has been becoming the obstruction to higher-performance systems. In this paper, we focus on power optimization of high-performance interconnection networks for MPI applications in high-performance parallel computers. Compared with the past history-based work, we propose the idea of compiler-directed power-aware on/off network links. There are some idle intervals for network links during the execution of parallel applications, at which the links still consume large amounts of energy. Using on/off network links, compiler first divides load-balancing MPI applications into the communication intervals and the computation intervals, and then inserts the on/off instruction into the applications to switch the link state. To avoid the time overhead of state switching, we use a time estimation technique to analyze the computation time, and insert the on instruction before reaching the communication intervals. Results from simulations and experiments show that the proposed compiler-directed method can reduce energy consumption of interconnection networks by 20∼70%, at a loss of less than 1% network latency and performance degradation. 相似文献

12.

多处理器计算环境中基于能量节约的实时动态调度算法

韩建军李庆华缪天鹏《小型微型计算机系统》2006,27(5):866-872

当前处理器由于较高的能量消耗，导致处理器热量散发的提高及系统可靠性的降低，已经成为目前计算机领域较为关心的问题．然而目前一些有效降低能量消耗的技术大多针对单处理器系统，较少考虑多处理器系统．提出的调度算法针对多处理器计算环境，以执行时间最快的任务优先调度为基础，结合其它有效技术（共享空闲时间回收），使得实时任务在其截止期内完成的同时能够有效地减低整个系统的能量消耗．针对独立任务集及具有依赖关系的任务集，提出两种针对同构计算环境的算法：STFBA1（Shortest—Task—First—Based Algorithm）及STFBA2，及两钟针对多任务集的算法HSA1（Hybrid Seheduling Algorithm）及HAS2．在单任务集计算环境下，与目前所知的有效算法相比，算法具有更好的性能（调度长度及能量消耗）．在多任务集计算环境下，基于混合调度策略的算法能够明显改进调度性能．相似文献

13.

Comparing system level power management policies 总被引：1，自引：0，他引：1

Yung-Hsiang Lu De Micheli G. 《Design & Test of Computers, IEEE》2001,18(2):10-19

Reducing power consumption is a challenge to system designers. Portable systems, such as laptop computers and personal digital assistants (PDAs), draw power from batteries, so reducing power consumption extends their operating times. For desktop computers or servers, high power consumption raises temperature and deteriorates performance and reliability. Soaring energy prices and rising concern about the environmental impact of electronics systems further highlight the importance of low power consumption. Power reduction techniques can be classified as static and dynamic. Static techniques, such as synthesis and compilation for low power, are applied at design time. In contrast, dynamic techniques use runtime behavior to reduce power when systems are serving light workloads or are idle. These techniques are known as dynamic power management (DPM). DPM can be achieved in different ways; for example, dynamic voltage scaling (DVS) changes supply voltage at runtime as a method of power management. Here, we use DPM specifically for shutting down unused I/O devices. We built an experimental environment on a laptop computer running Microsoft Windows. We implemented existing power management policies and quantitatively compared their effects on power saving and performance degradation 相似文献

14.

Adding spatial flexibility to source-receptor relationships for air quality modeling

《Environmental Modelling & Software》2017

To cope with computing power limitations, air quality models that are used in integrated assessment applications are generally approximated by simpler expressions referred to as “source-receptor relationships (SRR)”. In addition to speed, it is desirable for the SRR also to be spatially flexible (application over a wide range of situations) and to require a “light setup” (based on a limited number of full Air Quality Models - AQM simulations). But “speed”, “flexibility” and “light setup” do not naturally come together and a good compromise must be ensured that preserves “accuracy”, i.e. a good comparability between SRR results and AQM.In this work we further develop a SRR methodology to better capture spatial flexibility. The updated methodology is based on a cell-to-cell relationship, in which a bell-shape function links emissions to concentrations. Maintaining a cell-to-cell relationship is shown to be the key element needed to ensure spatial flexibility, while at the same time the proposed approach to link emissions and concentrations guarantees a “light set-up” phase. Validation has been repeated on different areas and domain sizes (countries, regions, province throughout Europe) for precursors reduced independently or contemporarily. All runs showed a bias around 10% between the full AQM and the SRR.This methodology allows assessing the impact on air quality of emission scenarios applied over any given area in Europe (regions, set of regions, countries), provided that a limited number of AQM simulations are performed for training. 相似文献

15.

Design, implementation, and experimental validation of optimal power split control for hybrid electric trucks

Thijs van Keulen Dominique van Mullem Bram de Jager Maarten Steinbuch 《Control Engineering Practice》2012,20(5):547-558

Hybrid electric vehicles require an algorithm that controls the power split between the internal combustion engine and electric machine(s), and the opening and closing of the clutch. Optimal control theory is applied to derive a methodology for a real-time optimal-control-based power split algorithm. The presented strategy is adaptive for vehicle mass and road elevation, and is implemented on a standard Electronic Control Unit of a parallel hybrid electric truck. The implemented strategy is experimentally validated on a chassis dynamo meter. The fuel consumption is measured on 12 different trajectories and compared with a heuristic and a non-hybrid strategy. The optimal control strategy has a fuel consumption lower (up to 3%) than the heuristic strategy on all trajectories that are evaluated, except one. Compared to the non-hybrid strategy the fuel consumption reduction ranged from 7% to 16%. 相似文献

16.

Modeling power consumption of 3D MPDATA and the CG method on ARM and Intel multicore architectures

Krzysztof Rojek Enrique S. Quintana-Ortí Roman Wyrzykowski 《The Journal of supercomputing》2017,73(10):4373-4389

We propose an approach to estimate the power consumption of algorithms, as a function of the frequency and number of cores, using only a very reduced set of real power measures. In addition, we also provide the formulation of a method to select the voltage–frequency scaling–concurrency throttling configurations that should be tested in order to obtain accurate estimations of the power dissipation. The power models and selection methodology are verified using two real scientific application: the stencil-based 3D MPDATA algorithm and the conjugate gradient (CG) method for sparse linear systems. MPDATA is a crucial component of the EULAG model, which is widely used in weather forecast simulations. The CG algorithm is the keystone for iterative solution of sparse symmetric positive definite linear systems via Krylov subspace methods. The reliability of the method is confirmed for a variety of ARM and Intel architectures, where the estimated results correspond to the real measured values with the average error being slightly below 5% in all cases. 相似文献

17.

Energy efficient online routing of flows with additive constraints

Stefano Avallone Giorgio Ventre 《Computer Networks》2012,56(10):2368-2382

A number of studies report that ICT sectors are responsible for up to 10% of the worldwide power consumption and that a substantial share of such amount is due to the Internet infrastructure. To accommodate the traffic in the peak hours, Internet Service Providers (ISP) have overprovisioned their networks, with the result that most of the links and devices are under-utilized most of the time. Thus, under-utilized links and devices may be put in a sleep state in order to save power and that might be achieved by properly routing traffic flows. In this paper, we address the design of a joint admission control and routing scheme aiming at maximizing the number of admitted flow requests while minimizing the number of nodes and links that need to stay active. We assume an online routing paradigm, where flow requests are processed one-by-one, with no knowledge of future flow requests. Each flow request has requirements in terms of bandwidth and m additive measures (e.g., delay, jitter). We develop a new routing algorithm, E²-MCRA, which searches for a feasible path for a given flow request that requires the least number of nodes and links to be turned on. The basic concepts of E²-MCRA are look-ahead, the depth-first search approach and a path length definition as a function of the available bandwidth, the additive QoS constraints and the current status (on/off) of the nodes and links along the path. Finally, we present the results of the simulation studies we conducted to evaluate the performance of the proposed algorithm. 相似文献

18.

Memory power optimization of Java-based embedded systems exploiting garbage collection information

Jose Manuel Velasco David Atienza Katzalin Olcoz 《Journal of Systems Architecture》2012,58(2):61-72

Nowadays, Java is used in all types of embedded devices. For these memory-constrained systems, the automatic dynamic memory manager (Garbage Collector or GC) has been always a key factor in terms of the Java Virtual Machine (JVM) performance. Moreover, in current embedded platforms, power consumption is becoming as important as performance. Thus, in this paper we present an exploration, from an energy viewpoint, of the different possibilities of memory hierarchies for high-performance embedded systems when used by state-of-the-art GCs. This is a starting point for a better understanding of the interactions between the Java applications, the memory hierarchy and the GC.Hence, we subsequently present two techniques to reduce energy consumption on Java-based embedded systems, based on exploiting GC information. The first technique uses GC execution behavior to reduce leakage energy consumption taking advantage of the low-power mode of actual multi-banked SDRAM memories and it is intended for generational collectors. This technique can achieve a reduction up to 50% of SDRAM memory leakage.The second technique involves the inclusion of a software-controlled (scratch-pad) memory that stores GC instructions under the JVM control to reduce the active energy consumption and also improve the performance of the target embedded system and it is aimed at all kind of garbage collectors. For this last technique we have experimented with two different approaches for selecting the GC code to be stored in the scratchpad memory: one static and one dynamic. Our experimental results show that the proposed dynamic scratchpad management approach for GCs enables up to 63% energy consumption reduction and 25% performance improvement during the collector phase, which means, in terms of JVM execution, a global reduction of 29% and 17% for energy and cycles, respectively.Overall, this work outlines that the key for an efficient low-power implementation of Java Virtual Machines for high-performance embedded systems is the synergy between the GC choice, the memory architecture tuning, and the inclusion of power management schemes controlled by the JVM, exploiting knowledge of the GC behavior. 相似文献

19.

From commercial documents to system requirements: an approach for the engineering of novel CBTC solutions

Alessio Ferrari Giorgio O. Spagnolo Giacomo Martelli Simone Menabeni 《International Journal on Software Tools for Technology Transfer (STTT)》2014,16(6):647-667

Communications-based train control (CBTC) systems are the new frontier of automated train control and operation. Currently developed CBTC platforms are actually very complex systems including several functionalities, and every installed system, developed by a different company, varies in extent, scope, number, and even names of the implemented functionalities. International standards have emerged, but they remain at a quite abstract level, mostly setting terminology. This paper presents the results of an experience in defining a global model of CBTC, by mixing semi-formal modelling and product line engineering. The effort has been based on an in-depth market analysis, not limiting to particular aspects but considering as far as possible the whole picture. The paper also describes a methodology to derive novel CBTC products from the global model, and to define system requirements for the individual CBTC components. To this end, the proposed methodology employs scenario-based requirements elicitation aided with rapid prototyping. To enhance the quality of the requirements, these are written in a constrained natural language (CNL), and evaluated with natural language processing (NLP) techniques. The final goal is to go toward a formal representation of the requirements for CBTC systems. The overall approach is discussed, and the current experience with the implementation of the method is presented. In particular, we show how the presented methodology has been used in practice to derive a novel CBTC architecture. 相似文献

20.

SEProf: A high-level software energy profiling tool for an embedded processor enabling power management functions

Shiao-Li Tsao Jian Jhen Chen 《Journal of Systems and Software》2012,85(8):1757-1769

Energy efficiency has become one of the most important design issues for embedded systems. To examine the power consumption of an embedded system, an energy profiling tool is highly demanded. Although a number of energy profiling tools have been proposed, they are not directly applicable to the embedded processors with power management functions that are widely utilized in battery-operated embedded systems to reduce power consumption. Hence, this study presents a high-level energy profiling tool, called SEProf, that estimates the energy consumption of an embedded system running multithread software and a multitasking operating system (OS) that supports power management functions. This study implements the proposed SEProf in Linux 2.6.19 and evaluates its performance on an ARM11 MPCore processor. Experimental results demonstrate that the proposed tool can provide accurate energy profiling results with a low profiling overhead. 相似文献