期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Energy and time efficient task offloading and resource allocation on the generic IoT-fog-cloud architecture

Sun Huaiying Yu Huiqun Fan Guisheng Chen Liqiong 《Peer-to-Peer Networking and Applications》2020,13(2):548-563

Peer-to-Peer Networking and Applications - With the exponential increase in the number of IoT devices and the amount of emitted data from these devices, it is expensive and inefficient to offload... 相似文献

2.

Towards efficient allocation of graph convolutional networks on hybrid computation-in-memory architecture

Jiaxian CHEN Guanquan LIN Jiexin CHEN Yi WANG 《中国科学:信息科学(英文版)》2021,(6):108-121

Graph convolutional networks (GCNs) have been applied successfully in social networks and recommendation systems to analyze graph data.Unlike conventional neura... 相似文献

3.

Hyper switching memory utilization on hybrid main memory for improved task execution and reduced power consumption

《Microprocessors and Microsystems》2020

The problem of lifetime maximization of PCM has been well studied. The arrival of non-volatile memory devices has replaced the traditional DRAM. Still the DRAM has many limitations on endurance and high power write operations. Similarly, number of designs has been discussed earlier to maximize the lifetime of PCM by catching the main memory at available DRAM. Still they could not achieve the performance on power consumption reduction and increasing memory utilization. To improve the performance in power consumption reduction and lifetime maximization, and categorical model is presented in this paper. The proposed method categorizes the processes according to their memory access activity. The categorized process has been allocated to respective part of hybrid memory which encourages maximum read and minimum write in PCM. The proposed method increases the lifetime of PCM than other methods. 相似文献

4.

An efficient task allocation scheme for 2D mesh architectures 总被引：1，自引：0，他引：1

Seong-Moo Yoo Hee Yong Youn Shirazi B. 《Parallel and Distributed Systems, IEEE Transactions on》1997,8(9):934-942

Efficient allocation of processors to incoming tasks in parallel computer systems is very important for achieving the desired high performance. It requires recognizing the free available processors with minimum overhead. In this paper, we present an efficient task allocation scheme for 2D mesh architectures. By employing a new approach for searching the mesh, our scheme can find the available submesh without scanning the entire mesh, unlike earlier designs. Comprehensive computer simulation reveals that the average allocation time and waiting delay are much smaller than earlier schemes of comparable performances, irrespective of the size of meshes and distribution of the shape of the incoming tasks 相似文献

5.

基于布隆过滤器的新型混合内存架构磨损均衡策略

张震付印金胡谷雨《计算机应用》2018,38(8):2230-2235

相变存储器（PCM）凭借低功耗的优势有望成为新一代主存储器,但是耐受性的缺陷成为其广泛应用的重要障碍。现有的随机存取存储器（DRAM）缓存技术和磨损均衡分别从减少PCM写数量以及均匀化写操作分布两个角度延长PCM使用寿命,但前者在写回数据时未考虑数据的读写倾向性,后者在空间局部性较强的应用场景下存在数据交换粒度、空间开销、随机性等诸多问题。因此,设计一种全新的混合存储架构,结合最近最少使用（LRU）算法和带有时间变化的最不经常使用（LFU-Aging）算法提出区分数据读写倾向性的缓存策略,并且基于布隆过滤器（BF）设计针对强空间局部性工作集的动态磨损均衡算法,在有效减少冗余写操作的同时实现低空间开销的组间磨损均衡操作。实验结果表明,该策略能够减少PCM上13.4%~38.6%的写操作,同时有效均匀90%以上分组的写操作分布。相似文献

6.

Dynamic recognition prefetch engine for DRAM-PCM hybrid main memory

Zhang Mengzhao Kim Jeong-Geun Yoon Su-Kyung Kim Shin-Dug 《The Journal of supercomputing》2022,78(2):1885-1902

The Journal of Supercomputing - This research is to design an effective prefetching method required for hybrid main memory systems consisting of dynamic random-access memory (DRAM) and phase-change... 相似文献

7.

A hybrid memory architecture supporting fine-grained data migration

Ye CHI Jianhui YUE Xiaofei LIAO Haikun LIU Hai JIN 《Frontiers of Computer Science》2024,18(2):182103

Hybrid memory systems composed of dynamic random access memory (DRAM) and Non-volatile memory (NVM) often exploit page migration technologies to fully take the advantages of different memory media. Most previous proposals usually migrate data at a granularity of 4 KB pages, and thus waste memory bandwidth and DRAM resource. In this paper, we propose Mocha, a non-hierarchical architecture that organizes DRAM and NVM in a flat address space physically, but manages them in a cache/memory hierarchy. Since the commercial NVM device–Intel Optane DC Persistent Memory Modules (DCPMM) actually access the physical media at a granularity of 256 bytes (an Optane block), we manage the DRAM cache at the 256-byte size to adapt to this feature of Optane. This design not only enables fine-grained data migration and management for the DRAM cache, but also avoids write amplification for Intel Optane DCPMM. We also create an Indirect Address Cache (IAC) in Hybrid Memory Controller (HMC) and propose a reverse address mapping table in the DRAM to speed up address translation and cache replacement. Moreover, we exploit a utility-based caching mechanism to filter cold blocks in the NVM, and further improve the efficiency of the DRAM cache. We implement Mocha in an architectural simulator. Experimental results show that Mocha can improve application performance by 8.2% on average (up to 24.6%), reduce 6.9% energy consumption and 25.9% data migration traffic on average, compared with a typical hybrid memory architecture–HSCC. 相似文献

8.

General memory efficient packet matching FPGA architecture for future high-speed networks

《Microprocessors and Microsystems》2020

Packet classification (matching) is one of the critical operations in networking widely used in many different devices and tasks ranging from switching or routing to a variety of monitoring and security applications like firewall or IDS. To satisfy the ever-growing performance demands of current and future high-speed networks, specially designed hardware accelerated architectures implementing packet classification are necessary. These demands are now growing to such an extent, that in order to keep up with the rising throughputs of network links, the FPGA accelerated architectures are required to perform matching of multiple packets in every single clock cycle. To meet this requirement a simple replication approach can be utilized – instantiate multiple copies of a processing pipeline matching incoming packets in parallel. However, simple replication of pipelines inseparably brings a significant increase in utilization of FPGA resources of all types, which is especially costly for rather scarce on-chip memories used in matching tables.We propose and examine a unique parallel hardware architecture for hash-based exact match classification of multiple packets in each clock cycle that offers a reduction of memory replication requirements. The core idea of the proposed architecture is to exploit the basic memory organization structure present in all modern FPGAs, where hundreds of individual block or distributed memory tiles are available and can be accessed (addressed) independently. This way, we are able to maintain a rather high throughput of matching multiple packets per clock cycle even without fully replicated memory resources in matching tables. Our results show that the designed approach can use on-chip memory resources very efficiently and even scales exceptionally well with increased capacities of match tables. For example, the proposed architecture is able to achieve a throughput of more than 2 Tbps (over 3 000 Mpps) with an effective capacity of more than 40 000 IPv4 flow records at the cost of only a few hundred block memory tiles (366 BlockRAM for Xilinx or 672 M20K for Intel FPGAs) utilizing only a small fraction of available logic resources (around 68 000 LUTs for Xilinx or 95 000 ALMs for Intel). 相似文献

9.

High reliable and efficient task allocation in networked multi-agent systems

Faezeh Rahimzadeh Leyli Mohammad Khanli Farnaz Mahan 《Autonomous Agents and Multi-Agent Systems》2015,29(6):1023-1040

相似文献

10.

一种低能耗低反应时间的并行任务调度方法* 总被引：1，自引：0，他引：1

宋曼李春林《计算机应用研究》2009,26(6):2251-2253

在各种网络状嵌入式系统如数字信号处理、车辆跟踪和基础设备监控中,一般同时要求低的能量消耗和低的反应时间。传统的方法通常只集中考虑能量节约问题,而没有考虑最优跨度问题,这样调度长度可能非常长的情况有时让人无法容忍。在限制调度长度的条件下最小化能量消耗,并提出了一种平衡能量消耗和反应时间的算法EATA。实验结果证实了该算法的有效性。相似文献

11.

Register allocation for write activity minimization on non-volatile main memory for embedded systems

Yazhi Huang Author VitaeTiantian Liu Author Vitae Chun Jason Xue^{Author Vitae} 《Journal of Systems Architecture》2012,58(1):13-23

Non-volatile memories are good candidates for DRAM replacement as main memory in embedded systems and they have many desirable characteristics. Nevertheless, the disadvantages of non-volatile memory co-exist with its advantages. First, the lifetime of some of the non-volatile memories is limited by the number of erase operations. Second, read and write operations have asymmetric speed or power consumption in non-volatile memory. This paper focuses on the embedded systems using non-volatile memory as main memory. We propose register allocation technique with re-computation to reduce the number of store instructions. When non-volatile memory is applied as the main memory, reducing store instructions will reduce write activities on non-volatile memory. To re-compute the spills effectively during register allocation, a novel potential spill selection strategy is proposed. During this process, live range splitting is utilized to split certain long live ranges such that they are more likely to be assigned into registers. In addition, techniques for re-computation overhead reduction is proposed on systems with multiple functional units. With the proposed approach, the lifetime of non-volatile memory is extended accordingly. The experimental results demonstrate that the proposed technique can efficiently reduce the number of store instructions on systems with non-volatile memory by 33% on average. 相似文献

12.

Design of a service oriented architecture for efficient resource allocation in media environments

Stein Desmet^{Author Vitae} Bruno VolckaertAuthor VitaeFilip De TurckAuthor Vitae 《Future Generation Computer Systems》2012,28(3):527-532

In many business domains, Grids and Service Oriented Architectures are considered to improve application design, integration and execution. In the audiovisual industry, applications are very data-intensive, time-constrained and computationally demanding, and design of a Service Oriented Architecture in this domain is no straightforward task. Efficient resource allocation-especially in terms of network usage-is paramount to meet users’ requirements in terms of deadlines and responsiveness, and offer high scalability at the same time. We present a resource- and network-aware management architecture addressing the issues in media environments, incorporating a number of scheduling algorithms and advance reservation systems to ensure efficient resource usage. 相似文献

13.

Optimized task allocation on private cloud for hybrid simulation of large-scale critical systems

《Future Generation Computer Systems》2017

Simulation represents a powerful technique for the analysis of dependability and performance aspects of distributed systems. For large-scale critical systems, simulation demands complex experimentation environments and the integration of different tools, in turn requiring sophisticated modeling skills. Moreover, the criticality of the involved systems implies the set-up of expensive testbeds on private infrastructures. This paper presents a middleware for performing hybrid simulation of large-scale critical systems. The services offered by the middleware allow the integration and interoperability of simulated and emulated subsystems, compliant with the reference interoperability standards, which can provide greater realism of the scenario under test. The hybrid simulation of complex critical systems is a research challenge due to the interoperability issues of emulated and simulated subsystems and to the cost associated with the scenarios to set up, which involve a large number of entities and expensive long running simulations. Therefore, a multi-objective optimization approach is proposed to optimize the simulation task allocation on a private cloud. 相似文献

14.

Energy-aware assignment and scheduling for hybrid main memory in embedded systems

Guohui Wang Yong Guan Yi Wang Zili Shao 《Computing》2016,98(3):279-301

相似文献

15.

基于SRAM和PRAM混合主存设计

姚英彪陈越佳《计算机工程与应用》2016,52(13):69-75

由于DRAM芯片超高的静态功耗,使得利用DRAM构建高性能计算机系统中的大容量主存遇到能耗过大问题,这激发了对新型大容量主存结构的研究。针对上述问题,设计了一种基于SRAM和PRAM的混合主存系统,该系统将SRAM作为PRAM的专用写缓存,并将改进后的LRFU算法应用到SRAM写缓存,从而在对主存系统性能影响不大的前提下,有效降低主存系统的能耗和延长PRAM的可用时间。仿真结果显示,所设计的混合存储结构的能耗-延时积（EDP）为纯DRAM存储结构的40%;此外,与纯PRAM存储结构相比,可使PRAM的写操作次数下降28.5%,与将SRAM作为Cache相比,PRAM写次数下降13%。相似文献

16.

Axiomatics for memory allocation

Prof. A. P. Ershov 《Acta Informatica》1976,6(1):61-75

Summary In the paper a method of labelling is applied to constructing a correct and complete transformation system which allows one, for any program scheme, to construct systematically any permissible memory allocation for variables of the scheme. Permissible memory allocations are assumed to be such allocations that preserve all information flow connections from the initial scheme.Notations B set of internal statements (2.3) - D information flow graph (2.2) - D _j component of an information flow graph (2.2) - E set of statements reachable downwards arcs (2.3) - G skeleton (2.1) - i input - L set of statements reachable upwards arcs (2.3) - N information carrier (2.1) - o output (2.1) - (o, i) information pair (2.1) - p pole (2.1) - R memory allocation (2.1) - L Lavrov scheme (2.1) - T, U statements (2.1) - V poles allocation (2.1) - W inconsistency graph (2.3) - X memory (2.1) - x, y variables (2.1) - empty set - calculability relation (2.1) - equivalence relation (3.1) - transformability relation (3.1) - end of a lemma proof - end of a theorem proof - empty word The author is greatful to Miss E. L. Gorel whose research of axiomatic of generalized Yanov schemata has stimulated this writing. 相似文献

17.

基于效用函数的多机器人系统任务分配

黎萍杨宜民练家乐《计算机应用研究》2009,26(2):537-539

目前所采用的多机器人系统任务分配方法大多都忽略了任务分配的解质量问题。从定量的角度出发,提出了一种基于效用函数的多机器人系统任务分配策略,在机器人能力向量和子任务要求的能力向量基础上,建立了效用函数的数学模型,根据效用函数大小进行任务分配。仿真实验在足球机器人仿真比赛平台上进行,结果表明该任务分配算法对异构多机器人系统合作具有很好的通用性,且算法快速简单,能够实现任务到机器人的最优映射。相似文献

18.

Towards efficient multiagent task allocation in the RoboCup Rescue: a biologically-inspired approach

Fernando dos Santos Ana L. C. Bazzan 《Autonomous Agents and Multi-Agent Systems》2011,22(3):465-486

This paper addresses team formation in the RoboCup Rescue centered on task allocation. We follow a previous approach that is based on so-called extreme teams, which have four key characteristics: agents act in domains that are dynamic; agents may perform multiple tasks; agents have overlapping functionality regarding the execution of each task but differing levels of capability; and some tasks may depict constraints such as simultaneous execution. So far these four characteristics have not been fully tested in domains such as the RoboCup Rescue. We use a swarm intelligence based approach, address all characteristics, and compare it to other two GAP-based algorithms. Experiments where computational effort, communication load, and the score obtained in the RoboCup Rescue aremeasured, show that our approach outperforms the others. 相似文献

19.

Algorithms for parallel memory allocation

Ellis Carla Schlatter Olson Thomas J. 《International journal of parallel programming》1988,17(4):303-345

Dynamic storage allocation is a vital component of programming systems intended for multiprocessor architectures that support globally shared memory. Highly parallel algorithms for access to system data structures lie at the core of effective memory allocation strategies as well as solutions to other parallel systems problems. In this paper, we investigate four algorithms, all based on the first fit approach, that provide different granularities of parallel access to the allocator's data structures. These solutions employ a variety of design techniques including specialized locking protocols, the use of atomic fetch-and- operations, and structural modifications. We describe experiments designed to compare the performance of these schemes. The results show that simple algorithms are appropriate when the expected number of concurrent requests per memory is low and the request pattern is not bursty. Algorithms that support finer granularity access while avoiding locking protocols are successful in a range of larger processor/memory ratios.This research was supported in part by the National Science Foundation under Grant Number DCR 8320136, DARPA/U.S. Army Engineer Topographic Laboratories under contract number DACA76-85-C-0001, and Unisys Corporation.A preliminary version appeared in International Conference on Parallel Processing, August 1987. 相似文献

20.

混合分布式任务分配机制在足球机器人系统中的应用研究

季秀才崔连虎郑志强《计算机应用》2008,28(3):706-709

在分析各种多智能体任务分配机制的优缺点的基础上,结合基于市场法的任务分配机制和基于规则的任务分配机制,提出了一种混合分布式的多机器人任务分配机制用于足球机器人系统的角色分配。该角色分配算法在动态地分配角色的同时能够有效地避免角色的非期望震荡。仿真和实际比赛均验证了该算法的有效性。相似文献