首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Peer-to-Peer Networking and Applications - With the exponential increase in the number of IoT devices and the amount of emitted data from these devices, it is expensive and inefficient to offload...  相似文献   

2.
Graph convolutional networks (GCNs) have been applied successfully in social networks and recommendation systems to analyze graph data.Unlike conventional neura...  相似文献   

3.
The problem of lifetime maximization of PCM has been well studied. The arrival of non-volatile memory devices has replaced the traditional DRAM. Still the DRAM has many limitations on endurance and high power write operations. Similarly, number of designs has been discussed earlier to maximize the lifetime of PCM by catching the main memory at available DRAM. Still they could not achieve the performance on power consumption reduction and increasing memory utilization. To improve the performance in power consumption reduction and lifetime maximization, and categorical model is presented in this paper. The proposed method categorizes the processes according to their memory access activity. The categorized process has been allocated to respective part of hybrid memory which encourages maximum read and minimum write in PCM. The proposed method increases the lifetime of PCM than other methods.  相似文献   

4.
An efficient task allocation scheme for 2D mesh architectures   总被引:1,自引:0,他引:1  
Efficient allocation of processors to incoming tasks in parallel computer systems is very important for achieving the desired high performance. It requires recognizing the free available processors with minimum overhead. In this paper, we present an efficient task allocation scheme for 2D mesh architectures. By employing a new approach for searching the mesh, our scheme can find the available submesh without scanning the entire mesh, unlike earlier designs. Comprehensive computer simulation reveals that the average allocation time and waiting delay are much smaller than earlier schemes of comparable performances, irrespective of the size of meshes and distribution of the shape of the incoming tasks  相似文献   

5.
张震  付印金  胡谷雨 《计算机应用》2018,38(8):2230-2235
相变存储器(PCM)凭借低功耗的优势有望成为新一代主存储器,但是耐受性的缺陷成为其广泛应用的重要障碍。现有的随机存取存储器(DRAM)缓存技术和磨损均衡分别从减少PCM写数量以及均匀化写操作分布两个角度延长PCM使用寿命,但前者在写回数据时未考虑数据的读写倾向性,后者在空间局部性较强的应用场景下存在数据交换粒度、空间开销、随机性等诸多问题。因此,设计一种全新的混合存储架构,结合最近最少使用(LRU)算法和带有时间变化的最不经常使用(LFU-Aging)算法提出区分数据读写倾向性的缓存策略,并且基于布隆过滤器(BF)设计针对强空间局部性工作集的动态磨损均衡算法,在有效减少冗余写操作的同时实现低空间开销的组间磨损均衡操作。实验结果表明,该策略能够减少PCM上13.4%~38.6%的写操作,同时有效均匀90%以上分组的写操作分布。  相似文献   

6.
The Journal of Supercomputing - This research is to design an effective prefetching method required for hybrid main memory systems consisting of dynamic random-access memory (DRAM) and phase-change...  相似文献   

7.
Hybrid memory systems composed of dynamic random access memory (DRAM) and Non-volatile memory (NVM) often exploit page migration technologies to fully take the advantages of different memory media. Most previous proposals usually migrate data at a granularity of 4 KB pages, and thus waste memory bandwidth and DRAM resource. In this paper, we propose Mocha, a non-hierarchical architecture that organizes DRAM and NVM in a flat address space physically, but manages them in a cache/memory hierarchy. Since the commercial NVM device–Intel Optane DC Persistent Memory Modules (DCPMM) actually access the physical media at a granularity of 256 bytes (an Optane block), we manage the DRAM cache at the 256-byte size to adapt to this feature of Optane. This design not only enables fine-grained data migration and management for the DRAM cache, but also avoids write amplification for Intel Optane DCPMM. We also create an Indirect Address Cache (IAC) in Hybrid Memory Controller (HMC) and propose a reverse address mapping table in the DRAM to speed up address translation and cache replacement. Moreover, we exploit a utility-based caching mechanism to filter cold blocks in the NVM, and further improve the efficiency of the DRAM cache. We implement Mocha in an architectural simulator. Experimental results show that Mocha can improve application performance by 8.2% on average (up to 24.6%), reduce 6.9% energy consumption and 25.9% data migration traffic on average, compared with a typical hybrid memory architecture–HSCC.  相似文献   

8.
Packet classification (matching) is one of the critical operations in networking widely used in many different devices and tasks ranging from switching or routing to a variety of monitoring and security applications like firewall or IDS. To satisfy the ever-growing performance demands of current and future high-speed networks, specially designed hardware accelerated architectures implementing packet classification are necessary. These demands are now growing to such an extent, that in order to keep up with the rising throughputs of network links, the FPGA accelerated architectures are required to perform matching of multiple packets in every single clock cycle. To meet this requirement a simple replication approach can be utilized – instantiate multiple copies of a processing pipeline matching incoming packets in parallel. However, simple replication of pipelines inseparably brings a significant increase in utilization of FPGA resources of all types, which is especially costly for rather scarce on-chip memories used in matching tables.We propose and examine a unique parallel hardware architecture for hash-based exact match classification of multiple packets in each clock cycle that offers a reduction of memory replication requirements. The core idea of the proposed architecture is to exploit the basic memory organization structure present in all modern FPGAs, where hundreds of individual block or distributed memory tiles are available and can be accessed (addressed) independently. This way, we are able to maintain a rather high throughput of matching multiple packets per clock cycle even without fully replicated memory resources in matching tables. Our results show that the designed approach can use on-chip memory resources very efficiently and even scales exceptionally well with increased capacities of match tables. For example, the proposed architecture is able to achieve a throughput of more than 2 Tbps (over 3 000 Mpps) with an effective capacity of more than 40 000 IPv4 flow records at the cost of only a few hundred block memory tiles (366 BlockRAM for Xilinx or 672 M20K for Intel FPGAs) utilizing only a small fraction of available logic resources (around 68 000 LUTs for Xilinx or 95 000 ALMs for Intel).  相似文献   

9.
10.
一种低能耗低反应时间的并行任务调度方法*   总被引:1,自引:0,他引:1  
在各种网络状嵌入式系统如数字信号处理、车辆跟踪和基础设备监控中,一般同时要求低的能量消耗和低的反应时间。传统的方法通常只集中考虑能量节约问题,而没有考虑最优跨度问题,这样调度长度可能非常长的情况有时让人无法容忍。在限制调度长度的条件下最小化能量消耗,并提出了一种平衡能量消耗和反应时间的算法EATA。实验结果证实了该算法的有效性。  相似文献   

11.
Non-volatile memories are good candidates for DRAM replacement as main memory in embedded systems and they have many desirable characteristics. Nevertheless, the disadvantages of non-volatile memory co-exist with its advantages. First, the lifetime of some of the non-volatile memories is limited by the number of erase operations. Second, read and write operations have asymmetric speed or power consumption in non-volatile memory. This paper focuses on the embedded systems using non-volatile memory as main memory. We propose register allocation technique with re-computation to reduce the number of store instructions. When non-volatile memory is applied as the main memory, reducing store instructions will reduce write activities on non-volatile memory. To re-compute the spills effectively during register allocation, a novel potential spill selection strategy is proposed. During this process, live range splitting is utilized to split certain long live ranges such that they are more likely to be assigned into registers. In addition, techniques for re-computation overhead reduction is proposed on systems with multiple functional units. With the proposed approach, the lifetime of non-volatile memory is extended accordingly. The experimental results demonstrate that the proposed technique can efficiently reduce the number of store instructions on systems with non-volatile memory by 33% on average.  相似文献   

12.
In many business domains, Grids and Service Oriented Architectures are considered to improve application design, integration and execution. In the audiovisual industry, applications are very data-intensive, time-constrained and computationally demanding, and design of a Service Oriented Architecture in this domain is no straightforward task. Efficient resource allocation-especially in terms of network usage-is paramount to meet users’ requirements in terms of deadlines and responsiveness, and offer high scalability at the same time. We present a resource- and network-aware management architecture addressing the issues in media environments, incorporating a number of scheduling algorithms and advance reservation systems to ensure efficient resource usage.  相似文献   

13.
Simulation represents a powerful technique for the analysis of dependability and performance aspects of distributed systems. For large-scale critical systems, simulation demands complex experimentation environments and the integration of different tools, in turn requiring sophisticated modeling skills. Moreover, the criticality of the involved systems implies the set-up of expensive testbeds on private infrastructures. This paper presents a middleware for performing hybrid simulation of large-scale critical systems. The services offered by the middleware allow the integration and interoperability of simulated and emulated subsystems, compliant with the reference interoperability standards, which can provide greater realism of the scenario under test. The hybrid simulation of complex critical systems is a research challenge due to the interoperability issues of emulated and simulated subsystems and to the cost associated with the scenarios to set up, which involve a large number of entities and expensive long running simulations. Therefore, a multi-objective optimization approach is proposed to optimize the simulation task allocation on a private cloud.  相似文献   

14.
15.
由于DRAM芯片超高的静态功耗,使得利用DRAM构建高性能计算机系统中的大容量主存遇到能耗过大问题,这激发了对新型大容量主存结构的研究。针对上述问题,设计了一种基于SRAM和PRAM的混合主存系统,该系统将SRAM作为PRAM的专用写缓存,并将改进后的LRFU算法应用到SRAM写缓存,从而在对主存系统性能影响不大的前提下,有效降低主存系统的能耗和延长PRAM的可用时间。仿真结果显示,所设计的混合存储结构的能耗-延时积(EDP)为纯DRAM存储结构的40%;此外,与纯PRAM存储结构相比,可使PRAM的写操作次数下降28.5%,与将SRAM作为Cache相比,PRAM写次数下降13%。  相似文献   

16.
Summary In the paper a method of labelling is applied to constructing a correct and complete transformation system which allows one, for any program scheme, to construct systematically any permissible memory allocation for variables of the scheme. Permissible memory allocations are assumed to be such allocations that preserve all information flow connections from the initial scheme.Notations B set of internal statements (2.3) - D information flow graph (2.2) - D j component of an information flow graph (2.2) - E set of statements reachable downwards arcs (2.3) - G skeleton (2.1) - i input - L set of statements reachable upwards arcs (2.3) - N information carrier (2.1) - o output (2.1) - (o, i) information pair (2.1) - p pole (2.1) - R memory allocation (2.1) - L Lavrov scheme (2.1) - T, U statements (2.1) - V poles allocation (2.1) - W inconsistency graph (2.3) - X memory (2.1) - x, y variables (2.1) - empty set - calculability relation (2.1) - equivalence relation (3.1) - transformability relation (3.1) - end of a lemma proof - end of a theorem proof - empty word The author is greatful to Miss E. L. Gorel whose research of axiomatic of generalized Yanov schemata has stimulated this writing.  相似文献   

17.
目前所采用的多机器人系统任务分配方法大多都忽略了任务分配的解质量问题。从定量的角度出发,提出了一种基于效用函数的多机器人系统任务分配策略,在机器人能力向量和子任务要求的能力向量基础上,建立了效用函数的数学模型,根据效用函数大小进行任务分配。仿真实验在足球机器人仿真比赛平台上进行,结果表明该任务分配算法对异构多机器人系统合作具有很好的通用性,且算法快速简单,能够实现任务到机器人的最优映射。  相似文献   

18.
This paper addresses team formation in the RoboCup Rescue centered on task allocation. We follow a previous approach that is based on so-called extreme teams, which have four key characteristics: agents act in domains that are dynamic; agents may perform multiple tasks; agents have overlapping functionality regarding the execution of each task but differing levels of capability; and some tasks may depict constraints such as simultaneous execution. So far these four characteristics have not been fully tested in domains such as the RoboCup Rescue. We use a swarm intelligence based approach, address all characteristics, and compare it to other two GAP-based algorithms. Experiments where computational effort, communication load, and the score obtained in the RoboCup Rescue aremeasured, show that our approach outperforms the others.  相似文献   

19.
Dynamic storage allocation is a vital component of programming systems intended for multiprocessor architectures that support globally shared memory. Highly parallel algorithms for access to system data structures lie at the core of effective memory allocation strategies as well as solutions to other parallel systems problems. In this paper, we investigate four algorithms, all based on the first fit approach, that provide different granularities of parallel access to the allocator's data structures. These solutions employ a variety of design techniques including specialized locking protocols, the use of atomic fetch-and- operations, and structural modifications. We describe experiments designed to compare the performance of these schemes. The results show that simple algorithms are appropriate when the expected number of concurrent requests per memory is low and the request pattern is not bursty. Algorithms that support finer granularity access while avoiding locking protocols are successful in a range of larger processor/memory ratios.This research was supported in part by the National Science Foundation under Grant Number DCR 8320136, DARPA/U.S. Army Engineer Topographic Laboratories under contract number DACA76-85-C-0001, and Unisys Corporation.A preliminary version appeared in International Conference on Parallel Processing, August 1987.  相似文献   

20.
在分析各种多智能体任务分配机制的优缺点的基础上,结合基于市场法的任务分配机制和基于规则的任务分配机制,提出了一种混合分布式的多机器人任务分配机制用于足球机器人系统的角色分配。该角色分配算法在动态地分配角色的同时能够有效地避免角色的非期望震荡。仿真和实际比赛均验证了该算法的有效性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号