期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

NoC synthesis flow for customized domain specific multiprocessor systems-on-chip 总被引：4，自引：0，他引：4

Bertozzi D. Jalabert A. Srinivasan Murali Tamhankar R. Stergiou S. Benini L. De Micheli G. 《Parallel and Distributed Systems, IEEE Transactions on》2005,16(2):113-129

The growing complexity of customizable single-chip multiprocessors is requiring communication resources that can only be provided by a highly-scalable communication infrastructure. This trend is exemplified by the growing number of network-on-chip (NoC) architectures that have been proposed recently for system-on-chip (SoC) integration. Developing NoC-based systems tailored to a particular application domain is crucial for achieving high-performance, energy-efficient customized solutions. The effectiveness of this approach largely depends on the availability of an ad hoc design methodology that, starting from a high-level application specification, derives an optimized NoC configuration with respect to different design objectives and instantiates the selected application specific on-chip micronetwork. Automatic execution of these design steps is highly desirable to increase SoC design productivity. This work illustrates a complete synthesis flow, called Netchip, for customized NoC architectures, that partitions the development work into major steps (topology mapping, selection, and generation) and provides proper tools for their automatic execution (SUNMAP, xpipescompiler). The entire flow leverages the flexibility of a fully reusable and scalable network components library called xpipes, consisting of highly-parameterizable network building blocks (network interface, switches, switch-to-switch links) that are design-time tunable and composable to achieve arbitrary topologies and customized domain-specific NoC architectures. Several experimental case studies are presented In the work, showing the powerful design space exploration capabilities of the proposed methodology and tools. 相似文献

2.

A modeling tool for simulating and design of on-chip network systems

Gul N. Khan Victor Dumitriu 《Microprocessors and Microsystems》2010,34(2-4):84-95

The conception of Network-on-Chip (NoC) presents system designers with a new approach to the design of on-chip inter-connection structures. However, such networks present designers with a large number of design parameters and decisions, many of which are critical to the efficient operation of over-all on-chip system. To aid the design process of complex systems-on-chip, this paper presents a NoC simulation environment that has been developed and implemented using the SystemC transaction-level modeling language. The simulation environment consists of on-chip components as well as traffic generators, which can generate various types of traffic patterns. The simulation environment has also been integrated with the NoC topology generation tool being developed in our group. A set of simulation results demonstrates the types of parameters that can affect the performance of on-chip systems, including topology variations, network latency and achievable throughput. These results also verify the modeling capabilities of the proposed simulation environment. 相似文献

3.

A fault tolerant NoC architecture using quad-spare mesh topology and dynamic reconfiguration

《Journal of Systems Architecture》2013,59(7):482-491

Network-on-Chip (NoC) is widely used as a communication scheme in modern many-core systems. To guarantee the reliability of communication, effective fault tolerant techniques are critical for an NoC. In this paper, a novel fault tolerant architecture employing redundant routers is proposed to maintain the functionality of a network in the presence of failures. This architecture consists of a mesh of 2 × 2 router blocks with a spare router placed in the center of each block. This spare router provides a viable alternative when a router fails in a block. The proposed fault-tolerant architecture is therefore referred to as a quad-spare mesh. The quad-spare mesh can be dynamically reconfigured by changing control signals without altering the underlying topology. This dynamic reconfiguration and its corresponding routing algorithm are demonstrated in detail. Since the topology after reconfiguration is consistent with the original error-free 2D mesh, the proposed design is transparent to operating systems and application software. Experimental results show that the proposed design achieves significant improvements on reliability compared with those reported in the literature. Comparing the error-free system with a single router failure case, the throughput only decreases by 5.19% and latency increases by 2.40%, with about 45.9% hardware redundancy. 相似文献

4.

用NS2评估片上网络体系结构的性能 总被引：2，自引：0，他引：2

下载免费PDF全文

杨智峰田泽《计算机工程与应用》2010,46(18):74-76

随着SoC复杂度的不断提高,总线互连结构面临着越来越严峻的挑战,因此,以网络互连为特点的NoC应运而生。分析了影响NoC性能的几项重要指标,并用网络仿真软件NS2对几种常用拓扑结构的几项性能参数进行了评估,得出了在进行NoC设计时的指导性结论：结合具体的设计,对传输延迟、吞吐量、面积、功耗和可重用性等性能参数进行折衷考虑后选取合适的体系结构。相似文献

5.

Modern architecture for photonic networks-on-chip

Sharma Kapil Sehgal Vivek Kumar 《The Journal of supercomputing》2020,76(12):9901-9921

Development in photonic integrated circuits (PICs) provides a promising solution for on-chip optical computation and communication. PICs provides the best alternative to traditional networks-on-chip (NoC) circuits which face serious challenges such as bandwidth, latency and power consumption. Integrated optics have substantiated the ability to accomplish low-power communication and low-power data processing at ultra-high speeds. In this work, we propose a new architecture for NoC, which might improve overall on-chip network performance by reducing its power consumption, providing large channel capacity for communication, decreasing latency among nodes and reducing hop count. Some of the key features of the proposed architecture are to reduce the waveguide network for communication among nodes, and this architecture can be used as a brick to construct other architectures. In this architecture, we use micro-ring resonator (MRR) and it is used to provide a high bandwidth connection among nodes with a lesser number of waveguide networks. Furthermore, results show that this architecture of PICs provides better performance in terms of low communication latency, low power consumption, high bandwidth. It also provides acceptable FSR value, FWHR value, finesse value and Q-factor of micro-ring resonators used for the design of MRR in this architecture.

相似文献

6.

Distributed computing architecture for effective management of multimedia streams on DPE

Choong Seon Hong Shinkuro Honda Kiyoto Kawauchi Yutaka Matsushita 《Multimedia Tools and Applications》1996,2(3):233-252

We study a networking architecture model that is built on a distributed processing environment (DPE) for multimedia services suitable for high speed transport networks such as ATM networks. In this architecture, the applications are deployed as units of software building blocks. Each building block provides a layered view for the effective management and control of the multimedia network resources and services according to the concept of telecommunications management network (TMN) and telecommunications information networking architecture (TINA). For the purpose of flexible service provision to users and effective service introduction by service providers, this architecture proposes the adoption of ad hoc service building blocks such as a video on demand building block and a CSCW building block that have interactions with a general purpose building block. This paper also proposes a naming structure for the management of user profiles and session profiles using a directory service system, and an effective control model for multimedia logical device objects using a stream process approach. The proposed model is implemented on a DPE platform that provides various transparencies, ANSAware. 相似文献

7.

An architectural co-synthesis algorithm for energy-aware Network-on-Chip design

Yi-Jung Chen Chia-Lin Yang Yen-Sheng Chang 《Journal of Systems Architecture》2009,55(5-6):299-309

Network-on-Chip (NoC) has been proposed to overcome the complex on-chip communication problem of System-on-Chip (SoC) design in deep sub-micron. A complete NoC design contains exploration on both hardware and software architectures. The hardware architecture includes the selection of Processing Elements (PEs) with multiple types and their topology. The software architecture contains allocating tasks to PEs, scheduling of tasks and their communications. To find the best hardware design for the target tasks, both hardware and software architectures need to be considered simultaneously. Previous works on NoC design have concentrated on solving only one or two design parameters at a time. In this paper, we propose a hardware–software co-synthesis algorithm for a heterogeneous NoC architecture. The design goal is to minimize energy consumption while meeting the real-time requirements commonly seen in embedded applications. The proposed algorithm is based on Simulated-Annealing (SA). To compare the solution quality and efficiency of the proposed algorithm, we also implement the branch-and-bound and iterative algorithm to solve the hardware–software co-synthesis problem of a heterogeneous NoC. With the given synthetic task sets, the experimental results show that the proposed SA-based algorithm achieves near-optimal solution in a reasonable time, while the branch-and-bound algorithm takes a very long time to find the optimal solution, and the iterative algorithm fails to achieve good solution quality. When applying the co-synthesis algorithms to a real-world application with PE library that has little variation in PE performance and energy consumption, the iterative algorithm achieves solution quality comparable to that of the proposed SA-based algorithm. 相似文献

8.

Floorplan-aware application-specific network-on-chip topology synthesis using genetic algorithm technique

G. Lai X. Lin 《The Journal of supercomputing》2012,61(3):418-437

Communication plays a critical role in the design and performance of multi-core systems-on-chip (SoCs). Networks-on-chip (NoCs) have been proposed as a promising solution to complex on-chip communication problems. As regular NoC topologies are infeasible to satisfy the performance demand for application-specific NoC, customized topology synthesis is therefore desirable. However, NoC topology synthesis problem is an NP-hard problem. In this paper, we propose a suboptimal genetic-algorithm based technique to synthesize application-specific NoC topology with system-level floorplan awareness. The method minimizes the power consumption and router resources while satisfying latency and bandwidth performance constraints. We have evaluated the proposed technique by running a number of representative benchmark applications and the results indicate that our method generates approximate optimal topologies effectively and efficiently for all benchmarks under consideration. 相似文献

9.

DimRouter: A Multi-Mode Router Architecture for Higher Energy-Proportionality of On-Chip Networks

下载免费PDF全文

Shi-Qi Lian Ying Wang Yin-He Han 《计算机科学技术学报》2018,33(5):984-997

In the dark silicon era, many independent components of many-core processors are becoming voluntarily inactive due to the constraint of power consumption on a chip. However, to keep network connectivity, the on-chip interconnection must still be kept activated and wastes considerable energy to avoid the isolation of these inactive components, harming the energy-proportionality of the whole processor chip. In this paper, we propose a novel design to provide more energy-proportional on-chip connection without damaging the network connectivity. To achieve this goal, we redesign the router architecture. The new architecture, DimRouter, supports three modes: normal, dark and dim. In the dim mode, only part of the router is active and provides flexible connection while the dark mode puts all router elements in the asleep state. Moreover, to maximize the number of dark routers, we also propose a reconfiguration algorithm based on degree-constrained Steiner Tree. The evaluation result under synthetic traffic shows that the new design can reduce the energy consumption up to 85% compared with the common design. For real application traffic, the new design can also save average 46% energy consumption with 4% performance improvement. 相似文献

10.

A survey of routing algorithm for mesh Network-on-Chip

Yue WU Chao LU Yunji CHEN 《Frontiers of Computer Science》2016,10(4):591-601

With the rapid development of semiconductor industry, the number of cores integrated on chip increases quickly, which brings tough challenges such as bandwidth, scalability and power into on-chip interconnection. Under such background, Network-on-Chip (NoC) is proposed and gradually replacing the traditional on-chip interconnections such as sharing bus and crossbar. For the convenience of physical layout, mesh is the most used topology in NoC design. Routing algorithm, which decides the paths of packets, has significant impact on the latency and throughput of network. Thus routing algorithm plays a vital role in a wellperformed network. This study mainly focuses on the routing algorithms of mesh NoC. By whether taking network information into consideration in routing decision, routing algorithms of NoC can be roughly classified into oblivious routing and adaptive routing. Oblivious routing costs less without adaptiveness while adaptive routing is on the contrary. To combine the advantages of oblivious and adaptive routing algorithm, half-adaptive algorithms were proposed. In this paper, the concepts, taxonomy and features of routing algorithms of NoC are introduced. Then the importance of routing algorithms in mesh NoC is highlighted, and representative routing algorithms with respective features are reviewed and summarized. Finally, we try to shed light upon the future work of NoC routing algorithms. 相似文献

11.

Thermal management in 3d networks-on-chip using dynamic link sharing

《Microprocessors and Microsystems》2017

3D integration is a practical solution for overcoming the problems of long and slow global wires in current and future generations of integrated circuits. This emerging technology stacks several die slices on top of each other in a single chip. It provides higher-bandwidth and lower-latency in the third dimension than a 2D design due to extremely shorter inter-layer distances. However, thermal challenges are a key impediment to stacking logic dies on top of each other. Particularly, routers in a 3D network-on-chip (NoC) are a main source of thermal hotspots, limiting the potential performance gains of the 3D integration. In this paper, we take advantage of the low-latency 3D vertical links to design a temperature-aware router architecture for 3D NoCs. This architecture reduces the peak temperature of routers, particularly routers that are farther from the heat sink, by balancing the traffic across all layers in a temperature-aware distributed way. This way, a router with high temperature can borrow the link and crossbar bandwidth of the routers in the layers closer to the heat sink to forward its packets, effectively offloading part of its traffic to them to reduce its temperature.Experimental results show that the proposed method can control the temperature of 3D NoCs and reduce the temperature gradient across the network with minimized negative impact on performance, compared to a state-of-the-art 3D NoC temperature management method. 相似文献

12.

A generic systolic array building block for neural networks withon-chip learning

Lehmann C. Viredaz M. Blayo F. 《Neural Networks, IEEE Transactions on》1993,4(3):400-407

Neural networks require VLSI implementations for on-board systems. Size and real-time considerations show that on-chip learning is necessary for a large range of applications. A flexible digital design is preferred here to more compact analog or optical realizations. As opposed to many current implementations, the two-dimensional systolic array system presented is an attempt to define a novel computer architecture inspired by neurobiology. It is composed of generic building blocks for basic operations rather than predefined neural models. A full custom VLSI design of a first prototype has demonstrated the efficacy of this design. A complete board dedicated to Hopfield's model has been designed using these building blocks. Beyond the very specific application presented, the underlying principles can be used for designing efficient hardware for most neural network models. 相似文献

13.

Buffer planning for application-specific networks-on-chip design

ShouYi Yin LeiBo Liu ShaoJun Wei 《中国科学F辑(英文版)》2009,52(4):547-558

Networks-on-chip (NoC) is a promising communication architecture for next generation SoC. The size of buffer used in on-chip routers impacts the silicon area and power consumption of NoC dominantly. It is important to plan the total buffer-size and each router buffer-allocation carefully for an efficient NoC design. In this paper, we propose two buffer planning algorithms for application-specific NoC design. More precisely, given the traffic parameters and performance constraints of target application, the proposed algorithms automatically determine minimal buffer budget and assign the buffer depth for each input channel in different routers. The experimental results show that the proposed algorithms can significantly reduce total buffer usage and guarantee the performance requirements. Supported by the National Natural Science Foundation of China (Grant No. 60803018) 相似文献

14.

支持动态可重构片上系统的高效通信模型

钟生海温东新吴峰王玲《计算机工程》2009,35(11):263-265

针对片上网络通信延迟大、不能有效支持动态可重构等问题,提出一种新的片上通信模型GNLB。该模型在网络通信的基础上引入局部通信的思想,采用网络DMA通信机制,并利用网络接口对通信的控制,实现对动态可重构的控制。实验结果表明,基于GNLB构架的系统比基于总线和Mesh结构的系统在性能上分别提升了25％和14％,在硬件资源消耗上分别节省了15．87％和30．15％。相似文献

15.

Improving performance of multi-core NUCA coherent systems using NoC-assisted mechanisms

Kuei-Chung Chang Ing-Ming Liao Chiu-Han Liao 《The Journal of supercomputing》2012,62(3):1318-1337

The significant speed-gap between processor and memory makes last-level cache performance crucial for multi-core architectures (MCA). Non-uniform cache architecture (NUCA) has been proposed to overcome the performance limitations of MCA for many embedded applications. The cache is partitioned into sub-banks, with each sub-bank being an independently accessible entity connected with a fast on-chip network (NoC). This paper presents two NoC-assisted mechanisms to improve the performance and power consumption of NUCA coherence. The first mechanism provides priority-based communication based on the wormhole routing architecture to support NUCA coherence. High-priority coherent packets are transmitted first to save time. The second mechanism offers multicasting communication based on the proposed priority-based NoC to provide efficient cache coherency for NUCA. We dispatch and collect coherence packets at the collecting nodes (CN) to further decrease the number of coherent messages flowing in the NoC. Experimental results show that the priority-based transmission can improve performance by approximately 10?%. The proposed multicasting mechanism can further improve performance and decrease power consumption of the NoC in NUCA by approximately 15?%. The two proposed mechanisms can together enhance the performance by 25?% averagely. 相似文献

16.

FINA：一种基于交互的网络体系结构框架模型 总被引：7，自引：0，他引：7

沈苏彬顾冠群《计算机研究与发展》2001,38(1):81-87

网络发展到面向应用的阶段,传统面向系统互连的网络体系结构已经不能满足各种以高性能为评价指标的网络应用需求。通过研究计算机网络体系结构发展的历史,吸收了软件体系结构研究的成果,提出了一种基于交互的网络体系结构框架模型（FINA）,FINA从宏观网络分层结构、构件化框架模型、以及网络构件及其交互模板3个抽象级辊描述网络体系结构,既保留了传统网络对等层交互的开放互连结构,又引入了现代网络相邻层交互的可定制结构。通过运用FINA描述和分析了传统网络及可编程网络体系结构,说明了FINA适合于描述和评价过去以及现在具有灵活服务定制要求的高性能网络体系结构。相似文献

17.

J2EE Web开发框架体系结构 总被引：6，自引：0，他引：6

杜小刚李舟军《计算机科学》2006,33(8):236-239

开源框架（如MVC框架Struts、OR Mapping框架Hibernate、Log框架Log4j等）的出现极大地提高了J2EE应用程序开发的效率,但它们都只提供了应用程序某一层次的框架,不是一个完整的应用框架。应用框架是整个系统的可重用设计,是构建应用程序的模板,它本质上是一系列设计模式的抽象实现,并提供一些框架基础服务。在整合各种框架的基础上,我们在一个更高的层面上设计和实现了一个J2EEWeb开发框架。该开发框架具有良好的软件体系结构,采用了多种架构设计模式（如多层结构、MVC模式、IoC模式等）,保证了程序具有松耦合性和易扩展性,并提供了一些常用的可复用构件,实现了web应用系统的基础功能。它可以帮助开发人员获得最大程度的框架复用,快速开发应用系统。相似文献

18.

基于片上网络的多核芯片组通讯方案

侯宁卢亚鹏张多利《计算机时代》2014,(10):17-18

多芯片协同工作是一种廉价、低风险的高密度计算应用解决方案。由于片上网络(Network On Chip,NoC)的数据通讯具有并发、分离的特性,因此可以方便地在板级集成多块NoC多核芯片协同工作,构成NoC多核芯片组,快速提供更强大的处理能力。基于某高性能图像处理项目,其硬件系统主要由4块NoC多核芯片构成,4块芯片采用全互连方式,研究了报文数据在不同多核芯片间的传输问题,提出了一种通过硬件实现的多核芯片组通讯方案,该方案已应用在某高性能图像处理项目。相似文献

19.

Deadlock free routing algorithms for irregular mesh topology NoC systems with rectangular regions

Rickard Maurizio Shashi 《Journal of Systems Architecture》2008,54(3-4):427-440

The simplicity of regular mesh topology Network on Chip (NoC) architecture leads to reductions in design time and manufacturing cost. A weakness of the regular shaped architecture is its inability to efficiently support cores of different sizes. A proposed way in literature to deal with this is to utilize the region concept, which helps to accommodate cores larger than the tile size in mesh topology NoC architectures. Region concept offers many new opportunities for NoC design, as well as provides new design issues and challenges. One of the most important among these is the design of an efficient deadlock free routing algorithm. Available adaptive routing algorithms developed for regular mesh topology cannot ensure freedom from deadlocks. In this paper, we list and discuss many new design issues which need to be handled for designing NoC systems incorporating cores larger than the tile size. We also present and compare two deadlock free routing algorithms for mesh topology NoC with regions. The idea of the first algorithm is borrowed from the area of fault tolerant networks, where a network topology is rendered irregular due to faults in routers or links, and is adapted for the new context. We compare this with an algorithm designed using a methodology for design of application specific routing algorithms for communication networks. The application specific routing algorithm tries to maximize adaptivity by using static and dynamic communication requirements of the application. Our study shows that the application specific routing algorithm not only provides much higher adaptivity, but also superior performance as compared to the other algorithm in all traffic cases. But this higher performance for the second algorithm comes at a higher area cost for implementing network routers. 相似文献

20.

Binary image denoising using a quantum multilayer self organizing neural network

《Applied Soft Computing》2014

Several classical techniques have evolved over the years for the purpose of denoising binary images. But the main disadvantages of these classical techniques lie in that an a priori information regarding the noise characteristics is required during the extraction process. Among the intelligent techniques in vogue, the multilayer self organizing neural network (MLSONN) architecture is suitable for binary image preprocessing tasks.In this article, we propose a quantum version of the MLSONN architecture. Similar to the MLSONN architecture, the proposed quantum multilayer self organizing neural network (QMLSONN) architecture comprises three processing layers viz., input, hidden and output layers. The different layers contains qubit based neurons. Single qubit rotation gates are designated as the network layer interconnection weights. A quantum measurement at the output layer destroys the quantum states of the processed information thereby inducing incorporation of linear indices of fuzziness as the network system errors used to adjust network interconnection weights through a quantum backpropagation algorithm.Results of application of the proposed QMLSONN are demonstrated on a synthetic and a real life binary image with varying degrees of Gaussian and uniform noise. A comparative study with the results obtained with the MLSONN architecture and the supervised Hopfield network reveals that the QMLSONN outperforms the MLSONN and the Hopfield network in terms of the computation time. 相似文献