首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Energy consumption of parallel computers has been becoming the obstruction to higher-performance systems. In this paper, we focus on power optimization of high-performance interconnection networks for MPI applications in high-performance parallel computers. Compared with the past history-based work, we propose the idea of compiler-directed power-aware on/off network links. There are some idle intervals for network links during the execution of parallel applications, at which the links still consume large amounts of energy. Using on/off network links, compiler first divides load-balancing MPI applications into the communication intervals and the computation intervals, and then inserts the on/off instruction into the applications to switch the link state. To avoid the time overhead of state switching, we use a time estimation technique to analyze the computation time, and insert the on instruction before reaching the communication intervals. Results from simulations and experiments show that the proposed compiler-directed method can reduce energy consumption of interconnection networks by 20∼70%, at a loss of less than 1% network latency and performance degradation.  相似文献   

2.
The performance of multiple-bus interconnection networks for multiprocessor systems is analyzed, taking into account conflict arising from memory and bus interference. A discrete stochastic model of bandwidth is presented for systems in which each memory is connected either to all the buses or to a subset of the available buses. The effects of the assumptions made concerning independence among requests for different memories (spatial independence) and resubmission of blocked requests (temporal independence) are investigated systematically. The basic bandwidth model is extended to account for spatial dependence, and compared to previously proposed models. Finally, the various analytic models are shown to be in close agreement with simulation results.  相似文献   

3.
关于互连网络的几个猜想   总被引:2,自引:0,他引:2       下载免费PDF全文
n-立方体是著名的互连网络,星图、煎饼图和冒泡排序图是由凯莱图模型设计出来的重要的互连网络。对换树(transposition tree)的凯莱图是一类特殊的凯莱图,星图和冒泡排序图分别是对换树为星和路的凯莱图。给出了关于n-立方体、星图、煎饼图、冒泡排序图和对换树的凯莱图的各一个猜想;提出了对换图的凯莱图的概念,进而由这一概念设计出了两个互连网络——圈图和轮图,并证明冒泡排序图和星图分别可嵌入圈图和轮图。  相似文献   

4.
Performance of multiprocessor interconnection networks   总被引:1,自引:0,他引:1  
A tutorial is provided on the performance evaluation of multiprocessor interconnection networks, to guide system designers in their design process. A classification of parallel/distributed systems is followed by a classification of multiprocessor interconnection networks. Basic terminology for performance evaluation is presented. The performance of crossbar interconnection networks, multistage interconnection networks, and multiple-bus systems is then addressed, and a comparison is made along them  相似文献   

5.
In recent years, many multistage interconnection networks using 2 × 2 switching cells have been proposed for parallel architecture. Here we state a correct and easy graph characterization of all the networks topologically equivalent to the Omega, Flip, Baseline, Reverse Baseline, Indirect Binary Cube, and Modified Data Manipulator networks.  相似文献   

6.
In this paper, we introduce the FLUX interconnection networks, a scheme where the interconnections of a parallel system are established on demand before or during program execution. We present a programming paradigm which can be utilized to make the proposed solution feasible. We perform several experiments to show the viability of our approach and the potential performance gain of using the most suitable network configuration for a given parallel program. We experiment on several case studies, evaluate different algorithms, developed for meshes or trees, and map them on “grid”-like or reconfigurable physical interconnection networks. Our results clearly show that, based on the underlying network, different mappings are suitable for different algorithms. Even for a single algorithm different mappings are more appropriate, when the processing data size, the number of utilized nodes or the hardware cost of the processing elements changes. The implication of the above is that changing interconnection topologies/mappings (dynamically) on demand depending on the program needs can be beneficial.  相似文献   

7.
8.
The interconnection network equivalence notions reported in the literature are formalized via conjugation maps over the sets of interconnections of such networks. Various forms of relations including group isomorphisms among interconnection networks are introduced. Equivalence relations express the degrees of freedom in “making one network behave like another.” Examples of these relations for commutative cube-connected networks with individual stage control are also included. In addition, an algorithm is provided to construct equivalence maps among such networks.  相似文献   

9.
We study the cross product as a method for generating and analyzing interconnection network topologies for multiprocessor systems. Consider two interconnection graphs G1 and G2 each with some established properties such as symmetry, low degree and diameter, scalability, simple optimal routing, recursive structure (partitionability), fault tolerance, existence of node-disjoint paths, low cost embedding, and efficient broadcasting. We investigate and evaluate the corresponding properties for the cross product of G1 and G2 based on the properties of G1 and those of G2. We also give a mathematical characterization of product families of graphs which are closed under the cross product operation. This investigation is useful in two ways. On one hand, it gives a new tool for further studying some of the known interconnection topologies, such as the hypercube and the mesh, which can be defined using the cross product operation. On the other hand, it can be used in defining and evaluating new interconnection graphs using the cross product operation on known topologies  相似文献   

10.
Topology optimization approaches   总被引:4,自引:0,他引:4  
Topology optimization has undergone a tremendous development since its introduction in the seminal paper by Bendsøe and Kikuchi in 1988. By now, the concept is developing in many different directions, including “density”, “level set”, “topological derivative”, “phase field”, “evolutionary” and several others. The paper gives an overview, comparison and critical review of the different approaches, their strengths, weaknesses, similarities and dissimilarities and suggests guidelines for future research.  相似文献   

11.
Central to all parallel architectures is a switching network which facilitates the communication between a machine's components necessary to support their cooperation. Multistage interconnection networks (MINs) are classified and analytic models are described for both packet-switched and circuit-switched MINs with asynchronous transmission mode. Under strong enough assumptions, packet switching can be modeled by standard queuing methods, hence providing a standard against which to assess approximate models. We describe one such approximate model with much weaker assumptions which is more widely applicable and can be implemented more efficiently. To model circuit switching requires a different approach because of the presence of passive resources, namely multiple links through the MIN which must be held before a message can be transmitted and throughout its transmission. An approximate analysis based upon the recursive structure of a particular MIN topology which yields accurate predictions when compared with simulation is described.  相似文献   

12.
It is important that a communication service has to service dependability by high level. Many affairs cause failures in a network. Destroying nodes or links in communication network, cable cuts, node interruptions, software errors or hardware failures and transmission failure at various points, human error or accident and can interrupt service for long periods of time. At the beginning a communication network, requiring greater degree of stability or less vulnerability. In this work, various stability measures of a communication network are defined and the stability measures of some static interconnection networks which are known long times and w-star networks that are a new graph class, are given.  相似文献   

13.
Several researchers have analysed the performance of k-ary n-cubes taking into account channel bandwidth constraints imposed by implementation technology, namely the constant wiring density and pin-out constraints for VLSI and multiple-chip technology respectively. For instance, Dally [IEEE Trans. Comput. 39(6) (1990) 775], Abraham [Issues in the architecture of direct interconnection networks schemes for multiprocessors, Ph.D. thesis, University of Illinois at Urbana-Champaign, 1992], and Agrawal [IEEE Trans. Parallel Distributed Syst. 2(4) (1991) 398] have shown that low-dimensional k-ary n-cubes (known as tori) outperform their high-dimensional counterparts (known as hypercubes) under the constant wiring density constraint. However, Abraham and Agrawal have arrived at an opposite conclusion when they considered the constant pin-out constraint. Most of these analyses have assumed deterministic routing, where a message always uses the same network path between a given pair of nodes. More recent multicomputers have incorporated adaptive routing to improve performance. This paper re-examines the relative performance merits of the torus and hypercube in the context of adaptive routing. Our analysis reveals that the torus manages to exploit its wider channels under light traffic. As traffic increases, however, the hypercube can provide better performance than the torus. Our conclusion under the constant wiring density constraint is different from that of the works mentioned above because adaptive routing enables the hypercube to exploit its richer connectivity to reduce message blocking.  相似文献   

14.
使用群论中的半直积作为工具,将已有的若干构建互连网络的方法统一成一种Cayley图模型CSC(q,p,l,k),使其具有更好的可扩展性。并证明了CSC(q,p,l,k)网络包括了若干重要的互连网络作为它的特殊情形,例如立方连通圈、星连通圈和最近提出并受到关注的k度Cayley图。提出该模型的意义在于为计算机系统的设计者们提供只需要选择合适的参数就可以确定自己需要的互连网络模型。其次,该模型也在一定程度上避免一些在互连网络构建方面的冗余研究工作。  相似文献   

15.
16.
Single-hop non-blocking networks have the advantage of providing uniform latency and throughput, which is important for cache-coherent network-on-chip systems. This paper focuses on high performance circuit designs of multi-stage non-blocking networks as alternatives to crossbars. Existing work shows that Benes networks have much lower transistor count and smaller circuit area but longer delay than crossbars. To reduce the timing delay, we propose to design the Clos network built with larger size switches. Using less than half number of stages than the Benes network, the Clos network with 4 × 4 switches can significantly reduce the timing delay. The circuit designs of both Benes and Clos networks in different sizes are conducted considering two types of implementation of the configurable switch: with N-type metal-oxide-semiconductor logic (NMOS) transistors only and full transmission gates (TGs). The layout and simulation results under 45 nm technology show that the TG-based implementation demonstrates much better signal integrity than its counterpart. Clos networks achieve average 60% lower timing delay than Benes networks with even smaller area and power consumption.  相似文献   

17.
The star networks,which were originally proposed by Akers and Harel,have suffered from a rigorous restriction on the number of nodes.The general incomplete star networks(GISN) are proposed in this paper to relieve this restriction.An efficient labeling scheme for GISN is given,and routing and broadcasting algorithms are also presented for GIS.The communication diameter of GISN is shown to be bounded by 4n-7.The proposed single node broadcasting algorithm is optimal with respect to time complexity O(nlog2n).  相似文献   

18.
Wireless operators, in developed or emerging regions, must support triple-play service offerings as demanded by the market or mandated by regulatory bodies through so-called Universal Service Obligations (USOs). Since individual operators might face different constraints such as available spectrum licenses, technologies, cost structures or a low energy footprint, the EU FP7 CARrier grade wireless MEsh Network (CARMEN) project has developed a carrier-grade heterogeneous multi-radio back-haul architecture which may be deployed to extend, complement or even replace traditional operator equipment. To support offloading of live triple-play content to broadcast-optimized, e.g., DVB-T, overlay cells, this heterogeneous wireless back-haul architecture integrates unidirectional broadcast technologies. In order to manage the physical and logical resources of such a network, a centralized coordinator approach has been chosen, where no routing state is kept at plain WiBACK Nodes (WNs) which merely store QoS-aware MPLS forwarding state. In this paper we present our Unidirectional Technology (UDT)-aware design of the centralized Topology Management Function (TMF), which provides a framework for different topology and spectrum allocation optimization strategies and algorithms to be implemented. Following the validation of the design, we present evaluation results using a hybrid local/centralized topology optimizer showing that our TMF design supports the reliable forming of optimized topologies as well as the timely recovery from node failures.  相似文献   

19.

Recently, topology optimization has drawn interest from both industry and academia as the ideal design method for additive manufacturing. Topology optimization, however, has a high entry barrier as it requires substantial expertise and development effort. The typical numerical methods for topology optimization are tightly coupled with the corresponding computational mechanics method such as a finite element method and the algorithms are intrusive, requiring an extensive understanding. This paper presents a modular paradigm for topology optimization using OpenMDAO, an open-source computational framework for multidisciplinary design optimization. This provides more accessible topology optimization algorithms that can be non-intrusively modified and easily understood, making them suitable as educational and research tools. This also opens up further opportunities to explore topology optimization for multidisciplinary design problems. Two widely used topology optimization methods—the density-based and level-set methods—are formulated in this modular paradigm. It is demonstrated that the modular paradigm enhances the flexibility of the architecture, which is essential for extensibility.

  相似文献   

20.
The flow-control mechanism determinates the manner in which the communicational resources are allocated. Well-designed flow-control mechanism should provide efficient allocation of the communicational resources in wide variety of interconnection networks. The goal of this paper is to suggest a highly effective “Step-Back-on-Blocking” buffered flow control. The proposed flow-control mechanism combines the advantages of the Wormhole and Virtual-Cut Through flow controls, whilst adds a means for adaptive allocation of the communicational resources. The “Step-Back-on-Blocking” flow control provides low message latency and achieves high fraction of the channel bandwidth by performing conditional evasion of temporary blocked network resources. The effectiveness of the proposed flow control has been evaluated on the basis of numerous experiments conducted in OMNet++ discrete event simulation environment.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号