期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

魏进民庞亚红毛幼菊《计算机工程与设计》2005,26(4):961-963,992

介绍了大规模并行计算的一个新互连网络——分级光环互连,适用于多处理器与多计算机的可升级网络。分级光环互连由一个衡量无阻塞、容错的单跳可升级互连拓扑组成,并通过波分多址技术充分地利用了光纤的TeraHz带宽。此光网络融合了分级环的互连节点接口简单、节点度恒定、容错等各种吸引人的特征以及光通信的各种优点。提出了分级光环互连拓扑,分析了其结构特征并描述了光设计的方法,导出了一个简短可行的分级光环互连研究。相似文献

2.

A regular scalable fault tolerant interconnection network for distributed processing

Wei Shi Pradip K. Srimani 《Parallel Computing》2001,27(14):1897-1919

Bounded degree networks like deBruijn graphs or wrapped butterfly networks are very important from VLSI implementation point of view as well as for applications where the computing nodes in the interconnection networks can have only a fixed number of I/O ports. One basic drawback of these networks is that they cannot provide a desired level of fault tolerance because of the bounded degree of the nodes. On the other hand, networks like hypercube (where degree of a node grows with the size of a network) can provide the desired fault tolerance but the design of a node becomes problematic for large networks. In their attempt to combine the best of the both worlds, authors in [IEEE Transactions on Parallel and Distributed Systems 4(9) (1993) 962] proposed hyper-deBruijn (HD) networks that have many additional features of logarithmic diameter, partitionability, embedding, etc. But, HD networks are not regular, are not optimally fault tolerant and the optimal routing is relatively complex. Our purpose in the present paper is to extend the concepts used in the above-mentioned reference to propose a new family of scalable network graphs that retain all the good features of HD networks and at the same time are regular and maximally fault tolerant; the optimal point to point routing algorithm is significantly simpler than that of the HD networks. We have developed some new interesting results on wrapped butterfly networks in the process. 相似文献

3.

Analytical model based on green criteria for optical backbone network interconnection

Jose Gutierrez Tahir Riaz Jens M. Pedersen Ahmed Patel Ole B. MadsenAuthor vitae 《Computer Standards & Interfaces》2011,33(6):574-586

Key terms such as Global warming, Green House Gas emissions, or Energy efficiency are currently on the scope of scientific research. Regarding telecommunications networks, wireless applications, routing protocols, etc. are being designed following this new “Green” trend. This work contributes to the evaluation of the environmental impact of networks from physical interconnection point of view. Networks deployment, usage, and disposal are analyzed as contributing elements to ICT's (Information and Communications Technology) CO₂ emissions. This paper presents an analytical model for evaluating and quantifying the CO₂ emissions of optical backbone networks during their lifetime. The main goal of this work is to present the model and illustrate how to evaluate the physical interconnection of backbones from an environmental perspective. This model can be applied as a new type of decision support criteria for backbone's interconnection, since minimization of CO₂ emissions is becoming an important factor. In addition, two case studies are presented to illustrate the use and application of this model, and the need for de facto and international standards to reduce CO₂ emissions through good network planning. 相似文献

4.

A spanning multichannel linked hypercube: a gradually scalableoptical interconnection network for massively parallel computing

Louri A. Weech B. Neocleous C. 《Parallel and Distributed Systems, IEEE Transactions on》1998,9(5):497-512

A new, scalable interconnection topology called the Spanning Multichannel Linked Hypercube (SMLH) is proposed. This proposed network is very suitable to massively parallel systems and is highly amenable to optical implementation. The SMLH uses the hypercube topology as a basic building block and connects such building blocks using two-dimensional multichannel links (similar to spanning buses). In doing so, the SMLH combines positive features of both the hypercube (small diameter, high connectivity, symmetry, simple routing, and fault tolerance) and the spanning bus hypercube (SBH) (constant node degree, scalability, and ease of physical implementation), while at the same time circumventing their disadvantages. The SMLH topology supports many communication patterns found in different classes of computation, such as bus-based, mesh-based, and tree-based problems, as well as hypercube-based problems. A very attractive feature of the SMLH network is its ability to support a large number of processors with the possibility of maintaining a constant degree and a constant diameter. Other positive features include symmetry, incremental scalability, and fault tolerance. It is shown that the SMLH network provides better average message distance, average traffic density, and queuing delay than many similar networks, including the binary hypercube, the SBH, etc. Additionally, the SMLH has comparable performance to other high-performance hypercubic networks, including the Generalized Hypercube and the Hypermesh. An optical implementation methodology is proposed for SMLH. The implementation methodology combines both the advantages of free space optics with those of wavelength division multiplexing techniques. A detailed analysis of the feasibility of the proposed network is also presented 相似文献

5.

Benchmarking parallel processing platforms: an applicationsperspective

Mueller-Thuns R.B. Saab D.G. Damiano R.F. Abraham J.A. 《Parallel and Distributed Systems, IEEE Transactions on》1993,4(8):947-954

Given the increased availability of general purpose parallel computers two issues arise: One needs to compare the performance of the different available platforms using realistic examples, and it is necessary to write application software that can be ported easily in order to take advantage of different platforms. The authors address these issues from an applications point of view. They are interested in the use of general purpose parallel computers for simulation tasks needed during the design of very large scale integrated (VLSI) circuits. They characterize the simulation task as a useful benchmark and introduce a high level process view of parallel simulation that is helpful for deriving portable parallel programs. Details of the partitioning strategy and the simulation algorithm used in the application are given. They discuss their implementation on different parallel machines and give statistics of various experiments 相似文献

6.

Signal parallel input liquid-crystal devices for multichannel optical processing systems

M. S. Kuzmin S. A. Rogov 《Optical Memory & Neural Networks》2016,25(2):114-117

Variants of liquid-crystal spatial light modulators control devices, that provide partial or fully parallel information input in multichannel optical signal processing systems are suggested. Applications of the proposed solutions enables to increase to a considerable extent the optical processors capacity that is actual for a number of practical problems. 相似文献

7.

Computer network interconnection

《Computer Networks (1976)》1978,2(1):25-34

This report examines the current situation regarding the interconnection of computer networks, especially packet switched networks (PSNs). Four major types of interconnections are surveyed:

1.1. Circuit Switched Network to PSN
2.2. Star Network to PSN
3.3. Simple Terminal to PSN
4.4. PSN to PSN

The emphasis is on identifying the barriers to interconnection and on surveying approaches to a solution, rather than recommending any single course of action. 相似文献

8.

A parallel processing simulator for a network system using multimicroprocessors

A Toda M Imai H Inamori K Hiyama M Hatada 《Microprocessors and Microsystems》1982,6(1):15-20

A parallel processing simulator, NEWTS, has been developed for network systems. This simulator consists of a number of microprocessors, which are interconnected with one another by hierarchical common buses and which excute simulation programs concurrently. This simulator is far cheaper than a general-purpose large computer and enables carrying out efficient and high speed network simulations. This simulator has been applied to a telephone network model, and it's usefulness confirmed. This paper describes an outline of the NEWTS hardware and software. 相似文献

9.

Massively parallel Lucas Kanade optical flow for real-time video processing applications

Aurélien Plyer Guy Le Besnerais Frédéric Champagnat 《Journal of Real-Time Image Processing》2016,11(4):713-730

This paper deals with dense optical flow estimation from the perspective of the trade-off between quality of the estimated flow and computational cost which is required by real-world applications. We propose a fast and robust local method, denoted by eFOLKI, and describe its implementation on GPU. It leads to very high performance even on large image formats such as 4 K (3,840 × 2,160) resolution. In order to assess the interest of eFOLKI, we first present a comparative study with currently available GPU codes, including local and global methods, on a large set of data with ground truth. eFOLKI appears significantly faster while providing quite accurate and highly robust estimated flows. We then show, on four real-time video processing applications based on optical flow, that eFOLKI reaches the requirements both in terms of estimated flows quality and of processing rate. 相似文献

10.

Multistage ring network: An interconnection network for large scale shared memory multiprocessors

《Journal of Systems Architecture》2000,46(9):765-778

Unidirectional ring-based networks are currently popular choices for high performance large scale shared memory multiprocessors. This class of networks is attractive for their simple hardware interfaces, high speed communication, wider data path, and easy addition of extra nodes. However, a single ring does not scale well due to the fixed bandwidth, and the hierarchical ring networks as a natural extension of a single ring show limited scalability due to their limited bandwidth near the root. In this paper we present a new interconnection network called the Multistage Ring Network (MRN). The MRN has a 2-level hierarchy of rings, and its interconnection of global rings forms a type of the multistage network. The architecture of the MRN is effective at diffusing the global traffic on the network to all global rings, and the bandwidth of the network increases proportionally with increases in the system size. Our results show that in a peak throughput, the MRN performs seven times better than the hierarchical ring network for system size of 1024. 相似文献

11.

Skeletons for parallel image processing: an overview of the SKIPPER project

Jocelyn Srot Dominique Ginhac 《Parallel Computing》2002,28(12):1685-1708

This paper is a general overview of the S project, run at Blaise Pascal University between 1996 and 2002. The main goal of the S project was to demonstrate the applicability of skeleton-based parallel programming techniques to the fast prototyping of reactive vision applications. This project has produced several versions of a full-fledged integrated parallel programming environment (PPE). These PPEs have been used to implement realistic vision applications, such as road following or vehicle tracking for assisted driving, on embedded parallel platforms embarked on semi-autonomous vehicles. All versions of S share a common front-end and repertoire of skeletons––presented in previous papers––but differ in the techniques used for implementing skeletons. This paper focuses on these implementation issues, by making a comparative survey, according to a set of four criteria (efficiency, expressivity, portability, predictability), of these implementation techniques. It also gives an account of the lessons we have learned, both when dealing with these implementation issues and when using the resulting tools for prototyping vision applications. 相似文献

12.

A new hierarchy of hypercube interconnection schemes for parallel computers

S. Lakshmivarahan Sudarshan K. Dhall 《The Journal of supercomputing》1988,2(1):81-108

This paper introduces a new hierarchy of cube-based interconnection schemes, called the base-b cube (which properly contains the well-known binary cube), for the design of parallel computers. This hierarchy admits a recursive definition and allows many more reconfigurations than are possible with the binary cube. Our analysis addresses the inherent cost-delay trade-off for this hierarchy along with a number of related topological properties such as sparsity, diameter, existence of node disjoint paths, and odd and even cycles. Embeddings of standard interconnection schemes including linear and two-dimensional arrays, rings, and complete binary trees in a base-b cube are illustrated. 相似文献

13.

Reliable multistage interconnection network design

S. Rajkumar Neeraj Kumar Goyal 《Peer-to-Peer Networking and Applications》2016,9(6):979-990

High-performance supercomputers generally comprise millions of CPUs in which interconnection networks play an important role to achieve high performance. New design paradigms of dynamic on-chip interconnection network involve a) topology b) synthesis, modeling and evaluation c) quality of service, fault tolerance and reliability d) routing procedures. To construct a dynamic highly fault tolerant interconnection networks requires more disjoint paths from each source-destination node pair at each stage and dynamic rerouting capability to use the various available paths effectively. Fast routing and rerouting strategy is needed to provide reliable performance on switch/link failures. This paper proposes two new architecture designs of fault tolerant interconnection networks named as reliable interconnection networks (RIN-1 and RIN-2). The proposed layouts are multipath multi-stage interconnection networks providing four disjoint paths for all the source-destination node pairs with dynamic rerouting capability. The designs can withstand switch failures in all the stages (including input and output stages) and provide more reliability. Reliability analysis of various MIN architectures is evaluated. On comparing the results with some existing MINs it is evident that the proposed designs provides higher reliability values and fault tolerance. 相似文献

14.

Limits on interconnection network performance 总被引：1，自引：0，他引：1

Agarwal A. 《Parallel and Distributed Systems, IEEE Transactions on》1991,2(4):398-412

The latency of direct networks is modeled, taking into account both switch and wire delays. A simple closed-form expression for contention in buffered, direct networks is derived and found to agree closely with simulations. The model includes the effects of packet size and communication locality. Network analysis under various constraints and under different workload parameters reveals that performance is highly sensitive to these constraints and workloads. A two-dimensional network is shown to have the lowest latency only when switch delays and network contention are ignored; three- or four-dimensional networks are favored otherwise. If communication locality exists, two-dimensional networks regain their advantage. Communication locality decreases both the base network latency and the network bandwidth requirements of applications. It is shown that a much larger fraction of the resulting performance improvement arises from the reduction in bandwidth requirements than from the decrease in latency 相似文献

15.

DPillar: Dual-port server interconnection network for large scale data centers

Yong Liao Jiangtao Yin Dong Yin Lixin Gao 《Computer Networks》2012,56(8):2132-2147

To meet the huge demands of computation power and storage space, a future data center may have to include up to millions of servers. The conventional hierarchical tree-based data center network architecture faces several challenges in scaling a data center to that size. Previous research effort has shown that a server-centric architecture, where servers are not only computation and storage workstations but also intermediate nodes relaying traffic for other servers, performs well in scaling a data center to a huge number of servers. This paper presents a server-centric data center network called DPillar, whose topology is inspired by the classic butterfly network. DPillar provides several nice properties and achieves the balance between topological scalability, network performance, and cost efficiency, which make it suitable for building large scale future data centers. Using only commodity hardware, a DPillar network can easily accommodate millions of servers. The structure of a DPillar network is symmetric so that any network bottleneck is eliminated at the architectural level. With each server having only two ports, DPillar is able to provide the bandwidth to support communication intensive distributed applications. This paper studies the interconnection features of DPillar, how to compute routes in DPillar, and how to forward packets in DPillar. Extensive simulation experiments have been performed to evaluate the performance of DPillar. The results show that DPillar performs well even in the presence of a large number of server and switch failures. 相似文献

16.

ISP: an optimal out-of-core image-set processing streaming architecture for parallel heterogeneous systems

Ha LK Krüger J Dihl Comba JL Silva CT Joshi S 《IEEE transactions on visualization and computer graphics》2012,18(6):838-851

Image population analysis is the class of statistical methods that plays a central role in understanding the development, evolution, and disease of a population. However, these techniques often require excessive computational power and memory that are compounded with a large number of volumetric inputs. Restricted access to supercomputing power limits its influence in general research and practical applications. In this paper we introduce ISP, an Image-Set Processing streaming framework that harnesses the processing power of commodity heterogeneous CPU/GPU systems and attempts to solve this computational problem. In ISP, we introduce specially designed streaming algorithms and data structures that provide an optimal solution for out-of-core multiimage processing problems both in terms of memory usage and computational efficiency. ISP makes use of the asynchronous execution mechanism supported by parallel heterogeneous systems to efficiently hide the inherent latency of the processing pipeline of out-of-core approaches. Consequently, with computationally intensive problems, the ISP out-of-core solution can achieve the same performance as the in-core solution. We demonstrate the efficiency of the ISP framework on synthetic and real datasets. 相似文献

17.

Loop optimisation for parallel processing

Di Manzo M.; Frisiani A. L.; Olimpo G. 《Computer Journal》1979,22(3):234-239

相似文献

18.

Computer trees: a concept for parallel processing

B. Buchberger J. Fegerl F. Lichtenberger 《Microprocessors and Microsystems》1979,3(6):244-248

This paper describes the multimicrocomputer concept called ‘computer tree’ which is currently being investigated at the institute. The first part summarizes the global idea and characteristic features of the concept and complexity considerations concerning the main application of computer trees: parallel execution of recursive algorithms. The paper then goes on to describe a simple hardware implementation of the concept, which has been undertaken in a first stage of the project, and a proposal for a future hardware implementation that goes beyond the capabilities of the first in that a flexible connection of processor modules and, thereby, an adaptation of the hardware structure to the structure of the algorithm during execution time is provided. 相似文献

19.

The hierarchical Petersen network: a new interconnection network with fixed degree

Jung-Hyun Seo Jong-Seok Kim Hyung Jae Chang Hyeong-Ok Lee 《The Journal of supercomputing》2018,74(4):1636-1654

Network cost and fixed-degree characteristic for the graph are important factors to evaluate interconnection networks. In this paper, we propose hierarchical Petersen network (HPN) that is constructed in recursive and hierarchical structure based on a Petersen graph as a basic module. The degree of HPN(n) is 5, and HPN(n) has \(10^n\) nodes and \(2.5 \times 10^n\) edges. And we analyze its basic topological properties, routing algorithm, diameter, spanning tree, broadcasting algorithm and embedding. From the analysis, we prove that the diameter and network cost of HPN(n) are \(3\log _{10}N-1\) and \(15 \log _{10}N-1\), respectively, and it contains a spanning tree with the degree of 4. In addition, we propose link-disjoint one-to-all broadcasting algorithm and show that HPN(n) can be embedded into FP\(_k\) with expansion 1, dilation 2k and congestion 4. For most of the fixed-degree networks proposed, network cost and diameter require \(O(\sqrt{N})\) and the degree of the graph requires O(N). However, HPN(n) requires O(1) for the degree and \(O(\log _{10}N)\) for both diameter and network cost. As a result, the suggested interconnection network in this paper is superior to current fixed-degree and hierarchical networks in terms of network cost, diameter and the degree of the graph. 相似文献

20.

CGIN: a fault tolerant modified Gamma interconnection network 总被引：1，自引：0，他引：1

Po-Jen Chuang 《Parallel and Distributed Systems, IEEE Transactions on》1996,7(12):1301-1306

To improve the terminal reliability of the Gamma interconnection network (GIN), we consider altering its connecting patterns between stages to attain multiple disjoint paths between any source and destination pair. The new modified GIN, referred to as a CGIN with connecting patterns between stages exhibiting a cyclic feature, is able to tolerate any arbitrary single fault and to lift up terminal reliability accordingly. If several rows of switching elements are fabricated in one chip using the VLSI technology, a CGIN could lead to reduced cost because the pin count per chip decreases and the layout area taken by connections shrinks. To make routing and rerouting in the CGIN more efficient and simpler to implement, destination tag routing and rerouting is also provided 相似文献