共查询到20条相似文献,搜索用时 15 毫秒
1.
Mahmoud Moadeli Author Vitae Wim Vanderbauwhede Author Vitae 《Journal of Systems and Software》2010,83(8):1327-1336
Multicast is one of the most frequently used collective communication operations in multi-core SoC platforms. Bus as the traditional interconnect architecture for SoC development has been highly efficient in delivering multicast messages. Since the bus is non-scalable, it can not address the bandwidth requirements of the large SoCs. The networks on-chip (NoCs) emerged as a scalable alternative to address the increasing communication demands of such systems. However, due to its hop-to-hop communication, the NoCs may not be able to deliver multicast operations as efficiently as buses do. Adopting multi-port routers has been an approach to improve the performance of the multicast operations in interconnection networks. This paper presents a novel analytical model to compute communication latency of the multicast operation in wormhole-routed interconnection networks employing asynchronous multi-port routers scheme. The model is applied to the Quarc NoC and its validity is verified by comparing the model predictions against the results obtained from a discrete-event simulator developed using OMNET++. 相似文献
2.
Performance modeling of Cartesian product networks 总被引:1,自引:0,他引:1
Reza MoravejiAuthor Vitae Hamid Sarbazi-AzadAuthor Vitae Albert Y. ZomayaAuthor Vitae 《Journal of Parallel and Distributed Computing》2011,71(1):105-113
This paper presents a comprehensive performance model for fully adaptive routing in wormhole-switched Cartesian product networks. Besides the generality of the model which makes it suitable to be used for any product graph, experimental (simulation) results show that the proposed model exhibits high accuracy even in heavy traffic and saturation region, where other models have severe problems to predict the performance of the network. Most popular interconnection network can be defined as a Cartesian product of two or more networks including the mesh, hypercube, and torus networks. Torus and mesh networks are the most popular topologies used in recent supercomputing parallel machines. They have been widely used for realizing on-chip network in recent on-chip multicore and multiprocessors system. 相似文献
3.
In this paper, we present a mathematical background for a new approach for performances modeling of interconnection networks, based on analyzing the packet blocking and waiting time spent in each channel passing through all possible paths in the channel dependency graph. We have proposed a new, simple and very accurate analytical model for deterministic routing in wormhole networks, which is general in terms of the network topology and traffic distribution. An accurate calculation of the variance of the service time has been developed, which overcomes the rough approximation used, as a rule, in the existing models. The model supports two-dimensional mesh topologies, widely used in network-on-chip architectures, and multidimensional topologies, popular in multicomputer architectures. It is applicable even for irregular topologies and arbitrary application-specific traffic. Results obtained through simulation show that the model achieves a high degree of accuracy. 相似文献
4.
Admission, congestion, and peak power control mechanisms are essential parts of a cluster network design for supporting integrated traffic. While an admission control algorithm helps in delivering the assured performance, a congestion control algorithm regulates traffic injection to avoid network saturation. Peak power control forces to meet pre-specified power constraints while maintaining the service quality by regulating the injection of packets. In this paper, we propose these control algorithms for clusters, which are increasingly being used in a diverse set of applications that require QoS guarantees. The uniqueness of our approach is that we develop these algorithms for wormhole-switched networks, which have been used in designing clusters. We use QoS-capable wormhole routers and QoS-capable network interface cards (NICs), referred to as Host Channel Adapters (HCAs) in InfiniBand™ Architecture (IBA), to evaluate the effectiveness of these algorithms. The admission control is applied at the HCAs and the routers, while the congestion control and the peak power control are deployed only at the HCAs. A mixed workload consisting of best-effort, real-time, and control traffic is used to investigate the effectiveness of the proposed schemes. 相似文献
5.
The design of teletext broadcast cycles 总被引:1,自引:0,他引:1
M. H. Ammar 《Performance Evaluation》1985,5(4):235-242
Teletext is a one-way picture information system where pages of information are broadcast to all users in a continuous manner. System response time is an important consideration in designing teletext systems. Factors contributing to system response time include transmission speeds, amount of processing required at user terminals, and efficiency of picture encoding procedures. As important is the design of the teletext broadcast cycle, i.e., the order of pages to be broadcast in a cyclic manner. In this paper, we first derive a formula for the mean response time of a given cycle and a lower bound for the mean response time for any cycle. Next we present a design procedure that yields a cycle with mean response time close to the theoretical lower bound. The use of the results of this paper is demonstrated through a numerical example. 相似文献
6.
Analytical modelling is indeed the most cost-effective method to evaluate the performance of a system. Several analytical models have been proposed in the literature for different interconnection network systems. This paper proposes an accurate analytical model to predict message latency in wormhole-switched star graphs with fully adaptive routing. Although the focus of this research is on the star graph but the approach used for modelling can be, however, used for modelling some other regular and irregular interconnection networks. The results obtained from simulation experiments confirm that the proposed model exhibits a good accuracy for various network sizes and under different operating conditions. 相似文献
7.
This paper presents, building on the analytical models developed in [A. Shahrabi, M. Ould-Khoua, L. Mackenzie, Performance modelling of broadcast communication in multicomputer networks, International Journal of Parallel, Emergent, and Distributed Systems 20 (1) (2005); A. Shahrabi, M. Ould-Khoua, On the communication latency of wormhole routed interconnection networks, International Journal of Simulation 4 (5–6) (2003) 32–43; A. Shahrabi, L. Mackenzie, M. Ould-Khoua, Modelling of Adaptive Wormhole-Routed Hypercubes in the Presence of Broadcast Traffic, in: N.J. Dimopoulos, K.F. Li (Eds.), Chapter 10 in High Performance Computing Systems And Applications, Kluwer Academic Publishers, Boston, 2002; A. Shahrabi, M. Ould-Khoua, L. Mackenzie, An Analytical Model of Wormhole-Routed Hypercubes under Broadcast Traffic, Performance Evaluation 53 (1) (2003) 23–42; A. Shahrabi, M. Ould-Khoua, L. Mackenzie, Latency of double-tree broadcast in wormhole-routed hypercubes, in: Proceedings of International Conference on Parallel Processing (ICPP’01), IEEE Computer Society, 2001, pp. 401–408] a comparative performance study of adaptive and deterministic routing algorithms in wormhole-switched interconnection networks carrying a broadcast traffic component and investigates the performance vicissitudes of them under a variety of network operating conditions. In contrast to previous works, which have reported superiority of adaptive over deterministic routing especially in high-dimensional networks such as hypercubes, our results show that adaptivity does not necessarily improve network performance even for high-dimensional networks and its superiority starts to deteriorate as the broadcast fraction of generated traffic increases. 相似文献
8.
9.
Evgeni KrimerAuthor Vitae Isaac KeslassyAuthor Vitae 《Journal of Parallel and Distributed Computing》2011,71(5):687-699
Networks-on-chip (NoCs) are used in a growing number of SoCs and multi-core processors. Because messages compete for the NoC’s shared resources, quality of service and resource allocation are major concerns for system designers. In particular, a model for the properties of packet delivery through the network is desirable. We present a methodology for packet-level static timing analysis in NoCs. Our methodology quickly and accurately gauges the performance parameters of a virtual-channel wormhole NoC without simulation. The network model can handle any topology, link capacities, and buffer sizes. It provides per-flow delay analysis that is orders-of-magnitude faster than simulation while being significantly more accurate than prior static modeling techniques. Using a carefully derived and reduced Markov chain, the model can statically represent the dynamic network state. Usage of the model in a placement optimization problem is shown as an example application. 相似文献
10.
Piotr Zieliński 《Distributed Computing》2008,20(6):435-450
The Atomic Broadcast algorithm described in this paper can deliver messages in two communication steps, even if multiple processes
broadcast at the same time. It tags all broadcast messages with the local real time, and delivers all messages in the order
of these timestamps. Both positive and negative statements are used: “m broadcast at time 51” vs. “no messages broadcast between times 31 and 51”. To prevent crashed processes from blocking the
system, the -elected leader broadcasts negative statements on behalf of the processes it suspects () to have crashed. A new cheap Generic Broadcast algorithm is used to ensure consistency between conflicting statements. It
requires only a majority of correct processes (n > 2f) and, in failure-free runs, delivers all non-conflicting messages in two steps. The main algorithm satisfies several new
lower bounds, which are proved in this paper. 相似文献
11.
一种高效的服务组合优化算法 总被引:1,自引:0,他引:1
随着功能性属性相同而非功能性属性各异的Web服务的大量涌现,如何在服务组合业务流程中为各个任务选择相应的组件服务以达到组合服务的QoS(quality of service)最大化,并在此基础上满足不同用户的需求,已成为了国内外研究的热点.由于该问题的复杂性(NP-hard),目前存在的大多数方法都并不十分适合需要相对精确、实时决策的Web服务组合系统.因此,本文提出了一种基于凸包构建的组合服务优化算法(CM-HEU)用以解决QoS感知的服务组合优化问题.CM-HEU首先通过对组合服务中的每组任务进行凸包构建,以减少搜索空间.然后通过对初始解向量的多次升级和一次降级操作以达到全局优化的目标.实验表明:相对于现阶段存在的一些主流方法,CM-HEU不仅能得到一个比较理想的结果,并且具有良好的效率. 相似文献
12.
Virtual channels yield significant improvement in the performance of wormhole-routed networks as they can greatly reduce message blocking over network resources. K-ary n-cubes with deterministic routing have been widely analysed using analytical modelling tools. Most existing models, however, have either entirely ignored the effects of virtual channel multiplexing or have not considered the impact of virtual channels allocation on message latency. This paper discusses two different organisations of virtual channels in k-ary n-cubes, resulting in two deterministic routing algorithms. It then proposes an analytical model to compute message latency for the two routing algorithms. The proposed model is used in a case study to demonstrate the sensitivity of network latency to the way virtual channels are allocated to messages. 相似文献
13.
The scarcity and diversity of resources among the devices of heterogeneous computing environments may affect their ability to execute services within the users’ requested Quality of Service levels, particularly in open real-time environments where the characteristics of the computational load cannot always be predicted in advance but, nevertheless, response to events still has to be provided within precise timing constraints in order to guarantee a desired level of performance. 相似文献
14.
Vassiliki AndronikouAuthor Vitae Konstantinos Mamouras Author VitaeKonstantinos Tserpes Author Vitae Dimosthenis Kyriazis Author VitaeTheodora Varvarigou Author Vitae 《Future Generation Computer Systems》2012,28(3):544-553
Data replication comprises a standard fault tolerance approach for systems-especially large-scale ones-that store and provide data over wide geographical and administrative areas. The major topics that the task of data replication covers include the replica creation, placement, relocation and retirement, replica consistency and replica access. In a business context a number of constraints exists which are set by the infrastructure, network and application capabilities in combination with the Quality of Service (QoS) requirements that hinder the effectiveness of data replication schemes. In this paper, we examine how this combination affects the replication lifecycle in Data Grids and we introduce a set of interoperable novel file replication algorithms that take into account the infrastructural constraints as well as the ‘importance’ of the data. The latter is approximated through a multi-parametric factor that encapsulates a set of data-specific parameters, such as popularity and content significance. 相似文献
15.
Jun-Hong Shen Author Vitae Ye-In Chang Author Vitae 《Journal of Systems and Software》2008,81(11):2091-2103
Data broadcast is an efficient dissemination method to deliver information to mobile clients through the wireless channel. It allows a huge number of the mobile clients simultaneously access data in the wireless environments. In real-life applications, more popular data may be frequently accessed by clients than less popular ones. Under such scenarios, Acharya et al.’s Broadcast Disks algorithm (BD) allocates more popular data appeared more times in a broadcast period than less popular ones, i.e., the nonuniform broadcast, and provides a good performance on reducing client waiting time. However, mobile devices should constantly tune in to the wireless broadcast channel to examine data, consuming a lot of energy. Using index technologies on the broadcast file can reduce a lot of energy consumption of the mobile devices without significantly increasing client waiting time. In this paper, we propose an efficient nonuniform index called the skewed index, SI, over BD. The proposed algorithm builds an index tree according to skewed access patterns of clients, and allocates index nodes for the popular data more times than those for the less popular ones in a broadcast cycle. From our experimental study, we have shown that our proposed algorithm outperforms the flexible index and the flexible distributed index. 相似文献
16.
Dealing with virtual channels has always been a critical issue in developing analytical performance models for interconnection networks. Almost all previous studies relied on a method proposed by Dally to capture the effect of virtual channels multiplexing in the performance of interconnection networks. This paper presents a new method to model the effect of virtual channel multiplexing in high-speed wormhole-switched interconnection networks. Dally's method loses its accuracy as the traffic load increases due to blocking nature of wormhole-switched networks. Our new method is based on a finite capacity queue, M/G/1/V and comparing to Dally's method achieves a higher degree of accuracy under low, moderate and high traffic loads. Furthermore, its simplicity eases its employment under different network conditions and setup. The presented model is validated by means of an event driven simulator and a detailed comparison with Dally's method is presented. 相似文献
17.
In wireless broadcasting environments, mobile clients cannot receive data reliably over broadcast channels because a reliable transmission protocol is not applicable to the channels. If such broadcast errors are not properly handled by a concurrency control algorithm, it could lead to a fatal effect especially when clients are permitted to issue update transactions. However, the effects of broadcast errors on concurrency control have been little researched. In this paper, we have proposed a concurrency control algorithm to support update transactions issued by mobile clients and evaluated the performance of the algorithm by focusing the effects of broadcast errors with an analytic model. The analytic results show our algorithm is efficient in resolving the broadcast errors. 相似文献
18.
Gerardo Canfora Author Vitae Author Vitae Raffaele Esposito Author Vitae Author Vitae 《Journal of Systems and Software》2008,81(10):1754-1769
QoS-aware dynamic binding of composite services provides the capability of binding each service invocation in a composition to a service chosen among a set of functionally equivalent ones to achieve a QoS goal, for example minimizing the response time while limiting the price under a maximum value.This paper proposes a QoS-aware binding approach based on Genetic Algorithms. The approach includes a feature for early run-time re-binding whenever the actual QoS deviates from initial estimates, or when a service is not available. The approach has been implemented in a framework and empirically assessed through two different service compositions. 相似文献
19.
This paper shows how to tailor a game-theoretic approach to the issue of distributed resource allocation in a CDMA multiple-access wireless network with different quality of service constraints. According to the nature of the terminals (either fixed/vehicular or mobile/battery-powered, in one respect, and either primary or secondary in another), each user pursues a different goal in the network. Game theory is used as an expedient tool to ensure optimum coexistence of users with highly conflicting interests. In the proposed game, after an initial centralized stage of admission control, each user is allowed to jointly set its transmit power and data rate according to a utility-maximizing criterion, where the utility is defined as the ratio of the throughput to the transmit power. The noncooperative Nash solution of the game is investigated and closed-form expressions for this equilibrium are derived and compared with numerical results for a decentralized resource control algorithm. 相似文献
20.
Recent embedded systems integrate a growing number of intellectual property cores into increasingly large designs. Implementation, prototyping, and verification of such large systems has become very challenging. One of the reasons is that chips/FPGAs resources are limited and therefore it is not always possible to implement the whole design in the traditional system-on-a-chip solutions. The state-of-the-art is to partition such systems into smaller sub-systems to implement each on a separate chip. Consequently, it requires interconnecting separate chips/FPGAs. Since Networks-on-Chip (NoCs) have become common interconnection solutions in embedded designs, we propose to bridge NoC-based SoCs enabling a generic multi-chip systems interconnection. In this context, the contribution of this paper is threefold, (i) we explore the NoC protocol stack to determine the best layer for implementing the off-chip bridge, (ii) we propose a generic hardware architecture for the bridge, and (iii) we develop a new software architecture enabling seamless configuration and communication of multi-chip NoC-based SoCs. Finally, we demonstrate performance, i.e., bandwidth and latency, of the bridge in a multi-FPGA platform, while the bridge guarantees QoS of traffic. The synthesis results indicate the implementation area cost of the bridge is only 1% of Xilinx Virtex6 FPGA. 相似文献