期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A new approach for designing fault-tolerant WDM networks

Arunita Subir Yash 《Computer Networks》2008,52(18):3421-3432

In recent years, path protection has emerged as a widely accepted technique for designing survivable WDM networks. This approach is attractive, since it is able to provide bandwidth guarantees in the presence of link failures. However, it requires allocating resources for backup lightpaths, which remain idle under normal fault-free conditions. In this paper, we introduce a new approach for designing fault-tolerant WDM networks, based on the concept of survivable routing. Survivable routing of a logical topology ensures that the lightpaths are routed in such a way that a single link failure does not disconnect the network. When a topology is generated using our approach, it is guaranteed to have a survivable routing. We further ensure that the logical topology is able to handle the entire traffic demand after any single link failure. We first present an ILP that optimally designs a survivable logical topology, and then propose a heuristic for larger networks. Experimental results demonstrate that this new approach is able to provide guaranteed bandwidth, and is much more efficient in terms of resource utilization, compared to both dedicated and shared path protection. 相似文献

2.

Architectural support for cooperative multiuser interfaces

Bentley R. Rodden T. Sawyer P. Sommerville I. 《Computer》1994,27(5):37-46

Computer support for cooperative work requires the construction of applications that support interaction by multiple users. The highly dynamic and flexible nature of cooperative work makes the need for rapid user-interface prototyping a central concern. We have designed and developed a software architecture that provides mechanisms to support rapid multiuser-interface construction and distributed user-interface management. Rapid prototyping requires mechanisms that make the information determining interface configuration visible, accessible, and tailorable. We developed the architecture as part of a project investigating support for the cooperative work of air traffic controllers. Extensive use of prolonged ethnographic investigation helped to uncover the nature of cooperation in air traffic control. The aim of the architecture is to support an environment in which a multidisciplinary team can experiment with a wide range of alternate user-interface designs for air traffic controllers. Thus, we use examples from this domain to illustrate the architecture 相似文献

3.

Architectural support for efficient multicasting in irregularnetworks

Sivaram R. Kesavan R. Panda D.K. Stunkel C.B. 《Parallel and Distributed Systems, IEEE Transactions on》2001,12(5):489-513

Parallel computing on networks of workstations is fast becoming a cost-effective high-performance computing alternative to MPPs. Such a computing environment typically consists of processing nodes interconnected through a switch-based irregular network. Many of the problems that were solved for regular networks have to be solved anew for these systems. One such problem is that of efficient multicast communication. In this paper, we propose two broad categories of schemes for efficient multicasting in such irregular networks: network interface-based (NI-based) and switch-based. The NI-based multicasting schemes use the network interface of intermediate destinations for absorbing and retransmitting messages to other destinations in the multicast tree. In contrast, the switch-based multicasting schemes use hardware support for packet replication at the switches of the network and a concept known as multidestination routing to convey a multicast message from one source to multiple destinations. We first present alternative schemes for efficient multipacket forwarding at the NI and derive an optimal k-binomial multicast tree for multipacket NI-based multicast. We then propose two switch-based multicasting schemes that differ in the power of the encoding scheme and the complexity of the decoding logic at the switches. These multicasting schemes use path-based multidestination worms that can cover all nodes connected to switches along a valid unicast path and tree-based multidestination worms that can cover entire destination sets in a single phase using one worm, respectively. For each scheme, we describe the associated header encoding and decoding operation, the method for deriving multidestination worms that cover arbitrary multicast destination sets, and the multicasting scheme using the derived multidestination worms 相似文献

4.

A formal specification framework for object-oriented distributedsystems

Buchs D. Guelfi N. 《IEEE transactions on pattern analysis and machine intelligence》2000,26(7):635-652

相似文献

5.

Guest editors' introduction: challenges in designing fault-tolerant routing in networks

《Parallel and Distributed Systems, IEEE Transactions on》1999,10(10):961-963

相似文献

6.

SEDOS: designing open distributed systems

Diaz M. Vissers C. 《Software, IEEE》1989,6(6):24-33

The aim of the ESPRIT SEDOS (software environment for the design of open systems) project is to further develop Estelle, the extended state-transition language, and LOTOS, the language for temporal ordering specifications, to describe services and protocols for distributed architectures and to demonstrate their effectiveness as concretely as possible by deriving simulators and other support tools. The Estelle language is based on extended state machines that communicate through infinite FIFO links. The LOTOS language is based on a temporal ordering of events and rendezvous 相似文献

7.

An index-based checkpointing algorithm for autonomous distributedsystems

Baldoni R. Quaglia F. Fornara P. 《Parallel and Distributed Systems, IEEE Transactions on》1999,10(2):181-192

This paper presents an index-based checkpointing algorithm for distributed systems with the aim of reducing the total number of checkpoints while ensuring that each checkpoint belongs to at least one consistent global checkpoint (or recovery line). The algorithm is based on an equivalence relation defined between pairs of successive checkpoints of a process which allows us, in some cases, to advance the recovery line of the computation without forcing checkpoints in other processes. The algorithm is well-suited for autonomous and heterogeneous environments, where each process does not know any private information about other processes and private information of the same type of distinct processes is not related (e.g., clock granularity, local checkpointing strategy, etc.). We also present a simulation study which compares the checkpointing-recovery overhead of this algorithm to the ones of previous solutions 相似文献

8.

Architectural support for thread communications in multi-core processors

Sevin Varoglu Stephen Jenks 《Parallel Computing》2011,37(1):26-41

In the ongoing quest for greater computational power, efficiently exploiting parallelism is of paramount importance. Architectural trends have shifted from improving single-threaded application performance, often achieved through instruction level parallelism (ILP), to improving multithreaded application performance by supporting thread level parallelism (TLP). Thus, multi-core processors incorporating two or more cores on a single die have become ubiquitous. To achieve concurrent execution on multi-core processors, applications must be explicitly restructured to exploit parallelism, either by programmers or compilers. However, multithreaded parallel programming may introduce overhead due to communications among threads. Though some resources are shared among processor cores, current multi-core processors provide no explicit communications support for multithreaded applications that takes advantage of the proximity between cores. Currently, inter-core communications depend on cache coherence, resulting in demand-based cache line transfers with their inherent latency and overhead. In this paper, we explore two approaches to improve communications support for multithreaded applications. Prepushing is a software controlled data forwarding technique that sends data to destination’s cache before it is needed, eliminating cache misses in the destination’s cache as well as reducing the coherence traffic on the bus. Software Controlled Eviction (SCE) improves thread communications by placing shared data in shared caches so that it can be found in a much closer location than remote caches or main memory. Simulation results show significant performance improvement with the addition of these architecture optimizations to multi-core processors. 相似文献

9.

Grid-based switch fabrics: a new approach in designing fault-tolerant ATM switches

《Computer Communications》2001,24(15-16):1589-1606

ATM is the switching and multiplexing technology chosen to be used in the implementation of B-ISDN, because of its superiority in fast packet switching. However, the use of ATM switches with large number of input and output ports have been proven to be a bottleneck in wide area ATM networks. In this paper, we propose a new space-division grid-based ATM architecture with fault tolerant characteristics and minimal number of switching elements (SE's). 相似文献

10.

An efficient protocol for checkpointing recovery in distributedsystems

Kim J.L. Park T. 《Parallel and Distributed Systems, IEEE Transactions on》1993,4(8):955-960

The authors present an efficient synchronized checkpointing protocol that exploits the dependency relation between processes in distributed systems. In this protocol, a process takes a checkpoint when it knows that all processes on which it computationally depends took their checkpoints, hence the process need not always wait for the decision made by the checkpointing coordinator as in the conventional synchronized protocols. As a result, the checkpointing coordination time is substantially reduced and the possibility of total abort of the checkpointing coordination is reduced 相似文献

11.

An architecture for survivable coordination in large distributedsystems

Malkhi D. Reiter M.K. 《Knowledge and Data Engineering, IEEE Transactions on》2000,12(2):187-202

Coordination among processes in a distributed system can be rendered very complex in a large-scale system where messages may be delayed or lost and when processes may participate only transiently or behave arbitrarily, e.g. after suffering a security breach. In this paper, we propose a scalable architecture to support coordination in such extreme conditions. Our architecture consists of a collection of persistent data servers that implement simple shared data abstractions for clients, without trusting the clients or even the servers themselves. We show that, by interacting with these untrusted servers, clients can solve distributed consensus, a powerful and fundamental coordination primitive. Our architecture is very practical, and we describe the implementation of its main components in a system called Fleet 相似文献

12.

Observer-a concept for formal on-line validation of distributedsystems

《IEEE transactions on pattern analysis and machine intelligence》1994,20(12):900-913

相似文献

13.

Architectural support for goal management in flat concurrent Prolog

Alkalaj L. Lang T. Ercegovac M.D. 《Computer》1992,25(8):34-47

A special-purpose architectural support that reduces the goal-management execution time in flat concurrent Prolog (FCP) is described. The architectural support consists of a dedicated goal-management unit that executes high-level goal-management operations concurrently with goal-reduction operations. The efficient execution of goal-management instructions is realized using a goal cache that stores recently spawned goals. It is shown that operations such as goal-switching, spawning, and halting are efficiently performed by changing their status in the goal cache. More complex operations, such as goal suspension and activation are decoupled from goal reduction by using two suspension tables and activation queues. Using an analytic performance model, it is shown that, for the systems development workload, which consists of large FCP programs, the overhead of software-implemented goal management is 50% of the program execution time. This is reduced up to 3% using the goal-management unit and the goal cache, resulting in a speedup of almost 2. The results are generalized for other workloads that exhibit different goal-management complexities 相似文献

14.

Architectural support for inter-stream communication in an MSIMD system

Vivek Garg David E. Schimmel 《Future Generation Computer Systems》1995,11(6):617-629

This paper considers hardware support for the exploitation of control parallelism on data parallel architectures. It is well known that data parallel algorithms may also possess control parallel structure. However, the splitting of control leads to data dependency and synchronization issues that were implicitly handled in conventional SIMD architectures. These include synchronization of access to scalar and parallel variables, and synchronization for parallel communication operations. We propose a sharing mechanism for scalar variables and identify a strategy which allows synchronization of scalar variables between multiple streams. The techniques considered are based on a bit-interleaved register file structure which allows fast copy between register sets. Hardware cost estimates and timing analyses are provided, and comparison with an alternate scheme is presented. The register file structure has been designed and simulated for the HP 0.8μm CMOS process, and circuit simulation indicates access times are less than six nanoseconds. In addition, the impact of this structure on system performance is also studied. 相似文献

15.

Replica management for fault-tolerant systems

Cherif A. Katayama T. 《Micro, IEEE》1998,18(5):54-65

Replication has traditionally been used to assure fault tolerance. Active parallel replication is a new technique that, in addition to ensuring fault tolerance, dramatically reduces computation time and cost 相似文献

16.

基于组件的分布式软件的动态配置和容错 总被引：1，自引：0，他引：1

曹旻吴耿锋徐白昱宋毅《计算机工程与应用》2004,40(6):100-104

论文提出一种结构化新方法,它能通过动态配置支持基于组件的分布式软件的容错。采用面向图形的编程模型,基于组件的分布式软件的软件体系结构可用一个逻辑图来表示,该逻辑图可以精化为一个明确的对象并分布到网络中,软件的动态配置通过执行定义在图上的一系列操作来实现,发生错误时通过动态重配置软件来支持容错。论文描述了该方法的基本模型、系统结构及其在CORBA上的实现原型。相似文献

17.

A user-centred approach for designing driving support systems: the case of collision avoidance 总被引：1，自引：0，他引：1

P. C. Cacciabue M. Martinetto 《Cognition, Technology & Work》2006,8(3):201-214

The work described in this paper is focused on an approach for implementing in real working contexts the guidelines of user-centred design contained in formal standards and in many research studies. The application concerns the EUCLIDE project (enhanced human–machine interface for on vehicle integrated driving support system), which aimed at developing a driving support system to avoid collisions with obstacles in reduced visibility conditions. The design of the system followed a user-centred approach which started by identifying the model of cognition to be applied throughout the whole design process. The definition of the warning strategies of the system was firstly analysed with the aim to achieve the highest balance between a totally supportive system and a non-disturbing system. Then an initial set of design solutions for the human–machine interface was tested in a static driving simulator. A second set of possible interfaces was evaluated in a dynamic simulator before developing a final design. This solution was implemented in two real vehicles and tested in real traffic situations. This paper describes the whole design process and concentrates on the final step of “in-vehicle” integration process. The road tests performed at the end of the whole process are discussed in detail focusing on the safety implications associated with the design solution finally selected and implemented.

P. C. CacciabueEmail:

相似文献

18.

Context-dependent awareness support in open collaboration environments

Liliana Ardissono Gianni Bosio 《User Modeling and User-Adapted Interaction》2012,22(3):223-254

The widespread adoption of online services for performing work, home and leisure tasks enables users to operate in the ubiquitous environment provided by the Internet by managing a possibly high number of parallel (private and shared) activity contexts. The provision of awareness information is a key factor for keeping users up-to-date with what happens around them; e.g., with the operations performed by their collaborators. However, the delivery of notifications describing the occurred events can interrupt the users’ activities, with a possible disruptive effect on their emotional and attentional states. As a possible solution to the trade-off between informing and interrupting users, we defined two context-dependent notification management policies which support the selection of the notifications to be delivered on the basis of the user’s current activities, at different granularity levels: general collaboration context versus task carried out. These policies are offered by the COntext depeNdent awaReness informAtion Delivery (CONRAD) framework. We tested such policies with users by applying them in a collaboration environment that includes a set of largely used Web 2.0 services. The experiments show that our policies reduce the levels of workload on users while supporting an up-to-the-moment understanding of the interaction with their shared contexts. The present paper presents the CONRAD framework and the techniques underlying the proposed notification policies. 相似文献

19.

Software-based rerouting for fault-tolerant pipelined communication 总被引：1，自引：0，他引：1

Young-Joo Suh Dao B.V. Yalamanchili S. 《Parallel and Distributed Systems, IEEE Transactions on》2000,11(3):193-211

This paper presents a software-based approach to fault-tolerant routing in networks using wormhole or virtual cut-through switching. When a message encounters a faulty output link, it is removed from the network by the local router and delivered to the messaging layer of the local node's operating system. The message passing software can reroute this message, possibly along nonminimal paths. Alternatively, the message may be addressed to an intermediate node, which will forward the message to the destination. A message may encounter multiple faults and pass through multiple intermediate nodes. The proposed techniques are applicable to both obliviously and adaptively routed networks. The techniques are specifically targeted toward commercial multiprocessors where the mean time to repair (MTTR) is much smaller than the mean time between router failures (MTBF), i.e., it is sufficient to tolerate a maximum of three failures. This paper presents requirements for buffer management, deadlock freedom, and livelock freedom. Simulation results are presented to evaluate the degradation in latency and throughput as a function of the number and distribution of faults. There are several advantages of such an approach. Router designs are minimally impacted, and thus remain compact and fast. Only messages that encounter faulty components are affected, while the machine is ensured of continued operation until the faulty components can be replaced. The technique leverages existing network technology, and the concepts are portable across evolving switch and router designs. Therefore, we feel that the technique is a good candidate for incorporation into the next generation of multiprocessor networks 相似文献

20.

Resource allocation for primary-site fault-tolerant systems

Huang Y. Tripathi S.K. 《IEEE transactions on pattern analysis and machine intelligence》1993,19(2):108-119

Resource allocation for a distributed system employing the primary site approach for fault tolerance is discussed. Two kinds of systems are considered. The first consists of fault-tolerant nodes where each node has many duplicated servers. One server is the primary, which serves user requests, and the rest are backup. The second does not have fault-tolerant nodes. To tolerate node failures, each node uses other nodes as backups. When a node fails, all requests initially allocated to the node are served by one of its backups. To study the resource allocation for such systems, an approximate model for each system is developed. Using these models, efficient allocation algorithms that take into account the failure/repair rates of the system and the fault-tolerant overheads are presented. Using experimental results, it is shown that the algorithms give the optimal or suboptimal allocations. The algorithms, which incur little overhead, can improve the system performance significantly over an intuitive allocation algorithm 相似文献