共查询到20条相似文献,搜索用时 0 毫秒
1.
Response time is an important design criterion for real-time systems. A new analytic model is developed to estimate task response time. It considers such factors as interprocessor communication, module precedence relationship, module scheduling, interconnection network delay, and assignment of modules and files to computers. Since module assignment as well as its replication have great impact on task response time, a new algorithm is developed to iteratively search for module assignments and replications that reduce task response time. An objective function is introduced that is based on the sum of task response time and delay penalty for the violations of thread response time requirements. With this objective function, good module allocations and replications, which minimize task response time and yet satisfy the thread response time requirements, can be determined by the proposed algorithm. To validate the algorithm, we compare the assignments generated by the algorithm for some sample distributed systems to the optimal module assignments obtained from exhaustive search. It shows that with a very small number of initial module assignments, our algorithm is able to generate the optimal or close-to-optimal assignments. The algorithm is also applied to a real-time distributed system for space defense applications where exhaustive search for the optimal assignment is not feasible. The generated module assignments (with replications) satisfy the specified thread response times, and compare closely with the simulation results. A series of experiments is also performed to characterize the behavior of the algorithm. In conclusion, the algorithm can serve as a valuable tool for assigning modules with replications for distributed systems. 相似文献
2.
There are few studies of deadlock resolution in the real-time distributed database environment. A specially designed distributed deadlock detection algorithm known as the Enhanced Probe-Based Algorithm (EPBA) was evaluated by the authors and has found to be very effective. The present work is carried out to evaluate the EPBA for the firm real-time distributed database environment. The study also compares its performance with other existing deadlock resolution algorithms such as the timeouts algorithm and the global sequential locking algorithm. Results indicated that under high data contention and with slack transaction deadlines, the EPBA approach outperforms all the other deadlock resolution methods. 相似文献
3.
An optimum distributed architecture with fault-tolerance capabilities for a given software application may be obtained by allowing allocation algorithms to evolve without any existing-hardware constraint. Distributed software partitioning and allocation is done using the simulated annealing optimization algorithm. To define the cost function used by the optimization algorithm, a model for interacting processes constituting the software application is presented. Tuning of algorithm parameters has been considered to assure convergence at a reasonable cost in terms of computation time 相似文献
4.
The authors address the issue of optimal design (in terms of the number of processors) of a distributed system which is based on a recursive algorithm for fault tolerance (RAFT). The reliability and performance of the system using RAFT are determined as a function of reliability of individual processors and the number of fault modes in a processor. Also discussed are how to determine the design policies when the objective is to minimize the average system failure. Several numerical examples illustrate the results 相似文献
5.
Two tightly coupled multi-computer testbeds, one providing efficient inter-node communications tailored to the application, and the other providing more flexible full connectivity among processors and memories are used to support validation of the design techniques for distributed real-time systems. The testbeds are valuable tools for evaluating, analyzing, and studying the behavior of many algorithms for distributed systems. We have used the testbeds in studying distributed recovery block scheme for handling hardware and software faults. A testbed has also been used to analyze database locking techniques and a fault-tolerant locking protocol for recovery from faults that occur during updating of replicated copies of files in tightly coupled distributed systems. Testbeds can be configured to represent the operating environments and input scenarios more accurately than software simulation. Therefore, testbed-based evaluation provides more accurate results than simulation and yields greater insight into the characteristics and limitations of proposed concepts. This is an important advantage in the complex field of distributed real-time system design evaluation and validation. Therefore, testbed-based experimentation is an effective approach to validate system concepts and design techniques for distributed systems for real-time applications. 相似文献
6.
This work is based on the observation that existing energy management techniques for mobile devices, such as Dynamic Voltage
Scaling (DVS), are non-cooperative in the sense that they reduce the energy consumption of a single device, disregarding potential
consequences for other constraints (e.g., end-to-end deadlines) and/or other devices (e.g., energy consumption on neighboring
devices). This paper argues that energy management in distributed wireless real-time systems has to be end-to-end in nature,
requiring a coordinated approach among communicating devices. A cooperative distributed energy management technique (Co-DVS)
is proposed that (1) adapts and maintains end-to-end latencies within specified timeliness requirements (deadlines) and (2)
enhances energy savings at the devices with the highest pay-off factors that represent the relative benefits or significance of conserving energy at a device. The proposed technique employs a feedback-based
approach to dynamically distribute end-to-end slack among the devices based on their pay-off factors. 相似文献
7.
ISO 11898 is a communication protocol based on the carrier sense multiple access with collision detection and arbitration on message priority (CSMA/CD+AMP) technique, which at present is largely used as a real-time network for industrial environments. Unfortunately, because of the peculiarities of the arbitration technique it adopts, it suffers from severe limitations on the maximum extension of the network, which cannot be overcome simply by means of improvements in the transceiver's technology as they depend on the limited propagation speed of the signals on the communication support. In this paper, a new kind of network is presented that features a behavior very similar to ISO 11898, but which achieves noticeably larger areas to be covered without having to reduce the bit rate. It relies on a tree topology and adopts a brand new multistage hierarchical distributed arbitration technique, which takes the increased propagation delays into account properly. 相似文献
8.
Pop P. Eles P. Zebo Peng Pop T. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2004,12(8):793-811
In this paper, we present an approach to mapping and scheduling of distributed embedded systems for hard real-time applications, aiming at a minimization of the system modification cost. We consider an incremental design process that starts from an already existing system running a set of applications. We are interested in implementing new functionality such that the timing requirements are fulfilled and the following two requirements are also satisfied: 1) the already running applications are disturbed as little as possible and 2) there is a good chance that later, new functionality can easily be added to the resulted system. Thus, we propose a heuristic that finds the set of already running applications which have to be remapped and rescheduled at the same time with mapping and scheduling the new application, such that the disturbance on the running system (expressed as the total cost implied by the modifications) is minimized. Once this set of applications has been determined, we outline a mapping and scheduling algorithm aimed at fulfilling the requirements stated above. The approaches have been evaluated based on extensive experiments using a large number of generated benchmarks as well as a real-life example. 相似文献
9.
A channel allocation algorithm in a cellular network consists of two parts: a channel acquisition algorithm and a channel selection algorithm. Some of the previous works in this field focused on centralized approaches to allocating channels. But, centralized approaches are neither scalable nor reliable. Recently, distributed dynamic channel allocation algorithms have been proposed, and they have gained a lot of attention due to their high reliability and scalability. But, in most of the algorithms, the cell that wants to borrow a channel has to wait for replies from all its interference neighbors and, hence, is not fault-tolerant. In this paper, we propose a new algorithm that is fault-tolerant and makes full use of the available channels. It can tolerate the failure of mobile nodes as well as static nodes without any significant degradation in service. 相似文献
10.
Dynamic fault-tree models for fault-tolerant computer systems 总被引:3,自引:0,他引:3
Reliability analysis of fault-tolerant computer systems for critical applications is complicated by several factors. Systems designed to achieve high levels of reliability frequently employ high levels of redundancy, dynamic redundancy management, and complex fault and error recovery techniques. This paper describes dynamic fault-tree modeling techniques for handling these difficulties. Three advanced fault-tolerant computer systems are described: a fault-tolerant parallel processor, a mission avionics system, and a fault-tolerant hypercube. Fault-tree models for their analysis are presented. HARP (Hybrid Automated Reliability Predictor) is a software package developed at Duke University and NASA Langley Research Center that can solve those fault-tree models 相似文献
11.
One notable advantage of Model-Driven Architecture (MDA) method is that software developers could do sufficient analysis and tests on software models in the design phase, which helps construct high confidence on the expected software behaviors and performance, especially for safety-critical real-time software. Most existing literature of reliability analysis ignores the effects from those deadline requirements of tasks which are critical properties for real-time software and thus cannot be ignored. Considering the contradictory relationship between the deadline requirements and time costs of fault tolerance in real-time tasks, in this paper, we present a novel reliability model, which takes schedulability as one of the major factors affecting the reliability, to analyze reliability of the task execution model in real-time software design phase. The tasks in this reliability model has no restrictions on their distributions and thus could be distributed on a multiprocessor or on a distributed system. Furthermore, the tasks also define arrival rates of faults and fault-tolerant mechanisms to model the occurrences of non-permanent faults and the corresponding time costs of fault handling. By analyzing the probability of tasks still being schedulable in the worst-case execution scenario with faults occurring, reliability and schedulability are combined into an unified analysis framework, and two algorithms for reliability analysis are given. To make this reliability model more pragmatic, we also present an estimation technique for estimating the fault arrival rate of each task. We show through two case studies respectively the detailed derivation process under static-priority scheduling in a multiprocessor system and in the design process of avionics software, and then analyze the factors affecting the reliability analysis by setting up simulation experiments. When no assumptions of fault occurrences made on the task model, this reliability model regresses to a generic schedulability model. 相似文献
12.
James C. Browne Allen Emerson Mohamed Gouda Daniel Miranker Aloysius Mok Louis Rosier 《Telematics and Informatics》1990,7(3-4):441-454
We introduce two systems concepts: bounded response-time and self-stabilization in the context of rule-based programs. These concepts are essential for the design of rule-based programs that must be highly fault-tolerant and perform in a real-time environment. The mechanical analysis of programs for these two properties will be discussed. We have also applied our techniques to analyze a NASA application. 相似文献
13.
Lach J. Mangione-Smith W.H. Potkonjak M. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1998,6(2):212-221
Fault-tolerance is an important system metric for many operating environments, from automotive to space exploration. The conventional technique for improving system reliability is through component replication, which usually comes at significant cost: increased design time, testing, power consumption, volume, and weight. We have developed a new fault-tolerance approach that capitalizes on the unique reconfiguration capabilities of field programmable gate arrays (FPGA's). The physical design is partitioned into a set of tiles. In response to a component failure, a functionally equivalent tile that does not rely on the faulty component replaces the affected tile. Unlike application specific integrated circuit (ASIC) and microprocessor design methods, which result in fixed structures, this technique allows a single physical component to provide redundant backup for several types of components. Experimental results conducted on a subset of the MCNC benchmarks demonstrate a high level of reliability with low timing and hardware overhead 相似文献
14.
Gill C.D. Cytron R.K. Schmidt D.C. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》2003,91(1):183-197
Increasingly complex requirements, coupled with tighter economic and organizational constraints, are making it hard to build complex distributed real-time embedded (DRE) systems entirely from scratch. Therefore, the proportion of DRE systems made up of commercial-off-the-shelf (COTS) hardware and software is increasing significantly. There are relatively few systematic empirical studies, however, that illustrate how suitable COTS-based hardware and software have become for mission-critical DRE systems. This paper provides the following contributions to the study of real-time quality-of-service (QoS) assurance and performance in COTS-based DRE systems: it presents evidence that flexible configuration of COTS middleware mechanisms, and the operating system (OS) settings they use, allows DRE systems to meet critical QoS requirements over a wider range of load and jitter conditions than statically configured systems; it shows that in addition to making critical QoS assurances, noncritical QoS performance can be improved through flexible support for alternative scheduling strategies; and it presents an empirical study of three canonical scheduling strategies; specifically the conditions that predict success of a strategy for a production-quality DRE avionics mission computing system. Our results show that applying a flexible scheduling framework to COTS hardware, OSs, and middleware improves real-time QoS assurance and performance for mission-critical DRE systems. 相似文献
15.
A combined performance and reliability (performability) measure for gracefully degradable fault-tolerant systems is introduced and a closed-form, analytic solution is provided for computing the performability of a class of unrepairable systems which can be modeled by general acyclic Markov processes. This allows the study of models which consider the degradation of more than one type of system component, e.g. processors, memories, buses. An efficient evaluation algorithm is provided, with an extensive analysis of its time and space complexity. A numerical example is provided which shows how the combined performance/reliability measure provides for a complete evaluation of the relative merits of different multiprocessor structures 相似文献
16.
Under a voting strategy in a fault-tolerant software system there is a difference between correctness and agreement. An independent N -version programming reliability model which distinguishes between correctness and agreement is proposed for treating small output spaces. An alternative voting strategy, consensus voting, is used to treat cases when there can be agreement among incorrect outputs, a case which can occur with small output spaces. The consensus voting strategy automatically adapts the voting to various version reliability and output-space cardinality characteristics. The majority-voting strategy provides reliability which is a lower bound, and the 2-out-of-n voting strategy provides reliability which is an upper bound, on the reliability by consensus voting. The reciprocal of the cardinality of output space is a lower bound on the average reliability of fault-tolerant system versions below which the system reliability begins to deteriorate as more versions are added 相似文献
17.
Posbist reliability theory is based on the possibility assumption and the binary-state assumption. In this paper we discuss posbist reliability behaviour of fault-tolerant systems, including cold redundant systems and warm redundant systems. In each type of fault-tolerant systems, the conversion switchs may behave in different ways, such as absolutely reliable, non-absolutely reliable with 0–1 mode, non-absolutely reliable with continuous mode. So the system posbist reliability behaviour varies. We express the system posbist reliability in terms of a system lifetime. When the system posbist reliability exhibits as a fuzzy variable in itself, we redefine the system posbist reliability by use of a new expression. 相似文献
18.
Simon Perathoner Ernesto Wandeler Lothar Thiele Arne Hamann Simon Schliecker Rafik Henia Razvan Racu Rolf Ernst Michael González Harbour 《Design Automation for Embedded Systems》2009,13(1-2):27-49
System level performance analysis plays a fundamental role in the design process of hard real-time embedded systems. Several different approaches have been presented so far to address the problem of accurate performance analysis of distributed embedded systems in early design stages. The existing formal analysis methods are based on essentially different concepts of abstraction. However, the influence of these different models on the accuracy of the system analysis is widely unknown, as a direct comparison of performance analysis methods has not been considered so far. We define a set of benchmarks aimed at the evaluation of performance analysis techniques for distributed systems. We apply different analysis methods to the benchmarks and compare the results obtained in terms of accuracy and analysis times, highlighting the specific effects of the various abstractions. We also point out several pitfalls for the analysis accuracy of single approaches and investigate the reasons for pessimistic performance predictions. 相似文献
19.
Mean time to failure (MTTF) is one of the most frequently used dependability measures in practice. By convention, MTTF is the expected time for a system to reach any one of the failure states. For some systems, however, the mean time to absorb to a subset of the failure states is of interest. Therefore, the concept of conditional MTTF may well be useful. In this paper, we formalize the definition of conditional MTTF and cumulative conditional MTTF with an efficient computation method in a finite state space Markov model. Analysis of a fault-tolerant disk array system and a fault-tolerant software structure are given to illustrate application of the conditional MTTF. 相似文献
20.
介绍了Cyber-Physical Systems的基本内容,给出了基于Cyber-Physical Systems异构分布式中的实时任务系统模型。并在该模型下结合基/副版本备份技术提出了两种适应于Cyber-Physical Systems异构分布式实时环境的启发式容错调度算法:HDLMA算法和HDLFA算法。最后针对这两种算法,分析了算法可调度性,负载均衡性,任务粒度大小对负载均衡性的影响,以及调度阀值对算法可调度性的影响。 相似文献