首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Reliability of systems used in space, avionic, and biomedical applications is highly critical. Such systems consist of an analog front-end to collect data, an analog-to-digital converter (ADC) to convert the collected data to digital form, and a digital unit to process it. Though a considerable amount of research has been performed to increase the reliability of digital blocks, the same cannot be claimed for mixed-signal blocks. The reliability enhancement that we employ begins with fault-sensitivity analysis followed by redesign. The data obtained from the sensitivity analysis is used to grade blocks based on their sensitivity to faults. The highly sensitive blocks can then be replaced by more reliable alternatives. The improvement gained by opting for more robust implementations might be limited due to the number of possible implementations. In these cases, alternative reliability enhancement techniques such as adding redundancy may provide further improvements. The steps involved in the reliability enhancement of ADCs are illustrated in this paper by first proposing a sensitivity analysis methodology for /spl alpha/-particle induced transients and then suggesting redesign techniques to improve the reliability of the ADC. A novel concept of node weights specific to /spl alpha/-particle transients is introduced, which improves the accuracy of the sensitivity analysis. The fault simulations show that, using techniques such as alternative robust implementations, adding redundancy, pattern detection, and transistor sizing, considerable improvements in reliability can be attained.  相似文献   

2.
Nano-scale devices are continuously shrinking, operating at lower voltages and higher frequencies. This makes them more susceptible to environmental perturbations and distinguished by their high dynamic fault rates. Redundancy techniques are widely used to increase the reliability of combinational logic circuits. In this work, soft error reliability is improved by using such techniques, and based on probability of occurrence for combinations at the outputs of circuits. A generalized modular redundancy scheme to enhance the reliability of combinational circuits is proposed. Additionally, several aspects regarding the application of this scheme are explored. This comprises types of redundant modules, complexity of voters and single versus multiple outputs protection. Also, a methodology for applying the generalized modular redundancy scheme is developed. Reliability analysis for various benchmarks from the LGSynth91 suite shows that the proposed methodology can achieve reliability figures higher than that of triple modular redundancy. In general, significant overhead savings are accomplished in addition to that superior reliability.  相似文献   

3.
A new scheme for implementing highly reliable digital systems is proposed. The method has a circuitry overhead which is comparable to that of the triple modular redundancy (TMR) scheme, although it is shown to have a reliability, and more importantly a mean time to failure, improvement well beyond that expected from the standard TMR systems. The reliability and mean time to failure are both developed from a discrete state, continuous time, Markov model of the new system. The results for the reliability and mean time to failure characteristics for this new design of system, termed comparative redundancy, are compared to both TMR and a single unit.  相似文献   

4.
A process for reliability-related quality programming is developed to fill existing gaps in software design and development so that a quality programming plan can be achieved. The tradeoffs among system reliability improvement, resource consumption, and other relevant constraints through the management phase are investigated. A software reliability-to-cost relation is developed both from a software reliability-related cost model and from software redundancy models with common-cause failures. A generic N-component redundancy model is also developed. The software reliability optimization problems can be formulated into a mixed-integer programming problem  相似文献   

5.
利用余度技术可以很好地提高控制系统的任务可靠性。余度设计的关键技术就是余度管理策略和方法,系统的容错能力主要是通过余度管理来实现的。文章详细描述了四余度容错计算机实现余度管理算法的设计思路。采用这样的设计使得系统结构紧凑,在保证实时性的同时提高了系统任务可靠性和安全性。  相似文献   

6.
This paper presents a heuristic method for optimum redundancy allocation in non-coherent systems. The method uses two forms of redundancy, namely parallel and series forms. System reliability of non-coherent systems cannot be generally improved by using only parallel redundancy. So use of series or parallel redundancy, whichever gives better system reliability, is recommended. The proposed method retains all the advantages of the most recommended [2,4,5] heuristic reliability optimization techniques. The method is general and can be used with linear or non-linear, separable or non-separable constraints.  相似文献   

7.
The literature on the theoretical aspects of redundancy in digital computers is extensive providing a sound basis for highly reliable design. This paper describes the design problems, the reliability prediction, the field performance, and the future application of redundancy techniques to digital systems. Triple modular redundancy (TMR) is described using the logic of the Launch Vehicle Digital Computer utilized in the uprated Saturn I and the Saturn V vehicles. The self-correcting memory of this computer is described along with the associated design problems and the design verification based on production experience. Consideration is given to system design problems involved with TMR logic. A Monte Carlo technique for predicting computer reliability is considered in a design engineering rather than programmer approach. The unique means of indicating single-channel malfunctions, while continuing to mask these single-channel malfunctions with respect to system operation, is introduced. The result of field operation are given and compared with predicted reliability. Quad redundancy at the component part level is described using the circuitry of the primary processor and data storage (PPDS) for NASA's Orbital Astronomical Observatory. The process of arriving at a quad redundancy implementation is considered in light of the constraints of cost, schedule, and an initial reliability requirement of 95 percent for a year's operation in space. The circuit and system design problems associated with quad redundancy such as impedance and part parameter variations, power consumption, fan out limitations, and testing restrictions are indicated. The results of field operation are given and compared with predicted reliability.  相似文献   

8.
Applying system-level fault-tolerant techniques such as active redundancy is a promising way to enhance the system reliability for safety-related applications. Embedded system design using active redundancy is a challenging task that involves solving two major problems, namely finding the optimal redundancy configuration and mapping/scheduling of the application (including the redundant components) to the platform under timing and reliability constraints. This paper presents a framework for automatic synthesis of fault-tolerant designs on multiprocessor platforms. The core of the framework consists of: (1) a reliability analysis, that computes the system-level reliability in the presence spatial and temporal redundancy, and (2) an optimization approach for reliability-aware design space exploration. The proposed approach considers both transient and permanent faults and is among the first to support system design using imperfect fault detectors. The framework takes an application model, a platform model and a set of application requirements as input, and generates the recommended design parameters, including task-to-processor binding, task schedule and the selection/placement of redundancy. The effectiveness of our approach is illustrated using several case studies.  相似文献   

9.
CERN is working on a new particle accelerator that will require a very large number of power converters. In that view, the reliability of the whole powering will be a major issue. The use of a redundancy and modularity may help increasing the overall machine availability. However, the reliability of the redundancy system must be high enough to add a significant improvement when compared to simple module systems. This paper suggests a comparative study of several modular and redundant configurations for optimising power converters reliability and draws some conclusion from what has been achieved in the LHC previous experience.  相似文献   

10.
The redundancy optimization policies for system reliability improvement have been studied in recent literature. In this paper, the uncertainty of the system environment is considered and the concept of system robustness is introduced to the redundancy optimization. The system robustness is considered as an inherent system ability to resist the uncertainty of the system environment. Two new criteria are suggested for selection of the component for parallel redundancy. An example is given to illustrate the new optimal policy.  相似文献   

11.
Advances in silicon technology and shrinking the feature size to nanometer levels make random variations and low reliability of nano-devices the most important concern for fault-tolerant design. Design of reliable and fault-tolerant embedded processors is mostly based on developing techniques that compensate reliability shortcomings by adding hardware or software redundancy. The recently-proposed redundancy adding techniques are generally applied uniformly to all parts of a system and lead to heavy overheads and inefficiencies in terms of performance, power, and area. Efficient employment of non-uniform redundancy becomes possible when a quantitative analysis of a system behavior while encountering transient faults is provided. In this work, we present a quantitative analysis of the behavior of an embedded processor regarding transient faults and propose a new approach that accurately predicts the architecture vulnerability factor (AVF) in real-time. Another critical concern in design of new-silicon processors is power consumption issue. Dynamic voltage and frequency scaling (DVFS) is an effective method for controlling both energy consumption and performance of a system. Since rate of radiation-induced transient faults depends on operating frequency and supply voltage, DVFS techniques are recently shown to have compromising effects on electronic system reliability. Therefore, ignoring the effects of voltage scaling on fault rate could considerably degrade the system reliability. Here, by exploiting the proposed online AVF prediction methodology and based on analytic derivation, we propose a reliability-aware adaptive dynamic voltage and frequency scaling (DVFS) approach in case study of Multi-Processor System on Chip (MPSoC) with Multiple Clock Domain (MCD) pipeline architectures in which the frequency and voltage are scaled by simultaneously considering all three of power consumption, reliability, and performance. Comparing to the traditional methods of reliability-aware DVFS systems, the proposed reliability-aware DVFS method yields 50% better power saving at the same reliability level.  相似文献   

12.
This paper describes a number of reliability tasks and techniques, including reliability modeling and prediction, FMECA, ESS, failure analysis, redundancy, and others. Each of these is examined to assess its usefulness and general cost-effectiveness. Suggestions are offered for deciding whether or not to use any of these tasks and techniques during the design and manufacture of commercial equipment.  相似文献   

13.
Dynamic fault-tree models for fault-tolerant computer systems   总被引:3,自引:0,他引:3  
Reliability analysis of fault-tolerant computer systems for critical applications is complicated by several factors. Systems designed to achieve high levels of reliability frequently employ high levels of redundancy, dynamic redundancy management, and complex fault and error recovery techniques. This paper describes dynamic fault-tree modeling techniques for handling these difficulties. Three advanced fault-tolerant computer systems are described: a fault-tolerant parallel processor, a mission avionics system, and a fault-tolerant hypercube. Fault-tree models for their analysis are presented. HARP (Hybrid Automated Reliability Predictor) is a software package developed at Duke University and NASA Langley Research Center that can solve those fault-tree models  相似文献   

14.
The reliability of general systems using dynamic and static redundancy schemes is derived, and communication protocols are considered as a representative example. The system reliability for three broadcast protocols using various redundancy-allocation policies is studied. The analytic and simulation results show that, in some cases, static redundancy yields a more reliable system than dynamic redundancy. This is essential for distributed system applications. In some cases, the failure detection time is substantial, so that the hardware reliability and hence the system reliability are adversely affected when using dynamic redundancy. This can be a critical factor for distributed system applications, because a large overhead of communication can be required for error detection. In these cases, unreliable protocols can provide better system reliability than reliable protocols, especially when the communication network is highly reliable and when the machine failure rate is relatively large. Since unreliable protocols generate less load and less resource contention, they are preferable in such cases. The reliability should be analyzed to determine the optimal balance between reliable and unreliable protocols. Static redundancy can be more reliable than dynamic redundancy if the failure-detection time is large  相似文献   

15.
This paper presents modeling and estimation techniques permitting the temperature-aware optimization of application-specific multiprocessor system-on-chip (MPSoC) reliability. Technology scaling and increasing power densities make MPSoC lifetime reliability problems more severe. MPSoC reliability strongly depends on system-level MPSoC architecture, redundancy, and thermal profile during operation. We propose an efficient temperature-aware MPSoC reliability analysis and prediction technique that enables MPSoC reliability optimization via redundancy and temperature-aware design planning. Reliability, performance, and area are concurrently optimized. Simulation results indicate that the proposed approach has the potential to substantially improve MPSoC system mean time to failure with small area overhead.  相似文献   

16.
The limitations of conventional redundancy techniques are pointed out and a novel redundancy technique is proposed for high-density DRAMs using multidivided data-line structures. The proposed technique features a flexible relationship between spare lines and spare decoders, as well as lower probability of unsuccessful repair. With this technique the yield improvement factor of 64-Mb DRAMs and beyond is estimated to be more than twice that with the conventional technique in the early stages of production  相似文献   

17.
In many applications such as critical life-saving systems, safety is an important design issue as well as reliability. Among various commonly-used approaches in the implementation of on-line unrepairable fault-tolerant systems, standby systems achieve the most satisfactory reliability figure, followed by duplex systems, hybrid systems and triple modular redundancy. Nevertheless, the safety figure of duplex systems is superior to that of the standby approach. In this paper, the failure rate of system modules and hard core is predicted by the M1L-HDBK-217E model, and we show that the reliability and safety figure of cold standby systems can be further dramatically improved by increasing the number of spare units. Furthermore, comparative measures such as the reliability improvement factor, the safety improvement factor and the mission time improvement factor are proposed for showing that long-term unmaintained systems have reliability and safety as high as other systems that must be repaired.  相似文献   

18.
机载机电系统是支持飞机正常、安全飞行工作所必需的,对飞机发动机和航空电子等设备的正常执行起着保障作用.为了提高机载机电系统的可靠性,机电系统采用了双余度结构,以达到故障容错,进而实现整个飞机系统的高可靠性和安全性.重点介绍了机载机电控制与管理计算机的容错体系结构的设计方案和工作原理,分析了系统的容错管理策略,研究了支持双余度计算机实现的故障检测及重构等关键技术.  相似文献   

19.
Optimization Techniques for System Reliability with Redundancy?A Review   总被引:1,自引:0,他引:1  
This paper is a state-of-art review of the literature related to optimal system reliability with redundancy. The literature is classified as follows. Optimal system reliability models with redundancy Series Parallel Series-parallel Parallel-series Standby Complex (nonseries, nonparallel) Optimization techniques for obtaining optimal system configuration Integer programming Dynamic programming Maximum principle Linear programming Geometric programming Sequential unconstrained minimization technique (SUMT) Modified sequential simplex pattern search Lagrange multipliers and Kuhn-Tucker conditions Generalized Lagrangian function Generalized reduced gradient (GRG) Heuristic approaches Parametric approaches Pseudo-Boolean programming Miscellaneous  相似文献   

20.
Hardware redundancy has been used in the design of fault-tolerant digital systems. A synthesis of protective hardware redundancy techniques is proposed and a generalized reliability model suitable for many fault-tolerant configurations is developed. This model, to be called General Modular Redundancy (GMR), yields as particular cases several known models of redundant structures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号