期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Error Recovery in a Real-Time Multiprocessor System

Li Weihua Yuan Youguang 《计算机科学技术学报》1992,7(1):83-87

In this paper,a new scheme for recovering errors due to transient faults in a real-time multiprocessor system is presented.The scheme,called dynamic redundancy at the task level,is implemented in a real-time multitasking environment,Utilizing the facilities in the operating system,the scheme makes backup tasks for the primary tasks as redundancy.The paper introdues an algorithm to generate a fault tolerant schedule for the tasks so that they recover errors as retry of checkpointing does.A reliability model is proposed to evahuste the effectiveness of the scheme. 相似文献

2.

Fault‐tolerant RT‐Mach (FT‐RT‐Mach) and an application to real‐time train control

A. Egan D. Kutz D. Mikulin R. Melhem D. Moss 《Software》1999,29(4):379-395

Even though real‐time systems have the stringent constraint of completing tasks before their deadlines, many existing real‐time operating systems do not implement fault tolerance capabilities. In this paper we summarize fault tolerant real‐time scheduling policy for dynamic tasks with ready times and deadlines. Our focus in this paper is the implementation, which includes fault‐tolerant scheduling, re‐scheduling, and recovery mechanisms in the FT‐RT‐Mach operating system, a fault‐tolerant version of RT‐Mach. A real‐time train control application is then implemented using the FT‐RT‐Mach operating system. Copyright © 1999 John Wiley & Sons, Ltd. 相似文献

3.

实时操作系统CPU使用率监测的软件容错研究

王余伟曹东施书成《计算机工程与科学》2018,40(8):1337-1343

在硬件实时操作系统中,系统CPU的使用率是系统性能的一项重要指标,如果任务占据了系统的全部CPU,其它任务将无法继续运行,给系统带来灾难性后果。通过分析实时操作系统中软件运行的特点,系统设计需要采取一定容错策略,以提高系统可靠性和容错能力。在μC/ OS-Ⅱ实时操作系统下对飞行控制软件中的任务进行实时监测。首先给出在μC/ OS Ⅱ实时操作系统下CPU使用率的计算方法,合理提出CPU的监测周期。其次,给出对CPU使用率异常的故障检测算法,对故障进行故障处置,提高系统的容错能力。最后,通过在MPC5674飞行控制计算机中编写嵌入式飞行控制软件来验证四种对CPU使用率异常的处置方法。仿真结果表明,实时操作系统中CPU的软件容错方法可以有效提高系统可靠性和容错能力。相似文献

4.

A graphic modeling and analysis tool for human fault diagnosis tasks

《International Journal of Industrial Ergonomics》1999,23(1-2):67-81

This study presents a graphic modeling and analysis tool for use in constructing an operator's mental model in fault diagnosis tasks. In most automatic and complicated process control systems, human fault diagnosis tasks have become increasingly complex and specialized. The system designer should consider the cognitive process of human operator to avert failure of implement action owing to a lack of compatibility between humans and aiding system interface. Here, an experiment is performed to investigate the nature of human fault diagnosis. A graphic modeling and analysis tool is then proposed to model the continuous process of human fault diagnosis. The approach proposed herein exploits both the line-chart and Petri nets to demonstrate the operator's thoughts and actions. Moreover, results in this study are integrated into an adaptive standard diagnosis model that can assess the operators' mental workload and accurately depict the interactions between human operator and aiding system.Relevance to industryAutomatic intelligent diagnosis systems can not provide satisfactory operating performance. Human diagnosticians are more effective than computer ones. Results in this study offer further insight into an operator behavior in graphic form and also how to design a better aiding system. 相似文献

5.

基于C/OS-II的多MCU容错设计与应用

许强徐凯《计算机工程》2008,34(17):281-283

在分析C/OS-II系统的基础上增加多MCU容错实时操作功能,可以对当前任务运行状态实时进行一致性操作,使多MCU中的一个节点在出现死节而崩溃时,其他节点可以实现死节的决策,并恢复其坏死节点的任务状态使其继续运行。通过三MCU方式结构及运行验证,改进的系统表现出稳定可靠的容错能力。相似文献

6.

一种筒弹测试故障模拟器系统设计

下载免费PDF全文

王鑫陈川《计算机测量与控制》2017,25(8):20-23

为了测试人员能在平时使用模拟器系统进行测试训练;设计并实现了一种筒弹测试故障模拟器系统,对该系统的结构进行了设计,给出了资源配置框图,根据任务需求确认了筒弹测试故障模拟器系统的硬件设计原理和软件设计流程,该系统能对筒弹基本电气性能和测试接口进行模拟,通过操作故障模拟器软件,选择相应的故障类型,并通过网线,将所选故障类型相对应的指令发送到主控插箱,主控插箱依据所选故障进行操作,将测试结果通过测试电缆输出到筒弹测试设备,实现模拟筒弹正常测试项目和模拟筒弹测试故障等功能,应用结果表明,该系统运行可靠,各项功能及技术指标均符合设计要求;可以有效地对测试人员进行基础训练,故障诊断与排除训练,有效地提高了测试人员工作效率,很好的完成测试人员对筒弹测试的操作训练任务。相似文献

7.

Case study on distributed and fault tolerant system modeling based on timed automata 总被引：1，自引：0，他引：1

Libor Jan Zdenk 《Journal of Systems and Software》2009,82(10):1678-1694

This article presents the modeling of a distributed fault-tolerant real-time application by timed automata. The application under consideration consists of several processors communicating via a Controller Area Network (CAN); each processor executes an application that consists of fault-tolerant tasks running on top of an operating system (e.g. OSEK/VDX compliant) and using inter-task synchronization primitives. For such a system, a model checking tool (e.g. UPPAAL) can be used to verify the complex time and logical properties formalized as safety or bounded liveness properties (e.g. end-to-end response time considering an occurrence of a fault). The proposed model reduces the size of the state-space by sharing clocks measuring the execution time of the tasks. 相似文献

8.

面向高性能计算的分布式故障定位框架

高剑于康卿鹏尉红梅《计算机应用》2018,38(1):44-49

针对高性能计算系统中故障定位难度高且实时性差的问题,提出了一种基于消息传递的故障定位框架（MPFL）,包括基于树形拓扑的故障检测（TFD）和故障分析（TFA）算法。首先,在并行作业初始化时,将所有参与计算的节点进行逻辑上的树形划分,生成故障定位树（FLT）,并将故障定位任务分布到节点上;然后,当消息库、操作系统等组件检测到节点异常状态时,基于TFD算法分析作业的FLT结构,根据负载平衡、性能开销等因素选择接收异常状态的节点;最后,节点利用TFA算法对接收到的异常状态进行推理得出故障,TFA算法使用基于规则的事件关联,并基于消息传递设计轻量级的主动探测,将两种方式相结合,提高了故障分析的准确性。实验以模拟节点停机故障为定位目标,并以NPB-FT与NPB-IS为基准测试,在集群上对MPFL框架进行了评估。实验结果表明,MPFL框架在故障定位能力与开销节省方面表现突出。相似文献

9.

嵌入式操作系统混合任务调度技术与策略研究

下载免费PDF全文

陆伟张龙妹《计算机工程与应用》2015,51(15):6-11

针对当前嵌入式系统中时间触发与事件触发混合任务的特点,以μC/OS-II操作系统架构为基础,设计了一种能够同时支持时间触发与事件触发的混合操作系统内核架构。该架构符合OSEK/VDX标准,具有良好的可移植性。针对混合任务调度问题,提出了一种静态周期性可抢占式混合任务调度策略,该策略同时支持中断级与任务级的任务切换,并采用EDF（最早截止时间优先）算法对被抢占的时间触发任务进行恢复,相比OSEKtime OS只能在中断级进行任务切换以及FIFO（先进先出）恢复算法,能够提高系统资源利用率,并最大限度保证任务实时性。实验分析结果表明,所设计的混合操作系统架构移植方便,所提出的混合任务调度策略可行有效,调度过程具有良好的可预测性。相似文献

10.

嵌入式Forth操作系统实时调度算法研究

黄忠建代红兵王蕾《计算机应用研究》2019,36(9)

针对目前嵌入式Forth操作系统中缺乏实时调度机制的问题,对基于Forth虚拟机架构的嵌入式操作系统中多任务调度的关键技术进行了研究。采用Forth虚拟机技术,新定义了一种中断任务类型来处理实时突发事件,并给出了一种新的任务调度算法来调度 Forth系统中终端任务、后台任务以及中断任务顺利运行。实验结果表明,改进后的 Forth 系统能够通过实时调度处理突发事件,并且实时响应度高,尤其适用于对实时性有要求的嵌入式环境中,以满足日趋复杂的嵌入式环境对高效操作系统和 Forth 技术的应用需求。相似文献

11.

分时EDF算法及其在多媒体操作系统中的应用 总被引：2，自引：0，他引：2

张怡张拥军彭宇行陈福接《计算机学报》2001,24(3):315-320

提出了一种新的CPU调度算法－－分时EDF（Earliest Deadine First)算法,该算法能保证硬实时任务不丢失死线,并易于在分时系统中实现。以分时EDF算法为基础,提出一种新的CPU层次调度算法－－HRFSFQ,该算法用于多媒体操作系统时能保证各类任务的QoS。最后通过大量实验证明了上述算法的有效性和正确性。相似文献

12.

铝电解槽阳极工作状态故障诊断系统研究 总被引：1，自引：0，他引：1

李春艳曾水平《自动化技术与应用》2004,23(7):49-51,57

本文以铝电解槽的阳极工作状态为研究对象，结合实验数据统计，数字滤波、模糊数学等技术，开发出铝电解槽阳极工作状态故障诊断系统。实验结果证实该系统的有效性。相似文献

13.

Fault-tolerance through scheduling of aperiodic tasks in hardreal-time multiprocessor systems

Ghosh S. Melhem R. Mosse D. 《Parallel and Distributed Systems, IEEE Transactions on》1997,8(3):272-284

Real time systems are being increasingly used in several applications which are time critical in nature. Fault tolerance is an important requirement of such systems, due to the catastrophic consequences of not tolerating faults. We study a scheme that provides fault tolerance through scheduling in real time multiprocessor systems. We schedule multiple copies of dynamic, aperiodic, nonpreemptive tasks in the system, and use two techniques that we call deallocation and overloading to achieve high acceptance ratio (percentage of arriving tasks scheduled by the system). The paper compares the performance of our scheme with that of other fault tolerant scheduling schemes, and determines how much each of deallocation and overloading affects the acceptance ratio of tasks. The paper also provides a technique that can help real time system designers determine the number of processors required to provide fault tolerance in dynamic systems. Lastly, a formal model is developed for the analysis of systems with uniform tasks 相似文献

14.

一种基于多处理机的容错实时任务调度算法 总被引：27，自引：2，他引：25

张拥军张怡彭宇行陈福接《计算机研究与发展》2000,37(4):425-429

容错是实时系统的重要要求,在实时系统中,若一个实时任务没在规定的时间期限内完成,则认为系统出现错误,针对多处理机实时系统提出了一种容错调度算法．算法采用了任务的主从备份技术和Ｆｉｒｓｔ－ｆｉｔ启发式方法,通过为可能因处理机故障而执行失败的实时任务预留重新运行的时间来实现容错功能;并通过对预留时间段的重叠利用和无错时预留时间的回收分配,来提高处理机的利用率和系统对任务的接收率．模拟结果表明算法是有效相似文献

15.

变电站智能故障仿真系统实现

张静宋存义谢振华白建明《微计算机信息》2007,23(31):195-196,289

仿真变电站以先进的计算机技术构建了一个“真实”、“节能”的变电站，系统采用人工智能技术和基于HLA（高层体系结构）的分布交互式仿真技术相结合的方法，研究开发变电站智能故障仿真系统，系统收集了大量变电站事故类型，如倒闸操作失误、运行中意外事故使保护动作及单相接地等异常情况，并对这些常见现象进行仿真模拟，软件采用平台设计，在画面上进行操作和事故处理时，模拟声音动画，增强了真实感。相似文献

16.

PLR: A Software Approach to Transient Fault Tolerance for Multicore Architectures

Shye Alex Blomstedt Joseph Moseley Tipp Reddi Vijay Janapa Connors Daniel A. 《Dependable and Secure Computing, IEEE Transactions on》2009,6(2):135-148

Transient faults are emerging as a critical concern in the reliability of general-purpose microprocessors. As architectural trends point toward multicore designs, there is substantial interest in adapting such parallel hardware resources for transient fault tolerance. This paper presents process-level redundancy (PLR), a software technique for transient fault tolerance, which leverages multiple cores for low overhead. PLR creates a set of redundant processes per application process and systematically compares the processes to guarantee correct execution. Redundancy at the process level allows the operating system to freely schedule the processes across all available hardware resources. PLR uses a software-centric approach to transient fault tolerance, which shifts the focus from ensuring correct hardware execution to ensuring correct software execution. As a result, many benign faults that do not propagate to affect program correctness can be safely ignored. A real prototype is presented that is designed to be transparent to the application and can run on general-purpose single-threaded programs without modifications to the program, operating system, or underlying hardware. The system is evaluated for fault coverage and performance on a four-way SMP machine and provides improved performance over existing software transient fault tolerance techniques with a 16.9 percent overhead for fault detection on a set of optimized SPEC2000 binaries. 相似文献

17.

某型导弹电视电子组合检测系统设计

尉广军 ;郭希维 ;王政 ;吴洪伟《计算机与数字工程》2014,(7):1198-1201

以改善某导弹电视电子箱和电视电源箱的测试精度和故障定位的准确程度为背景,设计了一种基于 PC104的精度高、可靠性好的组合检测系统。介绍了系统的组成以及各模块的硬件电路设计,软件设计采用基于Labview与Visu-al C＋＋混合编程的方法实现系统检测和故障诊断。应用表明,该系统满足预期的设计要求,能够完成故障诊断任务。相似文献

18.

基于CPCI总线的航天器通信信号设备故障检测系统设计

下载免费PDF全文

张贺鑫雷文礼王雨婷《计算机测量与控制》2021,29(2):1-4

针对当前航天器通信信号设备故障检测系统受到噪声影响,导致系统通信设备故障信号检测精准度低,检测时间长的问题,设计基于CPCI总线的航天器通信信号设备故障检测系统;CPCI故障模拟模块利用RS232串行线控制注入机,采用故障注入器执行故障注入CPCI总线,接收控制系统参数和指令,使用时钟分配芯片传输时钟信号,通过CPCI检测板卡模块,配合FPGA实现接口控制,完成系统硬件结构设计,利用任务间相互依赖关系,实现任务间相互检测,通过终端网工作站定期发送多路通信网相关信息,返回无疵点检测结果,采用二次相关算法,提取多通道通信故障信号详细信息,准确估算通信信号时延,排除多通道网络噪声影响造成的通信故障,完成系统软件部分设计;实验结果表明,基于CPCI总线的故障检测系统的故障信号检测时间仅为1.8 s,故障信号幅度最大为28 dB,最小为1 dB,与实际变化幅度一致,通信设备故障信号检测精准度较高,能够有效缩短通信设备故障信号检测时间。相似文献

19.

Configuration Reusing in On-Line Task Scheduling for Reconfigurable Computing Systems

下载免费PDF全文

Maisam Mansub Bassiri Hadi Shahriar Shahhoseini 《计算机科学技术学报》2011,26(3):463-473

Reconfigurable computing systems can be reconfigured at runtime and support partial reconfigurability which makes us able to execute tasks in a true multitasking manner.To manage such systems at runtime,a reconfigurable operating system is needed.The main part of this operating system is resource management unit which performs on-line scheduling and placement of hardware tasks at runtime.Reconfiguration overhead is an important obstacle that limits the performance of on-line scheduling algorithms in reconfigurable computing systems and increases the overall execution time.Configuration reusing (task reusing) can decrease reconfiguration overhead considerably,particularly in periodic applications or the applications in which the probability of tasks recurrence is high.In this paper,we present a technique called reusing-based scheduling (RBS),for on-line scheduling and placement in which configuration reusing is considered as a main characteristic in order to reduce reconfiguration overhead and decrease total execution time of the tasks.Several experiments have been conducted on the proposed algorithm.Obtained results show considerable improvement in overall execution time of the tasks. 相似文献

20.

异构分布式实时系统中对具有前后依赖关系任务的基于动态可变调度距离容错调度算法

刘栋孟庆鑫潘哲《计算机与网络》2014,(3):109-113

目前的主副版本容错调度算法大多没有考虑任务间的前后依赖关系,但实际中很多任务是具有前后依赖关系的。本文提出了一种基于主副版本动态可变调度距离的任务容错调度算法,该技术通过比较任务间的最晚开始执行时间与最早开始执行时间的差值,安排任务副版本的调度,并且基于此设计了可用于具有前后依赖关系任务调度可重叠技术。本文提出的基于动态可变调度距离的容错调度算法在尽可能让任务最早完成的情况下,提高系统的可靠性,并且优先调度关键路径任务,降低了系统的容错开销。最后通过实验证明本文算法的有效性和优异性。相似文献