期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An integrated architecture for fault diagnosis and failure prognosis of complex engineering systems

Chaochao Chen Douglas Brown Chris Sconyers Bin Zhang George Vachtsevanos Marcos E. Orchard 《Expert systems with applications》2012,39(10):9031-9040

Complex engineering systems, such as aircraft, industrial processes, and transportation systems, are experiencing a paradigm shift in the way they are operated and maintained. Instead of traditional scheduled or breakdown maintenance practices, they are maintained on the basis of their current state/condition. Condition-Based Maintenance (CBM) is becoming the preferred practice since it improves significantly the reliability, safety and availability of these critical systems. CBM enabling technologies include sensing and monitoring, information processing, fault diagnosis and failure prognosis algorithms that are capable of detecting accurately and in a timely manner incipient failures and predicting the remaining useful life of failing components. If such technologies are to be implemented on-line and in real-time, it is essential that an integrating system architecture be developed that possesses features of modularity, flexibility and interoperability while exhibiting attributes of computational efficiency for both on-line and off-line applications. This paper presents a .NET framework as the integrating software platform linking all constituent modules of the fault diagnosis and failure prognosis architecture. The inherent characteristics of the .NET framework provide the proposed system with a generic architecture for fault diagnosis and failure prognosis for a variety of applications. Functioning as data processing, feature extraction, fault diagnosis and failure prognosis, the corresponding modules in the system are built as .NET components that are developed separately and independently in any of the .NET languages. With the use of Bayesian estimation theory, a generic particle-filtering-based framework is integrated in the system for fault diagnosis and failure prognosis. The system is tested in two different applications—bearing spalling fault diagnosis and failure prognosis and brushless DC motor turn-to-turn winding fault diagnosis. The results suggest that the system is capable of meeting performance requirements specified by both the developer and the user for a variety of engineering systems. 相似文献

2.

一种结合AADL和IMC的系统可靠性建模方法

程亦涵黄志球阚双龙《计算机工程与科学》2015,37(8):1517-1524

随着嵌入式软件在安全关键领域广泛应用,系统可靠性随着其规模、复杂度和性能需求的不断提升而愈显重要。结构分析设计语言AADL是应用于嵌入式领域的体系结构建模、分析和验证的重要手段。由于AADL是一种半形式化模型,需要精确描述其语义才能进行定量分析。提出一种基于AADL的系统可靠性建模方法。首先,结合AADL模型和AADL错误模型附件,得到AADL可靠性模型;然后,提出一种模型转换方法,将AADL可靠性模型的基本元素和错误传播等特殊元素转换到交互式马尔科夫链模型IMC,进行可靠性定量分析;最后,结合法国空中交通控制系统的实例,证明该方法的可行性和有效性。相似文献

3.

A fault-tolerant approach to robot teams

Adrian Martin M. Reza Emami 《Robotics and Autonomous Systems》2013,61(12):1360-1378

As the applications of mobile robotics evolve it has become increasingly less practical for researchers to design custom hardware and control systems for each problem. This paper presents a new approach to control system design in order to look beyond end-of-lifecycle performance, and consider control system structure, flexibility, and extensibility. Towards these ends the Control ad libitum philosophy was proposed, stating that to make significant progress in the real-world application of mobile robot teams the control system must be structured such that teams can be formed in real-time from diverse components. The Control ad libitum philosophy was applied to the design of the HAA (Host, Avatar, Agent) architecture: a modular hierarchical framework built with provably correct distributed algorithms. A control system for mapping, exploration, and foraging was developed using the HAA architecture and evaluated in three experiments. First, the basic functionality of the HAA architecture was studied, specifically the ability to: (a) dynamically form the control system, (b) dynamically form the robot team, (c) dynamically form the processing network, and (d) handle heterogeneous teams and allocate robots between tasks based on their capabilities. Secondly, the control system was tested with different rates of software failure and was able to successfully complete its tasks even when each module was set to fail every 0.5–1.5 min. Thirdly, the control system was subjected to concurrent software and hardware failures, and was still able to complete a foraging task in a 216 m² environment. 相似文献

4.

An integrated architecture for future car generations

Roman Obermaisser Philipp Peti Fulvio Tagliabo 《Real-Time Systems》2007,36(1-2):101-133

The DECOS architecture is an integrated architecture that builds upon the validated services of a time-triggered network, which serves as a shared resource for the communication activities of more than one application subsystem. In addition, encapsulated partitions are used to share the computational resources of Electronic Control Units (ECUs) among software modules of multiple application subsystems. This paper investigates the benefits of the DECOS architecture as an electronic infrastructure for future car generations. The shift to an integrated architecture will result in quantifiable cost reductions in the areas of system hardware cost and system development. In the paper we present a current federated Fiat car E/E architecture and discuss a possible mapping to an integrated solution based on the DECOS architecture. The proposed architecture provides a foundation for mixed criticality integration with both safety-critical and non safety-critical subsystems. In particular, this architecture supports applications up to the highest criticality classes (10^?9 failures per hour), thereby taking into account the emerging dependability requirements of by-wire functionality in the automotive industry. 相似文献

5.

Reaching agreement on processor-group membrship in synchronous distributed systems

Flaviu Cristian 《Distributed Computing》1991,4(4):175-187

Reaching agreement on the identity of correctly functioning processors of a distributed system in the presence of random communication delays, failures and processor joins is a fundamental problem in fault-tolerant distributed systems. Assuming a synchronous communication network that is not subject to partition occurrences, we specify the processor-group membership problem and we propose three simple protocols for solving it. The protocols provide all correct processors with consistent views of the processor-group membership and guarantee bounded processor failure detection and join delays. Flaviu Cristian is a computer scientist at the IBM Almaden Research Center in San Jose, California. He received his PhD from the University of Grenoble, France, in 1979. After carrying out research in operating systems and programming methodology in France and working on the specification, design, and verification of fault-tolerant software in England, he joined IBM in 1982. Since then he has worked in the area of fault-tolerant distributed systems and protocols. He has participated in the design and implementation of a highly available distributed system prototype at the Almaden Research Center, has reviewed and consulted for several fault-tolerant distributed system designs, both in Europe and the American divisions of IBM, and is now a technical leader in the design of a new Air Traffic Control System for the US which must satisfy very stringent availability requirements. 相似文献

6.

Qualitatively modelling the effects of electrical circuit faults

《Artificial Intelligence in Engineering》1993,8(4):293-300

This paper presents a new method for the qualitative analysis of electrical circuit behaviour. This paper shows that a qualitative representation of electrical resistance provides a good intuitive model for reasoning about gross electrical effects due to connectivity faults. The motivation is to produce tools to assist engineers in the identification and analysis of circuit failures that have safety implications. This includes work in hazard analysis, safety-critical systems, and failure mode effects analysis (FMEA). The analysis algorithms efficiently locate state changes in circuits and assign qualitative symbols for voltage and current flow to all components. The input, Δr, is a list which specifies qualitative resistance changes between the nodes of a previously defined circuit and the output is a pair, (P_a, P_d), of sets of activated and deactivated paths, identifying the components that have changed state. The system has a layered priority approach precisely in keeping with failure mode analysis tasks and has been successfully tested in real applications. 相似文献

7.

The development of the NLR ATC Research Simulator (Narsim): Design philosophy and potential for ATM research

《Simulation Practice and Theory》1993,1(1):31-39

相似文献

8.

The TTA's Approach to Resilience after Transient Upsets

Wilfried Steiner Michael Paulitsch Hermann Kopetz 《Real-Time Systems》2006,32(3):213-233

The Time-Triggered Architecture, as architecture for safety-critical real-time applications, incorporates fault-tolerance mechanisms to ensure correct system operation despite failures. The primary fault hypothesis of the TTA claims to tolerate either the arbitrary failure of any one of its nodes or the passively arbitrary failure of any one of its communication channels. To cover these failure modes, active redundancy techniques are used, which basically means that nodes and channels are physically replicated. The primary fault hypothesis, is, however, not strong enough for certain applications that have to tolerate transient upsets of multiple, possibly all, components in the system. Such a transient upset of the system may break up the synchrony of the nodes and leave disjoined sets of nodes synchronized to each other while the overall synchronization is lost. Although the TTA provides a clique avoidance algorithm that is able to correct a wide class of such multiple transient failures, a stronger algorithm is needed for full coverage. In this paper we discuss a secondary fault hypothesis for the TTA that addresses the transient upset of multiple components and present a new clique resolving algorithm based on the TTA's integrated diagnosis and startup service. This paper is a revised version of Steiner et al. (2003). This work has been funded by the European Project DECOS (Project number: IST-511764). Michael Paulitsch is currently affiliated with Honeywell International. 相似文献

9.

空中交通管制指挥监测系统中的多雷达数据处理 总被引：1，自引：0，他引：1

舒学智崔德光姜微微《微计算机应用》2007,28(4):337-342

针对基于信息集成的空中交通管制指挥监测系统（ATCCMS）的体系结构，本文根据一般多雷达处理的模型，按照系统的特定要求分析了模型中各个流程的算法，给出了ATCCMS中信息集成时多雷达数据融合的方法，从而为ATCCMS的上层应用提供实时准确的数据基础。相似文献

10.

设备驱动程序可靠性和正确性保障方法与技术研究进展

张一帆黄超欧建生汤恩义陈鑫《软件学报》2015,26(2):239-253

随着计算机技术的不断发展,计算机系统在安全攸关领域得到了广泛应用,其中的软件系统正逐渐成为重要的使能部件.在计算机系统中,设备驱动程序扮演了软件与硬件设备之间桥梁的角色.由于与计算机平台、操作系统、设备3个方面同时关联所导致的复杂性,设备驱动程序的开发难度大、成本高,程序中所存在的错误和缺陷常常导致系统失效,在安全攸关领域造成不可挽回的损失.以设备驱动程序可靠性和正确性保障为目标,分别从故障的隔离与恢复、正确性分析和验证、设计建模与复杂性控制这3个方面对当前相关方法和技术进行分析,为开展进一步深入的研究工作打下基础. 相似文献

11.

An examination of distributed planning in the world of air traffic control

《Journal of Parallel and Distributed Computing》1986,3(3):411-431

A Distributed Planning System is a network whose nodes represent distinct processors, each cooperating with a selected set of others to achieve a common set of goals. We address two important issues in this paper: (i) How individual processors should be interconnected so that their capacities are fully utilized and their goals can be accomplished effectively and efficiently. (ii) What kind of planning activity the individual processors should engage in. We have chosen the Air Traffic Control (ATC) environment, an intellectually challenging domain of much practical importance, to study different ideas in distributed planning systems. Here we show the results obtained with one type of possible organizational architecture, Location-Centered Cooperative mode of operation. We describe and demonstrate algorithmically a simulation-based planning process based on this architecture. The work described in this paper is part of our continuing effort in examining ways in which greater responsibilities can be delegated to computers in ATC. 相似文献

12.

An analysis of clustered failures on large supercomputing systems

Thomas J. Hacker Fabian Romero Christopher D. Carothers 《Journal of Parallel and Distributed Computing》2009

Large supercomputers are built today using thousands of commodity components, and suffer from poor reliability due to frequent component failures. The characteristics of failure observed on large-scale systems differ from smaller scale systems studied in the past. One striking difference is that system events are clustered temporally and spatially, which complicates failure analysis and application design. Developing a clear understanding of failures for large-scale systems is a critical step in building more reliable systems and applications that can better tolerate and recover from failures. In this paper, we analyze the event logs of two large IBM Blue Gene systems, statistically characterize system failures, present a model for predicting the probability of node failure, and assess the effects of differing rates of failure on job failures for large-scale systems. The work presented in this paper will be useful for developers and designers seeking to deploy efficient and reliable petascale systems. 相似文献

13.

Probabilistic clock synchronization 总被引：18，自引：0，他引：18

Flaviu Cristian 《Distributed Computing》1989,3(3):146-158

A probabilistic method is proposed for reading remote clocks in distributed systems subject to unbounded random communication delays. The method can achieve clock synchronization precisions superior to those attainable by previously published clock synchronization algorithms. Its use is illustrated by presenting a time service which maintains externally (and hence, internally) synchronized clocks in the presence of process, communication and clock failures. Flaviu Cristian is a computer scientist at the IBM Almaden Research Center in San Jose, California. He received his PhD from the University of Grenoble, France, in 1979. After carrying out research in operating systems and programming methodology in France, and working on the specification, design, and verification of fault-tolerant programs in England, he joined IBM in 1982. Since then he has worked in the area of fault-tolerant distributed protocols and systems. He has participated in the design and implementation of a highly available system prototype at the Almaden Research Center and has reviewed and consulted for several fault-tolerant distributed system designs, both in Europe and in the American divisions of IBM. He is now a technical leader in the design of a new U.S. Air Traffic Control System which must satisfy very stringent availability requirements. 相似文献

14.

安全关键系统及其软件方法

杨启亮邢建春王平《计算机应用与软件》2011,28(2)

安全关键系统是指其不正确的功能或失效会导致人员伤亡、财产损失等严重后果的计算机系统。软件系统是安全关键系统研究的核心和难点。阐明了安全关键系统及其软件的基本概念、主要研究内容、起源与现状,重点讨论了安全关键软件方法,特别是形式化方法的原理、相关标准和典型应用。在分析当前安全关键系统的新变化及其软件面临的挑战的基础上,提出并讨论了形式化方法可能的应对对策和发展方向。相似文献

15.

空管系统中内存数据库的设计与实现

下载免费PDF全文

刘敏费向东胡术杨诚《计算机工程》2010,36(21):52-53,56

空中交通管制(ATC)系统对数据高速同步的要求越来越高,而目前所使用的传统式集中数据库存取数据的速度已远远不能满足ATC系统的需求。为此,结合ATC系统的特殊需求,设计并实现了一种基于type-index-value的内存数据库,该数据库具有良好的可移植性,可有效提高系统性能以及对数据的存取能力。相似文献

16.

基于CORBA技术的分布式体系结构的实现 总被引：1，自引：0，他引：1

李金兰《现代计算机》2006,(4):71-74

介绍了CORBA技术的实现原理、工作过程及技术先进性,分析了基于CORBA设计分布式应用体系结构的通用解决方法,提出了利用CORBA技术开发空中交通管制系统(ATCS)的总体思路. 相似文献

17.

基于数据仓库技术的空中交通流量管理系统 总被引：1，自引：0，他引：1

项天成崔德光《计算机工程与应用》2001,37(16):135-137

论文提出了把信息技术应用于空域流量管理的设想,简单介绍了数据仓库与决策支持系统的特点,并结合清华大学CIMS中心与华北空管局合作研制的流量管理原型系统,描述了在数据仓库上构建决策支持系统的几个技术问题和解决方案。相似文献

18.

基于局域网的空管信息系统安全策略

张蕾《信息安全与技术》2013,(6):39-41

文章首先对基于局域网的空管信息系统做了全面分析,在此基础上提出了适用于该信息系统的安全方法和策略。它采用系统权限分配、用户身份验证、记录跟踪、协议审计、数据备份、灾难恢复、告警系统和其他技术来为空管信息系统提供多层次的数据库安全保护,提高系统的安全管理水平。相似文献

19.

Software architecture reliability analysis using failure scenarios

Bedir Tekinerdogan^{Author Vitae} Hasan Sozer Author VitaeAuthor Vitae 《Journal of Systems and Software》2008,81(4):558-575

With the increasing size and complexity of software in embedded systems, software has now become a primary threat for the reliability. Several mature conventional reliability engineering techniques exist in literature but traditionally these have primarily addressed failures in hardware components and usually assume the availability of a running system. Software architecture analysis methods aim to analyze the quality of software-intensive system early at the software architecture design level and before a system is implemented. We propose a Software Architecture Reliability Analysis Approach (SARAH) that benefits from mature reliability engineering techniques and scenario-based software architecture analysis to provide an early software reliability analysis at the architecture design level. SARAH defines the notion of failure scenario model that is based on the Failure Modes and Effects Analysis method (FMEA) in the reliability engineering domain. The failure scenario model is applied to represent so-called failure scenarios that are utilized to derive fault tree sets (FTS). Fault tree sets are utilized to provide a severity analysis for the overall software architecture and the individual architectural elements. Despite conventional reliability analysis techniques which prioritize failures based on criteria such as safety concerns, in SARAH failure scenarios are prioritized based on severity from the end-user perspective. SARAH results in a failure analysis report that can be utilized to identify architectural tactics for improving the reliability of the software architecture. The approach is illustrated using an industrial case for analyzing reliability of the software architecture of the next release of a Digital TV. 相似文献

20.

Compiling and verifying SC-SystemJ programs for safety-critical reactive systems

《Computer Languages, Systems and Structures》2015

Most of today's embedded systems are very complex. These systems, controlled by computer programs, continuously interact with their physical environments through network of sensory input and output devices. Consequently, the operations of such embedded systems are highly reactive and concurrent. Since embedded systems are deployed in many safety-critical applications, where failures can lead to catastrophic events, an approach that combines mathematical logic and formal verification is employed in order to ensure correct behavior of the control algorithm. This paper presents What You Prove Is What You Execute (WYPIWYE) compilation strategy for a Globally Asynchronous Locally Synchronous (GALS) programming language called Safey-Critical SystemJ. SC-SystemJ is a safety-critical subset of the SystemJ language. A formal big-step transition semantics of SC-SystemJ is developed for compiling SC-SystemJ programs into propositional Linear Temporal Logic formulas. These LTL formulas are then converted into a network of Mealy automata using a novel and efficient compilation algorithm. The resultant Mealy automata have a straightforward syntactic translation into Promela code. The resultant Promela models can be used for verifying correctness properties via the SPIN model-checker. Finally there is a single translation procedure to compile both: Promela and C/Java code for execution, which satisfies the De-Bruijn index, i.e. this final translation step is simple enough that is can be manually verified. 相似文献