期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Mutual Information Agreement in Multicomputer Systems with the Detection and Identification of Byzantine Faults

V. Yu. Grishin A. V. Lobanov V. G. Sirenko 《Automation and Remote Control》2003,64(4):626-634

A method is suggested of the mutual information agreementin multicomputer systems with intercomputer bus-type communication channels and the broadcast way of transfer of intercomputer messages. The method makes it possible to detect and identify symptoms (both by the places of their emergence and by the types of faults, such as malfunctions programmed malfunctions, or failures) of multiple faults of computers and transmitting interfaces (interface devices) with communication channels, which can occur in all cycles of the interchange. In addition, the method enables one to distinguish, first, faults of computers and transmitting interfaces, if it is possible, and, second, the situation of the nondelivery of a message in initial cycles and the situation of the delivery of this message with distortions. 相似文献

2.

On the Uncertainty in the Correctness of Computer Programs

《IEEE transactions on pattern analysis and machine intelligence》1985,(9):857-864

The use of digital computers in critical process control systems requires the formal assessment of the system reliability. Failures can be due to either component malfunctions or design faults. Only the latter are relevant in evaluating software reliability. Although it is preferable to prove whether the program meets its specification, this is not yet practical for real-time control programs. Further, the specification itself can be incorrect or incomplete due to the complex requirements. 相似文献

3.

机房维护管理之我见

叶琦唐澜《数字社区&智能家居》2005,(14)

随着计算机技术的迅猛发展,特别是互联网的普及,高校中计算机数量急速增多,给机房的维护带来了很大挑战,文章主要讨论了为提高机房维护效率所采用的几种有效措施:对多台计算机软件安装与修复采用网络服务器克隆方法;为防止病毒感染加装硬盘保护卡;几种常见故障的解决方法。相似文献

4.

Detecting FET Stuck-Open Faults in CMOS Latches And Flip-Flops

Reddy M.K. Reddy S.M. 《Design & Test of Computers, IEEE》1986,3(5):17-26

The authors present evidence that conventional tests cannot detect FET stuck-open faults in several CMOS latches and flip-flops. Examples are given to show that stuck-open faults can change static latches and flip-flops into dynamic devices?a danger to circuits whose operation requires static memory, since undetected FET stuck-open faults can cause malfunctions. Designs are given for several memory devices in which all single FET stuck-open faults are detectable. These memory devices include common latches, master-slave flip-flops, and scan-path flip-flops that can be used in applications requiring static memory elements whose operation can be reliably ascertained through conventional fault testing methods. 相似文献

5.

Functional Diagnosis of Information Transmission in Computer Systems with Unknown Initial Information

V. G. Sirenko 《Automation and Remote Control》2005,66(11):1824-1840

Consideration was given to the multipath information transmission in the multi-level computer systems from one or more first-level computers to several last-level computers with the aim of improving reliability of information transmission. It was assumed that (i) at each system level known are system structure, information paths, and permissible number of faulty computers and (ii) there exists at least one path through-passing good computers. It was assumed in Part I that the valid values of the information received by all last-level computers are traceable. For these conditions, proposed was a method of determining each feasible variant of the valid value of the source information sent from the first-level computers with indication of each possible combination of admissible byzantine system faults under which this variant may occur. Conditions for which a subsystem consisting of the last-level computers can independently perform the functional self-diagnosis of information transmission in the system were established in Part II, and a method of self-diagnosis was proposed which determines each feasible variant of the initial information sent from different first-level computers with indication of each possible combination of the admissible byzantine system faults for which in the good last-level computers the true results of transmission under this variant of values may be generated. 相似文献

6.

Models of closed multimachine computer systems with transient-fault-tolerance and fault-tolerance on the basis of replication under byzantine faults

A. V. Lobanov 《Automation and Remote Control》2009,70(2):328-343

Consideration was given to the key definitions, notions, and models that may be useful for transient-fault-tolerant and fault-tolerant computing in the unmanned multimachine computer systems having many interconnected autonomous computers without shared memory and centralized control organ and operating with high degree of computational parallelism, that is, executing on different computers simultaneously various tasks which interchange their information. These computations should establish reliable results under byzantine faults and controllable degradation of the system at detection of faults. 相似文献

7.

Error detection and correction in switched linear controllers via periodic and non-concurrent checks

Shreyas Sundaram Author Vitae Author Vitae 《Automatica》2006,42(3):383-391

Control systems that utilize switched linear controllers have proven to be useful (and, in some cases, essential) for accomplishing certain control objectives in particular classes of plants. These controllers are often digital in nature and, as such, are subject to internal hardware malfunctions (faults). In this paper, we present a systematic methodology for constructing embeddings to protect switched linear controllers against hardware faults that corrupt their internal state. Our methodology is based on replacing the original controller with a redundant (higher dimensional) controller that preserves the functionality of the original controller while enabling error detection and correction. More importantly, this methodology allows an external mechanism to detect and identify transient state-transition faults through non-concurrent (e.g. periodic) parity checks. The resulting error detection and correction procedures can then be performed periodically, thereby relaxing the reliability requirements and overhead associated with the checking mechanism. 相似文献

8.

超立方体多处理机系统中基于扩展最优通路矩阵的容错路由 总被引：10，自引：1，他引：10

田绍槐《计算机学报》2002,25(1):87-92

该文在高峰等文章的基础上，提出了针对超立方体结构多处理机系统的扩展最优通路矩阵（Extended Optimal Path Matrices,EOPMs）的概念，并给出了一个建立EIPMs的算法和基于EOPMs的容错路由算法，证明了基于EOPMs的容错路由算法是基于扩展安全向量（ESVs）^[13]和基于最优通路矩阵（OPMs)^[14]容错路由算法的扩展，与原文相比，该算法的存储开销与OPMs,相同，但记录的最优通路的信息，包含了原文所记录的最优通路的信息，使搜索最优通路的能力比它们有进一步的提高。相似文献

9.

Graceful deadlock-free fault-tolerant routing algorithm for 3D Network-on-Chip architectures

Akram Ben Ahmed Abderazek Ben Abdallah 《Journal of Parallel and Distributed Computing》2014

Three-Dimensional Networks-on-Chip (3D-NoC) has been presented as an auspicious solution merging the high parallelism of Network-on-Chip (NoC) interconnect paradigm with the high-performance and lower interconnect-power of 3-dimensional integration circuits. However, 3D-NoC systems are exposed to a variety of manufacturing and design factors making them vulnerable to different faults that cause corrupted message transfer or even catastrophic system failures. Therefore, a 3D-NoC system should be fault-tolerant to transient malfunctions or permanent physical damages. 相似文献

10.

Model-based condition monitoring of an actuator system driven by a brushless DC motor

《Control Engineering Practice》2001,9(5):545-554

Air pressure in passenger aircrafts is controlled by opening and closing an outflow valve, which serves to release air from the cabin. Early identification of potential malfunctions in the underlying actuator-driven system is important both from the point of view of cost-efficient maintenance as well as overall safety. This paper presents a system for diagnosing faults in a valve actuator driven by a brushless DC motor during off-line pre-flight tests. First, a simple mathematical model of the drive is developed. Based on this, parity relations are derived with the aim of fault detection. Fault isolation is realised by means of an approximate reasoning technique referred to as transferable belief model (TBM). The problem of distinguishing between faults with the same fault signatures is addressed. It is shown that additional improvements in terms of diagnostic resolution can be achieved by applying parameter estimation. The proposed FDI scheme is tested on the actual drive under various faults. The achieved performance features high diagnostic resolution, diagnostic stability and accuracy. 相似文献

11.

Fault-Tolerant Microprocessor-Based Systems

Johnson B.W. 《Micro, IEEE》1984,4(6):6-21

How do computers go wrong and what can we do about it? This tutorial outlines the causes of faults and the basic techniques for dealing with them. 相似文献

12.

面向安腾架构的分层内存故障注入方法

下载免费PDF全文

王波左德承钱军张展《计算机工程》2012,38(4):70-72

为研究内存故障对高可用服务器的影响,针对安腾架构的计算机提出一种多层次的内存故障注入方法,设计并实现一种新的故障注入器(HMFI),通过在物理层、操作系统内核层和进程层注入内存故障,考察目标系统对内存故障的容错能力。实验结果表明,HMFI注入的内存故障能够有效验证与分析复杂计算机系统的容错性能。相似文献

13.

An Analysis of the Criteria for Evaluating Adequate Theories of Computation

Nir Fresco 《Minds and Machines》2008,18(3):379-401

This paper deals with the question: What are the criteria that an adequate theory of computation has to meet? (1) Smith’s answer: it has to meet the empirical criterion (i.e. doing justice to computational practice), the conceptual criterion (i.e. explaining all the underlying concepts) and the cognitive criterion (i.e. providing solid grounds for computationalism). (2) Piccinini’s answer: it has to meet the objectivity criterion (i.e. identifying computation as a matter of fact), the explanation criterion (i.e. explaining the computer’s behaviour), the right things compute criterion, the miscomputation criterion (i.e. accounting for malfunctions), the taxonomy criterion (i.e. distinguishing between different classes of computers) and the empirical criterion. (3) Von Neumann’s answer: it has to meet the precision and reliability of computers criterion, the single error criterion (i.e. addressing the impacts of errors) and the distinction between analogue and digital computers criterion. (4) “Everything” computes answer: it has to meet the implementation theory criterion by properly explaining the notion of implementation. 相似文献

14.

Special Feature A Survey of Methods of Achieving Reliable Software

Morgan D.E. Taylor D.J. 《Computer》1977,10(2):44-53

As organizations become more dependent on computers, they become more sensitive to computer system failures. The importance of computer reliability in realtime control systems (e.g., communications systems^8,15and traffic control systems) has been recognized for some time. Many computer users are now becoming aware that they accomplish more on systems which seldom crash because of malfunctions than on systems which run very rapidly (and correctly) between frequent crashes. Consequently, increasing emphasis is being placed by users and vendors on the reliability of the total system, and particularly the system software. A notable example of this is the emphasis placed by IBM on reliability concerns in the design and implementation of OS/VS2 Release 2.^35,36 相似文献

15.

论计算机故障诊断及解决方式

芦冏耀《办公自动化》2011,(6):10-12

随着人们物质生活水平的不断提高,计算机的应用已经是相当普及,人们的日常生活、学习、娱乐都离不开它,但由于用户的持续增加,计算机形形色色的故障也随之出现,很多客户都依赖于售后服务人员的维护,自己却对故障一无所知,正确理解计算机故障就很有必要。一方面:售后服务人员或修理工技术水平的高低将会直接影响售后服务质量,另一方面:对于使用者来说对故障原理有一些了解可以主动避免故障的发生,从而可以在使用上处于主动地位。本文从实践出发积累多年计算机故障诊断经验,将计算机故障进行划分,并根据故障缘由对应解决方案。相似文献

16.

一种辐射环境下瞬时故障的软件检测方法

李建立谭庆平徐建军《计算机工程与科学》2010,32(3):115-118

空间辐射环境中,大量的宇宙射线经常导致星载计算机出现瞬时故障,这些瞬时故障致使程序执行出现数据错误或者控制流错误。针对瞬时故障导致的程序错误,本文提出了一种软件实现的故障检测算法SITFT,它结合软件复算和标签分析的方法,既可以检测程序运行中的数据错误,又可以有效检测控制流错误。故障注入实验的结果表明,SIT-FT算法在性能开销比源程序增加58%～111%,存储开销增加153%～225%的前提下,使程序执行出现错误结果的情形比源程序减少了49.0%～73.2%。相似文献

17.

Observer-based fault diagnosis for a class of non-linear systems Application to a free radical copolymerization reaction

P. Kabore S. Othman T. F. Mckenna H. Hammouri 《International journal of control》2013,86(9):787-803

The problem of fault detection and isolation (FDI) is investigated for a class of non-linear systems and applied to the case of the free radical copolymerization of butyl acrylate (BuA) and vinyl acetate (VAc) in an ethyl acetate (EAc) solvent. The approach proposed in this work is based on decoupling techniques using differential geometry theory and observer synthesis for non-linear systems. A design procedure is provided for residual generation. The faults considered in the application are malfunctions of the feed pumps of both monomers and the initiator, and the presence of an inhibitor. A detailed construction of fault detection filters is presented, and their performances in the presence of parameter uncertainties are discussed through some simulations. 相似文献

18.

A Distributed Functional Diagnostic System for Multi-Computer Systems

V. Yu. Grishin A. V. Lobanov V. G. Sirenko 《Automation and Remote Control》2002,63(1):139-144

A distributed self-testing method is designed for multi-computer systems of arbitrary mutual test graph. The method is helpful in detecting and identifying the location and type (defects, program failure, and faults) of faults in machines in the course of tests and hazardous faults in the course of exchange of local test syndromes between computers. The method is based on an s-nonfailure information matching algorithm for detecting and identifying faults in information exchange. The fault detection and identification mechanisms form a continuous functional diagnostic process. 相似文献

19.

Automatic Contouring of Faulted Seismic Data Sets

M J McCullagh 《Computer Graphics Forum》1983,2(1):57-66

The theoretical basis for two alternative methods of structuring and then contouring 2D and 3D seismic data with faults is described. Both rely on triangulation systems, but one is more suitable for small memory computers, and the other for large memory machines. 相似文献

20.

Conceptual characteristic features of fault-free hardware design

V. N. Dianov 《Automation and Remote Control》2012,73(7):1202-1215

We consider the concept of active diagnostics for detecting and registering hardware faults in computers, sensors, executive mechanisms, and optical electronic systems. We propose a collection of informative features to detect and register sources of the faults: connectors, contacting units of large integral circuits (LIC) and super-large integral circuits (SLIC), contacting conductors on printed circuit boards (including multilayered ones), interface buses, unshielded single- and multicore wires, grounding and power buses, connector blocks, brazing joints. Based on these new properties of passive elements in radioelectronic hardware, and in keeping with the notion of “fail-safeness,” we propose a new notion of reliability, “fault-safeness,” that establishes a connection between hardware faults and its hidden defects. 相似文献