期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Cellular automata-based systems with fault-tolerance

Luděk ?aloudek Luká? Sekanina 《Natural computing》2012,11(4):673-685

One of the new computing paradigms which could overcome some of the problems of existing computing architectures may be cellular computing. In the investigated scenario, cellular automata-based systems are intended for yet-unknown methods of fabrication and as such, they need to address the problem of fault-tolerance in a way which is not tightly connected to used technology. Our goal is to reach not too complicated solutions, which may not be possible with existing elaborate fault-tolerant systems. This paper presents a possible solution for increasing fault-tolerance in cellular automata in a form of static module redundancy. Further, a set of experiments evaluating this approach is described, using triple and quintuple module redundancy in the automata with the presence of defects. The results indicate that the concept works for low intensity of defects for our selected benchmarks, however, the ability to cope with defects can not be intuitively deduced beforehand, as shown by the varying outcomes. One of the problems??the majority task??is then explored further, investigating the cellular automaton??s ability to cope not only with defects but also with transient errors. 相似文献

2.

Scalable hierarchical locking for distributed systems

《Journal of Parallel and Distributed Computing》2004,64(6):708-724

相似文献

3.

Optimizing checkpoint-based fault-tolerance in distributed stream processing systems: Theory to practice

Sachini Jayasekara Shanika Karunasekera Aaron Harwood 《Software》2022,52(1):296-315

Fault-tolerance is an essential part of a stream processing system that guarantees data analysis could continue even after failures. State-of-the-art distributed stream processing systems use checkpointing to support fault-tolerance for stateful computations where the state of the computations is periodically persisted. However, the frequency of performing checkpoints impacts the performance (utilization, latency, and throughput) of the system as the checkpointing process consumes resources and time that can be used for actual computations. In practice, systems are often configured to perform checkpoints based on crude values ignoring factors such as checkpoint and restart costs, leading to suboptimal performance. In our previous work, we proposed a theoretical optimal checkpoint interval that maximizes the system utilization for stream processing systems to minimize the impact of checkpointing on system performance. In this article, we investigate the practical benefits of our proposed theoretical optimal by conducting experiments in a real-world cloud setting using different streaming applications; we use Apache Flink, a well-known stream processing system for our experiments. The experiment results demonstrate that an optimal interval can achieve better utilization, confirming the practicality of the theoretical model when applied to real-world applications. We observed utilization improvements from 10% to 200% for a range of failure rates from 0.3 failures per hour to 0.075 failures per minute. Moreover, we explore how performance measures: latency and throughput are affected by the optimal interval. Our observations demonstrate that significant improvements can be achieved using the optimal interval for both latency and throughput. 相似文献

4.

Modeling distributed real-time systems with MAST 2

《Journal of Systems Architecture》2013,59(6):331-340

Switched networks have an increasingly important role in real-time communications. The IEEE Ethernet standards have defined prioritized traffic (802.1p) and other QoS mechanisms (802.1q). The Avionics Full-Duplex Switched Ethernet (AFDX) standard defines a hard real-time network based on switched Ethernet. Clock synchronization is also an important service in some real-time distributed systems because it allows a global notion of time for event timing and timing requirements. In the process of defining the new MAST 2 model, clock synchronization modeling capabilities have been added, and the network elements have been enhanced to include switches and routers. This paper introduces the schedulability model that will enable an automatic schedulability analysis of a distributed application using switched networks and clock synchronization mechanisms. 相似文献

5.

Modeling layered distributed communication systems

D. Herzberg M. Broy 《Formal Aspects of Computing》2005,17(1):1-18

相似文献

6.

Modeling parallel and distributed systems with finite workloads

Ahmed M. Lester Reda 《Performance Evaluation》2005,60(1-4):303-325

In studying or designing parallel and distributed systems one should have available a robust analytical model that includes the major parameters that determine the system performance. Jackson networks have been very successful in modeling computer systems. However, the ability of Jackson networks to predict performance with system changes remains an open question, since they do not apply to systems where there are population size constraints. Also, the product-form solution of Jackson networks assumes steady-state and exponential service centers or certain specialized queueing discipline. In this paper, we present a transient model for Jackson networks that is applicable to any population size and any finite workload (no new arrivals). Using several non-exponential distributions we show to what extent the exponential distribution can be used to approximate other distributions and transient systems with finite workloads. When the number of tasks to be executed is large enough, the model approaches the product-form solution (steady-state solution). We also, study the case where the non-exponential servers have queueing (Jackson networks cannot be applied). Finally, we show how to use the model to analyze the performance of parallel and distributed systems. 相似文献

7.

Modeling context in mobile distributed systems with the UML

《Journal of Visual Languages and Computing》2007,18(4):420-439

Context-awareness plays an important role in mobile distributed systems since it enables the adaptation of mobile devices to the users. However, one of the major challenges is the preservation of the users’ privacy. Many different approaches of modeling the context of the user exist, but the incorporation of privacy restrictions into context models, which makes the protection of privacy apparent, is missing. This paper presents the Context Modeling Profile (CMP), a lightweight UML (Unified Modeling Language) extension, as a visual language for context models in mobile distributed systems. The resulting models embody metainformation of the context, i.e. source and validity of context information, and reflect privacy restrictions. The profile provides several well-formedness rules for context models and supports the development of context-aware mobile applications through an adequate visual modeling language. A case study is used to illustrate the approach. 相似文献

8.

Modeling and testing object-oriented distributed systems with linear-time temporal logic

F. Dietrich X. Logean J.-P. Hubaux 《Concurrency and Computation》2001,13(5):385-420

We present a framework for constructing formal models of object-oriented distributed systems and a property language to express behavioral constraints in such models. Most of the existing models have their origin in specific mathematical notations and/or concepts. In contrast, we have developed our model such that it accounts for a large set of phenomena associated with industrial implementations of object-oriented distributed systems. The model that we propose, while closer to industrial concerns and practice, still has the powerful features of formal approaches. It also offers the possibility to automatically check at service run-time that the final service implementation has not violated and is not violating properties expressed at the abstraction level of our model. In our model, which relies on event-based behavioral abstraction, we use linear-time temporal logic as the underlying formalism for the specification of properties. We introduce two novel operators which are especially useful for object-oriented systems and which provide a number of advantages over the well-known temporal logic operators. A recent decision of one of our industrial partners to adopt our proposal into one of their development platforms can be seen as a strong evidence of the relevance of our work and as a promising step towards a better understanding between the academic formal methods community and industry. Copyright © 2001 John Wiley & Sons, Ltd. 相似文献

9.

分布式嵌入式系统实时调度的建模 总被引：2，自引：0，他引：2

张海涛邱联奎艾云峰《计算机应用》2008,28(8):2177-2180

针对RBTPN模型在建模分布式嵌入式系统实时调度时的不足,提出了一种新的扩展时间Petri网模型。该模型通过在需要处理器资源的变迁上引入变迁速率因子,得到具有相同优先级变迁的运行速率函数,从而在分布式嵌入式系统的调度建模中,在单个处理器上结合了固定优先级可抢先调度和轮转调度。随后给出了该模型可达图的构造方法,以便可以得到调度序列的各种性质。相似文献

10.

A hierarchical watchdog mechanism for systemic fault awareness on distributed systems

《Future Generation Computer Systems》2015

相似文献

11.

Some algorithms for the construction of coordinated solutions in hierarchical distributed systems

N. N. Trenev 《Cybernetics and Systems Analysis》1990,26(2):193-199

Algorithms for constructing coordinated solutions in multilevel distributed systems are proposed. These algorithms utilize the nonformalizable knowledge of the decision maker about each subsystem and modify the set of feasible solutions of the local problems. Convergence of the proposed algoritms is proved.Translated from Kibernetika, No. 2, pp. 42–46, March–April, 1990. 相似文献

12.

一类具有不确定性的复杂系统的鲁棒与反馈容错 总被引：5，自引：0，他引：5

张颖伟王福利张嗣瀛《控制理论与应用》2002,19(6):960-962

讨论受非线性扰动的线性相似子系统经非线性互联形成的组合大系统的鲁棒反馈镇定问题. 首先给出其可用结构相似非光滑控制器进行镇定的鲁棒分散控制器设计方案, 然后给出当执行器失效时控制器的设计方案. 这种控制器不仅使正常系统保持稳定, 当执行器失效时, 也能使系统保持稳定. 相似文献

13.

Modeling of correlated resources availability in distributed computing systems

《Simulation Modelling Practice and Theory》2018

Volunteer computing systems are large-scale distributed systems with large number of heterogeneous and unreliable Internet-connected hosts. Volunteer computing resources are suitable mainly to run High-Throughput Computing (HTC) applications due to their unavailability rate and frequent churn. Although they provide Peta-scale computing power for many scientific projects across the globe, efficient usage of this platform for different types of applications still has not been investigated in depth. So, characterizing, analyzing and modeling such resources availability in volunteer computing is becoming essential and important for efficient application scheduling. In this paper, we focus on statistical modeling of volunteer resources, which exhibit non-random pattern in their availability time. The proposed models take into account the autocorrelation structure in individual and subset of hosts whose availability has temporal correlation. We applied our methodology on real traces from the SETI@home project with more than 230,000 hosts. We showed that Markovian arrival process and ARIMA time series can model the availability and unavailability intervals of volunteer resources with a reasonable to excellent level of accuracy. 相似文献

14.

Scheduling algorithms for fault-tolerance in hard-real-time systems

Alan A. Bertossi Luigi V. Mancini 《Real-Time Systems》1994,7(3):229-245

Many time-critical applications require predictable performance in the presence of failures. This paper considers a distributed system with independent periodic tasks which can checkpoint their state on some reliable medium in order to handle failures. The problem of preemptively scheduling a set of such tasks is discussed where every occurrence of a task has to be completely executed before the next occurrence of the same task can start. Efficient scheduling algorithms are proposed which yield sub-optimal schedules when there is provision for fault-tolerance. The performance of the solutions proposed is evaluated in terms of the number of processors and the cost of the checkpoints needed. Moreover, analytical studies are used to reveal interesting trade-offs associated with the scheduling algorithms.This work has been supported by grants from the Italian Ministero dell'Università e della Ricerca Scientifica e Tecnologica and the Consiglio Nazionale delle Ricerche-Progetto Finalizzato Sistemi Informatici e Calcolo Parallelo. 相似文献

15.

Hierarchical Byzantine fault-tolerance protocol for permissioned blockchain systems

Thai Quang Tung Yim Jong-Chul Yoo Tae-Whan Yoo Hyun-Kyung Kwak Ji-Young Kim Sun-Me 《The Journal of supercomputing》2019,75(11):7337-7365

The Journal of Supercomputing - Emerging blockchain technology has introduced a new challenge to the distributed system research: Can Byzantine fault-tolerance protocols scale up to, for example,... 相似文献

16.

Modeling and analysis of scheduling for distributed real-time embedded systems

Hai-Tao Zhang Gui-Fang Wu 《国际自动化与计算杂志》2010,7(4):525-530

Aimed at the deficiencies of resources based time Petri nets (RBTPN) in doing scheduling analysis for distributed real-time embedded systems, the assemblage condition of complex scheduling sequences is presented to easily compute scheduling length and simplify scheduling analysis. Based on this, a new hierarchical RBTPN model is proposed. The model introduces the definition of transition border set, and represents it as an abstract transition. The abstract transition possesses all resources of the set, and has the highest priority of each resource; the execution time of abstract transition is the longest time of all possible scheduling sequences. According to the characteristics and assemblage condition of RBTPN, the refinement conditions of transition border set are given, and the conditions ensure the correction of scheduling analysis. As a result, it is easy for us to understand the scheduling model and perform scheduling analysis. 相似文献

17.

基于等级PRES+的嵌入式系统的建模与实现

周青张伟占东生《计算机工程与设计》2009,30(24)

为了有效的描绘规模较大的系统,需要一个分层机制,使模型可以有条理地构建.它由简单的单元组件构成,使得设计师在每个描述级上都可以很容易理解.提出并定义了用于形式化描述嵌入式系统建模的petri网的分层概念,显示了一个规模较大系统中的小部件如何使用层次的概念进行转换.该方法使得复杂的嵌入式系统描述更加模块化,具有可重用性,提高了嵌入式系统建模和分析的效率.通过一个实例表明了该方法的可行性. 相似文献

18.

Power saving and fault-tolerance in real-time critical embedded systems

Rodrigo M. Santos Jorge Santos Javier D. Orozco 《Journal of Systems Architecture》2009,55(2):90-101

In this paper, a method with the double purpose of reducing the consumption of energy and giving a deterministic guarantee on the fault tolerance of real-time embedded systems operating under the Rate Monotonic discipline is presented. A lower bound exists on the slack left free by tasks being executed at their worst-case execution time. This deterministic slack can be redistributed and used for any of the two purposes. The designer can set the trade-off point between them. In addition, more slack can be reclaimed when tasks are executed in less than their worst-case time. Fault-tolerance is achieved by using the slack to recompute the faulty task. Energy consumption is reduced by lowering the operating frequency of the processor as much as possible while meeting all time-constraints. This leads to a multifrequency method; simulations are carried out to test it versus two single frequency methods (nominal and reduced frequencies). This is done under different trade-off points and rates of faults’ occurrence. The existence of an upper bound on the overhead caused by the transition time between frequencies in Rate Monotonic scheduled real-time systems is formally proved. The method can also be applied to multicore or multiprocessor systems. 相似文献

19.

Control of complex distributed systems with distributed intelligent agents 总被引：1，自引：0，他引：1

Eric Tatara Ali &#x;nar Fouad Teymour 《Journal of Process Control》2007,17(5):415

Control of spatially distributed systems is a challenging problem because of their complex nature, nonlinearity, and generally high order. The lack of accurate and computationally efficient model-based techniques for large, spatially distributed systems leads to challenges in controlling the system. Agent-based control structures provide a powerful tool to manage distributed systems by utilizing (organizing) local and global information obtained from the system. A hierarchical, agent-based system with local and global controller agents is developed to control networks of interconnected chemical reactors (CSTRs). The global controller agent dynamically updates local controller agent’s objectives as the reactor network conditions change. One challenge posed is control of the spatial distribution of autocatalytic species in a network of reactors hosting multiple species. The multi-agent control system is able to intelligently manipulate the network flow rates such that the desired spatial distribution of species is achieved. Furthermore, the robustness and flexibility of the agent-based control system is illustrated through examples of disturbance rejection and scalability with respect to the size of the network. 相似文献

20.

Modeling Distributed Multimedia Synchronization with DSPN

下载免费PDF全文

Song Jun Gu Guanqun 《计算机科学技术学报》1998,13(5):448-454

Multimedia synchronization is the essential technology for the integration of multimedia in distributed multimedia systems.The multimedia synchronization model has been recognized by many researchers as a premise of the implementation of multimedia synchronization.In distributed multimedia systems,the characteristic of multimedia synchronization is dynamic,and the key medium has the priority in multimedia synchronization.The previously proposed multimedia synchronization models cannot meet these requirements.So a new multimedia dynamic synchronization model-DSPN,based on the timed Petri-net has been designed in this paper.This model can not only let the distributed multimedia system keep multimedia synchronization in a more precise and effective manner according to the runtime situation of the system,but also allow the user to interact with the presentation of multimedia. 相似文献