Fault Management in Distributed Systems: A Policy-Driven Approach |
| |
Authors: | Hanan L. Lutfiyya Michael A. Bauer Andrew D. Marshall David K. Stokes |
| |
Affiliation: | (1) Department of Computer Science, The University of Western Ontario, London, Canada;(2) Department of Computer Science, The University of Western Ontario, London, Canada |
| |
Abstract: | Managing the availability and performance of a distributed system involves monitoring the behavior of the system, identifying system problems, and correcting those problems. Each of these tasks requires some expertise, such as an understanding of the mechanics of the underlying system components. As the size and complexity of these systems increases, and the number of distributed applications executing on these systems increases, managing the availability and performance of distributed systems becomes more difficult. Little research has focused on embedding systems management expertise into a management application for a distributed system. In this paper we describe a rule-based management application for a commercially available distributed computing environment that is capable of monitoring the distributed system, detecting system service-related performance and availability problems, and generating corrective actions to correct the problems. |
| |
Keywords: | Distributed systems policy-driven management DCE distributed applications management |
本文献已被 SpringerLink 等数据库收录! |
|