Current Search: Wang, Yijun. (x)
-
-
Title
-
A unified methodology for software and hardware fault tolerance.
-
Creator
-
Wang, Yijun., Florida Atlantic University, Wu, Jie, College of Engineering and Computer Science, Department of Computer and Electrical Engineering and Computer Science
-
Abstract/Description
-
The growing demand for high availability of computer systems has led to a wide application range of fault-tolerant systems. In some real-time applications ultrareliable computer systems are required. Such computer systems should be capable of tolerating failures of not only their hardware components but also of their software components. This dissertation discusses three aspects of designing an ultrareliable system: (a) a hierarchical ultrareliable system structure; (b) a set of unified...
Show moreThe growing demand for high availability of computer systems has led to a wide application range of fault-tolerant systems. In some real-time applications ultrareliable computer systems are required. Such computer systems should be capable of tolerating failures of not only their hardware components but also of their software components. This dissertation discusses three aspects of designing an ultrareliable system: (a) a hierarchical ultrareliable system structure; (b) a set of unified methods to tolerate both software and hardware faults in combination; and (c) formal specifications in the system structure. The proposed hierarchical structure has four layers: Application, Software Fault Tolerance, Combined Fault Tolerance and Configuration. The Application Layer defines the structure of the application software in terms of the modular structure using a module interconnection language. The failure semantics of the service provided by the system is also defined at this layer. At the Software Fault Tolerance Layer each module can use software fault tolerance methods. The implementation of the software and hardware fault tolerance is achieved at the Combined Fault Tolerance Layer which utilizes the combined software/hardware fault tolerance methods. The Configuration Layer performs actual software and hardware resource management for the requests of fault identification and recovery from the Combined Fault Tolerance Layer. A combined software and hardware fault model is used as the system fault model. This model uses the concepts of fault pattern and fault set to abstract the various occurrences of software and hardware faults. We also discuss extended comparison models that consider faulty software as well. The combined software/hardware fault tolerance methods are based on recovery blocks, N-version programming, extended comparison methods and both forward and backward recovery methods. Formal specifications and verifications are used in the system design process and the system structure to show that the design and implementation of a fault-tolerant system satisfy the functional and non-functional requirements. Brief discussions and examples of using formal specifications in the hierarchical structure are given.
Show less
-
Date Issued
-
1995
-
PURL
-
http://purl.flvc.org/fcla/dt/12424
-
Subject Headings
-
Fault-tolerant computing, Computer architecture
-
Format
-
Document (PDF)