You are here

unified methodology for software and hardware fault tolerance

Download pdf | Full Screen View

Date Issued:
1995
Summary:
The growing demand for high availability of computer systems has led to a wide application range of fault-tolerant systems. In some real-time applications ultrareliable computer systems are required. Such computer systems should be capable of tolerating failures of not only their hardware components but also of their software components. This dissertation discusses three aspects of designing an ultrareliable system: (a) a hierarchical ultrareliable system structure; (b) a set of unified methods to tolerate both software and hardware faults in combination; and (c) formal specifications in the system structure. The proposed hierarchical structure has four layers: Application, Software Fault Tolerance, Combined Fault Tolerance and Configuration. The Application Layer defines the structure of the application software in terms of the modular structure using a module interconnection language. The failure semantics of the service provided by the system is also defined at this layer. At the Software Fault Tolerance Layer each module can use software fault tolerance methods. The implementation of the software and hardware fault tolerance is achieved at the Combined Fault Tolerance Layer which utilizes the combined software/hardware fault tolerance methods. The Configuration Layer performs actual software and hardware resource management for the requests of fault identification and recovery from the Combined Fault Tolerance Layer. A combined software and hardware fault model is used as the system fault model. This model uses the concepts of fault pattern and fault set to abstract the various occurrences of software and hardware faults. We also discuss extended comparison models that consider faulty software as well. The combined software/hardware fault tolerance methods are based on recovery blocks, N-version programming, extended comparison methods and both forward and backward recovery methods. Formal specifications and verifications are used in the system design process and the system structure to show that the design and implementation of a fault-tolerant system satisfy the functional and non-functional requirements. Brief discussions and examples of using formal specifications in the hierarchical structure are given.
Title: A unified methodology for software and hardware fault tolerance.
226 views
82 downloads
Name(s): Wang, Yijun.
Florida Atlantic University, Degree grantor
Wu, Jie, Thesis advisor
College of Engineering and Computer Science
Department of Computer and Electrical Engineering and Computer Science
Type of Resource: text
Genre: Electronic Thesis Or Dissertation
Issuance: monographic
Date Issued: 1995
Publisher: Florida Atlantic University
Place of Publication: Boca Raton, Fla.
Physical Form: application/pdf
Extent: 222 p.
Language(s): English
Summary: The growing demand for high availability of computer systems has led to a wide application range of fault-tolerant systems. In some real-time applications ultrareliable computer systems are required. Such computer systems should be capable of tolerating failures of not only their hardware components but also of their software components. This dissertation discusses three aspects of designing an ultrareliable system: (a) a hierarchical ultrareliable system structure; (b) a set of unified methods to tolerate both software and hardware faults in combination; and (c) formal specifications in the system structure. The proposed hierarchical structure has four layers: Application, Software Fault Tolerance, Combined Fault Tolerance and Configuration. The Application Layer defines the structure of the application software in terms of the modular structure using a module interconnection language. The failure semantics of the service provided by the system is also defined at this layer. At the Software Fault Tolerance Layer each module can use software fault tolerance methods. The implementation of the software and hardware fault tolerance is achieved at the Combined Fault Tolerance Layer which utilizes the combined software/hardware fault tolerance methods. The Configuration Layer performs actual software and hardware resource management for the requests of fault identification and recovery from the Combined Fault Tolerance Layer. A combined software and hardware fault model is used as the system fault model. This model uses the concepts of fault pattern and fault set to abstract the various occurrences of software and hardware faults. We also discuss extended comparison models that consider faulty software as well. The combined software/hardware fault tolerance methods are based on recovery blocks, N-version programming, extended comparison methods and both forward and backward recovery methods. Formal specifications and verifications are used in the system design process and the system structure to show that the design and implementation of a fault-tolerant system satisfy the functional and non-functional requirements. Brief discussions and examples of using formal specifications in the hierarchical structure are given.
Identifier: 12424 (digitool), FADT12424 (IID), fau:9319 (fedora)
Collection: FAU Electronic Theses and Dissertations Collection
Note(s): College of Engineering and Computer Science
Thesis (Ph.D.)--Florida Atlantic University, 1995.
Subject(s): Fault-tolerant computing
Computer architecture
Held by: Florida Atlantic University Libraries
Persistent Link to This Record: http://purl.flvc.org/fcla/dt/12424
Sublocation: Digital Library
Use and Reproduction: Copyright © is held by the author, with permission granted to Florida Atlantic University to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Use and Reproduction: http://rightsstatements.org/vocab/InC/1.0/
Host Institution: FAU
Is Part of Series: Florida Atlantic University Digital Library Collections.