You are here

Rough Set-Based Software Quality Models and Quality of Data

Download pdf | Full Screen View

Date Issued:
2008
Summary:
In this dissertation we address two significant issues of concern. These are software quality modeling and data quality assessment. Software quality can be measured by software reliability. Reliability is often measured in terms of the time between system failures. A failure is caused by a fault which is a defect in the executable software product. The time between system failures depends both on the presence and the usage pattern of the software. Finding faulty components in the development cycle of a software system can lead to a more reliable final system and will reduce development and maintenance costs. The issue of software quality is investigated by proposing a new approach, rule-based classification model (RBCM) that uses rough set theory to generate decision rules to predict software quality. The new model minimizes over-fitting by balancing the Type I and Type II niisclassiflcation error rates. We also propose a model selection technique for rule-based models called rulebased model selection (RBMS). The proposed rule-based model selection technique utilizes the complete and partial matching rule sets of candidate RBCMs to determine the model with the least amount of over-fitting. In the experiments that were performed, the RBCMs were effective at identifying faulty software modules, and the RBMS technique was able to identify RBCMs that minimized over-fitting. Good data quality is a critical component for building effective software quality models. We address the significance of the quality of data on the classification performance of learners by conducting a comprehensive comparative study. Several trends were observed in the experiments. Class and attribute had the greatest impact on the performance of learners when it occurred simultaneously in the data. Class noise had a significant impact on the performance of learners, while attribute noise had no impact when it occurred in less than 40% of the most significant independent attributes. Random Forest (RF100), a group of 100 decision trees, was the most, accurate and robust learner in all the experiments with noisy data.
Title: Rough Set-Based Software Quality Models and Quality of Data.
68 views
30 downloads
Name(s): Bullard, Lofton A.
Khoshgoftaar, Taghi M., Thesis advisor
Florida Atlantic University, Degree grantor
College of Engineering and Computer Science
Department of Computer and Electrical Engineering and Computer Science
Type of Resource: text
Genre: Electronic Thesis Or Dissertation
Date Created: 2008
Date Issued: 2008
Publisher: Florida Atlantic University
Place of Publication: Boca Raton, Fla.
Physical Form: application/pdf
Extent: 181 p.
Language(s): English
Summary: In this dissertation we address two significant issues of concern. These are software quality modeling and data quality assessment. Software quality can be measured by software reliability. Reliability is often measured in terms of the time between system failures. A failure is caused by a fault which is a defect in the executable software product. The time between system failures depends both on the presence and the usage pattern of the software. Finding faulty components in the development cycle of a software system can lead to a more reliable final system and will reduce development and maintenance costs. The issue of software quality is investigated by proposing a new approach, rule-based classification model (RBCM) that uses rough set theory to generate decision rules to predict software quality. The new model minimizes over-fitting by balancing the Type I and Type II niisclassiflcation error rates. We also propose a model selection technique for rule-based models called rulebased model selection (RBMS). The proposed rule-based model selection technique utilizes the complete and partial matching rule sets of candidate RBCMs to determine the model with the least amount of over-fitting. In the experiments that were performed, the RBCMs were effective at identifying faulty software modules, and the RBMS technique was able to identify RBCMs that minimized over-fitting. Good data quality is a critical component for building effective software quality models. We address the significance of the quality of data on the classification performance of learners by conducting a comprehensive comparative study. Several trends were observed in the experiments. Class and attribute had the greatest impact on the performance of learners when it occurred simultaneously in the data. Class noise had a significant impact on the performance of learners, while attribute noise had no impact when it occurred in less than 40% of the most significant independent attributes. Random Forest (RF100), a group of 100 decision trees, was the most, accurate and robust learner in all the experiments with noisy data.
Identifier: FA00012567 (IID)
Degree granted: Dissertation (Ph.D.)--Florida Atlantic University, 2008.
Collection: FAU Electronic Theses and Dissertations Collection
Note(s): College of Engineering and Computer Science
Subject(s): Computer software--Quality control
Computer software--Reliability
Software engineering
Computer arithmetic
Held by: Florida Atlantic University Libraries
Sublocation: Digital Library
Persistent Link to This Record: http://purl.flvc.org/fau/fd/FA00012567
Use and Reproduction: Copyright © is held by the author with permission granted to Florida Atlantic University to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Use and Reproduction: http://rightsstatements.org/vocab/InC/1.0/
Host Institution: FAU
Is Part of Series: Florida Atlantic University Digital Library Collections.