Current Search: Regression analysis (x)
View All Items
- Title
- Predictive discriminant analysis versus logistic regression for two-group classification problems in educational settings.
- Creator
- Meshbane, Alice., Florida Atlantic University, Morris, John D.
- Abstract/Description
-
The cross-validated classification accuracy of predictive discriminant analysis (PDA) and logistic regression (LR) models was compared for the two-group classification problem. Thirty-four real data sets varying in number of cases, number of predictor variables, degree of group separation, relative group size, and equality of group covariance matrices were employed for the comparison. PDA models were built based on assumptions of multivariate normality and equal covariance matrices, and cases...
Show moreThe cross-validated classification accuracy of predictive discriminant analysis (PDA) and logistic regression (LR) models was compared for the two-group classification problem. Thirty-four real data sets varying in number of cases, number of predictor variables, degree of group separation, relative group size, and equality of group covariance matrices were employed for the comparison. PDA models were built based on assumptions of multivariate normality and equal covariance matrices, and cases were classified using Tatsuoka's (1988, p. 351) minimum chi square rule. LR models were built using the International Mathematical and Statistical Library (IMSL) subroutine Categorical Generalized Linear Model (CTGLM), available with the 32-bit Microsoft Fortran v4.0 Powerstation. CTGLM uses a nonlinear approximation technique (Newton-Raphson) to determine maximum likelihood estimates of model parameters. The group with the higher log-likelihood probability was used as the LR prediction. Cross-validated hit-rate accuracy of PDA and LR models was estimated using the leave-one-out procedure. McNemar's (1947) statistic for correlated proportions was used in the statistical comparisons of PDA and LR hit rate estimates for separate-group and total-sample proportions (z = 2.58, a =.01). Total-sample and separate-group cross-validated classification accuracy obtained by PDA was not significantly different from that obtained by LR in any of the 31 data sets for which maximum likelihood estimates of LR model parameters could be calculated. This was true regardless of assumptions made about population sizes (i.e., equal or unequal). Neither theoretical nor data-based considerations were helpful in predicting these results. Although it does not appear from these data to make a difference which classification model is used, use of the method described in this study for comparing PDA and LR models will enable researchers to select the optimal classification model for a specific data set, regardless of data conditions.
Show less - Date Issued
- 1996
- PURL
- http://purl.flvc.org/fcla/dt/12461
- Subject Headings
- Discriminant analysis, Regression analysis, Logistic distribution, Education, Higher--Research
- Format
- Document (PDF)
- Title
- Predicting level of dissolved reactive phosphate in the Lafayette River, Virginia, from information on tide, wind, temperature, and sewage discharge.
- Creator
- Montgomery, John R., Harbor Branch Oceanographic Institute
- Date Issued
- 1979
- PURL
- http://purl.flvc.org/FCLA/DT/3172967
- Subject Headings
- Phosphate deposits, Sewage, Estuaries, Tidal power, Regression analysis
- Format
- Document (PDF)
- Title
- Light scattering and extinction in a highly turbid coastal inlet.
- Creator
- Thompson, M. John, Gilliland, Lewis E., Rosenfeld, Leslie K., Harbor Branch Oceanographic Institute
- Date Issued
- 1979
- PURL
- http://purl.flvc.org/FCLA/DT/3174007
- Subject Headings
- Inlets, Turbidity, Light --Scattering, Suspended sediments, Regression analysis
- Format
- Document (PDF)
- Title
- Regression analysis of a small business to determine optimal advertising media.
- Creator
- Harris, Jamie A., Harriet L. Wilkes Honors College
- Abstract/Description
-
Every day, business owners make important decisions trying to increase productivity. Smaller, family-owned companies, however, have a financial disadvantage over larger corporations. Through the analysis of one small business, Gardens Pool Supply, we provide the owners with answers to questions on how to reduce costs and increase profits.
- Date Issued
- 2007
- PURL
- http://purl.flvc.org/FAU/11608
- Subject Headings
- Regression analysis, Economics, Statistical methods, Commercial statistics, Business forecasting, Advertising
- Format
- Document (PDF)
- Title
- Count models for software quality estimation.
- Creator
- Gao, Kehan, Florida Atlantic University, Khoshgoftaar, Taghi M., College of Engineering and Computer Science, Department of Computer and Electrical Engineering and Computer Science
- Abstract/Description
-
The primary aim of software engineering is to produce quality software that is delivered on time, within budget, and fulfils all its requirements. A timely estimation of software quality can serve as a prerequisite in achieving high reliability of software-based systems. More specifically, software quality assurance efforts can be prioritized for targeting program modules that are most likely to have a high number of faults. Software quality estimation models are generally of two types: a...
Show moreThe primary aim of software engineering is to produce quality software that is delivered on time, within budget, and fulfils all its requirements. A timely estimation of software quality can serve as a prerequisite in achieving high reliability of software-based systems. More specifically, software quality assurance efforts can be prioritized for targeting program modules that are most likely to have a high number of faults. Software quality estimation models are generally of two types: a classification model that predicts the class membership of modules into two or more quality-based classes, and a quantitative prediction model that estimates the number of faults (or some other software quality factor) that are likely to occur in software modules. In the literature, a variety of techniques have been developed for software quality estimation, most of which are suited for either prediction or classification but not for both, e.g., the multiple linear regression (only for prediction) and logistic regression (only for classification).
Show less - Date Issued
- 2003
- PURL
- http://purl.flvc.org/fcla/dt/12042
- Subject Headings
- Computer software--Quality control, Software engineering, Econometrics, Regression analysis
- Format
- Document (PDF)
- Title
- Translog and Cobb-Douglas analysis of tourist demand in Florida.
- Creator
- Collins, Donald Lawrence., Florida Atlantic University, Yuhn, Ky-hyang
- Abstract/Description
-
The purpose of this study is to determine what factors could influence an economic agents' decision to travel or vacation in Florida. This study measures this decision by analyzing the state Division of Tourism estimates for visitors in light of changes in; national gross domestic product, non-aviation gasoline prices, average airfares, and exchange rates. This data was compiled on a quarterly basis form 1980 to 1993 and analyzed by employing Translog and Cobb-Douglas demand functional forms...
Show moreThe purpose of this study is to determine what factors could influence an economic agents' decision to travel or vacation in Florida. This study measures this decision by analyzing the state Division of Tourism estimates for visitors in light of changes in; national gross domestic product, non-aviation gasoline prices, average airfares, and exchange rates. This data was compiled on a quarterly basis form 1980 to 1993 and analyzed by employing Translog and Cobb-Douglas demand functional forms for use in regression analysis. Based upon the regression results, the Cobb-Douglas functional form best represents what has historically occurred in the real economic world and follows generally accepted micro-economic demand theory. The Cobb-Douglas techniques reveal that an economic agents' future income expectations, measured by GDP levels, has a significant influence on Florida visitor estimates and has a role in the decision to vacation in Florida.
Show less - Date Issued
- 1996
- PURL
- http://purl.flvc.org/fcla/dt/15356
- Subject Headings
- Tourism--Florida, Economics, Mathematical, Prices, Regression analysis
- Format
- Document (PDF)
- Title
- Empirical likelihood method for segmented linear regression.
- Creator
- Liu, Zhihua., Charles E. Schmidt College of Science, Department of Mathematical Sciences
- Abstract/Description
-
For a segmented regression system with an unknown change-point over two domains of a predictor, a new empirical likelihood ratio test statistic is proposed to test the null hypothesis of no change. The proposed method is a non-parametric method which releases the assumption of the error distribution. Under the null hypothesis of no change, the proposed test statistic is shown empirically Gumbel distributed with robust location and scale parameters under various parameter settings and error...
Show moreFor a segmented regression system with an unknown change-point over two domains of a predictor, a new empirical likelihood ratio test statistic is proposed to test the null hypothesis of no change. The proposed method is a non-parametric method which releases the assumption of the error distribution. Under the null hypothesis of no change, the proposed test statistic is shown empirically Gumbel distributed with robust location and scale parameters under various parameter settings and error distributions. Under the alternative hypothesis with a change-point, the comparisons with two other methods (Chen's SIC method and Muggeo's SEG method) show that the proposed method performs better when the slope change is small. A power analysis is conducted to illustrate the performance of the test. The proposed method is also applied to analyze two real datasets: the plasma osmolality dataset and the gasoline price dataset.
Show less - Date Issued
- 2011
- PURL
- http://purl.flvc.org/FAU/3332719
- Subject Headings
- Change-point problems, Regression analysis, Econometrics, Limit theory (Probability theory)
- Format
- Document (PDF)
- Title
- A systematic evaluation of object detection and recognition approaches with context capabilities.
- Creator
- Giusti Urbina, Rafael J., College of Engineering and Computer Science, Department of Computer and Electrical Engineering and Computer Science
- Abstract/Description
-
Contemporary computer vision solutions to the problem of object detection aim at incorporating contextual information into the process. This thesis proposes a systematic evaluation of the usefulness of incorporating knowledge about the geometric context of a scene into a baseline object detection algorithm based on local features. This research extends publicly available MATLABRª implementations of leading algorithms in the field and integrates them in a coherent and extensible way....
Show moreContemporary computer vision solutions to the problem of object detection aim at incorporating contextual information into the process. This thesis proposes a systematic evaluation of the usefulness of incorporating knowledge about the geometric context of a scene into a baseline object detection algorithm based on local features. This research extends publicly available MATLABRª implementations of leading algorithms in the field and integrates them in a coherent and extensible way. Experiments are presented to compare the performance and accuracy between baseline and context-based detectors, using images from the recently published SUN09 dataset. Experimental results demonstrate that adding contextual information about the geometry of the scene improves the detector performance over the baseline case in 50% of the tested cases.
Show less - Date Issued
- 2011
- PURL
- http://purl.flvc.org/FAU/3183127
- Subject Headings
- Imaging systems, Mathematical models, Cognitive science, Optical pattern recognition, Computer vision, Logistic regression analysis
- Format
- Document (PDF)
- Title
- DEEP LEARNING REGRESSION MODELS FOR LIMITED BIOMEDICAL TIME-SERIES DATA.
- Creator
- Hssayeni, Murtadha D., Behnaz Ghoraani, Behnaz, Florida Atlantic University, Department of Computer and Electrical Engineering and Computer Science, College of Engineering and Computer Science
- Abstract/Description
-
Time-series data in biomedical applications are gaining an increased interest to detect and predict underlying diseases and estimate their severity, such as Parkinson’s disease (PD) and cardiovascular diseases. This interest is driven by advances in wearable sensors and deep learning models to a large extent. In the literature, less attention has been paid to regression models for continuous outcomes in these applications, especially when dealing with limited data. Training deep learning...
Show moreTime-series data in biomedical applications are gaining an increased interest to detect and predict underlying diseases and estimate their severity, such as Parkinson’s disease (PD) and cardiovascular diseases. This interest is driven by advances in wearable sensors and deep learning models to a large extent. In the literature, less attention has been paid to regression models for continuous outcomes in these applications, especially when dealing with limited data. Training deep learning models on raw limited data results in overfitted models, which is the main technical challenge we address in this dissertation. An example of limited and\or imbalanced time-series data is PD’s motion signals that are needed for the continuous severity estimation of Parkinson’s disease (PD). The significance of this continuous estimation is providing a tool for longitudinal monitoring of daily motor and non-motor fluctuations and managing PD medications. The dissertation objective is to train generalizable deep learning models for biomedical regression problems when dealing with limited training time-series data. The goal is designing, developing, and validating an automatic assessment system based on wearable sensors that can measure the severity of PD complications in the home-living environment while patients with PD perform their activities of daily living (ADL). We first propose using a combination of domain-specific feature engineering, transfer learning, and an ensemble of multiple modalities. Second, we utilize generative adversarial networks (GAN) and propose a new formulation of conditional GAN (cGAN) as a generative model for regression to handle an imbalanced training dataset. Next, we propose a dual-channel auxiliary regressor GAN (AR-GAN) trained using Wasserstein-MSE-correlation loss. The proposed AR-GAN is used as a data augmentation method in regression problems.
Show less - Date Issued
- 2022
- PURL
- http://purl.flvc.org/fau/fd/FA00013992
- Subject Headings
- Deep learning (Machine learning), Regression analysis--Mathematical models, Biomedical engineering
- Format
- Document (PDF)
- Title
- Factors affecting success in organic chemistry.
- Creator
- Zaplatynski, Andrea Maria, Florida Atlantic University, Haky, Jerome E.
- Abstract/Description
-
In this study we correlate academic and non-academic descriptors with Organic Chemistry final grades for students enrolled at a Florida public university. Using multiple regression analysis, the following predictors are analyzed for a sample population of 904 students: age, gender, ethnicity, academic classification, SAT scores, major, overall grade point average (GPA), semesters lapsed between courses, institution where General Chemistry was taken, prerequisite grades, and number of math and...
Show moreIn this study we correlate academic and non-academic descriptors with Organic Chemistry final grades for students enrolled at a Florida public university. Using multiple regression analysis, the following predictors are analyzed for a sample population of 904 students: age, gender, ethnicity, academic classification, SAT scores, major, overall grade point average (GPA), semesters lapsed between courses, institution where General Chemistry was taken, prerequisite grades, and number of math and science courses taken with their respective grades. Results indicate strong correlations exist between final grade in Organic Chemistry, GPA and General Chemistry final grade. Additionally, Organic Chemistry final grades correlate with academic course load and the type of institution where General Chemistry was studied. We believe these results can be employing as a tool for advising students in planning their academic programs.
Show less - Date Issued
- 2006
- PURL
- http://purl.flvc.org/fcla/dt/13389
- Subject Headings
- Chemistry, Organic--Study and teaching, Education, Secondary, Regression analysis, Academic achievement--Education (Higher)--Florida
- Format
- Document (PDF)
- Title
- Using classification and regression tree to detect hematology abnormalities.
- Creator
- Qian, Cheng., Florida Atlantic University, Wu, Jie, College of Engineering and Computer Science, Department of Computer and Electrical Engineering and Computer Science
- Abstract/Description
-
The detection of the abnormal blood cells and particles in a blood test is essential in medical diagnosis. The detection rules, which are usually implemented in the widely used automated hematology analyzer, are therefore critical for the health and even lives of millions of people. The research endeavor of this thesis is on generating such detection rules using a supervised machine learning algorithm. The first part of this thesis studies the hematology data and surveys the popular...
Show moreThe detection of the abnormal blood cells and particles in a blood test is essential in medical diagnosis. The detection rules, which are usually implemented in the widely used automated hematology analyzer, are therefore critical for the health and even lives of millions of people. The research endeavor of this thesis is on generating such detection rules using a supervised machine learning algorithm. The first part of this thesis studies the hematology data and surveys the popular classification algorithms. In the second part, the selected algorithm, CART, is implemented with deliberately selected parameters. In the third part, a modification of the algorithm, logical pruning with Enclose the Normal principle, is exercised. To extend the algorithm and to achieve better performance, I developed and implemented the idea of decision tree combinations. The research has proven to be successful by the achievement of good performance and reasonable detection rules.
Show less - Date Issued
- 2004
- PURL
- http://purl.flvc.org/fcla/dt/13189
- Subject Headings
- Regression analysis, Health survey--Statistical methods, Medical statistics, Blood--Diseases--Diagnosis, Hematology, Blood--Examination
- Format
- Document (PDF)
- Title
- A min/max algorithm for cubic splines over k-partitions.
- Creator
- Golinko, Eric David, Charles E. Schmidt College of Science, Department of Mathematical Sciences
- Abstract/Description
-
The focus of this thesis is to statistically model violent crime rates against population over the years 1960-2009 for the United States. We approach this question as to be of interest since the trend of population for individual states follows different patterns. We propose here a method which employs cubic spline regression modeling. First we introduce a minimum/maximum algorithm that will identify potential knots. Then we employ least squares estimation to find potential regression...
Show moreThe focus of this thesis is to statistically model violent crime rates against population over the years 1960-2009 for the United States. We approach this question as to be of interest since the trend of population for individual states follows different patterns. We propose here a method which employs cubic spline regression modeling. First we introduce a minimum/maximum algorithm that will identify potential knots. Then we employ least squares estimation to find potential regression coefficients based upon the cubic spline model and the knots chosen by the minimum/maximum algorithm. We then utilize the best subsets regression method to aid in model selection in which we find the minimum value of the Bayesian Information Criteria. Finally, we preent the R2adj as a measure of overall goodness of fit of our selected model. We have found among the fifty states and Washington D.C., 42 out of 51 showed an R2adj value that was greater than 90%. We also present an overall model of the United States. Also, we show additional applications our algorithm for data which show a non linear association. It is hoped that our method can serve as a unified model for violent crime rate over future years.
Show less - Date Issued
- 2012
- PURL
- http://purl.flvc.org/FAU/3342107
- Subject Headings
- Spline theory, Data processing, Bayesian statistical decision theory, Data processing, Neural networks (Computer science), Mathematical statistics, Uncertainty (Information theory), Probabilities, Regression analysis
- Format
- Document (PDF)