Current Search: Data Analysis (x)
View All Items
Pages
- Title
- TOPOLOGICAL DATA ANALYSIS FOR DATA SCIENCE: THE DELAUNAY-RIPS COMPLEX, TRIANGULATION STABILITIES, AND PROTEIN STABILITY PREDICTIONS.
- Creator
- Mishra, Amish, Motta, Francis, Florida Atlantic University, Department of Mathematical Sciences, Charles E. Schmidt College of Science
- Abstract/Description
-
Topological Data Analysis (TDA) is a relatively new field of research that utilizes topological notions to extract discriminating features from data. Within TDA, persistent homology (PH) is a robust method to compute multi-dimensional geometric and topological features of a dataset. Because these features are often stable under certain perturbations of the underlying data, are often discriminating, and can be used for visualization of structure in high-dimensional data and in statistical and...
Show moreTopological Data Analysis (TDA) is a relatively new field of research that utilizes topological notions to extract discriminating features from data. Within TDA, persistent homology (PH) is a robust method to compute multi-dimensional geometric and topological features of a dataset. Because these features are often stable under certain perturbations of the underlying data, are often discriminating, and can be used for visualization of structure in high-dimensional data and in statistical and machine learning modeling, PH has attracted the interest of researchers across scientific disciplines and in many industry applications. However, computational costs may present challenges to effectively using PH in certain data contexts, and theoretical stability results may not hold in practice. In this dissertation, we develop an algorithm that can reduce the computation burden of computing persistent homology on point cloud data. Naming it Delaunay-Rips (DR), we define, implement, and empirically test this computationally tractable simplicial complex construction for computing persistent homology of Euclidean point cloud data. We demonstrate the practical robustness of DR for persistent homology in comparison with other simplical complexes in machine learning applications such as predicting sleep state from patient heart rate. To justify the theoretical stability of DR, we prove the stability of the Delaunay triangulation of a pointcloud P under perturbations of the points of P. Specifically, we impose a notion of genericity on the points of P to ensure stability. In the final chapter, we contribute to the field of computational biology by taking a data-driven approach to learn topological features of designed proteins from their persistence diagrams. We find correlations between the learned topological features and biochemical features to investigate how protein structure relates to features identified by subject-matter experts. We train several machine learning models to assess the performance of incorporating topological features into training with biochemical features. Using cover-tree differencing via entropy reduction (CDER), we identify distinguishing regions of the persistence diagrams of stable/unstable proteins. More notably, we find statistically significant improvement in classification performance (in terms of average precision score) for certain designed secondary structure topologies.
Show less - Date Issued
- 2023
- PURL
- http://purl.flvc.org/fau/fd/FA00014311
- Subject Headings
- Data Science, Data Analysis, Topology--Data processing, Protein Stability
- Format
- Document (PDF)
- Title
- CONNECTING THE NOSE AND THE BRAIN: DEEP LEARNING FOR CHEMICAL GAS SENSING.
- Creator
- Stark, Emily Nicole, Barenholtz, Elan, Florida Atlantic University, Department of Psychology, Charles E. Schmidt College of Science
- Abstract/Description
-
The success of deep learning in applications including computer vision, natural language processing, and even the game of Go can only be a orded by powerful computational resources and vast data sets. Data sets coming from the medical application are often much smaller and harder to acquire. Here a novel data approach is explained and used to demonstrate how to use deep learning as a step in data discovery, classi cation, and ultimately support for further investigation. Data sets used to...
Show moreThe success of deep learning in applications including computer vision, natural language processing, and even the game of Go can only be a orded by powerful computational resources and vast data sets. Data sets coming from the medical application are often much smaller and harder to acquire. Here a novel data approach is explained and used to demonstrate how to use deep learning as a step in data discovery, classi cation, and ultimately support for further investigation. Data sets used to illustrate these successes come from common ion-separation techniques that allow for gas samples to be quantitatively analyzed. The success of this data approach allows for the deployment of deep learning to smaller data sets.
Show less - Date Issued
- 2019
- PURL
- http://purl.flvc.org/fau/fd/FA00013416
- Subject Headings
- Deep Learning, Data sets, Gases--Analysis
- Format
- Document (PDF)
- Title
- Generalized Feature Embedding Learning for Clustering and Classication.
- Creator
- Golinko, Eric David, Zhu, Xingquan, Florida Atlantic University, College of Engineering and Computer Science, Department of Computer and Electrical Engineering and Computer Science
- Abstract/Description
-
Data comes in many di erent shapes and sizes. In real life applications it is common that data we are studying has features that are of varied data types. This may include, numerical, categorical, and text. In order to be able to model this data with machine learning algorithms, it is required that the data is typically in numeric form. Therefore, for data that is not originally numerical, it must be transformed to be able to be used as input into these algorithms. Along with this...
Show moreData comes in many di erent shapes and sizes. In real life applications it is common that data we are studying has features that are of varied data types. This may include, numerical, categorical, and text. In order to be able to model this data with machine learning algorithms, it is required that the data is typically in numeric form. Therefore, for data that is not originally numerical, it must be transformed to be able to be used as input into these algorithms. Along with this transformation it is common that data we study has many features relative to the number of samples in the data. It is often desirable to reduce the number of features that are being trained in a model to eliminate noise and reduce time in training. This problem of high dimensionality can be approached through feature selection, feature extraction, or feature embedding. Feature selection seeks to identify the most essential variables in a dataset that will lead to a parsimonious model and high performing results, while feature extraction and embedding are techniques that utilize a mathematical transformation of the data into a represented space. As a byproduct of using a new representation, we are able to reduce the dimension greatly without sacri cing performance. Oftentimes, by using embedded features we observe a gain in performance. Though extraction and embedding methods may be powerful for isolated machine learning problems, they do not always generalize well. Therefore, we are motivated to illustrate a methodology that can be applied to any data type with little pre-processing. The methods we develop can be applied in unsupervised, supervised, incremental, and deep learning contexts. Using 28 benchmark datasets as examples which include di erent data types, we construct a framework that can be applied for general machine learning tasks. The techniques we develop contribute to the eld of dimension reduction and feature embedding. Using this framework, we make additional contributions to eigendecomposition by creating an objective matrix that includes three main vital components. The rst being a class partitioned row and feature product representation of one-hot encoded data. Secondarily, the derivation of a weighted adjacency matrix based on class label relationships. Finally, by the inner product of these aforementioned values, we are able to condition the one-hot encoded data generated from the original data prior to eigenvector decomposition. The use of class partitioning and adjacency enable subsequent projections of the data to be trained more e ectively when compared side-to-side to baseline algorithm performance. Along with this improved performance, we can adjust the dimension of the subsequent data arbitrarily. In addition, we also show how these dense vectors may be used in applications to order the features of generic data for deep learning. In this dissertation, we examine a general approach to dimension reduction and feature embedding that utilizes a class partitioned row and feature representation, a weighted approach to instance similarity, and an adjacency representation. This general approach has application to unsupervised, supervised, online, and deep learning. In our experiments of 28 benchmark datasets, we show signi cant performance gains in clustering, classi cation, and training time.
Show less - Date Issued
- 2018
- PURL
- http://purl.flvc.org/fau/fd/FA00013063
- Subject Headings
- Eigenvectors--Data processing., Algorithms., Cluster analysis.
- Format
- Document (PDF)
- Title
- Detection of multiple change-points in hazard models.
- Creator
- Zhang, Wei, Qian, Lianfen, Florida Atlantic University, Charles E. Schmidt College of Science, Department of Mathematical Sciences
- Abstract/Description
-
Change-point detection in hazard rate function is an important research topic in survival analysis. In this dissertation, we firstly review existing methods for single change-point detection in piecewise exponential hazard model. Then we consider the problem of estimating the change point in the presence of right censoring and long-term survivors while using Kaplan-Meier estimator for the susceptible proportion. The maximum likelihood estimators are shown to be consistent. Taking one step...
Show moreChange-point detection in hazard rate function is an important research topic in survival analysis. In this dissertation, we firstly review existing methods for single change-point detection in piecewise exponential hazard model. Then we consider the problem of estimating the change point in the presence of right censoring and long-term survivors while using Kaplan-Meier estimator for the susceptible proportion. The maximum likelihood estimators are shown to be consistent. Taking one step further, we propose an counting process based and least squares based change-point detection algorithm. For single change-point case, consistency results are obtained. We then consider the detection of multiple change-points in the presence of long-term survivors via maximum likelihood based and counting process based method. Last but not least, we use a weighted least squares based and counting process based method for detection of multiple change-points with long-term survivors and covariates. For multiple change-points detection, simulation studies show good performances of our estimators under various parameters settings for both methods. All methods are applied to real data analyses.
Show less - Date Issued
- 2014
- PURL
- http://purl.flvc.org/fau/fd/FA00004173
- Subject Headings
- Problem solving--Data processing., Process control--Statistical methods., Point processes., Mathematical statistics., Failure time data analysis--Data processing., Survival analysis (Biometry)--Data processing.
- Format
- Document (PDF)
- Title
- Bayesian approach to an exponential hazard regression model with a change point.
- Creator
- Abraha, Yonas Kidane, Qian, Lianfen, Florida Atlantic University, Charles E. Schmidt College of Science, Department of Mathematical Sciences
- Abstract/Description
-
This thesis contains two parts. The first part derives the Bayesian estimator of the parameters in a piecewise exponential Cox proportional hazard regression model, with one unknown change point for a right censored survival data. The second part surveys the applications of change point problems to various types of data, such as long-term survival data, longitudinal data and time series data. Furthermore, the proposed method is then used to analyse a real survival data.
- Date Issued
- 2014
- PURL
- http://purl.flvc.org/fau/fd/FA00004013
- Subject Headings
- Bayesian statistical decision theory, Mathematical statistics, Multivariate analysis -- Data processing
- Format
- Document (PDF)
- Title
- Hybrid stress analysis using digitized photoelastic data and numerical methods.
- Creator
- Mahfuz, Hassan, Florida Atlantic University, Case, Robert O., College of Engineering and Computer Science, Department of Ocean and Mechanical Engineering
- Abstract/Description
-
Equations of stress-difference elasticity, derived from the equations of equilibrium and compatibility for a two-dimensional stress field, are solved for arbitrarily digitized, singly and multiply connected domains. Photoelastic data determined experimentally along the boundary provide the boundary values for the solution of the three elliptic partial differential equations by the finite difference method. A computerized method is developed to generate grid mesh, weighting functions and nodal...
Show moreEquations of stress-difference elasticity, derived from the equations of equilibrium and compatibility for a two-dimensional stress field, are solved for arbitrarily digitized, singly and multiply connected domains. Photoelastic data determined experimentally along the boundary provide the boundary values for the solution of the three elliptic partial differential equations by the finite difference method. A computerized method is developed to generate grid mesh, weighting functions and nodal connectivity within the digitized boundary for the solution of these partial differential equations. A method is introduced to digitize the photoelastic fringes, namely isochromatics and isoclinics, and to estimate the values of sigma1 - sigma2, sigma x - sigma y and tau xy at each nodal point by an interpolation technique. Interpolated values of the stress parameters are used to improve the initial estimate and hence the convergence of the iterative solution of the system of equations. Superfluous boundary conditions are added from the digitized photoelastic data for further speeding up the rate of convergence. The boundary of the domain and the photoelastic fringes are digitized by physically traversing the cursor along the boundary, and the digitized information is scanned horizontally and vertically to generate internal and boundary nodal points. A linear search determines the nodal connectivity and isolates the boundary points for the input of the boundary values. A similar scanning method estimates the photoelastic parameters at each nodal point and also finds the points closest to the tint of passage of each photoelastic fringe. Stress values at these close points are determined without interpolation and are subsequently used as superfluous boundary conditions in the iteration scheme. Successive over-relaxation is applied to the classical Gauss-Seidel method for final enhancement of the convergence of the iteration process. The iteration scheme starts with an accelerating factor other than unity and estimates the spectral radius of the iteration matrix from the two vector norms. This information is used to estimate a temporary value of the optimum relaxation parameter, omega[opt], which is used for a fixed number of iterations to approximate a better value of the accelerating factor. The process is continued until two successive estimates differ by a given tolerance or the stopping criteria are reached. Detailed techniques of developing the code for mesh generation, photoelastic data collection and boundary value interpolation to solve the elliptic boundary value problems are presented. Three separate examples with varying stress gradients and fringe patterns are presented to test the validity of the code and the overall method. Results are compared with the analytical and experimental solutions, and the significant improvement in the rate of convergence is demonstrated.
Show less - Date Issued
- 1989
- PURL
- http://purl.flvc.org/fcla/dt/11934
- Subject Headings
- Strains and stresses, Photoelasticity, Numerical analysis--Data processing
- Format
- Document (PDF)
- Title
- Studies on information-theoretics based data-sequence pattern-discriminant algorithms: Applications in bioinformatic data mining.
- Creator
- Arredondo, Tomas Vidal., Florida Atlantic University, Neelakanta, Perambur S., College of Engineering and Computer Science, Department of Computer and Electrical Engineering and Computer Science
- Abstract/Description
-
This research refers to studies on information-theoretic (IT) aspects of data-sequence patterns and developing thereof discriminant algorithms that enable distinguishing the features of underlying sequence patterns having characteristic, inherent stochastical attributes. The application potentials of such algorithms include bioinformatic data mining efforts. Consistent with the scope of the study as above, considered in this research are specific details on information-theoretics and entropy...
Show moreThis research refers to studies on information-theoretic (IT) aspects of data-sequence patterns and developing thereof discriminant algorithms that enable distinguishing the features of underlying sequence patterns having characteristic, inherent stochastical attributes. The application potentials of such algorithms include bioinformatic data mining efforts. Consistent with the scope of the study as above, considered in this research are specific details on information-theoretics and entropy considerations vis-a-vis sequence patterns (having stochastical attributes) such as DNA sequences of molecular biology. Applying information-theoretic concepts (essentially in Shannon's sense), the following distinct sets of metrics are developed and applied in the algorithms developed for data-sequence pattern-discrimination applications: (i) Divergence or cross-entropy algorithms of Kullback-Leibler type and of general Czizar class; (ii) statistical distance measures; (iii) ratio-metrics; (iv) Fisher type linear-discriminant measure and (v) complexity metric based on information redundancy. These measures are judiciously adopted in ascertaining codon-noncodon delineations in DNA sequences that consist of crisp and/or fuzzy nucleotide domains across their chains. The Fisher measure is also used in codon-noncodon delineation and in motif detection. Relevant algorithms are used to test DNA sequences of human and some bacterial organisms. The relative efficacy of the metrics and the algorithms is determined and discussed. The potentials of such algorithms in supplementing the prevailing methods are indicated. Scope for future studies is identified in terms of persisting open questions.
Show less - Date Issued
- 2003
- PURL
- http://purl.flvc.org/fau/fd/FADT12057
- Subject Headings
- Data mining, Bioinformatics, Discriminant analysis, Information theory in biology
- Format
- Document (PDF)
- Title
- SOCIAL MEDIA AND CRIME ANALYSIS: THE INTERSECTION OF ONLINE POSTING AND LAW ENFORCEMENT INVESTIGATIONS.
- Creator
- Lopez, Kevin P., Dario, Lisa M., Florida Atlantic University, School of Criminology and Criminal Justice, College of Social Work and Criminal Justice
- Abstract/Description
-
The current use of social media platforms has expanded to wider audiences, including police departments and other law enforcement agencies. The vast material being posted online may lead to it being used by police departments due to social media information being open-sourced. The following study will investigate the police’s use of social media data by collecting qualitative data from crime analysts through the International Association of Crime Analysts (IACA). Participants completed an...
Show moreThe current use of social media platforms has expanded to wider audiences, including police departments and other law enforcement agencies. The vast material being posted online may lead to it being used by police departments due to social media information being open-sourced. The following study will investigate the police’s use of social media data by collecting qualitative data from crime analysts through the International Association of Crime Analysts (IACA). Participants completed an openended survey describing their experience with collecting data from online social media sources and how it is used to assist with police activity. The results have implications for future research, such as further exploring the methods by which police are expanding their data collection. Caution may be required when sharing information online. Results from the study may inspire future research regarding the privacy and ethical considerations of using social media data collected from the public.
Show less - Date Issued
- 2023
- PURL
- http://purl.flvc.org/fau/fd/FA00014352
- Subject Headings
- Crime analysis, Social media--Data processing, Law enforcement
- Format
- Document (PDF)
- Title
- AI COMPUTATION OF L1-NORM-ERROR PRINCIPAL COMPONENTS WITH APPLICATIONS TO TRAINING DATASET CURATION AND DETECTION OF CHANGE.
- Creator
- Varma, Kavita, Pados, Dimitris, Florida Atlantic University, Department of Computer and Electrical Engineering and Computer Science, College of Engineering and Computer Science
- Abstract/Description
-
The aim of this dissertation is to achieve a thorough understanding and develop an algorithmic framework for a crucial aspect of autonomous and artificial intelligence (AI) systems: Data Analysis. In the current era of AI and machine learning (ML), ”data” holds paramount importance. For effective learning tasks, it is essential to ensure that the training dataset is accurate and comprehensive. Additionally, during system operation, it is vital to identify and address faulty data to prevent...
Show moreThe aim of this dissertation is to achieve a thorough understanding and develop an algorithmic framework for a crucial aspect of autonomous and artificial intelligence (AI) systems: Data Analysis. In the current era of AI and machine learning (ML), ”data” holds paramount importance. For effective learning tasks, it is essential to ensure that the training dataset is accurate and comprehensive. Additionally, during system operation, it is vital to identify and address faulty data to prevent potentially catastrophic system failures. Our research in data analysis focuses on creating new mathematical theories and algorithms for outlier-resistant matrix decomposition using L1-norm principal component analysis (PCA). L1-norm PCA has demonstrated robustness against irregular data points and will be pivotal for future AI learning and autonomous system operations. This dissertation presents a comprehensive exploration of L1-norm techniques and their diverse applications. A summary of our contributions in this manuscript follows: Chapter 1 establishes the foundational mathematical notation and linear algebra concepts critical for the subsequent discussions, along with a review of the complexities of the current state-of-the-art in L1-norm matrix decomposition algorithms. In Chapter 2, we address the L1-norm error decomposition problem by introducing a novel method called ”Individual L1-norm-error Principal Component Computation by 3-layer Perceptron” (Perceptron L1 error). Extensive studies demonstrate the efficiency of this greedy L1-norm PC calculator.
Show less - Date Issued
- 2024
- PURL
- http://purl.flvc.org/fau/fd/FA00014460
- Subject Headings
- Artificial intelligence, Machine learning, Neural networks (Computer science), Data Analysis
- Format
- Document (PDF)
- Title
- Big data and analytics: the future of music marketing.
- Creator
- Capodilupo, Daniella, Abrams, Ira, Florida Atlantic University, College of Business, Department of Management
- Abstract/Description
-
This is a comprehensive study of how Big Data and analytics will be the future of music marketing. There has been a recent trend of being able to turn metrics into quantifiable, real-word predictions. With an increase in online music consumption along with the use of social media there is now a clearer view than ever before about how this will happen. Instead of solely relying on big record companies for an artist to make it to the big time, there is now a plethora of data and analytics...
Show moreThis is a comprehensive study of how Big Data and analytics will be the future of music marketing. There has been a recent trend of being able to turn metrics into quantifiable, real-word predictions. With an increase in online music consumption along with the use of social media there is now a clearer view than ever before about how this will happen. Instead of solely relying on big record companies for an artist to make it to the big time, there is now a plethora of data and analytics available not just to a small number of big companies, but to anyone.
Show less - Date Issued
- 2015
- PURL
- http://purl.flvc.org/fau/fd/FA00004353, http://purl.flvc.org/fau/fd/FA00004353
- Subject Headings
- Big data -- Economic aspects, Consumer behavior, Internet marketing, Marketing -- Data processing, Music and the Internet, Musical analysis -- Data processing
- Format
- Document (PDF)
- Title
- Shamir's secret sharing scheme using floating point arithmetic.
- Creator
- Finamore, Timothy., Charles E. Schmidt College of Science, Department of Mathematical Sciences
- Abstract/Description
-
Implementing Shamir's secret sharing scheme using floating point arithmetic would provide a faster and more efficient secret sharing scheme due to the speed in which GPUs perform floating point arithmetic. However, with the loss of a finite field, properties of a perfect secret sharing scheme are not immediately attainable. The goal is to analyze the plausibility of Shamir's secret sharing scheme using floating point arithmetic achieving the properties of a perfect secret sharing scheme and...
Show moreImplementing Shamir's secret sharing scheme using floating point arithmetic would provide a faster and more efficient secret sharing scheme due to the speed in which GPUs perform floating point arithmetic. However, with the loss of a finite field, properties of a perfect secret sharing scheme are not immediately attainable. The goal is to analyze the plausibility of Shamir's secret sharing scheme using floating point arithmetic achieving the properties of a perfect secret sharing scheme and propose improvements to attain these properties. Experiments indicate that property 2 of a perfect secret sharing scheme, "Any k-1 or fewer participants obtain no information regarding the shared secret", is compromised when Shamir's secret sharing scheme is implemented with floating point arithmetic. These experimental results also provide information regarding possible solutions and adjustments. One of which being, selecting randomly generated points from a smaller interval in one of the proposed schemes of this thesis. Further experimental results indicate improvement using the scheme outlined. Possible attacks are run to test the desirable properties of the different schemes and reinforce the improvements observed in prior experiments.
Show less - Date Issued
- 2012
- PURL
- http://purl.flvc.org/FAU/3342048
- Subject Headings
- Signal processing, Digital techniques, Mathematics, Data encryption (Computer science), Computer file sharing, Security measures, Computer algorithms, Numerical analysis, Data processing
- Format
- Document (PDF)
- Title
- Text Mining and Topic Modeling for Social and Medical Decision Support.
- Creator
- Hurtado, Jose Luis, Zhu, Xingquan, Florida Atlantic University, College of Engineering and Computer Science, Department of Computer and Electrical Engineering and Computer Science
- Abstract/Description
-
Effective decision support plays vital roles in people's daily life, as well as for professional practitioners such as health care providers. Without correct information and timely derived knowledge, a decision is often suboptimal and may result in signi cant nancial loss or compromises of the performance. In this dissertation, we study text mining and topic modeling and propose to use text mining methods, in combination with topic models, to discover knowledge from texts popularly available...
Show moreEffective decision support plays vital roles in people's daily life, as well as for professional practitioners such as health care providers. Without correct information and timely derived knowledge, a decision is often suboptimal and may result in signi cant nancial loss or compromises of the performance. In this dissertation, we study text mining and topic modeling and propose to use text mining methods, in combination with topic models, to discover knowledge from texts popularly available from a wide variety of sources, such as research publications, news, medical diagnose notes, and further employ discovered knowledge to assist social and medical decision support. Examples of such decisions include hospital patient readmission prediction, which is a national initiative for health care cost reduction, academic research topics discovery and trend modeling, and social preference modeling for friend recommendation in social networks etc. To carry out text mining, our research, in Chapter 3, first emphasizes on single document analyzing to investigate textual stylometric features for user pro ling and recognition. Our research confirms that by using properly designed features, it is possible to identify the authors who wrote the article, using a number of sample articles written by the author as the training data. This study serves as the base to assert that text mining is a powerful tool for capturing knowledge in texts for better decision making. In the Chapter 4, we advance our research from single documents to documents with interdependency relationships, and propose to model and predict citation relationship between documents. Given a collection of documents with known linkage relationships, our research will discover e ective features to train prediction models, and predict the likelihood of two documents involving a citation relationships. This study will help accurately model social network linkage relationships, and can be used to assist e ective decision making for friend recommendation in social networking, and reference recommendation in scienti c writing etc. In the Chapter 5, we advance a topic discovery and trend prediction principle to discover meaningful topics from a set of data collection, and further model the evolution trend of the topic. By proposing techniques to discover topics from text, and using temporal correlation between trend for prediction, our techniques can be used to summarize a large collection of documents as meaningful topics, and further forecast the popularity of the topic in a near future. This study can help design systems to discover popular topics in social media, and further assist resource planning and scheduling based on the discovered topics and the their evolution trend. In the Chapter 6, we employ both text mining and topic modeling to the medical domain for effective decision making. The goal is to discover knowledge from medical notes to predict the risk of a patient being re-admitted in a near future. Our research emphasizes on the challenge that re-admitted patients are only a small portion of the patient population, although they bring signficant financial loss. As a result, the datasets are highly imbalanced which often result in poor accuracy for decision making. Our research will propose to use latent topic modeling to carryout localized sampling, and combine models trained from multiple copies of sampled data for accurate prediction. This study can be directly used to assist hospital re-admission assessment for early warning and decision support. The text mining and topic modeling techniques investigated in the dissertation can be applied to many other domains, involving texts and social relationships, towards pattern and knowledge based e ective decision making.
Show less - Date Issued
- 2016
- PURL
- http://purl.flvc.org/fau/fd/FA00004782, http://purl.flvc.org/fau/fd/FA00004782
- Subject Headings
- Social sciences--Research--Methodology., Data mining., Machine learning., Database searching., Discourse analysis--Data processing., Communication--Network analysis., Medical care--Quality control.
- Format
- Document (PDF)
- Title
- Statistics preserving spatial interpolation methods for missing precipitation data.
- Creator
- El Sharif, Husayn., College of Engineering and Computer Science, Department of Civil, Environmental and Geomatics Engineering
- Abstract/Description
-
Deterministic and stochastic weighting methods are commonly used methods for estimating missing precipitation rain gauge data based on values recorded at neighboring gauges. However, these spatial interpolation methods seldom check for their ability to preserve site and regional statistics. Such statistics and primarily defined by spatial correlations and other site-to-site statistics in a region. Preservation of site and regional statistics represents a means of assessing the validity of...
Show moreDeterministic and stochastic weighting methods are commonly used methods for estimating missing precipitation rain gauge data based on values recorded at neighboring gauges. However, these spatial interpolation methods seldom check for their ability to preserve site and regional statistics. Such statistics and primarily defined by spatial correlations and other site-to-site statistics in a region. Preservation of site and regional statistics represents a means of assessing the validity of missing precipitation estimates at a site. This study evaluates the efficacy of traditional interpolation methods for estimation of missing data in preserving site and regional statistics. New optimal spatial interpolation methods intended to preserve these statistics are also proposed and evaluated in this study. Rain gauge sites in the state of Kentucky are used as a case study, and several error and performance measures are used to evaluate the trade-offs in accuracy of estimation and preservation of site and regional statistics.
Show less - Date Issued
- 2012
- PURL
- http://purl.flvc.org/FAU/3355568
- Subject Headings
- Numerical analysis, Meteorology, Statistical methods, Spatial analysis (Statistics), Data processing, Atmospheric physics, Statistical methods, Geographic information systems, Mathematical models
- Format
- Document (PDF)
- Title
- Discounting the role of causal attributions in the ANOVA model of attribution.
- Creator
- Hakala, Kori A., Charles E. Schmidt College of Science, Department of Psychology
- Abstract/Description
-
For years attribution research has been dominated by the ANOVA model of behavior which proposes that people construct their dispositional attributions of others by carefully comparing and weighing all situational information using mental computations similar to the processes used by researchers to analyze data. A preliminary experiment successfully determined that participants were able to distinguish differences in variability assessed across persons (high vs. low consensus) and across...
Show moreFor years attribution research has been dominated by the ANOVA model of behavior which proposes that people construct their dispositional attributions of others by carefully comparing and weighing all situational information using mental computations similar to the processes used by researchers to analyze data. A preliminary experiment successfully determined that participants were able to distinguish differences in variability assessed across persons (high vs. low consensus) and across situations (high vs. low distinctiveness). Also, it was clear that the subjects could evaluate varying levels of situational constraint. A primary experiment administered to participants immediately following the preliminary study determined that participants grossly under-utilized those same variables when making dispositional attributions. Results gave evidence against the use of traditional ANOVA models and support for the use of the Behavior Averaging Principle of Attribution.
Show less - Date Issued
- 2008
- PURL
- http://purl.flvc.org/FAU/166450
- Subject Headings
- Social sciences, Statistical methods, Analysis of variance, Data processing, Mathematical statistics, Attribution (Social psychology)
- Format
- Document (PDF)
- Title
- Non-destructive evaluation of reinforced asphalt pavement built over soft organic soils.
- Creator
- Pohly, Daniel D., College of Engineering and Computer Science, Department of Civil, Environmental and Geomatics Engineering
- Abstract/Description
-
Research, tests and analysis are presented on several reinforcements placed in the asphalt overlay of a roadway built over soft organic soils. Non-destructive Evaluation (NDE) methods and statistical analysis were used to characterize the pavement before and after rehabilitative construction. Before reconstruction, falling weight deflectometer, rut and ride tests were conducted to evaluate the existing pavement and determine the statistical variability of critical site characteristics. Twenty...
Show moreResearch, tests and analysis are presented on several reinforcements placed in the asphalt overlay of a roadway built over soft organic soils. Non-destructive Evaluation (NDE) methods and statistical analysis were used to characterize the pavement before and after rehabilitative construction. Before reconstruction, falling weight deflectometer, rut and ride tests were conducted to evaluate the existing pavement and determine the statistical variability of critical site characteristics. Twenty-four 500ft. test sections were constructed on the roadway including sixteen reinforced asphalt and eight control sections at two test locations that possessed significantly different subsoil characteristics. NDE tests were repeated after reconstruction to characterize the improvements of the test sections. Test results were employed to quantify the stiffness properties of the pavement based on load-deflection data to evaluate the relative performance of the reinforced sections. Statistical analysis of the data showed the stiffness of the reinforced sections was consistently higher than the control sections.
Show less - Date Issued
- 2009
- PURL
- http://purl.flvc.org/FAU/368253
- Subject Headings
- Soil remediation, Technological innovations, Structural stability, Design, Pavements, Performance, Management, Data processing, Structural analysis (Engineering)
- Format
- Document (PDF)
- Title
- Hybrid model for optimization of cost operations for a university transit service.
- Creator
- Portal Palomo, Alicia Benazir., College of Engineering and Computer Science, Department of Civil, Environmental and Geomatics Engineering
- Abstract/Description
-
The demand on transportation infrastructure is dramatically increasing due to population growth causing the transportation systems to be pushed to their limits. With the projected population growth, not only for the U.S. but especially for the higher education field, university campuses are of great importance for transportation engineers. Urban univeristy campuses are considered major trip generators and with the population forecast many challenges are bound to arise. The implementation of...
Show moreThe demand on transportation infrastructure is dramatically increasing due to population growth causing the transportation systems to be pushed to their limits. With the projected population growth, not only for the U.S. but especially for the higher education field, university campuses are of great importance for transportation engineers. Urban univeristy campuses are considered major trip generators and with the population forecast many challenges are bound to arise. The implementation of an improved transit system provides a lower-cost solution to the continuously increasing congestion problems in university campus road networks and surrounding areas. This paper presents a methodology focused on the development of a hybrid system concentrated in three main aspects of transit functionality : access to bus stop location, reasonable travel time and low cost. Two methods for bus stop locations assessment are presented for two levels of analysis : microscopic and mesoscopic. The resulting travel time from the improved bus stop locations is analyzed and compared to the initial conditions by using a microsimulation platform. The development of a mathematical model targets the overall system's cost minimization, including user and operator cost, while maximizing the service coverage. The results demonstrate the benefits of the bus stop assessment by the two applied methods, as well as, the benefits of the route and headway selection based on the mathematical model. Moreover, the results indicate that the generation of routes using travel time as the impedance factor generates the optimal possible routes to obtain the minimum system's overall cost.
Show less - Date Issued
- 2012
- PURL
- http://purl.flvc.org/FAU/3352277
- Subject Headings
- Local transit, Statistics, Transportation planning, Mathematical models, System analysis, Statistical methods, Transportation, Data processing
- Format
- Document (PDF)
- Title
- An Empirical Study of Ordinal and Non-ordinal Classification Algorithms for Intrusion Detection in WLANs.
- Creator
- Gopalakrishnan, Leelakrishnan, Khoshgoftaar, Taghi M., Florida Atlantic University
- Abstract/Description
-
Ordinal classification refers to an important category of real world problems, in which the attributes of the instances to be classified and the classes are linearly ordered. Many applications of machine learning frequently involve situations exhibiting an order among the different categories represented by the class attribute. In ordinal classification the class value is converted into a numeric quantity and regression algorithms are applied to the transformed data. The data is later...
Show moreOrdinal classification refers to an important category of real world problems, in which the attributes of the instances to be classified and the classes are linearly ordered. Many applications of machine learning frequently involve situations exhibiting an order among the different categories represented by the class attribute. In ordinal classification the class value is converted into a numeric quantity and regression algorithms are applied to the transformed data. The data is later translated back into a discrete class value in a postprocessing step. This thesis is devoted to an empirical study of ordinal and non-ordinal classification algorithms for intrusion detection in WLANs. We used ordinal classification in conjunction with nine classifiers for the experiments in this thesis. All classifiers are parts of the WEKA machinelearning workbench. The results indicate that most of the classifiers give similar or better results with ordinal classification compared to non-ordinal classification.
Show less - Date Issued
- 2006
- PURL
- http://purl.flvc.org/fau/fd/FA00012521
- Subject Headings
- Wireless LANs--Security measures, Computer networks--Security measures, Data structures (Computer science), Multivariate analysis
- Format
- Document (PDF)
- Title
- Data Envelopment Analysis Model for Assessment of Safety and Security of Intermodal Transportation Facilities.
- Creator
- Gundersen, Elisabeth, Kaisar, Evangelos I., Florida Atlantic University
- Abstract/Description
-
Following September 11, 2001, numerous security policies have been created which have caused a number of unique challenges in planning for the transportation networks. In particular, there is a need to enhance security by improving collaboration between various transportation modes. The transportation modes are disconnected and have unequal levels of security and efficiency. Tools need to be refined for collaboration and consensus building to serve as catalysts for efficient transportation...
Show moreFollowing September 11, 2001, numerous security policies have been created which have caused a number of unique challenges in planning for the transportation networks. In particular, there is a need to enhance security by improving collaboration between various transportation modes. The transportation modes are disconnected and have unequal levels of security and efficiency. Tools need to be refined for collaboration and consensus building to serve as catalysts for efficient transportation solutions. In this study, we developed and investigated a mathematical model using Data Envelopment Analysis (DEA) to assess the safety and security of intermodal transportation facilities. The model identifies the best and worst performers by assessing several safety and security-related variables. The DEA model can assess the efficiency level of safety and security of intermodal facilities and identify potential solutions for improvement.
Show less - Date Issued
- 2008
- PURL
- http://purl.flvc.org/fau/fd/FA00012524
- Subject Headings
- Data envelopment analysis, Benchmarking (Management), Transportation and state--United States, Performance--Measurement
- Format
- Document (PDF)
- Title
- Comparing the use of second language communication strategies in oral interaction and synchronous computer-mediated communication.
- Creator
- Knierim, Markus., Florida Atlantic University, DuBravac, Stayc
- Abstract/Description
-
This study investigates whether synchronous computer-mediated communication (CMC) has the potential to foster second language learners' strategic competence (as a component of communicative competence). For this purpose, the use of communication strategies (CSs) by 15 fourth-semester students of German during four computer-mediated and four oral "jigsaw" tasks is compared. The students used more CSs in oral interaction, which is attributed to a lesser degree of utterance planning in oral...
Show moreThis study investigates whether synchronous computer-mediated communication (CMC) has the potential to foster second language learners' strategic competence (as a component of communicative competence). For this purpose, the use of communication strategies (CSs) by 15 fourth-semester students of German during four computer-mediated and four oral "jigsaw" tasks is compared. The students used more CSs in oral interaction, which is attributed to a lesser degree of utterance planning in oral interaction and stronger time constraints in synchronous CMC. However, this quantitative difference is due to only five students' use of significantly more CSs in oral interaction. The distribution of the various CS types was similar in both communication modes; only code-switching occurred much more frequently in synchronous CMC, which is attributed to stronger time constraints in this medium and less monitoring by the instructor. Hence, synchronous CMC is not superior to oral interaction as far as promoting CS use is concerned.
Show less - Date Issued
- 2001
- PURL
- http://purl.flvc.org/fcla/dt/12814
- Subject Headings
- Telematics, Interaction analysis in education, Second language acquisition--Data processing
- Format
- Document (PDF)
- Title
- Visualization as a Qualitative Method for Analysis of Data from Location Tracking Technologies.
- Creator
- Mani, Mohan, VanHilst, Michael, Pandya, Abhijit S., Hsu, Sam, Florida Atlantic University, College of Engineering and Computer Science, Department of Computer and Electrical Engineering and Computer Science
- Abstract/Description
-
One of the biggest factors in the quest to better wireless communication is cellular call handoff, which in tum, is a function of geographic location. In this thesis, our fundamental goal was to demonstrate the value addition brought forth by spatial data visualization techniques for the analysis of geo-referenced data from two different location tracking technologies: GPS and cellular systems. Through our efforts, we unearthed some valuable and surprising insights from the data being...
Show moreOne of the biggest factors in the quest to better wireless communication is cellular call handoff, which in tum, is a function of geographic location. In this thesis, our fundamental goal was to demonstrate the value addition brought forth by spatial data visualization techniques for the analysis of geo-referenced data from two different location tracking technologies: GPS and cellular systems. Through our efforts, we unearthed some valuable and surprising insights from the data being analyzed that led to interesting observations about the data itself as opposed to the entity, or entities, that the data is supposed to describe. In doing so, we underscored the value addition brought forth by spatial data visualization techniques even in the incipient stages of analysis of georeferenced data from cellular networks. We also demonstrated the value of visualization techniques as a verification tool to verify the results of analysis done through other methods, such as statistical analysis.
Show less - Date Issued
- 2008
- PURL
- http://purl.flvc.org/fau/fd/FA00012536
- Subject Headings
- Mobile communication systems, Algorithms--Data analysis, Radio--Transmitters and transmissions, Code division multiple access
- Format
- Document (PDF)