Current Search: Big data--Data processing (x)
View All Items
- Title
- INVESTIGATING MACHINE LEARNING ALGORITHMS WITH IMBALANCED BIG DATA.
- Creator
- Hasanin, Tawfiq, Khoshgoftaar, Taghi M., Florida Atlantic University, College of Engineering and Computer Science, Department of Computer and Electrical Engineering and Computer Science
- Abstract/Description
-
Recent technological developments have engendered an expeditious production of big data and also enabled machine learning algorithms to produce high-performance models from such data. Nonetheless, class imbalance (in binary classifications) between the majority and minority classes in big data can skew the predictive performance of the classification algorithms toward the majority (negative) class whereas the minority (positive) class usually holds greater value for the decision makers. Such...
Show moreRecent technological developments have engendered an expeditious production of big data and also enabled machine learning algorithms to produce high-performance models from such data. Nonetheless, class imbalance (in binary classifications) between the majority and minority classes in big data can skew the predictive performance of the classification algorithms toward the majority (negative) class whereas the minority (positive) class usually holds greater value for the decision makers. Such bias may lead to adverse consequences, some of them even life-threatening, when the existence of false negatives is generally costlier than false positives. The size of the minority class can vary from fair to extraordinary small, which can lead to different performance scores for machine learning algorithms. Class imbalance is a well-studied area for traditional data, i.e., not big data. However, there is limited research focusing on both rarity and severe class imbalance in big data.
Show less - Date Issued
- 2019
- PURL
- http://purl.flvc.org/fau/fd/FA00013316
- Subject Headings
- Algorithms, Machine learning, Big data--Data processing, Big data
- Format
- Document (PDF)
- Title
- SPATIAL NETWORK BIG DATABASE APPROACH TO RESOURCE ALLOCATION PROBLEMS.
- Creator
- Qutbuddin, Ahmad, Yang, KwangSoo, Florida Atlantic University, Department of Computer and Electrical Engineering and Computer Science, College of Engineering and Computer Science
- Abstract/Description
-
Resource allocation for Spatial Network Big Database is challenging due to the large size of spatial networks, variety of types of spatial data, a fast update rate of spatial and temporal elements. It is challenging to learn, manage and process the collected data and produce meaningful information in a limited time. Produced information must be concise and easy to understand. At the same time, the information must be very descriptive and useful. My research aims to address these challenges...
Show moreResource allocation for Spatial Network Big Database is challenging due to the large size of spatial networks, variety of types of spatial data, a fast update rate of spatial and temporal elements. It is challenging to learn, manage and process the collected data and produce meaningful information in a limited time. Produced information must be concise and easy to understand. At the same time, the information must be very descriptive and useful. My research aims to address these challenges through the development of fundamental data processing components for advanced spatial network queries that clearly and briefly deliver critical information. This thesis proposal studied two challenging Spatial Network Big Database problems: (1) Multiple Resource Network Voronoi Diagram and (2) Node-attributed Spatial Graph Partitioning. To address the challenge of query processing for multiple resource allocation in preparing for or after a disaster, we investigated the problem of the Multiple Resource Network Voronoi Diagram (MRNVD). Given a spatial network and a set of service centers from k different resource types, a Multiple Resource Network Voronoi Diagram (MRNVD) partitions the spatial network into a set of Service Areas that can minimize the total cycle-distances of graph-nodes to allotted k service centers with different resource types. The MRNVD problem is important for critical societal applications such as assigning essential survival supplies (e.g., food, water, gas, and medical assistance) to residents impacted by man-made or natural disasters. The MRNVD problem is NP-hard; it is computationally challenging due to the large size of the transportation network. Previous work proposed the Distance bounded Pruning (DP) approach to produce an optimal solution for MRNVD. However, we found that DP can be generalized to reduce the computational cost for the minimum cycle-distance. We extend our prior work and propose a novel approach that reduces the computational cost. Experiments using real-world datasets from five different regions demonstrate that the proposed approach creates MRNVD and significantly reduces the computational cost.
Show less - Date Issued
- 2021
- PURL
- http://purl.flvc.org/fau/fd/FA00013854
- Subject Headings
- Spatial data infrastructures, Big data--Data processing, Resource allocation, Voronoi polygons
- Format
- Document (PDF)