Current Search: Electronic Health Records (x)
View All Items
- Title
- STREAMLINING CLINICAL DETECTION OF ALZHEIMER’S DISEASE USING ELECTRONIC HEALTH RECORDS AND MACHINE LEARNING TECHNIQUES.
- Creator
- Kleiman, Michael J., Barenholtz, Elan, Florida Atlantic University, Charles E. Schmidt College of Science, Department of Psychology
- Abstract/Description
-
Alzheimer’s disease is typically detected using a combination of cognitive-behavioral assessment exams and interviews of both the patient and a family member or caregiver, both administered and interpreted by a trained physician. This procedure, while standard in medical practice, can be time consuming and expensive for both the patient and the diagnostician especially because proper training is required to interpret the collected information and determine an appropriate diagnosis. The use of...
Show moreAlzheimer’s disease is typically detected using a combination of cognitive-behavioral assessment exams and interviews of both the patient and a family member or caregiver, both administered and interpreted by a trained physician. This procedure, while standard in medical practice, can be time consuming and expensive for both the patient and the diagnostician especially because proper training is required to interpret the collected information and determine an appropriate diagnosis. The use of machine learning techniques to augment diagnostic procedures has been previously examined in limited capacity but to date no research examines real-world medical applications of predictive analytics for health records and cognitive exam scores. This dissertation seeks to examine the efficacy of detecting cognitive impairment due to Alzheimer’s disease using machine learning, including multi-modal neural network architectures, with a real-world clinical dataset used to determine the accuracy and applicability of the generated models. An in-depth analysis of each type of data (e.g. cognitive exams, questionnaires, demographics) as well as the cognitive domains examined (e.g. memory, attention, language) is performed to identify the most useful targets, with cognitive exams and questionnaires being found to be the most useful features and short-term memory, attention, and language found to be the most important cognitive domains. In an effort to reduce medical costs and streamline procedures, optimally predictive and efficient groups of features were identified and selected, with the best performing and economical group containing only three questions and one cognitive exam component, producing an accuracy of 85%. The most effective diagnostic scoring procedure was examined, with simple threshold counting based on medical documentation being identified as the most useful. Overall predictive analysis found that Alzheimer’s disease can be detected most accurately using a bimodal multi-input neural network model using separated cognitive domains and questionnaires, with a detection accuracy of 88% using the real-world testing set, and that the technique of analyzing domains separately serves to significantly improve model efficacy compared to models that combine them.
Show less - Date Issued
- 2019
- PURL
- http://purl.flvc.org/fau/fd/FA00013326
- Subject Headings
- Alzheimer's disease, Electronic Health Records, Machine learning
- Format
- Document (PDF)
- Title
- PREDICTING MELANOMA RISK FROM ELECTRONIC HEALTH RECORDS WITH MACHINE LEARNING TECHNIQUES.
- Creator
- Richter, Aaron N., Khoshgoftaar, Taghi M., Florida Atlantic University, College of Engineering and Computer Science, Department of Computer and Electrical Engineering and Computer Science
- Abstract/Description
-
Melanoma is one of the fastest growing cancers in the world, and can affect patients earlier in life than most other cancers. Therefore, it is imperative to be able to identify patients at high risk for melanoma and enroll them in screening programs to detect the cancer early. Electronic health records collect an enormous amount of data about real-world patient encounters, treatments, and outcomes. This data can be mined to increase our understanding of melanoma as well as build personalized...
Show moreMelanoma is one of the fastest growing cancers in the world, and can affect patients earlier in life than most other cancers. Therefore, it is imperative to be able to identify patients at high risk for melanoma and enroll them in screening programs to detect the cancer early. Electronic health records collect an enormous amount of data about real-world patient encounters, treatments, and outcomes. This data can be mined to increase our understanding of melanoma as well as build personalized models to predict risk of developing the cancer. Cancer risk models built from structured clinical data are limited in current research, with most studies involving just a few variables from institutional databases or registries. This dissertation presents data processing and machine learning approaches to build melanoma risk models from a large database of de-identified electronic health records. The database contains consistently captured structured data, enabling the extraction of hundreds of thousands of data points each from millions of patient records. Several experiments are performed to build effective models, particularly to predict sentinel lymph node metastasis in known melanoma patients and to predict individual risk of developing melanoma. Data for these models suffer from high dimensionality and class imbalance. Thus, classifiers such as logistic regression, support vector machines, random forest, and XGBoost are combined with advanced modeling techniques such as feature selection and data sampling. Risk factors are evaluated using regression model weights and decision trees, while personalized predictions are provided through random forest decomposition and Shapley additive explanations. Random undersampling on the melanoma risk dataset shows that many majority samples can be removed without a decrease in model performance. To determine how much data is truly needed, we explore learning curve approximation methods on the melanoma data and three publicly-available large-scale biomedical datasets. We apply an inverse power law model as well as introduce a novel semi-supervised curve creation method that utilizes a small amount of labeled data.
Show less - Date Issued
- 2019
- PURL
- http://purl.flvc.org/fau/fd/FA00013342
- Subject Headings
- Melanoma, Electronic Health Records, Machine learning--Technique, Big Data
- Format
- Document (PDF)