You are here

Text Mining and Topic Modeling for Social and Medical Decision Support

Download pdf | Full Screen View

Date Issued:
2016
Summary:
Effective decision support plays vital roles in people's daily life, as well as for professional practitioners such as health care providers. Without correct information and timely derived knowledge, a decision is often suboptimal and may result in signi cant nancial loss or compromises of the performance. In this dissertation, we study text mining and topic modeling and propose to use text mining methods, in combination with topic models, to discover knowledge from texts popularly available from a wide variety of sources, such as research publications, news, medical diagnose notes, and further employ discovered knowledge to assist social and medical decision support. Examples of such decisions include hospital patient readmission prediction, which is a national initiative for health care cost reduction, academic research topics discovery and trend modeling, and social preference modeling for friend recommendation in social networks etc. To carry out text mining, our research, in Chapter 3, first emphasizes on single document analyzing to investigate textual stylometric features for user pro ling and recognition. Our research confirms that by using properly designed features, it is possible to identify the authors who wrote the article, using a number of sample articles written by the author as the training data. This study serves as the base to assert that text mining is a powerful tool for capturing knowledge in texts for better decision making. In the Chapter 4, we advance our research from single documents to documents with interdependency relationships, and propose to model and predict citation relationship between documents. Given a collection of documents with known linkage relationships, our research will discover e ective features to train prediction models, and predict the likelihood of two documents involving a citation relationships. This study will help accurately model social network linkage relationships, and can be used to assist e ective decision making for friend recommendation in social networking, and reference recommendation in scienti c writing etc. In the Chapter 5, we advance a topic discovery and trend prediction principle to discover meaningful topics from a set of data collection, and further model the evolution trend of the topic. By proposing techniques to discover topics from text, and using temporal correlation between trend for prediction, our techniques can be used to summarize a large collection of documents as meaningful topics, and further forecast the popularity of the topic in a near future. This study can help design systems to discover popular topics in social media, and further assist resource planning and scheduling based on the discovered topics and the their evolution trend. In the Chapter 6, we employ both text mining and topic modeling to the medical domain for effective decision making. The goal is to discover knowledge from medical notes to predict the risk of a patient being re-admitted in a near future. Our research emphasizes on the challenge that re-admitted patients are only a small portion of the patient population, although they bring signficant financial loss. As a result, the datasets are highly imbalanced which often result in poor accuracy for decision making. Our research will propose to use latent topic modeling to carryout localized sampling, and combine models trained from multiple copies of sampled data for accurate prediction. This study can be directly used to assist hospital re-admission assessment for early warning and decision support. The text mining and topic modeling techniques investigated in the dissertation can be applied to many other domains, involving texts and social relationships, towards pattern and knowledge based e ective decision making.
Title: Text Mining and Topic Modeling for Social and Medical Decision Support.
491 views
156 downloads
Name(s): Hurtado, Jose Luis, author
Zhu, Xingquan, Thesis advisor
Florida Atlantic University, Degree grantor
College of Engineering and Computer Science
Department of Computer and Electrical Engineering and Computer Science
Type of Resource: text
Genre: Electronic Thesis Or Dissertation
Date Created: 2016
Date Issued: 2016
Publisher: Florida Atlantic University
Place of Publication: Boca Raton, Fla.
Physical Form: application/pdf
Extent: 142 p.
Language(s): English
Summary: Effective decision support plays vital roles in people's daily life, as well as for professional practitioners such as health care providers. Without correct information and timely derived knowledge, a decision is often suboptimal and may result in signi cant nancial loss or compromises of the performance. In this dissertation, we study text mining and topic modeling and propose to use text mining methods, in combination with topic models, to discover knowledge from texts popularly available from a wide variety of sources, such as research publications, news, medical diagnose notes, and further employ discovered knowledge to assist social and medical decision support. Examples of such decisions include hospital patient readmission prediction, which is a national initiative for health care cost reduction, academic research topics discovery and trend modeling, and social preference modeling for friend recommendation in social networks etc. To carry out text mining, our research, in Chapter 3, first emphasizes on single document analyzing to investigate textual stylometric features for user pro ling and recognition. Our research confirms that by using properly designed features, it is possible to identify the authors who wrote the article, using a number of sample articles written by the author as the training data. This study serves as the base to assert that text mining is a powerful tool for capturing knowledge in texts for better decision making. In the Chapter 4, we advance our research from single documents to documents with interdependency relationships, and propose to model and predict citation relationship between documents. Given a collection of documents with known linkage relationships, our research will discover e ective features to train prediction models, and predict the likelihood of two documents involving a citation relationships. This study will help accurately model social network linkage relationships, and can be used to assist e ective decision making for friend recommendation in social networking, and reference recommendation in scienti c writing etc. In the Chapter 5, we advance a topic discovery and trend prediction principle to discover meaningful topics from a set of data collection, and further model the evolution trend of the topic. By proposing techniques to discover topics from text, and using temporal correlation between trend for prediction, our techniques can be used to summarize a large collection of documents as meaningful topics, and further forecast the popularity of the topic in a near future. This study can help design systems to discover popular topics in social media, and further assist resource planning and scheduling based on the discovered topics and the their evolution trend. In the Chapter 6, we employ both text mining and topic modeling to the medical domain for effective decision making. The goal is to discover knowledge from medical notes to predict the risk of a patient being re-admitted in a near future. Our research emphasizes on the challenge that re-admitted patients are only a small portion of the patient population, although they bring signficant financial loss. As a result, the datasets are highly imbalanced which often result in poor accuracy for decision making. Our research will propose to use latent topic modeling to carryout localized sampling, and combine models trained from multiple copies of sampled data for accurate prediction. This study can be directly used to assist hospital re-admission assessment for early warning and decision support. The text mining and topic modeling techniques investigated in the dissertation can be applied to many other domains, involving texts and social relationships, towards pattern and knowledge based e ective decision making.
Identifier: FA00004782 (IID)
Degree granted: Dissertation (Ph.D.)--Florida Atlantic University, 2016.
Collection: FAU Electronic Theses and Dissertations Collection
Note(s): Includes bibliography.
Subject(s): Social sciences--Research--Methodology.
Data mining.
Machine learning.
Database searching.
Discourse analysis--Data processing.
Communication--Network analysis.
Medical care--Quality control.
Held by: Florida Atlantic University Libraries
Sublocation: Digital Library
Links: http://purl.flvc.org/fau/fd/FA00004782
Persistent Link to This Record: http://purl.flvc.org/fau/fd/FA00004782
Use and Reproduction: Copyright © is held by the author with permission granted to Florida Atlantic University to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Use and Reproduction: http://rightsstatements.org/vocab/InC/1.0/
Host Institution: FAU
Is Part of Series: Florida Atlantic University Digital Library Collections.