You are here
An evaluation of machine learning algorithms for tweet sentiment analysis
- Date Issued:
- 2015
- Summary:
- Sentiment analysis of tweets is an application of mining Twitter, and is growing in popularity as a means of determining public opinion. Machine learning algorithms are used to perform sentiment analysis; however, data quality issues such as high dimensionality, class imbalance or noise may negatively impact classifier performance. Machine learning techniques exist for targeting these problems, but have not been applied to this domain, or have not been studied in detail. In this thesis we discuss research that has been conducted on tweet sentiment classification, its accompanying data concerns, and methods of addressing these concerns. We test the impact of feature selection, data sampling and ensemble techniques in an effort to improve classifier performance. We also evaluate the combination of feature selection and ensemble techniques and examine the effects of high dimensionality when combining multiple types of features. Additionally, we provide strategies and insights for potential avenues of future work.
Title: | An evaluation of machine learning algorithms for tweet sentiment analysis. |
233 views
71 downloads |
---|---|---|
Name(s): |
Prusa, Joseph D., author Khoshgoftaar, Taghi M., Thesis advisor Florida Atlantic University, Degree grantor College of Engineering and Computer Science Department of Computer and Electrical Engineering and Computer Science |
|
Type of Resource: | text | |
Genre: | Electronic Thesis Or Dissertation | |
Date Created: | 2015 | |
Date Issued: | 2015 | |
Publisher: | Florida Atlantic University | |
Place of Publication: | Boca Raton, Fla. | |
Physical Form: | application/pdf | |
Extent: | 87 p. | |
Language(s): | English | |
Summary: | Sentiment analysis of tweets is an application of mining Twitter, and is growing in popularity as a means of determining public opinion. Machine learning algorithms are used to perform sentiment analysis; however, data quality issues such as high dimensionality, class imbalance or noise may negatively impact classifier performance. Machine learning techniques exist for targeting these problems, but have not been applied to this domain, or have not been studied in detail. In this thesis we discuss research that has been conducted on tweet sentiment classification, its accompanying data concerns, and methods of addressing these concerns. We test the impact of feature selection, data sampling and ensemble techniques in an effort to improve classifier performance. We also evaluate the combination of feature selection and ensemble techniques and examine the effects of high dimensionality when combining multiple types of features. Additionally, we provide strategies and insights for potential avenues of future work. | |
Identifier: | FA00004460 (IID) | |
Degree granted: | Thesis (M.S.)--Florida Atlantic University, 2015 | |
Collection: | FAU Electronic Theses and Dissertations Collection | |
Note(s): | Includes bibliography. | |
Subject(s): |
Social media. Natural language processing (Computer science) Machine learning. Algorithms. Fuzzy expert systems. Artificial intelligence. |
|
Held by: | Florida Atlantic University Libraries | |
Sublocation: | Digital Library | |
Links: | http://purl.flvc.org/fau/fd/FA00004460 | |
Persistent Link to This Record: | http://purl.flvc.org/fau/fd/FA00004460 | |
Use and Reproduction: | Copyright © is held by the author, with permission granted to Florida Atlantic University to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder. | |
Use and Reproduction: | http://rightsstatements.org/vocab/InC/1.0/ | |
Host Institution: | FAU | |
Is Part of Series: | Florida Atlantic University Digital Library Collections. |