[PAST EVENT] Honors Thesis Defense - Grace Smith   

May 3, 2022
5:15pm - 6:45pm
Jones Hall, Room 301
200 Ukrop Way
Williamsburg, VA 23185Map this location

Title: Investigating Text Mining Techniques Within the Context of Politicized Social Media Data

Abstract: Social media data has recently been looked to as a source of public opinion for elections, public policies, and the economy. In order to use this data effectively, natural language processing (NLP) techniques have been developed. Topic modeling, one branch of NLP, works to uncover latent topics within a large collection of tweets. Many topics modeling methods such as LDA and k-medoids clustering are unsupervised. We propose adding a supervised Random Forest layer before performing topic modeling in order to incorporate externally known topics. We find that implementing this layer helps increase the interpretability of topics as well as uncover unique topics. Sentiment analysis, another branch of NLP, measures the polarity of a tweet in order to gain insight into the author’s opinions. We apply several sentiment analysis methods to our dataset and examine the results; we then identify weaknesses in these methods and propose steps for improvement.


Grace Smith