Skip to content

alishahassan/data_mining_project

Repository files navigation

Social Sentiment Shift: The Impact of Celebrity Endorsements

Overview

This project analyzes the impact of Elon Musk's endorsement of Donald Trump on public sentiment and engagement across various Reddit communities during the 2024 Presidential Election. By using sentiment analysis and engagement metrics, we track how different political, social, and demographic groups reacted to this endorsement, highlighting shifts in discourse before and after the endorsement.

How to Use

  1. Clone or download this repository to your local machine.
  2. Open the project in a Python-supported IDE such as PyCharm or Visual Studio Code.
  3. Install the required dependencies by running:
    pip install praw pandas nltk
  4. Use the 400project.py script to collect Reddit data through the PRAW API.
  5. Use the DataProcessing.py script to clean and process the collected data using NLTK for sentiment analysis.
  6. Visualize the results and trends through engagement metrics and sentiment graphs generated from the analysis.

Project Structure

  • Data Collection: Scrapes data from Reddit using the PRAW API based on selected subreddits and keywords.
  • Data Processing: Cleans and analyzes the collected data using natural language processing (NLP) techniques like NLTK's VADER for sentiment scoring.
  • Engagement Metrics: Evaluates the engagement trends (e.g., number of posts, upvotes, comments) before and after the endorsement.
  • Graphs and Visualizations: Generates sentiment distribution, trend analysis, and engagement metric visualizations.
  • JSON Reddit Analysis: Parses the raw Reddit data into structured formats (JSON) for further analysis and reproducibility.

Features

  • Reddit Data Collection: Using the PRAW API to gather posts and comments from targeted subreddits.
  • Sentiment Analysis: Utilizing the NLTK VADER tool to classify sentiment into positive, negative, and neutral categories.
  • Engagement Metrics: Measures upvotes, comments, and post frequency over time to observe shifts in user engagement.
  • Topic Modeling (Optional): Using jsLDA to identify key discussion topics before and after the endorsement.

Requirements

  • Python 3.x
  • PRAW (Python Reddit API Wrapper)
  • Pandas
  • NLTK (Natural Language Toolkit)

Repository Link

GitHub Repository

About

This project investigates public sentiment towards the 2024 Presidential Election before and after celebrity endorsements. In particular, this project focuses on Elon Musk's endorsement for Donald Trump and how that's impacted the public sentiment of Reddit users, all of different political, social, and demographic groups.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages