This repository contains an end-to-end data science project focused on modeling and analyzing climate change indicators using real-world climate data and user sentiment. The project combines Natural Language Processing (NLP), exploratory data analysis, machine learning, and geospatial visualizations to assess and project climate change impacts.
- Predict and analyze climate change indicators such as temperature anomalies, COβ emissions, and extreme weather trends.
- Analyze public sentiment around climate change using Facebook comments from NASA's climate page.
- Apply machine learning to forecast climate metrics and visualize their progression over time and geography.
- Language: Python
- Libraries:
pandas,numpy,matplotlib,seaborn,scikit-learn,folium,geopandas,tqdm - ML Models: Random Forest, XGBoost, LSTM (future scope)
- Development Tools: Jupyter Notebook, VS Code
- Exploratory Data Analysis (EDA):
- COβ emission patterns across time and location
- Sentiment trends on climate change topics
- Outlier detection, skewness handling, and missing value imputation
- Sentiment Analysis (NLP)
- Analyze over 500 user comments from NASAβs Facebook Climate Change page (2020β2023).
- Perform trend analysis and topic modeling using NLP.
- Data privacy is preserved using SHA-256 hashing for user anonymity.
- Climate Data Modeling
- Dataset includes climate metrics such as COβ levels, solar radiation, temperature, sea level, and more.
- Feature engineering and preprocessing (normalization, encoding, handling outliers and missing values).
- Advanced time-series visualizations by week, year, and geographical coordinates.
- Machine Learning
- Trained multiple models: Random Forest, Gradient Boosting, Neural Networks, and LSTM.
- Model evaluation using MAE, MSE, RΒ², and cross-validation.
- Future forecasting and scenario simulation.
βββ main.ipynb # Main notebook for climate modeling
βββ data/ # Raw and cleaned datasets
βββ README.md # Project documentation- Recommended: Create a virtual environment
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate
- Install required packages:
pip install -r requirements.txt
- Run the Notebook:
Open main.ipynb in Jupyter Notebook and run the cells step-by-step.
-
Public sentiment shows increasing concern and awareness about climate change.
-
High emissions correlate with specific aerosol and cloud indicators.
-
Significant seasonal and geographical emission trends were discovered.
-
Integrate real-time data updates using APIs.
-
Deploy model as a Flask/Django web app.
-
Collaborate with domain experts for deeper scientific validation.
-
NASA Climate Change Facebook for social sentiment data
-
NOAA, IPCC, and Kaggle for climate datasets
-
BriantOliveira for reference EDA structure
Amrutha C
GitHub: @amruthadevops