This project is an attempt to analyse the air pollution levels in India.
We would like analyse the trends in the concentrations of various pollutants like NO2,SO2,O3,PM10,PM2.5 and CO through the years 2016-2018.
An effort shall also be made to try and attribute the changes in pollutant levels to phenomenon such as Rainfall, temperature, weather etc.
We are working on data collected by openAQ which can be found here https://openaq-data.s3.amazonaws.com/index.html
We have scraped a part of the data for India and our dataset can be found here https://www.kaggle.com/ruben99/air-pollution-dataset-india20162018
There are around 63 lakh rows with 11 features as mentioned below
Location : Describes the location where the measurement was made. Varies from locations throughout the country.
City : Specifies the city in which the reading was taken, provides a layer of abstraction as compared to location.
Country : Specifies the country. In our case it's India which is abbreviated as IN.
utc : UTC/GMT timimg for the particular location when the measurement was made.
local : The timimg in the local timezone for the measurement.
parameter : Mentions the pollutant which was measured.
Value : Measured value for pollutant
Unit : Specifies the unit in which the measurement was made
Latitude : Latitude of the corresponding location
Longitude : Longitude of the corresponding location
Attribution : The organisation from which the measurement was obtained
- extract.R : Script for extracting required data from OpenAQ database.
- Stocktaking.ipynb : Basic summary statistics of the data. Visualizations capturing some key aspects of the data.
- Stocktaking.html : HTML version obtained from nbviewer incase the .ipynb files are not rendered.
- WeRAnalysers_LiteratureSurveyReport.pdf : Literature survey report.
- WeRAnalysers_FinalReport.pdf : Final report for the project.
- AnalysisOfAirPollutantsInIndia.ipynb : Complete code available here
ipynb notebook can be uploaded onto kaggle for this particular dataset and the tests can be replicated.