-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Richard Dirauf
committed
Jan 8, 2025
1 parent
5cf7ec7
commit 38fd91d
Showing
2 changed files
with
175 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,168 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"<h1>MLTS Exercise 08 - Data Exploration</h1>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Your task is to read, explore and preprocess the following timeseries dataset. All information you will gather will be useful for the next notebook, were we will train a model based on this data.\n", | ||
"\n", | ||
"The dataset can be downloaded from [Individual Household Electric Power Consumption](https://archive.ics.uci.edu/dataset/235/individual+household+electric+power+consumption)\n", | ||
"\n", | ||
"It contains \"Measurements of electric power consumption in one household with a one-minute sampling rate over a period of almost 4 years. Different electrical quantities and some sub-metering values are available.\"" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"**Reference** \n", | ||
"Hebrail, G. & Berard, A. (2006). Individual Household Electric Power Consumption [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C58K54." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# import packages\n", | ||
"import pandas as pd\n", | ||
"import matplotlib.pyplot as plt\n", | ||
"import seaborn as sns" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Load the dataset" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# Importing dataset\n", | ||
"path = 'data/household_power_consumption.txt'\n", | ||
"\n", | ||
"Household_consumption = pd.read_csv(path, sep=';', low_memory=False)\n", | ||
"Household_consumption" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Preprocess Data\n", | ||
"\n", | ||
"* Convert seperate date and time columns into datetime column\n", | ||
"* Convert numeric columns to correct type\n", | ||
"* Find and replace missing values" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# TODO" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# TODO" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# TODO" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Explore the data\n", | ||
"\n", | ||
"Find different trends by looking at:\n", | ||
"* Monthly Global Active Power\n", | ||
"* Energy Usage Comparison Across Sub-Meterings\n", | ||
"* Proportion of Total Energy Usage by Sub-Metering\n", | ||
"* Total Global Active Power Consumption by Month\n", | ||
"* Total Global Active Power Consumption by Day of the Week\n", | ||
"* Total Global Active Power Consumption by Hour of the Day\n", | ||
"* Average Hourly Global Active Power Consumption\n", | ||
"\n", | ||
"Its recommended to use seaborn for some of these plots." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# TODO" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Time Series Specific Data Analysis\n", | ||
"\n", | ||
"Look at the\n", | ||
"* Daily Global Active Power Consumption with 7-Day Rolling Average\n", | ||
"* Daily Global Active Power Consumption with 30-Day Rolling Average" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# TODO" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "venv", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.12.7" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
matplotlib==3.9.2 | ||
numpy==2.1.2 | ||
ipykernel==6.29.5 | ||
ipympl==0.9.4 | ||
pandas==2.2.3 | ||
seaborn==0.13.2 | ||
torch==2.5.1 |