-
Notifications
You must be signed in to change notification settings - Fork 1
Homework part 4 ‐ Data Analysis
The goal of this homework is to perform data analysis on a Smart Home dataset that includes data exploration, visualization, and deriving meaningful insights. The analysis process will involve the following steps:
- Preparation: Importing the necessary Python libraries and ensuring the environment is set up for analysis.
- Data Loading: Reading the dataset from a specified source (URL) and examining its structure, including the column names and types of data available.
- Data Profiling: Generating a detailed summary report to understand the dataset’s characteristics, such as missing values, distribution of variables, and basic statistics.
- Data Visualization: Creating various visualizations to explore the dataset in depth: Bar charts to illustrate categorical distributions (e.g., device types). Histograms to analyze the distribution of continuous variables (e.g., energy consumption). Boxplots to examine variability and patterns across different groups (e.g., device types or efficiency levels). Scatterplots to uncover relationships between key variables (e.g., energy consumption vs. usage hours). Heatmaps to visualize correlations between continuous variables in the dataset.
- Insights: Using the visualizations and correlations identified to understand trends, patterns, and potential relationships within the data.
The dataset contains the following features:
- UserID: Unique identifier for each user.
- DeviceType: Type of smart home device (e.g., Lights, Thermostat).
- UsageHoursPerDay: Average hours per day the device is used.
- EnergyConsumption: Daily energy consumption of the device (kWh).
- UserPreferences: User preference for device usage (0 - Low, 1 - High).
- MalfunctionIncidents: Number of malfunction incidents reported. – DeviceAgeMonths: Age of the device in months.
- SmartHomeEfficiency (Target Variable): Efficiency status of the smart home device (0 - Inefficient, 1 - Efficient).
The following packages are recommended for completing the assignment:
- Pandas - For handling datasets - https://pandas.pydata.org/docs/
- Ydata-profiling - For data profiling - https://docs.profiling.ydata.ai/latest/
- hvPlot - For interactive visualization - https://hvplot.holoviz.org/
- Plotly - For interactive visualization - https://plotly.com/python/
You can access the homework notebook here: https://colab.research.google.com/drive/1mnoRCkiEDfYvQubrLrqrEbVWSmzJsSaT?usp=sharing
Preparation:
Make a personal copy (File -> Save a copy in Drive
) of the Colab Notebook provided and work on the tasks within your copy. Rename the file to include your team name. Share the notebook with your teammates so you can work together (in the top right corner, click Share
and either add people with their email addresses or create a link).
Tasks:
The task descriptions are provided as comments within the notebook cells. Complete each task in its respective cell and add a Text
cell after each task to describe your observations.
Submission:
To submit the assignment, download the completed notebook file (File -> Download -> Download .ipynb
) and upload it to your team's GitHub repository on the hw4
branch.
Tip: Use the Colab Notebook provided during the practical session as a reference for solving the tasks. Pay attention to labeling the axes on your charts appropriately.
- Lab 1 - TypeScript introduction
- Lab 2 - Langium LSP Introduction
- Lab 3 - Code Generation
- Lab 4 - Large Language Models
- Lab 5 - Testing
- Lab 6 - Smart Contracts on Ethereum
ASE Lectures (fall semester)
- Practice 2a ‐ Simple Gradle CI CD
- Practice 2b ‐ Advanced Gradle CI CD
- Practice 3 ‐ Graph Modeling
- Practice 4 ‐ Textual editors
- Practice 5 ‐ LLMs
- Practice 6 ‐ Code Generation
- Practice 8 ‐ Benchmarking
- Practice 9 ‐ Data analysis
- Practice 10 ‐ Static analysis
- Practice 11 ‐ Code Coverage
- Homework part 1 ‐ Graph Modeling
- Homework part 2 ‐ Textual Modeling
- Homework part 3 ‐ Code Generation
- Homework part 4 ‐ Data Analysis
- IMSc Extra Homework Assignment
Old exams are available here.