GitHub - Dev-Parmar17/End-To-End-CAR-RENTAL-PIPELINE

🚗💨 Car Rental Batch Data Ingestion with SCD2 Merge in Snowflake ❄

Overview

This project implements a data ingestion pipeline for car rental data, utilizing SCD2 (Slowly Changing Dimension Type 2) merge on the customer dimension table in Snowflake. The pipeline leverages Python, PySpark, GCP Dataproc, Airflow, and Snowflake.

Architecture Diagram

TECH STACK

Python 🐍
PySpark 🚀
GCP Dataproc ☁️
Airflow ✈️
Snowflake ❄️

Key Features

SCD2 Implementation: Effectively handles changes in customer data over time.
Data Ingestion: Reads data from Google Cloud Storage and loads it into Snowflake tables ❄️.
Data Processing: Utilizes PySpark for efficient data transformations and aggregations.
Orchestration: Airflow schedules and manages the pipeline for automation.
Scalability: Leverages GCP Dataproc for scalable data processing.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Car-Rental-Batch-Ingestion		Car-Rental-Batch-Ingestion
README.md		README.md
snowflake airflow.png		snowflake airflow.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚗💨 Car Rental Batch Data Ingestion with SCD2 Merge in Snowflake ❄

Overview

Architecture Diagram

TECH STACK

Key Features

Airflow DAG Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🚗💨 Car Rental Batch Data Ingestion with SCD2 Merge in Snowflake ❄

Overview

Architecture Diagram

TECH STACK

Key Features

Airflow DAG Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages