Skip to content

πŸ“Š A project to pipeline nutrition data into a data warehouse

Notifications You must be signed in to change notification settings

njsfield/nutrition-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Nutrition Data Project

An ELT project to extract data from Fatsecret, load into BigQuery then transform into star schema models with DBT.

Install

Dependencies

pip: Follow the steps here to install pip uv: Follow the steps here to install uv sqlfluff: Follow the steps here to install sqlfluff

Configure Env Variables

  1. Create a .envrc file with the following variables:
# Fat secret API Keys
export FATSECRET_CLIENT_ID=*********
export FATSECRET_CLIENT_SECRET=*********
export UV_NO_ENV_FILE=1

# Bigquery
export GCP_PROJECT=*********
export GCP_SA_KEYFILE=********* # service account JSON path
export GCP_PROJECT_RAW_DATASET=*********
export FAT_SECRET_FOOD_TABLE_NAME=*********
  1. Download direnv
  2. Configure your shell profile:
UV_NO_ENV_FILE=1
eval "$(direnv hook zsh)" # or bash etc
  1. Restart your shell to load env variables

Set up VSCode

  • Install Ruff for Python formatting
  • Install sqlfluff for formatting sql

1. Load

uv run --package load init_storage
uv run --package load extract_data_to_storage

2. Transform

uvx --directory data/transform --from dbt-core dbt deps
uvx --directory data/transform --from dbt-bigquery dbt build
uvx --directory data/transform --from dbt-bigquery dbt build --select

Lint

sqlfluff lint
# fix
sqlfluff fix

About

πŸ“Š A project to pipeline nutrition data into a data warehouse

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages