feat(inventory): add Open Food Facts cache with Deno ingest script #5

justanotheratom · 2025-09-30T10:17:18Z

Summary

This PR adds a caching layer for Open Food Facts product data to improve barcode lookup performance and reliability.

Changes

New table: with RLS policies (reads for authenticated users, writes for service role)
Deno script: that downloads OFF JSONL.gz, transforms to our product format, and batch-upserts to Supabase
Authentication: Updated to use (new key type) with fallback to legacy

Usage

# Set environment variables
export SUPABASE_URL=your_url
export SUPABASE_SECRET_KEY=your_secret_key

# Run the ingest script
deno run -A --import-map=local/import_map.json local/off_ingest.ts

Benefits

Faster barcode lookups (local cache vs API calls)
More reliable (offline-capable with cached data)
Fresh data (regular updates from OFF dumps)
Scalable (handles 4M+ products efficiently)

Database Impact

Estimated table size: ~8-12 GB for full OFF dataset
Indexed on barcode for fast lookups
JSONB columns for flexible ingredient/image storage

Co-authored-by: owner <[email protected]>

…; remove estimation artifacts

…ded)

…etter I/O

…nore - Move off_ingest.ts and off_upload_batch.ts to local/openfoodfacts/ folder - Add off_inventory_cache.jsonl to .gitignore (large generated file) - Fix parallel upload logic to prevent memory exhaustion - Add deduplication to handle duplicate barcodes in batches - Successfully uploaded 3M+ products to Supabase inventory_cache

…he-refresh

- Removed log_inventory table definition from tables.sql - Updated get_check_history and get_list_items functions to use inventory_cache - Removed background/log_inventory endpoint - Created consolidated getProductFromCache() function as single source of truth - Updated all code to query inventory_cache instead of log_inventory - Removed redundant lookupProduct wrapper function - inventory.ts now checks cache first, falls back to fresh API fetch

- Removed fetchProduct() function and OpenFoodFacts API integration - Removed processOpenFoodFactsProductData() and helper functions - Simplified get() endpoint to only read from inventory_cache - Reduced file from 252 lines to 104 lines (67% reduction) - Now returns 404 if product not found in cache

- Add barcodeToPath() function to convert barcodes to proper OFF path format (e.g., 088/491/237/3946) - Fix image URL format: full size uses imgId.jpg instead of imgId.WxH.jpg - Add validateImageUrlExists() function for optional HEAD request validation (currently disabled) - Update extractDisplayImageUrls() to use correct barcode paths in URLs - All image URLs now follow format: /images/products/{barcodePath}/{imgId}.{size}.jpg

- Only upsert barcode and images fields during upload - Remove updated_at field from upserts (column may not exist) - Add ignoreDuplicates: false to ensure existing rows are updated - Improve error logging with JSON.stringify for better debugging - This is a temporary change to fix image URLs without reprocessing all data

- Add barcode_matches() SQL function that pads upward only to avoid false matches - Update inventory query to try multiple barcode format variants (EAN-8, UPC-A, EAN-13, ITF-14) - Apply barcode_matches() to JOINs in get_check_history() and get_list_items() - Prevents 8-digit barcodes from matching unrelated 13-digit barcodes - Fixes case where barcode 884912373946 should match 0884912373946

- Remove temporary image-only update logic - Restore full batch upserts with all product data - Will populate fresh inventory_cache table with: - Fixed image URLs with correct barcode paths - All product metadata (name, brand, ingredients, etc.) - Proper last_refreshed_at timestamps

- Change ImageMetadata type to minimal structure (type, language, imgid, sizes[]) - Extract all image types: front, ingredients, nutrition, packaging - Extract images for all languages (en, fr, de, es, it, pt, nl, pl, ru, ja, zh, etc.) - Remove URL construction logic (moved to runtime in inventory.ts) - Remove unused helper functions: isValidImageUrl, validateImageUrlExists, barcodeToPath - Add comprehensive documentation about image storage strategy - Store only language keys, drop numeric keys to reduce storage by ~60% Image URLs will be constructed at runtime with smart selection: - Prefer English, fallback to other languages - Prefer 400px (medium), fallback to next available size - Return one image per type (front, ingredients, nutrition, packaging)

cursoragent and others added 22 commits September 28, 2025 15:45

Add script to estimate OFF payload sizes

e40f7ed

Co-authored-by: owner <[email protected]>

Add off_jsonl_linecount.txt with line count

a64fa1d

Co-authored-by: owner <[email protected]>

Add SSE inventory+analysis stream and shared helpers (#4)

e213cf0

feat(inventory): add Deno OFF ingest script and inventory_cache table…

10e72c2

…; remove estimation artifacts

feat: add Deno OFF ingest script for inventory caching

62805fd

chore: remove extraneous estimator files from PR

028cbb1

fix: resolve merge conflict in tables.sql

c48d7a2

feat: add .env template and load env vars in off_ingest script

97505a7

refactor: use npm: specifier for standalone script (no import map nee…

54622bf

…ded)

feat: add batch upload with user confirmation every 5 batches

a7ad823

feat: add progress indicators for download and processing

51b9068

feat: add detailed validation statistics to track invalid/empty products

e414031

perf: optimize for speed with larger batches, parallel uploads, and b…

162c9e7

…etter I/O

Merge remote-tracking branch 'origin/main' into feature/inventory-cac…

28c478d

…he-refresh

justanotheratom force-pushed the main branch from e213cf0 to 82d1483 Compare October 8, 2025 10:40

justanotheratom force-pushed the main branch from fe6d4a7 to 39e7ece Compare October 16, 2025 06:02

justanotheratom force-pushed the main branch from 5b105b7 to a67ddc7 Compare November 25, 2025 06:36

justanotheratom force-pushed the main branch 5 times, most recently from cba0124 to 0f645c6 Compare December 18, 2025 11:46

claude bot mentioned this pull request Jan 1, 2026

fix(security): Implement JWKS-based JWT verification #18

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(inventory): add Open Food Facts cache with Deno ingest script #5

feat(inventory): add Open Food Facts cache with Deno ingest script #5

Uh oh!

justanotheratom commented Sep 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat(inventory): add Open Food Facts cache with Deno ingest script #5

Are you sure you want to change the base?

feat(inventory): add Open Food Facts cache with Deno ingest script #5

Uh oh!

Conversation

justanotheratom commented Sep 30, 2025

Summary

Changes

Usage

Benefits

Database Impact

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants