Skip to content

deanhu0822/smart-city-management

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

78 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NYC Smart City Nexus

Energy, Waste & Urban Intelligence Platform

Spark Hack NYC 2026 | Environmental Impact Track | NVIDIA GB10

A decision-support platform that merges NYC's energy and waste data into one unified intelligence layer. Built to run entirely on-device using NVIDIA RAPIDS, NIMs, and cuOpt on the Acer Veriton GN100.

Watch the demo


Quick Start

# Run the full pipeline
python smart-city-management/build_silver_layer.py
python smart-city-management/build_bridge_layer.py
python smart-city-management/build_spatial_layer.py
python smart-city-management/build_gold_layer.py
python smart-city-management/add_missing_outputs.py

Project Structure

smart-city-management/
├── build_silver_layer.py     # Raw → Silver (12 Parquet datasets)
├── build_bridge_layer.py    # Create property_to_bbl bridge
├── build_spatial_layer.py  # cuSpatial feature joins
├── build_gold_layer.py     # → Gold (3 tables + handoffs)
├── add_missing_outputs.py   # depots + site_profiles
├── README.md
├── design doc/              # Technical design docs
├── frontend/                # Dashboard UI
├── AI/                     # LLM ranker
└── data/
    ├── raw/                # 12 raw CSVs from NYC Open Data
    ├── silver/              # Cleaned Parquet (partitioned)
    ├── gold/                # Unified tables + handoffs
    │   ├── route_inputs/    # cuOpt inputs
    │   ├── dispatch/        # BESS dispatch
    │   └── nim/            # NIM site batches
    └── dictionaries/        # Data documentation

Datasets (12 sources from NYC Open Data)

Energy

Status ID Dataset Rows Source
E1 Energy Cost Savings Program 2,363 NYC Open Data
E3 Electric Consumption & Cost 553,666 NYC Open Data
E4 NYC EV Fleet Station Network 1,638 NYC Open Data
E5 Municipal Solar Readiness (LL24) 4,270 NYC Open Data
E7 LL84 Monthly Energy Data ~2.2M NYC Open Data
E10 LL84 Annual Benchmarking 103,259 NYC Open Data

Waste

Status ID Dataset Rows Source
W1 DSNY Monthly Tonnage 24,883 NYC Open Data
W2 311 Service Requests (DSNY) 1,556,494 NYC Open Data
W3 Litter Basket Inventory 20,413 NYC Open Data
W7 Food Scrap Drop-Off Locations 591 NYC Open Data
W8 Disposal Facility Locations 39 NYC Open Data
W12 Waste Characterization 2,112 NYC Open Data

Pipeline Outputs

Silver Layer (data/silver/)

File Description Partitioned
E3_electric_consumption.parquet NYCHA electric usage No
E4_ev_fleet_stations.parquet EV charging locations No
E5_solar_readiness.parquet Solar-ready buildings No
E5_solar_readiness_enriched.parquet + cuSpatial features No
E7_ll84_monthly.parquet Monthly energy ✅ calendar_year
E10_ll84_benchmarking.parquet Annual benchmarking No
property_to_bbl.parquet Bridge table (48,310 unique) No
W1_dsny_monthly_tonnage.parquet District tonnage No
W2_311_dsny.parquet 311 complaints ✅ created_year
W7_food_scrap_dropoffs.parquet Compost sites No
W8_disposal_facilities.parquet Transfer facilities No

Gold Layer (data/gold/)

File Description
unified_sites.parquet Gold Table 1: E5 + E10 + spatial features
district_waste.parquet Gold Table 2: W1 aggregated by district
time_series.parquet Gold Table 3: E7 monthly energy
route_inputs/depots.parquet cuOpt depot locations (39)
route_inputs/demand_nodes.parquet cuOpt demand nodes (59 districts)
nim/site_batches.jsonl NIM scoring batch (20 sites)
dispatch/site_profiles.parquet BESS dispatch profiles (4,268 sites)

Interface Contracts (GB10 Runbook)

Join Rules

E5.bbl ← E10.bbl (via property_to_bbl)
E7.property_id → E10.property_id → E10.bbl → E5.bbl

cuSpatial Features

From E5 building points:

  • EV ports within 500m
  • EV ports within 1km
  • Compost sites within 1km
  • Nearest transfer facility distance

Complaint Schema

Map both Problem/Problem Detail and Complaint Type/Descriptor into one canonical category column.


Tech Stack

Layer Tool Purpose
Data Processing RAPIDS cuDF + cuSpatial GPU-accelerated data processing
Route Optimization RAPIDS cuOpt Waste collection vehicle routing
LLM Inference NVIDIA NIMs (Llama 3.1 8B) Local site scoring
Simulation Custom Python (GPU-vectorized) BESS dispatch
Dashboard Streamlit + Folium Interactive maps, charts

Running the Next Stages

cuOpt (Route Optimization)

import cuopt

# Load demand nodes and depots
demand_nodes = pd.read_parquet("data/gold/route_inputs/demand_nodes.parquet")
depots = pd.read_parquet("data/gold/route_inputs/depots.parquet")

# Solve route
# (See: https://docs.nvidia.com/cuopt/user-guide/latest/introduction.html)

NIM (Local LLM Scoring)

# Run NIM container
docker run -d --gpus=1 --name nimLlama \
  -v ./smart-city-management/data/gold/nim:/app/nim \
  nvcr.io/nim/nvidia/llama-3.1-nim:8b-instruct-q4

# Score sites
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d @nim/site_batches.jsonl

BESS Dispatch

# Load site profiles and time series
profiles = pd.read_parquet("data/gold/dispatch/site_profiles.parquet")
time_series = pd.read_parquet("data/gold/time_series.parquet")

# Run dispatch simulation
# (See: design doc/BESS_simulation.md)

Future Things to Implement

Local LLM Scoring (Qwen 3 80B)

The mini workstation already has the Qwen 3 80B model activated and available for local inference. For future local LLM scoring of sites, we can leverage this model directly:

# Using the local Qwen 3 80B endpoint (already running on the workstation)
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-80b",
    "messages": [{"role": "user", "content": "Score this site for solar potential..."}]
  }'

The Qwen 3 80B provides significantly more capacity than Llama 3.1 8B NIM for nuanced site scoring, natural language reasoning about building characteristics, and batch processing of site evaluations.


Team

Built for Spark Hack NYC 2026 — Environmental Impact Track.

NVIDIA GB10: 128 GB unified memory, 1 PFLOP FP4 on Acer Veriton GN100

Dataset & Model Training

Datasets Used

Data Preparation

Both datasets were preprocessed and reorganized into four unified classification categories:

  • Organics
  • Electronic
  • Plastic
  • Others

Images from both sources were partitioned into folder-based class directories to maintain consistent labeling and balanced training structure.

Base Model

  • Model Name: NVIDIA Nemotron Nano 12B v2 VL FP8
  • Model ID: nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-FP8

Fine-Tuning Method

This project uses LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning, enabling adaptation of the multimodal vision-language model with lower memory and compute overhead.

Training Goal

The objective is to fine-tune the model for garbage image classification across mixed-source datasets, improving recognition accuracy for real-world waste sorting applications.

About

NYC Smart City Nexus — Energy, Waste & Urban Intelligence Platform

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 58.5%
  • JavaScript 33.0%
  • TypeScript 6.8%
  • Other 1.7%