BYOB: Bring Your Own Bike

Understanding How Apple ESG Features Could Reshape How We Move Using AI Agents

Project Overview

An end-to-end agentic mobility pipeline built on the Microsoft Geolife GPS dataset:

Classify how people commute from raw GPS alone - no extra sensors, no user input
Estimate each user's carbon footprint by mode
Generate a personalised sustainability report using Claude, fact-checked by a Judge+Critic agent loop
Simulate how Apple ESG nudges could shift commute behaviour using LLM-powered agents
Propose three Apple product features grounded in the above work

Project Structure

geolife-human-psychology/
│
├── dashboard.py                  # Streamlit app entry point
├── requirements.txt
│
├── dashboard_utils/              # One module per dashboard page
│   ├── hypothesis.py             # Hypothesis page
│   ├── data_analysis.py          # EDA page
│   ├── modeling.py               # Commute Mode Predictor page
│   ├── agentic_eval.py           # Agentic Evaluation vs Human page
│   ├── agentic_simulation.py     # Agentic Behaviour Simulation page
│   ├── products.py               # Proposed Apple Products page
│   ├── references.py             # References page
│   └── theme.py                  # Apple-style CSS theme
│
├── eda/                          # Notebooks (analysis only - no rerun needed)
│   ├── eda.ipynb                 # EDA: data quality, mode distribution, emissions
│   ├── modeling.ipynb            # Feature engineering + LightGBM classifier
│   ├── agentic_evaluation.ipynb  # Report generation + Judge+Critic loop
│   └── agentic_simulation.ipynb  # 5-level Apple nudge simulation (525 Claude calls)
│
├── data/
│   └── processed/
│       └── features.parquet      # Cached feature dataset (built by modeling.ipynb)
│
├── models/
│   └── lgbm_mode_classifier.pkl  # Trained LightGBM model + label encoder
│
└── reports/                      # All generated outputs
    ├── user_reports.json         # Claude-generated sustainability reports
    ├── eval_results.json         # Judge+Critic + baseline verdicts
    ├── human_labels.csv          # Hand-labeled claims for Cohen's κ
    ├── simulation_index.json     # Simulation sweep metadata
    └── sim_strength_X.XX.json    # One file per nudge level (0.00 – 1.00)

Hypotheses

ID	Hypothesis
H2	Using only GPS data - no phone sensors, no user input - we can train a model to tell apart walking, biking, taking the bus, and driving.
H3	Cars produce far more CO₂ than their share of trips would suggest. Swapping short car trips (under 3 km) to bike would eliminate a measurable chunk of each user's carbon footprint.
H4	Running a 3-step pipeline - Judge to Critic to Revised Judge - produces fact-check verdicts that match human labels more closely than a simple single-prompt judge, measured on 28 claims hand-labeled across 6 user reports using Cohen's κ.
H5	Simulating 15 agentic commuters over 7 days shows that stronger Apple ESG nudges (quiet route suggestion to leaderboard to real-time Watch alert) progressively reduce car usage, lower CO2 emissions, and improve population health and mood - with the effect growing at each feature level.

Demo

EDA

Commute Mode Predictor

Agent Claims Evaluation vs Human Claims Evaluation

Social Network Simulation via Agents

How to Run

1. Install dependencies

pip install -r requirements.txt

2. Set up environment variables

Create a .env file in the project root:

ANTHROPIC_API_KEY=your_key_here

3. Download the dataset

Run the first cell of eda/eda.ipynb - it downloads the Geolife dataset via kagglehub automatically.

4. Run the notebooks (optional - outputs already saved)

The reports/ and models/ directories are pre-populated. You only need to rerun notebooks if you want to regenerate results from scratch:

eda/eda.ipynb               : EDA (no outputs saved, read-only)
eda/modeling.ipynb          : builds features.parquet + lgbm_mode_classifier.pkl
eda/agentic_evaluation.ipynb: builds user_reports.json + eval_results.json
eda/agentic_simulation.ipynb: builds sim_strength_*.json (costs ~$0.08 in API calls)

5. Launch the dashboard

streamlit run dashboard.py

The dashboard runs at http://localhost:8501.

Docker

# Build
docker build -t geolife-dashboard .

# Run
docker run -p 8501:8501 \
  -e ANTHROPIC_API_KEY=your_key_here \
  -v ~/.cache/kagglehub:/root/.cache/kagglehub \
  geolife-dashboard

Open http://localhost:8501.

Notes:

-e ANTHROPIC_API_KEY - never bake the key into the image

-v ~/.cache/kagglehub - mounts the local dataset cache so the EDA tab works; omit this flag if you only need the other tabs

Results Summary

Hypothesis	Result
H2 - Mode classification	CV macro-F1 = 0.553 ± 0.037 across 5 folds (subject-independent)
H3 - Emissions attribution	Car = 24% of distance but 52% of CO₂. Sub-3km car: bike saves a measurable share of per-user emissions
H4 - Agentic evaluation	Judge+Critic loop (κ) vs. single-prompt baseline - see Agentic Evaluation tab
H5 - Behaviour simulation	Car usage and CO₂ dropped at every nudge level; leaderboard level produced the biggest shift

Proposed Apple Products

Feature	What it does
Commute Copilot	On-device trip classification + weekly Claude-generated carbon report, fact-checked by Judge+Critic
Carbon Ring	A new Apple Health ring that rewards low-carbon travel with points for carbon offsets or Watch features
Apple Green Impact	Privately aggregated user footprints: stronger Apple ESG score: broader investor base

References

Zheng et al., Microsoft Research Asia — Microsoft GeoLife GPS Trajectory Dataset.
Park, J.S. et al. (2023). Generative agents: Interactive simulacra of human behavior. Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST).
Dabiri, S. & Heaslip, K. (2018). Inferring transportation modes from GPS trajectories using a convolutional neural network. Transportation Research Part C.
Xiao, Z. et al. (2012). Inferring social ties between users with human location history. Journal of Ambient Intelligence and Humanized Computing.
Zheng, Y. et al. (2010). GeoLife: A collaborative social networking service among user, location and trajectory. ACM Trans. Web.
Zheng, Y. et al. (2008). Understanding mobility based on GPS data. UbiComp.

License

Code is released under the MIT License — see LICENSE.

The Microsoft Geolife dataset is the property of Microsoft Research Asia and is redistributed (in derived form) under the terms of its original release. Please cite the Geolife papers above if you use any derived artifact from this repository.

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
.streamlit		.streamlit
agentic_simulation_tutorial		agentic_simulation_tutorial
dashboard_utils		dashboard_utils
data		data
eda		eda
images		images
models		models
reports		reports
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
dashboard.py		dashboard.py
fly.toml		fly.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BYOB: Bring Your Own Bike

Understanding How Apple ESG Features Could Reshape How We Move Using AI Agents

Project Overview

Project Structure

Hypotheses

Demo

EDA

Commute Mode Predictor

Agent Claims Evaluation vs Human Claims Evaluation

Social Network Simulation via Agents

How to Run

1. Install dependencies

2. Set up environment variables

3. Download the dataset

4. Run the notebooks (optional - outputs already saved)

5. Launch the dashboard

Docker

Results Summary

Proposed Apple Products

References

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BYOB: Bring Your Own Bike

Understanding How Apple ESG Features Could Reshape How We Move Using AI Agents

Project Overview

Project Structure

Hypotheses

Demo

EDA

Commute Mode Predictor

Agent Claims Evaluation vs Human Claims Evaluation

Social Network Simulation via Agents

How to Run

1. Install dependencies

2. Set up environment variables

3. Download the dataset

4. Run the notebooks (optional - outputs already saved)

5. Launch the dashboard

Docker

Results Summary

Proposed Apple Products

References

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages