An end-to-end agentic mobility pipeline built on the Microsoft Geolife GPS dataset:
- Classify how people commute from raw GPS alone - no extra sensors, no user input
- Estimate each user's carbon footprint by mode
- Generate a personalised sustainability report using Claude, fact-checked by a Judge+Critic agent loop
- Simulate how Apple ESG nudges could shift commute behaviour using LLM-powered agents
- Propose three Apple product features grounded in the above work
geolife-human-psychology/
│
├── dashboard.py # Streamlit app entry point
├── requirements.txt
│
├── dashboard_utils/ # One module per dashboard page
│ ├── hypothesis.py # Hypothesis page
│ ├── data_analysis.py # EDA page
│ ├── modeling.py # Commute Mode Predictor page
│ ├── agentic_eval.py # Agentic Evaluation vs Human page
│ ├── agentic_simulation.py # Agentic Behaviour Simulation page
│ ├── products.py # Proposed Apple Products page
│ ├── references.py # References page
│ └── theme.py # Apple-style CSS theme
│
├── eda/ # Notebooks (analysis only - no rerun needed)
│ ├── eda.ipynb # EDA: data quality, mode distribution, emissions
│ ├── modeling.ipynb # Feature engineering + LightGBM classifier
│ ├── agentic_evaluation.ipynb # Report generation + Judge+Critic loop
│ └── agentic_simulation.ipynb # 5-level Apple nudge simulation (525 Claude calls)
│
├── data/
│ └── processed/
│ └── features.parquet # Cached feature dataset (built by modeling.ipynb)
│
├── models/
│ └── lgbm_mode_classifier.pkl # Trained LightGBM model + label encoder
│
└── reports/ # All generated outputs
├── user_reports.json # Claude-generated sustainability reports
├── eval_results.json # Judge+Critic + baseline verdicts
├── human_labels.csv # Hand-labeled claims for Cohen's κ
├── simulation_index.json # Simulation sweep metadata
└── sim_strength_X.XX.json # One file per nudge level (0.00 – 1.00)
| ID | Hypothesis |
|---|---|
| H2 | Using only GPS data - no phone sensors, no user input - we can train a model to tell apart walking, biking, taking the bus, and driving. |
| H3 | Cars produce far more CO₂ than their share of trips would suggest. Swapping short car trips (under 3 km) to bike would eliminate a measurable chunk of each user's carbon footprint. |
| H4 | Running a 3-step pipeline - Judge to Critic to Revised Judge - produces fact-check verdicts that match human labels more closely than a simple single-prompt judge, measured on 28 claims hand-labeled across 6 user reports using Cohen's κ. |
| H5 | Simulating 15 agentic commuters over 7 days shows that stronger Apple ESG nudges (quiet route suggestion to leaderboard to real-time Watch alert) progressively reduce car usage, lower CO2 emissions, and improve population health and mood - with the effect growing at each feature level. |
pip install -r requirements.txtCreate a .env file in the project root:
ANTHROPIC_API_KEY=your_key_here
Run the first cell of eda/eda.ipynb - it downloads the Geolife dataset via kagglehub automatically.
The reports/ and models/ directories are pre-populated. You only need to rerun notebooks if you want to regenerate results from scratch:
eda/eda.ipynb : EDA (no outputs saved, read-only)
eda/modeling.ipynb : builds features.parquet + lgbm_mode_classifier.pkl
eda/agentic_evaluation.ipynb: builds user_reports.json + eval_results.json
eda/agentic_simulation.ipynb: builds sim_strength_*.json (costs ~$0.08 in API calls)
streamlit run dashboard.pyThe dashboard runs at http://localhost:8501.
# Build
docker build -t geolife-dashboard .
# Run
docker run -p 8501:8501 \
-e ANTHROPIC_API_KEY=your_key_here \
-v ~/.cache/kagglehub:/root/.cache/kagglehub \
geolife-dashboardOpen http://localhost:8501.
Notes:
-e ANTHROPIC_API_KEY- never bake the key into the image-v ~/.cache/kagglehub- mounts the local dataset cache so the EDA tab works; omit this flag if you only need the other tabs
| Hypothesis | Result |
|---|---|
| H2 - Mode classification | CV macro-F1 = 0.553 ± 0.037 across 5 folds (subject-independent) |
| H3 - Emissions attribution | Car = 24% of distance but 52% of CO₂. Sub-3km car: bike saves a measurable share of per-user emissions |
| H4 - Agentic evaluation | Judge+Critic loop (κ) vs. single-prompt baseline - see Agentic Evaluation tab |
| H5 - Behaviour simulation | Car usage and CO₂ dropped at every nudge level; leaderboard level produced the biggest shift |
| Feature | What it does |
|---|---|
| Commute Copilot | On-device trip classification + weekly Claude-generated carbon report, fact-checked by Judge+Critic |
| Carbon Ring | A new Apple Health ring that rewards low-carbon travel with points for carbon offsets or Watch features |
| Apple Green Impact | Privately aggregated user footprints: stronger Apple ESG score: broader investor base |
- Zheng et al., Microsoft Research Asia — Microsoft GeoLife GPS Trajectory Dataset.
- Park, J.S. et al. (2023). Generative agents: Interactive simulacra of human behavior. Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST).
- Dabiri, S. & Heaslip, K. (2018). Inferring transportation modes from GPS trajectories using a convolutional neural network. Transportation Research Part C.
- Xiao, Z. et al. (2012). Inferring social ties between users with human location history. Journal of Ambient Intelligence and Humanized Computing.
- Zheng, Y. et al. (2010). GeoLife: A collaborative social networking service among user, location and trajectory. ACM Trans. Web.
- Zheng, Y. et al. (2008). Understanding mobility based on GPS data. UbiComp.
Code is released under the MIT License — see LICENSE.
The Microsoft Geolife dataset is the property of Microsoft Research Asia and is redistributed (in derived form) under the terms of its original release. Please cite the Geolife papers above if you use any derived artifact from this repository.











