Skip to content

๐Ÿ—บ๏ธ Spatial Join & Enrich any urban layer given any external urban dataset of interest, streamline your urban analysis with Scikit-Learn-Like pipelines, and share your insights with the urban research community!

License

Notifications You must be signed in to change notification settings

VIDA-NYU/UrbanMapper

Repository files navigation

UrbanMapper

Enrich Urban Layers Given Urban Datasets

with ease-of-use API and Sklearn-alike Shareable & Reproducible Urban Pipeline

Beartype compliant UV compliant RUFF compliant Jupyter Python 3.10+ Compilation Status

UrbanMapper Cover


Important

  1. We support JupyterGIS as a bridge to export your Urban Pipeline for collaborative exploration ๐Ÿ‚ Shout-out to @mfisher87 for his tremendous help.
  2. We highly recommend exploring the /example folder for Jupyter Notebook-based examples ๐ŸŽ‰
  3. The following library is under active development and is not yet stable. Expect bugs & frequent changes!

๐ŸŒ† UrbanMapper โ€“โ€“ In a Nutshell

UrbanMapper โ€“โ€“ f(.) โ€“โ€“ brings urban layers (e.g. Street Roads / Intersections or Sidewalks / Cross Walks) โ€“โ€“ X โ€“โ€“ and your urban datasets โ€“โ€“ Y โ€“โ€“ together through the function f(X, Y) = X โ‹ˆ Y, allowing you to spatial-join these components, and enrich X given Y attributes, features and information.

While UrbanMapper is built with a Scikit-Learn-like philosophy โ€“ i.e., (I) from loading to viz. passing by mapping and enriching, we want to cover as much as usersโ€™ wishes in a welcoming way without having to code 20+/50+ lines of code for one, non-reproducible, non-shareable, non-updatable piece of code; and (II) the libraryโ€™s flexibility allows for easy contributions to sub-modules without having to start from scratch โ€œall the timeโ€.

This means that UrbanMapper is allowing you to build a reproducible, shareable, and updatable urban pipeline in a few lines of code ๐ŸŽ‰ This could therefore be seen as a stepping-stone / accelerator to further analysis such as machine learning-based ones.

The only thing we request from you is to be sure that your datasets Y are spatial datasets (i.e. with latitude and longitude coordinates) and let's urban proceed with enriching your urban layer of interests from insights your datasets comes with.


๐Ÿฅ Installation

We highly recommend using uv for installation from source to avoid the hassle of Conda or other package managers. It is also the fastest known to date on the OSS market and manages dependencies seamlessly without manual environment activation (Biggest flex!). If you do not want to use uv, there are no issues, but we will cover it in the upcoming documentation โ€“ not as follows.

Prerequisites

  • First, ensure uv is installed on your machine by following these instructions.

  • Second, make sure you install at least python 3.10+. If you are not sure:

uv python install 3.10
uv python pin 3.10

And you are ready to go! ๐ŸŽ‰

Steps

  1. Clone the UrbanMapper repository:
    git clone [email protected]:VIDA-NYU/UrbanMapper.git
    # git clone https://github.com/VIDA-NYU/UrbanMapper.git
    cd UrbanMapper
  2. Lock and sync dependencies with uv:
    uv lock
    uv sync
  3. (Recommended) Install Jupyter extensions for interactive visualisations requiring Jupyter widgets:
    uv run jupyter labextension install @jupyter-widgets/jupyterlab-manager
  4. Launch Jupyter Lab to explore UrbanMapper (Way faster than running Jupyter without uv):
    uv run --with jupyter jupyter lab

Voila ๐Ÿฅ ! We'd recommend you explore next the # Getting Started with UrbanMapper section to see how to use the tool.

๐Ÿซฃ Different ways to install UrbanMapper (e.g w/ pip)

Note on Alternative Dependency Management Methods

While we strongly recommend using uv for managing dependencies due to its superior speed and ease of use, alternative methods are available for those who prefer not to use uv. These alternatives are not as efficient, as they are slower and require more manual intervention.

Please be aware that the following assumptions are made for these alternative methods:

  • You have pip installed.
  • You are working within a virtual environment or a conda environment.

If you are not currently using a virtual or conda environment, we highly recommend setting one up to prevent potential conflicts and maintain a clean development workspace. For assistance, refer to the following resources:

  1. Clone the UrbanMapper repository:

     git clone [email protected]:VIDA-NYU/UrbanMapper.git
     # git clone https://github.com/VIDA-NYU/UrbanMapper.git
     cd UrbanMapper
  2. Install UrbanMapper dependencies using pip:

     pip install -r requirements.txt
  3. Install UrbanMapper:

     pip install -e ./UrbanMapper
     # or if you ensure you are in your virtual environment, cd UrbanMapper && pip install -e .
     # Note that -e means "editable" mode, which allows you to make changes to the code and see them reflected.
     # If you don't want to use editable mode, you can just run pip install ./UrbanMapper
  4. (Recommended) Install Jupyter extensions for interactive visualisations requiring Jupyter widgets:

     jupyter labextension install @jupyter-widgets/jupyterlab-manager
  5. Launch Jupyter Lab to explore UrbanMapper:

     jupyter lab

๐Ÿ—บ๏ธ Urban Layers Currently Supported

UrbanMapper currently supports the following urban layers:

  1. Streets Roads โ€“โ€“ UrbanMapper can load street road networks from OpenStreetMap (OSM) using OSMNx.
  2. Streets Intersections โ€“โ€“ UrbanMapper can load street intersections from OpenStreetMap (OSM) using OSMNx.
  3. Sidewalks โ€“โ€“ UrbanMapper can load sidewalk via Tile2Net using Deep Learning for automated mapping of pedestrian infrastructure from aerial imagery.
  4. Cross Walks โ€“โ€“ UrbanMapper can load crosswalk via Tile2Net using Deep Learning for automated mapping of pedestrian infrastructure from aerial imagery.
  5. Cities' Features -- Urban Mapper can load OSM cities features such as buildings, parks, Bike Lanes etc. via OSMNx API.
  6. Region Neighborhoods โ€“โ€“ UrbanMapper can load neighborhoods boundaries from OpenStreetMap (OSM) using OSMNx Features module.
  7. Region Cities โ€“โ€“ UrbanMapper can load cities boundaries from OpenStreetMap (OSM) using OSMNx Features module.
  8. Region States โ€“โ€“ UrbanMapper can load states boundaries from OpenStreetMap (OSM) using OSMNx Features module.
  9. Region Countries โ€“โ€“ UrbanMapper can load countries boundaries from OpenStreetMap (OSM) using OSMNx Features module.

More will be added in the future, e.g Subway/Tube networks, States/Provinces, Countries/ Regions, Continents, etc.

References

๐Ÿš€ Getting Started with UrbanMapper

Are you ready to dive into urban data analysis? The simplest approach to get started with UrbanMapper is to look through the hands-on examples in the examples/ directory. These Jupyter notebooks walk you through the library's features, from loading and prepping data to enriching urban layers and visualising the results.

Whether you are new to urban data or an experienced urban planner, these examples will help you realise UrbanMapper's full potential. Whether you are new to urban data science or an experienced data scientist, these examples will help you accelerate your urban data science workflow.

The examples/ directory is organised into three main sections: Basics/, End-to-End/, Study Cases and External Libraries Usages. Hereโ€™s a quick gander at what each notebook covers:

Tip

You can download the public datasets used throughout all examples via two channels:

  • Channel 1: Our Google Drive public folder

    • Option A: Download all datasets at once using the command:
      # If you do not have gdown installed, install it first
      # brew install gdown or pip install gdown
      gdown https://drive.google.com/drive/folders/1n-5zkNqT97W-I9Dc7X_mG4kezskfVtlb -O ./data --folder
    • Option B: Manually download specific datasets from the same Google Drive folder on demand.
  • Channel 2: Official data sources

    • Follow the data source links provided in the various notebooks.
    • Download the datasets directly from their official channels.
    • Place the downloaded files in the data/ folder or any other folder of your choice.

Voila! You are ready to go! ๐ŸŽ‰

๐Ÿงฉ Basics

  • [1] loader.ipynb: Learn how to load urban data from various formats into UrbanMapper.

    • What it does: Demonstrates loading PLUTO (CSV), taxi trip (Parquet), and NYC Pluto buildings information (Shapefile) data, setting the stage for analysis.
  • [2] urban_layer.ipynb: Discover how to create urban layers like streets or intersections and more!

    • What it does: Builds a streets layer for Downtown Brooklyn and previews it statically. Does show more urban layers primitives and show them mostly statically, some interactively.
  • [3] imputer.ipynb: Handle missing geospatial data with ease.

    • What it does: Uses SimpleGeoImputer to fill in missing coordinates in PLUTO data. Shows that there are more imputer techniques available and that more could be implemented.
  • [4] filter.ipynb: Focus your data on specific areas. Usecase: You have data for the entire Big Apple, but you focus on Downtown Brooklyn. It does not make sense to keep the entire data that is not in Downtown Brooklyn, does it ?

    • What it does: Applies a BoundingBoxFilter to keep only data within Downtown Brooklyn. Shows that there could be more filter techniques added.
  • [5] enricher.ipynb: Add valuable insights to your urban layers from your urban data information.

    • What it does: Enriches a street intersections layer with average building floors from PLUTO data.
  • [6] visualiser.ipynb: Bring your data to life with maps.

    • What it does: Creates static and interactive maps (e.g. dark-themed) of an enriched urban layer.
  • [7] urban_pipeline.ipynb: Streamline your workflow with a pipeline. Save and Share!

    • What it does: Builds and runs an urban pipeline that loads, processes, enriches, and visualises.
    • Beyond: It shows how to save and load your pipeline for future use such as e.g ML-exploration, as one is being showcased.
    • Bonus: We also show how to export your urban pipeline to JupyterGIS.
  • [8] pipeline_generator.ipynb: Let an LLM suggest a pipeline for you based on your user input.

    • What it does: Generates a pipeline from a description (e.g., mapping PLUTO data to intersections) using a given LLM of interest from those available. For the example we use gpt-4o.

๐Ÿ”„ End-to-End

These notebooks showcase complete workflows, tying all the pieces together.

  • [1] step_by_step.ipynb: Walk through the UrbanMapper workflow manually.

    • What it does: Loads PLUTO data, creates an intersections urban layer, imputes, filters, enriches with average floors, and visualises itโ€”all step-by-step.
  • [2] pipeline_way.ipynb: Achieve the same results with an urban pipeline.

    • What it does: Streamlines the step-by-step workflow into a single UrbanPipeline, showcasing efficiency and reusability.

๐Ÿ“Š Study Cases

Ready to see UrbanMapper tackle real urban challenges? These study cases apply the library to specific datasets, showing its power in action.

๐Ÿš— Downtown BK Collisions Study

  • [1] Downtown_BK_Collisions_StepByStep.ipynb: Get hands-on with collision data analysis.

    • What it does: Step-by-step, youโ€™ll load collision data, build an intersections layer, handle missing coordinates, filter to Downtown Brooklyn, map collisions to intersections, count them up, and visualize the hotspots.
  • [2] Downtown_BK_Collisions_Pipeline.ipynb: Simplify the process with a pipeline.

    • What it does: Wraps the entire workflow into an UrbanPipeline, making it a breeze to run and reuse.
  • [3] Downtown_BK_Collisions_Advanced_Pipeline.ipynb: Take it up a notch with extra metrics.

    • What it does: Adds total injuries and fatalities per intersection to the analysis, giving you a fuller picture of collision impacts.
  • [4] Downtown_BK_Collisions_Advanced_Pipeline_Extras.ipynb: Get more insights with additional enrichments than [3].

    • What it does: Adds more metrics than [3] by using the custom function from the enricher module allowing us more flexibility but needed more coding.

๐Ÿš– Downtown BK Taxi Trips Study

  • [1] Downtown_BK_Taxi_Trips_StepByStep.ipynb: Dive into taxi trip data analysis.

    • What it does: Manually load taxi data, create a streets layer, impute missing coordinates, filter to the area, map pickups and dropoffs to streets, count them, and visualize the busiest spots.
  • [2] Downtown_BK_Taxi_Trips_Pipeline.ipynb: Streamline your taxi trip analysis.

    • What it does: Bundles all the steps into an UrbanPipeline, saving you time and effort.
  • [3] Downtown_BK_Taxi_Trips_Advanced_Pipeline.ipynb: Get more insights with additional enrichments.

    • What it does: Adds average fare amount per pickup segment, helping you understand not just where taxis go, but how much they earn.
  • [4] Downtown_BK_Collisions_Advanced_Pipeline_Extras.ipynb: Get more insights with additional enrichments than [3].

    • What it does: Adds more metrics than [3] by using the custom function from the enricher module allowing us more flexibility but needed more coding.

๐Ÿš– Remarkable Trees Paris Study

  • [1] Paris_Remarquable_Trees_Pipeline.ipynb: Explore the remarkable trees of Paris within its neighborhoods.

    • What it does: This notebook demonstrates how to load the remarkable trees dataset, create a neighborhoods layer, and enrich the neighborhoods with remarkable trees information. E.g, the count of them per neighborhood. Another one may be the circumference of the trees on average per neighborhood. Etc.
  • [2] Paris_Remarquable_Trees_Advanced_Pipeline.ipynb: Explore the remarkable trees of Paris within its neighborhoods.

    • What it does: This notebook demonstrates how to load the remarkable trees dataset, create a neighborhoods layer, and enrich the neighborhoods with remarkable trees information. E.g, the count of them per neighborhood. Another one may be the circumference of the trees on average per neighborhood. Etc. As an extra, it LLM-compute a summary of the why the trees are remarkable and what is the impact of them on the neighborhoods.

๐Ÿ”— External Libraries Usage

UrbanMapper doesnโ€™t work in isolationโ€”it plays nicely with other powerful tools to make your user journey experience even more pleasing. To showcase these integrations, weโ€™ve prepared a few notebooks that demonstrate how to use the mixins that bridge UrbanMapper with other libraries.

  • [1] auctus_search.ipynb: Find and load datasets with Auctus from https://auctus.vida-nyu.org/.

    • What it does: Demonstrates searching for urban datasets (like PLUTO) using Auctus, a data discovery tool. Youโ€™ll learn to profile datasets and load them directly into UrbanMapper for analysis. See further in https://github.com/VIDA-NYU/auctus_search.
  • [2] interactive_table_vis.ipynb: Visualise data interactively with Skrub from https://skrub-data.org/.

    • What it does: Loads a CSV file and uses Skrubโ€™s interactive table visualisation to explore the data. This integration allows you to sort, filter, and inspect your urban datasets dynamically.
  • [3] Multi Urban Pipeline via Jupyter GIS: Combines collisions, taxi trips, and 311 NYC sidewalk inquiries for a holistic view of urban dynamics. It showcases UrbanMapperโ€™s capability to handle multiple urban pipeline and visualise them on a single interactive and shareable collaborative map. This comprehensive approach allows for a deeper understanding of urban interactions, potentially uncovering correlations between traffic incidents, taxi usage, and public concerns.


๐Ÿ—บ๏ธ Roadmap / Future Work

Note

For more about future works, explore the issues tab above!

๐ŸŒ API

Important

Full documentation is forthcoming; Hence, expect some breaking changes in the API โ€“ Bare wth us a doc is cooking-up! โš ๏ธ


Licence

UrbanMapper is released under the MIT Licence.

About

๐Ÿ—บ๏ธ Spatial Join & Enrich any urban layer given any external urban dataset of interest, streamline your urban analysis with Scikit-Learn-Like pipelines, and share your insights with the urban research community!

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages