Skip to content

Rajkumardev/db-sharding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Database Sharding with PostgreSQL

This project demonstrates database sharding using PostgreSQL with multiple shards: shard_0, shard_1, shard_2, shard_asia, shard_eu, and shard_na. The implementation includes Python-based APIs using FastAPI and psycopg2 for shard management.

Shards Overview

  • shard_0, shard_1, shard_2: Used for sharding based on hash keys or customer IDs.
  • shard_asia, shard_eu, shard_na: Used for sharding based on geographic regions.

Features

  1. Hash-Based Sharding:

    • Distributes users across shard_0, shard_1, and shard_2 using a hash function on user_id.
    • APIs:
      • Insert user: /users/ (POST)
      • Retrieve user: /users/ (GET)
      • Get shard index: /shard/{user_id} (GET)
  2. Geographic Sharding:

    • Distributes users across shard_asia, shard_eu, and shard_na based on the region.
    • APIs:
      • Insert user: /users (POST)
      • Retrieve user: /get_user (GET)
  3. Customer ID-Based Sharding:

    • Distributes customers across shard_0, shard_1, and shard_2 based on customer_id ranges.
    • APIs:
      • Insert customer: /customers/ (POST)
      • Retrieve customer: /get_customers (GET)

Setup Instructions

Prerequisites

  • PostgreSQL installed and running.
  • Python 3.9+ installed.
  • Required Python libraries:
    • fastapi
    • uvicorn
    • psycopg2
    • pydantic

Database Configuration

  1. Create the following databases in PostgreSQL:

    • shard_0
    • shard_1
    • shard_2
    • shard_asia
    • shard_eu
    • shard_na
  2. Create a users table in each shard:

    CREATE TABLE users (
        id SERIAL PRIMARY KEY,
        name VARCHAR(100),
        email VARCHAR(100),
        region VARCHAR(50)
    );
  3. Create a customers table in shard_0, shard_1, and shard_2:

    CREATE TABLE customers (
        customer_id SERIAL PRIMARY KEY,
        name VARCHAR(100),
        email VARCHAR(100)
    );

Installation

  1. Clone the repository:

    git clone https://github.com/Rajkumardev/db-sharding.git
    cd db-sharding
  2. Install dependencies:

    pip install -r requirements.txt
  3. Run the FastAPI applications:

    • Hash-Based Sharding:
      uvicorn hash_key:app --host 0.0.0.0 --port 5002 --reload
    • Geographic Sharding:
      uvicorn geo:app --host 0.0.0.0 --port 5003 --reload
    • Customer ID-Based Sharding:
      uvicorn key_range:app --host 0.0.0.0 --port 5001 --reload

API Endpoints

Hash-Based Sharding

  • Insert User: POST /users/

    {
        "user_id": 1,
        "name": "John Doe",
        "email": "[email protected]"
    }
  • Retrieve User: GET /users/?user_id=1

  • Get Shard Index: GET /shard/{user_id}

Geographic Sharding

Customer ID-Based Sharding

  • Insert Customer: POST /customers/

    {
        "customer_id": 1001,
        "name": "Alice",
        "email": "[email protected]"
    }
  • Retrieve Customer: GET /get_customers?customer_id=1001

File Structure

.
├── geo.py                # Geographic sharding implementation
├── hash_key.py           # Hash-based sharding implementation
├── key_range.py          # Customer ID-based sharding implementation
├── README.md             # Documentation

License

This project is licensed under the MIT License.

Releases

No releases published

Packages

No packages published

Languages