This project demonstrates database sharding using PostgreSQL with multiple shards: shard_0
, shard_1
, shard_2
, shard_asia
, shard_eu
, and shard_na
. The implementation includes Python-based APIs using FastAPI and psycopg2 for shard management.
- shard_0, shard_1, shard_2: Used for sharding based on hash keys or customer IDs.
- shard_asia, shard_eu, shard_na: Used for sharding based on geographic regions.
-
Hash-Based Sharding:
- Distributes users across
shard_0
,shard_1
, andshard_2
using a hash function onuser_id
. - APIs:
- Insert user:
/users/
(POST) - Retrieve user:
/users/
(GET) - Get shard index:
/shard/{user_id}
(GET)
- Insert user:
- Distributes users across
-
Geographic Sharding:
- Distributes users across
shard_asia
,shard_eu
, andshard_na
based on theregion
. - APIs:
- Insert user:
/users
(POST) - Retrieve user:
/get_user
(GET)
- Insert user:
- Distributes users across
-
Customer ID-Based Sharding:
- Distributes customers across
shard_0
,shard_1
, andshard_2
based oncustomer_id
ranges. - APIs:
- Insert customer:
/customers/
(POST) - Retrieve customer:
/get_customers
(GET)
- Insert customer:
- Distributes customers across
- PostgreSQL installed and running.
- Python 3.9+ installed.
- Required Python libraries:
fastapi
uvicorn
psycopg2
pydantic
-
Create the following databases in PostgreSQL:
shard_0
shard_1
shard_2
shard_asia
shard_eu
shard_na
-
Create a
users
table in each shard:CREATE TABLE users ( id SERIAL PRIMARY KEY, name VARCHAR(100), email VARCHAR(100), region VARCHAR(50) );
-
Create a
customers
table inshard_0
,shard_1
, andshard_2
:CREATE TABLE customers ( customer_id SERIAL PRIMARY KEY, name VARCHAR(100), email VARCHAR(100) );
-
Clone the repository:
git clone https://github.com/Rajkumardev/db-sharding.git cd db-sharding
-
Install dependencies:
pip install -r requirements.txt
-
Run the FastAPI applications:
- Hash-Based Sharding:
uvicorn hash_key:app --host 0.0.0.0 --port 5002 --reload
- Geographic Sharding:
uvicorn geo:app --host 0.0.0.0 --port 5003 --reload
- Customer ID-Based Sharding:
uvicorn key_range:app --host 0.0.0.0 --port 5001 --reload
- Hash-Based Sharding:
-
Insert User:
POST /users/
{ "user_id": 1, "name": "John Doe", "email": "[email protected]" }
-
Retrieve User:
GET /users/?user_id=1
-
Get Shard Index:
GET /shard/{user_id}
-
Insert User:
POST /users
{ "user_id": 1, "name": "Jane Doe", "email": "[email protected]", "region": "ASIA" }
-
Retrieve User:
GET /[email protected]®ion=ASIA
-
Insert Customer:
POST /customers/
{ "customer_id": 1001, "name": "Alice", "email": "[email protected]" }
-
Retrieve Customer:
GET /get_customers?customer_id=1001
.
├── geo.py # Geographic sharding implementation
├── hash_key.py # Hash-based sharding implementation
├── key_range.py # Customer ID-based sharding implementation
├── README.md # Documentation
This project is licensed under the MIT License.