A distributed, fault-tolerant job execution system built in Go. Relay handles job scheduling, execution, and orchestration with support for retries, job chaining, and distributed locking.
Relay is designed to reliably execute background jobs across distributed workers. It combines PostgreSQL for durable storage, Kafka for distributed messaging, and Redis for distributed locking to ensure jobs are executed exactly once, even in the presence of failures.
- Reliable Job Execution — Jobs are persisted before processing, ensuring no work is lost
- Distributed Processing — Scale horizontally with multiple workers consuming from Kafka
- Exactly-Once Semantics — Redis-based distributed locking prevents duplicate execution
- Shell Task Execution — Run external scripts and binaries with stdout/stderr capture
- Automatic Retries — Configurable retry logic with exponential backoff
- Job Chaining — Define workflows where completing one job triggers the next
- Dead Letter Queue — Failed jobs are quarantined for manual inspection
- CLI Interface — Submit, monitor, and manage jobs from the command line
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Client │────▶│ Relay API │────▶│ PostgreSQL │
│ (CLI) │ │ │ │ (Storage) │
└──────────────┘ └──────┬───────┘ └──────────────┘
│
▼
┌──────────────┐
│ Kafka │
│ (Queue) │
└──────┬───────┘
│
┌────────────┼────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Worker 1 │ │ Worker 2 │ │ Worker N │
└────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
└────────────┼────────────┘
▼
┌──────────────┐
│ Redis │
│ (Locks) │
└──────────────┘
PENDING ──▶ RUNNING ──▶ COMPLETED
│
├──▶ FAILED ──▶ (retry) ──▶ PENDING
│
└──▶ DEAD (after max retries)
- PENDING — Job is created and queued for execution
- RUNNING — Worker has acquired the lock and is executing the job
- COMPLETED — Job finished successfully
- FAILED — Job execution failed, may be retried
- DEAD — Job exhausted all retries, moved to dead letter queue
- Go 1.21+
- PostgreSQL 16+
- Apache Kafka 3.0+
- Redis 7.0+
git clone https://github.com/tomiwa-a/Relay.git
cd relay
go build -o relay ./cmd/api# Run migrations
make db/migrations/upRelay is configured via environment variables or command-line flags:
| Variable | Flag | Default | Description |
|---|---|---|---|
RELAY_DB_DSN |
-db-dsn |
— | PostgreSQL connection string |
RELAY_KAFKA_BROKERS |
-kafka-brokers |
localhost:9092 |
Kafka broker addresses |
RELAY_REDIS_ADDR |
-redis-addr |
localhost:6379 |
Redis server address |
RELAY_PORT |
-port |
4000 |
API server port |
RELAY_ENV |
-env |
development |
Environment mode |
make run/api# Submit a job from a JSON file
relay submit job.json
# List jobs with optional status filter
relay list --status=pending
relay list --status=failed
# View logs for a specific job
relay logs <job_id>
# Retry a failed job
relay retry <job_id>{
"type": "SHELL",
"payload": {
"command": "/usr/local/bin/process-data.sh",
"args": ["--input", "/data/file.csv"],
"timeout": "5m"
},
"on_success": {
"type": "SHELL",
"payload": {
"command": "/usr/local/bin/notify.sh"
}
},
"max_retries": 3,
"backoff": "exponential"
}relay/
├── cmd/
│ └── api/ # Application entrypoint
├── internal/
│ ├── api/ # HTTP handlers and routing
│ ├── worker/ # Kafka consumer and job execution
│ ├── executor/ # Shell and task executors
│ ├── repository/ # Database access (sqlc generated)
│ └── queries/ # SQL query definitions
├── migrations/ # Database migrations
└── Makefile
- Phase 1: Job ingestion, persistence, and basic execution
- Phase 2: Kafka distribution and Redis locking
- Phase 3: Shell executor with output capture
- Phase 4: Retry logic, job chaining, and dead letter queue
- Phase 5: CLI interface
This project is licensed under the MIT License — see the LICENSE file for details.