A local-first job scraping stack with Docker Compose by Firas Lamouchi.
- agent-ui: Streamlit UI at http://localhost:8501
- scraper: FastAPI service at http://localhost:8000
- automation-engine: n8n at http://localhost:5678
- Configure
Edit files in ./config:
- sites.txt: one site per line
- keywords.txt: one keyword or job title per line
- cv.txt: your CV text
- .env.example: copy to .env and set GROQ_API_KEY (optional)
- Run
docker compose up --build- Use
- Open UI: http://localhost:8501
- Open n8n: http://localhost:5678
Run without the UI:
python cli.py run --lite
python cli.py run --api-key YOUR_GROQ_KEYExport saved results from the local SQLite database:
python cli.py export --format json --limit 200
python cli.py export --format csv --out jobs.csvYou can override paths:
python cli.py --config ./config --data ./data run --litemake build
make up
make down
make logs
make ps- Provide a Groq API key in the UI to enable AI scoring.
- Toggle Lite Mode to use keyword-only scoring without an API key.
- Scraper can also be triggered from n8n via POST http://scraper:8000/run
All state is stored in ./data:
- SQLite database and logs
- n8n workflows and database
Running docker compose down will not delete ./data.
Trigger a run:
Read results:
Export:
The scraper supports basic delay and retry tuning via environment variables:
- REQUEST_DELAY_SECONDS (default 0.6)
- RETRY_MAX_ATTEMPTS (default 4)
- RETRY_BASE_SECONDS (default 0.6)
- RETRY_MAX_SECONDS (default 8)
Send a POST to http://scraper:8000/run with JSON:
{"api_key":"your_key","lite_mode":false}