A privacy-respecting, metasearch engine infrastructure designed specifically for AI Agents and LLMs (Large Language Models).
This setup provides a sanitized, JSON-structured output from multiple search engines (Google, Bing, DDG, etc.) without tracking, rate-limiting (via rotation), or ad clutter, making it the ideal Knowledge Retrieval Tool for RAG (Retrieval-Augmented Generation) pipelines.
The stack is composed of three lightweight services orchestrated via Docker Compose:
- SearXNG: The metasearch core engine.
- Valkey (Redis Fork): High-performance caching and rate-limiting management.
- Caddy: Reverse proxy with automatic HTTPS termination.
- Docker & Docker Compose
- Git
-
Clone the repository:
git clone [https://github.com/dacatano/SearXNG.git](https://github.com/dacatano/SearXNG.git) cd SearXNG -
Configure Environment: Create a
.envfile from the example template.cp .env.example .env # Edit .env to set your hostname and generate a secure secret -
Deploy:
docker compose up -d
-
Verify: Access
http://localhost:8080.
To retrieve results in JSON format suitable for LangChain, AutoGPT, or custom agents, append &format=json to the query.
Example Endpoint:
GET http://localhost:8080/search?q=latest+advancements+in+LLMs&format=json
🛠 Configuration
This project follows the 12-Factor App methodology.
Secrets: Injected via .env (not hardcoded) in the docker-compose.yml file.
Persistence: Managed via Docker Named Volumes (valkey-data, searxng-data).
Logs: JSON-driver enabled for seamless integration with observability stacks (ELK/Loki).
For architectural decisions and trade-offs, see docs/ARCHITECTURE.md.