I build production-grade AI systems, not demos.
Senior AI Engineer with 5+ years of experience designing and operating efficient, scalable, predictable, and cost-effective systems, now specializing in:
- LLM-powered applications
- AI Agents & tool-using systems
- Retrieval-Augmented Generation (RAG)
- Distributed AI platforms & microservices
- Build end-to-end AI systems (retrieval β reasoning β tool execution β response)
- Design and implement AI agents (planning, tool use, memory, multi-step reasoning)
- Develop MCP (Model Context Protocol) servers & clients for tool orchestration
- Engineer high-quality RAG pipelines (retrieval, reranking, compression)
- Run LLMs in production with focus on latency, cost, and reliability
- Architect microservices-based AI platforms for multi-tenant environments
- Optimize systems for predictability, observability, and cost efficiency
- Prompt engineering, instruction tuning, evaluation
- Agent design (ReAct, tool calling, multi-step reasoning)
- Function calling & external tool integration
- Memory systems (short-term, long-term, vector-based)
- Advanced chunking & context compression
- Embeddings & semantic retrieval
- ANN (HNSW, IVF)
- Hybrid search (BM25 + vector)
- Cross-encoder reranking
- Quantization (QLoRA)
- Efficient inference & batching strategies
- Local LLM deployment (7B scale)
- Latency optimization & caching layers
- Retrieval debugging & failure analysis
- Prompt tracing & observability
- Evaluation pipelines (offline + online)
- Languages: Python (primary), JavaScript, TypeScript
- Backend: FastAPI, Node.js, Express
- Architecture: Microservices, event-driven systems
- Databases: PostgreSQL, MongoDB, Redis
- Vector DB: Qdrant
- Async & Jobs: Celery, Redis queues
- AI Stack: LLMs (LLaMA), embeddings, rerankers, agent frameworks
- Protocols: MCP (Model Context Protocol), REST, Webhooks
- Infra: Docker, Kubernetes (K8s), distributed systems
- Building scalable, cost-efficient AI systems in production
- Designing agentic workflows for enterprise automation
- Developing MCP-based tool ecosystems
- Improving RAG quality, evaluation, and observability at scale
- AI agent platforms
- MCP-based ecosystems
- Distributed AI systems
- Production-grade RAG pipelines
- Email: [email protected]
β‘ I focus on building AI systems that are scalable, predictable, and cost-efficient in real-world production environments.



