Dharmendra Kashaudhan kashaudhan

👋 Hi, I’m Dharamendra

🧠 Senior Engineer | Distributed Architecture

I build production-grade AI systems, not demos.

Senior AI Engineer with 5+ years of experience designing and operating efficient, scalable, predictable, and cost-effective systems, now specializing in:

LLM-powered applications
AI Agents & tool-using systems
Retrieval-Augmented Generation (RAG)
Distributed AI platforms & microservices

🚀 What I Do

Build end-to-end AI systems (retrieval → reasoning → tool execution → response)
Design and implement AI agents (planning, tool use, memory, multi-step reasoning)
Develop MCP (Model Context Protocol) servers & clients for tool orchestration
Engineer high-quality RAG pipelines (retrieval, reranking, compression)
Run LLMs in production with focus on latency, cost, and reliability
Architect microservices-based AI platforms for multi-tenant environments
Optimize systems for predictability, observability, and cost efficiency

🧩 Core AI Skills

🤖 LLM & Agent Systems

Prompt engineering, instruction tuning, evaluation
Agent design (ReAct, tool calling, multi-step reasoning)
Function calling & external tool integration
Memory systems (short-term, long-term, vector-based)

🔎 Retrieval & RAG

Advanced chunking & context compression
Embeddings & semantic retrieval
ANN (HNSW, IVF)
Hybrid search (BM25 + vector)
Cross-encoder reranking

⚙️ Model Optimization

Quantization (QLoRA)
Efficient inference & batching strategies
Local LLM deployment (7B scale)

📊 AI System Design

Latency optimization & caching layers
Retrieval debugging & failure analysis
Prompt tracing & observability
Evaluation pipelines (offline + online)

🛠 Tech Stack

Languages: Python (primary), JavaScript, TypeScript
Backend: FastAPI, Node.js, Express
Architecture: Microservices, event-driven systems
Databases: PostgreSQL, MongoDB, Redis
Vector DB: Qdrant
Async & Jobs: Celery, Redis queues
AI Stack: LLMs (LLaMA), embeddings, rerankers, agent frameworks
Protocols: MCP (Model Context Protocol), REST, Webhooks
Infra: Docker, Kubernetes (K8s), distributed systems

🔬 Current Focus

Building scalable, cost-efficient AI systems in production
Designing agentic workflows for enterprise automation
Developing MCP-based tool ecosystems
Improving RAG quality, evaluation, and observability at scale

🤝 Looking to Collaborate On

AI agent platforms
MCP-based ecosystems
Distributed AI systems
Production-grade RAG pipelines

📫 Reach Me

Email: [email protected]

⚡ I focus on building AI systems that are scalable, predictable, and cost-efficient in real-world production environments.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly