Skip to content
View kashaudhan's full-sized avatar

Block or report kashaudhan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
kashaudhan/README.md

πŸ‘‹ Hi, I’m Dharamendra

🧠 Senior Engineer | Distributed Architecture

I build production-grade AI systems, not demos.

Senior AI Engineer with 5+ years of experience designing and operating efficient, scalable, predictable, and cost-effective systems, now specializing in:

  • LLM-powered applications
  • AI Agents & tool-using systems
  • Retrieval-Augmented Generation (RAG)
  • Distributed AI platforms & microservices

πŸš€ What I Do

  • Build end-to-end AI systems (retrieval β†’ reasoning β†’ tool execution β†’ response)
  • Design and implement AI agents (planning, tool use, memory, multi-step reasoning)
  • Develop MCP (Model Context Protocol) servers & clients for tool orchestration
  • Engineer high-quality RAG pipelines (retrieval, reranking, compression)
  • Run LLMs in production with focus on latency, cost, and reliability
  • Architect microservices-based AI platforms for multi-tenant environments
  • Optimize systems for predictability, observability, and cost efficiency

🧩 Core AI Skills

πŸ€– LLM & Agent Systems

  • Prompt engineering, instruction tuning, evaluation
  • Agent design (ReAct, tool calling, multi-step reasoning)
  • Function calling & external tool integration
  • Memory systems (short-term, long-term, vector-based)

πŸ”Ž Retrieval & RAG

  • Advanced chunking & context compression
  • Embeddings & semantic retrieval
  • ANN (HNSW, IVF)
  • Hybrid search (BM25 + vector)
  • Cross-encoder reranking

βš™οΈ Model Optimization

  • Quantization (QLoRA)
  • Efficient inference & batching strategies
  • Local LLM deployment (7B scale)

πŸ“Š AI System Design

  • Latency optimization & caching layers
  • Retrieval debugging & failure analysis
  • Prompt tracing & observability
  • Evaluation pipelines (offline + online)

πŸ›  Tech Stack

  • Languages: Python (primary), JavaScript, TypeScript
  • Backend: FastAPI, Node.js, Express
  • Architecture: Microservices, event-driven systems
  • Databases: PostgreSQL, MongoDB, Redis
  • Vector DB: Qdrant
  • Async & Jobs: Celery, Redis queues
  • AI Stack: LLMs (LLaMA), embeddings, rerankers, agent frameworks
  • Protocols: MCP (Model Context Protocol), REST, Webhooks
  • Infra: Docker, Kubernetes (K8s), distributed systems

πŸ”¬ Current Focus

  • Building scalable, cost-efficient AI systems in production
  • Designing agentic workflows for enterprise automation
  • Developing MCP-based tool ecosystems
  • Improving RAG quality, evaluation, and observability at scale

🀝 Looking to Collaborate On

  • AI agent platforms
  • MCP-based ecosystems
  • Distributed AI systems
  • Production-grade RAG pipelines

πŸ“« Reach Me


⚑ I focus on building AI systems that are scalable, predictable, and cost-efficient in real-world production environments.

Popular repositories Loading

  1. questionPairing questionPairing Public

    Mern Stack + Machine Learning

    JavaScript 34 4

  2. Pairing_Similar_Questions Pairing_Similar_Questions Public

    Quora Question Pairing

    Jupyter Notebook 2

  3. react-native-animated-player react-native-animated-player Public

    React native animated video player with expo

    TypeScript 2

  4. CNN_Binary_Image_Classifiaction CNN_Binary_Image_Classifiaction Public

    Jupyter Notebook 1

  5. Auction Auction Public

    Decentralized Auction on Ethereum

    JavaScript 1 4

  6. vue-tel-input vue-tel-input Public

    Forked from iamstevendao/vue-tel-input

    International Telephone Input with Vue https://iamstevendao.github.io/vue-tel-input/

    CSS 1