Skip to content
View KuntaiDu's full-sized avatar

Block or report KuntaiDu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
KuntaiDu/README.md

Hi, I'm Kuntai Du 👋

I'm a PhD student @ UChicago, graduating, working in Large Language Model Inference. Check my home page for more about me!

🔧 Experiences

  • 🚀 Working on vLLM project as vLLM team member. My contributions:
    • Performance dashboard: perf.vllm.ai.
    • Performance comparison with other LLM inference engines: the end of the blog.
    • Features: Disaggregated prefilling and CPU offloading.
  • 💾 Contributing to the LMCache project, exploring fun ideas in KV caches.

🎮 Hobbies and Interests

  • 🎮 Gaming: League of Legends, Stardew Valley, Go
  • 💃 Street Dance: Locking main, but I also dance waacking.
  • 🎤 Singing: Loch Lomond and 传奇 Legend

📧 Contact

Popular repositories Loading

  1. dds dds Public

    Server-driven Video Streaming for Deep Learning Inference

    Python 88 33

  2. AccMPEG AccMPEG Public

    Jupyter Notebook 28 6

  3. vllm vllm Public

    Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 13 4

  4. Video-Aalytic-Overview Video-Aalytic-Overview Public

    10 3

  5. Awesome-LLM-Inference Awesome-LLM-Inference Public

    Forked from xlite-dev/Awesome-LLM-Inference

    📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

    2

  6. Asteroids-fun-ver Asteroids-fun-ver Public

    Racket 1

291 contributions in the last year

Contribution Graph
Day of Week March April May June July August September October November December January February March
Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Less
No contributions.
Low contributions.
Medium-low contributions.
Medium-high contributions.
High contributions.
More

Contribution activity

March 2025

Created a pull request in vllm-project/vllm that received 2 comments

Serialize using safetensors for KV caches

Serializing the KV caches using safetensors instead of pickle in Mooncake Pipe for safety. Note that I just tested the behavior of this class local…

+6 −6 lines changed 2 comments
Opened 3 other pull requests in 2 repositories
Reviewed 10 pull requests in 2 repositories
Opened 1 issue in 1 repository
vllm-project/production-stack 1 open
Loading