Skip to content

feat: add support for TPUs on llm-d reference architecture#420

Draft
syeda-anjum wants to merge 5 commits into
mainfrom
sanjum-llmdontpus
Draft

feat: add support for TPUs on llm-d reference architecture#420
syeda-anjum wants to merge 5 commits into
mainfrom
sanjum-llmdontpus

Conversation

@syeda-anjum
Copy link
Copy Markdown
Collaborator

@syeda-anjum syeda-anjum commented May 1, 2026

Overview

This PR adds support for Gemma-4 on TPUs with llm-d, including Kustomize templates and a new README guide.

Key Changes

  • Added docs/platforms/gke/base/use-cases/inference-ref-arch/llmd/llmd-vllm-with-hf-model-tpu.md.
  • Added Kustomize manifests for v6e-gemma-4-26b-a4b,v6e-gemma-4-31b and 'v6e-qwen3-32b' in online-inference-tpu/llmd/vllm.

Impact of Change

Enables users to deploy Gemma-4 and Qwen-3 on TPUs using llm-d and provides documentation for it.

References

@syeda-anjum syeda-anjum changed the title feat: add Gemma-4 templates and README for llm-d on TPUs feat: add support for TPUs on llm-d reference architecture May 20, 2026
@syeda-anjum syeda-anjum requested a review from gushob21 May 20, 2026 19:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant