Skip to content

feat: add Qwen3-VL tensor parallel#178

Merged
kcz358 merged 2 commits into
mainfrom
feat/qwen3-vl-tp
May 21, 2026
Merged

feat: add Qwen3-VL tensor parallel#178
kcz358 merged 2 commits into
mainfrom
feat/qwen3-vl-tp

Conversation

@kcz358
Copy link
Copy Markdown
Collaborator

@kcz358 kcz358 commented May 21, 2026

Summary

  • add tp_degree mesh setup and Qwen3-VL text-decoder tensor parallelization
  • shard Qwen3-VL attention q/k/v/o and MLP gate/up/down projections over the TP mesh
  • account for TP in FSDP2 throughput/MFU token de-duplication
  • document Qwen3-VL TP usage in README, model docs, and examples

Verification

  • User ran training smoke with TP enabled
  • pre-commit hooks on commits: black/isort passed

@kcz358 kcz358 merged commit 4158d7f into main May 21, 2026
3 checks passed
@kcz358 kcz358 deleted the feat/qwen3-vl-tp branch May 21, 2026 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant