feat: renderer emits numpy, not torch by hallerite · Pull Request #18 · PrimeIntellect-ai/renderers

hallerite · 2026-05-12T14:09:45Z

Summary

HF processors return torch by default; multimodal renderers were passing those tensors through to mm_items, which leaked torch into the renderer's data model and forced downstream transport layers (verifiers — which doesn't declare torch as a dep) to handle torch tensors.

Switch all three multimodal processors (Qwen3-VL, Qwen3.5/3.6, Kimi-K2.5) to return_tensors=\"np\". mm_items[i][\"pixel_values\"] and friends now ship numpy arrays. Renderer is torch-free; downstream consumers (vLLM-glue helper in client.py, trainer in prime-rl) convert via torch.from_numpy / torch.as_tensor at their boundary, where torch is already a hard dependency.

Why

Architectural cleanup: the renderer shouldn't know about tensorization. numpy is the lowest-common-denominator format that's already a dep of every realistic consumer (HF, torch, vLLM, jax) and is natively serializable by verifiers' msgpack encoder. This lets verifiers drop its torch handling branch (companion PR opens shortly).

What changes

Qwen3VLRenderer._process_image — return_tensors=\"pt\" → \"np\"
Qwen35Renderer._process_image — same
KimiK25Renderer._process_image — same
client._build_qwen_vl_features — wrap numpy items in torch.as_tensor before torch.cat / BatchFeature. Torch is already imported here (the vLLM payload encoder is a torch-required code path).
Bump 0.1.7 → 0.1.8.

Test plan

Full renderer suite green (980 passed, 49 skipped, 1 xfailed)
Multimodal byte-parity vs HF processor unchanged (compares input_ids, not pixel arrays)
Companion verifiers PR drops torch branch from msgpack_encoder and bumps to renderers>=0.1.8
prime-rl bumps renderers>=0.1.8; trainer-side decode_tensor_payload already handles the encoded wire format identically (numpy/torch source produces the same __torch_tensor__-tagged payload)

HF processors return torch by default; multimodal renderers were passing those tensors through to ``mm_items``, which leaked torch into the renderer's data model and forced downstream transport layers (verifiers) to handle torch tensors despite not declaring torch as a dependency. Switch all three multimodal processors (Qwen3-VL, Qwen3.5/3.6, Kimi-K2.5) to ``return_tensors="np"``. ``mm_items[i]["pixel_values"]`` and friends now ship numpy arrays. Renderer is torch-free; downstream consumers (vLLM-glue helper here in client.py, trainer in prime-rl) convert via ``torch.from_numpy`` / ``torch.as_tensor`` at their boundary where torch is already a hard dependency. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

…ace (#18)

hallerite force-pushed the feat/renderer-numpy-mm-items branch from ebc1f1b to a2aa322 Compare May 12, 2026 17:24

hallerite force-pushed the feat/renderer-numpy-mm-items branch from a2aa322 to 4c78db5 Compare May 12, 2026 17:29

hallerite merged commit 8d2a7d3 into main May 12, 2026
6 checks passed

hallerite deleted the feat/renderer-numpy-mm-items branch May 12, 2026 17:40

hallerite added a commit that referenced this pull request May 13, 2026

feat: renderer emits numpy, not torch — drop torch from renderer surf…

99e2810

…ace (#18)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: renderer emits numpy, not torch#18

feat: renderer emits numpy, not torch#18
hallerite merged 1 commit into
mainfrom
feat/renderer-numpy-mm-items

hallerite commented May 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hallerite commented May 12, 2026

Summary

Why

What changes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant