Skip to content

Conversation

@guan404ming
Copy link
Member

@guan404ming guan404ming commented Nov 28, 2025

Related Issue

closes #17715

Why

  • Phi-4 uses: partial_rotary_factor = 0.75 (rotary_dim = 96) + longrope scaling
  • Longrope requires: Both long_factors + short_factors packed into one buffer
  • Expected buffer size: (rotary_dim,) = (96,) total
    • First half [0:48] = long_factors
    • Second half [48:96] = short_factors
  • llama4_rope_with_position_map still had old size (rotary_dim // 2,) = (48,)

@guan404ming guan404ming changed the title Fix llama4_rope_with_position_map to support partial rotary factor [Relax] Fix llama4_rope_with_position_map to support partial rotary factor Nov 28, 2025
@guan404ming guan404ming marked this pull request as ready for review November 28, 2025 07:52
@guan404ming
Copy link
Member Author

cc @tlopex @mshr-h

@tlopex
Copy link
Member

tlopex commented Nov 28, 2025

cc @MasterJH5574

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Inference - Phi-4 mini instruct

2 participants