Skip to content

Double RMSNorm in PatchedQwen3_5TextModel #322

@AirRunner

Description

@AirRunner

@will-lms Introduced in #298 (Qwen 3.5 Unified).

PatchedQwen3_5TextModel.__call__ returns self.norm(hidden_states), but the class it overrides (Qwen3_5TextModel) is expected to return un-normed hidden states. mlx-lm's TextModel.__call__ applies self.model.norm itself before the lm_head:

# mlx_lm/models/qwen3_5.py in TextModel.__call__
hidden = self.model(...)          # expects un-normed hidden
normed = self.model.norm(hidden)
out = self.lm_head(normed)

Since self.model is now PatchedQwen3_5TextModel, hidden is already normed, and self.model.norm is applied a second time.

This causes a double RMSNorm before the lm_head on all Qwen3.5 text and vision inference.

Fix

Do not call self.norm in mlx_engine/model_kit/patches/qwen3_5.py:

- return self.norm(hidden_states)
+ return hidden_states

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions