Double RMSNorm in PatchedQwen3_5TextModel

@will-lms Introduced in #298 (Qwen 3.5 Unified).

`PatchedQwen3_5TextModel.__call__` returns `self.norm(hidden_states)`, but the class it overrides (`Qwen3_5TextModel`) is expected to return un-normed hidden states. mlx-lm's `TextModel.__call__` applies `self.model.norm` itself before the lm_head:

```python
# mlx_lm/models/qwen3_5.py in TextModel.__call__
hidden = self.model(...)          # expects un-normed hidden
normed = self.model.norm(hidden)
out = self.lm_head(normed)
```

Since `self.model` is now `PatchedQwen3_5TextModel`, `hidden` is already normed, and `self.model.norm` is applied a second time.

This causes a double RMSNorm before the lm_head on all Qwen3.5 text and vision inference.

## Fix

Do not call `self.norm` in `mlx_engine/model_kit/patches/qwen3_5.py`:

```diff
- return self.norm(hidden_states)
+ return hidden_states
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Double RMSNorm in PatchedQwen3_5TextModel #322

Fix

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Double RMSNorm in PatchedQwen3_5TextModel #322

Description

Fix

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions