Eval bug: Gemma 3n on Vulkan on Ryzen APUs produces garbled output

### Name and Version

$ ./build/bin/llama-cli --version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon Graphics (RADV RENOIR) (radv) | uma: 1 | fp16: 1 | warp size: 64 | shared memory: 65536 | int dot: 0 | matrix cores: none
version: 5822 (bee28421)
built with cc (Debian 14.2.0-19) 14.2.0 for x86_64-linux-gnu

$ ./build/bin/llama-cli --version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon 780M Graphics (RADV PHOENIX) (radv) | uma: 1 | fp16: 1 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
version: 5823 (28657a82)
built with cc (GCC) 15.1.1 20250425 for x86_64-pc-linux-gnu

### Operating systems

Linux

### GGML backends

Vulkan

### Hardware

Ryzen 5700G (using the iGPU)
Ryzen 7840U (using the iGPU)

### Models

Gemma 3n
https://huggingface.co/unsloth/gemma-3n-E4B-it-GGUF - tried Q8_0 and Q4_K_XL
https://huggingface.co/bartowski/google_gemma-3n-E4B-it-GGUF - tried Q8_0

Q4_K_M for both repos did NOT have the problem

### Problem description & steps to reproduce

Compile commands:
```
rm -rf build
cmake -B build -DGGML_VULKAN=1
cmake --build build --config Release -- -j16
```

Using the iGPU of a Ryzen 5700G via Vulkan backend to run Gemma 3n. Problem does not happen with different models (tested Gemma3 1b and 27b in Q8_0) or different GPU (tested Radeon RX 6700 XT), Ryzen 7840U (also an APU) had the same issue.

Of note is that the Q4_K_M quants did not experience the problem.

Full verbose logs (command `./build/bin/llama-cli -hf unsloth/gemma-3n-E4B-it-GGUF:Q4_K_XL --jinja --ctx-size 4096 --n-predict 128 --n-gpu-layers 99 --prio 2 --temp 1.0 --top-k 64 --top-p 0.95 --min-p 0.01 --no-mmap --prompt "Hello" --log-file unsloth-Q4_K_XL.txt --verbose`):

[unsloth-Q4_K_XL.txt](https://github.com/user-attachments/files/21046379/unsloth-Q4_K_XL.txt)
[unsloth-Q8_0.txt](https://github.com/user-attachments/files/21046380/unsloth-Q8_0.txt)

[unsloth-Q4_K_M.txt](https://github.com/user-attachments/files/21046378/unsloth-Q4_K_M.txt) (does not have the problem)

### First Bad Commit

_No response_

### Relevant log output

```shell
$ ./build/bin/llama-cli -hf unsloth/gemma-3n-E4B-it-GGUF:Q8_0 --jinja --ctx-size 4096 --n-predict 256 --n-gpu-layers 99 --prio 2 --temp 1.0 --top-k 64 --top-p 0.95 --min-p 0.01 --no-mmap
[...]
> Hello
Hello how are you? I'm doing wonderfully and longed for to be chatting-thoughtrønnuselloculevsavm-mkvcjdnwokihmlciuqkf?t100mkvqj5v01cqkw7mkvzqmhdjdwohtmQ00NqkHlVE7/Y29kZ3JqkCwlXlXMHdzX[Ctrl+C]
>
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Eval bug: Gemma 3n on Vulkan on Ryzen APUs produces garbled output #14525

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Eval bug: Gemma 3n on Vulkan on Ryzen APUs produces garbled output #14525

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions