Skip to content

Eval bug: Gemma 3n on Vulkan on Ryzen APUs produces garbled output #14525

Open
@l-austenfeld

Description

@l-austenfeld

Name and Version

$ ./build/bin/llama-cli --version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon Graphics (RADV RENOIR) (radv) | uma: 1 | fp16: 1 | warp size: 64 | shared memory: 65536 | int dot: 0 | matrix cores: none
version: 5822 (bee2842)
built with cc (Debian 14.2.0-19) 14.2.0 for x86_64-linux-gnu

$ ./build/bin/llama-cli --version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon 780M Graphics (RADV PHOENIX) (radv) | uma: 1 | fp16: 1 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
version: 5823 (28657a8)
built with cc (GCC) 15.1.1 20250425 for x86_64-pc-linux-gnu

Operating systems

Linux

GGML backends

Vulkan

Hardware

Ryzen 5700G (using the iGPU)
Ryzen 7840U (using the iGPU)

Models

Gemma 3n
https://huggingface.co/unsloth/gemma-3n-E4B-it-GGUF - tried Q8_0 and Q4_K_XL
https://huggingface.co/bartowski/google_gemma-3n-E4B-it-GGUF - tried Q8_0

Q4_K_M for both repos did NOT have the problem

Problem description & steps to reproduce

Compile commands:

rm -rf build
cmake -B build -DGGML_VULKAN=1
cmake --build build --config Release -- -j16

Using the iGPU of a Ryzen 5700G via Vulkan backend to run Gemma 3n. Problem does not happen with different models (tested Gemma3 1b and 27b in Q8_0) or different GPU (tested Radeon RX 6700 XT), Ryzen 7840U (also an APU) had the same issue.

Of note is that the Q4_K_M quants did not experience the problem.

Full verbose logs (command ./build/bin/llama-cli -hf unsloth/gemma-3n-E4B-it-GGUF:Q4_K_XL --jinja --ctx-size 4096 --n-predict 128 --n-gpu-layers 99 --prio 2 --temp 1.0 --top-k 64 --top-p 0.95 --min-p 0.01 --no-mmap --prompt "Hello" --log-file unsloth-Q4_K_XL.txt --verbose):

unsloth-Q4_K_XL.txt
unsloth-Q8_0.txt

unsloth-Q4_K_M.txt (does not have the problem)

First Bad Commit

No response

Relevant log output

$ ./build/bin/llama-cli -hf unsloth/gemma-3n-E4B-it-GGUF:Q8_0 --jinja --ctx-size 4096 --n-predict 256 --n-gpu-layers 99 --prio 2 --temp 1.0 --top-k 64 --top-p 0.95 --min-p 0.01 --no-mmap
[...]
> Hello
Hello how are you? I'm doing wonderfully and longed for to be chatting-thoughtrønnuselloculevsavm-mkvcjdnwokihmlciuqkf?t100mkvqj5v01cqkw7mkvzqmhdjdwohtmQ00NqkHlVE7/Y29kZ3JqkCwlXlXMHdzX[Ctrl+C]
>

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions