Eval bug: llama-serve ignores SIGINT and SIGTERM when running within a container. #11742

rhatdan · 2025-02-07T21:26:15Z

Name and Version

llama-cli --version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = llvmpipe (LLVM 19.1.3, 256 bits) (llvmpipe) | uma: 0 | fp16: 1 | warp size: 8 | matrix cores: none
ggml_vulkan: Warning: Device type is CPU. This is probably not the device you want.
version: 4607 (aa6fb13)
built with cc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-2) for x86_64-redhat-linux

Operating systems

Linux

GGML backends

Vulkan

Hardware

When we run llama-serve in a podman container, it ignores kill -TERM and kill -INT. Sent from inside of the container and on the outside.

Models

Granite, but I believe this has nothing to do with the model.

Problem description & steps to reproduce

llama-server --port8080 -m/mnt/models/model.file -c2048 --temp0.8 -ngl -1 --host0.0.0.0

First Bad Commit

/bin/ramalama --image quay.io/ramalama/vulkan bench granite
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Intel(R) Graphics (RPL-S) (Intel open-source Mesa driver) | uma: 1 | fp16: 1 | warp size: 32 | matrix cores: none

model	size	params	backend	ngl	test	t/s
^C^C

Relevant log output

None,

The text was updated successfully, but these errors were encountered:

ngxson · 2025-02-07T21:55:39Z

Should be related to #11731

kth8 · 2025-02-08T00:14:52Z

Running the container with --init will properly forward the signals.

magicse · 2025-02-09T20:41:26Z

docker run -i image
https://forums.docker.com/t/docker-run-cannot-be-killed-with-ctrl-c/13108/11

rhatdan · 2025-02-10T17:03:54Z

I know how to stop a app that is ignoring signals inside of a container, The issue here is llama-serve should not be ignoring these signals. If you run an app like top and then press ^c it exits instantly.

Running with --init does not change the behavior.

ngxson · 2025-02-10T17:15:33Z

@rhatdan I think there could be something Dockerfile-related and not the llama-server itself. I usually seen the same mistake when people use a shell script as Dockerfile entrypoint that calls another binary, which results in the signal not properly redirected.

I noticed that we're using ENTRYPOINT ["/app/tools.sh"] in https://github.com/ggerganov/llama.cpp/blob/master/.devops/vulkan.Dockerfile#L89 , potentially due to this? Can you try overwriting the entrypoint to bypass the tools.sh script, something like docker --entrypoint ... ?

ngxson · 2025-02-10T17:16:35Z

Also, your command line runs bench, not server

/bin/ramalama --image quay.io/ramalama/vulkan bench

magicse · 2025-02-11T05:56:00Z

I know how to stop a app that is ignoring signals inside of a container, The issue here is llama-serve should not be ignoring these signals. If you run an app like top and then press ^c it exits instantly.

Running with --init does not change the behavior.

-i and --init are different keys

rhatdan · 2025-02-11T11:03:26Z

All containers run by RamaLama run with -i

kth8 · 2025-02-11T13:54:56Z

-i is short for --interactive, not --init

ngxson · 2025-02-11T14:02:32Z

I don't get why -i is important here.

Copied from https://docs.docker.com/reference/cli/docker/container/run/#interactive

The --interactive (or -i) flag keeps the container's STDIN open, and lets you send input to the container through standard input.

But signal is not delivered via stdin... some apps get terminated via Ctrl+D because it signifies EOF, but it does not emits SIGTERM.

https://forums.docker.com/t/docker-run-cannot-be-killed-with-ctrl-c/13108/11

@magicse the last comment in this post mentioned about ENTRYPOINT, which is aligned with my speculation above. We should test this theory instead.

kth8 · 2025-02-11T14:18:30Z

when I run llama-server inside a container with

podman run -d --name llama1b --init -p 8001:8080/tcp ghcr.io/kth8/llama-server:llama-3.2-1b-instruct

then run

podman stop llama1b

it immediately stops with Exited (143). Without --init it will hang for 10 seconds before getting killed with Exited (137)

WARN[0010] StopSignal SIGTERM failed to stop container llama1b in 10 seconds, resorting to SIGKILL

magicse · 2025-02-11T14:47:08Z

WARN[0010] StopSignal SIGTERM failed to stop container llama1b in 10 seconds, resorting to SIGKILL

Try without --init and after stop container try refresh in browser page with opened llama server or simple close it.

rhatdan · 2025-02-11T14:51:14Z

I tested this locally, will attempt to make similar change in RamaLama. Thanks. Need to make sure we have latet llama.cpp in our containers.

rhatdan added the bug-unconfirmed label Feb 7, 2025

rhatdan mentioned this issue Feb 7, 2025

ramalama server - CTRL C not working and freeze terminal containers/ramalama#753

Closed

rhatdan closed this as completed Feb 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: llama-serve ignores SIGINT and SIGTERM when running within a container. #11742

Eval bug: llama-serve ignores SIGINT and SIGTERM when running within a container. #11742

rhatdan commented Feb 7, 2025

ngxson commented Feb 7, 2025

kth8 commented Feb 8, 2025 •

edited

Loading

magicse commented Feb 9, 2025

rhatdan commented Feb 10, 2025

ngxson commented Feb 10, 2025 •

edited

Loading

ngxson commented Feb 10, 2025

magicse commented Feb 11, 2025

rhatdan commented Feb 11, 2025

kth8 commented Feb 11, 2025

ngxson commented Feb 11, 2025 •

edited

Loading

kth8 commented Feb 11, 2025

magicse commented Feb 11, 2025 •

edited

Loading

rhatdan commented Feb 11, 2025

Eval bug: llama-serve ignores SIGINT and SIGTERM when running within a container. #11742

Eval bug: llama-serve ignores SIGINT and SIGTERM when running within a container. #11742

Comments

rhatdan commented Feb 7, 2025

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

ngxson commented Feb 7, 2025

kth8 commented Feb 8, 2025 • edited Loading

magicse commented Feb 9, 2025

rhatdan commented Feb 10, 2025

ngxson commented Feb 10, 2025 • edited Loading

ngxson commented Feb 10, 2025

magicse commented Feb 11, 2025

rhatdan commented Feb 11, 2025

kth8 commented Feb 11, 2025

ngxson commented Feb 11, 2025 • edited Loading

kth8 commented Feb 11, 2025

magicse commented Feb 11, 2025 • edited Loading

rhatdan commented Feb 11, 2025

kth8 commented Feb 8, 2025 •

edited

Loading

ngxson commented Feb 10, 2025 •

edited

Loading

ngxson commented Feb 11, 2025 •

edited

Loading

magicse commented Feb 11, 2025 •

edited

Loading