Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misc. bug: RPC attempt fails with a specific error, but I cannot find any info on troubleshooting it #11929

Open
maglore9900 opened this issue Feb 17, 2025 · 3 comments

Comments

@maglore9900
Copy link

Name and Version

ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 3070, compute capability 8.6, VMM: yes
version: 4735 (73e2ed3)
built with cc (Ubuntu 13.2.0-23ubuntu4) 13.2.0 for x86_64-linux-gnu

is built per the RPC instructions, and launches fine

the other system attempting to use also ubuntu but without a GPU
version: 4735 (73e2ed3)
built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-cli

Command line

`bin/llama-cli -m ../models/llama-3.2-3b-instruct-q4_k_m.gguf` no issues
`bin/llama-cli -m ../models/llama-3.2-3b-instruct-q4_k_m.gguf --rpc 10.0.0.125:52415` issue listed below

Problem description & steps to reproduce

it was built with the cmake -B build -DGGML_RPC=ON command, compiles fine

I can run bin/llama-cli -m ../models/llama-3.2-3b-instruct-q4_k_m.gguf without any issues, I can run the same command on the other system without any issues

but when I run
bin/llama-cli -m ../models/llama-3.2-3b-instruct-q4_k_m.gguf --rpc 10.0.0.125:52415
I get the following error

build: 4735 (73e2ed3c) with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
main: llama backend init
main: load the model and apply lora adapter, if any
/mnt/test_zone/llama.cpp/ggml/src/ggml-rpc/ggml-rpc.cpp:755: GGML_ASSERT(status) failed
Could not attach to process.  If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: Operation not permitted.
No stack.
The program is not being run.
Aborted (core dumped)

I have tried rebuilding, googling, review the docs. But I cannot figure out why I get this error

I have confirmed that both systems can communicate with each other on my network

First Bad Commit

No response

Relevant log output

@maglore9900
Copy link
Author

Here is with the command run with sudo so I can ptrace

build: 4735 (73e2ed3c) with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
main: llama backend init
main: load the model and apply lora adapter, if any
/mnt/test_zone/llama.cpp/ggml/src/ggml-rpc/ggml-rpc.cpp:755: GGML_ASSERT(status) failed
[New LWP 380474]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x0000706684aea42f in __GI___wait4 (pid=380475, stat_loc=0x7ffc128ba924, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
30      ../sysdeps/unix/sysv/linux/wait4.c: No such file or directory.
#0  0x0000706684aea42f in __GI___wait4 (pid=380475, stat_loc=0x7ffc128ba924, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
30      in ../sysdeps/unix/sysv/linux/wait4.c
#1  0x000070668505bd5a in ggml_abort () from /mnt/test_zone/llama.cpp/build/bin/libggml-base.so
#2  0x0000706684c32f30 in ggml_backend_rpc_get_device_memory () from /mnt/test_zone/llama.cpp/build/bin/libggml-rpc.so
#3  0x000070668518369f in llama_model_load_from_file_impl(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, llama_model_params) () from /mnt/test_zone/llama.cpp/build/bin/libllama.so
#4  0x00007066851843c6 in llama_model_load_from_file () from /mnt/test_zone/llama.cpp/build/bin/libllama.so
#5  0x000058448083d111 in common_init_from_params(common_params&) ()
#6  0x00005844807e11f0 in main ()
[Inferior 1 (process 380473) detached]
Aborted

@br00t4c
Copy link

br00t4c commented Feb 19, 2025

I run into the same issue when attempting to offload model layers to a CPU-only RPC backend

@maglore9900
Copy link
Author

Is it necessary to run the llama-cli command with the --rpc flag ONLY on a system that is also running the RPC?

If so, this would indicate that only a system with a GPU can take advantage of the RPC option, and would account for the error I am receiving.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants