Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unable to run ramalama using --runtime vllm on macOS #801

Open
benoitf opened this issue Feb 13, 2025 · 5 comments
Open

unable to run ramalama using --runtime vllm on macOS #801

benoitf opened this issue Feb 13, 2025 · 5 comments

Comments

@benoitf
Copy link
Contributor

benoitf commented Feb 13, 2025

trying ramalama on macOS 15 using --runtime vllm

I got

Trying to pull quay.io/modh/vllm:rhoai-2.18-cuda...
Error: choosing an image from manifest list docker://quay.io/modh/vllm:rhoai-2.18-cuda: no image found in image index for architecture "arm64", variant "v8", OS "linux"

it seems to fetch a cuda image while I am on Apple silicon.

@benoitf
Copy link
Contributor Author

benoitf commented Feb 13, 2025

https://docs.vllm.ai/en/latest/getting_started/installation/cpu/index.html

it's not optimal but it can run on macOS

$ podman run --rm -it -p 8090:8000 localhost/vllm-cpu-env --model TinyLlama/TinyLlama-1.1B-Chat-v1.0                                                             INFO 02-13 10:52:31 __init__.py:190] Automatically detected platform cpu.
INFO 02-13 10:52:31 api_server.py:840] vLLM API server version 0.7.3.dev116+g578087e5
...

@ericcurtin
Copy link
Collaborator

I think you found your answer in the above doc:

"vLLM has experimental support for macOS with Apple silicon. For now, users shall build from the source vLLM to natively run on macOS."

This is one for the vLLM folks

@ericcurtin
Copy link
Collaborator

A general issue around vLLM should be opened if not already, could do with some Containerfile work.

llama.cpp is more suitable for macOS runtime today

@benoitf
Copy link
Contributor Author

benoitf commented Feb 13, 2025

I don't see why it's being closed as fixed

The user experience is bad. It tries to fetch an image that does not exists

It should report a good error message

@benoitf benoitf reopened this Feb 13, 2025
@benoitf
Copy link
Contributor Author

benoitf commented Feb 13, 2025

"vLLM has experimental support for macOS with Apple silicon. For now, users shall build from the source vLLM to natively run on macOS."

some ramalama images are building llama.cpp from sources so it could also build vLLM for a later arm/macOS usage
I'm not talking about running vLLM locally on my laptop but being containerized (where it's already compiled)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants