You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jul 24, 2025. It is now read-only.
I've followed the llm-d quick start to install llm-d on my OpenShift 4.17 Cluster and deploy the meta-llama/Llama-3.2-3B-Instruct model as part of the "sampleApplication" during setup. This model deploys properly and all the resources, including the HTTPRoute, are created. I can run ./test-request.sh and everything completes successfully.
When trying to deploy my own model, when applying the ModelService, an HTTPRoute is not created and the model is not registered with the gateway. I can query the model directly from the decode pod, but not from the gateway. I have tried the sample Scenario 2: https://github.com/llm-d/llm-d-model-service/tree/main/samples/nixl-xpyd. When applying the baseconfig and ModelService CR in that sample, all the components are created except for the HTTPRoute. I have also tried using the config that the sampleApplication used: basic-gpu-with-nixl-and-redis-lookup-preset, and I run into the same issue.
I have tried specifying my own model I wanted to deploy RedHatAI/granite-3.1-2b-instruct-quantized.w4a16 in a ModelService, and run into the same issue specified above. However, when I modify the values.yaml: https://github.com/llm-d/llm-d-deployer/blob/main/charts/llm-d/values.yaml#L109 to point to the RedHatAI/granite-3.1-2b-instruct-quantized.w4a16 model or any other model, all the components including the HTTPRoute get created successfully.
I have not been successful in specifying a ModelService outside of the sampleApplication that will create an HTTPRoute.