-
Notifications
You must be signed in to change notification settings - Fork 704
Description
Describe the Bug
Frontend fails to register workers for the multimodal models. After digging through the code, I found that three multimodal init functions are missing the register_llm_with_readiness_gate() call that all the text workers have.
File: components/src/dynamo/sglang/main.py
Missing registration in:
init_multimodal_worker() (line 401)
init_multimodal_encode_worker() (line 360)
init_multimodal_prefill_worker() (line 441)
Have registration (working fine):
init_worker() (line 167)
init_prefill() (line 221)
init_embedding() (line 289)
Even init_multimodal_processor() has it (line 344)
Steps to Reproduce
`# Start frontend
python -m dynamo.frontend --http-port 8000
Start multimodal worker
python -m dynamo.sglang --model-path Qwen/Qwen2.5-VL-7B-Instruct --multimodal-worker
Check models
curl http://localhost:8000/v1/models
Returns: {"data": []} ← model not registered
Try to use it
curl -X POST http://localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "Qwen/Qwen2.5-VL-7B-Instruct",
"messages": [{"role": "user", "content": "Hello!"}]
}'
Expected Behavior
- Worker should register with frontend via etcd/NATS discovery
- Frontend logs should show:
added model model_name="Qwen/Qwen2.5-VL-7B-Instruct" - curl /v1/models should return the model
Actual Behavior
- Worker starts but never registers
- Frontend logs show no model addition
- curl /v1/models returns empty list
- Requests return 503 Service Unavailable
Environment
Dynamo: v0.6.1
Backend: SGLang
Hardware: 8x H100
Additional Context
No response
Screenshots
No response