Skip to content

Conversation

derekhiggins
Copy link
Contributor

o Introduces vLLM provider support to the record/replay testing framework
o Enabling both recording and replay of vLLM API interactions alongside existing Ollama support.

The changes enable testing of vLLM functionality. vLLM tests focus on
inference capabilities, while Ollama continues to exercise the full API surface
including vision features.

--
This is an alternative to #3128 , using qwen3 instead of llama 3.2 1B appears to be more capable at structure output and tool calls.

},
defaults={
"text_model": "vllm/meta-llama/Llama-3.2-1B-Instruct",
"text_model": "vllm/Qwen/Qwen3-0.6B",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@derekhiggins anything blocking with this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to fix a problem in the record mechanism that arouse over the last few days, Hopefully will have it working again later today.

@@ -168,6 +168,11 @@ class Setup(BaseModel):
roots=base_roots,
default_setup="ollama",
),
"base-vllm-subset": Suite(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this needed anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intent here was to add this job with only the tests in "tests/integration/inference" and then once we're happy we haven't cause any major disruption we could expand to the entire suit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@derekhiggins you feel like this is ready?

@derekhiggins derekhiggins force-pushed the vllm-ci-qwen branch 7 times, most recently from feb3ec9 to 1c3ea50 Compare October 14, 2025 07:49
It preforms better in tool calling and structured tests

Signed-off-by: Derek Higgins <[email protected]>
Add vLLM provider support to integration test CI workflows alongside
existing Ollama support. Configure provider-specific test execution
where vLLM runs only inference specific tests (excluding vision tests) while
Ollama continues to run the full test suite.

This enables comprehensive CI testing of both inference providers but
keeps the vLLM footprint small, this can be expanded later if it proves
to not be too disruptive.

Also updated test skips that were marked with "inline::vllm", this
should be "remote::vllm". This causes some failing log probs tests
to be skipped and should be revisted.

Signed-off-by: Derek Higgins <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants