-
Notifications
You must be signed in to change notification settings - Fork 1.2k
ci: Add vLLM support to integration testing infrastructure #3128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1f29aaa to
6a4da14
Compare
21f2737 to
0f0e9ca
Compare
|
@ashwinb With this we'll need to run the record tests for 2 providers, but they can't be run in parrallel because
it works if you run them sequentially @ashwin to avoid conflicts what would you think about removing the index.sqlite file altogether? |
fbe1472 to
9e9687d
Compare
I also dealt with |
2ef7b4a to
79784ff
Compare
index.sqlite has been removed here #3254 |
82c7e69 to
c38a15a
Compare
c38a15a to
0e71d65
Compare
eb23960 to
2fdab23
Compare
2fdab23 to
903cffd
Compare
6052457 to
c13151e
Compare
| # Additional exclusions for vllm setup | ||
| if [[ "$TEST_SETUP" == "vllm" ]]; then | ||
| EXCLUDE_TESTS="${EXCLUDE_TESTS} or test_inference_store_tool_calls" | ||
| EXCLUDE_TESTS="${EXCLUDE_TESTS} or test_inference_store_tool_calls or test_text_chat_completion_structured_output" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about adding these to the skips in the test files directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem here is because of the model, Our skips in test files are all based on provider
I put the skips here so that it only skips them in CI, anybody running integration test with a more capable model will still be able to use them.
If we can get to the point that this job is running, I'll happy test other models to see if I can get ride of this line alltogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes please, ci w/ a model that passes more tests.
having a gap between what ci test and what developers see in the test suite is going to lead to bugs and confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened an alternative PR that instead use qwen3 #3545
I'll can close which ever one we don't want to go with
72da0de to
1a759f5
Compare
tldr: I removed all of the trivial changes that the ollama record ci job produced -- I took this commit and removed the trivial changes (these arn't needed), |
- Update earth question to be more specific with multiple choice format to prevent Llama-3.2-1B-Instruct from rambling about other planets - Skip test_text_chat_completion_structured_output as it sometimes times out during CI execution again with Llama-3.2-1B-Instruct on vllm Signed-off-by: Derek Higgins <[email protected]>
Add vLLM provider support to integration test CI workflows alongside existing Ollama support. Configure provider-specific test execution where vLLM runs only inference specific tests (excluding vision tests) while Ollama continues to run the full test suite. This enables comprehensive CI testing of both inference providers but keeps the vLLM footprint small, this can be expanded later if it proves to not be too disruptive. Signed-off-by: Derek Higgins <[email protected]>
Signed-off-by: Derek Higgins <[email protected]>
1a759f5 to
0ec2427
Compare
|
Closing this i favour of the qwen version (as per discussion in community meeting) |


o Introduces vLLM provider support to the record/replay testing framework
o Enabling both recording and replay of vLLM API interactions alongside existing Ollama support.
The changes enable testing of vLLM functionality. vLLM tests focus on
inference capabilities, while Ollama continues to exercise the full API surface
including vision features.
Related: #2888
--
see alternative using qwen here #3545