Skip to content

Commit 5468bab

Browse files
committed
ci: integrate vLLM inference tests with GitHub Actions workflows
Add vLLM provider support to integration test CI workflows alongside existing Ollama support. Configure provider-specific test execution where vLLM runs only inference specific tests (excluding vision tests) while Ollama continues to run the full test suite. This enables comprehensive CI testing of both inference providers but keeps the vLLM footprint small, this can be expanded later if it proves to not be too disruptive. Also updated test skips that were marked with "inline::vllm", this should be "remote::vllm". This causes some failing log probs tests to be skipped and should be revisted. Signed-off-by: Derek Higgins <[email protected]>
1 parent 4c6693a commit 5468bab

File tree

4 files changed

+13
-9
lines changed

4 files changed

+13
-9
lines changed

.github/actions/run-and-record-tests/action.yml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,8 @@ runs:
6868
echo "New recordings detected, committing and pushing"
6969
git add tests/integration/recordings/
7070
71-
git commit -m "Recordings update from CI (suite: ${{ inputs.suite }})"
71+
git commit -m "Recordings update from CI (setup: ${{ inputs.setup }}, suite: ${{ inputs.suite }})"
72+
7273
git fetch origin ${{ github.ref_name }}
7374
git rebase origin/${{ github.ref_name }}
7475
echo "Rebased successfully"
@@ -82,7 +83,8 @@ runs:
8283
if: ${{ always() }}
8384
shell: bash
8485
run: |
85-
sudo docker logs ollama > ollama-${{ inputs.inference-mode }}.log || true
86+
sudo docker logs ollama > ollama-${{ inputs.inference-mode }}.log 2>&1 || true
87+
sudo docker logs vllm > vllm-${{ inputs.inference-mode }}.log 2>&1 || true
8688
8789
- name: Upload logs
8890
if: ${{ always() }}

.github/workflows/integration-tests.yml

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,6 @@ on:
2121
schedule:
2222
# If changing the cron schedule, update the provider in the test-matrix job
2323
- cron: '0 0 * * *' # (test latest client) Daily at 12 AM UTC
24-
- cron: '1 0 * * 0' # (test vllm) Weekly on Sunday at 1 AM UTC
2524
workflow_dispatch:
2625
inputs:
2726
test-all-client-versions:
@@ -57,11 +56,9 @@ jobs:
5756
# Default (including test-setup=ollama): both ollama+base and ollama-vision+vision
5857
config: >-
5958
${{
60-
github.event.schedule == '1 0 * * 0'
61-
&& fromJSON('[{"setup": "vllm", "suite": "base"}]')
62-
|| github.event.inputs.test-setup == 'ollama-vision'
59+
github.event.inputs.test-setup == 'ollama-vision'
6360
&& fromJSON('[{"setup": "ollama-vision", "suite": "vision"}]')
64-
|| fromJSON('[{"setup": "ollama", "suite": "base"}, {"setup": "ollama-vision", "suite": "vision"}]')
61+
|| fromJSON('[{"setup": "ollama", "suite": "base"}, {"setup": "ollama-vision", "suite": "vision"}, {"setup": "vllm", "suite": "base-vllm-subset"}]')
6562
}}
6663
6764
steps:

tests/integration/inference/test_openai_completion.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ def skip_if_model_doesnt_support_openai_completion(client_with_models, model_id)
3939
if provider.provider_type in (
4040
"inline::meta-reference",
4141
"inline::sentence-transformers",
42-
"inline::vllm",
42+
"remote::vllm",
4343
"remote::bedrock",
4444
"remote::databricks",
4545
# Technically Nvidia does support OpenAI completions, but none of their hosted models
@@ -119,7 +119,7 @@ def skip_if_model_doesnt_support_openai_chat_completion(client_with_models, mode
119119
if provider.provider_type in (
120120
"inline::meta-reference",
121121
"inline::sentence-transformers",
122-
"inline::vllm",
122+
"remote::vllm",
123123
"remote::bedrock",
124124
"remote::databricks",
125125
"remote::cerebras",

tests/integration/suites.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -168,6 +168,11 @@ class Setup(BaseModel):
168168
roots=base_roots,
169169
default_setup="ollama",
170170
),
171+
"base-vllm-subset": Suite(
172+
name="base-vllm-subset",
173+
roots=["tests/integration/inference"],
174+
default_setup="vllm",
175+
),
171176
"responses": Suite(
172177
name="responses",
173178
roots=["tests/integration/responses"],

0 commit comments

Comments
 (0)