Skip to content

feat: add LiteLLM as AI gateway provider#357

Open
RheagalFire wants to merge 2 commits into
groq:mainfrom
RheagalFire:feat/add-litellm-provider
Open

feat: add LiteLLM as AI gateway provider#357
RheagalFire wants to merge 2 commits into
groq:mainfrom
RheagalFire:feat/add-litellm-provider

Conversation

@RheagalFire

@RheagalFire RheagalFire commented Jun 25, 2026

Copy link
Copy Markdown

Summary

Add LiteLLM as a model provider, enabling benchmarks to run against 100+ LLM providers (OpenAI, Anthropic, Azure, Bedrock, Vertex, Groq, Ollama, etc.) through a single LiteLLM proxy endpoint.

What are you adding?

  • New model provider

Changes Made

  • src/openbench/model/_providers/litellm.py - New LiteLLMAPI class extending OpenAICompatibleAPI (follows the exact DeepInfraAPI pattern)
  • src/openbench/_registry.py - Registered litellm provider via @modelapi decorator
  • src/openbench/provider_config.py - Added LITELLM to ProviderType enum and PROVIDER_CONFIGS
  • tests/test_litellm_provider.py - 7 unit tests covering initialization, API key handling, URL config, model name stripping

Testing

  • I have run the existing test suite (pytest)
  • I have added tests for my changes
  • I have tested with multiple model providers (if applicable)
  • I have run pre-commit hooks (pre-commit run --all-files)
tests/test_litellm_provider.py .......    7 passed
Full suite: 395 passed, 10 skipped in 60.73s
ruff format -> unchanged
ruff check -> All checks passed!

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (if applicable)
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Additional Context

Usage

  1. Start a LiteLLM proxy with your providers:
# litellm_config.yaml
model_list:
  - model_name: anthropic/claude-sonnet-4-6
    litellm_params:
      model: anthropic/claude-sonnet-4-6
      api_key: sk-ant-...
  - model_name: openai/gpt-4o
    litellm_params:
      model: gpt-4o
      api_key: sk-...

litellm_settings:
  drop_params: true
pip install litellm
litellm --config litellm_config.yaml --port 4000
  1. Run benchmarks through LiteLLM:
export LITELLM_API_KEY=sk-...
export LITELLM_BASE_URL=http://localhost:4000/v1

# Evaluate Claude via LiteLLM proxy
bench eval mmlu --model litellm/anthropic/claude-sonnet-4-6 --limit 10

# Evaluate GPT-4o via same proxy
bench eval mmlu --model litellm/openai/gpt-4o --limit 10

# Any model your proxy serves
bench eval humaneval --model litellm/groq/llama-3.3-70b-versatile --limit 5
  1. Or use programmatically:
from openbench.model._providers.litellm import LiteLLMAPI
from inspect_ai.model import GenerateConfig

provider = LiteLLMAPI(
    model_name="litellm/anthropic/claude-sonnet-4-6",
    api_key="sk-your-litellm-key",
    base_url="http://localhost:4000/v1",
    config=GenerateConfig(temperature=0.0, max_tokens=1024),
)

The model name after litellm/ is passed directly to the proxy, so use whatever model identifiers your LiteLLM config defines. No new dependencies required since the provider uses the existing OpenAICompatibleAPI base class from inspect-ai.

@RheagalFire RheagalFire requested a review from nmayorga7 as a code owner June 25, 2026 11:44
@RheagalFire

Copy link
Copy Markdown
Author

cc @AarushSah @nmayorga7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant