Releases: BerriAI/litellm
v1.66.1-nightly
What's Changed
- build(model_prices_and_context_window.json): add gpt-4.1 pricing by @krrishdholakia in #9990
- [Fixes/QA] For gpt-4.1 costs by @ishaan-jaff in #9991
- Fix cost for Phi-4-multimodal output tokens by @emerzon in #9880
- chore(docs): update ordering of logging & observability docs by @marcklingen in #9994
- Updated cohere v2 passthrough by @krrishdholakia in #9997
- [Feat] Add support for
cache_control_injection_points
for Anthropic API, Bedrock API by @ishaan-jaff in #9996 - [UI] Allow setting prompt
cache_control_injection_points
by @ishaan-jaff in #10000
Full Changelog: v1.66.0-nightly...v1.66.1-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.66.1-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 220.0 | 243.74385918230334 | 6.268015361621096 | 0.0 | 1876 | 0 | 197.45038600001408 | 3855.600032000012 |
Aggregated | Passed ✅ | 220.0 | 243.74385918230334 | 6.268015361621096 | 0.0 | 1876 | 0 | 197.45038600001408 | 3855.600032000012 |
v1.66.0-stable
What's Changed
- build(deps): bump @babel/runtime from 7.26.0 to 7.27.0 in /docs/my-website by @dependabot in #9934
- fix: correct the cost for 'gemini/gemini-2.5-pro-preview-03-25' by @n1lanjan in #9896
- Litellm add managed files db by @krrishdholakia in #9930
- [DB / Infra] Add new column team_member_permissions by @ishaan-jaff in #9941
- fix(factory.py): correct indentation for message index increment in ollama, This fixes bug #9822 by @djshaw01 in #9943
- fix(litellm_proxy_extras): add baselining db script by @krrishdholakia in #9942
- [Team Member permissions] - Fixes by @ishaan-jaff in #9945
- Litellm managed files docs by @krrishdholakia in #9948
- [v1.66.0-stable] Release notes by @ishaan-jaff in #9952
- [Docs] v1.66.0-stable fixes by @ishaan-jaff in #9953
- stable release note fixes by @ishaan-jaff in #9954
- Fix filtering litellm-dashboard keys for internal users + prevent flooding spend logs with admin endpoint errors by @krrishdholakia in #9955
- [UI QA checklist] by @ishaan-jaff in #9957
New Contributors
Full Changelog: v1.65.8-nightly...v1.66.0-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.66.0-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 282.9933559715544 | 5.995117478456652 | 0.0 | 1793 | 0 | 223.97943800001485 | 5176.803935999998 |
Aggregated | Passed ✅ | 250.0 | 282.9933559715544 | 5.995117478456652 | 0.0 | 1793 | 0 | 223.97943800001485 | 5176.803935999998 |
v1.66.0-nightly
What's Changed
- build(deps): bump @babel/runtime from 7.26.0 to 7.27.0 in /docs/my-website by @dependabot in #9934
- fix: correct the cost for 'gemini/gemini-2.5-pro-preview-03-25' by @n1lanjan in #9896
- Litellm add managed files db by @krrishdholakia in #9930
- [DB / Infra] Add new column team_member_permissions by @ishaan-jaff in #9941
- fix(factory.py): correct indentation for message index increment in ollama, This fixes bug #9822 by @djshaw01 in #9943
- fix(litellm_proxy_extras): add baselining db script by @krrishdholakia in #9942
- [Team Member permissions] - Fixes by @ishaan-jaff in #9945
- Litellm managed files docs by @krrishdholakia in #9948
- [v1.66.0-stable] Release notes by @ishaan-jaff in #9952
- [Docs] v1.66.0-stable fixes by @ishaan-jaff in #9953
- stable release note fixes by @ishaan-jaff in #9954
- Fix filtering litellm-dashboard keys for internal users + prevent flooding spend logs with admin endpoint errors by @krrishdholakia in #9955
- [UI QA checklist] by @ishaan-jaff in #9957
New Contributors
Full Changelog: v1.65.8-nightly...v1.66.0-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.66.0-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 230.0 | 252.49209995793416 | 6.279241190720279 | 0.0 | 1878 | 0 | 200.85592700002053 | 5135.250711999987 |
Aggregated | Passed ✅ | 230.0 | 252.49209995793416 | 6.279241190720279 | 0.0 | 1878 | 0 | 200.85592700002053 | 5135.250711999987 |
v1.65.8-nightly
What's Changed
- Revert avglogprobs change + Add azure/gpt-4o-realtime-audio cost tracking by @krrishdholakia in #9893
- Realtime API: Support 'base_model' cost tracking + show response in spend logs (if enabled) by @krrishdholakia in #9897
- Simplify calling gemini models w/ file id by @krrishdholakia in #9903
- feat: add extraEnvVars to the helm deployment by @mknet3 in #9292
- [Feat - UI] - Allow setting Default Team setting when LiteLLM SSO auto creates teams by @ishaan-jaff in #9918
- Fix typo: Entrata -> Entra in docs by @msabramo in #9921
- [Feat - PR1] Add xAI grok-3 models to LiteLLM by @ishaan-jaff in #9920
- [Feat - Team Member Permissions] - CRUD Endpoints for managing team member permissions by @ishaan-jaff in #9919
- [Feat] Add litellm.supports_reasoning() util to track if an llm supports reasoning by @ishaan-jaff in #9923
- [Feat] Add reasoning_effort support for
xai/grok-3-mini-beta
model family by @ishaan-jaff in #9932 - [UI] Render Reasoning content, ttft, usage metrics on test key page by @ishaan-jaff in #9931
- [UI] - Add Managing Team Member permissions on UI by @ishaan-jaff in #9927
- [UI] Linting fixes by @ishaan-jaff in #9933
- Support CRUD endpoints for Managed Files by @krrishdholakia in #9924
- fix(databricks/common_utils.py): fix custom endpoint check by @krrishdholakia in #9925
- fix(transformation.py): correctly translate 'thinking' param for lite… by @krrishdholakia in #9904
Full Changelog: v1.65.7-nightly...v1.65.8-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.8-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 220.0 | 248.0753682003237 | 6.194614175051195 | 0.0 | 1852 | 0 | 194.34754100001328 | 4413.887686999999 |
Aggregated | Passed ✅ | 220.0 | 248.0753682003237 | 6.194614175051195 | 0.0 | 1852 | 0 | 194.34754100001328 | 4413.887686999999 |
v1.65.7-nightly
What's Changed
- [Feat SSO] - Allow admins to set
default_team_params
to have default params for when litellm SSO creates default teams by @ishaan-jaff in #9895 - [Feat] Emit Key, Team Budget metrics on a cron job schedule by @ishaan-jaff in #9528
- [Bug Fix MSFT SSO] Use correct field for user email when using MSFT SSO by @ishaan-jaff in #9886
- [Docs] Tutorial using MSFT auto team assignment with LiteLLM by @ishaan-jaff in #9898
Full Changelog: v1.65.6-nightly...v1.65.7-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.7-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 240.0 | 261.69840098746454 | 6.131387078558505 | 0.0 | 1835 | 0 | 214.285206999989 | 3626.6518760000395 |
Aggregated | Passed ✅ | 240.0 | 261.69840098746454 | 6.131387078558505 | 0.0 | 1835 | 0 | 214.285206999989 | 3626.6518760000395 |
v1.65.6-nightly
What's Changed
- Fix anthropic prompt caching cost calc + trim logged message in db by @krrishdholakia in #9838
- feat(realtime/): add token tracking + log usage object in spend logs … by @krrishdholakia in #9843
- fix(cost_calculator.py): handle custom pricing at deployment level fo… by @krrishdholakia in #9855
Full Changelog: v1.65.5-nightly...v1.65.6-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.6-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 190.0 | 209.99145276997868 | 6.188819872716192 | 0.0 | 1852 | 0 | 167.33176299999286 | 4428.401366999992 |
Aggregated | Passed ✅ | 190.0 | 209.99145276997868 | 6.188819872716192 | 0.0 | 1852 | 0 | 167.33176299999286 | 4428.401366999992 |
v1.65.5-nightly
What's Changed
- build: bump litellm-proxy-extras version by @krrishdholakia in #9771
- Update model_prices by @aoaim in #9768
- Move daily user transaction logging outside of 'disable_spend_logs' flag - different tables by @krrishdholakia in #9772
- Add inference providers support for Hugging Face (#8258) (#9738) by @krrishdholakia in #9773
- [UI Bug fix] Don't show duplicate models on Team Admin models page by @ishaan-jaff in #9775
- [UI QA/Bug Fix] - Don't change team, key, org, model values on scroll by @ishaan-jaff in #9776
- [UI Polish] - Polish login screen by @ishaan-jaff in #9778
- Litellm 04 05 2025 release notes by @krrishdholakia in #9785
- feat: add offline swagger docs by @devdev999 in #7653
- fix(gemini/transformation.py): handle file_data being passed in by @krrishdholakia in #9786
- Realtime API Cost tracking by @krrishdholakia in #9795
- fix(vertex_ai.py): move to only passing in accepted keys by vertex ai response schema by @krrishdholakia in #8992
- fix(databricks/chat/transformation.py): remove reasoning_effort from … by @krrishdholakia in #9811
- Handle pydantic base model in message tool calls + Handle tools = [] + handle fireworks ai w/ 'strict' param in function call + support fake streaming on tool calls for meta.llama3-3-70b-instruct-v1:0 by @krrishdholakia in #9774
- Allow passing
thinking
param to litellm proxy via client sdk + Code QA Refactor on get_optional_params (get correct values) by @krrishdholakia in #9386 - [Feat] LiteLLM Tag/Policy Management by @ishaan-jaff in #9813
- Remove redundant
apk update
in Dockerfiles (cc #5016) by @PeterDaveHello in #9055 - [Security fix - CVE-2025-0330] - Leakage of Langfuse API keys in team exception handling by @ishaan-jaff in #9830
- [Security Fix CVE-2024-6825] Fix remote code execution in post call rules by @ishaan-jaff in #9826
- Bump next from 14.2.25 to 14.2.26 in /ui/litellm-dashboard by @dependabot in #9716
- fix: claude haiku cache read pricing per token by @hewliyang in #9834
- Add service annotations to litellm-helm chart by @mlhynfield in #9840
- Reflect key and team update in UI by @crisshaker in #9825
- Add user alias to API endpoint by @Jacobh2 in #9859
- Update Azure Phi-4 pricing by @emerzon in #9862
- feat: add enterpriseWebSearch tool for vertex-ai by @qvalentin in #9856
- VertexAI non-jsonl file storage support by @krrishdholakia in #9781
- [Bug Fix] Add support for UploadFile on LLM Pass through endpoints (OpenAI, Azure etc) by @ishaan-jaff in #9853
- [Feat SSO] Debug route - allow admins to debug SSO JWT fields by @ishaan-jaff in #9835
- [Feat] - SSO - Use MSFT Graph API to assign users to teams by @ishaan-jaff in #9865
- Cost tracking for
gemini-2.5-pro
by @krrishdholakia in #9837 - [SSO] Connect LiteLLM to Azure Entra ID Enterprise Application by @ishaan-jaff in #9872
New Contributors
- @aoaim made their first contribution in #9768
- @hewliyang made their first contribution in #9834
- @mlhynfield made their first contribution in #9840
- @crisshaker made their first contribution in #9825
- @qvalentin made their first contribution in #9856
Full Changelog: v1.65.4-nightly...v1.65.5-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.5-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 240.0 | 281.015845046053 | 6.098441913282575 | 0.0 | 1824 | 0 | 213.96507000002885 | 5930.206827000006 |
Aggregated | Passed ✅ | 240.0 | 281.015845046053 | 6.098441913282575 | 0.0 | 1824 | 0 | 213.96507000002885 | 5930.206827000006 |
v1.65.4.dev8
What's Changed
- fix: claude haiku cache read pricing per token by @hewliyang in #9834
- Add service annotations to litellm-helm chart by @mlhynfield in #9840
- Reflect key and team update in UI by @crisshaker in #9825
- Add user alias to API endpoint by @Jacobh2 in #9859
- Update Azure Phi-4 pricing by @emerzon in #9862
- feat: add enterpriseWebSearch tool for vertex-ai by @qvalentin in #9856
- VertexAI non-jsonl file storage support by @krrishdholakia in #9781
- [Bug Fix] Add support for UploadFile on LLM Pass through endpoints (OpenAI, Azure etc) by @ishaan-jaff in #9853
- [Feat SSO] Debug route - allow admins to debug SSO JWT fields by @ishaan-jaff in #9835
New Contributors
- @hewliyang made their first contribution in #9834
- @mlhynfield made their first contribution in #9840
- @crisshaker made their first contribution in #9825
- @qvalentin made their first contribution in #9856
Full Changelog: v1.65.4.dev6...v1.65.4.dev8
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.4.dev8
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 240.0 | 271.9459418950253 | 6.118160191369328 | 0.0 | 1829 | 0 | 215.48997299998973 | 3681.300501999999 |
Aggregated | Passed ✅ | 240.0 | 271.9459418950253 | 6.118160191369328 | 0.0 | 1829 | 0 | 215.48997299998973 | 3681.300501999999 |
v1.65.4.dev6
What's Changed
- build: bump litellm-proxy-extras version by @krrishdholakia in #9771
- Update model_prices by @aoaim in #9768
- Move daily user transaction logging outside of 'disable_spend_logs' flag - different tables by @krrishdholakia in #9772
- Add inference providers support for Hugging Face (#8258) (#9738) by @krrishdholakia in #9773
- [UI Bug fix] Don't show duplicate models on Team Admin models page by @ishaan-jaff in #9775
- [UI QA/Bug Fix] - Don't change team, key, org, model values on scroll by @ishaan-jaff in #9776
- [UI Polish] - Polish login screen by @ishaan-jaff in #9778
- Litellm 04 05 2025 release notes by @krrishdholakia in #9785
- feat: add offline swagger docs by @devdev999 in #7653
- fix(gemini/transformation.py): handle file_data being passed in by @krrishdholakia in #9786
- Realtime API Cost tracking by @krrishdholakia in #9795
- fix(vertex_ai.py): move to only passing in accepted keys by vertex ai response schema by @krrishdholakia in #8992
- fix(databricks/chat/transformation.py): remove reasoning_effort from … by @krrishdholakia in #9811
- Handle pydantic base model in message tool calls + Handle tools = [] + handle fireworks ai w/ 'strict' param in function call + support fake streaming on tool calls for meta.llama3-3-70b-instruct-v1:0 by @krrishdholakia in #9774
- Allow passing
thinking
param to litellm proxy via client sdk + Code QA Refactor on get_optional_params (get correct values) by @krrishdholakia in #9386 - [Feat] LiteLLM Tag/Policy Management by @ishaan-jaff in #9813
- Remove redundant
apk update
in Dockerfiles (cc #5016) by @PeterDaveHello in #9055 - [Security fix - CVE-2025-0330] - Leakage of Langfuse API keys in team exception handling by @ishaan-jaff in #9830
- [Security Fix CVE-2024-6825] Fix remote code execution in post call rules by @ishaan-jaff in #9826
- Bump next from 14.2.25 to 14.2.26 in /ui/litellm-dashboard by @dependabot in #9716
New Contributors
Full Changelog: v1.65.4-nightly...v1.65.4.dev6
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.4.dev6
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 230.0 | 264.1081772527121 | 6.162437450043016 | 0.0 | 1844 | 0 | 200.65376200000173 | 5098.356198000033 |
Aggregated | Passed ✅ | 230.0 | 264.1081772527121 | 6.162437450043016 | 0.0 | 1844 | 0 | 200.65376200000173 | 5098.356198000033 |
v1.65.4-stable
Docker Run LiteLLM Proxy
docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
ghcr.io/berriai/litellm:main-v1.65.4-stable
pip install LiteLLM Proxy
pip install litellm==1.65.4.post1
What's Changed
- Add support to Vertex AI transformation for anyOf union type with null fields by @NickGrab in #9618
- fix(logging): add json formatting for uncaught exceptions (#9615) by @krrishdholakia in #9619
- fix: wrong indentation of ttlSecondsAfterFinished in chart by @Dbzman in #9611
- Fix anthropic thinking + response_format by @krrishdholakia in #9594
- Add support to Vertex AI transformation for anyOf union type with null fields by @NickGrab in #9625
- fix(openrouter/chat/transformation.py): raise informative message for openrouter key error by @krrishdholakia in #9626
- [Reliability] - Reduce DB Deadlocks by storing spend updates in Redis and then committing to DB by @ishaan-jaff in #9608
- [Refactor] - Use single class for managing DB update spend transactions by @ishaan-jaff in #9600
- Add bedrock latency optimized inference support + Vertex AI Multimodal embedding cost tracking by @krrishdholakia in #9623
- build(pyproject.toml): add new dev dependencies - for type checking by @krrishdholakia in #9631
- install prisma migration files - connects litellm proxy to litellm's prisma migration files by @krrishdholakia in #9637
- update docs for openwebui by @tan-yong-sheng in #9636
- Add gemini audio input support + handle special tokens in sagemaker response by @krrishdholakia in #9640
- [Docs - Release notes v0] v1.65.0-stable by @ishaan-jaff in #9643
- [Feat] - MCP improvements, add support for using SSE MCP servers by @ishaan-jaff in #9642
- [FIX] - Add password to sync sentinel client by @jmarshall-medallia in #9622
- fix: Anthropic prompt caching on GCP Vertex AI by @sammcj in #9605
- Fixes Databricks llama3.3-70b endpoint and add databricks claude 3.7 sonnet endpoint by @anton164 in #9661
- fix(docs): update xAI Grok vision model reference by @colesmcintosh in #9286
- docs(gemini): fix typo by @GabrielLoiseau in #9581
- Update all_caches.md by @KPCOFGS in #9562
- [Bug fix] - Sagemaker endpoint with inference component streaming by @ishaan-jaff in #9515
- Revert "Correct Databricks llama3.3-70b endpoint and add databricks c… by @krrishdholakia in #9668
- Revert "fix: Anthropic prompt caching on GCP Vertex AI" by @krrishdholakia in #9670
- [Refactor] - Expose litellm.messages.acreate() and litellm.messages.create() to make LLM API calls in Anthropic API spec by @ishaan-jaff in #9567
- Openrouter streaming fixes + Anthropic 'file' message support by @krrishdholakia in #9667
- fix(cost_calculator.py): allows checking received + sent model name w… by @krrishdholakia in #9669
- Revert "Revert "Correct Databricks llama3.3-70b endpoint and add databricks c…" by @krrishdholakia in #9676
- Update model_prices_and_context_window.json add gemini-2.5-pro-exp-03-25 by @superpoussin22 in #9650
- fix(proxy_server.py): Fix "Circular reference detected" error when max_parallel_requests = 0 by @krrishdholakia in #9671
- UI (new_usage.tsx): Report 'total_tokens' + report success/failure calls by @krrishdholakia in #9675
- [Reliability] - Ensure new Redis + DB architecture tracks spend accurately by @ishaan-jaff in #9673
- [Bug fix] - Service accounts - only apply
service_account_settings.enforced_params
on service accounts by @ishaan-jaff in #9683 - UI - New Usage Tab fixes by @krrishdholakia in #9696
- [Reliability Fixes] - Ensure no deadlocks occur when updating
DailyUserSpendTransaction
by @ishaan-jaff in #9690 - Virtual key based policies in Aim Guardrails by @hxtomer in #9499
- fix(streaming_handler.py): fix completion start time tracking + Anthropic 'reasoning_effort' param mapping by @krrishdholakia in #9688
- Litellm user daily activity allow non admin usage by @krrishdholakia in #9695
- fix(model_management_endpoints.py): fix allowing team admins to update team models by @krrishdholakia in #9697
- Add support for max_completion_tokens to the Cohere chat transformati… by @simha104 in #9701
- fix(gemini/): add gemini/ route embedding optional param mapping support by @krrishdholakia in #9677
- Add Google AI Studio
/v1/files
upload API support by @krrishdholakia in #9645 - [Docs] High Availability Setup (Resolve DB Deadlocks) by @ishaan-jaff in #9714
- Bump image-size from 1.1.1 to 1.2.1 in /docs/my-website by @dependabot in #9708
- [Bug fix] Azure o-series tool calling by @ishaan-jaff in #9694
- [Reliability Fix] - Use Redis for PodLock Manager instead of PG (ensures no deadlocks occur) by @ishaan-jaff in #9715
- Ban hardcoded numbers - merge of #9513 by @krrishdholakia in #9709
- [Feat] Add VertexAI gemini-2.0-flash by @Dobiasd in #9723
- Fix: Use request body in curl log for Gemini streaming mode by @fengjiajie in #9736
- LiteLLM Minor Fixes & Improvements (04/02/2025) by @krrishdholakia in #9725
- fix:Gemini Flash 2.0 implementation is not returning the logprobs by @sajdakabir in #9713
- UI Improvements + Fixes - remove 'default key' on user signup + fix showing user models available for personal key creation by @krrishdholakia in #9741
- Fix prompt caching for Anthropic tool calls by @aorwall in #9706
- passthrough kwargs during acompletion, and unwrap extra_body for openrouter by @adrianlyjak in #9747
- [Feat] UI - Test Key v2 page - allow testing image endpoints + polish the page by @ishaan-jaff in #9748
- [Feat] Allow assigning SSO users to teams on MSFT SSO by @ishaan-jaff in #9745
- Fix VertexAI Credential Caching issue by @krrishdholakia in #9756
- [Reliability] v2 DB Deadlock Reduction Architecture – Add Max Size for In-Memory Queue + Backpressure Mechanism by @ishaan-jaff in #9759
- fix(router.py): support reusable credentials via passthrough router by @krrishdholakia in #9758
- Allow team members to see team models by @krrishdholakia in #9742
- fix(xai/chat/transformation.py): filter out 'name' param for xai non-… by @krrishdholakia in #9761
- Gemini image generation output support by @krrishdholakia in #9646
- [Fix] issue that metadata key exist, but value is None by @chaosddp in #9764
- fix(asr-groq): add groq whisper models to model cost map by @liuhu in #9648
- Update model_prices_and_context_window.json by @caramulrooney in #9620
- [Reliability] Emit operational metrics for new DB Transaction architecture by @ishaan-jaff in #9719
- [Security feature] Allow adding authentication on /metrics endpoints by @ishaan-jaff in #9766
- [Reliability] Prometheus emit llm provider on failure metric - make it easy to differentiate litellm error vs llm api error by @ishaan-jaff in #9760
- Fix prisma migrate deploy to use correct directory by @krrishdholakia in #9767
- Add DBRX Anthropic w/ thinking + response_format support by @krrishdholakia in #9744
- build: bump litellm-proxy-extras version by @krrishdholakia in #9771
- Update model_prices by @aoaim in #9768
- Move daily user transaction logging outside of 'disable_spend_logs' flag - different tables by @krrishdholakia in https://gith...