fix: clear model cache when run.yaml model list changes #3198

Ygnas · 2025-08-19T08:59:35Z

What does this PR do?

closes: #3150

Fixes the model cache not clearing when run.yaml model list changes by implementing proper cleanup mechanisms and adds unit tests.

Not sure exactly what is with all those models as listed_from_provider. At the moment they are still there.

Test Plan

Adding 2 models with:

curl -X POST http://127.0.0.1:8321/v1/models \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "my_llm",
    "provider_model_id": "gpt-3.5-turbo-0125",
    "provider_id": "openai",
    "model_type": "llm",
    "metadata": {}
  }'

And run.yaml

models:
  - metadata: {}
    model_id: testing-model
    provider_id: openai
    model_type: llm
    provider_model_id: gpt-4o-mini

Returned models looks like:

    "data": [
        {
            "identifier": "my_llm",
            "provider_resource_id": "gpt-3.5-turbo-0125",
            "provider_id": "openai",
            "type": "model",
            "metadata": {},
            "model_type": "llm"
        },
        {
            "identifier": "testing-model",
            "provider_resource_id": "gpt-4o-mini",
            "provider_id": "openai",
            "type": "model",
            "metadata": {},
            "model_type": "llm"
        },
...

After server restart and model removed from run.yaml only the manually added one remains:

    "data": [
        {
            "identifier": "my_llm",
            "provider_resource_id": "gpt-3.5-turbo-0125",
            "provider_id": "openai",
            "type": "model",
            "metadata": {},
            "model_type": "llm"
        },
...

llama_stack/core/routing_tables/models.py

ashwinb

I don't think we should do "store + cleanup" instead let's just not store things we don't want to be stored persistently. It seems like what we want to say generically is that there are three ways of registering models:

user provided (via the register API)
admin provided (via run.yaml)
provider provided (listed via the provider)

We need to only store the first parts and the rest are dynamically refreshed / loaded on every single startup of the server.

llama_stack/core/stack.py

cdoern · 2025-09-17T23:39:22Z

@Ygnas are you still working on this one?

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 19, 2025

Ygnas changed the title ~~Fix model cache not clearing when run.yaml model list changes~~ fix: clear model cache when run.yaml model list changes Aug 19, 2025

Ygnas force-pushed the model-cache branch 3 times, most recently from f32e42a to 91cf6a7 Compare August 19, 2025 11:31

Ygnas marked this pull request as ready for review August 19, 2025 11:35

Ygnas requested review from ashwinb, bbrowning, ehhuang, hardikjshah, leseb, mattf, raghotham, reluctantfuturist, slekkala1, terrytangyuan and yanxi0830 as code owners August 19, 2025 11:35

mattf mentioned this pull request Aug 19, 2025

fix: list models only for active providers #3143

Closed

ashwinb reviewed Aug 20, 2025

View reviewed changes

llama_stack/core/routing_tables/models.py Outdated Show resolved Hide resolved

ashwinb requested changes Sep 11, 2025

View reviewed changes

llama_stack/core/stack.py Show resolved Hide resolved

llama_stack/core/stack.py Outdated Show resolved Hide resolved

Ygnas force-pushed the model-cache branch 2 times, most recently from 12d9a08 to db63f94 Compare September 18, 2025 10:30

Ygnas requested a review from ashwinb September 18, 2025 10:30

Ygnas changed the title ~~fix: clear model cache when run.yaml model list changes~~ WIP: fix: clear model cache when run.yaml model list changes Sep 18, 2025

Ygnas added 2 commits September 18, 2025 14:08

fix: clear model cache when run.yaml model list changes

9e79e91

test: add tests for model not persistant models

5250349

Ygnas force-pushed the model-cache branch from db63f94 to 5250349 Compare September 18, 2025 13:09

Ygnas changed the title ~~WIP: fix: clear model cache when run.yaml model list changes~~ fix: clear model cache when run.yaml model list changes Sep 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: clear model cache when run.yaml model list changes #3198

fix: clear model cache when run.yaml model list changes #3198

Ygnas commented Aug 19, 2025 •

edited

Loading

Uh oh!

Uh oh!

ashwinb left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

cdoern commented Sep 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix: clear model cache when run.yaml model list changes #3198

Are you sure you want to change the base?

fix: clear model cache when run.yaml model list changes #3198

Conversation

Ygnas commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Test Plan

Uh oh!

Uh oh!

ashwinb left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cdoern commented Sep 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Ygnas commented Aug 19, 2025 •

edited

Loading

ashwinb left a comment •

edited

Loading