Skip to content

Conversation

analogrelay
Copy link
Member

This PR adds an initial version of the Container Metadata Cache, which caches metadata about a container that is frequently needed but infrequently changes. Right now, that's limited to the RID, but the patterns are here to allow us to add more cached metadata as needed.

To create an async single-flight (defined shortly) cache, I decided to use a third-party crate rather than building it myself. I did start along the path of building it myself and we can certainly do so, but as I got into testing I realized that I'd much rather use something well-tested that already exists for now. I chose the top result on crates.io when looking for an async cache: moka, which crates.io itself uses. It is frequently updated and highly used in the ecosystem. Plus, we don't expose anything from it on our public API, so we can replace it later without risk to users.

The main reason the cache is complicated is that I really wanted to make sure the cache was both async-friendly (i.e. works with async/await) and "single-flight", meaning there's only ever one outstanding request to update a given cache key. Those familiar with .NET might think about Lazy<T>'s "LazyThreadSafeMode". A single-flight cache is similar to LazyThreadSafeMode.ExecutionAndPublication, in that it ensures that only a single initialization function is running at a time. Moka does this, via get_with (we actually use get_with_by_ref, but it's very similar), which guarantees:

... that concurrent calls on the same not-existing key are coalesced into one evaluation of the init future. Only one of the calls evaluates its future, and other calls wait for that future to resolve.

https://docs.rs/moka/0.12.11/moka/future/struct.Cache.html#method.get_with

Not all of our SDKs do this kind of optimization, but I believe the .NET one does (from my reading) and if the Rust SDK is to serve as a core reference implementation, it seems important for it to do it as well.

Building an async single-flight cache in a runtime-agnostic way (i.e. not directly depending on tokio, smol, or some other async runtime) is possible, but complicated, so I wanted to let Moka do all that work for us for now ;).

@github-actions github-actions bot added the Cosmos The azure_cosmos crate label Oct 3, 2025
"AssetsRepo": "Azure/azure-sdk-assets",
"AssetsRepoPrefixPath": "rust",
"Tag": "rust/azure_data_cosmos_a39b424a5b",
"Tag": "rust/azure_data_cosmos_69ad1e4995",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

github-actions bot commented Oct 3, 2025

API Change Check

APIView identified API level changes in this PR and created the following API reviews

azure_data_cosmos

@analogrelay analogrelay force-pushed the ashleyst/container-meta-cache branch from 41f99c8 to f11022e Compare October 7, 2025 17:45
@analogrelay analogrelay marked this pull request as ready for review October 10, 2025 17:59
@Copilot Copilot AI review requested due to automatic review settings October 10, 2025 17:59
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds a container metadata cache (single‑flight, async) to reduce repeated metadata fetches (e.g. for throughput reads) and introduces a ResourceId newtype. Refactors the public surface from an internal CosmosPipeline to a CosmosConnection while wiring caching into ContainerClient and updating tests to validate reduced network calls. Also extends tooling (crate name extraction) and dictionaries, and adds the moka dependency.

  • Introduces ContainerMetadataCache using moka for single-flight async caching of stable container metadata (currently resource ID).
  • Refactors clients (CosmosClient, DatabaseClient, ContainerClient, query executor) to use CosmosConnection and ResourceId newtype.
  • Adds test and LocalRecorder policy to assert metadata is fetched once across repeated throughput reads.

Reviewed Changes

Copilot reviewed 22 out of 23 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
sdk/cosmos/azure_data_cosmos/tests/framework/test_account.rs Adds optional LocalRecorder injection into test client options.
sdk/cosmos/azure_data_cosmos/tests/framework/mod.rs Exposes LocalRecorder and TestAccountOptions, adjusts lint allowances.
sdk/cosmos/azure_data_cosmos/tests/framework/local_recorder.rs Implements a simple recording policy for test assertions.
sdk/cosmos/azure_data_cosmos/tests/cosmos_containers.rs Adds test ensuring container metadata is fetched only once due to caching.
sdk/cosmos/azure_data_cosmos/src/types.rs Introduces ResourceId newtype via macro.
sdk/cosmos/azure_data_cosmos/src/resource_context.rs Derives Hash for ResourceType/ResourceLink to allow cache keying.
sdk/cosmos/azure_data_cosmos/src/query/executor.rs Switches from CosmosPipeline to CosmosConnection for query operations.
sdk/cosmos/azure_data_cosmos/src/models/mod.rs Changes system_properties.resource_id to ResourceId.
sdk/cosmos/azure_data_cosmos/src/lib.rs Adds cache, connection, types modules; re-exports ResourceId; removes pipeline module.
sdk/cosmos/azure_data_cosmos/src/connection/signature_target.rs Updates module path after refactor.
sdk/cosmos/azure_data_cosmos/src/connection/mod.rs Implements CosmosConnection with shared pipeline and metadata cache; updates throughput methods.
sdk/cosmos/azure_data_cosmos/src/connection/authorization_policy.rs Updates module path references.
sdk/cosmos/azure_data_cosmos/src/clients/database_client.rs Refactors to use CosmosConnection.
sdk/cosmos/azure_data_cosmos/src/clients/cosmos_client.rs Refactors to use CosmosConnection; adjusts derives and endpoints.
sdk/cosmos/azure_data_cosmos/src/clients/container_client.rs Integrates metadata caching for container reads and throughput operations.
sdk/cosmos/azure_data_cosmos/src/cache.rs Implements ContainerMetadataCache and caching logic.
sdk/cosmos/azure_data_cosmos/Cargo.toml Adds moka dependency.
sdk/cosmos/assets.json Updates assets tag.
eng/scripts/update-cratenames.rs Expands crate name extraction to include all dependency sections and removes duplicates.
eng/dict/rust-custom.txt Adds newtypes to custom dictionary.
eng/dict/crates.txt Regenerates crate name list including new / reorganized crates and moka.
Cargo.toml Adds moka to workspace dependencies.

Copy link
Member

@heaths heaths left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes outside sdk/cosmos LGTM.

@heaths
Copy link
Member

heaths commented Oct 10, 2025

Blocked by #3180. I overrode the pipeline trigger for now. Will try to kick this...

@Azure Azure deleted a comment from azure-pipelines bot Oct 10, 2025
@heaths
Copy link
Member

heaths commented Oct 10, 2025

/azp run rust - pullrequest

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@heaths
Copy link
Member

heaths commented Oct 10, 2025

/azp run rust - storage-blob - perf

Copy link

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

@heaths
Copy link
Member

heaths commented Oct 10, 2025

Can't get rid of the broken pipeline now. Will try to close and re-open.

@heaths heaths closed this Oct 10, 2025
@heaths heaths reopened this Oct 10, 2025
@heaths
Copy link
Member

heaths commented Oct 10, 2025

Didn't work. @analogrelay when you're ready to merge, let me know and I can force it in. /cc @LarryOsterman

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Cosmos The azure_cosmos crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants