-
Notifications
You must be signed in to change notification settings - Fork 313
Cosmos: Container Metadata Cache #3109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Cosmos: Container Metadata Cache #3109
Conversation
"AssetsRepo": "Azure/azure-sdk-assets", | ||
"AssetsRepoPrefixPath": "rust", | ||
"Tag": "rust/azure_data_cosmos_a39b424a5b", | ||
"Tag": "rust/azure_data_cosmos_69ad1e4995", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API Change CheckAPIView identified API level changes in this PR and created the following API reviews |
41f99c8
to
f11022e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Adds a container metadata cache (single‑flight, async) to reduce repeated metadata fetches (e.g. for throughput reads) and introduces a ResourceId newtype. Refactors the public surface from an internal CosmosPipeline to a CosmosConnection while wiring caching into ContainerClient and updating tests to validate reduced network calls. Also extends tooling (crate name extraction) and dictionaries, and adds the moka dependency.
- Introduces ContainerMetadataCache using moka for single-flight async caching of stable container metadata (currently resource ID).
- Refactors clients (CosmosClient, DatabaseClient, ContainerClient, query executor) to use CosmosConnection and ResourceId newtype.
- Adds test and LocalRecorder policy to assert metadata is fetched once across repeated throughput reads.
Reviewed Changes
Copilot reviewed 22 out of 23 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
sdk/cosmos/azure_data_cosmos/tests/framework/test_account.rs | Adds optional LocalRecorder injection into test client options. |
sdk/cosmos/azure_data_cosmos/tests/framework/mod.rs | Exposes LocalRecorder and TestAccountOptions, adjusts lint allowances. |
sdk/cosmos/azure_data_cosmos/tests/framework/local_recorder.rs | Implements a simple recording policy for test assertions. |
sdk/cosmos/azure_data_cosmos/tests/cosmos_containers.rs | Adds test ensuring container metadata is fetched only once due to caching. |
sdk/cosmos/azure_data_cosmos/src/types.rs | Introduces ResourceId newtype via macro. |
sdk/cosmos/azure_data_cosmos/src/resource_context.rs | Derives Hash for ResourceType/ResourceLink to allow cache keying. |
sdk/cosmos/azure_data_cosmos/src/query/executor.rs | Switches from CosmosPipeline to CosmosConnection for query operations. |
sdk/cosmos/azure_data_cosmos/src/models/mod.rs | Changes system_properties.resource_id to ResourceId. |
sdk/cosmos/azure_data_cosmos/src/lib.rs | Adds cache, connection, types modules; re-exports ResourceId; removes pipeline module. |
sdk/cosmos/azure_data_cosmos/src/connection/signature_target.rs | Updates module path after refactor. |
sdk/cosmos/azure_data_cosmos/src/connection/mod.rs | Implements CosmosConnection with shared pipeline and metadata cache; updates throughput methods. |
sdk/cosmos/azure_data_cosmos/src/connection/authorization_policy.rs | Updates module path references. |
sdk/cosmos/azure_data_cosmos/src/clients/database_client.rs | Refactors to use CosmosConnection. |
sdk/cosmos/azure_data_cosmos/src/clients/cosmos_client.rs | Refactors to use CosmosConnection; adjusts derives and endpoints. |
sdk/cosmos/azure_data_cosmos/src/clients/container_client.rs | Integrates metadata caching for container reads and throughput operations. |
sdk/cosmos/azure_data_cosmos/src/cache.rs | Implements ContainerMetadataCache and caching logic. |
sdk/cosmos/azure_data_cosmos/Cargo.toml | Adds moka dependency. |
sdk/cosmos/assets.json | Updates assets tag. |
eng/scripts/update-cratenames.rs | Expands crate name extraction to include all dependency sections and removes duplicates. |
eng/dict/rust-custom.txt | Adds newtypes to custom dictionary. |
eng/dict/crates.txt | Regenerates crate name list including new / reorganized crates and moka. |
Cargo.toml | Adds moka to workspace dependencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes outside sdk/cosmos
LGTM.
Co-authored-by: Copilot <[email protected]>
Blocked by #3180. I overrode the pipeline trigger for now. Will try to kick this... |
/azp run rust - pullrequest |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run rust - storage-blob - perf |
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
Can't get rid of the broken pipeline now. Will try to close and re-open. |
Didn't work. @analogrelay when you're ready to merge, let me know and I can force it in. /cc @LarryOsterman |
This PR adds an initial version of the Container Metadata Cache, which caches metadata about a container that is frequently needed but infrequently changes. Right now, that's limited to the RID, but the patterns are here to allow us to add more cached metadata as needed.
To create an async single-flight (defined shortly) cache, I decided to use a third-party crate rather than building it myself. I did start along the path of building it myself and we can certainly do so, but as I got into testing I realized that I'd much rather use something well-tested that already exists for now. I chose the top result on crates.io when looking for an async cache: moka, which crates.io itself uses. It is frequently updated and highly used in the ecosystem. Plus, we don't expose anything from it on our public API, so we can replace it later without risk to users.
The main reason the cache is complicated is that I really wanted to make sure the cache was both async-friendly (i.e. works with async/await) and "single-flight", meaning there's only ever one outstanding request to update a given cache key. Those familiar with .NET might think about
Lazy<T>
's "LazyThreadSafeMode". A single-flight cache is similar toLazyThreadSafeMode.ExecutionAndPublication
, in that it ensures that only a single initialization function is running at a time. Moka does this, viaget_with
(we actually useget_with_by_ref
, but it's very similar), which guarantees:Not all of our SDKs do this kind of optimization, but I believe the .NET one does (from my reading) and if the Rust SDK is to serve as a core reference implementation, it seems important for it to do it as well.
Building an async single-flight cache in a runtime-agnostic way (i.e. not directly depending on tokio, smol, or some other async runtime) is possible, but complicated, so I wanted to let Moka do all that work for us for now ;).