Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update caching documentation #1267

Merged
merged 2 commits into from
Feb 14, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions doc/CONFIGURATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -319,7 +319,7 @@ See [mounthelper.go](https://github.com/awslabs/mountpoint-s3/tree/main/examples

## Caching configuration

Mountpoint can optionally cache object metadata and content to reduce cost and improve performance for repeated reads to the same file.
Mountpoint can optionally cache file system metadata and object content to reduce cost and improve performance for repeated reads to the same file.
Mountpoint can serve [supported file system requests](./SEMANTICS.md) from the cache, excluding listing of directory contents.

### Metadata Cache
Expand All @@ -331,8 +331,13 @@ It can be set to a positive numerical value in seconds, or to one of the pre-con
> Caching of metadata entries relaxes the strong read-after-write consistency offered by Amazon S3 and Mountpoint in its default configuration.
> See the [consistency and concurrency section of the semantics documentaton](./SEMANTICS.md#consistency-and-concurrency) for more details.

When configured with metadata caching, on its own or in conjunction with local cache or shared cache, Mountpoint will typically perform fewer requests to the mounted S3 bucket, but will not guarantee that the information it reports is up to date with the content of the mounted S3 bucket.
You can use the `--metadata-ttl` flag to choose the appropriate trade off between consistency (`--metadata-ttl minimal`) and performance/cost optimization (`--metadata-ttl indefinite`), depending on the requirements of your workload.
The `--metadata-ttl` flag is used to control how long Mountpoint considers it's file system metadata (file existence, size, object etag, etc) accurate before re-fetching from S3.
When configured, on its own or in conjunction with local cache or shared cache, Mountpoint will typically perform fewer requests to the mounted S3 bucket, but will not guarantee that the information it reports
is up to date with the content of the mounted S3 bucket.
When configured with a local cache or shared cache, the stored data is considered accurate until the metadata TTL expires.
After this period, Mountpoint revalidates if the cached data is still accurate by verifying the object's etag hasn't changed.

Mountpoint provides two presets which trade off consistency (`--metadata-ttl minimal`) and performance/cost optimization (`--metadata-ttl indefinite`), and should be used depending on the requirements of your workload.
In scenarios where the content of the mounted S3 bucket is modified by another client, and you require Mountpoint to return recently up-to-date information, setting `--metadata-ttl minimal` is most appropriate.
A setting of `--metadata-ttl 300` would instead allow Mountpoint to perform fewer requests to the mounted S3 bucket by delaying updates for up to 300 seconds.
If your workload does not require consistency, for example because the content of the mounted S3 bucket does not change, you should use `--metadata-ttl indefinite`.
Expand Down
2 changes: 1 addition & 1 deletion doc/SEMANTICS.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ By default, Mountpoint ensures that new file uploads to a single key are atomic.
Mountpoint also offers optional metadata and object content caching.
See the [caching section of the configuration documentation](./CONFIGURATION.md#caching) for more information.
When opting into caching, the strong read-after-write consistency model is relaxed,
and you may see stale metadata or object data for up to the cache's metadata time-to-live (TTL),
and you may see stale file system metadata or object data for up to the cache's metadata time-to-live (TTL),
which defaults to 1 minute but can be configured using the `--metadata-ttl` flag.

For example, with local and/or shared caching enabled, you can successfully open and read a file that has been deleted from the mounted S3 bucket if it is already cached.
Expand Down
Loading