Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EPIC] Add metadata to cache format #1610

Open
loewenheim opened this issue Feb 3, 2025 · 2 comments
Open

[EPIC] Add metadata to cache format #1610

loewenheim opened this issue Feb 3, 2025 · 2 comments
Assignees

Comments

@loewenheim
Copy link
Contributor

Currently, on-disk cache entries consist of either the file contents if the file was successfully fetched/looked up, or an error prefix + message if there was an error, or nothing if the file wasn't found. There is no other metadata associated with the file—the source location is used to compute the cache path, but this is not reversible. File deletion operates purely based on the mtime in the file system.

Instead, we propose to associate both data and metadata with each cache entry. The data would be the file contents in the success case and empty otherwise. The metadata could contain:

  • status (success, not found, various errors)
  • scope (project ID or else "global")
  • version number
  • the time it was written
    and possibly other data.

Saving the metadata in the filesystem could be accomplished by serializing the metadata to JSON and writing them either to separate files or into the same file before the data (whence it could then be deserialized as in https://github.com/getsentry/relay/blob/eb2b79ce88d2b1323bc99aab72f21fbe56619010/relay-server/src/envelope.rs#L1539-L1541).

Open questions:

  • Should the list of metadata be fixed, partially fixed, or completely freeform?
  • Should the metadata be written to the same file or a separate file?
  • How does this interact with CacheEntry/CacheError?
@loewenheim loewenheim self-assigned this Feb 3, 2025
@loewenheim loewenheim changed the title Add metadata to cache format [EPIC] Add metadata to cache format Feb 4, 2025
@loewenheim
Copy link
Contributor Author

loewenheim commented Feb 4, 2025

Design

In-memory structure

struct Metadata {
    error: Option<CacheError>,
    version: u32,
    created: SystemTime,
    scope: Scope,
    source: String,
}

struct CacheEntry {
    metadata: Metadata,
    contents: ByteView<'static>,
}
struct Metadata {
    version: u32,
    created: SystemTime,
    scope: Scope,
    source: String,
}

struct CacheEntry<T> {
    metadata: Metadata,
    contents: Result<T, CacheError>,
}

This version hews closer to the current implementation, where each cache entry is just Result<T, CacheError>.

On-disk structure

  1. Single file: Depending on whether we pick option 1 or 2 above the "error" would be stored as part of the metadata or after it in place of the contents.

    1. {"error": "…", "version": 17, …, "source": "…"}<FILE CONTENTS>
    2. {"version": 17, …, "source": "…"}<FILE CONTENTS or ERROR>

    It's possible to parse an initial JSON object and leave the rest of the file alone using a Deserializer directly, as in https://github.com/getsentry/relay/blob/eb2b79ce88d2b1323bc99aab72f21fbe56619010/relay-server/src/envelope.rs#L1539-L1541.

  2. Separate metadata file: This would mean putting a separate file, named e.g. foobar.md, next to the cache file foobar. This file would contain valid JSON.

@loewenheim
Copy link
Contributor Author

The design is complicated somewhat by the fact that we use different (meta)data to compute the cache key in different caches.

  • symcache: scope + source of object file + (optionally) source of il2cpp file + (optionally) source of bcsymbolmap file
  • all other native caches, as well as proguard: scope + source of object file
  • sourcemapcache: scope + source or contents of minified source file + source or contents of sourcemap file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant