Skip to content

Blueprint Rendezvous Table Garbage Collection #9592

@smklein

Description

@smklein

Background

Blueprints propagate some information to the database using rendezvous tables, as described in https://rfd.shared.oxide.computer/rfd/541.

This is implemented in nexus/src/app/background/tasks/blueprint_rendezvous.rs as a background task, which invokes reconcile_blueprint_rendezvous_tables.

In short, this operation is:

  • Read latest blueprint
  • Read latest inventory
  • "Do Reconciliation", which means many calls to INSERT INTO my_rendezvous_table, and either ignoring conflict errors or explicitly ignoring them.

As implemented today (1/5/2026) we are not performing "hard deletion" of any of these rendezvous table rows.

The Issue

If we want to ever perform hard deletion of any rendezvous table rows - suppose they're extremely old, no longer relevant, taking up space, etc - it's hard to guarantee that the background task won't re-insert old data.

For example, suppose we have the following sequence of events:

  • Nexus A reads blueprint @ generation 1, gets ready to do reconciliation (which will insert into a rendezvous table with id = foo-bar-baz)
  • Nexus A hangs unexpectedly. We'll treat this as a long sleep
  • Nexus B creates, reads, and executes many more blueprints. The database row with id = foo-bar-baz is created, and later we want to hard delete it.
  • At some point in the future, Nexus A will wake back up and resume reconciliation. It'll try to insert the entry with id = foo-bar-baz. If this row is hard-deleted, the later INSERT operation would succeed (which would be a bug - this would be a "zombie record" coming back to life unexpectedly after deletion).

Resolution

There are a couple ways we could mitigate this

  1. Guard the INSERT. Using a generation number stored somewhere, convert the INSERT INTO operations into a transaction or CTE, which is effectively INSERT INTO ... + ONLY DO IT WHILE THE GENERATION NUMBER OF RENDEZVOUS IS LATEST. This should cause all old rendezvous operations to be "locked out".
  2. Track ongoing rendezvous operations, Guard the DELETE. Basically: If we know what rendezvous operations are going on, we could prevent the "hard deletion" from happening until we know it won't be revived.
  3. Rely on timeouts. Use timeouts, to make the assumption that "no really old Nexuses exist". E.g., if rendezvous operations have a timeout of 30 sec, but we only perform hard deletions after 24 hours. (Note: this has huge issues for e.g. interactions with mupdate, time sync, so we'd prefer to avoid it, but listing it for completeness)

(My personal preference is for option 1 - it does have slightly more boilerplate, but it seems most "clearly correct", and it really limits the duration of old rendezvous operations happening)

With the ongoing work in #9552 , this is relevant to fault management reconciliation as well. Frankly it might be more relevant there, because alerts are probably going to churn more frequently than our blueprint rendezvous tables (e.g. dataset configurations).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions