Skip to content

feat: Hard delete database and table #26553

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 24, 2025
Merged

Conversation

stuartcarnie
Copy link
Contributor

@stuartcarnie stuartcarnie commented Jun 23, 2025

Summary

This PR implements hard deletion functionality for databases and tables in InfluxDB 3 Core (OSS). Previously, only soft deletion was supported, which marked objects as deleted but did not remove the underlying data or schema objects. This feature enables the complete removal of .parquet snapshot files for the deleted tables. The schema will be completely removed after a configurable grace period.

Configuration

Environment Variable Description Default Value
INFLUXDB3_HARD_DELETE_DEFAULT_DURATION The duration from when a database or table is soft-deleted until the data is scheduled to be hard deleted. 7 days (from Catalog::DEFAULT_HARD_DELETE_DURATION)
INFLUXDB3_DELETE_GRACE_PERIOD Grace period for hard deleted databases and tables before they are removed permanently from the catalog. 24h

Usage Examples

# Set hard delete to occur 30 days after soft deletion
export INFLUXDB3_HARD_DELETE_DEFAULT_DURATION=30d

# Set grace period to 48 hours before catalog removal
export INFLUXDB3_DELETE_GRACE_PERIOD=48h

Notes

  • Both values accept human-readable durations (e.g., "1h", "7d", "30m")
  • The hard delete duration determines when data files are removed from object storage
  • The grace period provides additional time after hard deletion before the catalog entry is permanently removed

Architecture

This builds on the existing retention-policy enforcement processing, adding files for the deleted schema. These removed files are added to the next snapshot and removed from object storage.

The PersistedFiles structure implements the ObjectDeleter trait:

/// Trait for clients to be notified when a database or table should be deleted.
pub trait ObjectDeleter: std::fmt::Debug + Send + Sync {
/// Deletes a database.
fn delete_database(&self, db_id: DbId);
/// Deletes a table.
fn delete_table(&self, db_id: DbId, table_id: TableId);
}

Which ensures it is notified when it should delete the database or table. This accounts for individual hard-delete time for each table.

As a result, file deletions only happen if the node is accepting writes and therefore regularly snapshotting, which is the same for retention policy enforcement. It is possible files aren't cleaned up with the current implementation if:

  1. A table is deleted,
  2. server isn't receiving writes,
  3. grace period passes, so schema is deleted from the catalog
  4. server is shut down

@stuartcarnie stuartcarnie self-assigned this Jun 23, 2025
@stuartcarnie stuartcarnie marked this pull request as ready for review June 24, 2025 00:40
@stuartcarnie stuartcarnie requested a review from a team June 24, 2025 00:40
Copy link
Contributor

@praveen-influx praveen-influx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - should --hard-delete-default-duration be added to serve-all.txt as well? I couldn't see that in enterprise too. It's not a blocker, if it should be added we could add it in separate PR.

@praveen-influx praveen-influx merged commit 4c8283d into main Jun 24, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add hard delete database Automatic Hard Delete Table function Hard Delete Table Option
2 participants