Skip to content

Feature/slayerfs gc compact clean#560

Open
Tyooughtul wants to merge 12 commits intork8s-dev:mainfrom
Tyooughtul:feature/slayerfs-gc-compact-clean
Open

Feature/slayerfs gc compact clean#560
Tyooughtul wants to merge 12 commits intork8s-dev:mainfrom
Tyooughtul:feature/slayerfs-gc-compact-clean

Conversation

@Tyooughtul
Copy link
Copy Markdown

主要为slayerfs 添加gc和compact机制:

Compact 机制:

  • 触发条件:可配置的阈值
  • 两种模式:
    • Async Compact:slice count 在 [min_slice_count, sync_threshold) 范围内,后台执行,不阻塞写入
    • Sync Compact:slice count >= sync_threshold,先获取分布式锁,阻塞该 chunk 的写入,然后执行 compact

GC 机制

  • 两阶段删除:
    • Phase 1: Soft Delete(标记 delayed,记录到 delayed_slice 表)
    • Phase 2: Hard Delete(GC worker 定期清理)

存储层支持

  • DatabaseStore:完整的 compact 和 GC 实现

保留了相关接口和配置,为后面接入更多组件预留

#355

Signed-off-by: Tyooughtul <yooughtul@126.com>
Signed-off-by: Tyooughtul <yooughtul@126.com>
Signed-off-by: Tyooughtul <yooughtul@126.com>
- Adapt daemon, fuse, main, vfs to new module paths
- Update examples and benchmarks

Signed-off-by: Tyooughtul <yooughtul@126.com>
…a module

Signed-off-by: Tyooughtul <yooughtul@126.com>
Signed-off-by: Tyooughtul <yooughtul@126.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ee82e88441

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +346 to +349
// Start background compaction and gc tasks
let bg_config = VfsBackgroundConfig::with_layout(&layout);
let background_tasks =
Self::start_background_tasks(&meta_client, store.clone(), layout, bg_config).await;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Honor read_only before starting compaction and GC workers

VFS::with_meta_client_config starts background compaction/GC unconditionally, so even when the caller sets MetaClientConfig.options.read_only (or disables background jobs), this path still launches workers that mutate metadata and block data through raw MetaStore APIs. That violates read-only mount semantics because the background tasks can compact chunks and delete delayed slices despite the client being configured as read-only.

Useful? React with 👍 / 👎.

Comment on lines +3096 to +3100
let current_slices: Vec<slice_meta::Model> = SliceMeta::find()
.filter(slice_meta::Column::ChunkId.eq(chunk_id as i64))
.lock_exclusive()
.all(&txn)
.await
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Lock out late slice inserts during versioned heavy compaction

This optimistic check reads and validates the current slice set once, but it does not prevent a concurrent writer from inserting a new slice for the same chunk after the read and before commit. In async compaction (no chunk lock), that can let replace_slices_for_compact_with_version commit stale merged data while a fresh write slips in, and if the compacted slice has a newer ID it can mask the just-written update. The transaction needs chunk-level exclusion or a commit-time conflict check that also catches phantom inserts.

Useful? React with 👍 / 👎.

Comment on lines +2629 to +2633
.select_only()
.column(slice_meta::Column::ChunkId)
.distinct()
.into_tuple()
.all(&self.db)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Push list_chunk_ids limit into the database query

The scheduler limit is applied only after loading all distinct chunk IDs into memory, so max_chunks_per_run does not bound query cost. On large slice tables this forces a full distinct scan/allocation every cycle and can cause avoidable latency and memory spikes. Apply .limit(limit as u64) in SQL so the compaction scan remains bounded.

Useful? React with 👍 / 👎.

Comment on lines +69 to +73
let pending_deletions = self
.meta_store
.process_delayed_slices(config.batch_size, config.min_age_secs)
.await
.map_err(GCError::MetaError)?;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Treat unsupported delayed-slice APIs as GC no-op

GC immediately maps process_delayed_slices errors to hard failures. In this commit, etcd/redis stores do not implement delayed-slice processing, so this returns NotImplemented; combined with always-on background workers, non-database backends will log GC cycle failed every interval forever. Handle MetaError::NotImplemented as a skip (like compaction does) to avoid persistent false-error loops.

Useful? React with 👍 / 👎.

Signed-off-by: Tyooughtul <yooughtul@126.com>
…apper

Signed-off-by: Tyooughtul <yooughtul@126.com>
Signed-off-by: Tyooughtul <yooughtul@126.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fad9c8444d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

merged_data: &mut [u8],
) -> Result<(), CompactorError> {
let mut sorted: Vec<_> = slices.to_vec();
sorted.sort_by_key(|s| s.slice_id);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve write order when merging overlapping slices

Heavy compaction currently decides the winning overlap by slice_id, but the live read path resolves overlaps by metadata insertion order (get_slices returns rows ordered by slice_meta.id, then reader walks them newest-first). When two writes to the same chunk commit out of order (e.g., one allocates a larger slice_id but commits earlier), this sort can pick a different winner than normal reads and rewrite stale bytes into the compacted slice, changing file content after compaction.

Useful? React with 👍 / 👎.

config.compact_config.clone(),
config.compact_config.lock_ttl.clone(),
);
let (compaction_handle, gc_handle) = worker.start(config.compaction, config.gc);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add lifecycle shutdown for spawned compaction/GC workers

This starts two infinite background loops but there is no corresponding cancellation/abort path on VFS teardown. The handles are only stored and never awaited/aborted, so dropping a VFS (for example during unmount/remount or repeated test setup) detaches the tasks and leaves them running against the same metadata store, causing duplicate GC/compaction cycles and post-teardown mutations.

Useful? React with 👍 / 👎.

Signed-off-by: Tyooughtul <yooughtul@126.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 494617ffb4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

return None;
}

let meta_store = meta_client.store();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Run compaction through MetaClient to keep slice cache coherent

start_background_tasks passes meta_client.store() (the raw MetaStore) into CompactionWorker, so compaction/GC change slice metadata without touching MetaClient’s slice-cache update/invalidation paths. In read-heavy workloads where get_slices is served from cache, this can leave stale slice IDs live after compaction; once GC removes old blocks, reads for hot chunks can return incorrect data (zeros/missing bytes) until eviction. Route these mutations through MetaClient or explicitly invalidate affected chunk slice cache entries after each compaction commit.

Useful? React with 👍 / 👎.

Comment on lines +347 to +349
let bg_config = VfsBackgroundConfig::with_layout(&layout);
let background_tasks =
Self::start_background_tasks(&meta_client, store.clone(), layout, bg_config).await;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Plumb configured CompactConfig into background worker startup

This path always builds VfsBackgroundConfig from with_layout, which hardcodes CompactConfig::default(), so user-provided compaction settings are silently ignored even though Config.compact was added in this change. That makes thresholds/intervals/toggles in config ineffective and can trigger unexpected compaction behavior in production. Thread the parsed compact settings into start_background_tasks instead of constructing defaults here.

Useful? React with 👍 / 👎.

…onfig

- Add a compaction success hook that invalidates per-chunk slice cache entries in MetaClient to prevent stale reads after Light/Heavy compaction succeeds.

- Replace hardcoded defaults by plumbing user-provided CompactConfig into VfsBackgroundConfig to enable configurable compaction intervals and thresholds.

the hook is currently only enabled for database.

Signed-off-by: Tyooughtul <yooughtul@126.com>
@Tyooughtul Tyooughtul requested a review from mon3stera April 2, 2026 03:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants