Skip to content

feat: StringValueFormat & BaseMetaFilter #54

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 26, 2025

Conversation

guozhihao-224
Copy link
Contributor

@guozhihao-224 guozhihao-224 commented Apr 19, 2025

  1. Change interval and parsedInterval to traits
  2. Implement string_value_format and the BaseMetaFilter part of base_filter

Summary by CodeRabbit

  • New Features

    • Introduced advanced compaction filtering for database entries, improving automatic cleanup of expired data.
    • Added new internal value format for string data, supporting precise encoding, timestamps, and efficient expiry handling.
  • Refactor

    • Replaced concrete value structures with trait-based abstractions for greater flexibility and maintainability in storage handling.
  • Chores

    • Added new dependencies to enhance buffer and timestamp management.
    • Updated and clarified public constants for storage value formatting.
    • Suppressed compiler warnings for unused code in storage error handling and encoding utilities.

Copy link

coderabbitai bot commented Apr 19, 2025

Warning

Rate limit exceeded

@guozhihao-224 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 0 minutes and 20 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 0578604 and a340b0a.

📒 Files selected for processing (1)
  • src/storage/base_filter.rs (1 hunks)

Walkthrough

This update introduces a new abstraction for value encoding and parsing in the storage module by replacing concrete structs with trait-based designs for internal and parsed values. It adds a dedicated module for string value format handling, implements a compaction filter and its factory for RocksDB, and revises error handling for data type conversions. The changes also update module imports, expose new public constants, and add dependencies on the bytes and thiserror crates. Unit tests are included to verify the new compaction filter logic for string values with TTL. Additionally, dead code warnings are suppressed on some traits and error enums with TODO comments.

Changes

File(s) Change Summary
Cargo.toml Added bytes (v1) and thiserror (v1.0) as dependencies; reordered chrono dependency.
src/storage/base_filter.rs Introduced BaseMetaFilter and BaseMetaFilterFactory implementing RocksDB compaction filter traits; includes unit test for string value TTL compaction.
src/storage/base_value_format.rs Replaced concrete InternalValue/ParsedInternalValue structs with traits; added unified error handling for DataType conversion; updated imports.
src/storage/strings_value_format.rs New module: Implements StringValue and ParsedStringsValue for string value encoding/parsing; both implement new traits; handles TTL and compaction decisions.
src/storage/storage_define.rs Made TYPE_LENGTH a public constant; added new public constant STRING_VALUE_SUFFIXLENGTH.
src/storage/mod.rs Commented out base_data_value_format import; added imports for strings_value_format, error, and base_filter.
src/storage/coding.rs Added #[allow(dead_code)] attributes with TODO comments to suppress warnings on FixedInt trait and related functions.
src/storage/error.rs Added #[allow(dead_code)] attribute with TODO comment to suppress warnings on StorageError enum.

Sequence Diagram(s)

sequenceDiagram
    participant RocksDB
    participant BaseMetaFilterFactory
    participant BaseMetaFilter
    participant ParsedStringsValue

    RocksDB->>BaseMetaFilterFactory: create_compaction_filter()
    BaseMetaFilterFactory->>BaseMetaFilter: new()
    RocksDB->>BaseMetaFilter: filter(key, value)
    BaseMetaFilter->>ParsedStringsValue: parse value
    ParsedStringsValue->>BaseMetaFilter: filter_decision(current_time)
    BaseMetaFilter->>RocksDB: return CompactionDecision
Loading

Possibly related PRs

  • feat: implement coding #16: Introduced the FixedInt trait and related encoding/decoding functions in src/storage/coding.rs, which are now annotated with #[allow(dead_code)] in this PR.
  • feat: impl coding #40: Added implementations and tests for the FixedInt trait and its functions, related to the same code affected by the dead code allowance in this PR.
  • feat: add rust storage #24: Provided the initial Rust storage implementation including Cargo.toml and foundational modules; this PR extends the storage codebase with compaction filters and value format abstractions.

Suggested reviewers

  • baerwang
  • marsevilspirit

Poem

In bytes and chrono, we now delight,
New filters compaction, keeping data right.
Traits for values, parsing anew,
Strings with TTLs, handled true.
Constants revealed, old code replaced—
This rabbit hops on, storage embraced!
🐇✨


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@guozhihao-224 guozhihao-224 changed the title feat:StringValueFormat & BaseMetaFilter feat: StringValueFormat & BaseMetaFilter Apr 19, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (6)
src/storage/mod.rs (1)

15-15: Consider updating comment or removing entirely.

The commented-out module (base_data_value_format) suggests a transition to the new trait-based design. If this module is no longer needed, consider removing it entirely rather than leaving it as a comment to maintain code clarity.

src/storage/base_filter.rs (2)

92-108: Test might be flaky due to sleep dependency.

The test relies on std::thread::sleep to verify the TTL expiration, which can make it flaky in CI environments with variable performance. Consider using a time-independent approach where you can directly manipulate the current time or the expiration time for testing.

     #[test]
     fn test_strings_filter() {
         let mut filter = BaseMetaFilter::default();
 
         let string_val: &'static [u8] = b"filter_val";
         let mut string_val = crate::storage::strings_value_format::StringValue::new(string_val);
-        let ttl = 1_000_000; // 1 秒 = 1,000,000 微秒
+        let ttl = 1_000_000; // 1 second = 1,000,000 microseconds
         crate::storage::base_value_format::InternalValue::set_relative_timestamp(&mut string_val, ttl);
 
-        let decision = filter.filter(0, b"filter_key", &crate::storage::base_value_format::InternalValue::encode(&string_val));
+        // Test when current time is before expiration
+        let encoded_val = crate::storage::base_value_format::InternalValue::encode(&string_val);
+        let decision = filter.filter(0, b"filter_key", &encoded_val);
         assert!(matches!(decision, CompactionDecision::Keep));
 
-        std::thread::sleep(std::time::Duration::from_secs(2));
-
-        let decision = filter.filter(0, b"filter_key", &crate::storage::base_value_format::InternalValue::encode(&string_val));
+        // Create a separate test for expiration
+        // This eliminates the need for sleep and makes the test more reliable
+    }
+
+    #[test]
+    fn test_strings_filter_expired() {
+        let mut filter = BaseMetaFilter::default();
+        
+        // Create a value that's already expired
+        let string_val: &'static [u8] = b"filter_val";
+        let mut string_val = crate::storage::strings_value_format::StringValue::new(string_val);
+        
+        // Set expiration time to 1 microsecond in the past
+        let now = Utc::now().timestamp_micros() as u64;
+        string_val.set_etime(now - 1);
+        
+        let encoded_val = crate::storage::base_value_format::InternalValue::encode(&string_val);
+        let decision = filter.filter(0, b"filter_key", &encoded_val);
         assert!(matches!(decision, CompactionDecision::Remove));
     }

Note: The above implementation assumes you can add a set_etime method to directly set the expiration time for testing purposes. If such a method doesn't exist, you might need to modify the approach.


98-98: Consider translating Chinese comment.

For consistency and clarity in an open-source project, consider translating the Chinese comment to English. This will help ensure all contributors can understand the code.

-        let ttl = 1_000_000; // 1 秒 = 1,000,000 微秒
+        let ttl = 1_000_000; // 1 second = 1,000,000 microseconds
src/storage/base_value_format.rs (2)

37-53: Consider leveraging num_enum::TryFromPrimitive to shrink boilerplate

The handcrafted impl TryFrom<u8> exhaustively matches every variant. Using a derive‑based helper like num_enum::TryFromPrimitive (or strum) would:

  • remove this 13‑line match,
  • guarantee the mapping always stays in sync when the enum is extended,
  • and still return a custom StorageError for unknown values.
-use super::error::{
-    Result,
-    StorageError::{self, InvalidFormat},
-};
+use num_enum::TryFromPrimitive;
+
+use super::error::{Result, StorageError::InvalidFormat};
+
+#[derive(Debug, Clone, Copy, PartialEq, Eq, TryFromPrimitive)]
+#[repr(u8)]
+pub enum DataType {
+    String = 0,
+    Hash   = 1,
+    Set    = 2,
+    List   = 3,
+    ZSet   = 4,
+    None   = 5,
+    All    = 6,
+}

This trims maintenance overhead and removes the lingering TODO: use unified Result.


74-81: Expose an immutable buffer to avoid unnecessary cloning

encode(&self) -> BytesMut forces every caller to clone/ freeze the buffer before storage. Returning bytes::Bytes (or a BytesMut by value plus a helper to freeze) reduces foot‑guns and better conveys immutability of the encoded payload.

-pub trait InternalValue {
-    fn encode(&self) -> BytesMut;
+pub trait InternalValue {
+    /// Returns an immutable, ready‑to‑persist slice.
+    fn encode(&self) -> Bytes;

Adapting concrete impls is trivial (BytesMut::freeze).

src/storage/strings_value_format.rs (1)

96-100: Avoid an extra copy when wrapping the input slice

Bytes::copy_from_slice always allocates and copies. When the lifetime allows, Bytes::from(input_data.to_vec()) performs the same copy, but if the slice already originates from a Bytes/BytesMut, you could instead accept a Bytes directly or use Bytes::from_static for zero‑copy.

This is a micro‑optimisation, but worthwhile in hot paths like compaction filtering.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 62a1204 and 721402f.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • Cargo.toml (1 hunks)
  • src/storage/base_filter.rs (1 hunks)
  • src/storage/base_value_format.rs (3 hunks)
  • src/storage/mod.rs (2 hunks)
  • src/storage/storage_define.rs (2 hunks)
  • src/storage/strings_value_format.rs (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
src/storage/base_filter.rs (3)
src/storage/base_key_format.rs (1)
  • key (108-110)
src/storage/strings_value_format.rs (5)
  • new (30-39)
  • new (80-109)
  • data_type (111-113)
  • set_relative_timestamp (65-67)
  • encode (43-55)
src/storage/base_value_format.rs (5)
  • new (86-88)
  • data_type (91-91)
  • try_from (41-52)
  • set_relative_timestamp (80-80)
  • encode (77-77)
src/storage/base_value_format.rs (3)
src/storage/strings_value_format.rs (12)
  • encode (43-55)
  • set_etime (57-59)
  • etime (123-125)
  • set_ctime (61-63)
  • ctime (119-121)
  • set_relative_timestamp (65-67)
  • new (30-39)
  • new (80-109)
  • filter_decision (131-136)
  • data_type (111-113)
  • user_value (115-117)
  • reserve (127-129)
src/storage/redis_multi.rs (1)
  • ttl (188-219)
src/kstd/slice.rs (2)
  • new (30-32)
  • data (43-45)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: cargo miri (ubuntu-latest)
  • GitHub Check: cargo miri (windows-latest)
  • GitHub Check: cargo miri (macos-latest)
🔇 Additional comments (8)
Cargo.toml (1)

21-22: Dependencies added correctly for new functionality.

The additions of bytes for buffer management and chrono for timestamp handling align well with the new string value format and compaction filter implementation. Both are established libraries with stable versions.

src/storage/mod.rs (1)

25-27: New modules properly registered.

The new modules for string value format, error handling, and compaction filtering are correctly registered. The organization follows the project structure and makes the new functionality available within the storage module.

src/storage/storage_define.rs (2)

24-24: Appropriate exposure of TYPE_LENGTH constant.

Making TYPE_LENGTH public is appropriate as it's now used by the new string value format implementation across module boundaries.


35-35: Well-defined constant for string value suffix.

The STRING_VALUE_SUFFIXLENGTH constant is well-defined as the sum of double timestamp length and suffix reserve length, which matches the string value encoding format implemented in strings_value_format.rs.

src/storage/base_filter.rs (2)

20-24: Implementation of CompactionFilter name method.

The name method is correctly implemented to provide a C-style string name for the filter. This is required by the RocksDB API.


73-86: CompactionFilterFactory implementation looks good.

The implementation of CompactionFilterFactory is correct and follows the RocksDB API requirements, creating a new filter instance and providing a name for the factory.

src/storage/strings_value_format.rs (2)

27-39: ttl unit is ambiguous – microseconds vs seconds

set_relative_timestamp adds ttl to Utc::now().timestamp_micros(), implying that the caller must supply micro‑seconds:

self.etime = Utc::now().timestamp_micros() as u64 + ttl;

Most APIs expose TTL in seconds. A mismatch will either expire keys ~1 000 000× later than intended or instantly.

Please document the expected unit and consider enforcing seconds for ergonomics:

-    fn set_relative_timestamp(&mut self, ttl: u64) {
-        self.etime = Utc::now().timestamp_micros() as u64 + ttl;
+    /// `ttl_secs` – time‑to‑live in seconds
+    fn set_relative_timestamp(&mut self, ttl_secs: u64) {
+        self.etime =
+            (Utc::now().timestamp_micros() as u64) + ttl_secs * 1_000_000;
     }

131-136: The time base for current_time should match etime

filter_decision compares etime—encoded in micro‑seconds—to a current_time value that the compaction filter passes in. Verify that the caller also supplies micro‑seconds; RocksDB’s CompactionFilter::FilterV2 typically deals in seconds.

If units differ, entries may never be purged. Please add a helper to fetch Utc::now().timestamp_micros() and use it consistently, or convert inside this method.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
src/storage/strings_value_format.rs (1)

79-109: ⚠️ Potential issue

Type check needed in ParsedStringsValue::new

The method converts the first byte into a DataType but doesn't verify that it's actually a DataType::String. This could lead to incorrect parsing of non-string values, causing field offsets to be computed on the wrong layout and producing incorrect data.

Add a type check:

let data = Bytes::copy_from_slice(input_data);
let data_type = DataType::try_from(data[0])?;
+ if data_type != DataType::String {
+     return Err(super::error::StorageError::InvalidFormat(
+         format!("Expected string type, got {:?}", data_type)
+     ));
+ }
🧹 Nitpick comments (6)
src/storage/strings_value_format.rs (6)

115-117: Remove unnecessary Range::clone in accessors

The clone() operation on the range is unnecessary and creates a new Range object on each method call.

fn user_value(&self) -> &[u8] {
-    &self.data[self.user_value_range.clone()]
+    &self.data[self.user_value_range.clone()]
}

127-129: Remove unnecessary Range::clone in reserve method

Similar to the user_value method, the clone is unnecessary here.

fn reserve(&self) -> &[u8] {
-    &self.data[self.reserve_range.clone()]
+    &self.data[self.reserve_range.clone()]
}

20-25: Add documentation for timestamp fields

The ctime and etime fields lack documentation explaining what they represent and their units (microseconds since epoch).

#[derive(Debug, Clone)]
pub struct StringValue {
    user_value: Bytes,
+   /// Expiration time in microseconds since epoch (0 means no expiration)
    etime: u64,
+   /// Creation time in microseconds since epoch
    ctime: u64,
}

28-29: Address TODO comment

There's a TODO comment to remove the dead code attribute. Either remove the attribute if the function is now being used, or add a more specific TODO with a ticket reference.

-    /// TODO: remove allow dead code
-    #[allow(dead_code)]
+    // This function will be used when implementing string operations

44-46: Check for potential overflow in buffer size calculation

The buffer size calculation does not check for potential integer overflow when adding the lengths.

-        let needed =
-            TYPE_LENGTH + self.user_value.len() + SUFFIX_RESERVE_LENGTH + 2 * TIMESTAMP_LENGTH;
+        let needed = TYPE_LENGTH
+            .checked_add(self.user_value.len())
+            .and_then(|sum| sum.checked_add(SUFFIX_RESERVE_LENGTH))
+            .and_then(|sum| sum.checked_add(2 * TIMESTAMP_LENGTH))
+            .ok_or_else(|| panic!("Buffer size calculation overflow"))?;

51-52: Document endianness assumptions

The code uses little-endian encoding/decoding methods but doesn't document this assumption explicitly.

+        // Store timestamps in little-endian format
        buf.put_u64_le(self.ctime);
        buf.put_u64_le(self.etime);
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 721402f and 3b09739.

📒 Files selected for processing (4)
  • src/storage/base_filter.rs (1 hunks)
  • src/storage/mod.rs (1 hunks)
  • src/storage/storage_define.rs (2 hunks)
  • src/storage/strings_value_format.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/storage/mod.rs
  • src/storage/storage_define.rs
  • src/storage/base_filter.rs
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: cargo miri (macos-latest)
  • GitHub Check: cargo miri (windows-latest)
  • GitHub Check: cargo miri (ubuntu-latest)
🔇 Additional comments (1)
src/storage/strings_value_format.rs (1)

131-136:

❌ Incorrect review comment

Verify data type in filter_decision method

The filter_decision method doesn't check that the data type is String before applying expiration logic, which might lead to incorrect compaction decisions for non-string types.


🏁 Script executed:

#!/bin/bash
# Check if filter_decision is called on non-string types
rg -A 5 "filter_decision" src/

Length of output: 1351


We need full context from src/storage/base_filter.rs—particularly how different ValueFormat implementations (including StringsValueFormat) are selected and how filter_decision is invoked. This will confirm whether non‐string types ever invoke the string‐specific expiration logic. Please run:

Then we can determine if an explicit type check is necessary.


🏁 Script executed:

#!/bin/bash
# Show how filter_decision is dispatched in base_filter.rs
sed -n '1,200p' src/storage/base_filter.rs

Length of output: 3427


filter_decision only applies to strings

The expiration logic in ParsedStringsValue::filter_decision is only ever invoked when BaseMetaFilter has already matched DataType::String (see the match data_type in base_filter.rs). Non‑string values never reach this code path, so no additional type check is needed inside filter_decision.

Likely an incorrect or invalid review comment.

Copy link
Collaborator

@marsevilspirit marsevilspirit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check comment

@marsevilspirit
Copy link
Collaborator

@guozhihao-224 代码冲突了 需要解决一下

1. Change interval and parsedInterval to traits
2. Implement string_value_format and the BaseMetaFilter part of base_filter
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ba682d4 and cd40e78.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • Cargo.toml (1 hunks)
  • src/storage/base_filter.rs (1 hunks)
  • src/storage/base_value_format.rs (3 hunks)
  • src/storage/coding.rs (3 hunks)
  • src/storage/error.rs (1 hunks)
  • src/storage/mod.rs (1 hunks)
  • src/storage/storage_define.rs (2 hunks)
  • src/storage/strings_value_format.rs (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • src/storage/storage_define.rs
🚧 Files skipped from review as they are similar to previous changes (6)
  • src/storage/mod.rs
  • src/storage/error.rs
  • Cargo.toml
  • src/storage/coding.rs
  • src/storage/base_filter.rs
  • src/storage/strings_value_format.rs
⏰ Context from checks skipped due to timeout of 90000ms (6)
  • GitHub Check: cargo miri (macos-latest)
  • GitHub Check: cargo miri (ubuntu-latest)
  • GitHub Check: cargo miri (windows-latest)
  • GitHub Check: cargo clippy (ubuntu-latest)
  • GitHub Check: cargo clippy (windows-latest)
  • GitHub Check: cargo clippy (macos-latest)
🔇 Additional comments (4)
src/storage/base_value_format.rs (4)

16-22: Good choice of structured imports.

The imports are well-organized, bringing in the necessary external dependencies (bytes::BytesMut, rocksdb::CompactionDecision) and the project's error types. This approach clearly separates external and internal dependencies.


26-26: Good practice adding standard derivations to DataType enum.

Adding these derive traits (Debug, Clone, Copy, PartialEq, Eq) to the DataType enum follows Rust best practices for type definitions. This provides important functionality for debugging, comparisons, and using the enum in various contexts.


74-81:

❓ Verification inconclusive

Good abstraction with the InternalValue trait.

The transition to a trait-based approach for value encoding and timestamp management provides better abstraction and flexibility. This will allow for different implementations tailored to specific data types.

I notice there's a TODO comment about removing the #[allow(dead_code)] annotation. Do you have a plan for addressing these TODOs?


🏁 Script executed:

#!/bin/bash
# Count all #[allow(dead_code)] annotations with TODOs in the codebase
rg "#\[allow\(dead_code\)\]" -B 1 --glob "*.rs" | grep -c "TODO"

Length of output: 67


Plan for removing #[allow(dead_code)] annotations?

I’ve found that this trait definition in src/storage/base_value_format.rs (lines 74–81) still carries a // TODO: remove allow dead code and, in total, there are 38 such annotations across the repository. It’d be great to know how you plan to track and remove these once the implementations land—e.g., opening an issue to clean them up, relying on coverage/lints, or another process.

• src/storage/base_value_format.rs (lines 74–81)
• 38 occurrences of // TODO: remove allow dead code in *.rs files

Let me know your strategy for cleaning up these dead‐code allowances!


83-96:

✅ Verification successful

ParsedInternalValue trait provides good flexibility.

The trait-based approach for parsed values provides a clean interface with methods for compaction decisions, accessing data types, and timestamp management. The default implementation for version() is a nice touch for backward compatibility.

Based on previous review comments, there was concern about validation in the ParsedInternalValue::new method. While the concrete implementation isn't shown here, make sure all implementations perform proper validation in both debug and release builds.


🏁 Script executed:

#!/bin/bash
# Find implementations of the ParsedInternalValue trait
rg "impl\s+ParsedInternalValue<" --glob "*.rs" -A 20 | grep -B 5 -A 15 "fn new"

Length of output: 81


🏁 Script executed:

#!/bin/bash
# Broad search for any impl of ParsedInternalValue to locate fn new implementations
rg -n "ParsedInternalValue<'a> for" --glob "*.rs" -A5
# Fallback search for any impl blocks mentioning the trait
rg -n "impl.*ParsedInternalValue" --glob "*.rs" -A5

Length of output: 1290


ParsedInternalValue trait provides good flexibility.

The trait-based approach for parsed values provides a clean interface with methods for compaction decisions, accessing data types, and timestamp management. The default implementation for version() is a nice touch for backward compatibility.

I verified that ParsedStringsValue::new (in src/storage/strings_value_format.rs) uses both a debug_assert! and a runtime if check returning an error for invalid input length, ensuring proper validation in both debug and release builds. Continue applying this pattern to any other implementations to guarantee robust error handling.

@guozhihao-224
Copy link
Contributor Author

@marsevilspirit 这里合并merge pr时,是不是能够将多个commit合并成一个(Squash and Merge)

@marsevilspirit
Copy link
Collaborator

@marsevilspirit 这里合并merge pr时,是不是能够将多个commit合并成一个(Squash and Merge)

yes.
我一般是用rebase。

@marsevilspirit
Copy link
Collaborator

这是pr完成了吗?
我看又添加了一些代码。

@guozhihao-224
Copy link
Contributor Author

这是完成了吗? 我看又添加了一些代码。

是的。base_data_key_format 和 base_filter里面的BaseDataFilter我单独另外提一个PR吧

Copy link
Collaborator

@marsevilspirit marsevilspirit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@marsevilspirit marsevilspirit merged commit 25ce99c into arana-db:main Apr 26, 2025
11 checks passed
This was referenced Jun 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✏️ Feature new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants