Skip to content

Implement SCAN Command for Key Pattern Matching #24

@ngocbd

Description

@ngocbd

Overview

MerkleKV currently lacks a SCAN command that allows clients to retrieve keys matching a specific pattern or prefix. This feature would be valuable for key discovery, debugging, and administrative operations.

Feature Request

Add a new SCAN command that returns all keys starting with a specified prefix string.

Command Syntax

SCAN <prefix>

Examples

  • `SCAN Hell` → Returns all keys starting with "Hell" (e.g., "Hell", "Hello", "Hello!!", "Hello123")
  • `SCAN user:` → Returns all user-related keys (e.g., "user:123", "user:admin", "user:guest")
  • `SCAN session_` → Returns all session keys (e.g., "session_abc", "session_xyz")

Response Format

KEYS <count>
<key1>
<key2>
<key3>
...

Or if no keys found:

KEYS 0

Implementation Requirements

Protocol Changes

  • Add SCAN command to `Command` enum in `src/protocol.rs`
  • Update protocol parser to handle SCAN syntax
  • Add validation for prefix parameter

Storage Engine Changes

  • Add `scan(prefix: &str) -> Vec` method to `KvEngine`
  • Implement efficient prefix matching using existing HashMap keys
  • Consider performance implications for large key sets

Server Integration

  • Add SCAN command handling in `src/server.rs`
  • Format response according to protocol specification
  • Handle empty results gracefully

Testing Requirements

  • Unit tests for protocol parsing
  • Unit tests for storage engine scan functionality
  • Integration tests for SCAN command via TCP
  • Performance tests with large datasets
  • Edge case tests (empty prefix, special characters)

Design Considerations

Performance

  • Use iterator-based approach to avoid loading all keys into memory
  • Consider adding optional LIMIT parameter for large result sets
  • Benchmark against datasets with 10K+ keys

Pattern Matching

  • Start with simple prefix matching (most common use case)
  • Consider future extension to glob patterns (*, ?, [])
  • Ensure case-sensitive matching for consistency

Memory Usage

  • Stream results rather than building large response strings
  • Consider pagination for very large result sets

Error Handling

  • Validate prefix parameter (non-empty, valid characters)
  • Handle edge cases gracefully
  • Provide clear error messages

Example Implementation Outline

Protocol Parser

"SCAN" => {
    if rest.is_empty() {
        return Err(anyhow!(\"SCAN command requires a prefix\"));
    }
    if rest.contains(' ') {
        return Err(anyhow!(\"SCAN command accepts only one argument\"));
    }
    Ok(Command::Scan {
        prefix: rest.to_string(),
    })
}

Storage Engine

pub fn scan(&self, prefix: &str) -> Vec<String> {
    self.data.keys()
        .filter(|key| key.starts_with(prefix))
        .cloned()
        .collect()
}

Server Response

Command::Scan { prefix } => {
    let keys = store.scan(&prefix);
    let mut response = format!(\"KEYS {}\\r\\n\", keys.len());
    for key in keys {
        response.push_str(&format!(\"{}\\r\\n\", key));
    }
    response
}

Use Cases

  1. Development & Debugging: Quickly find keys during development
  2. Administrative Tasks: List all keys in a namespace
  3. Cache Management: Find expired or related cache entries
  4. Data Migration: Identify keys to migrate or backup
  5. Monitoring: Discover key patterns and usage

Future Enhancements

  • Add LIMIT parameter: `SCAN prefix LIMIT 100`
  • Add cursor-based pagination: `SCAN prefix CURSOR 0 LIMIT 10`
  • Support glob patterns: `SCAN user:*:profile`
  • Add COUNT option to return only the number of matching keys

Priority

Medium - This is a useful administrative and development feature that enhances the usability of MerkleKV.


Labels: enhancement, protocol, feature-request
Milestone: v0.2.0"

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions