Skip to content

HBASE-29402: Comprehensive key management for encryption at rest #7111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 62 commits into
base: master
Choose a base branch
from

Conversation

haridsv
Copy link

@haridsv haridsv commented Jun 18, 2025

JIRA: HBASE-29402
Please refer to parent JIRA: HBASE-29368 for details and a design doc.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 29s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 buf 0m 0s buf was not available.
+0 🆗 buf 0m 0s buf was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+0 🆗 mvndep 0m 31s Maven dependency ordering for branch
+1 💚 mvninstall 3m 15s master passed
+1 💚 compile 8m 15s master passed
+1 💚 checkstyle 1m 12s master passed
+1 💚 spotbugs 12m 42s master passed
+1 💚 spotless 0m 46s branch has no errors when running spotless:check.
-0 ⚠️ patch 1m 26s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 12s Maven dependency ordering for patch
+1 💚 mvninstall 2m 55s the patch passed
+1 💚 compile 7m 56s the patch passed
+1 💚 cc 7m 56s the patch passed
-0 ⚠️ javac 7m 56s /results-compile-javac-root.txt root generated 15 new + 1740 unchanged - 0 fixed = 1755 total (was 1740)
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 12s the patch passed
-0 ⚠️ rubocop 0m 27s /results-rubocop.txt The patch generated 1 new + 84 unchanged - 0 fixed = 85 total (was 84)
+1 💚 spotbugs 13m 7s the patch passed
+1 💚 hadoopcheck 11m 14s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 hbaseprotoc 4m 48s the patch passed
-1 ❌ spotless 0m 13s patch has 64 errors when running spotless:check, run spotless:apply to fix.
_ Other Tests _
+1 💚 asflicense 1m 1s The patch does not generate ASF License warnings.
78m 38s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7111/11/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #7111
JIRA Issue HBASE-29402
Optional Tests dupname asflicense codespell detsecrets spotless javac spotbugs checkstyle compile hadoopcheck hbaseanti cc buflint bufcompat hbaseprotoc rubocop
uname Linux 876e5c091df4 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 346659c
Default Java Eclipse Adoptium-17.0.11+9
spotless https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7111/11/artifact/yetus-general-check/output/patch-spotless.txt
Max. process+thread count 188 (vs. ulimit of 30000)
modules C: hbase-protocol-shaded hbase-common hbase-client hbase-server hbase-testing-util hbase-shell . U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7111/11/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3 rubocop=1.37.1
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 29s Docker mode activated.
-0 ⚠️ yetus 0m 4s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 12s Maven dependency ordering for branch
+1 💚 mvninstall 2m 57s master passed
+1 💚 compile 2m 9s master passed
+1 💚 javadoc 3m 33s master passed
+1 💚 shadedjars 6m 4s branch has no errors when building our shaded downstream artifacts.
-0 ⚠️ patch 6m 52s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 11s Maven dependency ordering for patch
+1 💚 mvninstall 3m 3s the patch passed
+1 💚 compile 2m 11s the patch passed
+1 💚 javac 2m 11s the patch passed
+1 💚 javadoc 3m 27s the patch passed
+1 💚 shadedjars 6m 8s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
-1 ❌ unit 273m 11s /patch-unit-root.txt root in the patch failed.
314m 34s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7111/11/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #7111
JIRA Issue HBASE-29402
Optional Tests javac javadoc unit compile shadedjars
uname Linux c99a6e8e890b 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 346659c
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7111/11/testReport/
Max. process+thread count 6868 (vs. ulimit of 30000)
modules C: hbase-protocol-shaded hbase-common hbase-client hbase-server hbase-testing-util hbase-shell . U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7111/11/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

haridsv added 2 commits July 7, 2025 12:04
It took me several back and forth cycles to tell Cursor what I want and also make corrections to the code, so I asked it to summarize by generating a comprehensive prompt and the below is what I got. Saving it as a quick future reference. I will follow up with manaul improvements to this change and better test coverage.

Here's a comprehensive prompt that captures all the incremental changes and corrections we made throughout this refactoring:

---

**Comprehensive Refactoring Prompt for ManagedKeyDataCache and ManagedKeyAccessor**

I want to refactor the key management system in HBase to improve performance, simplify the architecture, and eliminate redundant code. Here are the specific requirements:

 ## Initial Refactoring Goals

1. **Refactor ManagedKeyDataCache.getEntry() to use CacheLoader interface**: Remove redundant methods and delegate responsibility properly. Use Caffeine's `get()` method with a lambda function for loading to allow passing context parameters (`key_cust` and `keyNamespace`) directly.

2. **Refactor ManagedKeyAccessor to delegate all key retrieval logic to the cache**: Include dynamic provider access, simplifying the accessor to be a thin wrapper.

3. **Merge getActiveKeys method into getRandomEntry**: Make `cacheByNS` a separate Caffeine cache with its own config to handle only active keys efficiently.

4. **Simplify active keys cache structure**: Use a 2-level cache with a composite key (custodian + namespace) and a regular Map for keys, then change to a List for O(1) random selection.

 ## Cache Architecture Changes

5. **Change active keys cache from Map to List**: Change the value type in `activeKeysCache` from `Map<String, ManagedKeyData>` to `List<ManagedKeyData>` to make random selection lightweight (O(1) instead of O(n)).

6. **Remove redundant addEntry() calls**: Explicit `addEntry()` calls inside Caffeine callbacks are redundant since Caffeine automatically caches returned values. Remove these from `getEntry()`, `getRandomEntry()`, and `getActiveKeys()` methods.

7. **Update removeEntry to work with both caches**: The `removeEntry` method should remove from both the main cache and the active keys cache.

8. **Update getEntryCount to include both caches**: Make it lightweight and clearly indicate it's an estimate that may include double-counting.

 ## Active Keys Cache Behavior

9. **Remove automatic addition to active keys cache**: `addEntry()` should not add to the active keys cache to avoid suppressing bulk loading logic. During random key selection, include individual active entries from the main cache if missing from the active keys cache.

10. **Simplify getRandomEntry**: Use `cache.getIfPresent` instead of `cache.get` to avoid triggering loading callbacks when just checking what's already cached.

11. **Remove dynamic lookup from active keys cache callback**: The `isDynamicLookupEnabled()` check should not be used in the callback for `activeKeysCache` since the callback is for loading active keys from the cache, not for determining dynamic lookup.

12. **Ignore main cache entries in getRandomEntry**: The `getRandomEntry` method should only return entries from the active keys cache and ignore any entries that might exist in the main cache.

 ## Method Cleanup

13. **Remove getActiveKeys() method**: Since `activeKeysCache` already contains only active keys, the `getActiveKeys()` method is redundant. Remove it and simplify `ManagedKeyAccessor.getAnActiveKey()` method.

14. **Remove ManagedKeyAccessor entirely**: Once `ManagedKeyAccessor` has no meaningful logic left, remove it and replace usage with `ManagedKeyDataCache` directly. Update the `Server` interface to return `ManagedKeyDataCache` instead of `ManagedKeyAccessor`.

15. **Remove addEntry method**: If `addEntry` is not being used by any callers, remove it. If it's only used internally, replace with direct cache access.

16. **Remove unused methods**: Remove `getActiveEntriesFromCache()` method if it's not being used anywhere.

 ## Inheritance and Code Reuse

17. **Make KeyManagementBase the base class**: Make `ManagedKeyDataCache` extend `KeyManagementBase` so it can reuse the `getKeyProvider()` method and other common functionality like `getServer()`.

 ## Testing and Validation

18. **Update all test files**: Replace all usages of `ManagedKeyAccessor` with `ManagedKeyDataCache` in test files and mock classes.

19. **Fix test compilation issues**: Update test mocks to properly configure the server mock to return a valid Configuration object.

20. **Update test logic**: Modify tests to work with the new architecture where `getRandomEntry()` only looks at the active keys cache, not the main cache.

 ## Key Design Principles

- **Performance**: Random key selection should be O(1)
- **Simplicity**: Remove redundant layers and methods
- **Correctness**: Maintain proper L1 → L2 → Dynamic Provider hierarchy
- **Testability**: Ensure tests can still validate functionality
- **Memory efficiency**: Use appropriate data structures for each use case

 ## Expected Final Architecture

- **Main cache**: Caffeine cache for general key data with automatic loading
- **Active keys cache**: Caffeine cache with List values for O(1) random selection
- **No accessor layer**: Direct usage of `ManagedKeyDataCache`
- **Inheritance**: `ManagedKeyDataCache` extends `KeyManagementBase`
- **Clean separation**: Active keys cache only for active keys, main cache for all keys

Please implement these changes incrementally, ensuring each step compiles and tests pass before proceeding to the next.
@haridsv
Copy link
Author

haridsv commented Jul 9, 2025

Closing temporarily to rework on caching layers.

@haridsv haridsv closed this Jul 9, 2025
@haridsv haridsv reopened this Jul 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants