Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-28447 New site configuration option "hfile.block.size" #5820

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

apurtell
Copy link
Contributor

@apurtell apurtell commented Apr 10, 2024

Introduce a new configuration setting - "hfile.block.size" that, if set, will define the default blocksize to use when writing HFiles if a column family schema does not define its own non-default block size. This is a bit complicated but required for compatability. The rules are:

  • If the schema specifies a non default block size, use it.
  • Otherwise, if the configuration specifies a non default block size, use it.
  • Otherwise, use the default block size.

The default is defined by HConstants.DEFAULT_BLOCKSIZE.

Given how compound configurations work the precedence order for a non default block size is: BLOCKSIZE in the column family schema > "hfile.block.size" in CF or table level schema > "hfile.block.size" in site configuration > HConstants.DEFAULT_BLOCKSIZE

Introduce a new configuration setting - "hfile.block.size" that, if set,
will define the default blocksize to use when writing HFiles if a column
family schema does not define its own non-default block size. This is a
bit complicated but required for compatability. The rules are:
 - If the schema specifies a non default block size, use it.
 - Otherwise, if the configuration specifies a non default block size,
   use it.
 - Otherwise, use the default block size.
The default is defined by HConstants.DEFAULT_BLOCKSIZE.
@apurtell apurtell changed the title HBASE-28447 New configuration to override the hfile specific blocksize HBASE-28447 New site configuration option "hfile.block.size" Apr 10, 2024
Copy link
Contributor

@virajjasani virajjasani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Comment on lines +42 to +45
@Test
public void testGetBlockSize() throws IOException {
int eightK = 8 * 1024;
int oneM = 1024 * 1024;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this could have been part of any existing test class, but no problem having it's own separate class either

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's best to start a TestStoreUtils class for testing StoreUtils. Some other units might be moved in here or added later. That would make more sense to me.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 30s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
_ master Compile Tests _
+0 🆗 mvndep 0m 14s Maven dependency ordering for branch
+1 💚 mvninstall 3m 17s master passed
+1 💚 compile 3m 13s master passed
+1 💚 checkstyle 0m 51s master passed
+0 🆗 refguide 2m 44s branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.
+1 💚 spotless 0m 48s branch has no errors when running spotless:check.
+1 💚 spotbugs 2m 12s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 12s Maven dependency ordering for patch
+1 💚 mvninstall 3m 2s the patch passed
+1 💚 compile 3m 11s the patch passed
+1 💚 javac 3m 11s the patch passed
+1 💚 checkstyle 0m 50s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 xml 0m 1s The patch has no ill-formed XML file.
+0 🆗 refguide 2m 14s patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.
+1 💚 hadoopcheck 5m 42s Patch does not cause any errors with Hadoop 3.3.6.
-1 ❌ spotless 0m 44s patch has 1 errors when running spotless:check, run spotless:apply to fix.
+1 💚 spotbugs 2m 21s the patch passed
_ Other Tests _
+1 💚 asflicense 0m 18s The patch does not generate ASF License warnings.
39m 42s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5820/1/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #5820
Optional Tests dupname asflicense javac refguide spotless xml spotbugs hadoopcheck hbaseanti checkstyle compile
uname Linux f054e1e46748 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 5d694da
Default Java Eclipse Adoptium-11.0.17+8
refguide https://nightlies.apache.org/hbase/HBase-PreCommit-GitHub-PR/PR-5820/1/yetus-general-check/output/branch-site/book.html
refguide https://nightlies.apache.org/hbase/HBase-PreCommit-GitHub-PR/PR-5820/1/yetus-general-check/output/patch-site/book.html
spotless https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5820/1/artifact/yetus-general-check/output/patch-spotless.txt
Max. process+thread count 81 (vs. ulimit of 30000)
modules C: hbase-common hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5820/1/console
versions git=2.34.1 maven=3.8.6 spotbugs=4.7.3
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@apurtell
Copy link
Contributor Author

The spotless check fails but when I run 'mvn spotless:apply' on my local branch there are no changes.

@apurtell
Copy link
Contributor Author

Ah, there was something wrong with CI during the spotless check

[ERROR] An internal error occurred during: "Periodic workspace save.".
java.lang.IllegalStateException: Job manager has been shut down.

@bbeaudreault
Copy link
Contributor

Ya that happens sometimes, not sure why

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 29s Docker mode activated.
-0 ⚠️ yetus 0m 4s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 14s Maven dependency ordering for branch
+1 💚 mvninstall 3m 16s master passed
+1 💚 compile 1m 13s master passed
+1 💚 shadedjars 5m 40s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 41s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 10s Maven dependency ordering for patch
+1 💚 mvninstall 2m 56s the patch passed
+1 💚 compile 1m 11s the patch passed
+1 💚 javac 1m 11s the patch passed
+1 💚 shadedjars 5m 37s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 39s the patch passed
_ Other Tests _
+1 💚 unit 2m 9s hbase-common in the patch passed.
+1 💚 unit 209m 3s hbase-server in the patch passed.
237m 50s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5820/1/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #5820
Optional Tests javac javadoc unit shadedjars compile
uname Linux 0b4ed2a636ac 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 5d694da
Default Java Eclipse Adoptium-17.0.10+7
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5820/1/testReport/
Max. process+thread count 5484 (vs. ulimit of 30000)
modules C: hbase-common hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5820/1/console
versions git=2.34.1 maven=3.8.6
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 28s Docker mode activated.
-0 ⚠️ yetus 0m 4s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 12s Maven dependency ordering for branch
+1 💚 mvninstall 3m 1s master passed
+1 💚 compile 1m 2s master passed
+1 💚 shadedjars 5m 40s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 38s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 12s Maven dependency ordering for patch
+1 💚 mvninstall 2m 45s the patch passed
+1 💚 compile 1m 1s the patch passed
+1 💚 javac 1m 1s the patch passed
+1 💚 shadedjars 5m 32s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 36s the patch passed
_ Other Tests _
+1 💚 unit 2m 16s hbase-common in the patch passed.
+1 💚 unit 217m 36s hbase-server in the patch passed.
246m 16s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5820/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR #5820
Optional Tests javac javadoc unit shadedjars compile
uname Linux 0f5a6ff8f380 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 5d694da
Default Java Eclipse Adoptium-11.0.17+8
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5820/1/testReport/
Max. process+thread count 5060 (vs. ulimit of 30000)
modules C: hbase-common hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5820/1/console
versions git=2.34.1 maven=3.8.6
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 37s Docker mode activated.
-0 ⚠️ yetus 0m 2s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 11s Maven dependency ordering for branch
+1 💚 mvninstall 2m 40s master passed
+1 💚 compile 1m 1s master passed
+1 💚 shadedjars 5m 17s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 41s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 14s Maven dependency ordering for patch
+1 💚 mvninstall 2m 28s the patch passed
+1 💚 compile 1m 0s the patch passed
+1 💚 javac 1m 0s the patch passed
+1 💚 shadedjars 5m 13s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 41s the patch passed
_ Other Tests _
+1 💚 unit 1m 56s hbase-common in the patch passed.
+1 💚 unit 240m 34s hbase-server in the patch passed.
267m 30s
Subsystem Report/Notes
Docker ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5820/1/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR #5820
Optional Tests javac javadoc unit shadedjars compile
uname Linux 593234cf39aa 5.4.0-172-generic #190-Ubuntu SMP Fri Feb 2 23:24:22 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 5d694da
Default Java Temurin-1.8.0_352-b08
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5820/1/testReport/
Max. process+thread count 4936 (vs. ulimit of 30000)
modules C: hbase-common hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5820/1/console
versions git=2.34.1 maven=3.8.6
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache9
Copy link
Contributor

Apache9 commented Apr 11, 2024

Introduce a new configuration setting - "hfile.block.size" that, if set, will define the default blocksize to use when writing HFiles if a column family schema does not define its own non-default block size. This is a bit complicated but required for compatability. The rules are:

  • If the schema specifies a non default block size, use it.
  • Otherwise, if the configuration specifies a non default block size, use it.
  • Otherwise, use the default block size.

The default is defined by HConstants.DEFAULT_BLOCKSIZE.

Given how compound configurations work the precedence order for a non default block size is: BLOCKSIZE in the column family schema > "hfile.block.size" in CF or table level schema > "hfile.block.size" in site configuration > HConstants.DEFAULT_BLOCKSIZE

No table level configuration?

Copy link

@gourabtaparia gourabtaparia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overal LGTM - added one clarifying question.

* @param schemaBlockSize The block size as specified in the column family schema.
* @return The block size to use.
*/
public static int getBlockSize(Configuration conf, int schemaBlockSize) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A clarification :

Though you have already called out the "non default block size" - One Question - Right now there can be a case where the site configuration has non default block size say 1MB(to be applied to all the table/CF), and for some CF/table one wants explicitly 64 KB - so setting 64KB explicitly again in the schema won't be picked, as that is the default.

For such cases, one will need to explicitly set the schemaBlocksize(either BLOCKSIZE or configuration override in schema) for all required tables to 1MB.

@bbeaudreault
Copy link
Contributor

@apurtell is this ready to be merged? It currently has fixVersion 2.6.0, and I'm looking to create the next RC tomorrow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants