HBASE-28399 region size can be wrong from RegionSizeCalculator #5700

frostruan · 2024-02-23T09:33:15Z

No description provided.

Apache-HBase · 2024-02-23T09:55:12Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	0m 27s	Docker mode activated.
-0 ⚠️	yetus	0m 2s	Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
		_ Prechecks _
		_ master Compile Tests _
+1 💚	mvninstall	2m 52s	master passed
+1 💚	compile	0m 16s	master passed
+1 💚	shadedjars	5m 34s	branch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 13s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	2m 27s	the patch passed
+1 💚	compile	0m 16s	the patch passed
+1 💚	javac	0m 16s	the patch passed
+1 💚	shadedjars	5m 34s	patch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 12s	the patch passed
		_ Other Tests _
-1 ❌	unit	0m 46s	hbase-mapreduce in the patch failed.
		19m 35s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5700/1/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR	#5700
Optional Tests	javac javadoc unit shadedjars compile
uname	Linux 96916b5ab17f 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `7be588e`
Default Java	Temurin-1.8.0_352-b08
unit	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5700/1/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-mapreduce.txt
Test Results	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5700/1/testReport/
Max. process+thread count	230 (vs. ulimit of 30000)
modules	C: hbase-mapreduce U: hbase-mapreduce
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5700/1/console
versions	git=2.34.1 maven=3.8.6
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2024-02-23T09:55:28Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	0m 37s	Docker mode activated.
-0 ⚠️	yetus	0m 3s	Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
		_ Prechecks _
		_ master Compile Tests _
+1 💚	mvninstall	3m 7s	master passed
+1 💚	compile	0m 20s	master passed
+1 💚	shadedjars	5m 11s	branch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 17s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	2m 49s	the patch passed
+1 💚	compile	0m 20s	the patch passed
+1 💚	javac	0m 20s	the patch passed
+1 💚	shadedjars	5m 10s	patch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 15s	the patch passed
		_ Other Tests _
-1 ❌	unit	0m 57s	hbase-mapreduce in the patch failed.
		20m 1s

Subsystem	Report/Notes
Docker	ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5700/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR	#5700
Optional Tests	javac javadoc unit shadedjars compile
uname	Linux 868b0960aea1 5.4.0-169-generic #187-Ubuntu SMP Thu Nov 23 14:52:28 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `7be588e`
Default Java	Eclipse Adoptium-11.0.17+8
unit	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5700/1/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-mapreduce.txt
Test Results	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5700/1/testReport/
Max. process+thread count	236 (vs. ulimit of 30000)
modules	C: hbase-mapreduce U: hbase-mapreduce
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5700/1/console
versions	git=2.34.1 maven=3.8.6
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

frostruan · 2024-02-23T09:57:01Z

Some unit tests failed. Let me try to fix them.

Apache-HBase · 2024-02-23T09:57:26Z

🎊 +1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	0m 26s	Docker mode activated.
		_ Prechecks _
+1 💚	dupname	0m 0s	No case conflicting files found.
+1 💚	hbaseanti	0m 0s	Patch does not have any anti-patterns.
+1 💚	@author	0m 0s	The patch does not contain any @author tags.
		_ master Compile Tests _
+1 💚	mvninstall	3m 8s	master passed
+1 💚	compile	0m 30s	master passed
+1 💚	checkstyle	0m 11s	master passed
+1 💚	spotless	0m 43s	branch has no errors when running spotless:check.
+1 💚	spotbugs	0m 30s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	2m 45s	the patch passed
+1 💚	compile	0m 29s	the patch passed
+1 💚	javac	0m 29s	the patch passed
+1 💚	checkstyle	0m 10s	the patch passed
+1 💚	whitespace	0m 0s	The patch has no whitespace issues.
+1 💚	hadoopcheck	5m 4s	Patch does not cause any errors with Hadoop 3.3.6.
+1 💚	spotless	0m 40s	patch has no errors when running spotless:check.
+1 💚	spotbugs	0m 35s	the patch passed
		_ Other Tests _
+1 💚	asflicense	0m 9s	The patch does not generate ASF License warnings.
		21m 52s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5700/1/artifact/yetus-general-check/output/Dockerfile
GITHUB PR	#5700
Optional Tests	dupname asflicense javac spotbugs hadoopcheck hbaseanti spotless checkstyle compile
uname	Linux 30e6cfeb49a2 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `7be588e`
Default Java	Eclipse Adoptium-11.0.17+8
Max. process+thread count	80 (vs. ulimit of 30000)
modules	C: hbase-mapreduce U: hbase-mapreduce
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5700/1/console
versions	git=2.34.1 maven=3.8.6 spotbugs=4.7.3
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2024-02-23T12:12:15Z

🎊 +1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	0m 27s	Docker mode activated.
		_ Prechecks _
+1 💚	dupname	0m 0s	No case conflicting files found.
+1 💚	hbaseanti	0m 0s	Patch does not have any anti-patterns.
+1 💚	@author	0m 0s	The patch does not contain any @author tags.
		_ master Compile Tests _
+1 💚	mvninstall	3m 3s	master passed
+1 💚	compile	0m 30s	master passed
+1 💚	checkstyle	0m 11s	master passed
+1 💚	spotless	0m 41s	branch has no errors when running spotless:check.
+1 💚	spotbugs	0m 28s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	2m 39s	the patch passed
+1 💚	compile	0m 27s	the patch passed
+1 💚	javac	0m 27s	the patch passed
+1 💚	checkstyle	0m 9s	the patch passed
+1 💚	whitespace	0m 0s	The patch has no whitespace issues.
+1 💚	hadoopcheck	5m 7s	Patch does not cause any errors with Hadoop 3.3.6.
+1 💚	spotless	0m 40s	patch has no errors when running spotless:check.
+1 💚	spotbugs	0m 35s	the patch passed
		_ Other Tests _
+1 💚	asflicense	0m 9s	The patch does not generate ASF License warnings.
		21m 33s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5700/2/artifact/yetus-general-check/output/Dockerfile
GITHUB PR	#5700
Optional Tests	dupname asflicense javac spotbugs hadoopcheck hbaseanti spotless checkstyle compile
uname	Linux 262e146e230d 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `7be588e`
Default Java	Eclipse Adoptium-11.0.17+8
Max. process+thread count	81 (vs. ulimit of 30000)
modules	C: hbase-mapreduce U: hbase-mapreduce
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5700/2/console
versions	git=2.34.1 maven=3.8.6 spotbugs=4.7.3
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2024-02-23T12:23:08Z

🎊 +1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	0m 26s	Docker mode activated.
-0 ⚠️	yetus	0m 2s	Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
		_ Prechecks _
		_ master Compile Tests _
+1 💚	mvninstall	2m 21s	master passed
+1 💚	compile	0m 16s	master passed
+1 💚	shadedjars	5m 31s	branch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 11s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	2m 27s	the patch passed
+1 💚	compile	0m 16s	the patch passed
+1 💚	javac	0m 16s	the patch passed
+1 💚	shadedjars	5m 32s	patch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 11s	the patch passed
		_ Other Tests _
+1 💚	unit	14m 25s	hbase-mapreduce in the patch passed.
		32m 34s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5700/2/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR	#5700
Optional Tests	javac javadoc unit shadedjars compile
uname	Linux ac81fbdfe395 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `7be588e`
Default Java	Temurin-1.8.0_352-b08
Test Results	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5700/2/testReport/
Max. process+thread count	2908 (vs. ulimit of 30000)
modules	C: hbase-mapreduce U: hbase-mapreduce
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5700/2/console
versions	git=2.34.1 maven=3.8.6
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2024-02-23T12:24:35Z

🎊 +1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	0m 38s	Docker mode activated.
-0 ⚠️	yetus	0m 2s	Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
		_ Prechecks _
		_ master Compile Tests _
+1 💚	mvninstall	3m 3s	master passed
+1 💚	compile	0m 20s	master passed
+1 💚	shadedjars	5m 8s	branch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 17s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	2m 49s	the patch passed
+1 💚	compile	0m 20s	the patch passed
+1 💚	javac	0m 20s	the patch passed
+1 💚	shadedjars	5m 12s	patch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 16s	the patch passed
		_ Other Tests _
+1 💚	unit	14m 58s	hbase-mapreduce in the patch passed.
		34m 7s

Subsystem	Report/Notes
Docker	ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5700/2/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR	#5700
Optional Tests	javac javadoc unit shadedjars compile
uname	Linux 54fc516a6b87 5.4.0-169-generic #187-Ubuntu SMP Thu Nov 23 14:52:28 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `7be588e`
Default Java	Eclipse Adoptium-11.0.17+8
Test Results	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5700/2/testReport/
Max. process+thread count	2990 (vs. ulimit of 30000)
modules	C: hbase-mapreduce U: hbase-mapreduce
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5700/2/console
versions	git=2.34.1 maven=3.8.6
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Apache9 · 2024-02-26T01:19:13Z

hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/RegionSizeCalculator.java

@@ -82,8 +81,8 @@ private void init(RegionLocator regionLocator, Admin admin) throws IOException {
        regionLocator.getName())) {

        byte[] regionId = regionLoad.getRegionName();
-        long regionSizeBytes =
-          ((long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE)) * MEGABYTE;
+        long regionSizeBytes = (long) regionLoad.getMemStoreSize().get(Size.Unit.BYTE)


Why we use Unit.MEGABYTE and then multiply 1024 * 1024 in the past? Can you find any related issues about this? Seems strange...

Seems it was like this in the first place...

Thanks for reviewing Duo.

From my understanding, before we introduce the Size api, we got region size by getMemstoreSizeMB and getStoreFileSizeMB, so I guess this is just to keep the same when applied the new Size api ?

OK, checked the code on how we constructor the store file size and memstore size, the default unit is MB, so it is useless to pass an unit less than MB here...

But anyway, I think the problem here is we also need to account memstore size when calculating region size?

Yes. otherwise the data in memstore will be lost.

I think that Hadoop assumes these split sizes are in megabytes and so we follow suit.

To protect against loss of precision, when the bytes-unit value is non-0, we can apply a minimum of 1mb. We should always add this minimum when running against an online-cluster.

When running against a snapshot, I'm not sure. MR over snapshots instantiates the region in the mapper process -- I assume that also reads the WAL and populates a memstore. In that case, we need the 1mb minimum here too. If not, we can permit the 0 to pass through and give the empty split optomization a chance.

To protect against loss of precision, when the bytes-unit value is non-0, we can apply a minimum of 1mb. We should always add this minimum when running against an online-cluster.

Agree. And I believe this problem has been solved in https://issues.apache.org/jira/browse/HBASE-26609

When running against a snapshot, I'm not sure. MR over snapshots instantiates the region in the mapper process -- I assume that also reads the WAL and populates a memstore. In that case, we need the 1mb minimum here too. If not, we can permit the 0 to pass through and give the empty split optomization a chance.

The value of snapshot input split length will always be 0.
https://github.com/apache/hbase/blob/rel/3.0.0-beta-1/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSnapshotInputFormatImpl.java#L185
I think maybe this should be increased to 1MB too ?

HBASE-28399 region size can be wrong from RegionSizeCalculator

7e174bf

fix unit tests

28ca3c7

Apache9 reviewed Feb 26, 2024

View reviewed changes

aalhour mentioned this pull request Feb 27, 2024

HBASE-28354 RegionSizeCalculator throws NPE when regions are in transition #5699

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HBASE-28399 region size can be wrong from RegionSizeCalculator #5700

HBASE-28399 region size can be wrong from RegionSizeCalculator #5700

frostruan commented Feb 23, 2024

Apache-HBase commented Feb 23, 2024

Apache-HBase commented Feb 23, 2024

frostruan commented Feb 23, 2024

Apache-HBase commented Feb 23, 2024

Apache-HBase commented Feb 23, 2024

Apache-HBase commented Feb 23, 2024

Apache-HBase commented Feb 23, 2024

Apache9 Feb 26, 2024

Apache9 Feb 26, 2024

frostruan Feb 26, 2024

Apache9 Feb 27, 2024

Apache9 Feb 27, 2024

frostruan Feb 27, 2024 •

edited

Loading

ndimiduk Feb 28, 2024

frostruan Feb 28, 2024

HBASE-28399 region size can be wrong from RegionSizeCalculator #5700

Are you sure you want to change the base?

HBASE-28399 region size can be wrong from RegionSizeCalculator #5700

Conversation

frostruan commented Feb 23, 2024

Apache-HBase commented Feb 23, 2024

Apache-HBase commented Feb 23, 2024

frostruan commented Feb 23, 2024

Apache-HBase commented Feb 23, 2024

Apache-HBase commented Feb 23, 2024

Apache-HBase commented Feb 23, 2024

Apache-HBase commented Feb 23, 2024

Apache9 Feb 26, 2024

Choose a reason for hiding this comment

Apache9 Feb 26, 2024

Choose a reason for hiding this comment

frostruan Feb 26, 2024

Choose a reason for hiding this comment

Apache9 Feb 27, 2024

Choose a reason for hiding this comment

Apache9 Feb 27, 2024

Choose a reason for hiding this comment

frostruan Feb 27, 2024 • edited Loading

Choose a reason for hiding this comment

ndimiduk Feb 28, 2024

Choose a reason for hiding this comment

frostruan Feb 28, 2024

Choose a reason for hiding this comment

frostruan Feb 27, 2024 •

edited

Loading