#314: Add support for sum of truncated values #324

ABLL526 · 2025-03-04T16:23:28Z

-Add support for sum of truncated values.
-Added the aggregatedTruncTotal and absAggregatedTruncTotal Measures.

Closes #314

Release notes:
-Added the aggregatedTruncTotal and absAggregatedTruncTotal Measures.
-Added the tests for these Measures.

- Added the aggregatedTruncTotal Measure and the absAggregatedTruncTotal Measure. - Added the tests for these Measures.

github-actions · 2025-03-04T16:26:25Z

JaCoCo model module code coverage report - scala 2.13.11

Overall Project	56.51%	🍏

There is no coverage information present for the Files changed

github-actions · 2025-03-04T16:26:26Z

JaCoCo agent module code coverage report - scala 2.13.11

Overall Project	78.98% `-8.83%`	🍏
Files changed	61.54%	❌

File	Coverage
MeasuresBuilder.scala	100%	🍏
Measure.scala	88.17% `-31.12%`	❌

github-actions · 2025-03-04T16:26:27Z

JaCoCo reader module code coverage report - scala 2.13.11

Overall Project	95.16%	🍏

There is no coverage information present for the Files changed

github-actions · 2025-03-04T16:26:28Z

JaCoCo server module code coverage report - scala 2.13.11

Overall Project	68.39%	🍏

There is no coverage information present for the Files changed

salamonpavel · 2025-03-05T08:04:07Z

agent/src/main/scala/za/co/absa/atum/agent/model/Measure.scala

@@ -44,6 +44,8 @@ object AtumMeasure {
    DistinctRecordCount.measureName,
    SumOfValuesOfColumn.measureName,
    AbsSumOfValuesOfColumn.measureName,
+    SumOfTruncatedValuesOfColumn.measureName,


val supportedMeasureNames is not used

I have only added the names, it must be dead code possibly. I will comment out the val and rerun the tests.

salamonpavel · 2025-03-05T08:11:29Z

agent/src/main/scala/za/co/absa/atum/agent/model/Measure.scala

@@ -117,6 +119,42 @@ object AtumMeasure {
    def apply(measuredCol: String): AbsSumOfValuesOfColumn = AbsSumOfValuesOfColumn(measureName, measuredCol)
  }

+  case class SumOfTruncatedValuesOfColumn private (measureName: String, measuredCol: String) extends AtumMeasure {


Why have you decided to use double casting approach instead of using standard functions from package org.apache.spark.sql to round values before summing them?

/** * Returns the value of the column `e` rounded to 0 decimal places with HALF_UP round mode. * * @group math_funcs * @since 1.5.0 */ def round(e: Column): Column = round(e, 0) /** * Round the value of `e` to `scale` decimal places with HALF_UP round mode * if `scale` is greater than or equal to 0 or at integral part when `scale` is less than 0. * * @group math_funcs * @since 1.5.0 */ def round(e: Column, scale: Int): Column = withExpr { Round(e.expr, Literal(scale)) } /** * Returns the value of the column `e` rounded to 0 decimal places with HALF_EVEN round mode. * * @group math_funcs * @since 2.0.0 */ def bround(e: Column): Column = bround(e, 0) /** * Round the value of `e` to `scale` decimal places with HALF_EVEN round mode * if `scale` is greater than or equal to 0 or at integral part when `scale` is less than 0. * * @group math_funcs * @since 2.0.0 */ def bround(e: Column, scale: Int): Column = withExpr { BRound(e.expr, Literal(scale)) }

Thank you. It was mentioned in the issue that this method was used in ATUM but let me change it accordingly. Since I think this method you have mentioned is more correct.

The round function does not work with negative numbers, unfortunately. We need a truncation function that will simply remove the decimals. But I have found another way that also works to ensure proper functionality without resorting to a double cast.

yup, combination of floor and ceil should work fine

scala> df.show +-----+ |value| +-----+ | -1.1| | 1.1| | -1.7| | 1.7| | -0.8| | 0.8| +-----+ scala> df.select(when(col("value") >= 0, floor(col("value"))).otherwise(ceil(col("value")))).show +-------------------------------------------------------------+ |CASE WHEN (value >= 0) THEN FLOOR(value) ELSE CEIL(value) END| +-------------------------------------------------------------+ | -1| | 1| | -1| | 1| | 0| | 0| +-------------------------------------------------------------+

- Added the aggregatedTruncTotal Measure and the absAggregatedTruncTotal Measure. - Added the tests for these Measures. - Made amendments to the function to not include double casts.

salamonpavel · 2025-03-06T07:35:27Z

agent/src/main/scala/za/co/absa/atum/agent/model/Measure.scala

+    }
+
+    override def measuredColumns: Seq[String] = Seq(measuredCol)
+    override val resultValueType: ResultValueType = ResultValueType.BigDecimalValue


ResultValueType.LongValue?

Yes, I like that. Good catch. Thank you.

- Added the aggregatedTruncTotal Measure and the absAggregatedTruncTotal Measure. - Added the tests for these Measures. - Made amendments to the function to not include double casts. - Changed the result from BigDecimal to LongValue.

salamonpavel

LGTM

Changes Made:

007f45c

- Added the aggregatedTruncTotal Measure and the absAggregatedTruncTotal Measure. - Added the tests for these Measures.

ABLL526 added enhancement New feature or request Agent Issues touching the agent part of the project labels Mar 4, 2025

ABLL526 self-assigned this Mar 4, 2025

ABLL526 requested review from benedeki, lsulak, Zejnilovic, dk1844 and salamonpavel as code owners March 4, 2025 16:23

ABLL526 linked an issue Mar 4, 2025 that may be closed by this pull request

Add support for sum of truncated values #314

Closed

ABLL526 removed a link to an issue Mar 4, 2025

Add support for sum of truncated values #314

Closed

ABLL526 linked an issue Mar 4, 2025 that may be closed by this pull request

Add support for sum of truncated values #314

Closed

Merge branch 'master' into 314-add-support-for-sum-of-truncated-values

8204871

salamonpavel reviewed Mar 5, 2025

View reviewed changes

Changes Made:

ff0ed63

- Added the aggregatedTruncTotal Measure and the absAggregatedTruncTotal Measure. - Added the tests for these Measures. - Made amendments to the function to not include double casts.

salamonpavel reviewed Mar 6, 2025

View reviewed changes

Changes Made:

3a5a907

- Added the aggregatedTruncTotal Measure and the absAggregatedTruncTotal Measure. - Added the tests for these Measures. - Made amendments to the function to not include double casts. - Changed the result from BigDecimal to LongValue.

salamonpavel approved these changes Mar 11, 2025

View reviewed changes

ABLL526 merged commit 07e4b4f into master Mar 12, 2025
9 checks passed

ABLL526 deleted the 314-add-support-for-sum-of-truncated-values branch March 12, 2025 15:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#314: Add support for sum of truncated values #324

#314: Add support for sum of truncated values #324

ABLL526 commented Mar 4, 2025

github-actions bot commented Mar 4, 2025

github-actions bot commented Mar 4, 2025 •

edited

Loading

github-actions bot commented Mar 4, 2025

github-actions bot commented Mar 4, 2025

salamonpavel Mar 5, 2025

ABLL526 Mar 5, 2025

salamonpavel Mar 5, 2025

ABLL526 Mar 5, 2025

ABLL526 Mar 5, 2025

salamonpavel Mar 5, 2025

salamonpavel Mar 6, 2025

ABLL526 Mar 7, 2025

salamonpavel left a comment

#314: Add support for sum of truncated values #324

#314: Add support for sum of truncated values #324

Conversation

ABLL526 commented Mar 4, 2025

github-actions bot commented Mar 4, 2025

JaCoCo model module code coverage report - scala 2.13.11

github-actions bot commented Mar 4, 2025 • edited Loading

JaCoCo agent module code coverage report - scala 2.13.11

github-actions bot commented Mar 4, 2025

JaCoCo reader module code coverage report - scala 2.13.11

github-actions bot commented Mar 4, 2025

JaCoCo server module code coverage report - scala 2.13.11

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

salamonpavel left a comment

Choose a reason for hiding this comment

github-actions bot commented Mar 4, 2025 •

edited

Loading