Skip to content

Conversation

@winterhazel
Copy link
Member

Description

removeRawUsageRecords calculates the next moment in which Usage will execute in order to prevent the removal of usage records in a window of 15 minutes before and after the Usage job executes. However, this calculation is performed incorrectly because the date is not rolled to the next day in case the initial calculated date has already passed. Due to this, if the Usage job is scheduled to execute at 00:00 and the current time is 23:59 (the job will execute in a minute), for instance, this check fails and the usage records are removed.

This PR fixes the calculation.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • Build/CI
  • Test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

The tests were performed in a VM with its date set to 2026-01-26 GMT-3.

Before the changes

  • I scheduled Usage to execute at 00:00 and set the Management Server's date to 23:50. I executed removeRawUsageRecords. I verified that the calculated execution date was 2026-01-26 00:00 (incorrect value), and that the usage records were removed.

    2026-01-27 02:50:37,340 DEBUG [c.c.u.UsageServiceImpl] (qtp1438988851-19:[ctx-406f7af2, ctx-443960b1]) (logid:3f310c43) Trying to remove old raw cloud_usage records older than 30 day(s), current time=1769482237337 next job execution time=1769472000000
    

With the patch

Regarding removeRawUsageRecords:

  • I scheduled Usage to execute at 00:00 and set the Management Server's date to 23:50. I executed removeRawUsageRecords. I verified that the calculated execution date was correct, and that the usage records were not removed.

    2026-01-27 02:51:48,514 DEBUG [c.c.u.UsageServiceImpl] (qtp1404565079-24:[ctx-41b413d8, ctx-9398b56c]) (logid:e7cc40c5) Next Usage job is scheduled to execute at [2026-01-27T00:00:00-0300]; previous execution was at [2026-01-26T00:00:00-0300].
    2026-01-27 02:51:48,514 INFO  [c.c.u.UsageServiceImpl] (qtp1404565079-24:[ctx-41b413d8, ctx-9398b56c]) (logid:e7cc40c5) Not removing any cloud_usage records because the next Usage job is scheduled to execute in less than 15 minute(s).
    
  • I scheduled Usage to execute at 00:00 and set the Management Server's date to 00:05. I executed removeRawUsageRecords. I verified that the calculated execution date was correct, and that the usage records were not removed.

    2026-01-26 03:07:20,495 DEBUG [c.c.u.UsageServiceImpl] (qtp1404565079-23:[ctx-bec9b411, ctx-07e46009]) (logid:c37addd5) Next Usage job is scheduled to execute at [2026-01-27T00:00:00-0300]; previous execution was at [2026-01-26T00:00:00-0300].
    2026-01-26 03:07:20,496 INFO  [c.c.u.UsageServiceImpl] (qtp1404565079-23:[ctx-bec9b411, ctx-07e46009]) (logid:c37addd5) Not removing any cloud_usage records because the last Usage job executed in less than 15 minute(s) ago.
    
  • I scheduled Usage to execute at 00:00 and set the Management Server's date to 23:40. I executed removeRawUsageRecords. I verified that the calculated execution date was correct, and that the usage records were removed.

    2026-01-27 02:40:34,826 DEBUG [c.c.u.UsageServiceImpl] (qtp1404565079-23:[ctx-0b325ec7, ctx-c3715fae]) (logid:cf95c7ea) Next Usage job is scheduled to execute at [2026-01-27T00:00:00-0300]; previous execution was at [2026-01-26T00:00:00-0300].
    2026-01-27 02:40:34,826 INFO  [c.c.u.UsageServiceImpl] (qtp1404565079-23:[ctx-0b325ec7, ctx-c3715fae]) (logid:cf95c7ea) Removing cloud_usage records older than 30 day(s).
    
  • I scheduled Usage to execute at 00:00 and set the Management Server's date to 00:20. I executed removeRawUsageRecords. I verified that the calculated execution date was correct, and that the usage records were removed.

    2026-01-26 03:20:47,151 DEBUG [c.c.u.UsageServiceImpl] (qtp1404565079-23:[ctx-2dc17d46, ctx-8c10ccad]) (logid:7f4dbf36) Next Usage job is scheduled to execute at [2026-01-27T00:00:00-0300]; previous execution was at [2026-01-26T00:00:00-0300].
    2026-01-26 03:20:47,151 INFO  [c.c.u.UsageServiceImpl] (qtp1404565079-23:[ctx-2dc17d46, ctx-8c10ccad]) (logid:7f4dbf36) Removing cloud_usage records older than 30 day(s).
    

Regarding the job execution time in the Usage Server itself:

  • I scheduled Usage to execute at 00:00 and set the Management Server's date to 23:50. I restarted the Usage Server. I verified that the calculated execution date was correct.

    2026-01-27 02:50:08,724 INFO  [cloud.usage.UsageManagerImpl_EnhancerByCloudStack_68b35dd4] (main:[]) (logid:) Usage is configured to execute in time zone [GMT-03:00], at [00:00], each [1440] minutes; the current time in that timezone is [2026-01-26T23:50:08-0300] and the next job is scheduled to execute at [2026-01-27T00:00:00-0300]. During its execution, Usage will aggregate stats according to the time zone [GMT-03:00] defined in global setting [usage.aggregation.timezone].
    
  • I scheduled Usage to execute at 00:00 and set the Management Server's date to 00:20. I restarted the Usage Server. I verified that the calculated execution date was correct.

    2026-01-26 03:20:07,917 INFO  [cloud.usage.UsageManagerImpl_EnhancerByCloudStack_68b35dd4] (main:[]) (logid:) Usage is configured to execute in time zone [GMT-03:00], at [00:00], each [1440] minutes; the current time in that timezone is [2026-01-26T00:20:07-0300] and the next job is scheduled to execute at [2026-01-27T00:00:00-0300]. During its execution, Usage will aggregate stats according to the time zone [GMT-03:00] defined in global setting [usage.aggregation.timezone].
    
  • I scheduled Usage to execute at 00:10 and set the Management Server's date to 00:05. I restarted the Usage Server. I verified that the calculated execution date was correct.

    2026-01-26 03:05:08,964 INFO  [cloud.usage.UsageManagerImpl_EnhancerByCloudStack_68b35dd4] (main:[]) (logid:) Usage is configured to execute in time zone [GMT-03:00], at [00:10], each [1440] minutes; the current time in that timezone is [2026-01-26T00:05:08-0300] and the next job is scheduled to execute at [2026-01-26T00:10:00-0300]. During its execution, Usage will aggregate stats according to the time zone [GMT-03:00] defined in global setting [usage.aggregation.timezone].
    
  • I scheduled Usage to execute at 00:10 and set the Management Server's date to 00:12. I restarted the Usage Server. I verified that the calculated execution date was correct.

    2026-01-26 03:12:09,134 INFO  [cloud.usage.UsageManagerImpl_EnhancerByCloudStack_68b35dd4] (main:[]) (logid:) Usage is configured to execute in time zone [GMT-03:00], at [00:10], each [1440] minutes; the current time in that timezone is [2026-01-26T00:12:09-0300] and the next job is scheduled to execute at [2026-01-27T00:10:00-0300]. During its execution, Usage will aggregate stats according to the time zone [GMT-03:00] defined in global setting [usage.aggregation.timezone].
    

@winterhazel
Copy link
Member Author

@blueorangutan package

@blueorangutan
Copy link

@winterhazel a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress.

@codecov
Copy link

codecov bot commented Jan 26, 2026

Codecov Report

❌ Patch coverage is 43.10345% with 33 lines in your changes missing coverage. Please review.
✅ Project coverage is 16.02%. Comparing base (8db065a) to head (bd5d8f9).
⚠️ Report is 12 commits behind head on 4.20.

Files with missing lines Patch % Lines
...rc/main/java/com/cloud/usage/UsageServiceImpl.java 0.00% 21 Missing ⚠️
...rc/main/java/com/cloud/usage/UsageManagerImpl.java 10.00% 9 Missing ⚠️
.../org/apache/cloudstack/utils/usage/UsageUtils.java 88.88% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               4.20   #12518      +/-   ##
============================================
- Coverage     16.24%   16.02%   -0.22%     
- Complexity    13396    13403       +7     
============================================
  Files          5658     5704      +46     
  Lines        499169   513122   +13953     
  Branches      60579    64702    +4123     
============================================
+ Hits          81087    82246    +1159     
- Misses       409041   421774   +12733     
- Partials       9041     9102      +61     
Flag Coverage Δ
uitests 4.28% <ø> (+0.25%) ⬆️
unittests 17.10% <43.10%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@blueorangutan
Copy link

Packaging result [SF]: ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 16534

@winterhazel winterhazel changed the base branch from main to 4.20 January 26, 2026 13:25
@winterhazel winterhazel reopened this Jan 26, 2026
@winterhazel
Copy link
Member Author

@blueorangutan package

@blueorangutan
Copy link

@winterhazel a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.


private TimeZone usageExecutionTimeZone = TimeZone.getTimeZone("GMT");

private static final long REMOVE_RAW_USAGE_RECORDS_WINDOW_IN_MS = 15 * 60 * 1000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@winterhazel , why the “magic” 15 minutes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never mind, traditional magic.. :(

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16535

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants