Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add server level dynamically configurable segment download throttler #15001

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

somandal
Copy link
Contributor

@somandal somandal commented Feb 5, 2025

This PR adds a new server level dynamically configurable throttler for segment downloads to reduce the impact of resource exhaustion on the servers (disk, CPU, memory). This throttler is used for both deep store and peer downloads. Uses the same mechanism introduced in #14894 for allowing dynamic config updates.

The above config is meant to work with the existing table level config: pinot.server.instance.table.level.max.parallel.segment.downloads for deep store downloads. The table level semaphore will be taken first, followed by the server level semaphore. The table level semaphore is meant to ensure there is no starvation among the tables in the situation where a large number of segments are suddenly added / updated for a single table. The server level semaphore is meant to ensure the server is protected from too many resources getting used up for download.

This PR also renames the SegmentPreprocessThrottler to SegmentOperationsThrottler as the newly added SegmentDownloadThrottler is no longer only preprocess related.

Lock ordering for deep store downloads:

  • Table level
  • Server level

Disabling throttling:

  • The table level config can be disabled by setting a value <= 0. This will continue to work as is.
  • For the newly added server level config, it cannot be disabled, but a large value can be set to essentially disable throttling.

The upper bound on threads that can be downloading segment is always going to be limited to: STATE_TRANSITIONS.maxThreads + min(USER_DEFINE_MSG.maxThreads, max.parallel.refresh.threads)

OSS Helix defaults STATE_TRANSITIONS.maxThreads and USER_DEFINE_MSG.maxThreads to 40.

cc @Jackie-Jiang @klsince @npawar @jackjlli @jasperjiaguo @vvivekiyer

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the download throttling is on each table. Not introduced in this PR, but that is wrong. The semaphore needs to be passed from outside so that it can be shared across all tables.

@codecov-commenter
Copy link

codecov-commenter commented Feb 6, 2025

Codecov Report

Attention: Patch coverage is 65.42056% with 37 lines in your changes missing coverage. Please review.

Project coverage is 63.69%. Comparing base (59551e4) to head (c845a4d).
Report is 1692 commits behind head on master.

Files with missing lines Patch % Lines
.../pinot/core/data/manager/BaseTableDataManager.java 51.51% 13 Missing and 3 partials ⚠️
.../pinot/server/starter/helix/BaseServerStarter.java 0.00% 9 Missing ⚠️
...nt/local/utils/BaseSegmentOperationsThrottler.java 78.57% 3 Missing and 3 partials ⚠️
...ata/manager/realtime/RealtimeTableDataManager.java 0.00% 2 Missing ⚠️
...ocal/segment/index/loader/SegmentPreProcessor.java 60.00% 0 Missing and 2 partials ⚠️
...egment/local/utils/SegmentOperationsThrottler.java 80.00% 1 Missing ⚠️
...server/starter/helix/HelixInstanceDataManager.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #15001      +/-   ##
============================================
+ Coverage     61.75%   63.69%   +1.94%     
- Complexity      207     1482    +1275     
============================================
  Files          2436     2728     +292     
  Lines        133233   152480   +19247     
  Branches      20636    23569    +2933     
============================================
+ Hits          82274    97119   +14845     
- Misses        44911    48044    +3133     
- Partials       6048     7317    +1269     
Flag Coverage Δ
custom-integration1 100.00% <ø> (+99.99%) ⬆️
integration 100.00% <ø> (+99.99%) ⬆️
integration1 100.00% <ø> (+99.99%) ⬆️
integration2 0.00% <ø> (ø)
java-11 63.65% <65.42%> (+1.94%) ⬆️
java-21 63.57% <65.42%> (+1.94%) ⬆️
skip-bytebuffers-false 63.68% <65.42%> (+1.94%) ⬆️
skip-bytebuffers-true 63.54% <65.42%> (+35.81%) ⬆️
temurin 63.69% <65.42%> (+1.94%) ⬆️
unittests 63.68% <65.42%> (+1.94%) ⬆️
unittests1 56.23% <38.14%> (+9.34%) ⬆️
unittests2 34.02% <49.53%> (+6.29%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@somandal
Copy link
Contributor Author

somandal commented Feb 6, 2025

Seems like the download throttling is on each table. Not introduced in this PR, but that is wrong. The semaphore needs to be passed from outside so that it can be shared across all tables.

I think the original intention of this semaphore was to be a separate semaphore for each table (e.g. see the comments and docs related to this) itself, and passing it from outside as a shared semaphore across all tables will result in a big behavior change. For now I would like to keep the overall functionality the same.

We can open an issue for consolidating these under a single semaphore or treating these as per table overridable configs perhaps (which was another idea I had discussed with @klsince). What do you think @Jackie-Jiang?

Reference PR of original change: #8694

Edit:

Capturing discussion on original intent of the table level config:

the intention is to avoid affecting the other tables from correctly downloading the segments. Think of the scenario when there are 1k+ tables living in the same server and if there is a burst of server downloads from one table, it could affect the other 1k+ tables from correctly downloading.

@somandal somandal force-pushed the segment-download-throttle-dynamic-config branch from 5de67e3 to 0a49709 Compare February 8, 2025 02:08
@somandal somandal requested a review from Jackie-Jiang February 8, 2025 02:08
@somandal somandal changed the title Make pinot.server.instance.table.level.max.parallel.segment.downloads dynamically configurable via ZK cluster configs Add server level dynamically configurable segment download throttler Feb 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants