Skip to content

Resource-Pool Bottleneck fix#29

Open
arjunkathuria wants to merge 1 commit intojuspay:masterfrom
arjunkathuria:resource-pool-bottleneck-fix
Open

Resource-Pool Bottleneck fix#29
arjunkathuria wants to merge 1 commit intojuspay:masterfrom
arjunkathuria:resource-pool-bottleneck-fix

Conversation

@arjunkathuria
Copy link
Copy Markdown

The PR introduces the fix for hedis fork master-branch performance issue that was found during profiling.

Isolated Issue.

We were running into throughput bottlenecks because during profiling it was found that the resource-pool was blocking requests. When The maximum resources per-stripe would get utilized and it would block.

Proposed fix

The proposed solution here is to introduce more number of stripes when making a pool connection. So that when one pool is saturated, it won't block, increasing the throughput at the cost of a bit more memory.

  • This PR increases the number of stripes from the default 1 to now 4.

  • The number 4 was chosen here a it was found to be the number with maximum gains without losing stability in local load-testing. A full report on different numbers used and how they translated to results was presented and can be referred to for a fuller picture.

Results observed after this Fix

  • The results were that changing the number of stripes in the connection pool settings
    on the hedis master branch fork does seem to affect the performance to a good
    degree.

    RPS observed before -> 188
    RPS observed after -> 294

    The performance of the API tested here went from 188 RPS -> 294 RPS with the inclusion of this fix. More than 100 RPS gain.

  • The time-allocation percentage of the major takeResource offender went down from 19.6% -> down to 2.8% as found when profiling cost-centres with this fix applied.

A full report on the issue, RCA and fix and different combinations was shared internally.

Caveats

There are some minor possible caveats here to be aware of :-

  1. This isn't extensively tested across all repos or scenarios it might possibly be used with, thought it should just work.

  2. If 4 stripes are found unstable for some repos or some deploys with low or tight memory limits, we can lower this to 3 for further increased stability, although with reduced performance, but still a good amount higher than the current status quo.

* We were running into throughput bottlenecks because it was
  found that the resource-pool was blocking requests.
  When The maximum resources per-stripe would get utilized and it would block.

* This PR increases the number of stripes from the default 1 to now 4.

* The number 4 was chosen here a it was found to be the
  number with maximum gains without losing stability in local load-testing.
  A full report on different numbers was presented and can be referred to
  for a fuller picture.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant