Commit 0467aca
[SPARK-49783][YARN] Fix resource leak of yarn allocator
### What changes were proposed in this pull request?
Fix the resource leak of yarn allocator
### Why are the changes needed?
When the target < running containers number, the assigned containers from the resource manager will be skipped, but these containers are not released by invoking the amClient.releaseAssignedContainer , that will make these containers reserved into the Yarn resourceManager at least 10 minutes. And so, the cluster resource will be wasted at a high ratio.
And this will reflect that the vcore * seconds statistics from yarn side will be greater than the result from the spark event logs.
From my statistics, the cluster resource waste ratio is ~25% if the spark jobs are exclusive in this cluster.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
In our internal hadoop cluster
### Was this patch authored or co-authored using generative AI tooling?
No
Closes #48238 from zuston/patch-1.
Authored-by: Junfan Zhang <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>1 parent e68b98a commit 0467aca
File tree
1 file changed
+1
-0
lines changed- resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn
1 file changed
+1
-0
lines changedLines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
820 | 820 | | |
821 | 821 | | |
822 | 822 | | |
| 823 | + | |
823 | 824 | | |
824 | 825 | | |
825 | 826 | | |
| |||
0 commit comments