Skip to content

Commit 0467aca

Browse files
zustondongjoon-hyun
authored andcommitted
[SPARK-49783][YARN] Fix resource leak of yarn allocator
### What changes were proposed in this pull request? Fix the resource leak of yarn allocator ### Why are the changes needed? When the target < running containers number, the assigned containers from the resource manager will be skipped, but these containers are not released by invoking the amClient.releaseAssignedContainer , that will make these containers reserved into the Yarn resourceManager at least 10 minutes. And so, the cluster resource will be wasted at a high ratio. And this will reflect that the vcore * seconds statistics from yarn side will be greater than the result from the spark event logs. From my statistics, the cluster resource waste ratio is ~25% if the spark jobs are exclusive in this cluster. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? In our internal hadoop cluster ### Was this patch authored or co-authored using generative AI tooling? No Closes #48238 from zuston/patch-1. Authored-by: Junfan Zhang <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent e68b98a commit 0467aca

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -820,6 +820,7 @@ private[yarn] class YarnAllocator(
820820
logInfo(log"Skip launching executorRunnable as running executors count: " +
821821
log"${MDC(LogKeys.COUNT, rpRunningExecs)} reached target executors count: " +
822822
log"${MDC(LogKeys.NUM_EXECUTOR_TARGET, getOrUpdateTargetNumExecutorsForRPId(rpId))}.")
823+
internalReleaseContainer(container)
823824
}
824825
}
825826
}

0 commit comments

Comments
 (0)