You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SPARK-51030][CORE][TESTS] Add a check before Utils.deleteRecursively(tempDir) to ensure tempDir won't be cleaned up by the ShutdownHook in afterEach of LocalRootDirsTest
### What changes were proposed in this pull request?
This pr add a check before `Utils.deleteRecursively(tempDir)` to ensure `tempDir` won't be cleaned up by the ShutdownHook in `afterEach` of `LocalRootDirsTest` to minimize the risk of test failures caused by race conditions in multi-threaded deletion of the `tempdir`.
### Why are the changes needed?
In the daily tests on macOS, you may see test failures similar to the following:
- https://github.com/apache/spark/actions/runs/13014395379/job/36325277534
```
SslExternalShuffleServiceSuite:
...
- SPARK-38640: memory only blocks can unpersist using shuffle service cache fetching
SUITE ABORTED - SslExternalShuffleServiceSuite: Failed to list files for dir: /Users/runner/work/spark/spark/core/target/tmp/local-6e9944e3-27c0-44bb-a535-8a7f09c1580b/55fe65f6-2d1c-4ced-af5f-abb64b135345/spark-3985444c-54a0-462c-b1d3-6df45ad55143/executor-8c55fc55-b74f-4793-bf3f-3d11a889fd8c/blockmgr-1a3455e7-bd31-4ef8-a9c6-910fde151bb3/11
java.io.IOException: Failed to list files for dir: /Users/runner/work/spark/spark/core/target/tmp/local-6e9944e3-27c0-44bb-a535-8a7f09c1580b/55fe65f6-2d1c-4ced-af5f-abb64b135345/spark-3985444c-54a0-462c-b1d3-6df45ad55143/executor-8c55fc55-b74f-4793-bf3f-3d11a889fd8c/blockmgr-1a3455e7-bd31-4ef8-a9c6-910fde151bb3/11
at org.apache.spark.network.util.JavaUtils.listFilesSafely(JavaUtils.java:201)
at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:135)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:123)
at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:137)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:123)
at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:137)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:123)
at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:137)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:123)
at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:137)
...
```
This appears to be an issue caused by multi-threaded deletion of the `tempdir`.
https://github.com/apache/spark/blob/42898c62b4c197e88b842eaa840986cb70d93a9c/core/src/test/scala/org/apache/spark/LocalRootDirsTest.scala#L42
Refer to the code above, in `LocalRootDirsTest#beforeEach` method, `tempDir = Utils.createTempDir(namePrefix = "local")` invokes `ShutdownHookManager.registerShutdownDeleteDir` to register `tempDir` for deletion via a shutdown hook. Therefore, if `tempDir` remains unchanged during the test case, manual cleanup is unnecessary in `LocalRootDirsTest#afterEach` method.
However, given that `tempDir` is defined as `protected var tempDir`, it is possible for it to be altered within the test case. Hence, additional checks are implemented before manually cleaning up `tempDir`:
1. `tempDir` has not been registered for cleanup via a shutdown hook.
2. `tempDir` is not a subpath of any path that has been registered for cleanup via a shutdown hook.
This approach helps to minimize the risk of a race condition where multiple threads attempt to clean up the `tempDir` concurrently.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
- Pass GitHub Actions
### Was this patch authored or co-authored using generative AI tooling?
No
Closes#49723 from LuciferYang/LocalRootDirsTest-hasRootAsShutdownDeleteDir.
Lead-authored-by: yangjie01 <[email protected]>
Co-authored-by: YangJie <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 19fbeaa)
Signed-off-by: Dongjoon Hyun <[email protected]>
0 commit comments