Skip to content

[test] add end-of-run leak detection to the forked JVM runner#1692

Merged
fwbrasil merged 3 commits into
mainfrom
kyo-test-leak-detection-pr
Jun 21, 2026
Merged

[test] add end-of-run leak detection to the forked JVM runner#1692
fwbrasil merged 3 commits into
mainfrom
kyo-test-leak-detection-pr

Conversation

@fwbrasil

Copy link
Copy Markdown
Collaborator

Problem

kyo-test had no automatic check for resources left open at the end of a test run. A leaked fiber, an unclosed file descriptor, or a non-daemon thread started by a test left the forked JVM dirty without any signal to the author. Two real leaks had reached the codebase undetected:

  • StreamSubscription.loopPoll did not check for cancellation between chunks while it held a large outstanding demand (request(Long.MaxValue)). After cancel() the loop kept pulling and calling onNext for the rest of the stream, leaving the consumer fiber running long after the subscriber was gone, and delivered a terminal onComplete the Reactive Streams spec forbids after cancel().
  • Several tests in kyo-ffi, kyo-tasty, and kyo-website called Files.list(...) / Files.createTempDirectory(...) without closing the returned DirectoryStream, leaking a file descriptor per call.

Solution

LeakCheck (new, JVM-only, kyo-test/runner/jvm) runs three end-of-run probes once at SbtRunner.done(), after every suite in the forked JVM has finished, the one point where the fork is quiescent and holds only this run's resources:

  • Fiber probe: Scheduler.get.loadAvg() catches spinning or repeatedly-rescheduling fibers.
  • Thread probe: Thread.getAllStackTraces diffed against a baseline captured at runner construction catches raw Thread starts and un-shutdown executors. sbt's own ForkMain pool threads are registered via LeakCheck.registerCarrierThread() in SbtTask.execute() and excluded.
  • Descriptor probe: /proc/self/fd symlinks diffed against the construction-time baseline catch unclosed files, pipes, and sockets. Classpath jars, native libraries, and JVM-internal anonymous inodes are excluded. Linux-only; a no-op on macOS and Windows.

RunConfig gains leakCheck (master), leakCheckFileDescriptors, leakCheckThreads, leakCheckFibers, and leakCheckSockets (all default true), plus an additive leakCheckAllowlist. SuiteReport carries the same fields so each suite's effective config survives serialization to the aggregation step; SbtRunner OR-aggregates them across the fork's suites, so a category runs if any suite enabled it. TestRunner.constructorFailureReport threads the resolved config into the report, so a suite whose constructor throws reports its real leak config rather than the field defaults.

StreamSubscription.loopPoll now checks requestChannel.closed() at the top of each iteration and raises Abort.fail(StreamCanceled) on cancel, stopping emission immediately and suppressing the post-cancel onComplete.

Tests in kyo-ffi, kyo-tasty, and kyo-website are migrated from java.nio.file to kyo.Path, which closes the underlying DirectoryStream in a try/finally.

Notes

  • leakCheck is a no-op outside a forked test JVM; LeakCheck.isForked detects the fork via sun.java.command and skips the check in the main sbt JVM, whose baseline would include sbt's own resources.
  • kyo-http's process-global NioIoDriver scheduler fiber is in LeakCheck.defaultAllowlist, excused fork-wide until the network stack rewrite gives the driver a proper lifecycle.
  • The sbt.ForkMain pool threads are identified structurally by registering the carrier thread at the top of each SbtTask.execute(), not by a name pattern, so no test thread pool name can collide with the exclusion.

fwbrasil added 3 commits June 21, 2026 11:31
## Problem

kyo-test had no automatic check for resources left open at the end of a
test run. A leaked fiber, an unclosed file descriptor, or a non-daemon
thread started by a test left the forked JVM dirty without any signal to
the author. Two real leaks had reached the codebase undetected:

- `StreamSubscription.loopPoll` did not check for cancellation between
  chunks while it held a large outstanding demand
  (`request(Long.MaxValue)`). After `cancel()` the loop kept pulling and
  calling `onNext` for the rest of the stream, leaving the consumer fiber
  running long after the subscriber was gone, and delivered a terminal
  `onComplete` the Reactive Streams spec forbids after `cancel()`.
- Several tests in kyo-ffi, kyo-tasty, and kyo-website called
  `Files.list(...)` / `Files.createTempDirectory(...)` without closing the
  returned `DirectoryStream`, leaking a file descriptor per call.

## Solution

`LeakCheck` (new, JVM-only, kyo-test/runner/jvm) runs three end-of-run
probes once at `SbtRunner.done()`, after every suite in the forked JVM has
finished, the one point where the fork is quiescent and holds only this
run's resources:

- Fiber probe: `Scheduler.get.loadAvg()` catches spinning or
  repeatedly-rescheduling fibers.
- Thread probe: `Thread.getAllStackTraces` diffed against a baseline
  captured at runner construction catches raw `Thread` starts and
  un-shutdown executors. sbt's own `ForkMain` pool threads are registered
  via `LeakCheck.registerCarrierThread()` in `SbtTask.execute()` and
  excluded.
- Descriptor probe: `/proc/self/fd` symlinks diffed against the
  construction-time baseline catch unclosed files, pipes, and sockets.
  Classpath jars, native libraries, and JVM-internal anonymous inodes are
  excluded. Linux-only; a no-op on macOS and Windows.

`RunConfig` gains `leakCheck` (master), `leakCheckFileDescriptors`,
`leakCheckThreads`, `leakCheckFibers`, and `leakCheckSockets` (all default
`true`), plus an additive `leakCheckAllowlist`. `SuiteReport` carries the
same fields so each suite's effective config survives serialization to the
aggregation step; `SbtRunner` OR-aggregates them across the fork's suites,
so a category runs if any suite enabled it. `TestRunner.constructorFailureReport`
threads the resolved config into the report, so a suite whose constructor
throws reports its real leak config rather than the field defaults.

`StreamSubscription.loopPoll` now checks `requestChannel.closed()` at the
top of each iteration and raises `Abort.fail(StreamCanceled)` on cancel,
stopping emission immediately and suppressing the post-cancel `onComplete`.

Tests in kyo-ffi, kyo-tasty, and kyo-website are migrated from
`java.nio.file` to `kyo.Path`, which closes the underlying
`DirectoryStream` in a `try/finally`.

## Notes

- `leakCheck` is a no-op outside a forked test JVM; `LeakCheck.isForked`
  detects the fork via `sun.java.command` and skips the check in the main
  sbt JVM, whose baseline would include sbt's own resources.
- kyo-http's process-global `NioIoDriver` scheduler fiber is in
  `LeakCheck.defaultAllowlist`, excused fork-wide until the network stack
  rewrite gives the driver a proper lifecycle.
- The `sbt.ForkMain` pool threads are identified structurally by
  registering the carrier thread at the top of each `SbtTask.execute()`,
  not by a name pattern, so no test thread pool name can collide with the
  exclusion.
## Problem

`HttpServer` closed only its listening socket on shutdown. Accepted
keep-alive connections (the kind a long-lived `HttpClient` keeps pooled)
were left open until a 60-second idle timer fired. A suite that holds a
static `HttpClient` accumulated open sockets across iterations because the
client kept the peer side pooled. `HttpClientBackend` tracked the
connections it created in a hand-rolled map with the same
close-on-shutdown intent but a separate implementation.

## Solution

A shared `ConnectionRegistry` (new, kyo-http internal) gives client and
server one track-all/close-all mechanism. `HttpServer` registers each
accepted connection and closes the lot in `closeFiber`; `HttpClientBackend`
registers each created connection, replacing its `allConnections` map with
the same registry.

`register` and `closeAll` both claim a connection by removing it from the
backing set, and only the caller that wins the remove closes it, so a
connection is closed exactly once even when a registration races a
shutdown. `register` closes the connection itself when it observes a
shutdown already in progress, so a handler interrupted right after
registering never leaks its connection and `closeAll` never drops a
still-open one. Double-close on the rare race is benign because
`Connection.close()` is idempotent.

## Notes

- This closes the connections kyo-http owns. The NIO transport separately
  defers a closed channel's real fd close to its idle selector's next
  `select()`, which nothing wakes; releasing that fd promptly is the
  transport's job (frozen for the kyo-net rewrite), which is why suites
  that hold a connection on it still opt out of socket leak detection.
…eak checks

## Problem

A few suites hold a resource that the NIO transport or an external process
keeps open past the end of the run. The transport defers a closed channel's
real fd close to its idle selector's next `select()`, which nothing wakes,
so a server listener or a pooled client connection outlives the run; a
shared headless Chrome process holds its CDP socket and stdio pipes for the
whole run. Each is an opaque `socket:[inode]` / `pipe:[inode]` target no
allowlist substring can match.

## Solution

Each affected suite disables the narrowest leak category, never the master
toggle:

- `BaseHttpTest`, `BaseCalibanTest`, `BasePodTest`, and
  `JsonRpcHttpTransportTest` disable `leakCheckSockets`: a server listener
  or client connection on the NIO transport whose fd close it defers.
- `BaseBrowserTest` and kyo-ui's `UITest` disable `leakCheckSockets` and
  `leakCheckFileDescriptors`: the shared Chrome process's CDP socket and
  stdio pipes.

Thread and fiber detection stay on for all of them, and every other module
and category, sockets included, stays checked.

## Notes

- These opt-outs track external or transport-owned resources, not test
  bugs. The kyo-net transport rewrite, which releases a channel's fd
  promptly on close, is the fix path that lets the socket opt-outs be
  removed.
@fwbrasil fwbrasil merged commit 8cf1491 into main Jun 21, 2026
8 of 11 checks passed
@fwbrasil fwbrasil deleted the kyo-test-leak-detection-pr branch June 21, 2026 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant