Skip to content

fix/async-issues#53

Merged
PolarBearEs merged 12 commits into
masterfrom
fix/async-issues
May 23, 2026
Merged

fix/async-issues#53
PolarBearEs merged 12 commits into
masterfrom
fix/async-issues

Conversation

@PolarBearEs
Copy link
Copy Markdown
Owner

No description provided.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 23, 2026

Greptile Summary

This PR addresses a broad set of async correctness issues across the pull pipeline: blocking credential-helper invocation, coarse-grained AtomicBool cancellation, missing I/O timeouts in the embedded cache registry, and JoinSet/FuturesUnordered tasks that were aborted but never drained.

  • Auth (src/auth.rs): invoke_helper_command is now fully async using tokio::process::Command, with a 30-second tokio::time::timeout around wait_with_output and kill_on_drop(true), replacing the previous blocking std::thread::sleep poll loop.
  • Cancellation (src/pull/, src/serve_registry/, src/signal.rs): Arc<AtomicBool> stop flags are replaced with CancellationToken; every abort site now drains its JoinSet / FuturesUnordered after abort_all(), and sleep calls mid-download respect cancellation via sleep_or_interrupt.
  • Cache registry I/O (src/serve_registry/): Read-request headers and all response writes gain idle timeouts (120 s and 300 s respectively); TemporaryCacheRegistry gains an explicit shutdown() that awaits and propagates the background task's result instead of silently discarding it.

Confidence Score: 5/5

Clean async refactor with no regressions found; all abort paths are properly drained and the I/O timeout additions are conservative.

Every changed code path was reviewed end-to-end: credential helper stdin is closed before wait_with_output, CancellationToken semantics are applied consistently, JoinSets and FuturesUnordered queues are drained after abort_all, and the new TemporaryCacheRegistry::shutdown correctly consumes self so the Drop impl never double-fires. No data loss, stall, or correctness issues were identified.

No files require special attention.

Important Files Changed

Filename Overview
src/auth.rs Migrates credential helper invocation from blocking std::process to async tokio::process::Command, adds a 30-second timeout around wait_with_output, and correctly uses kill_on_drop(true) for cleanup on timeout.
src/signal.rs Refactors signal handling into reusable wait_for_shutdown_signal; fixes the pre-existing bug where a None signal arm resolved immediately by returning std::future::pending() instead.
src/pull/load.rs TemporaryCacheRegistry gains an explicit async shutdown() that awaits the serve task and propagates its result; the task field changes from JoinHandle<()> to Option<JoinHandle<Result<()>>> so errors are no longer silently swallowed.
src/pull/mod.rs Replaces Arc stop flag with CancellationToken; abort_downloads now properly drains the FuturesUnordered queue after aborting rather than calling queue.clear().
src/pull/orchestrator.rs Switches AtomicBool stop to CancellationToken; adds abort_pull_tasks helper that drains the JoinSet after abort_all, logging unexpected (non-cancellation) panics.
src/pull/download.rs All stop-flag checks switch from AtomicBool::load to CancellationToken; retry sleep upgraded to sleep_or_interrupt so downloads stop promptly on cancellation.
src/serve_registry/server.rs CancellationToken replaces AtomicBool for pull_context.stop; ServeConfig gains an optional shutdown oneshot receiver wired to the signal handler introduced in runtime.rs.
src/serve_registry/request.rs Wraps read_request in a 120-second timeout to prevent a stalled connection from holding up the server indefinitely.
src/serve_registry/response.rs Replaces raw tokio::io::copy with a custom copy_with_idle_timeout that resets a 5-minute per-chunk deadline, preventing a slow Docker client from blocking the server socket forever.
src/docker/layers.rs Adds abort_inspect_tasks helper mirroring the JoinSet drain pattern; warns on unexpected (non-cancellation) errors from tasks aborted mid-flight.
src/runtime.rs resolve_compose_images wraps the blocking compose::resolve_images in spawn_blocking; serve command gains a shutdown oneshot fed by wait_for_shutdown_signal.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[runtime: run] --> B[orchestrator: pull_references]
    A --> C[serve_registry: serve with shutdown]

    B --> D[auth: async invoke_helper]
    D --> E[spawn tokio process]
    E --> F{tokio timeout around wait_with_output}
    F -->|elapsed| G[warn and return None]
    F -->|finished| H[parse credentials]

    B --> I[download blob loop]
    I --> J{select: fetch vs cancellation}
    J -->|cancelled| K[return Interrupted]
    J -->|chunk| L[write to file]
    L --> M{backoff needed?}
    M -->|yes| N[sleep_or_interrupt]
    N --> J

    C --> O[request read with timeout 120s]
    O --> P[route request]
    P --> Q[write response with idle timeout 300s per chunk]
Loading

Reviews (2): Last reviewed commit: "Use async timeout for credential helpers" | Re-trigger Greptile

Comment thread src/auth.rs Outdated
Comment thread src/signal.rs
@PolarBearEs PolarBearEs merged commit d7305cc into master May 23, 2026
11 checks passed
@PolarBearEs PolarBearEs deleted the fix/async-issues branch May 23, 2026 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant