Allow logs to fully flush before disposing of `DcpExecutor`. #8423

afscrome · 2025-03-30T21:48:30Z

Description

Updated DcpExecutor to wait for log streams to fully flush during StopAsync, rather than immediately cancelling them. Log streams should shut down once the associated resource is shut down and the logs have been processed.

Previous behaviour could cause logs from the very end of a process to be lost as the log streams could be shut down before the logs were forwarded from DCP to ILogger. This was particularly problematic for DistributedApplicationBuilderTests failing whilst waiting for a resource to enter a particular state as the final lines of error are usually the ones you want to work out why a container / executable exited / failed to start.

Fixes #8206

Checklist

Is this feature complete?
- Yes. Ready to ship.
- No. Follow-up changes expected.
Are you including unit tests for the changes and scenario tests if relevant?
- Yes
- No
Did you add public API?
- Yes
  - If yes, did you have an API Review for it?
    - Yes
    - No
  - Did you add <remarks /> and <code /> elements on your triple slash comments?
    - Yes
    - No
- No
Does the change make any security assumptions or guarantees?
- Yes
  - If yes, have you done a threat model and had a security review?
    - Yes
    - No
- No
Does the change require an update in our Aspire docs?
- Yes
  - Is this introducing a breaking change?
    - Yes
      - Link to aspire-docs issue (please use this breaking-change template):
    - [] No
      - Link to aspire-docs issue (please use this doc-idea template):
- No

Updated `DcpExecutor` to wait for log streams to fully flush during `StopAsync`, rather than immediately cancelling them. Log streams should shut down once the associated resource is shut down and the logs have been processed. Previous behaviour could cause logs from the very end of a process to be lost as the log streams could be shut down before the logs were forwarded from DCP to ILogger. This was particularly problematic for DistributedApplicationBuilderTests failing whilst waiting for a resource to enter a particular state as the final lines of error are usually the ones you want to work out why a container / executable exited / failed to start. Fixes dotnet#8206

…logs

danmoseley · 2025-04-04T18:19:22Z

Seems reasonable but it's not my area. Maybe Karol can review

karolz-ms · 2025-04-04T18:34:28Z

src/Aspire.Hosting/Dcp/DcpExecutor.cs

        if (_resourceWatchTask is { } resourceTask)
        {
            tasks.Add(resourceTask);
        }

        foreach (var (_, (cancellation, logTask)) in _logStreams)
        {
-            cancellation.Cancel();
+            cancellationToken.Register(cancellation.Cancel);


Maybe I am not understanding the change, but I am not sure it has the effect you intended.

If you want to stop DCP and then stop the log streams, you should do just that, i.e. move the call that waits on the log stream tasks to end so that it happens after the call to stop DCP.

The cancellation token passed to StopAsync() is a cancellation token for the whole stopping operation. It may be cancelled after a long time, or never (CancellationToken.None), meaning that with your proposed change the log streams may never be cancelled, depending on what the cancellation token is.

Sorry my bad, I meant to abandon this PR - see #8206 (comment). Whilst I got my failing test passing, I broke others as I stopped Aspire from shutting down any still running resources - unintended effects as you said.

I assume this needs some changes on DCP side to ensure when DCP is shutting down, it fully flushes logs before finishing the shutdown process (although without seeing DCP code it's hard for me to tell).

Absolutely no worries. We are in huge debt to you for all the help you have given us with making Aspire better.

I think though you were on the right track with this change. Log capture from services and streaming them to clients is asynchronous in DCP, so there is a bit of lag, but it will keep going until the API server shuts down, which is after the resource cleanup is complete. So at least part of the fix IMO should be to stop cancelling the log streams before app host requests DCP shutdown.

I also agree though that the second part might be fine-tuning how we handle the logs during shutdown inside DCP. Specifically, in the current implementation, there is no delay between the end of resource cleanup and the shutdown of the DCP/DCP API server. So there might be cases when logs are captured, resource (Executable or Container) is stopped, but a portion of the logs are not streamed to client before the stream is closed from DCP side because DCP is going away.

afscrome requested a review from ReubenBond as a code owner March 30, 2025 21:48

github-actions bot added the area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication label Mar 30, 2025

dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Mar 30, 2025

afscrome mentioned this pull request Mar 30, 2025

_kubernetesService.CleanupResourcesAsync Taking surprisingly long time in tests. #8424

Open

1 task

afscrome force-pushed the dont-swallow-final-logs branch from fe13058 to 978ccb0 Compare March 30, 2025 23:14

afscrome force-pushed the dont-swallow-final-logs branch from 978ccb0 to 1b678fd Compare March 30, 2025 23:17

Merge remote-tracking branch 'upstream/main' into dont-swallow-final-…

d9bb077

…logs

danmoseley requested a review from karolz-ms April 4, 2025 18:19

karolz-ms reviewed Apr 4, 2025

View reviewed changes

afscrome closed this Apr 4, 2025

karolz-ms mentioned this pull request Apr 4, 2025

Final container logs missing when disposing DistributedApplicationTestingBuilder #8206

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow logs to fully flush before disposing of `DcpExecutor`. #8423

Allow logs to fully flush before disposing of `DcpExecutor`. #8423

afscrome commented Mar 30, 2025

danmoseley commented Apr 4, 2025

karolz-ms Apr 4, 2025 •

edited

Loading

afscrome Apr 4, 2025

karolz-ms Apr 4, 2025

Allow logs to fully flush before disposing of DcpExecutor. #8423

Allow logs to fully flush before disposing of DcpExecutor. #8423

Conversation

afscrome commented Mar 30, 2025

Description

Checklist

danmoseley commented Apr 4, 2025

karolz-ms Apr 4, 2025 • edited Loading

Choose a reason for hiding this comment

afscrome Apr 4, 2025

Choose a reason for hiding this comment

karolz-ms Apr 4, 2025

Choose a reason for hiding this comment

Allow logs to fully flush before disposing of `DcpExecutor`. #8423

Allow logs to fully flush before disposing of `DcpExecutor`. #8423

karolz-ms Apr 4, 2025 •

edited

Loading