Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WaitBehavior being ignored when waiting for resource to be healthy in a test. #7601

Closed
1 task done
afscrome opened this issue Feb 13, 2025 · 7 comments · Fixed by #7650 or #7709
Closed
1 task done

WaitBehavior being ignored when waiting for resource to be healthy in a test. #7601

afscrome opened this issue Feb 13, 2025 · 7 comments · Fixed by #7650 or #7709
Assignees
Labels
area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication
Milestone

Comments

@afscrome
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Describe the bug

Trying the following in 9.1. I'm expecting the below test to fail since dependency should fail to start due to failToStart failing to start.

   [Test]
   public async Task Test()
   {
      using var appHost = DistributedApplicationTestingBuilder.Create();

      appHost.Services.AddLogging(x => x
         .AddNUnit()
         .AddFilter("Default", LogLevel.Information)
         .AddFilter("Microsoft.AspNetCore", LogLevel.Warning)
         .AddFilter("Aspire.Hosting.Dcp", LogLevel.Warning)
         .AddFilter("Aspire.Hosting", LogLevel.Debug)
      );

      var failToStart = appHost.AddExecutable("failToStart", "does-not-exist", ".");
      var dependency = appHost.AddContainer("redis", "redis");

      dependency.WaitFor(failToStart, WaitBehavior.StopOnDependencyFailure);

      using var app = appHost.Build();
      await app.StartAsync();
      await app.ResourceNotifications.WaitForResourceHealthyAsync(dependency.Resource.Name).WaitAsync(TimeSpan.FromSeconds(15));
   }

Expected Behavior

I'd expect waiting for redis to become healthy to fail as the resource has entered the terminal FailedToStart state.

Steps To Reproduce

See above code

Exceptions (if any)

info: Aspire.Hosting.DistributedApplication[0]
      Aspire version: 9.1.0-preview.1.25111.2+6adbbbaf7db1fdefae489033cfb30681d2f7bc2f
info: Aspire.Hosting.DistributedApplication[0]
      Distributed application starting.
info: Aspire.Hosting.DistributedApplication[0]
      Application host directory is: S:\REDACTED.Aspire\src\REDACTED.Aspire.AppHost.Tests
dbug: Aspire.Hosting.ApplicationModel.ResourceNotificationService[0]
      Resource redis/redis-zfrbuahn changed state: Starting
dbug: Aspire.Hosting.ApplicationModel.ResourceNotificationService[0]
      Resource redis/redis-zfrbuahn changed state: Starting -> Waiting
info: REDACTED.Aspire.AppHost.Tests.Resources.redis[0]
      1: 2025-02-13T19:25:32.6919100Z Waiting for resource 'failToStart' to enter the 'Running' state.
dbug: Aspire.Hosting.ApplicationModel.ResourceNotificationService[0]
      Waiting for resource 'redis' to match predicate.
dbug: Aspire.Hosting.ApplicationModel.ResourceNotificationService[0]
      Resource failToStart/failToStart-cafpnxdq changed state: Starting
fail: Aspire.Hosting.Dcp.dcpctrl.ExecutableReconciler[0]
      failed to start a process	{"Executable": {"name":"failToStart-cafpnxdq"}, "Reconciliation": 2, "error": "exec: \"does-not-exist\": executable file not found in %PATH%"}

fail: Aspire.Hosting.Dcp.dcpctrl.ExecutableReconciler[0]
      failed to start Executable	{"Executable": {"name":"failToStart-cafpnxdq"}, "Reconciliation": 2, "error": "exec: \"does-not-exist\": executable file not found in %PATH%"}

dbug: Aspire.Hosting.ApplicationModel.ResourceNotificationService[0]
      Resource failToStart/failToStart-cafpnxdq changed state: FailedToStart
fail: REDACTED.Aspire.AppHost.Tests.Resources.redis[0]
      2: 2025-02-13T19:25:36.1905556Z Dependency resource 'failToStart' failed to start.
fail: REDACTED.Aspire.AppHost.Tests.Resources.redis[0]
      3: 2025-02-13T19:25:36.2000362Z Failed to create container resource redis
Aspire.Hosting.DistributedApplicationException: Dependency resource 'failToStart' failed to start.
         at Aspire.Hosting.ApplicationModel.ResourceNotificationService.<>c__DisplayClass18_0.<<WaitUntilHealthyAsync>g__Core|1>d.MoveNext() in /_/src/Aspire.Hosting/ApplicationModel/ResourceNotificationService.cs:line 173
      --- End of stack trace from previous location ---
         at Aspire.Hosting.ApplicationModel.ResourceNotificationService.WaitUntilHealthyAsync(IResource resource, IResource dependency, WaitBehavior waitBehavior, CancellationToken cancellationToken) in /_/src/Aspire.Hosting/ApplicationModel/ResourceNotificationService.cs:line 157
         at Aspire.Hosting.ApplicationModel.ResourceNotificationService.WaitForDependenciesAsync(IResource resource, CancellationToken cancellationToken) in /_/src/Aspire.Hosting/ApplicationModel/ResourceNotificationService.cs:line 330
         at Aspire.Hosting.Orchestrator.ApplicationOrchestrator.WaitForInBeforeResourceStartedEvent(BeforeResourceStartedEvent event, CancellationToken cancellationToken) in /_/src/Aspire.Hosting/Orchestrator/ApplicationOrchestrator.cs:line 76
         at Aspire.Hosting.Eventing.DistributedApplicationEventing.<>c__DisplayClass4_0`1.<<Subscribe>b__0>d.MoveNext() in /_/src/Aspire.Hosting/Eventing/DistributedApplicationEventing.cs:line 82
      --- End of stack trace from previous location ---
         at Aspire.Hosting.Eventing.DistributedApplicationEventing.PublishAsync[T](T event, EventDispatchBehavior dispatchBehavior, CancellationToken cancellationToken) in /_/src/Aspire.Hosting/Eventing/DistributedApplicationEventing.cs:line 69
         at Aspire.Hosting.Orchestrator.ApplicationOrchestrator.OnResourceStarting(OnResourceStartingContext context) in /_/src/Aspire.Hosting/Orchestrator/ApplicationOrchestrator.cs:line 133
         at Aspire.Hosting.Dcp.DcpExecutorEvents.PublishAsync[T](T context) in /_/src/Aspire.Hosting/Dcp/DcpExecutorEvents.cs:line 33
         at Aspire.Hosting.Dcp.DcpExecutor.CreateContainerAsync(AppResource cr, ILogger resourceLogger, CancellationToken cancellationToken) in /_/src/Aspire.Hosting/Dcp/DcpExecutor.cs:line 1164
         at Aspire.Hosting.Dcp.DcpExecutor.<CreateContainersAsync>g__CreateContainerAsyncCore|67_0(AppResource cr, CancellationToken cancellationToken) in /_/src/Aspire.Hosting/Dcp/DcpExecutor.cs:line 1118
dbug: Aspire.Hosting.ApplicationModel.ResourceNotificationService[0]
      Resource redis/redis-zfrbuahn changed state: Waiting -> FailedToStart
info: Aspire.Hosting.DistributedApplication[0]
      Distributed application started. Press Ctrl+C to shut down.
dbug: Aspire.Hosting.ApplicationModel.ResourceNotificationService[0]
      Waiting for resource 'redis' to enter the 'Healthy' state.

.NET Version info

.NET SDK:
 Version:           9.0.200
 Commit:            90e8b202f2
 Workload version:  9.0.200-manifests.c4f6226a
 MSBuild version:   17.13.8+cbc39bea8

Runtime Environment:
 OS Name:     Windows
 OS Version:  10.0.22621
 OS Platform: Windows
 RID:         win-x64
 Base Path:   C:\Program Files\dotnet\sdk\9.0.200\

.NET workloads installed:
There are no installed workloads to display.
Configured to use loose manifests when installing new manifests.

Host:
  Version:      9.0.2
  Architecture: x64
  Commit:       80aa709f5d

.NET SDKs installed:
  6.0.428 [C:\Program Files\dotnet\sdk]
  8.0.406 [C:\Program Files\dotnet\sdk]
  9.0.100 [C:\Program Files\dotnet\sdk]
  9.0.200 [C:\Program Files\dotnet\sdk]

.NET runtimes installed:
  Microsoft.AspNetCore.App 6.0.36 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 7.0.7 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 7.0.20 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 8.0.13 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 9.0.0 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 9.0.2 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 6.0.36 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 8.0.12 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 8.0.13 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 9.0.0 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 9.0.2 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.WindowsDesktop.App 6.0.36 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
  Microsoft.WindowsDesktop.App 8.0.12 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
  Microsoft.WindowsDesktop.App 8.0.13 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
  Microsoft.WindowsDesktop.App 9.0.0 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
  Microsoft.WindowsDesktop.App 9.0.2 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]

Other architectures found:
  x86   [C:\Program Files (x86)\dotnet]
    registered at [HKLM\SOFTWARE\dotnet\Setup\InstalledVersions\x86\InstallLocation]

Environment variables:
  Not set

global.json file:
  Not found

Learn more:
  https://aka.ms/dotnet/info

Download .NET:
  https://aka.ms/dotnet/download

Anything else?

  <Sdk Name="Aspire.AppHost.Sdk" Version="9.1.0-preview.1.25111.2" />
    <PackageVersion Include="Aspire.Hosting" Version="9.1.0-preview.1.25111.2" />
    <PackageVersion Include="Aspire.Hosting.AppHost" Version="9.1.0-preview.1.25111.2" />
    <PackageVersion Include="Aspire.Hosting.Redis" Version="9.1.0-preview.1.25111.2" />
    <PackageVersion Include="Aspire.Hosting.Testing" Version="9.1.0-preview.1.25111.2" />
@afscrome afscrome changed the title WaitBehavior being ignored waiting for resource to be healthy in a test. WaitBehavior being ignored when waiting for resource to be healthy in a test. Feb 13, 2025
@danmoseley danmoseley added the area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication label Feb 14, 2025
@davidfowl davidfowl added this to the 9.1 milestone Feb 14, 2025
@danmoseley
Copy link
Member

@eerhardt do you have cycles to look?

@eerhardt
Copy link
Member

@eerhardt do you have cycles to look?

It won't be until next week.

@davidfowl
Copy link
Member

Like like it is working but WaitForHeathy does what it says, waits for the resource to be healthy.

I have a change with an overload of wait for health that also takes a wait behavior but that method currently only looks at the health state

@mitchdenny mitchdenny self-assigned this Feb 17, 2025
@mitchdenny
Copy link
Member

Hey @afscrome just getting back into things after being out of office for a few months. Looking at your test case it should fail due to timeout - but what you are suggesting is that it should fail due to dependency startup error?

This is what I currently see with your test (adapted) in the repo:

Image

@afscrome
Copy link
Contributor Author

afscrome commented Feb 17, 2025

Welcome back @mitchdenny.

In the above test, the WaitBehaviour.StopOnDependencyFailure means dependency resources goes into the terminal FailedToStart terminal state rather than remaining in the Waiting state that could potentially recover. In the terminal state, there is no way the resource could ever go healthy so I'd expect the test to fail fast clearly stating that the resource entered a terminal state as soon as it's impossible for the resource to go healthy, rather than timing out.

This is particularly important for tests (which default to StopOnDependencyFailure as any if you just end up with a TimeoutException, you're now going to have to trawl the logs to work out why the resource failed to start, and the time in between the resource entering the FailedToStart state and the TimeoutException being thrown is only going to be generating noise in the logs to make trawling the logs even more difficult.

@mitchdenny
Copy link
Member

We have to consider two modes of interaction with this API. One is in test cases like we are discussing here and another is in interactive modes via the dashboard. In the dashboard if a dependency fails to start we can actually restart the dependency and then restart the resource so its not necessarily terminal - and in unit tests we need to be able to simulate this (at least in the aspire repo :)).

But yes I agree in the normal case if your dependency moves into the terminal state, unless you are specifically writing code to test this behavior you probably don't want to endlessly wait. I've put up a PR with a proposed fix for this - feel free to take a look. There are a few API design considerations that need to be resolved before we merge (it might end up needing to be a new enumeration).

@afscrome
Copy link
Contributor Author

afscrome commented Feb 19, 2025

Just tried this out - almost there, but I think there's one last quirk.

   [Test]
   public async Task Test()
   {
      using var appHost = DistributedApplicationTestingBuilder.Create();

      appHost.Services.AddLogging(x => x
         .AddFilter("Default", LogLevel.Information)
         .AddFilter("Microsoft.AspNetCore", LogLevel.Warning)
         .AddFilter("Aspire.Hosting.Dcp", LogLevel.Warning)
         .AddFilter("Aspire.Hosting", LogLevel.Debug)
      );

      var failToStart = appHost.AddExecutable("failToStart", "does-not-exist", ".");
      var dependency = appHost.AddContainer("redis", "redis");

      dependency.WaitFor(failToStart);

      using var app = appHost.Build();
      await app.StartAsync();
   }

This works if I explicitly provide WaitBehaviour.StopOnResourceUnavailable, but doesn't otherwise. I would expect WaitForResourceHealthyAsync without an explicit WaitBeahviour to use the DefaultWaitBeahviour (i.e. WaitOnResourceUnavailable if dashboard is present, otherwise StopOnResourceUnavailable) to match how WaitFor() without a WaitBehavior behaves.

    public async Task<ResourceEvent> WaitForResourceHealthyAsync(string resourceName, CancellationToken cancellationToken = default)
    {
        return await WaitForResourceHealthyAsync(
            resourceName,
-           WaitBehavior.WaitOnResourceUnavailable, // Retain default behavior.
+          DefaultWaitBehavior,
            cancellationToken).ConfigureAwait(false);
    }

If not, should the default test templates be updated to include WaitBehavior.StopOnResourceUnavailable as a follow on to #7619

- await app.ResourceNotifications.WaitForResourceHealthyAsync(dependency.Resource.Name).WaitAsync(TimeSpan.FromSeconds(15));
+ await app.ResourceNotifications.WaitForResourceHealthyAsync(dependency.Resource.Name, WaitBehavior.StopOnResourceUnavailable).WaitAsync(TimeSpan.FromSeconds(15));

@github-actions github-actions bot locked and limited conversation to collaborators Mar 21, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication
Projects
None yet
5 participants