Run only quarantined tests in outer loop #8199

RussKie · 2025-03-20T05:16:39Z

Allow quarantine flaky tests, which excludes the test from being executed by the normal CI pipelines. The quarantined tests get executed on preset schedule by a cron job for some period of time, after which they could manually unquarantined, if their stability improves.

The quarantine attribute and the discoverer as well as some of the MSBuild infra are copied from dotnet/aspnetcore.

The test execution is implemented in an Arcade SDK way, which should make it easy to maintain and enhance, if required.

The quarantined test can be invoked with the following command:

./build.cmd -projects ./tests/Shared/SolutionTests.proj
            -restore -build -test
            /p:IsGitHubActions=true
            /p:RunQuarantinedTests=true

Setting RunQuarantinedTests=false (which is default, so can be omitted) will execute all the non-quarantined tests.

Pull Request Overview

This PR implements a mechanism to quarantine flaky tests from the main CI execution and run them separately in an outer loop. It introduces a new custom test attribute and trait discoverer, adds a dedicated GitHub workflow for quarantined tests, and applies the new attribute across various test projects.

Added QuarantinedTestAttribute and QuarantinedTestTraitDiscoverer in the Aspire.Components.Common.Tests project.
Created a new GitHub workflow (.github/workflows/tests-outerloop.yml) to execute quarantined tests.
Updated multiple test files to mark tests as quarantined using the [QuarantinedTest] attribute.

Reviewed Changes

Copilot reviewed 23 out of 28 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
tests/Aspire.Components.Common.Tests/QuarantinedTestAttribute.cs	Introduces the custom attribute for quarantining tests.
tests/Aspire.Components.Common.Tests/QuarantinedTestTraitDiscoverer.cs	Implements trait discovery for quarantined tests via xUnit.
.github/workflows/tests-outerloop.yml	Adds a new workflow to run quarantined tests on a scheduled outer loop.
tests/Aspire.Hosting.NodeJs.Tests/NodeFunctionalTests.cs	Applies the quarantined test attribute to Node.js functional tests.
tests/Aspire.Hosting.Testing.Tests/TestingFactoryTests.cs	Applies the quarantined test attribute to a test ensuring resource consumption.
tests/Aspire.Hosting.Testing.Tests/TestingBuilderTests.cs	Applies the quarantined test attribute to a test verifying argument propagation.
tests/Aspire.Hosting.Redis.Tests/RedisFunctionalTests.cs	Applies the quarantined test attribute to several Redis functionality tests.
tests/Aspire.Dashboard.Tests/Integration/Playwright/BrowserTokenAuthenticationTests.cs	Applies the quarantined test attribute to a Playwright integration test.
tests/Aspire.Hosting.Elasticsearch.Tests/ElasticsearchFunctionalTests.cs	Applies the quarantined test attribute to Elasticsearch related tests.
tests/Aspire.Cli.Tests/Hosting/CliBackchannelTests.cs	Applies the quarantined test attribute to a CLI backchannel test.
tests/Aspire.Hosting.Azure.Tests/AzureCosmosDBEmulatorFunctionalTests.cs	Applies the quarantined test attribute to an Azure Cosmos DB emulator test.
tests/Aspire.Dashboard.Tests/Integration/Playwright/AppBarTests.cs	Applies the quarantined test attribute to Playwright tests for the AppBar.
tests/Aspire.Hosting.MongoDB.Tests/MongoDbFunctionalTests.cs	Applies the quarantined test attribute to MongoDB functionality tests.

Files not reviewed (5)

Aspire.sln: Language not supported
Directory.Build.targets: Language not supported
eng/Testing.props: Language not supported
eng/Testing.targets: Language not supported
global.json: Language not supported

RussKie · 2025-03-20T05:22:54Z

Directory.Build.targets

@@ -15,7 +15,7 @@
  </ItemGroup>

  <Import Condition="'$(SampleProject)' == 'true' or '$(CI)' != 'true' " Project="eng\Versions.dev.targets" />
-  <Import Condition="'$(SampleProject)' != 'true' and '$(CI)' == 'true' " Project="eng\Versions.targets" />
+  <Import Condition="'$(SampleProject)' != 'true' and '$(CI)' == 'true' and '$(IsGitHubActions)' != 'true' " Project="eng\Versions.targets" />


This import caused me a fair a bit of grief (e.g. here):

/home/runner/work/aspire/aspire/eng/Versions.targets(49,67): error MSB4057: The target "GitInfo" does not exist in the project. [/home/runner/work/aspire/aspire/src/Aspire.AppHost.Sdk/Aspire.RuntimeIdentifier.Tool/Aspire.RuntimeIdentifier.Tool.csproj] /home/runner/work/aspire/aspire/eng/Versions.targets(49,67): error MSB4057: The target "GitInfo" does not exist in the project. [/home/runner/work/aspire/aspire/src/Aspire.AppHost.Sdk/Aspire.RuntimeIdentifier.Tool/Aspire.RuntimeIdentifier.Tool.csproj] /home/runner/work/aspire/aspire/eng/Versions.targets(49,67): error MSB4057: The target "GitInfo" does not exist in the project. [/home/runner/work/aspire/aspire/src/Aspire.Cli/Aspire.Cli.csproj] /home/runner/work/aspire/aspire/eng/Versions.targets(49,67): error MSB4057: The target "GitInfo" does not exist in the project. [/home/runner/work/aspire/aspire/src/Aspire.Hosting.Analyzers/Aspire.Hosting.Analyzers.csproj]

In general, these two imports look scary and nothing other dotnet/* repos use (except for maui). I'm questioning the need for these, but that's a separate discussion.
These are definitely not required for the outer loop runs.

We should be able to clean this up.

In the meantime, check what we do in other workflows where we export CI env variable and set it right before trying to build.

RussKie · 2025-03-20T07:24:48Z

Here's a test run to show it'll look: https://github.com/RussKie/aspire/actions/runs/13963811532

radical · 2025-03-20T07:30:52Z

Does RunQuarantinedTests=true mean that the quarantined tests will run in addition to all the other tests?
I think it would be better to run essentially the tests in the same way as in the regular workflow, so they can run in parallel, and have the same test environment. I'll have a PR up soon to combine the Templates, and Integrations workflows, which should make reusing them easier.

joperezr · 2025-03-20T23:25:43Z

Aspire.sln

@@ -3915,7 +3915,7 @@ Global
 		{39AD3DBA-6B79-44FC-9637-232544D0CA46} = {27381127-6C45-4B4C-8F18-41FF48DFE4B2}
 		{0C97F423-DAAE-4B63-B1B1-CB85E2840B98} = {27381127-6C45-4B4C-8F18-41FF48DFE4B2}
 		{23075FC1-D893-434D-9A20-02870B96ABA7} = {C424395C-1235-41A4-BF55-07880A04368C}
-		{F371C3EE-8AB3-4261-A5F2-9A1FF77ADC19} = {C424395C-1235-41A4-BF55-07880A04368C}
+		{F371C3EE-8AB3-4261-A5F2-9A1FF77ADC19} = {4981B3A5-4AFD-4191-BF7D-8692D9783D60}


Is this intentional?

Yes, moved Aspire.Components.Common.Tests.csproj to the "tests" v-folder because it is used across all tests.

I'm of an option the project should be renamed to not imply it is a project that contains tests (e.g., to "Aspire.TestsUtilities.csproj"), as its current names implies just that.

radical · 2025-03-20T23:34:48Z

I would suggest re-using the existing workflows that we have for tests, with the extra property passed to the test runs. This would mean that the flaky tests will be running with everything else in the same environment as they do on PRs, having the same load on resources.

cc @davidfowl @eerhardt

eng/Testing.props

eng/Testing.targets

joperezr · 2025-03-21T16:28:47Z

tests/Aspire.Cli.Tests/Hosting/CliBackchannelTests.cs

@@ -50,6 +51,7 @@ public async Task AppHostRespondsToPingWithMatchingTimestamp()

    [Fact]
    [ActiveIssue("https://github.com/dotnet/aspire/issues/8113")]
+    [QuarantinedTest("https://github.com/dotnet/aspire/issues/8113")]


What happens when a test is both ActiveIssue and QuarantinedTest? Is there an intention of separating a test that is flaky vs a test that is just failing consistently? The original discussion mentioned that there was value in separating plain failing tests vs flaky ones, and having the flaky use QuarantinedTest and failing tests using ActiveIssue.

Not saying this is how it has to work, I'm interested in what you think we should do here.

That's a good question.

I thought these two can conicide, but after some tests (and some lost hair) I came to a conclusion that, essentially, these two attributes clash. ActiveIssue appears to act as a conditional attribute by skipping tests for a specified platform. And this prevents tests to be executed in the outerloop.

joperezr · 2025-03-21T16:36:12Z

tests/Shared/SolutionTests.proj

+
+    Generally, this project should invoked in the following way:
+
+      ./build.cmd -projects ./tests/Shared/SolutionTests.proj -restore -build -test /p:IsGitHubActions=true /p:RunQuarantinedTests=[true|false]


Can you help me understand a bit better why this is needed? I think that running build.cmd -restore -build -test ... already works and will test all the projects in the repo as we have this:

aspire/eng/Build.props

Line 7 in fca6ddb

<ProjectToBuild Include="$(RepoRoot)tests\**\*.csproj" />

. If what you want is to make it so we only build tests and not the product, feels like we could reuse that Build.props and just add conditions that can be set via build.cmd command for this without having to introduce a new file. As an example, we already have build.cmd -build -testnobuild which builds all of the product but not the tests.

The reason I created a dedicated proj file was to execute only the required test projects, following the established filtering pattern in

aspire/tests/Shared/GetTestProjects.proj

Lines 17 to 27 in b3cc78f

<ItemGroup>

<_TestProjectsToExclude Include="$(RepoRoot)tests\Shared\**\*Tests.csproj" />

<_TestProjectsToExclude Include="$(RepoRoot)tests\testproject\**\*Tests.csproj" />

<_TestProjectsToExclude Include="$(RepoRoot)tests\TestingAppHost1\**\*Tests.csproj" />



<_TestProjectsToExclude Include="$(RepoRoot)tests\Aspire.Templates.Tests\**\*Tests.csproj" />

<_TestProjects Include="$(RepoRoot)tests\**\*Tests.csproj"

Exclude="@(_TestProjectsToExclude)" />

</ItemGroup>

I treat this is an intermediary step to address the immediate task. The current implementations are currently a) fluid and b) complicated, and it doesn't feel appropriate to splice in these.

I'm not 100% clear on the target state, but I think if each test project declared what test infra it's expected to be executed (e.g., by a virtue of having an MSBuild property), it could allow us to simplify the current infra.

RussKie · 2025-03-24T01:08:45Z

Does RunQuarantinedTests=true mean that the quarantined tests will run in addition to all the other tests?

No, the flow will only run the quarantined tests.

I think it would be better to run essentially the tests in the same way as in the regular workflow, so they can run in parallel, and have the same test environment. I'll have a PR up soon to combine the Templates, and Integrations workflows, which should make reusing them easier.

I disagree.

The quarantined flow has different requirements to a check run for each build such as

it runs on a dedicated schedule;
it doesn't have time constraints - i.e., it can run for 20 minutes, if necessary.
we can't know up front (without a significant very custom effort) which test project contain quarantined tests.

Following from the above:

all affected tests can run serially, if required (points 1 and 2);
no need to use excessive compute (point 2 and 3).

* Remove remnants of test coverage infra * Do not force to restore/build for test flows

Allow quarantine flaky tests, which excludes the test from being executed by the normal CI pipelines. The quarantined tests get executed on preset schedule by a cron job for some period of time, after which they could manually unquarantined, if their stability improves. The quarantine attribute and the discoverer as well as some of the MSBuild infra are copied from dotnet/aspnetcore. The test execution is implemented in an Arcade SDK way, which should make it easy to maintain and enhance, if required. The quarantined test can be invoked with the following command: ./build.cmd -projects ./tests/Shared/SolutionTests.proj -restore -build -test /p:IsGitHubActions=true /p:RunQuarantinedTests=true Setting "RunQuarantinedTests=false" (which is default, so can be omitted) will execute all the non-quarantined tests.

RussKie · 2025-03-24T10:53:01Z

Here's how it'd look:

radical · 2025-03-24T19:20:35Z

I disagree.

IMHO, the problem here is that the tests are not without side-effects. The tests can be using resources like docker containers, creating/deleting networks, or can have dcp maybe taking too long to quit etc. All of which means that if the individual tests are run by themselves then they are running in a very different environment. This would mean the signal about the tests passing or failing might not really apply to the real test environment in the main CI.

Maybe we can identity the test projects that have at least one Quarantined test, and run those with run-tests.yml, same as the other tests. And in the reporting section only look at the ones that were Quarantined.

RussKie · 2025-03-24T22:08:22Z

The tests can be using resources like docker containers, creating/deleting networks, or can have dcp maybe taking too long to quit etc. All of which means that if the individual tests are run by themselves then they are running in a very different environment. This would mean the signal about the tests passing or failing might not really apply to the real test environment in the main CI.

This would indicate that the tests are poorly implemented, and they should be either redesigned to make stable or removed entirely. In such shape the tests are pretty much unusable to maintainers or contributors.
We can't/shouldn't make any assumptions or requirements that tests should only be executed in a vacuum.

Maybe we can identity the test projects that have at least one Quarantined test, and run those with run-tests.yml, same as the other tests. And in the reporting section only look at the ones that were Quarantined.

It will likely be complex and very likely unreliable. Please feel free to take over this task.

radical · 2025-03-24T23:20:38Z

My opinion is really an assumption based on looking at flaky tests in the past. But I haven't tried running them separately.

We can go with running them like this to start with. And you can track if re-enabling these in the main tests shows any issues related to the difference in test environment. And we can try to improve those tests.

radical · 2025-03-24T23:22:08Z

re:moving-Aspire.Components.Common.Tests, the condition for archiving the tests needs to be updated so it is not triggered for test support projects.

Copilot bot review requested due to automatic review settings March 20, 2025 05:16

RussKie requested review from radical, eerhardt, mitchdenny and ReubenBond as code owners March 20, 2025 05:16

github-actions bot added the area-integrations Issues pertaining to Aspire Integrations packages label Mar 20, 2025

Copilot AI reviewed Mar 20, 2025

View reviewed changes

RussKie commented Mar 20, 2025

View reviewed changes

RussKie force-pushed the quarantined_outerloop branch from 7a7e679 to 03292b1 Compare March 20, 2025 07:08

joperezr reviewed Mar 20, 2025

View reviewed changes

davidfowl added area-engineering-systems infrastructure helix infra engineering repo stuff and removed area-integrations Issues pertaining to Aspire Integrations packages labels Mar 21, 2025