Skip to content

Fix SimpleSearchIT.doTestSimpleTerminateAfterTrackTotalHitsUpTo flaky test #18235

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 9, 2025

Conversation

prudhvigodithi
Copy link
Member

@prudhvigodithi prudhvigodithi commented May 7, 2025

Description

Fix SimpleSearchIT.doTestSimpleTerminateAfterTrackTotalHitsUpTo flaky test, coming from #16851 (comment) rather than updating the approximation code with +1 to get the relation": "gte", this PR is an attempt to fix the test with approximation behavior on long fields.

Falling back to +1 based on analysis part of #16851 (comment)

Background

Coming from the PR #18018 +1 is added and which was removed in this PR #18189 to honor the approximation behavior.

Related Issues

Coming from #16851 (comment)

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Member

@andrross andrross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harshavamsi Can you take a look here as well?

@github-project-automation github-project-automation bot moved this from Todo to In Progress in Performance Roadmap May 7, 2025
Copy link
Contributor

@harshavamsi harshavamsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this makes sense to me. Can you run this test 1000 times locally to ensure that we have no flakiness?

@prudhvigodithi prudhvigodithi marked this pull request as draft May 8, 2025 00:08
Copy link
Contributor

github-actions bot commented May 8, 2025

❌ Gradle check result for 79c76a8: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented May 8, 2025

❕ Gradle check result for b285671: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented May 8, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.46%. Comparing base (560ac10) to head (1925af0).
Report is 13 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #18235      +/-   ##
============================================
- Coverage     72.56%   72.46%   -0.11%     
+ Complexity    67261    67169      -92     
============================================
  Files          5476     5476              
  Lines        310478   310478              
  Branches      45133    45133              
============================================
- Hits         225313   224996     -317     
- Misses        66840    67127     +287     
- Partials      18325    18355      +30     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@prudhvigodithi
Copy link
Member Author

Hey @andrross @harshavamsi based on analysis #16851 (comment) I have updated the PR to add +1.

Also I ran and did not see any failures.

for i in {1..500}; do
  SEED=$(od -vAn -N8 -tx8 < /dev/urandom | tr -d ' ' | tr 'a-f' 'A-F')
  SEED=$(printf "%016s" "$SEED")
  echo "The run is $i"
  ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.simple.SimpleSearchIT.testSimpleTerminateAfterTrackTotalHitsUpToSize"   -Dtests.seed=$SEED 
done

Thanks

@prudhvigodithi
Copy link
Member Author

@andrross and @harshavamsi if you folks can take another review, falling back to +1 based on analysis part of #16851 (comment). Thanks

@andrross
Copy link
Member

andrross commented May 8, 2025

@prudhvigodithi What about Bowen's comment here? Can we improve the test to make it so it would consistently fail?

@prudhvigodithi
Copy link
Member Author

prudhvigodithi commented May 8, 2025

@prudhvigodithi What about Bowen's comment here? Can we improve the test to make it so it would consistently fail?

If we are going with +1, its always "relation": "gte", so we dont need any if conditions. @bowenlan-amzn correct me if I'm wrong.

Ya based on the analysis part of #16851 (comment), we cannot have consistently get the same behavior even with test size < track_total_size and > conditions. So moving with +1 to get the same behavior which used be even before approximation.

@bowenlan-amzn
Copy link
Member

@Prudhvi Godithi commented on May 8, 2025, 11:37 AM PDT:

What about Bowen's comment here? Can we improve the test to make it so it would consistently fail?

If we are going with +1, its always "relation": "gte", so we dont need any if conditions. correct me if I'm wrong.

Ya based on the analysis part of #16851 (comment), we cannot have consistently get the same behavior even with test size < track_total_size and > conditions. So moving with +1 to get the same behavior which used be even before approximation.

Basically if we separate that one random size test to 2 scenarios

  1. size <= track_total_size(5)
  2. size > track_total_size(5)

Scenario 2 should never fail regarding the approximation framework
Scenario 1 now is still flaky not consistently fail, thanks to @prudhvigodithi discovered "approximation" doesn't return exactly size length of docId iterator, it can return > size, and if > 5, then test pass.

I think still worth to refactor to express our understanding here. Seems just a few lines change.

@prudhvigodithi
Copy link
Member Author

prudhvigodithi commented May 8, 2025

I think still worth to refactor to express our understanding here. Seems just a few lines change.

Just getting back to the PR, are you suggesting to update the tests by adding the following scenarios and remove the +1 ?
size <= track_total_size(5)
size > track_total_size(5)

Because with +1 it will look something like this.

        if (size <=5 ) {
            assertEquals(GREATER_THAN_OR_EQUAL_TO, searchResponse.getHits().getTotalHits().relation());
        }
        if (size >= 5 ) {
            assertEquals(GREATER_THAN_OR_EQUAL_TO, searchResponse.getHits().getTotalHits().relation());
        }

If we remove the +1 we can have something like

  if (size <= 5) {
      assertTrue(
          "Relation should be either EQUAL_TO or GREATER_THAN_OR_EQUAL_TO",
          searchResponse.getHits().getTotalHits().relation() == EQUAL_TO ||
          searchResponse.getHits().getTotalHits().relation() == GREATER_THAN_OR_EQUAL_TO
      );

  } else {
      assertEquals(GREATER_THAN_OR_EQUAL_TO, searchResponse.getHits().getTotalHits().relation());
  }

Do we need an if condition if we go with +1 @bowenlan-amzn ?

@bowenlan-amzn
Copy link
Member

No, the idea is to refactor testSimpleTerminateAfterTrackTotalHitsUpToSize to 2 tests, one for size <= 5, one for size >5
You see 5 is kind of a magic number here, which is the setTrackTotalHitsUpTo number used in the current flaky part.

searchResponse = client().prepareSearch("test")
            .setQuery(QueryBuilders.rangeQuery("field").gte(1).lte(numDocs))
            .setSize(size)
            .setTrackTotalHitsUpTo(5) <-- why 5
            .get();
        assertEquals(5, searchResponse.getHits().getTotalHits().value());
        assertEquals(GREATER_THAN_OR_EQUAL_TO, searchResponse.getHits().getTotalHits().relation()); <-- flaky line

@prudhvigodithi
Copy link
Member Author

Make sense thanks @bowenlan-amzn, just updated the test please check.

Signed-off-by: Prudhvi Godithi <[email protected]>
Copy link
Contributor

github-actions bot commented May 9, 2025

✅ Gradle check result for 1925af0: SUCCESS

@prudhvigodithi
Copy link
Member Author

Here is the backport to 3.0 PR #18220.
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants