-
Notifications
You must be signed in to change notification settings - Fork 12
Ci add notice about outcomes #52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
enhance CAT AI statistical reporting with pass count and total count
move helper functions to helpers.py
…les and clarify report generation details
Fix: CI missing variable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR improves the testing framework and error handling by introducing a modular script for generating statistical reports, refactoring test functions for clarity, and consolidating helper functions.
- Extracts the CAT AI statistical report script for workflow modularity
- Adds a new function for JSON schema validation and refactors tests to use helper functions
- Improves test configuration and organization by refactoring redundant functions
Reviewed Changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| examples/team_recommender/src/response_matches_json_schema.py | Adds a function to validate JSON responses against a schema |
| examples/team_recommender/tests/helpers.py | Introduces new helper functions (e.g., natural sort, success rate assertion) |
| examples/team_recommender/tests/example_9_threshold/test_measurement_is_within_threshold.py | Refactors threshold measurement tests and error messages |
| examples/team_recommender/tests/example_7_schema_validators/test_response_has_valid_schema.py | Updates schema validation test with the new helper function |
| .github/workflows/cat-test-examples.yml | Updates workflow step to use the new statistical report script |
| examples/team_recommender/tests/conftest.py | Refactors fixtures and example discovery logic |
Comments suppressed due to low confidence (2)
examples/team_recommender/tests/helpers.py:189
- The expected order in this assertion relies on lexicographical sorting, which does not reflect natural number ordering; consider using the natural_sort_key in the sort or updating the expected order accordingly.
assert [ "example_10_threshold", "example_1_text_response", "example_2_unit", "example_8_retry_network", "example_9_retry_with_open_telemetry", ] == sorted(unsorted), "The list should be sorted by the number in the name"
examples/team_recommender/tests/example_9_threshold/test_measurement_is_within_threshold.py:85
- [nitpick] The error message 'Expected {expected_success_rate_measured} to be within of the success rate' is unclear; consider revising it to clearly indicate that the success rate is expected to lie within a specific confidence interval.
assert is_within_expected(expected_success_rate_measured, failure_count, sample_size), (
…t_sample and update related references
…ability with constants
The issue is that the condition uses success(), which means the step will only run if all previous steps in the job were successful.
…people_is_allocated for clarity
…n response handling
…tistical analysis tests
correct data type annotation Co-authored-by: Copilot <[email protected]>
austinworks
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙈
This pull request includes several changes aimed at improving the testing framework, refactoring code for better organization, and enhancing error handling and reporting. The most important changes include the extraction of a script for generating statistical reports, the addition of new helper functions and fixtures, and the refactoring of existing test functions for better clarity and structure.
Improvements to testing framework and error handling:
.github/workflows/cat-test-examples.yml: Refactored the step for showing the CAT AI statistical report to use a separate script (show-statistical-report.sh) for better modularity..github/workflows/show-statistical-report.sh: Created a new script to generate a statistical report of test results, improving the readability and maintainability of the workflow.Refactoring and code organization:
examples/team_recommender/src/response_matches_json_schema.py: Added a new functionresponse_matches_json_schemafor validating JSON responses against a schema, improving code reuse and separation of concerns.examples/team_recommender/tests/conftest.py: Added new fixtures and refactored existing functions to improve test configuration and setup. Removed redundant sorting functions and integrated them into the helpers module. [1] [2] [3]Enhancements to existing tests:
examples/team_recommender/tests/example_7_schema_validators/test_response_has_valid_schema.py: Refactored the test for validating JSON schema responses to use the newly addedresponse_matches_json_schemafunction and added retry logic for handling API connection errors. [1] [2] [3]examples/team_recommender/tests/example_9_threshold/test_measurement_is_within_threshold.py: Removed redundant validation functions and integrated them into the helpers module. Refactored the test for measuring success rates to improve clarity and maintainability. [1] [2] [3]Addition of helper functions:
examples/team_recommender/tests/helpers.py: Added new helper functions for sorting, success rate assertion, and generating test examples. These functions improve code reuse and simplify test logic across multiple test files. [1] [2]