Improve pipeline validation#4
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 38ce9c7dd8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| candidates.extend([stripped, normalized.replace("-", "_")]) | ||
| return _dedupe_preserve_order(candidates) |
There was a problem hiding this comment.
Stop probing non-importable package specifiers
_package_to_import_candidates now includes the raw requirement string (stripped) as an import target, which produces invalid probes for hyphenated packages such as pytest-cov and pytest-timeout (both are always added by extract_dependencies). Those probes can never succeed, so _discover_missing_dependencies keeps reporting them as missing and install_dependencies_with_retry repeatedly re-runs pip installs until retries are exhausted, even when dependencies are already installed. In CI/offline environments this can turn normal runs into repeated dependency-install failures and wasted time.
Useful? React with 👍 / 👎.
| ) | ||
| return failed_results, failed_evaluation, False | ||
|
|
||
| replay_results = run_tests(test_file, codebase_path) |
There was a problem hiding this comment.
Preserve mutation validation in final acceptance replay
Final acceptance replays the saved suite with run_tests(test_file, codebase_path) but does not enable mutation testing, then overwrites the pipeline’s final evaluation with this replay result. If the accepted iteration previously ran mutation testing, the final report now drops to mutation_score = 0 and loses mutation evidence, while the replay check can still pass because the signature comparison ignores mutation fields. This silently weakens final validation and reporting.
Useful? React with 👍 / 👎.
38ce9c7 to
055c449
Compare
No description provided.