[rb][py][dotnet][java][js] move logic for reruns and collecting logs into separate scripts #16879

titusfortner · 2026-01-10T01:49:48Z

User description

🔗 Related Issues

This is what #16840 was supposed to be.
I separated it from #16858

Reruns are disabled by default; running all the tests to show that it does not break existing setup
Currently we upload all logs on failure, not just failing logs; and they are all nested deep inside directories, this changes that
Note that with debugging enabled, the output is likely too large for the console, which is why figuring out the uploads is important

💥 What does this PR do?

Run Bazel step:
- now executes in a subshell so it still outputs to the file even when it fails, but still fails the step
Rerun failures with debug step:
- Executes rerun-failures.sh with the run inputs and whether or not to rerun
- Parses failing targets from console log generated in Run Bazel step into build/failures/_run1.txt
- Exits if not rerun
- Rerun failing targets only with SE_DEBUG=true and flakiness=1; tee console log to build/bazel-console2.log
- Parses failing targets from rerun console log into failures/_run2.txt
Collect Failed logs step
- Iterate over targets in _run2.txt if rerun happened, or _run1.txt if not
- copy log files into build/failures/ and rename with the name of the path so everything that failed is in one directory
Upload Logs step
- Uploaded file is a zip of the failures directory, with a _run1.txt, optional run2.txt and one file for each of the failures.

🔧 Implementation Notes

Putting things in build directory since it is gitignored and we don't want to disrupt the step that generates a git patch with changes. Could put it somewhere else or have a failures directory be in gitignore

💡 Additional Considerations

You can see the failing logs here: https://github.com/SeleniumHQ/selenium/actions/runs/20857910080/job/59929556499#step:23:30

🔄 Types of changes

New feature (non-breaking change which adds functionality and tests!)

PR Type

Enhancement

Description

Extract test failure handling into separate reusable scripts
Parse failures from Bazel console logs and optionally rerun with debug
Collect failed test logs from testlogs directory into single location
Upload organized failure logs as artifacts with improved structure

Diagram Walkthrough

flowchart LR
  A["Run Bazel<br/>output to console.log"] --> B["Parse failures<br/>from console.log"]
  B --> C{Rerun<br/>enabled?}
  C -->|Yes| D["Rerun with debug<br/>SE_DEBUG=true"]
  D --> E["Parse rerun failures<br/>to _run2.txt"]
  C -->|No| F["Use _run1.txt"]
  E --> G["Collect test logs<br/>from testlogs dir"]
  F --> G
  G --> H["Upload failures<br/>as artifact"]

File Walkthrough

Relevant files

Enhancement

rerun-failures.sh `Parse failures and optionally rerun with debug` scripts/github-actions/rerun-failures.sh New script to parse Bazel console logs for failed targets using awk Extracts base command and reruns failed tests with SE_DEBUG=true and flaky_test_attempts=1 Captures rerun output to separate console log and parses failures again Handles conditional rerun based on input parameter	+31/-0
collect-test-logs.sh `Collect and organize failed test logs` scripts/github-actions/collect-test-logs.sh New script to collect failed test logs from Bazel testlogs directory Reads target list from _run2.txt if present, otherwise _run1.txt Converts Bazel target paths to safe filenames and copies logs to build/failures/ Handles missing log files with warnings	+38/-0
bazel.yml `Integrate new failure handling scripts into workflow` .github/workflows/bazel.yml Refactored "Run Bazel" step to execute in subshell for consistent logging on failure Replaced inline failure parsing with call to rerun-failures.sh script Added new "Collect failed test logs" step calling collect-test-logs.sh Added new "Upload failed test logs" step to upload organized failures directory Removed old "Parse Bazel log for failures" and inline rerun logic Removed old upload-artifact step that uploaded raw testlogs directory	+15/-25

qodo-code-review · 2026-01-10T01:50:26Z

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance

⚪

Command injection

Description: The script derives base_cmd from untrusted/variable RUN_CMD and executes it unquoted
($base_cmd ...), enabling command injection if inputs.run (or the passed argument) can be
influenced to include shell metacharacters or additional commands. rerun-failures.sh [21-27]

Referred Code

base_cmd=$(echo "$RUN_CMD" | sed 's| //[^ ]*||g')
targets=$(tr '\n' ' ' < build/failures/_run1.txt)
echo "Rerunning tests: $base_cmd --test_env=SE_DEBUG=true --flaky_test_attempts=1 $targets"
set +e
{
  $base_cmd --test_env=SE_DEBUG=true --flaky_test_attempts=1 $targets
} 2>&1 | tee build/bazel-console2.log

Ticket Compliance

🟡

🎫 #1234

🔴	Ensure that `click()` on a link with `javascript:` in the `href` triggers the JavaScript in Firefox (regression from Selenium 2.47.1 to 2.48.x).
	Provide a fix or change that restores prior behavior for the provided test case (alerts in 2.47.1 but not 2.48.x) on Firefox 42.

🟡

🎫 #5678

🔴	Investigate and fix repeated ChromeDriver instantiation producing `ConnectFailure` `(Connection refused)` errors after the first instance on Ubuntu/Chrome/ChromeDriver versions provided.
	Ensure multiple ChromeDriver instances can be created without connection-refused errors and document any required setup/limits.

Codebase Duplication Compliance

⚪

Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance

🟢

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Missing prereq checks: The script assumes build/bazel-console.log exists and will fail with little context if the
file is missing or empty when running awk to produce build/failures/_run1.txt.

Referred Code

mkdir -p build/failures
awk '$1 ~ /^\/\// && $2 ~ /FAILED/ && $3 == "in" { print $1 }' build/bazel-console.log > build/failures/_run1.txt

Learn more about managing compliance generic rules or creating your own custom rules

⚪

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status:
Command tracing enabled: The use of set -x will echo executed commands and expanded variables to CI logs, which
could inadvertently expose sensitive values if any appear in paths, environment, or
inputs.

Referred Code

set -euxo pipefail

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Unvalidated command execution: The script derives base_cmd from the externally-provided RUN_CMD and executes it without
validation/allowlisting, which could be unsafe if inputs.run can be influenced beyond
trusted workflow configuration.

Referred Code

RUN_CMD="${1:-}"
RERUN_WITH_DEBUG="${2:-false}"

mkdir -p build/failures
awk '$1 ~ /^\/\// && $2 ~ /FAILED/ && $3 == "in" { print $1 }' build/bazel-console.log > build/failures/_run1.txt

if [ "$RERUN_WITH_DEBUG" != "true" ]; then
  exit 0
fi

if [ ! -s build/failures/_run1.txt ]; then
  echo "No failed tests to rerun."
  exit 0
fi

base_cmd=$(echo "$RUN_CMD" | sed 's| //[^ ]*||g')
targets=$(tr '\n' ' ' < build/failures/_run1.txt)
echo "Rerunning tests: $base_cmd --test_env=SE_DEBUG=true --flaky_test_attempts=1 $targets"
set +e
{
  $base_cmd --test_env=SE_DEBUG=true --flaky_test_attempts=1 $targets


 ... (clipped 1 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Update

Compliance status legend

🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

qodo-code-review · 2026-01-10T01:51:57Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
High-level	Use Bazel's structured output instead Instead of parsing plain text console logs with `awk` to find failed tests, use Bazel's Build Event Protocol (`--build_event_json_file`). This provides structured JSON output for more robustly determining test results and log file locations. Examples: scripts/github-actions/rerun-failures.sh [10] awk '$1 ~ /^\/\// && $2 ~ /FAILED/ && $3 == "in" { print $1 }' build/bazel-console.log > build/failures/_run1.txt scripts/github-actions/collect-test-logs.sh [25-27] rel_path=$(tr ':' '/' <<< "${target#//}") log_source="$TESTLOGS_ROOT/${rel_path}/test.log" Solution Walkthrough: Before: # scripts/github-actions/rerun-failures.sh # Parse human-readable console output to find failures awk '$1 ~ /^\/\// && $2 ~ /FAILED/ ...' build/bazel-console.log > build/failures/_run1.txt # scripts/github-actions/collect-test-logs.sh # Iterate through the parsed list of failed targets while IFS= read -r target; do # Manually construct the log file path from the target string rel_path=$(tr ':' '/' <<< "${target#//}") log_source="$TESTLOGS_ROOT/${rel_path}/test.log" if [ -f "$log_source" ]; then cp "$log_source" "build/failures/${safe_name}.log" fi done < "$LIST_FILE" After: # .github/workflows/bazel.yml # Run Bazel with the build event protocol flag - run: \| ${{ inputs.run }} --build_event_json_file=build/bep.json # New script to parse structured BEP output (e.g., using Python/jq) # This script would replace both `rerun-failures.sh` and `collect-test-logs.sh` # or be called by them. # Pseudocode for a parser: failed_targets = [] for event in parse_json_lines('build/bep.json'): if event.is_test_summary and event.status == 'FAILED': failed_targets.append(event.label) log_path = find_log_uri(event) copy_log(log_path, 'build/failures/') # Rerun logic would use the reliably parsed `failed_targets` list. Suggestion importance[1-10]: 9 __ Why: This suggestion addresses a core design weakness by replacing fragile text parsing with Bazel's robust, machine-readable Build Event Protocol, significantly improving the reliability and maintainability of the failure detection logic.	High
Possible issue	Allow rerun step to continue on error In the `bazel.yml` workflow, add `continue-on-error: true` to the "Rerun failures with debug" step to ensure that subsequent log collection steps run even if the rerun step fails. .github/workflows/bazel.yml [198-201] - name: Rerun failures with debug if: always() + continue-on-error: true run: ./scripts/github-actions/rerun-failures.sh '${{ inputs.run }}' '${{ inputs.rerun-with-debug }}' Apply / Chat Suggestion importance[1-10]: 8 __ Why: The suggestion correctly identifies that the `rerun-failures.sh` script's exit code would halt the workflow, preventing subsequent log collection steps from running. Adding `continue-on-error: true` is critical for ensuring failed test logs are always collected and uploaded.	Medium
	Use xargs to prevent command-line length errors To prevent potential "Argument list too long" errors when rerunning a large number of failed tests, use `xargs` to read targets from the failure list file instead of concatenating them into a single command. scripts/github-actions/rerun-failures.sh [21-29] base_cmd=$(echo "$RUN_CMD" \| sed 's\| //[^ ]\|\|g') -targets=$(tr '\n' ' ' < build/failures/_run1.txt) -echo "Rerunning tests: $base_cmd --test_env=SE_DEBUG=true --flaky_test_attempts=1 $targets" +echo "Rerunning failed tests with debug..." set +e { - $base_cmd --test_env=SE_DEBUG=true --flaky_test_attempts=1 $targets + xargs --no-run-if-empty --arg-file=build/failures/_run1.txt \ + $base_cmd --test_env=SE_DEBUG=true --flaky_test_attempts=1 } 2>&1 \| tee build/bazel-console2.log status=$? set -e Apply / Chat* Suggestion importance[1-10]: 7 __ Why: The suggestion correctly identifies a potential "Argument list too long" error when many tests fail and proposes a robust solution using `xargs`, significantly improving the script's reliability.	Medium
	✅ ~~Improve awk pattern to parse failures accurately~~ Suggestion Impact: The same awk line was modified to change how failures are detected, replacing the original `/FAILED/` check with a more targeted regex that matches specific statuses (`FAILED\|TIMEOUT\|INCOMPLETE`). However, it did not implement the suggested exact equality check (`== "FAILED"`). code diff: -awk '$1 ~ /^\/\// && $2 ~ /FAILED/ && $3 == "in" { print $1 }' build/bazel-console.log > build/failures/_run1.txt +awk '$1 ~ /^\/\// && $2 ~ /(FAILED\|TIMEOUT\|INCOMPLETE)/ && $3 == "in" { print $1 }' build/bazel-console.log > build/failures/_run1.txt Refine the `awk` command to use an exact match (`== "FAILED"`) instead of a pattern match (`~ /FAILED/`) to ensure only actual test failures are parsed from the Bazel log. scripts/github-actions/rerun-failures.sh [10] -awk '$1 ~ /^\/\// && $2 ~ /FAILED/ && $3 == "in" { print $1 }' build/bazel-console.log > build/failures/_run1.txt +awk '$1 ~ /^\/\// && $2 == "FAILED" && $3 == "in" { print $1 }' build/bazel-console.log > build/failures/_run1.txt `[Suggestion processed]` Suggestion importance[1-10]: 6 __ Why: The suggestion correctly improves the precision of the `awk` command by changing a loose match (`~ /FAILED/`) to a strict equality check (`== "FAILED"`), making the failure parsing more robust and less prone to incorrect matches.	Low
	Validate provided run command Add a check at the beginning of the `rerun-failures.sh` script to validate that the run command argument (`$1`) is not empty, and exit with an error if it is. scripts/github-actions/rerun-failures.sh [6-7] RUN_CMD="${1:-}" RERUN_WITH_DEBUG="${2:-false}" +if [ -z "$RUN_CMD" ]; then + echo "Error: No run command provided" >&2 + exit 1 +fi + Apply / Chat Suggestion importance[1-10]: 5 __ Why: This is a good defensive programming practice that adds input validation to the script, making it more robust by failing early with a clear error message if a required argument is missing.	Low
Learned best practice	Guard external command outputs Guard `bazel info` results and ensure the destination directory exists so the script fails with a clear error (or no-ops) instead of copying into a missing/invalid path. scripts/github-actions/collect-test-logs.sh [5-16] set -euxo pipefail -TESTLOGS_ROOT=$(bazel info bazel-testlogs) +mkdir -p build/failures + +if ! TESTLOGS_ROOT="$(bazel info bazel-testlogs)"; then + echo "Warning: unable to determine bazel-testlogs root; skipping log collection." >&2 + exit 0 +fi +if [ -z "${TESTLOGS_ROOT//[[:space:]]/}" ] \|\| [ ! -d "$TESTLOGS_ROOT" ]; then + echo "Warning: invalid bazel-testlogs root '$TESTLOGS_ROOT'; skipping log collection." >&2 + exit 0 +fi LIST_FILE="" if [ -s build/failures/_run2.txt ]; then LIST_FILE="build/failures/_run2.txt" elif [ -s build/failures/_run1.txt ]; then LIST_FILE="build/failures/_run1.txt" else exit 0 fi Apply / Chat Suggestion importance[1-10]: 5 __ Why: Relevant best practice - Add explicit validation/guards at integration boundaries (external commands/filesystem outputs) before using them.	Low
Update

[build] move logic for reruns and collecting logs into separate scripts

475770f

selenium-ci added the B-build Includes scripting, bazel and CI integrations label Jan 10, 2026

qodo-code-review bot added the Review effort 3/5 label Jan 10, 2026

titusfortner added 2 commits January 10, 2026 08:26

chmod scripts

218e221

include timeout and incomplete in the targets to rerun

dc4953e

titusfortner requested review from cgoldberg and diemol January 11, 2026 04:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[rb][py][dotnet][java][js] move logic for reruns and collecting logs into separate scripts #16879

[rb][py][dotnet][java][js] move logic for reruns and collecting logs into separate scripts #16879

titusfortner commented Jan 10, 2026 •

edited by qodo-code-review bot

Loading

Uh oh!

qodo-code-review bot commented Jan 10, 2026 •

edited

Loading

Uh oh!

qodo-code-review bot commented Jan 10, 2026 •

edited

Loading

Examples:

Solution Walkthrough:

Before:

After:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[rb][py][dotnet][java][js] move logic for reruns and collecting logs into separate scripts #16879

Are you sure you want to change the base?

[rb][py][dotnet][java][js] move logic for reruns and collecting logs into separate scripts #16879

Conversation

titusfortner commented Jan 10, 2026 • edited by qodo-code-review bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

🔗 Related Issues

💥 What does this PR do?

🔧 Implementation Notes

💡 Additional Considerations

🔄 Types of changes

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

qodo-code-review bot commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Compliance Guide 🔍

Uh oh!

qodo-code-review bot commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Examples:

Solution Walkthrough:

Before:

After:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

titusfortner commented Jan 10, 2026 •

edited by qodo-code-review bot

Loading

qodo-code-review bot commented Jan 10, 2026 •

edited

Loading

qodo-code-review bot commented Jan 10, 2026 •

edited

Loading