[py] Reduce traceback output on test failures #16854

cgoldberg · 2026-01-06T02:40:29Z

User description

🔗 Related Issues

💥 What does this PR do?

This PR changes how tracebacks are displayed on test failures when running the Python test suite (locally/RBE/GHA). Previously, we were displaying hundreds of lines of source code and traceback frames on failures.

To fix this, this PR replaces pytest's traceback display with the rich.traceback module: https://rich.readthedocs.io/en/latest/traceback.html

This allows finer grained control and better formatting than pytest's built-in functionality.

What this does:

improves formatting of tracebacks when exceptions occur
shows colored tracebacks with syntax highlighting (except when running on RBE since we don't want ANSI color codes in logs)
reduces number of surrounding source code lines shown in traceback context
reduces number of traceback frames to display (max of 5)
filters out pytest's internal frames from tracebacks
filters out detailed tracebacks from modules that don't have source code to display
filters out DeprecationWarning from output

Example output:

unit test failure on GHA:
https://github.com/SeleniumHQ/selenium/actions/runs/20736146980/job/59534158251?pr=16854#step:20:441
integration test failure on GHA:
https://github.com/SeleniumHQ/selenium/actions/runs/20736146980/job/59534158276?pr=16854#step:20:4315
integration test failure on RBE:
https://gypsum.cluster.engflow.com/invocations/default/c1c859d6-84a4-4c81-8f1d-3c2849e24803

💡 Additional Considerations

By replacing pytest's traceback handling, we are losing the traceback introspection from standard pytest asserts. However, introspection is only done when a custom assertion message isn't displayed, and we usually add them. I think this is a reasonable trade-off to reduce noise in output.

🔄 Types of changes

Dev/Test/CI

PR Type

Enhancement, Tests

Description

Replaces pytest's traceback display with rich.traceback module
Filters out pytest internals and frames without source files
Limits traceback frames to maximum of 5 for clarity
Disables ANSI colors on RBE to prevent log file corruption
Ignores DeprecationWarning messages in test output

Diagram Walkthrough

flowchart LR
  A["pytest test failure"] -->|"pytest_runtest_makereport hook"| B["Extract traceback frames"]
  B -->|"filter_frames"| C["Remove pytest internals"]
  C -->|"rebuild_traceback"| D["Reconstruct traceback object"]
  D -->|"rich.traceback.Traceback"| E["Format with syntax highlighting"]
  E -->|"console.print"| F["Display colored output"]

File Walkthrough

Relevant files

Enhancement

conftest.py `Add rich traceback formatting with pytest hook` py/conftest.py Added imports for `types`, `rich.console`, and `rich.traceback` modules Created `extract_traceback_frames()` to extract frames from traceback objects Created `filter_frames()` to remove pytest internal frames Created `rebuild_traceback()` to reconstruct traceback from filtered frames Implemented `pytest_runtest_makereport()` hook to intercept test failures and display rich-formatted tracebacks with max 5 frames Configured console with conditional color support based on RBE environment detection	+58/-0

Dependencies

BUILD.bazel `Add rich dependency for tests` py/BUILD.bazel Added `rich` package to TEST_DEPS requirement list	+1/-0

Configuration changes

pyproject.toml `Configure pytest to suppress default tracebacks` py/pyproject.toml Added pytest options to disable default traceback display (`--tb=no`) and enable output capture suppression (`-s`) Added filter to ignore DeprecationWarning messages from test output	+4/-0

qodo-code-review · 2026-01-06T03:56:15Z

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢	No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
⚪	🎫 No ticket provided Create ticket/issue
Codebase Duplication Compliance
⚪	Codebase context is not defined Follow the guide to enable codebase context checks.
Custom Compliance
🟢	Generic: Comprehensive Audit Trails Objective: To create a detailed and reliable record of critical system actions for security analysis and compliance. Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Meaningful Naming and Self-Documenting Code Objective: Ensure all identifiers clearly express their purpose and intent, making code self-documenting Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Security-First Input Validation and Data Handling Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent vulnerabilities Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
⚪	Generic: Robust Error Handling and Edge Case Management Objective: Ensure comprehensive error handling that provides meaningful context and graceful degradation Status: Edge cases unhandled: The new `pytest_runtest_makereport` traceback reconstruction uses private `call.excinfo._excinfo` and does not handle cases where frame extraction/filtering yields an empty traceback (potentially impacting failure reporting behavior). Referred Code def pytest_runtest_makereport(item, call): """Hook to print Rich traceback for test failures.""" if call.excinfo is None: return exc_type, exc_value, exc_tb = call.excinfo._excinfo frames = extract_traceback_frames(exc_tb) filtered_frames = filter_frames(frames) new_tb = rebuild_traceback(filtered_frames) tb = rich.traceback.Traceback.from_exception( exc_type, exc_value, new_tb, show_locals=False, max_frames=5, width=130, ) console.print("\n", tb) Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Error Handling Objective: To prevent the leakage of sensitive system information through error messages while providing sufficient detail for internal debugging. Status: Stack trace exposure: The hook prints rich-formatted stack traces directly to console output which may expose internal details in CI logs depending on who can access them and what exceptions contain. Referred Code def pytest_runtest_makereport(item, call): """Hook to print Rich traceback for test failures.""" if call.excinfo is None: return exc_type, exc_value, exc_tb = call.excinfo._excinfo frames = extract_traceback_frames(exc_tb) filtered_frames = filter_frames(frames) new_tb = rebuild_traceback(filtered_frames) tb = rich.traceback.Traceback.from_exception( exc_type, exc_value, new_tb, show_locals=False, max_frames=5, width=130, ) console.print("\n", tb) Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Logging Practices Objective: To ensure logs are useful for debugging and auditing without exposing sensitive information like PII, PHI, or cardholder data. Status: Increased log exposure: Enabling `-s` (no output capture) and printing tracebacks to stdout can increase the chance that secrets/PII emitted during tests appear in CI logs, which may violate logging policy depending on what tests output. Referred Code addopts = ["-s", "--tb=no"] filterwarnings = [ "ignore::DeprecationWarning", ] Learn more about managing compliance generic rules or creating your own custom rules
Update

Compliance status legend

🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

qodo-code-review · 2026-01-06T03:57:49Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
High-level	Use existing package for rich tracebacks It is suggested to replace the custom traceback reconstruction logic with the `pytest-rich` library. This would simplify the code, improve robustness, and restore pytest's assertion introspection feature, which is disabled by the current implementation. Examples: py/conftest.py [55-102] def extract_traceback_frames(tb): """Extract frames from a traceback object.""" frames = [] while tb: if hasattr(tb, "tb_frame") and hasattr(tb, "tb_lineno"): # Skip frames without source files if Path(tb.tb_frame.f_code.co_filename).exists(): frames.append((tb.tb_frame, tb.tb_lineno, getattr(tb, "tb_lasti", 0))) tb = getattr(tb, "tb_next", None) return frames ... (clipped 38 lines) Solution Walkthrough: Before: # py/conftest.py def extract_traceback_frames(tb): # ... manually walk traceback ... def filter_frames(frames): # ... manually filter frames ... def rebuild_traceback(frames): # ... manually create new traceback object ... def pytest_runtest_makereport(item, call): if call.excinfo: # ... extract, filter, rebuild traceback ... tb = rich.traceback.Traceback.from_exception(...) console.print(tb) # py/pyproject.toml addopts = ["-s", "--tb=no"] After: # py/conftest.py # (The custom functions extract_traceback_frames, filter_frames, # rebuild_traceback, and pytest_runtest_makereport are removed) # py/BUILD.bazel TEST_DEPS = [ ..., requirement("pytest-rich"), # Use pytest-rich instead of rich ... ] # py/pyproject.toml # Configuration is simplified, likely removing `--tb=no` # and relying on pytest-rich's defaults. addopts = ["-s"] Suggestion importance[1-10]: 9 __ Why: The suggestion correctly identifies that the PR's manual traceback manipulation is complex and disables pytest's assertion introspection, proposing a more robust and simpler solution by using the `pytest-rich` library.	High
Possible issue	✅ ~~Reverse frame iteration for correct traceback~~ Suggestion Impact: The commit introduced reversed(frames) iteration when processing traceback frames (in filter_frames), aligning with the suggestion’s intent to reverse frame ordering to produce correct traceback output, though it was applied in a different function than the suggested rebuild_traceback change. code diff: - for frame, lineno, lasti in frames: + for frame, lineno, lasti in reversed(frames): mod_name = frame.f_globals.get("__name__", "") if not any(skip in mod_name for skip in skip_modules): filtered.append((frame, lineno, lasti)) Reverse the frame iteration in `rebuild_traceback` to construct the traceback in the correct order by using `reversed(frames)`. py/conftest.py [78-83] def rebuild_traceback(frames): """Rebuild a traceback object from frames list.""" new_tb = None - for frame, lineno, lasti in frames: + for frame, lineno, lasti in reversed(frames): new_tb = types.TracebackType(new_tb, frame, lasti, lineno) return new_tb `[Suggestion processed]` Suggestion importance[1-10]: 9 __ Why: The suggestion correctly identifies a critical bug where the traceback is reconstructed in reverse order, and provides the simple, correct fix using `reversed()`.	High
General	Avoid duplicate tracebacks on setup failures Modify `pytest_runtest_makereport` to only generate a rich traceback for failures occurring in the 'call' phase of a test by adding a `call.when != "call"` check. py/conftest.py [86-102] def pytest_runtest_makereport(item, call): """Hook to print Rich traceback for test failures.""" - if call.excinfo is None: + if call.excinfo is None or call.when != "call": return exc_type, exc_value, exc_tb = call.excinfo._excinfo frames = extract_traceback_frames(exc_tb) filtered_frames = filter_frames(frames) new_tb = rebuild_traceback(filtered_frames) tb = rich.traceback.Traceback.from_exception( exc_type, exc_value, new_tb, show_locals=False, max_frames=5, width=130, ) console.print("\n", tb) Apply / Chat Suggestion importance[1-10]: 7 __ Why: The suggestion correctly points out that the hook runs for all test phases and proposes limiting it to the 'call' phase, which is a sensible refinement to avoid printing custom tracebacks for setup/teardown failures.	Medium
Learned best practice	✅ ~~Stop using private pytest APIs~~ Suggestion Impact: The hook was updated to stop using the private call.excinfo._excinfo tuple and instead access call.excinfo.type, call.excinfo.value, and call.excinfo.tb. (The commit also introduced a TRACEBACK_WIDTH constant, which is adjacent but not required by the suggestion.) code diff: - exc_type, exc_value, exc_tb = call.excinfo._excinfo + exc_type = call.excinfo.type + exc_value = call.excinfo.value + exc_tb = call.excinfo.tb Avoid using pytest’s private `call.excinfo._excinfo`; use the public `ExceptionInfo` attributes and guard for missing fields to keep this hook compatible across pytest versions. py/conftest.py [90] -exc_type, exc_value, exc_tb = call.excinfo._excinfo +exc_type = call.excinfo.type +exc_value = call.excinfo.value +exc_tb = call.excinfo.tb `[Suggestion processed]` Suggestion importance[1-10]: 6 __ Why: Relevant best practice - Add explicit validation/guards at integration boundaries and avoid relying on private/internal APIs.	Low
Learned best practice	✅ ~~Centralize duplicated configuration values~~ Suggestion Impact: Added a TRACEBACK_WIDTH constant set to 130 and replaced the duplicated width=130 literals in both rich.console.Console and rich.traceback.Traceback.from_exception with this constant. code diff: +TRACEBACK_WIDTH = 130 # don't force colors on RBE since errors get redirected to a log file force_terminal = "REMOTE_BUILD" not in os.environ -console = rich.console.Console(force_terminal=force_terminal, width=130) +console = rich.console.Console(force_terminal=force_terminal, width=TRACEBACK_WIDTH) def extract_traceback_frames(tb): @@ -87,7 +88,9 @@ """Hook to print Rich traceback for test failures.""" if call.excinfo is None: return - exc_type, exc_value, exc_tb = call.excinfo._excinfo + exc_type = call.excinfo.type + exc_value = call.excinfo.value + exc_tb = call.excinfo.tb frames = extract_traceback_frames(exc_tb) filtered_frames = filter_frames(frames) new_tb = rebuild_traceback(filtered_frames) @@ -97,7 +100,7 @@ new_tb, show_locals=False, max_frames=5, - width=130, + width=TRACEBACK_WIDTH, ) Replace the duplicated `width=130` literals with a single constant to keep console/traceback formatting consistent and easier to change. py/conftest.py [52-101] -console = rich.console.Console(force_terminal=force_terminal, width=130) +TRACEBACK_WIDTH = 130 +console = rich.console.Console(force_terminal=force_terminal, width=TRACEBACK_WIDTH) ... tb = rich.traceback.Traceback.from_exception( exc_type, exc_value, new_tb, show_locals=False, max_frames=5, - width=130, + width=TRACEBACK_WIDTH, ) `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 5 __ Why: Relevant best practice - Reduce duplication by centralizing shared configuration values.	Low
Update

navin772

LGTM!

cgoldberg added 5 commits January 5, 2026 17:40

[py] Reduce pytest tracebacks

68f4bd9

[py] Fix tracebacks

8bdcd12

[py] no colors on RBE

42288c4

[py] Add 2 bad tests to show tracebacks

172cbde

[py] Don't try to figure out terminal width

effb128

cgoldberg self-assigned this Jan 6, 2026

cgoldberg added C-py Python Bindings B-build Includes scripting, bazel and CI integrations labels Jan 6, 2026

cgoldberg added 2 commits January 5, 2026 21:41

[py] Remove line

f22c3bd

[py] Remove intentionally failing tests

c184da0

cgoldberg marked this pull request as ready for review January 6, 2026 03:55

cgoldberg requested review from navin772 and titusfortner January 6, 2026 03:56

qodo-code-review bot added the Review effort 2/5 label Jan 6, 2026

titusfortner and others added 2 commits January 6, 2026 07:19

Merge branch 'trunk' into py-reduce-test-tracebacks

4184086

[py] Refinements

07950dc

navin772 approved these changes Jan 6, 2026

View reviewed changes

cgoldberg added 2 commits January 7, 2026 08:28

Merge branch 'trunk' into py-reduce-test-tracebacks

16bf37b

[py] Reverse traceback frames

39fce33

cgoldberg merged commit a076bad into SeleniumHQ:trunk Jan 7, 2026
24 checks passed

cgoldberg deleted the py-reduce-test-tracebacks branch January 7, 2026 20:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[py] Reduce traceback output on test failures #16854

[py] Reduce traceback output on test failures #16854

cgoldberg commented Jan 6, 2026 •

edited by qodo-code-review bot

Loading

Uh oh!

qodo-code-review bot commented Jan 6, 2026 •

edited

Loading

Uh oh!

qodo-code-review bot commented Jan 6, 2026 •

edited

Loading

Examples:

Solution Walkthrough:

Before:

After:

Uh oh!

navin772 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[py] Reduce traceback output on test failures #16854

[py] Reduce traceback output on test failures #16854

Conversation

cgoldberg commented Jan 6, 2026 • edited by qodo-code-review bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

🔗 Related Issues

💥 What does this PR do?

💡 Additional Considerations

🔄 Types of changes

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

qodo-code-review bot commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Compliance Guide 🔍

Uh oh!

qodo-code-review bot commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Examples:

Solution Walkthrough:

Before:

After:

Uh oh!

navin772 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cgoldberg commented Jan 6, 2026 •

edited by qodo-code-review bot

Loading

qodo-code-review bot commented Jan 6, 2026 •

edited

Loading

qodo-code-review bot commented Jan 6, 2026 •

edited

Loading