Enrich retry log messages with task/sample/model context#3240
Open
sjawhar wants to merge 4 commits intoUKGovernmentBEIS:mainfrom
Open
Enrich retry log messages with task/sample/model context#3240sjawhar wants to merge 4 commits intoUKGovernmentBEIS:mainfrom
sjawhar wants to merge 4 commits intoUKGovernmentBEIS:mainfrom
Conversation
8e3cb0b to
dac225e
Compare
sjawhar
added a commit
to METR/inspect_ai
that referenced
this pull request
Feb 16, 2026
Merged branches: - retry-log (PR UKGovernmentBEIS#3240): Enrich retry log messages with task/sample/model context - fix/find-band-search (PR UKGovernmentBEIS#3237): Improve Ctrl+F search: wrap-around, match count, virtualization support - feature/viewer-flat-view: Add flat view toggle to transcript viewer
4 tasks
revmischa
added a commit
to METR/inspect-action
that referenced
this pull request
Feb 17, 2026
## Summary - Updates inspect-ai git pin from cherry-picked release (`4bfe32e7`) to proper octopus merge release (`f2e836ec`) based on PyPI `0.3.179` - The previous release was built by cherry-picking commits, missing several open PRs. This release is an octopus merge of all METR PR branches. ## Included METR PRs (on top of 0.3.179) | PR | Branch | Title | |----|--------|-------| | [#3240](UKGovernmentBEIS/inspect_ai#3240) | `retry-log` | Enrich retry log messages with task/sample/model context | | [#3237](UKGovernmentBEIS/inspect_ai#3237) | `fix/find-band-search` | Improve Ctrl+F search: wrap-around, match count, virtualization support | | — | `feature/viewer-flat-view` | Add flat view toggle to transcript viewer | ## Testing & Validation - [ ] CI passes - [ ] Smoke tests pass against dev environment ## Code Quality - [x] Lock files updated (root + all lambda modules) - [x] No code changes beyond `pyproject.toml` and lock files --------- Co-authored-by: Mischa Spiegelmock <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When inspect retries failed model requests, log messages like
Retrying request to /responses in 0.396765 secondslack context about which task, sample, and provider triggered the retry. In a concurrent runner processing many samples across many tasks, this makes debugging difficult.This PR enriches all retry log messages with a compact context prefix and error summary:
Format:
[{sample_uuid} {task}/{sample_id}/{epoch} {model}]What changed
New helpers in
src/inspect_ai/_util/retry.py:sample_context_prefix()— builds the compact prefix fromsample_active()ContextVarretry_error_summary()— extracts exception type/status/code without leaking message contentSampleContextFilter—logging.Filterfor SDK loggers, adds both inline prefix and structured fieldsinstall_sample_context_logging()— attaches filter at eval startupEnriched existing loggers:
log_model_retry()inmodel/_model.py— prefix + error summarylog_httpx_retry_attempt()in_util/httpx.py— prefixWired into startup:
init_eval_context()in_eval/context.pycallsinstall_sample_context_logging()Evidence of working
E2E: Real
inspect evalagainst mock 429 serverRan
inspect eval examples/hello_world.pyagainst a local mock server returning 429s:The prefix
[EeDA74nwfs3uirgyprfa4b hello_world/1/1 openai/gpt-4o-mini]appears on the SDK's own retry message — confirming theSampleContextFilteronopenai._base_clientworks.Structured JSON logging
When using a JSON log formatter (e.g.
python-json-logger), the structured fields appear as top-level keys:{ "message": "[Abc12xY mmlu/42/1 openai/gpt-4o] Retrying request to /responses in 0.396765 seconds", "name": "openai._base_client", "levelname": "INFO", "sample_uuid": "Abc12xY", "sample_task": "mmlu", "sample_id": 42, "sample_epoch": 1, "sample_model": "openai/gpt-4o" }Unit tests: 24 passing
Review-driven fixes
Code review caught three issues, all fixed with regression tests added before the fix:
openaibut SDK logs fromopenai._base_client; parent logger filters don't run for child records during propagationopenai._base_clientdirectlyTypeErroron non-string.code—getattr(ex, "code")can return int, crashing' '.join()str(raw_code)cast%in prefix breaks formatting — mutatingrecord.msgwith%chars would corruptmsg % argsrecord.getMessage()first, then set resolved msg and clear argsLinear: ENG-594