test(research): record full-lifecycle Deep Research cassette reaching completed by teng-lin · Pull Request #1566 · teng-lin/notebooklm-py

teng-lin · 2026-06-12T13:09:20Z

Summary

The committed tests/cassettes/research_deep_poll_long.yaml was deliberately scoped to START_DEEP_RESEARCH + 6 in-progress POLL_RESEARCH calls and never reached a terminal state, so research wait-to-completion against a real deep outcome was uncovered. Two earlier full-lifecycle recording attempts (2026-05-15) died with httpx.PoolTimeout ~15-20 min into the idle poll window.

This PR re-records the full Deep Research lifecycle to completed and wires it into the replay test.

What changed

Poll-until-terminal restructure

_poll_until_terminal replaces the old fixed-count _poll_n_times: it polls research.poll every 30s until a terminal ResearchStatus (COMPLETED/FAILED), capped at 80 polls (~40 min) so a hung run can't spin forever. The replay test now asserts the final status is ResearchStatus.COMPLETED (enum membership, not a string), and a separate cassette-inspection test decodes the last POLL_RESEARCH body through the project's own decode_response + parse_research_task_models to confirm the terminal state independently of the live client.

PoolTimeout-resilient recording client (record mode only)

The previous PoolTimeout was stale-connection reuse: the default 30s keepalive let the pool hand back a half-dead connection on a later poll across the multi-minute idle gap. In record mode the client is now built with a generous 60s pool-acquire/read timeout and a short 10s keepalive_expiry, so an idle connection is closed proactively and a fresh one is opened on the next poll. Both knobs go through the public constructor (timeout= → httpx's pool acquire timeout; limits=ConnectionLimits(keepalive_expiry=...)), so no private seam is needed and replay is unaffected. A @pytest.mark.timeout(3600) overrides the global 60s pytest-timeout for the long live recording.

Task-id pinning fix (load-bearing)

Verified live: Deep Research's start() task_id does not equal the task_id the poll endpoint reports for the same task. Threading start's id made every poll return NOT_FOUND. The loop now mirrors wait_for_completion — first poll unfiltered, then pins the POLL-reported id forward.

Trimming + scrubbing

The full recording's accumulating in-progress report bodies grew to ~400 KB (the cassette scrubber is super-linear on large markdown payloads, which blew up replay time). Trimmed to the in_progress → completed skeleton (3 polls) plus the create / add_text / start / delete lifecycle. The cassette was scrubbed on record and verified clean — cassette guards pass and a manual grep found no live cookies, OAuth/Bearer tokens, API keys, account email, or SAPISIDHASH. The scratch notebook self-cleans via try/finally.

Results

Live recording reached completed (no_research/in_progress → completed; PoolTimeout never recurred).
Final cassette: 0.815 MB (well under the 5 MB cap), 11 interactions.
Replay: ~12s, 3 tests pass.
tests/_guardrails/ (1022 tests), mypy src/notebooklm, and pre-commit run --all-files all clean.

🤖 Generated with Claude Code

Summary by CodeRabbit

Tests
- Rewrote Deep Research polling integration tests to record and replay the full START → polls → completed lifecycle.
- Tests now assert the run reaches a terminal COMPLETED status and that the poll-reported task identifier stabilizes once present.
- Updated cassette verification and guidance to support trimming redundant intermediate polls for replay size limits.

… completed The prior `research_deep_poll_long.yaml` was scoped to START_DEEP_RESEARCH + 6 in-progress POLL_RESEARCH calls and never reached a terminal state, leaving `research wait`-to-completion uncovered. Two earlier full-lifecycle recording attempts died with `httpx.PoolTimeout` ~15-20 min into the idle poll window. This re-records the full lifecycle to `completed`: - Poll-until-terminal loop: polls `research.poll` every 30s until a terminal `ResearchStatus` (COMPLETED/FAILED), capped at 80 polls. Replay asserts the final status is `ResearchStatus.COMPLETED` via enum membership, and a cassette-inspection test decodes the last POLL body through the real decoder + parser to confirm the terminal state independently of the client. - PoolTimeout-resilient recording client (record mode only): a generous 60s pool-acquire/read timeout plus a short 10s keepalive_expiry so an idle connection across the multi-minute poll gap is closed proactively and a fresh one is opened on the next poll. Both knobs go through the public constructor (`timeout=` -> httpx pool timeout; `limits=ConnectionLimits( keepalive_expiry=...)`), so no private seam is needed and replay is unaffected. A `@pytest.mark.timeout(3600)` overrides the global 60s pytest-timeout for the long live recording. - Task-id pinning fix: Deep Research's `start()` task_id does NOT equal the poll-reported task_id (verified live), so the loop mirrors `wait_for_completion` — first poll unfiltered, then pins the POLL-reported id. Threading start's id made every poll return NOT_FOUND. - Trimming: the full recording's accumulating in_progress report bodies blew past a usable replay time (the cassette scrubber is super-linear on the large markdown payloads). Trimmed to the in_progress -> completed skeleton (3 polls) plus the create/add_text/start/delete lifecycle. Final cassette is 0.815 MB (well under the 5 MB cap); replay runs in ~12s. Cassette scrubbed on record and verified clean: cassette guards pass and a manual grep found no live cookies, OAuth/Bearer tokens, API keys, account email, or SAPISIDHASH. The scratch notebook is self-cleaned via try/finally. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

teng-lin · 2026-06-12T13:09:26Z

@claude review

coderabbitai · 2026-06-12T13:09:34Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 19825892-35fd-40f9-9ef2-c042b473008a

📥 Commits

Reviewing files that changed from the base of the PR and between 1599f1c and 7627b68.

📒 Files selected for processing (2)

tests/cassettes/research_deep_poll_long.yaml
tests/integration/test_research_deep_poll_vcr.py

🚧 Files skipped from review as they are similar to previous changes (1)

tests/integration/test_research_deep_poll_vcr.py

📝 Walkthrough

Walkthrough

This PR refactors a Deep Research integration test to capture the full poll-until-terminal lifecycle instead of fixed-iteration polling, adds record-mode client hardening, implements a _poll_until_terminal helper with task-id pinning and max caps, strengthens cassette decoding/validation, and updates related docs and fixtures.

Changes

Deep Research Polling Lifecycle Test

Layer / File(s)	Summary
Module documentation and test foundation `tests/integration/test_research_deep_poll_vcr.py`	Module docstring and supporting documentation updated to describe the full `START` → terminal `POLL_RESEARCH` lifecycle. Sleep-mocking and recording/replay/trimming guidance clarified. Imports and types added for record-mode client configuration and ResearchStatus constants.
Polling configuration and record-mode helpers `tests/integration/test_research_deep_poll_vcr.py`	Terminal status constants, a maximum poll cap, and recording-only timeouts introduced. New `_is_record_mode()` and `_recording_client_kwargs()` helpers provide record-mode detection and client hardening (timeouts, connection limits, keepalive) during recording.
Poll-until-terminal helper and fixture updates `tests/integration/test_research_deep_poll_vcr.py`	New `_poll_until_terminal()` replaces the removed `_poll_n_times()`, polling until a terminal ResearchStatus is reached with first-appearance task ID pinning and hard cap enforcement. `fast_sleep` fixture now consults `_is_record_mode()` for replay vs record behavior.
Main test lifecycle and assertions `tests/integration/test_research_deep_poll_vcr.py`	Test now executes the full lifecycle: start Deep Research, poll until terminal using the new helper, verify task ID becomes stable once present, and assert the final decoded result is `ResearchStatus.COMPLETED`. Old fixed-poll-count logic removed.
Cassette inspection and strengthened validation `tests/integration/test_research_deep_poll_vcr.py`	Cassette parsing improved to track POLL_RESEARCH interactions and their bodies. Final POLL_RESEARCH response now decoded through the real `decode_response` + `parse_research_task_models` pipeline and asserted to be `ResearchStatus.COMPLETED`. Size-cap documentation updated to reference trimming redundant middle `in_progress` polls.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

teng-lin/notebooklm-py#645: Earlier refactor of the same test module that introduced _poll_n_times, which this PR replaces with _poll_until_terminal and enhanced cassette decoding validation.

Poem

🐰 I hopped from START through polls galore,
Pinned task IDs steady, till COMPLETED at the core,
Cassettes replay quickly, sleeps turned light,
The lifecycle now recorded and stitched up right. 🎬✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: recording a full-lifecycle Deep Research cassette reaching completed state.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/deep-research-completed-cassette

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

claude · 2026-06-12T13:09:42Z

Claude finished @teng-lin's task in 3m 4s —— View job

Code Review — PR #1566: Full-lifecycle Deep Research cassette

Load tools
Read changed test file
Read related source files (_research.py, _research_task_parser.py, _types/research.py, vcr_config.py)
Analyze cassette structure
Post review findings

This is a well-designed PR that solves a genuine infrastructure problem (PoolTimeout during recording, task-id mismatch between start and poll). The approach — record-only resilience config, independent cassette-inspection test that decodes through the real pipeline, task-id pinning mirroring wait_for_completion — is all sound. The documentation is thorough. A few observations below, roughly in priority order.

`_poll_until_terminal`: `NOT_FOUND` and `NO_RESEARCH` are not terminal

# tests/integration/test_research_deep_poll_vcr.py:138
_TERMINAL_STATUSES: frozenset[ResearchStatus] = frozenset(
    {ResearchStatus.COMPLETED, ResearchStatus.FAILED}
)

After pinned_task_id is set, if the poll endpoint returns NOT_FOUND (task evicted / transient absence) or NO_RESEARCH, the loop silently continues until _MAX_POLLS rather than breaking or raising. This intentionally mirrors wait_for_completion (treating absence as replication lag), but the docstring only mentions the terminal-status break condition. A reader seeing NOT_FOUND come back mid-loop during a debug session won't immediately understand why the loop kept going.

Suggestion: Add one sentence to the _poll_until_terminal docstring: "NOT_FOUND and NO_RESEARCH are not treated as terminal — they continue the loop (mirrors wait_for_completion's replication-lag policy)." This is a docs-only nit; the logic is correct by design.

Redundant `@pytest.mark.vcr` on the test method

# line 124
pytestmark = [pytest.mark.vcr, skip_no_cassettes]

# line 356-358 — @pytest.mark.vcr is redundant here
@pytest.mark.timeout(_RECORD_TEST_TIMEOUT_SECONDS)
@pytest.mark.vcr
@pytest.mark.asyncio
@notebooklm_vcr.use_cassette(CASSETTE_NAME)
async def test_deep_research_polling_loop(self, fast_sleep: None) -> None:

pytest.mark.vcr is already on the module via pytestmark, so the per-method decorator is redundant. It does no harm (markers are additive), but it could mislead a reader into thinking the module-level mark is absent. Fix this →

`fast_sleep` acquires `monkeypatch` unnecessarily in record mode

# line 271
def fast_sleep(monkeypatch: pytest.MonkeyPatch) -> None:
    if _is_record_mode():
        return  # monkeypatch acquired but never used
    ...
    monkeypatch.setattr(asyncio, "sleep", instant_sleep)

In record mode monkeypatch is resolved by pytest (a non-trivial fixture that installs its own finalizer) but then never used. Functionally harmless, but a minor waste.

One alternative: move the _is_record_mode() guard to the test itself by checking it in the test body before calling _poll_until_terminal, and making fast_sleep unconditionally patch. But since the existing code works correctly and the setup cost is trivial, this is a very low priority nit.

`test_cassette_reaches_completed`: cassette interaction ordering assumption

# lines 498-510
for interaction in cassette.get("interactions", []):
    ...
    if rpc_id == RPCMethod.POLL_RESEARCH.value:
        ...
        last_poll_body = string  # silently overwritten each iteration

This relies on the cassette's interactions list being in recording order, which is always true for VCR.py cassettes. But if a future hand-trimming accidentally reorders interactions (keeping completed before an in_progress entry by mistake), last_poll_body would silently land on the wrong response without any diagnostic.

Adding rpcids_sequence.index(RPCMethod.POLL_RESEARCH.value) offset tracking would catch this, but it's likely over-engineering for this use case. The minimum-count assertion (poll_count >= MIN_POLL_INTERACTIONS) combined with the terminal-status check already provides good coverage. No action required, just noting the dependency.

✅ Things that are explicitly correct

Task-id pinning logic (lines 333–339): The unfiltered first poll → pin the POLL-reported id pattern exactly mirrors wait_for_completion in _research.py:559–560. This is load-bearing for Deep Research and correct.
decode_response + parse_research_task_models in test_cassette_reaches_completed: Using the real decode pipeline to assert terminal status (rather than string-grepping the YAML) is the right approach — the wire format encodes status numerically.
_recording_client_kwargs() empty-in-replay guard: ConnectionLimits(keepalive_expiry=10.0) only applied in record mode; replay sees the default client config. Clean isolation.
try/finally cleanup: The DELETE_NOTEBOOK is in finally, ensuring scratch notebooks are deleted even on mid-recording failures.
Cassette scrubbing: Verified SCRUBBED placeholders on all cookies; no raw auth tokens visible. Notebook UUID in URLs is not a secret.
MIN_POLL_INTERACTIONS = 3 floor: Low enough for hand-trimmed skeletons (no_research → in_progress → completed) but still catches a broken recorder that captured only 1–2 polls.
test_cassette_under_size_cap with @pytest.mark.allow_no_vcr: Correctly skips when the cassette is absent rather than failing.

Summary

Two actionable items (both minor):

Docs nit: Add a note to _poll_until_terminal's docstring that NOT_FOUND/NO_RESEARCH don't break the loop by design.
Cosmetic cleanup: Remove the redundant @pytest.mark.vcr from test_deep_research_polling_loop.

Neither blocks merge — the core logic, scrubbing, and test coverage are all sound. Approving in spirit.

gemini-code-assist

Code Review

This pull request updates the integration test test_research_deep_poll_vcr.py to record and replay the full-lifecycle Deep Research polling loop until it reaches a terminal state, replacing the previous scoped-down approach. It addresses previous httpx.PoolTimeout issues during recording by applying resilient client configurations (generous timeouts and short keepalive expiry) only in record mode. Additionally, it updates the polling logic to pin the task ID and assert that the final status is COMPLETED, while providing guidelines for trimming cassettes to stay under the 5 MB size limit. The code reviewer suggested adding a defensive assertion in _poll_until_terminal to verify that the poll response is not None before accessing its attributes, ensuring type safety and fail-fast behavior.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/integration/test_research_deep_poll_vcr.py`:
- Around line 413-452: The test must assert the cassette exercised the
pinned-task-id path by verifying a non-empty poll.task_id appears in a non-final
poll and is reused later; update the loop that tracks seen_poll_task_id (and/or
introduce a counter like pinned_task_id_count) to record when a non-final poll
exposes a task_id and assert afterwards that there are at least two polls with
the same non-empty task_id (i.e. the seen_poll_task_id was observed on a prior
non-terminal poll and again on a later poll), while keeping the existing
terminal-status checks (final, _TERMINAL_STATUSES, ResearchStatus.COMPLETED)
intact.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 94e99ab5-2021-4df2-b17c-cd63799c47e1

📥 Commits

Reviewing files that changed from the base of the PR and between 928eab3 and 1599f1c.

📒 Files selected for processing (2)

tests/cassettes/research_deep_poll_long.yaml
tests/integration/test_research_deep_poll_vcr.py

The first trim round-tripped the cassette through yaml.safe_dump, which re-wrapped the long URL-encoded request/response scalars differently from VCR's native serializer (max line 224167 vs 222932). On Windows CI the re-wrapped scalars parsed to slightly different bytes, so the CREATE_NOTEBOOK (CCqFvf) request no longer matched its recorded entry and the replay raised CannotOverwriteExistingCassetteException (Linux/macOS were unaffected). Re-trim by slicing the VCR-native recording at interaction boundaries (no yaml re-emission), so every kept interaction is a byte-exact slice of the original recording. Verified: kept create/completed-poll/delete interactions are identical to the source recording, cassette guards pass, replay is green (~12s, 0.855 MB), and the cassette has no live credentials. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- Require the cassette to exercise the task-id pinning path: assert at least two polls surface the same non-empty task_id, so a later filtered poll reuses the captured id (CodeRabbit: a cassette where task_id only appears on the terminal poll would skip the pinned-filter path). The recorded cassette has 3 in_progress/completed polls all carrying the same task_id. - Add a defensive `assert poll is not None` in `_poll_until_terminal` to keep the fail-fast invariant explicit before attribute access (gemini). - Document in `_poll_until_terminal` that NOT_FOUND / NO_RESEARCH are not terminal and continue the loop by design (claude review). - Drop the redundant `@pytest.mark.vcr` from the test method; it is already applied module-wide via `pytestmark` (claude review). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

teng-lin · 2026-06-12T13:33:02Z

Thanks for the review @claude. Addressed both actionable items in 7627b68:

Added a docstring note to _poll_until_terminal that NOT_FOUND / NO_RESEARCH are not terminal and continue the loop by design (mirrors wait_for_completion's replication-lag policy).
Removed the redundant per-method @pytest.mark.vcr (already applied module-wide via pytestmark).

The fast_sleep record-mode monkeypatch nit and the cassette-ordering note were both flagged as no-action-required, so I left them as-is.

claude · 2026-06-12T13:33:17Z

Claude finished @teng-lin's task in 46s —— View job

Follow-up verification — `7627b68`

Read current test file
Confirm docstring note on _poll_until_terminal
Confirm @pytest.mark.vcr removal

Both items from the earlier review are correctly in place in 7627b68.

1. NOT_FOUND/NO_RESEARCH docstring note (test_research_deep_poll_vcr.py:327–331)

    Only ``COMPLETED`` and ``FAILED`` break the loop. ``NOT_FOUND`` /
    ``NO_RESEARCH`` are NOT treated as terminal — they continue polling
    (mirrors ``wait_for_completion``'s replication-lag policy: a pinned task
    temporarily absent from a poll is a transient condition, not an end
    state), bounded by :data:`_MAX_POLLS`.

Exactly what was requested. The wording explains the why (replication-lag policy, transient condition) rather than just stating the behaviour, which is what makes it useful in a debug session.

2. Redundant @pytest.mark.vcr removed (test_research_deep_poll_vcr.py:367–369)

    # ``pytest.mark.vcr`` is applied module-wide via ``pytestmark`` — no need
    # to repeat it here.
    @pytest.mark.timeout(_RECORD_TEST_TIMEOUT_SECONDS)
    @pytest.mark.asyncio
    @notebooklm_vcr.use_cassette(CASSETTE_NAME)
    async def test_deep_research_polling_loop(self, fast_sleep: None) -> None:

Gone, and the inline comment makes the intention explicit for future readers.

Bonus: gemini-code-assist and CodeRabbit items also landed cleanly

assert poll is not None (line 345) — the fail-fast narrowing gemini requested is in.
polls_with_task_id >= 2 assertion (lines 459–463) — the CodeRabbit pinned-task-id coverage assertion is in, with a clear message that shows the observed count on failure.

Everything from the review is addressed. The PR looks good to merge.

@gemini-code-assist

…cipe (#1567) * docs: document the long-running (deep-research) cassette re-record recipe Capture the non-obvious snags from the v0.8 full-lifecycle Deep Research re-record (PR #1566) under docs/development.md "Long-running recordings": record-mode httpx tuning to dodge httpx.PoolTimeout (timeout=60 + ConnectionLimits(keepalive_expiry=10); async_client_factory is not public), @pytest.mark.timeout(3600), the deep start() task_id != poll-reported id NOT_FOUND trap, byte-exact YAML trimming (yaml.safe_dump breaks Windows parsing), and credential-grep beyond the name-anchored scrubber. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Apply suggestion from @gemini-code-assist[bot] Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

gemini-code-assist Bot reviewed Jun 12, 2026

View reviewed changes

Comment thread tests/integration/test_research_deep_poll_vcr.py

coderabbitai Bot reviewed Jun 12, 2026

View reviewed changes

Comment thread tests/integration/test_research_deep_poll_vcr.py

claude added 2 commits June 12, 2026 09:30

teng-lin merged commit 8bcbef9 into main Jun 12, 2026
32 checks passed

teng-lin deleted the feat/deep-research-completed-cassette branch June 12, 2026 13:49

teng-lin mentioned this pull request Jun 12, 2026

docs: document the long-running (deep-research) cassette re-record recipe #1567

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(research): record full-lifecycle Deep Research cassette reaching completed#1566

test(research): record full-lifecycle Deep Research cassette reaching completed#1566
teng-lin merged 3 commits into
mainfrom
feat/deep-research-completed-cassette

teng-lin commented Jun 12, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

teng-lin commented Jun 12, 2026

Uh oh!

coderabbitai Bot commented Jun 12, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

claude Bot commented Jun 12, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

teng-lin commented Jun 12, 2026

Uh oh!

claude Bot commented Jun 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

teng-lin commented Jun 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Poll-until-terminal restructure

PoolTimeout-resilient recording client (record mode only)

Task-id pinning fix (load-bearing)

Trimming + scrubbing

Results

Summary by CodeRabbit

Uh oh!

teng-lin commented Jun 12, 2026

Uh oh!

coderabbitai Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

claude Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review — PR #1566: Full-lifecycle Deep Research cassette

_poll_until_terminal: NOT_FOUND and NO_RESEARCH are not terminal

Redundant @pytest.mark.vcr on the test method

fast_sleep acquires monkeypatch unnecessarily in record mode

test_cassette_reaches_completed: cassette interaction ordering assumption

✅ Things that are explicitly correct

Summary

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

teng-lin commented Jun 12, 2026

Uh oh!

claude Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Follow-up verification — 7627b68

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

teng-lin commented Jun 12, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 12, 2026 •

edited

Loading

claude Bot commented Jun 12, 2026 •

edited

Loading

`_poll_until_terminal`: `NOT_FOUND` and `NO_RESEARCH` are not terminal

Redundant `@pytest.mark.vcr` on the test method

`fast_sleep` acquires `monkeypatch` unnecessarily in record mode

`test_cassette_reaches_completed`: cassette interaction ordering assumption

claude Bot commented Jun 12, 2026 •

edited

Loading

Follow-up verification — `7627b68`