test(research): VCR integration coverage for import_sources_with_verification (real recording)#1568
Conversation
|
@claude review |
📝 WalkthroughWalkthroughThis PR adds a new integration test module that uses VCR cassette replay to verify the ChangesImport Sources Verification VCR Testing
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Claude finished @teng-lin's task in 5m 4s —— View job Code Review: PR #1568
SummaryThe test design, documentation, and happy-path coverage are all high quality and follow the established patterns from Bug: Missing taxonomy allowlist entry (will fail CI)
The guardrail at Compare: Fix needed: add this line to And add this line to Minor observations
What's done well
Overall this is a clean and well-designed addition; the taxonomy allowlist fix is the only required change before merge. |
There was a problem hiding this comment.
Code Review
This pull request introduces a new integration test file, test_research_import_verification_vcr.py, which implements VCR replay tests for the import_sources_with_verification happy path. The feedback suggests reusing the existing _is_vcr_record_mode helper using dynamic path-based loading to maintain a single source of truth and prevent namespace clashes in the non-package tests/ directory.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
…fication (real recording)
The timeout-tolerant deep-research import path
(ResearchAPI.import_sources_with_verification) had unit + e2e coverage but
no integration/VCR test locking the actual wire. This adds a REAL fast-research
recording of its happy path: the baseline sources.list snapshot (GET_NOTEBOOK)
plus the IMPORT_RESEARCH import, returning the imported entries.
A fast research is sufficient because the IMPORT_RESEARCH wire is identical
fast-vs-deep; the deep-only report-markdown-source path stays unit-tested.
Replay drives import_sources_with_verification and asserts it returns a
non-empty list[dict] of {id, title} entries, and asserts the cassette's
request rpcids include BOTH GET_NOTEBOOK and IMPORT_RESEARCH by their
RPCMethod constants (rotation-proof). The timeout/verify/retry branch stays
unit-tested (tests/unit/test_research_import_with_verification.py) because VCR
cannot synthesize the client-side RPCTimeoutError that triggers it.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
3c9430b to
88311e1
Compare
…d overlap nodeid - Reuse tests.vcr_config._is_vcr_record_mode for NOTEBOOKLM_VCR_RECORD parsing (single source of truth; gemini-code-assist). Direct import works via pytest pythonpath=["."], so no dynamic module loading is needed. - Record the test's vcr/allow_no_vcr overlap nodeid in integration_vcr_allow_no_vcr_nodeids.txt (claude[bot]). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Thanks for the review. Addressing the findings:
Also addressed gemini-code-assist's DRY point: |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/_fixtures/integration_allow_no_vcr_files.txt (1)
31-31: ⚡ Quick winUpdate the comment to accurately reflect the test's nature.
The comment on line 31 describes the entry as an "existing mock-only integration exception," but per the PR objectives, this is a new VCR-backed integration test that intentionally carries both
@pytest.mark.allow_no_vcrand VCR cassettes. The comment was apparently copied from the existing entries (lines 1–30) without customization, creating confusion about why a VCR-named file (_vcrsuffix) is allowlisted for "no VCR."Correct the comment to reflect the test's actual status and rationale.
📝 Proposed comment update
-tests/integration/test_research_import_verification_vcr.py # existing mock-only integration exception; migrate per test-suite taxonomy cleanup +tests/integration/test_research_import_verification_vcr.py # NEW: VCR-backed test with dual `@pytest.mark.allow_no_vcr` + cassette marking (see integration_vcr_allow_no_vcr_nodeids.txt)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/_fixtures/integration_allow_no_vcr_files.txt` at line 31, Update the comment that currently calls tests/integration/test_research_import_verification_vcr.py an "existing mock-only integration exception" to instead state that this is a new VCR-backed integration test which intentionally includes `@pytest.mark.allow_no_vcr` while still using VCR cassettes (i.e., an allowlist entry for a VCR-backed test); locate the entry referencing tests/integration/test_research_import_verification_vcr.py and replace the misleading description with a concise line explaining it is a new VCR-backed test and why it is allowlisted.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/_fixtures/integration_allow_no_vcr_files.txt`:
- Line 31: Update the comment that currently calls
tests/integration/test_research_import_verification_vcr.py an "existing
mock-only integration exception" to instead state that this is a new VCR-backed
integration test which intentionally includes `@pytest.mark.allow_no_vcr` while
still using VCR cassettes (i.e., an allowlist entry for a VCR-backed test);
locate the entry referencing
tests/integration/test_research_import_verification_vcr.py and replace the
misleading description with a concise line explaining it is a new VCR-backed
test and why it is allowlisted.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e425abfe-fb48-4376-9820-bde7817ef003
📒 Files selected for processing (4)
tests/_fixtures/integration_allow_no_vcr_files.txttests/_fixtures/integration_vcr_allow_no_vcr_nodeids.txttests/cassettes/research_import_verification.yamltests/integration/test_research_import_verification_vcr.py
✅ Files skipped from review due to trivial changes (1)
- tests/_fixtures/integration_vcr_allow_no_vcr_nodeids.txt
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/integration/test_research_import_verification_vcr.py
Gap
ResearchAPI.import_sources_with_verification(src/notebooklm/_research.py) — the timeout-tolerant deep-research import path (issue #315) — had unit + e2e coverage but no integration/VCR test locking the actual wire. The rawimport_sourcesRPC has a VCR backfill (research_import_sources_direct.yaml), but the verification wrapper that snapshots the baseline and toleratesIMPORT_RESEARCHtimeouts did not.What this adds
A REAL fast-research recording (
tests/cassettes/research_import_verification.yaml, 0.48 MB) of the verification wrapper's happy path, captured live against the API on a self-cleaning scratch notebook:CREATE_NOTEBOOK+ 3×ADD_SOURCE(substantive public Wikipedia paragraphs — no PII)START_FAST_RESEARCH→POLL_RESEARCHtocompleted(viawait_for_completion, which pins the POLL-reported task id)GET_NOTEBOOK(the pre-import baselinesources.listsnapshot) +IMPORT_RESEARCH(the import under test)DELETE_NOTEBOOKcleanup (runs in record AND replay)A fast research is sufficient because the
IMPORT_RESEARCHwire is identical fast-vs-deep; this keeps the recording cheap (~30 s) while the deep-only report-markdown-source import path stays unit-tested.Replay assertions (
record_mode=none, <30 s viafast_sleep)import_sources_with_verification(...)returns a non-emptylist[dict]of{"id": ..., "title": ...}imported entries.RPCMethod.GET_NOTEBOOK.value(baseline snapshot) andRPCMethod.IMPORT_RESEARCH.value(the import), asserted byRPCMethodconstant — rotation-proof throughrpc/types.py, not string literals.Timeout / verify / retry branch stays unit-tested
The timeout-driven verify-and-retry branch cannot be exercised through VCR — VCR replays the recorded
IMPORT_RESEARCHresponse verbatim and cannot synthesize the client-sideRPCTimeoutErrorthat triggers the probe/retry logic. That branch is covered exhaustively bytests/unit/test_research_import_with_verification.py(baseline-snapshot failure, partial-commit retry filtering, report-entry dropping, duplicate suppression, cancellation propagation, …). This VCR test deliberately scopes to the no-timeout happy path so replay stays deterministic and fast.Verification
completed+ imported sources; scratch notebook deleted (no orphans).tests/_guardrails/test_cassettes_clean.py,test_cassette_shapes.py,test_no_pinned_cassette_values.pyall pass; manual grep confirms no liveSID/HSID/SAPISIDHASH/Bearer/account-email shapes (cookies + body keys scrubbed).uv run pytest tests/integration/test_research_import_verification_vcr.py -q→ 3 passed in <1 s.uv run mypy src/notebooklm --ignore-missing-importsclean;uv run pre-commit run --all-filespasses.🤖 Generated with Claude Code
Summary by CodeRabbit