Skip to content

feat: add source management (list, add, read, remove)#31

Open
chazmaniandinkle wants to merge 9 commits intoPleasePrompto:masterfrom
chazmaniandinkle:feat/source-management
Open

feat: add source management (list, add, read, remove)#31
chazmaniandinkle wants to merge 9 commits intoPleasePrompto:masterfrom
chazmaniandinkle:feat/source-management

Conversation

@chazmaniandinkle
Copy link
Copy Markdown

@chazmaniandinkle chazmaniandinkle commented Apr 11, 2026

Summary

Adds full source and notebook management to the NotebookLM skill via browser automation. All operations run headless by default.

Source Management (source_manager.py)

Command Description
list List all sources in a notebook (writes to library)
add-text Add copied text as a source
add-file Upload files (PDF, images, docs, audio) via file chooser
add-website Add website/YouTube URLs using Angular formcontrolname selector
read Extract source guide content (AI summary + document text)
rename Rename a source via context menu
select --names Select specific sources for querying (deselect others)
select --all Reset to all sources selected
remove Delete a source with CDK overlay force-click
test Integration round-trip: add → read → remove

Notebook Management (notebook_manager.py)

Command Description
sync Discover all notebooks from web, upsert into local library
sync --deep Also scrape full source lists per notebook
sync --deep --stale Only re-scrape notebooks where source count changed
sync --library-only Only refresh existing entries
find Fuzzy search across names, descriptions, topics, AND source names
import --url Auto-discover title + source count from a notebook URL
create --title Create new notebook on web + add to library
rename --id --title Rename on web + update library
delete --id --confirm Delete from web + remove from library
exclude --id Remove from library + prevent sync re-add
list --format json Machine-readable output for piping

Targeted Querying (ask_question.py)

Flag Description
--sources "name1,name2" Select specific sources before asking (others excluded)

Architecture

  • Library-first principle: every web read writes to library.json first, then displays from library. The library is always the reader, the web is always the writer.
  • Source registry: list and all mutations write source names back to the notebook's library entry with sources_scraped_at timestamp.
  • Staleness detection: list shows warnings when homepage source count diverges from last deep scrape.
  • Angular FormControl: URL input uses [formcontrolname='urls'] + Insert button (credit: DataNath/notebooklm_source_automation).
  • Card UUID matching: link.closest('mat-card.project-button-card') for homepage operations.

New selectors in config.py

Source management + notebook management selectors for Angular Material UI elements.

Test plan

All features verified against live NotebookLM:

  • source list, add-text, add-file, add-website, read, remove
  • source select/deselect, select --all
  • source integration test (add → read → remove round-trip)
  • notebook sync (21 notebooks discovered)
  • notebook find (fuzzy search across titles + source names)
  • notebook create → rename → delete (full lifecycle)
  • ask_question --sources (scoped query verified)
  • library write-back on all mutations
  • list --format json

🤖 Generated with Claude Code

chazmaniandinkle and others added 9 commits April 11, 2026 15:38
New source_manager.py script for managing NotebookLM sources via browser
automation. Supports listing sources, adding text/file sources, reading
source content, and removing sources — all headless by default.

Operations:
- list: enumerate all sources in a notebook
- add-text: add copied text as a source (NLM auto-titles)
- add-file: upload files via Playwright file chooser
- add-website: semi-automated (opens browser for manual Submit)
- read: extract source guide content
- remove: delete with force-click through CDK overlay
- test: integration round-trip (add → read → remove)

Also adds source management selectors to config.py and documents
all new commands in SKILL.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Angular MatInput textarea doesn't trigger reactive form validation
with standard Playwright input methods (fill, type, pressSequentially,
execCommand, clipboard paste). The fix uses [formcontrolname='urls']
selector which binds directly to Angular's reactive FormControl, and
submits via the "Insert" button instead of the disabled "Submit" arrow.

Credit: DataNath/notebooklm_source_automation for the formcontrolname
approach.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… delete)

Extends notebook_manager.py with browser automation for:
- sync: discover all notebooks from web, upsert into local library
- sync --library-only: only refresh existing entries
- sync --force: ignore exclusion list
- find: fuzzy search across name, description, topics
- import: auto-discover title + source count from URL
- create: create new notebook on web + add to library
- rename: rename on web + update library
- delete: delete from web + remove from library (requires --confirm)
- exclude: remove from library + prevent sync re-add

Uses JavaScript-based scraping for reliable title extraction from
Angular Material cards. IDs include 8-char UUID suffix for uniqueness.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
source_manager.py:
- rename: rename sources via more_vert menu
- select --names: select specific sources (deselect others)
- select --all: reset to all sources selected

notebook_manager.py:
- list --format json: machine-readable output for piping

ask_question.py:
- --sources: select specific sources before asking
  (e.g. --sources "joes-book,cybernetics" queries only those)

The source selection feature lets agents scope queries to specific
documents instead of searching all 300 sources.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All source operations now persist to library.json as single source of
truth. list_sources writes scraped names back before printing.
Mutating ops (add/remove/rename) refresh the library after each change.

- Add _update_library_sources() to SourceManager for write-back
- list_sources reads from library after scrape, not raw page
- sync --deep scrapes source names per notebook into library
- sync --deep --stale only re-scrapes when counts diverge
- find searches source names and shows matched source
- list shows staleness when homepage count != scraped count
- update_notebook accepts sources and sources_scraped_at fields

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The rename/delete functions were using card.evaluate("e.closest('a')")
to find the link, but the <a> tag is a sibling overlay, not a parent.
Added _find_notebook_card_by_uuid that finds the link by UUID then
navigates to its sibling card element.

Note: card matching still fails in headless mode — the evaluate_handle
→ as_element() pattern needs further debugging. Tracked as known issue.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two bugs fixed:

1. create_notebook_web: Added page.wait_for_url() after clicking Create
   to capture the real notebook UUID URL instead of the transitional
   /notebook/creating URL.

2. _find_notebook_card_by_uuid: Replaced broken evaluate_handle →
   as_element() approach with JS-index strategy. JS finds the link
   matching the UUID, locates its sibling mat-card, and returns the
   card's integer index among all cards. Python fetches by index.
   Avoids crossing the JS/Python boundary with DOM handles.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The <a> link is a child of <mat-card>, not a sibling. Using
link.closest('mat-card.project-button-card') correctly finds the
parent card. All notebook operations (create, rename, delete) now
pass end-to-end testing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant