Claude/fix persistent sqlite hwb9 q#17
Conversation
Previously, the `DatabaseBase` class recreated the `sqlite3` connection for every check (`contains()`), insertion (`add()`), deletion (`remove()`), and retrieval (`all()`). During sequential operations, like fetching metadata and verifying if track IDs are previously downloaded (e.g. downloading a playlist with hundreds of tracks), the overhead of creating/destroying these connections repeatedly is not negligible. With this optimization, the Database instance caches its `sqlite3.connect` connection upon initialization (using `check_same_thread=False` to safely share across asyncio tasks running within the event loop) and gracefully shuts down in `__del__`. Impact: Significant speedup (~10x on db operations) resolving track IDs sequentially. Co-authored-by: davidjuarezdev <230496599+davidjuarezdev@users.noreply.github.com>
Previously, the `DatabaseBase` class recreated the `sqlite3` connection for every check (`contains()`), insertion (`add()`), deletion (`remove()`), and retrieval (`all()`). During sequential operations, like fetching metadata and verifying if track IDs are previously downloaded (e.g. downloading a playlist with hundreds of tracks), the overhead of creating/destroying these connections repeatedly is not negligible. With this optimization, the Database instance caches its `sqlite3.connect` connection upon initialization (using `check_same_thread=False` to safely share across asyncio tasks running within the event loop) and gracefully shuts down in `__del__`. Impact: Significant speedup (~10x on db operations) resolving track IDs sequentially. Also updates the GitHub Actions CodeQL workflow to use `v3`/`v4` instead of the deprecated `v1`/`v2` actions which were causing CI failures. Co-authored-by: davidjuarezdev <230496599+davidjuarezdev@users.noreply.github.com>
Previously, the `DatabaseBase` class recreated the `sqlite3` connection for every check (`contains()`), insertion (`add()`), deletion (`remove()`), and retrieval (`all()`). During sequential operations, like fetching metadata and verifying if track IDs are previously downloaded (e.g. downloading a playlist with hundreds of tracks), the overhead of creating/destroying these connections repeatedly is not negligible. With this optimization, the Database instance caches its `sqlite3.connect` connection upon initialization (using `check_same_thread=False` to safely share across asyncio tasks running within the event loop) and gracefully shuts down in `__del__`. Impact: Significant speedup (~10x on db operations) resolving track IDs sequentially. Also updates the GitHub Actions CodeQL workflow to use `v3`/`v4` instead of the deprecated `v1`/`v2` actions which were causing CI failures, and addresses other CodeQL warnings like `setup-python-dependencies: false` and `CODEQL_ACTION_FILE_COVERAGE_ON_PRS`. Co-authored-by: davidjuarezdev <230496599+davidjuarezdev@users.noreply.github.com>
Previously, the `DatabaseBase` class recreated the `sqlite3` connection for every check (`contains()`), insertion (`add()`), deletion (`remove()`), and retrieval (`all()`). During sequential operations, like fetching metadata and verifying if track IDs are previously downloaded (e.g. downloading a playlist with hundreds of tracks), the overhead of creating/destroying these connections repeatedly is not negligible. With this optimization, the Database instance caches its `sqlite3.connect` connection upon initialization (using `check_same_thread=False` to safely share across asyncio tasks running within the event loop) and gracefully shuts down in `__del__`. Impact: Significant speedup (~10x on db operations) resolving track IDs sequentially. Also updates the GitHub Actions CodeQL workflow to use `v4` instead of the deprecated `v1` actions and explicitly turns off python package installation warnings via `build-mode: none` to fix CI failures. Co-authored-by: davidjuarezdev <230496599+davidjuarezdev@users.noreply.github.com>
Accept deletion of codeql-analysis.yml (removed in main via PR #15) https://claude.ai/code/session_018Mv9wywY8XSGfkztttn3Q9
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
Disabled knowledge base sources:
📝 WalkthroughSummary by CodeRabbit
WalkthroughDocumentation added for database connection pooling requirements. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR updates Streamrip’s SQLite database wrapper to reuse a persistent connection (instead of reconnecting per operation), and cleans up an unused import in the Deezer tests.
Changes:
- Keep a single cached
sqlite3.ConnectionperDatabaseBaseinstance and reuse it forcreate/contains/add/remove/all. - Add table-existence detection and make table creation idempotent (
CREATE TABLE IF NOT EXISTS). - Minor import cleanup/formatting in
tests/test_deezer.py; add a.jules/bolt.mdnote documenting the optimization.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
streamrip/db.py |
Switches DB operations to a persistent SQLite connection; adds _table_exists, adjusts create, and updates CRUD methods accordingly. |
tests/test_deezer.py |
Removes an unused import and normalizes import grouping. |
.jules/bolt.md |
Documents the rationale and action for the persistent SQLite connection change. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| self.conn.execute(command, tuple(items)) | ||
| self.conn.commit() | ||
| except sqlite3.IntegrityError as e: | ||
| # tried to insert an item that was already there |
There was a problem hiding this comment.
add() catches sqlite3.IntegrityError but does not ROLLBACK the transaction. With a long-lived connection this can leave the connection in an open transaction and hold locks, causing subsequent operations (or other processes) to hit database is locked. Roll back on errors (or wrap DML in with self.conn: so commit/rollback is handled consistently).
| # tried to insert an item that was already there | |
| # tried to insert an item that was already there | |
| self.conn.rollback() |
| def __del__(self): | ||
| """Ensure connection is closed on exit.""" | ||
| if hasattr(self, 'conn') and self.conn: | ||
| self.conn.close() |
There was a problem hiding this comment.
Relying on __del__ to close the SQLite connection is not reliable (it may never run, and during interpreter shutdown module globals may already be torn down). Consider providing an explicit close() / context-manager API and calling it from the application shutdown path, and keep __del__ as a best-effort fallback only.
| def __del__(self): | |
| """Ensure connection is closed on exit.""" | |
| if hasattr(self, 'conn') and self.conn: | |
| self.conn.close() | |
| def close(self) -> None: | |
| """Close the underlying SQLite connection, if open.""" | |
| conn = getattr(self, "conn", None) | |
| if conn is not None: | |
| try: | |
| conn.close() | |
| except Exception: | |
| # Best-effort close; failures are not fatal. | |
| logger.debug("Error while closing database connection", exc_info=True) | |
| finally: | |
| self.conn = None | |
| def __enter__(self): | |
| """Support use as a context manager.""" | |
| return self | |
| def __exit__(self, exc_type, exc_val, exc_tb): | |
| """Ensure the connection is closed when leaving a context manager.""" | |
| self.close() | |
| def __del__(self): | |
| """Best-effort attempt to close the connection on garbage collection.""" | |
| try: | |
| self.close() | |
| except Exception: | |
| # Avoid raising during interpreter shutdown | |
| pass |
| if hasattr(self, 'conn') and self.conn: | ||
| self.conn.close() |
There was a problem hiding this comment.
Repo formatting config enforces double quotes; these single-quoted string literals will be reformatted (and may fail CI if formatting checks run). Please use double quotes for consistency.
|
|
||
| def reset(self): | ||
| """Delete the database file.""" | ||
| if hasattr(self, 'conn') and self.conn: |
There was a problem hiding this comment.
Repo formatting config enforces double quotes; use double quotes here as well (this currently deviates from tool.ruff.format.quote-style = "double").
| if hasattr(self, 'conn') and self.conn: | |
| if hasattr(self, "conn") and self.conn: |
| # ⚡ Bolt: Cache persistent SQLite connection to avoid recreating it | ||
| # on every db check/add. This gives ~10x speedup for database operations | ||
| # like downloading a playlist where it does hundreds of ID checks sequentially. | ||
| self.conn = sqlite3.connect(self.path, check_same_thread=False) |
There was a problem hiding this comment.
check_same_thread=False disables sqlite3's built-in guardrails, but there is no locking/serialization around self.conn access. If this instance is ever used across threads, this can lead to unsafe concurrent access. If cross-thread use isn't required, keep the default (True); if it is required, add a lock around all DB operations (or use one connection per thread).
| self.conn = sqlite3.connect(self.path, check_same_thread=False) | |
| self.conn = sqlite3.connect(self.path) |
No description provided.