feat: Add persistent device database for offline/degraded mode operation#548
feat: Add persistent device database for offline/degraded mode operation#548maxfield-allison wants to merge 6 commits into
Conversation
This implements the device database foundation that Wez requested for resolving Issue wez#76 (LAN-only fallback when APIs are unavailable). ## Problem Solved When all Govee APIs fail (authentication error, rate limiting, network issues) and the SQLite cache is empty/cleared, govee2mqtt would crash with the ISSUE_76_EXPLANATION error. This prevented LAN control even though LAN devices were reachable. ## Solution A persistent JSON device database that: - Stores device metadata (id, sku, name, room) learned from APIs - Survives cache clears and container restarts - Enables graceful degradation to LAN-only mode - Preserves device names for Home Assistant entity stability ## Changes ### New: src/device_database.rs - JSON storage at /data/devices.json (HA addon) or ~/.cache/govee2mqtt/ - Atomic writes (temp file + rename) for crash safety - StartupMode detection: Fresh, Upgrade, or Normal - User override fields for future editing capability ### Modified: src/commands/serve.rs - Load device database on startup - Fallback to cached devices when APIs fail (with warnings) - Populate in-memory state from database in degraded mode - Update database on successful API/LAN discovery ### Modified: src/service/device.rs - Added cached_name/cached_room fields - name() and room_name() fall back to cached values ## Tested Scenarios 1. Valid credentials: Database populated, all devices work 2. Invalid creds + SQLite cache: Cache returns stale data 3. Invalid creds + NO cache: Loads from devices.json, LAN works 4. Fresh install + no creds: Fails as expected (no data) ## Non-Breaking - All existing tests pass - Existing SQLite cache behavior unchanged - Database is purely additive resilience layer
|
Whoa that was fast!! Looking forward to per-device LAN API only mode :) |
I really just want to not worry about getting locked out for abnormal activity ever again if i have connectivity issues or bounce my services multiple times while messing around lol |
|
@inventor7777 not to say i don't also want enhanced features, lol, im sure I'll contribute more as time permits. (mister thumbs down... XD) |
|
OOOPS I'm sorry i didn't mean to do that 😂😂 |
wez
left a comment
There was a problem hiding this comment.
Thank you again for this! Sorry for the delay in responding; it's a busy time of year!
I completely understand. We hope for side project time but life always seems to find a way to make it difficult! |
|
one more commit incoming shortly to address the remaining items from review. or i can close and push one clean if you prefer, apologies. |
- Avoid unnecessary clone in device iteration (use reference) - Use defensive slicing with .get().unwrap_or() to prevent panics - Fix TOCTOU race condition in database load - Use NamedTempFile for robust atomic writes - Rename cached_name/cached_room to name/room (database is source of truth) - Restructure discovery flow: always load database first - Remove ISSUE_76 crash - fresh installs use generated names
e755006 to
a3838fd
Compare
|
need to correct database saves on LAN status query |
The save was being called on every LAN status response, causing multiple writes per minute. Database saves should only happen after API enumeration, not on every status update.
|
Not sure if this helps, but there is similar functionality in my fork (AlgoClaw/govee2mqtt) after creating a feature request in this issue (which I closed shortly after). You can map the internal directory /JSONs as a persistent volume which can store saved scene commands for LAN control. For example, if you save H61A8_final.json in /JSONs, the Govee2MQTT bridge will use the scenes defined in the the JSON file. What's is cool too is that you can keep old/deleted scenes Govee removed. I am a total noob when it comes to PRs and git stuff. So, I do not know how to help there. |
|
Nice thinking. that's another pain point I have with the current implementation - it doesn't pull everything (I have hundreds of DIYs and it only pulls half or less) |
|
When can we get this? I can't use adaptive lighting until this is fixed. My service just insta-pings over the daily or minute or second or hour limit, and then my lights are stuck at some arbitrary brightness. |
Summary
This PR implements the persistent device database foundation we discussed in #76 and #537. It enables govee2mqtt to gracefully degrade to LAN-only mode when Govee APIs are unavailable, rather than crashing.
Problem
When all Govee APIs fail (authentication error, rate limiting, abnormal activity detection, or network issues) and the SQLite cache is empty or cleared, govee2mqtt crashes with the
ISSUE_76_EXPLANATIONerror. This prevents LAN control even though LAN devices are reachable on the local network.Solution
A persistent JSON device database that:
Implementation Details
Following the architecture you outlined in #537:
Storage (src/device_database.rs)
/data/devices.json(HA addon) or~/.cache/govee2mqtt/devices.jsonuser_nameanduser_roomfor future editing capabilityStartup Flow (src/commands/serve.rs)
Fallback Logic
Changes
src/device_database.rssrc/commands/serve.rssrc/service/device.rscached_name/cached_roomfieldssrc/service/state.rssrc/main.rsCargo.tomlTesting
Tested on a Docker Swarm deployment with 7 Govee devices:
Key Test Output (Scenario 3)
Non-Breaking
cargo test)Future Work (Not in this PR)
This is intentionally a focused foundational PR. Future enhancements could include:
Related
Happy to address any feedback or make adjustments to the approach!