- 
                Notifications
    You must be signed in to change notification settings 
- Fork 744
Fix and extend cloud backup support #5464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2463543    to
    ee9ac4a      
    Compare
  
    | 📝 Walkthrough📝 WalkthroughWalkthroughThe pull request introduces significant updates to the backup and restore functionalities within the Supervisor API. Key enhancements include the ability to manage multiple backup locations, the addition of new helper functions, and modifications to existing methods to accommodate these changes. New constants and exception classes are introduced, while existing constants and methods are refined or removed to streamline operations. Additionally, the test suite is expanded and updated to reflect these changes, ensuring comprehensive coverage of the new functionalities. Changes
 Sequence Diagram(s)sequenceDiagram
    participant User
    participant BackupManager
    participant Backup
    participant Storage
    User->>BackupManager: Request Backup
    BackupManager->>Backup: Create Backup
    Backup->>Storage: Store Backup Data
    Backup->>BackupManager: Return Backup Info
    BackupManager->>User: Return Backup Success
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit: 
 Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
 Other keywords and placeholders
 CodeRabbit Configuration File ( | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (3)
tests/api/test_mounts.py (1)
Line range hint
1-799: Add test coverage for new backup featuresWhile the test file thoroughly covers the removal of bind mounts and read-only mount functionality, there appear to be gaps in test coverage for key PR objectives:
- Multi-location backup support
- Extra metadata functionality
- Backup location consolidation
Consider adding test cases to cover these new features to ensure they work as expected.
supervisor/api/backups.py (1)
243-243: Typographical Correction in Error MessageIn the error message on line 243, consider correcting "cannot backup to there" to "cannot back up to there" for proper grammar.
Apply this diff to fix the typo:
- f"Mount {mount.name} is not used for backups, cannot backup to there" + f"Mount {mount.name} is not used for backups, cannot back up to there"supervisor/backups/validate.py (1)
136-136: Consider adding size limits to extra metadata dictionary.While allowing arbitrary dictionaries aligns with the PR objectives, consider adding size limits to prevent potential memory issues.
Example implementation:
- vol.Optional(ATTR_EXTRA, default=dict): dict, + vol.Optional(ATTR_EXTRA, default=dict): vol.All( + dict, + vol.Length(max=1000), # Limit number of entries + vol.Schema({ + str: vol.All( # Ensure keys are strings + vol.Any(str, int, float, bool, dict, list), + # Add custom validator to limit nested structure depth + # and total size if needed + ) + }) + ),
🛑 Comments failed to post (3)
supervisor/backups/backup.py (1)
245-248: 🛠️ Refactor suggestion
Equality Method May Need to Include All Relevant Attributes
The
__eq__method compares backups based solely on their_dataattribute. To ensure backups are truly identical, consider including_locationsin the comparison.Update the method to include
_locations:def __eq__(self, other: Any) -> bool: """Return true if backups have same metadata and locations.""" return ( isinstance(other, Backup) and self._data == other._data + and self._locations == other._locations )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.def __eq__(self, other: Any) -> bool: """Return true if backups have same metadata and locations.""" return ( isinstance(other, Backup) and self._data == other._data and self._locations == other._locations )supervisor/backups/manager.py (1)
368-369:
⚠️ Potential issueIncorrect Dictionary Key in Backup Consolidation Check
In lines 368-369,
self._backups[backup]should beself._backups[backup.slug]becauseself._backupsuses slugs as keys. Using the Backup object directly will result in aKeyError.Apply this diff to fix the issue:
if ( backup.slug in self._backups - and backup.all_locations != self._backups[backup].all_locations + and backup.all_locations != self._backups[backup.slug].all_locations ):📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.backup.slug in self._backups and backup.all_locations != self._backups[backup.slug].all_locationssupervisor/backups/const.py (1)
26-26:
⚠️ Potential issueTypographical Error in Enum Member Name
There's a typo in the enum member name
COPY_ADDITONAL_LOCATIONS. It should beCOPY_ADDITIONAL_LOCATIONSto match the correct spelling of "additional".Apply this diff to correct the typo:
-COPY_ADDITONAL_LOCATIONS = "copy_additional_locations" +COPY_ADDITIONAL_LOCATIONS = "copy_additional_locations"Ensure to update all references to this enum member throughout the codebase.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.COPY_ADDITIONAL_LOCATIONS = "copy_additional_locations"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (2)
supervisor/api/backups.py (1)
434-449: Consider adding error handling for temporary directory creation.While the implementation is good, consider adding explicit error handling for temporary directory creation failures.
if location and location != LOCATION_CLOUD_BACKUP: tmp_path = location.local_where + if not tmp_path.exists(): + try: + tmp_path.mkdir(parents=True, exist_ok=True) + except OSError as err: + raise APIError(f"Failed to create temporary directory: {err}")supervisor/backups/manager.py (1)
288-331: Consider adding retry mechanism for network-related failures.While the error handling is robust, network-related failures during backup copying could be transient. Consider implementing a retry mechanism with exponential backoff for network operations.
Example implementation:
from functools import wraps import time def retry_on_network_error(retries=3, backoff_factor=2): def decorator(func): @wraps(func) def wrapper(*args, **kwargs): last_exception = None for attempt in range(retries): try: return func(*args, **kwargs) except OSError as err: last_exception = err if attempt < retries - 1: time.sleep(backoff_factor ** attempt) continue raise last_exception return wrapper return decorator @retry_on_network_error() def copy_to_location(src: Path, dst: Path) -> Path: return Path(copy(src, dst))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (19)
- supervisor/api/backups.py(12 hunks)
- supervisor/api/const.py(1 hunks)
- supervisor/backups/backup.py(10 hunks)
- supervisor/backups/const.py(1 hunks)
- supervisor/backups/manager.py(15 hunks)
- supervisor/backups/validate.py(2 hunks)
- supervisor/const.py(1 hunks)
- supervisor/docker/addon.py(0 hunks)
- supervisor/docker/homeassistant.py(0 hunks)
- supervisor/exceptions.py(1 hunks)
- supervisor/homeassistant/const.py(0 hunks)
- supervisor/homeassistant/websocket.py(0 hunks)
- supervisor/mounts/manager.py(0 hunks)
- supervisor/mounts/mount.py(1 hunks)
- tests/api/test_backups.py(3 hunks)
- tests/api/test_mounts.py(5 hunks)
- tests/backups/test_manager.py(5 hunks)
- tests/docker/test_homeassistant.py(0 hunks)
- tests/resolution/fixup/test_mount_execute_remove.py(1 hunks)
💤 Files with no reviewable changes (6)
- supervisor/homeassistant/const.py
- supervisor/mounts/manager.py
- supervisor/docker/homeassistant.py
- tests/docker/test_homeassistant.py
- supervisor/homeassistant/websocket.py
- supervisor/docker/addon.py
✅ Files skipped from review due to trivial changes (1)
- supervisor/const.py
🚧 Files skipped from review as they are similar to previous changes (7)
- supervisor/api/const.py
- supervisor/exceptions.py
- supervisor/backups/const.py
- supervisor/backups/validate.py
- tests/api/test_mounts.py
- tests/resolution/fixup/test_mount_execute_remove.py
- tests/backups/test_manager.py
🔇 Additional comments (16)
supervisor/api/backups.py (6)
72-76: LGTM! Helper function implementation is clean and effective.
The _ensure_list helper function provides a clean way to handle both single values and lists, ensuring backward compatibility.
100-102: Schema changes effectively support new features.
The schema updates appropriately handle:
- Multiple backup locations through _ensure_list
- Extra metadata through the new ATTR_EXTRAfield
Also applies to: 105-105
233-247: LGTM! Location to mount conversion is well-implemented.
The method properly handles:
- Cloud backup locations
- Mount validation
- Clear error messages
301-313: Multiple location handling is well-implemented.
The implementation correctly:
- Validates all locations
- Sets primary location
- Stores additional locations
463-466: LGTM! Error handling improvement for bad messages.
The error handling properly:
- Checks for specific error conditions
- Updates system health status
- Maintains proper error logging
Line range hint 253-264: Verify cloud backup location validation.
The validation now handles lists of locations, but we should verify this doesn't break existing functionality.
✅ Verification successful
Let me gather more information about how the cloud backup location is used in the codebase.
Let me check how the backup location was handled before this change and look for test cases with a different pattern.
Based on the comprehensive analysis of the code and tests, I can now provide the final response:
Cloud backup location validation change is safe and well-tested.
The validation change to handle lists of locations is part of a larger feature to support multiple backup locations. The changes are safe because:
- The validation maintains backward compatibility by converting single locations to a list
- The core validation logic remains unchanged - only Home Assistant can use cloud backup locations
- The feature is well-tested as evidenced by test cases like test_cloud_backup_core_onlyandtest_backup_to_multiple_locations
- All existing usages of the validation function have been updated to handle the new list format
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for any direct usage of cloud backup location validation
rg -A 3 "_validate_cloud_backup_location" 
Length of output: 2511
Script:
#!/bin/bash
# Check for the definition and usage of LOCATION_CLOUD_BACKUP constant
rg -A 2 "LOCATION_CLOUD_BACKUP\s*=" 
# Check how location is used in the backup-related code
ast-grep --pattern 'location: list[str | None] | str | None'
# Look for any test cases related to cloud backup location validation
rg -l "test.*cloud.*backup.*location|test.*backup.*cloud.*location"
Length of output: 296
Script:
#!/bin/bash
# Check git history for the file to see the previous implementation
git show HEAD -- supervisor/api/backups.py
# Look for backup-related test files
rg -l "test.*backup" tests/
# Check test assertions related to backup locations
rg -A 3 "def test.*backup" tests/
Length of output: 26090
supervisor/mounts/mount.py (1)
33-33: LGTM! Simplified imports align with architectural changes.
The removal of backup-related imports aligns with the PR objective of removing bind mounts and moving to API-only communication.
tests/api/test_backups.py (3)
542-570: LGTM! Comprehensive upload/download test implementation.
The test effectively:
- Validates the upload process
- Verifies backup integrity
- Confirms download functionality
577-607: LGTM! Multiple location backup test is well-structured.
The test thoroughly validates:
- Backup creation in multiple locations
- File existence verification
- Location tracking
612-638: LGTM! Extra metadata test provides good coverage.
The test effectively verifies:
- Extra metadata storage
- Retrieval functionality
- Data integrity
supervisor/backups/backup.py (5)
65-67: LGTM: Clean implementation of location sorting.
The function provides a clean way to sort locations while handling None values appropriately.
198-200: LGTM: Enhanced location tracking with paths.
The change from returning a set to a dictionary improves functionality by preserving the mapping between locations and their paths.
207-214: LGTM: Proper handling of location sorting and cloud backup.
The implementation correctly handles location sorting and cloud backup exclusion logic, aligning with the PR objectives for location consolidation.
249-259: LGTM: Robust backup consolidation implementation.
The consolidation logic properly validates both slug and metadata matches before merging, with appropriate error handling.
177-180: LGTM: Client metadata implementation matches requirements.
The implementation aligns with the PR objectives, allowing clients to store arbitrary metadata as a dictionary.
✅ Verification successful
LGTM: Schema validation properly handles client metadata
The implementation is correctly validated in the schema:
- ATTR_EXTRAis defined as an optional field with dictionary type
- Default value is set to dict
- Parent schema allows extra fields with extra=vol.ALLOW_EXTRA
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Verify ATTR_EXTRA is properly defined in schema
ast-grep --pattern 'SCHEMA_BACKUP = vol.Schema({
  $$$
  ATTR_EXTRA: vol.Any(dict, None),
  $$$
})'
Length of output: 101
Script:
#!/bin/bash
# Search for ATTR_EXTRA constant definition
rg "ATTR_EXTRA\s*=" -A 1
# Search for schema definition containing ATTR_EXTRA
rg "SCHEMA_BACKUP.*=.*Schema" -A 10
Length of output: 2098
Script:
#!/bin/bash
# Let's check the complete schema in validate.py to ensure ATTR_EXTRA is properly defined
rg "SCHEMA_BACKUP = vol\.Schema" -A 20 supervisor/backups/validate.py
Length of output: 973
Script:
#!/bin/bash
# Let's check the complete schema definition to find ATTR_EXTRA
# Using more context lines since the schema might be longer
rg "SCHEMA_BACKUP = vol\.Schema" -A 40 supervisor/backups/validate.py
Length of output: 1807
Script:
#!/bin/bash
# Let's check even more lines of the schema to find ATTR_EXTRA
# and also look for any schema extra validation settings
rg "SCHEMA_BACKUP = vol\.Schema" -A 60 supervisor/backups/validate.py
Length of output: 2338
supervisor/backups/manager.py (1)
476-478: LGTM: Consistent implementation of backup parameters.
The changes to backup methods properly implement client metadata and additional locations support, maintaining consistency across full and partial backups.
Also applies to: 490-490, 500-500, 526-528, 551-551, 564-569
| @Job(name="backup_manager_import_backup") | ||
| async def import_backup( | ||
| self, | ||
| tar_file: Path, | ||
| location: Mount | Literal[LOCATION_CLOUD_BACKUP] | None = None, | ||
| additional_locations: list[Mount | Literal[LOCATION_CLOUD_BACKUP] | None] | ||
| | None = None, | ||
| ) -> Backup | None: | ||
| """Check backup tarfile and import it.""" | ||
| backup = Backup(self.coresys, tar_file, "temp", None) | ||
|  | ||
| # Read meta data | ||
| if not await backup.load(): | ||
| return None | ||
|  | ||
| # Already exists? | ||
| if backup.slug in self._backups: | ||
| _LOGGER.warning("Backup %s already exists! overwriting", backup.slug) | ||
| self.remove(self.get(backup.slug)) | ||
|  | ||
| # Move backup to backup | ||
| tar_origin = Path(self.sys_config.path_backup, f"{backup.slug}.tar") | ||
| # Move backup to destination folder | ||
| tar_origin = Path(self._get_base_path(location), f"{backup.slug}.tar") | ||
| try: | ||
| backup.tarfile.rename(tar_origin) | ||
|  | ||
| except OSError as err: | ||
| if err.errno == errno.EBADMSG: | ||
| if err.errno == errno.EBADMSG and location in {LOCATION_CLOUD_BACKUP, None}: | ||
| self.sys_resolution.unhealthy = UnhealthyReason.OSERROR_BAD_MESSAGE | ||
| _LOGGER.error("Can't move backup file to storage: %s", err) | ||
| return None | ||
|  | ||
| # Load new backup | ||
| backup = Backup(self.coresys, tar_origin, backup.slug, None, backup.data) | ||
| if not await backup.load(): | ||
| # Remove invalid backup from location it was moved to | ||
| backup.tarfile.unlink() | ||
| return None | ||
| _LOGGER.info("Successfully imported %s", backup.slug) | ||
|  | ||
| self._backups[backup.slug] = backup | ||
| # Already exists? | ||
| if ( | ||
| backup.slug in self._backups | ||
| and backup.all_locations != self._backups[backup].all_locations | ||
| ): | ||
| _LOGGER.warning("Backup %s already exists! consolidating", backup.slug) | ||
| try: | ||
| self._backups[backup.slug].consolidate(backup) | ||
| except BackupInvalidError as err: | ||
| backup.tarfile.unlink() | ||
| raise BackupInvalidError( | ||
| f"Cannot import backup {backup.slug} due to: {err!s}", _LOGGER.error | ||
| ) from err | ||
| else: | ||
| self._backups[backup.slug] = backup | ||
|  | ||
| if additional_locations: | ||
| await self._copy_to_additional_locations(backup, additional_locations) | ||
|  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider using atomic operations for backup import.
The backup import process involves multiple file operations. Consider making these operations atomic to prevent partial imports in case of failures.
Example implementation:
from atomicwrites import atomic_write
def atomic_copy(src: Path, dst: Path) -> Path:
    with atomic_write(dst, overwrite=True) as f:
        f.write(src.read_bytes())
    return dst
# Use in import_backup:
backup_file = atomic_copy(tar_file, tar_origin)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Outside diff range and nitpick comments (6)
tests/misc/test_tasks.py (1)
215-241: Enhance test coverage for backup cleanup.While the test verifies basic cleanup functionality, consider adding these test cases:
- Verify that backups younger than the threshold are not deleted
- Test edge cases where backup dates are exactly at the threshold
- Add assertions to verify the backup metadata state
Here's a suggested enhancement:
@pytest.mark.usefixtures("path_extern") async def test_core_backup_cleanup( tasks: Tasks, coresys: CoreSys, tmp_supervisor_data: Path ): """Test core backup task cleans up old backup files.""" coresys.core.state = CoreState.RUNNING coresys.hardware.disk.get_disk_free_space = lambda x: 5000 # Put an old and new backup in folder copy(get_fixture_path("backup_example.tar"), coresys.config.path_core_backup) await coresys.backups.reload( location=".cloud_backup", filename="backup_example.tar" ) assert (old_backup := coresys.backups.get("7fed74c8")) + # Create a backup just before the threshold + threshold_backup = await coresys.backups.do_backup_partial( + name="threshold", folders=["ssl"], location=".cloud_backup" + ) + # Artificially age the backup to just before threshold + threshold_backup.date = (utcnow() - OLD_BACKUP_THRESHOLD + timedelta(minutes=1)).isoformat() + new_backup = await coresys.backups.do_backup_partial( name="test", folders=["ssl"], location=".cloud_backup" ) old_tar = old_backup.tarfile + threshold_tar = threshold_backup.tarfile new_tar = new_backup.tarfile # pylint: disable-next=protected-access await tasks._core_backup_cleanup() assert coresys.backups.get(new_backup.slug) + assert coresys.backups.get(threshold_backup.slug) # Should not be deleted assert not coresys.backups.get("7fed74c8") assert new_tar.exists() + assert threshold_tar.exists() # Should still exist assert not old_tar.exists() + # Verify backup metadata state + assert len(coresys.backups.list_backups) == 2supervisor/backups/manager.py (5)
10-10: Consider usingshutil.copy2to preserve metadataThe
copyfunction copies file data and permission mode but not metadata like timestamps and flags. Usingcopy2preserves all file metadata, which might be important for backup integrity.Apply this diff:
-from shutil import copy +from shutil import copy2And update the usage of
copytocopy2throughout the code.
Line range hint
187-199: Properly validate and sanitize theextrametadataThe
extraparameter allows inclusion of additional metadata with backups. Ensure that this data is properly validated and sanitized to prevent security issues like code injection or corruption of backup metadata.
225-225: Update the docstring for thereloadmethod to reflect the return typeThe
reloadmethod now returns abool, but the docstring mentions it returns a coroutine. Update the docstring to accurately reflect the method's behavior and return type.Apply this diff:
async def reload( self, location: Mount | Literal[LOCATION_CLOUD_BACKUP] | type[DEFAULT] | None = DEFAULT, filename: str | None = None, ) -> bool: """Load existing backups. - Return a coroutine. + Return True if successful, False otherwise. """
Line range hint
410-464: Check for race conditions during backup to multiple locationsIn
_do_backup, backups are copied to additional locations. Ensure that concurrent backups don't interfere with each other, potentially causing race conditions or corrupt backups, especially if backups share the same slug.
492-494: Document and validate new parameters indo_backup_fullThe parameters
extraandadditional_locationsare added todo_backup_full. Ensure these parameters are documented in the method's docstring and validate them to prevent invalid input.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
- supervisor/backups/manager.py(15 hunks)
- supervisor/misc/tasks.py(4 hunks)
- tests/backups/test_manager.py(5 hunks)
- tests/misc/test_tasks.py(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/backups/test_manager.py
🔇 Additional comments (6)
tests/misc/test_tasks.py (1)
5-6: LGTM! Import changes are appropriate.
The added imports from standard library modules are necessary for the new test functionality.
Also applies to: 21-21
supervisor/misc/tasks.py (2)
91-95: LGTM! Task registration is well-structured.
The cleanup task is properly registered with appropriate condition (JobCondition.HEALTHY).
46-47: Consider adjusting the timing constants.
The current timing configuration might need adjustment:
- The cleanup task runs approximately every 24 hours (86200seconds), which could lead to backups persisting longer than necessary.
- The 2-day threshold for deletion might be too aggressive, potentially removing backups that are still needed.
Let's analyze the backup patterns to determine better timing values:
Also applies to: 50-51
✅ Verification successful
Let me generate additional scripts to analyze the backup patterns and timing.
Based on the code analysis, I can now provide a final response regarding the timing constants:
The timing values are appropriate for their use case.
The cleanup task timing (24 hours) and 2-day threshold are well-justified because:
- The cleanup task (RUN_CORE_BACKUP_CLEANUP = 86200) targets only transient core backups stored in.cloud_backuplocation, as shown insupervisor/misc/tasks.py:
async def _core_backup_cleanup(self) -> None:
    """Core backup is intended for transient use, remove any old backups that got left behind."""- 
These backups are created automatically during core operations (like updates) and are not user-created backups, as evidenced by the creation patterns in supervisor/homeassistant/core.pyandsupervisor/addons/manager.py.
- 
The 2-day retention ensures temporary backups don't accumulate while giving enough time for rollback if needed. 
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check backup creation patterns to determine optimal cleanup schedule
rg -A 2 "do_backup_partial|create_backup" | grep -v "test"
Length of output: 69715
Script:
#!/bin/bash
# Search for backup cleanup related code
rg -A 5 "backup.*cleanup|cleanup.*backup" --type py
# Search for backup creation patterns
rg -A 5 "create_backup|do_backup" --type py
# Search for backup configuration and settings
rg -A 5 "backup.*config|backup.*setting" --type py
Length of output: 46278
supervisor/backups/manager.py (3)
22-22: Ensure appropriate handling of the new BackupDataDiskBadMessageError
A new exception BackupDataDiskBadMessageError is introduced. Verify that this exception is used consistently and that all scenarios where errno.EBADMSG might occur are adequately handled, providing clear feedback to the user.
168-171: Validate the setting of unhealthy state upon OSError
When an OSError with errno.EBADMSG is caught, the code sets UnhealthyReason.OSERROR_BAD_MESSAGE. Ensure that this is the appropriate response and that it doesn't conflict with other health checks or lead to unintended side effects.
348-400: Verify backup import and consolidation logic for data integrity
During import_backup, backups are consolidated if one with the same slug exists. Ensure that this process doesn't lead to data loss or inconsistencies, especially when backups have differing metadata or contents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (5)
supervisor/backups/manager.py (5)
168-171: Enhance error handling for bad message errorsGood addition of specific error handling for bad message errors. However, consider logging the specific error code for better debugging.
if err.errno == errno.EBADMSG and path in { self.sys_config.path_backup, self.sys_config.path_core_backup, }: + _LOGGER.error("Bad message error (errno: %d) for path: %s", err.errno, path) self.sys_resolution.unhealthy = UnhealthyReason.OSERROR_BAD_MESSAGE
187-187: Verify type hints for extra metadataThe
extraparameter allows for arbitrary dictionary data, but the type hint could be more specific to prevent potential misuse.- extra: dict | None = None, + extra: dict[str, Any] | None = None,Also applies to: 199-199
233-243: Improve error handling in backup consolidationThe consolidation logic properly handles invalid backups, but consider adding debug logging for successful consolidations to aid in troubleshooting.
try: self._backups[backup.slug].consolidate(backup) + _LOGGER.debug( + "Successfully consolidated backup %s from location %s", + backup.slug, + backup.location, + ) except BackupInvalidError as err:
304-347: Enhance error handling in _copy_to_additional_locationsThe error handling is thorough, but consider adding retries for transient network issues when copying to remote locations.
Consider implementing a retry mechanism with exponential backoff for network-related errors:
from typing import Any from tenacity import retry, stop_after_attempt, wait_exponential @retry( stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=4, max=10), retry=retry_if_exception_type(OSError) ) def copy_with_retry(src: Path, dst: Path) -> Path: """Copy file with retry mechanism for network issues.""" return Path(copy(src, dst))
544-546: Ensure consistent parameter ordering across methodsThe parameter ordering in
do_backup_partialanddo_backup_fullis consistent, which is good. However, consider grouping related parameters together using a dataclass for better maintainability.Consider creating a backup options dataclass:
@dataclass class BackupOptions: """Options for backup operations.""" password: str | None = None compressed: bool = True location: Mount | Literal[LOCATION_CLOUD_BACKUP] | type[DEFAULT] | None = DEFAULT extra: dict[str, Any] | None = None additional_locations: list[Mount | Literal[LOCATION_CLOUD_BACKUP] | None] | None = NoneAlso applies to: 493-495
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- supervisor/backups/manager.py(16 hunks)
🔇 Additional comments (1)
supervisor/backups/manager.py (1)
382-400: 🛠️ Refactor suggestion
Enhance backup import consolidation logic
The backup import consolidation logic correctly handles existing backups, but consider adding validation for the extra metadata when consolidating backups.
Add validation for extra metadata during consolidation:
def _validate_extra_metadata(self, extra: dict | None) -> bool:
    """Validate extra metadata structure."""
    if extra is None:
        return True
    return isinstance(extra, dict) and all(
        isinstance(k, str) for k in extra.keys()
    )There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (8)
tests/backups/conftest.py (3)
61-61: Fix typo in docstringThe docstring contains a typo: "prcoessed" should be "processed".
- """Return a list of prcoessed backup locations.""" + """Return a list of processed backup locations."""
58-60: Add type hint for request.paramThe fixture uses
request.parambut its type is not hinted in the function signature. Add a type hint to improve code clarity.-async def fixture_backup_locations( - request: pytest.FixtureRequest, coresys: CoreSys, mount_propagation, mock_is_mount -) -> list[LOCATION_TYPE]: +async def fixture_backup_locations( + request: pytest.FixtureRequest[list[str | None]], coresys: CoreSys, mount_propagation, mock_is_mount +) -> list[LOCATION_TYPE]:
63-63: Use more descriptive variable nameThe
loadedflag could be more descriptive to better indicate its purpose.- loaded = False + mounts_loaded = Falsesupervisor/api/backups.py (2)
71-75: Add type validation in _ensure_list helperThe helper function should validate that the input is not already a list before wrapping it.
def _ensure_list(item: Any) -> list: """Ensure value is a list.""" - if not isinstance(item, list): + if item is None: + return [] + elif not isinstance(item, list): return [item] return item
Line range hint
299-326: Improve error handling in copy_to_additional_locationsThe error handling could be improved by:
- Adding more context about the source of the error
- Preserving the original error information
- Using a more specific exception type for mount-related errorsexcept OSError as err: - msg = f"Could not copy backup to {location.name if isinstance(location, Mount) else location} due to: {err!s}" + location_name = location.name if isinstance(location, Mount) else location + msg = ( + f"Failed to copy backup from {backup.tarfile} to {location_name}: " + f"{err.__class__.__name__}: {err}" + ) if err.errno == errno.EBADMSG and location in { LOCATION_CLOUD_BACKUP, None, }: - raise BackupDataDiskBadMessageError(msg, _LOGGER.error) from err - raise BackupError(msg, _LOGGER.error) from err + exc = BackupDataDiskBadMessageError(msg, _LOGGER.error) + else: + exc = BackupMountError(msg, _LOGGER.error) if isinstance(location, Mount) else BackupError(msg, _LOGGER.error) + raise exc from err
tests/backups/test_manager.py (3)
1720-1727: Consider adding docstrings to explain test parameters.The parametrize decorator would benefit from docstrings explaining the test scenarios and expected outcomes for each parameter combination.
@pytest.mark.parametrize( ("backup_locations", "location_name", "healthy_expected"), [ - (["test"], "test", True), - ([None], None, False), + (["test"], "test", True), # Named location should maintain healthy state + ([None], None, False), # Default location should mark system unhealthy ], indirect=["backup_locations"], )
Line range hint
1908-1977: Consider adding metadata consistency checks.While the tests verify location tracking well, consider adding assertions to verify that backup metadata (slug, date, etc.) remains consistent across locations.
assert backup.location is None assert backup.locations == [None] assert backup.all_locations.keys() == {".cloud_backup", None} + # Verify backup metadata consistency + assert backup.slug == "7fed74c8" + assert backup.name == "backup_example" # Assuming this is the name + assert backup.date # Verify date is preserved
1980-2014: Consider adding error case tests for backup removal.The tests effectively cover successful removal scenarios, but consider adding test cases for:
- Permission errors during removal
- Partial failures (some locations fail to remove)
- Race conditions (file already removed)
Example test case to add:
async def test_backup_remove_partial_failure( coresys: CoreSys, tmp_supervisor_data: Path ): """Test removing a backup with failures in some locations.""" backup_file = get_fixture_path("backup_example.tar") location_1 = Path(copy(backup_file, coresys.config.path_backup)) location_2 = Path(copy(backup_file, coresys.config.path_core_backup)) await coresys.backups.reload() backup = coresys.backups.get("7fed74c8") # Simulate permission error in one location location_1.chmod(0o000) coresys.backups.remove(backup) assert location_1.exists() # Should still exist due to permission error assert not location_2.exists() # Should be removed successfully assert backup.all_locations == {None: location_1}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (5)
- supervisor/api/backups.py(13 hunks)
- supervisor/backups/const.py(2 hunks)
- supervisor/backups/manager.py(18 hunks)
- tests/backups/conftest.py(2 hunks)
- tests/backups/test_manager.py(7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- supervisor/backups/const.py
🔇 Additional comments (5)
supervisor/api/backups.py (1)
431-446: 
Handle partial failures in upload method
The upload method should handle partial failures when copying to additional locations and ensure cleanup in case of errors.
     if additional_locations:
         self._change_stage(BackupJobStage.COPY_ADDITONAL_LOCATIONS, backup)
         try:
             await self._copy_to_additional_locations(
                 backup, additional_locations
             )
         except BackupError as err:
             self.sys_jobs.capture_error(err)
+            # Clean up the partially copied backup
+            self.remove(backup)
+            return NoneLikely invalid or redundant comment.
supervisor/backups/manager.py (3)
63-65: Convert dict_values to list in list_backups property
The property is annotated to return list[Backup] but returns a dict_values object.
442-450: Improve error handling in _do_backup for additional locations
The error handling for additional locations copy should be consistent with other backup error handling.
221-231: 🛠️ Refactor suggestion
Improve error handling in backup consolidation
The consolidation error handling should provide more context about the failure and ensure proper cleanup.
                     try:
                         self._backups[backup.slug].consolidate(backup)
                     except BackupInvalidError as err:
                         _LOGGER.error(
-                            "Ignoring backup %s in %s due to: %s",
+                            "Failed to consolidate backup %s from %s with existing backup: %s",
                             backup.slug,
                             backup.location,
                             err,
                         )
+                        # Clean up the invalid backup file
+                        backup.tarfile.unlink()
                         return FalseLikely invalid or redundant comment.
tests/backups/test_manager.py (1)
1729-1748: LGTM! Comprehensive error handling tests.
The error handling tests effectively cover:
- Different error scenarios (EBUSY, EBADMSG)
- System health state verification
- Location-specific error handling
Proposed change
Follow up to #5438 , lot of changes based on subsequent discussions:
Type of change
Additional information
Checklist
ruff format supervisor tests)If API endpoints or add-on configuration are added/changed:
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Tests
These updates enhance the flexibility and robustness of the backup management system, improving user experience and reliability.