Fix #1: [0 BOUNTY] [Python] Add retry/backoff to health_check.py for tran#17
Open
Nexussyn wants to merge 1 commit into
Open
Fix #1: [0 BOUNTY] [Python] Add retry/backoff to health_check.py for tran#17Nexussyn wants to merge 1 commit into
Nexussyn wants to merge 1 commit into
Conversation
…ackoff to health_check.py for
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1
Summary
Implement a generic retry decorator with exponential backoff and apply it to all health check functions in health_check.py, handling transient failures gracefully while preserving existing functionality.
Changes
Summary
This PR implements retry logic with exponential backoff for the health check module to handle transient failures gracefully.
Changes
Added retry mechanism with exponential backoff
retry_with_backoffdecorator that wraps health check functionsUpdated health check functions
All health check functions now have retry support:
check_cpu_health()- retries on psutil errors and extreme spike readingscheck_memory_health()- retries on psutil errorscheck_disk_health()- retries on psutil, file, and permission errorscheck_network_health()- retries on timeout and connection errorsAdded TransientError exception class
Custom exception class to identify recoverable failures that should trigger retries.
Added type hints
Full type annotations added to all functions for better IDE support and code clarity.
Preserved existing functionality
retries_exhaustedflag when applicableTesting
python3 build.pyDiagnostic Artifacts
See attached
diagnostic/build-XXX.logdfile.