Skip to content

[FEATURE] RecoveryStrategy #336

@leeeunkoo

Description

@leeeunkoo

RecoveryStrategy Implementation Specification

📝 Requirement Description

  • Implement a comprehensive RecoveryStrategy framework that manages recovery processes for all components according to their Criticality levels
  • The RecoveryStrategy must correctly handle state transitions during recovery operations
  • System should be resilient with self-recovery capabilities across different failure scenarios
  • For S/W quality and achieving Gold Badge of Eclipse Foundation, the code coverage of test shall be over 80%
  • To eliminate potential issues/risks, maintain build warnings at 10 or fewer

📋 Acceptance Criteria

  • Implementations of all recovery functions and strategies
  • Integration with StateManager for state persistence and retrieval
  • Implementation of Criticality-specific recovery time objectives (RTO)
  • Recovery actions execute asynchronously to prevent blocking
  • Recovery policies are correctly applied according to component types
  • Run without errors under all test scenarios
  • Check the code coverage with cargo tarpaulin to be over 80%
  • Check the number of warnings with cargo build to be 10 or fewer

📎 Related Documents/References

  • PICCOLO Recovery System Documentation
  • Eclipse Pullpiri StateManager implementation
  • Criticality Safety Standards Documentation
  • StateManager LLD Documentation

📌 Subtasks

  • Implement RecoveryStrategy trait with core functionality
  • Develop recovery handlers for different resource types (Scenario, Package, Model, Container)
  • Implement state persistence and retrieval mechanisms using ETCD
  • Create recovery policies based on Criticality levels
  • Implement automatic recovery trigger mechanisms
  • Develop health monitoring integration for early issue detection
  • Implement recovery logging and metrics collection
  • Add test codes for RecoveryStrategy
  • Add test codes for StateManager recovery integration
  • Add test codes for Criticality-level recovery policies
  • Add test codes for recovery time objectives
  • Reduce warnings in the recovery code

🧪 Testing Plan

  • Unit Test:

    • Test RecoveryStrategy trait implementation
    • Test individual recovery handlers
    • Test Criticality-level recovery policies
    • Test recovery time objectives for different Criticality levels
    • Test state transitions during recovery
  • Integration Test:

    • Test RecoveryStrategy integration with StateManager
    • Test recovery flows across different component failures
    • Test recovery process with ETCD persistence
    • Test recovery under high load conditions
  • Performance Test:

    • Measure recovery times for different Criticality levels
    • Test recovery under resource constraints
    • Test concurrent recovery operations
    • Verify recovery time objectives are met under various conditions

📊 Test Results

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    Status

    Ready

    Status

    Todo

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions