Skip to content

[FEATURE] Clear Monitoring Metrics when Node Failure/Disconnection is detected #349

@daeyoung-jeong-lge

Description

@daeyoung-jeong-lge

📝 Requirement Description

Currently, there's no way for MonitoringServer to know the liveliness of Nodes. So, when one Node is failed or disconnected, the latest metric information is remained in ETCD and kept being shown on Dashboard.
To clear this metrics information from Dashboard, MonitoringServer shall know the status of Nodes and clear related information from ETCD.

In Pullpiri, there's the heartbeat mechanism between APIServer and NodeAgent for checking the liveliness. But this mechanism is not matured yet. After this is fully implemented, we can add one gRPC message from APIServer to MonitoringServer which includes the liveliness information of Nodes, so MonitoringServer can erase the related information of Nodes from ETCD.

📋 Acceptance Criteria

  • Implementations of all functions
  • Run without errors

📎 Related Documents/References

related to TBD

📌 Subtasks

  • Add Node liveliness information gRPC message from APIServer to MonitoringServer
  • Clear monitoring metrics from ETCD when Node failure/disconnection is detected.

🧪 Testing Plan

  • Unit Test:
  • Integration Test:
  • Performance Test:

📊 Test Results

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    Status

    No status

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions