You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(logserver): Prevent race condition during reconciliation
Resolves an issue where the logserver StatefulSet would hang for over
10 minutes during a rollout, a problem unique to this component. Pod
events showed repeated "FailedMount" warnings, eventually timing out
with the error: "Unable to attach or mount volumes: timed out waiting
for the condition".
The root cause was a race condition between the operator's reconciliation
loop and the Kubernetes controller managing the StatefulSet update.
Immediately after triggering a rollout, the operator would proceed to
reconcile the associated PVC. This interfered with the kubelet's process
of detaching the volume from the old pod and attaching it to the new one,
causing the prolonged timeout.
This commit fixes the race condition by ensuring that the DeployLogserver
function exits its reconciliation loop immediately after a StatefulSet
update has been triggered. This gives the Kubernetes volume controller
uninterrupted time to manage the PVC handover.
This change aligns the logserver controller's behavior with all other
StatefulSet controllers in the operator, which already followed this
pattern, correcting a historical inconsistency that only affected logserver.
Change-Id: I164ef03e0e4ef8557a1ec5effb0415b77a7c053f
0 commit comments