Description
Abstract
This proposal introduces an enhancement to Helm upgrade behavior.When a release becomes stuck in the PENDING_UPGRADE state, Helm currently blocks further operations on that release without recovery.
This HIP proposes adding a --force-rollback-on-pending-upgrade flag (or similar) to allow automatic safe rollback to the last successful revision when a release is detected in PENDING_UPGRADE, thereby unblocking operations without requiring manual intervention.
Motivation
Currently, when a helm upgrade fails or is interrupted unexpectedly (e.g., crash, timeout), the Helm release may become stuck in a PENDING_UPGRADE state.
Once a release is in PENDING_UPGRADE:
Any subsequent helm upgrade, helm rollback, or helm uninstall on the same release fails.
Users receive an error like:
"Another operation (upgrade/rollback) is in progress for release"
Users must manually delete or modify Helm storage (Secrets) to recover — a risky and manual operation.
This behavior breaks CI/CD pipelines, which are designed for automatic, unattended deployments.
When manual intervention is required:
- Pipelines fail unexpectedly.
- Automated rollout and recovery processes are halted.
- Human operators must step in, leading to delays, production risks, and increased operational burden.
A native, safe Helm mechanism to auto-recover releases stuck in PENDING_UPGRADE will significantly improve reliability, automation, and user experience, especially for large-scale environments. With all major cloud providers, rolling out multiple regions and trying to create k8s service, this problem becomes more and more important to solve to provide a smooth experience.
Proposal
Introduce a new flag for helm upgrade:
--force-rollback-on-pending-upgrade
Behavior when this flag is used:
Before starting an upgrade, Helm checks the current release's status.
If the release status is PENDING_UPGRADE:
Perform an automatic rollback to the last successful revision.
Log a clear message:
"Release was in PENDING_UPGRADE state. Rolling back to revision before proceeding."
After rollback, proceed with the requested upgrade.
If no previous successful revision is available, Helm should fail gracefully with a clear error message like:
"No successful revision found to rollback for release . Manual intervention required."