Skip to content

Automatic Recovery from Releases Stuck in PendingUpgrade #397

Open
@godhanipayal

Description

@godhanipayal

Abstract

This proposal introduces an enhancement to Helm upgrade behavior.When a release becomes stuck in the PENDING_UPGRADE state, Helm currently blocks further operations on that release without recovery.
This HIP proposes adding a --force-rollback-on-pending-upgrade flag (or similar) to allow automatic safe rollback to the last successful revision when a release is detected in PENDING_UPGRADE, thereby unblocking operations without requiring manual intervention.

Motivation

Currently, when a helm upgrade fails or is interrupted unexpectedly (e.g., crash, timeout), the Helm release may become stuck in a PENDING_UPGRADE state.
Once a release is in PENDING_UPGRADE:

Any subsequent helm upgrade, helm rollback, or helm uninstall on the same release fails.

Users receive an error like:

"Another operation (upgrade/rollback) is in progress for release"

Users must manually delete or modify Helm storage (Secrets) to recover — a risky and manual operation.

This behavior breaks CI/CD pipelines, which are designed for automatic, unattended deployments.

When manual intervention is required:

  • Pipelines fail unexpectedly.
  • Automated rollout and recovery processes are halted.
  • Human operators must step in, leading to delays, production risks, and increased operational burden.

A native, safe Helm mechanism to auto-recover releases stuck in PENDING_UPGRADE will significantly improve reliability, automation, and user experience, especially for large-scale environments. With all major cloud providers, rolling out multiple regions and trying to create k8s service, this problem becomes more and more important to solve to provide a smooth experience.

Proposal

Introduce a new flag for helm upgrade:

--force-rollback-on-pending-upgrade

Behavior when this flag is used:

Before starting an upgrade, Helm checks the current release's status.

If the release status is PENDING_UPGRADE:

Perform an automatic rollback to the last successful revision.

Log a clear message:

"Release was in PENDING_UPGRADE state. Rolling back to revision before proceeding."

After rollback, proceed with the requested upgrade.

If no previous successful revision is available, Helm should fail gracefully with a clear error message like:

"No successful revision found to rollback for release . Manual intervention required."

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions