Summary
After upgrading from RunsOn v2.12.6 (CloudFormation) to v3.1.1 (Terraform flex module), runs-on/snapshot stopped restoring the default-branch snapshot on pull-request runs. Every PR's first run now starts from a blank volume instead of inheriting the default branch's cache.
Root cause: the v3 flexd control-plane writes "defaultBranch": "" (empty) into the per-runner config at $RUNS_ON_HOME/config.json. The runs-on/snapshot action only performs its "fall back to the default branch" restore when that field is non-empty, so the fallback is silently disabled.
Our default branch is main — i.e. config.json should contain "defaultBranch": "main", and it did under v2.12.6.
Environment
|
|
| RunsOn (before, working) |
v2.12.6, CloudFormation install |
| RunsOn (after, broken) |
v3.1.1, Terraform flex module 3.1.1 (also reproduces conceptually on 3.1.2) |
| Control-plane service |
flexd (ECS), app_version: v3.1.1 |
runs-on/snapshot action |
v1.1.1 (commit d3bcc42) — unchanged across the upgrade |
| Runner |
linux / amd64 |
| Default branch |
main |
Expected vs actual
runs-on/snapshot documents this restore order:
- snapshot for the current branch
- else snapshot for the repository default branch
- else a blank volume
For a PR, step 1 can never match on the first run — the ref is the PR merge ref (<PR>/merge), unique per PR. So PRs rely entirely on step 2.
- Expected: step 2 restores the
main snapshot (which exists and is tagged runs-on-snapshot-branch=main).
- Actual: step 2 is skipped because
defaultBranch is empty, and a blank volume is created.
Where it's gated (action side — for reference)
In runs-on/snapshot (d3bcc42), the fallback is conditional on a non-empty value read from the control-plane-provided config:
// internal/snapshot/restore.go
} else if s.config.RunnerConfig.DefaultBranch != "" { // ← fallback only runs if non-empty
// search snapshots tagged with the default branch
}
// internal/config/config.go
configBytes, _ := os.ReadFile(filepath.Join(os.Getenv("RUNS_ON_HOME"), "config.json"))
// → RunnerConfig.DefaultBranch (json key "defaultBranch")
The action is behaving correctly given its input; the input is empty.
Evidence
1. Runner-side: config.json has an empty defaultBranch
The action logs PrettyPrint(cfg.RunnerConfig) on every run. customTags populate correctly (so the control-plane is writing the file), but defaultBranch is empty:
Runner config: {
"defaultBranch": "",
"customTags": [ { "key": "runs-on-stack-name", "value": "<redacted>" }, ... ]
}
2. Default-branch run works; PR run does not
On a push to the default branch, the ref is main, so it matches its own branch-tagged snapshot via step 1 and never needs the fallback:
RestoreSnapshot: Using git ref: main
RestoreSnapshot: Found latest snapshot snap-xxxx for branch main
CreateSnapshot: Using git ref: main → Snapshot created: snap-yyyy (~88% savings)
On a PR the fallback is needed but is skipped:
RestoreSnapshot: Using git ref: <PR>/merge
RestoreSnapshot: Searching ... tag:runs-on-snapshot-branch = ["<PR>/merge"]
RestoreSnapshot: Creating a new blank volume
This means the main snapshot is healthy and present; PRs simply can't reach it because the fallback never fires.
Summary
After upgrading from RunsOn v2.12.6 (CloudFormation) to v3.1.1 (Terraform
flexmodule),runs-on/snapshotstopped restoring the default-branch snapshot on pull-request runs. Every PR's first run now starts from a blank volume instead of inheriting the default branch's cache.Root cause: the v3
flexdcontrol-plane writes"defaultBranch": ""(empty) into the per-runner config at$RUNS_ON_HOME/config.json. Theruns-on/snapshotaction only performs its "fall back to the default branch" restore when that field is non-empty, so the fallback is silently disabled.Our default branch is
main— i.e.config.jsonshould contain"defaultBranch": "main", and it did under v2.12.6.Environment
flexmodule 3.1.1 (also reproduces conceptually on 3.1.2)flexd(ECS),app_version: v3.1.1runs-on/snapshotactiond3bcc42) — unchanged across the upgrademainExpected vs actual
runs-on/snapshotdocuments this restore order:For a PR, step 1 can never match on the first run — the ref is the PR merge ref (
<PR>/merge), unique per PR. So PRs rely entirely on step 2.mainsnapshot (which exists and is taggedruns-on-snapshot-branch=main).defaultBranchis empty, and a blank volume is created.Where it's gated (action side — for reference)
In
runs-on/snapshot(d3bcc42), the fallback is conditional on a non-empty value read from the control-plane-provided config:The action is behaving correctly given its input; the input is empty.
Evidence
1. Runner-side:
config.jsonhas an emptydefaultBranchThe action logs
PrettyPrint(cfg.RunnerConfig)on every run.customTagspopulate correctly (so the control-plane is writing the file), butdefaultBranchis empty:2. Default-branch run works; PR run does not
On a push to the default branch, the ref is
main, so it matches its own branch-tagged snapshot via step 1 and never needs the fallback:On a PR the fallback is needed but is skipped:
This means the
mainsnapshot is healthy and present; PRs simply can't reach it because the fallback never fires.