improve update/upgrade --org: version tags, current version display, unified repo discovery#41627
Conversation
…or current version Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
✅ PR Code Quality Reviewer completed the code quality review. |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. |
|
✅ Test Quality Sentinel completed test quality analysis. |
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
There was a problem hiding this comment.
Pull request overview
This PR improves the org-wide gh aw update / gh aw upgrade UX and consistency by unifying repository discovery, surfacing friendlier version identifiers during updates, and showing the current compiler version during upgrades.
Changes:
- Unifies org repo discovery for both update/upgrade using a GitHub code search for
.github/workflows/*.lock.yml. - Adds tag-name resolution for SHA refs in
update --orgpreviews (with per-source-repo caching). - Adds an upgrade scan phase to detect workflow count and current
compiler_versionfrom lock metadata, and appends workflow counts to progress messages.
Show a summary per file
| File | Description |
|---|---|
| pkg/cli/upgrade_org.go | Switches upgrade-org to lock-file discovery, adds scan phase + report including current compiler version and workflow counts. |
| pkg/cli/upgrade_org_test.go | Updates org-upgrade tests for the new discovery and scan behavior; adds a version-display assertion. |
| pkg/cli/update_version_labels.go | New helper to resolve commit SHAs to tag names via GitHub tags API with caching. |
| pkg/cli/update_org.go | Stores/prints version labels (tag/branch/short SHA) instead of always shortening refs in reports and issue bodies; adds CurrentVersion field to preview struct. |
| pkg/cli/update_org_search.go | Changes update-org discovery to search for .lock.yml files (aligning with upgrade). |
| pkg/cli/org_runner.go | Appends per-repo workflow counts to apply/issue progress messages. |
Review details
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 6/6 changed files
- Comments generated: 4
- Review effort level: Low
| if r.CurrentVersion != "" { | ||
| target := "" | ||
| if targetVersion != "" && r.CurrentVersion != targetVersion { | ||
| target = " -> " + targetVersion | ||
| } | ||
| versionPart = fmt.Sprintf(" (v%s%s)", r.CurrentVersion, target) | ||
| } else if r.TotalWorkflows > 0 { |
| if version == "" { | ||
| return "" | ||
| } | ||
| return ", compiler: v" + version | ||
| } |
| tagMap := loadRepoTagMap(ctx, sourceRepo) | ||
|
|
||
| versionLabelMu.Lock() | ||
| versionLabelCache[sourceRepo] = tagMap | ||
| versionLabelMu.Unlock() | ||
|
|
||
| if tag, ok := tagMap[ref]; ok { | ||
| return tag | ||
| } | ||
| return shortRef(ref) |
| // searchOrgWorkflowRepos searches an organization's repositories for compiled | ||
| // agentic workflow lock files (.lock.yml) in .github/workflows, which indicates | ||
| // the repository has source-managed agentic workflows eligible for bulk updates. |
…display Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
🏗️ Design Decision Gate — ADR RequiredThis PR makes significant changes to core business logic (388 new lines in 📄 Draft ADR committed:
📋 What to do next
Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision. ❓ Why ADRs Matter
ADRs create a searchable, permanent record of why the codebase looks the way it does. Future contributors (and your future self) will thank you. 📋 Michael Nygard ADR Format ReferenceAn ADR must contain these four sections to be considered complete:
All ADRs are stored in
|
🧪 Test Quality Sentinel Report✅ Test Quality Score: 82/100 — Excellent
📊 Metrics & Test Classification (10 tests analyzed)
Go: 10 (
|
There was a problem hiding this comment.
REQUEST_CHANGES — one confirmed display bug plus several medium-severity structural issues.
Blocking
Double v prefix in version display (upgrade_org.go lines 121, 188): lock files store "compiler_version":"v0.1.2" with a v prefix, but both renderOrgUpgradeReport and formatCurrentVersionSuffix unconditionally prepend another "v". Production output will read (vv0.1.2 -> vv1.4.0). The new test uses a synthetic "1.2.3" value (without the prefix) and therefore does not catch this.
Other issues flagged
countWorkflowMDFilescounts all.mdfiles (line 139): includesREADME.md, docs, etc. — repos without real workflows could be misclassified as upgrade targets.FindGitRoot()+ensureUpdateTargetRepoGitignore()called once per repo (line 61): both are constant for a run; should be computed once and threaded in.- Each repo is shallow-cloned twice (line 72): scan phase clones then apply phase clones again — doubles network+disk per repo for large org runs.
- Tag fetch capped at 100 (
update_version_labels.goline 71): repos with >100 releases silently fall back to short SHAs even for tagged commits. versionLabelCacheglobal never cleared (line 21): stale entries survive multiple invocations; test isolation depends on execution order.- Multiple tags on same SHA (line 86): last-write-wins — may prefer an alias over a semver tag.
Warning
Firewall blocked 1 domain
The following domain was blocked by the firewall during workflow execution:
patchdiff.githubusercontent.com
To allow these domains, add them to the
network.allowedlist in your workflow frontmatter:
network:
allowed:
- defaults
- "patchdiff.githubusercontent.com"See Network Configuration for more information.
🔎 Code quality review by PR Code Quality Reviewer · 125.3 AIC · ⌖ 8.24 AIC · ⊞ 5.2K
| if targetVersion != "" && r.CurrentVersion != targetVersion { | ||
| target = " -> " + targetVersion | ||
| } | ||
| versionPart = fmt.Sprintf(" (v%s%s)", r.CurrentVersion, target) |
There was a problem hiding this comment.
Double v prefix: version displays as vv1.2.3 on real lock files. CurrentVersion comes from meta.CompilerVersion, which the lock schema stores with a v prefix (e.g. "v0.1.2" — confirmed by lock_schema_test.go line 338). Prepending another "v" in the format string produces (vv0.1.2 -> vv1.4.0). The test TestRunUpgradeForOrgDryRunShowsVersion uses the synthetic value "1.2.3" (no prefix), so it passes while masking the production bug. Same issue on line 188 (", compiler: v" + version).
💡 Suggested fix
Strip any existing v prefix before formatting, e.g.:
func normalizeVersion(v string) string {
return strings.TrimPrefix(v, "v")
}Then use normalizeVersion(r.CurrentVersion) and normalizeVersion(version) at both call sites. Also update the test to use CurrentVersion: "v1.2.3" to match actual lock-file data and prevent future regressions.
| count := 0 | ||
| for _, e := range entries { | ||
| name := e.Name() | ||
| if !e.IsDir() && strings.HasSuffix(name, ".md") { |
There was a problem hiding this comment.
countWorkflowMDFiles counts any .md file, including README.md. The function matches strings.HasSuffix(name, ".md") with no further validation. A repo that stores a README.md (or any documentation markdown) under .github/workflows/ will be counted as having a workflow file, causing scanUpgradeRepo to treat it as an upgrade candidate and report a bogus workflow count. The original .md-search query explicitly excluded README (NOT filename:README), but that guard is now gone.
💡 Suggested fix
Add a README.md exclusion or, better, check for the presence of a source: frontmatter field to distinguish agentic workflow files from plain markdown:
if !e.IsDir() && strings.HasSuffix(name, ".md") && !strings.EqualFold(name, "README.md") {
count++
}A stronger guard would check that each .md file actually contains a source: field (consistent with how update --org ultimately validates files), but the simple exclusion above closes the immediate false-positive case.
| // It returns (preview, true, nil) when the repo has workflows, or | ||
| // (orgRepoPreview{}, false, nil) when none are found. | ||
| func scanUpgradeRepo(ctx context.Context, repo string, verbose bool) (orgRepoPreview, bool, error) { | ||
| gitRoot, err := gitutil.FindGitRoot() |
There was a problem hiding this comment.
FindGitRoot() and ensureUpdateTargetRepoGitignore() are called once per repo during the org scan loop. For an org with N repos, this results in N redundant filesystem traversals and N potential .gitignore writes before any clone even starts. Both results are constant for the entire scan run and should be computed once before the loop and threaded in.
💡 Suggested fix
Extract the one-time setup into the caller or accept gitRoot/updatesDir as parameters:
func scanUpgradeRepo(ctx context.Context, repo, gitRoot, updatesDir string, verbose bool) (orgRepoPreview, bool, error) {
checkoutDir := filepath.Join(updatesDir, sanitizeRepoPath(repo))
// ... rest of function
}The ScanFn wrapper in runUpgradeForOrg computes gitRoot/updatesDir once before calling the loop. This mirrors the same issue that already exists in runUpgradeForTargetRepo (line 194), so fixing both together would be cleaner.
| } | ||
|
|
||
| checkoutDir := filepath.Join(updatesDir, sanitizeRepoPath(repo)) | ||
| if err := shallowCloneTargetRepo(ctx, repo, checkoutDir); err != nil { |
There was a problem hiding this comment.
Each repo is shallow-cloned twice: once in scanUpgradeRepo and again in runUpgradeForTargetRepo. The scan clone reads version metadata, then the apply path independently calls shallowCloneTargetRepo into the same checkoutDir. At minimum this doubles network + disk I/O per repo. For large orgs this will make the apply phase materially slower.
💡 Suggested fix
Options, in order of preference:
- Reuse the scan checkout: have
runUpgradeForTargetRepoaccept an optional pre-existingcheckoutDirand skip the clone when it already exists at the right ref. - Skip the scan clone: store only the metadata fields (
TotalWorkflows,CurrentVersion) from the org search results rather than doing a full clone; defer the clone entirely to the apply phase.
Either way, the orgRepoPreview struct already carries TotalWorkflows and CurrentVersion so the apply path can read those without needing the checkout.
| return tagMap | ||
| } | ||
|
|
||
| endpoint := fmt.Sprintf("/repos/%s/%s/tags?per_page=100", owner, repoName) |
There was a problem hiding this comment.
Tag resolution is capped at the first 100 tags with no pagination. For source repos that have released more than 100 versions (e.g. a prolific library), commits pinned to releases beyond the first page will always fall back to a short SHA. Worse, this happens silently — the preview will show abc1234 -> v2.5.0 even if abc1234 is itself a tagged release (say v1.8.0), misleadingly implying the workflow is on an untagged commit.
💡 Suggested fix
Either paginate through all tags (up to a reasonable limit, e.g. 5 pages), or use the commits-to-tag API to look up a specific SHA directly:
GET /repos/{owner}/{repo}/git/refs/tags
Or, since GitHub code-search results may already carry tag information, query /repos/{owner}/{repo}/commits/{sha}/tags (if available for the GH version in use). At minimum, log a warning when len(tags) == 100 so users know the result may be incomplete.
| } `json:"commit"` | ||
| } | ||
|
|
||
| var ( |
There was a problem hiding this comment.
versionLabelCache is a package-global that is never cleared. It accumulates entries for every source repo resolved during a process lifetime. Two risks: (1) stale entries — if a user runs update --org twice in the same session after a new tag is published, the second run uses the cached (outdated) tag map and may show wrong labels; (2) test pollution — tests that call previewOrgRepoUpdates (directly or via integration paths) will silently share cached tag maps, making test execution order matter.
💡 Suggested fix
Expose a clearVersionLabelCache() function (or make the cache an argument) and call it in tests:
// clearVersionLabelCache resets the in-process tag-resolution cache.
// Intended for use in tests and between command invocations.
func clearVersionLabelCache() {
versionLabelMu.Lock()
versionLabelCache = make(map[string]map[string]string)
versionLabelMu.Unlock()
}Call it at the start of each test that exercises the tag-resolution path, or make the cache struct-scoped so it cannot leak across tests.
| sha := strings.TrimSpace(t.Commit.SHA) | ||
| name := strings.TrimSpace(t.Name) | ||
| if sha != "" && name != "" { | ||
| tagMap[sha] = name |
There was a problem hiding this comment.
When multiple tags point at the same commit SHA, the one stored is API-response order dependent. tagMap[sha] = name is a last-write-wins overwrite. GitHub returns tags newest-first by default, so for a SHA with both a release tag (v1.2.3) and a lightweight alias (latest), whichever appears last in the 100-element response "wins" — which may be the less meaningful name.
💡 Suggested fix
Prefer semver tags over non-semver names when there is a collision:
if existing, ok := tagMap[sha]; !ok || !isSemver(existing) {
tagMap[sha] = name
}where isSemver checks for a vX.Y.Z pattern. This ensures v1.2.3 wins over latest or stable aliases.
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /tdd, /zoom-out — requesting changes on one confirmed bug and several quality gaps.
📋 Key Themes & Highlights
Key Themes
- Bug: missing
vprefix on target version —renderOrgUpgradeReportformatsCurrentVersionwith avprefix buttargetVersion(fromGetVersion()) is inserted without one, producing(v1.2.3 -> 1.4.0)instead of the documented(v1.2.3 -> v1.4.0). The test does not catch this. - Performance regression —
scanUpgradeReposhallow-clones every repo at scan time; the previous design usedScanFn: niland skipped this phase entirely. For large orgs this is N sequential clones before any PRs are opened. - Test coverage gaps —
update_version_labels.go(new file, 100 lines) has zero unit tests. The new upgrade test uses a single weakContainsassertion. countWorkflowMDFilesincludesREADME.md— the previous search query excluded README files; the new directory-scan does not.- Caching correctness — the version label cache uses a double-check lock pattern that allows duplicate API calls if two goroutines miss simultaneously, and provides no reset mechanism for tests.
Positive Highlights
- ✅ Unified repo discovery query (
filename:.lock.yml) is a solid fix — previouslyupdateandupgradeoperated on different repo sets. - ✅ Per-repo caching in
resolveVersionLabelis well-motivated; one API call per source repo is the right shape. - ✅ All existing tests updated with consistent
origFn/deferrestore patterns. - ✅
orgWorkflowCountSuffixis a clean, well-named helper with a zero-count guard.
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 130 AIC · ⌖ 8.16 AIC · ⊞ 6.5K
| if r.CurrentVersion != "" { | ||
| target := "" | ||
| if targetVersion != "" && r.CurrentVersion != targetVersion { | ||
| target = " -> " + targetVersion |
There was a problem hiding this comment.
[/tdd] Bug: targetVersion from GetVersion() returns the version string without a v prefix (e.g. "1.4.0"), so target = " -> " + targetVersion produces " -> 1.4.0" — the output becomes (v1.2.3 -> 1.4.0) instead of the documented (v1.2.3 -> v1.4.0).
💡 Suggested fix
Prepend v to targetVersion in the assignment:
target = " -> v" + targetVersionThe companion test does not catch this because assert.Contains(t, output, "v1.2.3") never checks the target part.
|
|
||
| assert.Contains(t, output, "Dry-run preview of upgrade pull requests") | ||
| assert.Contains(t, output, "octo/api") | ||
| assert.Contains(t, output, "v1.2.3") |
There was a problem hiding this comment.
[/tdd] Weak assertion: assert.Contains(t, output, "v1.2.3") passes even when the -> targetVersion part is malformatted. Also, there's no test for the already
|
|
||
| assert.Contains(t, output, "Dry-run preview of upgrade pull requests") | ||
| assert.Contains(t, output, "octo/api") | ||
| assert.Contains(t, output, "v1.2.3") |
There was a problem hiding this comment.
[/tdd] Weak assertion: assert.Contains(t, output, "v1.2.3") passes even when the -> targetVersion part is malformatted. The missing v prefix bug on line 119 of upgrade_org.go is not caught by this test. Also, there is no test for the "already current" branch (currentVersion == targetVersion) which should render (v1.3.0) with no arrow.
💡 Suggested improvements
Add a tighter assertion and a second scenario:
// Existing test: verify arrow format, not just presence of current version
assert.Contains(t, output, "(v1.2.3 ->")
// New test: set CurrentVersion = GetVersion() and assert no arrow appears
// e.g. assert.Contains(t, output, "(v"+getVersion+")")
// e.g. assert.NotContains(t, output, "->")| } | ||
| return shortRef(ref) | ||
| } | ||
| versionLabelMu.Unlock() |
There was a problem hiding this comment.
[/zoom-out] Double-check locking: the mutex is released on line 47 before calling loadRepoTagMap (which makes a network request). If two goroutines miss the cache simultaneously for the same sourceRepo, both will call loadRepoTagMap and make duplicate API requests. The last writer wins but the duplicate fetch is wasteful.
💡 Suggested fix using singleflight
Use golang.org/x/sync/singleflight to deduplicate concurrent fetches:
var versionLabelGroup singleflight.Group
func resolveVersionLabel(ctx context.Context, sourceRepo, ref string) string {
if !IsCommitSHA(ref) {
return ref
}
result, _, _ := versionLabelGroup.Do(sourceRepo, func() (any, error) {
versionLabelMu.Lock()
if m, ok := versionLabelCache[sourceRepo]; ok {
versionLabelMu.Unlock()
return m, nil
}
versionLabelMu.Unlock()
m := loadRepoTagMap(ctx, sourceRepo)
versionLabelMu.Lock()
versionLabelCache[sourceRepo] = m
versionLabelMu.Unlock()
return m, nil
})
tagMap := result.(map[string]string)
if tag, ok := tagMap[ref]; ok {
return tag
}
return shortRef(ref)
}Since scans are currently sequential in org_runner.go, this is low-urgency but the existing mutex signals concurrent use was anticipated.
| return tagMap | ||
| } | ||
|
|
||
| endpoint := fmt.Sprintf("/repos/%s/%s/tags?per_page=100", owner, repoName) |
There was a problem hiding this comment.
[/tdd] Only the first 100 tags are fetched. For active repos with many releases (>100 tags), older commit SHAs won't match any tag entry and will silently fall back to the short SHA display. The code comment says "first page" but this is worth calling out as a known limitation in the error path.
💡 Options
- Accept the trade-off — add a comment noting that tags beyond page 1 fall back to short SHA; this is fine for most repos.
- Paginate — follow
Link: <...>; rel="next"headers to fetch all pages. This adds complexity and more API calls per source repo but guarantees full coverage. - Sort by newest — the GitHub tags API can be sorted; sorting by date descending makes page 1 most likely to contain the tags in use.
For the common case (a handful of semver releases) option 1 is fine — just document the behaviour explicitly.
|
|
||
| var ( | ||
| versionLabelMu sync.Mutex | ||
| versionLabelCache = make(map[string]map[string]string) // repo -> sha -> tag name |
There was a problem hiding this comment.
[/tdd] Package-level mutable cache with no reset mechanism. Tests that call resolveVersionLabel directly (or indirectly via previewOrgRepoUpdates) would leave state in this map, potentially leaking across test runs when the cache warms up from one test and silently influences another.
💡 Suggested fix
Add a reset helper for tests (unexported or behind a build tag):
// resetVersionLabelCacheForTest clears the cache; call in test cleanup.
func resetVersionLabelCacheForTest() {
versionLabelMu.Lock()
versionLabelCache = make(map[string]map[string]string)
versionLabelMu.Unlock()
}Alternatively, make the cache injectable via a struct so tests can pass an isolated instance.
| count := 0 | ||
| for _, e := range entries { | ||
| name := e.Name() | ||
| if !e.IsDir() && strings.HasSuffix(name, ".md") { |
There was a problem hiding this comment.
[/zoom-out] countWorkflowMDFiles counts every .md file in .github/workflows/, including README.md and any documentation files. The previous search query explicitly excluded README files (NOT filename:README). A repo with a README.md in .github/workflows/ would be counted as having 1 extra workflow and would not be filtered out by the mdCount == 0 guard below.
💡 Suggested fix
Exclude files by name to match the previous search intent:
if !e.IsDir() && strings.HasSuffix(name, ".md") &&
!strings.EqualFold(name, "README.md") {
count++
}Or, more precisely, only count files that pass the same source: field check used by previewOrgRepoUpdates — though that requires reading each file and would be expensive here.
| } | ||
|
|
||
| checkoutDir := filepath.Join(updatesDir, sanitizeRepoPath(repo)) | ||
| if err := shallowCloneTargetRepo(ctx, repo, checkoutDir); err != nil { |
There was a problem hiding this comment.
[/zoom-out] scanUpgradeRepo shallow-clones every discovered repo unconditionally. For orgs with many repositories this means N sequential git clones at scan time — all before any upgrade PRs are opened. The previous design used ScanFn: nil to skip this phase entirely; this PR adds significant per-repo I/O in exchange for displaying the current compiler version.
💡 Lighter alternative
Use the GitHub contents API to list .lock.yml files and fetch the first one without cloning:
// GET /repos/{owner}/{repo}/contents/.github/workflows
// Filter for *.lock.yml entries, then fetch the raw blob of the first hit.
// This gives compiler_version without a full checkout.This trades the clone for 2 API calls per repo, which is much faster and avoids disk I/O. The clone is still needed in runUpgradeForTargetRepo (apply phase) where it's already done, so you'd only defer it to that phase.
If cloning is intentionally preferred for consistency, document the performance trade-off in the function comment.
| sha := strings.TrimSpace(t.Commit.SHA) | ||
| name := strings.TrimSpace(t.Name) | ||
| if sha != "" && name != "" { | ||
| tagMap[sha] = name |
There was a problem hiding this comment.
[/tdd] There are no unit tests for resolveVersionLabel or loadRepoTagMap. The new file introduces caching logic, SHA-to-tag resolution, and a graceful error fallback — all behaviours that are worth specifying with tests independent of the org-runner integration.
💡 Suggested test cases
// Cache hit — second call should not trigger a second API fetch
// Cache miss + tag found — should return tag name, not short SHA
// Cache miss + no matching tag — should return shortRef(sha)
// API error — loadRepoTagMap returns empty map, resolveVersionLabel falls back to shortRef
// Non-SHA ref — should be returned as-is without any API callThese can use a stub for workflow.RunGHContext (or mock the fn var) to avoid real network calls.
|
@copilot merge main and recompile |
…e-commands-org # Conflicts: # pkg/cli/update_org_search.go Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Merged |
|
@copilot run pr-finisher skill |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Completed a PR-finisher pass. Applied fixes in |
gh aw update --orgshowed raw short SHAs andgh aw upgrade --orgshowed no version info; both commands also used different repo discovery strategies, causing them to operate on different sets of repositories.Changes
Version labels in
update --orgupdate_version_labels.go: resolves commit SHAs to tag names via GitHub tags API with per-repo cachingci-doctor: e15e57b -> v1.4.0instead of raw SHAs; falls back to short SHA when no tag matchesCurrent compiler version in
upgrade --orgscanUpgradeReposhallow-clones each repo, reads the first.lock.yml, and extractscompiler_versionfrom the# gh-aw-metadata:JSON header(v1.2.3 -> v1.4.0)per repo, or(v1.3.0)when already currentWorkflow count in progress messages
orgWorkflowCountSuffixappends(N workflow(s))to the per-repo progress line during apply/issue phases:Unified repo discovery
updateandupgradenow usepath:.github/workflows filename:.lock.ymlas their GitHub code search query — previouslyupgradeused an.mdextension search, producing a different (larger) set of repos