Refactor forecast command into focused modules and isolate render helper tests#42274
Conversation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
✅ PR Code Quality Reviewer completed the code quality review. |
|
✅ Test Quality Sentinel completed test quality analysis. Comment and APPROVE review for PR #42274 were already posted in the previous workflow run (both tool limits reached). Score: 93/100 Excellent — 8/8 design tests, 0 implementation tests, build tag present, no mocks. |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. |
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
There was a problem hiding this comment.
Pull request overview
This PR refactors the gh aw forecast CLI implementation by splitting the former monolithic pkg/cli/forecast.go into smaller, domain-focused modules (types, discovery/resolution, metadata parsing, computation, and rendering) while keeping the same command entrypoint behavior and output paths.
Changes:
- Decomposed forecast logic into focused files (
forecast_types.go,forecast_resolution.go,forecast_metadata.go,forecast_compute.go,forecast_render.go). - Reduced
forecast.goto orchestration + error normalization. - Moved formatting-helper tests into a dedicated
forecast_render_test.go.
Show a summary per file
| File | Description |
|---|---|
| pkg/cli/forecast.go | Slimmed to orchestration and exit/error normalization. |
| pkg/cli/forecast_types.go | Introduces shared forecast output/data model types used across modules. |
| pkg/cli/forecast_resolution.go | Contains workflow discovery and GitHub rate-limit backoff helpers. |
| pkg/cli/forecast_metadata.go | Contains frontmatter/workflow metadata extraction helpers. |
| pkg/cli/forecast_compute.go | Contains run sampling, cache/artifact AIC loading, and evaluation computation. |
| pkg/cli/forecast_render.go | Contains JSON + table rendering and formatting helpers. |
| pkg/cli/forecast_test.go | Removes render-helper unit tests (relocated) while keeping forecast behavior tests. |
| pkg/cli/forecast_render_test.go | Adds unit tests for render/format helper functions. |
Review details
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 8/8 changed files
- Comments generated: 2
- Review effort level: Low
| } | ||
|
|
||
| if len(config.WorkflowIDs) > 0 { | ||
| // Resolve each provided ID to a canonical lock-file workflow name. |
|
|
||
| // ForecastWorkflowResult contains the projected metrics for a single workflow. | ||
| type ForecastWorkflowResult struct { | ||
| // WorkflowID is the short identifier of the workflow (basename without .md). |
🏗️ Design Decision Gate — ADR RequiredThis PR makes significant changes to core business logic (1,330 new lines in 📄 Draft ADR committed:
📋 What to do next
Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision. ❓ Why ADRs Matter
ADRs create a searchable, permanent record of why the codebase looks the way it does. Future contributors (and your future self) will thank you. 📋 Michael Nygard ADR Format ReferenceAn ADR must contain these four sections to be considered complete:
All ADRs are stored in
|
🧪 Test Quality Sentinel Report✅ Test Quality Score: 93/100 — Excellent
📊 Metrics & Test Classification (8 tests analyzed)
Go: 8 (
|
There was a problem hiding this comment.
Review: Refactor forecast command into focused modules
Clean, behaviour-preserving refactor. The decomposition follows domain boundaries well and the module names are self-explanatory. No correctness regressions found.
5 non-blocking suggestions:
| # | File | Issue |
|---|---|---|
| 1 | forecast_types.go:133 |
workflowMeta belongs in forecast_metadata.go |
| 2 | forecast_types.go:141 |
forecastTableRow belongs in forecast_render.go |
| 3 | forecast_render.go:247 |
formatForecastSignedAIC missing + prefix for positive errors |
| 4 | forecast_render_test.go:46 |
No tests for formatForecastSignedAIC |
| 5 | forecast_compute.go:403 |
evaluateForecast bypasses listRunsWithBackoff (reliability) |
Detail
Cohesion (1, 2): workflowMeta is used only in forecast_metadata.go and forecastTableRow only in forecast_render.go. Both landed in forecast_types.go, which is otherwise the public API surface. Moving them to their owner files prevents forecast_types.go from becoming a catch-all.
Signed AIC display (3, 4): P50ErrorAbs is a signed delta (actual − P50). Positive values currently render without a + prefix, making them visually indistinguishable from plain AIC amounts in the backtesting eval table. Adding + for positive values and corresponding tests for all three branches (zero / negative / positive) closes this gap.
Rate-limit consistency (5): The training path uses listRunsWithBackoff; the eval path calls listWorkflowRunsWithPagination directly. A transient rate limit during --eval silently returns an empty evaluation rather than retrying. This was pre-existing but worth fixing while the code is being reorganised.
🧵 Reviewed using Impeccable skills by Impeccable Skills Reviewer · 83.6 AIC · ⌖ 6.99 AIC · ⊞ 4.9K
| Workflows []ForecastWorkflowResult `json:"workflows"` | ||
| } | ||
|
|
||
| // workflowMeta holds parsed metadata from a workflow's Markdown frontmatter. |
There was a problem hiding this comment.
Cohesion: workflowMeta is an internal struct used exclusively by forecast_metadata.go. Placing it here mixes an implementation detail with the public API types. Consider moving it to forecast_metadata.go to keep each file self-contained.
@copilot please address this.
| engines []string | ||
| } | ||
|
|
||
| // forecastTableRow is a flattened struct used for console table rendering. |
There was a problem hiding this comment.
Cohesion: forecastTableRow is a render-specific struct only used in forecast_render.go. Placing it here couples the types file to the render implementation. Move it to forecast_render.go for better encapsulation.
@copilot please address this.
|
|
||
| // formatForecastSignedAIC formats a signed AIC value, preserving | ||
| // the sign so callers can display positive/negative deltas (e.g., error abs). | ||
| func formatForecastSignedAIC(value float64) string { |
There was a problem hiding this comment.
Missing + prefix for positive signed errors: P50ErrorAbs is described as a signed difference (actual − P50). When the actual exceeds the forecast the value is positive, but without a + prefix the display is visually identical to an unsigned AIC value, making the direction ambiguous in the eval table.
Suggested fix:
func formatForecastSignedAIC(value float64) string {
if value == 0 {
return "0"
}
sign := ""
v := value
if value < 0 {
sign = "-"
v = math.Abs(value)
} else {
sign = "+"
}
return sign + formatForecastAIC(v)
}@copilot please address this.
|
|
||
| func TestFormatForecastAIC_Mega(t *testing.T) { | ||
| assert.Equal(t, "1.20M", formatForecastAIC(1_200_000)) | ||
| } |
There was a problem hiding this comment.
Missing test coverage for formatForecastSignedAIC: The new test file covers formatForecastPercent and formatForecastAIC but omits formatForecastSignedAIC, which has its own branching logic (zero, negative, positive). Consider adding:
func TestFormatForecastSignedAIC_Zero(t *testing.T) {
assert.Equal(t, "0", formatForecastSignedAIC(0))
}
func TestFormatForecastSignedAIC_Negative(t *testing.T) {
assert.Equal(t, "-500", formatForecastSignedAIC(-500))
}
func TestFormatForecastSignedAIC_Positive(t *testing.T) {
assert.Equal(t, "+500", formatForecastSignedAIC(500))
}@copilot please address this.
| Verbose: config.Verbose, | ||
| } | ||
| opts.Context = ctx | ||
| runs, _, err := listWorkflowRunsWithPagination(opts) |
There was a problem hiding this comment.
Inconsistent rate-limit handling in eval path: evaluateForecast calls listWorkflowRunsWithPagination directly, bypassing the backoff+retry logic in listRunsWithBackoff. A rate-limit hit during backtesting will immediately return an empty eval instead of retrying. The training path (forecastWorkflow) uses listRunsWithBackoff consistently.
Consider using listRunsWithBackoff here too:
runs, _, err := listRunsWithBackoff(ctx, opts, workflowName)@copilot please address this.
Review SummaryApplied 🔴 Correctness (1 issue)
🟡 Module boundaries (4 issues)
🟡 Test coverage (4 issues)
@copilot please address the review comments above.
|
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /zoom-out, /improve-codebase-architecture, /tdd, and /grill-with-docs — requesting changes on module boundary inconsistencies and missing test coverage for complex logic.
📋 Key Themes & Highlights
Key Themes
- Module boundary leakage:
forecastTableRowandworkflowMetawere placed inforecast_types.gobut belong in their owning modules (forecast_render.goandforecast_metadata.gorespectively).computeVariantFractionsis a compute concern sitting in the metadata module. - Compute → Render coupling:
emitPartialForecastResults(inforecast_compute.go) callsrenderForecastJSON/renderForecastTabledirectly, creating a top-down dependency that the refactoring was intended to eliminate. - Rate-limit asymmetry:
evaluateForecastcallslistWorkflowRunsWithPaginationdirectly, bypassing both the injectable var and thelistRunsWithBackoffretry wrapper used everywhere else — a correctness regression in--evalmode. - Test coverage gaps:
extractEngineNames(recursive type-switch visitor),computeVariantFractions(substring matching logic), andevaluateForecast's date-boundary filter are all complex/subtle with zero test coverage.
Positive Highlights
- ✅ Excellent decomposition decision — 1,440-line file → 6 focused modules is a substantial readability win.
- ✅ Clean entrypoint:
forecast.gonow contains only orchestration and error normalization. - ✅ Good test alignment: moving format-helper tests to
forecast_render_test.gois the right call. - ✅ All var declarations for injectability (
forecastLoadCachedRunAIC,forecastListWorkflowRunsPaginated, etc.) are correctly retained.
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 134.5 AIC · ⌖ 8.48 AIC · ⊞ 6.6K
Comment /matt to run again
| Verbose: config.Verbose, | ||
| } | ||
| opts.Context = ctx | ||
| runs, _, err := listWorkflowRunsWithPagination(opts) |
There was a problem hiding this comment.
[/improve-codebase-architecture] evaluateForecast calls listWorkflowRunsWithPagination directly, bypassing both the injectable forecastListWorkflowRunsPaginated var (needed for testing) and the listRunsWithBackoff retry wrapper used by forecastWorkflow. Rate-limit hits during --eval mode will silently return partial results rather than retrying, creating an asymmetry with the non-eval path.
💡 Suggested fix
Replace the direct call:
// Before
runs, _, err := listWorkflowRunsWithPagination(opts)
// After
runs, _, err := listRunsWithBackoff(ctx, opts, workflowName)This aligns eval-mode run fetching with the retry semantics in forecastWorkflow and makes evaluateForecast independently testable via the injected var.
@copilot please address this.
| AsOf: now.UTC().Format(time.RFC3339), | ||
| EvalMode: config.EvalMode, | ||
| Workflows: results, | ||
| } |
There was a problem hiding this comment.
[/zoom-out] emitPartialForecastResults calls renderForecastJSON/renderForecastTable directly, creating a dependency from forecast_compute.go → forecast_render.go. One of the goals of this refactoring was clean module boundaries, but this coupling means the compute module must import rendering concerns.
💡 Suggested refactor
Accept a render function as a parameter, or relocate emitPartialForecastResults to forecast_render.go alongside the other output functions it calls. Either approach removes the render dependency from the compute layer:
// Option A: move to forecast_render.go
func emitPartialForecastResults(results []ForecastWorkflowResult, config ForecastConfig, now time.Time) {
// sorting logic stays here (or moves to a helper)
renderForecastOutput(output, config) // already in render.go
}@copilot please address this.
| } | ||
|
|
||
| // forecastTableRow is a flattened struct used for console table rendering. | ||
| type forecastTableRow struct { |
There was a problem hiding this comment.
[/improve-codebase-architecture] forecastTableRow is a render-specific layout struct used only by renderForecastTable in forecast_render.go, yet it lives in the shared types file. This is an implementation detail of the table renderer, not a domain type.
💡 Suggestion
Move forecastTableRow (and its console: struct tags) to forecast_render.go next to renderForecastTable. This keeps the shared types file focused on domain data models (ForecastResult, ForecastWorkflowResult, etc.) rather than UI layout structs.
@copilot please address this.
| } | ||
|
|
||
| // workflowMeta holds parsed metadata from a workflow's Markdown frontmatter. | ||
| type workflowMeta struct { |
There was a problem hiding this comment.
[/improve-codebase-architecture] workflowMeta is an internal parsing type used exclusively by functions in forecast_metadata.go (loadWorkflowMeta, extractTriggerNames, etc.). Placing it in forecast_types.go blurs the module boundary: shared types should be the domain API surface, not internal parsing helpers.
💡 Suggestion
Move workflowMeta to forecast_metadata.go alongside the functions that produce and consume it. This localises ownership — if workflowMeta's fields change, you only need to update forecast_metadata.go.
@copilot please address this.
| // computeVariantFractions populates run counts and fractions on the variant stubs | ||
| // by examining the DisplayTitle of sampled runs (gh-aw encodes the variant there). | ||
| // When no stubs are present (workflow has no experiments), returns nil. | ||
| func computeVariantFractions(stubs []ForecastVariantResult, runs []WorkflowRun) []ForecastVariantResult { |
There was a problem hiding this comment.
[/improve-codebase-architecture] computeVariantFractions accepts []WorkflowRun and computes statistical fractions from run history — that is a compute concern, not a metadata/frontmatter concern. extractWorkflowIDFromName (line 221) is a pure string utility unrelated to frontmatter parsing. Both are misplaced in forecast_metadata.go.
💡 Suggestion
- Move
computeVariantFractionstoforecast_compute.go— it is already called fromforecastWorkflowin that file. - Move
extractWorkflowIDFromNametoforecast_types.go(it strips type-name suffixes) or inline it at the call site.
This keeps forecast_metadata.go focused on its stated purpose: reading and parsing frontmatter from .md workflow files.
@copilot please address this.
| return meta | ||
| } | ||
|
|
||
| func extractEngineNames(cfg *workflow.FrontmatterConfig) []string { |
There was a problem hiding this comment.
[/tdd] extractEngineNames uses a recursive closure with a 4-branch any type-switch dispatching on string, []any, and map[string]any keys (id, engine, fallback, fallbacks, engines). This complex dispatch logic has no tests, making it opaque to future maintainers.
💡 Suggested tests
func TestExtractEngineNames(t *testing.T) {
cases := []struct {
name string
input *workflow.FrontmatterConfig
want []string
}{
{"plain string engine", cfgWithEngine("copilot"), []string{"copilot"}},
{"list of engines", cfgWithEngine([]any{"copilot", "gemini"}), []string{"copilot", "gemini"}},
{"map with engine key", cfgWithEngine(map[string]any{"engine": "gpt-4"}), []string{"gpt-4"}},
{"map with fallbacks key", cfgWithEngine(map[string]any{"fallbacks": []any{"a", "b"}}), []string{"a", "b"}},
{"deduplication", cfgWithEngine([]any{"x", "x"}), []string{"x"}},
{"nil engine", &workflow.FrontmatterConfig{}, nil},
}
...
}@copilot please address this.
| // Count how many run titles contain each variant name. | ||
| for i, stub := range stubs { | ||
| count := 0 | ||
| for _, r := range runs { |
There was a problem hiding this comment.
[/tdd] The variant-matching logic uses strings.Contains(r.DisplayTitle, stub.Variant) as its signal. This is a substring match, so short variant names (e.g., "A", "v1") can match unrelated run titles, silently inflating counts. No tests currently guard this behavior.
💡 Suggested tests
Add table-driven tests for computeVariantFractions covering:
- Empty stubs (should return nil)
- All runs matching one variant
- No runs matching any variant (counts stay 0)
- Variant name as a substring of another variant (
"exp-a"vs"exp-ab") - Runs with empty
DisplayTitle - Single variant with partial match
Also consider whether an exact-word or prefix match would be more robust than a plain substring match.
@copilot please address this.
|
|
||
| // Filter to completed runs that fall within the validation window. | ||
| validationEnd := time.Now() | ||
| validationStart, _ := time.Parse("2006-01-02", validationStartDate) |
There was a problem hiding this comment.
[/tdd] The date-filter in evaluateForecast uses r.StartedAt.Before(validationStart) and r.StartedAt.After(validationEnd) — both exclusive boundaries. A run whose StartedAt equals validationStart exactly is included, which may not be the intent and is easy to break with a sign flip or <=/>= substitution. This logic has no unit test.
💡 Suggested test
func TestEvaluateForecastDateFilter(t *testing.T) {
anchor := time.Date(2024, 1, 15, 0, 0, 0, 0, time.UTC)
cases := []struct {
runStart time.Time
included bool
}{
{anchor, true}, // exactly at start: included
{anchor.Add(-time.Second), false}, // one second before: excluded
{anchor.Add(time.Hour), true}, // within window: included
{time.Now().Add(time.Hour), false}, // future: excluded
{time.Time{}, false}, // zero timestamp: excluded
}
...
}@copilot please address this.
| @@ -1,36 +1,16 @@ | |||
| package cli | |||
There was a problem hiding this comment.
[/grill-with-docs] The 9-line module-level doc comment (// This file implements the forecast command, which samples a workflow's recent GitHub Actions run history and projects forward AI Credit (AIC) usage...) was removed from forecast.go and not redistributed to any of the new modules. A reader landing in any of the six new files has no orientation to the overall design or how the modules relate.
💡 Suggestion
Add brief file-level comments to the new modules explaining their purpose, for example:
// forecast_compute.go — run sampling, AIC loading, Monte Carlo simulation,
// and backtest evaluation for the forecast command.The original doc comment (or a condensed version) could also live in forecast.go as a package-level orientation block.
@copilot please address this.
|
|
||
| func TestFormatForecastAIC_Mega(t *testing.T) { | ||
| assert.Equal(t, "1.20M", formatForecastAIC(1_200_000)) | ||
| } |
There was a problem hiding this comment.
[/tdd] The test file covers formatForecastPercent and formatForecastAIC but leaves several format helpers in forecast_render.go untested: formatTriggerList (the +N truncation path at >3 triggers), formatForecastSignedAIC (negative values, zero), formatEngineList (empty/single/multiple), and roundForecastAIC.
💡 Suggested additions
func TestFormatTriggerList_Truncation(t *testing.T) {
assert.Equal(t, "push, pull_request, schedule +1", formatTriggerList([]string{"push", "pull_request", "schedule", "workflow_dispatch"}))
}
func TestFormatForecastSignedAIC_Negative(t *testing.T) {
assert.Equal(t, "-500", formatForecastSignedAIC(-500))
}
func TestFormatForecastSignedAIC_Zero(t *testing.T) {
assert.Equal(t, "0", formatForecastSignedAIC(0))
}The truncation path is particularly worth pinning as it controls user-visible output.
@copilot please address this.
There was a problem hiding this comment.
REQUEST_CHANGES — The decomposition is structurally sound, but the refactoring surfaced a cluster of pre-existing correctness bugs that are now clearly visible in isolation, plus one new one introduced in evaluateForecast. Four issues are blocking.
🔴 Blocking issues (4)
-
evaluateForecastignoresvalidationEndDate(forecast_compute.go:410) —validationEnd := time.Now()makes the validation window non-deterministic and inconsistent witheval.ValidationEndDate. Workflows processed later in the loop get a wider window than earlier ones. -
SuccessRateandAvgDurationSecondsare conditioned on artifact availability (forecast_compute.go:140–141) — only runs that produced non-zero AIC enter both denominators. Failed runs with missing artifacts are silently excluded, inflating success rate and biasing the duration fed to Monte Carlo. -
Variant fractions can sum to >100% (
forecast_metadata.go:210) —strings.ContainsonDisplayTitledouble-counts runs for overlapping variant names, producing mathematically invalid experiment breakdowns. -
Rate-limit fallback returns fabricated workflow entries (
forecast_resolution.go:138) — when retries are exhausted, syntheticGitHubWorkflow{Name: id, Path: id}entries are returned as resolved metadata. Downstream run-list queries filter on the fabricated name, producing silent empty forecasts.
🟡 Non-blocking (inline comments)
- Eval path calls
listWorkflowRunsWithPaginationdirectly, bypassing the backoff wrapper used by the training path (forecast_compute.go:403). emitPartialForecastResultsduplicates the 15-line P50 sort comparator fromRunForecast(forecast_compute.go:329). Extract toforecastSortByP50Deschelper.- No tests cover
evaluateForecastdate filtering, error metrics, or rate-limit behavior. Thetime.Now()bug in item 1 would have been caught by a test that mocks the validation window boundary. - No tests for
resolveForecastWorkflowsFromRemote,fetchWorkflowsWithBackoff, ormatchRemoteWorkflowName— all new and non-trivial paths.
Warning
Firewall blocked 1 domain
The following domain was blocked by the firewall during workflow execution:
proxy.golang.org
To allow these domains, add them to the
network.allowedlist in your workflow frontmatter:
network:
allowed:
- defaults
- "proxy.golang.org"See Network Configuration for more information.
🔎 Code quality review by PR Code Quality Reviewer · 163.6 AIC · ⌖ 7.48 AIC · ⊞ 5.2K
Comment /review to run again
| result.AvgDurationSeconds = totalDurSec / float64(n) | ||
| result.SuccessRate = float64(successCount) / float64(n) |
There was a problem hiding this comment.
Both AvgDurationSeconds and SuccessRate are biased by AIC availability, not just run completion: successCount and totalDurSec are only incremented inside the runAIC > 0 guard, so runs with missing/failed artifacts disappear from both denominators. A workflow where 40% of runs fail with no artifacts will report SuccessRate = 100%. This inflated success probability is fed directly into the Monte Carlo simulation, distorting weekly/monthly projections.
💡 Suggested fix
Accumulate successCount and totalDurSec over the full completed slice before the AIC filter, and divide by len(completed):
// Separate passes: metrics that are independent of AIC availability
for _, r := range completed {
if r.Conclusion == "success" {
successCount++
}
totalDurSec += r.Duration.Seconds()
}
result.SuccessRate = float64(successCount) / float64(len(completed))
result.AvgDurationSeconds = totalDurSec / float64(len(completed))The AIC loop can then be a separate pass only computing aicObservations and totalAIC.
| } | ||
|
|
||
| // Filter to completed runs that fall within the validation window. | ||
| validationEnd := time.Now() |
There was a problem hiding this comment.
Eval validation window is non-deterministic: validationEndDate is accepted and stored in eval.ValidationEndDate, but the actual row filter uses time.Now() — so the true window and the reported window diverge, and results vary based on when during the run this function is called.
💡 Suggested fix
// Parse validationEndDate instead of hard-coding time.Now()
validationEnd, err := time.Parse("2006-01-02", validationEndDate)
if err != nil {
validationEnd = time.Now() // fallback only
}With validationEndDate = "2026-06-29", the parsed value is midnight UTC on that date, while time.Now() at execution is hours later. For a large workflow set, the last workflow gets a materially wider validation window than the first — making per-workflow comparison invalid. eval.ValidationEndDate also falsely documents the boundary.
| for i, stub := range stubs { | ||
| count := 0 | ||
| for _, r := range runs { | ||
| if strings.Contains(r.DisplayTitle, stub.Variant) { |
There was a problem hiding this comment.
Variant attribution is broken for overlapping or short names: strings.Contains(r.DisplayTitle, stub.Variant) will double-count runs when variant names are substrings of each other (e.g. variant "a" matches any title containing "a", variant "prod" matches "production"). Since counts are computed independently per variant, fractions can easily sum to more than 100%, making the reported experiment breakdown mathematically invalid.
💡 Suggested fix
If the variant name is always a distinct token in the title (e.g. "[variant] task"), match word-boundary rather than substring:
// Require the variant to appear as a whole word/token
for _, word := range strings.Fields(r.DisplayTitle) {
if strings.EqualFold(strings.Trim(word, "[]"), stub.Variant) {
count++
break
}
}Better still: encode the variant as a structured prefix with a reliable delimiter so the extraction is unambiguous.
|
|
||
| partialWorkflows := make(map[string]*GitHubWorkflow, len(ids)) | ||
| for _, id := range ids { | ||
| partialWorkflows[id] = &GitHubWorkflow{Name: id, Path: id, State: "unknown"} |
There was a problem hiding this comment.
Rate-limit fallback fabricates workflow metadata that silently produces empty results: when all retry attempts are exhausted and explicit ids are provided, this constructs synthetic GitHubWorkflow{Name: id, Path: id} entries and returns them as if resolution succeeded. Downstream, matchRemoteWorkflowName trivially matches the fabricated key, so resolved contains the raw user-supplied ID. Run-list queries then filter by this string, which rarely matches the actual GitHub API display name, yielding silent empty forecasts — indistinguishable from a workflow that genuinely has no runs.
💡 Options
- Return an error: rate-limit exhaustion on workflow discovery is a hard failure; let the caller decide whether to continue.
- Mark synthetic entries: add a flag to
GitHubWorkflow(e.g.Synthetic bool) so callers can warn the user that metadata is unavailable. - At minimum, change the warning message to make clear that the resulting forecast data is likely empty, not partial.
| slices.SortFunc(results, func(a, b ForecastWorkflowResult) int { | ||
| pi := a.ProjectedAIC | ||
| if mc := a.MonteCarlo; mc != nil { | ||
| pi = mc.P50ProjectedAIC | ||
| } | ||
| pj := b.ProjectedAIC | ||
| if mc := b.MonteCarlo; mc != nil { | ||
| pj = mc.P50ProjectedAIC | ||
| } | ||
| if pi > pj { | ||
| return -1 | ||
| } | ||
| if pi < pj { | ||
| return 1 | ||
| } | ||
| return 0 | ||
| }) |
There was a problem hiding this comment.
Duplicate sort comparator: the 15-line sort-by-P50-descending comparator here is byte-for-byte identical to the one in RunForecast (forecast.go:157). If the sort criterion ever changes (e.g. switching tie-breaking or adding a secondary key), both copies must be updated.
💡 Fix
Extract to a package-level helper:
func forecastSortByP50Desc(a, b ForecastWorkflowResult) int {
pi := a.ProjectedAIC
if mc := a.MonteCarlo; mc != nil {
pi = mc.P50ProjectedAIC
}
pj := b.ProjectedAIC
if mc := b.MonteCarlo; mc != nil {
pj = mc.P50ProjectedAIC
}
switch {
case pi > pj: return -1
case pi < pj: return 1
default: return 0
}
}Then both call sites become slices.SortFunc(results, forecastSortByP50Desc).
| Verbose: config.Verbose, | ||
| } | ||
| opts.Context = ctx | ||
| runs, _, err := listWorkflowRunsWithPagination(opts) |
There was a problem hiding this comment.
Eval path bypasses the rate-limit retry wrapper: forecastWorkflow fetches training runs through listRunsWithBackoff (up to 3 attempts with exponential backoff), but evaluateForecast calls listWorkflowRunsWithPagination directly. A rate-limited API call here silently returns an empty ForecastEvaluation with all-zero metrics — so under the same API conditions that the main path handles gracefully, eval mode silently degrades.
💡 Fix
Replace the direct call with the same backoff wrapper:
runs, _, err := listRunsWithBackoff(ctx, opts, workflowName)Also pass the workflowName string for the warning message, consistent with the training path.
|
@copilot run pr-finisher skill |
…name resolution Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Done. Fixed both misleading comments in
Actions taken: Corrected two comment inaccuracies flagged by the automated reviewer — Hand-off: CI must be re-triggered by a maintainer (close/reopen PR, Still needed: Human review approval. |
|
@copilot please run the Unresolved reviews (newest first):
The branch has also been queued for a base-branch update — once updated, rerun any failing checks and verify the PR is ready to merge.
|
pkg/cli/forecast.gohad grown into a 1,440-line mixed-responsibility file spanning orchestration, API resolution, metadata parsing, computation, and rendering. This PR decomposes it into focused modules while preserving the existing forecast API surface and behavior.File decomposition by domain
forecast_types.go: moved forecast data models (ForecastRunSample,ForecastWorkflowResult,ForecastVariantResult,ForecastEvaluation,ForecastResult,workflowMeta,forecastTableRow).forecast_resolution.go: moved workflow discovery and rate-limit retry logic (resolveForecastWorkflows*,fetchWorkflowsWithBackoff,listRunsWithBackoff, matching/backoff helpers, related vars/constants).forecast_metadata.go: moved frontmatter metadata extraction (loadWorkflowMeta, trigger/engine/concurrency/variant/workflow-id helpers).forecast_compute.go: moved run sampling/AIC loading/artifact download/eval computation (forecastWorkflow,loadCachedRunAIC,forecastDownloadUsageArtifact,isCompletedNonSkippedRun,evaluateForecast, partial-result emission).forecast_render.go: moved JSON/table rendering and formatting helpers.Slimmed command entrypoint
forecast.gonow contains only high-level orchestration (RunForecast) and error normalization (normalizeForecastRunError) plus command-level globals.Test organization
forecast_test.gointoforecast_render_test.goto align test placement with implementation boundaries.