Skip to content

pkg: add reusable opencost + prom modules for FinOps v2#492

Draft
nadaverell wants to merge 2 commits into
mainfrom
feature/finops-v2
Draft

pkg: add reusable opencost + prom modules for FinOps v2#492
nadaverell wants to merge 2 commits into
mainfrom
feature/finops-v2

Conversation

@nadaverell
Copy link
Copy Markdown
Contributor

@nadaverell nadaverell commented Apr 19, 2026

Summary

Moves the Prometheus client + cost math out of internal/ and into pkg/ so koala-backend and skyhook-connector can share the code for the FinOps v2 Costs Explorer. Net result:

  • internal/prometheus/{client,discovery,queries}.go shrinks substantially (queries.go moved under pkg/prom/).
  • internal/opencost/handlers.go shrinks 688 → 147 lines — handlers are now thin HTTP wrappers around pkg/opencost.
  • HTTP endpoints (/summary, /workloads, /trend, /nodes) keep their JSON shape unchanged.

pkg/prom — extracted from internal/prometheus

  • Client, HTTPTransport, Discover(), and all metric-query builders.
  • Optional Headers map[string]string on HTTPTransport for auth-protected backends (Authorization, X-Scope-OrgID, …) — applied after Accept.
  • Memory-scrape dedupe (max-by container inside the outer sum) folded into BuildQuery / BuildNamespaceQuery / BuildClusterQuery so callers don't double-count series from multiple scrape jobs.
  • internal/prometheus is now a thin facade re-exporting via aliases.go; client.go and discovery.go delegate, queries.go moved under pkg/.

pkg/opencost — two paths into the same data

REST (ComputeCostSummary, ComputeCostTrend) — new code path, hits OpenCost's /allocation API:

  • windowHours() normalizes /allocation.totalCost (summed over the window) to $/hr so callers project monthly via × 730 regardless of window. Regression test pins this.
  • Coarse default steps (24h→6h, 7d→1d, 30d→2d) fit CAC's 30s proxy budget.
  • Pod/controller aggregation via SummaryOptions.Aggregate.
  • Client-side namespace post-filter for OpenCost versions that silently ignore the REST filter param.
  • Drops synthetic __unallocated__ controller/pod drilldown rows so child totals don't exceed the parent namespace.
  • Surfaces NetworkCost per row + TotalNetworkCost in the summary.

PromQL (ComputeCostSummaryFromProm, ComputeCostTrendFromProm, ComputeWorkloadsFromProm, ComputeNodeCosts) — extracted from internal/opencost/handlers.go:

  • The workloads path takes a PodOwnerLookup callback so pkg/opencost stays free of k8s.io/client-go. radar supplies it from its informer cache; koala-backend / connector can supply it from CAC-served pod metadata.

Tag published: pkg/v1.4.4. Consumed by:

Test plan

  • go test ./... in both root and pkg/ submodules — all green (includes window-normalization regression, header pass-through, memory-scrape dedupe).
  • Live verified against OpenCost v1.108 on nonprod-cluster-us-east1: explore rows per namespace, trend buckets, pod/controller drill-down, namespace post-filter.
  • Radar's existing /api/opencost/{summary,workloads,trend,nodes} endpoints continue to return the same JSON shape after extraction.

@nadaverell nadaverell force-pushed the feature/finops-v2 branch from e43924c to 4ded49b Compare May 20, 2026 16:04
Comment thread pkg/prom/transport.go Dismissed
Moves the Prometheus client + cost math out of internal/ and into pkg/
so koala-backend and skyhook-connector can share the code for the FinOps
v2 Costs Explorer.

pkg/prom — extracted from internal/prometheus:
- Client, HTTPTransport, Discover(), and the metric-query builders.
- Optional Headers map on HTTPTransport for auth-protected backends
  (Authorization, X-Scope-OrgID, ...) — applied after Accept.
- Memory-scrape dedupe (max-by container inside the outer sum) folded
  into BuildQuery / BuildNamespaceQuery / BuildClusterQuery so callers
  do not double-count series from multiple scrape jobs.
- internal/prometheus is now a thin facade re-exporting via aliases.go;
  client.go and discovery.go delegate, queries.go moved under pkg/.

pkg/opencost has two paths into the same data:
- REST (ComputeCostSummary, ComputeCostTrend) hits OpenCost's
  /allocation API. windowHours() normalizes /allocation.totalCost
  (summed over the window) to dollars/hr so callers project monthly via
  x 730 regardless of window. Coarse default steps (24h->6h, 7d->1d,
  30d->2d) fit CAC's 30s proxy budget. Includes pod/controller
  aggregation, client-side namespace post-filter for OpenCost versions
  that silently ignore the REST filter param, removal of synthetic
  __unallocated__ drilldown rows, and per-row NetworkCost surfacing.
- PromQL (ComputeCostSummaryFromProm, ComputeCostTrendFromProm,
  ComputeWorkloadsFromProm, ComputeNodeCosts) extracted from
  internal/opencost/handlers.go. The workloads path takes a
  PodOwnerLookup callback so pkg/opencost stays free of k8s.io/client-go;
  radar supplies it from its informer cache, koala-backend / connector
  can supply it from CAC-served pod metadata.

internal/opencost/handlers.go shrinks 688 -> 147 lines; the four
endpoints (/summary, /workloads, /trend, /nodes) keep their JSON shape
unchanged.
@nadaverell nadaverell force-pushed the feature/finops-v2 branch from 4ded49b to 226ff2b Compare May 20, 2026 18:43
The endpoint accepted ?url=<any HTTP(S) URL> and called SetURL() to
redirect all subsequent Prometheus queries to that host. The scheme
was validated but the host was not, and radar binds to 0.0.0.0 by
default — so anyone reachable on the listen port could exfiltrate
the configured --prometheus-header credentials to an arbitrary host
or port-scan internal services via Prometheus query latency.

The override was undocumented and unused by the radar UI (it only
POSTs /prometheus/connect with no body or params). Operators set the
Prometheus URL via --prometheus-url at startup; there's no UX cost to
removing the per-request override.

Also removes the now-orphaned (*Client).SetURL method.

Flagged by CodeQL go/request-forgery (alert #330).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants