Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
93 changes: 93 additions & 0 deletions .design/project-log/2026-06-01-postgres-store.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# PostgreSQL Store Implementation

**Date:** 2026-06-01

## Motivation

The hub needs to run stateless in a hosted/cloud topology where the database is
a GitOps-configured external service (e.g. Cloud SQL, Lakebase, any managed
Postgres). SQLite is process-local and cannot be shared across replicas.
The `database.driver` + `database.url` fields already existed in `GlobalConfig`
to hold a connection URL; what was missing was a `Store` implementation that
consumed them. See `.design/hosted/resource-storage-refactor.md` §1.1
("Cloud / hosted mode — the storage backend is GCS") for the broader hosted
architecture context that motivated a stateless control plane.

## What landed

### `pkg/store/postgres/`

A new package, parallel in shape to `pkg/store/sqlite/`, implementing the full
`store.Store` interface against PostgreSQL.

- **`postgres.go`** — `PostgresStore` struct wrapping `*sql.DB`, `New(connURL
string)`, `Migrate(ctx)`, `Ping`, `Close`. Connection pool fixed at
`MaxOpenConns=4` / `MaxIdleConns=4`.
- **`driver.go`** — blank import of `github.com/lib/pq` (database/sql driver
name `postgres`) guarded by `//go:build !no_postgres`.
- **`migrations.go`** — 53 versioned migrations tracked in a
`schema_migrations` table (`version INTEGER PRIMARY KEY`). `Migrate` is
idempotent: it reads `MAX(version)` and skips already-applied steps. Each
migration runs in its own transaction; a `foreignKeysOffMigrations` map is
preserved for shape-parity with the SQLite runner (in Postgres, FK deferral
is handled inside the migration SQL itself via `CASCADE`/explicit FK drops,
so the function body is a plain transaction).
- **Per-entity files** (`agents.go`, `users.go`, `projects.go`, `secrets.go`,
`messages.go`, `groups.go`, `policies.go`, `tokens.go`, `invites.go`,
`brokers.go`, `envvars.go`, `schedule.go`, `scheduled_event.go`,
`notification.go`, `templates.go`, `harness_configs.go`, `allowlist.go`,
`brokersecret.go`, `providers.go`, `project_sync_state.go`,
`gcp_service_account.go`, `github_installation.go`, `maintenance.go`) —
one file per entity group, matching the sqlite layout.

### `initStore` case in `cmd/server_foreground.go`

`initStore` gained a `"postgres"` branch: `postgres.New(cfg.Database.URL)` →
`pgStore.Migrate` → `entc.OpenPostgres(cfg.Database.URL)` → `entc.AutoMigrate`
→ `entadapter.NewCompositeStore`. The grove→project data backfill
(`entc.MigrateGroveToProjectData`) is **not** called on the postgres path (see
below).

### Dialect translation rules applied throughout

| SQLite pattern | PostgreSQL replacement |
|---|---|
| `?` positional placeholder | `$N` numbered placeholder |
| `INSERT OR IGNORE` / `INSERT OR REPLACE` | `ON CONFLICT … DO NOTHING` / `ON CONFLICT … DO UPDATE SET` |
| `sqlite_master` / `pragma_table_info` | `information_schema.tables` / `information_schema.columns` (both scoped to `table_schema='public'`, queried with `$1`/$`$2` params) |
| `randomblob(16)` | `gen_random_uuid()::text` (pgcrypto built-in; used in several data-backfill migrations) |
| `json_each(…)` | `json_array_elements_text(…::json)` (used in agent ancestry filter) |
| Case-insensitive email uniqueness via `UNIQUE` on TEXT | `CREATE UNIQUE INDEX … ON allow_list (LOWER(email))` (functional unique index) |
| `BLOB` | `BYTEA` (broker secret key column) |

## What was deliberately skipped

**Grove→project data backfill** (`entc.MigrateGroveToProjectData`) is omitted
from the postgres `initStore` path. A fresh postgres database starts with the
post-rename schema (V50 renames `groves` → `projects` and all `grove_id`
columns → `project_id` in-place); there is no legacy ent sqlite data to
backfill. The backfill only applies to existing SQLite deployments upgrading
in-place.

## How it is tested

`pkg/store/postgres/postgres_test.go` contains integration tests (migration
idempotency, CRUD + filter coverage for users, projects, agents, secrets,
groups, policies, invite codes, env vars) that run against a live Postgres
instance. Tests skip automatically when `SCION_TEST_POSTGRES_URL` is not set:

```go
const envVarDSN = "SCION_TEST_POSTGRES_URL"
// ...
if dsn == "" {
t.Skipf("set %s to run Postgres tests", envVarDSN)
}
```

Each test calls `resetSchema` (`DROP SCHEMA public CASCADE; CREATE SCHEMA
public`) before applying migrations, giving a clean slate per test function.

`make test-fast` (which passes `-tags no_sqlite`) excludes the SQLite driver
and exercises the rest of the codebase including the postgres package files; CI
runs this path. The full Postgres integration suite requires a live DSN and is
not wired into CI at this time.
12 changes: 12 additions & 0 deletions changelog/2026-06-01-changelog.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Release Notes (Jun 1, 2026)

This release introduces a Postgres store backend for the hub, enabling stateless control-plane deployments backed by an external database instead of a node-local SQLite PVC. The ent-layer has supported Postgres for some time; this change brings the hand-written store layer to parity.

## 🚀 Features
* **[Store]: Postgres Backend.** The hub now accepts `database.driver: postgres` (environment variables `SCION_SERVER_DATABASE_DRIVER=postgres` and `SCION_SERVER_DATABASE_URL=<dsn>`) to connect to an external Postgres database instead of the default node-local SQLite file.
* **Stateless Control Plane.** With Postgres as the backing store the hub StatefulSet and its associated PVC are no longer required for state durability, enabling fully stateless hub deployments that can scale horizontally or restart without data loss.
* **Ent Parity.** The ent-generated layer has supported Postgres since its introduction; this change adds the hand-written store's Postgres twin so that all hub persistence paths (agents, sessions, secrets, projects) are covered by both drivers.
* **Migration Note.** The grove → project backfill that runs on SQLite databases at startup is skipped automatically on a fresh Postgres deployment; no manual intervention is needed.

## 🐛 Fixes
* **[Infrastructure]:** Continued monitoring and stabilization of the agent dispatch pipeline.
114 changes: 114 additions & 0 deletions cmd/server_foreground.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ import (
"io"
"log"
"log/slog"
"net/url"
"os"
"os/signal"
"path/filepath"
Expand All @@ -47,6 +48,7 @@ import (
"github.com/GoogleCloudPlatform/scion/pkg/storage"
"github.com/GoogleCloudPlatform/scion/pkg/store"
"github.com/GoogleCloudPlatform/scion/pkg/store/entadapter"
"github.com/GoogleCloudPlatform/scion/pkg/store/postgres"
"github.com/GoogleCloudPlatform/scion/pkg/store/sqlite"
"github.com/GoogleCloudPlatform/scion/pkg/util"
"github.com/GoogleCloudPlatform/scion/pkg/util/logging"
Expand Down Expand Up @@ -680,12 +682,94 @@ func initStore(cfg *config.GlobalConfig) (store.Store, error) {
return nil, fmt.Errorf("database ping failed: %w", err)
}

return s, nil
case "postgres":
pgStore, err := postgres.New(cfg.Database.URL)
if err != nil {
return nil, fmt.Errorf("failed to open database: %w", err)
}

if err := pgStore.Migrate(context.Background()); err != nil {
pgStore.Close()
return nil, fmt.Errorf("failed to run migrations: %w", err)
}

// Isolate the Ent-managed tables from the raw-store tables. On SQLite
// these two table sets live in physically separate database files
// (entDSN := cfg.Database.URL + "_ent" above); several tables exist in
// both worlds with deliberately different column types — e.g. the raw
// migrations create projects.id as TEXT while the Ent schema models it
// as UUID. Pointing Ent at the same Postgres schema as the raw store
// makes Ent's auto-migration try to ALTER those shared tables in place
// ("column \"id\" cannot be cast automatically to type uuid"). A
// dedicated `ent` schema is the Postgres analog of the separate _ent
// file, keeping the two table sets from colliding.
if _, err := pgStore.DB().ExecContext(context.Background(), "CREATE SCHEMA IF NOT EXISTS ent"); err != nil {
pgStore.Close()
return nil, fmt.Errorf("failed to create ent schema: %w", err)
}
entDSN, err := withSearchPath(cfg.Database.URL, "ent")
if err != nil {
pgStore.Close()
return nil, fmt.Errorf("failed to build ent DSN: %w", err)
}
entClient, err := entc.OpenPostgres(entDSN)
if err != nil {
pgStore.Close()
return nil, fmt.Errorf("failed to open ent database: %w", err)
}
if err := entc.AutoMigrate(context.Background(), entClient); err != nil {
entClient.Close()
pgStore.Close()
return nil, fmt.Errorf("failed to run ent migrations: %w", err)
}

// grove->project backfill is a SQLite-era data repair; a fresh Postgres DB has no legacy grove rows, so it is intentionally skipped.

s := entadapter.NewCompositeStore(pgStore, entClient)

if err := s.Ping(context.Background()); err != nil {
pgStore.Close()
return nil, fmt.Errorf("database ping failed: %w", err)
}

return s, nil
default:
return nil, fmt.Errorf("unsupported database driver: %s", cfg.Database.Driver)
}
}

// withSearchPath returns the Postgres DSN with its connection search_path
// pinned to schemaName, so an Ent client opened on it confines all of its
// tables to that schema. It understands both DSN flavors lib/pq accepts: a
// URL form ("postgres://user:pass@host/db?...") and the keyword/value form
// ("host=... dbname=..."). For the URL form the schema is set via the
// `options` query parameter (-c search_path=...); for the keyword form an
// `options` keyword is appended. An existing search_path/options is replaced.
func withSearchPath(dsn, schemaName string) (string, error) {
opt := "-c search_path=" + schemaName
if strings.HasPrefix(dsn, "postgres://") || strings.HasPrefix(dsn, "postgresql://") {
u, err := url.Parse(dsn)
if err != nil {
return "", fmt.Errorf("parsing postgres URL: %w", err)
}
q := u.Query()
q.Set("options", opt)
u.RawQuery = q.Encode()
return u.String(), nil
}
// Keyword/value DSN: drop any existing options token, then append ours.
fields := make([]string, 0)
for _, f := range strings.Fields(dsn) {
if strings.HasPrefix(f, "options=") {
continue
}
fields = append(fields, f)
}
fields = append(fields, "options='"+opt+"'")
return strings.Join(fields, " "), nil
}

// initDevAuth initializes dev authentication and returns the token.
func initDevAuth(cfg *config.GlobalConfig, globalDir string) (string, error) {
devAuthCfg := apiclient.DevAuthConfig{
Expand Down Expand Up @@ -825,6 +909,16 @@ func initHubServer(ctx context.Context, cfg *config.GlobalConfig, s store.Store,
ClientID: cfg.OAuth.Web.GitHub.ClientID,
ClientSecret: cfg.OAuth.Web.GitHub.ClientSecret,
},
Generic: hub.OAuthProviderConfig{
ClientID: cfg.OAuth.Web.Generic.ClientID,
ClientSecret: cfg.OAuth.Web.Generic.ClientSecret,
DiscoveryURL: cfg.OAuth.Web.Generic.DiscoveryURL,
Issuer: cfg.OAuth.Web.Generic.Issuer,
AuthorizationURL: cfg.OAuth.Web.Generic.AuthorizationURL,
TokenURL: cfg.OAuth.Web.Generic.TokenURL,
UserInfoURL: cfg.OAuth.Web.Generic.UserInfoURL,
Scopes: cfg.OAuth.Web.Generic.Scopes,
},
},
CLI: hub.OAuthClientConfig{
Google: hub.OAuthProviderConfig{
Expand All @@ -835,6 +929,16 @@ func initHubServer(ctx context.Context, cfg *config.GlobalConfig, s store.Store,
ClientID: cfg.OAuth.CLI.GitHub.ClientID,
ClientSecret: cfg.OAuth.CLI.GitHub.ClientSecret,
},
Generic: hub.OAuthProviderConfig{
ClientID: cfg.OAuth.CLI.Generic.ClientID,
ClientSecret: cfg.OAuth.CLI.Generic.ClientSecret,
DiscoveryURL: cfg.OAuth.CLI.Generic.DiscoveryURL,
Issuer: cfg.OAuth.CLI.Generic.Issuer,
AuthorizationURL: cfg.OAuth.CLI.Generic.AuthorizationURL,
TokenURL: cfg.OAuth.CLI.Generic.TokenURL,
UserInfoURL: cfg.OAuth.CLI.Generic.UserInfoURL,
Scopes: cfg.OAuth.CLI.Generic.Scopes,
},
},
Device: hub.OAuthClientConfig{
Google: hub.OAuthProviderConfig{
Expand All @@ -845,6 +949,16 @@ func initHubServer(ctx context.Context, cfg *config.GlobalConfig, s store.Store,
ClientID: cfg.OAuth.Device.GitHub.ClientID,
ClientSecret: cfg.OAuth.Device.GitHub.ClientSecret,
},
Generic: hub.OAuthProviderConfig{
ClientID: cfg.OAuth.Device.Generic.ClientID,
ClientSecret: cfg.OAuth.Device.Generic.ClientSecret,
DiscoveryURL: cfg.OAuth.Device.Generic.DiscoveryURL,
Issuer: cfg.OAuth.Device.Generic.Issuer,
AuthorizationURL: cfg.OAuth.Device.Generic.AuthorizationURL,
TokenURL: cfg.OAuth.Device.Generic.TokenURL,
UserInfoURL: cfg.OAuth.Device.Generic.UserInfoURL,
Scopes: cfg.OAuth.Device.Generic.Scopes,
},
},
},
MaintenanceConfig: resolveMaintenanceConfig(cfg),
Expand Down
14 changes: 14 additions & 0 deletions pkg/config/hub_config.go
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,16 @@ type OAuthProviderConfig struct {
ClientID string `json:"clientId" yaml:"clientId" koanf:"clientId"`
// ClientSecret is the OAuth application client secret.
ClientSecret string `json:"clientSecret" yaml:"clientSecret" koanf:"clientSecret"`
// The following fields are only used by the generic OAuth/OIDC provider
// (e.g. Dex); field names mirror Better Auth's genericOAuth config. Set
// DiscoveryURL or Issuer for OIDC discovery, or set the endpoints
// explicitly. Left empty for Google/GitHub.
DiscoveryURL string `json:"discoveryUrl" yaml:"discoveryUrl" koanf:"discoveryUrl"`
Issuer string `json:"issuer" yaml:"issuer" koanf:"issuer"`
AuthorizationURL string `json:"authorizationUrl" yaml:"authorizationUrl" koanf:"authorizationUrl"`
TokenURL string `json:"tokenUrl" yaml:"tokenUrl" koanf:"tokenUrl"`
UserInfoURL string `json:"userInfoUrl" yaml:"userInfoUrl" koanf:"userInfoUrl"`
Scopes string `json:"scopes" yaml:"scopes" koanf:"scopes"`
}

// OAuthClientConfig holds OAuth provider configurations for a specific client type.
Expand All @@ -171,6 +181,10 @@ type OAuthClientConfig struct {
Google OAuthProviderConfig `json:"google" yaml:"google" koanf:"google"`
// GitHub OAuth settings for this client type.
GitHub OAuthProviderConfig `json:"github" yaml:"github" koanf:"github"`
// Generic is a configurable OAuth2/OIDC provider (e.g. Dex) for this client
// type. Configure via SCION_SERVER_OAUTH_<CLIENT>_GENERIC_{CLIENTID,CLIENTSECRET}
// plus GENERIC_ISSUER (discovery) or explicit GENERIC_{AUTHURL,TOKENURL,USERINFOURL}.
Generic OAuthProviderConfig `json:"generic" yaml:"generic" koanf:"generic"`
}

// OAuthConfig holds OAuth provider configurations.
Expand Down
35 changes: 34 additions & 1 deletion pkg/hub/oauth.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ import (
"net/http"
"net/url"
"strings"
"sync"
"time"

"github.com/GoogleCloudPlatform/scion/pkg/hubclient"
Expand All @@ -31,17 +32,32 @@ import (
type OAuthProviderConfig struct {
ClientID string
ClientSecret string
// The following fields are only used by the generic OAuth/OIDC provider
// (Google/GitHub leave them empty). Field names mirror Better Auth's
// genericOAuth config. Endpoints resolve in this order: explicit
// AuthorizationURL/TokenURL/UserInfoURL win; else OIDC discovery against
// DiscoveryURL; else discovery derived from Issuer
// (Issuer + "/.well-known/openid-configuration").
DiscoveryURL string // full .well-known/openid-configuration URL
Issuer string // issuer identifier; also derives DiscoveryURL when that is unset
AuthorizationURL string
TokenURL string
UserInfoURL string
Scopes string // space-separated; defaults to "openid email profile"
}

// OAuthClientConfig holds OAuth provider configurations for a specific client type.
type OAuthClientConfig struct {
Google OAuthProviderConfig
GitHub OAuthProviderConfig
// Generic is a configurable OAuth2/OIDC provider (e.g. Dex) — discovery via
// Issuer, or explicit AuthURL/TokenURL/UserInfoURL.
Generic OAuthProviderConfig
}

// IsConfigured returns true if at least one OAuth provider is configured.
func (c *OAuthClientConfig) IsConfigured() bool {
return c.Google.ClientID != "" || c.GitHub.ClientID != ""
return c.Google.ClientID != "" || c.GitHub.ClientID != "" || c.Generic.ClientID != ""
}

// IsProviderConfigured returns true if the specified provider is configured.
Expand All @@ -51,6 +67,12 @@ func (c *OAuthClientConfig) IsProviderConfigured(provider string) bool {
return c.Google.ClientID != "" && c.Google.ClientSecret != ""
case hubclient.OAuthProviderGitHub:
return c.GitHub.ClientID != "" && c.GitHub.ClientSecret != ""
case hubclient.OAuthProviderGeneric:
// Needs credentials plus a way to resolve endpoints: a discovery URL or
// issuer (for discovery), or explicit authorize+token endpoints.
hasEndpoints := c.Generic.DiscoveryURL != "" || c.Generic.Issuer != "" ||
(c.Generic.AuthorizationURL != "" && c.Generic.TokenURL != "")
return c.Generic.ClientID != "" && c.Generic.ClientSecret != "" && hasEndpoints
default:
return false
}
Expand All @@ -63,6 +85,8 @@ func (c *OAuthClientConfig) GetProvider(provider string) OAuthProviderConfig {
return c.Google
case hubclient.OAuthProviderGitHub:
return c.GitHub
case hubclient.OAuthProviderGeneric:
return c.Generic
default:
return OAuthProviderConfig{}
}
Expand Down Expand Up @@ -110,6 +134,10 @@ func oauthProviderOrder() []string {
type OAuthService struct {
config OAuthConfig
httpClient *http.Client

// oidcCache memoizes OIDC discovery documents keyed by issuer URL.
oidcMu sync.RWMutex
oidcCache map[string]*oidcDiscovery
}

// NewOAuthService creates a new OAuth service.
Expand All @@ -119,6 +147,7 @@ func NewOAuthService(config OAuthConfig) *OAuthService {
httpClient: &http.Client{
Timeout: 30 * time.Second,
},
oidcCache: make(map[string]*oidcDiscovery),
}
}

Expand Down Expand Up @@ -212,6 +241,8 @@ func (s *OAuthService) GetAuthorizationURLForClient(clientType OAuthClientType,
return s.getGoogleAuthURLWithConfig(cfg.Google, callbackURL, state)
case hubclient.OAuthProviderGitHub:
return s.getGitHubAuthURLWithConfig(cfg.GitHub, callbackURL, state)
case hubclient.OAuthProviderGeneric:
return s.getGenericAuthURLWithConfig(cfg.Generic, callbackURL, state)
default:
return "", fmt.Errorf("unsupported OAuth provider: %s", provider)
}
Expand Down Expand Up @@ -278,6 +309,8 @@ func (s *OAuthService) ExchangeCodeForClient(ctx context.Context, clientType OAu
return s.exchangeGoogleCodeWithConfig(ctx, cfg.Google, code, callbackURL)
case "github":
return s.exchangeGitHubCodeWithConfig(ctx, cfg.GitHub, code, callbackURL)
case hubclient.OAuthProviderGeneric:
return s.exchangeGenericCodeWithConfig(ctx, cfg.Generic, code, callbackURL)
default:
return nil, fmt.Errorf("unsupported OAuth provider: %s", provider)
}
Expand Down
Loading