Add UI 'None' auth option and no-auth harness behavior config#246
Add UI 'None' auth option and no-auth harness behavior config#246ptone wants to merge 13 commits into
Conversation
…s and permissions integer
…oogleCloudPlatform#403) * Add `scion build` command for local harness-config image builds Introduces a top-level `scion build <harness-config-name>` CLI command that builds a container image from a Dockerfile bundled in a harness-config directory. Supports --tag, --base-image, --push, --platform, and --dry-run flags. After a successful build, updates the harness-config's config.yaml image field to reference the built image. Also fixes Dockerfile content hashing: Dockerfiles were previously excluded from ComputeHarnessConfigRevision, so changes to them did not trigger re-sync to the Hub. Removes Dockerfile from the skipBasenames exclusion list. Extracts detectContainerRuntime() from pkg/hub/maintenance_executors.go into a shared pkg/runtime/container.go so both the new build command and the Hub executor can use it. * Address PR review: yaml.Node config update, error handling, dedup settings H1: Replace destructive yaml.Unmarshal/Marshal round-trip with targeted yaml.Node edit for config.yaml image update. Preserves comments, field order, and unknown fields. M1: Handle all os.Stat errors on Dockerfile, not just IsNotExist. M2: Load settings once instead of twice when both --base-image is unset and --push is set. L3: Remove extra blank line in maintenance_executors.go. * Guard against empty harness-config path and non-mapping YAML nodes Add empty-path check after FindHarnessConfigDir to prevent synthetic harness-configs (e.g. 'generic') from resolving Dockerfile against CWD. Verify yaml.MappingNode kind before manipulating doc.Content[0] to handle malformed config.yaml gracefully. --------- Co-authored-by: Scion Agent (harness-local-build-p1) <agent@scion.dev>
* fix: test-login hardening, agent CLI access, and UI nits - M1: Distinguish store.ErrNotFound from transient DB errors in GetUserByEmail — return 500 for unexpected failures instead of silently creating duplicate users. - L1: Add http.MaxBytesReader(4096) body size limit. - L2: Validate email contains "@" before processing. - L3: Track whether displayName was explicitly provided so the default (email) doesn't overwrite an existing user's display name. - Add harness-config subcommands to the agentAllowed map so agents can manage harness configs via the CLI. Fixes 3 (file-browser badge) and 4 (page title) were already resolved in the current codebase. * style: fix gofmt alignment in cli_mode.go * ci: retry after flaky TestPipeline_LogHandlerRegistered port conflict * test: update cli_mode tests to expect harness-config in agentAllowed --------- Co-authored-by: Scion Agent (dev-followup-pr) <agent@scion.dev>
…GoogleCloudPlatform#401) * feat: wire hub OTel metric recorders to Cloud Monitoring The hub's dbmetrics and dispatchmetrics recorders were created with NewDisabled() — all OTel instruments recorded to no-op sinks. This wires them to a real GCP Cloud Monitoring MeterProvider during server startup, lighting up ~20 existing instruments (pg LISTEN/NOTIFY latency, notifications published/delivered/dropped, subscriber lag, dispatch lifecycle, pool stats). New package pkg/observability/hubmetrics provides NewMeterProvider() which creates a PeriodicReader (60s) with the GCP metric exporter. Metric groups (db-notify, db-pool, dispatch, hub-auth, hub-gcp) can be independently disabled via env vars using OTel View Drop(). Graceful degradation: when GCPProjectID is empty or the exporter fails, the hub logs a warning and continues with disabled recorders — identical to the previous behavior. * test: clean up TestIsGroupDisabled table-driven test Use the table's 'want' field directly instead of re-deriving the expected value in the assertion body. * fix: add 10s timeout to hub MeterProvider shutdown Prevents indefinite hang if the GCP metric exporter is unresponsive during server shutdown. * feat: replace hub in-memory counters with OTel instruments (Phase 2) Add OTelMetricsRecorder and OTelGCPTokenMetrics that implement the existing MetricsRecorder and new GCPTokenMetricsRecorder interfaces using OTel instruments for Cloud Monitoring export. Both use a dual-write pattern — OTel instruments for cloud export plus embedded atomic structs for the /api/metrics JSON snapshot endpoint. New metrics: scion.hub.auth.*, scion.hub.registration.count, scion.hub.join.*, scion.hub.rotation.count, scion.hub.dispatch.*, scion.hub.brokers.connected, scion.hub.gcp.token.*, scion.hub.gcp.iam.duration. Closes #240 * fix: address Phase 2 review findings (M1, M2, M3) M1: Add operation attribute to RecordDispatch OTel instruments so dispatch failures can be broken down by operation type. M2: Expand hub-auth metric group to cover all broker lifecycle instruments (registration, join, rotation, brokers, dispatch) — not just scion.hub.auth.*. M3: Gate SetMetrics(otelMetrics) on broker auth being enabled, preventing /api/metrics from showing an all-zeros broker block when auth is disabled. * feat: add pipeline health gauge, export error counter, and structured logging Phase 3 of metrics-delivery: add observability to the agent-side telemetry pipeline so we can confirm it's working end-to-end in production. - Add scion.telemetry.pipeline.status gauge (Int64, value=1) that self-reports via the cloud exporter on a 60s ticker, confirming the pipeline is alive - Add scion.telemetry.export.errors counter with signal and error_type attributes, incrementing on metric/span/log export failures - Add classifyError() to bucket errors into timeout/auth/quota/other - Upgrade credential logging at pipeline startup from log.Info format strings to structured slog.Info with credentials_file, source, project_id, provider, and cloud_configured fields - Add structured slog.Warn when cloud export is not configured, including the env var and well-known path that were checked - Fix slog handler in pkg/sciontool/log to render key=value attributes instead of silently dropping them - Add pipeline_health_test.go with tests for health gauge lifecycle, nil-safe export error recording, and classifyError bucketing * fix: shut down unused TracerProvider and LoggerProvider in initSelfMetrics initSelfMetrics() creates providers via NewProviders() but only needs the MeterProvider. The TracerProvider and LoggerProvider were never shut down, leaking goroutines. Shut them down immediately after creation. * fix(metrics-delivery): address errcheck and staticcheck CI failures - Replace os.Setenv+cleanup with t.Setenv in hubmetrics tests - Wrap standalone os.Unsetenv calls with error checks - Handle Shutdown errors on TracerProvider, LoggerProvider, MeterProvider - Check pipeline.Stop error in test cleanup - Convert switch to tagged form (staticcheck QF1002) - Fix MeterProvider leak on early return in startHealthGauge --------- Co-authored-by: Scion Agent (metrics-phase1-dev) <agent@scion.dev>
…ogleCloudPlatform#405) * fix: suppress commentary (TypeAssistantReply) messages in Discord Add unconditional early return in broker.Publish() to filter TypeAssistantReply before per-channel logic, covering all channels including those with ShowAssistantReply=true from the buggy setup default. Fix the setup default to false for new channel links. Clean up now-dead per-channel isAssistantReply check. * style: run gofmt on skill-related files inherited from main Fix formatting issues in files that were merged from main without gofmt, causing CI Build & Test check to fail. * fix: move commentary filter before webhook routing, remove dead branch Move the TypeAssistantReply early return before webhook routing and message formatting to avoid computing text that gets immediately discarded. Remove the now-unreachable TypeAssistantReply branch from the useWebhook condition. --------- Co-authored-by: Scion Agent (discord-commentary-filter-dev) <agent@scion.dev>
There was a problem hiding this comment.
Code Review
This pull request introduces a 'NoAuth' mode allowing agents to start without credentials, dropping them directly to a shell with custom setup instructions. It updates harness configurations, schemas, Go structs, backend propagation logic, and the frontend UI to support this new authentication option. The review feedback highlights a critical logic issue in both pkg/runtime/common.go and pkg/runtime/k8s_runtime.go where dropping to a shell is bypassed if NoAuthMessage is empty; it is recommended to check NoAuth independently and default to launching a shell even when no message is provided.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| if config.NoAuth && config.NoAuthMessage != "" { | ||
| harnessArgs = []string{"sh", "-c", fmt.Sprintf("printf '%%s\\n' %s; exec bash", shellQuote(config.NoAuthMessage))} | ||
| } else if config.Harness != nil { |
There was a problem hiding this comment.
If config.NoAuth is true but config.NoAuthMessage is empty (for example, if a harness config defines behavior: drop-to-shell but omits the optional message field), the condition config.NoAuth && config.NoAuthMessage != "" will evaluate to false. This causes the runtime to fall through and launch the harness CLI instead of dropping to a shell, which breaks the expected NoAuth behavior.
We should check config.NoAuth independently and drop to a shell, optionally printing the message if it is provided.
if config.NoAuth {
if config.NoAuthMessage != "" {
harnessArgs = []string{"sh", "-c", fmt.Sprintf("printf '%%s\\n' %s; exec bash", shellQuote(config.NoAuthMessage))}
} else {
harnessArgs = []string{"bash"}
}
} else if config.Harness != nil {| if config.NoAuth && config.NoAuthMessage != "" { | ||
| harnessArgs = []string{"sh", "-c", fmt.Sprintf("printf '%%s\\n' %s; exec bash", shellQuote(config.NoAuthMessage))} | ||
| } else if config.Harness != nil { |
There was a problem hiding this comment.
If config.NoAuth is true but config.NoAuthMessage is empty (for example, if a harness config defines behavior: drop-to-shell but omits the optional message field), the condition config.NoAuth && config.NoAuthMessage != "" will evaluate to false. This causes the runtime to fall through and launch the harness CLI instead of dropping to a shell, which breaks the expected NoAuth behavior.
We should check config.NoAuth independently and drop to a shell, optionally printing the message if it is provided.
| if config.NoAuth && config.NoAuthMessage != "" { | |
| harnessArgs = []string{"sh", "-c", fmt.Sprintf("printf '%%s\\n' %s; exec bash", shellQuote(config.NoAuthMessage))} | |
| } else if config.Harness != nil { | |
| if config.NoAuth { | |
| if config.NoAuthMessage != "" { | |
| harnessArgs = []string{"sh", "-c", fmt.Sprintf("printf '%%s\\n' %s; exec bash", shellQuote(config.NoAuthMessage))} | |
| } else { | |
| harnessArgs = []string{"bash"} | |
| } | |
| } else if config.Harness != nil { |
Add "No Authentication" option to the harness auth dropdown in both agent-create and agent-configure pages. When selected, the Hub maps harnessAuth="none" to NoAuth=true on the agent's AppliedConfig, which propagates through the broker dispatch chain to StartOptions. Add no_auth config section to all harness config.yaml files (claude, gemini, codex, opencode) with drop-to-shell behavior. When an agent starts with NoAuth=true and the harness config specifies drop-to-shell, the container prints the setup message and gives the user a shell instead of launching the harness CLI.
e36aa7d to
434e57a
Compare
Use t.Setenv() instead of os.Setenv() in pipeline_health_test.go to properly check error returns and follow Go test best practices.
Remove duplicate no_auth sections from all four harness config.yaml files (claude, gemini, codex, opencode) — the first occurrence between capabilities and auth is kept, the second at end-of-file is removed. Add the "field" property to the harnessAuthFileRequirement JSON schema definition so config validation accepts the new field attribute on required_files entries.
Summary
harnessAuth="none"toNoAuth=trueon agent'sAppliedConfigin Hub handlers, with full propagation through the broker dispatch chain (RemoteCreateAgentRequest→ brokerCreateAgentRequest→startContextInputs→StartOptions)no_authconfig section (withdrop-to-shellbehavior and setup message) to all four harness config.yaml files (claude, gemini, codex, opencode)HarnessNoAuthConfigstruct andNoAuthConfigfield toHarnessConfigEntry, plus JSON schema validationNoAuth=trueand harness config specifiesdrop-to-shell, the container prints the setup instructions and drops to a shell instead of launching the harness CLITest plan
go build ./...passesgofmtclean on all modified filesgolangci-lint run --new-from-rev=origin/main— 0 new issuesgo vetclean on all modified packagesno_authsection)pkg/agentandpkg/configconfirmed unrelated (fail onorigin/maintoo)