Refactor to modular architecture, fix bugs, and improve test infrastructure#217
Merged
Conversation
- Extract tenant_id from context in security events (was logging empty) - Remove broken testTx/NewServerWithTx/DB, add TestTransactionMiddleware - BroadcastRequest accepts both 'message' and 'event'+'payload' formats - DDL handler invalidates GraphQL schema alongside REST cache - Edge functions guide uses FLUXBASE_SERVICE_TOKEN (was FLUXBASE_API_KEY) - Clarify comments that keys are for external client use
Tenancy & FDW: - Rename TenantStore→TenantRepository, Storage→Repository across all call sites - Remove dead setupFDWLegacy (~80 lines), simplify SetupFDW to 3 params - Add 10 platform tables (without tenant_id) to FDW exclude list - Fix MFA service direct pool bypass (m.db.Pool()→m.db.Exec, 14 sites) Supabase removal: - Remove supabase/supabase-demo issuers from JWT validation - Clean ~30 Supabase references from comments, tests, and docs Extractions: - Extract ConnectWithRetry, EnsureDefaultTenantAndKeys, BackfillTenantIDToDefault from main.go (988→529 lines) into dedicated packages - Split Execute (390-line function) into orchestrator + execute_helpers.go Other: - Add exponential backoff to job retry (5s·2^n capped at ~5min) - Remove deprecated SignInAnonymous (handler, service, tests) - Stop clearing error_message on job retry - Document JWT scope bypass rationale and MCP/REST scope separation
… registry - Remove POST /api/v1/realtime/broadcast endpoint, handler, type, OpenAPI spec, and ScopeRealtimeBroadcast scope (clients should use WebSocket publish) - Replace 8-way method switch with router.Add()/router.All() in route registry - Cache RequireAuthOrServiceKey/OptionalAuthOrServiceKey middleware instances (was creating 24 duplicate handlers, now 2 cached on Server struct) - Remove testTx/DB() from server.go, update test helpers accordingly
Critical: - Fix job automatic retries: RequeueJob now handles running->pending transition (was requiring failed status, making backoff dead code) - Fix realtime cross-tenant message leakage: broadcasts, presence, and sync events now scoped by TenantID - Apply SSRF BlockedDomains to function/job/MCP executions (was only in test code; metadata endpoints were accessible) - Fix leader election: use dedicated connection for advisory locks (session-level locks were released immediately with pgxpool) High: - Fix race condition on Connection.Claims/Role reads (add GetRoleAndClaims) - Fix Prometheus metric cardinality: normalize slug/name path segments - Fix hasPendingMigrations always returning false - Fix service_role rate limiter creating fresh instance per request - Fix job function file operations bypassing tenant isolation (RLS) - Restrict --allow-read/--allow-write to /tmp (was full filesystem) Medium/Low: - Replace fmt.Sprintf JSON construction with json.Marshal (2 sites) - Remove dead code: execContext, SubscriptionFilter, NewConnectionFromPool - Pre-compile regexes in SanitizeSQL (was compiling on every call) - Fix panic on short refresh tokens in rate limiter - Fix stub handler returning 200 instead of 501 - Fix OOM detection for functions (was only for jobs)
- Job retry: verify RequeueJob (running->pending) and RequeueFailedJob (failed->pending) method signatures exist and are distinct - Realtime tenant isolation: verify BroadcastToChannel filters by tenantID, only matching-tenant connections receive messages, and BroadcastGlobal propagates tenantID through pubsub - SSRF protection: verify DefaultFunctionPermissions and DefaultJobPermissions include BlockedDomains with known SSRF targets (metadata endpoints, localhost) - Connection race safety: verify GetRoleAndClaims is concurrent-safe with concurrent UpdateAuth calls (no panics under -race) - Leader election: verify Start is idempotent, Stop is safe with nil dedicatedConn, and leadership state is properly cleared on stop
Critical: - Fix cookie JSON.parse crash in auth-store (try/catch + cleanup) - Consolidate triplicated auth refresh logic into single helper - Remove duplicate FluxbaseResponse type (response.ts deleted) Data fetching: - Branch selector rewritten with TanStack Query (was useEffect) - useTenant/useTenants rewritten with TanStack Query - Tenant ID added to React Query cache keys (prevents stale cross-tenant data) Error handling: - Extract getErrorMessage() helper (replaced 5 inline patterns) - Fix SDK fetch error wrapping to preserve original error message Architecture: - Document Axios vs SDK usage in client.ts JSDoc - Fix impersonation token race (read from store, not localStorage) - Add public getters to FluxbaseFetch (replaces bracket notation access) - Add auth loading spinner during async check Low priority: - Document sidebar/route coupling in sidebar-data.ts - Document dashboardAuthAPI raw axios usage - Add React hooks: useInvokeFunction, useSubmitJob, useJobStatus, useBranches - Remove deprecated SupabaseResponse/SupabaseAuthResponse aliases
- Remove Supabase alternative wording - Document JWT issuer (fluxbase-only), scope bypass, security events - Add new extracted files (retry.go, bootstrap_keys.go, backfill.go, execute_helpers.go) - Document SSRF protection (BlockedDomains applied to all executions) - Document realtime tenant isolation (broadcast + presence filtering) - Document leader election (dedicated connection, idempotent start) - Update admin UI auth (safe cookie parsing, consolidated refresh) - Update Prometheus normalizePath (handles slug-like segments)
HIGH: - Add per-connection WebSocket rate limiting (10 msgs/sec, burst 20) via golang.org/x/time/rate token bucket - Add subscription limit per WebSocket connection (max 100) - Fix rlsCache.cleanup() goroutine leak (accept context, add stop()) MEDIUM: - Fix FDW role name collision (use full UUID instead of 8 chars) - Fix partial FDW setup returning nil on failure (track failed schemas) - Fix bootstrap failure leaking orphaned database (DROP on failure) - Add array size limits on bulk operations (max 1000 items) - Fix scheduler namespace collisions (use namespace/name as cron key) - Fix path traversal in chunked upload (validate uploadID regex) Dead code removed: - Executor.ValidateMigration (never called) - normalizeSlug in tenant_loader.go (unused) - SyncFromPostgres in extensions/service.go (collected nothing) - @fluxbase:timeout/@fluxbase:memory annotation parsing (never applied)
MEDIUM: - M1: GetAppliedMigrations only swallows missing-table errors - M2: DDL wrapped in transactions in declarative schema - M3: Declarative apply uses shared pool (no connection exhaustion) - M6: BroadcastToChannel O(n) documented (tenant filtering limits scan) - M8: TenantExecutor uses advisory locks for concurrent migration safety - M12: Cron functions scheduled immediately on create/update - M13: Scheduled functions now receive settings secrets - M16: Tenant config env var parse errors logged as warnings - M18: Dead workers removed from slice (no unbounded growth) - M19: Restart count resets after 5 minutes of stability LOW: - L1: err.Error() string compare → errors.Is(err, pgx.ErrNoRows) - L2: bcrypt.DefaultCost → DefaultBcryptCost - L12: Secret names no longer logged via fmt.Printf - L15: Rate limiter panics → error returns - L17: err.Error() removed from API responses (generic messages) - L22: FailJob errors now logged - L23: Redundant inline role checks removed from monitoring - L24: context.Background() → request context in handlers - L31: Tracing version uses overridable var (ldflags) - L32: Custom itoa → strconv.Itoa - L33: Custom contains → strings.Contains
Goroutine leaks: - L19: slowQueryTracker cleanupLoop stopped on Connection.Close() - L20: endpointRateLimiter cleanup stopped on TriggerService.Stop() Data safety: - L21: perKeyLimiters map evicts idle entries (>1h) every 10m - L25: Idempotency check uses INSERT-first pattern (eliminates TOCTOU race) - L26: PresenceInfo.State map deep-copied on store Dead code: - L6: Removed ComputeDeduplicationKey, EnqueueJobWithDedup, ProgressToJSON, JSONToProgress from jobs/storage.go - L18: Extracted shared enrichJobWithETA to listener_helpers.go, documented duplication between Listener/ListenerPool Error responses: - M15: Migrated server.go to structured error helpers, removed remaining err.Error() leak in schema refresh handler
Foundation for modular server initialization: - Add Module interface with Name() and Init() methods - Add ServiceRegistry for type-keyed service lookup with GetService[T] - Add InitModules for sequential module initialization with error propagation - Extract 4 simplest modules: Email, Secrets, Webhook, Extensions - Modules coexist with existing init methods (no breaking changes) - Server.registry field holds the ServiceRegistry for future module use This is the first step of Phase 3. The old init methods for email, secrets, webhook, and extensions are replaced by module-based init while all other modules continue using the existing pattern.
…methods (Phase 3b) Moves initStorage, initLogging, initAuth into Module implementations. Also deletes initEmail, initWebhook, initSecrets (previously extracted). LoggingModule receives pubSub via struct field to work around GetService[T] limitation with interface types. Unused imports cleaned from server_init.go.
…e 3c) Extracts TenancyModule, SettingsModule, SchemaModule, FunctionsModule, JobsModule, RPCModule, AIModule, RealtimeModule, BranchingModule, GraphQLModule from server_init.go into dedicated module files. Adds PubSub field to ServiceRegistry (interface types can't be looked up via reflect-based GetService[T]). Deletes 579 lines from server_init.go. All modules now use dependency injection via the ServiceRegistry. Remaining: setupMCPServer, initMetrics, initBackgroundServices, setupMiddlewares, setupRoutes.
… 3d) Extracts MCPModule (setupMCPServer) and MetricsModule (initMetrics) into dedicated module files. Moves branchTenantResolver and branchFDWRepairer helper types to module_branching.go. Adds PubSub field to ServiceRegistry for interface-typed access. Registers AI chat handler and KB storage for cross-module lookup. server_init.go reduced from 632 to 369 lines. Remaining: initCore, initBackgroundServices, setupMiddlewares, setupRoutes (lifecycle/infrastructure, not business logic modules).
The cross-init dependency fields (emailManager, emailService, storageManager, storageService, loggingService, authService, captchaService, systemSettingsService, userMgmtService, invitationService, secretsStorage) were only written during init wiring but never read from the Server struct. Modules now pass dependencies directly to handlers. Removes 26 lines of dead code.
Add module system description to Internal Modules section. Update directory structure with module files. Document the 19-module dependency injection architecture.
… services, split files - Remove dead pubsub.GetGlobalPubSub/SetGlobalPubSub (defined but never called) - Inject rate limiter into functions.Handler via SetRateLimiter() instead of global ratelimit.GetGlobalStore() - Add RateLimiter field to ServiceRegistry (same pattern as PubSub) - Extract BackgroundServicesModule (starts realtime listener, schedulers, jobs manager, RPC scheduler via leader election) - Extract leaderElect() helper as standalone function - Split server_init.go into server_middlewares.go and server_routes.go - server_init.go reduced to 76 lines (only initCore remains) Net: -366 lines, 2 globals eliminated, 20 modules total.
- Reduce test pool sizes (MaxConns=3, MinConns=0, shorter idle/lifetime) - Add 50ms drain delay after Server.Shutdown() for connection release - Respect embedded_worker_count=0 to disable job workers in tests - Fix auth.set_tenant_id_from_client_key_or_context schema (already fixed in auth.sql but trigger function in DB was stale) - Increase devcontainer PostgreSQL max_connections to 400
…s during schema apply The declarative schema system sets search_path to include all schemas before applying SQL files. When a SECURITY DEFINER function with SET search_path=public references auth.client_keys, PostgreSQL normalizes it to just client_keys because auth is in the session search_path during CREATE FUNCTION. At runtime with search_path=public, the unqualified reference fails because public.client_keys doesn't exist. Fix: exclude the target schema from the session search_path during applySchemaDirect. This prevents normalization while still allowing cross-schema references.
Go E2E tests now reset and bootstrap their own database on every run, matching the Playwright UI test pattern. No external setup step needed. Changes: - Add resetTestDatabase() + bootstrapTestDatabase() to setup_test.go - resetTestDatabase drops all schemas, ensures users exist (as postgres) - bootstrapTestDatabase runs EnsureBootstrap() + declarative schema via Go - grantRLSTestPermissions adds missing schemas (ai, rpc, logging, branching) and roles (tenant_service, tenant_migration_role) using loop-based SQL - Default database changed from fluxbase_test to fluxbase_go_e2e - CI simplified from 6 steps to 3 (create DB + users + run tests) - Makefile E2E targets no longer depend on test-setup-db
- go fmt on 15 files (gofumpt formatting) - branch-selector: derive error state from query instead of setState in effect - SDK storage tests: add getBaseUrl() to MockFetch - SDK fetch tests: update 'Unknown error occurred' expectations to 'string error'
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Full-stack overhaul of Fluxbase internals: module-based architecture, 67 codebase findings fixed, frontend cleanup across 3 packages, and self-sufficient E2E test infrastructure.
Architecture: Module System
Replaced the monolithic server_init.go (1,211 lines) with a dependency-injection module system:
Bug Fixes (6 items)
Codebase Sweep
E2E Test Infrastructure
Files Changed
173 files, +4,997 / -4,354 lines
Test Results
Breaking Changes
None — all changes are backward compatible. Existing installations upgrade smoothly via declarative schema re-application on startup.
Not Included (Deferred)