-
Notifications
You must be signed in to change notification settings - Fork 0
feat: move fga subroutine from account operator #107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ts for readiness, merges parent/owner tuples, and keeps accountInfo store IDs aligned
…and a requeue scenario with fakes
WalkthroughAdds OpenFGA integration and FGA config fields, implements a WorkspaceFGA subroutine for writing tuples, extends workspace initializer with cross-cluster Store and AccountInfo orchestration, updates authorization-model DSL and coreModule, and expands tests and mocks for new error and requeue paths. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…ine-from-account-operator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can understand the idea behind the changeset you provided. But I would need to ask you to change your approach due to the following reason:
If we write the owner and parent tuple via the store resource, it would be reconciled regularly, which is not the desired behaviour. Also, if for every account we add to the tuple list of the Store resource, for a growing number of Accounts we would quickly reach a limit in size of the Store resource.
So I would ask you to reimplement this PR and provide a reimplementation with the following changes in mind:
- The initial
creatorshould only be written once through the initializer. Also make sure, that everything works in case the organization is of typeorgvs typeaccount. - The
parenttuple can be reconciled regularly, but for now it would be sufficient to write it once, since we want to handle the reconciliation of other things (hr for realms, etc) within the scope of a separate ticket. - Use an FGA client to achieve the goal.
If there are any questions on the approach, please let me know!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (3)
internal/subroutine/workspace_initializer.go (2)
40-53: Avoid panic on CoreModule read; fail fast gracefullyPanic in constructor will crash the operator on misconfig or transient IO. Prefer returning an error (or validating at startup) and surfacing it via controller setup to fail fast with a clear message.
102-104: Hardcoded 5s timeout is tight; make configurableCross-cluster GETs and subsequent create/update may exceed 5s under load. Consider a configurable timeout (e.g., cfg.FGA/K8s timeouts) and reuse it consistently.
internal/config/config.go (1)
13-16: Add FGA client TLS and timeout settings to configConsider adding fields like fga-tls-enabled, fga-ca-file, fga-client-cert/key, and per-call dial/call timeouts. This lets cmd/initializer dial securely and bound operations, avoiding hardcoded insecure dials later.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
cmd/initializer.go(2 hunks)internal/config/config.go(1 hunks)internal/controller/initializer_controller.go(2 hunks)internal/subroutine/workspace_initializer.go(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
cmd/initializer.go (1)
internal/controller/initializer_controller.go (1)
NewLogicalClusterReconciler(30-42)
internal/controller/initializer_controller.go (2)
internal/config/config.go (1)
Config(11-44)internal/subroutine/workspace_initializer.go (1)
NewWorkspaceInitializer(37-54)
internal/subroutine/workspace_initializer.go (2)
internal/config/config.go (1)
Config(11-44)api/v1alpha1/store_types.go (1)
Store(44-50)
🔇 Additional comments (1)
internal/controller/initializer_controller.go (1)
30-35: LGTM: FGA client properly plumbedNo outdated call sites found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
cmd/initializer.go (1)
104-111: Fix undefined variable causing compilation failure.The variable
appCfgis undefined, causing the build to fail. Based on the usage pattern elsewhere in this function (lines 34, 100), it should beinitializerCfg.Apply this diff to fix the undefined variable:
- conn, err := grpc.NewClient(appCfg.FGA.Target, grpc.WithTransportCredentials(insecure.NewCredentials())) + conn, err := grpc.NewClient(initializerCfg.FGA.Target, grpc.WithTransportCredentials(insecure.NewCredentials()))Note: A previous review comment recommended using
grpc.DialContextwith a timeout andgrpc.WithBlock()instead ofgrpc.NewClientfor more reliable connection establishment in the initializer context. That suggestion remains valid and should be addressed.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cmd/initializer.go(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
cmd/initializer.go (1)
internal/controller/initializer_controller.go (1)
NewLogicalClusterReconciler(30-42)
🪛 GitHub Actions: ci
cmd/initializer.go
[error] 104-104: undefined: appCfg
🪛 GitHub Check: pipe / lint / lint
cmd/initializer.go
[failure] 113-113:
undefined: appCfg (typecheck)
[failure] 104-104:
undefined: appCfg
🪛 GitHub Check: pipe / testSource / test
cmd/initializer.go
[failure] 113-113:
undefined: appCfg
[failure] 104-104:
undefined: appCfg
🔇 Additional comments (1)
cmd/initializer.go (1)
9-9: LGTM: Imports support OpenFGA gRPC integration.The added imports are appropriate for establishing a gRPC connection to OpenFGA services.
Also applies to: 15-16
…h the colon and dotted service account forms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
internal/subroutine/workspace_initializer.go (1)
272-281: Creator validation fix looks goodNow blocks both colon and dotted service account formats. Resolves the earlier gap.
🧹 Nitpick comments (5)
cmd/initializer.go (1)
118-129: Optional: prefer a health/test RPC over manual connectivity loopThe Connect/GetState/WaitForStateChange loop works, but a small unary health RPC (if server exposes gRPC Health) or a cheap FGA call (e.g., ListStores) better validates end-to-end readiness and credentials.
internal/subroutine/workspace_initializer.go (4)
40-43: Improve panic message for core module readAdd path context to the panic for faster diagnostics.
Apply:
- data, err := os.ReadFile(coreModulePath) - if err != nil { - panic(err) - } + data, err := os.ReadFile(coreModulePath) + if err != nil { + panic(fmt.Errorf("read core module from %q: %w", coreModulePath, err)) + }
205-207: Use bounded context for annotation patchPatch currently uses the unbounded ctx. Prefer the existing 5s bounded ctx to avoid hangs on API stalls.
Apply:
- if err := w.updateAnnotations(ctx, clusterRef.GetClient(), lc, ownerWritten, parentWritten); err != nil { + if err := w.updateAnnotations(ctxWithTimeout, clusterRef.GetClient(), lc, ownerWritten, parentWritten); err != nil {
224-231: Minor: derive store name from the existing path without re-reading annotationsYou already have path; reuse it to compute the name for clarity.
Apply:
- return logicalcluster.Name(path), generateStoreName(lc), nil + return logicalcluster.Name(path), generateStoreNameFromPath(path), nil
263-270:formatUserSA branch is currently unreachableSince
validateCreatorrejects service accounts, the SA-specific normalization informatUserwon’t execute. Either drop the SA case or leave a comment explaining future intent.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
cmd/initializer.go(2 hunks)internal/controller/store_controller.go(1 hunks)internal/subroutine/tuples.go(1 hunks)internal/subroutine/workspace_initializer.go(3 hunks)
✅ Files skipped from review due to trivial changes (2)
- internal/controller/store_controller.go
- internal/subroutine/tuples.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-15T14:34:26.498Z
Learnt from: kekcleader
PR: platform-mesh/security-operator#107
File: cmd/initializer.go:0-0
Timestamp: 2025-10-15T14:34:26.498Z
Learning: In gRPC Go library, `grpc.DialContext` is deprecated in favor of `grpc.NewClient`. NewClient establishes connections lazily (on first RPC call) rather than blocking on dial, which is the preferred behavior for most use cases.
Applied to files:
cmd/initializer.go
🧬 Code graph analysis (2)
cmd/initializer.go (1)
internal/controller/initializer_controller.go (1)
NewLogicalClusterReconciler(30-42)
internal/subroutine/workspace_initializer.go (2)
internal/config/config.go (1)
Config(11-44)api/v1alpha1/store_types.go (1)
Store(44-50)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: pipe / dockerBuild / docker
🔇 Additional comments (1)
cmd/initializer.go (1)
136-139: Good wiring of OpenFGA client into the reconcilerClient creation and passing fga into NewLogicalClusterReconciler looks correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
cmd/initializer.go (2)
37-69: Consider improving error messages in certificate loading.The function works correctly but could provide better diagnostics:
- Line 52:
fs.ErrInvaliddoesn't indicate which file failed or whyAppendCertsFromPEMrejected it.- The function succeeds even if no certificates were actually loaded from the provided path.
Consider these improvements:
func loadCertPool(path string) (*x509.CertPool, error) { pool, _ := x509.SystemCertPool() if pool == nil { pool = x509.NewCertPool() } info, err := os.Stat(path) if err != nil { return nil, err } + loaded := 0 add := func(p string) error { pem, err := os.ReadFile(p) if err != nil { return err } if ok := pool.AppendCertsFromPEM(pem); !ok { - return fs.ErrInvalid + return fmt.Errorf("failed to parse certificates from %s", p) } + loaded++ return nil } if info.IsDir() { - return pool, filepath.WalkDir(path, func(p string, d fs.DirEntry, err error) error { + err := filepath.WalkDir(path, func(p string, d fs.DirEntry, err error) error { if err != nil { return err } if d.IsDir() { return nil } return add(p) }) + if err != nil { + return nil, err + } + if loaded == 0 { + return nil, fmt.Errorf("no valid certificates found in directory %s", path) + } + return pool, nil - }) } - return pool, add(path) + if err := add(path); err != nil { + return nil, err + } + return pool, nil }
184-208: Prefer lazy connection or health check over manual state loop.
grpc.NewClientestablishes connections lazily on the first RPC call. The eagerConnect()+ state-checking loop (lines 192-203) works but is non-idiomatic and adds complexity that the deprecatedgrpc.DialContext+grpc.WithBlock()provided out-of-the-box.Based on learnings
Option 1 (Recommended): Remove eager connection logic and rely on lazy connection behavior:
- ctxDial, cancel := context.WithTimeout(ctx, 10*time.Second) - defer cancel() conn, err := grpc.NewClient(initializerCfg.FGA.Target, dialOpts...) if err != nil { log.Error().Err(err).Msg("unable to create OpenFGA client connection") os.Exit(1) } - // Eagerly connect and wait until Ready or timeout to fail fast on bad endpoints. - conn.Connect() - ready := false - for { - s := conn.GetState() - if s == connectivity.Ready { - ready = true - break - } - if !conn.WaitForStateChange(ctxDial, s) { - break // context expired - } - } - if !ready { - log.Error().Msg("OpenFGA connection not ready before deadline") - os.Exit(1) - } defer func() { _ = conn.Close() }() fga := openfgav1.NewOpenFGAServiceClient(conn)The first FGA RPC will surface any connectivity issues. If you need fail-fast validation, consider Option 2.
Option 2: Replace state loop with a health check or no-op RPC:
+ ctxCheck, cancel := context.WithTimeout(ctx, 10*time.Second) + defer cancel() conn, err := grpc.NewClient(initializerCfg.FGA.Target, dialOpts...) if err != nil { log.Error().Err(err).Msg("unable to create OpenFGA client connection") os.Exit(1) } - // Eagerly connect and wait until Ready or timeout to fail fast on bad endpoints. - conn.Connect() - ready := false - for { - s := conn.GetState() - if s == connectivity.Ready { - ready = true - break - } - if !conn.WaitForStateChange(ctxDial, s) { - break // context expired - } - } - if !ready { - log.Error().Msg("OpenFGA connection not ready before deadline") - os.Exit(1) - } defer func() { _ = conn.Close() }() fga := openfgav1.NewOpenFGAServiceClient(conn) + + // Fail fast: validate connectivity with a lightweight RPC + if _, err := fga.ListStores(ctxCheck, &openfgav1.ListStoresRequest{PageSize: wrapperspb.Int32(1)}); err != nil { + log.Error().Err(err).Msg("OpenFGA connectivity check failed") + os.Exit(1) + }(Add
"google.golang.org/protobuf/types/known/wrapperspb"import if using Option 2.)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
cmd/initializer.go(3 hunks)internal/config/config.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/config/config.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-15T14:34:26.498Z
Learnt from: kekcleader
PR: platform-mesh/security-operator#107
File: cmd/initializer.go:0-0
Timestamp: 2025-10-15T14:34:26.498Z
Learning: In gRPC Go library, `grpc.DialContext` is deprecated in favor of `grpc.NewClient`. NewClient establishes connections lazily (on first RPC call) rather than blocking on dial, which is the preferred behavior for most use cases.
Applied to files:
cmd/initializer.go
🧬 Code graph analysis (1)
cmd/initializer.go (2)
internal/config/config.go (1)
Config(11-48)internal/controller/initializer_controller.go (1)
NewLogicalClusterReconciler(30-42)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: pipe / dockerBuild / docker
- GitHub Check: pipe / lint / lint
🔇 Additional comments (1)
cmd/initializer.go (1)
154-157: LGTM: FGA target validation and client wiring.Lines 154-157 correctly validate that
FGA.Targetis non-empty and fail fast, addressing previous review feedback. Lines 210 and 212 properly create the OpenFGA service client and wire it into the reconciler as expected.Also applies to: 210-210, 212-212
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
cmd/initializer.go (1)
37-69: LGTM: Certificate loading logic is correct.The function properly handles both file and directory paths, starts with the system cert pool, and returns appropriate errors. The directory walk adds all files, which will fail if non-PEM files are present (since
AppendCertsFromPEMreturns false for invalid PEM), providing clear error feedback.Optionally, enhance the function documentation to clarify the error behavior:
-// loadCertPool loads PEM certificates from a file or directory into a CertPool. +// loadCertPool loads PEM certificates from a file or directory into a CertPool. +// If path is a directory, all files are processed; non-PEM files will cause an error. +// Falls back to system cert pool if unavailable.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cmd/initializer.go(3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-15T14:34:26.521Z
Learnt from: kekcleader
PR: platform-mesh/security-operator#107
File: cmd/initializer.go:0-0
Timestamp: 2025-10-15T14:34:26.521Z
Learning: In gRPC Go library, `grpc.DialContext` is deprecated in favor of `grpc.NewClient`. NewClient establishes connections lazily (on first RPC call) rather than blocking on dial, which is the preferred behavior for most use cases.
Applied to files:
cmd/initializer.go
🧬 Code graph analysis (1)
cmd/initializer.go (2)
internal/config/config.go (1)
Config(11-48)internal/controller/initializer_controller.go (1)
NewLogicalClusterReconciler(30-42)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: pipe / lint / lint
- GitHub Check: pipe / dockerBuild / docker
🔇 Additional comments (6)
cmd/initializer.go (6)
4-10: LGTM: Import additions are necessary and well-organized.All added imports (context, x509, fs, filepath, time, openfgav1, grpc packages) are used in the OpenFGA client setup and supporting functions.
Also applies to: 14-14, 20-23
71-78: LGTM: Bearer token implementation enforces transport security.The
bearerTokentype correctly implementsPerRPCCredentialsand enforcesRequireTransportSecurity() → true, ensuring bearer tokens are only transmitted over TLS. This pairs well with the validation at lines 159-162 that rejects incompatible configurations.
154-162: LGTM: Validation logic provides fail-fast checks.The validations correctly enforce:
- Non-empty FGA target (lines 154-157)
- Incompatibility of bearer token with insecure transport (lines 159-162)
The second check prevents a runtime failure that would occur when
bearerToken.RequireTransportSecurity()rejects the call. Clear error messages guide users to correct configuration.
164-187: LGTM: TLS setup is secure by default with appropriate fallbacks.The transport credentials logic:
- Prefers TLS unless explicitly configured as insecure (lines 166-177)
- Supports custom CA bundles with graceful fallback to system roots (lines 169-176)
- Logs warnings on CA load failures, keeping users informed (line 174)
- Conditionally adds bearer token credentials when provided (lines 185-187)
This addresses the previous review concern about using insecure transport in production.
189-213: LGTM: Eager connection establishment enables fail-fast validation.The connection logic correctly:
- Uses
grpc.NewClient(appropriate per learnings—DialContextis deprecated)- Implements eager connection with
conn.Connect()and a readiness loop (lines 196-212) to fail fast on misconfigured endpoints at startup, rather than deferring failures to the first RPC call- Waits up to 10 seconds for the
Readystate, exiting with a clear error if the connection doesn't succeedThis intentional fail-fast approach improves startup reliability.
Based on learnings
215-217: LGTM: OpenFGA client creation and wiring are correct.The OpenFGA service client is properly created from the established connection (line 215) and passed to
NewLogicalClusterReconciler(line 217), matching the updated constructor signature shown in the relevant code snippets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first batch of review... Second batch will follow..
|
Before I spend too much time writing line by line comments again here is another list of changes that need to be done:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (1)
internal/subroutine/workspace_initializer.go (1)
191-205: Optional: Reduce duplication between path-parsing helpers.Both
generateStoreNameandgenerateStoreNameFromPathcontain identical logic for extracting the last path segment. Consider consolidating to improve maintainability.Apply this diff:
func generateStoreName(lc *kcpv1alpha1.LogicalCluster) string { if path, ok := lc.Annotations["kcp.io/path"]; ok { - pathElements := strings.Split(path, ":") - return pathElements[len(pathElements)-1] + return generateStoreNameFromPath(path) } return "" }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
internal/config/config.go(1 hunks)internal/subroutine/workspace_initializer.go(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
internal/subroutine/workspace_initializer.go (2)
internal/config/config.go (1)
Config(11-44)api/v1alpha1/store_types.go (1)
Store(44-50)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: pipe / dockerBuild / docker
🔇 Additional comments (5)
internal/config/config.go (1)
13-16: LGTM! Clean FGA configuration additions.The three new FGA configuration fields follow the existing pattern with appropriate mapstructure tags and sensible defaults for OpenFGA relationship modeling.
internal/subroutine/workspace_initializer.go (4)
33-36: Startup panic is acceptable for required configuration.The panic on
ReadFilefailure is appropriate here sinceCoreModulePathis essential configuration that must be valid at startup. Failures surface immediately rather than during reconciliation.
61-70: LGTM! Lifecycle methods correctly delegate finalization.The comment clearly indicates that finalization is handled elsewhere, which aligns with the separation of concerns in the refactored design.
72-154: Verify FGA tuple writing is handled elsewhere.Previous review comments referenced
writeTuplecalls and creator validation logic that are no longer present in this file. Given the PR objectives state "Place FGA logic in a separate subroutine," please confirm that FGA tuple creation/management has been moved to a dedicated subroutine rather than removed entirely.Can you confirm where the FGA tuple writing logic now resides, or if it's planned for a follow-up change?
156-166: LGTM! AccountInfo update logic is clean.The
CreateOrUpdatepattern correctly ensures the AccountInfo contains the resolved StoreID, with appropriate error wrapping.
|
|
||
| if accountInfo.Spec.Account.Name == "" || accountInfo.Spec.Account.OriginClusterId == "" { | ||
| return ctrl.Result{RequeueAfter: 2 * time.Second}, nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may misunderstand it, but the account info resource is created at the end of this subroutine but we're trying to get it already here even with data which is set by account-operator. Does it work ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're absolutely right! This is indeed a timing dependency. The WorkspaceFGA subroutine expects accountInfo.Spec.Account.Name and accountInfo.Spec.Account.OriginClusterId to be populated by account-operator, but we create the AccountInfo in WorkspaceInitializer without these fields.
Currently it works because:
- WorkspaceInitializer creates AccountInfo with empty Account fields
- account-operator watches AccountInfo and populates Account.Name and Account.OriginClusterId
- WorkspaceFGA checks these fields and requeues (2s) if they're empty
- Eventually account-operator fills them and WorkspaceFGA succeeds
However, this creates an implicit dependency on account-operator's timing. We should make this more explicit by either:
- Adding a status condition to AccountInfo that indicates when account-operator has finished populating it
- Or documenting this dependency clearly in comments
For now, the requeue mechanism handles this race condition, but it's worth discussing whether we want to make this more robust.
…ub.com:platform-mesh/security-operator into feat/move-fga-subroutine-from-account-operator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (5)
internal/subroutine/workspace_initializer.go (5)
30-46: Constructor panics on missing core module file; confirm desired fail-fast behaviorIf CoreModulePath is misconfigured at runtime, the operator will crash. If that’s intended (startup validation), fine. Otherwise, consider returning a clear error earlier (config load) or logging and surfacing a structured operator error.
72-75: Add safe type assertion to avoid panic on unexpected instance typesAvoid a hard panic if another object slips through.
- lc := instance.(*kcpv1alpha1.LogicalCluster) + lc, ok := instance.(*kcpv1alpha1.LogicalCluster) + if !ok { + return ctrl.Result{}, errors.NewOperatorError( + fmt.Errorf("unexpected instance type %T", instance), false, false) + }
94-101: Bound the owner Account Get with a timeout (consistency and resilience)Other calls use 5s deadlines. Do the same here to avoid hangs on network stalls.
- if err := ownerClusterRef.GetClient().Get(ctx, client.ObjectKey{Name: lc.Spec.Owner.Name}, &account); err != nil { + ctxOwnerGet, cancelOwner := context.WithTimeout(ctx, 5*time.Second) + defer cancelOwner() + if err := ownerClusterRef.GetClient().Get(ctxOwnerGet, client.ObjectKey{Name: lc.Spec.Owner.Name}, &account); err != nil {
151-157: Requeue strategy: consider immediate requeue instead of fixed 2sStoreID usually populates very fast. Immediate requeue can reduce overall latency and controller idle time.
- if store.Status.StoreID == "" { - return ctrl.Result{RequeueAfter: 2 * time.Second}, nil - } + if store.Status.StoreID == "" { + return ctrl.Result{Requeue: true}, nil + }
219-225: Tighten generateStoreNameFromPath guardstrings.Split always returns at least one element; the len==0 branch is unreachable. Simplify.
-func generateStoreNameFromPath(path string) string { - pathElements := strings.Split(path, ":") - if len(pathElements) == 0 { - return "" - } - return pathElements[len(pathElements)-1] -} +func generateStoreNameFromPath(path string) string { + parts := strings.Split(path, ":") + return parts[len(parts)-1] +}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal/subroutine/workspace_initializer.go(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
internal/subroutine/workspace_initializer.go (2)
internal/config/config.go (1)
Config(11-44)api/v1alpha1/store_types.go (1)
Store(44-50)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: pipe / lint / lint
- GitHub Check: pipe / dockerBuild / docker
🔇 Additional comments (3)
internal/subroutine/workspace_initializer.go (3)
102-117: Good: per-operation timeouts; fresh context for store opsNice separation of deadlines for AccountInfo retrieval and store operations; addresses premature expiration risk.
Also applies to: 123-127
184-195: Org path branch: validated store name — looks goodPath presence is checked and store name is validated before returning.
197-208: Non‑org branch validations are soundChecks for org path, derived store name, and parent account readiness are appropriate.
Please confirm that downstream FGA subroutine uses these preconditions to gate tuple writes (to avoid partial state).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (3)
internal/subroutine/workspace_initializer_test.go (1)
17-51: Consider extracting test setup helper to reduce duplication.Both test functions share identical setup logic (temp file creation, mock initialization, config setup). Extract this into a helper function to improve maintainability.
Consider adding a test helper like:
func setupWorkspaceInitializerTest(t *testing.T) (*subroutine.WorkspaceInitializer, *mocks.MockManager, *mocks.MockClient, *mocks.MockCluster, *mocks.MockClient, func()) { tmpFile, err := os.CreateTemp("", "coreModule*.fga") assert.NoError(t, err) _, err = tmpFile.WriteString("model\n schema 1.1") assert.NoError(t, err) err = tmpFile.Close() assert.NoError(t, err) cleanup := func() { _ = os.Remove(tmpFile.Name()) } mgr := mocks.NewMockManager(t) orgsClient := mocks.NewMockClient(t) fga := mocks.NewMockOpenFGAServiceClient(t) wsCluster := mocks.NewMockCluster(t) wsClient := mocks.NewMockClient(t) cfg := config.Config{CoreModulePath: tmpFile.Name()} cfg.FGA.ObjectType = "core_platform-mesh_io_account" cfg.FGA.ParentRelation = "parent" cfg.FGA.CreatorRelation = "owner" sub := subroutine.NewWorkspaceInitializer(orgsClient, cfg, mgr, fga) return sub, mgr, orgsClient, wsCluster, wsClient, cleanup }Also applies to: 53-89
internal/subroutine/workspace_initializer.go (2)
77-77: Replace debug print statements with structured logging.The
Processmethod contains numerousfmt.Printfdebug statements. Use a structured logger (e.g., from controller-runtime) for proper observability and log level control.Consider passing a logger through the struct or obtaining it from the context:
// In the constructor or struct type workspaceInitializer struct { // ... existing fields log logr.Logger } // In Process method log := w.log.WithValues("workspace", lc.Name) log.V(1).Info("Workspace no longer initializing", "phase", lc.Status.Phase) log.V(1).Info("Successfully fetched account", "account", account.Name, "type", account.Spec.Type) // ... etcThis enables:
- Runtime log level control
- Structured log aggregation
- Consistent log formatting
- Production-grade observability
Also applies to: 99-99, 102-102, 105-105, 112-112, 119-119, 128-128, 133-133, 136-136, 139-139, 144-144, 149-149, 152-152, 161-161, 195-195, 198-198, 201-201, 212-212, 214-214, 219-219, 222-222, 224-224
107-139: Document AccountInfo creation race condition handling.The logic at lines 116-120 sets
Creatorduring AccountInfo creation to avoid a race with account-operator. This is subtle and important behavior that would benefit from inline documentation.Consider adding a comment:
accountInfo := &accountsv1alpha1.AccountInfo{ObjectMeta: metav1.ObjectMeta{Name: accountinfo.DefaultAccountInfoName}} op, err := controllerutil.CreateOrUpdate(ctxGetTimeout, workspaceClient, accountInfo, func() error { + // Set Creator immediately during creation to avoid race condition where account-operator + // might populate AccountInfo before we can set Creator, leading to lost creator information. + // This ensures creator ownership is captured at initialization time. if accountInfo.Spec.Creator == nil && account.Spec.Creator != nil { creatorValue := *account.Spec.Creator accountInfo.Spec.Creator = &creatorValue fmt.Printf("[DEBUG] Setting Creator to: %s during AccountInfo creation\n", creatorValue) } return nil })
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
internal/subroutine/workspace_initializer.go(2 hunks)internal/subroutine/workspace_initializer_test.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
internal/subroutine/workspace_initializer_test.go (6)
internal/subroutine/mocks/mock_Manager.go (1)
NewMockManager(827-837)internal/subroutine/mocks/mock_Client.go (1)
NewMockClient(835-845)internal/subroutine/mocks/mock_OpenFGAServiceClient.go (1)
NewMockOpenFGAServiceClient(1436-1446)internal/subroutine/mocks/mock_Cluster.go (1)
NewMockCluster(509-519)internal/config/config.go (1)
Config(11-44)internal/subroutine/workspace_initializer.go (1)
NewWorkspaceInitializer(30-47)
internal/subroutine/workspace_initializer.go (2)
internal/config/config.go (1)
Config(11-44)api/v1alpha1/store_types.go (1)
Store(44-50)
🪛 GitHub Check: pipe / lint / lint
internal/subroutine/workspace_initializer_test.go
[failure] 59-59:
Error return value of tmpFile.Close is not checked (errcheck)
[failure] 58-58:
Error return value of tmpFile.WriteString is not checked (errcheck)
[failure] 57-57:
Error return value of os.Remove is not checked (errcheck)
[failure] 23-23:
Error return value of tmpFile.Close is not checked (errcheck)
[failure] 22-22:
Error return value of tmpFile.WriteString is not checked (errcheck)
[failure] 21-21:
Error return value of os.Remove is not checked (errcheck)
internal/subroutine/workspace_initializer.go
[failure] 276-276:
func generateStoreNameFromPath is unused (unused)
[failure] 241-241:
func (*workspaceInitializer).resolveStoreTarget is unused (unused)
[failure] 228-228:
func (*workspaceInitializer).ensureAccountInfo is unused (unused)
[failure] 196-196:
ST1005: error strings should not be capitalized (staticcheck)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: pipe / dockerBuild / docker
…fixed apis_kcp_io_apibinding type; store crd definition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/subroutine/authorization_model_generation.go (1)
216-218: Critical: out-of-range slice when trimming group; can panic
group = resourceSchema.Spec.Group[len(longestRelationName)-50:]indexesgroupwith an offset computed from a longer string; iflen(longestRelationName)-50 > len(group), this panics.Use a safe budget based on the 50-char cap for
create_{group}_{plural}and trimgroupaccordingly:- if len(longestRelationName) > 50 { - group = resourceSchema.Spec.Group[len(longestRelationName)-50:] - } + if len(longestRelationName) > 50 { + // budget for group within the 50-char constraint: + const prefix = "create_" + budget := 50 - len(prefix) - 1 /* '_' */ - len(resourceSchema.Spec.Names.Plural) + if budget < 0 { + budget = 0 + } + if len(resourceSchema.Spec.Group) > budget { + group = resourceSchema.Spec.Group[len(resourceSchema.Spec.Group)-budget:] + } else { + group = resourceSchema.Spec.Group + } + }
♻️ Duplicate comments (3)
internal/subroutine/workspace_initializer.go (3)
52-59: Remove unused FGA fields from initializer (struct + ctor)
These fields aren’t used in this file after moving FGA logic to the dedicated subroutine; keep config in the FGA subroutine.type workspaceInitializer struct { orgsClient client.Client mgr mcmanager.Manager coreModule string - fga openfgav1.OpenFGAServiceClient - fgaObjectType string - fgaParentRel string - fgaCreatorRel string } func NewWorkspaceInitializer(orgsClient client.Client, cfg config.Config, mgr mcmanager.Manager, fga openfgav1.OpenFGAServiceClient) *workspaceInitializer { @@ - fga: fga, - fgaObjectType: cfg.FGA.ObjectType, - fgaParentRel: cfg.FGA.ParentRelation, - fgaCreatorRel: cfg.FGA.CreatorRelation,Also applies to: 30-46
230-268: Remove dead helpers (unused)
ensureAccountInfo,resolveStoreTarget, andgenerateStoreNameFromPathare no longer used.- func (w *workspaceInitializer) ensureAccountInfo(... - } - - func (w *workspaceInitializer) resolveStoreTarget(... - } - - func generateStoreNameFromPath(path string) string { ... }Also applies to: 278-284
196-199: Lowercase error string (“Store not ready”)Follow Go style for errors.
- return ctrl.Result{}, errors.NewOperatorError(fmt.Errorf("Store not ready"), true, false) + return ctrl.Result{}, errors.NewOperatorError(fmt.Errorf("store not ready"), true, false)
🧹 Nitpick comments (4)
internal/subroutine/authorization_model.go (1)
144-149: Reduce log noise; handle marshal error; avoid extra transform
- Don’t ignore marshal errors.
- Gate the verbose model dump behind debug or truncate it.
- Reuse desiredRaw if already computed.
Apply:
- // DEBUG: Log the model being written - modelJSON, _ := protojson.Marshal(authorizationModel) - modelDSL, err := language.TransformJSONStringToDSL(string(modelJSON)) - if err == nil && modelDSL != nil { - log.Info().Str("store", store.Name).Str("model", *modelDSL).Msg("Writing authorization model") - } + // Debug-only: log (truncated) DSL before write + if log.GetLevel() <= logger.DebugLevel { // or a project-appropriate check + if modelJSON, err := protojson.Marshal(authorizationModel); err == nil { + if modelDSL, derr := language.TransformJSONStringToDSL(string(modelJSON)); derr == nil && modelDSL != nil { + const max = 2_000 + dsl := *modelDSL + if len(dsl) > max { + dsl = dsl[:max] + "...(truncated)" + } + log.Debug().Str("store", store.Name).Str("model", dsl).Msg("writing authorization model") + } + } else { + log.Debug().Err(err).Msg("skip model dump: marshal failed") + } + }internal/subroutine/workspace_initializer.go (3)
75-77: Use project logger; avoid fmt.Printf and unify levelsReplace prints with the project logger (e.g., logger.LoadLoggerFromContext(ctx)) and structured fields. This keeps logs consistent and filterable.
- fmt.Printf("[DEBUG] Workspace phase=%s, ensuring resources remain consistent\n", lc.Status.Phase) + log := logger.LoadLoggerFromContext(ctx) + log.Debug().Str("phase", string(lc.Status.Phase)).Msg("ensuring resources remain consistent") ... - fmt.Printf("[ERROR] Failed to get account %s: %v\n", lc.Spec.Owner.Name, err) + log.Error().Err(err).Str("account", lc.Spec.Owner.Name).Msg("failed to get account")Also applies to: 92-104, 121-127, 139-160, 187-201, 202-227
128-134: Avoid brittle string match for CRD readinessUse meta.IsNoMatchError(err) instead of strings.Contains("no matches for kind").
- if strings.Contains(err.Error(), "no matches for kind") { + if meta.IsNoMatchError(err) { ... }
30-46: Constructor: avoid panic on coreModule readPrefer returning an OperatorError at setup or failing fast via caller logging; panic in library code complicates testing.
- data, err := os.ReadFile(coreModulePath) - if err != nil { - panic(err) - } + data, err := os.ReadFile(coreModulePath) + if err != nil { + // consider log.Fatal at wiring site or propagate via a factory that returns (init, error) + panic(fmt.Errorf("read core module: %w", err)) + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
data/coreModule.fga(1 hunks)internal/subroutine/authorization_model.go(1 hunks)internal/subroutine/authorization_model_generation.go(1 hunks)internal/subroutine/workspace_initializer.go(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
internal/subroutine/workspace_initializer.go (2)
internal/config/config.go (1)
Config(11-44)api/v1alpha1/store_types.go (1)
Store(44-50)
🔇 Additional comments (1)
internal/subroutine/authorization_model_generation.go (1)
62-64: LGTM: explicit member/owner inheritanceThe added member/owner-from-parent relations align with the parent types in core model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (4)
internal/subroutine/workspace_initializer.go (3)
220-274: Remove unused helper methods.The helpers
ensureAccountInfo(Lines 220-231),resolveStoreTarget(Lines 233-258), andgenerateStoreNameFromPath(Lines 268-274) are defined but never called. These appear to be remnants from refactoring where the logic was inlined into theProcessmethod.Apply this diff to remove the dead code:
return ctrl.Result{RequeueAfter: time.Minute}, nil } -func (w *workspaceInitializer) ensureAccountInfo(ctx context.Context, workspaceClient client.Client, storeID string, creator *string) errors.OperatorError { - accountInfo := &accountsv1alpha1.AccountInfo{ObjectMeta: metav1.ObjectMeta{Name: accountinfo.DefaultAccountInfoName}} - _, err := controllerutil.CreateOrUpdate(ctx, workspaceClient, accountInfo, func() error { - accountInfo.Spec.FGA.Store.Id = storeID - accountInfo.Spec.Creator = creator - return nil - }) - if err != nil { - return errors.NewOperatorError(fmt.Errorf("unable to create/update accountInfo: %w", err), true, true) - } - return nil -} - -func (w *workspaceInitializer) resolveStoreTarget(lc *kcpv1alpha1.LogicalCluster, account accountsv1alpha1.Account, accountInfo *accountsv1alpha1.AccountInfo) (logicalcluster.Name, string, errors.OperatorError) { - if account.Spec.Type == accountsv1alpha1.AccountTypeOrg { - path, ok := lc.Annotations["kcp.io/path"] - if !ok { - return "", "", errors.NewOperatorError(fmt.Errorf("unable to get workspace path"), true, false) - } - storeName := generateStoreName(lc) - if storeName == "" { - return "", "", errors.NewOperatorError(fmt.Errorf("unable to generate store name from workspace path"), true, false) - } - return logicalcluster.Name(path), storeName, nil - } - - if accountInfo.Spec.Organization.Path == "" { - return "", "", errors.NewOperatorError(fmt.Errorf("organization path not yet set"), true, false) - } - storeName := generateStoreNameFromPath(accountInfo.Spec.Organization.Path) - if storeName == "" { - return "", "", errors.NewOperatorError(fmt.Errorf("unable to derive store name from organization path"), true, false) - } - if accountInfo.Spec.ParentAccount == nil || accountInfo.Spec.ParentAccount.Name == "" || accountInfo.Spec.ParentAccount.OriginClusterId == "" { - return "", "", errors.NewOperatorError(fmt.Errorf("parent account information not ready"), true, false) - } - - return logicalcluster.Name(accountInfo.Spec.Organization.Path), storeName, nil -} - func generateStoreName(lc *kcpv1alpha1.LogicalCluster) string { if path, ok := lc.Annotations["kcp.io/path"]; ok { pathElements := strings.Split(path, ":") return pathElements[len(pathElements)-1] } return "" } - -func generateStoreNameFromPath(path string) string { - pathElements := strings.Split(path, ":") - if len(pathElements) == 0 { - return "" - } - return pathElements[len(pathElements)-1] -}
30-47: Remove unused FGA fields from workspaceInitializer.The fields
fgaObjectType,fgaParentRel, andfgaCreatorRelare initialized in the constructor (Lines 43-45) but never used in any methods ofworkspaceInitializer. These fields are used inworkspace_fga.gowithin theworkspaceFGASubroutinestruct. The FGA logic has been correctly separated into its own subroutine.Apply this diff:
func NewWorkspaceInitializer(orgsClient client.Client, cfg config.Config, mgr mcmanager.Manager, fga openfgav1.OpenFGAServiceClient) *workspaceInitializer { coreModulePath := cfg.CoreModulePath data, err := os.ReadFile(coreModulePath) if err != nil { panic(err) } return &workspaceInitializer{ orgsClient: orgsClient, mgr: mgr, coreModule: string(data), - fga: fga, - fgaObjectType: cfg.FGA.ObjectType, - fgaParentRel: cfg.FGA.ParentRelation, - fgaCreatorRel: cfg.FGA.CreatorRelation, } } type workspaceInitializer struct { orgsClient client.Client mgr mcmanager.Manager coreModule string - fga openfgav1.OpenFGAServiceClient - fgaObjectType string - fgaParentRel string - fgaCreatorRel string }Also applies to: 56-58
189-192: Fix error string capitalization.Error strings should not be capitalized per Go style guidelines.
Apply this diff:
if store.Status.StoreID == "" { fmt.Printf("[DEBUG] Store not ready yet (StoreID empty), will retry reconcile\n") - return ctrl.Result{}, errors.NewOperatorError(fmt.Errorf("Store not ready"), true, false) + return ctrl.Result{}, errors.NewOperatorError(fmt.Errorf("store not ready"), true, false) }internal/subroutine/workspace_initializer_test.go (1)
18-22: Check error returns in test setup.The test setup calls
os.Remove,tmpFile.WriteString, andtmpFile.Closewithout checking their returned errors. While these are in test code, unchecked errors can mask setup failures and cause confusing test results.Apply this diff to check or explicitly ignore these errors:
// Create temp file for core module tmpFile, err := os.CreateTemp("", "coreModule*.fga") assert.NoError(t, err) - defer os.Remove(tmpFile.Name()) - tmpFile.WriteString("model\n schema 1.1") - tmpFile.Close() + defer func() { _ = os.Remove(tmpFile.Name()) }() + _, err = tmpFile.WriteString("model\n schema 1.1") + assert.NoError(t, err) + err = tmpFile.Close() + assert.NoError(t, err)
🧹 Nitpick comments (2)
internal/subroutine/workspace_fga_test.go (2)
300-332: Add LogicalCluster owner setup for consistency.The test sets up AccountInfo with a parent account but doesn't configure the LogicalCluster's owner fields. While the test may pass currently, this creates an inconsistent fixture that could mask bugs if the implementation changes to validate owner presence earlier in the flow.
Apply this diff to add the missing owner setup:
sub := subroutine.NewWorkspaceFGASubroutine(mgr, fga, "core_platform-mesh_io_account", "parent", "owner") lc := &kcpcorev1alpha1.LogicalCluster{} + lc.Spec.Owner = &kcpcorev1alpha1.LogicalClusterOwner{ + Cluster: logicalcluster.Name("ws-owner").String(), + Name: "acc", + } ctx := mccontext.WithCluster(context.Background(), "ws") res, opErr := sub.Process(ctx, lc)
360-363: Add LogicalCluster owner setup for test consistency.Similar to the previous test, this test is missing the LogicalCluster owner configuration, which could cause the test to fail or mask bugs if the implementation validates owner presence.
Apply this diff:
sub := subroutine.NewWorkspaceFGASubroutine(mgr, fga, "core_platform-mesh_io_account", "parent", "owner") lc := &kcpcorev1alpha1.LogicalCluster{} + lc.Spec.Owner = &kcpcorev1alpha1.LogicalClusterOwner{ + Cluster: logicalcluster.Name("ws-owner").String(), + Name: "acc", + } ctx := mccontext.WithCluster(context.Background(), "ws") _, opErr := sub.Process(ctx, lc)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
internal/subroutine/workspace_fga.go(1 hunks)internal/subroutine/workspace_fga_test.go(1 hunks)internal/subroutine/workspace_initializer.go(3 hunks)internal/subroutine/workspace_initializer_test.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
internal/subroutine/workspace_initializer_test.go (5)
internal/subroutine/mocks/mock_Manager.go (1)
NewMockManager(827-837)internal/subroutine/mocks/mock_Client.go (1)
NewMockClient(835-845)internal/subroutine/mocks/mock_OpenFGAServiceClient.go (1)
NewMockOpenFGAServiceClient(1436-1446)internal/config/config.go (1)
Config(11-44)internal/subroutine/workspace_initializer.go (1)
NewWorkspaceInitializer(30-47)
internal/subroutine/workspace_fga_test.go (6)
internal/subroutine/mocks/mock_Manager.go (1)
NewMockManager(827-837)internal/subroutine/mocks/mock_Cluster.go (1)
NewMockCluster(509-519)internal/subroutine/mocks/mock_Client.go (1)
NewMockClient(835-845)internal/subroutine/mocks/mock_OpenFGAServiceClient.go (1)
NewMockOpenFGAServiceClient(1436-1446)internal/subroutine/workspace_fga.go (1)
NewWorkspaceFGASubroutine(32-40)internal/subroutine/mocks/mock_StatusWriter.go (1)
NewMockStatusWriter(17-25)
internal/subroutine/workspace_initializer.go (2)
internal/config/config.go (1)
Config(11-44)api/v1alpha1/store_types.go (1)
Store(44-50)
🔇 Additional comments (8)
internal/subroutine/workspace_fga_test.go (1)
22-555: LGTM! Comprehensive test coverage.The test suite provides excellent coverage of the WorkspaceFGA subroutine:
- Requeue scenarios (missing/incomplete data)
- Successful tuple writes for parent and creator relationships
- Creator validation (service accounts, empty values, dotted formats)
- Error propagation from FGA writes and status updates
- Interface method implementations
The tests are well-structured with clear naming and appropriate use of mocks.
internal/subroutine/workspace_initializer_test.go (1)
16-47: Test correctly validates empty owner.cluster error.The test properly verifies that the WorkspaceInitializer returns an error when the LogicalCluster's owner.cluster field is empty, which aligns with the updated validation logic in the implementation.
internal/subroutine/workspace_fga.go (4)
24-40: LGTM! Clean struct and constructor.The struct and constructor are well-designed:
- All fields are used in the implementation
- Constructor parameters are clear and properly assigned
- No unused fields remain (orgsClient was correctly removed)
52-120: Process implementation is solid with proper error handling.The Process method correctly implements the FGA tuple write flow:
- Nil guard for FGA client (Line 53-55)
- Differentiated error handling for NotFound vs other errors (Lines 67-73)
- Write-once semantics for creator tuples using status flag (Line 91)
- Proper validation before tuple writes (Lines 74-76, 94-96)
- Status updates after successful writes (Lines 113-116)
122-136: writeTuple helper correctly handles duplicates.The helper properly:
- Detects duplicate writes using the FGA helpers library (Line 129)
- Returns nil for duplicates (idempotent behavior)
- Wraps other errors appropriately for operator retry logic (Line 133)
138-155: Creator formatting and validation logic is correct.The two-stage validation properly prevents service accounts from being used as creators:
formatUser(Lines 140-145) converts Kubernetes service account format (colon-separated) to dot-separated format for FGA compatibilityvalidateCreator(Lines 147-155) blocks both formats (colon and dot-separated service accounts)This prevents service accounts from bypassing the check by using either format.
internal/subroutine/workspace_initializer.go (2)
75-134: Owner validation and AccountInfo creation logic is correct.The implementation properly:
- Validates owner.cluster is not empty before proceeding (Lines 80-84)
- Fetches the account with proper NotFound handling (Lines 95-102)
- Creates AccountInfo with immediate Creator propagation to avoid races (Lines 108-116)
- Handles CRD availability issues gracefully (Lines 126-129)
The debug statements are numerous but useful for troubleshooting initialization flows.
135-140: Store ID not populated for non-org accounts—verify external population mechanism.The code returns early for non-org accounts (line 139) before any Store ID resolution. Store creation and ID population (line 199) only occurs for org accounts. The unused functions
ensureAccountInfoandresolveStoreTargetsuggest incomplete implementation.After the 1-minute requeue, this initializer won't populate
AccountInfo.Spec.FGA.Store.Idfor non-org accounts. However, workspace_fga.go requires this field (line 74) to proceed.Verify that either:
- account-operator or another component populates the parent org's Store ID, or
- Additional reconciliation logic exists that handles non-org Store ID resolution on subsequent runs
If neither is true, the unused helper functions should be integrated into the non-org flow to fetch and set the parent org's Store ID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
internal/subroutine/workspace_initializer.go (2)
183-186: Fix error string capitalization.Error strings should not be capitalized per Go style guidelines (ST1005).
Apply this diff:
- return ctrl.Result{}, errors.NewOperatorError(fmt.Errorf("Store not ready"), true, false) + return ctrl.Result{}, errors.NewOperatorError(fmt.Errorf("store not ready"), true, false)
129-134: CRITICAL: Non-org accounts never get FGA Store ID, blocking all FGA operations.For non-org accounts, this early return leaves
AccountInfo.Spec.FGA.Store.Idempty. The workspace_fga subroutine (per past review context at workspace_fga.go:67) checks for empty Store.Id and returns early, preventing any FGA tuple writes for non-org accounts.Before returning, you must populate the Store ID from the parent org. Here's the fix pattern:
// Only create Store for org accounts during initialization // For account-type accounts, Store already exists in parent org if account.Spec.Type != accountsv1alpha1.AccountTypeOrg { fmt.Printf("[DEBUG] Account type is '%s', skipping Store creation (using parent org Store)\n", account.Spec.Type) + + // Non-org accounts must still have their parent org's Store ID populated + // Re-fetch AccountInfo to get Organization.Path set by account-operator + accountInfo := &accountsv1alpha1.AccountInfo{} + if err := workspaceClient.Get(ctx, client.ObjectKey{Name: accountinfo.DefaultAccountInfoName}, accountInfo); err != nil { + return ctrl.Result{}, errors.NewOperatorError(fmt.Errorf("unable to get accountInfo: %w", err), true, true) + } + + // Wait for account-operator to populate Organization.Path + if accountInfo.Spec.Organization.Path == "" { + fmt.Printf("[DEBUG] Organization path not set yet, requeuing\n") + return ctrl.Result{Requeue: true}, nil + } + + // Resolve parent org's Store + parentOrgPath := logicalcluster.Name(accountInfo.Spec.Organization.Path) + parentStoreName := accountInfo.Spec.Organization.Path[strings.LastIndex(accountInfo.Spec.Organization.Path, ":")+1:] + + ctxParentStore := mccontext.WithCluster(ctx, parentOrgPath.String()) + parentStore := &v1alpha1.Store{} + if err := w.orgsClient.Get(ctxParentStore, client.ObjectKey{Name: parentStoreName}, parentStore); err != nil { + if kerrors.IsNotFound(err) { + fmt.Printf("[DEBUG] Parent org Store not found yet, requeuing\n") + return ctrl.Result{Requeue: true}, nil + } + return ctrl.Result{}, errors.NewOperatorError(fmt.Errorf("unable to get parent store: %w", err), true, true) + } + + if parentStore.Status.StoreID == "" { + fmt.Printf("[DEBUG] Parent Store not ready yet, requeuing\n") + return ctrl.Result{Requeue: true}, nil + } + + // Update AccountInfo with parent Store ID + _, err := controllerutil.CreateOrUpdate(ctx, workspaceClient, accountInfo, func() error { + accountInfo.Spec.FGA.Store.Id = parentStore.Status.StoreID + if account.Spec.Creator != nil { + creatorValue := *account.Spec.Creator + accountInfo.Spec.Creator = &creatorValue + } + return nil + }) + if err != nil { + return ctrl.Result{}, errors.NewOperatorError(fmt.Errorf("unable to update accountInfo with parent store: %w", err), true, true) + } + fmt.Printf("[DEBUG] Updated non-org AccountInfo with parent Store ID: %s\n", parentStore.Status.StoreID) + return ctrl.Result{RequeueAfter: time.Minute}, nil }
🧹 Nitpick comments (1)
internal/subroutine/workspace_initializer.go (1)
70-71: Consider structured logging instead of fmt.Printf.The extensive debug logging using
fmt.Printfprovides good visibility but could be improved. Consider migrating to structured logging (e.g., usinglogr.Loggerfrom controller-runtime) for better production observability, log levels, and filtering.Example pattern:
// In struct type workspaceInitializer struct { orgsClient client.Client mgr mcmanager.Manager coreModule string fga openfgav1.OpenFGAServiceClient log logr.Logger // Add logger } // Usage w.log.V(1).Info("Successfully fetched account", "account", account.Name, "type", account.Spec.Type) w.log.Error(err, "Failed to get account", "account", lc.Spec.Owner.Name)Also applies to: 91-91, 94-94, 97-97, 100-100, 107-107, 116-116, 121-121, 124-124, 127-127, 132-132, 137-137, 140-140, 145-145, 149-149, 175-175, 184-184, 187-187, 190-190, 198-198, 200-200, 205-205, 208-208, 210-210
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal/subroutine/workspace_initializer.go(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
internal/subroutine/workspace_initializer.go (2)
internal/config/config.go (1)
Config(11-44)api/v1alpha1/store_types.go (1)
Store(44-50)
🔇 Additional comments (6)
internal/subroutine/workspace_initializer.go (6)
30-44: LGTM! Constructor properly wired for FGA integration.The constructor signature has been updated to accept the FGA client and correctly stores it in the initializer struct, aligning with the PR objective of moving FGA functionality into this operator.
48-53: LGTM! Struct properly streamlined.The struct has been simplified to remove unused FGA configuration fields while retaining the FGA client. This addresses the past review concern about unused fields.
99-127: LGTM! AccountInfo creation properly handles timing and races.The early creation of AccountInfo with immediate Creator propagation avoids race conditions with account-operator. The APIBinding readiness check gracefully handles CRD availability.
136-149: LGTM! Store resolution properly validates the generated name.The store name validation at lines 144-147 addresses the past review concern about empty store names being returned without error.
151-176: LGTM! Store creation now updates CoreModule on existing stores.The condition at line 165 properly updates CoreModule whenever it differs from the current config, addressing the past review concern about only setting it during creation.
189-208: LGTM! AccountInfo update properly handles Store ID and Creator propagation.The Creator value copy at lines 196-197 correctly avoids pointer sharing issues, and the null checking is appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (11)
internal/subroutine/workspace_fga_test.go (2)
39-40: Avoid brittle nanosecond equality in requeue assertions.Comparing int(RequeueAfter)==1 ties tests to an implementation detail (1ns). Assert “greater than zero” or compare to a named constant instead for stability.
- assert.Equal(t, 1, int(res.RequeueAfter), "Expected immediate requeue") + assert.Greater(t, res.RequeueAfter, time.Duration(0), "Expected immediate requeue")Add:
+ import "time"Also applies to: 63-65
97-113: Consider asserting exact tuple intents, not just call counts.Where feasible, match WriteRequest contents (User/Relation/Object) for each expected write to guard against regressions in tuple formatting or relation names, not only Times(n).
Also applies to: 239-254, 321-332
internal/subroutine/workspace_fga.go (3)
64-71: Use explicit duration for “immediate” requeue.
RequeueAfter: 1is 1ns. Make intent clear withtime.Nanosecond(or a named const), improving readability and avoiding accidental unit changes.+ "time" @@ - // AccountInfo not created yet by workspace_initializer, requeue immediately - return ctrl.Result{RequeueAfter: 1}, nil + // AccountInfo not created yet by workspace_initializer, requeue immediately + return ctrl.Result{RequeueAfter: time.Nanosecond}, nil @@ - return ctrl.Result{RequeueAfter: 1}, nil + return ctrl.Result{RequeueAfter: time.Nanosecond}, nil
111-115: Status update can conflict; prefer Status Patch with MergeFrom.A stale
Updatewill 409 on concurrent writers. Patch the Status to reduce conflicts and retries.- if err := workspaceClient.Status().Update(ctx, accountInfo); err != nil { + orig := accountInfo.DeepCopy() + if err := workspaceClient.Status().Patch(ctx, accountInfo, client.MergeFrom(orig)); err != nil { return ctrl.Result{}, errors.NewOperatorError(fmt.Errorf("unable to update accountInfo status: %w", err), true, true) }
121-135: Bound FGA writes with per-call timeout.Network stalls can hang reconciles. Wrap
w.fga.Writein a small timeout (e.g., 5s).+ "time" @@ func (w *workspaceFGASubroutine) writeTuple(ctx context.Context, storeID string, tuple *openfgav1.TupleKey) errors.OperatorError { - _, err := w.fga.Write(ctx, &openfgav1.WriteRequest{ + ctxWrite, cancel := context.WithTimeout(ctx, 5*time.Second) + defer cancel() + _, err := w.fga.Write(ctxWrite, &openfgav1.WriteRequest{ StoreId: storeID, Writes: &openfgav1.WriteRequestWrites{ TupleKeys: []*openfgav1.TupleKey{tuple}, }, })internal/subroutine/workspace_initializer.go (5)
90-93: Use RequeueAfter consistently; avoid deprecated/ambiguous Requeue.Switch to a small
RequeueAfterto match the rest of the code and keep timing explicit.- return ctrl.Result{Requeue: true}, nil + return ctrl.Result{RequeueAfter: time.Second}, nil
120-123: Use typed detection instead of string matching for CRD “no match”.String contains is brittle. Prefer
meta.IsNoMatchError(err).- if strings.Contains(err.Error(), "no matches for kind") { + if apimeta.IsNoMatchError(err) { fmt.Printf("[DEBUG] CRD not ready yet (no matches for kind), will retry reconcile\n") return ctrl.Result{}, errors.NewOperatorError(fmt.Errorf("apiBinding not ready: %w", err), true, false) }Add import:
+ apimeta "k8s.io/apimachinery/pkg/api/meta"
235-238: Don’t return an error for transient “store not ready”.Surface as a timed requeue rather than an error to reduce noise and align with earlier requeue paths.
- return ctrl.Result{}, errors.NewOperatorError(fmt.Errorf("store not ready"), true, false) + return ctrl.Result{RequeueAfter: time.Second}, nil
185-186: Avoid unconditional periodic requeues; let watches drive reconciles.Unconditionally requeuing every minute increases KCP load. Return
ctrl.Result{}after successful updates and rely on events.- return ctrl.Result{RequeueAfter: time.Minute}, nil + return ctrl.Result{}, nil @@ - return ctrl.Result{RequeueAfter: time.Minute}, nil + return ctrl.Result{}, nilAlso applies to: 262-264
70-71: Replace fmt.Printf with controller-runtime logger.Use
log := ctrl.LoggerFrom(ctx)and structured fields. This consolidates logs and respects verbosity.Also applies to: 91-99, 116-117, 124-126, 139-144, 152-167, 183-186, 189-199, 227-228, 239-243, 260-263
data/coreModule.fga (1)
13-20: Model aligns with tuple writes; keep relations stable across releases.The
role#assignee -> owner -> accountchain matches the subroutine. Consider documenting backward-compat constraints since changingowner/memberderivations is a breaking auth change.Also applies to: 36-44, 86-95
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
data/coreModule.fga(1 hunks)internal/subroutine/workspace_fga.go(1 hunks)internal/subroutine/workspace_fga_test.go(1 hunks)internal/subroutine/workspace_initializer.go(3 hunks)internal/subroutine/workspace_initializer_test.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/subroutine/workspace_initializer_test.go
🧰 Additional context used
🧬 Code graph analysis (2)
internal/subroutine/workspace_fga_test.go (6)
internal/subroutine/mocks/mock_Manager.go (1)
NewMockManager(827-837)internal/subroutine/mocks/mock_Cluster.go (1)
NewMockCluster(509-519)internal/subroutine/mocks/mock_Client.go (1)
NewMockClient(835-845)internal/subroutine/mocks/mock_OpenFGAServiceClient.go (1)
NewMockOpenFGAServiceClient(1436-1446)internal/subroutine/workspace_fga.go (1)
NewWorkspaceFGASubroutine(31-39)internal/subroutine/mocks/mock_StatusWriter.go (1)
NewMockStatusWriter(17-25)
internal/subroutine/workspace_initializer.go (2)
internal/config/config.go (1)
Config(11-44)api/v1alpha1/store_types.go (1)
Store(44-50)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (3)
go.mod(1 hunks)internal/controller/initializer_controller.go(2 hunks)internal/subroutine/authorization_model_generation.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- go.mod
🧰 Additional context used
🧬 Code graph analysis (1)
internal/controller/initializer_controller.go (3)
internal/config/config.go (1)
Config(11-44)internal/subroutine/workspace_initializer.go (1)
NewWorkspaceInitializer(30-44)internal/subroutine/workspace_fga.go (1)
NewWorkspaceFGASubroutine(31-39)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: pipe / dockerBuild / docker
- GitHub Check: pipe / lint / lint
- GitHub Check: pipe / testSource / test
🔇 Additional comments (4)
internal/controller/initializer_controller.go (3)
7-7: LGTM! Import is properly added.The OpenFGA proto import is necessary for the
OpenFGAServiceClienttype used in the controller wiring.
30-30: LGTM! Function signature correctly extended.The
fgaparameter addition follows the existing dependency injection pattern and enables FGA client wiring to the initialization subroutines.
34-35: LGTM! Initialization sequence correctly implements reviewer requirements.The changes properly wire the FGA client into the initialization flow:
- Line 34:
NewWorkspaceInitializernow receives the FGA client for store orchestration.- Line 35:
NewWorkspaceFGASubroutineis positioned after the initializer, satisfying the requirement that "FGA logic must be in a separate subroutine, executed after the store is initialized."The parameter passing matches the constructor signatures, and the FGA configuration fields (
ObjectType,ParentRelation,CreatorRelation) are sourced from config with appropriate defaults.internal/subroutine/authorization_model_generation.go (1)
67-68: LGTM! New status permission relations added correctly.The
statusUpdateandstatusPatchrelations are properly defined and correctly map to thememberprivilege level, granting status modification permissions to all members of the resource.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/subroutine/authorization_model_generation.go (1)
214-221: Critical: possible slice out-of-bounds when truncating groupUsing len(longestRelationName)-50 as a start index into resourceSchema.Spec.Group can exceed group length and panic. Compute an allowed group length instead and clamp.
- longestRelationName := fmt.Sprintf("create_%s_%s", resourceSchema.Spec.Group, resourceSchema.Spec.Names.Plural) - - group := resourceSchema.Spec.Group - - if len(longestRelationName) > 50 { - group = resourceSchema.Spec.Group[len(longestRelationName)-50:] - } + const maxRelLen = 50 // TODO: document why 50 is required/desired + group := resourceSchema.Spec.Group + // relation = "create_" + group + "_" + plural + relFixed := len("create_") + 1 + len(resourceSchema.Spec.Names.Plural) + allowedGroupLen := maxRelLen - relFixed + if allowedGroupLen < 1 { + allowedGroupLen = 1 + } + if len(group) > allowedGroupLen { + group = group[len(group)-allowedGroupLen:] + }Also consider extracting maxRelLen into a named constant with a brief comment.
🧹 Nitpick comments (3)
internal/subroutine/workspace_initializer.go (3)
32-41: Constructor panics on missing CoreModule fileFail‑fast is okay, but consider returning an error or providing a Must‑variant constructor to make the behavior explicit.
66-71: Validate owner name tooYou guard owner.cluster, but a blank owner.name will cause a Get with empty name. Add a check and requeue early.
- if lc.Spec.Owner.Cluster == "" { + if lc.Spec.Owner.Cluster == "" { return ctrl.Result{}, errors.NewOperatorError( fmt.Errorf("spec.owner.cluster is empty for LogicalCluster %s", lc.Name), true, true) } + if lc.Spec.Owner.Name == "" { + return ctrl.Result{Requeue: true}, nil + }
89-105: Prefer typed check over substring for APIBinding readinessSubstring match on “no matches for kind” is brittle. Use meta.IsNoMatchError(err) (or unwrap) to detect missing CRDs/APIBindings.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
internal/controller/initializer_controller.go(2 hunks)internal/subroutine/authorization_model_generation.go(1 hunks)internal/subroutine/invite_test.go(1 hunks)internal/subroutine/workspace_fga.go(1 hunks)internal/subroutine/workspace_initializer.go(4 hunks)internal/subroutine/workspace_initializer_test.go(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- internal/subroutine/invite_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
- internal/subroutine/workspace_initializer_test.go
- internal/subroutine/workspace_fga.go
- internal/controller/initializer_controller.go
🧰 Additional context used
🧬 Code graph analysis (1)
internal/subroutine/workspace_initializer.go (1)
api/v1alpha1/store_types.go (1)
Store(44-50)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: pipe / dockerBuild / docker
- GitHub Check: pipe / lint / lint
🔇 Additional comments (5)
internal/subroutine/authorization_model_generation.go (2)
61-66: Parent-derived member/owner: confirm intended semanticsDuplicates are gone; good. This drops direct role-based assignment on the resource in favor of “from parent” only. Confirm this change is intentional and covered by tests.
64-65: New statusUpdate/statusPatch relationsLooks fine. Please confirm all callers and RBAC checks use these exact relation names.
internal/subroutine/workspace_initializer.go (3)
107-156: Non‑org path: parent Store propagation looks correctResolves parent store via organization path, waits for StoreID, and persists it to AccountInfo with Creator. This satisfies the subroutine ordering and avoids extra Store creation.
Please confirm this is the minimal KCP call path in steady state (one Get for AccountInfo, one Get for Store, one CreateOrUpdate).
171-196: Org path: Store creation/update flow LGTMCreateOrUpdate CoreModule, refresh for status, and gate on StoreID. Error text casing fixed.
198-213: AccountInfo update with StoreID/Creator LGTMValue copy of Creator avoids pointer aliasing; requeue-after is reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
internal/subroutine/store_test.go (1)
70-84: Remove duplicate test case.This test case is functionally identical to the existing test at lines 140-153 ("should fail if get store fails when verifying existing store"). Both tests verify the same scenario: GetStore fails when a Store has StoreID set in status. The only difference is the error message string, which doesn't affect the behavior being tested.
Consider removing this duplicate test case to reduce maintenance burden and improve test suite clarity.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (4)
go.mod(1 hunks)internal/subroutine/authorization_model.go(1 hunks)internal/subroutine/store_test.go(1 hunks)internal/subroutine/workspace_authorization_test.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- internal/subroutine/authorization_model.go
- internal/subroutine/workspace_authorization_test.go
- go.mod
🧰 Additional context used
🧬 Code graph analysis (1)
internal/subroutine/store_test.go (2)
api/v1alpha1/store_types.go (2)
Store(44-50)StoreStatus(31-36)internal/subroutine/mocks/mock_OpenFGAServiceClient.go (1)
MockOpenFGAServiceClient(30-32)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: pipe / dockerBuild / docker
- GitHub Check: pipe / testSource / test
- GitHub Check: pipe / lint / lint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal/subroutine/authorization_model_test.go(3 hunks)
🔇 Additional comments (2)
internal/subroutine/authorization_model_test.go (2)
225-225: Test fixture data is internally consistent.The changes in the
moduleFilesarray correctly mirror the updates inmergedModule(lines 76-77, 80-81), ensuring that theTransformModuleFilesToModeltest case validates the expected DSL transformation behavior.Also applies to: 237-238, 244-245
80-81: New status permissions are properly integrated with test coverage.The
statusPatchandstatusUpdatepermissions are consistently defined across the authorization model generation and multiple test scenarios. The pattern mirrors the existingpatchandupdatepermissions, indicating proper alignment with Kubernetes status subresource conventions.
…c via from parent
Move the FGA subroutine functionality from the Account Operator to the Security Operator
This PR depends on changes from the account-operator repository that add new fields to the
AccountInfoCRD:Creatorfield inAccountInfoSpecCreatorTupleWrittenfield inAccountInfoStatusCurrently, the security-operator uses a pseudo-version of account-operator: v0.5.5-0.20251017120838-b8c73d0b347e
Steps to update after pushing account-operator to GitHub:
This will automatically resolve to the latest commit hash and update the pseudo-version accordingly.
Summary by CodeRabbit
New Features
Enhancements
Bug Fixes
Tests