-
Notifications
You must be signed in to change notification settings - Fork 1
[HYPERFLEET-264] Status reporter for GCP validator #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[HYPERFLEET-264] Status reporter for GCP validator #1
Conversation
WalkthroughAdds a new status-reporter subsystem that watches adapter result files, parses a JSON result contract, and updates a Kubernetes Job's status. Implements environment-driven configuration loading and validation, a Kubernetes client wrapper for Job status updates and pod/container inspection, a reporter orchestrator with polling, timeout and termination handling, a CLI entrypoint with graceful shutdown, container build artifacts (Dockerfile, Makefile), parser and result contract handling (size limits, UTF-8-safe truncation), and comprehensive unit tests across config, k8s, result, parser, reporter, and main packages. README expanded with documentation and adapter contract details. Sequence DiagramsequenceDiagram
participant Main
participant Config
participant Reporter
participant FS as Filesystem
participant Parser
participant K8sWrapper as K8s
participant KubeAPI
Main->>Config: Load() from env
Config-->>Main: Validated Config
Main->>Reporter: NewReporter(config)
Reporter->>K8sWrapper: NewClient(namespace, jobName)
K8sWrapper->>KubeAPI: in-cluster auth / connect
KubeAPI-->>K8sWrapper: client ready
Main->>Reporter: Run(ctx)
loop Polling
Reporter->>FS: Stat resultsPath
alt result file found
FS-->>Reporter: file bytes
Reporter->>Parser: Parse(data)
alt valid AdapterResult
Parser-->>Reporter: AdapterResult
Reporter->>K8sWrapper: UpdateJobStatus(condition)
K8sWrapper->>KubeAPI: PATCH Job status
KubeAPI-->>K8sWrapper: PATCH response
else parse error
Parser-->>Reporter: error
Reporter->>K8sWrapper: UpdateJobStatus(invalid-result)
end
Reporter-->>Main: Exit (done)
else no file and maxWait reached
Reporter->>K8sWrapper: GetAdapterContainerStatus(pod, container)
K8sWrapper->>KubeAPI: GET Pod
KubeAPI-->>K8sWrapper: Pod/container state
alt container terminated
Reporter->>K8sWrapper: UpdateJobStatus(terminated-reason)
else container running or unknown
Reporter->>K8sWrapper: UpdateJobStatus(timeout-reason)
end
Reporter-->>Main: Exit (timeout)
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45–90 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (7)
README.md (1)
210-221: Add a language identifier to the repository-structure code blockmarkdownlint (MD040) is flagging the fenced block starting at Line 210 because it lacks a language. Consider changing it to something like:
-``` +```text adapter-validation-gcp/ ...to keep linting clean. </blockquote></details> <details> <summary>status-reporter/Dockerfile (1)</summary><blockquote> `1-32`: **Align Dockerfile build flags with Makefile metadata (optional)** The multi‑stage build and distroless nonroot runtime look solid. To keep version/commit/buildDate consistent with `make binary`, consider reusing the same `-ldflags` here (or driving image builds via the Makefile) instead of hard‑coding `-ldflags="-w -s"`. </blockquote></details> <details> <summary>status-reporter/Makefile (1)</summary><blockquote> `1-171`: **Consider adding an `all` target and optionally shortening long target bodies** The Makefile is well‑structured and covers the usual workflows. To satisfy tools like checkmake and provide a conventional default: ```diff +.PHONY: all +all: verifyIf you care about the
maxbodylengthwarnings forlintandfmt, you could also move the shell bodies into small helper scripts or secondary targets, but that’s purely stylistic at this point.status-reporter/pkg/reporter/reporter_test.go (1)
399-402: Usefilepath.Joinfor path construction.String concatenation for path construction can lead to issues on different platforms and with edge cases.
BeforeEach(func() { tempDir = GinkgoT().TempDir() - resultsPath = tempDir + "/adapter-result.json" + resultsPath = filepath.Join(tempDir, "adapter-result.json") })You'll also need to add
"path/filepath"to the imports.status-reporter/pkg/k8s/client.go (1)
30-46: Consider validating required parameters.The constructor doesn't validate that
namespaceandjobNameare non-empty. While the Kubernetes API will return errors later, validating upfront provides clearer error messages.func NewClient(namespace, jobName string) (*Client, error) { + if namespace == "" { + return nil, fmt.Errorf("namespace is required") + } + if jobName == "" { + return nil, fmt.Errorf("jobName is required") + } + config, err := rest.InClusterConfig()status-reporter/pkg/config/config.go (1)
162-170: DuplicateValidationErrortype exists inresultpackage.There's a
ValidationErrortype in bothconfigandresultpackages (see relevant snippet fromresult/types.golines 68-71). While they're in different packages, this could lead to confusion. Consider whether these could share a common type in a shared package, or if the duplication is intentional for package isolation.status-reporter/pkg/reporter/reporter.go (1)
115-120: Usingtime.Sleepfor goroutine cleanup is fragile.The
time.Sleep(goroutineCleanupDelay)approach for waiting for the polling goroutine to exit is timing-dependent and could fail under high load. Consider using async.WaitGroupor a done channel that the goroutine signals when it exits.+import "sync" + func (r *StatusReporter) Run(ctx context.Context) error { // ... existing setup ... resultChan := make(chan *result.AdapterResult, 1) errorChan := make(chan error, 1) done := make(chan struct{}) + var wg sync.WaitGroup + wg.Add(1) - go r.pollForResults(timeoutCtx, resultChan, errorChan, done) + go func() { + defer wg.Done() + r.pollForResults(timeoutCtx, resultChan, errorChan, done) + }() // ... select block ... close(done) cancel() - time.Sleep(goroutineCleanupDelay) + wg.Wait() return reportErr }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
status-reporter/go.sumis excluded by!**/*.sum
📒 Files selected for processing (20)
README.md(1 hunks)status-reporter/Dockerfile(1 hunks)status-reporter/Makefile(1 hunks)status-reporter/cmd/reporter/main.go(1 hunks)status-reporter/go.mod(1 hunks)status-reporter/pkg/config/config.go(1 hunks)status-reporter/pkg/config/config_suite_test.go(1 hunks)status-reporter/pkg/config/config_test.go(1 hunks)status-reporter/pkg/k8s/client.go(1 hunks)status-reporter/pkg/k8s/client_test.go(1 hunks)status-reporter/pkg/k8s/k8s_suite_test.go(1 hunks)status-reporter/pkg/reporter/reporter.go(1 hunks)status-reporter/pkg/reporter/reporter_suite_test.go(1 hunks)status-reporter/pkg/reporter/reporter_test.go(1 hunks)status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go(1 hunks)status-reporter/pkg/result/parser.go(1 hunks)status-reporter/pkg/result/parser_test.go(1 hunks)status-reporter/pkg/result/result_suite_test.go(1 hunks)status-reporter/pkg/result/types.go(1 hunks)status-reporter/pkg/result/types_test.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (9)
status-reporter/pkg/result/types_test.go (2)
status-reporter/pkg/result/types.go (6)
AdapterResult(21-33)StatusSuccess(10-10)StatusFailure(11-11)DefaultReason(13-13)DefaultMessage(14-14)ValidationError(69-72)status-reporter/pkg/config/config.go (1)
ValidationError(163-166)
status-reporter/pkg/k8s/client_test.go (1)
status-reporter/pkg/k8s/client.go (1)
JobCondition(49-55)
status-reporter/pkg/config/config_test.go (1)
status-reporter/pkg/config/config.go (2)
Load(48-96)Config(13-23)
status-reporter/pkg/result/parser_test.go (2)
status-reporter/pkg/result/parser.go (2)
Parser(15-15)NewParser(18-20)status-reporter/pkg/result/types.go (4)
StatusSuccess(10-10)StatusFailure(11-11)DefaultReason(13-13)DefaultMessage(14-14)
status-reporter/pkg/result/parser.go (1)
status-reporter/pkg/result/types.go (1)
AdapterResult(21-33)
status-reporter/cmd/reporter/main.go (2)
status-reporter/pkg/config/config.go (2)
Load(48-96)Config(13-23)status-reporter/pkg/reporter/reporter.go (1)
NewReporter(54-65)
status-reporter/pkg/result/types.go (1)
status-reporter/pkg/config/config.go (1)
ValidationError(163-166)
status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go (1)
status-reporter/pkg/k8s/client.go (1)
JobCondition(49-55)
status-reporter/pkg/config/config.go (2)
status-reporter/pkg/reporter/reporter.go (1)
DefaultConditionType(18-18)status-reporter/pkg/result/types.go (1)
ValidationError(69-72)
🪛 checkmake (0.2.2)
status-reporter/Makefile
[warning] 85-85: Target body for "lint" exceeds allowed length of 5 (8).
(maxbodylength)
[warning] 96-96: Target body for "fmt" exceeds allowed length of 5 (6).
(maxbodylength)
[warning] 170-170: Missing required phony target "all"
(minphony)
🪛 markdownlint-cli2 (0.18.1)
README.md
210-210: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (23)
README.md (1)
5-205: Status reporter documentation is clear and adapter-focusedThe overview, adapter contract, and status examples form a coherent spec for adapter authors; the JSON schema, validation rules, and Job condition outputs look self‑consistent and aligned with the described status‑reporter behavior.
status-reporter/pkg/k8s/k8s_suite_test.go (1)
1-13: Suite bootstrap for k8s tests looks standardThe
TestK8sentrypoint cleanly wires the suite and matches the pattern used in your other packages; nothing to change here.status-reporter/pkg/result/result_suite_test.go (1)
1-13: Result suite bootstrap is consistent and correctThe
TestResultfunction provides the expected Ginkgo entrypoint for the result package and is consistent with the other suites.status-reporter/pkg/k8s/client_test.go (1)
1-42: JobCondition tests cover key struct behaviorsThe tests validate both a fully populated
JobConditionand the zeroLastTransitionTimecase, which is a good minimal sanity check for this type; assertions look correct.status-reporter/pkg/config/config_suite_test.go (1)
1-13: Config suite bootstrap is wired correctly
TestConfigregisters the fail handler and runs specs under a clear suite name; this matches the other packages and looks good.status-reporter/go.mod (1)
36-37: Double-check thego.yaml.in/yamlmodule pathsThe indirect requirements
go.yaml.in/yaml/v2 v2.4.2 // indirect go.yaml.in/yaml/v3 v3.0.4 // indirectlook unusual and may be a typo of the standard
gopkg.in/yaml.v2/gopkg.in/yaml.v3modules (you already havegopkg.in/yaml.v3listed separately at Line 50). If these paths are incorrect,go mod tidy/go buildwill fail to resolve them. Please verify and adjust or drop them as appropriate.status-reporter/cmd/reporter/main.go (1)
86-102: LGTM!The configuration logging is clear and helpful for debugging, with appropriate handling of the auto-detect case for the adapter container name.
status-reporter/pkg/reporter/reporter_suite_test.go (1)
1-13: LGTM!Standard Ginkgo/Gomega test suite bootstrap following best practices.
status-reporter/pkg/result/types.go (2)
9-33: LGTM!The constants and struct definition are well-designed. Using
json.RawMessagefor Details is the right choice for preserving arbitrary JSON data.
68-76: LGTM!The ValidationError implementation is straightforward and follows Go conventions.
status-reporter/pkg/config/config_test.go (1)
1-271: LGTM!Comprehensive test coverage for configuration loading and validation. Tests cover all important cases including required fields, defaults, validation constraints, and error conditions.
status-reporter/pkg/result/types_test.go (1)
1-198: LGTM!Comprehensive test coverage for AdapterResult validation, JSON marshaling, and error formatting. The tests cover all important edge cases including length limits, whitespace handling, and nested JSON details.
status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go (1)
1-35: LGTM!The mock implementation is clean and suitable for testing. The injectable function fields provide good flexibility for test scenarios.
status-reporter/pkg/result/parser_test.go (1)
1-153: LGTM!Comprehensive parser tests covering all important cases including file size limits, invalid JSON, validation errors, and default value application.
status-reporter/pkg/result/parser.go (1)
1-53: LGTM!The parser implementation is clean and includes appropriate safety checks:
- 1MB file size limit prevents memory exhaustion
- Empty file detection
- Clear error messages with context
status-reporter/pkg/reporter/reporter_test.go (1)
1-647: Comprehensive test coverage with good structure.The test suite covers all the key paths including success/failure scenarios, timeout handling, error propagation, and edge cases like empty files and invalid JSON. The use of
GinkgoT().TempDir()ensures proper cleanup.status-reporter/pkg/k8s/client.go (2)
58-97: Well-implemented retry logic with proper conflict handling.The
UpdateJobStatusmethod correctly usesretry.RetryOnConflictfor optimistic concurrency control, properly handles the NotFound case by breaking out of retry, and correctly updates or appends conditions.
116-129: Verify Go version requirement for loop variable safety.Taking the address of
csinside the range loop (lines 119 and 127) depends on the Go version:
- Go 1.22+: Safe — loop variables have per-iteration scope
- Go < 1.22: Unsafe — all iterations share the same variable, so the returned pointer would reference the last element
Confirm the
go.modfile specifiesgo 1.22or later to ensure this pattern is safe.status-reporter/pkg/config/config.go (2)
117-150: The..check has false positives but is acceptable as defense-in-depth.The check
strings.Contains(c.ResultsPath, "..")on line 119 could trigger false positives for paths likefoo..bar.json. However, sincefilepath.Cleanis called afterward and the path must be under/results, this additional check is a reasonable defense-in-depth measure against traversal attacks.
48-96: Clean configuration loading with proper validation flow.The
Loadfunction follows a good pattern: required variables first with early return on error, optional variables with sensible defaults, then validation. Error handling is consistent throughout.status-reporter/pkg/reporter/reporter.go (3)
122-169: Well-structured polling with proper channel handling.The
pollForResultsfunction correctly handles all exit paths (done signal, context cancellation, error, success) with non-blocking sends to channels. The ticker cleanup viadeferis appropriate.
214-242: Good use of separate timeout context for container status check.Using
context.Background()withcontainerStatusCheckTimeouton line 219 is the right approach since the parent context may have already timed out. This ensures the container status check can complete and provide more specific error information.
244-274: Comprehensive handling of container termination states.The function correctly distinguishes between OOMKilled, non-zero exit codes, and the edge case of exit code 0 without a result file (treated as a crash). The error messages provide useful debugging information.
bfb02ec to
ae37adf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
status-reporter/cmd/reporter/main.go (1)
44-75: Signal shutdown path ignores reporter errors and always exits 0.In the SIGTERM/SIGINT branch you cancel
ctxand wait fordoneChanor the shutdown timer, but you never read fromerrChan. Ifrep.Run(ctx)returns an error during shutdown (e.g., failed status update), the process still exits with status 0, which can hide real failures from Kubernetes/Job controllers.If you want exit codes to consistently reflect reporter success/failure, consider reading from
errChanwhendoneChanis closed in this branch as well (mirroring the non‑signal path) and exiting with 1 on error.
🧹 Nitpick comments (5)
status-reporter/pkg/errors/validation.go (1)
3-11: ValidationError implementation looks good; consider package naming.The
ValidationErrortype and itsError()method are straightforward and usable across config/result validation. One minor thought: using package nameerrorscan be confusing alongside the stdliberrorspackage and forces callers to alias one of them; consider a more specific name (e.g.,valerrorsor similar) if it ever becomes annoying in callers.status-reporter/Makefile (1)
84-96: Optional: add analltarget and keep make checkers happy.Tools like
checkmakeare flagging the missingalltarget and long bodies forlint/fmt. If you care about those checks, you could add a simple aggregator:.PHONY: all all: verifyand, if desired, factor helper recipes for
lint/fmtto shorten their bodies. Not required for functionality, just to quiet static analysis.Also applies to: 170-171
status-reporter/pkg/result/parser.go (1)
9-12: Size limit is enforced after full read; consider a streaming read if you worry about huge files.
ParseFilereads the whole file withos.ReadFileand only then enforcesmaxResultFileSize. That’s functionally fine, but it doesn’t actually prevent a single large read from consuming memory once.If you anticipate untrusted or accidentally huge result files, consider switching to an
os.Open+io.LimitReader(or similar) approach so you never read more thanmaxResultFileSize+εbytes into memory before failing. Otherwise the current approach is OK, just keep in mind the comment slightly overstates the protection.Also applies to: 22-38, 40-52
status-reporter/pkg/reporter/reporter.go (2)
89-122: Potential race between timeout and result delivery inRun.Because
Rundoes a singleselectoverresultChan,errorChan, andtimeoutCtx.Done(), if the adapter result arrives very close tomaxWaitTime, bothresultChanandtimeoutCtx.Done()can be ready and the timeout case may win. That would incorrectly report a timeout even though a valid result was parsed.If you care about that edge, consider restructuring to give precedence to an already‑arrived result/error over the timeout, e.g.:
- Use a small loop that first drains
resultChan/errorChannon‑blocking when the timeout fires, or- Drive everything off a single context but have
pollForResultsclose a dedicated “finished” channel and letRunread the result/error after noticing completion, instead of racing it against the timeout case.The current behavior is acceptable for most scenarios but may occasionally misclassify in near‑deadline situations.
219-246: Defensively handle nilContainerStatusfromGetAdapterContainerStatus.
UpdateFromTimeoutassumesGetAdapterContainerStatusreturns a non‑nil*corev1.ContainerStatuswhenerr == nil:containerStatus, err := r.k8sClient.GetAdapterContainerStatus(...) if err != nil { ... } else if containerStatus.State.Terminated != nil { ... }If an implementation ever returns
(nil, nil)(e.g., “pod not found” encoded as no error + nil status), this will panic oncontainerStatus.State. It’s unlikely but cheap to guard against:-} else if containerStatus.State.Terminated != nil { +} else if containerStatus != nil && containerStatus.State.Terminated != nil {This keeps the reporter robust against slightly looser client implementations.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
status-reporter/go.sumis excluded by!**/*.sum
📒 Files selected for processing (21)
README.md(1 hunks)status-reporter/Dockerfile(1 hunks)status-reporter/Makefile(1 hunks)status-reporter/cmd/reporter/main.go(1 hunks)status-reporter/go.mod(1 hunks)status-reporter/pkg/config/config.go(1 hunks)status-reporter/pkg/config/config_suite_test.go(1 hunks)status-reporter/pkg/config/config_test.go(1 hunks)status-reporter/pkg/errors/validation.go(1 hunks)status-reporter/pkg/k8s/client.go(1 hunks)status-reporter/pkg/k8s/client_test.go(1 hunks)status-reporter/pkg/k8s/k8s_suite_test.go(1 hunks)status-reporter/pkg/reporter/reporter.go(1 hunks)status-reporter/pkg/reporter/reporter_suite_test.go(1 hunks)status-reporter/pkg/reporter/reporter_test.go(1 hunks)status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go(1 hunks)status-reporter/pkg/result/parser.go(1 hunks)status-reporter/pkg/result/parser_test.go(1 hunks)status-reporter/pkg/result/result_suite_test.go(1 hunks)status-reporter/pkg/result/types.go(1 hunks)status-reporter/pkg/result/types_test.go(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- status-reporter/pkg/result/types_test.go
- README.md
🚧 Files skipped from review as they are similar to previous changes (12)
- status-reporter/pkg/config/config_suite_test.go
- status-reporter/pkg/result/result_suite_test.go
- status-reporter/pkg/result/parser_test.go
- status-reporter/Dockerfile
- status-reporter/pkg/k8s/client_test.go
- status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go
- status-reporter/pkg/k8s/k8s_suite_test.go
- status-reporter/go.mod
- status-reporter/pkg/config/config_test.go
- status-reporter/pkg/k8s/client.go
- status-reporter/pkg/result/types.go
- status-reporter/pkg/reporter/reporter_suite_test.go
🧰 Additional context used
🧬 Code graph analysis (4)
status-reporter/pkg/reporter/reporter_test.go (5)
status-reporter/pkg/reporter/reporter.go (10)
StatusReporter(42-51)NewReporterWithClient(68-70)DefaultConditionType(19-19)ConditionStatusTrue(21-21)ConditionStatusFalse(22-22)ReasonAdapterCrashed(24-24)ReasonAdapterOOMKilled(25-25)ReasonAdapterExitedWithError(26-26)ReasonAdapterTimeout(27-27)ReasonInvalidResultFormat(28-28)status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go (2)
MockK8sClient(12-16)NewMockK8sClient(18-20)status-reporter/pkg/config/config.go (1)
DefaultConditionType(31-31)status-reporter/pkg/result/types.go (3)
AdapterResult(24-36)StatusSuccess(13-13)StatusFailure(14-14)status-reporter/pkg/k8s/client.go (1)
JobCondition(49-55)
status-reporter/pkg/result/parser.go (1)
status-reporter/pkg/result/types.go (1)
AdapterResult(24-36)
status-reporter/cmd/reporter/main.go (2)
status-reporter/pkg/config/config.go (2)
Load(50-98)Config(15-25)status-reporter/pkg/reporter/reporter.go (1)
NewReporter(54-65)
status-reporter/pkg/config/config.go (2)
status-reporter/pkg/reporter/reporter.go (1)
DefaultConditionType(19-19)status-reporter/pkg/errors/validation.go (1)
ValidationError(4-7)
🪛 checkmake (0.2.2)
status-reporter/Makefile
[warning] 85-85: Target body for "lint" exceeds allowed length of 5 (8).
(maxbodylength)
[warning] 96-96: Target body for "fmt" exceeds allowed length of 5 (6).
(maxbodylength)
[warning] 170-170: Missing required phony target "all"
(minphony)
🔇 Additional comments (3)
status-reporter/Makefile (1)
19-26: Verify that ldflags targets (version/commit/buildDate/tag) exist inmain.The
LDFLAGSinjectmain.version,main.commit,main.buildDate, and optionallymain.tag. This works well if those are defined asstringvars in themainpackage; otherwisego buildwill fail at link time.Please double‑check that you have something like:
// in cmd/reporter (same main package) var ( version string commit string buildDate string tag string )If these vars don’t exist yet, adding them (or dropping the corresponding
-Xflags) will avoid build issues.Also applies to: 115-115
status-reporter/pkg/config/config.go (1)
49-98: Config loading/validation flow is clean and predictable.The
Load→Validatepipeline withValidationErrorfor env/field issues, plusGetPollInterval/GetMaxWaitTimehelpers, is structured well and should be easy to reason about and test. Required envs, sensible defaults, and timing invariants (poll < maxWait) are all enforced in one place.Also applies to: 100-117, 164-195
status-reporter/pkg/reporter/reporter_test.go (1)
20-648: Thorough reporter test coverage looks solid.The suite exercises all the important paths (success/failure results, parse errors, timeouts, container termination variants, k8s update failures, and context cancellation) using the mock client and temp files. Poll intervals vs sleeps look conservative enough that flakiness risk is low. This should give high confidence in the reporter behavior.
ae37adf to
3253941
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
status-reporter/pkg/config/config.go (1)
126-151: TightenResultsPathvalidation and remove broad..substring checkCurrent validation still (a) rejects legitimate filenames containing
..and (b) does not enforce that the path stays under the intended/resultsarea, which weakens the sandboxing story compared to the design intent in earlier review comments.You can both avoid false positives and constrain the path to the results directory by relying on
filepath.Cleanplus a base-dir check, and dropping the rawstrings.Contains(".."):func (c *Config) validateResultsPath() error { - if strings.Contains(c.ResultsPath, "..") { - return &ValidationError{ - Field: "ResultsPath", - Message: "path traversal (..) not allowed", - } - } - if strings.HasSuffix(c.ResultsPath, "/") { return &ValidationError{ Field: "ResultsPath", Message: "path must be a file, not a directory", } } cleanPath := filepath.Clean(c.ResultsPath) if !filepath.IsAbs(cleanPath) { return &ValidationError{ Field: "ResultsPath", Message: "path must be absolute", } } + + base := filepath.Clean(filepath.Dir(DefaultResultsPath)) // e.g. "/results" + if cleanPath != base && !strings.HasPrefix(cleanPath, base+string(os.PathSeparator)) { + return &ValidationError{ + Field: "ResultsPath", + Message: fmt.Sprintf("path must be within %s directory", base), + } + } return nil }This preserves the existing invariants (absolute path, file not directory) while ensuring the configured path can’t escape the
/resultstree, and no longer rejects harmless names like/results/adapter..json.
🧹 Nitpick comments (2)
status-reporter/Makefile (1)
61-171: Consider adding a default "all" target.Following Makefile conventions, projects typically define an "all" target that specifies the default build artifacts. This makes
make(without arguments) predictable and serves as documentation for primary project deliverables.Suggested implementation:
.PHONY: all all: binary image ## Build binary and image (default)This makes the project's primary outputs explicit and allows users to build everything with a single
makecommand.status-reporter/pkg/config/config.go (1)
25-44: Consider centralizingDefaultConditionTypeto avoid divergenceRelevant snippets show
DefaultConditionTypeis also defined instatus-reporter/pkg/reporter/reporter.go. It would be safer to have a single source of truth (e.g., keep it here inconfigand reference it fromreporter) to avoid future drift.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
status-reporter/go.sumis excluded by!**/*.sum
📒 Files selected for processing (20)
README.md(1 hunks)status-reporter/Dockerfile(1 hunks)status-reporter/Makefile(1 hunks)status-reporter/cmd/reporter/main.go(1 hunks)status-reporter/go.mod(1 hunks)status-reporter/pkg/config/config.go(1 hunks)status-reporter/pkg/config/config_suite_test.go(1 hunks)status-reporter/pkg/config/config_test.go(1 hunks)status-reporter/pkg/k8s/client.go(1 hunks)status-reporter/pkg/k8s/client_test.go(1 hunks)status-reporter/pkg/k8s/k8s_suite_test.go(1 hunks)status-reporter/pkg/reporter/reporter.go(1 hunks)status-reporter/pkg/reporter/reporter_suite_test.go(1 hunks)status-reporter/pkg/reporter/reporter_test.go(1 hunks)status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go(1 hunks)status-reporter/pkg/result/parser.go(1 hunks)status-reporter/pkg/result/parser_test.go(1 hunks)status-reporter/pkg/result/result.go(1 hunks)status-reporter/pkg/result/result_suite_test.go(1 hunks)status-reporter/pkg/result/result_test.go(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- README.md
🚧 Files skipped from review as they are similar to previous changes (9)
- status-reporter/pkg/result/result_suite_test.go
- status-reporter/pkg/config/config_suite_test.go
- status-reporter/go.mod
- status-reporter/pkg/reporter/reporter_suite_test.go
- status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go
- status-reporter/cmd/reporter/main.go
- status-reporter/pkg/k8s/k8s_suite_test.go
- status-reporter/pkg/reporter/reporter.go
- status-reporter/Dockerfile
🧰 Additional context used
🧬 Code graph analysis (7)
status-reporter/pkg/config/config_test.go (1)
status-reporter/pkg/config/config.go (2)
Load(57-105)Config(13-23)
status-reporter/pkg/result/result_test.go (1)
status-reporter/pkg/result/result.go (6)
AdapterResult(32-44)StatusSuccess(11-11)StatusFailure(12-12)DefaultReason(14-14)DefaultMessage(15-15)ResultError(22-25)
status-reporter/pkg/k8s/client_test.go (1)
status-reporter/pkg/k8s/client.go (1)
JobCondition(49-55)
status-reporter/pkg/result/parser_test.go (2)
status-reporter/pkg/result/parser.go (2)
Parser(15-15)NewParser(18-20)status-reporter/pkg/result/result.go (4)
StatusSuccess(11-11)StatusFailure(12-12)DefaultReason(14-14)DefaultMessage(15-15)
status-reporter/pkg/result/parser.go (1)
status-reporter/pkg/result/result.go (1)
AdapterResult(32-44)
status-reporter/pkg/config/config.go (1)
status-reporter/pkg/reporter/reporter.go (1)
DefaultConditionType(19-19)
status-reporter/pkg/reporter/reporter_test.go (5)
status-reporter/pkg/reporter/reporter.go (10)
StatusReporter(42-51)NewReporterWithClient(68-70)DefaultConditionType(19-19)ConditionStatusTrue(21-21)ConditionStatusFalse(22-22)ReasonAdapterCrashed(24-24)ReasonAdapterOOMKilled(25-25)ReasonAdapterExitedWithError(26-26)ReasonAdapterTimeout(27-27)ReasonInvalidResultFormat(28-28)status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go (2)
MockK8sClient(12-16)NewMockK8sClient(18-20)status-reporter/pkg/config/config.go (1)
DefaultConditionType(29-29)status-reporter/pkg/result/result.go (3)
AdapterResult(32-44)StatusSuccess(11-11)StatusFailure(12-12)status-reporter/pkg/k8s/client.go (1)
JobCondition(49-55)
🪛 checkmake (0.2.2)
status-reporter/Makefile
[warning] 85-85: Target body for "lint" exceeds allowed length of 5 (8).
(maxbodylength)
[warning] 96-96: Target body for "fmt" exceeds allowed length of 5 (6).
(maxbodylength)
[warning] 170-170: Missing required phony target "all"
(minphony)
🔇 Additional comments (34)
status-reporter/Makefile (1)
1-171: Well-structured Makefile with comprehensive build automation.The Makefile demonstrates solid infrastructure practices: proper metadata capture (version, commit, build date), intelligent container runtime detection with fallback error handling, race-detection-enabled testing, coverage reporting, and multi-registry image push support. The phony target declarations and variable organization follow Make conventions well.
status-reporter/pkg/k8s/client_test.go (2)
14-29: LGTM!The test correctly validates that all fields of JobCondition can be set and retrieved as expected.
31-40: LGTM!The test correctly validates zero-value handling for LastTransitionTime.
status-reporter/pkg/result/parser_test.go (3)
21-25: LGTM!The test correctly validates parser instantiation.
27-116: LGTM!Excellent test coverage for ParseFile, including both valid inputs and comprehensive error scenarios (empty file, invalid JSON, invalid status, oversized file, nonexistent file). The use of temp directories with proper cleanup is appropriate.
118-153: LGTM!The tests appropriately validate JSON parsing, default value application for missing fields, and error handling for malformed input. The coverage is thorough.
status-reporter/pkg/result/result_test.go (4)
14-29: LGTM!The IsSuccess tests correctly validate all status scenarios (success, failure, invalid).
31-152: LGTM!Excellent comprehensive test coverage for validation logic, including:
- Valid success/failure results
- Invalid status error handling
- Default value application for empty/whitespace fields
- Whitespace trimming
- Length truncation enforcement
The truncation tests are particularly valuable given the UTF-8-safe implementation.
154-186: LGTM!The JSON marshaling tests appropriately validate unmarshaling of basic results and results with optional details (including nested structures).
188-198: LGTM!The ResultError formatting tests correctly validate the error message format.
status-reporter/pkg/config/config_test.go (4)
39-148: LGTM!Excellent comprehensive test coverage for the Load function, including:
- Valid configuration with defaults
- Custom value overrides
- Whitespace trimming
- Missing required field error handling
- Invalid integer parsing error handling
The nested Context organization improves readability.
150-246: LGTM!Excellent validation test coverage, including:
- Valid configuration acceptance
- Timing parameter validation (positive values, poll < max wait)
- Security-critical path validation (path traversal prevention, absolute path requirement, file type validation)
The security-focused path validation tests are particularly valuable.
248-253: LGTM!The test correctly validates duration conversion for poll interval.
255-260: LGTM!The test correctly validates duration conversion for max wait time.
status-reporter/pkg/result/parser.go (3)
9-20: LGTM!The 1MB size limit appropriately prevents memory exhaustion attacks. The stateless Parser design is clean.
22-38: LGTM!The ParseFile implementation correctly validates file size and content before parsing. The error messages appropriately include contextual information (path, size) for debugging.
40-53: LGTM!The Parse implementation correctly unmarshals JSON and delegates validation to AdapterResult.Validate(). Good separation of concerns.
status-reporter/pkg/reporter/reporter_test.go (7)
41-52: LGTM!The test appropriately validates that exported constants have the expected values.
54-80: LGTM!The tests correctly validate reporter instantiation with custom and default condition types.
82-167: LGTM!Excellent test coverage for UpdateFromResult, including success/failure outcomes, error propagation, and custom condition type usage. The tests appropriately validate all JobCondition fields.
169-198: LGTM!The tests correctly validate UpdateFromError behavior, including the InvalidResultFormat reason and error propagation.
200-284: LGTM!Excellent test coverage for UpdateFromTerminatedContainer, including OOMKilled, non-zero exit, and zero exit scenarios. The comment at lines 239-245 provides valuable context explaining why zero exit code is treated as AdapterCrashed in the timeout path.
286-392: LGTM!Excellent comprehensive test coverage for UpdateFromTimeout, including various container termination states, fallback to timeout reason when status fetch fails, and error propagation.
394-648: LGTM!Exceptional end-to-end test coverage for the Run method, including:
- Immediate and delayed result file availability
- Timeout scenarios
- Invalid file content (malformed JSON, empty, invalid status)
- Context cancellation
- Error propagation from status updates
- Container termination detection
The tests appropriately use temp directories, realistic delays, and goroutines to simulate concurrent behavior.
status-reporter/pkg/result/result.go (4)
10-29: LGTM!The constants are well-defined with appropriate length limits. The ResultError implementation correctly follows the error interface pattern.
31-49: LGTM!The AdapterResult struct is well-documented with clear field descriptions. The IsSuccess method correctly implements status checking.
51-77: LGTM!The Validate method correctly:
- Enforces valid status values
- Normalizes Reason and Message (trim, default, truncate)
- Uses UTF-8-safe truncation to avoid corruption
This addresses the past review concern about UTF-8 corruption.
79-94: LGTM!The truncateUTF8 helper correctly prevents UTF-8 corruption by:
- Checking length before truncation
- Finding the last valid rune boundary before maxBytes using utf8.RuneStart
- Including a safety fallback
This properly addresses the UTF-8 safety concern from the past review.
status-reporter/pkg/k8s/client.go (5)
17-27: LGTM!The constant and Client struct are appropriately defined.
29-46: LGTM!The NewClient implementation follows standard Kubernetes client-go patterns for in-cluster configuration with appropriate error handling.
48-97: LGTM!The UpdateJobStatus implementation correctly follows Kubernetes best practices:
- Uses RetryOnConflict for optimistic concurrency control
- Handles NotFound errors explicitly
- Defaults LastTransitionTime when not set
- Implements proper upsert logic (update existing condition or append new)
99-107: LGTM!The GetPodStatus implementation is straightforward with appropriate error context.
109-132: LGTM!The GetAdapterContainerStatus implementation provides good flexibility:
- Supports explicit container name lookup
- Falls back to finding the first non-status-reporter container
- Includes appropriate error context in messages
status-reporter/pkg/config/config.go (1)
56-124: Config loading and timing validation look solid
Load()andValidate()handle required envs, whitespace, integer parsing, and thePollIntervalSeconds < MaxWaitTimeSecondsinvariant cleanly, returning structuredValidationErrors on misconfiguration. No issues from a correctness perspective.
3253941 to
b748930
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (6)
README.md (2)
57-186: Verify documentation claims against implementation for contract enforcement.The Adapter Contract section provides comprehensive JSON schema, validation rules, and scenario examples. However, several implementation details require verification:
- Line 20 claims the reporter "always exits successfully (never fails the parent Job)"; confirm the exit handler never returns non-zero
- Line 62-64 specifies a 300-second default timeout and
/results/adapter-result.jsondefault path; ensure these match the implementation- Lines 79-83 define specific field validation (trimming, truncation, defaults); verify all rules are enforced in the result parser
Please verify the following:
- The status reporter's main entrypoint exits with code 0 regardless of adapter outcome
- Default timeout is 300 seconds (and configurable via env var)
- Default results path is
/results/adapter-result.json(and configurable viaRESULTS_PATHenv var)- The result parser enforces all documented validation rules:
reasonis trimmed and truncated to 128 characters with default "NoReasonProvided"messageis trimmed and truncated to 1024 characters with default "No message provided"statusvalue is case-sensitive and only accepts "success" or "failure"Consider adding a "Testing the Status Reporter" section with:
- Example Kubernetes Job manifest integrating status reporter
- Sample adapter container that writes valid/invalid results
- Expected Job status outputs for each scenario
This would improve the onboarding experience for future validators.
187-207: Document Kubernetes volume ownership and cleanup expectations.The shared volume configuration (lines 187–207) shows the volume mount but does not clarify:
- Whether the status reporter should clean up the results file after reading
- Implications if the adapter writes multiple result files
- Whether persistent storage or
emptyDiris requiredAdd clarification on:
- Result file lifecycle (created by adapter, read by reporter, cleaned up by reporter or left for debugging?)
- Volume cleanup behavior (emptyDir will be destroyed with Pod; is this intentional?)
- Handling of partial or incomplete result files
This will prevent integration issues for teams deploying the status reporter.
status-reporter/Makefile (3)
84-93: Lint target needs simplification to comply with Makefile conventions.The lint target body (lines 84–93) exceeds the standard 5-line body length limit for Makefile targets. While the implementation is correct, refactoring for readability and compliance is recommended.
Extract the golangci-lint check into a shell script or simplify the conditional:
.PHONY: lint lint: ## Run golangci-lint @echo "Running golangci-lint..." @command -v golangci-lint >/dev/null || (echo "Error: golangci-lint not found"; exit 1) golangci-lint cache clean && golangci-lint runAlternatively, document that golangci-lint must be pre-installed via CI setup.
95-102: Fmt target should be simplified to reduce body length.The fmt target (lines 95–102) has conditional logic that slightly exceeds recommended body length. Simplify or extract the goimports logic.
Simplify by defaulting to goimports with a fallback:
.PHONY: fmt fmt: ## Format code with gofmt and goimports @echo "Formatting code..." @command -v goimports >/dev/null && goimports -w . || gofmt -w .This is more concise and achieves the same behavior.
110-171: Image building and verification targets are well-structured; missing "all" phony target.The binary, clean, and image targets are clear and functional. The verify target chains lint and test appropriately. However, the Makefile lacks a default "all" phony target, which is a Makefile convention for projects with multiple build artifacts.
Add an "all" phony target to define the default build goal:
.PHONY: all all: verify binary ## Default: verify code and build binary .PHONY: verify verify: lint test ## Run all verification checks (lint + test)Place the "all" target near the help target (after line 61) and update the verify target to depend on it, or simply have "all" depend on verify and binary. This establishes a convention where
makealone performs the most common build workflow.status-reporter/pkg/config/config.go (1)
126-152: Simplify the path traversal check.The
strings.Contains(c.ResultsPath, "..")check on line 128 is overly broad and would reject legitimate filenames like/results/adapter..json. Since you already usefilepath.Cleanandfilepath.IsAbs, path traversal sequences are normalized away, making this explicit check redundant.Consider removing the
".."substring check:- if strings.Contains(c.ResultsPath, "..") { - return &ValidationError{ - Field: "ResultsPath", - Message: "path traversal (..) not allowed", - } - } -The combination of
filepath.Cleanandfilepath.IsAbsprovides sufficient protection against path traversal.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
status-reporter/go.sumis excluded by!**/*.sum
📒 Files selected for processing (20)
README.md(1 hunks)status-reporter/Dockerfile(1 hunks)status-reporter/Makefile(1 hunks)status-reporter/cmd/reporter/main.go(1 hunks)status-reporter/go.mod(1 hunks)status-reporter/pkg/config/config.go(1 hunks)status-reporter/pkg/config/config_suite_test.go(1 hunks)status-reporter/pkg/config/config_test.go(1 hunks)status-reporter/pkg/k8s/client.go(1 hunks)status-reporter/pkg/k8s/client_test.go(1 hunks)status-reporter/pkg/k8s/k8s_suite_test.go(1 hunks)status-reporter/pkg/reporter/reporter.go(1 hunks)status-reporter/pkg/reporter/reporter_suite_test.go(1 hunks)status-reporter/pkg/reporter/reporter_test.go(1 hunks)status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go(1 hunks)status-reporter/pkg/result/parser.go(1 hunks)status-reporter/pkg/result/parser_test.go(1 hunks)status-reporter/pkg/result/result.go(1 hunks)status-reporter/pkg/result/result_suite_test.go(1 hunks)status-reporter/pkg/result/result_test.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (10)
- status-reporter/pkg/k8s/client_test.go
- status-reporter/pkg/config/config_suite_test.go
- status-reporter/pkg/result/result_test.go
- status-reporter/pkg/result/result_suite_test.go
- status-reporter/pkg/result/parser_test.go
- status-reporter/go.mod
- status-reporter/pkg/k8s/client.go
- status-reporter/Dockerfile
- status-reporter/pkg/config/config_test.go
- status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go
🧰 Additional context used
🧬 Code graph analysis (4)
status-reporter/pkg/result/parser.go (1)
status-reporter/pkg/result/result.go (1)
AdapterResult(32-44)
status-reporter/cmd/reporter/main.go (2)
status-reporter/pkg/config/config.go (2)
Load(57-105)Config(13-23)status-reporter/pkg/reporter/reporter.go (1)
NewReporter(52-59)
status-reporter/pkg/reporter/reporter.go (3)
status-reporter/pkg/k8s/client.go (2)
JobCondition(49-55)NewClient(30-46)status-reporter/pkg/result/parser.go (2)
Parser(15-15)NewParser(18-20)status-reporter/pkg/result/result.go (1)
AdapterResult(32-44)
status-reporter/pkg/reporter/reporter_test.go (4)
status-reporter/pkg/reporter/reporter.go (9)
StatusReporter(40-49)NewReporterWithClient(62-64)ConditionStatusTrue(19-19)ConditionStatusFalse(20-20)ReasonAdapterCrashed(22-22)ReasonAdapterOOMKilled(23-23)ReasonAdapterExitedWithError(24-24)ReasonAdapterTimeout(25-25)ReasonInvalidResultFormat(26-26)status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go (2)
MockK8sClient(12-16)NewMockK8sClient(18-20)status-reporter/pkg/result/result.go (3)
AdapterResult(32-44)StatusSuccess(11-11)StatusFailure(12-12)status-reporter/pkg/k8s/client.go (1)
JobCondition(49-55)
🪛 checkmake (0.2.2)
status-reporter/Makefile
[warning] 85-85: Target body for "lint" exceeds allowed length of 5 (8).
(maxbodylength)
[warning] 96-96: Target body for "fmt" exceeds allowed length of 5 (6).
(maxbodylength)
[warning] 170-170: Missing required phony target "all"
(minphony)
🔇 Additional comments (35)
README.md (1)
1-30: Documentation establishes clear component mission and cloud-agnostic design.The Overview and Status Reporter sections accurately reflect the PR objectives: cloud-agnostic sidecar, JSON result contract, multiple failure scenario handling, and zero adapter implementation dependencies.
status-reporter/Makefile (4)
1-60: Build metadata, environment detection, and tooling configuration look solid.The Makefile establishes clear metadata, embeds version/commit/buildDate into the binary, and robustly detects container runtimes (Docker → Podman → error). The PKG_DIRS discovery appropriately excludes vendor and test directories, and Go tooling is well-configured.
61-109: Test and coverage targets properly configured with race detection.The test targets include race detection (
-race) and comprehensive coverage reporting (HTML output). Test timeout is reasonable (30m).
42-55: Container runtime detection is robust with good error messaging.The fallback from Docker to Podman and the explicit error message if neither is available provides good developer experience. The error exit ensures CI pipelines fail fast if the environment is misconfigured.
147-168: Dev image workflow requires QUAY_USER; validation is clear and user-friendly.The image-dev target requires the QUAY_USER environment variable and provides helpful error messaging. The DEV_TAG based on GIT_COMMIT ensures unique tags for each iteration.
status-reporter/pkg/k8s/k8s_suite_test.go (1)
1-13: LGTM!The test suite bootstrap follows the standard Ginkgo/Gomega pattern and is consistent with the other test suite files in the repository.
status-reporter/pkg/reporter/reporter_suite_test.go (1)
1-13: LGTM!The test suite bootstrap follows the standard Ginkgo/Gomega pattern and is consistent with other test suite files in the repository.
status-reporter/cmd/reporter/main.go (4)
1-17: LGTM!The imports and shutdown timeout constant are appropriate for the status reporter service.
19-56: LGTM!The configuration loading, reporter setup, and goroutine initialization are correct. The channel setup properly handles the asynchronous reporter execution.
58-90: LGTM!The graceful shutdown logic and exit code handling are correct. The code properly handles both signal-initiated shutdown and normal completion, exiting with appropriate status codes based on errors.
92-108: LGTM!The configuration logging helper provides comprehensive visibility into the runtime configuration, with clear handling of the auto-detect case for the adapter container name.
status-reporter/pkg/result/parser.go (3)
1-20: LGTM!The 1MB file size limit is a reasonable safeguard against memory exhaustion, and the stateless Parser design is clean.
22-38: LGTM!The ParseFile method includes appropriate safeguards: empty file detection, size limits, and informative error messages.
40-53: LGTM!The Parse method correctly unmarshals JSON and validates the result format, with proper error wrapping.
status-reporter/pkg/reporter/reporter_test.go (5)
1-51: LGTM!The test setup and constants validation are well-structured. Testing exported constants ensures API stability.
53-166: LGTM!The tests comprehensively cover reporter creation and result update flows, including error propagation and custom condition types.
168-283: LGTM!The tests thoroughly cover error handling and container termination scenarios. The treatment of zero exit code without a result file as AdapterCrashed is well-reasoned and documented.
285-391: LGTM!The timeout tests cover all scenarios including container termination, running state, and fallback behavior when container status cannot be retrieved.
393-647: LGTM!The Run integration tests provide excellent coverage of the complete reporter workflow, including timing scenarios, invalid inputs, context cancellation, and error handling.
status-reporter/pkg/result/result.go (4)
1-29: LGTM!The constants and error type are well-defined. The max length limits (128 for reason, 1024 for message) are appropriate for Kubernetes condition fields.
31-49: LGTM!The AdapterResult structure is well-designed with clear field documentation. Using json.RawMessage for Details provides flexibility for adapter-specific data.
51-77: LGTM!The validation logic correctly enforces the status contract and safely normalizes Reason and Message fields with UTF-8-aware truncation.
79-94: LGTM!The truncateUTF8 helper correctly handles UTF-8 boundaries by finding the last valid rune start before the byte limit. This properly addresses the UTF-8 truncation concern from previous reviews.
status-reporter/pkg/config/config.go (5)
1-44: LGTM!The Config struct and default constants are well-designed with sensible default values for polling and timeouts.
46-105: LGTM!The Load function correctly handles required and optional environment variables and validates the configuration before returning.
107-124: LGTM!The validation logic correctly enforces positive values and the relationship constraint between poll interval and max wait time.
154-162: LGTM!The duration conversion helpers are straightforward and correct.
164-195: LGTM!The environment variable helpers properly handle whitespace trimming, required fields, and integer parsing with clear error messages.
status-reporter/pkg/reporter/reporter.go (7)
1-37: LGTM!The constants and K8sClientInterface are well-defined. The 10-second timeout for container status checks is appropriate.
39-77: LGTM!The StatusReporter structure and constructors are well-designed. The separation between production and test constructors enables easy testing with mock clients.
79-120: LGTM!The Run method's orchestration is excellent. The nested select on timeout (lines 104-112) cleverly handles race conditions by giving precedence to results/errors that arrive just before the timeout fires.
122-171: LGTM!The polling loop correctly handles all edge cases with proper channel send protection using select statements to prevent blocking during shutdown.
173-195: LGTM!The UpdateFromResult method correctly maps adapter results to Kubernetes condition status and propagates the adapter's reason and message.
197-214: LGTM!The UpdateFromError method correctly marks the job as failed due to invalid result format and preserves the original parsing error for debugging.
216-276: LGTM!The timeout handling is sophisticated and provides excellent diagnostics. The use of a fresh context for the container status check (line 221) ensures the check completes even if the parent context is cancelled. The termination reason mapping correctly distinguishes between OOMKilled, error exits, and logic bugs (zero exit without result file).
93ca488 to
1468e95
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
status-reporter/pkg/config/config.go (1)
107-152: ResultsPath validation is mostly good; the"..")substring check is still overly broadThe timing validations and basic path checks (disallow directory-like values and require an absolute path) look good. The remaining rough edge is:
strings.Contains(c.ResultsPath, "..")will also reject harmless filenames like/results/adapter..json, not just paths with..segments. Since you already normalize withfilepath.Clean, you can either drop this check or tighten it to reject only actual..path segments (e.g., splitting onos.PathSeparatorand checking for a segment equal to"..").If you later decide you need strict confinement to a base directory (e.g.
/results), you could also reintroduce a prefix check in terms offilepath.Clean(ResultsBasePath)plus a path‑separator boundary as was discussed previously.
🧹 Nitpick comments (10)
status-reporter/pkg/result/result.go (1)
10-19: Result validation and UTF‑8 truncation look solid; clarify byte vs character limitsThe
Validateflow (status check, trimming/defaulting, then truncation viatruncateUTF8) is coherent and matches how the contract is documented. The UTF‑8 helper safely avoids splitting multi‑byte runes by walking back to autf8.RuneStartboundary, which is good.Two small suggestions:
- The limits (
maxReasonLength = 128,maxMessageLength = 1024) are enforced in bytes, not runes. Since the README calls these “128 chars / 1024 chars”, non‑ASCII text will be truncated earlier in terms of visible characters. Either (a) switch the limits to be rune‑based, or (b) explicitly document that the limits are in bytes to avoid surprises.- For future‑proofing, you could add a quick guard in
truncateUTF8for non‑positivemaxBytes(e.g.if maxBytes <= 0 { return "" }) so it remains safe if ever called with dynamic values.Also applies to: 51-77, 79-93
README.md (1)
69-82: Minor contract/documentation nits: chars vs bytes wording and stray backtickThe README is nicely detailed and matches the code behavior for status/reason/message overall, but two small points:
- The JSON schema section describes
reasonas “max 128 chars” andmessageas “max 1024 chars”, while the implementation enforces 128/1024 bytes (with UTF‑8‑safe truncation). For non‑ASCII text this can mean fewer visible characters. Consider either clarifying that these are byte limits or switching the enforcement to rune counts to match the current wording.- There’s a small typo in the failure example:
Adapter writes to the result file:` has an extra backtick after “file”.You could fix the typo with something like:
- Adapter writes to the result file`: + Adapter writes to the result file:Also applies to: 110-121
status-reporter/pkg/result/result_test.go (1)
129-150: Consider adding a UTF‑8 truncation test caseThe truncation tests for
ReasonandMessageuse ASCII-only strings and assertlen(...) == 128/1024, which is great for the byte-limit behavior. To fully lock intruncateUTF8’s intent, I’d add a case whereReasonorMessagecontains multi-byte runes (e.g., accented or CJK characters), callValidate(), and assert that:
- The resulting string is still valid UTF‑8 (no rune split), and
- Its length does not exceed the configured limit.
This would guard against future regressions in the UTF‑8 truncation logic.
status-reporter/Dockerfile (1)
1-32: Multi-stage build and distroless runtime look solidThe Dockerfile cleanly builds a static binary and runs it in a nonroot distroless image with the expected path and entrypoint; nothing blocking here. If you want extra robustness, you could later pin the base images by digest to avoid surprises when upstream tags move.
status-reporter/pkg/result/parser.go (1)
9-37: Result parsing and validation are correct; consider stricter size enforcementThe Parser flow (read → empty check → max-size check → JSON unmarshal → AdapterResult.Validate) is sane and matches the intended contract. If, in the future, you expect untrusted or potentially large result files, you might consider switching from
os.ReadFileto anio.LimitReader-style approach so the max-size check happens before allocating the full file into memory; for the current 1MB cap this is more of a hardening tweak than a blocker.status-reporter/pkg/config/config_test.go (1)
13-260: Config tests give good coverage; consider reusing env key constantsThe test suite does a nice job exercising Load, Validate, and the duration helpers, including error paths and path validation. To reduce duplication and keep tests in sync with the config package, you might consider importing and using the
config.Env*constants instead of hardcoding"JOB_NAME","POD_NAME", etc., in expectations and setup.status-reporter/Makefile (1)
61-171: Makefile workflow is solid; minor checkmake nits if you care about that toolThe test, fmt, lint, and image targets are well-structured and should work fine as-is. If you plan to keep
checkmakehappy, you could:
- Add a simple
alltarget (e.g.,all: verify) to satisfy the missing-phony warning.- Optionally move the longer
lintandfmtshell blocks into small helper scripts (or secondary targets) to get under the configured max body length.These are style-only and don’t affect correctness.
status-reporter/pkg/reporter/reporter_test.go (1)
20-646: Reporter tests give strong behavioral coverageThe test suite thoroughly exercises the reporter’s branching (success/failure results, parse errors, timeouts, container terminations, k8s update failures, and context cancellation) using the mock client—this should make refactoring much safer. The only nit is the
"uses default condition type when empty"test name, which doesn’t currently assert the condition type itself; if that defaulting behavior is important to the API contract, you might add an assertion later.status-reporter/pkg/reporter/reporter.go (2)
216-244: Consider tying the timeout container-status check to the caller’s context
UpdateFromTimeoutcurrently uses:checkCtx, cancel := context.WithTimeout(context.Background(), containerStatusCheckTimeout)This means the container-status lookup ignores cancellation on the caller’s
ctxand will always run up tocontainerStatusCheckTimeout. If you want the reporter to honor shutdown signals more strictly, you could instead parent this on the providedctx:checkCtx, cancel := context.WithTimeout(ctx, containerStatusCheckTimeout)so that a canceled/expiring caller context also cancels the status lookup.
66-76: Optional: default the condition type when empty
newReporterWithClientpassesconditionTypestraight through; if it’s an empty string (as in one of the tests), later JobCondition updates will use an empty type. This is harmless but not very useful. If you want the constructor to be more forgiving, you could treat""as “use the default” (e.g.,"Available"from the config package) rather than requiring callers to always supply a non-empty type.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
status-reporter/go.sumis excluded by!**/*.sum
📒 Files selected for processing (20)
README.md(1 hunks)status-reporter/Dockerfile(1 hunks)status-reporter/Makefile(1 hunks)status-reporter/cmd/reporter/main.go(1 hunks)status-reporter/go.mod(1 hunks)status-reporter/pkg/config/config.go(1 hunks)status-reporter/pkg/config/config_suite_test.go(1 hunks)status-reporter/pkg/config/config_test.go(1 hunks)status-reporter/pkg/k8s/client.go(1 hunks)status-reporter/pkg/k8s/client_test.go(1 hunks)status-reporter/pkg/k8s/k8s_suite_test.go(1 hunks)status-reporter/pkg/reporter/reporter.go(1 hunks)status-reporter/pkg/reporter/reporter_suite_test.go(1 hunks)status-reporter/pkg/reporter/reporter_test.go(1 hunks)status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go(1 hunks)status-reporter/pkg/result/parser.go(1 hunks)status-reporter/pkg/result/parser_test.go(1 hunks)status-reporter/pkg/result/result.go(1 hunks)status-reporter/pkg/result/result_suite_test.go(1 hunks)status-reporter/pkg/result/result_test.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- status-reporter/pkg/reporter/reporter_suite_test.go
- status-reporter/go.mod
- status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go
- status-reporter/pkg/result/result_suite_test.go
🧰 Additional context used
🧬 Code graph analysis (8)
status-reporter/pkg/k8s/client_test.go (1)
status-reporter/pkg/k8s/client.go (1)
JobCondition(49-55)
status-reporter/cmd/reporter/main.go (2)
status-reporter/pkg/config/config.go (2)
Load(57-105)Config(13-23)status-reporter/pkg/reporter/reporter.go (1)
NewReporter(52-59)
status-reporter/pkg/result/result_test.go (1)
status-reporter/pkg/result/result.go (6)
AdapterResult(32-44)StatusSuccess(11-11)StatusFailure(12-12)DefaultReason(14-14)DefaultMessage(15-15)ResultError(22-25)
status-reporter/pkg/result/parser_test.go (2)
status-reporter/pkg/result/parser.go (2)
Parser(15-15)NewParser(18-20)status-reporter/pkg/result/result.go (4)
StatusSuccess(11-11)StatusFailure(12-12)DefaultReason(14-14)DefaultMessage(15-15)
status-reporter/pkg/config/config_test.go (1)
status-reporter/pkg/config/config.go (2)
Load(57-105)Config(13-23)
status-reporter/pkg/reporter/reporter_test.go (4)
status-reporter/pkg/reporter/reporter.go (9)
StatusReporter(40-49)NewReporterWithClient(62-64)ConditionStatusTrue(19-19)ConditionStatusFalse(20-20)ReasonAdapterCrashed(22-22)ReasonAdapterOOMKilled(23-23)ReasonAdapterExitedWithError(24-24)ReasonAdapterTimeout(25-25)ReasonInvalidResultFormat(26-26)status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go (2)
MockK8sClient(12-16)NewMockK8sClient(18-20)status-reporter/pkg/result/result.go (3)
AdapterResult(32-44)StatusSuccess(11-11)StatusFailure(12-12)status-reporter/pkg/k8s/client.go (1)
JobCondition(49-55)
status-reporter/pkg/reporter/reporter.go (3)
status-reporter/pkg/k8s/client.go (2)
JobCondition(49-55)NewClient(30-46)status-reporter/pkg/result/parser.go (2)
Parser(15-15)NewParser(18-20)status-reporter/pkg/result/result.go (1)
AdapterResult(32-44)
status-reporter/pkg/result/parser.go (1)
status-reporter/pkg/result/result.go (1)
AdapterResult(32-44)
🪛 checkmake (0.2.2)
status-reporter/Makefile
[warning] 85-85: Target body for "lint" exceeds allowed length of 5 (8).
(maxbodylength)
[warning] 96-96: Target body for "fmt" exceeds allowed length of 5 (6).
(maxbodylength)
[warning] 170-170: Missing required phony target "all"
(minphony)
🔇 Additional comments (7)
status-reporter/pkg/config/config_suite_test.go (1)
1-13: Ginkgo test suite bootstrap looks correct
TestConfigcorrectly registers the Gomega fail handler and runs the "Config Suite" specs frompackage config_test, which is the expected Ginkgo v2 pattern for black‑box tests. No changes needed.status-reporter/pkg/k8s/client_test.go (1)
12-40: JobCondition tests are straightforward and sufficientBoth tests exercise construction of
JobConditionwith and withoutLastTransitionTime, and validate fields as expected (includingIsZero()for the zero case). This gives good baseline coverage for the struct with minimal complexity.status-reporter/pkg/k8s/k8s_suite_test.go (1)
1-13: K8s Ginkgo suite setup is correct
TestK8sproperly wires Ginkgo/Gomega for thek8spackage underpackage k8s_test. This matches the pattern used in the other suites; no changes needed.status-reporter/pkg/result/parser_test.go (1)
14-153: Parser tests give good end‑to‑end coverageThese tests thoroughly exercise
Parserconstruction, file handling (including size limit and missing file), JSON parsing failures, invalid status values, and defaulting behavior. The use of per‑test temporary directories and explicit error message checks makes the suite robust. No changes needed.status-reporter/pkg/k8s/client.go (1)
29-132: Kubernetes client wrapper looks correct and idiomatic
NewClient,UpdateJobStatus, and the container-status helpers follow standard client-go patterns: retry-on-conflict around Job status updates, clean error wrapping, and sensible adapter container selection (named container when provided, otherwise first non‑status‑reporter container). I don’t see correctness issues here for the intended in-cluster use.status-reporter/pkg/config/config.go (1)
154-195: Env loading and numeric parsing behavior look correct
Load()’s use ofgetRequiredEnvfor required keys andgetEnvIntOrDefaultfor numeric options gives clear failure modes: unset values fall back to sane defaults, while malformed integers produce explicitValidationErrors. Combined with the duration helpers, this should make misconfiguration relatively easy to diagnose and avoids silent clamping.status-reporter/pkg/reporter/reporter.go (1)
79-120: Run/poll orchestration looks robust; nice use of channels and WaitGroupThe lifecycle around
Run→pollForResults(buffered channels,donesignal,timeoutCtx, andWaitGroup) is clean and avoids goroutine leaks, while still giving precedence to late-arriving results/errors on timeout. This is a good structure for a sidecar-style watcher.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
status-reporter/cmd/reporter/main.go (1)
66-77: Shutdown timeout can still block indefinitely onerrChan.After the select on lines 66-71, if the shutdown timer fires before
rep.Run(ctx)returns,doneChanis still open and no value has been sent toerrChan. The subsequent blocking read<-errChanon line 74 will hang indefinitely, defeating the purpose of the shutdown timeout.You should only read from
errChanwhendoneChanhas actually closed (the graceful path), and treat a timeout as a forced exit.Apply this diff to fix the blocking issue:
shutdownTimer := time.NewTimer(shutdownTimeout) defer shutdownTimer.Stop() select { case <-doneChan: + // Reporter exited within the shutdown window + if err := <-errChan; err != nil { + log.Printf("Reporter finished with error during shutdown: %v", err) + os.Exit(1) + } log.Println("Reporter finished during shutdown") + os.Exit(0) case <-shutdownTimer.C: + // Reporter did not stop in time; force exit log.Printf("Shutdown timeout (%s) exceeded", shutdownTimeout) + os.Exit(1) } - - // Check for errors even during signal shutdown - if err := <-errChan; err != nil { - log.Printf("Reporter finished with error: %v", err) - os.Exit(1) - } - - log.Println("Shutdown complete") - os.Exit(0)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
status-reporter/go.sumis excluded by!**/*.sum
📒 Files selected for processing (20)
README.md(1 hunks)status-reporter/Dockerfile(1 hunks)status-reporter/Makefile(1 hunks)status-reporter/cmd/reporter/main.go(1 hunks)status-reporter/go.mod(1 hunks)status-reporter/pkg/config/config.go(1 hunks)status-reporter/pkg/config/config_suite_test.go(1 hunks)status-reporter/pkg/config/config_test.go(1 hunks)status-reporter/pkg/k8s/client.go(1 hunks)status-reporter/pkg/k8s/client_test.go(1 hunks)status-reporter/pkg/k8s/k8s_suite_test.go(1 hunks)status-reporter/pkg/reporter/reporter.go(1 hunks)status-reporter/pkg/reporter/reporter_suite_test.go(1 hunks)status-reporter/pkg/reporter/reporter_test.go(1 hunks)status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go(1 hunks)status-reporter/pkg/result/parser.go(1 hunks)status-reporter/pkg/result/parser_test.go(1 hunks)status-reporter/pkg/result/result.go(1 hunks)status-reporter/pkg/result/result_suite_test.go(1 hunks)status-reporter/pkg/result/result_test.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
- status-reporter/go.mod
- status-reporter/pkg/result/parser.go
- status-reporter/pkg/result/parser_test.go
- status-reporter/pkg/result/result_test.go
- status-reporter/pkg/result/result.go
- status-reporter/Dockerfile
- status-reporter/pkg/config/config_suite_test.go
🧰 Additional context used
🧬 Code graph analysis (6)
status-reporter/pkg/k8s/client_test.go (1)
status-reporter/pkg/k8s/client.go (1)
JobCondition(49-55)
status-reporter/cmd/reporter/main.go (2)
status-reporter/pkg/config/config.go (2)
Load(57-105)Config(13-23)status-reporter/pkg/reporter/reporter.go (1)
NewReporter(52-59)
status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go (1)
status-reporter/pkg/k8s/client.go (1)
JobCondition(49-55)
status-reporter/pkg/reporter/reporter_test.go (4)
status-reporter/pkg/reporter/reporter.go (9)
StatusReporter(40-49)NewReporterWithClient(62-64)ConditionStatusTrue(19-19)ConditionStatusFalse(20-20)ReasonAdapterCrashed(22-22)ReasonAdapterOOMKilled(23-23)ReasonAdapterExitedWithError(24-24)ReasonAdapterTimeout(25-25)ReasonInvalidResultFormat(26-26)status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go (2)
MockK8sClient(12-16)NewMockK8sClient(18-20)status-reporter/pkg/result/result.go (3)
AdapterResult(32-44)StatusSuccess(11-11)StatusFailure(12-12)status-reporter/pkg/k8s/client.go (1)
JobCondition(49-55)
status-reporter/pkg/config/config_test.go (1)
status-reporter/pkg/config/config.go (2)
Load(57-105)Config(13-23)
status-reporter/pkg/reporter/reporter.go (3)
status-reporter/pkg/k8s/client.go (2)
JobCondition(49-55)NewClient(30-46)status-reporter/pkg/result/parser.go (2)
Parser(15-15)NewParser(18-20)status-reporter/pkg/result/result.go (1)
AdapterResult(32-44)
🪛 checkmake (0.2.2)
status-reporter/Makefile
[warning] 85-85: Target body for "lint" exceeds allowed length of 5 (8).
(maxbodylength)
[warning] 96-96: Target body for "fmt" exceeds allowed length of 5 (6).
(maxbodylength)
[warning] 170-170: Missing required phony target "all"
(minphony)
🔇 Additional comments (16)
status-reporter/Makefile (1)
1-171: LGTM!The Makefile provides comprehensive build automation with proper error handling, container runtime detection, and helpful user feedback. The static analysis warnings about target body length are acceptable given the conditional logic requirements, and the missing "all" target is optional for this use case.
status-reporter/pkg/k8s/client_test.go (1)
1-42: LGTM!The tests properly validate JobCondition struct behavior, covering both complete initialization and zero-value cases.
status-reporter/pkg/result/result_suite_test.go (1)
1-13: LGTM!Standard Ginkgo/Gomega test suite bootstrap following the established pattern across the codebase.
status-reporter/pkg/k8s/k8s_suite_test.go (1)
1-13: LGTM!Standard Ginkgo/Gomega test suite bootstrap following the established pattern across the codebase.
status-reporter/pkg/reporter/reporter_suite_test.go (1)
1-13: LGTM!Standard Ginkgo/Gomega test suite bootstrap following the established pattern across the codebase.
status-reporter/cmd/reporter/main.go (3)
19-42: LGTM!Configuration loading and reporter initialization are properly structured with appropriate error handling.
82-89: LGTM!The normal completion path correctly reads from
errChanand exits with the appropriate status code based on whether an error occurred.
92-108: LGTM!The configuration logging helper provides clear visibility into runtime settings, including the auto-detect behavior for adapter container name.
status-reporter/pkg/config/config_test.go (1)
1-250: LGTM!Comprehensive test coverage with proper test isolation, validation of required/optional fields, error handling, and edge cases. The tests effectively validate the configuration loading and validation logic.
status-reporter/pkg/reporter/reporter_test.go (1)
1-647: Exemplary test coverage!The test suite is comprehensive and well-structured, covering:
- All public constants and constructors
- Success and failure paths for result processing
- Container termination scenarios (OOMKilled, error exits, crashes)
- Timeout handling with various container states
- Edge cases (empty files, invalid JSON, context cancellation)
- Error propagation from the Kubernetes client
The use of Ginkgo/Gomega with a mock Kubernetes client enables deterministic, readable tests without requiring a live cluster.
status-reporter/pkg/k8s/client.go (1)
1-132: LGTM!The Kubernetes client wrapper is well-implemented:
- Proper retry-on-conflict handling for status updates (line 59)
- Intelligent fallback logic in
GetAdapterContainerStatusthat auto-detects the adapter container when name is unspecified- Clear error messages with contextual information
- Correct JobCondition upsert semantics (lines 81-92)
status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go (1)
1-35: LGTM!The mock client is clean and serves its testing purpose well. It properly records conditions for assertions and allows flexible behavior injection through function fields.
status-reporter/pkg/config/config.go (1)
126-145: Verify that path restriction removal is intentional.The past review comments mentioned issues with
ResultsPathvalidation (prefix check and..traversal detection). The current implementation has simplified validation to only check:
- Path is not a directory (line 128)
- Path is absolute (line 137)
However, there's no longer any restriction on which directories can be accessed. If the intent was to sandbox file access to a specific directory (e.g.,
/results), this validation is insufficient.If path sandboxing is required, consider adding validation to ensure the cleaned path is within an allowed base directory. If unrestricted absolute paths are intentional (adapter can specify any path), this is acceptable but should be documented as a security consideration.
Based on past review comments indicating path restriction was intended.
status-reporter/pkg/reporter/reporter.go (3)
79-120: Excellent concurrency pattern!The
Runmethod demonstrates strong concurrent programming practices:
- Proper timeout context usage (line 87)
- WaitGroup for goroutine coordination (line 93-95)
- Graceful precedence handling on timeout (lines 104-112) that checks for late-arriving results
- Correct cleanup sequence:
close(done),cancel(),wg.Wait()(lines 115-117)This prevents false timeouts when results arrive just before the deadline and ensures clean shutdown.
122-171: LGTM!The polling implementation is correct:
- Proper ticker cleanup with
defer ticker.Stop()(line 127)- Distinguishes between file-not-exist (continue polling) and actual errors (line 140-141)
- Non-blocking channel sends with
donesignal checks (lines 143-147, 154-158, 163-167) prevent goroutine leaks- Clean shutdown on context cancellation or done signal
173-276: LGTM!The update methods are well-designed:
UpdateFromResultcorrectly maps adapter status to Kubernetes condition statusUpdateFromErrorproperly preserves the original error while updating Job status (line 213)UpdateFromTimeoutintelligently checks container status first and delegates toUpdateFromTerminatedContainerwhen applicableUpdateFromTerminatedContainercorrectly distinguishes between OOMKilled, error exits, and the edge case of zero exit without result file (treated as crash)The error propagation and status update logic is sound.
1468e95 to
fd11719
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
status-reporter/go.sumis excluded by!**/*.sum
📒 Files selected for processing (22)
README.md(1 hunks)status-reporter/Dockerfile(1 hunks)status-reporter/Makefile(1 hunks)status-reporter/cmd/reporter/main.go(1 hunks)status-reporter/cmd/reporter/main_suite_test.go(1 hunks)status-reporter/cmd/reporter/main_test.go(1 hunks)status-reporter/go.mod(1 hunks)status-reporter/pkg/config/config.go(1 hunks)status-reporter/pkg/config/config_suite_test.go(1 hunks)status-reporter/pkg/config/config_test.go(1 hunks)status-reporter/pkg/k8s/client.go(1 hunks)status-reporter/pkg/k8s/client_test.go(1 hunks)status-reporter/pkg/k8s/k8s_suite_test.go(1 hunks)status-reporter/pkg/reporter/reporter.go(1 hunks)status-reporter/pkg/reporter/reporter_suite_test.go(1 hunks)status-reporter/pkg/reporter/reporter_test.go(1 hunks)status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go(1 hunks)status-reporter/pkg/result/parser.go(1 hunks)status-reporter/pkg/result/parser_test.go(1 hunks)status-reporter/pkg/result/result.go(1 hunks)status-reporter/pkg/result/result_suite_test.go(1 hunks)status-reporter/pkg/result/result_test.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
- status-reporter/pkg/result/result_test.go
- README.md
- status-reporter/pkg/result/parser_test.go
- status-reporter/go.mod
- status-reporter/pkg/config/config_suite_test.go
- status-reporter/pkg/k8s/client.go
- status-reporter/Dockerfile
- status-reporter/pkg/k8s/client_test.go
🧰 Additional context used
🧬 Code graph analysis (5)
status-reporter/pkg/reporter/reporter_test.go (4)
status-reporter/pkg/reporter/reporter.go (8)
StatusReporter(40-49)ConditionStatusTrue(19-19)ConditionStatusFalse(20-20)ReasonAdapterCrashed(22-22)ReasonAdapterOOMKilled(23-23)ReasonAdapterExitedWithError(24-24)ReasonAdapterTimeout(25-25)ReasonInvalidResultFormat(26-26)status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go (2)
MockK8sClient(12-16)NewMockK8sClient(18-20)status-reporter/pkg/result/result.go (3)
AdapterResult(32-44)StatusSuccess(11-11)StatusFailure(12-12)status-reporter/pkg/k8s/client.go (1)
JobCondition(49-55)
status-reporter/pkg/config/config_test.go (1)
status-reporter/pkg/config/config.go (2)
Load(57-105)Config(13-23)
status-reporter/pkg/result/parser.go (1)
status-reporter/pkg/result/result.go (1)
AdapterResult(32-44)
status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go (1)
status-reporter/pkg/k8s/client.go (1)
JobCondition(49-55)
status-reporter/pkg/reporter/reporter.go (3)
status-reporter/pkg/k8s/client.go (2)
JobCondition(49-55)NewClient(30-46)status-reporter/pkg/result/parser.go (2)
Parser(15-15)NewParser(18-20)status-reporter/pkg/result/result.go (1)
AdapterResult(32-44)
🪛 checkmake (0.2.2)
status-reporter/Makefile
[warning] 85-85: Target body for "lint" exceeds allowed length of 5 (8).
(maxbodylength)
[warning] 96-96: Target body for "fmt" exceeds allowed length of 5 (6).
(maxbodylength)
[warning] 170-170: Missing required phony target "all"
(minphony)
🔇 Additional comments (24)
status-reporter/cmd/reporter/main_suite_test.go (1)
1-13: LGTM! Standard Ginkgo test suite bootstrap.The test suite setup follows Ginkgo v2 conventions correctly.
status-reporter/pkg/k8s/k8s_suite_test.go (1)
1-13: LGTM! Standard Ginkgo test suite bootstrap.The K8s test suite setup follows Ginkgo v2 conventions correctly.
status-reporter/pkg/result/result_suite_test.go (1)
1-13: LGTM! Standard Ginkgo test suite bootstrap.The Result test suite setup follows Ginkgo v2 conventions correctly.
status-reporter/pkg/reporter/reporter_suite_test.go (1)
1-13: LGTM! Standard Ginkgo test suite bootstrap.The Reporter test suite setup follows Ginkgo v2 conventions correctly.
status-reporter/cmd/reporter/main.go (4)
22-68: LGTM! Well-structured main function with proper error handling.The main function correctly:
- Sets up configuration and logging
- Implements panic recovery for the reporter goroutine
- Establishes signal handling for graceful shutdown
- Delegates exit code determination to waitForCompletion
70-82: LGTM! Clean completion handling.The waitForCompletion function properly coordinates between normal completion and signal-driven shutdown, returning appropriate exit codes.
84-120: LGTM! Shutdown handling correctly addresses past review feedback.The completion and shutdown handlers properly:
- Return correct exit codes (1 for errors, 0 for success)
- Avoid blocking indefinitely on errChan by reading it only within the graceful shutdown case
- Force exit on timeout without waiting for the done channel
- Distinguish between expected context.Canceled errors and real errors during shutdown
The past review concerns about exit codes and shutdown timeout blocking have been successfully resolved.
122-138: LGTM! Clear configuration logging.The logConfig function provides helpful visibility into the loaded configuration, including the auto-detection behavior for AdapterContainerName.
status-reporter/pkg/config/config_test.go (4)
13-37: LGTM! Proper test isolation with environment variable management.The BeforeEach/AfterEach hooks correctly save, clear, and restore environment variables to ensure test independence.
39-148: LGTM! Comprehensive test coverage for config loading.The Load() tests thoroughly cover:
- Valid configurations with required and optional fields
- Default value application
- Custom value overrides
- Whitespace trimming
- Missing required environment variables
- Invalid integer parsing
150-235: LGTM! Thorough validation test coverage.The Validate() tests comprehensively verify:
- Valid configurations pass
- Timing parameters must be positive
- PollIntervalSeconds must be less than MaxWaitTimeSeconds
- ResultsPath must be absolute and point to a file
237-249: LGTM! Helper method tests verify duration conversions.The tests for GetPollInterval() and GetMaxWaitTime() correctly verify the conversion from seconds to time.Duration.
status-reporter/pkg/result/parser.go (3)
1-20: LGTM! Sensible size limit and constructor pattern.The 1MB maxResultFileSize is appropriate for JSON result files, and the constructor pattern provides flexibility for future enhancements.
22-38: LGTM! Robust file parsing with appropriate validation.ParseFile correctly validates file contents by checking for empty files and enforcing the size limit before parsing, with descriptive error messages.
40-53: LGTM! Clean parsing with delegated validation.Parse correctly unmarshals JSON and delegates validation to AdapterResult.Validate(), maintaining proper separation of concerns.
status-reporter/Makefile (4)
28-59: LGTM! Robust Go tooling and container runtime detection.The Makefile correctly:
- Configures race detection for tests
- Detects available container runtimes (Docker/Podman)
- Provides clear error messages when no runtime is found
61-109: LGTM! Well-structured development targets.The test, lint, and formatting targets properly:
- Enable race detection for tests
- Generate coverage reports
- Check for tool availability before use
- Prefer goimports over gofmt when available
The static analysis warnings about target body length are stylistic and don't indicate actual problems.
110-146: LGTM! Image build and push targets with runtime validation.The image targets correctly:
- Validate container runtime availability
- Build and tag images appropriately
- Provide clear error messages and status indicators
Note: The binary target uses LDFLAGS that were flagged in a separate comment.
147-171: LGTM! Dev workflow targets with proper validation.The image-dev target appropriately:
- Requires QUAY_USER to be set
- Provides clear usage instructions
- Uses a dev-specific tag with git commit
The verify target usefully chains linting and testing. The static analysis warning about a missing "all" target is a convention, not a requirement.
status-reporter/pkg/reporter/reporter_test.go (1)
1-647: Excellent comprehensive test coverage!The test suite thoroughly exercises the StatusReporter across all code paths:
- Constructor variations with custom and default condition types
- Result processing for success/failure cases
- Error handling for invalid formats and K8s update failures
- Container termination scenarios (OOMKilled, exit codes, crashes)
- Timeout handling with various container states
- End-to-end Run orchestration with file polling
The test structure is well-organized with clear Context blocks, and the use of the mock K8s client enables precise verification of status updates. The comment at lines 238-244 excellently explains the exit-code-zero-without-result-file edge case.
status-reporter/cmd/reporter/main_test.go (1)
1-382: Well-structured lifecycle tests with excellent edge case coverage.The test suite comprehensively validates the reporter's lifecycle management:
- Exit code handling for success, errors, and context cancellation
- Graceful shutdown with signal handling (SIGTERM/SIGINT)
- Shutdown timeout behavior and timer cleanup
- Race conditions between completion and signals
- Proper distinction between context.Canceled in shutdown (exit 0) vs normal paths (exit 1)
The tests cover critical operational scenarios that ensure the sidecar behaves correctly during Kubernetes pod lifecycle events.
status-reporter/pkg/result/result.go (1)
79-94: UTF-8 truncation correctly implemented.The
truncateUTF8helper properly addresses the past review concern about splitting multi-byte UTF-8 characters. The implementation walks backward frommaxBytesto find the last valid rune boundary usingutf8.RuneStart, ensuring the truncated string remains valid UTF-8.The logic correctly handles all cases:
- Returns the original string if already within limits
- Finds the last complete character that fits within
maxBytes- Never splits multi-byte runes
status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go (1)
1-35: Clean and effective mock implementation.The
MockK8sClientprovides a straightforward test double with:
- Injectable behavior via function fields for custom error scenarios
- State capture via
LastUpdatedConditionfor assertion verification- Safe defaults (nil/no-error) when custom behavior isn't needed
This enables the comprehensive test coverage in
reporter_test.goby allowing precise control over Kubernetes interactions.status-reporter/pkg/reporter/reporter.go (1)
1-276: Excellent implementation with proper resource management and error handling.The
StatusReporterdemonstrates several strong patterns:Clean architecture:
K8sClientInterfaceabstraction enables testability- Dependency injection via constructors (production vs. test)
- Clear separation of concerns across update methods
Robust orchestration:
- Proper context usage with timeout and graceful cancellation
- WaitGroup ensures goroutine cleanup before return
- Double-select pattern (lines 104-112) gives precedence to results that arrive just before timeout
- Ticker and context properly stopped via defers
Intelligent error handling:
UpdateFromTimeoutcreates a newcontext.Background()(line 221) for container status checks, allowing the query even if the parent context is cancelled - this is correct defensive programmingUpdateFromTerminatedContainerdistinguishes between OOMKilled, non-zero exits, and the edge case of zero-exit without results- Error messages include relevant context (pod name, condition type, timing)
| LDFLAGS := -w -s | ||
| LDFLAGS += -X main.version=$(VERSION) | ||
| LDFLAGS += -X main.commit=$(GIT_COMMIT) | ||
| LDFLAGS += -X main.buildDate=$(BUILD_DATE) | ||
| ifneq ($(GIT_TAG),) | ||
| LDFLAGS += -X main.tag=$(GIT_TAG) | ||
| endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LDFLAGS references non-existent variables in main.go.
The LDFLAGS attempt to inject version metadata (main.version, main.commit, main.buildDate, main.tag) via -X flags, but these variables are not declared in status-reporter/cmd/reporter/main.go.
If you want to embed build metadata, add these variables to main.go:
package main
import (
// ... existing imports
)
var (
version = "dev"
commit = "unknown"
buildDate = "unknown"
tag = ""
)Alternatively, if version metadata is not needed, remove lines 20-26 from the Makefile.
🤖 Prompt for AI Agents
In status-reporter/Makefile around lines 20 to 26, the LDFLAGS use -X to set
main.version, main.commit, main.buildDate and main.tag which do not exist in
status-reporter/cmd/reporter/main.go; to fix either (A) add matching
package-level vars in main.go (e.g., version, commit, buildDate, tag) with
sensible defaults so the -X flags can set them at link time, or (B) if build
metadata is not required, remove the LDFLAGS lines 20–26 from the Makefile so it
no longer references non-existent symbols.
fd11719 to
910c755
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
♻️ Duplicate comments (2)
status-reporter/Makefile (1)
19-26: LDFLAGS-X main.*still needs confirmation in main.go.
This is the same concern as the prior review: ensurestatus-reporter/cmd/reporter/main.godeclaresversion/commit/buildDate/tagvariables, or remove these flags.#!/bin/bash set -euo pipefail # Verify main package variables exist for ldflags injection rg -n --no-heading '^\s*var\s*\(\s*$|^\s*version\s*=|^\s*commit\s*=|^\s*buildDate\s*=|^\s*tag\s*=' status-reporter/cmd/reporter/main.go || true echo "----" sed -n '1,120p' status-reporter/cmd/reporter/main.gostatus-reporter/pkg/config/config.go (1)
126-145: Reintroduce/resultssandboxing forResultsPath(currently allows arbitrary absolute paths).
Right now any absolute file path is accepted; this is a security footgun and was previously flagged.const ( DefaultResultsPath = "/results/adapter-result.json" + ResultsBasePath = "/results" DefaultPollIntervalSeconds = 2 DefaultMaxWaitTimeSeconds = 300 DefaultConditionType = "Available" DefaultLogLevel = "info" DefaultAdapterContainerName = "" ) func (c *Config) validateResultsPath() error { if strings.HasSuffix(c.ResultsPath, "/") { return &ValidationError{ Field: "ResultsPath", Message: "path must be a file, not a directory", } } cleanPath := filepath.Clean(c.ResultsPath) if !filepath.IsAbs(cleanPath) { return &ValidationError{ Field: "ResultsPath", Message: "path must be absolute", } } + + // Ensure cleanPath is within ResultsBasePath (avoid treating "/results2" as inside "/results") + base := filepath.Clean(ResultsBasePath) + if cleanPath != base && !strings.HasPrefix(cleanPath, base+string(os.PathSeparator)) { + return &ValidationError{ + Field: "ResultsPath", + Message: fmt.Sprintf("path must be within %s directory", ResultsBasePath), + } + } return nil }
🧹 Nitpick comments (5)
status-reporter/Dockerfile (1)
2-17: Avoid hardcoding amd64; prefer build args / multi-arch pattern.
If this image is expected to run on arm64 clusters too,GOARCH=amd64will be a problem. Consider usingARG TARGETOS TARGETARCHand building with BuildKit’s target platform variables.FROM golang:1.25-alpine AS builder +ARG TARGETOS=linux +ARG TARGETARCH=amd64 ... -RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -ldflags="-w -s" -o status-reporter ./cmd/reporter +RUN CGO_ENABLED=0 GOOS=$TARGETOS GOARCH=$TARGETARCH go build -ldflags="-w -s" -o status-reporter ./cmd/reporterstatus-reporter/Makefile (1)
61-65: Consider addingallphony target to match common Make conventions.
This also addresses thecheckmake“missing phony target all” warning.+.PHONY: all +all: help + .PHONY: help help: ## Display this help messageAlso applies to: 170-171
status-reporter/pkg/reporter/reporter_test.go (1)
529-555: Potential CI flakes: replace fixed sleeps / exact error strings with Eventually + sentinel errors.
The tests coordinate withtime.Sleep(...)and sometimes asserterr.Error() == "timeout waiting for adapter results", which is brittle if wording changes. PreferEventuallyaround file existence/condition updates and typed/sentinel errors (orMatchErrorwith a shared exported error).Also applies to: 652-687, 747-842
status-reporter/pkg/k8s/client.go (1)
116-129: Avoid returning the address of a range variable (&cs).
This is a common Go pitfall and is easy to regress if the loop logic changes. Prefer indexing to take the address of the slice element.- for _, cs := range podStatus.ContainerStatuses { - if cs.Name == containerName { - return &cs, nil - } - } + for i := range podStatus.ContainerStatuses { + if podStatus.ContainerStatuses[i].Name == containerName { + return &podStatus.ContainerStatuses[i], nil + } + } ... - for _, cs := range podStatus.ContainerStatuses { - if cs.Name != StatusReporterContainerName { - return &cs, nil - } - } + for i := range podStatus.ContainerStatuses { + if podStatus.ContainerStatuses[i].Name != StatusReporterContainerName { + return &podStatus.ContainerStatuses[i], nil + } + }status-reporter/pkg/reporter/reporter.go (1)
204-226: Consider checkingLastTerminationState.Terminated(restart/crash-loop visibility) + clamp condition message size.
If the container restarts,State.Terminatedmay be nil whileLastTerminationState.Terminatedhas the real failure reason; also, conditionMessagecould grow large depending on parse errors.Also applies to: 375-405
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
status-reporter/go.sumis excluded by!**/*.sum
📒 Files selected for processing (22)
README.md(1 hunks)status-reporter/Dockerfile(1 hunks)status-reporter/Makefile(1 hunks)status-reporter/cmd/reporter/main.go(1 hunks)status-reporter/cmd/reporter/main_suite_test.go(1 hunks)status-reporter/cmd/reporter/main_test.go(1 hunks)status-reporter/go.mod(1 hunks)status-reporter/pkg/config/config.go(1 hunks)status-reporter/pkg/config/config_suite_test.go(1 hunks)status-reporter/pkg/config/config_test.go(1 hunks)status-reporter/pkg/k8s/client.go(1 hunks)status-reporter/pkg/k8s/client_test.go(1 hunks)status-reporter/pkg/k8s/k8s_suite_test.go(1 hunks)status-reporter/pkg/reporter/reporter.go(1 hunks)status-reporter/pkg/reporter/reporter_suite_test.go(1 hunks)status-reporter/pkg/reporter/reporter_test.go(1 hunks)status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go(1 hunks)status-reporter/pkg/result/parser.go(1 hunks)status-reporter/pkg/result/parser_test.go(1 hunks)status-reporter/pkg/result/result.go(1 hunks)status-reporter/pkg/result/result_suite_test.go(1 hunks)status-reporter/pkg/result/result_test.go(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- status-reporter/cmd/reporter/main_suite_test.go
🚧 Files skipped from review as they are similar to previous changes (9)
- status-reporter/pkg/config/config_suite_test.go
- status-reporter/pkg/result/result_suite_test.go
- status-reporter/pkg/reporter/reporter_suite_test.go
- status-reporter/go.mod
- status-reporter/pkg/result/result_test.go
- status-reporter/cmd/reporter/main.go
- status-reporter/pkg/result/parser.go
- status-reporter/pkg/result/parser_test.go
- status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go
🧰 Additional context used
🧬 Code graph analysis (4)
status-reporter/pkg/config/config_test.go (1)
status-reporter/pkg/config/config.go (2)
Load(57-105)Config(13-23)
status-reporter/pkg/reporter/reporter.go (3)
status-reporter/pkg/k8s/client.go (2)
JobCondition(49-55)NewClient(30-46)status-reporter/pkg/result/result.go (1)
AdapterResult(32-44)status-reporter/pkg/result/parser.go (2)
Parser(16-16)NewParser(19-21)
status-reporter/pkg/reporter/reporter_test.go (4)
status-reporter/pkg/reporter/reporter.go (3)
StatusReporter(49-59)NewReporterWithClient(72-74)NewReporterWithClientAndIntervals(77-79)status-reporter/pkg/reporter/testhelpers/mock_k8s_client.go (2)
MockK8sClient(12-16)NewMockK8sClient(18-20)status-reporter/pkg/result/result.go (3)
AdapterResult(32-44)StatusSuccess(11-11)StatusFailure(12-12)status-reporter/pkg/k8s/client.go (1)
JobCondition(49-55)
status-reporter/pkg/k8s/client_test.go (1)
status-reporter/pkg/k8s/client.go (1)
JobCondition(49-55)
🪛 checkmake (0.2.2)
status-reporter/Makefile
[warning] 85-85: Target body for "lint" exceeds allowed length of 5 (8).
(maxbodylength)
[warning] 96-96: Target body for "fmt" exceeds allowed length of 5 (6).
(maxbodylength)
[warning] 170-170: Missing required phony target "all"
(minphony)
🔇 Additional comments (8)
README.md (1)
88-90: Doc typo from earlier review looks fixed.
The “Adapter writes to the result file:” line no longer has the stray backtick.Also applies to: 110-112
status-reporter/pkg/k8s/k8s_suite_test.go (1)
1-13: Standard Ginkgo suite bootstrap looks correct.status-reporter/pkg/k8s/client_test.go (1)
12-41: LGTM (basic coverage).
These are simple smoke tests; consider expanding later to cover behavior around status update helpers (if/when present).status-reporter/Dockerfile (1)
2-2: Pin/validate base image tags against your supported toolchain.
Please confirmgolang:1.25-alpine(and the distroless tag) match what you want for CI/prod and actually exist in your registries.Also applies to: 19-19
status-reporter/pkg/config/config_test.go (1)
16-37: Env isolation approach looks solid.status-reporter/cmd/reporter/main_test.go (1)
15-40: Coverage of signal/normal completion paths is strong.Also applies to: 156-256
status-reporter/pkg/reporter/reporter_test.go (1)
20-843: Scenario coverage is thorough and matches the PR’s intent.
Good breadth across result parsing, termination, timeout, and status update propagation.status-reporter/pkg/result/result.go (1)
51-94: UTF-8-safe truncation + normalization looks solid.
This addresses the prior “byte slice can corrupt UTF-8” issue while keeping the contract simple.
| Resulting Kubernetes Job status: | ||
| ```yaml | ||
| status: | ||
| conditions: | ||
| - type: Available | ||
| status: "True" | ||
| reason: ValidationPassed | ||
| message: GCP environment validated successfully | ||
| lastTransitionTime: "2024-01-15T10:30:00Z" | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using Job-native condition types in examples (or explicitly call out “custom”).
Using type: Available in Job status examples can confuse readers and some tooling; if custom condition types are intentional, clarify that in the doc examples (or switch examples to Complete/Failed).
Also applies to: 123-132, 137-145
🤖 Prompt for AI Agents
In README.md around lines 97-106 (also review examples at 123-132 and 137-145),
the Job status YAML uses a non-standard/custom condition type "Available" which
can confuse readers and tooling; either change the example condition types to
native Job conditions such as "Complete" or "Failed" with appropriate
status/messages, or explicitly label the snippet as using a custom condition
type (e.g., add a comment/note above the YAML stating "custom condition type
used for illustration"). Ensure the chosen fix is applied consistently to the
other referenced snippets (123-132 and 137-145).
| Context("when reporter exceeds shutdown timeout", func() { | ||
| It("returns exit code 1 after timeout", func() { | ||
| // Simulate reporter hanging (never sends to done) | ||
| go func() { | ||
| <-ctx.Done() | ||
| // Never send to done - simulating hang | ||
| time.Sleep(10 * time.Second) | ||
| }() | ||
|
|
||
| start := time.Now() | ||
| exitCode := handleShutdown(syscall.SIGTERM, cancel, done) | ||
| duration := time.Since(start) | ||
|
|
||
| Expect(exitCode).To(Equal(1)) | ||
| Expect(duration).To(BeNumerically(">=", shutdownTimeout)) | ||
| Expect(duration).To(BeNumerically("<", shutdownTimeout+200*time.Millisecond)) | ||
| }) | ||
|
|
||
| It("does not block indefinitely", func() { | ||
| go func() { | ||
| <-ctx.Done() | ||
| // Hang forever | ||
| select {} | ||
| }() | ||
|
|
||
| done := make(chan int, 1) | ||
| go func() { | ||
| exitCode := handleShutdown(syscall.SIGTERM, cancel, make(chan error, 1)) | ||
| done <- exitCode | ||
| }() | ||
|
|
||
| Eventually(done, shutdownTimeout+1*time.Second).Should(Receive(Equal(1))) | ||
| }) | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reduce flake risk: loosen timing assertions + avoid forever goroutines.
The < shutdownTimeout+200ms bound is tight under load, and select {} will leak a goroutine. Prefer a goroutine that exits on ctx.Done() and use Eventually for time-based expectations.
- go func() {
- <-ctx.Done()
- // Hang forever
- select {}
- }()
+ go func() {
+ <-ctx.Done()
+ // Simulate hang until test cancels context
+ <-ctx.Done()
+ }()Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In status-reporter/cmd/reporter/main_test.go around lines 104 to 137, the test
has a tight timing assertion and a goroutine that blocks forever causing
flakiness and leaks; change the first It to relax the upper timing bound (remove
the strict "< shutdownTimeout+200ms" and instead assert duration is >=
shutdownTimeout and <= shutdownTimeout+1s or use Eventually to allow variance),
and rewrite the hanging goroutines so they exit on ctx.Done() (replace select{}
with <-ctx.Done() or a sleep until ctx.Done()) and for the second It use
Eventually on the done channel with a timeout like shutdownTimeout+1s to assert
it receives 1, ensuring all goroutines can terminate and timing is more
tolerant.
| .PHONY: image-dev | ||
| image-dev: ## Build and push to personal Quay registry (requires QUAY_USER) | ||
| ifndef QUAY_USER | ||
| @echo "❌ ERROR: QUAY_USER is not set" | ||
| @echo "" | ||
| @echo "Usage: QUAY_USER=myuser make image-dev" | ||
| @echo "" | ||
| @echo "This will build and push to: quay.io/$$QUAY_USER/$(IMAGE_NAME):$(DEV_TAG)" | ||
| @exit 1 | ||
| endif | ||
| ifeq ($(CONTAINER_RUNTIME),none) | ||
| @echo "❌ ERROR: No container runtime found" | ||
| @echo "Please install Docker or Podman" | ||
| @exit 1 | ||
| else | ||
| @echo "Building dev image quay.io/$(QUAY_USER)/$(IMAGE_NAME):$(DEV_TAG)..." | ||
| $(CONTAINER_CMD) build -t quay.io/$(QUAY_USER)/$(IMAGE_NAME):$(DEV_TAG) . | ||
| @echo "Pushing dev image..." | ||
| $(CONTAINER_CMD) push quay.io/$(QUAY_USER)/$(IMAGE_NAME):$(DEV_TAG) | ||
| @echo "" | ||
| @echo "✅ Dev image pushed: quay.io/$(QUAY_USER)/$(IMAGE_NAME):$(DEV_TAG)" | ||
| endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix image-dev QUAY_USER guard (currently allows empty QUAY_USER).
Because QUAY_USER ?= defines the var, ifndef QUAY_USER won’t fire when it’s empty.
.PHONY: image-dev
image-dev: ## Build and push to personal Quay registry (requires QUAY_USER)
-ifndef QUAY_USER
+ifeq ($(strip $(QUAY_USER)),)
@echo "❌ ERROR: QUAY_USER is not set"
@echo ""
@echo "Usage: QUAY_USER=myuser make image-dev"
@echo ""
@echo "This will build and push to: quay.io/$$QUAY_USER/$(IMAGE_NAME):$(DEV_TAG)"
@exit 1
endif| It("uses default values for optional fields", func() { | ||
| cfg, err := config.Load() | ||
| Expect(err).NotTo(HaveOccurred()) | ||
| Expect(cfg.ResultsPath).To(Equal("/results/adapter-result.json")) | ||
| Expect(cfg.PollIntervalSeconds).To(Equal(2)) | ||
| Expect(cfg.MaxWaitTimeSeconds).To(Equal(300)) | ||
| Expect(cfg.ConditionType).To(Equal("Available")) | ||
| Expect(cfg.LogLevel).To(Equal("info")) | ||
| Expect(cfg.AdapterContainerName).To(Equal("")) | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid hardcoding defaults in tests; reference config constants (prevents drift).
AdapterContainerName default asserted as "" is especially likely to diverge from DefaultAdapterContainerName.
- Expect(cfg.ResultsPath).To(Equal("/results/adapter-result.json"))
- Expect(cfg.PollIntervalSeconds).To(Equal(2))
- Expect(cfg.MaxWaitTimeSeconds).To(Equal(300))
- Expect(cfg.ConditionType).To(Equal("Available"))
- Expect(cfg.LogLevel).To(Equal("info"))
- Expect(cfg.AdapterContainerName).To(Equal(""))
+ Expect(cfg.ResultsPath).To(Equal(config.DefaultResultsPath))
+ Expect(cfg.PollIntervalSeconds).To(Equal(config.DefaultPollIntervalSeconds))
+ Expect(cfg.MaxWaitTimeSeconds).To(Equal(config.DefaultMaxWaitTimeSeconds))
+ Expect(cfg.ConditionType).To(Equal(config.DefaultConditionType))
+ Expect(cfg.LogLevel).To(Equal(config.DefaultLogLevel))
+ Expect(cfg.AdapterContainerName).To(Equal(config.DefaultAdapterContainerName))Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In status-reporter/pkg/config/config_test.go around lines 56 to 65, the test
hardcodes default values (notably AdapterContainerName as "") which can drift
from the package defaults; update the assertions to compare against the config
package constants (e.g. config.DefaultAdapterContainerName and the other
Default* constants such as DefaultResultsPath, DefaultPollIntervalSeconds,
DefaultMaxWaitTimeSeconds, DefaultConditionType, DefaultLogLevel) instead of
literal values so the test always reflects the source-of-truth defaults.
| // GetAdapterContainerStatus finds the adapter container status | ||
| func (c *Client) GetAdapterContainerStatus(ctx context.Context, podName, containerName string) (*corev1.ContainerStatus, error) { | ||
| podStatus, err := c.GetPodStatus(ctx, podName) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
|
|
||
| if containerName != "" { | ||
| for _, cs := range podStatus.ContainerStatuses { | ||
| if cs.Name == containerName { | ||
| return &cs, nil | ||
| } | ||
| } | ||
| return nil, fmt.Errorf("container not found: namespace=%s pod=%s container=%s", c.namespace, podName, containerName) | ||
| } | ||
|
|
||
| for _, cs := range podStatus.ContainerStatuses { | ||
| if cs.Name != StatusReporterContainerName { | ||
| return &cs, nil | ||
| } | ||
| } | ||
|
|
||
| return nil, fmt.Errorf("adapter container not found: namespace=%s pod=%s", c.namespace, podName) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Find and examine the k8s client file and related code
find . -type f -name "*.go" | head -20Repository: openshift-hyperfleet/adapter-validation-gcp
Length of output: 896
🏁 Script executed:
# Search for GetAdapterContainerStatus usage
rg "GetAdapterContainerStatus" --type go -B 2 -A 2Repository: openshift-hyperfleet/adapter-validation-gcp
Length of output: 8709
🏁 Script executed:
# Search for StatusReporterContainerName definition
rg "StatusReporterContainerName" --type go -B 2 -A 2Repository: openshift-hyperfleet/adapter-validation-gcp
Length of output: 729
🏁 Script executed:
# Search for ADAPTER_CONTAINER_NAME
rg "ADAPTER_CONTAINER_NAME" --type go -B 2 -A 2Repository: openshift-hyperfleet/adapter-validation-gcp
Length of output: 1739
🏁 Script executed:
# Check the full client.go file to understand context
wc -l status-reporter/pkg/k8s/client.goRepository: openshift-hyperfleet/adapter-validation-gcp
Length of output: 124
🏁 Script executed:
# Get the file structure and NewClient to understand initialization
head -150 status-reporter/pkg/k8s/client.go | tail -50Repository: openshift-hyperfleet/adapter-validation-gcp
Length of output: 1594
🏁 Script executed:
# Check for README or deployment docs
find . -type f \( -name "README*" -o -name "*.md" \) | head -20Repository: openshift-hyperfleet/adapter-validation-gcp
Length of output: 98
🏁 Script executed:
# Look at the main.go to see full context of how it's initialized
cat -n status-reporter/cmd/reporter/main.goRepository: openshift-hyperfleet/adapter-validation-gcp
Length of output: 4757
🏁 Script executed:
# Check the config.go to see if AdapterContainerName has defaults
cat -n status-reporter/pkg/config/config.goRepository: openshift-hyperfleet/adapter-validation-gcp
Length of output: 6163
🏁 Script executed:
# Check README for deployment guidance about container names
cat README.mdRepository: openshift-hyperfleet/adapter-validation-gcp
Length of output: 8281
Require explicit ADAPTER_CONTAINER_NAME when sidecar injection is present—the fallback "first non-status-reporter" container selection is fragile and non-deterministic.
If no container name is specified, the function returns the first container that is not "status-reporter". In Kubernetes clusters with sidecar injection (Istio, service meshes, etc.), container order in ContainerStatuses is undefined, making it possible to select an injected sidecar instead of the adapter container and silently misreport its status.
Either:
- Require
ADAPTER_CONTAINER_NAMEto be explicitly set and fail fast with a clear error if the fallback path is triggered, or - Validate that exactly one non-status-reporter container exists before returning it.
The fallback is documented in logs as "(auto-detect)" but not enforced safely in code.
🤖 Prompt for AI Agents
In status-reporter/pkg/k8s/client.go around lines 109 to 132, the current
fallback that returns the first non-status-reporter container is unsafe with
sidecar injection; change the logic so that when containerName is empty you
collect all ContainerStatuses where Name != StatusReporterContainerName and
then: if the resulting slice length == 1 return that container, otherwise return
an explicit error that auto-detection failed and instructs the caller to set
ADAPTER_CONTAINER_NAME (include namespace, pod, and the count of non-status
containers in the message). This enforces deterministic behavior and fails fast
when multiple candidate containers (e.g. injected sidecars) exist.
| // Run starts the reporter and blocks until completion | ||
| func (r *StatusReporter) Run(ctx context.Context) error { | ||
| log.Printf("Status reporter starting...") | ||
| log.Printf(" Pod: %s", r.podName) | ||
| log.Printf(" Results path: %s", r.resultsPath) | ||
| log.Printf(" Poll interval: %s", r.pollInterval) | ||
| log.Printf(" Max wait time: %s", r.maxWaitTime) | ||
|
|
||
| timeoutCtx, cancel := context.WithTimeout(ctx, r.maxWaitTime) | ||
| defer cancel() | ||
|
|
||
| // Buffered channels (size 1) prevent goroutine leaks if the main select has already | ||
| // chosen another case when a sender tries to send | ||
| channels := &pollChannels{ | ||
| result: make(chan *result.AdapterResult, 1), | ||
| error: make(chan error, 1), | ||
| terminated: make(chan *corev1.ContainerStateTerminated, 1), | ||
| done: make(chan struct{}), | ||
| } | ||
|
|
||
| var wg sync.WaitGroup | ||
| wg.Add(2) | ||
| go r.pollForResultFile(timeoutCtx, channels, &wg) | ||
| go r.monitorContainerStatus(timeoutCtx, channels, &wg) | ||
|
|
||
| var reportErr error | ||
| select { | ||
| case adapterResult := <-channels.result: | ||
| reportErr = r.UpdateFromResult(ctx, adapterResult) | ||
| case err := <-channels.error: | ||
| reportErr = r.UpdateFromError(ctx, err) | ||
| case terminated := <-channels.terminated: | ||
| reportErr = r.HandleTermination(ctx, terminated) | ||
| case <-timeoutCtx.Done(): | ||
| // Give precedence to results/errors/termination that may have arrived just before timeout | ||
| select { | ||
| case adapterResult := <-channels.result: | ||
| reportErr = r.UpdateFromResult(ctx, adapterResult) | ||
| case err := <-channels.error: | ||
| reportErr = r.UpdateFromError(ctx, err) | ||
| case terminated := <-channels.terminated: | ||
| reportErr = r.HandleTermination(ctx, terminated) | ||
| default: | ||
| reportErr = r.UpdateFromTimeout(ctx) | ||
| } | ||
| } | ||
|
|
||
| close(channels.done) | ||
| wg.Wait() | ||
|
|
||
| return reportErr | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cd status-reporter && cat -n pkg/reporter/reporter.go | sed -n '95,146p'Repository: openshift-hyperfleet/adapter-validation-gcp
Length of output: 2173
🏁 Script executed:
# Also check surrounding code to understand the full context
cd status-reporter && cat -n pkg/reporter/reporter.go | sed -n '80,160p'Repository: openshift-hyperfleet/adapter-validation-gcp
Length of output: 3597
🏁 Script executed:
# Check what UpdateFromTimeout does
cd status-reporter && rg -A 10 "func.*UpdateFromTimeout"Repository: openshift-hyperfleet/adapter-validation-gcp
Length of output: 1025
🏁 Script executed:
# Check what UpdateFromError does (mentioned in the review as potentially embedding unbounded errors)
cd status-reporter && rg -A 10 "func.*UpdateFromError"Repository: openshift-hyperfleet/adapter-validation-gcp
Length of output: 778
🌐 Web query:
Go context package DeadlineExceeded vs Canceled behavior
💡 Result:
Short answer
- ctx.Done() is closed in both cases; the difference is the error returned by ctx.Err():
- context.Canceled — the context was explicitly canceled (Cancel/CancelFunc called or parent canceled). [1]
- context.DeadlineExceeded — the context’s deadline/timeout expired. [1][2]
Details (practical points)
- Use WithCancel to produce a cancellation that yields context.Canceled when you call the CancelFunc. [1]
- Use WithDeadline/WithTimeout to produce a timeout that yields context.DeadlineExceeded when the deadline passes. Always still call the returned CancelFunc to release resources. [1]
- Many libraries treat both as “stop work now” (Done closed), but they may handle them differently (e.g., logging, connection handling) because the Err value indicates whether it was caller-initiated cancellation or a timeout. [1][3]
Sources
[1] Go context package docs (pkg.go.dev).
[2] Go stdlib context overview (golang.org / pkg docs).
[3] Discussion/examples about cancellation vs timeout behavior (blog/issue references).
🏁 Script executed:
cd status-reporter && rg -A 20 "func.*UpdateFromTimeout"Repository: openshift-hyperfleet/adapter-validation-gcp
Length of output: 1619
🏁 Script executed:
# Check what UpdateFromTerminatedContainer does
cd status-reporter && rg -A 15 "func.*UpdateFromTerminatedContainer"Repository: openshift-hyperfleet/adapter-validation-gcp
Length of output: 1227
🏁 Script executed:
# Check the Job condition update mechanism to understand side effects
cd status-reporter && rg -A 10 "UpdateJobStatus" | head -50Repository: openshift-hyperfleet/adapter-validation-gcp
Length of output: 3243
Distinguish "timeout" vs "context canceled" to avoid misreporting during shutdown/deletion.
timeoutCtx.Done() fires for both deadline-exceeded and parent context cancellation, but the code always runs the "timeout" path (unless a result/error arrived). When the parent context cancels (e.g., during pod/job deletion), this incorrectly marks the Job with ReasonAdapterTimeout instead of propagating the cancellation.
func (r *StatusReporter) Run(ctx context.Context) error {
...
case <-timeoutCtx.Done():
// Give precedence to results/errors/termination that may have arrived just before timeout
select {
case adapterResult := <-channels.result:
reportErr = r.UpdateFromResult(ctx, adapterResult)
case err := <-channels.error:
reportErr = r.UpdateFromError(ctx, err)
case terminated := <-channels.terminated:
reportErr = r.HandleTermination(ctx, terminated)
default:
+ if errors.Is(timeoutCtx.Err(), context.DeadlineExceeded) {
+ reportErr = r.UpdateFromTimeout(ctx)
+ } else {
+ // Parent canceled (shutdown/deletion). Avoid writing misleading "timeout" status.
+ reportErr = timeoutCtx.Err()
+ }
- reportErr = r.UpdateFromTimeout(ctx)
}
}
...
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Run starts the reporter and blocks until completion | |
| func (r *StatusReporter) Run(ctx context.Context) error { | |
| log.Printf("Status reporter starting...") | |
| log.Printf(" Pod: %s", r.podName) | |
| log.Printf(" Results path: %s", r.resultsPath) | |
| log.Printf(" Poll interval: %s", r.pollInterval) | |
| log.Printf(" Max wait time: %s", r.maxWaitTime) | |
| timeoutCtx, cancel := context.WithTimeout(ctx, r.maxWaitTime) | |
| defer cancel() | |
| // Buffered channels (size 1) prevent goroutine leaks if the main select has already | |
| // chosen another case when a sender tries to send | |
| channels := &pollChannels{ | |
| result: make(chan *result.AdapterResult, 1), | |
| error: make(chan error, 1), | |
| terminated: make(chan *corev1.ContainerStateTerminated, 1), | |
| done: make(chan struct{}), | |
| } | |
| var wg sync.WaitGroup | |
| wg.Add(2) | |
| go r.pollForResultFile(timeoutCtx, channels, &wg) | |
| go r.monitorContainerStatus(timeoutCtx, channels, &wg) | |
| var reportErr error | |
| select { | |
| case adapterResult := <-channels.result: | |
| reportErr = r.UpdateFromResult(ctx, adapterResult) | |
| case err := <-channels.error: | |
| reportErr = r.UpdateFromError(ctx, err) | |
| case terminated := <-channels.terminated: | |
| reportErr = r.HandleTermination(ctx, terminated) | |
| case <-timeoutCtx.Done(): | |
| // Give precedence to results/errors/termination that may have arrived just before timeout | |
| select { | |
| case adapterResult := <-channels.result: | |
| reportErr = r.UpdateFromResult(ctx, adapterResult) | |
| case err := <-channels.error: | |
| reportErr = r.UpdateFromError(ctx, err) | |
| case terminated := <-channels.terminated: | |
| reportErr = r.HandleTermination(ctx, terminated) | |
| default: | |
| reportErr = r.UpdateFromTimeout(ctx) | |
| } | |
| } | |
| close(channels.done) | |
| wg.Wait() | |
| return reportErr | |
| } | |
| // Run starts the reporter and blocks until completion | |
| func (r *StatusReporter) Run(ctx context.Context) error { | |
| log.Printf("Status reporter starting...") | |
| log.Printf(" Pod: %s", r.podName) | |
| log.Printf(" Results path: %s", r.resultsPath) | |
| log.Printf(" Poll interval: %s", r.pollInterval) | |
| log.Printf(" Max wait time: %s", r.maxWaitTime) | |
| timeoutCtx, cancel := context.WithTimeout(ctx, r.maxWaitTime) | |
| defer cancel() | |
| // Buffered channels (size 1) prevent goroutine leaks if the main select has already | |
| // chosen another case when a sender tries to send | |
| channels := &pollChannels{ | |
| result: make(chan *result.AdapterResult, 1), | |
| error: make(chan error, 1), | |
| terminated: make(chan *corev1.ContainerStateTerminated, 1), | |
| done: make(chan struct{}), | |
| } | |
| var wg sync.WaitGroup | |
| wg.Add(2) | |
| go r.pollForResultFile(timeoutCtx, channels, &wg) | |
| go r.monitorContainerStatus(timeoutCtx, channels, &wg) | |
| var reportErr error | |
| select { | |
| case adapterResult := <-channels.result: | |
| reportErr = r.UpdateFromResult(ctx, adapterResult) | |
| case err := <-channels.error: | |
| reportErr = r.UpdateFromError(ctx, err) | |
| case terminated := <-channels.terminated: | |
| reportErr = r.HandleTermination(ctx, terminated) | |
| case <-timeoutCtx.Done(): | |
| // Give precedence to results/errors/termination that may have arrived just before timeout | |
| select { | |
| case adapterResult := <-channels.result: | |
| reportErr = r.UpdateFromResult(ctx, adapterResult) | |
| case err := <-channels.error: | |
| reportErr = r.UpdateFromError(ctx, err) | |
| case terminated := <-channels.terminated: | |
| reportErr = r.HandleTermination(ctx, terminated) | |
| default: | |
| if errors.Is(timeoutCtx.Err(), context.DeadlineExceeded) { | |
| reportErr = r.UpdateFromTimeout(ctx) | |
| } else { | |
| // Parent canceled (shutdown/deletion). Avoid writing misleading "timeout" status. | |
| reportErr = timeoutCtx.Err() | |
| } | |
| } | |
| } | |
| close(channels.done) | |
| wg.Wait() | |
| return reportErr | |
| } |
🤖 Prompt for AI Agents
In status-reporter/pkg/reporter/reporter.go around lines 95 to 146, the select
branch for <-timeoutCtx.Done() treats all cancellations as a deadline timeout;
change it to distinguish context.DeadlineExceeded from parent cancellation by
checking timeoutCtx.Err(): if errors.Is(timeoutCtx.Err(),
context.DeadlineExceeded) keep the existing inner select and call
r.UpdateFromTimeout(...) when nothing else arrived, otherwise propagate the
cancellation immediately (e.g., return timeoutCtx.Err() or call
r.UpdateFromError with the cancellation error) so pod/job deletion doesn't get
misreported as a timeout.
Summary
Implements a cloud-agnostic, reusable Kubernetes sidecar container that monitors adapter operations and updates Job status based on execution results.
Changes
Summary by CodeRabbit
New Features
Documentation
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.