Skip to content

Conversation

@rwsu
Copy link
Contributor

@rwsu rwsu commented Jan 23, 2025

When a proxy is configured with a self-signed certificate, the certificate needs to be made available to the node-joiner pod to allow it to communicate with the proxy.

It is assumed that the proxy certificate is configured in the cluster proxy spec as

spec:
  trustedCA:
    name: user-ca-bundle

The certificate is stored in a config map named "user-ca-bundle" in the "openshift-config" namespace in a file named ca-bundle.crt.

apiVersion: v1
data:
  ca-bundle.crt: |
    -----BEGIN CERTIFICATE-----
    ** redacted **
    -----END CERTIFICATE-----
kind: ConfigMap
metadata:
  name: user-ca-bundle
  namespace: openshift-config

If the certificate is included in the additionalTrustBundle field in install-config.yaml prior to cluster installation, the proxy and user-ca-bundle is automatically configured as illustrated
above.

Previously, the certificate wasn't mounted to the node-joiner pod so that when the pod attempted to pull images through the proxy, it failed.

Now the user-ca-bundle config map is copied to the node-joiner pod's namespace and the certificate is mounted and made available as /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem.

When a proxy is configured with a self-signed certificate, the
certificate needs to be made available to the node-joiner pod
to allow it to communicate with the proxy.

Previously, the certificate wasn't mounted to the node-joiner pod
so that when the pod attempted to pull images through the proxy, it
failed.

Now the user-ca-bundle config map is copied to the node-joiner
pod's namespace and the certificate is mounted and made available
as /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem.
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 23, 2025
@openshift-ci-robot
Copy link

@rwsu: This pull request references Jira Issue OCPBUGS-44637, which is invalid:

  • expected the bug to target the "4.19.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

When a proxy is configured with a self-signed certificate, the certificate needs to be made available to the node-joiner pod to allow it to communicate with the proxy.

Previously, the certificate wasn't mounted to the node-joiner pod so that when the pod attempted to pull images through the proxy, it failed.

Now the user-ca-bundle config map is copied to the node-joiner pod's namespace and the certificate is mounted and made available as /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Jan 23, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 23, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rwsu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 23, 2025
@rwsu
Copy link
Contributor Author

rwsu commented Jan 23, 2025

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jan 23, 2025
@openshift-ci-robot
Copy link

@rwsu: This pull request references Jira Issue OCPBUGS-44637, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.19.0) matches configured target version for branch (4.19.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @mhanss

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested a review from mhanss January 23, 2025 15:14
Name: "user-ca-bundle",
MountPath: "/etc/pki/ca-trust/extracted/pem",
})

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that it seems the reasonable motivation for the issue, there is one point not clear to me: why the oc command must be responsible for extracting a certificate, and injecting it into the node-joiner container?
Couldn't be something directly managed by the node-joiner tool itself?
Personally I'd prefer to avoid having additional logic in the oc wrapper layer - if not the one just required the proper execution of the container (for the rest, the node-joiner must be able to work also on its own)

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 7, 2025
@fkawakubo
Copy link

/remove-lifecycle stale

@openshift-ci openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 21, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 16, 2025

@rwsu: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-serial-1of2 9b1bb8a link true /test e2e-aws-ovn-serial-1of2

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 15, 2025
@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 15, 2025
@coderabbitai
Copy link

coderabbitai bot commented Nov 15, 2025

Walkthrough

Adds support for copying cluster-wide user-ca-bundle ConfigMaps to the node-joiner pod for self-signed proxy scenarios. Includes new test coverage and refactors test infrastructure to inject additional runtime objects for testing proxy and ConfigMap handling.

Changes

Cohort / File(s) Summary
CA bundle propagation for proxy setups
pkg/cli/admin/nodeimage/create.go
Implements logic to retrieve the user-ca-bundle ConfigMap from openshift-config, create it in the node-joiner namespace, and mount it in the pod with proper volume and volume mount configuration. Returns early if no proxy is configured.
Test infrastructure and coverage
pkg/cli/admin/nodeimage/create_test.go
Introduces configObjects hook for injecting config-related Kubernetes objects and refactors createFakes signature to accept clientObjs parameter. Adds new test case validating pod volume and volumeMount behavior with user-ca-bundle present and proxy configured.
Test infrastructure updates
pkg/cli/admin/nodeimage/monitor_test.go
Updates createFakes call to include nil argument for new clientObjs parameter.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

  • create.go: Verify ConfigMap retrieval, error handling for missing ConfigMap, volume/volumeMount configuration in pod spec, and early return logic when proxy is absent
  • create_test.go: Review test infrastructure refactoring, especially the new clientObjs hook injection pattern and its impact on fake client setup; validate new test case assertions for volumes and volume mounts
  • Cross-file concern: Ensure createFakes signature change is consistently applied across all test files that use it
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
pkg/cli/admin/nodeimage/create.go (1)

801-871: CA bundle propagation is correct; consider idempotence and multi‑container robustness

The new logic to:

  • gate on HTTPProxy/HTTPSProxy,
  • copy user-ca-bundle from openshift-config,
  • mount it as tls-ca-bundle.pem under /etc/pki/ca-trust/extracted/pem,

matches the documented proxy + user CA flow and should address image pulls behind a self-signed proxy.

A couple of robustness tweaks you might consider:

  1. ConfigMap creation idempotence
    If user-ca-bundle already exists in the node-joiner namespace (e.g., leftover from a previous run), Create will fail. Treating IsAlreadyExists(err) as non-fatal would make this code resilient to partial cleanup or concurrent invocations:
cmClient := o.Client.CoreV1().ConfigMaps(o.nodeJoinerNamespace.GetName())
_, err = cmClient.Create(ctx, cm, metav1.CreateOptions{})
if err != nil {
    if kapierrors.IsAlreadyExists(err) {
        klog.V(2).Infof("user-ca-bundle already present in %s namespace", o.nodeJoinerNamespace.GetName())
        // fall through to volume wiring
    } else {
        klog.V(2).Infof("Error writing user-ca-bundle to %s namespace: %v", o.nodeJoinerNamespace.GetName(), err)
        return err
    }
}
  1. Future sidecars
    The volumeMount is added only to pod.Spec.Containers[0]. That’s fine with the current single-container pod, but if a sidecar is ever added that also needs proxy access, it will miss the CA bundle. A small helper that appends the volumeMount to all containers would future‑proof this.
pkg/cli/admin/nodeimage/create_test.go (1)

233-281: Strengthen the user‑CA bundle test with path/key assertions

The new test:

"node-joiner pod should mount user-ca-bundle as a volume if it is available and a proxy is configured"

correctly verifies that:

  • the pod spec includes a Volume named user-ca-bundle, and
  • container 0 has a VolumeMount for that volume.

To better guard the regression this PR is fixing, consider tightening the assertions to also check:

  • the Volume’s ConfigMap source has Name == "user-ca-bundle" and an Items entry mapping Key: "ca-bundle.crt" to Path: "tls-ca-bundle.pem", and
  • the VolumeMount’s MountPath equals /etc/pki/ca-trust/extracted/pem.

That way, any accidental change to the key, filename, or mount path (which are critical for the CA trust flow) will be caught by this test.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between e005223 and 9b1bb8a.

📒 Files selected for processing (3)
  • pkg/cli/admin/nodeimage/create.go (1 hunks)
  • pkg/cli/admin/nodeimage/create_test.go (7 hunks)
  • pkg/cli/admin/nodeimage/monitor_test.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • pkg/cli/admin/nodeimage/create.go
  • pkg/cli/admin/nodeimage/monitor_test.go
  • pkg/cli/admin/nodeimage/create_test.go
🔇 Additional comments (2)
pkg/cli/admin/nodeimage/monitor_test.go (1)

87-87: Updated createFakes call wiring looks correct

Passing nil for the new clientObjs hook matches the updated helper signature and is appropriate here since the monitor tests don’t require extra cluster objects.

pkg/cli/admin/nodeimage/create_test.go (1)

134-148: Refactored test wiring between core client and config client looks consistent

The introduction of:

  • configObjects func(string, string) []runtime.Object in the test case struct, and
  • the updated createFakes(t, podName, clientObjs) helper that seeds the core client,

cleanly separates:

  • core client objects (e.g., the user-ca-bundle ConfigMap via tc.objects → passed into createFakesfake.NewSimpleClientset), from
  • config client objects (e.g., ClusterVersion, Proxy via tc.configObjectsconfigv1fake.NewSimpleClientset).

This mirrors how production code uses the two clients and ensures:

  • the “missing cluster connection” case is driven by omitting ClusterVersion in configObjects, and
  • proxy-related tests see configv1.Proxy only through the config client.

The updated createFakes implementation correctly:

  • derives the repo/digest from the fake registry and passes them into clientObjs, and
  • seeds the fake core client with those runtime.Objects before attaching the pod-creation reactor.

This restructuring looks sound and keeps tests maintainable as more config-side objects are introduced.

Also applies to: 152-161, 166-187, 193-232, 283-287, 291-305, 592-604

@fkawakubo
Copy link

/remove-lifecycle stale

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants