-
Notifications
You must be signed in to change notification settings - Fork 41
OCPBUGS-57447,OCPBUGS-45056: Refrain from adding Egress IP to public LB backend pool #180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@arghosh93: This pull request references Jira Issue OCPBUGS-57447, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@arghosh93: This pull request references Jira Issue OCPBUGS-57447, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@arghosh93: This pull request references Jira Issue OCPBUGS-57447, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-45056, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@arghosh93: This pull request references Jira Issue OCPBUGS-57447, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: This pull request references Jira Issue OCPBUGS-45056, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
arkadeepsen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any tests which can be used to verify the changes made in this PR will correctly solve the issue?
pkg/controller/cloudprivateipconfig/cloudprivateipconfig_controller.go
Outdated
Show resolved
Hide resolved
We lack knowledge of API of different cloud providers to fake it. That is the main reason behind not having enough unit tests. |
c2b5065 to
6dca85c
Compare
|
@arghosh93 Does this PR introduce any limitations on pod egress traffic? From my understanding, if we skip adding the EgressIP to the load balancer backend pools, the egress traffic will be restricted to the infra subnet. Is that correct? |
Yes @pperiyasamy , that is correct. The plan is to document this limitation along with a suggestion of using NAT gateway instead of a general purpose public load balancer. I am also gonna notify support team members so that everyone is well aware. |
Thanks @arghosh93 , If this is agreed by everyone, i'm fine with it. one comment on the sync function. |
|
/retest-required |
6dca85c to
57b79e5
Compare
57b79e5 to
02854f1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
pkg/cloudprovider/azure.go (1)
310-381: Well-structured cleanup implementation.The
SyncLBBackendmethod correctly implements one-time upgrade cleanup:✅ Gracefully handles deleted nodes (logs warning, continues processing)
✅ Uses anonymous function withdeferto avoid lock accumulation across loop iterations
✅ Idempotent updates (only modifies whenLoadBalancerBackendAddressPoolsis present)
✅ Proper per-node locking prevents concurrent modifications
✅ Comprehensive error wrapping for Azure API callsOptional: Consider adding observability for successful cleanups.
For better visibility during upgrades, consider logging when backend pools are actually removed:
🔎 View suggested enhancement
if loadBalancerBackendPoolModified { networkInterface.Properties.IPConfigurations = ipConfigurations + klog.Infof("Removing Egress IP %s from load balancer backend pool for node %s", ipc, node.Name) poller, err := a.createOrUpdate(networkInterface)
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (9)
cmd/cloud-network-config-controller/main.go(1 hunks)pkg/cloudprovider/azure.go(8 hunks)pkg/cloudprovider/cloudprovider.go(1 hunks)pkg/controller/cloudprivateipconfig/cloudprivateipconfig_controller.go(4 hunks)pkg/controller/cloudprivateipconfig/cloudprivateipconfig_controller_test.go(1 hunks)pkg/controller/configmap/configmap_controller.go(1 hunks)pkg/controller/controller.go(2 hunks)pkg/controller/node/node_controller.go(2 hunks)pkg/controller/secret/secret_controller.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
- pkg/cloudprovider/cloudprovider.go
- pkg/controller/node/node_controller.go
- cmd/cloud-network-config-controller/main.go
- pkg/controller/cloudprivateipconfig/cloudprivateipconfig_controller.go
- pkg/controller/secret/secret_controller.go
- pkg/controller/controller.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.
Files:
pkg/controller/cloudprivateipconfig/cloudprivateipconfig_controller_test.gopkg/cloudprovider/azure.gopkg/controller/configmap/configmap_controller.go
🧬 Code graph analysis (2)
pkg/controller/cloudprivateipconfig/cloudprivateipconfig_controller_test.go (1)
pkg/cloudprovider/cloudprovider.go (1)
CloudProviderConfig(88-98)
pkg/cloudprovider/azure.go (1)
pkg/cloudprivateipconfig/cloudprivateipconfig.go (1)
NameToIP(28-39)
🔇 Additional comments (6)
pkg/controller/configmap/configmap_controller.go (1)
106-108: LGTM: Appropriate no-op implementation.The no-op
InitialSync()is correct for ConfigMapController, which only watches for configuration changes and triggers shutdown. No startup initialization is needed.pkg/controller/cloudprivateipconfig/cloudprivateipconfig_controller_test.go (1)
100-100: LGTM: Correct test setup update.The empty
CloudProviderConfig{}is appropriate for these tests, which use aFakeCloudProviderand don't exercise platform-specific behavior.pkg/cloudprovider/azure.go (4)
16-16: LGTM!The import additions correctly support the new
SyncLBBackendfunctionality. All imported packages are utilized in the implementation.Also applies to: 23-24, 26-27, 29-29, 32-32
121-170: LGTM!The removal of backend pool client initialization is correct and aligns with the PR objective to stop managing Egress IPs in Azure load balancer backend pools.
172-230: LGTM!The changes correctly implement the new behavior:
- Enhanced error messages with proper context wrapping improve debuggability
- Simplified IP configuration omits backend pool assignment as intended
- The warning about limited outbound connectivity (lines 223-224) is important for operators and aligns with the documented limitations in the PR
232-273: LGTM!The error message improvements provide better context for debugging Azure API interactions. The absence of backend pool cleanup logic is correct since Egress IPs are no longer added to backend pools.
|
/retest-required |
kyrtapz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last iteration looks great!
I have just a few more comments.
One thing that came to mind is that we will have to document that it won't be possible for the user to add the CPIC addresses to the backend pools on their own because cncc would remove it on restart.
8116e47 to
b32c33e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (9)
cmd/cloud-network-config-controller/main.gopkg/cloudprovider/azure.gopkg/cloudprovider/cloudprovider.gopkg/controller/cloudprivateipconfig/cloudprivateipconfig_controller.gopkg/controller/cloudprivateipconfig/cloudprivateipconfig_controller_test.gopkg/controller/configmap/configmap_controller.gopkg/controller/controller.gopkg/controller/node/node_controller.gopkg/controller/secret/secret_controller.go
✅ Files skipped from review due to trivial changes (1)
- pkg/cloudprovider/cloudprovider.go
🚧 Files skipped from review as they are similar to previous changes (4)
- pkg/controller/cloudprivateipconfig/cloudprivateipconfig_controller_test.go
- pkg/controller/secret/secret_controller.go
- pkg/controller/node/node_controller.go
- cmd/cloud-network-config-controller/main.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.
Files:
pkg/controller/configmap/configmap_controller.gopkg/controller/controller.gopkg/cloudprovider/azure.gopkg/controller/cloudprivateipconfig/cloudprivateipconfig_controller.go
🧬 Code graph analysis (1)
pkg/controller/cloudprivateipconfig/cloudprivateipconfig_controller.go (3)
pkg/cloudprovider/cloudprovider.go (1)
CloudProviderConfig(88-98)pkg/controller/controller.go (1)
CloudNetworkConfigController(48-68)pkg/cloudprovider/azure.go (2)
PlatformTypeAzure(38-38)Azure(49-60)
🔇 Additional comments (9)
pkg/controller/configmap/configmap_controller.go (1)
106-108: LGTM!The no-op
InitialSync()implementation correctly satisfies the interface requirement. ConfigMapController doesn't need startup cleanup, so returningnilis appropriate.pkg/cloudprovider/azure.go (3)
22-34: LGTM!The new imports are required for the
SyncLBBackendfunction andisCloudPrivateIPConfigAssignedhelper.
225-226: LGTM!The warning log clearly communicates the operational impact of not adding egress IPs to the LB backend pool. This aligns with the PR objective and helps operators understand the connectivity constraints.
683-690: LGTM!The helper correctly identifies assigned CloudPrivateIPConfigs by checking the
Assignedcondition status, which is consistent with the condition handling elsewhere in the codebase.pkg/controller/controller.go (2)
42-45: LGTM!The
InitialSync()interface method is well-documented and correctly positioned in the startup sequence. The comment clearly explains the intended use case.
104-108: LGTM!Calling
InitialSync()after cache sync but before worker startup is the correct placement for one-time cleanup operations that need full cluster state visibility. Failing startup on error is appropriate since incomplete cleanup could leave the cluster in an inconsistent state.pkg/controller/cloudprivateipconfig/cloudprivateipconfig_controller.go (3)
59-61: LGTM!The optional
initialSyncHookfield provides a clean extension point for platform-specific cleanup without coupling the controller to cloud provider implementations.
80-86: LGTM!The Azure-specific hook wiring is appropriately guarded by the platform type check. The type assertion is safe since
PlatformTypeAzureguarantees the client is an*Azureinstance. Capturing the listers in the closure ensures the hook has access to current cluster state duringInitialSync().
131-140: LGTM!The
InitialSync()implementation correctly delegates to the optional hook and provides appropriate logging. The nil check ensures non-Azure platforms safely skip cleanup.
The consensus is to not add egress IP to public load balancer backend pool regardless of the presence of an OutBoundRule. During upgrade this PR let cobtroller removes any egress IP added to public load balancer backend pool previously. Signed-off-by: Arnab Ghosh <[email protected]>
b32c33e to
ba4e72b
Compare
|
/retest-required |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: arghosh93, arkadeepsen, kyrtapz, pperiyasamy The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest-required |
|
/hold cancel |
|
/retest-required |
4 similar comments
|
/retest-required |
|
/retest-required |
|
/retest-required |
|
/retest-required |
|
/retest-requied |
|
/retest-required |
|
/verified by pre-merge testing |
|
@yingwang-0320: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retest-required |
2 similar comments
|
/retest-required |
|
/retest-required |
|
@arghosh93: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This PR is to stop adding Egress IP to public load balancer
backend pool regardless of presence of an OutBoundRule in any
Azure cluster.
This change comes with a consequence of no outbound connectivity
except to the infrastructure subnet even if there is no OutBoundRule.
However this is required to tackle following situation:
this PR also let cobtroller remove any egress IP
added to public load balancer backend pool previously.