-
Notifications
You must be signed in to change notification settings - Fork 448
OCPBUGS-42303: Networking: reset ovn-remote config and allow ovnkube controller to s… #5123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-42303: Networking: reset ovn-remote config and allow ovnkube controller to s… #5123
Conversation
@martinkennelly: GitHub didn't allow me to request PR reviews from the following users: martinkennelly. Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
773c168
to
dd6e137
Compare
@martinkennelly: This pull request references Jira Issue OCPBUGS-42303, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/jira refresh |
@martinkennelly: This pull request references Jira Issue OCPBUGS-42303, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@martinkennelly: This pull request references Jira Issue OCPBUGS-42303. The bug has been updated to no longer refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Closing because theres a cost to pausing and resuming. Another approach is tried. |
@martinkennelly: This pull request references Jira Issue OCPBUGS-42303, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
dd6e137
to
83ffb9e
Compare
…et it This fixes the issue where ovn-remote is set prior to reboot and when boot occurs, ovn-controller syncs quickly with a stale SB DB. This PR is part of the EIP GARP issue fix. Its required because when ovnkube-controller and ovn-controller container start on boot, there is no order to which container will start first, and we dont want ovn-controller to connect to SB DB before ovnkube controller has added the drop flows. Ideally, we would only allow ovn-controller to sync with SB DB when ovnkube controller has concluded syncing and the changes are available in SB DB. That maybe future work. Signed-off-by: Martin Kennelly <[email protected]>
83ffb9e
to
4d91920
Compare
The great Ben suggested to move this before the nmstate config check - nmstate maybe used to config br-ex |
/retest |
This patch only reset |
yep! we dont want to clear it during ovnkube-controller container restart because ovn-controller will do a full sync again with sb db and thats very costly. We already covered this scenario with drop flows. |
So pod / container restart covered with the GARP drop flows added to the ext bridge and this covers the node reboot. We need to add this because the drop flows added the the ext bridge when ovnkube controller shuts down do not persist following a reboot. |
/hold Revision 4d91920 was retested 3 times: holding |
CI is borked. Infra issues and also cannot find nmstate version - unrelated to this PR |
/unhold |
/retest maybe CI is back.. |
/test e2e-gcp-op CI is still borked. |
/unhold Trying again. |
/cherry-pick release-4.20 |
@martinkennelly: once the present PR merges, I will cherry-pick it on top of In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/retest |
/test e2e-aws-ovn |
@martinkennelly : https://prow.ci.openshift.org/job-history/gs/test-platform-results/pr-logs/directory/pull-ci-openshift-machine-config-operator-main-e2e-aws-ovn this job doesn't look that good, any idea if there are known TRT issues logged? |
Theres a few issues with deprovisioning i think - one i see most on this PR is the rate limiting in aws which should be fixed by: openshift/installer#9958 I was unable to find a bug. Ive also seen other deprovisioning issue that didint mention rate limiting in aws. I dont have a bug for it. |
@martinkennelly: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
A new deprovisioning bug it seems "Custom IAM endpoint not found, using default endpoint" - trying to get a bug and ill look for override. |
Cannot find a bug. I am engaging with installer team. |
At least the other de-provision bug looks solved :) |
https://redhat-internal.slack.com/archives/C68TNFWA2/p1759150383560899 Looking for override. |
@yuqi-zhang can you please over ride the |
/override ci/prow/e2e-aws-ovn Failure should be unrelated |
@yuqi-zhang: Overrode contexts on behalf of yuqi-zhang: ci/prow/e2e-aws-ovn In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
4c5e822
into
openshift:main
@martinkennelly: Jira Issue Verification Checks: Jira Issue OCPBUGS-42303 Jira Issue OCPBUGS-42303 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@martinkennelly: new pull request created: #5317 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
…et when syncd
Clear ovn-remote on startup to prevent ovn-controller connecting to a stale OVN southbound database. OVN Kube Controller may not have sync'd yet. ovn-remote will be set by OVNKube controller.