-
Notifications
You must be signed in to change notification settings - Fork 259
CNS Change for Subnet Overlay Expansion Job #4074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a mechanism to handle overlay subnet expansion in CNS by detecting primary IP address changes and forcing a CNS daemon restart. When a PrimaryCA mismatch is detected, the code now deletes the CNS state file and triggers a panic to restart the daemon, allowing it to pull fresh configuration from the updated NNC.
- Changes the behavior from returning an error code to panicking and deleting the state file
- Adds comprehensive test coverage for the new panic-based recovery mechanism
- Updates existing test to verify the new panic behavior instead of error return
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
cns/restserver/internalapi.go | Implements the core logic to delete state file and panic on PrimaryCA mismatch |
cns/restserver/internalapi_test.go | Adds tests and updates existing test to verify panic behavior and state file deletion |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleting the statefile is a nonstarter
Just got notified that this PR is replaced with the PR here: https://github.com/Azure/azure-container-networking/pull/4083/files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
27f3eaa
to
d452d9f
Compare
/azp run Azure Container Networking PR |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run Azure Container Networking PR |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run Azure Container Networking PR |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's verify full CIDR containment. Rest looks good.
/azp run Azure Container Networking PR |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run Azure Container Networking PR |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run Azure Container Networking PR |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run Azure Container Networking PR |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run Azure Container Networking PR |
Azure Pipelines successfully started running 1 pipeline(s). |
* added logic to fix cns bug for overlay subnet expansion * reverted a line change * fixed spelling * added unit test * fixing go lint * expanded on a comment * updated logic * updated test * updated validate superset logic * updated to return bool instead of error for checking cidr superset * updated logic to check for containment --------- Co-authored-by: Riya <[email protected]> Co-authored-by: Riya <[email protected]>
Reason for Change:
Without this CNS change, expanding an overlay subnet causes the CNS daemonset to crash, resulting in a PrimaryCANotSame error for all clusters.
To enable overlay subnet expansion:
Simply restarting CNS does not regenerate the state file with updated information. Instead, this pr will update the state file to have the new subnet address (same as the one from updated nnc) which will prevent the error only for overlay clusters when there is a mismatch of pod cidrs to start with.
Issue Fixed:
Requirements:
Notes: