-
Notifications
You must be signed in to change notification settings - Fork 39
NPEP-311: Best Practices for Multi-Cluster NetworkPolicy in a Flat Network #313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
82314b9
2329a41
a7220a8
a88f72c
90f0528
c89c9fb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,137 @@ | ||
| # NPEP-311: Best Practices for Multi-Cluster NetworkPolicy in a Flat Network | ||
|
|
||
| * Issue: | ||
| [#311](https://github.com/kubernetes-sigs/network-policy-api/issues/311) | ||
| * Status: Informational | ||
|
|
||
| ## TLDR | ||
|
|
||
| This NPEP documents a recommended set of practices for applying standard | ||
| NetworkPolicy resources in a multi-cluster, flat network environment. It | ||
| proposes a conventional labeling scheme, aligned with SIG-Multicluster, to | ||
| enable consistent and predictable cross-cluster policy enforcement without | ||
| changes to the API. | ||
|
|
||
| ## Goals | ||
|
|
||
| * To establish and document a common, intuitive operational model for | ||
| multi-cluster network policy. | ||
|
|
||
| * To provide clear, reusable patterns for administrators to secure applications | ||
| that span multiple clusters. | ||
|
|
||
| * To align these practices with the cluster identification conventions | ||
| established by SIG-Multicluster in | ||
| [KEP-2149](https://github.com/kubernetes/enhancements/tree/master/keps/sig-multicluster/2149-clusterid). | ||
|
|
||
| ## Non-Goals | ||
aojea marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| * This proposal does not support multi-cluster setups where there are overlapping IPs between clusters (or more generally, any multi-cluster setup where traffic isn't freely routed between clusters). | ||
|
|
||
|
|
||
| * This proposal does not introduce any changes to the Kubernetes Network Policy | ||
| API specification or any extension based on labels and/or annotations. | ||
|
|
||
| * This proposal does not mandate any specific CNI implementation or | ||
| multi-cluster architecture beyond the prerequisite of a flat network. | ||
|
|
||
| ## Introduction | ||
|
|
||
| In a multi-cluster architecture with a flat network, every pod IP is unique and | ||
| directly routable from any other pod, regardless of its origin cluster. This | ||
| topology allows for a simplified security model where NetworkPolicy selectors | ||
| can be evaluated against a global inventory of all pods and namespaces within a | ||
| defined ClusterSet. | ||
|
|
||
| The core principles of this model are: | ||
|
|
||
| * Policies are Local: A NetworkPolicy resource is applied only to the cluster | ||
| where it is created. Policies are not replicated, which contains the impact of | ||
| changes and allows for per-cluster rollout strategies. | ||
|
|
||
| * Selectors are Global: The policy engine within each cluster evaluates policy | ||
| rules against all known pods and namespaces in the entire ClusterSet. | ||
|
|
||
| * Administrator-Defined Identity: The ability to differentiate between clusters | ||
| is important for cross-cluster communication. This model places the | ||
| responsibility on the cluster administrator to implement a consistent labeling | ||
| strategy that can be used for identity and policy selection. | ||
|
|
||
| This document formalizes these principles as a set of best practices for the | ||
| community. | ||
|
|
||
| ## User-Stories/Use-Cases | ||
|
|
||
| ### Story 1: Securing a "Stretched" Application | ||
|
|
||
| As a platform operator running an application across multiple clusters, | ||
|
|
||
| I want to write a single policy to allow traffic from all frontend pods to my | ||
| database pods, | ||
|
|
||
| so that I don't have to manage separate, IP-based policies for frontend | ||
| instances that are spread across different clusters. | ||
|
|
||
| ### Story 2: Enforcing Cross-Cluster Security Boundaries | ||
|
|
||
| As a security administrator, | ||
|
|
||
| I want to allow a database application in a production cluster to receive | ||
| traffic only from a specific billing application in a separate PCI-compliant | ||
| cluster, | ||
|
|
||
| so that I can enforce strict, auditable cross-cluster communication paths. | ||
|
|
||
| ### Story 3: Maintaining Local Policy Scope | ||
|
|
||
| As an application developer, | ||
|
|
||
| I want to apply a NetworkPolicy to my application and be confident that it only | ||
| affects traffic within my local cluster, | ||
|
|
||
| so that I don't accidentally expose my service to other clusters or break | ||
| connectivity by applying a policy that is too broad. | ||
|
|
||
| ## API | ||
|
|
||
| No changes are proposed to the NetworkPolicy v1 API neither extensions based on | ||
| labels and/or annotations. This document describes a set of practices that | ||
| leverage the existing API. | ||
|
|
||
| The central recommendation is to adopt a consistent labeling scheme for cluster | ||
| identification. To effectively manage policies in a multi-cluster environment, | ||
| it is highly recommended to align with the conventions outlined in [KEP-2149: | ||
| ClusterId](https://github.com/kubernetes/enhancements/tree/master/keps/sig-multicluster/2149-clusterid). | ||
|
|
||
| ### Recommendation | ||
|
|
||
| Each namespace within a cluster should be labeled by the cluster administrator with a key that identifies its | ||
| parent cluster. The recommended label is: | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "should be labeled" by who? (In general, you should never use should/must in the passive voice in a specification. If something must be done, be clear about who must do it.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added " by the cluster administrator " There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What’s meant by “parent cluster”? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the cluster that contains the namespace There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So presumably it will have the same value for all namespaces in a given cluster, won’t it? What’s the purpose of the label then? Couldn’t multicluster network policy support require KEP-2149 instead? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I opened this PR for exploration, and @skitt you are far expert than anybody here, so if you have ideas I'm all ears , I will take a look at https://github.com/kubernetes/enhancements/tree/master/keps/sig-multicluster/2149-clusterid , There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The idea is that the NetworkPolicies can refer to namespaces in other clusters via this label. So you can say "allow from namespace foo (in this cluster)", or "allow from namespace foo in cluster bar" or "allow from namespace foo in all clusters", etc. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @aojea I understand perfectly that this is about having a discussion 😉 Thinking about this more, I’m not convince namespace sameness should limit what network policies can do — the way I see it, network policies apply on top of networking defaults, and exist to provide more control over those. If the multicluster default (in the SIG-MC worldview anyway) is that namespaces are the same across a clusterset, that doesn’t prevent network policies from clamping down on what’s possible there. In the same way that network policies nowadays allow administrators to say “I only want traffic between this and this”, it makes sense to me to consider that policies extended to multiples clusters should allow administrators to say “I only want traffic between this cluster and this other cluster”. That being said, @danwinship I don’t understand your comment about using the label to refer to namespaces in other clusters. Does that mean that the idea here is to create namespaces in a local cluster to represent namespaces in another cluster, with the label indicating that the namespaces are for remote clusters and not the local one? That would indeed be inconsistent with other multicluster APIs where multicluster connectivier (or at least discovery) depends on having the same namespace in multiple clusters… Put another way, given the practices described here, what would the CRs look like to create a network policy allowing a pod in a local cluster to send traffic to all pods in a given namespace in a remote cluster? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
No, nothing is "representing" anything. This is just about making it possible to refer to namespaces in other clusters. If you have a NetworkPolicy in kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: sample-multi-cluster-policy
namespace: alpha
spec:
podSelector: {}
ingress:
- from:
namespaceSelector:
matchLabels:
cluster.clusterset.k8s.io: cluster-two
kubernetes.io/metadata.name: betaThat says that all pods in the namespace There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
So keep in mind that NetworkPolicy is "single-sided": you can say "Pod X is allowed to send traffic to Pod Y", but that doesn't automatically imply that Pod Y will accept packets from Pod X. It just means that if Pod X was "deny-all-egress" before, then it becomes "deny-all-egress-except-egress-to-Pod-Y" now. So the policy you're asking about is basically just what I gave above, except with In a single-cluster, with AdminNetworkPolicy, you can write "double-sided" policies that actually say "Pod X can send traffic to Pod Y", which implies both that egress from Pod X to Pod Y and ingress to Pod Y from Pod X are allowed. However, in a multi-cluster context, AdminNetworkPolicy would have the same problem plain NP does with cross-cluster policies. Each AdminNetworkPolicy would only have power over traffic in its own cluster, so an ANP in one cluster would not be able to override policies in a remote cluster. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, thanks, I think I get it — this is just a way of reusing the existing label selectors to match on clusters in addition to everything else. So apart from requiring a “policy engine” somewhere that’s aware of other clusters’ namespaces, it doesn’t require anything, it’s just an agreement on label use so that policies can be written consistently across clusters. Is that correct? On the one hand I find the approach a bit icky because it requires adding information that’s already known to each namespace; on the other hand I like that it requires an action from the administrator before any given namespace can participate in network-policy-controlled traffic… There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
yeah, the question is that creating this implicity semantics simplify something, or if just people will prefer to operate with their own labeling considering the cluster as stretched ... it will be nice to have some end users feedback |
||
|
|
||
| `cluster.clusterset.k8s.io: <cluster-id>` | ||
|
|
||
| The <cluster-id> value should correspond to the unique name of the cluster | ||
| within the ClusterSet (e.g., cluster-a, us-west-2). This practice enables policy | ||
| authors to create selectors that precisely target peers from specific clusters. | ||
|
Comment on lines
+108
to
+115
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FYI we also have something somewhat similar in Cilium ClusterMesh: https://docs.cilium.io/en/latest/network/clustermesh/policy/ (CiliumNetworkPolicy are described there but same behavior affects regular NetworkPolicy too). The main difference is that we operate at the pod level not the namespace level (technically everything is translated as "pod" level things in the end in Cilium though...). It boils down to the following points:
Also, but this is a more advanced topic than only the cluster id, but still a semi-related one too: we are wondering on how to add labels at the cluster level. Here is the relevant issue on the Cilium side cilium/cilium#40413 (there's mostly implementation consideration in this issue though). To do that we are most likely going to rely on About API: https://multicluster.sigs.k8s.io/concepts/about-api/. But in our case we are a bit worried that each cluster would self declare their properties/labels while cluster name in our case is way more enforced on each clusters (local cluster knows its name and the name of each remote clusters too + for anything imported from remote clusters we enforce that it has the correct cluster name)... But yeah probably this is an entirely different "problem"? |
||
|
|
||
| ## Conformance Details | ||
|
|
||
| Not applicable, as this is an informational proposal that does not introduce new | ||
| API features. | ||
|
|
||
| ## Alternatives | ||
|
|
||
| The primary alternative is the absence of a documented best practice. This leads | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would say that the primary alternative would be adding In particular, reading use cases like "I want to apply a NetworkPolicy to my application and be confident that it only affects traffic within my local cluster", I keep thinking there needs to be an easy way to write a policy that refers to "this cluster" without having to identify the cluster by name, and an obvious way to do that would be to just extend the existing two-tiered
NPs that do not include a OTOH, having not paid much attention to SIG MC, I'm not sure what would be consistent with other multi-cluster APIs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In SIG-MC, namespaces aren’t constrained to single clusters, so clusters and namespaces should be considered orthogonal: a clusterset admin familiar with MCS (multicluster services) would expect a multicluster-aware network policy controller to allow traffic across clusters following namespace constraints. (Yes, network administrators tend not to like this.) End users ask about cross-cluster behaviour regularly, so it would be good to have some clarity; relying on a This also avoids surprising administrators: existing network policies only apply to the local cluster (and if there’s a deny-all, no traffic is allowed across the clusterset). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
What does that mean? A namespace can span multiple clusters? The namespace
These seem contrary? Network traffic should normally ignore cluster boundaries, but NetworkPolicies shouldn't?
That's automatic, given the semantics of "isolation" in NP. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Conceptually yes, namespaces span multiple clusters; more precisely, namespaces with the same name in multiple clusters are considered as effectively equivalent. The main consequence for SIG-MC is that a service made available across clusters in one namespace is always the same, regardless of the cluster it is hosted by. This means that, from a SIG-MC perspective, granting access at the network level to a service (or rather, its endpoints) in a given cluster is the same as granting access at the network level to the same service in all clusters. The apparent contradiction is because my first paragraph tried to explain the SIG-MC worldview, while the second paragraph was about what would be acceptable (in my mind) for multicluster network policies. A purely SIG-MC interpretation of network policies, with what we call “namespace sameness”, would imply that namespace-based network policies should apply across a clusterset; however I also think that that’s unacceptable from a “principle of least surprise” perspective, and that existing network policies should be cluster-local. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this approach based on @fasaxc feedback, builds on namespace sameness.
At this point is when the following point Dan is doing raises, do we want to embed the cluster concept on the API or do we follow the "namespace sameness" concept? If we consider the cluster concept, it looks to me we definitive need a new CRD, we can not make core APIs cluster aware without extending the scope to all objects (we've seen this in multi network how the feature eaily become viral and touches everything). I'm personally do not have time for persue this and after working on some related projects I do not think is feasible and worth the effort. If we treat this as the existing Multi Cluster Services based on the "namespace sameness" concept I think that is valid and backed by the existing Calico implementation and the Multi Cluster Services KEP. If we follow this path, is this something is raising some red flags or are we ok on iterating? if affirmative, do we document it in SIG MC or in this subproject of SIG Network? if we document it, is it a problem if kube-network-policies implements it? |
||
| to fragmented, implementation-specific approaches to multi-cluster policy, | ||
| reducing portability and creating a confusing experience for users. By | ||
| documenting a common pattern, we provide a consistent model that both users and | ||
| CNI implementers can reference. | ||
|
|
||
| ## References | ||
|
|
||
| * KEP-2149: ClusterId for ClusterSet identification: | ||
| https://github.com/kubernetes/enhancements/tree/master/keps/sig-multicluster/2149-clusterid | ||
|
|
||
| * Calico Multi-Cluster Flat model: Calico is one CNI that supports a flat | ||
| network model where policies can be applied across clusters, as described in | ||
| https://kubernetes.slack.com/archives/C01CWSHQWQJ/p1756283784627849 | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel like this is a "best practice".
If we want to document "some pod network implementations are doing something like this", then maybe we can document that, but if we think this use case is important, then I think we should support it with proper API.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find the right name
that is what I'm trying to figure out if people just need this stretched network policy mode, if it needs something else , if is a very custom thing ...