Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default ConfidentialInstanceType to SEV-SNP on supported machines when confidentialCompute is Enabled #1427

Open
bgartzi opened this issue Feb 14, 2025 · 4 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@bgartzi
Copy link
Contributor

bgartzi commented Feb 14, 2025

/kind feature

Describe the solution you'd like
A patch that opens the possibility of configuring AMD SEV-SNP confidential computing instances was merged (see #1410).
This was implemented by extending confidentialCompute, a GCPMachineSpec attribute, to accept more values than just the prior Enabled/Disabled.
Another similar effort is #1426, which aims at TDX on c3 machines (Intel's equivalent technology).

When only Enabled/Disabled was allowed, only AMD SEV machines could be configured. This set compute.Instance's ConfidentialInstanceConfig.EnableConfidentialCompute to true and relied on google API's internal defaults to provision a SEV machine. Those defaults do not seem to be documented [0], though.

AMD SEV-SNP extends previous AMD SEV by covering some of the existing attack channels [1]. That is, AMD SEV-SNP is considered to be a safer alternative following AMD SEV.
However, confidential instance type relies on Google API defaults, which default to SEV (I guess that for backward compatibility reasons).
Wouldn't be a safer choice (from the user perspective) to default to AMD SEV-SNP when confidentialCompute: Enabled is configured for a machine (n2d) that supports both SEV-SNP and SEV?

However, this could have some backward compatibility implications from my point of view. At the moment (as prior to #1410), setting confidentialCompute: Enabled for a n2d machine results on a SEV confidential instance. If the default for n2d and Enabled was changed, the same configuration would result in, although safer on paper, a different SEV-SNP confidential instance.

In case it's needed, I could share a draft patch if that would explain the intentions better.

  1. https://pkg.go.dev/google.golang.org/[email protected]/compute/v1#ConfidentialInstanceConfig
  2. https://www.amd.com/content/dam/amd/en/documents/epyc-business-docs/white-papers/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf
@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 14, 2025
@bgartzi
Copy link
Contributor Author

bgartzi commented Feb 14, 2025

cc @JoelSpeed,@damdo,@uril,@richardcase who helped me on #1410

@JoelSpeed
Copy link

This sounds like a breaking change to me, I think we should probably stick with the behaviour as is.

If particular users would prefer, they could implement policies (like the upcoming mutatingadmissionpolicy, or existing validatingadmissionpolicy) to either mutate or prevent the value from being Enabled, and forcing users to pick the more secure value

@bgartzi
Copy link
Contributor Author

bgartzi commented Feb 18, 2025

Thanks for confirming @JoelSpeed!

So If I understood correctly,

  • validatingadmissionpolicy would simply force some users to for example, avoid confidentialCompute: Enabled. In kubernetes-for-dummies (me) terms,
  • mutatingadmissionpolicy would let users introduce tricks such as mutating Enabled into AmdEncryptedVirtualizationNestedPaging when possible. However, this will be available in kubernetes 1.32, so it wouldn't be possible as of now, right?

And finally, just to understand the whole picture a bit better: this would be left for users to configure, so cluster admins should configure these policies themself, right?. I mean, this is not a solution that could be built in a downstream kubernetes distro, but left for users to decide by themself whether or not to configure it.

@JoelSpeed
Copy link

Yes, I think you've understood what I was suggesting.

In theory, you could build the policy into many levels.

  • We could include this in the code here, but it would be a breaking change
  • Distros could include VAPs/MAPs to enforce it at their level, if they so desired
  • Cluster Admins could add VAPs/MAPs to enforce it, if their platform does not

Leaving the behaviour as is, and perhaps documenting if appropriate, is the most flexible option AFAICT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

3 participants