Skip to content

Conversation

@MenD32
Copy link

@MenD32 MenD32 commented Oct 15, 2025

Currently OpenShift cannot create H200 machines since they are part of the a3 machineFamily but don't have a quota in the gcp compute library.

Signed-off-by: Amit Mendelevitch <[email protected]>
@openshift-ci openshift-ci bot requested review from nrb and theobarberbany October 15, 2025 12:32
@elmiko elmiko changed the title fix: added H200 support NO-JIRA: added H200 support Oct 15, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 15, 2025
@openshift-ci-robot
Copy link
Contributor

@MenD32: This pull request explicitly references no jira issue.

In response to this:

Currently OpenShift cannot create H200 machines since they are part of the a3 machineFamily but don't have a quota in the gcp compute library.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes sense to me, i do wonder if we shouldn't have some warning log message when we are skipping the accelerator validation. if there is no quota, or resource exhaustion, i'm not sure the user will be able to easily detect that.

@elmiko
Copy link
Contributor

elmiko commented Oct 29, 2025

@MenD32 any thoughts about this question ?

also, cc @damdo ptal

Copy link
Member

@damdo damdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

One nit

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 29, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: damdo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 29, 2025
@MenD32
Copy link
Author

MenD32 commented Oct 29, 2025

this makes sense to me, i do wonder if we shouldn't have some warning log message when we are skipping the accelerator validation. if there is no quota, or resource exhaustion, i'm not sure the user will be able to easily detect that.

I think this is somewhat broader then just H200 support, since this also affects other machine-types (g2, g4, a4, etc...). So I'm not sure how machine-api-provider should work with those...

@MenD32
Copy link
Author

MenD32 commented Oct 29, 2025

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 29, 2025

@MenD32: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@MenD32 MenD32 requested review from damdo and elmiko October 30, 2025 10:16
Copy link
Member

@damdo damdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 30, 2025
@damdo
Copy link
Member

damdo commented Oct 30, 2025

@MenD32 do you have a Jira card to track this?

@damdo
Copy link
Member

damdo commented Oct 30, 2025

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 30, 2025
@MenD32
Copy link
Author

MenD32 commented Oct 30, 2025

No, where should I open one?

@damdo
Copy link
Member

damdo commented Oct 30, 2025

@MenD32 probably on your team's Jira board

@MenD32
Copy link
Author

MenD32 commented Oct 30, 2025

My team doesn't currently work on RH's Jira, so IDK if there'd integration between the GitHub and the Jira instance. Nevertheless I'll create an issue

@MenD32 MenD32 changed the title NO-JIRA: added H200 support JN-2789: added H200 support Oct 30, 2025
@openshift-ci-robot
Copy link
Contributor

@MenD32: No Jira issue with key JN-2789 exists in the tracker at https://issues.redhat.com/.
Once a valid jira issue is referenced in the title of this pull request, request a refresh with /jira refresh.

In response to this:

Currently OpenShift cannot create H200 machines since they are part of the a3 machineFamily but don't have a quota in the gcp compute library.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot removed the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 30, 2025
@MenD32
Copy link
Author

MenD32 commented Oct 30, 2025

@MenD32 probably on your team's Jira board

Added a JIRA ticket reference

@damdo
Copy link
Member

damdo commented Oct 30, 2025

TY, I am chatting internally to see how best we test/verify this cc. @elmiko

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants