-
Couldn't load subscription status.
- Fork 75
DRAAdminAccess: add example #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: ritazh The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Rita Zhang <[email protected]>
f6cbf90 to
b399cba
Compare
| info := &HostHardwareInfo{} | ||
|
|
||
| // Get CPU information | ||
| if cpuInfo, err := readFile("/proc/cpuinfo"); err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since all the rest of the data coming out of the driver is entirely OS- and hardware-agnostic AFAIK, could we follow that here? Even if admin access is implemented here as simply as adding only DRA_ADMIN_ACCESS=true, I think that's enough to demonstrate the end-to-end flow. And it doesn't assume any details about the underlying OS or hardware, which I think is an important part of making the example driver usable in as many places as possible.
e.g. @lauralorenz just brought up the question of whether the example driver works on Windows. Without this change the answer is "maybe?" but reading /proc would make that a more conclusive "no." I know the driver wouldn't fail and we'd only get the "unavailable" value in that case, but I'd rather avoid any evidence of assumptions like that if possible.
| func (d *driver) PrepareResourceClaims(ctx context.Context, claims []*resourceapi.ResourceClaim) (map[types.UID]kubeletplugin.PrepareResult, error) { | ||
| klog.Infof("PrepareResourceClaims is called: number of claims: %d", len(claims)) | ||
| for i, claim := range claims { | ||
| klog.Infof("Claim %d: UID=%s, Namespace=%s, Name=%s", i, claim.UID, claim.Namespace, claim.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add this log to prepareResourceClaim so we don't need this extra loop?
| checkpoint := newCheckpoint() | ||
| if err := s.checkpointManager.GetCheckpoint(DriverPluginCheckpointFile, checkpoint); err != nil { | ||
| return nil, fmt.Errorf("unable to sync from checkpoint: %v", err) | ||
| // Create a new checkpoint if the old one is corrupted or in a different format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary here because AdminAccess is now part of the checkpoint?
| return nil, fmt.Errorf("unable to sync from checkpoint: %v", err) | ||
| // Create a new checkpoint if the old one is corrupted or in a different format | ||
| checkpoint = newCheckpoint() | ||
| if err := s.checkpointManager.CreateCheckpoint(DriverPluginCheckpointFile, checkpoint); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we integrate this with the e2e job?
https://github.com/kubernetes-sigs/dra-example-driver/blob/b399cbaa49b9d52d8266cf604e5f3f873bdd9fdd/.github/workflows/e2e.yaml
That will help make sure once we update for 1.34 that everything here is working properly.
| metadata: | ||
| name: gpu-test7 | ||
| labels: | ||
| resource.k8s.io/admin-access: "true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you suggest that new users set both of these labels in anticipation of upgrading to 1.34? If so, I think we should do that here too.
| echo "" | ||
| echo "=== Container will sleep for 5 minutes to allow inspection ===" | ||
| trap 'exit 0' TERM | ||
| sleep 300 & wait |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason these only sleep for 5 minutes? The other examples sleep for 9999 seconds which I think I slightly prefer so I don't feel like I have to race against the clock and look at this right away.
|
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Fixes: #97
resource.k8s.io/v1beta2=trueto kind config