Skip to content

[no-relnote] Update E2E test suite #1242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

ArangoGutierrez
Copy link
Collaborator

This pull request introduces significant updates to the end-to-end (E2E) testing framework for the NVIDIA Kubernetes Device Plugin. The changes include enhancements to the CI workflow, a migration to the Ginkgo v2 testing framework, restructuring of test-related files, and updates to licensing information. Below is a categorized summary of the most important changes:

Workflow and CI Updates:

  • Added a new HELM_CHART environment variable to the E2E CI workflow (.github/workflows/e2e.yaml) and updated the make command to use a specific Makefile for E2E tests. This improves configurability and aligns the workflow with the new test structure.
  • Replaced the Slack alert notification step with a new Ginkgo log archiving step and updated the Slack notification configuration to use a JSON payload for better integration.

Migration to Ginkgo v2:

  • Updated the Makefile to include a ginkgo target for installing the Ginkgo v2 CLI and a new test-e2e target for running tests with Ginkgo. This replaces the previous custom test logic.
  • Renamed and restructured test files to align with Ginkgo's conventions, such as renaming common/kubernetes.go to cleanup_test.go and removing redundant utility functions. [1] [2]

Documentation Enhancements:

  • Added a comprehensive README.md for the E2E test suite, detailing prerequisites, environment variables, execution flow, and troubleshooting steps. This improves developer onboarding and test maintainability.

Licensing and Copyright:

  • Updated all test-related files to use the SPDX license headers, reflecting the transition to a 2025 copyright year and the Apache 2.0 license. [1] [2] [3]

Codebase Simplification:

  • Removed unused test utilities, such as gpu_job.go and taints.go, consolidating functionality into more focused test files. This reduces redundancy and simplifies the codebase. [1] [2]

These changes collectively modernize the testing framework, improve CI integration, and enhance developer experience.

@ArangoGutierrez ArangoGutierrez self-assigned this Apr 24, 2025
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR modernizes the NVIDIA Kubernetes Device Plugin’s E2E test framework by upgrading to Ginkgo v2, updating CI workflows, and adjusting file structure and licensing information.

  • Migrates tests from legacy framework patterns to Ginkgo v2 constructs
  • Updates CI workflow to use environment variables and a dedicated Makefile target
  • Removes deprecated utility files and refactors test helpers for consistency

Reviewed Changes

Copilot reviewed 405 out of 407 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/e2e/utils.go Updated license headers and refactored helper functions with adjusted timeouts
tests/e2e/infra/aws.yaml Modified configuration for helm chart deployments and toolkit settings
tests/e2e/gpu-feature-discovery_test.go Migration to Ginkgo v2 and changes to use environment variables and updated client references
tests/e2e/e2e_test.go Overhauled test entry point to use environment variables and new helper functions
tests/e2e/device-plugin_test.go Refactored test structure and naming for Helm release names and client usage
tests/e2e/cleanup_test.go Consolidated cleanup functions for node, resource, and CRD removal
tests/e2e/README.md Added comprehensive documentation for running and configuring the E2E suite
.github/workflows/e2e.yaml Updated workflow steps to use a Makefile in the E2E directory and improved artifact handling
Files not reviewed (2)
  • tests/e2e/Makefile: Language not supported
  • tests/go.mod: Language not supported
Comments suppressed due to low confidence (3)

tests/e2e/device-plugin_test.go:80

  • [nitpick] Switching to randomSuffix() for generating helmReleaseName improves uniqueness; consider updating any accompanying comments or documentation to reflect this change in naming strategy.
helmReleaseName = "nvdp-e2e-test-" + randomSuffix()

.github/workflows/e2e.yaml:66

  • [nitpick] The workflow now explicitly references a Makefile in the tests/e2e directory; verify that this path remains accurate and that all necessary targets are defined in the referenced Makefile.
make -f tests/e2e/Makefile test-e2e

tests/e2e/utils.go:45

  • The timeout for eventuallyNonControlPlaneNodes has been increased from 10 seconds to 1 minute; please confirm that this longer duration aligns with the overall test responsiveness and cluster performance expectations.
}).WithPolling(1 * time.Second).WithTimeout(1 * time.Minute).WithContext(ctx)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the reason for this change just consistency with the DRA driver?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I am now working to bring more consistency across all the repos we manage

Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request modernizes the E2E test framework by migrating to Ginkgo v2, reorganizing test files, updating CI workflows, and refreshing licensing and documentation. Key changes include the restructuring and renaming of test files (e.g. renaming common/kubernetes.go to cleanup_test.go), enhancement of CI configurations (e.g. new Makefile target and Slack payload update), and removal of obsolete utilities.

Reviewed Changes

Copilot reviewed 406 out of 408 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/e2e/infra/aws.yaml Adjusted configuration ordering and added new keys for CDI enablement.
tests/e2e/gpu-feature-discovery_test.go Migrated test syntax and variable usage to align with Ginkgo v2 conventions.
tests/e2e/device-plugin_test.go Updated Helm chart installation and job creation logic.
tests/e2e/cleanup_test.go Refactored cleanup functions; moved functions from package common to e2e.
tests/e2e/README.md Added comprehensive usage and troubleshooting documentation.
.github/workflows/e2e.yaml Updated environment variables, Makefile target, and Slack notification steps.
Files not reviewed (2)
  • tests/e2e/Makefile: Language not supported
  • tests/go.mod: Language not supported
Comments suppressed due to low confidence (3)

tests/e2e/cleanup_test.go:267

  • The 'ctx' variable is used in the cleanupNodeFeatureRules function without being defined. Consider passing a context parameter to this function to correctly scope API calls.
nfrs, err := cli.NfdV1alpha1().NodeFeatureRules().List(ctx, metav1.ListOptions{})

tests/e2e/cleanup_test.go:69

  • The variable 'ctx' is used in the cleanupTestPods function without being defined or passed as an argument. Consider either passing a context parameter or declaring 'ctx' locally.
podList, err := clientSet.CoreV1().Pods(namespace).List(ctx, metav1.ListOptions{

tests/e2e/cleanup_test.go:114

  • The variable 'ctx' is used without being defined in this cleanup helper. Ensure that a valid context is passed to this function or declared within its scope to avoid runtime errors.
err := clientSet.CoreV1().Namespaces().Delete(ctx, testNamespace.Name, metav1.DeleteOptions{})

@ArangoGutierrez
Copy link
Collaborator Author

Thanks for the review and approval @tariq1890 , any additional comments @elezar ?

@ArangoGutierrez
Copy link
Collaborator Author

Now that we have merged the PR in CTK can we do a final round here @elezar

@ArangoGutierrez ArangoGutierrez added the testing issue/PR to fix/edit/create/enhance a project unit/e2e test label May 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing issue/PR to fix/edit/create/enhance a project unit/e2e test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants