Skip to content

[no-relnote] Update E2E test suite #1048

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

ArangoGutierrez
Copy link
Collaborator

No description provided.

Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the E2E test suite for the NVIDIA Container Toolkit. Key changes include:

  • Updating license headers to use SPDX identifiers.
  • Refactoring test files to use a global “runner” variable in place of a locally scoped “r” and adding log messages before running container commands.
  • Changing environment variable names for the container image from TOOLKIT_IMAGE to E2E_IMAGE_REPO and introducing E2E_IMAGE_TAG, with corresponding updates in the GitHub workflow.

Reviewed Changes

Copilot reviewed 46 out of 48 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/e2e/runner.go Updated license header to use SPDX identifiers.
tests/e2e/nvidia-container-toolkit_test.go Replaced local runner “r” with global “runner” along with added log messages.
tests/e2e/installer.go Updated license header to use SPDX identifiers.
tests/e2e/e2e_test.go Refactored env variable usage, introduced ImageRepo/ImageTag, and streamlined test env setup.
tests/e2e/README.md New file documenting the E2E suite but the env variable names may need updating.
.github/workflows/e2e.yaml Adjusted environment variables for image repository and tag accordingly.
Files not reviewed (2)
  • tests/e2e/Makefile: Language not supported
  • tests/go.mod: Language not supported
Comments suppressed due to low confidence (2)

tests/e2e/nvidia-container-toolkit_test.go:59

  • The log message duplicates the '--gpus' flag; consider revising it to accurately reflect the command arguments (e.g. 'By("Running docker run with --runtime=nvidia --gpus=all")').
By("Running docker run with --gpus=all --runtime=nvidia --gpus all")

tests/e2e/README.md:62

  • The code now uses 'E2E_IMAGE_REPO' and 'E2E_IMAGE_TAG' for the container image instead of 'TOOLKIT_IMAGE'; please update the documentation accordingly.
| `TOOLKIT_IMAGE` | ✔ | `nvcr.io/nvidia/cuda:12.4.0-runtime-ubi9` | Image that will be pulled & executed. |

@ArangoGutierrez ArangoGutierrez force-pushed the updated_e2e branch 2 times, most recently from bf24666 to f5c3347 Compare April 24, 2025 19:13
@ArangoGutierrez ArangoGutierrez self-assigned this Apr 24, 2025
@ArangoGutierrez ArangoGutierrez force-pushed the updated_e2e branch 3 times, most recently from 8258aac to 5280561 Compare April 25, 2025 09:38
Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
@ArangoGutierrez
Copy link
Collaborator Author

PR is ready for review @elezar

IMAGE_NAME: ghcr.io/nvidia/container-toolkit
VERSION: ${{ inputs.version }}
E2E_IMAGE_REPO: ghcr.io/nvidia/container-toolkit
E2E_IMAGE_TAG: ${{ inputs.version }}-ubuntu20.04
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move ubuntu20.04 to a variable? It could be called DIST

E2E_RUNTIME ?= docker
ginkgo:
mkdir -p $(CURDIR)/bin
GOBIN=$(CURDIR)/bin go install github.com/onsi/ginkgo/v2/ginkgo@latest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of latest, you could install the ginkgo version specified in go.mod. The -modfile=go.mod flag will help you with that

See here for an example

### 6.1 Basic invocation
```bash
INSTALL_CTK=true \
TOOLKIT_IMAGE=nvcr.io/nvidia/cuda:12.4.0-runtime-ubi9 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
TOOLKIT_IMAGE=nvcr.io/nvidia/cuda:12.4.0-runtime-ubi9 \
TOOLKIT_IMAGE=nvcr.io/nvidia/cuda:12.8.1-runtime-ubi9 \

_, _, err = runner.Run("docker pull nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0")
Expect(err).ToNot(HaveOccurred())

_, _, err = runner.Run("docker pull nvcr.io/nvidia/cuda:12.8.0-base-ubi8")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_, _, err = runner.Run("docker pull nvcr.io/nvidia/cuda:12.8.0-base-ubi8")
_, _, err = runner.Run("docker pull nvcr.io/nvidia/cuda:12.8.1-base-ubi8")

return strconv.Itoa(defaultValue)
}
intValue, err := strconv.Atoi(value)
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method should return the error. We can use the defaultValue at the layer above in the method call stack if an error is returned

@@ -136,53 +117,52 @@ var _ = Describe("docker", Ordered, ContinueOnFailure, func() {
// The following should all produce the same result.
When("Running the cuda-deviceQuery sample", Ordered, func() {
BeforeAll(func(ctx context.Context) {
_, _, err := r.Run("docker pull nvcr.io/nvidia/k8s/cuda-sample:devicequery-cuda12.5.0")
_, _, err := runner.Run("docker pull nvcr.io/nvidia/k8s/cuda-sample:devicequery-cuda12.5.0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this string is referenced in multiple places, can we move it to a constant string var?

nvcr.io/nvidia/k8s/cuda-sample:devicequery-cuda12.5.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants