Skip to content

Conversation

ArangoGutierrez
Copy link
Collaborator

Builds on #1280

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for drop-in configuration files to the nvidia-ctk command by introducing a --drop-in-config flag. This allows NVIDIA-specific runtime configurations to be saved to separate files instead of modifying the main configuration files.

  • Adds DropInConfig field to container runtime options and related CLI flags
  • Refactors containerd and crio config builders to use separate source and destination configs
  • Implements drop-in config support for containerd with import management

Reviewed Changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
pkg/config/engine/engine.go Changes function signatures to use more specific interface instead of generic Interface
pkg/config/engine/crio/option.go Renames path field to topLevelConfigPath for clarity
pkg/config/engine/crio/crio.go Updates config structure to support source/destination pattern
pkg/config/engine/containerd/option.go Renames path field to topLevelConfigPath for clarity
pkg/config/engine/containerd/containerd.go Implements drop-in config support with source/destination pattern
pkg/config/engine/containerd/config_drop_in.go New file implementing drop-in config functionality for containerd
pkg/config/engine/containerd/config.go Renames function for better clarity
pkg/config/engine/config.go New file defining the Config struct for source/destination pattern
cmd/nvidia-ctk/runtime/configure/configure.go Adds drop-in-config flag and updates function calls
cmd/nvidia-ctk-installer/container/runtime/runtime.go Adds drop-in config validation and flags
cmd/nvidia-ctk-installer/container/container.go Updates flush logic to use drop-in config path when specified
Test files Updates test expectations for new drop-in config behavior

@coveralls
Copy link

coveralls commented Sep 19, 2025

Pull Request Test Coverage Report for Build 18010993246

Details

  • 14 of 37 (37.84%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.9%) to 37.18%

Changes Missing Coverage Covered Lines Changed/Added Lines %
cmd/nvidia-ctk/runtime/configure/configure.go 10 18 55.56%
pkg/config/engine/containerd/config_drop_in.go 3 18 16.67%
Totals Coverage Status
Change from base Build 17981864462: 0.9%
Covered Lines: 4958
Relevant Lines: 13335

💛 - Coveralls

@elezar elezar added this to the v1.18.0 milestone Sep 19, 2025
@elezar elezar linked an issue Sep 19, 2025 that may be closed by this pull request
@ArangoGutierrez ArangoGutierrez force-pushed the dropins_ctk_cmd branch 2 times, most recently from e7ef820 to 8738572 Compare September 19, 2025 11:47
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 4 comments.

Comment on lines 127 to 133
topLevelConfig := &Config{
Tree: func() *toml.Tree {
t, _ := toml.FromFile(b.topLevelConfigPath).Load()
return t
}(),
configOptions: sourceConfigOptions,
}
Copy link
Preview

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error from toml.FromFile(b.topLevelConfigPath).Load() is being silently ignored with _. This could lead to unexpected behavior if the file loading fails. The error should be handled appropriately.

Copilot uses AI. Check for mistakes.

Comment on lines 73 to 78
// TODO: Only do this if we've actually modified the config.
if err := c.topLevelConfig.flush(); err != nil {
return 0, fmt.Errorf("failed to save top-level config: %w", err)
}
Copy link
Preview

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO comment indicates that the top-level config is being flushed unconditionally, even when no modifications were made. This could result in unnecessary file I/O operations.

Copilot uses AI. Check for mistakes.

}

// Apply the runtime-specific config changes.
// TODO: Add the runtime-specific DropInConfigs here.
Copy link
Preview

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This TODO comment suggests incomplete implementation. The runtime-specific drop-in config logic should be implemented or the TODO should be addressed.

Copilot uses AI. Check for mistakes.

@ArangoGutierrez ArangoGutierrez marked this pull request as ready for review September 19, 2025 12:51
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

}
}

if !filepath.IsAbs(config.dropInConfigPath) {
Copy link
Preview

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This validation will fail when dropInConfigPath is empty for docker runtime, since an empty string is not an absolute path. The validation should only run when dropInConfigPath is not empty.

Suggested change
if !filepath.IsAbs(config.dropInConfigPath) {
if config.dropInConfigPath != "" && !filepath.IsAbs(config.dropInConfigPath) {

Copilot uses AI. Check for mistakes.

defaultCrioConfigFilePath = "/etc/crio/crio.conf"
defaultDockerConfigFilePath = "/etc/docker/daemon.json"

defaultContainerdDropInConfigFilePath = "/etc/containerd/config.d/99-nvidia.toml"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Did we want to use the /run/nvidia/{{ .something }} here or are we hoping that we could convince Containerd to start honoring this path?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is for interactive usage, I thought it was ok to set it like that, and yes, with the Hope that we can push the community towards this direction in the future.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR containerd/containerd#12323 was merged, so we can expect FUTURE containerd versions to have "/etc/containerd/conf.d/*.toml" by default

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. That's fine then. Let's also check the paths we specify in the GPU Operator as part of NVIDIA/gpu-operator#1710 as well.

Copy link
Collaborator Author

@ArangoGutierrez ArangoGutierrez Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GPU Operator is using the right default

https://github.com/NVIDIA/gpu-operator/blob/1dda24966ee5e793a5eb4aa1f7abe477ef410f3c/controllers/object_controls.go#L62-L63

	DefaultContainerdDropInConfigFile = "/etc/containerd/conf.d/99-nvidia.toml"
	// DefaultContainerdSocketFile indicates default containerd socket file

}
}

if config.dropInConfigPath == "" && (config.runtime == "containerd" || config.runtime == "crio") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we checking the runtime here AND in the switch statement?

Suggested change
if config.dropInConfigPath == "" && (config.runtime == "containerd" || config.runtime == "crio") {
if config.dropInConfigPath == "" {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion adopted

return fmt.Errorf("failed to flush config to %q: %w", c.path, err)
func (c *topLevelConfig) Save(dropInPath string) (int64, error) {
saveToPath := c.path
if dropInPath == "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want toa dd a comment indicating that the path "" indicates that we're writing to STDOUT.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment added

}
}

if config.dropInConfigPath != "" && !filepath.IsAbs(config.dropInConfigPath) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we also want to raise an error if docker specifies a drop-in config path?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error check added for the mentioned scenario

Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
@ArangoGutierrez ArangoGutierrez force-pushed the dropins_ctk_cmd branch 4 times, most recently from cce3ebe to b41bd12 Compare September 24, 2025 13:52
@elezar elezar merged commit c51b2ea into NVIDIA:main Sep 26, 2025
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use drop-in files to configure containerd or cri-o
3 participants