Skip to content

Conversation

@kpedro88
Copy link
Contributor

@kpedro88 kpedro88 commented Oct 2, 2025

What does the PR do?

This pull request updates the command-line interface for build.py with two main ideas in mind.

A. Use native features of argparse rather than (potentially brittle) manual parsing of multi-valued inputs:

  • automatically show default values in help message via built-in formatter
  • use choices to enforce limited sets of values:
    • --target-platform {linux,rhel,windows,igpu}
    • --build-type {Release,Debug,RelWithDebInfo,MinSizeRel}
  • use nargs instead of separate postprocessing to split by separating characters:
    • --image IMAGE Use specified Docker image in build as <image-name>,<full-image-name>. <image-name> can be "base", "gpu-base", or "pytorch".
      ->
      --image <image-name> <full-image-name> Use specified Docker image in build. <image-name> can be "base", "gpu-base", or "pytorch". (default: [])
    • other flags modified similarly, listed below (along with further reorganization)
  • use argument groups to provide more structure in help message (grouping available flags for different types of build elements together, providing introductory descriptions of each type)

B. Unify the handling of different elements of the build to reduce code duplication and make it easier to add new features to control the build more precisely and flexibly.
The different elements are: components, backends, repoagents, caches, filesystems, endpoints, features.
Each type of element can be (internally) assigned different properties, which determine the available command-line arguments for that element type.
The properties include:

  • required: elements without this property get --enable and --disable flags.
  • strict: elements with this property are limited to a predefined set of choices (e.g. local, redis for caches).
  • tag: elements with this property get --<element>-tag <element> <tag> flags to specify a tag from the central repository.
  • org: elements with this property get --<element>-org <element> <org> flags to specify a different repository. (new feature)
  • cmake: elements with this property get --extra-<element>-cmake-arg <element> <name> <value> and --override-<element>-cmake-arg <element> <name> <value> flags to modify their build.

The resulting arguments become:

  • features:
    --enable-logging, --enable-stats, --enable-metrics, --enable-gpu-metrics, --enable-cpu-metrics, --enable-tracing, --enable-nvtx, --enable-gpu, --enable-mali-gpu
    ->
    --enable-feature [<feature> [<feature> ...] ...]
    • add --disable-feature [<feature> [<feature> ...] ...] (in case starting from --enable-all)
  • endpoints:
    --endpoint ENDPOINT
    ->
    --enable-endpoint [<endpoint> [<endpoint> ...] ...]
    • add --disable-endpoint [<endpoint> [<endpoint> ...] ...]
  • filesystems:
    --filesystem FILESYSTEM
    ->
    --enable-filesystem [<filesystem> [<filesystem> ...] ...]
    • add --disable-filesystem [<filesystem> [<filesystem> ...] ...]
  • backends:
    --backend BACKEND Include specified backend in build as <backend-name>[:<repo-tag>]...
    --extra-backend-cmake-arg EXTRA_BACKEND_CMAKE_ARG Extra CMake argument for a backend build as <backend>:<name>=<value>.
    --override-backend-cmake-arg OVERRIDE_BACKEND_CMAKE_ARG Override specified backend CMake argument in the build as <backend>:<name>=<value>.
    ->
    --enable-backend [<backend> [<backend> ...] ...]
    --disable-backend [<backend> [<backend> ...] ...]
    --backend-tag <backend> <tag>
    --backend-org <backend> <org>
    --extra-backend-cmake-arg <backend> <name> <value>
    --override-backend-cmake-arg <backend> <name> <value>
  • components:
    --repo-tag REPO_TAG The version of a component to use in the build as <component-name>:<repo-tag>.
    --extra-core-cmake-arg EXTRA_CORE_CMAKE_ARG Extra CMake argument as <name>=<value>.
    --override-core-cmake-arg OVERRIDE_CORE_CMAKE_ARG Override specified CMake argument in the build as <name>=<value>.
    ->
    --component-tag <component> <tag>
    --extra-core-cmake-arg <name> <value>
    --override-core-cmake-arg <name> <value>
  • repoagents:
    --repoagent REPOAGENT Include specified repo agent in build as <repoagent-name>[:<repo-tag>]
    ->
    --enable-repoagent [<repoagent> [<repoagent> ...] ...]
    --disable-repoagent [<repoagent> [<repoagent> ...] ...]
    --repoagent-tag <repoagent> <tag>
    --repoagent-org <repoagent> <org>
    --extra-repoagent-cmake-arg <repoagent> <name> <value>
    --override-repoagent-cmake-arg <repoagent> <name> <value>
  • caches:
    --cache CACHE Include specified cache in build as <cache-name>[:<repo-tag>].
    ->
    --enable-cache [<cache> [<cache> ...] ...]
    --disable-cache [<cache> [<cache> ...] ...]
    --cache-tag <cache> <tag>
    --cache-org <cache> <org>
    --extra-cache-cmake-arg <cache> <name> <value>
    --override-cache-cmake-arg <cache> <name> <value>

Several minor features are also added:

  1. Add flag --no-container-cache, which propagates to docker --no-cache.
  2. Add flag --default-repo-tag <tag> to override the calculated default value, which is not always appropriate. For example, when trying to build a dev version (currently 25.08), the upstream container version is set to the previous version (25.07), which takes precedence in the calculated default value, but using the corresponding dev versions of component and backend repositories may be intended. Rather than having to override it for each repo, it is useful to be able to override it globally.
  3. Add flag --use-buildbase to use the temporary "buildbase" image as the "base" image for backends that need it (e.g. onnxruntime).
  4. The feature --library-paths is found not to be used anywhere in the script, and is therefore removed (after demonstrating how it could be incorporated in the new scheme).

Checklist

  • I have read the Contribution guidelines and signed the Contributor License
    Agreement
  • PR title reflects the change and is of format <commit_type>: <Title>
  • Changes are described in the pull request.
  • Related issues are referenced.
  • Populated github labels field
  • Added test plan and verified test passes.
  • Verified that the PR passes existing CI.
  • I ran pre-commit locally (pre-commit install, pre-commit run --all)
  • Verified copyright is correct on all changed files.
  • Added succinct git squash message before merging ref.
  • All template sections are filled out.
  • Optional: Additional screenshots for behavior/output changes with before/after.

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

  • build
  • ci
  • docs
  • feat
  • fix
  • perf
  • refactor
  • revert
  • style
  • test

Related PRs:

This PR includes the changes from #8362.

Where should the reviewer start?

Changes only affect build.py and associated documentation.

Here is the help message before any of these changes:

usage: build.py [-h] [-q | -v] [--dryrun] [--no-container-build] [--use-user-docker-config USE_USER_DOCKER_CONFIG] [--no-container-interactive]
                [--no-container-pull] [--container-memory CONTAINER_MEMORY] [--target-platform TARGET_PLATFORM] [--target-machine TARGET_MACHINE]
                [--build-id BUILD_ID] [--build-sha BUILD_SHA] [--build-dir BUILD_DIR] [--install-dir INSTALL_DIR] [--cmake-dir CMAKE_DIR]
                [--tmp-dir TMP_DIR] [--library-paths LIBRARY_PATHS] [--build-type BUILD_TYPE] [-j BUILD_PARALLEL]
                [--github-organization GITHUB_ORGANIZATION] [--version VERSION] [--container-version CONTAINER_VERSION]
                [--container-prebuild-command CONTAINER_PREBUILD_COMMAND] [--no-container-source] [--image IMAGE] [--enable-all] [--enable-logging]
                [--enable-stats] [--enable-metrics] [--enable-gpu-metrics] [--enable-cpu-metrics] [--enable-tracing] [--enable-nvtx] [--enable-gpu]
                [--enable-mali-gpu] [--min-compute-capability MIN_COMPUTE_CAPABILITY] [--endpoint ENDPOINT] [--filesystem FILESYSTEM] [--no-core-build]
                [--backend BACKEND] [--repo-tag REPO_TAG] [--repoagent REPOAGENT] [--cache CACHE] [--no-force-clone]
                [--extra-core-cmake-arg EXTRA_CORE_CMAKE_ARG] [--override-core-cmake-arg OVERRIDE_CORE_CMAKE_ARG]
                [--extra-backend-cmake-arg EXTRA_BACKEND_CMAKE_ARG] [--override-backend-cmake-arg OVERRIDE_BACKEND_CMAKE_ARG]
                [--release-version RELEASE_VERSION] [--triton-container-version TRITON_CONTAINER_VERSION]
                [--upstream-container-version UPSTREAM_CONTAINER_VERSION] [--ort-version ORT_VERSION] [--ort-openvino-version ORT_OPENVINO_VERSION]
                [--standalone-openvino-version STANDALONE_OPENVINO_VERSION] [--dcgm-version DCGM_VERSION] [--vllm-version VLLM_VERSION]
                [--rhel-py-version RHEL_PY_VERSION] [--build-secret key value]

optional arguments:
  -h, --help            show this help message and exit
  -q, --quiet           Disable console output.
  -v, --verbose         Enable verbose output.
  --dryrun              Output the build scripts, but do not perform build.
  --no-container-build  Do not use Docker container for build.
  --use-user-docker-config USE_USER_DOCKER_CONFIG
                        Path to the Docker configuration file to be used when performing container build.
  --no-container-interactive
                        Do not use -it argument to "docker run" when performing container build.
  --no-container-pull   Do not use Docker --pull argument when building container.
  --container-memory CONTAINER_MEMORY
                        Value for Docker --memory argument. Used only for windows builds.
  --target-platform TARGET_PLATFORM
                        Target platform for build, can be "linux", "rhel", "windows" or "igpu". If not specified, build targets the current platform.
  --target-machine TARGET_MACHINE
                        Target machine/architecture for build. If not specified, build targets the current machine/architecture.
  --build-id BUILD_ID   Build ID associated with the build.
  --build-sha BUILD_SHA
                        SHA associated with the build.
  --build-dir BUILD_DIR
                        Build directory. All repo clones and builds will be performed in this directory.
  --install-dir INSTALL_DIR
                        Install directory, default is <builddir>/opt/tritonserver.
  --cmake-dir CMAKE_DIR
                        Directory containing the CMakeLists.txt file for Triton server.
  --tmp-dir TMP_DIR     Temporary directory used for building inside docker. Default is /tmp.
  --library-paths LIBRARY_PATHS
                        Specify library paths for respective backends in build as <backend-name>[:<library_path>].
  --build-type BUILD_TYPE
                        Build type, one of "Release", "Debug", "RelWithDebInfo" or "MinSizeRel". Default is "Release".
  -j BUILD_PARALLEL, --build-parallel BUILD_PARALLEL
                        Build parallelism. Defaults to 2 * number-of-cores.
  --github-organization GITHUB_ORGANIZATION
                        The GitHub organization containing the repos used for the build. Defaults to "https://github.com/triton-inference-server".
  --version VERSION     The Triton version. If not specified defaults to the value in the TRITON_VERSION file.
  --container-version CONTAINER_VERSION
                        The Triton container version to build. If not specified the container version will be chosen automatically based on --version value.
  --container-prebuild-command CONTAINER_PREBUILD_COMMAND
                        When performing a container build, this command will be executed within the container just before the build it performed.
  --no-container-source
                        Do not include OSS source code in Docker container.
  --image IMAGE         Use specified Docker image in build as <image-name>,<full-image-name>. <image-name> can be "base", "gpu-base", or "pytorch".
  --enable-all          Enable all standard released Triton features, backends, repository agents, caches, endpoints and file systems.
  --enable-logging      Enable logging.
  --enable-stats        Enable statistics collection.
  --enable-metrics      Enable metrics reporting.
  --enable-gpu-metrics  Include GPU metrics in reported metrics.
  --enable-cpu-metrics  Include CPU metrics in reported metrics.
  --enable-tracing      Enable tracing.
  --enable-nvtx         Enable NVTX.
  --enable-gpu          Enable GPU support.
  --enable-mali-gpu     Enable ARM MALI GPU support.
  --min-compute-capability MIN_COMPUTE_CAPABILITY
                        Minimum CUDA compute capability supported by server.
  --endpoint ENDPOINT   Include specified endpoint in build. Allowed values are "grpc", "http", "vertex-ai" and "sagemaker".
  --filesystem FILESYSTEM
                        Include specified filesystem in build. Allowed values are "gcs", "azure_storage" and "s3".
  --no-core-build       Do not build Triton core shared library or executable.
  --backend BACKEND     Include specified backend in build as <backend-name>[:<repo-tag>]. If <repo-tag> starts with "pull/" then it refers to a pull-
                        request reference, otherwise <repo-tag> indicates the git tag/branch to use for the build. If the version is non-development then
                        the default <repo-tag> is the release branch matching the container version (e.g. version YY.MM -> branch rYY.MM); otherwise the
                        default <repo-tag> is "main" (e.g. version YY.MMdev -> branch main).
  --repo-tag REPO_TAG   The version of a component to use in the build as <component-name>:<repo-tag>. <component-name> can be "common", "core", "backend"
                        or "thirdparty". <repo-tag> indicates the git tag/branch to use for the build. Currently <repo-tag> does not support pull-request
                        reference. If the version is non-development then the default <repo-tag> is the release branch matching the container version (e.g.
                        version YY.MM -> branch rYY.MM); otherwise the default <repo-tag> is "main" (e.g. version YY.MMdev -> branch main).
  --repoagent REPOAGENT
                        Include specified repo agent in build as <repoagent-name>[:<repo-tag>]. If <repo-tag> starts with "pull/" then it refers to a pull-
                        request reference, otherwise <repo-tag> indicates the git tag/branch to use for the build. If the version is non-development then
                        the default <repo-tag> is the release branch matching the container version (e.g. version YY.MM -> branch rYY.MM); otherwise the
                        default <repo-tag> is "main" (e.g. version YY.MMdev -> branch main).
  --cache CACHE         Include specified cache in build as <cache-name>[:<repo-tag>]. If <repo-tag> starts with "pull/" then it refers to a pull-request
                        reference, otherwise <repo-tag> indicates the git tag/branch to use for the build. If the version is non-development then the
                        default <repo-tag> is the release branch matching the container version (e.g. version YY.MM -> branch rYY.MM); otherwise the default
                        <repo-tag> is "main" (e.g. version YY.MMdev -> branch main).
  --no-force-clone      Do not create fresh clones of repos that have already been cloned.
  --extra-core-cmake-arg EXTRA_CORE_CMAKE_ARG
                        Extra CMake argument as <name>=<value>. The argument is passed to CMake as -D<name>=<value> and is included after all CMake
                        arguments added by build.py for the core builds.
  --override-core-cmake-arg OVERRIDE_CORE_CMAKE_ARG
                        Override specified CMake argument in the build as <name>=<value>. The argument is passed to CMake as -D<name>=<value>. This flag
                        only impacts CMake arguments that are used by build.py. To unconditionally add a CMake argument to the core build use --extra-core-
                        cmake-arg.
  --extra-backend-cmake-arg EXTRA_BACKEND_CMAKE_ARG
                        Extra CMake argument for a backend build as <backend>:<name>=<value>. The argument is passed to CMake as -D<name>=<value> and is
                        included after all CMake arguments added by build.py for the backend.
  --override-backend-cmake-arg OVERRIDE_BACKEND_CMAKE_ARG
                        Override specified backend CMake argument in the build as <backend>:<name>=<value>. The argument is passed to CMake as
                        -D<name>=<value>. This flag only impacts CMake arguments that are used by build.py. To unconditionally add a CMake argument to the
                        backend build use --extra-backend-cmake-arg.
  --release-version RELEASE_VERSION
                        This flag sets the release version for Triton Inference Server to be built. Default: the latest released version.
  --triton-container-version TRITON_CONTAINER_VERSION
                        This flag sets the container version for Triton Inference Server to be built. Default: the latest released version.
  --upstream-container-version UPSTREAM_CONTAINER_VERSION
                        This flag sets the upstream container version for Triton Inference Server to be built. Default: the latest released version.
  --ort-version ORT_VERSION
                        This flag sets the ORT version for Triton Inference Server to be built. Default: the latest supported version.
  --ort-openvino-version ORT_OPENVINO_VERSION
                        This flag sets the OpenVino version for Triton Inference Server to be built. Default: the latest supported version.
  --standalone-openvino-version STANDALONE_OPENVINO_VERSION
                        This flag sets the standalon OpenVino version for Triton Inference Server to be built. Default: the latest supported version.
  --dcgm-version DCGM_VERSION
                        This flag sets the DCGM version for Triton Inference Server to be built. Default: the latest supported version.
  --vllm-version VLLM_VERSION
                        This flag sets the vLLM version for Triton Inference Server to be built. Default: the latest supported version.
  --rhel-py-version RHEL_PY_VERSION
                        This flag sets the Python version for RHEL platform of Triton Inference Server to be built. Default: the latest supported version.
  --build-secret key value
                        Add build secrets in the form of <key> <value>. These secrets are used during the build process for vllm. The secrets are passed to
                        the Docker build step as `--secret id=<key>`. The following keys are expected and their purposes are described below: - 'req': A
                        file containing a list of dependencies for pip (e.g., requirements.txt). - 'build_public_vllm': A flag (default is 'true')
                        indicating whether to build the public VLLM version. Ensure that the required environment variables for these secrets are set before
                        running the build.

Here is the help message after all of the changes:

usage: build.py [-h] [-q | -v] [--dryrun] [--no-container-build] [--use-user-docker-config USE_USER_DOCKER_CONFIG] [--no-container-interactive]
                [--no-container-pull] [--no-container-cache] [--container-memory CONTAINER_MEMORY] [--target-platform {linux,rhel,windows,igpu}]
                [--target-machine TARGET_MACHINE] [--build-id BUILD_ID] [--build-sha BUILD_SHA] [--build-dir BUILD_DIR] [--install-dir INSTALL_DIR]
                [--cmake-dir CMAKE_DIR] [--tmp-dir TMP_DIR] [--build-type {Release,Debug,RelWithDebInfo,MinSizeRel}] [-j BUILD_PARALLEL]
                [--github-organization GITHUB_ORGANIZATION] [--version VERSION] [--container-version CONTAINER_VERSION]
                [--container-prebuild-command CONTAINER_PREBUILD_COMMAND] [--no-container-source] [--image <image-name> <full-image-name>] [--use-buildbase]
                [--enable-all] [--enable-backend [<backend> [<backend> ...] ...]] [--disable-backend [<backend> [<backend> ...] ...]]
                [--backend-tag <backend> <tag>] [--backend-org <backend> <org>] [--extra-backend-cmake-arg <backend> <name> <value>]
                [--override-backend-cmake-arg <backend> <name> <value>] [--enable-repoagent [<repoagent> [<repoagent> ...] ...]]
                [--disable-repoagent [<repoagent> [<repoagent> ...] ...]] [--repoagent-tag <repoagent> <tag>] [--repoagent-org <repoagent> <org>]
                [--extra-repoagent-cmake-arg <repoagent> <name> <value>] [--override-repoagent-cmake-arg <repoagent> <name> <value>]
                [--enable-cache [<cache> [<cache> ...] ...]] [--disable-cache [<cache> [<cache> ...] ...]] [--cache-tag <cache> <tag>]
                [--cache-org <cache> <org>] [--extra-cache-cmake-arg <cache> <name> <value>] [--override-cache-cmake-arg <cache> <name> <value>]
                [--enable-filesystem [<filesystem> [<filesystem> ...] ...]] [--disable-filesystem [<filesystem> [<filesystem> ...] ...]]
                [--enable-endpoint [<endpoint> [<endpoint> ...] ...]] [--disable-endpoint [<endpoint> [<endpoint> ...] ...]]
                [--enable-feature [<feature> [<feature> ...] ...]] [--disable-feature [<feature> [<feature> ...] ...]] [--component-tag <component> <tag>]
                [--extra-core-cmake-arg <name> <value>] [--override-core-cmake-arg <name> <value>] [--min-compute-capability MIN_COMPUTE_CAPABILITY]
                [--no-core-build] [--no-force-clone] [--release-version RELEASE_VERSION] [--triton-container-version TRITON_CONTAINER_VERSION]
                [--upstream-container-version UPSTREAM_CONTAINER_VERSION] [--default-repo-tag DEFAULT_REPO_TAG] [--ort-version ORT_VERSION]
                [--ort-openvino-version ORT_OPENVINO_VERSION] [--standalone-openvino-version STANDALONE_OPENVINO_VERSION] [--dcgm-version DCGM_VERSION]
                [--vllm-version VLLM_VERSION] [--rhel-py-version RHEL_PY_VERSION] [--build-secret key value]

optional arguments:
  -h, --help            show this help message and exit
  -q, --quiet           Disable console output. (default: False)
  -v, --verbose         Enable verbose output. (default: False)
  --dryrun              Output the build scripts, but do not perform build. (default: False)
  --no-container-build  Do not use Docker container for build. (default: False)
  --use-user-docker-config USE_USER_DOCKER_CONFIG
                        Path to the Docker configuration file to be used when performing container build. (default: None)
  --no-container-interactive
                        Do not use -it argument to "docker run" when performing container build. (default: False)
  --no-container-pull   Do not use Docker --pull argument when building container. (default: False)
  --no-container-cache  Use Docker --no-cache argument when building container. (default: False)
  --container-memory CONTAINER_MEMORY
                        Value for Docker --memory argument. Used only for windows builds. (default: None)
  --target-platform {linux,rhel,windows,igpu}
                        Target platform for build. If not specified, build targets the current platform. (default: None)
  --target-machine TARGET_MACHINE
                        Target machine/architecture for build. If not specified, build targets the current machine/architecture. (default: None)
  --build-id BUILD_ID   Build ID associated with the build. (default: None)
  --build-sha BUILD_SHA
                        SHA associated with the build. (default: None)
  --build-dir BUILD_DIR
                        Build directory. All repo clones and builds will be performed in this directory. (default: None)
  --install-dir INSTALL_DIR
                        Install directory, default is <builddir>/opt/tritonserver. (default: None)
  --cmake-dir CMAKE_DIR
                        Directory containing the CMakeLists.txt file for Triton server. (default: None)
  --tmp-dir TMP_DIR     Temporary directory used for building inside docker. (default: /tmp)
  --build-type {Release,Debug,RelWithDebInfo,MinSizeRel}
                        Build type. (default: Release)
  -j BUILD_PARALLEL, --build-parallel BUILD_PARALLEL
                        Build parallelism. Defaults to 2 * number-of-cores. (default: None)
  --github-organization GITHUB_ORGANIZATION
                        The GitHub organization containing the repos used for the build. (default: https://github.com/triton-inference-server)
  --version VERSION     The Triton version. If not specified defaults to the value in the TRITON_VERSION file. (default: None)
  --container-version CONTAINER_VERSION
                        The Triton container version to build. If not specified the container version will be chosen automatically based on --version value.
                        (default: None)
  --container-prebuild-command CONTAINER_PREBUILD_COMMAND
                        When performing a container build, this command will be executed within the container just before the build it performed. (default:
                        None)
  --no-container-source
                        Do not include OSS source code in Docker container. (default: False)
  --image <image-name> <full-image-name>
                        Use specified Docker image in build. <image-name> can be "base", "gpu-base", or "pytorch". (default: [])
  --use-buildbase       Use local temporary "buildbase" Docker image as "base" image to build backends (default: False)
  --enable-all          Enable all standard released Triton features, backends, repository agents, caches, endpoints, and file systems. (default: False)
  --min-compute-capability MIN_COMPUTE_CAPABILITY
                        Minimum CUDA compute capability supported by server. (default: 6.0)
  --no-core-build       Do not build Triton core shared library or executable. (default: False)
  --no-force-clone      Do not create fresh clones of repos that have already been cloned. (default: False)
  --release-version RELEASE_VERSION
                        This flag sets the release version for Triton Inference Server to be built. Default: the latest released version. (default:
                        2.62.0dev)
  --triton-container-version TRITON_CONTAINER_VERSION
                        This flag sets the container version for Triton Inference Server to be built. Default: the latest released version. (default:
                        25.10dev)
  --upstream-container-version UPSTREAM_CONTAINER_VERSION
                        This flag sets the upstream container version for Triton Inference Server to be built. Default: the latest released version.
                        (default: 25.08)
  --default-repo-tag DEFAULT_REPO_TAG
                        Override the calculated default-repo-tag value (default: None)
  --ort-version ORT_VERSION
                        This flag sets the ORT version for Triton Inference Server to be built. Default: the latest supported version. (default: 1.23.0)
  --ort-openvino-version ORT_OPENVINO_VERSION
                        This flag sets the OpenVino version for Triton Inference Server to be built. Default: the latest supported version. (default:
                        2025.3.0)
  --standalone-openvino-version STANDALONE_OPENVINO_VERSION
                        This flag sets the standalon OpenVino version for Triton Inference Server to be built. Default: the latest supported version.
                        (default: 2025.3.0)
  --dcgm-version DCGM_VERSION
                        This flag sets the DCGM version for Triton Inference Server to be built. Default: the latest supported version. (default: 4.4.0-1)
  --vllm-version VLLM_VERSION
                        This flag sets the vLLM version for Triton Inference Server to be built. Default: the latest supported version. (default: 0.9.2)
  --rhel-py-version RHEL_PY_VERSION
                        This flag sets the Python version for RHEL platform of Triton Inference Server to be built. Default: the latest supported version.
                        (default: 3.12.3)
  --build-secret key value
                        Add build secrets in the form of <key> <value>. These secrets are used during the build process for vllm. The secrets are passed to
                        the Docker build step as `--secret id=<key>`. The following keys are expected and their purposes are described below: - 'req': A
                        file containing a list of dependencies for pip (e.g., requirements.txt). - 'build_public_vllm': A flag (default is 'true')
                        indicating whether to build the public VLLM version. Ensure that the required environment variables for these secrets are set before
                        running the build. (default: [])

backend:
  Options to configure backends, including: ensemble, identity, square, repeat, onnxruntime, python, dali, pytorch, openvino, fil, tensorrt, ...

  --enable-backend [<backend> [<backend> ...] ...]
                        Enable requested backend(s) (default: [])
  --disable-backend [<backend> [<backend> ...] ...]
                        Disable requested backend(s) (remove from --enable-all standard list) (default: [])
  --backend-tag <backend> <tag>
                        Select <tag> for specified <backend>. If <tag> starts with "pull/" then it refers to a pull-request reference, otherwise <tag>
                        indicates the git tag/branch to use for the build. If the version is non-development then the default <tag> is the release branch
                        matching the container version (e.g. version YY.MM -> branch rYY.MM); otherwise the default <tag> is "main" (e.g. version YY.MMdev
                        -> branch main). (default: [])
  --backend-org <backend> <org>
                        Select <org> for specified <backend>, to use the fork of the corresponding repository from <org> instead of the default --github-
                        organization value. (default: [])
  --extra-backend-cmake-arg <backend> <name> <value>
                        Extra CMake argument for backend build. The argument is passed to CMake as -D<name>=<value> and is included after all CMake
                        arguments added by build.py. (default: [])
  --override-backend-cmake-arg <backend> <name> <value>
                        Override specified backend CMake argument in the backend build. The argument is passed to CMake as -D<name>=<value>. This flag only
                        impacts CMake arguments that are used by build.py. To unconditionally add a CMake argument to the backend build use --extra-backend-
                        cmake-arg. (default: [])

repoagent:
  Options to configure repoagents, including: checksum, ...

  --enable-repoagent [<repoagent> [<repoagent> ...] ...]
                        Enable requested repoagent(s) (default: [])
  --disable-repoagent [<repoagent> [<repoagent> ...] ...]
                        Disable requested repoagent(s) (remove from --enable-all standard list) (default: [])
  --repoagent-tag <repoagent> <tag>
                        Select <tag> for specified <repoagent>. If <tag> starts with "pull/" then it refers to a pull-request reference, otherwise <tag>
                        indicates the git tag/branch to use for the build. If the version is non-development then the default <tag> is the release branch
                        matching the container version (e.g. version YY.MM -> branch rYY.MM); otherwise the default <tag> is "main" (e.g. version YY.MMdev
                        -> branch main). (default: [])
  --repoagent-org <repoagent> <org>
                        Select <org> for specified <repoagent>, to use the fork of the corresponding repository from <org> instead of the default --github-
                        organization value. (default: [])
  --extra-repoagent-cmake-arg <repoagent> <name> <value>
                        Extra CMake argument for repoagent build. The argument is passed to CMake as -D<name>=<value> and is included after all CMake
                        arguments added by build.py. (default: [])
  --override-repoagent-cmake-arg <repoagent> <name> <value>
                        Override specified backend CMake argument in the repoagent build. The argument is passed to CMake as -D<name>=<value>. This flag
                        only impacts CMake arguments that are used by build.py. To unconditionally add a CMake argument to the repoagent build use --extra-
                        repoagent-cmake-arg. (default: [])

cache:
  Options to configure caches, including: local, redis, ...

  --enable-cache [<cache> [<cache> ...] ...]
                        Enable requested cache(s) (default: [])
  --disable-cache [<cache> [<cache> ...] ...]
                        Disable requested cache(s) (remove from --enable-all standard list) (default: [])
  --cache-tag <cache> <tag>
                        Select <tag> for specified <cache>. If <tag> starts with "pull/" then it refers to a pull-request reference, otherwise <tag>
                        indicates the git tag/branch to use for the build. If the version is non-development then the default <tag> is the release branch
                        matching the container version (e.g. version YY.MM -> branch rYY.MM); otherwise the default <tag> is "main" (e.g. version YY.MMdev
                        -> branch main). (default: [])
  --cache-org <cache> <org>
                        Select <org> for specified <cache>, to use the fork of the corresponding repository from <org> instead of the default --github-
                        organization value. (default: [])
  --extra-cache-cmake-arg <cache> <name> <value>
                        Extra CMake argument for cache build. The argument is passed to CMake as -D<name>=<value> and is included after all CMake arguments
                        added by build.py. (default: [])
  --override-cache-cmake-arg <cache> <name> <value>
                        Override specified backend CMake argument in the cache build. The argument is passed to CMake as -D<name>=<value>. This flag only
                        impacts CMake arguments that are used by build.py. To unconditionally add a CMake argument to the cache build use --extra-cache-
                        cmake-arg. (default: [])

filesystem:
  Options to configure filesystems, including: gcs, s3, azure_storage

  --enable-filesystem [<filesystem> [<filesystem> ...] ...]
                        Enable requested filesystem(s) (default: [])
  --disable-filesystem [<filesystem> [<filesystem> ...] ...]
                        Disable requested filesystem(s) (remove from --enable-all standard list) (default: [])

endpoint:
  Options to configure endpoints, including: http, grpc, sagemaker, vertex-ai

  --enable-endpoint [<endpoint> [<endpoint> ...] ...]
                        Enable requested endpoint(s) (default: [])
  --disable-endpoint [<endpoint> [<endpoint> ...] ...]
                        Disable requested endpoint(s) (remove from --enable-all standard list) (default: [])

feature:
  Options to configure features, including: logging, stats, metrics, gpu_metrics, cpu_metrics, tracing, nvtx, gpu, mali_gpu

  --enable-feature [<feature> [<feature> ...] ...]
                        Enable requested feature(s) (default: [])
  --disable-feature [<feature> [<feature> ...] ...]
                        Disable requested feature(s) (remove from --enable-all standard list) (default: [])

component:
  Options to configure components, including: common, core, backend, thirdparty

  --component-tag <component> <tag>
                        Select <tag> for specified <component>. If <tag> starts with "pull/" then it refers to a pull-request reference, otherwise <tag>
                        indicates the git tag/branch to use for the build. If the version is non-development then the default <tag> is the release branch
                        matching the container version (e.g. version YY.MM -> branch rYY.MM); otherwise the default <tag> is "main" (e.g. version YY.MMdev
                        -> branch main). (default: [])
  --extra-core-cmake-arg <name> <value>
                        Extra CMake argument for core build. The argument is passed to CMake as -D<name>=<value> and is included after all CMake arguments
                        added by build.py. (default: [])
  --override-core-cmake-arg <name> <value>
                        Override specified backend CMake argument in the core build. The argument is passed to CMake as -D<name>=<value>. This flag only
                        impacts CMake arguments that are used by build.py. To unconditionally add a CMake argument to the core build use --extra-core-cmake-
                        arg. (default: [])

Test plan:

I have used the script to build containers in several configurations and found that everything works as desired.

There are changes to some of the CLI flags, listed above.

My understanding from the discussion in #8362 is that there are internal build.py commands that need to be checked. I am happy to convert and check these if they can be provided, but I currently have no insight into the exact internal commands that need to be maintained.

Caveats:

More features could be added for the "component" elements: using forked repositories, modifying CMake arguments (beyond core). However, these changes would require modifying the CMakeLists.txt files in the various component repositories, and therefore are left for future PRs (so that this PR only involves modifications to build.py).

Background

In the process of adding minor features in #8362, I noticed some friction: duplication of code, multiple parsing steps, etc. The partial refactor, implemented here, allowed new options for different elements of the build process to be added much more easily (see 304b1a7 for an example, which resolves an existing todo item noted in the code).

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

N/A

@whoisj
Copy link
Contributor

whoisj commented Oct 9, 2025

this does produce, in my opinion, a more helpful --help message.

usage: build.py [-h] [-q | -v] [--dryrun] [--no-container-build] [--use-user-docker-config USE_USER_DOCKER_CONFIG] [--no-container-interactive] [--no-container-pull] [--no-container-cache] [--container-memory CONTAINER_MEMORY] [--target-platform {linux,rhel,windows,igpu}]
                [--target-machine TARGET_MACHINE] [--build-id BUILD_ID] [--build-sha BUILD_SHA] [--build-dir BUILD_DIR] [--install-dir INSTALL_DIR] [--cmake-dir CMAKE_DIR] [--tmp-dir TMP_DIR] [--build-type {Release,Debug,RelWithDebInfo,MinSizeRel}] [-j BUILD_PARALLEL]
                [--github-organization GITHUB_ORGANIZATION] [--version VERSION] [--container-version CONTAINER_VERSION] [--container-prebuild-command CONTAINER_PREBUILD_COMMAND] [--no-container-source] [--image <image-name> <full-image-name>] [--use-buildbase] [--enable-all]
                [--enable-backend [<backend> [<backend> ...] ...]] [--disable-backend [<backend> [<backend> ...] ...]] [--backend-tag <backend> <tag>] [--backend-org <backend> <org>] [--extra-backend-cmake-arg <backend> <name> <value>] [--override-backend-cmake-arg <backend> <name> <value>]
                [--enable-repoagent [<repoagent> [<repoagent> ...] ...]] [--disable-repoagent [<repoagent> [<repoagent> ...] ...]] [--repoagent-tag <repoagent> <tag>] [--repoagent-org <repoagent> <org>] [--extra-repoagent-cmake-arg <repoagent> <name> <value>]
                [--override-repoagent-cmake-arg <repoagent> <name> <value>] [--enable-cache [<cache> [<cache> ...] ...]] [--disable-cache [<cache> [<cache> ...] ...]] [--cache-tag <cache> <tag>] [--cache-org <cache> <org>] [--extra-cache-cmake-arg <cache> <name> <value>]
                [--override-cache-cmake-arg <cache> <name> <value>] [--enable-filesystem [<filesystem> [<filesystem> ...] ...]] [--disable-filesystem [<filesystem> [<filesystem> ...] ...]] [--enable-endpoint [<endpoint> [<endpoint> ...] ...]] [--disable-endpoint [<endpoint> [<endpoint> ...]
                ...]] [--enable-feature [<feature> [<feature> ...] ...]] [--disable-feature [<feature> [<feature> ...] ...]] [--component-tag <component> <tag>] [--extra-core-cmake-arg <name> <value>] [--override-core-cmake-arg <name> <value>] [--min-compute-capability MIN_COMPUTE_CAPABILITY]
                [--no-core-build] [--no-force-clone] [--release-version RELEASE_VERSION] [--triton-container-version TRITON_CONTAINER_VERSION] [--upstream-container-version UPSTREAM_CONTAINER_VERSION] [--default-repo-tag DEFAULT_REPO_TAG] [--ort-version ORT_VERSION]
                [--ort-openvino-version ORT_OPENVINO_VERSION] [--standalone-openvino-version STANDALONE_OPENVINO_VERSION] [--dcgm-version DCGM_VERSION] [--vllm-version VLLM_VERSION] [--rhel-py-version RHEL_PY_VERSION] [--build-secret key value]

options:
  -h, --help            show this help message and exit
  -q, --quiet           Disable console output. (default: False)
  -v, --verbose         Enable verbose output. (default: False)
  --dryrun              Output the build scripts, but do not perform build. (default: False)
  --no-container-build  Do not use Docker container for build. (default: False)
  --use-user-docker-config USE_USER_DOCKER_CONFIG
                        Path to the Docker configuration file to be used when performing container build. (default: None)
  --no-container-interactive
                        Do not use -it argument to "docker run" when performing container build. (default: False)
  --no-container-pull   Do not use Docker --pull argument when building container. (default: False)
  --no-container-cache  Use Docker --no-cache argument when building container. (default: False)
  --container-memory CONTAINER_MEMORY
                        Value for Docker --memory argument. Used only for windows builds. (default: None)
  --target-platform {linux,rhel,windows,igpu}
                        Target platform for build. If not specified, build targets the current platform. (default: None)
  --target-machine TARGET_MACHINE
                        Target machine/architecture for build. If not specified, build targets the current machine/architecture. (default: None)
  --build-id BUILD_ID   Build ID associated with the build. (default: None)
  --build-sha BUILD_SHA
                        SHA associated with the build. (default: None)
  --build-dir BUILD_DIR
                        Build directory. All repo clones and builds will be performed in this directory. (default: None)
  --install-dir INSTALL_DIR
                        Install directory, default is <builddir>/opt/tritonserver. (default: None)
  --cmake-dir CMAKE_DIR
                        Directory containing the CMakeLists.txt file for Triton server. (default: None)
  --tmp-dir TMP_DIR     Temporary directory used for building inside docker. (default: /tmp)
  --build-type {Release,Debug,RelWithDebInfo,MinSizeRel}
                        Build type. (default: Release)
  -j BUILD_PARALLEL, --build-parallel BUILD_PARALLEL
                        Build parallelism. Defaults to 2 * number-of-cores. (default: None)
  --github-organization GITHUB_ORGANIZATION
                        The GitHub organization containing the repos used for the build. (default: https://github.com/triton-inference-server)
  --version VERSION     The Triton version. If not specified defaults to the value in the TRITON_VERSION file. (default: None)
  --container-version CONTAINER_VERSION
                        The Triton container version to build. If not specified the container version will be chosen automatically based on --version value. (default: None)
  --container-prebuild-command CONTAINER_PREBUILD_COMMAND
                        When performing a container build, this command will be executed within the container just before the build it performed. (default: None)
  --no-container-source
                        Do not include OSS source code in Docker container. (default: False)
  --image <image-name> <full-image-name>
                        Use specified Docker image in build. <image-name> can be "base", "gpu-base", or "pytorch". (default: [])
  --use-buildbase       Use local temporary "buildbase" Docker image as "base" image to build backends (default: False)
  --enable-all          Enable all standard released Triton features, backends, repository agents, caches, endpoints, and file systems. (default: False)
  --min-compute-capability MIN_COMPUTE_CAPABILITY
                        Minimum CUDA compute capability supported by server. (default: 6.0)
  --no-core-build       Do not build Triton core shared library or executable. (default: False)
  --no-force-clone      Do not create fresh clones of repos that have already been cloned. (default: False)
  --release-version RELEASE_VERSION
                        This flag sets the release version for Triton Inference Server to be built. Default: the latest released version. (default: 2.62.0dev)
  --triton-container-version TRITON_CONTAINER_VERSION
                        This flag sets the container version for Triton Inference Server to be built. Default: the latest released version. (default: 25.10dev)
  --upstream-container-version UPSTREAM_CONTAINER_VERSION
                        This flag sets the upstream container version for Triton Inference Server to be built. Default: the latest released version. (default: 25.08)
  --default-repo-tag DEFAULT_REPO_TAG
                        Override the calculated default-repo-tag value (default: None)
  --ort-version ORT_VERSION
                        This flag sets the ORT version for Triton Inference Server to be built. Default: the latest supported version. (default: 1.23.0)
  --ort-openvino-version ORT_OPENVINO_VERSION
                        This flag sets the OpenVino version for Triton Inference Server to be built. Default: the latest supported version. (default: 2025.3.0)
  --standalone-openvino-version STANDALONE_OPENVINO_VERSION
                        This flag sets the standalon OpenVino version for Triton Inference Server to be built. Default: the latest supported version. (default: 2025.3.0)
  --dcgm-version DCGM_VERSION
                        This flag sets the DCGM version for Triton Inference Server to be built. Default: the latest supported version. (default: 4.4.0-1)
  --vllm-version VLLM_VERSION
                        This flag sets the vLLM version for Triton Inference Server to be built. Default: the latest supported version. (default: 0.9.2)
  --rhel-py-version RHEL_PY_VERSION
                        This flag sets the Python version for RHEL platform of Triton Inference Server to be built. Default: the latest supported version. (default: 3.12.3)
  --build-secret key value
                        Add build secrets in the form of <key> <value>. These secrets are used during the build process for vllm. The secrets are passed to the Docker build step as `--secret id=<key>`. The following keys are expected and their purposes are described below: - 'req': A file
                        containing a list of dependencies for pip (e.g., requirements.txt). - 'build_public_vllm': A flag (default is 'true') indicating whether to build the public VLLM version. Ensure that the required environment variables for these secrets are set before running the build.
                        (default: [])

backend:
  Options to configure backends, including: ensemble, identity, square, repeat, onnxruntime, python, dali, pytorch, openvino, fil, tensorrt, ...

  --enable-backend [<backend> [<backend> ...] ...]
                        Enable requested backend(s) (default: [])
  --disable-backend [<backend> [<backend> ...] ...]
                        Disable requested backend(s) (remove from --enable-all standard list) (default: [])
  --backend-tag <backend> <tag>
                        Select <tag> for specified <backend>. If <tag> starts with "pull/" then it refers to a pull-request reference, otherwise <tag> indicates the git tag/branch to use for the build. If the version is non-development then the default <tag> is the release branch matching the
                        container version (e.g. version YY.MM -> branch rYY.MM); otherwise the default <tag> is "main" (e.g. version YY.MMdev -> branch main). (default: [])
  --backend-org <backend> <org>
                        Select <org> for specified <backend>, to use the fork of the corresponding repository from <org> instead of the default --github-organization value. (default: [])
  --extra-backend-cmake-arg <backend> <name> <value>
                        Extra CMake argument for backend build. The argument is passed to CMake as -D<name>=<value> and is included after all CMake arguments added by build.py. (default: [])
  --override-backend-cmake-arg <backend> <name> <value>
                        Override specified backend CMake argument in the backend build. The argument is passed to CMake as -D<name>=<value>. This flag only impacts CMake arguments that are used by build.py. To unconditionally add a CMake argument to the backend build use --extra-backend-cmake-
                        arg. (default: [])

repoagent:
  Options to configure repoagents, including: checksum, ...

  --enable-repoagent [<repoagent> [<repoagent> ...] ...]
                        Enable requested repoagent(s) (default: [])
  --disable-repoagent [<repoagent> [<repoagent> ...] ...]
                        Disable requested repoagent(s) (remove from --enable-all standard list) (default: [])
  --repoagent-tag <repoagent> <tag>
                        Select <tag> for specified <repoagent>. If <tag> starts with "pull/" then it refers to a pull-request reference, otherwise <tag> indicates the git tag/branch to use for the build. If the version is non-development then the default <tag> is the release branch matching the
                        container version (e.g. version YY.MM -> branch rYY.MM); otherwise the default <tag> is "main" (e.g. version YY.MMdev -> branch main). (default: [])
  --repoagent-org <repoagent> <org>
                        Select <org> for specified <repoagent>, to use the fork of the corresponding repository from <org> instead of the default --github-organization value. (default: [])
  --extra-repoagent-cmake-arg <repoagent> <name> <value>
                        Extra CMake argument for repoagent build. The argument is passed to CMake as -D<name>=<value> and is included after all CMake arguments added by build.py. (default: [])
  --override-repoagent-cmake-arg <repoagent> <name> <value>
                        Override specified backend CMake argument in the repoagent build. The argument is passed to CMake as -D<name>=<value>. This flag only impacts CMake arguments that are used by build.py. To unconditionally add a CMake argument to the repoagent build use --extra-repoagent-
                        cmake-arg. (default: [])

cache:
  Options to configure caches, including: local, redis, ...

  --enable-cache [<cache> [<cache> ...] ...]
                        Enable requested cache(s) (default: [])
  --disable-cache [<cache> [<cache> ...] ...]
                        Disable requested cache(s) (remove from --enable-all standard list) (default: [])
  --cache-tag <cache> <tag>
                        Select <tag> for specified <cache>. If <tag> starts with "pull/" then it refers to a pull-request reference, otherwise <tag> indicates the git tag/branch to use for the build. If the version is non-development then the default <tag> is the release branch matching the
                        container version (e.g. version YY.MM -> branch rYY.MM); otherwise the default <tag> is "main" (e.g. version YY.MMdev -> branch main). (default: [])
  --cache-org <cache> <org>
                        Select <org> for specified <cache>, to use the fork of the corresponding repository from <org> instead of the default --github-organization value. (default: [])
  --extra-cache-cmake-arg <cache> <name> <value>
                        Extra CMake argument for cache build. The argument is passed to CMake as -D<name>=<value> and is included after all CMake arguments added by build.py. (default: [])
  --override-cache-cmake-arg <cache> <name> <value>
                        Override specified backend CMake argument in the cache build. The argument is passed to CMake as -D<name>=<value>. This flag only impacts CMake arguments that are used by build.py. To unconditionally add a CMake argument to the cache build use --extra-cache-cmake-arg.
                        (default: [])

filesystem:
  Options to configure filesystems, including: gcs, s3, azure_storage

  --enable-filesystem [<filesystem> [<filesystem> ...] ...]
                        Enable requested filesystem(s) (default: [])
  --disable-filesystem [<filesystem> [<filesystem> ...] ...]
                        Disable requested filesystem(s) (remove from --enable-all standard list) (default: [])

endpoint:
  Options to configure endpoints, including: http, grpc, sagemaker, vertex-ai

  --enable-endpoint [<endpoint> [<endpoint> ...] ...]
                        Enable requested endpoint(s) (default: [])
  --disable-endpoint [<endpoint> [<endpoint> ...] ...]
                        Disable requested endpoint(s) (remove from --enable-all standard list) (default: [])

feature:
  Options to configure features, including: logging, stats, metrics, gpu_metrics, cpu_metrics, tracing, nvtx, gpu, mali_gpu

  --enable-feature [<feature> [<feature> ...] ...]
                        Enable requested feature(s) (default: [])
  --disable-feature [<feature> [<feature> ...] ...]
                        Disable requested feature(s) (remove from --enable-all standard list) (default: [])

component:
  Options to configure components, including: common, core, backend, thirdparty

  --component-tag <component> <tag>
                        Select <tag> for specified <component>. If <tag> starts with "pull/" then it refers to a pull-request reference, otherwise <tag> indicates the git tag/branch to use for the build. If the version is non-development then the default <tag> is the release branch matching the
                        container version (e.g. version YY.MM -> branch rYY.MM); otherwise the default <tag> is "main" (e.g. version YY.MMdev -> branch main). (default: [])
  --extra-core-cmake-arg <name> <value>
                        Extra CMake argument for core build. The argument is passed to CMake as -D<name>=<value> and is included after all CMake arguments added by build.py. (default: [])
  --override-core-cmake-arg <name> <value>
                        Override specified backend CMake argument in the core build. The argument is passed to CMake as -D<name>=<value>. This flag only impacts CMake arguments that are used by build.py. To unconditionally add a CMake argument to the core build use --extra-core-cmake-arg.
                        (default: [])

@whoisj whoisj requested a review from mc-nv October 9, 2025 20:11
Copy link
Contributor

@whoisj whoisj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, but I think there might an error in the examples in the docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants