Skip to content

Conversation

@keivenchang
Copy link
Contributor

@keivenchang keivenchang commented Nov 22, 2025

Overview:

Adds environment variable support for dynamic port configuration in preparation for parallel testing. Removes hardcoded --http-port=8000 flags from launch scripts, allowing frontend and workers to use configurable ports via environment variables while maintaining backward compatibility with CLI flags and default values.

Details:

  • Frontend now accepts either --http-port CLI flag or DYN_HTTP_PORT env var (defaults to 8000)
  • Worker ports use fallback syntax: DYN_SYSTEM_PORT=${DYN_SYSTEM_PORT:-8081}
  • Multi-worker scenarios support distinct env vars (DYN_SYSTEM_PORT_PREFILL, DYN_SYSTEM_PORT_DECODE, DYN_SYSTEM_PORT_WORKER1, etc.)
  • Updated 40 launch scripts across vllm, sglang, and trtllm backends
  • Backward compatible: explicit CLI flags and default values still work

Where should the reviewer start?

  • Review comments in launch scripts: examples/backends/vllm/launch/agg.sh (typical single-worker pattern)
  • Multi-worker port handling: examples/backends/sglang/launch/agg_router.sh (2 workers)

Related Issues:

Relates to DIS-1022

/coderabbit profile chill

Summary by CodeRabbit

  • Chores
    • Refactored port configuration across deployment scripts to support environment variable overrides while maintaining default port behaviors.
    • Enhanced worker startup with additional runtime options for improved metrics, performance tuning, and code execution policies.

✏️ Tip: You can customize this high-level summary in your review settings.

@keivenchang keivenchang self-assigned this Nov 22, 2025
@github-actions github-actions bot added the feat label Nov 22, 2025
@keivenchang keivenchang changed the title feat: add environment variable support for dynamic port configuration… feat: add environment variable support for dynamic port (part 1), in preparation for parallel testing Nov 22, 2025
@keivenchang keivenchang changed the title feat: add environment variable support for dynamic port (part 1), in preparation for parallel testing feat: allow dynamic port in scripts in preparation for parallel testing (part 1) Nov 22, 2025
@keivenchang keivenchang force-pushed the keivenchang/DIS-1022__frontend-worker-dynamic-ports branch from b0a27eb to d3195ac Compare November 24, 2025 20:38
… (part 1)

Remove hardcoded --http-port=8000 flags from launch scripts. Frontend now
accepts either --http-port flag or DYN_HTTP_PORT env var (defaults to 8000).
Add DYN_SYSTEM_PORT fallback syntax ${DYN_SYSTEM_PORT:-8081} for worker ports.

Multi-worker scenarios use distinct env vars (DYN_SYSTEM_PORT_PREFILL,
DYN_SYSTEM_PORT_DECODE, DYN_SYSTEM_PORT_WORKER1, etc).

This enables flexible port configuration in preparation for parallel testing
while maintaining backward compatibility.

Related: DIS-1022
Signed-off-by: Keiven Chang <[email protected]>
@keivenchang keivenchang force-pushed the keivenchang/DIS-1022__frontend-worker-dynamic-ports branch from d3195ac to 52f645e Compare November 25, 2025 04:18
@keivenchang keivenchang marked this pull request as ready for review November 25, 2025 05:21
@keivenchang keivenchang requested review from a team as code owners November 25, 2025 05:21
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 25, 2025

Walkthrough

This pull request refactors shell scripts across multiple backend deployment configurations (sglang, trtllm, vllm) and test utilities to replace hardcoded HTTP and system port flags with environment-variable-driven configurations, enabling runtime port customization via defaults and environment variables.

Changes

Cohort / File(s) Summary
sglang backend launch scripts
examples/backends/sglang/launch/agg.sh, agg_embed.sh, agg_router.sh, disagg.sh, disagg_router.sh, disagg_same_gpu.sh, multimodal_agg.sh, multimodal_disagg.sh
Removed explicit --http-port=8000 flag from frontend invocations. Replaced hardcoded DYN_SYSTEM_PORT values with parameterized environment-variable defaults (e.g., ${DYN_SYSTEM_PORT:-8081}, ${DYN_SYSTEM_PORT_PREFILL:-8081}). Added clarifying comments about DYN_HTTP_PORT support.
trtllm backend launch scripts
examples/backends/trtllm/launch/agg.sh, agg_metrics.sh, agg_router.sh, disagg.sh, disagg_multimodal.sh, disagg_router.sh, disagg_same_gpu.sh, epd_disagg.sh, gpt_oss_disagg.sh
Removed --http-port 8000 or --http-port=8000 from frontend startup. Added documentation comments explaining DYN_HTTP_PORT environment variable support. Parameterized worker ports using bash parameter expansion with sensible defaults.
trtllm performance sweep script
examples/backends/trtllm/performance_sweeps/scripts/start_frontend.sh
Removed explicit --http-port 8000 flag; added comment documenting DYN_HTTP_PORT support with default 8000.
vllm backend launch scripts
examples/backends/vllm/launch/agg.sh, agg_kvbm.sh, agg_kvbm_router.sh, agg_lmcache.sh, agg_multimodal.sh, agg_multimodal_epd.sh, agg_multimodal_llama.sh, agg_request_planes.sh, agg_router.sh, dep.sh, disagg.sh, disagg_kvbm.sh, disagg_kvbm_2p2d.sh, disagg_kvbm_router.sh, disagg_lmcache.sh, disagg_multimodal_epd.sh, disagg_multimodal_llama.sh, disagg_router.sh, disagg_same_gpu.sh, dsr1_dep.sh
Replaced hardcoded port arguments (--http-port=8000) with environment-variable-driven defaults. Updated worker port configurations to use parameter expansion (e.g., ${DYN_SYSTEM_PORT:-8081}). Added comments clarifying port configuration via DYN_HTTP_PORT or flag.
Basics and test utilities
examples/basics/multinode/trtllm/start_frontend_services.sh, tests/serve/launch/template_verifier.sh
Removed explicit port flags from frontend invocations; added documentation about DYN_HTTP_PORT environment variable support with default port 8000.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~12 minutes

Rationale: This is a highly homogeneous refactor applied consistently across 40+ files with an identical pattern (remove hardcoded port flags, add environment-variable defaults, document changes). Verification requires:

  • Spot-checking a representative sample of scripts across different backends to confirm pattern consistency
  • Verifying environment variable naming conventions are sensible and non-conflicting
  • Confirming no inadvertent logic breakage from flag removal

All changes are cosmetic/configuration-level with no new business logic, error handling, or complexity.

Poem

🐰 A rabbit's ode to flexibility

Hardcoded ports once stood so tall,
But now they dance with env vars' call—
Default to eight-oh, or override with care,
Configuration flows through the air!
Parameters expand, and services stay free,
No more fixed ports, just defaults and... glee! 🎉

Pre-merge checks

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed Title clearly summarizes the main change: adding dynamic port configuration to scripts for parallel testing preparation.
Description check ✅ Passed Description includes all required template sections with detailed information about changes, specific file examples, and related issue reference.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
examples/backends/trtllm/launch/disagg_same_gpu.sh (1)

68-76: Critical: Decode worker must be backgrounded and its PID must be captured for cleanup.

The decode worker is missing the & background operator at line 76, which causes the script to block indefinitely waiting for the process to complete instead of running prefill and decode workers in parallel. Additionally, the decode worker's PID is not captured, so it won't be properly cleaned up by the trap handler.

Compare with the prefill worker (line 64), which correctly uses & and captures its PID.

Apply this diff to background the decode worker and capture its PID:

 # run decode worker (shares GPU with prefill)
 CUDA_VISIBLE_DEVICES=$CUDA_VISIBLE_DEVICES \
 DYN_SYSTEM_PORT=${DYN_SYSTEM_PORT2:-8082} \
 python3 -m dynamo.trtllm \
   --model-path "$MODEL_PATH" \
   --served-model-name "$SERVED_MODEL_NAME" \
   --extra-engine-args  "$DECODE_ENGINE_ARGS" \
   --modality "$MODALITY" \
   --publish-events-and-metrics \
-  --disaggregation-mode decode
+  --disaggregation-mode decode &
+DECODE_PID=$!

Then update the cleanup handler to include the decode worker:

 cleanup() {
     echo "Cleaning up background processes..."
-    kill $DYNAMO_PID $PREFILL_PID 2>/dev/null || true
-    wait $DYNAMO_PID $PREFILL_PID 2>/dev/null || true
+    kill $DYNAMO_PID $PREFILL_PID $DECODE_PID 2>/dev/null || true
+    wait $DYNAMO_PID $PREFILL_PID $DECODE_PID 2>/dev/null || true
     echo "Cleanup complete."
 }
examples/backends/trtllm/launch/gpt_oss_disagg.sh (1)

22-33: Set DYN_SYSTEM_PORT for each disaggregated worker to prevent port collisions.

The prefill (lines 22–33) and decode (lines 36–46) workers lack explicit DYN_SYSTEM_PORT environment variable configuration. Verification across all disaggregated test patterns—etcd_ha, fault_tolerance, and cancellation—shows that distinct ports must be assigned per worker to avoid collisions:

# tests/fault_tolerance/etcd_ha/test_trtllm.py, line 92
env["DYN_SYSTEM_PORT"] = port  # per-worker assignment

# tests/fault_tolerance/migration/test_vllm.py, line 55
env["DYN_SYSTEM_PORT"] = f"808{worker_id[-1]}"  # per-worker numbering

No automatic port reservation mechanism exists in trtllm (unlike SGLang's _reserve_disaggregation_bootstrap_port()). Without explicit configuration, both workers will attempt to use the same default port, causing startup failures or metrics conflicts.

Add DYN_SYSTEM_PORT assignments before each worker launch:

  • Prefill worker: assign a distinct port (e.g., 9345)
  • Decode worker: assign a different port (e.g., 9346)
🧹 Nitpick comments (4)
examples/basics/multinode/trtllm/start_frontend_services.sh (1)

15-17: Improve comment clarity about port configuration precedence.

The comment explains the options but could explicitly document the fallback chain for future maintainers. Consider clarifying the precedence: CLI flag > environment variable > default.

-# dynamo.frontend accepts either --http-port flag or DYN_HTTP_PORT env var (defaults to 8000)
+# dynamo.frontend port configuration precedence: --http-port flag > DYN_HTTP_PORT env var > default 8000
examples/backends/trtllm/launch/disagg_same_gpu.sh (1)

51-52: Explicitly export DYN_HTTP_PORT for consistency with worker configuration pattern.

The frontend correctly handles the DYN_HTTP_PORT environment variable with a default of 8000 (verified in components/src/dynamo/frontend/main.py line 95). However, to match the pattern used for worker configuration (lines 57, 69), consider explicitly exporting the variable:

DYN_HTTP_PORT=${DYN_HTTP_PORT:-8000} \
python3 -m dynamo.frontend --router-mode kv &

This makes the default port configuration explicit in the script rather than implicit in the application code, improving consistency and readability.

examples/backends/vllm/launch/agg_multimodal_epd.sh (1)

67-67: Consider making environment variable names explicit in comment.

The comment helpfully explains the frontend port behavior, but consider also documenting the exact environment variable names (e.g., DYN_HTTP_PORT) and the fallback logic more explicitly to aid future maintainers and parallel test configuration.

-# dynamo.frontend accepts either --http-port flag or DYN_HTTP_PORT env var (defaults to 8000)
+# dynamo.frontend accepts either --http-port CLI flag or DYN_HTTP_PORT env var
+# If neither is provided, it defaults to port 8000
examples/backends/trtllm/launch/disagg_router.sh (1)

25-25: Comment clarity: Specify how the application receives the port configuration.

The comment states that dynamo.frontend accepts either --http-port flag or DYN_HTTP_PORT env var, but it doesn't clarify whether the application must be coded to read DYN_HTTP_PORT at runtime, or if this script is expected to explicitly set/export the variable.

Consider enhancing the comment to be more explicit:

-# dynamo.frontend accepts either --http-port flag or DYN_HTTP_PORT env var (defaults to 8000)
+# dynamo.frontend accepts either --http-port flag or DYN_HTTP_PORT env var (defaults to 8000).
+# If DYN_HTTP_PORT is not set, the application defaults to 8000.
+# To customize the port at runtime, set: export DYN_HTTP_PORT=<port_number>

Alternatively, if the intent is to allow port override via environment in the current invocation, explicitly set it:

+export DYN_HTTP_PORT=${DYN_HTTP_PORT:-8000}
 python3 -m dynamo.frontend --router-mode kv &
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 57c701f and 52f645e.

📒 Files selected for processing (40)
  • examples/backends/sglang/launch/agg.sh (1 hunks)
  • examples/backends/sglang/launch/agg_embed.sh (1 hunks)
  • examples/backends/sglang/launch/agg_router.sh (2 hunks)
  • examples/backends/sglang/launch/disagg.sh (2 hunks)
  • examples/backends/sglang/launch/disagg_router.sh (5 hunks)
  • examples/backends/sglang/launch/disagg_same_gpu.sh (2 hunks)
  • examples/backends/sglang/launch/multimodal_agg.sh (1 hunks)
  • examples/backends/sglang/launch/multimodal_disagg.sh (1 hunks)
  • examples/backends/trtllm/launch/agg.sh (1 hunks)
  • examples/backends/trtllm/launch/agg_metrics.sh (1 hunks)
  • examples/backends/trtllm/launch/agg_router.sh (1 hunks)
  • examples/backends/trtllm/launch/disagg.sh (1 hunks)
  • examples/backends/trtllm/launch/disagg_multimodal.sh (1 hunks)
  • examples/backends/trtllm/launch/disagg_router.sh (1 hunks)
  • examples/backends/trtllm/launch/disagg_same_gpu.sh (2 hunks)
  • examples/backends/trtllm/launch/epd_disagg.sh (1 hunks)
  • examples/backends/trtllm/launch/gpt_oss_disagg.sh (1 hunks)
  • examples/backends/trtllm/performance_sweeps/scripts/start_frontend.sh (1 hunks)
  • examples/backends/vllm/launch/agg.sh (1 hunks)
  • examples/backends/vllm/launch/agg_kvbm.sh (1 hunks)
  • examples/backends/vllm/launch/agg_kvbm_router.sh (1 hunks)
  • examples/backends/vllm/launch/agg_lmcache.sh (1 hunks)
  • examples/backends/vllm/launch/agg_multimodal.sh (2 hunks)
  • examples/backends/vllm/launch/agg_multimodal_epd.sh (1 hunks)
  • examples/backends/vllm/launch/agg_multimodal_llama.sh (1 hunks)
  • examples/backends/vllm/launch/agg_request_planes.sh (1 hunks)
  • examples/backends/vllm/launch/agg_router.sh (1 hunks)
  • examples/backends/vllm/launch/dep.sh (1 hunks)
  • examples/backends/vllm/launch/disagg.sh (1 hunks)
  • examples/backends/vllm/launch/disagg_kvbm.sh (1 hunks)
  • examples/backends/vllm/launch/disagg_kvbm_2p2d.sh (1 hunks)
  • examples/backends/vllm/launch/disagg_kvbm_router.sh (1 hunks)
  • examples/backends/vllm/launch/disagg_lmcache.sh (1 hunks)
  • examples/backends/vllm/launch/disagg_multimodal_epd.sh (1 hunks)
  • examples/backends/vllm/launch/disagg_multimodal_llama.sh (1 hunks)
  • examples/backends/vllm/launch/disagg_router.sh (1 hunks)
  • examples/backends/vllm/launch/disagg_same_gpu.sh (2 hunks)
  • examples/backends/vllm/launch/dsr1_dep.sh (1 hunks)
  • examples/basics/multinode/trtllm/start_frontend_services.sh (1 hunks)
  • tests/serve/launch/template_verifier.sh (1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: keivenchang
Repo: ai-dynamo/dynamo PR: 3035
File: lib/runtime/src/metrics/prometheus_names.rs:49-53
Timestamp: 2025-09-16T00:26:37.092Z
Learning: keivenchang prefers consistency in metric naming standardization over strict adherence to Prometheus conventions about gauge vs counter suffixes. When standardizing metrics naming, prioritize consistency across the codebase rather than technical pedantry about individual metric type conventions.
Learnt from: keivenchang
Repo: ai-dynamo/dynamo PR: 3051
File: container/templates/Dockerfile.trtllm.j2:424-437
Timestamp: 2025-09-16T17:16:03.785Z
Learning: keivenchang prioritizes maintaining exact backward compatibility during migration/refactoring PRs, even when bugs are identified in the original code. Fixes should be deferred to separate PRs after the migration is complete.
📚 Learning: 2025-10-28T05:48:37.621Z
Learnt from: ayushag-nv
Repo: ai-dynamo/dynamo PR: 3634
File: components/src/dynamo/vllm/multimodal_utils/model.py:39-42
Timestamp: 2025-10-28T05:48:37.621Z
Learning: In components/src/dynamo/vllm/multimodal_utils/model.py, the AutoModel.from_pretrained call with trust_remote_code=True in the load_vision_model function is intentional and expected for the vLLM multimodal implementation.

Applied to files:

  • examples/backends/vllm/launch/agg_multimodal.sh
📚 Learning: 2025-06-05T01:46:15.509Z
Learnt from: GuanLuo
Repo: ai-dynamo/dynamo PR: 1371
File: examples/llm/benchmarks/vllm_multinode_setup.sh:18-25
Timestamp: 2025-06-05T01:46:15.509Z
Learning: In multi-node setups with head/worker architecture, the head node typically doesn't need environment variables pointing to its own services (like NATS_SERVER, ETCD_ENDPOINTS) because local processes can access them via localhost. Only worker nodes need these environment variables to connect to the head node's external IP address.

Applied to files:

  • examples/backends/vllm/launch/disagg_multimodal_llama.sh
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: deploy-test-fault-tolerance (vllm)
🔇 Additional comments (53)
examples/backends/vllm/launch/disagg_multimodal_epd.sh (1)

73-76: Verify environment variable support in dynamo.frontend.

The change removes the explicit --http-port=8000 flag to enable dynamic port configuration via the DYN_HTTP_PORT environment variable. While the comment accurately describes the intended behavior, I need to confirm that the dynamo.frontend module actually implements this environment variable with a proper default fallback to 8000 when unset.

To verify this change is properly implemented, please confirm:

  1. The dynamo.frontend module accepts and respects the DYN_HTTP_PORT environment variable
  2. If DYN_HTTP_PORT is not set, it defaults to port 8000
  3. CLI flags (e.g., --http-port) continue to take precedence over the environment variable (backward compatibility)

You can verify by checking the frontend module's entry point or startup code. If you'd like, I can generate a verification script to search for the environment variable handling logic.

examples/backends/vllm/launch/disagg_kvbm_router.sh (1)

13-16: ✓ Frontend port configuration change is clear and backward-compatible.

Removing the hardcoded --http-port 8000 flag while adding the explanatory comment successfully enables environment-based port configuration. The documented behavior—accepting DYN_HTTP_PORT or defaulting to 8000—preserves backward compatibility for callers not setting the environment variable, and the comment accurately conveys this to future maintainers.

The worker configurations remain appropriately unchanged, which is correct since they use their own port configuration patterns (ZMQ ports, side-channel ports).

examples/backends/sglang/launch/multimodal_disagg.sh (2)

63-64: Frontend port configuration refactored correctly.

The removal of the hardcoded --http-port=8000 flag and reliance on the DYN_HTTP_PORT environment variable (with default 8000) is the intended behavior per the PR objectives. The comment clearly documents the new configuration mechanism, and backward compatibility is preserved.


85-85: Verify if disaggregation bootstrap ports require dynamic configuration.

The worker processes continue to use a hardcoded --disaggregation-bootstrap-port 12345. For parallel testing scenarios, concurrent test runs may conflict on this shared port. Confirm whether these bootstrap ports need to be made dynamic (e.g., via environment variables like DYN_SYSTEM_PORT_PREFILL and DYN_SYSTEM_PORT_DECODE mentioned in the PR objectives), or if they are scoped/isolated appropriately for parallel execution.

As a follow-up, verify whether this change is intentional (deferred to a later PR or handled via other mechanisms) or if bootstrap port configuration needs to be updated as part of this PR's parallel testing enablement.

Also applies to: 100-100

examples/backends/vllm/launch/disagg_multimodal_llama.sh (1)

48-49: Frontend port configuration support verified.

The change is correct. Verification confirms that dynamo.frontend (components/src/dynamo/frontend/main.py) implements exactly what the comment describes:

  • Accepts --http-port CLI flag (line 93)
  • Reads DYN_HTTP_PORT environment variable (line 95)
  • Defaults to port 8000 when neither is specified (line 95)

The removal of the hardcoded flag and reliance on environment variable or default aligns with the PR objective for dynamic port configuration and enables parallel testing scenarios.

examples/backends/vllm/launch/agg_kvbm.sh (2)

13-14: Clarify worker port configuration approach for this script.

The PR objectives mention that worker ports should use environment-variable-based configuration (e.g., DYN_SYSTEM_PORT=${DYN_SYSTEM_PORT:-8081}), but the worker launch in this script doesn't include any port environment variables.

Is this intentional for the KVBM-specific variant, or should the worker also be updated to accept dynamic ports?


8-9: All verification requirements confirmed - no issues found.

The dynamo.frontend module correctly supports the DYN_HTTP_PORT environment variable with the documented default of 8000. The implementation uses int(os.environ.get("DYN_HTTP_PORT", "8000")), and users can still override via the --http-port CLI flag, preserving backward compatibility. The code change is correct and safe.

examples/backends/trtllm/performance_sweeps/scripts/start_frontend.sh (1)

22-25: Change verified as correct. The dynamo.frontend module supports the DYN_HTTP_PORT environment variable with a default of 8000, and backward compatibility is preserved—the --http-port flag remains functional. The comment accurately describes the behavior, and this pattern is already in use elsewhere in the codebase (e.g., examples/basics/multinode/trtllm/start_frontend_services.sh).

examples/backends/vllm/launch/disagg_router.sh (2)

23-53: Worker ports remain hardcoded; verify if intentional for "part 1" scope.

The vllm worker invocations still contain hardcoded port values:

  • VLLM_NIXL_SIDE_CHANNEL_PORT environment variables (20097, 20098, 20099 at lines 29, 39, 47)
  • KV event endpoint ports (20080, 20081, 20082, 20083 embedded in JSON strings at lines 27, 34, 45, 53)

Given the PR description mentions updating "40 launch scripts" with environment variables for "multi-worker scenarios," confirm whether:

  1. Worker port configuration should also use environment variables (e.g., DYN_SYSTEM_PORT_PREFILL_WORKER1, similar to what the PR summary describes), or
  2. This file intentionally defers worker port changes to a subsequent phase.

If worker ports should be made dynamic, the pattern from the PR summary suggests:

VLLM_NIXL_SIDE_CHANNEL_PORT=${DYN_VLLM_SIDE_CHANNEL_PORT_PREFILL_WORKER1:-20097}

And KV event endpoints within JSON should similarly use environment variable substitution or be externalized.


15-19: Verified: Frontend correctly respects DYN_HTTP_PORT environment variable.

The dynamo.frontend code properly implements support for the DYN_HTTP_PORT environment variable with a default of 8000, confirming that removing the explicit --http-port 8000 flag from the script is the correct approach. The --http-port CLI flag remains supported for backward compatibility.

examples/backends/vllm/launch/disagg_kvbm_2p2d.sh (1)

8-9: All verification checks passed; no issues found.

The implementation has been confirmed:

  • The dynamo.frontend code correctly implements support for both --http-port flag and DYN_HTTP_PORT environment variable with a default of 8000.
  • The comment on line 8 is accurate.
  • Removal of the explicit --http-port=8000 flag is safe and correct; the default will be applied if neither the flag nor the environment variable is set.
  • This pattern is consistent across the broader codebase.
examples/basics/multinode/trtllm/start_frontend_services.sh (1)

15-17: Change verified—dynamo.frontend supports DYN_HTTP_PORT environment variable with correct default.

The --http-port argument in components/src/dynamo/frontend/main.py reads from the DYN_HTTP_PORT environment variable and defaults to port 8000, confirming the shell script change maintains backward compatibility while adopting environment-variable-based configuration.

examples/backends/sglang/launch/multimodal_agg.sh (1)

63-64: Port configuration change is consistent and maintains backward compatibility.

Frontend now relies on environment-driven port selection (DYN_HTTP_PORT env var or default 8000) instead of hardcoded flag.

examples/backends/trtllm/launch/disagg.sh (1)

28-29: Port configuration removed from frontend invocation.

Aligns with dynamic port preparation by relying on DYN_HTTP_PORT environment variable (default 8000).

examples/backends/vllm/launch/disagg_lmcache.sh (1)

8-9: Frontend port configuration delegated to environment variables.

Clean change consistent with other scripts; --router-mode kv behavior preserved.

examples/backends/trtllm/launch/disagg_multimodal.sh (1)

26-27: Frontend port configuration externalized to environment.

Change is consistent with PR objectives and maintains multimodal worker setup unchanged.

examples/backends/trtllm/launch/agg_metrics.sh (2)

22-23: Frontend port configuration externalized.

Consistent pattern with environment-driven port selection.


27-27: Worker port configured via environment variable with sensible default.

DYN_SYSTEM_PORT=${DYN_SYSTEM_PORT:-8081} correctly sets the env var for the worker subprocess, defaulting to 8081 when unset. Enables runtime port customization for parallel testing.

examples/backends/vllm/launch/agg_multimodal.sh (2)

48-49: Frontend port configuration moved to environment.

Properly documents dynamic port behavior; TCP transport config and multimodal setup remain intact.


63-63: Worker port configuration parameterized with environment variable.

DYN_SYSTEM_PORT=${DYN_SYSTEM_PORT:-8081} enables dynamic port allocation while preserving all vLLM worker configurations (--enforce-eager, --connector none).

examples/backends/sglang/launch/agg.sh (2)

47-49: Frontend port configuration externalized; OTEL service naming preserved.

Comment documents dynamic port behavior; OTEL_SERVICE_NAME environment setup correctly positioned before frontend invocation.


52-61: Worker port parameterized with environment variable and runtime flags expanded.

DYN_SYSTEM_PORT=${DYN_SYSTEM_PORT:-8081} enables dynamic port allocation. Runtime flags (--page-size, --tp, --trust-remote-code, --skip-tokenizer-init, --enable-metrics) are orthogonal improvements that enable metrics collection on the configured port.

examples/backends/sglang/launch/agg_embed.sh (2)

47-49: Frontend port configuration externalized with OTEL service naming preserved.

Consistent pattern; OTEL configuration correctly ordered before frontend invocation.


53-53: Worker port and OTEL service name configured for embedding worker.

DYN_SYSTEM_PORT=${DYN_SYSTEM_PORT:-8081} correctly combined with OTEL_SERVICE_NAME for environment setup; embedding-specific flags preserved.

examples/backends/vllm/launch/agg_lmcache.sh (1)

8-12: Environment-variable-driven port configuration is properly implemented.

The shell syntax for variable expansion (${DYN_SYSTEM_PORT:-8081}) is correct, and the comment accurately describes the intended port selection behavior. The changes maintain backward compatibility by preserving CLI flag support.

However, please confirm that the underlying Python implementations (dynamo.frontend and dynamo.vllm) properly support:

  1. The DYN_HTTP_PORT environment variable for the frontend (with 8000 as default)
  2. The DYN_SYSTEM_PORT environment variable for the worker

You can verify this by checking the argument parser or port-binding logic in those modules. If these features are not yet implemented, the environment variables will be silently ignored and the processes may fail to bind or use unexpected ports.

examples/backends/sglang/launch/disagg_same_gpu.sh (1)

40-46: Multi-worker port configuration follows a clear, consistent pattern.

The use of numbered environment variables (DYN_SYSTEM_PORT1, DYN_SYSTEM_PORT2) provides unambiguous port assignment for the prefill and decode workers. The defaults (8081, 8082) are sequential and easy to track.

Confirm that the worker processes (dynamo.sglang in disaggregation modes) respect the DYN_SYSTEM_PORT environment variable for port binding.

Also applies to: 75-76

examples/backends/sglang/launch/disagg.sh (1)

47-68: Semantic environment-variable naming improves readability.

Using role-based names (DYN_SYSTEM_PORT_PREFILL, DYN_SYSTEM_PORT_DECODE) instead of numeric indices makes the configuration more self-documenting and easier to understand at a glance. The defaults (8081, 8082) are consistent with other scripts.

Please verify that dynamo.sglang workers in both prefill and decode modes respect the DYN_SYSTEM_PORT environment variable for metrics/telemetry port binding.

examples/backends/sglang/launch/disagg_router.sh (1)

48-111: Hierarchical environment-variable naming scales well for complex multi-worker scenarios.

The naming scheme (DYN_SYSTEM_PORT_PREFILL_ROUTER, DYN_SYSTEM_PORT_PREFILL_WORKER1, etc.) is hierarchical, clear, and supports future expansion. The sequential port defaults (8081–8085) prevent conflicts.

Verify that all worker types (dynamo.router, dynamo.sglang in various disaggregation modes) properly bind to the ports specified in their respective DYN_SYSTEM_PORT_* environment variables. Given the complexity here, consider testing a full startup to ensure no port collisions or binding failures occur.

examples/backends/sglang/launch/agg_router.sh (1)

47-64: Two-worker port configuration follows the established pattern consistently.

Environment variables DYN_SYSTEM_PORT_WORKER1 and DYN_SYSTEM_PORT_WORKER2 with defaults (8081, 8082) provide clear, unambiguous port assignment. The pattern aligns with other multi-worker scripts in this PR.

examples/backends/trtllm/launch/agg.sh (1)

25-26: Frontend port configuration aligns with the PR's environment-driven approach.

Removing the hardcoded --http-port=8000 flag and relying on DYN_HTTP_PORT (or a built-in default) is consistent with changes across other backend launch scripts.

Confirm that dynamo.frontend without any explicit --http-port flag will:

  1. Check for DYN_HTTP_PORT environment variable
  2. Fall back to 8000 if the variable is unset
tests/serve/launch/template_verifier.sh (1)

20-21: Test utility correctly adopts environment-driven port configuration.

The pattern is consistent with production launch scripts. The frontend will use DYN_HTTP_PORT or default to 8000.

examples/backends/trtllm/launch/epd_disagg.sh (1)

32-33: Frontend startup correctly removes hardcoded port flag.

This change is consistent with the broader port-configuration refactor across all backend launch scripts in this PR.

Confirm that the three worker processes (encode, prefill, decode) either:

  1. Use their own environment-variable-based port configuration (if modified in the source), or
  2. Use default ports that will not conflict with the frontend or each other

Without explicit port configuration for the workers in this script, verify there are no port-binding conflicts in actual deployments.

examples/backends/vllm/launch/agg_router.sh (1)

15-18: Backend port configuration change looks solid.

Removing the hardcoded port and documenting the env var fallback pattern is clean and maintains backward compatibility. The frontend will use DYN_HTTP_PORT if set, otherwise defaults to 8000.

examples/backends/vllm/launch/dsr1_dep.sh (1)

85-88: Frontend port configuration aligns with PR pattern.

The removal of --http-port=8000 and documentation of env var fallback is consistent across the backend launch scripts. Change preserves backward compatibility and enables runtime customization.

examples/backends/vllm/launch/dep.sh (1)

8-9: Frontend port change follows established pattern.

Removing hardcoded frontend port and relying on DYN_HTTP_PORT env var (or default 8000) is consistent with the broader PR pattern. The DP worker loop already uses dynamic port configuration, so the full stack is now env-driven.

examples/backends/vllm/launch/agg.sh (1)

8-14: Frontend and worker port configuration is correct and well-parameterized.

The use of bash parameter expansion (${DYN_SYSTEM_PORT:-8081}) allows external port configuration for parallel testing scenarios while maintaining a sensible default. Both frontend (env var or default 8000) and worker (env var or default 8081) now support dynamic port binding.

examples/backends/vllm/launch/disagg.sh (2)

8-9: Frontend port configuration is consistent.

The removal of hardcoded frontend port and reliance on DYN_HTTP_PORT env var matches the pattern from other launch scripts.


12-19: Worker ports should be parameterized for multi-worker scenario.

The AI summary and PR objectives mention support for distinct env vars (DYN_SYSTEM_PORT_PREFILL, DYN_SYSTEM_PORT_DECODE, etc.) for multi-worker scenarios. However, the worker invocations in this file do not show env var parameterization like the pattern in agg.sh (line 13: DYN_SYSTEM_PORT=${DYN_SYSTEM_PORT:-8081}).

For consistency and to enable parallel testing with distinct ports for prefill and decode workers, these should likely be configured. For example:

DYN_SYSTEM_PORT_PREFILL=${DYN_SYSTEM_PORT_PREFILL:-8081} \
CUDA_VISIBLE_DEVICES=0 python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager &

DYN_SYSTEM_PORT_DECODE=${DYN_SYSTEM_PORT_DECODE:-8082} \
DYN_VLLM_KV_EVENT_PORT=20081 \
VLLM_NIXL_SIDE_CHANNEL_PORT=20097 \
CUDA_VISIBLE_DEVICES=1 python3 -m dynamo.vllm ...

Please confirm whether worker port parameterization is intentionally omitted from disagg.sh or if this is a missing implementation per the PR objectives for multi-worker scenarios.

examples/backends/trtllm/launch/agg_router.sh (2)

27-31: Review comment is based on incorrect assumptions about file content.

The file is exactly 31 lines—the code snippet is complete and not truncated. The worker configuration (lines 27-31) shows a single worker setup with no DYN_SYSTEM_PORT_WORKER1, DYN_SYSTEM_PORT_WORKER2, or port configuration flags. The file does not contain Worker 1/Worker 2 configurations as the original review expected. The code snippet accurately reflects what exists in the file.

Likely an incorrect or invalid review comment.


21-24: No issues found — the frontend default port behavior is verified and correct.

The removal of the explicit --http-port=8000 flag from line 23 is safe. Verification confirms that dynamo.frontend (in components/src/dynamo/frontend/main.py line 95) uses default=int(os.environ.get("DYN_HTTP_PORT", "8000")), meaning it defaults to port 8000 when neither the --http-port flag nor the DYN_HTTP_PORT environment variable is provided. The comment on line 22 accurately documents this behavior, and the change maintains backward compatibility.

examples/backends/vllm/launch/agg_kvbm_router.sh (2)

1-41: Maintain backward compatibility and consistency across backends.

Per the learnings from your prior work (PR 3051), ensure exact backward compatibility during this refactoring. Given this is one of 40 scripts being updated, verify that:

  1. All backend scripts (vllm, sglang, trtllm) follow the same port configuration pattern for the frontend
  2. Existing deployments that rely on the hardcoded port 8000 continue to work without modification
  3. No test fixtures or CI/CD workflows hardcode port assumptions that would break

Consider running a quick smoke test with and without DYN_HTTP_PORT set to confirm both paths work.


14-17: The search found no references to DYN_HTTP_PORT in the codebase. This is concerning. Let me verify if the frontend module actually implements this environment variable handling:

The web search found the general pattern that CLI arguments can be configured via environment variables using the DYN_ prefix with uppercase parameter names, but no specific confirmation that DYN_HTTP_PORT is actually implemented. Let me check the frontend source code directly:

Based on my verification, I found a critical issue: The review comment documents support for a DYN_HTTP_PORT environment variable, but there is no evidence this feature actually exists in the codebase.

Search results show:

  • All tests explicitly pass --http-port with dynamic values
  • No references to DYN_HTTP_PORT anywhere in the codebase
  • Documentation mentions the DYN_ prefix pattern for Kubernetes deployments, but provides no specific example for --http-port
  • Official examples continue to show explicit --http-port 8000 usage

The change removes the hardcoded --http-port 8000 flag without confirming the environment variable fallback is implemented, creating a risk that the frontend will start with an unspecified port.

Verify the frontend actually implements DYN_HTTP_PORT environment variable handling before approving this change. Check the dynamo.frontend module's argument parsing code to confirm:

  1. The --http-port parameter can be overridden by a DYN_HTTP_PORT env var
  2. The default behavior (port 8000) is preserved when neither is provided
  3. Existing tests that expect port 8000 still pass without the explicit flag
examples/backends/vllm/launch/agg_multimodal_epd.sh (1)

67-68: I need to examine the actual EPD script and compare it with multimodal patterns in other backends.

Now let me retrieve the actual EPD script to verify the specific implementation:

Let me retrieve the actual file content to verify the worker port configuration:

#!/bin/bash
cat examples/backends/vllm/launch/agg_multimodal_epd.sh | head -100

Once I have the file content, I'll be able to definitively verify whether the worker processes need port configuration and provide an accurate review comment. Could you run this command so I can see the full script?

examples/backends/trtllm/launch/disagg_router.sh (1)

24-26: Let me search for source code documentation on port configuration:

Based on my verification of the codebase and Dynamo documentation, I can now provide the final rewritten review comment:


Remove the review comment—the code changes are correct and the inconsistency is with the AI summary, not the implementation.

All CLI arguments can be converted to environment variables with a DYN_ prefix. The frontend can be launched with --http-port 8000 or the DYN_HTTP_PORT environment variable, which matches the code at lines 24-26 that removes the hardcoded --http-port=8000 flag and relies on environment variable or application defaults.

The original review comment flagged worker port environment variables (e.g., DYN_SYSTEM_PORT_PREFILL_WORKER1) as missing. However, no such variables exist in the Dynamo codebase or documentation. Worker port assignment is implicit—handled internally by the application through the distributed runtime framework and service registration. The AI summary overstated what was changed; the actual code changes are correct and complete.

Likely an incorrect or invalid review comment.

examples/backends/vllm/launch/agg_multimodal_llama.sh (2)

11-12: All verification checks passed — no action required.

The shell output confirms that dynamo.frontend has been properly updated:

  • DYN_HTTP_PORT is supported with a default of 8000 in the frontend code
  • Backward compatibility is preserved: existing deployments without DYN_HTTP_PORT set will continue to work with the default port
  • The changes in lines 11-12 are safe and align with the PR's env-var-driven configuration goal

1-21: Configure port environments for worker processes to enable parallel execution.

The workers in this script may conflict when running multiple instances simultaneously. The disagg variant (disagg_multimodal_llama.sh) sets explicit port configuration:

  • VLLM_NIXL_SIDE_CHANNEL_PORT (differs per worker: 20097 vs 20098)
  • --kv-events-config with unique ZMQ endpoints (ports 20080 vs 20081)

The agg script should similarly configure these to avoid port collisions. Reference examples/backends/vllm/launch/disagg_multimodal_llama.sh for the port configuration pattern needed here (lines 15 and 18).

examples/backends/vllm/launch/agg_request_planes.sh (2)

44-49: All verification concerns confirmed—no issues found.

The refactoring is correct and follows established patterns throughout the codebase:

  • Frontend accepts DYN_HTTP_PORT env var with default 8000
  • Backend workers accept DYN_SYSTEM_PORT env var, defaulting to -1 (disabled) but enabling with positive values like 8081
  • The ${DYN_SYSTEM_PORT:-8081} pattern is consistently used across 30+ launch scripts in examples/backends/
  • The comment accurately reflects the module behavior

This refactoring maintains backward compatibility and aligns with the codebase's existing env var configuration patterns.


47-47: Port defaults are consistent across backends—no changes needed.

Verification confirms that DYN_SYSTEM_PORT=${DYN_SYSTEM_PORT:-8081} in vllm agg_request_planes.sh aligns with the established pattern across all backends (vllm, sglang, trtllm). Multi-worker scenarios are already handled via distinct environment variables (DYN_SYSTEM_PORT_PREFILL, DYN_SYSTEM_PORT_DECODE, or numbered variants like DYN_SYSTEM_PORT_WORKER1), keeping concerns separate and avoiding conflicts.

examples/backends/trtllm/launch/gpt_oss_disagg.sh (1)

17-18: ✓ Frontend port configuration aligns with PR objectives.

Removing the hardcoded --http-port 8000 flag and documenting the fallback behavior (DYN_HTTP_PORT env var with default 8000) is consistent with the PR goal of enabling dynamic port configuration. The comment on Line 17 clearly signals the expected behavior to future maintainers.

examples/backends/vllm/launch/disagg_same_gpu.sh (3)

68-77: Prefill worker uses distinct env var following multi-worker pattern.

The use of DYN_SYSTEM_PORT_PREFILL with default 8082 correctly differentiates the prefill worker from the decode worker, aligning with the PR's multi-worker support. This is intentional per the PR objectives.

Verify that the prefill worker (dynamo.vllm --is-prefill-worker) correctly reads DYN_SYSTEM_PORT_PREFILL for its port configuration.


51-57: Decode worker port configuration looks correct.

The use of DYN_SYSTEM_PORT=${DYN_SYSTEM_PORT:-8081} with default 8081 is correct and maintains backward compatibility. However, verify that dynamo.vllm module respects this environment variable.


45-46: Verify that dynamo.frontend application handles DYN_HTTP_PORT environment variable.

The removal of the explicit --http-port=8000 flag assumes the frontend application has been updated to read DYN_HTTP_PORT and default to 8000. While the comment states this, the implementation change needs verification.

examples/backends/vllm/launch/disagg_kvbm.sh (1)

8-9: No action needed—verified that frontend supports environment variable configuration with correct defaults.

Frontend module correctly implements DYN_HTTP_PORT environment variable support with a default fallback to port 8000, and the --http-port CLI flag remains functional for explicit override. The removal of the flag from this script is safe.

@keivenchang keivenchang merged commit a77558d into main Nov 25, 2025
32 of 33 checks passed
@keivenchang keivenchang deleted the keivenchang/DIS-1022__frontend-worker-dynamic-ports branch November 25, 2025 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants