-
Notifications
You must be signed in to change notification settings - Fork 714
feat: allow dynamic port in scripts in preparation for parallel testing (part 1) #4546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: allow dynamic port in scripts in preparation for parallel testing (part 1) #4546
Conversation
b0a27eb to
d3195ac
Compare
… (part 1)
Remove hardcoded --http-port=8000 flags from launch scripts. Frontend now
accepts either --http-port flag or DYN_HTTP_PORT env var (defaults to 8000).
Add DYN_SYSTEM_PORT fallback syntax ${DYN_SYSTEM_PORT:-8081} for worker ports.
Multi-worker scenarios use distinct env vars (DYN_SYSTEM_PORT_PREFILL,
DYN_SYSTEM_PORT_DECODE, DYN_SYSTEM_PORT_WORKER1, etc).
This enables flexible port configuration in preparation for parallel testing
while maintaining backward compatibility.
Related: DIS-1022
Signed-off-by: Keiven Chang <[email protected]>
d3195ac to
52f645e
Compare
WalkthroughThis pull request refactors shell scripts across multiple backend deployment configurations (sglang, trtllm, vllm) and test utilities to replace hardcoded HTTP and system port flags with environment-variable-driven configurations, enabling runtime port customization via defaults and environment variables. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~12 minutes Rationale: This is a highly homogeneous refactor applied consistently across 40+ files with an identical pattern (remove hardcoded port flags, add environment-variable defaults, document changes). Verification requires:
All changes are cosmetic/configuration-level with no new business logic, error handling, or complexity. Poem
Pre-merge checks✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
examples/backends/trtllm/launch/disagg_same_gpu.sh (1)
68-76: Critical: Decode worker must be backgrounded and its PID must be captured for cleanup.The decode worker is missing the
&background operator at line 76, which causes the script to block indefinitely waiting for the process to complete instead of running prefill and decode workers in parallel. Additionally, the decode worker's PID is not captured, so it won't be properly cleaned up by the trap handler.Compare with the prefill worker (line 64), which correctly uses
&and captures its PID.Apply this diff to background the decode worker and capture its PID:
# run decode worker (shares GPU with prefill) CUDA_VISIBLE_DEVICES=$CUDA_VISIBLE_DEVICES \ DYN_SYSTEM_PORT=${DYN_SYSTEM_PORT2:-8082} \ python3 -m dynamo.trtllm \ --model-path "$MODEL_PATH" \ --served-model-name "$SERVED_MODEL_NAME" \ --extra-engine-args "$DECODE_ENGINE_ARGS" \ --modality "$MODALITY" \ --publish-events-and-metrics \ - --disaggregation-mode decode + --disaggregation-mode decode & +DECODE_PID=$!Then update the cleanup handler to include the decode worker:
cleanup() { echo "Cleaning up background processes..." - kill $DYNAMO_PID $PREFILL_PID 2>/dev/null || true - wait $DYNAMO_PID $PREFILL_PID 2>/dev/null || true + kill $DYNAMO_PID $PREFILL_PID $DECODE_PID 2>/dev/null || true + wait $DYNAMO_PID $PREFILL_PID $DECODE_PID 2>/dev/null || true echo "Cleanup complete." }examples/backends/trtllm/launch/gpt_oss_disagg.sh (1)
22-33: SetDYN_SYSTEM_PORTfor each disaggregated worker to prevent port collisions.The prefill (lines 22–33) and decode (lines 36–46) workers lack explicit
DYN_SYSTEM_PORTenvironment variable configuration. Verification across all disaggregated test patterns—etcd_ha, fault_tolerance, and cancellation—shows that distinct ports must be assigned per worker to avoid collisions:# tests/fault_tolerance/etcd_ha/test_trtllm.py, line 92 env["DYN_SYSTEM_PORT"] = port # per-worker assignment # tests/fault_tolerance/migration/test_vllm.py, line 55 env["DYN_SYSTEM_PORT"] = f"808{worker_id[-1]}" # per-worker numberingNo automatic port reservation mechanism exists in trtllm (unlike SGLang's
_reserve_disaggregation_bootstrap_port()). Without explicit configuration, both workers will attempt to use the same default port, causing startup failures or metrics conflicts.Add
DYN_SYSTEM_PORTassignments before each worker launch:
- Prefill worker: assign a distinct port (e.g., 9345)
- Decode worker: assign a different port (e.g., 9346)
🧹 Nitpick comments (4)
examples/basics/multinode/trtllm/start_frontend_services.sh (1)
15-17: Improve comment clarity about port configuration precedence.The comment explains the options but could explicitly document the fallback chain for future maintainers. Consider clarifying the precedence: CLI flag > environment variable > default.
-# dynamo.frontend accepts either --http-port flag or DYN_HTTP_PORT env var (defaults to 8000) +# dynamo.frontend port configuration precedence: --http-port flag > DYN_HTTP_PORT env var > default 8000examples/backends/trtllm/launch/disagg_same_gpu.sh (1)
51-52: Explicitly exportDYN_HTTP_PORTfor consistency with worker configuration pattern.The frontend correctly handles the
DYN_HTTP_PORTenvironment variable with a default of 8000 (verified incomponents/src/dynamo/frontend/main.pyline 95). However, to match the pattern used for worker configuration (lines 57, 69), consider explicitly exporting the variable:DYN_HTTP_PORT=${DYN_HTTP_PORT:-8000} \ python3 -m dynamo.frontend --router-mode kv &This makes the default port configuration explicit in the script rather than implicit in the application code, improving consistency and readability.
examples/backends/vllm/launch/agg_multimodal_epd.sh (1)
67-67: Consider making environment variable names explicit in comment.The comment helpfully explains the frontend port behavior, but consider also documenting the exact environment variable names (e.g.,
DYN_HTTP_PORT) and the fallback logic more explicitly to aid future maintainers and parallel test configuration.-# dynamo.frontend accepts either --http-port flag or DYN_HTTP_PORT env var (defaults to 8000) +# dynamo.frontend accepts either --http-port CLI flag or DYN_HTTP_PORT env var +# If neither is provided, it defaults to port 8000examples/backends/trtllm/launch/disagg_router.sh (1)
25-25: Comment clarity: Specify how the application receives the port configuration.The comment states that
dynamo.frontendaccepts either--http-portflag orDYN_HTTP_PORTenv var, but it doesn't clarify whether the application must be coded to readDYN_HTTP_PORTat runtime, or if this script is expected to explicitly set/export the variable.Consider enhancing the comment to be more explicit:
-# dynamo.frontend accepts either --http-port flag or DYN_HTTP_PORT env var (defaults to 8000) +# dynamo.frontend accepts either --http-port flag or DYN_HTTP_PORT env var (defaults to 8000). +# If DYN_HTTP_PORT is not set, the application defaults to 8000. +# To customize the port at runtime, set: export DYN_HTTP_PORT=<port_number>Alternatively, if the intent is to allow port override via environment in the current invocation, explicitly set it:
+export DYN_HTTP_PORT=${DYN_HTTP_PORT:-8000} python3 -m dynamo.frontend --router-mode kv &
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (40)
examples/backends/sglang/launch/agg.sh(1 hunks)examples/backends/sglang/launch/agg_embed.sh(1 hunks)examples/backends/sglang/launch/agg_router.sh(2 hunks)examples/backends/sglang/launch/disagg.sh(2 hunks)examples/backends/sglang/launch/disagg_router.sh(5 hunks)examples/backends/sglang/launch/disagg_same_gpu.sh(2 hunks)examples/backends/sglang/launch/multimodal_agg.sh(1 hunks)examples/backends/sglang/launch/multimodal_disagg.sh(1 hunks)examples/backends/trtllm/launch/agg.sh(1 hunks)examples/backends/trtllm/launch/agg_metrics.sh(1 hunks)examples/backends/trtllm/launch/agg_router.sh(1 hunks)examples/backends/trtllm/launch/disagg.sh(1 hunks)examples/backends/trtllm/launch/disagg_multimodal.sh(1 hunks)examples/backends/trtllm/launch/disagg_router.sh(1 hunks)examples/backends/trtllm/launch/disagg_same_gpu.sh(2 hunks)examples/backends/trtllm/launch/epd_disagg.sh(1 hunks)examples/backends/trtllm/launch/gpt_oss_disagg.sh(1 hunks)examples/backends/trtllm/performance_sweeps/scripts/start_frontend.sh(1 hunks)examples/backends/vllm/launch/agg.sh(1 hunks)examples/backends/vllm/launch/agg_kvbm.sh(1 hunks)examples/backends/vllm/launch/agg_kvbm_router.sh(1 hunks)examples/backends/vllm/launch/agg_lmcache.sh(1 hunks)examples/backends/vllm/launch/agg_multimodal.sh(2 hunks)examples/backends/vllm/launch/agg_multimodal_epd.sh(1 hunks)examples/backends/vllm/launch/agg_multimodal_llama.sh(1 hunks)examples/backends/vllm/launch/agg_request_planes.sh(1 hunks)examples/backends/vllm/launch/agg_router.sh(1 hunks)examples/backends/vllm/launch/dep.sh(1 hunks)examples/backends/vllm/launch/disagg.sh(1 hunks)examples/backends/vllm/launch/disagg_kvbm.sh(1 hunks)examples/backends/vllm/launch/disagg_kvbm_2p2d.sh(1 hunks)examples/backends/vllm/launch/disagg_kvbm_router.sh(1 hunks)examples/backends/vllm/launch/disagg_lmcache.sh(1 hunks)examples/backends/vllm/launch/disagg_multimodal_epd.sh(1 hunks)examples/backends/vllm/launch/disagg_multimodal_llama.sh(1 hunks)examples/backends/vllm/launch/disagg_router.sh(1 hunks)examples/backends/vllm/launch/disagg_same_gpu.sh(2 hunks)examples/backends/vllm/launch/dsr1_dep.sh(1 hunks)examples/basics/multinode/trtllm/start_frontend_services.sh(1 hunks)tests/serve/launch/template_verifier.sh(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: keivenchang
Repo: ai-dynamo/dynamo PR: 3035
File: lib/runtime/src/metrics/prometheus_names.rs:49-53
Timestamp: 2025-09-16T00:26:37.092Z
Learning: keivenchang prefers consistency in metric naming standardization over strict adherence to Prometheus conventions about gauge vs counter suffixes. When standardizing metrics naming, prioritize consistency across the codebase rather than technical pedantry about individual metric type conventions.
Learnt from: keivenchang
Repo: ai-dynamo/dynamo PR: 3051
File: container/templates/Dockerfile.trtllm.j2:424-437
Timestamp: 2025-09-16T17:16:03.785Z
Learning: keivenchang prioritizes maintaining exact backward compatibility during migration/refactoring PRs, even when bugs are identified in the original code. Fixes should be deferred to separate PRs after the migration is complete.
📚 Learning: 2025-10-28T05:48:37.621Z
Learnt from: ayushag-nv
Repo: ai-dynamo/dynamo PR: 3634
File: components/src/dynamo/vllm/multimodal_utils/model.py:39-42
Timestamp: 2025-10-28T05:48:37.621Z
Learning: In components/src/dynamo/vllm/multimodal_utils/model.py, the AutoModel.from_pretrained call with trust_remote_code=True in the load_vision_model function is intentional and expected for the vLLM multimodal implementation.
Applied to files:
examples/backends/vllm/launch/agg_multimodal.sh
📚 Learning: 2025-06-05T01:46:15.509Z
Learnt from: GuanLuo
Repo: ai-dynamo/dynamo PR: 1371
File: examples/llm/benchmarks/vllm_multinode_setup.sh:18-25
Timestamp: 2025-06-05T01:46:15.509Z
Learning: In multi-node setups with head/worker architecture, the head node typically doesn't need environment variables pointing to its own services (like NATS_SERVER, ETCD_ENDPOINTS) because local processes can access them via localhost. Only worker nodes need these environment variables to connect to the head node's external IP address.
Applied to files:
examples/backends/vllm/launch/disagg_multimodal_llama.sh
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: deploy-test-fault-tolerance (vllm)
🔇 Additional comments (53)
examples/backends/vllm/launch/disagg_multimodal_epd.sh (1)
73-76: Verify environment variable support indynamo.frontend.The change removes the explicit
--http-port=8000flag to enable dynamic port configuration via theDYN_HTTP_PORTenvironment variable. While the comment accurately describes the intended behavior, I need to confirm that thedynamo.frontendmodule actually implements this environment variable with a proper default fallback to 8000 when unset.To verify this change is properly implemented, please confirm:
- The
dynamo.frontendmodule accepts and respects theDYN_HTTP_PORTenvironment variable- If
DYN_HTTP_PORTis not set, it defaults to port 8000- CLI flags (e.g.,
--http-port) continue to take precedence over the environment variable (backward compatibility)You can verify by checking the frontend module's entry point or startup code. If you'd like, I can generate a verification script to search for the environment variable handling logic.
examples/backends/vllm/launch/disagg_kvbm_router.sh (1)
13-16: ✓ Frontend port configuration change is clear and backward-compatible.Removing the hardcoded
--http-port 8000flag while adding the explanatory comment successfully enables environment-based port configuration. The documented behavior—acceptingDYN_HTTP_PORTor defaulting to 8000—preserves backward compatibility for callers not setting the environment variable, and the comment accurately conveys this to future maintainers.The worker configurations remain appropriately unchanged, which is correct since they use their own port configuration patterns (ZMQ ports, side-channel ports).
examples/backends/sglang/launch/multimodal_disagg.sh (2)
63-64: Frontend port configuration refactored correctly.The removal of the hardcoded
--http-port=8000flag and reliance on theDYN_HTTP_PORTenvironment variable (with default 8000) is the intended behavior per the PR objectives. The comment clearly documents the new configuration mechanism, and backward compatibility is preserved.
85-85: Verify if disaggregation bootstrap ports require dynamic configuration.The worker processes continue to use a hardcoded
--disaggregation-bootstrap-port 12345. For parallel testing scenarios, concurrent test runs may conflict on this shared port. Confirm whether these bootstrap ports need to be made dynamic (e.g., via environment variables likeDYN_SYSTEM_PORT_PREFILLandDYN_SYSTEM_PORT_DECODEmentioned in the PR objectives), or if they are scoped/isolated appropriately for parallel execution.As a follow-up, verify whether this change is intentional (deferred to a later PR or handled via other mechanisms) or if bootstrap port configuration needs to be updated as part of this PR's parallel testing enablement.
Also applies to: 100-100
examples/backends/vllm/launch/disagg_multimodal_llama.sh (1)
48-49: Frontend port configuration support verified.The change is correct. Verification confirms that
dynamo.frontend(components/src/dynamo/frontend/main.py) implements exactly what the comment describes:
- Accepts
--http-portCLI flag (line 93)- Reads
DYN_HTTP_PORTenvironment variable (line 95)- Defaults to port 8000 when neither is specified (line 95)
The removal of the hardcoded flag and reliance on environment variable or default aligns with the PR objective for dynamic port configuration and enables parallel testing scenarios.
examples/backends/vllm/launch/agg_kvbm.sh (2)
13-14: Clarify worker port configuration approach for this script.The PR objectives mention that worker ports should use environment-variable-based configuration (e.g.,
DYN_SYSTEM_PORT=${DYN_SYSTEM_PORT:-8081}), but the worker launch in this script doesn't include any port environment variables.Is this intentional for the KVBM-specific variant, or should the worker also be updated to accept dynamic ports?
8-9: All verification requirements confirmed - no issues found.The
dynamo.frontendmodule correctly supports theDYN_HTTP_PORTenvironment variable with the documented default of 8000. The implementation usesint(os.environ.get("DYN_HTTP_PORT", "8000")), and users can still override via the--http-portCLI flag, preserving backward compatibility. The code change is correct and safe.examples/backends/trtllm/performance_sweeps/scripts/start_frontend.sh (1)
22-25: Change verified as correct. Thedynamo.frontendmodule supports theDYN_HTTP_PORTenvironment variable with a default of 8000, and backward compatibility is preserved—the--http-portflag remains functional. The comment accurately describes the behavior, and this pattern is already in use elsewhere in the codebase (e.g.,examples/basics/multinode/trtllm/start_frontend_services.sh).examples/backends/vllm/launch/disagg_router.sh (2)
23-53: Worker ports remain hardcoded; verify if intentional for "part 1" scope.The vllm worker invocations still contain hardcoded port values:
VLLM_NIXL_SIDE_CHANNEL_PORTenvironment variables (20097, 20098, 20099 at lines 29, 39, 47)- KV event endpoint ports (20080, 20081, 20082, 20083 embedded in JSON strings at lines 27, 34, 45, 53)
Given the PR description mentions updating "40 launch scripts" with environment variables for "multi-worker scenarios," confirm whether:
- Worker port configuration should also use environment variables (e.g.,
DYN_SYSTEM_PORT_PREFILL_WORKER1, similar to what the PR summary describes), or- This file intentionally defers worker port changes to a subsequent phase.
If worker ports should be made dynamic, the pattern from the PR summary suggests:
VLLM_NIXL_SIDE_CHANNEL_PORT=${DYN_VLLM_SIDE_CHANNEL_PORT_PREFILL_WORKER1:-20097}And KV event endpoints within JSON should similarly use environment variable substitution or be externalized.
15-19: Verified: Frontend correctly respectsDYN_HTTP_PORTenvironment variable.The
dynamo.frontendcode properly implements support for theDYN_HTTP_PORTenvironment variable with a default of 8000, confirming that removing the explicit--http-port 8000flag from the script is the correct approach. The--http-portCLI flag remains supported for backward compatibility.examples/backends/vllm/launch/disagg_kvbm_2p2d.sh (1)
8-9: All verification checks passed; no issues found.The implementation has been confirmed:
- The
dynamo.frontendcode correctly implements support for both--http-portflag andDYN_HTTP_PORTenvironment variable with a default of 8000.- The comment on line 8 is accurate.
- Removal of the explicit
--http-port=8000flag is safe and correct; the default will be applied if neither the flag nor the environment variable is set.- This pattern is consistent across the broader codebase.
examples/basics/multinode/trtllm/start_frontend_services.sh (1)
15-17: Change verified—dynamo.frontendsupportsDYN_HTTP_PORTenvironment variable with correct default.The
--http-portargument incomponents/src/dynamo/frontend/main.pyreads from theDYN_HTTP_PORTenvironment variable and defaults to port 8000, confirming the shell script change maintains backward compatibility while adopting environment-variable-based configuration.examples/backends/sglang/launch/multimodal_agg.sh (1)
63-64: Port configuration change is consistent and maintains backward compatibility.Frontend now relies on environment-driven port selection (DYN_HTTP_PORT env var or default 8000) instead of hardcoded flag.
examples/backends/trtllm/launch/disagg.sh (1)
28-29: Port configuration removed from frontend invocation.Aligns with dynamic port preparation by relying on DYN_HTTP_PORT environment variable (default 8000).
examples/backends/vllm/launch/disagg_lmcache.sh (1)
8-9: Frontend port configuration delegated to environment variables.Clean change consistent with other scripts; --router-mode kv behavior preserved.
examples/backends/trtllm/launch/disagg_multimodal.sh (1)
26-27: Frontend port configuration externalized to environment.Change is consistent with PR objectives and maintains multimodal worker setup unchanged.
examples/backends/trtllm/launch/agg_metrics.sh (2)
22-23: Frontend port configuration externalized.Consistent pattern with environment-driven port selection.
27-27: Worker port configured via environment variable with sensible default.
DYN_SYSTEM_PORT=${DYN_SYSTEM_PORT:-8081}correctly sets the env var for the worker subprocess, defaulting to 8081 when unset. Enables runtime port customization for parallel testing.examples/backends/vllm/launch/agg_multimodal.sh (2)
48-49: Frontend port configuration moved to environment.Properly documents dynamic port behavior; TCP transport config and multimodal setup remain intact.
63-63: Worker port configuration parameterized with environment variable.
DYN_SYSTEM_PORT=${DYN_SYSTEM_PORT:-8081}enables dynamic port allocation while preserving all vLLM worker configurations (--enforce-eager, --connector none).examples/backends/sglang/launch/agg.sh (2)
47-49: Frontend port configuration externalized; OTEL service naming preserved.Comment documents dynamic port behavior; OTEL_SERVICE_NAME environment setup correctly positioned before frontend invocation.
52-61: Worker port parameterized with environment variable and runtime flags expanded.
DYN_SYSTEM_PORT=${DYN_SYSTEM_PORT:-8081}enables dynamic port allocation. Runtime flags (--page-size, --tp, --trust-remote-code, --skip-tokenizer-init, --enable-metrics) are orthogonal improvements that enable metrics collection on the configured port.examples/backends/sglang/launch/agg_embed.sh (2)
47-49: Frontend port configuration externalized with OTEL service naming preserved.Consistent pattern; OTEL configuration correctly ordered before frontend invocation.
53-53: Worker port and OTEL service name configured for embedding worker.
DYN_SYSTEM_PORT=${DYN_SYSTEM_PORT:-8081}correctly combined withOTEL_SERVICE_NAMEfor environment setup; embedding-specific flags preserved.examples/backends/vllm/launch/agg_lmcache.sh (1)
8-12: Environment-variable-driven port configuration is properly implemented.The shell syntax for variable expansion (
${DYN_SYSTEM_PORT:-8081}) is correct, and the comment accurately describes the intended port selection behavior. The changes maintain backward compatibility by preserving CLI flag support.However, please confirm that the underlying Python implementations (
dynamo.frontendanddynamo.vllm) properly support:
- The
DYN_HTTP_PORTenvironment variable for the frontend (with 8000 as default)- The
DYN_SYSTEM_PORTenvironment variable for the workerYou can verify this by checking the argument parser or port-binding logic in those modules. If these features are not yet implemented, the environment variables will be silently ignored and the processes may fail to bind or use unexpected ports.
examples/backends/sglang/launch/disagg_same_gpu.sh (1)
40-46: Multi-worker port configuration follows a clear, consistent pattern.The use of numbered environment variables (
DYN_SYSTEM_PORT1,DYN_SYSTEM_PORT2) provides unambiguous port assignment for the prefill and decode workers. The defaults (8081, 8082) are sequential and easy to track.Confirm that the worker processes (
dynamo.sglangin disaggregation modes) respect theDYN_SYSTEM_PORTenvironment variable for port binding.Also applies to: 75-76
examples/backends/sglang/launch/disagg.sh (1)
47-68: Semantic environment-variable naming improves readability.Using role-based names (
DYN_SYSTEM_PORT_PREFILL,DYN_SYSTEM_PORT_DECODE) instead of numeric indices makes the configuration more self-documenting and easier to understand at a glance. The defaults (8081, 8082) are consistent with other scripts.Please verify that
dynamo.sglangworkers in both prefill and decode modes respect theDYN_SYSTEM_PORTenvironment variable for metrics/telemetry port binding.examples/backends/sglang/launch/disagg_router.sh (1)
48-111: Hierarchical environment-variable naming scales well for complex multi-worker scenarios.The naming scheme (
DYN_SYSTEM_PORT_PREFILL_ROUTER,DYN_SYSTEM_PORT_PREFILL_WORKER1, etc.) is hierarchical, clear, and supports future expansion. The sequential port defaults (8081–8085) prevent conflicts.Verify that all worker types (
dynamo.router,dynamo.sglangin various disaggregation modes) properly bind to the ports specified in their respectiveDYN_SYSTEM_PORT_*environment variables. Given the complexity here, consider testing a full startup to ensure no port collisions or binding failures occur.examples/backends/sglang/launch/agg_router.sh (1)
47-64: Two-worker port configuration follows the established pattern consistently.Environment variables
DYN_SYSTEM_PORT_WORKER1andDYN_SYSTEM_PORT_WORKER2with defaults (8081, 8082) provide clear, unambiguous port assignment. The pattern aligns with other multi-worker scripts in this PR.examples/backends/trtllm/launch/agg.sh (1)
25-26: Frontend port configuration aligns with the PR's environment-driven approach.Removing the hardcoded
--http-port=8000flag and relying onDYN_HTTP_PORT(or a built-in default) is consistent with changes across other backend launch scripts.Confirm that
dynamo.frontendwithout any explicit--http-portflag will:
- Check for
DYN_HTTP_PORTenvironment variable- Fall back to 8000 if the variable is unset
tests/serve/launch/template_verifier.sh (1)
20-21: Test utility correctly adopts environment-driven port configuration.The pattern is consistent with production launch scripts. The frontend will use
DYN_HTTP_PORTor default to 8000.examples/backends/trtllm/launch/epd_disagg.sh (1)
32-33: Frontend startup correctly removes hardcoded port flag.This change is consistent with the broader port-configuration refactor across all backend launch scripts in this PR.
Confirm that the three worker processes (encode, prefill, decode) either:
- Use their own environment-variable-based port configuration (if modified in the source), or
- Use default ports that will not conflict with the frontend or each other
Without explicit port configuration for the workers in this script, verify there are no port-binding conflicts in actual deployments.
examples/backends/vllm/launch/agg_router.sh (1)
15-18: Backend port configuration change looks solid.Removing the hardcoded port and documenting the env var fallback pattern is clean and maintains backward compatibility. The frontend will use DYN_HTTP_PORT if set, otherwise defaults to 8000.
examples/backends/vllm/launch/dsr1_dep.sh (1)
85-88: Frontend port configuration aligns with PR pattern.The removal of
--http-port=8000and documentation of env var fallback is consistent across the backend launch scripts. Change preserves backward compatibility and enables runtime customization.examples/backends/vllm/launch/dep.sh (1)
8-9: Frontend port change follows established pattern.Removing hardcoded frontend port and relying on DYN_HTTP_PORT env var (or default 8000) is consistent with the broader PR pattern. The DP worker loop already uses dynamic port configuration, so the full stack is now env-driven.
examples/backends/vllm/launch/agg.sh (1)
8-14: Frontend and worker port configuration is correct and well-parameterized.The use of bash parameter expansion (
${DYN_SYSTEM_PORT:-8081}) allows external port configuration for parallel testing scenarios while maintaining a sensible default. Both frontend (env var or default 8000) and worker (env var or default 8081) now support dynamic port binding.examples/backends/vllm/launch/disagg.sh (2)
8-9: Frontend port configuration is consistent.The removal of hardcoded frontend port and reliance on DYN_HTTP_PORT env var matches the pattern from other launch scripts.
12-19: Worker ports should be parameterized for multi-worker scenario.The AI summary and PR objectives mention support for distinct env vars (
DYN_SYSTEM_PORT_PREFILL,DYN_SYSTEM_PORT_DECODE, etc.) for multi-worker scenarios. However, the worker invocations in this file do not show env var parameterization like the pattern inagg.sh(line 13:DYN_SYSTEM_PORT=${DYN_SYSTEM_PORT:-8081}).For consistency and to enable parallel testing with distinct ports for prefill and decode workers, these should likely be configured. For example:
DYN_SYSTEM_PORT_PREFILL=${DYN_SYSTEM_PORT_PREFILL:-8081} \ CUDA_VISIBLE_DEVICES=0 python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager & DYN_SYSTEM_PORT_DECODE=${DYN_SYSTEM_PORT_DECODE:-8082} \ DYN_VLLM_KV_EVENT_PORT=20081 \ VLLM_NIXL_SIDE_CHANNEL_PORT=20097 \ CUDA_VISIBLE_DEVICES=1 python3 -m dynamo.vllm ...Please confirm whether worker port parameterization is intentionally omitted from
disagg.shor if this is a missing implementation per the PR objectives for multi-worker scenarios.examples/backends/trtllm/launch/agg_router.sh (2)
27-31: Review comment is based on incorrect assumptions about file content.The file is exactly 31 lines—the code snippet is complete and not truncated. The worker configuration (lines 27-31) shows a single worker setup with no
DYN_SYSTEM_PORT_WORKER1,DYN_SYSTEM_PORT_WORKER2, or port configuration flags. The file does not contain Worker 1/Worker 2 configurations as the original review expected. The code snippet accurately reflects what exists in the file.Likely an incorrect or invalid review comment.
21-24: No issues found — the frontend default port behavior is verified and correct.The removal of the explicit
--http-port=8000flag from line 23 is safe. Verification confirms thatdynamo.frontend(incomponents/src/dynamo/frontend/main.pyline 95) usesdefault=int(os.environ.get("DYN_HTTP_PORT", "8000")), meaning it defaults to port 8000 when neither the--http-portflag nor theDYN_HTTP_PORTenvironment variable is provided. The comment on line 22 accurately documents this behavior, and the change maintains backward compatibility.examples/backends/vllm/launch/agg_kvbm_router.sh (2)
1-41: Maintain backward compatibility and consistency across backends.Per the learnings from your prior work (PR 3051), ensure exact backward compatibility during this refactoring. Given this is one of 40 scripts being updated, verify that:
- All backend scripts (vllm, sglang, trtllm) follow the same port configuration pattern for the frontend
- Existing deployments that rely on the hardcoded port 8000 continue to work without modification
- No test fixtures or CI/CD workflows hardcode port assumptions that would break
Consider running a quick smoke test with and without
DYN_HTTP_PORTset to confirm both paths work.
14-17: The search found no references toDYN_HTTP_PORTin the codebase. This is concerning. Let me verify if the frontend module actually implements this environment variable handling:The web search found the general pattern that CLI arguments can be configured via environment variables using the DYN_ prefix with uppercase parameter names, but no specific confirmation that
DYN_HTTP_PORTis actually implemented. Let me check the frontend source code directly:Based on my verification, I found a critical issue: The review comment documents support for a
DYN_HTTP_PORTenvironment variable, but there is no evidence this feature actually exists in the codebase.Search results show:
- All tests explicitly pass
--http-portwith dynamic values- No references to
DYN_HTTP_PORTanywhere in the codebase- Documentation mentions the DYN_ prefix pattern for Kubernetes deployments, but provides no specific example for
--http-port- Official examples continue to show explicit
--http-port 8000usageThe change removes the hardcoded
--http-port 8000flag without confirming the environment variable fallback is implemented, creating a risk that the frontend will start with an unspecified port.Verify the frontend actually implements
DYN_HTTP_PORTenvironment variable handling before approving this change. Check the dynamo.frontend module's argument parsing code to confirm:
- The
--http-portparameter can be overridden by aDYN_HTTP_PORTenv var- The default behavior (port 8000) is preserved when neither is provided
- Existing tests that expect port 8000 still pass without the explicit flag
examples/backends/vllm/launch/agg_multimodal_epd.sh (1)
67-68: I need to examine the actual EPD script and compare it with multimodal patterns in other backends.Now let me retrieve the actual EPD script to verify the specific implementation:
Let me retrieve the actual file content to verify the worker port configuration:
#!/bin/bash cat examples/backends/vllm/launch/agg_multimodal_epd.sh | head -100Once I have the file content, I'll be able to definitively verify whether the worker processes need port configuration and provide an accurate review comment. Could you run this command so I can see the full script?
examples/backends/trtllm/launch/disagg_router.sh (1)
24-26: Let me search for source code documentation on port configuration:Based on my verification of the codebase and Dynamo documentation, I can now provide the final rewritten review comment:
Remove the review comment—the code changes are correct and the inconsistency is with the AI summary, not the implementation.
All CLI arguments can be converted to environment variables with a DYN_ prefix. The frontend can be launched with
--http-port 8000or theDYN_HTTP_PORTenvironment variable, which matches the code at lines 24-26 that removes the hardcoded--http-port=8000flag and relies on environment variable or application defaults.The original review comment flagged worker port environment variables (e.g.,
DYN_SYSTEM_PORT_PREFILL_WORKER1) as missing. However, no such variables exist in the Dynamo codebase or documentation. Worker port assignment is implicit—handled internally by the application through the distributed runtime framework and service registration. The AI summary overstated what was changed; the actual code changes are correct and complete.Likely an incorrect or invalid review comment.
examples/backends/vllm/launch/agg_multimodal_llama.sh (2)
11-12: All verification checks passed — no action required.The shell output confirms that
dynamo.frontendhas been properly updated:
DYN_HTTP_PORTis supported with a default of 8000 in the frontend code- Backward compatibility is preserved: existing deployments without
DYN_HTTP_PORTset will continue to work with the default port- The changes in lines 11-12 are safe and align with the PR's env-var-driven configuration goal
1-21: Configure port environments for worker processes to enable parallel execution.The workers in this script may conflict when running multiple instances simultaneously. The disagg variant (
disagg_multimodal_llama.sh) sets explicit port configuration:
VLLM_NIXL_SIDE_CHANNEL_PORT(differs per worker: 20097 vs 20098)--kv-events-configwith unique ZMQ endpoints (ports 20080 vs 20081)The agg script should similarly configure these to avoid port collisions. Reference
examples/backends/vllm/launch/disagg_multimodal_llama.shfor the port configuration pattern needed here (lines 15 and 18).examples/backends/vllm/launch/agg_request_planes.sh (2)
44-49: All verification concerns confirmed—no issues found.The refactoring is correct and follows established patterns throughout the codebase:
- Frontend accepts
DYN_HTTP_PORTenv var with default8000- Backend workers accept
DYN_SYSTEM_PORTenv var, defaulting to-1(disabled) but enabling with positive values like8081- The
${DYN_SYSTEM_PORT:-8081}pattern is consistently used across 30+ launch scripts in examples/backends/- The comment accurately reflects the module behavior
This refactoring maintains backward compatibility and aligns with the codebase's existing env var configuration patterns.
47-47: Port defaults are consistent across backends—no changes needed.Verification confirms that
DYN_SYSTEM_PORT=${DYN_SYSTEM_PORT:-8081}in vllmagg_request_planes.shaligns with the established pattern across all backends (vllm, sglang, trtllm). Multi-worker scenarios are already handled via distinct environment variables (DYN_SYSTEM_PORT_PREFILL,DYN_SYSTEM_PORT_DECODE, or numbered variants likeDYN_SYSTEM_PORT_WORKER1), keeping concerns separate and avoiding conflicts.examples/backends/trtllm/launch/gpt_oss_disagg.sh (1)
17-18: ✓ Frontend port configuration aligns with PR objectives.Removing the hardcoded
--http-port 8000flag and documenting the fallback behavior (DYN_HTTP_PORT env var with default 8000) is consistent with the PR goal of enabling dynamic port configuration. The comment on Line 17 clearly signals the expected behavior to future maintainers.examples/backends/vllm/launch/disagg_same_gpu.sh (3)
68-77: Prefill worker uses distinct env var following multi-worker pattern.The use of
DYN_SYSTEM_PORT_PREFILLwith default 8082 correctly differentiates the prefill worker from the decode worker, aligning with the PR's multi-worker support. This is intentional per the PR objectives.Verify that the prefill worker (
dynamo.vllm --is-prefill-worker) correctly readsDYN_SYSTEM_PORT_PREFILLfor its port configuration.
51-57: Decode worker port configuration looks correct.The use of
DYN_SYSTEM_PORT=${DYN_SYSTEM_PORT:-8081}with default 8081 is correct and maintains backward compatibility. However, verify thatdynamo.vllmmodule respects this environment variable.
45-46: Verify thatdynamo.frontendapplication handlesDYN_HTTP_PORTenvironment variable.The removal of the explicit
--http-port=8000flag assumes the frontend application has been updated to readDYN_HTTP_PORTand default to 8000. While the comment states this, the implementation change needs verification.examples/backends/vllm/launch/disagg_kvbm.sh (1)
8-9: No action needed—verified that frontend supports environment variable configuration with correct defaults.Frontend module correctly implements
DYN_HTTP_PORTenvironment variable support with a default fallback to port 8000, and the--http-portCLI flag remains functional for explicit override. The removal of the flag from this script is safe.
Overview:
Adds environment variable support for dynamic port configuration in preparation for parallel testing. Removes hardcoded --http-port=8000 flags from launch scripts, allowing frontend and workers to use configurable ports via environment variables while maintaining backward compatibility with CLI flags and default values.
Details:
Where should the reviewer start?
examples/backends/vllm/launch/agg.sh(typical single-worker pattern)examples/backends/sglang/launch/agg_router.sh(2 workers)Related Issues:
Relates to DIS-1022
/coderabbit profile chill
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.