[P/D] add layerwise connector CI #4468

Fager10086 · 2025-11-26T11:04:14Z

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.11.2
vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

github-actions · 2025-11-26T11:04:26Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request adds CI configurations for multi-node tests with a layerwise connector for DeepSeek-V3 and Qwen3 models. My review focuses on the correctness and consistency of these new YAML configuration files.

I've found a few issues:

The Qwen3-235B-W8A8-layerwise.yaml file seems to be an incomplete copy of the DeepSeek-V3 configuration. It's missing flags for setting execution modes (--enforce-eager for the producer and graph mode config for the consumer) and has an empty benchmarks section. These omissions will likely lead to incorrect or incomplete CI test runs.
The DeepSeek-V3-layerwise.yaml file contains a misleading comment at the top, describing a 4-node setup while the configuration is for 2 nodes.

I've provided suggestions to fix these inconsistencies and complete the configuration.

gemini-code-assist · 2025-11-26T11:05:50Z

tests/e2e/nightly/multi_node/config/models/DeepSeek-V3-layerwise.yaml

+# Suppose we have **4 nodes** running a 2P1D setup (2 Prefillers + 1 Decoder):
+#   ┌───────────────┬───────────────┬───────────────┬───────────────┐
+#   │   node0       │   node1       │   node2       │   node3       │
+#   │ Prefiller #1  │ Prefiller #2  │   Decoder     │   Decoder     │
+#   └───────────────┴───────────────┴───────────────┴───────────────┘
+# For the prefiller nodes. the hosts should be node0 and node1
+# For the decoder nodes. we only have 1 decoder node(dp+tp+ep across node2 and node3. Where node3 is running with headless mode)
+# So the prefiller_host_index is [0, 1], and the decoder_host_index is [2]


The comment describes a 4-node setup, but the configuration below uses num_nodes: 2. This is misleading. Please update the comment to reflect the actual 2-node (1 prefiller, 1 decoder) setup being configured.

# Suppose we have **2 nodes** running a 1P1D setup (1 Prefiller + 1 Decoder): # ┌───────────────┬───────────────┐ # │ node0 │ node1 │ # │ Prefiller │ Decoder │ # └───────────────┴───────────────┘ # For the prefiller node, the host is node0. # For the decoder node, the host is node1. # So the prefiller_host_index is [0], and the decoder_host_index is [1].

gemini-code-assist · 2025-11-26T11:05:50Z

tests/e2e/nightly/multi_node/config/models/Qwen3-235B-W8A8-layerwise.yaml

+        --data-parallel-size 2
+        --data-parallel-size-local 2
+        --tensor-parallel-size 8
+        --seed 1024


The producer (prefill node) deployment is missing the --enforce-eager flag. In disaggregated prefill setups, it's common to run the prefill stage in eager mode for flexibility, while the decode stage runs in graph mode for performance. The DeepSeek-V3-layerwise.yaml configuration in this PR follows this pattern. For consistency and to ensure correct behavior, please add this flag.

--seed 1024 --enforce-eager

gemini-code-assist · 2025-11-26T11:05:50Z

tests/e2e/nightly/multi_node/config/models/Qwen3-235B-W8A8-layerwise.yaml

+        --enable-expert-parallel
+        --trust-remote-code
+        --no-enable-prefix-caching
+        --gpu-memory-utilization 0.9


The consumer (decode node) deployment is missing the configuration to enable graph mode (--additional-config '{\"torchair_graph_config\":{\"enabled\":true}}'). This is inconsistent with the DeepSeek-V3-layerwise.yaml config, where the decoder is explicitly configured to run in graph mode for better performance. Please add this configuration for consistency and to leverage performance optimizations.

--gpu-memory-utilization 0.9 --additional-config '{\"torchair_graph_config\":{\"enabled\":true}}'

gemini-code-assist · 2025-11-26T11:05:50Z

tests/e2e/nightly/multi_node/config/models/Qwen3-235B-W8A8-layerwise.yaml

+                  }
+            }
+        }'
+benchmarks:


The benchmarks section is empty. For the CI to run a meaningful test, it needs benchmark configuration, such as an accuracy test. This seems to be an oversight, as the DeepSeek-V3-layerwise.yaml file includes a benchmark configuration. Please add the appropriate benchmark configuration, similar to the other test file.

benchmarks: acc: case_type: accuracy dataset_path: vllm-ascend/gsm8k-lite request_conf: vllm_api_general_chat dataset_conf: gsm8k/gsm8k_gen_0_shot_cot_chat_prompt max_out_len: 4096 batch_size: 512 baseline: 95 threshold: 5

Signed-off-by: Fager10086 <[email protected]>

github-actions bot added the module:tests label Nov 26, 2025

Fager10086 force-pushed the add_layerwise_ci branch from 6419299 to 8b166ed Compare November 26, 2025 11:05

gemini-code-assist bot reviewed Nov 26, 2025

View reviewed changes

add layerwise CI for Qwen3-235B-w8a8 and DEEPSEEK

6562cb9

Signed-off-by: Fager10086 <[email protected]>

Fager10086 force-pushed the add_layerwise_ci branch from 8b166ed to 6562cb9 Compare November 26, 2025 11:10

Fager10086 added 2 commits November 27, 2025 17:56

add layerwise CI for Qwen3-235B-w8a8 and DEEPSEEK

1533349

Signed-off-by: Fager10086 <[email protected]>

add layerwise CI for Qwen3-235B-w8a8 and DEEPSEEK

408677e

Signed-off-by: Fager10086 <[email protected]>

Fager10086 force-pushed the add_layerwise_ci branch from 4621ac2 to 408677e Compare November 27, 2025 10:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[P/D] add layerwise connector CI #4468

[P/D] add layerwise connector CI #4468

Fager10086 commented Nov 26, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 26, 2025

Uh oh!

gemini-code-assist bot Nov 26, 2025

Uh oh!

gemini-code-assist bot Nov 26, 2025

Uh oh!

gemini-code-assist bot Nov 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[P/D] add layerwise connector CI #4468

Are you sure you want to change the base?

[P/D] add layerwise connector CI #4468

Conversation

Fager10086 commented Nov 26, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fager10086 commented Nov 26, 2025 •

edited by github-actions bot

Loading