Merge branch 'pr_ds32_mtp' of https://github.com/rjg-lyh/vllm-ascend into pr_ds32_mtp

ZYang6263 · ZYang6263 · commit b8b22b9b26f0 · 2025-11-26T16:29:25.000+08:00
diff --git a/docs/source/installation.md b/docs/source/installation.md
@@ -261,8 +261,14 @@ for output in outputs:
 Then run:
 
 ```bash
-# Try `export VLLM_USE_MODELSCOPE=true` and `pip install modelscope`
-# to speed up download if huggingface is not reachable.
+python example.py
+```
+
+If you encounter a connection error with Hugging Face (e.g., `We couldn't connect to 'https://huggingface.co' to load the files, and couldn't find them in the cached files.`), run the following commands to use ModelScope as an alternative:
+
+```bash
+export VLLM_USE_MODELSCOPE = true
+pip install modelscope
 python example.py
 ```
 
diff --git a/docs/source/tutorials/multi_node_kimi.md b/docs/source/tutorials/multi_node_kimi.md
@@ -5,7 +5,7 @@
 Refer to [multi_node.md](https://vllm-ascend.readthedocs.io/en/latest/tutorials/multi_node.html#verification-process).
 
 ## Run with Docker
-Assume you have two Atlas 800 A3 (64G*16)  or four A2 nodes, and want to deploy the `Kimi-K2-Instruct-W8A8` quantitative model across multiple nodes.
+Assume you have two Atlas 800 A3 (64G*16) or four A2 nodes, and want to deploy the `Kimi-K2-Instruct-W8A8` quantitative model across multiple nodes.
 
 ```{code-block} bash
    :substitutions:
diff --git a/docs/source/user_guide/feature_guide/kv_pool_mooncake.md b/docs/source/user_guide/feature_guide/kv_pool_mooncake.md
@@ -21,10 +21,10 @@
     Also, you need to set environment variables to point to them `export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib64/python3.11/site-packages/mooncake`, or copy the .so files to the `/usr/local/lib64` directory after compilation
 
 ### KV Pooling Parameter Description
-**kv_connector_extra_config**:Additional Configurable Parameters for Pooling.  
-**mooncake_rpc_port**:Port for RPC Communication Between Pooling Scheduler Process and Worker Process: Each Instance Requires a Unique Port Configuration.  
-**load_async**:Whether to Enable Asynchronous Loading. The default value is false.  
-**register_buffer**:Whether to Register Video Memory with the Backend. Registration is Not Required When Used with MooncakeConnectorV1; It is Required in All Other Cases. The Default Value is false.
+**kv_connector_extra_config**: Additional Configurable Parameters for Pooling.  
+**mooncake_rpc_port**: Port for RPC Communication Between Pooling Scheduler Process and Worker Process: Each Instance Requires a Unique Port Configuration.  
+**load_async**: Whether to Enable Asynchronous Loading. The default value is false.  
+**register_buffer**: Whether to Register Video Memory with the Backend. Registration is Not Required When Used with MooncakeConnectorV1; It is Required in All Other Cases. The Default Value is false.
 
 ## Run Mooncake Master