[New Model]: Add support for openPangu-Ultra-MoE-718B

### The model to consider.

[https://ai.gitcode.com/ascend-tribe/openPangu-Ultra-MoE-718B-V1.1](url)

The architecture of the openPangu-Ultra-MoE-718B-V1.1 adopts the mainstream Multi-head Latent Attention (MLA), Multi-Token Prediction (MTP), high MoE sparsity, and features several different designs:

Depth-Scaled Sandwich-Norm and TinyInit: These techniques adjust the layer normalization structure and parameter initialization for improved training stability.
EP-Group load balancing loss: This technique optimizes the load balancing loss, achieving better expert specialization.

### The closest model vllm already supports.

The closest model is Deepseek_v3.

### What's your difficulty of supporting the model you want?

Most related mainstream modules have been well implemented in vllm, but the Depth-Scaled Sandwich-Norm structure and EP-Group router still need some adaptation.
Furthermore, more dense models in openPangu series will be added in the future.

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

[New Model]: Add support for openPangu-Ultra-MoE-718B #27019

The model to consider.

The closest model vllm already supports.

What's your difficulty of supporting the model you want?

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Uh oh!

[New Model]: Add support for openPangu-Ultra-MoE-718B #27019

Description

The model to consider.

The closest model vllm already supports.

What's your difficulty of supporting the model you want?

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions