Skip to content

Conversation

@yt0428
Copy link

@yt0428 yt0428 commented Oct 26, 2025

Purpose

Add support for openPangu_Ultra_MoE models
FIX #27019

Test Plan

vllm serve $LOCAL_CKPT_DIR/openpangu-ultra-moe-718b-model \ --data-parallel-size 4 \ --data-parallel-size-local 1 \ --data-parallel-start-rank $NODE_RANK \ --data-parallel-address $MASTER_NODE_IP \ --data-parallel-rpc-port 13389 \ --tensor-parallel-size 8 \ --served-model-name pangu_ultra_moe \ --enable-expert-parallel \ --trust-remote-code \

Test Result

The serving start normally.


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

@mergify
Copy link

mergify bot commented Oct 26, 2025

Documentation preview: https://vllm--27521.org.readthedocs.build/en/27521/

@mergify mergify bot added documentation Improvements or additions to documentation new-model Requests to new models speculative-decoding v1 labels Oct 26, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for the openPangu_Ultra_MoE model. The changes include a new model implementation file and updates to various configuration and registry files to integrate the new model. The implementation appears to be largely adapted from the existing deepseek_v2 model.

I've identified a critical issue in the scaling logic within the OpenPanguMoE module, which seems to have been carried over from the deepseek_v2 implementation. This logic flaw could lead to incorrect computations, particularly in float16 precision, potentially affecting the model's output. A detailed comment with a suggested fix is provided below. The other changes appear to be correct and consistent with adding a new model to the framework.

@Bye-legumes
Copy link

hi, can you give us (me and https://github.com/kcmnd )the access to your fork repo as we tested it it doent work now. We can fix some codes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation new-model Requests to new models speculative-decoding v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[New Model]: Add support for openPangu-Ultra-MoE-718B

2 participants