Skip to content

Conversation

Bruce-x-1997
Copy link

@Bruce-x-1997 Bruce-x-1997 commented Sep 18, 2025

What does this PR do?

support w4afp8 quant in v3.1(ue8m0)

Usage

just set gemm_impl to fp8 and use config_v3.1.json

Testing

after applying this patch, our model(3.1 using w4afp8) could reach aime25 50% and aime24 60%

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes/No
  • Did you write any new necessary tests?: Yes/No
  • Did you add or update any necessary documentation?: Yes/No
  • Did you update Changelog?: Yes/No

Additional Information

@Bruce-x-1997 Bruce-x-1997 requested a review from a team as a code owner September 18, 2025 12:25
Copy link

copy-pr-bot bot commented Sep 18, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@Bruce-x-1997
Copy link
Author

@cjluo-nv please help review it, thanks

assert weight_quantizer is None
assert act_quantizer is None
x, scale = act_quant(x, block_size)
x, scale = act_quant(x, block_size, scale_fmt)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you walk through how updating the scale will give you W4A8?

In this case, what's the 4bit weight? Is it NVFP4 or INT4?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I don't notice the comment before.
in my case, 4bit weight means int4.
in 3.1 case, when I want to quant the model, there is a interface mismatch with deepseek-v3.git
so I fix it @cjluo-nv

nvcr.io/nvidia/tensorrt-llm/release
```
then we can operate modelopt in the docker pod as (Trtllm example)[https://github.com/NVIDIA/TensorRT-LLM/blob/main/examples/models/core/deepseek_v3/README.md?plain=1]
but we should notice that just using the latest DeepSeek-V3.git is ok, because there is a dtype bug in bias proto at commit 1398800.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you recommend using a more recent commit?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants