Skip to content

Support DeepSeekV3-style block FP8 quantization #372

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jul 21, 2025

Conversation

mgoin
Copy link
Member

@mgoin mgoin commented Jun 30, 2025

Quite a few things packed into one here, but the goal is to support the 128x128 weight and 1x128 input quantization adopted by deepseekv3 and qwen3 models. See examples: https://huggingface.co/deepseek-ai/DeepSeek-V3 and https://huggingface.co/Qwen/Qwen3-0.6B-FP8

  • Added BLOCK static quantization paths for weight quantization.
  • Added GROUP dynamic quantization paths for per-token-group input quantization. I feel like this is more understandable than the "1x128" block input quantization deepseek uses.
  • I’ve updated all of the places where block_structure was previously treated as an “NxM” string so that it now uses a Python list of two integers (e.g. [128, 128]). I added a pydantic validator that can convert this automatically for old checkpoints that use the string.

Here is the scheme I am proposing to support this:

# Block‐wise FP8 (deepseekv3-style quantization):
# static 128x128 per‐block weights and 
# dynamic per‐token‐group activations
FP8_BLOCK = dict(
    weights=QuantizationArgs(
        num_bits=8,
        type=QuantizationType.FLOAT,
        strategy=QuantizationStrategy.BLOCK,
        symmetric=True,
        dynamic=False,
        block_structure=[128, 128],
    ),
    input_activations=QuantizationArgs(
        num_bits=8,
        type=QuantizationType.FLOAT,
        strategy=QuantizationStrategy.GROUP,
        symmetric=True,
        dynamic=True,
        observer=None,
        group_size=128,
    ),
)

Added this model to hugging face: nm-testing/Qwen3-0.6B-FP8-BLOCK

mgoin added 4 commits July 1, 2025 00:44
Signed-off-by: mgoin <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: mgoin <[email protected]>
Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you produce a test model to nm-testing andd it to this PR?

@kylesayrs kylesayrs self-assigned this Jul 8, 2025
@kylesayrs kylesayrs assigned shanjiaz and unassigned kylesayrs Jul 9, 2025
shanjiaz added 2 commits July 10, 2025 21:14
Signed-off-by: shanjiaz <[email protected]>
Signed-off-by: shanjiaz <[email protected]>
@shanjiaz shanjiaz requested review from kylesayrs and dsikka July 21, 2025 14:05
Signed-off-by: shanjiaz <[email protected]>
Copy link
Contributor

@kylesayrs kylesayrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Copy link
Contributor

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome work! clear with nice tests

@shanjiaz shanjiaz merged commit 09b7ed4 into main Jul 21, 2025
1 check passed
@shanjiaz shanjiaz deleted the support-deepseek-block-fp8 branch July 21, 2025 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants