Skip to content

Add support for Int4GroupwisePreshuffleTensor for fbgemm #2421

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 3, 2025

Conversation

jerryzh168
Copy link
Contributor

@jerryzh168 jerryzh168 commented Jun 22, 2025

Copy link

pytorch-bot bot commented Jun 22, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2421

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 851dc01 with merge base 5a50667 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 added a commit that referenced this pull request Jun 22, 2025
Note: slice is not working yet, others are working

Test Plan:
python test/dtypes/test_int4_groupwise_preshuffle.py

Reviewers:

Subscribers:

Tasks:

Tags:

stack-info: PR: #2421, branch: jerryzh168/stack/1
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/1 branch from 565e596 to 65a1373 Compare June 22, 2025 04:28
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 22, 2025
@jerryzh168 jerryzh168 changed the title Summary: Add Int4GroupwisePreshuffleTensor for fbgemm Jun 22, 2025
@jerryzh168 jerryzh168 added the topic: new feature Use this tag if this PR adds a new feature label Jun 22, 2025
@jerryzh168 jerryzh168 changed the title Add Int4GroupwisePreshuffleTensor for fbgemm Summary: Jun 22, 2025
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/1 branch from 65a1373 to 8dcecf4 Compare June 22, 2025 04:35
@jerryzh168 jerryzh168 changed the title Summary: Add support for Int4GroupwisePreshuffleTensor for fbgemm Jun 22, 2025
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/1 branch 3 times, most recently from 44b79dd to 6ce4c7b Compare June 24, 2025 22:25
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/1 branch from 6ce4c7b to 027648f Compare June 24, 2025 22:28
if importlib.util.find_spec("fbgemm_gpu") is None:
quantize_int4_preshuffle = None
else:
from fbgemm_gpu.experimental.gen_ai.quantize import quantize_int4_preshuffle
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a prototype API? If yes, should the torchao version also be prototype? what does "experimental" mean in the folder structure here?

Copy link
Contributor Author

@jerryzh168 jerryzh168 Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is stable and production ready, and used in production. it's just bad naming according to @jwfromm, and they have a plan to get rid of it

shape: shape of the original Tensor

Note:
preshuffle means the weight is rearranged for more efficient use of loading instructions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be good to share the specifics of the preshuffle transformation, either here or via a link a user can follow

Int4GroupwisePreshuffleTensor,
)

Int4GroupwisePreshuffleTensor.__module__ = "torchao.quantization"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we confirm (by actually testing it) that we can change the directory location later without breaking BC?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a test that verifies the loaded weight have module path torchao.quantization.Int4GroupwisePreshuffleTensor, this (type(tensor))is used in the load code path: https://github.com/pytorch/pytorch/blob/d4b8857e51a089b7e0e722689398c5c3ada274c9/torch/_tensor.py#L262 which gives us good confidence that it would work as long as we do this

but I can do a e2e test a bit later by uploading the file in huggingface hub and change the path locally to verify as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added in #2437

Copy link
Contributor

@vkuzo vkuzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for making the changes!

Summary:
Note: slice is not working yet, others are working

Test Plan:
python test/dtypes/test_int4_groupwise_preshuffle.py

Reviewers:

Subscribers:

Tasks:

Tags:

stack-info: PR: #2421, branch: jerryzh168/stack/1
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/1 branch from e664a1e to 851dc01 Compare July 2, 2025 20:36
@jerryzh168 jerryzh168 merged commit 2d61be8 into main Jul 3, 2025
18 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: new feature Use this tag if this PR adds a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants