Model size after quantization

Why is the size relationship of the model unreasonable after I use these three quantization methods on the same model?

```Python
from torchao.quantization import quantize_, int8_weight_only
quantize_(new_model, int8_weight_only())


# from torchao.quantization import quantize_, int8_dynamic_activation_int8_weight
# quantize_(new_model, int8_dynamic_activation_int8_weight())


# from torchao.quantization import int8_dynamic_activation_int4_weight
# quantize_(new_model, int8_dynamic_activation_int4_weight())
```

the result:
```Shell
20786584 Feb  5 13:46 a8w4SWaT.pte
20373272 Feb  5 13:45 a8w8SWaT.pte
29685120 Oct  5 13:12 pytorch_checkpoint.pth
20262664 Feb  5 13:44 w8onlySWaT.pte
```

Because theoretically, the model after using the A8W4 quantization method should be the smallest, but the actual results are different

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Model size after quantization #1701

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model size after quantization #1701

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions