You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: torchao/prototype/parq/README.md
+85-4
Original file line number
Diff line number
Diff line change
@@ -7,23 +7,104 @@ This library applies QAT without modifying model-level code. It instead interfac
7
7
* quantization method: computing the best set of discrete, quantized values
8
8
* proximal mapping: projection of weights onto quantized values
9
9
10
+
11
+
## PARQ vs. torchao
12
+
13
+
There are two main QAT interfaces in torchao:
14
+
15
+
- Swap modules (e.g., `torch.nn.Linear`) with their quantized counterparts (e.g., `Int4WeightOnlyQATLinear`). See [Quantizer API (legacy)](torchao/quantization/qat#quantizer-api-legacy) for details.
16
+
- Replace instances of `torch.Tensor` with a quantized tensor subclass such as `AffineQuantizedTensor`. The [`quantize_` API](quantization/qat#quantize_-api-recommended) uses this method by default.
17
+
18
+
PARQ is conceptually more similar to the tensor subclass interface. It quantizes tensors through the optimizer's parameter groups, leaving the model untouched.
19
+
20
+
An example PARQ flow and its torchao equivalent are shown below. The prepare stage occurs before training, while the convert stage runs after training to produce a quantized model.
Note that `UnifTorchaoQuantizer` calls the same quantization primitives as torchao to match the numerics (see [Affine Quantization Details](torchao/quantization#affine-quantization-details)).
90
+
10
91
## QAT arguments
11
92
12
93
|| description | choices |
13
94
| --- | --- | --- |
14
-
|`quant-bits`| bit-width for quantized weights | 0 (ternary), 1—4 |
95
+
|`quant-bits`| bit-width for quantized weights | 0 (ternary), 1-4 |
15
96
|`quant-method`| method for determining quantized values |`lsbq`, `uniform`|
|`anneal-start`| start epoch for QAT annealing period | (0, `total_steps` - 1) |
18
99
|`anneal-end`| end epoch for QAT annealing period | (`anneal_end`, `total_steps`) |
19
-
|`anneal-steepness`| sigmoid steepness for PARQ inverse slope schedule |25—100|
100
+
|`anneal-steepness`| sigmoid steepness for PARQ inverse slope schedule |1-20|
20
101
21
102
## Optimizer-only interface
22
103
23
104
The `QuantOptimizer` wrapper takes any `torch.optim.Optimizer` object. It is also initialized with a `Quantizer` and `ProxMap` object. Integration into new training pipelines is simple:
0 commit comments