You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: torchao/prototype/parq/README.md
+86-4
Original file line number
Diff line number
Diff line change
@@ -7,23 +7,105 @@ This library applies QAT without modifying model-level code. It instead interfac
7
7
* quantization method: computing the best set of discrete, quantized values
8
8
* proximal mapping: projection of weights onto quantized values
9
9
10
+
11
+
## PARQ vs. torchao
12
+
13
+
There are two main QAT interfaces in torchao:
14
+
15
+
- Modules (e.g., `torch.nn.Linear`) are swapped with their quantized counterparts (e.g., `Int4WeightOnlyQATLinear`). See [Quantizer API (legacy)](torchao/quantization/qat#quantizer-api-legacy) for details.
16
+
- The tensor subclass approach instead operates at a finer level of granularity. It replaces `torch.Tensor` instances with quantized `AffineQuantizedTensor` ones. The [`quantize_` API](quantization/qat#quantize_-api-recommended) uses this method by default.
17
+
18
+
PARQ is conceptually similar to the tensor subclass interface. It quantizes tensors through the optimizer (i.e., `optimizer.param_groups[i]["params"]`) without modifying the model.
19
+
20
+
An example PARQ flow and its torchao equivalent are shown below. The prepare stage takes place before training, while the convert stage runs after training to produce a quantized model.
> `UnifTorchaoQuantizer` calls exactly the same quantization primitives as in torchao's tensor subclass interface (see [Affine Quantization Details](torchao/quantization#affine-quantization-details)).
91
+
10
92
## QAT arguments
11
93
12
94
|| description | choices |
13
95
| --- | --- | --- |
14
-
|`quant-bits`| bit-width for quantized weights | 0 (ternary), 1—4 |
96
+
|`quant-bits`| bit-width for quantized weights | 0 (ternary), 1-4 |
15
97
|`quant-method`| method for determining quantized values |`lsbq`, `uniform`|
|`anneal-start`| start epoch for QAT annealing period | (0, `total_steps` - 1) |
18
100
|`anneal-end`| end epoch for QAT annealing period | (`anneal_end`, `total_steps`) |
19
-
|`anneal-steepness`| sigmoid steepness for PARQ inverse slope schedule |25—100|
101
+
|`anneal-steepness`| sigmoid steepness for PARQ inverse slope schedule |1-20|
20
102
21
103
## Optimizer-only interface
22
104
23
105
The `QuantOptimizer` wrapper takes any `torch.optim.Optimizer` object. It is also initialized with a `Quantizer` and `ProxMap` object. Integration into new training pipelines is simple:
0 commit comments