diff --git a/releases/1.32.2/Examples/onnx/quantization/adaround.html b/releases/1.32.2/Examples/onnx/quantization/adaround.html new file mode 100644 index 00000000..c67abf8b --- /dev/null +++ b/releases/1.32.2/Examples/onnx/quantization/adaround.html @@ -0,0 +1,1476 @@ + + + + + + Adaptive Rounding (AdaRound) — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Adaptive Rounding (AdaRound)

+

This notebook shows a working code example of how to use AIMET to perform Adaptive Rounding (AdaRound).

+

AIMET quantization features typically use the “nearest rounding” technique for achieving quantization. When using the “nearest rounding” technique, the weight value is quantized to the nearest integer value.

+

AdaRound optimizes a loss function using unlabeled training data to decide whether to quantize a specific weight to the closer integer value or the farther one. Using AdaRound quantization, a model is able to achieve an accuracy closer to the FP32 model, while using low bit-width integer quantization.

+
+

Overall flow

+

This notebook covers the following: 1. Instantiate the example evaluation and training pipeline 2. Convert an FP32 PyTorch model to ONNX and evaluate the model’s baseline FP32 accuracy 3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score 4. Apply AdaRound and evaluate the simulation model to get a post-finetuned quantized accuracy score

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art results

  • +
  • For example, it uses a relatively quantization-friendly model like Resnet18

  • +
  • Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, use that. Otherwise, download the dataset from an appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).

+

Note1: The dataloader provided in this example notebook relies on the ImageNet dataset having the following characteristics: - Subfolders ‘train’ for the training samples and ‘val’ for the validation samples. Please see the pytorch dataset description for more details. - A subdirectory per class, and a file per each image sample.

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could reduce the dataset to 2 samples per class. This exercise is left up to the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = '/path/to/dataset/'         # Please replace this with a real directory
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a ONNX model. This QuantizationSim model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods? Not really. You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
import torch
+import onnxruntime as ort
+from Examples.common import image_net_config
+from Examples.onnx.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader
+
+class ImageNetDataPipeline:
+
+    @staticmethod
+    def get_val_dataloader() -> torch.utils.data.DataLoader:
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(DATASET_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         is_training=False,
+                                         num_workers=image_net_config.evaluation['num_workers']).data_loader
+        return data_loader
+
+    @staticmethod
+    def evaluate(sess: ort.InferenceSession) -> float:
+        """
+        Given a torch model, evaluates its Top-1 accuracy on the dataset
+        :param sess: the model to evaluate
+        """
+        evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      num_workers=image_net_config.evaluation['num_workers'])
+
+        return evaluator.evaluate(sess, iterations=None)
+
+
+
+
+
+
+

2. Convert an FP32 PyTorch model to ONNX and evaluate the model’s baseline FP32 accuracy

+

For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead or convert a model trained in a different framework altogether.

+
+
[ ]:
+
+
+
from torchvision.models import resnet18
+import onnx
+
+input_shape = (1, 3, 224, 224)    # Shape for each ImageNet sample is (3 channels) x (224 height) x (224 width)
+dummy_input = torch.randn(input_shape)
+filename = "./resnet18.onnx"
+
+# Load a pretrained ResNet-18 model in torch
+pt_model = resnet18(pretrained=True)
+
+# Export the torch model to onnx
+torch.onnx.export(pt_model.eval(),
+                  dummy_input,
+                  filename,
+                  export_params=True,
+                  do_constant_folding=True,
+                  input_names=['input'],
+                  output_names=['output'],
+                  dynamic_axes={
+                      'input' : {0 : 'batch_size'},
+                      'output' : {0 : 'batch_size'},
+                  }
+                  )
+
+model = onnx.load_model(filename)
+
+
+
+
+

We should decide whether to run the model on a CPU or CUDA device. This example code will use CUDA if available in your onnxruntime environment. You can change this logic and force a device placement if needed.

+
+
[ ]:
+
+
+
# cudnn_conv_algo_search is fixing it to default to avoid changing in accuracies/outputs at every inference
+if 'CUDAExecutionProvider' in ort.get_available_providers():
+    providers = [('CUDAExecutionProvider', {'cudnn_conv_algo_search': 'DEFAULT'}), 'CPUExecutionProvider']
+    use_cuda = True
+else:
+    providers = ['CPUExecutionProvider']
+    use_cuda = False
+
+
+
+
+

Let’s create an onnxruntime session and determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
sess = ort.InferenceSession(filename, providers=providers)
+accuracy = ImageNetDataPipeline.evaluate(sess)
+print(accuracy)
+
+
+
+
+
+
+

3. Create a quantization simulation model and determine quantized accuracy

+
+
+

Fold Batch Normalization layers

+

Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.

+

Why do we need to this?

+

On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so results in an inferences/sec speedup since unnecessary computation is avoided.

+

From a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values for the weight parameters of the adjacent layers.

+

This can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). We want to simulate that on-target behavior by doing BN folding here.

+

The following code calls AIMET to fold the BN layers in-place on the given model:

+
+
[ ]:
+
+
+
from aimet_onnx.batch_norm_fold import fold_all_batch_norms_to_weight
+
+_ = fold_all_batch_norms_to_weight(model)
+
+
+
+
+
+

Create Quantization Sim Model

+

Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them. A few of the parameters are explained here - quant_scheme: We set this to “QuantScheme.post_training_tf_enhanced” - Supported options are ‘tf_enhanced’ or ‘tf’ or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced - default_activation_bw: Setting this to 8, essentially means that we are +asking AIMET to perform all activation quantizations in the model using integer 8-bit precision - default_param_bw: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision

+

In case the ONNX model has custom ops, we need to specify the paths of compiled custom ops via user_onnx_libs parameter. For example, user_onnx_libs=[‘path/to/custom_op1.so’, ‘path/to/custom_op2.so’]

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+
+
[ ]:
+
+
+
import copy
+from aimet_common.defs import QuantScheme
+from aimet_onnx.quantsim import QuantizationSimModel
+
+sim = QuantizationSimModel(model=copy.deepcopy(model),
+                           quant_scheme=QuantScheme.post_training_tf_enhanced,
+                           default_activation_bw=8,
+                           default_param_bw=8,
+                           use_cuda=use_cuda)
+
+
+
+
+
+

Compute Encodings

+

Even though AIMET has added ‘quantizer’ nodes to the model graph, the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ node.

+

For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as ‘computing encodings’.

+

We create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metrics, so we can just ignore the model output. A few pointers regarding the data samples:

+
    +
  • In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 to 1000 samples.

  • +
  • It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results.

  • +
+

The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example.

+
+
[ ]:
+
+
+
def pass_calibration_data(session, samples):
+    data_loader = ImageNetDataPipeline.get_val_dataloader()
+    batch_size = data_loader.batch_size
+    input_name = sess.get_inputs()[0].name
+
+    batch_cntr = 0
+    for input_data, target_data in data_loader:
+
+        inputs_batch = input_data.numpy()
+        session.run(None, {input_name : inputs_batch})
+
+        batch_cntr += 1
+        if (batch_cntr * batch_size) > samples:
+            break
+
+
+
+
+

Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters.

+
+
[ ]:
+
+
+
sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=1000)
+
+
+
+
+

Now the QuantizationSim model is ready to be used for inference. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sim.session)
+print(accuracy)
+
+
+
+
+
+
+

4. Apply Adaround

+

We can now apply AdaRound to this model.

+

Some of the parameters for AdaRound are described below

+
    +
  • dataloader: AdaRound needs a dataloader that iterates over unlabeled data for the layer-by-layer optimization to learn the rounding vectors. We should comply with the class signature for the dataloader which is expected by AdaRound.

  • +
  • num_batches: The number of batches used to evaluate the model while calculating the quantization encodings. Typically we want AdaRound to use around 2000 samples. So with a batch size of 32, this may translate to 64 batches. To speed up the execution here we are using a batch size of 1.

  • +
  • default_num_iterations: The number of iterations to adaround each layer. Default value is set to 10000 and we strongly recommend to not reduce this number. But in this example we are using 32 to speed up the execution runtime.

  • +
+
+
[ ]:
+
+
+
import os
+from aimet_onnx.adaround.adaround_weight import Adaround, AdaroundParameters
+
+# Dataloader satisfying the class signature required by AdaRound
+class DataLoader:
+    """
+    This dataloader derives unlabeled samples in the form of numpy arrays from a torch dataloader
+    """
+    def __init__(self):
+        self._torch_data_loader = ImageNetDataPipeline.get_val_dataloader()
+        self._iterator = None
+        self.batch_size = self._torch_data_loader.batch_size
+
+    def __iter__(self):
+        self._iterator = iter(self._torch_data_loader)
+        return self
+
+    def __next__(self):
+        input_data, _ = next(self._iterator)
+        return input_data.numpy()
+
+    def __len__(self):
+        return len(self._torch_data_loader)
+
+data_loader = DataLoader()
+params = AdaroundParameters(data_loader=data_loader, num_batches=1, default_num_iterations=32,
+                            forward_fn=pass_calibration_data, forward_pass_callback_args=1000)
+
+os.makedirs('./output/', exist_ok=True)
+ada_model = Adaround.apply_adaround(model, params,
+                                    path="output",
+                                    filename_prefix='adaround',
+                                    default_param_bw=8,
+                                    default_quant_scheme=QuantScheme.post_training_tf_enhanced)
+
+
+
+
+

Now, we can determine the simulated quantized accuracy of the model after applying Adaround. We again create a simulation model like before and evaluate to determine simulated quantized accuracy.

+

Note: There are two important things to understand in the following cell. - Parameter Biwidth Precision: The QuantizationSimModel must be created with the same parameter bitwidth precision that was used in the apply_adaround() created.

+
    +
  • Freezing the parameter encodings: After creating the QuantizationSimModel, the set_and_freeze_param_encodings() API must be called before calling the compute_encodings() API. While applying AdaRound, the parameter values have been rounded up or down based on these initial encodings internally created. For Quantization Simulation accuracy, it is important to freeze these encodings. If the parameters encodings are NOT frozen, the call to compute_encodings() will alter the value of the +parameters encodings and Quantization Simulation accuracy will not reflect the AdaRounded accuracy.

  • +
+
+
[ ]:
+
+
+
sim = QuantizationSimModel(model=ada_model,
+                           quant_scheme=QuantScheme.post_training_tf_enhanced,
+                           default_activation_bw=8,
+                           default_param_bw=8,
+                           use_cuda=use_cuda)
+
+sim.set_and_freeze_param_encodings(encoding_path=os.path.join("output", 'adaround.encodings'))
+
+sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=1000)
+
+
+
+
+

Now the QuantizationSim model is ready to be used for inference. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization, using the newly AdaRounded model with updated parameters.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sim.session)
+print(accuracy)
+
+
+
+
+

Depending on your settings you may have observed a slight gain in accuracy after applying AdaRound. The settings used in this notebook are designed only to serve as code examples, designed to run quickly, but may not be optimal. Please try this workflow against the model of your choice and play with the number of samples and other parameters to get the best results.

+

The next step would be to take this model to target. We need to do two things: - export the model with the updated weights without the fake quantization ops - export the encodings (scale/offset quantization parameters). AIMET QuantizationSimModel provides an export API for this purpose.

+
+
[ ]:
+
+
+
sim.export(path='./output/', filename_prefix='resnet18_after_adaround')
+
+
+
+
+
+
+

Summary

+

This example illustrated how the AIMET AdaRound API is invoked to achieve post training quantization. To use AIMET AdaRound for your specific needs, replace the model with your model and replace the data pipeline with your data pipeline. As indicated above, some parameters in this example have been chosen in such a way to make this example execute faster.

+

We hope this notebook was useful for you to understand how to use AIMET for performing AdaRound.

+

A few additional resources: - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT techniques

+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/onnx/quantization/adaround.ipynb b/releases/1.32.2/Examples/onnx/quantization/adaround.ipynb new file mode 100644 index 00000000..e47cecf8 --- /dev/null +++ b/releases/1.32.2/Examples/onnx/quantization/adaround.ipynb @@ -0,0 +1,558 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Adaptive Rounding (AdaRound)\n", + "This notebook shows a working code example of how to use AIMET to perform Adaptive Rounding (AdaRound).\n", + "\n", + "AIMET quantization features typically use the \"nearest rounding\" technique for achieving quantization.\n", + "When using the \"nearest rounding\" technique, the weight value is quantized to the nearest integer value.\n", + "\n", + "AdaRound optimizes a loss function using unlabeled training data to decide whether to quantize a specific weight to the closer integer value or the farther one.\n", + "Using AdaRound quantization, a model is able to achieve an accuracy closer to the FP32 model, while using low bit-width integer quantization.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following:\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Convert an FP32 PyTorch model to ONNX and evaluate the model's baseline FP32 accuracy\n", + "3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score\n", + "4. Apply AdaRound and evaluate the simulation model to get a post-finetuned quantized accuracy score\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art results\n", + "* For example, it uses a relatively quantization-friendly model like Resnet18\n", + "* Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification.\n", + "If you already have a version of the dataset readily available, use that.\n", + "Otherwise, download the dataset from an appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).\n", + "\n", + "**Note1**: The dataloader provided in this example notebook relies on the ImageNet dataset having the following characteristics:\n", + "- Subfolders 'train' for the training samples and 'val' for the validation samples.\n", + "Please see the [pytorch dataset description](https://pytorch.org/vision/0.8/_modules/torchvision/datasets/imagenet.html) for more details.\n", + "- A subdirectory per class, and a file per each image sample.\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset.\n", + "E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class.\n", + "But for the purpose of running this notebook, you could reduce the dataset to 2 samples per class.\n", + "This exercise is left up to the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "DATASET_DIR = '/path/to/dataset/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a ONNX model. This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really. You should be able to use your existing evaluate and train routines as-is.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import torch\n", + "import onnxruntime as ort\n", + "from Examples.common import image_net_config\n", + "from Examples.onnx.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader\n", + "\n", + "class ImageNetDataPipeline:\n", + "\n", + " @staticmethod\n", + " def get_val_dataloader() -> torch.utils.data.DataLoader:\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(DATASET_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " is_training=False,\n", + " num_workers=image_net_config.evaluation['num_workers']).data_loader\n", + " return data_loader\n", + "\n", + " @staticmethod\n", + " def evaluate(sess: ort.InferenceSession) -> float:\n", + " \"\"\"\n", + " Given a torch model, evaluates its Top-1 accuracy on the dataset\n", + " :param sess: the model to evaluate\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " num_workers=image_net_config.evaluation['num_workers'])\n", + "\n", + " return evaluator.evaluate(sess, iterations=None)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 2. Convert an FP32 PyTorch model to ONNX and evaluate the model's baseline FP32 accuracy" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead or convert a model trained in a different framework altogether." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from torchvision.models import resnet18\n", + "import onnx\n", + "\n", + "input_shape = (1, 3, 224, 224) # Shape for each ImageNet sample is (3 channels) x (224 height) x (224 width)\n", + "dummy_input = torch.randn(input_shape)\n", + "filename = \"./resnet18.onnx\"\n", + "\n", + "# Load a pretrained ResNet-18 model in torch\n", + "pt_model = resnet18(pretrained=True)\n", + "\n", + "# Export the torch model to onnx\n", + "torch.onnx.export(pt_model.eval(),\n", + " dummy_input,\n", + " filename,\n", + " export_params=True,\n", + " do_constant_folding=True,\n", + " input_names=['input'],\n", + " output_names=['output'],\n", + " dynamic_axes={\n", + " 'input' : {0 : 'batch_size'},\n", + " 'output' : {0 : 'batch_size'},\n", + " }\n", + " )\n", + "\n", + "model = onnx.load_model(filename)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We should decide whether to run the model on a CPU or CUDA device. This example code will use CUDA if available in your onnxruntime environment. You can change this logic and force a device placement if needed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# cudnn_conv_algo_search is fixing it to default to avoid changing in accuracies/outputs at every inference\n", + "if 'CUDAExecutionProvider' in ort.get_available_providers():\n", + " providers = [('CUDAExecutionProvider', {'cudnn_conv_algo_search': 'DEFAULT'}), 'CPUExecutionProvider']\n", + " use_cuda = True\n", + "else:\n", + " providers = ['CPUExecutionProvider']\n", + " use_cuda = False" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Let's create an onnxruntime session and determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sess = ort.InferenceSession(filename, providers=providers)\n", + "accuracy = ImageNetDataPipeline.evaluate(sess)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 3. Create a quantization simulation model and determine quantized accuracy" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "## Fold Batch Normalization layers\n", + "Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model.\n", + "These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.\n", + "\n", + "**Why do we need to this?**\n", + "\n", + "On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers.\n", + "Doing so results in an inferences/sec speedup since unnecessary computation is avoided.\n", + "\n", + "From a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy.\n", + "However, folding the BN layers can increase the range of the tensor values for the weight parameters of the adjacent layers.\n", + "\n", + "This can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision).\n", + "We want to simulate that on-target behavior by doing BN folding here.\n", + "\n", + "The following code calls AIMET to fold the BN layers in-place on the given model:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_onnx.batch_norm_fold import fold_all_batch_norms_to_weight\n", + "\n", + "_ = fold_all_batch_norms_to_weight(model)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Create Quantization Sim Model\n", + "\n", + "Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them.\n", + "A few of the parameters are explained here\n", + "- **quant_scheme**: We set this to \"QuantScheme.post_training_tf_enhanced\"\n", + " - Supported options are 'tf_enhanced' or 'tf' or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced\n", + "- **default_activation_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision\n", + "- **default_param_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision\n", + "\n", + "In case the ONNX model has **custom ops**, we need to specify the paths of compiled custom ops via **user_onnx_libs** parameter.\n", + "For example, user_onnx_libs=['path/to/custom_op1.so', 'path/to/custom_op2.so']\n", + "\n", + "There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import copy\n", + "from aimet_common.defs import QuantScheme\n", + "from aimet_onnx.quantsim import QuantizationSimModel\n", + "\n", + "sim = QuantizationSimModel(model=copy.deepcopy(model),\n", + " quant_scheme=QuantScheme.post_training_tf_enhanced,\n", + " default_activation_bw=8,\n", + " default_param_bw=8,\n", + " use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Compute Encodings\n", + "Even though AIMET has added 'quantizer' nodes to the model graph, the model is not ready to be used yet.\n", + "Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each 'quantizer' node.\n", + "\n", + "For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters.\n", + "This process is sometimes referred to as calibration. AIMET simply refers to it as 'computing encodings'.\n", + "\n", + "We create a routine to pass unlabeled data samples through the model.\n", + "This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model.\n", + "We don't need to compute any loss metrics, so we can just ignore the model output. A few pointers regarding the data samples:\n", + "\n", + "- In practice, we need a very small percentage of the overall data samples for computing encodings.\n", + " For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 to 1000 samples.\n", + "- It may be beneficial if the samples used for computing encoding are well distributed.\n", + " It's not necessary that all classes need to be covered since we are only looking at the range of values at every layer activation.\n", + " However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings.\n", + "This routine can be written in many different ways, this is just an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def pass_calibration_data(session, samples):\n", + " data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + " batch_size = data_loader.batch_size\n", + " input_name = sess.get_inputs()[0].name\n", + "\n", + " batch_cntr = 0\n", + " for input_data, target_data in data_loader:\n", + "\n", + " inputs_batch = input_data.numpy()\n", + " session.run(None, {input_name : inputs_batch})\n", + "\n", + " batch_cntr += 1\n", + " if (batch_cntr * batch_size) > samples:\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings.\n", + "Encodings here refer to scale/offset quantization parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=1000)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now the QuantizationSim model is ready to be used for inference.\n", + "First we can pass this model to the same evaluation routine we used before.\n", + "The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sim.session)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 4. Apply Adaround\n", + "\n", + "We can now apply AdaRound to this model.\n", + "\n", + "Some of the parameters for AdaRound are described below\n", + "\n", + "- **dataloader:** AdaRound needs a dataloader that iterates over unlabeled data for the layer-by-layer optimization to learn the rounding vectors. We should comply with the class signature for the dataloader which is expected by AdaRound.\n", + "- **num_batches:** The number of batches used to evaluate the model while calculating the quantization encodings. Typically we want AdaRound to use around 2000 samples. So with a batch size of 32, this may translate to 64 batches. To speed up the execution here we are using a batch size of 1.\n", + "- **default_num_iterations:** The number of iterations to adaround each layer. Default value is set to 10000 and we strongly recommend to not reduce this number. But in this example we are using 32 to speed up the execution runtime." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "import os\n", + "from aimet_onnx.adaround.adaround_weight import Adaround, AdaroundParameters\n", + "\n", + "# Dataloader satisfying the class signature required by AdaRound\n", + "class DataLoader:\n", + " \"\"\"\n", + " This dataloader derives unlabeled samples in the form of numpy arrays from a torch dataloader\n", + " \"\"\"\n", + " def __init__(self):\n", + " self._torch_data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + " self._iterator = None\n", + " self.batch_size = self._torch_data_loader.batch_size\n", + "\n", + " def __iter__(self):\n", + " self._iterator = iter(self._torch_data_loader)\n", + " return self\n", + "\n", + " def __next__(self):\n", + " input_data, _ = next(self._iterator)\n", + " return input_data.numpy()\n", + "\n", + " def __len__(self):\n", + " return len(self._torch_data_loader)\n", + "\n", + "data_loader = DataLoader()\n", + "params = AdaroundParameters(data_loader=data_loader, num_batches=1, default_num_iterations=32, \n", + " forward_fn=pass_calibration_data, forward_pass_callback_args=1000)\n", + "\n", + "os.makedirs('./output/', exist_ok=True)\n", + "ada_model = Adaround.apply_adaround(model, params,\n", + " path=\"output\", \n", + " filename_prefix='adaround', \n", + " default_param_bw=8,\n", + " default_quant_scheme=QuantScheme.post_training_tf_enhanced)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now, we can determine the simulated quantized accuracy of the model after applying Adaround.\n", + "We again create a simulation model like before and evaluate to determine simulated quantized accuracy.\n", + "\n", + "**Note:** There are two important things to understand in the following cell.\n", + " - **Parameter Biwidth Precision**: The QuantizationSimModel must be created with the same parameter bitwidth precision that was used in the apply_adaround() created.\n", + " \n", + " - **Freezing the parameter encodings**:\n", + "After creating the QuantizationSimModel, the set_and_freeze_param_encodings() API must be called before calling the compute_encodings() API.\n", + "While applying AdaRound, the parameter values have been rounded up or down based on these initial encodings internally created.\n", + "For Quantization Simulation accuracy, it is important to freeze these encodings.\n", + "If the parameters encodings are NOT frozen, the call to compute_encodings() will alter the value of the parameters encodings and Quantization Simulation accuracy will not reflect the AdaRounded accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "sim = QuantizationSimModel(model=ada_model,\n", + " quant_scheme=QuantScheme.post_training_tf_enhanced,\n", + " default_activation_bw=8,\n", + " default_param_bw=8,\n", + " use_cuda=use_cuda)\n", + "\n", + "sim.set_and_freeze_param_encodings(encoding_path=os.path.join(\"output\", 'adaround.encodings'))\n", + "\n", + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=1000)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now the QuantizationSim model is ready to be used for inference.\n", + "First we can pass this model to the same evaluation routine we used before.\n", + "The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization, using the newly AdaRounded model with updated parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sim.session)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Depending on your settings you may have observed a slight gain in accuracy after applying AdaRound.\n", + "The settings used in this notebook are designed only to serve as code examples, designed to run quickly, but may not be optimal.\n", + "Please try this workflow against the model of your choice and play with the number of samples and other parameters to get the best results.\n", + "\n", + "The next step would be to take this model to target.\n", + "We need to do two things:\n", + "- export the model with the updated weights without the fake quantization ops\n", + "- export the encodings (scale/offset quantization parameters).\n", + "AIMET QuantizationSimModel provides an export API for this purpose." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "sim.export(path='./output/', filename_prefix='resnet18_after_adaround')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Summary" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This example illustrated how the AIMET AdaRound API is invoked to achieve post training quantization.\n", + "To use AIMET AdaRound for your specific needs, replace the model with your model and replace the data pipeline with your data pipeline.\n", + "As indicated above, some parameters in this example have been chosen in such a way to make this example execute faster.\n", + "\n", + "We hope this notebook was useful for you to understand how to use AIMET for performing AdaRound.\n", + "\n", + "A few additional resources:\n", + "- Refer to the AIMET API docs to know more details of the APIs and optional parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT techniques" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/releases/1.32.2/Examples/onnx/quantization/cle.html b/releases/1.32.2/Examples/onnx/quantization/cle.html new file mode 100644 index 00000000..d210bea5 --- /dev/null +++ b/releases/1.32.2/Examples/onnx/quantization/cle.html @@ -0,0 +1,1416 @@ + + + + + + Cross-Layer Equalization (CLE) — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Cross-Layer Equalization (CLE)

+

This notebook showcases a working code example of how to use AIMET to apply Cross-Layer Equalization (CLE). CLE is a post-training quantization techniques that aim to improve quantized accuracy of a given model. CLE does not need any data samples. This technique help recover quantized accuracy when the model quantization is sensitive to parameter quantization as opposed to activation quantization.

+

To learn more about this technique, please refer to the “Data-Free Quantization Through Weight Equalization and Bias Correction” paper from ICCV 2019 - https://arxiv.org/abs/1906.04721

+

Cross-Layer Equalization AIMET performs the following steps when running CLE: 1. Batch Norm Folding: Folds BN layers into Conv layers immediate before or after the Conv layers. 2. Cross-Layer Scaling: Given a set of consecutive Conv layers, equalizes the range of tensor values per-channel by scaling up/down per-channel weight tensor values of a layer and corresponding scaling down/up per-channel weight tensor values of the subsequent layer. 3. High Bias Folding: Cross-layer scaling may +result in high bias parameter values for some layers. This technique folds some of the bias of a layer into the subsequent layer’s parameters.

+
+

Overall flow

+

This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy 3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score 4. Apply CLE and evaluate the simulation model to get a post-finetuned quantized accuracy score

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art results. For example, it uses a relatively quantization-friendly model like Resnet18. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).

+

Note1: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - Subfolders ‘train’ for the training samples and ‘val’ for the validation samples. Please see the pytorch dataset description for more details. - A subdirectory per class, and a file per each image sample

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = '/path/to/dataset/'         # Please replace this with a real directory
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a PyTorch model. This QuantizationSim model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods? Not really. You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
import torch
+import onnxruntime as ort
+from Examples.common import image_net_config
+from Examples.onnx.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader
+
+class ImageNetDataPipeline:
+
+    @staticmethod
+    def get_val_dataloader() -> torch.utils.data.DataLoader:
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(DATASET_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         is_training=False,
+                                         num_workers=image_net_config.evaluation['num_workers']).data_loader
+        return data_loader
+
+    @staticmethod
+    def evaluate(sess: ort.InferenceSession) -> float:
+        """
+        Given a torch model, evaluates its Top-1 accuracy on the dataset
+        :param sess: the model to evaluate
+        """
+        evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      num_workers=image_net_config.evaluation['num_workers'])
+
+        return evaluator.evaluate(sess, iterations=None)
+
+
+
+
+
+
+

2. Convert an FP32 PyTorch model to ONNX and evaluate the model’s baseline FP32 accuracy

+

For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead or convert a model trained in a different framework altogether.

+
+
[ ]:
+
+
+
from torchvision.models import resnet18
+import onnx
+
+input_shape = (1, 3, 224, 224)    # Shape for each ImageNet sample is (3 channels) x (224 height) x (224 width)
+dummy_input = torch.randn(input_shape)
+filename = "./resnet18.onnx"
+
+# Load a pretrained ResNet-18 model in torch
+pt_model = resnet18(pretrained=True)
+
+# Export the torch model to onnx
+torch.onnx.export(pt_model.eval(),
+                  dummy_input,
+                  filename,
+                  training=torch.onnx.TrainingMode.PRESERVE,
+                  export_params=True,
+                  do_constant_folding=False,
+                  input_names=['input'],
+                  output_names=['output'],
+                  dynamic_axes={
+                      'input' : {0 : 'batch_size'},
+                      'output' : {0 : 'batch_size'},
+                  }
+                  )
+
+model = onnx.load_model(filename)
+
+
+
+
+

We should decide whether to place the model on a CPU or CUDA device. This example code will use CUDA if available in your current execution environment. You can change this logic and force a device placement if needed.

+
+
[ ]:
+
+
+
# cudnn_conv_algo_search is fixing it to default to avoid changing in accuracies/outputs at every inference
+if 'CUDAExecutionProvider' in ort.get_available_providers():
+    providers = [('CUDAExecutionProvider', {'cudnn_conv_algo_search': 'DEFAULT'}), 'CPUExecutionProvider']
+    use_cuda = True
+else:
+    providers = ['CPUExecutionProvider']
+    use_cuda = False
+
+
+
+
+

Let’s create an onnxruntime session and determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
sess = ort.InferenceSession(filename, providers=providers)
+accuracy = ImageNetDataPipeline.evaluate(sess)
+print(accuracy)
+
+
+
+
+
+
+

3. Create a quantization simulation model and determine quantized accuracy

+
+
+

Fold Batch Normalization layers

+

Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.

+

Why do we need to this? On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values +for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.

+

The following code calls AIMET to fold the BN layers in-place on the given model

+
+
[ ]:
+
+
+
from aimet_onnx.batch_norm_fold import fold_all_batch_norms_to_weight
+
+_ = fold_all_batch_norms_to_weight(model)
+
+
+
+
+
+
+

Create Quantization Sim Model

+

Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them. A few of the parameters are explained here - quant_scheme: We set this to “QuantScheme.post_training_tf_enhanced” - Supported options are ‘tf_enhanced’ or ‘tf’ or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced - default_output_bw: Setting this to 8, essentially means that we are +asking AIMET to perform all activation quantizations in the model using integer 8-bit precision - default_param_bw: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision - num_batches: The number of batches used to evaluate the model while calculating the quantization encodings.Number of batches to use for computing encodings. Only 5 batches are used here to speed up the process. In addition, the +number of images in these 5 batches should be sufficient for compute encodings - rounding_mode: The rounding mode used for quantization. There are two possible choices here - ‘nearest’ or ‘stochastic’ We will use “nearest.”

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+
+
[ ]:
+
+
+
from aimet_common.defs import QuantScheme
+from aimet_onnx.quantsim import QuantizationSimModel
+
+sim = QuantizationSimModel(model=model,
+                           quant_scheme=QuantScheme.post_training_tf_enhanced,
+                           default_activation_bw=8,
+                           default_param_bw=8,
+                           use_cuda=use_cuda)
+
+
+
+
+

Even though AIMET has added ‘quantizer’ nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred +to as calibration. AIMET simply refers to it as ‘computing encodings’.

+

So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples

+

In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples. It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario +like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results. The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example.

+
+
[ ]:
+
+
+
def pass_calibration_data(session, samples):
+    data_loader = ImageNetDataPipeline.get_val_dataloader()
+    batch_size = data_loader.batch_size
+    input_name = sess.get_inputs()[0].name
+
+    batch_cntr = 0
+    for input_data, target_data in data_loader:
+
+        inputs_batch = input_data.numpy()
+        session.run(None, {input_name : inputs_batch})
+
+        batch_cntr += 1
+        if (batch_cntr * batch_size) > samples:
+            break
+
+
+
+
+

Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters.

+
+
[ ]:
+
+
+
sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=1000)
+
+
+
+
+

Now the QuantizationSim model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sim.session)
+print(accuracy)
+
+
+
+
+
+
+

4. 1 Cross Layer Equalization

+

The next cell performs cross-layer equalization on the model. As noted before, the function folds batch norms, applies cross-layer scaling, and then folds high biases.

+

Note: Interestingly, CLE needs BN statistics for its procedure. If a BN folded model is provided, CLE will run the CLS (cross-layer scaling) optimization step but will skip the HBA (high-bias absorption) step. To avoid this, we simply load the original model again before running CLE.

+

Note: CLE equalizes the model in-place

+
+
[ ]:
+
+
+
filename = "./resnet18.onnx"
+model = onnx.load_model(filename)
+
+
+
+
+
[ ]:
+
+
+
from aimet_onnx.cross_layer_equalization import equalize_model
+
+equalize_model(model)
+
+
+
+
+

Now, we can determine the simulated quantized accuracy of the equalized model. We again create a simulation model like before and evaluate to determine simulated quantized accuracy.

+
+
[ ]:
+
+
+
sim = QuantizationSimModel(model=model,
+                           quant_scheme=QuantScheme.post_training_tf_enhanced,
+                           default_activation_bw=8,
+                           default_param_bw=8,
+                           use_cuda=use_cuda)
+
+sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=1000)
+
+accuracy = ImageNetDataPipeline.evaluate(sim.model)
+print(accuracy)
+
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET for performing Cross Layer Equalization (CLE).

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT techniques

+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/onnx/quantization/cle.ipynb b/releases/1.32.2/Examples/onnx/quantization/cle.ipynb new file mode 100644 index 00000000..463e5378 --- /dev/null +++ b/releases/1.32.2/Examples/onnx/quantization/cle.ipynb @@ -0,0 +1,551 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "# Cross-Layer Equalization (CLE)\n", + "\n", + "This notebook showcases a working code example of how to use AIMET to apply Cross-Layer Equalization (CLE). CLE is a post-training quantization techniques that aim to improve quantized accuracy of a given model. CLE does not need any data samples. This technique help recover quantized accuracy when the model quantization is sensitive to parameter quantization as opposed to activation quantization.\n", + "\n", + "To learn more about this technique, please refer to the \"Data-Free Quantization Through Weight Equalization and Bias Correction\" paper from ICCV 2019 - https://arxiv.org/abs/1906.04721\n", + "\n", + "**Cross-Layer Equalization**\n", + "AIMET performs the following steps when running CLE:\n", + "1. Batch Norm Folding: Folds BN layers into Conv layers immediate before or after the Conv layers.\n", + "2. Cross-Layer Scaling: Given a set of consecutive Conv layers, equalizes the range of tensor values per-channel by scaling up/down per-channel weight tensor values of a layer and corresponding scaling down/up per-channel weight tensor values of the subsequent layer.\n", + "3. High Bias Folding: Cross-layer scaling may result in high bias parameter values for some layers. This technique folds some of the bias of a layer into the subsequent layer's parameters.\n", + "\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy\n", + "3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score\n", + "4. Apply CLE and evaluate the simulation model to get a post-finetuned quantized accuracy score\n", + "\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art results. For example, it uses a relatively quantization-friendly model like Resnet18. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).\n", + "\n", + "**Note1**: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- Subfolders 'train' for the training samples and 'val' for the validation samples. Please see the [pytorch dataset description](https://pytorch.org/vision/0.8/_modules/torchvision/datasets/imagenet.html) for more details.\n", + "- A subdirectory per class, and a file per each image sample\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "DATASET_DIR = '/path/to/dataset/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a PyTorch model. This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really. You should be able to use your existing evaluate and train routines as-is.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "import torch\n", + "import onnxruntime as ort\n", + "from Examples.common import image_net_config\n", + "from Examples.onnx.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader\n", + "\n", + "class ImageNetDataPipeline:\n", + "\n", + " @staticmethod\n", + " def get_val_dataloader() -> torch.utils.data.DataLoader:\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(DATASET_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " is_training=False,\n", + " num_workers=image_net_config.evaluation['num_workers']).data_loader\n", + " return data_loader\n", + "\n", + " @staticmethod\n", + " def evaluate(sess: ort.InferenceSession) -> float:\n", + " \"\"\"\n", + " Given a torch model, evaluates its Top-1 accuracy on the dataset\n", + " :param sess: the model to evaluate\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " num_workers=image_net_config.evaluation['num_workers'])\n", + "\n", + " return evaluator.evaluate(sess, iterations=None)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "\n", + "## 2. Convert an FP32 PyTorch model to ONNX and evaluate the model's baseline FP32 accuracy" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead or convert a model trained in a different framework altogether." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from torchvision.models import resnet18\n", + "import onnx\n", + "\n", + "input_shape = (1, 3, 224, 224) # Shape for each ImageNet sample is (3 channels) x (224 height) x (224 width)\n", + "dummy_input = torch.randn(input_shape)\n", + "filename = \"./resnet18.onnx\"\n", + "\n", + "# Load a pretrained ResNet-18 model in torch\n", + "pt_model = resnet18(pretrained=True)\n", + "\n", + "# Export the torch model to onnx\n", + "torch.onnx.export(pt_model.eval(),\n", + " dummy_input,\n", + " filename,\n", + " training=torch.onnx.TrainingMode.PRESERVE,\n", + " export_params=True,\n", + " do_constant_folding=False,\n", + " input_names=['input'],\n", + " output_names=['output'],\n", + " dynamic_axes={\n", + " 'input' : {0 : 'batch_size'},\n", + " 'output' : {0 : 'batch_size'},\n", + " }\n", + " )\n", + "\n", + "model = onnx.load_model(filename)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "We should decide whether to place the model on a CPU or CUDA device. This example code will use CUDA if available in your current execution environment. You can change this logic and force a device placement if needed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "# cudnn_conv_algo_search is fixing it to default to avoid changing in accuracies/outputs at every inference\n", + "if 'CUDAExecutionProvider' in ort.get_available_providers():\n", + " providers = [('CUDAExecutionProvider', {'cudnn_conv_algo_search': 'DEFAULT'}), 'CPUExecutionProvider']\n", + " use_cuda = True\n", + "else:\n", + " providers = ['CPUExecutionProvider']\n", + " use_cuda = False" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "Let's create an onnxruntime session and determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + }, + "scrolled": true + }, + "outputs": [], + "source": [ + "sess = ort.InferenceSession(filename, providers=providers)\n", + "accuracy = ImageNetDataPipeline.evaluate(sess)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "## 3. Create a quantization simulation model and determine quantized accuracy\n", + "\n", + "## Fold Batch Normalization layers\n", + "Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.\n", + "\n", + "**Why do we need to this?**\n", + "On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.\n", + "\n", + "The following code calls AIMET to fold the BN layers in-place on the given model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from aimet_onnx.batch_norm_fold import fold_all_batch_norms_to_weight\n", + "\n", + "_ = fold_all_batch_norms_to_weight(model)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "## Create Quantization Sim Model\n", + "Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them.\n", + "A few of the parameters are explained here\n", + "- **quant_scheme**: We set this to \"QuantScheme.post_training_tf_enhanced\"\n", + " - Supported options are 'tf_enhanced' or 'tf' or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced\n", + "- **default_output_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision\n", + "- **default_param_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision\n", + "- **num_batches**: The number of batches used to evaluate the model while calculating the quantization encodings.Number of batches to use for computing encodings. Only 5 batches are used here to speed up the process. In addition, the number of images in these 5 batches should be sufficient for compute encodings\n", + "- **rounding_mode**: The rounding mode used for quantization. There are two possible choices here - 'nearest' or 'stochastic' We will use \"nearest.\"\n", + "\n", + "There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from aimet_common.defs import QuantScheme\n", + "from aimet_onnx.quantsim import QuantizationSimModel\n", + "\n", + "sim = QuantizationSimModel(model=model,\n", + " quant_scheme=QuantScheme.post_training_tf_enhanced,\n", + " default_activation_bw=8,\n", + " default_param_bw=8,\n", + " use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "Even though AIMET has added 'quantizer' nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each 'quantizer' node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as 'computing encodings'.\n", + "\n", + "So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don't need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples\n", + "\n", + "In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples.\n", + "It may be beneficial if the samples used for computing encoding are well distributed. It's not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "def pass_calibration_data(session, samples):\n", + " data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + " batch_size = data_loader.batch_size\n", + " input_name = sess.get_inputs()[0].name\n", + "\n", + " batch_cntr = 0\n", + " for input_data, target_data in data_loader:\n", + "\n", + " inputs_batch = input_data.numpy()\n", + " session.run(None, {input_name : inputs_batch})\n", + "\n", + " batch_cntr += 1\n", + " if (batch_cntr * batch_size) > samples:\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=1000)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "Now the QuantizationSim model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sim.session)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "## 4. 1 Cross Layer Equalization\n", + "\n", + "The next cell performs cross-layer equalization on the model. As noted before, the function folds batch norms, applies cross-layer scaling, and then folds high biases.\n", + "\n", + "**Note:** Interestingly, CLE needs BN statistics for its procedure. If a BN folded model is provided, CLE will run the CLS (cross-layer scaling) optimization step but will skip the HBA (high-bias absorption) step. To avoid this, we simply load the original model again before running CLE.\n", + "\n", + "**Note:** CLE equalizes the model in-place" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "filename = \"./resnet18.onnx\"\n", + "model = onnx.load_model(filename)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from aimet_onnx.cross_layer_equalization import equalize_model\n", + "\n", + "equalize_model(model)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "Now, we can determine the simulated quantized accuracy of the equalized model. We again create a simulation model like before and evaluate to determine simulated quantized accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + }, + "scrolled": true + }, + "outputs": [], + "source": [ + "sim = QuantizationSimModel(model=model,\n", + " quant_scheme=QuantScheme.post_training_tf_enhanced,\n", + " default_activation_bw=8,\n", + " default_param_bw=8,\n", + " use_cuda=use_cuda)\n", + "\n", + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=1000)\n", + "\n", + "accuracy = ImageNetDataPipeline.evaluate(sim.model)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "## Summary\n", + "\n", + "Hope this notebook was useful for you to understand how to use AIMET for performing Cross Layer Equalization (CLE).\n", + "\n", + "Few additional resources\n", + "- Refer to the [AIMET API docs](https://quic.github.io/aimet-pages/AimetDocs/api_docs/index.html) to know more details of the APIs and optional parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT techniques" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/releases/1.32.2/Examples/onnx/quantization/quantsim.html b/releases/1.32.2/Examples/onnx/quantization/quantsim.html new file mode 100644 index 00000000..5936194a --- /dev/null +++ b/releases/1.32.2/Examples/onnx/quantization/quantsim.html @@ -0,0 +1,1367 @@ + + + + + + Quantization Simulation — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Quantization Simulation

+

This notebook shows a working code example of how to use AIMET to perform quantization simulation (quantsim). Quantsim is an AIMET feature that adds quantization simulation ops (also called fake quantization ops sometimes) to a trained ML model in order to compute quantization encodings and estimate the resulting accuracy of the model when deployed on quantized ML accelerators.

+
+

Overall flow

+

This notebook covers the following 1. Instantiate the example evaluation pipeline 2. Convert an FP32 PyTorch model to ONNX and evaluate the model’s baseline FP32 accuracy 3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simulation model to get a quantized accuracy score

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art quantized accuracy. For example, it uses a relatively quantization-friendly model like Resnet18. Also, optimization techniques such as Quantization-Aware Training, AdaRound, and Cross-Layer Equalization can be employed to improve the accuracy of quantized models.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).

+

Note1: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - Subfolders ‘train’ for the training samples and ‘val’ for the validation samples. Please see the pytorch dataset description for more details. - A subdirectory per class, and a file per each image sample

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = '/path/to/dataset/'         # Please replace this with a real directory
+
+
+
+
+
+
+

1. Example evaluation pipeline

+

The following is an example validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model and provide a QuantizationSim session that acts as a regular onnxruntime inference session. However, it is recommended that users only use inference sessions created by the QuantizationSimModel, as this will automatically register the required custom operators.

  • +
+
+
[ ]:
+
+
+
import torch
+import onnxruntime as ort
+from Examples.common import image_net_config
+from Examples.onnx.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader
+
+class ImageNetDataPipeline:
+
+    @staticmethod
+    def get_val_dataloader() -> torch.utils.data.DataLoader:
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(DATASET_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         is_training=False,
+                                         num_workers=image_net_config.evaluation['num_workers']).data_loader
+        return data_loader
+
+    @staticmethod
+    def evaluate(sess: ort.InferenceSession) -> float:
+        """
+        Given a torch model, evaluates its Top-1 accuracy on the dataset
+        :param sess: the model to evaluate
+        """
+        evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      num_workers=image_net_config.evaluation['num_workers'])
+
+        return evaluator.evaluate(sess, iterations=None)
+
+
+
+
+
+
+

2. Convert an FP32 PyTorch model to ONNX and evaluate the model’s baseline FP32 accuracy

+

For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead or convert a model trained in a different framework altogether.

+
+
[ ]:
+
+
+
from torchvision.models import resnet18
+import onnx
+
+input_shape = (1, 3, 224, 224)    # Shape for each ImageNet sample is (3 channels) x (224 height) x (224 width)
+dummy_input = torch.randn(input_shape)
+filename = "./resnet18.onnx"
+
+# Load a pretrained ResNet-18 model in torch
+pt_model = resnet18(pretrained=True)
+
+# Export the torch model to onnx
+torch.onnx.export(pt_model.eval(),
+                  dummy_input,
+                  filename,
+                  training=torch.onnx.TrainingMode.PRESERVE,
+                  export_params=True,
+                  do_constant_folding=False,
+                  input_names=['input'],
+                  output_names=['output'],
+                  dynamic_axes={
+                      'input' : {0 : 'batch_size'},
+                      'output' : {0 : 'batch_size'},
+                  }
+                  )
+
+model = onnx.load_model(filename)
+
+
+
+
+

We should decide whether to run the model on a CPU or CUDA device. This example code will use CUDA if available in your onnxruntime environment. You can change this logic and force a device placement if needed.

+
+
[ ]:
+
+
+
# cudnn_conv_algo_search is fixing it to default to avoid changing in accuracies/outputs at every inference
+if 'CUDAExecutionProvider' in ort.get_available_providers():
+    providers = [('CUDAExecutionProvider', {'cudnn_conv_algo_search': 'DEFAULT'}), 'CPUExecutionProvider']
+    use_cuda = True
+else:
+    providers = ['CPUExecutionProvider']
+    use_cuda = False
+
+
+
+
+

Let’s create an onnxruntime session and determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
sess = ort.InferenceSession(filename, providers=providers)
+accuracy = ImageNetDataPipeline.evaluate(sess)
+print(accuracy)
+
+
+
+
+
+
+

3. Create a quantization simulation model and determine quantized accuracy

+
+
+

Fold Batch Normalization layers

+

Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.

+

Why do we need to this? On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values +for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.

+

The following code calls AIMET to fold the BN layers in-place on the given model

+
+
[ ]:
+
+
+
from aimet_onnx.batch_norm_fold import fold_all_batch_norms_to_weight
+
+_ = fold_all_batch_norms_to_weight(model)
+
+
+
+
+
+

Create Quantization Sim Model

+

Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them. A few of the parameters are explained here - quant_scheme: We set this to “QuantScheme.post_training_tf_enhanced” - Supported options are ‘tf_enhanced’ or ‘tf’ or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced - default_activation_bw: Setting this to 8, essentially means that we are +asking AIMET to perform all activation quantizations in the model using integer 8-bit precision - default_param_bw: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision

+

In case the ONNX model has custom ops, we need to specify the paths of compiled custom ops via user_onnx_libs parameter. For example, user_onnx_libs=[‘path/to/custom_op1.so’, ‘path/to/custom_op2.so’]

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+
+
[ ]:
+
+
+
from aimet_common.defs import QuantScheme
+from aimet_onnx.quantsim import QuantizationSimModel
+
+sim = QuantizationSimModel(model=model,
+                           quant_scheme=QuantScheme.post_training_tf_enhanced,
+                           default_activation_bw=8,
+                           default_param_bw=8,
+                           use_cuda=use_cuda)
+
+
+
+
+
+
+

Compute Encodings

+

Even though AIMET has added ‘quantizer’ nodes to the model graph, the model is not ready to be used yet. Before we can use the sim model for inference, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ node. For activation quantization node, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as +calibration. AIMET simply refers to it as ‘computing encodings’.

+

So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples - In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has +1M samples. For computing encodings we only need 500 or 1000 samples. - It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results.

+

The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many ways, this is just an example.

+
+
[ ]:
+
+
+
def pass_calibration_data(session, samples):
+    data_loader = ImageNetDataPipeline.get_val_dataloader()
+    batch_size = data_loader.batch_size
+    input_name = sess.get_inputs()[0].name
+
+    batch_cntr = 0
+    for input_data, target_data in data_loader:
+
+        inputs_batch = input_data.numpy()
+        session.run(None, {input_name : inputs_batch})
+
+        batch_cntr += 1
+        if (batch_cntr * batch_size) > samples:
+            break
+
+
+
+
+

Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters.

+
+
[ ]:
+
+
+
sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=1000)
+
+
+
+
+

Now the QuantizationSim model is ready to be used for inference. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sim.session)
+print(accuracy)
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET for performing QuantizationSimulation.

+

Additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters.

+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/onnx/quantization/quantsim.ipynb b/releases/1.32.2/Examples/onnx/quantization/quantsim.ipynb new file mode 100644 index 00000000..53eb57c4 --- /dev/null +++ b/releases/1.32.2/Examples/onnx/quantization/quantsim.ipynb @@ -0,0 +1,479 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "# Quantization Simulation\n", + "\n", + "This notebook shows a working code example of how to use AIMET to perform quantization simulation (quantsim). Quantsim is an AIMET feature that adds quantization simulation ops (also called fake quantization ops sometimes) to a trained ML model in order to compute quantization encodings and estimate the resulting accuracy of the model when deployed on quantized ML accelerators.\n", + "\n", + "\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation pipeline\n", + "2. Convert an FP32 PyTorch model to ONNX and evaluate the model's baseline FP32 accuracy\n", + "3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simulation model to get a quantized accuracy score\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art quantized accuracy. For example, it uses a relatively quantization-friendly model like Resnet18. Also, optimization techniques such as Quantization-Aware Training, AdaRound, and Cross-Layer Equalization can be employed to improve the accuracy of quantized models." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).\n", + "\n", + "**Note1**: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- Subfolders 'train' for the training samples and 'val' for the validation samples. Please see the [pytorch dataset description](https://pytorch.org/vision/0.8/_modules/torchvision/datasets/imagenet.html) for more details.\n", + "- A subdirectory per class, and a file per each image sample\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "DATASET_DIR = '/path/to/dataset/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "---\n", + "## 1. Example evaluation pipeline\n", + "\n", + "The following is an example validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model and provide a QuantizationSim session that acts as a regular onnxruntime inference session. However, it is recommended that users only use inference sessions created by the QuantizationSimModel, as this will automatically register the required custom operators.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "import torch\n", + "import onnxruntime as ort\n", + "from Examples.common import image_net_config\n", + "from Examples.onnx.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader\n", + "\n", + "class ImageNetDataPipeline:\n", + "\n", + " @staticmethod\n", + " def get_val_dataloader() -> torch.utils.data.DataLoader:\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(DATASET_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " is_training=False,\n", + " num_workers=image_net_config.evaluation['num_workers']).data_loader\n", + " return data_loader\n", + "\n", + " @staticmethod\n", + " def evaluate(sess: ort.InferenceSession) -> float:\n", + " \"\"\"\n", + " Given a torch model, evaluates its Top-1 accuracy on the dataset\n", + " :param sess: the model to evaluate\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " num_workers=image_net_config.evaluation['num_workers'])\n", + "\n", + " return evaluator.evaluate(sess, iterations=None)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "---\n", + "## 2. Convert an FP32 PyTorch model to ONNX and evaluate the model's baseline FP32 accuracy" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead or convert a model trained in a different framework altogether." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "from torchvision.models import resnet18\n", + "import onnx\n", + "\n", + "input_shape = (1, 3, 224, 224) # Shape for each ImageNet sample is (3 channels) x (224 height) x (224 width)\n", + "dummy_input = torch.randn(input_shape)\n", + "filename = \"./resnet18.onnx\"\n", + "\n", + "# Load a pretrained ResNet-18 model in torch\n", + "pt_model = resnet18(pretrained=True)\n", + "\n", + "# Export the torch model to onnx\n", + "torch.onnx.export(pt_model.eval(),\n", + " dummy_input,\n", + " filename,\n", + " training=torch.onnx.TrainingMode.PRESERVE,\n", + " export_params=True,\n", + " do_constant_folding=False,\n", + " input_names=['input'],\n", + " output_names=['output'],\n", + " dynamic_axes={\n", + " 'input' : {0 : 'batch_size'},\n", + " 'output' : {0 : 'batch_size'},\n", + " }\n", + " )\n", + "\n", + "model = onnx.load_model(filename)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "---\n", + "We should decide whether to run the model on a CPU or CUDA device. This example code will use CUDA if available in your onnxruntime environment. You can change this logic and force a device placement if needed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + }, + "pycharm": { + "is_executing": true + } + }, + "outputs": [], + "source": [ + "# cudnn_conv_algo_search is fixing it to default to avoid changing in accuracies/outputs at every inference\n", + "if 'CUDAExecutionProvider' in ort.get_available_providers():\n", + " providers = [('CUDAExecutionProvider', {'cudnn_conv_algo_search': 'DEFAULT'}), 'CPUExecutionProvider']\n", + " use_cuda = True\n", + "else:\n", + " providers = ['CPUExecutionProvider']\n", + " use_cuda = False" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "---\n", + "Let's create an onnxruntime session and determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "sess = ort.InferenceSession(filename, providers=providers)\n", + "accuracy = ImageNetDataPipeline.evaluate(sess)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "---\n", + "## 3. Create a quantization simulation model and determine quantized accuracy\n", + "\n", + "## Fold Batch Normalization layers\n", + "Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.\n", + "\n", + "**Why do we need to this?**\n", + "On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.\n", + "\n", + "The following code calls AIMET to fold the BN layers in-place on the given model\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "from aimet_onnx.batch_norm_fold import fold_all_batch_norms_to_weight\n", + "\n", + "_ = fold_all_batch_norms_to_weight(model)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## Create Quantization Sim Model\n", + "\n", + "Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them.\n", + "A few of the parameters are explained here\n", + "- **quant_scheme**: We set this to \"QuantScheme.post_training_tf_enhanced\"\n", + " - Supported options are 'tf_enhanced' or 'tf' or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced\n", + "- **default_activation_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision\n", + "- **default_param_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision\n", + "\n", + "In case the ONNX model has **custom ops**, we need to specify the paths of compiled custom ops via **user_onnx_libs** parameter.\n", + "For example, user_onnx_libs=['path/to/custom_op1.so', 'path/to/custom_op2.so']\n", + "\n", + "There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "from aimet_common.defs import QuantScheme\n", + "from aimet_onnx.quantsim import QuantizationSimModel\n", + "\n", + "sim = QuantizationSimModel(model=model,\n", + " quant_scheme=QuantScheme.post_training_tf_enhanced,\n", + " default_activation_bw=8,\n", + " default_param_bw=8,\n", + " use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "---\n", + "## Compute Encodings\n", + "Even though AIMET has added 'quantizer' nodes to the model graph, the model is not ready to be used yet. Before we can use the sim model for inference, we need to find appropriate scale/offset quantization parameters for each 'quantizer' node. For activation quantization node, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as 'computing encodings'.\n", + "\n", + "So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don't need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples\n", + "- In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples.\n", + "- It may be beneficial if the samples used for computing encoding are well distributed. It's not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many ways, this is just an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "def pass_calibration_data(session, samples):\n", + " data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + " batch_size = data_loader.batch_size\n", + " input_name = sess.get_inputs()[0].name\n", + "\n", + " batch_cntr = 0\n", + " for input_data, target_data in data_loader:\n", + "\n", + " inputs_batch = input_data.numpy()\n", + " session.run(None, {input_name : inputs_batch})\n", + "\n", + " batch_cntr += 1\n", + " if (batch_cntr * batch_size) > samples:\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "---\n", + "Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=1000)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "---\n", + "Now the QuantizationSim model is ready to be used for inference. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sim.session)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## Summary\n", + "\n", + "Hope this notebook was useful for you to understand how to use AIMET for performing QuantizationSimulation.\n", + "\n", + "Additional resources\n", + "- Refer to the [AIMET API docs](https://quic.github.io/aimet-pages/AimetDocs/api_docs/index.html) to know more details of the APIs and optional parameters." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.10" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/releases/1.32.2/Examples/tensorflow/compression/channel_pruning.html b/releases/1.32.2/Examples/tensorflow/compression/channel_pruning.html new file mode 100644 index 00000000..20cf0590 --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/compression/channel_pruning.html @@ -0,0 +1,1469 @@ + + + + + + Model Compression Using Channel Pruning — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Model Compression Using Channel Pruning

+

This notebook shows a working code example of how to use AIMET to perform model compression. The Channel Pruning technique is used in this notebook to achieve model compression.

+

Here is a brief introduction to the techniques. Please refer to the AIMET user guide for more details.

+
    +
  1. Spatial SVD: This is a tensor-decomposition technique generally applied to convolutional layers (Conv2D). Applying this technique will decompose a single convolutional layer into two. The weight tensor of the layer to be split is flattended to a 2D matrix and singular value decomposition (SVD) is applied to this matrix. Compression is achieved by discarding the least significant singular values in the diagonal matrix. The decomposed matrices are combined back into two separate +convolutional layers.

  2. +
  3. Channel Pruning: In this technique AIMET will discard least significant (using a magnitude metric) input channels of a given convolutional (Conv2D) layer. The layers of the model feeding into this convolutional layer also have the channels dimension modified to get back to a working graph. This technique also uses a layer-by-layer reconstruction procedure that modifies the weights of the compressed layers to minimize the distance of the compressed layer output to the corresponding layer +output of the original model.

  4. +
+

Both of the above techniques are structured pruning techniques that aim to reduce computational macs or memory requirements of the model. Subsequent to applying either of these techniques, the compressed model needs to be fine-tuned (meaning trained again for a few epochs) to recover accuracy close to the original model.

+

This notebook shows working code example of how the technique #2 can be used to compress the model. You can find a separate notebook for #1, and #1 followed by #2 in the same folder.

+
+

Overall flow

+
+
This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load the model and evaluate it to find the baseline accuracy 3. Compress the model and fine-tune:
+
3.1 Compress model using Channel Pruning and evaluate it to find post-compression accuracy
+
3.2 Fine-tune the model
+
+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art compression results. For example, some optimization parameters such as num_comp_ratio_candidates, num_eval_iterations and epochs are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#) and convert them into tfrecords.

+

Note1: The ImageNet tfrecords dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - A folder containing tfrecords files starting with ‘train*’ for training files and ‘valid*’ for validation files. Each tfrecord file should have features: ‘image/encoded’ for image data and ‘image/class/label’ for its corresponding class.

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class and then convert it into tfrecords. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
TFRECORDS_DIR = '/path/to/tfrecords/dir/'        # Please replace this with a real directory
+
+
+
+

We disable logs at the INFO level and disable eager execution. We set verbosity to the level as displayed (ERORR), so TensorFlow will display all messages that have the label ERROR (or more critical).

+
+
[ ]:
+
+
+
import os
+os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' ##TODO
+
+import tensorflow.compat.v1 as tf ## TODO Abhijit
+tf.disable_eager_execution()
+tf.logging.set_verbosity(tf.logging.ERROR)
+
+
+
+
+
[ ]:
+
+
+
from typing import List
+from Examples.common import image_net_config
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.tensorflow.utils.image_net_trainer import ImageNetTrainer
+
+class ImageNetDataPipeline:
+    """
+    Provides APIs for model evaluation and finetuning using ImageNet Dataset.
+    """
+
+    @staticmethod
+    def get_val_dataloader():
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(TFRECORDS_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         format_bgr=True)
+
+        return data_loader
+
+    @staticmethod
+    def evaluate(sess: tf.Session, iterations: int = None, use_cuda: bool = False) -> float:
+        """
+        Given a TF session, evaluates its Top-1 accuracy on the validation dataset
+        :param sess: The sess graph to be evaluated.
+        :return: The accuracy for the sample with the maximum accuracy.
+        """
+        evaluator = ImageNetEvaluator(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],
+                                      data_inputs=['input_1:0'], validation_inputs=['labels:0'],
+                                      image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      format_bgr=True)
+
+        return evaluator.evaluate(sess, iterations)
+
+
+    @staticmethod
+    def finetune(sess: tf.Session, update_ops_name: List[str], epochs: int, learning_rate: float, decay_steps: int):
+        """
+        Given a TF session, finetunes it to improve its accuracy
+        :param sess: The sess graph to fine-tune.
+        :param update_ops_name: list of name of update ops (mostly BatchNorms' moving averages).
+                                tf.GraphKeys.UPDATE_OPS collections is always used
+                                in addition to this list
+        :param epochs: The number of epochs used during the finetuning step.
+        :param learning_rate: The learning rate used during the finetuning step.
+        :param decay_steps: A number used to adjust(decay) the learning rate after every decay_steps epochs in training.
+        """
+        trainer = ImageNetTrainer(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],
+                                  data_inputs=['input_1:0'], validation_inputs=['labels:0'],
+                                  image_size=image_net_config.dataset['image_size'],
+                                  batch_size=image_net_config.train['batch_size'],
+                                  num_epochs=epochs, format_bgr=True)
+
+        trainer.train(sess, update_ops_name=update_ops_name, learning_rate=learning_rate, decay_steps=decay_steps)
+
+
+
+
+
+
+

2. Load the model and evaluate it to find the baseline accuracy

+

For this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to a tensorflow session. Similarly, you can load any pretrained tensorflow model instead.

+

Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.

+

By default the update ops are placed in tf.GraphKeys.UPDATE_OPS, so they need to be added as a dependency to the train_op. Since batchnorm ops are folded, these need to be ignored during training.

+
+
[ ]:
+
+
+
from tensorflow.compat.v1.keras.applications.resnet import ResNet50
+
+tf.keras.backend.clear_session()
+
+model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+update_ops_name = [op.name for op in model.updates] # Used for finetuning
+
+
+
+

The following utility method in AIMET sets BN layers in the model to eval mode. This allows AIMET to more easily read the BN parameters from the graph. Eventually we will fold BN layers into adjacent conv layers.

+
+
[ ]:
+
+
+
from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag
+
+model = update_keras_bn_ops_trainable_flag(model, load_save_path="./", trainable=False)
+
+
+
+

AIMET features currently support tensorflow sessions. add_image_net_computational_nodes_in_graph adds an output layer, softmax and loss functions to the Resnet50 model graph.

+
+
[ ]:
+
+
+
from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph
+
+sess = tf.keras.backend.get_session()
+
+# Creates the computation graph of ResNet within the tensorflow session.
+add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])
+
+
+
+

Since all tensorflow input and output tensors have names, we identify the tensors needed by AIMET APIs here.

+
+
[ ]:
+
+
+
input_op_names = [model.input.name.split(":")[0]]
+output_op_names = [model.output.name.split(":")[0]]
+starting_op_names = input_op_names
+starting_op_names.append('labels')
+
+
+
+

We are checking if TensorFlow is using CPU or CUDA device. This example code will use CUDA if available in your current execution environment.

+
+
[ ]:
+
+
+
use_cuda = tf.test.is_gpu_available(cuda_only=True)
+
+
+
+
+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sess=sess)
+print(accuracy)
+
+
+
+
+
+

3. Compress the model and fine-tune

+
+

3.1. Compress model using Channel Pruning and evaluate it to find post-compression accuracy

+

Now we use AIMET to define compression parameters for Channel Pruning, few of which are explained here

+
    +
  • target_comp_ratio: The desired compession ratio using Channel Pruning. This value denotes the desired compression % of the original model. To compress the model to 20% of its original size, use 0.2. This would compress the model by 80%. The pre-specified value that is given is 50%. The desired compression ratio for Channel Pruning. We are using 0.9 to compress the model by 10%.

  • +
  • num_comp_ratio_candidates: As part of determining how compressible each layer is, AIMET performs various measurements. This number denotes the different compression ratios tried by the AIMET for each layer. We are using 3 here which translates to 0.33, 0.66 and 1.00 compression ratios at each layer. Optimal value is 10. The higher the number of candidates the more granular the measurements for each layer, but also the higher the time taken to complete these measurements.

  • +
  • modules_to_ignore: This list can contain the references of model-layers that should be ignored during compression. We have added the first layer to be ignored to preserve the way the input interacts with the model; other layers can be added too if desired.

  • +
  • mode: We are chossing Auto mode which means AIMET performs per-layer compressibility analysis and determines how much to compress each layer. The alternate choice is Manual.

  • +
  • data_loader: Channel Pruning uses unlabelled data samples for the layer-by-layer reconstruction procedure explained at the start. This provided data loader is used to retrieve those samples. You can just pass your existing data loader - say for the validation or training dataset.

  • +
  • num_reconstruction_samples: During the last stage of Channel Pruning, the Compression API tries to map the outputs of the pruned model with that of the original model through linear regression, and uses this attempt to change the weights in the pruned layer. The regression is done with this many random samples. The number of samples used in the layer-by-layer reconstruction procedure. We are using 10 here which is a ridiculously low number but enables this notebook to execute quickly. A +typical setting here would ~1000 samples.

  • +
  • allow_custom_downsample_ops: If this flag is enabled, AIMET Channel Pruning will insert downsample ops into the model graph if needed. Enabling this can enable more convolutional layers to be considered for pruning, but it may increase memory bandwidth overhead for the additional downsample layers. So there is a trade-off to be considered. We suggest disabling this by default.

  • +
  • eval_callback: The model evaluation function. The expected signature of the evaluate function should be <function_name>(model, eval_iterations, use_cuda) and it is expected to return an accuracy metric.

  • +
  • eval_iterations: The number of batches of data to use for evaluating the model while the model is compressing. We are using 1 to speed up the notebook execution. But please choose a high enough number of samples so that we can trust the accuracy of the model given those samples. It is expected that the eval callback would use the same samples for every invocation of the callback.

  • +
  • compress_scheme: We choose the ‘channel pruning’ compression scheme.

  • +
  • cost_metric: Determines whether we want to target either to reduce MACs or memory by the desired compression ratio. We are chossing ‘mac’ here.

  • +
+

The next cell defines the actual Channel Pruning Parameters. There are two methods for which you can choose parameters - Auto and Manual. For Auto, the only option is a greedy selection scheme, where the optimal compression ratio is selected for each layer among a set list of candidates to reach the target ratio (which was specified in the previous cell). For Manual, you have to specify the compression ratios for each layer; a general rule of thumb, if one is to use Manual, is to start with the +ratios found by Auto Mode and use it as a starting point.

+
+
[ ]:
+
+
+
from aimet_common.defs import CompressionScheme, CostMetric
+from aimet_tensorflow.defs import GreedySelectionParameters, ChannelPruningParameters
+from decimal import Decimal
+
+greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.9),
+                                          num_comp_ratio_candidates=3)
+
+modules_to_ignore = [sess.graph.get_operation_by_name('conv1_conv/Conv2D')]
+auto_params = ChannelPruningParameters.AutoModeParams(greedy_select_params=greedy_params,
+                                                      modules_to_ignore=modules_to_ignore)
+
+data_loader = ImageNetDataPipeline.get_val_dataloader()
+params = ChannelPruningParameters(input_op_names=starting_op_names,
+                                  output_op_names=output_op_names,
+                                  data_set=data_loader.dataset,
+                                  batch_size=data_loader.batch_size,
+                                  num_reconstruction_samples=10,
+                                  allow_custom_downsample_ops=False,
+                                  mode=ChannelPruningParameters.Mode.auto,
+                                  params=auto_params)
+
+eval_callback = ImageNetDataPipeline.evaluate
+eval_iterations = 1
+compress_scheme = CompressionScheme.channel_pruning
+cost_metric = CostMetric.mac
+
+
+
+
+

We call the AIMET ModelCompressor.compress_model API using the above parameters. This call returns a compressed model as well as relevant statistics.

+

Note: the ModelCompressor evaluates the model while compressing using the same evaluate function that is in our data pipeline. This returns both the new model, which is saved, as well as relevant statistics. Finally, the compressed model is evaluated on the dataset.

+
+
[ ]:
+
+
+
from aimet_tensorflow.compress import ModelCompressor
+
+os.makedirs('./output/', exist_ok=True)
+
+#TODO: makedirs should be at top??
+compressed_sess, comp_stats = ModelCompressor.compress_model(sess=sess,
+                                                             working_dir="output",
+                                                             eval_callback=eval_callback,
+                                                             eval_iterations=eval_iterations,
+                                                             input_shape=(1, 3, 224, 224),
+                                                             compress_scheme=compress_scheme,
+                                                             cost_metric=cost_metric,
+                                                             parameters=params)
+
+print(comp_stats)
+
+
+
+
+

Now the compressed model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before to calculated compressed model accuracy.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(compressed_sess, iterations=1)
+print(accuracy)
+
+
+
+
+

As you can see the model accuracy fell sharply after compression. This is expected. We will use model fine-tuning to recover this accuracy back.

+
+
+

3.2. Fine-tune the model

+

After the model is compressed using Channel Pruning, we can simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.

+

For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit.

+

Note: Since Channel Pruning replaces few BNs by different BNs with ‘reduced_’ added in their original name, update_ops_name list should be updated accordingly

+
+
[ ]:
+
+
+
compr_graph_all_ops_name = [op.name for op in compressed_sess.graph.get_operations()]
+update_ops_name_after_CP = []
+for op_name in update_ops_name:
+    if 'reduced_'+op_name in compr_graph_all_ops_name:
+        update_ops_name_after_CP.append('reduced_'+op_name)
+    else:
+        update_ops_name_after_CP.append(op_name)
+
+ImageNetDataPipeline.finetune(compressed_sess, update_ops_name_after_CP, epochs=1, learning_rate=1e-3, decay_steps=5)
+
+
+
+
+

After we are done with finetuing the compressed model, we can check the floating point accuracy against the same validation dataset at the end to observe any improvements in accuracy.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(compressed_sess)
+print(accuracy)
+
+
+
+
+

Depending on your settings you should have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.

+

So we have an improved model after compression using Channel Pruning. Optionally, this model now can be saved like a regular tensorflow model.

+
+
[ ]:
+
+
+
from aimet_tensorflow.utils.graph_saver import save_model_to_meta
+
+save_model_to_meta(compressed_sess, meta_path='./output/finetuned_model')
+
+
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET for performing compression with Channel Pruning. As indicated above, some parameters have been chosen in a way to run the example faster.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET compression and quantization techniques

+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/compression/channel_pruning.ipynb b/releases/1.32.2/Examples/tensorflow/compression/channel_pruning.ipynb new file mode 100644 index 00000000..292a9b05 --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/compression/channel_pruning.ipynb @@ -0,0 +1,509 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Model Compression Using Channel Pruning \n", + "\n", + "This notebook shows a working code example of how to use AIMET to perform model compression. The Channel Pruning technique is used in this notebook to achieve model compression.\n", + "\n", + "Here is a brief introduction to the techniques. Please refer to the AIMET user guide for more details.\n", + "\n", + "1. **Spatial SVD**: This is a tensor-decomposition technique generally applied to convolutional layers (Conv2D). Applying this technique will decompose a single convolutional layer into two. The weight tensor of the layer to be split is flattended to a 2D matrix and singular value decomposition (SVD) is applied to this matrix. Compression is achieved by discarding the least significant singular values in the diagonal matrix. The decomposed matrices are combined back into two separate convolutional layers.\n", + "2. **Channel Pruning**: In this technique AIMET will discard least significant (using a magnitude metric) input channels of a given convolutional (Conv2D) layer. The layers of the model feeding into this convolutional layer also have the channels dimension modified to get back to a working graph. This technique also uses a layer-by-layer reconstruction procedure that modifies the weights of the compressed layers to minimize the distance of the compressed layer output to the corresponding layer output of the original model.\n", + "\n", + "Both of the above techniques are structured pruning techniques that aim to reduce computational macs or memory requirements of the model. Subsequent to applying either of these techniques, the compressed model needs to be fine-tuned (meaning trained again for a few epochs) to recover accuracy close to the original model.\n", + "\n", + "This notebook shows working code example of how the technique #2 can be used to compress the model. You can find a separate notebook for #1, and #1 followed by #2 in the same folder.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load the model and evaluate it to find the baseline accuracy\n", + "3. Compress the model and fine-tune: \n", + " 3.1 Compress model using Channel Pruning and evaluate it to find post-compression accuracy \n", + " 3.2 Fine-tune the model\n", + "\n", + "\n", + "#### What this notebook is not \n", + "* This notebook is not designed to show state-of-the-art compression results. For example, some optimization parameters such as num_comp_ratio_candidates, num_eval_iterations and epochs are deliberately chosen to have the notebook execute more quickly.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#) and convert them into tfrecords.\n", + "\n", + "**Note1**: The ImageNet tfrecords dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- A folder containing tfrecords files starting with **'train\\*'** for training files and **'valid\\*'** for validation files. Each tfrecord file should have features: **'image/encoded'** for image data and **'image/class/label'** for its corresponding class.\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class and then convert it into tfrecords. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "TFRECORDS_DIR = '/path/to/tfrecords/dir/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We disable logs at the INFO level and disable eager execution. We set verbosity to the level as displayed (ERORR), so TensorFlow will display all messages that have the label ERROR (or more critical)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' ##TODO\n", + "\n", + "import tensorflow.compat.v1 as tf ## TODO Abhijit\n", + "tf.disable_eager_execution()\n", + "tf.logging.set_verbosity(tf.logging.ERROR)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from typing import List\n", + "from Examples.common import image_net_config\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.tensorflow.utils.image_net_trainer import ImageNetTrainer\n", + "\n", + "class ImageNetDataPipeline:\n", + " \"\"\"\n", + " Provides APIs for model evaluation and finetuning using ImageNet Dataset.\n", + " \"\"\"\n", + " \n", + " @staticmethod\n", + " def get_val_dataloader():\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(TFRECORDS_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return data_loader\n", + " \n", + " @staticmethod\n", + " def evaluate(sess: tf.Session, iterations: int = None, use_cuda: bool = False) -> float:\n", + " \"\"\"\n", + " Given a TF session, evaluates its Top-1 accuracy on the validation dataset\n", + " :param sess: The sess graph to be evaluated.\n", + " :return: The accuracy for the sample with the maximum accuracy.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],\n", + " data_inputs=['input_1:0'], validation_inputs=['labels:0'],\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return evaluator.evaluate(sess, iterations)\n", + "\n", + " \n", + " @staticmethod\n", + " def finetune(sess: tf.Session, update_ops_name: List[str], epochs: int, learning_rate: float, decay_steps: int):\n", + " \"\"\"\n", + " Given a TF session, finetunes it to improve its accuracy\n", + " :param sess: The sess graph to fine-tune.\n", + " :param update_ops_name: list of name of update ops (mostly BatchNorms' moving averages).\n", + " tf.GraphKeys.UPDATE_OPS collections is always used\n", + " in addition to this list\n", + " :param epochs: The number of epochs used during the finetuning step.\n", + " :param learning_rate: The learning rate used during the finetuning step.\n", + " :param decay_steps: A number used to adjust(decay) the learning rate after every decay_steps epochs in training.\n", + " \"\"\"\n", + " trainer = ImageNetTrainer(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],\n", + " data_inputs=['input_1:0'], validation_inputs=['labels:0'],\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.train['batch_size'],\n", + " num_epochs=epochs, format_bgr=True)\n", + "\n", + " trainer.train(sess, update_ops_name=update_ops_name, learning_rate=learning_rate, decay_steps=decay_steps)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 2. Load the model and evaluate it to find the baseline accuracy" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to a tensorflow session. Similarly, you can load any pretrained tensorflow model instead.\n", + "\n", + "\n", + "Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.\n", + "\n", + "\n", + "By default the update ops are placed in tf.GraphKeys.UPDATE_OPS, so they need to be added as a dependency to the train_op. Since batchnorm ops are folded, these need to be ignored during training." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from tensorflow.compat.v1.keras.applications.resnet import ResNet50\n", + "\n", + "tf.keras.backend.clear_session()\n", + "\n", + "model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))\n", + "update_ops_name = [op.name for op in model.updates] # Used for finetuning" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The following utility method in AIMET sets BN layers in the model to eval mode. This allows AIMET to more easily read the BN parameters from the graph. Eventually we will fold BN layers into adjacent conv layers." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag\n", + "\n", + "model = update_keras_bn_ops_trainable_flag(model, load_save_path=\"./\", trainable=False)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "AIMET features currently support tensorflow sessions. **add_image_net_computational_nodes_in_graph** adds an output layer, softmax and loss functions to the Resnet50 model graph." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph\n", + "\n", + "sess = tf.keras.backend.get_session()\n", + "\n", + "# Creates the computation graph of ResNet within the tensorflow session.\n", + "add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Since all tensorflow input and output tensors have names, we identify the tensors needed by AIMET APIs here. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "input_op_names = [model.input.name.split(\":\")[0]]\n", + "output_op_names = [model.output.name.split(\":\")[0]]\n", + "starting_op_names = input_op_names\n", + "starting_op_names.append('labels')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We are checking if TensorFlow is using CPU or CUDA device. This example code will use CUDA if available in your current execution environment." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "use_cuda = tf.test.is_gpu_available(cuda_only=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sess=sess)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. Compress the model and fine-tune\n", + "\n", + "### 3.1. Compress model using Channel Pruning and evaluate it to find post-compression accuracy\n", + "Now we use AIMET to define compression parameters for Channel Pruning, few of which are explained here\n", + "\n", + "- **target_comp_ratio**: The desired compession ratio using Channel Pruning. This value denotes the desired compression % of the original model. To compress the model to 20% of its original size, use 0.2. This would compress the model by 80%. The pre-specified value that is given is 50%. The desired compression ratio for Channel Pruning. We are using 0.9 to compress the model by 10%.\n", + "\n", + "- **num_comp_ratio_candidates**: As part of determining how compressible each layer is, AIMET performs various measurements. This number denotes the different compression ratios tried by the AIMET for each layer. We are using 3 here which translates to 0.33, 0.66 and 1.00 compression ratios at each layer. Optimal value is 10. The higher the number of candidates the more granular the measurements for each layer, but also the higher the time taken to complete these measurements.\n", + "\n", + "- **modules_to_ignore**: This list can contain the references of model-layers that should be ignored during compression. We have added the first layer to be ignored to preserve the way the input interacts with the model; other layers can be added too if desired.\n", + "\n", + "- **mode**: We are chossing **Auto** mode which means AIMET performs per-layer compressibility analysis and determines how much to compress each layer. The alternate choice is **Manual**.\n", + "\n", + "- **data_loader**: Channel Pruning uses unlabelled data samples for the layer-by-layer reconstruction procedure explained at the start. This provided data loader is used to retrieve those samples. You can just pass your existing data loader - say for the validation or training dataset.\n", + "\n", + "- **num_reconstruction_samples**: During the last stage of Channel Pruning, the Compression API tries to map the outputs of the pruned model with that of the original model through linear regression, and uses this attempt to change the weights in the pruned layer. The regression is done with this many random samples. The number of samples used in the layer-by-layer reconstruction procedure. We are using 10 here which is a ridiculously low number but enables this notebook to execute quickly. A typical setting here would ~1000 samples.\n", + "\n", + "- **allow_custom_downsample_ops**: If this flag is enabled, AIMET Channel Pruning will insert downsample ops into the model graph if needed. Enabling this can enable more convolutional layers to be considered for pruning, but it may increase memory bandwidth overhead for the additional downsample layers. So there is a trade-off to be considered. We suggest disabling this by default.\n", + "\n", + "- **eval_callback**: The model evaluation function. The expected signature of the evaluate function should be `(model, eval_iterations, use_cuda)` and it is expected to return an accuracy metric.\n", + "\n", + "- **eval_iterations**: The number of batches of data to use for evaluating the model while the model is compressing. We are using 1 to speed up the notebook execution. But please choose a high enough number of samples so that we can trust the accuracy of the model given those samples. It is expected that the eval callback would use the same samples for every invocation of the callback.\n", + "\n", + "- **compress_scheme**: We choose the 'channel pruning' compression scheme.\n", + "\n", + "- **cost_metric**: Determines whether we want to target either to reduce MACs or memory by the desired compression ratio. We are chossing 'mac' here." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The next cell defines the actual Channel Pruning Parameters. There are two methods for which you can choose parameters - Auto and Manual. For Auto, the only option is a greedy selection scheme, where the optimal compression ratio is selected for each layer among a set list of candidates to reach the target ratio (which was specified in the previous cell). For Manual, you have to specify the compression ratios for each layer; a general rule of thumb, if one is to use Manual, is to start with the ratios found by Auto Mode and use it as a starting point.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_common.defs import CompressionScheme, CostMetric\n", + "from aimet_tensorflow.defs import GreedySelectionParameters, ChannelPruningParameters\n", + "from decimal import Decimal\n", + "\n", + "greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.9),\n", + " num_comp_ratio_candidates=3)\n", + "\n", + "modules_to_ignore = [sess.graph.get_operation_by_name('conv1_conv/Conv2D')]\n", + "auto_params = ChannelPruningParameters.AutoModeParams(greedy_select_params=greedy_params, \n", + " modules_to_ignore=modules_to_ignore)\n", + "\n", + "data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + "params = ChannelPruningParameters(input_op_names=starting_op_names,\n", + " output_op_names=output_op_names,\n", + " data_set=data_loader.dataset,\n", + " batch_size=data_loader.batch_size,\n", + " num_reconstruction_samples=10,\n", + " allow_custom_downsample_ops=False,\n", + " mode=ChannelPruningParameters.Mode.auto,\n", + " params=auto_params)\n", + "\n", + "eval_callback = ImageNetDataPipeline.evaluate\n", + "eval_iterations = 1\n", + "compress_scheme = CompressionScheme.channel_pruning\n", + "cost_metric = CostMetric.mac" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We call the AIMET ModelCompressor.compress_model API using the above parameters. This call returns a compressed model as well as relevant statistics. \n", + "\n", + "\n", + "**Note**: the ModelCompressor evaluates the model while compressing using the same evaluate function that is in our data pipeline. This returns both the new model, which is saved, as well as relevant statistics. Finally, the compressed model is evaluated on the dataset." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.compress import ModelCompressor\n", + "\n", + "os.makedirs('./output/', exist_ok=True)\n", + "\n", + "#TODO: makedirs should be at top??\n", + "compressed_sess, comp_stats = ModelCompressor.compress_model(sess=sess,\n", + " working_dir=\"output\",\n", + " eval_callback=eval_callback,\n", + " eval_iterations=eval_iterations,\n", + " input_shape=(1, 3, 224, 224),\n", + " compress_scheme=compress_scheme,\n", + " cost_metric=cost_metric,\n", + " parameters=params)\n", + "\n", + "print(comp_stats)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now the compressed model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before to calculated compressed model accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(compressed_sess, iterations=1)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "As you can see the model accuracy fell sharply after compression. This is expected. We will use model fine-tuning to recover this accuracy back." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.2. Fine-tune the model\n", + "\n", + "After the model is compressed using Channel Pruning, we can simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.\n", + "\n", + "For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit.\n", + "\n", + "**Note:** Since Channel Pruning replaces few BNs by different BNs with 'reduced_' added in their original name, update_ops_name list should be updated accordingly" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "compr_graph_all_ops_name = [op.name for op in compressed_sess.graph.get_operations()]\n", + "update_ops_name_after_CP = []\n", + "for op_name in update_ops_name:\n", + " if 'reduced_'+op_name in compr_graph_all_ops_name:\n", + " update_ops_name_after_CP.append('reduced_'+op_name)\n", + " else:\n", + " update_ops_name_after_CP.append(op_name)\n", + " \n", + "ImageNetDataPipeline.finetune(compressed_sess, update_ops_name_after_CP, epochs=1, learning_rate=1e-3, decay_steps=5)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "After we are done with finetuing the compressed model, we can check the floating point accuracy against the same validation dataset at the end to observe any improvements in accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(compressed_sess)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Depending on your settings you should have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.\n", + "\n", + "So we have an improved model after compression using Channel Pruning. Optionally, this model now can be saved like a regular tensorflow model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.utils.graph_saver import save_model_to_meta\n", + "\n", + "save_model_to_meta(compressed_sess, meta_path='./output/finetuned_model')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Summary\n", + "Hope this notebook was useful for you to understand how to use AIMET for performing compression with Channel Pruning. As indicated above, some parameters have been chosen in a way to run the example faster.\n", + "\n", + "Few additional resources\n", + "- Refer to the AIMET API docs to know more details of the APIs and optional parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET compression and quantization techniques" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/releases/1.32.2/Examples/tensorflow/compression/spatial_svd.html b/releases/1.32.2/Examples/tensorflow/compression/spatial_svd.html new file mode 100644 index 00000000..453b03a9 --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/compression/spatial_svd.html @@ -0,0 +1,1476 @@ + + + + + + Model compression Using Spatial SVD — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Model compression Using Spatial SVD

+

This notebook shows a working code example of how to use AIMET to perform model compression. The Spatial SVD technique is used in this notebook to achieve model compression.

+

Here is a brief introduction to the techniques. Please refer to the AIMET user guide for more details.

+
    +
  1. Spatial SVD: This is a tensor-decomposition technique generally applied to convolutional layers (Conv2D). Applying this technique will decompose a single convolutional layer into two. The weight tensor of the layer to be split is flattended to a 2D matrix and singular value decomposition (SVD) is applied to this matrix. Compression is achieved by discarding the least significant singular values in the diagonal matrix. The decomposed matrices are combined back into two separate +convolutional layers.

  2. +
  3. Channel Pruning: In this technique AIMET will discard least significant (using a magnitude metric) input channels of a given convolutional (Conv2D) layer. The layers of the model feeding into this convolutional layer also have the channels dimension modified to get back to a working graph. This technique also uses a layer-by-layer reconstruction procedure that modifies the weights of the compressed layers to minimize the distance of the compressed layer output to the corresponding layer +output of the original model.

  4. +
+

Both of the above techniques are structured pruning techniques that aim to reduce computational macs or memory requirements of the model. Subsequent to applying either of these techniques, the compressed model needs to be fine-tuned (meaning trained again for a few epochs) to recover accuracy close to the original model.

+

This notebook shows working code example of how the technique #1 can be used to compress the model. You can find a separate notebook for #2, and #1 followed by #2 in the same folder.

+
+

Overall flow

+
+
This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load the model and evaluate it to find the baseline accuracy 3. Compress the model and fine-tune:
+
3.1 Compress model using Spatial SVD and evaluate it to find post-compression accuracy
+
3.2 Fine-tune the model
+
+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art compression results. For example, some optimization parameters such as num_comp_ratio_candidates, num_eval_iterations and epochs are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#) and convert them into tfrecords.

+

Note1: The ImageNet tfrecords dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - A folder containing tfrecords files starting with ‘train*’ for training files and ‘valid*’ for validation files. Each tfrecord file should have features: ‘image/encoded’ for image data and ‘image/class/label’ for its corresponding class.

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class and then convert it into tfrecords. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
TFRECORDS_DIR = '/path/to/tfrecords/dir/'        # Please replace this with a real directory
+
+
+
+

We disable logs at the INFO level and disable eager execution. We set verbosity to the level as displayed (ERORR), so TensorFlow will display all messages that have the label ERROR (or more critical).

+
+
[ ]:
+
+
+
import os
+os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'  ##TODO
+
+import tensorflow.compat.v1 as tf        ## TODO Abhijit
+tf.disable_eager_execution()
+tf.logging.set_verbosity(tf.logging.ERROR)
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to compress it and the resultant model is still a PyTorch model. This compressed model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods? Not really, but evaluate() method should return a single number representing the accuracy of the model. Ideally, You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
from typing import List
+from Examples.common import image_net_config
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.tensorflow.utils.image_net_trainer import ImageNetTrainer
+
+class ImageNetDataPipeline:
+    """
+    Provides APIs for model evaluation and finetuning using ImageNet Dataset.
+    """
+
+    @staticmethod
+    def get_val_dataloader():
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(TFRECORDS_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         format_bgr=True)
+
+        return data_loader
+
+    @staticmethod
+    def evaluate(sess: tf.Session, iterations: int = None, use_cuda: bool = False) -> float:
+        """
+        Given a TF session, evaluates its Top-1 accuracy on the validation dataset
+        :param sess: The sess graph to be evaluated.
+        :return: The accuracy for the sample with the maximum accuracy.
+        """
+        evaluator = ImageNetEvaluator(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],
+                                      data_inputs=['input_1:0'], validation_inputs=['labels:0'],
+                                      image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      format_bgr=True)
+
+        return evaluator.evaluate(sess, iterations)
+
+
+    @staticmethod
+    def finetune(sess: tf.Session, update_ops_name: List[str], epochs: int, learning_rate: float, decay_steps: int):
+        """
+        Given a TF session, finetunes it to improve its accuracy
+        :param sess: The sess graph to fine-tune.
+        :param update_ops_name: list of name of update ops (mostly BatchNorms' moving averages).
+                                tf.GraphKeys.UPDATE_OPS collections is always used
+                                in addition to this list
+        :param epochs: The number of epochs used during the finetuning step.
+        :param learning_rate: The learning rate used during the finetuning step.
+        :param decay_steps: A number used to adjust(decay) the learning rate after every decay_steps epochs in training.
+        """
+        trainer = ImageNetTrainer(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],
+                                  data_inputs=['input_1:0'], validation_inputs=['labels:0'],
+                                  image_size=image_net_config.dataset['image_size'],
+                                  batch_size=image_net_config.train['batch_size'],
+                                  num_epochs=epochs, format_bgr=True)
+
+        trainer.train(sess, update_ops_name=update_ops_name, learning_rate=learning_rate, decay_steps=decay_steps)
+
+
+
+
+
+
+

2. Load the model and evaluate it to find the baseline accuracy

+

For this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to a tensorflow session. Similarly, you can load any pretrained tensorflow model instead.

+

Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.

+

By default the update ops are placed in tf.GraphKeys.UPDATE_OPS, so they need to be added as a dependency to the train_op. Since batchnorm ops are folded, these need to be ignored during training.

+
+
[ ]:
+
+
+
from tensorflow.compat.v1.keras.applications.resnet import ResNet50
+
+tf.keras.backend.clear_session()
+
+model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+update_ops_name = [op.name for op in model.updates] # Used for finetuning
+
+
+
+

The following utility method in AIMET sets BN layers in the model to eval mode. This allows AIMET to more easily read the BN parameters from the graph. Eventually we will fold BN layers into adjacent conv layers.

+
+
[ ]:
+
+
+
from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag
+
+model = update_keras_bn_ops_trainable_flag(model, load_save_path="./", trainable=False)
+
+
+
+

AIMET features currently support tensorflow sessions. add_image_net_computational_nodes_in_graph adds an output layer, softmax and loss functions to the Resnet50 model graph.

+
+
[ ]:
+
+
+
from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph
+
+sess = tf.keras.backend.get_session()
+
+# Creates the computation graph of ResNet within the tensorflow session.
+add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])
+
+
+
+

Since all tensorflow input and output tensors have names, we identify the tensors needed by AIMET APIs here.

+
+
[ ]:
+
+
+
input_op_names = [model.input.name.split(":")[0]]
+output_op_names = [model.output.name.split(":")[0]]
+starting_op_names = input_op_names.copy()
+starting_op_names.append('labels')
+
+
+
+

We are checking if TensorFlow is using CPU or CUDA device. This example code will use CUDA if available in your current execution environment.

+
+
[ ]:
+
+
+
use_cuda = tf.test.is_gpu_available(cuda_only=True)
+
+
+
+
+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sess=sess)
+print(accuracy)
+
+
+
+
+
+

3. Compress the model and fine-tune

+
+

3.1. Compress model using Channel Pruning and evaluate it to find post-compression accuracy

+

Now we use AIMET to define compression parameters for Channel Pruning, few of which are explained here

+
    +
  • target_comp_ratio: The desired compession ratio using Channel Pruning. This value denotes the desired compression % of the original model. To compress the model to 20% of its original size, use 0.2. This would compress the model by 80%. The pre-specified value that is given is 50%. The desired compression ratio for Channel Pruning. We are using 0.9 to compress the model by 10%.

  • +
  • num_comp_ratio_candidates: As part of determining how compressible each layer is, AIMET performs various measurements. This number denotes the different compression ratios tried by the AIMET for each layer. We are using 3 here which translates to 0.33, 0.66 and 1.00 compression ratios at each layer. Optimal value is 10. The higher the number of candidates the more granular the measurements for each layer, but also the higher the time taken to complete these measurements.

  • +
  • modules_to_ignore: This list can contain the references of model-layers that should be ignored during compression. We have added the first layer to be ignored to preserve the way the input interacts with the model; other layers can be added too if desired.

  • +
  • mode: We are chossing Auto mode which means AIMET performs per-layer compressibility analysis and determines how much to compress each layer. The alternate choice is Manual.

  • +
  • data_loader: Channel Pruning uses unlabelled data samples for the layer-by-layer reconstruction procedure explained at the start. This provided data loader is used to retrieve those samples. You can just pass your existing data loader - say for the validation or training dataset.

  • +
  • num_reconstruction_samples: During the last stage of Channel Pruning, the Compression API tries to map the outputs of the pruned model with that of the original model through linear regression, and uses this attempt to change the weights in the pruned layer. The regression is done with this many random samples. The number of samples used in the layer-by-layer reconstruction procedure. We are using 10 here which is a ridiculously low number but enables this notebook to execute quickly. A +typical setting here would ~1000 samples.

  • +
  • allow_custom_downsample_ops: If this flag is enabled, AIMET Channel Pruning will insert downsample ops into the model graph if needed. Enabling this can enable more convolutional layers to be considered for pruning, but it may increase memory bandwidth overhead for the additional downsample layers. So there is a trade-off to be considered. We suggest disabling this by default.

  • +
  • eval_callback: The model evaluation function. The expected signature of the evaluate function should be <function_name>(model, eval_iterations, use_cuda) and it is expected to return an accuracy metric.

  • +
  • eval_iterations: The number of batches of data to use for evaluating the model while the model is compressing. We are using 1 to speed up the notebook execution. But please choose a high enough number of samples so that we can trust the accuracy of the model given those samples. It is expected that the eval callback would use the same samples for every invocation of the callback.

  • +
  • compress_scheme: We choose the ‘channel pruning’ compression scheme.

  • +
  • cost_metric: Determines whether we want to target either to reduce MACs or memory by the desired compression ratio. We are chossing ‘mac’ here.

  • +
+

The next cell creates the actual parameters for Spatial SVD. There are two methods for which you can choose parameters - Auto and Manual. For Auto, the only option is a greedy selection scheme, where the optimal compression ratio is selected for each layer among a set list of candidates to reach the target ratio (which was specified in the previous cell). For Manual, you have to specify the compression ratios for each layer; a general rule of thumb, if one is to use Manual, is to start with the +ratios found by Auto Mode and use it as a starting point.

+
+
[ ]:
+
+
+
from decimal import Decimal
+from aimet_common.defs import CompressionScheme, CostMetric, GreedySelectionParameters
+from aimet_tensorflow.defs import SpatialSvdParameters
+
+greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.5),
+                                          num_comp_ratio_candidates=2)
+
+modules_to_ignore = [sess.graph.get_operation_by_name('conv1_conv/Conv2D')]
+auto_params = SpatialSvdParameters.AutoModeParams(greedy_select_params=greedy_params,
+                                                  modules_to_ignore=modules_to_ignore)
+
+params = SpatialSvdParameters(input_op_names=starting_op_names,
+                              output_op_names=output_op_names,
+                              mode=SpatialSvdParameters.Mode.auto,
+                              params=auto_params)
+
+
+eval_callback = ImageNetDataPipeline.evaluate
+eval_iterations = 1
+compress_scheme =  CompressionScheme.spatial_svd
+cost_metric = CostMetric.mac
+
+
+
+
+
+
We call the AIMET ModelCompressor.compress_model API using the above parameters. This call returns a compressed model as well as relevant statistics.
+
Note: the ModelCompressor evaluates the model while compressing using the same evaluate function that is in our data pipeline.
+
+
+
[ ]:
+
+
+
from aimet_tensorflow.compress import ModelCompressor
+
+os.makedirs('./output/', exist_ok=True)
+#TODO: makedirs should be at top??
+
+compressed_sess, comp_stats = ModelCompressor.compress_model(sess=sess,
+                                                             working_dir="output",
+                                                             eval_callback=ImageNetDataPipeline.evaluate,
+                                                             eval_iterations=eval_iterations,
+                                                             input_shape=(1, 3, 224, 224),
+                                                             compress_scheme=compress_scheme,
+                                                             cost_metric=cost_metric,
+                                                             parameters=params)
+
+print(comp_stats)
+
+
+
+
+

Now the compressed model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before to calculated compressed model accuracy.

+
+
[ ]:
+
+
+
comp_accuracy = ImageNetDataPipeline.evaluate(compressed_sess)
+print(comp_accuracy)
+
+
+
+
+

As you can see the model accuracy fell sharply after compression. This is expected. We will use model fine-tuning to recover this accuracy back.

+
+
+

3.2. Fine-tune the model

+

After the model is compressed using Spatial SVD, we can simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.

+

For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit.

+

Add this: Since Channel Pruning replaces few BNs by different BNs with ‘reduced_’ added in their original name, update_ops_name list should be updated accordingly

+
+
[ ]:
+
+
+
compr_graph_all_ops_name = [op.name for op in compressed_sess.graph.get_operations()]
+update_ops_name_after_CP = []
+for op_name in update_ops_name:
+    if 'reduced_'+op_name in compr_graph_all_ops_name:
+        update_ops_name_after_CP.append('reduced_'+ op_name)
+    else:
+        update_ops_name_after_CP.append(op_name)
+
+ImageNetDataPipeline.finetune(compressed_sess, update_ops_name_after_CP, epochs=1, learning_rate=1e-3, decay_steps=5)
+
+
+
+
+

After we are done with finetuing the compressed model, we can check the floating point accuracy against the same validation dataset at the end to observe any improvements in accuracy.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(compressed_sess)
+print(accuracy)
+
+
+
+
+

Depending on your settings you should have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.

+

So we have an improved model after compression using spatial SVD. Optionally, this model now can be saved like a regular tensorflow model.

+
+
[ ]:
+
+
+
from aimet_tensorflow.utils.graph_saver import save_model_to_meta
+
+save_model_to_meta(compressed_sess, meta_path='./output/finetuned_model')
+
+
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET for performing compression with Spatial SVD. As indicated above, some parameters have been chosen in a way to run the example faster.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET compression and quantization techniques

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/compression/spatial_svd.ipynb b/releases/1.32.2/Examples/tensorflow/compression/spatial_svd.ipynb new file mode 100644 index 00000000..e2fdb7ce --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/compression/spatial_svd.ipynb @@ -0,0 +1,519 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Model compression Using Spatial SVD \n", + "\n", + "This notebook shows a working code example of how to use AIMET to perform model compression. The Spatial SVD technique is used in this notebook to achieve model compression.\n", + "\n", + "Here is a brief introduction to the techniques. Please refer to the AIMET user guide for more details.\n", + "\n", + "1. **Spatial SVD**: This is a tensor-decomposition technique generally applied to convolutional layers (Conv2D). Applying this technique will decompose a single convolutional layer into two. The weight tensor of the layer to be split is flattended to a 2D matrix and singular value decomposition (SVD) is applied to this matrix. Compression is achieved by discarding the least significant singular values in the diagonal matrix. The decomposed matrices are combined back into two separate convolutional layers.\n", + "2. **Channel Pruning**: In this technique AIMET will discard least significant (using a magnitude metric) input channels of a given convolutional (Conv2D) layer. The layers of the model feeding into this convolutional layer also have the channels dimension modified to get back to a working graph. This technique also uses a layer-by-layer reconstruction procedure that modifies the weights of the compressed layers to minimize the distance of the compressed layer output to the corresponding layer output of the original model.\n", + "\n", + "Both of the above techniques are structured pruning techniques that aim to reduce computational macs or memory requirements of the model. Subsequent to applying either of these techniques, the compressed model needs to be fine-tuned (meaning trained again for a few epochs) to recover accuracy close to the original model.\n", + "\n", + "This notebook shows working code example of how the technique #1 can be used to compress the model. You can find a separate notebook for #2, and #1 followed by #2 in the same folder.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load the model and evaluate it to find the baseline accuracy\n", + "3. Compress the model and fine-tune: \n", + " 3.1 Compress model using Spatial SVD and evaluate it to find post-compression accuracy \n", + " 3.2 Fine-tune the model\n", + "\n", + "\n", + "#### What this notebook is not \n", + "* This notebook is not designed to show state-of-the-art compression results. For example, some optimization parameters such as num_comp_ratio_candidates, num_eval_iterations and epochs are deliberately chosen to have the notebook execute more quickly.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#) and convert them into tfrecords.\n", + "\n", + "**Note1**: The ImageNet tfrecords dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- A folder containing tfrecords files starting with **'train\\*'** for training files and **'valid\\*'** for validation files. Each tfrecord file should have features: **'image/encoded'** for image data and **'image/class/label'** for its corresponding class.\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class and then convert it into tfrecords. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "TFRECORDS_DIR = '/path/to/tfrecords/dir/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We disable logs at the INFO level and disable eager execution. We set verbosity to the level as displayed (ERORR), so TensorFlow will display all messages that have the label ERROR (or more critical)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' ##TODO\n", + "\n", + "import tensorflow.compat.v1 as tf ## TODO Abhijit\n", + "tf.disable_eager_execution()\n", + "tf.logging.set_verbosity(tf.logging.ERROR)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to compress it and the resultant model is still a PyTorch model. This compressed model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really, but evaluate() method should return a single number representing the accuracy of the model. Ideally, You should be able to use your existing evaluate and train routines as-is.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from typing import List\n", + "from Examples.common import image_net_config\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.tensorflow.utils.image_net_trainer import ImageNetTrainer\n", + "\n", + "class ImageNetDataPipeline:\n", + " \"\"\"\n", + " Provides APIs for model evaluation and finetuning using ImageNet Dataset.\n", + " \"\"\"\n", + " \n", + " @staticmethod\n", + " def get_val_dataloader():\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(TFRECORDS_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return data_loader\n", + " \n", + " @staticmethod\n", + " def evaluate(sess: tf.Session, iterations: int = None, use_cuda: bool = False) -> float:\n", + " \"\"\"\n", + " Given a TF session, evaluates its Top-1 accuracy on the validation dataset\n", + " :param sess: The sess graph to be evaluated.\n", + " :return: The accuracy for the sample with the maximum accuracy.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],\n", + " data_inputs=['input_1:0'], validation_inputs=['labels:0'],\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return evaluator.evaluate(sess, iterations)\n", + "\n", + " \n", + " @staticmethod\n", + " def finetune(sess: tf.Session, update_ops_name: List[str], epochs: int, learning_rate: float, decay_steps: int):\n", + " \"\"\"\n", + " Given a TF session, finetunes it to improve its accuracy\n", + " :param sess: The sess graph to fine-tune.\n", + " :param update_ops_name: list of name of update ops (mostly BatchNorms' moving averages).\n", + " tf.GraphKeys.UPDATE_OPS collections is always used\n", + " in addition to this list\n", + " :param epochs: The number of epochs used during the finetuning step.\n", + " :param learning_rate: The learning rate used during the finetuning step.\n", + " :param decay_steps: A number used to adjust(decay) the learning rate after every decay_steps epochs in training.\n", + " \"\"\"\n", + " trainer = ImageNetTrainer(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],\n", + " data_inputs=['input_1:0'], validation_inputs=['labels:0'],\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.train['batch_size'],\n", + " num_epochs=epochs, format_bgr=True)\n", + "\n", + " trainer.train(sess, update_ops_name=update_ops_name, learning_rate=learning_rate, decay_steps=decay_steps)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 2. Load the model and evaluate it to find the baseline accuracy" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to a tensorflow session. Similarly, you can load any pretrained tensorflow model instead.\n", + "\n", + "\n", + "Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.\n", + "\n", + "\n", + "By default the update ops are placed in tf.GraphKeys.UPDATE_OPS, so they need to be added as a dependency to the train_op. Since batchnorm ops are folded, these need to be ignored during training." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from tensorflow.compat.v1.keras.applications.resnet import ResNet50\n", + "\n", + "tf.keras.backend.clear_session()\n", + "\n", + "model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))\n", + "update_ops_name = [op.name for op in model.updates] # Used for finetuning" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The following utility method in AIMET sets BN layers in the model to eval mode. This allows AIMET to more easily read the BN parameters from the graph. Eventually we will fold BN layers into adjacent conv layers." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag\n", + "\n", + "model = update_keras_bn_ops_trainable_flag(model, load_save_path=\"./\", trainable=False)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "AIMET features currently support tensorflow sessions. **add_image_net_computational_nodes_in_graph** adds an output layer, softmax and loss functions to the Resnet50 model graph." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph\n", + "\n", + "sess = tf.keras.backend.get_session()\n", + "\n", + "# Creates the computation graph of ResNet within the tensorflow session.\n", + "add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Since all tensorflow input and output tensors have names, we identify the tensors needed by AIMET APIs here. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "input_op_names = [model.input.name.split(\":\")[0]]\n", + "output_op_names = [model.output.name.split(\":\")[0]]\n", + "starting_op_names = input_op_names.copy()\n", + "starting_op_names.append('labels')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We are checking if TensorFlow is using CPU or CUDA device. This example code will use CUDA if available in your current execution environment." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "use_cuda = tf.test.is_gpu_available(cuda_only=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sess=sess)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. Compress the model and fine-tune\n", + "\n", + "### 3.1. Compress model using Channel Pruning and evaluate it to find post-compression accuracy\n", + "Now we use AIMET to define compression parameters for Channel Pruning, few of which are explained here\n", + "\n", + "- **target_comp_ratio**: The desired compession ratio using Channel Pruning. This value denotes the desired compression % of the original model. To compress the model to 20% of its original size, use 0.2. This would compress the model by 80%. The pre-specified value that is given is 50%. The desired compression ratio for Channel Pruning. We are using 0.9 to compress the model by 10%.\n", + "\n", + "- **num_comp_ratio_candidates**: As part of determining how compressible each layer is, AIMET performs various measurements. This number denotes the different compression ratios tried by the AIMET for each layer. We are using 3 here which translates to 0.33, 0.66 and 1.00 compression ratios at each layer. Optimal value is 10. The higher the number of candidates the more granular the measurements for each layer, but also the higher the time taken to complete these measurements.\n", + "\n", + "- **modules_to_ignore**: This list can contain the references of model-layers that should be ignored during compression. We have added the first layer to be ignored to preserve the way the input interacts with the model; other layers can be added too if desired.\n", + "\n", + "- **mode**: We are chossing **Auto** mode which means AIMET performs per-layer compressibility analysis and determines how much to compress each layer. The alternate choice is **Manual**.\n", + "\n", + "- **data_loader**: Channel Pruning uses unlabelled data samples for the layer-by-layer reconstruction procedure explained at the start. This provided data loader is used to retrieve those samples. You can just pass your existing data loader - say for the validation or training dataset.\n", + "\n", + "- **num_reconstruction_samples**: During the last stage of Channel Pruning, the Compression API tries to map the outputs of the pruned model with that of the original model through linear regression, and uses this attempt to change the weights in the pruned layer. The regression is done with this many random samples. The number of samples used in the layer-by-layer reconstruction procedure. We are using 10 here which is a ridiculously low number but enables this notebook to execute quickly. A typical setting here would ~1000 samples.\n", + "\n", + "- **allow_custom_downsample_ops**: If this flag is enabled, AIMET Channel Pruning will insert downsample ops into the model graph if needed. Enabling this can enable more convolutional layers to be considered for pruning, but it may increase memory bandwidth overhead for the additional downsample layers. So there is a trade-off to be considered. We suggest disabling this by default.\n", + "\n", + "- **eval_callback**: The model evaluation function. The expected signature of the evaluate function should be `(model, eval_iterations, use_cuda)` and it is expected to return an accuracy metric.\n", + "\n", + "- **eval_iterations**: The number of batches of data to use for evaluating the model while the model is compressing. We are using 1 to speed up the notebook execution. But please choose a high enough number of samples so that we can trust the accuracy of the model given those samples. It is expected that the eval callback would use the same samples for every invocation of the callback.\n", + "\n", + "- **compress_scheme**: We choose the 'channel pruning' compression scheme.\n", + "\n", + "- **cost_metric**: Determines whether we want to target either to reduce MACs or memory by the desired compression ratio. We are chossing 'mac' here." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The next cell creates the actual parameters for Spatial SVD. There are two methods for which you can choose parameters - Auto and Manual. For Auto, the only option is a greedy selection scheme, where the optimal compression ratio is selected for each layer among a set list of candidates to reach the target ratio (which was specified in the previous cell). For Manual, you have to specify the compression ratios for each layer; a general rule of thumb, if one is to use Manual, is to start with the ratios found by Auto Mode and use it as a starting point." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from decimal import Decimal\n", + "from aimet_common.defs import CompressionScheme, CostMetric, GreedySelectionParameters\n", + "from aimet_tensorflow.defs import SpatialSvdParameters\n", + "\n", + "greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.5),\n", + " num_comp_ratio_candidates=2)\n", + "\n", + "modules_to_ignore = [sess.graph.get_operation_by_name('conv1_conv/Conv2D')] \n", + "auto_params = SpatialSvdParameters.AutoModeParams(greedy_select_params=greedy_params,\n", + " modules_to_ignore=modules_to_ignore)\n", + "\n", + "params = SpatialSvdParameters(input_op_names=starting_op_names,\n", + " output_op_names=output_op_names,\n", + " mode=SpatialSvdParameters.Mode.auto,\n", + " params=auto_params)\n", + "\n", + "\n", + "eval_callback = ImageNetDataPipeline.evaluate\n", + "eval_iterations = 1\n", + "compress_scheme = CompressionScheme.spatial_svd\n", + "cost_metric = CostMetric.mac" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We call the AIMET ModelCompressor.compress_model API using the above parameters. This call returns a compressed model as well as relevant statistics. \n", + "**Note**: the ModelCompressor evaluates the model while compressing using the same evaluate function that is in our data pipeline.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.compress import ModelCompressor\n", + "\n", + "os.makedirs('./output/', exist_ok=True)\n", + "#TODO: makedirs should be at top??\n", + "\n", + "compressed_sess, comp_stats = ModelCompressor.compress_model(sess=sess,\n", + " working_dir=\"output\",\n", + " eval_callback=ImageNetDataPipeline.evaluate,\n", + " eval_iterations=eval_iterations,\n", + " input_shape=(1, 3, 224, 224),\n", + " compress_scheme=compress_scheme,\n", + " cost_metric=cost_metric,\n", + " parameters=params)\n", + "\n", + "print(comp_stats)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now the compressed model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before to calculated compressed model accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "comp_accuracy = ImageNetDataPipeline.evaluate(compressed_sess)\n", + "print(comp_accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "As you can see the model accuracy fell sharply after compression. This is expected. We will use model fine-tuning to recover this accuracy back." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.2. Fine-tune the model\n", + "\n", + "After the model is compressed using Spatial SVD, we can simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.\n", + "\n", + "For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit.\n", + "\n", + "Add this: Since Channel Pruning replaces few BNs by different BNs with 'reduced_' added in their original name, update_ops_name list should be updated accordingly" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "compr_graph_all_ops_name = [op.name for op in compressed_sess.graph.get_operations()]\n", + "update_ops_name_after_CP = []\n", + "for op_name in update_ops_name:\n", + " if 'reduced_'+op_name in compr_graph_all_ops_name:\n", + " update_ops_name_after_CP.append('reduced_'+ op_name)\n", + " else:\n", + " update_ops_name_after_CP.append(op_name)\n", + " \n", + "ImageNetDataPipeline.finetune(compressed_sess, update_ops_name_after_CP, epochs=1, learning_rate=1e-3, decay_steps=5)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "After we are done with finetuing the compressed model, we can check the floating point accuracy against the same validation dataset at the end to observe any improvements in accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(compressed_sess)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Depending on your settings you should have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.\n", + "\n", + "So we have an improved model after compression using spatial SVD. Optionally, this model now can be saved like a regular tensorflow model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.utils.graph_saver import save_model_to_meta\n", + "\n", + "save_model_to_meta(compressed_sess, meta_path='./output/finetuned_model')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Summary\n", + "\n", + "Hope this notebook was useful for you to understand how to use AIMET for performing compression with Spatial SVD. As indicated above, some parameters have been chosen in a way to run the example faster.\n", + "\n", + "Few additional resources\n", + "- Refer to the AIMET API docs to know more details of the APIs and optional parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET compression and quantization techniques" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/releases/1.32.2/Examples/tensorflow/compression/spatial_svd_channel_pruning.html b/releases/1.32.2/Examples/tensorflow/compression/spatial_svd_channel_pruning.html new file mode 100644 index 00000000..d349a324 --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/compression/spatial_svd_channel_pruning.html @@ -0,0 +1,1622 @@ + + + + + + Model Compression Using Spatial SVD Followed by Channel Pruning — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + +
  • + View page source +
  • +
+
+
+
+
+ +
+

Model Compression Using Spatial SVD Followed by Channel Pruning

+

This notebook shows a working code example of how to use AIMET to perform model compression. Two model-compression techniques are applied back-to-back: Spatial SVD followed by Channel Pruning.

+

Here is a brief introduction to the techniques. Please refer to the AIMET user guide for more details.

+
    +
  1. Spatial SVD: This is a tensor-decomposition technique generally applied to convolutional layers (Conv2D). Applying this technique will decompose a single convolutional layer into two. The weight tensor of the layer to be split is flattended to a 2D matrix and singular value decomposition (SVD) is applied to this matrix. Compression is achieved by discarding the least significant singular values in the diagonal matrix. The decomposed matrices are combined back into two separate +convolutional layers.

  2. +
  3. Channel Pruning: In this technique AIMET will discard least significant (using a magnitude metric) input channels of a given convolutional (Conv2D) layer. The layers of the model feeding into this convolutional layer also have the channels dimension modified to get back to a working graph. This technique also uses a layer-by-layer reconstruction procedure that modifies the weights of the compressed layers to minimize the distance of the compressed layer output to the corresponding layer +output of the original model.

  4. +
+

Both of the above techniques are structured pruning techniques that aim to reduce computational macs or memory requirements of the model. Subsequent to applying either of these techniques, the compressed model needs to be fine-tuned (meaning trained again for a few epochs) to recover accuracy close to the original model.

+

This notebook shows working code example of how both the techniques (#1 and #2) can be used to compress the model. You can find a separate notebook for only #1 or #2 in the same folder.

+
+

Overall flow

+
+
This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load the model and evaluate it to find the baseline accuracy 3. Compress the model and fine-tune:
+
3.1 Compress model using Spatial SVD and evaluate it to find post-compression accuracy
+
3.2 Fine-tune the model after Spatial SVD
+
3.3 Compress model using Channel Pruning and evaluate it to find post-compression accuracy
+
3.4 Fine-tune the model after Channel Pruning
+
+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art compression results. For example, some optimization parameters such as num_comp_ratio_candidates, num_eval_iterations and epochs are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#) and convert them into tfrecords.

+

Note1: The ImageNet tfrecords dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - A folder containing tfrecords files starting with ‘train*’ for training files and ‘valid*’ for validation files. Each tfrecord file should have features: ‘image/encoded’ for image data and ‘image/class/label’ for its corresponding class.

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class and then convert it into tfrecords. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
TFRECORDS_DIR = '/path/to/tfrecords/dir/'        # Please replace this with a real directory
+
+
+
+

We disable logs at the INFO level and disable eager execution. We set verbosity to the level as displayed (ERORR), so TensorFlow will display all messages that have the label ERROR (or more critical).

+
+
[ ]:
+
+
+
import os
+os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
+
+import tensorflow.compat.v1 as tf
+tf.disable_eager_execution()
+tf.logging.set_verbosity(tf.logging.ERROR)
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to compress it and the resultant model is still a PyTorch model. This compressed model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods? Not really, but evaluate() method should return a single number representing the accuracy of the model. Ideally, You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
from typing import List
+from Examples.common import image_net_config
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.tensorflow.utils.image_net_trainer import ImageNetTrainer
+
+class ImageNetDataPipeline:
+    """
+    Provides APIs for model evaluation and finetuning using ImageNet Dataset.
+    """
+
+    @staticmethod
+    def get_val_dataloader():
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(TFRECORDS_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         format_bgr=True)
+
+        return data_loader
+
+    @staticmethod
+    def evaluate(sess: tf.Session, iterations: int = None, use_cuda: bool = False) -> float:
+        """
+        Given a TF session, evaluates its Top-1 accuracy on the validation dataset
+        :param sess: The sess graph to be evaluated.
+        :return: The accuracy for the sample with the maximum accuracy.
+        """
+        evaluator = ImageNetEvaluator(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],
+                                      data_inputs=['input_1:0'], validation_inputs=['labels:0'],
+                                      image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      format_bgr=True)
+
+        return evaluator.evaluate(sess, iterations)
+
+
+    @staticmethod
+    def finetune(sess: tf.Session, update_ops_name: List[str], epochs: int, learning_rate: float, decay_steps: int):
+        """
+        Given a TF session, finetunes it to improve its accuracy
+        :param sess: The sess graph to fine-tune.
+        :param update_ops_name: list of name of update ops (mostly BatchNorms' moving averages).
+                                tf.GraphKeys.UPDATE_OPS collections is always used
+                                in addition to this list
+        :param epochs: The number of epochs used during the finetuning step.
+        :param learning_rate: The learning rate used during the finetuning step.
+        :param decay_steps: A number used to adjust(decay) the learning rate after every decay_steps epochs in training.
+        """
+        trainer = ImageNetTrainer(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],
+                                  data_inputs=['input_1:0'], validation_inputs=['labels:0'],
+                                  image_size=image_net_config.dataset['image_size'],
+                                  batch_size=image_net_config.train['batch_size'],
+                                  num_epochs=epochs, format_bgr=True)
+
+        trainer.train(sess, update_ops_name=update_ops_name, learning_rate=learning_rate, decay_steps=decay_steps)
+
+
+
+
+
+
+

2. Load the model and evaluate it to find the baseline accuracy

+

For this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to a tensorflow session. Similarly, you can load any pretrained tensorflow model instead.

+

Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.

+

By default the update ops are placed in tf.GraphKeys.UPDATE_OPS, so they need to be added as a dependency to the train_op. Since batchnorm ops are folded, these need to be ignored during training.

+
+
[ ]:
+
+
+
from tensorflow.compat.v1.keras.applications.resnet import ResNet50
+
+tf.keras.backend.clear_session()
+
+model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+update_ops_name = [op.name for op in model.updates] # Used for finetuning
+
+
+
+

The following utility method in AIMET sets BN layers in the model to eval mode. This allows AIMET to more easily read the BN parameters from the graph. Eventually we will fold BN layers into adjacent conv layers.

+
+
[ ]:
+
+
+
from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag
+
+model = update_keras_bn_ops_trainable_flag(model, load_save_path="./", trainable=False)
+
+
+
+

AIMET features currently support tensorflow sessions. add_image_net_computational_nodes_in_graph adds an output layer, softmax and loss functions to the Resnet50 model graph.

+
+
[ ]:
+
+
+
from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph
+
+sess = tf.keras.backend.get_session()
+
+# Creates the computation graph of ResNet within the tensorflow session.
+add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])
+
+
+
+

Since all tensorflow input and output tensors have names, we identify the tensors needed by AIMET APIs here.

+
+
[ ]:
+
+
+
input_op_names = [model.input.name.split(":")[0]]
+output_op_names = [model.output.name.split(":")[0]]
+starting_op_names = input_op_names.copy()
+starting_op_names.append('labels')
+
+
+
+

We are checking if TensorFlow is using CPU or CUDA device. This example code will use CUDA if available in your current execution environment.

+
+
[ ]:
+
+
+
use_cuda = tf.test.is_gpu_available(cuda_only=True)
+
+
+
+
+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sess=sess)
+print(accuracy)
+
+
+
+
+
+

3. Compress the model and fine-tune

+
+

3.1. Compress model using Channel Pruning and evaluate it to find post-compression accuracy

+

Now we use AIMET to define compression parameters for Channel Pruning, few of which are explained here

+
    +
  • target_comp_ratio: The desired compession ratio using Channel Pruning. This value denotes the desired compression % of the original model. To compress the model to 20% of its original size, use 0.2. This would compress the model by 80%. The pre-specified value that is given is 50%. The desired compression ratio for Channel Pruning. We are using 0.9 to compress the model by 10%.

  • +
  • num_comp_ratio_candidates: As part of determining how compressible each layer is, AIMET performs various measurements. This number denotes the different compression ratios tried by the AIMET for each layer. We are using 3 here which translates to 0.33, 0.66 and 1.00 compression ratios at each layer. Optimal value is 10. The higher the number of candidates the more granular the measurements for each layer, but also the higher the time taken to complete these measurements.

  • +
  • modules_to_ignore: This list can contain the references of model-layers that should be ignored during compression. We have added the first layer to be ignored to preserve the way the input interacts with the model; other layers can be added too if desired.

  • +
  • mode: We are chossing Auto mode which means AIMET performs per-layer compressibility analysis and determines how much to compress each layer. The alternate choice is Manual.

  • +
  • data_loader: Channel Pruning uses unlabelled data samples for the layer-by-layer reconstruction procedure explained at the start. This provided data loader is used to retrieve those samples. You can just pass your existing data loader - say for the validation or training dataset.

  • +
  • num_reconstruction_samples: During the last stage of Channel Pruning, the Compression API tries to map the outputs of the pruned model with that of the original model through linear regression, and uses this attempt to change the weights in the pruned layer. The regression is done with this many random samples. The number of samples used in the layer-by-layer reconstruction procedure. We are using 10 here which is a ridiculously low number but enables this notebook to execute quickly. A +typical setting here would ~1000 samples.

  • +
  • allow_custom_downsample_ops: If this flag is enabled, AIMET Channel Pruning will insert downsample ops into the model graph if needed. Enabling this can enable more convolutional layers to be considered for pruning, but it may increase memory bandwidth overhead for the additional downsample layers. So there is a trade-off to be considered. We suggest disabling this by default.

  • +
  • eval_callback: The model evaluation function. The expected signature of the evaluate function should be <function_name>(model, eval_iterations, use_cuda) and it is expected to return an accuracy metric.

  • +
  • eval_iterations: The number of batches of data to use for evaluating the model while the model is compressing. We are using 1 to speed up the notebook execution. But please choose a high enough number of samples so that we can trust the accuracy of the model given those samples. It is expected that the eval callback would use the same samples for every invocation of the callback.

  • +
  • compress_scheme: We choose the ‘channel pruning’ compression scheme.

  • +
  • cost_metric: Determines whether we want to target either to reduce MACs or memory by the desired compression ratio. We are chossing ‘mac’ here.

  • +
+

The next cell creates the actual parameters for Spatial SVD. There are two methods for which you can choose parameters - Auto and Manual. For Auto, the only option is a greedy selection scheme, where the optimal compression ratio is selected for each layer among a set list of candidates to reach the target ratio (which was specified in the previous cell). For Manual, you have to specify the compression ratios for each layer; a general rule of thumb, if one is to use Manual, is to start with the +ratios found by Auto Mode and use it as a starting point.

+
+
[ ]:
+
+
+
from decimal import Decimal
+from aimet_common.defs import CompressionScheme, CostMetric, GreedySelectionParameters
+from aimet_tensorflow.defs import SpatialSvdParameters
+
+greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.5),
+                                          num_comp_ratio_candidates=2)
+
+modules_to_ignore = [sess.graph.get_operation_by_name('conv1_conv/Conv2D')]
+auto_params = SpatialSvdParameters.AutoModeParams(greedy_select_params=greedy_params,
+                                                  modules_to_ignore=modules_to_ignore)
+
+params = SpatialSvdParameters(input_op_names=input_op_names,
+                              output_op_names=output_op_names,
+                              mode=SpatialSvdParameters.Mode.auto,
+                              params=auto_params)
+
+
+eval_callback = ImageNetDataPipeline.evaluate
+eval_iterations = 1
+compress_scheme =  CompressionScheme.spatial_svd
+cost_metric = CostMetric.mac
+
+
+
+
+
+
We call the AIMET ModelCompressor.compress_model API using the above parameters. This call returns a compressed model as well as relevant statistics.
+
Note: the ModelCompressor evaluates the model while compressing using the same evaluate function that is in our data pipeline.
+
+
+
[ ]:
+
+
+
from aimet_tensorflow.compress import ModelCompressor
+
+os.makedirs('./output/', exist_ok=True)
+#TODO: makedirs should be at top??
+
+ssvd_compressed_sess, ssvd_comp_stats = ModelCompressor.compress_model(sess=sess,
+                                                             working_dir="output",
+                                                             eval_callback=ImageNetDataPipeline.evaluate,
+                                                             eval_iterations=eval_iterations,
+                                                             input_shape=(1, 3, 224, 224),
+                                                             compress_scheme=compress_scheme,
+                                                             cost_metric=cost_metric,
+                                                             parameters=params)
+
+print(ssvd_comp_stats)
+
+
+
+
+

Now the compressed model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before to calculated compressed model accuracy.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(ssvd_compressed_sess)
+print(accuracy)
+
+
+
+
+

As you can see the model accuracy fell sharply after compression. This is expected. We will use model fine-tuning to recover this accuracy back.

+
+
+

3.2. Fine-tune the model after Spatial SVD

+

After the model is compressed using Spatial SVD, we can simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.

+

For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit.

+

Add this: Since Channel Pruning replaces few BNs by different BNs with ‘reduced_’ added in their original name, update_ops_name list should be updated accordingly

+
+
[ ]:
+
+
+
# compr_graph_all_ops_name = [op.name for op in ssvd_compressed_sess.graph.get_operations()]
+# update_ops_name_after_CP = []
+# for op_name in update_ops_name:
+#     if 'reduced_'+op_name in compr_graph_all_ops_name:
+#         update_ops_name_after_CP.append('reduced_'+op_name)
+#     else:
+#         update_ops_name_after_CP.append(op_name)
+
+# ImageNetDataPipeline.finetune(ssvd_compressed_sess, update_ops_name_after_CP, epochs=1, learning_rate=1e-3, decay_steps=5)
+
+
+
+
+

After we are done with finetuing the compressed model, we can check the floating point accuracy against the same validation dataset at the end to observe any improvements in accuracy.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(ssvd_compressed_sess)
+print(accuracy)
+
+
+
+
+

Depending on your settings you should have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.

+

So we have an improved model after compression using spatial SVD. Optionally, this model now can be saved like a regular tensorflow model.

+
+
[ ]:
+
+
+
from aimet_tensorflow.utils.graph_saver import save_model_to_meta
+
+save_model_to_meta(ssvd_compressed_sess, meta_path='./output/ssvd_finetuned_model')
+
+
+
+
+
+

3.3. Compress model using Channel Pruning and evaluate it to find post-compression accuracy

+
+
The fine-tuned model, compressed with Spatial SVD, can be further compressed using Channel Pruning method.
+
Similar to Spatial SVD, we will first define the parameters for Channel Pruning compression, out of which mostly are same as of Spatial SVD. The other parameters for Channel Pruning are as follows:
+
+
    +
  • data_loader: Channel Pruning uses unlabelled data samples for the layer-by-layer reconstruction procedure explained at the start. This provided data loader is used to retrieve those samples. You can just pass your existing data loader - say for the validation or training dataset.

  • +
  • num_reconstruction_samples: The number of samples used in the layer-by-layer reconstruction procedure. We are using 10 here which is a ridiculously low number but enables this notebook to execute quickly. A typical setting here would ~1000 samples.

  • +
  • allow_custom_downsample_ops: If this flag is enabled, AIMET Channel Pruning will insert downsample ops into the model graph if needed. Enabling this can enable more convolutional layers to be considered for pruning, but it may increase memory bandwidth overhead for the additional downsample layers. So there is a trade-off to be considered. We suggest disabling this by default.

  • +
+
+
[ ]:
+
+
+
starting_op_names
+
+
+
+
+
[ ]:
+
+
+
from aimet_common.defs import CompressionScheme, CostMetric
+from aimet_tensorflow.defs import GreedySelectionParameters, ChannelPruningParameters
+from decimal import Decimal
+
+greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.9),
+                                          num_comp_ratio_candidates=3)
+
+modules_to_ignore = [ssvd_compressed_sess.graph.get_operation_by_name('conv1_conv/Conv2D')]
+auto_params = ChannelPruningParameters.AutoModeParams(greedy_select_params=greedy_params,
+                                                      modules_to_ignore=modules_to_ignore)
+
+data_loader = ImageNetDataPipeline.get_val_dataloader()
+params = ChannelPruningParameters(input_op_names=starting_op_names,
+                                  output_op_names=output_op_names,
+                                  data_set=data_loader.dataset,
+                                  batch_size=data_loader.batch_size,
+                                  num_reconstruction_samples=10,
+                                  allow_custom_downsample_ops=False,
+                                  mode=ChannelPruningParameters.Mode.auto,
+                                  params=auto_params)
+
+eval_callback = ImageNetDataPipeline.evaluate
+eval_iterations = 1
+compress_scheme = CompressionScheme.channel_pruning
+cost_metric = CostMetric.mac
+
+
+
+
+

We call the AIMET ModelCompressor.compress_model API using the above parameters. This call returns a compressed model as well as relevant statistics.

+

Note: the ModelCompressor evaluates the model while compressing using the same evaluate function that is in our data pipeline. This returns both the new model, which is saved, as well as relevant statistics. Finally, the compressed model is evaluated on the dataset.

+
+
[ ]:
+
+
+
from aimet_tensorflow.compress import ModelCompressor
+
+os.makedirs('./output/', exist_ok=True)
+#TODO: makedirs should be at top??
+
+ssvd_cp_compressed_sess, cp_comp_stats = ModelCompressor.compress_model(sess=ssvd_compressed_sess,
+                                                             working_dir="output",
+                                                             eval_callback=eval_callback,
+                                                             eval_iterations=eval_iterations,
+                                                             input_shape=(1, 3, 224, 224),
+                                                             compress_scheme=compress_scheme,
+                                                             cost_metric=cost_metric,
+                                                             parameters=params)
+
+print(cp_comp_stats)
+
+
+
+
+

Ok so we have a compressed model. We can pass this model to the same evaluation routine we used before to calculated compressed model accuracy.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(ssvd_cp_compressed_sess, iterations=1)
+print(accuracy)
+
+
+
+
+

As you can see the model accuracy fell sharply after compression. This is expected. We will use model fine-tuning to recover this accuracy back.

+
+
+

3.4. Fine-tune the model after Channel Pruning

+

After the model is compressed using Spatial SVD followed by Channel Pruning, we can simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.

+

For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit.

+

Note: Since Channel Pruning replaces few BNs by different BNs with ‘reduced_’ added in their original name, update_ops_name list should be updated accordingly

+
+
[ ]:
+
+
+
compr_graph_all_ops_name = [op.name for op in ssvd_cp_compressed_sess.graph.get_operations()]
+update_ops_name_after_CP = []
+for op_name in update_ops_name:
+    if 'reduced_'+op_name in compr_graph_all_ops_name:
+        update_ops_name_after_CP.append('reduced_'+op_name)
+    else:
+        update_ops_name_after_CP.append(op_name)
+
+ImageNetDataPipeline.finetune(ssvd_cp_compressed_sess, update_ops_name_after_CP, epochs=1, learning_rate=1e-3, decay_steps=5)
+
+
+
+
+

After we are done with finetuing the compressed model, we can check the floating point accuracy against the same validation dataset at the end to observe any improvements in accuracy.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(ssvd_cp_compressed_sess)
+print(accuracy)
+
+
+
+
+

Depending on your settings you may have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.

+

So we have an improved model after compression using spatial SVD followed by Channel Pruning. Optionally, this model now can be saved like a regular Tensorflow model.

+
+
[ ]:
+
+
+
from aimet_tensorflow.utils.graph_saver import save_model_to_meta
+
+save_model_to_meta(ssvd_cp_compressed_sess, meta_path='./output/finetuned_model')
+
+
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET for performing compression with Spatial SVD followed by Channel Pruning. As indicated above, some parameters have been chosen in a way to run the example faster.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET compression and quantization techniques

+
+
[ ]:
+
+
+

+
+
+
+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/compression/spatial_svd_channel_pruning.ipynb b/releases/1.32.2/Examples/tensorflow/compression/spatial_svd_channel_pruning.ipynb new file mode 100644 index 00000000..04646490 --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/compression/spatial_svd_channel_pruning.ipynb @@ -0,0 +1,714 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Model Compression Using Spatial SVD Followed by Channel Pruning \n", + "\n", + "This notebook shows a working code example of how to use AIMET to perform model compression. Two model-compression techniques are applied back-to-back: Spatial SVD followed by Channel Pruning.\n", + "\n", + "Here is a brief introduction to the techniques. Please refer to the AIMET user guide for more details.\n", + "\n", + "1. **Spatial SVD**: This is a tensor-decomposition technique generally applied to convolutional layers (Conv2D). Applying this technique will decompose a single convolutional layer into two. The weight tensor of the layer to be split is flattended to a 2D matrix and singular value decomposition (SVD) is applied to this matrix. Compression is achieved by discarding the least significant singular values in the diagonal matrix. The decomposed matrices are combined back into two separate convolutional layers.\n", + "2. **Channel Pruning**: In this technique AIMET will discard least significant (using a magnitude metric) input channels of a given convolutional (Conv2D) layer. The layers of the model feeding into this convolutional layer also have the channels dimension modified to get back to a working graph. This technique also uses a layer-by-layer reconstruction procedure that modifies the weights of the compressed layers to minimize the distance of the compressed layer output to the corresponding layer output of the original model.\n", + "\n", + "Both of the above techniques are structured pruning techniques that aim to reduce computational macs or memory requirements of the model. Subsequent to applying either of these techniques, the compressed model needs to be fine-tuned (meaning trained again for a few epochs) to recover accuracy close to the original model.\n", + "\n", + "This notebook shows working code example of how both the techniques (#1 and #2) can be used to compress the model. You can find a separate notebook for only #1 or #2 in the same folder.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load the model and evaluate it to find the baseline accuracy\n", + "3. Compress the model and fine-tune: \n", + " 3.1 Compress model using Spatial SVD and evaluate it to find post-compression accuracy \n", + " 3.2 Fine-tune the model after Spatial SVD \n", + " 3.3 Compress model using Channel Pruning and evaluate it to find post-compression accuracy \n", + " 3.4 Fine-tune the model after Channel Pruning \n", + "\n", + "\n", + "#### What this notebook is not \n", + "* This notebook is not designed to show state-of-the-art compression results. For example, some optimization parameters such as num_comp_ratio_candidates, num_eval_iterations and epochs are deliberately chosen to have the notebook execute more quickly.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#) and convert them into tfrecords.\n", + "\n", + "**Note1**: The ImageNet tfrecords dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- A folder containing tfrecords files starting with **'train\\*'** for training files and **'valid\\*'** for validation files. Each tfrecord file should have features: **'image/encoded'** for image data and **'image/class/label'** for its corresponding class.\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class and then convert it into tfrecords. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "TFRECORDS_DIR = '/path/to/tfrecords/dir/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We disable logs at the INFO level and disable eager execution. We set verbosity to the level as displayed (ERORR), so TensorFlow will display all messages that have the label ERROR (or more critical)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'\n", + "\n", + "import tensorflow.compat.v1 as tf\n", + "tf.disable_eager_execution()\n", + "tf.logging.set_verbosity(tf.logging.ERROR)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to compress it and the resultant model is still a PyTorch model. This compressed model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really, but evaluate() method should return a single number representing the accuracy of the model. Ideally, You should be able to use your existing evaluate and train routines as-is.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from typing import List\n", + "from Examples.common import image_net_config\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.tensorflow.utils.image_net_trainer import ImageNetTrainer\n", + "\n", + "class ImageNetDataPipeline:\n", + " \"\"\"\n", + " Provides APIs for model evaluation and finetuning using ImageNet Dataset.\n", + " \"\"\"\n", + " \n", + " @staticmethod\n", + " def get_val_dataloader():\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(TFRECORDS_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return data_loader\n", + " \n", + " @staticmethod\n", + " def evaluate(sess: tf.Session, iterations: int = None, use_cuda: bool = False) -> float:\n", + " \"\"\"\n", + " Given a TF session, evaluates its Top-1 accuracy on the validation dataset\n", + " :param sess: The sess graph to be evaluated.\n", + " :return: The accuracy for the sample with the maximum accuracy.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],\n", + " data_inputs=['input_1:0'], validation_inputs=['labels:0'],\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return evaluator.evaluate(sess, iterations)\n", + "\n", + " \n", + " @staticmethod\n", + " def finetune(sess: tf.Session, update_ops_name: List[str], epochs: int, learning_rate: float, decay_steps: int):\n", + " \"\"\"\n", + " Given a TF session, finetunes it to improve its accuracy\n", + " :param sess: The sess graph to fine-tune.\n", + " :param update_ops_name: list of name of update ops (mostly BatchNorms' moving averages).\n", + " tf.GraphKeys.UPDATE_OPS collections is always used\n", + " in addition to this list\n", + " :param epochs: The number of epochs used during the finetuning step.\n", + " :param learning_rate: The learning rate used during the finetuning step.\n", + " :param decay_steps: A number used to adjust(decay) the learning rate after every decay_steps epochs in training.\n", + " \"\"\"\n", + " trainer = ImageNetTrainer(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],\n", + " data_inputs=['input_1:0'], validation_inputs=['labels:0'],\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.train['batch_size'],\n", + " num_epochs=epochs, format_bgr=True)\n", + "\n", + " trainer.train(sess, update_ops_name=update_ops_name, learning_rate=learning_rate, decay_steps=decay_steps)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 2. Load the model and evaluate it to find the baseline accuracy" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to a tensorflow session. Similarly, you can load any pretrained tensorflow model instead.\n", + "\n", + "\n", + "Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.\n", + "\n", + "\n", + "By default the update ops are placed in tf.GraphKeys.UPDATE_OPS, so they need to be added as a dependency to the train_op. Since batchnorm ops are folded, these need to be ignored during training." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from tensorflow.compat.v1.keras.applications.resnet import ResNet50\n", + "\n", + "tf.keras.backend.clear_session()\n", + "\n", + "model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))\n", + "update_ops_name = [op.name for op in model.updates] # Used for finetuning" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The following utility method in AIMET sets BN layers in the model to eval mode. This allows AIMET to more easily read the BN parameters from the graph. Eventually we will fold BN layers into adjacent conv layers." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag\n", + "\n", + "model = update_keras_bn_ops_trainable_flag(model, load_save_path=\"./\", trainable=False)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "AIMET features currently support tensorflow sessions. **add_image_net_computational_nodes_in_graph** adds an output layer, softmax and loss functions to the Resnet50 model graph." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph\n", + "\n", + "sess = tf.keras.backend.get_session()\n", + "\n", + "# Creates the computation graph of ResNet within the tensorflow session.\n", + "add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Since all tensorflow input and output tensors have names, we identify the tensors needed by AIMET APIs here. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "input_op_names = [model.input.name.split(\":\")[0]]\n", + "output_op_names = [model.output.name.split(\":\")[0]]\n", + "starting_op_names = input_op_names.copy()\n", + "starting_op_names.append('labels')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We are checking if TensorFlow is using CPU or CUDA device. This example code will use CUDA if available in your current execution environment." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "use_cuda = tf.test.is_gpu_available(cuda_only=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sess=sess)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. Compress the model and fine-tune\n", + "\n", + "### 3.1. Compress model using Channel Pruning and evaluate it to find post-compression accuracy\n", + "Now we use AIMET to define compression parameters for Channel Pruning, few of which are explained here\n", + "\n", + "- **target_comp_ratio**: The desired compession ratio using Channel Pruning. This value denotes the desired compression % of the original model. To compress the model to 20% of its original size, use 0.2. This would compress the model by 80%. The pre-specified value that is given is 50%. The desired compression ratio for Channel Pruning. We are using 0.9 to compress the model by 10%.\n", + "\n", + "- **num_comp_ratio_candidates**: As part of determining how compressible each layer is, AIMET performs various measurements. This number denotes the different compression ratios tried by the AIMET for each layer. We are using 3 here which translates to 0.33, 0.66 and 1.00 compression ratios at each layer. Optimal value is 10. The higher the number of candidates the more granular the measurements for each layer, but also the higher the time taken to complete these measurements.\n", + "\n", + "- **modules_to_ignore**: This list can contain the references of model-layers that should be ignored during compression. We have added the first layer to be ignored to preserve the way the input interacts with the model; other layers can be added too if desired.\n", + "\n", + "- **mode**: We are chossing **Auto** mode which means AIMET performs per-layer compressibility analysis and determines how much to compress each layer. The alternate choice is **Manual**.\n", + "\n", + "- **data_loader**: Channel Pruning uses unlabelled data samples for the layer-by-layer reconstruction procedure explained at the start. This provided data loader is used to retrieve those samples. You can just pass your existing data loader - say for the validation or training dataset.\n", + "\n", + "- **num_reconstruction_samples**: During the last stage of Channel Pruning, the Compression API tries to map the outputs of the pruned model with that of the original model through linear regression, and uses this attempt to change the weights in the pruned layer. The regression is done with this many random samples. The number of samples used in the layer-by-layer reconstruction procedure. We are using 10 here which is a ridiculously low number but enables this notebook to execute quickly. A typical setting here would ~1000 samples.\n", + "\n", + "- **allow_custom_downsample_ops**: If this flag is enabled, AIMET Channel Pruning will insert downsample ops into the model graph if needed. Enabling this can enable more convolutional layers to be considered for pruning, but it may increase memory bandwidth overhead for the additional downsample layers. So there is a trade-off to be considered. We suggest disabling this by default.\n", + "\n", + "- **eval_callback**: The model evaluation function. The expected signature of the evaluate function should be `(model, eval_iterations, use_cuda)` and it is expected to return an accuracy metric.\n", + "\n", + "- **eval_iterations**: The number of batches of data to use for evaluating the model while the model is compressing. We are using 1 to speed up the notebook execution. But please choose a high enough number of samples so that we can trust the accuracy of the model given those samples. It is expected that the eval callback would use the same samples for every invocation of the callback.\n", + "\n", + "- **compress_scheme**: We choose the 'channel pruning' compression scheme.\n", + "\n", + "- **cost_metric**: Determines whether we want to target either to reduce MACs or memory by the desired compression ratio. We are chossing 'mac' here." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The next cell creates the actual parameters for Spatial SVD. There are two methods for which you can choose parameters - Auto and Manual. For Auto, the only option is a greedy selection scheme, where the optimal compression ratio is selected for each layer among a set list of candidates to reach the target ratio (which was specified in the previous cell). For Manual, you have to specify the compression ratios for each layer; a general rule of thumb, if one is to use Manual, is to start with the ratios found by Auto Mode and use it as a starting point." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from decimal import Decimal\n", + "from aimet_common.defs import CompressionScheme, CostMetric, GreedySelectionParameters\n", + "from aimet_tensorflow.defs import SpatialSvdParameters\n", + "\n", + "greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.5),\n", + " num_comp_ratio_candidates=2)\n", + "\n", + "modules_to_ignore = [sess.graph.get_operation_by_name('conv1_conv/Conv2D')] \n", + "auto_params = SpatialSvdParameters.AutoModeParams(greedy_select_params=greedy_params,\n", + " modules_to_ignore=modules_to_ignore)\n", + "\n", + "params = SpatialSvdParameters(input_op_names=input_op_names,\n", + " output_op_names=output_op_names,\n", + " mode=SpatialSvdParameters.Mode.auto,\n", + " params=auto_params)\n", + "\n", + "\n", + "eval_callback = ImageNetDataPipeline.evaluate\n", + "eval_iterations = 1\n", + "compress_scheme = CompressionScheme.spatial_svd\n", + "cost_metric = CostMetric.mac" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We call the AIMET ModelCompressor.compress_model API using the above parameters. This call returns a compressed model as well as relevant statistics. \n", + "**Note**: the ModelCompressor evaluates the model while compressing using the same evaluate function that is in our data pipeline.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.compress import ModelCompressor\n", + "\n", + "os.makedirs('./output/', exist_ok=True)\n", + "#TODO: makedirs should be at top??\n", + "\n", + "ssvd_compressed_sess, ssvd_comp_stats = ModelCompressor.compress_model(sess=sess,\n", + " working_dir=\"output\",\n", + " eval_callback=ImageNetDataPipeline.evaluate,\n", + " eval_iterations=eval_iterations,\n", + " input_shape=(1, 3, 224, 224),\n", + " compress_scheme=compress_scheme,\n", + " cost_metric=cost_metric,\n", + " parameters=params)\n", + "\n", + "print(ssvd_comp_stats)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now the compressed model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before to calculated compressed model accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(ssvd_compressed_sess)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "As you can see the model accuracy fell sharply after compression. This is expected. We will use model fine-tuning to recover this accuracy back." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.2. Fine-tune the model after Spatial SVD\n", + "\n", + "After the model is compressed using Spatial SVD, we can simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.\n", + "\n", + "For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit.\n", + "\n", + "Add this: Since Channel Pruning replaces few BNs by different BNs with 'reduced_' added in their original name, update_ops_name list should be updated accordingly" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# compr_graph_all_ops_name = [op.name for op in ssvd_compressed_sess.graph.get_operations()]\n", + "# update_ops_name_after_CP = []\n", + "# for op_name in update_ops_name:\n", + "# if 'reduced_'+op_name in compr_graph_all_ops_name:\n", + "# update_ops_name_after_CP.append('reduced_'+op_name)\n", + "# else:\n", + "# update_ops_name_after_CP.append(op_name)\n", + "\n", + "# ImageNetDataPipeline.finetune(ssvd_compressed_sess, update_ops_name_after_CP, epochs=1, learning_rate=1e-3, decay_steps=5)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "After we are done with finetuing the compressed model, we can check the floating point accuracy against the same validation dataset at the end to observe any improvements in accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(ssvd_compressed_sess)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Depending on your settings you should have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.\n", + "\n", + "So we have an improved model after compression using spatial SVD. Optionally, this model now can be saved like a regular tensorflow model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.utils.graph_saver import save_model_to_meta\n", + "\n", + "save_model_to_meta(ssvd_compressed_sess, meta_path='./output/ssvd_finetuned_model')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.3. Compress model using Channel Pruning and evaluate it to find post-compression accuracy\n", + "\n", + "The fine-tuned model, compressed with Spatial SVD, can be further compressed using Channel Pruning method. \n", + "Similar to Spatial SVD, we will first define the parameters for Channel Pruning compression, out of which mostly are same as of Spatial SVD. The other parameters for Channel Pruning are as follows:\n", + "\n", + "- **data_loader**: Channel Pruning uses unlabelled data samples for the layer-by-layer reconstruction procedure explained at the start. This provided data loader is used to retrieve those samples. You can just pass your existing data loader - say for the validation or training dataset.\n", + "\n", + "- **num_reconstruction_samples**: The number of samples used in the layer-by-layer reconstruction procedure. We are using 10 here which is a ridiculously low number but enables this notebook to execute quickly. A typical setting here would ~1000 samples.\n", + "\n", + "- **allow_custom_downsample_ops**: If this flag is enabled, AIMET Channel Pruning will insert downsample ops into the model graph if needed. Enabling this can enable more convolutional layers to be considered for pruning, but it may increase memory bandwidth overhead for the additional downsample layers. So there is a trade-off to be considered. We suggest disabling this by default." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "starting_op_names" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_common.defs import CompressionScheme, CostMetric\n", + "from aimet_tensorflow.defs import GreedySelectionParameters, ChannelPruningParameters\n", + "from decimal import Decimal\n", + "\n", + "greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.9),\n", + " num_comp_ratio_candidates=3)\n", + "\n", + "modules_to_ignore = [ssvd_compressed_sess.graph.get_operation_by_name('conv1_conv/Conv2D')]\n", + "auto_params = ChannelPruningParameters.AutoModeParams(greedy_select_params=greedy_params, \n", + " modules_to_ignore=modules_to_ignore)\n", + "\n", + "data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + "params = ChannelPruningParameters(input_op_names=starting_op_names,\n", + " output_op_names=output_op_names,\n", + " data_set=data_loader.dataset,\n", + " batch_size=data_loader.batch_size,\n", + " num_reconstruction_samples=10,\n", + " allow_custom_downsample_ops=False,\n", + " mode=ChannelPruningParameters.Mode.auto,\n", + " params=auto_params)\n", + "\n", + "eval_callback = ImageNetDataPipeline.evaluate\n", + "eval_iterations = 1\n", + "compress_scheme = CompressionScheme.channel_pruning\n", + "cost_metric = CostMetric.mac" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We call the AIMET ModelCompressor.compress_model API using the above parameters. This call returns a compressed model as well as relevant statistics. \n", + "\n", + "\n", + "**Note**: the ModelCompressor evaluates the model while compressing using the same evaluate function that is in our data pipeline. This returns both the new model, which is saved, as well as relevant statistics. Finally, the compressed model is evaluated on the dataset." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.compress import ModelCompressor\n", + "\n", + "os.makedirs('./output/', exist_ok=True)\n", + "#TODO: makedirs should be at top??\n", + "\n", + "ssvd_cp_compressed_sess, cp_comp_stats = ModelCompressor.compress_model(sess=ssvd_compressed_sess,\n", + " working_dir=\"output\",\n", + " eval_callback=eval_callback,\n", + " eval_iterations=eval_iterations,\n", + " input_shape=(1, 3, 224, 224),\n", + " compress_scheme=compress_scheme,\n", + " cost_metric=cost_metric,\n", + " parameters=params)\n", + "\n", + "print(cp_comp_stats)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Ok so we have a compressed model. We can pass this model to the same evaluation routine we used before to calculated compressed model accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(ssvd_cp_compressed_sess, iterations=1)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "As you can see the model accuracy fell sharply after compression. This is expected. We will use model fine-tuning to recover this accuracy back." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.4. Fine-tune the model after Channel Pruning\n", + "\n", + "After the model is compressed using Spatial SVD followed by Channel Pruning, we can simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.\n", + "\n", + "For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit.\n", + "\n", + "\n", + "**Note:** Since Channel Pruning replaces few BNs by different BNs with 'reduced_' added in their original name, update_ops_name list should be updated accordingly" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "compr_graph_all_ops_name = [op.name for op in ssvd_cp_compressed_sess.graph.get_operations()]\n", + "update_ops_name_after_CP = []\n", + "for op_name in update_ops_name:\n", + " if 'reduced_'+op_name in compr_graph_all_ops_name:\n", + " update_ops_name_after_CP.append('reduced_'+op_name)\n", + " else:\n", + " update_ops_name_after_CP.append(op_name)\n", + " \n", + "ImageNetDataPipeline.finetune(ssvd_cp_compressed_sess, update_ops_name_after_CP, epochs=1, learning_rate=1e-3, decay_steps=5)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "After we are done with finetuing the compressed model, we can check the floating point accuracy against the same validation dataset at the end to observe any improvements in accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(ssvd_cp_compressed_sess)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Depending on your settings you may have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.\n", + "\n", + "So we have an improved model after compression using spatial SVD followed by Channel Pruning. Optionally, this model now can be saved like a regular Tensorflow model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.utils.graph_saver import save_model_to_meta\n", + "\n", + "save_model_to_meta(ssvd_cp_compressed_sess, meta_path='./output/finetuned_model')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Summary\n", + "\n", + "Hope this notebook was useful for you to understand how to use AIMET for performing compression with Spatial SVD followed by Channel Pruning. As indicated above, some parameters have been chosen in a way to run the example faster.\n", + "\n", + "Few additional resources\n", + "- Refer to the AIMET API docs to know more details of the APIs and optional parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET compression and quantization techniques" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/releases/1.32.2/Examples/tensorflow/quantization/adaround.html b/releases/1.32.2/Examples/tensorflow/quantization/adaround.html new file mode 100644 index 00000000..d807007c --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/adaround.html @@ -0,0 +1,1538 @@ + + + + + + Adaptive Rounding (AdaRound) — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Adaptive Rounding (AdaRound)

+

This notebook shows a working code example of how to use AIMET to perform Adaptive Rounding (AdaRound).

+

AIMET quantization features typically use the “nearest rounding” technique for achieving quantization. When using the “nearest rounding” technique, the weight value is quantized to the nearest integer value.

+

AdaRound optimizes a loss function using unlabeled training data to decide whether to quantize a specific weight to the closer integer value or the farther one. Using AdaRound quantization, a model is able to achieve an accuracy closer to the FP32 model, while using low bit-width integer quantization.

+
+

Overall flow

+

This notebook covers the following: 1. Instantiate the example evaluation and training pipeline 2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy 3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score 4. Apply AdaRound and evaluate the simulation model to get a post-finetuned quantized accuracy score

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art results

  • +
  • For example, it uses a relatively quantization-friendly model like Resnet50

  • +
  • Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, use that. Otherwise, download the dataset from an appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#) and convert them into tfrecords.

+

Note1: The dataloader provided in this example notebook relies on the ImageNet tfrecords dataset having the following characteristics: - A folder containing tfrecords files starting with ‘train*’ for training files and ‘valid*’ for validation files. - Each tfrecord file should have features: ‘image/encoded’ for image data and ‘image/class/label’ for its corresponding class.

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. For the purpose of running this notebook, you could reduce the dataset to 2 samples per class and then convert it into tfrecords. This exercise is left up to the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
TFRECORDS_DIR = '/path/to/dataset/'        # Please replace this with a real directory
+
+
+
+

We disable logs at the INFO level and disable eager execution. We set verbosity to the level as displayed (ERROR), so TensorFlow will display all messages that have the label ERROR (or more critical).

+
+
[ ]:
+
+
+
import os
+os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
+
+import tensorflow.compat.v1 as tf
+tf.disable_eager_execution()
+tf.logging.set_verbosity(tf.logging.ERROR)
+
+
+
+
+
+
+

1. Example Evaluation and Training Pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written?

    +

    Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a TensorFlow model. This QuantizationSim model can be used in place of the original model when doing inference or training.

    +
  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods?

    +

    Not really. You should be able to use your existing evaluate and train routines as-is.

    +
  • +
+
+
[ ]:
+
+
+
from typing import List
+
+from Examples.common import image_net_config
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.tensorflow.utils.image_net_trainer import ImageNetTrainer
+
+class ImageNetDataPipeline:
+    """
+    Provides APIs for model evaluation and finetuning using ImageNet Dataset.
+    """
+
+    @staticmethod
+    def get_val_dataloader():
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(TFRECORDS_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         format_bgr=True)
+
+        return data_loader
+
+    @staticmethod
+    def evaluate(sess: tf.Session) -> float:
+        """
+        Given a TF session, evaluates its Top-1 accuracy on the validation dataset
+        :param sess: The sess graph to be evaluated.
+        :return: The accuracy for the sample with the maximum accuracy.
+        """
+        evaluator = ImageNetEvaluator(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],
+                                      data_inputs=['input_1:0'], validation_inputs=['labels:0'],
+                                      image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      format_bgr=True)
+
+        return evaluator.evaluate(sess)
+
+
+    @staticmethod
+    def finetune(sess: tf.Session, update_ops_name: List[str], epochs: int, learning_rate: float, decay_steps: int):
+        """
+        Given a TF session, finetunes it to improve its accuracy
+        :param sess: The sess graph to fine-tune.
+        :param update_ops_name: list of name of update ops (mostly BatchNorms' moving averages).
+                                tf.GraphKeys.UPDATE_OPS collections is always used
+                                in addition to this list
+        :param epochs: The number of epochs used during the finetuning step.
+        :param learning_rate: The learning rate used during the finetuning step.
+        :param decay_steps: A number used to adjust(decay) the learning rate after every decay_steps epochs in training.
+        """
+        trainer = ImageNetTrainer(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],
+                                  data_inputs=['input_1:0'], validation_inputs=['labels:0'],
+                                  image_size=image_net_config.dataset['image_size'],
+                                  batch_size=image_net_config.train['batch_size'],
+                                  num_epochs=epochs, format_bgr=True)
+
+        trainer.train(sess, update_ops_name=update_ops_name, learning_rate=learning_rate, decay_steps=decay_steps)
+
+
+
+
+
+
+

2. Load the model and evaluate to get a baseline FP32 accuracy score

+

For this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to a tensorflow session. Similarly, you can load any pretrained tensorflow model instead.

+

Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.

+

By default the update ops are placed in tf.GraphKeys.UPDATE_OPS, so they need to be added as a dependency to the train_op. Since batchnorm ops are folded, these need to be ignored during training.

+
+
[ ]:
+
+
+
from tensorflow.compat.v1.keras.applications.resnet import ResNet50
+
+tf.keras.backend.clear_session()
+
+model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+update_ops_name = [op.name for op in model.updates] # Used for finetuning
+
+
+
+

The following utility method in AIMET sets BN layers in the model to eval mode. This allows AIMET to more easily read the BN parameters from the graph. Eventually we will fold BN layers into adjacent conv layers.

+
+
[ ]:
+
+
+
from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag
+
+model = update_keras_bn_ops_trainable_flag(model, load_save_path="./", trainable=False)
+
+
+
+

AIMET features currently support tensorflow sessions. add_image_net_computational_nodes_in_graph adds an output layer, softmax and loss functions to the Resnet50 model graph.

+
+
[ ]:
+
+
+
from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph
+
+sess = tf.keras.backend.get_session()
+
+# Creates the computation graph of ResNet within the tensorflow session.
+add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])
+
+
+
+

Since all tensorflow input and output tensors have names, we identify the tensors needed by AIMET APIs here.

+
+
[ ]:
+
+
+
starting_op_names = [model.input.name.split(":")[0]]
+output_op_names = [model.output.name.split(":")[0]]
+
+
+
+

We are checking if TensorFlow is using CPU or CUDA device. This example code will use CUDA if available in your current execution environment.

+
+
[ ]:
+
+
+
use_cuda = tf.test.is_gpu_available(cuda_only=True)
+
+
+
+
+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine:

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sess=sess)
+print(accuracy)
+
+
+
+
+
+
+

3. Create a quantization simulation model and determine quantized accuracy

+
+
+

Fold Batch Normalization layers

+

Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.

+

Why do we need to this?

+

On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so results in an inferences/sec speedup since unnecessary computation is avoided.

+

From a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values for the weight parameters of the adjacent layers.

+

This can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). We want to simulate that on-target behavior by doing BN folding here.

+

The following code calls AIMET to fold the BN layers on the given model and returns a new session:

+
+
[ ]:
+
+
+
from aimet_tensorflow.batch_norm_fold import fold_all_batch_norms
+
+BN_folded_sess, _= fold_all_batch_norms(sess,
+                                        input_op_names=starting_op_names,
+                                        output_op_names=output_op_names)
+
+
+
+
+
+

Create Quantization Sim Model

+

Now we use AIMET to create a QuantizationSimModel.

+

Before we create the QuantizationSimModel, we save and load a version of the BN folded session for QuantSim to use. QuantSim will insert fake quantization ops in the session passed into it, and we want to maintain a fresh copy of the BN folded session for use in AdaRound later.

+
+
[ ]:
+
+
+
from aimet_tensorflow.utils.graph_saver import save_and_load_graph
+BN_folded_sess_copy = save_and_load_graph("output", BN_folded_sess)
+
+
+
+

AIMET will insert fake quantization ops in the model graph and configure them. A few of the parameters are explained here: - quant_scheme: - We set this to “post_training_tf_enhanced” With this choice of quant scheme, AIMET will use the TF Enhanced quant scheme to initialize the quantization parameters like scale/offset. - default_output_bw: Setting this to 8 means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision. - +default_param_bw: Setting this to 8 means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision.

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+
+
[ ]:
+
+
+
from aimet_common.defs import QuantScheme
+from aimet_tensorflow.quantsim import QuantizationSimModel
+
+sim = QuantizationSimModel(session=BN_folded_sess_copy,
+                           starting_op_names=starting_op_names,
+                           output_op_names=output_op_names,
+                           quant_scheme= QuantScheme.post_training_tf_enhanced,
+                           default_output_bw=8,
+                           default_param_bw=8,
+                           use_cuda=use_cuda)
+
+
+
+
+
+
+

Compute Encodings

+

Even though AIMET has added ‘quantizer’ nodes to the model graph, the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ node.

+

For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as ‘computing encodings’.

+

We create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metrics, so we can just ignore the model output. A few pointers regarding the data samples:

+
    +
  • In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 to 1000 samples.

  • +
  • It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results.

  • +
+

The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example.

+
+
[ ]:
+
+
+
def pass_calibration_data(session: tf.Session, _):
+    data_loader = ImageNetDataPipeline.get_val_dataloader()
+    batch_size = data_loader.batch_size
+
+    input_label_tensors = [session.graph.get_tensor_by_name('input_1:0'),
+                           session.graph.get_tensor_by_name('labels:0')]
+
+    train_tensors = [session.graph.get_tensor_by_name('keras_learning_phase:0')]
+    train_tensors_dict = dict.fromkeys(train_tensors, False)
+
+    eval_outputs = [session.graph.get_operation_by_name('top1-acc').outputs[0]]
+
+    samples = 500
+
+    batch_cntr = 0
+    for input_label in data_loader:
+        input_label_tensors_dict = dict(zip(input_label_tensors, input_label))
+
+        feed_dict = {**input_label_tensors_dict, **train_tensors_dict}
+
+        with session.graph.as_default():
+            _ = session.run(eval_outputs, feed_dict=feed_dict)
+
+        batch_cntr += 1
+        if (batch_cntr * batch_size) > samples:
+            break
+
+
+
+
+

Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters.

+
+
[ ]:
+
+
+
sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=None)
+
+
+
+
+

Now the QuantizationSim model is ready to be used for inference or training. We can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sim.session)
+print(accuracy)
+
+
+
+
+
+
+

4. Apply AdaRound

+

We can now apply AdaRound to this model.

+

Some of the parameters for AdaRound are described below:

+
    +
  • data_loader: AdaRound needs a dataloader to use data samples for the layer-by-layer optimization to learn the rounding vectors. Either a training or validation dataloader could be passed in.

  • +
  • num_batches: The number of batches used to evaluate the model while calculating the quantization encodings. Typically we want AdaRound to use around 2000 samples. So with a batch size of 32, this translates to 64 batches. To speed up the execution here we are using 5 batches.

  • +
  • default_num_iterations: The number of iterations to AdaRound each layer. Default value is set to 10000 and we strongly recommend to not reduce this number. But in this example we are using 32 to speed up the execution runtime.

  • +
  • path: The path where AdaRound parameter encodings are exported. Ensure that this folder exists prior to calling the API.

  • +
+
+
[ ]:
+
+
+
from aimet_tensorflow.adaround.adaround_weight import Adaround
+from aimet_tensorflow.adaround.adaround_weight import AdaroundParameters
+
+num_batches = 5
+num_iterations = 32
+data_set = ImageNetDataPipeline.get_val_dataloader().dataset
+params = AdaroundParameters(data_set=data_set, num_batches=num_batches, default_num_iterations=num_iterations)
+ada_model = Adaround.apply_adaround(BN_folded_sess, starting_op_names=starting_op_names,
+                                    output_op_names=output_op_names, params=params,
+                                    path="output", filename_prefix="adaround", default_param_bw=8,
+                                    default_quant_scheme=QuantScheme.post_training_tf_enhanced)
+
+
+
+
+

Now, we can determine the simulated quantized accuracy of the model after applying AdaRound. We again create a simulation model like before and evaluate to determine simulated quantized accuracy.

+

Note: There are two important things to understand in the following cell. - Parameter Biwidth Precision: The QuantizationSimModel must be created with the same parameter bitwidth precision that was used in the apply_adaround() created.

+
    +
  • Freezing the parameter encodings: After creating the QuantizationSimModel, the set_and_freeze_param_encodings() API must be called before calling the compute_encodings() API. While applying AdaRound, the parameter values have been rounded up or down based on these initial encodings internally created. For Quantization Simulation accuracy, it is important to freeze these encodings. If the parameters encodings are NOT frozen, the call to compute_encodings() will alter the value of the +parameters encodings and Quantization Simulation accuracy will not reflect the AdaRounded accuracy.

  • +
+
+
[ ]:
+
+
+
sim = QuantizationSimModel(session=ada_model,
+                           starting_op_names=starting_op_names,
+                           output_op_names=output_op_names,
+                           quant_scheme= QuantScheme.post_training_tf_enhanced,
+                           default_output_bw=8,
+                           default_param_bw=8,
+                           use_cuda=use_cuda)
+
+sim.set_and_freeze_param_encodings(encoding_path=os.path.join("output", "adaround.encodings"))
+
+sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=None)
+
+
+
+
+

Now the QuantizationSim model is ready to be used for inference. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization, using the newly AdaRounded model with updated parameters.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sim.session)
+print(accuracy)
+
+
+
+
+

Depending on your settings you may have observed a slight gain in accuracy after applying AdaRound. The settings used in this notebook are designed only to serve as code examples, designed to run quickly, but may not be optimal. Please try this workflow against the model of your choice and play with the number of samples and other parameters to get the best results.

+

The next step would be to take this model to target. We need to do two things: - export the model with the updated weights without the fake quantization ops - export the encodings (scale/offset quantization parameters). AIMET QuantizationSimModel provides an export API for this purpose.

+
+
[ ]:
+
+
+
sim.export(path="./output/", filename_prefix="resnet18_after_adaround")
+
+
+
+
+
+
+

Summary

+

This example illustrated how the AIMET AdaRound API is invoked to achieve post training quantization. To use AIMET AdaRound for your specific needs, replace the model with your model and replace the data pipeline with your data pipeline. As indicated above, some parameters in this example have been chosen in such a way to make this example execute faster.

+

We hope this notebook was useful for you to understand how to use AIMET for performing AdaRound.

+

A few additional resources: - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT techniques

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/quantization/adaround.ipynb b/releases/1.32.2/Examples/tensorflow/quantization/adaround.ipynb new file mode 100644 index 00000000..87af0dfe --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/adaround.ipynb @@ -0,0 +1,803 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "# Adaptive Rounding (AdaRound)\n", + "This notebook shows a working code example of how to use AIMET to perform Adaptive Rounding (AdaRound).\n", + "\n", + "AIMET quantization features typically use the \"nearest rounding\" technique for achieving quantization.\n", + "When using the \"nearest rounding\" technique, the weight value is quantized to the nearest integer value.\n", + "\n", + "AdaRound optimizes a loss function using unlabeled training data to decide whether to quantize a specific weight to the closer integer value or the farther one.\n", + "Using AdaRound quantization, a model is able to achieve an accuracy closer to the FP32 model, while using low bit-width integer quantization.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following:\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy\n", + "3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score\n", + "4. Apply AdaRound and evaluate the simulation model to get a post-finetuned quantized accuracy score\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art results\n", + "* For example, it uses a relatively quantization-friendly model like Resnet50\n", + "* Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification.\n", + "If you already have a version of the dataset readily available, use that.\n", + "Otherwise, download the dataset from an appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#) and convert them into tfrecords.\n", + "\n", + "**Note1**: The dataloader provided in this example notebook relies on the ImageNet tfrecords dataset having the following characteristics:\n", + "- A folder containing tfrecords files starting with **'train\\*'** for training files and **'valid\\*'** for validation files.\n", + "- Each tfrecord file should have features: **'image/encoded'** for image data and **'image/class/label'** for its corresponding class.\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset.\n", + "E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class.\n", + "For the purpose of running this notebook, you could reduce the dataset to 2 samples per class and then convert it into tfrecords.\n", + "This exercise is left up to the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "TFRECORDS_DIR = '/path/to/dataset/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "We disable logs at the INFO level and disable eager execution.\n", + "We set verbosity to the level as displayed (ERROR), so TensorFlow will display all messages that have the label ERROR (or more critical)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'\n", + "\n", + "import tensorflow.compat.v1 as tf\n", + "tf.disable_eager_execution()\n", + "tf.logging.set_verbosity(tf.logging.ERROR)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## 1. Example Evaluation and Training Pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?**\n", + "\n", + " Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a TensorFlow model.\n", + " This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?**\n", + "\n", + " Not really. You should be able to use your existing evaluate and train routines as-is.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from typing import List\n", + "\n", + "from Examples.common import image_net_config\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.tensorflow.utils.image_net_trainer import ImageNetTrainer\n", + "\n", + "class ImageNetDataPipeline:\n", + " \"\"\"\n", + " Provides APIs for model evaluation and finetuning using ImageNet Dataset.\n", + " \"\"\"\n", + "\n", + " @staticmethod\n", + " def get_val_dataloader():\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(TFRECORDS_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return data_loader\n", + "\n", + " @staticmethod\n", + " def evaluate(sess: tf.Session) -> float:\n", + " \"\"\"\n", + " Given a TF session, evaluates its Top-1 accuracy on the validation dataset\n", + " :param sess: The sess graph to be evaluated.\n", + " :return: The accuracy for the sample with the maximum accuracy.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],\n", + " data_inputs=['input_1:0'], validation_inputs=['labels:0'],\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return evaluator.evaluate(sess)\n", + "\n", + "\n", + " @staticmethod\n", + " def finetune(sess: tf.Session, update_ops_name: List[str], epochs: int, learning_rate: float, decay_steps: int):\n", + " \"\"\"\n", + " Given a TF session, finetunes it to improve its accuracy\n", + " :param sess: The sess graph to fine-tune.\n", + " :param update_ops_name: list of name of update ops (mostly BatchNorms' moving averages).\n", + " tf.GraphKeys.UPDATE_OPS collections is always used\n", + " in addition to this list\n", + " :param epochs: The number of epochs used during the finetuning step.\n", + " :param learning_rate: The learning rate used during the finetuning step.\n", + " :param decay_steps: A number used to adjust(decay) the learning rate after every decay_steps epochs in training.\n", + " \"\"\"\n", + " trainer = ImageNetTrainer(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],\n", + " data_inputs=['input_1:0'], validation_inputs=['labels:0'],\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.train['batch_size'],\n", + " num_epochs=epochs, format_bgr=True)\n", + "\n", + " trainer.train(sess, update_ops_name=update_ops_name, learning_rate=learning_rate, decay_steps=decay_steps)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## 2. Load the model and evaluate to get a baseline FP32 accuracy score" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "For this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to a tensorflow session.\n", + "Similarly, you can load any pretrained tensorflow model instead.\n", + "\n", + "Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.\n", + "\n", + "By default the update ops are placed in tf.GraphKeys.UPDATE_OPS, so they need to be added as a dependency to the train_op.\n", + "Since batchnorm ops are folded, these need to be ignored during training." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from tensorflow.compat.v1.keras.applications.resnet import ResNet50\n", + "\n", + "tf.keras.backend.clear_session()\n", + "\n", + "model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))\n", + "update_ops_name = [op.name for op in model.updates] # Used for finetuning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "The following utility method in AIMET sets BN layers in the model to eval mode.\n", + "This allows AIMET to more easily read the BN parameters from the graph.\n", + "Eventually we will fold BN layers into adjacent conv layers." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag\n", + "\n", + "model = update_keras_bn_ops_trainable_flag(model, load_save_path=\"./\", trainable=False)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "AIMET features currently support tensorflow sessions.\n", + "**add_image_net_computational_nodes_in_graph** adds an output layer, softmax and loss functions to the Resnet50 model graph." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph\n", + "\n", + "sess = tf.keras.backend.get_session()\n", + "\n", + "# Creates the computation graph of ResNet within the tensorflow session.\n", + "add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "Since all tensorflow input and output tensors have names, we identify the tensors needed by AIMET APIs here." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "starting_op_names = [model.input.name.split(\":\")[0]]\n", + "output_op_names = [model.output.name.split(\":\")[0]]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "We are checking if TensorFlow is using CPU or CUDA device.\n", + "This example code will use CUDA if available in your current execution environment." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "use_cuda = tf.test.is_gpu_available(cuda_only=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sess=sess)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## 3. Create a quantization simulation model and determine quantized accuracy\n", + "\n", + "## Fold Batch Normalization layers\n", + "Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model.\n", + "These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.\n", + "\n", + "**Why do we need to this?**\n", + "\n", + "On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers.\n", + "Doing so results in an inferences/sec speedup since unnecessary computation is avoided.\n", + "\n", + "From a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy.\n", + "However, folding the BN layers can increase the range of the tensor values for the weight parameters of the adjacent layers.\n", + "\n", + "This can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision).\n", + "We want to simulate that on-target behavior by doing BN folding here.\n", + "\n", + "The following code calls AIMET to fold the BN layers on the given model and returns a new session:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.batch_norm_fold import fold_all_batch_norms\n", + "\n", + "BN_folded_sess, _= fold_all_batch_norms(sess,\n", + " input_op_names=starting_op_names,\n", + " output_op_names=output_op_names)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "## Create Quantization Sim Model\n", + "\n", + "Now we use AIMET to create a QuantizationSimModel.\n", + "\n", + "Before we create the QuantizationSimModel, we save and load a version of the BN folded session for QuantSim to use.\n", + "QuantSim will insert fake quantization ops in the session passed into it, and we want to maintain a fresh copy of the BN folded session for use in AdaRound later." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.utils.graph_saver import save_and_load_graph\n", + "BN_folded_sess_copy = save_and_load_graph(\"output\", BN_folded_sess)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "AIMET will insert fake quantization ops in the model graph and configure them.\n", + "A few of the parameters are explained here:\n", + "- **quant_scheme**:\n", + " - We set this to \"post_training_tf_enhanced\"\n", + " With this choice of quant scheme, AIMET will use the TF Enhanced quant scheme to initialize the quantization parameters like scale/offset.\n", + "- **default_output_bw**: Setting this to 8 means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision.\n", + "- **default_param_bw**: Setting this to 8 means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision.\n", + "\n", + "There are other parameters that are set to default values in this example.\n", + "Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from aimet_common.defs import QuantScheme\n", + "from aimet_tensorflow.quantsim import QuantizationSimModel\n", + "\n", + "sim = QuantizationSimModel(session=BN_folded_sess_copy,\n", + " starting_op_names=starting_op_names,\n", + " output_op_names=output_op_names,\n", + " quant_scheme= QuantScheme.post_training_tf_enhanced,\n", + " default_output_bw=8,\n", + " default_param_bw=8,\n", + " use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## Compute Encodings\n", + "Even though AIMET has added 'quantizer' nodes to the model graph, the model is not ready to be used yet.\n", + "Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each 'quantizer' node.\n", + "\n", + "For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters.\n", + "This process is sometimes referred to as calibration. AIMET simply refers to it as 'computing encodings'.\n", + "\n", + "We create a routine to pass unlabeled data samples through the model.\n", + "This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model.\n", + "We don't need to compute any loss metrics, so we can just ignore the model output. A few pointers regarding the data samples:\n", + "\n", + "- In practice, we need a very small percentage of the overall data samples for computing encodings.\n", + " For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 to 1000 samples.\n", + "- It may be beneficial if the samples used for computing encoding are well distributed.\n", + " It's not necessary that all classes need to be covered since we are only looking at the range of values at every layer activation.\n", + " However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings.\n", + "This routine can be written in many different ways, this is just an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "def pass_calibration_data(session: tf.Session, _):\n", + " data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + " batch_size = data_loader.batch_size\n", + "\n", + " input_label_tensors = [session.graph.get_tensor_by_name('input_1:0'),\n", + " session.graph.get_tensor_by_name('labels:0')]\n", + "\n", + " train_tensors = [session.graph.get_tensor_by_name('keras_learning_phase:0')]\n", + " train_tensors_dict = dict.fromkeys(train_tensors, False)\n", + "\n", + " eval_outputs = [session.graph.get_operation_by_name('top1-acc').outputs[0]]\n", + "\n", + " samples = 500\n", + "\n", + " batch_cntr = 0\n", + " for input_label in data_loader:\n", + " input_label_tensors_dict = dict(zip(input_label_tensors, input_label))\n", + "\n", + " feed_dict = {**input_label_tensors_dict, **train_tensors_dict}\n", + "\n", + " with session.graph.as_default():\n", + " _ = session.run(eval_outputs, feed_dict=feed_dict)\n", + "\n", + " batch_cntr += 1\n", + " if (batch_cntr * batch_size) > samples:\n", + " break\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings.\n", + "Encodings here refer to scale/offset quantization parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=None)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "Now the QuantizationSim model is ready to be used for inference or training.\n", + "We can pass this model to the same evaluation routine we used before.\n", + "The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sim.session)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "## 4. Apply AdaRound\n", + "\n", + "We can now apply AdaRound to this model.\n", + "\n", + "Some of the parameters for AdaRound are described below:\n", + "\n", + "- **data_loader:** AdaRound needs a dataloader to use data samples for the layer-by-layer optimization to learn the rounding vectors.\n", + " Either a training or validation dataloader could be passed in.\n", + "- **num_batches:** The number of batches used to evaluate the model while calculating the quantization encodings.\n", + " Typically we want AdaRound to use around 2000 samples. So with a batch size of 32, this translates to 64 batches.\n", + " To speed up the execution here we are using 5 batches.\n", + "- **default_num_iterations:** The number of iterations to AdaRound each layer.\n", + " Default value is set to 10000 and we strongly recommend to not reduce this number.\n", + " But in this example we are using 32 to speed up the execution runtime.\n", + "- **path:** The path where AdaRound parameter encodings are exported. Ensure that this folder exists prior to calling the API." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.adaround.adaround_weight import Adaround\n", + "from aimet_tensorflow.adaround.adaround_weight import AdaroundParameters\n", + "\n", + "num_batches = 5\n", + "num_iterations = 32\n", + "data_set = ImageNetDataPipeline.get_val_dataloader().dataset\n", + "params = AdaroundParameters(data_set=data_set, num_batches=num_batches, default_num_iterations=num_iterations)\n", + "ada_model = Adaround.apply_adaround(BN_folded_sess, starting_op_names=starting_op_names,\n", + " output_op_names=output_op_names, params=params,\n", + " path=\"output\", filename_prefix=\"adaround\", default_param_bw=8,\n", + " default_quant_scheme=QuantScheme.post_training_tf_enhanced)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "Now, we can determine the simulated quantized accuracy of the model after applying AdaRound.\n", + "We again create a simulation model like before and evaluate to determine simulated quantized accuracy.\n", + "\n", + "**Note:** There are two important things to understand in the following cell.\n", + " - **Parameter Biwidth Precision**: The QuantizationSimModel must be created with the same parameter bitwidth precision that was used in the apply_adaround() created.\n", + "\n", + " - **Freezing the parameter encodings**:\n", + "After creating the QuantizationSimModel, the set_and_freeze_param_encodings() API must be called before calling the compute_encodings() API.\n", + "While applying AdaRound, the parameter values have been rounded up or down based on these initial encodings internally created.\n", + "For Quantization Simulation accuracy, it is important to freeze these encodings.\n", + "If the parameters encodings are NOT frozen, the call to compute_encodings() will alter the value of the parameters encodings and Quantization Simulation accuracy will not reflect the AdaRounded accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "sim = QuantizationSimModel(session=ada_model,\n", + " starting_op_names=starting_op_names,\n", + " output_op_names=output_op_names,\n", + " quant_scheme= QuantScheme.post_training_tf_enhanced,\n", + " default_output_bw=8,\n", + " default_param_bw=8,\n", + " use_cuda=use_cuda)\n", + "\n", + "sim.set_and_freeze_param_encodings(encoding_path=os.path.join(\"output\", \"adaround.encodings\"))\n", + "\n", + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=None)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "Now the QuantizationSim model is ready to be used for inference.\n", + "First we can pass this model to the same evaluation routine we used before.\n", + "The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization, using the newly AdaRounded model with updated parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sim.session)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "Depending on your settings you may have observed a slight gain in accuracy after applying AdaRound.\n", + "The settings used in this notebook are designed only to serve as code examples, designed to run quickly, but may not be optimal.\n", + "Please try this workflow against the model of your choice and play with the number of samples and other parameters to get the best results.\n", + "\n", + "The next step would be to take this model to target.\n", + "We need to do two things:\n", + "- export the model with the updated weights without the fake quantization ops\n", + "- export the encodings (scale/offset quantization parameters).\n", + "AIMET QuantizationSimModel provides an export API for this purpose." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "sim.export(path=\"./output/\", filename_prefix=\"resnet18_after_adaround\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "## Summary\n", + "\n", + "This example illustrated how the AIMET AdaRound API is invoked to achieve post training quantization.\n", + "To use AIMET AdaRound for your specific needs, replace the model with your model and replace the data pipeline with your data pipeline.\n", + "As indicated above, some parameters in this example have been chosen in such a way to make this example execute faster.\n", + "\n", + "We hope this notebook was useful for you to understand how to use AIMET for performing AdaRound.\n", + "\n", + "A few additional resources:\n", + "- Refer to the AIMET API docs to know more details of the APIs and optional parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT techniques" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 2 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython2", + "version": "2.7.6" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/releases/1.32.2/Examples/tensorflow/quantization/autoquant.html b/releases/1.32.2/Examples/tensorflow/quantization/autoquant.html new file mode 100644 index 00000000..02b99720 --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/autoquant.html @@ -0,0 +1,1453 @@ + + + + + + AutoQuant — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AutoQuant

+

This notebook shows a working code example of how to use AIMET AutoQuant feature.

+

AIMET offers a suite of neural network post-training quantization techniques. Often, applying these techniques in a specific sequence, results in better accuracy and performance. Without the AutoQuant feature, the AIMET user needs to manually try out various combinations of AIMET quantization features. This manual process is error-prone and often time-consuming.

+

The AutoQuant feature, analyzes the model, determines the sequence of AIMET quantization techniques and applies these techniques. In addition, the user can specify the amount of accuracy drop that can be tolerated, in the AutoQuant API. As soon as this threshold accuracy is reached, AutoQuant stops applying any additional quantization technique. In summary, the AutoQuant feature saves time and automates the quantization of the neural networks.

+
+

Overall flow

+

This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load a pretrained FP32 model 3. Determine the baseline FP32 accuracy 4. Define constants and helper functions 5. Apply AutoQuant

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art AutoQuant results. For example, it uses a relatively quantization-friendly model like Resnet18. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#) and convert them into tfrecords.

+

Note1: The ImageNet tfrecords dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - A folder containing tfrecords files starting with ‘train’ for training files and ‘valid’ for validation files. Each tfrecord file should have features: ‘image/encoded’ for image data and ‘image/class/label’ for its corresponding class.

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class and then convert it into tfrecords. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = '/path/to/tfrecords/dir/'       # Please replace this with a real directory
+
+
+
+

We disable logs at the INFO level and disable eager execution. We set verbosity to the level as displayed (ERROR), so TensorFlow will display all messages that have the label ERROR (or more critical).

+
+
[ ]:
+
+
+
import os
+os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
+
+import tensorflow as tf
+tf.compat.v1.disable_eager_execution()
+tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR)
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a TensorFlow model. This QuantizationSim model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of evaluate() or train() methods? Not really. You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
from Examples.common import image_net_config
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator
+
+class ImageNetDataPipeline:
+    """
+    Provides APIs for model evaluation and fine-tuning using ImageNet Dataset.
+    """
+
+    @staticmethod
+    def get_val_dataloader():
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(DATASET_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         format_bgr=True)
+
+        return data_loader
+
+    @staticmethod
+    def evaluate(sess: tf.compat.v1.Session) -> float:
+        """
+        Given a TF session, evaluates its Top-1 accuracy on the validation dataset
+        :param sess: The sess graph to be evaluated.
+        :return: The accuracy for the sample with the maximum accuracy.
+        """
+        evaluator = ImageNetEvaluator(DATASET_DIR, training_inputs=['keras_learning_phase:0'],
+                                      data_inputs=['input_1:0'], validation_inputs=['labels:0'],
+                                      image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      format_bgr=True)
+
+        return evaluator.evaluate(sess)
+
+
+
+
+
+

2. Load a pretrained FP32 model

+

For this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to a tensorflow session. Similarly, you can load any pretrained tensorflow model instead.

+

Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.

+
+
[ ]:
+
+
+
from tensorflow.keras.applications.resnet import ResNet50
+
+tf.keras.backend.clear_session()
+
+input_shape = (224, 224, 3)
+model = ResNet50(weights='imagenet', input_shape=input_shape)
+
+
+
+

The following utility method in AIMET sets BN layers in the model to eval mode. This allows AIMET to more easily read the BN parameters from the graph. Eventually we will fold BN layers into adjacent conv layers.

+
+
[ ]:
+
+
+
from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag
+
+model = update_keras_bn_ops_trainable_flag(model, False, load_save_path='./')
+
+
+
+

AIMET features currently support tensorflow sessions. add_image_net_computational_nodes_in_graph adds an output layer, softmax and loss functions to the Resnet50 model graph.

+
+
[ ]:
+
+
+
from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph
+
+sess = tf.compat.v1.keras.backend.get_session()
+
+# Creates the computation graph of ResNet within the tensorflow session.
+add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])
+
+
+
+

Since all tensorflow input and output tensors have names, we identify the tensors needed by AIMET APIs here.

+
+
[ ]:
+
+
+
input_tensor_name = model.input.name
+input_op_name, _ = input_tensor_name.split(":")
+output_tensor_name = model.output.name
+output_op_name, _ = output_tensor_name.split(":")
+
+
+
+

We are checking if TensorFlow is using CPU or CUDA device. This example code will use CUDA if available in your current execution environment.

+
+
[ ]:
+
+
+
use_cuda = tf.test.is_gpu_available(cuda_only=True)
+
+
+
+
+
+

3. Determine the baseline FP32 accuracy

+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using evaluate() routine

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sess=sess)
+print(accuracy)
+
+
+
+
+
+

4. Define Constants and Helper functions

+

In this section the constants and helper functions needed to run this example are defined.

+
    +
  • EVAL_DATASET_SIZE A typical value is 5000. To execute this example faster this value has been set to 50

  • +
  • CALIBRATION_DATASET_SIZE A typical value is 2000. To execute this example faster this value has been set to 20

  • +
  • BATCH_SIZE User sets the batch size. As an example, set to 10

  • +
+

The helper function **_create_sampled_data_loader()** returns a DataLoader based on the dataset and the number of samples provided.

+
+
[ ]:
+
+
+
EVAL_DATASET_SIZE = 50
+CALIBRATION_DATASET_SIZE = 20
+BATCH_SIZE = 10
+
+_sampled_datasets = {}
+
+def _create_sampled_dataset(dataset: tf.compat.v1.data.Dataset,
+                            num_samples: int) -> tf.compat.v1.data.Dataset:
+    if num_samples in _sampled_datasets:
+        return _sampled_datasets[num_samples]
+
+    with dataset._graph.as_default():
+        SHUFFLE_BUFFER_SIZE = 300 # NOTE: Adjust the buffer size as necessary.
+        SHUFFLE_SEED = 22222
+        dataset = dataset.shuffle(buffer_size=SHUFFLE_BUFFER_SIZE, seed=SHUFFLE_SEED)\
+                         .take(num_samples)\
+                         .batch(BATCH_SIZE)
+        _sampled_datasets[num_samples] = dataset
+        return dataset
+
+
+
+
+
+
+

Prepare unlabeled dataset

+

The AutoQuant feature utilizes an unlabeled dataset to achieve quantization. Below cell shows how to get an unlabeled Dataset object from a labeled Dataset.

+
+
[ ]:
+
+
+
eval_dataset = ImageNetDataPipeline.get_val_dataloader().dataset
+
+with eval_dataset._graph.as_default():
+    image_dataset = eval_dataset.map(lambda images, labels: images)
+    unlabeled_dataset = image_dataset.batch(BATCH_SIZE)
+
+
+
+
+
+

Prepare the evaluation callback function

+

The eval_callback() function takes the session object to evaluate and the number of samples to use as arguments. If the num_samples argument is None, the whole evaluation dataset is used to evaluate the model.

+
+
[ ]:
+
+
+
import numpy as np
+from aimet_tensorflow.utils.common import iterate_tf_dataset
+from typing import Optional
+
+
+def eval_callback(sess: tf.compat.v1.Session,
+                  num_samples: Optional[int] = None) -> float:
+    if num_samples is None:
+        num_samples = EVAL_DATASET_SIZE
+
+    sampled_dataset = _create_sampled_dataset(eval_dataset, num_samples)
+
+    with sess.graph.as_default():
+        sess.run(tf.compat.v1.global_variables_initializer())
+        input_tensor = sess.graph.get_tensor_by_name(input_tensor_name)
+        output_tensor = sess.graph.get_tensor_by_name(output_tensor_name)
+
+        num_correct_predictions = 0
+        for images, labels in iterate_tf_dataset(sampled_dataset):
+            prob = sess.run(output_tensor, feed_dict={input_tensor: images})
+            predictions = np.argmax(prob, axis=1)
+            num_correct_predictions += np.sum(predictions == labels)
+
+        return int(num_correct_predictions) / num_samples
+
+
+
+
+
+

5. Apply AutoQuant

+

As a first step, the AutoQuant object is created.

+

The allowed_accuracy_drop parameter is set by the user to convey to the AutoQuant feature, how much accuracy drop is tolerated by the user. AutoQuant applies a series of quantization features. When the allowed accuracy is reached, AutoQuant stops applying any subsequent quantization feature. Please refer AutoQuant User Guide and API documentation for complete details.

+
+
[ ]:
+
+
+
from aimet_tensorflow.auto_quant import AutoQuant
+
+auto_quant = AutoQuant(allowed_accuracy_drop=0.01,
+                       unlabeled_dataset=unlabeled_dataset,
+                       eval_callback=eval_callback)
+
+
+
+
+
+

Optionally set AdaRound Parameters

+

The AutoQuant feature internally uses default parameters to execute the AdaRound step. If and only if necessary, the default AdaRound Parameters should be modified using the API shown below.

+

Note: To execute this example faster, the default value of the num_iterations parameter has been reduced from 10000 to 2000

+
+
[ ]:
+
+
+
from aimet_tensorflow.adaround.adaround_weight import AdaroundParameters
+
+ADAROUND_DATASET_SIZE = 2000
+adaround_dataset = _create_sampled_dataset(image_dataset, ADAROUND_DATASET_SIZE)
+adaround_params = AdaroundParameters(adaround_dataset,
+                                     num_batches=ADAROUND_DATASET_SIZE // BATCH_SIZE)
+auto_quant.set_adaround_params(adaround_params)
+
+
+
+
+
+

Run AutoQuant

+

This step applies the AutoQuant feature. The best possible quantized model, the associated eval_score and the path to the AdaRound encoding files are returned.

+
+
[ ]:
+
+
+
sess, accuracy, encoding_path =\
+    auto_quant.apply(tf.compat.v1.keras.backend.get_session(),
+                     starting_op_names=[input_op_name],
+                     output_op_names=[output_op_name])
+
+
+
+
+
[ ]:
+
+
+
print(accuracy)
+print(encoding_path)
+
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET AutoQuant feature.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and parameters - Refer to the other example notebooks to understand how to use AIMET CLE and AdaRound features in a standalone fashion.

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/quantization/autoquant.ipynb b/releases/1.32.2/Examples/tensorflow/quantization/autoquant.ipynb new file mode 100644 index 00000000..93577491 --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/autoquant.ipynb @@ -0,0 +1,539 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "# AutoQuant\n", + "\n", + "This notebook shows a working code example of how to use AIMET AutoQuant feature.\n", + "\n", + "AIMET offers a suite of neural network post-training quantization techniques. Often, applying these techniques in a specific sequence, results in better accuracy and performance. Without the AutoQuant feature, the AIMET user needs to manually try out various combinations of AIMET quantization features. This manual process is error-prone and often time-consuming.\n", + "\n", + "The AutoQuant feature, analyzes the model, determines the sequence of AIMET quantization techniques and applies these techniques. In addition, the user can specify the amount of accuracy drop that can be tolerated, in the AutoQuant API. As soon as this threshold accuracy is reached, AutoQuant stops applying any additional quantization technique. In summary, the AutoQuant feature saves time and automates the quantization of the neural networks.\n", + "\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load a pretrained FP32 model\n", + "3. Determine the baseline FP32 accuracy\n", + "4. Define constants and helper functions\n", + "5. Apply AutoQuant\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art AutoQuant results. For example, it uses a relatively quantization-friendly model like Resnet18. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. [https://image-net.org/challenges/LSVRC/2012/index.php#](https://image-net.org/challenges/LSVRC/2012/index.php#)) and convert them into tfrecords.\n", + "\n", + "**Note1**: The ImageNet tfrecords dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- A folder containing tfrecords files starting with **'train'** for training files and **'valid'** for validation files. Each tfrecord file should have features: **'image/encoded'** for image data and **'image/class/label'** for its corresponding class.\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class and then convert it into tfrecords. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "DATASET_DIR = '/path/to/tfrecords/dir/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "We disable logs at the INFO level and disable eager execution. We set verbosity to the level as displayed (ERROR), so TensorFlow will display all messages that have the label ERROR (or more critical)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'\n", + "\n", + "import tensorflow as tf\n", + "tf.compat.v1.disable_eager_execution()\n", + "tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a TensorFlow model. This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of evaluate() or train() methods?** Not really. You should be able to use your existing evaluate and train routines as-is." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from Examples.common import image_net_config\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator\n", + "\n", + "class ImageNetDataPipeline:\n", + " \"\"\"\n", + " Provides APIs for model evaluation and fine-tuning using ImageNet Dataset.\n", + " \"\"\"\n", + "\n", + " @staticmethod\n", + " def get_val_dataloader():\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(DATASET_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return data_loader\n", + "\n", + " @staticmethod\n", + " def evaluate(sess: tf.compat.v1.Session) -> float:\n", + " \"\"\"\n", + " Given a TF session, evaluates its Top-1 accuracy on the validation dataset\n", + " :param sess: The sess graph to be evaluated.\n", + " :return: The accuracy for the sample with the maximum accuracy.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(DATASET_DIR, training_inputs=['keras_learning_phase:0'],\n", + " data_inputs=['input_1:0'], validation_inputs=['labels:0'],\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return evaluator.evaluate(sess)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "## 2. Load a pretrained FP32 model\n", + "For this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to a tensorflow session. Similarly, you can load any pretrained tensorflow model instead.\n", + "\n", + "Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from tensorflow.keras.applications.resnet import ResNet50\n", + "\n", + "tf.keras.backend.clear_session()\n", + "\n", + "input_shape = (224, 224, 3)\n", + "model = ResNet50(weights='imagenet', input_shape=input_shape)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "The following utility method in AIMET sets BN layers in the model to eval mode. This allows AIMET to more easily read the BN parameters from the graph. Eventually we will fold BN layers into adjacent conv layers." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag\n", + "\n", + "model = update_keras_bn_ops_trainable_flag(model, False, load_save_path='./')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "AIMET features currently support tensorflow sessions. **add_image_net_computational_nodes_in_graph** adds an output layer, softmax and loss functions to the Resnet50 model graph." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph\n", + "\n", + "sess = tf.compat.v1.keras.backend.get_session()\n", + "\n", + "# Creates the computation graph of ResNet within the tensorflow session.\n", + "add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "Since all tensorflow input and output tensors have names, we identify the tensors needed by AIMET APIs here." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "input_tensor_name = model.input.name\n", + "input_op_name, _ = input_tensor_name.split(\":\")\n", + "output_tensor_name = model.output.name\n", + "output_op_name, _ = output_tensor_name.split(\":\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "We are checking if TensorFlow is using CPU or CUDA device. This example code will use CUDA if available in your current execution environment." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "use_cuda = tf.test.is_gpu_available(cuda_only=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "## 3. Determine the baseline FP32 accuracy\n", + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sess=sess)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "## 4. Define Constants and Helper functions\n", + "\n", + "In this section the constants and helper functions needed to run this example are defined.\n", + "\n", + "- **EVAL_DATASET_SIZE** A typical value is 5000. To execute this example faster this value has been set to 50\n", + "- **CALIBRATION_DATASET_SIZE** A typical value is 2000. To execute this example faster this value has been set to 20\n", + "- **BATCH_SIZE** User sets the batch size. As an example, set to 10\n", + "\n", + "\n", + "The helper function **_create_sampled_data_loader()** returns a DataLoader based on the dataset and the number of samples provided." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "EVAL_DATASET_SIZE = 50\n", + "CALIBRATION_DATASET_SIZE = 20\n", + "BATCH_SIZE = 10\n", + "\n", + "_sampled_datasets = {}\n", + "\n", + "def _create_sampled_dataset(dataset: tf.compat.v1.data.Dataset,\n", + " num_samples: int) -> tf.compat.v1.data.Dataset:\n", + " if num_samples in _sampled_datasets:\n", + " return _sampled_datasets[num_samples]\n", + "\n", + " with dataset._graph.as_default():\n", + " SHUFFLE_BUFFER_SIZE = 300 # NOTE: Adjust the buffer size as necessary.\n", + " SHUFFLE_SEED = 22222\n", + " dataset = dataset.shuffle(buffer_size=SHUFFLE_BUFFER_SIZE, seed=SHUFFLE_SEED)\\\n", + " .take(num_samples)\\\n", + " .batch(BATCH_SIZE)\n", + " _sampled_datasets[num_samples] = dataset\n", + " return dataset" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## Prepare unlabeled dataset\n", + "\n", + "The AutoQuant feature utilizes an unlabeled dataset to achieve quantization. Below cell shows how to get an unlabeled Dataset object from a labeled Dataset." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "eval_dataset = ImageNetDataPipeline.get_val_dataloader().dataset\n", + "\n", + "with eval_dataset._graph.as_default():\n", + " image_dataset = eval_dataset.map(lambda images, labels: images)\n", + " unlabeled_dataset = image_dataset.batch(BATCH_SIZE)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "## Prepare the evaluation callback function\n", + "\n", + "The **eval_callback()** function takes the session object to evaluate and the number of samples to use as arguments. If the **num_samples** argument is None, the whole evaluation dataset is used to evaluate the model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "from aimet_tensorflow.utils.common import iterate_tf_dataset\n", + "from typing import Optional\n", + "\n", + "\n", + "def eval_callback(sess: tf.compat.v1.Session,\n", + " num_samples: Optional[int] = None) -> float:\n", + " if num_samples is None:\n", + " num_samples = EVAL_DATASET_SIZE\n", + "\n", + " sampled_dataset = _create_sampled_dataset(eval_dataset, num_samples)\n", + "\n", + " with sess.graph.as_default():\n", + " sess.run(tf.compat.v1.global_variables_initializer())\n", + " input_tensor = sess.graph.get_tensor_by_name(input_tensor_name)\n", + " output_tensor = sess.graph.get_tensor_by_name(output_tensor_name)\n", + "\n", + " num_correct_predictions = 0\n", + " for images, labels in iterate_tf_dataset(sampled_dataset):\n", + " prob = sess.run(output_tensor, feed_dict={input_tensor: images})\n", + " predictions = np.argmax(prob, axis=1)\n", + " num_correct_predictions += np.sum(predictions == labels)\n", + "\n", + " return int(num_correct_predictions) / num_samples" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "## 5. Apply AutoQuant\n", + "\n", + "As a first step, the AutoQuant object is created.\n", + "\n", + "The **allowed_accuracy_drop** parameter is set by the user to convey to the AutoQuant feature, how much accuracy drop is tolerated by the user. AutoQuant applies a series of quantization features. When the allowed accuracy is reached, AutoQuant stops applying any subsequent quantization feature. Please refer AutoQuant User Guide and API documentation for complete details." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.auto_quant import AutoQuant\n", + "\n", + "auto_quant = AutoQuant(allowed_accuracy_drop=0.01,\n", + " unlabeled_dataset=unlabeled_dataset,\n", + " eval_callback=eval_callback)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "## Optionally set AdaRound Parameters\n", + "The AutoQuant feature internally uses default parameters to execute the AdaRound step.\n", + "If and only if necessary, the default AdaRound Parameters should be modified using the API shown below.\n", + "\n", + "**Note:**\n", + "To execute this example faster, the default value of the **num_iterations** parameter has been reduced from 10000 to 2000" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.adaround.adaround_weight import AdaroundParameters\n", + "\n", + "ADAROUND_DATASET_SIZE = 2000\n", + "adaround_dataset = _create_sampled_dataset(image_dataset, ADAROUND_DATASET_SIZE)\n", + "adaround_params = AdaroundParameters(adaround_dataset,\n", + " num_batches=ADAROUND_DATASET_SIZE // BATCH_SIZE)\n", + "auto_quant.set_adaround_params(adaround_params)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "## Run AutoQuant\n", + "\n", + "This step applies the AutoQuant feature. The best possible quantized model, the associated eval_score and the path to the AdaRound encoding files are returned." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "sess, accuracy, encoding_path =\\\n", + " auto_quant.apply(tf.compat.v1.keras.backend.get_session(),\n", + " starting_op_names=[input_op_name],\n", + " output_op_names=[output_op_name])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "print(accuracy)\n", + "print(encoding_path)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## Summary\n", + "\n", + "Hope this notebook was useful for you to understand how to use AIMET AutoQuant feature.\n", + "\n", + "Few additional resources\n", + "- Refer to the AIMET API docs to know more details of the APIs and parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET CLE and AdaRound features in a standalone fashion." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 2 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython2", + "version": "2.7.6" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/releases/1.32.2/Examples/tensorflow/quantization/bn_reestimation.html b/releases/1.32.2/Examples/tensorflow/quantization/bn_reestimation.html new file mode 100644 index 00000000..e6e918f5 --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/bn_reestimation.html @@ -0,0 +1,1537 @@ + + + + + + Quantization-Aware Training with BatchNorm Re-estimation — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + +
  • + View page source +
  • +
+
+
+
+
+ +
+

Quantization-Aware Training with BatchNorm Re-estimation

+

This notebook shows a working code example of how to use AIMET to perform QAT (Quantization-aware training) with batchnorm re-estimation. Batchnorm re-estimation is a technique for countering potential instability of batchnrom statistics (i.e. running mean and variance) during QAT. More specifically, batchnorm re-estimation recalculates the batchnorm statistics based on the model after QAT. By doing so, we aim to make our model learn batchnorm statistics from from stable outputs after QAT, +rather than from likely noisy outputs during QAT.

+
+

Overall flow

+

This notebook covers the following steps: 1. Create a quantization simulation model with fake quantization ops inserted. 2. Finetune and evaluate the quantization simulation model 3. Re-estimate batchnorm statistics and compare the eval score before and after re-estimation. 4. Fold the re-estimated batchnorm layers and export the quantization simulation model

+
+
+

What this notebook is not

+

In this notebook, we will focus how to apply batchnorm re-estimation after QAT, rather than covering all the details about QAT itself. For more information about QAT, please refer to QAT notebook or QAT range learning notebook.

+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#) and convert them into tfrecords.

+

Note1: The ImageNet tfrecords dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - A folder containing tfrecords files starting with ‘train*’ for training files and ‘valid*’ for validation files. Each tfrecord file should have features: ‘image/encoded’ for image data and ‘image/class/label’ for its corresponding class.

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class and then convert it into tfrecords. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
TFRECORDS_DIR = '/path/to/dataset/'         # Please replace this with a real directory
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s session graph to create a QuantizationSim model which is still a Tensorflow graph. This QuantizationSim model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods? Not really. You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
import os
+os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
+import tensorflow.compat.v1 as tf
+tf.disable_eager_execution()
+tf.logging.set_verbosity(tf.logging.ERROR)
+from typing import List
+
+from Examples.common import image_net_config
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.tensorflow.utils.image_net_trainer import ImageNetTrainer
+
+class ImageNetDataPipeline:
+    """
+    Provides APIs for model evaluation and finetuning using ImageNet Dataset.
+    """
+
+    @staticmethod
+    def get_val_dataloader():
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(TFRECORDS_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         format_bgr=True)
+
+        return data_loader
+
+    @staticmethod
+    def evaluate(sess: tf.Session) -> float:
+        """
+        Given a TF session, evaluates its Top-1 accuracy on the validation dataset
+        :param sess: The sess graph to be evaluated.
+        :return: The accuracy for the sample with the maximum accuracy.
+        """
+        evaluator = ImageNetEvaluator(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],
+                                      data_inputs=['input_1:0'], validation_inputs=['labels:0'],
+                                      image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      format_bgr=True)
+
+        return evaluator.evaluate(sess)
+
+
+    @staticmethod
+    def finetune(sess: tf.Session, update_ops_name: List[str], epochs: int, learning_rate: float, decay_steps: int):
+        """
+        Given a TF session, finetunes it to improve its accuracy
+        :param sess: The sess graph to fine-tune.
+        :param update_ops_name: list of name of update ops (mostly BatchNorms' moving averages).
+                                tf.GraphKeys.UPDATE_OPS collections is always used
+                                in addition to this list
+        :param epochs: The number of epochs used during the finetuning step.
+        :param learning_rate: The learning rate used during the finetuning step.
+        :param decay_steps: A number used to adjust(decay) the learning rate after every decay_steps epochs in training.
+        """
+        trainer = ImageNetTrainer(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],
+                                  data_inputs=['input_1:0'], validation_inputs=['labels:0'],
+                                  image_size=image_net_config.dataset['image_size'],
+                                  batch_size=image_net_config.train['batch_size'],
+                                  num_epochs=epochs, format_bgr=True)
+
+        trainer.train(sess, update_ops_name=update_ops_name, learning_rate=learning_rate, decay_steps=decay_steps)
+
+
+
+
+
+
+

2. Load FP32 model

+

AIMET currently support BatchNorm Re-estimation on Tensorflow sessions. In this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to work with Tensorflow session. Similarly, you can load any pretrained Tensorflow model. Please refer to QAT notebook for more detail.

+
+
[ ]:
+
+
+
from tensorflow.compat.v1.keras.applications.resnet import ResNet50
+
+tf.keras.backend.clear_session()
+model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+sess = tf.keras.backend.get_session()
+
+# Following lines are additional steps to make keras model work with AIMET.
+from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph
+add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])
+
+
+
+

We need names of input and output of the model to work with AIMET.

+
+
[ ]:
+
+
+
input_op_names = [model.input.op.name]
+output_op_names = [model.output.op.name]
+
+
+
+
+

BatchNorm Rewriter

+

In the later notebook, we will make changes to parameters of BatchNorms to improve performance. However, depending on how the BatchNorm was configured, this might be difficult to achieve.

+

AIMET provides model_sess_bn_mutable that changes BatchNorm layer to make it easier to modify parameters.

+
+
[ ]:
+
+
+
from aimet_tensorflow.utils.op.bn_mutable import modify_sess_bn_mutable
+modify_sess_bn_mutable(sess, input_op_names, output_op_names, training_tf_placeholder=False)
+
+
+
+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sess=sess)
+print(accuracy)
+
+
+
+
+
+
+
+

3. Create a quantization simulation model and Perform QAT

+
+

Create Quantization Sim Model

+

Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them. A few of the parameters are explained here - quant_scheme: We set this to “QuantScheme.post_training_tf_enhanced” - Supported options are ‘tf_enhanced’ or ‘tf’ or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced - default_output_bw: Setting this to 8, essentially means that we are +asking AIMET to perform all activation quantizations in the model using integer 8-bit precision - default_param_bw: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+

NOTE: Note that, unlike in other QAT example scripts, we didn’t fold batchnorm layers before QAT. This is because we aim to finetune our model with batchnorm layers present and re-estimate the batchnorm statatistics for better accuracy. The batchnorm layers will be folded after re-estimation.

+
+
[ ]:
+
+
+
import json
+from aimet_common.defs import QuantScheme
+from aimet_tensorflow.quantsim import QuantizationSimModel
+
+default_config_per_channel = {
+            "defaults":
+                {
+                    "ops":
+                        {
+                            "is_output_quantized": "True"
+                        },
+                    "params":
+                        {
+                            "is_quantized": "True",
+                            "is_symmetric": "True"
+                        },
+                    "strict_symmetric": "False",
+                    "unsigned_symmetric": "True",
+                    "per_channel_quantization": "True"
+                },
+
+            "params":
+                {
+                    "bias":
+                        {
+                            "is_quantized": "False"
+                        }
+                },
+
+            "op_type":
+                {
+                    "Squeeze":
+                        {
+                            "is_output_quantized": "False"
+                        },
+                    "Pad":
+                        {
+                            "is_output_quantized": "False"
+                        },
+                    "Mean":
+                        {
+                            "is_output_quantized": "False"
+                        }
+                },
+
+            "supergroups":
+                [
+                    {
+                        "op_list": ["Conv", "Relu"]
+                    },
+                    {
+                        "op_list": ["Conv", "Clip"]
+                    },
+                    {
+                        "op_list": ["Conv", "BatchNormalization", "Relu"]
+                    },
+                    {
+                        "op_list": ["Add", "Relu"]
+                    },
+                    {
+                        "op_list": ["Gemm", "Relu"]
+                    }
+                ],
+
+            "model_input":
+                {
+                    "is_input_quantized": "True"
+                },
+
+            "model_output":
+                {}
+        }
+
+config_file_path = "/tmp/default_config_per_channel.json"
+with open(config_file_path, "w") as f:
+    json.dump(default_config_per_channel, f)
+
+sim = QuantizationSimModel(sess, input_op_names, output_op_names, use_cuda=True,
+                                   quant_scheme=QuantScheme.training_range_learning_with_tf_init,
+                                   config_file=config_file_path)
+
+
+
+
+
+
+
+

Compute Encodings

+

Even though AIMET has added ‘quantizer’ nodes to the model graph, the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to +as calibration. AIMET simply refers to it as ‘computing encodings’.

+

So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples - In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has +1M samples. For computing encodings we only need 500 or 1000 samples. - It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results.

+

The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example.

+
+
[ ]:
+
+
+
def pass_calibration_data(session: tf.compat.v1.Session, _):
+    data_loader = ImageNetDataPipeline.get_val_dataloader()
+    batch_size = data_loader.batch_size
+
+    input_label_tensors = [session.graph.get_tensor_by_name('input_1:0'),
+                           session.graph.get_tensor_by_name('labels:0')]
+
+    train_tensors = [session.graph.get_tensor_by_name('keras_learning_phase:0')]
+    train_tensors_dict = dict.fromkeys(train_tensors, False)
+
+    eval_outputs = [session.graph.get_operation_by_name('top1-acc').outputs[0]]
+
+    samples = 500
+
+    batch_cntr = 0
+    for input_label in data_loader:
+        input_label_tensors_dict = dict(zip(input_label_tensors, input_label))
+
+        feed_dict = {**input_label_tensors_dict, **train_tensors_dict}
+
+        with session.graph.as_default():
+            _ = session.run(eval_outputs, feed_dict=feed_dict)
+
+        batch_cntr += 1
+        if (batch_cntr * batch_size) > samples:
+            break
+
+sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=None)
+
+
+
+
+

Now the QuantizationSim model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sim.session)
+print(accuracy)
+
+
+
+
+

Perform QAT

+

To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.

+

For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit.

+
+
[ ]:
+
+
+
update_ops_name = [op.name for op in model.updates] # Used for finetuning
+ImageNetDataPipeline.finetune(sim.session, update_ops_name=update_ops_name, epochs=1, learning_rate=5e-7, decay_steps=5)
+
+
+
+

After we are done with QAT, we can run quantization simulation inference against the validation dataset at the end to observe any improvements in accuracy.

+
+
[ ]:
+
+
+
finetuned_accuracy  = ImageNetDataPipeline.evaluate(sim.session)
+print(finetuned_accuracy)
+
+
+
+
+
+
+
+

4. Perform BatchNorm Reestimation

+
+

Re-estimate BatchNorm Statistics

+

AIMET provides a helper function, reestimate_bn_stats, for re-estimating batchnorm statistics. Here is the full list of parameters for this function: * model: Model to re-estimate the BatchNorm statistics. * dataloader Train dataloader. * num_batches (optional): The number of batches to be used for reestimation. (Default: 100) * forward_fn (optional): Optional adapter function that performs forward pass given a model and a input batch yielded from the data loader. If not +specified, it is expected that inputs yielded from dataloader can be passed directly to the model.

+
+
[ ]:
+
+
+
from aimet_tensorflow.bn_reestimation import reestimate_bn_stats
+import numpy as np
+
+data_loader = ImageNetDataLoader(TFRECORDS_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         format_bgr=True)
+
+arrays=[]
+for input_label in data_loader:
+    arrays.append(input_label[0])
+real_inputs = np.vstack(arrays)
+
+dataset = tf.compat.v1.data.Dataset.from_tensor_slices(real_inputs)
+bn_re_restimation_dataset = dataset.batch(32)
+
+reestimate_bn_stats(sim, start_op_names=input_op_names, output_op_names=output_op_names,
+                    bn_re_estimation_dataset=bn_re_restimation_dataset, bn_num_batches=100)
+
+finetuned_accuracy_bn_reestimated = ImageNetDataPipeline.evaluate(sim.session)
+print(finetuned_accuracy_bn_reestimated)
+
+
+
+
+
+

Fold BatchNorm Layers

+

So far, we have improved our quantization simulation model through QAT and batchnorm re-estimation. The next step would be to actually take this model to target. But first, we should fold the batchnorm layers for our model to run on target devices more efficiently.

+
+
[ ]:
+
+
+
from aimet_tensorflow.batch_norm_fold import fold_all_batch_norms_to_scale
+
+fold_all_batch_norms_to_scale(sim, input_op_names, output_op_names)
+
+
+
+
+
+
+
+

5. Export Model

+

As the final step, we will export the model to run it on actual target devices. AIMET QuantizationSimModel provides an export API for this purpose.

+
+
[ ]:
+
+
+
os.makedirs('./output/', exist_ok=True)
+sim.export(path='./output/', filename_prefix='resnet50_after_qat')
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use batchnorm re-estimation feature of AIMET.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters. - Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT methods.

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/quantization/bn_reestimation.ipynb b/releases/1.32.2/Examples/tensorflow/quantization/bn_reestimation.ipynb new file mode 100644 index 00000000..0bb2da97 --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/bn_reestimation.ipynb @@ -0,0 +1,565 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Quantization-Aware Training with BatchNorm Re-estimation\n", + "\n", + "This notebook shows a working code example of how to use AIMET to perform QAT (Quantization-aware training) with batchnorm re-estimation.\n", + "Batchnorm re-estimation is a technique for countering potential instability of batchnrom statistics (i.e. running mean and variance) during QAT. More specifically, batchnorm re-estimation recalculates the batchnorm statistics based on the model after QAT. By doing so, we aim to make our model learn batchnorm statistics from from stable outputs after QAT, rather than from likely noisy outputs during QAT.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following steps:\n", + "1. Create a quantization simulation model with fake quantization ops inserted.\n", + "2. Finetune and evaluate the quantization simulation model\n", + "3. Re-estimate batchnorm statistics and compare the eval score before and after re-estimation.\n", + "4. Fold the re-estimated batchnorm layers and export the quantization simulation model\n", + "\n", + "#### What this notebook is not\n", + "In this notebook, we will focus how to apply batchnorm re-estimation after QAT, rather than covering all the details about QAT itself. For more information about QAT, please refer to [QAT notebook](https://github.com/quic/aimet/blob/develop/Examples/tensorflow/quantization/qat.ipynb) or [QAT range learning notebook](https://github.com/quic/aimet/blob/develop/Examples/tensorflow/quantization/qat_range_learning.ipynb)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#) and convert them into tfrecords.\n", + "\n", + "**Note1**: The ImageNet tfrecords dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- A folder containing tfrecords files starting with **'train\\*'** for training files and **'valid\\*'** for validation files. Each tfrecord file should have features: **'image/encoded'** for image data and **'image/class/label'** for its corresponding class.\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class and then convert it into tfrecords. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "is_executing": true + } + }, + "outputs": [], + "source": [ + "TFRECORDS_DIR = '/path/to/dataset/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's session graph to create a QuantizationSim model which is still a Tensorflow graph. This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really. You should be able to use your existing evaluate and train routines as-is.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "is_executing": true + } + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'\n", + "import tensorflow.compat.v1 as tf\n", + "tf.disable_eager_execution()\n", + "tf.logging.set_verbosity(tf.logging.ERROR)\n", + "from typing import List\n", + "\n", + "from Examples.common import image_net_config\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.tensorflow.utils.image_net_trainer import ImageNetTrainer\n", + "\n", + "class ImageNetDataPipeline:\n", + " \"\"\"\n", + " Provides APIs for model evaluation and finetuning using ImageNet Dataset.\n", + " \"\"\"\n", + "\n", + " @staticmethod\n", + " def get_val_dataloader():\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(TFRECORDS_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return data_loader\n", + "\n", + " @staticmethod\n", + " def evaluate(sess: tf.Session) -> float:\n", + " \"\"\"\n", + " Given a TF session, evaluates its Top-1 accuracy on the validation dataset\n", + " :param sess: The sess graph to be evaluated.\n", + " :return: The accuracy for the sample with the maximum accuracy.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],\n", + " data_inputs=['input_1:0'], validation_inputs=['labels:0'],\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return evaluator.evaluate(sess)\n", + "\n", + "\n", + " @staticmethod\n", + " def finetune(sess: tf.Session, update_ops_name: List[str], epochs: int, learning_rate: float, decay_steps: int):\n", + " \"\"\"\n", + " Given a TF session, finetunes it to improve its accuracy\n", + " :param sess: The sess graph to fine-tune.\n", + " :param update_ops_name: list of name of update ops (mostly BatchNorms' moving averages).\n", + " tf.GraphKeys.UPDATE_OPS collections is always used\n", + " in addition to this list\n", + " :param epochs: The number of epochs used during the finetuning step.\n", + " :param learning_rate: The learning rate used during the finetuning step.\n", + " :param decay_steps: A number used to adjust(decay) the learning rate after every decay_steps epochs in training.\n", + " \"\"\"\n", + " trainer = ImageNetTrainer(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],\n", + " data_inputs=['input_1:0'], validation_inputs=['labels:0'],\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.train['batch_size'],\n", + " num_epochs=epochs, format_bgr=True)\n", + "\n", + " trainer.train(sess, update_ops_name=update_ops_name, learning_rate=learning_rate, decay_steps=decay_steps)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 2. Load FP32 model" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "AIMET currently support BatchNorm Re-estimation on Tensorflow sessions. In this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to work with Tensorflow session. Similarly, you can load any pretrained Tensorflow model. Please refer to [QAT notebook](https://github.com/quic/aimet/blob/develop/Examples/tensorflow/quantization/quantization_aware_training.ipynb) for more detail." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from tensorflow.compat.v1.keras.applications.resnet import ResNet50\n", + "\n", + "tf.keras.backend.clear_session()\n", + "model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))\n", + "sess = tf.keras.backend.get_session()\n", + "\n", + "# Following lines are additional steps to make keras model work with AIMET.\n", + "from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph\n", + "add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We need names of input and output of the model to work with AIMET." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "input_op_names = [model.input.op.name]\n", + "output_op_names = [model.output.op.name]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### BatchNorm Rewriter\n", + "In the later notebook, we will make changes to parameters of BatchNorms to improve performance.\n", + "However, depending on how the BatchNorm was configured, this might be difficult to achieve.\n", + "\n", + "AIMET provides `model_sess_bn_mutable` that changes BatchNorm layer to make it easier to modify parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.utils.op.bn_mutable import modify_sess_bn_mutable\n", + "modify_sess_bn_mutable(sess, input_op_names, output_op_names, training_tf_placeholder=False)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let’s determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sess=sess)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 3. Create a quantization simulation model and Perform QAT\n", + "\n", + "### Create Quantization Sim Model\n", + "\n", + "Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them.\n", + "A few of the parameters are explained here\n", + "- **quant_scheme**: We set this to \"QuantScheme.post_training_tf_enhanced\"\n", + " - Supported options are 'tf_enhanced' or 'tf' or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced\n", + "- **default_output_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision\n", + "- **default_param_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision\n", + "\n", + "There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.\n", + "\n", + "**NOTE**: Note that, unlike in other QAT example scripts, we didn't fold batchnorm layers before QAT. This is because we aim to finetune our model with batchnorm layers present and re-estimate the batchnorm statatistics for better accuracy. The batchnorm layers will be folded after re-estimation." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import json\n", + "from aimet_common.defs import QuantScheme\n", + "from aimet_tensorflow.quantsim import QuantizationSimModel\n", + "\n", + "default_config_per_channel = {\n", + " \"defaults\":\n", + " {\n", + " \"ops\":\n", + " {\n", + " \"is_output_quantized\": \"True\"\n", + " },\n", + " \"params\":\n", + " {\n", + " \"is_quantized\": \"True\",\n", + " \"is_symmetric\": \"True\"\n", + " },\n", + " \"strict_symmetric\": \"False\",\n", + " \"unsigned_symmetric\": \"True\",\n", + " \"per_channel_quantization\": \"True\"\n", + " },\n", + "\n", + " \"params\":\n", + " {\n", + " \"bias\":\n", + " {\n", + " \"is_quantized\": \"False\"\n", + " }\n", + " },\n", + "\n", + " \"op_type\":\n", + " {\n", + " \"Squeeze\":\n", + " {\n", + " \"is_output_quantized\": \"False\"\n", + " },\n", + " \"Pad\":\n", + " {\n", + " \"is_output_quantized\": \"False\"\n", + " },\n", + " \"Mean\":\n", + " {\n", + " \"is_output_quantized\": \"False\"\n", + " }\n", + " },\n", + "\n", + " \"supergroups\":\n", + " [\n", + " {\n", + " \"op_list\": [\"Conv\", \"Relu\"]\n", + " },\n", + " {\n", + " \"op_list\": [\"Conv\", \"Clip\"]\n", + " },\n", + " {\n", + " \"op_list\": [\"Conv\", \"BatchNormalization\", \"Relu\"]\n", + " },\n", + " {\n", + " \"op_list\": [\"Add\", \"Relu\"]\n", + " },\n", + " {\n", + " \"op_list\": [\"Gemm\", \"Relu\"]\n", + " }\n", + " ],\n", + "\n", + " \"model_input\":\n", + " {\n", + " \"is_input_quantized\": \"True\"\n", + " },\n", + "\n", + " \"model_output\":\n", + " {}\n", + " }\n", + "\n", + "config_file_path = \"/tmp/default_config_per_channel.json\"\n", + "with open(config_file_path, \"w\") as f:\n", + " json.dump(default_config_per_channel, f)\n", + "\n", + "sim = QuantizationSimModel(sess, input_op_names, output_op_names, use_cuda=True,\n", + " quant_scheme=QuantScheme.training_range_learning_with_tf_init,\n", + " config_file=config_file_path)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Compute Encodings\n", + "Even though AIMET has added 'quantizer' nodes to the model graph, the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each 'quantizer' node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as 'computing encodings'.\n", + "\n", + "So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don't need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples\n", + "- In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples.\n", + "- It may be beneficial if the samples used for computing encoding are well distributed. It's not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def pass_calibration_data(session: tf.compat.v1.Session, _):\n", + " data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + " batch_size = data_loader.batch_size\n", + "\n", + " input_label_tensors = [session.graph.get_tensor_by_name('input_1:0'),\n", + " session.graph.get_tensor_by_name('labels:0')]\n", + "\n", + " train_tensors = [session.graph.get_tensor_by_name('keras_learning_phase:0')]\n", + " train_tensors_dict = dict.fromkeys(train_tensors, False)\n", + "\n", + " eval_outputs = [session.graph.get_operation_by_name('top1-acc').outputs[0]]\n", + "\n", + " samples = 500\n", + "\n", + " batch_cntr = 0\n", + " for input_label in data_loader:\n", + " input_label_tensors_dict = dict(zip(input_label_tensors, input_label))\n", + "\n", + " feed_dict = {**input_label_tensors_dict, **train_tensors_dict}\n", + "\n", + " with session.graph.as_default():\n", + " _ = session.run(eval_outputs, feed_dict=feed_dict)\n", + "\n", + " batch_cntr += 1\n", + " if (batch_cntr * batch_size) > samples:\n", + " break\n", + "\n", + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=None)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now the QuantizationSim model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sim.session)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Perform QAT\n", + "\n", + "To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.\n", + "\n", + "For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "update_ops_name = [op.name for op in model.updates] # Used for finetuning\n", + "ImageNetDataPipeline.finetune(sim.session, update_ops_name=update_ops_name, epochs=1, learning_rate=5e-7, decay_steps=5)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "After we are done with QAT, we can run quantization simulation inference against the validation dataset at the end to observe any improvements in accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "finetuned_accuracy = ImageNetDataPipeline.evaluate(sim.session)\n", + "print(finetuned_accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 4. Perform BatchNorm Reestimation\n", + "\n", + "### Re-estimate BatchNorm Statistics\n", + "AIMET provides a helper function, `reestimate_bn_stats`, for re-estimating batchnorm statistics.\n", + "Here is the full list of parameters for this function:\n", + "* **model**: Model to re-estimate the BatchNorm statistics.\n", + "* **dataloader** Train dataloader.\n", + "* **num_batches** (optional): The number of batches to be used for reestimation. (Default: 100)\n", + "* **forward_fn** (optional): Optional adapter function that performs forward pass given a model and a input batch yielded from the data loader. If not specified, it is expected that inputs yielded from dataloader can be passed directly to the model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.bn_reestimation import reestimate_bn_stats\n", + "import numpy as np\n", + "\n", + "data_loader = ImageNetDataLoader(TFRECORDS_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + "arrays=[]\n", + "for input_label in data_loader:\n", + " arrays.append(input_label[0])\n", + "real_inputs = np.vstack(arrays)\n", + "\n", + "dataset = tf.compat.v1.data.Dataset.from_tensor_slices(real_inputs)\n", + "bn_re_restimation_dataset = dataset.batch(32)\n", + "\n", + "reestimate_bn_stats(sim, start_op_names=input_op_names, output_op_names=output_op_names,\n", + " bn_re_estimation_dataset=bn_re_restimation_dataset, bn_num_batches=100)\n", + "\n", + "finetuned_accuracy_bn_reestimated = ImageNetDataPipeline.evaluate(sim.session)\n", + "print(finetuned_accuracy_bn_reestimated)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Fold BatchNorm Layers\n", + "\n", + "So far, we have improved our quantization simulation model through QAT and batchnorm re-estimation. The next step would be to actually take this model to target. But first, we should fold the batchnorm layers for our model to run on target devices more efficiently." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.batch_norm_fold import fold_all_batch_norms_to_scale\n", + "\n", + "fold_all_batch_norms_to_scale(sim, input_op_names, output_op_names)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 5. Export Model\n", + "As the final step, we will export the model to run it on actual target devices. AIMET QuantizationSimModel provides an export API for this purpose." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "os.makedirs('./output/', exist_ok=True)\n", + "sim.export(path='./output/', filename_prefix='resnet50_after_qat')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Summary\n", + "\n", + "Hope this notebook was useful for you to understand how to use batchnorm re-estimation feature of AIMET.\n", + "\n", + "Few additional resources\n", + "- Refer to the [AIMET API docs](https://quic.github.io/aimet-pages/AimetDocs/api_docs/index.html) to know more details of the APIs and optional parameters.\n", + "- Refer to the [other example notebooks](https://github.com/quic/aimet/tree/develop/Examples/tensorflow/quantization) to understand how to use AIMET post-training quantization techniques and QAT methods." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.0" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/releases/1.32.2/Examples/tensorflow/quantization/cle_bc.html b/releases/1.32.2/Examples/tensorflow/quantization/cle_bc.html new file mode 100644 index 00000000..fafa8545 --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/cle_bc.html @@ -0,0 +1,1554 @@ + + + + + + Cross-Layer Equalization (CLE) and Bias Correction (BC) — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + +
  • + View page source +
  • +
+
+
+
+
+ +
+

Cross-Layer Equalization (CLE) and Bias Correction (BC)

+

This notebook showcases a working code example of how to use AIMET to apply Cross-Layer Equalization (CLE) and Bias Correction (BC). CLE and BC are post-training quantization techniques that aim to improve quantized accuracy of a given model. CLE does not need any data samples. BC may optionally need unlabelled data samples. These techniques help recover quantized accuracy when the model quantization is sensitive to parameter quantization as opposed to activation quantization.

+

To learn more about this techniques, please refer to the “Data-Free Quantization Through Weight Equalization and Bias Correction” paper from ICCV 2019 - https://arxiv.org/abs/1906.04721

+

Cross-Layer Equalization AIMET performs the following steps when running CLE: 1. Batch Norm Folding: Folds BN layers into Conv layers immediate before or after the Conv layers. 2. Cross-Layer Scaling: Given a set of consecutive Conv layers, equalizes the range of tensor values per-channel by scaling up/down per-channel weight tensor values of a layer and corresponding scaling down/up per-channel weight tensor values of the subsequent layer. 3. High Bias Folding: Cross-layer scaling may +result in high bias parameter values for some layers. This technique folds some of the bias of a layer into the subsequent layer’s parameters.

+
+
Bias Correction
+
Quantization sometimes leads to a shift in layer outputs. This techniques helps correct this shift by adjusting the bias parameters of that layer. Note that this technique is generally applied after CLE, but it is a optional step.
+
+
+

Overall flow

+

This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy 3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score 4. Apply CLE, BC and and evaluate the simulation model to get a post-finetuned quantized accuracy score

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art results. For example, it uses a relatively quantization-friendly model like Resnet18. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#) and convert them into tfrecords.

+

Note1: The ImageNet tfrecords dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - A folder containing tfrecords files starting with ‘train*’ for training files and ‘valid*’ for validation files. Each tfrecord file should have features: ‘image/encoded’ for image data and ‘image/class/label’ for its corresponding class.

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class and then convert it into tfrecords. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = '/path/to/tfrecords/dir/'        # Please replace this with a real directory
+
+
+
+

We disable logs at the INFO level and disable eager execution. We set verbosity to the level as displayed (ERORR), so TensorFlow will display all messages that have the label ERROR (or more critical).

+
+
[ ]:
+
+
+
import os
+os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
+
+import tensorflow.compat.v1 as tf
+tf.disable_eager_execution()
+tf.logging.set_verbosity(tf.logging.ERROR)
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a PyTorch model. This QuantizationSim model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods? Not really. You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
from typing import List
+
+from Examples.common import image_net_config
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.tensorflow.utils.image_net_trainer import ImageNetTrainer
+
+class ImageNetDataPipeline:
+    """
+    Provides APIs for model evaluation and finetuning using ImageNet Dataset.
+    """
+
+    @staticmethod
+    def get_val_dataloader():
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(TFRECORDS_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         format_bgr=True)
+
+        return data_loader
+
+    @staticmethod
+    def evaluate(sess: tf.Session) -> float:
+        """
+        Given a TF session, evaluates its Top-1 accuracy on the validation dataset
+        :param sess: The sess graph to be evaluated.
+        :return: The accuracy for the sample with the maximum accuracy.
+        """
+        evaluator = ImageNetEvaluator(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],
+                                      data_inputs=['input_1:0'], validation_inputs=['labels:0'],
+                                      image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      format_bgr=True)
+
+        return evaluator.evaluate(sess)
+
+
+    @staticmethod
+    def finetune(sess: tf.Session, update_ops_name: List[str], epochs: int, learning_rate: float, decay_steps: int):
+        """
+        Given a TF session, finetunes it to improve its accuracy
+        :param sess: The sess graph to fine-tune.
+        :param update_ops_name: list of name of update ops (mostly BatchNorms' moving averages).
+                                tf.GraphKeys.UPDATE_OPS collections is always used
+                                in addition to this list
+        :param epochs: The number of epochs used during the finetuning step.
+        :param learning_rate: The learning rate used during the finetuning step.
+        :param decay_steps: A number used to adjust(decay) the learning rate after every decay_steps epochs in training.
+        """
+        trainer = ImageNetTrainer(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],
+                                  data_inputs=['input_1:0'], validation_inputs=['labels:0'],
+                                  image_size=image_net_config.dataset['image_size'],
+                                  batch_size=image_net_config.train['batch_size'],
+                                  num_epochs=epochs, format_bgr=True)
+
+        trainer.train(sess, update_ops_name=update_ops_name, learning_rate=learning_rate, decay_steps=decay_steps)
+
+
+
+
+
+
+

2. Load the model and evaluate to get a baseline FP32 accuracy score

+

For this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to a tensorflow session. Similarly, you can load any pretrained tensorflow model instead.

+

Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.

+

By default the update ops are placed in tf.GraphKeys.UPDATE_OPS, so they need to be added as a dependency to the train_op. Since batchnorm ops are folded, these need to be ignored during training.

+
+
[ ]:
+
+
+
from tensorflow.compat.v1.keras.applications.resnet import ResNet50
+
+tf.keras.backend.clear_session()
+
+model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+update_ops_name = [op.name for op in model.updates] # Used for finetuning
+
+
+
+

The following utility method in AIMET sets BN layers in the model to eval mode. This allows AIMET to more easily read the BN parameters from the graph. Eventually we will fold BN layers into adjacent conv layers.

+
+
[ ]:
+
+
+
from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag
+
+model = update_keras_bn_ops_trainable_flag(model, load_save_path="./", trainable=False)
+
+
+
+

AIMET features currently support tensorflow sessions. add_image_net_computational_nodes_in_graph adds an output layer, softmax and loss functions to the Resnet50 model graph.

+
+
[ ]:
+
+
+
from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph
+
+sess = tf.keras.backend.get_session()
+
+# Creates the computation graph of ResNet within the tensorflow session.
+add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])
+
+
+
+

Since all tensorflow input and output tensors have names, we identify the tensors needed by AIMET APIs here.

+
+
[ ]:
+
+
+
starting_op_names = [model.input.name.split(":")[0]]
+output_op_names = [model.output.name.split(":")[0]]
+
+
+
+

We are checking if TensorFlow is using CPU or CUDA device. This example code will use CUDA if available in your current execution environment.

+
+
[ ]:
+
+
+
use_cuda = tf.test.is_gpu_available(cuda_only=True):
+
+
+
+
+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sess=sess)
+print(accuracy)
+
+
+
+
+
+
+

3. Create a quantization simulation model and determine quantized accuracy

+
+
+

Fold Batch Normalization layers

+

Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.

+

Why do we need to this? On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values +for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.

+

The following code calls AIMET to fold the BN layers in-place on the given model

+
+
[ ]:
+
+
+
from aimet_tensorflow.batch_norm_fold import fold_all_batch_norms
+
+BN_folded_sess, _ = fold_all_batch_norms(sess,
+                                         input_op_names=starting_op_names,
+                                         output_op_names=output_op_names)
+
+
+
+
+
+
+

Create Quantization Sim Model

+

Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them. A few of the parameters are explained here - quant_scheme: We set this to “QuantScheme.post_training_tf_enhanced” - Supported options are ‘tf_enhanced’ or ‘tf’ or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced - default_output_bw: Setting this to 8, essentially means that we are +asking AIMET to perform all activation quantizations in the model using integer 8-bit precision - default_param_bw: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision - num_batches: The number of batches used to evaluate the model while calculating the quantization encodings.Number of batches to use for computing encodings. Only 5 batches are used here to speed up the process. In addition, the +number of images in these 5 batches should be sufficient for compute encodings - rounding_mode: The rounding mode used for quantization. There are two possible choices here - ‘nearest’ or ‘stochastic’ We will use “nearest.”

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+

The next cell sets up the quantizer, and quantizes the model. The new session that contains all the changes to the graph is quantizer.session, and this is then evaluated on the dataset. Note that the quantizer uses the same evaluate function as the one defined in our data pipeline when computing the new weights.

+
+
[ ]:
+
+
+
from aimet_common.defs import QuantScheme
+from aimet_tensorflow.quantsim import QuantizationSimModel
+
+sim = QuantizationSimModel(session=BN_folded_sess,
+                           starting_op_names=starting_op_names,
+                           output_op_names=output_op_names,
+                           quant_scheme= QuantScheme.training_range_learning_with_tf_enhanced_init,
+                           rounding_mode="nearest",
+                           default_output_bw=8,
+                           default_param_bw=8,
+                           use_cuda=use_cuda)
+
+
+
+
+
+
+

Compute Encodings

+

Even though AIMET has added ‘quantizer’ nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred +to as calibration. AIMET simply refers to it as ‘computing encodings’.

+

So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples

+

In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples. It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario +like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results. The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example.

+
+
[ ]:
+
+
+
def pass_calibration_data(session: tf.Session, _):
+    data_loader = ImageNetDataPipeline.get_val_dataloader()
+    batch_size = data_loader.batch_size
+
+    input_label_tensors = [session.graph.get_tensor_by_name('input_1:0'),
+                           session.graph.get_tensor_by_name('labels:0')]
+
+    train_tensors = [session.graph.get_tensor_by_name('keras_learning_phase:0')]
+    train_tensors_dict = dict.fromkeys(train_tensors, False)
+
+    eval_outputs = [session.graph.get_operation_by_name('top1-acc').outputs[0]]
+
+    samples = 500
+
+    batch_cntr = 0
+    for input_label in data_loader:
+        input_label_tensors_dict = dict(zip(input_label_tensors, input_label))
+
+        feed_dict = {**input_label_tensors_dict, **train_tensors_dict}
+
+        with session.graph.as_default():
+            _ = session.run(eval_outputs, feed_dict=feed_dict)
+
+        batch_cntr += 1
+        if (batch_cntr * batch_size) > samples:
+            break
+
+
+
+
+

Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters.

+
+
[ ]:
+
+
+
sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=None)
+
+
+
+
+

Now the QuantizationSim model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)
+print(accuracy)
+
+
+
+
+
+
+

4. 1 Cross Layer Equalization

+

The next cell performs cross-layer equalization on the model. As noted before, the function folds batch norms, applies cross-layer scaling, and then folds high biases.

+

Note: Interestingly, CLE needs BN statistics for its procedure. If a BN folded model is provided, CLE will run the CLS (cross-layer scaling) optimization step but will skip the HBA (high-bias absorption) step. To avoid this, we simply load the original model again before running CLE.

+

Note: CLE equalizes the model in-place

+
+
[ ]:
+
+
+
from aimet_tensorflow import cross_layer_equalization as aimet_cle
+
+cle_applied_sess = aimet_cle.equalize_model(sess,
+                                            start_op_names=start_op_names,
+                                            output_op_names=output_op_names)
+
+
+
+
+

Now, we can determine the simulated quantized accuracy of the equalized model. We again create a simulation model like before and evaluate to determine simulated quantized accuracy.

+
+
[ ]:
+
+
+
sim = QuantizationSimModel(session=cle_applied_sess,
+                           starting_op_names=starting_op_names,
+                           output_op_names=output_op_names,
+                           quant_scheme= QuantScheme.training_range_learning_with_tf_enhanced_init,
+                           rounding_mode="nearest",
+                           default_output_bw=8,
+                           default_param_bw=8,
+                           use_cuda=use_cuda)
+
+sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=None)
+
+accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)
+print(accuracy)
+
+
+
+
+
+
+

4. 2 Bias Correction

+

This section shows how we can apply AIMET Bias Correction on top of the already equalized model from the previous step. Bias correction under the hood uses a reference FP32 model and a QuantizationSimModel to perform its procedure. More details are explained in the AIMET User Guide documentation.

+

For the correct_bias API, we pass the following parameters

+
    +
  • num_quant_samples: Number of samples used for computing encodings. We are setting this to a low number here to speed up execution. A typical number would be 500-1000.

  • +
  • num_bias_correct_samples: Number of samples used for bias correction. We are setting this to a low number here to speed up execution. A typical number would be 1000-2000.

  • +
  • data_loader: BC uses unlabeled data samples from this data loader.

  • +
+
+
[ ]:
+
+
+
from aimet_tensorflow import bias_correction as aimet_bc
+
+quant_params = aimet_bc.QuantParams(quant_mode= QuantScheme.post_training_tf_enhanced, round_mode="nearest",
+                                    use_cuda=use_cuda, ops_to_ignore=[])
+bias_correction_params = aimet_bc.BiasCorrectionParams(batch_size=56,
+                                                       num_quant_samples=16,
+                                                       num_bias_correct_samples=16,
+                                                       input_op_names=start_op_names,
+                                                       output_op_names=output_op_names)
+
+after_bc_sess = aimet_bc.BiasCorrection.correct_bias(sess, bias_correct_params=bias_correction_params,
+                                                     quant_params=quant_params,
+                                                     data_set=data_loader.dataset)
+
+
+
+
+

Now, we can determine the simulated quantized accuracy of the bias-corrected model. We again create a simulation model like before and evaluate to determine simulated quantized accuracy.

+
+
[ ]:
+
+
+
sim = QuantizationSimModel(session=BN_folded_sess,
+                           starting_op_names=['input_1'],
+                           output_op_names=[model.output.name.split(":")[0]],
+                           quant_scheme= QuantScheme.training_range_learning_with_tf_enhanced_init,
+                           rounding_mode="nearest",
+                           default_output_bw=8,
+                           default_param_bw=8,
+                           use_cuda=use_cuda)
+
+sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=None)
+
+accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)
+print(accuracy)
+
+
+
+
+

Depending on your settings you may have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.

+

So we should have an improved model after QAT. Now the next step would be to actually take this model to target. For this purpose, we need to export the model with the updated weights without the fake quant ops. AIMET QuantizationSimModel provides an export API for this purpose. This API would save the model as #TODO

+
+
[ ]:
+
+
+
os.makedirs('./output/', exist_ok=True)
+sim.export(path='./output/', filename_prefix='resnet50_after_qat_range_learning')
+
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET for performing Cross Layer Equalization (CLE) and Bias Correction (BC).

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT techniques

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/quantization/cle_bc.ipynb b/releases/1.32.2/Examples/tensorflow/quantization/cle_bc.ipynb new file mode 100644 index 00000000..47ebf527 --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/cle_bc.ipynb @@ -0,0 +1,670 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "38bf01b3", + "metadata": {}, + "source": [ + "# Cross-Layer Equalization (CLE) and Bias Correction (BC)\n", + "\n", + "This notebook showcases a working code example of how to use AIMET to apply Cross-Layer Equalization (CLE) and Bias Correction (BC). CLE and BC are post-training quantization techniques that aim to improve quantized accuracy of a given model. CLE does not need any data samples. BC may optionally need unlabelled data samples. These techniques help recover quantized accuracy when the model quantization is sensitive to parameter quantization as opposed to activation quantization.\n", + "\n", + "To learn more about this techniques, please refer to the \"Data-Free Quantization Through Weight Equalization and Bias Correction\" paper from ICCV 2019 - https://arxiv.org/abs/1906.04721\n", + "\n", + "**Cross-Layer Equalization**\n", + "AIMET performs the following steps when running CLE:\n", + "1. Batch Norm Folding: Folds BN layers into Conv layers immediate before or after the Conv layers.\n", + "2. Cross-Layer Scaling: Given a set of consecutive Conv layers, equalizes the range of tensor values per-channel by scaling up/down per-channel weight tensor values of a layer and corresponding scaling down/up per-channel weight tensor values of the subsequent layer.\n", + "3. High Bias Folding: Cross-layer scaling may result in high bias parameter values for some layers. This technique folds some of the bias of a layer into the subsequent layer's parameters.\n", + "\n", + "**Bias Correction** \n", + "Quantization sometimes leads to a shift in layer outputs. This techniques helps correct this shift by adjusting the bias parameters of that layer. Note that this technique is generally applied after CLE, but it is a optional step.\n", + "\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy\n", + "3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score\n", + "4. Apply CLE, BC and and evaluate the simulation model to get a post-finetuned quantized accuracy score\n", + "\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art results. For example, it uses a relatively quantization-friendly model like Resnet18. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly.\n" + ] + }, + { + "cell_type": "markdown", + "id": "71116f26", + "metadata": {}, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#) and convert them into tfrecords.\n", + "\n", + "**Note1**: The ImageNet tfrecords dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- A folder containing tfrecords files starting with **'train\\*'** for training files and **'valid\\*'** for validation files. Each tfrecord file should have features: **'image/encoded'** for image data and **'image/class/label'** for its corresponding class.\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class and then convert it into tfrecords. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "14eaf1d4", + "metadata": {}, + "outputs": [], + "source": [ + "DATASET_DIR = '/path/to/tfrecords/dir/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "id": "e08683fa", + "metadata": {}, + "source": [ + "We disable logs at the INFO level and disable eager execution. We set verbosity to the level as displayed (ERORR), so TensorFlow will display all messages that have the label ERROR (or more critical)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7b45d3c7", + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'\n", + "\n", + "import tensorflow.compat.v1 as tf\n", + "tf.disable_eager_execution()\n", + "tf.logging.set_verbosity(tf.logging.ERROR)" + ] + }, + { + "cell_type": "markdown", + "id": "7b964ae2", + "metadata": {}, + "source": [ + "---\n", + "\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a PyTorch model. This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really. You should be able to use your existing evaluate and train routines as-is.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1ff778eb", + "metadata": {}, + "outputs": [], + "source": [ + "from typing import List\n", + "\n", + "from Examples.common import image_net_config\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.tensorflow.utils.image_net_trainer import ImageNetTrainer\n", + "\n", + "class ImageNetDataPipeline:\n", + " \"\"\"\n", + " Provides APIs for model evaluation and finetuning using ImageNet Dataset.\n", + " \"\"\"\n", + " \n", + " @staticmethod\n", + " def get_val_dataloader():\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(TFRECORDS_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return data_loader\n", + " \n", + " @staticmethod\n", + " def evaluate(sess: tf.Session) -> float:\n", + " \"\"\"\n", + " Given a TF session, evaluates its Top-1 accuracy on the validation dataset\n", + " :param sess: The sess graph to be evaluated.\n", + " :return: The accuracy for the sample with the maximum accuracy.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],\n", + " data_inputs=['input_1:0'], validation_inputs=['labels:0'],\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return evaluator.evaluate(sess)\n", + "\n", + " \n", + " @staticmethod\n", + " def finetune(sess: tf.Session, update_ops_name: List[str], epochs: int, learning_rate: float, decay_steps: int):\n", + " \"\"\"\n", + " Given a TF session, finetunes it to improve its accuracy\n", + " :param sess: The sess graph to fine-tune.\n", + " :param update_ops_name: list of name of update ops (mostly BatchNorms' moving averages).\n", + " tf.GraphKeys.UPDATE_OPS collections is always used\n", + " in addition to this list\n", + " :param epochs: The number of epochs used during the finetuning step.\n", + " :param learning_rate: The learning rate used during the finetuning step.\n", + " :param decay_steps: A number used to adjust(decay) the learning rate after every decay_steps epochs in training.\n", + " \"\"\"\n", + " trainer = ImageNetTrainer(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],\n", + " data_inputs=['input_1:0'], validation_inputs=['labels:0'],\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.train['batch_size'],\n", + " num_epochs=epochs, format_bgr=True)\n", + "\n", + " trainer.train(sess, update_ops_name=update_ops_name, learning_rate=learning_rate, decay_steps=decay_steps)\n" + ] + }, + { + "cell_type": "markdown", + "id": "1e7b31bf", + "metadata": {}, + "source": [ + "---\n", + "\n", + "## 2. Load the model and evaluate to get a baseline FP32 accuracy score" + ] + }, + { + "cell_type": "markdown", + "id": "1d5e6074", + "metadata": {}, + "source": [ + "For this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to a tensorflow session. Similarly, you can load any pretrained tensorflow model instead.\n", + "\n", + "\n", + "Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.\n", + "\n", + "\n", + "By default the update ops are placed in tf.GraphKeys.UPDATE_OPS, so they need to be added as a dependency to the train_op. Since batchnorm ops are folded, these need to be ignored during training." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "de92a9d0", + "metadata": {}, + "outputs": [], + "source": [ + "from tensorflow.compat.v1.keras.applications.resnet import ResNet50\n", + "\n", + "tf.keras.backend.clear_session()\n", + "\n", + "model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))\n", + "update_ops_name = [op.name for op in model.updates] # Used for finetuning" + ] + }, + { + "cell_type": "markdown", + "id": "8c55a5be", + "metadata": {}, + "source": [ + "The following utility method in AIMET sets BN layers in the model to eval mode. This allows AIMET to more easily read the BN parameters from the graph. Eventually we will fold BN layers into adjacent conv layers." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "de9c180c", + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag\n", + "\n", + "model = update_keras_bn_ops_trainable_flag(model, load_save_path=\"./\", trainable=False)" + ] + }, + { + "cell_type": "markdown", + "id": "c97b5289", + "metadata": {}, + "source": [ + "AIMET features currently support tensorflow sessions. **add_image_net_computational_nodes_in_graph** adds an output layer, softmax and loss functions to the Resnet50 model graph." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a55dee2c", + "metadata": {}, + "outputs": [], + "source": [ + "from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph\n", + "\n", + "sess = tf.keras.backend.get_session()\n", + "\n", + "# Creates the computation graph of ResNet within the tensorflow session.\n", + "add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])" + ] + }, + { + "cell_type": "markdown", + "id": "bc385ad9", + "metadata": {}, + "source": [ + "Since all tensorflow input and output tensors have names, we identify the tensors needed by AIMET APIs here. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "91368424", + "metadata": {}, + "outputs": [], + "source": [ + "starting_op_names = [model.input.name.split(\":\")[0]]\n", + "output_op_names = [model.output.name.split(\":\")[0]]" + ] + }, + { + "cell_type": "markdown", + "id": "712344c8", + "metadata": {}, + "source": [ + "We are checking if TensorFlow is using CPU or CUDA device. This example code will use CUDA if available in your current execution environment." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "12e8ae6c", + "metadata": {}, + "outputs": [], + "source": [ + "use_cuda = tf.test.is_gpu_available(cuda_only=True):" + ] + }, + { + "cell_type": "markdown", + "id": "050b6187", + "metadata": {}, + "source": [ + "---\n", + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b1b7c11b", + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sess=sess)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "id": "d4962c07", + "metadata": {}, + "source": [ + "---\n", + "## 3. Create a quantization simulation model and determine quantized accuracy\n", + "\n", + "## Fold Batch Normalization layers\n", + "Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.\n", + "\n", + "**Why do we need to this?**\n", + "On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.\n", + "\n", + "The following code calls AIMET to fold the BN layers in-place on the given model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3879d9fb", + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.batch_norm_fold import fold_all_batch_norms\n", + "\n", + "BN_folded_sess, _ = fold_all_batch_norms(sess,\n", + " input_op_names=starting_op_names,\n", + " output_op_names=output_op_names)" + ] + }, + { + "cell_type": "markdown", + "id": "7adfba8d", + "metadata": {}, + "source": [ + "---\n", + "## Create Quantization Sim Model\n", + "Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them.\n", + "A few of the parameters are explained here\n", + "- **quant_scheme**: We set this to \"QuantScheme.post_training_tf_enhanced\"\n", + " - Supported options are 'tf_enhanced' or 'tf' or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced\n", + "- **default_output_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision\n", + "- **default_param_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision\n", + "- **num_batches**: The number of batches used to evaluate the model while calculating the quantization encodings.Number of batches to use for computing encodings. Only 5 batches are used here to speed up the process. In addition, the number of images in these 5 batches should be sufficient for compute encodings\n", + "- **rounding_mode**: The rounding mode used for quantization. There are two possible choices here - 'nearest' or 'stochastic' We will use \"nearest.\"\n", + "\n", + "There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters." + ] + }, + { + "cell_type": "markdown", + "id": "68ad3150", + "metadata": {}, + "source": [ + "The next cell sets up the quantizer, and quantizes the model. The new session that contains all the changes to the graph is quantizer.session, and this is then evaluated on the dataset. Note that the quantizer uses the same evaluate function as the one defined in our data pipeline when computing the new weights." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7a8a06c5", + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_common.defs import QuantScheme\n", + "from aimet_tensorflow.quantsim import QuantizationSimModel\n", + " \n", + "sim = QuantizationSimModel(session=BN_folded_sess,\n", + " starting_op_names=starting_op_names,\n", + " output_op_names=output_op_names,\n", + " quant_scheme= QuantScheme.training_range_learning_with_tf_enhanced_init,\n", + " rounding_mode=\"nearest\",\n", + " default_output_bw=8,\n", + " default_param_bw=8,\n", + " use_cuda=use_cuda)\n" + ] + }, + { + "cell_type": "markdown", + "id": "b692e06d", + "metadata": {}, + "source": [ + "---\n", + "## Compute Encodings\n", + "Even though AIMET has added 'quantizer' nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each 'quantizer' node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as 'computing encodings'.\n", + "\n", + "So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don't need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples\n", + "\n", + "In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples.\n", + "It may be beneficial if the samples used for computing encoding are well distributed. It's not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fe4b6ebb", + "metadata": {}, + "outputs": [], + "source": [ + "def pass_calibration_data(session: tf.Session, _):\n", + " data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + " batch_size = data_loader.batch_size\n", + "\n", + " input_label_tensors = [session.graph.get_tensor_by_name('input_1:0'),\n", + " session.graph.get_tensor_by_name('labels:0')]\n", + " \n", + " train_tensors = [session.graph.get_tensor_by_name('keras_learning_phase:0')]\n", + " train_tensors_dict = dict.fromkeys(train_tensors, False)\n", + " \n", + " eval_outputs = [session.graph.get_operation_by_name('top1-acc').outputs[0]]\n", + "\n", + " samples = 500\n", + "\n", + " batch_cntr = 0\n", + " for input_label in data_loader:\n", + " input_label_tensors_dict = dict(zip(input_label_tensors, input_label))\n", + "\n", + " feed_dict = {**input_label_tensors_dict, **train_tensors_dict}\n", + "\n", + " with session.graph.as_default():\n", + " _ = session.run(eval_outputs, feed_dict=feed_dict)\n", + "\n", + " batch_cntr += 1\n", + " if (batch_cntr * batch_size) > samples:\n", + " break\n", + " " + ] + }, + { + "cell_type": "markdown", + "id": "c58b6107", + "metadata": {}, + "source": [ + "---\n", + "Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6ede0ff0", + "metadata": {}, + "outputs": [], + "source": [ + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=None)" + ] + }, + { + "cell_type": "markdown", + "id": "5323ed7a", + "metadata": {}, + "source": [ + "---\n", + "Now the QuantizationSim model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cef1c8a2", + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "id": "9cfaa11b", + "metadata": {}, + "source": [ + "---\n", + "## 4. 1 Cross Layer Equalization\n", + "\n", + "The next cell performs cross-layer equalization on the model. As noted before, the function folds batch norms, applies cross-layer scaling, and then folds high biases.\n", + "\n", + "**Note:** Interestingly, CLE needs BN statistics for its procedure. If a BN folded model is provided, CLE will run the CLS (cross-layer scaling) optimization step but will skip the HBA (high-bias absorption) step. To avoid this, we simply load the original model again before running CLE.\n", + "\n", + "**Note:** CLE equalizes the model in-place" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cd2ef6e8", + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow import cross_layer_equalization as aimet_cle\n", + "\n", + "cle_applied_sess = aimet_cle.equalize_model(sess,\n", + " start_op_names=start_op_names,\n", + " output_op_names=output_op_names)" + ] + }, + { + "cell_type": "markdown", + "id": "44e9d19e", + "metadata": {}, + "source": [ + "---\n", + "Now, we can determine the simulated quantized accuracy of the equalized model. We again create a simulation model like before and evaluate to determine simulated quantized accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7fa0d708", + "metadata": {}, + "outputs": [], + "source": [ + "sim = QuantizationSimModel(session=cle_applied_sess,\n", + " starting_op_names=starting_op_names,\n", + " output_op_names=output_op_names,\n", + " quant_scheme= QuantScheme.training_range_learning_with_tf_enhanced_init,\n", + " rounding_mode=\"nearest\",\n", + " default_output_bw=8,\n", + " default_param_bw=8,\n", + " use_cuda=use_cuda)\n", + "\n", + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=None)\n", + "\n", + "accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "id": "c6f8149d", + "metadata": {}, + "source": [ + "---\n", + "## 4. 2 Bias Correction\n", + "\n", + "This section shows how we can apply AIMET Bias Correction on top of the already equalized model from the previous step. Bias correction under the hood uses a reference FP32 model and a QuantizationSimModel to perform its procedure. More details are explained in the AIMET User Guide documentation.\n", + "\n", + "For the correct_bias API, we pass the following parameters\n", + "\n", + "- **num_quant_samples**: Number of samples used for computing encodings. We are setting this to a low number here to speed up execution. A typical number would be 500-1000.\n", + "- **num_bias_correct_samples**: Number of samples used for bias correction. We are setting this to a low number here to speed up execution. A typical number would be 1000-2000.\n", + "- **data_loader**: BC uses unlabeled data samples from this data loader." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b5b41ae3", + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow import bias_correction as aimet_bc\n", + "\n", + "quant_params = aimet_bc.QuantParams(quant_mode= QuantScheme.post_training_tf_enhanced, round_mode=\"nearest\",\n", + " use_cuda=use_cuda, ops_to_ignore=[])\n", + "bias_correction_params = aimet_bc.BiasCorrectionParams(batch_size=56,\n", + " num_quant_samples=16,\n", + " num_bias_correct_samples=16,\n", + " input_op_names=start_op_names,\n", + " output_op_names=output_op_names)\n", + "\n", + "after_bc_sess = aimet_bc.BiasCorrection.correct_bias(sess, bias_correct_params=bias_correction_params,\n", + " quant_params=quant_params,\n", + " data_set=data_loader.dataset)" + ] + }, + { + "cell_type": "markdown", + "id": "a90309ad", + "metadata": {}, + "source": [ + "---\n", + "Now, we can determine the simulated quantized accuracy of the bias-corrected model. We again create a simulation model like before and evaluate to determine simulated quantized accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "442a2402", + "metadata": {}, + "outputs": [], + "source": [ + "sim = QuantizationSimModel(session=BN_folded_sess,\n", + " starting_op_names=['input_1'],\n", + " output_op_names=[model.output.name.split(\":\")[0]],\n", + " quant_scheme= QuantScheme.training_range_learning_with_tf_enhanced_init,\n", + " rounding_mode=\"nearest\",\n", + " default_output_bw=8,\n", + " default_param_bw=8,\n", + " use_cuda=use_cuda)\n", + "\n", + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=None)\n", + "\n", + "accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "id": "f048b779", + "metadata": {}, + "source": [ + "---\n", + "Depending on your settings you may have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.\n", + "\n", + "So we should have an improved model after QAT. Now the next step would be to actually take this model to target. For this purpose, we need to export the model with the updated weights without the fake quant ops. AIMET QuantizationSimModel provides an export API for this purpose. This API would save the model as #TODO" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "602e4e56", + "metadata": {}, + "outputs": [], + "source": [ + "os.makedirs('./output/', exist_ok=True)\n", + "sim.export(path='./output/', filename_prefix='resnet50_after_qat_range_learning')" + ] + }, + { + "cell_type": "markdown", + "id": "10b9cab9", + "metadata": {}, + "source": [ + "---\n", + "## Summary\n", + "\n", + "Hope this notebook was useful for you to understand how to use AIMET for performing Cross Layer Equalization (CLE) and Bias Correction (BC).\n", + "\n", + "Few additional resources\n", + "- Refer to the AIMET API docs to know more details of the APIs and optional parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT techniques" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/adaround.html b/releases/1.32.2/Examples/tensorflow/quantization/keras/adaround.html new file mode 100644 index 00000000..c1acf44b --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/adaround.html @@ -0,0 +1,1408 @@ + + + + + + Adaptive Rounding (Adaround) — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Adaptive Rounding (Adaround)

+

This notebook illustrates the use of AIMET Adaround feature.

+

AIMET quantization features, by default, use the “nearest rounding” technique for achieving quantization. When using the “nearest rounding” technique, the weight value is quantized to the nearest integer value. The Adaptive Rounding (AdaRound) feature, uses a smaller subset of the unlabelled training data to adaptively round the weights. AdaRound, optimizes a loss function using the unlabelled training data to adaptively decide whether to quantize a specific weight to the integer value near it +or away from it. Using the AdaRound quantization, a model is able to achieve an accuracy closer to the FP32 model, while using low bit-width integer quantization.

+
+

Overall flow

+

This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy 3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score 4. Apply Adaround and evaluate the simulation model to get a post-finetuned quantized accuracy score

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art results. For example, it uses a relatively quantization-friendly model like Resnet50. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).

+

Note: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = "/path/to/dataset/dir/"          # Please replace this with a real directory
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a Keras model. This QuantizationSim model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods? Not really. You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
import tensorflow as tf
+from Examples.common import image_net_config
+from Examples.tensorflow.utils.keras.image_net_dataset import ImageNetDataset
+from Examples.tensorflow.utils.keras.image_net_evaluator import ImageNetEvaluator
+
+
+class ImageNetDataPipeline:
+    """
+    Provides APIs for model evaluation and finetuning using ImageNet Dataset.
+    """
+
+    @staticmethod
+    def get_val_dataset() -> tf.data.Dataset:
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        :return: A tensorflow dataset
+        """
+        data_loader = ImageNetDataset(DATASET_DIR,
+                                      image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'])
+
+        return data_loader
+
+    @staticmethod
+    def evaluate(model, iterations=None) -> float:
+        """
+        Given a Keras model, evaluates its Top-1 accuracy on the validation dataset
+        :param model: The Keras model to be evaluated.
+        :param iterations: The number of iterations to run. If None, all the data will be used
+        :return: The accuracy for the sample with the maximum accuracy.
+        """
+        evaluator = ImageNetEvaluator(DATASET_DIR,
+                                      image_size=image_net_config.dataset["image_size"],
+                                      batch_size=image_net_config.evaluation["batch_size"])
+
+        return evaluator.evaluate(model=model, iterations=iterations)
+
+
+
+
+
+
+

2. Load the model and evaluate to get a baseline FP32 accuracy score

+

For this example notebook, we are going to load a pretrained resnet50 model from Keras. Similarly, you can load any pretrained Keras model instead.

+
+
[ ]:
+
+
+
from tensorflow.keras.applications.resnet50 import ResNet50
+
+model = ResNet50(include_top=True,
+                 weights="imagenet",
+                 input_tensor=None,
+                 input_shape=None,
+                 pooling=None,
+                 classes=1000)
+
+
+
+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
ImageNetDataPipeline.evaluate(model=model, iterations=10)
+
+
+
+
+
+
+

3. Create a quantization simulation model and determine quantized accuracy

+
+
+

Fold Batch Normalization layers

+

Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.

+

Why do we need to this? On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values +for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.

+

The following code calls AIMET to fold the BN layers of a given model. NOTE: During folding, a new model is returned. Please use the returned model for the rest of the pipeline.

+
+
[ ]:
+
+
+
from aimet_tensorflow.keras.batch_norm_fold import fold_all_batch_norms
+from aimet_tensorflow.keras.quantsim import QuantizationSimModel
+from aimet_common.defs import QuantScheme
+
+_, model = fold_all_batch_norms(model)
+sim = QuantizationSimModel(model=model,
+                           quant_scheme=QuantScheme.post_training_tf,
+                           rounding_mode="nearest",
+                           default_output_bw=8,
+                           default_param_bw=8)
+
+
+
+
+

Even though AIMET has added ‘quantizer’ nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred +to as calibration. AIMET simply refers to it as ‘computing encodings’.

+

So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples - In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has +1M samples. For computing encodings we only need 500 or 1000 samples. - It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results.

+

The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example.

+
+
[ ]:
+
+
+
from tensorflow.keras.utils import Progbar
+from tensorflow.keras.applications.resnet import preprocess_input
+
+def pass_calibration_data(sim_model, samples):
+    tf_dataset = ImageNetDataPipeline.get_val_dataset()
+    dataset = tf_dataset.dataset
+    batch_size = tf_dataset.batch_size
+
+    progbar = Progbar(samples)
+
+    batch_cntr = 0
+    for inputs, _ in dataset:
+        sim_model(preprocess_input(inputs))
+
+        batch_cntr += 1
+        progbar_stat_update = \
+            batch_cntr * batch_size if (batch_cntr * batch_size) < samples else samples
+        progbar.update(progbar_stat_update)
+        if (batch_cntr * batch_size) > samples:
+            break
+
+
+
+
+

Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters.

+
+
[ ]:
+
+
+
sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=1000)
+
+
+
+
+

Now the QuantizationSim model is ready to be used for inference. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before.

+
+
[ ]:
+
+
+
ImageNetDataPipeline.evaluate(model=sim.model, iterations=10)
+
+
+
+
+
+
+

4. Apply Adaround

+

We can now apply AdaRound to this model.

+

Some of the parameters for AdaRound are described below

+
    +
  • data_set: AdaRound needs a dataset to use data samples for the layer-by-layer optimization to learn the rounding vectors. Either a training or validation dataloader could be passed in.

  • +
  • num_batches: The number of batches used to evaluate the model while calculating the quantization encodings. Typically we want AdaRound to use around 2000 samples. So with a batch size of 32, this may translate to 64 batches. To speed up the execution here we are using a batch size of 1.

  • +
  • default_num_iterations: The number of iterations to adaround each layer. Default value is set to 10000 and we strongly recommend to not reduce this number. But in this example we are using 32 to speed up the execution runtime.

  • +
+
+
[ ]:
+
+
+
import os
+from tensorflow.keras.applications.resnet import preprocess_input
+from tensorflow.keras.preprocessing import image_dataset_from_directory
+from aimet_tensorflow.keras.adaround_weight import Adaround, AdaroundParameters
+
+ada_round_data = image_dataset_from_directory(directory=DATASET_DIR,
+                                              labels="inferred",
+                                              label_mode="categorical",
+                                              batch_size=image_net_config.evaluation["batch_size"],
+                                              shuffle=False,
+                                              image_size=(image_net_config.dataset["image_width"],
+                                                          image_net_config.dataset["image_height"]))
+ada_round_data = ada_round_data.map(lambda x, y: preprocess_input(x))
+
+params = AdaroundParameters(data_set=ada_round_data, num_batches=1, default_num_iterations=32)
+
+os.makedirs("./output/", exist_ok=True)
+ada_model = Adaround.apply_adaround(model, params, path="output", filename_prefix="adaround",
+                                    default_param_bw=8, default_quant_scheme=QuantScheme.post_training_tf)
+
+
+
+
+

Now, we can determine the simulated quantized accuracy of the model after applying Adaround. We again create a simulation model like before and evaluate to determine simulated quantized accuracy.

+

Note: There are two important things to understand in the following cell. - Parameter Biwidth Precision: The QuantizationSimModel must be created with the same parameter bitwidth precision that was used in the apply_adaround() created.

+
    +
  • Freezing the parameter encodings: After creating the QuantizationSimModel, the set_and_freeze_param_encodings() API must be called before calling the compute_encodings() API. While applying AdaRound, the parameter values have been rounded up or down based on these initial encodings internally created. Fo r Quantization Simulation accuracy, it is important to freeze these encodings. If the parameters encodings are NOT frozen, the call to compute_encodings() will alter the value of the +parameters encoding and Quantization Simulation accuracy will not be correct.

  • +
+
+
[ ]:
+
+
+
sim = QuantizationSimModel(model=ada_model,
+                           quant_scheme=QuantScheme.post_training_tf,
+                           rounding_mode="nearest",
+                           default_output_bw=8,
+                           default_param_bw=8)
+
+sim.set_and_freeze_param_encodings(encoding_path=os.path.join("output", "adaround.encodings"))
+
+sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=1000)
+
+
+
+
+

Now the QuantizationSim model is ready to be used for inference. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before.

+
+
[ ]:
+
+
+
ImageNetDataPipeline.evaluate(model=sim.model, iterations=10)
+
+
+
+
+

Depending on your settings you may have observed a slight gain in accuracy after applying adaround. Of course, this was just an example. Please try this against the model of your choice and play with the number of samples to get the best results.

+

Now the next step would be to take this model to target. For this purpose, we need to export the model with the updated weights without the fake quant ops. And also to export the encodings (scale/offset quantization parameters). AIMET QuantizationSimModel provides an export API for this purpose.

+
+
[ ]:
+
+
+
sim.export(path="./output", filename_prefix="resnet50_after_adaround")
+
+
+
+
+
+
+

Summary

+

This example illustrated how AIMET AdaRound API is invoked to achieve post training quantization. To use AIMET AdaRound for your specific needs, replace the model with your model and replace the data pipeline with your data pipeline. This will provide you a quick starting point. As indicated above, some parameters in this example have been chosen in such a way way to make this example execute faster.

+

Hope this notebook was useful for you to understand how to use AIMET for performing Adaround.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT techniques

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/adaround.ipynb b/releases/1.32.2/Examples/tensorflow/quantization/keras/adaround.ipynb new file mode 100644 index 00000000..d8ba566a --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/adaround.ipynb @@ -0,0 +1,481 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "# Adaptive Rounding (Adaround)\n", + "This notebook illustrates the use of AIMET Adaround feature.\n", + "\n", + "AIMET quantization features, by default, use the \"nearest rounding\" technique for achieving quantization. When using the \"nearest rounding\" technique, the weight value is quantized to the nearest integer value. The Adaptive Rounding (AdaRound) feature, uses a smaller subset of the unlabelled training data to adaptively round the weights. AdaRound, optimizes a loss function using the unlabelled training data to adaptively decide whether to quantize a specific weight to the integer value near it or away from it. Using the AdaRound quantization, a model is able to achieve an accuracy closer to the FP32 model, while using low bit-width integer quantization.\n", + "\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy\n", + "3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score\n", + "4. Apply Adaround and evaluate the simulation model to get a post-finetuned quantized accuracy score\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art results. For example, it uses a relatively quantization-friendly model like Resnet50. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).\n", + "\n", + "**Note**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "DATASET_DIR = \"/path/to/dataset/dir/\" # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a Keras model. This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really. You should be able to use your existing evaluate and train routines as-is.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "from Examples.common import image_net_config\n", + "from Examples.tensorflow.utils.keras.image_net_dataset import ImageNetDataset\n", + "from Examples.tensorflow.utils.keras.image_net_evaluator import ImageNetEvaluator\n", + "\n", + "\n", + "class ImageNetDataPipeline:\n", + " \"\"\"\n", + " Provides APIs for model evaluation and finetuning using ImageNet Dataset.\n", + " \"\"\"\n", + "\n", + " @staticmethod\n", + " def get_val_dataset() -> tf.data.Dataset:\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " :return: A tensorflow dataset\n", + " \"\"\"\n", + " data_loader = ImageNetDataset(DATASET_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'])\n", + "\n", + " return data_loader\n", + "\n", + " @staticmethod\n", + " def evaluate(model, iterations=None) -> float:\n", + " \"\"\"\n", + " Given a Keras model, evaluates its Top-1 accuracy on the validation dataset\n", + " :param model: The Keras model to be evaluated.\n", + " :param iterations: The number of iterations to run. If None, all the data will be used\n", + " :return: The accuracy for the sample with the maximum accuracy.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(DATASET_DIR,\n", + " image_size=image_net_config.dataset[\"image_size\"],\n", + " batch_size=image_net_config.evaluation[\"batch_size\"])\n", + "\n", + " return evaluator.evaluate(model=model, iterations=iterations)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## 2. Load the model and evaluate to get a baseline FP32 accuracy score" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "For this example notebook, we are going to load a pretrained resnet50 model from Keras. Similarly, you can load any pretrained Keras model instead.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from tensorflow.keras.applications.resnet50 import ResNet50\n", + "\n", + "model = ResNet50(include_top=True,\n", + " weights=\"imagenet\",\n", + " input_tensor=None,\n", + " input_shape=None,\n", + " pooling=None,\n", + " classes=1000)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "ImageNetDataPipeline.evaluate(model=model, iterations=10)" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## 3. Create a quantization simulation model and determine quantized accuracy\n", + "\n", + "## Fold Batch Normalization layers\n", + "Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.\n", + "\n", + "**Why do we need to this?**\n", + "On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.\n", + "\n", + "The following code calls AIMET to fold the BN layers of a given model.
\n", + "**NOTE: During folding, a new model is returned. Please use the returned model for the rest of the pipeline.**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.keras.batch_norm_fold import fold_all_batch_norms\n", + "from aimet_tensorflow.keras.quantsim import QuantizationSimModel\n", + "from aimet_common.defs import QuantScheme\n", + "\n", + "_, model = fold_all_batch_norms(model)\n", + "sim = QuantizationSimModel(model=model,\n", + " quant_scheme=QuantScheme.post_training_tf,\n", + " rounding_mode=\"nearest\",\n", + " default_output_bw=8,\n", + " default_param_bw=8)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "Even though AIMET has added 'quantizer' nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each 'quantizer' node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as 'computing encodings'.\n", + "\n", + "So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don't need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples\n", + "- In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples.\n", + "- It may be beneficial if the samples used for computing encoding are well distributed. It's not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from tensorflow.keras.utils import Progbar\n", + "from tensorflow.keras.applications.resnet import preprocess_input\n", + "\n", + "def pass_calibration_data(sim_model, samples):\n", + " tf_dataset = ImageNetDataPipeline.get_val_dataset()\n", + " dataset = tf_dataset.dataset\n", + " batch_size = tf_dataset.batch_size\n", + "\n", + " progbar = Progbar(samples)\n", + "\n", + " batch_cntr = 0\n", + " for inputs, _ in dataset:\n", + " sim_model(preprocess_input(inputs))\n", + "\n", + " batch_cntr += 1\n", + " progbar_stat_update = \\\n", + " batch_cntr * batch_size if (batch_cntr * batch_size) < samples else samples\n", + " progbar.update(progbar_stat_update)\n", + " if (batch_cntr * batch_size) > samples:\n", + " break\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=1000)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "Now the QuantizationSim model is ready to be used for inference. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "ImageNetDataPipeline.evaluate(model=sim.model, iterations=10)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## 4. Apply Adaround\n", + "\n", + "We can now apply AdaRound to this model.\n", + "\n", + "Some of the parameters for AdaRound are described below\n", + "\n", + "- **data_set:** AdaRound needs a dataset to use data samples for the layer-by-layer optimization to learn the rounding vectors. Either a training or validation dataloader could be passed in.\n", + "- **num_batches:** The number of batches used to evaluate the model while calculating the quantization encodings. Typically we want AdaRound to use around 2000 samples. So with a batch size of 32, this may translate to 64 batches. To speed up the execution here we are using a batch size of 1.\n", + "- **default_num_iterations:** The number of iterations to adaround each layer. Default value is set to 10000 and we strongly recommend to not reduce this number. But in this example we are using 32 to speed up the execution runtime." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "import os\n", + "from tensorflow.keras.applications.resnet import preprocess_input\n", + "from tensorflow.keras.preprocessing import image_dataset_from_directory\n", + "from aimet_tensorflow.keras.adaround_weight import Adaround, AdaroundParameters\n", + "\n", + "ada_round_data = image_dataset_from_directory(directory=DATASET_DIR,\n", + " labels=\"inferred\",\n", + " label_mode=\"categorical\",\n", + " batch_size=image_net_config.evaluation[\"batch_size\"],\n", + " shuffle=False,\n", + " image_size=(image_net_config.dataset[\"image_width\"],\n", + " image_net_config.dataset[\"image_height\"]))\n", + "ada_round_data = ada_round_data.map(lambda x, y: preprocess_input(x))\n", + "\n", + "params = AdaroundParameters(data_set=ada_round_data, num_batches=1, default_num_iterations=32)\n", + "\n", + "os.makedirs(\"./output/\", exist_ok=True)\n", + "ada_model = Adaround.apply_adaround(model, params, path=\"output\", filename_prefix=\"adaround\",\n", + " default_param_bw=8, default_quant_scheme=QuantScheme.post_training_tf)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "Now, we can determine the simulated quantized accuracy of the model after applying Adaround. We again create a simulation model like before and evaluate to determine simulated quantized accuracy.\n", + "\n", + "**Note:** There are two important things to understand in the following cell.\n", + " - **Parameter Biwidth Precision**: The QuantizationSimModel must be created with the same parameter bitwidth precision that was used in the apply_adaround() created.\n", + "\n", + " - **Freezing the parameter encodings**:\n", + "After creating the QuantizationSimModel, the set_and_freeze_param_encodings() API must be called\n", + "before calling the compute_encodings() API. While applying AdaRound, the parameter values have\n", + "been rounded up or down based on these initial encodings internally created. Fo\n", + "r Quantization Simulation accuracy, it is important to freeze these encodings.\n", + "If the parameters encodings are NOT frozen, the call to compute_encodings() will alter\n", + "the value of the parameters encoding and Quantization Simulation accuracy will not be correct." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "sim = QuantizationSimModel(model=ada_model,\n", + " quant_scheme=QuantScheme.post_training_tf,\n", + " rounding_mode=\"nearest\",\n", + " default_output_bw=8,\n", + " default_param_bw=8)\n", + "\n", + "sim.set_and_freeze_param_encodings(encoding_path=os.path.join(\"output\", \"adaround.encodings\"))\n", + "\n", + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=1000)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "Now the QuantizationSim model is ready to be used for inference. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "ImageNetDataPipeline.evaluate(model=sim.model, iterations=10)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "Depending on your settings you may have observed a slight gain in accuracy after applying adaround. Of course, this was just an example. Please try this against the model of your choice and play with the number of samples to get the best results.\n", + "\n", + "Now the next step would be to take this model to target. For this purpose, we need to export the model with the updated weights without the fake quant ops. And also to export the encodings (scale/offset quantization parameters). AIMET QuantizationSimModel provides an export API for this purpose." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "sim.export(path=\"./output\", filename_prefix=\"resnet50_after_adaround\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## Summary\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "This example illustrated how AIMET AdaRound API is invoked to achieve post training quantization. To use AIMET AdaRound for your specific needs, replace the model with your model and\n", + "replace the data pipeline with your data pipeline. This will provide you a quick starting point. As indicated above, some parameters in this example have been chosen in such a way way to make this example execute faster.\n", + "\n", + "Hope this notebook was useful for you to understand how to use AIMET for performing Adaround.\n", + "\n", + "Few additional resources\n", + "- Refer to the AIMET API docs to know more details of the APIs and optional parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT techniques" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3.8.0 64-bit", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 2 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython2", + "version": "3.8.0" + }, + "vscode": { + "interpreter": { + "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6" + } + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/autoquant.html b/releases/1.32.2/Examples/tensorflow/quantization/keras/autoquant.html new file mode 100644 index 00000000..1e8e14ab --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/autoquant.html @@ -0,0 +1,1358 @@ + + + + + + AutoQuant — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AutoQuant

+

This notebook shows a working code example of how to use AIMET AutoQuant feature.

+

AIMET offers a suite of neural network post-training quantization techniques. Often, applying these techniques in a specific sequence, results in better accuracy and performance. Without the AutoQuant feature, the AIMET user needs to manually try out various combinations of AIMET quantization features. This manual process is error-prone and often time-consuming.

+

The AutoQuant feature, analyzes the model, determines the sequence of AIMET quantization techniques and applies these techniques. In addition, the user can specify the amount of accuracy drop that can be tolerated, in the AutoQuant API. As soon as this threshold accuracy is reached, AutoQuant stops applying any additional quantization technique. In summary, the AutoQuant feature saves time and automates the quantization of the neural networks.

+
+

Overall flow

+

This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load a pretrained FP32 model 3. Determine the baseline FP32 accuracy 4. Define constants and helper functions 5. Apply AutoQuant

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art AutoQuant results. For example, it uses a relatively quantization-friendly model like Resnet50. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#)

+

Note: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = '/path/to/dir/'       # Please replace this with a real directory
+
+
+
+
+
[ ]:
+
+
+
import os
+os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
+
+import tensorflow as tf
+from aimet_tensorflow.keras.auto_quant import AutoQuant
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a TensorFlow model. This QuantizationSim model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of evaluate() or train() methods? Not really. You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
from typing import Optional
+from Examples.common import image_net_config
+from Examples.tensorflow.utils.keras.image_net_dataset import ImageNetDataset
+from Examples.tensorflow.utils.keras.image_net_evaluator import ImageNetEvaluator
+
+
+class ImageNetDataPipeline:
+    """
+    Provides APIs for model evaluation and finetuning using ImageNet Dataset.
+    """
+
+    @staticmethod
+    def get_val_dataset(batch_size: Optional[int] = None) -> tf.data.Dataset:
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        :return: A tensorflow dataset
+        """
+        if batch_size is None:
+            batch_size = image_net_config.evaluation['batch_size']
+
+        data_loader = ImageNetDataset(DATASET_DIR,
+                                      image_size=image_net_config.dataset['image_size'],
+                                      batch_size=batch_size)
+
+        return data_loader
+
+    @staticmethod
+    def evaluate(model, iterations=None) -> float:
+        """
+        Given a Keras model, evaluates its Top-1 accuracy on the validation dataset
+        :param model: The Keras model to be evaluated.
+        :param iterations: The number of iterations to run. If None, all the data will be used
+        :return: The accuracy for the sample with the maximum accuracy.
+        """
+        evaluator = ImageNetEvaluator(DATASET_DIR,
+                                      image_size=image_net_config.dataset["image_size"],
+                                      batch_size=image_net_config.evaluation["batch_size"])
+
+        return evaluator.evaluate(model=model, iterations=iterations)
+
+
+
+
+
+

2. Load a pretrained FP32 model

+

For this example notebook, we are going to load a pretrained ResNet50 model from Keras. Similarly, you can load any pretrained Keras model instead.

+
+
[ ]:
+
+
+
from tensorflow.keras.applications.resnet import ResNet50
+
+model = ResNet50(weights='imagenet')
+
+
+
+
+
+

3. Determine the baseline FP32 accuracy

+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using evaluate() routine

+
+
[ ]:
+
+
+
ImageNetDataPipeline.evaluate(model=model)
+
+
+
+
+
+

4. Define Constants and Helper functions

+

In this section the constants and helper functions needed to run this example are defined.

+
    +
  • EVAL_DATASET_SIZE A typical value is 5000. To execute this example faster this value has been set to 50

  • +
  • CALIBRATION_DATASET_SIZE A typical value is 2000. To execute this example faster this value has been set to 20

  • +
  • BATCH_SIZE User sets the batch size. As an example, set to 10

  • +
+

The helper function **_create_sampled_data_loader()** returns a DataLoader based on the dataset and the number of samples provided.

+
+
[ ]:
+
+
+
EVAL_DATASET_SIZE = 50
+CALIBRATION_DATASET_SIZE = 20
+BATCH_SIZE = 10
+
+
+
+
+
[ ]:
+
+
+
eval_dataset = ImageNetDataPipeline.get_val_dataset(BATCH_SIZE).dataset
+unlabeled_dataset = eval_dataset.map(lambda images, labels: images)
+
+
+
+
+
+

Prepare the evaluation callback function

+

The eval_callback() function takes the model object to evaluate and compile option dictionary and the number of samples to use as arguments. If the num_samples argument is None, the whole evaluation dataset is used to evaluate the model.

+
+
[ ]:
+
+
+
from typing import Optional
+
+
+def eval_callback(model: tf.keras.Model,
+                  num_samples: Optional[int] = None) -> float:
+    if num_samples is None:
+        num_samples = EVAL_DATASET_SIZE
+
+    sampled_dataset = eval_dataset.take(num_samples)
+
+    # Model should be compiled before evaluation
+    model.compile(optimizer=tf.keras.optimizers.Adam(),
+                  loss=tf.keras.losses.CategoricalCrossentropy(),
+                  metrics=tf.keras.metrics.CategoricalAccuracy())
+    _, acc = model.evaluate(sampled_dataset)
+
+    return acc
+
+
+
+
+
+

5. Apply AutoQuant

+

As a first step, the AutoQuant object is created.

+

The allowed_accuracy_drop parameter is set by the user to convey to the AutoQuant feature, how much accuracy drop is tolerated by the user. AutoQuant applies a series of quantization features. When the allowed accuracy is reached, AutoQuant stops applying any subsequent quantization feature. Please refer AutoQuant User Guide and API documentation for complete details.

+
+
[ ]:
+
+
+
auto_quant = AutoQuant(allowed_accuracy_drop=0.01,
+                       unlabeled_dataset=unlabeled_dataset,
+                       eval_callback=eval_callback)
+
+
+
+
+
+

Optionally set AdaRound Parameters

+

The AutoQuant feature internally uses default parameters to execute the AdaRound step. If and only if necessary, the default AdaRound Parameters should be modified using the API shown below.

+

Note: To execute this example faster, the default value of the num_iterations parameter has been reduced from 10000 to 2000

+
+
[ ]:
+
+
+
from aimet_tensorflow.adaround.adaround_weight import AdaroundParameters
+
+ADAROUND_DATASET_SIZE = 2000
+adaround_dataset = unlabeled_dataset.take(ADAROUND_DATASET_SIZE)
+adaround_params = AdaroundParameters(adaround_dataset,
+                                     num_batches=ADAROUND_DATASET_SIZE // BATCH_SIZE)
+auto_quant.set_adaround_params(adaround_params)
+
+
+
+
+
+

Run AutoQuant

+

This step applies the AutoQuant feature. The best possible quantized model, the associated eval_score and the path to the AdaRound encoding files are returned.

+
+
[ ]:
+
+
+
model, accuracy, encoding_path = auto_quant.apply(model)
+
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET AutoQuant feature.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and parameters - Refer to the other example notebooks to understand how to use AIMET CLE and AdaRound features in a standalone fashion.

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/autoquant.ipynb b/releases/1.32.2/Examples/tensorflow/quantization/keras/autoquant.ipynb new file mode 100644 index 00000000..55981b8d --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/autoquant.ipynb @@ -0,0 +1,380 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "# AutoQuant\n", + "\n", + "This notebook shows a working code example of how to use AIMET AutoQuant feature.\n", + "\n", + "AIMET offers a suite of neural network post-training quantization techniques. Often, applying these techniques in a specific sequence, results in better accuracy and performance. Without the AutoQuant feature, the AIMET user needs to manually try out various combinations of AIMET quantization features. This manual process is error-prone and often time-consuming.\n", + "\n", + "The AutoQuant feature, analyzes the model, determines the sequence of AIMET quantization techniques and applies these techniques. In addition, the user can specify the amount of accuracy drop that can be tolerated, in the AutoQuant API. As soon as this threshold accuracy is reached, AutoQuant stops applying any additional quantization technique. In summary, the AutoQuant feature saves time and automates the quantization of the neural networks.\n", + "\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load a pretrained FP32 model\n", + "3. Determine the baseline FP32 accuracy\n", + "4. Define constants and helper functions\n", + "5. Apply AutoQuant\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art AutoQuant results. For example, it uses a relatively quantization-friendly model like Resnet50. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. [https://image-net.org/challenges/LSVRC/2012/index.php#](https://image-net.org/challenges/LSVRC/2012/index.php#))\n", + "\n", + "**Note**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "DATASET_DIR = '/path/to/dir/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'\n", + "\n", + "import tensorflow as tf\n", + "from aimet_tensorflow.keras.auto_quant import AutoQuant" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a TensorFlow model. This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of evaluate() or train() methods?** Not really. You should be able to use your existing evaluate and train routines as-is." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from typing import Optional\n", + "from Examples.common import image_net_config\n", + "from Examples.tensorflow.utils.keras.image_net_dataset import ImageNetDataset\n", + "from Examples.tensorflow.utils.keras.image_net_evaluator import ImageNetEvaluator\n", + "\n", + "\n", + "class ImageNetDataPipeline:\n", + " \"\"\"\n", + " Provides APIs for model evaluation and finetuning using ImageNet Dataset.\n", + " \"\"\"\n", + "\n", + " @staticmethod\n", + " def get_val_dataset(batch_size: Optional[int] = None) -> tf.data.Dataset:\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " :return: A tensorflow dataset\n", + " \"\"\"\n", + " if batch_size is None:\n", + " batch_size = image_net_config.evaluation['batch_size']\n", + "\n", + " data_loader = ImageNetDataset(DATASET_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=batch_size)\n", + "\n", + " return data_loader\n", + "\n", + " @staticmethod\n", + " def evaluate(model, iterations=None) -> float:\n", + " \"\"\"\n", + " Given a Keras model, evaluates its Top-1 accuracy on the validation dataset\n", + " :param model: The Keras model to be evaluated.\n", + " :param iterations: The number of iterations to run. If None, all the data will be used\n", + " :return: The accuracy for the sample with the maximum accuracy.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(DATASET_DIR,\n", + " image_size=image_net_config.dataset[\"image_size\"],\n", + " batch_size=image_net_config.evaluation[\"batch_size\"])\n", + "\n", + " return evaluator.evaluate(model=model, iterations=iterations)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "## 2. Load a pretrained FP32 model\n", + "\n", + "For this example notebook, we are going to load a pretrained ResNet50 model from Keras. Similarly, you can load any pretrained Keras model instead." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from tensorflow.keras.applications.resnet import ResNet50\n", + "\n", + "model = ResNet50(weights='imagenet')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "## 3. Determine the baseline FP32 accuracy\n", + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "ImageNetDataPipeline.evaluate(model=model)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "## 4. Define Constants and Helper functions\n", + "\n", + "In this section the constants and helper functions needed to run this example are defined.\n", + "\n", + "- **EVAL_DATASET_SIZE** A typical value is 5000. To execute this example faster this value has been set to 50\n", + "- **CALIBRATION_DATASET_SIZE** A typical value is 2000. To execute this example faster this value has been set to 20\n", + "- **BATCH_SIZE** User sets the batch size. As an example, set to 10\n", + "\n", + "\n", + "The helper function **_create_sampled_data_loader()** returns a DataLoader based on the dataset and the number of samples provided." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "EVAL_DATASET_SIZE = 50\n", + "CALIBRATION_DATASET_SIZE = 20\n", + "BATCH_SIZE = 10" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "eval_dataset = ImageNetDataPipeline.get_val_dataset(BATCH_SIZE).dataset\n", + "unlabeled_dataset = eval_dataset.map(lambda images, labels: images)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "## Prepare the evaluation callback function\n", + "\n", + "The **eval_callback()** function takes the model object to evaluate and compile option dictionary and the number of samples to use as arguments. If the **num_samples** argument is None, the whole evaluation dataset is used to evaluate the model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from typing import Optional\n", + "\n", + "\n", + "def eval_callback(model: tf.keras.Model,\n", + " num_samples: Optional[int] = None) -> float:\n", + " if num_samples is None:\n", + " num_samples = EVAL_DATASET_SIZE\n", + "\n", + " sampled_dataset = eval_dataset.take(num_samples)\n", + "\n", + " # Model should be compiled before evaluation\n", + " model.compile(optimizer=tf.keras.optimizers.Adam(),\n", + " loss=tf.keras.losses.CategoricalCrossentropy(),\n", + " metrics=tf.keras.metrics.CategoricalAccuracy())\n", + " _, acc = model.evaluate(sampled_dataset)\n", + "\n", + " return acc" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "## 5. Apply AutoQuant\n", + "\n", + "As a first step, the AutoQuant object is created.\n", + "\n", + "The **allowed_accuracy_drop** parameter is set by the user to convey to the AutoQuant feature, how much accuracy drop is tolerated by the user. AutoQuant applies a series of quantization features. When the allowed accuracy is reached, AutoQuant stops applying any subsequent quantization feature. Please refer AutoQuant User Guide and API documentation for complete details." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "auto_quant = AutoQuant(allowed_accuracy_drop=0.01,\n", + " unlabeled_dataset=unlabeled_dataset,\n", + " eval_callback=eval_callback)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "## Optionally set AdaRound Parameters\n", + "The AutoQuant feature internally uses default parameters to execute the AdaRound step.\n", + "If and only if necessary, the default AdaRound Parameters should be modified using the API shown below.\n", + "\n", + "**Note:**\n", + "To execute this example faster, the default value of the **num_iterations** parameter has been reduced from 10000 to 2000" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.adaround.adaround_weight import AdaroundParameters\n", + "\n", + "ADAROUND_DATASET_SIZE = 2000\n", + "adaround_dataset = unlabeled_dataset.take(ADAROUND_DATASET_SIZE)\n", + "adaround_params = AdaroundParameters(adaround_dataset,\n", + " num_batches=ADAROUND_DATASET_SIZE // BATCH_SIZE)\n", + "auto_quant.set_adaround_params(adaround_params)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "## Run AutoQuant\n", + "\n", + "This step applies the AutoQuant feature. The best possible quantized model, the associated eval_score and the path to the AdaRound encoding files are returned." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "model, accuracy, encoding_path = auto_quant.apply(model)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## Summary\n", + "\n", + "Hope this notebook was useful for you to understand how to use AIMET AutoQuant feature.\n", + "\n", + "Few additional resources\n", + "- Refer to the AIMET API docs to know more details of the APIs and parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET CLE and AdaRound features in a standalone fashion." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 2 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython2", + "version": "2.7.6" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/bn_reestimation.html b/releases/1.32.2/Examples/tensorflow/quantization/keras/bn_reestimation.html new file mode 100644 index 00000000..cc476e54 --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/bn_reestimation.html @@ -0,0 +1,1502 @@ + + + + + + Quantization-Aware Training with BatchNorm Re-estimation — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + +
  • + View page source +
  • +
+
+
+
+
+ +
+

Quantization-Aware Training with BatchNorm Re-estimation

+

This notebook shows a working code example of how to use AIMET to perform QAT (Quantization-aware training) with batchnorm re-estimation. Batchnorm re-estimation is a technique for countering potential instability of batchnorm statistics (i.e. running mean and variance) during QAT. More specifically, batchnorm re-estimation recalculates the batchnorm statistics based on the model after QAT. By doing so, we aim to make our model learn batchnorm statistics from stable outputs after QAT, rather +than from likely noisy outputs during QAT.

+
+

Overall flow

+

This notebook covers the following steps: 1. Instantiate the example evaluation and training pipeline 2. Define Constants and Datasets Prepare 3. Create the model in Keras 4. Train and evaluate the model 5. Quantize the model with QuantSim 6. Finetune and evaluate the quantization simulation model 7. Re-estimate batchnorm statistics and compare the eval score before and after re-estimation 8. Fold the re-estimated batchnorm layers and export the quantization simulation model

+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#)

+

Note: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = '/path/to/dir/'       # Please replace this with a real directory
+
+
+
+
+
[ ]:
+
+
+
import os
+os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
+
+import tensorflow as tf
+
+
+
+
+
+
+

1. Instantiate the example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a TensorFlow model. This QuantizationSim model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of evaluate() or train() methods? Not really. You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
from typing import Optional
+from Examples.common import image_net_config
+from Examples.tensorflow.utils.keras.image_net_dataset import ImageNetDataset
+from Examples.tensorflow.utils.keras.image_net_evaluator import ImageNetEvaluator
+
+
+class ImageNetDataPipeline:
+    """
+    Provides APIs for model evaluation and finetuning using ImageNet Dataset.
+    """
+
+    @staticmethod
+    def get_val_dataset(batch_size: Optional[int] = None) -> tf.data.Dataset:
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        :return: A tensorflow dataset
+        """
+        if batch_size is None:
+            batch_size = image_net_config.evaluation['batch_size']
+
+        data_loader = ImageNetDataset(DATASET_DIR,
+                                      image_size=image_net_config.dataset['image_size'],
+                                      batch_size=batch_size)
+
+        return data_loader
+
+    @staticmethod
+    def evaluate(model, iterations=None) -> float:
+        """
+        Given a Keras model, evaluates its Top-1 accuracy on the validation dataset
+        :param model: The Keras model to be evaluated.
+        :param iterations: The number of iterations to run. If None, all the data will be used
+        :return: The accuracy for the sample with the maximum accuracy.
+        """
+        evaluator = ImageNetEvaluator(DATASET_DIR,
+                                      image_size=image_net_config.dataset["image_size"],
+                                      batch_size=image_net_config.evaluation["batch_size"])
+
+        return evaluator.evaluate(model=model, iterations=iterations)
+
+
+
+
+
+

2. Define Constants and Datasets Prepare

+

In this section the constants and helper functions needed to run this example are defined.

+
    +
  • EVAL_DATASET_SIZE To execute this example faster this value has been set to 4

  • +
  • TRAIN_DATASET_SIZE To execute this example faster this value has been set to 4

  • +
  • RE_ESTIMATION_DATASET_SIZE To execute this example faster this value has been set to 4

  • +
  • BATCH_SIZE User sets the batch size. As an example, set to 16

  • +
+
+
[ ]:
+
+
+
EVAL_DATASET_SIZE = 4
+TRAIN_DATASET_SIZE = 4
+RE_ESTIMATION_DATASET_SIZE = 4
+BATCH_SIZE = 16
+
+dataset = ImageNetDataPipeline.get_val_dataset(BATCH_SIZE).dataset
+eval_dataset = dataset.take(EVAL_DATASET_SIZE)
+train_dataset = dataset.take(TRAIN_DATASET_SIZE)
+unlabeled_dataset = dataset.map(lambda images, labels: images)
+re_estimation_dataset = unlabeled_dataset.take(RE_ESTIMATION_DATASET_SIZE)
+
+
+
+
+
+

2. Create the model in Keras

+

Currently, only Keras models built using the Sequential or Functional APIs are compatible with QuantSim - models making use of subclassed layers are incompatible. Therefore, we use the Functional API to create the model used in this example

+
+
[ ]:
+
+
+
tf.keras.backend.clear_session()
+inputs = tf.keras.Input(shape=(224, 224, 3), name="inputs")
+conv = tf.keras.layers.Conv2D(16, (3, 3), name ='conv1')(inputs)
+bn = tf.keras.layers.BatchNormalization(fused=True)(conv)
+relu = tf.keras.layers.ReLU()(bn)
+pool = tf.keras.layers.MaxPooling2D()(relu)
+conv2 = tf.keras.layers.Conv2D(8, (3, 3), name ='conv2')(pool)
+flatten = tf.keras.layers.Flatten()(conv2)
+dense  = tf.keras.layers.Dense(1000)(flatten)
+functional_model = tf.keras.Model(inputs=inputs, outputs=dense)
+
+
+
+
+
+

3. Train and evaluate the model

+

Before we can quantize the model and apply QAT, the FP32 model must be trained so that we can get a baseline accuracy.

+
+
[ ]:
+
+
+
loss_fn = tf.keras.losses.CategoricalCrossentropy()
+
+functional_model.compile(optimizer='adam',
+              loss=loss_fn,
+              metrics=['accuracy'])
+
+functional_model.fit(train_dataset, epochs=5)
+
+# Evaluate the model on the test data using `evaluate`
+print("Evaluate quantized model (post QAT) on test data")
+ImageNetDataPipeline.evaluate(model=functional_model)
+
+
+
+
+
+

4. Create a QuantizationSim Model

+

Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them. A few of the parameters are explained here - quant_scheme: We set this to “QuantScheme.training_range_learning_with_tf_init” - Supported options are ‘tf_enhanced’ or ‘tf’ or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced - default_output_bw: Setting this to 8, essentially means that +we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision - default_param_bw: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+
+
[ ]:
+
+
+
import json
+from aimet_common.defs import QuantScheme
+from aimet_tensorflow.keras.quantsim import QuantizationSimModel
+
+default_config_per_channel = {
+        "defaults":
+            {
+                "ops":
+                    {
+                        "is_output_quantized": "True"
+                    },
+                "params":
+                    {
+                        "is_quantized": "True",
+                        "is_symmetric": "True"
+                    },
+                "strict_symmetric": "False",
+                "unsigned_symmetric": "True",
+                "per_channel_quantization": "True"
+            },
+
+        "params":
+            {
+                "bias":
+                    {
+                        "is_quantized": "False"
+                    }
+            },
+
+        "op_type":
+            {
+                "Squeeze":
+                    {
+                        "is_output_quantized": "False"
+                    },
+                "Pad":
+                    {
+                        "is_output_quantized": "False"
+                    },
+                "Mean":
+                    {
+                        "is_output_quantized": "False"
+                    }
+            },
+
+        "supergroups":
+            [
+                {
+                    "op_list": ["Conv", "Relu"]
+                },
+                {
+                    "op_list": ["Conv", "Clip"]
+                },
+                {
+                    "op_list": ["Conv", "BatchNormalization", "Relu"]
+                },
+                {
+                    "op_list": ["Add", "Relu"]
+                },
+                {
+                    "op_list": ["Gemm", "Relu"]
+                }
+            ],
+
+        "model_input":
+            {
+                "is_input_quantized": "True"
+            },
+
+        "model_output":
+            {}
+    }
+
+with open("/tmp/default_config_per_channel.json", "w") as f:
+    json.dump(default_config_per_channel, f)
+
+
+qsim = QuantizationSimModel(functional_model, quant_scheme=QuantScheme.training_range_learning_with_tf_init,
+                                config_file="/tmp/default_config_per_channel.json")
+
+
+
+
+
+

Prepare the evaluation callback function

+

The eval_callback() function takes the model object to evaluate and compile option dictionary and the number of samples to use as arguments. If the num_samples argument is None, the whole evaluation dataset is used to evaluate the model.

+
+
[ ]:
+
+
+
from typing import Optional
+
+
+def eval_callback(model: tf.keras.Model,
+                  num_samples: Optional[int] = None) -> float:
+    if num_samples is None:
+        num_samples = EVAL_DATASET_SIZE
+
+    sampled_dataset = eval_dataset.take(num_samples)
+
+    # Model should be compiled before evaluation
+    model.compile(optimizer=tf.keras.optimizers.Adam(),
+                  loss=tf.keras.losses.CategoricalCrossentropy(),
+                  metrics=tf.keras.metrics.CategoricalAccuracy())
+    _, acc = model.evaluate(sampled_dataset)
+
+    return acc
+
+
+
+

Compute Encodings

+

Even though AIMET has added ‘quantizer’ nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred +to as calibration. AIMET simply refers to it as ‘computing encodings’.

+

So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples - In practice, we need a very small percentage of the overall data samples for computing encodings. - It may be beneficial if the samples used for +computing encoding are well distributed. It’s not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all positive or negative samples are used.

+

The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example.

+
+
[ ]:
+
+
+
qsim.compute_encodings(eval_callback, forward_pass_callback_args=None)
+
+
+
+

Next, we can evaluate the performance of the quantized model

+
+
[ ]:
+
+
+
print("Evaluate quantized model on test data")
+ImageNetDataPipeline.evaluate(model=qsim.model)
+
+
+
+
+
+

5. Perform QAT

+

To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so. For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change +these parameters as you see fit.

+
+
[ ]:
+
+
+
quantized_callback = tf.keras.callbacks.TensorBoard(log_dir="./log/quantized")
+history = qsim.model.fit(
+    train_dataset, batch_size=4, epochs=1, validation_data=eval_dataset,
+    callbacks=[quantized_callback]
+)
+
+
+
+

Finally, let’s evaluate the validation accuracy of our model after QAT.

+
+
[ ]:
+
+
+
print("Evaluate quantized model (post QAT) on test data")
+ImageNetDataPipeline.evaluate(model=qsim.model)
+
+
+
+

6. Re-estimate BatchNorm Statistics

+

AIMET provides a helper function, reestimate_bn_stats, for re-estimating batchnorm statistics. Here is the full list of parameters for this function: * model: Model to re-estimate the BatchNorm statistics. * dataloader Train dataloader. * num_batches (optional): The number of batches to be used for reestimation. (Default: 100) * forward_fn (optional): Optional adapter function that performs forward pass given a model and a input batch yielded from the data loader. If not +specified, it is expected that inputs yielded from dataloader can be passed directly to the model.

+
+
[ ]:
+
+
+
from aimet_tensorflow.keras.bn_reestimation import reestimate_bn_stats
+
+reestimate_bn_stats(qsim.model, re_estimation_dataset, 1)
+
+
+
+
+

Fold BatchNorm Layers

+

So far, we have improved our quantization simulation model through QAT and batchnorm re-estimation. The next step would be to actually take this model to target. But first, we should fold the batchnorm layers for our model to run on target devices more efficiently.

+
+
[ ]:
+
+
+
from aimet_tensorflow.keras.batch_norm_fold import fold_all_batch_norms_to_scale
+fold_all_batch_norms_to_scale(qsim)
+
+
+
+
+
+
+
+

5. Export Model

+

As the final step, we will export the model to run it on actual target devices. AIMET QuantizationSimModel provides an export API for this purpose.

+
+
[ ]:
+
+
+
import os
+os.makedirs('./output/', exist_ok=True)
+qsim.export(path='./output/', filename_prefix='mnist_after_bn_re_estimation_qat_range_learning')
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use batchnorm re-estimation feature of AIMET.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters. - Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT methods.

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/bn_reestimation.ipynb b/releases/1.32.2/Examples/tensorflow/quantization/keras/bn_reestimation.ipynb new file mode 100644 index 00000000..100d10ad --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/bn_reestimation.ipynb @@ -0,0 +1,624 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "# Quantization-Aware Training with BatchNorm Re-estimation\n", + "\n", + "This notebook shows a working code example of how to use AIMET to perform QAT (Quantization-aware training) with batchnorm re-estimation.\n", + "Batchnorm re-estimation is a technique for countering potential instability of batchnorm statistics (i.e. running\n", + "mean and variance) during QAT. More specifically, batchnorm re-estimation recalculates the batchnorm statistics based on the model after QAT. By doing so, we aim to make our model learn batchnorm statistics from stable outputs after QAT, rather than from likely noisy outputs during QAT.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following steps:\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Define Constants and Datasets Prepare\n", + "3. Create the model in Keras\n", + "4. Train and evaluate the model\n", + "5. Quantize the model with QuantSim\n", + "6. Finetune and evaluate the quantization simulation model\n", + "7. Re-estimate batchnorm statistics and compare the eval score before and after re-estimation\n", + "8. Fold the re-estimated batchnorm layers and export the quantization simulation model\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. [https://image-net.org/challenges/LSVRC/2012/index.php#](https://image-net.org/challenges/LSVRC/2012/index.php#))\n", + "\n", + "**Note**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "DATASET_DIR = '/path/to/dir/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'\n", + "\n", + "import tensorflow as tf" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 1. Instantiate the example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a TensorFlow model. This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of evaluate() or train() methods?** Not really. You should be able to use your existing evaluate and train routines as-is." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from typing import Optional\n", + "from Examples.common import image_net_config\n", + "from Examples.tensorflow.utils.keras.image_net_dataset import ImageNetDataset\n", + "from Examples.tensorflow.utils.keras.image_net_evaluator import ImageNetEvaluator\n", + "\n", + "\n", + "class ImageNetDataPipeline:\n", + " \"\"\"\n", + " Provides APIs for model evaluation and finetuning using ImageNet Dataset.\n", + " \"\"\"\n", + "\n", + " @staticmethod\n", + " def get_val_dataset(batch_size: Optional[int] = None) -> tf.data.Dataset:\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " :return: A tensorflow dataset\n", + " \"\"\"\n", + " if batch_size is None:\n", + " batch_size = image_net_config.evaluation['batch_size']\n", + "\n", + " data_loader = ImageNetDataset(DATASET_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=batch_size)\n", + "\n", + " return data_loader\n", + "\n", + " @staticmethod\n", + " def evaluate(model, iterations=None) -> float:\n", + " \"\"\"\n", + " Given a Keras model, evaluates its Top-1 accuracy on the validation dataset\n", + " :param model: The Keras model to be evaluated.\n", + " :param iterations: The number of iterations to run. If None, all the data will be used\n", + " :return: The accuracy for the sample with the maximum accuracy.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(DATASET_DIR,\n", + " image_size=image_net_config.dataset[\"image_size\"],\n", + " batch_size=image_net_config.evaluation[\"batch_size\"])\n", + "\n", + " return evaluator.evaluate(model=model, iterations=iterations)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## 2. Define Constants and Datasets Prepare\n", + "\n", + "In this section the constants and helper functions needed to run this example are defined.\n", + "\n", + "- **EVAL_DATASET_SIZE** To execute this example faster this value has been set to 4\n", + "- **TRAIN_DATASET_SIZE** To execute this example faster this value has been set to 4\n", + "- **RE_ESTIMATION_DATASET_SIZE** To execute this example faster this value has been set to 4\n", + "- **BATCH_SIZE** User sets the batch size. As an example, set to 16\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "EVAL_DATASET_SIZE = 4\n", + "TRAIN_DATASET_SIZE = 4\n", + "RE_ESTIMATION_DATASET_SIZE = 4\n", + "BATCH_SIZE = 16\n", + "\n", + "dataset = ImageNetDataPipeline.get_val_dataset(BATCH_SIZE).dataset\n", + "eval_dataset = dataset.take(EVAL_DATASET_SIZE)\n", + "train_dataset = dataset.take(TRAIN_DATASET_SIZE)\n", + "unlabeled_dataset = dataset.map(lambda images, labels: images)\n", + "re_estimation_dataset = unlabeled_dataset.take(RE_ESTIMATION_DATASET_SIZE)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## 2. Create the model in Keras\n", + "\n", + "Currently, only Keras models built using the Sequential or Functional APIs are compatible with QuantSim - models making use of subclassed layers are incompatible. Therefore, we use the Functional API to create the model used in this example" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "tf.keras.backend.clear_session()\n", + "inputs = tf.keras.Input(shape=(224, 224, 3), name=\"inputs\")\n", + "conv = tf.keras.layers.Conv2D(16, (3, 3), name ='conv1')(inputs)\n", + "bn = tf.keras.layers.BatchNormalization(fused=True)(conv)\n", + "relu = tf.keras.layers.ReLU()(bn)\n", + "pool = tf.keras.layers.MaxPooling2D()(relu)\n", + "conv2 = tf.keras.layers.Conv2D(8, (3, 3), name ='conv2')(pool)\n", + "flatten = tf.keras.layers.Flatten()(conv2)\n", + "dense = tf.keras.layers.Dense(1000)(flatten)\n", + "functional_model = tf.keras.Model(inputs=inputs, outputs=dense)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## 3. Train and evaluate the model\n", + "\n", + "Before we can quantize the model and apply QAT, the FP32 model must be trained so that we can get a baseline accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "loss_fn = tf.keras.losses.CategoricalCrossentropy()\n", + "\n", + "functional_model.compile(optimizer='adam',\n", + " loss=loss_fn,\n", + " metrics=['accuracy'])\n", + "\n", + "functional_model.fit(train_dataset, epochs=5)\n", + "\n", + "# Evaluate the model on the test data using `evaluate`\n", + "print(\"Evaluate quantized model (post QAT) on test data\")\n", + "ImageNetDataPipeline.evaluate(model=functional_model)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## 4. Create a QuantizationSim Model\n", + "\n", + "Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them.\n", + "A few of the parameters are explained here\n", + "- **quant_scheme**: We set this to \"QuantScheme.training_range_learning_with_tf_init\"\n", + " - Supported options are 'tf_enhanced' or 'tf' or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced\n", + "- **default_output_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision\n", + "- **default_param_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision\n", + "\n", + "There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "import json\n", + "from aimet_common.defs import QuantScheme\n", + "from aimet_tensorflow.keras.quantsim import QuantizationSimModel\n", + "\n", + "default_config_per_channel = {\n", + " \"defaults\":\n", + " {\n", + " \"ops\":\n", + " {\n", + " \"is_output_quantized\": \"True\"\n", + " },\n", + " \"params\":\n", + " {\n", + " \"is_quantized\": \"True\",\n", + " \"is_symmetric\": \"True\"\n", + " },\n", + " \"strict_symmetric\": \"False\",\n", + " \"unsigned_symmetric\": \"True\",\n", + " \"per_channel_quantization\": \"True\"\n", + " },\n", + "\n", + " \"params\":\n", + " {\n", + " \"bias\":\n", + " {\n", + " \"is_quantized\": \"False\"\n", + " }\n", + " },\n", + "\n", + " \"op_type\":\n", + " {\n", + " \"Squeeze\":\n", + " {\n", + " \"is_output_quantized\": \"False\"\n", + " },\n", + " \"Pad\":\n", + " {\n", + " \"is_output_quantized\": \"False\"\n", + " },\n", + " \"Mean\":\n", + " {\n", + " \"is_output_quantized\": \"False\"\n", + " }\n", + " },\n", + "\n", + " \"supergroups\":\n", + " [\n", + " {\n", + " \"op_list\": [\"Conv\", \"Relu\"]\n", + " },\n", + " {\n", + " \"op_list\": [\"Conv\", \"Clip\"]\n", + " },\n", + " {\n", + " \"op_list\": [\"Conv\", \"BatchNormalization\", \"Relu\"]\n", + " },\n", + " {\n", + " \"op_list\": [\"Add\", \"Relu\"]\n", + " },\n", + " {\n", + " \"op_list\": [\"Gemm\", \"Relu\"]\n", + " }\n", + " ],\n", + "\n", + " \"model_input\":\n", + " {\n", + " \"is_input_quantized\": \"True\"\n", + " },\n", + "\n", + " \"model_output\":\n", + " {}\n", + " }\n", + "\n", + "with open(\"/tmp/default_config_per_channel.json\", \"w\") as f:\n", + " json.dump(default_config_per_channel, f)\n", + "\n", + "\n", + "qsim = QuantizationSimModel(functional_model, quant_scheme=QuantScheme.training_range_learning_with_tf_init,\n", + " config_file=\"/tmp/default_config_per_channel.json\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Prepare the evaluation callback function\n", + "\n", + "The **eval_callback()** function takes the model object to evaluate and compile option dictionary and the number of samples to use as arguments. If the **num_samples** argument is None, the whole evaluation dataset is used to evaluate the model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from typing import Optional\n", + "\n", + "\n", + "def eval_callback(model: tf.keras.Model,\n", + " num_samples: Optional[int] = None) -> float:\n", + " if num_samples is None:\n", + " num_samples = EVAL_DATASET_SIZE\n", + "\n", + " sampled_dataset = eval_dataset.take(num_samples)\n", + "\n", + " # Model should be compiled before evaluation\n", + " model.compile(optimizer=tf.keras.optimizers.Adam(),\n", + " loss=tf.keras.losses.CategoricalCrossentropy(),\n", + " metrics=tf.keras.metrics.CategoricalAccuracy())\n", + " _, acc = model.evaluate(sampled_dataset)\n", + "\n", + " return acc" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "**Compute Encodings**\n", + "\n", + "Even though AIMET has added 'quantizer' nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each 'quantizer' node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as 'computing encodings'.\n", + "\n", + "So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don't need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples\n", + "- In practice, we need a very small percentage of the overall data samples for computing encodings.\n", + "- It may be beneficial if the samples used for computing encoding are well distributed. It's not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all positive or negative samples are used.\n", + "\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "qsim.compute_encodings(eval_callback, forward_pass_callback_args=None)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "Next, we can evaluate the performance of the quantized model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "print(\"Evaluate quantized model on test data\")\n", + "ImageNetDataPipeline.evaluate(model=qsim.model)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## 5. Perform QAT\n", + "\n", + "To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.\n", + "For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "quantized_callback = tf.keras.callbacks.TensorBoard(log_dir=\"./log/quantized\")\n", + "history = qsim.model.fit(\n", + " train_dataset, batch_size=4, epochs=1, validation_data=eval_dataset,\n", + " callbacks=[quantized_callback]\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, let's evaluate the validation accuracy of our model after QAT." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "print(\"Evaluate quantized model (post QAT) on test data\")\n", + "ImageNetDataPipeline.evaluate(model=qsim.model)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***6. Re-estimate BatchNorm Statistics***\n", + "\n", + "AIMET provides a helper function, `reestimate_bn_stats`, for re-estimating batchnorm statistics.\n", + "Here is the full list of parameters for this function:\n", + "* **model**: Model to re-estimate the BatchNorm statistics.\n", + "* **dataloader** Train dataloader.\n", + "* **num_batches** (optional): The number of batches to be used for reestimation. (Default: 100)\n", + "* **forward_fn** (optional): Optional adapter function that performs forward pass given a model and a input batch yielded from the data loader. If not specified, it is expected that inputs yielded from dataloader can be passed directly to the model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.keras.bn_reestimation import reestimate_bn_stats\n", + "\n", + "reestimate_bn_stats(qsim.model, re_estimation_dataset, 1)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Fold BatchNorm Layers\n", + "\n", + "So far, we have improved our quantization simulation model through QAT and batchnorm re-estimation. The next step would be to actually take this model to target. But first, we should fold the batchnorm layers for our model to run on target devices more efficiently." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.keras.batch_norm_fold import fold_all_batch_norms_to_scale\n", + "fold_all_batch_norms_to_scale(qsim)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 5. Export Model\n", + "As the final step, we will export the model to run it on actual target devices. AIMET QuantizationSimModel provides an export API for this purpose." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "import os\n", + "os.makedirs('./output/', exist_ok=True)\n", + "qsim.export(path='./output/', filename_prefix='mnist_after_bn_re_estimation_qat_range_learning')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Summary\n", + "\n", + "Hope this notebook was useful for you to understand how to use batchnorm re-estimation feature of AIMET.\n", + "\n", + "Few additional resources\n", + "- Refer to the [AIMET API docs](https://quic.github.io/aimet-pages/AimetDocs/api_docs/index.html) to know more details of the APIs and optional parameters.\n", + "- Refer to the [other example notebooks](https://github.com/quic/aimet/tree/develop/Examples/tensorflow/quantization/keras) to understand how to use AIMET post-training quantization techniques and QAT methods." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.0" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/keras_transformer_qat.html b/releases/1.32.2/Examples/tensorflow/quantization/keras/keras_transformer_qat.html new file mode 100644 index 00000000..8dfe1c9c --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/keras_transformer_qat.html @@ -0,0 +1,1325 @@ + + + + + + Quantization-Aware Training with a Keras Transformer Model — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + +
  • + View page source +
  • +
+
+
+
+
+ +
+

Quantization-Aware Training with a Keras Transformer Model

+

This notebook shows a working code example of how to use AIMET to perform QAT (Quantization-aware training) for transformer models built in Keras. QAT is an AIMET feature adding quantization simulation ops (also called fake quantization ops sometimes) to a trained ML model and using a standard training pipeline to fine-tune or train the model for a few epochs. The resulting model should show improved accuracy on quantized ML accelerators.

+
+

Overall flow

+

This notebook covers the following 1. Load the dataset 2. Create the model in Keras 3. Train and evaluate the model 4. Quantize the model with QuantSim 5. Fine-tune the quantized model accuracy with QAT

+

1. Load the dataset

+

This notebook relies on the IMDB dataset for sentiment analysis, as provided by Keras.

+
+
[ ]:
+
+
+
from tensorflow import keras
+
+vocab_size = 20000  # Only consider the top 20k words
+maxlen = 200  # Only consider the first 200 words of each movie review
+
+(x_train, y_train), (x_val, y_val) = keras.datasets.imdb.load_data(num_words=vocab_size)
+print(len(x_train), "Training sequences")
+print(len(x_val), "Validation sequences")
+
+x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=maxlen)
+x_val = keras.preprocessing.sequence.pad_sequences(x_val, maxlen=maxlen)
+
+
+
+

Currently, only Keras models built using the Sequential or Functional APIs are compatible with QuantSim - models making use of subclassed layers are incompatible. Therefore, we use the Functional API to create the model used in this example.

+
+
[ ]:
+
+
+
import tensorflow as tf
+from tensorflow.keras import layers
+
+embed_dim = 32  # Embedding size for each token
+num_heads = 2  # Number of attention heads
+ff_dim = 32  # Hidden layer size in feed forward network inside transformer
+
+############## FUNCTIONAL MODEL ##############
+inputs = layers.Input(shape=(maxlen,))
+
+# Embedding Layer
+positions = tf.range(start=0, limit=maxlen, delta=1)
+positions = layers.Embedding(input_dim=maxlen, output_dim=embed_dim)(positions)
+x = layers.Embedding(input_dim=vocab_size, output_dim=embed_dim)(inputs)
+x = x + positions
+
+# Transformer Block
+x = layers.MultiHeadAttention(num_heads=num_heads, key_dim=embed_dim)(x, x)
+x = layers.Dropout(0.1)(x)
+x = layers.LayerNormalization(epsilon=1e-6)(x)
+x = layers.Dense(ff_dim, activation="relu")(x)
+x = layers.Dense(embed_dim)(x)
+x = layers.Dropout(0.1)(x)
+x = layers.LayerNormalization(epsilon=1e-6)(x)
+
+# Output layers
+x = layers.GlobalAveragePooling1D()(x)
+x = layers.Dropout(0.1)(x)
+x = layers.Dense(20, activation="relu")(x)
+x = layers.Dropout(0.1)(x)
+outputs = layers.Dense(2, activation="softmax")(x)
+################################################
+
+functional_model = keras.Model(inputs=inputs, outputs=outputs)
+
+
+
+
+

3. Train and evaluate the model to get a baseline accuracy

+

Before we can quantize the model and apply QAT, the FP32 model must be trained so that we can get a baseline accuracy.

+
+
[ ]:
+
+
+
functional_callback = tf.keras.callbacks.TensorBoard(log_dir="./log/functional", histogram_freq=1)
+functional_model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])
+history = functional_model.fit(
+    x_train, y_train, batch_size=32, epochs=1, validation_data=(x_val, y_val), callbacks=[functional_callback]
+)
+
+
+
+
+
[ ]:
+
+
+
# Evaluate the model on the test data using `evaluate`
+print("Evaluate model on test data")
+results = functional_model.evaluate(x_val, y_val, batch_size=128)
+print("test loss, test acc:", results)
+
+
+
+
+

4. Create a QuantizationSim Model and determine quantized accuracy

+

Create Quantization Sim Model

+

Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them. A few of the parameters are explained here - quant_scheme: We set this to “QuantScheme.post_training_tf_enhanced” - Supported options are ‘tf_enhanced’ or ‘tf’ or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced - default_output_bw: Setting this to 8, essentially means that we are +asking AIMET to perform all activation quantizations in the model using integer 8-bit precision - default_param_bw: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+
+
[ ]:
+
+
+
from aimet_common.defs import QuantScheme
+from aimet_tensorflow.keras.quantsim import QuantizationSimModel
+
+model = QuantizationSimModel(model=functional_model,
+                             quant_scheme=QuantScheme.post_training_tf_enhanced,
+                             rounding_mode='nearest',
+                             default_output_bw=8,
+                             default_param_bw=8)
+
+
+
+

QuantSim works by wrapping each layer in the model with a Quantization Wrapper that simulates the effects of quantization on the inputs, outputs, and parameters of the layer (visualized below). A regular Conv2D Keras layer is displayed on the right, while a Conv2D layer after a quantization wrapper has been applied is displayed on the left. A regular Conv2d layer    A Conv2D layer after a quantization wrapper has been applied

+

If a multi-head attention layer is encountered in the model, the original layer is replaced with a custom quantizable version that gives the QuantizationSimModel access to the inputs and outputs of internal ops within the layer, so that quantization wrappers can be applied at a more granular level than the entire MHA layer. This is necessary in order to accurately simulate the effects of on-target quantization.

+

This works by making use of Keras’s built-in clone_layer function, which allows us to clone and modify the FP32 model layer by layer. A more detailed call flow diagram is displayed below. Keras QuantSim call flow

+
+

Compute Encodings

+

Even though AIMET has added ‘quantizer’ nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred +to as calibration. AIMET simply refers to it as ‘computing encodings’.

+

So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples - In practice, we need a very small percentage of the overall data samples for computing encodings. - It may be beneficial if the samples used for +computing encoding are well distributed. It’s not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all positive or negative samples are used.

+

The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example.

+
+
[ ]:
+
+
+
model.compute_encodings(lambda m, _: m(x_val[0:1000]), None)
+model.export('./data', 'model', convert_to_pb=False) # Once the encodings have been computed, export them for later inspection
+
+
+
+

Next, we can evaluate the performance of the quantized model

+
+
[ ]:
+
+
+
model.model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]) # must compile model before evaluating
+
+print("Evaluate quantized model on test data")
+results = model.model.evaluate(x_val, y_val, batch_size=128)
+print("test loss, test acc:", results)
+
+
+
+
+

5. Perform QAT

+

To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so. For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change +these parameters as you see fit.

+
+
[ ]:
+
+
+
quantized_callback = tf.keras.callbacks.TensorBoard(log_dir="./log/quantized")
+history = model.model.fit(
+    x_train[0:1024], y_train[0:1024], batch_size=32, epochs=1, validation_data=(x_val, y_val), callbacks=[quantized_callback]
+)
+
+
+
+

Now, let’s compute and export the encodings of the model after performing QAT. When comparing the encodings file generated by this step and the encodings generated before quantization, there should be some differences. These differences are an artifact of QAT.

+
+
[ ]:
+
+
+
model.compute_encodings(lambda m, _: m(x_val[0:3000]), None)
+model.export('./data', 'model_after_qat', convert_to_pb=False)
+
+
+
+

Finally, let’s evaluate the validation accuracy of our model after QAT

+
+
[ ]:
+
+
+
print("Evaluate quantized model (post QAT) on test data")
+results = model.model.evaluate(x_val, y_val, batch_size=128)
+print("test loss, test acc:", results)
+
+
+
+

We can also use tensorboard to visualize the FP32 and quantized models to see how they are different from one another. Comparing the two, we can see that most layers are now replaced with a quantization wrapped simulating the effects of quantization at the input and output nodes of the layer. In the case of more complex layers, like multi-head attention, QuantSim has custom pipelines to insert quantization wrappers around more elementary ops within the layer.

+
+
[ ]:
+
+
+
%tensorboard --logdir logs
+from tensorboard import notebook
+
+notebook.display(height=1000)
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET with Keras models. Few additional resources Refer to the AIMET API docs to know more details of the APIs and optional parameters Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and the vanilla QAT method (without range-learning)

+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/keras_transformer_qat.ipynb b/releases/1.32.2/Examples/tensorflow/quantization/keras/keras_transformer_qat.ipynb new file mode 100644 index 00000000..3e6e36cb --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/keras_transformer_qat.ipynb @@ -0,0 +1,461 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "# Quantization-Aware Training with a Keras Transformer Model\n", + "\n", + "This notebook shows a working code example of how to use AIMET to perform QAT (Quantization-aware training) for transformer models built in Keras. QAT is an AIMET feature adding quantization simulation ops (also called fake quantization ops sometimes) to a trained ML model and using a standard training pipeline to fine-tune or train the model for a few epochs. The resulting model should show improved accuracy on quantized ML accelerators.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Load the dataset\n", + "2. Create the model in Keras\n", + "3. Train and evaluate the model\n", + "4. Quantize the model with QuantSim\n", + "5. Fine-tune the quantized model accuracy with QAT\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "***1. Load the dataset***\n", + "\n", + "This notebook relies on the IMDB dataset for sentiment analysis, as provided by Keras." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + }, + "pycharm": { + "is_executing": true + } + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "\n", + "vocab_size = 20000 # Only consider the top 20k words\n", + "maxlen = 200 # Only consider the first 200 words of each movie review\n", + "\n", + "(x_train, y_train), (x_val, y_val) = keras.datasets.imdb.load_data(num_words=vocab_size)\n", + "print(len(x_train), \"Training sequences\")\n", + "print(len(x_val), \"Validation sequences\")\n", + "\n", + "x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=maxlen)\n", + "x_val = keras.preprocessing.sequence.pad_sequences(x_val, maxlen=maxlen)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "Currently, only Keras models built using the Sequential or Functional APIs are compatible with QuantSim - models making use of subclassed layers are incompatible. Therefore, we use the Functional API to create the model used in this example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "from tensorflow.keras import layers\n", + "\n", + "embed_dim = 32 # Embedding size for each token\n", + "num_heads = 2 # Number of attention heads\n", + "ff_dim = 32 # Hidden layer size in feed forward network inside transformer\n", + "\n", + "############## FUNCTIONAL MODEL ##############\n", + "inputs = layers.Input(shape=(maxlen,))\n", + "\n", + "# Embedding Layer\n", + "positions = tf.range(start=0, limit=maxlen, delta=1)\n", + "positions = layers.Embedding(input_dim=maxlen, output_dim=embed_dim)(positions)\n", + "x = layers.Embedding(input_dim=vocab_size, output_dim=embed_dim)(inputs)\n", + "x = x + positions\n", + "\n", + "# Transformer Block\n", + "x = layers.MultiHeadAttention(num_heads=num_heads, key_dim=embed_dim)(x, x)\n", + "x = layers.Dropout(0.1)(x)\n", + "x = layers.LayerNormalization(epsilon=1e-6)(x)\n", + "x = layers.Dense(ff_dim, activation=\"relu\")(x)\n", + "x = layers.Dense(embed_dim)(x)\n", + "x = layers.Dropout(0.1)(x)\n", + "x = layers.LayerNormalization(epsilon=1e-6)(x)\n", + "\n", + "# Output layers\n", + "x = layers.GlobalAveragePooling1D()(x)\n", + "x = layers.Dropout(0.1)(x)\n", + "x = layers.Dense(20, activation=\"relu\")(x)\n", + "x = layers.Dropout(0.1)(x)\n", + "outputs = layers.Dense(2, activation=\"softmax\")(x)\n", + "################################################\n", + "\n", + "functional_model = keras.Model(inputs=inputs, outputs=outputs)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "---\n", + "***3. Train and evaluate the model to get a baseline accuracy***\n", + "\n", + "Before we can quantize the model and apply QAT, the FP32 model must be trained so that we can get a baseline accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "functional_callback = tf.keras.callbacks.TensorBoard(log_dir=\"./log/functional\", histogram_freq=1)\n", + "functional_model.compile(optimizer=\"adam\", loss=\"sparse_categorical_crossentropy\", metrics=[\"accuracy\"])\n", + "history = functional_model.fit(\n", + " x_train, y_train, batch_size=32, epochs=1, validation_data=(x_val, y_val), callbacks=[functional_callback]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "# Evaluate the model on the test data using `evaluate`\n", + "print(\"Evaluate model on test data\")\n", + "results = functional_model.evaluate(x_val, y_val, batch_size=128)\n", + "print(\"test loss, test acc:\", results)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "---\n", + "***4. Create a QuantizationSim Model and determine quantized accuracy***\n", + "\n", + "**Create Quantization Sim Model**\n", + "\n", + "Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them.\n", + "A few of the parameters are explained here\n", + "- **quant_scheme**: We set this to \"QuantScheme.post_training_tf_enhanced\"\n", + " - Supported options are 'tf_enhanced' or 'tf' or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced\n", + "- **default_output_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision\n", + "- **default_param_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision\n", + "\n", + "There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "from aimet_common.defs import QuantScheme\n", + "from aimet_tensorflow.keras.quantsim import QuantizationSimModel\n", + "\n", + "model = QuantizationSimModel(model=functional_model,\n", + " quant_scheme=QuantScheme.post_training_tf_enhanced,\n", + " rounding_mode='nearest',\n", + " default_output_bw=8,\n", + " default_param_bw=8)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "QuantSim works by wrapping each layer in the model with a Quantization Wrapper that simulates the effects of quantization on the inputs, outputs, and parameters of the layer (visualized below). A regular Conv2D Keras layer is displayed on the right, while a Conv2D layer after a quantization wrapper has been applied is displayed on the left.\n", + "![A regular Conv2d layer](../images/keras_pre_quant_layer.png)    ![A Conv2D layer after a quantization wrapper has been applied](../images/keras_post_quant_layer.png)\n", + "\n", + "If a multi-head attention layer is encountered in the model, the original layer is replaced with a custom quantizable version that gives the QuantizationSimModel access to the inputs and outputs of internal ops within the layer, so that quantization wrappers can be applied at a more granular level than the entire MHA layer. This is necessary in order to accurately simulate the effects of on-target quantization.\n", + "\n", + "This works by making use of Keras's built-in `clone_layer` function, which allows us to clone and modify the FP32 model layer by layer. A more detailed call flow diagram is displayed below.\n", + "![Keras QuantSim call flow](../images/keras_quantsim_callflow.png)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "---\n", + "**Compute Encodings**\n", + "\n", + "Even though AIMET has added 'quantizer' nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each 'quantizer' node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as 'computing encodings'.\n", + "\n", + "So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don't need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples\n", + "- In practice, we need a very small percentage of the overall data samples for computing encodings.\n", + "- It may be beneficial if the samples used for computing encoding are well distributed. It's not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all positive or negative samples are used.\n", + "\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "model.compute_encodings(lambda m, _: m(x_val[0:1000]), None)\n", + "model.export('./data', 'model', convert_to_pb=False) # Once the encodings have been computed, export them for later inspection" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "Next, we can evaluate the performance of the quantized model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "model.model.compile(optimizer=\"adam\", loss=\"sparse_categorical_crossentropy\", metrics=[\"accuracy\"]) # must compile model before evaluating\n", + "\n", + "print(\"Evaluate quantized model on test data\")\n", + "results = model.model.evaluate(x_val, y_val, batch_size=128)\n", + "print(\"test loss, test acc:\", results)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "---\n", + "***5. Perform QAT***\n", + "\n", + "To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.\n", + "For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "quantized_callback = tf.keras.callbacks.TensorBoard(log_dir=\"./log/quantized\")\n", + "history = model.model.fit(\n", + " x_train[0:1024], y_train[0:1024], batch_size=32, epochs=1, validation_data=(x_val, y_val), callbacks=[quantized_callback]\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "Now, let's compute and export the encodings of the model after performing QAT. When comparing the encodings file generated by this step and the encodings generated before quantization, there should be some differences. These differences are an artifact of QAT." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "model.compute_encodings(lambda m, _: m(x_val[0:3000]), None)\n", + "model.export('./data', 'model_after_qat', convert_to_pb=False)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "Finally, let's evaluate the validation accuracy of our model after QAT" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "print(\"Evaluate quantized model (post QAT) on test data\")\n", + "results = model.model.evaluate(x_val, y_val, batch_size=128)\n", + "print(\"test loss, test acc:\", results)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "We can also use tensorboard to visualize the FP32 and quantized models to see how they are different from one another. Comparing the two, we can see that most layers are now replaced with a quantization wrapped simulating the effects of quantization at the input and output nodes of the layer. In the case of more complex layers, like multi-head attention, QuantSim has custom pipelines to insert quantization wrappers around more elementary ops within the layer." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "%tensorboard --logdir logs\n", + "from tensorboard import notebook\n", + "\n", + "notebook.display(height=1000)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "---\n", + "***Summary***\n", + "\n", + "Hope this notebook was useful for you to understand how to use AIMET with Keras models.\n", + "Few additional resources\n", + "Refer to the AIMET API docs to know more details of the APIs and optional parameters\n", + "Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and the vanilla QAT method (without range-learning)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/model_preparer.html b/releases/1.32.2/Examples/tensorflow/quantization/keras/model_preparer.html new file mode 100644 index 00000000..3316c55c --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/model_preparer.html @@ -0,0 +1,1347 @@ + + + + + + Keras Model Preparer — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Keras Model Preparer

+

This notebook shows how to prepare a Keras model for quantization. Specifically, this preparer converts a Keras model with subclass layers into a Keras model with functional layers. This is required for quantization because the AIMET quantization tooling only supports the Functional and Sequantial Keras model building API’s.

+

To learn more about the Keras Model Preparer, please refer to the API Docs in AIMET.

+
+

Overall flow

+

This notebook covers the following 1. Creating a Keras model with subclass layers 2. Converting the Keras model with subclass layers to a Keras model with functional layers 3. Showing similarities and differences between the original and converted models 4. Dicussing the limitations of the Keras Model Preparer

+
+
+

1. Creating a Keras model with subclass layers

+

First, we will create a Keras model with subclass layers. For this notebook example, we will use a model defined by Keras that utilizes subclass layers. This model is a text classification transformer model and can be found here. The subclass layers used in this model are - TokenAndPositionEmbedding and TransformerBlock. They are defined below.

+
+
[ ]:
+
+
+
import tensorflow as tf
+
+class TransformerBlock(tf.keras.layers.Layer):
+    def __init__(self, embed_dim, num_heads, ff_dim, rate=0.1):
+        super(TransformerBlock, self).__init__()
+        self.att = tf.keras.layers.MultiHeadAttention(num_heads=num_heads, key_dim=embed_dim)
+        self.ffn = tf.keras.Sequential(
+            [tf.keras.layers.Dense(ff_dim, activation="relu"), tf.keras.layers.Dense(embed_dim),]
+        )
+        self.layernorm1 = tf.keras.layers.LayerNormalization(epsilon=1e-6)
+        self.layernorm2 = tf.keras.layers.LayerNormalization(epsilon=1e-6)
+        self.dropout1 = tf.keras.layers.Dropout(rate)
+        self.dropout2 = tf.keras.layers.Dropout(rate)
+
+    def call(self, inputs, training, **kwargs):
+        attn_output = self.att(inputs, inputs)
+        attn_output = self.dropout1(attn_output, training=training)
+        out1 = self.layernorm1(inputs + attn_output)
+        ffn_output = self.ffn(out1)
+        ffn_output = self.dropout2(ffn_output, training=training)
+        return self.layernorm2(out1 + ffn_output)
+
+
+
+class TokenAndPositionEmbedding(tf.keras.layers.Layer):
+    def __init__(self, maxlen, vocab_size, embed_dim):
+        super(TokenAndPositionEmbedding, self).__init__()
+        self.token_emb = tf.keras.layers.Embedding(input_dim=vocab_size, output_dim=embed_dim)
+        self.pos_emb = tf.keras.layers.Embedding(input_dim=maxlen, output_dim=embed_dim)
+
+    def call(self, x, **kwargs):
+        maxlen = tf.shape(x)[-1]
+        positions = tf.range(start=0, limit=maxlen, delta=1)
+        positions = self.pos_emb(positions)
+        x = self.token_emb(x)
+        x = x + positions
+        return x
+
+
+
+

With those subclass layers defined, we can now define the model. Since we are not training the model, we will use random weights and a random input tensor to build the model.

+
+
[ ]:
+
+
+
import numpy as np
+vocab_size = 20000
+maxlen = 200
+
+random_input = np.random.random((10, 200)) # Random input to build the model
+
+embed_dim = 32  # Embedding size for each token
+num_heads = 2  # Number of attention heads
+ff_dim = 32  # Hidden layer size in feed forward network inside transformer
+
+inputs = tf.keras.layers.Input(shape=(maxlen,))
+embedding_layer = TokenAndPositionEmbedding(maxlen, vocab_size, embed_dim)
+x = embedding_layer(inputs)
+transformer_block = TransformerBlock(embed_dim, num_heads, ff_dim)
+x = transformer_block(x)
+x = tf.keras.layers.GlobalAveragePooling1D()(x)
+x = tf.keras.layers.Dropout(0.1)(x)
+x = tf.keras.layers.Dense(20, activation="relu")(x)
+x = tf.keras.layers.Dropout(0.1)(x)
+outputs = tf.keras.layers.Dense(2, activation="softmax")(x)
+
+model = tf.keras.Model(inputs=inputs, outputs=outputs)
+_ = model(random_input)
+model.summary()
+
+
+
+

From the model.summary() output, we can see the models 2 subclass layers - token_and_position_embedding, transformer_block. Since these layers are using layer inside they’re classes, we need to extract them to create a symmetrical functional model.

+
+
+
+

2. Converting the Keras model with subclass layers to a Keras model with functional layers

+

The Keras Model Preparer can be used to convert a Keras model with subclass layers to a Keras model with functional layers. The Keras Model Preparer can be imported from aimet_tensorflow.keras.model_preparer. The Keras Model Preparer takes in a Keras model with subclass layers and returns a Keras model with functional layers. Note that the prepare_model function takes an optional input_layer parameter. This parameter is required if the model begins with a subclass layer. In this +case, the model does not begin with a subclass layer, so we do not need to provide an input_shape parameter.

+
+
[ ]:
+
+
+
from aimet_tensorflow.keras.model_preparer import prepare_model
+
+functional_model = prepare_model(model)
+functional_model.summary()
+
+
+
+

We can see that the Keras Model Preparer has converted the model with subclass layers to a model with functional layers. Specifically, it has extracted the call function of each of these layers and created a functional layer from it.

+
+
+
+

3. Showing similarities and differences between the original and converted models

+

We can see that the original model and the converted model are symmetrical. The only difference is that the subclass layers are unwrapped. This means that the converted model is functionally identical to the original model. We can test this in a few ways.

+
    +
  1. We can compare the total number of parameters in the original and converted models. We can see that the total number of parameters is the same.

  2. +
  3. We can compare the weights of the original and converted models. We can see that the weights are the same.

    +
      +
    • Note that the order of the weights presented when calling get_weights() on each of these models are not the same and as is the names of the weights. We can use an internal function to get the original models weights in the same order as the converted models weights.

    • +
    +
  4. +
  5. We can compare the outputs of the original and converted models. We can see that the outputs are the same.

  6. +
+
+
[ ]:
+
+
+
from typing import Set, List
+
+# This function is a functional representation of the reordering done inside the model preparer for Keras.
+def get_original_models_weights_in_functional_model_order(
+    original_model: tf.keras.Model,
+    functional_model: tf.keras.Model,
+    class_names: Set[str]
+) -> List[np.ndarray]:
+    """Map the original model's weights to the functional model's weights.
+
+    Args:
+        original_model:
+            Original model to reference the weight order.
+        functional_model:
+            Prepared model to updates weight of.
+        class_names:
+            Names of the classes that the original model was subclassed from
+
+    Returns:
+        A list of the original model's weights in the order of the functional model's weights
+    """
+
+    # Make the original model's weights into a dictionary for quick lookup by name
+    # The original subclassed layers names are removed to match the new functional model's names
+    original_model_weights = {}
+    for weight in original_model.weights:
+        # pop out class_names of weight name
+        weight_name = weight.name
+        for class_name in class_names:
+            weight_name = weight_name.replace(class_name + '/', '')
+        original_model_weights[weight_name] = weight.numpy()
+
+    # Get the functional model's weights in order as a dictionary for quick lookup where the key is the weight name
+    # and the position of the weight's order is the value
+    functional_model_weight_order = {
+        weight.name: position
+        for position, weight in enumerate(functional_model.weights)
+    }
+
+    # Using the functional model's weights order, get the original model's weights in the same order. The lambda here
+    # uses the weight's name to get position in the functional model's weights order and the sorts the original model's
+    # weights by that position.
+    weights_in_correct_order = [
+        weight for _, weight in
+        sorted(original_model_weights.items(), key=lambda weight_info: functional_model_weight_order[weight_info[0]])
+    ]
+
+    return weights_in_correct_order
+
+
+
+
+
[ ]:
+
+
+
assert functional_model.count_params() == model.count_params()
+assert functional_model.input_shape == model.input_shape
+assert functional_model.output_shape == model.output_shape
+
+# NOTE: Since TextClassification Model has the internal layers out of order compared to the call method,
+# the weights are not in the order of what the actual architecture is (this is a Keras design).
+# Therefore, we get the original model's weights and sort them in the order of the actual
+# architecture and use those weights to compare to the functional model's weights.
+model_weights_in_correct_order = get_original_models_weights_in_functional_model_order(
+    model, functional_model, class_names=["token_and_position_embedding", "transformer_block"])
+
+for i, _ in enumerate(model_weights_in_correct_order):
+        np.testing.assert_array_equal(model_weights_in_correct_order[i], functional_model.get_weights()[i])
+
+np.testing.assert_array_equal(functional_model(random_input).numpy(), model(random_input).numpy())
+print("Models are equal")
+
+
+
+
+
+

4. Discussing the limitations of the Keras Model Preparer

+
    +
  • The AIMET Keras ModelPreparer API is able to convert subclass layers that have arthmetic experssion in their call function. However, this API and Keras, will convert these operations to TFOPLambda layers which are not currently supported by AIMET Keras Quantization API. If possible, it is recommended to have the subclass layers call function ressemble the Keras Functional API layers. For example, if a subclass layer has two convolution layers in its call function, the call function should +look like the following:

    +
    def call(self, x, **kwargs):
    +    x = self.conv_1(x)
    +    x = self.conv_2(x)
    +    return x
    +
    +
    +
  • +
  • If the model starts with a subclassed layer, the AIMET Keras ModelPreparer API will need an Keras Input Layer as input. This is becuase the Keras Functional API requires an Input Layer as the first layer in the model. The AIMET Keras ModelPreparer API will raise an exception if the model starts with a subclassed layer and an Input Layer is not provided as input.

  • +
+
+
+
+

Summary

+

Hopefully this notebook was useful for you to understand how to use the Keras Model Preparer.

+

Few additional resources: - AIMET API Docs

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/model_preparer.ipynb b/releases/1.32.2/Examples/tensorflow/quantization/keras/model_preparer.ipynb new file mode 100644 index 00000000..43ee42d8 --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/model_preparer.ipynb @@ -0,0 +1,326 @@ +{ + "cells": [ + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Keras Model Preparer\n", + "\n", + "This notebook shows how to prepare a Keras model for quantization. Specifically, this preparer converts a Keras model with subclass layers into a Keras model with functional layers. This is required for quantization because the AIMET quantization tooling only supports the Functional and Sequantial Keras model building API's.\n", + "\n", + "To learn more about the Keras Model Preparer, please refer to the API Docs in AIMET.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Creating a Keras model with subclass layers\n", + "2. Converting the Keras model with subclass layers to a Keras model with functional layers\n", + "3. Showing similarities and differences between the original and converted models\n", + "4. Dicussing the limitations of the Keras Model Preparer" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 1. Creating a Keras model with subclass layers\n", + "\n", + "First, we will create a Keras model with subclass layers. For this notebook example, we will use a model defined by Keras that utilizes subclass layers. This model is a text classification transformer model and can be found [here]( https://keras.io/examples/nlp/text_classification_with_transformer/). The subclass layers used in this model are - `TokenAndPositionEmbedding` and `TransformerBlock`. They are defined below. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "\n", + "class TransformerBlock(tf.keras.layers.Layer):\n", + " def __init__(self, embed_dim, num_heads, ff_dim, rate=0.1):\n", + " super(TransformerBlock, self).__init__()\n", + " self.att = tf.keras.layers.MultiHeadAttention(num_heads=num_heads, key_dim=embed_dim)\n", + " self.ffn = tf.keras.Sequential(\n", + " [tf.keras.layers.Dense(ff_dim, activation=\"relu\"), tf.keras.layers.Dense(embed_dim),]\n", + " )\n", + " self.layernorm1 = tf.keras.layers.LayerNormalization(epsilon=1e-6)\n", + " self.layernorm2 = tf.keras.layers.LayerNormalization(epsilon=1e-6)\n", + " self.dropout1 = tf.keras.layers.Dropout(rate)\n", + " self.dropout2 = tf.keras.layers.Dropout(rate)\n", + "\n", + " def call(self, inputs, training, **kwargs):\n", + " attn_output = self.att(inputs, inputs)\n", + " attn_output = self.dropout1(attn_output, training=training)\n", + " out1 = self.layernorm1(inputs + attn_output)\n", + " ffn_output = self.ffn(out1)\n", + " ffn_output = self.dropout2(ffn_output, training=training)\n", + " return self.layernorm2(out1 + ffn_output)\n", + "\n", + "\n", + "\n", + "class TokenAndPositionEmbedding(tf.keras.layers.Layer):\n", + " def __init__(self, maxlen, vocab_size, embed_dim):\n", + " super(TokenAndPositionEmbedding, self).__init__()\n", + " self.token_emb = tf.keras.layers.Embedding(input_dim=vocab_size, output_dim=embed_dim)\n", + " self.pos_emb = tf.keras.layers.Embedding(input_dim=maxlen, output_dim=embed_dim)\n", + "\n", + " def call(self, x, **kwargs):\n", + " maxlen = tf.shape(x)[-1]\n", + " positions = tf.range(start=0, limit=maxlen, delta=1)\n", + " positions = self.pos_emb(positions)\n", + " x = self.token_emb(x)\n", + " x = x + positions\n", + " return x" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "With those subclass layers defined, we can now define the model. Since we are not training the model, we will use random weights and a random input tensor to build the model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "vocab_size = 20000 \n", + "maxlen = 200\n", + "\n", + "random_input = np.random.random((10, 200)) # Random input to build the model\n", + "\n", + "embed_dim = 32 # Embedding size for each token\n", + "num_heads = 2 # Number of attention heads\n", + "ff_dim = 32 # Hidden layer size in feed forward network inside transformer\n", + "\n", + "inputs = tf.keras.layers.Input(shape=(maxlen,))\n", + "embedding_layer = TokenAndPositionEmbedding(maxlen, vocab_size, embed_dim)\n", + "x = embedding_layer(inputs)\n", + "transformer_block = TransformerBlock(embed_dim, num_heads, ff_dim)\n", + "x = transformer_block(x)\n", + "x = tf.keras.layers.GlobalAveragePooling1D()(x)\n", + "x = tf.keras.layers.Dropout(0.1)(x)\n", + "x = tf.keras.layers.Dense(20, activation=\"relu\")(x)\n", + "x = tf.keras.layers.Dropout(0.1)(x)\n", + "outputs = tf.keras.layers.Dense(2, activation=\"softmax\")(x)\n", + "\n", + "model = tf.keras.Model(inputs=inputs, outputs=outputs)\n", + "_ = model(random_input)\n", + "model.summary()" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "From the `model.summary()` output, we can see the models 2 subclass layers - `token_and_position_embedding`, `transformer_block`. Since these layers are using layer inside they're classes, we need to extract them to create a symmetrical functional model. " + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 2. Converting the Keras model with subclass layers to a Keras model with functional layers\n", + "\n", + "The Keras Model Preparer can be used to convert a Keras model with subclass layers to a Keras model with functional layers. The Keras Model Preparer can be imported from `aimet_tensorflow.keras.model_preparer`. The Keras Model Preparer takes in a Keras model with subclass layers and returns a Keras model with functional layers. Note that the `prepare_model` function takes an optional `input_layer` parameter. This parameter is required if the model begins with a subclass layer. In this case, the model does not begin with a subclass layer, so we do not need to provide an `input_shape` parameter." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.keras.model_preparer import prepare_model\n", + "\n", + "functional_model = prepare_model(model) \n", + "functional_model.summary()" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can see that the Keras Model Preparer has converted the model with subclass layers to a model with functional layers. Specifically, it has extracted the call function of each of these layers and created a functional layer from it." + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 3. Showing similarities and differences between the original and converted models\n", + "\n", + "We can see that the original model and the converted model are symmetrical. The only difference is that the subclass layers are unwrapped. This means that the converted model is functionally identical to the original model. We can test this in a few ways.\n", + "\n", + "1) We can compare the total number of parameters in the original and converted models. We can see that the total number of parameters is the same.\n", + "\n", + "2) We can compare the weights of the original and converted models. We can see that the weights are the same.\n", + " * Note that the order of the weights presented when calling `get_weights()` on each of these models are not the same and as is the names of the weights. We can use an internal function to get the original models weights in the same order as the converted models weights.\n", + "\n", + "3) We can compare the outputs of the original and converted models. We can see that the outputs are the same." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from typing import Set, List\n", + "\n", + "# This function is a functional representation of the reordering done inside the model preparer for Keras.\n", + "def get_original_models_weights_in_functional_model_order(\n", + " original_model: tf.keras.Model,\n", + " functional_model: tf.keras.Model,\n", + " class_names: Set[str]\n", + ") -> List[np.ndarray]:\n", + " \"\"\"Map the original model's weights to the functional model's weights.\n", + "\n", + " Args:\n", + " original_model:\n", + " Original model to reference the weight order.\n", + " functional_model: \n", + " Prepared model to updates weight of.\n", + " class_names: \n", + " Names of the classes that the original model was subclassed from\n", + "\n", + " Returns:\n", + " A list of the original model's weights in the order of the functional model's weights\n", + " \"\"\"\n", + "\n", + " # Make the original model's weights into a dictionary for quick lookup by name\n", + " # The original subclassed layers names are removed to match the new functional model's names\n", + " original_model_weights = {}\n", + " for weight in original_model.weights:\n", + " # pop out class_names of weight name\n", + " weight_name = weight.name\n", + " for class_name in class_names:\n", + " weight_name = weight_name.replace(class_name + '/', '')\n", + " original_model_weights[weight_name] = weight.numpy()\n", + "\n", + " # Get the functional model's weights in order as a dictionary for quick lookup where the key is the weight name\n", + " # and the position of the weight's order is the value\n", + " functional_model_weight_order = {\n", + " weight.name: position\n", + " for position, weight in enumerate(functional_model.weights)\n", + " }\n", + "\n", + " # Using the functional model's weights order, get the original model's weights in the same order. The lambda here\n", + " # uses the weight's name to get position in the functional model's weights order and the sorts the original model's\n", + " # weights by that position.\n", + " weights_in_correct_order = [\n", + " weight for _, weight in\n", + " sorted(original_model_weights.items(), key=lambda weight_info: functional_model_weight_order[weight_info[0]])\n", + " ]\n", + "\n", + " return weights_in_correct_order\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "assert functional_model.count_params() == model.count_params()\n", + "assert functional_model.input_shape == model.input_shape\n", + "assert functional_model.output_shape == model.output_shape\n", + "\n", + "# NOTE: Since TextClassification Model has the internal layers out of order compared to the call method,\n", + "# the weights are not in the order of what the actual architecture is (this is a Keras design).\n", + "# Therefore, we get the original model's weights and sort them in the order of the actual\n", + "# architecture and use those weights to compare to the functional model's weights.\n", + "model_weights_in_correct_order = get_original_models_weights_in_functional_model_order(\n", + " model, functional_model, class_names=[\"token_and_position_embedding\", \"transformer_block\"])\n", + "\n", + "for i, _ in enumerate(model_weights_in_correct_order):\n", + " np.testing.assert_array_equal(model_weights_in_correct_order[i], functional_model.get_weights()[i])\n", + "\n", + "np.testing.assert_array_equal(functional_model(random_input).numpy(), model(random_input).numpy())\n", + "print(\"Models are equal\")" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 4. Discussing the limitations of the Keras Model Preparer\n", + "\n", + "- The AIMET Keras ModelPreparer API is able to convert subclass layers that have arthmetic experssion in their call function.\n", + "However, this API and Keras, will convert these operations to TFOPLambda layers which are not currently supported by AIMET Keras Quantization API. \n", + "If possible, it is recommended to have the subclass layers call function ressemble the Keras Functional API layers.\n", + "For example, if a subclass layer has two convolution layers in its call function, the call function should look like\n", + "the following:\n", + "\n", + " ```python\n", + " def call(self, x, **kwargs):\n", + " x = self.conv_1(x)\n", + " x = self.conv_2(x)\n", + " return x\n", + " ```\n", + "\n", + "- If the model starts with a subclassed layer, the AIMET Keras ModelPreparer API will need an Keras Input Layer as input.\n", + "This is becuase the Keras Functional API requires an Input Layer as the first layer in the model. The AIMET Keras ModelPreparer API\n", + "will raise an exception if the model starts with a subclassed layer and an Input Layer is not provided as input." + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Summary\n", + "\n", + "Hopefully this notebook was useful for you to understand how to use the Keras Model Preparer.\n", + "\n", + "Few additional resources:\n", + "- [AIMET API Docs](https://quic.github.io/aimet-pages/releases/latest/user_guide/index.html)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + }, + "vscode": { + "interpreter": { + "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6" + } + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/qat.html b/releases/1.32.2/Examples/tensorflow/quantization/keras/qat.html new file mode 100644 index 00000000..fcae8c09 --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/qat.html @@ -0,0 +1,1382 @@ + + + + + + Quantization-Aware Training — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Quantization-Aware Training

+

This notebook shows a working code example of how to use AIMET to perform Quantization Aware Training(QAT). QAT is an AIMET feature adding quantization simulation ops to a pre-trained model and using a standard training pipeline to fine-tune the model for a few epochs. The resulting model should show improved accuracy on quantized ML accelerators.

+

The quantization parameters(like encoding min/max/scale/offset) for activations are computed once. During fine-tuning, the model weights are updated to minimize the effects of quantization in the forward pass, keeping the quantization parameters constant.

+
+

Overall flow

+

This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load a pretrained FP32 model and determine the baseline FP32 accuracy 3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simulation model to get a quantized accuracy score 4. Fine-tune the quantization simulation model using QAT and evaluate the simulation model to get a post fine-tuned quantized accuracy score

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art QAT results. For example, it uses a relatively quantization-friendly model like Resnet50. Also, some optimization parameters like number of epochs are deliberately chosen to have the notebook execute more quickly.

  • +
+
+

Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline for QAT is written? Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a TensorFlow model. This QuantizationSim model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of evaluate() or train() methods? Not really. You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#)

+

Note: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = '/path/to/imagenet_dir'        # Please replace this with a real directory
+BATCH_SIZE = 128
+IMAGE_SIZE = (224, 224)
+
+
+
+
+
[ ]:
+
+
+
import os
+os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
+
+import tensorflow as tf
+
+
+
+
+
+

1. Load the dataset

+

We defined a few utility functions and assign the training and validation dataset to dataset_train and dataset_valid respectively

+
+
[ ]:
+
+
+
dataset_train = dataset_valid = tf.keras.preprocessing.image_dataset_from_directory(
+    directory=os.path.join(DATASET_DIR, "train"),
+    labels="inferred",
+    label_mode="categorical",
+    batch_size=BATCH_SIZE,
+    shuffle=True,
+    image_size=IMAGE_SIZE
+)
+dataset_valid = tf.keras.preprocessing.image_dataset_from_directory(
+    directory=os.path.join(DATASET_DIR, "val"),
+    labels="inferred",
+    label_mode="categorical",
+    batch_size=BATCH_SIZE,
+    shuffle=False,
+    image_size=IMAGE_SIZE
+)
+
+
+
+
+
+

2. Load a pretrained FP32 model

+

For this example notebook, we are going to load a pretrained ResNet50 model from Keras. Similarly, you can load any pretrained Keras model instead.

+
+
[ ]:
+
+
+
from tensorflow.keras.applications.resnet import ResNet50
+
+model = ResNet50(weights="imagenet")
+model.compile(optimizer="adam", loss="categorical_crossentropy")
+
+
+
+
+
+

3. Determine the baseline FP32 accuracy

+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using evaluate() routine

+
+
[ ]:
+
+
+
model.evaluate(dataset_valid)
+
+
+
+
+
+

4. Create a QuantizationSim Model and determine quantized accuracy

+
+

Fold Batch Normalization layers

+

Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.

+

Why do we need to this? On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values +for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.

+

The following code calls AIMET to fold the BN layers of a given model. NOTE: During folding, a new model is returned. Please use the returned model for the rest of the pipeline.

+
+
[ ]:
+
+
+
from aimet_tensorflow.keras.batch_norm_fold import fold_all_batch_norms
+
+_, model = fold_all_batch_norms(model)
+
+
+
+
+
+

Create Quantization Sim Model

+

Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them. A few of the parameters are explained here - quant_scheme: We set this to “QuantScheme.post_training_tf” - Other Supported options for QAT are ‘tf_enhanced’ or ‘tf’ or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced - default_output_bw: Setting this to 8, essentially means that we are +asking AIMET to perform all activation quantizations in the model using integer 8-bit precision - default_param_bw: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+
+
[ ]:
+
+
+
from aimet_tensorflow.keras.quantsim import QuantizationSimModel
+from aimet_common.defs import QuantScheme
+
+sim = QuantizationSimModel(model=model,
+                           quant_scheme=QuantScheme.post_training_tf,
+                           rounding_mode="nearest",
+                           default_output_bw=8,
+                           default_param_bw=8)
+
+
+
+
+
+

Compute Encodings

+

Even though AIMET has wrapped the layers to act as being ‘quantized’ but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ layer. For activation quantization layers, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes +referred to as calibration. AIMET simply refers to it as ‘computing encodings’.

+

So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples

+

In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples. It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario +like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results. The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many ways, this is just an example.

+
+
[ ]:
+
+
+
from tensorflow.keras.utils import Progbar
+from tensorflow.keras.applications.resnet import preprocess_input
+
+def pass_calibration_data(sim_model, samples):
+    dataset = dataset_valid
+
+    progbar = Progbar(samples)
+
+    batch_cntr = 0
+    for inputs, _ in dataset:
+        sim_model(preprocess_input(inputs))
+
+        batch_cntr += 1
+        progbar_stat_update = \
+            batch_cntr * BATCH_SIZE if (batch_cntr * BATCH_SIZE) < samples else samples
+        progbar.update(progbar_stat_update)
+        if (batch_cntr * BATCH_SIZE) > samples:
+            break
+
+
+
+

Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters.

+
+
[ ]:
+
+
+
sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=1000)
+
+
+
+
+
+

Compile the model

+

Configure the model for training and evaluation. The model must be compiled before evaluation

+
+
[ ]:
+
+
+
sim.model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
+
+
+
+
+
+

Evaluate the performance of the quantized model

+

Next, we can evaluate the performance of the quantized model

+
+
[ ]:
+
+
+
sim.model.evaluate(dataset_valid)
+
+
+
+
+
+
+

5. Perform QAT

+

To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyperparameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so. For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change +these parameters as you see fit.

+
+
[ ]:
+
+
+
quantized_callback = tf.keras.callbacks.TensorBoard(log_dir="./log/quantized")
+history = sim.model.fit(dataset_train, epochs=1, validation_data=dataset_valid, callbacks=[quantized_callback])
+
+
+
+
+
+

6. Evaluate validation accuracy after QAT

+

Next, let’s evaluate the validation accuracy of our model after QAT

+
+
[ ]:
+
+
+
sim.model.evaluate(dataset_valid)
+
+
+
+
+
+

7. Export the encodings

+

Finally, let’s compute and export the encodings of the model after performing QAT. When comparing the encodings file generated by this step and the encodings generated before quantization, there should be some differences. These differences are an artifact of QAT.

+
+
[ ]:
+
+
+
sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=1000)
+sim.export('./data', 'model_after_qat')
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET for performing QAT.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT with range-learning

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/qat.ipynb b/releases/1.32.2/Examples/tensorflow/quantization/keras/qat.ipynb new file mode 100644 index 00000000..2040a133 --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/qat.ipynb @@ -0,0 +1,549 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "# Quantization-Aware Training\n", + "\n", + "This notebook shows a working code example of how to use AIMET to perform Quantization Aware Training(QAT). QAT is an AIMET feature adding quantization simulation ops to a pre-trained model and using a standard training pipeline to fine-tune the model for a few epochs. The resulting model should show improved accuracy on quantized ML accelerators.\n", + "\n", + "The quantization parameters(like encoding min/max/scale/offset) for activations are computed once. During fine-tuning, the model weights are updated to minimize the effects of quantization in the forward pass, keeping the quantization parameters constant.\n", + "\n", + "\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load a pretrained FP32 model and determine the baseline FP32 accuracy\n", + "3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simulation model to get a quantized accuracy score\n", + "4. Fine-tune the quantization simulation model using QAT and evaluate the simulation model to get a post fine-tuned quantized accuracy score\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art QAT results. For example, it uses a relatively quantization-friendly model like Resnet50. Also, some optimization parameters like number of epochs are deliberately chosen to have the notebook execute more quickly." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline for QAT is written?** Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a TensorFlow model. This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of evaluate() or train() methods?** Not really. You should be able to use your existing evaluate and train routines as-is.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. [https://image-net.org/challenges/LSVRC/2012/index.php#](https://image-net.org/challenges/LSVRC/2012/index.php#))\n", + "\n", + "**Note**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "DATASET_DIR = '/path/to/imagenet_dir' # Please replace this with a real directory\n", + "BATCH_SIZE = 128\n", + "IMAGE_SIZE = (224, 224)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'\n", + "\n", + "import tensorflow as tf" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## 1. Load the dataset\n", + "\n", + "We defined a few utility functions and assign the training and validation dataset to `dataset_train` and `dataset_valid` respectively" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "dataset_train = dataset_valid = tf.keras.preprocessing.image_dataset_from_directory(\n", + " directory=os.path.join(DATASET_DIR, \"train\"),\n", + " labels=\"inferred\",\n", + " label_mode=\"categorical\",\n", + " batch_size=BATCH_SIZE,\n", + " shuffle=True,\n", + " image_size=IMAGE_SIZE\n", + ")\n", + "dataset_valid = tf.keras.preprocessing.image_dataset_from_directory(\n", + " directory=os.path.join(DATASET_DIR, \"val\"),\n", + " labels=\"inferred\",\n", + " label_mode=\"categorical\",\n", + " batch_size=BATCH_SIZE,\n", + " shuffle=False,\n", + " image_size=IMAGE_SIZE\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## 2. Load a pretrained FP32 model\n", + "\n", + "For this example notebook, we are going to load a pretrained ResNet50 model from Keras. Similarly, you can load any pretrained Keras model instead." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "from tensorflow.keras.applications.resnet import ResNet50\n", + "\n", + "model = ResNet50(weights=\"imagenet\")\n", + "model.compile(optimizer=\"adam\", loss=\"categorical_crossentropy\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## 3. Determine the baseline FP32 accuracy\n", + "\n", + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "model.evaluate(dataset_valid)" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## 4. Create a QuantizationSim Model and determine quantized accuracy\n", + "\n", + "### Fold Batch Normalization layers\n", + "Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.\n", + "\n", + "**Why do we need to this?**\n", + "On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.\n", + "\n", + "The following code calls AIMET to fold the BN layers of a given model.
\n", + "**NOTE: During folding, a new model is returned. Please use the returned model for the rest of the pipeline.**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.keras.batch_norm_fold import fold_all_batch_norms\n", + "\n", + "_, model = fold_all_batch_norms(model)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "\n", + "### Create Quantization Sim Model\n", + "\n", + "Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them.\n", + "A few of the parameters are explained here\n", + "- **quant_scheme**: We set this to \"QuantScheme.post_training_tf\"\n", + " - Other Supported options for QAT are 'tf_enhanced' or 'tf' or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced\n", + "- **default_output_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision\n", + "- **default_param_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision\n", + "\n", + "There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.keras.quantsim import QuantizationSimModel\n", + "from aimet_common.defs import QuantScheme\n", + "\n", + "sim = QuantizationSimModel(model=model,\n", + " quant_scheme=QuantScheme.post_training_tf,\n", + " rounding_mode=\"nearest\",\n", + " default_output_bw=8,\n", + " default_param_bw=8)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "### Compute Encodings\n", + "Even though AIMET has wrapped the layers to act as being 'quantized' but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each 'quantizer' layer. For activation quantization layers, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as 'computing encodings'.\n", + "\n", + "So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don't need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples\n", + "\n", + "In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples.\n", + "It may be beneficial if the samples used for computing encoding are well distributed. It's not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many ways, this is just an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "from tensorflow.keras.utils import Progbar\n", + "from tensorflow.keras.applications.resnet import preprocess_input\n", + "\n", + "def pass_calibration_data(sim_model, samples):\n", + " dataset = dataset_valid\n", + "\n", + " progbar = Progbar(samples)\n", + "\n", + " batch_cntr = 0\n", + " for inputs, _ in dataset:\n", + " sim_model(preprocess_input(inputs))\n", + "\n", + " batch_cntr += 1\n", + " progbar_stat_update = \\\n", + " batch_cntr * BATCH_SIZE if (batch_cntr * BATCH_SIZE) < samples else samples\n", + " progbar.update(progbar_stat_update)\n", + " if (batch_cntr * BATCH_SIZE) > samples:\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "\n", + "Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=1000)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "### Compile the model\n", + "\n", + "Configure the model for training and evaluation. The model must be compiled before evaluation" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "sim.model.compile(optimizer=\"adam\", loss=\"categorical_crossentropy\", metrics=[\"accuracy\"])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "### Evaluate the performance of the quantized model\n", + "\n", + "Next, we can evaluate the performance of the quantized model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "sim.model.evaluate(dataset_valid)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## 5. Perform QAT\n", + "\n", + "To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyperparameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.\n", + "For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + }, + "scrolled": true + }, + "outputs": [], + "source": [ + "quantized_callback = tf.keras.callbacks.TensorBoard(log_dir=\"./log/quantized\")\n", + "history = sim.model.fit(dataset_train, epochs=1, validation_data=dataset_valid, callbacks=[quantized_callback])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## 6. Evaluate validation accuracy after QAT\n", + "\n", + "Next, let's evaluate the validation accuracy of our model after QAT" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "sim.model.evaluate(dataset_valid)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## 7. Export the encodings\n", + "\n", + "Finally, let's compute and export the encodings of the model after performing QAT. When comparing the encodings file generated by this step and the encodings generated before quantization, there should be some differences. These differences are an artifact of QAT." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=1000)\n", + "sim.export('./data', 'model_after_qat')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## Summary\n", + "Hope this notebook was useful for you to understand how to use AIMET for performing QAT.\n", + "\n", + "Few additional resources\n", + "- Refer to the [AIMET API docs](https://quic.github.io/aimet-pages/AimetDocs/api_docs/index.html) to know more details of the APIs and optional parameters\n", + "- Refer to the [other example notebooks](https://github.com/quic/aimet/tree/develop/Examples/tensorflow/quantization/keras) to understand how to use AIMET post-training quantization techniques and QAT with range-learning" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + }, + "vscode": { + "interpreter": { + "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6" + } + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/qat_range_learning.html b/releases/1.32.2/Examples/tensorflow/quantization/keras/qat_range_learning.html new file mode 100644 index 00000000..f62d5e4b --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/qat_range_learning.html @@ -0,0 +1,1388 @@ + + + + + + Quantization-Aware Training with Range Learning — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Quantization-Aware Training with Range Learning

+

This notebook shows a working code example of how to use AIMET to perform Quantization-Aware Training(QAT) with range-learning. QAT with range-learning is an AIMET feature adding quantization simulation ops to a pre-trained model and using a standard training pipeline to fine-tune both the model and quantization parameters for a few epochs. While QAT fine-tunes only the model parameters, QAT with range-learning also learns encoding min/max of parameter quantizers(hence the name range-learning). +The resulting model should show improved accuracy on quantized ML accelerators.

+

The quantization parameters(like encoding min/max/scale/offset) for activations are computed once initially. During QAT, both the model weights and quantization parameters are jointly updated to minimize the effects of quantization in the forward pass.

+
+

Overall flow

+

This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load a pretrained FP32 model and determine the baseline FP32 accuracy 3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simulation model to get a quantized accuracy score 4. Fine-tune the quantization simulation model using QAT with range-learning and evaluate the simulation model to get a post fine-tuned quantized accuracy score

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art QAT results. For example, it uses a relatively quantization-friendly model like Resnet50. Also, some optimization parameters like number of epochs are deliberately chosen to have the notebook execute more quickly.

  • +
+
+

Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline for QAT with range-learning is written? Yes, there is limitation only on the training pipeline due to restrictions of keras for range-learning. You cannot use a custom training loop to do QAT with range-learning. Doing so would prevent the encoding min/max from updating during training. Instead, the only way to achieve range-learning is to:

    +
      +
    1. Compile the quantization simulation model directly with sim.compile

    2. +
    3. Run QAT directly on the simulation model with sim.fit

    4. +
    +
  • +
  • Does AIMET put any limitation on the interface of evaluate() or train() methods for QAT with range-learning? Only on the train method. You should be able to use your existing evaluation routine as-is, but there is a restriction on training as mentioned above

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#)

+

Note: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = '/path/to/dir'        # Please replace this with a real directory
+BATCH_SIZE = 128
+IMAGE_SIZE = (224, 224)
+
+
+
+
+
[ ]:
+
+
+
import os
+os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
+
+import tensorflow as tf
+
+
+
+
+
+

1. Load the dataset

+

We defined a few utility functions and assign the training and validation dataset to dataset_train and dataset_valid respectively

+
+
[ ]:
+
+
+
dataset_train = dataset_valid = tf.keras.preprocessing.image_dataset_from_directory(
+    directory=os.path.join(DATASET_DIR, "train"),
+    labels="inferred",
+    label_mode="categorical",
+    batch_size=BATCH_SIZE,
+    shuffle=True,
+    image_size=IMAGE_SIZE
+)
+dataset_valid = tf.keras.preprocessing.image_dataset_from_directory(
+    directory=os.path.join(DATASET_DIR, "val"),
+    labels="inferred",
+    label_mode="categorical",
+    batch_size=BATCH_SIZE,
+    shuffle=False,
+    image_size=IMAGE_SIZE
+)
+
+
+
+
+
+

2. Load a pretrained FP32 model

+

For this example notebook, we are going to load a pretrained ResNet50 model from Keras. Similarly, you can load any pretrained Keras model instead.

+
+
[ ]:
+
+
+
from tensorflow.keras.applications.resnet import ResNet50
+
+model = ResNet50(weights='imagenet')
+model.compile(optimizer="adam", loss="categorical_crossentropy")
+
+
+
+
+
+

3. Determine the baseline FP32 accuracy

+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using evaluate() routine

+
+
[ ]:
+
+
+
model.evaluate(dataset_valid)
+
+
+
+
+
+

4. Create a QuantizationSim Model and determine quantized accuracy

+
+

Fold Batch Normalization layers

+

Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.

+

Why do we need to this? On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values +for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.

+

The following code calls AIMET to fold the BN layers of a given model. NOTE: During folding, a new model is returned. Please use the returned model for the rest of the pipeline.

+
+
[ ]:
+
+
+
from aimet_tensorflow.keras.batch_norm_fold import fold_all_batch_norms
+
+_, model = fold_all_batch_norms(model)
+
+
+
+
+
+
+

Create Quantization Sim Model

+

Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them. A few of the parameters are explained here - quant_scheme: We set this to “training_range_learning_with_tf_init” - This is the key setting that enables “range learning”. With this choice of quant scheme, AIMET will use the TF quant scheme to initialize the quantization parameters like scale/offset. And then those parameters are set +to be trainable so they can continue to be updated during fine-tuning. - Another choice for quant_scheme is “training_range_learning_with_tf_enhanced_init”. Similar to the above, but the initialization for scale/offset is doing using the TF Enhanced scheme. Since in both schemes the quantization parameters are set to be trainable, there is not much benefit to using this choice instead of “training_range_learning_with_tf_init. - default_output_bw: Setting this to 8, essentially means that we +are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision - default_param_bw: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+
+
[ ]:
+
+
+
from aimet_tensorflow.keras.quantsim import QuantizationSimModel
+from aimet_common.defs import QuantScheme
+
+sim = QuantizationSimModel(model=model,
+                           quant_scheme=QuantScheme.training_range_learning_with_tf_init,
+                           rounding_mode="nearest",
+                           default_output_bw=8,
+                           default_param_bw=8)
+
+
+
+
+

Compute Encodings

+

Even though AIMET has wrapped the layers to act as being ‘quantized’ but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ layer. For activation quantization layers, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes +referred to as calibration. AIMET simply refers to it as ‘computing encodings’.

+

So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples

+

In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples. It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario +like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results. The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many ways, this is just an example.

+
+
[ ]:
+
+
+
from tensorflow.keras.utils import Progbar
+from tensorflow.keras.applications.resnet import preprocess_input
+
+def pass_calibration_data(sim_model, samples):
+    dataset = dataset_valid
+    progbar = Progbar(samples)
+
+    batch_cntr = 0
+    for inputs, _ in dataset:
+        sim_model(preprocess_input(inputs))
+
+        batch_cntr += 1
+        progbar_stat_update = \
+            batch_cntr * BATCH_SIZE if (batch_cntr * BATCH_SIZE) < samples else samples
+        progbar.update(progbar_stat_update)
+        if (batch_cntr * BATCH_SIZE) > samples:
+            break
+
+
+
+

Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters.

+
+
[ ]:
+
+
+
sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=1000)
+
+
+
+
+
+

Compile the model

+

Configure the model for training and evaluation. The model must be compiled before evaluation.

+
+
[ ]:
+
+
+
sim.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
+
+
+
+
+
+

Evaluate the performance of the quantized model

+

Next, we can evaluate the performance of the quantized model

+
+
[ ]:
+
+
+
sim.evaluate(dataset_valid)
+
+
+
+
+
+
+

5. Perform QAT

+

To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyperparameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so. For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change +these parameters as you see fit.

+
+
[ ]:
+
+
+
quantized_callback = tf.keras.callbacks.TensorBoard(log_dir="./log/quantized")
+history = sim.fit(dataset_train, epochs=1, validation_data=dataset_valid, callbacks=[quantized_callback])
+
+
+
+
+
+

6. Evaluate validation accuracy after QAT

+

Next, let’s evaluate the validation accuracy of our model after QAT

+
+
[ ]:
+
+
+
sim.evaluate(dataset_valid)
+
+
+
+
+
+

7. Export the encodings

+

Finally, let’s compute and export the encodings of the model after performing QAT. When comparing the encodings file generated by this step and the encodings generated before quantization, there should be some differences. These differences are an artifact of QAT.

+
+
[ ]:
+
+
+
sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=1000)
+sim.export('./data', 'model_after_qat')
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET for performing QAT with range-learning.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and vanilla QAT(without range-learning)

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/qat_range_learning.ipynb b/releases/1.32.2/Examples/tensorflow/quantization/keras/qat_range_learning.ipynb new file mode 100644 index 00000000..064f7d84 --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/qat_range_learning.ipynb @@ -0,0 +1,551 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "# Quantization-Aware Training with Range Learning\n", + "\n", + "This notebook shows a working code example of how to use AIMET to perform Quantization-Aware Training(QAT) with range-learning. QAT with range-learning is an AIMET feature adding quantization simulation ops to a pre-trained model and using a standard training pipeline to fine-tune both the model and quantization parameters for a few epochs. While QAT fine-tunes only the model parameters, QAT with range-learning also learns encoding min/max of parameter quantizers(hence the name range-learning). The resulting model should show improved accuracy on quantized ML accelerators.\n", + "\n", + "The quantization parameters(like encoding min/max/scale/offset) for activations are computed once initially. During QAT, both the model weights and quantization parameters are jointly updated to minimize the effects of quantization in the forward pass.\n", + "\n", + "\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load a pretrained FP32 model and determine the baseline FP32 accuracy\n", + "3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simulation model to get a quantized accuracy score\n", + "4. Fine-tune the quantization simulation model using QAT with range-learning and evaluate the simulation model to get a post fine-tuned quantized accuracy score\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art QAT results. For example, it uses a relatively quantization-friendly model like Resnet50. Also, some optimization parameters like number of epochs are deliberately chosen to have the notebook execute more quickly." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline for QAT with range-learning is written?** **_Yes_**, there is limitation only on the training pipeline due to restrictions of keras for range-learning. You cannot use a custom training loop to do QAT with range-learning. Doing so would prevent the encoding min/max from updating during training. Instead, the only way to achieve range-learning is to:\n", + " 1. Compile the quantization simulation model directly with `sim.compile`\n", + " 2. Run QAT directly on the simulation model with `sim.fit`\n", + "\n", + "- **Does AIMET put any limitation on the interface of evaluate() or train() methods for QAT with range-learning?** Only on the train method. You should be able to use your existing evaluation routine as-is, but there is a restriction on training as mentioned above\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. [https://image-net.org/challenges/LSVRC/2012/index.php#](https://image-net.org/challenges/LSVRC/2012/index.php#))\n", + "\n", + "**Note**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "DATASET_DIR = '/path/to/dir' # Please replace this with a real directory\n", + "BATCH_SIZE = 128\n", + "IMAGE_SIZE = (224, 224)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'\n", + "\n", + "import tensorflow as tf" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## 1. Load the dataset\n", + "\n", + "We defined a few utility functions and assign the training and validation dataset to `dataset_train` and `dataset_valid` respectively" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "dataset_train = dataset_valid = tf.keras.preprocessing.image_dataset_from_directory(\n", + " directory=os.path.join(DATASET_DIR, \"train\"),\n", + " labels=\"inferred\",\n", + " label_mode=\"categorical\",\n", + " batch_size=BATCH_SIZE,\n", + " shuffle=True,\n", + " image_size=IMAGE_SIZE\n", + ")\n", + "dataset_valid = tf.keras.preprocessing.image_dataset_from_directory(\n", + " directory=os.path.join(DATASET_DIR, \"val\"),\n", + " labels=\"inferred\",\n", + " label_mode=\"categorical\",\n", + " batch_size=BATCH_SIZE,\n", + " shuffle=False,\n", + " image_size=IMAGE_SIZE\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## 2. Load a pretrained FP32 model\n", + "\n", + "For this example notebook, we are going to load a pretrained ResNet50 model from Keras. Similarly, you can load any pretrained Keras model instead." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "from tensorflow.keras.applications.resnet import ResNet50\n", + "\n", + "model = ResNet50(weights='imagenet')\n", + "model.compile(optimizer=\"adam\", loss=\"categorical_crossentropy\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## 3. Determine the baseline FP32 accuracy\n", + "\n", + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "model.evaluate(dataset_valid)" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## 4. Create a QuantizationSim Model and determine quantized accuracy\n", + "\n", + "### Fold Batch Normalization layers\n", + "Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.\n", + "\n", + "**Why do we need to this?**\n", + "On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.\n", + "\n", + "The following code calls AIMET to fold the BN layers of a given model.
\n", + "**NOTE: During folding, a new model is returned. Please use the returned model for the rest of the pipeline.**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.keras.batch_norm_fold import fold_all_batch_norms\n", + "\n", + "_, model = fold_all_batch_norms(model)" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## Create Quantization Sim Model\n", + "\n", + "Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them.\n", + "A few of the parameters are explained here\n", + "- **quant_scheme**: We set this to \"training_range_learning_with_tf_init\"\n", + " - This is the key setting that enables \"range learning\". With this choice of quant scheme, AIMET will use the TF quant scheme to initialize the quantization parameters like scale/offset. And then those parameters are set to be trainable so they can continue to be updated during fine-tuning.\n", + " - Another choice for quant_scheme is \"training_range_learning_with_tf_enhanced_init\". Similar to the above, but the initialization for scale/offset is doing using the TF Enhanced scheme. Since in both schemes the quantization parameters are set to be trainable, there is not much benefit to using this choice instead of \"training_range_learning_with_tf_init.\n", + "- **default_output_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision\n", + "- **default_param_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision\n", + "\n", + "There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.keras.quantsim import QuantizationSimModel\n", + "from aimet_common.defs import QuantScheme\n", + "\n", + "sim = QuantizationSimModel(model=model,\n", + " quant_scheme=QuantScheme.training_range_learning_with_tf_init,\n", + " rounding_mode=\"nearest\",\n", + " default_output_bw=8,\n", + " default_param_bw=8)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "### Compute Encodings\n", + "Even though AIMET has wrapped the layers to act as being 'quantized' but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each 'quantizer' layer. For activation quantization layers, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as 'computing encodings'.\n", + "\n", + "So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don't need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples\n", + "\n", + "In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples.\n", + "It may be beneficial if the samples used for computing encoding are well distributed. It's not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many ways, this is just an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "from tensorflow.keras.utils import Progbar\n", + "from tensorflow.keras.applications.resnet import preprocess_input\n", + "\n", + "def pass_calibration_data(sim_model, samples):\n", + " dataset = dataset_valid\n", + " progbar = Progbar(samples)\n", + "\n", + " batch_cntr = 0\n", + " for inputs, _ in dataset:\n", + " sim_model(preprocess_input(inputs))\n", + "\n", + " batch_cntr += 1\n", + " progbar_stat_update = \\\n", + " batch_cntr * BATCH_SIZE if (batch_cntr * BATCH_SIZE) < samples else samples\n", + " progbar.update(progbar_stat_update)\n", + " if (batch_cntr * BATCH_SIZE) > samples:\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "\n", + "Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=1000)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "### Compile the model\n", + "\n", + "Configure the model for training and evaluation. The model must be compiled before evaluation." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "sim.compile(optimizer=\"adam\", loss=\"categorical_crossentropy\", metrics=[\"accuracy\"])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "### Evaluate the performance of the quantized model\n", + "\n", + "Next, we can evaluate the performance of the quantized model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "sim.evaluate(dataset_valid)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## 5. Perform QAT\n", + "\n", + "To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyperparameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.\n", + "For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "quantized_callback = tf.keras.callbacks.TensorBoard(log_dir=\"./log/quantized\")\n", + "history = sim.fit(dataset_train, epochs=1, validation_data=dataset_valid, callbacks=[quantized_callback])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## 6. Evaluate validation accuracy after QAT\n", + "\n", + "Next, let's evaluate the validation accuracy of our model after QAT" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "sim.evaluate(dataset_valid)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## 7. Export the encodings\n", + "\n", + "Finally, let's compute and export the encodings of the model after performing QAT. When comparing the encodings file generated by this step and the encodings generated before quantization, there should be some differences. These differences are an artifact of QAT." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [], + "source": [ + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=1000)\n", + "sim.export('./data', 'model_after_qat')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "source": [ + "## Summary\n", + "Hope this notebook was useful for you to understand how to use AIMET for performing QAT with range-learning.\n", + "\n", + "Few additional resources\n", + "- Refer to the [AIMET API docs](https://quic.github.io/aimet-pages/AimetDocs/api_docs/index.html) to know more details of the APIs and optional parameters\n", + "- Refer to the [other example notebooks](https://github.com/quic/aimet/tree/develop/Examples/tensorflow/quantization/keras) to understand how to use AIMET post-training quantization techniques and vanilla QAT(without range-learning)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + }, + "vscode": { + "interpreter": { + "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6" + } + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/quant_analyzer.html b/releases/1.32.2/Examples/tensorflow/quantization/keras/quant_analyzer.html new file mode 100644 index 00000000..cf2d131d --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/quant_analyzer.html @@ -0,0 +1,1408 @@ + + + + + + Quant Analyzer — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Quant Analyzer

+

This notebook showcases a working code example of how to use AIMET to apply Quant Analyzer. Quant Analyzer is a feature which performs various analyses on a model to understand how each layer in the model responds to quantization.

+
+

Overall flow

+

This notebook covers the following 1. Instantiate the example evaluation pipeline 2. Load the FP32 model 3. Apply QuantAnalyzer to the model

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art results.

  • +
  • For example, it uses a relatively quantization-friendly model like Resnet50.

  • +
  • Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#)

+

Note: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = '/path/to/dir/'       # Please replace this with a real directory
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written?

    +

    Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a TensorFlow model. This QuantizationSim model can be used in place of the original model when doing inference or training.

    +
  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods?

    +

    Not really. You should be able to use your existing evaluate and train routines as-is.

    +
  • +
+
+
[ ]:
+
+
+
import tensorflow as tf
+
+from typing import Optional
+from Examples.common import image_net_config
+from Examples.tensorflow.utils.keras.image_net_dataset import ImageNetDataset
+
+
+class ImageNetDataPipeline:
+    """
+    Provides APIs for getting ImageNet Dataset.
+    """
+
+    @staticmethod
+    def get_val_dataset(batch_size: Optional[int] = None) -> tf.data.Dataset:
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        :return: A tensorflow dataset
+        """
+        if batch_size is None:
+            batch_size = image_net_config.evaluation['batch_size']
+
+        dataset = ImageNetDataset(DATASET_DIR,
+                                  image_size=image_net_config.dataset['image_size'],
+                                  batch_size=batch_size)
+
+        return dataset
+
+
+
+
+
+
+

2. Load a pretrained FP32 model

+

For this example notebook, we are going to load a pretrained ResNet50 model from Keras. Similarly, you can load any pretrained Keras model instead.

+
+
[ ]:
+
+
+
from tensorflow.keras.applications.resnet import ResNet50
+
+model = ResNet50(weights='imagenet')
+
+
+
+
+
+
+

3. Apply QuantAnalyzer to the model

+

QuantAnalyzer requires two functions to be defined by the user for passing data through the model:

+

Forward pass callback

+

One function will be used to pass representative data through a quantized version of the model to calibrate quantization parameters. This function should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metrics, so we can just ignore the model output.

+

The function must take two arguments, the first of which will be the model to run the forward pass on. The second argument can be anything additional which the function requires to run, and can be in the form of a single item or a tuple of items.

+

If no additional argument is needed, the user can specify a dummy “_” parameter for the function.

+

A few pointers regarding the forward pass data samples:

+
    +
  • In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 to 1000 samples.

  • +
  • It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results.

  • +
+

The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many ways; this is just an example. This function only requires unlabeled data as no loss or other evaluation metric is needed.

+
+
[ ]:
+
+
+
val_dataset = ImageNetDataPipeline.get_val_dataset()
+unlabeled_dataset = val_dataset.dataset
+
+def pass_calibration_data(sim_model: tf.keras.Model, _) -> None:
+    batch_size = val_dataset.batch_size
+    samples = 1000
+
+    sampled_dataset = unlabeled_dataset.take(samples // batch_size)
+    _ = sim_model.predict(sampled_dataset)
+
+
+
+

In order to pass this function to QuantAnalyzer, we need to wrap it in a CallbackFunc object, as shown below. The CallbackFunc takes two arguments: the callback function itself, and the inputs to pass into the callback function.

+
+
[ ]:
+
+
+
from aimet_common.utils import CallbackFunc
+
+forward_pass_callback = CallbackFunc(pass_calibration_data)
+
+
+
+
+

Evaluation callback

+

The second function will be used to evaluate the model, and needs to return an accuracy metric. In here, the user should pass any amount of data through the model which they would like when evaluating their model for accuracy.

+

Like the forward pass callback, this function also must take exactly two arguments: the model to evaluate, and any additional argument needed for the function to work. The second argument can be a tuple of items in case multiple items are needed.

+

Like the forward pass callback, we need to wrap the evaluation callback in a CallbackFunc object as well.

+
+
[ ]:
+
+
+
def eval_func(model: tf.keras.Model, _) -> float:
+    model.compile(optimizer=tf.keras.optimizers.Adam(),
+                  loss=tf.keras.losses.CategoricalCrossentropy(),
+                  metrics=tf.keras.metrics.CategoricalAccuracy())
+
+    _, acc = model.evaluate(val_dataset.dataset)
+    return acc
+
+eval_callback = CallbackFunc(eval_func)
+
+
+
+
+

Enabling MSE loss per layer analysis

+

An optional analysis step in QuantAnalyzer calculates the MSE loss per layer in the model, comparing the layer outputs from the original FP32 model vs. a quantized model. To perform this step, the user needs to also provide an unlabeled Dataset to QuantAnalyzer.

+

We will demonstrate this step by using the ImageNetDataLoader imported above.

+
+
[ ]:
+
+
+
from aimet_tensorflow.keras.quant_analyzer import QuantAnalyzer
+
+quant_analyzer = QuantAnalyzer(model, forward_pass_callback, eval_callback)
+
+
+
+

To enable the MSE loss analysis, we set the following:

+
+
[ ]:
+
+
+
quant_analyzer.enable_per_layer_mse_loss(unlabeled_dataset=unlabeled_dataset, num_batches=4)
+
+
+
+

Finally, to start the analyzer, we call .analyze().

+

A few of the parameters are explained here: - quant_scheme: - We set this to “post_training_tf_enhanced” With this choice of quant scheme, AIMET will use the TF Enhanced quant scheme to initialize the quantization parameters like scale/offset. - default_output_bw: Setting this to 8 means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision. - default_param_bw: Setting this to 8 means that we are asking AIMET to perform all +parameter quantizations in the model using integer 8-bit precision.

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+

When you call the analyze method, the following analyses are run:

+
    +
  • Compare fp32 accuracy, accuracy with only parameters quantized, and accuracy with only activations quantized

  • +
  • For each layer, track the model accuracy when quantization for all other layers is disabled (enabling quantization for only one layer in the model at a time)

  • +
  • For each layer, track the model accuracy when quantization for all other layers is enabled (disabling quantization for only one layer in the model at a time)

  • +
  • Track the minimum and maximum encoding parameters calculated by each quantizer in the model as a result of forward passes through the model with representative data

  • +
  • When the TF Enhanced quantization scheme is used, track the histogram of tensor ranges seen by each quantizer in the model as a result of forward passes through the model with representative data

  • +
  • If enabled, track the MSE loss seen at each layer by comparing layer outputs of the original fp32 model vs. a quantized model

  • +
+
+
[ ]:
+
+
+
from aimet_common.defs import QuantScheme
+
+quant_analyzer.analyze(quant_scheme=QuantScheme.post_training_tf_enhanced,
+                       default_param_bw=8,
+                       default_output_bw=8,
+                       config_file=None,
+                       results_dir="./tmp/")
+
+
+
+

AIMET will also output .html plots and json files where appropriate for each analysis to help visualize the data.

+

The following output files will be produced, in a folder specified by the user: Output directory structure will be like below

+
results_dir
+|-- per_layer_quant_enabled.html
+|-- per_layer_quant_enabled.json
+|-- per_layer_quant_disabled.html
+|-- per_layer_quant_disabled.json
+|-- min_max_ranges
+|   |-- activations.html
+|   |-- activations.json
+|   |-- weights.html
+|   +-- weights.json
+|-- activations_pdf
+|   |-- name_{input/output}_{index_0}.html
+|   |-- name_{input/output}_{index_1}.html
+|   |-- ...
+|   +-- name_{input/output}_{index_N}.html
+|-- weights_pdf
+|   |-- layer1
+|   |   |-- param_name_{channel_index_0}.html
+|   |   |-- param_name_{channel_index_1}.html
+|   |   |-- ...
+|   |   +-- param_name_{channel_index_N}.html
+|   |-- layer2
+|   |   |-- param_name_{channel_index_0}.html
+|   |   |-- param_name_{channel_index_1}.html
+|   |   |-- ...
+|   |   +-- param_name_{channel_index_N}.html
+|   |-- ...
+|   |-- layerN
+|   |   |-- param_name_{channel_index_0}.html
+|   |   |-- param_name_{channel_index_1}.html
+|   |   |-- ...
+|   +-- +-- param_name_{channel_index_N}.html
+|-- per_layer_mse_loss.html
++-- per_layer_mse_loss.json
+
+
+
+
+
+

Per-layer analysis by enabling/disabling quantization wrappers

+
    +
  • per_layer_quant_enabled.html: A plot with layers on the x-axis and model accuracy on the y-axis, where each layer’s accuracy represents the model accuracy when all quantizers in the model are disabled except for that layer’s parameter and activation quantizers.

  • +
  • per_layer_quant_enabled.json: A json file containing the data shown in per_layer_quant_enabled.html, associating layer names with model accuracy.

  • +
  • per_layer_quant_disabled.html: A plot with layers on the x-axis and model accuracy on the y-axis, where each layer’s accuracy represents the model accuracy when all quantizers in the model are enabled except for that layer’s parameter and activation quantizers.

  • +
  • per_layer_quant_disabled.json: A json file containing the data shown in per_layer_quant_disabled.html, associating layer names with model accuracy.

  • +
+

per_layer_quant_enabled.html

+
+
+

Encoding min/max ranges

+
    +
  • min_max_ranges: A folder containing the following sets of files:

    +
      +
    • activations.html: A plot with output activations on the x-axis and min-max values on the y-axis, where each output activation’s range represents the encoding min and max parameters computed during forward pass calibration (explained below).

    • +
    • activations.json: A json file containing the data shown in activations.html, associating layer names with min and max encoding values.

    • +
    • weights.html: A plot with parameter names on the x-axis and min-max values on the y-axis, where each parameter’s range represents the encoding min and max parameters computed during forward pass calibration.

    • +
    • weights.json: A json file containing the data shown in weights.html, associating parameter names with min and max encoding values.

    • +
    +
  • +
+

min_max_ranges.html

+
+
+

PDF of statistics

+
    +
  • (If TF Enhanced quant scheme is used) activations_pdf: A folder containing html files for each layer, plotting the histogram of tensor values seen for that layer’s output activation seen during forward pass calibration.

  • +
  • (If TF Enhanced quant scheme is used) weights_pdf: A folder containing sub folders for each layer with weights. Each layer’s folder contains html files for each parameter of that layer, with a histogram plot of tensor values seen for that parameter seen during forward pass calibration.

  • +
+

weights_pdf.html

+
+
+

Per-layer MSE loss

+
    +
  • (Optional, if per layer MSE loss is enabled) per_layer_mse_loss.html: A plot with layers on the x-axis and MSE loss on the y-axis, where each layer’s MSE loss represents the MSE seen comparing that layer’s outputs in the FP32 model vs. the quantized model.

  • +
  • (Optional, if per layer MSE loss is enabled) per_layer_mse_loss.json: A json file containing the data shown in per_layer_mse_loss.html, associating layer names with MSE loss.

  • +
+

per_layer_mse_loss.html

+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/quant_analyzer.ipynb b/releases/1.32.2/Examples/tensorflow/quantization/keras/quant_analyzer.ipynb new file mode 100644 index 00000000..c543984a --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/quant_analyzer.ipynb @@ -0,0 +1,459 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "# Quant Analyzer\n", + "\n", + "This notebook showcases a working code example of how to use AIMET to apply Quant Analyzer.\n", + "Quant Analyzer is a feature which performs various analyses on a model to understand how each layer in the model responds to quantization.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation pipeline\n", + "2. Load the FP32 model\n", + "3. Apply QuantAnalyzer to the model\n", + "\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art results.\n", + "* For example, it uses a relatively quantization-friendly model like Resnet50.\n", + "* Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. [https://image-net.org/challenges/LSVRC/2012/index.php#](https://image-net.org/challenges/LSVRC/2012/index.php#))\n", + "\n", + "**Note**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "DATASET_DIR = '/path/to/dir/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?**\n", + "\n", + " Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a TensorFlow model.\n", + " This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?**\n", + "\n", + " Not really. You should be able to use your existing evaluate and train routines as-is." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "\n", + "from typing import Optional\n", + "from Examples.common import image_net_config\n", + "from Examples.tensorflow.utils.keras.image_net_dataset import ImageNetDataset\n", + "\n", + "\n", + "class ImageNetDataPipeline:\n", + " \"\"\"\n", + " Provides APIs for getting ImageNet Dataset.\n", + " \"\"\"\n", + "\n", + " @staticmethod\n", + " def get_val_dataset(batch_size: Optional[int] = None) -> tf.data.Dataset:\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " :return: A tensorflow dataset\n", + " \"\"\"\n", + " if batch_size is None:\n", + " batch_size = image_net_config.evaluation['batch_size']\n", + "\n", + " dataset = ImageNetDataset(DATASET_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=batch_size)\n", + "\n", + " return dataset" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "\n", + "## 2. Load a pretrained FP32 model\n", + "\n", + "For this example notebook, we are going to load a pretrained ResNet50 model from Keras. Similarly, you can load any pretrained Keras model instead." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from tensorflow.keras.applications.resnet import ResNet50\n", + "\n", + "model = ResNet50(weights='imagenet')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "\n", + "## 3. Apply QuantAnalyzer to the model\n", + "\n", + "QuantAnalyzer requires two functions to be defined by the user for passing data through the model:\n", + "\n", + "**Forward pass callback**\n", + "\n", + "One function will be used to pass representative data through a quantized version of the model to calibrate quantization parameters.\n", + "This function should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model.\n", + "We don't need to compute any loss metrics, so we can just ignore the model output.\n", + "\n", + "The function **must** take two arguments, the first of which will be the model to run the forward pass on.\n", + "The second argument can be anything additional which the function requires to run, and can be in the form of a single item or a tuple of items.\n", + "\n", + "If no additional argument is needed, the user can specify a dummy \"_\" parameter for the function.\n", + "\n", + "A few pointers regarding the forward pass data samples:\n", + "\n", + "- In practice, we need a very small percentage of the overall data samples for computing encodings.\n", + " For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 to 1000 samples.\n", + "- It may be beneficial if the samples used for computing encoding are well distributed.\n", + " It's not necessary that all classes need to be covered since we are only looking at the range of values at every layer activation.\n", + " However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings.\n", + "This routine can be written in many ways; this is just an example.\n", + "This function only requires unlabeled data as no loss or other evaluation metric is needed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "val_dataset = ImageNetDataPipeline.get_val_dataset()\n", + "unlabeled_dataset = val_dataset.dataset\n", + "\n", + "def pass_calibration_data(sim_model: tf.keras.Model, _) -> None:\n", + " batch_size = val_dataset.batch_size\n", + " samples = 1000\n", + "\n", + " sampled_dataset = unlabeled_dataset.take(samples // batch_size)\n", + " _ = sim_model.predict(sampled_dataset)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "In order to pass this function to QuantAnalyzer, we need to wrap it in a CallbackFunc object, as shown below.\n", + "The CallbackFunc takes two arguments: the callback function itself, and the inputs to pass into the callback function." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from aimet_common.utils import CallbackFunc\n", + "\n", + "forward_pass_callback = CallbackFunc(pass_calibration_data)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "\n", + "**Evaluation callback**\n", + "\n", + "The second function will be used to evaluate the model, and needs to return an accuracy metric.\n", + "In here, the user should pass any amount of data through the model which they would like when evaluating their model for accuracy.\n", + "\n", + "Like the forward pass callback, this function also must take exactly two arguments: the model to evaluate, and any additional argument needed for the function to work.\n", + "The second argument can be a tuple of items in case multiple items are needed.\n", + "\n", + "Like the forward pass callback, we need to wrap the evaluation callback in a CallbackFunc object as well." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "def eval_func(model: tf.keras.Model, _) -> float:\n", + " model.compile(optimizer=tf.keras.optimizers.Adam(),\n", + " loss=tf.keras.losses.CategoricalCrossentropy(),\n", + " metrics=tf.keras.metrics.CategoricalAccuracy())\n", + "\n", + " _, acc = model.evaluate(val_dataset.dataset)\n", + " return acc\n", + "\n", + "eval_callback = CallbackFunc(eval_func)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "\n", + "**Enabling MSE loss per layer analysis**\n", + "\n", + "An optional analysis step in QuantAnalyzer calculates the MSE loss per layer in the model, comparing the layer outputs from the original FP32 model vs. a quantized model.\n", + "To perform this step, the user needs to also provide an unlabeled Dataset to QuantAnalyzer.\n", + "\n", + "We will demonstrate this step by using the ImageNetDataLoader imported above." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.keras.quant_analyzer import QuantAnalyzer\n", + "\n", + "quant_analyzer = QuantAnalyzer(model, forward_pass_callback, eval_callback)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "To enable the MSE loss analysis, we set the following:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "quant_analyzer.enable_per_layer_mse_loss(unlabeled_dataset=unlabeled_dataset, num_batches=4)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "Finally, to start the analyzer, we call .analyze().\n", + "\n", + "A few of the parameters are explained here:\n", + "- **quant_scheme**:\n", + " - We set this to \"post_training_tf_enhanced\"\n", + " With this choice of quant scheme, AIMET will use the TF Enhanced quant scheme to initialize the quantization parameters like scale/offset.\n", + "- **default_output_bw**: Setting this to 8 means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision.\n", + "- **default_param_bw**: Setting this to 8 means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision.\n", + "\n", + "There are other parameters that are set to default values in this example.\n", + "Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "When you call the analyze method, the following analyses are run:\n", + "\n", + "- Compare fp32 accuracy, accuracy with only parameters quantized, and accuracy with only activations quantized\n", + "- For each layer, track the model accuracy when quantization for all other layers is disabled (enabling quantization for only one layer in the model at a time)\n", + "- For each layer, track the model accuracy when quantization for all other layers is enabled (disabling quantization for only one layer in the model at a time)\n", + "- Track the minimum and maximum encoding parameters calculated by each quantizer in the model as a result of forward passes through the model with representative data\n", + "- When the TF Enhanced quantization scheme is used, track the histogram of tensor ranges seen by each quantizer in the model as a result of forward passes through the model with representative data\n", + "- If enabled, track the MSE loss seen at each layer by comparing layer outputs of the original fp32 model vs. a quantized model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from aimet_common.defs import QuantScheme\n", + "\n", + "quant_analyzer.analyze(quant_scheme=QuantScheme.post_training_tf_enhanced,\n", + " default_param_bw=8,\n", + " default_output_bw=8,\n", + " config_file=None,\n", + " results_dir=\"./tmp/\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "AIMET will also output .html plots and json files where appropriate for each analysis to help visualize the data.\n", + "\n", + "The following output files will be produced, in a folder specified by the user:\n", + "Output directory structure will be like below\n", + "\n", + "```\n", + "results_dir\n", + "|-- per_layer_quant_enabled.html\n", + "|-- per_layer_quant_enabled.json\n", + "|-- per_layer_quant_disabled.html\n", + "|-- per_layer_quant_disabled.json\n", + "|-- min_max_ranges\n", + "| |-- activations.html\n", + "| |-- activations.json\n", + "| |-- weights.html\n", + "| +-- weights.json\n", + "|-- activations_pdf\n", + "| |-- name_{input/output}_{index_0}.html\n", + "| |-- name_{input/output}_{index_1}.html\n", + "| |-- ...\n", + "| +-- name_{input/output}_{index_N}.html\n", + "|-- weights_pdf\n", + "| |-- layer1\n", + "| | |-- param_name_{channel_index_0}.html\n", + "| | |-- param_name_{channel_index_1}.html\n", + "| | |-- ...\n", + "| | +-- param_name_{channel_index_N}.html\n", + "| |-- layer2\n", + "| | |-- param_name_{channel_index_0}.html\n", + "| | |-- param_name_{channel_index_1}.html\n", + "| | |-- ...\n", + "| | +-- param_name_{channel_index_N}.html\n", + "| |-- ...\n", + "| |-- layerN\n", + "| | |-- param_name_{channel_index_0}.html\n", + "| | |-- param_name_{channel_index_1}.html\n", + "| | |-- ...\n", + "| +-- +-- param_name_{channel_index_N}.html\n", + "|-- per_layer_mse_loss.html\n", + "+-- per_layer_mse_loss.json\n", + "```\n", + "\n", + "#### Per-layer analysis by enabling/disabling quantization wrappers\n", + "\n", + "- per_layer_quant_enabled.html: A plot with layers on the x-axis and model accuracy on the y-axis, where each layer's accuracy represents the model accuracy when all quantizers in the model are disabled except for that layer's parameter and activation quantizers.\n", + "- per_layer_quant_enabled.json: A json file containing the data shown in per_layer_quant_enabled.html, associating layer names with model accuracy.\n", + "- per_layer_quant_disabled.html: A plot with layers on the x-axis and model accuracy on the y-axis, where each layer's accuracy represents the model accuracy when all quantizers in the model are enabled except for that layer's parameter and activation quantizers.\n", + "- per_layer_quant_disabled.json: A json file containing the data shown in per_layer_quant_disabled.html, associating layer names with model accuracy.\n", + "\n", + "![per_layer_quant_enabled.html](../images/keras_per_layer_quant_enabled.PNG)\n", + "\n", + "#### Encoding min/max ranges\n", + "\n", + "- min_max_ranges: A folder containing the following sets of files:\n", + " - activations.html: A plot with output activations on the x-axis and min-max values on the y-axis, where each output activation's range represents the encoding min and max parameters computed during forward pass calibration (explained below).\n", + " - activations.json: A json file containing the data shown in activations.html, associating layer names with min and max encoding values.\n", + " - weights.html: A plot with parameter names on the x-axis and min-max values on the y-axis, where each parameter's range represents the encoding min and max parameters computed during forward pass calibration.\n", + " - weights.json: A json file containing the data shown in weights.html, associating parameter names with min and max encoding values.\n", + "\n", + "![min_max_ranges.html](../images/keras_min_max_ranges.PNG)\n", + "\n", + "#### PDF of statistics\n", + "\n", + "- (If TF Enhanced quant scheme is used) activations_pdf: A folder containing html files for each layer, plotting the histogram of tensor values seen for that layer's output activation seen during forward pass calibration.\n", + "- (If TF Enhanced quant scheme is used) weights_pdf: A folder containing sub folders for each layer with weights.\n", + " Each layer's folder contains html files for each parameter of that layer, with a histogram plot of tensor values seen for that parameter seen during forward pass calibration.\n", + "\n", + "![weights_pdf.html](../images/keras_weights_pdf.PNG)\n", + "\n", + "#### Per-layer MSE loss\n", + "- (Optional, if per layer MSE loss is enabled) per_layer_mse_loss.html: A plot with layers on the x-axis and MSE loss on the y-axis, where each layer's MSE loss represents the MSE seen comparing that layer's outputs in the FP32 model vs. the quantized model.\n", + "- (Optional, if per layer MSE loss is enabled) per_layer_mse_loss.json: A json file containing the data shown in per_layer_mse_loss.html, associating layer names with MSE loss.\n", + "\n", + "![per_layer_mse_loss.html](../images/keras_per_layer_mse_loss.PNG)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 2 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython2", + "version": "2.7.6" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/quantsim_adaround_pcq.html b/releases/1.32.2/Examples/tensorflow/quantization/keras/quantsim_adaround_pcq.html new file mode 100644 index 00000000..b200ed8c --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/quantsim_adaround_pcq.html @@ -0,0 +1,1417 @@ + + + + + + Quantsim and Adaround - Per Channel Quantization (PCQ) — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + +
  • + View page source +
  • +
+
+
+
+
+ +
+

Quantsim and Adaround - Per Channel Quantization (PCQ)

+

This notebook illustrates the use of AIMET Adaround feature.

+

AIMET quantization features, by default, use the “nearest rounding” technique for achieving quantization. When using the “nearest rounding” technique, the weight value is quantized to the nearest integer value. The Adaptive Rounding (AdaRound) feature, uses a smaller subset of the unlabelled training data to adaptively round the weights. AdaRound, optimizes a loss function using the unlabelled training data to adaptively decide whether to quantize a specific weight to the integer value near it +or away from it. Using the AdaRound quantization, a model is able to achieve an accuracy closer to the FP32 model, while using low bit-width integer quantization.

+

Per Channel Quantization AIMET by default performs quantization on a per tensor basis. However, for better quantization performance (specifically networks with many convolutions) quantization can be done on a per channel quantization basis.

+

The difference between per tensor and per channel quantization can be illustrated with a single 2D convolution layer in Keras. Imagine that we have a 2D convolution layer with 64 filters, a kernel size of 3, and input of shape of (28, 28, 1). If we were to per tensor quantize this layer, we would look at the entirety of the convolutions weight matrix (convolution kernel). From here, we would take its overall max, overall min, and compute the encoding to quantize the entire matrix. In contrast, +on a per channel basis, we would repeat the process of finding a min, max, and computing an encoding. However, we would repeat this for every output channel in the convolution. In this example, that would mean we would have 64 encodings that are unique to each channel. This more detailed calculation is what attributes to better performance in quantization.

+

Example Keras Conv2D Layer

+
from tensorflow.keras import layers
+conv2d_layer = layers.Conv2D(filters=64, kernel_size=3)
+
+
+
+

Overall flow

+

This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy 3. Apply PCQ Adaround and get corresponding model 4. Create a PCQ quantization simulation model (with fake quantization ops inserted) from the Adaround model and evaluate this simuation model to get a quantized accuracy score 5. Exporting the simulation models encodings and how to take them to SNPE/QNN

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art results. For example, it uses a relatively quantization-friendly model like Resnet50. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).

+

Note: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = "/path/to/dataset/dir/"          # Please replace this with a real directory
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a Keras model. This QuantizationSim model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods? Not really. You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
import tensorflow as tf
+from Examples.common import image_net_config
+from Examples.tensorflow.utils.keras.image_net_dataset import ImageNetDataset
+from Examples.tensorflow.utils.keras.image_net_evaluator import ImageNetEvaluator
+
+
+class ImageNetDataPipeline:
+    """
+    Provides APIs for model evaluation and finetuning using ImageNet Dataset.
+    """
+
+    @staticmethod
+    def get_val_dataset() -> tf.data.Dataset:
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        :return: A tensorflow dataset
+        """
+        data_loader = ImageNetDataset(DATASET_DIR,
+                                      image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'])
+
+        return data_loader
+
+    @staticmethod
+    def evaluate(model, iterations=None) -> float:
+        """
+        Given a Keras model, evaluates its Top-1 accuracy on the validation dataset
+        :param model: The Keras model to be evaluated.
+        :param iterations: The number of iterations to run. If None, all the data will be used
+        :return: The accuracy for the sample with the maximum accuracy.
+        """
+        evaluator = ImageNetEvaluator(DATASET_DIR,
+                                      image_size=image_net_config.dataset["image_size"],
+                                      batch_size=image_net_config.evaluation["batch_size"])
+
+        return evaluator.evaluate(model=model, iterations=iterations)
+
+
+
+
+
+
+

2. Load the model and evaluate to get a baseline FP32 accuracy score

+

For this example notebook, we are going to load a pretrained ResNet50 model from Keras. Similarly, you can load any pretrained Keras model instead.

+
+
[ ]:
+
+
+
from tensorflow.keras.applications.resnet50 import ResNet50
+
+model = ResNet50(include_top=True,
+                 weights="imagenet",
+                 input_tensor=None,
+                 input_shape=None,
+                 pooling=None,
+                 classes=1000)
+
+
+
+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
ImageNetDataPipeline.evaluate(model=model, iterations=10)
+
+
+
+
+
+
+

3. Create a quantization simulation model and determine quantized accuracy

+
+
+

Fold Batch Normalization layers

+

Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.

+

Note: For per channel to be enabled, we pass a config file in which the config file has per_channel_quantization set to True. This config file will be used later on with Adaround as well. Having one place describing the quantization style ensures that we don’t mismatch when applying QuantSim and Adaround together.

+

Why do we need to this? On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values +for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.

+

The following code calls AIMET to fold the BN layers of a given model. NOTE: During folding, a new model is returned. Please use the returned model for the rest of the pipeline.

+
+
[ ]:
+
+
+
from aimet_tensorflow.keras.batch_norm_fold import fold_all_batch_norms
+from aimet_tensorflow.keras.quantsim import QuantizationSimModel
+from aimet_common.defs import QuantScheme
+
+_, model = fold_all_batch_norms(model)
+sim = QuantizationSimModel(model=model,
+                           quant_scheme=QuantScheme.post_training_tf,
+                           rounding_mode="nearest",
+                           default_output_bw=8,
+                           default_param_bw=8,
+                           config_file="Examples/tensorflow/utils/keras/pcq_quantsim_config")  # NOTE: We tell QuantSim to run per channel quantization through the config file defined here.
+
+
+
+
+

Even though AIMET has added ‘quantizer’ nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred +to as calibration. AIMET simply refers to it as ‘computing encodings’.

+

So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples - In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has +1M samples. For computing encodings we only need 500 or 1000 samples. - It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results.

+

The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example.

+
+
[ ]:
+
+
+
from tensorflow.keras.utils import Progbar
+from tensorflow.keras.applications.resnet import preprocess_input
+
+def pass_calibration_data(sim_model, samples):
+    tf_dataset = ImageNetDataPipeline.get_val_dataset()
+    dataset = tf_dataset.dataset
+    batch_size = tf_dataset.batch_size
+
+    progbar = Progbar(samples)
+
+    batch_cntr = 0
+    for inputs, _ in dataset:
+        sim_model(preprocess_input(inputs))
+
+        batch_cntr += 1
+        progbar_stat_update = \
+            batch_cntr * batch_size if (batch_cntr * batch_size) < samples else samples
+        progbar.update(progbar_stat_update)
+        if (batch_cntr * batch_size) > samples:
+            break
+
+
+
+
+

Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters.

+
+
[ ]:
+
+
+
sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=1000)
+
+
+
+
+

Now the QuantizationSim model is ready to be used for inference. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before.

+
+
+
+

4. Apply Adaround

+

We can now apply AdaRound to this model in a per channel quantization fashion.

+

Note: For per channel to be enabled, we pass a config file to apply_adaround in which the config file has per_channel_quantization set to True just as we did with QuantSim.

+

Some of the parameters for AdaRound are described below

+
    +
  • data_set: AdaRound needs a dataset to use data samples for the layer-by-layer optimization to learn the rounding vectors. Either a training or validation dataloader could be passed in.

  • +
  • num_batches: The number of batches used to evaluate the model while calculating the quantization encodings. Typically we want AdaRound to use around 2000 samples. So with a batch size of 32, this may translate to 64 batches. To speed up the execution here we are using a batch size of 1.

  • +
  • default_num_iterations: The number of iterations to adaround each layer. Default value is set to 10000 and we strongly recommend to not reduce this number. But in this example we are using 32 to speed up the execution runtime.

  • +
+
+
[ ]:
+
+
+
import os
+from tensorflow.keras.applications.resnet import preprocess_input
+from tensorflow.keras.preprocessing import image_dataset_from_directory
+from aimet_tensorflow.keras.adaround_weight import Adaround, AdaroundParameters
+from aimet_common.defs import QuantScheme
+
+ada_round_data = image_dataset_from_directory(directory=DATASET_DIR,
+                                              labels="inferred",
+                                              label_mode="categorical",
+                                              batch_size=image_net_config.evaluation["batch_size"],
+                                              shuffle=False,
+                                              image_size=(image_net_config.dataset["image_width"],
+                                                          image_net_config.dataset["image_height"]))
+ada_round_data = ada_round_data.map(lambda x, y: preprocess_input(x))
+
+params = AdaroundParameters(data_set=ada_round_data,
+                            num_batches=1, default_num_iterations=32)
+
+os.makedirs("./output/", exist_ok=True)
+ada_model = Adaround.apply_adaround(model, params, path="output", filename_prefix="adaround",
+                                    default_param_bw=8, default_quant_scheme=QuantScheme.post_training_tf,
+                                    config_file="Examples/tensorflow/utils/keras/pcq_quantsim_config")  # NOTE: The same config file used in QuantSim is used here as well. Again, telling Adaround to enable PCQ.
+
+
+
+
+

Now, we can determine the simulated quantized accuracy of the model after applying Adaround. We again create a simulation model like before and evaluate to determine simulated quantized accuracy.

+

Note: There are two important things to understand in the following cell. - Parameter Biwidth Precision: The QuantizationSimModel must be created with the same parameter bitwidth precision that was used in the apply_adaround() created.

+
    +
  • Freezing the parameter encodings: After creating the QuantizationSimModel, the set_and_freeze_param_encodings() API must be called before calling the compute_encodings() API. While applying AdaRound, the parameter values have been rounded up or down based on these initial encodings internally created. Fo r Quantization Simulation accuracy, it is important to freeze these encodings. If the parameters encodings are NOT frozen, the call to compute_encodings() will alter the value of the +parameters encoding and Quantization Simulation accuracy will not be correct.

  • +
+
+
[ ]:
+
+
+
from aimet_tensorflow.keras.quantsim import QuantizationSimModel
+
+sim = QuantizationSimModel(model=ada_model,
+                           quant_scheme=QuantScheme.post_training_tf,
+                           rounding_mode="nearest",
+                           default_output_bw=8,
+                           default_param_bw=8,
+                           config_file="Examples/tensorflow/utils/keras/pcq_quantsim_config")  # NOTE: The same config file used in the first QuantSim.
+
+sim.set_and_freeze_param_encodings(encoding_path=os.path.join("output", 'adaround.encodings'))
+
+sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=1000)
+
+
+
+
+

Now the QuantizationSim model is ready to be used for inference. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before.

+
+
[ ]:
+
+
+
ImageNetDataPipeline.evaluate(model=sim.model, iterations=10)
+
+
+
+
+

Depending on your settings you may have observed a slight gain in accuracy after applying adaround. Of course, this was just an example. Please try this against the model of your choice and play with the number of samples to get the best results.

+

Now the next step would be to take this model to target. For this purpose, we need to export the model with the updated weights without the fake quant ops. And also to export the encodings (scale/offset quantization parameters). AIMET QuantizationSimModel provides an export API for this purpose.

+
+
[ ]:
+
+
+
sim.export(path='./output', filename_prefix='resnet50_pcq_adaround')
+
+
+
+
+
+
+

Summary

+

This example illustrated how AIMET AdaRound and QuantSim API is invoked to achieve post training quantization on a per channel basis. To use AIMET for your specific needs, replace the model with your model and replace the data pipeline with your data pipeline. This will provide you a quick starting point. As indicated above, some parameters in this example have been chosen in such a way way to make this example execute faster.

+

Hope this notebook was useful for you to understand how to use AIMET for performing Adaround and QuantSim on a per channel basis.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT techniques

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/quantsim_adaround_pcq.ipynb b/releases/1.32.2/Examples/tensorflow/quantization/keras/quantsim_adaround_pcq.ipynb new file mode 100644 index 00000000..8b649a20 --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/quantsim_adaround_pcq.ipynb @@ -0,0 +1,498 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "# Quantsim and Adaround - Per Channel Quantization (PCQ)\n", + "\n", + "This notebook illustrates the use of AIMET Adaround feature.\n", + "\n", + "AIMET quantization features, by default, use the \"nearest rounding\" technique for achieving quantization. When using the \"nearest rounding\" technique, the weight value is quantized to the nearest integer value. The Adaptive Rounding (AdaRound) feature, uses a smaller subset of the unlabelled training data to adaptively round the weights. AdaRound, optimizes a loss function using the unlabelled training data to adaptively decide whether to quantize a specific weight to the integer value near it or away from it. Using the AdaRound quantization, a model is able to achieve an accuracy closer to the FP32 model, while using low bit-width integer quantization.\n", + "\n", + "\n", + "**Per Channel Quantization**\n", + "AIMET by default performs quantization on a per tensor basis. However, for better quantization performance (specifically networks with many convolutions) quantization can be done on a per channel quantization basis.\n", + "\n", + "The difference between per tensor and per channel quantization can be illustrated with a single 2D convolution layer in Keras. Imagine that we have a 2D convolution layer with 64 filters, a kernel size of 3, and input of shape of (28, 28, 1). If we were to per tensor quantize this layer, we would look at the entirety of the convolutions weight matrix (convolution kernel). From here, we would take its overall max, overall min, and compute the encoding to quantize the entire matrix. In contrast, on a per channel basis, we would repeat the process of finding a min, max, and computing an encoding. However, we would repeat this for every output channel in the convolution. In this example, that would mean we would have 64 encodings that are unique to each channel. This more detailed calculation is what attributes to better performance in quantization.\n", + "\n", + "*Example Keras Conv2D Layer*\n", + "```python\n", + "from tensorflow.keras import layers\n", + "conv2d_layer = layers.Conv2D(filters=64, kernel_size=3)\n", + "```\n", + "\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy\n", + "3. Apply PCQ Adaround and get corresponding model\n", + "4. Create a PCQ quantization simulation model (with fake quantization ops inserted) from the Adaround model and evaluate this simuation model to get a quantized accuracy score\n", + "5. Exporting the simulation models encodings and how to take them to SNPE/QNN\n", + "\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art results. For example, it uses a relatively quantization-friendly model like Resnet50. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).\n", + "\n", + "**Note**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "DATASET_DIR = \"/path/to/dataset/dir/\" # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a Keras model. This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really. You should be able to use your existing evaluate and train routines as-is." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "from Examples.common import image_net_config\n", + "from Examples.tensorflow.utils.keras.image_net_dataset import ImageNetDataset\n", + "from Examples.tensorflow.utils.keras.image_net_evaluator import ImageNetEvaluator\n", + "\n", + "\n", + "class ImageNetDataPipeline:\n", + " \"\"\"\n", + " Provides APIs for model evaluation and finetuning using ImageNet Dataset.\n", + " \"\"\"\n", + "\n", + " @staticmethod\n", + " def get_val_dataset() -> tf.data.Dataset:\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " :return: A tensorflow dataset\n", + " \"\"\"\n", + " data_loader = ImageNetDataset(DATASET_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'])\n", + "\n", + " return data_loader\n", + "\n", + " @staticmethod\n", + " def evaluate(model, iterations=None) -> float:\n", + " \"\"\"\n", + " Given a Keras model, evaluates its Top-1 accuracy on the validation dataset\n", + " :param model: The Keras model to be evaluated.\n", + " :param iterations: The number of iterations to run. If None, all the data will be used\n", + " :return: The accuracy for the sample with the maximum accuracy.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(DATASET_DIR,\n", + " image_size=image_net_config.dataset[\"image_size\"],\n", + " batch_size=image_net_config.evaluation[\"batch_size\"])\n", + "\n", + " return evaluator.evaluate(model=model, iterations=iterations)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "\n", + "## 2. Load the model and evaluate to get a baseline FP32 accuracy score" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "For this example notebook, we are going to load a pretrained ResNet50 model from Keras. Similarly, you can load any pretrained Keras model instead." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from tensorflow.keras.applications.resnet50 import ResNet50\n", + "\n", + "model = ResNet50(include_top=True,\n", + " weights=\"imagenet\",\n", + " input_tensor=None,\n", + " input_shape=None,\n", + " pooling=None,\n", + " classes=1000)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "ImageNetDataPipeline.evaluate(model=model, iterations=10)" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## 3. Create a quantization simulation model and determine quantized accuracy\n", + "\n", + "## Fold Batch Normalization layers\n", + "Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.\n", + "\n", + "**Note**: For per channel to be enabled, we pass a config file in which the config file has *per_channel_quantization* set to *True*. This config file will be used later on with Adaround as well. Having one place describing the quantization style ensures that we don't mismatch when applying QuantSim and Adaround together.\n", + "\n", + "**Why do we need to this?**\n", + "On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.\n", + "\n", + "The following code calls AIMET to fold the BN layers of a given model.
\n", + "**NOTE: During folding, a new model is returned. Please use the returned model for the rest of the pipeline.**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.keras.batch_norm_fold import fold_all_batch_norms\n", + "from aimet_tensorflow.keras.quantsim import QuantizationSimModel\n", + "from aimet_common.defs import QuantScheme\n", + "\n", + "_, model = fold_all_batch_norms(model)\n", + "sim = QuantizationSimModel(model=model,\n", + " quant_scheme=QuantScheme.post_training_tf,\n", + " rounding_mode=\"nearest\",\n", + " default_output_bw=8,\n", + " default_param_bw=8,\n", + " config_file=\"Examples/tensorflow/utils/keras/pcq_quantsim_config\") # NOTE: We tell QuantSim to run per channel quantization through the config file defined here." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "Even though AIMET has added 'quantizer' nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each 'quantizer' node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as 'computing encodings'.\n", + "\n", + "So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don't need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples\n", + "- In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples.\n", + "- It may be beneficial if the samples used for computing encoding are well distributed. It's not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from tensorflow.keras.utils import Progbar\n", + "from tensorflow.keras.applications.resnet import preprocess_input\n", + "\n", + "def pass_calibration_data(sim_model, samples):\n", + " tf_dataset = ImageNetDataPipeline.get_val_dataset()\n", + " dataset = tf_dataset.dataset\n", + " batch_size = tf_dataset.batch_size\n", + "\n", + " progbar = Progbar(samples)\n", + "\n", + " batch_cntr = 0\n", + " for inputs, _ in dataset:\n", + " sim_model(preprocess_input(inputs))\n", + "\n", + " batch_cntr += 1\n", + " progbar_stat_update = \\\n", + " batch_cntr * batch_size if (batch_cntr * batch_size) < samples else samples\n", + " progbar.update(progbar_stat_update)\n", + " if (batch_cntr * batch_size) > samples:\n", + " break\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=1000)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "Now the QuantizationSim model is ready to be used for inference. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## 4. Apply Adaround\n", + "\n", + "We can now apply AdaRound to this model in a per channel quantization fashion.\n", + "\n", + "**Note**: For per channel to be enabled, we pass a config file to `apply_adaround` in which the config file has *per_channel_quantization* set to *True* just as we did with QuantSim.\n", + "\n", + "Some of the parameters for AdaRound are described below\n", + "\n", + "- **data_set:** AdaRound needs a dataset to use data samples for the layer-by-layer optimization to learn the rounding vectors. Either a training or validation dataloader could be passed in.\n", + "- **num_batches:** The number of batches used to evaluate the model while calculating the quantization encodings. Typically we want AdaRound to use around 2000 samples. So with a batch size of 32, this may translate to 64 batches. To speed up the execution here we are using a batch size of 1.\n", + "- **default_num_iterations:** The number of iterations to adaround each layer. Default value is set to 10000 and we strongly recommend to not reduce this number. But in this example we are using 32 to speed up the execution runtime." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "import os\n", + "from tensorflow.keras.applications.resnet import preprocess_input\n", + "from tensorflow.keras.preprocessing import image_dataset_from_directory\n", + "from aimet_tensorflow.keras.adaround_weight import Adaround, AdaroundParameters\n", + "from aimet_common.defs import QuantScheme\n", + "\n", + "ada_round_data = image_dataset_from_directory(directory=DATASET_DIR,\n", + " labels=\"inferred\",\n", + " label_mode=\"categorical\",\n", + " batch_size=image_net_config.evaluation[\"batch_size\"],\n", + " shuffle=False,\n", + " image_size=(image_net_config.dataset[\"image_width\"],\n", + " image_net_config.dataset[\"image_height\"]))\n", + "ada_round_data = ada_round_data.map(lambda x, y: preprocess_input(x))\n", + "\n", + "params = AdaroundParameters(data_set=ada_round_data,\n", + " num_batches=1, default_num_iterations=32)\n", + "\n", + "os.makedirs(\"./output/\", exist_ok=True)\n", + "ada_model = Adaround.apply_adaround(model, params, path=\"output\", filename_prefix=\"adaround\",\n", + " default_param_bw=8, default_quant_scheme=QuantScheme.post_training_tf,\n", + " config_file=\"Examples/tensorflow/utils/keras/pcq_quantsim_config\") # NOTE: The same config file used in QuantSim is used here as well. Again, telling Adaround to enable PCQ.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "Now, we can determine the simulated quantized accuracy of the model after applying Adaround. We again create a simulation model like before and evaluate to determine simulated quantized accuracy.\n", + "\n", + "**Note:** There are two important things to understand in the following cell.\n", + " - **Parameter Biwidth Precision**: The QuantizationSimModel must be created with the same parameter bitwidth precision that was used in the apply_adaround() created.\n", + "\n", + " - **Freezing the parameter encodings**:\n", + "After creating the QuantizationSimModel, the set_and_freeze_param_encodings() API must be called\n", + "before calling the compute_encodings() API. While applying AdaRound, the parameter values have\n", + "been rounded up or down based on these initial encodings internally created. Fo\n", + "r Quantization Simulation accuracy, it is important to freeze these encodings.\n", + "If the parameters encodings are NOT frozen, the call to compute_encodings() will alter\n", + "the value of the parameters encoding and Quantization Simulation accuracy will not be correct." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.keras.quantsim import QuantizationSimModel\n", + "\n", + "sim = QuantizationSimModel(model=ada_model,\n", + " quant_scheme=QuantScheme.post_training_tf,\n", + " rounding_mode=\"nearest\",\n", + " default_output_bw=8,\n", + " default_param_bw=8,\n", + " config_file=\"Examples/tensorflow/utils/keras/pcq_quantsim_config\") # NOTE: The same config file used in the first QuantSim.\n", + "\n", + "sim.set_and_freeze_param_encodings(encoding_path=os.path.join(\"output\", 'adaround.encodings'))\n", + "\n", + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=1000)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "Now the QuantizationSim model is ready to be used for inference. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "ImageNetDataPipeline.evaluate(model=sim.model, iterations=10)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "Depending on your settings you may have observed a slight gain in accuracy after applying adaround. Of course, this was just an example. Please try this against the model of your choice and play with the number of samples to get the best results.\n", + "\n", + "Now the next step would be to take this model to target. For this purpose, we need to export the model with the updated weights without the fake quant ops. And also to export the encodings (scale/offset quantization parameters). AIMET QuantizationSimModel provides an export API for this purpose." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "sim.export(path='./output', filename_prefix='resnet50_pcq_adaround')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## Summary" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "This example illustrated how AIMET AdaRound and QuantSim API is invoked to achieve post training quantization on a per channel basis. To use AIMET for your specific needs, replace the model with your model and\n", + "replace the data pipeline with your data pipeline. This will provide you a quick starting point. As indicated above, some parameters in this example have been chosen in such a way way to make this example execute faster.\n", + "\n", + "Hope this notebook was useful for you to understand how to use AIMET for performing Adaround and QuantSim on a per channel basis.\n", + "\n", + "Few additional resources\n", + "- Refer to the AIMET API docs to know more details of the APIs and optional parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT techniques" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 2 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython2", + "version": "3.8.0" + }, + "vscode": { + "interpreter": { + "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6" + } + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/quantsim_cle.html b/releases/1.32.2/Examples/tensorflow/quantization/keras/quantsim_cle.html new file mode 100644 index 00000000..e379db0f --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/quantsim_cle.html @@ -0,0 +1,1398 @@ + + + + + + Cross-Layer Equalization (CLE) with QuantSim — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Cross-Layer Equalization (CLE) with QuantSim

+

This notebook showcases a working code example of how to use AIMET to apply Cross-Layer Equalization (CLE) and use QuantSim. CLE is post-training quantization techniques that aims to improve quantized accuracy of a given model. CLE does not need any data samples. This technique help recover quantized accuracy when the model quantization is sensitive to parameter quantization as opposed to activation quantization.

+

To learn more about this techniques, please refer to the “Data-Free Quantization Through Weight Equalization and Bias Correction” paper from ICCV 2019 - https://arxiv.org/abs/1906.04721

+

Cross-Layer Equalization AIMET performs the following steps when running CLE: 1. Batch Norm Folding: Folds BN layers into Conv layers immediate before or after the Conv layers. 2. Cross-Layer Scaling: Given a set of consecutive Conv layers, equalizes the range of tensor values per-channel by scaling up/down per-channel weight tensor values of a layer and corresponding scaling down/up per-channel weight tensor values of the subsequent layer. 3. High Bias Folding: Cross-layer scaling may +result in high bias parameter values for some layers. This technique folds some of the bias of a layer into the subsequent layer’s parameters.

+
+

Overall flow

+

This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy 3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score 4. Apply CLE, and evaluate the simulation model to get a post-finetuned quantized accuracy score 5. Exporting the simulation models encodings and how to take them to SNPE/QNN

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art results. For example, it uses a relatively quantization-friendly model like Resnet50. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).

+

Note: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = "/path/to/dataset/dir/"  # Please replace this with a real directory
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user”s model to create a QuantizationSim model which is still a Keras model. This QuantizationSim model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods? Not really. You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
import tensorflow as tf
+from Examples.common import image_net_config
+from Examples.tensorflow.utils.keras.image_net_dataset import ImageNetDataset
+from Examples.tensorflow.utils.keras.image_net_evaluator import ImageNetEvaluator
+
+
+class ImageNetDataPipeline:
+    """
+    Provides APIs for model evaluation and finetuning using ImageNet Dataset.
+    """
+
+    @staticmethod
+    def get_val_dataset() -> tf.data.Dataset:
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        :return: A tensorflow dataset
+        """
+        data_loader = ImageNetDataset(DATASET_DIR,
+                                      image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'])
+
+        return data_loader
+
+    @staticmethod
+    def evaluate(model, iterations=None) -> float:
+        """
+        Given a Keras model, evaluates its Top-1 accuracy on the validation dataset
+        :param model: The Keras model to be evaluated.
+        :param iterations: The number of iterations to run. If None, all the data will be used
+        :return: The accuracy for the sample with the maximum accuracy.
+        """
+        evaluator = ImageNetEvaluator(DATASET_DIR,
+                                      image_size=image_net_config.dataset["image_size"],
+                                      batch_size=image_net_config.evaluation["batch_size"])
+
+        return evaluator.evaluate(model=model, iterations=iterations)
+
+
+
+
+
+
+

2. Load the model and evaluate to get a baseline FP32 accuracy score

+

For this example notebook, we are going to load a pretrained ResNet50 model from Keras. Similarly, you can load any pretrained Keras model instead.

+
+
[ ]:
+
+
+
from tensorflow.keras.applications.resnet50 import ResNet50
+
+model = ResNet50(include_top=True,
+                 weights="imagenet",
+                 input_tensor=None,
+                 input_shape=None,
+                 pooling=None,
+                 classes=1000)
+
+
+
+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
ImageNetDataPipeline.evaluate(model=model, iterations=10)
+
+
+
+
+
+
+

3. Create a quantization simulation model and determine quantized accuracy

+
+
+

Fold Batch Normalization layers

+

Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.

+

Why do we need to this? On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values +for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.

+

The following code calls AIMET to fold the BN layers of a given model. NOTE: During folding, a new model is returned. Please use the returned model for the rest of the pipeline.

+
+
[ ]:
+
+
+
from aimet_tensorflow.keras.batch_norm_fold import fold_all_batch_norms
+
+_, model = fold_all_batch_norms(model)
+
+
+
+
+
+
+

Create Quantization Sim Model

+

Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will wrap the Keras layers to mimic a layer as quantized. A few of the parameters are explained here - quant_scheme: We set this to “QuantScheme.post_training_tf” - Supported options are ‘tf_enhanced’ or ‘tf’ or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced - default_output_bw: Setting this to 8, essentially means that we are asking AIMET to perform all +activation quantizations in the model using integer 8-bit precision - default_param_bw: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision - num_batches: The number of batches used to evaluate the model while calculating the quantization encodings.Number of batches to use for computing encodings. Only 5 batches are used here to speed up the process. In addition, the number of images in these 5 +batches should be sufficient for compute encodings - rounding_mode: The rounding mode used for quantization. There are two possible choices here - ‘nearest’ or ‘stochastic’ We will use “nearest.”

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+

The next cell sets up the quantizer, and quantizes the model. Note that the quantizer uses the same evaluate function as the one defined in our data pipeline when computing the new weights.

+
+
[ ]:
+
+
+
from aimet_common.defs import QuantScheme
+from aimet_tensorflow.keras.quantsim import QuantizationSimModel
+
+sim = QuantizationSimModel(model=model,
+                           quant_scheme=QuantScheme.post_training_tf,
+                           rounding_mode="nearest",
+                           default_output_bw=8,
+                           default_param_bw=8)
+
+
+
+
+
+
+

Compute Encodings

+

Even though AIMET has wrapped the layers to act as being ‘quantized’ but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ layer. For activation quantization layers, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes +referred to as calibration. AIMET simply refers to it as ‘computing encodings’.

+

So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples

+

In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples. It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario +like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results. The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example.

+
+
[ ]:
+
+
+
from tensorflow.keras.utils import Progbar
+from tensorflow.keras.applications.resnet import preprocess_input
+
+def pass_calibration_data(sim_model, samples):
+    tf_dataset = ImageNetDataPipeline.get_val_dataset()
+    dataset = tf_dataset.dataset
+    batch_size = tf_dataset.batch_size
+
+    progbar = Progbar(samples)
+
+    batch_cntr = 0
+    for inputs, _ in dataset:
+        sim_model(preprocess_input(inputs))
+
+        batch_cntr += 1
+        progbar_stat_update = \
+            batch_cntr * batch_size if (batch_cntr * batch_size) < samples else samples
+        progbar.update(progbar_stat_update)
+        if (batch_cntr * batch_size) > samples:
+            break
+
+
+
+
+

Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters.

+
+
[ ]:
+
+
+
sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=1000)
+
+
+
+
+

Now, we can determine the simulated quantized accuracy of the equalized model. We again create a simulation model like before and evaluate to determine simulated quantized accuracy.

+
+
[ ]:
+
+
+
ImageNetDataPipeline.evaluate(sim.model)
+
+
+
+
+
+
+

4 Cross Layer Equalization

+

The next cell performs cross-layer equalization on the model. As noted before, the function folds batch norms, applies cross-layer scaling, and then folds high biases.

+

Note: Interestingly, CLE needs BN statistics for its procedure. If a BN folded model is provided, CLE will run the CLS (cross-layer scaling) optimization step but will skip the HBA (high-bias absorption) step. To avoid this, we simply load the original model again before running CLE.

+
+
[ ]:
+
+
+
from aimet_tensorflow.keras import cross_layer_equalization as aimet_cle
+
+cle_applied_model = aimet_cle.equalize_model(model)
+
+
+
+
+

Now, we can determine the simulated quantized accuracy of the equalized model. We again create a simulation model like before and evaluate to determine simulated quantized accuracy.

+
+
[ ]:
+
+
+
sim = QuantizationSimModel(model=cle_applied_model,
+                           quant_scheme=QuantScheme.post_training_tf,
+                           rounding_mode="nearest",
+                           default_output_bw=8,
+                           default_param_bw=8)
+
+sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=1000)
+
+ImageNetDataPipeline.evaluate(sim.model)
+
+
+
+
+
+
+

5 Exporting

+

Now the encodings for the QuantizationSimModel have been computed, they can be exported. Exporting can be done with the export function. This function will export the encodings in both a JSON and YAML file, a h5 model without wrappers, a Tensorflow 2 SavedModel, and a converted protobuff model from the h5 model. The converted protobuff model and the encodings exported can be then consumed by either SNPE/QNN.

+

Note: export() takes a path to safe to, and a filename_prefix

+
+
[ ]:
+
+
+
import os
+
+os.makedirs("./output/", exist_ok=True)
+sim.export(path="./output", filename_prefix="resnet50_after_cle")
+
+
+
+
+
+
+

Summary

+

Hopefully this notebook was useful for you to understand how to use AIMET for performing Cross Layer Equalization (CLE).

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT techniques

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/quantization/keras/quantsim_cle.ipynb b/releases/1.32.2/Examples/tensorflow/quantization/keras/quantsim_cle.ipynb new file mode 100644 index 00000000..99aa1bad --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/keras/quantsim_cle.ipynb @@ -0,0 +1,467 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "22208c45", + "metadata": {}, + "source": [ + "# Cross-Layer Equalization (CLE) with QuantSim\n", + "\n", + "This notebook showcases a working code example of how to use AIMET to apply Cross-Layer Equalization (CLE) and use QuantSim. CLE is post-training quantization techniques that aims to improve quantized accuracy of a given model. CLE does not need any data samples. This technique help recover quantized accuracy when the model quantization is sensitive to parameter quantization as opposed to activation quantization.\n", + "\n", + "To learn more about this techniques, please refer to the \"Data-Free Quantization Through Weight Equalization and Bias Correction\" paper from ICCV 2019 - https://arxiv.org/abs/1906.04721\n", + "\n", + "**Cross-Layer Equalization**\n", + "AIMET performs the following steps when running CLE:\n", + "1. Batch Norm Folding: Folds BN layers into Conv layers immediate before or after the Conv layers.\n", + "2. Cross-Layer Scaling: Given a set of consecutive Conv layers, equalizes the range of tensor values per-channel by scaling up/down per-channel weight tensor values of a layer and corresponding scaling down/up per-channel weight tensor values of the subsequent layer.\n", + "3. High Bias Folding: Cross-layer scaling may result in high bias parameter values for some layers. This technique folds some of the bias of a layer into the subsequent layer's parameters.\n", + "\n", + "\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy\n", + "3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score\n", + "4. Apply CLE, and evaluate the simulation model to get a post-finetuned quantized accuracy score\n", + "5. Exporting the simulation models encodings and how to take them to SNPE/QNN\n", + "\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art results. For example, it uses a relatively quantization-friendly model like Resnet50. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly.\n" + ] + }, + { + "cell_type": "markdown", + "id": "447958a2", + "metadata": {}, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).\n", + "\n", + "**Note**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c37b55bc", + "metadata": {}, + "outputs": [], + "source": [ + "DATASET_DIR = \"/path/to/dataset/dir/\" # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "id": "e348e923", + "metadata": {}, + "source": [ + "---\n", + "\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user\"s model to create a QuantizationSim model which is still a Keras model. This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really. You should be able to use your existing evaluate and train routines as-is." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e21aca42", + "metadata": {}, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "from Examples.common import image_net_config\n", + "from Examples.tensorflow.utils.keras.image_net_dataset import ImageNetDataset\n", + "from Examples.tensorflow.utils.keras.image_net_evaluator import ImageNetEvaluator\n", + "\n", + "\n", + "class ImageNetDataPipeline:\n", + " \"\"\"\n", + " Provides APIs for model evaluation and finetuning using ImageNet Dataset.\n", + " \"\"\"\n", + "\n", + " @staticmethod\n", + " def get_val_dataset() -> tf.data.Dataset:\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " :return: A tensorflow dataset\n", + " \"\"\"\n", + " data_loader = ImageNetDataset(DATASET_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'])\n", + "\n", + " return data_loader\n", + "\n", + " @staticmethod\n", + " def evaluate(model, iterations=None) -> float:\n", + " \"\"\"\n", + " Given a Keras model, evaluates its Top-1 accuracy on the validation dataset\n", + " :param model: The Keras model to be evaluated.\n", + " :param iterations: The number of iterations to run. If None, all the data will be used\n", + " :return: The accuracy for the sample with the maximum accuracy.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(DATASET_DIR,\n", + " image_size=image_net_config.dataset[\"image_size\"],\n", + " batch_size=image_net_config.evaluation[\"batch_size\"])\n", + "\n", + " return evaluator.evaluate(model=model, iterations=iterations)\n" + ] + }, + { + "cell_type": "markdown", + "id": "52ed096a", + "metadata": {}, + "source": [ + "---\n", + "\n", + "## 2. Load the model and evaluate to get a baseline FP32 accuracy score" + ] + }, + { + "cell_type": "markdown", + "id": "0ed81fba", + "metadata": {}, + "source": [ + "For this example notebook, we are going to load a pretrained ResNet50 model from Keras. Similarly, you can load any pretrained Keras model instead." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d9879523", + "metadata": {}, + "outputs": [], + "source": [ + "from tensorflow.keras.applications.resnet50 import ResNet50\n", + "\n", + "model = ResNet50(include_top=True,\n", + " weights=\"imagenet\",\n", + " input_tensor=None,\n", + " input_shape=None,\n", + " pooling=None,\n", + " classes=1000)" + ] + }, + { + "cell_type": "markdown", + "id": "5d7183d3", + "metadata": { + "collapsed": false + }, + "source": [ + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "247f8b8b", + "metadata": {}, + "outputs": [], + "source": [ + "ImageNetDataPipeline.evaluate(model=model, iterations=10)" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "e71a4de7", + "metadata": {}, + "source": [ + "---\n", + "## 3. Create a quantization simulation model and determine quantized accuracy\n", + "\n", + "## Fold Batch Normalization layers\n", + "Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.\n", + "\n", + "**Why do we need to this?**\n", + "On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.\n", + "\n", + "The following code calls AIMET to fold the BN layers of a given model.
\n", + "**NOTE: During folding, a new model is returned. Please use the returned model for the rest of the pipeline.**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fdb8cbca", + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.keras.batch_norm_fold import fold_all_batch_norms\n", + "\n", + "_, model = fold_all_batch_norms(model)" + ] + }, + { + "cell_type": "markdown", + "id": "72457743", + "metadata": {}, + "source": [ + "---\n", + "## Create Quantization Sim Model\n", + "Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will wrap the Keras layers to mimic a layer as quantized.\n", + "A few of the parameters are explained here\n", + "- **quant_scheme**: We set this to \"QuantScheme.post_training_tf\"\n", + " - Supported options are 'tf_enhanced' or 'tf' or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced\n", + "- **default_output_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision\n", + "- **default_param_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision\n", + "- **num_batches**: The number of batches used to evaluate the model while calculating the quantization encodings.Number of batches to use for computing encodings. Only 5 batches are used here to speed up the process. In addition, the number of images in these 5 batches should be sufficient for compute encodings\n", + "- **rounding_mode**: The rounding mode used for quantization. There are two possible choices here - 'nearest' or 'stochastic' We will use \"nearest.\"\n", + "\n", + "There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters." + ] + }, + { + "cell_type": "markdown", + "id": "05e6ce14", + "metadata": {}, + "source": [ + "The next cell sets up the quantizer, and quantizes the model. Note that the quantizer uses the same evaluate function as the one defined in our data pipeline when computing the new weights." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b3770e74", + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_common.defs import QuantScheme\n", + "from aimet_tensorflow.keras.quantsim import QuantizationSimModel\n", + "\n", + "sim = QuantizationSimModel(model=model,\n", + " quant_scheme=QuantScheme.post_training_tf,\n", + " rounding_mode=\"nearest\",\n", + " default_output_bw=8,\n", + " default_param_bw=8)" + ] + }, + { + "cell_type": "markdown", + "id": "8da84201", + "metadata": {}, + "source": [ + "---\n", + "## Compute Encodings\n", + "Even though AIMET has wrapped the layers to act as being 'quantized' but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each 'quantizer' layer. For activation quantization layers, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as 'computing encodings'.\n", + "\n", + "So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don't need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples\n", + "\n", + "In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples.\n", + "It may be beneficial if the samples used for computing encoding are well distributed. It's not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3f3346fa", + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from tensorflow.keras.utils import Progbar\n", + "from tensorflow.keras.applications.resnet import preprocess_input\n", + "\n", + "def pass_calibration_data(sim_model, samples):\n", + " tf_dataset = ImageNetDataPipeline.get_val_dataset()\n", + " dataset = tf_dataset.dataset\n", + " batch_size = tf_dataset.batch_size\n", + "\n", + " progbar = Progbar(samples)\n", + "\n", + " batch_cntr = 0\n", + " for inputs, _ in dataset:\n", + " sim_model(preprocess_input(inputs))\n", + "\n", + " batch_cntr += 1\n", + " progbar_stat_update = \\\n", + " batch_cntr * batch_size if (batch_cntr * batch_size) < samples else samples\n", + " progbar.update(progbar_stat_update)\n", + " if (batch_cntr * batch_size) > samples:\n", + " break\n" + ] + }, + { + "cell_type": "markdown", + "id": "c0ddbed9", + "metadata": {}, + "source": [ + "---\n", + "Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "48b9ed98", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=1000)" + ] + }, + { + "cell_type": "markdown", + "id": "5337b23b", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "Now, we can determine the simulated quantized accuracy of the equalized model. We again create a simulation model like before and evaluate to determine simulated quantized accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5a78a26e", + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "ImageNetDataPipeline.evaluate(sim.model)" + ] + }, + { + "cell_type": "markdown", + "id": "f8e0a345", + "metadata": {}, + "source": [ + "---\n", + "## 4 Cross Layer Equalization\n", + "\n", + "The next cell performs cross-layer equalization on the model. As noted before, the function folds batch norms, applies cross-layer scaling, and then folds high biases.\n", + "\n", + "**Note:** Interestingly, CLE needs BN statistics for its procedure. If a BN folded model is provided, CLE will run the CLS (cross-layer scaling) optimization step but will skip the HBA (high-bias absorption) step. To avoid this, we simply load the original model again before running CLE." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ca3365e6", + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.keras import cross_layer_equalization as aimet_cle\n", + "\n", + "cle_applied_model = aimet_cle.equalize_model(model)" + ] + }, + { + "cell_type": "markdown", + "id": "7c769dee", + "metadata": {}, + "source": [ + "---\n", + "Now, we can determine the simulated quantized accuracy of the equalized model. We again create a simulation model like before and evaluate to determine simulated quantized accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "99c228e0", + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "sim = QuantizationSimModel(model=cle_applied_model,\n", + " quant_scheme=QuantScheme.post_training_tf,\n", + " rounding_mode=\"nearest\",\n", + " default_output_bw=8,\n", + " default_param_bw=8)\n", + "\n", + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=1000)\n", + "\n", + "ImageNetDataPipeline.evaluate(sim.model)" + ] + }, + { + "cell_type": "markdown", + "id": "4beeb700", + "metadata": {}, + "source": [ + "---\n", + "## 5 Exporting\n", + "\n", + "Now the encodings for the QuantizationSimModel have been computed, they can be exported. Exporting can be done with the export function. This function will export the encodings in both a JSON and YAML file, a h5 model without wrappers, a Tensorflow 2 SavedModel, and a converted protobuff model from the h5 model. The converted protobuff model and the encodings exported can be then consumed by either SNPE/QNN.\n", + "\n", + "**Note:** `export()` takes a path to safe to, and a filename_prefix" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6a8bb414", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "import os\n", + "\n", + "os.makedirs(\"./output/\", exist_ok=True)\n", + "sim.export(path=\"./output\", filename_prefix=\"resnet50_after_cle\")" + ] + }, + { + "cell_type": "markdown", + "id": "f071a78c", + "metadata": {}, + "source": [ + "---\n", + "## Summary\n", + "\n", + "Hopefully this notebook was useful for you to understand how to use AIMET for performing Cross Layer Equalization (CLE).\n", + "\n", + "Few additional resources\n", + "- Refer to the AIMET API docs to know more details of the APIs and optional parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT techniques" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.0" + }, + "vscode": { + "interpreter": { + "hash": "767d51c1340bd893661ea55ea3124f6de3c7a262a8b4abca0554b478b1e2ff90" + } + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/releases/1.32.2/Examples/tensorflow/quantization/qat.html b/releases/1.32.2/Examples/tensorflow/quantization/qat.html new file mode 100644 index 00000000..eaf92f6a --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/qat.html @@ -0,0 +1,1474 @@ + + + + + + Quantization-Aware Training — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Quantization-Aware Training

+

This notebook shows a working code example of how to use AIMET to perform QAT (Quantization-aware training). QAT is an AIMET feature adding quantization simulation ops (also called fake quantization ops sometimes) to a trained ML model and using a standard training pipeline to fine-tune or train the model for a few epochs. The resulting model should show improved accuracy on quantized ML accelerators.

+

AIMET supports two different types of QAT 1. Simply referred to as QAT - quantization parameters like per-tensor scale/offsets for activations are computed once. During fine-tuning, the model weights are updated to minimize the effects of quantization in the forward pass, keeping the quantization parameters constant. 2. Referred to as QAT with range-learning - quantization parameters like per-tensor scale/offsets for activations are computed initially. Then both the quantization parameters and +the model weights are jointly updated during fine-tuning to minimize the effects of quantization in the forward pass.

+

This notebook specifically shows working code example for #1 above. You can find a separate notebook for #2 in the same folder.

+
+

Overall flow

+

This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy 3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score 4. Fine-tune the quantization simulation model and evaluate the simulation model to get a post-finetuned quantized accuracy score

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art QAT results. For example, it uses a relatively quantization-friendly model like Resnet50. Also, some optimization parameters like number of epochs to fine-tune are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#) and convert them into tfrecords.

+

Note1: The ImageNet tfrecords dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - A folder containing tfrecords files starting with ‘train*’ for training files and ‘valid*’ for validation files. Each tfrecord file should have features: ‘image/encoded’ for image data and ‘image/class/label’ for its corresponding class.

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class and then convert it into tfrecords. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
TFRECORDS_DIR = '/path/to/tfrecords/dir/'        # Please replace this with a real directory
+
+
+
+

We disable logs at the INFO level and disable eager execution. We set verbosity to the level as displayed (ERORR), so TensorFlow will display all messages that have the label ERROR (or more critical).

+
+
[ ]:
+
+
+
import os
+os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
+
+import tensorflow.compat.v1 as tf
+tf.disable_eager_execution()
+tf.logging.set_verbosity(tf.logging.ERROR)
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a TensorFlow model. This QuantizationSim model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods? Not really. You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
from typing import List
+
+from Examples.common import image_net_config
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.tensorflow.utils.image_net_trainer import ImageNetTrainer
+
+class ImageNetDataPipeline:
+    """
+    Provides APIs for model evaluation and finetuning using ImageNet Dataset.
+    """
+
+    @staticmethod
+    def get_val_dataloader():
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(TFRECORDS_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         format_bgr=True)
+
+        return data_loader
+
+    @staticmethod
+    def evaluate(sess: tf.Session) -> float:
+        """
+        Given a TF session, evaluates its Top-1 accuracy on the validation dataset
+        :param sess: The sess graph to be evaluated.
+        :return: The accuracy for the sample with the maximum accuracy.
+        """
+        evaluator = ImageNetEvaluator(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],
+                                      data_inputs=['input_1:0'], validation_inputs=['labels:0'],
+                                      image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      format_bgr=True)
+
+        return evaluator.evaluate(sess)
+
+
+    @staticmethod
+    def finetune(sess: tf.Session, update_ops_name: List[str], epochs: int, learning_rate: float, decay_steps: int):
+        """
+        Given a TF session, finetunes it to improve its accuracy
+        :param sess: The sess graph to fine-tune.
+        :param update_ops_name: list of name of update ops (mostly BatchNorms' moving averages).
+                                tf.GraphKeys.UPDATE_OPS collections is always used
+                                in addition to this list
+        :param epochs: The number of epochs used during the finetuning step.
+        :param learning_rate: The learning rate used during the finetuning step.
+        :param decay_steps: A number used to adjust(decay) the learning rate after every decay_steps epochs in training.
+        """
+        trainer = ImageNetTrainer(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],
+                                  data_inputs=['input_1:0'], validation_inputs=['labels:0'],
+                                  image_size=image_net_config.dataset['image_size'],
+                                  batch_size=image_net_config.train['batch_size'],
+                                  num_epochs=epochs, format_bgr=True)
+
+        trainer.train(sess, update_ops_name=update_ops_name, learning_rate=learning_rate, decay_steps=decay_steps)
+
+
+
+
+
+
+

2. Load the model and evaluate to get a baseline FP32 accuracy score

+

For this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to a tensorflow session. Similarly, you can load any pretrained tensorflow model instead.

+

Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.

+

By default the update ops are placed in tf.GraphKeys.UPDATE_OPS, so they need to be added as a dependency to the train_op.

+
+
[ ]:
+
+
+
from tensorflow.compat.v1.keras.applications.resnet import ResNet50
+
+tf.keras.backend.clear_session()
+
+model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+update_ops_name = [op.name for op in model.updates] # Used for finetuning
+
+
+
+

The following utility method in AIMET sets BN layers in the model to eval mode. This allows AIMET to more easily read the BN parameters from the graph. Eventually we will fold BN layers into adjacent conv layers.

+
+
[ ]:
+
+
+
from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag
+
+model = update_keras_bn_ops_trainable_flag(model, load_save_path="./", trainable=False)
+
+
+
+

AIMET features currently support tensorflow sessions. add_image_net_computational_nodes_in_graph adds an output layer, softmax and loss functions to the Resnet50 model graph.

+
+
[ ]:
+
+
+
from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph
+
+sess = tf.keras.backend.get_session()
+add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])
+
+
+
+

Since all tensorflow input and output tensors have names, we identify the tensors needed by AIMET APIs here.

+
+
[ ]:
+
+
+
starting_op_names = [model.input.name.split(":")[0]]
+output_op_names = [model.output.name.split(":")[0]]
+
+
+
+

We are checking if TensorFlow is using CPU or CUDA device. This example code will use CUDA if available in your current execution environment.

+
+
[ ]:
+
+
+
use_cuda = tf.test.is_gpu_available(cuda_only=True)
+
+
+
+
+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sess=sess)
+print(accuracy)
+
+
+
+
+
+
+

3. Create a quantization simulation model and determine quantized accuracy

+
+
+

Fold Batch Normalization layers

+

Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.

+

Why do we need to this? On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values +for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.

+

The following code calls AIMET to fold the BN layers on the given model and returns a new session

+
+
[ ]:
+
+
+
from aimet_tensorflow.batch_norm_fold import fold_all_batch_norms
+
+bn_folded_sess, _= fold_all_batch_norms(sess,
+                                        input_op_names=starting_op_names,
+                                        output_op_names=output_op_names)
+
+
+
+
+
+

Create Quantization Sim Model

+

Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them. A few of the parameters are explained here - quant_scheme: We set this to “QuantScheme.post_training_tf_enhanced” - Supported options are ‘tf_enhanced’ or ‘tf’ or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced - default_output_bw: Setting this to 8, essentially means that we are +asking AIMET to perform all activation quantizations in the model using integer 8-bit precision - default_param_bw: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+

The next cell sets up the quantizer, and quantizes the model. The new session that contains all the changes to the graph is quantizer.session, and this is then evaluated on the dataset. Note that the quantizer uses the same evaluate function as the one defined in our data pipeline when computing the new weights.

+
+
[ ]:
+
+
+
from aimet_common.defs import QuantScheme
+from aimet_tensorflow.quantsim import QuantizationSimModel
+
+sim = QuantizationSimModel(session=bn_folded_sess,
+                           starting_op_names=starting_op_names,
+                           output_op_names=output_op_names,
+                           quant_scheme= QuantScheme.post_training_tf_enhanced,
+                           default_output_bw=8,
+                           default_param_bw=8,
+                           use_cuda=use_cuda)
+
+
+
+
+
+
+

Compute Encodings

+

Even though AIMET has added ‘quantizer’ nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred +to as calibration. AIMET simply refers to it as ‘computing encodings’.

+

So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples - In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has +1M samples. For computing encodings we only need 500 or 1000 samples. - It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results.

+

The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example.

+
+
[ ]:
+
+
+
def pass_calibration_data(session: tf.compat.v1.Session, _):
+    data_loader = ImageNetDataPipeline.get_val_dataloader()
+    batch_size = data_loader.batch_size
+
+    input_label_tensors = [session.graph.get_tensor_by_name('input_1:0'),
+                           session.graph.get_tensor_by_name('labels:0')]
+
+    train_tensors = [session.graph.get_tensor_by_name('keras_learning_phase:0')]
+    train_tensors_dict = dict.fromkeys(train_tensors, False)
+
+    eval_outputs = [session.graph.get_operation_by_name('top1-acc').outputs[0]]
+
+    samples = 500
+
+    batch_cntr = 0
+    for input_label in data_loader:
+        input_label_tensors_dict = dict(zip(input_label_tensors, input_label))
+
+        feed_dict = {**input_label_tensors_dict, **train_tensors_dict}
+
+        with session.graph.as_default():
+            _ = session.run(eval_outputs, feed_dict=feed_dict)
+
+        batch_cntr += 1
+        if (batch_cntr * batch_size) > samples:
+            break
+
+
+
+
+

Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters.

+
+
[ ]:
+
+
+
sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=None)
+
+
+
+
+

Now the QuantizationSim model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sim.session)
+print(accuracy)
+
+
+
+
+
+
+

4. Perform QAT

+

To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.

+

For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit.

+
+
[ ]:
+
+
+
ImageNetDataPipeline.finetune(sim.session, update_ops_name=update_ops_name, epochs=1, learning_rate=1e-3, decay_steps=5)
+
+
+
+
+

After we are done with QAT, we can run quantization simulation inference against the validation dataset at the end to observe any improvements in accuracy.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sim.session)
+print(accuracy)
+
+
+
+
+

Depending on your settings you may have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.

+

So we should have an improved model after QAT. Now the next step would be to actually take this model to target. For this purpose, we need to export the model with the updated weights without the fake quant ops. AIMET QuantizationSimModel provides an export API for this purpose. This API would save the model as .encodings .meta etc. TODO

+
+
[ ]:
+
+
+
os.makedirs('./output/', exist_ok=True)
+sim.export(path='./output/', filename_prefix='resnet50_after_qat')
+
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET for performing QAT.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT with range-learning

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/quantization/qat.ipynb b/releases/1.32.2/Examples/tensorflow/quantization/qat.ipynb new file mode 100644 index 00000000..0904e94d --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/qat.ipynb @@ -0,0 +1,536 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Quantization-Aware Training\n", + "\n", + "This notebook shows a working code example of how to use AIMET to perform QAT (Quantization-aware training). QAT is an AIMET feature adding quantization simulation ops (also called fake quantization ops sometimes) to a trained ML model and using a standard training pipeline to fine-tune or train the model for a few epochs. The resulting model should show improved accuracy on quantized ML accelerators.\n", + "\n", + "AIMET supports two different types of QAT\n", + "1. Simply referred to as QAT - quantization parameters like per-tensor scale/offsets for activations are computed once. During fine-tuning, the model weights are updated to minimize the effects of quantization in the forward pass, keeping the quantization parameters constant.\n", + "2. Referred to as QAT with range-learning - quantization parameters like per-tensor scale/offsets for activations are computed initially. Then both the quantization parameters and the model weights are jointly updated during fine-tuning to minimize the effects of quantization in the forward pass.\n", + "\n", + "This notebook specifically shows working code example for #1 above. You can find a separate notebook for #2 in the same folder.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy\n", + "3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score\n", + "4. Fine-tune the quantization simulation model and evaluate the simulation model to get a post-finetuned quantized accuracy score\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art QAT results. For example, it uses a relatively quantization-friendly model like Resnet50. Also, some optimization parameters like number of epochs to fine-tune are deliberately chosen to have the notebook execute more quickly." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#) and convert them into tfrecords.\n", + "\n", + "**Note1**: The ImageNet tfrecords dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- A folder containing tfrecords files starting with **'train\\*'** for training files and **'valid\\*'** for validation files. Each tfrecord file should have features: **'image/encoded'** for image data and **'image/class/label'** for its corresponding class.\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class and then convert it into tfrecords. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "TFRECORDS_DIR = '/path/to/tfrecords/dir/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We disable logs at the INFO level and disable eager execution. We set verbosity to the level as displayed (ERORR), so TensorFlow will display all messages that have the label ERROR (or more critical)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'\n", + "\n", + "import tensorflow.compat.v1 as tf\n", + "tf.disable_eager_execution()\n", + "tf.logging.set_verbosity(tf.logging.ERROR)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a TensorFlow model. This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really. You should be able to use your existing evaluate and train routines as-is.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from typing import List\n", + "\n", + "from Examples.common import image_net_config\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.tensorflow.utils.image_net_trainer import ImageNetTrainer\n", + "\n", + "class ImageNetDataPipeline:\n", + " \"\"\"\n", + " Provides APIs for model evaluation and finetuning using ImageNet Dataset.\n", + " \"\"\"\n", + " \n", + " @staticmethod\n", + " def get_val_dataloader():\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(TFRECORDS_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return data_loader\n", + " \n", + " @staticmethod\n", + " def evaluate(sess: tf.Session) -> float:\n", + " \"\"\"\n", + " Given a TF session, evaluates its Top-1 accuracy on the validation dataset\n", + " :param sess: The sess graph to be evaluated.\n", + " :return: The accuracy for the sample with the maximum accuracy.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],\n", + " data_inputs=['input_1:0'], validation_inputs=['labels:0'],\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return evaluator.evaluate(sess)\n", + "\n", + " \n", + " @staticmethod\n", + " def finetune(sess: tf.Session, update_ops_name: List[str], epochs: int, learning_rate: float, decay_steps: int):\n", + " \"\"\"\n", + " Given a TF session, finetunes it to improve its accuracy\n", + " :param sess: The sess graph to fine-tune.\n", + " :param update_ops_name: list of name of update ops (mostly BatchNorms' moving averages).\n", + " tf.GraphKeys.UPDATE_OPS collections is always used\n", + " in addition to this list\n", + " :param epochs: The number of epochs used during the finetuning step.\n", + " :param learning_rate: The learning rate used during the finetuning step.\n", + " :param decay_steps: A number used to adjust(decay) the learning rate after every decay_steps epochs in training.\n", + " \"\"\"\n", + " trainer = ImageNetTrainer(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],\n", + " data_inputs=['input_1:0'], validation_inputs=['labels:0'],\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.train['batch_size'],\n", + " num_epochs=epochs, format_bgr=True)\n", + "\n", + " trainer.train(sess, update_ops_name=update_ops_name, learning_rate=learning_rate, decay_steps=decay_steps)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 2. Load the model and evaluate to get a baseline FP32 accuracy score" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to a tensorflow session. Similarly, you can load any pretrained tensorflow model instead.\n", + "\n", + "\n", + "Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.\n", + "\n", + "\n", + "By default the update ops are placed in tf.GraphKeys.UPDATE_OPS, so they need to be added as a dependency to the train_op." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "from tensorflow.compat.v1.keras.applications.resnet import ResNet50\n", + "\n", + "tf.keras.backend.clear_session()\n", + "\n", + "model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))\n", + "update_ops_name = [op.name for op in model.updates] # Used for finetuning" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The following utility method in AIMET sets BN layers in the model to eval mode. This allows AIMET to more easily read the BN parameters from the graph. Eventually we will fold BN layers into adjacent conv layers." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag\n", + "\n", + "model = update_keras_bn_ops_trainable_flag(model, load_save_path=\"./\", trainable=False)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "AIMET features currently support tensorflow sessions. **add_image_net_computational_nodes_in_graph** adds an output layer, softmax and loss functions to the Resnet50 model graph." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph\n", + "\n", + "sess = tf.keras.backend.get_session()\n", + "add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Since all tensorflow input and output tensors have names, we identify the tensors needed by AIMET APIs here. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "starting_op_names = [model.input.name.split(\":\")[0]]\n", + "output_op_names = [model.output.name.split(\":\")[0]]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We are checking if TensorFlow is using CPU or CUDA device. This example code will use CUDA if available in your current execution environment." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "use_cuda = tf.test.is_gpu_available(cuda_only=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sess=sess)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 3. Create a quantization simulation model and determine quantized accuracy\n", + "\n", + "## Fold Batch Normalization layers\n", + "Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.\n", + "\n", + "**Why do we need to this?**\n", + "On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.\n", + "\n", + "The following code calls AIMET to fold the BN layers on the given model and returns a new session" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.batch_norm_fold import fold_all_batch_norms\n", + "\n", + "bn_folded_sess, _= fold_all_batch_norms(sess,\n", + " input_op_names=starting_op_names,\n", + " output_op_names=output_op_names)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Create Quantization Sim Model\n", + "\n", + "Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them.\n", + "A few of the parameters are explained here\n", + "- **quant_scheme**: We set this to \"QuantScheme.post_training_tf_enhanced\"\n", + " - Supported options are 'tf_enhanced' or 'tf' or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced\n", + "- **default_output_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision\n", + "- **default_param_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision\n", + "\n", + "There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The next cell sets up the quantizer, and quantizes the model. The new session that contains all the changes to the graph is quantizer.session, and this is then evaluated on the dataset. Note that the quantizer uses the same evaluate function as the one defined in our data pipeline when computing the new weights." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "from aimet_common.defs import QuantScheme\n", + "from aimet_tensorflow.quantsim import QuantizationSimModel\n", + "\n", + "sim = QuantizationSimModel(session=bn_folded_sess,\n", + " starting_op_names=starting_op_names,\n", + " output_op_names=output_op_names,\n", + " quant_scheme= QuantScheme.post_training_tf_enhanced,\n", + " default_output_bw=8,\n", + " default_param_bw=8,\n", + " use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Compute Encodings\n", + "Even though AIMET has added 'quantizer' nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each 'quantizer' node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as 'computing encodings'.\n", + "\n", + "So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don't need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples\n", + "- In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples.\n", + "- It may be beneficial if the samples used for computing encoding are well distributed. It's not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def pass_calibration_data(session: tf.compat.v1.Session, _):\n", + " data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + " batch_size = data_loader.batch_size\n", + "\n", + " input_label_tensors = [session.graph.get_tensor_by_name('input_1:0'),\n", + " session.graph.get_tensor_by_name('labels:0')]\n", + " \n", + " train_tensors = [session.graph.get_tensor_by_name('keras_learning_phase:0')]\n", + " train_tensors_dict = dict.fromkeys(train_tensors, False)\n", + " \n", + " eval_outputs = [session.graph.get_operation_by_name('top1-acc').outputs[0]]\n", + "\n", + " samples = 500\n", + "\n", + " batch_cntr = 0\n", + " for input_label in data_loader:\n", + " input_label_tensors_dict = dict(zip(input_label_tensors, input_label))\n", + "\n", + " feed_dict = {**input_label_tensors_dict, **train_tensors_dict}\n", + "\n", + " with session.graph.as_default():\n", + " _ = session.run(eval_outputs, feed_dict=feed_dict)\n", + "\n", + " batch_cntr += 1\n", + " if (batch_cntr * batch_size) > samples:\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=None)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now the QuantizationSim model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sim.session)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 4. Perform QAT\n", + "\n", + "To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.\n", + "\n", + "For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ImageNetDataPipeline.finetune(sim.session, update_ops_name=update_ops_name, epochs=1, learning_rate=1e-3, decay_steps=5)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "After we are done with QAT, we can run quantization simulation inference against the validation dataset at the end to observe any improvements in accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sim.session)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Depending on your settings you may have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.\n", + "\n", + "So we should have an improved model after QAT. Now the next step would be to actually take this model to target. For this purpose, we need to export the model with the updated weights without the fake quant ops. AIMET QuantizationSimModel provides an export API for this purpose. This API would save the model as .encodings .meta etc. TODO" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "os.makedirs('./output/', exist_ok=True)\n", + "sim.export(path='./output/', filename_prefix='resnet50_after_qat')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Summary\n", + "\n", + "Hope this notebook was useful for you to understand how to use AIMET for performing QAT.\n", + "\n", + "Few additional resources\n", + "- Refer to the AIMET API docs to know more details of the APIs and optional parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT with range-learning" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/releases/1.32.2/Examples/tensorflow/quantization/qat_range_learning.html b/releases/1.32.2/Examples/tensorflow/quantization/qat_range_learning.html new file mode 100644 index 00000000..e24b2e78 --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/qat_range_learning.html @@ -0,0 +1,1476 @@ + + + + + + Quantization-Aware Training with Range Learning — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Quantization-Aware Training with Range Learning

+

This notebook shows a working code example of how to use AIMET to perform QAT (Quantization-aware training). QAT is an AIMET feature adding quantization simulation ops (also called fake quantization ops sometimes) to a trained ML model and using a standard training pipeline to fine-tune or train the model for a few epochs. The resulting model should show improved accuracy on quantized ML accelerators.

+

AIMET supports two different types of QAT 1. Simply referred to as QAT - quantization parameters like per-tensor scale/offsets for activations are computed once. During fine-tuning, the model weights are updated to minimize the effects of quantization in the forward pass, keeping the quantization parameters constant. 2. Referred to as QAT with range-learning - quantization parameters like per-tensor scale/offsets for activations are computed initially. Then both the quantization parameters and +the model weights are jointly updated during fine-tuning to minimize the effects of quantization in the forward pass.

+

This notebook specifically shows working code example for #2 above. You can find a separate notebook for #1 in the same folder.

+
+

Overall flow

+

This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy 3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score 4. Fine-tune the quantization simulation model and evaluate the simulation model to get a post-finetuned quantized accuracy score

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art QAT results. For example, it uses a relatively quantization-friendly model like Resnet50. Also, some optimization parameters like number of epochs to fine-tune are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#) and convert them into tfrecords.

+

Note1: The ImageNet tfrecords dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - A folder containing tfrecords files starting with ‘train*’ for training files and ‘valid*’ for validation files. Each tfrecord file should have features: ‘image/encoded’ for image data and ‘image/class/label’ for its corresponding class.

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class and then convert it into tfrecords. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
TFRECORDS_DIR = '/path/to/tfrecords/dir/'        # Please replace this with a real directory
+
+
+
+

We disable logs at the INFO level and disable eager execution. We set verbosity to the level as displayed (ERORR), so TensorFlow will display all messages that have the label ERROR (or more critical).

+
+
[ ]:
+
+
+
import os
+os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
+
+import tensorflow.compat.v1 as tf
+tf.disable_eager_execution()
+tf.logging.set_verbosity(tf.logging.ERROR)
+
+
+
+
+
+
+

1. Example Evaluation and Training Pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a TensorFlow model. This QuantizationSim model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods? Not really. You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
from typing import List
+
+from Examples.common import image_net_config
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.tensorflow.utils.image_net_trainer import ImageNetTrainer
+
+class ImageNetDataPipeline:
+    """
+    Provides APIs for model evaluation and finetuning using ImageNet Dataset.
+    """
+
+    @staticmethod
+    def get_val_dataloader():
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(TFRECORDS_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         format_bgr=True)
+
+        return data_loader
+
+    @staticmethod
+    def evaluate(sess: tf.Session) -> float:
+        """
+        Given a TF session, evaluates its Top-1 accuracy on the validation dataset
+        :param sess: The sess graph to be evaluated.
+        :return: The accuracy for the sample with the maximum accuracy.
+        """
+        evaluator = ImageNetEvaluator(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],
+                                      data_inputs=['input_1:0'], validation_inputs=['labels:0'],
+                                      image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      format_bgr=True)
+
+        return evaluator.evaluate(sess)
+
+
+    @staticmethod
+    def finetune(sess: tf.Session, update_ops_name: List[str], epochs: int, learning_rate: float, decay_steps: int):
+        """
+        Given a TF session, finetunes it to improve its accuracy
+        :param sess: The sess graph to fine-tune.
+        :param update_ops_name: list of name of update ops (mostly BatchNorms' moving averages).
+                                tf.GraphKeys.UPDATE_OPS collections is always used
+                                in addition to this list
+        :param epochs: The number of epochs used during the finetuning step.
+        :param learning_rate: The learning rate used during the finetuning step.
+        :param decay_steps: A number used to adjust(decay) the learning rate after every decay_steps epochs in training.
+        """
+        trainer = ImageNetTrainer(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],
+                                  data_inputs=['input_1:0'], validation_inputs=['labels:0'],
+                                  image_size=image_net_config.dataset['image_size'],
+                                  batch_size=image_net_config.train['batch_size'],
+                                  num_epochs=epochs, format_bgr=True)
+
+        trainer.train(sess, update_ops_name=update_ops_name, learning_rate=learning_rate, decay_steps=decay_steps)
+
+
+
+
+
+
+

2. Load the model and evaluate to get a baseline FP32 accuracy score

+

For this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to a tensorflow session. Similarly, you can load any pretrained tensorflow model instead.

+

Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.

+

By default the update ops are placed in tf.GraphKeys.UPDATE_OPS, so they need to be added as a dependency to the train_op. Since batchnorm ops are folded, these need to be ignored during training.

+
+
[ ]:
+
+
+
from tensorflow.compat.v1.keras.applications.resnet import ResNet50
+
+tf.keras.backend.clear_session()
+
+model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+update_ops_name = [op.name for op in model.updates] # Used for finetuning
+
+
+
+

The following utility method in AIMET sets BN layers in the model to eval mode. This allows AIMET to more easily read the BN parameters from the graph. Eventually we will fold BN layers into adjacent conv layers.

+
+
[ ]:
+
+
+
from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag
+
+model = update_keras_bn_ops_trainable_flag(model, load_save_path="./", trainable=False)
+
+
+
+

AIMET features currently support tensorflow sessions. add_image_net_computational_nodes_in_graph adds an output layer, softmax and loss functions to the Resnet50 model graph.

+
+
[ ]:
+
+
+
from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph
+
+sess = tf.keras.backend.get_session()
+
+# Creates the computation graph of ResNet within the tensorflow session.
+add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])
+
+
+
+

Since all tensorflow input and output tensors have names, we identify the tensors needed by AIMET APIs here.

+
+
[ ]:
+
+
+
starting_op_names = [model.input.name.split(":")[0]]
+output_op_names = [model.output.name.split(":")[0]]
+
+
+
+

We are checking if TensorFlow is using CPU or CUDA device. This example code will use CUDA if available in your current execution environment.

+
+
[ ]:
+
+
+
use_cuda = tf.test.is_gpu_available(cuda_only=True)
+
+
+
+
+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sess=sess)
+print(accuracy)
+
+
+
+
+
+
+

3. Create a quantization simulation model and determine quantized accuracy

+
+
+

Fold Batch Normalization layers

+

Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.

+

Why do we need to this? On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values +for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.

+

The following code calls AIMET to fold the BN layers on the given model and returns a new session

+
+
[ ]:
+
+
+
from aimet_tensorflow.batch_norm_fold import fold_all_batch_norms
+
+BN_folded_sess, _= fold_all_batch_norms(sess,
+                                        input_op_names=starting_op_names,
+                                        output_op_names=output_op_names)
+
+
+
+
+
+

Create Quantization Sim Model

+

Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them. A few of the parameters are explained here - quant_scheme: We set this to “training_range_learning_with_tf_init” - This is the key setting that enables “range learning”. With this choice of quant scheme, AIMET will use the TF quant scheme to initialize the quantization parameters like scale/offset. And then those parameters are set +to be trainable so they can continue to be updated during fine-tuning. - Another choice for quant_scheme is “training_range_learning_with_tf_enhanced_init”. Similar to the above, but the initialization for scale/offset is doing using the TF Enhanced scheme. Since in both schemes the quantization parameters are set to be trainable, there is not much benefit to using this choice instead of “training_range_learning_with_tf_init. - default_output_bw: Setting this to 8, essentially means that we +are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision - default_param_bw: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+
+
[ ]:
+
+
+
from aimet_common.defs import QuantScheme
+from aimet_tensorflow.quantsim import QuantizationSimModel
+
+sim = QuantizationSimModel(session=BN_folded_sess,
+                           starting_op_names=starting_op_names,
+                           output_op_names=output_op_names,
+                           quant_scheme= QuantScheme.training_range_learning_with_tf_enhanced_init,
+                           default_output_bw=8,
+                           default_param_bw=8,
+                           use_cuda=use_cuda)
+
+
+
+
+
+
+

Compute Encodings

+

Even though AIMET has added ‘quantizer’ nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred +to as calibration. AIMET simply refers to it as ‘computing encodings’.

+

So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples - In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has +1M samples. For computing encodings we only need 500 or 1000 samples. - It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results.

+

The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example.

+
+
[ ]:
+
+
+
def pass_calibration_data(session: tf.compat.v1.Session, _):
+    data_loader = ImageNetDataPipeline.get_val_dataloader()
+    batch_size = data_loader.batch_size
+
+    input_label_tensors = [session.graph.get_tensor_by_name('input_1:0'),
+                           session.graph.get_tensor_by_name('labels:0')]
+
+    train_tensors = [session.graph.get_tensor_by_name('keras_learning_phase:0')]
+    train_tensors_dict = dict.fromkeys(train_tensors, False)
+
+    eval_outputs = [session.graph.get_operation_by_name('top1-acc').outputs[0]]
+
+    samples = 500
+
+    batch_cntr = 0
+    for input_label in data_loader:
+        input_label_tensors_dict = dict(zip(input_label_tensors, input_label))
+
+        feed_dict = {**input_label_tensors_dict, **train_tensors_dict}
+
+        with session.graph.as_default():
+            _ = session.run(eval_outputs, feed_dict=feed_dict)
+
+        batch_cntr += 1
+        if (batch_cntr * batch_size) > samples:
+            break
+
+
+
+
+

Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters.

+
+
[ ]:
+
+
+
sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=None)
+
+
+
+
+

Now the QuantizationSim model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sim.session)
+print(accuracy)
+
+
+
+
+
+
+

4. Perform QAT

+

To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.

+

For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit.

+
+
[ ]:
+
+
+
ImageNetDataPipeline.finetune(sim.session, update_ops_name=update_ops_name, epochs=1, learning_rate=1e-3, decay_steps=5)
+
+
+
+
+

After we are done with QAT, we can run quantization simulation inference against the validation dataset at the end to observe any improvements in accuracy.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sim.session)
+print(accuracy)
+
+
+
+
+

Depending on your settings you may have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.

+

So we should have an improved model after QAT. Now the next step would be to actually take this model to target. For this purpose, we need to export the model with the updated weights without the fake quant ops. AIMET QuantizationSimModel provides an export API for this purpose. This API would save the model as #TODO

+
+
[ ]:
+
+
+
os.makedirs('./output/', exist_ok=True)
+sim.export(path='./output/', filename_prefix='resnet50_after_qat_range_learning')
+
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET for performing QAT with range-learning.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and the vanilla QAT method (without range-learning)

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/quantization/qat_range_learning.ipynb b/releases/1.32.2/Examples/tensorflow/quantization/qat_range_learning.ipynb new file mode 100644 index 00000000..4121e17e --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/qat_range_learning.ipynb @@ -0,0 +1,528 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Quantization-Aware Training with Range Learning\n", + "\n", + "This notebook shows a working code example of how to use AIMET to perform QAT (Quantization-aware training). QAT is an AIMET feature adding quantization simulation ops (also called fake quantization ops sometimes) to a trained ML model and using a standard training pipeline to fine-tune or train the model for a few epochs. The resulting model should show improved accuracy on quantized ML accelerators.\n", + "\n", + "AIMET supports two different types of QAT\n", + "1. Simply referred to as QAT - quantization parameters like per-tensor scale/offsets for activations are computed once. During fine-tuning, the model weights are updated to minimize the effects of quantization in the forward pass, keeping the quantization parameters constant.\n", + "2. Referred to as QAT with range-learning - quantization parameters like per-tensor scale/offsets for activations are computed initially. Then both the quantization parameters and the model weights are jointly updated during fine-tuning to minimize the effects of quantization in the forward pass.\n", + "\n", + "This notebook specifically shows working code example for #2 above. You can find a separate notebook for #1 in the same folder.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy\n", + "3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score\n", + "4. Fine-tune the quantization simulation model and evaluate the simulation model to get a post-finetuned quantized accuracy score\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art QAT results. For example, it uses a relatively quantization-friendly model like Resnet50. Also, some optimization parameters like number of epochs to fine-tune are deliberately chosen to have the notebook execute more quickly." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#) and convert them into tfrecords.\n", + "\n", + "**Note1**: The ImageNet tfrecords dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- A folder containing tfrecords files starting with **'train\\*'** for training files and **'valid\\*'** for validation files. Each tfrecord file should have features: **'image/encoded'** for image data and **'image/class/label'** for its corresponding class.\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class and then convert it into tfrecords. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "TFRECORDS_DIR = '/path/to/tfrecords/dir/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We disable logs at the INFO level and disable eager execution. We set verbosity to the level as displayed (ERORR), so TensorFlow will display all messages that have the label ERROR (or more critical)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'\n", + "\n", + "import tensorflow.compat.v1 as tf\n", + "tf.disable_eager_execution()\n", + "tf.logging.set_verbosity(tf.logging.ERROR)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 1. Example Evaluation and Training Pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a TensorFlow model. This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really. You should be able to use your existing evaluate and train routines as-is.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from typing import List\n", + "\n", + "from Examples.common import image_net_config\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.tensorflow.utils.image_net_trainer import ImageNetTrainer\n", + "\n", + "class ImageNetDataPipeline:\n", + " \"\"\"\n", + " Provides APIs for model evaluation and finetuning using ImageNet Dataset.\n", + " \"\"\"\n", + " \n", + " @staticmethod\n", + " def get_val_dataloader():\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(TFRECORDS_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return data_loader\n", + " \n", + " @staticmethod\n", + " def evaluate(sess: tf.Session) -> float:\n", + " \"\"\"\n", + " Given a TF session, evaluates its Top-1 accuracy on the validation dataset\n", + " :param sess: The sess graph to be evaluated.\n", + " :return: The accuracy for the sample with the maximum accuracy.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],\n", + " data_inputs=['input_1:0'], validation_inputs=['labels:0'],\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return evaluator.evaluate(sess)\n", + "\n", + " \n", + " @staticmethod\n", + " def finetune(sess: tf.Session, update_ops_name: List[str], epochs: int, learning_rate: float, decay_steps: int):\n", + " \"\"\"\n", + " Given a TF session, finetunes it to improve its accuracy\n", + " :param sess: The sess graph to fine-tune.\n", + " :param update_ops_name: list of name of update ops (mostly BatchNorms' moving averages).\n", + " tf.GraphKeys.UPDATE_OPS collections is always used\n", + " in addition to this list\n", + " :param epochs: The number of epochs used during the finetuning step.\n", + " :param learning_rate: The learning rate used during the finetuning step.\n", + " :param decay_steps: A number used to adjust(decay) the learning rate after every decay_steps epochs in training.\n", + " \"\"\"\n", + " trainer = ImageNetTrainer(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],\n", + " data_inputs=['input_1:0'], validation_inputs=['labels:0'],\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.train['batch_size'],\n", + " num_epochs=epochs, format_bgr=True)\n", + "\n", + " trainer.train(sess, update_ops_name=update_ops_name, learning_rate=learning_rate, decay_steps=decay_steps)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 2. Load the model and evaluate to get a baseline FP32 accuracy score" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to a tensorflow session. Similarly, you can load any pretrained tensorflow model instead.\n", + "\n", + "\n", + "Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.\n", + "\n", + "\n", + "By default the update ops are placed in tf.GraphKeys.UPDATE_OPS, so they need to be added as a dependency to the train_op. Since batchnorm ops are folded, these need to be ignored during training." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from tensorflow.compat.v1.keras.applications.resnet import ResNet50\n", + "\n", + "tf.keras.backend.clear_session()\n", + "\n", + "model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))\n", + "update_ops_name = [op.name for op in model.updates] # Used for finetuning" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The following utility method in AIMET sets BN layers in the model to eval mode. This allows AIMET to more easily read the BN parameters from the graph. Eventually we will fold BN layers into adjacent conv layers." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag\n", + "\n", + "model = update_keras_bn_ops_trainable_flag(model, load_save_path=\"./\", trainable=False)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "AIMET features currently support tensorflow sessions. **add_image_net_computational_nodes_in_graph** adds an output layer, softmax and loss functions to the Resnet50 model graph." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph\n", + "\n", + "sess = tf.keras.backend.get_session()\n", + "\n", + "# Creates the computation graph of ResNet within the tensorflow session.\n", + "add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Since all tensorflow input and output tensors have names, we identify the tensors needed by AIMET APIs here. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "starting_op_names = [model.input.name.split(\":\")[0]]\n", + "output_op_names = [model.output.name.split(\":\")[0]]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We are checking if TensorFlow is using CPU or CUDA device. This example code will use CUDA if available in your current execution environment." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "use_cuda = tf.test.is_gpu_available(cuda_only=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sess=sess)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 3. Create a quantization simulation model and determine quantized accuracy\n", + "\n", + "## Fold Batch Normalization layers\n", + "Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.\n", + "\n", + "**Why do we need to this?**\n", + "On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.\n", + "\n", + "The following code calls AIMET to fold the BN layers on the given model and returns a new session" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.batch_norm_fold import fold_all_batch_norms\n", + "\n", + "BN_folded_sess, _= fold_all_batch_norms(sess,\n", + " input_op_names=starting_op_names,\n", + " output_op_names=output_op_names)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Create Quantization Sim Model\n", + "\n", + "Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them.\n", + "A few of the parameters are explained here\n", + "- **quant_scheme**: We set this to \"training_range_learning_with_tf_init\"\n", + " - This is the key setting that enables \"range learning\". With this choice of quant scheme, AIMET will use the TF quant scheme to initialize the quantization parameters like scale/offset. And then those parameters are set to be trainable so they can continue to be updated during fine-tuning.\n", + " - Another choice for quant_scheme is \"training_range_learning_with_tf_enhanced_init\". Similar to the above, but the initialization for scale/offset is doing using the TF Enhanced scheme. Since in both schemes the quantization parameters are set to be trainable, there is not much benefit to using this choice instead of \"training_range_learning_with_tf_init.\n", + "- **default_output_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision\n", + "- **default_param_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision\n", + "\n", + "There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "from aimet_common.defs import QuantScheme\n", + "from aimet_tensorflow.quantsim import QuantizationSimModel\n", + "\n", + "sim = QuantizationSimModel(session=BN_folded_sess,\n", + " starting_op_names=starting_op_names,\n", + " output_op_names=output_op_names,\n", + " quant_scheme= QuantScheme.training_range_learning_with_tf_enhanced_init,\n", + " default_output_bw=8,\n", + " default_param_bw=8,\n", + " use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Compute Encodings\n", + "Even though AIMET has added 'quantizer' nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each 'quantizer' node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as 'computing encodings'.\n", + "\n", + "So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don't need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples\n", + "- In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples.\n", + "- It may be beneficial if the samples used for computing encoding are well distributed. It's not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def pass_calibration_data(session: tf.compat.v1.Session, _):\n", + " data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + " batch_size = data_loader.batch_size\n", + "\n", + " input_label_tensors = [session.graph.get_tensor_by_name('input_1:0'),\n", + " session.graph.get_tensor_by_name('labels:0')]\n", + " \n", + " train_tensors = [session.graph.get_tensor_by_name('keras_learning_phase:0')]\n", + " train_tensors_dict = dict.fromkeys(train_tensors, False)\n", + " \n", + " eval_outputs = [session.graph.get_operation_by_name('top1-acc').outputs[0]]\n", + "\n", + " samples = 500\n", + "\n", + " batch_cntr = 0\n", + " for input_label in data_loader:\n", + " input_label_tensors_dict = dict(zip(input_label_tensors, input_label))\n", + "\n", + " feed_dict = {**input_label_tensors_dict, **train_tensors_dict}\n", + "\n", + " with session.graph.as_default():\n", + " _ = session.run(eval_outputs, feed_dict=feed_dict)\n", + "\n", + " batch_cntr += 1\n", + " if (batch_cntr * batch_size) > samples:\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=None)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now the QuantizationSim model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sim.session)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 4. Perform QAT\n", + "\n", + "To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.\n", + "\n", + "For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ImageNetDataPipeline.finetune(sim.session, update_ops_name=update_ops_name, epochs=1, learning_rate=1e-3, decay_steps=5)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "After we are done with QAT, we can run quantization simulation inference against the validation dataset at the end to observe any improvements in accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sim.session)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Depending on your settings you may have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.\n", + "\n", + "So we should have an improved model after QAT. Now the next step would be to actually take this model to target. For this purpose, we need to export the model with the updated weights without the fake quant ops. AIMET QuantizationSimModel provides an export API for this purpose. This API would save the model as #TODO" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "os.makedirs('./output/', exist_ok=True)\n", + "sim.export(path='./output/', filename_prefix='resnet50_after_qat_range_learning')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Summary\n", + "\n", + "Hope this notebook was useful for you to understand how to use AIMET for performing QAT with range-learning.\n", + "\n", + "Few additional resources\n", + "- Refer to the AIMET API docs to know more details of the APIs and optional parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and the vanilla QAT method (without range-learning)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/releases/1.32.2/Examples/tensorflow/quantization/quant_analyzer.html b/releases/1.32.2/Examples/tensorflow/quantization/quant_analyzer.html new file mode 100644 index 00000000..0b501468 --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/quant_analyzer.html @@ -0,0 +1,1491 @@ + + + + + + Quant Analyzer — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Quant Analyzer

+

This notebook showcases a working code example of how to use AIMET to apply Quant Analyzer. Quant Analyzer is a feature which performs various analyses on a model to understand how each op in the model responds to quantization.

+
+

Overall flow

+

This notebook covers the following 1. Instantiate the example evaluation pipeline 2. Load the FP32 model 3. Apply QuantAnalyzer to the model

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art results.

  • +
  • For example, it uses a relatively quantization-friendly model like Resnet50.

  • +
  • Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#) and convert them into tfrecords.

+

Note1: The ImageNet tfrecords dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - A folder containing tfrecords files starting with ‘train*’ for training files and ‘valid*’ for validation files. Each tfrecord file should have features: ‘image/encoded’ for image data and ‘image/class/label’ for its corresponding class.

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class and then convert it into tfrecords. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
TFRECORDS_DIR = '/path/to/dataset/'         # Please replace this with a real directory
+
+
+
+

We disable logs at the INFO level and disable eager execution. We set verbosity to the level as displayed (ERROR), so TensorFlow will display all messages that have the label ERROR (or more critical).

+
+
[ ]:
+
+
+
import os
+os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
+
+import tensorflow.compat.v1 as tf
+tf.disable_eager_execution()
+tf.logging.set_verbosity(tf.logging.ERROR)
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written?

    +

    Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a TensorFlow model. This QuantizationSim model can be used in place of the original model when doing inference or training.

    +
  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods?

    +

    Not really. You should be able to use your existing evaluate and train routines as-is.

    +
  • +
+
+
[ ]:
+
+
+
from typing import List
+
+from Examples.common import image_net_config
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator
+
+class ImageNetDataPipeline:
+    """
+    Provides APIs for model evaluation and finetuning using ImageNet Dataset.
+    """
+
+    @staticmethod
+    def get_val_dataloader():
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(TFRECORDS_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         format_bgr=True)
+
+        return data_loader
+
+    @staticmethod
+    def evaluate(sess: tf.Session, iterations: int = None) -> float:
+        """
+        Given a TF session, evaluates its Top-1 accuracy on the validation dataset
+        :param sess: The sess graph to be evaluated.
+        :param iterations: No of batches to use. Default is complete dataset
+        :return: The accuracy for the sample with the maximum accuracy.
+        """
+        evaluator = ImageNetEvaluator(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],
+                                      data_inputs=['input_1:0'], validation_inputs=['labels:0'],
+                                      image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      format_bgr=True)
+
+        return evaluator.evaluate(sess, iterations)
+
+
+
+
+
+
+

2. Load the model

+

For this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to a tensorflow session. Similarly, you can load any pretrained tensorflow model instead.

+

Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.

+

By default the update ops are placed in tf.GraphKeys.UPDATE_OPS, so they need to be added as a dependency to the train_op.

+
+
[ ]:
+
+
+
from tensorflow.compat.v1.keras.applications.resnet import ResNet50
+
+tf.keras.backend.clear_session()
+
+model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+
+
+
+

The following utility method in AIMET sets BN layers in the model to eval mode. This allows AIMET to more easily read the BN parameters from the graph. Eventually we will fold BN layers into adjacent conv layers.

+
+
[ ]:
+
+
+
from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag
+
+model = update_keras_bn_ops_trainable_flag(model, load_save_path="./", trainable=False)
+
+
+
+

AIMET features currently support tensorflow sessions. add_image_net_computational_nodes_in_graph adds an output layer, softmax and loss functions to the Resnet50 model graph.

+
+
[ ]:
+
+
+
from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph
+
+sess = tf.keras.backend.get_session()
+add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])
+
+
+
+

Since all tensorflow input and output tensors have names, we identify the tensors needed by AIMET APIs here.

+
+
[ ]:
+
+
+
starting_op_names = [model.input.name.split(":")[0]]
+output_op_names = [model.output.name.split(":")[0]]
+
+
+
+

We are checking if TensorFlow is using CPU or CUDA device. This example code will use CUDA if available in your current execution environment.

+
+
[ ]:
+
+
+
use_cuda = tf.test.is_gpu_available(cuda_only=True)
+
+
+
+
+
+
+

3. Apply QuantAnalyzer to the model

+

QuantAnalyzer requires two functions to be defined by the user for passing data through the model:

+

Forward pass callback

+

One function will be used to pass representative data through a quantized version of the model to calibrate quantization parameters. This function should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metrics, so we can just ignore the model output.

+

The function must take two arguments, the first of which will be the session to run the forward pass on. The second argument can be anything additional which the function requires to run, and can be in the form of a single item or a tuple of items.

+

If no additional argument is needed, the user can specify a dummy “_” parameter for the function.

+

A few pointers regarding the forward pass data samples:

+
    +
  • In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 to 1000 samples.

  • +
  • It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered since we are only looking at the range of values at every op activation. However, we definitely want to avoid an extreme scenario like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results.

  • +
+

The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways; this is just an example. This function only requires unlabeled data as no loss or other evaluation metric is needed.

+
+
[ ]:
+
+
+
def pass_calibration_data(session: tf.compat.v1.Session, _):
+    data_loader = ImageNetDataPipeline.get_val_dataloader()
+    batch_size = data_loader.batch_size
+
+    input_label_tensors = [session.graph.get_tensor_by_name('input_1:0'),
+                           session.graph.get_tensor_by_name('labels:0')]
+
+    train_tensors = [session.graph.get_tensor_by_name('keras_learning_phase:0')]
+    train_tensors_dict = dict.fromkeys(train_tensors, False)
+
+    eval_outputs = [session.graph.get_operation_by_name('top1-acc').outputs[0]]
+
+    samples = 500
+
+    batch_cntr = 0
+    for input_label in data_loader:
+        input_label_tensors_dict = dict(zip(input_label_tensors, input_label))
+
+        feed_dict = {**input_label_tensors_dict, **train_tensors_dict}
+
+        with session.graph.as_default():
+            _ = session.run(eval_outputs, feed_dict=feed_dict)
+
+        batch_cntr += 1
+        if (batch_cntr * batch_size) > samples:
+            break
+
+
+
+

In order to pass this function to QuantAnalyzer, we need to wrap it in a CallbackFunc object, as shown below. The CallbackFunc takes two arguments: the callback function itself, and the inputs to pass into the callback function.

+
+
[ ]:
+
+
+
from aimet_common.utils import CallbackFunc
+
+forward_pass_callback = CallbackFunc(func=pass_calibration_data, func_callback_args=None)
+
+
+
+
+

Evaluation callback

+

The second function will be used to evaluate the model, and needs to return an accuracy metric. In here, the user should pass any amount of data through the model which they would like when evaluating their model for accuracy.

+

Like the forward pass callback, this function also must take exactly two arguments: the session to evaluate, and any additional argument needed for the function to work. The second argument can be a tuple of items in case multiple items are needed.

+

We will be using the ImageNetDataPipeline’s evaluate defined above for this purpose. Like the forward pass callback, we need to wrap the evaluation callback in a CallbackFunc object as well.

+
+
[ ]:
+
+
+
data_pipeline = ImageNetDataPipeline()
+eval_callback = CallbackFunc(func=ImageNetDataPipeline.evaluate)
+
+
+
+
+

Creating unlabeled dataset and defining number of batches for MSE loss per op analysis

+

An optional analysis step in QuantAnalyzer calculates the MSE loss per op in the model, comparing the op outputs from the original FP32 model vs. a quantized model. To perform this step, the user needs to also provide an unlabeled Dataset to QuantAnalyzer.

+

We will demonstrate this step by using the ImageNetDataLoader imported above.

+
+
[ ]:
+
+
+
dataset = data_pipeline.get_val_dataloader().dataset
+
+with dataset._graph.as_default():
+    unlabeled_dataset = dataset.map(lambda x,y: x)
+num_batches = 4
+
+
+
+
+

We are now ready to apply QuantAnalyzer.

+
+
[ ]:
+
+
+
from aimet_tensorflow.quant_analyzer import QuantAnalyzer
+
+quant_analyzer = QuantAnalyzer(sess, start_op_names=starting_op_names, output_op_names=output_op_names,
+                               forward_pass_callback=forward_pass_callback, eval_callback=eval_callback, use_cuda= use_cuda)
+
+
+
+

Finally, to start the analyzer, we call analyze()

+

A few of the parameters are explained here: - quant_scheme: - We set this to “post_training_tf_enhanced” With this choice of quant scheme, AIMET will use the TF Enhanced quant scheme to initialize the quantization parameters like scale/offset. - default_output_bw: Setting this to 8 means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision. - default_param_bw: Setting this to 8 means that we are asking AIMET to perform all +parameter quantizations in the model using integer 8-bit precision.

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+

When analyze method is called, the following analyses are run: - Compare fp32 accuracy, accuracy with only parameters quantized and accuracy with only activations quantized - For each op, track the model accuracy when quantization for all other ops is disabled (enabling quantization for only one op in the model at a time) - For each op, track the model accuracy when quantization for all other ops is enabled (disabling quantization for only one op in the model at a time) - Track the minimum and +maximum encoding parameters calculated by each quantizer in the model as a result of forward passes through the model with representative data - When the TF Enhanced quantization scheme is used, track the histogram of tensor ranges seen by each quantizer in the model as a result of forward passes through the model with representative data - Track the MSE loss seen at each op by comparing op outputs of the original fp32 model vs. a quantized model when user has provided unlabeled dataset and +number of batches

+
+
[ ]:
+
+
+
from aimet_common.defs import QuantScheme
+
+quant_analyzer.analyze(default_param_bw=8, default_output_bw=8,
+                       quant_scheme=QuantScheme.post_training_tf_enhanced,
+                       config_file=None,
+                       unlabeled_dataset=unlabeled_dataset, num_batches=num_batches,
+                       results_dir='./tmp/')
+
+
+
+

AIMET will also output .html plots and json files where appropriate for each analysis to help visualize the data.

+

The following output files will be produced, in a folder specified by the user:

+
results_dir
+|-- per_op_quant_enabled.html
+|-- per_op_quant_enabled.json
+|-- per_op_quant_disabled.html
+|-- per_op_quant_disabled.json
+|-- min_max_ranges
+|   |-- activations.html
+|   |-- activations.json
+|   |-- weights.html
+|   +-- weights.json
+|-- activations_pdf
+|   |-- quant_op_name0.html
+|   |-- quant_op_name1.html
+|   |-- ...
+|   +-- quant_op_nameN.html
+|-- weights_pdf
+|   |-- op1
+|   |   |-- param_name_{channel_index_0}.html
+|   |   |-- param_name_{channel_index_1}.html
+|   |   |-- ...
+|   |   +-- param_name_{channel_index_x}.html
+|   |-- op2
+|   |   |-- param_name_{channel_index_0}.html
+|   |   |-- param_name_{channel_index_1}.html
+|   |   |-- ...
+|   |   +-- param_name_{channel_index_y}.html
+|   |-- ...
+|   |-- opn
+|   |   |-- param_name_{channel_index_0}.html
+|   |   |-- param_name_{channel_index_1}.html
+|   |   |-- ...
+|   +-- +-- param_name_{channel_index_z}.html
+|-- per_op_mse_loss.html
++-- per_op_mse_loss.json
+
+
+
+
+
+

Per op analysis by enabling/disabling quantization ops

+
    +
  • per_op_quant_enabled.html: A plot with ops on the x-axis and model accuracy on the y-axis, where each op’s accuracy represents the model accuracy when all quantizers in the model are disabled except for that op’s parameter and activation quantizers.

  • +
  • per_op_quant_enabled.json: A json file containing the data shown in per_op_quant_enabled.html, associating op names with model accuracy.

  • +
  • per_op_quant_disabled.html: A plot with ops on the x-axis and model accuracy on the y-axis, where each op’s accuracy represents the model accuracy when all quantizers in the model are enabled except for that op’s parameter and activation quantizers.

  • +
  • per_op_quant_disabled.json: A json file containing the data shown in per_op_quant_disabled.html, associating op names with model accuracy.

  • +
+

per_op_quant_enabled.html

+
+
+

Encoding min/max ranges

+
    +
  • min_max_ranges: A folder containing the following sets of files:

    +
      +
    • activations.html: A plot with output activations on the x-axis and min-max values on the y-axis, where each output activation’s range represents the encoding min and max parameters computed during forward pass calibration.

    • +
    • activations.json: A json file containing the data shown in activations.html, associating op names with min and max encoding values.

    • +
    • weights.html: A plot with parameter names on the x-axis and min-max values on the y-axis, where each parameter’s range represents the encoding min and max parameters computed during forward pass calibration.

    • +
    • weights.json: A json file containing the data shown in weights.html, associating parameter names with min and max encoding values.

    • +
    +
  • +
+

min_max_ranges.html

+
+
+

PDF of statistics

+
    +
  • (If TF Enhanced quant scheme is used) activations_pdf: A folder containing html files for each op, plotting the histogram of tensor values seen for that op’s output activation seen during forward pass calibration.

  • +
  • (If TF Enhanced quant scheme is used) weights_pdf: A folder containing sub folders for each op with weights. Each op’s folder contains html files for each parameter of that op, with a histogram plot of tensor values seen for that parameter seen during forward pass calibration.

  • +
+

weights_pdf.html

+
+
+

Per op MSE loss

+
    +
  • (Optional, only enabled when user has provided unlabeled dataset and number of batches) per_op_mse_loss.html: A plot with ops on the x-axis and MSE loss on the y-axis, where each op’s MSE loss represents the MSE seen comparing that op’s outputs in the FP32 model vs. the quantized model.

  • +
  • (Optional, only enabled when user has provided unlabeled dataset and number of batches) per_op_mse_loss.json: A json file containing the data shown in per_op_mse_loss.html, associating op names with MSE loss.

  • +
+

per_op_mse_loss.html

+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/tensorflow/quantization/quant_analyzer.ipynb b/releases/1.32.2/Examples/tensorflow/quantization/quant_analyzer.ipynb new file mode 100644 index 00000000..3c68ad1c --- /dev/null +++ b/releases/1.32.2/Examples/tensorflow/quantization/quant_analyzer.ipynb @@ -0,0 +1,618 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "collapsed": true, + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "# Quant Analyzer\n", + "\n", + "This notebook showcases a working code example of how to use AIMET to apply Quant Analyzer.\n", + "Quant Analyzer is a feature which performs various analyses on a model to understand how each op in the model responds to quantization.\n", + "\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation pipeline\n", + "2. Load the FP32 model\n", + "3. Apply QuantAnalyzer to the model\n", + "\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art results.\n", + "* For example, it uses a relatively quantization-friendly model like Resnet50.\n", + "* Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#) and convert them into tfrecords.\n", + "\n", + "**Note1**: The ImageNet tfrecords dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- A folder containing tfrecords files starting with **'train\\*'** for training files and **'valid\\*'** for validation files. Each tfrecord file should have features: **'image/encoded'** for image data and **'image/class/label'** for its corresponding class.\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class and then convert it into tfrecords. This exercise is left upto the reader and is not necessary.\n", + "\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "TFRECORDS_DIR = '/path/to/dataset/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We disable logs at the INFO level and disable eager execution. We set verbosity to the level as displayed (ERROR),\n", + "so TensorFlow will display all messages that have the label ERROR (or more critical)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'\n", + "\n", + "import tensorflow.compat.v1 as tf\n", + "tf.disable_eager_execution()\n", + "tf.logging.set_verbosity(tf.logging.ERROR)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?**\n", + "\n", + " Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a TensorFlow model.\n", + " This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?**\n", + "\n", + " Not really. You should be able to use your existing evaluate and train routines as-is." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from typing import List\n", + "\n", + "from Examples.common import image_net_config\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetDataLoader\n", + "from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator\n", + "\n", + "class ImageNetDataPipeline:\n", + " \"\"\"\n", + " Provides APIs for model evaluation and finetuning using ImageNet Dataset.\n", + " \"\"\"\n", + " \n", + " @staticmethod\n", + " def get_val_dataloader():\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(TFRECORDS_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return data_loader\n", + " \n", + " @staticmethod\n", + " def evaluate(sess: tf.Session, iterations: int = None) -> float:\n", + " \"\"\"\n", + " Given a TF session, evaluates its Top-1 accuracy on the validation dataset\n", + " :param sess: The sess graph to be evaluated.\n", + " :param iterations: No of batches to use. Default is complete dataset\n", + " :return: The accuracy for the sample with the maximum accuracy.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(TFRECORDS_DIR, training_inputs=['keras_learning_phase:0'],\n", + " data_inputs=['input_1:0'], validation_inputs=['labels:0'],\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " format_bgr=True)\n", + "\n", + " return evaluator.evaluate(sess, iterations)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "\n", + "## 2. Load the model" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For this example notebook, we are going to load a pretrained ResNet50 model from keras and covert it to a tensorflow session. Similarly, you can load any pretrained tensorflow model instead.\n", + "\n", + "\n", + "Calling clear_session() releases the global state: this helps avoid clutter from old models and layers, especially when memory is limited.\n", + "\n", + "\n", + "By default the update ops are placed in tf.GraphKeys.UPDATE_OPS, so they need to be added as a dependency to the train_op." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from tensorflow.compat.v1.keras.applications.resnet import ResNet50\n", + "\n", + "tf.keras.backend.clear_session()\n", + "\n", + "model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The following utility method in AIMET sets BN layers in the model to eval mode. This allows AIMET to more easily read the BN parameters from the graph. Eventually we will fold BN layers into adjacent conv layers." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag\n", + "\n", + "model = update_keras_bn_ops_trainable_flag(model, load_save_path=\"./\", trainable=False)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "AIMET features currently support tensorflow sessions. **add_image_net_computational_nodes_in_graph** adds an output layer, softmax and loss functions to the Resnet50 model graph." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph\n", + "\n", + "sess = tf.keras.backend.get_session()\n", + "add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Since all tensorflow input and output tensors have names, we identify the tensors needed by AIMET APIs here. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "starting_op_names = [model.input.name.split(\":\")[0]]\n", + "output_op_names = [model.output.name.split(\":\")[0]]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We are checking if TensorFlow is using CPU or CUDA device. This example code will use CUDA if available in your current execution environment." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "use_cuda = tf.test.is_gpu_available(cuda_only=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "\n", + "## 3. Apply QuantAnalyzer to the model\n", + "\n", + "QuantAnalyzer requires two functions to be defined by the user for passing data through the model:\n", + "\n", + "**Forward pass callback**\n", + "\n", + "One function will be used to pass representative data through a quantized version of the model to calibrate quantization parameters.\n", + "This function should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model.\n", + "We don't need to compute any loss metrics, so we can just ignore the model output.\n", + "\n", + "The function **must** take two arguments, the first of which will be the session to run the forward pass on.\n", + "The second argument can be anything additional which the function requires to run, and can be in the form of a single item or a tuple of items.\n", + "\n", + "If no additional argument is needed, the user can specify a dummy \"_\" parameter for the function.\n", + "\n", + "A few pointers regarding the forward pass data samples:\n", + "\n", + "- In practice, we need a very small percentage of the overall data samples for computing encodings.\n", + " For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 to 1000 samples.\n", + "- It may be beneficial if the samples used for computing encoding are well distributed.\n", + " It's not necessary that all classes need to be covered since we are only looking at the range of values at every op activation.\n", + " However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings.\n", + "This routine can be written in many different ways; this is just an example.\n", + "This function only requires unlabeled data as no loss or other evaluation metric is needed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "def pass_calibration_data(session: tf.compat.v1.Session, _):\n", + " data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + " batch_size = data_loader.batch_size\n", + "\n", + " input_label_tensors = [session.graph.get_tensor_by_name('input_1:0'),\n", + " session.graph.get_tensor_by_name('labels:0')]\n", + " \n", + " train_tensors = [session.graph.get_tensor_by_name('keras_learning_phase:0')]\n", + " train_tensors_dict = dict.fromkeys(train_tensors, False)\n", + " \n", + " eval_outputs = [session.graph.get_operation_by_name('top1-acc').outputs[0]]\n", + "\n", + " samples = 500\n", + "\n", + " batch_cntr = 0\n", + " for input_label in data_loader:\n", + " input_label_tensors_dict = dict(zip(input_label_tensors, input_label))\n", + "\n", + " feed_dict = {**input_label_tensors_dict, **train_tensors_dict}\n", + "\n", + " with session.graph.as_default():\n", + " _ = session.run(eval_outputs, feed_dict=feed_dict)\n", + "\n", + " batch_cntr += 1\n", + " if (batch_cntr * batch_size) > samples:\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "In order to pass this function to QuantAnalyzer, we need to wrap it in a CallbackFunc object, as shown below.\n", + "The CallbackFunc takes two arguments: the callback function itself, and the inputs to pass into the callback function." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from aimet_common.utils import CallbackFunc\n", + "\n", + "forward_pass_callback = CallbackFunc(func=pass_calibration_data, func_callback_args=None)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "\n", + "**Evaluation callback**\n", + "\n", + "The second function will be used to evaluate the model, and needs to return an accuracy metric.\n", + "In here, the user should pass any amount of data through the model which they would like when evaluating their model for accuracy.\n", + "\n", + "Like the forward pass callback, this function also must take exactly two arguments: the session to evaluate, and any additional argument needed for the function to work.\n", + "The second argument can be a tuple of items in case multiple items are needed.\n", + "\n", + "We will be using the ImageNetDataPipeline's evaluate defined above for this purpose.\n", + "Like the forward pass callback, we need to wrap the evaluation callback in a CallbackFunc object as well." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "data_pipeline = ImageNetDataPipeline()\n", + "eval_callback = CallbackFunc(func=ImageNetDataPipeline.evaluate)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "\n", + "**Creating unlabeled dataset and defining number of batches for MSE loss per op analysis**\n", + "\n", + "An optional analysis step in QuantAnalyzer calculates the MSE loss per op in the model, comparing the op outputs from the original FP32 model vs. a quantized model.\n", + "To perform this step, the user needs to also provide an unlabeled Dataset to QuantAnalyzer.\n", + "\n", + "We will demonstrate this step by using the ImageNetDataLoader imported above." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "dataset = data_pipeline.get_val_dataloader().dataset\n", + " \n", + "with dataset._graph.as_default():\n", + " unlabeled_dataset = dataset.map(lambda x,y: x)\n", + "num_batches = 4 " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "We are now ready to apply QuantAnalyzer." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from aimet_tensorflow.quant_analyzer import QuantAnalyzer\n", + "\n", + "quant_analyzer = QuantAnalyzer(sess, start_op_names=starting_op_names, output_op_names=output_op_names,\n", + " forward_pass_callback=forward_pass_callback, eval_callback=eval_callback, use_cuda= use_cuda)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "Finally, to start the analyzer, we call analyze()\n", + "\n", + "A few of the parameters are explained here:\n", + "- **quant_scheme**:\n", + " - We set this to \"post_training_tf_enhanced\"\n", + " With this choice of quant scheme, AIMET will use the TF Enhanced quant scheme to initialize the quantization parameters like scale/offset.\n", + "- **default_output_bw**: Setting this to 8 means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision.\n", + "- **default_param_bw**: Setting this to 8 means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision.\n", + "\n", + "There are other parameters that are set to default values in this example.\n", + "Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.\n", + "\n", + "When analyze method is called, the following analyses are run:\n", + "- Compare fp32 accuracy, accuracy with only parameters quantized and accuracy with only activations quantized\n", + "- For each op, track the model accuracy when quantization for all other ops is disabled (enabling quantization for only one op in the model at a time)\n", + "- For each op, track the model accuracy when quantization for all other ops is enabled (disabling quantization for only one op in the model at a time)\n", + "- Track the minimum and maximum encoding parameters calculated by each quantizer in the model as a result of forward passes through the model with representative data\n", + "- When the TF Enhanced quantization scheme is used, track the histogram of tensor ranges seen by each quantizer in the model as a result of forward passes through the model with representative data\n", + "- Track the MSE loss seen at each op by comparing op outputs of the original fp32 model vs. a quantized model when user has provided unlabeled dataset and number of batches" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from aimet_common.defs import QuantScheme\n", + "\n", + "quant_analyzer.analyze(default_param_bw=8, default_output_bw=8,\n", + " quant_scheme=QuantScheme.post_training_tf_enhanced,\n", + " config_file=None,\n", + " unlabeled_dataset=unlabeled_dataset, num_batches=num_batches,\n", + " results_dir='./tmp/')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "AIMET will also output .html plots and json files where appropriate for each analysis to help visualize the data.\n", + "\n", + "The following output files will be produced, in a folder specified by the user:\n", + "\n", + "```\n", + "results_dir\n", + "|-- per_op_quant_enabled.html\n", + "|-- per_op_quant_enabled.json\n", + "|-- per_op_quant_disabled.html\n", + "|-- per_op_quant_disabled.json\n", + "|-- min_max_ranges\n", + "| |-- activations.html\n", + "| |-- activations.json\n", + "| |-- weights.html\n", + "| +-- weights.json\n", + "|-- activations_pdf\n", + "| |-- quant_op_name0.html\n", + "| |-- quant_op_name1.html\n", + "| |-- ...\n", + "| +-- quant_op_nameN.html\n", + "|-- weights_pdf\n", + "| |-- op1\n", + "| | |-- param_name_{channel_index_0}.html\n", + "| | |-- param_name_{channel_index_1}.html\n", + "| | |-- ...\n", + "| | +-- param_name_{channel_index_x}.html\n", + "| |-- op2\n", + "| | |-- param_name_{channel_index_0}.html\n", + "| | |-- param_name_{channel_index_1}.html\n", + "| | |-- ...\n", + "| | +-- param_name_{channel_index_y}.html\n", + "| |-- ...\n", + "| |-- opn\n", + "| | |-- param_name_{channel_index_0}.html\n", + "| | |-- param_name_{channel_index_1}.html\n", + "| | |-- ...\n", + "| +-- +-- param_name_{channel_index_z}.html\n", + "|-- per_op_mse_loss.html\n", + "+-- per_op_mse_loss.json\n", + "```\n", + "\n", + "#### Per op analysis by enabling/disabling quantization ops\n", + "\n", + "- per_op_quant_enabled.html: A plot with ops on the x-axis and model accuracy on the y-axis, where each op's accuracy represents the model accuracy when all quantizers in the model are disabled except for that op's parameter and activation quantizers.\n", + "- per_op_quant_enabled.json: A json file containing the data shown in per_op_quant_enabled.html, associating op names with model accuracy.\n", + "- per_op_quant_disabled.html: A plot with ops on the x-axis and model accuracy on the y-axis, where each op's accuracy represents the model accuracy when all quantizers in the model are enabled except for that op's parameter and activation quantizers.\n", + "- per_op_quant_disabled.json: A json file containing the data shown in per_op_quant_disabled.html, associating op names with model accuracy.\n", + "\n", + "![per_op_quant_enabled.html](images/tf_quant_analyzer_per_op_quant_enabled.png)\n", + "\n", + "#### Encoding min/max ranges\n", + "\n", + "- min_max_ranges: A folder containing the following sets of files:\n", + " - activations.html: A plot with output activations on the x-axis and min-max values on the y-axis, where each output activation's range represents the encoding min and max parameters computed during forward pass calibration.\n", + " - activations.json: A json file containing the data shown in activations.html, associating op names with min and max encoding values.\n", + " - weights.html: A plot with parameter names on the x-axis and min-max values on the y-axis, where each parameter's range represents the encoding min and max parameters computed during forward pass calibration.\n", + " - weights.json: A json file containing the data shown in weights.html, associating parameter names with min and max encoding values.\n", + "\n", + "![min_max_ranges.html](images/tf_quant_analyzer_min_max_range_weights.png)\n", + "\n", + "#### PDF of statistics\n", + "\n", + "- (If TF Enhanced quant scheme is used) activations_pdf: A folder containing html files for each op, plotting the histogram of tensor values seen for that op's output activation seen during forward pass calibration.\n", + "- (If TF Enhanced quant scheme is used) weights_pdf: A folder containing sub folders for each op with weights.\n", + " Each op's folder contains html files for each parameter of that op, with a histogram plot of tensor values seen for that parameter seen during forward pass calibration.\n", + "\n", + "![weights_pdf.html](images/tf_quant_analyzer_pdf.png)\n", + "\n", + "#### Per op MSE loss\n", + "- (Optional, only enabled when user has provided unlabeled dataset and number of batches) per_op_mse_loss.html: A plot with ops on the x-axis and MSE loss on the y-axis, where each op's MSE loss represents the MSE seen comparing that op's outputs in the FP32 model vs. the quantized model.\n", + "- (Optional, only enabled when user has provided unlabeled dataset and number of batches) per_op_mse_loss.json: A json file containing the data shown in per_op_mse_loss.html, associating op names with MSE loss.\n", + "\n", + "![per_op_mse_loss.html](images/tf_quant_analyzer_mse_loss.png)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.0" + } + }, + "nbformat": 4, + "nbformat_minor": 1 +} diff --git a/releases/1.32.2/Examples/torch/compression/channel_pruning.html b/releases/1.32.2/Examples/torch/compression/channel_pruning.html new file mode 100644 index 00000000..0608889a --- /dev/null +++ b/releases/1.32.2/Examples/torch/compression/channel_pruning.html @@ -0,0 +1,1401 @@ + + + + + + Model compression using Channel Pruning — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Model compression using Channel Pruning

+

This notebook shows a working code example of how to use AIMET to perform model compression. The Channel Pruning technique is used in this notebook to achieve model compression.

+

Here is a brief introduction to the techniques. Please refer to the AIMET user guide for more details.

+
    +
  1. Spatial SVD: This is a tensor-decomposition technique generally applied to convolutional layers (Conv2D). Applying this technique will decompose a single convolutional layer into two. The weight tensor of the layer to be split is flattended to a 2D matrix and singular value decomposition (SVD) is applied to this matrix. Compression is achieved by discarding the least significant singular values in the diagonal matrix. The decomposed matrices are combined back into two separate +convolutional layers.

  2. +
  3. Channel Pruning: In this technique AIMET will discard least significant (using a magnitude metric) input channels of a given convolutional (Conv2D) layer. The layers of the model feeding into this convolutional layer also have the channels dimension modified to get back to a working graph. This technique also uses a layer-by-layer reconstruction procedure that modifies the weights of the compressed layers to minimize the distance of the compressed layer output to the corresponding layer +output of the original model.

  4. +
+

Both of the above techniques are structured pruning techniques that aim to reduce computational macs or memory requirements of the model. Subsequent to applying either of these techniques, the compressed model needs to be fine-tuned (meaning trained again for a few epochs) to recover accuracy close to the original model.

+

This notebook shows working code example of how the technique #2 can be used to compress the model. You can find a separate notebook for #1, and #1 followed by #2 in the same folder.

+
+

Overall flow

+
+
This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load the model and evaluate it to find the baseline accuracy 3. Compress the model and fine-tune:
+
3.1 Compress model using Channel Pruning and evaluate it to find post-compression accuracy
+
3.2 Fine-tune the model
+
+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art compression results. For example, some optimization parameters such as num_comp_ratio_candidates, num_eval_iterations and epochs are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).

+

Note1: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - Subfolders ‘train’ for the training samples and ‘val’ for the validation samples. Please see the pytorch dataset description for more details. - A subdirectory per class, and a file per each image sample

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = '/path/to/dataset/'         # Please replace this with a real directory
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to compress it and the resultant model is still a PyTorch model. This compressed model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods? Not really, but evaluate() method should return a single number representing the accuracy of the model. Ideally, You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
import os
+import torch
+from typing import List
+from Examples.common import image_net_config
+from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.torch.utils.image_net_trainer import ImageNetTrainer
+from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader
+
+class ImageNetDataPipeline:
+
+    @staticmethod
+    def get_val_dataloader() -> torch.utils.data.DataLoader:
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(DATASET_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         is_training=False,
+                                         num_workers=image_net_config.evaluation['num_workers']).data_loader
+        return data_loader
+
+    @staticmethod
+    def evaluate(model: torch.nn.Module, iterations: int, use_cuda: bool) -> float:
+        """
+        Given a torch model, evaluates its Top-1 accuracy on the dataset
+        :param model: the model to evaluate
+        :param iterations: the number of batches to be used to evaluate the model. A value of 'None' means the model will be
+                           evaluated on the entire dataset once.
+        :param use_cuda: whether or not the GPU should be used.
+        """
+        evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      num_workers=image_net_config.evaluation['num_workers'])
+
+        return evaluator.evaluate(model, iterations=iterations, use_cuda=use_cuda)
+
+    @staticmethod
+    def finetune(model: torch.nn.Module, epochs: int, learning_rate: float, learning_rate_schedule: List, use_cuda: bool):
+        """
+        Given a torch model, finetunes the model to improve its accuracy
+        :param model: the model to finetune
+        :param epochs: The number of epochs used during the finetuning step.
+        :param learning_rate: The learning rate used during the finetuning step.
+        :param learning_rate_schedule: The learning rate schedule used during the finetuning step.
+        :param use_cuda: whether or not the GPU should be used.
+        """
+        trainer = ImageNetTrainer(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
+                                  batch_size=image_net_config.train['batch_size'],
+                                  num_workers=image_net_config.train['num_workers'])
+
+        trainer.train(model, max_epochs=epochs, learning_rate=learning_rate,
+                      learning_rate_schedule=learning_rate_schedule, use_cuda=use_cuda)
+
+
+
+
+
+
+

2. Load the model and evaluate it to find the baseline accuracy

+

For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead.

+
+
[ ]:
+
+
+
from torchvision.models import resnet18
+
+model = resnet18(pretrained=True)
+
+
+
+
+

We should decide whether to place the model on a CPU or CUDA device. This example code will use CUDA if available in your current execution environment. You can change this logic and force a device placement if needed.

+
+
[ ]:
+
+
+
use_cuda = False
+if torch.cuda.is_available():
+    use_cuda = True
+    model.to(torch.device('cuda'))
+
+
+
+
+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(model, iterations=None, use_cuda=use_cuda)
+print(accuracy)
+
+
+
+
+
+

3. Compress the model and fine-tune

+
+

3.1. Compress model using Channel Pruning and evaluate it to find post-compression accuracy

+

Now we use AIMET to define compression parameters for Channel Pruning, few of which are explained here

+
    +
  • target_comp_ratio: The desired compression ratio for Channel Pruning. We are using 0.9 to compress the model by 10%.

  • +
  • num_comp_ratio_candidates: As part of determining how compressible each layer is, AIMET performs various measurements. This number denotes the different compression ratios tried by the AIMET for each layer. We are using 3 here which translates to 0.33, 0.66 and 1.00 compression ratios at each layer. Optimal value is 10. The higher the number of candidates the more granular the measurements for each layer, but also the higher the time taken to complete these measurements.

  • +
  • modules_to_ignore: This list can contain the references of model-layers that should be ignored during compression. We have added the first layer to be ignored to preserve the way the input interacts with the model; other layers can be added too if desired.

  • +
  • mode: We are chossing Auto mode which means AIMET performs per-layer compressibility analysis and determines how much to compress each layer. The alternate choice is Manual.

  • +
  • data_loader: Channel Pruning uses unlabelled data samples for the layer-by-layer reconstruction procedure explained at the start. This provided data loader is used to retrieve those samples. You can just pass your existing data loader - say for the validation or training dataset.

  • +
  • num_reconstruction_samples: The number of samples used in the layer-by-layer reconstruction procedure. We are using 10 here which is a ridiculously low number but enables this notebook to execute quickly. A typical setting here would ~1000 samples.

  • +
  • allow_custom_downsample_ops: If this flag is enabled, AIMET Channel Pruning will insert downsample ops into the model graph if needed. Enabling this can enable more convolutional layers to be considered for pruning, but it may increase memory bandwidth overhead for the additional downsample layers. So there is a trade-off to be considered. We suggest disabling this by default.

  • +
  • eval_callback: The model evaluation function. The expected signature of the evaluate function should be <function_name>(model, eval_iterations, use_cuda) and it is expected to return an accuracy metric.

  • +
  • eval_iterations: The number of batches of data to use for evaluating the model while the model is compressing. We are using 1 to speed up the notebook execution. But please choose a high enough number of samples so that we can trust the accuracy of the model given those samples. It is expected that the eval callback would use the same samples for every invocation of the callback.

  • +
  • compress_scheme: We choose the ‘channel pruning’ compression scheme.

  • +
  • cost_metric: Determines whether we want to target either to reduce MACs or memory by the desired compression ratio. We are chossing ‘mac’ here.

  • +
+
+
[ ]:
+
+
+
from decimal import Decimal
+from aimet_torch.defs import GreedySelectionParameters, ChannelPruningParameters
+from aimet_common.defs import CompressionScheme, CostMetric
+
+greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.9),
+                                          num_comp_ratio_candidates=3)
+modules_to_ignore = [model.conv1]
+auto_params = ChannelPruningParameters.AutoModeParams(greedy_select_params=greedy_params,
+                                                      modules_to_ignore=modules_to_ignore)
+data_loader = ImageNetDataPipeline.get_val_dataloader()
+params = ChannelPruningParameters(data_loader=data_loader,
+                                  num_reconstruction_samples=10,
+                                  allow_custom_downsample_ops=False,
+                                  mode=ChannelPruningParameters.Mode.auto,
+                                  params=auto_params)
+
+eval_callback = ImageNetDataPipeline.evaluate
+eval_iterations = 1
+compress_scheme = CompressionScheme.channel_pruning
+cost_metric = CostMetric.mac
+
+
+
+
+
+
We call the AIMET ModelCompressor.compress_model API using the above parameters. This call returns a compressed model as well as relevant statistics.
+
Note: the ModelCompressor evaluates the model while compressing using the same evaluate function that is in our data pipeline.
+
+
+
[ ]:
+
+
+
from aimet_torch.compress import ModelCompressor
+compressed_model, comp_stats = ModelCompressor.compress_model(model=model,
+                                                              eval_callback=eval_callback,
+                                                              eval_iterations=eval_iterations,
+                                                              input_shape=(1, 3, 224, 224),
+                                                              compress_scheme=compress_scheme,
+                                                              cost_metric=cost_metric,
+                                                              parameters=params)
+
+print(comp_stats)
+
+
+
+
+

Now the compressed model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before to calculated compressed model accuracy.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(compressed_model, iterations=None, use_cuda=use_cuda)
+print(accuracy)
+
+
+
+
+

As you can see the model accuracy fell sharply after compression. This is expected. We will use model fine-tuning to recover this accuracy back.

+
+
+

3.2. Fine-tune the model

+

After the model is compressed using Channel Pruning, we can simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.

+

For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit.

+
+
[ ]:
+
+
+
ImageNetDataPipeline.finetune(compressed_model, epochs=2, learning_rate=15e-4, learning_rate_schedule=[5, 10],
+                              use_cuda=use_cuda)
+
+
+
+
+

After we are done with finetuing the compressed model, we can check the floating point accuracy against the same validation dataset at the end to observe any improvements in accuracy.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(compressed_model, iterations=None, use_cuda=use_cuda)
+print(accuracy)
+
+
+
+
+

Depending on your settings you should have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.

+

So we have an improved model after compression using Channel Pruning. Optionally, this model now can be saved like a regular PyTorch model.

+
+
[ ]:
+
+
+
os.makedirs('./output/', exist_ok=True)
+torch.save(compressed_model, './output/finetuned_model')
+
+
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET for performing compression with Channel Pruning. As indicated above, some parameters have been chosen in a way to run the example faster.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET compression and quantization techniques

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/torch/compression/channel_pruning.ipynb b/releases/1.32.2/Examples/torch/compression/channel_pruning.ipynb new file mode 100644 index 00000000..e19e3606 --- /dev/null +++ b/releases/1.32.2/Examples/torch/compression/channel_pruning.ipynb @@ -0,0 +1,408 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Model compression using Channel Pruning \n", + "\n", + "This notebook shows a working code example of how to use AIMET to perform model compression. The Channel Pruning technique is used in this notebook to achieve model compression.\n", + "\n", + "Here is a brief introduction to the techniques. Please refer to the AIMET user guide for more details.\n", + "\n", + "1. **Spatial SVD**: This is a tensor-decomposition technique generally applied to convolutional layers (Conv2D). Applying this technique will decompose a single convolutional layer into two. The weight tensor of the layer to be split is flattended to a 2D matrix and singular value decomposition (SVD) is applied to this matrix. Compression is achieved by discarding the least significant singular values in the diagonal matrix. The decomposed matrices are combined back into two separate convolutional layers.\n", + "2. **Channel Pruning**: In this technique AIMET will discard least significant (using a magnitude metric) input channels of a given convolutional (Conv2D) layer. The layers of the model feeding into this convolutional layer also have the channels dimension modified to get back to a working graph. This technique also uses a layer-by-layer reconstruction procedure that modifies the weights of the compressed layers to minimize the distance of the compressed layer output to the corresponding layer output of the original model.\n", + "\n", + "Both of the above techniques are structured pruning techniques that aim to reduce computational macs or memory requirements of the model. Subsequent to applying either of these techniques, the compressed model needs to be fine-tuned (meaning trained again for a few epochs) to recover accuracy close to the original model.\n", + "\n", + "This notebook shows working code example of how the technique #2 can be used to compress the model. You can find a separate notebook for #1, and #1 followed by #2 in the same folder.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load the model and evaluate it to find the baseline accuracy\n", + "3. Compress the model and fine-tune: \n", + " 3.1 Compress model using Channel Pruning and evaluate it to find post-compression accuracy \n", + " 3.2 Fine-tune the model\n", + "\n", + "\n", + "#### What this notebook is not \n", + "* This notebook is not designed to show state-of-the-art compression results. For example, some optimization parameters such as num_comp_ratio_candidates, num_eval_iterations and epochs are deliberately chosen to have the notebook execute more quickly.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).\n", + "\n", + "**Note1**: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- Subfolders 'train' for the training samples and 'val' for the validation samples. Please see the [pytorch dataset description](https://pytorch.org/vision/0.8/_modules/torchvision/datasets/imagenet.html) for more details.\n", + "- A subdirectory per class, and a file per each image sample\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "DATASET_DIR = '/path/to/dataset/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to compress it and the resultant model is still a PyTorch model. This compressed model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really, but evaluate() method should return a single number representing the accuracy of the model. Ideally, You should be able to use your existing evaluate and train routines as-is.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import torch\n", + "from typing import List\n", + "from Examples.common import image_net_config\n", + "from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.torch.utils.image_net_trainer import ImageNetTrainer\n", + "from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader\n", + "\n", + "class ImageNetDataPipeline:\n", + "\n", + " @staticmethod\n", + " def get_val_dataloader() -> torch.utils.data.DataLoader:\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(DATASET_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " is_training=False,\n", + " num_workers=image_net_config.evaluation['num_workers']).data_loader\n", + " return data_loader\n", + "\n", + " @staticmethod\n", + " def evaluate(model: torch.nn.Module, iterations: int, use_cuda: bool) -> float:\n", + " \"\"\"\n", + " Given a torch model, evaluates its Top-1 accuracy on the dataset\n", + " :param model: the model to evaluate\n", + " :param iterations: the number of batches to be used to evaluate the model. A value of 'None' means the model will be\n", + " evaluated on the entire dataset once.\n", + " :param use_cuda: whether or not the GPU should be used.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " num_workers=image_net_config.evaluation['num_workers'])\n", + "\n", + " return evaluator.evaluate(model, iterations=iterations, use_cuda=use_cuda)\n", + "\n", + " @staticmethod\n", + " def finetune(model: torch.nn.Module, epochs: int, learning_rate: float, learning_rate_schedule: List, use_cuda: bool):\n", + " \"\"\"\n", + " Given a torch model, finetunes the model to improve its accuracy\n", + " :param model: the model to finetune\n", + " :param epochs: The number of epochs used during the finetuning step.\n", + " :param learning_rate: The learning rate used during the finetuning step.\n", + " :param learning_rate_schedule: The learning rate schedule used during the finetuning step.\n", + " :param use_cuda: whether or not the GPU should be used.\n", + " \"\"\"\n", + " trainer = ImageNetTrainer(DATASET_DIR, image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.train['batch_size'],\n", + " num_workers=image_net_config.train['num_workers'])\n", + "\n", + " trainer.train(model, max_epochs=epochs, learning_rate=learning_rate,\n", + " learning_rate_schedule=learning_rate_schedule, use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 2. Load the model and evaluate it to find the baseline accuracy" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from torchvision.models import resnet18\n", + "\n", + "model = resnet18(pretrained=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We should decide whether to place the model on a CPU or CUDA device. This example code will use CUDA if available in your current execution environment. You can change this logic and force a device placement if needed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "use_cuda = False\n", + "if torch.cuda.is_available():\n", + " use_cuda = True\n", + " model.to(torch.device('cuda'))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(model, iterations=None, use_cuda=use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. Compress the model and fine-tune\n", + "\n", + "### 3.1. Compress model using Channel Pruning and evaluate it to find post-compression accuracy\n", + "Now we use AIMET to define compression parameters for Channel Pruning, few of which are explained here\n", + "\n", + "- **target_comp_ratio**: The desired compression ratio for Channel Pruning. We are using 0.9 to compress the model by 10%.\n", + "\n", + "- **num_comp_ratio_candidates**: As part of determining how compressible each layer is, AIMET performs various measurements. This number denotes the different compression ratios tried by the AIMET for each layer. We are using 3 here which translates to 0.33, 0.66 and 1.00 compression ratios at each layer. Optimal value is 10. The higher the number of candidates the more granular the measurements for each layer, but also the higher the time taken to complete these measurements.\n", + "\n", + "- **modules_to_ignore**: This list can contain the references of model-layers that should be ignored during compression. We have added the first layer to be ignored to preserve the way the input interacts with the model; other layers can be added too if desired.\n", + "\n", + "- **mode**: We are chossing **Auto** mode which means AIMET performs per-layer compressibility analysis and determines how much to compress each layer. The alternate choice is **Manual**.\n", + "\n", + "- **data_loader**: Channel Pruning uses unlabelled data samples for the layer-by-layer reconstruction procedure explained at the start. This provided data loader is used to retrieve those samples. You can just pass your existing data loader - say for the validation or training dataset.\n", + "\n", + "- **num_reconstruction_samples**: The number of samples used in the layer-by-layer reconstruction procedure. We are using 10 here which is a ridiculously low number but enables this notebook to execute quickly. A typical setting here would ~1000 samples.\n", + "\n", + "- **allow_custom_downsample_ops**: If this flag is enabled, AIMET Channel Pruning will insert downsample ops into the model graph if needed. Enabling this can enable more convolutional layers to be considered for pruning, but it may increase memory bandwidth overhead for the additional downsample layers. So there is a trade-off to be considered. We suggest disabling this by default.\n", + "\n", + "- **eval_callback**: The model evaluation function. The expected signature of the evaluate function should be `(model, eval_iterations, use_cuda)` and it is expected to return an accuracy metric.\n", + "\n", + "- **eval_iterations**: The number of batches of data to use for evaluating the model while the model is compressing. We are using 1 to speed up the notebook execution. But please choose a high enough number of samples so that we can trust the accuracy of the model given those samples. It is expected that the eval callback would use the same samples for every invocation of the callback.\n", + "\n", + "- **compress_scheme**: We choose the 'channel pruning' compression scheme.\n", + "\n", + "- **cost_metric**: Determines whether we want to target either to reduce MACs or memory by the desired compression ratio. We are chossing 'mac' here." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from decimal import Decimal\n", + "from aimet_torch.defs import GreedySelectionParameters, ChannelPruningParameters\n", + "from aimet_common.defs import CompressionScheme, CostMetric\n", + "\n", + "greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.9),\n", + " num_comp_ratio_candidates=3)\n", + "modules_to_ignore = [model.conv1]\n", + "auto_params = ChannelPruningParameters.AutoModeParams(greedy_select_params=greedy_params,\n", + " modules_to_ignore=modules_to_ignore)\n", + "data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + "params = ChannelPruningParameters(data_loader=data_loader,\n", + " num_reconstruction_samples=10,\n", + " allow_custom_downsample_ops=False,\n", + " mode=ChannelPruningParameters.Mode.auto,\n", + " params=auto_params)\n", + "\n", + "eval_callback = ImageNetDataPipeline.evaluate\n", + "eval_iterations = 1\n", + "compress_scheme = CompressionScheme.channel_pruning\n", + "cost_metric = CostMetric.mac" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We call the AIMET ModelCompressor.compress_model API using the above parameters. This call returns a compressed model as well as relevant statistics. \n", + "**Note**: the ModelCompressor evaluates the model while compressing using the same evaluate function that is in our data pipeline.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "from aimet_torch.compress import ModelCompressor\n", + "compressed_model, comp_stats = ModelCompressor.compress_model(model=model,\n", + " eval_callback=eval_callback,\n", + " eval_iterations=eval_iterations,\n", + " input_shape=(1, 3, 224, 224),\n", + " compress_scheme=compress_scheme,\n", + " cost_metric=cost_metric,\n", + " parameters=params)\n", + "\n", + "print(comp_stats)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now the compressed model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before to calculated compressed model accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(compressed_model, iterations=None, use_cuda=use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "As you can see the model accuracy fell sharply after compression. This is expected. We will use model fine-tuning to recover this accuracy back." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.2. Fine-tune the model\n", + "\n", + "After the model is compressed using Channel Pruning, we can simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.\n", + "\n", + "For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ImageNetDataPipeline.finetune(compressed_model, epochs=2, learning_rate=15e-4, learning_rate_schedule=[5, 10],\n", + " use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "After we are done with finetuing the compressed model, we can check the floating point accuracy against the same validation dataset at the end to observe any improvements in accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(compressed_model, iterations=None, use_cuda=use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Depending on your settings you should have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.\n", + "\n", + "So we have an improved model after compression using Channel Pruning. Optionally, this model now can be saved like a regular PyTorch model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "os.makedirs('./output/', exist_ok=True)\n", + "torch.save(compressed_model, './output/finetuned_model')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Summary\n", + "\n", + "Hope this notebook was useful for you to understand how to use AIMET for performing compression with Channel Pruning. As indicated above, some parameters have been chosen in a way to run the example faster.\n", + "\n", + "Few additional resources\n", + "- Refer to the AIMET API docs to know more details of the APIs and optional parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET compression and quantization techniques" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/releases/1.32.2/Examples/torch/compression/spatial_svd.html b/releases/1.32.2/Examples/torch/compression/spatial_svd.html new file mode 100644 index 00000000..8876ec53 --- /dev/null +++ b/releases/1.32.2/Examples/torch/compression/spatial_svd.html @@ -0,0 +1,1380 @@ + + + + + + Model compression using Spatial SVD — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Model compression using Spatial SVD

+

This notebook shows a working code example of how to use AIMET to perform model compression. The Spatial SVD technique is used in this notebook to achieve model compression.

+

Here is a brief introduction to the techniques. Please refer to the AIMET user guide for more details.

+
    +
  1. Spatial SVD: This is a tensor-decomposition technique generally applied to convolutional layers (Conv2D). Applying this technique will decompose a single convolutional layer into two. The weight tensor of the layer to be split is flattended to a 2D matrix and singular value decomposition (SVD) is applied to this matrix. Compression is achieved by discarding the least significant singular values in the diagonal matrix. The decomposed matrices are combined back into two separate +convolutional layers.

  2. +
  3. Channel Pruning: In this technique AIMET will discard least significant (using a magnitude metric) input channels of a given convolutional (Conv2D) layer. The layers of the model feeding into this convolutional layer also have the channels dimension modified to get back to a working graph. This technique also uses a layer-by-layer reconstruction procedure that modifies the weights of the compressed layers to minimize the distance of the compressed layer output to the corresponding layer +output of the original model.

  4. +
+

Both of the above techniques are structured pruning techniques that aim to reduce computational macs or memory requirements of the model. Subsequent to applying either of these techniques, the compressed model needs to be fine-tuned (meaning trained again for a few epochs) to recover accuracy close to the original model.

+

This notebook shows working code example of how the technique #1 can be used to compress the model. You can find a separate notebook for #2, and #1 followed by #2 in the same folder.

+
+

Overall flow

+
+
This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load the model and evaluate it to find the baseline accuracy 3. Compress the model and fine-tune:
+
3.1 Compress model using Spatial SVD and evaluate it to find post-compression accuracy
+
3.2 Fine-tune the model
+
+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art compression results. For example, some optimization parameters such as num_comp_ratio_candidates, num_eval_iterations and epochs are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).

+

Note1: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - Subfolders ‘train’ for the training samples and ‘val’ for the validation samples. Please see the pytorch dataset description for more details. - A subdirectory per class, and a file per each image sample

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = '/path/to/dataset/'         # Please replace this with a real directory
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to compress it and the resultant model is still a PyTorch model. This compressed model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods? Not really, but evaluate() method should return a single number representing the accuracy of the model. Ideally, You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
import os
+import torch
+from typing import List
+from Examples.common import image_net_config
+from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.torch.utils.image_net_trainer import ImageNetTrainer
+
+class ImageNetDataPipeline:
+
+    @staticmethod
+    def evaluate(model: torch.nn.Module, iterations: int, use_cuda: bool) -> float:
+        """
+        Given a torch model, evaluates its Top-1 accuracy on the dataset
+        :param model: the model to evaluate
+        :param iterations: the number of batches to be used to evaluate the model. A value of 'None' means the model will be
+                           evaluated on the entire dataset once.
+        :param use_cuda: whether or not the GPU should be used.
+        """
+        evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      num_workers=image_net_config.evaluation['num_workers'])
+
+        return evaluator.evaluate(model, iterations=iterations, use_cuda=use_cuda)
+
+    @staticmethod
+    def finetune(model: torch.nn.Module, epochs: int, learning_rate: float, learning_rate_schedule: List, use_cuda: bool):
+        """
+        Given a torch model, finetunes the model to improve its accuracy
+        :param model: the model to finetune
+        :param epochs: The number of epochs used during the finetuning step.
+        :param learning_rate: The learning rate used during the finetuning step.
+        :param learning_rate_schedule: The learning rate schedule used during the finetuning step.
+        :param use_cuda: whether or not the GPU should be used.
+        """
+        trainer = ImageNetTrainer(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
+                                  batch_size=image_net_config.train['batch_size'],
+                                  num_workers=image_net_config.train['num_workers'])
+
+        trainer.train(model, max_epochs=epochs, learning_rate=learning_rate,
+                      learning_rate_schedule=learning_rate_schedule, use_cuda=use_cuda)
+
+
+
+
+
+
+

2. Load the model and evaluate it to find the baseline accuracy

+

For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead.

+
+
[ ]:
+
+
+
from torchvision.models import resnet18
+
+model = resnet18(pretrained=True)
+
+
+
+
+

We should decide whether to place the model on a CPU or CUDA device. This example code will use CUDA if available in your current execution environment. You can change this logic and force a device placement if needed.

+
+
[ ]:
+
+
+
use_cuda = False
+if torch.cuda.is_available():
+    use_cuda = True
+    model.to(torch.device('cuda'))
+
+
+
+
+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(model, iterations=None, use_cuda=use_cuda)
+print(accuracy)
+
+
+
+
+
+

3. Compress the model and fine-tune

+
+

3.1. Compress model using Spatial SVD and evaluate it to find post-compression accuracy

+

Now we use AIMET to define compression parameters for Spatial SVD, few of which are explained here

+
    +
  • target_comp_ratio: The desired compression ratio for Spatial SVD. We are using 0.8 to compress the model by 20%.

  • +
  • num_comp_ratio_candidates: As part of determining how compressible each layer is, AIMET performs various measurements. This number denotes the different compression ratios tried by the AIMET for each layer. We are using 3 here which translates to 0.33, 0.66 and 1.00 compression ratios at each layer. Optimal value is 10. The higher the number of candidates the more granular the measurements for each layer, but also the higher the time taken to complete these measurements.

  • +
  • modules_to_ignore: This list can contain the references of model-layers that should be ignored during compression. We have added the first layer to be ignored to preserve the way the input interacts with the model; other layers can be added too if desired.

  • +
  • mode: We are chossing Auto mode which means AIMET performs per-layer compressibility analysis and determines how much to compress each layer. The alternate choice is Manual.

  • +
  • eval_callback: The model evaluation function. The expected signature of the evaluate function should be <function_name>(model, eval_iterations, use_cuda) and it is expected to return an accuracy metric.

  • +
  • eval_iterations: The number of batches of data to use for evaluating the model while the model is compressing. We are using 1 to speed up the notebook execution. But please choose a high enough number of samples so that we can trust the accuracy of the model given those samples. It is expected that the eval callback would use the same samples for every invocation of the callback.

  • +
  • compress_scheme: We choose the ‘spatial svd’ compression scheme.

  • +
  • cost_metric: Determines whether we want to target either to reduce MACs or memory by the desired compression ratio. We are chossing ‘mac’ here.

  • +
+
+
[ ]:
+
+
+
from decimal import Decimal
+from aimet_torch.defs import GreedySelectionParameters, SpatialSvdParameters
+from aimet_common.defs import CompressionScheme, CostMetric
+
+greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.8),
+                                          num_comp_ratio_candidates=3)
+modules_to_ignore = [model.conv1]
+auto_params = SpatialSvdParameters.AutoModeParams(greedy_select_params=greedy_params,
+                                                  modules_to_ignore=modules_to_ignore)
+params = SpatialSvdParameters(mode=SpatialSvdParameters.Mode.auto, params=auto_params)
+
+eval_callback = ImageNetDataPipeline.evaluate
+eval_iterations = 1
+compress_scheme = CompressionScheme.spatial_svd
+cost_metric = CostMetric.mac
+
+
+
+
+
+
We call the AIMET ModelCompressor.compress_model API using the above parameters. This call returns a compressed model as well as relevant statistics.
+
Note: the ModelCompressor evaluates the model while compressing using the same evaluate function that is in our data pipeline.
+
+
+
[ ]:
+
+
+
from aimet_torch.compress import ModelCompressor
+compressed_model, comp_stats = ModelCompressor.compress_model(model=model,
+                                                              eval_callback=eval_callback,
+                                                              eval_iterations=eval_iterations,
+                                                              input_shape=(1, 3, 224, 224),
+                                                              compress_scheme=compress_scheme,
+                                                              cost_metric=cost_metric,
+                                                              parameters=params)
+
+print(comp_stats)
+
+
+
+
+

Now the compressed model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before to calculated compressed model accuracy.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(compressed_model, iterations=None, use_cuda=use_cuda)
+print(accuracy)
+
+
+
+
+

As you can see the model accuracy fell sharply after compression. This is expected. We will use model fine-tuning to recover this accuracy back.

+
+
+

3.2. Fine-tune the model

+

After the model is compressed using Spatial SVD, we can simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.

+

For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit.

+
+
[ ]:
+
+
+
ImageNetDataPipeline.finetune(compressed_model, epochs=1, learning_rate=5e-7, learning_rate_schedule=[5, 10],
+                              use_cuda=use_cuda)
+
+
+
+
+

After we are done with finetuing the compressed model, we can check the floating point accuracy against the same validation dataset at the end to observe any improvements in accuracy.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(compressed_model, iterations=None, use_cuda=use_cuda)
+print(accuracy)
+
+
+
+
+

Depending on your settings you should have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.

+

So we have an improved model after compression using spatial SVD. Optionally, this model now can be saved like a regular PyTorch model.

+
+
[ ]:
+
+
+
os.makedirs('./output/', exist_ok=True)
+torch.save(compressed_model, './output/finetuned_model')
+
+
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET for performing compression with Spatial SVD. As indicated above, some parameters have been chosen in a way to run the example faster.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET compression and quantization techniques

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/torch/compression/spatial_svd.ipynb b/releases/1.32.2/Examples/torch/compression/spatial_svd.ipynb new file mode 100644 index 00000000..2d16573b --- /dev/null +++ b/releases/1.32.2/Examples/torch/compression/spatial_svd.ipynb @@ -0,0 +1,384 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Model compression using Spatial SVD \n", + "\n", + "This notebook shows a working code example of how to use AIMET to perform model compression. The Spatial SVD technique is used in this notebook to achieve model compression.\n", + "\n", + "Here is a brief introduction to the techniques. Please refer to the AIMET user guide for more details.\n", + "\n", + "1. **Spatial SVD**: This is a tensor-decomposition technique generally applied to convolutional layers (Conv2D). Applying this technique will decompose a single convolutional layer into two. The weight tensor of the layer to be split is flattended to a 2D matrix and singular value decomposition (SVD) is applied to this matrix. Compression is achieved by discarding the least significant singular values in the diagonal matrix. The decomposed matrices are combined back into two separate convolutional layers.\n", + "2. **Channel Pruning**: In this technique AIMET will discard least significant (using a magnitude metric) input channels of a given convolutional (Conv2D) layer. The layers of the model feeding into this convolutional layer also have the channels dimension modified to get back to a working graph. This technique also uses a layer-by-layer reconstruction procedure that modifies the weights of the compressed layers to minimize the distance of the compressed layer output to the corresponding layer output of the original model.\n", + "\n", + "Both of the above techniques are structured pruning techniques that aim to reduce computational macs or memory requirements of the model. Subsequent to applying either of these techniques, the compressed model needs to be fine-tuned (meaning trained again for a few epochs) to recover accuracy close to the original model.\n", + "\n", + "This notebook shows working code example of how the technique #1 can be used to compress the model. You can find a separate notebook for #2, and #1 followed by #2 in the same folder.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load the model and evaluate it to find the baseline accuracy\n", + "3. Compress the model and fine-tune: \n", + " 3.1 Compress model using Spatial SVD and evaluate it to find post-compression accuracy \n", + " 3.2 Fine-tune the model\n", + "\n", + "\n", + "#### What this notebook is not \n", + "* This notebook is not designed to show state-of-the-art compression results. For example, some optimization parameters such as num_comp_ratio_candidates, num_eval_iterations and epochs are deliberately chosen to have the notebook execute more quickly.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).\n", + "\n", + "**Note1**: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- Subfolders 'train' for the training samples and 'val' for the validation samples. Please see the [pytorch dataset description](https://pytorch.org/vision/0.8/_modules/torchvision/datasets/imagenet.html) for more details.\n", + "- A subdirectory per class, and a file per each image sample\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "DATASET_DIR = '/path/to/dataset/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to compress it and the resultant model is still a PyTorch model. This compressed model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really, but evaluate() method should return a single number representing the accuracy of the model. Ideally, You should be able to use your existing evaluate and train routines as-is.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import torch\n", + "from typing import List\n", + "from Examples.common import image_net_config\n", + "from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.torch.utils.image_net_trainer import ImageNetTrainer\n", + "\n", + "class ImageNetDataPipeline:\n", + "\n", + " @staticmethod\n", + " def evaluate(model: torch.nn.Module, iterations: int, use_cuda: bool) -> float:\n", + " \"\"\"\n", + " Given a torch model, evaluates its Top-1 accuracy on the dataset\n", + " :param model: the model to evaluate\n", + " :param iterations: the number of batches to be used to evaluate the model. A value of 'None' means the model will be\n", + " evaluated on the entire dataset once.\n", + " :param use_cuda: whether or not the GPU should be used.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " num_workers=image_net_config.evaluation['num_workers'])\n", + "\n", + " return evaluator.evaluate(model, iterations=iterations, use_cuda=use_cuda)\n", + "\n", + " @staticmethod\n", + " def finetune(model: torch.nn.Module, epochs: int, learning_rate: float, learning_rate_schedule: List, use_cuda: bool):\n", + " \"\"\"\n", + " Given a torch model, finetunes the model to improve its accuracy\n", + " :param model: the model to finetune\n", + " :param epochs: The number of epochs used during the finetuning step.\n", + " :param learning_rate: The learning rate used during the finetuning step.\n", + " :param learning_rate_schedule: The learning rate schedule used during the finetuning step.\n", + " :param use_cuda: whether or not the GPU should be used.\n", + " \"\"\"\n", + " trainer = ImageNetTrainer(DATASET_DIR, image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.train['batch_size'],\n", + " num_workers=image_net_config.train['num_workers'])\n", + "\n", + " trainer.train(model, max_epochs=epochs, learning_rate=learning_rate,\n", + " learning_rate_schedule=learning_rate_schedule, use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 2. Load the model and evaluate it to find the baseline accuracy" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from torchvision.models import resnet18\n", + "\n", + "model = resnet18(pretrained=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We should decide whether to place the model on a CPU or CUDA device. This example code will use CUDA if available in your current execution environment. You can change this logic and force a device placement if needed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "use_cuda = False\n", + "if torch.cuda.is_available():\n", + " use_cuda = True\n", + " model.to(torch.device('cuda'))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(model, iterations=None, use_cuda=use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. Compress the model and fine-tune\n", + "\n", + "### 3.1. Compress model using Spatial SVD and evaluate it to find post-compression accuracy\n", + "Now we use AIMET to define compression parameters for Spatial SVD, few of which are explained here\n", + "\n", + "- **target_comp_ratio**: The desired compression ratio for Spatial SVD. We are using 0.8 to compress the model by 20%.\n", + "\n", + "- **num_comp_ratio_candidates**: As part of determining how compressible each layer is, AIMET performs various measurements. This number denotes the different compression ratios tried by the AIMET for each layer. We are using 3 here which translates to 0.33, 0.66 and 1.00 compression ratios at each layer. Optimal value is 10. The higher the number of candidates the more granular the measurements for each layer, but also the higher the time taken to complete these measurements.\n", + "\n", + "- **modules_to_ignore**: This list can contain the references of model-layers that should be ignored during compression. We have added the first layer to be ignored to preserve the way the input interacts with the model; other layers can be added too if desired.\n", + "\n", + "- **mode**: We are chossing **Auto** mode which means AIMET performs per-layer compressibility analysis and determines how much to compress each layer. The alternate choice is **Manual**.\n", + "\n", + "- **eval_callback**: The model evaluation function. The expected signature of the evaluate function should be `(model, eval_iterations, use_cuda)` and it is expected to return an accuracy metric.\n", + "\n", + "- **eval_iterations**: The number of batches of data to use for evaluating the model while the model is compressing. We are using 1 to speed up the notebook execution. But please choose a high enough number of samples so that we can trust the accuracy of the model given those samples. It is expected that the eval callback would use the same samples for every invocation of the callback.\n", + "\n", + "- **compress_scheme**: We choose the 'spatial svd' compression scheme.\n", + "\n", + "- **cost_metric**: Determines whether we want to target either to reduce MACs or memory by the desired compression ratio. We are chossing 'mac' here.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from decimal import Decimal\n", + "from aimet_torch.defs import GreedySelectionParameters, SpatialSvdParameters\n", + "from aimet_common.defs import CompressionScheme, CostMetric\n", + "\n", + "greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.8),\n", + " num_comp_ratio_candidates=3)\n", + "modules_to_ignore = [model.conv1]\n", + "auto_params = SpatialSvdParameters.AutoModeParams(greedy_select_params=greedy_params,\n", + " modules_to_ignore=modules_to_ignore)\n", + "params = SpatialSvdParameters(mode=SpatialSvdParameters.Mode.auto, params=auto_params)\n", + "\n", + "eval_callback = ImageNetDataPipeline.evaluate\n", + "eval_iterations = 1\n", + "compress_scheme = CompressionScheme.spatial_svd\n", + "cost_metric = CostMetric.mac" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We call the AIMET ModelCompressor.compress_model API using the above parameters. This call returns a compressed model as well as relevant statistics. \n", + "**Note**: the ModelCompressor evaluates the model while compressing using the same evaluate function that is in our data pipeline.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "from aimet_torch.compress import ModelCompressor\n", + "compressed_model, comp_stats = ModelCompressor.compress_model(model=model,\n", + " eval_callback=eval_callback,\n", + " eval_iterations=eval_iterations,\n", + " input_shape=(1, 3, 224, 224),\n", + " compress_scheme=compress_scheme,\n", + " cost_metric=cost_metric,\n", + " parameters=params)\n", + "\n", + "print(comp_stats)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now the compressed model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before to calculated compressed model accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(compressed_model, iterations=None, use_cuda=use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "As you can see the model accuracy fell sharply after compression. This is expected. We will use model fine-tuning to recover this accuracy back." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.2. Fine-tune the model\n", + "\n", + "After the model is compressed using Spatial SVD, we can simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.\n", + "\n", + "For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ImageNetDataPipeline.finetune(compressed_model, epochs=1, learning_rate=5e-7, learning_rate_schedule=[5, 10],\n", + " use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "After we are done with finetuing the compressed model, we can check the floating point accuracy against the same validation dataset at the end to observe any improvements in accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(compressed_model, iterations=None, use_cuda=use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Depending on your settings you should have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.\n", + "\n", + "So we have an improved model after compression using spatial SVD. Optionally, this model now can be saved like a regular PyTorch model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "os.makedirs('./output/', exist_ok=True)\n", + "torch.save(compressed_model, './output/finetuned_model')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Summary\n", + "\n", + "Hope this notebook was useful for you to understand how to use AIMET for performing compression with Spatial SVD. As indicated above, some parameters have been chosen in a way to run the example faster.\n", + "\n", + "Few additional resources\n", + "- Refer to the AIMET API docs to know more details of the APIs and optional parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET compression and quantization techniques" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/releases/1.32.2/Examples/torch/compression/spatial_svd_channel_pruning.html b/releases/1.32.2/Examples/torch/compression/spatial_svd_channel_pruning.html new file mode 100644 index 00000000..f55bf926 --- /dev/null +++ b/releases/1.32.2/Examples/torch/compression/spatial_svd_channel_pruning.html @@ -0,0 +1,1504 @@ + + + + + + Model compression using Spatial SVD followed by Channel Pruning — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + +
  • + View page source +
  • +
+
+
+
+
+ +
+

Model compression using Spatial SVD followed by Channel Pruning

+

This notebook shows a working code example of how to use AIMET to perform model compression. Two model-compression techniques are applied back-to-back: Spatial SVD followed by Channel Pruning.

+

Here is a brief introduction to the techniques. Please refer to the AIMET user guide for more details.

+
    +
  1. Spatial SVD: This is a tensor-decomposition technique generally applied to convolutional layers (Conv2D). Applying this technique will decompose a single convolutional layer into two. The weight tensor of the layer to be split is flattended to a 2D matrix and singular value decomposition (SVD) is applied to this matrix. Compression is achieved by discarding the least significant singular values in the diagonal matrix. The decomposed matrices are combined back into two separate +convolutional layers.

  2. +
  3. Channel Pruning: In this technique AIMET will discard least significant (using a magnitude metric) input channels of a given convolutional (Conv2D) layer. The layers of the model feeding into this convolutional layer also have the channels dimension modified to get back to a working graph. This technique also uses a layer-by-layer reconstruction procedure that modifies the weights of the compressed layers to minimize the distance of the compressed layer output to the corresponding layer +output of the original model.

  4. +
+

Both of the above techniques are structured pruning techniques that aim to reduce computational macs or memory requirements of the model. Subsequent to applying either of these techniques, the compressed model needs to be fine-tuned (meaning trained again for a few epochs) to recover accuracy close to the original model.

+

This notebook shows working code example of how both the techniques (#1 and #2) can be used to compress the model. You can find a separate notebook for only #1 or #2 in the same folder.

+
+

Overall flow

+
+
This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load the model and evaluate it to find the baseline accuracy 3. Compress the model and fine-tune:
+
3.1 Compress model using Spatial SVD and evaluate it to find post-compression accuracy
+
3.2 Fine-tune the model after Spatial SVD
+
3.3 Compress model using Channel Pruning and evaluate it to find post-compression accuracy
+
3.4 Fine-tune the model after Channel Pruning
+
+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art compression results. For example, some optimization parameters such as num_comp_ratio_candidates, num_eval_iterations and epochs are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).

+

Note1: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - Subfolders ‘train’ for the training samples and ‘val’ for the validation samples. Please see the pytorch dataset description for more details. - A subdirectory per class, and a file per each image sample

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = '/path/to/dataset/'         # Please replace this with a real directory
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to compress it and the resultant model is still a PyTorch model. This compressed model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods? Not really, but evaluate() method should return a single number representing the accuracy of the model. Ideally, You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
import os
+import torch
+from typing import List
+from Examples.common import image_net_config
+from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.torch.utils.image_net_trainer import ImageNetTrainer
+from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader
+
+class ImageNetDataPipeline:
+
+    @staticmethod
+    def get_val_dataloader() -> torch.utils.data.DataLoader:
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(DATASET_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         is_training=False,
+                                         num_workers=image_net_config.evaluation['num_workers']).data_loader
+        return data_loader
+
+    @staticmethod
+    def evaluate(model: torch.nn.Module, iterations: int, use_cuda: bool) -> float:
+        """
+        Given a torch model, evaluates its Top-1 accuracy on the dataset
+        :param model: the model to evaluate
+        :param iterations: the number of batches to be used to evaluate the model. A value of 'None' means the model will be
+                           evaluated on the entire dataset once.
+        :param use_cuda: whether or not the GPU should be used.
+        """
+        evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      num_workers=image_net_config.evaluation['num_workers'])
+
+        return evaluator.evaluate(model, iterations=iterations, use_cuda=use_cuda)
+
+    @staticmethod
+    def finetune(model: torch.nn.Module, epochs: int, learning_rate: float, learning_rate_schedule: List, use_cuda: bool):
+        """
+        Given a torch model, finetunes the model to improve its accuracy
+        :param model: the model to finetune
+        :param epochs: The number of epochs used during the finetuning step.
+        :param learning_rate: The learning rate used during the finetuning step.
+        :param learning_rate_schedule: The learning rate schedule used during the finetuning step.
+        :param use_cuda: whether or not the GPU should be used.
+        """
+        trainer = ImageNetTrainer(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
+                                  batch_size=image_net_config.train['batch_size'],
+                                  num_workers=image_net_config.train['num_workers'])
+
+        trainer.train(model, max_epochs=epochs, learning_rate=learning_rate,
+                      learning_rate_schedule=learning_rate_schedule, use_cuda=use_cuda)
+
+
+
+
+
+
+

2. Load the model and evaluate it to find the baseline accuracy

+

For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead.

+
+
[ ]:
+
+
+
from torchvision.models import resnet18
+
+model = resnet18(pretrained=True)
+
+
+
+
+

We should decide whether to place the model on a CPU or CUDA device. This example code will use CUDA if available in your current execution environment. You can change this logic and force a device placement if needed.

+
+
[ ]:
+
+
+
use_cuda = False
+if torch.cuda.is_available():
+    use_cuda = True
+    model.to(torch.device('cuda'))
+
+
+
+
+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(model, iterations=None, use_cuda=use_cuda)
+print(accuracy)
+
+
+
+
+
+

3. Compress the model and fine-tune

+
+

3.1. Compress model using Spatial SVD and evaluate it to find post-compression accuracy

+

Now we use AIMET to define compression parameters for Spatial SVD, few of which are explained here

+
    +
  • target_comp_ratio: The desired compression ratio for Spatial SVD. We are using 0.8 to compress the model by 20%.

  • +
  • num_comp_ratio_candidates: As part of determining how compressible each layer is, AIMET performs various measurements. This number denotes the different compression ratios tried by the AIMET for each layer. We are using 3 here which translates to 0.33, 0.66 and 1.00 compression ratios at each layer. Optimal value is 10. The higher the number of candidates the more granular the measurements for each layer, but also the higher the time taken to complete these measurements.

  • +
  • modules_to_ignore: This list can contain the references of model-layers that should be ignored during compression. We have added the first layer to be ignored to preserve the way the input interacts with the model; other layers can be added too if desired.

  • +
  • mode: We are chossing Auto mode which means AIMET performs per-layer compressibility analysis and determines how much to compress each layer. The alternate choice is Manual.

  • +
  • eval_callback: The model evaluation function. The expected signature of the evaluate function should be <function_name>(model, eval_iterations, use_cuda) and it is expected to return an accuracy metric.

  • +
  • eval_iterations: The number of batches of data to use for evaluating the model while the model is compressing. We are using 1 to speed up the notebook execution. But please choose a high enough number of samples so that we can trust the accuracy of the model given those samples. It is expected that the eval callback would use the same samples for every invocation of the callback.

  • +
  • compress_scheme: We choose the ‘spatial svd’ compression scheme.

  • +
  • cost_metric: Determines whether we want to target either to reduce MACs or memory by the desired compression ratio. We are chossing ‘mac’ here.

  • +
+
+
[ ]:
+
+
+
from decimal import Decimal
+from aimet_torch.defs import GreedySelectionParameters, SpatialSvdParameters
+from aimet_common.defs import CompressionScheme, CostMetric
+
+greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.8),
+                                          num_comp_ratio_candidates=3)
+modules_to_ignore = [model.conv1]
+auto_params = SpatialSvdParameters.AutoModeParams(greedy_select_params=greedy_params,
+                                                  modules_to_ignore=modules_to_ignore)
+params = SpatialSvdParameters(mode=SpatialSvdParameters.Mode.auto, params=auto_params)
+
+eval_callback = ImageNetDataPipeline.evaluate
+eval_iterations = 1
+compress_scheme = CompressionScheme.spatial_svd
+cost_metric = CostMetric.mac
+
+
+
+
+
+
We call the AIMET ModelCompressor.compress_model API using the above parameters. This call returns a compressed model as well as relevant statistics.
+
Note: the ModelCompressor evaluates the model while compressing using the same evaluate function that is in our data pipeline.
+
+
+
[ ]:
+
+
+
from aimet_torch.compress import ModelCompressor
+ssvd_compressed_model, ssvd_comp_stats = ModelCompressor.compress_model(model=model,
+                                                                        eval_callback=eval_callback,
+                                                                        eval_iterations=eval_iterations,
+                                                                        input_shape=(1, 3, 224, 224),
+                                                                        compress_scheme=compress_scheme,
+                                                                        cost_metric=cost_metric,
+                                                                        parameters=params)
+
+print(ssvd_comp_stats)
+
+
+
+
+

Now the compressed model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before to calculated compressed model accuracy.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(ssvd_compressed_model, iterations=None, use_cuda=use_cuda)
+print(accuracy)
+
+
+
+
+

As you can see the model accuracy fell sharply after compression. This is expected. We will use model fine-tuning to recover this accuracy back.

+
+
+

3.2. Fine-tune the model after Spatial SVD

+

After the model is compressed using Spatial SVD, we can simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.

+

For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit.

+
+
[ ]:
+
+
+
ImageNetDataPipeline.finetune(ssvd_compressed_model, epochs=1, learning_rate=5e-7, learning_rate_schedule=[5, 10],
+                              use_cuda=use_cuda)
+
+
+
+
+

After we are done with finetuing the compressed model, we can check the floating point accuracy against the same validation dataset at the end to observe any improvements in accuracy.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(ssvd_compressed_model, iterations=None, use_cuda=use_cuda)
+print(accuracy)
+
+
+
+
+

Depending on your settings you should have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.

+

So we have an improved model after compression using spatial SVD. Optionally, this model now can be saved like a regular PyTorch model.

+
+
[ ]:
+
+
+
os.makedirs('./output/', exist_ok=True)
+torch.save(ssvd_compressed_model, './output/ssvd_finetuned_model')
+
+
+
+
+
+

3.3. Compress model using Channel Pruning and evaluate it to find post-compression accuracy

+
+
The fine-tuned model, compressed with Spatial SVD, can be further compressed using Channel Pruning method.
+
Similar to Spatial SVD, we will first define the parameters for Channel Pruning compression, out of which mostly are same as of Spatial SVD. The other parameters for Channel Pruning are as follows:
+
+
    +
  • data_loader: Channel Pruning uses unlabelled data samples for the layer-by-layer reconstruction procedure explained at the start. This provided data loader is used to retrieve those samples. You can just pass your existing data loader - say for the validation or training dataset.

  • +
  • num_reconstruction_samples: The number of samples used in the layer-by-layer reconstruction procedure. We are using 10 here which is a ridiculously low number but enables this notebook to execute quickly. A typical setting here would ~1000 samples.

  • +
  • allow_custom_downsample_ops: If this flag is enabled, AIMET Channel Pruning will insert downsample ops into the model graph if needed. Enabling this can enable more convolutional layers to be considered for pruning, but it may increase memory bandwidth overhead for the additional downsample layers. So there is a trade-off to be considered. We suggest disabling this by default.

  • +
+
+
[ ]:
+
+
+
from aimet_torch.defs import ChannelPruningParameters
+
+greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.9),
+                                          num_comp_ratio_candidates=3)
+modules_to_ignore = [ssvd_compressed_model.conv1]
+auto_params = ChannelPruningParameters.AutoModeParams(greedy_select_params=greedy_params,
+                                                      modules_to_ignore=modules_to_ignore)
+
+data_loader = ImageNetDataPipeline.get_val_dataloader()
+params = ChannelPruningParameters(data_loader=data_loader,
+                                  num_reconstruction_samples=10,
+                                  allow_custom_downsample_ops=False,
+                                  mode=ChannelPruningParameters.Mode.auto,
+                                  params=auto_params)
+
+eval_callback = ImageNetDataPipeline.evaluate
+eval_iterations = 1
+compress_scheme = CompressionScheme.channel_pruning
+cost_metric = CostMetric.mac
+
+
+
+
+
+
We call the AIMET ModelCompressor.compress_model API using the above parameters. This call returns a compressed model as well as relevant statistics.
+
Note: the ModelCompressor evaluates the model while compressing using the same evaluate function that is in our data pipeline.
+
+
+
[ ]:
+
+
+
ssvd_cp_compressed_model, cp_comp_stats = ModelCompressor.compress_model(model=ssvd_compressed_model,
+                                                                         eval_callback=eval_callback,
+                                                                         eval_iterations=eval_iterations,
+                                                                         input_shape=(1, 3, 224, 224),
+                                                                         compress_scheme=compress_scheme,
+                                                                         cost_metric=cost_metric,
+                                                                         parameters=params)
+
+print(cp_comp_stats)
+
+
+
+
+

Ok so we have a compressed model. We can pass this model to the same evaluation routine we used before to calculated compressed model accuracy.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(ssvd_cp_compressed_model, iterations=None, use_cuda=use_cuda)
+print(accuracy)
+
+
+
+
+

As you can see the model accuracy fell sharply after compression. This is expected. We will use model fine-tuning to recover this accuracy back.

+
+
+

3.4. Fine-tune the model after Channel Pruning

+

After the model is compressed using Spatial SVD followed by Channel Pruning, we can simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.

+

For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit.

+
+
[ ]:
+
+
+
ImageNetDataPipeline.finetune(ssvd_cp_compressed_model, epochs=2, learning_rate=15e-4, learning_rate_schedule=[1],
+                              use_cuda=use_cuda)
+
+
+
+
+

After we are done with finetuing the compressed model, we can check the floating point accuracy against the same validation dataset at the end to observe any improvements in accuracy.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(ssvd_cp_compressed_model, iterations=None, use_cuda=use_cuda)
+print(accuracy)
+
+
+
+
+

Depending on your settings you may have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.

+

So we have an improved model after compression using spatial SVD followed by Channel Pruning. Optionally, this model now can be saved like a regular PyTorch model.

+
+
[ ]:
+
+
+
os.makedirs('./output/', exist_ok=True)
+torch.save(ssvd_cp_compressed_model, './output/ssvd_cp_finetuned_model')
+
+
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET for performing compression with Spatial SVD followed by Channel Pruning. As indicated above, some parameters have been chosen in a way to run the example faster.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET compression and quantization techniques

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/torch/compression/spatial_svd_channel_pruning.ipynb b/releases/1.32.2/Examples/torch/compression/spatial_svd_channel_pruning.ipynb new file mode 100644 index 00000000..336803ff --- /dev/null +++ b/releases/1.32.2/Examples/torch/compression/spatial_svd_channel_pruning.ipynb @@ -0,0 +1,555 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Model compression using Spatial SVD followed by Channel Pruning \n", + "\n", + "This notebook shows a working code example of how to use AIMET to perform model compression. Two model-compression techniques are applied back-to-back: Spatial SVD followed by Channel Pruning.\n", + "\n", + "Here is a brief introduction to the techniques. Please refer to the AIMET user guide for more details.\n", + "\n", + "1. **Spatial SVD**: This is a tensor-decomposition technique generally applied to convolutional layers (Conv2D). Applying this technique will decompose a single convolutional layer into two. The weight tensor of the layer to be split is flattended to a 2D matrix and singular value decomposition (SVD) is applied to this matrix. Compression is achieved by discarding the least significant singular values in the diagonal matrix. The decomposed matrices are combined back into two separate convolutional layers.\n", + "2. **Channel Pruning**: In this technique AIMET will discard least significant (using a magnitude metric) input channels of a given convolutional (Conv2D) layer. The layers of the model feeding into this convolutional layer also have the channels dimension modified to get back to a working graph. This technique also uses a layer-by-layer reconstruction procedure that modifies the weights of the compressed layers to minimize the distance of the compressed layer output to the corresponding layer output of the original model.\n", + "\n", + "Both of the above techniques are structured pruning techniques that aim to reduce computational macs or memory requirements of the model. Subsequent to applying either of these techniques, the compressed model needs to be fine-tuned (meaning trained again for a few epochs) to recover accuracy close to the original model.\n", + "\n", + "This notebook shows working code example of how both the techniques (#1 and #2) can be used to compress the model. You can find a separate notebook for only #1 or #2 in the same folder.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load the model and evaluate it to find the baseline accuracy\n", + "3. Compress the model and fine-tune: \n", + " 3.1 Compress model using Spatial SVD and evaluate it to find post-compression accuracy \n", + " 3.2 Fine-tune the model after Spatial SVD \n", + " 3.3 Compress model using Channel Pruning and evaluate it to find post-compression accuracy \n", + " 3.4 Fine-tune the model after Channel Pruning \n", + "\n", + "\n", + "#### What this notebook is not \n", + "* This notebook is not designed to show state-of-the-art compression results. For example, some optimization parameters such as num_comp_ratio_candidates, num_eval_iterations and epochs are deliberately chosen to have the notebook execute more quickly.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).\n", + "\n", + "**Note1**: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- Subfolders 'train' for the training samples and 'val' for the validation samples. Please see the [pytorch dataset description](https://pytorch.org/vision/0.8/_modules/torchvision/datasets/imagenet.html) for more details.\n", + "- A subdirectory per class, and a file per each image sample\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "DATASET_DIR = '/path/to/dataset/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to compress it and the resultant model is still a PyTorch model. This compressed model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really, but evaluate() method should return a single number representing the accuracy of the model. Ideally, You should be able to use your existing evaluate and train routines as-is.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import torch\n", + "from typing import List\n", + "from Examples.common import image_net_config\n", + "from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.torch.utils.image_net_trainer import ImageNetTrainer\n", + "from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader\n", + "\n", + "class ImageNetDataPipeline:\n", + "\n", + " @staticmethod\n", + " def get_val_dataloader() -> torch.utils.data.DataLoader:\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(DATASET_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " is_training=False,\n", + " num_workers=image_net_config.evaluation['num_workers']).data_loader\n", + " return data_loader\n", + "\n", + " @staticmethod\n", + " def evaluate(model: torch.nn.Module, iterations: int, use_cuda: bool) -> float:\n", + " \"\"\"\n", + " Given a torch model, evaluates its Top-1 accuracy on the dataset\n", + " :param model: the model to evaluate\n", + " :param iterations: the number of batches to be used to evaluate the model. A value of 'None' means the model will be\n", + " evaluated on the entire dataset once.\n", + " :param use_cuda: whether or not the GPU should be used.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " num_workers=image_net_config.evaluation['num_workers'])\n", + "\n", + " return evaluator.evaluate(model, iterations=iterations, use_cuda=use_cuda)\n", + "\n", + " @staticmethod\n", + " def finetune(model: torch.nn.Module, epochs: int, learning_rate: float, learning_rate_schedule: List, use_cuda: bool):\n", + " \"\"\"\n", + " Given a torch model, finetunes the model to improve its accuracy\n", + " :param model: the model to finetune\n", + " :param epochs: The number of epochs used during the finetuning step.\n", + " :param learning_rate: The learning rate used during the finetuning step.\n", + " :param learning_rate_schedule: The learning rate schedule used during the finetuning step.\n", + " :param use_cuda: whether or not the GPU should be used.\n", + " \"\"\"\n", + " trainer = ImageNetTrainer(DATASET_DIR, image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.train['batch_size'],\n", + " num_workers=image_net_config.train['num_workers'])\n", + "\n", + " trainer.train(model, max_epochs=epochs, learning_rate=learning_rate,\n", + " learning_rate_schedule=learning_rate_schedule, use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 2. Load the model and evaluate it to find the baseline accuracy" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from torchvision.models import resnet18\n", + "\n", + "model = resnet18(pretrained=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We should decide whether to place the model on a CPU or CUDA device. This example code will use CUDA if available in your current execution environment. You can change this logic and force a device placement if needed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "use_cuda = False\n", + "if torch.cuda.is_available():\n", + " use_cuda = True\n", + " model.to(torch.device('cuda'))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(model, iterations=None, use_cuda=use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. Compress the model and fine-tune\n", + "\n", + "### 3.1. Compress model using Spatial SVD and evaluate it to find post-compression accuracy\n", + "Now we use AIMET to define compression parameters for Spatial SVD, few of which are explained here\n", + "\n", + "- **target_comp_ratio**: The desired compression ratio for Spatial SVD. We are using 0.8 to compress the model by 20%.\n", + "\n", + "- **num_comp_ratio_candidates**: As part of determining how compressible each layer is, AIMET performs various measurements. This number denotes the different compression ratios tried by the AIMET for each layer. We are using 3 here which translates to 0.33, 0.66 and 1.00 compression ratios at each layer. Optimal value is 10. The higher the number of candidates the more granular the measurements for each layer, but also the higher the time taken to complete these measurements.\n", + "\n", + "- **modules_to_ignore**: This list can contain the references of model-layers that should be ignored during compression. We have added the first layer to be ignored to preserve the way the input interacts with the model; other layers can be added too if desired.\n", + "\n", + "- **mode**: We are chossing **Auto** mode which means AIMET performs per-layer compressibility analysis and determines how much to compress each layer. The alternate choice is **Manual**.\n", + "\n", + "- **eval_callback**: The model evaluation function. The expected signature of the evaluate function should be `(model, eval_iterations, use_cuda)` and it is expected to return an accuracy metric.\n", + "\n", + "- **eval_iterations**: The number of batches of data to use for evaluating the model while the model is compressing. We are using 1 to speed up the notebook execution. But please choose a high enough number of samples so that we can trust the accuracy of the model given those samples. It is expected that the eval callback would use the same samples for every invocation of the callback.\n", + "\n", + "- **compress_scheme**: We choose the 'spatial svd' compression scheme.\n", + "\n", + "- **cost_metric**: Determines whether we want to target either to reduce MACs or memory by the desired compression ratio. We are chossing 'mac' here.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from decimal import Decimal\n", + "from aimet_torch.defs import GreedySelectionParameters, SpatialSvdParameters\n", + "from aimet_common.defs import CompressionScheme, CostMetric\n", + "\n", + "greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.8),\n", + " num_comp_ratio_candidates=3)\n", + "modules_to_ignore = [model.conv1]\n", + "auto_params = SpatialSvdParameters.AutoModeParams(greedy_select_params=greedy_params,\n", + " modules_to_ignore=modules_to_ignore)\n", + "params = SpatialSvdParameters(mode=SpatialSvdParameters.Mode.auto, params=auto_params)\n", + "\n", + "eval_callback = ImageNetDataPipeline.evaluate\n", + "eval_iterations = 1\n", + "compress_scheme = CompressionScheme.spatial_svd\n", + "cost_metric = CostMetric.mac" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We call the AIMET ModelCompressor.compress_model API using the above parameters. This call returns a compressed model as well as relevant statistics. \n", + "**Note**: the ModelCompressor evaluates the model while compressing using the same evaluate function that is in our data pipeline.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "from aimet_torch.compress import ModelCompressor\n", + "ssvd_compressed_model, ssvd_comp_stats = ModelCompressor.compress_model(model=model,\n", + " eval_callback=eval_callback,\n", + " eval_iterations=eval_iterations,\n", + " input_shape=(1, 3, 224, 224),\n", + " compress_scheme=compress_scheme,\n", + " cost_metric=cost_metric,\n", + " parameters=params)\n", + "\n", + "print(ssvd_comp_stats)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now the compressed model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before to calculated compressed model accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(ssvd_compressed_model, iterations=None, use_cuda=use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "As you can see the model accuracy fell sharply after compression. This is expected. We will use model fine-tuning to recover this accuracy back." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.2. Fine-tune the model after Spatial SVD\n", + "\n", + "After the model is compressed using Spatial SVD, we can simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.\n", + "\n", + "For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ImageNetDataPipeline.finetune(ssvd_compressed_model, epochs=1, learning_rate=5e-7, learning_rate_schedule=[5, 10],\n", + " use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "After we are done with finetuing the compressed model, we can check the floating point accuracy against the same validation dataset at the end to observe any improvements in accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(ssvd_compressed_model, iterations=None, use_cuda=use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Depending on your settings you should have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.\n", + "\n", + "So we have an improved model after compression using spatial SVD. Optionally, this model now can be saved like a regular PyTorch model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "os.makedirs('./output/', exist_ok=True)\n", + "torch.save(ssvd_compressed_model, './output/ssvd_finetuned_model')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.3. Compress model using Channel Pruning and evaluate it to find post-compression accuracy\n", + "\n", + "The fine-tuned model, compressed with Spatial SVD, can be further compressed using Channel Pruning method. \n", + "Similar to Spatial SVD, we will first define the parameters for Channel Pruning compression, out of which mostly are same as of Spatial SVD. The other parameters for Channel Pruning are as follows:\n", + "\n", + "- **data_loader**: Channel Pruning uses unlabelled data samples for the layer-by-layer reconstruction procedure explained at the start. This provided data loader is used to retrieve those samples. You can just pass your existing data loader - say for the validation or training dataset.\n", + "\n", + "- **num_reconstruction_samples**: The number of samples used in the layer-by-layer reconstruction procedure. We are using 10 here which is a ridiculously low number but enables this notebook to execute quickly. A typical setting here would ~1000 samples.\n", + "\n", + "- **allow_custom_downsample_ops**: If this flag is enabled, AIMET Channel Pruning will insert downsample ops into the model graph if needed. Enabling this can enable more convolutional layers to be considered for pruning, but it may increase memory bandwidth overhead for the additional downsample layers. So there is a trade-off to be considered. We suggest disabling this by default." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_torch.defs import ChannelPruningParameters\n", + "\n", + "greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.9),\n", + " num_comp_ratio_candidates=3)\n", + "modules_to_ignore = [ssvd_compressed_model.conv1]\n", + "auto_params = ChannelPruningParameters.AutoModeParams(greedy_select_params=greedy_params,\n", + " modules_to_ignore=modules_to_ignore)\n", + "\n", + "data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + "params = ChannelPruningParameters(data_loader=data_loader,\n", + " num_reconstruction_samples=10,\n", + " allow_custom_downsample_ops=False,\n", + " mode=ChannelPruningParameters.Mode.auto,\n", + " params=auto_params)\n", + "\n", + "eval_callback = ImageNetDataPipeline.evaluate\n", + "eval_iterations = 1\n", + "compress_scheme = CompressionScheme.channel_pruning\n", + "cost_metric = CostMetric.mac" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We call the AIMET ModelCompressor.compress_model API using the above parameters. This call returns a compressed model as well as relevant statistics. \n", + "**Note**: the ModelCompressor evaluates the model while compressing using the same evaluate function that is in our data pipeline." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "ssvd_cp_compressed_model, cp_comp_stats = ModelCompressor.compress_model(model=ssvd_compressed_model,\n", + " eval_callback=eval_callback,\n", + " eval_iterations=eval_iterations,\n", + " input_shape=(1, 3, 224, 224),\n", + " compress_scheme=compress_scheme,\n", + " cost_metric=cost_metric,\n", + " parameters=params)\n", + "\n", + "print(cp_comp_stats)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Ok so we have a compressed model. We can pass this model to the same evaluation routine we used before to calculated compressed model accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(ssvd_cp_compressed_model, iterations=None, use_cuda=use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "As you can see the model accuracy fell sharply after compression. This is expected. We will use model fine-tuning to recover this accuracy back." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.4. Fine-tune the model after Channel Pruning\n", + "\n", + "After the model is compressed using Spatial SVD followed by Channel Pruning, we can simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.\n", + "\n", + "For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ImageNetDataPipeline.finetune(ssvd_cp_compressed_model, epochs=2, learning_rate=15e-4, learning_rate_schedule=[1],\n", + " use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "After we are done with finetuing the compressed model, we can check the floating point accuracy against the same validation dataset at the end to observe any improvements in accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(ssvd_cp_compressed_model, iterations=None, use_cuda=use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Depending on your settings you may have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.\n", + "\n", + "So we have an improved model after compression using spatial SVD followed by Channel Pruning. Optionally, this model now can be saved like a regular PyTorch model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "os.makedirs('./output/', exist_ok=True)\n", + "torch.save(ssvd_cp_compressed_model, './output/ssvd_cp_finetuned_model')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Summary\n", + "\n", + "Hope this notebook was useful for you to understand how to use AIMET for performing compression with Spatial SVD followed by Channel Pruning. As indicated above, some parameters have been chosen in a way to run the example faster.\n", + "\n", + "Few additional resources\n", + "- Refer to the AIMET API docs to know more details of the APIs and optional parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET compression and quantization techniques" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/releases/1.32.2/Examples/torch/quantization/adaround.html b/releases/1.32.2/Examples/torch/quantization/adaround.html new file mode 100644 index 00000000..b4286361 --- /dev/null +++ b/releases/1.32.2/Examples/torch/quantization/adaround.html @@ -0,0 +1,1467 @@ + + + + + + Adaptive Rounding (AdaRound) — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Adaptive Rounding (AdaRound)

+

This notebook shows a working code example of how to use AIMET to perform Adaptive Rounding (AdaRound).

+

AIMET quantization features typically use the “nearest rounding” technique for achieving quantization. When using the “nearest rounding” technique, the weight value is quantized to the nearest integer value.

+

AdaRound optimizes a loss function using unlabeled training data to decide whether to quantize a specific weight to the closer integer value or the farther one. Using AdaRound quantization, a model is able to achieve an accuracy closer to the FP32 model, while using low bit-width integer quantization.

+
+

Overall flow

+

This notebook covers the following: 1. Instantiate the example evaluation and training pipeline 2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy 3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score 4. Apply AdaRound and evaluate the simulation model to get a post-finetuned quantized accuracy score

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art results

  • +
  • For example, it uses a relatively quantization-friendly model like Resnet18

  • +
  • Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, use that. Otherwise, download the dataset from an appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).

+

Note1: The dataloader provided in this example notebook relies on the ImageNet tfrecords dataset having the following characteristics: - Subfolders ‘train’ for the training samples and ‘val’ for the validation samples. Please see the pytorch dataset description for more details. - A subdirectory per class, and a file per each image sample.

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could reduce the dataset to 2 samples per class. This exercise is left up to the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = '/path/to/dataset/'         # Please replace this with a real directory
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a PyTorch model. This QuantizationSim model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods? Not really. You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
import os
+import torch
+from Examples.common import image_net_config
+from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader
+
+class ImageNetDataPipeline:
+
+    @staticmethod
+    def get_val_dataloader() -> torch.utils.data.DataLoader:
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(DATASET_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         is_training=False,
+                                         num_workers=image_net_config.evaluation['num_workers']).data_loader
+        return data_loader
+
+    @staticmethod
+    def evaluate(model: torch.nn.Module, use_cuda: bool) -> float:
+        """
+        Given a torch model, evaluates its Top-1 accuracy on the dataset
+        :param model: the model to evaluate
+        :param use_cuda: whether or not the GPU should be used.
+        """
+        evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      num_workers=image_net_config.evaluation['num_workers'])
+
+        return evaluator.evaluate(model, iterations=None, use_cuda=use_cuda)
+
+
+
+
+
+
+

2. Load the model and evaluate to get a baseline FP32 accuracy score

+

For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead.

+
+
[ ]:
+
+
+
from torchvision.models import resnet18
+
+model = resnet18(pretrained=True)
+
+
+
+

AIMET quantization simulation requires the user’s model definition to follow certain guidelines. For example, functionals defined in forward pass should be changed to equivalent torch.nn.Module. AIMET user guide lists all these guidelines. The following ModelPreparer API uses new graph transformation feature available in PyTorch 1.9+ version and automates model definition changes required to comply with the above guidelines.

+
+
[ ]:
+
+
+
from aimet_torch.model_preparer import prepare_model
+
+model = prepare_model(model)
+
+
+
+
+

We should decide whether to place the model on a CPU or CUDA device. This example code will use CUDA if available in your current execution environment. You can change this logic and force a device placement if needed.

+
+
[ ]:
+
+
+
use_cuda = False
+if torch.cuda.is_available():
+    use_cuda = True
+    model.to(torch.device('cuda'))
+
+
+
+
+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(model, use_cuda)
+print(accuracy)
+
+
+
+
+
+
+

3. Create a quantization simulation model and determine quantized accuracy

+
+
+

Fold Batch Normalization layers

+

Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.

+

Why do we need to this?

+

On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so results in an inferences/sec speedup since unnecessary computation is avoided.

+

From a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values for the weight parameters of the adjacent layers.

+

This can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). We want to simulate that on-target behavior by doing BN folding here.

+

The following code calls AIMET to fold the BN layers in-place on the given model:

+
+
[ ]:
+
+
+
from aimet_torch.batch_norm_fold import fold_all_batch_norms
+
+_ = fold_all_batch_norms(model, input_shapes=(1, 3, 224, 224))
+
+
+
+
+
[ ]:
+
+
+
from aimet_common.defs import QuantScheme
+from aimet_torch.quantsim import QuantizationSimModel
+
+dummy_input = torch.rand(1, 3, 224, 224)    # Shape for each ImageNet sample is (3 channels) x (224 height) x (224 width)
+if use_cuda:
+    dummy_input = dummy_input.cuda()
+
+sim = QuantizationSimModel(model=model,
+                           quant_scheme=QuantScheme.post_training_tf_enhanced,
+                           dummy_input=dummy_input,
+                           default_output_bw=8,
+                           default_param_bw=8)
+
+
+
+
+

We can check the modifications AIMET has made to the model graph. One way is to print the model, and we can see that AIMET has added quantization wrapper layers.

+

Note: use sim.model to access the modified PyTorch model. By default, AIMET creates a copy of the original model prior to modifying it. There is a parameter to override this behavior.

+
+
[ ]:
+
+
+
print(sim.model)
+
+
+
+
+

We can also check how AIMET has configured the added fake quantization nodes, which AIMET refers to as ‘quantizers’. You can see this by printing the sim object.

+
+
[ ]:
+
+
+
print(sim)
+
+
+
+
+

Even though AIMET has added ‘quantizer’ nodes to the model graph, the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ node.

+

For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as ‘computing encodings’.

+

We create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metrics, so we can just ignore the model output. A few pointers regarding the data samples:

+
    +
  • In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 to 1000 samples.

  • +
  • It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results.

  • +
+

The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example.

+
+
[ ]:
+
+
+
def pass_calibration_data(sim_model, use_cuda):
+    data_loader = ImageNetDataPipeline.get_val_dataloader()
+    batch_size = data_loader.batch_size
+
+    if use_cuda:
+        device = torch.device('cuda')
+    else:
+        device = torch.device('cpu')
+
+    sim_model.eval()
+    samples = 1000
+
+    batch_cntr = 0
+    with torch.no_grad():
+        for input_data, target_data in data_loader:
+
+            inputs_batch = input_data.to(device)
+            sim_model(inputs_batch)
+
+            batch_cntr += 1
+            if (batch_cntr * batch_size) > samples:
+                break
+
+
+
+
+

Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters.

+
+
[ ]:
+
+
+
sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=use_cuda)
+
+
+
+
+

Now the QuantizationSim model is ready to be used for inference. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)
+print(accuracy)
+
+
+
+
+
+
+

4. Apply Adaround

+

We can now apply AdaRound to this model.

+

Some of the parameters for AdaRound are described below

+
    +
  • dataloader: AdaRound needs a dataloader to use data samples for the layer-by-layer optimization to learn the rounding vectors. Either a training or validation dataloader could be passed in.

  • +
  • num_batches: The number of batches used to evaluate the model while calculating the quantization encodings. Typically we want AdaRound to use around 2000 samples. So with a batch size of 32, this may translate to 64 batches. To speed up the execution here we are using a batch size of 1.

  • +
  • default_num_iterations: The number of iterations to adaround each layer. Default value is set to 10000 and we strongly recommend to not reduce this number. But in this example we are using 32 to speed up the execution runtime.

  • +
+
+
[ ]:
+
+
+
from aimet_torch.adaround.adaround_weight import Adaround, AdaroundParameters
+
+data_loader = ImageNetDataPipeline.get_val_dataloader()
+params = AdaroundParameters(data_loader=data_loader, num_batches=1, default_num_iterations=32)
+
+dummy_input = torch.rand(1, 3, 224, 224)
+if use_cuda:
+    dummy_input = dummy_input.cuda()
+
+os.makedirs('./output/', exist_ok=True)
+ada_model = Adaround.apply_adaround(model, dummy_input, params,
+                                    path="output",
+                                    filename_prefix='adaround',
+                                    default_param_bw=8,
+                                    default_quant_scheme=QuantScheme.post_training_tf_enhanced)
+
+
+
+
+

Now, we can determine the simulated quantized accuracy of the model after applying Adaround. We again create a simulation model like before and evaluate to determine simulated quantized accuracy.

+

Note: There are two important things to understand in the following cell. - Parameter Biwidth Precision: The QuantizationSimModel must be created with the same parameter bitwidth precision that was used in the apply_adaround() created.

+
    +
  • Freezing the parameter encodings: After creating the QuantizationSimModel, the set_and_freeze_param_encodings() API must be called before calling the compute_encodings() API. While applying AdaRound, the parameter values have been rounded up or down based on these initial encodings internally created. For Quantization Simulation accuracy, it is important to freeze these encodings. If the parameters encodings are NOT frozen, the call to compute_encodings() will alter the value of the +parameters encodings and Quantization Simulation accuracy will not reflect the AdaRounded accuracy.

  • +
+
+
[ ]:
+
+
+
sim = QuantizationSimModel(model=ada_model,
+                           dummy_input=dummy_input,
+                           quant_scheme=QuantScheme.post_training_tf_enhanced,
+                           default_output_bw=8,
+                           default_param_bw=8)
+
+sim.set_and_freeze_param_encodings(encoding_path=os.path.join("output", 'adaround.encodings'))
+
+sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=use_cuda)
+
+
+
+
+

Now the QuantizationSim model is ready to be used for inference. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization, using the newly AdaRounded model with updated parameters.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)
+print(accuracy)
+
+
+
+
+

Depending on your settings you may have observed a slight gain in accuracy after applying AdaRound. The settings used in this notebook are designed only to serve as code examples, designed to run quickly, but may not be optimal. Please try this workflow against the model of your choice and play with the number of samples and other parameters to get the best results.

+

The next step would be to take this model to target. We need to do two things: - export the model with the updated weights without the fake quantization ops - export the encodings (scale/offset quantization parameters). AIMET QuantizationSimModel provides an export API for this purpose.

+
+
[ ]:
+
+
+
dummy_input = dummy_input.cpu()
+sim.export(path='./output/', filename_prefix='resnet18_after_adaround', dummy_input=dummy_input)
+
+
+
+
+
+
+

Summary

+

This example illustrated how the AIMET AdaRound API is invoked to achieve post training quantization. To use AIMET AdaRound for your specific needs, replace the model with your model and replace the data pipeline with your data pipeline. As indicated above, some parameters in this example have been chosen in such a way to make this example execute faster.

+

We hope this notebook was useful for you to understand how to use AIMET for performing AdaRound.

+

A few additional resources: - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT techniques

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/torch/quantization/adaround.ipynb b/releases/1.32.2/Examples/torch/quantization/adaround.ipynb new file mode 100644 index 00000000..de91ea3c --- /dev/null +++ b/releases/1.32.2/Examples/torch/quantization/adaround.ipynb @@ -0,0 +1,571 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Adaptive Rounding (AdaRound)\n", + "This notebook shows a working code example of how to use AIMET to perform Adaptive Rounding (AdaRound).\n", + "\n", + "AIMET quantization features typically use the \"nearest rounding\" technique for achieving quantization.\n", + "When using the \"nearest rounding\" technique, the weight value is quantized to the nearest integer value.\n", + "\n", + "AdaRound optimizes a loss function using unlabeled training data to decide whether to quantize a specific weight to the closer integer value or the farther one.\n", + "Using AdaRound quantization, a model is able to achieve an accuracy closer to the FP32 model, while using low bit-width integer quantization.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following:\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy\n", + "3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score\n", + "4. Apply AdaRound and evaluate the simulation model to get a post-finetuned quantized accuracy score\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art results\n", + "* For example, it uses a relatively quantization-friendly model like Resnet18\n", + "* Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification.\n", + "If you already have a version of the dataset readily available, use that.\n", + "Otherwise, download the dataset from an appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).\n", + "\n", + "**Note1**: The dataloader provided in this example notebook relies on the ImageNet tfrecords dataset having the following characteristics:\n", + "- Subfolders 'train' for the training samples and 'val' for the validation samples.\n", + "Please see the [pytorch dataset description](https://pytorch.org/vision/0.8/_modules/torchvision/datasets/imagenet.html) for more details.\n", + "- A subdirectory per class, and a file per each image sample.\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset.\n", + "E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class.\n", + "But for the purpose of running this notebook, you could reduce the dataset to 2 samples per class.\n", + "This exercise is left up to the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "DATASET_DIR = '/path/to/dataset/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a PyTorch model. This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really. You should be able to use your existing evaluate and train routines as-is.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import torch\n", + "from Examples.common import image_net_config\n", + "from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader\n", + "\n", + "class ImageNetDataPipeline:\n", + "\n", + " @staticmethod\n", + " def get_val_dataloader() -> torch.utils.data.DataLoader:\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(DATASET_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " is_training=False,\n", + " num_workers=image_net_config.evaluation['num_workers']).data_loader\n", + " return data_loader\n", + "\n", + " @staticmethod\n", + " def evaluate(model: torch.nn.Module, use_cuda: bool) -> float:\n", + " \"\"\"\n", + " Given a torch model, evaluates its Top-1 accuracy on the dataset\n", + " :param model: the model to evaluate\n", + " :param use_cuda: whether or not the GPU should be used.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " num_workers=image_net_config.evaluation['num_workers'])\n", + "\n", + " return evaluator.evaluate(model, iterations=None, use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 2. Load the model and evaluate to get a baseline FP32 accuracy score" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For this example notebook, we are going to load a pretrained resnet18 model from torchvision.\n", + "Similarly, you can load any pretrained PyTorch model instead." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from torchvision.models import resnet18\n", + "\n", + "model = resnet18(pretrained=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "AIMET quantization simulation requires the user's model definition to follow certain guidelines.\n", + "For example, functionals defined in forward pass should be changed to equivalent torch.nn.Module.\n", + "AIMET user guide lists all these guidelines.\n", + "The following **ModelPreparer** API uses new graph transformation feature available in PyTorch 1.9+ version and automates model definition changes required to comply with the above guidelines. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_torch.model_preparer import prepare_model\n", + "\n", + "model = prepare_model(model)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We should decide whether to place the model on a CPU or CUDA device.\n", + "This example code will use CUDA if available in your current execution environment.\n", + "You can change this logic and force a device placement if needed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "use_cuda = False\n", + "if torch.cuda.is_available():\n", + " use_cuda = True\n", + " model.to(torch.device('cuda'))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(model, use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 3. Create a quantization simulation model and determine quantized accuracy" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "## Fold Batch Normalization layers\n", + "Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model.\n", + "These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.\n", + "\n", + "**Why do we need to this?**\n", + "\n", + "On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers.\n", + "Doing so results in an inferences/sec speedup since unnecessary computation is avoided.\n", + "\n", + "From a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy.\n", + "However, folding the BN layers can increase the range of the tensor values for the weight parameters of the adjacent layers.\n", + "\n", + "This can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision).\n", + "We want to simulate that on-target behavior by doing BN folding here.\n", + "\n", + "The following code calls AIMET to fold the BN layers in-place on the given model:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_torch.batch_norm_fold import fold_all_batch_norms\n", + "\n", + "_ = fold_all_batch_norms(model, input_shapes=(1, 3, 224, 224))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_common.defs import QuantScheme\n", + "from aimet_torch.quantsim import QuantizationSimModel\n", + "\n", + "dummy_input = torch.rand(1, 3, 224, 224) # Shape for each ImageNet sample is (3 channels) x (224 height) x (224 width)\n", + "if use_cuda:\n", + " dummy_input = dummy_input.cuda()\n", + "\n", + "sim = QuantizationSimModel(model=model,\n", + " quant_scheme=QuantScheme.post_training_tf_enhanced,\n", + " dummy_input=dummy_input,\n", + " default_output_bw=8,\n", + " default_param_bw=8)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We can check the modifications AIMET has made to the model graph.\n", + "One way is to print the model, and we can see that AIMET has added quantization wrapper layers.\n", + "\n", + "**Note**: use sim.model to access the modified PyTorch model.\n", + "By default, AIMET creates a copy of the original model prior to modifying it.\n", + "There is a parameter to override this behavior." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(sim.model)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We can also check how AIMET has configured the added fake quantization nodes, which AIMET refers to as 'quantizers'.\n", + "You can see this by printing the sim object." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(sim)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Even though AIMET has added 'quantizer' nodes to the model graph, the model is not ready to be used yet.\n", + "Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each 'quantizer' node.\n", + "\n", + "For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters.\n", + "This process is sometimes referred to as calibration. AIMET simply refers to it as 'computing encodings'.\n", + "\n", + "We create a routine to pass unlabeled data samples through the model.\n", + "This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model.\n", + "We don't need to compute any loss metrics, so we can just ignore the model output. A few pointers regarding the data samples:\n", + "\n", + "- In practice, we need a very small percentage of the overall data samples for computing encodings.\n", + " For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 to 1000 samples.\n", + "- It may be beneficial if the samples used for computing encoding are well distributed.\n", + " It's not necessary that all classes need to be covered since we are only looking at the range of values at every layer activation.\n", + " However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings.\n", + "This routine can be written in many different ways, this is just an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def pass_calibration_data(sim_model, use_cuda):\n", + " data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + " batch_size = data_loader.batch_size\n", + "\n", + " if use_cuda:\n", + " device = torch.device('cuda')\n", + " else:\n", + " device = torch.device('cpu')\n", + "\n", + " sim_model.eval()\n", + " samples = 1000\n", + "\n", + " batch_cntr = 0\n", + " with torch.no_grad():\n", + " for input_data, target_data in data_loader:\n", + "\n", + " inputs_batch = input_data.to(device)\n", + " sim_model(inputs_batch)\n", + "\n", + " batch_cntr += 1\n", + " if (batch_cntr * batch_size) > samples:\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings.\n", + "Encodings here refer to scale/offset quantization parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now the QuantizationSim model is ready to be used for inference.\n", + "First we can pass this model to the same evaluation routine we used before.\n", + "The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 4. Apply Adaround\n", + "\n", + "We can now apply AdaRound to this model.\n", + "\n", + "Some of the parameters for AdaRound are described below\n", + "\n", + "- **dataloader:** AdaRound needs a dataloader to use data samples for the layer-by-layer optimization to learn the rounding vectors. Either a training or validation dataloader could be passed in.\n", + "- **num_batches:** The number of batches used to evaluate the model while calculating the quantization encodings. Typically we want AdaRound to use around 2000 samples. So with a batch size of 32, this may translate to 64 batches. To speed up the execution here we are using a batch size of 1.\n", + "- **default_num_iterations:** The number of iterations to adaround each layer. Default value is set to 10000 and we strongly recommend to not reduce this number. But in this example we are using 32 to speed up the execution runtime." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "from aimet_torch.adaround.adaround_weight import Adaround, AdaroundParameters\n", + "\n", + "data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + "params = AdaroundParameters(data_loader=data_loader, num_batches=1, default_num_iterations=32)\n", + "\n", + "dummy_input = torch.rand(1, 3, 224, 224)\n", + "if use_cuda:\n", + " dummy_input = dummy_input.cuda()\n", + "\n", + "os.makedirs('./output/', exist_ok=True)\n", + "ada_model = Adaround.apply_adaround(model, dummy_input, params,\n", + " path=\"output\", \n", + " filename_prefix='adaround', \n", + " default_param_bw=8,\n", + " default_quant_scheme=QuantScheme.post_training_tf_enhanced)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now, we can determine the simulated quantized accuracy of the model after applying Adaround.\n", + "We again create a simulation model like before and evaluate to determine simulated quantized accuracy.\n", + "\n", + "**Note:** There are two important things to understand in the following cell.\n", + " - **Parameter Biwidth Precision**: The QuantizationSimModel must be created with the same parameter bitwidth precision that was used in the apply_adaround() created.\n", + " \n", + " - **Freezing the parameter encodings**:\n", + "After creating the QuantizationSimModel, the set_and_freeze_param_encodings() API must be called before calling the compute_encodings() API.\n", + "While applying AdaRound, the parameter values have been rounded up or down based on these initial encodings internally created.\n", + "For Quantization Simulation accuracy, it is important to freeze these encodings.\n", + "If the parameters encodings are NOT frozen, the call to compute_encodings() will alter the value of the parameters encodings and Quantization Simulation accuracy will not reflect the AdaRounded accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "sim = QuantizationSimModel(model=ada_model,\n", + " dummy_input=dummy_input,\n", + " quant_scheme=QuantScheme.post_training_tf_enhanced,\n", + " default_output_bw=8, \n", + " default_param_bw=8)\n", + "\n", + "sim.set_and_freeze_param_encodings(encoding_path=os.path.join(\"output\", 'adaround.encodings'))\n", + "\n", + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now the QuantizationSim model is ready to be used for inference.\n", + "First we can pass this model to the same evaluation routine we used before.\n", + "The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization, using the newly AdaRounded model with updated parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Depending on your settings you may have observed a slight gain in accuracy after applying AdaRound.\n", + "The settings used in this notebook are designed only to serve as code examples, designed to run quickly, but may not be optimal.\n", + "Please try this workflow against the model of your choice and play with the number of samples and other parameters to get the best results.\n", + "\n", + "The next step would be to take this model to target.\n", + "We need to do two things:\n", + "- export the model with the updated weights without the fake quantization ops\n", + "- export the encodings (scale/offset quantization parameters).\n", + "AIMET QuantizationSimModel provides an export API for this purpose." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "dummy_input = dummy_input.cpu()\n", + "sim.export(path='./output/', filename_prefix='resnet18_after_adaround', dummy_input=dummy_input)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Summary" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This example illustrated how the AIMET AdaRound API is invoked to achieve post training quantization.\n", + "To use AIMET AdaRound for your specific needs, replace the model with your model and replace the data pipeline with your data pipeline.\n", + "As indicated above, some parameters in this example have been chosen in such a way to make this example execute faster.\n", + "\n", + "We hope this notebook was useful for you to understand how to use AIMET for performing AdaRound.\n", + "\n", + "A few additional resources:\n", + "- Refer to the AIMET API docs to know more details of the APIs and optional parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT techniques" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/releases/1.32.2/Examples/torch/quantization/autoquant.html b/releases/1.32.2/Examples/torch/quantization/autoquant.html new file mode 100644 index 00000000..aae8abdb --- /dev/null +++ b/releases/1.32.2/Examples/torch/quantization/autoquant.html @@ -0,0 +1,1326 @@ + + + + + + AutoQuant — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AutoQuant

+

This notebook shows a working code example of how to use AIMET AutoQuant feature.

+

AIMET offers a suite of neural network post-training quantization (PTQ) techniques that can be applied in succession. However, the process of finding the right combination and sequence of techniques to apply is time-consuming and requires careful analysis, which can be challenging especially for non-expert users. We instead recommend AutoQuant to save time and effort.

+

AutoQuant is an API that applies various PTQ techniques in AIMET automatically based on analyzing the model and best-known heuristics. In AutoQuant, users specify the amount of tolerable accuracy drop, and AutoQuant will apply PTQ techniques cumulatively until the target accuracy is satisfied.

+
+

Overall flow

+

This notebook covers the following 1. Define constants and helper functions 2. Load a pretrained FP32 model 3. Run AutoQuant

+
+
+

What this notebook is not

+

This notebook is not designed to show state-of-the-art AutoQuant results. For example, it uses a relatively quantization-friendly model like Resnet18. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly.

+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).

+

Note1: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - Subfolders ‘train’ for the training samples and ‘val’ for the validation samples. Please see the pytorch dataset description for more details. - A subdirectory per class, and a file per each image sample

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
import os
+from torchvision import transforms, datasets
+
+DATASET_DIR = '/path/to/dataset'   # Please replace this with a real directory
+
+val_transforms = transforms.Compose([
+    transforms.CenterCrop(224),
+    transforms.ToTensor(),
+    transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
+])
+
+imagenet_dataset = datasets.ImageFolder(root=os.path.join(DATASET_DIR, 'val'), transform=val_transforms)
+
+
+
+
+
+

1. Define Constants and Helper functions

+

In this section the constants and helper functions needed to run this eaxmple are defined.

+
    +
  • EVAL_DATASET_SIZE A typical value is 5000. In this notebook, this value has been set to 500 for faster execution.

  • +
  • CALIBRATION_DATASET_SIZE A typical value is 2000. In this notebook, this value has been set to 200 for faster execution.

  • +
+

The helper function **_create_sampled_data_loader()** returns a DataLoader based on the dataset and the number of samples provided.

+
+
[ ]:
+
+
+
import random
+from typing import Optional
+from tqdm import tqdm
+import torch
+from torch.utils.data import Dataset, DataLoader, SubsetRandomSampler, Subset
+from aimet_torch.utils import in_eval_mode, get_device
+
+EVAL_DATASET_SIZE = 500
+CALIBRATION_DATASET_SIZE = 200
+
+_datasets = {}
+
+def _create_sampled_data_loader(dataset, num_samples):
+    if num_samples not in _datasets:
+        indices = random.sample(range(len(dataset)), num_samples)
+        _datasets[num_samples] = Subset(dataset, indices)
+    return DataLoader(_datasets[num_samples], batch_size=32)
+
+
+def eval_callback(model: torch.nn.Module, num_samples: Optional[int] = None) -> float:
+    if num_samples is None:
+        num_samples = EVAL_DATASET_SIZE
+
+    data_loader = _create_sampled_data_loader(imagenet_dataset, num_samples)
+    device = get_device(model)
+
+    correct = 0
+    with in_eval_mode(model), torch.no_grad():
+        for image, label in tqdm(data_loader):
+            image = image.to(device)
+            label = label.to(device)
+            logits = model(image)
+            top1 = logits.topk(k=1).indices
+            correct += (top1 == label.view_as(top1)).sum()
+
+    return int(correct) / num_samples
+
+
+
+
+
+

2. Load a pretrained FP32 model

+

For this example, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead.

+
+
[ ]:
+
+
+
from torchvision.models import resnet18
+
+model = resnet18(pretrained=True).eval()
+
+if torch.cuda.is_available():
+    model.to(torch.device('cuda'))
+
+accuracy = eval_callback(model)
+print(f'- FP32 accuracy: {accuracy}')
+
+
+
+
+
+

3. Run AutoQuant

+
+

Create AutoQuant Object

+

The AutoQuant feature utilizes an unlabeled dataset to achieve quantization. The class UnlabeledDatasetWrapper creates an unlabeled Dataset object from a labeled Dataset.

+
+
[ ]:
+
+
+
from aimet_torch.auto_quant import AutoQuant
+
+class UnlabeledDatasetWrapper(Dataset):
+    def __init__(self, dataset):
+        self._dataset = dataset
+
+    def __len__(self):
+        return len(self._dataset)
+
+    def __getitem__(self, index):
+        images, _ = self._dataset[index]
+        return images
+
+
+unlabeled_imagenet_dataset = UnlabeledDatasetWrapper(imagenet_dataset)
+unlabeled_imagenet_data_loader = _create_sampled_data_loader(unlabeled_imagenet_dataset,
+                                                             CALIBRATION_DATASET_SIZE)
+
+dummy_input = torch.randn((1, 3, 224, 224)).to(get_device(model))
+
+auto_quant = AutoQuant(model,
+                       dummy_input=dummy_input,
+                       data_loader=unlabeled_imagenet_data_loader,
+                       eval_callback=eval_callback)
+
+
+
+
+
+

Run AutoQuant Inference

+

This step runs AutoQuant inference. AutoQuant inference will run evaluation using the eval_callback with the vanilla quantized model without applying PTQ techniques. This will be useful for measuring the baseline evaluation score before running AutoQuant optimization.

+
+
[ ]:
+
+
+
sim, initial_accuracy = auto_quant.run_inference()
+print(f"- Quantized Accuracy (before optimization): {initial_accuracy}")
+
+
+
+
+
+

Set AdaRound Parameters (optional)

+

AutoQuant uses a set of predefined default parameters for AdaRound. These values were determined empirically and work well with the common models. However, if necessary, you can also use your custom parameters for Adaround. In this notebook, we will use very small AdaRound parameters for faster execution.

+
+
[ ]:
+
+
+
from aimet_torch.adaround.adaround_weight import AdaroundParameters
+
+ADAROUND_DATASET_SIZE = 200
+adaround_data_loader = _create_sampled_data_loader(unlabeled_imagenet_dataset, ADAROUND_DATASET_SIZE)
+adaround_params = AdaroundParameters(adaround_data_loader, num_batches=len(adaround_data_loader), default_num_iterations=2000)
+auto_quant.set_adaround_params(adaround_params)
+
+
+
+
+
+

Run AutoQuant Optimization

+

This step runs AutoQuant optimization, which returns the best possible quantized model, corresponding evaluation score and the path to the encoding file. The allowed_accuracy_drop parameter indicates the tolerable amount of accuracy drop. AutoQuant applies a series of quantization features until the target accuracy (FP32 accuracy - allowed accuracy drop) is satisfied. When the target accuracy is reached, AutoQuant will return immediately without applying furhter PTQ techniques. Please refer +AutoQuant User Guide and API documentation for complete details.

+
+
[ ]:
+
+
+
model, optimized_accuracy, encoding_path = auto_quant.optimize(allowed_accuracy_drop=0.01)
+print(f"- Quantized Accuracy (after optimization):  {optimized_accuracy}")
+
+
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET AutoQuant feature.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and parameters - Refer to the other example notebooks to understand how to use AIMET CLE and AdaRound features in a standalone fashion.

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/torch/quantization/autoquant.ipynb b/releases/1.32.2/Examples/torch/quantization/autoquant.ipynb new file mode 100644 index 00000000..e9369432 --- /dev/null +++ b/releases/1.32.2/Examples/torch/quantization/autoquant.ipynb @@ -0,0 +1,318 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "# AutoQuant\n", + "\n", + "This notebook shows a working code example of how to use AIMET AutoQuant feature.\n", + "\n", + "AIMET offers a suite of neural network post-training quantization (PTQ) techniques that can be applied in succession. However, the process of finding the right combination and sequence of techniques to apply is time-consuming and requires careful analysis, which can be challenging especially for non-expert users. We instead recommend AutoQuant to save time and effort.\n", + "\n", + "AutoQuant is an API that applies various PTQ techniques in AIMET automatically based on analyzing the model and best-known heuristics. In AutoQuant, users specify the amount of tolerable accuracy drop, and AutoQuant will apply PTQ techniques cumulatively until the target accuracy is satisfied.\n", + "\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Define constants and helper functions\n", + "2. Load a pretrained FP32 model\n", + "3. Run AutoQuant\n", + "\n", + "#### What this notebook is not\n", + "This notebook is not designed to show state-of-the-art AutoQuant results. For example, it uses a relatively quantization-friendly model like Resnet18. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).\n", + "\n", + "**Note1**: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- Subfolders 'train' for the training samples and 'val' for the validation samples. Please see the [pytorch dataset description](https://pytorch.org/vision/0.8/_modules/torchvision/datasets/imagenet.html) for more details.\n", + "- A subdirectory per class, and a file per each image sample\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "is_executing": true + } + }, + "outputs": [], + "source": [ + "import os\n", + "from torchvision import transforms, datasets\n", + "\n", + "DATASET_DIR = '/path/to/dataset' # Please replace this with a real directory\n", + "\n", + "val_transforms = transforms.Compose([\n", + " transforms.CenterCrop(224),\n", + " transforms.ToTensor(),\n", + " transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),\n", + "])\n", + "\n", + "imagenet_dataset = datasets.ImageFolder(root=os.path.join(DATASET_DIR, 'val'), transform=val_transforms)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 1. Define Constants and Helper functions\n", + "\n", + "In this section the constants and helper functions needed to run this eaxmple are defined.\n", + "\n", + "- **EVAL_DATASET_SIZE** A typical value is 5000. In this notebook, this value has been set to 500 for faster execution.\n", + "- **CALIBRATION_DATASET_SIZE** A typical value is 2000. In this notebook, this value has been set to 200 for faster execution.\n", + "\n", + "\n", + "The helper function **_create_sampled_data_loader()** returns a DataLoader based on the dataset and the number of samples provided." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "import random\n", + "from typing import Optional\n", + "from tqdm import tqdm\n", + "import torch\n", + "from torch.utils.data import Dataset, DataLoader, SubsetRandomSampler, Subset\n", + "from aimet_torch.utils import in_eval_mode, get_device\n", + "\n", + "EVAL_DATASET_SIZE = 500\n", + "CALIBRATION_DATASET_SIZE = 200\n", + "\n", + "_datasets = {}\n", + "\n", + "def _create_sampled_data_loader(dataset, num_samples):\n", + " if num_samples not in _datasets:\n", + " indices = random.sample(range(len(dataset)), num_samples)\n", + " _datasets[num_samples] = Subset(dataset, indices)\n", + " return DataLoader(_datasets[num_samples], batch_size=32)\n", + "\n", + "\n", + "def eval_callback(model: torch.nn.Module, num_samples: Optional[int] = None) -> float:\n", + " if num_samples is None:\n", + " num_samples = EVAL_DATASET_SIZE\n", + "\n", + " data_loader = _create_sampled_data_loader(imagenet_dataset, num_samples)\n", + " device = get_device(model)\n", + " \n", + " correct = 0\n", + " with in_eval_mode(model), torch.no_grad():\n", + " for image, label in tqdm(data_loader):\n", + " image = image.to(device)\n", + " label = label.to(device)\n", + " logits = model(image)\n", + " top1 = logits.topk(k=1).indices\n", + " correct += (top1 == label.view_as(top1)).sum()\n", + "\n", + " return int(correct) / num_samples" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 2. Load a pretrained FP32 model\n", + "For this example, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from torchvision.models import resnet18\n", + "\n", + "model = resnet18(pretrained=True).eval()\n", + "\n", + "if torch.cuda.is_available():\n", + " model.to(torch.device('cuda'))\n", + "\n", + "accuracy = eval_callback(model)\n", + "print(f'- FP32 accuracy: {accuracy}')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. Run AutoQuant\n", + "### Create AutoQuant Object\n", + "\n", + "The AutoQuant feature utilizes an unlabeled dataset to achieve quantization. The class **UnlabeledDatasetWrapper** creates an unlabeled Dataset object from a labeled Dataset. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from aimet_torch.auto_quant import AutoQuant\n", + "\n", + "class UnlabeledDatasetWrapper(Dataset):\n", + " def __init__(self, dataset):\n", + " self._dataset = dataset\n", + "\n", + " def __len__(self):\n", + " return len(self._dataset)\n", + "\n", + " def __getitem__(self, index):\n", + " images, _ = self._dataset[index]\n", + " return images\n", + "\n", + "\n", + "unlabeled_imagenet_dataset = UnlabeledDatasetWrapper(imagenet_dataset)\n", + "unlabeled_imagenet_data_loader = _create_sampled_data_loader(unlabeled_imagenet_dataset,\n", + " CALIBRATION_DATASET_SIZE)\n", + "\n", + "dummy_input = torch.randn((1, 3, 224, 224)).to(get_device(model))\n", + "\n", + "auto_quant = AutoQuant(model,\n", + " dummy_input=dummy_input,\n", + " data_loader=unlabeled_imagenet_data_loader,\n", + " eval_callback=eval_callback)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Run AutoQuant Inference\n", + "This step runs AutoQuant inference. AutoQuant inference will run evaluation using the **eval_callback** with the vanilla quantized model without applying PTQ techniques. This will be useful for measuring the baseline evaluation score before running AutoQuant optimization." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sim, initial_accuracy = auto_quant.run_inference()\n", + "print(f\"- Quantized Accuracy (before optimization): {initial_accuracy}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Set AdaRound Parameters (optional)\n", + "AutoQuant uses a set of predefined default parameters for AdaRound.\n", + "These values were determined empirically and work well with the common models.\n", + "However, if necessary, you can also use your custom parameters for Adaround.\n", + "In this notebook, we will use very small AdaRound parameters for faster execution." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from aimet_torch.adaround.adaround_weight import AdaroundParameters\n", + "\n", + "ADAROUND_DATASET_SIZE = 200\n", + "adaround_data_loader = _create_sampled_data_loader(unlabeled_imagenet_dataset, ADAROUND_DATASET_SIZE)\n", + "adaround_params = AdaroundParameters(adaround_data_loader, num_batches=len(adaround_data_loader), default_num_iterations=2000)\n", + "auto_quant.set_adaround_params(adaround_params)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Run AutoQuant Optimization\n", + "This step runs AutoQuant optimization, which returns the best possible quantized model, corresponding evaluation score and the path to the encoding file.\n", + "The **allowed_accuracy_drop** parameter indicates the tolerable amount of accuracy drop. AutoQuant applies a series of quantization features until the target accuracy (FP32 accuracy - allowed accuracy drop) is satisfied. When the target accuracy is reached, AutoQuant will return immediately without applying furhter PTQ techniques. Please refer AutoQuant User Guide and API documentation for complete details." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": false + }, + "outputs": [], + "source": [ + "model, optimized_accuracy, encoding_path = auto_quant.optimize(allowed_accuracy_drop=0.01)\n", + "print(f\"- Quantized Accuracy (after optimization): {optimized_accuracy}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "## Summary\n", + "\n", + "Hope this notebook was useful for you to understand how to use AIMET AutoQuant feature.\n", + "\n", + "Few additional resources\n", + "- Refer to the AIMET API docs to know more details of the APIs and parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET CLE and AdaRound features in a standalone fashion." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.0" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/releases/1.32.2/Examples/torch/quantization/bn_reestimation.html b/releases/1.32.2/Examples/torch/quantization/bn_reestimation.html new file mode 100644 index 00000000..ea0a5cf2 --- /dev/null +++ b/releases/1.32.2/Examples/torch/quantization/bn_reestimation.html @@ -0,0 +1,1382 @@ + + + + + + Quantization-Aware Training with BatchNorm Re-estimation — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + +
  • + View page source +
  • +
+
+
+
+
+ +
+

Quantization-Aware Training with BatchNorm Re-estimation

+

This notebook shows a working code example of how to use AIMET to perform QAT (Quantization-aware training) with batchnorm re-estimation. Batchnorm re-estimation is a technique for countering potential instability of batchnrom statistics (i.e. running mean and variance) during QAT. More specifically, batchnorm re-estimation recalculates the batchnorm statistics based on the model after QAT. By doing so, we aim to make our model learn batchnorm statistics from from stable outputs after QAT, +rather than from likely noisy outputs during QAT.

+
+

Overall flow

+

This notebook covers the following steps: 1. Create a quantization simulation model with fake quantization ops inserted. 2. Finetune and evaluate the quantization simulation model 3. Re-estimate batchnorm statistics and compare the eval score before and after re-estimation. 4. Fold the re-estimated batchnorm layers and export the quantization simulation model

+
+
+

What this notebook is not

+

In this notebook, we will focus how to apply batchnorm re-estimation after QAT, rather than covering all the details about QAT itself. For more information about QAT, please refer to QAT notebook or QAT range learning notebook.

+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).

+

Note1: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - Subfolders ‘train’ for the training samples and ‘val’ for the validation samples. Please see the pytorch dataset description for more details. - A subdirectory per class, and a file per each image sample

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = '/path/to/dataset/'         # Please replace this with a real directory
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a PyTorch model. This QuantizationSim model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods? Not really. You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
import os
+import torch
+from Examples.common import image_net_config
+from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.torch.utils.image_net_trainer import ImageNetTrainer
+from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader
+
+class ImageNetDataPipeline:
+
+    @staticmethod
+    def get_val_dataloader() -> torch.utils.data.DataLoader:
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(DATASET_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         is_training=False,
+                                         num_workers=image_net_config.evaluation['num_workers']).data_loader
+        return data_loader
+
+    @staticmethod
+    def evaluate(model: torch.nn.Module, use_cuda: bool) -> float:
+        """
+        Given a torch model, evaluates its Top-1 accuracy on the dataset
+        :param model: the model to evaluate
+        :param use_cuda: whether or not the GPU should be used.
+        """
+        evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      num_workers=image_net_config.evaluation['num_workers'])
+
+        return evaluator.evaluate(model, iterations=None, use_cuda=use_cuda)
+
+    @staticmethod
+    def finetune(model: torch.nn.Module, epochs, learning_rate, learning_rate_schedule, use_cuda):
+        """
+        Given a torch model, finetunes the model to improve its accuracy
+        :param model: the model to finetune
+        :param epochs: The number of epochs used during the finetuning step.
+        :param learning_rate: The learning rate used during the finetuning step.
+        :param learning_rate_schedule: The learning rate schedule used during the finetuning step.
+        :param use_cuda: whether or not the GPU should be used.
+        """
+        trainer = ImageNetTrainer(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
+                                  batch_size=image_net_config.train['batch_size'],
+                                  num_workers=image_net_config.train['num_workers'])
+
+        trainer.train(model, max_epochs=epochs, learning_rate=learning_rate,
+                      learning_rate_schedule=learning_rate_schedule, use_cuda=use_cuda)
+
+
+
+
+
+
+

2. Load FP32 model

+

For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead.

+
+
[ ]:
+
+
+
from torchvision.models import resnet18
+from aimet_torch.model_preparer import prepare_model
+
+use_cuda = torch.cuda.is_available()
+if use_cuda:
+    device = torch.device("cuda")
+else:
+    device = torch.device("cpu")
+
+model = resnet18(pretrained=True).to(device)
+model = prepare_model(model)
+
+
+
+
+
+
+

3. Create a quantization simulation model and Perform QAT

+
+

Create Quantization Sim Model

+

Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them. A few of the parameters are explained here - quant_scheme: We set this to “QuantScheme.post_training_tf_enhanced” - Supported options are ‘tf_enhanced’ or ‘tf’ or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced - default_output_bw: Setting this to 8, essentially means that we are +asking AIMET to perform all activation quantizations in the model using integer 8-bit precision - default_param_bw: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+

NOTE: Note that, unlike in other QAT example scripts, we didn’t fold batchnorm layers before QAT. This is because we aim to finetune our model with batchnorm layers present and re-estimate the batchnorm statatistics for better accuracy. The batchnorm layers will be folded after re-estimation.

+
+
[ ]:
+
+
+
from aimet_common.defs import QuantScheme
+from aimet_torch.quantsim import QuantizationSimModel
+
+dummy_input = torch.rand(1, 3, 224, 224, device=device)    # Shape for each ImageNet sample is (3 channels) x (224 height) x (224 width)
+
+sim = QuantizationSimModel(model=model,
+                           quant_scheme=QuantScheme.training_range_learning_with_tf_init,
+                           dummy_input=dummy_input,
+                           default_output_bw=8,
+                           default_param_bw=8)
+
+def pass_calibration_data(sim_model, use_cuda):
+    data_loader = ImageNetDataPipeline.get_val_dataloader()
+    batch_size = data_loader.batch_size
+
+    if use_cuda:
+        device = torch.device('cuda')
+    else:
+        device = torch.device('cpu')
+
+    samples = 1000
+    batch_cntr = 0
+
+    for input_data, target_data in data_loader:
+        inputs_batch = input_data.to(device)
+        sim_model(inputs_batch)
+
+        batch_cntr += 1
+        if (batch_cntr * batch_size) > samples:
+            break
+
+sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=use_cuda)
+
+
+
+
+
+

Perform QAT

+

To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.

+

For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit.

+
+
[ ]:
+
+
+
ImageNetDataPipeline.finetune(sim.model, epochs=1, learning_rate=5e-7, learning_rate_schedule=[5, 10], use_cuda=use_cuda)
+
+
+
+

After we are done with QAT, we can run quantization simulation inference against the validation dataset at the end to observe any improvements in accuracy.

+
+
[ ]:
+
+
+
finetuned_accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)
+print(finetuned_accuracy)
+
+
+
+
+
+
+
+

4. Perform BatchNorm Reestimation

+
+

Re-estimate BatchNorm Statistics

+

AIMET provides a helper function, reestimate_bn_stats, for re-estimating batchnorm statistics. Here is the full list of parameters for this function: * model: Model to re-estimate the BatchNorm statistics. * dataloader Train dataloader. * num_batches (optional): The number of batches to be used for reestimation. (Default: 100) * forward_fn (optional): Optional adapter function that performs forward pass given a model and a input batch yielded from the data loader. If not +specified, it is expected that inputs yielded from dataloader can be passed directly to the model.

+
+
[ ]:
+
+
+
from aimet_torch.bn_reestimation import reestimate_bn_stats
+
+train_loader = ImageNetDataLoader(images_dir=DATASET_DIR,
+                                  image_size=image_net_config.dataset['image_size'],
+                                  batch_size=image_net_config.train['batch_size'],
+                                  is_training=True,
+                                  num_workers=image_net_config.train['num_workers']).data_loader
+def forward_fn(model, inputs):
+    input_data, target_data = inputs
+    model(input_data)
+
+reestimate_bn_stats(sim.model, train_loader, forward_fn=forward_fn)
+
+finetuned_accuracy_bn_reestimated = ImageNetDataPipeline.evaluate(sim.model, use_cuda)
+print(finetuned_accuracy_bn_reestimated)
+
+
+
+
+
+

Fold BatchNorm Layers

+

So far, we have improved our quantization simulation model through QAT and batchnorm re-estimation. The next step would be to actually take this model to target. But first, we should fold the batchnorm layers for our model to run on target devices more efficiently.

+
+
[ ]:
+
+
+
from aimet_torch.batch_norm_fold import fold_all_batch_norms_to_scale
+
+fold_all_batch_norms_to_scale(sim)
+
+
+
+
+
+
+
+

5. Export Model

+

As the final step, we will export the model to run it on actual target devices. AIMET QuantizationSimModel provides an export API for this purpose.

+
+
[ ]:
+
+
+
os.makedirs('./output/', exist_ok=True)
+dummy_input = dummy_input.cpu()
+sim.export(path='./output/', filename_prefix='resnet18_after_qat', dummy_input=dummy_input)
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use batchnorm re-estimation feature of AIMET.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters. - Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT methods.

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/torch/quantization/bn_reestimation.ipynb b/releases/1.32.2/Examples/torch/quantization/bn_reestimation.ipynb new file mode 100644 index 00000000..b468ec4b --- /dev/null +++ b/releases/1.32.2/Examples/torch/quantization/bn_reestimation.ipynb @@ -0,0 +1,409 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "# Quantization-Aware Training with BatchNorm Re-estimation\n", + "\n", + "This notebook shows a working code example of how to use AIMET to perform QAT (Quantization-aware training) with batchnorm re-estimation.\n", + "Batchnorm re-estimation is a technique for countering potential instability of batchnrom statistics (i.e. running mean and variance) during QAT. More specifically, batchnorm re-estimation recalculates the batchnorm statistics based on the model after QAT. By doing so, we aim to make our model learn batchnorm statistics from from stable outputs after QAT, rather than from likely noisy outputs during QAT.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following steps:\n", + "1. Create a quantization simulation model with fake quantization ops inserted.\n", + "2. Finetune and evaluate the quantization simulation model\n", + "3. Re-estimate batchnorm statistics and compare the eval score before and after re-estimation.\n", + "4. Fold the re-estimated batchnorm layers and export the quantization simulation model\n", + "\n", + "#### What this notebook is not\n", + "In this notebook, we will focus how to apply batchnorm re-estimation after QAT, rather than covering all the details about QAT itself. For more information about QAT, please refer to [QAT notebook](https://github.com/quic/aimet/blob/develop/Examples/torch/quantization/quantization_aware_training.ipynb) or [QAT range learning notebook](https://github.com/quic/aimet/blob/develop/Examples/torch/quantization/qat_range_learning.ipynb)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).\n", + "\n", + "**Note1**: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- Subfolders 'train' for the training samples and 'val' for the validation samples. Please see the [pytorch dataset description](https://pytorch.org/vision/0.8/_modules/torchvision/datasets/imagenet.html) for more details.\n", + "- A subdirectory per class, and a file per each image sample\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "is_executing": true + } + }, + "outputs": [], + "source": [ + "DATASET_DIR = '/path/to/dataset/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a PyTorch model. This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really. You should be able to use your existing evaluate and train routines as-is.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "is_executing": true, + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "import os\n", + "import torch\n", + "from Examples.common import image_net_config\n", + "from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.torch.utils.image_net_trainer import ImageNetTrainer\n", + "from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader\n", + "\n", + "class ImageNetDataPipeline:\n", + "\n", + " @staticmethod\n", + " def get_val_dataloader() -> torch.utils.data.DataLoader:\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(DATASET_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " is_training=False,\n", + " num_workers=image_net_config.evaluation['num_workers']).data_loader\n", + " return data_loader\n", + "\n", + " @staticmethod\n", + " def evaluate(model: torch.nn.Module, use_cuda: bool) -> float:\n", + " \"\"\"\n", + " Given a torch model, evaluates its Top-1 accuracy on the dataset\n", + " :param model: the model to evaluate\n", + " :param use_cuda: whether or not the GPU should be used.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " num_workers=image_net_config.evaluation['num_workers'])\n", + "\n", + " return evaluator.evaluate(model, iterations=None, use_cuda=use_cuda)\n", + "\n", + " @staticmethod\n", + " def finetune(model: torch.nn.Module, epochs, learning_rate, learning_rate_schedule, use_cuda):\n", + " \"\"\"\n", + " Given a torch model, finetunes the model to improve its accuracy\n", + " :param model: the model to finetune\n", + " :param epochs: The number of epochs used during the finetuning step.\n", + " :param learning_rate: The learning rate used during the finetuning step.\n", + " :param learning_rate_schedule: The learning rate schedule used during the finetuning step.\n", + " :param use_cuda: whether or not the GPU should be used.\n", + " \"\"\"\n", + " trainer = ImageNetTrainer(DATASET_DIR, image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.train['batch_size'],\n", + " num_workers=image_net_config.train['num_workers'])\n", + "\n", + " trainer.train(model, max_epochs=epochs, learning_rate=learning_rate,\n", + " learning_rate_schedule=learning_rate_schedule, use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "## 2. Load FP32 model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from torchvision.models import resnet18\n", + "from aimet_torch.model_preparer import prepare_model\n", + "\n", + "use_cuda = torch.cuda.is_available()\n", + "if use_cuda:\n", + " device = torch.device(\"cuda\")\n", + "else:\n", + " device = torch.device(\"cpu\")\n", + "\n", + "model = resnet18(pretrained=True).to(device)\n", + "model = prepare_model(model)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "## 3. Create a quantization simulation model and Perform QAT\n", + "\n", + "### Create Quantization Sim Model\n", + "\n", + "Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them.\n", + "A few of the parameters are explained here\n", + "- **quant_scheme**: We set this to \"QuantScheme.post_training_tf_enhanced\"\n", + " - Supported options are 'tf_enhanced' or 'tf' or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced\n", + "- **default_output_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision\n", + "- **default_param_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision\n", + "\n", + "There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.\n", + "\n", + "**NOTE**: Note that, unlike in other QAT example scripts, we didn't fold batchnorm layers before QAT. This is because we aim to finetune our model with batchnorm layers present and re-estimate the batchnorm statatistics for better accuracy. The batchnorm layers will be folded after re-estimation." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_common.defs import QuantScheme\n", + "from aimet_torch.quantsim import QuantizationSimModel\n", + "\n", + "dummy_input = torch.rand(1, 3, 224, 224, device=device) # Shape for each ImageNet sample is (3 channels) x (224 height) x (224 width)\n", + "\n", + "sim = QuantizationSimModel(model=model,\n", + " quant_scheme=QuantScheme.training_range_learning_with_tf_init,\n", + " dummy_input=dummy_input,\n", + " default_output_bw=8,\n", + " default_param_bw=8)\n", + "\n", + "def pass_calibration_data(sim_model, use_cuda):\n", + " data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + " batch_size = data_loader.batch_size\n", + "\n", + " if use_cuda:\n", + " device = torch.device('cuda')\n", + " else:\n", + " device = torch.device('cpu')\n", + "\n", + " samples = 1000\n", + " batch_cntr = 0\n", + "\n", + " for input_data, target_data in data_loader:\n", + " inputs_batch = input_data.to(device)\n", + " sim_model(inputs_batch)\n", + "\n", + " batch_cntr += 1\n", + " if (batch_cntr * batch_size) > samples:\n", + " break\n", + " \n", + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Perform QAT\n", + "\n", + "To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.\n", + "\n", + "For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ImageNetDataPipeline.finetune(sim.model, epochs=1, learning_rate=5e-7, learning_rate_schedule=[5, 10], use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "After we are done with QAT, we can run quantization simulation inference against the validation dataset at the end to observe any improvements in accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "finetuned_accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)\n", + "print(finetuned_accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 4. Perform BatchNorm Reestimation\n", + "\n", + "### Re-estimate BatchNorm Statistics\n", + "AIMET provides a helper function, `reestimate_bn_stats`, for re-estimating batchnorm statistics.\n", + "Here is the full list of parameters for this function:\n", + "* **model**: Model to re-estimate the BatchNorm statistics.\n", + "* **dataloader** Train dataloader.\n", + "* **num_batches** (optional): The number of batches to be used for reestimation. (Default: 100)\n", + "* **forward_fn** (optional): Optional adapter function that performs forward pass given a model and a input batch yielded from the data loader. If not specified, it is expected that inputs yielded from dataloader can be passed directly to the model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_torch.bn_reestimation import reestimate_bn_stats\n", + "\n", + "train_loader = ImageNetDataLoader(images_dir=DATASET_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.train['batch_size'],\n", + " is_training=True,\n", + " num_workers=image_net_config.train['num_workers']).data_loader\n", + "def forward_fn(model, inputs):\n", + " input_data, target_data = inputs\n", + " model(input_data)\n", + "\n", + "reestimate_bn_stats(sim.model, train_loader, forward_fn=forward_fn)\n", + "\n", + "finetuned_accuracy_bn_reestimated = ImageNetDataPipeline.evaluate(sim.model, use_cuda)\n", + "print(finetuned_accuracy_bn_reestimated)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Fold BatchNorm Layers\n", + "\n", + "So far, we have improved our quantization simulation model through QAT and batchnorm re-estimation. The next step would be to actually take this model to target. But first, we should fold the batchnorm layers for our model to run on target devices more efficiently." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_torch.batch_norm_fold import fold_all_batch_norms_to_scale\n", + "\n", + "fold_all_batch_norms_to_scale(sim)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 5. Export Model\n", + "As the final step, we will export the model to run it on actual target devices. AIMET QuantizationSimModel provides an export API for this purpose." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "os.makedirs('./output/', exist_ok=True)\n", + "dummy_input = dummy_input.cpu()\n", + "sim.export(path='./output/', filename_prefix='resnet18_after_qat', dummy_input=dummy_input)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Summary\n", + "\n", + "Hope this notebook was useful for you to understand how to use batchnorm re-estimation feature of AIMET.\n", + "\n", + "Few additional resources\n", + "- Refer to the [AIMET API docs](https://quic.github.io/aimet-pages/AimetDocs/api_docs/index.html) to know more details of the APIs and optional parameters.\n", + "- Refer to the [other example notebooks](https://github.com/quic/aimet/tree/develop/Examples/torch/quantization) to understand how to use AIMET post-training quantization techniques and QAT methods." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/releases/1.32.2/Examples/torch/quantization/cle_bc.html b/releases/1.32.2/Examples/torch/quantization/cle_bc.html new file mode 100644 index 00000000..fe200765 --- /dev/null +++ b/releases/1.32.2/Examples/torch/quantization/cle_bc.html @@ -0,0 +1,1506 @@ + + + + + + Cross-Layer Equalization (CLE) and Bias Correction (BC) — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + +
  • + View page source +
  • +
+
+
+
+
+ +
+

Cross-Layer Equalization (CLE) and Bias Correction (BC)

+

This notebook showcases a working code example of how to use AIMET to apply Cross-Layer Equalization (CLE) and Bias Correction (BC). CLE and BC are post-training quantization techniques that aim to improve quantized accuracy of a given model. CLE does not need any data samples. BC may optionally need unlabelled data samples. These techniques help recover quantized accuracy when the model quantization is sensitive to parameter quantization as opposed to activation quantization.

+

To learn more about this techniques, please refer to the “Data-Free Quantization Through Weight Equalization and Bias Correction” paper from ICCV 2019 - https://arxiv.org/abs/1906.04721

+

Cross-Layer Equalization AIMET performs the following steps when running CLE: 1. Batch Norm Folding: Folds BN layers into Conv layers immediate before or after the Conv layers. 2. Cross-Layer Scaling: Given a set of consecutive Conv layers, equalizes the range of tensor values per-channel by scaling up/down per-channel weight tensor values of a layer and corresponding scaling down/up per-channel weight tensor values of the subsequent layer. 3. High Bias Folding: Cross-layer scaling may +result in high bias parameter values for some layers. This technique folds some of the bias of a layer into the subsequent layer’s parameters.

+
+
Bias Correction
+
Quantization sometimes leads to a shift in layer outputs. This techniques helps correct this shift by adjusting the bias parameters of that layer. Note that this technique is generally applied after CLE, but it is a optional step.
+
+
+

Overall flow

+

This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy 3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score 4. Apply CLE, BC and and evaluate the simulation model to get a post-finetuned quantized accuracy score

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art results. For example, it uses a relatively quantization-friendly model like Resnet18. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).

+

Note1: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - Subfolders ‘train’ for the training samples and ‘val’ for the validation samples. Please see the pytorch dataset description for more details. - A subdirectory per class, and a file per each image sample

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = '/path/to/dataset/'         # Please replace this with a real directory
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a PyTorch model. This QuantizationSim model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods? Not really. You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
import os
+import torch
+from Examples.common import image_net_config
+from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.torch.utils.image_net_trainer import ImageNetTrainer
+from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader
+
+class ImageNetDataPipeline:
+
+    @staticmethod
+    def get_val_dataloader() -> torch.utils.data.DataLoader:
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(DATASET_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         is_training=False,
+                                         num_workers=image_net_config.evaluation['num_workers']).data_loader
+        return data_loader
+
+    @staticmethod
+    def evaluate(model: torch.nn.Module, use_cuda: bool) -> float:
+        """
+        Given a torch model, evaluates its Top-1 accuracy on the dataset
+        :param model: the model to evaluate
+        :param iterations: the number of batches to be used to evaluate the model. A value of 'None' means the model will be
+                           evaluated on the entire dataset once.
+        :param use_cuda: whether or not the GPU should be used.
+        """
+        evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      num_workers=image_net_config.evaluation['num_workers'])
+
+        return evaluator.evaluate(model, iterations=None, use_cuda=use_cuda)
+
+
+
+
+
+
+

2. Load the model and evaluate to get a baseline FP32 accuracy score

+

For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead.

+
+
[ ]:
+
+
+
from torchvision.models import resnet18
+
+model = resnet18(pretrained=True)
+
+
+
+

AIMET quantization simulation requires the user’s model definition to follow certain guidelines. For example, functionals defined in forward pass should be changed to equivalent torch.nn.Module. AIMET user guide lists all these guidelines. The following ModelPreparer API uses new graph transformation feature available in PyTorch 1.9+ version and automates model definition changes required to comply with the above guidelines.

+
+
[ ]:
+
+
+
from aimet_torch.model_preparer import prepare_model
+
+model = prepare_model(model)
+
+
+
+
+

We should decide whether to place the model on a CPU or CUDA device. This example code will use CUDA if available in your current execution environment. You can change this logic and force a device placement if needed.

+
+
[ ]:
+
+
+
use_cuda = False
+if torch.cuda.is_available():
+    use_cuda = True
+    model.to(torch.device('cuda'))
+
+
+
+
+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(model, use_cuda)
+print(accuracy)
+
+
+
+
+
+
+

3. Create a quantization simulation model and determine quantized accuracy

+
+
+

Fold Batch Normalization layers

+

Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.

+

Why do we need to this? On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values +for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.

+

The following code calls AIMET to fold the BN layers in-place on the given model

+
+
[ ]:
+
+
+
from aimet_torch.batch_norm_fold import fold_all_batch_norms
+
+_ = fold_all_batch_norms(model, input_shapes=(1, 3, 224, 224))
+
+
+
+
+
+
+

Create Quantization Sim Model

+

Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them. A few of the parameters are explained here - quant_scheme: We set this to “QuantScheme.post_training_tf_enhanced” - Supported options are ‘tf_enhanced’ or ‘tf’ or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced - default_output_bw: Setting this to 8, essentially means that we are +asking AIMET to perform all activation quantizations in the model using integer 8-bit precision - default_param_bw: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision - num_batches: The number of batches used to evaluate the model while calculating the quantization encodings.Number of batches to use for computing encodings. Only 5 batches are used here to speed up the process. In addition, the +number of images in these 5 batches should be sufficient for compute encodings - rounding_mode: The rounding mode used for quantization. There are two possible choices here - ‘nearest’ or ‘stochastic’ We will use “nearest.”

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+
+
[ ]:
+
+
+
from aimet_common.defs import QuantScheme
+from aimet_torch.quantsim import QuantizationSimModel
+
+dummy_input = torch.rand(1, 3, 224, 224)    # Shape for each ImageNet sample is (3 channels) x (224 height) x (224 width)
+if use_cuda:
+    dummy_input = dummy_input.cuda()
+
+sim = QuantizationSimModel(model=model,
+                           quant_scheme=QuantScheme.post_training_tf_enhanced,
+                           dummy_input=dummy_input,
+                           default_output_bw=8,
+                           default_param_bw=8)
+
+
+
+
+

We can check the modifications AIMET has made to the model graph. One way is to print the model, and we can see that AIMET has added quantization wrapper layers. Note: use sim.model to access the modified PyTorch model. By default, AIMET creates a copy of the original model prior to modifying it. There is a parameter to override this behavior.

+
+
[ ]:
+
+
+
print(sim.model)
+
+
+
+
+

We can also check how AIMET has configured the added fake quantization nodes, which AIMET refers to as ‘quantizers’. You can see this by printing the sim object.

+
+
[ ]:
+
+
+
print(sim)
+
+
+
+
+

Even though AIMET has added ‘quantizer’ nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred +to as calibration. AIMET simply refers to it as ‘computing encodings’.

+

So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples

+

In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples. It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario +like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results. The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example.

+
+
[ ]:
+
+
+
def pass_calibration_data(sim_model, use_cuda):
+    data_loader = ImageNetDataPipeline.get_val_dataloader()
+    batch_size = data_loader.batch_size
+
+    if use_cuda:
+        device = torch.device('cuda')
+    else:
+        device = torch.device('cpu')
+
+    sim_model.eval()
+    samples = 1000
+
+    batch_cntr = 0
+    with torch.no_grad():
+        for input_data, target_data in data_loader:
+
+            inputs_batch = input_data.to(device)
+            sim_model(inputs_batch)
+
+            batch_cntr += 1
+            if (batch_cntr * batch_size) > samples:
+                break
+
+
+
+
+

Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters.

+
+
[ ]:
+
+
+
sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=use_cuda)
+
+
+
+
+

Now the QuantizationSim model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)
+print(accuracy)
+
+
+
+
+
+
+

4. 1 Cross Layer Equalization

+

The next cell performs cross-layer equalization on the model. As noted before, the function folds batch norms, applies cross-layer scaling, and then folds high biases.

+

Note: Interestingly, CLE needs BN statistics for its procedure. If a BN folded model is provided, CLE will run the CLS (cross-layer scaling) optimization step but will skip the HBA (high-bias absorption) step. To avoid this, we simply load the original model again before running CLE.

+

Note: CLE equalizes the model in-place

+
+
[ ]:
+
+
+
model = resnet18(pretrained=True)
+model = prepare_model(model)
+
+use_cuda = False
+if torch.cuda.is_available():
+    use_cuda = True
+    model.to(torch.device('cuda'))
+
+
+
+
+
[ ]:
+
+
+
from aimet_torch.cross_layer_equalization import equalize_model
+
+equalize_model(model, input_shapes=(1, 3, 224, 224))
+
+
+
+
+

Now, we can determine the simulated quantized accuracy of the equalized model. We again create a simulation model like before and evaluate to determine simulated quantized accuracy.

+
+
[ ]:
+
+
+
sim = QuantizationSimModel(model=model,
+                           quant_scheme=QuantScheme.post_training_tf,
+                           dummy_input=dummy_input,
+                           default_output_bw=8,
+                           default_param_bw=8)
+
+sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=use_cuda)
+
+accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)
+print(accuracy)
+
+
+
+
+
+
+

4. 2 Bias Correction

+

This section shows how we can apply AIMET Bias Correction on top of the already equalized model from the previous step. Bias correction under the hood uses a reference FP32 model and a QuantizationSimModel to perform its procedure. More details are explained in the AIMET User Guide documentation.

+

For the correct_bias API, we pass the following parameters

+
    +
  • num_quant_samples: Number of samples used for computing encodings. We are setting this to a low number here to speed up execution. A typical number would be 500-1000.

  • +
  • num_bias_correct_samples: Number of samples used for bias correction. We are setting this to a low number here to speed up execution. A typical number would be 1000-2000.

  • +
  • data_loader: BC uses unlabeled data samples from this data loader.

  • +
+
+
[ ]:
+
+
+
from aimet_torch.quantsim import QuantParams
+from aimet_torch.bias_correction import correct_bias
+
+data_loader = ImageNetDataPipeline.get_val_dataloader()
+
+bc_params = QuantParams(weight_bw=8, act_bw=8, round_mode="nearest",
+                        quant_scheme=QuantScheme.post_training_tf_enhanced)
+
+correct_bias(model, bc_params, num_quant_samples=16,
+             data_loader=data_loader, num_bias_correct_samples=16)
+
+
+
+
+

Now, we can determine the simulated quantized accuracy of the bias-corrected model. We again create a simulation model like before and evaluate to determine simulated quantized accuracy.

+
+
[ ]:
+
+
+
sim = QuantizationSimModel(model=model,
+                           quant_scheme=QuantScheme.post_training_tf_enhanced,
+                           dummy_input=dummy_input,
+                           default_output_bw=8,
+                           default_param_bw=8)
+
+sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=use_cuda)
+
+accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)
+print(accuracy)
+
+
+
+
+

Depending on your settings you may have observed a slight gain in accuracy after applying CLE ad BC. Ofcourse, this was just an example. Please try this against the model of your choice and play with the number of samples to get the best results.

+

Now the next step would be to take this model to target. For this purpose, we need to export the model with the updated weights without the fake quant ops. And also to export the encodings (scale/offset quantization parameters). AIMET QuantizationSimModel provides an export API for this purpose.

+
+
[ ]:
+
+
+
os.makedirs('./output/', exist_ok=True)
+dummy_input = dummy_input.cpu()
+sim.export(path='./output/', filename_prefix='resnet18_after_cle_bc', dummy_input=dummy_input)
+
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET for performing Cross Layer Equalization (CLE) and Bias Correction (BC).

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters - Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT techniques

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/torch/quantization/cle_bc.ipynb b/releases/1.32.2/Examples/torch/quantization/cle_bc.ipynb new file mode 100644 index 00000000..fc31b76f --- /dev/null +++ b/releases/1.32.2/Examples/torch/quantization/cle_bc.ipynb @@ -0,0 +1,574 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Cross-Layer Equalization (CLE) and Bias Correction (BC)\n", + "\n", + "This notebook showcases a working code example of how to use AIMET to apply Cross-Layer Equalization (CLE) and Bias Correction (BC). CLE and BC are post-training quantization techniques that aim to improve quantized accuracy of a given model. CLE does not need any data samples. BC may optionally need unlabelled data samples. These techniques help recover quantized accuracy when the model quantization is sensitive to parameter quantization as opposed to activation quantization.\n", + "\n", + "To learn more about this techniques, please refer to the \"Data-Free Quantization Through Weight Equalization and Bias Correction\" paper from ICCV 2019 - https://arxiv.org/abs/1906.04721\n", + "\n", + "**Cross-Layer Equalization**\n", + "AIMET performs the following steps when running CLE:\n", + "1. Batch Norm Folding: Folds BN layers into Conv layers immediate before or after the Conv layers.\n", + "2. Cross-Layer Scaling: Given a set of consecutive Conv layers, equalizes the range of tensor values per-channel by scaling up/down per-channel weight tensor values of a layer and corresponding scaling down/up per-channel weight tensor values of the subsequent layer.\n", + "3. High Bias Folding: Cross-layer scaling may result in high bias parameter values for some layers. This technique folds some of the bias of a layer into the subsequent layer's parameters.\n", + "\n", + "**Bias Correction** \n", + "Quantization sometimes leads to a shift in layer outputs. This techniques helps correct this shift by adjusting the bias parameters of that layer. Note that this technique is generally applied after CLE, but it is a optional step.\n", + "\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy\n", + "3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score\n", + "4. Apply CLE, BC and and evaluate the simulation model to get a post-finetuned quantized accuracy score\n", + "\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art results. For example, it uses a relatively quantization-friendly model like Resnet18. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).\n", + "\n", + "**Note1**: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- Subfolders 'train' for the training samples and 'val' for the validation samples. Please see the [pytorch dataset description](https://pytorch.org/vision/0.8/_modules/torchvision/datasets/imagenet.html) for more details.\n", + "- A subdirectory per class, and a file per each image sample\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "DATASET_DIR = '/path/to/dataset/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a PyTorch model. This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really. You should be able to use your existing evaluate and train routines as-is.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import torch\n", + "from Examples.common import image_net_config\n", + "from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.torch.utils.image_net_trainer import ImageNetTrainer\n", + "from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader\n", + "\n", + "class ImageNetDataPipeline:\n", + "\n", + " @staticmethod\n", + " def get_val_dataloader() -> torch.utils.data.DataLoader:\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(DATASET_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " is_training=False,\n", + " num_workers=image_net_config.evaluation['num_workers']).data_loader\n", + " return data_loader\n", + "\n", + " @staticmethod\n", + " def evaluate(model: torch.nn.Module, use_cuda: bool) -> float:\n", + " \"\"\"\n", + " Given a torch model, evaluates its Top-1 accuracy on the dataset\n", + " :param model: the model to evaluate\n", + " :param iterations: the number of batches to be used to evaluate the model. A value of 'None' means the model will be\n", + " evaluated on the entire dataset once.\n", + " :param use_cuda: whether or not the GPU should be used.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " num_workers=image_net_config.evaluation['num_workers'])\n", + "\n", + " return evaluator.evaluate(model, iterations=None, use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "\n", + "## 2. Load the model and evaluate to get a baseline FP32 accuracy score" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from torchvision.models import resnet18\n", + "\n", + "model = resnet18(pretrained=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "AIMET quantization simulation requires the user's model definition to follow certain guidelines. For example, functionals defined in forward pass should be changed to equivalent torch.nn.Module.\n", + "AIMET user guide lists all these guidelines.\n", + "The following **ModelPreparer** API uses new graph transformation feature available in PyTorch 1.9+ version and automates model definition changes required to comply with the above guidelines. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_torch.model_preparer import prepare_model\n", + "\n", + "model = prepare_model(model)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We should decide whether to place the model on a CPU or CUDA device. This example code will use CUDA if available in your current execution environment. You can change this logic and force a device placement if needed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "use_cuda = False\n", + "if torch.cuda.is_available():\n", + " use_cuda = True\n", + " model.to(torch.device('cuda'))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(model, use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 3. Create a quantization simulation model and determine quantized accuracy\n", + "\n", + "## Fold Batch Normalization layers\n", + "Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.\n", + "\n", + "**Why do we need to this?**\n", + "On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.\n", + "\n", + "The following code calls AIMET to fold the BN layers in-place on the given model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_torch.batch_norm_fold import fold_all_batch_norms\n", + "\n", + "_ = fold_all_batch_norms(model, input_shapes=(1, 3, 224, 224))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Create Quantization Sim Model\n", + "Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them.\n", + "A few of the parameters are explained here\n", + "- **quant_scheme**: We set this to \"QuantScheme.post_training_tf_enhanced\"\n", + " - Supported options are 'tf_enhanced' or 'tf' or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced\n", + "- **default_output_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision\n", + "- **default_param_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision\n", + "- **num_batches**: The number of batches used to evaluate the model while calculating the quantization encodings.Number of batches to use for computing encodings. Only 5 batches are used here to speed up the process. In addition, the number of images in these 5 batches should be sufficient for compute encodings\n", + "- **rounding_mode**: The rounding mode used for quantization. There are two possible choices here - 'nearest' or 'stochastic' We will use \"nearest.\"\n", + "\n", + "There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_common.defs import QuantScheme\n", + "from aimet_torch.quantsim import QuantizationSimModel\n", + "\n", + "dummy_input = torch.rand(1, 3, 224, 224) # Shape for each ImageNet sample is (3 channels) x (224 height) x (224 width)\n", + "if use_cuda:\n", + " dummy_input = dummy_input.cuda()\n", + "\n", + "sim = QuantizationSimModel(model=model,\n", + " quant_scheme=QuantScheme.post_training_tf_enhanced,\n", + " dummy_input=dummy_input,\n", + " default_output_bw=8,\n", + " default_param_bw=8)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We can check the modifications AIMET has made to the model graph. One way is to print the model, and we can see that AIMET has added quantization wrapper layers. Note: use sim.model to access the modified PyTorch model. By default, AIMET creates a copy of the original model prior to modifying it. There is a parameter to override this behavior." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(sim.model)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We can also check how AIMET has configured the added fake quantization nodes, which AIMET refers to as 'quantizers'. You can see this by printing the sim object." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "print(sim)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Even though AIMET has added 'quantizer' nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each 'quantizer' node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as 'computing encodings'.\n", + "\n", + "So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don't need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples\n", + "\n", + "In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples.\n", + "It may be beneficial if the samples used for computing encoding are well distributed. It's not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def pass_calibration_data(sim_model, use_cuda):\n", + " data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + " batch_size = data_loader.batch_size\n", + "\n", + " if use_cuda:\n", + " device = torch.device('cuda')\n", + " else:\n", + " device = torch.device('cpu')\n", + "\n", + " sim_model.eval()\n", + " samples = 1000\n", + "\n", + " batch_cntr = 0\n", + " with torch.no_grad():\n", + " for input_data, target_data in data_loader:\n", + "\n", + " inputs_batch = input_data.to(device)\n", + " sim_model(inputs_batch)\n", + "\n", + " batch_cntr += 1\n", + " if (batch_cntr * batch_size) > samples:\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now the QuantizationSim model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 4. 1 Cross Layer Equalization\n", + "\n", + "The next cell performs cross-layer equalization on the model. As noted before, the function folds batch norms, applies cross-layer scaling, and then folds high biases.\n", + "\n", + "**Note:** Interestingly, CLE needs BN statistics for its procedure. If a BN folded model is provided, CLE will run the CLS (cross-layer scaling) optimization step but will skip the HBA (high-bias absorption) step. To avoid this, we simply load the original model again before running CLE.\n", + "\n", + "**Note:** CLE equalizes the model in-place" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "model = resnet18(pretrained=True)\n", + "model = prepare_model(model)\n", + "\n", + "use_cuda = False\n", + "if torch.cuda.is_available():\n", + " use_cuda = True\n", + " model.to(torch.device('cuda'))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_torch.cross_layer_equalization import equalize_model\n", + "\n", + "equalize_model(model, input_shapes=(1, 3, 224, 224))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now, we can determine the simulated quantized accuracy of the equalized model. We again create a simulation model like before and evaluate to determine simulated quantized accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "sim = QuantizationSimModel(model=model,\n", + " quant_scheme=QuantScheme.post_training_tf,\n", + " dummy_input=dummy_input,\n", + " default_output_bw=8,\n", + " default_param_bw=8)\n", + "\n", + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=use_cuda)\n", + "\n", + "accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 4. 2 Bias Correction\n", + "\n", + "This section shows how we can apply AIMET Bias Correction on top of the already equalized model from the previous step. Bias correction under the hood uses a reference FP32 model and a QuantizationSimModel to perform its procedure. More details are explained in the AIMET User Guide documentation.\n", + "\n", + "For the correct_bias API, we pass the following parameters\n", + "\n", + "- **num_quant_samples**: Number of samples used for computing encodings. We are setting this to a low number here to speed up execution. A typical number would be 500-1000.\n", + "- **num_bias_correct_samples**: Number of samples used for bias correction. We are setting this to a low number here to speed up execution. A typical number would be 1000-2000.\n", + "- **data_loader**: BC uses unlabeled data samples from this data loader." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_torch.quantsim import QuantParams\n", + "from aimet_torch.bias_correction import correct_bias\n", + "\n", + "data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + "\n", + "bc_params = QuantParams(weight_bw=8, act_bw=8, round_mode=\"nearest\",\n", + " quant_scheme=QuantScheme.post_training_tf_enhanced)\n", + "\n", + "correct_bias(model, bc_params, num_quant_samples=16,\n", + " data_loader=data_loader, num_bias_correct_samples=16)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now, we can determine the simulated quantized accuracy of the bias-corrected model. We again create a simulation model like before and evaluate to determine simulated quantized accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sim = QuantizationSimModel(model=model,\n", + " quant_scheme=QuantScheme.post_training_tf_enhanced,\n", + " dummy_input=dummy_input,\n", + " default_output_bw=8,\n", + " default_param_bw=8)\n", + "\n", + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=use_cuda)\n", + "\n", + "accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Depending on your settings you may have observed a slight gain in accuracy after applying CLE ad BC. Ofcourse, this was just an example. Please try this against the model of your choice and play with the number of samples to get the best results.\n", + "\n", + "Now the next step would be to take this model to target. For this purpose, we need to export the model with the updated weights without the fake quant ops. And also to export the encodings (scale/offset quantization parameters). AIMET QuantizationSimModel provides an export API for this purpose." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "os.makedirs('./output/', exist_ok=True)\n", + "dummy_input = dummy_input.cpu()\n", + "sim.export(path='./output/', filename_prefix='resnet18_after_cle_bc', dummy_input=dummy_input)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Summary\n", + "\n", + "Hope this notebook was useful for you to understand how to use AIMET for performing Cross Layer Equalization (CLE) and Bias Correction (BC).\n", + "\n", + "Few additional resources\n", + "- Refer to the AIMET API docs to know more details of the APIs and optional parameters\n", + "- Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT techniques" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.9" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/releases/1.32.2/Examples/torch/quantization/qat.html b/releases/1.32.2/Examples/torch/quantization/qat.html new file mode 100644 index 00000000..e3fb3e8e --- /dev/null +++ b/releases/1.32.2/Examples/torch/quantization/qat.html @@ -0,0 +1,1440 @@ + + + + + + Quantization-Aware Training — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Quantization-Aware Training

+

This notebook shows a working code example of how to use AIMET to perform QAT (Quantization-aware training). QAT is an AIMET feature adding quantization simulation ops (also called fake quantization ops sometimes) to a trained ML model and using a standard training pipeline to fine-tune or train the model for a few epochs. The resulting model should show improved accuracy on quantized ML accelerators.

+

AIMET supports two different types of QAT 1. Simply referred to as QAT - quantization parameters like per-tensor scale/offsets for activations are computed once. During fine-tuning, the model weights are updated to minimize the effects of quantization in the forward pass, keeping the quantization parameters constant. 2. Referred to as QAT with range-learning - quantization parameters like per-tensor scale/offsets for activations are computed initially. Then both the quantization parameters and +the model weights are jointly updated during fine-tuning to minimize the effects of quantization in the forward pass.

+

This notebook specifically shows working code example for #1 above. You can find a separate notebook for #2 in the same folder.

+
+

Overall flow

+

This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy 3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score 4. Fine-tune the quantization simulation model and evaluate the simulation model to get a post-finetuned quantized accuracy score

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art QAT results. For example, it uses a relatively quantization-friendly model like Resnet18. Also, some optimization parameters like number of epochs to fine-tune are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).

+

Note1: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - Subfolders ‘train’ for the training samples and ‘val’ for the validation samples. Please see the pytorch dataset description for more details. - A subdirectory per class, and a file per each image sample

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = '/path/to/dataset/'         # Please replace this with a real directory
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a PyTorch model. This QuantizationSim model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods? Not really. You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
import os
+import torch
+from Examples.common import image_net_config
+from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.torch.utils.image_net_trainer import ImageNetTrainer
+from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader
+
+class ImageNetDataPipeline:
+
+    @staticmethod
+    def get_val_dataloader() -> torch.utils.data.DataLoader:
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(DATASET_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         is_training=False,
+                                         num_workers=image_net_config.evaluation['num_workers']).data_loader
+        return data_loader
+
+    @staticmethod
+    def evaluate(model: torch.nn.Module, use_cuda: bool) -> float:
+        """
+        Given a torch model, evaluates its Top-1 accuracy on the dataset
+        :param model: the model to evaluate
+        :param use_cuda: whether or not the GPU should be used.
+        """
+        evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      num_workers=image_net_config.evaluation['num_workers'])
+
+        return evaluator.evaluate(model, iterations=None, use_cuda=use_cuda)
+
+    @staticmethod
+    def finetune(model: torch.nn.Module, epochs, learning_rate, learning_rate_schedule, use_cuda):
+        """
+        Given a torch model, finetunes the model to improve its accuracy
+        :param model: the model to finetune
+        :param epochs: The number of epochs used during the finetuning step.
+        :param learning_rate: The learning rate used during the finetuning step.
+        :param learning_rate_schedule: The learning rate schedule used during the finetuning step.
+        :param use_cuda: whether or not the GPU should be used.
+        """
+        trainer = ImageNetTrainer(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
+                                  batch_size=image_net_config.train['batch_size'],
+                                  num_workers=image_net_config.train['num_workers'])
+
+        trainer.train(model, max_epochs=epochs, learning_rate=learning_rate,
+                      learning_rate_schedule=learning_rate_schedule, use_cuda=use_cuda)
+
+
+
+
+
+
+

2. Load the model and evaluate to get a baseline FP32 accuracy score

+

For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead.

+
+
[ ]:
+
+
+
from torchvision.models import resnet18
+
+model = resnet18(pretrained=True)
+
+
+
+

AIMET quantization simulation requires the user’s model definition to follow certain guidelines. For example, functionals defined in forward pass should be changed to equivalent torch.nn.Module. AIMET user guide lists all these guidelines. The following ModelPreparer API uses new graph transformation feature available in PyTorch 1.9+ version and automates model definition changes required to comply with the above guidelines.

+
+
[ ]:
+
+
+
from aimet_torch.model_preparer import prepare_model
+
+model = prepare_model(model)
+
+
+
+
+

We should decide whether to place the model on a CPU or CUDA device. This example code will use CUDA if available in your current execution environment. You can change this logic and force a device placement if needed.

+
+
[ ]:
+
+
+
use_cuda = False
+if torch.cuda.is_available():
+    use_cuda = True
+    model.to(torch.device('cuda'))
+
+
+
+
+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(model, use_cuda)
+print(accuracy)
+
+
+
+
+
+
+

3. Create a quantization simulation model and determine quantized accuracy

+
+
+

Fold Batch Normalization layers

+

Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.

+

Why do we need to this? On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values +for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.

+

The following code calls AIMET to fold the BN layers in-place on the given model

+
+
[ ]:
+
+
+
from aimet_torch.batch_norm_fold import fold_all_batch_norms
+
+_ = fold_all_batch_norms(model, input_shapes=(1, 3, 224, 224))
+
+
+
+
+
+

Create Quantization Sim Model

+

Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them. A few of the parameters are explained here - quant_scheme: We set this to “QuantScheme.post_training_tf_enhanced” - Supported options are ‘tf_enhanced’ or ‘tf’ or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced - default_output_bw: Setting this to 8, essentially means that we are +asking AIMET to perform all activation quantizations in the model using integer 8-bit precision - default_param_bw: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+
+
[ ]:
+
+
+
from aimet_common.defs import QuantScheme
+from aimet_torch.quantsim import QuantizationSimModel
+
+dummy_input = torch.rand(1, 3, 224, 224)    # Shape for each ImageNet sample is (3 channels) x (224 height) x (224 width)
+if use_cuda:
+    dummy_input = dummy_input.cuda()
+
+sim = QuantizationSimModel(model=model,
+                           quant_scheme=QuantScheme.post_training_tf_enhanced,
+                           dummy_input=dummy_input,
+                           default_output_bw=8,
+                           default_param_bw=8)
+
+
+
+
+

We can check the modifications AIMET has made to the model graph. One way is to print the model, and we can see that AIMET has added quantization wrapper layers. Note: use sim.model to access the modified PyTorch model. By default, AIMET creates a copy of the original model prior to modifying it. There is a parameter to override this behavior.

+
+
[ ]:
+
+
+
print(sim.model)
+
+
+
+
+

We can also check how AIMET has configured the added fake quantization nodes, which AIMET refers to as ‘quantizers’. You can see this by printing the sim object.

+
+
[ ]:
+
+
+
print(sim)
+
+
+
+
+

Even though AIMET has added ‘quantizer’ nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred +to as calibration. AIMET simply refers to it as ‘computing encodings’.

+

So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples - In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has +1M samples. For computing encodings we only need 500 or 1000 samples. - It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results.

+

The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example.

+
+
[ ]:
+
+
+
def pass_calibration_data(sim_model, use_cuda):
+    data_loader = ImageNetDataPipeline.get_val_dataloader()
+    batch_size = data_loader.batch_size
+
+    if use_cuda:
+        device = torch.device('cuda')
+    else:
+        device = torch.device('cpu')
+
+    sim_model.eval()
+    samples = 1000
+
+    batch_cntr = 0
+    with torch.no_grad():
+        for input_data, target_data in data_loader:
+
+            inputs_batch = input_data.to(device)
+            sim_model(inputs_batch)
+
+            batch_cntr += 1
+            if (batch_cntr * batch_size) > samples:
+                break
+
+
+
+
+

Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters.

+
+
[ ]:
+
+
+
sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=use_cuda)
+
+
+
+
+

Now the QuantizationSim model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)
+print(accuracy)
+
+
+
+
+
+
+

4. Perform QAT

+

To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.

+

For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit.

+
+
[ ]:
+
+
+
ImageNetDataPipeline.finetune(sim.model, epochs=1, learning_rate=5e-7, learning_rate_schedule=[5, 10], use_cuda=use_cuda)
+
+
+
+
+

After we are done with QAT, we can run quantization simulation inference against the validation dataset at the end to observe any improvements in accuracy.

+
+
[ ]:
+
+
+
finetuned_accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)
+print(finetuned_accuracy)
+
+
+
+
+

Depending on your settings you may have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.

+

So we have an improved model after QAT. Now the next step would be to actually take this model to target. For this purpose, we need to export the model with the updated weights without the fake quant ops. And also to export the encodings (scale/offset quantization parameters) that were updated during training since we employed QAT with range-learning. AIMET QuantizationSimModel provides an export API for this purpose.

+
+
[ ]:
+
+
+
os.makedirs('./output/', exist_ok=True)
+dummy_input = dummy_input.cpu()
+sim.export(path='./output/', filename_prefix='resnet18_after_qat', dummy_input=dummy_input)
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET for performing QAT.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters. - Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and QAT with range-learning.

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/torch/quantization/qat.ipynb b/releases/1.32.2/Examples/torch/quantization/qat.ipynb new file mode 100644 index 00000000..78ad8a57 --- /dev/null +++ b/releases/1.32.2/Examples/torch/quantization/qat.ipynb @@ -0,0 +1,490 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Quantization-Aware Training\n", + "\n", + "This notebook shows a working code example of how to use AIMET to perform QAT (Quantization-aware training). QAT is an AIMET feature adding quantization simulation ops (also called fake quantization ops sometimes) to a trained ML model and using a standard training pipeline to fine-tune or train the model for a few epochs. The resulting model should show improved accuracy on quantized ML accelerators.\n", + "\n", + "AIMET supports two different types of QAT\n", + "1. Simply referred to as QAT - quantization parameters like per-tensor scale/offsets for activations are computed once. During fine-tuning, the model weights are updated to minimize the effects of quantization in the forward pass, keeping the quantization parameters constant.\n", + "2. Referred to as QAT with range-learning - quantization parameters like per-tensor scale/offsets for activations are computed initially. Then both the quantization parameters and the model weights are jointly updated during fine-tuning to minimize the effects of quantization in the forward pass.\n", + "\n", + "This notebook specifically shows working code example for #1 above. You can find a separate notebook for #2 in the same folder.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy\n", + "3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score\n", + "4. Fine-tune the quantization simulation model and evaluate the simulation model to get a post-finetuned quantized accuracy score\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art QAT results. For example, it uses a relatively quantization-friendly model like Resnet18. Also, some optimization parameters like number of epochs to fine-tune are deliberately chosen to have the notebook execute more quickly." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).\n", + "\n", + "**Note1**: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- Subfolders 'train' for the training samples and 'val' for the validation samples. Please see the [pytorch dataset description](https://pytorch.org/vision/0.8/_modules/torchvision/datasets/imagenet.html) for more details.\n", + "- A subdirectory per class, and a file per each image sample\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "is_executing": true + } + }, + "outputs": [], + "source": [ + "DATASET_DIR = '/path/to/dataset/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a PyTorch model. This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really. You should be able to use your existing evaluate and train routines as-is.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "is_executing": true + } + }, + "outputs": [], + "source": [ + "import os\n", + "import torch\n", + "from Examples.common import image_net_config\n", + "from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.torch.utils.image_net_trainer import ImageNetTrainer\n", + "from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader\n", + "\n", + "class ImageNetDataPipeline:\n", + "\n", + " @staticmethod\n", + " def get_val_dataloader() -> torch.utils.data.DataLoader:\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(DATASET_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " is_training=False,\n", + " num_workers=image_net_config.evaluation['num_workers']).data_loader\n", + " return data_loader\n", + "\n", + " @staticmethod\n", + " def evaluate(model: torch.nn.Module, use_cuda: bool) -> float:\n", + " \"\"\"\n", + " Given a torch model, evaluates its Top-1 accuracy on the dataset\n", + " :param model: the model to evaluate\n", + " :param use_cuda: whether or not the GPU should be used.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " num_workers=image_net_config.evaluation['num_workers'])\n", + "\n", + " return evaluator.evaluate(model, iterations=None, use_cuda=use_cuda)\n", + "\n", + " @staticmethod\n", + " def finetune(model: torch.nn.Module, epochs, learning_rate, learning_rate_schedule, use_cuda):\n", + " \"\"\"\n", + " Given a torch model, finetunes the model to improve its accuracy\n", + " :param model: the model to finetune\n", + " :param epochs: The number of epochs used during the finetuning step.\n", + " :param learning_rate: The learning rate used during the finetuning step.\n", + " :param learning_rate_schedule: The learning rate schedule used during the finetuning step.\n", + " :param use_cuda: whether or not the GPU should be used.\n", + " \"\"\"\n", + " trainer = ImageNetTrainer(DATASET_DIR, image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.train['batch_size'],\n", + " num_workers=image_net_config.train['num_workers'])\n", + "\n", + " trainer.train(model, max_epochs=epochs, learning_rate=learning_rate,\n", + " learning_rate_schedule=learning_rate_schedule, use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 2. Load the model and evaluate to get a baseline FP32 accuracy score" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from torchvision.models import resnet18\n", + "\n", + "model = resnet18(pretrained=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "AIMET quantization simulation requires the user's model definition to follow certain guidelines. For example, functionals defined in forward pass should be changed to equivalent torch.nn.Module.\n", + "AIMET user guide lists all these guidelines.\n", + "The following **ModelPreparer** API uses new graph transformation feature available in PyTorch 1.9+ version and automates model definition changes required to comply with the above guidelines. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_torch.model_preparer import prepare_model\n", + "\n", + "model = prepare_model(model)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We should decide whether to place the model on a CPU or CUDA device. This example code will use CUDA if available in your current execution environment. You can change this logic and force a device placement if needed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "use_cuda = False\n", + "if torch.cuda.is_available():\n", + " use_cuda = True\n", + " model.to(torch.device('cuda'))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(model, use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 3. Create a quantization simulation model and determine quantized accuracy\n", + "\n", + "## Fold Batch Normalization layers\n", + "Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.\n", + "\n", + "**Why do we need to this?**\n", + "On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.\n", + "\n", + "The following code calls AIMET to fold the BN layers in-place on the given model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_torch.batch_norm_fold import fold_all_batch_norms\n", + "\n", + "_ = fold_all_batch_norms(model, input_shapes=(1, 3, 224, 224))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Create Quantization Sim Model\n", + "\n", + "Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them.\n", + "A few of the parameters are explained here\n", + "- **quant_scheme**: We set this to \"QuantScheme.post_training_tf_enhanced\"\n", + " - Supported options are 'tf_enhanced' or 'tf' or using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced\n", + "- **default_output_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision\n", + "- **default_param_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision\n", + "\n", + "There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_common.defs import QuantScheme\n", + "from aimet_torch.quantsim import QuantizationSimModel\n", + "\n", + "dummy_input = torch.rand(1, 3, 224, 224) # Shape for each ImageNet sample is (3 channels) x (224 height) x (224 width)\n", + "if use_cuda:\n", + " dummy_input = dummy_input.cuda()\n", + "\n", + "sim = QuantizationSimModel(model=model,\n", + " quant_scheme=QuantScheme.post_training_tf_enhanced,\n", + " dummy_input=dummy_input,\n", + " default_output_bw=8,\n", + " default_param_bw=8)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We can check the modifications AIMET has made to the model graph. One way is to print the model, and we can see that AIMET has added quantization wrapper layers. Note: use sim.model to access the modified PyTorch model. By default, AIMET creates a copy of the original model prior to modifying it. There is a parameter to override this behavior." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(sim.model)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We can also check how AIMET has configured the added fake quantization nodes, which AIMET refers to as 'quantizers'. You can see this by printing the sim object." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(sim)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Even though AIMET has added 'quantizer' nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each 'quantizer' node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as 'computing encodings'.\n", + "\n", + "So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don't need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples\n", + "- In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples.\n", + "- It may be beneficial if the samples used for computing encoding are well distributed. It's not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def pass_calibration_data(sim_model, use_cuda):\n", + " data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + " batch_size = data_loader.batch_size\n", + "\n", + " if use_cuda:\n", + " device = torch.device('cuda')\n", + " else:\n", + " device = torch.device('cpu')\n", + "\n", + " sim_model.eval()\n", + " samples = 1000\n", + "\n", + " batch_cntr = 0\n", + " with torch.no_grad():\n", + " for input_data, target_data in data_loader:\n", + "\n", + " inputs_batch = input_data.to(device)\n", + " sim_model(inputs_batch)\n", + "\n", + " batch_cntr += 1\n", + " if (batch_cntr * batch_size) > samples:\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now the QuantizationSim model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 4. Perform QAT\n", + "\n", + "To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.\n", + "\n", + "For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ImageNetDataPipeline.finetune(sim.model, epochs=1, learning_rate=5e-7, learning_rate_schedule=[5, 10], use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "After we are done with QAT, we can run quantization simulation inference against the validation dataset at the end to observe any improvements in accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "finetuned_accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)\n", + "print(finetuned_accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Depending on your settings you may have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.\n", + "\n", + "So we have an improved model after QAT. Now the next step would be to actually take this model to target. For this purpose, we need to export the model with the updated weights without the fake quant ops. And also to export the encodings (scale/offset quantization parameters) that were updated during training since we employed QAT with range-learning. AIMET QuantizationSimModel provides an export API for this purpose." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "os.makedirs('./output/', exist_ok=True)\n", + "dummy_input = dummy_input.cpu()\n", + "sim.export(path='./output/', filename_prefix='resnet18_after_qat', dummy_input=dummy_input)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Summary\n", + "\n", + "Hope this notebook was useful for you to understand how to use AIMET for performing QAT.\n", + "\n", + "Few additional resources\n", + "- Refer to the [AIMET API docs](https://quic.github.io/aimet-pages/AimetDocs/api_docs/index.html) to know more details of the APIs and optional parameters.\n", + "- Refer to the [other example notebooks](https://github.com/quic/aimet/tree/develop/Examples/torch/quantization) to understand how to use AIMET post-training quantization techniques and QAT with range-learning." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.0" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/releases/1.32.2/Examples/torch/quantization/qat_range_learning.html b/releases/1.32.2/Examples/torch/quantization/qat_range_learning.html new file mode 100644 index 00000000..2b5330af --- /dev/null +++ b/releases/1.32.2/Examples/torch/quantization/qat_range_learning.html @@ -0,0 +1,1442 @@ + + + + + + Quantization-Aware Training with Range Learning — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Quantization-Aware Training with Range Learning

+

This notebook shows a working code example of how to use AIMET to perform QAT (Quantization-aware training). QAT is a technique where AIMET adds quantization simulation ops (also called fake quantization ops sometimes) to a trained ML model and use a standard training pipeline to fine-tune or train the model for a few epochs. The resulting model should show improved accuracy on quantized ML accelerators.

+

AIMET supports two different types of QAT 1. Simply referred to as QAT - quantization parameters like per-tensor scale/offsets for activations are computed once. During fine-tuning, the model weights are updated to minimize the effects of quantization in the forward pass, keeping the quantization parameters constant. 2. Referred to as QAT with range-learning - quantization parameters like per-tensor scale/offsets for activations are computed initially. Then both the quantization parameters and +the model weights are jointly updated during fine-tuning to minimize the effects of quantization in the forward pass.

+

This notebook specifically shows working code example for #2 above. You can find a separate notebook for #1 in the same folder.

+
+

Overall flow

+

This notebook covers the following 1. Instantiate the example evaluation and training pipeline 2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy 3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score 4. Fine-tune the quantization simulation model and evaluate the simulation model to get a post-finetuned quantized accuracy score

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art QAT results. For example, it uses a relatively quantization-friendly model like Resnet18. Also, some optimization parameters like number of epochs to fine-tune are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).

+

Note1: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - Subfolders ‘train’ for the training samples and ‘val’ for the validation samples. Please see the pytorch dataset description for more details. - A subdirectory per class, and a file per each image sample

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = '/path/to/dataset/'         # Please replace this with a real directory
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written? Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a PyTorch model. This QuantizationSim model can be used in place of the original model when doing inference or training.

  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods? Not really. You should be able to use your existing evaluate and train routines as-is.

  • +
+
+
[ ]:
+
+
+
import os
+import torch
+from Examples.common import image_net_config
+from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.torch.utils.image_net_trainer import ImageNetTrainer
+from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader
+
+class ImageNetDataPipeline:
+
+    @staticmethod
+    def get_val_dataloader() -> torch.utils.data.DataLoader:
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(DATASET_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         is_training=False,
+                                         num_workers=image_net_config.evaluation['num_workers']).data_loader
+        return data_loader
+
+    @staticmethod
+    def evaluate(model: torch.nn.Module, use_cuda: bool) -> float:
+        """
+        Given a torch model, evaluates its Top-1 accuracy on the dataset
+        :param model: the model to evaluate
+        :param use_cuda: whether or not the GPU should be used.
+        """
+        evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      num_workers=image_net_config.evaluation['num_workers'])
+
+        return evaluator.evaluate(model, iterations=None, use_cuda=use_cuda)
+
+    @staticmethod
+    def finetune(model: torch.nn.Module, epochs, learning_rate, learning_rate_schedule, use_cuda):
+        """
+        Given a torch model, finetunes the model to improve its accuracy
+        :param model: the model to finetune
+        :param epochs: The number of epochs used during the finetuning step.
+        :param learning_rate: The learning rate used during the finetuning step.
+        :param learning_rate_schedule: The learning rate schedule used during the finetuning step.
+        :param use_cuda: whether or not the GPU should be used.
+        """
+        trainer = ImageNetTrainer(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
+                                  batch_size=image_net_config.train['batch_size'],
+                                  num_workers=image_net_config.train['num_workers'])
+
+        trainer.train(model, max_epochs=epochs, learning_rate=learning_rate,
+                      learning_rate_schedule=learning_rate_schedule, use_cuda=use_cuda)
+
+
+
+
+
+
+

2. Load the model and evaluate to get a baseline FP32 accuracy score

+

For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead.

+
+
[ ]:
+
+
+
from torchvision.models import resnet18
+
+model = resnet18(pretrained=True)
+
+
+
+

AIMET quantization simulation requires the user’s model definition to follow certain guidelines. For example, functionals defined in forward pass should be changed to equivalent torch.nn.Module. AIMET user guide lists all these guidelines. The following ModelPreparer API uses new graph transformation feature available in PyTorch 1.9+ version and automates model definition changes required to comply with the above guidelines.

+
+
[ ]:
+
+
+
from aimet_torch.model_preparer import prepare_model
+
+model = prepare_model(model)
+
+
+
+
+

We should decide whether to place the model on a CPU or CUDA device. This example code will use CUDA if available in your current execution environment. You can change this logic and force a device placement if needed.

+
+
[ ]:
+
+
+
use_cuda = False
+if torch.cuda.is_available():
+    use_cuda = True
+    model.to(torch.device('cuda'))
+
+
+
+
+

Let’s determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(model, use_cuda)
+print(accuracy)
+
+
+
+
+
+
+

3. Create a quantization simulation model and determine quantized accuracy

+
+
+

Fold Batch Normalization layers

+

Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.

+

Why do we need to this? On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values +for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.

+

The following code calls AIMET to fold the BN layers in-place on the given model

+
+
[ ]:
+
+
+
from aimet_torch.batch_norm_fold import fold_all_batch_norms
+
+_ = fold_all_batch_norms(model, input_shapes=(1, 3, 224, 224))
+
+
+
+
+
+

Create Quantization Sim Model

+

Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them. A few of the parameters are explained here - quant_scheme: We set this to “training_range_learning_with_tf_init” - This is the key setting that enables “range learning”. With this choice of quant scheme, AIMET will use the TF quant scheme to initialize the quantization parameters like scale/offset. And then those parameters are set +to be trainable so they can continue to be updated during fine-tuning. - Another choice for quant_scheme is “training_range_learning_with_tf_enhanced_init”. Similar to the above, but the initialization for scale/offset is doing using the TF Enhanced scheme. Since in both schemes the quantization parameters are set to be trainable, there is not much benefit to using this choice instead of “training_range_learning_with_tf_init. - default_output_bw: Setting this to 8, essentially means that we +are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision - default_param_bw: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+
+
[ ]:
+
+
+
from aimet_common.defs import QuantScheme
+from aimet_torch.quantsim import QuantizationSimModel
+
+dummy_input = torch.rand(1, 3, 224, 224)    # Shape for each ImageNet sample is (3 channels) x (224 height) x (224 width)
+if use_cuda:
+    dummy_input = dummy_input.cuda()
+
+sim = QuantizationSimModel(model=model,
+                           quant_scheme=QuantScheme.training_range_learning_with_tf_init,
+                           dummy_input=dummy_input,
+                           default_output_bw=8,
+                           default_param_bw=8)
+
+
+
+
+

We can check the modifications AIMET has made to the model graph. One way is to print the model, and we can see that AIMET has added quantization wrapper layers. Note: use sim.model to access the modified PyTorch model. By default, AIMET creates a copy of the original model prior to modifying it. There is a parameter to override this behavior.

+
+
[ ]:
+
+
+
print(sim.model)
+
+
+
+
+

We can also check how AIMET has configured the added fake quantization nodes, which AIMET refers to as ‘quantizers’. You can see this by printing the sim object.

+
+
[ ]:
+
+
+
print(sim)
+
+
+
+
+

Even though AIMET has added ‘quantizer’ nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each ‘quantizer’ node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred +to as calibration. AIMET simply refers to it as ‘computing encodings’.

+

So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples - In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has +1M samples. For computing encodings we only need 500 or 1000 samples. - It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results.

+

The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example.

+
+
[ ]:
+
+
+
def pass_calibration_data(sim_model, use_cuda):
+    data_loader = ImageNetDataPipeline.get_val_dataloader()
+    batch_size = data_loader.batch_size
+
+    if use_cuda:
+        device = torch.device('cuda')
+    else:
+        device = torch.device('cpu')
+
+    sim_model.eval()
+    samples = 1000
+
+    batch_cntr = 0
+    with torch.no_grad():
+        for input_data, target_data in data_loader:
+
+            inputs_batch = input_data.to(device)
+            sim_model(inputs_batch)
+
+            batch_cntr += 1
+            if (batch_cntr * batch_size) > samples:
+                break
+
+
+
+
+

Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters.

+
+
[ ]:
+
+
+
sim.compute_encodings(forward_pass_callback=pass_calibration_data,
+                      forward_pass_callback_args=use_cuda)
+
+
+
+
+

Now the QuantizationSim model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before.

+
+
[ ]:
+
+
+
accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)
+print(accuracy)
+
+
+
+
+
+
+

4. Perform QAT

+

To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.

+

For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit.

+
+
[ ]:
+
+
+
ImageNetDataPipeline.finetune(sim.model, epochs=1, learning_rate=5e-7, learning_rate_schedule=[5, 10], use_cuda=use_cuda)
+
+
+
+
+

After we are done with QAT, we can run quantization simulation inference against the validation dataset at the end to observe any improvements in accuracy.

+
+
[ ]:
+
+
+
finetuned_accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)
+print(finetuned_accuracy)
+
+
+
+
+

Depending on your settings you may have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.

+

So we have an improved model after QAT. Now the next step would be to actually take this model to target. For this purpose, we need to export the model with the updated weights without the fake quant ops. And also to export the encodings (scale/offset quantization parameters) that were updated during training since we employed QAT with range-learning. AIMET QuantizationSimModel provides an export API for this purpose.

+
+
[ ]:
+
+
+
os.makedirs('./output/', exist_ok=True)
+dummy_input = dummy_input.cpu()
+sim.export(path='./output/', filename_prefix='resnet18_after_qat', dummy_input=dummy_input)
+
+
+
+
+
+
+

Summary

+

Hope this notebook was useful for you to understand how to use AIMET for performing QAT with range-learning.

+

Few additional resources - Refer to the AIMET API docs to know more details of the APIs and optional parameters. - Refer to the other example notebooks to understand how to use AIMET post-training quantization techniques and the vanilla QAT method (without range-learning).

+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/torch/quantization/qat_range_learning.ipynb b/releases/1.32.2/Examples/torch/quantization/qat_range_learning.ipynb new file mode 100644 index 00000000..e15c9c90 --- /dev/null +++ b/releases/1.32.2/Examples/torch/quantization/qat_range_learning.ipynb @@ -0,0 +1,561 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "# Quantization-Aware Training with Range Learning\n", + "\n", + "This notebook shows a working code example of how to use AIMET to perform QAT (Quantization-aware training). QAT is a technique where AIMET adds quantization simulation ops (also called fake quantization ops sometimes) to a trained ML model and use a standard training pipeline to fine-tune or train the model for a few epochs. The resulting model should show improved accuracy on quantized ML accelerators.\n", + "\n", + "AIMET supports two different types of QAT\n", + "1. Simply referred to as QAT - quantization parameters like per-tensor scale/offsets for activations are computed once. During fine-tuning, the model weights are updated to minimize the effects of quantization in the forward pass, keeping the quantization parameters constant.\n", + "2. Referred to as QAT with range-learning - quantization parameters like per-tensor scale/offsets for activations are computed initially. Then both the quantization parameters and the model weights are jointly updated during fine-tuning to minimize the effects of quantization in the forward pass.\n", + "\n", + "This notebook specifically shows working code example for #2 above. You can find a separate notebook for #1 in the same folder.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation and training pipeline\n", + "2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy\n", + "3. Create a quantization simulation model (with fake quantization ops inserted) and evaluate this simuation model to get a quantized accuracy score\n", + "4. Fine-tune the quantization simulation model and evaluate the simulation model to get a post-finetuned quantized accuracy score\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art QAT results. For example, it uses a relatively quantization-friendly model like Resnet18. Also, some optimization parameters like number of epochs to fine-tune are deliberately chosen to have the notebook execute more quickly." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).\n", + "\n", + "**Note1**: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- Subfolders 'train' for the training samples and 'val' for the validation samples. Please see the [pytorch dataset description](https://pytorch.org/vision/0.8/_modules/torchvision/datasets/imagenet.html) for more details.\n", + "- A subdirectory per class, and a file per each image sample\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "is_executing": true + } + }, + "outputs": [], + "source": [ + "DATASET_DIR = '/path/to/dataset/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a PyTorch model. This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really. You should be able to use your existing evaluate and train routines as-is.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "is_executing": true, + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "import os\n", + "import torch\n", + "from Examples.common import image_net_config\n", + "from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.torch.utils.image_net_trainer import ImageNetTrainer\n", + "from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader\n", + "\n", + "class ImageNetDataPipeline:\n", + "\n", + " @staticmethod\n", + " def get_val_dataloader() -> torch.utils.data.DataLoader:\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(DATASET_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " is_training=False,\n", + " num_workers=image_net_config.evaluation['num_workers']).data_loader\n", + " return data_loader\n", + "\n", + " @staticmethod\n", + " def evaluate(model: torch.nn.Module, use_cuda: bool) -> float:\n", + " \"\"\"\n", + " Given a torch model, evaluates its Top-1 accuracy on the dataset\n", + " :param model: the model to evaluate\n", + " :param use_cuda: whether or not the GPU should be used.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " num_workers=image_net_config.evaluation['num_workers'])\n", + "\n", + " return evaluator.evaluate(model, iterations=None, use_cuda=use_cuda)\n", + "\n", + " @staticmethod\n", + " def finetune(model: torch.nn.Module, epochs, learning_rate, learning_rate_schedule, use_cuda):\n", + " \"\"\"\n", + " Given a torch model, finetunes the model to improve its accuracy\n", + " :param model: the model to finetune\n", + " :param epochs: The number of epochs used during the finetuning step.\n", + " :param learning_rate: The learning rate used during the finetuning step.\n", + " :param learning_rate_schedule: The learning rate schedule used during the finetuning step.\n", + " :param use_cuda: whether or not the GPU should be used.\n", + " \"\"\"\n", + " trainer = ImageNetTrainer(DATASET_DIR, image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.train['batch_size'],\n", + " num_workers=image_net_config.train['num_workers'])\n", + "\n", + " trainer.train(model, max_epochs=epochs, learning_rate=learning_rate,\n", + " learning_rate_schedule=learning_rate_schedule, use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "## 2. Load the model and evaluate to get a baseline FP32 accuracy score" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from torchvision.models import resnet18\n", + "\n", + "model = resnet18(pretrained=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "AIMET quantization simulation requires the user's model definition to follow certain guidelines. For example, functionals defined in forward pass should be changed to equivalent torch.nn.Module.\n", + "AIMET user guide lists all these guidelines.\n", + "The following **ModelPreparer** API uses new graph transformation feature available in PyTorch 1.9+ version and automates model definition changes required to comply with the above guidelines. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_torch.model_preparer import prepare_model\n", + "\n", + "model = prepare_model(model)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We should decide whether to place the model on a CPU or CUDA device. This example code will use CUDA if available in your current execution environment. You can change this logic and force a device placement if needed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "use_cuda = False\n", + "if torch.cuda.is_available():\n", + " use_cuda = True\n", + " model.to(torch.device('cuda'))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Let's determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(model, use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 3. Create a quantization simulation model and determine quantized accuracy\n", + "\n", + "## Fold Batch Normalization layers\n", + "Before we determine the simulated quantized accuracy using QuantizationSimModel, we will fold the BatchNormalization (BN) layers in the model. These layers get folded into adjacent Convolutional layers. The BN layers that cannot be folded are left as they are.\n", + "\n", + "**Why do we need to this?**\n", + "On quantized runtimes (like TFLite, SnapDragon Neural Processing SDK, etc.), it is a common practice to fold the BN layers. Doing so, results in an inferences/sec speedup since unnecessary computation is avoided. Now from a floating point compute perspective, a BN-folded model is mathematically equivalent to a model with BN layers from an inference perspective, and produces the same accuracy. However, folding the BN layers can increase the range of the tensor values for the weight parameters of the adjacent layers. And this can have a negative impact on the quantized accuracy of the model (especially when using INT8 or lower precision). So, we want to simulate that on-target behavior by doing BN folding here.\n", + "\n", + "The following code calls AIMET to fold the BN layers in-place on the given model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from aimet_torch.batch_norm_fold import fold_all_batch_norms\n", + "\n", + "_ = fold_all_batch_norms(model, input_shapes=(1, 3, 224, 224))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Create Quantization Sim Model\n", + "\n", + "Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in the model graph and will configure them.\n", + "A few of the parameters are explained here\n", + "- **quant_scheme**: We set this to \"training_range_learning_with_tf_init\"\n", + " - This is the key setting that enables \"range learning\". With this choice of quant scheme, AIMET will use the TF quant scheme to initialize the quantization parameters like scale/offset. And then those parameters are set to be trainable so they can continue to be updated during fine-tuning.\n", + " - Another choice for quant_scheme is \"training_range_learning_with_tf_enhanced_init\". Similar to the above, but the initialization for scale/offset is doing using the TF Enhanced scheme. Since in both schemes the quantization parameters are set to be trainable, there is not much benefit to using this choice instead of \"training_range_learning_with_tf_init.\n", + "- **default_output_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision\n", + "- **default_param_bw**: Setting this to 8, essentially means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision\n", + "\n", + "There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from aimet_common.defs import QuantScheme\n", + "from aimet_torch.quantsim import QuantizationSimModel\n", + "\n", + "dummy_input = torch.rand(1, 3, 224, 224) # Shape for each ImageNet sample is (3 channels) x (224 height) x (224 width)\n", + "if use_cuda:\n", + " dummy_input = dummy_input.cuda()\n", + "\n", + "sim = QuantizationSimModel(model=model,\n", + " quant_scheme=QuantScheme.training_range_learning_with_tf_init,\n", + " dummy_input=dummy_input,\n", + " default_output_bw=8,\n", + " default_param_bw=8)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We can check the modifications AIMET has made to the model graph. One way is to print the model, and we can see that AIMET has added quantization wrapper layers. Note: use sim.model to access the modified PyTorch model. By default, AIMET creates a copy of the original model prior to modifying it. There is a parameter to override this behavior." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "print(sim.model)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "We can also check how AIMET has configured the added fake quantization nodes, which AIMET refers to as 'quantizers'. You can see this by printing the sim object." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "print(sim)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Even though AIMET has added 'quantizer' nodes to the model graph but the model is not ready to be used yet. Before we can use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each 'quantizer' node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is sometimes referred to as calibration. AIMET simply refers to it as 'computing encodings'.\n", + "\n", + "So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don't need to compute any loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples\n", + "- In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples.\n", + "- It may be beneficial if the samples used for computing encoding are well distributed. It's not necessary that all classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many different ways, this is just an example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "def pass_calibration_data(sim_model, use_cuda):\n", + " data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + " batch_size = data_loader.batch_size\n", + "\n", + " if use_cuda:\n", + " device = torch.device('cuda')\n", + " else:\n", + " device = torch.device('cpu')\n", + "\n", + " sim_model.eval()\n", + " samples = 1000\n", + "\n", + " batch_cntr = 0\n", + " with torch.no_grad():\n", + " for input_data, target_data in data_loader:\n", + "\n", + " inputs_batch = input_data.to(device)\n", + " sim_model(inputs_batch)\n", + "\n", + " batch_cntr += 1\n", + " if (batch_cntr * batch_size) > samples:\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization encodings. Encodings here refer to scale/offset quantization parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "sim.compute_encodings(forward_pass_callback=pass_calibration_data,\n", + " forward_pass_callback_args=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Now the QuantizationSim model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before. The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization instead of the FP32 accuracy score we saw before." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)\n", + "print(accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "## 4. Perform QAT\n", + "\n", + "To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.\n", + "\n", + "For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ImageNetDataPipeline.finetune(sim.model, epochs=1, learning_rate=5e-7, learning_rate_schedule=[5, 10], use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "After we are done with QAT, we can run quantization simulation inference against the validation dataset at the end to observe any improvements in accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "finetuned_accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)\n", + "print(finetuned_accuracy)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "Depending on your settings you may have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.\n", + "\n", + "So we have an improved model after QAT. Now the next step would be to actually take this model to target. For this purpose, we need to export the model with the updated weights without the fake quant ops. And also to export the encodings (scale/offset quantization parameters) that were updated during training since we employed QAT with range-learning. AIMET QuantizationSimModel provides an export API for this purpose." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "os.makedirs('./output/', exist_ok=True)\n", + "dummy_input = dummy_input.cpu()\n", + "sim.export(path='./output/', filename_prefix='resnet18_after_qat', dummy_input=dummy_input)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "---\n", + "## Summary\n", + "\n", + "Hope this notebook was useful for you to understand how to use AIMET for performing QAT with range-learning.\n", + "\n", + "Few additional resources\n", + "- Refer to the [AIMET API docs](https://quic.github.io/aimet-pages/AimetDocs/api_docs/index.html) to know more details of the APIs and optional parameters.\n", + "- Refer to the [other example notebooks](https://github.com/quic/aimet/tree/develop/Examples/torch/quantization) to understand how to use AIMET post-training quantization techniques and the vanilla QAT method (without range-learning)." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.0" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/releases/1.32.2/Examples/torch/quantization/quant_analyzer.html b/releases/1.32.2/Examples/torch/quantization/quant_analyzer.html new file mode 100644 index 00000000..2766763d --- /dev/null +++ b/releases/1.32.2/Examples/torch/quantization/quant_analyzer.html @@ -0,0 +1,1466 @@ + + + + + + Quant Analyzer — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Quant Analyzer

+

This notebook showcases a working code example of how to use AIMET to apply Quant Analyzer. Quant Analyzer is a feature which performs various analyses on a model to understand how each layer in the model responds to quantization.

+
+

Overall flow

+

This notebook covers the following 1. Instantiate the example evaluation pipeline 2. Load the FP32 model 3. Apply QuantAnalyzer to the model

+
+
+

What this notebook is not

+
    +
  • This notebook is not designed to show state-of-the-art results.

  • +
  • For example, it uses a relatively quantization-friendly model like Resnet18.

  • +
  • Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly.

  • +
+
+
+

Dataset

+

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).

+

Note1: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these - Subfolders ‘train’ for the training samples and ‘val’ for the validation samples. Please see the pytorch dataset description for more details. - A subdirectory per class, and a file per each image sample

+

Note2: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

+

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

+
+
[ ]:
+
+
+
DATASET_DIR = '/path/to/dataset/'         # Please replace this with a real directory
+
+
+
+
+
+
+

1. Example evaluation and training pipeline

+

The following is an example training and validation loop for this image classification task.

+
    +
  • Does AIMET have any limitations on how the training, validation pipeline is written?

    +

    Not really. We will see later that AIMET will modify the user’s model to create a QuantizationSim model which is still a TensorFlow model. This QuantizationSim model can be used in place of the original model when doing inference or training.

    +
  • +
  • Does AIMET put any limitation on the interface of the evaluate() or train() methods?

    +

    Not really. You should be able to use your existing evaluate and train routines as-is.

    +
  • +
+
+
[ ]:
+
+
+
import torch
+from Examples.common import image_net_config
+from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator
+from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader
+
+class ImageNetDataPipeline:
+
+    @staticmethod
+    def get_val_dataloader() -> torch.utils.data.DataLoader:
+        """
+        Instantiates a validation dataloader for ImageNet dataset and returns it
+        """
+        data_loader = ImageNetDataLoader(DATASET_DIR,
+                                         image_size=image_net_config.dataset['image_size'],
+                                         batch_size=image_net_config.evaluation['batch_size'],
+                                         is_training=False,
+                                         num_workers=image_net_config.evaluation['num_workers']).data_loader
+        return data_loader
+
+    @staticmethod
+    def evaluate(model: torch.nn.Module, use_cuda: bool) -> float:
+        """
+        Given a torch model, evaluates its Top-1 accuracy on the dataset
+        :param model: the model to evaluate
+        :param use_cuda: whether or not the GPU should be used.
+        """
+        evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
+                                      batch_size=image_net_config.evaluation['batch_size'],
+                                      num_workers=image_net_config.evaluation['num_workers'])
+
+        return evaluator.evaluate(model, iterations=None, use_cuda=use_cuda)
+
+
+
+
+
+
+

2. Load the model

+

For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead.

+
+
[ ]:
+
+
+
from torchvision.models import resnet18
+
+model = resnet18(pretrained=True)
+
+
+
+

AIMET quantization simulation requires the user’s model definition to follow certain guidelines. For example, functionals defined in forward pass should be changed to equivalent torch.nn.Module. AIMET user guide lists all these guidelines.

+

The following ModelPreparer API uses new graph transformation feature available in PyTorch 1.9+ version and automates model definition changes required to comply with the above guidelines.

+
+
[ ]:
+
+
+
from aimet_torch.model_preparer import prepare_model
+
+model = prepare_model(model)
+
+
+
+
+

We should decide whether to place the model on a CPU or CUDA device. This example code will use CUDA if available in your current execution environment. You can change this logic and force a device placement if needed.

+
+
[ ]:
+
+
+
use_cuda = False
+if torch.cuda.is_available():
+    use_cuda = True
+    model.to(torch.device('cuda'))
+
+
+
+
+
+
+

3. Apply QuantAnalyzer to the model

+

QuantAnalyzer requires two functions to be defined by the user for passing data through the model:

+

Forward pass callback

+

One function will be used to pass representative data through a quantized version of the model to calibrate quantization parameters. This function should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any loss metrics, so we can just ignore the model output.

+

The function must take two arguments, the first of which will be the model to run the forward pass on. The second argument can be anything additional which the function requires to run, and can be in the form of a single item or a tuple of items.

+

If no additional argument is needed, the user can specify a dummy “_” parameter for the function.

+

A few pointers regarding the forward pass data samples:

+
    +
  • In practice, we need a very small percentage of the overall data samples for computing encodings. For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 to 1000 samples.

  • +
  • It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all classes need to be covered since we are only looking at the range of values at every layer activation. However, we definitely want to avoid an extreme scenario like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures captured at night might not give ideal results.

  • +
+

The following shows an example of a routine that passes unlabeled samples through the model for computing encodings. This routine can be written in many ways; this is just an example. This function only requires unlabeled data as no loss or other evaluation metric is needed.

+
+
[ ]:
+
+
+
def pass_calibration_data(sim_model, use_cuda):
+    data_loader = ImageNetDataPipeline.get_val_dataloader()
+    batch_size = data_loader.batch_size
+
+    if use_cuda:
+        device = torch.device('cuda')
+    else:
+        device = torch.device('cpu')
+
+    sim_model.eval()
+    samples = 1000
+
+    batch_cntr = 0
+    with torch.no_grad():
+        for input_data, target_data in data_loader:
+
+            inputs_batch = input_data.to(device)
+            sim_model(inputs_batch)
+
+            batch_cntr += 1
+            if (batch_cntr * batch_size) > samples:
+                break
+
+
+
+

In order to pass this function to QuantAnalyzer, we need to wrap it in a CallbackFunc object, as shown below. The CallbackFunc takes two arguments: the callback function itself, and the inputs to pass into the callback function.

+
+
[ ]:
+
+
+
from aimet_torch.quant_analyzer import CallbackFunc
+
+forward_pass_callback = CallbackFunc(pass_calibration_data, use_cuda)
+
+
+
+
+

Evaluation callback

+

The second function will be used to evaluate the model, and needs to return an accuracy metric. In here, the user should pass any amount of data through the model which they would like when evaluating their model for accuracy.

+

Like the forward pass callback, this function also must take exactly two arguments: the model to evaluate, and any additional argument needed for the function to work. The second argument can be a tuple of items in case multiple items are needed.

+

We will be using the ImageNetDataPipeline’s evaluate defined above for this purpose. Like the forward pass callback, we need to wrap the evaluation callback in a CallbackFunc object as well.

+
+
[ ]:
+
+
+
eval_callback = CallbackFunc(ImageNetDataPipeline.evaluate, use_cuda)
+
+
+
+
+

Enabling MSE loss per layer analysis

+

An optional analysis step in QuantAnalyzer calculates the MSE loss per layer in the model, comparing the layer outputs from the original FP32 model vs. a quantized model. To perform this step, the user needs to also provide an unlabeled DataLoader to QuantAnalyzer.

+

We will demonstrate this step by using the ImageNetDataLoader imported above.

+
+
[ ]:
+
+
+
data_loader = ImageNetDataPipeline.get_val_dataloader()
+
+
+
+
+

QuantAnalyzer also requires a dummy input to the model. This dummy input does not need to be representative of the dataset. All that matters is that the input shape is correct for the model to run on.

+
+
[ ]:
+
+
+
dummy_input = torch.rand(1, 3, 224, 224)    # Shape for each ImageNet sample is (3 channels) x (224 height) x (224 width)
+if use_cuda:
+    dummy_input = dummy_input.cuda()
+
+
+
+
+

We are now ready to apply QuantAnalyzer.

+
+
[ ]:
+
+
+
from aimet_torch.quant_analyzer import QuantAnalyzer
+
+quant_analyzer = QuantAnalyzer(model, dummy_input, forward_pass_callback, eval_callback)
+
+
+
+

To enable the MSE loss analysis, we set the following:

+
+
[ ]:
+
+
+
quant_analyzer.enable_per_layer_mse_loss(data_loader, num_batches=4)
+
+
+
+

Finally, to start the analyzer, we call .analyze().

+

A few of the parameters are explained here: - quant_scheme: - We set this to “post_training_tf_enhanced” With this choice of quant scheme, AIMET will use the TF Enhanced quant scheme to initialize the quantization parameters like scale/offset. - default_output_bw: Setting this to 8 means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision. - default_param_bw: Setting this to 8 means that we are asking AIMET to perform all +parameter quantizations in the model using integer 8-bit precision.

+

There are other parameters that are set to default values in this example. Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.

+

When you call the analyze method, the following analyses are run:

+
    +
  • Compare fp32 accuracy, accuracy with only parameters quantized, and accuracy with only activations quantized

  • +
  • For each layer, track the model accuracy when quantization for all other layers is disabled (enabling quantization for only one layer in the model at a time)

  • +
  • For each layer, track the model accuracy when quantization for all other layers is enabled (disabling quantization for only one layer in the model at a time)

  • +
  • Track the minimum and maximum encoding parameters calculated by each quantizer in the model as a result of forward passes through the model with representative data

  • +
  • When the TF Enhanced quantization scheme is used, track the histogram of tensor ranges seen by each quantizer in the model as a result of forward passes through the model with representative data

  • +
  • If enabled, track the MSE loss seen at each layer by comparing layer outputs of the original fp32 model vs. a quantized model

  • +
+
+
[ ]:
+
+
+
from aimet_common.defs import QuantScheme
+
+quant_analyzer.analyze(quant_scheme=QuantScheme.post_training_tf_enhanced,
+                       default_param_bw=8,
+                       default_output_bw=8,
+                       config_file=None,
+                       results_dir="./tmp/")
+
+
+
+

AIMET will also output .html plots and json files where appropriate for each analysis to help visualize the data.

+

The following output files will be produced, in a folder specified by the user: Output directory structure will be like below

+
results_dir
+|-- per_layer_quant_enabled.html
+|-- per_layer_quant_enabled.json
+|-- per_layer_quant_disabled.html
+|-- per_layer_quant_disabled.json
+|-- min_max_ranges
+|   |-- activations.html
+|   |-- activations.json
+|   |-- weights.html
+|   +-- weights.json
+|-- activations_pdf
+|   |-- name_{input/output}_{index_0}.html
+|   |-- name_{input/output}_{index_1}.html
+|   |-- ...
+|   +-- name_{input/output}_{index_N}.html
+|-- weights_pdf
+|   |-- layer1
+|   |   |-- param_name_{channel_index_0}.html
+|   |   |-- param_name_{channel_index_1}.html
+|   |   |-- ...
+|   |   +-- param_name_{channel_index_N}.html
+|   |-- layer2
+|   |   |-- param_name_{channel_index_0}.html
+|   |   |-- param_name_{channel_index_1}.html
+|   |   |-- ...
+|   |   +-- param_name_{channel_index_N}.html
+|   |-- ...
+|   |-- layerN
+|   |   |-- param_name_{channel_index_0}.html
+|   |   |-- param_name_{channel_index_1}.html
+|   |   |-- ...
+|   +-- +-- param_name_{channel_index_N}.html
+|-- per_layer_mse_loss.html
++-- per_layer_mse_loss.json
+
+
+
+
+
+

Per-layer analysis by enabling/disabling quantization wrappers

+
    +
  • per_layer_quant_enabled.html: A plot with layers on the x-axis and model accuracy on the y-axis, where each layer’s accuracy represents the model accuracy when all quantizers in the model are disabled except for that layer’s parameter and activation quantizers.

  • +
  • per_layer_quant_enabled.json: A json file containing the data shown in per_layer_quant_enabled.html, associating layer names with model accuracy.

  • +
  • per_layer_quant_disabled.html: A plot with layers on the x-axis and model accuracy on the y-axis, where each layer’s accuracy represents the model accuracy when all quantizers in the model are enabled except for that layer’s parameter and activation quantizers.

  • +
  • per_layer_quant_disabled.json: A json file containing the data shown in per_layer_quant_disabled.html, associating layer names with model accuracy.

  • +
+

per_layer_quant_enabled.html

+
+
+

Encoding min/max ranges

+
    +
  • min_max_ranges: A folder containing the following sets of files:

    +
      +
    • activations.html: A plot with output activations on the x-axis and min-max values on the y-axis, where each output activation’s range represents the encoding min and max parameters computed during forward pass calibration (explained below).

    • +
    • activations.json: A json file containing the data shown in activations.html, associating layer names with min and max encoding values.

    • +
    • weights.html: A plot with parameter names on the x-axis and min-max values on the y-axis, where each parameter’s range represents the encoding min and max parameters computed during forward pass calibration.

    • +
    • weights.json: A json file containing the data shown in weights.html, associating parameter names with min and max encoding values.

    • +
    +
  • +
+

min_max_ranges.html

+
+
+

PDF of statistics

+
    +
  • (If TF Enhanced quant scheme is used) activations_pdf: A folder containing html files for each layer, plotting the histogram of tensor values seen for that layer’s output activation seen during forward pass calibration.

  • +
  • (If TF Enhanced quant scheme is used) weights_pdf: A folder containing sub folders for each layer with weights. Each layer’s folder contains html files for each parameter of that layer, with a histogram plot of tensor values seen for that parameter seen during forward pass calibration.

  • +
+

weights_pdf.html

+
+
+

Per-layer MSE loss

+
    +
  • (Optional, if per layer MSE loss is enabled) per_layer_mse_loss.html: A plot with layers on the x-axis and MSE loss on the y-axis, where each layer’s MSE loss represents the MSE seen comparing that layer’s outputs in the FP32 model vs. the quantized model.

  • +
  • (Optional, if per layer MSE loss is enabled) per_layer_mse_loss.json: A json file containing the data shown in per_layer_mse_loss.html, associating layer names with MSE loss.

  • +
+

per_layer_mse_loss.html

+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/Examples/torch/quantization/quant_analyzer.ipynb b/releases/1.32.2/Examples/torch/quantization/quant_analyzer.ipynb new file mode 100644 index 00000000..69cd7f8a --- /dev/null +++ b/releases/1.32.2/Examples/torch/quantization/quant_analyzer.ipynb @@ -0,0 +1,574 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "collapsed": true + }, + "source": [ + "# Quant Analyzer\n", + "\n", + "This notebook showcases a working code example of how to use AIMET to apply Quant Analyzer.\n", + "Quant Analyzer is a feature which performs various analyses on a model to understand how each layer in the model responds to quantization.\n", + "\n", + "#### Overall flow\n", + "This notebook covers the following\n", + "1. Instantiate the example evaluation pipeline\n", + "2. Load the FP32 model\n", + "3. Apply QuantAnalyzer to the model\n", + "\n", + "\n", + "#### What this notebook is not\n", + "* This notebook is not designed to show state-of-the-art results.\n", + "* For example, it uses a relatively quantization-friendly model like Resnet18.\n", + "* Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "## Dataset\n", + "\n", + "This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).\n", + "\n", + "**Note1**: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these\n", + "- Subfolders 'train' for the training samples and 'val' for the validation samples. Please see the [pytorch dataset description](https://pytorch.org/vision/0.8/_modules/torchvision/datasets/imagenet.html) for more details.\n", + "- A subdirectory per class, and a file per each image sample\n", + "\n", + "**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.\n", + "\n", + "Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "DATASET_DIR = '/path/to/dataset/' # Please replace this with a real directory" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "\n", + "## 1. Example evaluation and training pipeline\n", + "\n", + "The following is an example training and validation loop for this image classification task.\n", + "\n", + "- **Does AIMET have any limitations on how the training, validation pipeline is written?**\n", + "\n", + " Not really. We will see later that AIMET will modify the user's model to create a QuantizationSim model which is still a TensorFlow model.\n", + " This QuantizationSim model can be used in place of the original model when doing inference or training.\n", + "\n", + "- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?**\n", + "\n", + " Not really. You should be able to use your existing evaluate and train routines as-is." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "import torch\n", + "from Examples.common import image_net_config\n", + "from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator\n", + "from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader\n", + "\n", + "class ImageNetDataPipeline:\n", + "\n", + " @staticmethod\n", + " def get_val_dataloader() -> torch.utils.data.DataLoader:\n", + " \"\"\"\n", + " Instantiates a validation dataloader for ImageNet dataset and returns it\n", + " \"\"\"\n", + " data_loader = ImageNetDataLoader(DATASET_DIR,\n", + " image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " is_training=False,\n", + " num_workers=image_net_config.evaluation['num_workers']).data_loader\n", + " return data_loader\n", + "\n", + " @staticmethod\n", + " def evaluate(model: torch.nn.Module, use_cuda: bool) -> float:\n", + " \"\"\"\n", + " Given a torch model, evaluates its Top-1 accuracy on the dataset\n", + " :param model: the model to evaluate\n", + " :param use_cuda: whether or not the GPU should be used.\n", + " \"\"\"\n", + " evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],\n", + " batch_size=image_net_config.evaluation['batch_size'],\n", + " num_workers=image_net_config.evaluation['num_workers'])\n", + "\n", + " return evaluator.evaluate(model, iterations=None, use_cuda=use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "\n", + "## 2. Load the model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "For this example notebook, we are going to load a pretrained resnet18 model from torchvision.\n", + "Similarly, you can load any pretrained PyTorch model instead." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from torchvision.models import resnet18\n", + "\n", + "model = resnet18(pretrained=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "AIMET quantization simulation requires the user's model definition to follow certain guidelines.\n", + "For example, functionals defined in forward pass should be changed to equivalent torch.nn.Module.\n", + "AIMET user guide lists all these guidelines.\n", + "\n", + "The following **ModelPreparer** API uses new graph transformation feature available in PyTorch 1.9+ version and automates model definition changes required to comply with the above guidelines." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from aimet_torch.model_preparer import prepare_model\n", + "\n", + "model = prepare_model(model)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "We should decide whether to place the model on a CPU or CUDA device.\n", + "This example code will use CUDA if available in your current execution environment.\n", + "You can change this logic and force a device placement if needed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "use_cuda = False\n", + "if torch.cuda.is_available():\n", + " use_cuda = True\n", + " model.to(torch.device('cuda'))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "\n", + "## 3. Apply QuantAnalyzer to the model\n", + "\n", + "QuantAnalyzer requires two functions to be defined by the user for passing data through the model:\n", + "\n", + "**Forward pass callback**\n", + "\n", + "One function will be used to pass representative data through a quantized version of the model to calibrate quantization parameters.\n", + "This function should be fairly simple - use the existing train or validation data loader to extract some samples and pass them to the model.\n", + "We don't need to compute any loss metrics, so we can just ignore the model output.\n", + "\n", + "The function **must** take two arguments, the first of which will be the model to run the forward pass on.\n", + "The second argument can be anything additional which the function requires to run, and can be in the form of a single item or a tuple of items.\n", + "\n", + "If no additional argument is needed, the user can specify a dummy \"_\" parameter for the function.\n", + "\n", + "A few pointers regarding the forward pass data samples:\n", + "\n", + "- In practice, we need a very small percentage of the overall data samples for computing encodings.\n", + " For example, the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 to 1000 samples.\n", + "- It may be beneficial if the samples used for computing encoding are well distributed.\n", + " It's not necessary that all classes need to be covered since we are only looking at the range of values at every layer activation.\n", + " However, we definitely want to avoid an extreme scenario like all 'dark' or 'light' samples are used - e.g. only using pictures captured at night might not give ideal results.\n", + "\n", + "The following shows an example of a routine that passes unlabeled samples through the model for computing encodings.\n", + "This routine can be written in many ways; this is just an example.\n", + "This function only requires unlabeled data as no loss or other evaluation metric is needed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "def pass_calibration_data(sim_model, use_cuda):\n", + " data_loader = ImageNetDataPipeline.get_val_dataloader()\n", + " batch_size = data_loader.batch_size\n", + "\n", + " if use_cuda:\n", + " device = torch.device('cuda')\n", + " else:\n", + " device = torch.device('cpu')\n", + "\n", + " sim_model.eval()\n", + " samples = 1000\n", + "\n", + " batch_cntr = 0\n", + " with torch.no_grad():\n", + " for input_data, target_data in data_loader:\n", + "\n", + " inputs_batch = input_data.to(device)\n", + " sim_model(inputs_batch)\n", + "\n", + " batch_cntr += 1\n", + " if (batch_cntr * batch_size) > samples:\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "In order to pass this function to QuantAnalyzer, we need to wrap it in a CallbackFunc object, as shown below.\n", + "The CallbackFunc takes two arguments: the callback function itself, and the inputs to pass into the callback function." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from aimet_torch.quant_analyzer import CallbackFunc\n", + "\n", + "forward_pass_callback = CallbackFunc(pass_calibration_data, use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "\n", + "**Evaluation callback**\n", + "\n", + "The second function will be used to evaluate the model, and needs to return an accuracy metric.\n", + "In here, the user should pass any amount of data through the model which they would like when evaluating their model for accuracy.\n", + "\n", + "Like the forward pass callback, this function also must take exactly two arguments: the model to evaluate, and any additional argument needed for the function to work.\n", + "The second argument can be a tuple of items in case multiple items are needed.\n", + "\n", + "We will be using the ImageNetDataPipeline's evaluate defined above for this purpose.\n", + "Like the forward pass callback, we need to wrap the evaluation callback in a CallbackFunc object as well." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "eval_callback = CallbackFunc(ImageNetDataPipeline.evaluate, use_cuda)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "\n", + "**Enabling MSE loss per layer analysis**\n", + "\n", + "An optional analysis step in QuantAnalyzer calculates the MSE loss per layer in the model, comparing the layer outputs from the original FP32 model vs. a quantized model.\n", + "To perform this step, the user needs to also provide an unlabeled DataLoader to QuantAnalyzer.\n", + "\n", + "We will demonstrate this step by using the ImageNetDataLoader imported above." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "data_loader = ImageNetDataPipeline.get_val_dataloader()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "\n", + "QuantAnalyzer also requires a dummy input to the model.\n", + "This dummy input does not need to be representative of the dataset.\n", + "All that matters is that the input shape is correct for the model to run on." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "dummy_input = torch.rand(1, 3, 224, 224) # Shape for each ImageNet sample is (3 channels) x (224 height) x (224 width)\n", + "if use_cuda:\n", + " dummy_input = dummy_input.cuda()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "---\n", + "We are now ready to apply QuantAnalyzer." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from aimet_torch.quant_analyzer import QuantAnalyzer\n", + "\n", + "quant_analyzer = QuantAnalyzer(model, dummy_input, forward_pass_callback, eval_callback)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "To enable the MSE loss analysis, we set the following:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "quant_analyzer.enable_per_layer_mse_loss(data_loader, num_batches=4)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "Finally, to start the analyzer, we call .analyze().\n", + "\n", + "A few of the parameters are explained here:\n", + "- **quant_scheme**:\n", + " - We set this to \"post_training_tf_enhanced\"\n", + " With this choice of quant scheme, AIMET will use the TF Enhanced quant scheme to initialize the quantization parameters like scale/offset.\n", + "- **default_output_bw**: Setting this to 8 means that we are asking AIMET to perform all activation quantizations in the model using integer 8-bit precision.\n", + "- **default_param_bw**: Setting this to 8 means that we are asking AIMET to perform all parameter quantizations in the model using integer 8-bit precision.\n", + "\n", + "There are other parameters that are set to default values in this example.\n", + "Please check the AIMET API documentation of QuantizationSimModel to see reference documentation for all the parameters.\n", + "\n", + "When you call the analyze method, the following analyses are run:\n", + "\n", + "- Compare fp32 accuracy, accuracy with only parameters quantized, and accuracy with only activations quantized\n", + "- For each layer, track the model accuracy when quantization for all other layers is disabled (enabling quantization for only one layer in the model at a time)\n", + "- For each layer, track the model accuracy when quantization for all other layers is enabled (disabling quantization for only one layer in the model at a time)\n", + "- Track the minimum and maximum encoding parameters calculated by each quantizer in the model as a result of forward passes through the model with representative data\n", + "- When the TF Enhanced quantization scheme is used, track the histogram of tensor ranges seen by each quantizer in the model as a result of forward passes through the model with representative data\n", + "- If enabled, track the MSE loss seen at each layer by comparing layer outputs of the original fp32 model vs. a quantized model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "from aimet_common.defs import QuantScheme\n", + "\n", + "quant_analyzer.analyze(quant_scheme=QuantScheme.post_training_tf_enhanced,\n", + " default_param_bw=8,\n", + " default_output_bw=8,\n", + " config_file=None,\n", + " results_dir=\"./tmp/\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": false + }, + "source": [ + "AIMET will also output .html plots and json files where appropriate for each analysis to help visualize the data.\n", + "\n", + "The following output files will be produced, in a folder specified by the user:\n", + "Output directory structure will be like below\n", + "\n", + "```\n", + "results_dir\n", + "|-- per_layer_quant_enabled.html\n", + "|-- per_layer_quant_enabled.json\n", + "|-- per_layer_quant_disabled.html\n", + "|-- per_layer_quant_disabled.json\n", + "|-- min_max_ranges\n", + "| |-- activations.html\n", + "| |-- activations.json\n", + "| |-- weights.html\n", + "| +-- weights.json\n", + "|-- activations_pdf\n", + "| |-- name_{input/output}_{index_0}.html\n", + "| |-- name_{input/output}_{index_1}.html\n", + "| |-- ...\n", + "| +-- name_{input/output}_{index_N}.html\n", + "|-- weights_pdf\n", + "| |-- layer1\n", + "| | |-- param_name_{channel_index_0}.html\n", + "| | |-- param_name_{channel_index_1}.html\n", + "| | |-- ...\n", + "| | +-- param_name_{channel_index_N}.html\n", + "| |-- layer2\n", + "| | |-- param_name_{channel_index_0}.html\n", + "| | |-- param_name_{channel_index_1}.html\n", + "| | |-- ...\n", + "| | +-- param_name_{channel_index_N}.html\n", + "| |-- ...\n", + "| |-- layerN\n", + "| | |-- param_name_{channel_index_0}.html\n", + "| | |-- param_name_{channel_index_1}.html\n", + "| | |-- ...\n", + "| +-- +-- param_name_{channel_index_N}.html\n", + "|-- per_layer_mse_loss.html\n", + "+-- per_layer_mse_loss.json\n", + "```\n", + "\n", + "#### Per-layer analysis by enabling/disabling quantization wrappers\n", + "\n", + "- per_layer_quant_enabled.html: A plot with layers on the x-axis and model accuracy on the y-axis, where each layer's accuracy represents the model accuracy when all quantizers in the model are disabled except for that layer's parameter and activation quantizers.\n", + "- per_layer_quant_enabled.json: A json file containing the data shown in per_layer_quant_enabled.html, associating layer names with model accuracy.\n", + "- per_layer_quant_disabled.html: A plot with layers on the x-axis and model accuracy on the y-axis, where each layer's accuracy represents the model accuracy when all quantizers in the model are enabled except for that layer's parameter and activation quantizers.\n", + "- per_layer_quant_disabled.json: A json file containing the data shown in per_layer_quant_disabled.html, associating layer names with model accuracy.\n", + "\n", + "![per_layer_quant_enabled.html](./images/quant_analyzer_per_layer_quant_enabled.PNG)\n", + "\n", + "#### Encoding min/max ranges\n", + "\n", + "- min_max_ranges: A folder containing the following sets of files:\n", + " - activations.html: A plot with output activations on the x-axis and min-max values on the y-axis, where each output activation's range represents the encoding min and max parameters computed during forward pass calibration (explained below).\n", + " - activations.json: A json file containing the data shown in activations.html, associating layer names with min and max encoding values.\n", + " - weights.html: A plot with parameter names on the x-axis and min-max values on the y-axis, where each parameter's range represents the encoding min and max parameters computed during forward pass calibration.\n", + " - weights.json: A json file containing the data shown in weights.html, associating parameter names with min and max encoding values.\n", + "\n", + "![min_max_ranges.html](./images/quant_analyzer_min_max_ranges.PNG)\n", + "\n", + "#### PDF of statistics\n", + "\n", + "- (If TF Enhanced quant scheme is used) activations_pdf: A folder containing html files for each layer, plotting the histogram of tensor values seen for that layer's output activation seen during forward pass calibration.\n", + "- (If TF Enhanced quant scheme is used) weights_pdf: A folder containing sub folders for each layer with weights.\n", + " Each layer's folder contains html files for each parameter of that layer, with a histogram plot of tensor values seen for that parameter seen during forward pass calibration.\n", + "\n", + "![weights_pdf.html](./images/quant_analyzer_weights_pdf.PNG)\n", + "\n", + "#### Per-layer MSE loss\n", + "- (Optional, if per layer MSE loss is enabled) per_layer_mse_loss.html: A plot with layers on the x-axis and MSE loss on the y-axis, where each layer's MSE loss represents the MSE seen comparing that layer's outputs in the FP32 model vs. the quantized model.\n", + "- (Optional, if per layer MSE loss is enabled) per_layer_mse_loss.json: A json file containing the data shown in per_layer_mse_loss.html, associating layer names with MSE loss.\n", + "\n", + "![per_layer_mse_loss.html](./images/quant_analyzer_per_layer_mse_loss.PNG)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 2 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython2", + "version": "2.7.6" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/releases/1.32.2/_images/AIMET_index_no_fine_tune.png b/releases/1.32.2/_images/AIMET_index_no_fine_tune.png new file mode 100644 index 00000000..59803a20 Binary files /dev/null and b/releases/1.32.2/_images/AIMET_index_no_fine_tune.png differ diff --git a/releases/1.32.2/_images/adaround.png b/releases/1.32.2/_images/adaround.png new file mode 100644 index 00000000..40c4c9a4 Binary files /dev/null and b/releases/1.32.2/_images/adaround.png differ diff --git a/releases/1.32.2/_images/auto_quant_v2_flowchart.png b/releases/1.32.2/_images/auto_quant_v2_flowchart.png new file mode 100644 index 00000000..3f97910b Binary files /dev/null and b/releases/1.32.2/_images/auto_quant_v2_flowchart.png differ diff --git a/releases/1.32.2/_images/bias_correction_analytical.png b/releases/1.32.2/_images/bias_correction_analytical.png new file mode 100644 index 00000000..92e930ac Binary files /dev/null and b/releases/1.32.2/_images/bias_correction_analytical.png differ diff --git a/releases/1.32.2/_images/bias_correction_empirical.png b/releases/1.32.2/_images/bias_correction_empirical.png new file mode 100644 index 00000000..7dfe940c Binary files /dev/null and b/releases/1.32.2/_images/bias_correction_empirical.png differ diff --git a/releases/1.32.2/_images/bn_reestimation.png b/releases/1.32.2/_images/bn_reestimation.png new file mode 100644 index 00000000..93c2c2aa Binary files /dev/null and b/releases/1.32.2/_images/bn_reestimation.png differ diff --git a/releases/1.32.2/_images/channel_pruning_1.png b/releases/1.32.2/_images/channel_pruning_1.png new file mode 100644 index 00000000..68953c87 Binary files /dev/null and b/releases/1.32.2/_images/channel_pruning_1.png differ diff --git a/releases/1.32.2/_images/cle_1.png b/releases/1.32.2/_images/cle_1.png new file mode 100644 index 00000000..56c09217 Binary files /dev/null and b/releases/1.32.2/_images/cle_1.png differ diff --git a/releases/1.32.2/_images/cle_4.png b/releases/1.32.2/_images/cle_4.png new file mode 100644 index 00000000..57741e47 Binary files /dev/null and b/releases/1.32.2/_images/cle_4.png differ diff --git a/releases/1.32.2/_images/cle_5.png b/releases/1.32.2/_images/cle_5.png new file mode 100644 index 00000000..07e35158 Binary files /dev/null and b/releases/1.32.2/_images/cle_5.png differ diff --git a/releases/1.32.2/_images/compression_flow.png b/releases/1.32.2/_images/compression_flow.png new file mode 100644 index 00000000..12d15895 Binary files /dev/null and b/releases/1.32.2/_images/compression_flow.png differ diff --git a/releases/1.32.2/_images/compression_use_case.PNG b/releases/1.32.2/_images/compression_use_case.PNG new file mode 100644 index 00000000..bc429099 Binary files /dev/null and b/releases/1.32.2/_images/compression_use_case.PNG differ diff --git a/releases/1.32.2/_images/cp_2.png b/releases/1.32.2/_images/cp_2.png new file mode 100644 index 00000000..d25a132b Binary files /dev/null and b/releases/1.32.2/_images/cp_2.png differ diff --git a/releases/1.32.2/_images/cp_3.jpg b/releases/1.32.2/_images/cp_3.jpg new file mode 100644 index 00000000..8c02570c Binary files /dev/null and b/releases/1.32.2/_images/cp_3.jpg differ diff --git a/releases/1.32.2/_images/cp_4.jpg b/releases/1.32.2/_images/cp_4.jpg new file mode 100644 index 00000000..c047526b Binary files /dev/null and b/releases/1.32.2/_images/cp_4.jpg differ diff --git a/releases/1.32.2/_images/flow_diagram_cle.png b/releases/1.32.2/_images/flow_diagram_cle.png new file mode 100644 index 00000000..a9e56767 Binary files /dev/null and b/releases/1.32.2/_images/flow_diagram_cle.png differ diff --git a/releases/1.32.2/_images/greedy_1.png b/releases/1.32.2/_images/greedy_1.png new file mode 100644 index 00000000..4e5afd97 Binary files /dev/null and b/releases/1.32.2/_images/greedy_1.png differ diff --git a/releases/1.32.2/_images/greedy_2.png b/releases/1.32.2/_images/greedy_2.png new file mode 100644 index 00000000..937d4b08 Binary files /dev/null and b/releases/1.32.2/_images/greedy_2.png differ diff --git a/releases/1.32.2/_images/greedy_3.png b/releases/1.32.2/_images/greedy_3.png new file mode 100644 index 00000000..0088528a Binary files /dev/null and b/releases/1.32.2/_images/greedy_3.png differ diff --git a/releases/1.32.2/_images/greedy_4.jpg b/releases/1.32.2/_images/greedy_4.jpg new file mode 100644 index 00000000..653fc39f Binary files /dev/null and b/releases/1.32.2/_images/greedy_4.jpg differ diff --git a/releases/1.32.2/_images/greedy_5.jpg b/releases/1.32.2/_images/greedy_5.jpg new file mode 100644 index 00000000..39b02ebe Binary files /dev/null and b/releases/1.32.2/_images/greedy_5.jpg differ diff --git a/releases/1.32.2/_images/keras_min_max_ranges.PNG b/releases/1.32.2/_images/keras_min_max_ranges.PNG new file mode 100644 index 00000000..ec86807a Binary files /dev/null and b/releases/1.32.2/_images/keras_min_max_ranges.PNG differ diff --git a/releases/1.32.2/_images/keras_per_layer_mse_loss.PNG b/releases/1.32.2/_images/keras_per_layer_mse_loss.PNG new file mode 100644 index 00000000..3b8cfde4 Binary files /dev/null and b/releases/1.32.2/_images/keras_per_layer_mse_loss.PNG differ diff --git a/releases/1.32.2/_images/keras_per_layer_quant_enabled.PNG b/releases/1.32.2/_images/keras_per_layer_quant_enabled.PNG new file mode 100644 index 00000000..896279a6 Binary files /dev/null and b/releases/1.32.2/_images/keras_per_layer_quant_enabled.PNG differ diff --git a/releases/1.32.2/_images/keras_post_quant_layer.png b/releases/1.32.2/_images/keras_post_quant_layer.png new file mode 100644 index 00000000..1f2dba9c Binary files /dev/null and b/releases/1.32.2/_images/keras_post_quant_layer.png differ diff --git a/releases/1.32.2/_images/keras_pre_quant_layer.png b/releases/1.32.2/_images/keras_pre_quant_layer.png new file mode 100644 index 00000000..8fe929c5 Binary files /dev/null and b/releases/1.32.2/_images/keras_pre_quant_layer.png differ diff --git a/releases/1.32.2/_images/keras_quantsim_callflow.png b/releases/1.32.2/_images/keras_quantsim_callflow.png new file mode 100644 index 00000000..4dd3f052 Binary files /dev/null and b/releases/1.32.2/_images/keras_quantsim_callflow.png differ diff --git a/releases/1.32.2/_images/keras_weights_pdf.PNG b/releases/1.32.2/_images/keras_weights_pdf.PNG new file mode 100644 index 00000000..6b6dd360 Binary files /dev/null and b/releases/1.32.2/_images/keras_weights_pdf.PNG differ diff --git a/releases/1.32.2/_images/logo-quic-on@h68.png b/releases/1.32.2/_images/logo-quic-on@h68.png new file mode 100644 index 00000000..a83b3d27 Binary files /dev/null and b/releases/1.32.2/_images/logo-quic-on@h68.png differ diff --git a/releases/1.32.2/_images/mapping_between_onnx_tensor_names_and_encodings.png b/releases/1.32.2/_images/mapping_between_onnx_tensor_names_and_encodings.png new file mode 100644 index 00000000..6771595d Binary files /dev/null and b/releases/1.32.2/_images/mapping_between_onnx_tensor_names_and_encodings.png differ diff --git a/releases/1.32.2/_images/pytorch_model_prep_and_validate.PNG b/releases/1.32.2/_images/pytorch_model_prep_and_validate.PNG new file mode 100644 index 00000000..bec69113 Binary files /dev/null and b/releases/1.32.2/_images/pytorch_model_prep_and_validate.PNG differ diff --git a/releases/1.32.2/_images/quant_2.png b/releases/1.32.2/_images/quant_2.png new file mode 100644 index 00000000..5c81db4f Binary files /dev/null and b/releases/1.32.2/_images/quant_2.png differ diff --git a/releases/1.32.2/_images/quant_3.png b/releases/1.32.2/_images/quant_3.png new file mode 100644 index 00000000..3e1bdcc9 Binary files /dev/null and b/releases/1.32.2/_images/quant_3.png differ diff --git a/releases/1.32.2/_images/quant_analyzer_min_max_ranges.PNG b/releases/1.32.2/_images/quant_analyzer_min_max_ranges.PNG new file mode 100644 index 00000000..6497e28f Binary files /dev/null and b/releases/1.32.2/_images/quant_analyzer_min_max_ranges.PNG differ diff --git a/releases/1.32.2/_images/quant_analyzer_per_layer_mse_loss.PNG b/releases/1.32.2/_images/quant_analyzer_per_layer_mse_loss.PNG new file mode 100644 index 00000000..870841f3 Binary files /dev/null and b/releases/1.32.2/_images/quant_analyzer_per_layer_mse_loss.PNG differ diff --git a/releases/1.32.2/_images/quant_analyzer_per_layer_quant_enabled.PNG b/releases/1.32.2/_images/quant_analyzer_per_layer_quant_enabled.PNG new file mode 100644 index 00000000..d8e2ad06 Binary files /dev/null and b/releases/1.32.2/_images/quant_analyzer_per_layer_quant_enabled.PNG differ diff --git a/releases/1.32.2/_images/quant_analyzer_weights_pdf.PNG b/releases/1.32.2/_images/quant_analyzer_weights_pdf.PNG new file mode 100644 index 00000000..cb932964 Binary files /dev/null and b/releases/1.32.2/_images/quant_analyzer_weights_pdf.PNG differ diff --git a/releases/1.32.2/_images/quant_use_case_1.PNG b/releases/1.32.2/_images/quant_use_case_1.PNG new file mode 100644 index 00000000..93a7cb72 Binary files /dev/null and b/releases/1.32.2/_images/quant_use_case_1.PNG differ diff --git a/releases/1.32.2/_images/quant_use_case_2.PNG b/releases/1.32.2/_images/quant_use_case_2.PNG new file mode 100644 index 00000000..075ec7ee Binary files /dev/null and b/releases/1.32.2/_images/quant_use_case_2.PNG differ diff --git a/releases/1.32.2/_images/quant_use_case_3.PNG b/releases/1.32.2/_images/quant_use_case_3.PNG new file mode 100644 index 00000000..c38a7d23 Binary files /dev/null and b/releases/1.32.2/_images/quant_use_case_3.PNG differ diff --git a/releases/1.32.2/_images/quantization_debugging_flow_chart.png b/releases/1.32.2/_images/quantization_debugging_flow_chart.png new file mode 100644 index 00000000..8ed4aba9 Binary files /dev/null and b/releases/1.32.2/_images/quantization_debugging_flow_chart.png differ diff --git a/releases/1.32.2/_images/quantization_workflow.PNG b/releases/1.32.2/_images/quantization_workflow.PNG new file mode 100644 index 00000000..1618222f Binary files /dev/null and b/releases/1.32.2/_images/quantization_workflow.PNG differ diff --git a/releases/1.32.2/_images/quantsim_config_file.png b/releases/1.32.2/_images/quantsim_config_file.png new file mode 100644 index 00000000..a3d5c7a8 Binary files /dev/null and b/releases/1.32.2/_images/quantsim_config_file.png differ diff --git a/releases/1.32.2/_images/spatial_svd.png b/releases/1.32.2/_images/spatial_svd.png new file mode 100644 index 00000000..6686a254 Binary files /dev/null and b/releases/1.32.2/_images/spatial_svd.png differ diff --git a/releases/1.32.2/_images/tf_quant_analyzer_min_max_range_weights.png b/releases/1.32.2/_images/tf_quant_analyzer_min_max_range_weights.png new file mode 100644 index 00000000..b7536080 Binary files /dev/null and b/releases/1.32.2/_images/tf_quant_analyzer_min_max_range_weights.png differ diff --git a/releases/1.32.2/_images/tf_quant_analyzer_mse_loss.png b/releases/1.32.2/_images/tf_quant_analyzer_mse_loss.png new file mode 100644 index 00000000..fd19e4a4 Binary files /dev/null and b/releases/1.32.2/_images/tf_quant_analyzer_mse_loss.png differ diff --git a/releases/1.32.2/_images/tf_quant_analyzer_pdf.png b/releases/1.32.2/_images/tf_quant_analyzer_pdf.png new file mode 100644 index 00000000..a6082f1b Binary files /dev/null and b/releases/1.32.2/_images/tf_quant_analyzer_pdf.png differ diff --git a/releases/1.32.2/_images/tf_quant_analyzer_per_op_quant_enabled.png b/releases/1.32.2/_images/tf_quant_analyzer_per_op_quant_enabled.png new file mode 100644 index 00000000..8f821876 Binary files /dev/null and b/releases/1.32.2/_images/tf_quant_analyzer_per_op_quant_enabled.png differ diff --git a/releases/1.32.2/_images/vis_1.png b/releases/1.32.2/_images/vis_1.png new file mode 100644 index 00000000..78ace7ce Binary files /dev/null and b/releases/1.32.2/_images/vis_1.png differ diff --git a/releases/1.32.2/_images/vis_3.png b/releases/1.32.2/_images/vis_3.png new file mode 100644 index 00000000..a00c69be Binary files /dev/null and b/releases/1.32.2/_images/vis_3.png differ diff --git a/releases/1.32.2/_images/vis_4.png b/releases/1.32.2/_images/vis_4.png new file mode 100644 index 00000000..e6551c7c Binary files /dev/null and b/releases/1.32.2/_images/vis_4.png differ diff --git a/releases/1.32.2/_images/vis_5.png b/releases/1.32.2/_images/vis_5.png new file mode 100644 index 00000000..f25ccccb Binary files /dev/null and b/releases/1.32.2/_images/vis_5.png differ diff --git a/releases/1.32.2/_images/vis_6.png b/releases/1.32.2/_images/vis_6.png new file mode 100644 index 00000000..5e0dc63d Binary files /dev/null and b/releases/1.32.2/_images/vis_6.png differ diff --git a/releases/1.32.2/_images/vis_7.png b/releases/1.32.2/_images/vis_7.png new file mode 100644 index 00000000..bfc33864 Binary files /dev/null and b/releases/1.32.2/_images/vis_7.png differ diff --git a/releases/1.32.2/_images/weight_svd.png b/releases/1.32.2/_images/weight_svd.png new file mode 100644 index 00000000..1c2b548d Binary files /dev/null and b/releases/1.32.2/_images/weight_svd.png differ diff --git a/releases/1.32.2/_images/winnow_1.png b/releases/1.32.2/_images/winnow_1.png new file mode 100644 index 00000000..ccbb3bfe Binary files /dev/null and b/releases/1.32.2/_images/winnow_1.png differ diff --git a/releases/1.32.2/_images/winnow_2.png b/releases/1.32.2/_images/winnow_2.png new file mode 100644 index 00000000..bc5a123d Binary files /dev/null and b/releases/1.32.2/_images/winnow_2.png differ diff --git a/releases/1.32.2/_modules/aimet_common/bias_correction.html b/releases/1.32.2/_modules/aimet_common/bias_correction.html new file mode 100644 index 00000000..cb5e3891 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_common/bias_correction.html @@ -0,0 +1,1264 @@ + + + + + + aimet_common.bias_correction — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_common.bias_correction

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2019, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+"""  holds common code for bias correction """
+
+from aimet_common.defs import ActivationType
+from aimet_common.utils import AimetLogger
+from aimet_common.connected_graph.operation import Op
+
+logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.Utils)
+
+CONV_OP_TYPES = ['Conv1d', 'Conv2D', 'DepthwiseConv2dNative', 'Conv', 'ConvTranspose', 'Conv3d']
+LINEAR_OP_TYPES = ['Dense', 'Gemm', 'MatMul']
+BN_OP_TYPES = ['FusedBatchNormV3', 'FusedBatchNorm', 'BatchNormalization', 'BatchNorm3d']
+
+
[docs]class ConvBnInfoType: + """ + Type for hoding convs with bn info and activation types + Activation types supported are Relu and Relu6 + """ + def __init__(self, + input_bn=None, + output_bn=None, + in_activation_type: ActivationType = ActivationType.no_activation, + out_activation_type: ActivationType = ActivationType.no_activation): + """ + :param input_bn: Reference to Input BatchNorm to layer + :param output_bn: Reference to Output BatchNorm to layer + :param in_activation_type: Type of Activation + :param out_activation_type: Type of Activation + """ + + self.input_bn = input_bn + self.output_bn = output_bn + self.in_activation_type = in_activation_type + self.out_activation_type = out_activation_type
+ + +class ConvBnPatternHandler: + """ + common handler for matched patterns for bias correction and batchnorm fold. + """ + + def __init__(self): + self.conv_linears_with_bn_dict = {} + + def get_conv_linear_bn_info_dict(self): + """ + returns the dictionary created + :return: dictionary of convs/linears with bn and activation info + """ + return self.conv_linears_with_bn_dict + + def __call__(self, *args, **kwargs): + """ + custom pattern match handler that keeps a dictionary of convs/linears with bn and activation info. + """ + + _, op_subset = args + + bn_activation_info = ConvBnInfoType() + + activation_type = ActivationType.no_activation + conv_op = None + bn_op = None + + for op in op_subset: + if op.type in CONV_OP_TYPES + LINEAR_OP_TYPES: + conv_op = op + op_key = get_op_dict_key(conv_op) + if op_key in self.conv_linears_with_bn_dict.keys(): + bn_activation_info = self.conv_linears_with_bn_dict[op_key] + elif op.type in BN_OP_TYPES: + bn_op = op + elif op.type in ['Relu6', 'Clip']: + activation_type = ActivationType.relu6 + elif op.type in ['Relu']: + activation_type = ActivationType.relu + + if len(op_subset) >= 2: + if op_subset[0].type in BN_OP_TYPES: + bn_activation_info.input_bn = bn_op + bn_activation_info.in_activation_type = activation_type + # we do not match linear layers with preceding bn for bias correction + elif op_subset[0].type in CONV_OP_TYPES + LINEAR_OP_TYPES: + bn_activation_info.output_bn = bn_op + bn_activation_info.out_activation_type = activation_type + # in tf linear layer has two ops together [flatten/reshape -- dense] , check for len 3 + elif len(op_subset) >= 3 and op_subset[1].type in ['Dense']: + bn_activation_info.output_bn = bn_op + bn_activation_info.out_activation_type = activation_type + op_key = get_op_dict_key(conv_op) + self.conv_linears_with_bn_dict[op_key] = bn_activation_info + + +def get_op_dict_key(op: Op): + """ + Returns the object to be used as a key in the conv/linear BN dict. + For torch and tensorflow models, returns op.get_module(). For onnx models, returns the original op. + + :param op: connected graph layer to be used as a dictionary key + :return: object (op or op.get_module()) to be used as a key in the conv/linear BN dict + """ + module = op.get_module() + # ONNX NodeProto objects are not hashable, return the original Op object instead + if module.__hash__ is None: + return op + return module +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_common/defs.html b/releases/1.32.2/_modules/aimet_common/defs.html new file mode 100644 index 00000000..cd94f274 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_common/defs.html @@ -0,0 +1,1549 @@ + + + + + + aimet_common.defs — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_common.defs

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2019-2023, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Common type definitions that are used across aimet """
+import io
+from enum import Enum
+from typing import Union, Callable, Any, Optional, Dict, List
+from decimal import Decimal
+
+from aimet_common.layer_database import Layer
+import aimet_common.libpymo as libpymo
+
+
+# supported quantization schemes
+
[docs]class QuantScheme(Enum): + """ Enumeration of Quant schemes""" + + post_training_tf = 1 + """ For a Tensor, the absolute minimum and maximum value of the Tensor are used to compute the Quantization + encodings. """ + post_training_tf_enhanced = 2 + """ For a Tensor, searches and selects the optimal minimum and maximum value that minimizes the Quantization Noise. + The Quantization encodings are calculated using the selected minimum and maximum value. """ + training_range_learning_with_tf_init = 3 + """ For a Tensor, the encoding values are initialized with the post_training_tf scheme. Then, the encodings are + learned during training. """ + training_range_learning_with_tf_enhanced_init = 4 + """ For a Tensor, the encoding values are initialized with the post_training_tf_enhanced scheme. Then, the encodings + are learned during training. """ + training_range_learning = 5 + post_training_percentile = 6 + """ For a Tensor, adjusted minimum and maximum values are selected based on the percentile value passed. + The Quantization encodings are calculated using the adjusted minimum and maximum value."""
+ +MAP_QUANT_SCHEME_TO_PYMO = {QuantScheme.post_training_tf: libpymo.QuantizationMode.QUANTIZATION_TF, + QuantScheme.post_training_tf_enhanced: + libpymo.QuantizationMode.QUANTIZATION_TF_ENHANCED, + QuantScheme.training_range_learning_with_tf_init: + libpymo.QuantizationMode.QUANTIZATION_TF, + QuantScheme.training_range_learning_with_tf_enhanced_init: + libpymo.QuantizationMode.QUANTIZATION_TF_ENHANCED, + QuantScheme.post_training_percentile: + libpymo.QuantizationMode.QUANTIZATION_PERCENTILE} +MAP_ROUND_MODE_TO_PYMO = {'nearest': libpymo.RoundingMode.ROUND_NEAREST, + 'stochastic': libpymo.RoundingMode.ROUND_STOCHASTIC} + +RANGE_LEARNING_SCHEMES = {QuantScheme.training_range_learning_with_tf_init, + QuantScheme.training_range_learning_with_tf_enhanced_init} + + +
[docs]class ActivationType(Enum): + """ Enums to identify activation type""" + no_activation = 0 + """ No activation """ + + relu = 1 + """ ReLU activation """ + + relu6 = 2 + """ ReLU6 activation """
+ + +
[docs]class CostMetric(Enum): + """ Enumeration of metrics to measure cost of a model/layer """ + + mac = 1 + """ MAC: Cost modeled for compute requirements """ + + memory = 2 + """ Memory: Cost modeled for space requirements """
+ + +
[docs]class CompressionScheme(Enum): + """ Enumeration of compression schemes supported in aimet """ + + weight_svd = 1 + """ Weight SVD """ + + spatial_svd = 2 + """ Spatial SVD """ + + channel_pruning = 3 + """ Channel Pruning """
+ + +class RankSelectScheme(Enum): + """ Enumeration of rank selection schemes supported in aimet """ + + greedy = 1 + """ Greedy scheme""" + + tar = 2 + """ TAR scheme """ + + +class LayerCompRatioPair: + """ + Models a pair of (layer: nn.Module, CompRatio: Decimal) + """ + + def __init__(self, layer: Layer, comp_ratio: Union[Decimal, None]): + """ + Constructor + :param layer: Reference to layer + :param comp_ratio: Comp-ratio as a floating point number between 0 and 1 + """ + self.layer = layer + self.comp_ratio = comp_ratio + + def __str__(self): + return 'LayerCompRatioPair: layer={}, comp-ratio={}'.format(self.layer.name, self.comp_ratio) + + +class LayerCompRatioEvalScore: + """ + Models data element with (layer: nn.Module, CompRatio: Decimal, EvalScore: Decimal) attributes + """ + + def __init__(self, layer: Layer, comp_ratio: Union[Decimal, None], eval_score: Optional[Union[Decimal, None]]): + """ + Constructor + :param layer: Reference to layer + :param comp_ratio: Comp-ratio as a floating point number between 0 and 1 + :param eval_score: Eval score as floating point number + """ + self.layer = layer + self.comp_ratio = comp_ratio + self.eval_score = eval_score + + def __str__(self): + return 'LayerCompRatioEvalScore: layer={}, comp-ratio={}, eval_score={}'. \ + format(self.layer.name, self.comp_ratio, self.eval_score) + + +class TarPerRankIndexData: + """ + TAR based algo stats require a combination of + (layer: nn.Module, CompRatio: Decimal, EvalScore:Decimal) per rank index to be stored + """ + + def __init__(self, layer: Layer, comp_ratio: Union[Decimal, None], eval_score: Union[Decimal, None]): + """ + Constructor + :param layer: Reference to layer + :param comp_ratio: Comp-ratio as a floating point number between 0 and 1 + :param eval_score: Eval score as a floating point number + """ + self.layer = layer + self.comp_ratio = comp_ratio + self.eval_score = eval_score + + def __str__(self): + return 'TarPerRankIndexData: layer={}, comp-ratio={}, eval-score={}'.format(self.layer.name, self.comp_ratio, + self.eval_score) + + +
[docs]class TarRankSelectionParameters: + """ + Configuration parameters for the TAR compression-ratio selection algorithm + + :ivar num_rank_indices: Number of rank indices for ratio selection. + + """ + def __init__(self, num_rank_indices: int): + + # Sanity check + if num_rank_indices < 2: + raise ValueError("Error: num_rank_indices={}. Need at least 2 candidates for " + "TAR based compression-ratio selection".format(num_rank_indices)) + + self.num_rank_indices = num_rank_indices
+ + +EvalFunction = Callable[[Any, Optional[int], bool], float] + + +
[docs]class GreedySelectionParameters: + """ + Configuration parameters for the Greedy compression-ratio selection algorithm + + :ivar target_comp_ratio: Target compression ratio. Expressed as value between 0 and 1. + Compression ratio is the ratio of cost of compressed model to cost of the original model. + :ivar num_comp_ratio_candidates: Number of comp-ratio candidates to analyze per-layer + More candidates allows more granular distribution of compression at the cost + of increased run-time during analysis. Default value=10. Value should be greater than 1. + :ivar use_monotonic_fit: If True, eval scores in the eval dictionary are fitted to a monotonically increasing + function. This is useful if you see the eval dict scores for some layers are not monotonically increasing. + By default, this option is set to False. + :ivar saved_eval_scores_dict: Path to the eval_scores dictionary pickle file that was + saved in a previous run. This is useful to speed-up experiments when trying + different target compression-ratios for example. aimet will save eval_scores + dictionary pickle file automatically in a ./data directory relative to the + current path. num_comp_ratio_candidates parameter will be ignored when this option is used. + """ + + def __init__(self, + target_comp_ratio: float, + num_comp_ratio_candidates: int = 10, + use_monotonic_fit: bool = False, + saved_eval_scores_dict: Optional[str] = None): + + self.target_comp_ratio = target_comp_ratio + + # Sanity check + if num_comp_ratio_candidates < 2: + raise ValueError("Error: num_comp_ratio_candidates={}. Need more than 1 candidate for " + "Greedy compression-ratio selection".format(num_comp_ratio_candidates)) + + self.num_comp_ratio_candidates = num_comp_ratio_candidates + self.use_monotonic_fit = use_monotonic_fit + self.saved_eval_scores_dict = saved_eval_scores_dict
+ + +class GreedyCompressionRatioSelectionStats: + """ Statistics for the greedy compression-ratio selection algorithm """ + + def __init__(self, eval_scores_dict: Dict[str, Dict[Decimal, float]]): + """ + Constructor + :param eval_scores_dict: Dictionary of {layer_name: {compression_ratio: eval_score}} + """ + self.eval_scores_dictionary = eval_scores_dict + + def __str__(self): + stream = io.StringIO(newline='\n') + stream.write('\nGreedy Eval Dict\n') + layer_dict = self.eval_scores_dictionary + for layer in layer_dict: + stream.write(' Layer: {}\n'.format(layer)) + + for ratio in sorted(layer_dict[layer]): + stream.write(' Ratio={}, Eval score={}\n'.format(ratio, layer_dict[layer][ratio])) + + return stream.getvalue() + + +class TarCompressionRatioSelectionStats: + """ Statistics for the TAR compression-ratio selection algorithm """ + + def __init__(self, layers_comp_ratio_eval_score_per_rank_index): + """ + Constructor + :param layers_comp_ratio_eval_score_per_rank_index: List of [layer_name: compression_ratio: eval_score] params + """ + self.layers_comp_ratio_eval_score_per_rank_index = layers_comp_ratio_eval_score_per_rank_index + + def __str__(self): + stream = io.StringIO(newline='\n') + stream.write('\nTar Eval table\n') + for data_to_print in self.layers_comp_ratio_eval_score_per_rank_index: + stream.write(' Layer: {}\n'.format(data_to_print.layer)) + stream.write(' Ratio={}, Eval score={}\n'.format((data_to_print.comp_ratio), + (data_to_print.eval_score))) + + return stream.getvalue() + + +class CompressionStats: + """ Statistics generated during model compression """ + + class LayerStats: + """ Statistics for every layer in the model that was compressed """ + + def __init__(self, name: str, comp_ratio: Decimal): + self.name = name + self.compression_ratio = comp_ratio + + def __init__(self, base_accuracy: float, comp_accuracy: float, + mem_comp_ratio: Decimal, mac_comp_ratio: Decimal, + per_layer_stats: List[LayerStats], + comp_ratio_select_stats: Union[GreedyCompressionRatioSelectionStats, None]): + + self.baseline_model_accuracy = format(base_accuracy, '.6f') + self.compressed_model_accuracy = format(comp_accuracy, '.6f') + self.memory_compression_ratio = format(mem_comp_ratio, '.6f') + self.mac_compression_ratio = format(mac_comp_ratio, '.6f') + self.per_layer_stats = per_layer_stats + self.compression_ratio_selection_stats = comp_ratio_select_stats + + def __str__(self): + + stream = io.StringIO(newline='\n') + stream.write('**********************************************************************************************\n') + stream.write('Compressed Model Statistics\n') + stream.write('Baseline model accuracy: {}, Compressed model accuracy: {}\n' + .format(self.baseline_model_accuracy, + self.compressed_model_accuracy)) + stream.write('Compression ratio for memory={}, mac={}\n'.format(self.memory_compression_ratio, + self.mac_compression_ratio)) + stream.write('\n') + stream.write('**********************************************************************************************\n') + + stream.write('\nPer-layer Stats\n') + for layer in self.per_layer_stats: + stream.write(' Name:{}, compression-ratio: {}\n'.format(layer.name, + layer.compression_ratio)) + stream.write('\n') + stream.write('**********************************************************************************************\n') + + stream.write('{}\n'.format(self.compression_ratio_selection_stats)) + stream.write('**********************************************************************************************\n') + + return stream.getvalue() + + +class AdaroundConstants: + """ Constants used for Adarounding """ + + GAMMA = -0.1 + ZETA = 1.1 + + +class QuantizationDataType(Enum): + """ Enumeration of tensor quantizer data types supported """ + undefined = 0 + int = 1 + float = 2 + +class SupportedKernelsAction(Enum): + """ Enumeration to specify the action to apply during supported_kernels validation""" + allow_error = 1 + warn_on_error = 2 + assert_on_error = 3 + + +class QuantDtypeBwInfo: + """ + QuantDtypeBwInfo holds activation dtype/bw and param dtype/bw + """ + + + def __init__(self, act_dtype: QuantizationDataType, act_bw: int, + param_dtype: QuantizationDataType = QuantizationDataType.undefined, param_bw: int = 0): + """ + Data class to hold dtype and bw info + :param act_dtype: Activation datatype of type QuantizationDataType + :param act_bw: Activation bitwidth of type int + :param param_dtype: Param datatype of type QuantizationDataType + :param param_bw: Param bitwidth of type int + """ + self.act_dtype = act_dtype + self.act_bw = act_bw + self.param_dtype = param_dtype + self.param_bw = param_bw + self._validate_inputs() + + def __repr__(self): + return f'(activation:({self.act_dtype}, {self.act_bw}) param:({self.param_dtype}, {self.param_bw})' + + def __str__(self): + return f'activation:({self.act_dtype}, {self.act_bw}) param:({self.param_dtype}, {self.param_bw})' + + def __eq__(self, other): + return self.act_dtype == other.act_dtype and self.act_bw == other.act_bw and \ + self.param_dtype == other.param_dtype and self.param_bw == other.param_bw + + def _validate_inputs(self): + """ + Validate inputs + """ + if self.param_dtype and self.param_bw: + if self.param_dtype == QuantizationDataType.float and self.param_bw not in [16, 32]: + raise ValueError( + 'float param_dtype can only be used when param_bw is set to 16, not ' + str(self.param_bw)) + + if self.act_dtype == QuantizationDataType.float and self.act_bw not in [16, 32]: + raise ValueError( + 'float act_dtype can only be used when act_bw is set to 16, not ' + str(self.act_bw)) + + def is_same_activation(self, dtype: QuantizationDataType, bw: int): + """ + helper function to check if activation of the object is same as input + :param bw: bitwidth to verify against + :param dtype: dtype to verify against + """ + return bw == self.act_bw and dtype == self.act_dtype + + def is_same_param(self, dtype: QuantizationDataType, bw: int): + """ + helper function to check if param of the object is same as input + :param bw: bitwidth to verify against + :param dtype: dtype to verify against + """ + return bw == self.param_bw and dtype == self.param_dtype + + def get_activation(self) -> tuple: + """ getter method for activation candidate""" + return self.act_dtype, self.act_bw + + def get_param(self) -> tuple: + """ getter method for param candidate""" + return self.param_dtype, self.param_bw +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_common/utils.html b/releases/1.32.2/_modules/aimet_common/utils.html new file mode 100644 index 00000000..62e2ea8b --- /dev/null +++ b/releases/1.32.2/_modules/aimet_common/utils.html @@ -0,0 +1,1629 @@ + + + + + + aimet_common.utils — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_common.utils

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2018-2023, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+""" Utility classes and functions that are used by NightlyTests files as well as
+    common to both PyTorch and TensorFlow. """
+
+import sys
+from contextlib import contextmanager
+import json
+import logging
+import logging.config
+import logging.handlers
+import math
+import os
+import signal
+import subprocess
+import threading
+import time
+from enum import Enum
+from typing import Callable, Dict, List, Optional, TextIO, Union, Any
+import multiprocessing
+import yaml
+from tqdm import tqdm
+from bokeh.server.server import Server
+from bokeh.application import Application
+
+SAVE_TO_YAML = False
+
+try:
+    # The build system updates Product, Version and Feature set information in the package_info file.
+    from aimet_common.package_info import Product, Version_Info, Postfix
+
+except ImportError:
+    # Default values for Product, Version and Feature set information.
+    Product = 'AIMET'
+    Version_Info = ''
+    Postfix = ''
+
+
+class ModelApi(Enum):
+    """ Enum differentiating between Pytorch or Tensorflow """
+    pytorch = 0
+    tensorflow = 1
+    keras = 2
+    onnx = 3
+
+
+
[docs]class CallbackFunc: + """ + Class encapsulating callback function, and it's argument(s) + """ + def __init__(self, func: Callable, func_callback_args=None): + """ + :param func: Callable Function + :param func_callback_args: Arguments passed to the callable function as-is. + """ + self.func = func + self.args = func_callback_args
+ + +class SingletonType(type): + """ SingletonType is used as a metaclass by other classes for which only one instance must be created. + + A metaclass inherits from "type' and it's instances are other classes. + """ + _instances = {} + + def __call__(cls, *args, **kwargs): + """ This function overrides the behavior of type's __call__ function. + + The overriding behavior is needed so that only one instance of the derived + class is created. The argument cls is a class variable (similar to self for instances). + + Using AimetLogger class as an example, when AimetLogger() is called, SingletonType + (the metaclass) class's __call__ is called which in turn calls AimetLogger's __call__ + creating an instance of AimetLogger. The creation happens only once, making + aimetLooger a singleton. + """ + if cls not in cls._instances: + cls._instances[cls] = super(SingletonType, cls).__call__(*args, **kwargs) + return cls._instances[cls] + + +class AimetLogger(metaclass=SingletonType): + """ The aimet Logger class. Multiple Area Loggers have been defined. + Each Area Logger could be set at a different logging level. """ + _logger = None + + class LogAreas(Enum): + """ Defines the LogAreas used in aimet. """ + Quant = 'Quant' + Svd = 'Svd' + Test = 'Test' + Utils = 'Utils' + CompRatioSelect = 'CompRatioSelect' + ChannelPruning = 'ChannelPruning' + Winnow = 'Winnow' + ConnectedGraph = 'ConnectedGraph' + CrosslayerEqualization = 'CrossLayerEqualization' + MixedPrecision = 'MixedPrecision' + AutoQuant = 'AutoQuant' + Nas = 'Nas' + NasPipeline = 'NasPipeline' + DeviceFramework = 'DeviceFramework' + BatchNormFolding = "BatchNormFolding" + ModelPreparer = "ModelPreparer" + LayerOutputs = 'LayerOutputs' + QuantAnalyzer = 'QuantAnalyzer' + SeqMse = 'SeqMse' + + def __init__(self): + self._logger = logging.getLogger() + + dir_name = os.path.dirname(__file__) + rel_path = "default_logging_config.json" + abs_file_path = os.path.join(dir_name, rel_path) + + with open(abs_file_path, encoding='utf-8') as logging_configuration_file: + try: + config_dict = json.loads(logging_configuration_file.read()) + except: # pylint: disable=raise-missing-from + raise ValueError("Logging configuration file: default_logging_config.json contains invalid format") + + logging.config.dictConfig(config_dict) + + # Validate JSON file default_logging_config.json for correct Logging Areas + #TODO This results in a pylint error: Instance of 'RootLogger' has no 'loggerDict' member. + # Need to fix this issue and then remove the pylint disablement. + configured_items = list(logging.root.manager.loggerDict.items()) # pylint: disable=no-member + + log_areas_list = list() + for x in AimetLogger.LogAreas: + log_areas_list.append(x.value) + + configured_areas_list = list() + for name, _ in configured_items: + configured_areas_list.append(name) + + for area in log_areas_list: + if area not in configured_areas_list: + raise ValueError(" ERROR: LogArea: {} NOT configured".format(area)) + + log_package_info() + + @staticmethod + def get_area_logger(area): + """ Returns a specific Area logger. """ + AimetLogger() + area_logger = logging.getLogger(area.value) + return area_logger + + @staticmethod + def set_area_logger_level(area, level): + """ Sets a logging level for a single area logger. """ + area_logger = logging.getLogger(area.value) + area_logger.setLevel(level) + + @staticmethod + def set_level_for_all_areas(level): + """ Sets the same logging level for all area debuggers. """ + for area in AimetLogger.LogAreas: + AimetLogger.set_area_logger_level(area, level) + +def log_with_error_and_assert_if_false(condition: bool, logger: logging.Logger, error_msg: str): + """ + If condition is false, log an error and assert with the same error message. + + :param condition: Condition to check + :param logger: Logger to log error with + :param error_msg: Error message string + """ + if not condition: + logger.error(error_msg) + assert condition, error_msg + +def round_up_to_multiplicity(multiplicity: int, num: int, max_allowable_num: int): + """ + Function to round a number to the nearest multiplicity given the multiplicity + :param multiplicity: multiplicity for rounding + :param num: input number to be rounded + :param max_allowable_num: maximum value for num allowed + :return: number rounded up to nearest multiplicity + """ + larger_multiple = math.ceil(float(num) / float(multiplicity)) * multiplicity + if larger_multiple >= max_allowable_num: + return max_allowable_num + return int(larger_multiple) + + +def round_down_to_multiplicity(multiplicity: int, num: int): + """ + Function to round a number to the nearest multiplicity given the multiplicity + :param multiplicity: multiplicity for rounding + :param num: input number to be rounded + :return: number rounded down to nearest multiplicity + """ + if num - multiplicity <= 0: + return num + + if num % multiplicity == 0: + num = num - 1 + lower_multiple = math.floor(float(num) / float(multiplicity)) * multiplicity + return int(lower_multiple) + + +# Depending on pytorch or tensorflow, the ordering of dimensions in tensor/product shapes will be different. +# In pytorch, the number of channels is always index 1 +# In tensorflow, the number of channels is always the last dimension in the shape +api_channel_index_dict = {ModelApi.pytorch: 1, ModelApi.tensorflow: -1} + + +def kill_process_with_name_and_port_number(name: str, port_number: int): + """ Kill a process that is associated with a port number displayed by the command: ps -x """ + + logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.Utils) + p = subprocess.Popen(['ps', '-x'], stdout=subprocess.PIPE) # pylint: disable=consider-using-with + out, _ = p.communicate() + + for line in out.splitlines(): + str_line = line.decode() + port_num_str = str(port_number) + if name in str_line and '--port=' + port_num_str in str_line: + pid = int(line.split(None, 1)[0]) + logger.info("Killing Bokeh server with process id: %s", format(pid)) + os.kill(pid, signal.SIGKILL) + break + + +def start_bokeh_server_session(port: int = None): + """ + start a bokeh server programmatically. Used for testing purposes. + :param port: Port number. If not specified, bokeh server will listen on an arbitrary free port. + :return: Returns the Bokeh Server URL and the process object used to create the child server process + """ + manager = multiprocessing.Manager() + d = manager.dict() + server_started = manager.Event() + + def start_bokeh_server(port: int = None): + os.setsid() + + # If port is 0, server automatically finds and listens on an arbitrary free port. + port = port or 0 + try: + server = Server({'/': Application()}, port=port) + server.start() + d['port'] = server.port + server_started.set() + server.run_until_shutdown() + except Exception as e: + d['exception'] = e + raise + + proc = multiprocessing.Process(target=start_bokeh_server, args=(port,)) + + proc.start() + server_started.wait(timeout=10) + + if 'port' not in d: + if proc: + proc.terminate() + + if 'exception' in d: + e = d['exception'] + raise RuntimeError(f'Bokeh server failed with the following error: {e}') + + raise RuntimeError('Bokeh Server failed with an unknown error') + + port = d['port'] + address = f'http://localhost:{port}' + + return address, proc + + +def log_package_info(): + """ + Log the Product, Version and Postfix. + :return: + """ + + # The Product is always a non-empty string. + if Version_Info != '' and Postfix != '': + # Log Product-Version-Postfix + logging.info("%s-%s-%s", Product, Version_Info, Postfix) + elif Version_Info != '' and Postfix == '': + # Log Product-Version + logging.info("%s-%s", Product, Version_Info) + else: + # If Version is empty, the Postfix is not logged. + # Log Product. + logging.info("%s", Product) + + +def save_json_yaml(file_path: str, dict_to_save: dict): + """ + Function which saves encoding in YAML and JSON file format + :param file_path: file name to use to generate the yaml and json file + :param dict_to_save: dictionary to save + """ + encoding_file_path_json = file_path + with open(encoding_file_path_json, 'w') as encoding_fp_json: + json.dump(dict_to_save, encoding_fp_json, sort_keys=True, indent=4) + + if SAVE_TO_YAML: + encoding_file_path_yaml = file_path + '.yaml' + with open(encoding_file_path_yaml, 'w') as encoding_fp_yaml: + yaml.dump(dict_to_save, encoding_fp_yaml, default_flow_style=False, allow_unicode=True) + + +class TqdmStreamHandler(logging.StreamHandler): + """ + Logging handler for tqdm. + """ + def emit(self, record): + with tqdm.external_write_mode(file=self.stream): + super().emit(record) + + + +class Spinner(tqdm): + """ + Simple spinner that displays what's being performed under the hood. + This is helpful for providing a cue to the users that something is in + progress (not blocked) when showing a progress bar is not always possible, + e.g. when there is no loop, when the loop resides in the library, etc. + + NOTE: Being a subclass of tqdm, we should use AimetLogger when spinner is + activated to keep the standard output look as neat as it should be. + + Typical usage:: + >>> def do_something(): + ... do_part_1() + ... logger.info("Part 1 done") + ... do_part_2() + ... logger.info("Part 2 done") + ... do_part_3() + ... logger.info("Part 3 done") + ... + ... with Spinner("Doing task A"): + ... do_something() + Part 1 done + Part 2 done + Part 3 done + / Doing task A <- Spinning at the bottom until the end of with block + + This can also be used in a nested manner:: + >>> with Spinner("Doing task A"): + ... with Spinner("Part 1 in progress..."): + ... do_part_1() + ... with Spinner("Part 2 in progress..."): + ... do_part_2() + ... with Spinner("Part 3 in progress..."): + ... do_part_3() + / Doing task A <- Two spinners spinning independently + - Part 1 in progress... <- Two spinners spinning independently + """ + prefixes = ["/", "-", "\\", "|"] + + def __init__(self, title: str, refresh_interval: float = 0.5): + """ + :param title: Title that the spinner will display. + :param refresh_interval: Time interval (unit: sec) of refreshing the spinner. + """ + def refresh_in_loop(): + while not self._stop.is_set(): + with self._lock: + self._index = (self._index + 1) % len(self.prefixes) + self.refresh(nolock=True) + time.sleep(refresh_interval) + + self._index = 0 + self._stop = threading.Event() + self._refresh_thread = threading.Thread(target=refresh_in_loop) + self._messages = [ + f"{prefix} {title}" for prefix in self.prefixes + ] + + super().__init__() + + def __str__(self): + return self._messages[self._index] + + def __enter__(self): + self._refresh_thread.start() + return super().__enter__() + + def __exit__(self, *args, **kwargs): # pylint: disable=arguments-differ + self._stop.set() + self._refresh_thread.join() + super().__exit__(*args, **kwargs) + + +class Handle: + """ Removable handle. """ + + def __init__(self, cleanup_fn): + self._cleanup_fn = cleanup_fn + self._removed = False + + def remove(self): + """ Run clean up function """ + if not self._removed: + self._cleanup_fn() + self._removed = True + + def __enter__(self): + return self + + def __exit__(self, *_): + self.remove() + + +def convert_configs_values_to_bool(dictionary: Dict): + """ + Recursively traverse all key value pairs in dictionary and set any string values representing booleans to + booleans. + :param dictionary: Dictionary to set values to True or False if applicable + """ + for key, value in dictionary.items(): + if value == 'True': + dictionary[key] = True + elif value == 'False': + dictionary[key] = False + elif isinstance(value, List): + for item in value: + if isinstance(item, Dict): + convert_configs_values_to_bool(item) + elif isinstance(value, Dict): + convert_configs_values_to_bool(value) + else: + pass + + +@contextmanager +def profile(label: str, file: Union[str, os.PathLike, TextIO] = None, new_file: bool = False, logger: Optional[logging.Logger] = None, + cleanup: Callable[[], Any] = None): + """ + Profile a block of code and save profiling information into a file. + + :param label: String label associated with the block of code to profile (shows up in the profiling print) + :param file: File path and name or a file-like object to send output text to (Default: stdout) + :param new_file: True if a new file is to be created to hold profiling info, False if an existing file should be + appended to. This flag is only valid when ``file`` is a path, not a file-like object. + :param logger: If logger is provided, profiling string will also be printed with INFO logging level + :param cleanup: If provided, this will be called before ending profiling. This can be useful for synchronizing cuda streams. + """ + should_close = False + if isinstance(file, (str, os.PathLike)): + mode = 'w' if new_file else 'a' + file = open(file, mode) # pylint: disable=consider-using-with + should_close = True + elif file is None: + file = sys.stdout + + assert hasattr(file, 'write') + + try: + with Spinner(label): + start = time.perf_counter() + yield + if cleanup: + cleanup() + end = time.perf_counter() + + profiling_string = f'{label}: {end - start:.2f}s' + + if logger: + logger.info(profiling_string) + + print(profiling_string, file=file) + finally: + if should_close: + file.close() +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/adaround/adaround_weight.html b/releases/1.32.2/_modules/aimet_tensorflow/adaround/adaround_weight.html new file mode 100644 index 00000000..9c78abbd --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/adaround/adaround_weight.html @@ -0,0 +1,1540 @@ + + + + + + aimet_tensorflow.adaround.adaround_weight — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_tensorflow.adaround.adaround_weight

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2021-2023, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Top level API for Adaptive Rounding - Post-Training Quantization (PTQ) """
+
+import os
+import json
+import shutil
+from typing import List, Tuple, Callable, Union, Dict
+import numpy as np
+from tqdm import tqdm
+import tensorflow as tf
+
+# Import AIMET specific modules
+import aimet_common.libpymo as libpymo
+from aimet_common.utils import AimetLogger
+from aimet_common.defs import QuantScheme
+from aimet_common.quantsim_config.json_config_importer import JsonConfigImporter, ConfigDictKeys, ConfigDictType
+from aimet_tensorflow.utils import graph_saver
+from aimet_tensorflow.utils.common import get_ordered_ops
+from aimet_tensorflow.utils.op.conv import WeightTensorUtils
+from aimet_tensorflow.quantsim_config.quantsim_config import MAP_TF_PARAM_NAME_TO_QUANTSIM_NAME
+from aimet_tensorflow.adaround.activation_sampler import ActivationSampler
+from aimet_tensorflow.adaround.adaround_loss import AdaroundHyperParameters
+from aimet_tensorflow.adaround.adaround_optimizer import AdaroundOptimizer
+from aimet_tensorflow.adaround.adaround_wrapper import AdaroundWrapper
+
+AdaroundSupportedOps = ('Conv2D', 'DepthwiseConv2dNative', 'MatMul', 'Conv2DBackpropInput')
+
+ActFuncMap = {'Relu': tf.nn.relu, 'Relu6': tf.nn.relu6, 'Tanh': tf.nn.tanh, 'Sigmoid': tf.nn.sigmoid,
+              'Softmax': tf.nn.softmax}
+
+tf_op_type_to_onnx_type_dict = {
+        "Conv2D": "Conv",
+        "DepthwiseConv2dNative": "Conv",
+        "MatMul": "Gemm",
+    }
+
+logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.Quant)
+WORKING_DIR = '/tmp/adaround/'
+
+
+
[docs]class AdaroundParameters: + """ + Configuration parameters for Adaround + """ + def __init__(self, data_set: tf.data.Dataset, num_batches: int, default_num_iterations: int = 10000, + default_reg_param: float = 0.01, default_beta_range: Tuple = (20, 2), default_warm_start: float = 0.2): + """ + :param data_set: TF Data set + :param num_batches: Number of batches + :param default_num_iterations: Number of iterations to adaround each layer. Default 10000 + :param default_reg_param: Regularization parameter, trading off between rounding loss vs reconstruction loss. + Default 0.01 + :param default_beta_range: Start and stop beta parameter for annealing of rounding loss (start_beta, end_beta). + Default (20, 2) + :param default_warm_start: warm up period, during which rounding loss has zero effect. Default 20% (0.2) + """ + self.data_set = data_set + self.num_batches = num_batches + self.num_iterations = default_num_iterations + self.reg_param = default_reg_param + self.beta_range = default_beta_range + self.warm_start = default_warm_start + + def __eq__(self, other: "AdaroundParameters"): + return self.data_set == other.data_set and\ + self.num_batches == other.num_batches and\ + self.num_iterations == other.num_iterations and\ + self.reg_param == other.reg_param and\ + self.beta_range == other.beta_range and\ + self.warm_start == other.warm_start
+ + +class Adaround: + """ + Weight-rounding mechanism for Post Training Quantization (PTQ) + """ + @classmethod + def apply_adaround(cls, session: tf.compat.v1.Session, starting_op_names: List[str], output_op_names: List[str], + params: AdaroundParameters, path: str, filename_prefix: str, default_param_bw: int = 4, + default_quant_scheme: QuantScheme = QuantScheme.post_training_tf_enhanced, + default_config_file: str = None) -> tf.compat.v1.Session: + """ + Returns Tf session - model with optimized weight rounding of every op (Conv and Linear) and also saves the + corresponding quantization encodings to a separate JSON-formatted file that can then be imported by + QuantSim for inference or QAT + + :param session: Tf session with model to adaround + :param starting_op_names: List of starting op names of the model + :param output_op_names: List of output op names of the model + :param params: Parameters for adaround + :param path: path where to store parameter encodings + :param filename_prefix: Prefix to use for filename of the encodings file + :param default_param_bw: Default bitwidth (4-31) to use for quantizing layer parameters. Default 4 + :param default_quant_scheme: Quantization scheme. Supported options are QuantScheme.post_training_tf or + QuantScheme.post_training_tf_enhanced. Default QuantScheme.post_training_tf_enhanced + :param default_config_file: Default configuration file for model quantizers + :return: Tf session with Adarounded weight and saves corresponding parameter encodings JSON file + at provided path + """ + # pylint: disable=too-many-arguments + if not os.path.exists(WORKING_DIR): + os.makedirs(WORKING_DIR) + + param_encodings,\ + session_soft_rounded_weight = cls._apply_adaround_helper(session, + starting_op_names, + output_op_names, + params, + default_param_bw, + default_quant_scheme, + default_config_file) + + # Export quantization encodings to JSON-formatted file at provided path + cls.export_encoding_to_json(path, filename_prefix, param_encodings) + + if os.path.exists(WORKING_DIR): + logger.info('Deleting temporary working directory %s', WORKING_DIR) + shutil.rmtree(WORKING_DIR) + + logger.info('Completed Adarounding Model') + + return session_soft_rounded_weight + + @classmethod + def _apply_adaround_helper( # pylint: disable=too-many-locals + cls, + session: tf.compat.v1.Session, + starting_op_names: List[str], + output_op_names: List[str], + params: AdaroundParameters, + param_bw: int, + quant_scheme: QuantScheme, + config_file: str, + ) -> Tuple[Dict, tf.compat.v1.Session]: + """ + Helper for apply_adaround(). + + NOTE: Soft rounding is only used for op-wise optimization procedure as we need gradients + for the rounding to be learned and after that we switch to hard rounding (i.e. using + true fixed point numbers) to be used for collecting later layers activations data. + + When optimization is fully converged (i.e. wrapper.alpha is always exact 0 or 1), there + is no difference between soft rounding and hard rounding. + + :param session: Tf session with model to adaround. + :param starting_op_names: List of starting op names of the model. + :param output_op_names: List of output op names of the model. + :param params: Parameters for adaround. + :param param_bw: bitwidth (4-31) to use for quantizing layer parameters. + :param quant_scheme: Quantization scheme. + :param config_file: configuration file. + :return: Dictionary containing encoding for adarounded parameters, + TF session with soft rounding weights. + """ + # Create copies which will have model's weights quantized with hard and soft rounding. + session_hard_rounded_weight = graph_saver.save_and_load_graph(WORKING_DIR, session) + session_soft_rounded_weight = graph_saver.save_and_load_graph(WORKING_DIR, session) + + # Get parameters from config file. + configs, strict_symmetric, unsigned_symmetric, enable_per_channel = Adaround.get_config_dict_keys(config_file) + + # Optimization Hyper parameters + opt_params = AdaroundHyperParameters(params.num_iterations, params.reg_param, params.beta_range, + params.warm_start) + # Activation sampler + act_sampler = ActivationSampler(params.data_set) + + # Get Adaround supported ops based on occurrence in the model + ordered_ops = cls._get_ordered_list_of_ops(session.graph, starting_op_names, output_op_names) + + param_encodings = {} + for op in tqdm(ordered_ops): + logger.info("Started Optimizing weight rounding of op: %s", op.name) + + # Using name, get corresponding op from session with soft and hard rounded weights. + hard_rounded_op = session_hard_rounded_weight.graph.get_operation_by_name(op.name) + soft_rounded_op = session_soft_rounded_weight.graph.get_operation_by_name(op.name) + + # Collect input and output activations data + all_inp_data, all_out_data = act_sampler.sample_activation(op, hard_rounded_op, session, + session_hard_rounded_weight, starting_op_names, + params.num_batches) + is_symmetric = cls.get_is_symmetric_flag_for_op_param(configs, op.type, + param_name="weight", + framework_to_onnx_type_dict=tf_op_type_to_onnx_type_dict) + + + # Find next following activation function + act_func = cls._get_act_func(op) + + # Perform Adaround optimization in separate graph + graph = tf.Graph() + with graph.as_default(): + output_height, output_width, output_channels = None, None, None + if op.type == 'Conv2DBackpropInput': + output_height, output_width, output_channels = \ + cls.get_conv2d_transpose_output_tensor_shape(op.get_attr("data_format").decode('utf-8'), + all_out_data) + wrapper = AdaroundWrapper(session, op, param_bw, quant_scheme, is_symmetric, + strict_symmetric, unsigned_symmetric, enable_per_channel, output_height, + output_width, output_channels) + hard_rounded_weight, soft_rounded_weight = AdaroundOptimizer().adaround_wrapper(wrapper, act_func, + all_inp_data, + all_out_data, + opt_params) + + # Update param encodings dictionary + cls._update_param_encodings_dict(param_encodings, op, wrapper.encoding, is_symmetric) + + # Update with hard and soft rounded weights + WeightTensorUtils.update_tensor_for_op(session_hard_rounded_weight, hard_rounded_op, hard_rounded_weight) + WeightTensorUtils.update_tensor_for_op(session_soft_rounded_weight, soft_rounded_op, soft_rounded_weight) + + # Close intermediate session + session_hard_rounded_weight.close() + + return param_encodings, session_soft_rounded_weight + + @staticmethod + def get_config_dict_keys(config_file: str) -> Tuple[ConfigDictType, bool, bool, bool]: + """ + Get config dictionary keys from config file. Config file will default if no provided one. + :param config_file: configuration file. + :return: Config dictionary, strict symmetric flag, unsigned symmetric flag, enable per channel flag. + """ + configs = JsonConfigImporter.import_json_config_file(config_file) + strict_symmetric = configs[ConfigDictKeys.DEFAULTS].get(ConfigDictKeys.STRICT_SYMMETRIC, False) + unsigned_symmetric = configs[ConfigDictKeys.DEFAULTS].get(ConfigDictKeys.UNSIGNED_SYMMETRIC, False) + + # Read per-channel quantization field. Default = False + per_channel_enabled = configs[ConfigDictKeys.DEFAULTS].get(ConfigDictKeys.PER_CHANNEL_QUANTIZATION, False) + + return configs, strict_symmetric, unsigned_symmetric, per_channel_enabled + + @staticmethod + def _get_ordered_list_of_ops(graph: tf.Graph, input_op_names: List[str], output_op_names: List[str]) \ + -> List[tf.Operation]: + """ + Get Adaround supported ops based on occurrence in the model + :param graph: Model represented as TF data flow graph + :param input_op_names: List of input op names + :param output_op_names: List of output op names of the model + :return: List of Adaround supported ops + """ + # Get all the ops in the model based on occurrence + list_of_ordered_ops = get_ordered_ops(graph, input_op_names, output_op_names) + + ordered_ops = [] + + for op in list_of_ordered_ops: + if op.type in AdaroundSupportedOps: + ordered_ops.append(op) + + return ordered_ops + + @staticmethod + def _get_act_func(op: tf.Operation) -> Union[Callable, None]: + """ + Gets immediate following activation function else returns None + :param op: Tf op + :return: Callable Tf activation function or None + """ + act_func = None + consumer_ops = op.outputs[0].consumers() + + if not consumer_ops: + return act_func + + # op -> act_func + if consumer_ops[0].type in ActFuncMap: + act_func = ActFuncMap[consumer_ops[0].type] + + # op -> bias_add -> act_func + elif consumer_ops[0].type in ['Add', 'BiasAdd']: + if consumer_ops[0].outputs[0].consumers() and consumer_ops[0].outputs[0].consumers()[0].type in ActFuncMap: + act_func = ActFuncMap[consumer_ops[0].outputs[0].consumers()[0].type] + + logger.info("op: %s 's next following act func: %s", op.name, act_func) + return act_func + + @classmethod + def export_encoding_to_json(cls, path: str, filename_prefix: str, param_encodings: Dict): + """ + Save Adadrounded op's parameter encodings to JSON file + :param path: path where to store param encodings + :param filename_prefix: filename to store exported weight encodings in JSON format + :param param_encodings: Parameter encodings dictionary + """ + # export encodings to JSON file + os.makedirs(os.path.abspath(path), exist_ok=True) + encoding_file_path = os.path.join(path, filename_prefix + '.encodings') + with open(encoding_file_path, 'w') as encoding_fp: + json.dump(param_encodings, encoding_fp, sort_keys=True, indent=4) + + @staticmethod + def _update_param_encodings_dict(encoding_dict: Dict, op: tf.Operation, + encoding: Union[libpymo.TfEncoding, List[libpymo.TfEncoding]], + is_symmetric: bool): + """ + Add op's parameter encoding to dictionary to be used for exporting + :param encoding_dict: Encoding dictionary + :param op: Tf op + :param encoding: Encoding + :param is_symmetric: Symmetric vs Asymmetric boolean + """ + tensor_name = op.inputs[1].name + # Wrap Per Tensor encoding in a list for list comprehension + encoding = encoding if isinstance(encoding, list) else [encoding] + encoding_dict[tensor_name] = [{'min': enc.min, + 'max': enc.max, + 'scale': enc.delta, + 'offset': int(enc.offset), + 'bitwidth': enc.bw, + 'is_symmetric': str(is_symmetric)} for enc in encoding] + + @staticmethod + def get_is_symmetric_flag_for_op_param(configs: ConfigDictType, tf_op_type: str, param_name: str, + framework_to_onnx_type_dict: dict) -> bool: + """ + NOTE: Checks config file in reverse order of specificity. + + Returns is_symmetric flag for op's param if it is set in config file else returns + False. First check all ops of specific types, second check all params of specific + and lastly check for default types. If not specified, it will return default is + symmetric False. + + :param configs: Dictionary containing configs. + :param tf_op_type: TensorFlow operation type. + :param param_name: Parameter name. + :param framework_to_onnx_type_dict: Dictionary mapping framework type to ONNX type. + :return: is_symmetric flag for given op's param. + """ + assert param_name in MAP_TF_PARAM_NAME_TO_QUANTSIM_NAME.keys(), "param name is invalid." + + # third level of specificity which applies to specific op_type's parameters. + try: + onnx_type = framework_to_onnx_type_dict[tf_op_type] + return configs[ConfigDictKeys.OP_TYPE] \ + [onnx_type] \ + [ConfigDictKeys.PARAMS] \ + [param_name] \ + [ConfigDictKeys.IS_SYMMETRIC] + except KeyError: + pass + + # Second level of specificity which applies to all parameters only. + try: + return configs[ConfigDictKeys.PARAMS] \ + [param_name] \ + [ConfigDictKeys.IS_SYMMETRIC] + except KeyError: + pass + + # First level of specificity which applies to all the ops and parameters. + try: + return configs[ConfigDictKeys.DEFAULTS] \ + [ConfigDictKeys.PARAMS] \ + [ConfigDictKeys.IS_SYMMETRIC] + except KeyError: + pass + + # Default is_symmetric False. + return False + + @staticmethod + def get_conv2d_transpose_output_tensor_shape(data_format: str, output_data: np.ndarray): + """ + Get output height, width, and channels from output_data for use in adarounding conv2d transpose op. + :param data_format: Data format for the op (NHWC or NCHW) + :param output_data: numpy array containing sampled output of the op + :return: Tuple containing output height, width, and channels of the op + """ + if data_format == 'NHWC': + output_height = output_data.shape[1] + output_width = output_data.shape[2] + output_channels = output_data.shape[3] + else: + output_height = output_data.shape[2] + output_width = output_data.shape[3] + output_channels = output_data.shape[1] + return output_height, output_width, output_channels +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/auto_quant.html b/releases/1.32.2/_modules/aimet_tensorflow/auto_quant.html new file mode 100644 index 00000000..5de58f8c --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/auto_quant.html @@ -0,0 +1,1969 @@ + + + + + + aimet_tensorflow.auto_quant — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_tensorflow.auto_quant

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2022, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+"""Automatic Post-Training Quantization"""
+import contextlib
+from dataclasses import dataclass
+import os
+from typing import Any, Callable, Dict, List, Optional, Tuple
+import tensorflow as tf
+from tqdm import tqdm
+
+import jinja2
+from bokeh.resources import CDN
+
+from aimet_tensorflow.adaround.adaround_weight import Adaround, AdaroundParameters
+from aimet_tensorflow.cross_layer_equalization import equalize_model
+from aimet_tensorflow.batch_norm_fold import fold_all_batch_norms
+from aimet_tensorflow.quantsim import QuantizationSimModel
+from aimet_tensorflow.utils.graph_saver import load_model_from_meta
+from aimet_tensorflow.utils.common import (
+    create_input_feed_dict,
+    deepcopy_tf_session,
+    iterate_tf_dataset,
+)
+from aimet_tensorflow.cache import TfSessionSerializationProtocol
+
+from aimet_common.auto_quant import Diagnostics
+from aimet_common.cache import Cache
+from aimet_common.defs import QuantScheme
+from aimet_common.utils import AimetLogger, Spinner
+from aimet_common.quantsim import validate_quantsim_inputs
+
+
+tf.compat.v1.disable_eager_execution()
+
+_logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.AutoQuant)
+
+
+cache = Cache()
+
+
+# The number of samples to be used for performance evaluation.
+# NOTE: None means "all".
+NUM_SAMPLES_FOR_PERFORMANCE_EVALUATION = None
+
+
+
[docs]class AutoQuant: + """ + Integrate and apply post-training quantization techniques. + + AutoQuant includes 1) batchnorm folding, 2) cross-layer equalization, + and 3) Adaround. + These techniques will be applied in a best-effort manner until the model + meets the evaluation goal given as allowed_accuracy_drop. + """ + + def __init__( # pylint: disable=too-many-arguments + self, + allowed_accuracy_drop: float, + unlabeled_dataset: tf.compat.v1.data.Dataset, + eval_callback: Callable[[tf.compat.v1.Session, Optional[int]], float], + default_param_bw: int = 8, + default_output_bw: int = 8, + default_quant_scheme: QuantScheme = QuantScheme.post_training_tf_enhanced, + default_rounding_mode: str = 'nearest', + default_config_file: str = None, + ) -> None: + """ + :param allowed_accuracy_drop: Maximum allowed accuracy drop. + :param unlabeled_dataset: An unlabeled dataset for encoding computation. + By default, this dataset will be also used for Adaround unless + otherwise specified by `self.set_adaround_params`. + :param eval_callback: A function that maps a tf session and the number of samples + to the evaluation score. This callback is expected to return a + scalar value representing the model performance evaluated + against exactly `N` samples, where `N` is the number of samples + passed as the second argument of this callback. + NOTE: If `N` is None, the model is expected to be evaluated against + the whole evaluation dataset. + :param default_param_bw: Default bitwidth (4-31) to use for quantizing layer parameters. + :param default_output_bw: Default bitwidth (4-31) to use for quantizing layer inputs andoutputs. + :param default_quant_scheme: Quantization scheme. Supported values are + QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced. + :param default_rounding_mode: Rounding mode. Supported options are 'nearest' or 'stochastic' + :param default_config_file: Path to configuration file for model quantizers + """ + if allowed_accuracy_drop < 0: + raise ValueError( + "`allowed_accuracy_drop` must be a positive value. Got {:.2f}" + .format(allowed_accuracy_drop) + ) + + validate_quantsim_inputs(default_quant_scheme, + default_rounding_mode, + default_output_bw, + default_param_bw) + + self.allowed_accuracy_drop = allowed_accuracy_drop + self.eval_callback = eval_callback + self.default_param_bw = default_param_bw + self.default_output_bw = default_output_bw + self.default_quant_scheme = default_quant_scheme + self.default_rounding_mode = default_rounding_mode + self.default_config_file = default_config_file + + self._unlabeled_dataset = unlabeled_dataset + self._unlabled_dataset_length = None + + self._adaround_params = None + + @property + def adaround_params(self): + """Returns the adaround parameter.""" + # If adaround_params is manually set, return it. + if self._adaround_params is not None: + return self._adaround_params + # Otherwise, return the default adaround params if the length of the + # dataset if known. + if self._unlabled_dataset_length is not None: + return AdaroundParameters(self._unlabeled_dataset, + self._unlabled_dataset_length) + return None + + def _evaluate_model_performance(self, sess: tf.compat.v1.Session) -> float: + """ + Evaluate the model performance. + + :param sess: tf.Session associated with the model to evaluate. + :return: Evaluation score. + """ + return self.eval_callback(sess, NUM_SAMPLES_FOR_PERFORMANCE_EVALUATION) + +
[docs] def set_adaround_params(self, adaround_params: AdaroundParameters) -> None: + """ + Set Adaround parameters. + If this method is not called explicitly by the user, AutoQuant will use + `unlabeled_dataset` (passed to `__init__`) for Adaround. + + :param adaround_params: Adaround parameters. + """ + self._adaround_params = adaround_params
+ + def _create_quantsim_and_encodings( # pylint: disable=too-many-arguments + self, + sess: tf.compat.v1.Session, + starting_op_names: List[str], + output_op_names: List[str], + quant_scheme: QuantScheme = None, + rounding_mode: str = None, + default_output_bw: int = None, + default_param_bw: int = None, + config_file: str = None, + encoding_path: str = None, + ) -> QuantizationSimModel: + """ + Create a QuantizationSimModel and compute encoding. If `encoding_path` is not None, + it is prioritized over other arguments (`default_output_bw`, `defalt_param_bw`, ...). + + NOTE: Input session is not mutated. + + :param sess: The input model as session to add quantize ops to. + :param starting_op_names: List of starting op names of the model. + :param output_op_names: List of output op names of the model. + :param quant_scheme: Quantization scheme. Defaults to self.default_quant_scheme. + :param rounding_mode: Rounding mode. Defaults to self.default_rounding_mode. + :param default_output_bw: Default bitwidth (4-31) to use for quantizing layer inputs andoutputs. + Defaults to self.default_output_bw. + :param default_param_bw: Default bitwidth (4-31) to use for quantizing layer parameters. + Defaults to self.default_param_bw. + :param config_file: Path to configuration file for model quantizers. + Defaults to self.default_config_file. + :param encoding_path: Path to parameter encodings file. + :return: Quantsim model. + """ + kwargs = dict( + quant_scheme=(quant_scheme or self.default_quant_scheme), + rounding_mode=(rounding_mode or self.default_rounding_mode), + default_output_bw=(default_output_bw or self.default_output_bw), + default_param_bw=(default_param_bw or self.default_param_bw), + config_file=(config_file or self.default_config_file), + ) + with deepcopy_tf_session(sess) as sess: # pylint: disable=redefined-argument-from-local + sim = QuantizationSimModel(sess, starting_op_names, output_op_names, **kwargs) + + if encoding_path: + sim.set_and_freeze_param_encodings(encoding_path) + + def forward_pass_callback(sess: tf.compat.v1.Session, _: Any = None): + output_ops = [ + sess.graph.get_operation_by_name(op_name) + for op_name in output_op_names + ] + + count = 0 + iterator = iterate_tf_dataset(self._unlabeled_dataset) + for inputs in tqdm(iterator, total=self._unlabled_dataset_length): + feed_dict = create_input_feed_dict(sess.graph, starting_op_names, inputs) + sess.run(output_ops, feed_dict=feed_dict) + count += 1 + + self._unlabled_dataset_length = count + + sim.compute_encodings(forward_pass_callback, None) + + return sim + + def _apply_batchnorm_folding( # pylint: disable=no-self-use + self, + sess: tf.compat.v1.Session, + starting_op_names: List[str], + output_op_names: List[str], + ) -> Tuple[tf.compat.v1.Session, List[Tuple[tf.Operation, tf.Operation]]]: + """ + Apply batchnorm folding. + + NOTE: Input session is not mutated. + + :param sess: tf.Session associated with the model to apply cle. + :param starting_op_names: List of starting op names of the model. + :param output_op_names: List of output op names of the model. + :return: Output session and folded pairs. + """ + # NOTE: We don't apply caching to batchnorm folding because caching is + # likely going to have an adverse effect on the performance. + # Since a tf.Operation contains a reference to the graph it belongs + # to, serializing a subset of operations of a tf.Graph requires + # serializing the whole graph, making the serialization cost very + # likely to exceed the evaluation cost. + with deepcopy_tf_session(sess) as sess: # pylint: disable=redefined-argument-from-local + return fold_all_batch_norms(sess, starting_op_names, output_op_names) + + @cache.mark("cle", TfSessionSerializationProtocol()) + def _apply_cross_layer_equalization( # pylint: disable=no-self-use + self, + sess: tf.compat.v1.Session, + starting_op_names: List[str], + output_op_names: List[str], + ) -> tf.compat.v1.Session: + """ + Apply cross-layer equalization. + + NOTE: Input session is not mutated. + + :param sess: tf.Session associated with the model to apply batchnorm folding. + :param starting_op_names: List of starting op names of the model. + :param output_op_names: List of output op names of the model. + :return: Output session. + """ + with deepcopy_tf_session(sess) as sess: # pylint: disable=redefined-argument-from-local + return equalize_model(sess, starting_op_names, output_op_names) + + def _apply_adaround( + self, + sess: tf.compat.v1.Session, + starting_op_names: List[str], + output_op_names: List[str], + results_dir: str, + ) -> Tuple[tf.compat.v1.Session, str]: + """ + Apply adaround. + + :param sess: tf.Session associated with the model to apply adaround. + :param starting_op_names: List of starting op names of the model. + :param output_op_names: List of output op names of the model. + :param results_dir: Directory to save the results of AdaRound. + :return: Output session and the path to the parameter encoding file. + """ + # NOTE: We dont need to make a deepcopy of model here, since Adaround.apply_adaround + # internally creates and returns a deepcopy of model. + if self.adaround_params is None: + raise RuntimeError + + filename_prefix = "adaround" + adaround_encoding_path = os.path.join(results_dir, + "{}.encodings".format(filename_prefix)) + _apply_adaround_cached =\ + cache.mark("adaround", TfSessionSerializationProtocol())\ + (Adaround.apply_adaround) + + sess = _apply_adaround_cached(sess, + starting_op_names, + output_op_names, + self.adaround_params, + path=results_dir, + filename_prefix=filename_prefix, + default_param_bw=self.default_param_bw, + default_quant_scheme=self.default_quant_scheme, + default_config_file=self.default_config_file) + + return sess, adaround_encoding_path + +
[docs] def apply( + self, + fp32_sess: tf.compat.v1.Session, + starting_op_names: List[str], + output_op_names: List[str], + results_dir: str = "/tmp", + cache_id: str = None, + ) -> Tuple[tf.compat.v1.Session, float, str]: + """ + Apply post-training quantization techniques. + + :param fp32_sess: tf.Session associated with the model to apply PTQ techniques. + :param starting_op_names: List of starting op names of the model. + :param output_op_names: List of output op names of the model. + :param results_dir: Directory to save the results. + :return: Tuple of (best session, eval score, encoding path). + """ + result = self._apply_helper(self._auto_quant_main, + fp32_sess, + starting_op_names, + output_op_names, + results_dir, + cache_id) + return result["model"],\ + result["accuracy"],\ + result["encoding_path"]
+ + def _apply_helper( + self, + auto_quant_main_fn: Callable, + fp32_sess: tf.compat.v1.Session, + starting_op_names: List[str], + output_op_names: List[str], + results_dir: str = "/tmp", + cache_id: str = None, + ) -> Dict[str, Any]: + """ + Helper for self.apply(). + + :param fp32_sess: tf.Session associated with the model to apply PTQ techniques. + :param starting_op_names: List of starting op names of the model. + :param output_op_names: List of output op names of the model. + :param results_dir: Directory to save the results. + :return: The best ptq result as a dictionary. + """ + results_dir = os.path.abspath(results_dir) + os.makedirs(results_dir, exist_ok=True) + + if cache_id is None: + cache_dir = None + else: + cache_dir = os.path.join(results_dir, ".auto_quant_cache", cache_id) + + with cache.enable(cache_dir): + _logger.info("Starting AutoQuant") + + fp32_acc = self._evaluate_model_performance(fp32_sess) + target_acc = fp32_acc - self.allowed_accuracy_drop + + _logger.info("Target eval score: %f", target_acc) + _logger.info("FP32 eval score (W32A32): %f", fp32_acc) + + eval_manager = _EvalManager( + quantsim_factory=self._create_quantsim_and_encodings, + eval_func=self._evaluate_model_performance, + starting_op_names=starting_op_names, + output_op_names=output_op_names, + results_dir=results_dir, + ) + + ret = auto_quant_main_fn(fp32_sess, target_acc, + starting_op_names, output_op_names, + eval_manager, results_dir) + + acc = ret["accuracy"] + _logger.info("Best eval score: %f", acc) + + if acc < target_acc: + _logger.info( + "AutoQuant is unable to match the target accuracy. " + "Consider Quantization Aware Training." + ) + + eval_manager.export_diagnostics() + + return ret + + def _auto_quant_main( + self, + fp32_sess: tf.compat.v1.Session, + target_acc: float, + starting_op_names: List[str], + output_op_names: List[str], + eval_manager: "_EvalManager", + results_dir: str = "/tmp", + ) -> Dict[str, Any]: + """ + Helper function of apply(). + + :param fp32_sess: Model to apply PTQ techniques. + :param target_acc: Target eval score. + :param starting_op_names: List of starting op names of the model. + :param output_op_names: List of output op names of the model. + :param eval_manager: _Evalmanager object. + :param results_dir: Directory to save the results. + :return: The best ptq result as a dictionary. + """ + with eval_manager.analysis_session("Weight Quantization Sensitivity") as s: + acc = s.eval(fp32_sess, default_output_bw=32) + s.diagnostics.add( + f"Weight-quantized eval score (W{self.default_param_bw}A32): {acc:f}" + ) + + with eval_manager.analysis_session("Activation Quantization Sensitivity") as s: + acc = s.eval(fp32_sess, default_param_bw=32) + s.diagnostics.add( + f"Activation-quantized eval score (W32A{self.default_output_bw}): {acc:f}" + ) + + # Batchnorm Folding + with eval_manager.ptq_session("Batchnorm Folding") as s: + sess, folded_pairs = self._apply_batchnorm_folding(fp32_sess, + starting_op_names, + output_op_names) + for conv, bn in folded_pairs: + s.diagnostics.add(f"{conv} was merged with {bn}.") + s.set_ptq_result(sess=sess, applied_techniques=["batchnorm_folding"]) + + best_result = eval_manager.get_best_ptq_result() + if best_result.accuracy >= target_acc: + return best_result.as_dict() + + # Cross-Layer Equalization + with eval_manager.ptq_session("Cross-Layer Equalization") as s: + sess = self._apply_cross_layer_equalization(fp32_sess, + starting_op_names, + output_op_names) + s.set_ptq_result(sess=sess, applied_techniques=["cross_layer_equalization"]) + + best_result = eval_manager.get_best_ptq_result() + if best_result.accuracy >= target_acc: + return best_result.as_dict() + + # AdaRound + with eval_manager.ptq_session("AdaRound") as s: + sess, encoding_path = self._apply_adaround(best_result.load_model(), + starting_op_names, + output_op_names, + results_dir) + s.set_ptq_result(sess=sess, + encoding_path=encoding_path, + applied_techniques=[*best_result.applied_techniques, "adaround"]) + + return eval_manager.get_best_ptq_result().as_dict()
+ + +@dataclass +class PtqResult: + """ + Evaluation results. + :param tag: Identifier string of the evaluation result. + :param model_path: Path to the serialized model. + :param encoding_path: Path to the encoding file. + :param accuracy: Accuracy of the model. + """ + meta_path: str + checkpoint_path: str + encoding_path: str + accuracy: float + applied_techniques: List[str] + + def load_model(self) -> tf.compat.v1.Session: + """ + Load model. + :return: Loaded model. + """ + return load_model_from_meta(self.meta_path, self.checkpoint_path) + + def as_dict(self): + """Convert to dictionary""" + return dict(model=self.load_model(), + accuracy=self.accuracy, + encoding_path=self.encoding_path, + applied_techniques=self.applied_techniques) + + +class _EvalManager: + """ + Evaluation manager for AutoQuant. + """ + def __init__(self, + quantsim_factory: Callable, + eval_func: Callable[[tf.compat.v1.Session], float], + starting_op_names: List[str], + output_op_names: List[str], + results_dir: str): + """ + :param quantsim_factory: A factory function that returns QuantizationSimModel. + :param eval_func: Evaluation function. + :param dummy_input: Dummy input to the model. Assumed to be located on the same device as the model. + :param dummy_input_on_cpu: Dummy input to the model in CPU memory. + :param results_dir: Base directory to save the temporary serialized model. + """ + self._quantsim_factory = quantsim_factory + self._eval_func = eval_func + self._starting_op_names = starting_op_names + self._output_op_names = output_op_names + self._results_dir = results_dir + + os.makedirs(self._results_dir, exist_ok=True) + + self._all_sessions: List[_EvalSession] = [] + self._ptq_sessions: List[_PtqSession] = [] + + def get_best_ptq_result(self) -> PtqResult: + """ + Get the results with the highest evaluation score among the ptq results evaluated so far. + :return: The best evaluation result so far. + """ + if not self._ptq_sessions: + raise RuntimeError + + ptq_results = [sess.ptq_result for sess in self._ptq_sessions] + return max(ptq_results, key=lambda ptq_result: ptq_result.accuracy) + + def analysis_session(self, title: str) -> "_EvalSession": + """ + Return a session for analysis only. + :param title: Title of the session. + :return: Analysis session. + """ + return self._get_session(title, _EvalSession) + + def ptq_session(self, title: str) -> "_PtqSession": + """ + Return a session for analysis only. + :param title: Title of the session. + :return: PTQ session. + """ + sess = self._get_session(title, _PtqSession) + self._ptq_sessions.append(sess) + return sess + + def _get_session(self, title: str, session_cls: type): + """ + Session factory. + :param title: Title of the session. + :session_cls: Class of the session. + :return: Session object. + """ + session = session_cls(title, + self._quantsim_factory, + self._eval_func, + self._starting_op_names, + self._output_op_names, + results_dir=os.path.join(self._results_dir, ".trace")) + self._all_sessions.append(session) + return session + + def export_diagnostics(self) -> str: + """ + Export diagnostics in html format. + :return: Diagnostics string in html format. + """ + loader = jinja2.FileSystemLoader(os.path.dirname(os.path.abspath(__file__))) + env = jinja2.Environment(loader=loader) + template = env.get_template("auto_quant_diagnostics_template.html") + + if any(sess.diagnostics.contains_bokeh() for sess in self._all_sessions): + head = CDN.render() + else: + head = "" + + body = { + sess.title: sess.diagnostics + for sess in self._all_sessions + if not sess.diagnostics.is_empty() + } + + html = template.render(head=head, body=body) + filename = os.path.join(self._results_dir, "diagnostics.html") + with open(filename, "w") as f: + f.write(html) + return html + + +class _EvalSession: + """ + Evaluation session for AutoQuant. + + Each session object contains a title and diagnostics produced during the session. + The collected diagnostics will be exported into a html file by _EvalManager. + """ + def __init__( + self, + title: str, + quantsim_factory: Callable, + eval_func: Callable[[tf.compat.v1.Session], float], + starting_op_names: List[str], + output_op_names: List[str], + results_dir: str + ): + """ + :param title: Title of the session. + :param quantsim_factory: A factory function that returns QuantizationSimModel. + :param eval_func: Evaluation function. + :param dummy_input: Dummy input to the model. Assumed to be located on the same device as the model. + :param dummy_input_on_cpu: Dummy input to the model in CPU memory. + :param results_dir: Base directory to save the temporary serialized model. + """ + self._title = title + self._quantsim_factory = quantsim_factory + self._eval_func = eval_func + self._starting_op_names = starting_op_names + self._output_op_names = output_op_names + self._results_dir = results_dir + self._spinner = None + + os.makedirs(self._results_dir, exist_ok=True) + + self._diagnostics = Diagnostics() + + # Map session title to file name. + # e.g. title: "Cross-Layer Equalization" -> filename: "cross_layer_equalization" + self._filename = self._title.lower().replace("-", " ") + self._filename = "_".join(self._filename.split()) + + @property + def title(self): + """Getter of self._title.""" + return self._title + + @property + def diagnostics(self): + """Getter of self._diagnostics.""" + return self._diagnostics + + def eval(self, sess: tf.compat.v1.Session, **kwargs): + """ + Evaluate the model. + :param sess: tf.Session associated with the model to evaluate. + :param **kwargs: Additional arguments to the quantsim factory. + :return: Eval score. + """ + sim = self._quantsim_factory(sess, + self._starting_op_names, + self._output_op_names, + **kwargs) + acc = self._eval_func(sim.session) + return acc + + def __enter__(self): + self._spinner = Spinner(self._title) + self._spinner.__enter__() + return self + + def __exit__(self, exc_type, exc_val, exc_tb): + try: + if self._spinner is not None: + self._spinner.__exit__(exc_type, exc_val, exc_tb) + finally: + if exc_val is not None: + raise exc_val + + +class _PtqSession(_EvalSession): + """ + PTQ session. + + Each PTQ session object should call `set_ptq_result` exactly once + inside a with-as block. + """ + def __init__(self, *args, **kwargs): + super().__init__(*args, **kwargs) + self._ptq_result = None + + @property + def ptq_result(self) -> PtqResult: + """Getter of self._ptq_result.""" + if self._ptq_result is None: + raise RuntimeError + return self._ptq_result + + def set_ptq_result( + self, + applied_techniques: List[str], + sess: tf.compat.v1.Session = None, + sim: QuantizationSimModel = None, + acc: float = None, + **kwargs + ) -> None: + """ + Set the result of PTQ. Should be called exactly once inside a with-as block. + + Exactly one among model and (sim, acc) pair should be specified. + 1) If sim and acc is specified, save them as the result of this session. + 2) If model is specified, evaluate the quantized accuracy of the model and save the result. + + :param sess: Result of PTQ. + :param sim: Result of PTQ. The quamtization encoding (compute_encodings()) is + assumed to have been computed in advance. + :param acc: Eval score. + :param **kwargs: Additional arguments to the quantsim factory. + :return: None + """ + if sim is None: + assert acc is None + assert sess is not None + sim = self._quantsim_factory(sess, + self._starting_op_names, + self._output_op_names, + **kwargs) + acc = self._eval_func(sim.session) + else: + assert acc is not None + assert sess is None + + self._set_ptq_result(sim, acc, applied_techniques) + + def _set_ptq_result( + self, + sim: QuantizationSimModel, + acc: float, + applied_techniques: List[str], + ) -> PtqResult: + """ + Set the result of PTQ. Should be called exactly once inside a with-as block. + + :param sim: Result of PTQ. The quamtization encoding (compute_encodings()) is + assumed to have been computed in advance. + :param acc: Eval score. + :return: PtqResult object. + """ + if self._ptq_result is not None: + raise RuntimeError( + "sess.eval() can be called only once per each _EvalSession instance." + ) + + meta_path, checkpoint_path, encoding_path = self._export(sim) + self._ptq_result = PtqResult( + meta_path=meta_path, + checkpoint_path=checkpoint_path, + encoding_path=encoding_path, + accuracy=acc, + applied_techniques=applied_techniques, + ) + return self._ptq_result + + def _export(self, sim: QuantizationSimModel) -> Tuple[str, str, str]: + """ + Export quantsim. + :param sim: QuantizationSimModel object to export. + :return: The paths where model and encoding are saved + """ + sim.export(path=self._results_dir, filename_prefix=self._filename) + checkpoint_path = os.path.join(self._results_dir, self._filename) + meta_path = f"{checkpoint_path}.meta" + encoding_path = f"{checkpoint_path}.encodings" + _logger.info("The results of %s is saved in %s, %s, and %s.", + self._title, checkpoint_path, meta_path, encoding_path) + return meta_path, checkpoint_path, encoding_path + + def __exit__(self, exc_type, exc_val, exc_tb): + """Raises error if set_ptq_result is not called.""" + super().__exit__(exc_type, exc_val, exc_tb) + + if self._ptq_result is None: + raise RuntimeError + + _logger.info("Session finished: %s. (eval score: %f)", + self._title, self._ptq_result.accuracy) + + +@contextlib.contextmanager +def spy_auto_quant(auto_quant: AutoQuant): + """ + Install a spy that collects the handles to the ptq result of + each stage of AutoQuant. + + Typical usage:: + >>> auto_quant = AutoQuant(...) + ... with auto_quant_spy(auto_quant) as spy: + ... _ = auto_quant.apply(...) + ... + ... for result in spy.get_all_ptq_results(): + ... print(result.applied_techniques) + ... print(result.accuracy) + ... print(result.encoding_path) + ... model = result.load_model() + ... ... + """ + # pylint: disable=protected-access + class Spy: + """ + Spy that collects the handles to the ptq result of + each stage of AutoQuant. + """ + def __init__(self): + self._eval_manager = None + + def get_all_ptq_results(self) -> List[PtqResult]: + """Return handles to the results of AutoQuant""" + if self._eval_manager is None: + return [] + return [sess.ptq_result for sess in self._eval_manager._ptq_sessions] + + spy = Spy() + + _auto_quant_main = auto_quant._auto_quant_main + + def _auto_quant_main_wrapper(fp32_sess, target_acc, starting_op_names, + output_op_names, eval_manager, results_dir="/tmp"): + spy._eval_manager = eval_manager + return _auto_quant_main(fp32_sess, target_acc, starting_op_names, + output_op_names, eval_manager, results_dir) + + try: + setattr(auto_quant, "_auto_quant_main", _auto_quant_main_wrapper) + yield spy + finally: + setattr(auto_quant, "_auto_quant_main", _auto_quant_main) +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/batch_norm_fold.html b/releases/1.32.2/_modules/aimet_tensorflow/batch_norm_fold.html new file mode 100644 index 00000000..24f8d51b --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/batch_norm_fold.html @@ -0,0 +1,1787 @@ + + + + + + aimet_tensorflow.batch_norm_fold — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_tensorflow.batch_norm_fold

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2019-2023, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" TF Code to fold batch-norm layers """
+
+from typing import List, Tuple, Union, Set
+import numpy as np
+import tensorflow as tf
+
+from aimet_common.graph_searcher import GraphSearcher
+from aimet_common.bias_correction import ConvBnPatternHandler
+from aimet_common.graph_pattern_matcher import PatternType
+from aimet_common.quantsim import compute_min_max_given_delta_offset
+from aimet_common.utils import AimetLogger
+import aimet_common.libpymo as libpymo
+
+from aimet_tensorflow.common.connectedgraph import ConnectedGraph
+from aimet_tensorflow.common.operation import OpWithMetaInfoType, Op
+from aimet_tensorflow.quantsim import QuantizationSimModel
+from aimet_tensorflow.utils.op.conv import WeightTensorUtils, BiasUtils
+from aimet_tensorflow.utils.op.fusedbatchnorm import BNUtils
+from aimet_tensorflow.utils.graph_saver import save_and_load_graph
+from aimet_tensorflow.utils.op.conv import get_weight_tensor_with_shape
+from aimet_tensorflow.utils.common import get_ordered_conv_linears, get_ordered_ops
+from aimet_tensorflow.quantizer_info import QuantizerInfo
+
+logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.BatchNormFolding)
+
+# save required information for performing bn fold on candidate bns as
+# <PairTypes> that includes :
+# tf.Operation type  : op which bn needs to be folded into.
+# OpWithMetaInfoType : bn op will store the input and output tensors along with tf.Operation
+# bool : Flag indicating if bn op can be folded upstream or downstream.
+PairType = Tuple[tf.Operation, Union[OpWithMetaInfoType, Op], bool]
+
+
+def _conv_bn_select_custom_pattern_init():
+    """
+    initialize the patterns we want to use to pick layers for bn based bias correction
+    :return: patterns and associated actions to be performed upon match
+    """
+
+    patterns_with_callbacks = []
+
+    # the types we want to handle
+    conv_layer_types = ['Conv2D', 'DepthwiseConv2dNative']
+    preceeding_linear_op_types = ['Flatten', 'Reshape']
+
+    # handler when pattern match
+    layer_select_handler = ConvBnPatternHandler()
+
+    # Linear layer combinations
+    for preceeding_linear_op_type in preceeding_linear_op_types:
+        # BN -> Linear
+        patterns_with_callbacks.append(PatternType(pattern=['FusedBatchNormV3', preceeding_linear_op_type, 'Dense'],
+                                                   action=layer_select_handler))
+
+        patterns_with_callbacks.append(PatternType(pattern=['FusedBatchNorm', preceeding_linear_op_type, 'Dense'],
+                                                   action=layer_select_handler))
+        # note: we cannot perform linear -> BN on TF
+
+    # conv layer combinations
+    for conv in conv_layer_types:
+
+        # BN -> Conv / Conv -> BN
+        patterns_with_callbacks.append(PatternType(pattern=[conv, 'FusedBatchNormV3'],
+                                                   action=layer_select_handler))
+
+        patterns_with_callbacks.append(PatternType(pattern=['FusedBatchNormV3', conv],
+                                                   action=layer_select_handler))
+
+        patterns_with_callbacks.append(PatternType(pattern=[conv, 'FusedBatchNorm'],
+                                                   action=layer_select_handler))
+
+        patterns_with_callbacks.append(PatternType(pattern=['FusedBatchNorm', conv],
+                                                   action=layer_select_handler))
+
+    return patterns_with_callbacks, layer_select_handler
+
+
+def _find_conv_bn_pairs(conn_graph: ConnectedGraph):
+    """
+    uses searcher to choose convs/ linears with bn and activation info.
+    :param conn_graph: tf.compat.v1.Session type
+    :return: dictionary of conv/linear layers with associated bn op / activation info
+    """
+    # create a list of patterns and corresponding handlers or actions to be applied for selecting
+    # layers for bias correction.
+    # layer_select_handler is an instance of custom handler created for bias correction.
+    patterns_with_callback, layer_select_handler = _conv_bn_select_custom_pattern_init()
+
+    # graph searcher looks for patterns and applies actions when matching patterns are found
+    graph_searcher = GraphSearcher(conn_graph, patterns_with_callback)
+    graph_searcher.find_all_patterns_in_graph_apply_actions()
+
+    # use custom handler instance and fetch the selected layer info for bias correction
+    convs_linears_bn_activation_info_dict = layer_select_handler.get_conv_linear_bn_info_dict()
+
+    return convs_linears_bn_activation_info_dict
+
+
+def find_all_batch_norms_to_fold(sess: tf.compat.v1.Session, start_op_names: Union[List[str], str],
+                                 output_op_names: Union[List[str], str], return_bn_conn_op=False) -> Tuple[List[PairType], Set[tf.Operation]]:
+    """
+    uses searcher to choose layers for bias correction
+    :param sess: tf.compat.v1.Session type
+    :param start_op_names: list of strings with names of starting ops in the model
+    :param output_op_names: List of output op names of the model, used to help ConnectedGraph determine valid ops
+    :param return_bn_conn_op: Return bn op as connected graph op instead of tf tensor
+    (to ignore training ops for example).  If None, all ops in the model are considered valid.
+
+    :return: List of conv/linear layers with associated bn op / activation info
+    """
+    if isinstance(start_op_names, str):
+        start_op_names = [start_op_names]
+
+    if isinstance(output_op_names, str):
+        output_op_names = [output_op_names]
+
+    conn_graph = ConnectedGraph(sess.graph, start_op_names, output_op_names)
+    bn_conv_linear_pairs, marked_bn_set = _find_all_batch_norms_to_fold(conn_graph, start_op_names, output_op_names,
+                                                                        return_bn_conn_op)
+    return bn_conv_linear_pairs, marked_bn_set
+
+
+def _get_bias_tensor(sess: tf.compat.v1.Session, conv: tf.Operation) -> libpymo.TensorParams():
+    """
+    Get bias tensor in given conv op.
+    Packs bias in the format required for BN fold
+    (libpymo.TensorParams()).
+    :param sess: current session
+    :param conv: conv op
+    :return: return bias param in libpymo.TensorParams() format.
+    """
+    # Bias tensor
+    bias_tensor = libpymo.TensorParams()
+    with sess.graph.as_default():
+        if not BiasUtils.is_bias_none(conv):
+            bias_tensor.shape = BiasUtils.get_shape(conv)
+            bias_tensor.data = BiasUtils.get_bias_as_numpy_data(sess, conv)
+
+    return bias_tensor
+
+
+def _get_weight_tensor_transpose_reshape(sess: tf.compat.v1.Session, conv: tf.Operation) -> libpymo.TensorParams():
+    """
+    Get weight tensor from conv op
+    Converts to right format - performs transpose and reshape.
+    Packs it to the format required for BN fold (libpymo.TensorParams()).
+    :param sess: current session
+    :param conv: conv op
+    :return: return weight tensor in libpymo.TensorParams() format.
+    """
+    # Weight tensor libpymo format
+    weight_tensor = libpymo.TensorParams()
+    wt_tensor, shape = get_weight_tensor_with_shape(sess, conv)
+
+    # linear array to be sent for bn fold
+    weight_tensor.data = wt_tensor.reshape(-1)
+    weight_tensor.shape = shape
+
+    return weight_tensor
+
+
+def _get_bn_params(sess: tf.compat.v1.Session, bn: tf.Operation) -> libpymo.BNParams():
+    """
+    helper to populate BN params from given BN op, required for fold
+
+    :param sess: tf.compat.v1.Session type
+    :param bn: BatchNorm or a FusedBatch Norm op
+    :return: bn_params
+    """
+    with sess.graph.as_default():
+        # create BNParams type and populate
+        bn_params = libpymo.BNParams()
+        bn_params.beta = BNUtils.get_beta_as_numpy_data(sess, bn).reshape(-1)
+        bn_params.gamma = BNUtils.get_gamma_as_numpy_data(sess, bn).reshape(-1)
+        bn_params.runningMean = BNUtils.get_moving_mean_as_numpy_data(sess, bn).reshape(-1)
+
+        if bn.type == 'Identity':
+            # can't find a way to read epsilon if BN type is Identity
+            epsilon = 0.001
+        else:
+            epsilon = BNUtils.get_epsilon(bn)
+
+        var = BNUtils.get_moving_variance_as_numpy_data(sess, bn).reshape(-1)
+        sigma = np.sqrt(var + epsilon)
+        bn_params.runningVar = sigma
+    return bn_params
+
+# pylint: disable=too-many-locals
+def _fold_given_auto_selected_batch_norms(sess: tf.compat.v1.Session, layer_pairs: List[PairType]) -> tf.compat.v1.Session:
+    """
+    Fold a given set of batch_norm layers into conv layers
+
+    :param sess: tf.compat.v1.Session
+    :param layer_pairs: pair of conv and bn layers
+    :return: new session with updated graph
+    """
+    with sess.graph.as_default():
+        for pair in layer_pairs:
+            conv_linear, bn, fold_backward = pair
+            assert conv_linear.type in ['Conv2D', 'DepthwiseConv2dNative', 'MatMul']
+            #  check flag
+            is_bias_valid = False
+            if not BiasUtils.is_bias_none(conv_linear):
+                is_bias_valid = True
+
+            bn_params = _get_bn_params(sess, bn.op)
+            weight_tensor = _get_weight_tensor_transpose_reshape(sess, conv_linear)
+            bias_tensor = _get_bias_tensor(sess, conv_linear)
+
+            bias = libpymo.fold(bn_params, weight_tensor, bias_tensor, is_bias_valid, fold_backward)
+
+            # converting back to TF format [kh, kw, Nic, Noc] before updating weight tensor value
+            if conv_linear.type == 'DepthwiseConv2dNative':
+                # Depthwise conv layers in TF have outputs(Noc) set to 1.
+                # we send in format [Nic, Noc, kh, kw]
+                numpy_weight_reshaped = np.reshape(weight_tensor.data, weight_tensor.shape).transpose((2, 3, 0, 1))
+            elif conv_linear.type == 'MatMul':
+                # o, i - convert to i , o
+                numpy_weight_reshaped = np.reshape(weight_tensor.data,
+                                                   [weight_tensor.shape[0], weight_tensor.shape[1]]).transpose(1, 0)
+            else:
+                # conv2D case
+                # we sent in format [Noc, Nic, kh, kw]
+                numpy_weight_reshaped = np.reshape(weight_tensor.data, weight_tensor.shape).transpose((2, 3, 1, 0))
+
+            WeightTensorUtils.update_tensor_for_op(sess, conv_linear, numpy_weight_reshaped)
+
+            # remove bn op
+            BNUtils.skip_bn_op(sess, bn.op, bn.in_tensor, bn.out_tensor)
+
+            # update bias tensor, even in case there was no existing bias add op in given conv2D op.
+            bias_tensor_shape = [weight_tensor.shape[0]]
+            numpy_bias_reshaped = np.reshape(bias, bias_tensor_shape)
+            BiasUtils.update_bias_for_op(sess, conv_linear, numpy_bias_reshaped)
+
+        # we edited the graph, so we should load and save for the metagraph associated with the session to be updated
+        after_bn_fold_sess = save_and_load_graph('./temp_bn_fold', sess)
+
+    return after_bn_fold_sess
+
+
+
[docs]def fold_given_batch_norms(sess: tf.compat.v1.Session, input_op_names: Union[str, List[str]], + output_op_names: Union[str, List[str]], + layer_pairs: List[Tuple[tf.Operation, tf.Operation, bool]]) -> tf.compat.v1.Session: + + """ + Api to fold custom set of bn layers in a model + + :param sess: active tensorflow session + :param input_op_names: starting op in model or a list of starting ops in the model + :param layer_pairs: List of tuple with conv and bn op layers as tf.Operation and + a flag to indicate fold upstream or downstream + :param output_op_names: List of output op names of the model, used to help ConnectedGraph determine valid ops + (to ignore training ops for example). + :return: updated_session after fold + + """ + # check for valid types + if not isinstance(input_op_names, (str, List)): + logger.error('start op names must be passed as a string or a List of strings') + + # if passed start op name is a single string, create a list + if isinstance(input_op_names, str): + input_op_names = [input_op_names] + + connected_graph = ConnectedGraph(sess.graph, input_op_names, output_op_names) + + conn_tf_n_op_map = {} + for op in connected_graph.get_all_ops().values(): + if op.type in ['FusedBatchNormV3', 'FusedBatchNorm']: + conn_tf_n_op_map[op.get_module()] = op + + layer_pairs_internal_format = [] + for layer_pair in layer_pairs: + conv_op, bn_op, is_bn_op_second = layer_pair + layer_pairs_internal_format.append((conv_op, conn_tf_n_op_map[bn_op].get_tf_op_with_io_tensor(), is_bn_op_second)) + + # invoke internal api + new_sess = _fold_given_auto_selected_batch_norms(sess, layer_pairs_internal_format) + + # save and load graph + after_fold_sess = save_and_load_graph('./temp_graph', new_sess) + + return after_fold_sess
+ + +
[docs]def fold_all_batch_norms(sess: tf.compat.v1.Session, input_op_names: Union[str, List[str]], + output_op_names: Union[str, List[str]])\ + -> Tuple[tf.compat.v1.Session, List[Tuple[tf.Operation, tf.Operation]]]: + """ + Fold all batch_norm layers in a model into corresponding conv layers + + :param sess: active tf.compat.v1.Session + :param input_op_names: Name of the starting op in the given graph or a list of names in case of multi-input model + :param output_op_names: List of output op names of the model, used to help ConnectedGraph determine valid ops + (to ignore training ops for example). If None, all ops in the model are considered valid. + :return: A new session with edited graph and a list of pairs of layers [(Conv/Linear, BN layer that got folded)] + + """ + # check for valid types + if not isinstance(input_op_names, (str, List)): + logger.error('start op names must be passed as a string or a List of strings') + + # if passed start op name is only a string - create a list for connected graph + if isinstance(input_op_names, str): + input_op_names = [input_op_names] + + # if passed output op name is only a string - create a list for connected graph + if isinstance(output_op_names, str): + output_op_names = [output_op_names] + + bn_conv_linear_pairs, bns_to_fold = find_all_batch_norms_to_fold(sess, input_op_names, output_op_names) + + after_fold_sess = _fold_given_auto_selected_batch_norms(sess, bn_conv_linear_pairs) + + # When returning the pairs, we want the second element of the pair to be the BN + pairs_to_return = [] + + # tf.Operation type conv , pair[1] nis of type OpWithMetaInfoType + # bn op is stored as OpWithMetaInfoType, get the op from it. + # pair[0] is always conv op and bn op is pair[1] + for pair in bn_conv_linear_pairs: + pairs_to_return.append((pair[0], pair[1].op)) + + # Convert the standalone BNs which are not folded + bn_converted = convert_standalone_batchnorms(after_fold_sess, input_op_names, output_op_names, bns_to_fold) + if bn_converted: + logger.info("%d BatchNorms' weights got converted", len(bn_converted)) + + # we edited the graph, so we should load and save for the metagraph associated with the session to be updated + after_fold_sess = save_and_load_graph('./temp_bn_fold', after_fold_sess) + + return after_fold_sess, pairs_to_return
+ + +def convert_standalone_batchnorms(sess, input_op_names: Union[str, List[str]], + output_op_names: Union[str, List[str]], bns_folded: List) -> List[tf.Operation]: + + """ + Converts the weights of standalone batch norms remaining in the model after BN folding. + + :param sess: TF session in which the graph is loaded + :param input_op_names: Name of the starting op in the given graph or a list of names in case of multi-input model + :param output_op_names: List of output op names of the model, used to help ConnectedGraph determine valid ops + (to ignore training ops for example). If None, all ops in the model are considered valid. + :param bns_folded: list of batch norms which got folded + :return: list of BatchNorms whose weights is converted + """ + + list_of_ordered_ops = get_ordered_ops(sess.graph, input_op_names, output_op_names) + + converted_bns = [] + # look for bn layers which are not folded + for op in list_of_ordered_ops: + if op.type in ['FusedBatchNormV3', 'FusedBatchNorm', 'BatchNormalization'] and op not in bns_folded: + convert_batchnorm_parameters(sess, op) + converted_bns.append(op) + logger.debug("%s weights got converted", op) + return converted_bns + + +def convert_batchnorm_parameters(sess, op): + """ + Convert the weights of BN such that it works as y = weights * x + bias + + :param sess: TF Session in which the graph is loaded + :param op: bn_op which whose weights need to be converted + """ + bn_params = _get_bn_params(sess, op) + weight = np.array(bn_params.gamma) / np.array(bn_params.runningVar) + bias = np.array(bn_params.beta) - np.array(bn_params.runningMean) * weight + BNUtils.modify_bn_params_to_weight_bias_form(sess, op, weight, bias) + + +
[docs]def fold_all_batch_norms_to_scale(sim: QuantizationSimModel, + starting_op_names: List[str], + output_op_names: List[str]): + """ + Fold all batch_norm layers in a model into the quantization scale parameter + of the corresponding conv layers + + :param sim: tf quantized model + :param starting_op_names: List of starting op names of the model + :param output_op_names: List of output op names of the model + """ + assert sim.session is not None + assert sim.connected_graph is not None + + connected_graph = sim.connected_graph + bn_conv_linear_pairs, _ = _find_all_batch_norms_to_fold(connected_graph, starting_op_names, output_op_names) + _fold_given_auto_selected_batch_norms_scale(sim, bn_conv_linear_pairs)
+ + +def _fold_given_auto_selected_batch_norms_scale(sim: QuantizationSimModel, layer_pairs: List[PairType]): + """ + Fold a given set of batch_norm layers into conv layers. + + NOTE: Need to retrieve operation(s) by name since TensorFlow graph associated with Connected graph + and sim.session are different (after save and load step). + + :param sim: QuantizationSimModel object. + :param layer_pairs pairs of conv and bn layers. + """ + sess = sim.session + with sess.graph.as_default(): + for pair in layer_pairs: + conv_linear, bn, fold_backward = pair + + bn_tf_op = sess.graph.get_operation_by_name(bn.op.name) + assert bn_tf_op.type in ['FusedBatchNormV3', 'Identity'], "Only Fused BN is supported." + bn_params = _get_bn_params(sess, bn_tf_op) + + conv_linear_tf_op = sess.graph.get_operation_by_name(conv_linear.name) + is_bias_valid = False + if not BiasUtils.is_bias_none(conv_linear_tf_op): + is_bias_valid = True + + # _fold_to_weight() using FP32 weights and bias (if exists). + weight_tensor = _get_weight_tensor_transpose_reshape(sess, conv_linear_tf_op) + bias_tensor = _get_bias_tensor(sess, conv_linear_tf_op) + bias = libpymo.fold(bn_params, weight_tensor, bias_tensor, is_bias_valid, fold_backward) + # converting back to TF format [kh, kw, Nic, Noc] before updating weight tensor value + if conv_linear_tf_op.type == 'DepthwiseConv2dNative': + # Depthwise conv layers in TF have outputs(Noc) set to 1. + # we send in format [Nic, Noc, kh, kw] + numpy_weight_reshaped = np.reshape(weight_tensor.data, weight_tensor.shape).transpose((2, 3, 0, 1)) + elif conv_linear_tf_op.type == 'MatMul': + # o, i - convert to i, o + numpy_weight_reshaped = np.reshape(weight_tensor.data, + [weight_tensor.shape[0], weight_tensor.shape[1]]).transpose((1, 0)) + else: + # conv2D case + # we sent in format [Noc, Nic, kh, kw] + numpy_weight_reshaped = np.reshape(weight_tensor.data, weight_tensor.shape).transpose((2, 3, 1, 0)) + WeightTensorUtils.update_tensor_for_op(sess, conv_linear_tf_op, numpy_weight_reshaped) + BiasUtils.update_bias_for_op(sess, conv_linear_tf_op, np.reshape(bias, [weight_tensor.shape[0]])) + + # fold to scale + conv_linear_w_quantizer, conv_linear_a_quantizer, bn_a_quantizer = \ + _find_quantizers(sim, conv_linear_tf_op, bn_tf_op, is_bias_valid) + _fold_pair_scale(conv_linear_w_quantizer, conv_linear_a_quantizer, bn_a_quantizer, bn_params) + + # remove bn op + _delete_bn_from_model(sess, bn, is_bias_valid) + + # we edited the graph, so we should load and save for the metagraph associated with the session to be updated + updated_sess = save_and_load_graph('./temp_bn_fold_to_scale', sess) + sim.session = updated_sess + + +def _delete_bn_from_model(sess: tf.compat.v1.Session, + bn_op: OpWithMetaInfoType, + is_bias_valid: bool): + """ + Delete BN and BN_quantized ops from the session.graph. + If BN's previous conv doesn't have bias, is_bias_valid must + be False. In that case, need to find the correct BN's input tensor. + + Note: supports only Fused BN op types (FusedBatchNormV3, Identity). + + :param sess: TensorFlow session. + :param bn_op: BN op with meta info. + :param is_bias_valid: False if BN's preceding Conv doesn't have bias, True otherwise. + """ + bn_tf_op = sess.graph.get_operation_by_name(bn_op.op.name) + bn_in_tensor = sess.graph.get_tensor_by_name(bn_op.in_tensor.name) + bn_out_tensor = sess.graph.get_tensor_by_name(bn_op.out_tensor.name) + + # Find BNs correct input tensor. + if not is_bias_valid: + # bias was not present and was added between conv and quant op + bn_in_tensor = bn_in_tensor.consumers()[0].outputs[0].consumers()[0].outputs[0] + else: + bn_in_tensor = bn_in_tensor.consumers()[0].outputs[0] + assert bn_in_tensor.op.type == 'QcQuantize', 'BNs preceding op must be of type QcQuantize.' + + # Find BNs correct output tensor. + bn_out_tensor = bn_out_tensor.consumers()[0].outputs[0] + assert bn_out_tensor.op.type == 'QcQuantize', 'BNs output op must be of type QcQuantize.' + + # Detach BN and following BN_quantized ops from the graph. + BNUtils.skip_bn_op(sess, bn_tf_op, bn_in_tensor, bn_out_tensor) + + +def _fold_pair_scale(conv_linear_w_quantizer: QuantizerInfo, + conv_linear_a_quantizer: QuantizerInfo, + bn_a_quantizer: QuantizerInfo, + bn_params: libpymo.BNParams): + """ + Fold a batch_norm layer into conv_linear's scale + + :param conv_linear_w_quantizer: conv or Linear op weight quantizer. + :param conv_linear_a_quantizer: conv or Linear op activation quantizer + :param bn_a_quantizer: BN op activation quantizer + :param bn_params: bn_params + """ + if all(quantizer is None for quantizer in [conv_linear_w_quantizer, conv_linear_a_quantizer, bn_a_quantizer]): + raise RuntimeError + + encodings = conv_linear_w_quantizer.get_encoding() + if encodings is None: + raise RuntimeError + + if isinstance(encodings, libpymo.TfEncoding): + encodings = [encodings] + + gamma = np.array(bn_params.gamma) + sigma = np.array(bn_params.runningVar) + + new_encodings = [] + for old_encoding, c in zip(encodings, gamma/sigma): + new_encoding = libpymo.TfEncoding() + new_encoding.bw = old_encoding.bw + new_encoding.offset = old_encoding.offset + new_encoding.delta = old_encoding.delta * abs(c) + new_encoding.min, new_encoding.max = \ + compute_min_max_given_delta_offset(new_encoding.delta, + new_encoding.offset, + new_encoding.bw, + conv_linear_w_quantizer.use_symmetric_encoding, + conv_linear_w_quantizer.use_strict_symmetric) + new_encodings.append(new_encoding) + + conv_linear_w_quantizer.set_encoding(new_encodings) + + # Copy batchnorm's output quantizers to conv output quantizers + conv_linear_a_quantizer.enabled = bn_a_quantizer.enabled + + if bn_a_quantizer.get_encoding() is not None: + encoding = libpymo.TfEncoding() + bn_encoding = bn_a_quantizer.get_encoding() + encoding.delta = bn_encoding.delta + encoding.max = bn_encoding.max + encoding.min = bn_encoding.min + encoding.offset = bn_encoding.offset + encoding.bw = bn_encoding.bw + conv_linear_a_quantizer.set_op_mode(int(libpymo.TensorQuantizerOpMode.quantizeDequantize)) + conv_linear_a_quantizer.set_encoding(encoding) + + bn_a_quantizer.enabled = False + + +def _find_all_batch_norms_to_fold(conn_graph: ConnectedGraph, + start_op_names: List[str], + output_op_names: List[str], + return_bn_conn_op: bool = False) -> Tuple[List, Set]: + """ + Find all possible batch norm layers that can be folded. And returns a list of pairs such that (bn, layer) + means bn will be forward-folded into layer and (layer, bn) means bn will be backward-folded into layer + + :param conn_graph: Connected graph associated with the model. + :param start_op_names: List of starting op names of the model + :param output_op_names: List of output op names of the model + :param return_bn_conn_op: Return bn op as connected graph op instead of tf tensor if True. + :return: A list of (layer, bn) pairs and a list of (bn, layer) pairs, + where `bn` can be folded into to `layer', + A set of bn ops which can be folded. + """ + conv_linear_bn_activation_info_dict = _find_conv_bn_pairs(conn_graph) + + # get all ordered conv/linear ops + ordered_conv_linear_op = get_ordered_conv_linears(conn_graph.graph, start_op_names, output_op_names) + + # get the in out tensor for bns found, we need this on TF to remove the bns after fold. + bn_conv_linear_pairs = [] + + # track BNs added for fold + bn_picked_for_folding = set() + + for conv_linear_op in ordered_conv_linear_op: + if conv_linear_op in conv_linear_bn_activation_info_dict.keys(): + bn_info = conv_linear_bn_activation_info_dict[conv_linear_op] + if bn_info.output_bn: + if bn_info.output_bn not in bn_picked_for_folding: + fold_backward = True + if return_bn_conn_op: + bn_conv_linear_pairs.append((conv_linear_op, bn_info.output_bn, fold_backward)) + else: + bn_conv_linear_pairs.append((conv_linear_op, bn_info.output_bn.get_tf_op_with_io_tensor(), + fold_backward)) + bn_picked_for_folding.add(bn_info.output_bn) + elif bn_info.input_bn: + if bn_info.input_bn not in bn_picked_for_folding: + fold_backward = False + if return_bn_conn_op: + bn_conv_linear_pairs.append((conv_linear_op, bn_info.input_bn, fold_backward)) + else: + bn_conv_linear_pairs.append((conv_linear_op, bn_info.input_bn.get_tf_op_with_io_tensor(), + fold_backward)) + bn_picked_for_folding.add(bn_info.input_bn) + return bn_conv_linear_pairs, bn_picked_for_folding + + +def _find_quantizers(sim: QuantizationSimModel, + conv_linear_tf_op: tf.Operation, + bn_tf_op: tf.Operation, + is_bias_valid: bool) -> Tuple[QuantizerInfo, QuantizerInfo, QuantizerInfo]: + """ + Find quantizers. + + :param sim: QuantizationSimModel object + :param conv_linear_tf_op: Conv/Linear tf operation. + :param bn_tf_op: BN tf operation + :param is_bias_valid: is bias valid. + :return: conv/linear weight quantizer, conv/linear activation quantizer, bn activation quantizer. + """ + if is_bias_valid: + bias_add_op = conv_linear_tf_op.outputs[0].consumers()[0] + assert bias_add_op.type == 'BiasAdd' + conv_linear_a_quantizer_op = bias_add_op.outputs[0].consumers()[0] + assert conv_linear_a_quantizer_op.type == 'QcQuantize' + conv_linear_a_quantizer_name = conv_linear_a_quantizer_op.name + else: + next_op = conv_linear_tf_op.outputs[0].consumers()[0] + if next_op.type == 'BiasAdd': # did bias get added between conv -> quant op? + next_op = next_op.outputs[0].consumers()[0] + assert next_op.type == 'QcQuantize' + conv_linear_a_quantizer_name = next_op.name + + bn_a_quantizer_name = bn_tf_op.name + "_quantized" + conv_linear_w_quantizer_name = conv_linear_tf_op.inputs[1].op.inputs[0].op.name + "_quantized" + + conv_linear_w_quantizer = sim.quantizer_config(conv_linear_w_quantizer_name) + conv_linear_a_quantizer = sim.quantizer_config(conv_linear_a_quantizer_name) + bn_a_quantizer = sim.quantizer_config(bn_a_quantizer_name) + + return conv_linear_w_quantizer, conv_linear_a_quantizer, bn_a_quantizer +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/bias_correction.html b/releases/1.32.2/_modules/aimet_tensorflow/bias_correction.html new file mode 100644 index 00000000..bccf6e51 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/bias_correction.html @@ -0,0 +1,1704 @@ + + + + + + aimet_tensorflow.bias_correction — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_tensorflow.bias_correction

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2019-2021, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Code to perform bias correction for layers """
+
+from typing import List, Union, Tuple, Dict
+import numpy as np
+import tensorflow as tf
+
+import aimet_common.libpymo as libpymo
+from aimet_common.bias_correction import ConvBnInfoType
+from aimet_common.defs import ActivationType, QuantScheme
+from aimet_common.utils import AimetLogger
+from aimet_common.graph_searcher import GraphSearcher
+from aimet_common.bias_correction import ConvBnPatternHandler
+from aimet_common.graph_pattern_matcher import PatternType
+
+from aimet_tensorflow.quantsim import QuantizationSimModel
+from aimet_tensorflow.utils.graph_saver import save_model_to_meta, save_and_load_graph, load_model_from_meta
+from aimet_tensorflow.utils.common import create_input_feed_dict, iter_first_x, get_ordered_conv_linears
+from aimet_tensorflow.utils.op.fusedbatchnorm import BNUtils
+from aimet_tensorflow.utils.op.conv import get_weight_tensor_with_shape, BiasUtils
+from aimet_tensorflow.common.connectedgraph import ConnectedGraph
+
+
+logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.Quant)
+
+
+
[docs]class QuantParams: + """ + Quant Params to be passed in by user + + """ + + def __init__(self, + quant_mode='tf_enhanced', + round_mode='nearest', + use_cuda=True, + ops_to_ignore=None): + """ + Constructor + + :param quant_mode: Indicates which quantization algorithm should be used, either + 'tf' or 'tf_enhanced'. Defaults to 'tf_enhanced' + :param round_mode: The round scheme to used. One of: 'nearest' or 'stochastic'. Default is 'nearest'. + :param use_cuda: flag to indicate if GPU is to be used + :param ops_to_ignore: ops to be ignored + """ + self.quant_mode = quant_mode + self.round_mode = round_mode + self.ops_to_ignore = ops_to_ignore + self.use_cuda = use_cuda
+ + +
[docs]class BiasCorrectionParams: + """ + Input for bias correction to be passed by the user + + :param batch_size: input batch size to be used + :param num_quant_samples: samples to be used for quantization + :param num_bias_correct_samples: samples to be used for bias correction + :param input_op_names: list of input op names of the given model + :param output_op_names: list of output op names of the given model + + """ + + def __init__(self, + batch_size: int, + num_quant_samples: int, + num_bias_correct_samples: int, + input_op_names: List[str], + output_op_names: List[str]): + + self.batch_size = batch_size + self.num_quant_samples = num_quant_samples + self.num_bias_correct_samples = num_bias_correct_samples + self.input_op_names = input_op_names + self.output_op_names = output_op_names
+ + +class BiasCorrection: + """ + class for bias correction in tensorflow + """ + + @staticmethod + def _get_output_data(sess: tf.compat.v1.Session, input_op_names: List[str], output_op_name: str, + batch_data: Union[np.ndarray, Tuple[np.ndarray], List[np.ndarray]]) -> np.ndarray: + """ + Function to get output values of a layer + :param sess: tf.compat.v1.Session containing the layer to evaluate + :param input_op_names: List of names of input ops to the session graph + :param output_op_name: Name of the output layer to evaluate + :param batch_data: Batch of data to feed into model input + :return: Output of layer for all batches of images + """ + + feed_dict = create_input_feed_dict(sess.graph, input_op_names, batch_data) + tf_op = sess.graph.get_operation_by_name(output_op_name) + assert tf_op.outputs + assert tf_op.outputs[0].consumers() + assert tf_op.outputs[0].consumers()[0].outputs + biasadd_tensor = tf_op.outputs[0].consumers()[0].outputs[0] # Replace with a get BiasAdd utils later + output_data = sess.run(biasadd_tensor, feed_dict=feed_dict) + return output_data + + @staticmethod + def _call_mo_correct_bias(corrected_model: tf.compat.v1.Session, layer_name: str, + bias_correction: libpymo.BiasCorrection, + bias_shape: int, is_bias_none: bool): + """ + helper to perform bias correction using cpp backend + :param corrected_model: active tensorflow session with corrected model as tf.compat.v1.Session + :param layer_name: name of the layer to be bias corrected + :param bias_correction: bias correction inputs + :param bias_shape: shape of bias associated with the layer + :param is_bias_none: True if bias for a layer is None + :return: None, updates bias for the given layer + """ + + bias_tensor = libpymo.TensorParamBiasCorrection() + + layer_to_be_corrected = corrected_model.graph.get_operation_by_name(layer_name) + + with corrected_model.graph.as_default(): + assert(layer_to_be_corrected.type in ['Conv2D', 'DepthwiseConv2dNative', 'MatMul']) + + if is_bias_none: + bias_tensor.data = np.zeros(bias_shape) + else: + # read bias from given op + bias_tensor.data = BiasUtils.get_bias_as_numpy_data(corrected_model, layer_to_be_corrected) + + # perform bias correction + bias_correction.correctBias(bias_tensor) + + # this api updates bias or adds bias add to layer if not present + BiasUtils.update_bias_for_quantized_op(corrected_model, layer_to_be_corrected, np.array(bias_tensor.data), + is_bias_none) + + @staticmethod + def _get_quantized_model(corrected_model: tf.compat.v1.Session, quant_params: QuantParams, input_op_names: List[str], + output_op_names: List[str], num_quant_samples: int, batch_size: int, + data_set: tf.data.Dataset) -> QuantizationSimModel: + """ + api to get quantized session + :param corrected_model: active tensorflow session with corrected model as tf.compat.v1.Session + :param quant_params: quantization params from user + :param input_op_names: names of the input nodes of the given model + :param output_op_names: names of the output nodes of the given model + :param num_quant_samples: number of dataset samples to use during quantization + :param batch_size: batch size to use for dataset samples + :return: quantized sim model + """ + + def bias_correction_callback(session: tf.compat.v1.Session, iterations: int): + dataset_samples_quant_itr = iter_first_x(data_set, iterations) + output_ops = [] + for output_op_name in output_op_names: + output_ops.append(session.graph.get_operation_by_name(output_op_name)) + for data in dataset_samples_quant_itr: + feed_dict = create_input_feed_dict(session.graph, input_op_names, data) + for output_op in output_ops: + output_op.outputs[0].eval(session=session, feed_dict=feed_dict) + + save_model_to_meta(corrected_model, './bias_correction/temp') + + # Allocate the quantizer and quantize the network using the default 8 bit params/activations + quantsim = QuantizationSimModel(corrected_model, input_op_names, output_op_names, + quant_params.quant_mode, quant_params.round_mode) + + # Disable all output quantizers + # pylint:disable = protected-access + for quantize_op in quantsim._activation_quantizers: + if quantsim._activation_quantizers[quantize_op].enabled: + quantsim._activation_quantizers[quantize_op].enabled = False + + n_batches_quantization = int(np.ceil(num_quant_samples / batch_size)) + quantsim.compute_encodings(bias_correction_callback, forward_pass_callback_args=n_batches_quantization) + + return quantsim + + + # pylint: disable=too-many-locals + @staticmethod + def bias_correction_per_layer(reference_model: tf.compat.v1.Session, + corrected_model: tf.compat.v1.Session, + bias_correct_params: BiasCorrectionParams, + layer_name_to_be_corrected: str, + data_set: tf.data.Dataset) -> tf.compat.v1.Session: + """ + Helper function to perform empirical bias correction per layer. + + :param reference_model: active tensorflow session for reference model + :param corrected_model: active tensorflow session for corrected model + :param bias_correct_params: bias correction params + :param layer_name_to_be_corrected: name of layer on which bias correction is to be performed + :param quant_params: Quantization specific params from user + :return: None, updates corrected model in-place. + + """ + + ref_layer = reference_model.graph.get_operation_by_name(layer_name_to_be_corrected) + + bias_correction = libpymo.BiasCorrection() + logger.info('Correcting layer %s', ref_layer.name) + + n_batches_bias_correction = int(np.ceil(bias_correct_params.num_bias_correct_samples / + bias_correct_params.batch_size)) + + reduced_dataset_iter = iter_first_x(data_set, n_batches_bias_correction) + + for batch_input in reduced_dataset_iter: + # reference model without corrected nodes + reference_output_batch = BiasCorrection._get_output_data(reference_model, + bias_correct_params.input_op_names, + ref_layer.name, + batch_input) + + quantized_model_output_batch = BiasCorrection._get_output_data(corrected_model, + bias_correct_params.input_op_names, + ref_layer.name, + batch_input) + + + + if ref_layer.type == 'MatMul': + extended_shape = np.concatenate((reference_output_batch.shape, np.array([1, 1]))) + reference_output_batch = reference_output_batch.reshape(extended_shape) + quantized_model_output_batch = quantized_model_output_batch.reshape(extended_shape) + + # we need to reshape from tensorflow shape NxHxWxC to NxCxHxW + bias_correction.storePreActivationOutput(np.ascontiguousarray(reference_output_batch.transpose(0, 3, 1, 2))) + bias_correction.storeQuantizedPreActivationOutput(np.ascontiguousarray( + quantized_model_output_batch.transpose(0, 3, 1, 2))) + + bias_shape = None + is_bias_none = False + # get shape for bias if the layer does not have bias + if BiasUtils.is_bias_none(ref_layer): + is_bias_none = True + if ref_layer.type == 'MatMul': + bias_shape = reference_output_batch.shape[1] + elif ref_layer.type in ['Conv2D', 'DepthwiseConv2dNative']: + # for conv2d or depthwise conv2d + bias_shape = reference_output_batch.shape[3] + + # bias is to be corrected in the corrected model graph + BiasCorrection._call_mo_correct_bias(corrected_model, ref_layer.name, bias_correction, bias_shape, + is_bias_none) + + logger.info('Completed empirical bias correction for layer %s', ref_layer.name) + + @staticmethod + def _get_quantized_weights(weight_tensor, quant_params): + """ + helper function to get quantized dequantized weights + :param weight_tensor: weight tensor + :param quant_params: quantization params such as mode, rounding etc + :return: quantized de-quantized weight tensor + """ + + q_wt_tensor = weight_tensor + + quant_mode = libpymo.QuantizationMode.QUANTIZATION_TF_ENHANCED + if quant_params.quant_mode == QuantScheme.post_training_tf or quant_params.quant_mode == 'tf': + quant_mode = libpymo.QuantizationMode.QUANTIZATION_TF + + round_mode = libpymo.RoundingMode.ROUND_NEAREST + if quant_params.round_mode == 'stochastic': + round_mode = libpymo.RoundingMode.ROUND_STOCHASTIC + + bitwidth = 8 + + # use tensorQuantizerForPython to get quantizeDequantize weights + encoding_analyzer = libpymo.EncodingAnalyzerForPython(quant_mode) + encoding_analyzer.updateStats(weight_tensor, quant_params.use_cuda) + encoding, is_encoding_valid = encoding_analyzer.computeEncoding(bitwidth, False, False, False) + + if is_encoding_valid: + tensor_quantizer = libpymo.TensorQuantizationSimForPython() + q_wt_tensor = tensor_quantizer.quantizeDequantize(weight_tensor, encoding, round_mode, quant_params.use_cuda) + + return q_wt_tensor + + + @staticmethod + def _get_conv_linear_params(model, layer_to_be_corrected): + """ + Extract weights and bias of given conv/linear layer + :param model: tf.compat.v1.Session type + :param layer_to_be_corrected: conv/linear layer as tf.Operation + :return: bias, weight and quantized weights as TensorParamBiasCorrection types + """ + + bias_tensor = libpymo.TensorParamBiasCorrection() + + # get weight tensor + weight_tensor, _ = get_weight_tensor_with_shape(model, layer_to_be_corrected) + + if weight_tensor is None: + logger.error('Weight tensor extraction failed for layer {%s}', layer_to_be_corrected.name) + + bias_tensor.data = BiasUtils.get_bias_as_numpy_data(model, layer_to_be_corrected) + bias_tensor.shape = BiasUtils.get_shape(layer_to_be_corrected) + + return bias_tensor, weight_tensor + + @staticmethod + def _get_bn_params(model, bn_layer) -> libpymo.BnParamsBiasCorr(): + """ + get bn params for bn based bias correction + :param model: tf.compat.v1.Session type + :param bn_layer: tf.Operation type + :return: bn params as libpymo.BnParamsBiasCorr() type + """ + + bn_params = libpymo.BnParamsBiasCorr() + bn_params.beta = BNUtils.get_beta_as_numpy_data(model, bn_layer).reshape(-1) + bn_params.gamma = BNUtils.get_gamma_as_numpy_data(model, bn_layer).reshape(-1) + + return bn_params + + @staticmethod + def analytical_bias_correction_per_layer(corrected_model: tf.compat.v1.Session, layer: tf.Operation, + preceeding_bn_layer_info: ConvBnInfoType, quant_params: QuantParams, + is_first_conv: bool = False) -> tf.compat.v1.Session: + """ + Perform bn based bias correction (analytical bc). + + :param corrected_model: active tensorflow session for corrected model + :param layer: conv/linear layer to be corrected + :param preceeding_bn_layer_info: corresponding preceeding bn/ activation info + :param quant_params: Quantization specific params from user + :param is_first_conv: flag to indicate if it's the first conv layer + :return: None, updates corrected_model in place + + """ + + layer = corrected_model.graph.get_operation_by_name(layer.name) + # get bn param and quantized weights from conv for this layer + bias_tensor, weight_tensor = BiasCorrection._get_conv_linear_params(corrected_model, layer) + quantized_weight = BiasCorrection._get_quantized_weights(weight_tensor, quant_params) + + bn_params = libpymo.BnParamsBiasCorr() + activation_type = libpymo.ActivationType.noActivation + + if preceeding_bn_layer_info: + input_tf_bn_op_name = preceeding_bn_layer_info.input_bn.get_module().name + bn_op = corrected_model.graph.get_operation_by_name(input_tf_bn_op_name) + bn_params = BiasCorrection._get_bn_params(corrected_model, bn_op) + if preceeding_bn_layer_info.in_activation_type == ActivationType.relu: + activation_type = libpymo.ActivationType.relu + elif preceeding_bn_layer_info.in_activation_type == ActivationType.relu6: + activation_type = libpymo.ActivationType.relu6 + elif preceeding_bn_layer_info.in_activation_type == ActivationType.no_activation: + activation_type = libpymo.ActivationType.noActivation + else: + assert(0, 'Unknown activation type', preceeding_bn_layer_info.in_activation_type) + else: + if is_first_conv: + # for the first conv layer case, we use gamma = 1 and beta = 0 + shape = weight_tensor.shape[1] + bn_params.gamma = np.ones(shape) + bn_params.beta = np.zeros(shape) + else: + assert 0, "layer info is None and is not first conv layer" + + # need to invoke cpp api for bn based bias correction + biasCorrection = libpymo.BnBasedBiasCorrection() + + biasCorrection.correctBias(bias_tensor, quantized_weight, weight_tensor, bn_params, activation_type) + + # this api updates bias or adds bias add to layer if not present + layer = corrected_model.graph.get_operation_by_name(layer.name) + BiasUtils.update_bias_for_quantized_op(corrected_model, layer, np.array(bias_tensor.data)) + logger.info('Completed analytical bias correction for layer %s', layer.name) + + @staticmethod + def _conv_bn_select_custom_pattern_init(): + """ + initialize the patterns we want to use to pick layers for bn based bias correction. + :return: patterns and associated actions to be performed upon match + """ + + patterns_with_callbacks = [] + + # the types we want to handle + conv_layer_types = ['Conv2D', 'DepthwiseConv2dNative'] + activation_types = ['Relu', 'Relu6'] + + # add the patterns we are interested in along with a handler + layer_select_handler = ConvBnPatternHandler() + + # conv layer combinations + for conv in conv_layer_types: + + for activation in activation_types: + patterns_with_callbacks.append(PatternType(pattern=['FusedBatchNormV3', activation, conv], + action=layer_select_handler)) + + patterns_with_callbacks.append(PatternType(pattern=['FusedBatchNormV3', conv], + action=layer_select_handler)) + + return patterns_with_callbacks, layer_select_handler + + @staticmethod + def find_all_convs_bn_with_activation(model, start_op_names: Union[List[str], str], + output_op_names: Union[List[str], str]): + """ + uses searcher to choose convs/ linears with bn and activation info. + :param model: tf.compat.v1.Session type + :param start_op_names: list of strings with names of starting ops in the model + :param output_op_names: List of output op names of the model, used to help ConnectedGraph determine valid ops + (to ignore training ops for example). + :return: dictionary of conv/linear layers with associated bn op / activation info + """ + + if isinstance(start_op_names, str): + start_op_names = [start_op_names] + + if isinstance(output_op_names, str): + output_op_names = [output_op_names] + + conn_graph = ConnectedGraph(model.graph, start_op_names, output_op_names) + + # create a list of patterns and corresponding handlers or actions to be applied for selecting + # layers for bias correction. + # layer_select_handler is an instance of custom handler created for bias correction. + patterns_with_callback, layer_select_handler = BiasCorrection._conv_bn_select_custom_pattern_init() + + # graph searcher looks for patterns and applies actions when matching patterns are found + graph_searcher = GraphSearcher(conn_graph, patterns_with_callback) + graph_searcher.find_all_patterns_in_graph_apply_actions() + + # use custom handler instance and fetch the selected layer info for bias correction + convs_bn_activation_info_dict = layer_select_handler.get_conv_linear_bn_info_dict() + + return convs_bn_activation_info_dict + + @staticmethod + def refresh_op_ref(sess, conv_bn_dict): + """ + Updates the conv op references saved in user passed in conv bn dictionary. + + :param reference_model: active tf.compat.v1.Session for the model. + :param conv_bn_dict: Dict of conv and bn with activation info + :return: dict of conv and bn with updated conv references + + """ + conv_linears_with_bn_dict = {} + for conv in conv_bn_dict.keys(): + refreshed_conv = sess.graph.get_operation_by_name(conv.name) + bn_activation_info = conv_bn_dict[conv] + conv_linears_with_bn_dict[refreshed_conv] = bn_activation_info + + return conv_linears_with_bn_dict + + @staticmethod + def correct_bias(reference_model: tf.compat.v1.Session, bias_correct_params: BiasCorrectionParams, + quant_params: QuantParams, data_set: tf.data.Dataset, + conv_bn_dict: Union[Dict[tf.Operation, ConvBnInfoType], None] = None, + perform_only_empirical_bias_corr: bool = True): + """ + Top level function for bias correction + + :param reference_model: active tf.compat.v1.Session for the model to be corrected. + :param bias_correct_params: input params for bias correction + :param quant_params: QuantParams type with params for quantization simulation for bias correction. + :param data_set: input data set + :param conv_bn_dict: Dict of conv and bn with activation info. If None, the function looks for it. + This can be obtained on the model with bns and convs using + BiasCorrection.find_all_convs_bn_with_activation() api. + :param perform_only_empirical_bias_corr: a flag to indicate only empirical bias correction is to be performed. + :return: updated session with corrected bias for given ops + + """ + + # one time initialization of all layers with bias param + reference_model = BiasUtils.initialize_model_with_bias(reference_model, bias_correct_params.input_op_names, + bias_correct_params.output_op_names) + + # Create a copy of the model as reference model + corrected_model = save_and_load_graph('./temp_meta_path', reference_model) + + # get all ordered convs/ linears and skip gradient ops + ordered_conv_linears = get_ordered_conv_linears(reference_model.graph, bias_correct_params.input_op_names, + bias_correct_params.output_op_names) + + # Get conv2D, depthwise with preceding BN ops info for analytical bias correction + # if user has not passed any dictionary + if conv_bn_dict is None: + convs_bn_activation_info_dict = BiasCorrection.find_all_convs_bn_with_activation(reference_model, + bias_correct_params.input_op_names, + bias_correct_params.output_op_names) + else: + convs_bn_activation_info_dict = BiasCorrection.refresh_op_ref(reference_model, conv_bn_dict) + + # Quantize model + quantsim = BiasCorrection._get_quantized_model(corrected_model, quant_params, + bias_correct_params.input_op_names, + bias_correct_params.output_op_names, + bias_correct_params.num_quant_samples, + bias_correct_params.batch_size, + data_set) + + # Perform analytical bias correction for first conv layer + # we always perform empirical bias correction for linear layers + if ordered_conv_linears: + if not perform_only_empirical_bias_corr and ordered_conv_linears[0].type not in ['MatMul']: + first_conv = ordered_conv_linears.pop(0) + BiasCorrection.analytical_bias_correction_per_layer(quantsim.session, + first_conv, + None, + quant_params, + is_first_conv=True) + + # for each candidate layer in an ordered list of conv/lieanr ops + # find the corresponding bn and activation info + for layer in ordered_conv_linears: + + # if this layer is in selected patterns of convs with preceding BN op and + # if empirical flag is false + # perform analytical Bias correction + if layer in convs_bn_activation_info_dict.keys() and not perform_only_empirical_bias_corr: + + preceding_bn_layer_info = convs_bn_activation_info_dict[layer] + + BiasCorrection.analytical_bias_correction_per_layer(quantsim.session, + layer, + preceding_bn_layer_info, + quant_params) + else: + # stand-alone convs/ linears or when perform_only_empirical_bias_corr is set to True + # perform empirical bias correction + BiasCorrection.bias_correction_per_layer(reference_model, + quantsim.session, + bias_correct_params, + layer.name, + data_set) + logger.info('Completed bias correction') + # Remove quantization nodes and save bias correction model + # pylint:disable = protected-access + quantsim._remove_quantization_nodes_and_save_graph('./temp_meta_path', 'bias_corrected_model') + corrected_model = load_model_from_meta(meta_path=str('./temp_meta_path' + '/' + 'bias_corrected_model' + + '.meta')) + return corrected_model +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/bn_reestimation.html b/releases/1.32.2/_modules/aimet_tensorflow/bn_reestimation.html new file mode 100644 index 00000000..8cc9bc35 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/bn_reestimation.html @@ -0,0 +1,1405 @@ + + + + + + aimet_tensorflow.bn_reestimation — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_tensorflow.bn_reestimation

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+"""BatchNorm Re-estimation"""
+
+from typing import List, Tuple, Dict
+import numpy as np
+import tensorflow as tf
+
+from aimet_common.utils import Handle, AimetLogger
+from aimet_tensorflow.utils.op.fusedbatchnorm import BNUtils
+from aimet_tensorflow.common.graph_eval import initialize_uninitialized_vars
+from aimet_tensorflow.quantsim import QuantizationSimModel
+from aimet_tensorflow.utils.common import create_input_feed_dict, iterate_tf_dataset
+from aimet_tensorflow.utils.op.bn_mutable import get_active_bn_ops
+
+logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.Quant)
+
+# pylint: disable=too-many-locals
+def _get_all_tf_bn_vars_list(sim: QuantizationSimModel) -> Tuple[List[tf.Variable],
+                                                                 List[tf.Variable],
+                                                                 List[tf.Variable],
+                                                                 List[tf.Variable]]:
+    """
+    find tf variables list to access BNs mean, variance, momentum and is_training
+
+    :param sim: tf quantized model
+    :return: tf.variable lists to access bn layers's mean, var, momentum, is_training
+    """
+    conn_graph = sim.connected_graph
+    bn_conn_graph_ops = tuple(get_active_bn_ops(conn_graph))
+
+    with sim.session.graph.as_default():
+        tf_global_vars = tf.compat.v1.get_collection(tf.compat.v1.GraphKeys.GLOBAL_VARIABLES)
+
+    mean_tf_var_names = []
+    variance_tf_var_names = []
+    is_training_tf_var_names = []
+    momentum_tf_var_names = []
+
+    for bn_conn_graph_op in bn_conn_graph_ops:
+        tf_op = bn_conn_graph_op.internal_ops[0]
+        assert tf_op.type in ['Identity'], 'Fused Batch Norm with training tensor is only supported.'
+        bn_mean_tf_var_name = tf_op.inputs[0].op.inputs[3].name
+        bn_var_tf_var_name = tf_op.inputs[0].op.inputs[4].name
+
+        bn_cond_1_tf_op = BNUtils.get_cond_1_identity_op(tf_op)
+        bn_momentum_tf_var_name = bn_cond_1_tf_op.inputs[0].op.inputs[1].name
+        bn_training_tf_var_name = tf_op.inputs[0].op.inputs[0].op.inputs[0].name
+
+        mean_tf_var_names.append(bn_mean_tf_var_name)
+        variance_tf_var_names.append(bn_var_tf_var_name)
+        momentum_tf_var_names.append(bn_momentum_tf_var_name)
+        is_training_tf_var_names.append(bn_training_tf_var_name)
+
+    mean_tf_vars = []
+    variance_tf_vars = []
+    is_training_tf_vars = []
+    momentum_tf_vars = []
+
+    for v in tf_global_vars:
+        if v.name in mean_tf_var_names:
+            mean_tf_vars.append(v)
+
+        if v.name in variance_tf_var_names:
+            variance_tf_vars.append(v)
+
+        if v.name in momentum_tf_var_names:
+            momentum_tf_vars.append(v)
+
+        if v.name in is_training_tf_var_names:
+            is_training_tf_vars.append(v)
+
+    return mean_tf_vars, variance_tf_vars, momentum_tf_vars, is_training_tf_vars
+
+
+def _reset_bn_stats(sess: tf.compat.v1.Session,
+                    bn_mean_checkpoints: Dict[tf.Variable, np.ndarray],
+                    bn_variance_checkpoints: Dict[tf.Variable, np.ndarray]) -> Handle:
+    """
+    Reset all BNs statistics to the initial values.
+
+    :param sess: tf session
+    :param bn_mean_checkpoints: Dict for original BN mean
+    :param bn_variance_checkpoints: Dict for original BN variance
+    :return: Handle that restores the original BN statistics upon handle.remove().
+    """
+    def cleanup():
+        """
+        Restore all BNs stats
+        """
+        with sess.graph.as_default():
+            sess.run([tf.compat.v1.assign(v, bn_mean_checkpoints[v]) for v in bn_mean_checkpoints])
+            sess.run([tf.compat.v1.assign(v, bn_variance_checkpoints[v]) for v in bn_variance_checkpoints])
+
+    try:
+        with sess.graph.as_default():
+            sess.run([tf.compat.v1.assign(v, np.zeros(v.shape, dtype=v.dtype.as_numpy_dtype))
+                      for v in bn_mean_checkpoints])
+            sess.run([tf.compat.v1.assign(v, np.ones(v.shape, dtype=v.dtype.as_numpy_dtype))
+                      for v in bn_variance_checkpoints])
+        return Handle(cleanup)
+    except:
+        cleanup()
+        raise
+
+
+def _reset_momentum(sess: tf.compat.v1.Session,
+                    momentum_checkpoints: Dict[tf.Variable, np.float32]) -> Handle:
+    """
+    Set all BNs momentum to 0.0.
+
+    :param sess: tf session
+    :param momentum_checkpoints: Dict for original BN momentum[tf.Variable --> original_values]
+    :return: Handle that restores the original BN statistics upon handle.remove().
+    """
+    def cleanup():
+        """
+        Restore BNs momentum
+        """
+        with sess.graph.as_default():
+            sess.run([tf.compat.v1.assign(v, momentum_checkpoints[v]) for v in momentum_checkpoints])
+    try:
+        with sess.graph.as_default():
+            sess.run([tf.compat.v1.assign(v, 0.0) for v in momentum_checkpoints])
+        return Handle(cleanup)
+    except:
+        cleanup()
+        raise
+
+
+def _set_bn_in_train_mode(sess: tf.compat.v1.Session,
+                          is_training_checkpoints: Dict[tf.Variable, bool]) -> Handle:
+    """
+    Set BNs in training mode.
+
+    :param sess: tf session
+    :param is_training_checkpoints: Dict for original BNs is_training flag.
+    :return: Handle that sets all mutable BNs to eval mode upon handle.remove().
+    """
+    def cleanup():
+        """
+        Set all the BNs to eval mode.
+        """
+        with sess.graph.as_default():
+            sess.run([tf.compat.v1.assign(k, False) for k in is_training_checkpoints])
+
+    try:
+        # Set all the BNs to train mode
+        with sess.graph.as_default():
+            sess.run([tf.compat.v1.assign(k, True) for k in is_training_checkpoints])
+        return Handle(cleanup)
+    except:
+        cleanup()
+        raise
+
+
+def _get_tf_vars_and_orig_values(sim: QuantizationSimModel) -> Tuple[Dict[tf.Variable, np.ndarray],
+                                                                     Dict[tf.Variable, np.ndarray],
+                                                                     Dict[tf.Variable, np.float32],
+                                                                     Dict[tf.Variable, bool]]:
+    """
+    save original values for all BNs mean, variance, momentum and is_training tf Variables.
+
+    :param sim: QuantizationSimModel object.
+    :return: Dictionary [tf.Variable] --> original_value for all BNs mean, variance, momentum and is_training.
+    """
+    # setup tf variable list to access
+    mean_tf_vars, variance_tf_vars, momentum_tf_vars, is_training_tf_vars = _get_all_tf_bn_vars_list(sim)
+
+    with sim.session.graph.as_default():
+        mean_checkpoints = dict(zip(mean_tf_vars, sim.session.run(list(mean_tf_vars))))
+        variance_checkpoints = dict(zip(variance_tf_vars, sim.session.run(list(variance_tf_vars))))
+        momentum_checkpoints = dict(zip(momentum_tf_vars, sim.session.run(list(momentum_tf_vars))))
+        is_training_checkpoints = dict(zip(is_training_tf_vars, sim.session.run(list(is_training_tf_vars))))
+
+    return mean_checkpoints, variance_checkpoints, momentum_checkpoints, is_training_checkpoints
+
+
+DEFAULT_NUM_BATCHES = 100
+
+
+
[docs]def reestimate_bn_stats(sim: QuantizationSimModel, + start_op_names: List[str], + output_op_names: List[str], + dataset: tf.compat.v1.data.Dataset, + num_batches: int = DEFAULT_NUM_BATCHES) -> Handle: + """ + Reestimate BatchNorm statistics (running mean and var). + + :param sim: QuantizationSimModel object. + :param start_op_names: List of starting op names of the model + :param output_op_names: List of output op names of the model + :param dataset: Training dataset + :param num_batches: The number of batches to be used for reestimation + :returns: Handle that undos the effect of BN reestimation upon handle.remove() + """ + # setup tf variable list to access + mean_checkpoints, variance_checkpoints, momentum_checkpoints, is_training_checkpoints = \ + _get_tf_vars_and_orig_values(sim) + + sess = sim.session + # Set all the BNs in training mode + with _set_bn_in_train_mode(sess, is_training_checkpoints), _reset_momentum(sess, momentum_checkpoints): + handle = _reset_bn_stats(sess, mean_checkpoints, variance_checkpoints) + try: + with sess.graph.as_default(): + output_tensors = [sess.graph.get_tensor_by_name(name + ':0') for name in output_op_names] + update_ops = tf.compat.v1.get_collection(tf.compat.v1.GraphKeys.UPDATE_OPS) + assert update_ops, "GraphKeys.UPDATE_OPS can not be empty." + + # GraphKeys.UPDATE_OPS is collection of moving mean and variance for BN layers. During training mode + # moving mean and variance need to be updated and added as a control dependency. + with tf.compat.v1.control_dependencies(update_ops): + output_tensors_dependencies = [] + for output_tensor in output_tensors: + output_tensor = tf.compat.v1.identity(output_tensor) + output_tensors_dependencies.append(output_tensor) + initialize_uninitialized_vars(sess) + + # BN statistics accumulation buffer + sum_mean = {v: np.zeros(v.shape, dtype=v.dtype.as_numpy_dtype) for v in mean_checkpoints} + sum_var = {v: np.zeros(v.shape, dtype=v.dtype.as_numpy_dtype) for v in variance_checkpoints} + + batches = 0 + iterator = iterate_tf_dataset(dataset) + for _ in range(num_batches): + try: + data = next(iterator) + batches += 1 + except StopIteration: + break + feed_dict = create_input_feed_dict(sess.graph, start_op_names, data) + sess.run(output_tensors_dependencies, feed_dict=feed_dict) + for v in mean_checkpoints: + sum_mean[v] += sess.run(v) + for v in variance_checkpoints: + sum_var[v] += sess.run(v) + + # Override BN stats with the reestimated stats. + with sess.graph.as_default(): + sess.run([tf.compat.v1.assign(v, sum_mean[v] / batches) for v in mean_checkpoints]) + sess.run([tf.compat.v1.assign(v, sum_var[v] / batches) for v in variance_checkpoints]) + + return handle + except: + handle.remove() + raise
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/compress.html b/releases/1.32.2/_modules/aimet_tensorflow/compress.html new file mode 100644 index 00000000..e6011795 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/compress.html @@ -0,0 +1,1238 @@ + + + + + + aimet_tensorflow.compress — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_tensorflow.compress

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2019, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Top-level API for aimet compression library """
+
+from typing import Union, Tuple, List
+import tensorflow as tf
+
+from aimet_common.defs import CostMetric, CompressionScheme, EvalFunction, CompressionStats
+from aimet_common.bokeh_plots import BokehServerSession
+
+from aimet_tensorflow.utils.graph_saver import wrapper_func, save_and_load_graph
+from aimet_tensorflow.defs import SpatialSvdParameters, ChannelPruningParameters
+from aimet_tensorflow.compression_factory import CompressionFactory
+
+
+
[docs]class ModelCompressor: + """ aimet model compressor: Enables model compression using various schemes """ + + # pylint: disable=too-many-arguments + +
[docs] @staticmethod + def compress_model(sess: tf.compat.v1.Session, working_dir: str, eval_callback: EvalFunction, eval_iterations, + input_shape: Union[Tuple, List[Tuple]], + compress_scheme: CompressionScheme, cost_metric: CostMetric, + parameters: Union[SpatialSvdParameters, + ChannelPruningParameters], + trainer=None, visualization_url=None) -> Tuple[tf.compat.v1.Session, CompressionStats]: + """ + Compress a given model using the specified parameters + + :param sess: Model, represented by a tf.compat.v1.Session, to compress + :param working_dir: File path to save compressed TensorFlow meta file + :param eval_callback: Evaluation callback. Expected signature is evaluate(model, iterations, use_cuda). + Expected to return an accuracy metric. + :param eval_iterations: Iterations to run evaluation for + :param trainer: Training Class: Contains a callable, train_model, which takes model, layer which is being fine + tuned and an optional parameter train_flag as a parameter + None: If per layer fine tuning is not required while creating the final compressed model + :param input_shape: tuple or list of tuples of input shapes to the model (channels_last format) + :param compress_scheme: Compression scheme. See the enum for allowed values + :param cost_metric: Cost metric to use for the compression-ratio (either mac or memory) + :param parameters: Compression parameters specific to given compression scheme + :param trainer: Training function + None: If per layer fine tuning is not required while creating the final compressed model + :param visualization_url: url the user will need to input where visualizations will appear + :return: A tuple of the compressed model session, and compression statistics + """ + + # If no url is passed in, then do not create a bokeh server session + if not visualization_url: + bokeh_session = None + else: + # create a bokeh session to publish visualizations to the server document for compression + bokeh_session = BokehServerSession(url=visualization_url, session_id="compression") + + if parameters.multiplicity < 1: + raise ValueError('Rounding Multiplicity should be greater than 1') + + if compress_scheme == CompressionScheme.spatial_svd: + # wrapper_func saves and reloads the graph before evaluation + # In TF after making changes to the graph you must save and reload, then evaluate + eval_callback = wrapper_func(eval_callback) + + algo = CompressionFactory.create_spatial_svd_algo(sess, working_dir, eval_callback, eval_iterations, + input_shape, cost_metric, parameters, bokeh_session) + elif compress_scheme == CompressionScheme.channel_pruning: + algo = CompressionFactory.create_channel_pruning_algo(sess, working_dir, eval_callback, input_shape, + eval_iterations, cost_metric, parameters, + bokeh_session) + else: + raise ValueError("Compression scheme not supported: {}".format(compress_scheme)) + + compressed_layer_db, stats = algo.compress_model(cost_metric, trainer) + + # TODO: this is a temporary fix, needs to be resolved + # In TF after making changes to the graph you must save and reload, then evaluate + updated_model = save_and_load_graph('./saver', compressed_layer_db.model) + compressed_layer_db.model.close() + + return updated_model, stats
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/cross_layer_equalization.html b/releases/1.32.2/_modules/aimet_tensorflow/cross_layer_equalization.html new file mode 100644 index 00000000..16e821de --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/cross_layer_equalization.html @@ -0,0 +1,1977 @@ + + + + + + aimet_tensorflow.cross_layer_equalization — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_tensorflow.cross_layer_equalization

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2019, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Auto Mode TF Cross Layer Equalization """
+
+from typing import Tuple, List, Union, Dict
+from enum import Enum
+import numpy as np
+import tensorflow as tf
+
+from aimet_common.utils import AimetLogger
+import aimet_common.libpymo as libpymo
+from aimet_tensorflow.common.connectedgraph import ConnectedGraph
+from aimet_tensorflow.common.operation import Op
+from aimet_tensorflow.batch_norm_fold import fold_all_batch_norms
+from aimet_tensorflow.utils.graph_saver import save_and_load_graph
+from aimet_tensorflow.utils.op.conv import WeightTensorUtils, BiasUtils
+import aimet_tensorflow.utils.op.relu as ReluUtils
+from aimet_tensorflow.utils.op.fusedbatchnorm import BNUtils
+
+logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.CrosslayerEqualization)
+
+ScaleFactor = Union[np.ndarray, Tuple[np.ndarray]]
+
+ClsSet = Union[Tuple[tf.Operation, tf.Operation],
+               Tuple[tf.Operation, tf.Operation, tf.Operation]]
+
+# TODO below Enum is common with PyTorch impl. Move to a common file
+class ClsLayerType(Enum):
+    """Enum class to represent CLS layer types"""
+    Unsupported = 0
+    Conv = 1  # Overloaded for conv and ConvTranspose
+    DepthwiseConv = 2
+
+class GraphSearchUtils:
+
+    """ Implements graph search utils required by CLE feature"""
+
+    def __init__(self, model: tf.Graph, start_op_names: Union[str, List[str]], output_op_names: Union[str, List[str]]):
+        if isinstance(start_op_names, str):
+            start_op_names = [start_op_names]
+
+        if isinstance(output_op_names, str):
+            output_op_names = [output_op_names]
+
+        self._connected_graph = ConnectedGraph(model, start_op_names, output_op_names)
+
+    def find_and_replace_relu6_with_relu(self, sess: tf.compat.v1.Session) -> tf.compat.v1.Session:
+        """
+        finds and replaces Relu6 ops with Relu
+        :return: updated session
+        """
+        for op in self._connected_graph.get_all_ops().values():
+            if op.type in ['Relu6']:
+                # send the session here, so we make the update on sess.graph (active graph)
+                ReluUtils.replace_relu6_with_relu(sess, op.get_module())
+
+        # in the end update the session
+        after_relu_replace_sess = save_and_load_graph('./replace_relu6_with_relu', sess)
+
+        return after_relu_replace_sess
+
+    @staticmethod
+    def find_downstream_layer_groups_to_scale(op, layer_groups, visited_nodes, current_group=None):
+        """
+        Populates all the layer groups eligible for cross layer scaling
+        :param op: starting  op
+        :param layer_groups: layer_groups as empty list
+        :param visited_nodes: all the ops that have been visited
+        :param current_group: op groups
+        :return: None. Updates layer_groups[] if groups are found.
+        """
+
+        if not current_group:
+            current_group = []
+
+        if op in visited_nodes:
+            return
+
+        visited_nodes.append(op)
+        logger.debug("Visiting node: {%s}", op.dotted_name)
+
+        # If current node is Conv2D, add to the current group
+        if op.type in ['Conv2D', 'DepthwiseConv2dNative']:
+            current_group.append(op)
+
+        # Terminating condition for current group
+        if not (op.type in ['Conv2D', 'DepthwiseConv2dNative', 'Relu', 'PReLU', 'Pad', 'Identity']):
+            if (len(current_group) > 1) and (current_group not in layer_groups):
+                layer_groups.append(current_group)
+                node_set = [op.dotted_name for op in current_group]
+                logger.debug("Added new set of nodes: {%s}", node_set)
+            current_group = []
+
+        if op.output:
+            for consumer in op.output.consumers:
+                GraphSearchUtils.find_downstream_layer_groups_to_scale(consumer, layer_groups, visited_nodes,
+                                                                       current_group)
+
+        # Reached a leaf.. See if the current group has something to grab
+        if (len(current_group) > 1) and (current_group not in layer_groups):
+            layer_groups.append(current_group)
+            node_set = [op.dotted_name for op in current_group]
+            logger.debug("Added new set of nodes: {%s}", node_set)
+
+    def find_layer_groups_to_scale_as_conn_ops(self) -> List[List[Op]]:
+        """
+        :return: List of groups of layers. Each group can be independently equalized
+        """
+
+        # Find the input node(s) in the graph
+        input_nodes = []
+        for op in self._connected_graph.get_all_ops().values():
+            if op.inputs and op.inputs[0].is_model_input:
+                input_nodes.append(op)
+
+        layer_groups = []
+        visited_nodes = []
+
+        for op in input_nodes:
+            self.find_downstream_layer_groups_to_scale(op=op, layer_groups=layer_groups,
+                                                       visited_nodes=visited_nodes)
+
+        return layer_groups
+
+    def find_layer_groups_to_scale(self):
+        """
+        Find layer groups for scaling as tf ops
+        :return: groups for scaling as tf ops
+        """
+
+        layer_groups_as_conn_graph_ops = self.find_layer_groups_to_scale_as_conn_ops()
+        layer_groups_as_tf_ops, tf_op_to_conn_graph_op_map = self.convert_conn_graph_ops_to_tf_op(layer_groups_as_conn_graph_ops)
+
+        return tf_op_to_conn_graph_op_map, layer_groups_as_tf_ops
+
+    @staticmethod
+    def convert_conn_graph_ops_to_tf_op(op_groups: List[List[Op]]) -> \
+            List[List[tf.Operation]]:
+        """
+         Helper function to get op list as tf.Operation type to be usable for updating/scaling weights and biases
+         using generic apis for tensor updates.
+        :param op_groups: list of op groups as TfOperation type of used by Connected Graph
+        :return: lis of op groups as tf.Operation  (standard TF op type)
+        """
+        tf_op_to_conn_graph_op_map = {}
+        layer_groups_as_tf_ops = []
+        for ops in op_groups:
+            curr_group = []
+            for op in ops:
+                tf_op_to_conn_graph_op_map[op.get_module()] = op
+                curr_group.append(op.get_module())
+            layer_groups_as_tf_ops.append(curr_group)
+
+        return layer_groups_as_tf_ops, tf_op_to_conn_graph_op_map
+
+    @staticmethod
+    def convert_layer_group_to_cls_sets(layer_group: List[tf.Operation]):
+        """
+        Helper function to convert a layer group to a list of cls sets
+        :param layer_group: Given layer group to generate cls sets
+        :return: List of cls sets
+
+        Supported layer combinations for CLS are:
+        1. Conv + Conv
+        2. DepthwiseConv + Conv
+        3. Conv + DepthwiseConv + Conv
+        Can be rewritten as,
+        Conv
+            -> Conv
+            -> DepthwiseConv
+                -> Conv
+        DepthwiseConv
+            -> Conv
+        If a combination is partially supported, the cls_set is completely omitted and restarted from the next
+        supported layer
+        For example: Consider Conv + DepthwiseConv + Depthwise(unsupported)
+        - Since Depthwise(unsupported) is the last layer encountered, we need to omit all the three layers and restart
+        the cls sets from the next supported layer.
+
+        """
+
+        # pylint: disable=too-many-branches
+        def convert_to_cls_layer_type(layer: tf.Operation) -> Tuple[ClsLayerType, tf.Operation]:
+            """
+            Given the layer, check if its supported in CLS
+            :param layer: layer to check
+            :return: Tuple of ClsLayerType and the layer
+            """
+            if layer.type in ['Conv', 'Conv2D', 'ConvTranspose', 'Conv2DTranspose']:
+                layer_type = ClsLayerType.Conv
+            elif layer.type == 'DepthwiseConv2dNative':
+                layer_type = ClsLayerType.DepthwiseConv
+            else:
+                layer_type = ClsLayerType.Unsupported
+
+            return layer_type, layer
+
+        def get_next_layer() -> Union[Tuple[ClsLayerType, Union[tf.Operation, None]]]:
+            """
+            :return: Tuple of ClsLayerType and the next layer in layer_group
+            """
+            if not layer_group:
+                return ClsLayerType.Unsupported, None
+            layer = layer_group.pop(0)
+            return convert_to_cls_layer_type(layer)
+
+        # TODO below code is common with PyTorch impl. Move to a common file
+        cls_sets = []
+        first_layer_to_scale = (ClsLayerType.Unsupported, None)
+        while layer_group:
+            while layer_group and first_layer_to_scale[0] is ClsLayerType.Unsupported:
+                first_layer_to_scale = get_next_layer()
+                if first_layer_to_scale[0] is ClsLayerType.Unsupported:
+                    logger.info('Layer %s is not supported. Ignoring for cls', first_layer_to_scale[1])
+
+            second_layer_to_scale = get_next_layer()
+            if first_layer_to_scale[0] == ClsLayerType.Conv:
+                if second_layer_to_scale[0] == ClsLayerType.Conv:
+                    cls_sets.append((first_layer_to_scale[1], second_layer_to_scale[1]))
+                    first_layer_to_scale = second_layer_to_scale
+                elif second_layer_to_scale[0] == ClsLayerType.DepthwiseConv:
+                    if layer_group:
+                        # do not pop third layer yet, determine its type and then pop it
+                        third_layer_to_scale = convert_to_cls_layer_type(layer_group[0])
+                        if third_layer_to_scale[0] == ClsLayerType.Conv:
+                            cls_sets.append(
+                                (first_layer_to_scale[1], second_layer_to_scale[1], third_layer_to_scale[1]))
+                            # adding third_layer_to_scale for the next round of CLS set determination
+                            first_layer_to_scale = get_next_layer()
+                        else:
+                            # unsupported combination encountered
+                            first_layer_to_scale = second_layer_to_scale
+                else:
+                    logger.info('Layer %s is not supported. Ignoring for cls', second_layer_to_scale[1])
+                    first_layer_to_scale = (ClsLayerType.Unsupported, None)
+            elif first_layer_to_scale[0] == ClsLayerType.DepthwiseConv:
+                if second_layer_to_scale[0] == ClsLayerType.Conv:
+                    cls_sets.append((first_layer_to_scale[1], second_layer_to_scale[1]))
+                first_layer_to_scale = second_layer_to_scale
+            else:
+                logger.info('Layer %s is not supported. Ignoring for cls', first_layer_to_scale[1])
+                first_layer_to_scale = second_layer_to_scale
+
+        return cls_sets
+
+    @staticmethod
+    def is_relu_activation_present_in_cls_sets(cls_sets: List[ClsSet],
+                                               tf_op_to_conn_graph_op_map: Dict) -> List[bool]:
+        """
+        check if there is Relu activations between cls sets
+        :param cls_sets: cls conv op pairs
+        :param tf_op_to_conn_graph_op_map: Map of tf-op => connected graph op
+        :return: list of relu activation preset flags(True or False)
+        corresponding to input cls_sets list
+        """
+        is_relu_activation_in_cls_sets = []
+        for cls_set in cls_sets:
+            # We need to check activation functions for all layers but the last one in the set
+            # Because we are only interested in checking activation functions between the layers we will scale
+            cls_set = cls_set[:-1]
+
+            is_relu_activation_in_cls_set = ()
+            for conv_op in cls_set:
+                conn_graph_conv_op = tf_op_to_conn_graph_op_map[conv_op]
+                is_relu_activation_in_cls_set += (ReluUtils.does_conv_have_relu_activation(conn_graph_conv_op), )
+
+            if len(is_relu_activation_in_cls_set) == 1:
+                is_relu_activation_in_cls_set = is_relu_activation_in_cls_set[0]
+
+            is_relu_activation_in_cls_sets.append(is_relu_activation_in_cls_set)
+
+        return is_relu_activation_in_cls_sets
+
+    @staticmethod
+    def map_op_names_to_ops(sess: tf.compat.v1.Session) -> Dict[str, tf.Operation]:
+        """
+        After the fold and cls , the graph is updated, so are the ops
+        So, we need a way to map ops we stored on graph we began with, to perform
+        high bias fold operation on latest ops in the updated graph.
+        :param sess: active tf.compat.v1.Session (tf.compat.v1.Session type)
+        :return: a dictionary of op names mapped to ops in the given new session.
+        Note : only stores infor pertaining to bn and conv ops required by high bias fold.
+        """
+
+        tf_names_op_dict = {}
+        with sess.graph.as_default():
+            op_list = sess.graph.get_operations()
+            for op in op_list:
+                if op.type in ['Conv2D', 'DepthwiseConv2dNative', 'FusedBatchNormV3']:
+                    tf_names_op_dict[op.name] = op
+
+        return tf_names_op_dict
+
+
+
[docs]class ClsSetInfo: + """ + This class hold information about the layers in a CLS set, along with corresponding scaling factors + for CLS set layers + """ + +
[docs] class ClsSetLayerPairInfo: + """ + Models a pair of layers that were scaled using CLS. And related information. + + :param layer1: layer as tf.Operation + :param layer2: layer as tf.Operation + :param scale_factor: scale factors as np.ndarray + :param relu_activation_between_layers: list of flags per layer set indicating\ + if they have Relu activations in-between. + + """ + def __init__(self, layer1: tf.Operation, layer2: tf.Operation, + scale_factor: np.ndarray, relu_activation_between_layers): + + self.layer1 = layer1 + self.layer2 = layer2 + self.scale_factor = scale_factor + self.relu_activation_between_layers = relu_activation_between_layers
+ + def __init__(self, cls_pair_1: ClsSetLayerPairInfo, cls_pair_2: ClsSetLayerPairInfo = None): + if cls_pair_2: + self.cls_pair_info_list = [cls_pair_1, cls_pair_2] + else: + self.cls_pair_info_list = [cls_pair_1] + +
[docs] @staticmethod + def map_cls_sets_to_new_session(tf_names_op_dict: Dict[str, tf.Operation], cls_set_info_list): + """ + Helper function to updates ops stored during cls to be used by high bias fold with updated session. + + :param tf_names_op_dict: map of tf op names to ops + :param cls_set_info_list: list of ClsSetInfo type + :return: None /cls_set_info_list updated in-place + + """ + for cls_set_info in cls_set_info_list: + for cls_pair_info in cls_set_info.cls_pair_info_list: + # refresh the ops, so we can perform high bias fold with info saved during cls. + cls_pair_info.layer1 = tf_names_op_dict[cls_pair_info.layer1.name] + cls_pair_info.layer2 = tf_names_op_dict[cls_pair_info.layer2.name]
+ + +class CrossLayerScaling: + """ implements auto mode cross-layer-scaling technique to a model """ + + @staticmethod + def scale_cls_sets(sess: tf.compat.v1.Session, cls_sets: List[ClsSet]) -> List[ScaleFactor]: + + """ + Scale multiple CLS sets + + :param sess: Current session + :param cls_sets: List of CLS sets + :return: Scaling factors calculated and applied for each CLS set in order + + """ + scale_factor_list = [] + for cls_set in cls_sets: + scale_factor = CrossLayerScaling.scale_cls_set(sess, cls_set) + scale_factor_list.append(scale_factor) + return scale_factor_list + + @staticmethod + def scale_cls_set(sess: tf.compat.v1.Session, cls_set: ClsSet) -> ScaleFactor: + """ + Scale a CLS set + :param sess: Current session + :param cls_set: Either a pair or regular conv layers or a triplet of depthwise separable layers + :return: Scaling factor calculated and applied + """ + + if len(cls_set) == 3: + scale_factor = CrossLayerScaling.scale_cls_set_with_depthwise_layers(sess, cls_set) + else: + scale_factor = CrossLayerScaling.scale_cls_set_with_conv_layers(sess, cls_set) + + return scale_factor + + @staticmethod + def scale_cls_set_with_conv_layers(model: tf.compat.v1.Session, cls_set: Tuple[tf.Operation, tf.Operation]) -> np.ndarray: + """ + API to invoke equalize layer params (update for weights and bias is in place) + This function is currently supported for Conv+Conv, DepthwiseConv2D+Conv combinations only + :param model: active tf.compat.v1.Session + :param cls_set: Consecutive Conv layers Tuple whose weights and biases need to be equalized + :return: Scaling factor S_12 for each conv layer pair: numpy array + """ + + with model.graph.as_default(): + assert len(cls_set) == 2, "Two layers need to be present in the cls_set" + assert cls_set[0].type in ['DepthwiseConv2dNative', 'Conv2D'], "unsupported type for cls_set[0]" + assert cls_set[1].type == "Conv2D", "unsupported type for cls_set[1]" + + # Create structs for holding layer weights and bias parameters + prev_layer_params = libpymo.EqualizationParams() + curr_layer_params = libpymo.EqualizationParams() + + # send as [Noc, Nic, kh, kw], TF format is [kh, kw, Nic, Noc] + weight_shape = WeightTensorUtils.get_tensor_shape(cls_set[0]) + if cls_set[0].type == "Conv2D": + prev_layer_params.weight = WeightTensorUtils.get_tensor_as_numpy_data(model, cls_set[0]). \ + transpose((3, 2, 0, 1)).reshape(-1) + prev_layer_params.weightShape = [weight_shape[3], weight_shape[2], weight_shape[0], weight_shape[1]] + elif cls_set[0].type == "DepthwiseConv2dNative": + assert weight_shape[3] == 1, "Only depth_multiplier=1 is supported for DepthwiseConv2DNative" + prev_layer_params.weight = WeightTensorUtils.get_tensor_as_numpy_data(model, cls_set[0]). \ + transpose((2, 3, 0, 1)).reshape(-1) + prev_layer_params.weightShape = [weight_shape[2], weight_shape[3], weight_shape[0], weight_shape[1]] + else: + assert False, "unsupported layer encountered" + + prev_layer_params.isBiasNone = BiasUtils.is_bias_none(cls_set[0]) + + # send as [Noc, Nic, kh, kw], TF format is [kh, kw, Nic, Noc] + curr_layer_params.weight = WeightTensorUtils.get_tensor_as_numpy_data(model, cls_set[1]). \ + transpose((3, 2, 0, 1)).reshape(-1) + weight_shape = WeightTensorUtils.get_tensor_shape(cls_set[1]) + curr_layer_params.weightShape = [weight_shape[3], weight_shape[2], weight_shape[0], weight_shape[1]] + + if not BiasUtils.is_bias_none(cls_set[0]): + prev_layer_params.bias = BiasUtils.get_bias_as_numpy_data(model, cls_set[0]).reshape(-1) + else: + prev_layer_params.isBiasNone = True + + scaling_factor = libpymo.scaleLayerParams(prev_layer_params, curr_layer_params) + + # convert received formats back to TF + # TF format is [kh, kw, Nic, Noc] + if cls_set[0].type == "Conv2D": + numpy_weight_reshaped = np.reshape(prev_layer_params.weight, prev_layer_params.weightShape). \ + transpose((2, 3, 1, 0)) + elif cls_set[0].type == "DepthwiseConv2dNative": + numpy_weight_reshaped = np.reshape(prev_layer_params.weight, prev_layer_params.weightShape). \ + transpose((2, 3, 0, 1)) + else: + assert False, "unsupported layer encountered" + + WeightTensorUtils.update_tensor_for_op(model, cls_set[0], numpy_weight_reshaped) + + numpy_weight_reshaped = np.reshape(curr_layer_params.weight, curr_layer_params.weightShape). \ + transpose((2, 3, 1, 0)) + WeightTensorUtils.update_tensor_for_op(model, cls_set[1], numpy_weight_reshaped) + + if not BiasUtils.is_bias_none(cls_set[0]): + numpy_bias_reshaped = np.reshape(prev_layer_params.bias, BiasUtils.get_shape(cls_set[0])) + BiasUtils.update_bias_for_op(model, cls_set[0], numpy_bias_reshaped) + + return scaling_factor + + @staticmethod + def scale_cls_set_with_depthwise_layers(model: tf.compat.v1.Session, + cls_set: Tuple[tf.Operation, + tf.Operation, + tf.Operation]) -> [np.ndarray, np.ndarray]: + """ + API to invoke equalize layer params for the combination of conv+depthwiseConv+conv layer params + - update for weights and bias is in place + :param model: active tf.compat.v1.Session + :param cls_set: Consecutive Conv layers whose weights and biases need to be equalized. + Second Conv layer is a depth-wise conv and third conv layer is point-wise conv + :return: Scaling factors S_12 and S_23 : numpy arrays + + DepthwiseConv2D layer is handled the following way, Assume the depthwise layer has the following dimensions + [Dimensions: (kh, kw, Nic, Noc) example:(3, 3, 96, 2)], following operations are done on it, + 1. Pre-process + a. Merge the last two dimensions and transpose [(Nic*Noc, kh, kw, 1), (192, 3, 3, 1)] + - implementation is converting to [(Nic, Noc, kh, kw), (96, 2, 3, 3)] which would give same res after next step + b. Convert to a 1D array [(Nic*Noc*kh*kw), (192*3*3)] + 2. Perform scaling + 3. Bring it back to original dimensions, current dimensions [(Nic*Noc*kh*kw), (192*3*3)] + a. reshape to [(Noc, Nic, kh, kw), (2, 96, 3, 3)] + b. reorder the dimensions [(kh, kw, Nic, Noc), (3, 3, 96, 2)] + """ + + # make sure you define the session and graph scope before making any graph updates. + with model.graph.as_default(): + assert len(cls_set) == 3, "Three layers need to be present in the cls_set" + assert cls_set[0].type == "Conv2D", "unsupported type for cls_set[0]" + assert cls_set[1].type == "DepthwiseConv2dNative", "unsupported type for cls_set[1]" + assert cls_set[2].type == "Conv2D", "unsupported type for cls_set[2]" + + # Create structs for holding layer weights and bias parameters + prev_layer_params = libpymo.EqualizationParams() + curr_layer_params = libpymo.EqualizationParams() + next_layer_params = libpymo.EqualizationParams() + + # send as [Noc, Nic, kh, kw], TF format is [kh, kw, Nic, Noc] + prev_layer_params.weight = WeightTensorUtils.get_tensor_as_numpy_data(model, cls_set[0]). \ + transpose((3, 2, 0, 1)).reshape(-1) + weight_shape = WeightTensorUtils.get_tensor_shape(cls_set[0]) + prev_layer_params.weightShape = [weight_shape[3], weight_shape[2], weight_shape[0], weight_shape[1]] + prev_layer_params.isBiasNone = BiasUtils.is_bias_none(cls_set[0]) + + # depthwise layer outputs is set to 1 in TF + # send as [Nic, Noc, kh, kw], TF format is [kh, kw, Nic, Noc] + curr_layer_params.weight = WeightTensorUtils.get_tensor_as_numpy_data(model, cls_set[1]). \ + transpose((2, 3, 0, 1)).reshape(-1) + weight_shape = WeightTensorUtils.get_tensor_shape(cls_set[1]) + + # depthwise layer outputs is set to 1 in TF + # send as [Nic, Noc, kh, kw], TF format is [kh, kw, Nic, Noc] + curr_layer_params.weightShape = [weight_shape[2] * weight_shape[3], weight_shape[0], weight_shape[1], 1] + assert weight_shape[3] == 1, "Only depth_multiplier=1 is supported for DepthwiseConv2D" + curr_layer_params.isBiasNone = BiasUtils.is_bias_none(cls_set[1]) + + # send as [Noc, Nic, kh, kw] , TF format is [kh, kw, Nic, Noc] + next_layer_params.weight = WeightTensorUtils.get_tensor_as_numpy_data(model, cls_set[2]). \ + transpose((3, 2, 0, 1)).reshape(-1) + weight_shape = WeightTensorUtils.get_tensor_shape(cls_set[2]) + next_layer_params.weightShape = [weight_shape[3], weight_shape[2], weight_shape[0], weight_shape[1]] + + if not BiasUtils.is_bias_none(cls_set[0]): + prev_layer_params.bias = BiasUtils.get_bias_as_numpy_data(model, cls_set[0]).reshape(-1) + else: + prev_layer_params.isBiasNone = True + + if not BiasUtils.is_bias_none(cls_set[1]): + curr_layer_params.bias = BiasUtils.get_bias_as_numpy_data(model, cls_set[1]).reshape(-1) + else: + curr_layer_params.isBiasNone = True + + scaling_params = libpymo.scaleDepthWiseSeparableLayer(prev_layer_params, curr_layer_params, + next_layer_params) + + # convert received formats (in [Noc, Nic, kh, kw] format) back to the TF format ([kh, kw, Nic, Noc]) + numpy_weight_reshaped_0 = np.reshape(prev_layer_params.weight, prev_layer_params.weightShape). \ + transpose((2, 3, 1, 0)) + WeightTensorUtils.update_tensor_for_op(model, cls_set[0], numpy_weight_reshaped_0) + + # depthwise layer + weight_shape_1 = WeightTensorUtils.get_tensor_shape(cls_set[1]) + numpy_weight_reshaped_1 = np.reshape(curr_layer_params.weight, (weight_shape_1[3], weight_shape_1[2], + weight_shape_1[0], weight_shape_1[1])) + numpy_weight_reshaped_1 = np.transpose(numpy_weight_reshaped_1, (2, 3, 1, 0)) + WeightTensorUtils.update_tensor_for_op(model, cls_set[1], numpy_weight_reshaped_1) + + # conv layer + numpy_weight_reshaped_2 = np.reshape(next_layer_params.weight, next_layer_params.weightShape). \ + transpose((2, 3, 1, 0)) + WeightTensorUtils.update_tensor_for_op(model, cls_set[2], numpy_weight_reshaped_2) + + if not BiasUtils.is_bias_none(cls_set[0]): + assert [len(prev_layer_params.bias)] == BiasUtils.get_shape(cls_set[0]), \ + "Unsupported dimension encountered" + numpy_bias_reshaped = np.reshape(prev_layer_params.bias, BiasUtils.get_shape(cls_set[0])) + BiasUtils.update_bias_for_op(model, cls_set[0], numpy_bias_reshaped) + + if not BiasUtils.is_bias_none(cls_set[1]): + assert [len(curr_layer_params.bias)] == BiasUtils.get_shape(cls_set[1]), \ + "Unsupported dimension encountered" + numpy_bias_reshaped = np.reshape(curr_layer_params.bias, BiasUtils.get_shape(cls_set[1])) + BiasUtils.update_bias_for_op(model, cls_set[1], numpy_bias_reshaped) + + return scaling_params.scalingMatrix12, scaling_params.scalingMatrix23 + + @staticmethod + def create_cls_set_info_list(cls_sets: List[ClsSet], scale_factors: List[ScaleFactor], + is_relu_activation_in_cls_sets): + """ + Binds information from there separate lists into one [ClsInfoSet] data-structure + + :param cls_sets: List of CLS sets + :param scale_factors: Scale-factors for each cls-set + :param is_relu_activation_in_cls_sets: Information if there is relu activation in each cls-set + :return: List of ClsSetInfo + """ + cls_set_info_list = [] + assert len(cls_sets) == len(scale_factors) == len(is_relu_activation_in_cls_sets) + + for index, cls_set in enumerate(cls_sets): + + if isinstance(scale_factors[index], tuple): + # If we are dealing with a triplet of layers, then we should have 2 scale factors and 2 relu flags + # Assert that this is true + assert len(cls_set) == 3 + assert len(scale_factors[index]) == len(is_relu_activation_in_cls_sets[index]) == 2 + + cls_pair_1 = ClsSetInfo.ClsSetLayerPairInfo(cls_set[0], cls_set[1], scale_factors[index][0], + is_relu_activation_in_cls_sets[index][0]) + cls_pair_2 = ClsSetInfo.ClsSetLayerPairInfo(cls_set[1], cls_set[2], scale_factors[index][1], + is_relu_activation_in_cls_sets[index][1]) + cls_set_info = ClsSetInfo(cls_pair_1, cls_pair_2) + + else: + cls_pair = ClsSetInfo.ClsSetLayerPairInfo(cls_set[0], cls_set[1], scale_factors[index], + is_relu_activation_in_cls_sets[index]) + cls_set_info = ClsSetInfo(cls_pair) + + cls_set_info_list.append(cls_set_info) + + return cls_set_info_list + + @staticmethod + def scale_model(sess: tf.compat.v1.Session, input_op_names: Union[str, List[str]], output_op_names: Union[str, List[str]])\ + -> (tf.compat.v1.Session, List[ClsSetInfo]): + """ + Uses cross-layer scaling to scale all applicable layers in the given model + + :param sess: Session containing graph to scale + :param input_op_names: Names of starting ops in the model + :param output_op_names: List of output op names of the model, used to help ConnectedGraph determine valid ops + (to ignore training ops for example). If None, all ops in the model are considered valid. + :return: updated session, CLS information for each CLS set + + """ + + if isinstance(input_op_names, str): + input_op_names = [input_op_names] + + if isinstance(output_op_names, str): + output_op_names = [output_op_names] + + # Find layer groups + graph_search = GraphSearchUtils(sess.graph, input_op_names, output_op_names) + tf_op_to_conn_graph_op_map, layer_groups_as_tf_ops = graph_search.find_layer_groups_to_scale() + + # Find cls sets from the layer groups + cls_sets = [] + for layer_group in layer_groups_as_tf_ops: + cls_set = graph_search.convert_layer_group_to_cls_sets(layer_group) + cls_sets += cls_set + + # Scale the CLS sets + scale_factors = CrossLayerScaling.scale_cls_sets(sess, cls_sets) + + # Find if there were relu activations between layers of each cls set + is_relu_activation_in_cls_sets = graph_search.is_relu_activation_present_in_cls_sets(cls_sets, + tf_op_to_conn_graph_op_map) + + # Convert to a list of cls-set-info elements + cls_set_info_list = CrossLayerScaling.create_cls_set_info_list(cls_sets, scale_factors, + is_relu_activation_in_cls_sets) + + # save and load the updated graph after scaling + after_cls_sess = save_and_load_graph('./temp_cls', sess) + + return after_cls_sess, cls_set_info_list + + +class HighBiasFold: + """ + Class to apply the high-bias-fold technique to a given model + """ + + ActivationIsReluForFirstModule = bool + ScaleForFirstModule = np.ndarray + + @staticmethod + def get_bn_params_for_bias_fold(sess: tf.compat.v1.Session, bn_op: tf.Operation, scaling_parameter: np.ndarray): + """ + + :param sess: active tf.compat.v1.Session + :param bn_op: tf Operation type fused batchnorm op. + :param scaling_parameter: scaling param as np.ndarray + :return: bn_params as BNParamsHighBiasFold type. + """ + + bn_params = libpymo.BNParamsHighBiasFold() + # Scaling gamma and beta parameter of batch norm layer + gamma = BNUtils.get_gamma_as_numpy_data(sess, bn_op).reshape(-1) + bn_params.gamma = np.divide(gamma, scaling_parameter) + beta = BNUtils.get_beta_as_numpy_data(sess, bn_op).reshape(-1) + bn_params.beta = np.divide(beta, scaling_parameter) + + return bn_params + + @staticmethod + def _refresh_layer_set_info_before_hbf(sess: tf.compat.v1.Session, + folded_pairs: List[Tuple[tf.Operation, tf.Operation]], + cls_set_info_list: List[ClsSetInfo])\ + -> (List[ClsSetInfo], Dict[str, tf.Operation]): + """ + As the tensorflow session gets updated, info on op references need to be refreshed. + :param folded_pairs: bn conv op pairs saved during batchnorm fold. + :param cls_set_info_list: conv layer info saved during cross layer scaling + :return: refreshes both data sets to reflect references on new tf.compat.v1.Session. + """ + + bn_dict = {} + dict_names_to_tf_ops = GraphSearchUtils.map_op_names_to_ops(sess) + + # update info saved during batchnorm fold + for conv_bn in folded_pairs: + # get the new op ref from it's name + bn_dict[conv_bn[0].name] = dict_names_to_tf_ops[conv_bn[1].name] + + # update info saved during cls + ClsSetInfo.map_cls_sets_to_new_session(dict_names_to_tf_ops, cls_set_info_list) + + return cls_set_info_list, bn_dict + + @staticmethod + def bias_fold(sess: tf.compat.v1.Session, folded_pairs: List[Tuple[tf.Operation, tf.Operation]], + cls_set_info_list: List[ClsSetInfo]) -> tf.compat.v1.Session: + + """ + Folds bias values greater than 3 * sigma to next layer's bias + + :param sess: Current session + :param folded_pairs: Key: Conv/Linear layer Value: Corresponding folded BN layer + :param cls_set_info_list: List of info elements for each cls set + :return: updated session after graph updates from hbf + + """ + + with sess.graph.as_default(): + + # refresh the references saved during bn fold and cls. + cls_set_info_list, bn_layers = HighBiasFold._refresh_layer_set_info_before_hbf(sess, folded_pairs, + cls_set_info_list) + + if not bn_layers: + logger.error('High Bias folding is not supported for models without BatchNorm Layers') + return sess + + for cls_set_info in cls_set_info_list: + + for cls_pair_info in cls_set_info.cls_pair_info_list: + + # check if we have a corresponding bn layer + if cls_pair_info.layer1.name in bn_layers.keys(): + + # check if bias present in given conv2D(s) + if BiasUtils.is_bias_none(cls_pair_info.layer1) or BiasUtils.is_bias_none(cls_pair_info.layer2): + continue + + prev_layer_params = libpymo.LayerParams() + curr_layer_params = libpymo.LayerParams() + + scaling_parameter = cls_pair_info.scale_factor + + prev_layer_bn_params =\ + HighBiasFold.get_bn_params_for_bias_fold(sess, + bn_layers[cls_pair_info.layer1.name], + scaling_parameter) + + prev_layer_params.activationIsRelu = cls_pair_info.relu_activation_between_layers + prev_layer_params.bias =\ + BiasUtils.get_bias_as_numpy_data(sess, cls_pair_info.layer1).reshape(-1) + prev_bias_shape = BiasUtils.get_shape(cls_pair_info.layer1) + + weight_shape = WeightTensorUtils.get_tensor_shape(cls_pair_info.layer1) + prev_layer_params.weightShape = [weight_shape[3], weight_shape[2], weight_shape[0], + weight_shape[1]] + + curr_layer_params.bias =\ + BiasUtils.get_bias_as_numpy_data(sess, cls_pair_info.layer2).reshape(-1) + curr_bias_shape = BiasUtils.get_shape(cls_pair_info.layer2) + + weight_shape = WeightTensorUtils.get_tensor_shape(cls_pair_info.layer2) + + # Handle depthwise layer case + # for a depthwise layer num outputs is set to 1 in TF + # send as [Nic, Noc, kh, kw], TF format is [kh, kw, Nic, Noc] + if cls_pair_info.layer2.type in ['DepthwiseConv2dNative']: + c_wt = WeightTensorUtils.get_tensor_as_numpy_data( + sess, cls_pair_info.layer2).transpose((2, 3, 0, 1)) + curr_layer_params.weight = c_wt.reshape(-1) + curr_layer_params.weightShape = [weight_shape[2], weight_shape[3], weight_shape[0], + weight_shape[1]] + + else: + # send as [Noc, Nic, kh, kw], TF format is [kh, kw, Nic, Noc] + c_wt = WeightTensorUtils.get_tensor_as_numpy_data( + sess, cls_pair_info.layer2).transpose((3, 2, 0, 1)) + curr_layer_params.weight = c_wt.reshape(-1) + curr_layer_params.weightShape = [weight_shape[3], weight_shape[2], weight_shape[0], + weight_shape[1]] + + libpymo.updateBias(prev_layer_params, curr_layer_params, prev_layer_bn_params) + + BiasUtils.update_bias_for_op(sess, cls_pair_info.layer1, np.reshape(prev_layer_params.bias, + prev_bias_shape)) + + BiasUtils.update_bias_for_op(sess, cls_pair_info.layer2, np.reshape(curr_layer_params.bias, + curr_bias_shape)) + else: + logger.info("skipping layer: {%s}", cls_pair_info.layer1.name) + + # save and load the updated graph after high bias fold update + aftr_hbf_sess = save_and_load_graph('./temp_hbf', sess) + + return aftr_hbf_sess + + +
[docs]def equalize_model(sess: tf.compat.v1.Session, start_op_names: Union[str, List[str]], + output_op_names: Union[str, List[str]]) -> tf.compat.v1.Session: + """ + High-level API to perform Cross-Layer Equalization (CLE) on the given model. The model is equalized in place. + + :param sess: tf.compat.v1.Session with model to equalize + :param start_op_names: Names of starting ops in the given model + :param output_op_names: List of output op names of the model, used to help ConnectedGraph determine valid ops + (to ignore training ops for example). + :return: updated session after bn fold, cls and hbf. + + """ + + if not isinstance(start_op_names, (str, List)): + logger.error('start op names must be passed as a string or a List of strings') + + if isinstance(start_op_names, str): + start_op_names = [start_op_names] + + # fold batchnorm layers + after_bn_fold_sess, folded_pairs = fold_all_batch_norms(sess, start_op_names, output_op_names) + + # replace any ReLU6 layers with ReLU + graph_util = GraphSearchUtils(after_bn_fold_sess.graph, start_op_names, output_op_names) + after_relu_replace_sess = graph_util.find_and_replace_relu6_with_relu(after_bn_fold_sess) + + # perform cross-layer scaling on applicable layer sets + after_cls_sess, cls_set_info_list = CrossLayerScaling.scale_model(after_relu_replace_sess, start_op_names, + output_op_names) + + # high-bias fold + after_hbf_sess = HighBiasFold.bias_fold(after_cls_sess, folded_pairs, cls_set_info_list) + + return after_hbf_sess
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/defs.html b/releases/1.32.2/_modules/aimet_tensorflow/defs.html new file mode 100644 index 00000000..b8c88f03 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/defs.html @@ -0,0 +1,1318 @@ + + + + + + aimet_tensorflow.defs — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_tensorflow.defs

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2018-2020, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Common type definitions that are used across aimet """
+
+from enum import Enum
+from typing import List, Optional, Union
+
+from dataclasses import dataclass
+import tensorflow as tf
+
+from aimet_common.defs import GreedySelectionParameters
+
+
+
[docs]class ModuleCompRatioPair: + """ + Pair of tf.Operation and a compression-ratio + + :ivar module: Module of type tf.Operation + :ivar comp_ratio: Compression ratio. Compression ratio is the ratio of cost of compressed model + to cost of the original model. + """ + + def __init__(self, module: tf.Operation, comp_ratio: float): + self.module = module + self.comp_ratio = comp_ratio
+ + +
[docs]class SpatialSvdParameters: + """ Configuration parameters for spatial svd compression """ + +
[docs] class ManualModeParams: + """ + Configuration parameters for manual-mode spatial svd compression + """ + + def __init__(self, list_of_module_comp_ratio_pairs: List[ModuleCompRatioPair]): + """ + :param list_of_module_comp_ratio_pairs: List of (module, comp-ratio) pairs + """ + self.list_of_module_comp_ratio_pairs = list_of_module_comp_ratio_pairs
+ +
[docs] class AutoModeParams: + """ + Configuration parameters for auto-mode compression + """ + + def __init__(self, greedy_select_params: GreedySelectionParameters, + modules_to_ignore: Optional[List[tf.Operation]] = None): + """ + :param greedy_select_params: Params for greedy comp-ratio selection algorithm + :param modules_to_ignore: List of modules to ignore (None indicates nothing to ignore) + """ + self.greedy_params = greedy_select_params + self.modules_to_ignore = [] if modules_to_ignore is None else modules_to_ignore
+ +
[docs] class Mode(Enum): + """ Mode enumeration """ + + manual = 1 + """ Manual mode """ + + auto = 2 + """ Auto mode """
+ + def __init__(self, input_op_names: List[str], output_op_names: List[str], mode: Mode, + params: Union[ManualModeParams, AutoModeParams], multiplicity=1): + """ + :param input_op_names: list of input op names to the model + :param output_op_names: List of output op names of the model + :param mode: Either auto mode or manual mode + :param params: Parameters for the mode selected + :param multiplicity: The multiplicity to which ranks/input channels will get rounded. Default: 1 + """ + self.input_op_names = input_op_names + self.output_op_names = output_op_names + self.mode = mode + self.mode_params = params + self.multiplicity = multiplicity
+ + +
[docs]class ChannelPruningParameters: + """ Configuration parameters for channel pruning compression """ + +
[docs] class ManualModeParams: + """ + Configuration parameters for manual-mode channel pruning compression + """ + + def __init__(self, list_of_module_comp_ratio_pairs: List[ModuleCompRatioPair]): + """ + :param list_of_module_comp_ratio_pairs: List of (module, comp-ratio) pairs + """ + self.list_of_module_comp_ratio_pairs = list_of_module_comp_ratio_pairs
+ +
[docs] class AutoModeParams: + """ + Configuration parameters for auto-mode compression + """ + + def __init__(self, greedy_select_params: GreedySelectionParameters, + modules_to_ignore: Optional[List[tf.Operation]] = None): + """ + :param greedy_select_params: Params for greedy comp-ratio selection algorithm + :param modules_to_ignore: List of modules to ignore (None indicates nothing to ignore) + """ + self.greedy_params = greedy_select_params + self.modules_to_ignore = [] if modules_to_ignore is None else modules_to_ignore
+ +
[docs] class Mode(Enum): + """ Mode enumeration """ + + manual = 1 + """ Manual mode: User specifies comp-ratio per layer """ + + auto = 2 + """ Auto mode: aimet computes optimal comp-ratio per layer """
+ + def __init__(self, input_op_names: List[str], output_op_names: List[str], data_set: tf.data.Dataset, + batch_size: int, num_reconstruction_samples: int, allow_custom_downsample_ops: bool, mode: Mode, + params: Union[ManualModeParams, AutoModeParams], multiplicity=1): + """ + + :param input_op_names: list of input op names to the model + :param output_op_names: List of output op names of the model + :param data_set: data set + :param batch_size: batch size + :param num_reconstruction_samples: number of samples to be used for reconstruction + :param allow_custom_downsample_ops: If set to True, DownSampleLayer and UpSampleLayer will be added as required + :param mode: indicates whether the mode is manual or auto + :param params: ManualModeParams or AutoModeParams, depending on teh value of mode + :param multiplicity: The multiplicity to which ranks/input channels will get rounded. Default: 1 + """ + + # pylint: disable=too-many-arguments + self.input_op_names = input_op_names + self.output_op_names = output_op_names + self.data_set = data_set + self.batch_size = batch_size + self.num_reconstruction_samples = num_reconstruction_samples + self.allow_custom_downsample_ops = allow_custom_downsample_ops + self.mode = mode + self.mode_params = params + self.multiplicity = multiplicity
+ + +class ParameterInfo: + """ Store information required for parameter quantization """ + def __init__(self, param_type: str, op_with_param_name: List): + self.param_type = param_type + self.op_with_param_name = op_with_param_name + +@dataclass +class Tensorflow2Version: + """ + Enumeration for checking TensorFlow version for import statements + """ + + v2_4_3 = '2.4.3' + + v2_10_1 = '2.10.1' +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/keras/batch_norm_fold.html b/releases/1.32.2/_modules/aimet_tensorflow/keras/batch_norm_fold.html new file mode 100644 index 00000000..58f346ce --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/keras/batch_norm_fold.html @@ -0,0 +1,2150 @@ + + + + + + aimet_tensorflow.keras.batch_norm_fold — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_tensorflow.keras.batch_norm_fold

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2021-2023, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+#pylint: disable=too-many-lines
+""" Utility for batch norm fold in tf 2.x """
+from typing import Iterable, Optional, Tuple, Union, List, Dict, Set
+import numpy as np
+import tensorflow as tf
+import tensorflow.keras.backend as K
+from packaging import version  # pylint: disable=wrong-import-order
+
+if version.parse(tf.version.VERSION) >= version.parse("2.10"):
+    # Ignore pylint errors as keras module is not available in TF 2.4
+    from keras.layers.core.tf_op_layer import TFOpLambda # pylint: disable=import-error
+    from keras.engine.functional import Functional # pylint: disable=import-error
+else:
+    # Ignore pylint errors due to conditional imports
+    from tensorflow.python.keras.engine.functional import Functional # pylint: disable=ungrouped-imports
+    from tensorflow.python.keras.layers.core import TFOpLambda # pylint: disable=ungrouped-imports
+
+# pylint: disable=wrong-import-position
+from aimet_common.defs import QuantScheme, MAP_ROUND_MODE_TO_PYMO
+
+import aimet_common.libpymo as libpymo
+from aimet_common.utils import AimetLogger
+from aimet_tensorflow.keras.model_preparer import _KerasModelPreparer
+from aimet_tensorflow.keras.quant_sim.qc_quantize_wrapper import QcQuantizeWrapper
+from aimet_tensorflow.keras.quant_sim.tensor_quantizer import ParamPerTensorQuantizer
+from aimet_tensorflow.keras.quantsim import QuantizationSimModel
+from aimet_tensorflow.keras.utils import common
+from aimet_tensorflow.keras.utils.model_connection_utils import ModelLayerConnections, ModelLayerConnectionsProperties
+from aimet_tensorflow.keras.utils.quantizer_utils import get_wrappers_bias_quantizer, get_wrappers_weight_quantizer
+from aimet_tensorflow.keras.utils.weight_tensor_utils import WeightTensorUtils
+
+_logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.Utils)
+
+LayerType = Union[
+    tf.keras.layers.Conv2D,
+    tf.keras.layers.Dense,
+    tf.keras.layers.Conv2DTranspose,
+    tf.keras.layers.DepthwiseConv2D
+]
+_supported_layers = LayerType.__args__
+
+PairType = Union[Tuple[LayerType, tf.keras.layers.BatchNormalization, bool],
+                 Tuple[tf.keras.layers.BatchNormalization, LayerType, bool]]
+
+BatchNormType = tf.keras.layers.BatchNormalization
+_supported_batchnorms = BatchNormType
+
+# Todo: search for more types of convolution
+LinearType = tf.keras.layers.Dense
+ConvType = Union[tf.keras.layers.Conv1D,
+                 tf.keras.layers.Conv2D,
+                 tf.keras.layers.DepthwiseConv2D,
+                 tf.keras.layers.Conv2DTranspose]
+_supported_convs = ConvType.__args__
+
+FlattenType = Union[tf.keras.layers.Flatten, tf.keras.layers.Reshape]
+
+MAP_PYMO_TO_ROUND_MODE = {v: k for k, v in MAP_ROUND_MODE_TO_PYMO.items()}
+def _check_layer_to_find_pattern(cur_layer: tf.keras.layers.Layer,
+                                 conv_linear_with_bn_dict: Dict[Union[ConvType, LinearType],
+                                                                List[Union[None, BatchNormType]]],
+                                 layer_out_node_ref: Dict,
+                                 has_seen: List[Union[None, ConvType, BatchNormType, FlattenType]]):
+    """
+    find all paths in the model considering all inputs.
+
+    :param cur_layer: layer to investigate for finding a pattern
+    :param conv_linear_with_bn_dict: dictionary to store possible conv_bn pairs,
+        key: Dense or Conv layer & Value: list of BNS;
+        first index in this list shows bn_in and the second index shows bn_out
+    :param layer_out_node_ref: dictionary includes layer_ref as a key, outbound nodes as value
+    :param has_seen: for storing the layer which is useful for finding pattern in the next layers;
+        index 0 is for conv op, index 2 is for bn op and index 3 is for storing flatten/reshape op
+    """
+
+    # pylint: disable=too-many-branches
+    if isinstance(cur_layer, _supported_convs):
+        if has_seen[1] is not None:
+            conv_linear_with_bn_dict[cur_layer] = [has_seen[1], None]
+            has_seen[1] = None
+        if (cur_layer.activation is tf.keras.activations.linear) and \
+                (cur_layer in layer_out_node_ref) and len(layer_out_node_ref[cur_layer]) == 1:
+            has_seen[0] = cur_layer
+    elif isinstance(cur_layer, BatchNormType):
+        if has_seen[0] is not None:
+            if has_seen[0] in conv_linear_with_bn_dict:
+                conv_linear_with_bn_dict[has_seen[0]][1] = cur_layer
+            else:
+                conv_linear_with_bn_dict[has_seen[0]] = [None, cur_layer]
+            has_seen[0] = None
+        if (cur_layer in layer_out_node_ref) and len(layer_out_node_ref[cur_layer]) == 1:
+            has_seen[1] = cur_layer
+    elif isinstance(cur_layer, (tf.keras.layers.Flatten, tf.keras.layers.Reshape)):
+        if (cur_layer in layer_out_node_ref) and len(layer_out_node_ref[cur_layer]) == 1:
+            if has_seen[1]:
+                has_seen[2] = cur_layer
+            else:
+                has_seen[1] = None
+        if has_seen[0]:
+            has_seen[0] = None
+    elif isinstance(cur_layer, LinearType):
+        if has_seen[1] is not None and has_seen[2] is not None:
+            conv_linear_with_bn_dict[cur_layer] = [has_seen[1], None]
+        has_seen[2] = None
+        has_seen[1] = None
+    else:
+        has_seen[0] = None
+        has_seen[1] = None
+        has_seen[2] = None
+
+
+def _add_children_layer_before_parent_layer(cur_layer: tf.keras.layers.Layer, node_layer_map: Dict,
+                                            layer_out_node_map: Dict,
+                                            visited_layers: Set[tf.keras.layers.Layer],
+                                            reversed_ordered_layers: List[tf.keras.layers.Layer]):
+    """
+    Function to use topological sorting for finding all the layers which are accessible
+    from the specific input_layer in the opposite order of occurrence.
+
+    :param cur_layer:layer that we want to find path from
+    :param node_layer_map: dictionary includes node_ref as a key, in_layers and out_layer as value
+    :param layer_out_node_map: dictionary includes layer_ref as a key, outbound nodes as value
+    :param visited_layers: Set of all layers that have been visited
+    :param reversed_ordered_layers: List of layers in the opposite order of occurrence
+        for the layers that we have visited so far
+    """
+
+    # Mark the current layer as visited.
+    visited_layers.add(cur_layer)
+
+    if cur_layer in layer_out_node_map:
+        # Recur for all the layers adjacent to this layer
+        for next_node in layer_out_node_map[cur_layer]:
+            next_layer = node_layer_map[next_node][1]
+            if next_layer not in visited_layers:
+                _add_children_layer_before_parent_layer(next_layer, node_layer_map,
+                                                        layer_out_node_map, visited_layers,
+                                                        reversed_ordered_layers)
+            reversed_ordered_layers.append(cur_layer)
+    else:
+        reversed_ordered_layers.append(cur_layer)
+
+
+def _get_ordered_layers(node_layer_map: Dict,
+                        layer_out_node_map: Dict) -> List[tf.keras.layers.Layer]:
+    """
+    Function to return the list with all the layers in which layers come before parent layer.
+
+    :param node_layer_map: dictionary includes node_ref as a key, in_layers and out_layer as value
+    :param layer_out_node_map: dictionary includes layer_ref as a key, outbound nodes as value
+    :return: ordered_layers: List of all layers in the order of occurrence
+    """
+    # to find the input layers of the model
+    input_layers = common.find_input_layers(node_layer_map)
+
+    #  Set of all layers that have been visited (to cut short duplicate traversals)
+    visited_layers = set()
+
+    # List of all layers in the opposite of order of occurrence
+    reversed_ordered_layers = []
+
+    for input_layer in input_layers:
+        _add_children_layer_before_parent_layer(input_layer, node_layer_map, layer_out_node_map,
+                                                visited_layers, reversed_ordered_layers)
+
+    # reverse the list because layers are in reverse order
+    ordered_layers = reversed_ordered_layers[::-1]
+
+    # # filter ordered ops for only valid ops
+    # ordered_ops = [op for op in ordered_ops if op in valid_ops]
+
+    return ordered_layers
+
+
+def _get_ordered_conv_linears(node_layer_map: Dict,
+                              layer_out_node_map: Dict) -> List[Union[ConvType, LinearType]]:
+    """
+    helper to select a list of conv_linears in the order of occurence
+
+    :param node_layer_map: dictionary includes node_ref as a key, in_layers and out_layer as value
+    :param layer_out_node_map: dictionary includes layer_ref as a key, outbound nodes as value
+    :return: return List of conv/linear layer refs
+    """
+    # get ordered layers list in node_layer map dictionary
+    list_of_ordered_layers = _get_ordered_layers(node_layer_map, layer_out_node_map)
+
+    # look for conv layers
+    ordered_conv_linears = []
+    for layer in list_of_ordered_layers:
+        if isinstance(layer, _supported_layers):
+            ordered_conv_linears.append(layer)
+    return ordered_conv_linears
+
+
+def _fill_conv_linear_bn_dict(cur_layer: tf.keras.layers.Layer, node_layer_ref: Dict,
+                              layer_out_node_ref: Dict,
+                              has_seen: List[Union[None, ConvType, BatchNormType, FlattenType]],
+                              visited_layer: Set[tf.keras.layers.Layer],
+                              conv_linear_with_bn_dict: Dict[Union[ConvType, LinearType],
+                                                             List[Union[None, BatchNormType]]]):
+    """
+    fill conv_linear_bn_dict for the model
+
+    :param cur_layer: dictionary includes node_ref as a key, in_layers and out_layer as value
+    :param node_layer_ref: dictionary includes node_ref as a key, in_layers and out_layer as value
+    :param layer_out_node_ref: dictionary includes layer_ref as a key, outbound nodes as value
+    :paramm has_seen: for storing the layer which is useful for finding pattern in the next layers;
+        index 0 is for conv op, index 2 is for bn op and index 3 is for storing flatten/reshape op
+    :param visited_layer: to store all the layers that have been visited so far in the dictionary
+    :param conv_linear_with_bn_dict: dictionary of all possible conv_bn pairs,
+        key: Dense or Conv layer & Value: list of BNS;
+        first index in this list shows bn_in and the second index shows bn_out
+    """
+
+    # Mark the current layer as visited to prevent passing from one layer more than once
+    visited_layer.add(cur_layer)
+
+    _check_layer_to_find_pattern(cur_layer, conv_linear_with_bn_dict, layer_out_node_ref, has_seen)
+
+    if cur_layer in layer_out_node_ref:
+        for next_node in layer_out_node_ref[cur_layer]:
+            next_layer = node_layer_ref[next_node][1]
+            if next_layer not in visited_layer:
+                _fill_conv_linear_bn_dict(next_layer, node_layer_ref, layer_out_node_ref, has_seen,
+                                          visited_layer, conv_linear_with_bn_dict)
+            else:
+                has_seen[0] = None
+                has_seen[1] = None
+                has_seen[2] = None
+
+
+def _find_possible_convs_linears_bn(node_layer_map: Dict, layer_out_node_map: Dict)\
+        -> Dict[Union[ConvType, LinearType], List[Union[None, BatchNormType]]]:
+    """
+    find all possible convs_linears_bn by traversing all paths in the model considering all inputs
+
+    :param node_layer_map:  dictionary includes node_ref as a key, in_layers and out_layer as value
+    :param layer_out_node_map: dictionary includes layer_ref as a key, outbound nodes as value
+    :return: return dictionary of all possible conv_bn pairs,
+        key: Dense or Conv layer & Value: list of BNS;
+        first index in this list shows bn_in and the second index shows bn_out
+    """
+
+    input_layers = common.find_input_layers(node_layer_map)
+    visited_layer = set()
+    conv_linear_with_bn_dict = {}
+
+    for input_layer in input_layers:
+        _fill_conv_linear_bn_dict(input_layer, node_layer_map, layer_out_node_map,
+                                  [None, None, None], visited_layer, conv_linear_with_bn_dict)
+
+    return conv_linear_with_bn_dict
+
+
+def _get_bn_params(bn: tf.keras.layers.BatchNormalization) -> libpymo.BNParams():
+    """
+    helper to populate BN params from given BN Layer, required for fold
+
+    :param bn: BatchNorm Layer
+    :return: return bn params in libpymo.TensorParams() format.
+    """
+    if bn.gamma is None:
+        _logger.warning("Gamma for BatchNormalization '%s' is None. Setting to ones.", bn.name)
+        # Batch Normalization layers can having missing gammas with two different cases. One is that the 'gamma' attribute
+        # is set to None. The second is if `scale` is set to False upon creation of the layer which turns off gamma.
+        with tf.name_scope(bn.name):
+            weights_with_gamma_and_before_rebuild = [np.ones_like(bn.beta)] + bn.get_weights()
+            bn.scale = True
+            bn.build(bn.input.shape)
+            bn.set_weights(weights_with_gamma_and_before_rebuild)
+            bn.gamma = next(filter(lambda w: 'gamma' in w.name, bn.weights))
+
+    bn_params = libpymo.BNParams()
+
+    bn_params.gamma = bn.gamma.numpy().reshape(-1)
+    bn_params.beta = bn.beta.numpy().reshape(-1)
+    bn_params.runningMean = bn.moving_mean.numpy().reshape(-1)
+    bn_params.runningVar = bn.moving_variance.numpy().reshape(-1)
+    epsilon = bn.epsilon
+    var = bn.moving_variance.numpy()
+    var_with_epsilon = var + epsilon
+    sigma = np.sqrt(var_with_epsilon)
+    bn_params.runningVar = sigma
+
+    return bn_params
+
+
+def _get_bias_tensor(conv_linear: LayerType) -> libpymo.TensorParams():
+    """
+    Get bias tensor in given conv layer.
+
+    Packs bias in the format required for BN fold
+    (libpymo.TensorParams()).
+    :param conv_linear: conv Layer
+    :return: return bias param in libpymo.TensorParams() format.
+    """
+
+    bias_tensor = libpymo.TensorParams()
+    if conv_linear.bias is not None:
+        bias_tensor.data = conv_linear.bias.numpy().reshape(-1)
+        bias_tensor.shape = np.array(conv_linear.bias.shape)
+
+    return bias_tensor
+
+
+def _get_weight_tensor_transpose_reshape(conv_linear: LayerType) -> libpymo.TensorParams():
+    """
+    Get weight tensor from conv layer.
+
+    Converts to right format - performs transpose and reshape.
+    Packs it to the format required for BN fold (libpymo.TensorParams()).
+    :param conv_linear: conv layer
+    :return: return weight tensor in libpymo.TensorParams() format.
+    """
+
+    # Weight tensor libpymo format
+    weight_tensor = libpymo.TensorParams()
+
+    # linear array to be sent for bn fold
+    weight = conv_linear.get_weights()[0]
+    shape = weight.shape
+
+    if isinstance(conv_linear, tf.keras.layers.DepthwiseConv2D):
+        # Depthwise conv layers in TF have outputs(Noc) set to 1.
+        # we will use format [Nic, Noc, kh, kw] -
+        # to be compatible with cpp backend.
+        weight = np.transpose(weight, (2, 3, 0, 1))
+        # [Nic, Noc, kh, kw]
+        shape = np.array([shape[2], shape[3], shape[0], shape[1]])
+    elif isinstance(conv_linear, tf.keras.layers.Dense):
+        shape = np.concatenate((np.array([1, 1]), shape))
+        weight = np.transpose(weight, (1, 0))
+        # [Noc, Nic, kh, kw]
+        shape = np.array([shape[3], shape[2], shape[0], shape[1]])
+    elif isinstance(conv_linear, tf.keras.layers.Conv2DTranspose):
+        weight = np.transpose(weight, (2, 3, 0, 1))
+        # [Noc, Nic, kh, kw]
+        shape = np.array([shape[2], shape[3], shape[0], shape[1]])
+    elif isinstance(conv_linear, tf.keras.layers.Conv2D):
+        weight = np.transpose(weight, (3, 2, 0, 1))
+        # [Noc, Nic, kh, kw]
+        shape = np.array([shape[3], shape[2], shape[0], shape[1]])
+    else:
+        _logger.error("_get_weight_tensor_transpose_reshape(): Operation type unsupported")
+
+    weight_tensor.data = weight.reshape(-1)
+    weight_tensor.shape = shape
+
+    return weight_tensor
+
+
+class PassThroughOp(tf.keras.layers.Layer):
+    """
+    This is a pass-through op, used for purpose of making an op a no-op
+    """
+
+    # pylint: disable=arguments-differ
+    @staticmethod
+    def call(inputs):
+        """
+        This is a function to return input as an output
+        :param inputs: input to pass through
+        """
+        return inputs
+
+# pylint: disable=too-many-branches, protected-access, too-many-locals, too-many-nested-blocks
+@common.to_functional
+def _delete_bn_from_functional(model: tf.keras.Model,
+                               bn_layers_to_remove: List[tf.keras.layers.BatchNormalization]) -> tf.keras.Model:
+    """
+    This function is used to remove ALL batch normalization layers from a functional model passed via the
+    bn_layers_to_remove parameter. Removing in place is not possible for functional models as the layers inbound and
+    outbound connections are immutable. This function returns a new model with the batch normalization layers removed.
+
+    :param model: Model to remove bn_layers from
+    :param bn_layers_to_remove: List of batch normalization layers to remove from the model
+    :return: A new model with the batch normalization layers removed
+    """
+
+    # In order to do this, we first need to know the original models inbound and outbound connections to each layer.
+    # We then need to create a new model with the same inbound and outbound connections, but with the batch normalization
+    # layers removed. This is done by rerouting the inbound nodes of the batch normalization layers to the inbound nodes
+    # of the next layer. This can be seen in the following diagram:
+    #
+    # Original model flow ------------------------->
+    #   ______________        ______________        ______________
+    #  |             |       |             |       |             |
+    #  |    Conv     |  -X-> |  Batch Norm |  -X-> |    ReLU     |
+    #  |_____________|       |_____________|     ^ |_____________|
+    #  New model flow   \                       /
+    #                    \                     /
+    #                     \___________________/
+
+
+    def wrapped_bn_layer_in_bns_to_remove(layer: tf.keras.layers.Layer) -> bool:
+        return isinstance(layer, QcQuantizeWrapper) and layer._layer_to_wrap in bn_layers_to_remove
+
+    tf.keras.backend.clear_session() # clear session to not have tensor name conflicts
+
+    # Step 1: Get the inbound and outbound connections for each layer in the model
+    model_layer_connections = ModelLayerConnections.get_model_layers_connection_properties(model)
+
+    for inp in model.inputs:
+        model_layer_connections[ModelLayerConnectionsProperties.OUTPUT_TENSORS].update({inp.name: inp})
+
+    # Step 2: Create a new model with the batch normalization layers removed by iterating through the layers in the model
+    # and using the inbound and outbound connections to rerouting around the batch normalization layers.
+    batch_norms_replaced_with_names = {}
+    model_outputs = []
+    for current_layer in model.layers:
+        if isinstance(current_layer, tf.keras.layers.InputLayer):
+            continue
+
+        # Determine input tensors of the given layer
+        layer_input = [model_layer_connections[ModelLayerConnectionsProperties.OUTPUT_TENSORS][layer_aux]
+                       for layer_aux in model_layer_connections[ModelLayerConnectionsProperties.INBOUND_NODES][current_layer.name]]
+
+        layer_input = layer_input[0] if len(layer_input) == 1 else layer_input
+
+        # Reroute around batch normalization layers if the layer is in the list of layers to remove
+        if current_layer in bn_layers_to_remove or wrapped_bn_layer_in_bns_to_remove(current_layer):
+            _logger.debug("Removing Batch Normalization layer %s", current_layer.name)
+
+            for outbound_node in current_layer._outbound_nodes:  # pylint: disable=protected-access
+                # Find and replace the Batch Normalization output layers input that holds the Batch Normalization layer
+                # node and replace it with the input layers of the Batch Normalization layer.
+                # For example, if ReLU's inputs are [conv1_bn] and conv1_bn's inputs are [conv1], then we replace
+                # ReLU's inputs with [conv1]
+
+                all_batch_norms_inbound_layers_names = \
+                    [inbound_node.inbound_layers.name for inbound_node in current_layer._inbound_nodes]
+
+                # Go through all the outbound layers of the batch normalization layer and replace the batch normalization
+                # layer name with the input layer names of the batch normalization layer.
+                batch_norms_outbound_layers_new_inbound_layers_names = \
+                    [outlayer.replace(current_layer.name, *all_batch_norms_inbound_layers_names)
+                     for outlayer in model_layer_connections[ModelLayerConnectionsProperties.INBOUND_NODES][outbound_node.outbound_layer.name]]
+
+                # Keras Batch Norm only supports one input tensors. Meaning there is one singular layer coming into it.
+                # Hence, 'inbound_nodes[0]'.
+                batch_norms_replaced_with_names[current_layer.name] = current_layer._inbound_nodes[0].inbound_layers.name
+
+                model_layer_connections[ModelLayerConnectionsProperties.INBOUND_NODES].update(
+                    {outbound_node.outbound_layer.name: batch_norms_outbound_layers_new_inbound_layers_names})
+
+                # The above updates our dict for the mapping of the inputs, but we need to also update what Keras thinks
+                # the inputs are. This is done by updating the inbound nodes of the output layer of the Batch Normalization.
+                # THIS IS ONLY FOR MAPPING THE INPUTS TO BUILD A NEW MODEL. The original models underlying structure is
+                # not changed.
+                outbound_node.outbound_layer._inbound_nodes = current_layer.inbound_nodes  # pylint: disable=protected-access
+
+        # Otherwise, treat like a normal layer
+        else:
+            # For layers that have multiple inputs, order matters for what is fed into the layer. For example, if we have
+            # an Add layer with inputs from a ReLU and a Batch Norm, the order they go into the Add matters. Furthermore,
+            # if the Batch Norm is deleted, then it needs to be replaced with it's folded layer in the same order.
+
+            KERAS_SYMBOLIC_TENSORS_INDEX = 0
+            # Check if we need to change layer_input order. If there is just one input, there is no order.
+            # Special case when there is a Lambda layer with multiple inputs is handled seperately
+            if isinstance(layer_input, List) and not isinstance(current_layer, TFOpLambda):
+                # Original models keras symbolic tensor order
+                original_keras_symbolic_tensors_order = model_layer_connections[ModelLayerConnectionsProperties.CALL_ARGS][
+                    current_layer.name][KERAS_SYMBOLIC_TENSORS_INDEX]
+
+                # Special case for Lambda layers. Lambda layers can be thought of as z = x + y. Unfortunately, their call
+                # args for the keras symbolic tensors will ONLY have the x portion. In our layer_input we have both x and y.
+                # This statement is added to wrap the x portion of the original call args and check if it's a batch norm
+                # folded out.
+                if not isinstance(original_keras_symbolic_tensors_order, List):
+                    original_keras_symbolic_tensors_order = [original_keras_symbolic_tensors_order]
+
+                # Check if a Batch Norm that was deleted is in the original keras symbolic order.
+                name_of_bn_replaced = [
+                    tensor._keras_history.layer.name
+                    for tensor in original_keras_symbolic_tensors_order
+                    if tensor._keras_history.layer.name in batch_norms_replaced_with_names
+                ]
+
+                # If a Batch Norm is found, then the keras symbolic tensor order is slightly updated to replace the
+                # Batch Norm with the folded layer. Otherwise, we can just use the original keras symbolic tensor order.
+                if name_of_bn_replaced:
+
+                    updated_keras_symbolic_tensors_order = []
+                    for keras_symbolic_tensor in original_keras_symbolic_tensors_order:
+                        if (name_of_bn := keras_symbolic_tensor._keras_history.layer.name) in name_of_bn_replaced: #pylint: disable=superfluous-parens
+                            updated_keras_symbolic_tensors_order.append(
+                                model_layer_connections[ModelLayerConnectionsProperties.OUTPUT_TENSORS][
+                                    batch_norms_replaced_with_names[name_of_bn]
+                                ]
+                            )
+                        else:
+                            updated_keras_symbolic_tensors_order.append(keras_symbolic_tensor)
+
+                    # Dictionary of the keras symbolic tensor name to the order.
+                    ordered_inputs = {k.name: v for v, k in enumerate(updated_keras_symbolic_tensors_order)}
+
+                    # Sort layer_input based on the above dictionary.
+                    layer_input = sorted(layer_input, key=lambda current_input, oi=ordered_inputs: oi[current_input.name])
+
+            # Since we are rerouting around the batch normalization layers, we need to temporarily remove the inbound
+            # and outbound nodes of the batch normalization layers so that the model can be built correctly and not
+            # duplicate the non batch normalization layers inbound/outbound nodes.
+            current_layer._inbound_nodes = []  # pylint: disable=protected-access
+            # Special case for when there is a Lambda operation with multiple inputs. For example, z = x + y.
+            if isinstance(current_layer, TFOpLambda):
+                kmp = _KerasModelPreparer.get_instance_for_common_layer_passthrough_functions(model_layer_connections)
+                x = kmp._handle_normal_keras_layer(current_layer)  # pylint: disable=protected-access
+                # Updating the Model layer connections
+                kmp._update_output_tensors_in_model_layers_connections(  # pylint: disable=protected-access
+                    current_layer, x, model
+                )
+            else:
+                x = current_layer(layer_input)
+            current_layer._outbound_nodes = []  # pylint: disable=protected-access
+
+            # Set new output tensor (in this case, it will be the same as the original model)
+            model_layer_connections[ModelLayerConnectionsProperties.OUTPUT_TENSORS].update({current_layer.name: x})
+
+        # Save tensor in output list if it is output in the initial model
+        if current_layer.name in model.output_names:
+            model_outputs.append(x)
+
+    return tf.keras.Model(inputs=model.inputs, outputs=model_outputs)
+
+
+def _delete_bn_from_sequential(layer: tf.keras.layers.Layer,
+                               bn: tf.keras.layers.BatchNormalization):
+
+    """
+    This is the function for removing batch normalization layers that are layers of sequential model
+
+    :param layer: model to obtain bn_layer that we want to remove
+    :param bn: batch normalization layer that needs to be removed
+    """
+
+    layers_after_bn = []
+    visited = False
+    idx = None
+    # pylint: disable=protected-access
+    for index, inner_layer in enumerate(layer.layers):
+        if visited:
+            layers_after_bn.append(inner_layer)
+
+        elif inner_layer == bn:
+            visited = True
+            idx = index
+
+        elif inner_layer.submodules:
+            _delete_bn_for_non_subclassed_model(inner_layer, bn)
+
+    if visited and idx is not None:
+        # pylint: disable=protected-access
+        for _ in range(len(layer.layers) - idx):
+            layer.pop()
+        for layer_to_add in layers_after_bn:
+            layer.add(layer_to_add)
+
+
+def _delete_bn_for_non_subclassed_model(model: Union[tf.keras.Model, tf.keras.layers.Layer],
+                                        bn_layer: tf.keras.layers.BatchNormalization):
+    """
+    Remove bn layer for those model which are not part of model subclassing
+
+    :param model: model to delete bn layers from
+    :param bn_layer: bn layer that should be removed
+    """
+
+    if isinstance(model, tf.keras.Sequential):
+        _delete_bn_from_sequential(model, bn_layer)
+
+    # We are expecting to find sequential model in functional model
+    # or model subclassing in the elif statement
+    elif isinstance(model, (tf.keras.layers.Layer, tf.keras.Model)):
+        for layer in model.layers:
+            if layer.submodules:
+                _delete_bn_for_non_subclassed_model(layer, bn_layer)
+
+
+def _delete_bn_from_model_subclassing(module_to_name_map: Dict[tf.keras.layers.Layer,
+                                                               Tuple[tf.keras.Model, str]],
+                                      bn_layer: tf.keras.layers.BatchNormalization):
+    """
+    Remove bn layer which is part of model subclassing api
+    or model inheriting from tf.keras.layers.Layer
+
+    :param module_to_name_map: model to remove bn from
+    :param bn_layer: bn layer that should be removed
+    """
+
+    parent_ref, module_name = module_to_name_map[bn_layer]
+    op = PassThroughOp()
+    setattr(parent_ref, module_name, op)
+
+# pylint: disable=inconsistent-return-statements
+def _delete_all_bns_from_model(model: Union[tf.keras.Model, tf.keras.layers.Layer],
+                               bn_layers: List[tf.keras.layers.BatchNormalization]) -> Optional[tf.keras.Model]:
+    """
+    Remove all bn layers for a given model.
+
+    :param model: Model to have the bn layers removed from
+    :param bn_layers: bn layers that should be removed
+    :return: new model with bn layers removed, if model is functional else None
+    """
+    if bn_layers:
+        # QuantizationSimModel's model will fall into this case.
+        if isinstance(model, Functional) and not isinstance(model, tf.keras.Sequential) or any(isinstance(l, QcQuantizeWrapper) for l in model.layers):
+            return _delete_bn_from_functional(model, bn_layers)
+
+        module_to_name_map = common.module_to_name_map(model)
+
+        for bn_layer in bn_layers:
+            if bn_layer in module_to_name_map:
+                _delete_bn_from_model_subclassing(module_to_name_map, bn_layer)
+            else:
+                _delete_bn_for_non_subclassed_model(model, bn_layer)
+
+
+def _find_all_batch_norms_to_fold(model: tf.keras.Model) -> Tuple[List[PairType], List[PairType], Set[tf.keras.layers.BatchNormalization]]:
+    """
+    uses searcher to choose layers for bias correction
+
+    :param model: model to obtain conv_linear pairs for
+    :return: List of conv/linear layers with associated bn op / activation info and
+            a Set of all the batch norms which are marked for folding.
+    """
+
+    node_layer_map = common.create_node_to_layer_map(model)
+    layer_out_node_map = common.create_layer_to_out_node_map(model)
+
+    possible_convs_linears_bn = _find_possible_convs_linears_bn(node_layer_map, layer_out_node_map)
+
+    # get all ordered convs/ linears layers
+    ordered_conv_linears = _get_ordered_conv_linears(node_layer_map, layer_out_node_map)
+
+    bn_picked_for_folding = set()
+    def get_pairs(conv_is_first=False) -> List:
+        index = 1 if conv_is_first else 0
+
+        pairs_list = []
+        for conv_linear in ordered_conv_linears:
+            if conv_linear in possible_convs_linears_bn and (bn_info := possible_convs_linears_bn[conv_linear]):
+                if bn_info[index] and bn_info[index] not in bn_picked_for_folding:
+                    pairs_list.append((conv_linear, bn_info[index]) if conv_is_first else (bn_info[index], conv_linear))
+                    bn_picked_for_folding.add(bn_info[index])
+
+        return pairs_list
+
+    conv_bn_pairs = get_pairs(conv_is_first=True)
+    bn_conv_pairs = get_pairs(conv_is_first=False)
+
+    return conv_bn_pairs, bn_conv_pairs, bn_picked_for_folding
+
+
+
[docs]def fold_all_batch_norms(model: tf.keras.Model) \ + -> Tuple[List[Tuple[LayerType, BatchNormType]], tf.keras.Model]: + """ + Fold all batch_norm layers in a model into corresponding conv/linear layers + + :param model: model to find all batch norms for + :return: A tuple of List of conv/linear layers with associated bn op / activation info and a new model with the + Batch Normalization layers folded + """ + + conv_bn_pairs, bn_conv_pairs, folded_bns = _find_all_batch_norms_to_fold(model) + + # Potential new model is returned in case the model is a functional model + potential_new_model = _fold_given_batch_norms(model, conv_bn_pairs, bn_conv_pairs) + model = potential_new_model if potential_new_model else model + + # Convert the standalone BNs which are not folded + bn_converted = convert_standalone_batchnorms(model, folded_bns) + if bn_converted: + _logger.info("%d BatchNorms' weights got converted", len(bn_converted)) + model.compile() + _logger.warning("A new model is returned with the Batch Normalization layers removed for Keras models. " + "Please use this new model for the rest of the AIMET flow.") + + return conv_bn_pairs + [(conv, bn) for bn, conv in bn_conv_pairs], model
+ + +def convert_standalone_batchnorms(model: tf.keras.Model, folded_bns: set) -> List[tf.keras.layers.BatchNormalization]: + """ + Converts the weights of standalone batch norms remaining in the model after BN folding + :param model: keras model on which batch norm folding is being performed + :param folded_bns: list of batch norms which got folded + :return: list of BatchNorms whose weights is converted + """ + + bn_converted = [] + for layer in model.layers: + if isinstance(layer, tf.keras.layers.BatchNormalization) and layer not in folded_bns: + convert_batchnorm_parameters(layer) + _logger.debug("%s weights got converted", layer.name) + bn_converted.append(layer) + return bn_converted + + +def convert_batchnorm_parameters(bn: tf.keras.layers.BatchNormalization): + """ + Convert the weights of BN such that it works as y = weights * x + bias + :param bn: Batch Norm layer whose weights need to be converted + """ + bn_params = _get_bn_params(bn) + + # inv :: 1/ Sqrt(var + eps) + inv = tf.math.rsqrt(bn.moving_variance.numpy() + bn.epsilon) + weight = np.array(bn_params.gamma) * np.array(inv) + bias = np.array(bn_params.beta) - np.array(bn_params.runningMean) * weight + + new_bn_weights = [weight.data, bias.data, + np.zeros(shape=bn.moving_mean.shape, dtype=np.float32), + np.ones(shape=bn.moving_variance.shape, dtype=np.float32)] + bn.trainable = False + bn.set_weights(new_bn_weights) + bn.epsilon = 0 + + +# pylint: disable=protected-access +
[docs]def fold_all_batch_norms_to_scale(sim: QuantizationSimModel) -> List[Tuple[QcQuantizeWrapper, QcQuantizeWrapper]]: + """ + Fold all batch_norm layers in a model into the quantization scale parameter + of the corresponding conv layers + + :param sim: QuantizationSimModel to be folded + :return: A list of pairs of layers [(Conv/Linear, BN layer that got folded)] + """ + + assert sim.model is not None, "QuantizationSimModel attribute 'model' is None." + + model = sim._model_without_wrappers + + quant_wrappers = { + quant_wrapper._layer_to_wrap: quant_wrapper + for quant_wrapper in sim.quant_wrappers() + } + + conv_bn_pairs, bn_conv_pairs, _ = _find_all_batch_norms_to_fold(model) + conv_bn_pairs = [ + (quant_wrappers[conv], quant_wrappers[bn]) for conv, bn in conv_bn_pairs + ] + bn_conv_pairs = [ + (quant_wrappers[bn], quant_wrappers[conv]) for bn, conv in bn_conv_pairs + ] + + old_model_without_wrappers = tf.keras.models.clone_model(model) + conv_bn_pairs_without_wrappers, _, _ = _find_all_batch_norms_to_fold(old_model_without_wrappers) + old_model_without_wrappers.set_weights(WeightTensorUtils.get_all_sim_models_layer_to_wrap_weights(sim.model)) + + # We fold both the sim.model and sim._model_without_wrappers because we rebuild the QuantizationSimModel during + # export and this utilizes the sim._model_without_wrappers to achieve this. + bn_fold_sim_model = _fold_given_batch_norms(sim.model, conv_bn_pairs, bn_conv_pairs) + sim.model = bn_fold_sim_model if bn_fold_sim_model else sim.model + + bn_fold_model = _fold_given_batch_norms(old_model_without_wrappers, conv_bn_pairs_without_wrappers, []) + sim._model_without_wrappers = bn_fold_model if bn_fold_model else old_model_without_wrappers + + return conv_bn_pairs + [(conv, bn) for bn, conv in bn_conv_pairs]
+ +
[docs]def fold_given_batch_norms(model: tf.keras.Model, layer_pairs: List[PairType]) -> Optional[tf.keras.Model]: + """ + Fold a given set of batch_norm layers into conv_linear layers + + :param model: Either a Keras Model or a QuantizationSimModel's model + :param layer_pairs: Tuple of conv, bn layers and is_batch_norm_second flag + :return: new model with batch norm layers folded if model is a functional model, else None + """ + # pylint: disable=protected-access + conv_bn_paris = [] + bn_conv_pairs = [] + + def is_batchnorm(layer: tf.keras.layers.Layer) -> bool: + if isinstance(layer, QcQuantizeWrapper): + layer = layer._layer_to_wrap + return isinstance(layer, _supported_batchnorms) + + def is_conv_linear(layer: tf.keras.layers.Layer) -> bool: + if isinstance(layer, QcQuantizeWrapper): + layer = layer._layer_to_wrap + return isinstance(layer, _supported_layers) + + for x, y in layer_pairs: + if is_batchnorm(x): + assert is_conv_linear(y) + bn = x + conv = y + bn_conv_pairs.append((bn, conv)) + else: + assert is_conv_linear(x) + assert is_batchnorm(y) + conv = x + bn = y + conv_bn_paris.append((conv, bn)) + + return _fold_given_batch_norms(model, conv_bn_paris, bn_conv_pairs)
+ +def _fold_given_batch_norms(model: tf.keras.Model, + conv_bn_pairs: Iterable[Tuple[tf.keras.layers.Layer, tf.keras.layers.Layer]], + bn_conv_pairs: Iterable[Tuple[tf.keras.layers.Layer, tf.keras.layers.Layer]]) -> \ + Optional[tf.keras.Model]: + """ + Fold a given set of batch_norm layers into conv layers + + :param model: Model + :param conv_bn_pairs: List of (conv, bn) pairs to fold + :param bn_conv_pairs: List of (bn, conv) pairs to fold + """ + for bn, conv in bn_conv_pairs: + if isinstance(conv, QcQuantizeWrapper): + raise RuntimeError(f"Forward folding to scale is not possible. Got {conv.name}") + + bn_layers = [] + + def _fold(conv, bn, fold_backward): + is_wrapped = isinstance(conv, QcQuantizeWrapper) or isinstance(bn, QcQuantizeWrapper) + try: + if is_wrapped: + assert isinstance(conv, QcQuantizeWrapper) and isinstance(bn, QcQuantizeWrapper) + bn._layer_to_wrap.trainable = False + _fold_to_scale(conv, bn) + bn_layers.append(bn._layer_to_wrap) + else: + bn.trainable = False + _fold_to_weight(conv, bn, fold_backward=fold_backward) + except _BatchNormFoldingNotSupported as e: + bn_name = bn._layer_to_wrap.name if is_wrapped else bn.name + conv_name = conv._layer_to_wrap.name if is_wrapped else conv.name + _logger.warning( + "Failed to fold %s to %s. [Reason] %s", bn_name, conv_name, str(e) + ) + else: + bn_layers.append(bn._layer_to_wrap if is_wrapped else bn) + + for conv, bn in conv_bn_pairs: + _fold(conv, bn, fold_backward=True) + + for bn, conv in bn_conv_pairs: + _fold(conv, bn, fold_backward=False) + + return _delete_all_bns_from_model(model, bn_layers) + +class _BatchNormFoldingNotSupported(RuntimeError): + pass + +def _fold_to_scale(conv_wrapper: QcQuantizeWrapper, bn_wrapper: QcQuantizeWrapper): + """ + Fold BatchNorm into the scale and bias of the given layer. + + :param conv_wrapper: QcQuantizeWrapper that wraps conv or linear layer + :param bn_wrapper: QcQuantizeWrapper that wraps the Batch Norm layer + """ + # pylint: disable=protected-access, too-many-statements, too-many-locals + conv = conv_wrapper._layer_to_wrap + bn = bn_wrapper._layer_to_wrap + + weight_quantizer = get_wrappers_weight_quantizer(conv_wrapper.param_quantizers) + bias_quantizer = get_wrappers_bias_quantizer(conv_wrapper.param_quantizers) + + # Checking QuantScheme as aimet_tensorflow.keras does not have LearnedGridTensorQuantizer + if weight_quantizer.quant_scheme not in [QuantScheme.training_range_learning_with_tf_init, + QuantScheme.training_range_learning_with_tf_enhanced_init]: + raise _BatchNormFoldingNotSupported( + "BatchNorm folding to scale supports training_range_learning_with_tf_init or " + "training_range_learning_with_tf_enhanced_init only. " + f"got {weight_quantizer.quant_scheme}" + ) + + output_quantizer = conv_wrapper.output_quantizers[0] + + if output_quantizer.is_enabled(): + raise _BatchNormFoldingNotSupported( + "BatchNorm should belong to the same supergroup with the layer to be folded to." + ) + + if bias_quantizer: + if bias_quantizer.is_enabled(): + raise _BatchNormFoldingNotSupported( + "Can't fold BatchNorm to scale if bias quantizer is enabled." + ) + + enc_min = weight_quantizer._encoding_min + enc_max = weight_quantizer._encoding_max + + if not weight_quantizer.is_encoding_valid(): + raise RuntimeError + + with bn_wrapper._quantize_params(): + _fold_to_weight(conv, bn, fold_backward=True) + + gamma = bn.gamma + sigma = K.sqrt(bn.moving_variance + bn.epsilon) + + for i, c in enumerate(gamma/sigma): + c = float(c) + if c >= 0: + enc_max[i].assign(enc_max[i] * c) + enc_min[i].assign(enc_min[i] * c) + else: + enc_max_before_reassign = enc_max[i] + enc_max[i].assign(enc_min[i] * c) + enc_min[i].assign(enc_max_before_reassign * c) + + + # Copy batchnorm's output quantizers to conv output quantizers + for conv_output_quantizer, bn_output_quantizer in \ + zip(conv_wrapper.output_quantizers, bn_wrapper.output_quantizers): + if bn_output_quantizer.encoding is not None: + conv_output_quantizer._encoding_min.assign(bn_output_quantizer._encoding_min) + conv_output_quantizer._encoding_max.assign(bn_output_quantizer._encoding_max) + conv_output_quantizer._is_encoding_valid = True + + tensor_quantizers = conv_output_quantizer._tensor_quantizer if isinstance(conv_output_quantizer._tensor_quantizer, List) else [conv_output_quantizer._tensor_quantizer] + for tensor_quantizer in tensor_quantizers: + tensor_quantizer.isEncodingValid = True + + if bn_output_quantizer.is_enabled(): + conv_output_quantizer.enable() + else: + conv_output_quantizer.disable() + + + + bn_output_quantizer.disable() + + if bias_quantizer is None: + bias_quantizer = ParamPerTensorQuantizer(conv, + conv.bias.name.split(':')[0], + weight_quantizer.quant_scheme, + MAP_PYMO_TO_ROUND_MODE[weight_quantizer.round_mode], + weight_quantizer.bitwidth, + weight_quantizer.data_type, + weight_quantizer.is_symmetric, + weight_quantizer.use_strict_symmetric, + weight_quantizer.use_unsigned_symmetric, + enabled=False) + tensor_quantizers = bias_quantizer._tensor_quantizer if isinstance(bias_quantizer._tensor_quantizer, List) else [bias_quantizer._tensor_quantizer] + for tensor_quantizer in tensor_quantizers: + tensor_quantizer.isEncodingValid = True + conv_wrapper.param_quantizers.append(bias_quantizer) + + +def _fold_to_weight(conv_linear: LayerType, bn: BatchNormType, fold_backward: bool): + """ + Fold BatchNorm into the weight and bias of the given layer. + + :param conv_linear: Conv or linear layer to fold BN into. + :param bn: BatchNorm to fold. + :param fold_backward: To fold backwards or not + """ + + is_bias_valid = conv_linear.bias is not None + + bn_params = _get_bn_params(bn) + weight_tensor = _get_weight_tensor_transpose_reshape(conv_linear) + bias_tensor = _get_bias_tensor(conv_linear) + + # Updated weight and bias + bias = libpymo.fold(bn_params, weight_tensor, bias_tensor, is_bias_valid, fold_backward) + + if isinstance(conv_linear, tf.keras.layers.DepthwiseConv2D): + # Depthwise conv layers in TF have outputs(Noc) set to 1. + # we send in format [Nic, Noc, kh, kw] + numpy_weight_reshaped = np.reshape(weight_tensor.data, weight_tensor.shape).transpose((2, 3, 0, 1)) + + elif isinstance(conv_linear, tf.keras.layers.Dense): + # o, i - convert to i , o + numpy_weight_reshaped = np.reshape( + weight_tensor.data, + [weight_tensor.shape[0], weight_tensor.shape[1]]).transpose(1, 0) + + elif isinstance(conv_linear, tf.keras.layers.Conv2DTranspose): + # we sent in format [Noc, Nic, kh, kw] + numpy_weight_reshaped = np.reshape(weight_tensor.data, weight_tensor.shape).transpose((2, 3, 0, 1)) + + else: + # conv2D case + # we sent in format [Noc, Nic, kh, kw] + numpy_weight_reshaped = np.reshape(weight_tensor.data, weight_tensor.shape).transpose((2, 3, 1, 0)) + + # update bias tensor, even in case there was no existing bias add op in given conv2D op. + bias_tensor_shape = [weight_tensor.shape[0]] + numpy_bias_reshaped = np.reshape(bias, bias_tensor_shape) + + if not is_bias_valid: + conv_linear.use_bias = True + conv_linear.bias = conv_linear.add_weight(name=f"{conv_linear.name}/bias", + shape=(weight_tensor.shape[0],), + dtype=conv_linear.dtype, + trainable=True) + conv_linear.set_weights([numpy_weight_reshaped.data, numpy_bias_reshaped]) +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/keras/bn_reestimation.html b/releases/1.32.2/_modules/aimet_tensorflow/keras/bn_reestimation.html new file mode 100644 index 00000000..03bdc3fb --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/keras/bn_reestimation.html @@ -0,0 +1,1254 @@ + + + + + + aimet_tensorflow.keras.bn_reestimation — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_tensorflow.keras.bn_reestimation

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2022, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+"""BatchNorm Reestimation"""
+from typing import List, Dict
+import numpy as np
+import tensorflow as tf
+from aimet_common.utils import Handle, AimetLogger
+
+logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.Utils)
+
+def _get_bn_submodules(model: tf.keras.Model) -> List[tf.keras.layers.Layer]:
+    bn_layers = []
+    for layer in model.submodules:
+        if isinstance(layer, tf.keras.layers.BatchNormalization):
+            bn_layers.append(layer)
+    return bn_layers
+
+
+def _reset_bn_stats(bn_layers: List[tf.keras.layers.Layer], bn_mean_checkpoints: Dict, bn_var_checkpoints: Dict, bn_momentum_checkpoints: Dict) -> Handle:
+    """
+    reset bn stats
+    :param bn_layers: keras bn_layers
+    :param bn_mean_checkpoints: Dict for original bn mean
+    :param bn_var_checkpoints: Dict for original bn var
+    :param bn_momentum_checkpoints: Dict for original bn momentum
+    :return:
+    """
+
+    def cleanup():
+        """
+        Restore Bn stats
+        """
+        for layer in bn_layers:
+            move_mean = bn_mean_checkpoints[layer.name]
+            move_var = bn_var_checkpoints[layer.name]
+            gamma, beta, _, _ = layer.get_weights()
+            layer.set_weights([gamma, beta, move_mean, move_var])
+            layer.momentum = bn_momentum_checkpoints[layer.name]
+
+    try:
+        for layer in bn_layers:
+            layer.momentum = 0.0
+        return Handle(cleanup)
+    except:
+        cleanup()
+        raise ValueError('exception for reset_bn_stats')  # pylint: disable=raise-missing-from
+
+# pylint: disable=too-many-locals
+
[docs]def reestimate_bn_stats(model: tf.keras.Model, bn_re_estimation_dataset: tf.data.Dataset, + bn_num_batches: int = 100) -> Handle: + """ + + top level api for end user directly call + + :param model: tf.keras.Model + :param bn_re_estimation_dataset: Training dataset + :param bn_num_batches: The number of batches to be used for reestimation + :returns: Handle that undos the effect of BN reestimation upon handle.remove() + """ + + bn_layers = _get_bn_submodules(model) + + # save checkpoints + bn_mean_ori = {layer.name: layer.moving_mean.numpy() for layer in bn_layers} + bn_var_ori = {layer.name: layer.moving_variance.numpy() for layer in bn_layers} + bn_momentum_ori = {layer.name: layer.momentum for layer in bn_layers} + # 1. switch to re-estimation mode and setup remove + handle = _reset_bn_stats(bn_layers, bn_mean_ori, bn_var_ori, bn_momentum_ori) + + # 2. mean &var initialization + mean_sum_dict = {layer.name: np.zeros(layer.moving_mean.shape, dtype=layer.moving_mean.dtype.as_numpy_dtype) for layer in bn_layers} + var_sum_dict = {layer.name: np.zeros(layer.moving_variance.shape, dtype=layer.moving_variance.dtype.as_numpy_dtype) for layer in bn_layers} + + # 3 per batch forward for BN re-estimation, accumulate into mean&var buffers + bn_dataset_iterator = iter(bn_re_estimation_dataset) + for batch_index in range(bn_num_batches): + try: + batch_data = next(bn_dataset_iterator) + model(batch_data, training=True) + for layer in bn_layers: + mean_sum_dict[layer.name] += layer.moving_mean.numpy() + var_sum_dict[layer.name] += layer.moving_variance.numpy() + if batch_index == bn_num_batches - 1: + break + except tf.errors.OutOfRangeError: + logger.info("tf.errors.OutOfRangeError:: End of dataset.") + break + + # 4 average mean&var buffers, Override BN stats with the reestimated stats + for layer in bn_layers: + move_mean = mean_sum_dict[layer.name]/bn_num_batches + move_var = var_sum_dict[layer.name]/bn_num_batches + gamma, beta, _, _ = layer.get_weights() + layer.set_weights([gamma, beta, move_mean, move_var]) + + return handle
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/keras/compress.html b/releases/1.32.2/_modules/aimet_tensorflow/keras/compress.html new file mode 100644 index 00000000..2ec0e806 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/keras/compress.html @@ -0,0 +1,1233 @@ + + + + + + aimet_tensorflow.keras.compress — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_tensorflow.keras.compress

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2023, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Top-level API for aimet compression library """
+
+from typing import Union, Tuple, Callable
+import tensorflow as tf
+
+from aimet_common.defs import CostMetric, CompressionScheme, EvalFunction, CompressionStats
+from aimet_common.bokeh_plots import BokehServerSession
+
+from aimet_tensorflow.utils.graph_saver import keras_wrapper_func, keras_save_and_load_graph, keras_remove_hanging_nodes
+from aimet_tensorflow.defs import SpatialSvdParameters
+from aimet_tensorflow.keras.compression_factory import CompressionFactory
+
+
+
[docs]class ModelCompressor: + """ aimet model compressor: Enables model compression using various schemes """ + + # pylint: disable=too-many-arguments + +
[docs] @staticmethod + def compress_model(model: tf.keras.Model, eval_callback: EvalFunction, eval_iterations, + compress_scheme: CompressionScheme, cost_metric: CostMetric, + parameters: Union[SpatialSvdParameters], + trainer: Callable = None, visualization_url: str = None) -> Tuple[tf.keras.Model, CompressionStats]: + """ + Compress a given model using the specified parameters + + :param model: Model, represented by a tf.keras.Model, to compress + :param eval_callback: Evaluation callback. Expected signature is evaluate(model, iterations, use_cuda). + Expected to return an accuracy metric. + :param eval_iterations: Iterations to run evaluation for. + :param compress_scheme: Compression scheme. See the enum for allowed values + :param cost_metric: Cost metric to use for the compression-ratio (either mac or memory) + :param parameters: Compression parameters specific to given compression scheme + :param trainer: Training function + None: If per layer fine-tuning is not required while creating the final compressed model + :param visualization_url: url the user will need to input where visualizations will appear + :return: A tuple of the compressed model session, and compression statistics + """ + + # If no url is passed in, then do not create a bokeh server session + if not visualization_url: + bokeh_session = None + else: + # create a bokeh session to publish visualizations to the server document for compression + bokeh_session = BokehServerSession(url=visualization_url, session_id="compression") + + if parameters.multiplicity < 1: + raise ValueError('Rounding Multiplicity should be greater than 1') + + # wrapper_func saves and reloads the graph before evaluation + # In Keras after making changes to the graph you must save and reload, then evaluate + eval_callback = keras_wrapper_func(eval_callback) + + if compress_scheme == CompressionScheme.spatial_svd: + algo = CompressionFactory.create_spatial_svd_algo(model, eval_callback, eval_iterations, + cost_metric, parameters, bokeh_session) + elif compress_scheme == CompressionScheme.weight_svd: + raise NotImplementedError("Not yet implemented for: {}".format(compress_scheme)) + elif compress_scheme == CompressionScheme.channel_pruning: + raise NotImplementedError("Not yet implemented for: {}".format(compress_scheme)) + else: + raise ValueError("Compression scheme not supported: {}".format(compress_scheme)) + + compressed_layer_db, stats = algo.compress_model(cost_metric, trainer) + + # In keras after making changes to the model you must save and reload, then evaluate + tmp_dir = './data/saved_model' + updated_model = keras_save_and_load_graph(tmp_dir, compressed_layer_db.model) + + # Remove the hanging nodes + updated_model = keras_remove_hanging_nodes(updated_model) + + return updated_model, stats
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/keras/cross_layer_equalization.html b/releases/1.32.2/_modules/aimet_tensorflow/keras/cross_layer_equalization.html new file mode 100644 index 00000000..793f6f80 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/keras/cross_layer_equalization.html @@ -0,0 +1,1618 @@ + + + + + + aimet_tensorflow.keras.cross_layer_equalization — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_tensorflow.keras.cross_layer_equalization

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2021, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+"""Cross Layer Equalization"""
+
+import typing
+
+import numpy as np
+import tensorflow as tf
+
+import aimet_common.libpymo as libpymo
+from aimet_common.utils import AimetLogger
+from aimet_tensorflow.keras.batch_norm_fold import fold_all_batch_norms
+from aimet_tensorflow.keras.graphsearchtuils import GraphSearchUtils, ClsSet
+from aimet_tensorflow.keras.utils import model_transform_utils
+from aimet_tensorflow.keras.utils.weight_tensor_utils import WeightTensorUtils
+
+_logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.CrosslayerEqualization)
+
+BatchNormFoldedPair = typing.Union[typing.Tuple[tf.keras.layers.Conv2D,
+                                                tf.keras.layers.BatchNormalization],
+                                   typing.Tuple[tf.keras.layers.Dense,
+                                                tf.keras.layers.BatchNormalization]]
+
+ScaleFactor = typing.Union[np.ndarray, typing.Tuple[np.ndarray, np.ndarray]]
+ReluFlag = typing.Union[bool, typing.Tuple[bool, bool]]
+
+Convs = typing.Union[tf.keras.layers.Conv2D,
+                     tf.keras.layers.DepthwiseConv2D,
+                     tf.keras.layers.Conv2DTranspose]
+
+_supported_convs = Convs.__args__
+
+
[docs]class ClsSetInfo: + """ + This class hold information about the layers in a CLS set, along with corresponding scaling factors + and other information like if there is a ReLU activation function between the CLS set layers + """ + +
[docs] class ClsSetLayerPairInfo: + """ + Models a pair of layers that were scaled using CLS. And related information. + """ + + def __init__(self, layer1: tf.keras.layers.Conv2D, layer2: tf.keras.layers.Conv2D, scale_factor: np.ndarray, + relu_activation_between_layers: bool): + """ + :param layer1: Layer whose bias is folded + :param layer2: Layer to which bias of previous layer's bias is folded + :param scale_factor: Scale Factor found from Cross Layer Scaling to scale BN parameters + :param relu_activation_between_layers: If the activation between layer1 and layer2 is Relu + """ + self.layer1 = layer1 + self.layer2 = layer2 + self.scale_factor = scale_factor + self.relu_activation_between_layers = relu_activation_between_layers + + def __eq__(self, other): + if isinstance(self, other.__class__): + return self.layer1 == other.layer1 and \ + self.layer2 == other.layer2 and \ + np.allclose(self.scale_factor, other.scale_factor) and \ + self.relu_activation_between_layers == other.relu_activation_between_layers + return False
+ + def __init__(self, cls_pair_1: ClsSetLayerPairInfo, cls_pair_2: ClsSetLayerPairInfo = None): + """ + Constructor takes 2 pairs if Depth-wise separable layer is being folded + :param cls_pair_1: Pair between two conv or conv and depth-wise conv + :param cls_pair_2: Pair between depth-wise conv and point-wise conv + """ + if cls_pair_2: + self.cls_pair_info_list = [cls_pair_1, cls_pair_2] + else: + self.cls_pair_info_list = [cls_pair_1] + + def __eq__(self, other): + if isinstance(self, other.__class__): + return self.cls_pair_info_list == other.cls_pair_info_list + + return False
+ +class CrossLayerScaling: + """ + Code to apply the cross-layer-scaling technique to a model + """ + + @staticmethod + def scale_cls_set_with_conv_layers( + cls_set: typing.Tuple[tf.keras.layers.Conv2D, tf.keras.layers.Conv2D]) -> np.ndarray: + """ + API to invoke equalize layer params (update for weights and bias is in place) + :param cls_set: Consecutive Conv layers Tuple whose weights and biases need to be equalized + :return: Scaling factor S_12 for each conv layer pair: numpy array + """ + + for layer in cls_set: + # NOTE: DepthwiseConv2D and Conv2DTranspose is subclass of Conv2D + # The check below covers all of Conv2D, DepthwiseConv2D and Conv2DTranspose class + if not isinstance(layer, tf.keras.layers.Conv2D): + raise ValueError("Only Conv or Transposed Conv layers are supported for CLE") + + scaling_factor, prev_layer_params, curr_layer_params = CrossLayerScaling.call_mo_scale(cls_set) + + prev_layer, curr_layer = cls_set + weight_and_bias_0 = CrossLayerScaling._unpack_equalization_params(prev_layer, prev_layer_params, + unpack_bias=True) + prev_layer.set_weights(weight_and_bias_0) + + weight_and_bias_1 = CrossLayerScaling._unpack_equalization_params(curr_layer, curr_layer_params, + unpack_bias=False) + curr_layer.set_weights(weight_and_bias_1) + + return scaling_factor + + @staticmethod + def call_mo_scale(cls_set: typing.Tuple[tf.keras.layers.Conv2D, tf.keras.layers.Conv2D]) \ + -> typing.Tuple[np.ndarray, libpymo.EqualizationParams, libpymo.EqualizationParams]: + """ + Invokes scale API in model optimization library + :param cls_set: Consecutive Conv layers Tuple whose weights and biases need to be equalized + :return: Scaling factor, prev and current layer updated parameters + """ + prev_layer_params = CrossLayerScaling._pack_equalization_params(cls_set[0], pack_bias=True) + curr_layer_params = CrossLayerScaling._pack_equalization_params(cls_set[1], pack_bias=False) + + scaling_factor = libpymo.scaleLayerParams(prev_layer_params, curr_layer_params) + return scaling_factor, prev_layer_params, curr_layer_params + + @staticmethod + def scale_cls_set_with_depthwise_conv_layers( + cls_set: typing.Tuple[tf.keras.layers.Conv2D, + tf.keras.layers.DepthwiseConv2D, + tf.keras.layers.Conv2D]) -> typing.Tuple[np.ndarray, np.ndarray]: + """ + API to invoke equalize layer params (update for weights and bias is in place) + :param cls_set: Consecutive Conv layers whose weights and biases need to be equalized. + Second Conv layer is a depth-wise conv and third conv layer is point-wise conv + :return: Scaling factors S_12 and S_23 : numpy arrays + """ + + for layer in cls_set: + if not isinstance(layer, _supported_convs): + raise ValueError("Only Conv or Transposed Conv layers are supported for CLE") + + scaling_params, prev_layer_params, curr_layer_params, next_layer_params = \ + CrossLayerScaling.call_mo_scale_depthwise_separable_layer(cls_set) + + prev_layer, curr_layer, next_layer = cls_set + weight_and_bias_0 = CrossLayerScaling._unpack_equalization_params(prev_layer, + prev_layer_params, + unpack_bias=True) + prev_layer.set_weights(weight_and_bias_0) + + weight_and_bias_1 = CrossLayerScaling._unpack_equalization_params(curr_layer, + curr_layer_params, + unpack_bias=True) + curr_layer.set_weights(weight_and_bias_1) + + weight_and_bias_2 = CrossLayerScaling._unpack_equalization_params(next_layer, + next_layer_params, + unpack_bias=False) + next_layer.set_weights(weight_and_bias_2) + + return scaling_params.scalingMatrix12, scaling_params.scalingMatrix23 + + @staticmethod + def call_mo_scale_depthwise_separable_layer( + cls_set: typing.Tuple[tf.keras.layers.Conv2D, + tf.keras.layers.DepthwiseConv2D, + tf.keras.layers.Conv2D]) -> typing.Tuple[libpymo.RescalingParamsVectors, + libpymo.EqualizationParams, + libpymo.EqualizationParams, + libpymo.EqualizationParams]: + """ + Invokes scale API in model optimization library + :param cls_set: Consecutive Conv layers whose weights and biases need to be equalized + :return: Scaling factors, prev, current and next layer updated parameters + """ + + prev_layer_params = CrossLayerScaling._pack_equalization_params(cls_set[0], pack_bias=True) + curr_layer_params = CrossLayerScaling._pack_equalization_params(cls_set[1], pack_bias=True) + next_layer_params = CrossLayerScaling._pack_equalization_params(cls_set[2], pack_bias=False) + + scaling_params = libpymo.scaleDepthWiseSeparableLayer(prev_layer_params, curr_layer_params, next_layer_params) + return scaling_params, prev_layer_params, curr_layer_params, next_layer_params + + @staticmethod + def _pack_equalization_params(layer: tf.keras.layers.Conv2D, pack_bias: bool) -> libpymo.EqualizationParams: + equalization_params = libpymo.EqualizationParams() + + param_tensors = layer.get_weights() + + weight_tensor = param_tensors[0] + weight_tensor = WeightTensorUtils.transpose_from_tf_to_libpymo_format(weight_tensor, layer) + + equalization_params.weight = weight_tensor.reshape(-1) + equalization_params.weightShape = np.array(weight_tensor.shape) + + if pack_bias: + if layer.use_bias: + equalization_params.bias = param_tensors[1] + else: + equalization_params.isBiasNone = True + + return equalization_params + + @staticmethod + def _unpack_equalization_params(layer: tf.keras.layers.Conv2D, + equalization_params: libpymo.EqualizationParams, + unpack_bias: bool) -> typing.List: + + weight_tensor = np.reshape(equalization_params.weight, equalization_params.weightShape) + weight_tensor = WeightTensorUtils.transpose_from_libpymo_to_tf_format(weight_tensor, layer) + + if layer.use_bias: + if unpack_bias: + bias_tensor = np.reshape(equalization_params.bias, equalization_params.weightShape[0]) + else: + _, bias_tensor = layer.get_weights() + + param_tensors = [weight_tensor, bias_tensor] + else: + param_tensors = [weight_tensor] + + return param_tensors + + @staticmethod + def scale_cls_sets(cls_sets: typing.List[ClsSet]) -> \ + typing.List[typing.Union[np.ndarray, typing.Tuple[np.ndarray, np.ndarray]]]: + """ + Scale each cls set + :param cls_sets: Cls sets to scale + :return: List of scale factors corresponding to each scaled cls set + """ + scale_factor_list = [] + for cls_set in cls_sets: + if len(cls_set) == 3: + scale_factor = CrossLayerScaling.scale_cls_set_with_depthwise_conv_layers(cls_set) + else: + scale_factor = CrossLayerScaling.scale_cls_set_with_conv_layers(cls_set) + scale_factor_list.append(scale_factor) + return scale_factor_list + + @staticmethod + def create_cls_set_info_list(cls_sets: typing.List[ClsSet], + scale_factors: typing.List[ScaleFactor], + is_relu_activation_in_cls_sets: typing.List[ReluFlag]) -> typing.List[ClsSetInfo]: + """ + Binds information from there separate lists into one [ClsInfoSet] data structure + :param cls_sets: List of CLS sets + :param scale_factors: List of scale factors for each cls set + :param is_relu_activation_in_cls_sets: List of ReLU flag whether there is ReLU activation in each cls set + :return: List of ClsSetInfo + """ + assert len(cls_sets) == len(scale_factors) == len(is_relu_activation_in_cls_sets) + + cls_set_info_list = [] + for cls_set, scale_factor, has_relu_activation in zip(cls_sets, + scale_factors, + is_relu_activation_in_cls_sets): + # Depthwise separable convolution layer case (triplet of layers) + # Should have two scale factors and ReLU flags + if isinstance(scale_factor, tuple): + assert len(cls_set) == 3 + assert len(scale_factor) == len(has_relu_activation) == 2 + + prev_layer, curr_layer, next_layer = cls_set + cls_pair_1 = ClsSetInfo.ClsSetLayerPairInfo(prev_layer, curr_layer, + scale_factor[0], has_relu_activation[0]) + cls_pair_2 = ClsSetInfo.ClsSetLayerPairInfo(curr_layer, next_layer, + scale_factor[1], has_relu_activation[1]) + cls_set_info = ClsSetInfo(cls_pair_1, cls_pair_2) + + # Standard convolution layer case (tuple of layers) + # Should have one scale factor and ReLU flag + else: + prev_layer, curr_layer = cls_set + cls_pair = ClsSetInfo.ClsSetLayerPairInfo(prev_layer, curr_layer, + scale_factor, has_relu_activation) + cls_set_info = ClsSetInfo(cls_pair) + + cls_set_info_list.append(cls_set_info) + + return cls_set_info_list + + @staticmethod + def scale_model(model: tf.keras.Model) -> typing.List[ClsSetInfo]: + """ + Uses cross-layer scaling to scale all applicable layers in the given model + :param model: tf.keras.Model + :return: CLS information for each CLS set + """ + + # Find layer groups + graph_search_util = GraphSearchUtils(model) + layer_groups = graph_search_util.find_layer_groups_to_scale() + + # Find cls sets from the layer groups + cls_sets = [] + for layer_group in layer_groups: + cls_set = GraphSearchUtils.convert_layer_group_to_cls_sets(layer_group) + cls_sets += cls_set + + # Scale the CLS sets + scale_factors = CrossLayerScaling.scale_cls_sets(cls_sets) + + # Find if there were ReLU activations between layers of each cls set + is_relu_activation_in_cls_sets = graph_search_util.is_relu_activation_present_in_cls_sets(cls_sets) + + # Convert to a list of cls set info elements + cls_set_info_list = CrossLayerScaling.create_cls_set_info_list(cls_sets, + scale_factors, + is_relu_activation_in_cls_sets) + + return cls_set_info_list + + +class HighBiasFold: + """ + Code to apply the high-bias-fold technique to a model + """ + + @staticmethod + def bias_fold(cls_set_info_list: typing.List[ClsSetInfo], + bn_layers: typing.Dict[tf.keras.layers.Conv2D, tf.keras.layers.BatchNormalization]): + """ + Folds bias values greater than 3 * sigma to next layer's bias + :param cls_set_info_list: List of info elements for each cls set + :param bn_layers: Key: Conv/Linear layer Value: Corresponding folded BN layer + """ + if not bn_layers: + _logger.info('High Bias folding is not supported for models without BatchNorm Layers') + return + + for cls_set_info in cls_set_info_list: + for cls_pair_info in cls_set_info.cls_pair_info_list: + if (not cls_pair_info.layer1.use_bias) or (not cls_pair_info.layer2.use_bias) or \ + (cls_pair_info.layer1 not in bn_layers): + continue + + prev_layer_params, curr_layer_params = HighBiasFold.call_mo_high_bias_fold(cls_pair_info, bn_layers) + + layer1 = cls_pair_info.layer1 + layer1_weight_tensor, _ = layer1.get_weights() + layer1_bias_tensor = np.array(prev_layer_params.bias) + layer1.set_weights([layer1_weight_tensor, layer1_bias_tensor]) + + layer2 = cls_pair_info.layer2 + layer2_weight_tensor, _ = layer2.get_weights() + layer2_bias_tensor = np.array(curr_layer_params.bias) + layer2.set_weights([layer2_weight_tensor, layer2_bias_tensor]) + + @staticmethod + def call_mo_high_bias_fold(cls_pair_info: ClsSetInfo.ClsSetLayerPairInfo, + bn_layers: typing.Dict[tf.keras.layers.Conv2D, tf.keras.layers.BatchNormalization]) \ + -> typing.Tuple[libpymo.LayerParams, libpymo.LayerParams]: + """ + Invokes high bias fold MO API + :param cls_pair_info: Pair of layers that were scaled using CLS and related information + :param bn_layers: Key: Conv/Linear layer Value: Corresponding folded BN layer + :return: Updated layer params + """ + + bn_layer = bn_layers[cls_pair_info.layer1] + prev_layer_bn_params = HighBiasFold._pack_bn_params_high_bias_fold(bn_layer, cls_pair_info.scale_factor) + + prev_layer_params, curr_layer_params = HighBiasFold._pack_layer_params(cls_pair_info) + + libpymo.updateBias(prev_layer_params, curr_layer_params, prev_layer_bn_params) + return prev_layer_params, curr_layer_params + + @staticmethod + def _pack_bn_params_high_bias_fold(bn_layer: tf.keras.layers.BatchNormalization, + scaling_parameter: np.ndarray) -> libpymo.BNParamsHighBiasFold: + """ + Helper method to pack BatchNormalization parameter for high bias fold + :param bn_layer: Target batch normalization layer + :param scaling_parameter: Scaling parameters for each channel obtained from cross layer scaling + :return: Packed BNParamsHighBiasFold + """ + bn_params = libpymo.BNParamsHighBiasFold() + + # Note: In BatchNormFold, we initialize gamma and beta to 1 and 0 respectively to work as Identity + # So if the original value was set, use it for High Bias Fold + if hasattr(bn_layer, "original_gamma") and hasattr(bn_layer, "original_beta"): + gamma, beta = bn_layer.original_gamma, bn_layer.original_beta + else: + gamma, beta, _, _ = bn_layer.get_weights() + + if len(scaling_parameter) != len(gamma) or len(scaling_parameter) != len(beta): + raise ValueError("High Bias absorption is not supported for networks with fold-forward BatchNorms") + + bn_params.gamma = np.divide(gamma, scaling_parameter) + bn_params.beta = np.divide(beta, scaling_parameter) + + return bn_params + + @staticmethod + def _pack_layer_params(cls_pair_info: ClsSetInfo.ClsSetLayerPairInfo) \ + -> typing.Tuple[libpymo.LayerParams, libpymo.LayerParams]: + """ + Helper method to pack information of previous and current layer + :param cls_pair_info: Pair of layers that were scaled using CLS and related information + :return: Packed layer parameter tuple + """ + # Pack parameters for previous layer + prev_layer_params = libpymo.LayerParams() + + prev_layer = cls_pair_info.layer1 + prev_layer_params.activationIsRelu = cls_pair_info.relu_activation_between_layers + + _, prev_layer_bias_tensor = prev_layer.get_weights() + prev_layer_params.bias = prev_layer_bias_tensor + + # Pack parameters for current layer + curr_layer_params = libpymo.LayerParams() + + curr_layer = cls_pair_info.layer2 + curr_layer_weight_tensor, curr_layer_bias_tensor = curr_layer.get_weights() + curr_layer_weight_tensor = WeightTensorUtils.transpose_from_tf_to_libpymo_format(curr_layer_weight_tensor, + curr_layer) + + curr_layer_params.bias = curr_layer_bias_tensor + curr_layer_params.weight = curr_layer_weight_tensor.reshape(-1) + curr_layer_params.weightShape = np.array(curr_layer_weight_tensor.shape) + + return prev_layer_params, curr_layer_params + + +
[docs]def equalize_model(model: tf.keras.Model) -> tf.keras.Model: + """ + High-level API to perform Cross-Layer Equalization (CLE) on the given model + :param model: tf.keras.Model + :return: CLE applied tf.keras.Model + """ + # replace any ReLU6 layers with ReLU + model_for_cle, _ = model_transform_utils.replace_relu6_with_relu(model) + + folded_pairs, model_for_cle = fold_all_batch_norms(model_for_cle) + + equalize_bn_folded_model(model_for_cle, folded_pairs) + + return model_for_cle
+ + +def equalize_bn_folded_model(model: tf.keras.Model, + folded_pairs: typing.List[BatchNormFoldedPair]): + """ + Perform Cross-Layer Scaling (CLS) and High Bias Folding (HBF) on a batchnorm-folded model in-place + :param model: BatchNorm-folded model to equalize + :param folded_pairs: List of pairs of folded layers + """ + bn_dict = {} + for conv_or_linear, bn in folded_pairs: + bn_dict[conv_or_linear] = bn + + # perform cross-layer scaling on applicable layer sets + cls_set_info_list = CrossLayerScaling.scale_model(model) + + # high-bias fold + HighBiasFold.bias_fold(cls_set_info_list, bn_dict) +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/keras/layer_output_utils.html b/releases/1.32.2/_modules/aimet_tensorflow/keras/layer_output_utils.html new file mode 100644 index 00000000..5dfef170 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/keras/layer_output_utils.html @@ -0,0 +1,1281 @@ + + + + + + aimet_tensorflow.keras.layer_output_utils — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_tensorflow.keras.layer_output_utils

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2023, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" This module contains utilities to capture and save intermediate layer-outputs of a model. """
+
+import os
+from typing import Union, List, Tuple
+import re
+from collections import OrderedDict
+import json
+import numpy as np
+import tensorflow as tf
+from aimet_tensorflow.keras.quantsim import QcQuantizeWrapper, QcQuantizableMultiHeadAttention
+from aimet_common.layer_output_utils import SaveInputOutput
+from aimet_common.utils import AimetLogger
+
+logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.LayerOutputs)
+
+
[docs]class LayerOutputUtil: + """ Implementation to capture and save outputs of intermediate layers of a model (fp32/quantsim) """ + + def __init__(self, model: tf.keras.Model, save_dir: str = "./KerasLayerOutput"): + """ + Constructor for LayerOutputUtil. + + :param model: Keras (fp32/quantsim) model. + :param save_dir: Directory to save the layer outputs. + """ + # Freeze the model weights and state + model.trainable = False + + # Get intermediate model for layer-outputs + self.intermediate_model = self._get_intermediate_model(model) + + # Get actual Layer output name to modified layer output name dict + self.original_name_to_modified_name_mapper = self._get_original_name_to_modified_name_mapper(model) + + # Saving the actual layer output name to modified layer output name (valid file name to save) in a json file + os.makedirs(save_dir, exist_ok=True) + with open(os.path.join(save_dir, "LayerOutputNameMapper.json"), 'w', encoding='utf-8') as fp: + json.dump(self.original_name_to_modified_name_mapper, fp=fp, indent=4) + + # Identify the axis-layout used for representing an image tensor + axis_layout = 'NHWC' if tf.keras.backend.image_data_format() == 'channels_last' else 'NCHW' + + # Utility to save model inputs and their corresponding layer-outputs + self.save_inp_out_obj = SaveInputOutput(save_dir, axis_layout=axis_layout) + + @classmethod + def _get_layer_output_name(cls, layer: Union[QcQuantizeWrapper, QcQuantizableMultiHeadAttention, tf.keras.layers.Layer]): + """ + This function returns the actual layer output name for a given layer + :param layer: Keras model layer. + :return: Actual layer output name for the layer + """ + if isinstance(layer, QcQuantizeWrapper): + return layer.original_layer.output.name + return layer.output.name + + @classmethod + def _get_intermediate_model(cls, model: tf.keras.Model): + """ + This function instantiates the feature extraction model for per layer outputs + :param model: Keras model. + :return: Intermediate keras model for feature extraction + """ + outputs = [layer.output for layer in model.layers] + intermediate_model = tf.keras.models.Model(inputs=model.inputs, outputs=outputs) + intermediate_model.trainable = False + return intermediate_model + + @classmethod + def _get_original_name_to_modified_name_mapper(cls, model: tf.keras.Model): + """ + This function captures the per-layer output name and modifies it to make a valid file name + (by removing non-word characters) so that the layer output can be easily saved with the modified name. + :param model: Keras model. + :return: Actual layer name to modified layer name dict + """ + original_name_to_modified_name_mapper = OrderedDict() + for layer in model.layers: + layer_output_name = cls._get_layer_output_name(layer) + + # Replace all non-word characters with "_" to make it a valid file name for saving the results + # For Eg.: "conv2d/BiasAdd:0" gets converted to "conv2d_BiasAdd_0" + modified_layer_output_name = re.sub(r'\W+', "_", layer_output_name) + + original_name_to_modified_name_mapper[layer_output_name] = modified_layer_output_name + + return original_name_to_modified_name_mapper + + def get_outputs(self, input_batch: Union[tf.Tensor, List[tf.Tensor], Tuple[tf.Tensor]]): + """ + This function captures layer-outputs and renames them as per the AIMET exported model. + :param input_batch: Batch of inputs for which we want to obtain layer-outputs. + :return: layer-output name to layer-output batch dict + """ + # Run in inference mode + outs = self.intermediate_model(input_batch, training=False) + output_pred = [out.numpy() for out in outs] + + return dict(zip(self.original_name_to_modified_name_mapper.values(), output_pred)) + +
[docs] def generate_layer_outputs(self, input_batch: Union[tf.Tensor, List[tf.Tensor], Tuple[tf.Tensor]]): + """ + This method captures output of every layer of a model & saves the inputs and corresponding layer-outputs to disk. + + :param input_batch: Batch of Inputs for which layer output need to be generated + :return: None + """ + layer_output_batch_dict = self.get_outputs(input_batch) + + # Skip constant scalar layer-outputs + const_scalar_layer_name = [] + for layer_name, layer_output in layer_output_batch_dict.items(): + if not isinstance(layer_output, np.ndarray): + const_scalar_layer_name.append(layer_name) + for layer_name in const_scalar_layer_name: + logger.info("Skipping constant scalar output of layer %s", layer_name) + _ = layer_output_batch_dict.pop(layer_name) + + self.save_inp_out_obj.save(np.array(input_batch), layer_output_batch_dict) + + logger.info("Layer Outputs Saved")
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/keras/model_preparer.html b/releases/1.32.2/_modules/aimet_tensorflow/keras/model_preparer.html new file mode 100644 index 00000000..8352c8a7 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/keras/model_preparer.html @@ -0,0 +1,1970 @@ + + + + + + aimet_tensorflow.keras.model_preparer — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_tensorflow.keras.model_preparer

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Implementation to automatically prepare keras models for AIMET by converting them to a functional model """
+import inspect
+import logging
+from typing import Any, Dict, List, Set, Union, Optional
+import re
+import numpy as np
+import tensorflow as tf
+
+import tensorflow.keras.backend as K
+from packaging import version  # pylint: disable=wrong-import-order
+
+if version.parse(tf.version.VERSION) >= version.parse("2.10"):
+    # Ignore pylint errors as keras module is not available in TF 2.4
+    from keras.engine.base_layer_utils import is_subclassed  # pylint: disable=import-error
+    from keras.engine.functional import Functional  # pylint: disable=import-error
+    from keras.engine.keras_tensor import KerasTensor  # pylint: disable=import-error
+    from keras.layers.core.tf_op_layer import TFOpLambda  # pylint: disable=import-error
+    from keras.layers.merging.base_merge import _Merge as MergeLayersParentClass  # pylint: disable=import-error
+else:
+    # Ignore pylint errors due to conditional imports
+    from tensorflow.python.keras.engine.base_layer_utils import is_subclassed  # pylint: disable=ungrouped-imports
+    from tensorflow.python.keras.engine.keras_tensor import KerasTensor  # pylint: disable=ungrouped-imports
+    from tensorflow.python.keras.engine.functional import Functional  # pylint: disable=ungrouped-imports
+    from tensorflow.python.keras.layers.core import TFOpLambda  # pylint: disable=ungrouped-imports
+    # pylint: disable=ungrouped-imports
+    from tensorflow.python.keras.layers.merge import \
+        _Merge as MergeLayersParentClass
+
+# pylint: disable=wrong-import-position
+from aimet_tensorflow.keras.utils.model_connection_utils import ModelLayerConnections, ModelLayerConnectionsProperties
+from aimet_tensorflow.keras.utils.model_transform_utils import replace_separable_conv_with_depthwise_pointwise, \
+    replace_relu6_with_relu
+from aimet_common.utils import AimetLogger
+
+_logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.ModelPreparer)
+
+regex_for_camel_case_to_snake_case = re.compile(r'(?<!^)(?=[A-Z])')
+_TEMP_MODEL_NAME = "temp_aimet_intermediate_model"
+
+"""
+This file contains the implementation to automatically prepare keras models for AIMET by converting them to a functional model.
+"""
+
+
+class _KerasModelPreparer:
+
+    def __init__(
+        self,
+        original_model: Optional[tf.keras.Model] = None,
+        input_layer: Optional[tf.keras.layers.InputLayer] = None
+    ):
+        self.model_outputs = []  # Both normal init and "passthrough" init utilize this
+        if original_model:
+            self.input_layer = self._format_input_layer(original_model, input_layer)
+
+            if self._inherits_from_keras_model(original_model):
+                _logger.debug("This model inherits from tf.keras.Model. Need to connect model.")
+                self.original_model = self._connect_inherited_model(original_model, input_layer, is_original_model=True)
+
+            else:
+                self.original_model = original_model
+
+            # Used to fix weight names at end of unwrapping
+            # Originally set to the name of the original model's class in the case that there is an inherited model
+            self.class_names = self._get_class_names_in_model(self.original_model)
+
+            self.model_layers_connections = \
+                ModelLayerConnections.get_model_layers_connection_properties(self.original_model)
+            self._set_prepared_models_input_layer()
+
+            self.original_models_last_layer = self.original_model.layers[-1]
+
+            self.prepared_model = None
+            self.custom_objects = None
+            self.original_weights_in_prepared_model_order = None
+
+    @classmethod
+    def get_instance_for_common_layer_passthrough_functions(
+            cls, model_layers_connections: ModelLayerConnectionsProperties.TYPE
+    ):
+        """
+        Special function to __init__ for classes outside _KerasModelPreparer that want access to useful
+        functions like _handle_normal_keras_layer. ONLY use this for internal use. For normal Keras Model Preparer,
+        please utilize the prepare_model function.
+
+        :param model_layers_connections: Dictionary of Model Layer Connections for the functions to use.
+        :return: A slim instance of _KerasModelPreparer
+        """
+
+        self = cls(original_model=None, input_layer=None)
+        self.model_layers_connections = model_layers_connections
+        return self
+
+    def _get_original_models_weights_in_functional_model_order(self) -> List[np.ndarray]:
+        """
+        Map the original model's weights to the functional model's weights.
+
+        :return: A list of the original model's weights in the order of the functional model's weights
+        """
+        # Make the original model's weights into a dictionary for quick lookup by name
+        # The original subclassed layers names are removed to match the new functional model's names
+        original_model_weights = {}
+        for weight in self.original_model.weights:
+            # pop out class_names of weight name
+            weight_name = weight.name
+            for class_name in self.class_names:
+                weight_name = weight_name.replace(class_name + '/', '')
+            original_model_weights[weight_name] = weight.numpy()
+
+        # Get the functional model's weights in order as a dictionary for quick lookup where the key is the weight name
+        # and the position of the weight's order is the value
+        prepared_model_weight_order = {
+            weight.name: position
+            for position, weight in enumerate(self.prepared_model.weights)
+        }
+
+        # Using the functional model's weights order, get the original model's weights in the same order. The lambda
+        # here uses the weight's name to get position in the functional model's weights order and the sorts the
+        # original model's weights by that position.
+        self.original_weights_in_prepared_model_order = [
+            weight for _, weight in
+            sorted(original_model_weights.items(), key=lambda weight_info: prepared_model_weight_order[weight_info[0]])
+        ]
+
+        return self.original_weights_in_prepared_model_order
+
+    def _set_prepared_models_weights(self):
+        """
+        Set the functional model's weights to the original model's weights in the correct order
+        """
+
+        assert self.prepared_model, (
+            "The prepared model must be created before setting weights. Please call "
+                "prepare_model() before calling set_weights()."
+        )
+
+        try:
+            self.prepared_model.set_weights(self._get_original_models_weights_in_functional_model_order())
+        except ValueError:
+            _logger.error(
+                "Could not copy weights from original model to the prepared model. This can occur when "
+                "custom sublayers are defined not in the same order as the sublayers call method. Please ensure that "
+                "the sublayers internal layers are defined in the same order as the sublayers call method.")
+            raise
+
+        _logger.debug("Functional model weights copied.")
+        _logger.info("Model prepared for AIMET in Functional API format.")
+
+    @staticmethod
+    def _format_input_layer(
+            original_model: tf.keras.Model,
+            input_layer: Union[tf.keras.layers.InputLayer, List[tf.keras.layers.InputLayer], dict]
+    ) -> tf.keras.layers.Layer:
+        """
+        This function formats the input layer by either using the original models input layer or the user provided
+        input layer. This function will also raise an error if the model needs a defined input layer to be prepared
+        for AIMET.
+
+        :param original_model: The original model to be copied
+        :param input_layer: The input layer to be used for the functional model
+        :return: The input layer
+        """
+        if hasattr(original_model, "input"):
+            input_layer = original_model.input
+        elif isinstance(input_layer, tf.keras.layers.InputLayer):
+            _logger.info("Input layer explicitly passed in")
+            return input_layer
+        else:
+            _logger.info("Input layer not found. Using input layer passed in.")
+            if input_layer is None:
+                raise ValueError(
+                    "The top layer of this model is subclassed. Please provide an input layer via the "
+                    "\'input_layer\' parameter."
+                )
+
+        if isinstance(input_layer, dict):  # Keras allows passing in tensors via tensor_name : tensor
+            input_layer = list(input_layer.values())
+            if len(input_layer) == 1:
+                return input_layer[0]
+
+        return input_layer
+
+    @staticmethod
+    def _get_class_names_in_model(model: Union[tf.keras.Model, tf.keras.layers.Layer]) -> Set[str]:
+        """
+        Helper function to get the class name for a nested layer.
+
+        :param model: the 'layer' or 'model' to get the class name
+        :return: A set containing the class name
+        """
+        return {
+            regex_for_camel_case_to_snake_case.sub("_", name).lower()
+            for name in (model.name, model.__class__.__name__)
+        }
+
+    @staticmethod
+    def _is_nested_layer(layer: tf.keras.layers.Layer) -> bool:
+        """
+        Checks if the given layer is a nested layer.
+
+        :param layer: The layer to check
+        :return: True if the layer is a nested layer, False otherwise
+        """
+        keras_defined_subclassed_layer = is_subclassed(layer)
+        # pylint: disable=use-a-generator
+        is_aimet_defined_subclassed = keras_defined_subclassed_layer and any(
+            [isinstance(v, tf.keras.layers.Layer) for v in layer.__dict__.values()]
+        )  # check if the subclass is holding subclassed layer attributes. we only care if this is the case.
+
+        return (
+            is_aimet_defined_subclassed or
+            _KerasModelPreparer._is_functional_model(layer) or
+            _KerasModelPreparer._is_sequential_model(layer)
+        )
+
+    @staticmethod
+    def _is_functional_model(layer: tf.keras.layers.Layer) -> bool:
+        """
+        Checks if the given layer is a functional layer.
+
+        :param layer: The layer to check
+        :return: True if the layer is a functional layer, False otherwise
+        """
+        return isinstance(layer, Functional) and not isinstance(layer, tf.keras.Sequential)
+
+    @staticmethod
+    def _is_sequential_model(layer: tf.keras.layers.Layer) -> bool:
+        """
+        Checks if the given layer is a sequential layer.
+
+        :param layer: The layer to check
+        :return: True if the layer is a sequential layer, False otherwise
+        """
+        return isinstance(layer, tf.keras.Sequential)
+
+    def _set_prepared_models_input_layer(self):
+        """
+        This function sets the input layer of the model to the input layer of the functional model.
+        """
+
+        def set_input_layer_factory(input_layer: Union[tf.keras.layers.InputLayer, List[tf.keras.layers.InputLayer]]):
+            if isinstance(input_layer, list):
+                for inp in input_layer:
+                    self.model_layers_connections[ModelLayerConnectionsProperties.OUTPUT_TENSORS].update(
+                        {inp.name: inp}
+                    )
+            else:
+                self.model_layers_connections[ModelLayerConnectionsProperties.OUTPUT_TENSORS].update(
+                    {input_layer.name: input_layer}
+                )
+
+        try:
+            set_input_layer_factory(self.input_layer)
+        except AttributeError:
+            # For models that are not connected
+            _logger.info("Model is not connected. Setting input layer to input layer passed in.")
+
+            input_layer_name = [inp.name for inp in self.input_layer] if isinstance(self.input_layer, list) else \
+                [self.input_layer.name]
+            self.model_layers_connections[ModelLayerConnectionsProperties.INBOUND_NODES].update(
+                {self.original_model.layers[0].name: [*input_layer_name]}
+            )
+
+            set_input_layer_factory(self.input_layer)
+
+    def _get_layer_input(self, layer: tf.keras.layers.Layer) -> tf.keras.layers.Layer:
+        """
+        Helper function to get the input layer of a layer.
+
+        :param layer: The layer to get the input layer of
+        :return: The input layer of the layer
+        """
+        try:
+            layer_input = [
+                self.model_layers_connections[ModelLayerConnectionsProperties.OUTPUT_TENSORS][layer_aux]
+                for layer_aux in
+                self.model_layers_connections[ModelLayerConnectionsProperties.INBOUND_NODES][layer.name]
+            ]
+
+            if len(layer_input) == 1:
+                layer_input = layer_input[0]
+        except KeyError:
+            layer_input = self._get_most_recently_added_output_tensor()
+            self.model_layers_connections[ModelLayerConnectionsProperties.INBOUND_NODES].update(
+                {layer.name: [layer_input.name]}
+            )
+            _logger.warning(
+                "Could not find input tensor for layer: %s. Using %s as input, the most recent output tensor.",
+                layer.name, layer_input.name
+            )
+
+        return layer_input
+
+    @staticmethod
+    def _is_tf_or_keras_tensor_input(arg: Any) -> bool:
+        """
+        Helper function to check if a given argument is a valid Keras tensor.
+
+        :param arg: The argument in question
+        :return: True if it is valid, False if not
+        """
+        if arg is not None:
+            if isinstance(arg, List):
+                return all(isinstance(x, KerasTensor) for x in arg) or all(isinstance(x, tf.Tensor) for x in arg)
+            return isinstance(arg, (KerasTensor, tf.Tensor))
+        return False
+
+    def _get_updated_call_args(self, layer: tf.keras.layers.Layer) -> List[Union[KerasTensor, List[KerasTensor], Any]]:
+        """
+        Helper function to get the call arguments of a layer.
+
+        :param layer: The layer to get the call arguments of
+        :return: The call arguments of the layer
+        """
+
+        def _is_tf_tensor(arg_in_question: Any) -> bool:
+            return isinstance(arg_in_question, tf.Tensor)
+
+        try:
+            original_call_args = self.model_layers_connections[ModelLayerConnectionsProperties.CALL_ARGS][layer.name]
+        except KeyError:
+            _logger.warning("Could not find call args for layer: '%s'. Using keras tensor only as input.", layer.name)
+            return [self._get_layer_input(layer)]
+
+        updated_call_args = []
+        found_keras_tensor = False
+        for arg in original_call_args:
+            if self._is_tf_or_keras_tensor_input(arg):
+
+                if found_keras_tensor and _is_tf_tensor(arg):
+                    updated_call_args.append(arg)
+
+                elif not found_keras_tensor:
+                    layer_input = self._get_layer_input(layer)
+                    if isinstance(layer_input, List):
+                        updated_call_args.extend(layer_input)
+                    else:
+                        updated_call_args.append(layer_input)
+                    found_keras_tensor = True
+
+            else:
+                updated_call_args.append(arg)
+
+        assert found_keras_tensor, f"No keras tensor found in call args of layer {layer.name}"
+        return updated_call_args
+
+    def _get_call_kwargs(self, layer: tf.keras.layers.Layer) -> Dict[Union[KerasTensor, List[KerasTensor]], Any]:
+        """
+        Helper function to get call keyword arguments for a given layer.
+
+        :param layer: The layer to get the call keyword arguments of
+        :return: The call keyword arguments of the layer
+        """
+        if original_call_kwargs := \
+                self.model_layers_connections[ModelLayerConnectionsProperties.CALL_KWARGS][layer.name]:
+            call_kwargs = {}
+            for key, value in original_call_kwargs.items():
+                # The Keras tensor is already in the call args, so we don't need to add it again. call_kwargs are for
+                # keyword arguments that are not Keras tensors such as 'axis', 'training', etc.
+                if self._is_tf_or_keras_tensor_input(value):
+                    continue
+                call_kwargs[key] = value
+        else:
+            _logger.debug("No kwargs for layer: '%s'", layer.name)
+            return {}
+        return call_kwargs
+
+    def _update_output_tensors_in_model_layers_connections(
+            self,
+            layer: tf.keras.layers.Layer,
+            new_output_tensor: KerasTensor,
+            model: tf.keras.Model
+    ):
+        """
+        Helper function to update the output tensors in the model layers connections dictionary.
+
+        :param layer: The layer to update the output tensors of
+        :param new_output_tensor: The new output tensor to update with
+        :param model: The model currently being checked. Used to add model outputs
+        """
+        if layer.name != new_output_tensor.name:
+            new_name = new_output_tensor.name
+            old_name_of_inputs = self.model_layers_connections[ModelLayerConnectionsProperties.INBOUND_NODES].pop(
+                layer.name
+            )
+            self.model_layers_connections[ModelLayerConnectionsProperties.INBOUND_NODES].update(
+                {new_name: old_name_of_inputs}
+            )
+
+            # Replace values in model_layers_connections[NetworkDictProperties.INBOUND_NODES] with new_name
+            for value in self.model_layers_connections[ModelLayerConnectionsProperties.INBOUND_NODES].values():
+                if layer.name in value:
+                    idx = value.index(layer.name)
+                    value[idx] = new_name
+
+            self.model_layers_connections[ModelLayerConnectionsProperties.OUTPUT_TENSORS].update(
+                {new_name: new_output_tensor}
+            )
+        else:
+            # Set new output tensor (in this case, it will be the same as the original model)
+            self.model_layers_connections[ModelLayerConnectionsProperties.OUTPUT_TENSORS].update(
+                {layer.name: new_output_tensor}
+            )
+
+        # Save tensor in output list if it is output in the initial model
+        # TODO: Update so that the last conditional is only checked when it's not the last layer.
+        if model.output_names and layer.name in model.output_names:
+            _logger.debug("Layer '%s' added as output layer", layer.name)
+            self.model_outputs.append(new_output_tensor)
+
+    def _get_most_recently_added_output_tensor(self) -> KerasTensor:
+        """
+        Helper function to get the most recently added output tensor from the model layers connections.
+
+        :return: The most recently added output tensor
+        """
+        return next(reversed(self.model_layers_connections[ModelLayerConnectionsProperties.OUTPUT_TENSORS].items()))[-1]
+
+    @staticmethod
+    def _get_temporary_model(layer: tf.keras.layers.Layer, layer_input: tf.keras.layers.Layer) -> tf.keras.Model:
+        """
+        Helper function to create a temporary functional model from a layer.
+
+        :param layer: The layer to create the temporary model from
+        :param layer_input: The input layer of the layer
+        :return: The temporary model
+        """
+
+        def verify_weights(original_layer_weights: Set[tf.Variable], temp_model_weights: Set[tf.Variable]):
+            if missing_weights := original_layer_weights.difference(temp_model_weights):
+                raise ValueError(f"""
+    The number of weights in the temporary model for unwrapping layer '{layer.name}' does not match the
+    number of weights of the original layer. The missing weight(s) are {missing_weights}. This occurs when the Keras 
+    Symbolic tensor passed into the layers call function does not interact with a layer defined inside of the nested 
+    layer. Please refer to the documentation for more information.
+
+    This is the call function that is causing this error:
+    {inspect.getsource(layer.call)}
+    """)
+
+        layer_input = layer_input if isinstance(layer_input, List) else [layer_input]
+        temp_inputs = [
+            tf.keras.layers.Input(shape=inp.shape[1:], name=inp.name.split(':')[0] + "_temp_input")
+            for inp in layer_input
+        ]
+        if len(temp_inputs) == 1:
+            temp_inputs = temp_inputs[0]
+
+        try:
+            if _KerasModelPreparer._inherits_from_keras_model(layer):
+                temp_model = _KerasModelPreparer._connect_inherited_model(layer, temp_inputs)
+            else:
+                temp_model = tf.keras.Model(inputs=temp_inputs,
+                                            outputs=layer.call(temp_inputs, training=False),
+                                            name=_TEMP_MODEL_NAME)
+            _logger.debug("Model created for layer '%s'", layer.name)
+        except TypeError as e:
+            if "call() got an unexpected keyword argument 'training'" in e.__str__():
+                _logger.error(
+                    "Model preparer calls subclassed layers call functions with the parameter 'training=False', "
+                    "in the case that the layer behaves differently during evaluation. Please add **kwargs to your "
+                    "call function for layer '%s.'",
+                    layer.name
+                )
+            raise
+
+        temp_model.summary(print_fn=_logger.debug)
+        verify_weights({w.name for w in layer.weights}, {w.name for w in temp_model.weights})
+
+        return temp_model
+
+    @staticmethod
+    def _update_temporary_model_layers_connections_inbound_nodes(
+            temp_model_model_layers_connections: ModelLayerConnectionsProperties.TYPE,
+            temp_model: tf.keras.Model,
+            layer_input: tf.keras.layers.Layer
+    ):
+        """
+        Helper function to update the inbound nodes of the temporary model layers connections dictionary.
+
+        :param temp_model_model_layers_connections: The temporary model layers connections dictionary
+        :param temp_model: The temporary model
+        :param layer_input: The input layer of the layer
+        """
+        temp_model_input_names = [inp.name for inp in temp_model.input] if isinstance(temp_model.input, List) else \
+            [temp_model.input.name]
+        layer_inputs_name = [
+            inp.name for inp in (layer_input if isinstance(layer_input, List) else [layer_input])
+        ]  # pylint: disable=superfluous-parens
+
+        for layers_name, input_tensor_name in temp_model_model_layers_connections[
+                ModelLayerConnectionsProperties.INBOUND_NODES].items():
+            for idx, current_input_name in enumerate(input_tensor_name):
+                if current_input_name in temp_model_input_names:
+                    if len(layer_inputs_name) == 1:  # Special case where the same input is feed in multiple times
+                        temp_model_model_layers_connections[ModelLayerConnectionsProperties.INBOUND_NODES][layers_name][
+                            idx] = layer_inputs_name[0]
+                    else:
+                        temp_model_model_layers_connections[ModelLayerConnectionsProperties.INBOUND_NODES][layers_name][
+                            idx] = layer_inputs_name[idx]
+
+    def _handle_nested_layer(self, layer: tf.keras.layers.Layer) -> KerasTensor:
+        """
+        Helper function to handle nested layers such as subclass, functional, or sequential.
+
+        :param layer: The layer to handle
+        :return: The output tensor of the layer
+        """
+        _logger.debug("Extracting layers for '%s'", layer.name)
+
+        # Converts CamelCase to snake_case of nested layers class name
+        self.class_names.update([layer.name] if self._inherits_from_keras_model(
+            layer) else self._get_class_names_in_model(layer))
+
+        # Create a model based on the nested layer.
+        # This is done with the layer input from the model layers connections dictionary.
+        # 1) The input layer is used to create the temporary functional model
+        # 2) The input layer is used in the nested layers call function as a symbolic tensor to get internal layers
+        layer_input = self._get_layer_input(layer)
+        temp_model = tf.keras.models.clone_model(self._get_temporary_model(layer, layer_input))
+
+        # Get the model layers connections dictionary for the temporary model and merge it with the model layers
+        # connections dictionary for the functional model. This is done, so we can keep track of the sublayer and
+        # their inputs and outputs.
+        temp_model_model_layers_connections = ModelLayerConnections.get_model_layers_connection_properties(temp_model)
+        self._update_temporary_model_layers_connections_inbound_nodes(
+            temp_model_model_layers_connections, temp_model, layer_input
+        )
+
+        self.model_layers_connections = ModelLayerConnections.merge_model_layers_connections(
+            self.model_layers_connections, temp_model_model_layers_connections
+        )
+
+        return self._prepare_model_helper(temp_model)
+
+    def _handle_normal_keras_layer(self, layer: tf.keras.layers.Layer) -> KerasTensor:
+        """
+        Helper function to handle normal keras layers. This function will create a new output tensor for the layer
+        and return it.
+
+        :param layer: The layer to create the output tensor for
+        :return: The output tensor of the layer
+        """
+        call_args = self._get_updated_call_args(layer)
+
+        if isinstance(layer, TFOpLambda):
+            if call_kwargs := self._get_call_kwargs(layer):
+                # Special case for 'tf.concat' that takes a list of inputs with kwargs attached
+                # may need to updated in the future
+
+                if "concat" in layer.name:
+                    new_output_tensor = layer.call([*call_args], **call_kwargs)
+                else:
+                    new_output_tensor = layer.call(*call_args, **call_kwargs)
+            else:
+                new_output_tensor = layer.call(*call_args)
+        # Special case for "Merge" layers that take a list of inputs such as "tf.keras.layers.Concatenate" and
+        # "tf.keras.layers.Add"
+        elif isinstance(layer, MergeLayersParentClass):
+            new_output_tensor = layer(call_args)
+        else:
+            new_output_tensor = layer(*call_args)
+
+        return new_output_tensor
+
+    def _prepare_model_helper(self, model: tf.keras.Model) -> KerasTensor:
+        """
+        Helper function to recursively prepare a model. This function will be recursively called if a nested layer is
+        found. This function will extract the layers from the nested layer and add them to the functional model.
+        Otherwise, it will add the layer to the functional model.
+
+        :param model: The model to prepare
+        :return: The last layer of the model
+        """
+        for current_layer in model.layers:
+            _logger.debug("Processing layer: '%s'", current_layer.name)
+            # Skip input layers
+            if isinstance(current_layer, tf.keras.layers.InputLayer):
+                continue
+
+            # If the current layer is either a subclassed layer, functional model or sequential model, we need to
+            # extract the layers from the nested layer and add them to the functional model.
+            if self._is_nested_layer(current_layer):
+                new_output_tensor = self._handle_nested_layer(current_layer)
+                # If we are at the end of the original model, we want the model_outputs to be the end model outputs
+                if current_layer == self.original_models_last_layer:
+                    _logger.debug(
+                        "Last layer was a nested layer. "
+                        "Using temp model's output from _handle_nested_layer as model_output"
+                    )
+                    continue
+                self.model_outputs.clear()
+            else:
+                new_output_tensor = self._handle_normal_keras_layer(current_layer)
+
+            self._update_output_tensors_in_model_layers_connections(current_layer, new_output_tensor, model)
+        return new_output_tensor
+
+    def prepare_model(self):
+        """
+        Function to get the prepared model. This function sets up the input layer and calls the helper function to
+        recursively prepare the model.
+        """
+        _ = self._prepare_model_helper(self.original_model)
+
+        # If the model outputs are empty, then we need to get the most recently added output tensor. This is the case
+        # when a model might be sparse and not fully connected or when a Functional model is inside an inherited model.
+        if not self.model_outputs:
+            _logger.warning(
+                "No model outputs found. This usually occurs when a models is made by inheriting from "
+                "'tf.keras.Model' and placing a Functional model inside. Using most recently added output tensor as "
+                "prepared models output."
+            )
+            self.model_outputs = self._get_most_recently_added_output_tensor()
+
+        setattr(self, "prepared_model", tf.keras.Model(
+            inputs=self.input_layer,
+            outputs=self.model_outputs,
+            name=f"{self.original_model.name}_prepared"
+        ))
+
+        # Cloning model to remove any references to the original model
+        K.clear_session()  # To avoid name conflicts
+        self.prepared_model = tf.keras.models.clone_model(self.prepared_model)
+        setattr(
+            self, "custom_objects",  # For acceptable subclass layers
+            self._get_models_custom_objects(self.prepared_model)
+        )
+        _logger.info("Prepared Model Summary: \n")
+        self.prepared_model.summary(print_fn=_logger.info)
+
+        # Copying over weights from original model to functional model
+        _logger.debug("Final class_names: %s", self.class_names)
+        self._set_prepared_models_weights()
+
+        # Extra prepare step to replace Separable Conv's with Depthwise Pointwise pattern.
+        self.prepared_model, _ = replace_separable_conv_with_depthwise_pointwise(
+            self.prepared_model,
+            custom_objects=self.custom_objects
+        )
+        self.prepared_model, _ = replace_relu6_with_relu(
+            self.prepared_model,
+            custom_objects=self.custom_objects
+        )
+
+        self.verify_prepared_model()
+
+    @staticmethod
+    def _get_models_custom_objects(model: tf.keras.Model) -> Optional[Dict[str, tf.keras.layers.Layer]]:
+        """
+        Helper function to return a models `custom_objects` if there are any present in the model.
+
+        :param model: The model to check
+        :return: A dictionary {layer name : layer obj} of the custom objects or None if there are not any
+        """
+
+        return {
+            layer.__class__.__name__: layer.__class__
+            for layer in model.layers
+            if not getattr(layer, "__module__", None).split(".")[0] == "keras" and  # TF 2.10.1 and up
+            not getattr(layer, "__module__", None).split(".")[0] == "tensorflow"    # TF 2.4.3 support
+        } or None
+
+    @staticmethod
+    def _model_has_nested_layers(model: tf.keras.Model) -> bool:
+        """
+        Helper function to check if a model is needed to be prepared or not based on if the model has nested layers such as
+        subclass, functional, or sequential.
+
+        :param model: The model to check
+        :return: If the model needs to be prepared or not
+        """
+        for layer in model.layers:
+            if _KerasModelPreparer._is_nested_layer(layer):
+                return True
+        return False
+
+    @staticmethod
+    def _inherits_from_keras_model(model: tf.keras.Model) -> bool:
+        """
+        Helper function to check if a model itself is inheriting from tf.keras.Model. If so, then the model needs to connected.
+
+        :param model: The model to check.
+        :return: If the model is inheriting from tf.keras.Model
+        """
+
+        return (
+            type(model).__bases__[0] == tf.keras.Model and
+            not _KerasModelPreparer._is_functional_model(model) and
+            not _KerasModelPreparer._is_sequential_model(model)
+        )
+
+    @staticmethod
+    def _connect_inherited_model(model: tf.keras.Model, input_layer: Union[
+            tf.keras.layers.InputLayer, List[tf.keras.layers.InputLayer]],
+                                 is_original_model: bool = False) -> tf.keras.Model:
+        """
+        Function to loop through models that inherit from tf.keras.Model and therefore could potentially have no
+        outbound nodes.
+
+        :param model: Model to connect.
+        :param input_layer: The input layer to connect the model.
+        :param is_original_model: Flag to clone the model if the original model is the one passed in.
+        This is to fix naming issues. Otherwise, the model is not cloned.
+        :return: A model with the outbound nodes generated.
+        """
+
+        # TODO: Fix case where the layers are all the same. Maybe user has to?
+        model = tf.keras.Model(inputs=input_layer, outputs=model.call(input_layer), name=_TEMP_MODEL_NAME)
+        if is_original_model:
+            try:
+                return tf.keras.models.clone_model(model)
+            except TypeError as e:
+                _logger.error("The layer %s inherits from tf.keras.Model and has layer that does not have a "
+                              "`get_config` defined. Due to this, Keras cannot clone this layer. Please override the "
+                              "`get_config` function and provide the missing keys mentioned in the Keras error logs.",
+                              model.name)
+                raise e
+        return model
+
+    def verify_prepared_model(self):
+        """
+        Function to verify that the prepared model is correct. This function will check that the prepared model has
+        the same weights as the original model and that the prepared model has the same outputs as the original model.
+        """
+
+        # Check that the prepared model has the same number of parameters as the original model
+        assert self.prepared_model.count_params() == self.original_model.count_params(), \
+            "Prepared model and original model do not have the same number of parameters"
+        _logger.debug("Prepared model and original model have the same number of parameters")
+
+        # Check the weights of the prepared model and the original model
+        for original_weight, prepared_weight in zip(
+                self.original_weights_in_prepared_model_order, self.prepared_model.get_weights()):
+            np.testing.assert_array_equal(
+                original_weight, prepared_weight,
+                err_msg="Weights of prepared model and original model do not match"
+            )
+        _logger.debug("Weights of prepared model and original model match")
+
+        # Create a random input to test the prepared model
+        if isinstance(self.prepared_model.input_shape, List):
+            random_input = []
+            for current_input_shape in self.original_model.input_shape:
+                input_shape = [shape if shape is not None else 1 for shape in current_input_shape]
+                random_input.append(np.random.rand(*input_shape).astype(np.float32))
+        else:
+            input_shape = [shape if shape is not None else 1 for shape in self.prepared_model.input_shape]
+            random_input = np.random.rand(*input_shape).astype(np.float32)
+
+        verbose = logging.DEBUG == _logger.level
+        original_model_output = self.original_model.predict(random_input, verbose=verbose)
+        prepared_model_output = self.prepared_model.predict(random_input, verbose=verbose)
+
+        # Check the outputs of the prepared model and the original model
+        err_msg = """
+        Outputs of prepared model and original model do not match. Since the weights match and params 
+        match, this is likely due to a mismatch in the model's architecture. Specifically, if there is a reuse of a 
+        layer, then the prepared model will not have the same output as the original model. For example, 
+        if a ReLU layer is defined once and then used twice, then the prepared model will only have one ReLU layer 
+        while the original model will have two ReLU layers. Please check the model's architecture to see if there are 
+        any layers that are reused.
+        """
+
+        if isinstance(original_model_output, Dict):
+            original_model_output = list(original_model_output.values())
+            if len(original_model_output) == 1:
+                original_model_output = original_model_output[0]
+
+        if isinstance(original_model_output, List):
+            for original_output, prepared_output in zip(original_model_output, prepared_model_output):
+                np.testing.assert_array_equal(original_output, prepared_output, err_msg=err_msg)
+        else:
+            np.testing.assert_array_equal(original_model_output, prepared_model_output, err_msg=err_msg)
+        _logger.debug("Outputs of prepared model and original model match")
+
+        _logger.info("Prepared model verified")
+
+
+
[docs]def prepare_model(original_model: tf.keras.Model, + input_layer: Union[tf.keras.layers.InputLayer, List[tf.keras.layers.InputLayer]] = None) \ + -> tf.keras.Model: + """ + This function prepares a Keras model before continuing on with AIMET. Specifically, it will convert the model into + a purely Functional API model and copy over the original models weights. + + :param original_model: The original model to be prepared + :param input_layer: The input layer to be used for the new model. By default, the input layer is set to None. If the + beginning portion of the model is subclassed, then the input layer must be passed in. + :return: The prepared model if needed, or the original model + """ + + # Initial check to see if preparing model is necessary + # pylint: disable=protected-access + if not _KerasModelPreparer._model_has_nested_layers(original_model) and \ + not _KerasModelPreparer._inherits_from_keras_model(original_model): + _logger.info("Model does not contain any nested layers. " + "Returning original model after going through " + "'replace_separable_conv_with_depthwise_pointwise' and 'replace_relu6_with_relu.") + custom_objects = _KerasModelPreparer._get_models_custom_objects(original_model) + prepared_model, _ = replace_relu6_with_relu(original_model, custom_objects=custom_objects) + prepared_model, _ = replace_separable_conv_with_depthwise_pointwise(prepared_model, + custom_objects=custom_objects) + return prepared_model + + keras_model_preparer = _KerasModelPreparer(original_model, input_layer=input_layer) + + keras_model_preparer.prepare_model() + + return keras_model_preparer.prepared_model
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/keras/quant_analyzer.html b/releases/1.32.2/_modules/aimet_tensorflow/keras/quant_analyzer.html new file mode 100644 index 00000000..77e67ad7 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/keras/quant_analyzer.html @@ -0,0 +1,1734 @@ + + + + + + aimet_tensorflow.keras.quant_analyzer — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_tensorflow.keras.quant_analyzer

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+"""Quant Analyzer"""
+import os
+from collections import OrderedDict, defaultdict
+from typing import Dict, List, Tuple
+
+import tensorflow as tf
+
+from aimet_common.defs import QuantScheme
+from aimet_common.quant_analyzer import export_per_layer_sensitivity_analysis_plot, save_json, \
+    create_and_export_min_max_ranges_plot, export_stats_histogram_plot, export_per_layer_mse_plot
+from aimet_common.utils import CallbackFunc, AimetLogger, Spinner
+from aimet_tensorflow.keras.batch_norm_fold import fold_all_batch_norms
+from aimet_tensorflow.keras.graphsearchtuils import GraphSearchUtils
+from aimet_tensorflow.keras.quant_sim.qc_quantize_wrapper import QcQuantizeWrapper
+from aimet_tensorflow.keras.quant_sim.tensor_quantizer import TensorQuantizer
+from aimet_tensorflow.keras.quantsim import QuantizationSimModel
+from aimet_tensorflow.keras.utils.quantizer_utils import get_enabled_activation_quantizers, enable_disable_quantizers, \
+    get_enabled_param_quantizers
+
+_logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.Quant)
+
+
+def _sort_quant_wrappers_based_on_occurrence(sim: QuantizationSimModel) -> Dict[str, QcQuantizeWrapper]:
+    """
+    Sort quant wrappers based on occurrence for given quantsim model.
+
+    :param sim: Quantsim model.
+    :return: Ordered dictionary which maps wrapped layer name to quant wrapper.
+    """
+    sorted_quant_wrappers_dict = OrderedDict()
+    for wrapper in sim.model.layers:
+        if not isinstance(wrapper, QcQuantizeWrapper):
+            continue
+
+        sorted_quant_wrappers_dict[wrapper.original_layer.name] = wrapper
+
+    return sorted_quant_wrappers_dict
+
+
+def _get_enabled_quantizers(sorted_quant_wrappers: Dict[str, QcQuantizeWrapper]) -> \
+        Dict[QcQuantizeWrapper, List[TensorQuantizer]]:
+    """
+    For given sorted quant wrappers dict, get enabled quantizers.
+
+    :param sorted_quant_wrappers: Dictionary containing quant wrappers sorted based on occurrence.
+    :return: Dictionary which maps a quant wrapper to a list of enabled quantizers in it.
+    """
+    enabled_quant_wrappers = defaultdict(list)
+
+    for quant_wrapper in sorted_quant_wrappers.values():
+        for quantizer in quant_wrapper.param_quantizers:
+            if quantizer.is_enabled():
+                enabled_quant_wrappers[quant_wrapper].append(quantizer)
+
+        for quantizer in quant_wrapper.output_quantizers:
+            if quantizer.is_enabled():
+                enabled_quant_wrappers[quant_wrapper].append(quantizer)
+
+        for quantizer in quant_wrapper.input_quantizers:
+            if quantizer.is_enabled():
+                enabled_quant_wrappers[quant_wrapper].append(quantizer)
+
+    return enabled_quant_wrappers
+
+
+def _get_output_of_intermediate_layer(model: tf.keras.Model,
+                                      input_tensor: tf.Tensor,
+                                      layer_index: int) -> tf.Tensor:
+    """
+    Return output tensor from model extracted up to target intermediate layer
+
+    :param model: tf.keras.Model
+    :param input_tensor: Input tensor to feed
+    :param layer_index: Index of layer
+    :return: Output tensor from intermediate layer
+    """
+    layer_output = model.get_layer(index=layer_index).output
+    extracted_model = tf.keras.Model(inputs=model.inputs, outputs=layer_output)
+
+    return extracted_model(input_tensor)
+
+
+
[docs]class QuantAnalyzer: + """ + QuantAnalyzer tool provides + + 1) model sensitivity to weight and activation quantization + 2) per layer sensitivity analysis + 3) per layer encoding (min - max range) + 4) per PDF analysis and + 5) per layer MSE analysis + """ + + def __init__(self, + model: tf.keras.Model, + forward_pass_callback: CallbackFunc, + eval_callback: CallbackFunc): + """ + :param model: FP32 model to analyze for quantization. + :param forward_pass_callback: A callback function for model calibration that simply runs + forward passes on the model to compute encoding (delta/offset). This + callback function should use representative data and should be subset of + entire train/validation dataset (~1000 images/samples). + :param eval_callback: A callback function for model evaluation that determines model + performance. This callback function is expected to return scalar value + representing the model performance evaluated against entire test/evaluation dataset. + """ + if not isinstance(forward_pass_callback, CallbackFunc): + raise ValueError('forward_pass_callback and its argument(s) are not encapsulated by CallbackFunc class.') + if not isinstance(eval_callback, CallbackFunc): + raise ValueError('eval_callback and its argument(s) are not encapsulated by CallbackFunc class.') + + self._model = model + self._forward_pass_callback = forward_pass_callback + self._eval_callback = eval_callback + self._unlabeled_dataset = None + self._num_batches = None + + # pylint: disable=unused-argument, no-self-use +
[docs] def analyze(self, + quant_scheme: QuantScheme = QuantScheme.post_training_tf_enhanced, + rounding_mode: str = "nearest", + default_param_bw: int = 8, + default_output_bw: int = 8, + config_file: str = None, + results_dir: str = "./tmp/"): + """ + Analyze model for quantization and point out sensitive parts/hotspots of the model by performing + 1) model sensitivity to quantization, + 2) perform per layer sensitivity analysis by enabling and disabling quant wrappers, + 3) export per layer encodings min - max ranges, + 4) export per layer statistics histogram (PDF) when quant scheme is TF-Enhanced, + 5) per layer MSE analysis + + :param quant_scheme: Quantization scheme. Supported values are + QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced. + :param rounding_mode: The round scheme to used. One of: 'nearest' or 'stochastic', defaults to 'nearest' + :param default_param_bw: Default bitwidth (4-31) to use for quantizing layer parameters. + :param default_output_bw: Default bitwidth (4-31) to use for quantizing layer inputs and outputs. + :param config_file: Path to configuration file for model quantizers. + :param results_dir: Directory to save the results. + """ + results_dir = os.path.abspath(results_dir) + os.makedirs(results_dir, exist_ok=True) + + sim = self._create_quantsim_and_encodings(quant_scheme, + rounding_mode, + default_param_bw, + default_output_bw, + config_file) + + # Check model sensitivity to weight and activation quantization individually. + self.check_model_sensitivity_to_quantization(sim, default_param_bw, default_output_bw) + + # Perform per layer analysis by enabling each quant wrapper (OPTION-1). + self.perform_per_layer_analysis_by_enabling_quant_wrappers(sim, results_dir) + + # Perform per layer analysis by disabling each quant wrapper (OPTION-2). + self.perform_per_layer_analysis_by_disabling_quant_wrappers(sim, results_dir) + + # Export encoding min-max range. + self.export_per_layer_encoding_min_max_range(sim, results_dir) + + # Export PDF of statistics + if quant_scheme == QuantScheme.post_training_tf_enhanced: + self.export_per_layer_stats_histogram(sim, results_dir) + + # Export per layer MSE loss between fp32 and quantized output activations. + if self._unlabeled_dataset and self._num_batches: + self.export_per_layer_mse_loss(sim, results_dir)
+ + def _create_quantsim_and_encodings(self, + quant_scheme: QuantScheme, + rounding_mode: str, + default_param_bw: int, + default_output_bw: int, + config_file: str) -> QuantizationSimModel: + """ + Create Quantsim and compute encodings. + + :param quant_scheme: Quantization scheme. + :param rounding_mode: The round scheme to used. One of: 'nearest' or 'stochastic', defaults to 'nearest' + :param default_param_bw: Default bitwidth (4-31) to use for quantizing layer parameters. + :param default_output_bw: Default bitwidth (4-31) to use for quantizing layer inputs and outputs. + :param config_file: Path to configuration file for model quantizers. + :return: Quantsim model. + """ + _, self._model = fold_all_batch_norms(self._model) # pylint: disable=attribute-defined-outside-init + sim = QuantizationSimModel(self._model, + quant_scheme=quant_scheme, + rounding_mode=rounding_mode, + default_output_bw=default_output_bw, + default_param_bw=default_param_bw, + config_file=config_file) + + sim.compute_encodings(forward_pass_callback=self._forward_pass_callback.func, + forward_pass_callback_args=self._forward_pass_callback.args) + + return sim + + def check_model_sensitivity_to_quantization(self, + sim: QuantizationSimModel, + default_param_bw: int, + default_output_bw: int): + """ + Perform the sensitivity analysis to weight and activation quantization + individually. + + :param sim: Quantsim model. + :param default_param_bw: Default bitwidth (4-31) to use for quantizing layer parameters. + :param default_output_bw: Default bitwidth (4-31) to use for quantizing layer inputs and outputs. + :return: FP32 eval score, weight-quantized eval score, act-quantized eval score. + """ + fp32_eval_score = self._eval_model(self._model) + _logger.info("FP32 eval score (W32A32): %f", fp32_eval_score) + + weight_quantized_eval_score = self._eval_weight_quantized_model(sim) + _logger.info("Weight-quantized eval score (W%dA32): %f", default_param_bw, + weight_quantized_eval_score) + + act_quantized_eval_score = self._eval_activation_quantized_model(sim) + _logger.info("Activation-quantized eval score (W32A%d): %f", default_output_bw, + act_quantized_eval_score) + + def _eval_model(self, model: tf.keras.Model) -> float: + """ + Evaluate the model performance. + :param model: tf.keras.Model to be evaluated + :return: Scalar value representing model performance + """ + return self._eval_callback.func(model, self._eval_callback.args) + + def _eval_weight_quantized_model(self, sim: QuantizationSimModel) -> float: + """ + Evaluate weight quantized model performance. + For weight quantized model performance, disable enabled activation quantizers, measure + eval score and enable again. + + :param sim: Quantsim model. + :return: Quantized model performance. + """ + enabled_activation_quantizers = get_enabled_activation_quantizers(sim) + enable_disable_quantizers(enabled_activation_quantizers, enabled=False) + eval_score = self._eval_model(sim.model) + enable_disable_quantizers(enabled_activation_quantizers, enabled=True) + return eval_score + + def _eval_activation_quantized_model(self, sim: QuantizationSimModel) -> float: + """ + Evaluate activation quantized model performance. + For activation quantized model performance, disable enabled param quantizers, measure + eval score and enable again. + + :param sim: Quantsim model. + :return: Quantized model performance. + """ + enabled_param_quantizers = get_enabled_param_quantizers(sim) + enable_disable_quantizers(enabled_param_quantizers, enabled=False) + eval_score = self._eval_model(sim.model) + enable_disable_quantizers(enabled_param_quantizers, enabled=True) + return eval_score + + def perform_per_layer_analysis_by_enabling_quant_wrappers(self, + sim: QuantizationSimModel, + results_dir: str) -> Dict[str, float]: + """ + NOTE: Option 1 + + 1. All quant wrappers' parameters and activations quantizers are disabled. + 2. For every quant wrappers, based on occurrence: + i. Each quant wrapper's parameters and activations quantizers are enabled as per JSON config file + and set to bit-width specified. + ii. Measure and record eval score on subset of dataset. + iii. Disable enabled quantizers in step i. + 3. Returns dictionary containing quant wrapper name and corresponding eval score. + + :param sim: Quantsim model. + :param results_dir: Directory to save the results. + :return: layer-wise eval score dictionary. dict[layer_name] = eval_score + """ + results_dir = os.path.abspath(results_dir) + os.makedirs(results_dir, exist_ok=True) + + _logger.info("\nOPTION-1:\nAll the quant wrappers are disabled.\n" + "Starting per-layer analysis by enabling quant wrappers as per config file.") + + layer_wise_eval_score_dict = self._perform_per_layer_analysis(sim, + disable_all_quantizers=True, + enabled_before=True, + enabled_after=False) + export_per_layer_sensitivity_analysis_plot(layer_wise_eval_score_dict, + results_dir, + title="per_layer_quant_enabled") + save_json(layer_wise_eval_score_dict, + results_dir, + title="per_layer_quant_enabled.json") + _logger.info("Exported per-layer quant analysis (enabled) plot.") + return layer_wise_eval_score_dict + + def perform_per_layer_analysis_by_disabling_quant_wrappers(self, + sim: QuantizationSimModel, + results_dir: str) -> Dict[str, float]: + """ + NOTE: Option 2 + + 1. All quant wrappers' parameters and activations quantizers are enabled as per JSON config file + and set to bit-width specified. + 2. For every quant wrappers, based on occurrence: + i. Each quant wrapper's parameters and activations quantizers are disabled. + ii. Measure and record eval score on subset of dataset. + iii. Enable disabled quantizers in step i. + 3. Returns dictionary containing quant wrapper name and corresponding eval score. + + :param sim: Quantsim model. + :param results_dir: Directory to save the results. + :return: layer wise eval score dictionary. dict[layer_name] = eval_score + """ + results_dir = os.path.abspath(results_dir) + os.makedirs(results_dir, exist_ok=True) + + _logger.info("\nOPTION-2:\nAll the quant wrappers are enabled as per config file.\n" + "Starting per-layer analysis by disabling quant wrappers.") + layer_wise_eval_score_dict = self._perform_per_layer_analysis(sim, + disable_all_quantizers=False, + enabled_before=False, + enabled_after=True) + export_per_layer_sensitivity_analysis_plot(layer_wise_eval_score_dict, + results_dir, + title="per_layer_quant_disabled") + save_json(layer_wise_eval_score_dict, + results_dir, + title="per_layer_quant_disabled.json") + _logger.info("Exported per-layer quant analysis (disabled) plot.") + return layer_wise_eval_score_dict + + def _perform_per_layer_analysis(self, + sim: QuantizationSimModel, + disable_all_quantizers: bool, + enabled_before: bool, + enabled_after: bool) -> Dict[str, float]: + """ + Helper function for perform_per_layer_analysis_by_enabling_quant_wrappers() and + perform_per_layer_analysis_by_disabling_quant_wrappers() + + :param sim: Quantsim model. + :param disable_all_quantizers: Flag to disable all the quantizers before per-layer analysis. + :param enabled_before: Flag to set enabled for quantizers before computing encodings. + :param enabled_after: Flag to set enabled for quantizers after computing encodings. + :return: layer wise eval score dictionary. dict[layer_name] = eval_score. + """ + # Sorted quant wrappers based on occurrence. + # maps wrapped module name to a quant wrapper. + sorted_quant_wrappers = _sort_quant_wrappers_based_on_occurrence(sim) + + # quant wrappers and it's enabled quantizers. + # maps quant wrapper to a list of enabled quantizers in it. + enabled_quant_wrappers = _get_enabled_quantizers(sorted_quant_wrappers) + + if disable_all_quantizers: + for enabled_quantizers in enabled_quant_wrappers.values(): + enable_disable_quantizers(enabled_quantizers, enabled=False) + + eval_score_dict = {} + for name, quant_wrapper in sorted_quant_wrappers.items(): + if quant_wrapper in enabled_quant_wrappers: + enabled_quantizers = enabled_quant_wrappers[quant_wrapper] + enable_disable_quantizers(enabled_quantizers, enabled=enabled_before) + + # Record eval score. + eval_score_dict[name] = self._eval_model(sim.model) + _logger.debug("For layer: %s, the eval score is: %f", name, eval_score_dict[name]) + + enable_disable_quantizers(enabled_quantizers, enabled=enabled_after) + + if disable_all_quantizers: + for enabled_quantizers in enabled_quant_wrappers.values(): + enable_disable_quantizers(enabled_quantizers, enabled=True) + + return eval_score_dict + + # pylint: disable=no-self-use + def export_per_layer_encoding_min_max_range(self, + sim: QuantizationSimModel, + results_dir: str) -> Tuple[Dict, Dict]: + """ + Export encoding min and max range for all weights and activations. results_dir should have + html files in following format. + + -results_dir + -activations.html + -weights.html + + If per channel quantization(PCQ) is enabled then, + + -results_dir + -activations.html + -{wrapped_module_name}_{param_name}.html + + :param sim: Quantsim model. + :param results_dir: Directory to save the results. + :return: layer wise min-max range for weights and activations. + """ + min_max_ranges_dir = os.path.join(results_dir, "min_max_ranges") + + min_max_range_for_activations_dict = {} + min_max_range_for_weights_dict = {} + for quant_wrapper in sim.quant_wrappers(): + wrapped_layer_name = quant_wrapper.original_layer.name + + for index, quantizer in enumerate(quant_wrapper.input_quantizers): + if quantizer.is_enabled(): + name = f"{wrapped_layer_name}_input_{index}" + min_max_range_for_activations_dict[name] = (quantizer.encoding.min, quantizer.encoding.max) + + for index, quantizer in enumerate(quant_wrapper.output_quantizers): + if quantizer.is_enabled(): + name = f"{wrapped_layer_name}_output_{index}" + min_max_range_for_activations_dict[name] = (quantizer.encoding.min, quantizer.encoding.max) + + for quantizer in quant_wrapper.param_quantizers: + if quantizer.is_enabled(): + # Keras parameter name usually contains slash (/) and it can cause incorrect file path when saving + # Replace slash (/) with dash (-) to avoid it + quantizer_name = quantizer.name.replace("/", "-") + name = f"{wrapped_layer_name}_{quantizer_name}" + + if isinstance(quantizer.encoding, List): # per-channel + per_channel_encodings = {} + for index, encoding in enumerate(quantizer.encoding): + per_channel_encodings[f"{name}_{index}"] = (encoding.min, encoding.max) + min_max_range_for_weights_dict[name] = per_channel_encodings + else: # per-tensor + min_max_range_for_weights_dict[name] = (quantizer.encoding.min, quantizer.encoding.max) + + create_and_export_min_max_ranges_plot(min_max_range_for_weights_dict, + min_max_ranges_dir, + title="weights") + create_and_export_min_max_ranges_plot(min_max_range_for_activations_dict, + min_max_ranges_dir, + title="activations") + save_json(min_max_range_for_weights_dict, min_max_ranges_dir, title="weights.json") + save_json(min_max_range_for_activations_dict, min_max_ranges_dir, title="activations.json") + _logger.info("Exported per layer encodings min-max ranges plot(s).") + return min_max_range_for_weights_dict, min_max_range_for_activations_dict + + def export_per_layer_stats_histogram(self, sim: QuantizationSimModel, results_dir: str) -> None: + """ + NOTE: Not to invoke when quantization scheme is not TF-Enhanced. + + Export histogram that represents a PDF of collected statistics by a quantizer for every + quant wrapper. After invoking this API, results_dir should have html files in following + format for every quantizers of quant wrappers. + + -results_dir + -activations_pdf + name_{input/output}_{index}.html + -weights_pdf + -name + param_name_{channel_index}.html + + :param sim: Quantsim model. + :param results_dir: Directory to save the results. + """ + weights_pdf_dir = os.path.join(results_dir, "weights_pdf") + activations_pdf_dir = os.path.join(results_dir, "activations_pdf") + + for quant_wrapper in sim.quant_wrappers(): + wrapped_layer_name = quant_wrapper.original_layer.name + + for index, quantizer in enumerate(quant_wrapper.input_quantizers): + if quantizer.encoding: + self._create_and_export_stats_histogram_plot(quantizer, activations_pdf_dir, + title=f"{wrapped_layer_name}_input_q{index}") + + for index, quantizer in enumerate(quant_wrapper.output_quantizers): + if quantizer.encoding: + self._create_and_export_stats_histogram_plot(quantizer, activations_pdf_dir, + title=f"{wrapped_layer_name}_output_q{index}") + + for quantizer in quant_wrapper.param_quantizers: + if quantizer.encoding: + # Keras parameter name usually contains slash (/) and it can cause incorrect file path when saving + # Replace slash (/) with dash (-) to avoid it + param_name = quantizer.name.replace("/", "-") + self._create_and_export_stats_histogram_plot(quantizer, + os.path.join(weights_pdf_dir, wrapped_layer_name), + title=f"{wrapped_layer_name}_{param_name}") + _logger.info("Exported per layer stats histogram plot(s).") + + @staticmethod + def _create_and_export_stats_histogram_plot(quantizer: TensorQuantizer, + results_dir: str, + title: str) -> None: + """ + For given quantizer, create and export histogram (PDF) of statistics in html format. + + :param quantizer: Quantizer. + :param results_dir: Directory to save the results. + :param title: Title of the plot. + """ + os.makedirs(results_dir, exist_ok=True) + + histograms = quantizer.get_stats_histogram() + encodings = quantizer.encoding + if not isinstance(encodings, List): + encodings = [encodings] + + for index, (histogram, encoding) in enumerate(zip(histograms, encodings)): + export_stats_histogram_plot(histogram, encoding, results_dir, title=f"{title}_{index}") + + def export_per_layer_mse_loss(self, + sim: QuantizationSimModel, + results_dir: str) -> Dict[str, float]: + """ + NOTE: Need to pass same model input data through both fp32 and quantsim model to + tap output activations of each layer. + + Export MSE loss between fp32 and quantized output activations for each layer. + :param sim: Quantsim model. + :param results_dir: Directory to save the results. + :return layer wise MSE loss. dict[layer_name] = MSE loss. + """ + results_dir = os.path.abspath(results_dir) + os.makedirs(results_dir, exist_ok=True) + + mse_loss_dict = {} + with Spinner("Calculating per-layer MSE loss"): + for index, layer in enumerate(self._model.layers): + if isinstance(layer, tf.keras.layers.InputLayer) or \ + GraphSearchUtils.is_folded_batch_normalization(layer): + continue + + loss = self._compute_mse_loss(sim, index) + mse_loss_dict[layer.name] = loss + + export_per_layer_mse_plot(mse_loss_dict, + results_dir, + title="per_layer_mse_loss") + save_json(mse_loss_dict, results_dir, title="per_layer_mse_loss.json") + _logger.info("Exported per layer MSE loss plot.") + return mse_loss_dict + + def _compute_mse_loss(self, + sim: QuantizationSimModel, + index: int) -> float: + """ + Compute MSE loss between fp32 and quantized output activations for each batch, add for + all the batches and return averaged mse loss. + + :param sim: Quantsim model. + :param index: Index of layer + :return: MSE loss between fp32 and quantized output activations. + """ + loss = 0.0 + total = 0 + mse = tf.keras.losses.MeanSquaredError() + for tensor in self._unlabeled_dataset.take(self._num_batches): + quantized_output = _get_output_of_intermediate_layer(sim.model, tensor, index) + fp32_output = _get_output_of_intermediate_layer(self._model, tensor, index) + + loss += mse(quantized_output, fp32_output).numpy() + total += tensor.shape[0] + + return loss / total + + def enable_per_layer_mse_loss(self, unlabeled_dataset: tf.data.Dataset, num_batches: int) -> None: + """ + Enable per layer MSE loss analysis. + + :param unlabeled_dataset: tf.data.Dataset provided as input to the model + and used to calculate mse loss + :param num_batches: Maximum number of batches to be used for MSE loss calculation + """ + self._unlabeled_dataset = unlabeled_dataset + self._num_batches = num_batches
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/keras/quantsim.html b/releases/1.32.2/_modules/aimet_tensorflow/keras/quantsim.html new file mode 100644 index 00000000..61388c65 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/keras/quantsim.html @@ -0,0 +1,1957 @@ + + + + + + aimet_tensorflow.keras.quantsim — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_tensorflow.keras.quantsim

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+""" Quantsim for Keras """
+from __future__ import annotations
+
+from dataclasses import dataclass
+import json
+import os
+from typing import Union, Dict, Tuple, Optional, List
+
+import tensorflow as tf
+from aimet_common import libpymo
+
+from aimet_common.defs import QuantScheme, QuantizationDataType
+from aimet_common.utils import AimetLogger, save_json_yaml
+from aimet_common.quantsim import encoding_version, extract_global_quantizer_args
+from aimet_tensorflow.keras.connectedgraph import ConnectedGraph
+from aimet_tensorflow.keras.graphsearchtuils import GraphSearchUtils
+from aimet_tensorflow.keras.quant_sim.qc_quantize_wrapper import QcQuantizeWrapper, QuantizerSettings
+from aimet_tensorflow.keras.quant_sim.qc_mha_wrapper import QcQuantizableMultiHeadAttention
+from aimet_tensorflow.keras.rnn.qc_quant_LSTM import QuantizedLSTM
+from aimet_tensorflow.keras.quant_sim.tensor_quantizer import TensorQuantizer, ActivationTensorQuantizer, \
+    ParamPerTensorQuantizer, StaticGridPerChannelQuantizer, ParamPerChannelQuantizer
+from aimet_tensorflow.keras.quantsim_config.quantsim_config import QuantSimConfigurator, INPUT_QUANTIZERS, \
+    OUTPUT_QUANTIZERS, PARAM_QUANTIZERS
+from aimet_tensorflow.keras.utils.common import convert_h5_model_to_pb_model
+
+from aimet_tensorflow.keras.defs import AxisHandling
+import aimet_tensorflow.keras.utils.common as keras_common_utils
+
+_logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.Quant)
+
+unquantizable_modules = (tf.keras.layers.InputLayer, QcQuantizeWrapper)
+substitutable_modules = {
+    tf.keras.layers.MultiHeadAttention: QcQuantizableMultiHeadAttention,
+    tf.keras.layers.LSTM: QuantizedLSTM
+}
+
+
+@dataclass
+class QuantizationSimModelParams:
+    """
+    Data class that holds parameters for QuantizationSimModel. Used specifically to rebuild after converting to TF frozen pb
+    """
+    quant_scheme: Union[QuantScheme, str] = 'tf_enhanced'
+    rounding_mode: str = 'nearest'
+    default_output_bw: int = 8
+    default_param_bw: int = 8
+    in_place: bool = False
+    config_file: str = None
+    default_data_type: QuantizationDataType = QuantizationDataType.int
+
+
+# pylint: disable=too-many-ancestors
+# pylint: disable=too-many-instance-attributes
+
[docs]class QuantizationSimModel(tf.keras.Model): + """ + Implements mechanism to add quantization simulations ops to a model. This allows for off-target simulation of + inference accuracy. Also allows the model to be fine-tuned to counter the effects of quantization. + """ + + # pylint: disable=too-many-arguments + # pylint: disable=unused-argument + def __init__(self, model, quant_scheme: Union[QuantScheme, str] = 'tf_enhanced', rounding_mode: str = 'nearest', + default_output_bw: int = 8, default_param_bw: int = 8, in_place: bool = False, + config_file: str = None, default_data_type: QuantizationDataType = QuantizationDataType.int): + """ + :param model: Model to quantize + :param quant_scheme: Quantization Scheme, currently supported schemes are post_training_tf and + post_training_tf_enhanced, defaults to post_training_tf_enhanced + :param rounding_mode: The round scheme to used. One of: 'nearest' or 'stochastic', defaults to 'nearest'. + :param default_output_bw: bitwidth to use for activation tensors, defaults to 8 + :param default_param_bw: bitwidth to use for parameter tensors, defaults to 8 + :param in_place: If True, then the given 'model' is modified in-place to add quant-sim nodes. + Only suggested use of this option is when the user wants to avoid creating a copy of the model + :param config_file: Path to a config file to use to specify rules for placing quant ops in the model + :param default_data_type: Default data type to use for quantizing all layer parameters. + Possible options are QuantizationDataType.int and QuantizationDataType.float. + Note that the mode default_data_type=QuantizationDataType.float is only supported with + default_output_bw=16 and default_param_bw=16 + """ + super().__init__() + + self._model_without_wrappers = model + if not in_place: + self._model_without_wrappers = tf.keras.models.clone_model(model) + n_weights = len(self._model_without_wrappers.weights) + self._model_without_wrappers.set_weights(model.get_weights()[:n_weights]) + self._layer_name_to_quant_wrapper = {} + self._substituted_layer = {} # to hold the substituted layers + self._validate_model() + self.connected_graph = ConnectedGraph(self._model_without_wrappers) + self._quantsim_configurator = self._initialize_quantsim_configurator(quant_scheme, rounding_mode, + default_output_bw, default_param_bw, + default_data_type, config_file) + self.quant_scheme = quant_scheme + self._percentile_value = 100 # default percentile value + self.per_channel_quantization_enabled = self._quantsim_configurator.per_channel_quantization_flag + self.model = self._add_quantization_wrappers(quant_scheme, rounding_mode, + default_output_bw, default_param_bw, default_data_type) + self.quant_args = extract_global_quantizer_args(quant_scheme, self._quantsim_configurator) + + self._params = QuantizationSimModelParams(quant_scheme, rounding_mode, default_output_bw, default_param_bw, + in_place, config_file, default_data_type) + + def _validate_model(self): + """ + Check that model is appropriate for quantsim. + """ + multiple_inbound_node_layers = [] + + for layer in self._model_without_wrappers.layers: + if len(layer.inbound_nodes) > 1: + multiple_inbound_node_layers.append(layer.name) + + if multiple_inbound_node_layers: + error_msg = (f'Layers with more than one inbound nodes are unsupported. This may occur if a layer is ' + f'reused multiple times in the model definition.\n' + f'Layers with multiple inbound nodes: {multiple_inbound_node_layers}') + _logger.error(error_msg) + raise NotImplementedError(error_msg) + + sep_conv_found = self.check_separable_conv(self._model_without_wrappers) + if sep_conv_found: + # Raising an assertion error incase there's SeparableConv2D in the model because in this case we have two sets of weights: Depthwise + # and Pointwise. For depthwise kernels, LAST TWO AXIS should be considered and for pointwise kernels LAST AXIS + # should be considered, which is not handled here. Running model preparer beforehand will resolve this as there the + # SeparableConv2D is splitted into two layers Depthwise and Pointwise seperately. + raise AssertionError("SeparableConv2D found in the model. Please run model preparer before calling QuantizationSimModel") + + def check_separable_conv(self, model: tf.keras.models.Model | tf.keras.Sequential) -> bool: + """ + Checks for SeparableConv2D layer in the model + :param model: Keras Model + :return: Boolean value, True if SeperableConv layer is found else False + """ + for layer in model.layers: + if isinstance(layer, tf.keras.Sequential): + if self.check_separable_conv(layer): + return True + elif isinstance(layer, tf.keras.layers.SeparableConv2D): + return True + return False + + def _get_quantizer_list(self) -> Tuple[List, List, List]: + """ + Method to provide a list of input, output and parameter quantizers + :return: Three lists containing input, paramater and output quantizers respectively + """ + input_quantizers = [] + parameter_quantizers = [] + output_quantizers = [] + + for wrapper in self.quant_wrappers(): + for quantizer in wrapper.input_quantizers: + input_quantizers.append(quantizer) + + for quantizer in wrapper.param_quantizers: + parameter_quantizers.append(quantizer) + + for quantizer in wrapper.output_quantizers: + output_quantizers.append(quantizer) + + return input_quantizers, parameter_quantizers, output_quantizers + + def set_percentile_value(self, percentile_value: float): + """ + Set the percentile value to be used while computing encodings for quantizers having percentile quant scheme. + + :param percentile_value: Percentile value to be set to + """ + if percentile_value < 90 or percentile_value > 100: + raise ValueError("Percentile value must be in range [90, 100]") + self._percentile_value = percentile_value + + # Set the percentile value to the activation quantizers + input_quantizers, _, output_quantizers = self._get_quantizer_list() + for quantizer in input_quantizers + output_quantizers: + if quantizer.quant_scheme == QuantScheme.post_training_percentile: + quantizer.set_percentile_value(self._percentile_value) + + def _initialize_quantsim_configurator(self, quant_scheme: Union[QuantScheme, str], rounding_mode: str, + default_output_bw: int, default_param_bw: int, + default_data_type: QuantizationDataType = QuantizationDataType.int, + config_file: str = None) -> QuantSimConfigurator: + """ + Initialize quantsim configurator + :param quant_scheme: Quantization Scheme + :param rounding_mode: The round scheme to used + :param default_output_bw: bitwidth to use for activation tensors + :param default_param_bw: bitwidth to use for parameter tensors + :param default_data_type: data type to use for the parameter tensors + :param config_file: Path to a config file to use to specify rules for placing quant ops in the model + :return: QuantSimConfigurator + """ + return QuantSimConfigurator(self.connected_graph, quant_scheme, rounding_mode, + default_output_bw, default_param_bw, default_data_type, config_file) + + def _add_quantization_wrappers(self, quant_scheme, rounding_mode, + default_output_bw, default_param_bw, default_data_type): + """ + Add quantization wrappers to the model and return a new model with the wrappers inserted. + :param quant_scheme: Quantization scheme to use + :param rounding_mode: Rounding mode to use + :param default_output_bw: Default bitwidth for activation quantizers + :param default_param_bw: Default bitwidth for param quantizers + :param default_data_type: data type to use for param quantizers + """ + + def wrap_layer(layer) -> tf.keras.layers.Layer: + """ + Function to wrap layers with QcQuantizeWrappers, used by keras clone_model() + :param layer: Layer to wrap + :return: Wrapped layer, or original layer if layer is not to be wrapped + """ + if isinstance(layer, tuple(substitutable_modules.keys())): + new_class = substitutable_modules[type(layer)] + config = layer.get_config() + config["copy_source_weights"] = layer.get_weights() + + if isinstance(layer, tf.keras.layers.LSTM): + if isinstance(self._model_without_wrappers, tf.keras.Sequential): + config["is_sequential_model"] = True + + # pylint: disable=protected-access + if self._quantsim_configurator._layer_to_config_dict[layer]["is_input_quantized"]["setting"]: + config["is_input_quantized"] = True + config["quant_scheme"] = quant_scheme + config["rounding_mode"] = rounding_mode + config["default_output_bw"] = default_output_bw + config["default_param_bw"] = default_param_bw + config["default_data_type"] = default_data_type + + wrapped_layer = new_class.from_config(config) + self._substituted_layer[layer] = wrapped_layer + return wrapped_layer + + if isinstance(layer, tf.keras.Sequential): + return tf.keras.models.clone_model(layer, clone_function=wrap_layer) + + if isinstance(layer, unquantizable_modules) or layer.submodules: + return layer + + activation_quant_settings = QuantizerSettings(default_output_bw, default_data_type, rounding_mode, + quant_scheme, False, False, False) + param_quant_settings = QuantizerSettings(default_param_bw, default_data_type, rounding_mode, + quant_scheme, False, False, False) + + input_quantizers, output_quantizers, param_quantizers = self._get_quantizers_by_layer(layer) + wrapper = QcQuantizeWrapper(layer, activation_quant_settings, param_quant_settings, + num_inputs=len(layer.inbound_nodes[0].keras_inputs), + input_quantizers=input_quantizers, + output_quantizers=output_quantizers, + param_quantizers=param_quantizers, + per_channel_quantization_enabled=self.per_channel_quantization_enabled) + self._layer_name_to_quant_wrapper[layer.name] = wrapper + return wrapper + + return tf.keras.models.clone_model(self._model_without_wrappers, clone_function=wrap_layer) + + def _get_quantizers_by_layer(self, layer: tf.keras.layers.Layer) -> Tuple[Optional[ActivationTensorQuantizer], + Optional[ActivationTensorQuantizer], + Union[ParamPerTensorQuantizer, + ParamPerChannelQuantizer]]: + """ + Get input/output/param quantizers from quantizers dictionary or initialize quantizers if layer is not found + :param layer: Target layer + :return: tuple of input, output, param quantizers + """ + quantizers_dict = self._quantsim_configurator.get_quantizers_dict(layer) + if quantizers_dict is None: + _logger.warning("%s not found in quantizers dict, will generate quantizers automatically", layer.name) + input_quantizers = None + output_quantizers = None + param_quantizers = None + else: + input_quantizers = quantizers_dict.get(INPUT_QUANTIZERS) + output_quantizers = quantizers_dict.get(OUTPUT_QUANTIZERS) + param_quantizers = quantizers_dict.get(PARAM_QUANTIZERS) + + return input_quantizers, output_quantizers, param_quantizers + + @staticmethod + def _quantizer_to_name_tuple(quantizers: List[TensorQuantizer]) -> Tuple[Optional[List[str]]]: + """ + Converts a list of quantizers to a tuple of quantizer names + :param quantizers: quantizers + :return: tuple of quantizer names + """ + quant_list = [] + if not quantizers: + return None + + for quantizer in quantizers: + quant_list.append(quantizer.name) + return tuple(quant_list) + + def get_quantizer_name_by_layer(self, layer: tf.keras.layers.Layer) -> Tuple[Optional[List[str]], + Optional[List[str]], + Optional[List[str]]]: + """ + Get the names of input, output and param quantizers + :param layer: the keras layer + :return: Tuple of quantizer names + """ + input_quantizers, output_quantizers, param_quantizers = self._get_quantizers_by_layer(layer) + output_quantizers_names = self._quantizer_to_name_tuple(output_quantizers) + input_quantizers_names = self._quantizer_to_name_tuple(input_quantizers) + parameter_quantizers_names = self._quantizer_to_name_tuple(param_quantizers) + + return input_quantizers_names, output_quantizers_names, parameter_quantizers_names + + def _disable_quantizers_in_folded_batchnorm(self): + """ + Disable input/output/param quantizers if layer is folded batch normalization + """ + for quantsim_wrapper in self._layer_name_to_quant_wrapper.values(): + if GraphSearchUtils.is_folded_batch_normalization(quantsim_wrapper.original_layer): + for q in quantsim_wrapper.input_quantizers: + q.disable() + for q in quantsim_wrapper.output_quantizers: + q.disable() + for q in quantsim_wrapper.param_quantizers: + q.disable() + + @staticmethod + def _get_encoding_dict_for_quantizer(quantizer: TensorQuantizer) -> Union[List[Dict[str, Union[str, int, float]]], + Dict[str, Union[str, int, float]]]: + """ + Get encoding dict for a tensor quantizer. + + :param quantizer: Quantizer to get encoding info from + :return: Dictionary or List of dictionaries containing encodings info for the tensor quantizer + """ + if not isinstance(quantizer, ParamPerChannelQuantizer) or quantizer.data_type == QuantizationDataType.float: + quantizer_encodings = [quantizer.encoding] + else: + quantizer_encodings = quantizer.encoding + return [ + { + 'min': encoding.min, + 'max': encoding.max, + 'scale': encoding.delta, + 'offset': int(encoding.offset), + 'bitwidth': encoding.bw, + 'is_symmetric': str(quantizer.is_symmetric), + 'dtype': 'int' + } if quantizer.data_type == QuantizationDataType.int + else {'dtype': 'float', 'bitwidth': int(quantizer.bitwidth)} + for encoding in quantizer_encodings + ] + + def get_encodings_dict(self) -> Dict[str, Union[str, Dict]]: + """ + Get encodings dict containing all activation and parameter encodings info in the model + :return: Dictionary containing all activation and parameter encodings info in the model + """ + # pylint: disable=protected-access, too-many-branches + model_input_tensor_names = [inp.name for inp in self.model.inputs] + activation_encodings = {} + param_encodings = {} + for wrapper in self.quant_wrappers(): + for idx, input_quantizer in enumerate(wrapper.input_quantizers): + if input_quantizer.is_encoding_valid() or input_quantizer.data_type == QuantizationDataType.float: + # because dense layers in quantizable MHA are not explicitly sublayers, they don't have their + # inbound_nodes parameter populated, so the name of the quantizer is used instead + if not wrapper._layer_to_wrap.inbound_nodes: + tensor_name = wrapper.name + "/" + input_quantizer.name + ":0" + else: + tensor_name = wrapper._layer_to_wrap.inbound_nodes[0].keras_inputs[idx].name + encoding_dict = self._get_encoding_dict_for_quantizer(input_quantizer) + if tensor_name in model_input_tensor_names: + tensor_name += ":0" + activation_encodings[tensor_name] = encoding_dict + for idx, param_quantizer in enumerate(wrapper.param_quantizers): + if param_quantizer.is_encoding_valid() or param_quantizer.data_type == QuantizationDataType.float: + param_name = wrapper._layer_to_wrap.weights[idx].name + encoding_dict = self._get_encoding_dict_for_quantizer(param_quantizer) + param_encodings[param_name] = encoding_dict + for idx, output_quantizer in enumerate(wrapper.output_quantizers): + if output_quantizer.is_encoding_valid() or output_quantizer.data_type == QuantizationDataType.float: + # because dense layers in quantizable MHA are not explicitly sublayers, they don't have their + # inbound_nodes parameter populated, so the name of the quantizer is used instead + if not wrapper._layer_to_wrap.inbound_nodes: + tensor_name = wrapper.name + ":0" + elif isinstance(wrapper._layer_to_wrap.output, List): + tensor_name = wrapper._layer_to_wrap.output[idx].name + else: + tensor_name = wrapper._layer_to_wrap.output.name + encoding_dict = self._get_encoding_dict_for_quantizer(output_quantizer) + activation_encodings[tensor_name] = encoding_dict + return { + 'version': encoding_version, + 'activation_encodings': activation_encodings, + 'param_encodings': param_encodings, + 'quantizer_args': self.quant_args if hasattr(self, "quant_args") else {} + } + +
[docs] def compute_encodings(self, forward_pass_callback, forward_pass_callback_args): + """ + Computes encodings for all quantization sim nodes in the model. + :param forward_pass_callback: A callback function that is expected to runs forward passes on a model. + This callback function should use representative data for the forward pass, so the calculated + encodings work for all data samples. + :param forward_pass_callback_args: These argument(s) are passed to the forward_pass_callback as-is. Up to + the user to determine the type of this parameter. E.g. could be simply an integer representing the number + of data samples to use. Or could be a tuple of parameters or an object representing something more + complex. + """ + ops_with_invalid_encodings = [] + self._compute_and_set_parameter_encodings(ops_with_invalid_encodings) + + self._set_op_mode_parameters(libpymo.TensorQuantizerOpMode.quantizeDequantize) + + forward_pass_callback(self.model, forward_pass_callback_args) + for quant_wrapper in self.quant_wrappers(): + quant_wrapper.compute_encoding(ops_with_invalid_encodings) + + op_mode = self._param_op_mode_after_analysis(self.quant_scheme) + + self._set_op_mode_parameters(op_mode) + + if ops_with_invalid_encodings: + _logger.info('The following quantizers did not have valid encodings and have been set to passThrough mode: ' + '%s', ops_with_invalid_encodings) + _logger.info('This can be due to the quantizers not having been evaluated during the forward pass in ' + 'compute encodings. Evaluation is required to collect statistics needed to compute valid ' + 'encodings.\n' + 'As a result, the quantizers have been set to passThrough mode, meaning no quantization noise ' + 'will be simulated for these ops if they are evaluated in the future.\n' + 'If this is not desired, amend the forward pass to evaluate tensors which require these ops ' + 'to be evaluated, and recompute encodings.')
+ + def _set_op_mode_parameters(self, op_mode: libpymo.TensorQuantizerOpMode): + """ + Sets quant mode for parameters and if the encodings are invalid, then adds those wrappers + to wrappers_with_invalid_encodings + :param op_mode: Quant mode to set to + """ + + for quantizer_info in self.quant_wrappers(): + for param_quantizer in quantizer_info.param_quantizers: + if param_quantizer.is_enabled(): + param_quantizer.quant_mode = op_mode + +
[docs] def export(self, path, filename_prefix, custom_objects=None, convert_to_pb=True): + """ + This method exports out the quant-sim model so it is ready to be run on-target. + Specifically, the following are saved + 1. The sim-model is exported to a regular Keras model without any simulation ops + 2. The quantization encodings are exported to a separate JSON-formatted file that can + then be imported by the on-target runtime (if desired) + :param path: path where to store model pth and encodings + :param filename_prefix: Prefix to use for filenames of the model pth and encodings files + :param custom_objects: If there are custom objects to load, Keras needs a dict of them to map them + """ + model_path = os.path.join(path, filename_prefix) + + #TF Version 2.4 has bug i.e. save() in tf format don't work for unrolled LSTM. + for layer in self._model_without_wrappers.layers: + if isinstance(layer, tf.keras.layers.LSTM): + break + else: + self._model_without_wrappers.save(model_path) + + self._model_without_wrappers.save(model_path + '.h5', save_format='h5') + + # Conversion of saved h5 model to pb model for consumption by SNPE/QNN + try: + if convert_to_pb: + convert_h5_model_to_pb_model(f'{model_path}.h5', custom_objects=custom_objects) + except ValueError: + _logger.error("Could not convert h5 to frozen pb. " + "Please call export() again with custom_objects defined.") + raise + finally: + encodings_dict = self.get_encodings_dict() + encoding_file_path = os.path.join(path, filename_prefix + '.encodings') + save_json_yaml(encoding_file_path, encodings_dict)
+ + def _compute_and_set_parameter_encodings(self, ops_with_invalid_encodings: List): + # pylint: disable=too-many-nested-blocks + for quantizer_wrapper in self.quant_wrappers(): + for idx, param_quantizer in enumerate(quantizer_wrapper.param_quantizers): + if param_quantizer.is_enabled() and param_quantizer.data_type == QuantizationDataType.int: + # 0th input to our quant wrapper is the tensor being quantized + weight_tensor = quantizer_wrapper.original_layer.get_weights()[idx] + + # Per-channel + if isinstance(param_quantizer, StaticGridPerChannelQuantizer): + for index, tensor_quantizer in enumerate(param_quantizer.tensor_quantizer): + if param_quantizer.axis_handling == AxisHandling.LAST_TWO_AXES.value: + last_two_axes_combined_shape = list(weight_tensor.shape[:-2]) + [-1] + channel_slice = weight_tensor.reshape(*last_two_axes_combined_shape) + channel_slice = channel_slice.take(index, channel_slice.ndim - 1) + elif isinstance(quantizer_wrapper.original_layer, tf.keras.layers.Conv2DTranspose): + if weight_tensor.ndim == 4: + channel_slice = weight_tensor.take(index, weight_tensor.ndim - 2) + else: + # For bias in Transpose layers + channel_slice = weight_tensor.take(index, weight_tensor.ndim - 1) + else: + channel_slice = weight_tensor.take(index, weight_tensor.ndim - 1) + tensor_quantizer.updateStats(channel_slice, False) + + # Per-tensor + else: + tensor_quantizer = param_quantizer.tensor_quantizer + tensor_quantizer.updateStats(weight_tensor, False) + + param_quantizer.compute_encoding(ops_with_invalid_encodings) + + def set_and_freeze_param_encodings(self, encoding_path: str): + """ + Set and freeze parameter encodings from encodings JSON file + :param encoding_path: path from where to load parameter encodings file + """ + # Load parameter encodings file + with open(encoding_path) as json_file: + param_encodings = json.load(json_file) + + for quant_wrapper in self.quant_wrappers(): + quant_wrapper.set_and_freeze_param_encoding(param_encodings) + + def load_encodings_to_sim(self, encoding_file_path: str): + """ + Loads the saved encodings to quant sim model + + :param encoding_file_path: path from where to load encodings file + :return: + """ + # pylint: disable=protected-access, too-many-branches, too-many-locals, too-many-statements + # Load encodings file + with open(encoding_file_path) as json_file: + encodings = json.load(json_file) + + param_encodings = encodings['param_encodings'] + activation_encodings = encodings['activation_encodings'] + + model_input_tensor_names = [inp.name for inp in self.model.inputs] + + for wrapper in self.quant_wrappers(): + for idx, input_quantizer in enumerate(wrapper.input_quantizers): + # because dense layers in quantizable MHA and RNN are not explicitly sublayers, they don't have their + # inbound_nodes parameter populated, so the name of the quantizer is used instead + if not wrapper._layer_to_wrap.inbound_nodes: + tensor_name = wrapper.name + "/" + input_quantizer.name + ":0" + else: + tensor_name = wrapper._layer_to_wrap.inbound_nodes[0].keras_inputs[idx].name + if tensor_name in model_input_tensor_names: + tensor_name += ":0" + + if tensor_name in activation_encodings: + if not input_quantizer.is_enabled(): + _logger.info("Not loading encodings for quantizer: %s as it is disabled", tensor_name) + continue + encoding_dict = activation_encodings[tensor_name][0] + if encoding_dict['dtype'] == 'int': + encoding, is_symmetric = keras_common_utils.create_encoding_from_dict(encoding_dict) + input_quantizer.tensor_quantizer.isEncodingValid = True + input_quantizer.set_quantizer_encodings(encoding.bw, is_symmetric, encoding, + libpymo.TensorQuantizerOpMode.quantizeDequantize) + _logger.info("Setting encodings for : %s", tensor_name) + elif encoding_dict['dtype'] == 'float': + input_quantizer.data_type = QuantizationDataType.float + input_quantizer.bitwidth = encoding_dict['bitwidth'] + _logger.info("Setting quantizer dtype to float for : %s", tensor_name) + else: + raise RuntimeError("Unrecognized dtype %s for: %s" % (encoding_dict['dtype'], tensor_name)) + else: + if input_quantizer.is_enabled(): + input_quantizer.disable() + _logger.info("Encoding for quantizer: %s is not present thus disabling it.", tensor_name) + + for idx, param_quantizer in enumerate(wrapper.param_quantizers): + param_name = wrapper._layer_to_wrap.weights[idx].name + + if param_name in param_encodings: + if not param_quantizer.is_enabled(): + _logger.info("Not loading encodings for parameter: %s as quantizer is disabled", param_name) + continue + if isinstance(param_quantizer, StaticGridPerChannelQuantizer): + assert param_encodings[param_name][0]['dtype'] != 'float', "PerChannel Quantizers can't be set to float" + encoding, is_symmetric = keras_common_utils.create_encoding_from_dict( + param_encodings[param_name]) + for tensor_quantizer in param_quantizer.tensor_quantizer: + tensor_quantizer.isEncodingValid = True + bw = encoding[0].bw + param_quantizer.set_quantizer_encodings(bw, is_symmetric, encoding, + libpymo.TensorQuantizerOpMode.oneShotQuantizeDequantize) + _logger.info("Setting encodings for : %s", param_name) + else: + encoding_dict = param_encodings[param_name][0] + if encoding_dict['dtype'] == 'int': + encoding, is_symmetric = keras_common_utils.create_encoding_from_dict(encoding_dict) + param_quantizer.tensor_quantizer.isEncodingValid = True + bw = encoding.bw + param_quantizer.set_quantizer_encodings(bw, is_symmetric, encoding, + libpymo.TensorQuantizerOpMode.oneShotQuantizeDequantize) + _logger.info("Setting encodings for : %s", param_name) + elif encoding_dict['dtype'] == 'float': + param_quantizer.data_type = QuantizationDataType.float + param_quantizer.bitwidth = encoding_dict['bitwidth'] + _logger.info("Setting quantizer to float for : %s", param_name) + else: + raise RuntimeError("Unrecognized dtype %s for: %s" % (encoding_dict['dtype'], tensor_name)) + else: + if param_quantizer.is_enabled(): + param_quantizer.disable() + _logger.info("Encoding for parameter: %s not present thus disabling this quantizer.", + param_name) + + # Loading encodings means that compute encodings was called. Therefore, these two lines set the correct + # op mode for the correct quant scheme and if the quantization was per channel or not. + op_mode = self._param_op_mode_after_analysis(self.quant_scheme) + self._set_op_mode_parameters(op_mode) + + for idx, output_quantizer in enumerate(wrapper.output_quantizers): + # because dense layers in quantizable MHA are not explicitly sublayers, they don't have their + # inbound_nodes parameter populated, so the name of the quantizer is used instead + if not wrapper._layer_to_wrap.inbound_nodes: + tensor_names = [wrapper.name + ":0"] + else: + # There can be multiple outputs if there is a + # `tf.split` in the model. + if isinstance(wrapper._layer_to_wrap.output, list): + tensor_names = [ + output.name + for output in wrapper._layer_to_wrap.output + ] + else: + tensor_names = [wrapper._layer_to_wrap.output.name] + + for tensor_name in tensor_names: + if tensor_name in activation_encodings: + if not output_quantizer.is_enabled(): + _logger.info("Not loading encodings for quantizer: %s as it is disabled", tensor_name) + continue + encoding_dict = activation_encodings[tensor_name][0] + if encoding_dict['dtype'] == 'int': + encoding, is_symmetric = keras_common_utils.create_encoding_from_dict(encoding_dict) + output_quantizer.tensor_quantizer.isEncodingValid = True + output_quantizer.set_quantizer_encodings(encoding.bw, is_symmetric, encoding, + libpymo.TensorQuantizerOpMode.quantizeDequantize) + _logger.info("Setting encodings for : %s", tensor_name) + elif encoding_dict['dtype'] == 'float': + output_quantizer.data_type = QuantizationDataType.float + output_quantizer.bitwidth = encoding_dict['bitwidth'] + _logger.info("Setting quantizer dtype to float for : %s", tensor_name) + else: + raise RuntimeError("Unrecognized dtype %s for: %s" % (encoding_dict['dtype'], tensor_name)) + else: + if output_quantizer.is_enabled(): + output_quantizer.disable() + _logger.info("Encoding for quantizer: %s is not present thus disabling it.", tensor_name) + + def _param_op_mode_after_analysis(self, quant_scheme) -> libpymo.TensorQuantizerOpMode: + """ + Returns quant mode to use for parameters after encodings have been computed + :param quant_scheme: Quantization scheme to use + :return: Quant mode to use + """ + if quant_scheme in [QuantScheme.training_range_learning_with_tf_init, + QuantScheme.training_range_learning_with_tf_enhanced_init] \ + or self.per_channel_quantization_enabled: + return libpymo.TensorQuantizerOpMode.quantizeDequantize + return libpymo.TensorQuantizerOpMode.oneShotQuantizeDequantize + + def quant_wrappers(self): + """ + Generator for yielding all quantization wrappers + """ + for layer in self.model.layers: + if isinstance(layer, QcQuantizeWrapper): + yield layer + if isinstance(layer, tuple(substitutable_modules.values())): + yield from layer.quant_wrappers() + + # For Getting Quantizers from Sequantial Block + if isinstance(layer, tf.keras.Sequential): + yield from quant_wrappers_for_sequential_block(layer) + + def get_quant_wrapper_for_layer_name(self, layer_name: str) -> QcQuantizeWrapper: + """ + Return qc quant wrapper corresponding to a layer name + :param layer_name: Layer name to get quantize wrapper for + :return: Qc quant wrapper corresponding to a layer name + """ + return self._layer_name_to_quant_wrapper.get(layer_name) + + # pylint: disable=too-many-locals + def _fill_missing_encoding_min_max_gradients(self, gradients: list): + """ + Computes the encoding min/max gradients and populates the gradients list + :param gradients: gradients computed using GradientTape(gradients for encoding min/max will be `None`) + """ + + def _find_weight_in_layer(weight_name: str, model_layer: tf.keras.layers.Layer): + + for weight in model_layer.weights: + if weight.name.split(":")[0] == weight_name: + return weight + + return None + + # Mapping used to get the gradients of weights(kernel, bias etc) + weight_name_to_gradient = dict(zip([weight.name.split(":")[0] for weight in self.model.trainable_weights], + gradients)) + + # Mapping used to get index of encoding min/max gradients (which would be `None`) and fill them + weight_name_to_index = dict(zip([weight.name for weight in self.model.trainable_weights], + range(len(self.model.trainable_weights)))) + + # Only process layers where 'param_quantizers' is defined (i.e. QcQuantizeWrapper layers) + for layer in filter(lambda _layer: hasattr(_layer, 'param_quantizers'), self.model.layers): + for param_quantizer in layer.param_quantizers: + if param_quantizer.name in weight_name_to_gradient: + # Value of weight associated with this param quantizer + weight_tensor = _find_weight_in_layer(param_quantizer.name, layer.original_layer) + + # Gradients of the weights + grad = weight_name_to_gradient[param_quantizer.name] + + # Using the weights and it's gradients, compute gradients for encoding min/max + dloss_by_dmin, dloss_by_dmax = param_quantizer.get_gradients_for_encoding_min_max(weight_tensor, + grad) + + enc_min_index = weight_name_to_index[param_quantizer.encoding_min.name] + enc_max_index = weight_name_to_index[param_quantizer.encoding_max.name] + + gradients[enc_min_index] = dloss_by_dmin + gradients[enc_max_index] = dloss_by_dmax + + # TODO: Remove this logic once this has been resolved in QNN/SNPE + # Go through activation quantizers (Input/Output) and set any ReLU's encoding min to 0 + relu_quantize_wrappers = [ + _layer for _layer in self.model.layers + if isinstance(_layer, QcQuantizeWrapper) and isinstance(_layer.original_layer, tf.keras.layers.ReLU) + ] + + def _set_encoding_min_grad_to_None(quantizer): + enc_min_index = weight_name_to_index[quantizer.encoding_min.name] + gradients[enc_min_index] = None + + for relu_quantizer in relu_quantize_wrappers: + for output_quantizer in relu_quantizer.output_quantizers: + _set_encoding_min_grad_to_None(output_quantizer) + + # pylint: disable=useless-super-delegation + def get_config(self): + return super().get_config() + + def call(self, inputs, training=None, mask=None): + return self.model.call(inputs, training, mask) + + def train_step(self, data): + """ + Custom training loop, equivalent to overriding `keras.Model.fit` function + Reference: https://keras.io/guides/customizing_what_happens_in_fit/ + Only relevant when using range-learning, otherwise equivalent to `keras.Model.fit` + Param quantizers are disconnected in the op graph of the wrapped model + Because of this, the gradients are not computed for encoding min/max(when range learning is enabled) + This custom train_step function computes the missing gradients for encoding min/max of param quantizers + """ + x, y = data + + with tf.GradientTape() as tape: + predictions = self(x, training=True) + loss = self.compiled_loss(y, predictions) + + gradients = tape.gradient(loss, self.model.trainable_weights) + + # Manually compute missing gradients for encoding min/max when using range learning + if self.quant_scheme in [QuantScheme.training_range_learning_with_tf_init, + QuantScheme.training_range_learning_with_tf_enhanced_init]: + self._fill_missing_encoding_min_max_gradients(gradients) + + gradients_to_apply = [(gradient, weight) for gradient, weight in zip(gradients, self.model.trainable_weights) + if gradient is not None] + + self.optimizer.apply_gradients(gradients_to_apply) + + self.compiled_metrics.update_state(y, predictions) + + return {m.name: m.result() for m in self.metrics}
+ + +def quant_wrappers_for_sequential_block(seq_block: tf.keras.Sequential): + """ + Generator for yielding all quantization wrappers for a Sequantial Block + """ + for layer in seq_block.layers: + if isinstance(layer, QcQuantizeWrapper): + yield layer + if isinstance(layer, tuple(substitutable_modules.values())): + yield from layer.quant_wrappers() + + # in cases of nested Sequential Block + if isinstance(layer, tf.keras.Sequential): + yield from quant_wrappers_for_sequential_block(layer) +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/layer_output_utils.html b/releases/1.32.2/_modules/aimet_tensorflow/layer_output_utils.html new file mode 100644 index 00000000..d9acbb5d --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/layer_output_utils.html @@ -0,0 +1,1298 @@ + + + + + + aimet_tensorflow.layer_output_utils — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_tensorflow.layer_output_utils

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2023, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" This module contains utilities to capture and save intermediate layer-outputs of a model """
+
+import re
+from typing import List, Dict, Tuple, Union
+
+import numpy as np
+import tensorflow as tf
+
+from aimet_common.utils import AimetLogger
+from aimet_common.layer_output_utils import SaveInputOutput, save_layer_output_names
+
+from aimet_tensorflow.common.connectedgraph import ConnectedGraph
+from aimet_tensorflow.quantsim import QuantizationSimModel
+from aimet_tensorflow.utils.common import create_input_feed_dict
+
+logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.LayerOutputs)
+
+
+
[docs]class LayerOutputUtil: + """ Implementation to capture and save outputs of intermediate layers of a model (fp32/quantsim) """ + + def __init__(self, session: tf.compat.v1.Session, starting_op_names: List[str], output_op_names: List[str], + dir_path: str): + """ + Constructor for LayerOutputUtil. + + :param session: Session containing the model whose layer-outputs are needed. + :param starting_op_names: List of starting op names of the model. + :param output_op_names: List of output op names of the model. + :param dir_path: Directory wherein layer-outputs will be saved. + """ + self.session = session + self.starting_op_names = starting_op_names + + # Utility to capture layer-outputs + self.layer_output = LayerOutput(session=session, starting_op_names=starting_op_names, + output_op_names=output_op_names, dir_path=dir_path) + + # Identify the axis-layout used for representing an image tensor + axis_layout = 'NHWC' if tf.keras.backend.image_data_format() == 'channels_last' else 'NCHW' + + # Utility to save model inputs and their corresponding layer-outputs + self.save_input_output = SaveInputOutput(dir_path, axis_layout) + +
[docs] def generate_layer_outputs(self, input_batch: Union[np.ndarray, List[np.ndarray], Tuple[np.ndarray]]): + """ + This method captures output of every layer of a model & saves the inputs and corresponding layer-outputs to disk. + + :param input_batch: Batch of inputs for which we want to obtain layer-outputs. + :return: None + """ + logger.info("Generating layer-outputs for %d input instances", len(input_batch)) + + feed_dict = create_input_feed_dict(self.session.graph, self.starting_op_names, input_batch) + + # Obtain layer-output name to output dictionary + layer_output_batch_dict = self.layer_output.get_outputs(feed_dict) + + # Skip constant scalar layer-outputs + const_scalar_layer_name = [] + for layer_name, layer_output in layer_output_batch_dict.items(): + if not isinstance(layer_output, np.ndarray): + const_scalar_layer_name.append(layer_name) + for layer_name in const_scalar_layer_name: + logger.info("Skipping constant scalar output of layer %s", layer_name) + _ = layer_output_batch_dict.pop(layer_name) + + # Save inputs and layer-outputs + self.save_input_output.save(input_batch, layer_output_batch_dict) + + logger.info('Layer-outputs generated for %d input instances', len(input_batch))
+ + +class LayerOutput: + """ + This class creates a layer-output name to layer-output dictionary. The layer-output names are as per the AIMET exported + tensorflow model. + """ + def __init__(self, session: tf.compat.v1.Session, starting_op_names: List[str], output_op_names: List[str], dir_path: str): + """ + Constructor - It initializes few lists that are required for capturing and naming layer-outputs. + + :param session: Session containing TF model. + :param starting_op_names: List of starting op names of the model. + :param output_op_names: List of output op names of the model. + """ + self.session = session + self.activation_tensor_names, self.activation_tensors = LayerOutput.get_activation_tensor_info( + session, starting_op_names, output_op_names) + + # Save activation tensor names which are in topological order of model graph. This order can be used while comparing layer-outputs. + save_layer_output_names(self.activation_tensor_names, dir_path) + + def get_outputs(self, feed_dict: Dict) -> Dict[str, np.ndarray]: + """ + This function creates layer-output name to layer-output dictionary. The layer-output names are as per the AIMET + exported TF model. + + :param feed_dict: input tensor to input batch map + :return: layer-output name to layer-output dictionary + """ + act_outputs = self.session.run(self.activation_tensors, feed_dict=feed_dict) + return dict(zip(self.activation_tensor_names, act_outputs)) + + @staticmethod + def get_activation_tensor_info(session: tf.compat.v1.Session, starting_op_names: List[str], output_op_names: List[str]) -> Tuple[List, List]: + """ + This function fetches the activation tensors and its names from the given TF model. These activation tensors contain + the layer-outputs of the given TF model. + + :param session: Session containing TF model. + :param starting_op_names: List of starting op names of the model. + :param output_op_names: List of output op names of the model. + :return: activation_tensor_names, activation_tensors + """ + connected_graph = ConnectedGraph(session.graph, starting_op_names, output_op_names) + # pylint: disable=protected-access + activation_op_names = QuantizationSimModel._get_ops_to_quantize_activations_for(session.graph, connected_graph) + + # Get activation quantization ops + activation_quant_op_names = [op_name for op_name in activation_op_names if op_name.endswith('_quantized')] + + # If activation quant ops are present then capture only their tensors + if activation_quant_op_names: + activation_op_names = activation_quant_op_names + + activation_tensor_names = [] + activation_tensors = [] + for activation_op_name in activation_op_names: + activation_op = session.graph.get_operation_by_name(activation_op_name) + for output in activation_op.outputs: + activation_tensor_names.append(output.name) + activation_tensors.append(output) + + # Update activation tensor names by removing 'quantized:' string and replacing '/' with '_'. + activation_tensor_names = [re.sub(r'\W+', "_", name.replace('quantized:', '')) for name in activation_tensor_names] + + return activation_tensor_names, activation_tensors +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/plotting_utils.html b/releases/1.32.2/_modules/aimet_tensorflow/plotting_utils.html new file mode 100644 index 00000000..5e7f6a70 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/plotting_utils.html @@ -0,0 +1,1614 @@ + + + + + + aimet_tensorflow.plotting_utils — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_tensorflow.plotting_utils

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2019-2021, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Create visualizations on the weights in each conv and linear layer in a model"""
+import os
+import math
+import holoviews as hv
+import numpy as np
+import pandas as pd
+from bokeh import plotting
+from bokeh.layouts import row
+from bokeh.models import HoverTool, WheelZoomTool, ColumnDataSource, Span, TableColumn, DataTable
+from bokeh.plotting import figure
+from bokeh.layouts import column
+from bokeh.models import Div
+
+# Some magic stuff happening during import that ties pandas dataframe to hvplot
+# Need this import, please don't remove
+import hvplot.pandas  # pylint:disable=unused-import
+
+from aimet_tensorflow.utils.op.conv import WeightTensorUtils
+
+
+def get_weights(conv_module, sess):
+    """
+    Returns the weights of a conv_module in a 2d matrix, where each column is an output channel.
+
+    :param sess: tf.compat.v1.Session
+    :param conv_module: convNd module
+    :return: 2d numpy array
+    """
+    numpy_weight = WeightTensorUtils.get_tensor_as_numpy_data(sess, conv_module)
+    numpy_weight = np.reshape(numpy_weight, (numpy_weight.shape[3], numpy_weight.shape[2], numpy_weight.shape[0],
+                                             numpy_weight.shape[1]))
+    axis_0_length = numpy_weight.shape[0]
+    axis_1_length = np.prod(numpy_weight.shape[1:])
+    reshaped_weights = numpy_weight.reshape(int(axis_0_length), int(axis_1_length))
+    return reshaped_weights
+
+
+def style(p):
+    """
+    Style bokeh figure object p and return the styled object
+    :param p: Bokeh figure object
+    :return: Bokeh figure object
+    """
+    # Title
+    p.title.align = 'center'
+    p.title.text_font_size = '14pt'
+    p.title.text_font = 'serif'
+
+    # Axis titles
+    p.xaxis.axis_label_text_font_size = '12pt'
+    # p.xaxis.axis_label_text_font_style = 'bold'
+    p.yaxis.axis_label_text_font_size = '12pt'
+    #     p.yaxis.axis_label_text_font_style = 'bold'
+
+    # Tick labels
+    p.xaxis.major_label_text_font_size = '10pt'
+    p.yaxis.major_label_text_font_size = '10pt'
+
+    p.add_tools(WheelZoomTool())
+
+    return p
+
+
+def line_plot_changes_in_summary_stats(data_before, data_after, x_axis_label=None, y_axis_label=None, title=None):
+    """
+    Returns a bokeh figure object showing a lineplot of min, max, and mean per output channel, shading in the area
+    difference between before and after.
+    :param data_before: pandas data frame with columns min, max, and mean.
+    :param data_after: pandas data frame with columns min, max, and mean
+    :param x_axis_label: string description of x axis
+    :param y_axis_label: string description of y axis
+    :param title: title for the plot
+    :return: bokeh figure object
+    """
+    layer_weights_old_model = convert_pandas_data_frame_to_bokeh_column_data_source(data_before)
+    layer_weights_new_model = convert_pandas_data_frame_to_bokeh_column_data_source(data_after)
+    plot = figure(x_axis_label=x_axis_label, y_axis_label=y_axis_label,
+                  title=title,
+                  tools="pan, box_zoom, crosshair, reset, save",
+                  width_policy="max", sizing_mode='stretch_both', output_backend="webgl")
+    plot.line(x='index', y='min', line_width=2, line_color="#2171b5", legend_label="Minimum After Optimization",
+              source=layer_weights_old_model, name="new model")
+    plot.line(x='index', y='max', line_width=2, line_color="green", legend_label="Maximum After Optimization", source=layer_weights_old_model,
+              name="new model")
+    plot.line(x='index', y='mean', line_width=2, line_color="orange", legend_label="Mean After Optimization",
+              source=layer_weights_old_model, name="new model")
+
+    plot.line(x='index', y='min', line_width=2, line_color="#2171b5", line_dash='dotted',
+              legend_label="Minimum Before Optimization", source=layer_weights_new_model, name="old model")
+    plot.line(x='index', y='max', line_width=2, line_color="green", line_dash='dotted',
+              legend_label="Maximum Before Optimizaiton", source=layer_weights_new_model, name="old model")
+    plot.line(x='index', y='mean', line_width=2, line_color="orange", line_dash='dotted',
+              legend_label="Mean Before Optimization", source=layer_weights_new_model, name="old model")
+
+    plot.varea(x=data_after.index,
+               y1=data_after['min'],
+               y2=data_before['min'], fill_alpha=0.3, legend_label="shaded region", name="new model")
+
+    plot.varea(x=data_after.index,
+               y1=data_after['max'],
+               y2=data_before['max'], fill_color="green", fill_alpha=0.3, legend_label="shaded region")
+
+    plot.varea(x=data_after.index,
+               y1=data_after['mean'],
+               y2=data_before['mean'], fill_color="orange", fill_alpha=0.3, legend_label="shaded region")
+
+    plot.legend.location = "top_left"
+    plot.legend.click_policy = "hide"
+    plot.legend.background_fill_alpha = 0.3
+
+    if not x_axis_label or not y_axis_label or not title:
+        layout = row(plot)
+        return layout
+
+    # display a tooltip whenever the cursor in line with a glyph
+    hover1 = HoverTool(tooltips=[("Output Channel", "$index"),
+                                 ("Mean Before Optimization", "@mean{0.00}"),
+                                 ("Minimum Before Optimization", "@min{0.00}"),
+                                 ("Maximum Before Optimization", "@max{0.00}"),
+                                 ("25 Percentile Before Optimization", "@{25%}{0.00}"),
+                                 ("75 Percentile Before Optimization", "@{75%}{0.00}")], names=['old model'],
+                       mode='mouse'
+                       )
+    hover2 = HoverTool(tooltips=[("Output Channel", "$index"),
+                                 ("Mean After Optimization", "@mean{0.00}"),
+                                 ("Minimum After Optimization", "@min{0.00}"),
+                                 ("Maximum After Optimization", "@max{0.00}"),
+                                 ("25 Percentile After Optimization", "@{25%}{0.00}"),
+                                 ("75 Percentile After Optimization", "@{75%}{0.00}")], names=['new model'],
+                       mode='mouse'
+                       )
+    plot.add_tools(hover1)
+    plot.add_tools(hover2)
+    style(plot)
+
+    layout = row(plot)
+    return layout
+
+
+def line_plot(x, y, x_axis_label, y_axis_label, title, x_range=None):
+    """
+    :param x: x coordinates of data points
+    :param y: y coordinates of data points
+    :param x_axis_label: string description of x axis
+    :param y_axis_label: string description of y axis
+    :param title: title for the plot
+    :return: bokeh figure object
+    """
+    plot = figure(x_axis_label=x_axis_label, y_axis_label=y_axis_label,
+                  title=title,
+                  tools="pan, box_zoom, crosshair, reset, save",
+                  x_range=x_range,
+                  width=1500)
+    plot.line(x=x, y=y, line_width=2, line_color="#2171b5")
+    plot.circle(x=x, y=y, color="black", alpha=0.7, size=10)
+    if isinstance(x_range, list) and isinstance(x_range[0], str):
+        plot.xaxis.major_label_orientation = math.pi / 4
+
+    style(plot)
+
+    return plot
+
+
+def scatter_plot_summary_stats(data_frame, x_axis_label_mean="mean", y_axis_label_mean="standard deviation",
+                               title_mean="Mean vs Standard Deviation",
+                               x_axis_label_min="Minimum",
+                               y_axis_label_min="Maximum", title_min="Minimum vs Maximum"):
+    """
+    Creates a scatter plot, plotting min vs max, and mean vs std side by side.
+    :param data_frame: pandas data frame object
+    :param x_axis_label_mean: string description of x axis in plot showing mean vs std
+    :param y_axis_label_mean: string description of y axis in plot showing mean vs std
+    :param x_axis_label_min: string description of x axis in plot showing min vs max
+    :param y_axis_label_min: string description of y axis in plot showing min vs max
+    :return: bokeh figure
+    """
+    plot1 = figure(x_axis_label=x_axis_label_mean, y_axis_label=y_axis_label_mean,
+                   title=title_mean,
+                   tools="box_zoom, crosshair,reset", output_backend="webgl")
+    plot1.circle(x=data_frame['mean'], y=data_frame['std'], size=10, color="orange", alpha=0.4)
+
+    plot2 = figure(x_axis_label=x_axis_label_min, y_axis_label=y_axis_label_min,
+                   title=title_min,
+                   tools="box_zoom, crosshair,reset", output_backend="webgl")
+    plot2.circle(x=data_frame['min'], y=data_frame['max'], size=10, color="#2171b5", alpha=0.4)
+    style(plot1)
+    style(plot2)
+    # layout = row(plot1, plot2)
+    return plot1, plot2
+
+
+def box_plot_max_ranges(data_frame, output_channels_needed, x_label=None, y_label=None, title=None):
+    """
+    Creates a figure with n boxplots that can be most sensitive to outliers.
+    :param data_frame: pandas dataframe object
+    :param described_df: pandas dataframe object with a max and min column
+    :param largest_ranges_n: number of boxplots to be made
+    :return: a boxplot figure that has n boxplots with the largest range
+    """
+    # CLEAN THIS UP
+    data_frame.columns = data_frame.columns.map(str)
+    max_range_df = data_frame[output_channels_needed]
+    columns = list(max_range_df.columns)
+    plot = max_range_df.hvplot.box(y=columns, legend=False, invert=False, box_fill_alpha=0.5,
+                                   outlier_fill_color="red",
+                                   outlier_alpha=0.3, width=1200, height=600,
+                                   xlabel=x_label,
+                                   ylabel=y_label,
+                                   title=title)
+    bokeh_plot = hv.render(plot)
+
+    style(bokeh_plot)
+    return bokeh_plot
+
+
+def identify_max_range_columns(data_frame, described_df, num_columns=50):
+    """
+    Returns a list of columns with the maximum absolute ranges.
+    :param data_frame: pandas data frame
+    :param described_df: pandas data frame with summary statistics
+    :param num_columns: number of output channels to return
+    :return: list of output channels with maximum ranges.
+    """
+    data_frame.columns = data_frame.columns.map(str)
+    described_df['range'] = described_df['max'] - described_df['min']
+    described_df = described_df.sort_values(by=['range'], ascending=False)
+    output_channels_needed = described_df[:num_columns].index
+
+    output_channels_needed = [str(i) for i in output_channels_needed]
+    return output_channels_needed
+
+
+def line_plot_summary_statistics_model(layer_name, layer_weights_data_frame, height, width):
+    """
+    Given a layer
+    :param layer_name:
+    :param layer_weights_data_frame:
+    :return:
+    """
+    layer_weights = convert_pandas_data_frame_to_bokeh_column_data_source(layer_weights_data_frame)
+    plot = figure(x_axis_label="Output Channels", y_axis_label="Summary Statistics",
+                  title="Weight Ranges per Output Channel: " + layer_name,
+                  tools="pan, box_zoom, crosshair, reset, save",
+                  width=width, height=height, output_backend="webgl")
+    plot.line(x='index', y='min', line_width=2, line_color="#2171b5",
+              legend_label="Minimum", source=layer_weights)
+    plot.line(x='index', y='max', line_width=2, line_color="green",
+              legend_label="Maximum", source=layer_weights)
+    plot.line(x='index', y='mean', line_width=2, line_color="orange",
+              legend_label="Average", source=layer_weights)
+
+    plot.legend.location = "top_left"
+    plot.legend.click_policy = "hide"
+    plot.legend.background_fill_alpha = 0.3
+
+    plot.add_tools(HoverTool(tooltips=[("Output Channel", "$index"),
+                                       ("Mean", "@mean{0.00}"),
+                                       ("Min", "@min{0.00}"),
+                                       ("Max", "@max{0.00}"),
+                                       ("25 percentile", "@{25%}{0.00}"),
+                                       ("75 percentile", "@{75%}{0.00}")],
+                             # display a tooltip whenever the cursor is vertically in line with a glyph
+                             mode='mouse'
+                             ))
+    style(plot)
+    return plot
+
+
+def identify_problematic_output_channels(module_weights_data_frame_described):
+    """
+    return a list of output channels that have large weight ranges
+    :param module_weights_data_frame: pandas data frame where each column are summary statistics for each row, output channels
+    :param largest_ranges_n: number of output channels to return
+    :return:
+    """
+    # data_frame.columns = data_frame.columns.map(str)
+    module_weights_data_frame_described['range'] = module_weights_data_frame_described['max'] - \
+                                                   module_weights_data_frame_described['min']
+    module_weights_data_frame_described["abs range"] = module_weights_data_frame_described["range"].abs()
+    variable = module_weights_data_frame_described["abs range"].min()
+    module_weights_data_frame_described["relative range"] = module_weights_data_frame_described["abs range"] / variable
+    described_df = module_weights_data_frame_described.sort_values(by=['relative range'], ascending=False)
+    all_output_channel_ranges = described_df["relative range"]
+    output_channels_needed = detect_outlier_channels(all_output_channel_ranges)
+
+    return output_channels_needed, all_output_channel_ranges
+
+
+def detect_outlier_channels(data_frame_with_relative_ranges):
+    """
+    Detects outliers for relative weight ranges.
+    :param data_frame_with_relative_ranges: pandas data frame with column name "relative ranges"
+    :return: list of output channels that have very large weight ranges
+    """
+    Q1 = data_frame_with_relative_ranges.quantile(0.25)
+    Q3 = data_frame_with_relative_ranges.quantile(0.75)
+    IQR = Q3 - Q1
+    v = (data_frame_with_relative_ranges > (Q3 + 1.5 * IQR))
+    v_df = v.to_frame()
+    keep_only_outliers = v_df.loc[v_df['relative range']]
+    output_channels_list = keep_only_outliers.index
+    return output_channels_list
+
+
+def add_vertical_line_to_figure(x_coordinate, figure_object):
+    """
+    adds a vertical line to a bokeh figure object
+    :param x_coordinate: x_coordinate to add line
+    :param figure_object: bokeh figure object
+    :return: None
+    """
+    # Vertical line
+    vertical_line = Span(location=x_coordinate, dimension='height', line_color='red', line_width=1)
+    figure_object.add_layout(vertical_line)
+
+
+def histogram(data_frame, column_name, num_bins, x_label=None, y_label=None, title=None):
+    """
+    Creates a histogram of the column in the input data frame.
+    :param data_frame: pandas data frame
+    :param column_name: column in data frame
+    :param num_bins: number of bins to divide data into for histogram
+    :return: bokeh figure object
+    """
+    hv_plot_object = data_frame.hvplot.hist(column_name, bins=num_bins, height=400, tools="", xlabel=x_label,
+                                            ylabel=y_label,
+                                            title=title, fill_alpha=0.5)
+
+    bokeh_plot = hv.render(hv_plot_object)
+    style(bokeh_plot)
+    return bokeh_plot
+
+
+def convert_pandas_data_frame_to_bokeh_data_table(data):
+    """
+    Converts a pandas data frame to a bokeh column data source object so that it can be plotted
+    :param data: pandas data frame
+    :return: data table that can be displayed on a bokeh plot
+    """
+    data["index"] = data.index
+    data = data[['index'] + data.columns[:-1].tolist()]
+
+    data.columns.map(str)
+    source = ColumnDataSource(data=data)
+    columns = [TableColumn(field=column_str, title=column_str) for column_str in data.columns]  # bokeh columns
+    data_table = DataTable(source=source, columns=columns)
+    layout = add_title(data_table, "Table Summarizing Weight Ranges")
+    return layout
+
+
+def convert_pandas_data_frame_to_bokeh_column_data_source(data):
+    """
+    Converts a pandas data frame to a bokeh column data source object so that it can be pushed to a server document
+    :param data: pandas data frame
+    :return: data table that can be displayed on a bokeh server document
+    """
+    data["index"] = data.index
+    data = data[['index'] + data.columns[:-1].tolist()]
+
+    data.columns.map(str)
+    source = ColumnDataSource(data=data)
+    return source
+
+
+def add_title(layout, title):
+    """
+    Add a title to the layout.
+    :return: layout wrapped with title div.
+    """
+    text_str = "<b>" + title + "</b>"
+    wrap_layout_with_div = column(Div(text=text_str), layout)
+    return wrap_layout_with_div
+
+
+
[docs]def visualize_weight_ranges_single_layer(sess, layer, results_dir): + """ + Given a layer, visualizes weight ranges with scatter plots and line plots + + :param sess: tf.compat.v1.Session + :param layer: layer with weights + :param results_dir: Directory to save the Bokeh plots + :return: Bokeh plot + """ + + file_path = os.path.join(results_dir, 'visualize_weight_ranges_single_layer.html') + plotting.output_file(file_path) + + layer_weights = pd.DataFrame(get_weights(layer, sess)) + layer_name = layer.name + layer_weights_summary_statistics = layer_weights.describe().T + + scatter_plot_mean, scatter_plot_min = scatter_plot_summary_stats(layer_weights_summary_statistics, + x_axis_label_mean="Mean Weights Per Output Channel", + y_axis_label_mean="Std Per Output Channel", + title_mean="Mean vs Standard Deviation: " + layer_name, + x_axis_label_min="Min Weights Per Output Channel", + y_axis_label_min="Max Weights Per Output Channel", + title_min="Minimum vs Maximum: " + layer_name) + + scatter_plots_layout = row(scatter_plot_mean, scatter_plot_min) + line_plots = line_plot_summary_statistics_model(layer_name=layer_name, + layer_weights_data_frame=layer_weights_summary_statistics, + width=1500, height=700) + layout = column(scatter_plots_layout, line_plots) + layout_with_title = add_title(layout, layer_name) + plotting.save(layout_with_title) + return layout_with_title
+ + +
[docs]def visualize_relative_weight_ranges_single_layer(sess, layer, results_dir): + """ + + Publishes a line plot showing weight ranges for each layer, summary statistics + for relative weight ranges, and a histogram showing weight ranges of output channels + + :param sess: tf.compat.v1.Session + :param layer: layer with weights + :param results_dir: Directory to save the Bokeh plots + :return: bokeh plot + + """ + + # pylint: disable=too-many-locals + file_path = os.path.join(results_dir, 'visualize_relative_weight_ranges_single_layer.html') + plotting.output_file(file_path) + + layer_weights_data_frame = pd.DataFrame(get_weights(layer, sess)).describe().T + layer_name = layer.name + plot = line_plot_summary_statistics_model(layer_name, layer_weights_data_frame, width=1150, height=700) + + # list of problematic output channels, data frame containing magnitude of range in each output channel + problematic_output_channels, output_channel_ranges_data_frame = identify_problematic_output_channels( + layer_weights_data_frame) + + histogram_plot = histogram(output_channel_ranges_data_frame, "relative range", 75, + x_label="Weight Range Relative to Smallest Output Channel", + y_label="Count", + title="Relative Ranges For All Output Channels") + output_channel_ranges_data_frame = output_channel_ranges_data_frame.describe().T.to_frame() + output_channel_ranges_data_frame = output_channel_ranges_data_frame.drop("count") + + output_channel_ranges_as_column_data_source = convert_pandas_data_frame_to_bokeh_data_table( + output_channel_ranges_data_frame) + + # add vertical lines to highlight problematic channels + for channel in problematic_output_channels: + add_vertical_line_to_figure(channel, plot) + + column_layout = column(histogram_plot, output_channel_ranges_as_column_data_source) + layout = row(plot, column_layout) + layout_with_title = add_title(layout, layer_name) + + plotting.save(layout_with_title) + return layout_with_title
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/quant_analyzer.html b/releases/1.32.2/_modules/aimet_tensorflow/quant_analyzer.html new file mode 100644 index 00000000..344d5fb5 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/quant_analyzer.html @@ -0,0 +1,1690 @@ + + + + + + aimet_tensorflow.quant_analyzer — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_tensorflow.quant_analyzer

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Quant Analyzer """
+
+import os
+from typing import List, Tuple, Dict
+import tensorflow.compat.v1 as tf
+from aimet_common.defs import QuantScheme
+from aimet_common.quant_analyzer import save_json, export_per_layer_sensitivity_analysis_plot,\
+    create_and_export_min_max_ranges_plot, export_per_layer_mse_plot, export_stats_histogram_plot
+from aimet_common.utils import AimetLogger, CallbackFunc
+from aimet_tensorflow.common.operation import Op
+from aimet_tensorflow.utils.common import create_input_feed_dict, iterate_tf_dataset
+from aimet_tensorflow.quantizer_info import QuantizerInfo
+from aimet_tensorflow.quantsim import QuantizationSimModel
+from aimet_tensorflow import batch_norm_fold
+_logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.Quant)
+
+DEFAULT_BOKEH_FIGURE_HEIGHT = 300
+
+
[docs]class QuantAnalyzer: + """ + QuantAnalyzer tool provides + 1) Model sensitivity to weight and activation quantization + 2) Per layer encoding (min - max range) and PDF analysis + 3) Per op sensitivity analysis + 4) Per op MSE analysis + + """ + + def __init__(self, session: tf.compat.v1.Session, start_op_names: List[str], output_op_names: List[str], + forward_pass_callback: CallbackFunc, eval_callback: CallbackFunc, use_cuda: bool = True): + """ + :param session: The input model as session to add quantize ops to + :param start_op_names: List of starting op names of the model + :param output_op_names: List of output op names of the model + :param forward_pass_callback: A callback function that is expected to run forward passes on a session. + This callback function should use representative data for the forward pass, so the calculated + encodings work for all data samples. This callback internally chooses the number of data samples + it wants to use for calculating encodings. + :param eval_callback: A callback function for model evaluation that determines model + performance. This callback function is expected to return scalar value + representing the model performance evaluated against entire test/evaluation dataset. + :param use_cuda: If True, places quantization ops on GPU. Defaults to True + """ + if not isinstance(forward_pass_callback, CallbackFunc): + raise ValueError('forward_pass_callback and its argument(s) are not encapsulated by CallbackFunc class.') + if not isinstance(eval_callback, CallbackFunc): + raise ValueError('eval_callback and its argument(s) are not encapsulated by CallbackFunc class.') + + self._session = session + self._start_op_names = start_op_names + self._output_op_names = output_op_names + self._forward_pass_callback = forward_pass_callback + self._eval_callback = eval_callback + self._use_cuda = use_cuda + self._default_output_bw = None + self._default_param_bw = None + self._unlabeled_dataset = None + self._num_batches = None + + # pylint: disable=too-many-arguments +
[docs] def analyze(self, + quant_scheme: QuantScheme = QuantScheme.post_training_tf_enhanced, + rounding_mode: str = 'nearest', + default_param_bw: int = 8, + default_output_bw: int = 8, + config_file: str = None, + unlabeled_dataset: tf.compat.v1.data.Dataset = None, + num_batches: int = None, + results_dir: str = "./tmp/"): + """ + Analyze model for quantization and point out sensitive parts/hotspots of the model by performing + 1) model sensitivity to quantization + 2) export per layer encoding (min - max range) + 3) export per layer statistics histogram (PDF) when quant scheme is TF-Enhanced + 4) perform per op sensitivity analysis by enabling and disabling quant ops + 5) per op MSE loss between fp32 and quantized output activations + + :param quant_scheme: Quantization Scheme, currently supported schemes are post_training_tf and + post_training_tf_enhanced, defaults to post_training_tf_enhanced + :param rounding_mode: The round scheme to used. One of: 'nearest' or 'stochastic', defaults to 'nearest' + :param default_param_bw: bitwidth to use for parameter tensors, defaults to 8 + :param default_output_bw: bitwidth to use for activation tensors, defaults to 8 + :param config_file: Path to a config file to use to specify rules for placing quant ops in the model + :param results_dir: Directory to save the results. + :param unlabeled_dataset: Unlabeled TF dataset + Used in per op MSE loss calculation + :param num_batches: Number of batches. Approximately 256 samples/images are recommended, + so if batch size of data loader is 64, then 4 number of batches leads to 256 samples/images + Used in per op MSE loss calculation + """ + self._unlabeled_dataset = unlabeled_dataset + self._num_batches = num_batches + self._default_param_bw = default_param_bw + self._default_output_bw = default_output_bw + sim = self._create_quantsim_and_encodings(quant_scheme, rounding_mode, config_file) + results_dir = os.path.abspath(results_dir) + os.makedirs(results_dir, exist_ok=True) + + # Check model sensitivity to weight and activation quantization individually. + self._check_model_sensitivity_to_quantization(sim) + + # Export encoding min-max range. + self._export_per_layer_encoding_min_max_range(sim, results_dir) + + # Export PDF of statistics. + if quant_scheme == QuantScheme.post_training_tf_enhanced: + self._export_per_layer_stats_histogram(sim, results_dir) + + # Perform per op analysis by enabling each quant op (OPTION-1). + self._perform_per_op_analysis_by_enabling_quant_ops(sim, results_dir) + + # Perform per op analysis by disabling each quant op (OPTION-2). + self._perform_per_op_analysis_by_disabling_quant_ops(sim, results_dir) + + # Perform per op MSE loss between fp32 and quantized output activations. + if self._unlabeled_dataset and self._num_batches: + self._perform_per_op_mse_loss(sim, results_dir)
+ + def _create_quantsim_and_encodings(self, quant_scheme: QuantScheme, rounding_mode: str, + config_file: str) -> QuantizationSimModel: + """" + Create Quantsim and compute encodings. + + :param quant_scheme: Quantization Scheme + :param rounding_mode: The round scheme to used + :param config_file: Path to a config file + :return: Quantsim model + """ + bn_folded_sess, _ = batch_norm_fold.fold_all_batch_norms(self._session, input_op_names=self._start_op_names, + output_op_names=self._output_op_names) + self._session = bn_folded_sess + quant_sim_model = QuantizationSimModel(session=bn_folded_sess, + starting_op_names=self._start_op_names, + output_op_names=self._output_op_names, + quant_scheme=quant_scheme, rounding_mode=rounding_mode, + default_output_bw=self._default_output_bw, + default_param_bw=self._default_param_bw, + use_cuda=self._use_cuda, + config_file=config_file) + quant_sim_model.compute_encodings(forward_pass_callback=self._forward_pass_callback.func, + forward_pass_callback_args=self._forward_pass_callback.args) + + return quant_sim_model + + def _check_model_sensitivity_to_quantization(self, sim: QuantizationSimModel) -> Tuple[float, float, float]: + """ + Perform the sensitivity analysis to weight and activation quantization + individually. + + :param sim: Quantsim model. + :return: FP32 eval score, weight-quantized eval score, act-quantized eval score. + """ + fp32_eval_score = self._eval_model(self._session) + _logger.info("FP32 eval score (W32A32): %f", fp32_eval_score) + + act_quantized_eval_score = self._eval_activation_quantized_model(sim) + _logger.info("Activation-quantized eval score (W32A%d): %f", self._default_output_bw, + act_quantized_eval_score) + + weight_quantized_eval_score = self._eval_weight_quantized_model(sim) + _logger.info("Weight-quantized eval score (W%dA32): %f", self._default_param_bw, + weight_quantized_eval_score) + + return fp32_eval_score, weight_quantized_eval_score, act_quantized_eval_score + + def _eval_model(self, session: tf.compat.v1.Session) -> float: + """ + Evaluate the model performance. + + :param session: TensorFlow session to be evaluated. + :return: Scaler value representing model performance. + """ + return self._eval_callback.func(session, self._eval_callback.args) + + def _eval_weight_quantized_model(self, sim): + """ + Evaluate weight quantized model performance. + For weight quantized model performance, disable enabled activation quantizers, measure + eval score and enable again. + + :param sim: Quantsim model. + :return: Quantized model performance. + """ + enabled_activation_quantizers = sim.get_enabled_activation_quantizers() + self._enable_disable_quantizers(enabled_activation_quantizers, enabled=False) + eval_score = self._eval_model(sim.session) + self._enable_disable_quantizers(enabled_activation_quantizers, enabled=True) + return eval_score + + def _eval_activation_quantized_model(self, sim): + """ + Evaluate activation quantized model performance. + For activation quantized model performance, disable enabled param quantizers, measure + eval score and enable again. + + :param sim: Quantsim model. + :return: Quantized model performance. + """ + enabled_param_quantizers = sim.get_enabled_parameter_quantizers() + self._enable_disable_quantizers(enabled_param_quantizers, enabled=False) + eval_score = self._eval_model(sim.session) + self._enable_disable_quantizers(enabled_param_quantizers, enabled=True) + return eval_score + + def _perform_per_op_analysis_by_enabling_quant_ops(self, + sim: QuantizationSimModel, + results_dir: str = "./tmp/", + ) -> Dict: + """ + 1. All activations and parameters quantizers are disabled. + 2. For every activations and parameters quantizers: + i. Quantizer is enabled + ii. Measure and record eval score on subset of dataset. + iii. Disable enabled quantizer in step i. + 3. Returns dictionary containing quant op name and corresponding eval score. + + :param sim: Quantsim model. + :param results_dir: Directory to save the results. + :return: op wise eval score dictionary. dict[op_name] = eval_score + """ + results_dir = os.path.abspath(results_dir) + os.makedirs(results_dir, exist_ok=True) + + _logger.info("\nOPTION-1:\nAll the quant ops are disabled.\n" + "Starting per-op analysis by enabling quant ops as per config file.") + op_wise_eval_score_dict = self._perform_per_op_analysis(sim, + disable_all_quantizers=True, + enabled_before=True, + enabled_after=False) + export_per_layer_sensitivity_analysis_plot(op_wise_eval_score_dict, + results_dir, + title="per_op_quant_enabled") + save_json(op_wise_eval_score_dict, + results_dir, + title="per_op_quant_enabled.json") + return op_wise_eval_score_dict + + def _perform_per_op_analysis_by_disabling_quant_ops(self, + sim: QuantizationSimModel, + results_dir: str = "./tmp/", + ) -> Dict: + """ + 1. All activations and parameters quantizers are enabled as per JSON config file. + 2. For every activations and parameters quantizers: + i. Quantizer is disabled + ii. Measure and record eval score on subset of dataset. + iii. Enable disabled quantizer in step i. + 3. Returns dictionary containing quant op name and corresponding eval score. + + :param sim: Quantsim model. + :param results_dir: Directory to save the results. + :return: op wise eval score dictionary. dict[op_name] = eval_score + """ + results_dir = os.path.abspath(results_dir) + os.makedirs(results_dir, exist_ok=True) + + _logger.info("\nOPTION-2:\nAll the quant ops are enabled as per config file.\n" + "Starting per-op analysis by disabling quant ops.") + op_wise_eval_score_dict = self._perform_per_op_analysis(sim, + disable_all_quantizers=False, + enabled_before=False, + enabled_after=True) + export_per_layer_sensitivity_analysis_plot(op_wise_eval_score_dict, + results_dir, + title="per_op_quant_disabled") + save_json(op_wise_eval_score_dict, + results_dir, + title="per_op_quant_disabled.json") + return op_wise_eval_score_dict + + def _perform_per_op_analysis(self, + sim: QuantizationSimModel, + disable_all_quantizers: bool, + enabled_before: bool, + enabled_after: bool, + ) -> Dict: + """ + Helper function for perform_per_op_analysis_by_enabling_quant_ops() and + perform_per_op_analysis_by_disabling_quant_ops() + + :param sim: Quantsim model. + :param disable_all_quantizers: Flag to disable all the quantizers before per-op analysis. + :param enabled_before: Flag to set enabled for quantizers before computing encodings. + :param enabled_after: Flag to set enabled for quantizers after computing encodings. + :return: op wise eval score dictionary. dict[conn_graph_op] = eval_score. + """ + + enabled_quant_ops = self._get_enabled_quantizer_groups(sim) + + if disable_all_quantizers: + for quantizer_group_list in enabled_quant_ops.values(): + if quantizer_group_list: + self._enable_disable_quantizers(quantizer_group_list, enabled=False) + + eval_score_dict = {} + for conn_graph_op, quantizer_info_list in enabled_quant_ops.items(): + if quantizer_info_list: + conn_graph_op = str(conn_graph_op) + self._enable_disable_quantizers(quantizer_info_list, enabled=enabled_before) + + # Record eval score. + eval_score_dict[conn_graph_op] = self._eval_model(sim.session) + _logger.info("For connected graph op: %s, the eval score is: %f", conn_graph_op, eval_score_dict[conn_graph_op]) + + self._enable_disable_quantizers(quantizer_info_list, enabled=enabled_after) + + if disable_all_quantizers: + for quantizer_group_list in enabled_quant_ops.values(): + if quantizer_group_list: + self._enable_disable_quantizers(quantizer_group_list, enabled=True) + + return eval_score_dict + + # pylint: disable=too-many-locals + def _perform_per_op_mse_loss(self, + sim: QuantizationSimModel, + results_dir: str, + ) -> Dict: + """ + MSE loss computation between fp32 and quantized output activations for each op. + :param sim: Quantsim model. + :param results_dir: Directory to save the results. + :return op wise MSE loss. dict[op_name] = MSE loss. + """ + + results_dir = os.path.abspath(results_dir) + os.makedirs(results_dir, exist_ok=True) + + # pylint: disable=protected-access + output_op_names = QuantizationSimModel._get_ops_to_quantize_activations_for(self._session.graph, sim.connected_graph) + mse_loss_dict = {} + + for _, output_op_name in enumerate(output_op_names): + mse_loss_dict[output_op_name] = self._compute_mse_loss(sim, output_op_name) + + export_per_layer_mse_plot(mse_loss_dict, + results_dir, + title="per_op_mse_loss") + save_json(mse_loss_dict, results_dir, title="per_op_mse_loss.json") + _logger.info("Exported per op MSE loss plot.") + return mse_loss_dict + + def _compute_mse_loss(self, sim: QuantizationSimModel, output_op_name) -> float: + """ + Compute MSE loss between fp32 and quantized output activations for each batch, add for + all the batches and return averaged mse loss. + :param sim: Quantsim model. + :param output_op_name: Output op name. + :return: MSE loss between fp32 and quantized output activations. + """ + total = 0 + loss = 0.0 + mse_loss = tf.keras.losses.MeanSquaredError() + iterator = iterate_tf_dataset(self._unlabeled_dataset) + for _ in range(self._num_batches): + try: + model_inputs = next(iterator) + except StopIteration: + raise ValueError(f'Can not fetch {self._num_batches} batches from dataset') # pylint: disable=raise-missing-from + + # Collect output activation data from original op + feed_dict = create_input_feed_dict(self._session.graph, self._start_op_names, model_inputs) + orig_op = self._session.graph.get_operation_by_name(output_op_name) + orig_out_data = self._session.run(orig_op.outputs[0], feed_dict=feed_dict) + + # Collect output activation data from quant sim op + feed_dict = create_input_feed_dict(sim.session.graph, self._start_op_names, model_inputs) + quant_op = sim.session.graph.get_operation_by_name(output_op_name + "_quantized") + quantized_out_data = sim.session.run(quant_op.outputs[0], feed_dict=feed_dict) + + # Calculate MSE loss + mse = mse_loss(orig_out_data, quantized_out_data) + with tf.compat.v1.Session().as_default(): + loss += mse.eval() + total += orig_out_data.shape[0] + return loss/total + + @staticmethod + def _get_enabled_quantizer_groups(sim: QuantizationSimModel)-> Dict[Op, List[QuantizerInfo]]: + """ + For given quantsim model, get all enabled activation and parameter quantizers. + :param sim: Quantsim model. + :return: Dictionary which maps a connected graph op to a list of enabled quantizer info in it. + """ + enabled_quantizers_dict = {} + # pylint: disable=protected-access + for conn_graph_op, quantizer_group in sim._op_to_quant_ops_dict.items(): + group = [] + # pylint: disable=protected-access + param_quant_op_dict, act_quant_op = quantizer_group + activation_quantize_info = sim._activation_quantizers.get(act_quant_op.name) + if activation_quantize_info.enabled: + group.append(activation_quantize_info) + for param_op_set in param_quant_op_dict.values(): + for param_op in param_op_set: + # pylint: disable=protected-access + param_quantize_info = sim._param_quantizers.get(param_op.name) + if param_quantize_info.enabled: + group.append(param_quantize_info) + enabled_quantizers_dict[conn_graph_op] = group + return enabled_quantizers_dict + + @staticmethod + def _enable_disable_quantizers(quantizer_list: List[QuantizerInfo], enabled: bool): + """ + For given list of quantizers, set (enable/disable) quantizer's enabled. + + :param quantizer_list: List of quantizers. + :param enabled: Enabled flag. + """ + for quantizer_info in quantizer_list: + if enabled: + quantizer_info.enable_keeping_encoding() + else: + quantizer_info.enabled = enabled + + def _export_per_layer_stats_histogram(self, sim: QuantizationSimModel, + results_dir: str = "./tmp/"): + """ + NOTE: Not to invoke when quantization scheme is not TF-Enhanced. + + Export histogram that represents a PDF of collected statistics by a quantizer for every + quant op. After invoking this API, results_dir should have html files in following + format for every quantizers of quant ops. + + -results_dir + -activations_pdf + quant_op_name.html + -weights_pdf + -quant_op_name + quant_op_name_{channel_index}.html + + :param sim: Quantsim model. + :param results_dir: Directory to save the results. + """ + # pylint: disable=protected-access + weights_pdf_dir = os.path.join(results_dir, "weights_pdf") + activations_pdf_dir = os.path.join(results_dir, "activations_pdf") + + for quant_op_name, quantizer_info in sim._activation_quantizers.items(): + quant_op_name = quant_op_name.replace("/", "_") + if quantizer_info.is_encoding_valid(): + self._create_and_export_stats_histogram_plot(quantizer_info, + activations_pdf_dir, + title=f"{quant_op_name}") + for quant_op_name, quantizer_info in sim._param_quantizers.items(): + quant_op_name = quant_op_name.replace("/", "_") + if quantizer_info.is_encoding_valid(): + self._create_and_export_stats_histogram_plot(quantizer_info, + os.path.join(weights_pdf_dir, quant_op_name), + title=f"{quant_op_name}") + + _logger.info("Exported per layer stats histogram.") + + # pylint: disable=no-self-use + def _export_per_layer_encoding_min_max_range(self, sim: QuantizationSimModel, + results_dir: str = "./tmp/" + ) -> Tuple[Dict, Dict]: + """ + Export encoding min and max range for all weights and activations. results_dir should have + html files in following format. + + -results_dir + -activations.html + -weights.html + + If per channel quantization(PCQ) is enabled then, + + -results_dir + -activations.html + -{quant_op_name}_{param_name}.html + + :param sim: Quantsim model. + :param results_dir: Directory to save the results. + :return: layer wise min-max range for weights and activations. + """ + # pylint: disable=protected-access + min_max_ranges_dir = os.path.join(results_dir, "min_max_ranges") + + min_max_range_for_activations_dict = {} + min_max_range_for_weights_dict = {} + for quant_op_name, quantizer_info in sim._activation_quantizers.items(): + quant_op_name = quant_op_name.replace("/", "_") + if quantizer_info.enabled: + encoding = quantizer_info.get_encoding() + min_max_range_for_activations_dict[quant_op_name] = (encoding.min, encoding.max) + + for quant_op_name, quantizer_info in sim._param_quantizers.items(): + quant_op_name = quant_op_name.replace("/", "_") + if quantizer_info.enabled: + encoding = quantizer_info.get_encoding() + if isinstance(encoding, List): # per-channel + per_channel_encodings = {} + for index, enc in enumerate(encoding): + per_channel_encodings[f"{quant_op_name}_{index}"] = (enc.min, enc.max) + min_max_range_for_weights_dict[quant_op_name] = per_channel_encodings + else: # per-tensor + min_max_range_for_weights_dict[quant_op_name] = (encoding.min, encoding.max) + + create_and_export_min_max_ranges_plot(min_max_range_for_weights_dict, + min_max_ranges_dir, + title="weights") + create_and_export_min_max_ranges_plot(min_max_range_for_activations_dict, + min_max_ranges_dir, + title="activations") + save_json(min_max_range_for_weights_dict, min_max_ranges_dir, title="weights.json") + save_json(min_max_range_for_activations_dict, min_max_ranges_dir, title="activations.json") + _logger.info("Exported per layer encoding min-max ranges.") + return min_max_range_for_weights_dict, min_max_range_for_activations_dict + + + def _create_and_export_stats_histogram_plot(self, quantizer_info: QuantizerInfo, + results_dir: str, + title: str): + """ + For given quantizer, create and export histogram (PDF) of statistics in html format. + + :param quantizer_info: Quantizer. + :param results_dir: Directory to save the results. + :param title: Title of the plot. + """ + os.makedirs(results_dir, exist_ok=True) + + histograms = quantizer_info.get_stats_histogram() + encodings = quantizer_info.get_encoding() + if not isinstance(encodings, List): + encodings = [encodings] + + for index, (histogram, encoding) in enumerate(zip(histograms, encodings)): + export_stats_histogram_plot(histogram, encoding, results_dir, title=f"{title}_{index}")
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/quantsim.html b/releases/1.32.2/_modules/aimet_tensorflow/quantsim.html new file mode 100644 index 00000000..67860f9d --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/quantsim.html @@ -0,0 +1,2762 @@ + + + + + + aimet_tensorflow.quantsim — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_tensorflow.quantsim

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2020-2023, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Implementation for simulating models running on Quantized hardware """
+
+from typing import List, Union, Dict, Callable, Any, Tuple
+import os
+import shutil
+import json
+import numpy as np
+import tensorflow as tf
+from tensorflow.python.framework import ops as tf_ops
+from packaging import version  # pylint: disable=wrong-import-order
+
+import aimet_common.libpymo as libpymo
+import aimet_common.libaimet_tf_ops as qcops
+from aimet_common.defs import QuantScheme, QuantizationDataType
+from aimet_common.quantsim import encoding_version, validate_quantsim_inputs, \
+    recompute_grid_params, extract_global_quantizer_args
+from aimet_common.quant_utils import get_conv_accum_bounds
+from aimet_common.utils import AimetLogger, save_json_yaml
+from aimet_tensorflow import graph_editor
+from aimet_tensorflow.utils.common import update_variables_with_values, save_data_to_pickle_file, \
+    load_data_from_pickle_file, get_valid_ops
+from aimet_tensorflow import utils
+from aimet_tensorflow.utils import transformer_utils
+from aimet_tensorflow.utils.constants import QuantizeOpIndices
+from aimet_tensorflow.utils.op.embedding import get_embedding_params_using_patterns
+from aimet_tensorflow.utils.quantsim import create_op_to_quant_ops_dict, is_op_quantizable, \
+    get_time_steps_tensor_from_rnn_inner_ops, swap_last_two_dim
+from aimet_tensorflow.utils.graph import updated_graph_flow_context_to_loop_context, set_graph_flow_context, \
+    op_not_in_loop_control_flow_context
+from aimet_tensorflow.common.connectedgraph import ConnectedGraph
+from aimet_tensorflow.defs import ParameterInfo
+from aimet_tensorflow.quantizer_info import QuantizerInfo, QuantizerType, quant_scheme_to_libpymo
+from aimet_tensorflow.quantsim_config.quantsim_config import QuantSimConfigurator
+from aimet_tensorflow.quantsim_recurrent import _select_simple_rnn_internal_ops_to_quantize, \
+    _select_lstm_internal_ops_to_quantize, SUPPORTED_RECURRENT_TYPES
+
+from aimet_tensorflow.keras.defs import AxisHandling
+from aimet_tensorflow.keras.utils.common import create_encoding_from_dict
+
+# this is required to associate gradient with QcQuantize op
+from aimet_tensorflow import quantsim_straight_through_grad      # pylint: disable=unused-import
+
+
+# pylint: disable=too-many-lines
+
+_logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.Quant)
+WORKING_DIR = '/tmp/quantsim/'
+
+
+# Op types which we will not place quantize ops after
+op_types_to_ignore = {'branch', 'Flatten', 'Shape', 'Identity', 'Reshape', 'Transpose', 'ResourceGather', 'Tile'}
+
+# Connected graph types to ignore parameter quantization
+param_quant_conn_op_ignore_list = {'FusedBatchNorm', 'FusedBatchNormV3', 'BatchNorm'}
+
+DTYPES_QUANTIZE_NOT_REQUIRED = [tf.dtypes.int8, tf.dtypes.uint8, tf.dtypes.int16, tf.dtypes.uint16,
+                                tf.dtypes.int32, tf.dtypes.uint32, tf.dtypes.int64, tf.dtypes.uint64,
+                                tf.bool, tf.dtypes.string]
+
+class PickleableQuantSimState:
+    """
+    State variables to be saved while pickling
+    """
+    def __init__(self, quant_scheme, rounding_mode, use_cuda,
+                 param_quantizer_dict, activation_quantizer_dict):
+        """
+        class type to save pickle-able info pertaining to quantsim config
+        :param quant_scheme: quant scheme
+        :param rounding_mode: rounding mode
+        :param use_cuda: flag to indicate usage of GPU
+        :param param_quantizer_dict: param quantizers dictionary
+        :param activation_quantizer_dict: activation quantizers dictionary
+        """
+
+        self.quant_scheme = quant_scheme
+        self.rounding_mode = rounding_mode
+        self.use_cuda = use_cuda
+        self.param_quantizers = param_quantizer_dict
+        self.activation_quantizers = activation_quantizer_dict
+
+
+
[docs]class QuantizationSimModel: + + """ + Creates a QuantSim model by adding quantization simulations ops to a given model. + + This enables + + #. off-target simulation of inference accuracy + #. the model to be fine-tuned to counter the effects of quantization + + """ + # pylint: disable=too-many-arguments + # pylint: disable=too-many-instance-attributes + def __init__(self, session: tf.compat.v1.Session, starting_op_names: List[str], output_op_names: List[str], + quant_scheme: Union[str, QuantScheme] = 'tf_enhanced', rounding_mode: str = 'nearest', + default_output_bw: int = 8, default_param_bw: int = 8, use_cuda: bool = True, config_file: str = None, + default_data_type: QuantizationDataType = QuantizationDataType.int): + """ + :param session: The input model as session to add quantize ops to + :param starting_op_names: List of starting op names of the model + :param output_op_names: List of output op names of the model + :param quant_scheme: Quantization Scheme, currently supported schemes are post_training_tf and + post_training_tf_enhanced, defaults to post_training_tf_enhanced + :param rounding_mode: The round scheme to used. One of: 'nearest' or 'stochastic', defaults to 'nearest'. + :param default_output_bw: bitwidth to use for activation tensors, defaults to 8 + :param default_param_bw: bitwidth to use for parameter tensors, defaults to 8 + :param use_cuda: If True, places quantization ops on GPU. Defaults to True + :param config_file: Path to a config file to use to specify rules for placing quant ops in the model + :param default_data_type: Default data type to use for quantizing all layer parameters. + Possible options are QuantizationDataType.int and QuantizationDataType.float. + Note that the mode default_data_type=QuantizationDataType.float is only supported with + default_output_bw=16 and default_param_bw=16 + + :returns: An object which can be used to perform quantization on a tensorflow graph + :raises: ValueError: An error occurred processing one of the input parameters. + + """ + # sanity checks + validate_quantsim_inputs(quant_scheme, + rounding_mode, + default_output_bw, + default_param_bw, + default_data_type) + + self.session = session + + if isinstance(quant_scheme, str): + quant_scheme_lookup = {'tf': QuantScheme.post_training_tf, + 'tf_enhanced': QuantScheme.post_training_tf_enhanced, + 'percentile': QuantScheme.post_training_percentile} + quant_scheme = quant_scheme_lookup[quant_scheme] + self._quant_scheme = quant_scheme + self._rounding_mode = rounding_mode + self._use_cuda = use_cuda + self._param_quantizers = {} + self._activation_quantizers = {} + self._default_output_bw = default_output_bw + self._default_param_bw = default_param_bw + self._percentile_value = 100 # default percentile value + self._op_to_quant_ops_dict = {} + self.connected_graph = ConnectedGraph(self.session.graph, starting_op_names, output_op_names) + + # We save a copy of the original model (to be used during export later) + with self.session.graph.as_default(): + saver = tf.compat.v1.train.Saver() + saver.save(self.session, save_path=WORKING_DIR+'orig_model_before_quantsim') + self._quantsim_configurator = QuantSimConfigurator(session, self.connected_graph, config_file, default_output_bw, + default_param_bw, default_data_type) + self._supported_kernels = self._quantsim_configurator.get_supported_kernels() + self.per_channel_quantization_enabled = self._quantsim_configurator.per_channel_quantization_flag + self._op_name_to_output_channels_axis_handling_dict = {} + + self.quant_args = extract_global_quantizer_args(quant_scheme, self._quantsim_configurator) + + with self.session.graph.as_default(): + self._add_and_configure_quant_nodes(starting_op_names, output_op_names, default_param_bw, default_output_bw, + default_data_type) + + self._override_quant_config_for_transformer_mask_add() + # Save and load the session so the graph changes can take effect + self._save_and_load_sim_model() + + def __getstate__(self): + # convert object to pickle-able state + state = PickleableQuantSimState(self._quant_scheme, self._rounding_mode, + self._use_cuda, self._param_quantizers, + self._activation_quantizers) + return state + + def __setstate__(self, state): + self.session = None + self._quant_scheme = state.quant_scheme + self._rounding_mode = state.rounding_mode + self._use_cuda = state.use_cuda + self._param_quantizers = state.param_quantizers + self._activation_quantizers = state.activation_quantizers + + def quantizer_config(self, quant_op_name: str) -> Union[QuantizerInfo, None]: + """ + gets QuantizerInfo associated with given quantize op + :param quant_op_name: Name of the Quantize op + :return: QuantizerInfo associated with the Quant op + """ + + if quant_op_name in self._param_quantizers: + return self._param_quantizers[quant_op_name] + + if quant_op_name in self._activation_quantizers: + return self._activation_quantizers[quant_op_name] + + _logger.error('Could not find Quantizer for given op {%s} ', quant_op_name) + return None + + def set_percentile_value(self, percentile_value: float): + """ + Set the percentile value to be used while computing encodings for quantizers having percentile quant scheme. + + :param percentile_value: Percentile value to set + """ + if percentile_value < 90 or percentile_value > 100: + raise ValueError("Percentile value must be in range [90, 100]") + self._percentile_value = percentile_value + + if self._quant_scheme == QuantScheme.post_training_percentile: + # Set the percentile value to the activation quantizers: + for quant_info in self._activation_quantizers.values(): + quant_info.set_percentile_value(self._percentile_value) + + def get_supported_kernels(self) -> Dict: + """ + Return _supported_kernels parsed from the config file + :return: Dictionary containing supported_kernels + """ + return self._supported_kernels + + def _get_op_variable_value(self, quant_op: tf.Operation, var_index: int): + """ + Utility to load variable values from quant op + :param quant_op: quantize op + :param var_index: variable index to be read + :return: variable value + """ + + op_var_tensor = quant_op.inputs[var_index] + return self.session.run(op_var_tensor) + + def configure_quantization_ops(self, conn_graph: ConnectedGraph, ops_with_param_names: List[str], indices: List[int], + params_to_quantize: Dict[str, ParameterInfo], activation_op_names: List[str]): + """ + Configure inserted quantize ops using config file + :param conn_graph: Connected graph of the model + :param ops_with_param_names: List of ops for which param quantization ops were inserted for + :param indices: List of input indices (one-to-one for each entry in ops) + :param params_to_quantize: Dictionary of parameters to quantize + :param activation_op_names: List of ops for which activation quantization ops were inserted for + """ + if not conn_graph: + error_msg = ('Connected graph passed into configure_quantization_ops() is None. If manual insertion of ' + 'quantization ops is being done, and get_ops_to_quantize_activations_for() has been ' + 'overriden, please override configure_quantization_ops() as well.') + _logger.error(error_msg) + raise AssertionError(error_msg) + self._op_to_quant_ops_dict = create_op_to_quant_ops_dict(self.session.graph, conn_graph, ops_with_param_names, indices, + params_to_quantize, activation_op_names) + self._quantsim_configurator.configure_quantizers(self._op_to_quant_ops_dict, self._param_quantizers, + self._activation_quantizers) + +
[docs] def compute_encodings(self, forward_pass_callback: Callable[[tf.compat.v1.Session, Any], None], + forward_pass_callback_args): + """ + Computes encodings for all quantization sim nodes in the model. + This is also used to set initial encodings for Range Learning. + + :param forward_pass_callback: A callback function that is expected to runs forward passes on a session. + This callback function should use representative data for the forward pass, so the calculated + encodings work for all data samples. This callback internally chooses the number of data samples + it wants to use for calculating encodings. + + :param forward_pass_callback_args: These argument(s) are passed to the forward_pass_callback as-is. Up to + the user to determine the type of this parameter. E.g. could be simply an integer representing the number + of data samples to use. Or could be a tuple of parameters or an object representing something more + complex. + + :return: None + + """ + + self._compute_and_set_parameter_encodings() + + # At the beginning before we do forward pass we want to set parameters to quantize dequantize mode and once we + # compute the encodings for activations we set it to the required op mode based on quant scheme & if per channel + # quantization is enabled + self._set_op_mode_parameters(libpymo.TensorQuantizerOpMode.quantizeDequantize, []) + + ops_with_invalid_encodings = [] + + # Run data through the quantsim so we can compute activation encodings + forward_pass_callback(self.session, forward_pass_callback_args) + + # For activations, calculate encodings and update min-max parameters + for op_name, quantizer_info in self._activation_quantizers.items(): + # Calculate encodings + if quantizer_info.get_op_mode() != int(libpymo.TensorQuantizerOpMode.passThrough): + op_bitwidth, op_use_symmetric_encodings = quantizer_info.bitwidth, quantizer_info.use_symmetric_encoding + encoding = quantizer_info.compute_encoding(op_bitwidth, op_use_symmetric_encodings) + # encoding would be invalid for dtype=fp because there is no encoding computed in float mode through the + # tensor_quantizer + if quantizer_info.data_type == QuantizationDataType.float: + quantizer_info.set_op_mode(libpymo.TensorQuantizerOpMode.quantizeDequantize) + else: + if quantizer_info.is_encoding_valid(): + quantizer_info.set_encoding(encoding) + quantizer_info.set_op_mode(libpymo.TensorQuantizerOpMode.quantizeDequantize) + else: + quantizer_info.set_op_mode(libpymo.TensorQuantizerOpMode.passThrough) + ops_with_invalid_encodings.append(op_name) + + # For post-training mode, params will always be in one-shot mode + op_mode = self._param_op_mode_after_analysis(self._quant_scheme) + + self._set_op_mode_parameters(op_mode, ops_with_invalid_encodings) + + if ops_with_invalid_encodings: + _logger.info('The following quantizers did not have valid encodings and have been set to passThrough mode: ' + '%s', ops_with_invalid_encodings) + _logger.info('This can be due to the quantizers not having been evaluated during the forward pass in ' + 'compute encodings. Evaluation is required to collect statistics needed to compute valid ' + 'encodings.\n' + 'As a result, the quantizers have been set to passThrough mode, meaning no quantization noise ' + 'will be simulated for these ops if they are evaluated in the future.\n' + 'If this is not desired, amend the forward pass to evaluate tensors which require these ops ' + 'to be evaluated, and recompute encodings.') + + self._clamp_transformer_attention_mask_encoding()
+ + def get_enabled_parameter_quantizers(self): + """ + For given quantsim model, get all enabled param quantizers. + :return: List of enabled param quantizers. + """ + enabled_param_quantizers = [] + for quantizer_info in self._param_quantizers.values(): + if quantizer_info.enabled: + enabled_param_quantizers.append(quantizer_info) + return enabled_param_quantizers + + def get_enabled_activation_quantizers(self): + """ + For given quantsim model, get all enabled activation quantizers. + :return: List of enabled activation quantizers. + """ + enabled_activation_quantizers = [] + for quantizer_info in self._activation_quantizers.values(): + if quantizer_info.enabled: + enabled_activation_quantizers.append(quantizer_info) + return enabled_activation_quantizers + + def _set_op_mode_parameters(self, op_mode: libpymo.TensorQuantizerOpMode, + ops_with_invalid_encodings: List): + """ + Sets op mode for parameters and if the encodings are invalid, then adds those ops to ops_with_invalid_encodings + :param op_mode: libpymo.TensorQuantizerOpMode + :param ops_with_invalid_encodings: list of ops that don't have vallid encodings + """ + for op_name, quantizer_info in self._param_quantizers.items(): + if quantizer_info.get_op_mode() != int(libpymo.TensorQuantizerOpMode.passThrough): + # encoding would be invalid for dtype=fp because there is no encoding computed in float mode through the + # tensor_quantizer + if quantizer_info.data_type == QuantizationDataType.float: + quantizer_info.set_op_mode(libpymo.TensorQuantizerOpMode.quantizeDequantize) + else: + if quantizer_info.is_encoding_valid(): + quantizer_info.set_op_mode(op_mode) + else: + quantizer_info.set_op_mode(libpymo.TensorQuantizerOpMode.passThrough) + ops_with_invalid_encodings.append(op_name) + +
[docs] def export(self, path: str, filename_prefix: str, orig_sess: tf.compat.v1.Session = None): + """ + This method exports out the quant-sim model so it is ready to be run on-target. + + Specifically, the following are saved + + 1. The sim-model is exported to a regular tensorflow meta/checkpoint without any simulation ops + + 2. The quantization encodings are exported to a separate JSON-formatted file that can + then be imported by the on-target runtime (if desired) + + :param path: path where to store model pth and encodings + :param filename_prefix: Prefix to use for filenames of the model pth and encodings files + :param orig_sess: optional param to pass in original session without quant nodes for export + :return: None + + """ + # this is required to update the encoding for last iteration of backward pass for QAT 1.0 only + if self._quant_scheme in [QuantScheme.post_training_tf, QuantScheme.post_training_tf_enhanced]: + self._compute_and_set_parameter_encodings() + # save session without quant nodes + if orig_sess is not None: + with orig_sess.graph.as_default(): + saver = tf.compat.v1.train.Saver() + saver.save(orig_sess, save_path=WORKING_DIR+'orig_model_before_quantsim') + else: + _logger.info('Original session is not provided, use orig_model_before_quantsim.meta to export') + + self._remove_quantization_nodes_and_save_graph(path, filename_prefix) + self._export_encodings(os.path.join(path, filename_prefix) + '.encodings')
+ + def _compute_and_set_parameter_encodings(self): + + for quantizer_info in self._param_quantizers.values(): + + if quantizer_info.enabled and quantizer_info.data_type == QuantizationDataType.int: + # 0th input to our quant op is the tensor being quantized - in this case the parameter tensor + weight_tensor = quantizer_info.get_variable_from_op(0) + + # Per-channel + if isinstance(quantizer_info.tensor_quantizer, list): + for index, tensor_quantizer in enumerate(quantizer_info.tensor_quantizer): + if quantizer_info.axis_handling == AxisHandling.LAST_TWO_AXES: + last_two_axes_combined_shape = list(weight_tensor.shape[:-2]) + [-1] + channel_slice = weight_tensor.reshape(*last_two_axes_combined_shape) + channel_slice = channel_slice.take(index, channel_slice.ndim - 1) + tensor_quantizer.updateStats(channel_slice, False) + else: + channel_slice = weight_tensor.take(index, weight_tensor.ndim - 1) + tensor_quantizer.updateStats(channel_slice, False) + + # Per-tensor + else: + tensor_quantizer = quantizer_info.tensor_quantizer + tensor_quantizer.updateStats(weight_tensor, False) + + encoding = quantizer_info.compute_encoding(quantizer_info.bitwidth, + quantizer_info.use_symmetric_encoding) + + quantizer_info.set_encoding(encoding) + + def _remove_quantization_nodes_and_save_graph(self, path: str, filename_prefix: str): + """ + This function removes the quantization nodes from quantized graph and saves it + :param path: path where to store model pth and encodings + :param filename_prefix: Prefix to use for filenames of the model pth and encodings files + """ + vars_to_save = [] + with self.session.graph.as_default(): + for var in tf.compat.v1.global_variables(): + if not var.name[:-2].endswith(('_quantized', '_quantized_op_mode', '_quantized_quant_ref', + '_quantized_encoding_min', '_quantized_encoding_max', + '_quantized_bit_width', '_quantized_use_symmetric_encoding', + '_quantized_axis_handling', '_quantized_data_type')): + vars_to_save.append(var) + + saver = tf.compat.v1.train.Saver(vars_to_save) + saver.save(self.session, save_path=os.path.join(path, filename_prefix)) + shutil.copyfile(WORKING_DIR + 'orig_model_before_quantsim.meta', + os.path.join(path, filename_prefix) + '.meta') + + def save_to_keras(self, temp_dir_path: str = "/tmp/") -> tf.compat.v1.Session: + """ + This method exports out the quant-sim model so it is ready to be eval/trained using a Keras pipeline + + :param temp_dir_path: temporary directory to store intermediate files + :return: Session to import into a Keras model + + """ + current_graph = self.session.graph + with current_graph.as_default(): + ops = current_graph.get_operations() + for op in ops: + if op.type in ['QcQuantize', 'QcQuantizeRecurrentParam']: + + # Read the config + # ----------------- + quant_config = self.quantizer_config(op.name) + config_tuple = self.session.run([op.inputs[QuantizeOpIndices.op_mode], + op.inputs[QuantizeOpIndices.encoding_min], + op.inputs[QuantizeOpIndices.encoding_max], + op.inputs[QuantizeOpIndices.bit_width], + op.inputs[QuantizeOpIndices.use_symmetric_encoding]]) + op_mode, encoding_min, encoding_max, bitwidth, is_symmetric = config_tuple + + # Create the static op + # -------------------- + if not self._use_cuda: + with tf.device('/cpu:0'): + static_op = qcops.qc_quantize_static(name=op.name+"_static", in_tensor=op.inputs[0], + encoding_min=encoding_min, encoding_max=encoding_max, + bitwidth=bitwidth, quant_scheme=quant_config.quant_scheme, + op_mode=op_mode, is_symmetric=bool(is_symmetric)) + else: + static_op = qcops.qc_quantize_static(name=op.name + "_static", in_tensor=op.inputs[0], + encoding_min=encoding_min, encoding_max=encoding_max, + bitwidth=bitwidth, quant_scheme=quant_config.quant_scheme, + op_mode=op_mode, is_symmetric=bool(is_symmetric)) + + # Replace in graph + # ----------------- + graph_editor.reroute_ts(ts0=[static_op], ts1=[op.outputs[0]], + can_modify=op.outputs[0].consumers()) + graph_editor.detach_inputs(op) + + new_sess = utils.graph_saver.save_and_load_graph(temp_dir_path, self.session) + return new_sess + + def save_model_with_embedded_quantization_nodes(self, checkpoint_path: str, encoding_path: str = None, + orig_sess: tf.compat.v1.Session = None): + """ + This method is to export model embedded with native tensorflow quantization nodes + :param checkpoint_path: path to save the checkpoint files + :param encoding_path: optional param to pass the path from where to load parameter encodings file + :param orig_sess: optional param to pass in original session without quant nodes + """ + # Load encodings file + encodings_dicts = {} + if encoding_path and os.path.exists(encoding_path): + with open(encoding_path) as json_file: + encodings_dicts = json.load(json_file) + encodings_dicts = dict(encodings_dicts["activation_encodings"], **encodings_dicts["param_encodings"]) + + if orig_sess is None: + _logger.info('Original session is not provided, use orig_model_before_quantsim.meta as default graph') + orig_sess = utils.graph_saver.load_model_from_meta(meta_path=os.path.join(WORKING_DIR, + 'orig_model_before_quantsim' + '.meta')) + with orig_sess.graph.as_default(): + for op_name, quantizer_info in dict(self._param_quantizers, **self._activation_quantizers).items(): + tensor_name = self.session.graph.get_operation_by_name(op_name).inputs[0].name + op = orig_sess.graph.get_tensor_by_name(tensor_name).op + consumers = [consumer for consumer in op.outputs[0].consumers() if 'gradients' not in consumer.name] + if tensor_name in encodings_dicts: + # Check for per channel quantization + if self.per_channel_quantization_enabled and len(encodings_dicts[tensor_name]) > 1: + encoding_min = [channel_dict['min'] for channel_dict in encodings_dicts[tensor_name]] + encoding_max = [channel_dict['max'] for channel_dict in encodings_dicts[tensor_name]] + encoding_bw = encodings_dicts[tensor_name][0]['bitwidth'] + else: + encoding_max = encodings_dicts[tensor_name][0].get('max') + encoding_min = encodings_dicts[tensor_name][0].get('min') + encoding_bw = encodings_dicts[tensor_name][0].get('bitwidth') + + else: + if not quantizer_info.is_encoding_valid(): + if quantizer_info.data_type == QuantizationDataType.float and quantizer_info.get_op_mode() in\ + [int(libpymo.TensorQuantizerOpMode.oneShotQuantizeDequantize), + int(libpymo.TensorQuantizerOpMode.quantizeDequantize)]: + # Cast input tensor to data_type and dequant it to fp32 + with tf.device('' if self._use_cuda else '/cpu:0'): + tf_quantization_op = tf.cast(tf.cast(op.outputs[0], tf.float16), tf.float32) + # Replace in graph + # ----------------- + graph_editor.reroute_ts(ts0=tf_quantization_op, ts1=[op.outputs[0]], + can_modify=consumers) + continue + _logger.info("Can't find %s in encodings file, encodings in QuantizationSimModel will be used", + self._get_quantized_name(op.name)) + encoding_min, encoding_max = self.read_min_max(self._get_quantized_name(op.name)) + # if per channel quantization is enabled, then min and max are numpy arrays, and this function gates the array + encoding_bw = int(self._get_op_variable_value(self.session.graph.get_operation_by_name(op_name), + QuantizeOpIndices.bit_width)) + + _logger.info("Adding native tensorflow quantization op %s", self._get_quantized_name(op.name)) + # inser native tensorflow quantization nodes into graph + with tf.device('' if self._use_cuda else '/cpu:0'): + if not isinstance(encoding_max, (list, np.ndarray)): + tf_quantization_op = \ + tf.quantization.fake_quant_with_min_max_vars(op.outputs[0], min=encoding_min, max=encoding_max, + num_bits=encoding_bw, narrow_range=False, + name=self._get_quantized_name(op.name)) + else: + tf_quantization_op = \ + tf.quantization.fake_quant_with_min_max_vars_per_channel(op.outputs[0], min=np.array(encoding_min), + max=np.array(encoding_max), num_bits=encoding_bw, + narrow_range=False, name=self._get_quantized_name(op.name)) + + # Replace in graph + # ----------------- + graph_editor.reroute_ts(ts0=tf_quantization_op, ts1=[op.outputs[0]], + can_modify=consumers) + + utils.graph_saver.save_model_to_meta(orig_sess, os.path.join(checkpoint_path + '_embedded_quant_nodes')) + return utils.graph_saver.load_model_from_meta(meta_path=str(checkpoint_path + '_embedded_quant_nodes.meta')) + + def set_and_freeze_param_encodings(self, encoding_path: str): + """ + Set and freeze parameter encodings from encodings JSON file + :param encoding_path: path from where to load parameter encodings file + """ + # Load parameter encodings file + with open(encoding_path) as json_file: + param_encodings = json.load(json_file) + + # op mode will be Quantize dequantize + op_mode = libpymo.TensorQuantizerOpMode.quantizeDequantize + + for op_name, quantizer_info in self._param_quantizers.items(): + quant_op = self.session.graph.get_operation_by_name(op_name) + tensor_name = quant_op.inputs[0].name + if tensor_name in param_encodings: + encoding_dict = param_encodings[tensor_name] if self.per_channel_quantization_enabled else \ + param_encodings[tensor_name][0] + encoding, is_symmetric = create_encoding_from_dict(encoding_dict) + quantizer_info.use_symmetric_encoding = is_symmetric + quantizer_info.set_and_freeze_encoding_and_op_mode(encoding, op_mode) + _logger.info("Setting and freezing quantization encodings for parameter: %s", tensor_name) + + def load_encodings_to_sim(self, encoding_path: str): + """ + Set parameter and activation encodings from encodings JSON file + :param encoding_path: path from where to load encodings file + """ + # Load parameter encodings file + with open(encoding_path) as json_file: + encodings = json.load(json_file) + param_encodings = encodings['param_encodings'] + activation_encodings = encodings['activation_encodings'] + + for op_name, quantizer_info in self._param_quantizers.items(): + quant_op = self.session.graph.get_operation_by_name(op_name) + tensor_name = quant_op.inputs[0].name + if tensor_name in param_encodings: + # Check if the quantizer is disabled + if not quantizer_info.enabled: + _logger.info("Not loading encodings for parameter: %s as quantizer is disabled", tensor_name) + continue + encoding_dict = param_encodings[tensor_name] if self.per_channel_quantization_enabled else \ + param_encodings[tensor_name][0] + encoding, is_symmetric = create_encoding_from_dict(encoding_dict) + bitwidth = encoding_dict[0].get('bitwidth') if self.per_channel_quantization_enabled else \ + encoding_dict.get('bitwidth') + quantizer_info.set_encodings_to_quantizer(bitwidth, is_symmetric, encoding, + libpymo.TensorQuantizerOpMode.oneShotQuantizeDequantize) + _logger.info("Setting quantization encodings for parameter: %s", tensor_name) + else: + # Case where encoding is not present in the encoding file + # So we will disable the quantizer if its active + if quantizer_info.enabled: + quantizer_info.enabled = False + _logger.info("Encoding for parameter: %s not present thus disabling this quantizer.", tensor_name) + + for op_name, quantizer_info in self._activation_quantizers.items(): + quant_op = self.session.graph.get_operation_by_name(op_name) + tensor_name = quant_op.inputs[0].name + if tensor_name in activation_encodings: + # Check if the quantizer is disabled + if not quantizer_info.enabled: + _logger.info("Not loading encodings for parameter: %s as quantizer is disabled", tensor_name) + continue + encoding_dict = activation_encodings[tensor_name][0] + encoding, is_symmetric = create_encoding_from_dict(encoding_dict) + quantizer_info.set_encodings_to_quantizer(encoding_dict.get('bitwidth'), is_symmetric, + encoding, libpymo.TensorQuantizerOpMode.quantizeDequantize) + _logger.info("Setting quantization encodings for activation: %s", tensor_name) + else: + # Case where encoding is not present in the encoding file + # So we will disable the quantizer if its active + if quantizer_info.enabled: + quantizer_info.enabled = False + _logger.info("Encoding for parameter: %s not present thus disabling this quantizer.", tensor_name) + + def _param_op_mode_after_analysis(self, quant_scheme) -> libpymo.TensorQuantizerOpMode: + """ + Returns op mode to use for parameters after encodings have been computed + :param quant_scheme: Quantization scheme to use + :return: + """ + if quant_scheme in [QuantScheme.training_range_learning_with_tf_init, + QuantScheme.training_range_learning_with_tf_enhanced_init]: + op_mode = libpymo.TensorQuantizerOpMode.quantizeDequantize + else: + op_mode = libpymo.TensorQuantizerOpMode.oneShotQuantizeDequantize + + if self.per_channel_quantization_enabled: + op_mode = libpymo.TensorQuantizerOpMode.quantizeDequantize + + return op_mode + + def get_min_max_var_dict(self) -> Dict: + """ + Fetches all the min max variables in given Quantized graph. + :return: dictionary of min/ max variable names to var mapping + """ + variable_dict = {} + with self.session.graph.as_default(): + for var in tf.compat.v1.global_variables(): + if var.name.endswith('_encoding_min:0') or var.name.endswith('_encoding_max:0'): + variable_dict[var.name] = var + + return variable_dict + + def read_min_max(self, quant_op_name: str, variable_dict: Dict = None) -> (float, float): + """ + Reads min and max params from quantize op + :param quant_op_name: quantize op name to read min and max variables from. + :param variable_dict: dictionary of min/max variable names to variable mapping for given quantized graph, optional + :return: min and max variable values from the given quant op. + """ + if not variable_dict: + # get a variable dict if one is not provided + variable_dict = self.get_min_max_var_dict() + + min_var = variable_dict[quant_op_name + '_encoding_min:0'] + max_var = variable_dict[quant_op_name + '_encoding_max:0'] + return self.session.run([min_var, max_var]) + + def _export_encodings(self, encoding_file_path: str): + """ + Export encodings to the given file path. + + :param encoding_file_path: File path to export encodings to + """ + def update_encoding_dict_entry_float(encoding_dict: Dict, op_name: str): + quant_op = self.session.graph.get_operation_by_name(op_name) + op_bitwidth = int(self._get_op_variable_value(quant_op, QuantizeOpIndices.bit_width)) + if op_bitwidth != 16: + raise ValueError('dtype is set to float but bitwidth is not 16 for the layer:', op_name) + + tensor_name = quant_op.inputs[0].name + encoding_dict[tensor_name] = [{'dtype': 'float', + 'bitwidth': op_bitwidth}] + + def update_encoding_dict_entry_int(encoding_dict: Dict, quantizer_info: QuantizerInfo): + encoding = quantizer_info.get_encoding() + quant_op = self.session.graph.get_operation_by_name(quantizer_info.quant_op_name) + + # Min and max will be numpy arrays, so to make them JSON serializable + if self.per_channel_quantization_enabled and isinstance(encoding, list): + min_val = [enc.min for enc in encoding] + max_val = [enc.max for enc in encoding] + delta = [enc.delta for enc in encoding] + offset = [enc.offset for enc in encoding] + else: + # Wrap single min/max value in a list to support list comprehension + min_val = [encoding.min] + max_val = [encoding.max] + delta = [encoding.delta] + offset = [encoding.offset] + + tensor_name = quant_op.inputs[0].name + if quant_op.type in ['QcQuantizePerChannel'] and 'EagerPyFunc' in tensor_name: + tensor_name = quant_op.inputs[0].op.inputs[0].name + encoding_dict[tensor_name] = [{'min': min_val[idx], + 'max': max_val[idx], + 'scale': delta[idx], + 'offset': int(offset[idx]), + 'bitwidth': int(quantizer_info.bitwidth), + 'is_symmetric': str(quantizer_info.use_symmetric_encoding), + 'dtype': 'int'} for idx in range(len(min_val))] + + param_encodings = {} + for quant_op_name, quantizer_info in self._param_quantizers.items(): + if quantizer_info.data_type == QuantizationDataType.float: + update_encoding_dict_entry_float(param_encodings, quant_op_name) + else: + if not quantizer_info.is_encoding_valid(): + continue + update_encoding_dict_entry_int(param_encodings, self._param_quantizers[quant_op_name]) + + activation_encodings = {} + for quant_op_name, quantizer_info in self._activation_quantizers.items(): + if quantizer_info.data_type == QuantizationDataType.float: + update_encoding_dict_entry_float(activation_encodings, quant_op_name) + else: + if not quantizer_info.is_encoding_valid(): + continue + update_encoding_dict_entry_int(activation_encodings, self._activation_quantizers[quant_op_name]) + + encodings_dict = {'version': encoding_version, + 'activation_encodings': activation_encodings, + 'param_encodings': param_encodings, + 'quantizer_args': self.quant_args} + + save_json_yaml(encoding_file_path, encodings_dict) + + def _save_and_load_sim_model(self): + self.session = utils.graph_saver.save_and_load_graph(WORKING_DIR, self.session) + update_tensor_quantizer_references(self.session, self._activation_quantizers) + update_tensor_quantizer_references(self.session, self._param_quantizers) + + def _add_quant_nodes_recurrent(self, conn_graph: ConnectedGraph, default_param_bw: int, default_output_bw: int) \ + -> Tuple[List[str], List[int], List[str]]: + """ + Utility to add quant nodes to recurrent module + :param conn_graph: Connected graph of the model + :param default_param_bw: default param bitwidth + :param default_output_bw: default output bitwidth + :return: Tuple[List[str], List[int], List[str]], param op names, input indices and activation op names + """ + # pylint: disable=protected-access + # pylint: disable=too-many-locals + + # Register custom handlers to select internal ops to quantize in a given recurrent module type + switcher = { + "SimpleRNN": _select_simple_rnn_internal_ops_to_quantize, + "LSTM": _select_lstm_internal_ops_to_quantize + } + + ops_with_param_names = [] + input_indices = [] + activation_op_names = [] + + for op in conn_graph.get_all_ops().values(): + # we can configure custom layer selectors per recurrent type or use default one + if op.type in SUPPORTED_RECURRENT_TYPES: + if version.parse(tf.version.VERSION) >= version.parse("2.00"): + raise AssertionError('Recurrent layers are not supported with TF2.x, instead use TF1.15.') + internal_ops = op.internal_ops + + # select internal ops to quantize in this recurrent type + select_internal_ops_to_quantize = switcher.get(op.type) + module_ops_with_param_names, module_op_input_indices, module_activation_op_names = \ + select_internal_ops_to_quantize(self.session.graph, internal_ops) + + # insert the quant nodes + self._insert_param_quantization_ops_loop_context(module_ops_with_param_names, module_op_input_indices, + default_param_bw, internal_ops) + + self._insert_activation_quantization_ops(module_activation_op_names, default_output_bw, + in_loop_context=True) + + # if there are multiple recurrent modules, we want a list containing all the param + # and activation info + if module_ops_with_param_names and module_op_input_indices: + ops_with_param_names.extend(module_ops_with_param_names) + input_indices.extend(module_op_input_indices) + if module_activation_op_names: + activation_op_names.extend(module_activation_op_names) + + return ops_with_param_names, input_indices, activation_op_names + + def _add_and_configure_quant_nodes(self, starting_op_names: List[str], output_op_names: List[str], + default_param_bw: int, default_output_bw: int, + default_data_type: QuantizationDataType): + """ + Utility to add quant nodes + :param starting_op_names: List of starting op names of the model + :param output_op_names: List of output op names of the model + :param default_param_bw: default param bitwidth + :param default_output_bw: default output bitwidth + :param default_data_type: Default data type to use for quantizing all layer parameters + """ + + # Get list of ops with params to insert quantizers for, as well as the input indices to insert on. + params_to_quantize = QuantizationSimModel._get_ops_to_quantize_params_for(self.session.graph, + self.connected_graph, + starting_op_names, + output_op_names) + + # Get list of activation ops to insert quantizers for + activation_op_names = QuantizationSimModel._get_ops_to_quantize_activations_for(self.session.graph, + self.connected_graph) + + + self._insert_param_quantization_ops(params_to_quantize, default_param_bw, data_type=default_data_type) + self._insert_activation_quantization_ops(activation_op_names, default_output_bw, data_type=default_data_type) + + # this takes care of quant node insertion in loop context of recurrent layer, which makes a cell + recurrent_ops_with_param_names, recurrent_input_indices, recurrent_activation_op_names = \ + self._add_quant_nodes_recurrent(self.connected_graph, default_param_bw, default_output_bw) + + if recurrent_activation_op_names: + activation_op_names.extend(recurrent_activation_op_names) + + # Note: at this point, the session used to construct conn_graph is different than the current + # self.session, however we still use the connected graph to traverse the graph structure. + self.configure_quantization_ops(self.connected_graph, recurrent_ops_with_param_names, recurrent_input_indices, + params_to_quantize, activation_op_names) + + @staticmethod + def _get_quantized_name(op_name: str) -> str: + """ + Small utility function to name a quantized parameter + :param op_name: Name of the op being quantized + :return: Returns an appropriate name for the quantized op + """ + return op_name + '_quantized' + + @staticmethod + def _get_unquantized_name(quant_op_name: str) -> str: + """ + Small utility function to get the name of the op being quantized + :param quant_op_name: Name of the quant op + :return: Returns the name of the op being quantized + """ + assert quant_op_name.endswith('_quantized') + return quant_op_name[:-len('_quantized')] + + @staticmethod + def _get_op_to_modify_with_param_in(op: tf.Operation, index: int) -> (tf.Operation, tf.Tensor): + """ + utility to get op to modify along with param input + :param op: TensorFlow operation + :param index: input index to get param from + :return: Tuple of TF operation and param in tensor + """ + + op_to_modify = None + param_in = None + # case of params being depth 2 input nodes to MatMul + # via strided-slice or split op + if op.inputs[index].op.type in ['StridedSlice', 'Split']: + strided_slice_op = op.inputs[index].op + for inp in strided_slice_op.inputs: + if inp.op.type in ['ReadVariableOp']: + op_to_modify = strided_slice_op + param_in = inp + else: + # case of params being direct input nodes to MatMul + op_to_modify = op + param_in = op.inputs[index] + + return op_to_modify, param_in + + def _insert_param_quantization_ops(self, params_to_quantize: Dict[str, ParameterInfo], default_param_bw: int, + data_type: QuantizationDataType = QuantizationDataType.int): + """ + Inserts quantization ops for individual parameters + :param params_to_quantize: dictionary of parameters to quantize + :param default_param_bw : default param bitwidth + :return: None + """ + # pylint: disable=too-many-locals + for param_name, param_info in params_to_quantize.items(): + param_in = self.session.graph.get_operation_by_name(param_name).outputs[0] + can_modify_ops = [self.session.graph.get_operation_by_name(consumer) \ + for consumer in param_info.op_with_param_name] + # Assume all ops that are consumers of the param are of the same type for axis handling purposes + can_modify_op_type = can_modify_ops[0].type + if param_in is not None: + num_output_channels, quantization_axis_handling = \ + QuantizationSimModel._get_number_of_output_channels_and_quantization_axis_handling( + param_in.get_shape().as_list(), can_modify_op_type) + quant_op_name = self._get_quantized_name(param_name) + + self._op_name_to_output_channels_axis_handling_dict[quant_op_name] = [num_output_channels, + quantization_axis_handling] + _logger.info("Adding weight quantization op %s", quant_op_name) + op_mode = libpymo.TensorQuantizerOpMode.oneShotQuantizeDequantize + + # If per channel quantization is enabled we tranpose the weights of tranpose op and then + # perform per channel quantization + if can_modify_op_type in ['Conv2DTranspose', 'Conv2DBackpropInput'] and \ + self.per_channel_quantization_enabled: + + fout = tf.py_function(func=swap_last_two_dim, inp=[param_in], Tout=tf.float32) + + q_op_out = self._insert_post_training_quant_op(fout, quant_op_name, + op_mode, self._param_quantizers, QuantizerType.param, + default_param_bw, data_type) + + q_op_out = tf.py_function(func=swap_last_two_dim, inp=[q_op_out], Tout=tf.float32) + else: + q_op_out = self._insert_post_training_quant_op(param_in, quant_op_name, + op_mode, self._param_quantizers, QuantizerType.param, + default_param_bw, data_type) + + nodes_modified_count = graph_editor.reroute_ts(tf_ops.convert_to_tensor(q_op_out), param_in, + can_modify=can_modify_ops) + + if nodes_modified_count != len(can_modify_ops): + raise ValueError(f'Issue quantizing {param_in.name}') + + def _insert_param_quantization_ops_loop_context(self, op_names: List[str], indices: List[int], + default_param_bw: int, + inner_ops: List[tf.Operation], + data_type: QuantizationDataType = QuantizationDataType.int): + """ + Inserts quantization ops for individual parameters + :param op_names: List of ops whose parameters are being quantized + :param indices: List of input indices (one-to-one for each entry in ops) + :param default_param_bw : default param bitwidth + :param inner_ops: list of tf.Operations inside a RNN op + :param data_type: Default data type to use for quantizing all layer parameters + :return: None + """ + # pylint: disable=too-many-locals + ops = [self.session.graph.get_operation_by_name(op_name) for op_name in op_names] + assert len(ops) == len(indices) + + for op, index in zip(ops, indices): + # Modify the weight/bias inputs to use the quantized inputs + can_modify_op, param_in = QuantizationSimModel._get_op_to_modify_with_param_in(op, index) + + if param_in is not None: + num_output_channels, quantization_axis_handling = \ + QuantizationSimModel._get_number_of_output_channels_and_quantization_axis_handling( + can_modify_op.inputs[index].get_shape().as_list(), can_modify_op.type) + quant_op_name = self._get_quantized_name(param_in.op.name) + + self._op_name_to_output_channels_axis_handling_dict[quant_op_name] = [num_output_channels, + quantization_axis_handling] + _logger.info("Adding weight quantization op %s", quant_op_name) + op_mode = libpymo.TensorQuantizerOpMode.oneShotQuantizeDequantize + + q_op_out = self._insert_param_quantizer_loop_context(inner_ops, param_in, quant_op_name, + op_mode, self._param_quantizers, + QuantizerType.param, + default_param_bw, data_type) + + nodes_modified_count = graph_editor.reroute_ts(tf_ops.convert_to_tensor(q_op_out), param_in, + can_modify=can_modify_op) + if nodes_modified_count != 1: + raise ValueError('Input ' + param_in.name + ' not quantized!') + + + @staticmethod + def _get_number_of_output_channels_and_quantization_axis_handling(weight_shape: List[int], + consumer_op_type: str) -> \ + Tuple[int, AxisHandling]: + """ + Gets number of output channels and quantization axis handling for an op for per channel quantization + :param weight_shape: list containing tensor shape of weight + :param consumer_op_type: type of op that consumes weight + :return number of output channel and axis handling from weight_shape + """ + # Initialize axis_handling and num_output_channels with values fitting most ops + axis_handling = AxisHandling.LAST_AXIS + num_output_channels = weight_shape[-1] + if consumer_op_type in ['Conv2DTranspose', 'Conv2DBackpropInput']: + num_output_channels = weight_shape[2] + elif consumer_op_type == 'DepthwiseConv2dNative': + num_output_channels *= weight_shape[-2] + axis_handling = AxisHandling.LAST_TWO_AXES + + # If op is not any special op, fall through and return the unmodified values. + return num_output_channels, axis_handling + + @staticmethod + def _is_op_quantizable(op: tf.Operation) -> bool: + """ + utility to check if the quantization can be supported for this op + :param op: op as tf.Operation type + :return: True if the op can be quantized, False otherwise + """ + + if op.outputs: + if op.outputs[0].dtype not in DTYPES_QUANTIZE_NOT_REQUIRED: + return True + + return False + + def _insert_activation_quantization_ops(self, valid_op_names: List[str], default_output_bw, + in_loop_context: bool = False, + data_type: QuantizationDataType = QuantizationDataType.int): + """ + Inserts quantization ops at the outputs of given ops + :param valid_op_names: List of op names to insert activation quantizers for + :param default_output_bw: default activation bitwidth + :param in_loop_context: True, if the ops belong to a loop control flow context + :param data_type: Default data type to use for quantizing all layer activations + return: + """ + for op_name in valid_op_names: + quant_op_name = self._get_quantized_name(op_name) + op = self.session.graph.get_operation_by_name(op_name) + _logger.info("Adding activation quantization op %s", quant_op_name) + + consumers = [consumer for consumer in op.outputs[0].consumers() if 'gradients' not in consumer.name] + + if not QuantizationSimModel._is_op_quantizable(op): + error_msg = f'Unsupported dtype {op.outputs[0].dtype} detected for op {op_name}.' + _logger.error(error_msg) + raise AssertionError(error_msg) + + if in_loop_context: + q_op_out = self._insert_post_training_quant_op_in_loop_context(op.outputs[0], quant_op_name, + libpymo.TensorQuantizerOpMode.updateStats, + self._activation_quantizers, + QuantizerType.activation, + default_output_bw, data_type) + else: + q_op_out = self._insert_post_training_quant_op(op.outputs[0], quant_op_name, + libpymo.TensorQuantizerOpMode.updateStats, + self._activation_quantizers, QuantizerType.activation, + default_output_bw, data_type) + + # Re-route + num_rerouted_outputs = graph_editor.reroute_ts(tf_ops.convert_to_tensor(q_op_out), + op.outputs[0], can_modify=consumers) + if num_rerouted_outputs != len(consumers): + raise ValueError('Failed to map ' + str(len(consumers)) + ' quantization output(s). Only mapped ' + + str(num_rerouted_outputs)) + + def _create_encoding_min_max_vars(self, q_op_name: str, quantizer_type: QuantizerType = None) -> (tf.Variable, tf.Variable): + """ + creates encoding min and max variables for quant op. + :param q_op_name: name of quantize op + :param quantizer_type: Quantizer type param or activation + :return: encoding min and max as tf.Variable type + """ + + is_trainable = False + if self._quant_scheme in [QuantScheme.training_range_learning_with_tf_init, + QuantScheme.training_range_learning_with_tf_enhanced_init]: + is_trainable = True + + initial_min_val = 0.0 + initial_max_val = 0.0 + + if quantizer_type == QuantizerType.param and self.per_channel_quantization_enabled: + num_output_channels, _ = self._op_name_to_output_channels_axis_handling_dict[q_op_name] + initial_min_val = [0.0] * num_output_channels + initial_max_val = [0.0] * num_output_channels + + encoding_min_var = tf.Variable(initial_value=initial_min_val, + name=q_op_name + '_encoding_min', + trainable=is_trainable, dtype=tf.double) + encoding_max_var = tf.Variable(initial_value=initial_max_val, + name=q_op_name + '_encoding_max', + trainable=is_trainable, dtype=tf.double) + + return encoding_min_var, encoding_max_var + + @staticmethod + def _get_ops_to_quantize_params_for(graph: tf.Graph, conn_graph: ConnectedGraph, starting_op_names: List[str], + output_op_names: List[str]) -> Dict[str, ParameterInfo]: + """ + Get names of ops to insert param quantizers for, as well as corresponding indices + :param graph: TensorFlow graph to get names of ops to quantize weights for + :param conn_graph: Connected graph of the model + :param starting_op_names: List of starting op names of the model + :param output_op_names: List of output op names of the model + :return: Dictionary with name of parameters to quantize as keys and information about parameters as values + """ + if conn_graph is None: + _logger.error("Connected graph is not passed as a parameter") + raise AssertionError("Connected graph is not passed as a parameter") + + # Get available connected graphs + valid_conns = [conn for conn in conn_graph.get_all_ops().values() + if conn.type not in param_quant_conn_op_ignore_list] + + valid_ops = get_valid_ops(graph, starting_op_names, output_op_names) + + # Get parameters of connected graphs + params_to_quantize = {} + for conn in valid_conns: + for param_name, param_info in conn.parameters.items(): + for consumer_name in param_info.op_with_param_name: + consumer = graph.get_operation_by_name(consumer_name) + if op_not_in_loop_control_flow_context(graph, consumer) and consumer in valid_ops: + if param_name in params_to_quantize: + # Parameter can be a weight shared parameter, that was used for a different op that was + # processed earlier. In this case, there will already be a parameter info entry for this + # parameter, and we need to update the op_with_param_name list to include the current op. + params_to_quantize[param_name].op_with_param_name.extend(param_info.op_with_param_name) + else: + params_to_quantize[param_name] = param_info + + params_to_quantize.update(get_embedding_params_using_patterns(conn_graph)) + + return params_to_quantize + + @staticmethod + def _get_ops_to_quantize_activations_for(graph: tf.Graph, conn_graph: ConnectedGraph) -> List[str]: + """ + Get names of ops to insert activation quantizers for + :param graph: TensorFlow graph to get names of ops to quantize weights for + :param conn_graph: Connected graph of the model + :return: List of op names to insert activation quantize ops for + """ + valid_ops = [op for op in conn_graph.get_all_ops().values() if op.type not in op_types_to_ignore] + op_names_to_quantize = [conn_graph_op.output_op_node.name for conn_graph_op in valid_ops if + is_op_quantizable(conn_graph_op.output_op_node) + and op_not_in_loop_control_flow_context(graph, conn_graph_op.output_op_node)] + + return op_names_to_quantize + + def _insert_post_training_quant_op_in_loop_context(self, preceeding_tensor, + quant_op_name: str, + op_mode: libpymo.QuantizationMode, + quantizer_dict: Dict[str, QuantizerInfo], + quantizer_type: QuantizerType, + bit_width: int = 8, + data_type: QuantizationDataType = QuantizationDataType.int): + """ + Create and insert a post-training quant op after a given tensor in a loop control flow context. + :param preceeding_tensor: Preceeding tensor to insert the quant op after + :param quant_op_name: Name to give to the new quant op + :param op_mode: Starting mode to configure for the new quant op + :param quantizer_dict: dictionary of op and QuantizerInfo + :param quantizer_type : indicate param or activation quantizer + :param bit_width : bit-width to be used (output or param quantization bit-width), default set to 8 + :param data_type: data type to use for quantizing all layer parameters + :return: None + """ + + # this handles cases such as conditional blocks that are defined in their own context + context_bk = updated_graph_flow_context_to_loop_context(self.session.graph, preceeding_tensor) + q_op_out = self._insert_post_training_quant_op(preceeding_tensor, quant_op_name, op_mode, quantizer_dict, + quantizer_type, bit_width, data_type) + + # revert the context back to graph level from op context + set_graph_flow_context(self.session.graph, context_bk) + + return q_op_out + + def _insert_param_quantizer_loop_context(self, inner_ops, preceeding_tensor, + quant_op_name: str, + op_mode: libpymo.QuantizationMode, + quantizer_dict: Dict[str, QuantizerInfo], + quantizer_type: QuantizerType, + bit_width: int = 8, + data_type: QuantizationDataType = QuantizationDataType.int): + """ + Create and insert a post-training quant op after a given tensor in a loop control flow context. + :param preceeding_tensor: Preceeding tensor to insert the quant op after + :param quant_op_name: Name to give to the new quant op + :param op_mode: Starting mode to configure for the new quant op + :param quantizer_dict: dictionary of op and QuantizerInfo + :param quantizer_type : indicate param or activation quantizer + :param bit_width: bit-width to be used (output or param quantization bit-width), default set to 8. + :param data_type: data type to use for quantizing all layer parameters + :return: None + """ + + # this handles cases such as conditional blocks that are defined in their own context + context_bk = updated_graph_flow_context_to_loop_context(self.session.graph, preceeding_tensor) + q_op_out = self._insert_param_quantizer_recurrent(inner_ops, preceeding_tensor, quant_op_name, op_mode, quantizer_dict, + quantizer_type, bit_width, data_type) + + # revert the context back to graph level from op context + set_graph_flow_context(self.session.graph, context_bk) + + return q_op_out + + # pylint: disable=too-many-locals + def _create_and_init_quant_op_input_vars(self, quant_op_name: str, quantizer_dict: Dict[str, QuantizerInfo], + quantizer_type, op_mode: libpymo.QuantizationMode, bit_width: int = 8, + data_type: QuantizationDataType = QuantizationDataType.int): + """ + creates input variables to Quantize op and initializes them + :param quant_op_name: quantize op name + :param quantizer_dict: dictionary of op and QuantizerInfo + :param quantizer_type: indicate param or activation quantizer + :param op_mode: Starting mode to configure for the new quant op + :param bit_width: bit-width to be used (output or param quantization bit-width), default set to 8 + :param data_type: data type to use for quantizing all layer parameters + :return: quant op input variables created + """ + with self.session.graph.as_default(): + op_mode_var = tf.Variable(int(op_mode), + name=quant_op_name + '_op_mode', trainable=False, + dtype=tf.int32) + + bit_width = tf.Variable(initial_value=bit_width, + name=quant_op_name + '_bit_width', + trainable=False, dtype=tf.int8) + + # Note: Later, is_symmetric_encoding value is to be read from config file + use_symmetric_encoding = tf.Variable(initial_value=False, + name=quant_op_name + '_use_symmetric_encoding', + trainable=False, dtype=tf.bool) + axis_handling = AxisHandling.LAST_AXIS + if quantizer_type == QuantizerType.param and self.per_channel_quantization_enabled: + tensor_quantizer, tensor_quant_ref, encoding_min, encoding_max, axis_handling = \ + self._create_per_channel_quantizers_and_encodings(quant_op_name) + else: + tensor_quantizer, tensor_quant_ref, \ + encoding_min, encoding_max = self._create_per_tensor_quantizers_and_encodings(quant_op_name) + + quantization_axis_handling = tf.Variable(initial_value=axis_handling.value, + name=quant_op_name + '_axis_handling', + trainable=False, dtype=tf.int32) + + is_int_data_type = tf.Variable(initial_value=(data_type == QuantizationDataType.int), + name=quant_op_name + '_data_type', trainable=False, dtype=tf.bool) + + # Add to quantizer dict + quantizer_info = QuantizerInfo(self.session, tensor_quantizer, quant_op_name, quantizer_type, + self._quant_scheme, self.per_channel_quantization_enabled, axis_handling) + quantizer_dict[quant_op_name] = quantizer_info + + self.session.run([op_mode_var.initializer, tensor_quant_ref.initializer, encoding_min.initializer, + encoding_max.initializer, bit_width.initializer, use_symmetric_encoding.initializer, + quantization_axis_handling.initializer, is_int_data_type.initializer]) + + return op_mode_var, tensor_quant_ref, encoding_min, encoding_max, bit_width, use_symmetric_encoding, \ + quantization_axis_handling, is_int_data_type + + def _create_per_channel_quantizers_and_encodings(self, quant_op_name: str) -> \ + Tuple[List[libpymo.TensorQuantizer], tf.Variable, tf.Variable, tf.Variable, AxisHandling]: + """ + Creates per channel quantizers and encoding min max variables + :param quant_op_name: Name of quantization op with parameter to create per channel quantizers for + :return: Tensor quantizers, variable with quantizer pointer, encoding min variable, encoding max variable, and + axis handling enum + """ + num_output_channels, axis_handling = self._op_name_to_output_channels_axis_handling_dict[quant_op_name] + tensor_quantizer_int64 = [None] * num_output_channels + tensor_quantizers = [None] * num_output_channels + # Create a tensor_quantizer per channel + for i in range(num_output_channels): + tensor_quantizer = libpymo.TensorQuantizer(quant_scheme_to_libpymo[self._quant_scheme], + libpymo.RoundingMode.ROUND_NEAREST) + + tensor_quantizers[i] = tensor_quantizer + val = libpymo.PtrToInt64(tensor_quantizer) + tensor_quantizer_int64[i] = val + + tensor_quant_ref = tf.Variable(tensor_quantizer_int64, name=quant_op_name + '_quant_ref', + trainable=False, dtype=tf.int64) + + encoding_min, encoding_max = self._create_encoding_min_max_vars(quant_op_name, + quantizer_type=QuantizerType.param) + + return tensor_quantizers, tensor_quant_ref, encoding_min, encoding_max, axis_handling + + def _create_per_tensor_quantizers_and_encodings(self, quant_op_name: str): + """ + Creates per tensor quantizers and encoding min max variables + """ + tensor_quantizer = libpymo.TensorQuantizer(quant_scheme_to_libpymo[self._quant_scheme], + libpymo.RoundingMode.ROUND_NEAREST) + tensor_quantizer_int64 = libpymo.PtrToInt64(tensor_quantizer) + tensor_quant_ref = tf.Variable(tensor_quantizer_int64, name=quant_op_name + '_quant_ref', + trainable=False, dtype=tf.int64) + encoding_min, encoding_max = self._create_encoding_min_max_vars(quant_op_name) + return tensor_quantizer, tensor_quant_ref, encoding_min, encoding_max + + def _insert_param_quantizer_recurrent(self, inner_ops, preceeding_tensor, quant_op_name: str, + op_mode: libpymo.QuantizationMode, + quantizer_dict: Dict[str, QuantizerInfo], quantizer_type: QuantizerType, + bit_width: int = 8, + data_type: QuantizationDataType = QuantizationDataType.int): + """ + Create and insert a post-training quant op after a given tensor + :param preceeding_tensor: Preceeding tensor to insert the quant op after + :param quant_op_name: Name to give to the new quant op + :param op_mode: Starting mode to configure for the new quant op + :param quantizer_dict: dictionary of op and QuantizerInfo + :param quantizer_type : indicate param or activation quantizer + :param bit_width : bit-width to be used (output or param quantization bit-width), default set to 8 + :param data_type: data type to use for quantizing all layer parameters + :return: None + """ + # pylint: disable=too-many-locals + # Create variables for op_mode, tensor_quantizer_reference, encoding_min, encoding_max, bitwidth and + # is_symmetric_encoding flag + # (so we can change these in the future, if needed) + + op_mode_var, tensor_quant_ref, encoding_min, encoding_max, bit_width, use_symmetric_encoding, _, _ = \ + self._create_and_init_quant_op_input_vars(quant_op_name, quantizer_dict, quantizer_type, op_mode, + bit_width, data_type) + + # extract loop cond bool variable + time_step_tensor = get_time_steps_tensor_from_rnn_inner_ops(inner_ops) + + # CPU device assignment for QcQuantize op + q_op_out = self._create_and_place_recurrent_param_quantize_op(quant_op_name, preceeding_tensor, + op_mode_var, + tensor_quant_ref, + encoding_min, + encoding_max, + bit_width, + use_symmetric_encoding, + time_step_tensor) + + return q_op_out + + def _insert_post_training_quant_op(self, preceeding_tensor, quant_op_name: str, op_mode: libpymo.QuantizationMode, + quantizer_dict: Dict[str, QuantizerInfo], quantizer_type: QuantizerType, + bit_width: int = 8, data_type: QuantizationDataType = QuantizationDataType.int): + """ + Create and insert a post-training quant op after a given tensor + :param preceeding_tensor: Preceeding tensor to insert the quant op after + :param quant_op_name: Name to give to the new quant op + :param op_mode: Starting mode to configure for the new quant op + :param quantizer_dict: dictionary of op and QuantizerInfo + :param quantizer_type : indicate param or activation quantizer + :param bit_width : bit-width to be used (output or param quantization bit-width), default set to 8. + :param data_type: data type to use for quantizing the op + :return: None + """ + # pylint: disable=too-many-locals + # Create variables for op_mode, tensor_quantizer_reference, encoding_min, encoding_max, bitwidth and + # is_symmetric_encoding flag + # (so we can change these in the future, if needed) + + op_mode_var, tensor_quant_ref, encoding_min, encoding_max, bit_width, use_symmetric_encoding, \ + quantization_axis_handling, is_int_data_type = self._create_and_init_quant_op_input_vars(quant_op_name, + quantizer_dict, + quantizer_type, + op_mode, + bit_width, data_type) + + # CPU device assignment for QcQuantize op + q_op_out = self._create_and_place_quantize_op(quant_op_name, preceeding_tensor, op_mode_var, tensor_quant_ref, + encoding_min, encoding_max, bit_width, use_symmetric_encoding, + quantizer_type, quantization_axis_handling, is_int_data_type) + + return q_op_out + + def _create_and_place_quantize_op(self, quant_op_name: str, preceeding_tensor, + op_mode_var: tf.Variable, tensor_quant_ref: tf.Variable, + encoding_min: tf.Variable, encoding_max: tf.Variable, bit_width: tf.Variable, + use_symmetric_encoding: tf.Variable, quantizer_type: QuantizerType, + quantization_axis_handling: tf.Variable, is_int_data_type: tf.Variable): + """ + Create a QcQuantize op and place it on CPU/CPU and with the right custom-gradient function registered + """ + # pylint: disable=too-many-arguments + + def create_quantize_op(): + if self.per_channel_quantization_enabled and quantizer_type == QuantizerType.param: + + is_training = tf.keras.backend.learning_phase() + + op = qcops.qc_quantize_per_channel(name=quant_op_name, in_tensor=preceeding_tensor, + op_mode=op_mode_var, + tensor_quantizer_reference=tensor_quant_ref, + encoding_min=encoding_min, + encoding_max=encoding_max, + bit_width=bit_width, + is_int_data_type=is_int_data_type, + use_symmetric_encoding=use_symmetric_encoding, + axis_handling=quantization_axis_handling, is_training=is_training) + else: + op = qcops.qc_quantize(name=quant_op_name, in_tensor=preceeding_tensor, + op_mode=op_mode_var, + tensor_quantizer_reference=tensor_quant_ref, + encoding_min=encoding_min, encoding_max=encoding_max, + bit_width=bit_width, + use_symmetric_encoding=use_symmetric_encoding, + is_int_data_type=is_int_data_type) + + return op + + if not self._use_cuda: + with tf.device('/cpu:0'): + if self._quant_scheme in [QuantScheme.training_range_learning_with_tf_init, + QuantScheme.training_range_learning_with_tf_enhanced_init]: + with self.session.graph.gradient_override_map( + {"QcQuantize": "QcQuantizeRangeLearningCustomGradient", + "QcQuantizePerChannel": "QcQuantizePerChannelRangeLearningCustomGradient"}): + q_op_out = create_quantize_op() + else: + q_op_out = create_quantize_op() + + # GPU device assignment for QcQuantize op + else: + if self._quant_scheme in [QuantScheme.training_range_learning_with_tf_init, + QuantScheme.training_range_learning_with_tf_enhanced_init]: + with self.session.graph.gradient_override_map( + {"QcQuantize": "QcQuantizeRangeLearningCustomGradient", + "QcQuantizePerChannel": "QcQuantizePerChannelRangeLearningCustomGradient"}): + q_op_out = create_quantize_op() + else: + q_op_out = create_quantize_op() + + return q_op_out + + def _create_and_place_recurrent_param_quantize_op(self, quant_op_name: str, preceeding_tensor, + op_mode_var: tf.Variable, tensor_quant_ref: tf.Variable, + encoding_min: tf.Variable, encoding_max: tf.Variable, + bit_width: tf.Variable, + use_symmetric_encoding: tf.Variable, time_steps): + def create_recurrent_param_quantize_op(): + op = qcops.qc_quantize_recurrent_param(name=quant_op_name, in_tensor=preceeding_tensor, + op_mode=op_mode_var, tensor_quantizer_reference=tensor_quant_ref, + encoding_min=encoding_min, encoding_max=encoding_max, + bit_width=bit_width, use_symmetric_encoding=use_symmetric_encoding, + time_steps=time_steps) + return op + + if not self._use_cuda: + with tf.device('/cpu:0'): + q_op_out = create_recurrent_param_quantize_op() + + # GPU device assignment for QcQuantize op + else: + q_op_out = create_recurrent_param_quantize_op() + + return q_op_out + + @staticmethod + def _is_op_transformer_mask(quant_op_name: str) -> bool: + """ + Check if quant_op_name is transformer mask add op + :param quant_op_name: op name to check + :return: True if quant_op_name belongs to transformer mask add op + """ + for supported_mask in transformer_utils.SUPPORTED_ATTENTION_MASK_OVERRIDE: + if quant_op_name.endswith(supported_mask + '_quantized'): + return True + return False + + def _override_quant_config_for_transformer_mask_add(self): + """ + Find transformer mask add op and change bitwidth to 16 and quant_scheme to tf + """ + for quant_op_name, quantizer_info in self._activation_quantizers.items(): + if self._is_op_transformer_mask(quant_op_name) and quantizer_info.data_type == QuantizationDataType.int: + quantizer_info.bitwidth = 16 + quantizer_info.quant_scheme = QuantScheme.post_training_tf + + def _clamp_transformer_attention_mask_encoding(self): + """ + Clamp the quantizer encoding min associated with mask adder op within an attention head. + """ + for quant_op_name, quantizer_info in self._activation_quantizers.items(): + if self._is_op_transformer_mask(quant_op_name) and quantizer_info.enabled \ + and quantizer_info.data_type == QuantizationDataType.int: + encoding = quantizer_info.get_encoding() + encoding.min = max(encoding.min, transformer_utils.MASK_OVERRIDE_VALUE) + + clamped_encoding = recompute_grid_params(encoding, self._default_output_bw, + quantizer_info.use_symmetric_encoding) + quantizer_info.bitwidth = self._default_output_bw + quantizer_info.quant_scheme = self._quant_scheme + quantizer_info.set_encoding(clamped_encoding) + quantizer_info.freeze_encoding()
+ + +# load and save utilities +def update_tensor_quantizer_references(quant_sim_sess: tf.compat.v1.Session, quantizer_dict: Dict[str, QuantizerInfo]): + """ + updates the param / activation quant ops in the passed-in session with new tensor quantizer references. + :param quant_sim_sess: tensorflow session held by quantsim object + :param quantizer_dict: dictionary with quant ops and associated quantizer info + :return: None, updates passed-in session quant ops with new tensor quantizer references. + """ + + vars_with_value = {} + for q_op_name in quantizer_dict: + # also update the session held by tensor quantizer object + quantizer_dict[q_op_name].session = quant_sim_sess + # For per channel quantization of parameters + tensor_quantizers = quantizer_dict[q_op_name].tensor_quantizer + tensor_quantizer_ref = [] + if isinstance(tensor_quantizers, list): + for tensor_quantizer in tensor_quantizers: + ptr_to_int64_val = libpymo.PtrToInt64(tensor_quantizer) + tensor_quantizer_ref.append(ptr_to_int64_val) + else: + ptr_to_int64_val = libpymo.PtrToInt64(tensor_quantizers) + tensor_quantizer_ref.append(ptr_to_int64_val) + tensor_quantizer_ref = tensor_quantizer_ref[0] + vars_with_value[q_op_name + '_quant_ref'] = tensor_quantizer_ref + + update_variables_with_values(quant_sim_sess, vars_with_value) + + +def save_checkpoint(quantsim: QuantizationSimModel, meta_path: str, file_name_prefix: str): + """ + Saves a checkpoint of the QuantSim model which can be loaded at a later point to continue fine-tuning. + See also load_checkpoint(). + + :param quantsim: QuantizationSimModel to be saved + :param meta_path: path to save the meta file + :param file_name_prefix: filename prefix string + """ + if not os.path.exists(meta_path): + os.mkdir(meta_path) + + save_path = os.path.join(meta_path, file_name_prefix) + + # save the model with quant ops + utils.graph_saver.save_model_to_meta(quantsim.session, save_path) + + # save info in the quantsim object + save_data_to_pickle_file(quantsim, meta_path, 'orig_quantsim_config') + + +def load_checkpoint(meta_path: str, file_name_prefix: str) -> QuantizationSimModel: + """ + Loads QuantSim model from saved checkpoint and pickle files. + + :param meta_path: to load meta from + :param file_name_prefix: filename prefix string + :return: returns new QuantSim object + """ + #pylint: disable=protected-access + + # load saved session with quant ops + new_sess = utils.graph_saver.load_model_from_meta(meta_path=str(meta_path + '/' + file_name_prefix + '.meta')) + + # load quant sim model object with params from saved pickle data + new_quant_sim = load_data_from_pickle_file(meta_path + '/orig_quantsim_config') + + # set session for the new quantsim object + new_quant_sim.session = new_sess + + # update tensor references in the new quantsim object + update_tensor_quantizer_references(new_sess, new_quant_sim._param_quantizers) + update_tensor_quantizer_references(new_sess, new_quant_sim._activation_quantizers) + + return new_quant_sim + + +def check_accumulator_overflow(sess: tf.compat.v1.Session, quant_bw: int, accum_bw: int): + """ + Checks for any potential for accumulator overflow across all the layers of the given model + :param sess: Tensorflow session + :param quant_bw: Bitwidth the layers are quantized at + :param accum_bw: Bitwidth of the accumulator + :return: Name of the layer with the most accumulator range used and range used + """ + + most_accum_range_used = 0 + most_accum_range_used_layer = None + + for op in sess.graph.get_operations(): + if op.type == 'Conv2D': + weights = utils.op.conv.WeightTensorUtils.get_tensor_as_numpy_data(sess, op) + weights = np.transpose(weights, (3, 2, 0, 1)) # Reshape from HWIO to OIHW + was_accum_range_exceeded, accum_range_used = get_conv_accum_bounds(weights, quant_bw, accum_bw) + if accum_range_used > most_accum_range_used: + most_accum_range_used = accum_range_used + most_accum_range_used_layer = op.name + + if was_accum_range_exceeded: + _logger.info('Possible accumulator overflow for layer: %s', op.name) + + if most_accum_range_used < 1: + _logger.info('No overflow detected. Layer %s had the most accumulator range used: %f%%', + most_accum_range_used_layer, most_accum_range_used * 100) + else: + _logger.info('Overflow detected. Layer %s had the most accumulator range used: %f%%', + most_accum_range_used_layer, most_accum_range_used * 100) + + return most_accum_range_used_layer, most_accum_range_used +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/svd.html b/releases/1.32.2/_modules/aimet_tensorflow/svd.html new file mode 100644 index 00000000..a4ca8633 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/svd.html @@ -0,0 +1,2176 @@ + + + + + + aimet_tensorflow.svd — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_tensorflow.svd

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2017-2018, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+# pylint: disable=too-many-lines
+
+""" Implementation of the SVD model compression technique for TensorFlow """
+
+import os
+from functools import reduce
+import operator
+from enum import Enum
+import numpy as np
+import tensorflow as tf
+
+from aimet_tensorflow import graph_editor
+from aimet_tensorflow.common import core, graph_eval
+import aimet_common.libpymo as pymo
+from aimet_common import statistics_util as stats_u
+from aimet_common.utils import AimetLogger
+
+logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.Svd)
+
+_SVD_TYPES = {'svd': pymo.TYPE_SINGLE,
+              'ssvd': pymo.TYPE_SUCCESSIVE}
+_SVD_LAYER_TYPES = {'Conv2D': pymo.LAYER_TYPE_CONV,
+                    'MatMul': pymo.LAYER_TYPE_FC}
+
+_MIN_LAYER_DIM_FOR_SVD = 10
+_SVD_SUPPORTED_LAYER_TYPES = ['Conv2D', 'MatMul']
+
+
+class CostMetric(Enum):
+    """ Enumeration of metrics to measure cost of a model/layer """
+    mac = 1
+    memory = 2
+
+
+class LayerAttributes:
+    """ Holds attributes for a given layer """
+
+    def __init__(self, layer_ref, cost, weight_shape):
+        """
+        Constructor
+        :param layer_ref: Reference to the layer object in TensorFlow
+        :param cost: Cost of the layer
+        :param weight_shape: Shape of the output activation of the layer
+        """
+        self.layer_ref = layer_ref
+        self.cost = cost
+        self.weight_shape = weight_shape
+
+
+
[docs]class Svd: + """A class for performing singular value decomposition on a tensorflow model. + + The Svd class enables model compression through singular value decomposition (SVD). + It can analyze convolution and fully connected layers and perform + some analysis to find the optimal ranks for balancing compression and the + accuracy of the network. + """ + # pylint: disable=too-many-instance-attributes + + def __init__(self, graph, checkpoint, metric, output_file='./svd_graph', svd_type='svd', + num_layers=0, layers=None, layer_ranks=None, num_ranks=20, gpu=True, debug=False, no_evaluation=False, + layer_selection_threshold=0.6): + """ + Constructor for the Svd class + + Constructs the Svd class from a set of options passed in at construction. The class takes + a number of named arguments which are detailed below. + + :param graph: The file path to the meta graph. + :param checkpoint: The file path to the tensorflow checkpoint file. + :param metric: The metric to use for determining the optimal compression. Either + 'mac' for optimizing compression to minimize multiplies and accumulates or 'memory' which + optimizes for overall memory footprint. Defaults to 'memory' + :param output_file: The file path for saving the compressed tensorflow graph. + aimet will save to the directory specified, using output_file as a filename prefix + :param svd_type: Indicates which algorithm should be used, either + 'svd' or 'ssvd'. Defaults to 'svd'. + :param num_layers: The number of layers to compress. Defaults to '0' which uses a + heuristic to determine the optimal number of layers to compress. + :param layers: A list of op names to compress. All other layers will be ignored. + Overrides num_layers and sets it to the length of this list. + :param layer_ranks: required only if no_evaluation is set to True. A list of tuples to compress + layers specified in layers argument. + :param num_ranks: The number of ranks (compression_points) to evaluate for compression. + Defaults to 20. Value should be greater than 2. + :param gpu: Indicates if the algorithm should run on GPU or CPU. Defaults to GPU. To + use CPU set to false + :param debug: If true debug messages will be printed. Defaults to False. + :param no_evaluation: If true, ranks will be set manually from user. Defaults to False. + :param layer_selection_threshold: Threshold (0-1) to use to select the top layers in the network + + :raises: ValueError: An error occurred processing one of the input parameters. + """ + # pylint: disable=too-many-arguments + + self._sanity_check_constructor_parameters(layer_selection_threshold, layers, no_evaluation, num_layers, + num_ranks, svd_type) + + self._gpu = gpu + self._debug = debug + self._default_meta_graph = graph + self._default_checkpoint = checkpoint + self._output_file = output_file + self._output_dir = os.path.dirname(output_file) + if not os.path.exists(self._output_dir): + os.makedirs(self._output_dir) + logger.info('Saving SVD output as: %s', output_file) + + self.svd_type = _SVD_TYPES[svd_type] + + self._metric = metric + + self._num_layers = num_layers + + self._selected_layers = [] + self._networkCost = None + + if layers: + logger.debug('Attempting to compress: %s', layers) + self._layers_to_compress = layers + else: + self._layers_to_compress = [] + + if num_ranks < 0: + raise ValueError("num_ranks must be >= 0") + self._num_ranks = num_ranks + + if layer_ranks: + self._layer_ranks = layer_ranks + self._num_layer_ranks = len(layer_ranks) + logger.debug('Attempting to compress model with user provided ranks : %s', layer_ranks) + + # Setup the SVD instance and load the graph + self._svd = pymo.GetSVDInstance() + self._no_eval = no_evaluation + self._layer_selection_threshold = layer_selection_threshold + self._model_performance_candidate_ranks = list() + + # Todo: Need to look at these attributes and see how to handle them better + # Very likely these attributes don't need to be object attributes + self._generator = None + self._eval_names = None + self._eval_func = None + self._iterations = None + self._run_graph = None + self._baseline_perf = None + self._error_margin = None + self._compressible_ops = None + + @staticmethod + def _sanity_check_constructor_parameters(layer_selection_threshold, layers, no_evaluation, num_layers, + num_ranks, svd_type): + if svd_type not in _SVD_TYPES: + raise ValueError('Invalid SVD mode: ' + svd_type) + if no_evaluation: + if not layers: + raise ValueError('Both layers and layer_rank parameters are needed for Manual mode') + if layer_selection_threshold < 0 or layer_selection_threshold > 1: + raise ValueError('Layer selection threshold should be between 0 and 1') + if not no_evaluation: + if num_ranks <= 2: + raise ValueError('Number of ranks should be greater than 2 for auto mode') + if num_layers < 0: + raise ValueError("num_layers must be >= 0") + + def _compute_per_layer_compression_ratio(self, split_layers_shape, output_shape, original_layer_shape, op_type): + """ + Updates the per layer statistics + + :param orig_layer: The layer before it was split + :param split_layers: List of split layers + :return: The compression ratio of split layers + """ + orig_layer_cost = self._compute_layer_cost(original_layer_shape, output_shape, op_type) + + split_layers_mem_cost = 0 + split_layers_mac_cost = 0 + + for layer_shape in split_layers_shape: + mem_cost, mac_cost = self._compute_layer_cost(layer_shape, output_shape, op_type) + if not isinstance(mem_cost, int): + mem_cost = mem_cost.value + if not isinstance(mac_cost, int): + mac_cost = mac_cost.value + split_layers_mem_cost += mem_cost + split_layers_mac_cost += mac_cost + + if self._metric is CostMetric.memory: + savings = orig_layer_cost[0] - split_layers_mem_cost + ratio = savings / orig_layer_cost[0] + logger.debug('Original Layer Cost: %s Memory Compression Ratio: %s', orig_layer_cost[0], ratio) + else: + savings = orig_layer_cost[1] - split_layers_mac_cost + ratio = savings / orig_layer_cost[1] + logger.debug('Original Layer Cost: %s MAC Compression Ratio: %s', orig_layer_cost[1], ratio) + + return ratio + + @staticmethod + def _reset_session(sess): + """ + Reset the given tf.compat.v1.Session + :param sess: tf.compat.v1.Session + :return: None + """ + tf.compat.v1.reset_default_graph() + sess.close() + + @staticmethod + def _load_graph(graph, meta_graph, checkpoint): + """ + Load a graph and checkpoint and create a new tf.compat.v1.Session + :param graph: TF graph + :param meta_graph: Meta file + :param checkpoint: Checkpoint file + :return: Newly created session + """ + logger.info('Loading graph: %s', meta_graph) + sess = tf.compat.v1.Session(graph=graph) + + # Open the graph and retore the parameters + saver = tf.compat.v1.train.import_meta_graph(meta_graph) + saver.restore(sess, checkpoint) + return sess, saver + + @staticmethod + def _get_layer_type(op): + """ + Converts TF layer types into corresponding PyMo layer enumerated values + :param op: TF op + :return: PyMo enumerated value corresponding to the type of op + """ + if op.type in _SVD_LAYER_TYPES: + return _SVD_LAYER_TYPES[op.type] + return pymo.LAYER_TYPE_OTHER + + class LayerSelectionScheme(Enum): + """ Enumeration of schemes supported to select layers for SVD compression """ + manual = 1 + top_n_layers = 2 + top_x_percent = 3 + + @staticmethod + def _pick_compression_layers(sess, cost_metric, layer_select_scheme, **kwargs): + """ + Pick layers for SVD compression given parameters + :param sess: tf.compat.v1.Session + :param cost_metric: Metric to use for evaluating layer cost (either in terms of memory or mac) + :param layer_select_scheme: Layer selection scheme to use + :param kwargs: Keyword arguments that depend on which layer selection scheme is specified + top_n_layers:: num_layers: Number of layers to pick + top_x_percent:: percent_thresh: Top layers up to this parameter will be selected + manual:: layers_to_compress: List of layers (names) to compress + :return: + """ + # pylint: disable=too-many-locals,too-many-branches + + if not isinstance(cost_metric, CostMetric): + raise TypeError("cost_metric is not of type CostMetric") + + if not isinstance(layer_select_scheme, Svd.LayerSelectionScheme): + raise TypeError("layer_selection_scheme is not of type Svd.LayerSelectionScheme") + + # Find all compressible ops + query = core.OpQuery(sess.graph) + compressible_ops = query.get_weight_ops() + compressible_ops = [op for op in compressible_ops if op.type in _SVD_SUPPORTED_LAYER_TYPES] + + layer_attributes_list = Svd._create_layer_attributes_list(compressible_ops, sess) + network_cost = Svd._compute_network_cost(layer_attributes_list) + + # Heuristic1: Reject any ops whose param shape does not meet a base criterion + pruned_list = [] + for layer_attributes in layer_attributes_list: + h, w, n, c = layer_attributes.weight_shape + if (n >= _MIN_LAYER_DIM_FOR_SVD) and ((c * h * w) >= _MIN_LAYER_DIM_FOR_SVD): + pruned_list.append(layer_attributes) + else: + print("Pruning out {}: shape is {}".format(layer_attributes.layer_ref.name, + layer_attributes.weight_shape)) + + # Reset layer_attributes_list for the next phase + layer_attributes_list = pruned_list + pruned_list = [] + + # Sort the attribute list based on cost + if cost_metric == CostMetric.memory: + layer_attributes_list.sort(key=lambda x: x.cost[0], reverse=True) + else: + layer_attributes_list.sort(key=lambda x: x.cost[1], reverse=True) + + if layer_select_scheme == Svd.LayerSelectionScheme.top_n_layers: + num_layers = kwargs['num_layers'] + pruned_list = layer_attributes_list[:num_layers] + + elif layer_select_scheme == Svd.LayerSelectionScheme.top_x_percent: + percent_thresh = kwargs['percent_thresh'] + accum_cost = 0. + total_cost = network_cost[0] if (cost_metric == CostMetric.memory) else network_cost[1] + + for layer in layer_attributes_list: + cost = layer.cost[0] if (cost_metric == CostMetric.memory) else layer.cost[1] + + if (100 * (cost + accum_cost)/total_cost) < percent_thresh: + pruned_list.append(layer) + accum_cost += cost + + elif layer_select_scheme == Svd.LayerSelectionScheme.manual: + layers_to_compress = kwargs['layers_to_compress'] + for layer in layer_attributes_list: + if layer.layer_ref.name in layers_to_compress: + pruned_list.append(layer) + + if not pruned_list: + raise RuntimeError('No suitable layers found in the model.') + return pruned_list, network_cost + + + @staticmethod + def _create_layer_attributes_list(ops_to_use, sess): + """ + Creates list of layer attributes given a set of TF ops + :param ops_to_use: TF ops to collect layer attributes for + :param sess: tf.compat.v1.Session to use + :return: Created list of layer attributes + """ + query = core.OpQuery(sess.graph) + layer_attributes_list = [] + for op in ops_to_use: + + weight_shape = query.get_weights_for_op(op).eval(session=sess).shape + if op.type == 'MatMul': + n, c = weight_shape + weight_shape = (1, 1, n, c) + output_dims = op.outputs[0].shape + + cost = Svd._compute_layer_cost(weight_shape, output_dims, op.type) + + + layer_attributes_list.append(LayerAttributes(op, cost, weight_shape)) + + return layer_attributes_list + + @staticmethod + def _compute_network_cost(layer_attributes_list): + """ + Compute aggregate cost of the layers included in the layer attributes list + :param layer_attributes_list: List of layer attributes + :return: Computed cost + """ + mac_cost = 0 + mem_cost = 0 + for layer_attributes in layer_attributes_list: + op_mem_cost, op_mac_cost = layer_attributes.cost + mem_cost += op_mem_cost + mac_cost += op_mac_cost + + return mem_cost, mac_cost + + @staticmethod + def _compute_layer_cost(weights_shape, output_dims, op_type): + """ + Compute cost of a layer + :param weights_shape: Shape of the weights of this layer + :param output_dims: Shape of the output of this layer + :param op_type: Type of this TF op + :return: Computed layer cost + """ + # for outputs, TF uses dims [N,H,W,C] + mem_cost = reduce(operator.mul, weights_shape) + + if op_type == 'Conv2D': + mac_cost = mem_cost * int(output_dims[1]) * int(output_dims[2]) + elif op_type == 'MatMul': + mac_cost = mem_cost + + return mem_cost, mac_cost + + def _compute_compression_ratio(self, sess, cost_metric): + """ + Compute compression ratio + :param sess: tf.compat.v1.Session + :return: Computed compression ratio + """ + query = core.OpQuery(sess.graph) + compressible_ops = query.get_weight_ops() + compressible_ops = [op for op in compressible_ops if op.type in _SVD_SUPPORTED_LAYER_TYPES] + + layer_attributes_list = Svd._create_layer_attributes_list(compressible_ops, sess) + selected_layers_ops = [layer.layer_ref.name for layer in self._selected_layers] + layer_attributes_list = [layer for layer in layer_attributes_list if layer.layer_ref.name not in selected_layers_ops] + compressed_network_cost = Svd._compute_network_cost(layer_attributes_list) + + if cost_metric is CostMetric.memory: + savings = self._networkCost[0] - compressed_network_cost[0] + ratio = savings/self._networkCost[0] + + else: + savings = self._networkCost[1] - compressed_network_cost[1] + ratio = savings/self._networkCost[1] + + return ratio + + def _store_net_stats(self, sess): + """ + Store layer attributes in the PyMo library instance + :param sess: tf.compat.v1.Session + :return: None + """ + # pylint: disable=too-many-locals,too-many-branches,too-many-statements + + if self._metric == CostMetric.memory: + pymo_metric = pymo.COST_TYPE_MEMORY + else: + pymo_metric = pymo.COST_TYPE_MAC + + self._svd.SetCostMetric(pymo_metric) + + # Layer-selection + if self._layers_to_compress: + selected_layers, network_cost = self._pick_compression_layers(sess, + self._metric, + self.LayerSelectionScheme.manual, + layers_to_compress=self._layers_to_compress) + elif self._num_layers > 0: + selected_layers, network_cost = self._pick_compression_layers(sess, + self._metric, + self.LayerSelectionScheme.top_n_layers, + num_layers=self._num_layers) + else: + percent_thresh = self._layer_selection_threshold * 100 + selected_layers, network_cost = self._pick_compression_layers(sess, + self._metric, + self.LayerSelectionScheme.top_x_percent, + percent_thresh=percent_thresh) + + self._networkCost = network_cost + + print("Selected Layers:") + for layer in selected_layers: + print(layer.layer_ref.name) + + self._selected_layers = selected_layers + + # Get the op query module and query for all Conv/FC layers + query = core.OpQuery(sess.graph) + self._compressible_ops = query.get_weight_ops() + + # Set up the layer attributes for each Conv/FC layer (this also checks for trailing + # bias adds + for i, op in enumerate(self._compressible_ops): + + # If op is not a selected layer, skip + if not any(op is layer.layer_ref for layer in selected_layers): + continue + + attr = pymo.LayerAttributes() + layerName = op.name + output_dims = op.outputs[0].shape # TF uses dims [N,H,W,C] + attr.layerType = self._get_layer_type(op) + if self.svd_type == pymo.TYPE_SINGLE: + attr.mode = self._svd.GetCompressionType(attr.layerType, 'single') + else: + attr.mode = self._svd.GetCompressionType(attr.layerType, 'successive') + + if op.type == 'Conv2D' or op.type == 'MatMul': + logger.info('Setting layer attributes for: %s', layerName+'('+op.type+')') + + # Get weights + weights = query.get_weights_for_op(op).eval(session=sess) + w_shape = weights.shape + logger.debug('Got weight shape: %s', w_shape) + + # Check for bias op + bias = None + if (i+1) < len(self._compressible_ops): + bias = query.get_bias_for_op(self._compressible_ops[i+1]) + if bias is not None: + bias = bias.eval(session=sess) + logger.debug('Got %s w/bias. Shape: %s', op.type, str(bias.shape)) + + if op.type == 'Conv2D': + attr.shape = [w_shape[3], w_shape[2], w_shape[0], w_shape[1]] # TF Conv weight order [KH,KW,ID,OD] + attr.activation_dims = (output_dims[1], output_dims[2]) # (H,W) + + # CONV weights are stored in the order {H,W,I,O} in Tensorflow + # Re-order them to the form {O,I,H,W} + weights = np.transpose(weights, (3, 2, 0, 1)) + + elif op.type == 'MatMul': + attr.shape = [w_shape[1], w_shape[0], 1, 1] # TF FC weight order [ID,OD], SVD expects [OD,ID] + attr.activation_dims = (1, 1) + weights = np.transpose(weights, (1, 0)) + + # blobs is a numpy array... add to list then set + params = [weights.flatten()] + if bias is not None: + params.append(bias.flatten()) + attr.blobs = params + + # Save the attributes for this layer + self._svd.StoreLayerAttributes(layerName, attr) + + def _compute_objective_score(self, model_perf, compression_score): + """ + Compute objective score of a given compression model + :param model_perf: Performance of compressed model + :param compression_score: Compression ratio + :return: Computed objective score + """ + if model_perf + (self._error_margin / 100) >= self._baseline_perf: + objective_score = 1 - model_perf + (1 - compression_score) + else: + objective_score = 1 + (1 - compression_score) # treat lower accuracies as 0 + + return objective_score + + def _split_conv_layer(self, sess, svd_ranks, attr, op_name, bias_op_name=None): + """ + Split a given conv layer given a rank + :param sess: tf.compat.v1.Session + :param svd_ranks: Rank to split the layer with (two ranks in case of SSVD) + :param attr: Reference to the corresponding layer attribute + :param op_name: Name of the op to split + :param bias_op_name: Name of the corresponding bias op (if any) + :return: None + """ + # pylint: disable=too-many-statements,too-many-branches,too-many-locals + + logger.info('Splitting conv op: %s', op_name) + + # Retrieve the op(s) from the current graph + op = sess.graph.get_operation_by_name(op_name) + + bias_op = None + if bias_op_name: + bias_op = sess.graph.get_operation_by_name(bias_op_name) + + # Create new 'conv_a' layer + pad_mode = op.get_attr('padding') + data_format = op.get_attr('data_format').decode('utf-8') + strides = op.get_attr('strides') + + # Print current conv weight shape + query = core.OpQuery(sess.graph) + w_shape = query.get_weights_for_op(op).get_shape().as_list() + logger.debug('Original %s weight shape: %s', op.name, str(w_shape)) + split_weights, weight_sizes = [], [] + split_biases, bias_sizes = [], [] + + # TF weights are in [H,W,I,O] order. We must reshape the split weights to SVD format [O,I,H,W] + # and then transpose back + # Conv a weights are: [1, 1, w_shape[2], svd_ranks[0]] + split_conv_a_w_shape = (svd_ranks[0], w_shape[2], 1, 1) + conv_a_weights = np.zeros(split_conv_a_w_shape) # transpose(2,3,1,0) + split_weights.append(conv_a_weights.flatten().tolist()) + weight_sizes.append(conv_a_weights.size) + if bias_op: + conv_a_bias = np.zeros(svd_ranks[0]) + split_biases.append(conv_a_bias.flatten().tolist()) + bias_sizes.append(conv_a_bias.size) + + num_filters = w_shape[3] + if len(svd_ranks) >= 2 and attr.mode == pymo.TYPE_SUCCESSIVE: + # Output channels = output_rank (s) + num_filters = svd_ranks[1] + + # Conv b weights are: [w_shape[0],w_shape[1],svd_ranks[0],num_filters] + split_conv_b_w_shape = (num_filters, svd_ranks[0], w_shape[0], w_shape[1]) + conv_b_weights = np.zeros(split_conv_b_w_shape) + conv_b_bias = np.zeros(num_filters) + split_weights.append(conv_b_weights.flatten().tolist()) + weight_sizes.append(conv_b_weights.size) + if bias_op: + split_biases.append(conv_b_bias.flatten().tolist()) + bias_sizes.append(conv_b_bias.size) + + # Only create a third conv layer when performing successive SVD + if len(svd_ranks) >= 2 and attr.mode == pymo.TYPE_SUCCESSIVE: + # Conv c weights are: [1,1,num_filters,w_shape[3]] + split_conv_c_w_shape = (w_shape[3], num_filters, 1, 1) + conv_c_weights = np.zeros(split_conv_c_w_shape) + conv_c_bias = np.zeros(w_shape[3]) + split_weights.append(conv_c_weights.flatten().tolist()) + weight_sizes.append(conv_c_weights.size) + if bias_op: + split_biases.append(conv_c_bias.flatten().tolist()) + bias_sizes.append(conv_c_bias.size) + + # Split the weights and biases according to the number of layers and ranks + split_weights = self._svd.SplitLayerWeights(op.name, split_weights, weight_sizes, svd_ranks) + split_biases = self._svd.SplitLayerBiases(op.name, split_biases, bias_sizes, svd_ranks) + if split_weights: + conv_a_name = op.name+'_a' + conv_a_weights = np.array(split_weights[0]).reshape(split_conv_a_w_shape).transpose(2, 3, 1, 0) + conv_a_w = tf.Variable(initial_value=conv_a_weights, name=conv_a_name+'_w', dtype=tf.float32) + logger.debug('%s weight shape: %s', conv_a_name, str(conv_a_weights.shape)) + + # Create conv_a using default strides (1,1) + # pylint: disable=no-member + conv_acts = tf.nn.conv2d(op.inputs[0], conv_a_w, strides=[1, 1, 1, 1], data_format=data_format, + padding=pad_mode, name=op.name+'_a') # dilation_rate=dilation_rate + if bias_op: + conv_a_bias = tf.Variable(initial_value=split_biases[0], name=conv_a_name+'_bias', dtype=tf.float32) + conv_acts = conv_acts + conv_a_bias # tf.nn.bias_add(conv_acts, split_biases[0]) + + if len(split_weights) > 1: + # Create conv_b + conv_b_name = op.name+'_b' + conv_b_weights = np.array(split_weights[1]).reshape(split_conv_b_w_shape).transpose(2, 3, 1, 0) + conv_b_w = tf.Variable(initial_value=conv_b_weights, name=conv_b_name+'_w', dtype=tf.float32) + logger.debug('%s weight shape: %s', conv_b_name, str(conv_b_weights.shape)) + + # pylint: disable=no-member + conv_acts = tf.nn.conv2d(conv_acts, conv_b_w, strides=strides, data_format=data_format, padding=pad_mode, name=conv_b_name) #dilation_rate=dilation_rate + if bias_op: + conv_b_bias = tf.Variable(initial_value=split_biases[1], name=conv_b_name+'_bias', dtype=tf.float32) + conv_acts = conv_acts + conv_b_bias # tf.nn.bias_add(conv_acts, split_biases[1]) + ratio = self._compute_per_layer_compression_ratio([conv_a_w.shape, conv_b_w.shape], conv_acts.shape, w_shape, "Conv2D") + # Only create a third conv layer when performing successive SVD + if len(split_weights) > 2 and len(svd_ranks) >= 2 and attr.mode == pymo.TYPE_SUCCESSIVE: + # Create conv_c, using default strides (1,1) + conv_c_name = op.name+'_c' + conv_c_weights = np.array(split_weights[2]).reshape(split_conv_c_w_shape).transpose(2, 3, 1, 0) + conv_c_w = tf.Variable(initial_value=conv_c_weights, name=conv_c_name+'_w', dtype=tf.float32) + logger.debug('%s weight shape: %s', conv_c_name, str(conv_c_weights.shape)) + + # pylint: disable=no-member + conv_acts = tf.nn.conv2d(conv_acts, conv_c_w, strides=[1, 1, 1, 1], data_format=data_format, + padding=pad_mode, name=conv_c_name) + if bias_op: + conv_c_bias = tf.Variable(initial_value=split_biases[2], name=conv_c_name+'_bias', dtype=tf.float32) + conv_acts = conv_acts + conv_c_bias # tf.nn.bias_add(conv_acts, split_biases[2]) + + consumers = [] + rerouted_inputs = [bias_op.outputs[0]] if bias_op else [op.outputs[0]] + for inp in rerouted_inputs: + for consumer in inp.consumers(): + consumers.append(consumer) + _ = graph_editor.reroute_ts(conv_acts, rerouted_inputs, can_modify=consumers) + + return ratio + + def _split_fc_layer(self, sess, svd_ranks, op_name, bias_op_name=None): + """ + Split a given conv layer given a rank + :param sess: tf.compat.v1.Session + :param svd_ranks: Rank to split the layer with (two ranks in case of SSVD) + :param op_name: Name of the op to split + :param bias_op_name: Name of the corresponding bias op (if any) + :return: None + """ + # pylint: disable=too-many-statements, too-many-locals + + logger.info('Splitting fully connected op: %s', op_name) + + # Retrieve the op(s) from the current graph + op = sess.graph.get_operation_by_name(op_name) + bias_op = None + if bias_op_name: + bias_op = sess.graph.get_operation_by_name(bias_op_name) + + # Print current conv weight shape + query = core.OpQuery(sess.graph) + w_shape = query.get_weights_for_op(op).get_shape().as_list() + logger.debug('Original %s weight shape: %s', op.name, str(w_shape)) + split_weights, weight_sizes = [], [] + split_biases, bias_sizes = [], [] + + # FC weights are: [w_shape[2],svd_ranks[0]] in [I,O] order. + # We must reshape the split weights to SVD format [O,I] and then transpose to NHWC + split_fc_a_w_shape = (svd_ranks[0], w_shape[0]) + fc_a_weights = np.zeros(split_fc_a_w_shape) + fc_a_bias = np.zeros(svd_ranks[0]) + split_weights.append(fc_a_weights.flatten().tolist()) + weight_sizes.append(fc_a_weights.size) + if bias_op: + split_biases.append(fc_a_bias.flatten().tolist()) + bias_sizes.append(fc_a_bias.size) + + # FC b weights are: [svd_ranks[0],num_filters] in [H,W,I,O] order. + # We must reshape the split weights to SVD format [O,I,H,W] and then transpose to NHWC + split_fc_b_w_shape = (w_shape[1], svd_ranks[0]) + fc_b_weights = np.zeros(split_fc_b_w_shape) + split_weights.append(fc_b_weights.flatten().tolist()) + weight_sizes.append(fc_b_weights.size) + if bias_op: + fc_b_bias = np.zeros(w_shape[1]) + split_biases.append(fc_b_bias.flatten().tolist()) + bias_sizes.append(fc_b_bias.size) + + # Split the weights and biases according to the number of layers and ranks + split_weights = self._svd.SplitLayerWeights(op.name, split_weights, weight_sizes, svd_ranks) + split_biases = self._svd.SplitLayerBiases(op.name, split_biases, bias_sizes, svd_ranks) + + if split_weights: + fc_a_name = op.name+'_a' + fc_a_weights = np.array(split_weights[0]).reshape(split_fc_a_w_shape).transpose(1, 0) + fc_a_w = tf.Variable(initial_value=fc_a_weights, name=fc_a_name+'_w', dtype=tf.float32) + logger.debug('%s weight shape: %s', fc_a_name, str(fc_a_weights.shape)) + + # Create fc_a using default strides (1,1) + fc_acts = tf.matmul(op.inputs[0], fc_a_w, name=fc_a_name) + if bias_op: + fc_a_bias = tf.Variable(initial_value=split_biases[0], name=fc_a_name+'_bias', dtype=tf.float32) + fc_acts = fc_acts + fc_a_bias + + if len(split_weights) > 1: + # Create fc_b + fc_b_name = op.name+'_b' + fc_b_weights = np.array(split_weights[1]).reshape(split_fc_b_w_shape).transpose(1, 0) + fc_b_w = tf.Variable(initial_value=fc_b_weights, name=fc_b_name+'_w', dtype=tf.float32) + logger.debug('%s weight shape: %s', fc_b_name, str(fc_b_weights.shape)) + fc_acts = tf.matmul(fc_acts, fc_b_w, name=fc_b_name) + if bias_op: + fc_b_bias = tf.Variable(initial_value=split_biases[1], name=fc_b_name+'_bias', dtype=tf.float32) + fc_acts = fc_acts + fc_b_bias + ratio = self._compute_per_layer_compression_ratio([fc_a_w.shape, fc_b_w.shape], fc_acts.shape, w_shape, 'MatMul') + consumers = [] + rerouted_inputs = [bias_op.outputs[0]] if bias_op else [op.outputs[0]] + for inp in rerouted_inputs: + for consumer in inp.consumers(): + consumers.append(consumer) + _ = graph_editor.reroute_ts(fc_acts, rerouted_inputs, can_modify=consumers) + return ratio + + def _split_layers(self, sess, rank_index, use_best_ranks): + """ + Split all the selected layers given a rank index + :param sess: tf.compat.v1.Session + :param rank_index: Rank index to use for finding the ranks + :param use_best_ranks: Use the best rank index (for final compressed network) + :return: None + """ + layer_stats = list() + for i, op in enumerate(self._compressible_ops): + + # If op is not a selected layer, skip + if not any(op is layer.layer_ref for layer in self._selected_layers): + continue + + # Bias is taken care of as part of the Conv/FC op + if op.type in ['Add', 'BiasAdd']: + continue + + # Get the stored attributes for this op + attr = self._svd.GetLayerAttributes(op.name) + if not attr: + raise RuntimeError("Layer attributes not available for layer"+op.name) + + if use_best_ranks: + svd_ranks = attr.bestRanks + else: + svd_ranks = self._svd.GetCandidateRanks(op.name, rank_index) + if svd_ranks: + bias_op = None + if i+1 < len(self._compressible_ops): + bias_op = self._compressible_ops[i+1] + bias_op = bias_op.name if bias_op.type in ['Add', 'BiasAdd'] else None + if op.type in ['Conv2D']: + ratio = self._split_conv_layer(sess, svd_ranks, attr, op.name, bias_op) + elif op.type in ['MatMul']: + ratio = self._split_fc_layer(sess, svd_ranks, op.name, bias_op) + per_layer_stats = stats_u.SvdStatistics.PerSelectedLayer(op.name, svd_ranks, ratio) + layer_stats.append(per_layer_stats) + return layer_stats + + def _create_compressed_network(self, sess, rank_index, use_best_ranks): + """ + Create a compressed network for a given rank index + :param sess: tf.compat.v1.Session + :param rank_index: Rank index to use for finding the ranks + :param use_best_ranks: Use the best rank index (for final compressed network) + :return: None + """ + # Split the network layers and update the connections + per_layer_stats = self._split_layers(sess, rank_index, use_best_ranks) + return per_layer_stats + + def _perform_rank_selection(self): + """ + Perform rank selection procedure + :return: None + """ + # pylint: disable=too-many-locals + stats_per_rank_index = list() + self._svd.ComputeNetworkCost() + self._num_ranks = self._svd.SetCandidateRanks(self._num_ranks) + + if not self._num_ranks: + raise RuntimeError('No good candidate ranks found for compressing specified layers.') + + # Ranks are in order from least compression to highest + best_index = -1 + optimal_score = 0.0 + + for rank_index in range(self._num_ranks): + g = tf.Graph() + with g.as_default(): + # Create a new network for each rank_index + self._svd.PrintCandidateRanks(rank_index, False) + + # Load the default graph so we are operating on a fresh copy of the original graph + sess, saver = self._load_graph(g, self._default_meta_graph, self._default_checkpoint) + per_layer_stats = self._create_compressed_network(sess, rank_index, False) + + # Save the temp model + output_file = os.path.join(self._output_dir, 'svd_rank_index_' + str(rank_index)) + self._save_graph(sess, saver, output_file) + + # Reset the session and start a new graph for loading the compressed model + self._reset_session(sess) + + g = tf.Graph() + with g.as_default(): + + # In TF after making changes to the graph you must save and reload, then evaluate + sess, saver = self._load_graph(g, output_file+'.meta', output_file) + model_perf = self._run_graph(sess, self._generator, self._eval_names, self._eval_func, self._iterations) + logger.info('%s performance: %s', output_file, str(model_perf)) + self._model_performance_candidate_ranks.append(model_perf * 100) + + # Estimate relative compression score for this rank_index + compression_score = self._compute_compression_ratio(sess, self._metric) + objective_score = self._compute_objective_score(model_perf, compression_score) + rank_data = stats_u.SvdStatistics.PerRankIndex(rank_index=rank_index, model_accuracy=model_perf, + model_compression_ratio=compression_score, + layer_stats_list=per_layer_stats) + stats_per_rank_index.append(rank_data) + + logger.info('Compressed network with rank_index %i/%i: accuracy = %f percent ' + 'with %f percent compression (%r option) and an objective score of %f', + rank_index, self._num_ranks, model_perf * 100, compression_score * 100, + self._metric, objective_score) + + if rank_index == 0: + optimal_score = objective_score + logger.info('Initializing objective score to %f at rank index %i', optimal_score, rank_index) + + if model_perf + self._error_margin/100 < self._baseline_perf: + logger.info('Model performance %f falls below %f percent of baseline performance %f' + ' Ending rank selection', model_perf, self._error_margin, self._baseline_perf) + break + + if objective_score <= optimal_score: + optimal_score = objective_score + logger.info('Found a better value for the objective score %f at rank_index %i', + optimal_score, rank_index) + best_index = rank_index + + if best_index != -1: + self._svd.StoreBestRanks(best_index) + memory_compression_ratio = self._compute_compression_ratio(sess, CostMetric.memory) + mac_compression_ratio = self._compute_compression_ratio(sess, CostMetric.mac) + stats = stats_u.SvdStatistics(self._baseline_perf, model_perf, self._metric, best_index, + mem_comp_ratio=memory_compression_ratio, mac_comp_ratio=mac_compression_ratio, + rank_stats_list=stats_per_rank_index) + # close the session and reset the default graph + self._reset_session(sess) + return stats + + # close the session and reset the default graph + self._reset_session(sess) + raise RuntimeError('No suitable ranks found to compress model within defined error bounds.') + + def manual_rank_svd(self): + """ + Set provided ranks in the PyMo library + :return: None + """ + # Store total net cost + self._svd.ComputeNetworkCost() + + # Ensure proper layer names are provided in no_eval mode + if not self._layer_ranks: + raise ValueError('Layer names MUST be specified in no_eval mode.') + + # Ensure layer_ranks is in list of tuples format + if not all(isinstance(item, tuple) for item in self._layer_ranks): + raise ValueError('layer_ranks should be in list of tuples format for both SVD and SSVD') + + # Check number of input ranks match with number of input layers + if len(self._layers_to_compress) != self._num_layer_ranks: + raise ValueError('Number of Input SVD ranks does not match number of layers.') + + for layer_name, rank in zip(self._layers_to_compress, self._layer_ranks): + rank_list = list() + rank_list.append(rank[1]) + if self.svd_type == _SVD_TYPES['ssvd']: + rank_list.append(rank[1]) + self._svd.StoreBestRanks(layer_name, rank_list) + stats = self._stats_for_manual_rank_svd() + return stats + + @staticmethod + def _save_graph(sess, saver, output_graph): + """ + Utility function to save a graph + :param sess: tf.compat.v1.Session + :param saver: TF save + :param output_graph: Filename and path for saving the output + :return: + """ + logger.info('Saving graph: %s', output_graph) + saver.save(sess, output_graph) + _ = tf.compat.v1.summary.FileWriter(os.path.dirname(output_graph)+"/models", sess.graph) + + def _save_compressed_network(self): + """ + Create and save a compressed network (using the best ranks identified) + :return: + """ + logger.info('Saving final compressed network') + g = tf.Graph() + with g.as_default(): + sess, saver = self._load_graph(g, self._default_meta_graph, self._default_checkpoint) + per_layer_stats = self._create_compressed_network(sess, 0, True) + + # Save the final network + self._save_graph(sess, saver, self._output_file) + self._reset_session(sess) + return per_layer_stats + + def _stats_for_manual_rank_svd(self): + per_layer_stats = self._save_compressed_network() + g = tf.Graph() + with g.as_default(): + # Load and evaluate the final network + sess, _ = self._load_graph(g, self._output_file+'.meta', self._output_file) + model_perf = self._run_graph(sess, self._generator, self._eval_names, self._eval_func, self._iterations) + logger.info('%s performance: %s', self._output_file, str(model_perf)) + + # Estimate relative compression score for this rank_index + self._svd.PrintCandidateRanks(0, True) + # Estimate relative compression score for this rank_index + compression_score = self._compute_compression_ratio(sess, self._metric) + logger.info('Evaluating final model using layer(s): %s. ' + 'Final accuracy = %f percent with %f percent compression (%r option).', + self._eval_names, model_perf*100, compression_score*100, self._metric) + + memory_compression_ratio = self._compute_compression_ratio(sess, + CostMetric.memory) + mac_compression_ratio = self._compute_compression_ratio(sess, + CostMetric.mac) + rank_data = stats_u.SvdStatistics.PerRankIndex(rank_index=0, model_accuracy=model_perf, + model_compression_ratio=compression_score, + layer_stats_list=per_layer_stats) + rank_data_list = list() + rank_data_list.append(rank_data) + stats = stats_u.SvdStatistics(self._baseline_perf, model_perf, self._metric, 0, + mem_comp_ratio=memory_compression_ratio, + mac_comp_ratio=mac_compression_ratio, + rank_stats_list=rank_data_list) + return stats + +
[docs] def compress_net(self, generator, eval_names=None, run_graph=graph_eval.evaluate_graph, + eval_func=graph_eval.default_eval_func, error_margin=2, iterations=100): + """ + Compresses the network using SVD + + Runs rank selection on the network, and compresses it using the method and parameters + passed during construction of the Svd object. + + :param generator: The generator which should be used for generating data for quantization + :param eval_names: The list of names to use for calculating model performance + :param run_graph: The function to use for running data through the graph and evaluating + the network's performance. This function must return only a single number representing the + avg performance of the model over the dataset batches. + See the 'graph_eval' module's 'evaluate_graph' function for the prototype + :param eval_func: The function to use for evaluating the network performance. This function should always + return a single number that can be used for comparing different graph's performance. + (The default is accuracy) + :param error_margin: The acceptable degradation in network accuracy from the original. + 1 for 1% drop, etc. Defaults to 2%. + :param iterations: The number of iterations (data batches) to run through the network for analysis + :return: An object containing compression statistics + + :raises: - ValueError: An invalid parameter was passed + - RuntimeError: An error occurred analyzing or compressing the network. The associated error + and other information will be returned with the error. + """ + + self._generator = generator + + if not eval_names: + eval_names = ['accuracy'] + + self._eval_names = eval_names + self._run_graph = run_graph + self._eval_func = eval_func + if error_margin <= 0: + raise ValueError('Invalid error_margin: '+str(error_margin)+'. Must pass error_margin > 0') + self._error_margin = error_margin + if iterations <= 0: + raise ValueError('Invalid iterations: '+str(iterations)+'. Number of iterations must be > 0') + self._iterations = iterations + + # Get baseline accuracy, then store the network stats + g = tf.Graph() + with g.as_default(): + sess, _ = self._load_graph(g, self._default_meta_graph, self._default_checkpoint) + self._baseline_perf = run_graph(sess, generator, eval_names, eval_func, iterations) + logger.info('Baseline performance: %f', self._baseline_perf) + self._store_net_stats(sess) + + self._reset_session(sess) + + if self._no_eval: + # Set Manual rank + stats = self.manual_rank_svd() + else: + # Perform rank selection + stats = self._perform_rank_selection() + self._save_compressed_network() + + return stats
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/utils/convert_tf_sess_to_keras.html b/releases/1.32.2/_modules/aimet_tensorflow/utils/convert_tf_sess_to_keras.html new file mode 100644 index 00000000..95dc1021 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/utils/convert_tf_sess_to_keras.html @@ -0,0 +1,1304 @@ + + + + + + aimet_tensorflow.utils.convert_tf_sess_to_keras — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_tensorflow.utils.convert_tf_sess_to_keras

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2020, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+""" Utilities to convert TF session to Keras """
+# pylint: skip-file
+import shutil
+from typing import List, Tuple
+import tensorflow as tf
+
+
+
[docs]def save_tf_session_single_gpu(sess: tf.compat.v1.Session(), path: 'str', input_tensor: 'str', output_tensor: 'str'): + """ + Saves TF session, meta graph and variables in the provided path + + :param sess: Input: tf.compat.v1.Session + :param path: Path to save the session + :param input_tensor: Name of starting op to the given graph + :param output_tensor: Name of output op of the graph + :return: None + + """ + + # Initilzing the given Tensorflow session + with sess.graph.as_default(): + init = tf.compat.v1.global_variables_initializer() + sess.run(init) + + # Getting the input and output tensors of the graph using provided names + inputs = sess.graph.get_tensor_by_name(input_tensor) + train_out = sess.graph.get_tensor_by_name(output_tensor) + + # Saving the input session, meta graph and variables in the provided path + with sess.graph.as_default(): + train_signature = tf.compat.v1.saved_model.predict_signature_def(inputs={'x': inputs}, outputs={'out': train_out}) + shutil.rmtree(path, ignore_errors=True) + builder = tf.compat.v1.saved_model.Builder(path) + builder.add_meta_graph_and_variables(sess, ['serve'], signature_def_map={'train': train_signature}) + builder.save()
+ + +def change_name_of_compressed_op(x: str): + """ + Splits op name and adds kernel:0 to it + :param x: Name of op + :return: + """ + return x.split('/')[0]+'/kernel'+':0' + + +
[docs]def load_tf_sess_variables_to_keras_single_gpu(path: 'str', compressed_ops: List['str']) -> tf.compat.v1.keras.Model: + """ + Creates a Keras model subclass and loads the saved session, meta graph and variables to Keras model + + :param path: Path to load the tf session saved using save_session_graph_and_variables + :param compressed_ops: List of ops names skipped in Keras model creations. These are the the ops + that AIMET compressed and are isolated from rest of the graph. + :return: Subclassed Keras Model + + """ + + to_ignore = map(change_name_of_compressed_op, compressed_ops) + + class Model(tf.compat.v1.keras.Model): + """ Keras Model subclassing and loading the saved variables""" + def __init__(self): + super(Model, self).__init__() + self.imported = tf.compat.v1.saved_model.load_v2(path) + self.variables_list = [v for v in self.imported.variables if v.name not in to_ignore] + + def call(self, inputs, training=None): + """ + Creates a Keras model from the saved object in path + :param inputs: Input to model + :param training: If model is to be trained + :return: + """ + if training: + return self.imported.signatures['train'](inputs) + return self.imported.signatures['serving_default'](input) + + return Model()
+ + +
[docs]def save_as_tf_module_multi_gpu(loading_path: 'str', saving_path: 'str', compressed_ops: List['str'], input_shape: Tuple): + """ + Loads a Keras model and re-saves the loaded object in the form of tf.Module + + :param loading_path: Path to load the Keras Model + :param saving_path: Path to save the object + :param compressed_ops: List of ops names for which we need to skip in Keras model creation. These are the the + ops that AIMET compressed and are isolated from rest of the graph. + :param input_shape: shape of input to the model + :return: None + + """ + + def trace_model(inputs): + tf.keras.backend.set_learning_phase(1) + model = load_tf_sess_variables_to_keras_single_gpu(loading_path, compressed_ops) + train_out = model(inputs, training=True) + return train_out + + def export(): + tf.keras.backend.clear_session() + with tf.compat.v1.keras.backend.get_session() as sess: + + fn = tf.wrap_function(trace_model, signature=[tf.TensorSpec((None, input_shape[0], input_shape[1], + input_shape[2]), tf.float32)]) + train_fn = fn.prune(feeds=fn.inputs[0], fetches=fn.outputs[0]) + obj = tf.Module() + obj.variables_list = list(fn.graph.variables) + sess.run(tf.compat.v1.global_variables_initializer()) + tf.saved_model.save(obj, saving_path, {'train': train_fn, 'serving_default': train_fn}) + + export()
+ + +
[docs]def load_keras_model_multi_gpu(loading_path: 'str', input_shape: List): + """ + This function loads the Keras model back, which can be used for funetuning within a strategy + + :param loading_path: Path to load the Keras Model + :param input_shape: the shape of stating tensor in graph ; for instance (224,224,3) for ResNet50 and MoblinetV1 + :return: subclassed Keras model + """ + + class Model(tf.compat.v1.keras.Model): + """ Keras Model subclassing and loading the saved variables """ + def __init__(self): + super(Model, self).__init__() + self.imported = tf.compat.v1.saved_model.load_v2(loading_path) + self.variables_list = self.imported.variables_list + + def call(self, inputs, training=None): + """ + Creates a Keras model from the saved object in path + :param inputs: Input to model + :param training: If training is True or False + :return: + """ + if training: + return self.imported.signatures['train'](inputs) + return self.imported.signatures['serving_default'](inputs) + + tf.keras.backend.set_learning_phase(1) + + x = tf.keras.Input(shape=tuple(input_shape)) + return tf.compat.v1.keras.Model(x, Model()(x, training=True))
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_tensorflow/utils/graph.html b/releases/1.32.2/_modules/aimet_tensorflow/utils/graph.html new file mode 100644 index 00000000..54452cd6 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_tensorflow/utils/graph.html @@ -0,0 +1,1240 @@ + + + + + + aimet_tensorflow.utils.graph — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_tensorflow.utils.graph

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2020, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+""" utilities for tf graph related operations """
+
+import os
+import tensorflow as tf
+from tensorflow.keras.models import load_model, save_model
+
+
+def op_not_in_loop_control_flow_context(graph: tf.Graph, input_op: tf.Operation) -> bool:
+    """
+    checks if the  op is not in loop control flow context or not
+    :param graph: tf.Graph is the active graph
+    :param input_op: op as tf.Operation
+    :return: True if op is not in a loop control flow context, False otherwise.
+    """
+    # pylint: disable=protected-access
+    active_ctxt = graph._get_control_flow_context()
+    input_ctxt = input_op._get_control_flow_context()
+
+    if not input_ctxt or input_ctxt is active_ctxt:
+        # input_op isn't in 'a' loop control flow context or
+        # input_op is in the same context as op.
+        return True
+
+    return False
+
+
+def updated_graph_flow_context_to_loop_context(graph: tf.Graph, preceeding_tensor: tf.Tensor):
+    """
+    updates graph flow context to loop context
+    :param graph: TensorFlow Graph (tf.Graph)
+    :param preceeding_tensor: TF tensor that feeds into the op which needs modification
+    :return: old graph context object
+    """
+
+    # pylint: disable=protected-access
+    old_graph_context = graph._get_control_flow_context()
+    graph._set_control_flow_context(preceeding_tensor.op._get_control_flow_context())
+
+    return old_graph_context
+
+
+def set_graph_flow_context(graph: tf.Graph, active_context):
+    """
+    sets graph context to active context provided
+    :param graph: TensorFlow Graph (tf.Graph)
+    :param active_context: context object to be set as current graph's context
+    :return:
+    """
+
+    # pylint: disable=protected-access
+    graph._set_control_flow_context(active_context)
+
+
+
[docs]def update_keras_bn_ops_trainable_flag(model: tf.keras.Model, trainable: bool, load_save_path: str) -> tf.keras.Model: + """ + helper method to update Keras BN ops trainable state in a given keras model. + :param model: Keras model to be updated with BN ops trainable flag + :param trainable: bool flag to indicate trainable to be set to true or false + :param load_save_path: temp folder to perform load/save, cleans up file created + :return: updated keras model + """ + + if not os.path.exists(load_save_path): + os.mkdir(load_save_path) + + output_file_with_path = os.path.join(load_save_path, 't.h5') + + # update BN ops trainable flag + for layer in model.layers: + if isinstance(layer, tf.keras.layers.BatchNormalization): + layer.trainable = trainable + save_model(model, output_file_with_path) + tf.compat.v1.keras.backend.clear_session() + model = load_model(output_file_with_path) + + # clean up file after use + if os.path.exists(output_file_with_path): + os.remove(output_file_with_path) + + # return updated keras model + return model
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_torch/adaround/adaround_weight.html b/releases/1.32.2/_modules/aimet_torch/adaround/adaround_weight.html new file mode 100644 index 00000000..02503f9a --- /dev/null +++ b/releases/1.32.2/_modules/aimet_torch/adaround/adaround_weight.html @@ -0,0 +1,1771 @@ + + + + + + aimet_torch.adaround.adaround_weight — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_torch.adaround.adaround_weight

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2021-2023, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Top level API for Adaptive Rounding - Post-Training Quantization (PTQ) """
+
+import os
+import contextlib
+import itertools
+import json
+import shutil
+from typing import Tuple, Union, Dict, List, Callable, Any, Optional
+import torch
+from torch.utils.data import DataLoader
+from tqdm import tqdm
+
+# Import AIMET specific modules
+from aimet_common.utils import AimetLogger, convert_configs_values_to_bool
+from aimet_common.defs import QuantScheme, QuantizationDataType
+
+from aimet_torch import utils
+from aimet_torch.save_utils import SaveUtils
+from aimet_torch.meta import connectedgraph_utils
+from aimet_torch.quantsim import QuantizationSimModel, QcQuantizeWrapper, ExportableQuantModule
+from aimet_torch.qc_quantize_op import StaticGridQuantWrapper, QcQuantizeOpMode
+from aimet_torch.tensor_quantizer import TensorQuantizer
+from aimet_torch.adaround.adaround_wrapper import AdaroundWrapper
+from aimet_torch.adaround.adaround_optimizer import AdaroundOptimizer
+from aimet_torch.adaround.adaround_loss import AdaroundHyperParameters
+from aimet_torch.adaround.activation_sampler import create_modulelist_for_group_modules, get_block_inputs, \
+    get_block_outputs, create_cached_block_schedule_list
+from aimet_torch.utils import get_named_module
+
+logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.Quant)
+
+# The following modules with weights are supported by Adaround
+AdaroundSupportedModules = (torch.nn.Conv2d, torch.nn.ConvTranspose2d, torch.nn.Linear)
+WORKING_DIR = '/tmp/adaround/'
+
+
+
[docs]class AdaroundParameters: + """ + Configuration parameters for Adaround + """ + def __init__(self, data_loader: DataLoader, num_batches: int, + default_num_iterations: int = None, default_reg_param: float = 0.01, + default_beta_range: Tuple = (20, 2), default_warm_start: float = 0.2, + forward_fn: Callable[[torch.nn.Module, Any], Any] = None): + """ + :param data_loader: Data loader + :param num_batches: Number of batches to be used for Adaround. + A commonly recommended value for this parameter is the smaller value among (1) len(data_loader) and (2) ceil(2000/batch_size) + :param default_num_iterations: Number of iterations to adaround each layer. + The default value is 10K for models with 8- or higher bit weights, and 15K for models with lower than 8 bit weights. + :param default_reg_param: Regularization parameter, trading off between rounding loss vs reconstruction loss. + Default 0.01 + :param default_beta_range: Start and stop beta parameter for annealing of rounding loss (start_beta, end_beta). + Default (20, 2) + :param default_warm_start: warm up period, during which rounding loss has zero effect. Default 20% (0.2) + :param forward_fn: Optional adapter function that performs forward pass given a model and inputs + yielded from the data loader. The function expects model as first argument and inputs to model + as second argument. + """ + if len(data_loader) < num_batches: + raise ValueError(f'Can not fetch {num_batches} batches from ' + f'a data loader of length {len(data_loader)}.') + + self.data_loader = data_loader + self.num_batches = num_batches + self.num_iterations = default_num_iterations + self.reg_param = default_reg_param + self.beta_range = default_beta_range + self.warm_start = default_warm_start + self.forward_fn = forward_fn
+ + +class Adaround: + """ + Weight-rounding mechanism for Post Training Quantization (PTQ) + """ + @classmethod + def apply_adaround(cls, model: torch.nn.Module, dummy_input: Union[torch.Tensor, Tuple], params: AdaroundParameters, + path: str, filename_prefix: str, default_param_bw: int = 4, + param_bw_override_list: List[Tuple[torch.nn.Module, int]] = None, + ignore_quant_ops_list: List[torch.nn.Module] = None, + default_quant_scheme: QuantScheme = QuantScheme.post_training_tf_enhanced, + default_config_file: str = None) -> torch.nn.Module: + """ + Returns model with optimized weight rounding of every module (Conv and Linear) and also saves the + corresponding quantization encodings to a separate JSON-formatted file that can then be imported by + QuantSim for inference or QAT + + :param model: Model to Adaround + :param dummy_input: Dummy input to the model. Used to parse model graph. If the model has more than one input, + pass a tuple. User is expected to place the tensors on the appropriate device. + :param params: Parameters for Adaround + :param path: path where to store parameter encodings + :param filename_prefix: Prefix to use for filename of the encodings file + :param default_param_bw: Default bitwidth (4-31) to use for quantizing layer parameters + :param param_bw_override_list: List of Tuples. Each Tuple is a module and the corresponding parameter bitwidth + to be used for that module. + :param ignore_quant_ops_list: Ops listed here are skipped during quantization needed for AdaRounding. Do not + specify Conv and Linear modules in this list. Doing so, will affect accuracy. + :param default_quant_scheme: Quantization scheme. Supported options are using Quant Scheme Enum + QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced + :param default_config_file: Default configuration file for model quantizers + :return: Model with Adarounded weights and saves corresponding parameter encodings JSON file at provided path + """ + # pylint: disable=too-many-arguments + # Create Quant sim with given parameters + quant_sim = cls._get_quantsim(model, dummy_input=dummy_input, quant_scheme=default_quant_scheme, + default_param_bw=default_param_bw, + config_file=default_config_file) + + # For the modules in the param_bw_override_list, override the default parameter bitwidths in the QuantSim + if param_bw_override_list: + cls._override_param_bitwidth(model, quant_sim, param_bw_override_list) + + if ignore_quant_ops_list: + cls._exclude_modules(model, quant_sim, ignore_quant_ops_list) + + # Compute only param encodings + cls._compute_param_encodings(quant_sim) + + return cls._apply_adaround(quant_sim, model, dummy_input, params, path, filename_prefix) + + @classmethod + def _apply_adaround(cls, quant_sim: QuantizationSimModel, model: torch.nn.Module, + dummy_input: Union[torch.Tensor, Tuple], params: AdaroundParameters, + path: str, filename_prefix: str, checkpoints_config: str = None) -> torch.nn.Module: + """ + Returns model with optimized weight rounding of every module (Conv and Linear) and also saves the + corresponding quantization encodings to a separate JSON-formatted file that can then be imported by + QuantSim for inference or QAT + + :param quant_sim: QuantizationSimModel object to optimize weight rounding. + The activation quantizers are expected to have been disabled. + :param model: Original fp32 model from which quant_sim was created. + :param dummy_input: Dummy input to the model. Used to parse model graph. If the model has more than one input, + pass a tuple. User is expected to place the tensors on the appropriate device. + :param params: Parameters for Adaround + :param path: path where to store parameter encodings + :param filename_prefix: Prefix to use for filename of the encodings file + :param checkpoints_config: Config files to split fp32/quant model by checkpoints + :return: Model with Adarounded weights and saves corresponding parameter encodings JSON file at provided path + """ + + # Sanity check: All the input/output quantizers should be disabled + cls._check_input_output_quantizers_for_adaround(quant_sim.model) + + # Get the module - activation function pair using ConnectedGraph + module_act_func_pair = connectedgraph_utils.get_module_act_func_pair(model, dummy_input) + + cls._adaround_model(model, quant_sim, module_act_func_pair, params, dummy_input, checkpoints_config) + + # Export quantization encodings to JSON-formatted file + cls._export_encodings_to_json(path, filename_prefix, quant_sim) + + cls._remove_quantization_wrappers(quant_sim.model) + logger.info('Completed Adarounding Model') + return quant_sim.model + + @classmethod + def _adaround_model(cls, model: torch.nn.Module, quant_sim: QuantizationSimModel, module_act_func_pair: Dict, + params: AdaroundParameters, dummy_input: Union[torch.Tensor, Tuple], + checkpoints_config: str = None): + """ + Optimize weight rounding of every module (AdaroundSupportedModules) of model in sequential manner + based on occurrence + + NOTE: When checkpoints_config file is provided, assumption is that the outputs from previous group modules (block) + should feed directly into next group modules (block) + + :param model: Original fp32 model from which quant_sim was created. + :param quant_sim: QuantizationSimModel object to optimize weight rounding. + The activation quantizers are expected to have been disabled. + :param module_act_func_pair: Dictionary of module to immediate following activation function + :param params: Adaround parameters + :param dummy_input: Dummy input to the model + :param checkpoints_config: Config files to split fp32/quant model by checkpoints to speedup activations sampling + """ + # pylint: disable=too-many-locals, protected-access, too-many-branches, too-many-statements + + num_iterations = params.num_iterations + + if num_iterations is None: + lowest_weight_bw = cls._get_lowest_weight_bw(quant_sim.model) + # If the lowest wegith bitwidth is < 8, then set num_iterations to 15K by default + if lowest_weight_bw < 8: + num_iterations = 15000 + else: + num_iterations = 10000 + try: + # Cache model input data to WORKING_DIR + cached_dataset = utils.CachedDataset(params.data_loader, params.num_batches, WORKING_DIR) + + # Optimization Hyper parameters + opt_params = AdaroundHyperParameters(num_iterations, params.reg_param, params.beta_range, + params.warm_start) + + # AdaRound must be applied to modules in the order of occurrence + if checkpoints_config: + # Load the predefined json file for checkpoints info + checkpoint_config = json.load(open(checkpoints_config)) + convert_configs_values_to_bool(checkpoint_config) + + assert 'cache_on_cpu' in checkpoint_config.keys(), \ + "Please define cache_on_cpu to determine whether to cache intermediate tensors on CPU" + cache_on_cpu = checkpoint_config['cache_on_cpu'] + + checkpoint_type = checkpoint_config.get('checkpoint_type', 'sequential') + if checkpoint_type == 'sequential': + assert 'grouped_modules' in checkpoint_config.keys(), \ + "Please provide a dictionary of grouped_modules in the file to define checkpoints" + assert 'include_static_inputs' in checkpoint_config.keys(), \ + "Please provide a dictionary of include_static_inputs in the file to define checkpoints" + + grouped_modules = checkpoint_config['grouped_modules'] + breakpoint_module_name = checkpoint_config['grouped_modules'][list(grouped_modules.keys())[0]][0] + include_static_inputs = checkpoint_config['include_static_inputs'] + cached_fp_dataset, cached_quant_dataset = get_block_inputs(model, quant_sim, + breakpoint_module_name, + cached_dataset, cache_on_cpu, + params.forward_fn, params.num_batches, + WORKING_DIR) + # Get the device of model to latter be used to place input tensor on the same device + device = utils.get_device(model) + model.cpu() + quant_sim.model.cpu() + + # Forward function for the ModuleList object + def fwd_mod_ls(mod_ls, x): + for mod in mod_ls: + x = params.forward_fn(mod, x) + return x + + sub_fp_models, sub_sim_models = create_modulelist_for_group_modules(model, quant_sim, grouped_modules) + for i, (fp_block, quant_sim_block, static_input) in enumerate(zip(sub_fp_models, + sub_sim_models, + include_static_inputs)): + modules = utils.get_ordered_list_of_modules(fp_block, cached_fp_dataset[0], fwd_mod_ls) + cls._run_adaround_model(modules, fp_block, quant_sim_block, + module_act_func_pair, opt_params, + fwd_mod_ls, + cached_fp_dataset, cached_quant_dataset) + + # Get the outputs from the current block and assign to be the inputs for next block + # except for the last block + if i < len(sub_fp_models) - 1: + get_block_outputs(fp_block, quant_sim_block, static_input, + cached_fp_dataset, cached_quant_dataset, cache_on_cpu, + fwd_mod_ls, device, WORKING_DIR) + + # After finishing Adaround, placing the quant model back to its original device + quant_sim.model.to(device) + else: + assert 'cached_blocks' in checkpoint_config.keys(), \ + "Please provide a list of modules that can be cached" + + block_list = create_cached_block_schedule_list( + model, dummy_input, checkpoint_config['cached_blocks'], AdaroundSupportedModules) + + for block_cfg, modules in tqdm(block_list, desc='block'): + if block_cfg is None: # doesn't belong to a cached block + cls._run_adaround_model(modules, model, quant_sim.model, module_act_func_pair, opt_params, + params.forward_fn, cached_dataset) + else: + block_name, fp_block = block_cfg + quant_sim_block: torch.nn.Module = get_named_module(quant_sim.model, block_name) + + cached_fp_dataset, cached_quant_dataset = get_block_inputs(model, quant_sim, + block_name, + cached_dataset, cache_on_cpu, + params.forward_fn, + params.num_batches, + WORKING_DIR, + incl_kwargs=True) + + def block_fwd(_model, x): + return _model(*x) + + cls._run_adaround_model(modules, fp_block, quant_sim_block, module_act_func_pair, + opt_params, + block_fwd, cached_fp_dataset, cached_quant_dataset) + del cached_fp_dataset + del cached_quant_dataset + else: + modules = utils.get_ordered_list_of_modules(model, dummy_input) + cls._run_adaround_model(modules, model, quant_sim.model, module_act_func_pair, opt_params, + params.forward_fn, cached_dataset) + finally: + try: + logger.info('Deleting model inputs from location: %s', WORKING_DIR) + shutil.rmtree(WORKING_DIR) + except FileNotFoundError: + pass + + @classmethod + def _run_adaround_model(cls, modules: List, model: torch.nn.Module, quant_sim_model: torch.nn.Module, + module_act_func_pair: Dict, opt_params: AdaroundHyperParameters, forward_fn: Callable, + cached_dataset: utils.CachedDataset, + cached_quant_dataset: Optional[utils.CachedDataset] = None): + """ + Iterate through all modules to find out Adaround supported modules and + apply Adaround optimization to those modules + + :param modules: Candidate modules + :param model: Original fp32 model + :param quant_sim_model: QuantSim model + :param module_act_func_pair: Activation function pairs + :param opt_params: Optimization parameters + :param forward_fn: Adapter function that performs forward pass given a model and inputs + yielded from the data loader + :param cached_dataset: Cached dataset for the fp32 model + :param cached_quant_dataset: Cached dataset for the quant model + """ + # pylint: disable=too-many-arguments, too-many-locals, protected-access + for name, module in tqdm(modules): + if isinstance(module, AdaroundSupportedModules): + # Using name, get corresponding quantized wrapper module from Quant sim model + quant_wrapper = cls._get_quant_wrapper(quant_sim_model, name) + if not quant_wrapper: + continue + + # Wraps the quant module with adaround wrapper + # and temporarily replace quant module with wrapped module + with cls._replace_quantization_layer(quant_sim_model, name) as adaround_wrapper: + + # Get module's next following activation function + act_func = module_act_func_pair[module] + + logger.info("Started Optimizing weight rounding of module: %s", name) + AdaroundOptimizer.adaround_module(module, adaround_wrapper, model, quant_sim_model, act_func, + cached_dataset, forward_fn, opt_params, cached_quant_dataset) + weight = adaround_wrapper.weight + + # Fold trained alpha to weight + with torch.no_grad(): + # Use soft rounding to compute Adarounded weight + adaround_wrapper.use_soft_rounding = True + adarounded_weight = adaround_wrapper.apply_adaround(weight) + weight.copy_(adarounded_weight) + del adarounded_weight + + @staticmethod + def _compute_param_encodings(quant_sim: QuantizationSimModel): + """ + Compute encodings for parameters, needed for initializing Adaround quantizers + :param quant_sim: Quant sim + """ + for quant_module in quant_sim.model.modules(): + if isinstance(quant_module, StaticGridQuantWrapper): + # Adaround requires input and output quantizers to be disabled + for quatizer in quant_module.input_quantizers: + quatizer.enabled = False + for quatizer in quant_module.output_quantizers: + quatizer.enabled = False + + # pylint: disable=protected-access + for name, param in quant_module._module_to_wrap.named_parameters(): + param_quantizer = quant_module.param_quantizers[name] + param_quantizer.reset_encoding_stats() + param_quantizer.update_encoding_stats(param.data) + param_quantizer.compute_encoding() + + # Wrapper mode must be set to ACTIVE because the wrapper's quantize_dequantize_params() will only call + # into the param tensor quantizer's quantize_dequantize() if the mode is not PASSTHROUGH. + quant_module.set_mode(QcQuantizeOpMode.ACTIVE) + + @staticmethod + def _get_quantsim(model: torch.nn.Module, dummy_input: torch.Tensor, + quant_scheme: QuantScheme, default_param_bw: int, config_file: str): + return QuantizationSimModel(model, dummy_input=dummy_input, quant_scheme=quant_scheme, + default_param_bw=default_param_bw, + config_file=config_file) + + @staticmethod + def _get_adaround_wrapper(quant_module: QcQuantizeWrapper): + return AdaroundWrapper(quant_module) + + @staticmethod + def _remove_quantization_wrappers(module: torch.nn.Module): + SaveUtils.remove_quantization_wrappers(module) + + @staticmethod + @contextlib.contextmanager + def _patch_module_layer(model, layer_name, new_layer): + """ + Temporarily replace model layer + """ + original_layer = getattr(model, layer_name) + setattr(model, layer_name, new_layer) + yield + setattr(model, layer_name, original_layer) + + @staticmethod + def _validate_quant_module_for_adaround(quant_module: StaticGridQuantWrapper): + assert quant_module.param_quantizers['weight'], '%s does not have weight parameter.' % quant_module + assert quant_module.param_quantizers['weight'].encoding, '%s encoding needs to be set.' % quant_module + + @staticmethod + def _check_input_output_quantizers_for_adaround(quant_model: torch.nn.Module): + _, input_quantizers, output_quantizers = utils.get_all_quantizers(quant_model) + for quantizer in itertools.chain(input_quantizers, output_quantizers): + assert not quantizer.enabled + + @staticmethod + def _get_lowest_weight_bw(quant_model: torch.nn.Module): + param_quantizers, _, _ = utils.get_all_quantizers(quant_model) + return min( + quantizer.bitwidth for quantizer in param_quantizers + if quantizer.enabled and quantizer.data_type == QuantizationDataType.int + ) + + @classmethod + @contextlib.contextmanager + def _replace_quantization_layer(cls, quant_sim_model: torch.nn.Module, module_name: str): + """ + Replace the quantized module's weight tensor quantizer with the Adaround tensor quantizer + :param quant_module: quant module + """ + quant_module = utils.get_named_module(quant_sim_model, module_name) + cls._validate_quant_module_for_adaround(quant_module) + adaround_layer = cls._get_adaround_wrapper(quant_module) + + # We need to look for the container to patch for modules inside submodule + upper_module = quant_sim_model + upper_module_name, _, target_module_name = module_name.rpartition('.') + if upper_module_name: + upper_module = utils.get_named_module(quant_sim_model, upper_module_name) + + # Temporarily replace quant module with wrapped module + with cls._patch_module_layer(upper_module, target_module_name, adaround_layer): + yield adaround_layer + + @staticmethod + def _get_quant_wrapper(quant_sim_model: torch.nn.Module, module_name: str) -> Union[StaticGridQuantWrapper, None]: + """ + For given module name, get the quantized wrapper module from the QuantSim model + :param quant_sim_model: Model with simulation ops + :param module_name: Module name + :return: Quantized wrapper module or None + """ + quant_module = None + + for name, module in quant_sim_model.named_modules(): + if name == module_name and isinstance(module, StaticGridQuantWrapper): + quant_module = module + break + + return quant_module + + @classmethod + def _export_encodings_to_json(cls, path: str, filename_prefix: str, quant_sim: QuantizationSimModel): + """ + Save Adadrounded module's parameter encodings to JSON file + :param path: path where to store param encodings + :param filename_prefix: filename to store exported weight encodings in JSON format + :param quant_sim: QunatSim that contains the model and Adaround tensor quantizers + """ + # pylint: disable=protected-access + # Create a dictionary to export to JSON file + param_encodings = {} + + for name, quant_module in quant_sim.model.named_modules(): + if isinstance(quant_module, ExportableQuantModule) and \ + isinstance(quant_module.get_original_module(), AdaroundSupportedModules): + + if 'weight' in quant_module.param_quantizers: + cls._update_param_encodings_dict(quant_module, name, param_encodings) + + # Unify the encoding format to be same as that of full encoding export file + encoding = {'param_encodings': param_encodings} + # export encodings to JSON file + os.makedirs(os.path.abspath(path), exist_ok=True) + encoding_file_path = os.path.join(path, filename_prefix + '.encodings') + with open(encoding_file_path, 'w') as encoding_fp: + json.dump(encoding, encoding_fp, sort_keys=True, indent=4) + + @classmethod + def _update_param_encodings_dict(cls, quant_module: ExportableQuantModule, name: str, param_encodings: Dict): + """ + Add module's weight parameter encodings to dictionary to be used for exporting encodings + :param quant_module: quant module + :param name: name of module + :param param_encodings: Dictionary of param encodings + """ + for orig_param_name, encodings in quant_module.export_param_encodings().items(): + if orig_param_name == 'weight' and encodings: + param_name = name + '.' + orig_param_name + param_encodings[param_name] = encodings + + @staticmethod + def _create_encodings_dict_for_quantizer(quantizer: TensorQuantizer) -> List[Dict]: + """ + Return encodings for given qunatizer + :param quantizer: Tensor quantizer associated with module's param + :return: Dictionary containing encodings + """ + quant_encodings = quantizer.encoding + if not isinstance(quantizer.encoding, list): + quant_encodings = [quant_encodings] + + encodings_dict = [] + for enc in quant_encodings: + encodings_dict.append({'min': enc.min, + 'max': enc.max, + 'scale': enc.delta, + 'offset': int(enc.offset), + 'bitwidth': enc.bw, + 'is_symmetric': str(quantizer.use_symmetric_encodings), + 'dtype': 'int' if quantizer.data_type == QuantizationDataType.int else 'float'}) + return encodings_dict + + @staticmethod + def _override_param_bitwidth(model: torch.nn.Module, quant_sim: QuantizationSimModel, + param_bw_override_list: List[Tuple[torch.nn.Module, int]]): + """ + For the QuantSim, for the list of modules in the param_bw_override_list, + overrides the default parameter bitwidths with the provided bitwidth. + + :param model: The original model + :param quant_sim: The QuantSim that was created using a deepcopy of the original model. + :param param_bw_override_list: List of Tuples. Each Tuple is a module and the corresponding parameter bitwidth + to be used for that module. + """ + # Create a mapping of original model's AdaRoundable module and their name + module_to_name = {} + for name, module in model.named_modules(): + if isinstance(module, AdaroundSupportedModules): + module_to_name[module] = name + + # Create a mapping of QuantSim model's AdaRoundable module name and their module + name_to_module = {} + for q_name, q_module in quant_sim.model.named_modules(): + if isinstance(q_module, ExportableQuantModule): + if isinstance(q_module.get_original_module(), AdaroundSupportedModules): # pylint: disable=protected-access + name_to_module[q_name] = q_module + + # For the modules specified in the param_bw_override_list, set the weight quantizer bitwidth + for (module, bw) in param_bw_override_list: + module_name = module_to_name[module] + quant_wrapper = name_to_module[module_name] + quant_wrapper.param_quantizers['weight'].bitwidth = bw + + @classmethod + def _exclude_modules(cls, model: torch.nn.Module, quant_sim: QuantizationSimModel, + ignore_quant_ops_list: List[torch.nn.Module]): + """ + For the modules mentioned in the ignore_quant_ops_list, remove the corresponding quant wrappers from the + quantSim and excludes modules from adaround optimization. + + :param model: The original model + :param quant_sim: The QuantSim that was created using a deepcopy of the original model. + :param ignore_quant_ops_list: The list of modules for which the Quantization wrappers are removed from the + QuantSim object. + """ + quant_wrappers_to_exclude = [] + for module in ignore_quant_ops_list: + for m in module.modules(): + name = utils.get_layer_name(model, m) + quant_wrapper = cls._get_quant_wrapper(quant_sim.model, name) + if quant_wrapper: + quant_wrappers_to_exclude.append(quant_wrapper) + + quant_sim.exclude_layers_from_quantization(quant_wrappers_to_exclude) + + @classmethod + def apply_adaround_with_cache(cls, model: torch.nn.Module, dummy_input: Union[torch.Tensor, Tuple], + params: AdaroundParameters, + path: str, filename_prefix: str, default_param_bw: int = 4, + param_bw_override_list: List[Tuple[torch.nn.Module, int]] = None, + ignore_quant_ops_list: List[torch.nn.Module] = None, + default_quant_scheme: QuantScheme = QuantScheme.post_training_tf_enhanced, + default_config_file: str = None, + checkpoints_config: str = None) -> torch.nn.Module: + """ + Returns model with optimized weight rounding of every module (Conv and Linear) and also saves the + corresponding quantization encodings to a separate JSON-formatted file that can then be imported by + QuantSim for inference or QAT + :param model: Model to Adaround + :param dummy_input: Dummy input to the model. Used to parse model graph. If the model has more than one input, + pass a tuple. User is expected to place the tensors on the appropriate device. + :param params: Parameters for Adaround + :param path: path where to store parameter encodings + :param filename_prefix: Prefix to use for filename of the encodings file + :param default_param_bw: Default bitwidth (4-31) to use for quantizing layer parameters + :param param_bw_override_list: List of Tuples. Each Tuple is a module and the corresponding parameter bitwidth + to be used for that module. + :param ignore_quant_ops_list: Ops listed here are skipped during quantization needed for AdaRounding. Do not + specify Conv and Linear modules in this list. Doing so, will affect accuracy. + :param default_quant_scheme: Quantization scheme. Supported options are using Quant Scheme Enum + QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced + :param default_config_file: Default configuration file for model quantizers + :param checkpoints_file: JSON file to define checkpoints for caching intermediate tensors of fp32/quant model + :return: Model with Adarounded weights and saves corresponding parameter encodings JSON file at provided path + """ + # pylint: disable=too-many-arguments + assert checkpoints_config is not None, "To run Adaround with cached tensors, please provide a JSON file with checkpoints defined" + # Create Quant sim with given parameters + quant_sim = cls._get_quantsim(model, dummy_input=dummy_input, quant_scheme=default_quant_scheme, + default_param_bw=default_param_bw, + config_file=default_config_file) + + # For the modules in the param_bw_override_list, override the default parameter bitwidths in the QuantSim + if param_bw_override_list: + cls._override_param_bitwidth(model, quant_sim, param_bw_override_list) + + if ignore_quant_ops_list: + cls._exclude_modules(model, quant_sim, ignore_quant_ops_list) + + # Compute only param encodings + cls._compute_param_encodings(quant_sim) + + return cls._apply_adaround(quant_sim, model, dummy_input, params, path, filename_prefix, checkpoints_config) +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_torch/auto_quant.html b/releases/1.32.2/_modules/aimet_torch/auto_quant.html new file mode 100644 index 00000000..14d9dfbb --- /dev/null +++ b/releases/1.32.2/_modules/aimet_torch/auto_quant.html @@ -0,0 +1,2575 @@ + + + + + + aimet_torch.auto_quant — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_torch.auto_quant

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2022, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+# pylint: disable=too-many-lines
+
+""" Implementation of AIMET AutoQuantBase and v1 AutoQuant """
+import abc
+import copy
+import contextlib
+from collections import OrderedDict, defaultdict
+from dataclasses import dataclass
+import functools
+import itertools
+import math
+import traceback
+import os
+import sys
+import io
+from unittest.mock import patch
+from typing import Any, Callable, Dict, List, Optional, Tuple, Union, Mapping
+import pickle
+from uuid import uuid4
+import torch
+from torch.utils.data import DataLoader
+import jinja2
+from bokeh.resources import CDN
+from tqdm import tqdm
+
+from aimet_torch import utils
+from aimet_torch.adaround.adaround_weight import Adaround, AdaroundParameters
+from aimet_torch.cross_layer_equalization import equalize_model
+from aimet_torch.batch_norm_fold import fold_all_batch_norms
+from aimet_torch.quantsim import QuantizationSimModel
+from aimet_torch.utils import get_all_quantizers, in_eval_mode
+from aimet_torch.onnx_utils import OnnxExportApiArgs
+from aimet_torch.model_preparer import prepare_model
+from aimet_torch.model_validator.model_validator import ModelValidator
+
+from aimet_common.auto_quant import Diagnostics
+from aimet_common.cache import Cache
+from aimet_common.defs import QuantScheme
+from aimet_common.utils import AimetLogger, Spinner
+from aimet_common.quantsim import validate_quantsim_inputs
+
+
+_logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.AutoQuant)
+
+cache = Cache()
+
+
+# The number of samples to be used for performance evaluation.
+# NOTE: None means "all".
+NUM_SAMPLES_FOR_PERFORMANCE_EVALUATION = None
+
+class _StageSkipped(Exception):
+    pass
+
+
+@dataclass(frozen=True)
+class _QuantSchemePair:
+    param_quant_scheme: QuantScheme
+    output_quant_scheme: QuantScheme
+    param_percentile: Optional[float] = None
+    output_percentile: Optional[float] = None
+
+    def __str__(self):
+        def scheme_to_str(quant_scheme, percentile):
+            if quant_scheme == QuantScheme.post_training_percentile:
+                return f"{percentile}%ile"
+            if quant_scheme in (QuantScheme.post_training_tf,
+                                QuantScheme.training_range_learning_with_tf_init):
+                return "tf"
+            if quant_scheme in (QuantScheme.post_training_tf_enhanced,
+                                QuantScheme.training_range_learning_with_tf_enhanced_init):
+                return "tf-enhanced"
+            raise ValueError
+
+        param_str = scheme_to_str(self.param_quant_scheme, self.param_percentile)
+        output_str = scheme_to_str(self.output_quant_scheme, self.output_percentile)
+        return f"W@{param_str} / A@{output_str}"
+
+
+_QUANT_SCHEME_CANDIDATES = (
+    # Weight:     tf
+    # Activation: tf
+    _QuantSchemePair(QuantScheme.post_training_tf,
+                     QuantScheme.post_training_tf),
+
+    # Weight:     tf_enhanced
+    # Activation: tf
+    _QuantSchemePair(QuantScheme.post_training_tf_enhanced,
+                     QuantScheme.post_training_tf),
+
+    # Weight:     tf_enhanced
+    # Activation: tf_enhanced
+    _QuantSchemePair(QuantScheme.post_training_tf_enhanced,
+                     QuantScheme.post_training_tf_enhanced),
+
+    # Weight:     tf_enhanced
+    # Activation: percentile(99.9)
+    _QuantSchemePair(QuantScheme.post_training_tf_enhanced,
+                     QuantScheme.post_training_percentile,
+                     output_percentile=99.9),
+
+    # Weight:     tf_enhanced
+    # Activation: percentile(99.99)
+    _QuantSchemePair(QuantScheme.post_training_tf_enhanced,
+                     QuantScheme.post_training_percentile,
+                     output_percentile=99.99),
+)
+
+
+def _validate_inputs(model: torch.nn.Module, # pylint: disable=too-many-arguments
+                     data_loader: DataLoader,
+                     eval_callback: Callable[[torch.nn.Module], float],
+                     dummy_input: torch.Tensor,
+                     results_dir: str,
+                     strict_validation: bool,
+                     quant_scheme: QuantScheme,
+                     param_bw: int,
+                     output_bw: int,
+                     rounding_mode: str):
+    """
+    Confirms inputs are of the correct type
+    :param model: Model to be quantized
+    :param data_loader: A collection that iterates over an unlabeled dataset, used for computing encodings
+    :param eval_callback: Function that calculates the evaluation score
+    :param dummy_input: Dummy input for the model
+    :param results_dir: Directory to save the results of PTQ techniques
+    :param strict_validation: Flag set to True by default. When False, AutoQuant will proceed with execution and try to handle errors internally if possible. This may produce unideal or unintuitive results.
+    :param quant_scheme: Quantization scheme
+    :param param_bw: Parameter bitwidth
+    :param output_bw: Output bitwidth
+    :param rounding_mode: Rounding mode
+    """
+    if not isinstance(model, torch.nn.Module):
+        raise ValueError('Model must be of type torch.nn.Module, not ' + str(type(model).__name__))
+
+    if not isinstance(data_loader, DataLoader):
+        raise ValueError('data_loader must be of type DataLoader, not ' + str(
+            type(data_loader).__name__))
+
+    if not isinstance(eval_callback, Callable):  # pylint: disable=isinstance-second-argument-not-valid-type
+        raise ValueError('eval_callback must be of type Callable, not ' + str(type(eval_callback).__name__))
+
+    if not isinstance(dummy_input, (torch.Tensor, Tuple)):
+        raise ValueError(
+            'dummy_input must be of type torch.Tensor or Tuple, not ' + str(type(dummy_input).__name__))
+
+    if not isinstance(results_dir, str):
+        raise ValueError('results_dir must be of type str, not ' + str(type(results_dir).__name__))
+
+    results_dir = os.path.abspath(results_dir)
+    os.makedirs(results_dir, exist_ok=True)
+
+    if not isinstance(strict_validation, bool):
+        raise ValueError('strict_validation must be of type bool, not ' + str(type(strict_validation).__name__))
+
+    validate_quantsim_inputs(quant_scheme, rounding_mode, output_bw, param_bw)
+
+
+class AutoQuantBase(abc.ABC): # pylint: disable=too-many-instance-attributes
+    """
+    Integrate and apply post-training quantization techniques.
+
+    AutoQuant includes 1) batchnorm folding, 2) cross-layer equalization,
+    and 3) Adaround.
+    These techniques will be applied in a best-effort manner until the model
+    meets the evaluation goal given as allowed_accuracy_drop.
+    """
+
+    def __init__( # pylint: disable=too-many-arguments, too-many-locals
+            self,
+            model: torch.nn.Module,
+            dummy_input: Union[torch.Tensor, Tuple],
+            data_loader: DataLoader,
+            eval_callback: Callable[[torch.nn.Module], float],
+            param_bw: int = 8,
+            output_bw: int = 8,
+            quant_scheme: QuantScheme = QuantScheme.post_training_tf_enhanced,
+            rounding_mode: str = 'nearest',
+            config_file: str = None,
+            results_dir: str = "/tmp",
+            cache_id: str = None,
+            strict_validation: bool = True,
+            model_prepare_required: bool = True) -> None:
+        '''
+        :param model: Model to be quantized. Assumes model is on the correct device
+        :param dummy_input: Dummy input for the model. Assumes that dummy_input is on the correct device
+        :param data_loader: A collection that iterates over an unlabeled dataset, used for computing encodings
+        :param eval_callback: Function that calculates the evaluation score
+        :param param_bw: Parameter bitwidth
+        :param output_bw: Output bitwidth
+        :param quant_scheme: Quantization scheme
+        :param rounding_mode: Rounding mode
+        :param config_file: Path to configuration file for model quantizers
+        :param results_dir: Directory to save the results of PTQ techniques
+        :param cache_id: ID associated with cache results
+        :param strict_validation: Flag set to True by default.hen False, AutoQuant will proceed with execution and handle errors internally if possible. This may produce unideal or unintuitive results.
+        :param model_prepare_required: Flag set to True by default.If False, AutoQuant will skip model prepare block in the pipeline.
+        '''
+        _validate_inputs(model, data_loader, eval_callback, dummy_input, results_dir,
+                         strict_validation, quant_scheme, param_bw, output_bw, rounding_mode)
+
+        self.fp32_model = model
+        self.dummy_input = dummy_input
+        self.data_loader = data_loader
+        self.eval_callback = eval_callback
+
+        self._quantsim_params = dict(
+            param_bw=param_bw,
+            output_bw=output_bw,
+            quant_scheme=_QuantSchemePair(quant_scheme, quant_scheme),
+            rounding_mode=rounding_mode,
+            config_file=config_file,
+        )
+
+        self.results_dir = results_dir
+        if cache_id:
+            self.cache_dir = os.path.join(results_dir, ".auto_quant_cache", cache_id)
+        else:
+            self.cache_dir = None
+
+        self.model_prepare_required = model_prepare_required
+
+        def forward_pass_callback(model, _: Any = None):
+            device = utils.get_device(model)
+            with in_eval_mode(model), torch.no_grad():
+                for input_data in tqdm(data_loader):
+                    input_data = utils.change_tensor_device_placement(input_data, device)
+                    if isinstance(input_data, torch.Tensor):
+                        model(input_data)
+                    else:
+                        assert isinstance(input_data, (tuple, list))
+                        model(*input_data)
+
+        self.forward_pass_callback = forward_pass_callback
+
+        @functools.wraps(eval_callback)
+        def eval_callback_wrapper(model: torch.nn.Module, *args, **kwargs) -> float:
+            """
+            Wrapper to ensure that model is in eval mode before entering eval_callback.
+            """
+            with in_eval_mode(model), torch.no_grad():
+                return eval_callback(model, *args, **kwargs)
+
+        self.eval_callback = eval_callback_wrapper
+
+        # Use at most 2000 samples for AdaRound.
+        num_samples = min(len(self.data_loader.dataset), 2000)
+        batch_size = self.data_loader.batch_size or 1
+        num_batches = math.ceil(num_samples / batch_size)
+        num_batches = min(num_batches, len(self.data_loader))
+        self.adaround_params = self._get_adaround_parameters(self.data_loader, num_batches)
+
+        self._export_kwargs = dict(
+            onnx_export_args=OnnxExportApiArgs(),
+            propagate_encodings=False,
+        )
+        self._model_preparer_kwargs = dict(
+            modules_to_exclude=None,
+            module_classes_to_exclude=None,
+            concrete_args=None,
+        )
+
+        self.eval_manager = _EvalManager(
+            quantsim_factory=self._create_quantsim_and_encodings,
+            eval_func=self._evaluate_model_performance,
+            dummy_input_on_cpu=utils.change_tensor_device_placement(dummy_input, torch.device("cpu")),
+            results_dir=self.results_dir,
+            strict_validation=strict_validation)
+
+        self._quant_scheme_candidates = _QUANT_SCHEME_CANDIDATES
+        self._fp32_acc = None
+
+    @staticmethod
+    @abc.abstractmethod
+    def _get_adaround():
+        """ returns AdaRound """
+
+    @staticmethod
+    @abc.abstractmethod
+    def _get_adaround_parameters(data_loader, num_batches):
+        """ Returns AdaroundParameters(data_loader, num_batches) """
+
+
+    def _evaluate_model_performance(self, model) -> float:
+        """
+        Evaluate the model performance.
+        """
+        return self.eval_callback(model, NUM_SAMPLES_FOR_PERFORMANCE_EVALUATION)
+
+    def run_inference(self) -> Tuple[QuantizationSimModel, float]:
+        '''
+        Creates a quantization model and performs inference
+
+        :return: QuantizationSimModel, model accuracy as float
+        '''
+        model = self.fp32_model
+
+        if self.model_prepare_required:
+            with self.eval_manager.session("Prepare Model") as sess:
+                model = sess.wrap(self._prepare_model)(self.fp32_model)
+
+        # Batchnorm Folding
+        with self.eval_manager.session("Batchnorm Folding", ptq=True) as sess:
+            model, _ = sess.wrap(self._apply_batchnorm_folding)(model)
+            if sess.ptq_result is None:
+                sess.set_ptq_result(model=model,
+                                    applied_techniques=["batchnorm_folding"],
+                                    export_kwargs=self._export_kwargs)
+
+        sim = self._create_quantsim_and_encodings(model)
+
+        if sess.ptq_result is None:
+            # BN folding failed. Need to measure the eval score
+            acc = self._evaluate_model_performance(sim.model)
+        else:
+            # BN folding success. No need to measure the eval score again
+            acc = sess.ptq_result.accuracy
+
+        return sim, acc
+
+    def optimize(self, allowed_accuracy_drop: float = 0.0) -> Tuple[torch.nn.Module, float, str]:
+        """
+        Integrate and apply post-training quantization techniques.
+
+        :param allowed_accuracy_drop: Maximum allowed accuracy drop
+        :return: Tuple of (best model, eval score, encoding path)
+        """
+        result = self._optimize_helper(self._optimize_main, allowed_accuracy_drop)
+        return result["model"],\
+               result["accuracy"],\
+               result["encoding_path"]
+
+    def set_adaround_params(self, adaround_params) -> None:
+        """
+        Set Adaround parameters.
+        If this method is not called explicitly by the user, AutoQuant will use
+        `data_loader` (passed to `__init__`) for Adaround.
+
+        :param adaround_params: Adaround parameters.
+        """
+        self.adaround_params = adaround_params
+
+    def set_export_params(self,
+                          onnx_export_args: OnnxExportApiArgs = -1,
+                          propagate_encodings: bool = None) -> None:
+        """
+        Set parameters for QuantizationSimModel.export.
+
+        :param onnx_export_args: optional export argument with onnx specific overrides
+                if not provide export via torchscript graph
+        :param propagate_encodings: If True, encoding entries for intermediate ops
+                (when one PyTorch ops results in multiple ONNX nodes) are filled with
+                the same BW and data_type as the output tensor for that series of ops.
+        """
+        # Here, we use -1 to indicate `onnx_export_args` wasn't specified
+        # since onnx_export_args being None has its own meaning.
+        if onnx_export_args != -1:
+            self._export_kwargs.update(onnx_export_args=onnx_export_args)
+        if propagate_encodings is not None:
+            self._export_kwargs.update(propagate_encodings=propagate_encodings)
+
+    def set_model_preparer_params(
+            self,
+            modules_to_exclude: Optional[List[torch.nn.Module]] = None,
+            module_classes_to_exclude: Optional[List[torch.nn.Module]] = None,
+            concrete_args: Optional[Dict[str, Any]] = None,
+    ):
+        """
+        Set parameters for model preparer.
+
+        :param modules_to_exclude: List of modules to exclude when tracing.
+        :param module_classes_to_exclude: List of module classes to exclude when tracing.
+        :param concrete_args: Parameter for model preparer. Allows you to partially specialize
+            your function, whether it's to remove control flow or data structures. If the
+            model has control flow, torch.fx won't be able to trace the model. Check
+            torch.fx.symbolic_trace API in detail.
+        """
+        self._model_preparer_kwargs["modules_to_exclude"] = copy.copy(modules_to_exclude)
+        self._model_preparer_kwargs["module_classes_to_exclude"] = copy.copy(module_classes_to_exclude)
+        self._model_preparer_kwargs["concrete_args"] = copy.copy(concrete_args)
+
+    def _create_quantsim_and_encodings( # pylint: disable=too-many-arguments, too-many-locals, too-many-branches
+            self,
+            model: torch.nn.Module,
+            rounding_mode: str = None,
+            output_bw: int = None,
+            output_quant_scheme: QuantScheme = None,
+            output_percentile: float = None,
+            param_bw: int = None,
+            param_quant_scheme: QuantScheme = None,
+            param_percentile: float = None,
+            config_file: str = None,
+            encoding_path: str = None,
+    ) -> QuantizationSimModel:
+        """
+        Create a QuantizationSimModel and compute encoding. If `encoding_path` is not None,
+        it is prioritized over other arguments (`output_bw`, `param_bw`, ...).
+
+        :param model: Model to quantize.
+        :param rounding_mode: Rounding mode. Defaults to self._quantsim_params["rounding_mode"].
+        :param output_bw: Default bitwidth (4-31) to use for quantizing layer inputs andoutputs.
+            Defaults to self._quantsim_params["output_bw"].
+        :param output_quant_scheme: Quantization scheme for output quantizers.
+            Defaults to self._quantsim_params["quant_scheme"].output_quant_scheme.
+        :param output_percentile: Percentile value for outputs.
+            Only valid if output quant scheme is percentile scheme.
+        :param param_bw: Default bitwidth (4-31) to use for quantizing layer parameters.
+            Defaults to self._quantsim_params["param_bw"].
+        :param param_quant_scheme: Quantization scheme for param quantizers.
+            Defaults to self._quantsim_params["quant_scheme"].param_quant_scheme.
+        :param param_percentile: Percentile value for parameters.
+            Only valid if param quant scheme is percentile scheme.
+        :param config_file: Path to configuration file for model quantizers.
+                            Defaults to self._quantsim_params["config_file"].
+        :param encoding_path: Path to parameter encodings file.
+        :return: Quantsim model.
+        """
+        if output_bw is not None:
+            assert output_bw <= 32
+
+        if param_bw is not None:
+            assert param_bw <= 32
+
+        if output_quant_scheme is None or param_quant_scheme is None:
+            assert self._quantsim_params["quant_scheme"] is not None
+
+        kwargs = dict(
+            rounding_mode=(rounding_mode or self._quantsim_params["rounding_mode"]),
+            default_output_bw=(output_bw or self._quantsim_params["output_bw"]),
+            default_param_bw=(param_bw or self._quantsim_params["param_bw"]),
+            config_file=(config_file or self._quantsim_params["config_file"]),
+        )
+        sim = self._get_quantsim(model, self.dummy_input, **kwargs)
+
+        default_quant_scheme = self._quantsim_params.get("quant_scheme")
+        if default_quant_scheme is not None:
+            output_quant_scheme = output_quant_scheme or\
+                                  default_quant_scheme.output_quant_scheme
+            output_percentile = output_percentile or default_quant_scheme.output_percentile
+            param_quant_scheme = param_quant_scheme or\
+                                 default_quant_scheme.param_quant_scheme
+            param_percentile = param_percentile or default_quant_scheme.param_percentile
+
+        self._configure_quantsim(sim,
+                                 output_bw,
+                                 output_quant_scheme,
+                                 output_percentile,
+                                 param_bw,
+                                 param_quant_scheme,
+                                 param_percentile,
+                                 encoding_path)
+
+        if self._has_enabled_quantizers(sim):
+            sim.compute_encodings(self.forward_pass_callback, None)
+
+        return sim
+
+    @staticmethod
+    @abc.abstractmethod
+    def _get_quantsim(model, dummy_input, **kwargs):
+        """ Returns QuantizationSimModel(model, dummy_input, **kwargs) """
+
+    @abc.abstractmethod
+    def _configure_quantsim(self, # pylint: disable=too-many-arguments
+                            sim,
+                            output_bw,
+                            output_quant_scheme,
+                            output_percentile,
+                            param_bw,
+                            param_quant_scheme,
+                            param_percentile,
+                            encoding_path):
+        """Configures quantizers in sim with given bitwidths, quantschemes, and percentiles then loads encodings
+
+        Any 32 bit quantizers are disabled after loading and freezing encodings
+        """
+
+    @staticmethod
+    @abc.abstractmethod
+    def _has_enabled_quantizers(sim):
+        """ Returns True if any quantizer in sim is enabled """
+
+    def _prepare_model(self, model):
+        prepared_model = prepare_model(model, **self._model_preparer_kwargs)
+
+        if ModelValidator.validate_model(prepared_model, self.dummy_input):
+            _logger.info(
+                "Model validation has succeeded. Proceeding to AutoQuant algorithm."
+            )
+        else:
+            raise ValueError(
+                "Model validation has failed."
+                " Please make the necessary changes to the model and run again."
+            )
+        return prepared_model
+
+    @cache.mark("batchnorm_folding")
+    def _apply_batchnorm_folding(self, model: torch.nn.Module)\
+            -> Tuple[torch.nn.Module, List[Tuple]]:
+        """
+        Apply batchnorm folding.
+
+        NOTE: Input model is not mutated.
+
+        :param model: Model to apply batchnorm folding.
+        :return: Output model and folded pairs.
+        """
+        model = copy.deepcopy(model)
+        folded_pairs = fold_all_batch_norms(model, None, self.dummy_input)
+        return model, folded_pairs
+
+    @cache.mark("cle")
+    def _apply_cross_layer_equalization(self, model: torch.nn.Module) -> torch.nn.Module:
+        """
+        Apply cross-layer equalization.
+
+        NOTE: Input model is not mutated.
+
+        :param model: Model to apply cross-layer-equalization.
+        :return: Output model.
+        """
+        model = copy.deepcopy(model)
+        if isinstance(self.dummy_input, torch.Tensor):
+            input_shape = tuple(self.dummy_input.shape)
+        else:
+            input_shape = [tuple(x.shape) for x in self.dummy_input]
+        equalize_model(model, input_shape)
+        return model
+
+    @cache.mark("adaround")
+    def _apply_adaround(self, model: torch.nn.Module) -> Tuple[torch.nn.Module, str]:
+        """
+        Apply adaround.
+
+        NOTE1: Input model is not mutated.
+        NOTE2: Parameters `param_bw_override_list` and `ignore_quant_ops_list` are always set to None.
+
+        :param model: Model to apply adaround.
+        :return: Output model and the path to the parameter encoding file.
+        """
+        # NOTE: We dont need to make a deepcopy of model here, since Adaround.apply_adaround
+        # internally creates and returns a deepcopy of model.
+        filename_prefix = "adaround"
+        adaround_encoding_path = os.path.join(self.results_dir,
+                                              "{}.encodings".format(filename_prefix))
+
+        sim = self._create_quantsim_and_encodings(model)
+
+        self._disable_activation_quantizers(sim)
+
+        model = self._get_adaround()._apply_adaround(sim, model, self.dummy_input, self.adaround_params, # pylint: disable=protected-access
+                                                     path=self.results_dir, filename_prefix=filename_prefix)
+
+        return model, adaround_encoding_path
+
+    @staticmethod
+    @abc.abstractmethod
+    def _disable_activation_quantizers(sim):
+        """ Disables all input and output quantizers in sim """
+
+    def _optimize_helper(
+            self,
+            optimize_fn: Callable,
+            allowed_accuracy_drop: float) -> Tuple[torch.nn.Module, float, str]:
+        """
+        Integrate and apply post-training quantization techniques.
+
+        :param allowed_accuracy_drop: Maximum allowed accuracy drop
+        :return: Tuple of (best model, eval score, encoding path)
+        """
+        allowed_accuracy_drop = float(allowed_accuracy_drop)
+        if allowed_accuracy_drop < 0:
+            raise ValueError(
+                "`allowed_accuracy_drop` must be a positive value. Got {:.2f}"
+                .format(allowed_accuracy_drop)
+            )
+
+        self.eval_manager.clear()
+
+        try:
+            with in_eval_mode(self.fp32_model), cache.enable(self.cache_dir):
+                _logger.info("Starting AutoQuant")
+
+                self._fp32_acc = self._evaluate_model_performance(self.fp32_model)
+                target_acc = self._fp32_acc - allowed_accuracy_drop
+                _logger.info("Target eval score: %f", target_acc)
+                _logger.info("FP32 eval score (W32A32): %f", self._fp32_acc)
+
+                ret = optimize_fn(self.fp32_model, target_acc)
+
+                acc = ret["accuracy"]
+                if acc is not None:
+                    _logger.info("Best eval score: %f", acc)
+
+                    if acc < target_acc:
+                        _logger.info(
+                            "AutoQuant is unable to match the target accuracy. "
+                            "Consider Quantization Aware Training."
+                        )
+
+                return ret
+        finally:
+            self.eval_manager.export_diagnostics()
+
+    def get_quant_scheme_candidates(self) -> Tuple[_QuantSchemePair, ...]:
+        """
+        Return the candidates for quant scheme search.
+        During :meth:`~AutoQuant.optimize`, the candidate with the highest accuracy
+        will be selected among them.
+
+        :return: Candidates for quant scheme search
+        """
+        return self._quant_scheme_candidates
+
+    def set_quant_scheme_candidates(self, candidates: Tuple[_QuantSchemePair, ...]):
+        """
+        Set candidates for quant scheme search.
+        During :meth:`~AutoQuant.optimize`, the candidate with the highest accuracy
+        will be selected among them.
+
+        :param candidates: Candidates for quant scheme search
+        """
+        self._quant_scheme_candidates = copy.copy(candidates)
+
+    def _choose_default_quant_scheme(self):
+        def eval_fn(pair: _QuantSchemePair):
+            sim = self._create_quantsim_and_encodings(
+                self.fp32_model,
+                param_quant_scheme=pair.param_quant_scheme,
+                param_percentile=pair.param_percentile,
+                output_quant_scheme=pair.output_quant_scheme,
+                output_percentile=pair.output_percentile,
+            )
+            eval_score = self._evaluate_model_performance(sim.model)
+            _logger.info("Evaluation finished: %s (eval score: %f)", pair, eval_score)
+            return eval_score
+
+        param_bw = self._quantsim_params["param_bw"]
+        output_bw = self._quantsim_params["output_bw"]
+
+        candidates = self.get_quant_scheme_candidates()
+
+        # If the weight representation has sufficient precision (i.e. bitwidth >= 16),
+        # always use tf scheme
+        if param_bw >= 16:
+            candidates = [
+                candidate for candidate in candidates
+                if candidate.param_quant_scheme == QuantScheme.post_training_tf
+            ]
+
+        # If the output representation has sufficient precision (i.e. bitwidth >= 16),
+        # always use tf scheme
+        if output_bw >= 16:
+            candidates = [
+                candidate for candidate in candidates
+                if candidate.output_quant_scheme == QuantScheme.post_training_tf
+            ]
+
+        # If we have only one candidate left, we don't need to evaluated
+        # the quant scheme for comparison
+        if len(candidates) == 1:
+            return candidates[0]
+
+        assert candidates
+
+        # Find the quant scheme that yields the best eval score
+        return max(candidates, key=eval_fn)
+
+    def _optimize_main(self, fp32_model: torch.nn.Module, target_acc: float): # pylint: disable=too-many-branches
+        """
+        Helper function of apply().
+
+        :param fp32_model: Model to apply PTQ techniques.
+        :param target_acc: Target eval score.
+
+        :raises RuntimeError: If none of the PTQ techniques were finished successfully.
+
+        :return: The best ptq result as a dictionary.
+        """
+        fp32_model = self.fp32_model
+
+        with self.eval_manager.session("Prepare Model") as sess:
+            if self.model_prepare_required:
+                fp32_model = sess.wrap(self._prepare_model)(self.fp32_model)
+            else:
+                raise _StageSkipped("Skipping Model Preparer")
+
+        # Choose quant scheme automatically.
+        with self.eval_manager.session("QuantScheme Selection") as sess:
+            self._quantsim_params["quant_scheme"] = sess.wrap(self._choose_default_quant_scheme)()
+
+        with self.eval_manager.session("W32 Evaluation") as sess:
+            w32_eval_score = sess.wrap(sess.eval)(model=fp32_model, param_bw=32)
+            _logger.info("Evaluation finished: W32A%d (eval score: %f)",
+                         self._quantsim_params["output_bw"], w32_eval_score)
+
+            # Early exit
+            if w32_eval_score < target_acc:
+                _logger.info(
+                    "W32A%d eval score (%f) is lower "
+                    "than the target eval score (%f). This means it is unlikely that "
+                    "the target eval score can be met using PTQ techniques. "
+                    "Please consider finetuning the model using range learning.",
+                    self._quantsim_params["output_bw"], w32_eval_score, target_acc
+                )
+
+                # Since AutoQuant pipeline exited early, all the return values are set to None
+                return {
+                    "model": None,
+                    "accuracy": None,
+                    "encoding_path": None,
+                    "applied_techniques": None,
+                }
+
+            sess.result["target_satisfied"] = True
+
+        # Batchnorm Folding
+        with self.eval_manager.session("Batchnorm Folding", ptq=True) as sess:
+            model, _ = sess.wrap(self._apply_batchnorm_folding)(fp32_model)
+            if sess.ptq_result is None:
+                sess.set_ptq_result(model=model,
+                                    applied_techniques=["batchnorm_folding"],
+                                    export_kwargs=self._export_kwargs)
+
+        best_result = self.eval_manager.get_best_ptq_result()
+        if best_result and best_result.accuracy >= target_acc:
+            sess.result["target_satisfied"] = True
+            return best_result.as_dict()
+
+        # Cross-Layer Equalization
+        with self.eval_manager.session("Cross-Layer Equalization", ptq=True) as sess:
+            model = sess.wrap(self._apply_cross_layer_equalization)(fp32_model)
+            if sess.ptq_result is None:
+                sess.set_ptq_result(model=model,
+                                    applied_techniques=["cross_layer_equalization"],
+                                    export_kwargs=self._export_kwargs)
+
+        best_result = self.eval_manager.get_best_ptq_result()
+        if best_result and best_result.accuracy >= target_acc:
+            sess.result["target_satisfied"] = True
+            return best_result.as_dict()
+
+        if best_result is None:
+            model = fp32_model
+            applied_techniques = []
+        else:
+            if "cross_layer_equalization" not in best_result.applied_techniques:
+                sess.result["effective"] = False
+            model = best_result.load_model()
+            applied_techniques = best_result.applied_techniques
+
+        # AdaRound
+        with self.eval_manager.session("AdaRound", ptq=True) as sess:
+            model, encoding_path = self._apply_adaround(model)
+            if sess.ptq_result is None:
+                sess.set_ptq_result(model=model,
+                                    encoding_path=encoding_path,
+                                    applied_techniques=[*applied_techniques, "adaround"],
+                                    export_kwargs=self._export_kwargs)
+
+        best_result = self.eval_manager.get_best_ptq_result()
+        if best_result:
+            if "adaround" not in best_result.applied_techniques:
+                sess.result["effective"] = False
+            if best_result.accuracy >= target_acc:
+                sess.result["target_satisfied"] = True
+            return best_result.as_dict()
+
+        raise RuntimeError("None of batchnorm folding, CLE, or Adaround "
+                           "has been finished successfully.")
+
+
+@dataclass
+class PtqResult:
+    """
+    Evaluation results.
+    :param tag: Identifier string of the evaluation result.
+    :param model_path: Path to the serialized model.
+    :param encoding_path: Path to the encoding file.
+    :param accuracy: Accuracy of the model.
+    """
+    model_path: str
+    device: torch.device
+    encoding_path: str
+    accuracy: float
+    applied_techniques: List[str]
+
+    def load_model(self) -> torch.nn.Module:
+        """
+        Load model.
+        :return: Loaded model.
+        """
+        return torch.load(self.model_path).to(self.device)
+
+    def as_dict(self):
+        """Convert to dictionary"""
+        return dict(model=self.load_model(),
+                    accuracy=self.accuracy,
+                    encoding_path=self.encoding_path,
+                    applied_techniques=self.applied_techniques)
+
+
+class _EvalManager:
+    """
+    Evaluation manager for AutoQuant.
+    """
+    def __init__(self,
+                 quantsim_factory: Callable,
+                 eval_func: Callable[[torch.nn.Module], float],
+                 dummy_input_on_cpu: Union[torch.Tensor, Tuple],
+                 results_dir: str,
+                 strict_validation: bool):
+        """
+        :param quantsim_factory: A factory function that returns QuantizationSimModel.
+        :param eval_func: Evaluation function.
+        :param dummy_input: Dummy input to the model. Assumed to be located on the same device as the model.
+        :param dummy_input_on_cpu: Dummy input to the model in CPU memory.
+        :param results_dir: Base directory to save the temporary serialized model.
+        """
+        self._quantsim_factory = quantsim_factory
+        self._eval_func = eval_func
+        self._dummy_input_on_cpu = dummy_input_on_cpu
+        self._results_dir = results_dir
+        self._strict_validation = strict_validation
+
+        os.makedirs(self._results_dir, exist_ok=True)
+
+        self._all_sessions = OrderedDict() # type: OrderedDict[str, _EvalSession]
+
+    def clear(self):
+        """
+        Clear all the session status saved in the previous run
+        """
+        for sess in self._all_sessions.values():
+            sess.reset_status()
+
+    def get_best_ptq_result(self) -> Optional[PtqResult]:
+        """
+        Get the results with the highest evaluation score among the ptq results evaluated so far.
+        :return: The best evaluation result so far.
+        """
+        ptq_results = [sess.ptq_result for sess in self._all_sessions.values()
+                       if sess.ptq_result is not None]
+        if not ptq_results:
+            return None
+
+        return max(ptq_results, key=lambda ptq_result: ptq_result.accuracy)
+
+    def session(self, title: str, ptq: bool = False):
+        """
+        Session factory.
+        :param title: Title of the session.
+        :param ptq: True if this session is a ptq session
+        :return: Session object.
+        """
+        if title not in self._all_sessions:
+            session = _EvalSession(title,
+                                   self._quantsim_factory,
+                                   self._eval_func,
+                                   self._dummy_input_on_cpu,
+                                   results_dir=os.path.join(self._results_dir, ".trace"),
+                                   strict_validation=self._strict_validation,
+                                   ptq=ptq)
+            self._all_sessions[title] = session
+        return self._all_sessions[title]
+
+    HTML_TEMPLATE_FILE = os.path.join(
+        os.path.dirname(os.path.abspath(__file__)),
+        "auto_quant_diagnostics_template.html",
+    )
+
+    def export_diagnostics(self) -> str:
+        """
+        Export diagnostics in html format.
+        :return: Diagnostics string in html format.
+        """
+        loader = jinja2.FileSystemLoader(os.path.dirname(self.HTML_TEMPLATE_FILE))
+        env = jinja2.Environment(loader=loader)
+        template = env.get_template(os.path.basename(self.HTML_TEMPLATE_FILE))
+
+        if any(sess.diagnostics.contains_bokeh() for sess in self._all_sessions.values()):
+            head = CDN.render()
+        else:
+            head = ""
+
+        log = io.StringIO()
+        for sess in self._all_sessions.values():
+            if sess.diagnostics.is_empty():
+                continue
+            log.write(
+                f"<h1> {sess.title} </h1>\n"
+            )
+            content = "\n".join(
+                line.get_html_elem() for line in sess.diagnostics
+            )
+            log.write(f"{content}\n")
+
+        result = OrderedDict()
+        result["ptq_techniques"] = OrderedDict()
+
+        for sess in self._all_sessions.values():
+            if sess.is_ptq_session():
+                result["ptq_techniques"][sess.title_lowercase] = sess.result
+            else:
+                result[sess.title_lowercase] = sess.result
+
+        flowchart_metadata = _build_flowchart_metadata(result)
+
+        html = template.render(head=head, log=log.getvalue(), **flowchart_metadata)
+
+        filename = os.path.join(self._results_dir, "diagnostics.html")
+        with open(filename, "w") as f:
+            f.write(html)
+        return html
+
+
+class _EvalSession: # pylint: disable=too-many-instance-attributes
+    """
+    Evaluation session for AutoQuant.
+
+    Each session object contains a title and diagnostics produced during the session.
+    The collected diagnostics will be exported into a html file by _EvalManager.
+    """
+    def __init__(
+            self,
+            title: str,
+            quantsim_factory: Callable,
+            eval_func: Callable[[torch.nn.Module], float],
+            dummy_input_on_cpu: Union[torch.Tensor, Tuple],
+            results_dir: str,
+            strict_validation: bool,
+            ptq: bool,
+    ):
+        """
+        :param title: Title of the session.
+        :param quantsim_factory: A factory function that returns QuantizationSimModel.
+        :param eval_func: Evaluation function.
+        :param dummy_input_on_cpu: Dummy input to the model in CPU memory.
+        :param results_dir: Base directory to save the temporary serialized model.
+        :param ptq: True if this session is a ptq session
+        """
+        self.title = title
+        self._quantsim_factory = quantsim_factory
+        self._eval_func = eval_func
+        self._dummy_input_on_cpu = dummy_input_on_cpu
+        self._results_dir = results_dir
+        self._strict_validation = strict_validation
+        self._ptq = ptq
+
+        self._spinner = None
+
+        self.result = {
+            "status": None,
+            "error": None,
+            "target_satisfied": False,
+            "effective": True,
+        }
+
+        os.makedirs(self._results_dir, exist_ok=True)
+
+        self.diagnostics = Diagnostics()
+
+        # Map session title to file name.
+        # e.g. title: "Cross-Layer Equalization" -> filename: "cross_layer_equalization"
+        self.title_lowercase = self.title.lower().replace("-", " ")
+        self.title_lowercase = "_".join(self.title_lowercase.split())
+
+        stdout_write = sys.stdout.write
+        self._log = io.StringIO()
+
+        # Redirects stdout to self._log
+        def write_wrapper(*args, **kwargs):
+            self._log.write(*args, **kwargs)
+            return stdout_write(*args, **kwargs)
+
+        self._stdout_redirect = patch.object(sys.stdout, "write", write_wrapper)
+        self._ptq_result = None
+        self._cached_result = None
+
+    def is_ptq_session(self):
+        """
+        Getter method of self._ptq flag
+        """
+        return self._ptq
+
+    def reset_status(self):
+        """
+        Reset the session status saved in the previous run
+        """
+        self.result = {
+            "status": None,
+            "error": None,
+            "target_satisfied": False,
+            "effective": True,
+        }
+
+    def wrap(self, fn):
+        """
+        Return a wrapper function that caches the return value.
+
+        :param fn: Function to wrap.
+        :returns: Function whose return value is cached.
+        """
+
+        results_dir = self._results_dir
+        class CachedResult:
+            """Cached result """
+            def __init__(self, obj):
+                self._filename = os.path.join(results_dir, f".{uuid4()}")
+                while os.path.exists(self._filename):
+                    self._filename = os.path.join(results_dir, f".{uuid4()}")
+                with open(self._filename, "wb") as f:
+                    pickle.dump(obj, f)
+
+            def load(self):
+                """Load cached result """
+                with open(self._filename, "rb") as f:
+                    return pickle.load(f)
+
+        @functools.wraps(fn)
+        def wrapper(*args, **kwargs):
+            if self._cached_result:
+                return self._cached_result.load()
+            ret = fn(*args, **kwargs)
+            self._cached_result = CachedResult(ret)
+            return ret
+        return wrapper
+
+    def eval(self, model: torch.nn.Module, **kwargs):
+        """
+        Evaluate the model.
+        :param model: Model to evaluate.
+        :param **kwargs: Additional arguments to the quantsim factory.
+        :return: Eval score.
+        """
+        sim = self._quantsim_factory(model, **kwargs)
+        acc = self._eval_func(sim.model)
+        return acc
+
+    def __enter__(self):
+        self._spinner = Spinner(self.title)
+        self._spinner.__enter__()
+        self._stdout_redirect.start()
+        return self
+
+    def __exit__(self, exc_type, exc_val, exc_tb):
+        if self._ptq_result is not None:
+            _logger.info("Session finished: %s. (eval score: %f)",
+                         self.title, self._ptq_result.accuracy)
+
+        self._spinner.__exit__(exc_type, exc_val, exc_tb)
+
+        if exc_val:
+            buffer = io.StringIO()
+            traceback.print_exception(exc_type, exc_val, exc_tb, file=buffer)
+
+            if exc_type == _StageSkipped:
+                print(exc_val.args[0])
+            else:
+                if self._strict_validation:
+                    print(buffer.getvalue())
+                else:
+                    print(
+                        "################################################################\n"
+                        "################################################################\n"
+                        "################################################################\n"
+                        "WARNING: The following exception was raised but ignored:\n\n"
+                        f"{buffer.getvalue()}"
+                        "################################################################\n"
+                        "################################################################\n"
+                        "################################################################\n"
+                    )
+
+        self._stdout_redirect.stop()
+        self.diagnostics.add(self._log.getvalue())
+
+        self.result["error"] = exc_val
+        if not exc_val:
+            self.result["status"] = "success"
+        elif exc_type == _StageSkipped:
+            self.result["status"] = "discarded"
+            return True
+        elif self._strict_validation:
+            self.result["status"] = "error-failed"
+        else:
+            self.result["status"] = "error-ignored"
+
+        if exc_val and not self._strict_validation:
+            # Return True so that the error doesn't propagate further
+            return True
+        return None
+
+    @property
+    def ptq_result(self) -> Optional[PtqResult]:
+        """Getter of self._ptq_result."""
+        return self._ptq_result
+
+    def set_ptq_result(
+            self,
+            applied_techniques: List[str],
+            model: torch.nn.Module = None,
+            sim: QuantizationSimModel = None,
+            acc: float = None,
+            export_kwargs: Mapping = None,
+            **kwargs
+    ) -> None:
+        """
+        Set the result of PTQ. Should be called exactly once inside a with-as block.
+
+        Exactly one among model and (sim, acc) pair should be specified.
+        1) If sim and acc is specified, save them as the result of this session.
+        2) If model is specified, evaluate the quantized accuracy of the model and save the result.
+
+        :param model: Result of PTQ.
+        :param sim: Result of PTQ. The quamtization encoding (compute_encodings()) is
+                    assumed to have been computed in advance.
+        :param acc: Eval score.
+        :param **kwargs: Additional arguments to the quantsim factory.
+        :return: None
+        """
+        if export_kwargs is None:
+            export_kwargs = {}
+
+        if sim is None:
+            assert acc is None
+            assert model is not None
+            sim = self._quantsim_factory(model, **kwargs)
+            acc = self._eval_func(sim.model)
+        else:
+            assert acc is not None
+            assert model is None
+
+        self._set_ptq_result(sim, acc, applied_techniques, export_kwargs)
+
+    def _set_ptq_result(
+            self,
+            sim: QuantizationSimModel,
+            acc: float,
+            applied_techniques: List[str],
+            export_kwargs: Mapping,
+    ) -> PtqResult:
+        """
+        Set the result of PTQ. Should be called exactly once inside a with-as block.
+
+        :param sim: Result of PTQ. The quamtization encoding (compute_encodings()) is
+                    assumed to have been computed in advance.
+        :param acc: Eval score.
+        :param export_kwargs: Additional kwargs for sim.export
+        :return: PtqResult object.
+        """
+        if self._ptq_result is not None:
+            raise RuntimeError(
+                "sess.eval() can be called only once per each _EvalSession instance."
+            )
+
+        device = utils.get_device(sim.model)
+        model_path, encoding_path = self._export(sim, export_kwargs)
+        self._ptq_result = PtqResult(
+            model_path=model_path,
+            device=device,
+            encoding_path=encoding_path,
+            accuracy=acc,
+            applied_techniques=applied_techniques,
+        )
+        return self._ptq_result
+
+    def _export(self, sim: QuantizationSimModel, export_kwargs: Mapping) -> Tuple[str, str]:
+        """
+        Export quantsim.
+        :param sim: QuantizationSimModel object to export.
+        :param export_kwargs: Additional kwargs for sim.export
+        :return: The paths where model and encoding are saved
+        """
+        sim.export(path=self._results_dir,
+                   filename_prefix=self.title_lowercase,
+                   dummy_input=self._dummy_input_on_cpu,
+                   **export_kwargs)
+        model_path = os.path.join(self._results_dir, f"{self.title_lowercase}.pth")
+        encoding_path = os.path.join(self._results_dir, f"{self.title_lowercase}.encodings")
+        _logger.info("The results of %s is saved in %s and %s.",
+                     self.title, model_path, encoding_path)
+        return model_path, encoding_path
+
+
+@contextlib.contextmanager
+def spy_auto_quant(auto_quant: AutoQuantBase):
+    """
+    Install a spy that collects the handles to the ptq result of
+    each stage of AutoQuant.
+
+    Typical usage::
+        >>> auto_quant = AutoQuant(...)
+        ... with auto_quant_spy(auto_quant) as spy:
+        ...     _ = auto_quant.apply(...)
+        ...
+        ... for result in spy.get_all_ptq_results():
+        ...     print(result.applied_techniques)
+        ...     print(result.accuracy)
+        ...     print(result.encoding_path)
+        ...     model = result.load_model()
+        ...     ...
+    """
+    # pylint: disable=protected-access
+    class Spy:
+        """
+        Spy that collects the handles to the ptq result of
+        each stage of AutoQuant.
+        """
+        def __init__(self, eval_manager):
+            self._eval_manager = eval_manager
+
+        def get_all_ptq_results(self) -> List[PtqResult]:
+            """Return handles to the results of AutoQuant"""
+            if self._eval_manager is None:
+                return []
+            return [sess.ptq_result for sess in self._eval_manager._all_sessions.values()
+                    if sess.ptq_result is not None]
+
+    spy = Spy(auto_quant.eval_manager)
+
+    _optimize_main = auto_quant._optimize_main
+
+    def _optimize_main_wrapper(fp32_model, target_acc):
+        return _optimize_main(fp32_model, target_acc)
+
+    try:
+        setattr(auto_quant, "_optimize_main", _optimize_main_wrapper)
+        yield spy
+    finally:
+        setattr(auto_quant, "_optimize_main", _optimize_main)
+
+
+def _build_flowchart_metadata(result: Mapping) -> Dict: # pylint: disable=too-many-return-statements
+    """
+    Build flowchart metadata for the html template of summary report
+
+    :param result: Result of AutoQuant with the following format:
+
+        result := {
+            "prepare_model": _stage_result,
+            "quantscheme_selection": _stage_result,
+            "w32_evaluation": _stage_result,
+            "ptq_techniques" [
+                "batchnorm_folding": _stage_result,
+                "cross_layer_equalization": _stage_result,
+                "adaround": _stage_result,
+            ]
+
+        }
+
+        where _stage_result is a dictionary defined as below:
+
+        _stage_result := {
+            "status": str,
+            "error": Exception,
+            "target_satisfied": bool,
+            "effective": bool,
+        }
+
+    :return: Dictionary that contains flowchart metadata for html template
+    """
+    metadata = defaultdict(str)
+    metadata.update(
+        edge_prepare_model_in='data-visited="true"',
+        node_prepare_model='data-visited="true"',
+    )
+
+    status = result['prepare_model']['status']
+    metadata.update(
+        node_prepare_model=f'data-visited="true" data-stage-result="{status}"',
+    )
+
+    if status == 'error-failed':
+        return metadata
+
+    metadata.update(
+        edge_prepare_model_out='data-visited="true"',
+    )
+
+    if "quantscheme_selection" in result:
+        status = result['quantscheme_selection']['status']
+        metadata.update(
+            node_quant_scheme_selection=f'data-visited="true" data-stage-result="{status}"',
+        )
+
+        if status == 'error-failed':
+            return metadata
+
+    metadata.update(
+        edge_quant_scheme_selection_out='data-visited="true"',
+        node_test_w32_eval_score='data-visited="true"',
+    )
+
+    if not result["w32_evaluation"]["target_satisfied"]:
+        metadata.update(
+            edge_test_w32_eval_score_if_false='data-visited="true"',
+            node_result_fail='data-visited="true"',
+        )
+        return metadata
+
+    metadata.update(
+        edge_test_w32_eval_score_if_true='data-visited="true"',
+    )
+
+
+    for ptq_name, ptq_result in result["ptq_techniques"].items():
+        status = ptq_result['status']
+        effective = ptq_result['effective']
+        if status == "success" and not effective:
+            status = "discarded"
+        metadata.update({
+            f"node_{ptq_name}": f'data-visited="true" data-stage-result="{status}"',
+        })
+
+        if status == 'error-failed':
+            return metadata
+
+        metadata.update({
+            f'edge_{ptq_name}_out': 'data-visited="true"',
+            f'node_test_{ptq_name}': 'data-visited="true"',
+        })
+
+        if ptq_result['target_satisfied']:
+            metadata.update({
+                f'edge_test_{ptq_name}_if_true': 'data-visited="true"',
+                'node_result_success': 'data-visited="true"',
+            })
+            return metadata
+
+        metadata.update({
+            f'edge_test_{ptq_name}_if_false': 'data-visited="true"',
+        })
+
+    metadata.update(
+        node_result_fail='data-visited="true"',
+    )
+
+    return metadata
+
+
+
[docs]class AutoQuant(AutoQuantBase): # pylint: disable=too-many-instance-attributes + """ + Integrate and apply post-training quantization techniques. + + AutoQuant includes 1) batchnorm folding, 2) cross-layer equalization, + and 3) Adaround. + These techniques will be applied in a best-effort manner until the model + meets the evaluation goal given as allowed_accuracy_drop. + """ + + @staticmethod + def _get_adaround(): + """ returns AdaRound """ + return Adaround + + @staticmethod + def _get_adaround_parameters(data_loader, num_batches): + return AdaroundParameters(data_loader, num_batches) + + @staticmethod + def _get_quantsim(model, dummy_input, **kwargs): + return QuantizationSimModel(model, dummy_input, **kwargs) + + def _configure_quantsim(self, # pylint: disable=too-many-arguments + sim, + output_bw, + output_quant_scheme, + output_percentile, + param_bw, + param_quant_scheme, + param_percentile, + encoding_path): + + param_quantizers, input_quantizers, output_quantizers = utils.get_all_quantizers(sim.model) + + # Set input/output quantizers' quant schemes + for quantizer in itertools.chain(input_quantizers, output_quantizers): + quantizer.quant_scheme = output_quant_scheme + if quantizer.quant_scheme == QuantScheme.post_training_percentile and\ + output_percentile is not None: + quantizer.set_percentile_value(output_percentile) + + # Set param quantizers' quant schemes + for quantizer in param_quantizers: + quantizer.quant_scheme = param_quant_scheme + if quantizer.quant_scheme == QuantScheme.post_training_percentile and\ + param_percentile is not None: + quantizer.set_percentile_value(param_percentile) + + if encoding_path: + sim.set_and_freeze_param_encodings(encoding_path) + + param_quantizers, input_quantizers, output_quantizers = utils.get_all_quantizers(sim.model) + + # Disable input/output quantizers, using fp32 to simulate int32. + if output_bw == 32: + for quantizer in input_quantizers + output_quantizers: + quantizer.enabled = False + + # Disable param quantizers, using fp32 to simulate int32. + if param_bw == 32: + for quantizer in param_quantizers: + quantizer.enabled = False + + @staticmethod + def _has_enabled_quantizers(sim): + param_quantizers, input_quantizers, output_quantizers = utils.get_all_quantizers(sim.model) + return any(quantizer.enabled for quantizer in param_quantizers +\ + input_quantizers +\ + output_quantizers) + + @staticmethod + def _disable_activation_quantizers(sim): + _, input_quantizers, output_quantizers = get_all_quantizers(sim.model) + for quantizer in itertools.chain(input_quantizers, output_quantizers): + quantizer.enabled = False
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_torch/batch_norm_fold.html b/releases/1.32.2/_modules/aimet_torch/batch_norm_fold.html new file mode 100644 index 00000000..68f848f5 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_torch/batch_norm_fold.html @@ -0,0 +1,1751 @@ + + + + + + aimet_torch.batch_norm_fold — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_torch.batch_norm_fold

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2019, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Optimization code to fold batch-norm layers """
+
+import contextlib
+import math
+from typing import List, Tuple, Union, Dict, Iterable, Set, Any
+import numpy as np
+import torch
+import torch.nn
+from torch.nn.modules.batchnorm import BatchNorm1d, BatchNorm2d
+from torch.nn.modules.conv import _ConvTransposeNd
+
+import aimet_common.libpymo as libpymo
+
+from aimet_common.bias_correction import ConvBnPatternHandler, CONV_OP_TYPES, LINEAR_OP_TYPES, BN_OP_TYPES
+from aimet_common.graph_pattern_matcher import PatternType
+from aimet_common.graph_searcher import GraphSearcher
+from aimet_common.utils import AimetLogger
+
+# pylint: disable=unused-import
+from aimet_torch.defs import PassThroughOp
+from aimet_torch import utils
+from aimet_torch.meta.connectedgraph import ConnectedGraph
+from aimet_torch.quantsim import QuantizationSimModel
+from aimet_torch.qc_quantize_op import QcQuantizeWrapper
+from aimet_torch.tensor_quantizer import LearnedGridTensorQuantizer
+
+_logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.BatchNormFolding)
+
+
+LayerType = Union[
+    torch.nn.Linear,
+    torch.nn.Conv1d,
+    torch.nn.Conv2d,
+    torch.nn.ConvTranspose2d,
+]
+_supported_layers = LayerType.__args__
+
+BatchNormType = Union[BatchNorm1d, BatchNorm2d]
+_supported_batchnorms = BatchNormType.__args__
+
+
+def _delete_bn_from_model(model: torch.nn.Module, bn_layer_list: Iterable[BatchNormType]):
+    utils.replace_modules_with_instances_of_new_type(model, bn_layer_list, torch.nn.Identity)
+
+
+@contextlib.contextmanager
+def _expand_shape_to_4d(weight_tensor: libpymo.TensorParams):
+    """ Expand the shape of the weight into 4d.  """
+    dims = len(weight_tensor.shape)
+
+    if dims > 5:
+        raise RuntimeError
+
+    if dims == 4:
+        yield weight_tensor
+
+    else:
+        orig_shape = weight_tensor.shape
+        if dims < 4:
+            # If we have less dimensions, we add 1s to make 4 dimensions
+            _4d_shape = np.append(orig_shape, [1 for _ in range(4-dims)]).astype(int)
+        else:
+            # If we have more dimensions, we concatenate all the dimensions beyond 3 into one dimension
+            _4d_shape = np.array(orig_shape[:3] + [math.prod(orig_shape[3:])])
+
+        try:
+            weight_tensor.shape = _4d_shape
+            yield weight_tensor
+        finally:
+            weight_tensor.shape = orig_shape
+
+
+def _call_mo_batch_norm_fold(weight: torch.Tensor,
+                             bias: torch.Tensor,
+                             bn: BatchNormType,
+                             fold_backward: bool):
+    """
+    Calls C++ batch norm folding API.
+
+    :param weight: Weight or scale tensor to fold BN into.
+    :param bias: Bias tensor to fold BN into.
+    :param bn: Batch Norm layer
+    :param fold_backward: True if BatchNorm comes after Conv/Linear layer
+    """
+    with torch.no_grad():
+        bn_params = libpymo.BNParams()
+        bn_params.gamma = bn.weight.detach().cpu().numpy().reshape(-1)
+        bn_params.beta = bn.bias.detach().cpu().numpy().reshape(-1)
+        bn_params.runningMean = bn.running_mean.detach().cpu().numpy().reshape(-1)
+        sigma = torch.sqrt(bn.running_var + bn.eps)
+        bn_params.runningVar = sigma.detach().cpu().numpy().reshape(-1)
+
+        weight_tensor = libpymo.TensorParams()
+
+        weight_tensor.data = weight.detach().cpu().numpy().reshape(-1)
+        weight_tensor.shape = np.array(weight.shape)
+
+        bias_tensor = libpymo.TensorParams()
+
+        bias_tensor.data = bias.detach().cpu().numpy().reshape(-1)
+        bias_tensor.shape = np.array(bias.shape)
+        is_bias_valid = True
+
+        with _expand_shape_to_4d(weight_tensor):
+            _bias = libpymo.fold(bn_params, weight_tensor, bias_tensor, is_bias_valid, fold_backward)
+
+        bias.copy_(torch.tensor(_bias, device=bias.device, dtype=bias.dtype)
+                   .reshape_as(bias))
+
+        weight.copy_(torch.tensor(weight_tensor.data, device=weight.device, dtype=weight.dtype)
+                     .reshape_as(weight))
+
+
+class _BatchNormFoldingNotSupported(RuntimeError):
+    pass
+
+
+def _fold_to_scale(conv_wrapper: QcQuantizeWrapper, bn_wrapper: QcQuantizeWrapper):
+    """
+    Fold BatchNorm into the scale and bias of the given layer.
+
+    :param conv_wrapper: QcQuantizeWrapper that wraps conv or linear layer.
+    :param bn_wrapper: QcQuantizeWrapper that wraps bn.
+    """
+    # pylint: disable=protected-access, too-many-locals, too-many-branches, too-many-statements
+    conv = conv_wrapper._module_to_wrap
+    bn = bn_wrapper._module_to_wrap
+
+    weight_quantizer = conv_wrapper.param_quantizers["weight"]
+
+    if not isinstance(weight_quantizer, LearnedGridTensorQuantizer):
+        raise _BatchNormFoldingNotSupported(
+            "BatchNorm folding to scale supports LearnedGridTensorQuantizer only; "
+            f"got {type(weight_quantizer)}."
+        )
+
+    output_quantizer = conv_wrapper.output_quantizers[0]
+
+    if output_quantizer.enabled:
+        raise _BatchNormFoldingNotSupported(
+            "BatchNorm should belong to the same supergroup with the layer to be folded to."
+        )
+
+    if "bias" in conv_wrapper.param_quantizers:
+        bias_quantizer = conv_wrapper.param_quantizers["bias"]
+        if bias_quantizer.enabled:
+            raise _BatchNormFoldingNotSupported(
+                "Can't fold BatchNorm to scale if bias quantizer is enabled."
+            )
+
+    encodings = weight_quantizer.encoding
+
+    if encodings is None:
+        raise RuntimeError
+
+    if isinstance(encodings, libpymo.TfEncoding):
+        encodings = [encodings]
+
+    if isinstance(conv, _ConvTransposeNd) and conv.groups != 1:
+        raise _BatchNormFoldingNotSupported(
+            "BatchNorm folding to scale is not supported for grouped ConvTransposeNd."
+        )
+
+    # Add quantization noise to the BN params (bn weight & bn bias) before folding.
+    # NOTE: Quantization of foldable batchnorms is automatically disabled when
+    #       initializing quantsim. However, it is still safer to call _quantize_params here
+    #       as we can't guarantee this is always the case.
+    #       For example, the user can manually enable quantization of batchnorms, etc...
+    #       (FYI: _quantize_params takes effect only when the parameter quantizers are enabled)
+    with bn_wrapper._quantize_params():
+        _fold_to_weight(conv, bn, fold_backward=True)
+
+        gamma = bn.weight
+        sigma = torch.sqrt(bn.running_var + bn.eps)
+
+        new_encodings = []
+        for old_encoding, c in zip(encodings, gamma/sigma):
+            new_encoding = libpymo.TfEncoding()
+            new_encoding.delta = old_encoding.delta * abs(c)
+            if c >= 0:
+                new_encoding.max = old_encoding.max * c
+                new_encoding.min = old_encoding.min * c
+            else:
+                new_encoding.max = old_encoding.min * c
+                new_encoding.min = old_encoding.max * c
+            new_encoding.offset = old_encoding.offset
+            new_encoding.bw = old_encoding.bw
+            new_encodings.append(new_encoding)
+
+        weight_quantizer.encoding = new_encodings
+
+    # Copy batchnorm's output quantizers to conv output quantizers
+    for conv_output_quantizer, bn_output_quantizer in\
+            zip(conv_wrapper.output_quantizers, bn_wrapper.output_quantizers):
+        conv_output_quantizer.enabled = bn_output_quantizer.enabled
+
+        if bn_output_quantizer.encoding is not None:
+            encoding = libpymo.TfEncoding()
+            encoding.delta  = bn_output_quantizer.encoding.delta
+            encoding.max    = bn_output_quantizer.encoding.max
+            encoding.min    = bn_output_quantizer.encoding.min
+            encoding.offset = bn_output_quantizer.encoding.offset
+            encoding.bw     = bn_output_quantizer.encoding.bw
+            conv_output_quantizer.encoding = encoding
+
+        bn_output_quantizer.enabled = False
+
+    if "bias" not in conv_wrapper.param_quantizers:
+        bias_quantizer = LearnedGridTensorQuantizer(weight_quantizer.bitwidth,
+                                                    weight_quantizer.round_mode,
+                                                    weight_quantizer.quant_scheme,
+                                                    weight_quantizer.use_symmetric_encodings,
+                                                    enabled_by_default=False,
+                                                    data_type=weight_quantizer.data_type)
+        bias_quantizer._ch_axis = weight_quantizer._ch_axis
+        conv_wrapper.param_quantizers["bias"] = bias_quantizer
+
+
+def _fold_to_weight(conv_linear: LayerType, bn: BatchNormType, fold_backward: bool):
+    """
+    Fold BatchNorm into the weight and bias of the given layer.
+
+    :param conv_linear: Conv or linear layer to fold BN into.
+    :param bn: BatchNorm to fold.
+    """
+    # Transpose weights to C, N, H, W from N, C, H, W since axis are flipped for transposed conv
+    # However depthwise conv layers are always N, 1, H, W whether transposed-conv or not, so no need to transpose
+    if isinstance(conv_linear, torch.nn.ConvTranspose2d) and conv_linear.groups == 1:
+        conv_linear.weight.data = conv_linear.weight.data.permute(1, 0, 2, 3)
+
+    if conv_linear.bias is None:
+        out_channels = conv_linear.out_features if isinstance(conv_linear, torch.nn.Linear)\
+                       else conv_linear.out_channels
+        bias = torch.zeros(out_channels,
+                           device=conv_linear.weight.device,
+                           dtype=conv_linear.weight.dtype)
+        conv_linear.bias = torch.nn.Parameter(bias)
+
+    _call_mo_batch_norm_fold(conv_linear.weight, conv_linear.bias, bn, fold_backward=fold_backward)
+
+    # Transpose weight back to N, C, H, W for transposed Conv2D, for non-depthwise layers
+    if isinstance(conv_linear, torch.nn.ConvTranspose2d) and conv_linear.groups == 1:
+        conv_linear.weight.data = conv_linear.weight.data.permute(1, 0, 2, 3)
+
+
+
[docs]def fold_given_batch_norms(model, layer_pairs): + """ + Fold a given set of batch_norm layers into conv layers + + :param model: Model + :param layer_pairs: Pairs of conv and batch_norm layers to use for folding + :return: None + """ + # pylint: disable=protected-access + conv_bn_pairs = [] + bn_conv_pairs = [] + + def is_batchnorm(module: torch.nn.Module) -> bool: + if isinstance(module, QcQuantizeWrapper): + module = module._module_to_wrap + return isinstance(module, _supported_batchnorms) + + def is_conv_linear(module: torch.nn.Module) -> bool: + if isinstance(module, QcQuantizeWrapper): + module = module._module_to_wrap + return isinstance(module, _supported_layers) + + for x, y in layer_pairs: + if is_batchnorm(x): + assert is_conv_linear(y) + bn = x + conv = y + bn_conv_pairs.append((bn, conv)) + else: + assert is_conv_linear(x) + assert is_batchnorm(y) + conv = x + bn = y + conv_bn_pairs.append((conv, bn)) + + _fold_given_batch_norms(model, conv_bn_pairs, bn_conv_pairs)
+ + +def _fold_given_batch_norms(model, + conv_bn_pairs: Iterable[Tuple[torch.nn.Module, torch.nn.Module]], + bn_conv_pairs: Iterable[Tuple[torch.nn.Module, torch.nn.Module]]): + """ + Fold a given set of batch_norm layers into conv layers + + :param model: Model + :param conv_bn_pairs: List of (conv, bn) pairs to fold + :param bn_conv_pairs: List of (bn, conv) pairs to fold + :return: None + """ + # pylint: disable=protected-access + for bn, conv in bn_conv_pairs: + if isinstance(conv, QcQuantizeWrapper): + raise RuntimeError(f"Forward folding to scale is not possible. Got {conv}") + + bn_modules = [] + + def _fold(conv, bn, fold_backward): + is_wrapped = isinstance(conv, QcQuantizeWrapper) or isinstance(bn, QcQuantizeWrapper) + try: + if is_wrapped: + assert isinstance(conv, QcQuantizeWrapper) and isinstance(bn, QcQuantizeWrapper) + _fold_to_scale(conv, bn) + bn_modules.append(bn._module_to_wrap) + else: + _fold_to_weight(conv, bn, fold_backward=fold_backward) + except _BatchNormFoldingNotSupported as e: + bn_name = utils.get_layer_name(model, bn) + conv_name = utils.get_layer_name(model, conv) + _logger.warning( + "Failed to fold %s to %s. [Reason] %s", bn_name, conv_name, str(e) + ) + else: + bn_modules.append(bn._module_to_wrap if is_wrapped else bn) + + + with utils.in_eval_mode(model), torch.no_grad(): + for conv, bn in conv_bn_pairs: + _fold(conv, bn, fold_backward=True) + + for bn, conv in bn_conv_pairs: + _fold(conv, bn, fold_backward=False) + + _delete_bn_from_model(model, bn_modules) + + +def find_all_batch_norms_to_fold(model, input_shapes, dummy_input: Union[torch.Tensor, Tuple] = None): + """ + Find all possible batch norm layers that can be folded. And returns a list of pairs such that (bn, layer) + means bn will be forward-folded into layer and (layer, bn) means bn will be backward-folded into layer + :param model: Model to search + :param input_shapes: Input shapes to use for the model (can be one or multiple inputs) + :param dummy_input: A dummy input to the model. Can be a Tensor or a Tuple of Tensors + :return: List of pairs of bn and layers to fold bn into + """ + device = utils.get_device(model) + if dummy_input is not None: + connected_graph = ConnectedGraph(model, dummy_input) + else: + device = utils.get_device(model) + inp_tensor_list = utils.create_rand_tensors_given_shapes(input_shapes, device) + connected_graph = ConnectedGraph(model, inp_tensor_list) + + conv_bn_pairs, bn_conv_pairs, _ = _find_all_batch_norms_to_fold(connected_graph) + return conv_bn_pairs + bn_conv_pairs + + +def _find_all_batch_norms_to_fold(connected_graph: ConnectedGraph) -> Tuple[ + List[Tuple[LayerType, BatchNormType]], List[Tuple[BatchNormType, LayerType]]]: + """ + Find all possible batch norm layers that can be folded. And returns a list of pairs such that (bn, layer) + means bn will be forward-folded into layer and (layer, bn) means bn will be backward-folded into layer + :param connected_graph: Connected graph associated with the model. + :return: A list of (layer, bn) pairs and a list of (bn, layer) pairs, + where `bn` can be folded into to `layer`. + """ + conv_bn_pairs, bn_conv_pairs, bn_to_fold = _find_foldable_bn_pair_and_bn_picked_for_folding(connected_graph) + return conv_bn_pairs, bn_conv_pairs, bn_to_fold + +def _find_foldable_bn_pair_and_bn_picked_for_folding(connected_graph: ConnectedGraph) -> Tuple[ + List[Tuple[LayerType, BatchNormType]], List[Tuple[BatchNormType, LayerType]], Set]: + """ + Find all possible batch norm layers that can be folded. And returns a list of pairs such that (bn, layer) + means bn will be forward-folded into layer and (layer, bn) means bn will be backward-folded into layer + :param connected_graph: Connected graph associated with the model. + :return: A list of (layer, bn) pairs and a list of (bn, layer) pairs, + where `bn` can be folded into to `layer`. + A set of bn ops which can be folded in to immediate convs. + """ + conv_linear_bn_activation_info_dict = find_all_conv_bn_with_activation_in_graph(connected_graph) + + # To mark BN's already picked for backward folding + bn_picked_for_folding = set() + + _conv_linear_optypes = CONV_OP_TYPES + LINEAR_OP_TYPES + ordered_conv_fc_modules = [op.get_module() for op in connected_graph.ordered_ops if op.type in _conv_linear_optypes] + + conv_bn_pairs = [] + # Backward fold is given priority over Forward fold + for module in ordered_conv_fc_modules: + if module in conv_linear_bn_activation_info_dict.keys() and _is_valid_bn_fold(module, True): + bn_info = conv_linear_bn_activation_info_dict[module] + if bn_info.output_bn and bn_info.output_bn not in bn_picked_for_folding: + conv_bn_pairs.append((module, bn_info.output_bn.get_module())) + bn_picked_for_folding.add(bn_info.output_bn) + + bn_conv_pairs = [] + for module in ordered_conv_fc_modules: + if module in conv_linear_bn_activation_info_dict.keys() and _is_valid_bn_fold(module, False): + bn_info = conv_linear_bn_activation_info_dict[module] + if bn_info.input_bn and bn_info.input_bn not in bn_picked_for_folding: + bn_conv_pairs.append((bn_info.input_bn.get_module(), module)) + bn_picked_for_folding.add(bn_info.input_bn) + + return conv_bn_pairs, bn_conv_pairs, bn_picked_for_folding + +def find_standalone_batchnorm_ops(connected_graph: ConnectedGraph)->set: + """ + Find all batchnorms ops can not be folded. + :param connected_graph: Connected graph associated with the model. + :return stand_alone_bn_ops: Set of batchnorm ops can not be folded. + """ + _, _, bn_picked_for_folding = _find_foldable_bn_pair_and_bn_picked_for_folding(connected_graph) + bn_ops = {op for op in connected_graph.get_all_ops().values() if op.type in BN_OP_TYPES} + stand_alone_bn_ops = bn_ops - bn_picked_for_folding + + return stand_alone_bn_ops + +def _is_valid_bn_fold(conv: LayerType, fold_backward: bool) -> bool: + """ + Determine if a given layer can successfully absorb a BatchNorm given the layer type and parameters + :param conv: The Conv/Linear layer to fold a BatchNorm into. + :param fold_backward: True if BatchNorm comes after Conv/Linear layer + :return: True if a BatchNorm layer can be folded without causing output error. + """ + valid = True + if not fold_backward: + # Cannot fold BN -> Conv with padding. AIMET does not support forward folding to grouped or DW Conv + if isinstance(conv, (torch.nn.Conv2d, torch.nn.Conv1d, torch.nn.Conv3d)): + valid &= all(item == 0 for item in conv.padding) + valid &= conv.groups == 1 + # AIMET does not support forward folding to ConvTranspose + elif isinstance(conv, torch.nn.ConvTranspose2d): + valid = False + else: + # AIMET does not support backwards folding to grouped ConvTranspose + if isinstance(conv, torch.nn.ConvTranspose2d): + valid &= conv.groups in (1, conv.in_channels) + return valid + + +def fold_all_batch_norms_to_weight( + model: torch.nn.Module, + input_shapes: Union[Tuple, List[Tuple]], + dummy_input: Union[torch.Tensor, Tuple] = None +) -> List[Tuple[LayerType, BatchNormType]]: + """ + Fold all batch_norm layers in a model into the weight of the corresponding conv layers + + :param model: Model + :param input_shapes: Input shapes for the model (can be one or multiple inputs) + :param dummy_input: A dummy input to the model. Can be a Tensor or a Tuple of Tensors + :return: A list of pairs of layers [(Conv/Linear, BN layer that got folded)] + """ + if isinstance(model, torch.nn.DataParallel): + return fold_all_batch_norms_to_weight(model.module, input_shapes, dummy_input) + device = utils.get_device(model) + if dummy_input is None: + inp_tensor_list = utils.create_rand_tensors_given_shapes(input_shapes, device) + else: + inp_tensor_list = dummy_input + connected_graph = ConnectedGraph(model, inp_tensor_list) + + conv_bn_pairs, bn_conv_pairs, bn_to_fold = _find_all_batch_norms_to_fold(connected_graph) + + _fold_given_batch_norms(model, conv_bn_pairs, bn_conv_pairs) + + # Convert the standalone BNs which are not folded + bn_converted = convert_standalone_batchnorms(model, inp_tensor_list, bn_to_fold) + _logger.info("%d BatchNorms' weights got converted", len(bn_converted)) + return conv_bn_pairs + [(conv, bn) for bn, conv in bn_conv_pairs] + + +def convert_standalone_batchnorms(model: torch.nn.Module, + dummy_input: Union[torch.Tensor, Tuple], + folded_bn: set) -> List[Tuple[Any, BatchNorm2d]]: + """ + Convert the weights of all the standalone batchnorms of a model which didn't get folded. + :param model: torch model for which batch norm folding is being performed + :param dummy_input: dummy input for the model + :param folded_bn: list of BNs which got folded + :return: List of tuple(name, bn_module) whose weights got converted + """ + + module_list = utils.get_ordered_list_of_modules(model, dummy_input) + bn_converted = [] + for name, module in module_list: + if isinstance(module, (torch.nn.BatchNorm1d, torch.nn.BatchNorm2d)) and module not in folded_bn: + convert_batchnorm_parameters(model, module) + _logger.debug("%s weights got converted", name) + bn_converted.append((name, module)) + return bn_converted + + +def convert_batchnorm_parameters(model: torch.nn.Module, bn: Union[torch.nn.BatchNorm1d, torch.nn.BatchNorm2d]): + """ + To convert the weight of a batchnorm such that it becomes in the format y = weights*input + bias + :param model: torch model for which batch norm folding is being performed + :param bn: BatchNorm module whose weights needs to be converted + """ + with utils.in_eval_mode(model), torch.no_grad(): + gamma = bn.weight + beta = bn.bias + running_mean = bn.running_mean + inv_sigma = torch.rsqrt(bn.running_var + bn.eps) + + weight = gamma*inv_sigma + bias = beta - running_mean * weight + + # Update the values + bn.eps = 0 + bn.track_running_stats = False + bn.weight.copy_(weight.clone().detach()) + bn.bias.copy_(bias.clone().detach()) + bn.running_mean = torch.zeros(bn.running_mean.shape, device=bn.running_mean.device, dtype=bn.running_mean.dtype) + bn.running_var = torch.ones(bn.running_var.shape, device=bn.running_var.device, dtype=bn.running_var.dtype) + + +fold_all_batch_norms = fold_all_batch_norms_to_weight + + +
[docs]def fold_all_batch_norms_to_scale( + sim: QuantizationSimModel, +) -> List[Tuple[QcQuantizeWrapper, QcQuantizeWrapper]]: + """ + Fold all batch_norm layers in a model into the quantization scale parameter + of the corresponding conv layers + + :param sim: QuantizationSimModel + :return: A list of pairs of layers [(Conv/Linear, BN layer that got folded)] + """ + # pylint: disable=protected-access + assert sim.model is not None + assert sim.connected_graph is not None + + model = sim.model + connected_graph = sim.connected_graph + + quant_wrappers = { + quant_wrapper._module_to_wrap: quant_wrapper + for _, quant_wrapper in sim.quant_wrappers() + } + conv_bn_pairs, bn_conv_pairs, _ = _find_all_batch_norms_to_fold(connected_graph) + conv_bn_pairs = [ + (quant_wrappers[conv], quant_wrappers[bn]) for conv, bn in conv_bn_pairs + ] + bn_conv_pairs = [ + (quant_wrappers[bn], quant_wrappers[conv]) for bn, conv in bn_conv_pairs + ] + + _fold_given_batch_norms(model, conv_bn_pairs, bn_conv_pairs) + + return conv_bn_pairs + [(conv, bn) for bn, conv in bn_conv_pairs]
+ + +def find_all_conv_bn_with_activation(model: torch.nn.Module, input_shape: Tuple) -> Dict: + """ + Uses searcher to find preceding and next bn layers for a conv/linear layer + :param model: PyTorch model + :param input_shape: shape of input to the model + :return: dictionary of conv/linear layers with associated bn op / activation info + """ + device = utils.get_device(model) + inp_tensor_list = utils.create_rand_tensors_given_shapes(input_shape, device) + connected_graph = ConnectedGraph(model, inp_tensor_list) + return find_all_conv_bn_with_activation_in_graph(connected_graph) + + +def find_all_conv_bn_with_activation_in_graph(connected_graph: ConnectedGraph) -> Dict: + """ + Uses searcher to find preceding and next bn layers for a conv/linear layer + :param connected_graph: ConnectedGraph object. + :return: dictionary of conv/linear layers with associated bn op / activation info + """ + + # initialize all patterns to be matched and associated call back functions + patterns_with_callbacks = [] + layer_select_handler = ConvBnPatternHandler() + conv_types = ['Conv1d', 'Conv', 'ConvTranspose'] + linear_types = ['Gemm'] + + for op_type in conv_types + linear_types: + patterns_with_callbacks.append(PatternType(pattern=['BatchNormalization', op_type], + action=layer_select_handler)) + patterns_with_callbacks.append(PatternType(pattern=[op_type, 'BatchNormalization'], + action=layer_select_handler)) + patterns_with_callbacks.append(PatternType(pattern=['Conv3d', 'BatchNorm3d'], action=layer_select_handler)) + patterns_with_callbacks.append(PatternType(pattern=['BatchNorm3d', 'Conv3d'], action=layer_select_handler)) + + # create graph searcher instance with connected graph and patterns to search + graph_searcher = GraphSearcher(connected_graph, patterns_with_callbacks) + + # get all conv/linear and bn info + graph_searcher.find_all_patterns_in_graph_apply_actions() + convs_bn_activation_dict = layer_select_handler.get_conv_linear_bn_info_dict() + + return convs_bn_activation_dict +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_torch/bias_correction.html b/releases/1.32.2/_modules/aimet_torch/bias_correction.html new file mode 100644 index 00000000..5772aaed --- /dev/null +++ b/releases/1.32.2/_modules/aimet_torch/bias_correction.html @@ -0,0 +1,1521 @@ + + + + + + aimet_torch.bias_correction — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_torch.bias_correction

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2019, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+# TODO Need to exclude this file for PyLint checking. We get the following error that needs to be investigated:
+# RecursionError: maximum recursion depth exceeded while calling a Python object
+# pylint: skip-file
+
+""" Code to perform bias correction for layers """
+from typing import Callable, Tuple, List, Union, Dict
+import copy
+
+import torch
+import torch.nn
+import numpy as np
+import aimet_common.libpymo as libpymo
+
+from aimet_common.graph_pattern_matcher import PatternType
+from aimet_common.graph_searcher import GraphSearcher
+
+from aimet_torch import utils
+from aimet_torch import quantsim as qsim
+from aimet_torch.meta.connectedgraph import ConnectedGraph
+from aimet_torch.quantsim import QcQuantizeWrapper
+from aimet_torch.save_utils import SaveUtils
+from aimet_common.utils import AimetLogger
+from aimet_common.bias_correction import ConvBnInfoType, ConvBnPatternHandler
+from aimet_common.defs import ActivationType
+from aimet_torch.utils import get_ordered_lists_of_conv_fc
+
+logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.Quant)
+
+
+class StopForwardException(Exception):
+    """ Dummy exception to early-terminate forward-pass """
+
+
+def forward_pass(model: torch.nn.Module, batch: torch.Tensor):
+    """
+    forward pass depending model allocation on CPU / GPU till StopForwardException
+    :param model: model
+    :param batch: batch
+    :return: Nothing
+    """
+    # first check if the model is on GPU or not
+    if utils.is_model_on_gpu(model):
+        batch = batch.cuda()
+    try:
+        with utils.in_eval_mode(model), torch.no_grad():
+            _ = model(batch)
+    except StopForwardException:
+        pass
+
+
+def get_quantized_dequantized_weight(layer: torch.nn.Module) -> torch.Tensor:
+    """
+    Gets quantized dequantized weights of a layer
+    :param layer: Conv/FC layer
+    :return: quantized dequantized weights
+    """
+    weight_tensor = layer._module_to_wrap.weight
+    weight_quantizer = layer.param_quantizers['weight']
+
+    quant_dequant_weights = weight_quantizer.quantize_dequantize(weight_tensor, weight_quantizer.round_mode)
+
+    return quant_dequant_weights
+
+
+def register_fwd_hook_for_layer(layer: torch.nn.Module, hook: Callable) -> torch.utils.hooks.RemovableHandle:
+    """
+    register forward hook for given layer
+    :param layer: layer
+    :param hook: hook function
+    :return: hook handle
+    """
+    hook_handle = layer.register_forward_hook(hook)
+    return hook_handle
+
+
+def get_output_data(layer: torch.nn.Module, model: torch.nn.Module, images_in_one_batch: torch.Tensor) -> np.ndarray:
+    """
+    Function to get output values of a layer
+    :param layer: layer
+    :param model: model
+    :param images_in_one_batch
+    :return: list of output of layer for all batches of images
+    """
+    def _hook_to_collect_output_data(module, _, out_data):
+        """
+        hook to collect output data
+        """
+        out_data = utils.to_numpy(out_data)
+        orig_layer_out_data.append(out_data)
+        raise StopForwardException
+
+    hook_handles = list()
+
+    orig_layer_out_data = list()
+
+    # register forward hooks
+    hook_handles.append(register_fwd_hook_for_layer(layer, _hook_to_collect_output_data))
+
+    # forward pass for 1 batch for model
+    forward_pass(model, images_in_one_batch)
+    output_data = np.vstack(orig_layer_out_data)
+
+    # remove hook handles
+    for hook_handle in hook_handles:
+        hook_handle.remove()
+
+    return output_data
+
+
+def call_empirical_mo_correct_bias(layer: torch.nn.Module, bias_correction: libpymo.BiasCorrection):
+    """
+    :param layer: Layer to be corrected
+    :param bias_correction: BiasCorrection object to call pymo interface
+    """
+    device = layer.bias.device
+
+    bias_tensor = libpymo.TensorParamBiasCorrection()
+    bias_tensor.data = layer.bias.detach().cpu().numpy()
+
+    bias_correction.correctBias(bias_tensor)
+
+    bias = torch.nn.Parameter(torch.Tensor(bias_tensor.data))
+
+    layer.bias.data = bias.to(device=device)
+
+
+def call_analytical_mo_correct_bias(layer: torch.nn.Module, bn: Union[torch.nn.BatchNorm2d, None],
+                                    activation_type: Union[ActivationType, None]):
+    """
+    :param layer: Layer to be corrected
+    :param bn: Input BN to layer
+    :param activation_type: Input activation to layer
+    """
+    bias_correction = libpymo.BnBasedBiasCorrection()
+    # Passed wrapped layer since quantized network has to be corrected
+    device = layer._modules['_module_to_wrap'].bias.device
+
+    quant_dequant_weight = get_quantized_dequantized_weight(layer)
+
+    weight_tensor = layer._module_to_wrap.weight
+
+    # Transpose weights to C, N, H, W from N, C, H, W since axis are flipped for transposed conv
+    if isinstance(layer._module_to_wrap, torch.nn.ConvTranspose2d) and layer._module_to_wrap.groups == 1:
+        weight_tensor = weight_tensor.permute(1, 0, 2, 3)
+        quant_dequant_weight = quant_dequant_weight.permute(1, 0, 2, 3)
+
+    quant_dequant_weight = quant_dequant_weight.detach().cpu().numpy()
+
+    weight_tensor = weight_tensor.detach().cpu().numpy()
+    bias_tensor = libpymo.TensorParamBiasCorrection()
+    bias_tensor.data = layer._module_to_wrap.bias.detach().cpu().numpy()
+
+    # Assigning activation to No Acivation
+    activation = libpymo.ActivationType.noActivation
+    bn_params = libpymo.BnParamsBiasCorr()
+    if bn is None:
+        shape = weight_tensor.shape[1]
+        bn_params.gamma = np.ones(shape)
+        bn_params.beta = np.zeros(shape)
+    else:
+        bn_params.gamma = bn.get_module().weight.detach().cpu().numpy()
+        bn_params.beta = bn.get_module().bias.detach().cpu().numpy()
+
+        if activation_type == ActivationType.relu:
+            activation = libpymo.ActivationType.relu
+        # Relu6's type in connected graph is hardtanh
+        elif activation_type == ActivationType.relu6:
+            activation = libpymo.ActivationType.relu6
+
+    bias_correction.correctBias(bias_tensor, quant_dequant_weight, weight_tensor, bn_params, activation)
+
+    # Assigning the updated bias back to the layer
+    bias = torch.nn.Parameter(torch.Tensor(bias_tensor.data))
+
+    layer._module_to_wrap.bias.data = bias.to(device=device)
+
+
+
[docs]def correct_bias(model: torch.nn.Module, quant_params: qsim.QuantParams, + num_quant_samples: int, data_loader, num_bias_correct_samples: int, + conv_bn_dict: Union[Dict[torch.nn.Module, ConvBnInfoType], None] = None, + perform_only_empirical_bias_corr: bool = True, + layers_to_ignore: List[torch.nn.Module] = None): + """ + Corrects bias for each Conv layer of model (unless ignored). A combination of Analytical and Empirical Bias + Correction is used i.e. all the layers which can be corrected using Analytical Bias Correction are corrected + using Analytical Bias Correction and remaining layers are corrected using Empirical method. + + Returns an in-place corrected floating point model + + :param model: Model to be corrected + :param quant_params: Named tuple for quantization simulation for bias correction + :param num_quant_samples: number of samples of images to pass through quantization sim for bias correction. + :param data_loader: data loader for the model + :param num_bias_correct_samples: number of samples for Bias correction + :param conv_bn_dict: Dict of conv and bn with information related to activation. If None, the function calc it + :param perform_only_empirical_bias_corr: Default True. If true will perform only empirical Bias Corr for all layers + irrespective of the fact that layer is eligible for Analytical Bias Corr. + :param layers_to_ignore: list of layer names for which we need to skip bias correction. + + """ + + if layers_to_ignore is None: + layers_to_ignore = [] + + # Find batch size and shape of input tensor + batch_size, input_shape = utils.get_input_shape_batch_size(data_loader) + + # Rounding up number of samples to batch size + n_batches_bias_correction = int(np.ceil(num_bias_correct_samples / batch_size)) + n_batches_quantization = int(np.ceil(num_quant_samples / batch_size)) + + data_loader_n_samples_bias_corr = utils.IterFirstX(data_loader, n_batches_bias_correction) + data_loader_n_samples_quant = utils.IterFirstX(data_loader, n_batches_quantization) + + # TODO: Remove wrapper function + # Create a wrapping function for data loader for quantization + def pass_data_through_model(model, early_stopping_iterations=None, use_cuda=False): + # pylint: disable=unused-argument + # forward pass for given number of batches for model + for (images_in_one_batch, *_) in data_loader_n_samples_quant: + forward_pass(model, images_in_one_batch) + + ordered_conv_linear_nodes = get_ordered_lists_of_conv_fc(model, input_shape) + + if conv_bn_dict is None: + conv_bn_dict = find_all_conv_bn_with_activation(model, input_shape) + + # Create a copy of the model as reference model + model_copy = copy.deepcopy(model) + + # Add bias for all the layers whose bias is None + for name, module in ordered_conv_linear_nodes: + if module.bias is None: + if isinstance(module, (torch.nn.Conv2d, torch.nn.ConvTranspose2d)): + output_size = module.out_channels + elif isinstance(module, torch.nn.Linear): + output_size = module.out_features + module.bias = torch.nn.Parameter(torch.zeros(output_size)) + module.bias.data = module.bias.data.to(device=module.weight.device) + + # Quantize full model + dummy_tensors = utils.create_rand_tensors_given_shapes(input_shape, utils.get_device(model)) + q = qsim.QuantizationSimModel(model=model, quant_scheme=quant_params.quant_scheme, + rounding_mode=quant_params.round_mode, + default_output_bw=quant_params.act_bw, + default_param_bw=quant_params.weight_bw, + in_place=True, + dummy_input=dummy_tensors, config_file=quant_params.config_file) + + # make sure model got updated in-place before we use it for bc updates + assert(q.model is model) + + # updates to skip_output_activation and layers_to_ignore + for name, module in model.named_modules(): + # Skip all layer's output quantization + if isinstance(module, QcQuantizeWrapper): + module.output_quantizers[0].enabled = False + + q.compute_encodings(pass_data_through_model, None) + + # For first conv layer, perform analytical bc if perform_only_empirical_bias_corr is set to False + # and layer is not marked to be ignored during bc. + if not perform_only_empirical_bias_corr: + module_name, module = ordered_conv_linear_nodes[0] + if module not in layers_to_ignore: + logger.info('Correcting layer %s using Analytical Bias Correction', module_name) + quantize_layer = utils.get_layer_by_name(model, module_name) + call_analytical_mo_correct_bias(quantize_layer, None, None) + logger.info('Corrected bias for the layer') + ordered_conv_linear_nodes.pop(0) + + for module_name, module in ordered_conv_linear_nodes: + # Ignore all layers which are skipped by user + if module in layers_to_ignore: + continue + else: + # make sure module is in the model used by qsim. + assert(module in list(q.model.modules())) + # Analytical Bias Correction is only done for Conv layers + reference_layer = utils.get_layer_by_name(model_copy, module_name) + quantize_layer = utils.get_layer_by_name(model, module_name) + + if module in conv_bn_dict.keys(): + + bn_layer_info = conv_bn_dict[module] + + if perform_only_empirical_bias_corr or bn_layer_info is None or bn_layer_info.input_bn is None: + logger.info('Correcting layer %s using Empirical Bias Correction', module_name) + bias_correction = libpymo.BiasCorrection() + + # Get output from quantized model and reference model + + for images_in_one_batch, *_ in data_loader_n_samples_bias_corr: + reference_output_batch = get_output_data(reference_layer, model_copy, images_in_one_batch) + quantized_model_output_batch = get_output_data(quantize_layer, model, images_in_one_batch) + + if isinstance(reference_layer, torch.nn.Linear): + extended_shape = np.concatenate((reference_output_batch.shape, np.array([1, 1]))) + reference_output_batch = reference_output_batch.reshape(extended_shape) + quantized_model_output_batch = quantized_model_output_batch.reshape(extended_shape) + + bias_correction.storePreActivationOutput(reference_output_batch) + bias_correction.storeQuantizedPreActivationOutput(quantized_model_output_batch) + + call_empirical_mo_correct_bias(module, bias_correction) + + else: + logger.info('Correcting layer %s using Analytical Bias Correction', module_name) + call_analytical_mo_correct_bias(quantize_layer, bn_layer_info.input_bn, + bn_layer_info.in_activation_type) + + logger.info('Corrected bias for the layer') + + SaveUtils.remove_quantization_wrappers(model) + + logger.info('Completed bias correction')
+ + +def find_all_conv_bn_with_activation(model: torch.nn.Module, input_shape: Tuple) -> Dict: + """ + Uses searcher to find preceding and next bn layers for a conv/linear layer + :param model: PyTorch model + :param input_shape: shape of input to the model + :return: dictionary of conv/linear layers with associated bn op / activation info + """ + + activation_types = ['Relu', 'Clip'] + + # initialize all patterns to be matched and associated call back functions + patterns_with_callbacks = [] + layer_select_handler = ConvBnPatternHandler() + patterns_with_callbacks.append(PatternType(pattern=['BatchNormalization', 'Conv'], + action=layer_select_handler)) + + patterns_with_callbacks.append(PatternType(pattern=['BatchNormalization', 'ConvTranspose'], + action=layer_select_handler)) + + patterns_with_callbacks.append(PatternType(pattern=['Conv'], + action=layer_select_handler)) + + patterns_with_callbacks.append(PatternType(pattern=['Gemm'], + action=layer_select_handler)) + + for activation in activation_types: + patterns_with_callbacks.append(PatternType(pattern=['BatchNormalization', activation, 'Conv'], + action=layer_select_handler)) + + patterns_with_callbacks.append(PatternType(pattern=['BatchNormalization', activation, 'ConvTranspose'], + action=layer_select_handler)) + + device = utils.get_device(model) + connected_graph = ConnectedGraph(model, (torch.rand(input_shape).to(device),)) + + # create graph searcher instance with connected graph and patterns to search + graph_searcher = GraphSearcher(connected_graph, patterns_with_callbacks) + + # get all conv/linear and bn info + graph_searcher.find_all_patterns_in_graph_apply_actions() + convs_bn_activation_dict = layer_select_handler.get_conv_linear_bn_info_dict() + + return convs_bn_activation_dict +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_torch/bn_reestimation.html b/releases/1.32.2/_modules/aimet_torch/bn_reestimation.html new file mode 100644 index 00000000..821d6b2b --- /dev/null +++ b/releases/1.32.2/_modules/aimet_torch/bn_reestimation.html @@ -0,0 +1,1308 @@ + + + + + + aimet_torch.bn_reestimation — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_torch.bn_reestimation

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2022, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+"""BatchNorm Reestimation"""
+
+import itertools
+from typing import Iterable, List, Callable, Any
+
+from tqdm import tqdm
+import torch
+from torch.utils.data import DataLoader
+from torch.nn.modules.batchnorm import _BatchNorm
+from aimet_torch.utils import in_eval_mode, in_train_mode
+from aimet_common.utils import Handle
+
+def _get_active_bn_modules(model: torch.nn.Module) -> Iterable[_BatchNorm]:
+    for module in model.modules():
+        if isinstance(module, _BatchNorm):
+            bn = module
+            if bn.running_mean is not None and bn.running_var is not None:
+                yield bn
+
+
+def _for_each_module(modules: Iterable[torch.nn.Module],
+                     action: Callable[[torch.nn.Module], Handle]) -> Handle:
+    """
+    Apply an undoable action to each module.
+
+    :param modules: Modules to apply the action.
+    :param action: Action to be applied to the modules.
+    :returns: Handle that undos the applied action.
+    """
+
+    handles: List[Handle] = []
+
+    def cleanup():
+        for handle in handles:
+            handle.remove()
+
+    try:
+        for module in modules:
+            handle = action(module)
+            assert isinstance(handle, Handle)
+            handles.append(handle)
+        return Handle(cleanup)
+    except:
+        cleanup()
+        raise
+
+
+def _reset_bn_stats(module: _BatchNorm) -> Handle:
+    """
+    Reset BN statistics to the initial values.
+
+    :param module: BatchNorm module.
+    :returns: Handle that restores the original BN statistics upon handle.remove().
+    """
+    orig_running_mean = module.running_mean.clone()
+    orig_running_var = module.running_var.clone()
+    orig_num_batches_tracked = module.num_batches_tracked.clone()
+
+    def cleanup():
+        module.running_mean.copy_(orig_running_mean)
+        module.running_var.copy_(orig_running_var)
+        module.num_batches_tracked.copy_(orig_num_batches_tracked)
+
+    try:
+        module.reset_running_stats()
+        return Handle(cleanup)
+    except:
+        cleanup()
+        raise
+
+
+def _reset_momentum(module: _BatchNorm) -> Handle:
+    """
+    Set BN momentum to 1.0.
+
+    :param module: BatchNorm module.
+    :returns: Handle that restores the original BN momentum upon handle.remove().
+    """
+    momentum = module.momentum
+
+    def cleanup():
+        module.momentum = momentum
+
+    try:
+        module.momentum = 1.0
+        return Handle(cleanup)
+    except:
+        cleanup()
+        raise
+
+
+DEFAULT_NUM_BATCHES = 100
+
+
+
[docs]def reestimate_bn_stats(model: torch.nn.Module, + dataloader: DataLoader, + num_batches: int = DEFAULT_NUM_BATCHES, + forward_fn: Callable[[torch.nn.Module, Any], Any] = None) -> Handle: + """ + Reestimate BatchNorm statistics (running mean and var). + + :param model: Model to reestimate the BN stats. + :param dataloader: Training dataset. + :param num_batches: The number of batches to be used for reestimation. + :param forward_fn: Optional adapter function that performs forward pass + given a model and a input batch yielded from the data loader. + :returns: Handle that undos the effect of BN reestimation upon handle.remove(). + """ + forward_fn = forward_fn or (lambda model, data: model(data)) + bn_modules = tuple(_get_active_bn_modules(model)) + + # Set all the layers to eval mode except batchnorm layers + with in_eval_mode(model), in_train_mode(bn_modules), torch.no_grad(): + with _for_each_module(bn_modules, action=_reset_momentum): + handle = _for_each_module(bn_modules, action=_reset_bn_stats) + + try: + # Batchnorm statistics accumulation buffer + buffer = { + bn: {"sum_mean": torch.zeros_like(bn.running_mean), + "sum_var": torch.zeros_like(bn.running_var)} + for bn in bn_modules + } + + num_batches = min(len(dataloader), num_batches) + dataloader_slice = itertools.islice(dataloader, num_batches) + + for data in tqdm(dataloader_slice, + total=num_batches, + desc="batchnorm reestimation"): + forward_fn(model, data) + + for bn in bn_modules: + buffer[bn]["sum_mean"] += bn.running_mean + buffer[bn]["sum_var"] += bn.running_var + + for bn in bn_modules: + sum_mean = buffer[bn]["sum_mean"] + sum_var = buffer[bn]["sum_var"] + + # Override BN stats with the reestimated stats. + bn.running_mean.copy_(sum_mean / min(len(dataloader), num_batches)) + bn.running_var.copy_(sum_var / min(len(dataloader), num_batches)) + + return handle + except: + handle.remove() + raise
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_torch/compress.html b/releases/1.32.2/_modules/aimet_torch/compress.html new file mode 100644 index 00000000..59e71068 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_torch/compress.html @@ -0,0 +1,1236 @@ + + + + + + aimet_torch.compress — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_torch.compress

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2018, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Top-level API to AIMET compression library """
+
+from typing import Union, Tuple
+import torch
+
+from aimet_common.defs import CostMetric, CompressionScheme, EvalFunction, CompressionStats
+from aimet_common.bokeh_plots import BokehServerSession
+
+from aimet_torch.defs import SpatialSvdParameters, WeightSvdParameters, ChannelPruningParameters
+from aimet_torch.compression_factory import CompressionFactory
+
+
+
[docs]class ModelCompressor: + """ AIMET model compressor: Enables model compression using various schemes """ + # Too many arguments in this function, disabling pylint for now +
[docs] @staticmethod + def compress_model(model: torch.nn.Module, eval_callback: EvalFunction, eval_iterations, + input_shape: Tuple, + compress_scheme: CompressionScheme, cost_metric: CostMetric, + parameters: Union[SpatialSvdParameters, + WeightSvdParameters, + ChannelPruningParameters], + trainer=None, visualization_url=None) -> Tuple[torch.nn.Module, CompressionStats]: + + """ + Compress a given model using the specified parameters + + :param model: Model to compress + :param eval_callback: Evaluation callback. Expected signature is evaluate(model, iterations, use_cuda). + Expected to return an accuracy metric. + :param eval_iterations: Iterations to run evaluation for + :param trainer: Training Class: Contains a callable, train_model, which takes model, layer which is being fine + tuned and an optional parameter train_flag as a parameter + None: If per layer fine tuning is not required while creating the final compressed model + :param input_shape: Shape of the input tensor for model + :param compress_scheme: Compression scheme. See the enum for allowed values + :param cost_metric: Cost metric to use for the compression-ratio (either mac or memory) + :param parameters: Compression parameters specific to given compression scheme + :param visualization_url: url the user will need to input where visualizations will appear + :return: A tuple of the compressed model, and compression statistics + """ + # pylint:disable=too-many-arguments + # If no url is passed in, then do not create a bokeh server session + if not visualization_url: + bokeh_session = None + else: + # create a bokeh session to publish visualizations to the server document for compression + bokeh_session = BokehServerSession(url=visualization_url, session_id="compression") + + # put model in eval mode. This is important because otherwise running a forward pass can change module buffers + # e.g. for batchnorm layers that can affect model evaluation results + if trainer is not None: + trainer.train_model(model, model, train_flag=False) + + model = model.eval() + + if parameters.multiplicity < 1: + raise ValueError('Rounding Multiplicity should be greater than 1') + + if compress_scheme == CompressionScheme.spatial_svd: + algo = CompressionFactory.create_spatial_svd_algo(model, eval_callback, eval_iterations, + input_shape, cost_metric, parameters, bokeh_session) + + elif compress_scheme == CompressionScheme.weight_svd: + algo = CompressionFactory.create_weight_svd_algo(model, eval_callback, eval_iterations, + input_shape, cost_metric, parameters, bokeh_session) + + elif compress_scheme == CompressionScheme.channel_pruning: + algo = CompressionFactory.create_channel_pruning_algo(model, eval_callback, eval_iterations, + input_shape, cost_metric, parameters, bokeh_session) + + else: + raise ValueError("Compression scheme not supported: {}".format(compress_scheme)) + + compressed_layer_db, stats = algo.compress_model(cost_metric, trainer) + return compressed_layer_db.model, stats
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_torch/cross_layer_equalization.html b/releases/1.32.2/_modules/aimet_torch/cross_layer_equalization.html new file mode 100644 index 00000000..6620d4b4 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_torch/cross_layer_equalization.html @@ -0,0 +1,2017 @@ + + + + + + aimet_torch.cross_layer_equalization — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_torch.cross_layer_equalization

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2019, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Cross Layer Equalization
+
+Some terminology for this code.
+CLS set: Set of layers (2 or 3) that can be used for cross-layer scaling
+Layer groups: Groups of layers that are immediately connected and can be decomposed further into CLS sets
+"""
+
+from typing import Tuple, List, Union, Dict
+from enum import Enum
+import numpy as np
+import torch
+
+import aimet_common.libpymo as libpymo      # pylint: disable=import-error
+
+from aimet_common.utils import AimetLogger
+from aimet_torch import utils
+from aimet_torch.meta.connectedgraph import ConnectedGraph
+from aimet_torch.batch_norm_fold import fold_all_batch_norms
+from aimet_torch.utils import get_device, get_ordered_list_of_modules, create_rand_tensors_given_shapes
+
+logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.Quant)
+
+
+ClsSet = Union[Tuple[torch.nn.Conv2d, torch.nn.Conv2d],
+               Tuple[torch.nn.Conv2d, torch.nn.Conv2d, torch.nn.Conv2d]]
+
+ClsSupportedLayer = Union[torch.nn.Conv1d, torch.nn.Conv2d, torch.nn.ConvTranspose1d, torch.nn.ConvTranspose2d]
+
+ScaleFactor = Union[np.ndarray, Tuple[np.ndarray]]
+
+cls_supported_layers = (torch.nn.Conv2d, torch.nn.ConvTranspose2d, torch.nn.Conv1d, torch.nn.ConvTranspose1d)
+cls_supported_activations = (torch.nn.ReLU, torch.nn.PReLU)
+
+
+class ClsLayerType(Enum):
+    """Enum class to represent CLS layer types"""
+    Unsupported = 0
+    Conv = 1  # Overloaded for conv and ConvTranspose
+    DepthwiseConv = 2
+
+
+
[docs]class ClsSetInfo: + """ + This class hold information about the layers in a CLS set, along with corresponding scaling factors + and other information like if there is a ReLU activation function between the CLS set layers + """ + +
[docs] class ClsSetLayerPairInfo: + """ + Models a pair of layers that were scaled using CLS. And related information. + """ + + def __init__(self, layer1: torch.nn.Conv2d, layer2: torch.nn.Conv2d, scale_factor: np.ndarray, + relu_activation_between_layers: bool): + """ + :param layer1: Layer whose bias is folded + :param layer2: Layer to which bias of previous layer's bias is folded + :param scale_factor: Scale Factor found from Cross Layer Scaling to scale BN parameters + :param relu_activation_between_layers: If the activation between layer1 and layer2 is Relu + """ + self.layer1 = layer1 + self.layer2 = layer2 + self.scale_factor = scale_factor + self.relu_activation_between_layers = relu_activation_between_layers
+ + def __init__(self, cls_pair_1: ClsSetLayerPairInfo, cls_pair_2: ClsSetLayerPairInfo = None): + """ + Constructor takes 2 pairs if Depth-wise separable layer is being folded + + :param cls_pair_1: Pair between two conv or conv and depth-wise conv + :param cls_pair_2: Pair between depth-wise conv and point-wise conv + """ + if cls_pair_2: + self.cls_pair_info_list = [cls_pair_1, cls_pair_2] + else: + self.cls_pair_info_list = [cls_pair_1]
+ + +def get_ordered_list_of_conv_modules(model: torch.nn.Module, dummy_input: Union[torch.Tensor, Tuple]) -> List: + """ + Finds order of nodes in graph + :param model: model + :param dummy_input: Dummy input to the model. Used to parse model graph. + :return: List of names in graph in order + """ + module_list = get_ordered_list_of_modules(model, dummy_input) + module_list = [[name, module] for name, module in module_list if isinstance(module, cls_supported_layers)] + return module_list + + +class GraphSearchUtils: + """ + Code to search a model graph to find nodes to use for cross-layer-scaling and high-bias-fold + """ + + def __init__(self, model: torch.nn.Module, input_shapes: Union[Tuple, List[Tuple]], dummy_input: Union[torch.Tensor, List[torch.Tensor]] = None): + + if dummy_input is None: + inp_tensor_list = tuple(utils.create_rand_tensors_given_shapes(input_shapes, get_device(model))) + else: + inp_tensor_list = dummy_input + self._connected_graph = ConnectedGraph(model, inp_tensor_list) + self._ordered_module_list = get_ordered_list_of_conv_modules(model, inp_tensor_list) + + + @staticmethod + def find_downstream_layer_groups_to_scale(op, layer_groups, current_group=None, visited_nodes=None): + """ + Recursive function to find cls layer groups downstream from a given op + :param op: Starting op to search from + :param layer_groups: Running list of layer groups + :param current_group: Running current layer group + :param visited_nodes: Running list of visited nodes (to short-circuit recursion) + :return: None + """ + + if not visited_nodes: + visited_nodes = [] + if not current_group: + current_group = [] + + if op in visited_nodes: + return + visited_nodes.append(op) + # print("Visiting node: {}".format(op.dotted_name)) + + # If current node is Conv2D, add to the current group + if op.model_module and isinstance(op.model_module.get_module(), cls_supported_layers): + current_group.append(op.model_module.get_module()) + + # Terminating condition for current group + if not op.model_module or not isinstance(op.model_module.get_module(), + cls_supported_layers + cls_supported_activations): + + if (len(current_group) > 1) and (current_group not in layer_groups): + layer_groups.append(current_group) + current_group = [] + + if op.output: + for consumer in op.output.consumers: + GraphSearchUtils.find_downstream_layer_groups_to_scale(consumer, layer_groups, + current_group, visited_nodes) + + # Reached a leaf.. See if the current group has something to grab + if (len(current_group) > 1) and (current_group not in layer_groups): + layer_groups.append(current_group) + + @staticmethod + def convert_layer_group_to_cls_sets(layer_group): + """ + Helper function to convert a layer group to a list of cls sets + :param layer_group: Given layer group to generate cls sets + :return: List of cls sets + + Supported layer combinations for CLS are: + 1. Conv + Conv + 2. DepthwiseConv + Conv + 3. Conv + DepthwiseConv + Conv + + Can be rewritten as, + Conv + -> Conv + -> DepthwiseConv + -> Conv + DepthwiseConv + -> Conv + + If a combination is partially supported, the cls_set is completely omitted and restarted from the next + supported layer + For example: Consider Conv + DepthwiseConv + Depthwise(unsupported) + - Since Depthwise(unsupported) is the last layer encountered, we need to omit all the three layers and restart + the cls sets from the next supported layer. + + """ + + # pylint: disable=too-many-branches + def convert_to_cls_layer_type(layer: ClsSupportedLayer) -> Tuple[ClsLayerType, ClsSupportedLayer]: + """ + Given the layer, check if its supported in CLS + :param layer: layer to check + :return: Tuple of ClsLayerType and the layer + """ + if layer.groups == 1: + layer_type = ClsLayerType.Conv + elif layer.groups == layer.in_channels and layer.in_channels == layer.out_channels: + # depthwiseConv layer with depth multiplier = 1 + layer_type = ClsLayerType.DepthwiseConv + else: + layer_type = ClsLayerType.Unsupported + return layer_type, layer + + def get_next_layer() -> Union[Tuple[ClsLayerType, Union[ClsSupportedLayer, None]]]: + """ + :return: Tuple of ClsLayerType and the next layer in layer_group + """ + if not layer_group: + return ClsLayerType.Unsupported, None + layer = layer_group.pop(0) + return convert_to_cls_layer_type(layer) + + cls_sets = [] + + first_layer_to_scale = (ClsLayerType.Unsupported, None) + while layer_group: + while layer_group and first_layer_to_scale[0] is ClsLayerType.Unsupported: + first_layer_to_scale = get_next_layer() + if first_layer_to_scale[0] is ClsLayerType.Unsupported: + logger.info('Layer %s is not supported. Ignoring for cls', first_layer_to_scale[1]) + + second_layer_to_scale = get_next_layer() + if first_layer_to_scale[0] == ClsLayerType.Conv: + if second_layer_to_scale[0] == ClsLayerType.Conv: + cls_sets.append((first_layer_to_scale[1], second_layer_to_scale[1])) + first_layer_to_scale = second_layer_to_scale + elif second_layer_to_scale[0] == ClsLayerType.DepthwiseConv: + if layer_group: + # do not pop third layer yet, determine its type and then pop it + third_layer_to_scale = convert_to_cls_layer_type(layer_group[0]) + if third_layer_to_scale[0] == ClsLayerType.Conv: + cls_sets.append( + (first_layer_to_scale[1], second_layer_to_scale[1], third_layer_to_scale[1])) + # adding third_layer_to_scale for the next round of CLS set determination + first_layer_to_scale = get_next_layer() + else: + # unsupported combination encountered + first_layer_to_scale = second_layer_to_scale + else: + logger.info('Layer %s is not supported. Ignoring for cls', second_layer_to_scale[1]) + first_layer_to_scale = (ClsLayerType.Unsupported, None) + elif first_layer_to_scale[0] == ClsLayerType.DepthwiseConv: + if second_layer_to_scale[0] == ClsLayerType.Conv: + cls_sets.append((first_layer_to_scale[1], second_layer_to_scale[1])) + first_layer_to_scale = second_layer_to_scale + else: + logger.info('Layer %s is not supported. Ignoring for cls', first_layer_to_scale[1]) + first_layer_to_scale = second_layer_to_scale + + return cls_sets + + def find_layer_groups_to_scale(self) -> List[List[torch.nn.Conv2d]]: + """ + :return: List of groups of layers. Each group can be independently equalized + """ + + # Find the input node(s) in the graph + input_nodes = [] + for op in self._connected_graph.get_all_ops().values(): + if op.inputs and op.inputs[0].is_model_input: + input_nodes.append(op) + + layer_groups = [] + for op in input_nodes: + self.find_downstream_layer_groups_to_scale(op, layer_groups) + + # Sort the layer groups in order of occurrence in the model + ordered_layer_groups = [] + for _, module in self._ordered_module_list: + for layer_group in layer_groups: + if layer_group[0] is module: + ordered_layer_groups.append(layer_group) + + return ordered_layer_groups + + @staticmethod + def does_module_have_relu_activation(connected_graph: ConnectedGraph, module: torch.nn.Module) -> bool: + """ + Finds if a given module has a ReLU activation + :param connected_graph: Reference to ConnectedGraph instance + :param module: PyTorch module to find activation for + :return: True if module has a relu activation + """ + + for op in connected_graph.get_all_ops().values(): + + if op.model_module and op.model_module.get_module() is module: + assert len(op.output.consumers) == 1 + is_relu_activation = isinstance(op.output.consumers[0].model_module.get_module(), + (torch.nn.ReLU, torch.nn.PReLU)) + return is_relu_activation + + return False + + def is_relu_activation_present_in_cls_sets(self, cls_sets: List[ClsSet]): + """ + :param cls_sets: CLS sets to find relu activations in + :return: List of groups of layers. Each group can be independently equalized + """ + + is_relu_activation_in_cls_sets = [] + for cls_set in cls_sets: + + # We need to check activation functions for all layers but the last one in the set + # Because we are only interested in checking activation functions between the layers we will scale + cls_set = cls_set[:-1] + + is_relu_activation_in_cls_set = () + for module in cls_set: + is_relu_activation_in_cls_set += (self.does_module_have_relu_activation(self._connected_graph, + module), ) + + if len(is_relu_activation_in_cls_set) == 1: + is_relu_activation_in_cls_set = is_relu_activation_in_cls_set[0] + + is_relu_activation_in_cls_sets.append(is_relu_activation_in_cls_set) + + return is_relu_activation_in_cls_sets + + +class CrossLayerScaling: + """ + Code to apply the cross-layer-scaling technique to a model + """ + + @staticmethod + def scale_cls_sets(cls_sets: List[ClsSet]) -> List[ScaleFactor]: + """ + Scale multiple CLS sets + + :param cls_sets: List of CLS sets + :return: Scaling factors calculated and applied for each CLS set in order + """ + scale_factor_list = [] + for cls_set in cls_sets: + scale_factor = CrossLayerScaling.scale_cls_set(cls_set) + scale_factor_list.append(scale_factor) + return scale_factor_list + + @staticmethod + def scale_cls_set(cls_set: ClsSet) -> ScaleFactor: + """ + Scale a CLS set + :param cls_set: Either a pair or regular conv layers or a triplet of depthwise separable layers + :return: Scaling factor calculated and applied + """ + if len(cls_set) == 3: + scale_factor = CrossLayerScaling.scale_cls_set_with_depthwise_layers(cls_set) + else: + scale_factor = CrossLayerScaling.scale_cls_set_with_conv_layers(cls_set) + + return scale_factor + + @classmethod + def scale_cls_set_with_conv_layers(cls, cls_set: ClsSet) -> np.ndarray: + """ + API to invoke equalize layer params (update for weights and bias is in place) + + :param cls_set: Consecutive Conv layers Tuple whose weights and biases need to be equalized + :return: Scaling factor S_12 for each conv layer pair: numpy array + """ + on_gpu = False + for module in cls_set: + if not isinstance(module, cls_supported_layers): + raise ValueError("Only Conv or Transposed Conv layers are supported for cross layer equalization") + if module.weight.is_cuda: + on_gpu = True + module.cpu() + + # Create structs for holding layer weights and bias parameters + prev_layer_params = libpymo.EqualizationParams() + curr_layer_params = libpymo.EqualizationParams() + + # Prepare and pack data structures for cls set. + cls._pack_params_for_conv(cls_set, prev_layer_params, curr_layer_params) + + # Scales weights and bias for consecutive layers and updates data structures in-place. + scaling_factor = libpymo.scaleLayerParams(prev_layer_params, curr_layer_params) + + # Update weight and biases for cls set using updated data structures. + cls._update_params_for_conv(cls_set, prev_layer_params, curr_layer_params) + + if on_gpu: + for module in cls_set: + module.to(device="cuda") + + return scaling_factor + + @classmethod + def scale_cls_set_with_depthwise_layers(cls, cls_set: ClsSet) -> [np.ndarray, np.ndarray]: + """ + API to invoke equalize layer params for depth wise separable layers(update for weights and bias is in place) + + :param cls_set: Consecutive Conv layers whose weights and biases need to be equalized. + Second Conv layer is a depth-wise conv and third conv layer is point-wise conv + :return: Scaling factors S_12 and S_23 : numpy arrays + """ + on_gpu = False + for module in cls_set: + if not isinstance(module, cls_supported_layers): + raise ValueError("Only conv layers are supported for cross layer equalization") + if module.weight.is_cuda: + on_gpu = True + module.cpu() + + # Create structs for holding layer weights and bias parameters + prev_layer_params = libpymo.EqualizationParams() + curr_layer_params = libpymo.EqualizationParams() + next_layer_params = libpymo.EqualizationParams() + + # Prepare and pack data structures for cls set. + cls._pack_params_for_depthwise_conv(cls_set, prev_layer_params, curr_layer_params, next_layer_params) + + # Scales weights and bias for consecutive layers and updates data structures in-place. + scaling_params = libpymo.scaleDepthWiseSeparableLayer(prev_layer_params, curr_layer_params, next_layer_params) + + # Update weight and biases for cls set using updated data structures. + cls._update_params_for_depthwise_conv(cls_set, prev_layer_params, curr_layer_params, next_layer_params) + + if on_gpu: + for module in cls_set: + module.to(device="cuda") + + return scaling_params.scalingMatrix12, scaling_params.scalingMatrix23 + + @staticmethod + def create_cls_set_info_list(cls_sets: List[ClsSet], scale_factors: List[ScaleFactor], + is_relu_activation_in_cls_sets): + """ + Binds information from there separate lists into one [ClsInfoSet] data-structure + :param cls_sets: List of CLS sets + :param scale_factors: Scale-factors for each cls-set + :param is_relu_activation_in_cls_sets: Information if there is relu activation in each cls-set + :return: List of ClsSetInfo + """ + cls_set_info_list = [] + assert len(cls_sets) == len(scale_factors) == len(is_relu_activation_in_cls_sets) + + for index, cls_set in enumerate(cls_sets): + + if isinstance(scale_factors[index], tuple): + # If we are dealing with a triplet of layers, then we should have 2 scale factors and 2 relu flags + # Assert that this is true + assert len(cls_set) == 3 + assert len(scale_factors[index]) == len(is_relu_activation_in_cls_sets[index]) == 2 + + cls_pair_1 = ClsSetInfo.ClsSetLayerPairInfo(cls_set[0], cls_set[1], scale_factors[index][0], + is_relu_activation_in_cls_sets[index][0]) + cls_pair_2 = ClsSetInfo.ClsSetLayerPairInfo(cls_set[1], cls_set[2], scale_factors[index][1], + is_relu_activation_in_cls_sets[index][1]) + + cls_set_info = ClsSetInfo(cls_pair_1, cls_pair_2) + + else: + cls_pair = ClsSetInfo.ClsSetLayerPairInfo(cls_set[0], cls_set[1], scale_factors[index], + is_relu_activation_in_cls_sets[index]) + + cls_set_info = ClsSetInfo(cls_pair) + + cls_set_info_list.append(cls_set_info) + + return cls_set_info_list + + @staticmethod + def scale_model(model: torch.nn.Module, input_shapes: Union[Tuple, List[Tuple]], dummy_input: Union[torch.Tensor, List[torch.Tensor]] = None) -> List[ClsSetInfo]: + """ + Uses cross-layer scaling to scale all applicable layers in the given model + + :param model: Model to scale + :param input_shapes: Input shape for the model (can be one or multiple inputs) + :param dummy_input: Dummy input to the model. Used to parse model graph. User is expected to place the tensors on the appropriate device. + :return: CLS information for each CLS set + """ + if isinstance(model, torch.nn.DataParallel): + return CrossLayerScaling.scale_model(model.module, input_shapes, dummy_input=dummy_input) + device = get_device(model) + model.cpu() + + # Find layer groups + graph_search = GraphSearchUtils(model, input_shapes, dummy_input=dummy_input) + layer_groups = graph_search.find_layer_groups_to_scale() + + # Find cls sets from the layer groups + cls_sets = [] + for layer_group in layer_groups: + cls_set = GraphSearchUtils.convert_layer_group_to_cls_sets(layer_group) + cls_sets += cls_set + + # Scale the CLS sets + scale_factors = CrossLayerScaling.scale_cls_sets(cls_sets) + + # Find if there were relu activations between layers of each cls set + is_relu_activation_in_cls_sets = graph_search.is_relu_activation_present_in_cls_sets(cls_sets) + + # Convert to a list of cls-set-info elements + cls_set_info_list = CrossLayerScaling.create_cls_set_info_list(cls_sets, scale_factors, + is_relu_activation_in_cls_sets) + + model.to(device=device) + return cls_set_info_list + + @staticmethod + def _pack_params_for_conv(cls_set: ClsSet, + prev_layer_params: libpymo.EqualizationParams, + curr_layer_params: libpymo.EqualizationParams): + """ + Prepare and pack data structure for previous and current layer in given cls set. + + :param cls_set: Consecutive Conv layers Tuple whose weights and biases need to be equalized. + :param prev_layer_params: Data structure holding weight and bias for previous layer in cls set. + :param curr_layer_params: Data structure holding weight and bias for current layer in cls set. + """ + weight_set_0 = cls_set[0].weight + + # Transpose weights to C, N, H, W from N, C, H, W since axis are flipped for transposed conv + if isinstance(cls_set[0], torch.nn.ConvTranspose2d): + weight_set_0 = weight_set_0.permute(1, 0, 2, 3) + if isinstance(cls_set[0], torch.nn.ConvTranspose1d): + weight_set_0 = weight_set_0.permute(1, 0, 2) + + prev_layer_params.weight = weight_set_0.detach().numpy().reshape(-1) + prev_layer_params.weightShape = np.array(weight_set_0.shape) + if len(prev_layer_params.weightShape) == 3: + prev_layer_params.weightShape = prev_layer_params.weightShape + [1] + + weight_set_1 = cls_set[1].weight + + # Transpose weights to C, N, H, W from N, C, H, W since axis are flipped for transposed conv + if isinstance(cls_set[1], torch.nn.ConvTranspose2d): + weight_set_1 = weight_set_1.permute(1, 0, 2, 3) + if isinstance(cls_set[1], torch.nn.ConvTranspose1d): + weight_set_1 = weight_set_1.permute(1, 0, 2) + + curr_layer_params.weight = weight_set_1.detach().numpy().reshape(-1) + curr_layer_params.weightShape = np.array(weight_set_1.shape) + if len(curr_layer_params.weightShape) == 3: + curr_layer_params.weightShape = curr_layer_params.weightShape + [1] + + if cls_set[0].bias is not None: + prev_layer_params.bias = cls_set[0].bias.detach().numpy() + else: + prev_layer_params.isBiasNone = True + + @staticmethod + def _update_params_for_conv(cls_set: ClsSet, + prev_layer_params: libpymo.EqualizationParams, + curr_layer_params: libpymo.EqualizationParams): + """ + Update weight and biases for cls set using updated data structures. + + :param cls_set: Consecutive Conv layers Tuple whose weights and biases need to be equalized. + :param prev_layer_params: Data structure holding weight and bias for previous layer in cls set. + :param curr_layer_params: Data structure holding weight and bias for current layer in cls set. + """ + if isinstance(cls_set[0], (torch.nn.Conv1d, torch.nn.ConvTranspose1d)): + prev_layer_params.weightShape = prev_layer_params.weightShape[:-1] + cls_set[0].weight.data = torch.from_numpy(np.reshape(prev_layer_params.weight, + prev_layer_params.weightShape)) + cls_set[0].weight.data = cls_set[0].weight.data.type(torch.FloatTensor) + + + # Transpose weight back to N, C, H, W for transposed Conv2D + if isinstance(cls_set[0], torch.nn.ConvTranspose2d): + cls_set[0].weight.data = cls_set[0].weight.data.permute(1, 0, 2, 3).contiguous() + if isinstance(cls_set[0], torch.nn.ConvTranspose1d): + cls_set[0].weight.data = cls_set[0].weight.data.permute(1, 0, 2).contiguous() + + if isinstance(cls_set[1], (torch.nn.Conv1d, torch.nn.ConvTranspose1d)): + curr_layer_params.weightShape = curr_layer_params.weightShape[:-1] + cls_set[1].weight.data = torch.from_numpy(np.reshape(curr_layer_params.weight, + curr_layer_params.weightShape)) + cls_set[1].weight.data = cls_set[1].weight.data.type(torch.FloatTensor) + + # Transpose weight back to N, C, H, W for transposed Conv2D + if isinstance(cls_set[1], torch.nn.ConvTranspose2d): + cls_set[1].weight.data = cls_set[1].weight.data.permute(1, 0, 2, 3).contiguous() + if isinstance(cls_set[1], torch.nn.ConvTranspose1d): + cls_set[1].weight.data = cls_set[1].weight.data.permute(1, 0, 2).contiguous() + + if cls_set[0].bias is not None: + cls_set[0].bias.data = torch.from_numpy(np.reshape(prev_layer_params.bias, + prev_layer_params.weightShape[0])) + cls_set[0].bias.data = cls_set[0].bias.data.type(torch.FloatTensor) + + @staticmethod + def _pack_params_for_depthwise_conv(cls_set: ClsSet, + prev_layer_params: libpymo.EqualizationParams, + curr_layer_params: libpymo.EqualizationParams, + next_layer_params: libpymo.EqualizationParams): + """ + Prepare and pack data structure for previous, current and next layer in given cls set. + + :param cls_set: Consecutive Conv layers Tuple whose weights and biases need to be equalized. + :param prev_layer_params: Data structure holding weight and bias for previous layer in cls set. + :param curr_layer_params: Data structure holding weight and bias for current layer in cls set. + :param next_layer_params: Data structure holding weight and bias for next layer in cls set. + """ + if isinstance(cls_set[0], torch.nn.ConvTranspose2d): + cls_set[0].weight.data = cls_set[0].weight.data.permute(1, 0, 2, 3).contiguous() + if isinstance(cls_set[0], torch.nn.ConvTranspose1d): + cls_set[0].weight.data = cls_set[0].weight.data.permute(1, 0, 2).contiguous() + + if isinstance(cls_set[2], torch.nn.ConvTranspose2d): + cls_set[2].weight.data = cls_set[2].weight.data.permute(1, 0, 2, 3).contiguous() + if isinstance(cls_set[2], torch.nn.ConvTranspose1d): + cls_set[2].weight.data = cls_set[2].weight.data.permute(1, 0, 2).contiguous() + + assert cls_set[1].groups > 1 + + prev_layer_params.weight = cls_set[0].weight.detach().numpy().flatten() + prev_layer_params.weightShape = np.array(cls_set[0].weight.shape) + if len(prev_layer_params.weightShape) == 3: + prev_layer_params.weightShape = prev_layer_params.weightShape + [1] + + curr_layer_params.weight = cls_set[1].weight.detach().numpy().flatten() + curr_layer_params.weightShape = np.array(cls_set[1].weight.shape) + if len(curr_layer_params.weightShape) == 3: + curr_layer_params.weightShape = curr_layer_params.weightShape + [1] + + next_layer_params.weight = cls_set[2].weight.detach().numpy().flatten() + next_layer_params.weightShape = np.array(cls_set[2].weight.shape) + if len(next_layer_params.weightShape) == 3: + next_layer_params.weightShape = next_layer_params.weightShape + [1] + + + if cls_set[0].bias is not None: + prev_layer_params.bias = cls_set[0].bias.detach().numpy() + else: + prev_layer_params.isBiasNone = True + + if cls_set[1].bias is not None: + curr_layer_params.bias = cls_set[1].bias.detach().numpy() + else: + curr_layer_params.isBiasNone = True + + @staticmethod + def _update_params_for_depthwise_conv(cls_set: ClsSet, + prev_layer_params: libpymo.EqualizationParams, + curr_layer_params: libpymo.EqualizationParams, + next_layer_params: libpymo.EqualizationParams): + """ + Update weight and biases for cls set using updated data structures. + + :param cls_set: Consecutive Conv layers Tuple whose weights and biases need to be equalized. + :param prev_layer_params: Data structure holding weight and bias for previous layer in cls set. + :param curr_layer_params: Data structure holding weight and bias for current layer in cls set. + :param next_layer_params: Data structure holding weight and bias for next layer in cls set. + """ + if isinstance(cls_set[0], (torch.nn.Conv1d, torch.nn.ConvTranspose1d)): + prev_layer_params.weightShape = prev_layer_params.weightShape[:-1] + cls_set[0].weight.data = torch.from_numpy(np.reshape(prev_layer_params.weight, + prev_layer_params.weightShape)) + cls_set[0].weight.data = cls_set[0].weight.data.type(torch.FloatTensor) + if isinstance(cls_set[0], torch.nn.ConvTranspose2d): + cls_set[0].weight.data = cls_set[0].weight.data.permute(1, 0, 2, 3).contiguous() + if isinstance(cls_set[0], torch.nn.ConvTranspose1d): + cls_set[0].weight.data = cls_set[0].weight.data.permute(1, 0, 2).contiguous() + + if isinstance(cls_set[1], (torch.nn.Conv1d, torch.nn.ConvTranspose1d)): + curr_layer_params.weightShape = curr_layer_params.weightShape[:-1] + cls_set[1].weight.data = torch.from_numpy(np.reshape(curr_layer_params.weight, + curr_layer_params.weightShape)) + cls_set[1].weight.data = cls_set[1].weight.data.type(torch.FloatTensor) + + if isinstance(cls_set[2], (torch.nn.Conv1d, torch.nn.ConvTranspose1d)): + next_layer_params.weightShape = next_layer_params.weightShape[:-1] + + cls_set[2].weight.data = torch.from_numpy(np.reshape(next_layer_params.weight, + next_layer_params.weightShape)) + cls_set[2].weight.data = cls_set[2].weight.data.type(torch.FloatTensor) + if isinstance(cls_set[2], torch.nn.ConvTranspose2d): + cls_set[2].weight.data = cls_set[2].weight.data.permute(1, 0, 2, 3).contiguous() + if isinstance(cls_set[2], torch.nn.ConvTranspose1d): + cls_set[2].weight.data = cls_set[2].weight.data.permute(1, 0, 2).contiguous() + + if cls_set[0].bias is not None: + cls_set[0].bias.data = torch.from_numpy(np.reshape(prev_layer_params.bias, + prev_layer_params.weightShape[0])) + cls_set[0].bias.data = cls_set[0].bias.data.type(torch.FloatTensor) + + if cls_set[1].bias is not None: + cls_set[1].bias.data = torch.from_numpy(np.reshape(curr_layer_params.bias, + curr_layer_params.weightShape[0])) + cls_set[1].bias.data = cls_set[1].bias.data.type(torch.FloatTensor) + + +class HighBiasFold: + """ + Code to apply the high-bias-fold technique to a model + """ + + ActivationIsReluForFirstModule = bool + ScaleForFirstModule = np.ndarray + + @classmethod + def bias_fold(cls, cls_set_info_list: List[ClsSetInfo], + bn_layers: Dict[Union[torch.nn.Conv2d, torch.nn.ConvTranspose2d], torch.nn.BatchNorm2d]): + """ + Folds bias values greater than 3 * sigma to next layer's bias + + :param cls_set_info_list: List of info elements for each cls set + :param bn_layers: Key: Conv/Linear layer Value: Corresponding folded BN layer + :return: None + """ + if not bn_layers: + logger.info('High Bias folding is not supported for models without BatchNorm Layers') + return + + for cls_set_info in cls_set_info_list: + for cls_pair_info in cls_set_info.cls_pair_info_list: + + if (cls_pair_info.layer1.bias is None) or (cls_pair_info.layer2.bias is None) or \ + (cls_pair_info.layer1 not in bn_layers): + continue + + # Create data structures for holding layer weights and bias parameters. + prev_layer_params = libpymo.LayerParams() + curr_layer_params = libpymo.LayerParams() + prev_layer_bn_params = libpymo.BNParamsHighBiasFold() + + # Prepare and pack data structures for high bias fold. + cls._pack_bn_layer_params(cls_pair_info, bn_layers, prev_layer_bn_params) + cls._pack_previous_and_current_layer_params(cls_pair_info, prev_layer_params, curr_layer_params) + + # Update bias for previous and current layer and data structures in-place. + libpymo.updateBias(prev_layer_params, curr_layer_params, prev_layer_bn_params) + + # Set updated biases for previous and current layer. + cls._update_previous_and_current_layer_bias(cls_pair_info, prev_layer_params, curr_layer_params) + + @staticmethod + def _pack_bn_layer_params(cls_pair_info: ClsSetInfo.ClsSetLayerPairInfo, + bn_layers: Dict[torch.nn.Module, torch.nn.BatchNorm2d], + prev_layer_bn_params: libpymo.BNParamsHighBiasFold): + """ + Helper method to pack batch norm layer parameter for high bias fold. + + :param cls_pair_info: Layer pairs that were scaled using CLS and related information. + :param bn_layers: Dictionary with Key being Conv/Linear layer and value being corresponding folded BN layer. + :param prev_layer_bn_params: Data structure to pack batch norm parameter. + """ + scaling_parameter = cls_pair_info.scale_factor + + # Scaling gamma and beta parameter of batch norm layer + prev_layer_bn_params.gamma = bn_layers[cls_pair_info.layer1].weight.detach().numpy().reshape(-1) + prev_layer_bn_params.beta = bn_layers[cls_pair_info.layer1].bias.detach().numpy().reshape(-1) + + if len(scaling_parameter) != len(prev_layer_bn_params.gamma) or \ + len(scaling_parameter) != len(prev_layer_bn_params.beta): + raise ValueError("High Bias absorption is not supported for networks with fold-forward BatchNorms") + prev_layer_bn_params.gamma = np.divide(prev_layer_bn_params.gamma, scaling_parameter) + prev_layer_bn_params.beta = np.divide(prev_layer_bn_params.beta, scaling_parameter) + + @staticmethod + def _pack_previous_and_current_layer_params(cls_pair_info, prev_layer_params, curr_layer_params): + """ + Helper method to pack information of previous and current layer. + + :param cls_pair_info: Layer pairs that were scaled using CLS and related information. + :param prev_layer_params: Data structure to pack previous layer parameters. + :param curr_layer_params: Data structure to pack current layer parameters. + """ + prev_layer_params.activationIsRelu = cls_pair_info.relu_activation_between_layers + prev_layer_params.bias = cls_pair_info.layer1.bias.detach().numpy() + + weight = cls_pair_info.layer2.weight + + if isinstance(cls_pair_info.layer2, (torch.nn.Conv1d, torch.nn.ConvTranspose1d)): + weight = torch.unsqueeze(weight, dim=-1) + + # Transpose weights to C, N, H, W from N, C, H, W since axis are flipped for transposed conv + if isinstance(cls_pair_info.layer2, (torch.nn.ConvTranspose1d, torch.nn.ConvTranspose2d)) and \ + cls_pair_info.layer2.groups == 1: + weight = weight.permute(1, 0, 2, 3) + + curr_layer_params.bias = cls_pair_info.layer2.bias.detach().numpy() + curr_layer_params.weight = weight.detach().numpy().reshape(-1) + curr_layer_params.weightShape = np.array(weight.shape) + + @staticmethod + def _update_previous_and_current_layer_bias(cls_pair_info: ClsSetInfo.ClsSetLayerPairInfo, + prev_layer_params: libpymo.LayerParams, + curr_layer_params: libpymo.LayerParams): + """ + Update biases for previous and current layer. + + :param cls_pair_info: Layer pairs that were scaled using CLS and related information. + :param prev_layer_params: Data structure holding weight and bias for previous layer in cls set. + :param curr_layer_params: Data structure holding weight and bias for current layer in cls set. + """ + prev_layer_bias_shape = cls_pair_info.layer1.weight.shape[0] + if (isinstance(cls_pair_info.layer1, (torch.nn.ConvTranspose1d, torch.nn.ConvTranspose2d))) and \ + (cls_pair_info.layer1.groups == 1): + prev_layer_bias_shape = cls_pair_info.layer1.weight.shape[1] + + cls_pair_info.layer1.bias.data = torch.from_numpy(np.reshape(prev_layer_params.bias, + prev_layer_bias_shape)) + cls_pair_info.layer1.bias.data = cls_pair_info.layer1.bias.data.type(torch.FloatTensor) + + cls_pair_info.layer2.bias.data = torch.from_numpy(np.reshape(curr_layer_params.bias, + curr_layer_params.weightShape[0])) + cls_pair_info.layer2.bias.data = cls_pair_info.layer2.bias.data.type(torch.FloatTensor) + + +
[docs]def equalize_model(model: torch.nn.Module, input_shapes: Union[Tuple, List[Tuple]], + dummy_input: Union[torch.Tensor, Tuple] = None): + """ + High-level API to perform Cross-Layer Equalization (CLE) on the given model. The model is equalized in place. + + :param model: Model to equalize + :param input_shapes: Shape of the input (can be a tuple or a list of tuples if multiple inputs) + :param dummy_input: A dummy input to the model. Can be a Tensor or a Tuple of Tensors + :return: None + """ + if dummy_input is None: + # The use of input_shapes will be removed in a future release. It is maintained now for backward compatibility. + # Note, create_rand_tensors_given_shapes() creates all FP32 tensors where as some multi-input models might + # additionally use Integer Tensors. + dummy_input = create_rand_tensors_given_shapes(input_shapes, torch.device('cpu')) + if isinstance(dummy_input, (list, tuple)): + input_shapes = [i.shape for i in dummy_input] + else: + input_shapes = dummy_input.shape + + if isinstance(model, torch.nn.DataParallel): + equalize_model(model.module, input_shapes, dummy_input) + else: + device = get_device(model) + model.cpu() + # fold batchnorm layers + folded_pairs = fold_all_batch_norms(model, input_shapes, dummy_input) + equalize_bn_folded_model(model, input_shapes, folded_pairs, dummy_input=dummy_input) + + model.to(device=device)
+ +def equalize_bn_folded_model(model: torch.nn.Module, + input_shapes: Union[Tuple, List[Tuple]], + folded_pairs: List[Tuple[torch.nn.Module, torch.nn.BatchNorm2d]], + dummy_input: Union[torch.Tensor, Tuple] = None + ): + """ + Perform Cross-Layer Scaling (CLS) and High Bias Folding (HBF) on a batchnorm-folded model. + The model is equalized in place. + + :param model: Batchnorm-folded model to equalize + :param input_shapes: Shape of the input (can be a tuple or a list of tuples if multiple inputs) + :param dummy_input: Dummy input to the model. Used to parse model graph. User is expected to place the tensors on the appropriate device. + :param folded_pairs: List of pairs of folded layers + :return: None + """ + if isinstance(model, torch.nn.DataParallel): + equalize_bn_folded_model(model.module, input_shapes, folded_pairs, dummy_input=dummy_input) + else: + device = get_device(model) + model.cpu() + bn_dict = {} + for conv_bn in folded_pairs: + bn_dict[conv_bn[0]] = conv_bn[1] + + # replace any ReLU6 layers with ReLU + utils.replace_modules_of_type1_with_type2(model, torch.nn.ReLU6, torch.nn.ReLU) + + # perform cross-layer scaling on applicable layer sets + cls_set_info_list = CrossLayerScaling.scale_model(model, input_shapes, dummy_input=dummy_input) + + # high-bias fold + HighBiasFold.bias_fold(cls_set_info_list, bn_dict) + + model.to(device=device) +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_torch/defs.html b/releases/1.32.2/_modules/aimet_torch/defs.html new file mode 100644 index 00000000..e54250e8 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_torch/defs.html @@ -0,0 +1,1358 @@ + + + + + + aimet_torch.defs — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_torch.defs

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2018, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Common type definitions that are used across AIMET """
+
+from enum import Enum
+from typing import List, Optional, Union
+
+import torch.utils.data
+
+from aimet_common.defs import GreedySelectionParameters, TarRankSelectionParameters, RankSelectScheme
+
+
+
[docs]class ModuleCompRatioPair: + """ + Pair of torch.nn.module and a compression-ratio + + :ivar module: Module of type torch.nn.module + :ivar comp_ratio: Compression ratio. Compression ratio is the ratio of cost of compressed model + to cost of the original model. + """ + + def __init__(self, module: torch.nn.Module, comp_ratio: float): + self.module = module + self.comp_ratio = comp_ratio
+ + +class OpToIOTensors: + """ + Data class to store the input and output tensor names of an operation as a lists. + """ + def __init__(self, node_inputs: List[str], node_outputs: List[str]): + """ + :param node_inputs: name of inputs to the node + :param node_outputs: name of output from the node + """ + + self.inputs = node_inputs + self.outputs = node_outputs + + +
[docs]class SpatialSvdParameters: + """ Configuration parameters for spatial svd compression """ + +
[docs] class ManualModeParams: + """ + Configuration parameters for manual-mode spatial svd compression + """ + + def __init__(self, list_of_module_comp_ratio_pairs: List[ModuleCompRatioPair]): + """ + :param list_of_module_comp_ratio_pairs: List of (module, comp-ratio) pairs + """ + self.list_of_module_comp_ratio_pairs = list_of_module_comp_ratio_pairs
+ +
[docs] class AutoModeParams: + """ + Configuration parameters for auto-mode compression + """ + + def __init__(self, greedy_select_params: GreedySelectionParameters, + modules_to_ignore: Optional[List[torch.nn.Module]] = None): + """ + :param greedy_select_params: Params for greedy comp-ratio selection algorithm + :param modules_to_ignore: List of modules to ignore (None indicates nothing to ignore) + """ + self.greedy_params = greedy_select_params + self.modules_to_ignore = [] if modules_to_ignore is None else modules_to_ignore
+ +
[docs] class Mode(Enum): + """ Mode enumeration """ + + manual = 1 + """ Manual mode """ + + auto = 2 + """ Auto mode """
+ + def __init__(self, mode: Mode, params: Union[ManualModeParams, AutoModeParams], multiplicity=1): + """ + :param mode: Either auto mode or manual mode + :param params: Parameters for the mode selected + :param multiplicity: The multiplicity to which ranks/input channels will get rounded. Default: 1 + """ + self.mode = mode + self.mode_params = params + self.multiplicity = multiplicity
+ + +
[docs]class ChannelPruningParameters: + """ Configuration parameters for channel pruning compression """ + +
[docs] class ManualModeParams: + """ + Configuration parameters for manual-mode channel pruning compression + """ + + def __init__(self, list_of_module_comp_ratio_pairs: List[ModuleCompRatioPair]): + """ + :param list_of_module_comp_ratio_pairs: List of (module, comp-ratio) pairs + """ + self.list_of_module_comp_ratio_pairs = list_of_module_comp_ratio_pairs
+ +
[docs] class AutoModeParams: + """ + Configuration parameters for auto-mode compression + """ + + def __init__(self, greedy_select_params: GreedySelectionParameters, + modules_to_ignore: Optional[List[torch.nn.Module]] = None): + """ + :param greedy_select_params: Params for greedy comp-ratio selection algorithm + :param modules_to_ignore: List of modules to ignore (None indicates nothing to ignore) + """ + self.greedy_params = greedy_select_params + self.modules_to_ignore = [] if modules_to_ignore is None else modules_to_ignore
+ +
[docs] class Mode(Enum): + """ Mode enumeration """ + + manual = 1 + """ Manual mode: User specifies comp-ratio per layer """ + + auto = 2 + """ Auto mode: AIMET computes optimal comp-ratio per layer """
+ + def __init__(self, data_loader: torch.utils.data.DataLoader, num_reconstruction_samples: int, + allow_custom_downsample_ops: bool, + mode: Mode, params: Union[ManualModeParams, AutoModeParams], multiplicity=1): + self.data_loader = data_loader + self.num_reconstruction_samples = num_reconstruction_samples + self.allow_custom_downsample_ops = allow_custom_downsample_ops + self.mode = mode + self.mode_params = params + self.multiplicity = multiplicity
+ + +
[docs]class WeightSvdParameters: + """ Configuration parameters for weight svd compression """ + +
[docs] class ManualModeParams: + """ + Configuration parameters for manual-mode weight svd compression + """ + + def __init__(self, list_of_module_comp_ratio_pairs: List[ModuleCompRatioPair]): + """ + :param list_of_module_comp_ratio_pairs: List of (module, comp-ratio) pairs + """ + self.list_of_module_comp_ratio_pairs = list_of_module_comp_ratio_pairs
+ +
[docs] class AutoModeParams: + """ + Configuration parameters for auto-mode compression + """ + + def __init__(self, + rank_select_scheme: RankSelectScheme, + select_params: Union[GreedySelectionParameters, + TarRankSelectionParameters], + modules_to_ignore: Optional[List[torch.nn.Module]] = None): + """ + :param rank_select_scheme: supports two options greedy and tar + :param select_params: Params for greedy/TAR comp-ratio selection algorithm + :param modules_to_ignore: List of modules to ignore (None indicates nothing to ignore) + """ + self.rank_select_scheme = rank_select_scheme + self.select_params = select_params + self.modules_to_ignore = [] if modules_to_ignore is None else modules_to_ignore
+ +
[docs] class Mode(Enum): + """ Mode enumeration """ + + manual = 1 + """ Manual mode """ + + auto = 2 + """ Auto mode """
+ + def __init__(self, mode: Mode, params: Union[ManualModeParams, AutoModeParams], multiplicity=1): + """ + :param mode: Either auto mode or manual mode + :param params: Parameters for the mode selected + :param multiplicity: The multiplicity to which ranks/input channels will get rounded. Default: 1 + """ + self.mode = mode + self.mode_params = params + self.multiplicity = multiplicity
+ + +class PassThroughOp(torch.nn.Module): + """ + This is a pass-through op, used for purpose of making an op a no-op + """ + # pylint:disable=arguments-differ + @staticmethod + def forward(inputx): + """ + Forward pass for passthrough op + """ + return inputx +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_torch/layer_output_utils.html b/releases/1.32.2/_modules/aimet_torch/layer_output_utils.html new file mode 100644 index 00000000..901b17e3 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_torch/layer_output_utils.html @@ -0,0 +1,1521 @@ + + + + + + aimet_torch.layer_output_utils — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_torch.layer_output_utils

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2023, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" This module contains utilities to capture and save intermediate layer-outputs of a model. """
+
+import os
+from typing import Union, Dict, List, Tuple
+from enum import Enum
+import shutil
+import re
+
+import numpy as np
+import onnx
+import torch
+
+from aimet_common.utils import AimetLogger
+from aimet_common.layer_output_utils import SaveInputOutput, save_layer_output_names
+
+from aimet_torch.quantsim import ExportableQuantModule, QuantizationSimModel
+from aimet_torch import utils
+from aimet_torch import torchscript_utils
+from aimet_torch.onnx_utils import OnnxSaver, OnnxExportApiArgs
+from aimet_torch.qc_quantize_recurrent import QcQuantizeRecurrent
+from aimet_torch.v2.nn.base import BaseQuantizationMixin
+
+logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.LayerOutputs)
+
+
+
[docs]class NamingScheme(Enum): + """ Enumeration of layer-output naming schemes. """ + + PYTORCH = 1 + """ Names outputs according to exported pytorch model. Layer names are used. """ + ONNX = 2 + """ Names outputs according to exported onnx model. Layer output names are generally numeric. """ + TORCHSCRIPT = 3 + """ Names outputs according to exported torchscript model. Layer output names are generally numeric. """
+ + +
[docs]class LayerOutputUtil: + """ Implementation to capture and save outputs of intermediate layers of a model (fp32/quantsim). """ + + def __init__(self, model: torch.nn.Module, dir_path: str, naming_scheme: NamingScheme = NamingScheme.PYTORCH, + dummy_input: Union[torch.Tensor, Tuple, List] = None, onnx_export_args: Union[OnnxExportApiArgs, Dict] = None): + """ + Constructor for LayerOutputUtil. + + :param model: Model whose layer-outputs are needed. + :param dir_path: Directory wherein layer-outputs will be saved. + :param naming_scheme: Naming scheme to be followed to name layer-outputs. There are multiple schemes as per + the exported model (pytorch, onnx or torchscript). Refer the NamingScheme enum definition. + :param dummy_input: Dummy input to model. Required if naming_scheme is 'NamingScheme.ONNX' or 'NamingScheme.TORCHSCRIPT'. + :param onnx_export_args: Should be same as that passed to quantsim export API to have consistency between + layer-output names present in exported onnx model and generated layer-outputs. Required if naming_scheme is + 'NamingScheme.ONNX'. + """ + + # Utility to capture layer-outputs + self.layer_output = LayerOutput(model=model, naming_scheme=naming_scheme, dir_path=dir_path, dummy_input=dummy_input, + onnx_export_args=onnx_export_args) + + # Utility to save model inputs and their corresponding layer-outputs + self.save_input_output = SaveInputOutput(dir_path=dir_path, axis_layout='NCHW') + +
[docs] def generate_layer_outputs(self, input_batch: Union[torch.Tensor, List[torch.Tensor], Tuple[torch.Tensor]]): + """ + This method captures output of every layer of a model & saves the inputs and corresponding layer-outputs to disk. + + :param input_batch: Batch of inputs for which we want to obtain layer-outputs. + :return: None + """ + + input_instance_count = len(input_batch) if isinstance(input_batch, torch.Tensor) else len(input_batch[0]) + logger.info("Generating layer-outputs for %d input instances", input_instance_count) + + # Obtain layer-output name to output dictionary + layer_output_batch_dict = self.layer_output.get_outputs(input_batch) + + # Place inputs and layer-outputs on CPU + input_batch = LayerOutputUtil._get_input_batch_in_numpy(input_batch) + layer_output_batch_dict = LayerOutputUtil._get_layer_output_batch_in_numpy(layer_output_batch_dict) + + # Save inputs and layer-outputs + self.save_input_output.save(input_batch, layer_output_batch_dict) + + logger.info('Successfully generated layer-outputs for %d input instances', input_instance_count)
+ + @staticmethod + def _get_input_batch_in_numpy(input_batch: Union[torch.Tensor, List[torch.Tensor], Tuple[torch.Tensor]]) -> \ + Union[np.ndarray, List[np.ndarray], Tuple[np.ndarray]]: + """ + Coverts the torch tensors into numpy arrays + :param input_batch: input batch with torch tensors + :return: input batch with numpy arrays + """ + if isinstance(input_batch, (List, Tuple)): + numpy_input_batch = [] + for ith_input in input_batch: + numpy_input_batch.append(ith_input.cpu().numpy()) + return numpy_input_batch + return input_batch.cpu().numpy() + + @staticmethod + def _get_layer_output_batch_in_numpy(layer_output_dict: Dict[str, torch.Tensor]) -> Dict[str, np.ndarray]: + """ + Converts the torch tensors into numpy arrays + :param layer_output_dict: layer output dictionary with torch tensors + :return: layer output dictionary with numpy arrays + """ + layer_output_numpy_dict = {} + for output_name, output_tensor in layer_output_dict.items(): + layer_output_numpy_dict[output_name] = output_tensor.cpu().numpy() + return layer_output_numpy_dict
+ + +class LayerOutput: + """ + This class creates a layer-output name to layer-output dictionary. The layer-output names are as per the AIMET exported + pytorch/onnx/torchscript model. + """ + def __init__(self, model: torch.nn.Module, dir_path: str, naming_scheme: NamingScheme = NamingScheme.PYTORCH, + dummy_input: Union[torch.Tensor, Tuple, List] = None, onnx_export_args: Union[OnnxExportApiArgs, Dict] = None): + """ + Constructor - It initializes few dictionaries that are required for capturing and naming layer-outputs. + + :param model: Model whose layer-outputs are needed. + :param dir_path: Directory wherein layer-output names arranged in topological order will be saved. It will also + be used to temporarily save onnx/torchscript equivalent of the given model. + :param naming_scheme: Naming scheme to be followed to name layer-outputs. There are multiple schemes as per + the exported model (pytorch, onnx or torchscript). Refer the NamingScheme enum definition. + :param dummy_input: Dummy input to model (required if naming_scheme is 'onnx'). + :param onnx_export_args: Should be same as that passed to quantsim export API to have consistency between + layer-output names present in exported onnx model and generated layer-outputs (required if naming_scheme is + 'onnx'). + """ + self.model = model + self.module_to_name_dict = utils.get_module_to_name_dict(model=model, prefix='') + + # Check whether the given model is quantsim model + self.is_quantsim_model = any(isinstance(module, (ExportableQuantModule, QcQuantizeRecurrent)) for module in model.modules()) + + # Obtain layer-name to layer-output name mapping + self.layer_name_to_layer_output_dict = {} + self.layer_name_to_layer_output_name_dict = {} + if naming_scheme == NamingScheme.PYTORCH: + for name, module in model.named_modules(): + if utils.is_leaf_module(module) or isinstance(module, BaseQuantizationMixin): + name = name.replace('._module_to_wrap', '') + self.layer_name_to_layer_output_name_dict[name] = name + else: + self.layer_name_to_layer_output_name_dict = LayerOutput.get_layer_name_to_layer_output_name_map( + self.model, naming_scheme, dummy_input, onnx_export_args, dir_path) + + # Replace any delimiter in layer-output name string with underscore + for layer_name, output_name in self.layer_name_to_layer_output_name_dict.items(): + self.layer_name_to_layer_output_name_dict[layer_name] = re.sub(r'\W+', "_", output_name) + + # Save layer-output names which are in topological order of model graph. This order can be used while comparing layer-outputs. + layer_output_names = list(self.layer_name_to_layer_output_name_dict.values()) + save_layer_output_names(layer_output_names, dir_path) + + def get_outputs(self, input_batch: Union[torch.Tensor, List[torch.Tensor], Tuple[torch.Tensor]]) -> Dict[str, torch.Tensor]: + """ + This function captures layer-outputs and renames them as per the AIMET exported pytorch/onnx/torchscript model. + + :param input_batch: Batch of inputs for which we want to obtain layer-outputs. + :return: layer-name to layer-output batch dict + """ + + # Fetch outputs of all the layers + self.layer_name_to_layer_output_dict = {} + if self.is_quantsim_model: + # Apply record-output hook to QuantizeWrapper modules (one node above leaf node in model graph) + utils.run_hook_for_layers_with_given_input(self.model, input_batch, self.record_outputs, + module_type_for_attaching_hook=(ExportableQuantModule, QcQuantizeRecurrent), + leaf_node_only=False) + else: + # Apply record-output hook to Original modules (leaf node in model graph) + utils.run_hook_for_layers_with_given_input(self.model, input_batch, self.record_outputs, leaf_node_only=True) + + # Rename outputs according to pytorch/onnx/torchscript model + layer_output_name_to_layer_output_dict = LayerOutput.rename_layer_outputs(self.layer_name_to_layer_output_dict, + self.layer_name_to_layer_output_name_dict) + + return layer_output_name_to_layer_output_dict + + def record_outputs(self, module: torch.nn.Module, _, output: torch.Tensor): + """ + Hook function to capture output of a layer. + + :param module: Layer-module in consideration. + :param _: Placeholder for the input of the layer-module. + :param output: Output of the layer-module. + :return: None + """ + layer_name = self.module_to_name_dict[module] + if isinstance(output, torch.Tensor): + self.layer_name_to_layer_output_dict[layer_name] = output.clone() + else: + logger.info("Skipping constant scalar output of layer %s", layer_name) + + @staticmethod + def rename_layer_outputs(layer_name_to_layer_output_dict: Dict[str, torch.Tensor], + layer_name_to_layer_output_name_dict: Dict[str, str]) -> Dict[str, torch.Tensor]: + """ + Rename layer-outputs based on the layer-name to layer-output name map + + :param layer_name_to_layer_output_dict: Dict containing layer-outputs + :param layer_name_to_layer_output_name_dict: Dict containing layer-output names + :return: layer_output_name_to_layer_output_dict + """ + layer_names = list(layer_name_to_layer_output_dict.keys()) + + for layer_name in layer_names: + if layer_name in layer_name_to_layer_output_name_dict: + # Rename the layer-output by using layer-output name, instead of layer-name + layer_output_name = layer_name_to_layer_output_name_dict[layer_name] + layer_name_to_layer_output_dict[layer_output_name] = layer_name_to_layer_output_dict.pop(layer_name) + else: + # Delete the layer-output as it doesn't have a name + layer_name_to_layer_output_dict.pop(layer_name) + + return layer_name_to_layer_output_dict + + @staticmethod + def get_layer_name_to_layer_output_name_map(model, naming_scheme: NamingScheme, dummy_input: Union[torch.Tensor, Tuple, List], + onnx_export_args: Union[OnnxExportApiArgs, Dict], dir_path: str) -> Dict[str, str]: + """ + This function produces layer-name to layer-output name map w.r.t the AIMET exported onnx/torchscript model. If a + layer gets expanded into multiple layers in the exported model then the intermediate layers are ignored and + output-name of last layer is used. + + :param model: model + :param naming_scheme: onnx/torchscript + :param dummy_input: dummy input that is used to construct onnx/torchscript model + :param onnx_export_args: OnnxExportApiArgs instance same as that passed to quantsim export API + :param dir_path: directory to temporarily save the constructed onnx/torchscrip model + :return: dictionary of layer-name to layer-output name + """ + + # Restore original model by removing quantization wrappers if present. + original_model = QuantizationSimModel.get_original_model(model) + + # Set path to store exported onnx/torchscript model. + LayerOutput._validate_dir_path(dir_path) + exported_model_dir = os.path.join(dir_path, 'exported_models') + os.makedirs(exported_model_dir, exist_ok=True) + + # Get node to i/o tensor name map from the onnx/torchscript model + if naming_scheme == NamingScheme.ONNX: + exported_model_node_to_io_tensor_map = LayerOutput.get_onnx_node_to_io_tensor_map( + original_model, exported_model_dir, dummy_input, onnx_export_args) + else: + exported_model_node_to_io_tensor_map = LayerOutput.get_torchscript_node_to_io_tensor_map( + original_model, exported_model_dir, dummy_input) + + layer_names_list = [name for name, module in original_model.named_modules() if utils.is_leaf_module(module)] + layers_missing_in_exported_model = [] + layer_name_to_layer_output_name_map = {} + + # Get mapping between layer names and layer-output names. + logger.info("Layer Name to Layer Output-name Mapping") + # pylint: disable=protected-access + for layer_name in layer_names_list: + if layer_name in exported_model_node_to_io_tensor_map: + # pylint: disable=protected-access, unused-variable + layer_output_names, intermediate_layer_output_names = QuantizationSimModel._get_layer_activation_tensors( + layer_name, exported_model_node_to_io_tensor_map) + layer_name_to_layer_output_name_map[layer_name] = layer_output_names[0] + logger.info("%s -> %s", layer_name, layer_output_names[0]) + else: + layers_missing_in_exported_model.append(layer_name) + + if layers_missing_in_exported_model: + logger.warning("The following layers were not found in the exported model:\n" + "%s\n" + "This can be due to below reason:\n" + "\t- The layer was not seen while exporting using the dummy input provided in sim.export(). " + "Ensure that the dummy input covers all layers.", + layers_missing_in_exported_model) + + # Delete onnx/torchscript models + shutil.rmtree(exported_model_dir, ignore_errors=False, onerror=None) + + return layer_name_to_layer_output_name_map + + @staticmethod + def get_onnx_node_to_io_tensor_map(model: torch.nn.Module, exported_model_dir: str, dummy_input: Union[torch.Tensor, Tuple, List], + onnx_export_args: Union[OnnxExportApiArgs, Dict]) -> Dict[str, Dict]: + """ + This function constructs an onnx model equivalent to the give pytorch model and then generates node-name to i/o + tensor-name map. + :param model: pytorch model without quantization wrappers + :param exported_model_dir: directory to save onnx model + :param dummy_input: dummy input to be used for constructing onnx model + :param onnx_export_args: configurations to generate onnx model + :return: onnx_node_to_io_tensor_map + """ + LayerOutput._validate_dummy_input(dummy_input) + LayerOutput._validate_onnx_export_args(onnx_export_args) + + onnx_path = os.path.join(exported_model_dir, 'model.onnx') + + OnnxSaver.create_onnx_model_with_pytorch_layer_names(onnx_model_path=onnx_path, pytorch_model=model, + dummy_input=dummy_input, onnx_export_args=onnx_export_args) + onnx_model = onnx.load(onnx_path) + onnx_node_to_io_tensor_map, _ = OnnxSaver.get_onnx_node_to_io_tensor_names_map(onnx_model) + + return onnx_node_to_io_tensor_map + + @staticmethod + def get_torchscript_node_to_io_tensor_map(model: torch.nn.Module, exported_model_dir: str, + dummy_input: Union[torch.Tensor, Tuple, List]) -> Dict[str, Dict]: + """ + This function constructs a torchscript model equivalent to the give pytorch model and then generates node-name to i/o + tensor-name map. + :param model: pytorch model without quantization wrappers + :param exported_model_dir: directory to save onnx model + :param dummy_input: dummy input to be used for constructing onnx model + :return: torchscript_node_to_io_tensor_map + """ + LayerOutput._validate_dummy_input(dummy_input) + + ts_path = os.path.join(exported_model_dir, 'model.torchscript.pth') + + with utils.in_eval_mode(model), torch.no_grad(): + torchscript_utils.create_torch_script_model(ts_path, model, dummy_input) + trace = torch.jit.load(ts_path) + torch_script_node_to_io_tensor_map, _ = \ + torchscript_utils.get_node_to_io_tensor_names_map(model, trace, dummy_input) + + return torch_script_node_to_io_tensor_map + + @staticmethod + def _validate_dir_path(dir_path: str): + """ + Validate directory path in which onnx/torchscript models will be temporarily saved + :param dir_path: directory path + :return: + """ + if dir_path is None: + raise ValueError("Missing directory path to save onnx/torchscript models") + + @staticmethod + def _validate_dummy_input(dummy_input: Union[torch.Tensor, Tuple, List]): + """ + Validates dummy input which is used to generate onnx/torchscript model + :param dummy_input: single input instance + :return: + """ + if not isinstance(dummy_input, (torch.Tensor, tuple, list)): + raise ValueError("Invalid dummy_input data-type") + + @staticmethod + def _validate_onnx_export_args(onnx_export_args: Union[OnnxExportApiArgs, Dict]): + """ + Validates export arguments which are used to generate an onnx model + :param onnx_export_args: export arguments + :return: + """ + if onnx_export_args is None: + onnx_export_args = OnnxExportApiArgs() + if not isinstance(onnx_export_args, (OnnxExportApiArgs, dict)): + raise ValueError("Invalid onnx_export_args data-type") +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_torch/model_preparer.html b/releases/1.32.2/_modules/aimet_torch/model_preparer.html new file mode 100644 index 00000000..e4347abc --- /dev/null +++ b/releases/1.32.2/_modules/aimet_torch/model_preparer.html @@ -0,0 +1,1904 @@ + + + + + + aimet_torch.model_preparer — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_torch.model_preparer

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2021-2024, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  From PyTorch:
+#
+#  Copyright (c) 2016-     Facebook, Inc            (Adam Paszke)
+#  Copyright (c) 2014-     Facebook, Inc            (Soumith Chintala)
+#  Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert)
+#  Copyright (c) 2012-2014 Deepmind Technologies    (Koray Kavukcuoglu)
+#  Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu)
+#  Copyright (c) 2011-2013 NYU                      (Clement Farabet)
+#  Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston)
+#  Copyright (c) 2006      Idiap Research Institute (Samy Bengio)
+#  Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz)
+#
+#  From Caffe2:
+#
+#  Copyright (c) 2016-present, Facebook Inc. All rights reserved.
+#
+#  All contributions by Facebook:
+#  Copyright (c) 2016 Facebook Inc.
+#
+#  All contributions by Google:
+#  Copyright (c) 2015 Google Inc.
+#  All rights reserved.
+#
+#  All contributions by Yangqing Jia:
+#  Copyright (c) 2015 Yangqing Jia
+#  All rights reserved.
+#
+#  All contributions by Kakao Brain:
+#  Copyright 2019-2020 Kakao Brain
+#
+#  All contributions by Cruise LLC:
+#  Copyright (c) 2022 Cruise LLC.
+#  All rights reserved.
+#
+#  All contributions from Caffe:
+#  Copyright(c) 2013, 2014, 2015, the respective contributors
+#  All rights reserved.
+#
+#  All other contributions:
+#  Copyright(c) 2015, 2016 the respective contributors
+#  All rights reserved.
+#
+#  Caffe2 uses a copyright model similar to Caffe: each contributor holds
+#  copyright over their contributions to Caffe2. The project versioning records
+#  all such contribution and copyright details. If a contributor wants to further
+#  mark their specific copyright on a particular contribution, they should
+#  indicate their copyright solely in the commit message of the change when it is
+#  committed.
+#
+#  All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright
+#     notice, this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright
+#     notice, this list of conditions and the following disclaimer in the
+#     documentation and/or other materials provided with the distribution.
+#
+#  3. Neither the names of Facebook, Deepmind Technologies, NYU, NEC Laboratories America
+#     and IDIAP Research Institute nor the names of its contributors may be
+#     used to endorse or promote products derived from this software without
+#     specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Implementation to automatically prepare pytorch models for AIMET features """
+
+# --------------------------------------------------------------------------------------------------------
+# Reference : https://github.com/pytorch/pytorch/blob/main/torch/fx/proxy.py#L26
+#             https://github.com/pytorch/pytorch/blob/main/torch/fx/proxy.py#L57
+
+# Above PyTorch code is used to get node_name_to_scope information by overriding call_module and create_node methods
+# of torch.fx.Tracer base class:
+# TODO: node_name_to_scope should be removed and instead use node.meta[] after upgrading to torch 2.0
+# ----------------------------------------------------------------------------------------------------------
+
+import copy
+import re
+from typing import Any, Optional, Dict, Union, List, Callable, Tuple
+import torch
+import torch.fx
+from aimet_common.utils import AimetLogger
+from aimet_torch.utils import in_eval_mode
+from aimet_torch.utils import replace_modules_of_type1_with_type2
+import aimet_torch.elementwise_ops as elementwise_ops
+
+logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.ModelPreparer)
+
+# this is a map of torch.nn.functional type to corresponding module type
+functional_op_to_module_map = {
+    torch.nn.functional.relu: torch.nn.ReLU,
+    torch.nn.functional.gelu: torch.nn.GELU
+}
+
+# In this functional --> module map, corresponding model is of type torch.nn and stateful.
+functional_with_stateful_api = {
+    'relu'          : torch.nn.ReLU,
+    'relu6'         : torch.nn.ReLU6,
+    'hardtanh'      : torch.nn.Hardtanh,
+    'hardwish'      : torch.nn.Hardswish,
+    'elu'           : torch.nn.ELU,
+    'selu'          : torch.nn.SELU,
+    'celu'          : torch.nn.CELU,
+    'leaky_relu'    : torch.nn.LeakyReLU,
+    'prelu'         : torch.nn.PReLU,
+    'rrelu'         : torch.nn.RReLU,
+    'glu'           : torch.nn.GLU,
+    'gelu'          : torch.nn.GELU,
+    'logsigmoid'    : torch.nn.LogSigmoid,
+    'hardshrink'    : torch.nn.Hardshrink,
+    'tanhshrink'    : torch.nn.Tanhshrink,
+    'softsign'      : torch.nn.Softsign,
+    'softplus'      : torch.nn.Softplus,
+    'softmin'       : torch.nn.Softmin,
+    'softmax'       : torch.nn.Softmax,
+    'softshrink'    : torch.nn.Softshrink,
+    'log_softmax'   : torch.nn.LogSoftmax,
+    'tanh'          : torch.nn.Tanh,
+    'sigmoid'       : torch.nn.Sigmoid,
+    'hardsigmoid'   : torch.nn.Hardsigmoid,
+    'silu'          : torch.nn.SiLU,
+}
+
+
+# Function that requires special transformation.
+functional_with_special_handling = {
+    'cat'           : elementwise_ops.Concat,
+    'conv2d'        : torch.nn.Conv2d
+}
+
+# In this functional --> module map, corresponding custom module is of type torch.nn and uses stateless API.
+functional_with_stateless_api = {
+    '_pad'                      : elementwise_ops.Pad,
+    'pad'                      : elementwise_ops.Pad,
+    'sum'                       : elementwise_ops.Sum,
+    'add'                       : elementwise_ops.Add,
+    'subtract'                  : elementwise_ops.Subtract,
+    'sub'                       : elementwise_ops.Subtract,
+    'mul'                       : elementwise_ops.Multiply,
+    'div'                       : elementwise_ops.Divide,
+    'truediv'                   : elementwise_ops.Divide,
+    'floordiv'                  : elementwise_ops.FloorDivide,
+    'matmul'                    : elementwise_ops.MatMul,
+    'exp'                       : elementwise_ops.Exponential,
+    'interpolate'               : elementwise_ops.Interpolate,
+    'max_pool2d'                : elementwise_ops.MaxPool2d,
+    'max_pool2d_with_indices'   : elementwise_ops.MaxPool2d,
+    'adaptive_avg_pool2d'       : elementwise_ops.AdaptiveAvgPool2d,
+    'avg_pool2d'                : elementwise_ops.AvgPool2d,
+    'norm'                      : elementwise_ops.Norm,
+    'batch_norm'                : elementwise_ops.BatchNorm,
+    'group_norm'                : elementwise_ops.GroupNorm,
+    'mean'                      : elementwise_ops.Mean,
+    'pow'                       : elementwise_ops.Pow,
+    'where'                     : elementwise_ops.Where,
+    'addmm'                     : elementwise_ops.Addmm,
+    'bmm'                       : elementwise_ops.Bmm,
+    'baddbmm'                   : elementwise_ops.Baddbmm,
+    'cumsum'                    : elementwise_ops.CumSum,
+    'masked_fill'               : elementwise_ops.MaskedFill,
+    'square'                    : elementwise_ops.Square,
+    'rsqrt'                     : elementwise_ops.RSqrt,
+}
+
+
+class Scope:
+    """
+    Code adapted from: https://github.com/pytorch/pytorch/blob/main/torch/fx/proxy.py#L26
+
+    Scope object that records the module path and the module type of module.
+    Scope is used to track the information of the module that contains a Node
+    in a Graph of GraphModule.
+    """
+    def __init__(self, module_path: str, module_type: Any):
+        super().__init__()
+        self.module_path = module_path
+        self.module_type = module_type
+
+
+class ScopeContextManager:
+    """
+    Code adapted from: https://github.com/pytorch/pytorch/blob/main/torch/fx/proxy.py#L57
+
+    A context manager to track the Scope of Node during symbolic tracing.
+    When entering a forward function of a Module, we'll update the scope information of
+    the current module, and when we exit, we'll restore the previous scope information.
+    """
+    def __init__(self, scope: Scope, current_scope: Scope):
+        super().__init__()
+        # Keep a copy of prev scope.
+        self._prev_scope = copy.copy(scope)
+        # Update scope to current scope
+        scope.module_path = current_scope.module_path
+        scope.module_type = current_scope.module_type
+        # Save a reference so, we can restore tracer.scope with prev scope on exit.
+        self._scope = scope
+
+    def __enter__(self):
+        return
+
+    def __exit__(self, *args):
+        self._scope.module_path = self._prev_scope.module_path
+        self._scope.module_type = self._prev_scope.module_type
+
+
+def conv2d_create_node(traced_model: torch.fx.GraphModule, module_name: str, node: torch.fx.node) \
+        -> torch.fx.node:
+    """
+    Create the node to be inserted in the graph model.
+
+    :param traced_model: Symbolically traced model
+    :param module_name: Qualified module name in symbolic_traced_model hierarchy corresponding to new node
+    :param node: Current node in the graph after which new node will be inserted
+    :return: torch.fx.node to be inserted in the graph
+    """
+
+    n_args = len(node.args)
+    # input tensors must be passed as args, not kwargs for QcQuantizeWrapper
+    input_tensor = []
+    # input and weight is guaranteed to exist, but bias can be None
+    # Since None cannot be passed as args in QcQuantizeWrapper, do not add it to input_tensor
+    for index, key in [[0, 'input'], [1, 'weight'], [2, ' bias']]:
+        value = None
+        if n_args > index:
+            value = node.args[index]
+        elif key in node.kwargs:
+            value = node.kwargs[key]
+
+        if value is not None:
+            input_tensor.append(value)
+        else:
+            break
+
+    with traced_model.graph.inserting_after(node):
+        if check_dynamic_conv2d(traced_model, module_name):
+            new_node = traced_model.graph.call_module(module_name, args=tuple(input_tensor))
+        else:
+            new_node = traced_model.graph.call_module(module_name, args=tuple([input_tensor[0]]))
+        return new_node
+
+
+def check_dynamic_conv2d(traced_model: torch.fx.GraphModule, module_name: str) -> bool:
+    """
+    return True if the module is dynamic conv2d.
+    """
+    m = traced_model
+    for name in module_name.split('.'):
+        m = getattr(m, name)
+
+    return isinstance(m, elementwise_ops.DynamicConv2d)
+
+
+def conv2d_create_module(node: torch.fx.node) -> torch.nn.Module:
+    """
+    Create the replacement module.
+
+    :param node: Current node in the graph after which new node will be inserted
+    :return: New module.
+    """
+
+    # Get weight and bias from argument
+    params = merge_args_and_kwargs(node, {1: 'weight', 2: 'bias'})
+
+    # Convert F.Conv2D arguments to nn.Conv2D arguments
+    kwargs = merge_args_and_kwargs(node, {3: 'stride', 4: 'padding', 5: 'dilation', 6: 'groups'})
+
+    # If weight or bias is from activation of another layer, use dynamic_conv2d
+    use_dynamic_conv2d = False
+    for key, param in params.items():
+        if param.op != 'get_attr':
+            use_dynamic_conv2d = True
+            break
+
+    if use_dynamic_conv2d:
+        module = elementwise_ops.DynamicConv2d(**kwargs)
+    else:
+        for key, param_node in params.items():
+            params[key] = get_node_attr(param_node)
+
+        # Fetch additional info using parameters
+        out_channels, in_channels, kernel_size, _ = params['weight'].shape
+        bias = 'bias' in params
+
+        # For Depthwise Conv, multiply in_channels by number of groups
+        # if groups is not passed as arg, use its default value 1
+        kwargs['in_channels'] = in_channels * kwargs.get('groups', 1)
+        kwargs['out_channels'] = out_channels
+        kwargs['kernel_size'] = kernel_size
+        kwargs['bias'] = bias
+
+        module = torch.nn.Conv2d(**kwargs)
+        # Replace nn.Conv2D params using F.Conv2D arguments
+        module.weight = torch.nn.Parameter(params['weight'])
+        if bias:
+            module.bias = torch.nn.Parameter(params['bias'])
+    return module
+
+
+def merge_args_and_kwargs(node: torch.fx.node, arguments_to_fetch: Dict) -> Dict:
+    """
+    Merge args and kwargs into a single kwargs and return it
+    :param node: node to fetch args and kwargs from
+    :param arguments_to_fetch: dictionary containing arguments' indices in args and keys in kwargs
+    :return: single merged kwargs
+    """
+    n_args = len(node.args)
+    kwargs = {}
+    for index, key in arguments_to_fetch.items():
+        value = None
+        if n_args > index:
+            value = node.args[index]
+        elif key in node.kwargs:
+            value = node.kwargs[key]
+
+        if value is not None:
+            kwargs[key] = value
+    return kwargs
+
+
+def get_node_attr(node: torch.fx.node):
+    """
+    Codes modified from https://pytorch.org/docs/stable/fx.html#the-interpreter-pattern
+
+    :param node: node to fetch data from
+    :return: value returned from node
+    """
+    def fetch_attr(target: str):
+        target_atoms = target.split('.')
+        attr_itr = node.graph.owning_module
+        for i, atom in enumerate(target_atoms):
+            if not hasattr(attr_itr, atom):
+                raise RuntimeError(f"Node referenced nonexistant target {'.'.join(target_atoms[:i])}")
+            attr_itr = getattr(attr_itr, atom)
+        return attr_itr
+
+    assert node.op == 'get_attr'
+
+    return fetch_attr(node.target)
+
+
+def concat_create_node(traced_model: torch.fx.GraphModule, module_name: str, node: torch.fx.node) \
+        -> torch.fx.node:
+    """
+    Create the node to be inserted in the graph model.
+
+    :param traced_model: Symbolically traced model
+    :param module_name: Qualified module name in symbolic_traced_model hierarchy corresponding to new node
+    :param node: Current node in the graph after which new node will be inserted
+    :return: torch.fx.node to be inserted in the graph
+    """
+
+    with traced_model.graph.inserting_after(node):
+        # call_module only accepts tuple as args but node.args[0] can be a list. Convert it into a tuple
+        # If node.args[0] is already a tuple, tuple() will do nothing
+        new_node = traced_model.graph.call_module(module_name, args=tuple(node.args[0]))
+        return new_node
+
+
+def concat_create_module(node: torch.fx.node) -> torch.nn.Module:
+    """
+    Create the replacement module.
+
+    :param node: Current node in the graph after which new node will be inserted
+    :return: New module.
+    """
+
+    num_args = len(node.args)
+    if num_args == 1 and 'dim' not in node.kwargs:
+        # Handle torch.cat being called with default parameter dim
+        kwargs = node.kwargs
+        module = elementwise_ops.Concat()
+    else:
+        axis = node.args[1] if num_args > 1 else node.kwargs['dim']
+        module = elementwise_ops.Concat(axis)
+        kwargs = {'axis': axis}
+
+    for key, value in kwargs.items():
+        setattr(module, key, value)
+
+    return module
+
+special_handler_functions = {
+    # Special handling functions for creating node and module
+    'cat': {'node_fn': concat_create_node, 'module_fn': concat_create_module},
+    'conv2d': {'node_fn': conv2d_create_node, 'module_fn': conv2d_create_module}
+}
+
+
+
[docs]def prepare_model(model: torch.nn.Module, + modules_to_exclude: List[torch.nn.Module] = None, + module_classes_to_exclude: List[Callable] = None, + concrete_args: Optional[Dict[str, Any]] = None) -> torch.fx.GraphModule: + """ + Prepare and modify the pytorch model for AIMET features using torch.FX symbolic tracing API. + + 1. Replace torch.nn.functional by module of type torch.nn.Module + 2. Create new independent torch.nn.Module instances for reused/duplicate module + + :param model: pytorch Model to be modified. + :param modules_to_exclude: List of modules to exclude when tracing. + :param module_classes_to_exclude: List of module classes to exclude when tracing. + :param concrete_args: Allows you to partially specialize your function, whether it's to remove control flow or + data structures. If the model has control flow, torch.fx won't be able to trace the model. Check + torch.fx.symbolic_trace API in detail. + :return: Modified pytorch Model + """ + with in_eval_mode(model): + traced_model, node_name_to_scope = \ + _trace_model(model, modules_to_exclude, module_classes_to_exclude, concrete_args) + + # Prepare model and perform checks to make sure the graph is well-formed. + _prepare_traced_model(traced_model, node_name_to_scope) + return traced_model
+ + +def _trace_model(model: torch.nn.Module, + modules_to_exclude: Optional[List[torch.nn.Module]], + module_classes_to_exclude: Optional[List[Callable]], + concrete_args: Optional[Dict[str, Any]]) -> [torch.fx.GraphModule, Dict]: + """ + Returns traced model and dictionary of node name to the scope of module which contains the node. + + :param model: pytorch Model to be modified. + :param modules_to_exclude: List of modules to exclude when tracing. + :param module_classes_to_exclude: List of module classes to exclude when tracing. + :param concrete_args: Concrete arguments that should not be treated as Proxies. + :return: (Traced model, node_name_to_scope) + """ + class Tracer(torch.fx.Tracer): + """ + Override is_leaf_module(), call_module() and create_node() methods of parent class. + """ + def __init__(self): + super().__init__() + self.scope = Scope("", None) + self.node_name_to_scope = {} + + def is_leaf_module(self, m: torch.nn.Module, module_qualified_name: str) -> bool: + return ( + modules_to_exclude and m in modules_to_exclude + or module_classes_to_exclude and type(m) in module_classes_to_exclude # pylint: disable=unidiomatic-typecheck + or super().is_leaf_module(m, module_qualified_name) + ) + + def call_module(self, m: torch.nn.Module, forward: Callable[..., Any], args: Tuple[Any, ...], + kwargs: Dict[str, Any]) -> Any: + module_qualified_name = self.path_of_module(m) + with ScopeContextManager(self.scope, Scope(module_qualified_name, type(m))): + return super().call_module(m, forward, args, kwargs) + + def create_node(self, kind: str, target, args, kwargs, name: Optional[str] = None, + type_expr: Optional[Any] = None) -> torch.fx.Node: + node = super().create_node(kind, target, args, kwargs, name, type_expr) + self.node_name_to_scope[node.name] = (self.scope.module_path, self.scope.module_type) + return node + + # Symbolic tracing frontend - captures the semantics of the module + tracer = Tracer() + graph = tracer.trace(model, concrete_args=concrete_args) + traced_model = torch.fx.GraphModule(tracer.root, graph) + return traced_model, tracer.node_name_to_scope + + +def _prepare_traced_model(traced_model: torch.fx.GraphModule, + node_name_to_scope: Dict[str, Tuple[str, type]] = None): + """ + Helper for prepare_model(). This prepares the given traced_model in-place. + + :param traced_model: Symbolically traced model. + :param node_name_to_scope: Mapping from node name to the scope of module which contains the node. + """ + unique_nodes = set() + + # Modify the symbolically traced model by iterating over all the nodes + for node in traced_model.graph.nodes: + + # Create new module for functional nodes + if node.op in ['call_function', 'call_method']: + functional_name = _find_functional_name_for_node(node.name) + if functional_name: + # Instantiate new module for functional node + new_module = _create_module_for_functional_node(node, functional_name) + parent_module, new_module_name, new_module_qualified_name = \ + _get_info_for_functional_node(traced_model, node, node_name_to_scope) + setattr(parent_module, new_module_name, new_module) + # Insert the node for new module in the graph + _insert_node_for_new_module(traced_model, node, new_module_qualified_name, functional_name) + logger.info("Functional : Adding new module for node: {%s} ", new_module_qualified_name) + + # Create new module for reused/duplicate nodes + elif node.target in unique_nodes: + if node.op == 'call_module': + # Instantiate new module for reused node + new_module = _create_module_for_reused_node(node, traced_model) + parent_module, new_module_name, new_module_qualified_name = \ + _get_info_for_reused_node(traced_model, node, node_name_to_scope) + setattr(parent_module, new_module_name, new_module) + # Insert the node for new module in the graph + _insert_node_for_new_module(traced_model, node, new_module_qualified_name) + logger.info("Reused/Duplicate : Adding new module for node: {%s} ", new_module_qualified_name) + else: + unique_nodes.add(node.target) + + _verify_traced_model(traced_model) + + # Replace SiLU with CustomSiLU + replace_modules_of_type1_with_type2(traced_model, torch.nn.SiLU, elementwise_ops.CustomSiLU) + + +def _verify_traced_model(traced_model: torch.fx.GraphModule): + """ + Does some checks to make sure the graph is well-formed and recompile the forward() method of symbolic_traced + model from its graph + + :param traced_model: Symbolically traced model + """ + traced_model.graph.lint() + traced_model.recompile() + + +def _insert_node_for_new_module(traced_model: torch.fx.GraphModule, + node: torch.fx.node, + module_qualified_name: str, + functional_name: str = None): + """ + Insert 'call module' node into graph and replace all the uses of 'node' with newly added node and erase the + old node from graph + :param traced_model: Symbolically traced model + :param node: Current node in the graph after which new node will be inserted + :param module_qualified_name: Qualified module name in symbolic_traced_model hierarchy corresponding to new node + :param functional_name: Original functional name + """ + with traced_model.graph.inserting_after(node): + if functional_name: + if functional_name in functional_with_special_handling: + new_node = special_handler_functions[functional_name]['node_fn'](traced_model, module_qualified_name, node) + elif functional_name in functional_with_stateless_api: + new_node = traced_model.graph.call_module(module_qualified_name, args=node.args, kwargs=node.kwargs) + elif functional_name in functional_with_stateful_api: + new_node = traced_model.graph.call_module(module_qualified_name, args=node.args) + else: + raise ValueError("Unsupported module: {}".format(functional_name)) + else: + new_node = traced_model.graph.call_module(module_qualified_name, args=node.args) + + node.replace_all_uses_with(new_node) + traced_model.graph.erase_node(node) + + +def _find_functional_name_for_node(node_name: str) -> Union[str, None]: + """ + For given node name, find corresponding functional name from combined lookup + + :param node_name: torch.fx Node name + :return: corresponding functional name if found, else None + """ + combined_lookup = {**functional_with_stateful_api, **functional_with_special_handling, **functional_with_stateless_api} + + # Functional operations with similar names are differentiated using "_count" suffix + # when symbolically traced. For example, two add operations will have name 'add' and 'add_1'. + # Split given node name by occurrence of pattern. \d is used to match [0-9] followed by '_'. + strings = re.split(pattern=r'_\d', string=node_name) + for string in strings: + if string in combined_lookup.keys(): + return string + + logger.debug("Couldn't find functional: %s in the lookup. If functional op isn't math invariant," + " add an entry in the lookup.", node_name) + return None + + +def _create_module_for_functional_node(node: torch.fx.node, functional_name: str) -> torch.nn.Module: + """ + For given node and functional name, create torch.nn.Module with same parameters as functional node parameters + :param node: torch.fx Node + :param functional_name: Functional name for given node + :return: New module + """ + # Instantiate new module from lookup + if functional_name in functional_with_stateful_api: + module = functional_with_stateful_api[functional_name]() + # Set the parameters for module from node.kwargs + for key, value in node.kwargs.items(): + setattr(module, key, value) + elif functional_name in functional_with_special_handling: + module = special_handler_functions[functional_name]['module_fn'](node) + elif functional_name in functional_with_stateless_api: + module = functional_with_stateless_api[functional_name]() + else: + raise ValueError("Unsupported module: {}".format(functional_name)) + return module + + +def _create_module_for_reused_node(node: torch.fx.node, symbolic_traced_model: torch.fx.GraphModule) ->\ + torch.nn.Module: + """ + For given reused/Duplicate node in symbolically traced model, create new module with same parameters as + original module + :param node: Reused/Duplicate torch.fx Node + :param symbolic_traced_model: Symbolically traced model + :return: New module + """ + # Get the original module and return newly deep copied module + module = _get_module_for_dotted_name(symbolic_traced_model, node.target) + new_module = copy.deepcopy(module) + + return new_module + + +def _get_module_for_dotted_name(module: torch.fx.GraphModule, dotted_name: str) -> torch.nn.Module: + """ + For given dotted name, find the module + :param module: module to be found + :param dotted_name: dotted name of module + :return: module + """ + if '.' in dotted_name: + module_name, _, remainder = dotted_name.partition('.') + return _get_module_for_dotted_name(module._modules[module_name], remainder) # pylint: disable=protected-access + + return getattr(module, dotted_name) + + +def get_module_for_activation_fn(act_fn: torch.nn.functional): + """ + returns module instance for functional tyoe handled within PT transformers for activation functions + :param act_fn: activation function implemented as a functional. + :return: module equivalent for the activation function. + """ + + if act_fn not in functional_op_to_module_map: + logger.error("Unsupported activation function {%s}", act_fn) + return None + module = functional_op_to_module_map[act_fn]() + return module + + +def prepare_pt_transformer_for_quantsim(transformer_model: torch.nn.Module): + """ + Replaces functionals with modules for activation function, updates model in-place + :param transformer_model: model with PyTorch nn.Transformer layer + :return: updated model with modules for activation function. + """ + + for module in transformer_model.modules(): + + # encoder layer or decoder layer type is the leaf level node to be updated within nn.transformer layer + if isinstance(module, torch.nn.TransformerEncoderLayer) and not isinstance(module.activation, torch.nn.Module): + module.activation = get_module_for_activation_fn(module.activation) + + if isinstance(module, torch.nn.TransformerDecoderLayer) and not isinstance(module.activation, torch.nn.Module): + module.activation = get_module_for_activation_fn(module.activation) + + +def _get_info_for_functional_node(traced_model: torch.fx.GraphModule, + node: torch.fx.Node, + node_name_to_scope: Dict[str, Tuple[str, type]])\ + -> Tuple[torch.fx.GraphModule, str, str]: + """ + For functional node, get module which contains the node, corresponding new module's name and fully qualified name. + This information will be used to add new module at either module-level scope or model-level scope. + + NOTE: If node_name_to_scope is not provided, then the corresponding new module will be added at model-level scope. + Also, if exception is raised, new module will be added at model-level scope. + + :param traced_model: Traced model + :param node: torch.fx Node + :param node_name_to_scope: Mapping from node name to the scope of module which contains the node. + :return: (parent_module, new_module_name, new_module_qualified_name) + """ + parent_module = traced_model + new_module_name = "module_" + node.name + new_module_qualified_name = new_module_name + + if node_name_to_scope: + try: + module_path, _ = node_name_to_scope[node.name] + parent_module = traced_model.get_submodule(module_path) + if module_path == "": + new_module_qualified_name = new_module_name + else: + new_module_qualified_name = module_path + "." + new_module_name + except (KeyError, AttributeError): + pass + + return parent_module, new_module_name, new_module_qualified_name + + +def _get_info_for_reused_node(traced_model: torch.fx.GraphModule, + node: torch.fx.Node, + node_name_to_scope: Dict[str, Tuple[str, type]])\ + -> Tuple[torch.fx.GraphModule, str, str]: + """ + For reused node, get module which contains the node, corresponding new module's name and fully qualified name. + This information will be used to add new module at either module-level scope or model-level scope. + + NOTE: If node_name_to_scope is not provided, then the corresponding new module will be added at model-level scope. + Also, if exception is raised, new module will be added at model-level scope. + + :param traced_model: Traced model + :param node: torch.fx Node + :param node_name_to_scope: Mapping from node name to the scope of module which contains the node. + :return: (parent_module, new_module_name, new_module_qualified_name) + """ + parent_module = traced_model + new_module_name = "module_" + node.name + new_module_qualified_name = new_module_name + + if node_name_to_scope: + try: + module_path, _ = node_name_to_scope[node.name] + if "." in module_path: + parent_name, child_name = module_path.rsplit(".", maxsplit=1) + else: + parent_name, child_name = "", module_path + parent_module = traced_model.get_submodule(parent_name) + new_module_name = "module_" + child_name + "_" + node.name.rsplit("_", maxsplit=1)[1] + if parent_name == "": + new_module_qualified_name = new_module_name + else: + new_module_qualified_name = parent_name + "." + new_module_name + except (KeyError, AttributeError): + pass + + return parent_module, new_module_name, new_module_qualified_name +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_torch/peft.html b/releases/1.32.2/_modules/aimet_torch/peft.html new file mode 100644 index 00000000..ed334bce --- /dev/null +++ b/releases/1.32.2/_modules/aimet_torch/peft.html @@ -0,0 +1,1569 @@ + + + + + + aimet_torch.peft — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_torch.peft

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Implementation for handling LoRA adapters added using PEFT """
+from typing import Dict, Type
+import os
+import pickle
+from collections import defaultdict
+import torch.nn as nn
+import torch
+import onnx
+from safetensors.torch import save_file
+from safetensors import safe_open
+
+# pylint: disable=import-error
+# pylint: disable=no-name-in-module
+from peft.tuners.lora.layer import LoraLayer as PeftLoraLayer
+from peft.tuners.lora.layer import Conv2d as PeftConv2d
+
+from aimet_torch.utils import replace_modules_of_type1_using_constructor
+from aimet_torch.elementwise_ops import Add
+from aimet_torch.v2.quantsim import QuantizationSimModel
+from aimet_torch.quantsim import ExportableQuantModule
+from aimet_torch.v2.nn import BaseQuantizationMixin
+from aimet_torch.onnx_utils import OnnxSaver, get_layers_in_io_tensor_map
+
+
+class LoraLayer(torch.nn.Module):
+    """
+    Quantizable lora layer
+    """
+    # pylint: disable=too-many-instance-attributes
+    def __init__(self, lora_layer: PeftLoraLayer):
+        """
+        :param lora_layer: Lora layer we want to replace
+        """
+        super().__init__()
+        self.base_layer = lora_layer.base_layer
+        self.r = lora_layer.r
+        self.lora_alpha = lora_layer.lora_alpha
+        self.scaling = lora_layer.scaling
+        self.lora_dropout = nn.ModuleList({})
+        self.adapter_name_to_index = {}
+        self.index_to_adapter_name = {}
+        self.lora_A = nn.ModuleList([])
+        self.lora_B = nn.ModuleList([])
+        self.active_adapters = {}
+        self._swap_module_dict_with_list(lora_layer)
+        self.in_features = lora_layer.in_features
+        self.out_features = lora_layer.out_features
+        self.add_lora_to_res = Add()
+
+    def _swap_module_dict_with_list(self, lora_layer):
+        for index, adapter_name in enumerate(lora_layer.lora_A):
+            self.lora_A.append(lora_layer.lora_A[adapter_name])
+            self.lora_B.append(lora_layer.lora_B[adapter_name])
+            self.lora_dropout.append(lora_layer.lora_dropout[adapter_name])
+            self.adapter_name_to_index[adapter_name] = index
+            if adapter_name in lora_layer.active_adapter:
+                self.active_adapters[adapter_name] = True
+            else:
+                self.active_adapters[adapter_name] = False
+        for adapter_name in self.adapter_name_to_index:
+            self.index_to_adapter_name[self.adapter_name_to_index[adapter_name]] = adapter_name
+
+    def forward(self, x: torch.Tensor, *args, **kwargs) -> torch.Tensor:
+        """ Forward pass for replaced layer"""
+        result = self.base_layer(x, *args, **kwargs)
+        torch_result_dtype = result.dtype
+        for active_adapter in self.active_adapters:
+            if active_adapter not in self.adapter_name_to_index:
+                continue
+            lora_A = self.lora_A[self.adapter_name_to_index[active_adapter]]
+            lora_B = self.lora_B[self.adapter_name_to_index[active_adapter]]
+            dropout = self.lora_dropout[self.adapter_name_to_index[active_adapter]]
+            scaling = self.scaling[active_adapter]
+            x = x.to(lora_A.weight.dtype)
+
+            result = self.add_lora_to_res(result, lora_B(lora_A(dropout(x)) * scaling))
+
+        result = result.to(torch_result_dtype)
+        return result
+
+
+def replace_lora_layers_with_quantizable_layers(model: torch.nn.Module):
+    """
+    Utility to replace lora layers with Quantizable Lora layers
+
+    :param model: PEFT model
+    """
+    replace_modules_of_type1_using_constructor(model, PeftLoraLayer, LoraLayer)
+    replace_modules_of_type1_using_constructor(model, PeftConv2d, LoraLayer)
+
+
+def save_lora_weights_after_adaptation(model: torch.nn.Module, path: str, filename_prefix: str):
+    """
+    Utility to save model weights after model adaptations
+
+    :param model: PEFT model
+    :param path: path where to store model pth and encodings
+    :param filename_prefix: Prefix to use for filenames
+    """
+    param_to_name = {}
+
+    for name, param in model.named_parameters():
+        param_to_name[param] = name
+
+    lora_weights = {}
+    for name, module in model.named_modules():
+        if isinstance(module, LoraLayer):
+            for _, param in module.lora_A.named_parameters():
+                name = param_to_name[param]
+                lora_weights[name] = param
+            for _, param in module.lora_B.named_parameters():
+                name = param_to_name[param]
+                lora_weights[name] = param
+
+    filename_prefix = filename_prefix + '.safetensor'
+    model_params_path = os.path.join(path, filename_prefix)
+    save_file(lora_weights, model_params_path)
+
+
+
[docs]class AdapterMetaData: + """ + Tracks meta data for lora layers. Tracks names of lora_a & b as well as alpha values + Attributes: + lora_A, lora_B, alpha + """ + def __init__(self): + self.lora_A = [] + self.lora_B = [] + self.alpha = None
+ + +def track_lora_meta_data(model: torch.nn.Module, path: str, filename_prefix: str, + replaced_module_type: Type[torch.nn.Module] = None) -> Dict[str, AdapterMetaData]: + """ + Utility to track and save meta data for adapters. The meta data has adapter names and corresponding lora layers & alphas + + :param model: PEFT model + :param path: path where to store model pth and encodings + :param filename_prefix: Prefix to use for filenames + :param replaced_module_type: If lora linear layer is replaced by another torch module, then replaced_module_type + represents the type with which linear layer was replaced. Otherwise pass None + """ + module_to_name_d = {} + + for name, module in model.named_modules(): + module_to_name_d[module] = name + + adapter_name_to_meta_data = defaultdict(AdapterMetaData) + for name, module in model.named_modules(): + if isinstance(module, LoraLayer): + for index, lora_layer in enumerate(module.lora_A): + if replaced_module_type and isinstance(lora_layer, replaced_module_type): + lora_layer = lora_layer.conv2d + adapter_name_to_meta_data[module.index_to_adapter_name[index]].lora_A.append( + module_to_name_d[lora_layer]) + for index, lora_layer in enumerate(module.lora_B): + if replaced_module_type and isinstance(lora_layer, replaced_module_type): + lora_layer = lora_layer.conv2d + adapter_name_to_meta_data[module.index_to_adapter_name[index]].lora_B.append( + module_to_name_d[lora_layer]) + for lora_adapter_name in module.lora_alpha: + adapter_name_to_meta_data[lora_adapter_name].alpha = module.lora_alpha[lora_adapter_name] + + + file_name = os.path.join(path, f"{filename_prefix}.pkl") + with open(file_name, 'wb') as file: + pickle.dump(adapter_name_to_meta_data, file) + return adapter_name_to_meta_data + + +
[docs]class PeftQuantUtils: + """ + Utilities for quantizing peft model + """ + def __init__(self, adapater_name_to_meta_data: Dict[str, AdapterMetaData], name_to_module_dict=None): + """ + Init for Peft utilities for quantization + + :param adapater_name_to_meta_data: Dict mapping adapter name to meta data. Output of track_meta_data + :param name_to_module_dict: PT Name to module prepared model name mapping + """ + self.adapter_name_to_meta_data = adapater_name_to_meta_data + self.lora_layers = self._get_lora_layers() + self.pt_name_to_prepared_name, self.prepared_name_to_pt_name = None, None + self.pt_to_lora_name = dict.fromkeys(self.lora_layers, '') + if name_to_module_dict: + self.pt_name_to_prepared_name, self.prepared_name_to_pt_name = self._get_pytorch_name_to_prepared_name(name_to_module_dict) + self.lora_to_pt_name, self.pt_to_lora_name = self._get_lora_name_to_pytorch_name() + + @staticmethod + def _get_pytorch_name_to_prepared_name(name_to_module_dict): + """ + Gets onnx names to pytorch names mapping and vice versa + + :param model: PT model + """ + pt_name_to_onnx_name = {} + onnx_name_to_pt_name = {} + for pytorch_name in name_to_module_dict: + onnx_name = name_to_module_dict[pytorch_name][0] + pt_name_to_onnx_name[pytorch_name] = onnx_name + onnx_name_to_pt_name[onnx_name] = pytorch_name + return pt_name_to_onnx_name, onnx_name_to_pt_name + + def _get_lora_name_to_pytorch_name(self): + """ + Gets most similar pytorch name for every lora name + """ + lora_to_pytorch_name = {} + pytorch_to_lora_name = {} + for pt_name in self.pt_name_to_prepared_name: + for lora_name in self.lora_layers: + if pt_name in lora_name: + lora_to_pytorch_name[lora_name] = pt_name + pytorch_to_lora_name[pt_name] = lora_name + return lora_to_pytorch_name, pytorch_to_lora_name + + def _get_lora_layers(self) -> set: + """ + Gets all lora layers + """ + lora_layers = set() + for adapter_name in self.adapter_name_to_meta_data: + for lora_module in self.adapter_name_to_meta_data[adapter_name].lora_A: + lora_layers.add(lora_module) + for lora_module in self.adapter_name_to_meta_data[adapter_name].lora_B: + lora_layers.add(lora_module) + return lora_layers + + @staticmethod + def _freeze_quantizer(quantizer): + """ + Disables compute encodings and gradient update for a quantizer + + :param quantizer: Param, output or Input quantizer + """ + # pylint:disable = protected-access + quantizer._allow_overwrite = False + quantizer.requires_grad_(False) + +
[docs] def freeze_base_model_param_quantizers(self, sim: QuantizationSimModel): + """ + Freeze parameter quantizers of base model + + :param sim: QuantSim model + """ + for module_name, module in sim.model.named_modules(): + if self.prepared_name_to_pt_name and module_name in self.prepared_name_to_pt_name: + module_name = self.prepared_name_to_pt_name[module_name] + if isinstance(module, BaseQuantizationMixin) and module_name not in self.pt_to_lora_name: + for _, param_quantizer in module.param_quantizers.items(): + if param_quantizer: + self._freeze_quantizer(param_quantizer)
+ +
[docs] def freeze_base_model_activation_quantizers(self, sim: QuantizationSimModel): + """ + Freeze activation quantizers of base model + + :param sim: QuantSim model + """ + for module_name, module in sim.model.named_modules(): + if self.prepared_name_to_pt_name and module_name in self.prepared_name_to_pt_name: + module_name = self.prepared_name_to_pt_name[module_name] + if isinstance(module, BaseQuantizationMixin) and module_name not in self.pt_to_lora_name: + for input_quantizer, output_quantizer in zip(module.input_quantizers, module.output_quantizers): + if input_quantizer: + self._freeze_quantizer(input_quantizer) + if output_quantizer: + self._freeze_quantizer(output_quantizer)
+ +
[docs] def freeze_base_model(self, sim: QuantizationSimModel): + """ + Freeze entire base model + + :param sim: QuantSim model + """ + self.freeze_base_model_activation_quantizers(sim) + self.freeze_base_model_param_quantizers(sim)
+ +
[docs] def set_bitwidth_for_lora_adapters(self, sim: QuantizationSimModel, + output_bw: int, param_bw: int): + """ + Sets output and param bitwidth for all Lora adapters added to the model + + :param sim: QuantSim model + :param output_bw: Output BW + :param param_bw: Parameter BW + """ + for module_name, module in sim.model.named_modules(): + if self.prepared_name_to_pt_name and module_name in self.prepared_name_to_pt_name: + module_name = self.prepared_name_to_pt_name[module_name] + if isinstance(module, BaseQuantizationMixin) and module_name in self.pt_to_lora_name: + self._set_bitwidth_for_module(module, output_bw, param_bw)
+ +
[docs] def get_quantized_lora_layer(self, sim: QuantizationSimModel): + """ + This function can be used to generate lora quantized layers + Use cases: 1) New quantizers can be created and assigned to lora quantized layer. + New quantizers may be required if changing - Changing dtype, per channel to per tensor + and vice versa + 2) Assign new values to symmetric, bitwidth + + :param sim: QuantSim model + """ + for module_name, module in sim.model.named_modules(): + if self.prepared_name_to_pt_name and module_name in self.prepared_name_to_pt_name: + module_name = self.prepared_name_to_pt_name[module_name] + if isinstance(module, BaseQuantizationMixin) and module_name in self.pt_to_lora_name: + yield module_name, module
+ + @staticmethod + def _set_bitwidth_for_module(module: BaseQuantizationMixin, output_bw: int, param_bw: int): + """ + Sets bitwidth for a QcQuantizeWrapper module + + :param module: QcQuantize wrapper module + :param output_bw: Output BW + :param param_bw: Parameter BW + """ + for output_quantizer in module.output_quantizers: + output_quantizer.bitwidth = output_bw + for _, param_quantizer in module.param_quantizers.items(): + param_quantizer.bitwidth = param_bw + +
[docs] def export_adapter_weights(self, sim: QuantizationSimModel, path: str, filename_prefix: str, onnx_model_path: str): + """ + Exports adapter weights to safetensor format + + :param sim: QuantSim model + :param path: path where to store model pth and encodings + :param filename_prefix: Prefix to use for filenames of the model pth and encodings files + :param onnx_model_path: Path from where we can load the exported onnx model. This can be the same path to where + QuantSim exported the ONNX model + """ + # pylint: disable=too-many-locals + assert os.path.exists(onnx_model_path), 'The onnx model does not exist in the location specified' + + onnx_model = onnx.load(onnx_model_path) + onnx_node_to_io_tensor_map, _ = OnnxSaver.get_onnx_node_to_io_tensor_names_map(onnx_model) + layers_to_onnx_op_names = get_layers_in_io_tensor_map(onnx_node_to_io_tensor_map) + + tensors = {} + + for module_name, module in sim.model.named_modules(): + if not isinstance(module, ExportableQuantModule): + continue + + if module_name in layers_to_onnx_op_names: + onnx_name = layers_to_onnx_op_names[module_name][0] + if self.prepared_name_to_pt_name and module_name in self.prepared_name_to_pt_name: + pt_name = self.prepared_name_to_pt_name[module_name] + if pt_name in self.pt_to_lora_name: + module_name = self.pt_to_lora_name[pt_name] + if module_name in self.lora_layers: + for param_name, param in module.named_parameters(): + if param_name in ['weight', 'bias']: + tensor_name = onnx_name + '.' + param_name + tensors[tensor_name] = param + filename_prefix = filename_prefix + '.safetensor' + model_params_path = os.path.join(path, filename_prefix) + save_file(tensors, model_params_path)
+ +
[docs] def enable_adapter_and_load_weights(self, sim: QuantizationSimModel, adapter_weights_path, + use_safetensor: bool = True): + """ + Enables adapter effect on base model by loading weights to model + + :param sim: QuantSim model + :param adapter_weights_path: Path to adapter weights (adapter weights should be either bin file or safetensor) + :param use_safetensor: True if adapter weights path point to a safetensor file. False if points to bin file + """ + tensors = {} + if use_safetensor: + with safe_open(adapter_weights_path, framework="pt", device=0) as f: + for key in f.keys(): + tensors[key] = f.get_tensor(key) + else: + tensors = torch.load(adapter_weights_path) + + onnx_names_tensors = {} + for key in tensors.keys(): + tensor_name = key + if self.prepared_name_to_pt_name: + temp_key = key[0:key.find('.weight')] + tensor_name = self.pt_name_to_prepared_name[self.lora_to_pt_name[temp_key]] + '.weight' + onnx_names_tensors[tensor_name] = tensors[key] + + sim.model.load_state_dict(onnx_names_tensors, strict=False)
+ +
[docs] def disable_lora_adapters(self, sim: QuantizationSimModel): + """ + Disables adapter (zero out weights for lora A & B) effect on base model by loading weights to model + + :param sim: QuantSim model + """ + tensors = {} + for module_name, module in sim.model.named_modules(): + org_name = module_name + if self.prepared_name_to_pt_name and module_name in self.prepared_name_to_pt_name: + pt_name = self.prepared_name_to_pt_name[module_name] + if pt_name in self.pt_to_lora_name: + module_name = self.pt_to_lora_name[pt_name] + if module_name in self.lora_layers: + for param_name, param in module.named_parameters(): + if param_name in ['weight', 'bias']: + tensor_name = org_name + '.' + param_name + tensors[tensor_name] = torch.zeros_like(param) + + sim.model.load_state_dict(tensors, strict=False)
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_torch/quant_analyzer.html b/releases/1.32.2/_modules/aimet_torch/quant_analyzer.html new file mode 100644 index 00000000..79737e44 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_torch/quant_analyzer.html @@ -0,0 +1,1886 @@ + + + + + + aimet_torch.quant_analyzer — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_torch.quant_analyzer

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Quant Analyzer """
+
+import os
+import contextlib
+from collections import OrderedDict, defaultdict
+from typing import Union, Tuple, Dict, List, Collection, Type, Generator
+import torch
+from torch.utils.data import DataLoader
+
+from aimet_common.quant_analyzer import save_json, export_per_layer_sensitivity_analysis_plot,\
+    create_and_export_min_max_ranges_plot, export_per_layer_mse_plot, export_stats_histogram_plot
+from aimet_common.utils import AimetLogger, CallbackFunc
+from aimet_common.defs import QuantScheme
+from aimet_torch import utils
+from aimet_torch.tensor_quantizer import TensorQuantizer, StaticGridTensorQuantizer
+from aimet_torch.qc_quantize_op import QcQuantizeWrapper
+from aimet_torch.qc_quantize_recurrent import QcQuantizeRecurrent
+from aimet_torch.quantsim import QuantizationSimModel
+from aimet_torch.batch_norm_fold import fold_all_batch_norms
+
+_logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.QuantAnalyzer)
+
+DEFAULT_BOKEH_FIGURE_HEIGHT = 300
+
+
+
[docs]class QuantAnalyzer: + """ + QuantAnalyzer tool provides + + 1) model sensitivity to weight and activation quantization + 2) per layer sensitivity analysis + 3) per layer encoding (min - max range) + 4) per PDF analysis and + 5) per layer MSE analysis + """ + def __init__(self, + model: torch.nn.Module, + dummy_input: Union[torch.Tensor, Tuple], + forward_pass_callback: CallbackFunc, + eval_callback: CallbackFunc, + modules_to_ignore: List[torch.nn.Module] = None, + ): + """ + :param model: FP32 model to analyze for quantization. + :param dummy_input: Dummy input to model. + :param forward_pass_callback: A callback function for model calibration that simply runs + forward passes on the model to compute encoding (delta/offset). This + callback function should use representative data and should be subset of + entire train/validation dataset (~1000 images/samples). + :param eval_callback: A callback function for model evaluation that determines model + performance. This callback function is expected to return scalar value + representing the model performance evaluated against entire test/evaluation dataset. + :param modules_to_ignore: Excludes certain modules from being analyzed. + """ + if not isinstance(forward_pass_callback, CallbackFunc): + raise ValueError('forward_pass_callback and its argument(s) are not encapsulated by CallbackFunc class.') + if not isinstance(eval_callback, CallbackFunc): + raise ValueError('eval_callback and its argument(s) are not encapsulated by CallbackFunc class.') + + self._model = model + self._dummy_input = dummy_input + self._forward_pass_callback = forward_pass_callback + self._eval_callback = eval_callback + self._unlabeled_dataset_iterable = None + self._num_batches = None + self._modules_to_ignore = modules_to_ignore + +
[docs] def analyze(self, + quant_scheme: QuantScheme = QuantScheme.post_training_tf_enhanced, + default_param_bw: int = 8, + default_output_bw: int = 8, + config_file: str = None, + results_dir: str = "./tmp/", + ): + """ + Analyze model for quantization and point out sensitive parts/hotspots of the model by performing + 1) model sensitivity to quantization, + 2) perform per layer sensitivity analysis by enabling and disabling quant wrappers, + 3) export per layer encodings min - max ranges, + 4) export per layer statistics histogram (PDF) when quant scheme is TF-Enhanced, + 5) per layer MSE analysis + + :param quant_scheme: Quantization scheme. Supported values are + QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced. + :param default_param_bw: Default bitwidth (4-31) to use for quantizing layer parameters. + :param default_output_bw: Default bitwidth (4-31) to use for quantizing layer inputs and outputs. + :param config_file: Path to configuration file for model quantizers. + :param results_dir: Directory to save the results. + """ + sim = self._create_quantsim_and_encodings(quant_scheme, + default_param_bw, + default_output_bw, + config_file) + + results_dir = os.path.abspath(results_dir) + os.makedirs(results_dir, exist_ok=True) + + # Check model sensitivity to weight and activation quantization individually. + self.check_model_sensitivity_to_quantization(sim) + + # Perform per layer analysis by enabling each quant wrapper (OPTION-1). + self.perform_per_layer_analysis_by_enabling_quant_wrappers(sim, results_dir) + + # Perform per layer analysis by disabling each quant wrapper (OPTION-2). + self.perform_per_layer_analysis_by_disabling_quant_wrappers(sim, results_dir) + + # Export encoding min-max range. + self.export_per_layer_encoding_min_max_range(sim, results_dir) + + # Export PDF of statistics. + if quant_scheme == QuantScheme.post_training_tf_enhanced: + self.export_per_layer_stats_histogram(sim, results_dir) + + # Export per layer MSE loss between fp32 and quantized output activations. + if self._unlabeled_dataset_iterable: + self.export_per_layer_mse_loss(sim, results_dir)
+ +
[docs] def enable_per_layer_mse_loss(self, unlabeled_dataset_iterable: Union[DataLoader, Collection], num_batches: int): + """ + Enable per layer MSE loss analysis. + + :param unlabeled_dataset_iterable: A collection (i.e. iterable with `__len__`) + that iterates over an unlabeled dataset. The values yielded by this iterable are expected + to be able to be passed directly to the model. + :param num_batches: Number of batches. Approximately 256 samples/images are recommended, + so if batch size of data loader is 64, then 4 number of batches leads to 256 samples/images. + """ + # TODO: Make per layer MSE loss analysis as part of top level API. + if len(unlabeled_dataset_iterable) < num_batches: + raise ValueError(f'Can not fetch {num_batches} batches from ' + f'a data loader of length {len(unlabeled_dataset_iterable)}.') + + self._unlabeled_dataset_iterable = unlabeled_dataset_iterable + self._num_batches = num_batches
+ + def _create_quantsim_and_encodings(self, quant_scheme: QuantScheme, default_param_bw: int, + default_output_bw: int, config_file: str) \ + -> QuantizationSimModel: + """ + Create Quantsim and compute encodings. + + :param quant_scheme: Quantization scheme. + :param default_param_bw: Default bitwidth (4-31) to use for quantizing layer parameters. + :param default_output_bw: Default bitwidth (4-31) to use for quantizing layer inputs and outputs. + :param config_file: Path to configuration file for model quantizers. + :return: Quantsim model. + """ + if isinstance(self._dummy_input, torch.Tensor): + input_shape = tuple(self._dummy_input.shape) + else: + input_shape = [tuple(x.shape) for x in self._dummy_input] + _ = fold_all_batch_norms(self._model, input_shape, dummy_input=self._dummy_input) + + kwargs = dict( + quant_scheme=quant_scheme, + default_output_bw=default_output_bw, + default_param_bw=default_param_bw, + config_file=config_file, + ) + sim = self._get_quantsim_cls()(self._model, self._dummy_input, **kwargs) + if self._modules_to_ignore: + self._exclude_modules_from_quantization(self._model, sim, self._modules_to_ignore) + + self.patch_quantsim_to_store_histogram(sim) + sim.compute_encodings(self._forward_pass_callback.func, self._forward_pass_callback.args) + return sim + + def _eval_weight_quantized_model(self, sim: QuantizationSimModel)-> float: + """ + Evaluate weight quantized model performance. + For weight quantized model performance, disable enabled activation quantizers, measure + eval score and enable again. + + :param sim: Quantsim model. + :return: Quantized model performance. + """ + with self._disable_activation_quantizers(sim): + eval_score = self._eval_model(sim.model) + return eval_score + + def _eval_activation_quantized_model(self, sim: QuantizationSimModel)-> float: + """ + Evaluate activation quantized model performance. + For activation quantized model performance, disable enabled param quantizers, measure + eval score and enable again. + + :param sim: Quantsim model. + :return: Quantized model performance. + """ + with self._disable_param_quantizers(sim): + eval_score = self._eval_model(sim.model) + return eval_score + + def _eval_model(self, model: torch.nn.Module) -> float: + """ + Evaluate the model performance. + + :param model: PyTorch model to be evaluated. + :return: Scaler value representing model performance. + """ + with utils.in_eval_mode(model), torch.no_grad(): + return self._eval_callback.func(model, self._eval_callback.args) + + def _sort_quant_wrappers_based_on_occurrence(self, sim: QuantizationSimModel) -> Dict: + """ + Sort quant wrappers based on occurrence for given quantsim model. + + :param sim: Quantsim model. + :return: Ordered dictionary which maps wrapped module name to quant wrapper. + """ + def sorting_hook(quant_wrapper: torch.nn.Module, *_): + """ + Hook-function to sort quant wrappers based on occurrence. + + :param quant_wrapper: Quant wrapper. + :param _: Additional args. + """ + quant_wrapper_name = module_to_name_dict[quant_wrapper] + sorted_quant_wrappers_dict[quant_wrapper_name] = quant_wrapper + + module_to_name_dict = {} + for name, module in sim.model.named_modules(): + module_to_name_dict[module] = name + + sorted_quant_wrappers_dict = OrderedDict() + utils.run_hook_for_layers_with_given_input(sim.model, self._dummy_input, sorting_hook, + module_type_for_attaching_hook=self._get_quant_wrapper_type(), + leaf_node_only=False) + return sorted_quant_wrappers_dict + + @classmethod + def _get_enabled_quantizers(cls, sorted_quant_wrappers: Dict)\ + -> Dict[Union[QcQuantizeWrapper, QcQuantizeRecurrent], List[TensorQuantizer]]: + """ + For given sorted quant wrappers dict, get enabled quantizers. + + :param sorted_quant_wrappers: Dictionary containing quant wrappers sorted based on occurrence. + :return: Dictionary which maps a quant wrapper to a list of enabled quantizers in it. + """ + enabled_quant_wrappers = defaultdict(list) + + for quant_wrapper in sorted_quant_wrappers.values(): + for quantizer in quant_wrapper.param_quantizers.values(): + if cls._is_quantizer_enabled(quantizer): + enabled_quant_wrappers[quant_wrapper].append(quantizer) + for quantizer in quant_wrapper.output_quantizers: + if cls._is_quantizer_enabled(quantizer): + enabled_quant_wrappers[quant_wrapper].append(quantizer) + for quantizer in quant_wrapper.input_quantizers: + if cls._is_quantizer_enabled(quantizer): + enabled_quant_wrappers[quant_wrapper].append(quantizer) + + return enabled_quant_wrappers + + @classmethod + def _get_enabled_param_quantizers(cls, sim: QuantizationSimModel) -> List[TensorQuantizer]: + """ + For given quantsim model, get all enabled param quantizers. + :param sim: Quantsim model. + :return: List of enabled param quantizers. + """ + enabled_param_quantizers = [] + for quant_wrapper in cls._get_quantized_modules(sim): + for quantizer in quant_wrapper.param_quantizers.values(): + if cls._is_quantizer_enabled(quantizer): + enabled_param_quantizers.append(quantizer) + + return enabled_param_quantizers + + @classmethod + def _get_enabled_activation_quantizers(cls, sim: QuantizationSimModel) -> List[TensorQuantizer]: + """ + For given quantsim model, get all enabled activation quantizers. + :param sim: Quantsim model. + :return: List of enabled activation quantizers. + """ + enabled_activation_quantizers = [] + for quant_wrapper in cls._get_quantized_modules(sim): + for quantizer in quant_wrapper.input_quantizers: + if cls._is_quantizer_enabled(quantizer): + enabled_activation_quantizers.append(quantizer) + for quantizer in quant_wrapper.output_quantizers: + if cls._is_quantizer_enabled(quantizer): + enabled_activation_quantizers.append(quantizer) + + return enabled_activation_quantizers + + @staticmethod + def _enable_disable_quantizers(quantizers: List[TensorQuantizer], enabled: bool): + """ + For given list of quantizers, set (enable/disable) quantizer's enabled. + + :param quantizers: List of quantizers. + :param enabled: Enabled flag. + """ + for quantizer in quantizers: + quantizer.enabled = enabled + + def _perform_per_layer_analysis(self, + sim: QuantizationSimModel, + disable_all_quantizers: bool, + enabled_before: bool, + enabled_after: bool, + ) -> Dict: + """ + Helper function for perform_per_layer_analysis_by_enabling_quant_wrappers() and + perform_per_layer_analysis_by_disabling_quant_wrappers() + + :param sim: Quantsim model. + :param disable_all_quantizers: Flag to disable all the quantizers before per-layer analysis. + :param enabled_before: Flag to set enabled for quantizers before computing encodings. + :param enabled_after: Flag to set enabled for quantizers after computing encodings. + :return: layer wise eval score dictionary. dict[layer_name] = eval_score. + """ + # Validate input arguments + assert (disable_all_quantizers, enabled_before, enabled_after) in \ + ((True, True, False), (False, False, True)) + + # Sorted quant wrappers based on occurrence. + # maps wrapped module name to a quant wrapper. + sorted_quant_wrappers = self._sort_quant_wrappers_based_on_occurrence(sim) + + # quant wrappers and it's enabled quantizers. + # maps quant wrapper to a list of enabled quantizers in it. + enabled_quant_wrappers = self._get_enabled_quantizers(sorted_quant_wrappers) + + eval_score_dict = {} + for name, quant_wrapper in sorted_quant_wrappers.items(): + if quant_wrapper not in enabled_quant_wrappers: + continue + + with contextlib.ExitStack() as stack: + if disable_all_quantizers and enabled_before: + # Disable all quantizers except quant_wrapper + for enabled_quant_wrapper in enabled_quant_wrappers.keys(): + if enabled_quant_wrapper == quant_wrapper: + continue + stack.enter_context(self._disable_quant_wrapper(enabled_quant_wrapper)) + else: + # Disable only quant_wrapper + stack.enter_context(self._disable_quant_wrapper(quant_wrapper)) + + # Record eval score. + eval_score_dict[name] = self._eval_model(sim.model) + _logger.debug("For layer: %s, the eval score is: %f", name, eval_score_dict[name]) + + return eval_score_dict + + # pylint: disable=no-self-use + def _create_and_export_stats_histogram_plot(self, + quantizer: StaticGridTensorQuantizer, + results_dir: str, + title: str, + ): + """ + For given quantizer, create and export histogram (PDF) of statistics in html format. + + :param quantizer: Quantizer. + :param results_dir: Directory to save the results. + :param title: Title of the plot. + """ + os.makedirs(results_dir, exist_ok=True) + + histograms = quantizer.get_stats_histogram() + encodings = quantizer.encoding + if not isinstance(encodings, List): + encodings = [encodings] + + for index, (histogram, encoding) in enumerate(zip(histograms, encodings)): + export_stats_histogram_plot(histogram, encoding, results_dir, title=f"{title}_{index}") + +
[docs] def check_model_sensitivity_to_quantization(self, + sim: QuantizationSimModel, + ) -> Tuple[float, float, float]: + """ + Perform the sensitivity analysis to weight and activation quantization + individually. + + :param sim: Quantsim model. + :return: FP32 eval score, weight-quantized eval score, act-quantized eval score. + """ + # pylint: disable=protected-access + fp32_eval_score = self._eval_model(self._model) + _logger.info("FP32 eval score (W32A32): %f", fp32_eval_score) + + weight_quantized_eval_score = self._eval_weight_quantized_model(sim) + _logger.info("Weight-quantized eval score (W%dA32): %f", sim._default_param_bw, + weight_quantized_eval_score) + + act_quantized_eval_score = self._eval_activation_quantized_model(sim) + _logger.info("Activation-quantized eval score (W32A%d): %f", sim._default_output_bw, + act_quantized_eval_score) + + return fp32_eval_score, weight_quantized_eval_score, act_quantized_eval_score
+ +
[docs] def perform_per_layer_analysis_by_enabling_quant_wrappers(self, + sim: QuantizationSimModel, + results_dir: str, + ) -> Dict: + """ + NOTE: Option 1 + + 1. All quant wrappers' parameters and activations quantizers are disabled. + 2. Based on occurrence for every quant wrappers + - Each quant wrapper's parameters and activations quantizers are enabled as per JSON config file and set to bit-width specified. + - Measure and record eval score on subset of dataset. + - Disable enabled quantizers in step 1. + 3. Returns dictionary containing quant wrapper name and corresponding eval score. + + :param sim: Quantsim model. + :param results_dir: Directory to save the results. + :return: layer wise eval score dictionary. dict[layer_name] = eval_score + """ + results_dir = os.path.abspath(results_dir) + os.makedirs(results_dir, exist_ok=True) + + _logger.info("\nOPTION-1:\nAll the quant wrappers are disabled.\n" + "Starting per-layer analysis by enabling quant wrappers as per config file.") + layer_wise_eval_score_dict = self._perform_per_layer_analysis(sim, + disable_all_quantizers=True, + enabled_before=True, + enabled_after=False) + export_per_layer_sensitivity_analysis_plot(layer_wise_eval_score_dict, + results_dir, + title="per_layer_quant_enabled") + save_json(layer_wise_eval_score_dict, + results_dir, + title="per_layer_quant_enabled.json") + _logger.info("Exported per-layer quant analysis (enabled) plot.") + return layer_wise_eval_score_dict
+ +
[docs] def perform_per_layer_analysis_by_disabling_quant_wrappers(self, + sim: QuantizationSimModel, + results_dir: str, + ) -> Dict: + """ + NOTE: Option 2 + + 1. All quant wrappers' parameters and activations quantizers are enabled as per JSON config file and set to bit-width specified. + 2. Based on occurrence for every quant wrappers + - Each quant wrapper's parameters and activations quantizers are disabled. + - Measure and record eval score on subset of dataset. + - Enable disabled quantizers in step 1. + 3. Returns dictionary containing quant wrapper name and corresponding eval score. + + :param sim: Quantsim model. + :param results_dir: Directory to save the results. + :return: layer wise eval score dictionary. dict[layer_name] = eval_score + """ + results_dir = os.path.abspath(results_dir) + os.makedirs(results_dir, exist_ok=True) + + _logger.info("\nOPTION-2:\nAll the quant wrappers are enabled as per config file.\n" + "Starting per-layer analysis by disabling quant wrappers.") + layer_wise_eval_score_dict = self._perform_per_layer_analysis(sim, + disable_all_quantizers=False, + enabled_before=False, + enabled_after=True) + export_per_layer_sensitivity_analysis_plot(layer_wise_eval_score_dict, + results_dir, + title="per_layer_quant_disabled") + save_json(layer_wise_eval_score_dict, + results_dir, + title="per_layer_quant_disabled.json") + _logger.info("Exported per-layer quant analysis (disabled) plot.") + return layer_wise_eval_score_dict
+ + # pylint: disable=no-self-use +
[docs] def export_per_layer_encoding_min_max_range(self, + sim: QuantizationSimModel, + results_dir: str, + ) -> Tuple[Dict, Dict]: + """ + Export encoding min and max range for all weights and activations. results_dir should have + html files in following format. + + -results_dir + -activations.html + -weights.html + + If per channel quantization(PCQ) is enabled then, + + -results_dir + -activations.html + -{wrapped_module_name}_{param_name}.html + + :param sim: Quantsim model. + :param results_dir: Directory to save the results. + :return: layer wise min-max range for weights and activations. + """ + # pylint: disable=too-many-locals + min_max_ranges_dir = os.path.join(results_dir, "min_max_ranges") + + module_to_name_dict = {} + for name, module in sim.model.named_modules(): + module_to_name_dict[module] = name + + min_max_range_for_activations_dict = {} + min_max_range_for_weights_dict = {} + for quant_wrapper in self._get_quantized_modules(sim): + wrapped_module_name = module_to_name_dict[quant_wrapper] + for index, quantizer in enumerate(quant_wrapper.input_quantizers): + if self._is_quantizer_enabled(quantizer): + name = f"{wrapped_module_name}_input_{index}" + encoding = self._get_quantizer_encodings(quantizer)[0] + min_max_range_for_activations_dict[name] = (encoding.min, encoding.max) + for index, quantizer in enumerate(quant_wrapper.output_quantizers): + if self._is_quantizer_enabled(quantizer): + name = f"{wrapped_module_name}_output_{index}" + encoding = self._get_quantizer_encodings(quantizer)[0] + min_max_range_for_activations_dict[name] = (encoding.min, encoding.max) + for param_name, quantizer in quant_wrapper.param_quantizers.items(): + if self._is_quantizer_enabled(quantizer): + name = f"{wrapped_module_name}_{param_name}" + encodings = self._get_quantizer_encodings(quantizer) + if len(encodings) > 1: # per-channel + per_channel_encodings = {} + for index, encoding in enumerate(encodings): + per_channel_encodings[f"{name}_{index}"] = (encoding.min, encoding.max) + min_max_range_for_weights_dict[name] = per_channel_encodings + else: # per-tensor + min_max_range_for_weights_dict[name] = (encodings[0].min, encodings[0].max) + + create_and_export_min_max_ranges_plot(min_max_range_for_weights_dict, + min_max_ranges_dir, + title="weights") + create_and_export_min_max_ranges_plot(min_max_range_for_activations_dict, + min_max_ranges_dir, + title="activations") + save_json(min_max_range_for_weights_dict, min_max_ranges_dir, title="weights.json") + save_json(min_max_range_for_activations_dict, min_max_ranges_dir, title="activations.json") + _logger.info("Exported per layer encodings min-max ranges plot(s).") + return min_max_range_for_weights_dict, min_max_range_for_activations_dict
+ +
[docs] def export_per_layer_stats_histogram(self, + sim: QuantizationSimModel, + results_dir: str, + ): + """ + NOTE: Not to invoke when quantization scheme is not TF-Enhanced. + + Export histogram that represents a PDF of collected statistics by a quantizer for every + quant wrapper. After invoking this API, results_dir should have html files in following + format for every quantizers of quant wrappers. + + -results_dir + -activations_pdf + name_{input/output}_{index}.html + -weights_pdf + -name + param_name_{channel_index}.html + + :param sim: Quantsim model. + :param results_dir: Directory to save the results. + """ + weights_pdf_dir = os.path.join(results_dir, "weights_pdf") + activations_pdf_dir = os.path.join(results_dir, "activations_pdf") + + module_to_name_dict = {} + for name, module in sim.model.named_modules(): + module_to_name_dict[module] = name + + for quant_wrapper in self._get_quantized_modules(sim): + wrapped_module_name = module_to_name_dict[quant_wrapper] + for index, quantizer in enumerate(quant_wrapper.input_quantizers): + if quantizer is not None and self._get_quantizer_encodings(quantizer): + self._create_and_export_stats_histogram_plot(quantizer, + activations_pdf_dir, + title=f"{wrapped_module_name}_input_q{index}") + for index, quantizer in enumerate(quant_wrapper.output_quantizers): + if quantizer is not None and self._get_quantizer_encodings(quantizer): + self._create_and_export_stats_histogram_plot(quantizer, + activations_pdf_dir, + title=f"{wrapped_module_name}_output_q{index}") + for param_name, quantizer in quant_wrapper.param_quantizers.items(): + if quantizer is not None and self._get_quantizer_encodings(quantizer): + self._create_and_export_stats_histogram_plot(quantizer, + os.path.join(weights_pdf_dir, wrapped_module_name), + title=f"{wrapped_module_name}_{param_name}") + _logger.info("Exported per layer stats histogram plot(s).")
+ +
[docs] def export_per_layer_mse_loss(self, + sim: QuantizationSimModel, + results_dir: str, + ) -> Dict: + """ + NOTE: Need to pass same model input data through both fp32 and quantsim model to + tap output activations of each layer. + + Export MSE loss between fp32 and quantized output activations for each layer. + :param sim: Quantsim model. + :param results_dir: Directory to save the results. + :return layer wise MSE loss. dict[layer_name] = MSE loss. + """ + results_dir = os.path.abspath(results_dir) + os.makedirs(results_dir, exist_ok=True) + + name_to_quant_wrapper_dict = {} + for name, module in sim.model.named_modules(): + name_to_quant_wrapper_dict[name] = module + + modules = utils.get_ordered_list_of_modules(self._model, self._dummy_input) + mse_loss_dict = {} + for name, module in modules: + quant_wrapper = name_to_quant_wrapper_dict[name] + loss = self._compute_mse_loss(module, quant_wrapper, self._model, sim) + mse_loss_dict[name] = loss + + export_per_layer_mse_plot(mse_loss_dict, + results_dir, + title="per_layer_mse_loss") + save_json(mse_loss_dict, results_dir, title="per_layer_mse_loss.json") + _logger.info("Exported per layer MSE loss plot.") + return mse_loss_dict
+ + def _compute_mse_loss(self, module: torch.nn.Module, quant_wrapper: torch.nn.Module, + fp32_model: torch.nn.Module, sim: QuantizationSimModel) -> float: + """ + Compute MSE loss between fp32 and quantized output activations for each batch, add for + all the batches and return averaged mse loss. + + :param module: module from the fp32_model. + :param quant_wrapper: Corresponding quant wrapper from the QuantSim model. + :param fp32_model: PyTorch model. + :param sim: Quantsim model. + :return: MSE loss between fp32 and quantized output activations. + """ + # output activations collector. + orig_module_collector = utils.ModuleData(fp32_model, module) + quant_module_collector = utils.ModuleData(sim.model, quant_wrapper) + + total = 0 + loss = 0.0 + batch_index = 0 + for model_inputs in self._unlabeled_dataset_iterable: + assert isinstance(model_inputs, (torch.Tensor, tuple, list)) + _, quantized_out_acts = quant_module_collector.collect_inp_out_data(model_inputs, + collect_input=False, + collect_output=True) + _, fp32_out_acts = orig_module_collector.collect_inp_out_data(model_inputs, + collect_input=False, + collect_output=True) + loss += torch.nn.functional.mse_loss(fp32_out_acts, quantized_out_acts).item() + total += fp32_out_acts.size(0) + batch_index += 1 + if batch_index == self._num_batches: + break + + average_loss = loss/total + return average_loss + + @staticmethod + def _exclude_modules_from_quantization(model: torch.nn.Module, sim: QuantizationSimModel, + modules_to_ignore: List[torch.nn.Module]): + """ + For the modules in the modules_to_ignore, remove the corresponding quant wrappers. + + :param model: Original model. + :param sim: Quantsim model. + :param modules_to_ignore: The list of modules for which the quant wrappers are removed. + """ + name_to_quant_wrapper_dict = {} + for name, module in sim.model.named_modules(): + name_to_quant_wrapper_dict[name] = module + + module_to_name_dict = {} + for name, module in model.named_modules(): + module_to_name_dict[module] = name + + quant_wrappers_to_ignore = [] + for module in modules_to_ignore: + name = module_to_name_dict[module] + quant_wrapper = name_to_quant_wrapper_dict[name] + quant_wrappers_to_ignore.append(quant_wrapper) + + sim.exclude_layers_from_quantization(quant_wrappers_to_ignore) + + @staticmethod + def patch_quantsim_to_store_histogram(_): + """ + Placeholder function to prevent patching v1 quantsim + """ + + @staticmethod + def _get_quantsim_cls() -> Type[QuantizationSimModel]: + return QuantizationSimModel + + @staticmethod + def _get_quant_wrapper_type() -> Tuple[Type]: + return (QcQuantizeWrapper, QcQuantizeRecurrent) + + @staticmethod + def _is_quantizer_enabled(quantizer: TensorQuantizer): + return quantizer.enabled + + @staticmethod + def _get_quantizer_encodings(quantizer: TensorQuantizer): + if quantizer.encoding and not isinstance(quantizer.encoding, List): + return [quantizer.encoding] + return quantizer.encoding + + @classmethod + @contextlib.contextmanager + def _disable_param_quantizers(cls, sim: QuantizationSimModel): + enabled_param_quantizers = cls._get_enabled_param_quantizers(sim) + cls._enable_disable_quantizers(enabled_param_quantizers, enabled=False) + yield + cls._enable_disable_quantizers(enabled_param_quantizers, enabled=True) + + @classmethod + @contextlib.contextmanager + def _disable_activation_quantizers(cls, sim: QuantizationSimModel): + enabled_activation_quantizers = cls._get_enabled_activation_quantizers(sim) + cls._enable_disable_quantizers(enabled_activation_quantizers, enabled=False) + yield + cls._enable_disable_quantizers(enabled_activation_quantizers, enabled=True) + + @staticmethod + def _disable_quant_wrapper(module: QcQuantizeWrapper): + return utils.disable_all_quantizers(module) + + @staticmethod + def _get_quantized_modules(sim: QuantizationSimModel) -> Generator[QcQuantizeWrapper, None, None]: + for module in sim.model.modules(): + if isinstance(module, (QcQuantizeWrapper, QcQuantizeRecurrent)): + yield module
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_torch/quantsim.html b/releases/1.32.2/_modules/aimet_torch/quantsim.html new file mode 100644 index 00000000..79f2e5f3 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_torch/quantsim.html @@ -0,0 +1,3346 @@ + + + + + + aimet_torch.quantsim — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_torch.quantsim

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2019-2024, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Implementation for simulating models running on Quantized hardware """
+# pylint: disable=too-many-lines
+from itertools import chain
+import contextlib
+import os
+import io
+import copy
+import pickle
+from typing import Tuple, List, Union, Dict, Callable, Optional, Any, runtime_checkable, Protocol, Mapping
+from collections import OrderedDict, defaultdict
+import json
+import torch
+import onnx
+from packaging import version  # pylint: disable=wrong-import-order
+
+import aimet_common.libpymo as libpymo
+from aimet_common import quantsim
+
+from aimet_common.connected_graph.connectedgraph_utils import CG_SPLIT
+from aimet_common.utils import AimetLogger, save_json_yaml, log_with_error_and_assert_if_false
+from aimet_common.defs import QuantScheme, QuantizationDataType, SupportedKernelsAction, QuantDtypeBwInfo
+from aimet_common.quantsim import validate_quantsim_inputs, extract_global_quantizer_args
+from aimet_common.quant_utils import get_conv_accum_bounds
+
+from aimet_torch import elementwise_ops
+from aimet_torch.quantsim_config.quantsim_config import QuantSimConfigurator
+from aimet_torch.qc_quantize_op import QcQuantizeStandAloneBase, QcQuantizeWrapper, QcQuantizeOpMode, \
+    StaticGridQuantWrapper, LearnedGridQuantWrapper, NativeTorchQuantWrapper, QUANTIZER_TYPE_INPUT, QUANTIZER_TYPE_OUTPUT
+from aimet_torch.tensor_quantizer import initialize_learned_grid_quantizer_attributes, TensorQuantizer
+from aimet_torch.qc_quantize_op import get_encoding_by_quantizer as _get_encoding_by_quantizer
+from aimet_torch import torchscript_utils, utils, onnx_utils
+from aimet_torch.utils import deprecated
+from aimet_torch.onnx_utils import (
+    OnnxSaver,
+    OnnxExportApiArgs,
+    CustomMarker,
+    save_initializer_restored_onnx_graph,
+)
+from aimet_torch.meta.connectedgraph import ConnectedGraph, Op
+from aimet_torch.qc_quantize_recurrent import QcQuantizeRecurrent
+from aimet_torch.quantsim_config.builder import LazyQuantizeWrapper
+from aimet_torch.experimental.v2.quantsim.export_utils import VALID_ENCODING_VERSIONS, _export_to_1_0_0
+
+
+logger = AimetLogger.get_area_logger(AimetLogger.LogAreas.Quant)
+
+# If a torch module type is in this dictionary, call the corresponding quantized module constructor instead of wrapping
+# it with QcQuantizeWrapper.
+qc_quantize_modules_dict = {
+    torch.nn.RNN: QcQuantizeRecurrent,
+    torch.nn.LSTM: QcQuantizeRecurrent,
+    torch.nn.GRU: QcQuantizeRecurrent
+}
+
+# Length of the string '._module_to_wrap'
+MODULE_TO_WRAP_STRING_REVERSE_INDEX = -16
+
+MAP_PYMO_TO_ROUND_MODE = {libpymo.RoundingMode.ROUND_NEAREST: 'nearest',
+                          libpymo.RoundingMode.ROUND_STOCHASTIC: 'stochastic'}
+
+SUPPORTED_KERNELS_ACTION = SupportedKernelsAction.warn_on_error
+
+
+
+
[docs]class QuantParams: + """ + Data type to hold quantization related params. + """ + + def __init__(self, + weight_bw: int = 8, + act_bw: int = 8, + round_mode: str = 'nearest', + quant_scheme: Union[QuantScheme, str] = QuantScheme.post_training_tf_enhanced, + config_file: str = None): + """ + Constructor + + :param weight_bw: Weight bitwidth (4-31) to use for quantizing layer weights. Default = 8 + :param act_bw: Activation bitwidth(4-31) to use for quantizing layer activations. Default = 8 + :param round_mode: Rounding mode. Supported options are 'nearest' or 'stochastic' + :param quant_scheme: Quantization scheme. Supported options are 'tf_enhanced' or 'tf' or using Quant Scheme Enum + QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced + :param config_file: Path to Configuration file for model quantizers + """ + + self.weight_bw = weight_bw + self.act_bw = act_bw + self.round_mode = round_mode + self.quant_scheme = quant_scheme + self.config_file = config_file
+ + +@runtime_checkable +class ExportableQuantModule(Protocol): + """ + Defines the minimum interface requirements for exporting encodings from a module. + """ + + def export_input_encodings(self) -> List[List[Dict]]: + """ + Returns a list of input encodings, each represented as a List of Dicts + """ + + def export_output_encodings(self) -> List[List[Dict]]: + """ + Returns a list of output encodings, each represented as a List of Dicts + """ + + def export_param_encodings(self) -> Dict[str, List[Dict]]: + """ + Returns a dict of {param name: param encodings}, with each encoding represented as a List of Dicts + """ + + def import_input_encodings(self, + encodings: Mapping[str, Mapping], + strict: bool, + partial: bool, + requires_grad: Optional[bool], + allow_overwrite: bool): + """ + Import input encodings represented in below format: + { + '0': dict, + '1': dict, + ... + } + """ + + def import_output_encodings(self, + encodings: Mapping[str, Mapping], + strict: bool, + partial: bool, + requires_grad: Optional[bool], + allow_overwrite: bool): + """ + Import output encodings represented in below format: + { + '0': dict, + '1': dict, + ... + } + """ + + def import_param_encodings(self, + encodings: Mapping[str, Mapping], + strict: bool, + partial: bool, + requires_grad: Optional[bool], + allow_overwrite: bool): + """ + Import parameter encodings represented in below format: + { + 'param_name_0': [dict, dict, ...], + 'param_name_1': [dict, dict, ...], + ... + } + """ + + def get_original_module(self) -> torch.nn.Module: + """ + Returns the floating point version of quantized module + """ + + +# Types of modules which cannot be quantized +unquantizable_modules = ( + QcQuantizeWrapper, + QcQuantizeStandAloneBase, + QcQuantizeRecurrent, + ExportableQuantModule, + torch.nn.Identity, +) + + +
[docs]class QuantizationSimModel: + """ + Implements mechanism to add quantization simulations ops to a model. This allows for off-target simulation of + inference accuracy. Also allows the model to be fine-tuned to counter the effects of quantization. + """ + + # pylint: disable=too-many-arguments, too-many-instance-attributes, too-many-locals, too-many-public-methods + def __init__(self, model: torch.nn.Module, dummy_input: Union[torch.Tensor, Tuple], + quant_scheme: Union[str, QuantScheme] = QuantScheme.post_training_tf_enhanced, + rounding_mode: str = 'nearest', default_output_bw: int = 8, default_param_bw: int = 8, + in_place: bool = False, config_file: str = None, + default_data_type: QuantizationDataType = QuantizationDataType.int): + + """ + Constructor for QuantizationSimModel. + + :param model: Model to add simulation ops to + :param dummy_input: Dummy input to the model. Used to parse model graph. If the model has more than one input, + pass a tuple. User is expected to place the tensors on the appropriate device. + :param quant_scheme: Quantization scheme. The Quantization scheme is used to compute the Quantization encodings. + There are multiple schemes available. Please refer the QuantScheme enum definition. + :param rounding_mode: Rounding mode. Supported options are 'nearest' or 'stochastic' + :param default_output_bw: Default bitwidth (4-31) to use for quantizing all layer inputs and outputs + :param default_param_bw: Default bitwidth (4-31) to use for quantizing all layer parameters + :param in_place: If True, then the given 'model' is modified in-place to add quant-sim nodes. + Only suggested use of this option is when the user wants to avoid creating a copy of the model + :param config_file: Path to Configuration file for model quantizers + :param default_data_type: Default data type to use for quantizing all layer inputs, outputs and parameters. + Possible options are QuantizationDataType.int and QuantizationDataType.float. + Note that the mode default_data_type=QuantizationDataType.float is only supported with + default_output_bw=16 or 32 and default_param_bw=16 or 32. + """ + # Perform sanity checks on inputs + validate_quantsim_inputs(quant_scheme, rounding_mode, default_output_bw, default_param_bw, + default_data_type) + # save some parameters + if in_place: + self.model = model + else: + self.model = copy.deepcopy(model) + + try: + self.connected_graph = ConnectedGraph(self.model, dummy_input) + except (torch.jit.TracingCheckError, AssertionError): + self.connected_graph = None + + if isinstance(quant_scheme, str): + if quant_scheme == 'tf': + quant_scheme = QuantScheme.post_training_tf + elif quant_scheme == 'tf_enhanced': + quant_scheme = QuantScheme.post_training_tf_enhanced + elif quant_scheme == 'percentile': + quant_scheme = QuantScheme.post_training_percentile + self._quant_scheme = quant_scheme + self._rounding_mode = rounding_mode + self._default_output_bw = default_output_bw + self._default_param_bw = default_param_bw + self._config_file = config_file + self._is_conditional = False + self._module_marker_map = {} + self._percentile_value = 100 # default percentile value + self._excluded_layer_names = [] + + # Add quantization layers + inout_tensor_shape = utils.get_inout_tensor_shape_per_module(self.model, dummy_input) + num_inout_tensors = self._get_num_inout_tensors_from_tensor_shape_dict(inout_tensor_shape) + inout_tensors_dtypes_for_cast_ops = utils.get_inout_tensors_dtypes_for_cast_modules(self.model, dummy_input) + + self._add_quantization_wrappers(self.model, num_inout_tensors, default_data_type) + self._set_tensor_quantizers_for_consts(inout_tensor_shape) + + # Disable bias quantization + self.exclude_param_from_quantization("bias") + + quantsim_configurator = self.configure_quantization_ops(config_file, default_output_bw, default_param_bw, + default_data_type) + + self.quant_args = extract_global_quantizer_args(quant_scheme, quantsim_configurator) + + self._enable_output_quantizers_for_specific_cast_ops(inout_tensors_dtypes_for_cast_ops) + + # pylint: disable=protected-access + self._hw_version = quantsim_configurator._get_hw_version() + self._supported_kernels = quantsim_configurator.get_supported_kernels() + self._validate_supported_kernels_for_quantizers(SUPPORTED_KERNELS_ACTION) + + self._apply_exception_rules() + + # Initialize real wrappers using collected information + self._realize_quant_wrappers_in_model(self.model) + + def _realize_quant_wrappers_in_model(self, model: torch.nn.Module): + """ + Prepare QuantSim for compute encodings. Resets encodings for each quantizable layer and sets mode to Analysis. + Realize quant wrappers using collected information in LazyQuantWrapper. + + :param model: model containing modules wrapped with LazyQuantWrapper + """ + for module_name, module_ref in model.named_children(): + if isinstance(module_ref, LazyQuantizeWrapper): + quantized_module = self._realize_quant_wrapper(module_ref) + setattr(model, module_name, quantized_module) + + elif not utils.is_leaf_module(module_ref): + self._realize_quant_wrappers_in_model(module_ref) + + @staticmethod + def _realize_quant_wrapper(module: torch.nn.Module) -> QcQuantizeWrapper: + return module.realize_v1_wrapper() + + def get_supported_kernels(self) -> Dict: + """ + Return _supported_kernels parsed from the config file + :return: Dictionary containing supported_kernels + """ + return self._supported_kernels + + def __str__(self): + """ + Pretty-printed output indicating where in the model, quantizers have been activated + :return: + """ + + def print_quantizer_state(stream, quantizer, prefix_string): + if quantizer.enabled: + stream.write(f' {prefix_string}: bw={quantizer.bitwidth}, ' + f'encoding-present={bool(quantizer.encoding)}\n') + + if quantizer.encoding: + stream.write(f' {quantizer}') + else: + stream.write(f' {prefix_string}: Not quantized\n') + + stream.write(' -------\n') + + stream = io.StringIO(newline='\n') + stream.write("-------------------------\n") + stream.write("Quantized Model Report\n") + stream.write("-------------------------\n") + + for layer_name, layer in self._get_qc_quantized_layers(self.model): + stream.write('----------------------------------------------------------\n') + stream.write('Layer: {}\n'.format(layer_name)) + + # Inputs + if isinstance(layer.input_quantizers, dict): + for name, quantizer in layer.input_quantizers.items(): + print_quantizer_state(stream, quantizer, prefix_string=f"Input[{name}]") + else: + for index, quantizer in enumerate(layer.input_quantizers): + print_quantizer_state(stream, quantizer, prefix_string=f"Input[{index}]") + + # Params + for param_name, quantizer in layer.param_quantizers.items(): + print_quantizer_state(stream, quantizer, prefix_string=f"Param[{param_name}]") + + # Outputs + if isinstance(layer.output_quantizers, dict): + for name, quantizer in layer.output_quantizers.items(): + print_quantizer_state(stream, quantizer, prefix_string=f"Output[{name}]") + else: + for index, quantizer in enumerate(layer.output_quantizers): + print_quantizer_state(stream, quantizer, prefix_string=f"Output[{index}]") + + return stream.getvalue() + + @staticmethod + def prepare_sim_for_compute_encodings(sim: 'QuantizationSimModel'): + """ + Prepare QuantSim for compute encodings. Resets encodings for each quantizable layer and sets mode to Analysis. + + :param sim: QuantSim to prepare + """ + # pylint: disable=protected-access + quantized_layers = sim._get_qc_quantized_layers(sim.model) + + for _, layer in quantized_layers: + # Clear stats and encodings if they are present + layer.reset_encodings() + + # And set the mode to analysis + layer.set_mode(QcQuantizeOpMode.ANALYSIS) + + for _, layer in quantized_layers: + # call only when quant scheme is percentile + if sim._quant_scheme == QuantScheme.post_training_percentile: + layer.set_percentile_value(sim._percentile_value) + + @staticmethod + def compute_layer_encodings_for_sim(sim: 'QuantizationSimModel'): + """ + Compute encodings for each quantizable layer in sim after forward pass has been called. + + :param sim: QuantSim to compute encodings for + """ + # pylint: disable=protected-access + quantized_layers = sim._get_qc_quantized_layers(sim.model) + # Get the computed per-layer encodings and log them + for name, layer in quantized_layers: + layer.compute_encoding() + + # Before we return we set the mode to active - meaning ready for quantize/de-quantize + # for layers with valid_encoding, otherwise we set to pass through + if isinstance(layer, QcQuantizeRecurrent): + sim.set_mode_for_recurrent_module(layer, name) + else: + # By default we want to set the Quantization wrappers to ACTIVE mode + layer.set_mode(QcQuantizeOpMode.ACTIVE) + + sim.replace_wrappers_for_quantize_dequantize() + +
[docs] def compute_encodings(self, forward_pass_callback, forward_pass_callback_args): + """ + Computes encodings for all quantization sim nodes in the model. It is also used to find initial encodings for + Range Learning + + :param forward_pass_callback: A callback function that simply runs forward passes on the model. This callback + function should use representative data for the forward pass, so the calculated encodings work for all + data samples. This callback internally chooses the number of data samples it wants to use for calculating + encodings. + :param forward_pass_callback_args: These argument(s) are passed to the forward_pass_callback as-is. Up to + the user to determine the type of this parameter. E.g. could be simply an integer representing the number + of data samples to use. Or could be a tuple of parameters or an object representing something more complex. + If set to None, forward_pass_callback will be invoked with no parameters. + :return: None + + """ + + QuantizationSimModel.prepare_sim_for_compute_encodings(self) + + # Run forward iterations so we can collect statistics to compute the appropriate encodings + with utils.in_eval_mode(self.model), torch.no_grad(): + _ = forward_pass_callback(self.model, forward_pass_callback_args) + + QuantizationSimModel.compute_layer_encodings_for_sim(self)
+ + @classmethod + def set_mode_for_recurrent_module(cls, layer: QcQuantizeRecurrent, name: str): + """ + Sets Recurrent module to active or pass through mode based on quantizer state + + :param layer: Qc Quantizer layer for recurrent module + :param name: layer name + :return: True if the encoding is invalid + + """ + for quantizer_name, output_quantizer in layer.output_quantizers.items(): + if output_quantizer.enabled: + if output_quantizer.encoding: + encoding = output_quantizer.encoding + logger.debug("Encoding for %s-%s: min=%f, max=%f, offset=%f. delta=%f, bw=%f", + name, quantizer_name, encoding.min, encoding.max, + encoding.delta, encoding.offset, encoding.bw) + + for quantizer_name, input_quantizer in layer.input_quantizers.items(): + if input_quantizer.enabled: + if input_quantizer.encoding: + encoding = input_quantizer.encoding + logger.debug("Encoding for %s-%s: min=%f, max=%f, offset=%f. delta=%f, bw=%f", + name, quantizer_name, encoding.min, encoding.max, + encoding.delta, encoding.offset, encoding.bw) + + layer.set_mode(QcQuantizeOpMode.ACTIVE) + + def set_percentile_value(self, percentile_value: float): + """ + Set the percentile value to be used while computing encodings + """ + if percentile_value < 90 or percentile_value > 100: + raise ValueError("Percentile value must be in range [90, 100]") + self._percentile_value = percentile_value + +
[docs] def export(self, path: str, filename_prefix: str, dummy_input: Union[torch.Tensor, Tuple], + onnx_export_args: Optional[Union[OnnxExportApiArgs, Dict]] = None, propagate_encodings: bool = False, + export_to_torchscript: bool = False, use_embedded_encodings: bool = False, export_model: bool = True, + filename_prefix_encodings: str = None): + """ + This method exports out the quant-sim model so it is ready to be run on-target. + + Specifically, the following are saved: + + 1. The sim-model is exported to a regular PyTorch model without any simulation ops + 2. The quantization encodings are exported to a separate JSON-formatted file that can + then be imported by the on-target runtime (if desired) + 3. Optionally, An equivalent model in ONNX format is exported. In addition, nodes in the ONNX model are named + the same as the corresponding PyTorch module names. This helps with matching ONNX node to their quant + encoding from #2. + + :param path: path where to store model pth and encodings + :param filename_prefix: Prefix to use for filenames of the model pth and encodings files + :param dummy_input: Dummy input to the model. Used to parse model graph. It is required for the dummy_input to + be placed on CPU. + :param onnx_export_args: Optional export argument with onnx specific overrides provided as a dictionary or + OnnxExportApiArgs object. If not provided, defaults to "opset_version" = None, "input_names" = None, + "output_names" = None, and for torch version < 1.10.0, "enable_onnx_checker" = False. + :param propagate_encodings: If True, encoding entries for intermediate ops (when one PyTorch ops results in + multiple ONNX nodes) are filled with the same BW and data_type as the output tensor for that series of + ops. Defaults to False. + :param export_to_torchscript: If True, export to torchscript. Export to onnx otherwise. Defaults to False. + :param use_embedded_encodings: If True, another onnx model embedded with fakequant nodes will be exported + :param export_model: If True, then ONNX model is exported. When False, only encodings are exported. User should + disable (False) this flag only if the corresponding ONNX model already exists in the path + specified + :param filename_prefix_encodings: File name prefix to be used when saving encodings. + If None, then user defaults to filename_prefix value + """ + + warning_str = 'Exporting encodings to yaml will be deprecated in a future release. Ensure that your ' \ + 'code can work with the exported files ending in ".encodings" which are saved using json ' \ + 'format. For the time being, if yaml export is needed, set aimet_common.utils.SAVE_TO_YAML to ' \ + 'True.' + logger.warning(warning_str) + + if not filename_prefix_encodings: + filename_prefix_encodings = filename_prefix + + if quantsim.encoding_version not in VALID_ENCODING_VERSIONS: + raise NotImplementedError(f'Encoding version {quantsim.encoding_version} not in set of valid encoding ' + f'versions {VALID_ENCODING_VERSIONS}.') + # save the quantized model and encodings + model_filename = filename_prefix + '.pth' + model_path = os.path.join(path, model_filename) + + # Create a version of the model without any quantization ops + model_to_export = QuantizationSimModel.get_original_model(self.model) + + torch.save(model_to_export, model_path) + + if onnx_export_args is None: + onnx_export_args = {'opset_version': None, + 'input_names': None, + 'output_names': None} + if version.parse(torch.__version__) < version.parse("1.10.0") and isinstance(onnx_export_args, dict): + onnx_export_args['enable_onnx_checker'] = False + log_with_error_and_assert_if_false(isinstance(onnx_export_args, (OnnxExportApiArgs, dict)), + logger, + f'unsupported opt_args type={type(onnx_export_args)}') + + if use_embedded_encodings: + QuantizationSimModel.save_model_with_embedded_quantization_nodes(self.model, path, filename_prefix, dummy_input, + onnx_export_args, export_to_torchscript, self._is_conditional) + else: + if export_to_torchscript: + self.export_torch_script_model_and_encodings(path, filename_prefix, filename_prefix_encodings, + model_to_export, self.model, + dummy_input, + self._excluded_layer_names) + else: + self.export_onnx_model_and_encodings(path, filename_prefix, model_to_export, self.model, + dummy_input, onnx_export_args, propagate_encodings, + self._module_marker_map, self._is_conditional, + self._excluded_layer_names, quantizer_args=self.quant_args, + export_model=export_model, + filename_prefix_encodings=filename_prefix_encodings)
+ + @staticmethod + def export_torch_script_model_and_encodings(path: str, filename_prefix: str, + filename_prefix_encodings: str, + original_model: torch.nn.Module, + sim_model: torch.nn.Module, + dummy_input: Union[torch.Tensor, Tuple], + excluded_layer_names: List = None): + """ + This method exports a torchscript mode and the corresponding encodings + + :param path: path where to store model pth and encodings + :param filename_prefix: Prefix to use for filenames of the model pth and encodings files + :param filename_prefix_encodings: File name prefix for encodings. Can be same as filename_prefix + :param original_model: model without the quantsim wrappers + :param sim_model: model with the quantsim wrappers + :param dummy_input: Dummy input to the model. Used to parse model graph. + :param excluded_layer_names: List of names of layers that have been excluded from quantization. + :return: None + """ + # Create torchscript model and obtain node to i/o tensor name map + ts_path = os.path.join(path, filename_prefix + '.torchscript.pth') + with utils.in_eval_mode(original_model), torch.no_grad(): + torchscript_utils.create_torch_script_model(ts_path, original_model, dummy_input) + + trace = torch.jit.load(ts_path) + torch_script_node_io_tensor_map, valid_param_set = \ + torchscript_utils.get_node_to_io_tensor_names_map(original_model, trace, dummy_input) + + # Export encodings + QuantizationSimModel._export_encodings_to_files(sim_model, path, filename_prefix_encodings, + torch_script_node_io_tensor_map, valid_param_set, + excluded_layer_names, propagate_encodings=False) + + @staticmethod + def export_onnx_model_and_encodings(path: str, filename_prefix: str, original_model: torch.nn.Module, + sim_model: torch.nn.Module, dummy_input: Union[torch.Tensor, Tuple], + onnx_export_args: Union[OnnxExportApiArgs, dict], propagate_encodings: bool, + module_marker_map: Dict[torch.nn.Module, torch.Tensor] = None, + is_conditional: bool = False, excluded_layer_names: List = None, + quantizer_args: Dict = None, export_model: bool = True, + filename_prefix_encodings: str = None): + """ + This method exports a onnx model and the corresponding encodings + + :param path: path where to store model pth and encodings + :param filename_prefix: Prefix to use for filenames of the model pth and encodings files + :param original_model: model without the quantsim wrappers + :param sim_model: model with the quantsim wrappers + :param dummy_input: Dummy input to the model. Used to parse model graph. + :param onnx_export_args: Additional onnx export args including export api overrides + :param propagate_encodings: If True, encoding entries for intermediate ops (when one PyTorch ops results in + multiple ONNX nodes) are filled with the same BW and data_type as the output tensor for that series of + ops. + :param module_marker_map: Maps module names to traced custom markers (only used for conditional models) + :param is_conditional: True if model is conditional, False otherwise + :param excluded_layer_names: List of names of layers that have been excluded from quantization. + :param export_model: If True, then ONNX model is exported. When False, only encodings are exported. User should + disable (False) this flag only if the corresponding ONNX model already exists in the path + specified + :param filename_prefix_encodings: File name prefix to be used when saving encodings. + If None, then user defaults to filename_prefix value + :return: None + + """ + # pylint: disable=too-many-locals + if not filename_prefix_encodings: + filename_prefix_encodings = filename_prefix + onnx_path = os.path.join(path, filename_prefix + '.onnx') + if export_model: + if version.parse(torch.__version__) >= version.parse("1.13.0") and onnx_utils.EXPORT_TO_ONNX_DIRECT: + logger.debug('Exporting quantsim using torch.onnx.export directly') + original_model.cpu() + if isinstance(onnx_export_args, OnnxExportApiArgs): + kwargs = onnx_export_args.kwargs + else: + kwargs = onnx_export_args + torch.onnx.export(original_model, dummy_input, onnx_path, **kwargs) + save_initializer_restored_onnx_graph(onnx_path, onnx_path) + else: + # Create onnx model and obtain node to i/o tensor name map + OnnxSaver.create_onnx_model_with_pytorch_layer_names(onnx_path, original_model, dummy_input, is_conditional, + module_marker_map, onnx_export_args) + + assert os.path.exists(onnx_path), 'The onnx model does not exist in the location specified. Please re-run export' \ + 'with export_model flag as True or check path/file_name' + onnx_model = onnx.load(onnx_path) + onnx_node_to_io_tensor_map, valid_param_set = OnnxSaver.get_onnx_node_to_io_tensor_names_map(onnx_model) + + # Export encodings + QuantizationSimModel._export_encodings_to_files(sim_model, path, filename_prefix_encodings, + onnx_node_to_io_tensor_map, valid_param_set, + excluded_layer_names, propagate_encodings, + quantizer_args=quantizer_args) + + def save_encodings_to_json(self, path: str, filename_prefix: str): + """ + Save encodings in the model to json. + + :param path: Path to save file + :param filename_prefix: Filename to use for saved file + """ + activation_encodings, param_encodings = self.get_activation_param_encodings() + encodings_dict = {'activation_encodings': activation_encodings, 'param_encodings': param_encodings} + with open(os.path.join(path, filename_prefix + '.json'), 'w') as encoding_json: + json.dump(encodings_dict, encoding_json, sort_keys=True, indent=4) + + def get_activation_param_encodings(self): + """ + Get activation and param encodings from sim.model. + + :return: Tuple of activation and param encodings dictionaries mapping torch module names to encodings + """ + activation_encodings = OrderedDict() + param_encodings = OrderedDict() + + for module_name, module in self.model.named_modules(): + if not isinstance(module, ExportableQuantModule): + continue + + activation_encodings[module_name] = defaultdict(OrderedDict) + + for i, encoding in enumerate(module.export_input_encodings()): + if not encoding: + continue + if len(encoding) == 1: + encoding = encoding[0] + activation_encodings[module_name]['input'][i] = encoding + + for i, encoding in enumerate(module.export_output_encodings()): + if not encoding: + continue + if len(encoding) == 1: + encoding = encoding[0] + activation_encodings[module_name]['output'][i] = encoding + + if not activation_encodings[module_name]: + del activation_encodings[module_name] + + for param_name, encoding in module.export_param_encodings().items(): + if not encoding: + continue + param_encodings[f'{module_name}.{param_name}'] = encoding + + return activation_encodings, param_encodings + + def exclude_layers_from_quantization(self, layers_to_exclude: List[torch.nn.Module]): + """ + Excludes certain layers from being quantized-dequantized by the simulator + :param layers_to_exclude: List of torch layers to exclude + :return: None + """ + # Save the excluded layer names. Do not save the modules since the wrapper removal depends on + # reference count to automatically remove the layers. + module_to_name_dict = utils.get_module_to_name_dict(self.model) + quant_layers_to_exclude = [] + quant_cls = (QcQuantizeRecurrent, + LazyQuantizeWrapper, + ExportableQuantModule) + for layer in layers_to_exclude: + for module in layer.modules(): + if isinstance(module, quant_cls): + quant_layers_to_exclude.append(module) + excluded_module_name = module_to_name_dict.get(module) + self._excluded_layer_names.append(excluded_module_name) + + self._remove_quantization_wrappers(self.model, quant_layers_to_exclude) + + def exclude_param_from_quantization(self, param_name_to_exclude: str): + """ + Excludes all parameters matching 'param_name' from quantization + :param param_name_to_exclude: Name of the parameter to exclude + :return: None + """ + for module in self.model.modules(): + if isinstance(module, (QcQuantizeWrapper, QcQuantizeRecurrent, LazyQuantizeWrapper)): + if param_name_to_exclude in module.param_quantizers: + module.param_quantizers[param_name_to_exclude].enabled = False + + def _replace_quantization_wrapper(self, model, device): + """ + Recursively remove quantization wrappers from all appropriate modules starting with a given module + :param model: model for which PostTrainingWrapper gets replaced with Trainable wrapped module + :param device: device on which model is present + :return: None + """ + for module_name, module_ref in model.named_children(): + + if isinstance(module_ref, StaticGridQuantWrapper): + # Create a Trainable wrapper and copy properties of PostTrainingWrapper to the Trainable wrapper + quantized_module = self._construct_and_initialize_trainable_wrapper(module_ref, device) + setattr(model, module_name, quantized_module) + + elif isinstance(module_ref, QcQuantizeRecurrent): + # Set Recurrent layer for training mode + module_ref.construct_and_initialize_trainable_quantizers(self._quant_scheme) + + # Recursively call children modules if present + if not utils.is_leaf_module(module_ref): + self._replace_quantization_wrapper(module_ref, device) + + def _construct_and_initialize_trainable_wrapper(self, post_training_module: StaticGridQuantWrapper, + device: torch.device) -> LearnedGridQuantWrapper: + """ + Copies following tensor quantizer attributes from StaticGridQuantWrapper to LearnedGridQuantWrapper + to avoid any mismatch. + - enabled + - bitwidth + - encoding + - use_symmetric_encodings + - use_strict_symmetric + - use_unsigned_symmetric + + :param post_training_module: StaticGridQuantWrapper wrapped module + :param device: device on which model is present + :return: trainable_module: QcTrainable wrapper module + """ + + # pylint: disable=protected-access + module = post_training_module._module_to_wrap + + num_inputs = len(post_training_module.input_quantizers) + num_outputs = len(post_training_module.output_quantizers) + + # Creating a LearnedGridQuantWrapper module + trainable_module = LearnedGridQuantWrapper(module, self._default_param_bw, + self._default_output_bw, self._rounding_mode, self._quant_scheme, + device=device, num_inputs=num_inputs, num_outputs=num_outputs, + data_type=QuantizationDataType.int) + # Copy user settable attributes for outputs + for index, quantizer in enumerate(post_training_module.output_quantizers): + initialize_learned_grid_quantizer_attributes(trainable_module.output_quantizers[index], quantizer) + if trainable_module.output_quantizers[index].encoding_min_max_fixed_vals is not None: + trainable_module.output_quantizers[index].freeze_encoding() + # Copy user settable attributes for inputs + for index, quantizer in enumerate(post_training_module.input_quantizers): + initialize_learned_grid_quantizer_attributes(trainable_module.input_quantizers[index], quantizer) + if trainable_module.input_quantizers[index].encoding_min_max_fixed_vals is not None: + trainable_module.input_quantizers[index].freeze_encoding() + # Copy user settable attributes for params + for name, quantizer in post_training_module.param_quantizers.items(): + learned_grid_quantizer = trainable_module.param_quantizers[name] + initialize_learned_grid_quantizer_attributes(learned_grid_quantizer, quantizer) + if learned_grid_quantizer.encoding_min_max_fixed_vals is not None: + learned_grid_quantizer.freeze_encoding() + + return trainable_module + + def replace_wrappers_for_quantize_dequantize(self): + """ + Replaces StaticGridWrapper with LearnedGridWrapper + """ + if self._quant_scheme == QuantScheme.training_range_learning_with_tf_init or self._quant_scheme == \ + QuantScheme.training_range_learning_with_tf_enhanced_init: + try: + device = utils.get_device(self.model) + except StopIteration: + # Model doesn't have any parameter. + # Set device to cpu by default. + device = torch.device('cpu') + + self._replace_quantization_wrapper(self.model, device) + + @staticmethod + def _validate_quantsim_inputs(quant_scheme: Union[str, QuantScheme], rounding_mode: str, default_output_bw: int, + default_param_bw: int, data_type: QuantizationDataType = QuantizationDataType.int): + """ + Perform sanity checks on inputs to QuantSim + + NOTE: This method will be deprecated. + Call aimet_common.quantsim.validate_quantsim_inputs directly instead. + + :param quant_scheme: Quantization scheme. Supported options are 'tf_enhanced' or 'tf' or using Quant Scheme Enum + QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced + :param rounding_mode: Rounding mode. Supported options are 'nearest' or 'stochastic' + :param default_output_bw: Default bitwidth (4-31) to use for quantizing layer inputs and outputs + :param default_param_bw: Default bitwidth (4-31) to use for quantizing layer parameters + :param data_type: Data type of the quantized values (int or float). + """ + validate_quantsim_inputs(quant_scheme, + rounding_mode, + default_output_bw, + default_param_bw, + data_type) + + @staticmethod + def _find_next_downstream_modules(op): + downstream_modules = [] + for succeeding_op in list(op.output.consumers): + if succeeding_op.get_module(): + downstream_modules.append(succeeding_op.get_module()) + + elif succeeding_op.type == CG_SPLIT: + downstream_modules += QuantizationSimModel._find_next_downstream_modules(succeeding_op) + + return downstream_modules + + @staticmethod + def _export_encodings_to_files(sim_model: torch.nn.Module, path: str, filename_prefix: str, + op_to_io_tensor_map: Dict, valid_param_set: set, excluded_layer_names, + propagate_encodings: bool, quantizer_args: Dict = None): + """ + Save the quantized model weight encodings + + :param sim_model: Quantsim model to export encodings for + :param path: path where to store model pth and encodings + :param filename_prefix: filename to store exported weight encodings in json format + :param op_to_io_tensor_map: Dictionary of layer to I/O tensor mapping from onnx or torch script model + :param valid_param_set: a set of valid param input names in model + :param excluded_layer_names: List of names of layers that have been excluded from quantization. + :param propagate_encodings: If True, encoding entries for intermediate ops (when one PyTorch ops results in + multiple ONNX nodes) are filled with the same BW and data_type as the output tensor for that series of + ops. + :param quantizer_args + """ + + # pylint: disable=too-many-locals + + # Create a dictionary to export to JSON + activation_encodings_onnx = {} + activation_encodings_torch = {} + param_encodings = {} + layers_to_onnx_op_names = onnx_utils.get_layers_in_io_tensor_map(op_to_io_tensor_map) + tensor_to_consumer_map = onnx_utils.get_tensor_to_consumer_map(op_to_io_tensor_map) + layer_names_not_found = [] + tensor_to_quantizer_map = {} + + for layer_name, layer in sim_model.named_modules(): + if not isinstance(layer, (ExportableQuantModule, QcQuantizeRecurrent)): + continue + if not has_valid_encodings(layer): + continue + # TODO: specifically call out dropout layers here since they are specifically switched out during export. + # These ops should eventually be reworked as part of math invariant ops to ignore quantization altogether. + # pylint: disable=protected-access + if isinstance(layer, ExportableQuantModule) and isinstance(layer.get_original_module(), utils.DROPOUT_TYPES): + continue + + if layer_name not in layers_to_onnx_op_names.keys(): + layer_names_not_found.append(layer_name) + else: + QuantizationSimModel._update_encoding_dicts_for_layer(layer, layer_name, activation_encodings_onnx, + activation_encodings_torch, + param_encodings, op_to_io_tensor_map, + valid_param_set, propagate_encodings, + tensor_to_consumer_map, layers_to_onnx_op_names, + tensor_to_quantizer_map) + + if layer_names_not_found: + logger.warning("The following layers were not found in the exported onnx model. Encodings for these layers" + " will not appear in the exported encodings file:\n" + "%s\n" + "This can be due to several reasons:\n" + "\t- The layer is set to quantize with float datatype, but was not exercised in compute " + "encodings. Not an issue if the layer is not meant to be run.\n" + "\t- The layer has valid encodings but was not seen while exporting to onnx using the dummy " + "input provided in sim.export(). Ensure that the dummy input covers all layers.", + layer_names_not_found) + + if quantsim.encoding_version == '0.6.1': + encodings_dict_onnx = {'version': quantsim.encoding_version, + 'activation_encodings': activation_encodings_onnx, + 'param_encodings': param_encodings, + 'excluded_layers': excluded_layer_names} + + if quantizer_args: + encodings_dict_onnx.update({'quantizer_args': quantizer_args}) + + logger.info("Layers excluded from quantization: %s", excluded_layer_names) + + # export weight encodings to output json file + encoding_file_path = os.path.join(path, filename_prefix + '.encodings') + save_json_yaml(encoding_file_path, encodings_dict_onnx) + else: + _export_to_1_0_0(path, filename_prefix, activation_encodings_onnx, param_encodings, tensor_to_quantizer_map, + excluded_layer_names, quantizer_args) + + # Export torch.encodings used for saving/loading common to 0.6.1 and 1.0.0 versions + encodings_dict_pytorch = {'version': quantsim.encoding_version, + 'activation_encodings': activation_encodings_torch, + 'param_encodings': param_encodings, + 'excluded_layers': excluded_layer_names} + + if quantizer_args: + encodings_dict_pytorch.update({'quantizer_args': quantizer_args}) + + encoding_file_path_pytorch = os.path.join(path, filename_prefix + '_torch' + '.encodings') + save_json_yaml(encoding_file_path_pytorch, encodings_dict_pytorch) + + @staticmethod + def _update_param_encodings_dict_for_layer(layer: ExportableQuantModule, layer_name: str, param_encodings: Dict, + valid_param_set: set, tensor_to_quantizer_map: Dict): + """ + :param layer: layer as torch.nn.Module + :param layer_name : Name of the layer + :param param_encodings: dictionary of param encodings + :param valid_param_set: a set of valid param input names in model + """ + + for orig_param_name, param_encoding in layer.export_param_encodings().items(): + param_name = layer_name + '.' + orig_param_name + if param_encoding is None: + continue + if param_name not in valid_param_set: + logger.error('Param tensor {%s} not found in valid param set', param_name) + continue + param_encodings[param_name] = param_encoding + tensor_to_quantizer_map[param_name] = layer.param_quantizers[orig_param_name] + + @staticmethod + def _update_encoding_dicts_for_layer(layer: ExportableQuantModule, layer_name: str, activation_encodings_onnx: Dict, + activation_encodings_torch: Dict, param_encodings: Dict, + op_to_io_tensor_map: Dict, valid_param_set: set, propagate_encodings: bool, + tensor_to_consumer_map: Dict[str, str], + layers_to_onnx_op_names: Dict[str, str], + tensor_to_quantizer_map: Dict): + """ + Add given layer param and activation encodings to respective dictionaries to be used for exporting encodings + :param layer: layer as torch.nn.Module + :param layer_name: Name of the layer + :param activation_encodings_onnx: dictionary of activation encodings which maps onnx attribute to encodings + :param activation_encodings_torch: dictionary of activation encodings which maps pytorch names to encodings + :param param_encodings: dictionary of param encodings + :param op_to_io_tensor_map: ONNX or Torch Script map of layer name to it's input/output tensors + :param valid_param_set: a set of valid param input names in model + :param propagate_encodings: If True, encoding entries for intermediate ops (when one PyTorch ops results in + multiple ONNX nodes) are filled with the same BW and data_type as the output tensor for that series of + ops. + :param tensor_to_consumer_map: Dictionary mapping tensor names to op names which consume the tensor + :param layers_to_onnx_op_names: Dictionary mapping PyTorch layer names to names of corresponding ONNX ops + """ + + if isinstance(layer, ExportableQuantModule): + + # -------------------------------------- + # Update encodings for Input activations + # -------------------------------------- + QuantizationSimModel._update_encoding_dict_for_input_activations(layer, layer_name, op_to_io_tensor_map, + activation_encodings_onnx, + activation_encodings_torch, + layers_to_onnx_op_names, + tensor_to_quantizer_map) + # --------------------------------------- + # Update encodings for output activations + # --------------------------------------- + QuantizationSimModel._update_encoding_dict_for_output_activations(layer, layer_name, + op_to_io_tensor_map, + activation_encodings_onnx, + activation_encodings_torch, + propagate_encodings, + tensor_to_consumer_map, + layers_to_onnx_op_names, + tensor_to_quantizer_map) + # --------------------------- + # Update encodings for Params + # --------------------------- + QuantizationSimModel._update_param_encodings_dict_for_layer(layer, layer_name, param_encodings, + valid_param_set, tensor_to_quantizer_map) + + if isinstance(layer, QcQuantizeRecurrent): + # Update encodings for Recurrent layers + QuantizationSimModel._update_encoding_dict_for_recurrent_layers(layer, layer_name, op_to_io_tensor_map, + activation_encodings_onnx, + param_encodings, propagate_encodings, + tensor_to_quantizer_map) + + @staticmethod + def find_op_names_for_layer(layer_name: str, op_to_io_tensor_map: Dict, + tensor_to_consumer_map: Optional[Dict[str, str]], + layers_to_onnx_op_names: Optional[Dict[str, str]]) -> Tuple[List[str], List[str]]: + """ + This function returns the last ONNX op and the list of ONNX Ops that were mapped from a PyTorch Op. + + :param layer_name: Name of the PyTorch layer + :param op_to_io_tensor_map: ONNX or Torch Script map of layer name to it's input/output tensors + :param tensor_to_consumer_map: Dictionary mapping tensor names to op names which consume the tensor + :param layers_to_onnx_op_names: Dictionary mapping PyTorch layer names to names of corresponding ONNX ops + :return: tuple(end op names, all op names) + """ + if version.parse(torch.__version__) < version.parse("1.13.0") or not onnx_utils.EXPORT_TO_ONNX_DIRECT: + op_names = [key for key in op_to_io_tensor_map if (key.startswith(layer_name) and layer_name+'#' in key) + or key == layer_name] + if len(op_names) == 1: + return op_names, op_names + + end_op_names = [op_name for op_name in op_names if op_name.endswith('.end')] + return end_op_names, op_names + + assert tensor_to_consumer_map is not None + assert layers_to_onnx_op_names is not None + # Get all ops which correspond to the current PyTorch layer being processed. + op_names = layers_to_onnx_op_names.get(layer_name, []) + op_name_set = set(op_names) + + end_op_names = [] + end_op_names_set = set() + for op_name in op_names: + # Loop through outputs of each op and check whether the output leads to an op not in + for output in op_to_io_tensor_map[op_name].outputs: + assert output in tensor_to_consumer_map.keys() + if not tensor_to_consumer_map[output]: + if op_name not in end_op_names_set: + # output has no consumers, and can either be a model output or an unused op output. + # List it as an end_op_name all the same. + end_op_names.append(op_name) + end_op_names_set.add(op_name) + else: + for consumer in tensor_to_consumer_map[output]: + if consumer not in op_name_set and op_name not in end_op_names_set: + end_op_names.append(op_name) + end_op_names_set.add(op_name) + + return end_op_names, op_names + + @staticmethod + def _update_encoding_dict_for_output_activations(layer: ExportableQuantModule, layer_name: str, op_to_io_tensor_map: Dict, + activation_encodings_onnx: Dict, activation_encodings_torch: Dict, + propagate_encodings: bool, tensor_to_consumer_map: Dict[str, str], + layers_to_onnx_op_names: Dict[str, str], + tensor_to_quantizer_map: Dict): + # pylint: disable=too-many-locals + output_tensors, propagate_tensors = QuantizationSimModel._get_layer_activation_tensors(layer_name, + op_to_io_tensor_map, + tensor_to_consumer_map, + layers_to_onnx_op_names) + output_encodings = layer.export_output_encodings() + + if len(output_tensors) != len(output_encodings): + logger.warning("number of output quantizers: %d available for layer: %s " + "doesn't match with number of output tensors: %d", len(output_encodings), layer_name, + len(output_tensors)) + + for index, (output_tensor, encoding) in enumerate(zip(output_tensors, output_encodings)): + + if encoding is not None: + activation_encodings_onnx[output_tensor] = encoding + tensor_to_quantizer_map[output_tensor] = layer.output_quantizers[index] + if layer_name not in activation_encodings_torch: + activation_encodings_torch[layer_name] = {} + if QUANTIZER_TYPE_OUTPUT not in activation_encodings_torch[layer_name]: + activation_encodings_torch[layer_name][QUANTIZER_TYPE_OUTPUT] = {} + activation_encodings_torch[layer_name][QUANTIZER_TYPE_OUTPUT][index] = encoding[0] + + if propagate_encodings: + valid_encodings = [enc for enc in output_encodings if enc is not None] + if valid_encodings: + encoding = valid_encodings[0] + for activation_tensor in propagate_tensors: + activation_encodings_onnx[activation_tensor] = utils.get_propagated_encoding_dict(encoding) + + + @staticmethod + def _update_encoding_dict_for_input_activations(layer: ExportableQuantModule, layer_name: str, op_to_io_tensor_map: Dict, + activation_encodings_onnx: Dict, activation_encodings_torch: Dict, + layers_to_onnx_op_names: Dict[str, str], + tensor_to_quantizer_map: Dict): + input_encodings = layer.export_input_encodings() + # skip layer if it has no input encodings. + if all(encoding is None for encoding in input_encodings): + return + + input_tensors = QuantizationSimModel._get_layer_input_tensors(layer, layer_name, op_to_io_tensor_map, + layers_to_onnx_op_names) + + if len(input_tensors) != len(input_encodings): + logger.warning("number of input quantizers: %d available for layer: %s " + "doesn't match with number of input tensors: %d", len(input_encodings), layer_name, + len(input_tensors)) + + for index, (input_tensor, encoding) in enumerate(zip(input_tensors, input_encodings)): + if encoding is not None: + activation_encodings_onnx[input_tensor] = encoding + # TODO: Modify this so quantsim does not make assumptions about the length of input_quantizers + tensor_to_quantizer_map[input_tensor] = layer.input_quantizers[min(index, len(layer.input_quantizers) - 1)] + # Check if layer exists in the pytorch encoding dictionary + if layer_name not in activation_encodings_torch: + activation_encodings_torch[layer_name] = {} + if QUANTIZER_TYPE_INPUT not in activation_encodings_torch[layer_name]: + activation_encodings_torch[layer_name][QUANTIZER_TYPE_INPUT] = {} + # Store encodings for a particular index so that they can be used to check if a quantizer was + # enabled or not + activation_encodings_torch[layer_name][QUANTIZER_TYPE_INPUT][index] = encoding[0] + + @staticmethod + def _get_layer_input_tensors(layer: torch.nn.Module, layer_name: str, op_to_io_tensor_map: Dict, + layers_to_onnx_op_names: Dict[str, str] = None) -> List[str]: + """ + This function returns the list of input tensor names mapped from a PyTorch Op. + + :param layer: layer as torch.nn.Module + :param layer_name: Name of the PyTorch layer + :param op_to_io_tensor_map: ONNX or Torch Script map of layer name to it's input/output tensors + :param layers_to_onnx_op_names: Dictionary mapping PyTorch layer names to names of corresponding ONNX ops + :return: list of input tensor names. + """ + + param_inputs = [layer_name + '.' + param_name for param_name, _ in layer.named_parameters()] + if version.parse(torch.__version__) < version.parse("1.13.0") or not onnx_utils.EXPORT_TO_ONNX_DIRECT: + start_op_names = [key for key in op_to_io_tensor_map + if (key.startswith(layer_name) and '#0' in key) or key == layer_name] + else: + assert layers_to_onnx_op_names is not None + op_names = layers_to_onnx_op_names.get(layer_name, []) + op_name_set = set(op_names) + start_op_names = set() + for op_name in op_names: + # For each op's inputs, if the input comes from an op not associated with this layer, add it to + # start_op_names. + for inp in op_to_io_tensor_map[op_name].inputs: + if inp not in op_name_set: + start_op_names.add(op_name) + + input_tensors = [] + input_tensors_set = set() + for name in start_op_names: + for input_tensor in op_to_io_tensor_map[name].inputs: + if input_tensor not in param_inputs and input_tensor not in input_tensors_set: + input_tensors.append(input_tensor) + input_tensors_set.add(input_tensor) + + return input_tensors + + @classmethod + def _get_layer_activation_tensors(cls, layer_name: str, op_to_io_tensor_map: Dict, + tensor_to_consumer_map: Dict[str, str] = None, + layers_to_onnx_op_names: Dict[str, str] = None) -> Tuple[List[str], List[str]]: + """ + This function returns the list of output tensor and intermediate tensor names mapped from a PyTorch Op. + + :param layer_name: Name of the PyTorch layer + :param op_to_io_tensor_map: ONNX or Torch Script map of layer name to it's input/output tensors + :param tensor_to_consumer_map: Dictionary mapping tensor names to op names which consume the tensor + :param layers_to_onnx_op_names: Dictionary mapping PyTorch layer names to names of corresponding ONNX ops + :return: tuple containing list of output tensor names and list of intermediate tensors + """ + end_op_names, op_names = cls.find_op_names_for_layer(layer_name, op_to_io_tensor_map, tensor_to_consumer_map, + layers_to_onnx_op_names) + + if len(end_op_names) > 1: + output_op_map_str = cls._get_output_map_str(end_op_names, layer_name, op_to_io_tensor_map) + logger.info("layer_name: %s, has multiple output onnx ops: %s", layer_name, output_op_map_str) + + output_tensors = [] + intermediate_tensors = [] + for name in op_names: + if name in end_op_names: + output_tensors.extend(op_to_io_tensor_map[name].outputs) + else: + intermediate_tensors.extend(op_to_io_tensor_map[name].outputs) + + return output_tensors, intermediate_tensors + + @staticmethod + def _get_output_map_str(end_op_names, layer_name, op_to_io_tensor_map) -> str: + """ + This function returns formatted list of output ops tensor mapping + + :param end_op_names: list of output onnx ops + :param layer_name: Name of the PyTorch layer + :param op_to_io_tensor_map: ONNX or Torch Script map of layer name to it's input/output tensors + :return: formatted string with output ops and their corresponding output count. + """ + num_output_ops = len(end_op_names) + op_map_str = ','.join([f'{name.replace(layer_name, "")}:{len(op_to_io_tensor_map[name].outputs)}' + for name in end_op_names[:5]]) + if num_output_ops > 5: + op_map_str += ', ..' + return f'{num_output_ops},[{op_map_str}]' + + @staticmethod + def _update_encoding_dict_for_recurrent_layers(layer: torch.nn.Module, layer_name: str, op_to_io_tensor_map: Dict, + activation_encodings_onnx: Dict, param_encodings: Dict, + propagate_encodings: bool, tensor_to_quantizer_map: Dict): + """ + + :param layer: + :param layer_name: + :param op_to_io_tensor_map: + :param activation_encodings_onnx: + :param param_encodings: + :param propagate_encodings: + :return: + """ + + # pylint: disable=too-many-nested-blocks + # pylint: disable=too-many-locals + + onnx_activations_to_quantizers, onnx_params_to_quantizers = \ + layer.get_activation_param_quantizers_for_onnx_tensors(op_to_io_tensor_map[layer_name + + '#root_node']) + # ------------------ + # Activations + # ------------------ + quantizer = None + for tensor, quantizer in onnx_activations_to_quantizers.items(): + quantizer_encoding = _get_encoding_by_quantizer(quantizer) + encoding = QuantizationSimModel._create_encoding_dict(quantizer_encoding, quantizer, + propagate_encodings=False) + activation_encodings_onnx[tensor] = [encoding] + tensor_to_quantizer_map[tensor] = quantizer + + if propagate_encodings and quantizer: + _, op_names = QuantizationSimModel.find_op_names_for_layer(layer_name, op_to_io_tensor_map, None, None) + for op_name in op_names: + io_tensor_list = op_to_io_tensor_map[op_name] + if not isinstance(io_tensor_list, list): + io_tensor_list = [io_tensor_list] + + for io_tensors in io_tensor_list: + + if io_tensors.outputs: + for output_tensor in io_tensors.outputs: + if output_tensor in onnx_activations_to_quantizers: + continue + quantizer_encoding = _get_encoding_by_quantizer(quantizer) + encoding = QuantizationSimModel._create_encoding_dict(quantizer_encoding, quantizer, + True) + + activation_encodings_onnx[output_tensor] = [encoding] + tensor_to_quantizer_map[output_tensor] = quantizer + + # ------------------ + # Params + # ------------------ + for tensor, quantizer in onnx_params_to_quantizers.items(): + quantizer_encoding = _get_encoding_by_quantizer(quantizer) + encoding = QuantizationSimModel._create_encoding_dict(quantizer_encoding, quantizer, + propagate_encodings=False) + param_encodings[tensor] = [encoding] + tensor_to_quantizer_map[tensor] = quantizer + + @staticmethod + def _get_qc_quantized_layers(model) -> List[Tuple[str, QcQuantizeWrapper]]: + quantized_layers = [] + for name, module in model.named_modules(): + if isinstance(module, (QcQuantizeRecurrent, LazyQuantizeWrapper, ExportableQuantModule)): + quantized_layers.append((name, module)) + return quantized_layers + + @staticmethod + def _is_quantizable_module(module_ref): + """ Function to check if a module is eligible for quantization. + If the module is NOT an PyTorch module type or if the module was already + Quantized or if the module is in the layers_to_ignore list, don't quantize. + """ + + if isinstance(module_ref, unquantizable_modules): + logger.debug("Module %s not quantizable", module_ref) + return False + + logger.debug("Module %s is quantizable", module_ref) + return True + + def _create_quantizer_module(self, module_to_quantize: torch.nn.Module, num_inout_tensors: Dict, + data_type: QuantizationDataType) -> torch.nn.Module: + """Instantiates wrapper based on quant scheme + """ + assert self._quant_scheme in [QuantScheme.post_training_tf, QuantScheme.post_training_tf_enhanced, + QuantScheme.training_range_learning_with_tf_enhanced_init, + QuantScheme.training_range_learning_with_tf_init, + QuantScheme.post_training_percentile] + + # We lookup the number of input and output tensors already determined + # Special case, we are adding a wrapper for a module not in the forward pass: Use default of 1, 1 + num_in_tensors, num_out_tensors = num_inout_tensors.get(module_to_quantize, (1, 1)) + + # Set quantizer to be a module replacer if it is in qc_quantize_modules_dict, otherwise set as + # StaticGridQuantWrapper. + quantizer_wrapper_type = qc_quantize_modules_dict.get(type(module_to_quantize), LazyQuantizeWrapper) + + if quantizer_wrapper_type == LazyQuantizeWrapper: + quant_scheme_for_initialization = self._quant_scheme + else: + quant_scheme_for_initialization = utils.get_v1_quant_scheme_for_initialization(self._quant_scheme) + + # TODO add quant_scheme_for_initialization for FP8 case + quantized_module = quantizer_wrapper_type(module_to_quantize, self._default_param_bw, self._default_output_bw, + self._rounding_mode, quant_scheme_for_initialization, num_inputs=num_in_tensors, + num_outputs=num_out_tensors, data_type=data_type) + + return quantized_module + + def _add_quantization_wrappers(self, module, num_inout_tensors, default_data_type: QuantizationDataType): + """Recursively add quantization wrappers to all appropriate modules starting with module + """ + for module_name, module_ref in module.named_children(): + logger.debug("nn.Module found : %s", module_ref) + + # check if the module already quantized then ignore + if not self._is_quantizable_module(module_ref): + continue + + # check if the module is leaf or not + if utils.is_leaf_module(module_ref): + + # Create a new QcQuantize wrapper module + quantized_module = self._create_quantizer_module(module_ref, num_inout_tensors, default_data_type) + + setattr(module, module_name, quantized_module) + + # recursively call children modules + else: + self._add_quantization_wrappers(module_ref, num_inout_tensors, default_data_type) + + def _set_tensor_quantizers_for_consts(self, inout_tensor_shape_dict: Dict): + """ + Identify and set is_const for tensor quantizers which correspond to constant inputs in the model. + """ + + if self.connected_graph is not None: + for _, qc_quantize_wrapper in self.quant_wrappers(): + if isinstance(qc_quantize_wrapper, (QcQuantizeWrapper, LazyQuantizeWrapper)): + # Only handling QcQuantWrappers and not QcQuantizeRecurrents + # pylint: disable=protected-access + conn_graph_op = self.connected_graph._module_to_op_dict.get(qc_quantize_wrapper._module_to_wrap) + input_tensor_shape_list = inout_tensor_shape_dict.get(qc_quantize_wrapper._module_to_wrap) + if conn_graph_op is not None: + for idx, (input_quantizer, inp) in \ + enumerate(zip(qc_quantize_wrapper.input_quantizers, conn_graph_op.inputs)): + input_quantizer.is_const = inp.is_const + input_quantizer.is_singleton = (input_tensor_shape_list is not None \ + and input_tensor_shape_list[0][idx] is not None \ + and input_tensor_shape_list[0][idx].numel() == 1) + + @staticmethod + def _create_encoding_dict(encoding: libpymo.TfEncoding, quantizer, propagate_encodings: bool) -> Union[Dict, None]: + """ + Create encoding dictionary from encoding object + :param encoding: Encoding of the quantizer + :param quantizer: Tensor Quantizer + :param propagate_encodings: If True, encoding entries for intermediate ops (when one PyTorch ops results in + multiple ONNX nodes) are filled with the same BW and data_type as the output tensor for that series of + ops. + :return: Encoding Dictionary + """ + return utils.create_encoding_dict(encoding, quantizer, propagate_encodings) + + @classmethod + def _remove_quantization_wrappers(cls, starting_module, list_of_modules_to_exclude): + """ + Recursively remove quantization wrappers from all appropriate modules starting with a given module + :param starting_module: Module to recursive search downstream from + :param list_of_modules_to_exclude: List of torch modules to remove quantization wrappers from (if present) + :return: None + """ + for module_name, module_ref in starting_module.named_children(): + + # If modules is in the exclude list, remove the wrapper + if module_ref in list_of_modules_to_exclude: + + if isinstance(module_ref, ExportableQuantModule): + # Remove the wrapper, gets auto-deleted + # pylint: disable=protected-access + setattr(starting_module, module_name, module_ref.get_original_module()) + + elif isinstance(module_ref, QcQuantizeStandAloneBase): + setattr(starting_module, module_name, torch.nn.Identity()) + + elif isinstance(module_ref, QcQuantizeRecurrent): + module_ref.update_params() + setattr(starting_module, module_name, module_ref.module_to_quantize) + + # Recursively call children modules if present + if not utils.is_leaf_module(module_ref): + cls._remove_quantization_wrappers(module_ref, list_of_modules_to_exclude) + + @staticmethod + def get_original_model(model: torch.nn.Module): + """ + This function returns the model with all quantization wrappers removed. + :return: Model without quantization wrappers. + """ + original_model = copy.deepcopy(model) + # pylint: disable=unnecessary-comprehension + all_modules_in_original_model = [module for module in original_model.modules()] + QuantizationSimModel._remove_quantization_wrappers(original_model, all_modules_in_original_model) + return original_model + + def _get_leaf_module_to_name_map(self): + """ + Returns a mapping from leaf modules to module name, where any ExportableQuantModule is considered a leaf module, + and is therefore not further recursed (since we do not want to retrieve all internal quantizers/modules). + """ + def recursively_populate_map(starting_module, module_map, start_str): + for name, module in starting_module.named_children(): + if isinstance(module, ExportableQuantModule) or utils.is_leaf_module(module): + module_map[module] = start_str + name + else: + recursively_populate_map(module, module_map, start_str + name + ".") + module_to_name_map = {} + recursively_populate_map(self.model, module_to_name_map, "") + return module_to_name_map + + def _add_inputs_hook(self, hooks): + module_to_name_map = self._get_leaf_module_to_name_map() + + def inputs_hook(module_ref, inputs, _): + # Need to remove hook here, otherwise the jit trace of CustomMarker with module ref will error since the + # hook will be recursively hit. + hooks[module_ref].remove() + del hooks[module_ref] + module_name = module_to_name_map[module_ref] + if isinstance(module_ref, ExportableQuantModule): + module_ref = module_ref.get_original_module() + marker_layer = torch.jit.trace(CustomMarker(module_ref, module_name, 'True'), + inputs) + self._module_marker_map[module_name] = marker_layer + + for name, module in self.model.named_modules(): + if name in module_to_name_map.values(): + hooks[module] = module.register_forward_hook(inputs_hook) + + def _validate_module_marker_map(self): + """ + Check to make sure all leaf modules have traced Custom Markers associated with them. + """ + all_leaf_modules = self._get_leaf_module_to_name_map().values() + missing_inputs_entries = [] + + for leaf_module in all_leaf_modules: + if leaf_module not in self._module_marker_map.keys(): + missing_inputs_entries.append(leaf_module) + + if missing_inputs_entries: + logger.info('In order to export a conditional model, all leaf modules need to be run with some input so ' + 'torch trace can be done.') + logger.info('The following modules were not run during compute encodings:') + logger.info(missing_inputs_entries) + logger.info('Please use the sim.run_modules_for_traced_custom_marker(<module list>, dummy_input) api to ' + 'pass dummy inputs to these modules.') + logger.info('Modules which can take the same dummy input can be ' + 'grouped as a list. For groups of modules with different input shapes, please call ' + 'sim.run_modules_for_traced_custom_markers() for each group.') + logger.info('Exiting quantsim export early.') + return False + return True + + def _export_conditional(self, path: str, filename_prefix: str, dummy_input: Union[torch.Tensor, Tuple], + forward_pass_callback: Callable, forward_pass_callback_args, + onnx_export_args: Union[OnnxExportApiArgs, None] = OnnxExportApiArgs(), + propagate_encodings: bool = False): + """ + Export function for conditional models. Performs another round of forward passes to create and store traced + CustomMarker info for each leaf module to be later used when scripting the model for export. + :param path: path where to store model pth and encodings + :param filename_prefix: Prefix to use for filenames of the model pth and encodings files + :param dummy_input: Dummy input to the model. Used to parse model graph. It is required for the dummy_input to + be placed on CPU. + :param forward_pass_callback: A callback function that simply runs forward passes on the model. This callback + function should use representative data for the forward pass, so the calculated encodings work for all + data samples. This callback internally chooses the number of data samples it wants to use for calculating + encodings. The callback should exercise all paths of the conditional model. + :param forward_pass_callback_args: These argument(s) are passed to the forward_pass_callback as-is. Up to + the user to determine the type of this parameter. E.g. could be simply an integer representing the number + of data samples to use. Or could be a tuple of parameters or an object representing something more complex. + If set to None, forward_pass_callback will be invoked with no parameters. + :param onnx_export_args: onnx specific export arguments + :param propagate_encodings: If True, encoding entries for intermediate ops (when one PyTorch ops results in + multiple ONNX nodes) are filled with the same BW and data_type as the output tensor for that series of + ops. + :return: None + """ + self._is_conditional = True + if onnx_export_args is None: + onnx_export_args = OnnxExportApiArgs() + + # If model is conditional, we need to create traced CustomMarkers to be used later during export. Create hooks + # here for creating a traced CustomMarker for each leaf module during the forward pass callback. + hooks = {} + if self._is_conditional: + self._add_inputs_hook(hooks) + + with utils.in_eval_mode(self.model), torch.no_grad(): + _ = forward_pass_callback(self.model, forward_pass_callback_args) + + # Any hooks that were hit during forward pass callback would have removed themselves. Remove the remaining + # hooks that were not run. + for h in hooks.values(): + h.remove() + + # Check that all paths were exercised + if not self._validate_module_marker_map(): + return + self.export(path, filename_prefix, dummy_input, onnx_export_args, propagate_encodings) + + def configure_quantization_ops(self, config_file: str, default_output_bw: int, default_param_bw: int, + default_data_type: QuantizationDataType) -> QuantSimConfigurator: + """ + Configure inserted quantize ops using config file and fill in all the supported kernels + :param config_file: Configuration file to use + :param default_output_bw: default bitwidth for activations + :param default_param_bw: default bitwidth for params + :param default_data_type: default data type + :return: QuantSimConfigurator object + """ + if self.connected_graph is None: + error_msg = ('A connected graph failed to be built.\n' + 'Unable to proceed with automatically configuring quantization ops using the config file.\n' + 'Please configure quantization ops manually by redefining ' + 'QuantizationSimModel.configure_quantization_ops()') + logger.error(error_msg) + raise AssertionError(error_msg) + return QuantSimConfigurator(self.model, self.connected_graph, config_file, default_output_bw, + default_param_bw, default_data_type) + + def load_encodings(self, encodings: Union[Mapping, str, os.PathLike], + strict: bool = True, + partial: bool = True, + requires_grad: Optional[bool] = None, + allow_overwrite: bool = True): + """ + :param encodings: Encoding dictionary or path to the encoding dictionary json file. + :param bool strict: If True, an error will be thrown if the model doesn't + have a quantizer corresponding to the specified encodings. + :param bool partial: If True, the encoding will be interpreted as a partial encoding, + and the dangling quantizers with no corresponding encoding will be kept untouched. + Otherwise, the dangling quantizers will be removed from the model. + :param bool requires_grad: Whether or not the quantization parameters loaded from the + encodings require gradient computation during training. + If None, ``requires_grad`` flag of the quantization parameters will be kept unchanged. + :param bool allow_overwrite: Whether or not the quantization parameters loaded from the + encodings can be overwriiten by :ref:`compute_encodings` or another :ref:`load_encodings`. + If None, whether the quantizer is overwrieable will be kept unchanged. + """ + if isinstance(encodings, (str, os.PathLike)): + with open(encodings, mode='r') as f: + encodings = json.load(f) + + self._load_encodings_impl(encodings, strict, partial, requires_grad, allow_overwrite) + + def _load_encodings_impl(self, encodings: Mapping, + strict: bool, + partial: bool, + requires_grad: Optional[bool], + allow_overwrite: bool): + if 'param_encodings' not in encodings: + param_encodings = encodings + activation_encodings = {} + logger.warning("An older AdaRound exported encoding file type has been detected! " + "Please regenerate it using the AdaRound export function from the latest " + "AIMET (version 1.32 or higher) if necessary. " + "Support for this encoding file will be deprecated in AIMET version 1.33.0.") + else: + param_encodings = encodings.get('param_encodings', {}) + activation_encodings = encodings.get('activation_encodings', {}) + + if not param_encodings and not activation_encodings: + raise RuntimeError + + if strict is True: + encoding_keys = param_encodings.keys() | activation_encodings.keys() + model_keys = set(name.replace("._module_to_wrap", "") for name, _ + in chain(self.model.named_modules(), utils.get_all_named_parameters(self.model))) + keys_not_found = encoding_keys - model_keys + if keys_not_found: + keys_not_found = ', '.join(sorted(keys_not_found)) + msg = f"Encoding dictionary contains modules/parameters that doesn't exist in the model: {keys_not_found}" + raise RuntimeError(msg) + + if param_encodings is not None: + self._set_param_encodings(param_encodings, + strict, partial, requires_grad, allow_overwrite) + + if activation_encodings is not None: + self._set_activation_encodings(activation_encodings, + strict, partial, requires_grad, allow_overwrite) + + @deprecated(f"Use {load_encodings.__qualname__} instead.") + def load_and_freeze_encodings(self, encoding_path: str, ignore_when_quantizer_disabled: bool = False): + """ + Functionality to set encodings (both activation and parameter) as per the given encodings JSON file and + freeze them. + .. note: + The encodings JSON file should be the {prefix}_torch.encodings json exported during sim.export() + + :param encoding_path: JSON file path from where to load the encodings. + :param ignore_when_quantizer_disabled: ignore raising RuntimeError while setting encodings, + when quantizers are disabled. + """ + self.load_encodings(encoding_path, + strict=not ignore_when_quantizer_disabled, + partial=True, + requires_grad=False, + allow_overwrite=False) + + def _set_param_encodings(self, + encoding_dict: Mapping, + strict: bool, + partial: bool, + requires_grad: Optional[bool], + allow_overwrite: bool): + for name, quant_module in self.model.named_modules(): + if isinstance(quant_module, ExportableQuantModule): + param_encoding = { + param_name: encoding_dict[f'{name}.{param_name}'] + for param_name, _ in quant_module.param_quantizers.items() + if f'{name}.{param_name}' in encoding_dict + } + quant_module.import_param_encodings(param_encoding, + strict, + partial, + requires_grad, + allow_overwrite) + + def _set_activation_encodings(self, + activation_encoding_dict: Mapping, + strict: bool, + partial: bool, + requires_grad: Optional[bool], + allow_overwrite: bool): + for module_name, module in self.model.named_modules(): + if not isinstance(module, ExportableQuantModule): + continue + + try: + input_encoding = activation_encoding_dict[module_name]['input'] + except KeyError: + input_encoding = {} + + module.import_input_encodings(input_encoding, + strict, + partial, + requires_grad, + allow_overwrite) + + try: + output_encoding = activation_encoding_dict[module_name]['output'] + except KeyError: + output_encoding = {} + + module.import_output_encodings(output_encoding, + strict, + partial, + requires_grad, + allow_overwrite) + + + @deprecated(f"Use {load_encodings.__qualname__} instead.") + def set_and_freeze_param_encodings(self, encoding_path: str): + """ + Set and freeze parameter encodings from encodings JSON file. + .. note: + The loaded json file should contain ONLY weight encodings. This is different from the json file used in + `load_and_freeze_encodings`, which contains both weight and activation dictionaries. + + :param encoding_path: path from where to load parameter encodings file + """ + with open(encoding_path, mode='r') as f: + encodings = json.load(f) + + if 'activation_encodings' in encodings: + del encodings['activation_encodings'] + + self.load_encodings(encodings, + strict=True, + partial=True, + requires_grad=False, + allow_overwrite=False) + + def named_qmodules(self): + """Generator that yields all quantized modules in the model and their names + """ + for name, module in self.model.named_modules(): + if isinstance(module, (QcQuantizeRecurrent, LazyQuantizeWrapper, ExportableQuantModule)): + yield name, module + + def qmodules(self): + """Generator that yields all quantized modules in the model + """ + yield from (module for _, module in self.named_qmodules()) + + quant_wrappers = named_qmodules + + def run_modules_for_traced_custom_marker(self, module_list: List[torch.nn.Module], dummy_input): + """ + Given a list of modules to run and dummy input for the module, create a traced CustomMarker for each module + and store it in the module_marker map. The same dummy input will be used for all modules. + + :param module_list: List of modules to create traced CustomMarkers for + :param dummy_input: Dummy input for all modules + """ + + module_to_name_map = self._get_leaf_module_to_name_map() + + for module in module_list: + # Only perform init and trace if the given module is a leaf module, and we have not recorded it before + if module in module_to_name_map and module_to_name_map[module] not in self._module_marker_map: + name = module_to_name_map[module] + module = module.get_original_module() if isinstance(module, ExportableQuantModule) else module + with utils.in_eval_mode(module), torch.no_grad(): + marker_layer = torch.jit.trace(CustomMarker(module, name, True), dummy_input) + self._module_marker_map[name] = marker_layer + + def _validate_supported_kernels_for_quantizers(self, action: SupportedKernelsAction): + """ + Validate supported kernels for all the Quantizers in the QuantSimModel + :param action: The action to be performed when incorrect candidate is set in a quantizer + """ + + def apply_act_param_rules(curr_candidate: QuantDtypeBwInfo, allowed_supported_kernels: List[QuantDtypeBwInfo], module_name): + """ + helper function to validate both activation and param against the supported_kernels passed + :param curr_candidate: candidate of interest + :param allowed_supported_kernels: List of supported kernels for the given module + :param module_name: name of the module + """ + if action != SupportedKernelsAction.allow_error: + for k in allowed_supported_kernels: + if curr_candidate == k: + return + + if action == SupportedKernelsAction.warn_on_error: + logger.warning("candidate:%s is not under the supported_kernels for the module %s", curr_candidate, + module_name) + + if action == SupportedKernelsAction.assert_on_error: + error_msg = f'candidate: {curr_candidate} is not under the supported_kernels for the module {module_name}' + raise RuntimeError(error_msg) + + def apply_act_rules(act: Tuple[int, QuantizationDataType], allowed_supported_kernels: List[QuantDtypeBwInfo], module_name): + """ + helper function to validate both activation only against the supported_kernels passed + :param act: act of the candidate to be validated + :param allowed_supported_kernels: List of supported kernels for the given module + :param module_name: name of the module + """ + if action != SupportedKernelsAction.allow_error: + for k in allowed_supported_kernels: + if k.is_same_activation(act[1], act[0]): + return + + if action == SupportedKernelsAction.warn_on_error: + logger.warning("activation:%s is not under the supported_kernels for the module %s", act, module_name) + + if action == SupportedKernelsAction.assert_on_error: + error_msg = f'activation: {act} is not under the supported_kernels for the module {module_name}' + raise RuntimeError(error_msg) + + # retrieve all the act and param quantizer candidates, and validate them against supported_kernels + for name, module in self.model.named_modules(): + if isinstance(module, (QcQuantizeWrapper, LazyQuantizeWrapper)) and module.supported_kernels: + supported_kernels = [] + for supported_kernel in module.supported_kernels: + # ((activation bitwidth, activation data type), (param bitwidth, param data type)) + # TODO modify this once reformat_supported_kernels generates of type QuantDtypeBwInfo + if isinstance(supported_kernel[1], tuple): + supported_kernels.append( + QuantDtypeBwInfo(supported_kernel[0][1], supported_kernel[0][0], + supported_kernel[1][1], supported_kernel[1][0])) + else: + supported_kernels.append( + QuantDtypeBwInfo(supported_kernel[1], supported_kernel[0])) + act_candidates = [] + param_candidate = () + for quantizer in module.input_quantizers + module.output_quantizers: + act_candidates.append((quantizer.bitwidth, quantizer.data_type)) + + if 'weight' in module.param_quantizers: + param_candidate = (module.param_quantizers['weight'].bitwidth, + module.param_quantizers['weight'].data_type) + + if param_candidate: + # we need to check weights against all the activations + for act_candidate in set(act_candidates): + apply_act_param_rules(QuantDtypeBwInfo(act_candidate[1], act_candidate[0], param_candidate[1], + param_candidate[0]), supported_kernels, name) + else: + for candidate in set(act_candidates): + apply_act_rules(candidate, supported_kernels, name) + + @staticmethod + def _replace_quantization_wrapper_with_native_torch_quantization_nodes(quant_sim_model, device: torch.device): + """ + Recursively remove quantization wrappers from all appropriate modules starting with a given module + :param quant_sim_model: model for which QcQuantizeWrapper gets replaced with wrapped module using + native torch quantization nodes + :param device: device on which model is present + :return: + """ + # Recursively replace quantization wrappers to native torch quantization nodes + for module_name, module_ref in quant_sim_model.named_children(): + # Create a native torch quantization node + if isinstance(module_ref, QcQuantizeWrapper): + embedded_module = NativeTorchQuantWrapper(module_ref, '_module_to_wrap', device) + setattr(quant_sim_model, module_name, embedded_module) + + elif isinstance(module_ref, QcQuantizeRecurrent): + logger.error('Do not support save model embedded native torch quantization nodes using QcQuantizeRecurrent.') + raise AssertionError + + # Recursively call children modules if present + if not utils.is_leaf_module(module_ref): + QuantizationSimModel._replace_quantization_wrapper_with_native_torch_quantization_nodes(module_ref, device) + + # pylint: disable=protected-access, too-many-branches, too-many-locals + def _apply_exception_rules(self): + """ + Apply exception rules to specific op. For example, a rule can override high bitwidth to Embedding module + """ + if self._hw_version not in {'V66', 'V68', 'V69', 'V73', 'V75', 'V79'}: + return + + module_to_quant_wrapper = {} + for _, wrapper in self.quant_wrappers(): + module_to_quant_wrapper[wrapper._module_to_wrap] = wrapper + + for name, wrapper in self.quant_wrappers(): + original_module = wrapper._module_to_wrap + + # A module that doesn't require exception rules + if not isinstance(original_module, (torch.nn.Embedding, torch.nn.GroupNorm, elementwise_ops.MatMul)): + continue + + if isinstance(original_module, torch.nn.Embedding): + if self._hw_version not in {'V73', 'V75', 'V79'}: + continue + weight_quantizer = wrapper.param_quantizers['weight'] + output_quantizer = wrapper.output_quantizers[0] + + weight_quantizer.bitwidth = output_quantizer.bitwidth + weight_quantizer.use_symmetric_encodings = output_quantizer.use_symmetric_encodings + elif isinstance(original_module, torch.nn.GroupNorm): + if self._hw_version not in {'V73', 'V75', 'V79'}: + continue + if 'weight' in wrapper.param_quantizers: + output_quantizer = wrapper.output_quantizers[0] + for _, param_quantizer in wrapper.param_quantizers.items(): + param_quantizer.bitwidth = output_quantizer.bitwidth + param_quantizer.use_symmetric_encodings = output_quantizer.use_symmetric_encodings + elif isinstance(original_module, elementwise_ops.MatMul): + first_input_quantizer, second_input_quantizer = wrapper.input_quantizers + + op = self.connected_graph._module_to_op_dict[original_module] + first_input_op = op.input_ops[0] if (not first_input_quantizer.enabled) else None + second_input_op = op.input_ops[1] if (not second_input_quantizer.enabled) else None + + target_quantizer_for_first_input = self._get_target_quantizer(first_input_quantizer, first_input_op, module_to_quant_wrapper) + target_quantizer_for_second_input = self._get_target_quantizer(second_input_quantizer, second_input_op, module_to_quant_wrapper) + + if not target_quantizer_for_second_input: + continue + + # According to opdef for Matmul in HTP: + # 16bit Weight(second input for dynamic MatMul) must have 16bit Activation(first input for dynamic MatMul). + # 16bit Activation and 16bit Weight require minimum arch V73. + # 16bit Weight must be symmetric quantized. + + # Below are the possible combinations for MatMul with 8/16 bitwidth: + # If version is V73/V75: {input0->8, input1->8 symm/asymm} {input0->16 , input1->8 symm/asymm} {input0->16, input1->16 symmetric} + # If version is lesser than V73: {input0->8, input1->8 symmetric} {input0->16, input1->8 symmetric} + + if self._hw_version in {'V66', 'V68', 'V69'}: + target_quantizer_for_second_input.use_symmetric_encodings = True + target_quantizer_for_second_input.bitwidth = 8 + elif self._hw_version in {'V73', 'V75', 'V79'}: + if target_quantizer_for_second_input.bitwidth == 16: + target_quantizer_for_second_input.use_symmetric_encodings = True + if target_quantizer_for_first_input: + target_quantizer_for_first_input.bitwidth = 16 + else: + raise ValueError(f'Not expected hardware version to apply exception rules: {self._hw_version}') + + else: + raise ValueError(f'A module not expected to apply exception rules: {name}') + + def _get_target_quantizer(self, input_quantizer: TensorQuantizer, input_op: Op, module_to_quant_wrapper: Dict[torch.nn.Module, QcQuantizeWrapper]) -> TensorQuantizer: + """ + Returns input quantizer if enabled otherwise returns closest enabled parent output quantizer. + + :param input_quantizer: Input quantizer + :param input_op: Input Op + :param module_to_quant_wrapper: Dict of module to quant wrapper + :return: Target quantizer + """ + target_quantizer = None + if input_quantizer.enabled: + target_quantizer = input_quantizer + elif input_op: + closest_producer_wrapper = self._get_closest_producer_wrapper( + input_op, module_to_quant_wrapper + ) + if closest_producer_wrapper: + target_quantizer = closest_producer_wrapper.output_quantizers[0] + else: + logger.warning("The closest wrapper could not be found. MatMul exception rule does not apply. " + "If you haven't used model preparer, consider using it.") + return target_quantizer + + + def _get_closest_producer_wrapper(self, + op: Op, + module_to_quant_wrapper: Dict[torch.nn.Module, QcQuantizeWrapper]) -> \ + Optional[QcQuantizeWrapper]: + """ + Find the closest producer QcQuantizeWrapper and return it + + :param op: Target operation + :param module_to_quant_wrapper: Module to Wrapper dictionary + :return: QcQuantizerWrapper if exists else None + """ + def get_quant_wrapper() -> Optional[QcQuantizeWrapper]: + module = op.get_module() + return module_to_quant_wrapper.get(module) if module else None + + wrapper = get_quant_wrapper() + if wrapper and wrapper.output_quantizers[0].enabled: + return wrapper + + if wrapper and not wrapper.output_quantizers[0].enabled: + # pylint: disable=no-else-return + if len(op.input_ops) == 1: + return self._get_closest_producer_wrapper(op.input_ops[0], module_to_quant_wrapper) + else: + logger.warning("A wrapper of %s with output quantization disabled has no input or more than one input exists. " + "It's ambiguous to find the nearest producer in this case", str(op.get_module())) + return None + + if not wrapper: + if not op.input_ops: + logger.warning("No input exists for navigation for traversal, it's not possible to find the closest producer") + return None + + if len(op.input_ops) > 1: + logger.warning("Multiple input ops exist, traversal to find closest producer is performed based on the first input") + + return self._get_closest_producer_wrapper(op.input_ops[0], module_to_quant_wrapper) + + @staticmethod + def save_model_with_embedded_quantization_nodes(sim_model, path: str, filename_prefix: str, dummy_input: Union[torch.Tensor, Tuple], + onnx_export_args: Optional[Union[OnnxExportApiArgs, Dict]] = None, + export_to_torchscript: bool = False, is_conditional: bool = False): + """ + Export model embedded with native torch quantization nodes. These nodes will be exported + as default onnx or torch script quantized nodes. + :param sim_model: model with the quantsim wrappers + :param path: path where to store model pth and encodings + :param filename_prefix: Prefix to use for filenames of the model pth and encodings files + :param dummy_input: Dummy input to the model. Used to parse model graph + :param onnx_export_args: optional export argument with onnx specific overrides if not provide export via + torchscript graph. Int16 can only be exported by torchscript + :param export_to_torchscript: If True, export to torchscript. Export to onnx otherwise. Defaults to False. + :param is_conditional: True if model is conditional, False otherwise + :return: + """ + def _validate_torchquantizer(quant_sim_model): + # To avoid non 8 bit TorchQuantizer are exported to ONNX + for _, module in quant_sim_model.named_modules(): + if isinstance(module, NativeTorchQuantWrapper): + quantizers = module.input_quantizers + module.output_quantizers + if 'weight' in module.param_quantizers: + quantizers += [module.param_quantizers['weight']] + if 'bias' in module.param_quantizers: + quantizers += [module.param_quantizers['bias']] + + for quantizer in quantizers: + if quantizer.enabled and quantizer.data_type == QuantizationDataType.int and quantizer.bitwidth != 8: + raise ValueError('Only 8 bit quantizers are supported by exporting to ONNX model.' + 'Please enable export_to_torchscript if you want to export non 8 bit quantizers.') + + model_filename = filename_prefix + '_embedded' + '.onnx' + model_path = os.path.join(path, model_filename) + quant_sim_model = copy.deepcopy(sim_model) + + device = utils.get_device(quant_sim_model) + if isinstance(dummy_input, torch.Tensor): + dummy_input = dummy_input.to(device) + else: + dummy_input = tuple([input.to(device) for input in dummy_input]) # pylint: disable=consider-using-generator + QuantizationSimModel._replace_quantization_wrapper_with_native_torch_quantization_nodes(quant_sim_model, device) + + if export_to_torchscript: + with utils.in_eval_mode(quant_sim_model), torch.no_grad(): + trace = torch.jit.trace(quant_sim_model, dummy_input) + ts_path = os.path.join(path, filename_prefix + '_embedded' + '.torchscript.pth') + trace.save(ts_path) + else: + _validate_torchquantizer(quant_sim_model) + OnnxSaver._export_model_to_onnx(quant_sim_model, dummy_input, model_path, is_conditional, onnx_export_args) # pylint: disable=protected-access + + def _enable_output_quantizers_for_specific_cast_ops(self, inout_tensors_dtypes: Dict[torch.nn.Module, Tuple[torch.dtype, torch.dtype]]): + """ + Enable output quantizer for Cast Ops where datatype of input tensor is int/bool + and data type of output tensor is float. + """ + # pylint: disable=protected-access + model_prefix = self.connected_graph._model_name + '.' + torch_int_dtypes = {torch.int8, torch.int16, torch.int32, torch.int64, torch.bool, torch.uint8} + torch_float_dtypes = {torch.float16, torch.float32, torch.float64} + + for module, inout_dtypes in inout_tensors_dtypes.items(): + input_tensor_dtype = inout_dtypes[0] + output_tensor_dtype = inout_dtypes[1] + # pylint: disable=protected-access + module_name = self.connected_graph._module_to_name[module].split(model_prefix)[-1] + + if input_tensor_dtype != output_tensor_dtype and input_tensor_dtype in torch_int_dtypes and output_tensor_dtype in torch_float_dtypes: + logger.info("Enabling output quantizer for module %s", module_name) + wrapped_module = getattr(self.model, module_name) + for output_quantizer in wrapped_module.output_quantizers: + setattr(output_quantizer, 'enabled', True) + + @staticmethod + def _get_num_inout_tensors_from_tensor_shape_dict(inout_tensor_shape_dict): + """ + Convert tensor shape dictionary to num inout tensors dictionary + """ + num_inout_tensors = {} + + for module, inout_tensor_shape in inout_tensor_shape_dict.items(): + input_tensor_shape_list, output_tensor_shape_list = inout_tensor_shape + num_inout_tensors[module] = (len(input_tensor_shape_list), + len(output_tensor_shape_list)) + + return num_inout_tensors
+ + +def save_checkpoint(quant_sim_model: QuantizationSimModel, file_path: str): + """ + This API provides a way for the user to save a checkpoint of the quantized model which can + be loaded at a later point to continue fine-tuning e.g. + See also load_checkpoint() + + :param quant_sim_model: QuantizationSimModel to save checkpoint for + :param file_path: Path to the file where you want to save the checkpoint + :return: None + """ + with open(file_path, 'wb') as file: + pickle.dump(quant_sim_model, file) + + +def load_checkpoint(file_path: str) -> QuantizationSimModel: + """ + Load the quantized model + + :param file_path: Path to the file where you want to save the checkpoint + :return: A new instance of the QuantizationSimModel created after loading the checkpoint + """ + with open(file_path, 'rb') as file: + sim = pickle.load(file) + return sim + + +def check_accumulator_overflow(model: torch.nn.Module, quant_bw: int, accum_bw: int): + """ + Checks for any potential for accumulator overflow across all the layers of the given model + :param model: Model + :param quant_bw: Bitwidth the layers are quantized at + :param accum_bw: Bitwidth of the accumulator + :return: Name of the layer with the most accumulator range used and range used + """ + + most_accum_range_used = 0 + most_accum_range_used_layer = None + + for layer_name, layer in model.named_modules(): + + if isinstance(layer, torch.nn.Conv2d): + was_accum_range_exceeded, accum_range_used = get_conv_accum_bounds(layer.weight.detach().numpy(), + quant_bw, accum_bw) + if accum_range_used > most_accum_range_used: + most_accum_range_used = accum_range_used + most_accum_range_used_layer = layer_name + + if was_accum_range_exceeded: + logger.info('Possible accumulator overflow for layer: %s', layer_name) + + if most_accum_range_used < 1: + logger.info('No overflow detected. Layer %s had the most accumulator range used: %f%%', + most_accum_range_used_layer, most_accum_range_used * 100) + else: + logger.info('Overflow detected. Layer %s had the most accumulator range used: %f%%', + most_accum_range_used_layer, most_accum_range_used * 100) + + return most_accum_range_used_layer, most_accum_range_used + + +@deprecated(f"Use {QuantizationSimModel.load_encodings.__qualname__} instead.") +def load_encodings_to_sim(quant_sim_model: QuantizationSimModel, pytorch_encoding_path: str): + """ + Loads the saved encodings to quant sim model. The encoding filename to load should end in _torch.encodings, + generated as part of quantsim export. + + :param quant_sim_model: Quantized model to load encodings for. Note: The model configuration should be the same as + when encodings were exported. + :param pytorch_encoding_path: Path of the encodings file to load. + """ + for module in quant_sim_model.model.modules(): + if isinstance(module, QcQuantizeWrapper): + module.set_mode(QcQuantizeOpMode.ACTIVE) + + quant_sim_model.load_encodings(pytorch_encoding_path, + strict=True, + partial=False, + requires_grad=None, + allow_overwrite=None) + + if isinstance(quant_sim_model, QuantizationSimModel): + # Only for V1 quantsim + quant_sim_model.replace_wrappers_for_quantize_dequantize() + + +def has_valid_encodings(qc_quantize_op: ExportableQuantModule) -> bool: + """ + Utility for determining whether a given qc_quantize_op has any valid encodings. + + :param qc_quantize_op: Qc quantize op to evaluate + :return: True if any input, param, or output quantizers have valid encodings, False otherwise + """ + if not isinstance(qc_quantize_op, (ExportableQuantModule, QcQuantizeRecurrent)): + logger.error("has_valid_encodings only supported for QcQuantizeWrapper and QcQuantizeRecurrent " + "modules") + assert isinstance(qc_quantize_op, (ExportableQuantModule, QcQuantizeRecurrent)) + if isinstance(qc_quantize_op, ExportableQuantModule): + all_encodings = qc_quantize_op.export_output_encodings() + qc_quantize_op.export_input_encodings() + \ + list(qc_quantize_op.export_param_encodings().values()) + return any([encoding is not None for encoding in all_encodings]) # pylint: disable=consider-using-generator,use-a-generator + input_quantizers = list(qc_quantize_op.input_quantizers.values()) + output_quantizers = list(qc_quantize_op.output_quantizers.values()) + + for quantizer in input_quantizers + output_quantizers + list(qc_quantize_op.param_quantizers.values()): + if quantizer.enabled and (quantizer.encoding is not None or quantizer.data_type is QuantizationDataType.float): + return True + + return False + + +def compute_encodings_for_sims(sim_list: List[QuantizationSimModel], forward_pass_callback: Callable, + forward_pass_callback_args: Any): + """ + Compute encodings for a list of QuantSims. + + :param sim_list: List of QuantSims to compute encodings for. + :param forward_pass_callback: A callback function that simply runs forward passes on the models. This callback + function should use representative data for the forward pass, so the calculated encodings work for all + data samples. This callback internally chooses the number of data samples it wants to use for calculating + encodings. + The callback expects exactly two inputs: + - List of models which are involved in the forward pass. The models are taken directly from calling + sim.model for each sim in sim_list, passed in the same order in which the sims appear in sim_list. + - Forward pass callback args + :param forward_pass_callback_args: These argument(s) are passed to the forward_pass_callback as-is. Up to + the user to determine the type of this parameter. E.g. could be simply an integer representing the number + of data samples to use. Or could be a tuple of parameters or an object representing something more complex. + If set to None, forward_pass_callback will be invoked with no parameters. + """ + ctx_managers = [torch.no_grad()] + for sim in sim_list: + ctx_managers.append(utils.in_eval_mode(sim.model)) + QuantizationSimModel.prepare_sim_for_compute_encodings(sim) + + with contextlib.ExitStack() as stack: + for mgr in ctx_managers: + stack.enter_context(mgr) + _ = forward_pass_callback([sim.model for sim in sim_list], forward_pass_callback_args) + + for sim in sim_list: + QuantizationSimModel.compute_layer_encodings_for_sim(sim) +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_torch/visualize_model.html b/releases/1.32.2/_modules/aimet_torch/visualize_model.html new file mode 100644 index 00000000..6a18ab79 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_torch/visualize_model.html @@ -0,0 +1,1283 @@ + + + + + + aimet_torch.visualize_model — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_torch.visualize_model

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2019-2021, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Top level API for visualizing a pytorch model. """
+import os
+from typing import List
+import torch
+from bokeh import plotting
+from bokeh.layouts import column
+from aimet_torch import plotting_utils
+from aimet_torch.utils import get_layer_by_name
+
+
+
[docs]def visualize_changes_after_optimization( + old_model: torch.nn.Module, + new_model: torch.nn.Module, + results_dir: str, + selected_layers: List = None +) -> List[plotting.figure]: + """ + Visualizes changes before and after some optimization has been applied to a model. + + :param old_model: pytorch model before optimization + :param new_model: pytorch model after optimization + :param results_dir: Directory to save the Bokeh plots + :param selected_layers: a list of layers a user can choose to have visualized. If selected layers is None, + all Linear and Conv layers will be visualized. + :return: A list of bokeh plots + """ + file_path = os.path.join(results_dir, 'visualize_changes_after_optimization.html') + plotting.output_file(file_path) + subplots = [] + if selected_layers: + for name, module in new_model.named_modules(): + if name in selected_layers and hasattr(module, "weight"): + old_model_module = get_layer_by_name(old_model, name) + new_model_module = module + subplots.append( + plotting_utils.visualize_changes_after_optimization_single_layer( + name, old_model_module, new_model_module + ) + ) + + else: + for name, module in new_model.named_modules(): + if hasattr(module, "weight") and\ + isinstance(module, (torch.nn.modules.conv.Conv2d, torch.nn.modules.linear.Linear)): + old_model_module = get_layer_by_name(old_model, name) + new_model_module = module + subplots.append( + plotting_utils.visualize_changes_after_optimization_single_layer( + name, old_model_module, new_model_module + ) + ) + plotting.save(column(subplots)) + return subplots
+ + +
[docs]def visualize_weight_ranges( + model: torch.nn.Module, + results_dir: str, + selected_layers: List = None +) -> List[plotting.figure]: + """ + Visualizes weight ranges for each layer through a scatter plot showing mean plotted against the standard deviation, + the minimum plotted against the max, and a line plot with min, max, and mean for each output channel. + + :param model: pytorch model + :param selected_layers: a list of layers a user can choose to have visualized. If selected layers is None, + all Linear and Conv layers will be visualized. + :param results_dir: Directory to save the Bokeh plots + :return: A list of bokeh plots + """ + + file_path = os.path.join(results_dir, 'visualize_weight_ranges.html') + plotting.output_file(file_path) + subplots = [] + if selected_layers: + for name, module in model.named_modules(): + if name in selected_layers and hasattr(module, "weight"): + subplots.append(plotting_utils.visualize_weight_ranges_single_layer(module, name)) + else: + for name, module in model.named_modules(): + if hasattr(module, "weight") and\ + isinstance(module, (torch.nn.modules.conv.Conv2d, torch.nn.modules.linear.Linear)): + subplots.append(plotting_utils.visualize_weight_ranges_single_layer(module, name)) + + plotting.save(column(subplots)) + return subplots
+ + +
[docs]def visualize_relative_weight_ranges_to_identify_problematic_layers( + model: torch.nn.Module, + results_dir: str, + selected_layers: List = None +) -> List[plotting.figure]: + """ + For each of the selected layers, publishes a line plot showing weight ranges for each layer, summary statistics + for relative weight ranges, and a histogram showing weight ranges of output channels + with respect to the minimum weight range. + + :param model: pytorch model + :param results_dir: Directory to save the Bokeh plots + :param selected_layers: a list of layers a user can choose to have visualized. If selected layers is None, + all Linear and Conv layers will be visualized. + :return: A list of bokeh plots + """ + + file_path = os.path.join(results_dir, 'visualize_relative_weight_ranges_to_identify_problematic_layers.html') + plotting.output_file(file_path) + subplots = [] + # layer name -> module weights data frame mapping + if not selected_layers: + for name, module in model.named_modules(): + if hasattr(module, "weight") and\ + isinstance(module, (torch.nn.modules.conv.Conv2d, torch.nn.modules.linear.Linear)): + subplots.append(plotting_utils.visualize_relative_weight_ranges_single_layer(module, name)) + else: + for name, module in model.named_modules(): + if hasattr(module, "weight") and\ + isinstance(module, (torch.nn.modules.conv.Conv2d, torch.nn.modules.linear.Linear)) and\ + name in selected_layers: + subplots.append(plotting_utils.visualize_relative_weight_ranges_single_layer(module, name)) + + plotting.save(column(subplots)) + return subplots
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/aimet_torch/visualize_serialized_data.html b/releases/1.32.2/_modules/aimet_torch/visualize_serialized_data.html new file mode 100644 index 00000000..0f20a751 --- /dev/null +++ b/releases/1.32.2/_modules/aimet_torch/visualize_serialized_data.html @@ -0,0 +1,1214 @@ + + + + + + aimet_torch.visualize_serialized_data — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_torch.visualize_serialized_data

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2019, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+
+""" Class for visualizing after compression is completed"""
+import pickle
+import pandas as pd
+from bokeh.models import ColumnDataSource, DataTable, TableColumn
+from aimet_common.compression_algo import CompressionAlgo
+from aimet_common.bokeh_plots import BokehServerSession
+from aimet_common import plotting_utils
+
+
+
[docs]class VisualizeCompression: + """ Updates bokeh server session document and publishes graphs/tables to the server with session id compression. """ + + def __init__(self, visualization_url): + self.bokeh_session = BokehServerSession(visualization_url, session_id="compression") + self.__document = self.bokeh_session.document + +
[docs] def display_eval_scores(self, saved_eval_scores_dict_path): + """ + Publishes the evaluation scores table to the server. + + :param saved_eval_scores_dict_path: file path to the evaluation scores for each layer + :return: None + """ + with open(saved_eval_scores_dict_path, 'rb') as infile: + eval_scores_dict = pickle.load(infile) + + eval_scores_data_frame = pd.DataFrame.from_dict(eval_scores_dict).T + eval_scores_data_frame.columns = eval_scores_data_frame.columns.map(str) + eval_scores_data_frame.insert(0, 'layers', eval_scores_data_frame.index) + + source = ColumnDataSource(data=eval_scores_data_frame) + columns = [TableColumn(field=Ci, title=Ci) for Ci in eval_scores_data_frame.columns] # bokeh columns + eval_scores_data_table = DataTable(source=source, columns=columns, width=1500) + + self.__document.add_root(eval_scores_data_table)
+ +
[docs] def display_comp_ratio_plot(self, comp_ratio_list_path): + """ + Publishes the optimal compression ratios to the server. + + :param comp_ratio_list_path: Path to the pkl file with compression ratios for each layer + :return: None + """ + layer_comp_ratio_list = CompressionAlgo.unpickle_comp_ratios_list(comp_ratio_list_path=comp_ratio_list_path) + + # visualize comp ratios vs layers in a plot and add it to a server session document. + comp_ratios = [] + layer_names = [] + for layer_name, comp_ratio in layer_comp_ratio_list: + comp_ratios.append(comp_ratio) + layer_names.append(layer_name) + + plot = plotting_utils.plot_optimal_compression_ratios(comp_ratios, layer_names) + self.__document.add_root(plot)
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_modules/index.html b/releases/1.32.2/_modules/index.html new file mode 100644 index 00000000..a18828fa --- /dev/null +++ b/releases/1.32.2/_modules/index.html @@ -0,0 +1,1163 @@ + + + + + + Overview: module code — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + +
  • +
  • +
+
+
+ +
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/_static/_sphinx_javascript_frameworks_compat.js b/releases/1.32.2/_static/_sphinx_javascript_frameworks_compat.js new file mode 100644 index 00000000..8549469d --- /dev/null +++ b/releases/1.32.2/_static/_sphinx_javascript_frameworks_compat.js @@ -0,0 +1,134 @@ +/* + * _sphinx_javascript_frameworks_compat.js + * ~~~~~~~~~~ + * + * Compatability shim for jQuery and underscores.js. + * + * WILL BE REMOVED IN Sphinx 6.0 + * xref RemovedInSphinx60Warning + * + */ + +/** + * select a different prefix for underscore + */ +$u = _.noConflict(); + + +/** + * small helper function to urldecode strings + * + * See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/decodeURIComponent#Decoding_query_parameters_from_a_URL + */ +jQuery.urldecode = function(x) { + if (!x) { + return x + } + return decodeURIComponent(x.replace(/\+/g, ' ')); +}; + +/** + * small helper function to urlencode strings + */ +jQuery.urlencode = encodeURIComponent; + +/** + * This function returns the parsed url parameters of the + * current request. Multiple values per key are supported, + * it will always return arrays of strings for the value parts. + */ +jQuery.getQueryParameters = function(s) { + if (typeof s === 'undefined') + s = document.location.search; + var parts = s.substr(s.indexOf('?') + 1).split('&'); + var result = {}; + for (var i = 0; i < parts.length; i++) { + var tmp = parts[i].split('=', 2); + var key = jQuery.urldecode(tmp[0]); + var value = jQuery.urldecode(tmp[1]); + if (key in result) + result[key].push(value); + else + result[key] = [value]; + } + return result; +}; + +/** + * highlight a given string on a jquery object by wrapping it in + * span elements with the given class name. + */ +jQuery.fn.highlightText = function(text, className) { + function highlight(node, addItems) { + if (node.nodeType === 3) { + var val = node.nodeValue; + var pos = val.toLowerCase().indexOf(text); + if (pos >= 0 && + !jQuery(node.parentNode).hasClass(className) && + !jQuery(node.parentNode).hasClass("nohighlight")) { + var span; + var isInSVG = jQuery(node).closest("body, svg, foreignObject").is("svg"); + if (isInSVG) { + span = document.createElementNS("http://www.w3.org/2000/svg", "tspan"); + } else { + span = document.createElement("span"); + span.className = className; + } + span.appendChild(document.createTextNode(val.substr(pos, text.length))); + node.parentNode.insertBefore(span, node.parentNode.insertBefore( + document.createTextNode(val.substr(pos + text.length)), + node.nextSibling)); + node.nodeValue = val.substr(0, pos); + if (isInSVG) { + var rect = document.createElementNS("http://www.w3.org/2000/svg", "rect"); + var bbox = node.parentElement.getBBox(); + rect.x.baseVal.value = bbox.x; + rect.y.baseVal.value = bbox.y; + rect.width.baseVal.value = bbox.width; + rect.height.baseVal.value = bbox.height; + rect.setAttribute('class', className); + addItems.push({ + "parent": node.parentNode, + "target": rect}); + } + } + } + else if (!jQuery(node).is("button, select, textarea")) { + jQuery.each(node.childNodes, function() { + highlight(this, addItems); + }); + } + } + var addItems = []; + var result = this.each(function() { + highlight(this, addItems); + }); + for (var i = 0; i < addItems.length; ++i) { + jQuery(addItems[i].parent).before(addItems[i].target); + } + return result; +}; + +/* + * backward compatibility for jQuery.browser + * This will be supported until firefox bug is fixed. + */ +if (!jQuery.browser) { + jQuery.uaMatch = function(ua) { + ua = ua.toLowerCase(); + + var match = /(chrome)[ \/]([\w.]+)/.exec(ua) || + /(webkit)[ \/]([\w.]+)/.exec(ua) || + /(opera)(?:.*version|)[ \/]([\w.]+)/.exec(ua) || + /(msie) ([\w.]+)/.exec(ua) || + ua.indexOf("compatible") < 0 && /(mozilla)(?:.*? rv:([\w.]+)|)/.exec(ua) || + []; + + return { + browser: match[ 1 ] || "", + version: match[ 2 ] || "0" + }; + }; + jQuery.browser = {}; + jQuery.browser[jQuery.uaMatch(navigator.userAgent).browser] = true; +} diff --git a/releases/1.32.2/_static/basic.css b/releases/1.32.2/_static/basic.css new file mode 100644 index 00000000..eeb0519a --- /dev/null +++ b/releases/1.32.2/_static/basic.css @@ -0,0 +1,899 @@ +/* + * basic.css + * ~~~~~~~~~ + * + * Sphinx stylesheet -- basic theme. + * + * :copyright: Copyright 2007-2022 by the Sphinx team, see AUTHORS. + * :license: BSD, see LICENSE for details. + * + */ + +/* -- main layout ----------------------------------------------------------- */ + +div.clearer { + clear: both; +} + +div.section::after { + display: block; + content: ''; + clear: left; +} + +/* -- relbar ---------------------------------------------------------------- */ + +div.related { + width: 100%; + font-size: 90%; +} + +div.related h3 { + display: none; +} + +div.related ul { + margin: 0; + padding: 0 0 0 10px; + list-style: none; +} + +div.related li { + display: inline; +} + +div.related li.right { + float: right; + margin-right: 5px; +} + +/* -- sidebar --------------------------------------------------------------- */ + +div.sphinxsidebarwrapper { + padding: 10px 5px 0 10px; +} + +div.sphinxsidebar { + float: left; + width: 230px; + margin-left: -100%; + font-size: 90%; + word-wrap: break-word; + overflow-wrap : break-word; +} + +div.sphinxsidebar ul { + list-style: none; +} + +div.sphinxsidebar ul ul, +div.sphinxsidebar ul.want-points { + margin-left: 20px; + list-style: square; +} + +div.sphinxsidebar ul ul { + margin-top: 0; + margin-bottom: 0; +} + +div.sphinxsidebar form { + margin-top: 10px; +} + +div.sphinxsidebar input { + border: 1px solid #98dbcc; + font-family: sans-serif; + font-size: 1em; +} + +div.sphinxsidebar #searchbox form.search { + overflow: hidden; +} + +div.sphinxsidebar #searchbox input[type="text"] { + float: left; + width: 80%; + padding: 0.25em; + box-sizing: border-box; +} + +div.sphinxsidebar #searchbox input[type="submit"] { + float: left; + width: 20%; + border-left: none; + padding: 0.25em; + box-sizing: border-box; +} + + +img { + border: 0; + max-width: 100%; +} + +/* -- search page ----------------------------------------------------------- */ + +ul.search { + margin: 10px 0 0 20px; + padding: 0; +} + +ul.search li { + padding: 5px 0 5px 20px; + background-image: url(file.png); + background-repeat: no-repeat; + background-position: 0 7px; +} + +ul.search li a { + font-weight: bold; +} + +ul.search li p.context { + color: #888; + margin: 2px 0 0 30px; + text-align: left; +} + +ul.keywordmatches li.goodmatch a { + font-weight: bold; +} + +/* -- index page ------------------------------------------------------------ */ + +table.contentstable { + width: 90%; + margin-left: auto; + margin-right: auto; +} + +table.contentstable p.biglink { + line-height: 150%; +} + +a.biglink { + font-size: 1.3em; +} + +span.linkdescr { + font-style: italic; + padding-top: 5px; + font-size: 90%; +} + +/* -- general index --------------------------------------------------------- */ + +table.indextable { + width: 100%; +} + +table.indextable td { + text-align: left; + vertical-align: top; +} + +table.indextable ul { + margin-top: 0; + margin-bottom: 0; + list-style-type: none; +} + +table.indextable > tbody > tr > td > ul { + padding-left: 0em; +} + +table.indextable tr.pcap { + height: 10px; +} + +table.indextable tr.cap { + margin-top: 10px; + background-color: #f2f2f2; +} + +img.toggler { + margin-right: 3px; + margin-top: 3px; + cursor: pointer; +} + +div.modindex-jumpbox { + border-top: 1px solid #ddd; + border-bottom: 1px solid #ddd; + margin: 1em 0 1em 0; + padding: 0.4em; +} + +div.genindex-jumpbox { + border-top: 1px solid #ddd; + border-bottom: 1px solid #ddd; + margin: 1em 0 1em 0; + padding: 0.4em; +} + +/* -- domain module index --------------------------------------------------- */ + +table.modindextable td { + padding: 2px; + border-collapse: collapse; +} + +/* -- general body styles --------------------------------------------------- */ + +div.body { + min-width: 360px; + max-width: 800px; +} + +div.body p, div.body dd, div.body li, div.body blockquote { + -moz-hyphens: auto; + -ms-hyphens: auto; + -webkit-hyphens: auto; + hyphens: auto; +} + +a.headerlink { + visibility: hidden; +} +a.brackets:before, +span.brackets > a:before{ + content: "["; +} + +a.brackets:after, +span.brackets > a:after { + content: "]"; +} + + +h1:hover > a.headerlink, +h2:hover > a.headerlink, +h3:hover > a.headerlink, +h4:hover > a.headerlink, +h5:hover > a.headerlink, +h6:hover > a.headerlink, +dt:hover > a.headerlink, +caption:hover > a.headerlink, +p.caption:hover > a.headerlink, +div.code-block-caption:hover > a.headerlink { + visibility: visible; +} + +div.body p.caption { + text-align: inherit; +} + +div.body td { + text-align: left; +} + +.first { + margin-top: 0 !important; +} + +p.rubric { + margin-top: 30px; + font-weight: bold; +} + +img.align-left, figure.align-left, .figure.align-left, object.align-left { + clear: left; + float: left; + margin-right: 1em; +} + +img.align-right, figure.align-right, .figure.align-right, object.align-right { + clear: right; + float: right; + margin-left: 1em; +} + +img.align-center, figure.align-center, .figure.align-center, object.align-center { + display: block; + margin-left: auto; + margin-right: auto; +} + +img.align-default, figure.align-default, .figure.align-default { + display: block; + margin-left: auto; + margin-right: auto; +} + +.align-left { + text-align: left; +} + +.align-center { + text-align: center; +} + +.align-default { + text-align: center; +} + +.align-right { + text-align: right; +} + +/* -- sidebars -------------------------------------------------------------- */ + +div.sidebar, +aside.sidebar { + margin: 0 0 0.5em 1em; + border: 1px solid #ddb; + padding: 7px; + background-color: #ffe; + width: 40%; + float: right; + clear: right; + overflow-x: auto; +} + +p.sidebar-title { + font-weight: bold; +} +div.admonition, div.topic, blockquote { + clear: left; +} + +/* -- topics ---------------------------------------------------------------- */ +div.topic { + border: 1px solid #ccc; + padding: 7px; + margin: 10px 0 10px 0; +} + +p.topic-title { + font-size: 1.1em; + font-weight: bold; + margin-top: 10px; +} + +/* -- admonitions ----------------------------------------------------------- */ + +div.admonition { + margin-top: 10px; + margin-bottom: 10px; + padding: 7px; +} + +div.admonition dt { + font-weight: bold; +} + +p.admonition-title { + margin: 0px 10px 5px 0px; + font-weight: bold; +} + +div.body p.centered { + text-align: center; + margin-top: 25px; +} + +/* -- content of sidebars/topics/admonitions -------------------------------- */ + +div.sidebar > :last-child, +aside.sidebar > :last-child, +div.topic > :last-child, +div.admonition > :last-child { + margin-bottom: 0; +} + +div.sidebar::after, +aside.sidebar::after, +div.topic::after, +div.admonition::after, +blockquote::after { + display: block; + content: ''; + clear: both; +} + +/* -- tables ---------------------------------------------------------------- */ + +table.docutils { + margin-top: 10px; + margin-bottom: 10px; + border: 0; + border-collapse: collapse; +} + +table.align-center { + margin-left: auto; + margin-right: auto; +} + +table.align-default { + margin-left: auto; + margin-right: auto; +} + +table caption span.caption-number { + font-style: italic; +} + +table caption span.caption-text { +} + +table.docutils td, table.docutils th { + padding: 1px 8px 1px 5px; + border-top: 0; + border-left: 0; + border-right: 0; + border-bottom: 1px solid #aaa; +} + +th { + text-align: left; + padding-right: 5px; +} + +table.citation { + border-left: solid 1px gray; + margin-left: 1px; +} + +table.citation td { + border-bottom: none; +} + +th > :first-child, +td > :first-child { + margin-top: 0px; +} + +th > :last-child, +td > :last-child { + margin-bottom: 0px; +} + +/* -- figures --------------------------------------------------------------- */ + +div.figure, figure { + margin: 0.5em; + padding: 0.5em; +} + +div.figure p.caption, figcaption { + padding: 0.3em; +} + +div.figure p.caption span.caption-number, +figcaption span.caption-number { + font-style: italic; +} + +div.figure p.caption span.caption-text, +figcaption span.caption-text { +} + +/* -- field list styles ----------------------------------------------------- */ + +table.field-list td, table.field-list th { + border: 0 !important; +} + +.field-list ul { + margin: 0; + padding-left: 1em; +} + +.field-list p { + margin: 0; +} + +.field-name { + -moz-hyphens: manual; + -ms-hyphens: manual; + -webkit-hyphens: manual; + hyphens: manual; +} + +/* -- hlist styles ---------------------------------------------------------- */ + +table.hlist { + margin: 1em 0; +} + +table.hlist td { + vertical-align: top; +} + +/* -- object description styles --------------------------------------------- */ + +.sig { + font-family: 'Consolas', 'Menlo', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', monospace; +} + +.sig-name, code.descname { + background-color: transparent; + font-weight: bold; +} + +.sig-name { + font-size: 1.1em; +} + +code.descname { + font-size: 1.2em; +} + +.sig-prename, code.descclassname { + background-color: transparent; +} + +.optional { + font-size: 1.3em; +} + +.sig-paren { + font-size: larger; +} + +.sig-param.n { + font-style: italic; +} + +/* C++ specific styling */ + +.sig-inline.c-texpr, +.sig-inline.cpp-texpr { + font-family: unset; +} + +.sig.c .k, .sig.c .kt, +.sig.cpp .k, .sig.cpp .kt { + color: #0033B3; +} + +.sig.c .m, +.sig.cpp .m { + color: #1750EB; +} + +.sig.c .s, .sig.c .sc, +.sig.cpp .s, .sig.cpp .sc { + color: #067D17; +} + + +/* -- other body styles ----------------------------------------------------- */ + +ol.arabic { + list-style: decimal; +} + +ol.loweralpha { + list-style: lower-alpha; +} + +ol.upperalpha { + list-style: upper-alpha; +} + +ol.lowerroman { + list-style: lower-roman; +} + +ol.upperroman { + list-style: upper-roman; +} + +:not(li) > ol > li:first-child > :first-child, +:not(li) > ul > li:first-child > :first-child { + margin-top: 0px; +} + +:not(li) > ol > li:last-child > :last-child, +:not(li) > ul > li:last-child > :last-child { + margin-bottom: 0px; +} + +ol.simple ol p, +ol.simple ul p, +ul.simple ol p, +ul.simple ul p { + margin-top: 0; +} + +ol.simple > li:not(:first-child) > p, +ul.simple > li:not(:first-child) > p { + margin-top: 0; +} + +ol.simple p, +ul.simple p { + margin-bottom: 0; +} +dl.footnote > dt, +dl.citation > dt { + float: left; + margin-right: 0.5em; +} + +dl.footnote > dd, +dl.citation > dd { + margin-bottom: 0em; +} + +dl.footnote > dd:after, +dl.citation > dd:after { + content: ""; + clear: both; +} + +dl.field-list { + display: grid; + grid-template-columns: fit-content(30%) auto; +} + +dl.field-list > dt { + font-weight: bold; + word-break: break-word; + padding-left: 0.5em; + padding-right: 5px; +} +dl.field-list > dt:after { + content: ":"; +} + + +dl.field-list > dd { + padding-left: 0.5em; + margin-top: 0em; + margin-left: 0em; + margin-bottom: 0em; +} + +dl { + margin-bottom: 15px; +} + +dd > :first-child { + margin-top: 0px; +} + +dd ul, dd table { + margin-bottom: 10px; +} + +dd { + margin-top: 3px; + margin-bottom: 10px; + margin-left: 30px; +} + +dl > dd:last-child, +dl > dd:last-child > :last-child { + margin-bottom: 0; +} + +dt:target, span.highlighted { + background-color: #fbe54e; +} + +rect.highlighted { + fill: #fbe54e; +} + +dl.glossary dt { + font-weight: bold; + font-size: 1.1em; +} + +.versionmodified { + font-style: italic; +} + +.system-message { + background-color: #fda; + padding: 5px; + border: 3px solid red; +} + +.footnote:target { + background-color: #ffa; +} + +.line-block { + display: block; + margin-top: 1em; + margin-bottom: 1em; +} + +.line-block .line-block { + margin-top: 0; + margin-bottom: 0; + margin-left: 1.5em; +} + +.guilabel, .menuselection { + font-family: sans-serif; +} + +.accelerator { + text-decoration: underline; +} + +.classifier { + font-style: oblique; +} + +.classifier:before { + font-style: normal; + margin: 0 0.5em; + content: ":"; + display: inline-block; +} + +abbr, acronym { + border-bottom: dotted 1px; + cursor: help; +} + +/* -- code displays --------------------------------------------------------- */ + +pre { + overflow: auto; + overflow-y: hidden; /* fixes display issues on Chrome browsers */ +} + +pre, div[class*="highlight-"] { + clear: both; +} + +span.pre { + -moz-hyphens: none; + -ms-hyphens: none; + -webkit-hyphens: none; + hyphens: none; + white-space: nowrap; +} + +div[class*="highlight-"] { + margin: 1em 0; +} + +td.linenos pre { + border: 0; + background-color: transparent; + color: #aaa; +} + +table.highlighttable { + display: block; +} + +table.highlighttable tbody { + display: block; +} + +table.highlighttable tr { + display: flex; +} + +table.highlighttable td { + margin: 0; + padding: 0; +} + +table.highlighttable td.linenos { + padding-right: 0.5em; +} + +table.highlighttable td.code { + flex: 1; + overflow: hidden; +} + +.highlight .hll { + display: block; +} + +div.highlight pre, +table.highlighttable pre { + margin: 0; +} + +div.code-block-caption + div { + margin-top: 0; +} + +div.code-block-caption { + margin-top: 1em; + padding: 2px 5px; + font-size: small; +} + +div.code-block-caption code { + background-color: transparent; +} + +table.highlighttable td.linenos, +span.linenos, +div.highlight span.gp { /* gp: Generic.Prompt */ + user-select: none; + -webkit-user-select: text; /* Safari fallback only */ + -webkit-user-select: none; /* Chrome/Safari */ + -moz-user-select: none; /* Firefox */ + -ms-user-select: none; /* IE10+ */ +} + +div.code-block-caption span.caption-number { + padding: 0.1em 0.3em; + font-style: italic; +} + +div.code-block-caption span.caption-text { +} + +div.literal-block-wrapper { + margin: 1em 0; +} + +code.xref, a code { + background-color: transparent; + font-weight: bold; +} + +h1 code, h2 code, h3 code, h4 code, h5 code, h6 code { + background-color: transparent; +} + +.viewcode-link { + float: right; +} + +.viewcode-back { + float: right; + font-family: sans-serif; +} + +div.viewcode-block:target { + margin: -1px -10px; + padding: 0 10px; +} + +/* -- math display ---------------------------------------------------------- */ + +img.math { + vertical-align: middle; +} + +div.body div.math p { + text-align: center; +} + +span.eqno { + float: right; +} + +span.eqno a.headerlink { + position: absolute; + z-index: 1; +} + +div.math:hover a.headerlink { + visibility: visible; +} + +/* -- printout stylesheet --------------------------------------------------- */ + +@media print { + div.document, + div.documentwrapper, + div.bodywrapper { + margin: 0 !important; + width: 100%; + } + + div.sphinxsidebar, + div.related, + div.footer, + #top-link { + display: none; + } +} \ No newline at end of file diff --git a/releases/1.32.2/_static/brain_logo.png b/releases/1.32.2/_static/brain_logo.png new file mode 100644 index 00000000..72de002b Binary files /dev/null and b/releases/1.32.2/_static/brain_logo.png differ diff --git a/releases/1.32.2/_static/css/badge_only.css b/releases/1.32.2/_static/css/badge_only.css new file mode 100644 index 00000000..c718cee4 --- /dev/null +++ b/releases/1.32.2/_static/css/badge_only.css @@ -0,0 +1 @@ +.clearfix{*zoom:1}.clearfix:after,.clearfix:before{display:table;content:""}.clearfix:after{clear:both}@font-face{font-family:FontAwesome;font-style:normal;font-weight:400;src:url(fonts/fontawesome-webfont.eot?674f50d287a8c48dc19ba404d20fe713?#iefix) format("embedded-opentype"),url(fonts/fontawesome-webfont.woff2?af7ae505a9eed503f8b8e6982036873e) format("woff2"),url(fonts/fontawesome-webfont.woff?fee66e712a8a08eef5805a46892932ad) format("woff"),url(fonts/fontawesome-webfont.ttf?b06871f281fee6b241d60582ae9369b9) format("truetype"),url(fonts/fontawesome-webfont.svg?912ec66d7572ff821749319396470bde#FontAwesome) format("svg")}.fa:before{font-family:FontAwesome;font-style:normal;font-weight:400;line-height:1}.fa:before,a .fa{text-decoration:inherit}.fa:before,a .fa,li .fa{display:inline-block}li .fa-large:before{width:1.875em}ul.fas{list-style-type:none;margin-left:2em;text-indent:-.8em}ul.fas li .fa{width:.8em}ul.fas li .fa-large:before{vertical-align:baseline}.fa-book:before,.icon-book:before{content:"\f02d"}.fa-caret-down:before,.icon-caret-down:before{content:"\f0d7"}.fa-caret-up:before,.icon-caret-up:before{content:"\f0d8"}.fa-caret-left:before,.icon-caret-left:before{content:"\f0d9"}.fa-caret-right:before,.icon-caret-right:before{content:"\f0da"}.rst-versions{position:fixed;bottom:0;left:0;width:300px;color:#fcfcfc;background:#1f1d1d;font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif;z-index:400}.rst-versions a{color:#2980b9;text-decoration:none}.rst-versions .rst-badge-small{display:none}.rst-versions .rst-current-version{padding:12px;background-color:#272525;display:block;text-align:right;font-size:90%;cursor:pointer;color:#27ae60}.rst-versions .rst-current-version:after{clear:both;content:"";display:block}.rst-versions .rst-current-version .fa{color:#fcfcfc}.rst-versions .rst-current-version .fa-book,.rst-versions .rst-current-version .icon-book{float:left}.rst-versions .rst-current-version.rst-out-of-date{background-color:#e74c3c;color:#fff}.rst-versions .rst-current-version.rst-active-old-version{background-color:#f1c40f;color:#000}.rst-versions.shift-up{height:auto;max-height:100%;overflow-y:scroll}.rst-versions.shift-up .rst-other-versions{display:block}.rst-versions .rst-other-versions{font-size:90%;padding:12px;color:grey;display:none}.rst-versions .rst-other-versions hr{display:block;height:1px;border:0;margin:20px 0;padding:0;border-top:1px solid #413d3d}.rst-versions .rst-other-versions dd{display:inline-block;margin:0}.rst-versions .rst-other-versions dd a{display:inline-block;padding:6px;color:#fcfcfc}.rst-versions.rst-badge{width:auto;bottom:20px;right:20px;left:auto;border:none;max-width:300px;max-height:90%}.rst-versions.rst-badge .fa-book,.rst-versions.rst-badge .icon-book{float:none;line-height:30px}.rst-versions.rst-badge.shift-up .rst-current-version{text-align:right}.rst-versions.rst-badge.shift-up .rst-current-version .fa-book,.rst-versions.rst-badge.shift-up .rst-current-version .icon-book{float:left}.rst-versions.rst-badge>.rst-current-version{width:auto;height:30px;line-height:30px;padding:0 6px;display:block;text-align:center}@media screen and (max-width:768px){.rst-versions{width:85%;display:none}.rst-versions.shift{display:block}} \ No newline at end of file diff --git a/releases/1.32.2/_static/css/fonts/Roboto-Slab-Bold.woff b/releases/1.32.2/_static/css/fonts/Roboto-Slab-Bold.woff new file mode 100644 index 00000000..6cb60000 Binary files /dev/null and b/releases/1.32.2/_static/css/fonts/Roboto-Slab-Bold.woff differ diff --git a/releases/1.32.2/_static/css/fonts/Roboto-Slab-Bold.woff2 b/releases/1.32.2/_static/css/fonts/Roboto-Slab-Bold.woff2 new file mode 100644 index 00000000..7059e231 Binary files /dev/null and b/releases/1.32.2/_static/css/fonts/Roboto-Slab-Bold.woff2 differ diff --git a/releases/1.32.2/_static/css/fonts/Roboto-Slab-Regular.woff b/releases/1.32.2/_static/css/fonts/Roboto-Slab-Regular.woff new file mode 100644 index 00000000..f815f63f Binary files /dev/null and b/releases/1.32.2/_static/css/fonts/Roboto-Slab-Regular.woff differ diff --git a/releases/1.32.2/_static/css/fonts/Roboto-Slab-Regular.woff2 b/releases/1.32.2/_static/css/fonts/Roboto-Slab-Regular.woff2 new file mode 100644 index 00000000..f2c76e5b Binary files /dev/null and b/releases/1.32.2/_static/css/fonts/Roboto-Slab-Regular.woff2 differ diff --git a/releases/1.32.2/_static/css/fonts/fontawesome-webfont.eot b/releases/1.32.2/_static/css/fonts/fontawesome-webfont.eot new file mode 100644 index 00000000..e9f60ca9 Binary files /dev/null and b/releases/1.32.2/_static/css/fonts/fontawesome-webfont.eot differ diff --git a/releases/1.32.2/_static/css/fonts/fontawesome-webfont.svg b/releases/1.32.2/_static/css/fonts/fontawesome-webfont.svg new file mode 100644 index 00000000..855c845e --- /dev/null +++ b/releases/1.32.2/_static/css/fonts/fontawesome-webfont.svg @@ -0,0 +1,2671 @@ + + + + +Created by FontForge 20120731 at Mon Oct 24 17:37:40 2016 + By ,,, +Copyright Dave Gandy 2016. All rights reserved. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/releases/1.32.2/_static/css/fonts/fontawesome-webfont.ttf b/releases/1.32.2/_static/css/fonts/fontawesome-webfont.ttf new file mode 100644 index 00000000..35acda2f Binary files /dev/null and b/releases/1.32.2/_static/css/fonts/fontawesome-webfont.ttf differ diff --git a/releases/1.32.2/_static/css/fonts/fontawesome-webfont.woff b/releases/1.32.2/_static/css/fonts/fontawesome-webfont.woff new file mode 100644 index 00000000..400014a4 Binary files /dev/null and b/releases/1.32.2/_static/css/fonts/fontawesome-webfont.woff differ diff --git a/releases/1.32.2/_static/css/fonts/fontawesome-webfont.woff2 b/releases/1.32.2/_static/css/fonts/fontawesome-webfont.woff2 new file mode 100644 index 00000000..4d13fc60 Binary files /dev/null and b/releases/1.32.2/_static/css/fonts/fontawesome-webfont.woff2 differ diff --git a/releases/1.32.2/_static/css/fonts/lato-bold-italic.woff b/releases/1.32.2/_static/css/fonts/lato-bold-italic.woff new file mode 100644 index 00000000..88ad05b9 Binary files /dev/null and b/releases/1.32.2/_static/css/fonts/lato-bold-italic.woff differ diff --git a/releases/1.32.2/_static/css/fonts/lato-bold-italic.woff2 b/releases/1.32.2/_static/css/fonts/lato-bold-italic.woff2 new file mode 100644 index 00000000..c4e3d804 Binary files /dev/null and b/releases/1.32.2/_static/css/fonts/lato-bold-italic.woff2 differ diff --git a/releases/1.32.2/_static/css/fonts/lato-bold.woff b/releases/1.32.2/_static/css/fonts/lato-bold.woff new file mode 100644 index 00000000..c6dff51f Binary files /dev/null and b/releases/1.32.2/_static/css/fonts/lato-bold.woff differ diff --git a/releases/1.32.2/_static/css/fonts/lato-bold.woff2 b/releases/1.32.2/_static/css/fonts/lato-bold.woff2 new file mode 100644 index 00000000..bb195043 Binary files /dev/null and b/releases/1.32.2/_static/css/fonts/lato-bold.woff2 differ diff --git a/releases/1.32.2/_static/css/fonts/lato-normal-italic.woff b/releases/1.32.2/_static/css/fonts/lato-normal-italic.woff new file mode 100644 index 00000000..76114bc0 Binary files /dev/null and b/releases/1.32.2/_static/css/fonts/lato-normal-italic.woff differ diff --git a/releases/1.32.2/_static/css/fonts/lato-normal-italic.woff2 b/releases/1.32.2/_static/css/fonts/lato-normal-italic.woff2 new file mode 100644 index 00000000..3404f37e Binary files /dev/null and b/releases/1.32.2/_static/css/fonts/lato-normal-italic.woff2 differ diff --git a/releases/1.32.2/_static/css/fonts/lato-normal.woff b/releases/1.32.2/_static/css/fonts/lato-normal.woff new file mode 100644 index 00000000..ae1307ff Binary files /dev/null and b/releases/1.32.2/_static/css/fonts/lato-normal.woff differ diff --git a/releases/1.32.2/_static/css/fonts/lato-normal.woff2 b/releases/1.32.2/_static/css/fonts/lato-normal.woff2 new file mode 100644 index 00000000..3bf98433 Binary files /dev/null and b/releases/1.32.2/_static/css/fonts/lato-normal.woff2 differ diff --git a/releases/1.32.2/_static/css/theme.css b/releases/1.32.2/_static/css/theme.css new file mode 100644 index 00000000..19a446a0 --- /dev/null +++ b/releases/1.32.2/_static/css/theme.css @@ -0,0 +1,4 @@ +html{box-sizing:border-box}*,:after,:before{box-sizing:inherit}article,aside,details,figcaption,figure,footer,header,hgroup,nav,section{display:block}audio,canvas,video{display:inline-block;*display:inline;*zoom:1}[hidden],audio:not([controls]){display:none}*{-webkit-box-sizing:border-box;-moz-box-sizing:border-box;box-sizing:border-box}html{font-size:100%;-webkit-text-size-adjust:100%;-ms-text-size-adjust:100%}body{margin:0}a:active,a:hover{outline:0}abbr[title]{border-bottom:1px dotted}b,strong{font-weight:700}blockquote{margin:0}dfn{font-style:italic}ins{background:#ff9;text-decoration:none}ins,mark{color:#000}mark{background:#ff0;font-style:italic;font-weight:700}.rst-content code,.rst-content tt,code,kbd,pre,samp{font-family:monospace,serif;_font-family:courier new,monospace;font-size:1em}pre{white-space:pre}q{quotes:none}q:after,q:before{content:"";content:none}small{font-size:85%}sub,sup{font-size:75%;line-height:0;position:relative;vertical-align:baseline}sup{top:-.5em}sub{bottom:-.25em}dl,ol,ul{margin:0;padding:0;list-style:none;list-style-image:none}li{list-style:none}dd{margin:0}img{border:0;-ms-interpolation-mode:bicubic;vertical-align:middle;max-width:100%}svg:not(:root){overflow:hidden}figure,form{margin:0}label{cursor:pointer}button,input,select,textarea{font-size:100%;margin:0;vertical-align:baseline;*vertical-align:middle}button,input{line-height:normal}button,input[type=button],input[type=reset],input[type=submit]{cursor:pointer;-webkit-appearance:button;*overflow:visible}button[disabled],input[disabled]{cursor:default}input[type=search]{-webkit-appearance:textfield;-moz-box-sizing:content-box;-webkit-box-sizing:content-box;box-sizing:content-box}textarea{resize:vertical}table{border-collapse:collapse;border-spacing:0}td{vertical-align:top}.chromeframe{margin:.2em 0;background:#ccc;color:#000;padding:.2em 0}.ir{display:block;border:0;text-indent:-999em;overflow:hidden;background-color:transparent;background-repeat:no-repeat;text-align:left;direction:ltr;*line-height:0}.ir br{display:none}.hidden{display:none!important;visibility:hidden}.visuallyhidden{border:0;clip:rect(0 0 0 0);height:1px;margin:-1px;overflow:hidden;padding:0;position:absolute;width:1px}.visuallyhidden.focusable:active,.visuallyhidden.focusable:focus{clip:auto;height:auto;margin:0;overflow:visible;position:static;width:auto}.invisible{visibility:hidden}.relative{position:relative}big,small{font-size:100%}@media print{body,html,section{background:none!important}*{box-shadow:none!important;text-shadow:none!important;filter:none!important;-ms-filter:none!important}a,a:visited{text-decoration:underline}.ir a:after,a[href^="#"]:after,a[href^="javascript:"]:after{content:""}blockquote,pre{page-break-inside:avoid}thead{display:table-header-group}img,tr{page-break-inside:avoid}img{max-width:100%!important}@page{margin:.5cm}.rst-content .toctree-wrapper>p.caption,h2,h3,p{orphans:3;widows:3}.rst-content .toctree-wrapper>p.caption,h2,h3{page-break-after:avoid}}.btn,.fa:before,.icon:before,.rst-content .admonition,.rst-content .admonition-title:before,.rst-content .admonition-todo,.rst-content .attention,.rst-content .caution,.rst-content .code-block-caption .headerlink:before,.rst-content .danger,.rst-content .eqno .headerlink:before,.rst-content .error,.rst-content .hint,.rst-content .important,.rst-content .note,.rst-content .seealso,.rst-content .tip,.rst-content .warning,.rst-content code.download span:first-child:before,.rst-content dl dt .headerlink:before,.rst-content h1 .headerlink:before,.rst-content h2 .headerlink:before,.rst-content h3 .headerlink:before,.rst-content h4 .headerlink:before,.rst-content h5 .headerlink:before,.rst-content h6 .headerlink:before,.rst-content p.caption .headerlink:before,.rst-content p .headerlink:before,.rst-content table>caption .headerlink:before,.rst-content tt.download span:first-child:before,.wy-alert,.wy-dropdown .caret:before,.wy-inline-validate.wy-inline-validate-danger .wy-input-context:before,.wy-inline-validate.wy-inline-validate-info .wy-input-context:before,.wy-inline-validate.wy-inline-validate-success .wy-input-context:before,.wy-inline-validate.wy-inline-validate-warning .wy-input-context:before,.wy-menu-vertical li.current>a button.toctree-expand:before,.wy-menu-vertical li.on a button.toctree-expand:before,.wy-menu-vertical li button.toctree-expand:before,input[type=color],input[type=date],input[type=datetime-local],input[type=datetime],input[type=email],input[type=month],input[type=number],input[type=password],input[type=search],input[type=tel],input[type=text],input[type=time],input[type=url],input[type=week],select,textarea{-webkit-font-smoothing:antialiased}.clearfix{*zoom:1}.clearfix:after,.clearfix:before{display:table;content:""}.clearfix:after{clear:both}/*! + * Font Awesome 4.7.0 by @davegandy - http://fontawesome.io - @fontawesome + * License - http://fontawesome.io/license (Font: SIL OFL 1.1, CSS: MIT License) + */@font-face{font-family:FontAwesome;src:url(fonts/fontawesome-webfont.eot?674f50d287a8c48dc19ba404d20fe713);src:url(fonts/fontawesome-webfont.eot?674f50d287a8c48dc19ba404d20fe713?#iefix&v=4.7.0) format("embedded-opentype"),url(fonts/fontawesome-webfont.woff2?af7ae505a9eed503f8b8e6982036873e) format("woff2"),url(fonts/fontawesome-webfont.woff?fee66e712a8a08eef5805a46892932ad) format("woff"),url(fonts/fontawesome-webfont.ttf?b06871f281fee6b241d60582ae9369b9) format("truetype"),url(fonts/fontawesome-webfont.svg?912ec66d7572ff821749319396470bde#fontawesomeregular) format("svg");font-weight:400;font-style:normal}.fa,.icon,.rst-content .admonition-title,.rst-content .code-block-caption .headerlink,.rst-content .eqno .headerlink,.rst-content code.download span:first-child,.rst-content dl dt .headerlink,.rst-content h1 .headerlink,.rst-content h2 .headerlink,.rst-content h3 .headerlink,.rst-content h4 .headerlink,.rst-content h5 .headerlink,.rst-content h6 .headerlink,.rst-content p.caption .headerlink,.rst-content p .headerlink,.rst-content table>caption .headerlink,.rst-content tt.download span:first-child,.wy-menu-vertical li.current>a button.toctree-expand,.wy-menu-vertical li.on a button.toctree-expand,.wy-menu-vertical li button.toctree-expand{display:inline-block;font:normal normal normal 14px/1 FontAwesome;font-size:inherit;text-rendering:auto;-webkit-font-smoothing:antialiased;-moz-osx-font-smoothing:grayscale}.fa-lg{font-size:1.33333em;line-height:.75em;vertical-align:-15%}.fa-2x{font-size:2em}.fa-3x{font-size:3em}.fa-4x{font-size:4em}.fa-5x{font-size:5em}.fa-fw{width:1.28571em;text-align:center}.fa-ul{padding-left:0;margin-left:2.14286em;list-style-type:none}.fa-ul>li{position:relative}.fa-li{position:absolute;left:-2.14286em;width:2.14286em;top:.14286em;text-align:center}.fa-li.fa-lg{left:-1.85714em}.fa-border{padding:.2em .25em .15em;border:.08em solid #eee;border-radius:.1em}.fa-pull-left{float:left}.fa-pull-right{float:right}.fa-pull-left.icon,.fa.fa-pull-left,.rst-content .code-block-caption .fa-pull-left.headerlink,.rst-content .eqno .fa-pull-left.headerlink,.rst-content .fa-pull-left.admonition-title,.rst-content code.download span.fa-pull-left:first-child,.rst-content dl dt .fa-pull-left.headerlink,.rst-content h1 .fa-pull-left.headerlink,.rst-content h2 .fa-pull-left.headerlink,.rst-content h3 .fa-pull-left.headerlink,.rst-content h4 .fa-pull-left.headerlink,.rst-content h5 .fa-pull-left.headerlink,.rst-content h6 .fa-pull-left.headerlink,.rst-content p .fa-pull-left.headerlink,.rst-content table>caption .fa-pull-left.headerlink,.rst-content tt.download span.fa-pull-left:first-child,.wy-menu-vertical li.current>a button.fa-pull-left.toctree-expand,.wy-menu-vertical li.on a button.fa-pull-left.toctree-expand,.wy-menu-vertical li button.fa-pull-left.toctree-expand{margin-right:.3em}.fa-pull-right.icon,.fa.fa-pull-right,.rst-content .code-block-caption .fa-pull-right.headerlink,.rst-content .eqno .fa-pull-right.headerlink,.rst-content .fa-pull-right.admonition-title,.rst-content code.download span.fa-pull-right:first-child,.rst-content dl dt .fa-pull-right.headerlink,.rst-content h1 .fa-pull-right.headerlink,.rst-content h2 .fa-pull-right.headerlink,.rst-content h3 .fa-pull-right.headerlink,.rst-content h4 .fa-pull-right.headerlink,.rst-content h5 .fa-pull-right.headerlink,.rst-content h6 .fa-pull-right.headerlink,.rst-content p .fa-pull-right.headerlink,.rst-content table>caption .fa-pull-right.headerlink,.rst-content tt.download span.fa-pull-right:first-child,.wy-menu-vertical li.current>a button.fa-pull-right.toctree-expand,.wy-menu-vertical li.on a button.fa-pull-right.toctree-expand,.wy-menu-vertical li button.fa-pull-right.toctree-expand{margin-left:.3em}.pull-right{float:right}.pull-left{float:left}.fa.pull-left,.pull-left.icon,.rst-content .code-block-caption .pull-left.headerlink,.rst-content .eqno .pull-left.headerlink,.rst-content .pull-left.admonition-title,.rst-content code.download span.pull-left:first-child,.rst-content dl dt .pull-left.headerlink,.rst-content h1 .pull-left.headerlink,.rst-content h2 .pull-left.headerlink,.rst-content h3 .pull-left.headerlink,.rst-content h4 .pull-left.headerlink,.rst-content h5 .pull-left.headerlink,.rst-content h6 .pull-left.headerlink,.rst-content p .pull-left.headerlink,.rst-content table>caption .pull-left.headerlink,.rst-content tt.download span.pull-left:first-child,.wy-menu-vertical li.current>a button.pull-left.toctree-expand,.wy-menu-vertical li.on a button.pull-left.toctree-expand,.wy-menu-vertical li button.pull-left.toctree-expand{margin-right:.3em}.fa.pull-right,.pull-right.icon,.rst-content .code-block-caption .pull-right.headerlink,.rst-content .eqno .pull-right.headerlink,.rst-content .pull-right.admonition-title,.rst-content code.download span.pull-right:first-child,.rst-content dl dt .pull-right.headerlink,.rst-content h1 .pull-right.headerlink,.rst-content h2 .pull-right.headerlink,.rst-content h3 .pull-right.headerlink,.rst-content h4 .pull-right.headerlink,.rst-content h5 .pull-right.headerlink,.rst-content h6 .pull-right.headerlink,.rst-content p .pull-right.headerlink,.rst-content table>caption .pull-right.headerlink,.rst-content tt.download span.pull-right:first-child,.wy-menu-vertical li.current>a button.pull-right.toctree-expand,.wy-menu-vertical li.on a button.pull-right.toctree-expand,.wy-menu-vertical li button.pull-right.toctree-expand{margin-left:.3em}.fa-spin{-webkit-animation:fa-spin 2s linear infinite;animation:fa-spin 2s linear infinite}.fa-pulse{-webkit-animation:fa-spin 1s steps(8) infinite;animation:fa-spin 1s steps(8) infinite}@-webkit-keyframes fa-spin{0%{-webkit-transform:rotate(0deg);transform:rotate(0deg)}to{-webkit-transform:rotate(359deg);transform:rotate(359deg)}}@keyframes fa-spin{0%{-webkit-transform:rotate(0deg);transform:rotate(0deg)}to{-webkit-transform:rotate(359deg);transform:rotate(359deg)}}.fa-rotate-90{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=1)";-webkit-transform:rotate(90deg);-ms-transform:rotate(90deg);transform:rotate(90deg)}.fa-rotate-180{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=2)";-webkit-transform:rotate(180deg);-ms-transform:rotate(180deg);transform:rotate(180deg)}.fa-rotate-270{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=3)";-webkit-transform:rotate(270deg);-ms-transform:rotate(270deg);transform:rotate(270deg)}.fa-flip-horizontal{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=0, mirror=1)";-webkit-transform:scaleX(-1);-ms-transform:scaleX(-1);transform:scaleX(-1)}.fa-flip-vertical{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=2, mirror=1)";-webkit-transform:scaleY(-1);-ms-transform:scaleY(-1);transform:scaleY(-1)}:root .fa-flip-horizontal,:root .fa-flip-vertical,:root .fa-rotate-90,:root .fa-rotate-180,:root .fa-rotate-270{filter:none}.fa-stack{position:relative;display:inline-block;width:2em;height:2em;line-height:2em;vertical-align:middle}.fa-stack-1x,.fa-stack-2x{position:absolute;left:0;width:100%;text-align:center}.fa-stack-1x{line-height:inherit}.fa-stack-2x{font-size:2em}.fa-inverse{color:#fff}.fa-glass:before{content:""}.fa-music:before{content:""}.fa-search:before,.icon-search:before{content:""}.fa-envelope-o:before{content:""}.fa-heart:before{content:""}.fa-star:before{content:""}.fa-star-o:before{content:""}.fa-user:before{content:""}.fa-film:before{content:""}.fa-th-large:before{content:""}.fa-th:before{content:""}.fa-th-list:before{content:""}.fa-check:before{content:""}.fa-close:before,.fa-remove:before,.fa-times:before{content:""}.fa-search-plus:before{content:""}.fa-search-minus:before{content:""}.fa-power-off:before{content:""}.fa-signal:before{content:""}.fa-cog:before,.fa-gear:before{content:""}.fa-trash-o:before{content:""}.fa-home:before,.icon-home:before{content:""}.fa-file-o:before{content:""}.fa-clock-o:before{content:""}.fa-road:before{content:""}.fa-download:before,.rst-content code.download span:first-child:before,.rst-content tt.download span:first-child:before{content:""}.fa-arrow-circle-o-down:before{content:""}.fa-arrow-circle-o-up:before{content:""}.fa-inbox:before{content:""}.fa-play-circle-o:before{content:""}.fa-repeat:before,.fa-rotate-right:before{content:""}.fa-refresh:before{content:""}.fa-list-alt:before{content:""}.fa-lock:before{content:""}.fa-flag:before{content:""}.fa-headphones:before{content:""}.fa-volume-off:before{content:""}.fa-volume-down:before{content:""}.fa-volume-up:before{content:""}.fa-qrcode:before{content:""}.fa-barcode:before{content:""}.fa-tag:before{content:""}.fa-tags:before{content:""}.fa-book:before,.icon-book:before{content:""}.fa-bookmark:before{content:""}.fa-print:before{content:""}.fa-camera:before{content:""}.fa-font:before{content:""}.fa-bold:before{content:""}.fa-italic:before{content:""}.fa-text-height:before{content:""}.fa-text-width:before{content:""}.fa-align-left:before{content:""}.fa-align-center:before{content:""}.fa-align-right:before{content:""}.fa-align-justify:before{content:""}.fa-list:before{content:""}.fa-dedent:before,.fa-outdent:before{content:""}.fa-indent:before{content:""}.fa-video-camera:before{content:""}.fa-image:before,.fa-photo:before,.fa-picture-o:before{content:""}.fa-pencil:before{content:""}.fa-map-marker:before{content:""}.fa-adjust:before{content:""}.fa-tint:before{content:""}.fa-edit:before,.fa-pencil-square-o:before{content:""}.fa-share-square-o:before{content:""}.fa-check-square-o:before{content:""}.fa-arrows:before{content:""}.fa-step-backward:before{content:""}.fa-fast-backward:before{content:""}.fa-backward:before{content:""}.fa-play:before{content:""}.fa-pause:before{content:""}.fa-stop:before{content:""}.fa-forward:before{content:""}.fa-fast-forward:before{content:""}.fa-step-forward:before{content:""}.fa-eject:before{content:""}.fa-chevron-left:before{content:""}.fa-chevron-right:before{content:""}.fa-plus-circle:before{content:""}.fa-minus-circle:before{content:""}.fa-times-circle:before,.wy-inline-validate.wy-inline-validate-danger .wy-input-context:before{content:""}.fa-check-circle:before,.wy-inline-validate.wy-inline-validate-success .wy-input-context:before{content:""}.fa-question-circle:before{content:""}.fa-info-circle:before{content:""}.fa-crosshairs:before{content:""}.fa-times-circle-o:before{content:""}.fa-check-circle-o:before{content:""}.fa-ban:before{content:""}.fa-arrow-left:before{content:""}.fa-arrow-right:before{content:""}.fa-arrow-up:before{content:""}.fa-arrow-down:before{content:""}.fa-mail-forward:before,.fa-share:before{content:""}.fa-expand:before{content:""}.fa-compress:before{content:""}.fa-plus:before{content:""}.fa-minus:before{content:""}.fa-asterisk:before{content:""}.fa-exclamation-circle:before,.rst-content .admonition-title:before,.wy-inline-validate.wy-inline-validate-info .wy-input-context:before,.wy-inline-validate.wy-inline-validate-warning .wy-input-context:before{content:""}.fa-gift:before{content:""}.fa-leaf:before{content:""}.fa-fire:before,.icon-fire:before{content:""}.fa-eye:before{content:""}.fa-eye-slash:before{content:""}.fa-exclamation-triangle:before,.fa-warning:before{content:""}.fa-plane:before{content:""}.fa-calendar:before{content:""}.fa-random:before{content:""}.fa-comment:before{content:""}.fa-magnet:before{content:""}.fa-chevron-up:before{content:""}.fa-chevron-down:before{content:""}.fa-retweet:before{content:""}.fa-shopping-cart:before{content:""}.fa-folder:before{content:""}.fa-folder-open:before{content:""}.fa-arrows-v:before{content:""}.fa-arrows-h:before{content:""}.fa-bar-chart-o:before,.fa-bar-chart:before{content:""}.fa-twitter-square:before{content:""}.fa-facebook-square:before{content:""}.fa-camera-retro:before{content:""}.fa-key:before{content:""}.fa-cogs:before,.fa-gears:before{content:""}.fa-comments:before{content:""}.fa-thumbs-o-up:before{content:""}.fa-thumbs-o-down:before{content:""}.fa-star-half:before{content:""}.fa-heart-o:before{content:""}.fa-sign-out:before{content:""}.fa-linkedin-square:before{content:""}.fa-thumb-tack:before{content:""}.fa-external-link:before{content:""}.fa-sign-in:before{content:""}.fa-trophy:before{content:""}.fa-github-square:before{content:""}.fa-upload:before{content:""}.fa-lemon-o:before{content:""}.fa-phone:before{content:""}.fa-square-o:before{content:""}.fa-bookmark-o:before{content:""}.fa-phone-square:before{content:""}.fa-twitter:before{content:""}.fa-facebook-f:before,.fa-facebook:before{content:""}.fa-github:before,.icon-github:before{content:""}.fa-unlock:before{content:""}.fa-credit-card:before{content:""}.fa-feed:before,.fa-rss:before{content:""}.fa-hdd-o:before{content:""}.fa-bullhorn:before{content:""}.fa-bell:before{content:""}.fa-certificate:before{content:""}.fa-hand-o-right:before{content:""}.fa-hand-o-left:before{content:""}.fa-hand-o-up:before{content:""}.fa-hand-o-down:before{content:""}.fa-arrow-circle-left:before,.icon-circle-arrow-left:before{content:""}.fa-arrow-circle-right:before,.icon-circle-arrow-right:before{content:""}.fa-arrow-circle-up:before{content:""}.fa-arrow-circle-down:before{content:""}.fa-globe:before{content:""}.fa-wrench:before{content:""}.fa-tasks:before{content:""}.fa-filter:before{content:""}.fa-briefcase:before{content:""}.fa-arrows-alt:before{content:""}.fa-group:before,.fa-users:before{content:""}.fa-chain:before,.fa-link:before,.icon-link:before{content:""}.fa-cloud:before{content:""}.fa-flask:before{content:""}.fa-cut:before,.fa-scissors:before{content:""}.fa-copy:before,.fa-files-o:before{content:""}.fa-paperclip:before{content:""}.fa-floppy-o:before,.fa-save:before{content:""}.fa-square:before{content:""}.fa-bars:before,.fa-navicon:before,.fa-reorder:before{content:""}.fa-list-ul:before{content:""}.fa-list-ol:before{content:""}.fa-strikethrough:before{content:""}.fa-underline:before{content:""}.fa-table:before{content:""}.fa-magic:before{content:""}.fa-truck:before{content:""}.fa-pinterest:before{content:""}.fa-pinterest-square:before{content:""}.fa-google-plus-square:before{content:""}.fa-google-plus:before{content:""}.fa-money:before{content:""}.fa-caret-down:before,.icon-caret-down:before,.wy-dropdown .caret:before{content:""}.fa-caret-up:before{content:""}.fa-caret-left:before{content:""}.fa-caret-right:before{content:""}.fa-columns:before{content:""}.fa-sort:before,.fa-unsorted:before{content:""}.fa-sort-desc:before,.fa-sort-down:before{content:""}.fa-sort-asc:before,.fa-sort-up:before{content:""}.fa-envelope:before{content:""}.fa-linkedin:before{content:""}.fa-rotate-left:before,.fa-undo:before{content:""}.fa-gavel:before,.fa-legal:before{content:""}.fa-dashboard:before,.fa-tachometer:before{content:""}.fa-comment-o:before{content:""}.fa-comments-o:before{content:""}.fa-bolt:before,.fa-flash:before{content:""}.fa-sitemap:before{content:""}.fa-umbrella:before{content:""}.fa-clipboard:before,.fa-paste:before{content:""}.fa-lightbulb-o:before{content:""}.fa-exchange:before{content:""}.fa-cloud-download:before{content:""}.fa-cloud-upload:before{content:""}.fa-user-md:before{content:""}.fa-stethoscope:before{content:""}.fa-suitcase:before{content:""}.fa-bell-o:before{content:""}.fa-coffee:before{content:""}.fa-cutlery:before{content:""}.fa-file-text-o:before{content:""}.fa-building-o:before{content:""}.fa-hospital-o:before{content:""}.fa-ambulance:before{content:""}.fa-medkit:before{content:""}.fa-fighter-jet:before{content:""}.fa-beer:before{content:""}.fa-h-square:before{content:""}.fa-plus-square:before{content:""}.fa-angle-double-left:before{content:""}.fa-angle-double-right:before{content:""}.fa-angle-double-up:before{content:""}.fa-angle-double-down:before{content:""}.fa-angle-left:before{content:""}.fa-angle-right:before{content:""}.fa-angle-up:before{content:""}.fa-angle-down:before{content:""}.fa-desktop:before{content:""}.fa-laptop:before{content:""}.fa-tablet:before{content:""}.fa-mobile-phone:before,.fa-mobile:before{content:""}.fa-circle-o:before{content:""}.fa-quote-left:before{content:""}.fa-quote-right:before{content:""}.fa-spinner:before{content:""}.fa-circle:before{content:""}.fa-mail-reply:before,.fa-reply:before{content:""}.fa-github-alt:before{content:""}.fa-folder-o:before{content:""}.fa-folder-open-o:before{content:""}.fa-smile-o:before{content:""}.fa-frown-o:before{content:""}.fa-meh-o:before{content:""}.fa-gamepad:before{content:""}.fa-keyboard-o:before{content:""}.fa-flag-o:before{content:""}.fa-flag-checkered:before{content:""}.fa-terminal:before{content:""}.fa-code:before{content:""}.fa-mail-reply-all:before,.fa-reply-all:before{content:""}.fa-star-half-empty:before,.fa-star-half-full:before,.fa-star-half-o:before{content:""}.fa-location-arrow:before{content:""}.fa-crop:before{content:""}.fa-code-fork:before{content:""}.fa-chain-broken:before,.fa-unlink:before{content:""}.fa-question:before{content:""}.fa-info:before{content:""}.fa-exclamation:before{content:""}.fa-superscript:before{content:""}.fa-subscript:before{content:""}.fa-eraser:before{content:""}.fa-puzzle-piece:before{content:""}.fa-microphone:before{content:""}.fa-microphone-slash:before{content:""}.fa-shield:before{content:""}.fa-calendar-o:before{content:""}.fa-fire-extinguisher:before{content:""}.fa-rocket:before{content:""}.fa-maxcdn:before{content:""}.fa-chevron-circle-left:before{content:""}.fa-chevron-circle-right:before{content:""}.fa-chevron-circle-up:before{content:""}.fa-chevron-circle-down:before{content:""}.fa-html5:before{content:""}.fa-css3:before{content:""}.fa-anchor:before{content:""}.fa-unlock-alt:before{content:""}.fa-bullseye:before{content:""}.fa-ellipsis-h:before{content:""}.fa-ellipsis-v:before{content:""}.fa-rss-square:before{content:""}.fa-play-circle:before{content:""}.fa-ticket:before{content:""}.fa-minus-square:before{content:""}.fa-minus-square-o:before,.wy-menu-vertical li.current>a button.toctree-expand:before,.wy-menu-vertical li.on a button.toctree-expand:before{content:""}.fa-level-up:before{content:""}.fa-level-down:before{content:""}.fa-check-square:before{content:""}.fa-pencil-square:before{content:""}.fa-external-link-square:before{content:""}.fa-share-square:before{content:""}.fa-compass:before{content:""}.fa-caret-square-o-down:before,.fa-toggle-down:before{content:""}.fa-caret-square-o-up:before,.fa-toggle-up:before{content:""}.fa-caret-square-o-right:before,.fa-toggle-right:before{content:""}.fa-eur:before,.fa-euro:before{content:""}.fa-gbp:before{content:""}.fa-dollar:before,.fa-usd:before{content:""}.fa-inr:before,.fa-rupee:before{content:""}.fa-cny:before,.fa-jpy:before,.fa-rmb:before,.fa-yen:before{content:""}.fa-rouble:before,.fa-rub:before,.fa-ruble:before{content:""}.fa-krw:before,.fa-won:before{content:""}.fa-bitcoin:before,.fa-btc:before{content:""}.fa-file:before{content:""}.fa-file-text:before{content:""}.fa-sort-alpha-asc:before{content:""}.fa-sort-alpha-desc:before{content:""}.fa-sort-amount-asc:before{content:""}.fa-sort-amount-desc:before{content:""}.fa-sort-numeric-asc:before{content:""}.fa-sort-numeric-desc:before{content:""}.fa-thumbs-up:before{content:""}.fa-thumbs-down:before{content:""}.fa-youtube-square:before{content:""}.fa-youtube:before{content:""}.fa-xing:before{content:""}.fa-xing-square:before{content:""}.fa-youtube-play:before{content:""}.fa-dropbox:before{content:""}.fa-stack-overflow:before{content:""}.fa-instagram:before{content:""}.fa-flickr:before{content:""}.fa-adn:before{content:""}.fa-bitbucket:before,.icon-bitbucket:before{content:""}.fa-bitbucket-square:before{content:""}.fa-tumblr:before{content:""}.fa-tumblr-square:before{content:""}.fa-long-arrow-down:before{content:""}.fa-long-arrow-up:before{content:""}.fa-long-arrow-left:before{content:""}.fa-long-arrow-right:before{content:""}.fa-apple:before{content:""}.fa-windows:before{content:""}.fa-android:before{content:""}.fa-linux:before{content:""}.fa-dribbble:before{content:""}.fa-skype:before{content:""}.fa-foursquare:before{content:""}.fa-trello:before{content:""}.fa-female:before{content:""}.fa-male:before{content:""}.fa-gittip:before,.fa-gratipay:before{content:""}.fa-sun-o:before{content:""}.fa-moon-o:before{content:""}.fa-archive:before{content:""}.fa-bug:before{content:""}.fa-vk:before{content:""}.fa-weibo:before{content:""}.fa-renren:before{content:""}.fa-pagelines:before{content:""}.fa-stack-exchange:before{content:""}.fa-arrow-circle-o-right:before{content:""}.fa-arrow-circle-o-left:before{content:""}.fa-caret-square-o-left:before,.fa-toggle-left:before{content:""}.fa-dot-circle-o:before{content:""}.fa-wheelchair:before{content:""}.fa-vimeo-square:before{content:""}.fa-try:before,.fa-turkish-lira:before{content:""}.fa-plus-square-o:before,.wy-menu-vertical li button.toctree-expand:before{content:""}.fa-space-shuttle:before{content:""}.fa-slack:before{content:""}.fa-envelope-square:before{content:""}.fa-wordpress:before{content:""}.fa-openid:before{content:""}.fa-bank:before,.fa-institution:before,.fa-university:before{content:""}.fa-graduation-cap:before,.fa-mortar-board:before{content:""}.fa-yahoo:before{content:""}.fa-google:before{content:""}.fa-reddit:before{content:""}.fa-reddit-square:before{content:""}.fa-stumbleupon-circle:before{content:""}.fa-stumbleupon:before{content:""}.fa-delicious:before{content:""}.fa-digg:before{content:""}.fa-pied-piper-pp:before{content:""}.fa-pied-piper-alt:before{content:""}.fa-drupal:before{content:""}.fa-joomla:before{content:""}.fa-language:before{content:""}.fa-fax:before{content:""}.fa-building:before{content:""}.fa-child:before{content:""}.fa-paw:before{content:""}.fa-spoon:before{content:""}.fa-cube:before{content:""}.fa-cubes:before{content:""}.fa-behance:before{content:""}.fa-behance-square:before{content:""}.fa-steam:before{content:""}.fa-steam-square:before{content:""}.fa-recycle:before{content:""}.fa-automobile:before,.fa-car:before{content:""}.fa-cab:before,.fa-taxi:before{content:""}.fa-tree:before{content:""}.fa-spotify:before{content:""}.fa-deviantart:before{content:""}.fa-soundcloud:before{content:""}.fa-database:before{content:""}.fa-file-pdf-o:before{content:""}.fa-file-word-o:before{content:""}.fa-file-excel-o:before{content:""}.fa-file-powerpoint-o:before{content:""}.fa-file-image-o:before,.fa-file-photo-o:before,.fa-file-picture-o:before{content:""}.fa-file-archive-o:before,.fa-file-zip-o:before{content:""}.fa-file-audio-o:before,.fa-file-sound-o:before{content:""}.fa-file-movie-o:before,.fa-file-video-o:before{content:""}.fa-file-code-o:before{content:""}.fa-vine:before{content:""}.fa-codepen:before{content:""}.fa-jsfiddle:before{content:""}.fa-life-bouy:before,.fa-life-buoy:before,.fa-life-ring:before,.fa-life-saver:before,.fa-support:before{content:""}.fa-circle-o-notch:before{content:""}.fa-ra:before,.fa-rebel:before,.fa-resistance:before{content:""}.fa-empire:before,.fa-ge:before{content:""}.fa-git-square:before{content:""}.fa-git:before{content:""}.fa-hacker-news:before,.fa-y-combinator-square:before,.fa-yc-square:before{content:""}.fa-tencent-weibo:before{content:""}.fa-qq:before{content:""}.fa-wechat:before,.fa-weixin:before{content:""}.fa-paper-plane:before,.fa-send:before{content:""}.fa-paper-plane-o:before,.fa-send-o:before{content:""}.fa-history:before{content:""}.fa-circle-thin:before{content:""}.fa-header:before{content:""}.fa-paragraph:before{content:""}.fa-sliders:before{content:""}.fa-share-alt:before{content:""}.fa-share-alt-square:before{content:""}.fa-bomb:before{content:""}.fa-futbol-o:before,.fa-soccer-ball-o:before{content:""}.fa-tty:before{content:""}.fa-binoculars:before{content:""}.fa-plug:before{content:""}.fa-slideshare:before{content:""}.fa-twitch:before{content:""}.fa-yelp:before{content:""}.fa-newspaper-o:before{content:""}.fa-wifi:before{content:""}.fa-calculator:before{content:""}.fa-paypal:before{content:""}.fa-google-wallet:before{content:""}.fa-cc-visa:before{content:""}.fa-cc-mastercard:before{content:""}.fa-cc-discover:before{content:""}.fa-cc-amex:before{content:""}.fa-cc-paypal:before{content:""}.fa-cc-stripe:before{content:""}.fa-bell-slash:before{content:""}.fa-bell-slash-o:before{content:""}.fa-trash:before{content:""}.fa-copyright:before{content:""}.fa-at:before{content:""}.fa-eyedropper:before{content:""}.fa-paint-brush:before{content:""}.fa-birthday-cake:before{content:""}.fa-area-chart:before{content:""}.fa-pie-chart:before{content:""}.fa-line-chart:before{content:""}.fa-lastfm:before{content:""}.fa-lastfm-square:before{content:""}.fa-toggle-off:before{content:""}.fa-toggle-on:before{content:""}.fa-bicycle:before{content:""}.fa-bus:before{content:""}.fa-ioxhost:before{content:""}.fa-angellist:before{content:""}.fa-cc:before{content:""}.fa-ils:before,.fa-shekel:before,.fa-sheqel:before{content:""}.fa-meanpath:before{content:""}.fa-buysellads:before{content:""}.fa-connectdevelop:before{content:""}.fa-dashcube:before{content:""}.fa-forumbee:before{content:""}.fa-leanpub:before{content:""}.fa-sellsy:before{content:""}.fa-shirtsinbulk:before{content:""}.fa-simplybuilt:before{content:""}.fa-skyatlas:before{content:""}.fa-cart-plus:before{content:""}.fa-cart-arrow-down:before{content:""}.fa-diamond:before{content:""}.fa-ship:before{content:""}.fa-user-secret:before{content:""}.fa-motorcycle:before{content:""}.fa-street-view:before{content:""}.fa-heartbeat:before{content:""}.fa-venus:before{content:""}.fa-mars:before{content:""}.fa-mercury:before{content:""}.fa-intersex:before,.fa-transgender:before{content:""}.fa-transgender-alt:before{content:""}.fa-venus-double:before{content:""}.fa-mars-double:before{content:""}.fa-venus-mars:before{content:""}.fa-mars-stroke:before{content:""}.fa-mars-stroke-v:before{content:""}.fa-mars-stroke-h:before{content:""}.fa-neuter:before{content:""}.fa-genderless:before{content:""}.fa-facebook-official:before{content:""}.fa-pinterest-p:before{content:""}.fa-whatsapp:before{content:""}.fa-server:before{content:""}.fa-user-plus:before{content:""}.fa-user-times:before{content:""}.fa-bed:before,.fa-hotel:before{content:""}.fa-viacoin:before{content:""}.fa-train:before{content:""}.fa-subway:before{content:""}.fa-medium:before{content:""}.fa-y-combinator:before,.fa-yc:before{content:""}.fa-optin-monster:before{content:""}.fa-opencart:before{content:""}.fa-expeditedssl:before{content:""}.fa-battery-4:before,.fa-battery-full:before,.fa-battery:before{content:""}.fa-battery-3:before,.fa-battery-three-quarters:before{content:""}.fa-battery-2:before,.fa-battery-half:before{content:""}.fa-battery-1:before,.fa-battery-quarter:before{content:""}.fa-battery-0:before,.fa-battery-empty:before{content:""}.fa-mouse-pointer:before{content:""}.fa-i-cursor:before{content:""}.fa-object-group:before{content:""}.fa-object-ungroup:before{content:""}.fa-sticky-note:before{content:""}.fa-sticky-note-o:before{content:""}.fa-cc-jcb:before{content:""}.fa-cc-diners-club:before{content:""}.fa-clone:before{content:""}.fa-balance-scale:before{content:""}.fa-hourglass-o:before{content:""}.fa-hourglass-1:before,.fa-hourglass-start:before{content:""}.fa-hourglass-2:before,.fa-hourglass-half:before{content:""}.fa-hourglass-3:before,.fa-hourglass-end:before{content:""}.fa-hourglass:before{content:""}.fa-hand-grab-o:before,.fa-hand-rock-o:before{content:""}.fa-hand-paper-o:before,.fa-hand-stop-o:before{content:""}.fa-hand-scissors-o:before{content:""}.fa-hand-lizard-o:before{content:""}.fa-hand-spock-o:before{content:""}.fa-hand-pointer-o:before{content:""}.fa-hand-peace-o:before{content:""}.fa-trademark:before{content:""}.fa-registered:before{content:""}.fa-creative-commons:before{content:""}.fa-gg:before{content:""}.fa-gg-circle:before{content:""}.fa-tripadvisor:before{content:""}.fa-odnoklassniki:before{content:""}.fa-odnoklassniki-square:before{content:""}.fa-get-pocket:before{content:""}.fa-wikipedia-w:before{content:""}.fa-safari:before{content:""}.fa-chrome:before{content:""}.fa-firefox:before{content:""}.fa-opera:before{content:""}.fa-internet-explorer:before{content:""}.fa-television:before,.fa-tv:before{content:""}.fa-contao:before{content:""}.fa-500px:before{content:""}.fa-amazon:before{content:""}.fa-calendar-plus-o:before{content:""}.fa-calendar-minus-o:before{content:""}.fa-calendar-times-o:before{content:""}.fa-calendar-check-o:before{content:""}.fa-industry:before{content:""}.fa-map-pin:before{content:""}.fa-map-signs:before{content:""}.fa-map-o:before{content:""}.fa-map:before{content:""}.fa-commenting:before{content:""}.fa-commenting-o:before{content:""}.fa-houzz:before{content:""}.fa-vimeo:before{content:""}.fa-black-tie:before{content:""}.fa-fonticons:before{content:""}.fa-reddit-alien:before{content:""}.fa-edge:before{content:""}.fa-credit-card-alt:before{content:""}.fa-codiepie:before{content:""}.fa-modx:before{content:""}.fa-fort-awesome:before{content:""}.fa-usb:before{content:""}.fa-product-hunt:before{content:""}.fa-mixcloud:before{content:""}.fa-scribd:before{content:""}.fa-pause-circle:before{content:""}.fa-pause-circle-o:before{content:""}.fa-stop-circle:before{content:""}.fa-stop-circle-o:before{content:""}.fa-shopping-bag:before{content:""}.fa-shopping-basket:before{content:""}.fa-hashtag:before{content:""}.fa-bluetooth:before{content:""}.fa-bluetooth-b:before{content:""}.fa-percent:before{content:""}.fa-gitlab:before,.icon-gitlab:before{content:""}.fa-wpbeginner:before{content:""}.fa-wpforms:before{content:""}.fa-envira:before{content:""}.fa-universal-access:before{content:""}.fa-wheelchair-alt:before{content:""}.fa-question-circle-o:before{content:""}.fa-blind:before{content:""}.fa-audio-description:before{content:""}.fa-volume-control-phone:before{content:""}.fa-braille:before{content:""}.fa-assistive-listening-systems:before{content:""}.fa-american-sign-language-interpreting:before,.fa-asl-interpreting:before{content:""}.fa-deaf:before,.fa-deafness:before,.fa-hard-of-hearing:before{content:""}.fa-glide:before{content:""}.fa-glide-g:before{content:""}.fa-sign-language:before,.fa-signing:before{content:""}.fa-low-vision:before{content:""}.fa-viadeo:before{content:""}.fa-viadeo-square:before{content:""}.fa-snapchat:before{content:""}.fa-snapchat-ghost:before{content:""}.fa-snapchat-square:before{content:""}.fa-pied-piper:before{content:""}.fa-first-order:before{content:""}.fa-yoast:before{content:""}.fa-themeisle:before{content:""}.fa-google-plus-circle:before,.fa-google-plus-official:before{content:""}.fa-fa:before,.fa-font-awesome:before{content:""}.fa-handshake-o:before{content:""}.fa-envelope-open:before{content:""}.fa-envelope-open-o:before{content:""}.fa-linode:before{content:""}.fa-address-book:before{content:""}.fa-address-book-o:before{content:""}.fa-address-card:before,.fa-vcard:before{content:""}.fa-address-card-o:before,.fa-vcard-o:before{content:""}.fa-user-circle:before{content:""}.fa-user-circle-o:before{content:""}.fa-user-o:before{content:""}.fa-id-badge:before{content:""}.fa-drivers-license:before,.fa-id-card:before{content:""}.fa-drivers-license-o:before,.fa-id-card-o:before{content:""}.fa-quora:before{content:""}.fa-free-code-camp:before{content:""}.fa-telegram:before{content:""}.fa-thermometer-4:before,.fa-thermometer-full:before,.fa-thermometer:before{content:""}.fa-thermometer-3:before,.fa-thermometer-three-quarters:before{content:""}.fa-thermometer-2:before,.fa-thermometer-half:before{content:""}.fa-thermometer-1:before,.fa-thermometer-quarter:before{content:""}.fa-thermometer-0:before,.fa-thermometer-empty:before{content:""}.fa-shower:before{content:""}.fa-bath:before,.fa-bathtub:before,.fa-s15:before{content:""}.fa-podcast:before{content:""}.fa-window-maximize:before{content:""}.fa-window-minimize:before{content:""}.fa-window-restore:before{content:""}.fa-times-rectangle:before,.fa-window-close:before{content:""}.fa-times-rectangle-o:before,.fa-window-close-o:before{content:""}.fa-bandcamp:before{content:""}.fa-grav:before{content:""}.fa-etsy:before{content:""}.fa-imdb:before{content:""}.fa-ravelry:before{content:""}.fa-eercast:before{content:""}.fa-microchip:before{content:""}.fa-snowflake-o:before{content:""}.fa-superpowers:before{content:""}.fa-wpexplorer:before{content:""}.fa-meetup:before{content:""}.sr-only{position:absolute;width:1px;height:1px;padding:0;margin:-1px;overflow:hidden;clip:rect(0,0,0,0);border:0}.sr-only-focusable:active,.sr-only-focusable:focus{position:static;width:auto;height:auto;margin:0;overflow:visible;clip:auto}.fa,.icon,.rst-content .admonition-title,.rst-content .code-block-caption .headerlink,.rst-content .eqno .headerlink,.rst-content code.download span:first-child,.rst-content dl dt .headerlink,.rst-content h1 .headerlink,.rst-content h2 .headerlink,.rst-content h3 .headerlink,.rst-content h4 .headerlink,.rst-content h5 .headerlink,.rst-content h6 .headerlink,.rst-content p.caption .headerlink,.rst-content p .headerlink,.rst-content table>caption .headerlink,.rst-content tt.download span:first-child,.wy-dropdown .caret,.wy-inline-validate.wy-inline-validate-danger .wy-input-context,.wy-inline-validate.wy-inline-validate-info .wy-input-context,.wy-inline-validate.wy-inline-validate-success .wy-input-context,.wy-inline-validate.wy-inline-validate-warning .wy-input-context,.wy-menu-vertical li.current>a button.toctree-expand,.wy-menu-vertical li.on a button.toctree-expand,.wy-menu-vertical li button.toctree-expand{font-family:inherit}.fa:before,.icon:before,.rst-content .admonition-title:before,.rst-content .code-block-caption .headerlink:before,.rst-content .eqno .headerlink:before,.rst-content code.download span:first-child:before,.rst-content dl dt .headerlink:before,.rst-content h1 .headerlink:before,.rst-content h2 .headerlink:before,.rst-content h3 .headerlink:before,.rst-content h4 .headerlink:before,.rst-content h5 .headerlink:before,.rst-content h6 .headerlink:before,.rst-content p.caption .headerlink:before,.rst-content p .headerlink:before,.rst-content table>caption .headerlink:before,.rst-content tt.download span:first-child:before,.wy-dropdown .caret:before,.wy-inline-validate.wy-inline-validate-danger .wy-input-context:before,.wy-inline-validate.wy-inline-validate-info .wy-input-context:before,.wy-inline-validate.wy-inline-validate-success .wy-input-context:before,.wy-inline-validate.wy-inline-validate-warning .wy-input-context:before,.wy-menu-vertical li.current>a button.toctree-expand:before,.wy-menu-vertical li.on a button.toctree-expand:before,.wy-menu-vertical li button.toctree-expand:before{font-family:FontAwesome;display:inline-block;font-style:normal;font-weight:400;line-height:1;text-decoration:inherit}.rst-content .code-block-caption a .headerlink,.rst-content .eqno a .headerlink,.rst-content a .admonition-title,.rst-content code.download a span:first-child,.rst-content dl dt a .headerlink,.rst-content h1 a .headerlink,.rst-content h2 a .headerlink,.rst-content h3 a .headerlink,.rst-content h4 a .headerlink,.rst-content h5 a .headerlink,.rst-content h6 a .headerlink,.rst-content p.caption a .headerlink,.rst-content p a .headerlink,.rst-content table>caption a .headerlink,.rst-content tt.download a span:first-child,.wy-menu-vertical li.current>a button.toctree-expand,.wy-menu-vertical li.on a button.toctree-expand,.wy-menu-vertical li a button.toctree-expand,a .fa,a .icon,a .rst-content .admonition-title,a .rst-content .code-block-caption .headerlink,a .rst-content .eqno .headerlink,a .rst-content code.download span:first-child,a .rst-content dl dt .headerlink,a .rst-content h1 .headerlink,a .rst-content h2 .headerlink,a .rst-content h3 .headerlink,a .rst-content h4 .headerlink,a .rst-content h5 .headerlink,a .rst-content h6 .headerlink,a .rst-content p.caption .headerlink,a .rst-content p .headerlink,a .rst-content table>caption .headerlink,a .rst-content tt.download span:first-child,a .wy-menu-vertical li button.toctree-expand{display:inline-block;text-decoration:inherit}.btn .fa,.btn .icon,.btn .rst-content .admonition-title,.btn .rst-content .code-block-caption .headerlink,.btn .rst-content .eqno .headerlink,.btn .rst-content code.download span:first-child,.btn .rst-content dl dt .headerlink,.btn .rst-content h1 .headerlink,.btn .rst-content h2 .headerlink,.btn .rst-content h3 .headerlink,.btn .rst-content h4 .headerlink,.btn .rst-content h5 .headerlink,.btn .rst-content h6 .headerlink,.btn .rst-content p .headerlink,.btn .rst-content table>caption .headerlink,.btn .rst-content tt.download span:first-child,.btn .wy-menu-vertical li.current>a button.toctree-expand,.btn .wy-menu-vertical li.on a button.toctree-expand,.btn .wy-menu-vertical li button.toctree-expand,.nav .fa,.nav .icon,.nav .rst-content .admonition-title,.nav .rst-content .code-block-caption .headerlink,.nav .rst-content .eqno .headerlink,.nav .rst-content code.download span:first-child,.nav .rst-content dl dt .headerlink,.nav .rst-content h1 .headerlink,.nav .rst-content h2 .headerlink,.nav .rst-content h3 .headerlink,.nav .rst-content h4 .headerlink,.nav .rst-content h5 .headerlink,.nav .rst-content h6 .headerlink,.nav .rst-content p .headerlink,.nav .rst-content table>caption .headerlink,.nav .rst-content tt.download span:first-child,.nav .wy-menu-vertical li.current>a button.toctree-expand,.nav .wy-menu-vertical li.on a button.toctree-expand,.nav .wy-menu-vertical li button.toctree-expand,.rst-content .btn .admonition-title,.rst-content .code-block-caption .btn .headerlink,.rst-content .code-block-caption .nav .headerlink,.rst-content .eqno .btn .headerlink,.rst-content .eqno .nav .headerlink,.rst-content .nav .admonition-title,.rst-content code.download .btn span:first-child,.rst-content code.download .nav span:first-child,.rst-content dl dt .btn .headerlink,.rst-content dl dt .nav .headerlink,.rst-content h1 .btn .headerlink,.rst-content h1 .nav .headerlink,.rst-content h2 .btn .headerlink,.rst-content h2 .nav .headerlink,.rst-content h3 .btn .headerlink,.rst-content h3 .nav .headerlink,.rst-content h4 .btn .headerlink,.rst-content h4 .nav .headerlink,.rst-content h5 .btn .headerlink,.rst-content h5 .nav .headerlink,.rst-content h6 .btn .headerlink,.rst-content h6 .nav .headerlink,.rst-content p .btn .headerlink,.rst-content p .nav .headerlink,.rst-content table>caption .btn .headerlink,.rst-content table>caption .nav .headerlink,.rst-content tt.download .btn span:first-child,.rst-content tt.download .nav span:first-child,.wy-menu-vertical li .btn button.toctree-expand,.wy-menu-vertical li.current>a .btn button.toctree-expand,.wy-menu-vertical li.current>a .nav button.toctree-expand,.wy-menu-vertical li .nav button.toctree-expand,.wy-menu-vertical li.on a .btn button.toctree-expand,.wy-menu-vertical li.on a .nav button.toctree-expand{display:inline}.btn .fa-large.icon,.btn .fa.fa-large,.btn .rst-content .code-block-caption .fa-large.headerlink,.btn .rst-content .eqno .fa-large.headerlink,.btn .rst-content .fa-large.admonition-title,.btn .rst-content code.download span.fa-large:first-child,.btn .rst-content dl dt .fa-large.headerlink,.btn .rst-content h1 .fa-large.headerlink,.btn .rst-content h2 .fa-large.headerlink,.btn .rst-content h3 .fa-large.headerlink,.btn .rst-content h4 .fa-large.headerlink,.btn .rst-content h5 .fa-large.headerlink,.btn .rst-content h6 .fa-large.headerlink,.btn .rst-content p .fa-large.headerlink,.btn .rst-content table>caption .fa-large.headerlink,.btn .rst-content tt.download span.fa-large:first-child,.btn .wy-menu-vertical li button.fa-large.toctree-expand,.nav .fa-large.icon,.nav .fa.fa-large,.nav .rst-content .code-block-caption .fa-large.headerlink,.nav .rst-content .eqno .fa-large.headerlink,.nav .rst-content .fa-large.admonition-title,.nav .rst-content code.download span.fa-large:first-child,.nav .rst-content dl dt .fa-large.headerlink,.nav .rst-content h1 .fa-large.headerlink,.nav .rst-content h2 .fa-large.headerlink,.nav .rst-content h3 .fa-large.headerlink,.nav .rst-content h4 .fa-large.headerlink,.nav .rst-content h5 .fa-large.headerlink,.nav .rst-content h6 .fa-large.headerlink,.nav .rst-content p .fa-large.headerlink,.nav .rst-content table>caption .fa-large.headerlink,.nav .rst-content tt.download span.fa-large:first-child,.nav .wy-menu-vertical li button.fa-large.toctree-expand,.rst-content .btn .fa-large.admonition-title,.rst-content .code-block-caption .btn .fa-large.headerlink,.rst-content .code-block-caption .nav .fa-large.headerlink,.rst-content .eqno .btn .fa-large.headerlink,.rst-content .eqno .nav .fa-large.headerlink,.rst-content .nav .fa-large.admonition-title,.rst-content code.download .btn span.fa-large:first-child,.rst-content code.download .nav span.fa-large:first-child,.rst-content dl dt .btn .fa-large.headerlink,.rst-content dl dt .nav .fa-large.headerlink,.rst-content h1 .btn .fa-large.headerlink,.rst-content h1 .nav .fa-large.headerlink,.rst-content h2 .btn .fa-large.headerlink,.rst-content h2 .nav .fa-large.headerlink,.rst-content h3 .btn .fa-large.headerlink,.rst-content h3 .nav .fa-large.headerlink,.rst-content h4 .btn .fa-large.headerlink,.rst-content h4 .nav .fa-large.headerlink,.rst-content h5 .btn .fa-large.headerlink,.rst-content h5 .nav .fa-large.headerlink,.rst-content h6 .btn .fa-large.headerlink,.rst-content h6 .nav .fa-large.headerlink,.rst-content p .btn .fa-large.headerlink,.rst-content p .nav .fa-large.headerlink,.rst-content table>caption .btn .fa-large.headerlink,.rst-content table>caption .nav .fa-large.headerlink,.rst-content tt.download .btn span.fa-large:first-child,.rst-content tt.download .nav span.fa-large:first-child,.wy-menu-vertical li .btn button.fa-large.toctree-expand,.wy-menu-vertical li .nav button.fa-large.toctree-expand{line-height:.9em}.btn .fa-spin.icon,.btn .fa.fa-spin,.btn .rst-content .code-block-caption .fa-spin.headerlink,.btn .rst-content .eqno .fa-spin.headerlink,.btn .rst-content .fa-spin.admonition-title,.btn .rst-content code.download span.fa-spin:first-child,.btn .rst-content dl dt .fa-spin.headerlink,.btn .rst-content h1 .fa-spin.headerlink,.btn .rst-content h2 .fa-spin.headerlink,.btn .rst-content h3 .fa-spin.headerlink,.btn .rst-content h4 .fa-spin.headerlink,.btn .rst-content h5 .fa-spin.headerlink,.btn .rst-content h6 .fa-spin.headerlink,.btn .rst-content p .fa-spin.headerlink,.btn .rst-content table>caption .fa-spin.headerlink,.btn .rst-content tt.download span.fa-spin:first-child,.btn .wy-menu-vertical li button.fa-spin.toctree-expand,.nav .fa-spin.icon,.nav .fa.fa-spin,.nav .rst-content .code-block-caption .fa-spin.headerlink,.nav .rst-content .eqno .fa-spin.headerlink,.nav .rst-content .fa-spin.admonition-title,.nav .rst-content code.download span.fa-spin:first-child,.nav .rst-content dl dt .fa-spin.headerlink,.nav .rst-content h1 .fa-spin.headerlink,.nav .rst-content h2 .fa-spin.headerlink,.nav .rst-content h3 .fa-spin.headerlink,.nav .rst-content h4 .fa-spin.headerlink,.nav .rst-content h5 .fa-spin.headerlink,.nav .rst-content h6 .fa-spin.headerlink,.nav .rst-content p .fa-spin.headerlink,.nav .rst-content table>caption .fa-spin.headerlink,.nav .rst-content tt.download span.fa-spin:first-child,.nav .wy-menu-vertical li button.fa-spin.toctree-expand,.rst-content .btn .fa-spin.admonition-title,.rst-content .code-block-caption .btn .fa-spin.headerlink,.rst-content .code-block-caption .nav .fa-spin.headerlink,.rst-content .eqno .btn .fa-spin.headerlink,.rst-content .eqno .nav .fa-spin.headerlink,.rst-content .nav .fa-spin.admonition-title,.rst-content code.download .btn span.fa-spin:first-child,.rst-content code.download .nav span.fa-spin:first-child,.rst-content dl dt .btn .fa-spin.headerlink,.rst-content dl dt .nav .fa-spin.headerlink,.rst-content h1 .btn .fa-spin.headerlink,.rst-content h1 .nav .fa-spin.headerlink,.rst-content h2 .btn .fa-spin.headerlink,.rst-content h2 .nav .fa-spin.headerlink,.rst-content h3 .btn .fa-spin.headerlink,.rst-content h3 .nav .fa-spin.headerlink,.rst-content h4 .btn .fa-spin.headerlink,.rst-content h4 .nav .fa-spin.headerlink,.rst-content h5 .btn .fa-spin.headerlink,.rst-content h5 .nav .fa-spin.headerlink,.rst-content h6 .btn .fa-spin.headerlink,.rst-content h6 .nav .fa-spin.headerlink,.rst-content p .btn .fa-spin.headerlink,.rst-content p .nav .fa-spin.headerlink,.rst-content table>caption .btn .fa-spin.headerlink,.rst-content table>caption .nav .fa-spin.headerlink,.rst-content tt.download .btn span.fa-spin:first-child,.rst-content tt.download .nav span.fa-spin:first-child,.wy-menu-vertical li .btn button.fa-spin.toctree-expand,.wy-menu-vertical li .nav button.fa-spin.toctree-expand{display:inline-block}.btn.fa:before,.btn.icon:before,.rst-content .btn.admonition-title:before,.rst-content .code-block-caption .btn.headerlink:before,.rst-content .eqno .btn.headerlink:before,.rst-content code.download span.btn:first-child:before,.rst-content dl dt .btn.headerlink:before,.rst-content h1 .btn.headerlink:before,.rst-content h2 .btn.headerlink:before,.rst-content h3 .btn.headerlink:before,.rst-content h4 .btn.headerlink:before,.rst-content h5 .btn.headerlink:before,.rst-content h6 .btn.headerlink:before,.rst-content p .btn.headerlink:before,.rst-content table>caption .btn.headerlink:before,.rst-content tt.download span.btn:first-child:before,.wy-menu-vertical li button.btn.toctree-expand:before{opacity:.5;-webkit-transition:opacity .05s ease-in;-moz-transition:opacity .05s ease-in;transition:opacity .05s ease-in}.btn.fa:hover:before,.btn.icon:hover:before,.rst-content .btn.admonition-title:hover:before,.rst-content .code-block-caption .btn.headerlink:hover:before,.rst-content .eqno .btn.headerlink:hover:before,.rst-content code.download span.btn:first-child:hover:before,.rst-content dl dt .btn.headerlink:hover:before,.rst-content h1 .btn.headerlink:hover:before,.rst-content h2 .btn.headerlink:hover:before,.rst-content h3 .btn.headerlink:hover:before,.rst-content h4 .btn.headerlink:hover:before,.rst-content h5 .btn.headerlink:hover:before,.rst-content h6 .btn.headerlink:hover:before,.rst-content p .btn.headerlink:hover:before,.rst-content table>caption .btn.headerlink:hover:before,.rst-content tt.download span.btn:first-child:hover:before,.wy-menu-vertical li button.btn.toctree-expand:hover:before{opacity:1}.btn-mini .fa:before,.btn-mini .icon:before,.btn-mini .rst-content .admonition-title:before,.btn-mini .rst-content .code-block-caption .headerlink:before,.btn-mini .rst-content .eqno .headerlink:before,.btn-mini .rst-content code.download span:first-child:before,.btn-mini .rst-content dl dt .headerlink:before,.btn-mini .rst-content h1 .headerlink:before,.btn-mini .rst-content h2 .headerlink:before,.btn-mini .rst-content h3 .headerlink:before,.btn-mini .rst-content h4 .headerlink:before,.btn-mini .rst-content h5 .headerlink:before,.btn-mini .rst-content h6 .headerlink:before,.btn-mini .rst-content p .headerlink:before,.btn-mini .rst-content table>caption .headerlink:before,.btn-mini .rst-content tt.download span:first-child:before,.btn-mini .wy-menu-vertical li button.toctree-expand:before,.rst-content .btn-mini .admonition-title:before,.rst-content .code-block-caption .btn-mini .headerlink:before,.rst-content .eqno .btn-mini .headerlink:before,.rst-content code.download .btn-mini span:first-child:before,.rst-content dl dt .btn-mini .headerlink:before,.rst-content h1 .btn-mini .headerlink:before,.rst-content h2 .btn-mini .headerlink:before,.rst-content h3 .btn-mini .headerlink:before,.rst-content h4 .btn-mini .headerlink:before,.rst-content h5 .btn-mini .headerlink:before,.rst-content h6 .btn-mini .headerlink:before,.rst-content p .btn-mini .headerlink:before,.rst-content table>caption .btn-mini .headerlink:before,.rst-content tt.download .btn-mini span:first-child:before,.wy-menu-vertical li .btn-mini button.toctree-expand:before{font-size:14px;vertical-align:-15%}.rst-content .admonition,.rst-content .admonition-todo,.rst-content .attention,.rst-content .caution,.rst-content .danger,.rst-content .error,.rst-content .hint,.rst-content .important,.rst-content .note,.rst-content .seealso,.rst-content .tip,.rst-content .warning,.wy-alert{padding:12px;line-height:24px;margin-bottom:24px;background:#e7f2fa}.rst-content .admonition-title,.wy-alert-title{font-weight:700;display:block;color:#fff;background:#6ab0de;padding:6px 12px;margin:-12px -12px 12px}.rst-content .danger,.rst-content .error,.rst-content .wy-alert-danger.admonition,.rst-content .wy-alert-danger.admonition-todo,.rst-content .wy-alert-danger.attention,.rst-content .wy-alert-danger.caution,.rst-content .wy-alert-danger.hint,.rst-content .wy-alert-danger.important,.rst-content .wy-alert-danger.note,.rst-content .wy-alert-danger.seealso,.rst-content .wy-alert-danger.tip,.rst-content .wy-alert-danger.warning,.wy-alert.wy-alert-danger{background:#fdf3f2}.rst-content .danger .admonition-title,.rst-content .danger .wy-alert-title,.rst-content .error .admonition-title,.rst-content .error .wy-alert-title,.rst-content .wy-alert-danger.admonition-todo .admonition-title,.rst-content .wy-alert-danger.admonition-todo .wy-alert-title,.rst-content .wy-alert-danger.admonition .admonition-title,.rst-content .wy-alert-danger.admonition .wy-alert-title,.rst-content .wy-alert-danger.attention .admonition-title,.rst-content .wy-alert-danger.attention .wy-alert-title,.rst-content .wy-alert-danger.caution .admonition-title,.rst-content .wy-alert-danger.caution .wy-alert-title,.rst-content .wy-alert-danger.hint .admonition-title,.rst-content .wy-alert-danger.hint .wy-alert-title,.rst-content .wy-alert-danger.important .admonition-title,.rst-content .wy-alert-danger.important .wy-alert-title,.rst-content .wy-alert-danger.note .admonition-title,.rst-content .wy-alert-danger.note .wy-alert-title,.rst-content .wy-alert-danger.seealso .admonition-title,.rst-content .wy-alert-danger.seealso .wy-alert-title,.rst-content .wy-alert-danger.tip .admonition-title,.rst-content .wy-alert-danger.tip .wy-alert-title,.rst-content .wy-alert-danger.warning .admonition-title,.rst-content .wy-alert-danger.warning .wy-alert-title,.rst-content .wy-alert.wy-alert-danger .admonition-title,.wy-alert.wy-alert-danger .rst-content .admonition-title,.wy-alert.wy-alert-danger .wy-alert-title{background:#f29f97}.rst-content .admonition-todo,.rst-content .attention,.rst-content .caution,.rst-content .warning,.rst-content .wy-alert-warning.admonition,.rst-content .wy-alert-warning.danger,.rst-content .wy-alert-warning.error,.rst-content .wy-alert-warning.hint,.rst-content .wy-alert-warning.important,.rst-content .wy-alert-warning.note,.rst-content .wy-alert-warning.seealso,.rst-content .wy-alert-warning.tip,.wy-alert.wy-alert-warning{background:#ffedcc}.rst-content .admonition-todo .admonition-title,.rst-content .admonition-todo .wy-alert-title,.rst-content .attention .admonition-title,.rst-content .attention .wy-alert-title,.rst-content .caution .admonition-title,.rst-content .caution .wy-alert-title,.rst-content .warning .admonition-title,.rst-content .warning .wy-alert-title,.rst-content .wy-alert-warning.admonition .admonition-title,.rst-content .wy-alert-warning.admonition .wy-alert-title,.rst-content .wy-alert-warning.danger .admonition-title,.rst-content .wy-alert-warning.danger .wy-alert-title,.rst-content .wy-alert-warning.error .admonition-title,.rst-content .wy-alert-warning.error .wy-alert-title,.rst-content .wy-alert-warning.hint .admonition-title,.rst-content .wy-alert-warning.hint .wy-alert-title,.rst-content .wy-alert-warning.important .admonition-title,.rst-content .wy-alert-warning.important .wy-alert-title,.rst-content .wy-alert-warning.note .admonition-title,.rst-content .wy-alert-warning.note .wy-alert-title,.rst-content .wy-alert-warning.seealso .admonition-title,.rst-content .wy-alert-warning.seealso .wy-alert-title,.rst-content .wy-alert-warning.tip .admonition-title,.rst-content .wy-alert-warning.tip .wy-alert-title,.rst-content .wy-alert.wy-alert-warning .admonition-title,.wy-alert.wy-alert-warning .rst-content .admonition-title,.wy-alert.wy-alert-warning .wy-alert-title{background:#f0b37e}.rst-content .note,.rst-content .seealso,.rst-content .wy-alert-info.admonition,.rst-content .wy-alert-info.admonition-todo,.rst-content .wy-alert-info.attention,.rst-content .wy-alert-info.caution,.rst-content .wy-alert-info.danger,.rst-content .wy-alert-info.error,.rst-content .wy-alert-info.hint,.rst-content .wy-alert-info.important,.rst-content .wy-alert-info.tip,.rst-content .wy-alert-info.warning,.wy-alert.wy-alert-info{background:#e7f2fa}.rst-content .note .admonition-title,.rst-content .note .wy-alert-title,.rst-content .seealso .admonition-title,.rst-content .seealso .wy-alert-title,.rst-content .wy-alert-info.admonition-todo .admonition-title,.rst-content .wy-alert-info.admonition-todo .wy-alert-title,.rst-content .wy-alert-info.admonition .admonition-title,.rst-content .wy-alert-info.admonition .wy-alert-title,.rst-content .wy-alert-info.attention .admonition-title,.rst-content .wy-alert-info.attention .wy-alert-title,.rst-content .wy-alert-info.caution .admonition-title,.rst-content .wy-alert-info.caution .wy-alert-title,.rst-content .wy-alert-info.danger .admonition-title,.rst-content .wy-alert-info.danger .wy-alert-title,.rst-content .wy-alert-info.error .admonition-title,.rst-content .wy-alert-info.error .wy-alert-title,.rst-content .wy-alert-info.hint .admonition-title,.rst-content .wy-alert-info.hint .wy-alert-title,.rst-content .wy-alert-info.important .admonition-title,.rst-content .wy-alert-info.important .wy-alert-title,.rst-content .wy-alert-info.tip .admonition-title,.rst-content .wy-alert-info.tip .wy-alert-title,.rst-content .wy-alert-info.warning .admonition-title,.rst-content .wy-alert-info.warning .wy-alert-title,.rst-content .wy-alert.wy-alert-info .admonition-title,.wy-alert.wy-alert-info .rst-content .admonition-title,.wy-alert.wy-alert-info .wy-alert-title{background:#6ab0de}.rst-content .hint,.rst-content .important,.rst-content .tip,.rst-content .wy-alert-success.admonition,.rst-content .wy-alert-success.admonition-todo,.rst-content .wy-alert-success.attention,.rst-content .wy-alert-success.caution,.rst-content .wy-alert-success.danger,.rst-content .wy-alert-success.error,.rst-content .wy-alert-success.note,.rst-content .wy-alert-success.seealso,.rst-content .wy-alert-success.warning,.wy-alert.wy-alert-success{background:#dbfaf4}.rst-content .hint .admonition-title,.rst-content .hint .wy-alert-title,.rst-content .important .admonition-title,.rst-content .important .wy-alert-title,.rst-content .tip .admonition-title,.rst-content .tip .wy-alert-title,.rst-content .wy-alert-success.admonition-todo .admonition-title,.rst-content .wy-alert-success.admonition-todo .wy-alert-title,.rst-content .wy-alert-success.admonition .admonition-title,.rst-content .wy-alert-success.admonition .wy-alert-title,.rst-content .wy-alert-success.attention .admonition-title,.rst-content .wy-alert-success.attention .wy-alert-title,.rst-content .wy-alert-success.caution .admonition-title,.rst-content .wy-alert-success.caution .wy-alert-title,.rst-content .wy-alert-success.danger .admonition-title,.rst-content .wy-alert-success.danger .wy-alert-title,.rst-content .wy-alert-success.error .admonition-title,.rst-content .wy-alert-success.error .wy-alert-title,.rst-content .wy-alert-success.note .admonition-title,.rst-content .wy-alert-success.note .wy-alert-title,.rst-content .wy-alert-success.seealso .admonition-title,.rst-content .wy-alert-success.seealso .wy-alert-title,.rst-content .wy-alert-success.warning .admonition-title,.rst-content .wy-alert-success.warning .wy-alert-title,.rst-content .wy-alert.wy-alert-success .admonition-title,.wy-alert.wy-alert-success .rst-content .admonition-title,.wy-alert.wy-alert-success .wy-alert-title{background:#1abc9c}.rst-content .wy-alert-neutral.admonition,.rst-content .wy-alert-neutral.admonition-todo,.rst-content .wy-alert-neutral.attention,.rst-content .wy-alert-neutral.caution,.rst-content .wy-alert-neutral.danger,.rst-content .wy-alert-neutral.error,.rst-content .wy-alert-neutral.hint,.rst-content .wy-alert-neutral.important,.rst-content .wy-alert-neutral.note,.rst-content .wy-alert-neutral.seealso,.rst-content .wy-alert-neutral.tip,.rst-content .wy-alert-neutral.warning,.wy-alert.wy-alert-neutral{background:#f3f6f6}.rst-content .wy-alert-neutral.admonition-todo .admonition-title,.rst-content .wy-alert-neutral.admonition-todo .wy-alert-title,.rst-content .wy-alert-neutral.admonition .admonition-title,.rst-content .wy-alert-neutral.admonition .wy-alert-title,.rst-content .wy-alert-neutral.attention .admonition-title,.rst-content .wy-alert-neutral.attention .wy-alert-title,.rst-content .wy-alert-neutral.caution .admonition-title,.rst-content .wy-alert-neutral.caution .wy-alert-title,.rst-content .wy-alert-neutral.danger .admonition-title,.rst-content .wy-alert-neutral.danger .wy-alert-title,.rst-content .wy-alert-neutral.error .admonition-title,.rst-content .wy-alert-neutral.error .wy-alert-title,.rst-content .wy-alert-neutral.hint .admonition-title,.rst-content .wy-alert-neutral.hint .wy-alert-title,.rst-content .wy-alert-neutral.important .admonition-title,.rst-content .wy-alert-neutral.important .wy-alert-title,.rst-content .wy-alert-neutral.note .admonition-title,.rst-content .wy-alert-neutral.note .wy-alert-title,.rst-content .wy-alert-neutral.seealso .admonition-title,.rst-content .wy-alert-neutral.seealso .wy-alert-title,.rst-content .wy-alert-neutral.tip .admonition-title,.rst-content .wy-alert-neutral.tip .wy-alert-title,.rst-content .wy-alert-neutral.warning .admonition-title,.rst-content .wy-alert-neutral.warning .wy-alert-title,.rst-content .wy-alert.wy-alert-neutral .admonition-title,.wy-alert.wy-alert-neutral .rst-content .admonition-title,.wy-alert.wy-alert-neutral .wy-alert-title{color:#404040;background:#e1e4e5}.rst-content .wy-alert-neutral.admonition-todo a,.rst-content .wy-alert-neutral.admonition a,.rst-content .wy-alert-neutral.attention a,.rst-content .wy-alert-neutral.caution a,.rst-content .wy-alert-neutral.danger a,.rst-content .wy-alert-neutral.error a,.rst-content .wy-alert-neutral.hint a,.rst-content .wy-alert-neutral.important a,.rst-content .wy-alert-neutral.note a,.rst-content .wy-alert-neutral.seealso a,.rst-content .wy-alert-neutral.tip a,.rst-content .wy-alert-neutral.warning a,.wy-alert.wy-alert-neutral a{color:#2980b9}.rst-content .admonition-todo p:last-child,.rst-content .admonition p:last-child,.rst-content .attention p:last-child,.rst-content .caution p:last-child,.rst-content .danger p:last-child,.rst-content .error p:last-child,.rst-content .hint p:last-child,.rst-content .important p:last-child,.rst-content .note p:last-child,.rst-content .seealso p:last-child,.rst-content .tip p:last-child,.rst-content .warning p:last-child,.wy-alert p:last-child{margin-bottom:0}.wy-tray-container{position:fixed;bottom:0;left:0;z-index:600}.wy-tray-container li{display:block;width:300px;background:transparent;color:#fff;text-align:center;box-shadow:0 5px 5px 0 rgba(0,0,0,.1);padding:0 24px;min-width:20%;opacity:0;height:0;line-height:56px;overflow:hidden;-webkit-transition:all .3s ease-in;-moz-transition:all .3s ease-in;transition:all .3s ease-in}.wy-tray-container li.wy-tray-item-success{background:#27ae60}.wy-tray-container li.wy-tray-item-info{background:#2980b9}.wy-tray-container li.wy-tray-item-warning{background:#e67e22}.wy-tray-container li.wy-tray-item-danger{background:#e74c3c}.wy-tray-container li.on{opacity:1;height:56px}@media screen and (max-width:768px){.wy-tray-container{bottom:auto;top:0;width:100%}.wy-tray-container li{width:100%}}button{font-size:100%;margin:0;vertical-align:baseline;*vertical-align:middle;cursor:pointer;line-height:normal;-webkit-appearance:button;*overflow:visible}button::-moz-focus-inner,input::-moz-focus-inner{border:0;padding:0}button[disabled]{cursor:default}.btn{display:inline-block;border-radius:2px;line-height:normal;white-space:nowrap;text-align:center;cursor:pointer;font-size:100%;padding:6px 12px 8px;color:#fff;border:1px solid rgba(0,0,0,.1);background-color:#27ae60;text-decoration:none;font-weight:400;font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif;box-shadow:inset 0 1px 2px -1px hsla(0,0%,100%,.5),inset 0 -2px 0 0 rgba(0,0,0,.1);outline-none:false;vertical-align:middle;*display:inline;zoom:1;-webkit-user-drag:none;-webkit-user-select:none;-moz-user-select:none;-ms-user-select:none;user-select:none;-webkit-transition:all .1s linear;-moz-transition:all .1s linear;transition:all .1s linear}.btn-hover{background:#2e8ece;color:#fff}.btn:hover{background:#2cc36b;color:#fff}.btn:focus{background:#2cc36b;outline:0}.btn:active{box-shadow:inset 0 -1px 0 0 rgba(0,0,0,.05),inset 0 2px 0 0 rgba(0,0,0,.1);padding:8px 12px 6px}.btn:visited{color:#fff}.btn-disabled,.btn-disabled:active,.btn-disabled:focus,.btn-disabled:hover,.btn:disabled{background-image:none;filter:progid:DXImageTransform.Microsoft.gradient(enabled = false);filter:alpha(opacity=40);opacity:.4;cursor:not-allowed;box-shadow:none}.btn::-moz-focus-inner{padding:0;border:0}.btn-small{font-size:80%}.btn-info{background-color:#2980b9!important}.btn-info:hover{background-color:#2e8ece!important}.btn-neutral{background-color:#f3f6f6!important;color:#404040!important}.btn-neutral:hover{background-color:#e5ebeb!important;color:#404040}.btn-neutral:visited{color:#404040!important}.btn-success{background-color:#27ae60!important}.btn-success:hover{background-color:#295!important}.btn-danger{background-color:#e74c3c!important}.btn-danger:hover{background-color:#ea6153!important}.btn-warning{background-color:#e67e22!important}.btn-warning:hover{background-color:#e98b39!important}.btn-invert{background-color:#222}.btn-invert:hover{background-color:#2f2f2f!important}.btn-link{background-color:transparent!important;color:#2980b9;box-shadow:none;border-color:transparent!important}.btn-link:active,.btn-link:hover{background-color:transparent!important;color:#409ad5!important;box-shadow:none}.btn-link:visited{color:#9b59b6}.wy-btn-group .btn,.wy-control .btn{vertical-align:middle}.wy-btn-group{margin-bottom:24px;*zoom:1}.wy-btn-group:after,.wy-btn-group:before{display:table;content:""}.wy-btn-group:after{clear:both}.wy-dropdown{position:relative;display:inline-block}.wy-dropdown-active .wy-dropdown-menu{display:block}.wy-dropdown-menu{position:absolute;left:0;display:none;float:left;top:100%;min-width:100%;background:#fcfcfc;z-index:100;border:1px solid #cfd7dd;box-shadow:0 2px 2px 0 rgba(0,0,0,.1);padding:12px}.wy-dropdown-menu>dd>a{display:block;clear:both;color:#404040;white-space:nowrap;font-size:90%;padding:0 12px;cursor:pointer}.wy-dropdown-menu>dd>a:hover{background:#2980b9;color:#fff}.wy-dropdown-menu>dd.divider{border-top:1px solid #cfd7dd;margin:6px 0}.wy-dropdown-menu>dd.search{padding-bottom:12px}.wy-dropdown-menu>dd.search input[type=search]{width:100%}.wy-dropdown-menu>dd.call-to-action{background:#e3e3e3;text-transform:uppercase;font-weight:500;font-size:80%}.wy-dropdown-menu>dd.call-to-action:hover{background:#e3e3e3}.wy-dropdown-menu>dd.call-to-action .btn{color:#fff}.wy-dropdown.wy-dropdown-up .wy-dropdown-menu{bottom:100%;top:auto;left:auto;right:0}.wy-dropdown.wy-dropdown-bubble .wy-dropdown-menu{background:#fcfcfc;margin-top:2px}.wy-dropdown.wy-dropdown-bubble .wy-dropdown-menu a{padding:6px 12px}.wy-dropdown.wy-dropdown-bubble .wy-dropdown-menu a:hover{background:#2980b9;color:#fff}.wy-dropdown.wy-dropdown-left .wy-dropdown-menu{right:0;left:auto;text-align:right}.wy-dropdown-arrow:before{content:" ";border-bottom:5px solid #f5f5f5;border-left:5px solid transparent;border-right:5px solid transparent;position:absolute;display:block;top:-4px;left:50%;margin-left:-3px}.wy-dropdown-arrow.wy-dropdown-arrow-left:before{left:11px}.wy-form-stacked select{display:block}.wy-form-aligned .wy-help-inline,.wy-form-aligned input,.wy-form-aligned label,.wy-form-aligned select,.wy-form-aligned textarea{display:inline-block;*display:inline;*zoom:1;vertical-align:middle}.wy-form-aligned .wy-control-group>label{display:inline-block;vertical-align:middle;width:10em;margin:6px 12px 0 0;float:left}.wy-form-aligned .wy-control{float:left}.wy-form-aligned .wy-control label{display:block}.wy-form-aligned .wy-control select{margin-top:6px}fieldset{margin:0}fieldset,legend{border:0;padding:0}legend{width:100%;white-space:normal;margin-bottom:24px;font-size:150%;*margin-left:-7px}label,legend{display:block}label{margin:0 0 .3125em;color:#333;font-size:90%}input,select,textarea{font-size:100%;margin:0;vertical-align:baseline;*vertical-align:middle}.wy-control-group{margin-bottom:24px;max-width:1200px;margin-left:auto;margin-right:auto;*zoom:1}.wy-control-group:after,.wy-control-group:before{display:table;content:""}.wy-control-group:after{clear:both}.wy-control-group.wy-control-group-required>label:after{content:" *";color:#e74c3c}.wy-control-group .wy-form-full,.wy-control-group .wy-form-halves,.wy-control-group .wy-form-thirds{padding-bottom:12px}.wy-control-group .wy-form-full input[type=color],.wy-control-group .wy-form-full input[type=date],.wy-control-group .wy-form-full input[type=datetime-local],.wy-control-group .wy-form-full input[type=datetime],.wy-control-group .wy-form-full input[type=email],.wy-control-group .wy-form-full input[type=month],.wy-control-group .wy-form-full input[type=number],.wy-control-group .wy-form-full input[type=password],.wy-control-group .wy-form-full input[type=search],.wy-control-group .wy-form-full input[type=tel],.wy-control-group .wy-form-full input[type=text],.wy-control-group .wy-form-full input[type=time],.wy-control-group .wy-form-full input[type=url],.wy-control-group .wy-form-full input[type=week],.wy-control-group .wy-form-full select,.wy-control-group .wy-form-halves input[type=color],.wy-control-group .wy-form-halves input[type=date],.wy-control-group .wy-form-halves input[type=datetime-local],.wy-control-group .wy-form-halves input[type=datetime],.wy-control-group .wy-form-halves input[type=email],.wy-control-group .wy-form-halves input[type=month],.wy-control-group .wy-form-halves input[type=number],.wy-control-group .wy-form-halves input[type=password],.wy-control-group .wy-form-halves input[type=search],.wy-control-group .wy-form-halves input[type=tel],.wy-control-group .wy-form-halves input[type=text],.wy-control-group .wy-form-halves input[type=time],.wy-control-group .wy-form-halves input[type=url],.wy-control-group .wy-form-halves input[type=week],.wy-control-group .wy-form-halves select,.wy-control-group .wy-form-thirds input[type=color],.wy-control-group .wy-form-thirds input[type=date],.wy-control-group .wy-form-thirds input[type=datetime-local],.wy-control-group .wy-form-thirds input[type=datetime],.wy-control-group .wy-form-thirds input[type=email],.wy-control-group .wy-form-thirds input[type=month],.wy-control-group .wy-form-thirds input[type=number],.wy-control-group .wy-form-thirds input[type=password],.wy-control-group .wy-form-thirds input[type=search],.wy-control-group .wy-form-thirds input[type=tel],.wy-control-group .wy-form-thirds input[type=text],.wy-control-group .wy-form-thirds input[type=time],.wy-control-group .wy-form-thirds input[type=url],.wy-control-group .wy-form-thirds input[type=week],.wy-control-group .wy-form-thirds select{width:100%}.wy-control-group .wy-form-full{float:left;display:block;width:100%;margin-right:0}.wy-control-group .wy-form-full:last-child{margin-right:0}.wy-control-group .wy-form-halves{float:left;display:block;margin-right:2.35765%;width:48.82117%}.wy-control-group .wy-form-halves:last-child,.wy-control-group .wy-form-halves:nth-of-type(2n){margin-right:0}.wy-control-group .wy-form-halves:nth-of-type(odd){clear:left}.wy-control-group .wy-form-thirds{float:left;display:block;margin-right:2.35765%;width:31.76157%}.wy-control-group .wy-form-thirds:last-child,.wy-control-group .wy-form-thirds:nth-of-type(3n){margin-right:0}.wy-control-group .wy-form-thirds:nth-of-type(3n+1){clear:left}.wy-control-group.wy-control-group-no-input .wy-control,.wy-control-no-input{margin:6px 0 0;font-size:90%}.wy-control-no-input{display:inline-block}.wy-control-group.fluid-input input[type=color],.wy-control-group.fluid-input input[type=date],.wy-control-group.fluid-input input[type=datetime-local],.wy-control-group.fluid-input input[type=datetime],.wy-control-group.fluid-input input[type=email],.wy-control-group.fluid-input input[type=month],.wy-control-group.fluid-input input[type=number],.wy-control-group.fluid-input input[type=password],.wy-control-group.fluid-input input[type=search],.wy-control-group.fluid-input input[type=tel],.wy-control-group.fluid-input input[type=text],.wy-control-group.fluid-input input[type=time],.wy-control-group.fluid-input input[type=url],.wy-control-group.fluid-input input[type=week]{width:100%}.wy-form-message-inline{padding-left:.3em;color:#666;font-size:90%}.wy-form-message{display:block;color:#999;font-size:70%;margin-top:.3125em;font-style:italic}.wy-form-message p{font-size:inherit;font-style:italic;margin-bottom:6px}.wy-form-message p:last-child{margin-bottom:0}input{line-height:normal}input[type=button],input[type=reset],input[type=submit]{-webkit-appearance:button;cursor:pointer;font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif;*overflow:visible}input[type=color],input[type=date],input[type=datetime-local],input[type=datetime],input[type=email],input[type=month],input[type=number],input[type=password],input[type=search],input[type=tel],input[type=text],input[type=time],input[type=url],input[type=week]{-webkit-appearance:none;padding:6px;display:inline-block;border:1px solid #ccc;font-size:80%;font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif;box-shadow:inset 0 1px 3px #ddd;border-radius:0;-webkit-transition:border .3s linear;-moz-transition:border .3s linear;transition:border .3s linear}input[type=datetime-local]{padding:.34375em .625em}input[disabled]{cursor:default}input[type=checkbox],input[type=radio]{padding:0;margin-right:.3125em;*height:13px;*width:13px}input[type=checkbox],input[type=radio],input[type=search]{-webkit-box-sizing:border-box;-moz-box-sizing:border-box;box-sizing:border-box}input[type=search]::-webkit-search-cancel-button,input[type=search]::-webkit-search-decoration{-webkit-appearance:none}input[type=color]:focus,input[type=date]:focus,input[type=datetime-local]:focus,input[type=datetime]:focus,input[type=email]:focus,input[type=month]:focus,input[type=number]:focus,input[type=password]:focus,input[type=search]:focus,input[type=tel]:focus,input[type=text]:focus,input[type=time]:focus,input[type=url]:focus,input[type=week]:focus{outline:0;outline:thin dotted\9;border-color:#333}input.no-focus:focus{border-color:#ccc!important}input[type=checkbox]:focus,input[type=file]:focus,input[type=radio]:focus{outline:thin dotted #333;outline:1px auto #129fea}input[type=color][disabled],input[type=date][disabled],input[type=datetime-local][disabled],input[type=datetime][disabled],input[type=email][disabled],input[type=month][disabled],input[type=number][disabled],input[type=password][disabled],input[type=search][disabled],input[type=tel][disabled],input[type=text][disabled],input[type=time][disabled],input[type=url][disabled],input[type=week][disabled]{cursor:not-allowed;background-color:#fafafa}input:focus:invalid,select:focus:invalid,textarea:focus:invalid{color:#e74c3c;border:1px solid #e74c3c}input:focus:invalid:focus,select:focus:invalid:focus,textarea:focus:invalid:focus{border-color:#e74c3c}input[type=checkbox]:focus:invalid:focus,input[type=file]:focus:invalid:focus,input[type=radio]:focus:invalid:focus{outline-color:#e74c3c}input.wy-input-large{padding:12px;font-size:100%}textarea{overflow:auto;vertical-align:top;width:100%;font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif}select,textarea{padding:.5em .625em;display:inline-block;border:1px solid #ccc;font-size:80%;box-shadow:inset 0 1px 3px #ddd;-webkit-transition:border .3s linear;-moz-transition:border .3s linear;transition:border .3s linear}select{border:1px solid #ccc;background-color:#fff}select[multiple]{height:auto}select:focus,textarea:focus{outline:0}input[readonly],select[disabled],select[readonly],textarea[disabled],textarea[readonly]{cursor:not-allowed;background-color:#fafafa}input[type=checkbox][disabled],input[type=radio][disabled]{cursor:not-allowed}.wy-checkbox,.wy-radio{margin:6px 0;color:#404040;display:block}.wy-checkbox input,.wy-radio input{vertical-align:baseline}.wy-form-message-inline{display:inline-block;*display:inline;*zoom:1;vertical-align:middle}.wy-input-prefix,.wy-input-suffix{white-space:nowrap;padding:6px}.wy-input-prefix .wy-input-context,.wy-input-suffix .wy-input-context{line-height:27px;padding:0 8px;display:inline-block;font-size:80%;background-color:#f3f6f6;border:1px solid #ccc;color:#999}.wy-input-suffix .wy-input-context{border-left:0}.wy-input-prefix .wy-input-context{border-right:0}.wy-switch{position:relative;display:block;height:24px;margin-top:12px;cursor:pointer}.wy-switch:before{left:0;top:0;width:36px;height:12px;background:#ccc}.wy-switch:after,.wy-switch:before{position:absolute;content:"";display:block;border-radius:4px;-webkit-transition:all .2s ease-in-out;-moz-transition:all .2s ease-in-out;transition:all .2s ease-in-out}.wy-switch:after{width:18px;height:18px;background:#999;left:-3px;top:-3px}.wy-switch span{position:absolute;left:48px;display:block;font-size:12px;color:#ccc;line-height:1}.wy-switch.active:before{background:#1e8449}.wy-switch.active:after{left:24px;background:#27ae60}.wy-switch.disabled{cursor:not-allowed;opacity:.8}.wy-control-group.wy-control-group-error .wy-form-message,.wy-control-group.wy-control-group-error>label{color:#e74c3c}.wy-control-group.wy-control-group-error input[type=color],.wy-control-group.wy-control-group-error input[type=date],.wy-control-group.wy-control-group-error input[type=datetime-local],.wy-control-group.wy-control-group-error input[type=datetime],.wy-control-group.wy-control-group-error input[type=email],.wy-control-group.wy-control-group-error input[type=month],.wy-control-group.wy-control-group-error input[type=number],.wy-control-group.wy-control-group-error input[type=password],.wy-control-group.wy-control-group-error input[type=search],.wy-control-group.wy-control-group-error input[type=tel],.wy-control-group.wy-control-group-error input[type=text],.wy-control-group.wy-control-group-error input[type=time],.wy-control-group.wy-control-group-error input[type=url],.wy-control-group.wy-control-group-error input[type=week],.wy-control-group.wy-control-group-error textarea{border:1px solid #e74c3c}.wy-inline-validate{white-space:nowrap}.wy-inline-validate .wy-input-context{padding:.5em .625em;display:inline-block;font-size:80%}.wy-inline-validate.wy-inline-validate-success .wy-input-context{color:#27ae60}.wy-inline-validate.wy-inline-validate-danger .wy-input-context{color:#e74c3c}.wy-inline-validate.wy-inline-validate-warning .wy-input-context{color:#e67e22}.wy-inline-validate.wy-inline-validate-info .wy-input-context{color:#2980b9}.rotate-90{-webkit-transform:rotate(90deg);-moz-transform:rotate(90deg);-ms-transform:rotate(90deg);-o-transform:rotate(90deg);transform:rotate(90deg)}.rotate-180{-webkit-transform:rotate(180deg);-moz-transform:rotate(180deg);-ms-transform:rotate(180deg);-o-transform:rotate(180deg);transform:rotate(180deg)}.rotate-270{-webkit-transform:rotate(270deg);-moz-transform:rotate(270deg);-ms-transform:rotate(270deg);-o-transform:rotate(270deg);transform:rotate(270deg)}.mirror{-webkit-transform:scaleX(-1);-moz-transform:scaleX(-1);-ms-transform:scaleX(-1);-o-transform:scaleX(-1);transform:scaleX(-1)}.mirror.rotate-90{-webkit-transform:scaleX(-1) rotate(90deg);-moz-transform:scaleX(-1) rotate(90deg);-ms-transform:scaleX(-1) rotate(90deg);-o-transform:scaleX(-1) rotate(90deg);transform:scaleX(-1) rotate(90deg)}.mirror.rotate-180{-webkit-transform:scaleX(-1) rotate(180deg);-moz-transform:scaleX(-1) rotate(180deg);-ms-transform:scaleX(-1) rotate(180deg);-o-transform:scaleX(-1) rotate(180deg);transform:scaleX(-1) rotate(180deg)}.mirror.rotate-270{-webkit-transform:scaleX(-1) rotate(270deg);-moz-transform:scaleX(-1) rotate(270deg);-ms-transform:scaleX(-1) rotate(270deg);-o-transform:scaleX(-1) rotate(270deg);transform:scaleX(-1) rotate(270deg)}@media only screen and (max-width:480px){.wy-form button[type=submit]{margin:.7em 0 0}.wy-form input[type=color],.wy-form input[type=date],.wy-form input[type=datetime-local],.wy-form input[type=datetime],.wy-form input[type=email],.wy-form input[type=month],.wy-form input[type=number],.wy-form input[type=password],.wy-form input[type=search],.wy-form input[type=tel],.wy-form input[type=text],.wy-form input[type=time],.wy-form input[type=url],.wy-form input[type=week],.wy-form label{margin-bottom:.3em;display:block}.wy-form input[type=color],.wy-form input[type=date],.wy-form input[type=datetime-local],.wy-form input[type=datetime],.wy-form input[type=email],.wy-form input[type=month],.wy-form input[type=number],.wy-form input[type=password],.wy-form input[type=search],.wy-form input[type=tel],.wy-form input[type=time],.wy-form input[type=url],.wy-form input[type=week]{margin-bottom:0}.wy-form-aligned .wy-control-group label{margin-bottom:.3em;text-align:left;display:block;width:100%}.wy-form-aligned .wy-control{margin:1.5em 0 0}.wy-form-message,.wy-form-message-inline,.wy-form .wy-help-inline{display:block;font-size:80%;padding:6px 0}}@media screen and (max-width:768px){.tablet-hide{display:none}}@media screen and (max-width:480px){.mobile-hide{display:none}}.float-left{float:left}.float-right{float:right}.full-width{width:100%}.rst-content table.docutils,.rst-content table.field-list,.wy-table{border-collapse:collapse;border-spacing:0;empty-cells:show;margin-bottom:24px}.rst-content table.docutils caption,.rst-content table.field-list caption,.wy-table caption{color:#000;font:italic 85%/1 arial,sans-serif;padding:1em 0;text-align:center}.rst-content table.docutils td,.rst-content table.docutils th,.rst-content table.field-list td,.rst-content table.field-list th,.wy-table td,.wy-table th{font-size:90%;margin:0;overflow:visible;padding:8px 16px}.rst-content table.docutils td:first-child,.rst-content table.docutils th:first-child,.rst-content table.field-list td:first-child,.rst-content table.field-list th:first-child,.wy-table td:first-child,.wy-table th:first-child{border-left-width:0}.rst-content table.docutils thead,.rst-content table.field-list thead,.wy-table thead{color:#000;text-align:left;vertical-align:bottom;white-space:nowrap}.rst-content table.docutils thead th,.rst-content table.field-list thead th,.wy-table thead th{font-weight:700;border-bottom:2px solid #e1e4e5}.rst-content table.docutils td,.rst-content table.field-list td,.wy-table td{background-color:transparent;vertical-align:middle}.rst-content table.docutils td p,.rst-content table.field-list td p,.wy-table td p{line-height:18px}.rst-content table.docutils td p:last-child,.rst-content table.field-list td p:last-child,.wy-table td p:last-child{margin-bottom:0}.rst-content table.docutils .wy-table-cell-min,.rst-content table.field-list .wy-table-cell-min,.wy-table .wy-table-cell-min{width:1%;padding-right:0}.rst-content table.docutils .wy-table-cell-min input[type=checkbox],.rst-content table.field-list .wy-table-cell-min input[type=checkbox],.wy-table .wy-table-cell-min input[type=checkbox]{margin:0}.wy-table-secondary{color:grey;font-size:90%}.wy-table-tertiary{color:grey;font-size:80%}.rst-content table.docutils:not(.field-list) tr:nth-child(2n-1) td,.wy-table-backed,.wy-table-odd td,.wy-table-striped tr:nth-child(2n-1) td{background-color:#f3f6f6}.rst-content table.docutils,.wy-table-bordered-all{border:1px solid #e1e4e5}.rst-content table.docutils td,.wy-table-bordered-all td{border-bottom:1px solid #e1e4e5;border-left:1px solid #e1e4e5}.rst-content table.docutils tbody>tr:last-child td,.wy-table-bordered-all tbody>tr:last-child td{border-bottom-width:0}.wy-table-bordered{border:1px solid #e1e4e5}.wy-table-bordered-rows td{border-bottom:1px solid #e1e4e5}.wy-table-bordered-rows tbody>tr:last-child td{border-bottom-width:0}.wy-table-horizontal td,.wy-table-horizontal th{border-width:0 0 1px;border-bottom:1px solid #e1e4e5}.wy-table-horizontal tbody>tr:last-child td{border-bottom-width:0}.wy-table-responsive{margin-bottom:24px;max-width:100%;overflow:auto}.wy-table-responsive table{margin-bottom:0!important}.wy-table-responsive table td,.wy-table-responsive table th{white-space:nowrap}a{color:#2980b9;text-decoration:none;cursor:pointer}a:hover{color:#3091d1}a:visited{color:#9b59b6}html{height:100%}body,html{overflow-x:hidden}body{font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif;font-weight:400;color:#404040;min-height:100%;background:#edf0f2}.wy-text-left{text-align:left}.wy-text-center{text-align:center}.wy-text-right{text-align:right}.wy-text-large{font-size:120%}.wy-text-normal{font-size:100%}.wy-text-small,small{font-size:80%}.wy-text-strike{text-decoration:line-through}.wy-text-warning{color:#e67e22!important}a.wy-text-warning:hover{color:#eb9950!important}.wy-text-info{color:#2980b9!important}a.wy-text-info:hover{color:#409ad5!important}.wy-text-success{color:#27ae60!important}a.wy-text-success:hover{color:#36d278!important}.wy-text-danger{color:#e74c3c!important}a.wy-text-danger:hover{color:#ed7669!important}.wy-text-neutral{color:#404040!important}a.wy-text-neutral:hover{color:#595959!important}.rst-content .toctree-wrapper>p.caption,h1,h2,h3,h4,h5,h6,legend{margin-top:0;font-weight:700;font-family:Roboto Slab,ff-tisa-web-pro,Georgia,Arial,sans-serif}p{line-height:24px;font-size:16px;margin:0 0 24px}h1{font-size:175%}.rst-content .toctree-wrapper>p.caption,h2{font-size:150%}h3{font-size:125%}h4{font-size:115%}h5{font-size:110%}h6{font-size:100%}hr{display:block;height:1px;border:0;border-top:1px solid #e1e4e5;margin:24px 0;padding:0}.rst-content code,.rst-content tt,code{white-space:nowrap;max-width:100%;background:#fff;border:1px solid #e1e4e5;font-size:75%;padding:0 5px;font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;color:#e74c3c;overflow-x:auto}.rst-content tt.code-large,code.code-large{font-size:90%}.rst-content .section ul,.rst-content .toctree-wrapper ul,.rst-content section ul,.wy-plain-list-disc,article ul{list-style:disc;line-height:24px;margin-bottom:24px}.rst-content .section ul li,.rst-content .toctree-wrapper ul li,.rst-content section ul li,.wy-plain-list-disc li,article ul li{list-style:disc;margin-left:24px}.rst-content .section ul li p:last-child,.rst-content .section ul li ul,.rst-content .toctree-wrapper ul li p:last-child,.rst-content .toctree-wrapper ul li ul,.rst-content section ul li p:last-child,.rst-content section ul li ul,.wy-plain-list-disc li p:last-child,.wy-plain-list-disc li ul,article ul li p:last-child,article ul li ul{margin-bottom:0}.rst-content .section ul li li,.rst-content .toctree-wrapper ul li li,.rst-content section ul li li,.wy-plain-list-disc li li,article ul li li{list-style:circle}.rst-content .section ul li li li,.rst-content .toctree-wrapper ul li li li,.rst-content section ul li li li,.wy-plain-list-disc li li li,article ul li li li{list-style:square}.rst-content .section ul li ol li,.rst-content .toctree-wrapper ul li ol li,.rst-content section ul li ol li,.wy-plain-list-disc li ol li,article ul li ol li{list-style:decimal}.rst-content .section ol,.rst-content .section ol.arabic,.rst-content .toctree-wrapper ol,.rst-content .toctree-wrapper ol.arabic,.rst-content section ol,.rst-content section ol.arabic,.wy-plain-list-decimal,article ol{list-style:decimal;line-height:24px;margin-bottom:24px}.rst-content .section ol.arabic li,.rst-content .section ol li,.rst-content .toctree-wrapper ol.arabic li,.rst-content .toctree-wrapper ol li,.rst-content section ol.arabic li,.rst-content section ol li,.wy-plain-list-decimal li,article ol li{list-style:decimal;margin-left:24px}.rst-content .section ol.arabic li ul,.rst-content .section ol li p:last-child,.rst-content .section ol li ul,.rst-content .toctree-wrapper ol.arabic li ul,.rst-content .toctree-wrapper ol li p:last-child,.rst-content .toctree-wrapper ol li ul,.rst-content section ol.arabic li ul,.rst-content section ol li p:last-child,.rst-content section ol li ul,.wy-plain-list-decimal li p:last-child,.wy-plain-list-decimal li ul,article ol li p:last-child,article ol li ul{margin-bottom:0}.rst-content .section ol.arabic li ul li,.rst-content .section ol li ul li,.rst-content .toctree-wrapper ol.arabic li ul li,.rst-content .toctree-wrapper ol li ul li,.rst-content section ol.arabic li ul li,.rst-content section ol li ul li,.wy-plain-list-decimal li ul li,article ol li ul li{list-style:disc}.wy-breadcrumbs{*zoom:1}.wy-breadcrumbs:after,.wy-breadcrumbs:before{display:table;content:""}.wy-breadcrumbs:after{clear:both}.wy-breadcrumbs>li{display:inline-block;padding-top:5px}.wy-breadcrumbs>li.wy-breadcrumbs-aside{float:right}.rst-content .wy-breadcrumbs>li code,.rst-content .wy-breadcrumbs>li tt,.wy-breadcrumbs>li .rst-content tt,.wy-breadcrumbs>li code{all:inherit;color:inherit}.breadcrumb-item:before{content:"/";color:#bbb;font-size:13px;padding:0 6px 0 3px}.wy-breadcrumbs-extra{margin-bottom:0;color:#b3b3b3;font-size:80%;display:inline-block}@media screen and (max-width:480px){.wy-breadcrumbs-extra,.wy-breadcrumbs li.wy-breadcrumbs-aside{display:none}}@media print{.wy-breadcrumbs li.wy-breadcrumbs-aside{display:none}}html{font-size:16px}.wy-affix{position:fixed;top:1.618em}.wy-menu a:hover{text-decoration:none}.wy-menu-horiz{*zoom:1}.wy-menu-horiz:after,.wy-menu-horiz:before{display:table;content:""}.wy-menu-horiz:after{clear:both}.wy-menu-horiz li,.wy-menu-horiz ul{display:inline-block}.wy-menu-horiz li:hover{background:hsla(0,0%,100%,.1)}.wy-menu-horiz li.divide-left{border-left:1px solid #404040}.wy-menu-horiz li.divide-right{border-right:1px solid #404040}.wy-menu-horiz a{height:32px;display:inline-block;line-height:32px;padding:0 16px}.wy-menu-vertical{width:300px}.wy-menu-vertical header,.wy-menu-vertical p.caption{color:#55a5d9;height:32px;line-height:32px;padding:0 1.618em;margin:12px 0 0;display:block;font-weight:700;text-transform:uppercase;font-size:85%;white-space:nowrap}.wy-menu-vertical ul{margin-bottom:0}.wy-menu-vertical li.divide-top{border-top:1px solid #404040}.wy-menu-vertical li.divide-bottom{border-bottom:1px solid #404040}.wy-menu-vertical li.current{background:#e3e3e3}.wy-menu-vertical li.current a{color:grey;border-right:1px solid #c9c9c9;padding:.4045em 2.427em}.wy-menu-vertical li.current a:hover{background:#d6d6d6}.rst-content .wy-menu-vertical li tt,.wy-menu-vertical li .rst-content tt,.wy-menu-vertical li code{border:none;background:inherit;color:inherit;padding-left:0;padding-right:0}.wy-menu-vertical li button.toctree-expand{display:block;float:left;margin-left:-1.2em;line-height:18px;color:#4d4d4d;border:none;background:none;padding:0}.wy-menu-vertical li.current>a,.wy-menu-vertical li.on a{color:#404040;font-weight:700;position:relative;background:#fcfcfc;border:none;padding:.4045em 1.618em}.wy-menu-vertical li.current>a:hover,.wy-menu-vertical li.on a:hover{background:#fcfcfc}.wy-menu-vertical li.current>a:hover button.toctree-expand,.wy-menu-vertical li.on a:hover button.toctree-expand{color:grey}.wy-menu-vertical li.current>a button.toctree-expand,.wy-menu-vertical li.on a button.toctree-expand{display:block;line-height:18px;color:#333}.wy-menu-vertical li.toctree-l1.current>a{border-bottom:1px solid #c9c9c9;border-top:1px solid #c9c9c9}.wy-menu-vertical .toctree-l1.current .toctree-l2>ul,.wy-menu-vertical .toctree-l2.current .toctree-l3>ul,.wy-menu-vertical .toctree-l3.current .toctree-l4>ul,.wy-menu-vertical .toctree-l4.current .toctree-l5>ul,.wy-menu-vertical .toctree-l5.current .toctree-l6>ul,.wy-menu-vertical .toctree-l6.current .toctree-l7>ul,.wy-menu-vertical .toctree-l7.current .toctree-l8>ul,.wy-menu-vertical .toctree-l8.current .toctree-l9>ul,.wy-menu-vertical .toctree-l9.current .toctree-l10>ul,.wy-menu-vertical .toctree-l10.current .toctree-l11>ul{display:none}.wy-menu-vertical .toctree-l1.current .current.toctree-l2>ul,.wy-menu-vertical .toctree-l2.current .current.toctree-l3>ul,.wy-menu-vertical .toctree-l3.current .current.toctree-l4>ul,.wy-menu-vertical .toctree-l4.current .current.toctree-l5>ul,.wy-menu-vertical .toctree-l5.current .current.toctree-l6>ul,.wy-menu-vertical .toctree-l6.current .current.toctree-l7>ul,.wy-menu-vertical .toctree-l7.current .current.toctree-l8>ul,.wy-menu-vertical .toctree-l8.current .current.toctree-l9>ul,.wy-menu-vertical .toctree-l9.current .current.toctree-l10>ul,.wy-menu-vertical .toctree-l10.current .current.toctree-l11>ul{display:block}.wy-menu-vertical li.toctree-l3,.wy-menu-vertical li.toctree-l4{font-size:.9em}.wy-menu-vertical li.toctree-l2 a,.wy-menu-vertical li.toctree-l3 a,.wy-menu-vertical li.toctree-l4 a,.wy-menu-vertical li.toctree-l5 a,.wy-menu-vertical li.toctree-l6 a,.wy-menu-vertical li.toctree-l7 a,.wy-menu-vertical li.toctree-l8 a,.wy-menu-vertical li.toctree-l9 a,.wy-menu-vertical li.toctree-l10 a{color:#404040}.wy-menu-vertical li.toctree-l2 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l3 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l4 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l5 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l6 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l7 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l8 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l9 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l10 a:hover button.toctree-expand{color:grey}.wy-menu-vertical li.toctree-l2.current li.toctree-l3>a,.wy-menu-vertical li.toctree-l3.current li.toctree-l4>a,.wy-menu-vertical li.toctree-l4.current li.toctree-l5>a,.wy-menu-vertical li.toctree-l5.current li.toctree-l6>a,.wy-menu-vertical li.toctree-l6.current li.toctree-l7>a,.wy-menu-vertical li.toctree-l7.current li.toctree-l8>a,.wy-menu-vertical li.toctree-l8.current li.toctree-l9>a,.wy-menu-vertical li.toctree-l9.current li.toctree-l10>a,.wy-menu-vertical li.toctree-l10.current li.toctree-l11>a{display:block}.wy-menu-vertical li.toctree-l2.current>a{padding:.4045em 2.427em}.wy-menu-vertical li.toctree-l2.current li.toctree-l3>a{padding:.4045em 1.618em .4045em 4.045em}.wy-menu-vertical li.toctree-l3.current>a{padding:.4045em 4.045em}.wy-menu-vertical li.toctree-l3.current li.toctree-l4>a{padding:.4045em 1.618em .4045em 5.663em}.wy-menu-vertical li.toctree-l4.current>a{padding:.4045em 5.663em}.wy-menu-vertical li.toctree-l4.current li.toctree-l5>a{padding:.4045em 1.618em .4045em 7.281em}.wy-menu-vertical li.toctree-l5.current>a{padding:.4045em 7.281em}.wy-menu-vertical li.toctree-l5.current li.toctree-l6>a{padding:.4045em 1.618em .4045em 8.899em}.wy-menu-vertical li.toctree-l6.current>a{padding:.4045em 8.899em}.wy-menu-vertical li.toctree-l6.current li.toctree-l7>a{padding:.4045em 1.618em .4045em 10.517em}.wy-menu-vertical li.toctree-l7.current>a{padding:.4045em 10.517em}.wy-menu-vertical li.toctree-l7.current li.toctree-l8>a{padding:.4045em 1.618em .4045em 12.135em}.wy-menu-vertical li.toctree-l8.current>a{padding:.4045em 12.135em}.wy-menu-vertical li.toctree-l8.current li.toctree-l9>a{padding:.4045em 1.618em .4045em 13.753em}.wy-menu-vertical li.toctree-l9.current>a{padding:.4045em 13.753em}.wy-menu-vertical li.toctree-l9.current li.toctree-l10>a{padding:.4045em 1.618em .4045em 15.371em}.wy-menu-vertical li.toctree-l10.current>a{padding:.4045em 15.371em}.wy-menu-vertical li.toctree-l10.current li.toctree-l11>a{padding:.4045em 1.618em .4045em 16.989em}.wy-menu-vertical li.toctree-l2.current>a,.wy-menu-vertical li.toctree-l2.current li.toctree-l3>a{background:#c9c9c9}.wy-menu-vertical li.toctree-l2 button.toctree-expand{color:#a3a3a3}.wy-menu-vertical li.toctree-l3.current>a,.wy-menu-vertical li.toctree-l3.current li.toctree-l4>a{background:#bdbdbd}.wy-menu-vertical li.toctree-l3 button.toctree-expand{color:#969696}.wy-menu-vertical li.current ul{display:block}.wy-menu-vertical li ul{margin-bottom:0;display:none}.wy-menu-vertical li ul li a{margin-bottom:0;color:#d9d9d9;font-weight:400}.wy-menu-vertical a{line-height:18px;padding:.4045em 1.618em;display:block;position:relative;font-size:90%;color:#d9d9d9}.wy-menu-vertical a:hover{background-color:#4e4a4a;cursor:pointer}.wy-menu-vertical a:hover button.toctree-expand{color:#d9d9d9}.wy-menu-vertical a:active{background-color:#2980b9;cursor:pointer;color:#fff}.wy-menu-vertical a:active button.toctree-expand{color:#fff}.wy-side-nav-search{display:block;width:300px;padding:.809em;margin-bottom:.809em;z-index:200;background-color:#2980b9;text-align:center;color:#fcfcfc}.wy-side-nav-search input[type=text]{width:100%;border-radius:50px;padding:6px 12px;border-color:#2472a4}.wy-side-nav-search img{display:block;margin:auto auto .809em;height:45px;width:45px;background-color:#2980b9;padding:5px;border-radius:100%}.wy-side-nav-search .wy-dropdown>a,.wy-side-nav-search>a{color:#fcfcfc;font-size:100%;font-weight:700;display:inline-block;padding:4px 6px;margin-bottom:.809em;max-width:100%}.wy-side-nav-search .wy-dropdown>a:hover,.wy-side-nav-search>a:hover{background:hsla(0,0%,100%,.1)}.wy-side-nav-search .wy-dropdown>a img.logo,.wy-side-nav-search>a img.logo{display:block;margin:0 auto;height:auto;width:auto;border-radius:0;max-width:100%;background:transparent}.wy-side-nav-search .wy-dropdown>a.icon img.logo,.wy-side-nav-search>a.icon img.logo{margin-top:.85em}.wy-side-nav-search>div.version{margin-top:-.4045em;margin-bottom:.809em;font-weight:400;color:hsla(0,0%,100%,.3)}.wy-nav .wy-menu-vertical header{color:#2980b9}.wy-nav .wy-menu-vertical a{color:#b3b3b3}.wy-nav .wy-menu-vertical a:hover{background-color:#2980b9;color:#fff}[data-menu-wrap]{-webkit-transition:all .2s ease-in;-moz-transition:all .2s ease-in;transition:all .2s ease-in;position:absolute;opacity:1;width:100%;opacity:0}[data-menu-wrap].move-center{left:0;right:auto;opacity:1}[data-menu-wrap].move-left{right:auto;left:-100%;opacity:0}[data-menu-wrap].move-right{right:-100%;left:auto;opacity:0}.wy-body-for-nav{background:#fcfcfc}.wy-grid-for-nav{position:absolute;width:100%;height:100%}.wy-nav-side{position:fixed;top:0;bottom:0;left:0;padding-bottom:2em;width:300px;overflow-x:hidden;overflow-y:hidden;min-height:100%;color:#9b9b9b;background:#343131;z-index:200}.wy-side-scroll{width:320px;position:relative;overflow-x:hidden;overflow-y:scroll;height:100%}.wy-nav-top{display:none;background:#2980b9;color:#fff;padding:.4045em .809em;position:relative;line-height:50px;text-align:center;font-size:100%;*zoom:1}.wy-nav-top:after,.wy-nav-top:before{display:table;content:""}.wy-nav-top:after{clear:both}.wy-nav-top a{color:#fff;font-weight:700}.wy-nav-top img{margin-right:12px;height:45px;width:45px;background-color:#2980b9;padding:5px;border-radius:100%}.wy-nav-top i{font-size:30px;float:left;cursor:pointer;padding-top:inherit}.wy-nav-content-wrap{margin-left:300px;background:#fcfcfc;min-height:100%}.wy-nav-content{padding:1.618em 3.236em;height:100%;max-width:800px;margin:auto}.wy-body-mask{position:fixed;width:100%;height:100%;background:rgba(0,0,0,.2);display:none;z-index:499}.wy-body-mask.on{display:block}footer{color:grey}footer p{margin-bottom:12px}.rst-content footer span.commit tt,footer span.commit .rst-content tt,footer span.commit code{padding:0;font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;font-size:1em;background:none;border:none;color:grey}.rst-footer-buttons{*zoom:1}.rst-footer-buttons:after,.rst-footer-buttons:before{width:100%;display:table;content:""}.rst-footer-buttons:after{clear:both}.rst-breadcrumbs-buttons{margin-top:12px;*zoom:1}.rst-breadcrumbs-buttons:after,.rst-breadcrumbs-buttons:before{display:table;content:""}.rst-breadcrumbs-buttons:after{clear:both}#search-results .search li{margin-bottom:24px;border-bottom:1px solid #e1e4e5;padding-bottom:24px}#search-results .search li:first-child{border-top:1px solid #e1e4e5;padding-top:24px}#search-results .search li a{font-size:120%;margin-bottom:12px;display:inline-block}#search-results .context{color:grey;font-size:90%}.genindextable li>ul{margin-left:24px}@media screen and (max-width:768px){.wy-body-for-nav{background:#fcfcfc}.wy-nav-top{display:block}.wy-nav-side{left:-300px}.wy-nav-side.shift{width:85%;left:0}.wy-menu.wy-menu-vertical,.wy-side-nav-search,.wy-side-scroll{width:auto}.wy-nav-content-wrap{margin-left:0}.wy-nav-content-wrap .wy-nav-content{padding:1.618em}.wy-nav-content-wrap.shift{position:fixed;min-width:100%;left:85%;top:0;height:100%;overflow:hidden}}@media screen and (min-width:1100px){.wy-nav-content-wrap{background:rgba(0,0,0,.05)}.wy-nav-content{margin:0;background:#fcfcfc}}@media print{.rst-versions,.wy-nav-side,footer{display:none}.wy-nav-content-wrap{margin-left:0}}.rst-versions{position:fixed;bottom:0;left:0;width:300px;color:#fcfcfc;background:#1f1d1d;font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif;z-index:400}.rst-versions a{color:#2980b9;text-decoration:none}.rst-versions .rst-badge-small{display:none}.rst-versions .rst-current-version{padding:12px;background-color:#272525;display:block;text-align:right;font-size:90%;cursor:pointer;color:#27ae60;*zoom:1}.rst-versions .rst-current-version:after,.rst-versions .rst-current-version:before{display:table;content:""}.rst-versions .rst-current-version:after{clear:both}.rst-content .code-block-caption .rst-versions .rst-current-version .headerlink,.rst-content .eqno .rst-versions .rst-current-version .headerlink,.rst-content .rst-versions .rst-current-version .admonition-title,.rst-content code.download .rst-versions .rst-current-version span:first-child,.rst-content dl dt .rst-versions .rst-current-version .headerlink,.rst-content h1 .rst-versions .rst-current-version .headerlink,.rst-content h2 .rst-versions .rst-current-version .headerlink,.rst-content h3 .rst-versions .rst-current-version .headerlink,.rst-content h4 .rst-versions .rst-current-version .headerlink,.rst-content h5 .rst-versions .rst-current-version .headerlink,.rst-content h6 .rst-versions .rst-current-version .headerlink,.rst-content p .rst-versions .rst-current-version .headerlink,.rst-content table>caption .rst-versions .rst-current-version .headerlink,.rst-content tt.download .rst-versions .rst-current-version span:first-child,.rst-versions .rst-current-version .fa,.rst-versions .rst-current-version .icon,.rst-versions .rst-current-version .rst-content .admonition-title,.rst-versions .rst-current-version .rst-content .code-block-caption .headerlink,.rst-versions .rst-current-version .rst-content .eqno .headerlink,.rst-versions .rst-current-version .rst-content code.download span:first-child,.rst-versions .rst-current-version .rst-content dl dt .headerlink,.rst-versions .rst-current-version .rst-content h1 .headerlink,.rst-versions .rst-current-version .rst-content h2 .headerlink,.rst-versions .rst-current-version .rst-content h3 .headerlink,.rst-versions .rst-current-version .rst-content h4 .headerlink,.rst-versions .rst-current-version .rst-content h5 .headerlink,.rst-versions .rst-current-version .rst-content h6 .headerlink,.rst-versions .rst-current-version .rst-content p .headerlink,.rst-versions .rst-current-version .rst-content table>caption .headerlink,.rst-versions .rst-current-version .rst-content tt.download span:first-child,.rst-versions .rst-current-version .wy-menu-vertical li button.toctree-expand,.wy-menu-vertical li .rst-versions .rst-current-version button.toctree-expand{color:#fcfcfc}.rst-versions .rst-current-version .fa-book,.rst-versions .rst-current-version .icon-book{float:left}.rst-versions .rst-current-version.rst-out-of-date{background-color:#e74c3c;color:#fff}.rst-versions .rst-current-version.rst-active-old-version{background-color:#f1c40f;color:#000}.rst-versions.shift-up{height:auto;max-height:100%;overflow-y:scroll}.rst-versions.shift-up .rst-other-versions{display:block}.rst-versions .rst-other-versions{font-size:90%;padding:12px;color:grey;display:none}.rst-versions .rst-other-versions hr{display:block;height:1px;border:0;margin:20px 0;padding:0;border-top:1px solid #413d3d}.rst-versions .rst-other-versions dd{display:inline-block;margin:0}.rst-versions .rst-other-versions dd a{display:inline-block;padding:6px;color:#fcfcfc}.rst-versions.rst-badge{width:auto;bottom:20px;right:20px;left:auto;border:none;max-width:300px;max-height:90%}.rst-versions.rst-badge .fa-book,.rst-versions.rst-badge .icon-book{float:none;line-height:30px}.rst-versions.rst-badge.shift-up .rst-current-version{text-align:right}.rst-versions.rst-badge.shift-up .rst-current-version .fa-book,.rst-versions.rst-badge.shift-up .rst-current-version .icon-book{float:left}.rst-versions.rst-badge>.rst-current-version{width:auto;height:30px;line-height:30px;padding:0 6px;display:block;text-align:center}@media screen and (max-width:768px){.rst-versions{width:85%;display:none}.rst-versions.shift{display:block}}.rst-content .toctree-wrapper>p.caption,.rst-content h1,.rst-content h2,.rst-content h3,.rst-content h4,.rst-content h5,.rst-content h6{margin-bottom:24px}.rst-content img{max-width:100%;height:auto}.rst-content div.figure,.rst-content figure{margin-bottom:24px}.rst-content div.figure .caption-text,.rst-content figure .caption-text{font-style:italic}.rst-content div.figure p:last-child.caption,.rst-content figure p:last-child.caption{margin-bottom:0}.rst-content div.figure.align-center,.rst-content figure.align-center{text-align:center}.rst-content .section>a>img,.rst-content .section>img,.rst-content section>a>img,.rst-content section>img{margin-bottom:24px}.rst-content abbr[title]{text-decoration:none}.rst-content.style-external-links a.reference.external:after{font-family:FontAwesome;content:"\f08e";color:#b3b3b3;vertical-align:super;font-size:60%;margin:0 .2em}.rst-content blockquote{margin-left:24px;line-height:24px;margin-bottom:24px}.rst-content pre.literal-block{white-space:pre;margin:0;padding:12px;font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;display:block;overflow:auto}.rst-content div[class^=highlight],.rst-content pre.literal-block{border:1px solid #e1e4e5;overflow-x:auto;margin:1px 0 24px}.rst-content div[class^=highlight] div[class^=highlight],.rst-content pre.literal-block div[class^=highlight]{padding:0;border:none;margin:0}.rst-content div[class^=highlight] td.code{width:100%}.rst-content .linenodiv pre{border-right:1px solid #e6e9ea;margin:0;padding:12px;font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;user-select:none;pointer-events:none}.rst-content div[class^=highlight] pre{white-space:pre;margin:0;padding:12px;display:block;overflow:auto}.rst-content div[class^=highlight] pre .hll{display:block;margin:0 -12px;padding:0 12px}.rst-content .linenodiv pre,.rst-content div[class^=highlight] pre,.rst-content pre.literal-block{font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;font-size:12px;line-height:1.4}.rst-content div.highlight .gp,.rst-content div.highlight span.linenos{user-select:none;pointer-events:none}.rst-content div.highlight span.linenos{display:inline-block;padding-left:0;padding-right:12px;margin-right:12px;border-right:1px solid #e6e9ea}.rst-content .code-block-caption{font-style:italic;font-size:85%;line-height:1;padding:1em 0;text-align:center}@media print{.rst-content .codeblock,.rst-content div[class^=highlight],.rst-content div[class^=highlight] pre{white-space:pre-wrap}}.rst-content .admonition,.rst-content .admonition-todo,.rst-content .attention,.rst-content .caution,.rst-content .danger,.rst-content .error,.rst-content .hint,.rst-content .important,.rst-content .note,.rst-content .seealso,.rst-content .tip,.rst-content .warning{clear:both}.rst-content .admonition-todo .last,.rst-content .admonition-todo>:last-child,.rst-content .admonition .last,.rst-content .admonition>:last-child,.rst-content .attention .last,.rst-content .attention>:last-child,.rst-content .caution .last,.rst-content .caution>:last-child,.rst-content .danger .last,.rst-content .danger>:last-child,.rst-content .error .last,.rst-content .error>:last-child,.rst-content .hint .last,.rst-content .hint>:last-child,.rst-content .important .last,.rst-content .important>:last-child,.rst-content .note .last,.rst-content .note>:last-child,.rst-content .seealso .last,.rst-content .seealso>:last-child,.rst-content .tip .last,.rst-content .tip>:last-child,.rst-content .warning .last,.rst-content .warning>:last-child{margin-bottom:0}.rst-content .admonition-title:before{margin-right:4px}.rst-content .admonition table{border-color:rgba(0,0,0,.1)}.rst-content .admonition table td,.rst-content .admonition table th{background:transparent!important;border-color:rgba(0,0,0,.1)!important}.rst-content .section ol.loweralpha,.rst-content .section ol.loweralpha>li,.rst-content .toctree-wrapper ol.loweralpha,.rst-content .toctree-wrapper ol.loweralpha>li,.rst-content section ol.loweralpha,.rst-content section ol.loweralpha>li{list-style:lower-alpha}.rst-content .section ol.upperalpha,.rst-content .section ol.upperalpha>li,.rst-content .toctree-wrapper ol.upperalpha,.rst-content .toctree-wrapper ol.upperalpha>li,.rst-content section ol.upperalpha,.rst-content section ol.upperalpha>li{list-style:upper-alpha}.rst-content .section ol li>*,.rst-content .section ul li>*,.rst-content .toctree-wrapper ol li>*,.rst-content .toctree-wrapper ul li>*,.rst-content section ol li>*,.rst-content section ul li>*{margin-top:12px;margin-bottom:12px}.rst-content .section ol li>:first-child,.rst-content .section ul li>:first-child,.rst-content .toctree-wrapper ol li>:first-child,.rst-content .toctree-wrapper ul li>:first-child,.rst-content section ol li>:first-child,.rst-content section ul li>:first-child{margin-top:0}.rst-content .section ol li>p,.rst-content .section ol li>p:last-child,.rst-content .section ul li>p,.rst-content .section ul li>p:last-child,.rst-content .toctree-wrapper ol li>p,.rst-content .toctree-wrapper ol li>p:last-child,.rst-content .toctree-wrapper ul li>p,.rst-content .toctree-wrapper ul li>p:last-child,.rst-content section ol li>p,.rst-content section ol li>p:last-child,.rst-content section ul li>p,.rst-content section ul li>p:last-child{margin-bottom:12px}.rst-content .section ol li>p:only-child,.rst-content .section ol li>p:only-child:last-child,.rst-content .section ul li>p:only-child,.rst-content .section ul li>p:only-child:last-child,.rst-content .toctree-wrapper ol li>p:only-child,.rst-content .toctree-wrapper ol li>p:only-child:last-child,.rst-content .toctree-wrapper ul li>p:only-child,.rst-content .toctree-wrapper ul li>p:only-child:last-child,.rst-content section ol li>p:only-child,.rst-content section ol li>p:only-child:last-child,.rst-content section ul li>p:only-child,.rst-content section ul li>p:only-child:last-child{margin-bottom:0}.rst-content .section ol li>ol,.rst-content .section ol li>ul,.rst-content .section ul li>ol,.rst-content .section ul li>ul,.rst-content .toctree-wrapper ol li>ol,.rst-content .toctree-wrapper ol li>ul,.rst-content .toctree-wrapper ul li>ol,.rst-content .toctree-wrapper ul li>ul,.rst-content section ol li>ol,.rst-content section ol li>ul,.rst-content section ul li>ol,.rst-content section ul li>ul{margin-bottom:12px}.rst-content .section ol.simple li>*,.rst-content .section ol.simple li ol,.rst-content .section ol.simple li ul,.rst-content .section ul.simple li>*,.rst-content .section ul.simple li ol,.rst-content .section ul.simple li ul,.rst-content .toctree-wrapper ol.simple li>*,.rst-content .toctree-wrapper ol.simple li ol,.rst-content .toctree-wrapper ol.simple li ul,.rst-content .toctree-wrapper ul.simple li>*,.rst-content .toctree-wrapper ul.simple li ol,.rst-content .toctree-wrapper ul.simple li ul,.rst-content section ol.simple li>*,.rst-content section ol.simple li ol,.rst-content section ol.simple li ul,.rst-content section ul.simple li>*,.rst-content section ul.simple li ol,.rst-content section ul.simple li ul{margin-top:0;margin-bottom:0}.rst-content .line-block{margin-left:0;margin-bottom:24px;line-height:24px}.rst-content .line-block .line-block{margin-left:24px;margin-bottom:0}.rst-content .topic-title{font-weight:700;margin-bottom:12px}.rst-content .toc-backref{color:#404040}.rst-content .align-right{float:right;margin:0 0 24px 24px}.rst-content .align-left{float:left;margin:0 24px 24px 0}.rst-content .align-center{margin:auto}.rst-content .align-center:not(table){display:block}.rst-content .code-block-caption .headerlink,.rst-content .eqno .headerlink,.rst-content .toctree-wrapper>p.caption .headerlink,.rst-content dl dt .headerlink,.rst-content h1 .headerlink,.rst-content h2 .headerlink,.rst-content h3 .headerlink,.rst-content h4 .headerlink,.rst-content h5 .headerlink,.rst-content h6 .headerlink,.rst-content p.caption .headerlink,.rst-content p .headerlink,.rst-content table>caption .headerlink{opacity:0;font-size:14px;font-family:FontAwesome;margin-left:.5em}.rst-content .code-block-caption .headerlink:focus,.rst-content .code-block-caption:hover .headerlink,.rst-content .eqno .headerlink:focus,.rst-content .eqno:hover .headerlink,.rst-content .toctree-wrapper>p.caption .headerlink:focus,.rst-content .toctree-wrapper>p.caption:hover .headerlink,.rst-content dl dt .headerlink:focus,.rst-content dl dt:hover .headerlink,.rst-content h1 .headerlink:focus,.rst-content h1:hover .headerlink,.rst-content h2 .headerlink:focus,.rst-content h2:hover .headerlink,.rst-content h3 .headerlink:focus,.rst-content h3:hover .headerlink,.rst-content h4 .headerlink:focus,.rst-content h4:hover .headerlink,.rst-content h5 .headerlink:focus,.rst-content h5:hover .headerlink,.rst-content h6 .headerlink:focus,.rst-content h6:hover .headerlink,.rst-content p.caption .headerlink:focus,.rst-content p.caption:hover .headerlink,.rst-content p .headerlink:focus,.rst-content p:hover .headerlink,.rst-content table>caption .headerlink:focus,.rst-content table>caption:hover .headerlink{opacity:1}.rst-content p a{overflow-wrap:anywhere}.rst-content .wy-table td p,.rst-content .wy-table td ul,.rst-content .wy-table th p,.rst-content .wy-table th ul,.rst-content table.docutils td p,.rst-content table.docutils td ul,.rst-content table.docutils th p,.rst-content table.docutils th ul,.rst-content table.field-list td p,.rst-content table.field-list td ul,.rst-content table.field-list th p,.rst-content table.field-list th ul{font-size:inherit}.rst-content .btn:focus{outline:2px solid}.rst-content table>caption .headerlink:after{font-size:12px}.rst-content .centered{text-align:center}.rst-content .sidebar{float:right;width:40%;display:block;margin:0 0 24px 24px;padding:24px;background:#f3f6f6;border:1px solid #e1e4e5}.rst-content .sidebar dl,.rst-content .sidebar p,.rst-content .sidebar ul{font-size:90%}.rst-content .sidebar .last,.rst-content .sidebar>:last-child{margin-bottom:0}.rst-content .sidebar .sidebar-title{display:block;font-family:Roboto Slab,ff-tisa-web-pro,Georgia,Arial,sans-serif;font-weight:700;background:#e1e4e5;padding:6px 12px;margin:-24px -24px 24px;font-size:100%}.rst-content .highlighted{background:#f1c40f;box-shadow:0 0 0 2px #f1c40f;display:inline;font-weight:700}.rst-content .citation-reference,.rst-content .footnote-reference{vertical-align:baseline;position:relative;top:-.4em;line-height:0;font-size:90%}.rst-content .citation-reference>span.fn-bracket,.rst-content .footnote-reference>span.fn-bracket{display:none}.rst-content .hlist{width:100%}.rst-content dl dt span.classifier:before{content:" : "}.rst-content dl dt span.classifier-delimiter{display:none!important}html.writer-html4 .rst-content table.docutils.citation,html.writer-html4 .rst-content table.docutils.footnote{background:none;border:none}html.writer-html4 .rst-content table.docutils.citation td,html.writer-html4 .rst-content table.docutils.citation tr,html.writer-html4 .rst-content table.docutils.footnote td,html.writer-html4 .rst-content table.docutils.footnote tr{border:none;background-color:transparent!important;white-space:normal}html.writer-html4 .rst-content table.docutils.citation td.label,html.writer-html4 .rst-content table.docutils.footnote td.label{padding-left:0;padding-right:0;vertical-align:top}html.writer-html5 .rst-content dl.citation,html.writer-html5 .rst-content dl.field-list,html.writer-html5 .rst-content dl.footnote{display:grid;grid-template-columns:auto minmax(80%,95%)}html.writer-html5 .rst-content dl.citation>dt,html.writer-html5 .rst-content dl.field-list>dt,html.writer-html5 .rst-content dl.footnote>dt{display:inline-grid;grid-template-columns:max-content auto}html.writer-html5 .rst-content aside.citation,html.writer-html5 .rst-content aside.footnote,html.writer-html5 .rst-content div.citation{display:grid;grid-template-columns:auto auto minmax(.65rem,auto) minmax(40%,95%)}html.writer-html5 .rst-content aside.citation>span.label,html.writer-html5 .rst-content aside.footnote>span.label,html.writer-html5 .rst-content div.citation>span.label{grid-column-start:1;grid-column-end:2}html.writer-html5 .rst-content aside.citation>span.backrefs,html.writer-html5 .rst-content aside.footnote>span.backrefs,html.writer-html5 .rst-content div.citation>span.backrefs{grid-column-start:2;grid-column-end:3;grid-row-start:1;grid-row-end:3}html.writer-html5 .rst-content aside.citation>p,html.writer-html5 .rst-content aside.footnote>p,html.writer-html5 .rst-content div.citation>p{grid-column-start:4;grid-column-end:5}html.writer-html5 .rst-content dl.citation,html.writer-html5 .rst-content dl.field-list,html.writer-html5 .rst-content dl.footnote{margin-bottom:24px}html.writer-html5 .rst-content dl.citation>dt,html.writer-html5 .rst-content dl.field-list>dt,html.writer-html5 .rst-content dl.footnote>dt{padding-left:1rem}html.writer-html5 .rst-content dl.citation>dd,html.writer-html5 .rst-content dl.citation>dt,html.writer-html5 .rst-content dl.field-list>dd,html.writer-html5 .rst-content dl.field-list>dt,html.writer-html5 .rst-content dl.footnote>dd,html.writer-html5 .rst-content dl.footnote>dt{margin-bottom:0}html.writer-html5 .rst-content dl.citation,html.writer-html5 .rst-content dl.footnote{font-size:.9rem}html.writer-html5 .rst-content dl.citation>dt,html.writer-html5 .rst-content dl.footnote>dt{margin:0 .5rem .5rem 0;line-height:1.2rem;word-break:break-all;font-weight:400}html.writer-html5 .rst-content dl.citation>dt>span.brackets:before,html.writer-html5 .rst-content dl.footnote>dt>span.brackets:before{content:"["}html.writer-html5 .rst-content dl.citation>dt>span.brackets:after,html.writer-html5 .rst-content dl.footnote>dt>span.brackets:after{content:"]"}html.writer-html5 .rst-content dl.citation>dt>span.fn-backref,html.writer-html5 .rst-content dl.footnote>dt>span.fn-backref{text-align:left;font-style:italic;margin-left:.65rem;word-break:break-word;word-spacing:-.1rem;max-width:5rem}html.writer-html5 .rst-content dl.citation>dt>span.fn-backref>a,html.writer-html5 .rst-content dl.footnote>dt>span.fn-backref>a{word-break:keep-all}html.writer-html5 .rst-content dl.citation>dt>span.fn-backref>a:not(:first-child):before,html.writer-html5 .rst-content dl.footnote>dt>span.fn-backref>a:not(:first-child):before{content:" "}html.writer-html5 .rst-content dl.citation>dd,html.writer-html5 .rst-content dl.footnote>dd{margin:0 0 .5rem;line-height:1.2rem}html.writer-html5 .rst-content dl.citation>dd p,html.writer-html5 .rst-content dl.footnote>dd p{font-size:.9rem}html.writer-html5 .rst-content aside.citation,html.writer-html5 .rst-content aside.footnote,html.writer-html5 .rst-content div.citation{padding-left:1rem;padding-right:1rem;font-size:.9rem;line-height:1.2rem}html.writer-html5 .rst-content aside.citation p,html.writer-html5 .rst-content aside.footnote p,html.writer-html5 .rst-content div.citation p{font-size:.9rem;line-height:1.2rem;margin-bottom:12px}html.writer-html5 .rst-content aside.citation span.backrefs,html.writer-html5 .rst-content aside.footnote span.backrefs,html.writer-html5 .rst-content div.citation span.backrefs{text-align:left;font-style:italic;margin-left:.65rem;word-break:break-word;word-spacing:-.1rem;max-width:5rem}html.writer-html5 .rst-content aside.citation span.backrefs>a,html.writer-html5 .rst-content aside.footnote span.backrefs>a,html.writer-html5 .rst-content div.citation span.backrefs>a{word-break:keep-all}html.writer-html5 .rst-content aside.citation span.backrefs>a:not(:first-child):before,html.writer-html5 .rst-content aside.footnote span.backrefs>a:not(:first-child):before,html.writer-html5 .rst-content div.citation span.backrefs>a:not(:first-child):before{content:" "}html.writer-html5 .rst-content aside.citation span.label,html.writer-html5 .rst-content aside.footnote span.label,html.writer-html5 .rst-content div.citation span.label{line-height:1.2rem}html.writer-html5 .rst-content aside.citation-list,html.writer-html5 .rst-content aside.footnote-list,html.writer-html5 .rst-content div.citation-list{margin-bottom:24px}html.writer-html5 .rst-content dl.option-list kbd{font-size:.9rem}.rst-content table.docutils.footnote,html.writer-html4 .rst-content table.docutils.citation,html.writer-html5 .rst-content aside.footnote,html.writer-html5 .rst-content aside.footnote-list aside.footnote,html.writer-html5 .rst-content div.citation-list>div.citation,html.writer-html5 .rst-content dl.citation,html.writer-html5 .rst-content dl.footnote{color:grey}.rst-content table.docutils.footnote code,.rst-content table.docutils.footnote tt,html.writer-html4 .rst-content table.docutils.citation code,html.writer-html4 .rst-content table.docutils.citation tt,html.writer-html5 .rst-content aside.footnote-list aside.footnote code,html.writer-html5 .rst-content aside.footnote-list aside.footnote tt,html.writer-html5 .rst-content aside.footnote code,html.writer-html5 .rst-content aside.footnote tt,html.writer-html5 .rst-content div.citation-list>div.citation code,html.writer-html5 .rst-content div.citation-list>div.citation tt,html.writer-html5 .rst-content dl.citation code,html.writer-html5 .rst-content dl.citation tt,html.writer-html5 .rst-content dl.footnote code,html.writer-html5 .rst-content dl.footnote tt{color:#555}.rst-content .wy-table-responsive.citation,.rst-content .wy-table-responsive.footnote{margin-bottom:0}.rst-content .wy-table-responsive.citation+:not(.citation),.rst-content .wy-table-responsive.footnote+:not(.footnote){margin-top:24px}.rst-content .wy-table-responsive.citation:last-child,.rst-content .wy-table-responsive.footnote:last-child{margin-bottom:24px}.rst-content table.docutils th{border-color:#e1e4e5}html.writer-html5 .rst-content table.docutils th{border:1px solid #e1e4e5}html.writer-html5 .rst-content table.docutils td>p,html.writer-html5 .rst-content table.docutils th>p{line-height:1rem;margin-bottom:0;font-size:.9rem}.rst-content table.docutils td .last,.rst-content table.docutils td .last>:last-child{margin-bottom:0}.rst-content table.field-list,.rst-content table.field-list td{border:none}.rst-content table.field-list td p{line-height:inherit}.rst-content table.field-list td>strong{display:inline-block}.rst-content table.field-list .field-name{padding-right:10px;text-align:left;white-space:nowrap}.rst-content table.field-list .field-body{text-align:left}.rst-content code,.rst-content tt{color:#000;font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;padding:2px 5px}.rst-content code big,.rst-content code em,.rst-content tt big,.rst-content tt em{font-size:100%!important;line-height:normal}.rst-content code.literal,.rst-content tt.literal{color:#e74c3c;white-space:normal}.rst-content code.xref,.rst-content tt.xref,a .rst-content code,a .rst-content tt{font-weight:700;color:#404040;overflow-wrap:normal}.rst-content kbd,.rst-content pre,.rst-content samp{font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace}.rst-content a code,.rst-content a tt{color:#2980b9}.rst-content dl{margin-bottom:24px}.rst-content dl dt{font-weight:700;margin-bottom:12px}.rst-content dl ol,.rst-content dl p,.rst-content dl table,.rst-content dl ul{margin-bottom:12px}.rst-content dl dd{margin:0 0 12px 24px;line-height:24px}.rst-content dl dd>ol:last-child,.rst-content dl dd>p:last-child,.rst-content dl dd>table:last-child,.rst-content dl dd>ul:last-child{margin-bottom:0}html.writer-html4 .rst-content dl:not(.docutils),html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple){margin-bottom:24px}html.writer-html4 .rst-content dl:not(.docutils)>dt,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt{display:table;margin:6px 0;font-size:90%;line-height:normal;background:#e7f2fa;color:#2980b9;border-top:3px solid #6ab0de;padding:6px;position:relative}html.writer-html4 .rst-content dl:not(.docutils)>dt:before,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt:before{color:#6ab0de}html.writer-html4 .rst-content dl:not(.docutils)>dt .headerlink,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt .headerlink{color:#404040;font-size:100%!important}html.writer-html4 .rst-content dl:not(.docutils) dl:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) dl:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt{margin-bottom:6px;border:none;border-left:3px solid #ccc;background:#f0f0f0;color:#555}html.writer-html4 .rst-content dl:not(.docutils) dl:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt .headerlink,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) dl:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt .headerlink{color:#404040;font-size:100%!important}html.writer-html4 .rst-content dl:not(.docutils)>dt:first-child,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt:first-child{margin-top:0}html.writer-html4 .rst-content dl:not(.docutils) code.descclassname,html.writer-html4 .rst-content dl:not(.docutils) code.descname,html.writer-html4 .rst-content dl:not(.docutils) tt.descclassname,html.writer-html4 .rst-content dl:not(.docutils) tt.descname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) code.descclassname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) code.descname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) tt.descclassname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) tt.descname{background-color:transparent;border:none;padding:0;font-size:100%!important}html.writer-html4 .rst-content dl:not(.docutils) code.descname,html.writer-html4 .rst-content dl:not(.docutils) tt.descname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) code.descname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) tt.descname{font-weight:700}html.writer-html4 .rst-content dl:not(.docutils) .optional,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) .optional{display:inline-block;padding:0 4px;color:#000;font-weight:700}html.writer-html4 .rst-content dl:not(.docutils) .property,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) .property{display:inline-block;padding-right:8px;max-width:100%}html.writer-html4 .rst-content dl:not(.docutils) .k,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) .k{font-style:italic}html.writer-html4 .rst-content dl:not(.docutils) .descclassname,html.writer-html4 .rst-content dl:not(.docutils) .descname,html.writer-html4 .rst-content dl:not(.docutils) .sig-name,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) .descclassname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) .descname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) .sig-name{font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;color:#000}.rst-content .viewcode-back,.rst-content .viewcode-link{display:inline-block;color:#27ae60;font-size:80%;padding-left:24px}.rst-content .viewcode-back{display:block;float:right}.rst-content p.rubric{margin-bottom:12px;font-weight:700}.rst-content code.download,.rst-content tt.download{background:inherit;padding:inherit;font-weight:400;font-family:inherit;font-size:inherit;color:inherit;border:inherit;white-space:inherit}.rst-content code.download span:first-child,.rst-content tt.download span:first-child{-webkit-font-smoothing:subpixel-antialiased}.rst-content code.download span:first-child:before,.rst-content tt.download span:first-child:before{margin-right:4px}.rst-content .guilabel,.rst-content .menuselection{font-size:80%;font-weight:700;border-radius:4px;padding:2.4px 6px;margin:auto 2px}.rst-content .guilabel,.rst-content .menuselection{border:1px solid #7fbbe3;background:#e7f2fa}.rst-content :not(dl.option-list)>:not(dt):not(kbd):not(.kbd)>.kbd,.rst-content :not(dl.option-list)>:not(dt):not(kbd):not(.kbd)>kbd{color:inherit;font-size:80%;background-color:#fff;border:1px solid #a6a6a6;border-radius:4px;box-shadow:0 2px grey;padding:2.4px 6px;margin:auto 0}.rst-content .versionmodified{font-style:italic}@media screen and (max-width:480px){.rst-content .sidebar{width:100%}}span[id*=MathJax-Span]{color:#404040}.math{text-align:center}@font-face{font-family:Lato;src:url(fonts/lato-normal.woff2?bd03a2cc277bbbc338d464e679fe9942) format("woff2"),url(fonts/lato-normal.woff?27bd77b9162d388cb8d4c4217c7c5e2a) format("woff");font-weight:400;font-style:normal;font-display:block}@font-face{font-family:Lato;src:url(fonts/lato-bold.woff2?cccb897485813c7c256901dbca54ecf2) format("woff2"),url(fonts/lato-bold.woff?d878b6c29b10beca227e9eef4246111b) format("woff");font-weight:700;font-style:normal;font-display:block}@font-face{font-family:Lato;src:url(fonts/lato-bold-italic.woff2?0b6bb6725576b072c5d0b02ecdd1900d) format("woff2"),url(fonts/lato-bold-italic.woff?9c7e4e9eb485b4a121c760e61bc3707c) format("woff");font-weight:700;font-style:italic;font-display:block}@font-face{font-family:Lato;src:url(fonts/lato-normal-italic.woff2?4eb103b4d12be57cb1d040ed5e162e9d) format("woff2"),url(fonts/lato-normal-italic.woff?f28f2d6482446544ef1ea1ccc6dd5892) format("woff");font-weight:400;font-style:italic;font-display:block}@font-face{font-family:Roboto Slab;font-style:normal;font-weight:400;src:url(fonts/Roboto-Slab-Regular.woff2?7abf5b8d04d26a2cafea937019bca958) format("woff2"),url(fonts/Roboto-Slab-Regular.woff?c1be9284088d487c5e3ff0a10a92e58c) format("woff");font-display:block}@font-face{font-family:Roboto Slab;font-style:normal;font-weight:700;src:url(fonts/Roboto-Slab-Bold.woff2?9984f4a9bda09be08e83f2506954adbe) format("woff2"),url(fonts/Roboto-Slab-Bold.woff?bed5564a116b05148e3b3bea6fb1162a) format("woff");font-display:block} \ No newline at end of file diff --git a/releases/1.32.2/_static/doctools.js b/releases/1.32.2/_static/doctools.js new file mode 100644 index 00000000..527b876c --- /dev/null +++ b/releases/1.32.2/_static/doctools.js @@ -0,0 +1,156 @@ +/* + * doctools.js + * ~~~~~~~~~~~ + * + * Base JavaScript utilities for all Sphinx HTML documentation. + * + * :copyright: Copyright 2007-2022 by the Sphinx team, see AUTHORS. + * :license: BSD, see LICENSE for details. + * + */ +"use strict"; + +const BLACKLISTED_KEY_CONTROL_ELEMENTS = new Set([ + "TEXTAREA", + "INPUT", + "SELECT", + "BUTTON", +]); + +const _ready = (callback) => { + if (document.readyState !== "loading") { + callback(); + } else { + document.addEventListener("DOMContentLoaded", callback); + } +}; + +/** + * Small JavaScript module for the documentation. + */ +const Documentation = { + init: () => { + Documentation.initDomainIndexTable(); + Documentation.initOnKeyListeners(); + }, + + /** + * i18n support + */ + TRANSLATIONS: {}, + PLURAL_EXPR: (n) => (n === 1 ? 0 : 1), + LOCALE: "unknown", + + // gettext and ngettext don't access this so that the functions + // can safely bound to a different name (_ = Documentation.gettext) + gettext: (string) => { + const translated = Documentation.TRANSLATIONS[string]; + switch (typeof translated) { + case "undefined": + return string; // no translation + case "string": + return translated; // translation exists + default: + return translated[0]; // (singular, plural) translation tuple exists + } + }, + + ngettext: (singular, plural, n) => { + const translated = Documentation.TRANSLATIONS[singular]; + if (typeof translated !== "undefined") + return translated[Documentation.PLURAL_EXPR(n)]; + return n === 1 ? singular : plural; + }, + + addTranslations: (catalog) => { + Object.assign(Documentation.TRANSLATIONS, catalog.messages); + Documentation.PLURAL_EXPR = new Function( + "n", + `return (${catalog.plural_expr})` + ); + Documentation.LOCALE = catalog.locale; + }, + + /** + * helper function to focus on search bar + */ + focusSearchBar: () => { + document.querySelectorAll("input[name=q]")[0]?.focus(); + }, + + /** + * Initialise the domain index toggle buttons + */ + initDomainIndexTable: () => { + const toggler = (el) => { + const idNumber = el.id.substr(7); + const toggledRows = document.querySelectorAll(`tr.cg-${idNumber}`); + if (el.src.substr(-9) === "minus.png") { + el.src = `${el.src.substr(0, el.src.length - 9)}plus.png`; + toggledRows.forEach((el) => (el.style.display = "none")); + } else { + el.src = `${el.src.substr(0, el.src.length - 8)}minus.png`; + toggledRows.forEach((el) => (el.style.display = "")); + } + }; + + const togglerElements = document.querySelectorAll("img.toggler"); + togglerElements.forEach((el) => + el.addEventListener("click", (event) => toggler(event.currentTarget)) + ); + togglerElements.forEach((el) => (el.style.display = "")); + if (DOCUMENTATION_OPTIONS.COLLAPSE_INDEX) togglerElements.forEach(toggler); + }, + + initOnKeyListeners: () => { + // only install a listener if it is really needed + if ( + !DOCUMENTATION_OPTIONS.NAVIGATION_WITH_KEYS && + !DOCUMENTATION_OPTIONS.ENABLE_SEARCH_SHORTCUTS + ) + return; + + document.addEventListener("keydown", (event) => { + // bail for input elements + if (BLACKLISTED_KEY_CONTROL_ELEMENTS.has(document.activeElement.tagName)) return; + // bail with special keys + if (event.altKey || event.ctrlKey || event.metaKey) return; + + if (!event.shiftKey) { + switch (event.key) { + case "ArrowLeft": + if (!DOCUMENTATION_OPTIONS.NAVIGATION_WITH_KEYS) break; + + const prevLink = document.querySelector('link[rel="prev"]'); + if (prevLink && prevLink.href) { + window.location.href = prevLink.href; + event.preventDefault(); + } + break; + case "ArrowRight": + if (!DOCUMENTATION_OPTIONS.NAVIGATION_WITH_KEYS) break; + + const nextLink = document.querySelector('link[rel="next"]'); + if (nextLink && nextLink.href) { + window.location.href = nextLink.href; + event.preventDefault(); + } + break; + } + } + + // some keyboard layouts may need Shift to get / + switch (event.key) { + case "/": + if (!DOCUMENTATION_OPTIONS.ENABLE_SEARCH_SHORTCUTS) break; + Documentation.focusSearchBar(); + event.preventDefault(); + } + }); + }, +}; + +// quick alias for translations +const _ = Documentation.gettext; + +_ready(Documentation.init); diff --git a/releases/1.32.2/_static/documentation_options.js b/releases/1.32.2/_static/documentation_options.js new file mode 100644 index 00000000..b57ae3b8 --- /dev/null +++ b/releases/1.32.2/_static/documentation_options.js @@ -0,0 +1,14 @@ +var DOCUMENTATION_OPTIONS = { + URL_ROOT: document.getElementById("documentation_options").getAttribute('data-url_root'), + VERSION: '', + LANGUAGE: 'en', + COLLAPSE_INDEX: false, + BUILDER: 'html', + FILE_SUFFIX: '.html', + LINK_SUFFIX: '.html', + HAS_SOURCE: true, + SOURCELINK_SUFFIX: '.txt', + NAVIGATION_WITH_KEYS: false, + SHOW_SEARCH_SUMMARY: true, + ENABLE_SEARCH_SHORTCUTS: true, +}; \ No newline at end of file diff --git a/releases/1.32.2/_static/file.png b/releases/1.32.2/_static/file.png new file mode 100644 index 00000000..a858a410 Binary files /dev/null and b/releases/1.32.2/_static/file.png differ diff --git a/releases/1.32.2/_static/jquery-3.6.0.js b/releases/1.32.2/_static/jquery-3.6.0.js new file mode 100644 index 00000000..fc6c299b --- /dev/null +++ b/releases/1.32.2/_static/jquery-3.6.0.js @@ -0,0 +1,10881 @@ +/*! + * jQuery JavaScript Library v3.6.0 + * https://jquery.com/ + * + * Includes Sizzle.js + * https://sizzlejs.com/ + * + * Copyright OpenJS Foundation and other contributors + * Released under the MIT license + * https://jquery.org/license + * + * Date: 2021-03-02T17:08Z + */ +( function( global, factory ) { + + "use strict"; + + if ( typeof module === "object" && typeof module.exports === "object" ) { + + // For CommonJS and CommonJS-like environments where a proper `window` + // is present, execute the factory and get jQuery. + // For environments that do not have a `window` with a `document` + // (such as Node.js), expose a factory as module.exports. + // This accentuates the need for the creation of a real `window`. + // e.g. var jQuery = require("jquery")(window); + // See ticket #14549 for more info. + module.exports = global.document ? + factory( global, true ) : + function( w ) { + if ( !w.document ) { + throw new Error( "jQuery requires a window with a document" ); + } + return factory( w ); + }; + } else { + factory( global ); + } + +// Pass this if window is not defined yet +} )( typeof window !== "undefined" ? window : this, function( window, noGlobal ) { + +// Edge <= 12 - 13+, Firefox <=18 - 45+, IE 10 - 11, Safari 5.1 - 9+, iOS 6 - 9.1 +// throw exceptions when non-strict code (e.g., ASP.NET 4.5) accesses strict mode +// arguments.callee.caller (trac-13335). But as of jQuery 3.0 (2016), strict mode should be common +// enough that all such attempts are guarded in a try block. +"use strict"; + +var arr = []; + +var getProto = Object.getPrototypeOf; + +var slice = arr.slice; + +var flat = arr.flat ? function( array ) { + return arr.flat.call( array ); +} : function( array ) { + return arr.concat.apply( [], array ); +}; + + +var push = arr.push; + +var indexOf = arr.indexOf; + +var class2type = {}; + +var toString = class2type.toString; + +var hasOwn = class2type.hasOwnProperty; + +var fnToString = hasOwn.toString; + +var ObjectFunctionString = fnToString.call( Object ); + +var support = {}; + +var isFunction = function isFunction( obj ) { + + // Support: Chrome <=57, Firefox <=52 + // In some browsers, typeof returns "function" for HTML elements + // (i.e., `typeof document.createElement( "object" ) === "function"`). + // We don't want to classify *any* DOM node as a function. + // Support: QtWeb <=3.8.5, WebKit <=534.34, wkhtmltopdf tool <=0.12.5 + // Plus for old WebKit, typeof returns "function" for HTML collections + // (e.g., `typeof document.getElementsByTagName("div") === "function"`). (gh-4756) + return typeof obj === "function" && typeof obj.nodeType !== "number" && + typeof obj.item !== "function"; + }; + + +var isWindow = function isWindow( obj ) { + return obj != null && obj === obj.window; + }; + + +var document = window.document; + + + + var preservedScriptAttributes = { + type: true, + src: true, + nonce: true, + noModule: true + }; + + function DOMEval( code, node, doc ) { + doc = doc || document; + + var i, val, + script = doc.createElement( "script" ); + + script.text = code; + if ( node ) { + for ( i in preservedScriptAttributes ) { + + // Support: Firefox 64+, Edge 18+ + // Some browsers don't support the "nonce" property on scripts. + // On the other hand, just using `getAttribute` is not enough as + // the `nonce` attribute is reset to an empty string whenever it + // becomes browsing-context connected. + // See https://github.com/whatwg/html/issues/2369 + // See https://html.spec.whatwg.org/#nonce-attributes + // The `node.getAttribute` check was added for the sake of + // `jQuery.globalEval` so that it can fake a nonce-containing node + // via an object. + val = node[ i ] || node.getAttribute && node.getAttribute( i ); + if ( val ) { + script.setAttribute( i, val ); + } + } + } + doc.head.appendChild( script ).parentNode.removeChild( script ); + } + + +function toType( obj ) { + if ( obj == null ) { + return obj + ""; + } + + // Support: Android <=2.3 only (functionish RegExp) + return typeof obj === "object" || typeof obj === "function" ? + class2type[ toString.call( obj ) ] || "object" : + typeof obj; +} +/* global Symbol */ +// Defining this global in .eslintrc.json would create a danger of using the global +// unguarded in another place, it seems safer to define global only for this module + + + +var + version = "3.6.0", + + // Define a local copy of jQuery + jQuery = function( selector, context ) { + + // The jQuery object is actually just the init constructor 'enhanced' + // Need init if jQuery is called (just allow error to be thrown if not included) + return new jQuery.fn.init( selector, context ); + }; + +jQuery.fn = jQuery.prototype = { + + // The current version of jQuery being used + jquery: version, + + constructor: jQuery, + + // The default length of a jQuery object is 0 + length: 0, + + toArray: function() { + return slice.call( this ); + }, + + // Get the Nth element in the matched element set OR + // Get the whole matched element set as a clean array + get: function( num ) { + + // Return all the elements in a clean array + if ( num == null ) { + return slice.call( this ); + } + + // Return just the one element from the set + return num < 0 ? this[ num + this.length ] : this[ num ]; + }, + + // Take an array of elements and push it onto the stack + // (returning the new matched element set) + pushStack: function( elems ) { + + // Build a new jQuery matched element set + var ret = jQuery.merge( this.constructor(), elems ); + + // Add the old object onto the stack (as a reference) + ret.prevObject = this; + + // Return the newly-formed element set + return ret; + }, + + // Execute a callback for every element in the matched set. + each: function( callback ) { + return jQuery.each( this, callback ); + }, + + map: function( callback ) { + return this.pushStack( jQuery.map( this, function( elem, i ) { + return callback.call( elem, i, elem ); + } ) ); + }, + + slice: function() { + return this.pushStack( slice.apply( this, arguments ) ); + }, + + first: function() { + return this.eq( 0 ); + }, + + last: function() { + return this.eq( -1 ); + }, + + even: function() { + return this.pushStack( jQuery.grep( this, function( _elem, i ) { + return ( i + 1 ) % 2; + } ) ); + }, + + odd: function() { + return this.pushStack( jQuery.grep( this, function( _elem, i ) { + return i % 2; + } ) ); + }, + + eq: function( i ) { + var len = this.length, + j = +i + ( i < 0 ? len : 0 ); + return this.pushStack( j >= 0 && j < len ? [ this[ j ] ] : [] ); + }, + + end: function() { + return this.prevObject || this.constructor(); + }, + + // For internal use only. + // Behaves like an Array's method, not like a jQuery method. + push: push, + sort: arr.sort, + splice: arr.splice +}; + +jQuery.extend = jQuery.fn.extend = function() { + var options, name, src, copy, copyIsArray, clone, + target = arguments[ 0 ] || {}, + i = 1, + length = arguments.length, + deep = false; + + // Handle a deep copy situation + if ( typeof target === "boolean" ) { + deep = target; + + // Skip the boolean and the target + target = arguments[ i ] || {}; + i++; + } + + // Handle case when target is a string or something (possible in deep copy) + if ( typeof target !== "object" && !isFunction( target ) ) { + target = {}; + } + + // Extend jQuery itself if only one argument is passed + if ( i === length ) { + target = this; + i--; + } + + for ( ; i < length; i++ ) { + + // Only deal with non-null/undefined values + if ( ( options = arguments[ i ] ) != null ) { + + // Extend the base object + for ( name in options ) { + copy = options[ name ]; + + // Prevent Object.prototype pollution + // Prevent never-ending loop + if ( name === "__proto__" || target === copy ) { + continue; + } + + // Recurse if we're merging plain objects or arrays + if ( deep && copy && ( jQuery.isPlainObject( copy ) || + ( copyIsArray = Array.isArray( copy ) ) ) ) { + src = target[ name ]; + + // Ensure proper type for the source value + if ( copyIsArray && !Array.isArray( src ) ) { + clone = []; + } else if ( !copyIsArray && !jQuery.isPlainObject( src ) ) { + clone = {}; + } else { + clone = src; + } + copyIsArray = false; + + // Never move original objects, clone them + target[ name ] = jQuery.extend( deep, clone, copy ); + + // Don't bring in undefined values + } else if ( copy !== undefined ) { + target[ name ] = copy; + } + } + } + } + + // Return the modified object + return target; +}; + +jQuery.extend( { + + // Unique for each copy of jQuery on the page + expando: "jQuery" + ( version + Math.random() ).replace( /\D/g, "" ), + + // Assume jQuery is ready without the ready module + isReady: true, + + error: function( msg ) { + throw new Error( msg ); + }, + + noop: function() {}, + + isPlainObject: function( obj ) { + var proto, Ctor; + + // Detect obvious negatives + // Use toString instead of jQuery.type to catch host objects + if ( !obj || toString.call( obj ) !== "[object Object]" ) { + return false; + } + + proto = getProto( obj ); + + // Objects with no prototype (e.g., `Object.create( null )`) are plain + if ( !proto ) { + return true; + } + + // Objects with prototype are plain iff they were constructed by a global Object function + Ctor = hasOwn.call( proto, "constructor" ) && proto.constructor; + return typeof Ctor === "function" && fnToString.call( Ctor ) === ObjectFunctionString; + }, + + isEmptyObject: function( obj ) { + var name; + + for ( name in obj ) { + return false; + } + return true; + }, + + // Evaluates a script in a provided context; falls back to the global one + // if not specified. + globalEval: function( code, options, doc ) { + DOMEval( code, { nonce: options && options.nonce }, doc ); + }, + + each: function( obj, callback ) { + var length, i = 0; + + if ( isArrayLike( obj ) ) { + length = obj.length; + for ( ; i < length; i++ ) { + if ( callback.call( obj[ i ], i, obj[ i ] ) === false ) { + break; + } + } + } else { + for ( i in obj ) { + if ( callback.call( obj[ i ], i, obj[ i ] ) === false ) { + break; + } + } + } + + return obj; + }, + + // results is for internal usage only + makeArray: function( arr, results ) { + var ret = results || []; + + if ( arr != null ) { + if ( isArrayLike( Object( arr ) ) ) { + jQuery.merge( ret, + typeof arr === "string" ? + [ arr ] : arr + ); + } else { + push.call( ret, arr ); + } + } + + return ret; + }, + + inArray: function( elem, arr, i ) { + return arr == null ? -1 : indexOf.call( arr, elem, i ); + }, + + // Support: Android <=4.0 only, PhantomJS 1 only + // push.apply(_, arraylike) throws on ancient WebKit + merge: function( first, second ) { + var len = +second.length, + j = 0, + i = first.length; + + for ( ; j < len; j++ ) { + first[ i++ ] = second[ j ]; + } + + first.length = i; + + return first; + }, + + grep: function( elems, callback, invert ) { + var callbackInverse, + matches = [], + i = 0, + length = elems.length, + callbackExpect = !invert; + + // Go through the array, only saving the items + // that pass the validator function + for ( ; i < length; i++ ) { + callbackInverse = !callback( elems[ i ], i ); + if ( callbackInverse !== callbackExpect ) { + matches.push( elems[ i ] ); + } + } + + return matches; + }, + + // arg is for internal usage only + map: function( elems, callback, arg ) { + var length, value, + i = 0, + ret = []; + + // Go through the array, translating each of the items to their new values + if ( isArrayLike( elems ) ) { + length = elems.length; + for ( ; i < length; i++ ) { + value = callback( elems[ i ], i, arg ); + + if ( value != null ) { + ret.push( value ); + } + } + + // Go through every key on the object, + } else { + for ( i in elems ) { + value = callback( elems[ i ], i, arg ); + + if ( value != null ) { + ret.push( value ); + } + } + } + + // Flatten any nested arrays + return flat( ret ); + }, + + // A global GUID counter for objects + guid: 1, + + // jQuery.support is not used in Core but other projects attach their + // properties to it so it needs to exist. + support: support +} ); + +if ( typeof Symbol === "function" ) { + jQuery.fn[ Symbol.iterator ] = arr[ Symbol.iterator ]; +} + +// Populate the class2type map +jQuery.each( "Boolean Number String Function Array Date RegExp Object Error Symbol".split( " " ), + function( _i, name ) { + class2type[ "[object " + name + "]" ] = name.toLowerCase(); + } ); + +function isArrayLike( obj ) { + + // Support: real iOS 8.2 only (not reproducible in simulator) + // `in` check used to prevent JIT error (gh-2145) + // hasOwn isn't used here due to false negatives + // regarding Nodelist length in IE + var length = !!obj && "length" in obj && obj.length, + type = toType( obj ); + + if ( isFunction( obj ) || isWindow( obj ) ) { + return false; + } + + return type === "array" || length === 0 || + typeof length === "number" && length > 0 && ( length - 1 ) in obj; +} +var Sizzle = +/*! + * Sizzle CSS Selector Engine v2.3.6 + * https://sizzlejs.com/ + * + * Copyright JS Foundation and other contributors + * Released under the MIT license + * https://js.foundation/ + * + * Date: 2021-02-16 + */ +( function( window ) { +var i, + support, + Expr, + getText, + isXML, + tokenize, + compile, + select, + outermostContext, + sortInput, + hasDuplicate, + + // Local document vars + setDocument, + document, + docElem, + documentIsHTML, + rbuggyQSA, + rbuggyMatches, + matches, + contains, + + // Instance-specific data + expando = "sizzle" + 1 * new Date(), + preferredDoc = window.document, + dirruns = 0, + done = 0, + classCache = createCache(), + tokenCache = createCache(), + compilerCache = createCache(), + nonnativeSelectorCache = createCache(), + sortOrder = function( a, b ) { + if ( a === b ) { + hasDuplicate = true; + } + return 0; + }, + + // Instance methods + hasOwn = ( {} ).hasOwnProperty, + arr = [], + pop = arr.pop, + pushNative = arr.push, + push = arr.push, + slice = arr.slice, + + // Use a stripped-down indexOf as it's faster than native + // https://jsperf.com/thor-indexof-vs-for/5 + indexOf = function( list, elem ) { + var i = 0, + len = list.length; + for ( ; i < len; i++ ) { + if ( list[ i ] === elem ) { + return i; + } + } + return -1; + }, + + booleans = "checked|selected|async|autofocus|autoplay|controls|defer|disabled|hidden|" + + "ismap|loop|multiple|open|readonly|required|scoped", + + // Regular expressions + + // http://www.w3.org/TR/css3-selectors/#whitespace + whitespace = "[\\x20\\t\\r\\n\\f]", + + // https://www.w3.org/TR/css-syntax-3/#ident-token-diagram + identifier = "(?:\\\\[\\da-fA-F]{1,6}" + whitespace + + "?|\\\\[^\\r\\n\\f]|[\\w-]|[^\0-\\x7f])+", + + // Attribute selectors: http://www.w3.org/TR/selectors/#attribute-selectors + attributes = "\\[" + whitespace + "*(" + identifier + ")(?:" + whitespace + + + // Operator (capture 2) + "*([*^$|!~]?=)" + whitespace + + + // "Attribute values must be CSS identifiers [capture 5] + // or strings [capture 3 or capture 4]" + "*(?:'((?:\\\\.|[^\\\\'])*)'|\"((?:\\\\.|[^\\\\\"])*)\"|(" + identifier + "))|)" + + whitespace + "*\\]", + + pseudos = ":(" + identifier + ")(?:\\((" + + + // To reduce the number of selectors needing tokenize in the preFilter, prefer arguments: + // 1. quoted (capture 3; capture 4 or capture 5) + "('((?:\\\\.|[^\\\\'])*)'|\"((?:\\\\.|[^\\\\\"])*)\")|" + + + // 2. simple (capture 6) + "((?:\\\\.|[^\\\\()[\\]]|" + attributes + ")*)|" + + + // 3. anything else (capture 2) + ".*" + + ")\\)|)", + + // Leading and non-escaped trailing whitespace, capturing some non-whitespace characters preceding the latter + rwhitespace = new RegExp( whitespace + "+", "g" ), + rtrim = new RegExp( "^" + whitespace + "+|((?:^|[^\\\\])(?:\\\\.)*)" + + whitespace + "+$", "g" ), + + rcomma = new RegExp( "^" + whitespace + "*," + whitespace + "*" ), + rcombinators = new RegExp( "^" + whitespace + "*([>+~]|" + whitespace + ")" + whitespace + + "*" ), + rdescend = new RegExp( whitespace + "|>" ), + + rpseudo = new RegExp( pseudos ), + ridentifier = new RegExp( "^" + identifier + "$" ), + + matchExpr = { + "ID": new RegExp( "^#(" + identifier + ")" ), + "CLASS": new RegExp( "^\\.(" + identifier + ")" ), + "TAG": new RegExp( "^(" + identifier + "|[*])" ), + "ATTR": new RegExp( "^" + attributes ), + "PSEUDO": new RegExp( "^" + pseudos ), + "CHILD": new RegExp( "^:(only|first|last|nth|nth-last)-(child|of-type)(?:\\(" + + whitespace + "*(even|odd|(([+-]|)(\\d*)n|)" + whitespace + "*(?:([+-]|)" + + whitespace + "*(\\d+)|))" + whitespace + "*\\)|)", "i" ), + "bool": new RegExp( "^(?:" + booleans + ")$", "i" ), + + // For use in libraries implementing .is() + // We use this for POS matching in `select` + "needsContext": new RegExp( "^" + whitespace + + "*[>+~]|:(even|odd|eq|gt|lt|nth|first|last)(?:\\(" + whitespace + + "*((?:-\\d)?\\d*)" + whitespace + "*\\)|)(?=[^-]|$)", "i" ) + }, + + rhtml = /HTML$/i, + rinputs = /^(?:input|select|textarea|button)$/i, + rheader = /^h\d$/i, + + rnative = /^[^{]+\{\s*\[native \w/, + + // Easily-parseable/retrievable ID or TAG or CLASS selectors + rquickExpr = /^(?:#([\w-]+)|(\w+)|\.([\w-]+))$/, + + rsibling = /[+~]/, + + // CSS escapes + // http://www.w3.org/TR/CSS21/syndata.html#escaped-characters + runescape = new RegExp( "\\\\[\\da-fA-F]{1,6}" + whitespace + "?|\\\\([^\\r\\n\\f])", "g" ), + funescape = function( escape, nonHex ) { + var high = "0x" + escape.slice( 1 ) - 0x10000; + + return nonHex ? + + // Strip the backslash prefix from a non-hex escape sequence + nonHex : + + // Replace a hexadecimal escape sequence with the encoded Unicode code point + // Support: IE <=11+ + // For values outside the Basic Multilingual Plane (BMP), manually construct a + // surrogate pair + high < 0 ? + String.fromCharCode( high + 0x10000 ) : + String.fromCharCode( high >> 10 | 0xD800, high & 0x3FF | 0xDC00 ); + }, + + // CSS string/identifier serialization + // https://drafts.csswg.org/cssom/#common-serializing-idioms + rcssescape = /([\0-\x1f\x7f]|^-?\d)|^-$|[^\0-\x1f\x7f-\uFFFF\w-]/g, + fcssescape = function( ch, asCodePoint ) { + if ( asCodePoint ) { + + // U+0000 NULL becomes U+FFFD REPLACEMENT CHARACTER + if ( ch === "\0" ) { + return "\uFFFD"; + } + + // Control characters and (dependent upon position) numbers get escaped as code points + return ch.slice( 0, -1 ) + "\\" + + ch.charCodeAt( ch.length - 1 ).toString( 16 ) + " "; + } + + // Other potentially-special ASCII characters get backslash-escaped + return "\\" + ch; + }, + + // Used for iframes + // See setDocument() + // Removing the function wrapper causes a "Permission Denied" + // error in IE + unloadHandler = function() { + setDocument(); + }, + + inDisabledFieldset = addCombinator( + function( elem ) { + return elem.disabled === true && elem.nodeName.toLowerCase() === "fieldset"; + }, + { dir: "parentNode", next: "legend" } + ); + +// Optimize for push.apply( _, NodeList ) +try { + push.apply( + ( arr = slice.call( preferredDoc.childNodes ) ), + preferredDoc.childNodes + ); + + // Support: Android<4.0 + // Detect silently failing push.apply + // eslint-disable-next-line no-unused-expressions + arr[ preferredDoc.childNodes.length ].nodeType; +} catch ( e ) { + push = { apply: arr.length ? + + // Leverage slice if possible + function( target, els ) { + pushNative.apply( target, slice.call( els ) ); + } : + + // Support: IE<9 + // Otherwise append directly + function( target, els ) { + var j = target.length, + i = 0; + + // Can't trust NodeList.length + while ( ( target[ j++ ] = els[ i++ ] ) ) {} + target.length = j - 1; + } + }; +} + +function Sizzle( selector, context, results, seed ) { + var m, i, elem, nid, match, groups, newSelector, + newContext = context && context.ownerDocument, + + // nodeType defaults to 9, since context defaults to document + nodeType = context ? context.nodeType : 9; + + results = results || []; + + // Return early from calls with invalid selector or context + if ( typeof selector !== "string" || !selector || + nodeType !== 1 && nodeType !== 9 && nodeType !== 11 ) { + + return results; + } + + // Try to shortcut find operations (as opposed to filters) in HTML documents + if ( !seed ) { + setDocument( context ); + context = context || document; + + if ( documentIsHTML ) { + + // If the selector is sufficiently simple, try using a "get*By*" DOM method + // (excepting DocumentFragment context, where the methods don't exist) + if ( nodeType !== 11 && ( match = rquickExpr.exec( selector ) ) ) { + + // ID selector + if ( ( m = match[ 1 ] ) ) { + + // Document context + if ( nodeType === 9 ) { + if ( ( elem = context.getElementById( m ) ) ) { + + // Support: IE, Opera, Webkit + // TODO: identify versions + // getElementById can match elements by name instead of ID + if ( elem.id === m ) { + results.push( elem ); + return results; + } + } else { + return results; + } + + // Element context + } else { + + // Support: IE, Opera, Webkit + // TODO: identify versions + // getElementById can match elements by name instead of ID + if ( newContext && ( elem = newContext.getElementById( m ) ) && + contains( context, elem ) && + elem.id === m ) { + + results.push( elem ); + return results; + } + } + + // Type selector + } else if ( match[ 2 ] ) { + push.apply( results, context.getElementsByTagName( selector ) ); + return results; + + // Class selector + } else if ( ( m = match[ 3 ] ) && support.getElementsByClassName && + context.getElementsByClassName ) { + + push.apply( results, context.getElementsByClassName( m ) ); + return results; + } + } + + // Take advantage of querySelectorAll + if ( support.qsa && + !nonnativeSelectorCache[ selector + " " ] && + ( !rbuggyQSA || !rbuggyQSA.test( selector ) ) && + + // Support: IE 8 only + // Exclude object elements + ( nodeType !== 1 || context.nodeName.toLowerCase() !== "object" ) ) { + + newSelector = selector; + newContext = context; + + // qSA considers elements outside a scoping root when evaluating child or + // descendant combinators, which is not what we want. + // In such cases, we work around the behavior by prefixing every selector in the + // list with an ID selector referencing the scope context. + // The technique has to be used as well when a leading combinator is used + // as such selectors are not recognized by querySelectorAll. + // Thanks to Andrew Dupont for this technique. + if ( nodeType === 1 && + ( rdescend.test( selector ) || rcombinators.test( selector ) ) ) { + + // Expand context for sibling selectors + newContext = rsibling.test( selector ) && testContext( context.parentNode ) || + context; + + // We can use :scope instead of the ID hack if the browser + // supports it & if we're not changing the context. + if ( newContext !== context || !support.scope ) { + + // Capture the context ID, setting it first if necessary + if ( ( nid = context.getAttribute( "id" ) ) ) { + nid = nid.replace( rcssescape, fcssescape ); + } else { + context.setAttribute( "id", ( nid = expando ) ); + } + } + + // Prefix every selector in the list + groups = tokenize( selector ); + i = groups.length; + while ( i-- ) { + groups[ i ] = ( nid ? "#" + nid : ":scope" ) + " " + + toSelector( groups[ i ] ); + } + newSelector = groups.join( "," ); + } + + try { + push.apply( results, + newContext.querySelectorAll( newSelector ) + ); + return results; + } catch ( qsaError ) { + nonnativeSelectorCache( selector, true ); + } finally { + if ( nid === expando ) { + context.removeAttribute( "id" ); + } + } + } + } + } + + // All others + return select( selector.replace( rtrim, "$1" ), context, results, seed ); +} + +/** + * Create key-value caches of limited size + * @returns {function(string, object)} Returns the Object data after storing it on itself with + * property name the (space-suffixed) string and (if the cache is larger than Expr.cacheLength) + * deleting the oldest entry + */ +function createCache() { + var keys = []; + + function cache( key, value ) { + + // Use (key + " ") to avoid collision with native prototype properties (see Issue #157) + if ( keys.push( key + " " ) > Expr.cacheLength ) { + + // Only keep the most recent entries + delete cache[ keys.shift() ]; + } + return ( cache[ key + " " ] = value ); + } + return cache; +} + +/** + * Mark a function for special use by Sizzle + * @param {Function} fn The function to mark + */ +function markFunction( fn ) { + fn[ expando ] = true; + return fn; +} + +/** + * Support testing using an element + * @param {Function} fn Passed the created element and returns a boolean result + */ +function assert( fn ) { + var el = document.createElement( "fieldset" ); + + try { + return !!fn( el ); + } catch ( e ) { + return false; + } finally { + + // Remove from its parent by default + if ( el.parentNode ) { + el.parentNode.removeChild( el ); + } + + // release memory in IE + el = null; + } +} + +/** + * Adds the same handler for all of the specified attrs + * @param {String} attrs Pipe-separated list of attributes + * @param {Function} handler The method that will be applied + */ +function addHandle( attrs, handler ) { + var arr = attrs.split( "|" ), + i = arr.length; + + while ( i-- ) { + Expr.attrHandle[ arr[ i ] ] = handler; + } +} + +/** + * Checks document order of two siblings + * @param {Element} a + * @param {Element} b + * @returns {Number} Returns less than 0 if a precedes b, greater than 0 if a follows b + */ +function siblingCheck( a, b ) { + var cur = b && a, + diff = cur && a.nodeType === 1 && b.nodeType === 1 && + a.sourceIndex - b.sourceIndex; + + // Use IE sourceIndex if available on both nodes + if ( diff ) { + return diff; + } + + // Check if b follows a + if ( cur ) { + while ( ( cur = cur.nextSibling ) ) { + if ( cur === b ) { + return -1; + } + } + } + + return a ? 1 : -1; +} + +/** + * Returns a function to use in pseudos for input types + * @param {String} type + */ +function createInputPseudo( type ) { + return function( elem ) { + var name = elem.nodeName.toLowerCase(); + return name === "input" && elem.type === type; + }; +} + +/** + * Returns a function to use in pseudos for buttons + * @param {String} type + */ +function createButtonPseudo( type ) { + return function( elem ) { + var name = elem.nodeName.toLowerCase(); + return ( name === "input" || name === "button" ) && elem.type === type; + }; +} + +/** + * Returns a function to use in pseudos for :enabled/:disabled + * @param {Boolean} disabled true for :disabled; false for :enabled + */ +function createDisabledPseudo( disabled ) { + + // Known :disabled false positives: fieldset[disabled] > legend:nth-of-type(n+2) :can-disable + return function( elem ) { + + // Only certain elements can match :enabled or :disabled + // https://html.spec.whatwg.org/multipage/scripting.html#selector-enabled + // https://html.spec.whatwg.org/multipage/scripting.html#selector-disabled + if ( "form" in elem ) { + + // Check for inherited disabledness on relevant non-disabled elements: + // * listed form-associated elements in a disabled fieldset + // https://html.spec.whatwg.org/multipage/forms.html#category-listed + // https://html.spec.whatwg.org/multipage/forms.html#concept-fe-disabled + // * option elements in a disabled optgroup + // https://html.spec.whatwg.org/multipage/forms.html#concept-option-disabled + // All such elements have a "form" property. + if ( elem.parentNode && elem.disabled === false ) { + + // Option elements defer to a parent optgroup if present + if ( "label" in elem ) { + if ( "label" in elem.parentNode ) { + return elem.parentNode.disabled === disabled; + } else { + return elem.disabled === disabled; + } + } + + // Support: IE 6 - 11 + // Use the isDisabled shortcut property to check for disabled fieldset ancestors + return elem.isDisabled === disabled || + + // Where there is no isDisabled, check manually + /* jshint -W018 */ + elem.isDisabled !== !disabled && + inDisabledFieldset( elem ) === disabled; + } + + return elem.disabled === disabled; + + // Try to winnow out elements that can't be disabled before trusting the disabled property. + // Some victims get caught in our net (label, legend, menu, track), but it shouldn't + // even exist on them, let alone have a boolean value. + } else if ( "label" in elem ) { + return elem.disabled === disabled; + } + + // Remaining elements are neither :enabled nor :disabled + return false; + }; +} + +/** + * Returns a function to use in pseudos for positionals + * @param {Function} fn + */ +function createPositionalPseudo( fn ) { + return markFunction( function( argument ) { + argument = +argument; + return markFunction( function( seed, matches ) { + var j, + matchIndexes = fn( [], seed.length, argument ), + i = matchIndexes.length; + + // Match elements found at the specified indexes + while ( i-- ) { + if ( seed[ ( j = matchIndexes[ i ] ) ] ) { + seed[ j ] = !( matches[ j ] = seed[ j ] ); + } + } + } ); + } ); +} + +/** + * Checks a node for validity as a Sizzle context + * @param {Element|Object=} context + * @returns {Element|Object|Boolean} The input node if acceptable, otherwise a falsy value + */ +function testContext( context ) { + return context && typeof context.getElementsByTagName !== "undefined" && context; +} + +// Expose support vars for convenience +support = Sizzle.support = {}; + +/** + * Detects XML nodes + * @param {Element|Object} elem An element or a document + * @returns {Boolean} True iff elem is a non-HTML XML node + */ +isXML = Sizzle.isXML = function( elem ) { + var namespace = elem && elem.namespaceURI, + docElem = elem && ( elem.ownerDocument || elem ).documentElement; + + // Support: IE <=8 + // Assume HTML when documentElement doesn't yet exist, such as inside loading iframes + // https://bugs.jquery.com/ticket/4833 + return !rhtml.test( namespace || docElem && docElem.nodeName || "HTML" ); +}; + +/** + * Sets document-related variables once based on the current document + * @param {Element|Object} [doc] An element or document object to use to set the document + * @returns {Object} Returns the current document + */ +setDocument = Sizzle.setDocument = function( node ) { + var hasCompare, subWindow, + doc = node ? node.ownerDocument || node : preferredDoc; + + // Return early if doc is invalid or already selected + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( doc == document || doc.nodeType !== 9 || !doc.documentElement ) { + return document; + } + + // Update global variables + document = doc; + docElem = document.documentElement; + documentIsHTML = !isXML( document ); + + // Support: IE 9 - 11+, Edge 12 - 18+ + // Accessing iframe documents after unload throws "permission denied" errors (jQuery #13936) + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( preferredDoc != document && + ( subWindow = document.defaultView ) && subWindow.top !== subWindow ) { + + // Support: IE 11, Edge + if ( subWindow.addEventListener ) { + subWindow.addEventListener( "unload", unloadHandler, false ); + + // Support: IE 9 - 10 only + } else if ( subWindow.attachEvent ) { + subWindow.attachEvent( "onunload", unloadHandler ); + } + } + + // Support: IE 8 - 11+, Edge 12 - 18+, Chrome <=16 - 25 only, Firefox <=3.6 - 31 only, + // Safari 4 - 5 only, Opera <=11.6 - 12.x only + // IE/Edge & older browsers don't support the :scope pseudo-class. + // Support: Safari 6.0 only + // Safari 6.0 supports :scope but it's an alias of :root there. + support.scope = assert( function( el ) { + docElem.appendChild( el ).appendChild( document.createElement( "div" ) ); + return typeof el.querySelectorAll !== "undefined" && + !el.querySelectorAll( ":scope fieldset div" ).length; + } ); + + /* Attributes + ---------------------------------------------------------------------- */ + + // Support: IE<8 + // Verify that getAttribute really returns attributes and not properties + // (excepting IE8 booleans) + support.attributes = assert( function( el ) { + el.className = "i"; + return !el.getAttribute( "className" ); + } ); + + /* getElement(s)By* + ---------------------------------------------------------------------- */ + + // Check if getElementsByTagName("*") returns only elements + support.getElementsByTagName = assert( function( el ) { + el.appendChild( document.createComment( "" ) ); + return !el.getElementsByTagName( "*" ).length; + } ); + + // Support: IE<9 + support.getElementsByClassName = rnative.test( document.getElementsByClassName ); + + // Support: IE<10 + // Check if getElementById returns elements by name + // The broken getElementById methods don't pick up programmatically-set names, + // so use a roundabout getElementsByName test + support.getById = assert( function( el ) { + docElem.appendChild( el ).id = expando; + return !document.getElementsByName || !document.getElementsByName( expando ).length; + } ); + + // ID filter and find + if ( support.getById ) { + Expr.filter[ "ID" ] = function( id ) { + var attrId = id.replace( runescape, funescape ); + return function( elem ) { + return elem.getAttribute( "id" ) === attrId; + }; + }; + Expr.find[ "ID" ] = function( id, context ) { + if ( typeof context.getElementById !== "undefined" && documentIsHTML ) { + var elem = context.getElementById( id ); + return elem ? [ elem ] : []; + } + }; + } else { + Expr.filter[ "ID" ] = function( id ) { + var attrId = id.replace( runescape, funescape ); + return function( elem ) { + var node = typeof elem.getAttributeNode !== "undefined" && + elem.getAttributeNode( "id" ); + return node && node.value === attrId; + }; + }; + + // Support: IE 6 - 7 only + // getElementById is not reliable as a find shortcut + Expr.find[ "ID" ] = function( id, context ) { + if ( typeof context.getElementById !== "undefined" && documentIsHTML ) { + var node, i, elems, + elem = context.getElementById( id ); + + if ( elem ) { + + // Verify the id attribute + node = elem.getAttributeNode( "id" ); + if ( node && node.value === id ) { + return [ elem ]; + } + + // Fall back on getElementsByName + elems = context.getElementsByName( id ); + i = 0; + while ( ( elem = elems[ i++ ] ) ) { + node = elem.getAttributeNode( "id" ); + if ( node && node.value === id ) { + return [ elem ]; + } + } + } + + return []; + } + }; + } + + // Tag + Expr.find[ "TAG" ] = support.getElementsByTagName ? + function( tag, context ) { + if ( typeof context.getElementsByTagName !== "undefined" ) { + return context.getElementsByTagName( tag ); + + // DocumentFragment nodes don't have gEBTN + } else if ( support.qsa ) { + return context.querySelectorAll( tag ); + } + } : + + function( tag, context ) { + var elem, + tmp = [], + i = 0, + + // By happy coincidence, a (broken) gEBTN appears on DocumentFragment nodes too + results = context.getElementsByTagName( tag ); + + // Filter out possible comments + if ( tag === "*" ) { + while ( ( elem = results[ i++ ] ) ) { + if ( elem.nodeType === 1 ) { + tmp.push( elem ); + } + } + + return tmp; + } + return results; + }; + + // Class + Expr.find[ "CLASS" ] = support.getElementsByClassName && function( className, context ) { + if ( typeof context.getElementsByClassName !== "undefined" && documentIsHTML ) { + return context.getElementsByClassName( className ); + } + }; + + /* QSA/matchesSelector + ---------------------------------------------------------------------- */ + + // QSA and matchesSelector support + + // matchesSelector(:active) reports false when true (IE9/Opera 11.5) + rbuggyMatches = []; + + // qSa(:focus) reports false when true (Chrome 21) + // We allow this because of a bug in IE8/9 that throws an error + // whenever `document.activeElement` is accessed on an iframe + // So, we allow :focus to pass through QSA all the time to avoid the IE error + // See https://bugs.jquery.com/ticket/13378 + rbuggyQSA = []; + + if ( ( support.qsa = rnative.test( document.querySelectorAll ) ) ) { + + // Build QSA regex + // Regex strategy adopted from Diego Perini + assert( function( el ) { + + var input; + + // Select is set to empty string on purpose + // This is to test IE's treatment of not explicitly + // setting a boolean content attribute, + // since its presence should be enough + // https://bugs.jquery.com/ticket/12359 + docElem.appendChild( el ).innerHTML = "" + + ""; + + // Support: IE8, Opera 11-12.16 + // Nothing should be selected when empty strings follow ^= or $= or *= + // The test attribute must be unknown in Opera but "safe" for WinRT + // https://msdn.microsoft.com/en-us/library/ie/hh465388.aspx#attribute_section + if ( el.querySelectorAll( "[msallowcapture^='']" ).length ) { + rbuggyQSA.push( "[*^$]=" + whitespace + "*(?:''|\"\")" ); + } + + // Support: IE8 + // Boolean attributes and "value" are not treated correctly + if ( !el.querySelectorAll( "[selected]" ).length ) { + rbuggyQSA.push( "\\[" + whitespace + "*(?:value|" + booleans + ")" ); + } + + // Support: Chrome<29, Android<4.4, Safari<7.0+, iOS<7.0+, PhantomJS<1.9.8+ + if ( !el.querySelectorAll( "[id~=" + expando + "-]" ).length ) { + rbuggyQSA.push( "~=" ); + } + + // Support: IE 11+, Edge 15 - 18+ + // IE 11/Edge don't find elements on a `[name='']` query in some cases. + // Adding a temporary attribute to the document before the selection works + // around the issue. + // Interestingly, IE 10 & older don't seem to have the issue. + input = document.createElement( "input" ); + input.setAttribute( "name", "" ); + el.appendChild( input ); + if ( !el.querySelectorAll( "[name='']" ).length ) { + rbuggyQSA.push( "\\[" + whitespace + "*name" + whitespace + "*=" + + whitespace + "*(?:''|\"\")" ); + } + + // Webkit/Opera - :checked should return selected option elements + // http://www.w3.org/TR/2011/REC-css3-selectors-20110929/#checked + // IE8 throws error here and will not see later tests + if ( !el.querySelectorAll( ":checked" ).length ) { + rbuggyQSA.push( ":checked" ); + } + + // Support: Safari 8+, iOS 8+ + // https://bugs.webkit.org/show_bug.cgi?id=136851 + // In-page `selector#id sibling-combinator selector` fails + if ( !el.querySelectorAll( "a#" + expando + "+*" ).length ) { + rbuggyQSA.push( ".#.+[+~]" ); + } + + // Support: Firefox <=3.6 - 5 only + // Old Firefox doesn't throw on a badly-escaped identifier. + el.querySelectorAll( "\\\f" ); + rbuggyQSA.push( "[\\r\\n\\f]" ); + } ); + + assert( function( el ) { + el.innerHTML = "" + + ""; + + // Support: Windows 8 Native Apps + // The type and name attributes are restricted during .innerHTML assignment + var input = document.createElement( "input" ); + input.setAttribute( "type", "hidden" ); + el.appendChild( input ).setAttribute( "name", "D" ); + + // Support: IE8 + // Enforce case-sensitivity of name attribute + if ( el.querySelectorAll( "[name=d]" ).length ) { + rbuggyQSA.push( "name" + whitespace + "*[*^$|!~]?=" ); + } + + // FF 3.5 - :enabled/:disabled and hidden elements (hidden elements are still enabled) + // IE8 throws error here and will not see later tests + if ( el.querySelectorAll( ":enabled" ).length !== 2 ) { + rbuggyQSA.push( ":enabled", ":disabled" ); + } + + // Support: IE9-11+ + // IE's :disabled selector does not pick up the children of disabled fieldsets + docElem.appendChild( el ).disabled = true; + if ( el.querySelectorAll( ":disabled" ).length !== 2 ) { + rbuggyQSA.push( ":enabled", ":disabled" ); + } + + // Support: Opera 10 - 11 only + // Opera 10-11 does not throw on post-comma invalid pseudos + el.querySelectorAll( "*,:x" ); + rbuggyQSA.push( ",.*:" ); + } ); + } + + if ( ( support.matchesSelector = rnative.test( ( matches = docElem.matches || + docElem.webkitMatchesSelector || + docElem.mozMatchesSelector || + docElem.oMatchesSelector || + docElem.msMatchesSelector ) ) ) ) { + + assert( function( el ) { + + // Check to see if it's possible to do matchesSelector + // on a disconnected node (IE 9) + support.disconnectedMatch = matches.call( el, "*" ); + + // This should fail with an exception + // Gecko does not error, returns false instead + matches.call( el, "[s!='']:x" ); + rbuggyMatches.push( "!=", pseudos ); + } ); + } + + rbuggyQSA = rbuggyQSA.length && new RegExp( rbuggyQSA.join( "|" ) ); + rbuggyMatches = rbuggyMatches.length && new RegExp( rbuggyMatches.join( "|" ) ); + + /* Contains + ---------------------------------------------------------------------- */ + hasCompare = rnative.test( docElem.compareDocumentPosition ); + + // Element contains another + // Purposefully self-exclusive + // As in, an element does not contain itself + contains = hasCompare || rnative.test( docElem.contains ) ? + function( a, b ) { + var adown = a.nodeType === 9 ? a.documentElement : a, + bup = b && b.parentNode; + return a === bup || !!( bup && bup.nodeType === 1 && ( + adown.contains ? + adown.contains( bup ) : + a.compareDocumentPosition && a.compareDocumentPosition( bup ) & 16 + ) ); + } : + function( a, b ) { + if ( b ) { + while ( ( b = b.parentNode ) ) { + if ( b === a ) { + return true; + } + } + } + return false; + }; + + /* Sorting + ---------------------------------------------------------------------- */ + + // Document order sorting + sortOrder = hasCompare ? + function( a, b ) { + + // Flag for duplicate removal + if ( a === b ) { + hasDuplicate = true; + return 0; + } + + // Sort on method existence if only one input has compareDocumentPosition + var compare = !a.compareDocumentPosition - !b.compareDocumentPosition; + if ( compare ) { + return compare; + } + + // Calculate position if both inputs belong to the same document + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + compare = ( a.ownerDocument || a ) == ( b.ownerDocument || b ) ? + a.compareDocumentPosition( b ) : + + // Otherwise we know they are disconnected + 1; + + // Disconnected nodes + if ( compare & 1 || + ( !support.sortDetached && b.compareDocumentPosition( a ) === compare ) ) { + + // Choose the first element that is related to our preferred document + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( a == document || a.ownerDocument == preferredDoc && + contains( preferredDoc, a ) ) { + return -1; + } + + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( b == document || b.ownerDocument == preferredDoc && + contains( preferredDoc, b ) ) { + return 1; + } + + // Maintain original order + return sortInput ? + ( indexOf( sortInput, a ) - indexOf( sortInput, b ) ) : + 0; + } + + return compare & 4 ? -1 : 1; + } : + function( a, b ) { + + // Exit early if the nodes are identical + if ( a === b ) { + hasDuplicate = true; + return 0; + } + + var cur, + i = 0, + aup = a.parentNode, + bup = b.parentNode, + ap = [ a ], + bp = [ b ]; + + // Parentless nodes are either documents or disconnected + if ( !aup || !bup ) { + + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + /* eslint-disable eqeqeq */ + return a == document ? -1 : + b == document ? 1 : + /* eslint-enable eqeqeq */ + aup ? -1 : + bup ? 1 : + sortInput ? + ( indexOf( sortInput, a ) - indexOf( sortInput, b ) ) : + 0; + + // If the nodes are siblings, we can do a quick check + } else if ( aup === bup ) { + return siblingCheck( a, b ); + } + + // Otherwise we need full lists of their ancestors for comparison + cur = a; + while ( ( cur = cur.parentNode ) ) { + ap.unshift( cur ); + } + cur = b; + while ( ( cur = cur.parentNode ) ) { + bp.unshift( cur ); + } + + // Walk down the tree looking for a discrepancy + while ( ap[ i ] === bp[ i ] ) { + i++; + } + + return i ? + + // Do a sibling check if the nodes have a common ancestor + siblingCheck( ap[ i ], bp[ i ] ) : + + // Otherwise nodes in our document sort first + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + /* eslint-disable eqeqeq */ + ap[ i ] == preferredDoc ? -1 : + bp[ i ] == preferredDoc ? 1 : + /* eslint-enable eqeqeq */ + 0; + }; + + return document; +}; + +Sizzle.matches = function( expr, elements ) { + return Sizzle( expr, null, null, elements ); +}; + +Sizzle.matchesSelector = function( elem, expr ) { + setDocument( elem ); + + if ( support.matchesSelector && documentIsHTML && + !nonnativeSelectorCache[ expr + " " ] && + ( !rbuggyMatches || !rbuggyMatches.test( expr ) ) && + ( !rbuggyQSA || !rbuggyQSA.test( expr ) ) ) { + + try { + var ret = matches.call( elem, expr ); + + // IE 9's matchesSelector returns false on disconnected nodes + if ( ret || support.disconnectedMatch || + + // As well, disconnected nodes are said to be in a document + // fragment in IE 9 + elem.document && elem.document.nodeType !== 11 ) { + return ret; + } + } catch ( e ) { + nonnativeSelectorCache( expr, true ); + } + } + + return Sizzle( expr, document, null, [ elem ] ).length > 0; +}; + +Sizzle.contains = function( context, elem ) { + + // Set document vars if needed + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( ( context.ownerDocument || context ) != document ) { + setDocument( context ); + } + return contains( context, elem ); +}; + +Sizzle.attr = function( elem, name ) { + + // Set document vars if needed + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( ( elem.ownerDocument || elem ) != document ) { + setDocument( elem ); + } + + var fn = Expr.attrHandle[ name.toLowerCase() ], + + // Don't get fooled by Object.prototype properties (jQuery #13807) + val = fn && hasOwn.call( Expr.attrHandle, name.toLowerCase() ) ? + fn( elem, name, !documentIsHTML ) : + undefined; + + return val !== undefined ? + val : + support.attributes || !documentIsHTML ? + elem.getAttribute( name ) : + ( val = elem.getAttributeNode( name ) ) && val.specified ? + val.value : + null; +}; + +Sizzle.escape = function( sel ) { + return ( sel + "" ).replace( rcssescape, fcssescape ); +}; + +Sizzle.error = function( msg ) { + throw new Error( "Syntax error, unrecognized expression: " + msg ); +}; + +/** + * Document sorting and removing duplicates + * @param {ArrayLike} results + */ +Sizzle.uniqueSort = function( results ) { + var elem, + duplicates = [], + j = 0, + i = 0; + + // Unless we *know* we can detect duplicates, assume their presence + hasDuplicate = !support.detectDuplicates; + sortInput = !support.sortStable && results.slice( 0 ); + results.sort( sortOrder ); + + if ( hasDuplicate ) { + while ( ( elem = results[ i++ ] ) ) { + if ( elem === results[ i ] ) { + j = duplicates.push( i ); + } + } + while ( j-- ) { + results.splice( duplicates[ j ], 1 ); + } + } + + // Clear input after sorting to release objects + // See https://github.com/jquery/sizzle/pull/225 + sortInput = null; + + return results; +}; + +/** + * Utility function for retrieving the text value of an array of DOM nodes + * @param {Array|Element} elem + */ +getText = Sizzle.getText = function( elem ) { + var node, + ret = "", + i = 0, + nodeType = elem.nodeType; + + if ( !nodeType ) { + + // If no nodeType, this is expected to be an array + while ( ( node = elem[ i++ ] ) ) { + + // Do not traverse comment nodes + ret += getText( node ); + } + } else if ( nodeType === 1 || nodeType === 9 || nodeType === 11 ) { + + // Use textContent for elements + // innerText usage removed for consistency of new lines (jQuery #11153) + if ( typeof elem.textContent === "string" ) { + return elem.textContent; + } else { + + // Traverse its children + for ( elem = elem.firstChild; elem; elem = elem.nextSibling ) { + ret += getText( elem ); + } + } + } else if ( nodeType === 3 || nodeType === 4 ) { + return elem.nodeValue; + } + + // Do not include comment or processing instruction nodes + + return ret; +}; + +Expr = Sizzle.selectors = { + + // Can be adjusted by the user + cacheLength: 50, + + createPseudo: markFunction, + + match: matchExpr, + + attrHandle: {}, + + find: {}, + + relative: { + ">": { dir: "parentNode", first: true }, + " ": { dir: "parentNode" }, + "+": { dir: "previousSibling", first: true }, + "~": { dir: "previousSibling" } + }, + + preFilter: { + "ATTR": function( match ) { + match[ 1 ] = match[ 1 ].replace( runescape, funescape ); + + // Move the given value to match[3] whether quoted or unquoted + match[ 3 ] = ( match[ 3 ] || match[ 4 ] || + match[ 5 ] || "" ).replace( runescape, funescape ); + + if ( match[ 2 ] === "~=" ) { + match[ 3 ] = " " + match[ 3 ] + " "; + } + + return match.slice( 0, 4 ); + }, + + "CHILD": function( match ) { + + /* matches from matchExpr["CHILD"] + 1 type (only|nth|...) + 2 what (child|of-type) + 3 argument (even|odd|\d*|\d*n([+-]\d+)?|...) + 4 xn-component of xn+y argument ([+-]?\d*n|) + 5 sign of xn-component + 6 x of xn-component + 7 sign of y-component + 8 y of y-component + */ + match[ 1 ] = match[ 1 ].toLowerCase(); + + if ( match[ 1 ].slice( 0, 3 ) === "nth" ) { + + // nth-* requires argument + if ( !match[ 3 ] ) { + Sizzle.error( match[ 0 ] ); + } + + // numeric x and y parameters for Expr.filter.CHILD + // remember that false/true cast respectively to 0/1 + match[ 4 ] = +( match[ 4 ] ? + match[ 5 ] + ( match[ 6 ] || 1 ) : + 2 * ( match[ 3 ] === "even" || match[ 3 ] === "odd" ) ); + match[ 5 ] = +( ( match[ 7 ] + match[ 8 ] ) || match[ 3 ] === "odd" ); + + // other types prohibit arguments + } else if ( match[ 3 ] ) { + Sizzle.error( match[ 0 ] ); + } + + return match; + }, + + "PSEUDO": function( match ) { + var excess, + unquoted = !match[ 6 ] && match[ 2 ]; + + if ( matchExpr[ "CHILD" ].test( match[ 0 ] ) ) { + return null; + } + + // Accept quoted arguments as-is + if ( match[ 3 ] ) { + match[ 2 ] = match[ 4 ] || match[ 5 ] || ""; + + // Strip excess characters from unquoted arguments + } else if ( unquoted && rpseudo.test( unquoted ) && + + // Get excess from tokenize (recursively) + ( excess = tokenize( unquoted, true ) ) && + + // advance to the next closing parenthesis + ( excess = unquoted.indexOf( ")", unquoted.length - excess ) - unquoted.length ) ) { + + // excess is a negative index + match[ 0 ] = match[ 0 ].slice( 0, excess ); + match[ 2 ] = unquoted.slice( 0, excess ); + } + + // Return only captures needed by the pseudo filter method (type and argument) + return match.slice( 0, 3 ); + } + }, + + filter: { + + "TAG": function( nodeNameSelector ) { + var nodeName = nodeNameSelector.replace( runescape, funescape ).toLowerCase(); + return nodeNameSelector === "*" ? + function() { + return true; + } : + function( elem ) { + return elem.nodeName && elem.nodeName.toLowerCase() === nodeName; + }; + }, + + "CLASS": function( className ) { + var pattern = classCache[ className + " " ]; + + return pattern || + ( pattern = new RegExp( "(^|" + whitespace + + ")" + className + "(" + whitespace + "|$)" ) ) && classCache( + className, function( elem ) { + return pattern.test( + typeof elem.className === "string" && elem.className || + typeof elem.getAttribute !== "undefined" && + elem.getAttribute( "class" ) || + "" + ); + } ); + }, + + "ATTR": function( name, operator, check ) { + return function( elem ) { + var result = Sizzle.attr( elem, name ); + + if ( result == null ) { + return operator === "!="; + } + if ( !operator ) { + return true; + } + + result += ""; + + /* eslint-disable max-len */ + + return operator === "=" ? result === check : + operator === "!=" ? result !== check : + operator === "^=" ? check && result.indexOf( check ) === 0 : + operator === "*=" ? check && result.indexOf( check ) > -1 : + operator === "$=" ? check && result.slice( -check.length ) === check : + operator === "~=" ? ( " " + result.replace( rwhitespace, " " ) + " " ).indexOf( check ) > -1 : + operator === "|=" ? result === check || result.slice( 0, check.length + 1 ) === check + "-" : + false; + /* eslint-enable max-len */ + + }; + }, + + "CHILD": function( type, what, _argument, first, last ) { + var simple = type.slice( 0, 3 ) !== "nth", + forward = type.slice( -4 ) !== "last", + ofType = what === "of-type"; + + return first === 1 && last === 0 ? + + // Shortcut for :nth-*(n) + function( elem ) { + return !!elem.parentNode; + } : + + function( elem, _context, xml ) { + var cache, uniqueCache, outerCache, node, nodeIndex, start, + dir = simple !== forward ? "nextSibling" : "previousSibling", + parent = elem.parentNode, + name = ofType && elem.nodeName.toLowerCase(), + useCache = !xml && !ofType, + diff = false; + + if ( parent ) { + + // :(first|last|only)-(child|of-type) + if ( simple ) { + while ( dir ) { + node = elem; + while ( ( node = node[ dir ] ) ) { + if ( ofType ? + node.nodeName.toLowerCase() === name : + node.nodeType === 1 ) { + + return false; + } + } + + // Reverse direction for :only-* (if we haven't yet done so) + start = dir = type === "only" && !start && "nextSibling"; + } + return true; + } + + start = [ forward ? parent.firstChild : parent.lastChild ]; + + // non-xml :nth-child(...) stores cache data on `parent` + if ( forward && useCache ) { + + // Seek `elem` from a previously-cached index + + // ...in a gzip-friendly way + node = parent; + outerCache = node[ expando ] || ( node[ expando ] = {} ); + + // Support: IE <9 only + // Defend against cloned attroperties (jQuery gh-1709) + uniqueCache = outerCache[ node.uniqueID ] || + ( outerCache[ node.uniqueID ] = {} ); + + cache = uniqueCache[ type ] || []; + nodeIndex = cache[ 0 ] === dirruns && cache[ 1 ]; + diff = nodeIndex && cache[ 2 ]; + node = nodeIndex && parent.childNodes[ nodeIndex ]; + + while ( ( node = ++nodeIndex && node && node[ dir ] || + + // Fallback to seeking `elem` from the start + ( diff = nodeIndex = 0 ) || start.pop() ) ) { + + // When found, cache indexes on `parent` and break + if ( node.nodeType === 1 && ++diff && node === elem ) { + uniqueCache[ type ] = [ dirruns, nodeIndex, diff ]; + break; + } + } + + } else { + + // Use previously-cached element index if available + if ( useCache ) { + + // ...in a gzip-friendly way + node = elem; + outerCache = node[ expando ] || ( node[ expando ] = {} ); + + // Support: IE <9 only + // Defend against cloned attroperties (jQuery gh-1709) + uniqueCache = outerCache[ node.uniqueID ] || + ( outerCache[ node.uniqueID ] = {} ); + + cache = uniqueCache[ type ] || []; + nodeIndex = cache[ 0 ] === dirruns && cache[ 1 ]; + diff = nodeIndex; + } + + // xml :nth-child(...) + // or :nth-last-child(...) or :nth(-last)?-of-type(...) + if ( diff === false ) { + + // Use the same loop as above to seek `elem` from the start + while ( ( node = ++nodeIndex && node && node[ dir ] || + ( diff = nodeIndex = 0 ) || start.pop() ) ) { + + if ( ( ofType ? + node.nodeName.toLowerCase() === name : + node.nodeType === 1 ) && + ++diff ) { + + // Cache the index of each encountered element + if ( useCache ) { + outerCache = node[ expando ] || + ( node[ expando ] = {} ); + + // Support: IE <9 only + // Defend against cloned attroperties (jQuery gh-1709) + uniqueCache = outerCache[ node.uniqueID ] || + ( outerCache[ node.uniqueID ] = {} ); + + uniqueCache[ type ] = [ dirruns, diff ]; + } + + if ( node === elem ) { + break; + } + } + } + } + } + + // Incorporate the offset, then check against cycle size + diff -= last; + return diff === first || ( diff % first === 0 && diff / first >= 0 ); + } + }; + }, + + "PSEUDO": function( pseudo, argument ) { + + // pseudo-class names are case-insensitive + // http://www.w3.org/TR/selectors/#pseudo-classes + // Prioritize by case sensitivity in case custom pseudos are added with uppercase letters + // Remember that setFilters inherits from pseudos + var args, + fn = Expr.pseudos[ pseudo ] || Expr.setFilters[ pseudo.toLowerCase() ] || + Sizzle.error( "unsupported pseudo: " + pseudo ); + + // The user may use createPseudo to indicate that + // arguments are needed to create the filter function + // just as Sizzle does + if ( fn[ expando ] ) { + return fn( argument ); + } + + // But maintain support for old signatures + if ( fn.length > 1 ) { + args = [ pseudo, pseudo, "", argument ]; + return Expr.setFilters.hasOwnProperty( pseudo.toLowerCase() ) ? + markFunction( function( seed, matches ) { + var idx, + matched = fn( seed, argument ), + i = matched.length; + while ( i-- ) { + idx = indexOf( seed, matched[ i ] ); + seed[ idx ] = !( matches[ idx ] = matched[ i ] ); + } + } ) : + function( elem ) { + return fn( elem, 0, args ); + }; + } + + return fn; + } + }, + + pseudos: { + + // Potentially complex pseudos + "not": markFunction( function( selector ) { + + // Trim the selector passed to compile + // to avoid treating leading and trailing + // spaces as combinators + var input = [], + results = [], + matcher = compile( selector.replace( rtrim, "$1" ) ); + + return matcher[ expando ] ? + markFunction( function( seed, matches, _context, xml ) { + var elem, + unmatched = matcher( seed, null, xml, [] ), + i = seed.length; + + // Match elements unmatched by `matcher` + while ( i-- ) { + if ( ( elem = unmatched[ i ] ) ) { + seed[ i ] = !( matches[ i ] = elem ); + } + } + } ) : + function( elem, _context, xml ) { + input[ 0 ] = elem; + matcher( input, null, xml, results ); + + // Don't keep the element (issue #299) + input[ 0 ] = null; + return !results.pop(); + }; + } ), + + "has": markFunction( function( selector ) { + return function( elem ) { + return Sizzle( selector, elem ).length > 0; + }; + } ), + + "contains": markFunction( function( text ) { + text = text.replace( runescape, funescape ); + return function( elem ) { + return ( elem.textContent || getText( elem ) ).indexOf( text ) > -1; + }; + } ), + + // "Whether an element is represented by a :lang() selector + // is based solely on the element's language value + // being equal to the identifier C, + // or beginning with the identifier C immediately followed by "-". + // The matching of C against the element's language value is performed case-insensitively. + // The identifier C does not have to be a valid language name." + // http://www.w3.org/TR/selectors/#lang-pseudo + "lang": markFunction( function( lang ) { + + // lang value must be a valid identifier + if ( !ridentifier.test( lang || "" ) ) { + Sizzle.error( "unsupported lang: " + lang ); + } + lang = lang.replace( runescape, funescape ).toLowerCase(); + return function( elem ) { + var elemLang; + do { + if ( ( elemLang = documentIsHTML ? + elem.lang : + elem.getAttribute( "xml:lang" ) || elem.getAttribute( "lang" ) ) ) { + + elemLang = elemLang.toLowerCase(); + return elemLang === lang || elemLang.indexOf( lang + "-" ) === 0; + } + } while ( ( elem = elem.parentNode ) && elem.nodeType === 1 ); + return false; + }; + } ), + + // Miscellaneous + "target": function( elem ) { + var hash = window.location && window.location.hash; + return hash && hash.slice( 1 ) === elem.id; + }, + + "root": function( elem ) { + return elem === docElem; + }, + + "focus": function( elem ) { + return elem === document.activeElement && + ( !document.hasFocus || document.hasFocus() ) && + !!( elem.type || elem.href || ~elem.tabIndex ); + }, + + // Boolean properties + "enabled": createDisabledPseudo( false ), + "disabled": createDisabledPseudo( true ), + + "checked": function( elem ) { + + // In CSS3, :checked should return both checked and selected elements + // http://www.w3.org/TR/2011/REC-css3-selectors-20110929/#checked + var nodeName = elem.nodeName.toLowerCase(); + return ( nodeName === "input" && !!elem.checked ) || + ( nodeName === "option" && !!elem.selected ); + }, + + "selected": function( elem ) { + + // Accessing this property makes selected-by-default + // options in Safari work properly + if ( elem.parentNode ) { + // eslint-disable-next-line no-unused-expressions + elem.parentNode.selectedIndex; + } + + return elem.selected === true; + }, + + // Contents + "empty": function( elem ) { + + // http://www.w3.org/TR/selectors/#empty-pseudo + // :empty is negated by element (1) or content nodes (text: 3; cdata: 4; entity ref: 5), + // but not by others (comment: 8; processing instruction: 7; etc.) + // nodeType < 6 works because attributes (2) do not appear as children + for ( elem = elem.firstChild; elem; elem = elem.nextSibling ) { + if ( elem.nodeType < 6 ) { + return false; + } + } + return true; + }, + + "parent": function( elem ) { + return !Expr.pseudos[ "empty" ]( elem ); + }, + + // Element/input types + "header": function( elem ) { + return rheader.test( elem.nodeName ); + }, + + "input": function( elem ) { + return rinputs.test( elem.nodeName ); + }, + + "button": function( elem ) { + var name = elem.nodeName.toLowerCase(); + return name === "input" && elem.type === "button" || name === "button"; + }, + + "text": function( elem ) { + var attr; + return elem.nodeName.toLowerCase() === "input" && + elem.type === "text" && + + // Support: IE<8 + // New HTML5 attribute values (e.g., "search") appear with elem.type === "text" + ( ( attr = elem.getAttribute( "type" ) ) == null || + attr.toLowerCase() === "text" ); + }, + + // Position-in-collection + "first": createPositionalPseudo( function() { + return [ 0 ]; + } ), + + "last": createPositionalPseudo( function( _matchIndexes, length ) { + return [ length - 1 ]; + } ), + + "eq": createPositionalPseudo( function( _matchIndexes, length, argument ) { + return [ argument < 0 ? argument + length : argument ]; + } ), + + "even": createPositionalPseudo( function( matchIndexes, length ) { + var i = 0; + for ( ; i < length; i += 2 ) { + matchIndexes.push( i ); + } + return matchIndexes; + } ), + + "odd": createPositionalPseudo( function( matchIndexes, length ) { + var i = 1; + for ( ; i < length; i += 2 ) { + matchIndexes.push( i ); + } + return matchIndexes; + } ), + + "lt": createPositionalPseudo( function( matchIndexes, length, argument ) { + var i = argument < 0 ? + argument + length : + argument > length ? + length : + argument; + for ( ; --i >= 0; ) { + matchIndexes.push( i ); + } + return matchIndexes; + } ), + + "gt": createPositionalPseudo( function( matchIndexes, length, argument ) { + var i = argument < 0 ? argument + length : argument; + for ( ; ++i < length; ) { + matchIndexes.push( i ); + } + return matchIndexes; + } ) + } +}; + +Expr.pseudos[ "nth" ] = Expr.pseudos[ "eq" ]; + +// Add button/input type pseudos +for ( i in { radio: true, checkbox: true, file: true, password: true, image: true } ) { + Expr.pseudos[ i ] = createInputPseudo( i ); +} +for ( i in { submit: true, reset: true } ) { + Expr.pseudos[ i ] = createButtonPseudo( i ); +} + +// Easy API for creating new setFilters +function setFilters() {} +setFilters.prototype = Expr.filters = Expr.pseudos; +Expr.setFilters = new setFilters(); + +tokenize = Sizzle.tokenize = function( selector, parseOnly ) { + var matched, match, tokens, type, + soFar, groups, preFilters, + cached = tokenCache[ selector + " " ]; + + if ( cached ) { + return parseOnly ? 0 : cached.slice( 0 ); + } + + soFar = selector; + groups = []; + preFilters = Expr.preFilter; + + while ( soFar ) { + + // Comma and first run + if ( !matched || ( match = rcomma.exec( soFar ) ) ) { + if ( match ) { + + // Don't consume trailing commas as valid + soFar = soFar.slice( match[ 0 ].length ) || soFar; + } + groups.push( ( tokens = [] ) ); + } + + matched = false; + + // Combinators + if ( ( match = rcombinators.exec( soFar ) ) ) { + matched = match.shift(); + tokens.push( { + value: matched, + + // Cast descendant combinators to space + type: match[ 0 ].replace( rtrim, " " ) + } ); + soFar = soFar.slice( matched.length ); + } + + // Filters + for ( type in Expr.filter ) { + if ( ( match = matchExpr[ type ].exec( soFar ) ) && ( !preFilters[ type ] || + ( match = preFilters[ type ]( match ) ) ) ) { + matched = match.shift(); + tokens.push( { + value: matched, + type: type, + matches: match + } ); + soFar = soFar.slice( matched.length ); + } + } + + if ( !matched ) { + break; + } + } + + // Return the length of the invalid excess + // if we're just parsing + // Otherwise, throw an error or return tokens + return parseOnly ? + soFar.length : + soFar ? + Sizzle.error( selector ) : + + // Cache the tokens + tokenCache( selector, groups ).slice( 0 ); +}; + +function toSelector( tokens ) { + var i = 0, + len = tokens.length, + selector = ""; + for ( ; i < len; i++ ) { + selector += tokens[ i ].value; + } + return selector; +} + +function addCombinator( matcher, combinator, base ) { + var dir = combinator.dir, + skip = combinator.next, + key = skip || dir, + checkNonElements = base && key === "parentNode", + doneName = done++; + + return combinator.first ? + + // Check against closest ancestor/preceding element + function( elem, context, xml ) { + while ( ( elem = elem[ dir ] ) ) { + if ( elem.nodeType === 1 || checkNonElements ) { + return matcher( elem, context, xml ); + } + } + return false; + } : + + // Check against all ancestor/preceding elements + function( elem, context, xml ) { + var oldCache, uniqueCache, outerCache, + newCache = [ dirruns, doneName ]; + + // We can't set arbitrary data on XML nodes, so they don't benefit from combinator caching + if ( xml ) { + while ( ( elem = elem[ dir ] ) ) { + if ( elem.nodeType === 1 || checkNonElements ) { + if ( matcher( elem, context, xml ) ) { + return true; + } + } + } + } else { + while ( ( elem = elem[ dir ] ) ) { + if ( elem.nodeType === 1 || checkNonElements ) { + outerCache = elem[ expando ] || ( elem[ expando ] = {} ); + + // Support: IE <9 only + // Defend against cloned attroperties (jQuery gh-1709) + uniqueCache = outerCache[ elem.uniqueID ] || + ( outerCache[ elem.uniqueID ] = {} ); + + if ( skip && skip === elem.nodeName.toLowerCase() ) { + elem = elem[ dir ] || elem; + } else if ( ( oldCache = uniqueCache[ key ] ) && + oldCache[ 0 ] === dirruns && oldCache[ 1 ] === doneName ) { + + // Assign to newCache so results back-propagate to previous elements + return ( newCache[ 2 ] = oldCache[ 2 ] ); + } else { + + // Reuse newcache so results back-propagate to previous elements + uniqueCache[ key ] = newCache; + + // A match means we're done; a fail means we have to keep checking + if ( ( newCache[ 2 ] = matcher( elem, context, xml ) ) ) { + return true; + } + } + } + } + } + return false; + }; +} + +function elementMatcher( matchers ) { + return matchers.length > 1 ? + function( elem, context, xml ) { + var i = matchers.length; + while ( i-- ) { + if ( !matchers[ i ]( elem, context, xml ) ) { + return false; + } + } + return true; + } : + matchers[ 0 ]; +} + +function multipleContexts( selector, contexts, results ) { + var i = 0, + len = contexts.length; + for ( ; i < len; i++ ) { + Sizzle( selector, contexts[ i ], results ); + } + return results; +} + +function condense( unmatched, map, filter, context, xml ) { + var elem, + newUnmatched = [], + i = 0, + len = unmatched.length, + mapped = map != null; + + for ( ; i < len; i++ ) { + if ( ( elem = unmatched[ i ] ) ) { + if ( !filter || filter( elem, context, xml ) ) { + newUnmatched.push( elem ); + if ( mapped ) { + map.push( i ); + } + } + } + } + + return newUnmatched; +} + +function setMatcher( preFilter, selector, matcher, postFilter, postFinder, postSelector ) { + if ( postFilter && !postFilter[ expando ] ) { + postFilter = setMatcher( postFilter ); + } + if ( postFinder && !postFinder[ expando ] ) { + postFinder = setMatcher( postFinder, postSelector ); + } + return markFunction( function( seed, results, context, xml ) { + var temp, i, elem, + preMap = [], + postMap = [], + preexisting = results.length, + + // Get initial elements from seed or context + elems = seed || multipleContexts( + selector || "*", + context.nodeType ? [ context ] : context, + [] + ), + + // Prefilter to get matcher input, preserving a map for seed-results synchronization + matcherIn = preFilter && ( seed || !selector ) ? + condense( elems, preMap, preFilter, context, xml ) : + elems, + + matcherOut = matcher ? + + // If we have a postFinder, or filtered seed, or non-seed postFilter or preexisting results, + postFinder || ( seed ? preFilter : preexisting || postFilter ) ? + + // ...intermediate processing is necessary + [] : + + // ...otherwise use results directly + results : + matcherIn; + + // Find primary matches + if ( matcher ) { + matcher( matcherIn, matcherOut, context, xml ); + } + + // Apply postFilter + if ( postFilter ) { + temp = condense( matcherOut, postMap ); + postFilter( temp, [], context, xml ); + + // Un-match failing elements by moving them back to matcherIn + i = temp.length; + while ( i-- ) { + if ( ( elem = temp[ i ] ) ) { + matcherOut[ postMap[ i ] ] = !( matcherIn[ postMap[ i ] ] = elem ); + } + } + } + + if ( seed ) { + if ( postFinder || preFilter ) { + if ( postFinder ) { + + // Get the final matcherOut by condensing this intermediate into postFinder contexts + temp = []; + i = matcherOut.length; + while ( i-- ) { + if ( ( elem = matcherOut[ i ] ) ) { + + // Restore matcherIn since elem is not yet a final match + temp.push( ( matcherIn[ i ] = elem ) ); + } + } + postFinder( null, ( matcherOut = [] ), temp, xml ); + } + + // Move matched elements from seed to results to keep them synchronized + i = matcherOut.length; + while ( i-- ) { + if ( ( elem = matcherOut[ i ] ) && + ( temp = postFinder ? indexOf( seed, elem ) : preMap[ i ] ) > -1 ) { + + seed[ temp ] = !( results[ temp ] = elem ); + } + } + } + + // Add elements to results, through postFinder if defined + } else { + matcherOut = condense( + matcherOut === results ? + matcherOut.splice( preexisting, matcherOut.length ) : + matcherOut + ); + if ( postFinder ) { + postFinder( null, results, matcherOut, xml ); + } else { + push.apply( results, matcherOut ); + } + } + } ); +} + +function matcherFromTokens( tokens ) { + var checkContext, matcher, j, + len = tokens.length, + leadingRelative = Expr.relative[ tokens[ 0 ].type ], + implicitRelative = leadingRelative || Expr.relative[ " " ], + i = leadingRelative ? 1 : 0, + + // The foundational matcher ensures that elements are reachable from top-level context(s) + matchContext = addCombinator( function( elem ) { + return elem === checkContext; + }, implicitRelative, true ), + matchAnyContext = addCombinator( function( elem ) { + return indexOf( checkContext, elem ) > -1; + }, implicitRelative, true ), + matchers = [ function( elem, context, xml ) { + var ret = ( !leadingRelative && ( xml || context !== outermostContext ) ) || ( + ( checkContext = context ).nodeType ? + matchContext( elem, context, xml ) : + matchAnyContext( elem, context, xml ) ); + + // Avoid hanging onto element (issue #299) + checkContext = null; + return ret; + } ]; + + for ( ; i < len; i++ ) { + if ( ( matcher = Expr.relative[ tokens[ i ].type ] ) ) { + matchers = [ addCombinator( elementMatcher( matchers ), matcher ) ]; + } else { + matcher = Expr.filter[ tokens[ i ].type ].apply( null, tokens[ i ].matches ); + + // Return special upon seeing a positional matcher + if ( matcher[ expando ] ) { + + // Find the next relative operator (if any) for proper handling + j = ++i; + for ( ; j < len; j++ ) { + if ( Expr.relative[ tokens[ j ].type ] ) { + break; + } + } + return setMatcher( + i > 1 && elementMatcher( matchers ), + i > 1 && toSelector( + + // If the preceding token was a descendant combinator, insert an implicit any-element `*` + tokens + .slice( 0, i - 1 ) + .concat( { value: tokens[ i - 2 ].type === " " ? "*" : "" } ) + ).replace( rtrim, "$1" ), + matcher, + i < j && matcherFromTokens( tokens.slice( i, j ) ), + j < len && matcherFromTokens( ( tokens = tokens.slice( j ) ) ), + j < len && toSelector( tokens ) + ); + } + matchers.push( matcher ); + } + } + + return elementMatcher( matchers ); +} + +function matcherFromGroupMatchers( elementMatchers, setMatchers ) { + var bySet = setMatchers.length > 0, + byElement = elementMatchers.length > 0, + superMatcher = function( seed, context, xml, results, outermost ) { + var elem, j, matcher, + matchedCount = 0, + i = "0", + unmatched = seed && [], + setMatched = [], + contextBackup = outermostContext, + + // We must always have either seed elements or outermost context + elems = seed || byElement && Expr.find[ "TAG" ]( "*", outermost ), + + // Use integer dirruns iff this is the outermost matcher + dirrunsUnique = ( dirruns += contextBackup == null ? 1 : Math.random() || 0.1 ), + len = elems.length; + + if ( outermost ) { + + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + outermostContext = context == document || context || outermost; + } + + // Add elements passing elementMatchers directly to results + // Support: IE<9, Safari + // Tolerate NodeList properties (IE: "length"; Safari: ) matching elements by id + for ( ; i !== len && ( elem = elems[ i ] ) != null; i++ ) { + if ( byElement && elem ) { + j = 0; + + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( !context && elem.ownerDocument != document ) { + setDocument( elem ); + xml = !documentIsHTML; + } + while ( ( matcher = elementMatchers[ j++ ] ) ) { + if ( matcher( elem, context || document, xml ) ) { + results.push( elem ); + break; + } + } + if ( outermost ) { + dirruns = dirrunsUnique; + } + } + + // Track unmatched elements for set filters + if ( bySet ) { + + // They will have gone through all possible matchers + if ( ( elem = !matcher && elem ) ) { + matchedCount--; + } + + // Lengthen the array for every element, matched or not + if ( seed ) { + unmatched.push( elem ); + } + } + } + + // `i` is now the count of elements visited above, and adding it to `matchedCount` + // makes the latter nonnegative. + matchedCount += i; + + // Apply set filters to unmatched elements + // NOTE: This can be skipped if there are no unmatched elements (i.e., `matchedCount` + // equals `i`), unless we didn't visit _any_ elements in the above loop because we have + // no element matchers and no seed. + // Incrementing an initially-string "0" `i` allows `i` to remain a string only in that + // case, which will result in a "00" `matchedCount` that differs from `i` but is also + // numerically zero. + if ( bySet && i !== matchedCount ) { + j = 0; + while ( ( matcher = setMatchers[ j++ ] ) ) { + matcher( unmatched, setMatched, context, xml ); + } + + if ( seed ) { + + // Reintegrate element matches to eliminate the need for sorting + if ( matchedCount > 0 ) { + while ( i-- ) { + if ( !( unmatched[ i ] || setMatched[ i ] ) ) { + setMatched[ i ] = pop.call( results ); + } + } + } + + // Discard index placeholder values to get only actual matches + setMatched = condense( setMatched ); + } + + // Add matches to results + push.apply( results, setMatched ); + + // Seedless set matches succeeding multiple successful matchers stipulate sorting + if ( outermost && !seed && setMatched.length > 0 && + ( matchedCount + setMatchers.length ) > 1 ) { + + Sizzle.uniqueSort( results ); + } + } + + // Override manipulation of globals by nested matchers + if ( outermost ) { + dirruns = dirrunsUnique; + outermostContext = contextBackup; + } + + return unmatched; + }; + + return bySet ? + markFunction( superMatcher ) : + superMatcher; +} + +compile = Sizzle.compile = function( selector, match /* Internal Use Only */ ) { + var i, + setMatchers = [], + elementMatchers = [], + cached = compilerCache[ selector + " " ]; + + if ( !cached ) { + + // Generate a function of recursive functions that can be used to check each element + if ( !match ) { + match = tokenize( selector ); + } + i = match.length; + while ( i-- ) { + cached = matcherFromTokens( match[ i ] ); + if ( cached[ expando ] ) { + setMatchers.push( cached ); + } else { + elementMatchers.push( cached ); + } + } + + // Cache the compiled function + cached = compilerCache( + selector, + matcherFromGroupMatchers( elementMatchers, setMatchers ) + ); + + // Save selector and tokenization + cached.selector = selector; + } + return cached; +}; + +/** + * A low-level selection function that works with Sizzle's compiled + * selector functions + * @param {String|Function} selector A selector or a pre-compiled + * selector function built with Sizzle.compile + * @param {Element} context + * @param {Array} [results] + * @param {Array} [seed] A set of elements to match against + */ +select = Sizzle.select = function( selector, context, results, seed ) { + var i, tokens, token, type, find, + compiled = typeof selector === "function" && selector, + match = !seed && tokenize( ( selector = compiled.selector || selector ) ); + + results = results || []; + + // Try to minimize operations if there is only one selector in the list and no seed + // (the latter of which guarantees us context) + if ( match.length === 1 ) { + + // Reduce context if the leading compound selector is an ID + tokens = match[ 0 ] = match[ 0 ].slice( 0 ); + if ( tokens.length > 2 && ( token = tokens[ 0 ] ).type === "ID" && + context.nodeType === 9 && documentIsHTML && Expr.relative[ tokens[ 1 ].type ] ) { + + context = ( Expr.find[ "ID" ]( token.matches[ 0 ] + .replace( runescape, funescape ), context ) || [] )[ 0 ]; + if ( !context ) { + return results; + + // Precompiled matchers will still verify ancestry, so step up a level + } else if ( compiled ) { + context = context.parentNode; + } + + selector = selector.slice( tokens.shift().value.length ); + } + + // Fetch a seed set for right-to-left matching + i = matchExpr[ "needsContext" ].test( selector ) ? 0 : tokens.length; + while ( i-- ) { + token = tokens[ i ]; + + // Abort if we hit a combinator + if ( Expr.relative[ ( type = token.type ) ] ) { + break; + } + if ( ( find = Expr.find[ type ] ) ) { + + // Search, expanding context for leading sibling combinators + if ( ( seed = find( + token.matches[ 0 ].replace( runescape, funescape ), + rsibling.test( tokens[ 0 ].type ) && testContext( context.parentNode ) || + context + ) ) ) { + + // If seed is empty or no tokens remain, we can return early + tokens.splice( i, 1 ); + selector = seed.length && toSelector( tokens ); + if ( !selector ) { + push.apply( results, seed ); + return results; + } + + break; + } + } + } + } + + // Compile and execute a filtering function if one is not provided + // Provide `match` to avoid retokenization if we modified the selector above + ( compiled || compile( selector, match ) )( + seed, + context, + !documentIsHTML, + results, + !context || rsibling.test( selector ) && testContext( context.parentNode ) || context + ); + return results; +}; + +// One-time assignments + +// Sort stability +support.sortStable = expando.split( "" ).sort( sortOrder ).join( "" ) === expando; + +// Support: Chrome 14-35+ +// Always assume duplicates if they aren't passed to the comparison function +support.detectDuplicates = !!hasDuplicate; + +// Initialize against the default document +setDocument(); + +// Support: Webkit<537.32 - Safari 6.0.3/Chrome 25 (fixed in Chrome 27) +// Detached nodes confoundingly follow *each other* +support.sortDetached = assert( function( el ) { + + // Should return 1, but returns 4 (following) + return el.compareDocumentPosition( document.createElement( "fieldset" ) ) & 1; +} ); + +// Support: IE<8 +// Prevent attribute/property "interpolation" +// https://msdn.microsoft.com/en-us/library/ms536429%28VS.85%29.aspx +if ( !assert( function( el ) { + el.innerHTML = ""; + return el.firstChild.getAttribute( "href" ) === "#"; +} ) ) { + addHandle( "type|href|height|width", function( elem, name, isXML ) { + if ( !isXML ) { + return elem.getAttribute( name, name.toLowerCase() === "type" ? 1 : 2 ); + } + } ); +} + +// Support: IE<9 +// Use defaultValue in place of getAttribute("value") +if ( !support.attributes || !assert( function( el ) { + el.innerHTML = ""; + el.firstChild.setAttribute( "value", "" ); + return el.firstChild.getAttribute( "value" ) === ""; +} ) ) { + addHandle( "value", function( elem, _name, isXML ) { + if ( !isXML && elem.nodeName.toLowerCase() === "input" ) { + return elem.defaultValue; + } + } ); +} + +// Support: IE<9 +// Use getAttributeNode to fetch booleans when getAttribute lies +if ( !assert( function( el ) { + return el.getAttribute( "disabled" ) == null; +} ) ) { + addHandle( booleans, function( elem, name, isXML ) { + var val; + if ( !isXML ) { + return elem[ name ] === true ? name.toLowerCase() : + ( val = elem.getAttributeNode( name ) ) && val.specified ? + val.value : + null; + } + } ); +} + +return Sizzle; + +} )( window ); + + + +jQuery.find = Sizzle; +jQuery.expr = Sizzle.selectors; + +// Deprecated +jQuery.expr[ ":" ] = jQuery.expr.pseudos; +jQuery.uniqueSort = jQuery.unique = Sizzle.uniqueSort; +jQuery.text = Sizzle.getText; +jQuery.isXMLDoc = Sizzle.isXML; +jQuery.contains = Sizzle.contains; +jQuery.escapeSelector = Sizzle.escape; + + + + +var dir = function( elem, dir, until ) { + var matched = [], + truncate = until !== undefined; + + while ( ( elem = elem[ dir ] ) && elem.nodeType !== 9 ) { + if ( elem.nodeType === 1 ) { + if ( truncate && jQuery( elem ).is( until ) ) { + break; + } + matched.push( elem ); + } + } + return matched; +}; + + +var siblings = function( n, elem ) { + var matched = []; + + for ( ; n; n = n.nextSibling ) { + if ( n.nodeType === 1 && n !== elem ) { + matched.push( n ); + } + } + + return matched; +}; + + +var rneedsContext = jQuery.expr.match.needsContext; + + + +function nodeName( elem, name ) { + + return elem.nodeName && elem.nodeName.toLowerCase() === name.toLowerCase(); + +} +var rsingleTag = ( /^<([a-z][^\/\0>:\x20\t\r\n\f]*)[\x20\t\r\n\f]*\/?>(?:<\/\1>|)$/i ); + + + +// Implement the identical functionality for filter and not +function winnow( elements, qualifier, not ) { + if ( isFunction( qualifier ) ) { + return jQuery.grep( elements, function( elem, i ) { + return !!qualifier.call( elem, i, elem ) !== not; + } ); + } + + // Single element + if ( qualifier.nodeType ) { + return jQuery.grep( elements, function( elem ) { + return ( elem === qualifier ) !== not; + } ); + } + + // Arraylike of elements (jQuery, arguments, Array) + if ( typeof qualifier !== "string" ) { + return jQuery.grep( elements, function( elem ) { + return ( indexOf.call( qualifier, elem ) > -1 ) !== not; + } ); + } + + // Filtered directly for both simple and complex selectors + return jQuery.filter( qualifier, elements, not ); +} + +jQuery.filter = function( expr, elems, not ) { + var elem = elems[ 0 ]; + + if ( not ) { + expr = ":not(" + expr + ")"; + } + + if ( elems.length === 1 && elem.nodeType === 1 ) { + return jQuery.find.matchesSelector( elem, expr ) ? [ elem ] : []; + } + + return jQuery.find.matches( expr, jQuery.grep( elems, function( elem ) { + return elem.nodeType === 1; + } ) ); +}; + +jQuery.fn.extend( { + find: function( selector ) { + var i, ret, + len = this.length, + self = this; + + if ( typeof selector !== "string" ) { + return this.pushStack( jQuery( selector ).filter( function() { + for ( i = 0; i < len; i++ ) { + if ( jQuery.contains( self[ i ], this ) ) { + return true; + } + } + } ) ); + } + + ret = this.pushStack( [] ); + + for ( i = 0; i < len; i++ ) { + jQuery.find( selector, self[ i ], ret ); + } + + return len > 1 ? jQuery.uniqueSort( ret ) : ret; + }, + filter: function( selector ) { + return this.pushStack( winnow( this, selector || [], false ) ); + }, + not: function( selector ) { + return this.pushStack( winnow( this, selector || [], true ) ); + }, + is: function( selector ) { + return !!winnow( + this, + + // If this is a positional/relative selector, check membership in the returned set + // so $("p:first").is("p:last") won't return true for a doc with two "p". + typeof selector === "string" && rneedsContext.test( selector ) ? + jQuery( selector ) : + selector || [], + false + ).length; + } +} ); + + +// Initialize a jQuery object + + +// A central reference to the root jQuery(document) +var rootjQuery, + + // A simple way to check for HTML strings + // Prioritize #id over to avoid XSS via location.hash (#9521) + // Strict HTML recognition (#11290: must start with <) + // Shortcut simple #id case for speed + rquickExpr = /^(?:\s*(<[\w\W]+>)[^>]*|#([\w-]+))$/, + + init = jQuery.fn.init = function( selector, context, root ) { + var match, elem; + + // HANDLE: $(""), $(null), $(undefined), $(false) + if ( !selector ) { + return this; + } + + // Method init() accepts an alternate rootjQuery + // so migrate can support jQuery.sub (gh-2101) + root = root || rootjQuery; + + // Handle HTML strings + if ( typeof selector === "string" ) { + if ( selector[ 0 ] === "<" && + selector[ selector.length - 1 ] === ">" && + selector.length >= 3 ) { + + // Assume that strings that start and end with <> are HTML and skip the regex check + match = [ null, selector, null ]; + + } else { + match = rquickExpr.exec( selector ); + } + + // Match html or make sure no context is specified for #id + if ( match && ( match[ 1 ] || !context ) ) { + + // HANDLE: $(html) -> $(array) + if ( match[ 1 ] ) { + context = context instanceof jQuery ? context[ 0 ] : context; + + // Option to run scripts is true for back-compat + // Intentionally let the error be thrown if parseHTML is not present + jQuery.merge( this, jQuery.parseHTML( + match[ 1 ], + context && context.nodeType ? context.ownerDocument || context : document, + true + ) ); + + // HANDLE: $(html, props) + if ( rsingleTag.test( match[ 1 ] ) && jQuery.isPlainObject( context ) ) { + for ( match in context ) { + + // Properties of context are called as methods if possible + if ( isFunction( this[ match ] ) ) { + this[ match ]( context[ match ] ); + + // ...and otherwise set as attributes + } else { + this.attr( match, context[ match ] ); + } + } + } + + return this; + + // HANDLE: $(#id) + } else { + elem = document.getElementById( match[ 2 ] ); + + if ( elem ) { + + // Inject the element directly into the jQuery object + this[ 0 ] = elem; + this.length = 1; + } + return this; + } + + // HANDLE: $(expr, $(...)) + } else if ( !context || context.jquery ) { + return ( context || root ).find( selector ); + + // HANDLE: $(expr, context) + // (which is just equivalent to: $(context).find(expr) + } else { + return this.constructor( context ).find( selector ); + } + + // HANDLE: $(DOMElement) + } else if ( selector.nodeType ) { + this[ 0 ] = selector; + this.length = 1; + return this; + + // HANDLE: $(function) + // Shortcut for document ready + } else if ( isFunction( selector ) ) { + return root.ready !== undefined ? + root.ready( selector ) : + + // Execute immediately if ready is not present + selector( jQuery ); + } + + return jQuery.makeArray( selector, this ); + }; + +// Give the init function the jQuery prototype for later instantiation +init.prototype = jQuery.fn; + +// Initialize central reference +rootjQuery = jQuery( document ); + + +var rparentsprev = /^(?:parents|prev(?:Until|All))/, + + // Methods guaranteed to produce a unique set when starting from a unique set + guaranteedUnique = { + children: true, + contents: true, + next: true, + prev: true + }; + +jQuery.fn.extend( { + has: function( target ) { + var targets = jQuery( target, this ), + l = targets.length; + + return this.filter( function() { + var i = 0; + for ( ; i < l; i++ ) { + if ( jQuery.contains( this, targets[ i ] ) ) { + return true; + } + } + } ); + }, + + closest: function( selectors, context ) { + var cur, + i = 0, + l = this.length, + matched = [], + targets = typeof selectors !== "string" && jQuery( selectors ); + + // Positional selectors never match, since there's no _selection_ context + if ( !rneedsContext.test( selectors ) ) { + for ( ; i < l; i++ ) { + for ( cur = this[ i ]; cur && cur !== context; cur = cur.parentNode ) { + + // Always skip document fragments + if ( cur.nodeType < 11 && ( targets ? + targets.index( cur ) > -1 : + + // Don't pass non-elements to Sizzle + cur.nodeType === 1 && + jQuery.find.matchesSelector( cur, selectors ) ) ) { + + matched.push( cur ); + break; + } + } + } + } + + return this.pushStack( matched.length > 1 ? jQuery.uniqueSort( matched ) : matched ); + }, + + // Determine the position of an element within the set + index: function( elem ) { + + // No argument, return index in parent + if ( !elem ) { + return ( this[ 0 ] && this[ 0 ].parentNode ) ? this.first().prevAll().length : -1; + } + + // Index in selector + if ( typeof elem === "string" ) { + return indexOf.call( jQuery( elem ), this[ 0 ] ); + } + + // Locate the position of the desired element + return indexOf.call( this, + + // If it receives a jQuery object, the first element is used + elem.jquery ? elem[ 0 ] : elem + ); + }, + + add: function( selector, context ) { + return this.pushStack( + jQuery.uniqueSort( + jQuery.merge( this.get(), jQuery( selector, context ) ) + ) + ); + }, + + addBack: function( selector ) { + return this.add( selector == null ? + this.prevObject : this.prevObject.filter( selector ) + ); + } +} ); + +function sibling( cur, dir ) { + while ( ( cur = cur[ dir ] ) && cur.nodeType !== 1 ) {} + return cur; +} + +jQuery.each( { + parent: function( elem ) { + var parent = elem.parentNode; + return parent && parent.nodeType !== 11 ? parent : null; + }, + parents: function( elem ) { + return dir( elem, "parentNode" ); + }, + parentsUntil: function( elem, _i, until ) { + return dir( elem, "parentNode", until ); + }, + next: function( elem ) { + return sibling( elem, "nextSibling" ); + }, + prev: function( elem ) { + return sibling( elem, "previousSibling" ); + }, + nextAll: function( elem ) { + return dir( elem, "nextSibling" ); + }, + prevAll: function( elem ) { + return dir( elem, "previousSibling" ); + }, + nextUntil: function( elem, _i, until ) { + return dir( elem, "nextSibling", until ); + }, + prevUntil: function( elem, _i, until ) { + return dir( elem, "previousSibling", until ); + }, + siblings: function( elem ) { + return siblings( ( elem.parentNode || {} ).firstChild, elem ); + }, + children: function( elem ) { + return siblings( elem.firstChild ); + }, + contents: function( elem ) { + if ( elem.contentDocument != null && + + // Support: IE 11+ + // elements with no `data` attribute has an object + // `contentDocument` with a `null` prototype. + getProto( elem.contentDocument ) ) { + + return elem.contentDocument; + } + + // Support: IE 9 - 11 only, iOS 7 only, Android Browser <=4.3 only + // Treat the template element as a regular one in browsers that + // don't support it. + if ( nodeName( elem, "template" ) ) { + elem = elem.content || elem; + } + + return jQuery.merge( [], elem.childNodes ); + } +}, function( name, fn ) { + jQuery.fn[ name ] = function( until, selector ) { + var matched = jQuery.map( this, fn, until ); + + if ( name.slice( -5 ) !== "Until" ) { + selector = until; + } + + if ( selector && typeof selector === "string" ) { + matched = jQuery.filter( selector, matched ); + } + + if ( this.length > 1 ) { + + // Remove duplicates + if ( !guaranteedUnique[ name ] ) { + jQuery.uniqueSort( matched ); + } + + // Reverse order for parents* and prev-derivatives + if ( rparentsprev.test( name ) ) { + matched.reverse(); + } + } + + return this.pushStack( matched ); + }; +} ); +var rnothtmlwhite = ( /[^\x20\t\r\n\f]+/g ); + + + +// Convert String-formatted options into Object-formatted ones +function createOptions( options ) { + var object = {}; + jQuery.each( options.match( rnothtmlwhite ) || [], function( _, flag ) { + object[ flag ] = true; + } ); + return object; +} + +/* + * Create a callback list using the following parameters: + * + * options: an optional list of space-separated options that will change how + * the callback list behaves or a more traditional option object + * + * By default a callback list will act like an event callback list and can be + * "fired" multiple times. + * + * Possible options: + * + * once: will ensure the callback list can only be fired once (like a Deferred) + * + * memory: will keep track of previous values and will call any callback added + * after the list has been fired right away with the latest "memorized" + * values (like a Deferred) + * + * unique: will ensure a callback can only be added once (no duplicate in the list) + * + * stopOnFalse: interrupt callings when a callback returns false + * + */ +jQuery.Callbacks = function( options ) { + + // Convert options from String-formatted to Object-formatted if needed + // (we check in cache first) + options = typeof options === "string" ? + createOptions( options ) : + jQuery.extend( {}, options ); + + var // Flag to know if list is currently firing + firing, + + // Last fire value for non-forgettable lists + memory, + + // Flag to know if list was already fired + fired, + + // Flag to prevent firing + locked, + + // Actual callback list + list = [], + + // Queue of execution data for repeatable lists + queue = [], + + // Index of currently firing callback (modified by add/remove as needed) + firingIndex = -1, + + // Fire callbacks + fire = function() { + + // Enforce single-firing + locked = locked || options.once; + + // Execute callbacks for all pending executions, + // respecting firingIndex overrides and runtime changes + fired = firing = true; + for ( ; queue.length; firingIndex = -1 ) { + memory = queue.shift(); + while ( ++firingIndex < list.length ) { + + // Run callback and check for early termination + if ( list[ firingIndex ].apply( memory[ 0 ], memory[ 1 ] ) === false && + options.stopOnFalse ) { + + // Jump to end and forget the data so .add doesn't re-fire + firingIndex = list.length; + memory = false; + } + } + } + + // Forget the data if we're done with it + if ( !options.memory ) { + memory = false; + } + + firing = false; + + // Clean up if we're done firing for good + if ( locked ) { + + // Keep an empty list if we have data for future add calls + if ( memory ) { + list = []; + + // Otherwise, this object is spent + } else { + list = ""; + } + } + }, + + // Actual Callbacks object + self = { + + // Add a callback or a collection of callbacks to the list + add: function() { + if ( list ) { + + // If we have memory from a past run, we should fire after adding + if ( memory && !firing ) { + firingIndex = list.length - 1; + queue.push( memory ); + } + + ( function add( args ) { + jQuery.each( args, function( _, arg ) { + if ( isFunction( arg ) ) { + if ( !options.unique || !self.has( arg ) ) { + list.push( arg ); + } + } else if ( arg && arg.length && toType( arg ) !== "string" ) { + + // Inspect recursively + add( arg ); + } + } ); + } )( arguments ); + + if ( memory && !firing ) { + fire(); + } + } + return this; + }, + + // Remove a callback from the list + remove: function() { + jQuery.each( arguments, function( _, arg ) { + var index; + while ( ( index = jQuery.inArray( arg, list, index ) ) > -1 ) { + list.splice( index, 1 ); + + // Handle firing indexes + if ( index <= firingIndex ) { + firingIndex--; + } + } + } ); + return this; + }, + + // Check if a given callback is in the list. + // If no argument is given, return whether or not list has callbacks attached. + has: function( fn ) { + return fn ? + jQuery.inArray( fn, list ) > -1 : + list.length > 0; + }, + + // Remove all callbacks from the list + empty: function() { + if ( list ) { + list = []; + } + return this; + }, + + // Disable .fire and .add + // Abort any current/pending executions + // Clear all callbacks and values + disable: function() { + locked = queue = []; + list = memory = ""; + return this; + }, + disabled: function() { + return !list; + }, + + // Disable .fire + // Also disable .add unless we have memory (since it would have no effect) + // Abort any pending executions + lock: function() { + locked = queue = []; + if ( !memory && !firing ) { + list = memory = ""; + } + return this; + }, + locked: function() { + return !!locked; + }, + + // Call all callbacks with the given context and arguments + fireWith: function( context, args ) { + if ( !locked ) { + args = args || []; + args = [ context, args.slice ? args.slice() : args ]; + queue.push( args ); + if ( !firing ) { + fire(); + } + } + return this; + }, + + // Call all the callbacks with the given arguments + fire: function() { + self.fireWith( this, arguments ); + return this; + }, + + // To know if the callbacks have already been called at least once + fired: function() { + return !!fired; + } + }; + + return self; +}; + + +function Identity( v ) { + return v; +} +function Thrower( ex ) { + throw ex; +} + +function adoptValue( value, resolve, reject, noValue ) { + var method; + + try { + + // Check for promise aspect first to privilege synchronous behavior + if ( value && isFunction( ( method = value.promise ) ) ) { + method.call( value ).done( resolve ).fail( reject ); + + // Other thenables + } else if ( value && isFunction( ( method = value.then ) ) ) { + method.call( value, resolve, reject ); + + // Other non-thenables + } else { + + // Control `resolve` arguments by letting Array#slice cast boolean `noValue` to integer: + // * false: [ value ].slice( 0 ) => resolve( value ) + // * true: [ value ].slice( 1 ) => resolve() + resolve.apply( undefined, [ value ].slice( noValue ) ); + } + + // For Promises/A+, convert exceptions into rejections + // Since jQuery.when doesn't unwrap thenables, we can skip the extra checks appearing in + // Deferred#then to conditionally suppress rejection. + } catch ( value ) { + + // Support: Android 4.0 only + // Strict mode functions invoked without .call/.apply get global-object context + reject.apply( undefined, [ value ] ); + } +} + +jQuery.extend( { + + Deferred: function( func ) { + var tuples = [ + + // action, add listener, callbacks, + // ... .then handlers, argument index, [final state] + [ "notify", "progress", jQuery.Callbacks( "memory" ), + jQuery.Callbacks( "memory" ), 2 ], + [ "resolve", "done", jQuery.Callbacks( "once memory" ), + jQuery.Callbacks( "once memory" ), 0, "resolved" ], + [ "reject", "fail", jQuery.Callbacks( "once memory" ), + jQuery.Callbacks( "once memory" ), 1, "rejected" ] + ], + state = "pending", + promise = { + state: function() { + return state; + }, + always: function() { + deferred.done( arguments ).fail( arguments ); + return this; + }, + "catch": function( fn ) { + return promise.then( null, fn ); + }, + + // Keep pipe for back-compat + pipe: function( /* fnDone, fnFail, fnProgress */ ) { + var fns = arguments; + + return jQuery.Deferred( function( newDefer ) { + jQuery.each( tuples, function( _i, tuple ) { + + // Map tuples (progress, done, fail) to arguments (done, fail, progress) + var fn = isFunction( fns[ tuple[ 4 ] ] ) && fns[ tuple[ 4 ] ]; + + // deferred.progress(function() { bind to newDefer or newDefer.notify }) + // deferred.done(function() { bind to newDefer or newDefer.resolve }) + // deferred.fail(function() { bind to newDefer or newDefer.reject }) + deferred[ tuple[ 1 ] ]( function() { + var returned = fn && fn.apply( this, arguments ); + if ( returned && isFunction( returned.promise ) ) { + returned.promise() + .progress( newDefer.notify ) + .done( newDefer.resolve ) + .fail( newDefer.reject ); + } else { + newDefer[ tuple[ 0 ] + "With" ]( + this, + fn ? [ returned ] : arguments + ); + } + } ); + } ); + fns = null; + } ).promise(); + }, + then: function( onFulfilled, onRejected, onProgress ) { + var maxDepth = 0; + function resolve( depth, deferred, handler, special ) { + return function() { + var that = this, + args = arguments, + mightThrow = function() { + var returned, then; + + // Support: Promises/A+ section 2.3.3.3.3 + // https://promisesaplus.com/#point-59 + // Ignore double-resolution attempts + if ( depth < maxDepth ) { + return; + } + + returned = handler.apply( that, args ); + + // Support: Promises/A+ section 2.3.1 + // https://promisesaplus.com/#point-48 + if ( returned === deferred.promise() ) { + throw new TypeError( "Thenable self-resolution" ); + } + + // Support: Promises/A+ sections 2.3.3.1, 3.5 + // https://promisesaplus.com/#point-54 + // https://promisesaplus.com/#point-75 + // Retrieve `then` only once + then = returned && + + // Support: Promises/A+ section 2.3.4 + // https://promisesaplus.com/#point-64 + // Only check objects and functions for thenability + ( typeof returned === "object" || + typeof returned === "function" ) && + returned.then; + + // Handle a returned thenable + if ( isFunction( then ) ) { + + // Special processors (notify) just wait for resolution + if ( special ) { + then.call( + returned, + resolve( maxDepth, deferred, Identity, special ), + resolve( maxDepth, deferred, Thrower, special ) + ); + + // Normal processors (resolve) also hook into progress + } else { + + // ...and disregard older resolution values + maxDepth++; + + then.call( + returned, + resolve( maxDepth, deferred, Identity, special ), + resolve( maxDepth, deferred, Thrower, special ), + resolve( maxDepth, deferred, Identity, + deferred.notifyWith ) + ); + } + + // Handle all other returned values + } else { + + // Only substitute handlers pass on context + // and multiple values (non-spec behavior) + if ( handler !== Identity ) { + that = undefined; + args = [ returned ]; + } + + // Process the value(s) + // Default process is resolve + ( special || deferred.resolveWith )( that, args ); + } + }, + + // Only normal processors (resolve) catch and reject exceptions + process = special ? + mightThrow : + function() { + try { + mightThrow(); + } catch ( e ) { + + if ( jQuery.Deferred.exceptionHook ) { + jQuery.Deferred.exceptionHook( e, + process.stackTrace ); + } + + // Support: Promises/A+ section 2.3.3.3.4.1 + // https://promisesaplus.com/#point-61 + // Ignore post-resolution exceptions + if ( depth + 1 >= maxDepth ) { + + // Only substitute handlers pass on context + // and multiple values (non-spec behavior) + if ( handler !== Thrower ) { + that = undefined; + args = [ e ]; + } + + deferred.rejectWith( that, args ); + } + } + }; + + // Support: Promises/A+ section 2.3.3.3.1 + // https://promisesaplus.com/#point-57 + // Re-resolve promises immediately to dodge false rejection from + // subsequent errors + if ( depth ) { + process(); + } else { + + // Call an optional hook to record the stack, in case of exception + // since it's otherwise lost when execution goes async + if ( jQuery.Deferred.getStackHook ) { + process.stackTrace = jQuery.Deferred.getStackHook(); + } + window.setTimeout( process ); + } + }; + } + + return jQuery.Deferred( function( newDefer ) { + + // progress_handlers.add( ... ) + tuples[ 0 ][ 3 ].add( + resolve( + 0, + newDefer, + isFunction( onProgress ) ? + onProgress : + Identity, + newDefer.notifyWith + ) + ); + + // fulfilled_handlers.add( ... ) + tuples[ 1 ][ 3 ].add( + resolve( + 0, + newDefer, + isFunction( onFulfilled ) ? + onFulfilled : + Identity + ) + ); + + // rejected_handlers.add( ... ) + tuples[ 2 ][ 3 ].add( + resolve( + 0, + newDefer, + isFunction( onRejected ) ? + onRejected : + Thrower + ) + ); + } ).promise(); + }, + + // Get a promise for this deferred + // If obj is provided, the promise aspect is added to the object + promise: function( obj ) { + return obj != null ? jQuery.extend( obj, promise ) : promise; + } + }, + deferred = {}; + + // Add list-specific methods + jQuery.each( tuples, function( i, tuple ) { + var list = tuple[ 2 ], + stateString = tuple[ 5 ]; + + // promise.progress = list.add + // promise.done = list.add + // promise.fail = list.add + promise[ tuple[ 1 ] ] = list.add; + + // Handle state + if ( stateString ) { + list.add( + function() { + + // state = "resolved" (i.e., fulfilled) + // state = "rejected" + state = stateString; + }, + + // rejected_callbacks.disable + // fulfilled_callbacks.disable + tuples[ 3 - i ][ 2 ].disable, + + // rejected_handlers.disable + // fulfilled_handlers.disable + tuples[ 3 - i ][ 3 ].disable, + + // progress_callbacks.lock + tuples[ 0 ][ 2 ].lock, + + // progress_handlers.lock + tuples[ 0 ][ 3 ].lock + ); + } + + // progress_handlers.fire + // fulfilled_handlers.fire + // rejected_handlers.fire + list.add( tuple[ 3 ].fire ); + + // deferred.notify = function() { deferred.notifyWith(...) } + // deferred.resolve = function() { deferred.resolveWith(...) } + // deferred.reject = function() { deferred.rejectWith(...) } + deferred[ tuple[ 0 ] ] = function() { + deferred[ tuple[ 0 ] + "With" ]( this === deferred ? undefined : this, arguments ); + return this; + }; + + // deferred.notifyWith = list.fireWith + // deferred.resolveWith = list.fireWith + // deferred.rejectWith = list.fireWith + deferred[ tuple[ 0 ] + "With" ] = list.fireWith; + } ); + + // Make the deferred a promise + promise.promise( deferred ); + + // Call given func if any + if ( func ) { + func.call( deferred, deferred ); + } + + // All done! + return deferred; + }, + + // Deferred helper + when: function( singleValue ) { + var + + // count of uncompleted subordinates + remaining = arguments.length, + + // count of unprocessed arguments + i = remaining, + + // subordinate fulfillment data + resolveContexts = Array( i ), + resolveValues = slice.call( arguments ), + + // the primary Deferred + primary = jQuery.Deferred(), + + // subordinate callback factory + updateFunc = function( i ) { + return function( value ) { + resolveContexts[ i ] = this; + resolveValues[ i ] = arguments.length > 1 ? slice.call( arguments ) : value; + if ( !( --remaining ) ) { + primary.resolveWith( resolveContexts, resolveValues ); + } + }; + }; + + // Single- and empty arguments are adopted like Promise.resolve + if ( remaining <= 1 ) { + adoptValue( singleValue, primary.done( updateFunc( i ) ).resolve, primary.reject, + !remaining ); + + // Use .then() to unwrap secondary thenables (cf. gh-3000) + if ( primary.state() === "pending" || + isFunction( resolveValues[ i ] && resolveValues[ i ].then ) ) { + + return primary.then(); + } + } + + // Multiple arguments are aggregated like Promise.all array elements + while ( i-- ) { + adoptValue( resolveValues[ i ], updateFunc( i ), primary.reject ); + } + + return primary.promise(); + } +} ); + + +// These usually indicate a programmer mistake during development, +// warn about them ASAP rather than swallowing them by default. +var rerrorNames = /^(Eval|Internal|Range|Reference|Syntax|Type|URI)Error$/; + +jQuery.Deferred.exceptionHook = function( error, stack ) { + + // Support: IE 8 - 9 only + // Console exists when dev tools are open, which can happen at any time + if ( window.console && window.console.warn && error && rerrorNames.test( error.name ) ) { + window.console.warn( "jQuery.Deferred exception: " + error.message, error.stack, stack ); + } +}; + + + + +jQuery.readyException = function( error ) { + window.setTimeout( function() { + throw error; + } ); +}; + + + + +// The deferred used on DOM ready +var readyList = jQuery.Deferred(); + +jQuery.fn.ready = function( fn ) { + + readyList + .then( fn ) + + // Wrap jQuery.readyException in a function so that the lookup + // happens at the time of error handling instead of callback + // registration. + .catch( function( error ) { + jQuery.readyException( error ); + } ); + + return this; +}; + +jQuery.extend( { + + // Is the DOM ready to be used? Set to true once it occurs. + isReady: false, + + // A counter to track how many items to wait for before + // the ready event fires. See #6781 + readyWait: 1, + + // Handle when the DOM is ready + ready: function( wait ) { + + // Abort if there are pending holds or we're already ready + if ( wait === true ? --jQuery.readyWait : jQuery.isReady ) { + return; + } + + // Remember that the DOM is ready + jQuery.isReady = true; + + // If a normal DOM Ready event fired, decrement, and wait if need be + if ( wait !== true && --jQuery.readyWait > 0 ) { + return; + } + + // If there are functions bound, to execute + readyList.resolveWith( document, [ jQuery ] ); + } +} ); + +jQuery.ready.then = readyList.then; + +// The ready event handler and self cleanup method +function completed() { + document.removeEventListener( "DOMContentLoaded", completed ); + window.removeEventListener( "load", completed ); + jQuery.ready(); +} + +// Catch cases where $(document).ready() is called +// after the browser event has already occurred. +// Support: IE <=9 - 10 only +// Older IE sometimes signals "interactive" too soon +if ( document.readyState === "complete" || + ( document.readyState !== "loading" && !document.documentElement.doScroll ) ) { + + // Handle it asynchronously to allow scripts the opportunity to delay ready + window.setTimeout( jQuery.ready ); + +} else { + + // Use the handy event callback + document.addEventListener( "DOMContentLoaded", completed ); + + // A fallback to window.onload, that will always work + window.addEventListener( "load", completed ); +} + + + + +// Multifunctional method to get and set values of a collection +// The value/s can optionally be executed if it's a function +var access = function( elems, fn, key, value, chainable, emptyGet, raw ) { + var i = 0, + len = elems.length, + bulk = key == null; + + // Sets many values + if ( toType( key ) === "object" ) { + chainable = true; + for ( i in key ) { + access( elems, fn, i, key[ i ], true, emptyGet, raw ); + } + + // Sets one value + } else if ( value !== undefined ) { + chainable = true; + + if ( !isFunction( value ) ) { + raw = true; + } + + if ( bulk ) { + + // Bulk operations run against the entire set + if ( raw ) { + fn.call( elems, value ); + fn = null; + + // ...except when executing function values + } else { + bulk = fn; + fn = function( elem, _key, value ) { + return bulk.call( jQuery( elem ), value ); + }; + } + } + + if ( fn ) { + for ( ; i < len; i++ ) { + fn( + elems[ i ], key, raw ? + value : + value.call( elems[ i ], i, fn( elems[ i ], key ) ) + ); + } + } + } + + if ( chainable ) { + return elems; + } + + // Gets + if ( bulk ) { + return fn.call( elems ); + } + + return len ? fn( elems[ 0 ], key ) : emptyGet; +}; + + +// Matches dashed string for camelizing +var rmsPrefix = /^-ms-/, + rdashAlpha = /-([a-z])/g; + +// Used by camelCase as callback to replace() +function fcamelCase( _all, letter ) { + return letter.toUpperCase(); +} + +// Convert dashed to camelCase; used by the css and data modules +// Support: IE <=9 - 11, Edge 12 - 15 +// Microsoft forgot to hump their vendor prefix (#9572) +function camelCase( string ) { + return string.replace( rmsPrefix, "ms-" ).replace( rdashAlpha, fcamelCase ); +} +var acceptData = function( owner ) { + + // Accepts only: + // - Node + // - Node.ELEMENT_NODE + // - Node.DOCUMENT_NODE + // - Object + // - Any + return owner.nodeType === 1 || owner.nodeType === 9 || !( +owner.nodeType ); +}; + + + + +function Data() { + this.expando = jQuery.expando + Data.uid++; +} + +Data.uid = 1; + +Data.prototype = { + + cache: function( owner ) { + + // Check if the owner object already has a cache + var value = owner[ this.expando ]; + + // If not, create one + if ( !value ) { + value = {}; + + // We can accept data for non-element nodes in modern browsers, + // but we should not, see #8335. + // Always return an empty object. + if ( acceptData( owner ) ) { + + // If it is a node unlikely to be stringify-ed or looped over + // use plain assignment + if ( owner.nodeType ) { + owner[ this.expando ] = value; + + // Otherwise secure it in a non-enumerable property + // configurable must be true to allow the property to be + // deleted when data is removed + } else { + Object.defineProperty( owner, this.expando, { + value: value, + configurable: true + } ); + } + } + } + + return value; + }, + set: function( owner, data, value ) { + var prop, + cache = this.cache( owner ); + + // Handle: [ owner, key, value ] args + // Always use camelCase key (gh-2257) + if ( typeof data === "string" ) { + cache[ camelCase( data ) ] = value; + + // Handle: [ owner, { properties } ] args + } else { + + // Copy the properties one-by-one to the cache object + for ( prop in data ) { + cache[ camelCase( prop ) ] = data[ prop ]; + } + } + return cache; + }, + get: function( owner, key ) { + return key === undefined ? + this.cache( owner ) : + + // Always use camelCase key (gh-2257) + owner[ this.expando ] && owner[ this.expando ][ camelCase( key ) ]; + }, + access: function( owner, key, value ) { + + // In cases where either: + // + // 1. No key was specified + // 2. A string key was specified, but no value provided + // + // Take the "read" path and allow the get method to determine + // which value to return, respectively either: + // + // 1. The entire cache object + // 2. The data stored at the key + // + if ( key === undefined || + ( ( key && typeof key === "string" ) && value === undefined ) ) { + + return this.get( owner, key ); + } + + // When the key is not a string, or both a key and value + // are specified, set or extend (existing objects) with either: + // + // 1. An object of properties + // 2. A key and value + // + this.set( owner, key, value ); + + // Since the "set" path can have two possible entry points + // return the expected data based on which path was taken[*] + return value !== undefined ? value : key; + }, + remove: function( owner, key ) { + var i, + cache = owner[ this.expando ]; + + if ( cache === undefined ) { + return; + } + + if ( key !== undefined ) { + + // Support array or space separated string of keys + if ( Array.isArray( key ) ) { + + // If key is an array of keys... + // We always set camelCase keys, so remove that. + key = key.map( camelCase ); + } else { + key = camelCase( key ); + + // If a key with the spaces exists, use it. + // Otherwise, create an array by matching non-whitespace + key = key in cache ? + [ key ] : + ( key.match( rnothtmlwhite ) || [] ); + } + + i = key.length; + + while ( i-- ) { + delete cache[ key[ i ] ]; + } + } + + // Remove the expando if there's no more data + if ( key === undefined || jQuery.isEmptyObject( cache ) ) { + + // Support: Chrome <=35 - 45 + // Webkit & Blink performance suffers when deleting properties + // from DOM nodes, so set to undefined instead + // https://bugs.chromium.org/p/chromium/issues/detail?id=378607 (bug restricted) + if ( owner.nodeType ) { + owner[ this.expando ] = undefined; + } else { + delete owner[ this.expando ]; + } + } + }, + hasData: function( owner ) { + var cache = owner[ this.expando ]; + return cache !== undefined && !jQuery.isEmptyObject( cache ); + } +}; +var dataPriv = new Data(); + +var dataUser = new Data(); + + + +// Implementation Summary +// +// 1. Enforce API surface and semantic compatibility with 1.9.x branch +// 2. Improve the module's maintainability by reducing the storage +// paths to a single mechanism. +// 3. Use the same single mechanism to support "private" and "user" data. +// 4. _Never_ expose "private" data to user code (TODO: Drop _data, _removeData) +// 5. Avoid exposing implementation details on user objects (eg. expando properties) +// 6. Provide a clear path for implementation upgrade to WeakMap in 2014 + +var rbrace = /^(?:\{[\w\W]*\}|\[[\w\W]*\])$/, + rmultiDash = /[A-Z]/g; + +function getData( data ) { + if ( data === "true" ) { + return true; + } + + if ( data === "false" ) { + return false; + } + + if ( data === "null" ) { + return null; + } + + // Only convert to a number if it doesn't change the string + if ( data === +data + "" ) { + return +data; + } + + if ( rbrace.test( data ) ) { + return JSON.parse( data ); + } + + return data; +} + +function dataAttr( elem, key, data ) { + var name; + + // If nothing was found internally, try to fetch any + // data from the HTML5 data-* attribute + if ( data === undefined && elem.nodeType === 1 ) { + name = "data-" + key.replace( rmultiDash, "-$&" ).toLowerCase(); + data = elem.getAttribute( name ); + + if ( typeof data === "string" ) { + try { + data = getData( data ); + } catch ( e ) {} + + // Make sure we set the data so it isn't changed later + dataUser.set( elem, key, data ); + } else { + data = undefined; + } + } + return data; +} + +jQuery.extend( { + hasData: function( elem ) { + return dataUser.hasData( elem ) || dataPriv.hasData( elem ); + }, + + data: function( elem, name, data ) { + return dataUser.access( elem, name, data ); + }, + + removeData: function( elem, name ) { + dataUser.remove( elem, name ); + }, + + // TODO: Now that all calls to _data and _removeData have been replaced + // with direct calls to dataPriv methods, these can be deprecated. + _data: function( elem, name, data ) { + return dataPriv.access( elem, name, data ); + }, + + _removeData: function( elem, name ) { + dataPriv.remove( elem, name ); + } +} ); + +jQuery.fn.extend( { + data: function( key, value ) { + var i, name, data, + elem = this[ 0 ], + attrs = elem && elem.attributes; + + // Gets all values + if ( key === undefined ) { + if ( this.length ) { + data = dataUser.get( elem ); + + if ( elem.nodeType === 1 && !dataPriv.get( elem, "hasDataAttrs" ) ) { + i = attrs.length; + while ( i-- ) { + + // Support: IE 11 only + // The attrs elements can be null (#14894) + if ( attrs[ i ] ) { + name = attrs[ i ].name; + if ( name.indexOf( "data-" ) === 0 ) { + name = camelCase( name.slice( 5 ) ); + dataAttr( elem, name, data[ name ] ); + } + } + } + dataPriv.set( elem, "hasDataAttrs", true ); + } + } + + return data; + } + + // Sets multiple values + if ( typeof key === "object" ) { + return this.each( function() { + dataUser.set( this, key ); + } ); + } + + return access( this, function( value ) { + var data; + + // The calling jQuery object (element matches) is not empty + // (and therefore has an element appears at this[ 0 ]) and the + // `value` parameter was not undefined. An empty jQuery object + // will result in `undefined` for elem = this[ 0 ] which will + // throw an exception if an attempt to read a data cache is made. + if ( elem && value === undefined ) { + + // Attempt to get data from the cache + // The key will always be camelCased in Data + data = dataUser.get( elem, key ); + if ( data !== undefined ) { + return data; + } + + // Attempt to "discover" the data in + // HTML5 custom data-* attrs + data = dataAttr( elem, key ); + if ( data !== undefined ) { + return data; + } + + // We tried really hard, but the data doesn't exist. + return; + } + + // Set the data... + this.each( function() { + + // We always store the camelCased key + dataUser.set( this, key, value ); + } ); + }, null, value, arguments.length > 1, null, true ); + }, + + removeData: function( key ) { + return this.each( function() { + dataUser.remove( this, key ); + } ); + } +} ); + + +jQuery.extend( { + queue: function( elem, type, data ) { + var queue; + + if ( elem ) { + type = ( type || "fx" ) + "queue"; + queue = dataPriv.get( elem, type ); + + // Speed up dequeue by getting out quickly if this is just a lookup + if ( data ) { + if ( !queue || Array.isArray( data ) ) { + queue = dataPriv.access( elem, type, jQuery.makeArray( data ) ); + } else { + queue.push( data ); + } + } + return queue || []; + } + }, + + dequeue: function( elem, type ) { + type = type || "fx"; + + var queue = jQuery.queue( elem, type ), + startLength = queue.length, + fn = queue.shift(), + hooks = jQuery._queueHooks( elem, type ), + next = function() { + jQuery.dequeue( elem, type ); + }; + + // If the fx queue is dequeued, always remove the progress sentinel + if ( fn === "inprogress" ) { + fn = queue.shift(); + startLength--; + } + + if ( fn ) { + + // Add a progress sentinel to prevent the fx queue from being + // automatically dequeued + if ( type === "fx" ) { + queue.unshift( "inprogress" ); + } + + // Clear up the last queue stop function + delete hooks.stop; + fn.call( elem, next, hooks ); + } + + if ( !startLength && hooks ) { + hooks.empty.fire(); + } + }, + + // Not public - generate a queueHooks object, or return the current one + _queueHooks: function( elem, type ) { + var key = type + "queueHooks"; + return dataPriv.get( elem, key ) || dataPriv.access( elem, key, { + empty: jQuery.Callbacks( "once memory" ).add( function() { + dataPriv.remove( elem, [ type + "queue", key ] ); + } ) + } ); + } +} ); + +jQuery.fn.extend( { + queue: function( type, data ) { + var setter = 2; + + if ( typeof type !== "string" ) { + data = type; + type = "fx"; + setter--; + } + + if ( arguments.length < setter ) { + return jQuery.queue( this[ 0 ], type ); + } + + return data === undefined ? + this : + this.each( function() { + var queue = jQuery.queue( this, type, data ); + + // Ensure a hooks for this queue + jQuery._queueHooks( this, type ); + + if ( type === "fx" && queue[ 0 ] !== "inprogress" ) { + jQuery.dequeue( this, type ); + } + } ); + }, + dequeue: function( type ) { + return this.each( function() { + jQuery.dequeue( this, type ); + } ); + }, + clearQueue: function( type ) { + return this.queue( type || "fx", [] ); + }, + + // Get a promise resolved when queues of a certain type + // are emptied (fx is the type by default) + promise: function( type, obj ) { + var tmp, + count = 1, + defer = jQuery.Deferred(), + elements = this, + i = this.length, + resolve = function() { + if ( !( --count ) ) { + defer.resolveWith( elements, [ elements ] ); + } + }; + + if ( typeof type !== "string" ) { + obj = type; + type = undefined; + } + type = type || "fx"; + + while ( i-- ) { + tmp = dataPriv.get( elements[ i ], type + "queueHooks" ); + if ( tmp && tmp.empty ) { + count++; + tmp.empty.add( resolve ); + } + } + resolve(); + return defer.promise( obj ); + } +} ); +var pnum = ( /[+-]?(?:\d*\.|)\d+(?:[eE][+-]?\d+|)/ ).source; + +var rcssNum = new RegExp( "^(?:([+-])=|)(" + pnum + ")([a-z%]*)$", "i" ); + + +var cssExpand = [ "Top", "Right", "Bottom", "Left" ]; + +var documentElement = document.documentElement; + + + + var isAttached = function( elem ) { + return jQuery.contains( elem.ownerDocument, elem ); + }, + composed = { composed: true }; + + // Support: IE 9 - 11+, Edge 12 - 18+, iOS 10.0 - 10.2 only + // Check attachment across shadow DOM boundaries when possible (gh-3504) + // Support: iOS 10.0-10.2 only + // Early iOS 10 versions support `attachShadow` but not `getRootNode`, + // leading to errors. We need to check for `getRootNode`. + if ( documentElement.getRootNode ) { + isAttached = function( elem ) { + return jQuery.contains( elem.ownerDocument, elem ) || + elem.getRootNode( composed ) === elem.ownerDocument; + }; + } +var isHiddenWithinTree = function( elem, el ) { + + // isHiddenWithinTree might be called from jQuery#filter function; + // in that case, element will be second argument + elem = el || elem; + + // Inline style trumps all + return elem.style.display === "none" || + elem.style.display === "" && + + // Otherwise, check computed style + // Support: Firefox <=43 - 45 + // Disconnected elements can have computed display: none, so first confirm that elem is + // in the document. + isAttached( elem ) && + + jQuery.css( elem, "display" ) === "none"; + }; + + + +function adjustCSS( elem, prop, valueParts, tween ) { + var adjusted, scale, + maxIterations = 20, + currentValue = tween ? + function() { + return tween.cur(); + } : + function() { + return jQuery.css( elem, prop, "" ); + }, + initial = currentValue(), + unit = valueParts && valueParts[ 3 ] || ( jQuery.cssNumber[ prop ] ? "" : "px" ), + + // Starting value computation is required for potential unit mismatches + initialInUnit = elem.nodeType && + ( jQuery.cssNumber[ prop ] || unit !== "px" && +initial ) && + rcssNum.exec( jQuery.css( elem, prop ) ); + + if ( initialInUnit && initialInUnit[ 3 ] !== unit ) { + + // Support: Firefox <=54 + // Halve the iteration target value to prevent interference from CSS upper bounds (gh-2144) + initial = initial / 2; + + // Trust units reported by jQuery.css + unit = unit || initialInUnit[ 3 ]; + + // Iteratively approximate from a nonzero starting point + initialInUnit = +initial || 1; + + while ( maxIterations-- ) { + + // Evaluate and update our best guess (doubling guesses that zero out). + // Finish if the scale equals or crosses 1 (making the old*new product non-positive). + jQuery.style( elem, prop, initialInUnit + unit ); + if ( ( 1 - scale ) * ( 1 - ( scale = currentValue() / initial || 0.5 ) ) <= 0 ) { + maxIterations = 0; + } + initialInUnit = initialInUnit / scale; + + } + + initialInUnit = initialInUnit * 2; + jQuery.style( elem, prop, initialInUnit + unit ); + + // Make sure we update the tween properties later on + valueParts = valueParts || []; + } + + if ( valueParts ) { + initialInUnit = +initialInUnit || +initial || 0; + + // Apply relative offset (+=/-=) if specified + adjusted = valueParts[ 1 ] ? + initialInUnit + ( valueParts[ 1 ] + 1 ) * valueParts[ 2 ] : + +valueParts[ 2 ]; + if ( tween ) { + tween.unit = unit; + tween.start = initialInUnit; + tween.end = adjusted; + } + } + return adjusted; +} + + +var defaultDisplayMap = {}; + +function getDefaultDisplay( elem ) { + var temp, + doc = elem.ownerDocument, + nodeName = elem.nodeName, + display = defaultDisplayMap[ nodeName ]; + + if ( display ) { + return display; + } + + temp = doc.body.appendChild( doc.createElement( nodeName ) ); + display = jQuery.css( temp, "display" ); + + temp.parentNode.removeChild( temp ); + + if ( display === "none" ) { + display = "block"; + } + defaultDisplayMap[ nodeName ] = display; + + return display; +} + +function showHide( elements, show ) { + var display, elem, + values = [], + index = 0, + length = elements.length; + + // Determine new display value for elements that need to change + for ( ; index < length; index++ ) { + elem = elements[ index ]; + if ( !elem.style ) { + continue; + } + + display = elem.style.display; + if ( show ) { + + // Since we force visibility upon cascade-hidden elements, an immediate (and slow) + // check is required in this first loop unless we have a nonempty display value (either + // inline or about-to-be-restored) + if ( display === "none" ) { + values[ index ] = dataPriv.get( elem, "display" ) || null; + if ( !values[ index ] ) { + elem.style.display = ""; + } + } + if ( elem.style.display === "" && isHiddenWithinTree( elem ) ) { + values[ index ] = getDefaultDisplay( elem ); + } + } else { + if ( display !== "none" ) { + values[ index ] = "none"; + + // Remember what we're overwriting + dataPriv.set( elem, "display", display ); + } + } + } + + // Set the display of the elements in a second loop to avoid constant reflow + for ( index = 0; index < length; index++ ) { + if ( values[ index ] != null ) { + elements[ index ].style.display = values[ index ]; + } + } + + return elements; +} + +jQuery.fn.extend( { + show: function() { + return showHide( this, true ); + }, + hide: function() { + return showHide( this ); + }, + toggle: function( state ) { + if ( typeof state === "boolean" ) { + return state ? this.show() : this.hide(); + } + + return this.each( function() { + if ( isHiddenWithinTree( this ) ) { + jQuery( this ).show(); + } else { + jQuery( this ).hide(); + } + } ); + } +} ); +var rcheckableType = ( /^(?:checkbox|radio)$/i ); + +var rtagName = ( /<([a-z][^\/\0>\x20\t\r\n\f]*)/i ); + +var rscriptType = ( /^$|^module$|\/(?:java|ecma)script/i ); + + + +( function() { + var fragment = document.createDocumentFragment(), + div = fragment.appendChild( document.createElement( "div" ) ), + input = document.createElement( "input" ); + + // Support: Android 4.0 - 4.3 only + // Check state lost if the name is set (#11217) + // Support: Windows Web Apps (WWA) + // `name` and `type` must use .setAttribute for WWA (#14901) + input.setAttribute( "type", "radio" ); + input.setAttribute( "checked", "checked" ); + input.setAttribute( "name", "t" ); + + div.appendChild( input ); + + // Support: Android <=4.1 only + // Older WebKit doesn't clone checked state correctly in fragments + support.checkClone = div.cloneNode( true ).cloneNode( true ).lastChild.checked; + + // Support: IE <=11 only + // Make sure textarea (and checkbox) defaultValue is properly cloned + div.innerHTML = ""; + support.noCloneChecked = !!div.cloneNode( true ).lastChild.defaultValue; + + // Support: IE <=9 only + // IE <=9 replaces "; + support.option = !!div.lastChild; +} )(); + + +// We have to close these tags to support XHTML (#13200) +var wrapMap = { + + // XHTML parsers do not magically insert elements in the + // same way that tag soup parsers do. So we cannot shorten + // this by omitting or other required elements. + thead: [ 1, "", "
" ], + col: [ 2, "", "
" ], + tr: [ 2, "", "
" ], + td: [ 3, "", "
" ], + + _default: [ 0, "", "" ] +}; + +wrapMap.tbody = wrapMap.tfoot = wrapMap.colgroup = wrapMap.caption = wrapMap.thead; +wrapMap.th = wrapMap.td; + +// Support: IE <=9 only +if ( !support.option ) { + wrapMap.optgroup = wrapMap.option = [ 1, "" ]; +} + + +function getAll( context, tag ) { + + // Support: IE <=9 - 11 only + // Use typeof to avoid zero-argument method invocation on host objects (#15151) + var ret; + + if ( typeof context.getElementsByTagName !== "undefined" ) { + ret = context.getElementsByTagName( tag || "*" ); + + } else if ( typeof context.querySelectorAll !== "undefined" ) { + ret = context.querySelectorAll( tag || "*" ); + + } else { + ret = []; + } + + if ( tag === undefined || tag && nodeName( context, tag ) ) { + return jQuery.merge( [ context ], ret ); + } + + return ret; +} + + +// Mark scripts as having already been evaluated +function setGlobalEval( elems, refElements ) { + var i = 0, + l = elems.length; + + for ( ; i < l; i++ ) { + dataPriv.set( + elems[ i ], + "globalEval", + !refElements || dataPriv.get( refElements[ i ], "globalEval" ) + ); + } +} + + +var rhtml = /<|&#?\w+;/; + +function buildFragment( elems, context, scripts, selection, ignored ) { + var elem, tmp, tag, wrap, attached, j, + fragment = context.createDocumentFragment(), + nodes = [], + i = 0, + l = elems.length; + + for ( ; i < l; i++ ) { + elem = elems[ i ]; + + if ( elem || elem === 0 ) { + + // Add nodes directly + if ( toType( elem ) === "object" ) { + + // Support: Android <=4.0 only, PhantomJS 1 only + // push.apply(_, arraylike) throws on ancient WebKit + jQuery.merge( nodes, elem.nodeType ? [ elem ] : elem ); + + // Convert non-html into a text node + } else if ( !rhtml.test( elem ) ) { + nodes.push( context.createTextNode( elem ) ); + + // Convert html into DOM nodes + } else { + tmp = tmp || fragment.appendChild( context.createElement( "div" ) ); + + // Deserialize a standard representation + tag = ( rtagName.exec( elem ) || [ "", "" ] )[ 1 ].toLowerCase(); + wrap = wrapMap[ tag ] || wrapMap._default; + tmp.innerHTML = wrap[ 1 ] + jQuery.htmlPrefilter( elem ) + wrap[ 2 ]; + + // Descend through wrappers to the right content + j = wrap[ 0 ]; + while ( j-- ) { + tmp = tmp.lastChild; + } + + // Support: Android <=4.0 only, PhantomJS 1 only + // push.apply(_, arraylike) throws on ancient WebKit + jQuery.merge( nodes, tmp.childNodes ); + + // Remember the top-level container + tmp = fragment.firstChild; + + // Ensure the created nodes are orphaned (#12392) + tmp.textContent = ""; + } + } + } + + // Remove wrapper from fragment + fragment.textContent = ""; + + i = 0; + while ( ( elem = nodes[ i++ ] ) ) { + + // Skip elements already in the context collection (trac-4087) + if ( selection && jQuery.inArray( elem, selection ) > -1 ) { + if ( ignored ) { + ignored.push( elem ); + } + continue; + } + + attached = isAttached( elem ); + + // Append to fragment + tmp = getAll( fragment.appendChild( elem ), "script" ); + + // Preserve script evaluation history + if ( attached ) { + setGlobalEval( tmp ); + } + + // Capture executables + if ( scripts ) { + j = 0; + while ( ( elem = tmp[ j++ ] ) ) { + if ( rscriptType.test( elem.type || "" ) ) { + scripts.push( elem ); + } + } + } + } + + return fragment; +} + + +var rtypenamespace = /^([^.]*)(?:\.(.+)|)/; + +function returnTrue() { + return true; +} + +function returnFalse() { + return false; +} + +// Support: IE <=9 - 11+ +// focus() and blur() are asynchronous, except when they are no-op. +// So expect focus to be synchronous when the element is already active, +// and blur to be synchronous when the element is not already active. +// (focus and blur are always synchronous in other supported browsers, +// this just defines when we can count on it). +function expectSync( elem, type ) { + return ( elem === safeActiveElement() ) === ( type === "focus" ); +} + +// Support: IE <=9 only +// Accessing document.activeElement can throw unexpectedly +// https://bugs.jquery.com/ticket/13393 +function safeActiveElement() { + try { + return document.activeElement; + } catch ( err ) { } +} + +function on( elem, types, selector, data, fn, one ) { + var origFn, type; + + // Types can be a map of types/handlers + if ( typeof types === "object" ) { + + // ( types-Object, selector, data ) + if ( typeof selector !== "string" ) { + + // ( types-Object, data ) + data = data || selector; + selector = undefined; + } + for ( type in types ) { + on( elem, type, selector, data, types[ type ], one ); + } + return elem; + } + + if ( data == null && fn == null ) { + + // ( types, fn ) + fn = selector; + data = selector = undefined; + } else if ( fn == null ) { + if ( typeof selector === "string" ) { + + // ( types, selector, fn ) + fn = data; + data = undefined; + } else { + + // ( types, data, fn ) + fn = data; + data = selector; + selector = undefined; + } + } + if ( fn === false ) { + fn = returnFalse; + } else if ( !fn ) { + return elem; + } + + if ( one === 1 ) { + origFn = fn; + fn = function( event ) { + + // Can use an empty set, since event contains the info + jQuery().off( event ); + return origFn.apply( this, arguments ); + }; + + // Use same guid so caller can remove using origFn + fn.guid = origFn.guid || ( origFn.guid = jQuery.guid++ ); + } + return elem.each( function() { + jQuery.event.add( this, types, fn, data, selector ); + } ); +} + +/* + * Helper functions for managing events -- not part of the public interface. + * Props to Dean Edwards' addEvent library for many of the ideas. + */ +jQuery.event = { + + global: {}, + + add: function( elem, types, handler, data, selector ) { + + var handleObjIn, eventHandle, tmp, + events, t, handleObj, + special, handlers, type, namespaces, origType, + elemData = dataPriv.get( elem ); + + // Only attach events to objects that accept data + if ( !acceptData( elem ) ) { + return; + } + + // Caller can pass in an object of custom data in lieu of the handler + if ( handler.handler ) { + handleObjIn = handler; + handler = handleObjIn.handler; + selector = handleObjIn.selector; + } + + // Ensure that invalid selectors throw exceptions at attach time + // Evaluate against documentElement in case elem is a non-element node (e.g., document) + if ( selector ) { + jQuery.find.matchesSelector( documentElement, selector ); + } + + // Make sure that the handler has a unique ID, used to find/remove it later + if ( !handler.guid ) { + handler.guid = jQuery.guid++; + } + + // Init the element's event structure and main handler, if this is the first + if ( !( events = elemData.events ) ) { + events = elemData.events = Object.create( null ); + } + if ( !( eventHandle = elemData.handle ) ) { + eventHandle = elemData.handle = function( e ) { + + // Discard the second event of a jQuery.event.trigger() and + // when an event is called after a page has unloaded + return typeof jQuery !== "undefined" && jQuery.event.triggered !== e.type ? + jQuery.event.dispatch.apply( elem, arguments ) : undefined; + }; + } + + // Handle multiple events separated by a space + types = ( types || "" ).match( rnothtmlwhite ) || [ "" ]; + t = types.length; + while ( t-- ) { + tmp = rtypenamespace.exec( types[ t ] ) || []; + type = origType = tmp[ 1 ]; + namespaces = ( tmp[ 2 ] || "" ).split( "." ).sort(); + + // There *must* be a type, no attaching namespace-only handlers + if ( !type ) { + continue; + } + + // If event changes its type, use the special event handlers for the changed type + special = jQuery.event.special[ type ] || {}; + + // If selector defined, determine special event api type, otherwise given type + type = ( selector ? special.delegateType : special.bindType ) || type; + + // Update special based on newly reset type + special = jQuery.event.special[ type ] || {}; + + // handleObj is passed to all event handlers + handleObj = jQuery.extend( { + type: type, + origType: origType, + data: data, + handler: handler, + guid: handler.guid, + selector: selector, + needsContext: selector && jQuery.expr.match.needsContext.test( selector ), + namespace: namespaces.join( "." ) + }, handleObjIn ); + + // Init the event handler queue if we're the first + if ( !( handlers = events[ type ] ) ) { + handlers = events[ type ] = []; + handlers.delegateCount = 0; + + // Only use addEventListener if the special events handler returns false + if ( !special.setup || + special.setup.call( elem, data, namespaces, eventHandle ) === false ) { + + if ( elem.addEventListener ) { + elem.addEventListener( type, eventHandle ); + } + } + } + + if ( special.add ) { + special.add.call( elem, handleObj ); + + if ( !handleObj.handler.guid ) { + handleObj.handler.guid = handler.guid; + } + } + + // Add to the element's handler list, delegates in front + if ( selector ) { + handlers.splice( handlers.delegateCount++, 0, handleObj ); + } else { + handlers.push( handleObj ); + } + + // Keep track of which events have ever been used, for event optimization + jQuery.event.global[ type ] = true; + } + + }, + + // Detach an event or set of events from an element + remove: function( elem, types, handler, selector, mappedTypes ) { + + var j, origCount, tmp, + events, t, handleObj, + special, handlers, type, namespaces, origType, + elemData = dataPriv.hasData( elem ) && dataPriv.get( elem ); + + if ( !elemData || !( events = elemData.events ) ) { + return; + } + + // Once for each type.namespace in types; type may be omitted + types = ( types || "" ).match( rnothtmlwhite ) || [ "" ]; + t = types.length; + while ( t-- ) { + tmp = rtypenamespace.exec( types[ t ] ) || []; + type = origType = tmp[ 1 ]; + namespaces = ( tmp[ 2 ] || "" ).split( "." ).sort(); + + // Unbind all events (on this namespace, if provided) for the element + if ( !type ) { + for ( type in events ) { + jQuery.event.remove( elem, type + types[ t ], handler, selector, true ); + } + continue; + } + + special = jQuery.event.special[ type ] || {}; + type = ( selector ? special.delegateType : special.bindType ) || type; + handlers = events[ type ] || []; + tmp = tmp[ 2 ] && + new RegExp( "(^|\\.)" + namespaces.join( "\\.(?:.*\\.|)" ) + "(\\.|$)" ); + + // Remove matching events + origCount = j = handlers.length; + while ( j-- ) { + handleObj = handlers[ j ]; + + if ( ( mappedTypes || origType === handleObj.origType ) && + ( !handler || handler.guid === handleObj.guid ) && + ( !tmp || tmp.test( handleObj.namespace ) ) && + ( !selector || selector === handleObj.selector || + selector === "**" && handleObj.selector ) ) { + handlers.splice( j, 1 ); + + if ( handleObj.selector ) { + handlers.delegateCount--; + } + if ( special.remove ) { + special.remove.call( elem, handleObj ); + } + } + } + + // Remove generic event handler if we removed something and no more handlers exist + // (avoids potential for endless recursion during removal of special event handlers) + if ( origCount && !handlers.length ) { + if ( !special.teardown || + special.teardown.call( elem, namespaces, elemData.handle ) === false ) { + + jQuery.removeEvent( elem, type, elemData.handle ); + } + + delete events[ type ]; + } + } + + // Remove data and the expando if it's no longer used + if ( jQuery.isEmptyObject( events ) ) { + dataPriv.remove( elem, "handle events" ); + } + }, + + dispatch: function( nativeEvent ) { + + var i, j, ret, matched, handleObj, handlerQueue, + args = new Array( arguments.length ), + + // Make a writable jQuery.Event from the native event object + event = jQuery.event.fix( nativeEvent ), + + handlers = ( + dataPriv.get( this, "events" ) || Object.create( null ) + )[ event.type ] || [], + special = jQuery.event.special[ event.type ] || {}; + + // Use the fix-ed jQuery.Event rather than the (read-only) native event + args[ 0 ] = event; + + for ( i = 1; i < arguments.length; i++ ) { + args[ i ] = arguments[ i ]; + } + + event.delegateTarget = this; + + // Call the preDispatch hook for the mapped type, and let it bail if desired + if ( special.preDispatch && special.preDispatch.call( this, event ) === false ) { + return; + } + + // Determine handlers + handlerQueue = jQuery.event.handlers.call( this, event, handlers ); + + // Run delegates first; they may want to stop propagation beneath us + i = 0; + while ( ( matched = handlerQueue[ i++ ] ) && !event.isPropagationStopped() ) { + event.currentTarget = matched.elem; + + j = 0; + while ( ( handleObj = matched.handlers[ j++ ] ) && + !event.isImmediatePropagationStopped() ) { + + // If the event is namespaced, then each handler is only invoked if it is + // specially universal or its namespaces are a superset of the event's. + if ( !event.rnamespace || handleObj.namespace === false || + event.rnamespace.test( handleObj.namespace ) ) { + + event.handleObj = handleObj; + event.data = handleObj.data; + + ret = ( ( jQuery.event.special[ handleObj.origType ] || {} ).handle || + handleObj.handler ).apply( matched.elem, args ); + + if ( ret !== undefined ) { + if ( ( event.result = ret ) === false ) { + event.preventDefault(); + event.stopPropagation(); + } + } + } + } + } + + // Call the postDispatch hook for the mapped type + if ( special.postDispatch ) { + special.postDispatch.call( this, event ); + } + + return event.result; + }, + + handlers: function( event, handlers ) { + var i, handleObj, sel, matchedHandlers, matchedSelectors, + handlerQueue = [], + delegateCount = handlers.delegateCount, + cur = event.target; + + // Find delegate handlers + if ( delegateCount && + + // Support: IE <=9 + // Black-hole SVG instance trees (trac-13180) + cur.nodeType && + + // Support: Firefox <=42 + // Suppress spec-violating clicks indicating a non-primary pointer button (trac-3861) + // https://www.w3.org/TR/DOM-Level-3-Events/#event-type-click + // Support: IE 11 only + // ...but not arrow key "clicks" of radio inputs, which can have `button` -1 (gh-2343) + !( event.type === "click" && event.button >= 1 ) ) { + + for ( ; cur !== this; cur = cur.parentNode || this ) { + + // Don't check non-elements (#13208) + // Don't process clicks on disabled elements (#6911, #8165, #11382, #11764) + if ( cur.nodeType === 1 && !( event.type === "click" && cur.disabled === true ) ) { + matchedHandlers = []; + matchedSelectors = {}; + for ( i = 0; i < delegateCount; i++ ) { + handleObj = handlers[ i ]; + + // Don't conflict with Object.prototype properties (#13203) + sel = handleObj.selector + " "; + + if ( matchedSelectors[ sel ] === undefined ) { + matchedSelectors[ sel ] = handleObj.needsContext ? + jQuery( sel, this ).index( cur ) > -1 : + jQuery.find( sel, this, null, [ cur ] ).length; + } + if ( matchedSelectors[ sel ] ) { + matchedHandlers.push( handleObj ); + } + } + if ( matchedHandlers.length ) { + handlerQueue.push( { elem: cur, handlers: matchedHandlers } ); + } + } + } + } + + // Add the remaining (directly-bound) handlers + cur = this; + if ( delegateCount < handlers.length ) { + handlerQueue.push( { elem: cur, handlers: handlers.slice( delegateCount ) } ); + } + + return handlerQueue; + }, + + addProp: function( name, hook ) { + Object.defineProperty( jQuery.Event.prototype, name, { + enumerable: true, + configurable: true, + + get: isFunction( hook ) ? + function() { + if ( this.originalEvent ) { + return hook( this.originalEvent ); + } + } : + function() { + if ( this.originalEvent ) { + return this.originalEvent[ name ]; + } + }, + + set: function( value ) { + Object.defineProperty( this, name, { + enumerable: true, + configurable: true, + writable: true, + value: value + } ); + } + } ); + }, + + fix: function( originalEvent ) { + return originalEvent[ jQuery.expando ] ? + originalEvent : + new jQuery.Event( originalEvent ); + }, + + special: { + load: { + + // Prevent triggered image.load events from bubbling to window.load + noBubble: true + }, + click: { + + // Utilize native event to ensure correct state for checkable inputs + setup: function( data ) { + + // For mutual compressibility with _default, replace `this` access with a local var. + // `|| data` is dead code meant only to preserve the variable through minification. + var el = this || data; + + // Claim the first handler + if ( rcheckableType.test( el.type ) && + el.click && nodeName( el, "input" ) ) { + + // dataPriv.set( el, "click", ... ) + leverageNative( el, "click", returnTrue ); + } + + // Return false to allow normal processing in the caller + return false; + }, + trigger: function( data ) { + + // For mutual compressibility with _default, replace `this` access with a local var. + // `|| data` is dead code meant only to preserve the variable through minification. + var el = this || data; + + // Force setup before triggering a click + if ( rcheckableType.test( el.type ) && + el.click && nodeName( el, "input" ) ) { + + leverageNative( el, "click" ); + } + + // Return non-false to allow normal event-path propagation + return true; + }, + + // For cross-browser consistency, suppress native .click() on links + // Also prevent it if we're currently inside a leveraged native-event stack + _default: function( event ) { + var target = event.target; + return rcheckableType.test( target.type ) && + target.click && nodeName( target, "input" ) && + dataPriv.get( target, "click" ) || + nodeName( target, "a" ); + } + }, + + beforeunload: { + postDispatch: function( event ) { + + // Support: Firefox 20+ + // Firefox doesn't alert if the returnValue field is not set. + if ( event.result !== undefined && event.originalEvent ) { + event.originalEvent.returnValue = event.result; + } + } + } + } +}; + +// Ensure the presence of an event listener that handles manually-triggered +// synthetic events by interrupting progress until reinvoked in response to +// *native* events that it fires directly, ensuring that state changes have +// already occurred before other listeners are invoked. +function leverageNative( el, type, expectSync ) { + + // Missing expectSync indicates a trigger call, which must force setup through jQuery.event.add + if ( !expectSync ) { + if ( dataPriv.get( el, type ) === undefined ) { + jQuery.event.add( el, type, returnTrue ); + } + return; + } + + // Register the controller as a special universal handler for all event namespaces + dataPriv.set( el, type, false ); + jQuery.event.add( el, type, { + namespace: false, + handler: function( event ) { + var notAsync, result, + saved = dataPriv.get( this, type ); + + if ( ( event.isTrigger & 1 ) && this[ type ] ) { + + // Interrupt processing of the outer synthetic .trigger()ed event + // Saved data should be false in such cases, but might be a leftover capture object + // from an async native handler (gh-4350) + if ( !saved.length ) { + + // Store arguments for use when handling the inner native event + // There will always be at least one argument (an event object), so this array + // will not be confused with a leftover capture object. + saved = slice.call( arguments ); + dataPriv.set( this, type, saved ); + + // Trigger the native event and capture its result + // Support: IE <=9 - 11+ + // focus() and blur() are asynchronous + notAsync = expectSync( this, type ); + this[ type ](); + result = dataPriv.get( this, type ); + if ( saved !== result || notAsync ) { + dataPriv.set( this, type, false ); + } else { + result = {}; + } + if ( saved !== result ) { + + // Cancel the outer synthetic event + event.stopImmediatePropagation(); + event.preventDefault(); + + // Support: Chrome 86+ + // In Chrome, if an element having a focusout handler is blurred by + // clicking outside of it, it invokes the handler synchronously. If + // that handler calls `.remove()` on the element, the data is cleared, + // leaving `result` undefined. We need to guard against this. + return result && result.value; + } + + // If this is an inner synthetic event for an event with a bubbling surrogate + // (focus or blur), assume that the surrogate already propagated from triggering the + // native event and prevent that from happening again here. + // This technically gets the ordering wrong w.r.t. to `.trigger()` (in which the + // bubbling surrogate propagates *after* the non-bubbling base), but that seems + // less bad than duplication. + } else if ( ( jQuery.event.special[ type ] || {} ).delegateType ) { + event.stopPropagation(); + } + + // If this is a native event triggered above, everything is now in order + // Fire an inner synthetic event with the original arguments + } else if ( saved.length ) { + + // ...and capture the result + dataPriv.set( this, type, { + value: jQuery.event.trigger( + + // Support: IE <=9 - 11+ + // Extend with the prototype to reset the above stopImmediatePropagation() + jQuery.extend( saved[ 0 ], jQuery.Event.prototype ), + saved.slice( 1 ), + this + ) + } ); + + // Abort handling of the native event + event.stopImmediatePropagation(); + } + } + } ); +} + +jQuery.removeEvent = function( elem, type, handle ) { + + // This "if" is needed for plain objects + if ( elem.removeEventListener ) { + elem.removeEventListener( type, handle ); + } +}; + +jQuery.Event = function( src, props ) { + + // Allow instantiation without the 'new' keyword + if ( !( this instanceof jQuery.Event ) ) { + return new jQuery.Event( src, props ); + } + + // Event object + if ( src && src.type ) { + this.originalEvent = src; + this.type = src.type; + + // Events bubbling up the document may have been marked as prevented + // by a handler lower down the tree; reflect the correct value. + this.isDefaultPrevented = src.defaultPrevented || + src.defaultPrevented === undefined && + + // Support: Android <=2.3 only + src.returnValue === false ? + returnTrue : + returnFalse; + + // Create target properties + // Support: Safari <=6 - 7 only + // Target should not be a text node (#504, #13143) + this.target = ( src.target && src.target.nodeType === 3 ) ? + src.target.parentNode : + src.target; + + this.currentTarget = src.currentTarget; + this.relatedTarget = src.relatedTarget; + + // Event type + } else { + this.type = src; + } + + // Put explicitly provided properties onto the event object + if ( props ) { + jQuery.extend( this, props ); + } + + // Create a timestamp if incoming event doesn't have one + this.timeStamp = src && src.timeStamp || Date.now(); + + // Mark it as fixed + this[ jQuery.expando ] = true; +}; + +// jQuery.Event is based on DOM3 Events as specified by the ECMAScript Language Binding +// https://www.w3.org/TR/2003/WD-DOM-Level-3-Events-20030331/ecma-script-binding.html +jQuery.Event.prototype = { + constructor: jQuery.Event, + isDefaultPrevented: returnFalse, + isPropagationStopped: returnFalse, + isImmediatePropagationStopped: returnFalse, + isSimulated: false, + + preventDefault: function() { + var e = this.originalEvent; + + this.isDefaultPrevented = returnTrue; + + if ( e && !this.isSimulated ) { + e.preventDefault(); + } + }, + stopPropagation: function() { + var e = this.originalEvent; + + this.isPropagationStopped = returnTrue; + + if ( e && !this.isSimulated ) { + e.stopPropagation(); + } + }, + stopImmediatePropagation: function() { + var e = this.originalEvent; + + this.isImmediatePropagationStopped = returnTrue; + + if ( e && !this.isSimulated ) { + e.stopImmediatePropagation(); + } + + this.stopPropagation(); + } +}; + +// Includes all common event props including KeyEvent and MouseEvent specific props +jQuery.each( { + altKey: true, + bubbles: true, + cancelable: true, + changedTouches: true, + ctrlKey: true, + detail: true, + eventPhase: true, + metaKey: true, + pageX: true, + pageY: true, + shiftKey: true, + view: true, + "char": true, + code: true, + charCode: true, + key: true, + keyCode: true, + button: true, + buttons: true, + clientX: true, + clientY: true, + offsetX: true, + offsetY: true, + pointerId: true, + pointerType: true, + screenX: true, + screenY: true, + targetTouches: true, + toElement: true, + touches: true, + which: true +}, jQuery.event.addProp ); + +jQuery.each( { focus: "focusin", blur: "focusout" }, function( type, delegateType ) { + jQuery.event.special[ type ] = { + + // Utilize native event if possible so blur/focus sequence is correct + setup: function() { + + // Claim the first handler + // dataPriv.set( this, "focus", ... ) + // dataPriv.set( this, "blur", ... ) + leverageNative( this, type, expectSync ); + + // Return false to allow normal processing in the caller + return false; + }, + trigger: function() { + + // Force setup before trigger + leverageNative( this, type ); + + // Return non-false to allow normal event-path propagation + return true; + }, + + // Suppress native focus or blur as it's already being fired + // in leverageNative. + _default: function() { + return true; + }, + + delegateType: delegateType + }; +} ); + +// Create mouseenter/leave events using mouseover/out and event-time checks +// so that event delegation works in jQuery. +// Do the same for pointerenter/pointerleave and pointerover/pointerout +// +// Support: Safari 7 only +// Safari sends mouseenter too often; see: +// https://bugs.chromium.org/p/chromium/issues/detail?id=470258 +// for the description of the bug (it existed in older Chrome versions as well). +jQuery.each( { + mouseenter: "mouseover", + mouseleave: "mouseout", + pointerenter: "pointerover", + pointerleave: "pointerout" +}, function( orig, fix ) { + jQuery.event.special[ orig ] = { + delegateType: fix, + bindType: fix, + + handle: function( event ) { + var ret, + target = this, + related = event.relatedTarget, + handleObj = event.handleObj; + + // For mouseenter/leave call the handler if related is outside the target. + // NB: No relatedTarget if the mouse left/entered the browser window + if ( !related || ( related !== target && !jQuery.contains( target, related ) ) ) { + event.type = handleObj.origType; + ret = handleObj.handler.apply( this, arguments ); + event.type = fix; + } + return ret; + } + }; +} ); + +jQuery.fn.extend( { + + on: function( types, selector, data, fn ) { + return on( this, types, selector, data, fn ); + }, + one: function( types, selector, data, fn ) { + return on( this, types, selector, data, fn, 1 ); + }, + off: function( types, selector, fn ) { + var handleObj, type; + if ( types && types.preventDefault && types.handleObj ) { + + // ( event ) dispatched jQuery.Event + handleObj = types.handleObj; + jQuery( types.delegateTarget ).off( + handleObj.namespace ? + handleObj.origType + "." + handleObj.namespace : + handleObj.origType, + handleObj.selector, + handleObj.handler + ); + return this; + } + if ( typeof types === "object" ) { + + // ( types-object [, selector] ) + for ( type in types ) { + this.off( type, selector, types[ type ] ); + } + return this; + } + if ( selector === false || typeof selector === "function" ) { + + // ( types [, fn] ) + fn = selector; + selector = undefined; + } + if ( fn === false ) { + fn = returnFalse; + } + return this.each( function() { + jQuery.event.remove( this, types, fn, selector ); + } ); + } +} ); + + +var + + // Support: IE <=10 - 11, Edge 12 - 13 only + // In IE/Edge using regex groups here causes severe slowdowns. + // See https://connect.microsoft.com/IE/feedback/details/1736512/ + rnoInnerhtml = /\s*$/g; + +// Prefer a tbody over its parent table for containing new rows +function manipulationTarget( elem, content ) { + if ( nodeName( elem, "table" ) && + nodeName( content.nodeType !== 11 ? content : content.firstChild, "tr" ) ) { + + return jQuery( elem ).children( "tbody" )[ 0 ] || elem; + } + + return elem; +} + +// Replace/restore the type attribute of script elements for safe DOM manipulation +function disableScript( elem ) { + elem.type = ( elem.getAttribute( "type" ) !== null ) + "/" + elem.type; + return elem; +} +function restoreScript( elem ) { + if ( ( elem.type || "" ).slice( 0, 5 ) === "true/" ) { + elem.type = elem.type.slice( 5 ); + } else { + elem.removeAttribute( "type" ); + } + + return elem; +} + +function cloneCopyEvent( src, dest ) { + var i, l, type, pdataOld, udataOld, udataCur, events; + + if ( dest.nodeType !== 1 ) { + return; + } + + // 1. Copy private data: events, handlers, etc. + if ( dataPriv.hasData( src ) ) { + pdataOld = dataPriv.get( src ); + events = pdataOld.events; + + if ( events ) { + dataPriv.remove( dest, "handle events" ); + + for ( type in events ) { + for ( i = 0, l = events[ type ].length; i < l; i++ ) { + jQuery.event.add( dest, type, events[ type ][ i ] ); + } + } + } + } + + // 2. Copy user data + if ( dataUser.hasData( src ) ) { + udataOld = dataUser.access( src ); + udataCur = jQuery.extend( {}, udataOld ); + + dataUser.set( dest, udataCur ); + } +} + +// Fix IE bugs, see support tests +function fixInput( src, dest ) { + var nodeName = dest.nodeName.toLowerCase(); + + // Fails to persist the checked state of a cloned checkbox or radio button. + if ( nodeName === "input" && rcheckableType.test( src.type ) ) { + dest.checked = src.checked; + + // Fails to return the selected option to the default selected state when cloning options + } else if ( nodeName === "input" || nodeName === "textarea" ) { + dest.defaultValue = src.defaultValue; + } +} + +function domManip( collection, args, callback, ignored ) { + + // Flatten any nested arrays + args = flat( args ); + + var fragment, first, scripts, hasScripts, node, doc, + i = 0, + l = collection.length, + iNoClone = l - 1, + value = args[ 0 ], + valueIsFunction = isFunction( value ); + + // We can't cloneNode fragments that contain checked, in WebKit + if ( valueIsFunction || + ( l > 1 && typeof value === "string" && + !support.checkClone && rchecked.test( value ) ) ) { + return collection.each( function( index ) { + var self = collection.eq( index ); + if ( valueIsFunction ) { + args[ 0 ] = value.call( this, index, self.html() ); + } + domManip( self, args, callback, ignored ); + } ); + } + + if ( l ) { + fragment = buildFragment( args, collection[ 0 ].ownerDocument, false, collection, ignored ); + first = fragment.firstChild; + + if ( fragment.childNodes.length === 1 ) { + fragment = first; + } + + // Require either new content or an interest in ignored elements to invoke the callback + if ( first || ignored ) { + scripts = jQuery.map( getAll( fragment, "script" ), disableScript ); + hasScripts = scripts.length; + + // Use the original fragment for the last item + // instead of the first because it can end up + // being emptied incorrectly in certain situations (#8070). + for ( ; i < l; i++ ) { + node = fragment; + + if ( i !== iNoClone ) { + node = jQuery.clone( node, true, true ); + + // Keep references to cloned scripts for later restoration + if ( hasScripts ) { + + // Support: Android <=4.0 only, PhantomJS 1 only + // push.apply(_, arraylike) throws on ancient WebKit + jQuery.merge( scripts, getAll( node, "script" ) ); + } + } + + callback.call( collection[ i ], node, i ); + } + + if ( hasScripts ) { + doc = scripts[ scripts.length - 1 ].ownerDocument; + + // Reenable scripts + jQuery.map( scripts, restoreScript ); + + // Evaluate executable scripts on first document insertion + for ( i = 0; i < hasScripts; i++ ) { + node = scripts[ i ]; + if ( rscriptType.test( node.type || "" ) && + !dataPriv.access( node, "globalEval" ) && + jQuery.contains( doc, node ) ) { + + if ( node.src && ( node.type || "" ).toLowerCase() !== "module" ) { + + // Optional AJAX dependency, but won't run scripts if not present + if ( jQuery._evalUrl && !node.noModule ) { + jQuery._evalUrl( node.src, { + nonce: node.nonce || node.getAttribute( "nonce" ) + }, doc ); + } + } else { + DOMEval( node.textContent.replace( rcleanScript, "" ), node, doc ); + } + } + } + } + } + } + + return collection; +} + +function remove( elem, selector, keepData ) { + var node, + nodes = selector ? jQuery.filter( selector, elem ) : elem, + i = 0; + + for ( ; ( node = nodes[ i ] ) != null; i++ ) { + if ( !keepData && node.nodeType === 1 ) { + jQuery.cleanData( getAll( node ) ); + } + + if ( node.parentNode ) { + if ( keepData && isAttached( node ) ) { + setGlobalEval( getAll( node, "script" ) ); + } + node.parentNode.removeChild( node ); + } + } + + return elem; +} + +jQuery.extend( { + htmlPrefilter: function( html ) { + return html; + }, + + clone: function( elem, dataAndEvents, deepDataAndEvents ) { + var i, l, srcElements, destElements, + clone = elem.cloneNode( true ), + inPage = isAttached( elem ); + + // Fix IE cloning issues + if ( !support.noCloneChecked && ( elem.nodeType === 1 || elem.nodeType === 11 ) && + !jQuery.isXMLDoc( elem ) ) { + + // We eschew Sizzle here for performance reasons: https://jsperf.com/getall-vs-sizzle/2 + destElements = getAll( clone ); + srcElements = getAll( elem ); + + for ( i = 0, l = srcElements.length; i < l; i++ ) { + fixInput( srcElements[ i ], destElements[ i ] ); + } + } + + // Copy the events from the original to the clone + if ( dataAndEvents ) { + if ( deepDataAndEvents ) { + srcElements = srcElements || getAll( elem ); + destElements = destElements || getAll( clone ); + + for ( i = 0, l = srcElements.length; i < l; i++ ) { + cloneCopyEvent( srcElements[ i ], destElements[ i ] ); + } + } else { + cloneCopyEvent( elem, clone ); + } + } + + // Preserve script evaluation history + destElements = getAll( clone, "script" ); + if ( destElements.length > 0 ) { + setGlobalEval( destElements, !inPage && getAll( elem, "script" ) ); + } + + // Return the cloned set + return clone; + }, + + cleanData: function( elems ) { + var data, elem, type, + special = jQuery.event.special, + i = 0; + + for ( ; ( elem = elems[ i ] ) !== undefined; i++ ) { + if ( acceptData( elem ) ) { + if ( ( data = elem[ dataPriv.expando ] ) ) { + if ( data.events ) { + for ( type in data.events ) { + if ( special[ type ] ) { + jQuery.event.remove( elem, type ); + + // This is a shortcut to avoid jQuery.event.remove's overhead + } else { + jQuery.removeEvent( elem, type, data.handle ); + } + } + } + + // Support: Chrome <=35 - 45+ + // Assign undefined instead of using delete, see Data#remove + elem[ dataPriv.expando ] = undefined; + } + if ( elem[ dataUser.expando ] ) { + + // Support: Chrome <=35 - 45+ + // Assign undefined instead of using delete, see Data#remove + elem[ dataUser.expando ] = undefined; + } + } + } + } +} ); + +jQuery.fn.extend( { + detach: function( selector ) { + return remove( this, selector, true ); + }, + + remove: function( selector ) { + return remove( this, selector ); + }, + + text: function( value ) { + return access( this, function( value ) { + return value === undefined ? + jQuery.text( this ) : + this.empty().each( function() { + if ( this.nodeType === 1 || this.nodeType === 11 || this.nodeType === 9 ) { + this.textContent = value; + } + } ); + }, null, value, arguments.length ); + }, + + append: function() { + return domManip( this, arguments, function( elem ) { + if ( this.nodeType === 1 || this.nodeType === 11 || this.nodeType === 9 ) { + var target = manipulationTarget( this, elem ); + target.appendChild( elem ); + } + } ); + }, + + prepend: function() { + return domManip( this, arguments, function( elem ) { + if ( this.nodeType === 1 || this.nodeType === 11 || this.nodeType === 9 ) { + var target = manipulationTarget( this, elem ); + target.insertBefore( elem, target.firstChild ); + } + } ); + }, + + before: function() { + return domManip( this, arguments, function( elem ) { + if ( this.parentNode ) { + this.parentNode.insertBefore( elem, this ); + } + } ); + }, + + after: function() { + return domManip( this, arguments, function( elem ) { + if ( this.parentNode ) { + this.parentNode.insertBefore( elem, this.nextSibling ); + } + } ); + }, + + empty: function() { + var elem, + i = 0; + + for ( ; ( elem = this[ i ] ) != null; i++ ) { + if ( elem.nodeType === 1 ) { + + // Prevent memory leaks + jQuery.cleanData( getAll( elem, false ) ); + + // Remove any remaining nodes + elem.textContent = ""; + } + } + + return this; + }, + + clone: function( dataAndEvents, deepDataAndEvents ) { + dataAndEvents = dataAndEvents == null ? false : dataAndEvents; + deepDataAndEvents = deepDataAndEvents == null ? dataAndEvents : deepDataAndEvents; + + return this.map( function() { + return jQuery.clone( this, dataAndEvents, deepDataAndEvents ); + } ); + }, + + html: function( value ) { + return access( this, function( value ) { + var elem = this[ 0 ] || {}, + i = 0, + l = this.length; + + if ( value === undefined && elem.nodeType === 1 ) { + return elem.innerHTML; + } + + // See if we can take a shortcut and just use innerHTML + if ( typeof value === "string" && !rnoInnerhtml.test( value ) && + !wrapMap[ ( rtagName.exec( value ) || [ "", "" ] )[ 1 ].toLowerCase() ] ) { + + value = jQuery.htmlPrefilter( value ); + + try { + for ( ; i < l; i++ ) { + elem = this[ i ] || {}; + + // Remove element nodes and prevent memory leaks + if ( elem.nodeType === 1 ) { + jQuery.cleanData( getAll( elem, false ) ); + elem.innerHTML = value; + } + } + + elem = 0; + + // If using innerHTML throws an exception, use the fallback method + } catch ( e ) {} + } + + if ( elem ) { + this.empty().append( value ); + } + }, null, value, arguments.length ); + }, + + replaceWith: function() { + var ignored = []; + + // Make the changes, replacing each non-ignored context element with the new content + return domManip( this, arguments, function( elem ) { + var parent = this.parentNode; + + if ( jQuery.inArray( this, ignored ) < 0 ) { + jQuery.cleanData( getAll( this ) ); + if ( parent ) { + parent.replaceChild( elem, this ); + } + } + + // Force callback invocation + }, ignored ); + } +} ); + +jQuery.each( { + appendTo: "append", + prependTo: "prepend", + insertBefore: "before", + insertAfter: "after", + replaceAll: "replaceWith" +}, function( name, original ) { + jQuery.fn[ name ] = function( selector ) { + var elems, + ret = [], + insert = jQuery( selector ), + last = insert.length - 1, + i = 0; + + for ( ; i <= last; i++ ) { + elems = i === last ? this : this.clone( true ); + jQuery( insert[ i ] )[ original ]( elems ); + + // Support: Android <=4.0 only, PhantomJS 1 only + // .get() because push.apply(_, arraylike) throws on ancient WebKit + push.apply( ret, elems.get() ); + } + + return this.pushStack( ret ); + }; +} ); +var rnumnonpx = new RegExp( "^(" + pnum + ")(?!px)[a-z%]+$", "i" ); + +var getStyles = function( elem ) { + + // Support: IE <=11 only, Firefox <=30 (#15098, #14150) + // IE throws on elements created in popups + // FF meanwhile throws on frame elements through "defaultView.getComputedStyle" + var view = elem.ownerDocument.defaultView; + + if ( !view || !view.opener ) { + view = window; + } + + return view.getComputedStyle( elem ); + }; + +var swap = function( elem, options, callback ) { + var ret, name, + old = {}; + + // Remember the old values, and insert the new ones + for ( name in options ) { + old[ name ] = elem.style[ name ]; + elem.style[ name ] = options[ name ]; + } + + ret = callback.call( elem ); + + // Revert the old values + for ( name in options ) { + elem.style[ name ] = old[ name ]; + } + + return ret; +}; + + +var rboxStyle = new RegExp( cssExpand.join( "|" ), "i" ); + + + +( function() { + + // Executing both pixelPosition & boxSizingReliable tests require only one layout + // so they're executed at the same time to save the second computation. + function computeStyleTests() { + + // This is a singleton, we need to execute it only once + if ( !div ) { + return; + } + + container.style.cssText = "position:absolute;left:-11111px;width:60px;" + + "margin-top:1px;padding:0;border:0"; + div.style.cssText = + "position:relative;display:block;box-sizing:border-box;overflow:scroll;" + + "margin:auto;border:1px;padding:1px;" + + "width:60%;top:1%"; + documentElement.appendChild( container ).appendChild( div ); + + var divStyle = window.getComputedStyle( div ); + pixelPositionVal = divStyle.top !== "1%"; + + // Support: Android 4.0 - 4.3 only, Firefox <=3 - 44 + reliableMarginLeftVal = roundPixelMeasures( divStyle.marginLeft ) === 12; + + // Support: Android 4.0 - 4.3 only, Safari <=9.1 - 10.1, iOS <=7.0 - 9.3 + // Some styles come back with percentage values, even though they shouldn't + div.style.right = "60%"; + pixelBoxStylesVal = roundPixelMeasures( divStyle.right ) === 36; + + // Support: IE 9 - 11 only + // Detect misreporting of content dimensions for box-sizing:border-box elements + boxSizingReliableVal = roundPixelMeasures( divStyle.width ) === 36; + + // Support: IE 9 only + // Detect overflow:scroll screwiness (gh-3699) + // Support: Chrome <=64 + // Don't get tricked when zoom affects offsetWidth (gh-4029) + div.style.position = "absolute"; + scrollboxSizeVal = roundPixelMeasures( div.offsetWidth / 3 ) === 12; + + documentElement.removeChild( container ); + + // Nullify the div so it wouldn't be stored in the memory and + // it will also be a sign that checks already performed + div = null; + } + + function roundPixelMeasures( measure ) { + return Math.round( parseFloat( measure ) ); + } + + var pixelPositionVal, boxSizingReliableVal, scrollboxSizeVal, pixelBoxStylesVal, + reliableTrDimensionsVal, reliableMarginLeftVal, + container = document.createElement( "div" ), + div = document.createElement( "div" ); + + // Finish early in limited (non-browser) environments + if ( !div.style ) { + return; + } + + // Support: IE <=9 - 11 only + // Style of cloned element affects source element cloned (#8908) + div.style.backgroundClip = "content-box"; + div.cloneNode( true ).style.backgroundClip = ""; + support.clearCloneStyle = div.style.backgroundClip === "content-box"; + + jQuery.extend( support, { + boxSizingReliable: function() { + computeStyleTests(); + return boxSizingReliableVal; + }, + pixelBoxStyles: function() { + computeStyleTests(); + return pixelBoxStylesVal; + }, + pixelPosition: function() { + computeStyleTests(); + return pixelPositionVal; + }, + reliableMarginLeft: function() { + computeStyleTests(); + return reliableMarginLeftVal; + }, + scrollboxSize: function() { + computeStyleTests(); + return scrollboxSizeVal; + }, + + // Support: IE 9 - 11+, Edge 15 - 18+ + // IE/Edge misreport `getComputedStyle` of table rows with width/height + // set in CSS while `offset*` properties report correct values. + // Behavior in IE 9 is more subtle than in newer versions & it passes + // some versions of this test; make sure not to make it pass there! + // + // Support: Firefox 70+ + // Only Firefox includes border widths + // in computed dimensions. (gh-4529) + reliableTrDimensions: function() { + var table, tr, trChild, trStyle; + if ( reliableTrDimensionsVal == null ) { + table = document.createElement( "table" ); + tr = document.createElement( "tr" ); + trChild = document.createElement( "div" ); + + table.style.cssText = "position:absolute;left:-11111px;border-collapse:separate"; + tr.style.cssText = "border:1px solid"; + + // Support: Chrome 86+ + // Height set through cssText does not get applied. + // Computed height then comes back as 0. + tr.style.height = "1px"; + trChild.style.height = "9px"; + + // Support: Android 8 Chrome 86+ + // In our bodyBackground.html iframe, + // display for all div elements is set to "inline", + // which causes a problem only in Android 8 Chrome 86. + // Ensuring the div is display: block + // gets around this issue. + trChild.style.display = "block"; + + documentElement + .appendChild( table ) + .appendChild( tr ) + .appendChild( trChild ); + + trStyle = window.getComputedStyle( tr ); + reliableTrDimensionsVal = ( parseInt( trStyle.height, 10 ) + + parseInt( trStyle.borderTopWidth, 10 ) + + parseInt( trStyle.borderBottomWidth, 10 ) ) === tr.offsetHeight; + + documentElement.removeChild( table ); + } + return reliableTrDimensionsVal; + } + } ); +} )(); + + +function curCSS( elem, name, computed ) { + var width, minWidth, maxWidth, ret, + + // Support: Firefox 51+ + // Retrieving style before computed somehow + // fixes an issue with getting wrong values + // on detached elements + style = elem.style; + + computed = computed || getStyles( elem ); + + // getPropertyValue is needed for: + // .css('filter') (IE 9 only, #12537) + // .css('--customProperty) (#3144) + if ( computed ) { + ret = computed.getPropertyValue( name ) || computed[ name ]; + + if ( ret === "" && !isAttached( elem ) ) { + ret = jQuery.style( elem, name ); + } + + // A tribute to the "awesome hack by Dean Edwards" + // Android Browser returns percentage for some values, + // but width seems to be reliably pixels. + // This is against the CSSOM draft spec: + // https://drafts.csswg.org/cssom/#resolved-values + if ( !support.pixelBoxStyles() && rnumnonpx.test( ret ) && rboxStyle.test( name ) ) { + + // Remember the original values + width = style.width; + minWidth = style.minWidth; + maxWidth = style.maxWidth; + + // Put in the new values to get a computed value out + style.minWidth = style.maxWidth = style.width = ret; + ret = computed.width; + + // Revert the changed values + style.width = width; + style.minWidth = minWidth; + style.maxWidth = maxWidth; + } + } + + return ret !== undefined ? + + // Support: IE <=9 - 11 only + // IE returns zIndex value as an integer. + ret + "" : + ret; +} + + +function addGetHookIf( conditionFn, hookFn ) { + + // Define the hook, we'll check on the first run if it's really needed. + return { + get: function() { + if ( conditionFn() ) { + + // Hook not needed (or it's not possible to use it due + // to missing dependency), remove it. + delete this.get; + return; + } + + // Hook needed; redefine it so that the support test is not executed again. + return ( this.get = hookFn ).apply( this, arguments ); + } + }; +} + + +var cssPrefixes = [ "Webkit", "Moz", "ms" ], + emptyStyle = document.createElement( "div" ).style, + vendorProps = {}; + +// Return a vendor-prefixed property or undefined +function vendorPropName( name ) { + + // Check for vendor prefixed names + var capName = name[ 0 ].toUpperCase() + name.slice( 1 ), + i = cssPrefixes.length; + + while ( i-- ) { + name = cssPrefixes[ i ] + capName; + if ( name in emptyStyle ) { + return name; + } + } +} + +// Return a potentially-mapped jQuery.cssProps or vendor prefixed property +function finalPropName( name ) { + var final = jQuery.cssProps[ name ] || vendorProps[ name ]; + + if ( final ) { + return final; + } + if ( name in emptyStyle ) { + return name; + } + return vendorProps[ name ] = vendorPropName( name ) || name; +} + + +var + + // Swappable if display is none or starts with table + // except "table", "table-cell", or "table-caption" + // See here for display values: https://developer.mozilla.org/en-US/docs/CSS/display + rdisplayswap = /^(none|table(?!-c[ea]).+)/, + rcustomProp = /^--/, + cssShow = { position: "absolute", visibility: "hidden", display: "block" }, + cssNormalTransform = { + letterSpacing: "0", + fontWeight: "400" + }; + +function setPositiveNumber( _elem, value, subtract ) { + + // Any relative (+/-) values have already been + // normalized at this point + var matches = rcssNum.exec( value ); + return matches ? + + // Guard against undefined "subtract", e.g., when used as in cssHooks + Math.max( 0, matches[ 2 ] - ( subtract || 0 ) ) + ( matches[ 3 ] || "px" ) : + value; +} + +function boxModelAdjustment( elem, dimension, box, isBorderBox, styles, computedVal ) { + var i = dimension === "width" ? 1 : 0, + extra = 0, + delta = 0; + + // Adjustment may not be necessary + if ( box === ( isBorderBox ? "border" : "content" ) ) { + return 0; + } + + for ( ; i < 4; i += 2 ) { + + // Both box models exclude margin + if ( box === "margin" ) { + delta += jQuery.css( elem, box + cssExpand[ i ], true, styles ); + } + + // If we get here with a content-box, we're seeking "padding" or "border" or "margin" + if ( !isBorderBox ) { + + // Add padding + delta += jQuery.css( elem, "padding" + cssExpand[ i ], true, styles ); + + // For "border" or "margin", add border + if ( box !== "padding" ) { + delta += jQuery.css( elem, "border" + cssExpand[ i ] + "Width", true, styles ); + + // But still keep track of it otherwise + } else { + extra += jQuery.css( elem, "border" + cssExpand[ i ] + "Width", true, styles ); + } + + // If we get here with a border-box (content + padding + border), we're seeking "content" or + // "padding" or "margin" + } else { + + // For "content", subtract padding + if ( box === "content" ) { + delta -= jQuery.css( elem, "padding" + cssExpand[ i ], true, styles ); + } + + // For "content" or "padding", subtract border + if ( box !== "margin" ) { + delta -= jQuery.css( elem, "border" + cssExpand[ i ] + "Width", true, styles ); + } + } + } + + // Account for positive content-box scroll gutter when requested by providing computedVal + if ( !isBorderBox && computedVal >= 0 ) { + + // offsetWidth/offsetHeight is a rounded sum of content, padding, scroll gutter, and border + // Assuming integer scroll gutter, subtract the rest and round down + delta += Math.max( 0, Math.ceil( + elem[ "offset" + dimension[ 0 ].toUpperCase() + dimension.slice( 1 ) ] - + computedVal - + delta - + extra - + 0.5 + + // If offsetWidth/offsetHeight is unknown, then we can't determine content-box scroll gutter + // Use an explicit zero to avoid NaN (gh-3964) + ) ) || 0; + } + + return delta; +} + +function getWidthOrHeight( elem, dimension, extra ) { + + // Start with computed style + var styles = getStyles( elem ), + + // To avoid forcing a reflow, only fetch boxSizing if we need it (gh-4322). + // Fake content-box until we know it's needed to know the true value. + boxSizingNeeded = !support.boxSizingReliable() || extra, + isBorderBox = boxSizingNeeded && + jQuery.css( elem, "boxSizing", false, styles ) === "border-box", + valueIsBorderBox = isBorderBox, + + val = curCSS( elem, dimension, styles ), + offsetProp = "offset" + dimension[ 0 ].toUpperCase() + dimension.slice( 1 ); + + // Support: Firefox <=54 + // Return a confounding non-pixel value or feign ignorance, as appropriate. + if ( rnumnonpx.test( val ) ) { + if ( !extra ) { + return val; + } + val = "auto"; + } + + + // Support: IE 9 - 11 only + // Use offsetWidth/offsetHeight for when box sizing is unreliable. + // In those cases, the computed value can be trusted to be border-box. + if ( ( !support.boxSizingReliable() && isBorderBox || + + // Support: IE 10 - 11+, Edge 15 - 18+ + // IE/Edge misreport `getComputedStyle` of table rows with width/height + // set in CSS while `offset*` properties report correct values. + // Interestingly, in some cases IE 9 doesn't suffer from this issue. + !support.reliableTrDimensions() && nodeName( elem, "tr" ) || + + // Fall back to offsetWidth/offsetHeight when value is "auto" + // This happens for inline elements with no explicit setting (gh-3571) + val === "auto" || + + // Support: Android <=4.1 - 4.3 only + // Also use offsetWidth/offsetHeight for misreported inline dimensions (gh-3602) + !parseFloat( val ) && jQuery.css( elem, "display", false, styles ) === "inline" ) && + + // Make sure the element is visible & connected + elem.getClientRects().length ) { + + isBorderBox = jQuery.css( elem, "boxSizing", false, styles ) === "border-box"; + + // Where available, offsetWidth/offsetHeight approximate border box dimensions. + // Where not available (e.g., SVG), assume unreliable box-sizing and interpret the + // retrieved value as a content box dimension. + valueIsBorderBox = offsetProp in elem; + if ( valueIsBorderBox ) { + val = elem[ offsetProp ]; + } + } + + // Normalize "" and auto + val = parseFloat( val ) || 0; + + // Adjust for the element's box model + return ( val + + boxModelAdjustment( + elem, + dimension, + extra || ( isBorderBox ? "border" : "content" ), + valueIsBorderBox, + styles, + + // Provide the current computed size to request scroll gutter calculation (gh-3589) + val + ) + ) + "px"; +} + +jQuery.extend( { + + // Add in style property hooks for overriding the default + // behavior of getting and setting a style property + cssHooks: { + opacity: { + get: function( elem, computed ) { + if ( computed ) { + + // We should always get a number back from opacity + var ret = curCSS( elem, "opacity" ); + return ret === "" ? "1" : ret; + } + } + } + }, + + // Don't automatically add "px" to these possibly-unitless properties + cssNumber: { + "animationIterationCount": true, + "columnCount": true, + "fillOpacity": true, + "flexGrow": true, + "flexShrink": true, + "fontWeight": true, + "gridArea": true, + "gridColumn": true, + "gridColumnEnd": true, + "gridColumnStart": true, + "gridRow": true, + "gridRowEnd": true, + "gridRowStart": true, + "lineHeight": true, + "opacity": true, + "order": true, + "orphans": true, + "widows": true, + "zIndex": true, + "zoom": true + }, + + // Add in properties whose names you wish to fix before + // setting or getting the value + cssProps: {}, + + // Get and set the style property on a DOM Node + style: function( elem, name, value, extra ) { + + // Don't set styles on text and comment nodes + if ( !elem || elem.nodeType === 3 || elem.nodeType === 8 || !elem.style ) { + return; + } + + // Make sure that we're working with the right name + var ret, type, hooks, + origName = camelCase( name ), + isCustomProp = rcustomProp.test( name ), + style = elem.style; + + // Make sure that we're working with the right name. We don't + // want to query the value if it is a CSS custom property + // since they are user-defined. + if ( !isCustomProp ) { + name = finalPropName( origName ); + } + + // Gets hook for the prefixed version, then unprefixed version + hooks = jQuery.cssHooks[ name ] || jQuery.cssHooks[ origName ]; + + // Check if we're setting a value + if ( value !== undefined ) { + type = typeof value; + + // Convert "+=" or "-=" to relative numbers (#7345) + if ( type === "string" && ( ret = rcssNum.exec( value ) ) && ret[ 1 ] ) { + value = adjustCSS( elem, name, ret ); + + // Fixes bug #9237 + type = "number"; + } + + // Make sure that null and NaN values aren't set (#7116) + if ( value == null || value !== value ) { + return; + } + + // If a number was passed in, add the unit (except for certain CSS properties) + // The isCustomProp check can be removed in jQuery 4.0 when we only auto-append + // "px" to a few hardcoded values. + if ( type === "number" && !isCustomProp ) { + value += ret && ret[ 3 ] || ( jQuery.cssNumber[ origName ] ? "" : "px" ); + } + + // background-* props affect original clone's values + if ( !support.clearCloneStyle && value === "" && name.indexOf( "background" ) === 0 ) { + style[ name ] = "inherit"; + } + + // If a hook was provided, use that value, otherwise just set the specified value + if ( !hooks || !( "set" in hooks ) || + ( value = hooks.set( elem, value, extra ) ) !== undefined ) { + + if ( isCustomProp ) { + style.setProperty( name, value ); + } else { + style[ name ] = value; + } + } + + } else { + + // If a hook was provided get the non-computed value from there + if ( hooks && "get" in hooks && + ( ret = hooks.get( elem, false, extra ) ) !== undefined ) { + + return ret; + } + + // Otherwise just get the value from the style object + return style[ name ]; + } + }, + + css: function( elem, name, extra, styles ) { + var val, num, hooks, + origName = camelCase( name ), + isCustomProp = rcustomProp.test( name ); + + // Make sure that we're working with the right name. We don't + // want to modify the value if it is a CSS custom property + // since they are user-defined. + if ( !isCustomProp ) { + name = finalPropName( origName ); + } + + // Try prefixed name followed by the unprefixed name + hooks = jQuery.cssHooks[ name ] || jQuery.cssHooks[ origName ]; + + // If a hook was provided get the computed value from there + if ( hooks && "get" in hooks ) { + val = hooks.get( elem, true, extra ); + } + + // Otherwise, if a way to get the computed value exists, use that + if ( val === undefined ) { + val = curCSS( elem, name, styles ); + } + + // Convert "normal" to computed value + if ( val === "normal" && name in cssNormalTransform ) { + val = cssNormalTransform[ name ]; + } + + // Make numeric if forced or a qualifier was provided and val looks numeric + if ( extra === "" || extra ) { + num = parseFloat( val ); + return extra === true || isFinite( num ) ? num || 0 : val; + } + + return val; + } +} ); + +jQuery.each( [ "height", "width" ], function( _i, dimension ) { + jQuery.cssHooks[ dimension ] = { + get: function( elem, computed, extra ) { + if ( computed ) { + + // Certain elements can have dimension info if we invisibly show them + // but it must have a current display style that would benefit + return rdisplayswap.test( jQuery.css( elem, "display" ) ) && + + // Support: Safari 8+ + // Table columns in Safari have non-zero offsetWidth & zero + // getBoundingClientRect().width unless display is changed. + // Support: IE <=11 only + // Running getBoundingClientRect on a disconnected node + // in IE throws an error. + ( !elem.getClientRects().length || !elem.getBoundingClientRect().width ) ? + swap( elem, cssShow, function() { + return getWidthOrHeight( elem, dimension, extra ); + } ) : + getWidthOrHeight( elem, dimension, extra ); + } + }, + + set: function( elem, value, extra ) { + var matches, + styles = getStyles( elem ), + + // Only read styles.position if the test has a chance to fail + // to avoid forcing a reflow. + scrollboxSizeBuggy = !support.scrollboxSize() && + styles.position === "absolute", + + // To avoid forcing a reflow, only fetch boxSizing if we need it (gh-3991) + boxSizingNeeded = scrollboxSizeBuggy || extra, + isBorderBox = boxSizingNeeded && + jQuery.css( elem, "boxSizing", false, styles ) === "border-box", + subtract = extra ? + boxModelAdjustment( + elem, + dimension, + extra, + isBorderBox, + styles + ) : + 0; + + // Account for unreliable border-box dimensions by comparing offset* to computed and + // faking a content-box to get border and padding (gh-3699) + if ( isBorderBox && scrollboxSizeBuggy ) { + subtract -= Math.ceil( + elem[ "offset" + dimension[ 0 ].toUpperCase() + dimension.slice( 1 ) ] - + parseFloat( styles[ dimension ] ) - + boxModelAdjustment( elem, dimension, "border", false, styles ) - + 0.5 + ); + } + + // Convert to pixels if value adjustment is needed + if ( subtract && ( matches = rcssNum.exec( value ) ) && + ( matches[ 3 ] || "px" ) !== "px" ) { + + elem.style[ dimension ] = value; + value = jQuery.css( elem, dimension ); + } + + return setPositiveNumber( elem, value, subtract ); + } + }; +} ); + +jQuery.cssHooks.marginLeft = addGetHookIf( support.reliableMarginLeft, + function( elem, computed ) { + if ( computed ) { + return ( parseFloat( curCSS( elem, "marginLeft" ) ) || + elem.getBoundingClientRect().left - + swap( elem, { marginLeft: 0 }, function() { + return elem.getBoundingClientRect().left; + } ) + ) + "px"; + } + } +); + +// These hooks are used by animate to expand properties +jQuery.each( { + margin: "", + padding: "", + border: "Width" +}, function( prefix, suffix ) { + jQuery.cssHooks[ prefix + suffix ] = { + expand: function( value ) { + var i = 0, + expanded = {}, + + // Assumes a single number if not a string + parts = typeof value === "string" ? value.split( " " ) : [ value ]; + + for ( ; i < 4; i++ ) { + expanded[ prefix + cssExpand[ i ] + suffix ] = + parts[ i ] || parts[ i - 2 ] || parts[ 0 ]; + } + + return expanded; + } + }; + + if ( prefix !== "margin" ) { + jQuery.cssHooks[ prefix + suffix ].set = setPositiveNumber; + } +} ); + +jQuery.fn.extend( { + css: function( name, value ) { + return access( this, function( elem, name, value ) { + var styles, len, + map = {}, + i = 0; + + if ( Array.isArray( name ) ) { + styles = getStyles( elem ); + len = name.length; + + for ( ; i < len; i++ ) { + map[ name[ i ] ] = jQuery.css( elem, name[ i ], false, styles ); + } + + return map; + } + + return value !== undefined ? + jQuery.style( elem, name, value ) : + jQuery.css( elem, name ); + }, name, value, arguments.length > 1 ); + } +} ); + + +function Tween( elem, options, prop, end, easing ) { + return new Tween.prototype.init( elem, options, prop, end, easing ); +} +jQuery.Tween = Tween; + +Tween.prototype = { + constructor: Tween, + init: function( elem, options, prop, end, easing, unit ) { + this.elem = elem; + this.prop = prop; + this.easing = easing || jQuery.easing._default; + this.options = options; + this.start = this.now = this.cur(); + this.end = end; + this.unit = unit || ( jQuery.cssNumber[ prop ] ? "" : "px" ); + }, + cur: function() { + var hooks = Tween.propHooks[ this.prop ]; + + return hooks && hooks.get ? + hooks.get( this ) : + Tween.propHooks._default.get( this ); + }, + run: function( percent ) { + var eased, + hooks = Tween.propHooks[ this.prop ]; + + if ( this.options.duration ) { + this.pos = eased = jQuery.easing[ this.easing ]( + percent, this.options.duration * percent, 0, 1, this.options.duration + ); + } else { + this.pos = eased = percent; + } + this.now = ( this.end - this.start ) * eased + this.start; + + if ( this.options.step ) { + this.options.step.call( this.elem, this.now, this ); + } + + if ( hooks && hooks.set ) { + hooks.set( this ); + } else { + Tween.propHooks._default.set( this ); + } + return this; + } +}; + +Tween.prototype.init.prototype = Tween.prototype; + +Tween.propHooks = { + _default: { + get: function( tween ) { + var result; + + // Use a property on the element directly when it is not a DOM element, + // or when there is no matching style property that exists. + if ( tween.elem.nodeType !== 1 || + tween.elem[ tween.prop ] != null && tween.elem.style[ tween.prop ] == null ) { + return tween.elem[ tween.prop ]; + } + + // Passing an empty string as a 3rd parameter to .css will automatically + // attempt a parseFloat and fallback to a string if the parse fails. + // Simple values such as "10px" are parsed to Float; + // complex values such as "rotate(1rad)" are returned as-is. + result = jQuery.css( tween.elem, tween.prop, "" ); + + // Empty strings, null, undefined and "auto" are converted to 0. + return !result || result === "auto" ? 0 : result; + }, + set: function( tween ) { + + // Use step hook for back compat. + // Use cssHook if its there. + // Use .style if available and use plain properties where available. + if ( jQuery.fx.step[ tween.prop ] ) { + jQuery.fx.step[ tween.prop ]( tween ); + } else if ( tween.elem.nodeType === 1 && ( + jQuery.cssHooks[ tween.prop ] || + tween.elem.style[ finalPropName( tween.prop ) ] != null ) ) { + jQuery.style( tween.elem, tween.prop, tween.now + tween.unit ); + } else { + tween.elem[ tween.prop ] = tween.now; + } + } + } +}; + +// Support: IE <=9 only +// Panic based approach to setting things on disconnected nodes +Tween.propHooks.scrollTop = Tween.propHooks.scrollLeft = { + set: function( tween ) { + if ( tween.elem.nodeType && tween.elem.parentNode ) { + tween.elem[ tween.prop ] = tween.now; + } + } +}; + +jQuery.easing = { + linear: function( p ) { + return p; + }, + swing: function( p ) { + return 0.5 - Math.cos( p * Math.PI ) / 2; + }, + _default: "swing" +}; + +jQuery.fx = Tween.prototype.init; + +// Back compat <1.8 extension point +jQuery.fx.step = {}; + + + + +var + fxNow, inProgress, + rfxtypes = /^(?:toggle|show|hide)$/, + rrun = /queueHooks$/; + +function schedule() { + if ( inProgress ) { + if ( document.hidden === false && window.requestAnimationFrame ) { + window.requestAnimationFrame( schedule ); + } else { + window.setTimeout( schedule, jQuery.fx.interval ); + } + + jQuery.fx.tick(); + } +} + +// Animations created synchronously will run synchronously +function createFxNow() { + window.setTimeout( function() { + fxNow = undefined; + } ); + return ( fxNow = Date.now() ); +} + +// Generate parameters to create a standard animation +function genFx( type, includeWidth ) { + var which, + i = 0, + attrs = { height: type }; + + // If we include width, step value is 1 to do all cssExpand values, + // otherwise step value is 2 to skip over Left and Right + includeWidth = includeWidth ? 1 : 0; + for ( ; i < 4; i += 2 - includeWidth ) { + which = cssExpand[ i ]; + attrs[ "margin" + which ] = attrs[ "padding" + which ] = type; + } + + if ( includeWidth ) { + attrs.opacity = attrs.width = type; + } + + return attrs; +} + +function createTween( value, prop, animation ) { + var tween, + collection = ( Animation.tweeners[ prop ] || [] ).concat( Animation.tweeners[ "*" ] ), + index = 0, + length = collection.length; + for ( ; index < length; index++ ) { + if ( ( tween = collection[ index ].call( animation, prop, value ) ) ) { + + // We're done with this property + return tween; + } + } +} + +function defaultPrefilter( elem, props, opts ) { + var prop, value, toggle, hooks, oldfire, propTween, restoreDisplay, display, + isBox = "width" in props || "height" in props, + anim = this, + orig = {}, + style = elem.style, + hidden = elem.nodeType && isHiddenWithinTree( elem ), + dataShow = dataPriv.get( elem, "fxshow" ); + + // Queue-skipping animations hijack the fx hooks + if ( !opts.queue ) { + hooks = jQuery._queueHooks( elem, "fx" ); + if ( hooks.unqueued == null ) { + hooks.unqueued = 0; + oldfire = hooks.empty.fire; + hooks.empty.fire = function() { + if ( !hooks.unqueued ) { + oldfire(); + } + }; + } + hooks.unqueued++; + + anim.always( function() { + + // Ensure the complete handler is called before this completes + anim.always( function() { + hooks.unqueued--; + if ( !jQuery.queue( elem, "fx" ).length ) { + hooks.empty.fire(); + } + } ); + } ); + } + + // Detect show/hide animations + for ( prop in props ) { + value = props[ prop ]; + if ( rfxtypes.test( value ) ) { + delete props[ prop ]; + toggle = toggle || value === "toggle"; + if ( value === ( hidden ? "hide" : "show" ) ) { + + // Pretend to be hidden if this is a "show" and + // there is still data from a stopped show/hide + if ( value === "show" && dataShow && dataShow[ prop ] !== undefined ) { + hidden = true; + + // Ignore all other no-op show/hide data + } else { + continue; + } + } + orig[ prop ] = dataShow && dataShow[ prop ] || jQuery.style( elem, prop ); + } + } + + // Bail out if this is a no-op like .hide().hide() + propTween = !jQuery.isEmptyObject( props ); + if ( !propTween && jQuery.isEmptyObject( orig ) ) { + return; + } + + // Restrict "overflow" and "display" styles during box animations + if ( isBox && elem.nodeType === 1 ) { + + // Support: IE <=9 - 11, Edge 12 - 15 + // Record all 3 overflow attributes because IE does not infer the shorthand + // from identically-valued overflowX and overflowY and Edge just mirrors + // the overflowX value there. + opts.overflow = [ style.overflow, style.overflowX, style.overflowY ]; + + // Identify a display type, preferring old show/hide data over the CSS cascade + restoreDisplay = dataShow && dataShow.display; + if ( restoreDisplay == null ) { + restoreDisplay = dataPriv.get( elem, "display" ); + } + display = jQuery.css( elem, "display" ); + if ( display === "none" ) { + if ( restoreDisplay ) { + display = restoreDisplay; + } else { + + // Get nonempty value(s) by temporarily forcing visibility + showHide( [ elem ], true ); + restoreDisplay = elem.style.display || restoreDisplay; + display = jQuery.css( elem, "display" ); + showHide( [ elem ] ); + } + } + + // Animate inline elements as inline-block + if ( display === "inline" || display === "inline-block" && restoreDisplay != null ) { + if ( jQuery.css( elem, "float" ) === "none" ) { + + // Restore the original display value at the end of pure show/hide animations + if ( !propTween ) { + anim.done( function() { + style.display = restoreDisplay; + } ); + if ( restoreDisplay == null ) { + display = style.display; + restoreDisplay = display === "none" ? "" : display; + } + } + style.display = "inline-block"; + } + } + } + + if ( opts.overflow ) { + style.overflow = "hidden"; + anim.always( function() { + style.overflow = opts.overflow[ 0 ]; + style.overflowX = opts.overflow[ 1 ]; + style.overflowY = opts.overflow[ 2 ]; + } ); + } + + // Implement show/hide animations + propTween = false; + for ( prop in orig ) { + + // General show/hide setup for this element animation + if ( !propTween ) { + if ( dataShow ) { + if ( "hidden" in dataShow ) { + hidden = dataShow.hidden; + } + } else { + dataShow = dataPriv.access( elem, "fxshow", { display: restoreDisplay } ); + } + + // Store hidden/visible for toggle so `.stop().toggle()` "reverses" + if ( toggle ) { + dataShow.hidden = !hidden; + } + + // Show elements before animating them + if ( hidden ) { + showHide( [ elem ], true ); + } + + /* eslint-disable no-loop-func */ + + anim.done( function() { + + /* eslint-enable no-loop-func */ + + // The final step of a "hide" animation is actually hiding the element + if ( !hidden ) { + showHide( [ elem ] ); + } + dataPriv.remove( elem, "fxshow" ); + for ( prop in orig ) { + jQuery.style( elem, prop, orig[ prop ] ); + } + } ); + } + + // Per-property setup + propTween = createTween( hidden ? dataShow[ prop ] : 0, prop, anim ); + if ( !( prop in dataShow ) ) { + dataShow[ prop ] = propTween.start; + if ( hidden ) { + propTween.end = propTween.start; + propTween.start = 0; + } + } + } +} + +function propFilter( props, specialEasing ) { + var index, name, easing, value, hooks; + + // camelCase, specialEasing and expand cssHook pass + for ( index in props ) { + name = camelCase( index ); + easing = specialEasing[ name ]; + value = props[ index ]; + if ( Array.isArray( value ) ) { + easing = value[ 1 ]; + value = props[ index ] = value[ 0 ]; + } + + if ( index !== name ) { + props[ name ] = value; + delete props[ index ]; + } + + hooks = jQuery.cssHooks[ name ]; + if ( hooks && "expand" in hooks ) { + value = hooks.expand( value ); + delete props[ name ]; + + // Not quite $.extend, this won't overwrite existing keys. + // Reusing 'index' because we have the correct "name" + for ( index in value ) { + if ( !( index in props ) ) { + props[ index ] = value[ index ]; + specialEasing[ index ] = easing; + } + } + } else { + specialEasing[ name ] = easing; + } + } +} + +function Animation( elem, properties, options ) { + var result, + stopped, + index = 0, + length = Animation.prefilters.length, + deferred = jQuery.Deferred().always( function() { + + // Don't match elem in the :animated selector + delete tick.elem; + } ), + tick = function() { + if ( stopped ) { + return false; + } + var currentTime = fxNow || createFxNow(), + remaining = Math.max( 0, animation.startTime + animation.duration - currentTime ), + + // Support: Android 2.3 only + // Archaic crash bug won't allow us to use `1 - ( 0.5 || 0 )` (#12497) + temp = remaining / animation.duration || 0, + percent = 1 - temp, + index = 0, + length = animation.tweens.length; + + for ( ; index < length; index++ ) { + animation.tweens[ index ].run( percent ); + } + + deferred.notifyWith( elem, [ animation, percent, remaining ] ); + + // If there's more to do, yield + if ( percent < 1 && length ) { + return remaining; + } + + // If this was an empty animation, synthesize a final progress notification + if ( !length ) { + deferred.notifyWith( elem, [ animation, 1, 0 ] ); + } + + // Resolve the animation and report its conclusion + deferred.resolveWith( elem, [ animation ] ); + return false; + }, + animation = deferred.promise( { + elem: elem, + props: jQuery.extend( {}, properties ), + opts: jQuery.extend( true, { + specialEasing: {}, + easing: jQuery.easing._default + }, options ), + originalProperties: properties, + originalOptions: options, + startTime: fxNow || createFxNow(), + duration: options.duration, + tweens: [], + createTween: function( prop, end ) { + var tween = jQuery.Tween( elem, animation.opts, prop, end, + animation.opts.specialEasing[ prop ] || animation.opts.easing ); + animation.tweens.push( tween ); + return tween; + }, + stop: function( gotoEnd ) { + var index = 0, + + // If we are going to the end, we want to run all the tweens + // otherwise we skip this part + length = gotoEnd ? animation.tweens.length : 0; + if ( stopped ) { + return this; + } + stopped = true; + for ( ; index < length; index++ ) { + animation.tweens[ index ].run( 1 ); + } + + // Resolve when we played the last frame; otherwise, reject + if ( gotoEnd ) { + deferred.notifyWith( elem, [ animation, 1, 0 ] ); + deferred.resolveWith( elem, [ animation, gotoEnd ] ); + } else { + deferred.rejectWith( elem, [ animation, gotoEnd ] ); + } + return this; + } + } ), + props = animation.props; + + propFilter( props, animation.opts.specialEasing ); + + for ( ; index < length; index++ ) { + result = Animation.prefilters[ index ].call( animation, elem, props, animation.opts ); + if ( result ) { + if ( isFunction( result.stop ) ) { + jQuery._queueHooks( animation.elem, animation.opts.queue ).stop = + result.stop.bind( result ); + } + return result; + } + } + + jQuery.map( props, createTween, animation ); + + if ( isFunction( animation.opts.start ) ) { + animation.opts.start.call( elem, animation ); + } + + // Attach callbacks from options + animation + .progress( animation.opts.progress ) + .done( animation.opts.done, animation.opts.complete ) + .fail( animation.opts.fail ) + .always( animation.opts.always ); + + jQuery.fx.timer( + jQuery.extend( tick, { + elem: elem, + anim: animation, + queue: animation.opts.queue + } ) + ); + + return animation; +} + +jQuery.Animation = jQuery.extend( Animation, { + + tweeners: { + "*": [ function( prop, value ) { + var tween = this.createTween( prop, value ); + adjustCSS( tween.elem, prop, rcssNum.exec( value ), tween ); + return tween; + } ] + }, + + tweener: function( props, callback ) { + if ( isFunction( props ) ) { + callback = props; + props = [ "*" ]; + } else { + props = props.match( rnothtmlwhite ); + } + + var prop, + index = 0, + length = props.length; + + for ( ; index < length; index++ ) { + prop = props[ index ]; + Animation.tweeners[ prop ] = Animation.tweeners[ prop ] || []; + Animation.tweeners[ prop ].unshift( callback ); + } + }, + + prefilters: [ defaultPrefilter ], + + prefilter: function( callback, prepend ) { + if ( prepend ) { + Animation.prefilters.unshift( callback ); + } else { + Animation.prefilters.push( callback ); + } + } +} ); + +jQuery.speed = function( speed, easing, fn ) { + var opt = speed && typeof speed === "object" ? jQuery.extend( {}, speed ) : { + complete: fn || !fn && easing || + isFunction( speed ) && speed, + duration: speed, + easing: fn && easing || easing && !isFunction( easing ) && easing + }; + + // Go to the end state if fx are off + if ( jQuery.fx.off ) { + opt.duration = 0; + + } else { + if ( typeof opt.duration !== "number" ) { + if ( opt.duration in jQuery.fx.speeds ) { + opt.duration = jQuery.fx.speeds[ opt.duration ]; + + } else { + opt.duration = jQuery.fx.speeds._default; + } + } + } + + // Normalize opt.queue - true/undefined/null -> "fx" + if ( opt.queue == null || opt.queue === true ) { + opt.queue = "fx"; + } + + // Queueing + opt.old = opt.complete; + + opt.complete = function() { + if ( isFunction( opt.old ) ) { + opt.old.call( this ); + } + + if ( opt.queue ) { + jQuery.dequeue( this, opt.queue ); + } + }; + + return opt; +}; + +jQuery.fn.extend( { + fadeTo: function( speed, to, easing, callback ) { + + // Show any hidden elements after setting opacity to 0 + return this.filter( isHiddenWithinTree ).css( "opacity", 0 ).show() + + // Animate to the value specified + .end().animate( { opacity: to }, speed, easing, callback ); + }, + animate: function( prop, speed, easing, callback ) { + var empty = jQuery.isEmptyObject( prop ), + optall = jQuery.speed( speed, easing, callback ), + doAnimation = function() { + + // Operate on a copy of prop so per-property easing won't be lost + var anim = Animation( this, jQuery.extend( {}, prop ), optall ); + + // Empty animations, or finishing resolves immediately + if ( empty || dataPriv.get( this, "finish" ) ) { + anim.stop( true ); + } + }; + + doAnimation.finish = doAnimation; + + return empty || optall.queue === false ? + this.each( doAnimation ) : + this.queue( optall.queue, doAnimation ); + }, + stop: function( type, clearQueue, gotoEnd ) { + var stopQueue = function( hooks ) { + var stop = hooks.stop; + delete hooks.stop; + stop( gotoEnd ); + }; + + if ( typeof type !== "string" ) { + gotoEnd = clearQueue; + clearQueue = type; + type = undefined; + } + if ( clearQueue ) { + this.queue( type || "fx", [] ); + } + + return this.each( function() { + var dequeue = true, + index = type != null && type + "queueHooks", + timers = jQuery.timers, + data = dataPriv.get( this ); + + if ( index ) { + if ( data[ index ] && data[ index ].stop ) { + stopQueue( data[ index ] ); + } + } else { + for ( index in data ) { + if ( data[ index ] && data[ index ].stop && rrun.test( index ) ) { + stopQueue( data[ index ] ); + } + } + } + + for ( index = timers.length; index--; ) { + if ( timers[ index ].elem === this && + ( type == null || timers[ index ].queue === type ) ) { + + timers[ index ].anim.stop( gotoEnd ); + dequeue = false; + timers.splice( index, 1 ); + } + } + + // Start the next in the queue if the last step wasn't forced. + // Timers currently will call their complete callbacks, which + // will dequeue but only if they were gotoEnd. + if ( dequeue || !gotoEnd ) { + jQuery.dequeue( this, type ); + } + } ); + }, + finish: function( type ) { + if ( type !== false ) { + type = type || "fx"; + } + return this.each( function() { + var index, + data = dataPriv.get( this ), + queue = data[ type + "queue" ], + hooks = data[ type + "queueHooks" ], + timers = jQuery.timers, + length = queue ? queue.length : 0; + + // Enable finishing flag on private data + data.finish = true; + + // Empty the queue first + jQuery.queue( this, type, [] ); + + if ( hooks && hooks.stop ) { + hooks.stop.call( this, true ); + } + + // Look for any active animations, and finish them + for ( index = timers.length; index--; ) { + if ( timers[ index ].elem === this && timers[ index ].queue === type ) { + timers[ index ].anim.stop( true ); + timers.splice( index, 1 ); + } + } + + // Look for any animations in the old queue and finish them + for ( index = 0; index < length; index++ ) { + if ( queue[ index ] && queue[ index ].finish ) { + queue[ index ].finish.call( this ); + } + } + + // Turn off finishing flag + delete data.finish; + } ); + } +} ); + +jQuery.each( [ "toggle", "show", "hide" ], function( _i, name ) { + var cssFn = jQuery.fn[ name ]; + jQuery.fn[ name ] = function( speed, easing, callback ) { + return speed == null || typeof speed === "boolean" ? + cssFn.apply( this, arguments ) : + this.animate( genFx( name, true ), speed, easing, callback ); + }; +} ); + +// Generate shortcuts for custom animations +jQuery.each( { + slideDown: genFx( "show" ), + slideUp: genFx( "hide" ), + slideToggle: genFx( "toggle" ), + fadeIn: { opacity: "show" }, + fadeOut: { opacity: "hide" }, + fadeToggle: { opacity: "toggle" } +}, function( name, props ) { + jQuery.fn[ name ] = function( speed, easing, callback ) { + return this.animate( props, speed, easing, callback ); + }; +} ); + +jQuery.timers = []; +jQuery.fx.tick = function() { + var timer, + i = 0, + timers = jQuery.timers; + + fxNow = Date.now(); + + for ( ; i < timers.length; i++ ) { + timer = timers[ i ]; + + // Run the timer and safely remove it when done (allowing for external removal) + if ( !timer() && timers[ i ] === timer ) { + timers.splice( i--, 1 ); + } + } + + if ( !timers.length ) { + jQuery.fx.stop(); + } + fxNow = undefined; +}; + +jQuery.fx.timer = function( timer ) { + jQuery.timers.push( timer ); + jQuery.fx.start(); +}; + +jQuery.fx.interval = 13; +jQuery.fx.start = function() { + if ( inProgress ) { + return; + } + + inProgress = true; + schedule(); +}; + +jQuery.fx.stop = function() { + inProgress = null; +}; + +jQuery.fx.speeds = { + slow: 600, + fast: 200, + + // Default speed + _default: 400 +}; + + +// Based off of the plugin by Clint Helfers, with permission. +// https://web.archive.org/web/20100324014747/http://blindsignals.com/index.php/2009/07/jquery-delay/ +jQuery.fn.delay = function( time, type ) { + time = jQuery.fx ? jQuery.fx.speeds[ time ] || time : time; + type = type || "fx"; + + return this.queue( type, function( next, hooks ) { + var timeout = window.setTimeout( next, time ); + hooks.stop = function() { + window.clearTimeout( timeout ); + }; + } ); +}; + + +( function() { + var input = document.createElement( "input" ), + select = document.createElement( "select" ), + opt = select.appendChild( document.createElement( "option" ) ); + + input.type = "checkbox"; + + // Support: Android <=4.3 only + // Default value for a checkbox should be "on" + support.checkOn = input.value !== ""; + + // Support: IE <=11 only + // Must access selectedIndex to make default options select + support.optSelected = opt.selected; + + // Support: IE <=11 only + // An input loses its value after becoming a radio + input = document.createElement( "input" ); + input.value = "t"; + input.type = "radio"; + support.radioValue = input.value === "t"; +} )(); + + +var boolHook, + attrHandle = jQuery.expr.attrHandle; + +jQuery.fn.extend( { + attr: function( name, value ) { + return access( this, jQuery.attr, name, value, arguments.length > 1 ); + }, + + removeAttr: function( name ) { + return this.each( function() { + jQuery.removeAttr( this, name ); + } ); + } +} ); + +jQuery.extend( { + attr: function( elem, name, value ) { + var ret, hooks, + nType = elem.nodeType; + + // Don't get/set attributes on text, comment and attribute nodes + if ( nType === 3 || nType === 8 || nType === 2 ) { + return; + } + + // Fallback to prop when attributes are not supported + if ( typeof elem.getAttribute === "undefined" ) { + return jQuery.prop( elem, name, value ); + } + + // Attribute hooks are determined by the lowercase version + // Grab necessary hook if one is defined + if ( nType !== 1 || !jQuery.isXMLDoc( elem ) ) { + hooks = jQuery.attrHooks[ name.toLowerCase() ] || + ( jQuery.expr.match.bool.test( name ) ? boolHook : undefined ); + } + + if ( value !== undefined ) { + if ( value === null ) { + jQuery.removeAttr( elem, name ); + return; + } + + if ( hooks && "set" in hooks && + ( ret = hooks.set( elem, value, name ) ) !== undefined ) { + return ret; + } + + elem.setAttribute( name, value + "" ); + return value; + } + + if ( hooks && "get" in hooks && ( ret = hooks.get( elem, name ) ) !== null ) { + return ret; + } + + ret = jQuery.find.attr( elem, name ); + + // Non-existent attributes return null, we normalize to undefined + return ret == null ? undefined : ret; + }, + + attrHooks: { + type: { + set: function( elem, value ) { + if ( !support.radioValue && value === "radio" && + nodeName( elem, "input" ) ) { + var val = elem.value; + elem.setAttribute( "type", value ); + if ( val ) { + elem.value = val; + } + return value; + } + } + } + }, + + removeAttr: function( elem, value ) { + var name, + i = 0, + + // Attribute names can contain non-HTML whitespace characters + // https://html.spec.whatwg.org/multipage/syntax.html#attributes-2 + attrNames = value && value.match( rnothtmlwhite ); + + if ( attrNames && elem.nodeType === 1 ) { + while ( ( name = attrNames[ i++ ] ) ) { + elem.removeAttribute( name ); + } + } + } +} ); + +// Hooks for boolean attributes +boolHook = { + set: function( elem, value, name ) { + if ( value === false ) { + + // Remove boolean attributes when set to false + jQuery.removeAttr( elem, name ); + } else { + elem.setAttribute( name, name ); + } + return name; + } +}; + +jQuery.each( jQuery.expr.match.bool.source.match( /\w+/g ), function( _i, name ) { + var getter = attrHandle[ name ] || jQuery.find.attr; + + attrHandle[ name ] = function( elem, name, isXML ) { + var ret, handle, + lowercaseName = name.toLowerCase(); + + if ( !isXML ) { + + // Avoid an infinite loop by temporarily removing this function from the getter + handle = attrHandle[ lowercaseName ]; + attrHandle[ lowercaseName ] = ret; + ret = getter( elem, name, isXML ) != null ? + lowercaseName : + null; + attrHandle[ lowercaseName ] = handle; + } + return ret; + }; +} ); + + + + +var rfocusable = /^(?:input|select|textarea|button)$/i, + rclickable = /^(?:a|area)$/i; + +jQuery.fn.extend( { + prop: function( name, value ) { + return access( this, jQuery.prop, name, value, arguments.length > 1 ); + }, + + removeProp: function( name ) { + return this.each( function() { + delete this[ jQuery.propFix[ name ] || name ]; + } ); + } +} ); + +jQuery.extend( { + prop: function( elem, name, value ) { + var ret, hooks, + nType = elem.nodeType; + + // Don't get/set properties on text, comment and attribute nodes + if ( nType === 3 || nType === 8 || nType === 2 ) { + return; + } + + if ( nType !== 1 || !jQuery.isXMLDoc( elem ) ) { + + // Fix name and attach hooks + name = jQuery.propFix[ name ] || name; + hooks = jQuery.propHooks[ name ]; + } + + if ( value !== undefined ) { + if ( hooks && "set" in hooks && + ( ret = hooks.set( elem, value, name ) ) !== undefined ) { + return ret; + } + + return ( elem[ name ] = value ); + } + + if ( hooks && "get" in hooks && ( ret = hooks.get( elem, name ) ) !== null ) { + return ret; + } + + return elem[ name ]; + }, + + propHooks: { + tabIndex: { + get: function( elem ) { + + // Support: IE <=9 - 11 only + // elem.tabIndex doesn't always return the + // correct value when it hasn't been explicitly set + // https://web.archive.org/web/20141116233347/http://fluidproject.org/blog/2008/01/09/getting-setting-and-removing-tabindex-values-with-javascript/ + // Use proper attribute retrieval(#12072) + var tabindex = jQuery.find.attr( elem, "tabindex" ); + + if ( tabindex ) { + return parseInt( tabindex, 10 ); + } + + if ( + rfocusable.test( elem.nodeName ) || + rclickable.test( elem.nodeName ) && + elem.href + ) { + return 0; + } + + return -1; + } + } + }, + + propFix: { + "for": "htmlFor", + "class": "className" + } +} ); + +// Support: IE <=11 only +// Accessing the selectedIndex property +// forces the browser to respect setting selected +// on the option +// The getter ensures a default option is selected +// when in an optgroup +// eslint rule "no-unused-expressions" is disabled for this code +// since it considers such accessions noop +if ( !support.optSelected ) { + jQuery.propHooks.selected = { + get: function( elem ) { + + /* eslint no-unused-expressions: "off" */ + + var parent = elem.parentNode; + if ( parent && parent.parentNode ) { + parent.parentNode.selectedIndex; + } + return null; + }, + set: function( elem ) { + + /* eslint no-unused-expressions: "off" */ + + var parent = elem.parentNode; + if ( parent ) { + parent.selectedIndex; + + if ( parent.parentNode ) { + parent.parentNode.selectedIndex; + } + } + } + }; +} + +jQuery.each( [ + "tabIndex", + "readOnly", + "maxLength", + "cellSpacing", + "cellPadding", + "rowSpan", + "colSpan", + "useMap", + "frameBorder", + "contentEditable" +], function() { + jQuery.propFix[ this.toLowerCase() ] = this; +} ); + + + + + // Strip and collapse whitespace according to HTML spec + // https://infra.spec.whatwg.org/#strip-and-collapse-ascii-whitespace + function stripAndCollapse( value ) { + var tokens = value.match( rnothtmlwhite ) || []; + return tokens.join( " " ); + } + + +function getClass( elem ) { + return elem.getAttribute && elem.getAttribute( "class" ) || ""; +} + +function classesToArray( value ) { + if ( Array.isArray( value ) ) { + return value; + } + if ( typeof value === "string" ) { + return value.match( rnothtmlwhite ) || []; + } + return []; +} + +jQuery.fn.extend( { + addClass: function( value ) { + var classes, elem, cur, curValue, clazz, j, finalValue, + i = 0; + + if ( isFunction( value ) ) { + return this.each( function( j ) { + jQuery( this ).addClass( value.call( this, j, getClass( this ) ) ); + } ); + } + + classes = classesToArray( value ); + + if ( classes.length ) { + while ( ( elem = this[ i++ ] ) ) { + curValue = getClass( elem ); + cur = elem.nodeType === 1 && ( " " + stripAndCollapse( curValue ) + " " ); + + if ( cur ) { + j = 0; + while ( ( clazz = classes[ j++ ] ) ) { + if ( cur.indexOf( " " + clazz + " " ) < 0 ) { + cur += clazz + " "; + } + } + + // Only assign if different to avoid unneeded rendering. + finalValue = stripAndCollapse( cur ); + if ( curValue !== finalValue ) { + elem.setAttribute( "class", finalValue ); + } + } + } + } + + return this; + }, + + removeClass: function( value ) { + var classes, elem, cur, curValue, clazz, j, finalValue, + i = 0; + + if ( isFunction( value ) ) { + return this.each( function( j ) { + jQuery( this ).removeClass( value.call( this, j, getClass( this ) ) ); + } ); + } + + if ( !arguments.length ) { + return this.attr( "class", "" ); + } + + classes = classesToArray( value ); + + if ( classes.length ) { + while ( ( elem = this[ i++ ] ) ) { + curValue = getClass( elem ); + + // This expression is here for better compressibility (see addClass) + cur = elem.nodeType === 1 && ( " " + stripAndCollapse( curValue ) + " " ); + + if ( cur ) { + j = 0; + while ( ( clazz = classes[ j++ ] ) ) { + + // Remove *all* instances + while ( cur.indexOf( " " + clazz + " " ) > -1 ) { + cur = cur.replace( " " + clazz + " ", " " ); + } + } + + // Only assign if different to avoid unneeded rendering. + finalValue = stripAndCollapse( cur ); + if ( curValue !== finalValue ) { + elem.setAttribute( "class", finalValue ); + } + } + } + } + + return this; + }, + + toggleClass: function( value, stateVal ) { + var type = typeof value, + isValidValue = type === "string" || Array.isArray( value ); + + if ( typeof stateVal === "boolean" && isValidValue ) { + return stateVal ? this.addClass( value ) : this.removeClass( value ); + } + + if ( isFunction( value ) ) { + return this.each( function( i ) { + jQuery( this ).toggleClass( + value.call( this, i, getClass( this ), stateVal ), + stateVal + ); + } ); + } + + return this.each( function() { + var className, i, self, classNames; + + if ( isValidValue ) { + + // Toggle individual class names + i = 0; + self = jQuery( this ); + classNames = classesToArray( value ); + + while ( ( className = classNames[ i++ ] ) ) { + + // Check each className given, space separated list + if ( self.hasClass( className ) ) { + self.removeClass( className ); + } else { + self.addClass( className ); + } + } + + // Toggle whole class name + } else if ( value === undefined || type === "boolean" ) { + className = getClass( this ); + if ( className ) { + + // Store className if set + dataPriv.set( this, "__className__", className ); + } + + // If the element has a class name or if we're passed `false`, + // then remove the whole classname (if there was one, the above saved it). + // Otherwise bring back whatever was previously saved (if anything), + // falling back to the empty string if nothing was stored. + if ( this.setAttribute ) { + this.setAttribute( "class", + className || value === false ? + "" : + dataPriv.get( this, "__className__" ) || "" + ); + } + } + } ); + }, + + hasClass: function( selector ) { + var className, elem, + i = 0; + + className = " " + selector + " "; + while ( ( elem = this[ i++ ] ) ) { + if ( elem.nodeType === 1 && + ( " " + stripAndCollapse( getClass( elem ) ) + " " ).indexOf( className ) > -1 ) { + return true; + } + } + + return false; + } +} ); + + + + +var rreturn = /\r/g; + +jQuery.fn.extend( { + val: function( value ) { + var hooks, ret, valueIsFunction, + elem = this[ 0 ]; + + if ( !arguments.length ) { + if ( elem ) { + hooks = jQuery.valHooks[ elem.type ] || + jQuery.valHooks[ elem.nodeName.toLowerCase() ]; + + if ( hooks && + "get" in hooks && + ( ret = hooks.get( elem, "value" ) ) !== undefined + ) { + return ret; + } + + ret = elem.value; + + // Handle most common string cases + if ( typeof ret === "string" ) { + return ret.replace( rreturn, "" ); + } + + // Handle cases where value is null/undef or number + return ret == null ? "" : ret; + } + + return; + } + + valueIsFunction = isFunction( value ); + + return this.each( function( i ) { + var val; + + if ( this.nodeType !== 1 ) { + return; + } + + if ( valueIsFunction ) { + val = value.call( this, i, jQuery( this ).val() ); + } else { + val = value; + } + + // Treat null/undefined as ""; convert numbers to string + if ( val == null ) { + val = ""; + + } else if ( typeof val === "number" ) { + val += ""; + + } else if ( Array.isArray( val ) ) { + val = jQuery.map( val, function( value ) { + return value == null ? "" : value + ""; + } ); + } + + hooks = jQuery.valHooks[ this.type ] || jQuery.valHooks[ this.nodeName.toLowerCase() ]; + + // If set returns undefined, fall back to normal setting + if ( !hooks || !( "set" in hooks ) || hooks.set( this, val, "value" ) === undefined ) { + this.value = val; + } + } ); + } +} ); + +jQuery.extend( { + valHooks: { + option: { + get: function( elem ) { + + var val = jQuery.find.attr( elem, "value" ); + return val != null ? + val : + + // Support: IE <=10 - 11 only + // option.text throws exceptions (#14686, #14858) + // Strip and collapse whitespace + // https://html.spec.whatwg.org/#strip-and-collapse-whitespace + stripAndCollapse( jQuery.text( elem ) ); + } + }, + select: { + get: function( elem ) { + var value, option, i, + options = elem.options, + index = elem.selectedIndex, + one = elem.type === "select-one", + values = one ? null : [], + max = one ? index + 1 : options.length; + + if ( index < 0 ) { + i = max; + + } else { + i = one ? index : 0; + } + + // Loop through all the selected options + for ( ; i < max; i++ ) { + option = options[ i ]; + + // Support: IE <=9 only + // IE8-9 doesn't update selected after form reset (#2551) + if ( ( option.selected || i === index ) && + + // Don't return options that are disabled or in a disabled optgroup + !option.disabled && + ( !option.parentNode.disabled || + !nodeName( option.parentNode, "optgroup" ) ) ) { + + // Get the specific value for the option + value = jQuery( option ).val(); + + // We don't need an array for one selects + if ( one ) { + return value; + } + + // Multi-Selects return an array + values.push( value ); + } + } + + return values; + }, + + set: function( elem, value ) { + var optionSet, option, + options = elem.options, + values = jQuery.makeArray( value ), + i = options.length; + + while ( i-- ) { + option = options[ i ]; + + /* eslint-disable no-cond-assign */ + + if ( option.selected = + jQuery.inArray( jQuery.valHooks.option.get( option ), values ) > -1 + ) { + optionSet = true; + } + + /* eslint-enable no-cond-assign */ + } + + // Force browsers to behave consistently when non-matching value is set + if ( !optionSet ) { + elem.selectedIndex = -1; + } + return values; + } + } + } +} ); + +// Radios and checkboxes getter/setter +jQuery.each( [ "radio", "checkbox" ], function() { + jQuery.valHooks[ this ] = { + set: function( elem, value ) { + if ( Array.isArray( value ) ) { + return ( elem.checked = jQuery.inArray( jQuery( elem ).val(), value ) > -1 ); + } + } + }; + if ( !support.checkOn ) { + jQuery.valHooks[ this ].get = function( elem ) { + return elem.getAttribute( "value" ) === null ? "on" : elem.value; + }; + } +} ); + + + + +// Return jQuery for attributes-only inclusion + + +support.focusin = "onfocusin" in window; + + +var rfocusMorph = /^(?:focusinfocus|focusoutblur)$/, + stopPropagationCallback = function( e ) { + e.stopPropagation(); + }; + +jQuery.extend( jQuery.event, { + + trigger: function( event, data, elem, onlyHandlers ) { + + var i, cur, tmp, bubbleType, ontype, handle, special, lastElement, + eventPath = [ elem || document ], + type = hasOwn.call( event, "type" ) ? event.type : event, + namespaces = hasOwn.call( event, "namespace" ) ? event.namespace.split( "." ) : []; + + cur = lastElement = tmp = elem = elem || document; + + // Don't do events on text and comment nodes + if ( elem.nodeType === 3 || elem.nodeType === 8 ) { + return; + } + + // focus/blur morphs to focusin/out; ensure we're not firing them right now + if ( rfocusMorph.test( type + jQuery.event.triggered ) ) { + return; + } + + if ( type.indexOf( "." ) > -1 ) { + + // Namespaced trigger; create a regexp to match event type in handle() + namespaces = type.split( "." ); + type = namespaces.shift(); + namespaces.sort(); + } + ontype = type.indexOf( ":" ) < 0 && "on" + type; + + // Caller can pass in a jQuery.Event object, Object, or just an event type string + event = event[ jQuery.expando ] ? + event : + new jQuery.Event( type, typeof event === "object" && event ); + + // Trigger bitmask: & 1 for native handlers; & 2 for jQuery (always true) + event.isTrigger = onlyHandlers ? 2 : 3; + event.namespace = namespaces.join( "." ); + event.rnamespace = event.namespace ? + new RegExp( "(^|\\.)" + namespaces.join( "\\.(?:.*\\.|)" ) + "(\\.|$)" ) : + null; + + // Clean up the event in case it is being reused + event.result = undefined; + if ( !event.target ) { + event.target = elem; + } + + // Clone any incoming data and prepend the event, creating the handler arg list + data = data == null ? + [ event ] : + jQuery.makeArray( data, [ event ] ); + + // Allow special events to draw outside the lines + special = jQuery.event.special[ type ] || {}; + if ( !onlyHandlers && special.trigger && special.trigger.apply( elem, data ) === false ) { + return; + } + + // Determine event propagation path in advance, per W3C events spec (#9951) + // Bubble up to document, then to window; watch for a global ownerDocument var (#9724) + if ( !onlyHandlers && !special.noBubble && !isWindow( elem ) ) { + + bubbleType = special.delegateType || type; + if ( !rfocusMorph.test( bubbleType + type ) ) { + cur = cur.parentNode; + } + for ( ; cur; cur = cur.parentNode ) { + eventPath.push( cur ); + tmp = cur; + } + + // Only add window if we got to document (e.g., not plain obj or detached DOM) + if ( tmp === ( elem.ownerDocument || document ) ) { + eventPath.push( tmp.defaultView || tmp.parentWindow || window ); + } + } + + // Fire handlers on the event path + i = 0; + while ( ( cur = eventPath[ i++ ] ) && !event.isPropagationStopped() ) { + lastElement = cur; + event.type = i > 1 ? + bubbleType : + special.bindType || type; + + // jQuery handler + handle = ( dataPriv.get( cur, "events" ) || Object.create( null ) )[ event.type ] && + dataPriv.get( cur, "handle" ); + if ( handle ) { + handle.apply( cur, data ); + } + + // Native handler + handle = ontype && cur[ ontype ]; + if ( handle && handle.apply && acceptData( cur ) ) { + event.result = handle.apply( cur, data ); + if ( event.result === false ) { + event.preventDefault(); + } + } + } + event.type = type; + + // If nobody prevented the default action, do it now + if ( !onlyHandlers && !event.isDefaultPrevented() ) { + + if ( ( !special._default || + special._default.apply( eventPath.pop(), data ) === false ) && + acceptData( elem ) ) { + + // Call a native DOM method on the target with the same name as the event. + // Don't do default actions on window, that's where global variables be (#6170) + if ( ontype && isFunction( elem[ type ] ) && !isWindow( elem ) ) { + + // Don't re-trigger an onFOO event when we call its FOO() method + tmp = elem[ ontype ]; + + if ( tmp ) { + elem[ ontype ] = null; + } + + // Prevent re-triggering of the same event, since we already bubbled it above + jQuery.event.triggered = type; + + if ( event.isPropagationStopped() ) { + lastElement.addEventListener( type, stopPropagationCallback ); + } + + elem[ type ](); + + if ( event.isPropagationStopped() ) { + lastElement.removeEventListener( type, stopPropagationCallback ); + } + + jQuery.event.triggered = undefined; + + if ( tmp ) { + elem[ ontype ] = tmp; + } + } + } + } + + return event.result; + }, + + // Piggyback on a donor event to simulate a different one + // Used only for `focus(in | out)` events + simulate: function( type, elem, event ) { + var e = jQuery.extend( + new jQuery.Event(), + event, + { + type: type, + isSimulated: true + } + ); + + jQuery.event.trigger( e, null, elem ); + } + +} ); + +jQuery.fn.extend( { + + trigger: function( type, data ) { + return this.each( function() { + jQuery.event.trigger( type, data, this ); + } ); + }, + triggerHandler: function( type, data ) { + var elem = this[ 0 ]; + if ( elem ) { + return jQuery.event.trigger( type, data, elem, true ); + } + } +} ); + + +// Support: Firefox <=44 +// Firefox doesn't have focus(in | out) events +// Related ticket - https://bugzilla.mozilla.org/show_bug.cgi?id=687787 +// +// Support: Chrome <=48 - 49, Safari <=9.0 - 9.1 +// focus(in | out) events fire after focus & blur events, +// which is spec violation - http://www.w3.org/TR/DOM-Level-3-Events/#events-focusevent-event-order +// Related ticket - https://bugs.chromium.org/p/chromium/issues/detail?id=449857 +if ( !support.focusin ) { + jQuery.each( { focus: "focusin", blur: "focusout" }, function( orig, fix ) { + + // Attach a single capturing handler on the document while someone wants focusin/focusout + var handler = function( event ) { + jQuery.event.simulate( fix, event.target, jQuery.event.fix( event ) ); + }; + + jQuery.event.special[ fix ] = { + setup: function() { + + // Handle: regular nodes (via `this.ownerDocument`), window + // (via `this.document`) & document (via `this`). + var doc = this.ownerDocument || this.document || this, + attaches = dataPriv.access( doc, fix ); + + if ( !attaches ) { + doc.addEventListener( orig, handler, true ); + } + dataPriv.access( doc, fix, ( attaches || 0 ) + 1 ); + }, + teardown: function() { + var doc = this.ownerDocument || this.document || this, + attaches = dataPriv.access( doc, fix ) - 1; + + if ( !attaches ) { + doc.removeEventListener( orig, handler, true ); + dataPriv.remove( doc, fix ); + + } else { + dataPriv.access( doc, fix, attaches ); + } + } + }; + } ); +} +var location = window.location; + +var nonce = { guid: Date.now() }; + +var rquery = ( /\?/ ); + + + +// Cross-browser xml parsing +jQuery.parseXML = function( data ) { + var xml, parserErrorElem; + if ( !data || typeof data !== "string" ) { + return null; + } + + // Support: IE 9 - 11 only + // IE throws on parseFromString with invalid input. + try { + xml = ( new window.DOMParser() ).parseFromString( data, "text/xml" ); + } catch ( e ) {} + + parserErrorElem = xml && xml.getElementsByTagName( "parsererror" )[ 0 ]; + if ( !xml || parserErrorElem ) { + jQuery.error( "Invalid XML: " + ( + parserErrorElem ? + jQuery.map( parserErrorElem.childNodes, function( el ) { + return el.textContent; + } ).join( "\n" ) : + data + ) ); + } + return xml; +}; + + +var + rbracket = /\[\]$/, + rCRLF = /\r?\n/g, + rsubmitterTypes = /^(?:submit|button|image|reset|file)$/i, + rsubmittable = /^(?:input|select|textarea|keygen)/i; + +function buildParams( prefix, obj, traditional, add ) { + var name; + + if ( Array.isArray( obj ) ) { + + // Serialize array item. + jQuery.each( obj, function( i, v ) { + if ( traditional || rbracket.test( prefix ) ) { + + // Treat each array item as a scalar. + add( prefix, v ); + + } else { + + // Item is non-scalar (array or object), encode its numeric index. + buildParams( + prefix + "[" + ( typeof v === "object" && v != null ? i : "" ) + "]", + v, + traditional, + add + ); + } + } ); + + } else if ( !traditional && toType( obj ) === "object" ) { + + // Serialize object item. + for ( name in obj ) { + buildParams( prefix + "[" + name + "]", obj[ name ], traditional, add ); + } + + } else { + + // Serialize scalar item. + add( prefix, obj ); + } +} + +// Serialize an array of form elements or a set of +// key/values into a query string +jQuery.param = function( a, traditional ) { + var prefix, + s = [], + add = function( key, valueOrFunction ) { + + // If value is a function, invoke it and use its return value + var value = isFunction( valueOrFunction ) ? + valueOrFunction() : + valueOrFunction; + + s[ s.length ] = encodeURIComponent( key ) + "=" + + encodeURIComponent( value == null ? "" : value ); + }; + + if ( a == null ) { + return ""; + } + + // If an array was passed in, assume that it is an array of form elements. + if ( Array.isArray( a ) || ( a.jquery && !jQuery.isPlainObject( a ) ) ) { + + // Serialize the form elements + jQuery.each( a, function() { + add( this.name, this.value ); + } ); + + } else { + + // If traditional, encode the "old" way (the way 1.3.2 or older + // did it), otherwise encode params recursively. + for ( prefix in a ) { + buildParams( prefix, a[ prefix ], traditional, add ); + } + } + + // Return the resulting serialization + return s.join( "&" ); +}; + +jQuery.fn.extend( { + serialize: function() { + return jQuery.param( this.serializeArray() ); + }, + serializeArray: function() { + return this.map( function() { + + // Can add propHook for "elements" to filter or add form elements + var elements = jQuery.prop( this, "elements" ); + return elements ? jQuery.makeArray( elements ) : this; + } ).filter( function() { + var type = this.type; + + // Use .is( ":disabled" ) so that fieldset[disabled] works + return this.name && !jQuery( this ).is( ":disabled" ) && + rsubmittable.test( this.nodeName ) && !rsubmitterTypes.test( type ) && + ( this.checked || !rcheckableType.test( type ) ); + } ).map( function( _i, elem ) { + var val = jQuery( this ).val(); + + if ( val == null ) { + return null; + } + + if ( Array.isArray( val ) ) { + return jQuery.map( val, function( val ) { + return { name: elem.name, value: val.replace( rCRLF, "\r\n" ) }; + } ); + } + + return { name: elem.name, value: val.replace( rCRLF, "\r\n" ) }; + } ).get(); + } +} ); + + +var + r20 = /%20/g, + rhash = /#.*$/, + rantiCache = /([?&])_=[^&]*/, + rheaders = /^(.*?):[ \t]*([^\r\n]*)$/mg, + + // #7653, #8125, #8152: local protocol detection + rlocalProtocol = /^(?:about|app|app-storage|.+-extension|file|res|widget):$/, + rnoContent = /^(?:GET|HEAD)$/, + rprotocol = /^\/\//, + + /* Prefilters + * 1) They are useful to introduce custom dataTypes (see ajax/jsonp.js for an example) + * 2) These are called: + * - BEFORE asking for a transport + * - AFTER param serialization (s.data is a string if s.processData is true) + * 3) key is the dataType + * 4) the catchall symbol "*" can be used + * 5) execution will start with transport dataType and THEN continue down to "*" if needed + */ + prefilters = {}, + + /* Transports bindings + * 1) key is the dataType + * 2) the catchall symbol "*" can be used + * 3) selection will start with transport dataType and THEN go to "*" if needed + */ + transports = {}, + + // Avoid comment-prolog char sequence (#10098); must appease lint and evade compression + allTypes = "*/".concat( "*" ), + + // Anchor tag for parsing the document origin + originAnchor = document.createElement( "a" ); + +originAnchor.href = location.href; + +// Base "constructor" for jQuery.ajaxPrefilter and jQuery.ajaxTransport +function addToPrefiltersOrTransports( structure ) { + + // dataTypeExpression is optional and defaults to "*" + return function( dataTypeExpression, func ) { + + if ( typeof dataTypeExpression !== "string" ) { + func = dataTypeExpression; + dataTypeExpression = "*"; + } + + var dataType, + i = 0, + dataTypes = dataTypeExpression.toLowerCase().match( rnothtmlwhite ) || []; + + if ( isFunction( func ) ) { + + // For each dataType in the dataTypeExpression + while ( ( dataType = dataTypes[ i++ ] ) ) { + + // Prepend if requested + if ( dataType[ 0 ] === "+" ) { + dataType = dataType.slice( 1 ) || "*"; + ( structure[ dataType ] = structure[ dataType ] || [] ).unshift( func ); + + // Otherwise append + } else { + ( structure[ dataType ] = structure[ dataType ] || [] ).push( func ); + } + } + } + }; +} + +// Base inspection function for prefilters and transports +function inspectPrefiltersOrTransports( structure, options, originalOptions, jqXHR ) { + + var inspected = {}, + seekingTransport = ( structure === transports ); + + function inspect( dataType ) { + var selected; + inspected[ dataType ] = true; + jQuery.each( structure[ dataType ] || [], function( _, prefilterOrFactory ) { + var dataTypeOrTransport = prefilterOrFactory( options, originalOptions, jqXHR ); + if ( typeof dataTypeOrTransport === "string" && + !seekingTransport && !inspected[ dataTypeOrTransport ] ) { + + options.dataTypes.unshift( dataTypeOrTransport ); + inspect( dataTypeOrTransport ); + return false; + } else if ( seekingTransport ) { + return !( selected = dataTypeOrTransport ); + } + } ); + return selected; + } + + return inspect( options.dataTypes[ 0 ] ) || !inspected[ "*" ] && inspect( "*" ); +} + +// A special extend for ajax options +// that takes "flat" options (not to be deep extended) +// Fixes #9887 +function ajaxExtend( target, src ) { + var key, deep, + flatOptions = jQuery.ajaxSettings.flatOptions || {}; + + for ( key in src ) { + if ( src[ key ] !== undefined ) { + ( flatOptions[ key ] ? target : ( deep || ( deep = {} ) ) )[ key ] = src[ key ]; + } + } + if ( deep ) { + jQuery.extend( true, target, deep ); + } + + return target; +} + +/* Handles responses to an ajax request: + * - finds the right dataType (mediates between content-type and expected dataType) + * - returns the corresponding response + */ +function ajaxHandleResponses( s, jqXHR, responses ) { + + var ct, type, finalDataType, firstDataType, + contents = s.contents, + dataTypes = s.dataTypes; + + // Remove auto dataType and get content-type in the process + while ( dataTypes[ 0 ] === "*" ) { + dataTypes.shift(); + if ( ct === undefined ) { + ct = s.mimeType || jqXHR.getResponseHeader( "Content-Type" ); + } + } + + // Check if we're dealing with a known content-type + if ( ct ) { + for ( type in contents ) { + if ( contents[ type ] && contents[ type ].test( ct ) ) { + dataTypes.unshift( type ); + break; + } + } + } + + // Check to see if we have a response for the expected dataType + if ( dataTypes[ 0 ] in responses ) { + finalDataType = dataTypes[ 0 ]; + } else { + + // Try convertible dataTypes + for ( type in responses ) { + if ( !dataTypes[ 0 ] || s.converters[ type + " " + dataTypes[ 0 ] ] ) { + finalDataType = type; + break; + } + if ( !firstDataType ) { + firstDataType = type; + } + } + + // Or just use first one + finalDataType = finalDataType || firstDataType; + } + + // If we found a dataType + // We add the dataType to the list if needed + // and return the corresponding response + if ( finalDataType ) { + if ( finalDataType !== dataTypes[ 0 ] ) { + dataTypes.unshift( finalDataType ); + } + return responses[ finalDataType ]; + } +} + +/* Chain conversions given the request and the original response + * Also sets the responseXXX fields on the jqXHR instance + */ +function ajaxConvert( s, response, jqXHR, isSuccess ) { + var conv2, current, conv, tmp, prev, + converters = {}, + + // Work with a copy of dataTypes in case we need to modify it for conversion + dataTypes = s.dataTypes.slice(); + + // Create converters map with lowercased keys + if ( dataTypes[ 1 ] ) { + for ( conv in s.converters ) { + converters[ conv.toLowerCase() ] = s.converters[ conv ]; + } + } + + current = dataTypes.shift(); + + // Convert to each sequential dataType + while ( current ) { + + if ( s.responseFields[ current ] ) { + jqXHR[ s.responseFields[ current ] ] = response; + } + + // Apply the dataFilter if provided + if ( !prev && isSuccess && s.dataFilter ) { + response = s.dataFilter( response, s.dataType ); + } + + prev = current; + current = dataTypes.shift(); + + if ( current ) { + + // There's only work to do if current dataType is non-auto + if ( current === "*" ) { + + current = prev; + + // Convert response if prev dataType is non-auto and differs from current + } else if ( prev !== "*" && prev !== current ) { + + // Seek a direct converter + conv = converters[ prev + " " + current ] || converters[ "* " + current ]; + + // If none found, seek a pair + if ( !conv ) { + for ( conv2 in converters ) { + + // If conv2 outputs current + tmp = conv2.split( " " ); + if ( tmp[ 1 ] === current ) { + + // If prev can be converted to accepted input + conv = converters[ prev + " " + tmp[ 0 ] ] || + converters[ "* " + tmp[ 0 ] ]; + if ( conv ) { + + // Condense equivalence converters + if ( conv === true ) { + conv = converters[ conv2 ]; + + // Otherwise, insert the intermediate dataType + } else if ( converters[ conv2 ] !== true ) { + current = tmp[ 0 ]; + dataTypes.unshift( tmp[ 1 ] ); + } + break; + } + } + } + } + + // Apply converter (if not an equivalence) + if ( conv !== true ) { + + // Unless errors are allowed to bubble, catch and return them + if ( conv && s.throws ) { + response = conv( response ); + } else { + try { + response = conv( response ); + } catch ( e ) { + return { + state: "parsererror", + error: conv ? e : "No conversion from " + prev + " to " + current + }; + } + } + } + } + } + } + + return { state: "success", data: response }; +} + +jQuery.extend( { + + // Counter for holding the number of active queries + active: 0, + + // Last-Modified header cache for next request + lastModified: {}, + etag: {}, + + ajaxSettings: { + url: location.href, + type: "GET", + isLocal: rlocalProtocol.test( location.protocol ), + global: true, + processData: true, + async: true, + contentType: "application/x-www-form-urlencoded; charset=UTF-8", + + /* + timeout: 0, + data: null, + dataType: null, + username: null, + password: null, + cache: null, + throws: false, + traditional: false, + headers: {}, + */ + + accepts: { + "*": allTypes, + text: "text/plain", + html: "text/html", + xml: "application/xml, text/xml", + json: "application/json, text/javascript" + }, + + contents: { + xml: /\bxml\b/, + html: /\bhtml/, + json: /\bjson\b/ + }, + + responseFields: { + xml: "responseXML", + text: "responseText", + json: "responseJSON" + }, + + // Data converters + // Keys separate source (or catchall "*") and destination types with a single space + converters: { + + // Convert anything to text + "* text": String, + + // Text to html (true = no transformation) + "text html": true, + + // Evaluate text as a json expression + "text json": JSON.parse, + + // Parse text as xml + "text xml": jQuery.parseXML + }, + + // For options that shouldn't be deep extended: + // you can add your own custom options here if + // and when you create one that shouldn't be + // deep extended (see ajaxExtend) + flatOptions: { + url: true, + context: true + } + }, + + // Creates a full fledged settings object into target + // with both ajaxSettings and settings fields. + // If target is omitted, writes into ajaxSettings. + ajaxSetup: function( target, settings ) { + return settings ? + + // Building a settings object + ajaxExtend( ajaxExtend( target, jQuery.ajaxSettings ), settings ) : + + // Extending ajaxSettings + ajaxExtend( jQuery.ajaxSettings, target ); + }, + + ajaxPrefilter: addToPrefiltersOrTransports( prefilters ), + ajaxTransport: addToPrefiltersOrTransports( transports ), + + // Main method + ajax: function( url, options ) { + + // If url is an object, simulate pre-1.5 signature + if ( typeof url === "object" ) { + options = url; + url = undefined; + } + + // Force options to be an object + options = options || {}; + + var transport, + + // URL without anti-cache param + cacheURL, + + // Response headers + responseHeadersString, + responseHeaders, + + // timeout handle + timeoutTimer, + + // Url cleanup var + urlAnchor, + + // Request state (becomes false upon send and true upon completion) + completed, + + // To know if global events are to be dispatched + fireGlobals, + + // Loop variable + i, + + // uncached part of the url + uncached, + + // Create the final options object + s = jQuery.ajaxSetup( {}, options ), + + // Callbacks context + callbackContext = s.context || s, + + // Context for global events is callbackContext if it is a DOM node or jQuery collection + globalEventContext = s.context && + ( callbackContext.nodeType || callbackContext.jquery ) ? + jQuery( callbackContext ) : + jQuery.event, + + // Deferreds + deferred = jQuery.Deferred(), + completeDeferred = jQuery.Callbacks( "once memory" ), + + // Status-dependent callbacks + statusCode = s.statusCode || {}, + + // Headers (they are sent all at once) + requestHeaders = {}, + requestHeadersNames = {}, + + // Default abort message + strAbort = "canceled", + + // Fake xhr + jqXHR = { + readyState: 0, + + // Builds headers hashtable if needed + getResponseHeader: function( key ) { + var match; + if ( completed ) { + if ( !responseHeaders ) { + responseHeaders = {}; + while ( ( match = rheaders.exec( responseHeadersString ) ) ) { + responseHeaders[ match[ 1 ].toLowerCase() + " " ] = + ( responseHeaders[ match[ 1 ].toLowerCase() + " " ] || [] ) + .concat( match[ 2 ] ); + } + } + match = responseHeaders[ key.toLowerCase() + " " ]; + } + return match == null ? null : match.join( ", " ); + }, + + // Raw string + getAllResponseHeaders: function() { + return completed ? responseHeadersString : null; + }, + + // Caches the header + setRequestHeader: function( name, value ) { + if ( completed == null ) { + name = requestHeadersNames[ name.toLowerCase() ] = + requestHeadersNames[ name.toLowerCase() ] || name; + requestHeaders[ name ] = value; + } + return this; + }, + + // Overrides response content-type header + overrideMimeType: function( type ) { + if ( completed == null ) { + s.mimeType = type; + } + return this; + }, + + // Status-dependent callbacks + statusCode: function( map ) { + var code; + if ( map ) { + if ( completed ) { + + // Execute the appropriate callbacks + jqXHR.always( map[ jqXHR.status ] ); + } else { + + // Lazy-add the new callbacks in a way that preserves old ones + for ( code in map ) { + statusCode[ code ] = [ statusCode[ code ], map[ code ] ]; + } + } + } + return this; + }, + + // Cancel the request + abort: function( statusText ) { + var finalText = statusText || strAbort; + if ( transport ) { + transport.abort( finalText ); + } + done( 0, finalText ); + return this; + } + }; + + // Attach deferreds + deferred.promise( jqXHR ); + + // Add protocol if not provided (prefilters might expect it) + // Handle falsy url in the settings object (#10093: consistency with old signature) + // We also use the url parameter if available + s.url = ( ( url || s.url || location.href ) + "" ) + .replace( rprotocol, location.protocol + "//" ); + + // Alias method option to type as per ticket #12004 + s.type = options.method || options.type || s.method || s.type; + + // Extract dataTypes list + s.dataTypes = ( s.dataType || "*" ).toLowerCase().match( rnothtmlwhite ) || [ "" ]; + + // A cross-domain request is in order when the origin doesn't match the current origin. + if ( s.crossDomain == null ) { + urlAnchor = document.createElement( "a" ); + + // Support: IE <=8 - 11, Edge 12 - 15 + // IE throws exception on accessing the href property if url is malformed, + // e.g. http://example.com:80x/ + try { + urlAnchor.href = s.url; + + // Support: IE <=8 - 11 only + // Anchor's host property isn't correctly set when s.url is relative + urlAnchor.href = urlAnchor.href; + s.crossDomain = originAnchor.protocol + "//" + originAnchor.host !== + urlAnchor.protocol + "//" + urlAnchor.host; + } catch ( e ) { + + // If there is an error parsing the URL, assume it is crossDomain, + // it can be rejected by the transport if it is invalid + s.crossDomain = true; + } + } + + // Convert data if not already a string + if ( s.data && s.processData && typeof s.data !== "string" ) { + s.data = jQuery.param( s.data, s.traditional ); + } + + // Apply prefilters + inspectPrefiltersOrTransports( prefilters, s, options, jqXHR ); + + // If request was aborted inside a prefilter, stop there + if ( completed ) { + return jqXHR; + } + + // We can fire global events as of now if asked to + // Don't fire events if jQuery.event is undefined in an AMD-usage scenario (#15118) + fireGlobals = jQuery.event && s.global; + + // Watch for a new set of requests + if ( fireGlobals && jQuery.active++ === 0 ) { + jQuery.event.trigger( "ajaxStart" ); + } + + // Uppercase the type + s.type = s.type.toUpperCase(); + + // Determine if request has content + s.hasContent = !rnoContent.test( s.type ); + + // Save the URL in case we're toying with the If-Modified-Since + // and/or If-None-Match header later on + // Remove hash to simplify url manipulation + cacheURL = s.url.replace( rhash, "" ); + + // More options handling for requests with no content + if ( !s.hasContent ) { + + // Remember the hash so we can put it back + uncached = s.url.slice( cacheURL.length ); + + // If data is available and should be processed, append data to url + if ( s.data && ( s.processData || typeof s.data === "string" ) ) { + cacheURL += ( rquery.test( cacheURL ) ? "&" : "?" ) + s.data; + + // #9682: remove data so that it's not used in an eventual retry + delete s.data; + } + + // Add or update anti-cache param if needed + if ( s.cache === false ) { + cacheURL = cacheURL.replace( rantiCache, "$1" ); + uncached = ( rquery.test( cacheURL ) ? "&" : "?" ) + "_=" + ( nonce.guid++ ) + + uncached; + } + + // Put hash and anti-cache on the URL that will be requested (gh-1732) + s.url = cacheURL + uncached; + + // Change '%20' to '+' if this is encoded form body content (gh-2658) + } else if ( s.data && s.processData && + ( s.contentType || "" ).indexOf( "application/x-www-form-urlencoded" ) === 0 ) { + s.data = s.data.replace( r20, "+" ); + } + + // Set the If-Modified-Since and/or If-None-Match header, if in ifModified mode. + if ( s.ifModified ) { + if ( jQuery.lastModified[ cacheURL ] ) { + jqXHR.setRequestHeader( "If-Modified-Since", jQuery.lastModified[ cacheURL ] ); + } + if ( jQuery.etag[ cacheURL ] ) { + jqXHR.setRequestHeader( "If-None-Match", jQuery.etag[ cacheURL ] ); + } + } + + // Set the correct header, if data is being sent + if ( s.data && s.hasContent && s.contentType !== false || options.contentType ) { + jqXHR.setRequestHeader( "Content-Type", s.contentType ); + } + + // Set the Accepts header for the server, depending on the dataType + jqXHR.setRequestHeader( + "Accept", + s.dataTypes[ 0 ] && s.accepts[ s.dataTypes[ 0 ] ] ? + s.accepts[ s.dataTypes[ 0 ] ] + + ( s.dataTypes[ 0 ] !== "*" ? ", " + allTypes + "; q=0.01" : "" ) : + s.accepts[ "*" ] + ); + + // Check for headers option + for ( i in s.headers ) { + jqXHR.setRequestHeader( i, s.headers[ i ] ); + } + + // Allow custom headers/mimetypes and early abort + if ( s.beforeSend && + ( s.beforeSend.call( callbackContext, jqXHR, s ) === false || completed ) ) { + + // Abort if not done already and return + return jqXHR.abort(); + } + + // Aborting is no longer a cancellation + strAbort = "abort"; + + // Install callbacks on deferreds + completeDeferred.add( s.complete ); + jqXHR.done( s.success ); + jqXHR.fail( s.error ); + + // Get transport + transport = inspectPrefiltersOrTransports( transports, s, options, jqXHR ); + + // If no transport, we auto-abort + if ( !transport ) { + done( -1, "No Transport" ); + } else { + jqXHR.readyState = 1; + + // Send global event + if ( fireGlobals ) { + globalEventContext.trigger( "ajaxSend", [ jqXHR, s ] ); + } + + // If request was aborted inside ajaxSend, stop there + if ( completed ) { + return jqXHR; + } + + // Timeout + if ( s.async && s.timeout > 0 ) { + timeoutTimer = window.setTimeout( function() { + jqXHR.abort( "timeout" ); + }, s.timeout ); + } + + try { + completed = false; + transport.send( requestHeaders, done ); + } catch ( e ) { + + // Rethrow post-completion exceptions + if ( completed ) { + throw e; + } + + // Propagate others as results + done( -1, e ); + } + } + + // Callback for when everything is done + function done( status, nativeStatusText, responses, headers ) { + var isSuccess, success, error, response, modified, + statusText = nativeStatusText; + + // Ignore repeat invocations + if ( completed ) { + return; + } + + completed = true; + + // Clear timeout if it exists + if ( timeoutTimer ) { + window.clearTimeout( timeoutTimer ); + } + + // Dereference transport for early garbage collection + // (no matter how long the jqXHR object will be used) + transport = undefined; + + // Cache response headers + responseHeadersString = headers || ""; + + // Set readyState + jqXHR.readyState = status > 0 ? 4 : 0; + + // Determine if successful + isSuccess = status >= 200 && status < 300 || status === 304; + + // Get response data + if ( responses ) { + response = ajaxHandleResponses( s, jqXHR, responses ); + } + + // Use a noop converter for missing script but not if jsonp + if ( !isSuccess && + jQuery.inArray( "script", s.dataTypes ) > -1 && + jQuery.inArray( "json", s.dataTypes ) < 0 ) { + s.converters[ "text script" ] = function() {}; + } + + // Convert no matter what (that way responseXXX fields are always set) + response = ajaxConvert( s, response, jqXHR, isSuccess ); + + // If successful, handle type chaining + if ( isSuccess ) { + + // Set the If-Modified-Since and/or If-None-Match header, if in ifModified mode. + if ( s.ifModified ) { + modified = jqXHR.getResponseHeader( "Last-Modified" ); + if ( modified ) { + jQuery.lastModified[ cacheURL ] = modified; + } + modified = jqXHR.getResponseHeader( "etag" ); + if ( modified ) { + jQuery.etag[ cacheURL ] = modified; + } + } + + // if no content + if ( status === 204 || s.type === "HEAD" ) { + statusText = "nocontent"; + + // if not modified + } else if ( status === 304 ) { + statusText = "notmodified"; + + // If we have data, let's convert it + } else { + statusText = response.state; + success = response.data; + error = response.error; + isSuccess = !error; + } + } else { + + // Extract error from statusText and normalize for non-aborts + error = statusText; + if ( status || !statusText ) { + statusText = "error"; + if ( status < 0 ) { + status = 0; + } + } + } + + // Set data for the fake xhr object + jqXHR.status = status; + jqXHR.statusText = ( nativeStatusText || statusText ) + ""; + + // Success/Error + if ( isSuccess ) { + deferred.resolveWith( callbackContext, [ success, statusText, jqXHR ] ); + } else { + deferred.rejectWith( callbackContext, [ jqXHR, statusText, error ] ); + } + + // Status-dependent callbacks + jqXHR.statusCode( statusCode ); + statusCode = undefined; + + if ( fireGlobals ) { + globalEventContext.trigger( isSuccess ? "ajaxSuccess" : "ajaxError", + [ jqXHR, s, isSuccess ? success : error ] ); + } + + // Complete + completeDeferred.fireWith( callbackContext, [ jqXHR, statusText ] ); + + if ( fireGlobals ) { + globalEventContext.trigger( "ajaxComplete", [ jqXHR, s ] ); + + // Handle the global AJAX counter + if ( !( --jQuery.active ) ) { + jQuery.event.trigger( "ajaxStop" ); + } + } + } + + return jqXHR; + }, + + getJSON: function( url, data, callback ) { + return jQuery.get( url, data, callback, "json" ); + }, + + getScript: function( url, callback ) { + return jQuery.get( url, undefined, callback, "script" ); + } +} ); + +jQuery.each( [ "get", "post" ], function( _i, method ) { + jQuery[ method ] = function( url, data, callback, type ) { + + // Shift arguments if data argument was omitted + if ( isFunction( data ) ) { + type = type || callback; + callback = data; + data = undefined; + } + + // The url can be an options object (which then must have .url) + return jQuery.ajax( jQuery.extend( { + url: url, + type: method, + dataType: type, + data: data, + success: callback + }, jQuery.isPlainObject( url ) && url ) ); + }; +} ); + +jQuery.ajaxPrefilter( function( s ) { + var i; + for ( i in s.headers ) { + if ( i.toLowerCase() === "content-type" ) { + s.contentType = s.headers[ i ] || ""; + } + } +} ); + + +jQuery._evalUrl = function( url, options, doc ) { + return jQuery.ajax( { + url: url, + + // Make this explicit, since user can override this through ajaxSetup (#11264) + type: "GET", + dataType: "script", + cache: true, + async: false, + global: false, + + // Only evaluate the response if it is successful (gh-4126) + // dataFilter is not invoked for failure responses, so using it instead + // of the default converter is kludgy but it works. + converters: { + "text script": function() {} + }, + dataFilter: function( response ) { + jQuery.globalEval( response, options, doc ); + } + } ); +}; + + +jQuery.fn.extend( { + wrapAll: function( html ) { + var wrap; + + if ( this[ 0 ] ) { + if ( isFunction( html ) ) { + html = html.call( this[ 0 ] ); + } + + // The elements to wrap the target around + wrap = jQuery( html, this[ 0 ].ownerDocument ).eq( 0 ).clone( true ); + + if ( this[ 0 ].parentNode ) { + wrap.insertBefore( this[ 0 ] ); + } + + wrap.map( function() { + var elem = this; + + while ( elem.firstElementChild ) { + elem = elem.firstElementChild; + } + + return elem; + } ).append( this ); + } + + return this; + }, + + wrapInner: function( html ) { + if ( isFunction( html ) ) { + return this.each( function( i ) { + jQuery( this ).wrapInner( html.call( this, i ) ); + } ); + } + + return this.each( function() { + var self = jQuery( this ), + contents = self.contents(); + + if ( contents.length ) { + contents.wrapAll( html ); + + } else { + self.append( html ); + } + } ); + }, + + wrap: function( html ) { + var htmlIsFunction = isFunction( html ); + + return this.each( function( i ) { + jQuery( this ).wrapAll( htmlIsFunction ? html.call( this, i ) : html ); + } ); + }, + + unwrap: function( selector ) { + this.parent( selector ).not( "body" ).each( function() { + jQuery( this ).replaceWith( this.childNodes ); + } ); + return this; + } +} ); + + +jQuery.expr.pseudos.hidden = function( elem ) { + return !jQuery.expr.pseudos.visible( elem ); +}; +jQuery.expr.pseudos.visible = function( elem ) { + return !!( elem.offsetWidth || elem.offsetHeight || elem.getClientRects().length ); +}; + + + + +jQuery.ajaxSettings.xhr = function() { + try { + return new window.XMLHttpRequest(); + } catch ( e ) {} +}; + +var xhrSuccessStatus = { + + // File protocol always yields status code 0, assume 200 + 0: 200, + + // Support: IE <=9 only + // #1450: sometimes IE returns 1223 when it should be 204 + 1223: 204 + }, + xhrSupported = jQuery.ajaxSettings.xhr(); + +support.cors = !!xhrSupported && ( "withCredentials" in xhrSupported ); +support.ajax = xhrSupported = !!xhrSupported; + +jQuery.ajaxTransport( function( options ) { + var callback, errorCallback; + + // Cross domain only allowed if supported through XMLHttpRequest + if ( support.cors || xhrSupported && !options.crossDomain ) { + return { + send: function( headers, complete ) { + var i, + xhr = options.xhr(); + + xhr.open( + options.type, + options.url, + options.async, + options.username, + options.password + ); + + // Apply custom fields if provided + if ( options.xhrFields ) { + for ( i in options.xhrFields ) { + xhr[ i ] = options.xhrFields[ i ]; + } + } + + // Override mime type if needed + if ( options.mimeType && xhr.overrideMimeType ) { + xhr.overrideMimeType( options.mimeType ); + } + + // X-Requested-With header + // For cross-domain requests, seeing as conditions for a preflight are + // akin to a jigsaw puzzle, we simply never set it to be sure. + // (it can always be set on a per-request basis or even using ajaxSetup) + // For same-domain requests, won't change header if already provided. + if ( !options.crossDomain && !headers[ "X-Requested-With" ] ) { + headers[ "X-Requested-With" ] = "XMLHttpRequest"; + } + + // Set headers + for ( i in headers ) { + xhr.setRequestHeader( i, headers[ i ] ); + } + + // Callback + callback = function( type ) { + return function() { + if ( callback ) { + callback = errorCallback = xhr.onload = + xhr.onerror = xhr.onabort = xhr.ontimeout = + xhr.onreadystatechange = null; + + if ( type === "abort" ) { + xhr.abort(); + } else if ( type === "error" ) { + + // Support: IE <=9 only + // On a manual native abort, IE9 throws + // errors on any property access that is not readyState + if ( typeof xhr.status !== "number" ) { + complete( 0, "error" ); + } else { + complete( + + // File: protocol always yields status 0; see #8605, #14207 + xhr.status, + xhr.statusText + ); + } + } else { + complete( + xhrSuccessStatus[ xhr.status ] || xhr.status, + xhr.statusText, + + // Support: IE <=9 only + // IE9 has no XHR2 but throws on binary (trac-11426) + // For XHR2 non-text, let the caller handle it (gh-2498) + ( xhr.responseType || "text" ) !== "text" || + typeof xhr.responseText !== "string" ? + { binary: xhr.response } : + { text: xhr.responseText }, + xhr.getAllResponseHeaders() + ); + } + } + }; + }; + + // Listen to events + xhr.onload = callback(); + errorCallback = xhr.onerror = xhr.ontimeout = callback( "error" ); + + // Support: IE 9 only + // Use onreadystatechange to replace onabort + // to handle uncaught aborts + if ( xhr.onabort !== undefined ) { + xhr.onabort = errorCallback; + } else { + xhr.onreadystatechange = function() { + + // Check readyState before timeout as it changes + if ( xhr.readyState === 4 ) { + + // Allow onerror to be called first, + // but that will not handle a native abort + // Also, save errorCallback to a variable + // as xhr.onerror cannot be accessed + window.setTimeout( function() { + if ( callback ) { + errorCallback(); + } + } ); + } + }; + } + + // Create the abort callback + callback = callback( "abort" ); + + try { + + // Do send the request (this may raise an exception) + xhr.send( options.hasContent && options.data || null ); + } catch ( e ) { + + // #14683: Only rethrow if this hasn't been notified as an error yet + if ( callback ) { + throw e; + } + } + }, + + abort: function() { + if ( callback ) { + callback(); + } + } + }; + } +} ); + + + + +// Prevent auto-execution of scripts when no explicit dataType was provided (See gh-2432) +jQuery.ajaxPrefilter( function( s ) { + if ( s.crossDomain ) { + s.contents.script = false; + } +} ); + +// Install script dataType +jQuery.ajaxSetup( { + accepts: { + script: "text/javascript, application/javascript, " + + "application/ecmascript, application/x-ecmascript" + }, + contents: { + script: /\b(?:java|ecma)script\b/ + }, + converters: { + "text script": function( text ) { + jQuery.globalEval( text ); + return text; + } + } +} ); + +// Handle cache's special case and crossDomain +jQuery.ajaxPrefilter( "script", function( s ) { + if ( s.cache === undefined ) { + s.cache = false; + } + if ( s.crossDomain ) { + s.type = "GET"; + } +} ); + +// Bind script tag hack transport +jQuery.ajaxTransport( "script", function( s ) { + + // This transport only deals with cross domain or forced-by-attrs requests + if ( s.crossDomain || s.scriptAttrs ) { + var script, callback; + return { + send: function( _, complete ) { + script = jQuery( " + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Using AIMET Tensorflow APIs with Keras Models

+
+

Introduction

+

Currently AIMET APIs support Tensorflow sessions. This example code shows a method for how to use AIMET if you have a Keras model by invoking AIMET on the back-end session and converting the returned session to a Keras model.

+
+
+

APIs

+

The method involves performing four steps. The steps are:

+

Step 1: Save the session returned by AIMET

+
+
+aimet_tensorflow.utils.convert_tf_sess_to_keras.save_tf_session_single_gpu(sess, path, input_tensor, output_tensor)[source]
+

Saves TF session, meta graph and variables in the provided path

+
+
Parameters
+
    +
  • sess (Session) – Input: tf.compat.v1.Session

  • +
  • path (str) – Path to save the session

  • +
  • input_tensor (str) – Name of starting op to the given graph

  • +
  • output_tensor (str) – Name of output op of the graph

  • +
+
+
Returns
+

None

+
+
+
+ +
+

+
+

Step 2: Model subclassing to load the corresponding session to Keras model

+
+
+aimet_tensorflow.utils.convert_tf_sess_to_keras.load_tf_sess_variables_to_keras_single_gpu(path, compressed_ops)[source]
+

Creates a Keras model subclass and loads the saved session, meta graph and variables to Keras model

+
+
Parameters
+
    +
  • path (str) – Path to load the tf session saved using save_session_graph_and_variables

  • +
  • compressed_ops (List[str]) – List of ops names skipped in Keras model creations. These are the the ops +that AIMET compressed and are isolated from rest of the graph.

  • +
+
+
Return type
+

Model

+
+
Returns
+

Subclassed Keras Model

+
+
+
+ +
+

+
+

After these two steps, model can be used for single gpu training. For multi-gpu training, the next two steps needs to be followed:

+

Step 3: Saving the Keras model from step 2 to make it compatible with distribution strategy

+
+
+aimet_tensorflow.utils.convert_tf_sess_to_keras.save_as_tf_module_multi_gpu(loading_path, saving_path, compressed_ops, input_shape)[source]
+

Loads a Keras model and re-saves the loaded object in the form of tf.Module

+
+
Parameters
+
    +
  • loading_path (str) – Path to load the Keras Model

  • +
  • saving_path (str) – Path to save the object

  • +
  • compressed_ops (List[str]) – List of ops names for which we need to skip in Keras model creation. These are the the +ops that AIMET compressed and are isolated from rest of the graph.

  • +
  • input_shape (Tuple) – shape of input to the model

  • +
+
+
Returns
+

None

+
+
+
+ +
+

+
+

Step 4: Model subclassing to load the corresponding Keras model

+
+
+aimet_tensorflow.utils.convert_tf_sess_to_keras.load_keras_model_multi_gpu(loading_path, input_shape)[source]
+

This function loads the Keras model back, which can be used for funetuning within a strategy

+
+
Parameters
+
    +
  • loading_path (str) – Path to load the Keras Model

  • +
  • input_shape (List) – the shape of stating tensor in graph ; for instance (224,224,3) for ResNet50 and MoblinetV1

  • +
+
+
Returns
+

subclassed Keras model

+
+
+
+ +
+

+
+
+
+

Code Example

+

Required imports

+
import tensorflow as tf
+from aimet_tensorflow.utils.convert_tf_sess_to_keras import save_tf_session_single_gpu, save_as_tf_module_multi_gpu, \
+    load_tf_sess_variables_to_keras_single_gpu, load_keras_model_multi_gpu
+
+
+

Steps to convert a TF session found after compression to Keras model

+
def convert_tf_session_to_keras_model():
+    """
+    Convert an AIMET  spatial SVD compressed session to a Keras model and train the Keras model with MirroredStrategy
+    """
+    sess = get_sess_from_keras_model()
+
+    # For instance, if the first conv layer in MobilNetV1 graph is compressed, then:
+    compressed_ops = ['conv1/Conv2D']
+    compressed_sess = compress_session(sess, compressed_ops)
+
+    # Defining the input and output convs of the session for MobileNet model
+    input_op_name, output_op_name = "input_1:0", "act_softmax/Softmax:0"
+
+    # Step 1: Single Saving the compressed session
+    path = './saved_model_single_gpu'
+    save_tf_session_single_gpu(compressed_sess, path, input_op_name, output_op_name)
+    tf.keras.backend.clear_session()
+
+    # Step 2: Loading the correspnding Keras Model
+    tf.keras.backend.set_learning_phase(1)
+    model = load_tf_sess_variables_to_keras_single_gpu(path, compressed_ops)
+
+    # Single GPU training of the loaded Keras Model
+    train(model)
+
+    # To be able to do multi-gpu training the next two steps needs to be followed:
+    # Step 3: Re-Saving the Keras model to make it compatible with distribution strategy
+    saving_path = './saved_model_multi_gpu'
+    save_as_tf_module_multi_gpu(path, saving_path, compressed_ops, input_shape=(224, 224, 3))
+
+    tf.keras.backend.clear_session()
+
+    with tf.distribute.MirroredStrategy().scope():
+        tf.keras.backend.set_learning_phase(1)
+        # Step 4: Loading the keras model and  Multi gpu training the model on given dataset
+        model = load_keras_model_multi_gpu(saving_path, input_shape=[224, 224, 3])
+        # Train model on Multi-GPU
+        train(model)
+
+
+
+
+

Utility Functions

+

Required imports

+
import tensorflow as tf
+from tensorflow.keras.applications import MobileNet
+from keras.applications.vgg16 import preprocess_input
+
+import numpy as np
+
+from aimet_common.defs import CompressionScheme, CostMetric
+from aimet_tensorflow.defs import SpatialSvdParameters
+from aimet_tensorflow.compress import ModelCompressor
+from aimet_tensorflow.defs import ModuleCompRatioPair
+
+
+

Utility function to get session from Keras model

+
def get_sess_from_keras_model():
+    """
+    Gets TF session from keras model
+    :return: TF session
+    """
+    tf.keras.backend.clear_session()
+    tf.keras.backend.set_learning_phase(1)
+    _ = MobileNet(weights=None, input_shape=(224, 224, 3))
+    sess = tf.compat.v1.keras.backend.get_session()
+    return sess
+
+
+

Utility function to get a compressed session

+
def compress_session(sess, compressible_ops):
+    """
+    Compressed TF session
+    :param sess: Tf session
+    :param compressible_ops: layers to compress
+    :return: compressed session
+    """
+    layer_a = sess.graph.get_operation_by_name(compressible_ops[0])
+    list_of_module_comp_ratio_pairs = [ModuleCompRatioPair(layer_a, 0.5)]
+    manual_params = SpatialSvdParameters.ManualModeParams(
+        list_of_module_comp_ratio_pairs=list_of_module_comp_ratio_pairs)
+    params = SpatialSvdParameters(input_op_names=['input_1'], output_op_names=['act_softmax/Softmax'],
+                                  mode=SpatialSvdParameters.Mode.manual, params=manual_params)
+    scheme = CompressionScheme.spatial_svd
+    metric = CostMetric.mac
+
+    # pylint: disable=unused-argument
+    def evaluate(sess, iterations, use_cuda):
+        return 1
+
+    sess, _ = ModelCompressor.compress_model(sess=sess,
+                                             working_dir="./",
+                                             eval_callback=evaluate,
+                                             eval_iterations=None,
+                                             input_shape=(1, 3, 224, 224),
+                                             compress_scheme=scheme,
+                                             cost_metric=metric,
+                                             parameters=params)
+    return sess
+
+
+

Utility function for training

+
def train(model):
+    """
+    Trains using fake dataset
+    :param model: Keras model
+    :return: trained model
+    """
+    # Create a fake dataset
+    x_train = np.random.rand(32, 224, 224, 3)
+    y_train = np.random.rand(32, )
+    x_train = preprocess_input(x_train)
+    y_train = tf.keras.utils.to_categorical(y_train, 1000)
+
+    model.compile('rmsprop', 'mse')
+    model.fit(x_train, y_train, epochs=1, batch_size=1, shuffle=False)
+    return model
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/index.html b/releases/1.32.2/api_docs/index.html new file mode 100644 index 00000000..786e54ca --- /dev/null +++ b/releases/1.32.2/api_docs/index.html @@ -0,0 +1,1199 @@ + + + + + + Welcome to AI Model Efficiency Toolkit API Docs! — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Welcome to AI Model Efficiency Toolkit API Docs!

+

AI Model Efficiency Toolkit (AIMET) is a software toolkit that enables users to compress +and quantize ML models. The resulting models returned by AIMET can be further trained (or fine-tuned) +to dramatically improve accuracy lost due to quantization and compression.

+

AIMET is designed to work generically on any user-provided model. At present, AIMET supports +TensorFlow, Keras, PyTorch, and ONNX frameworks.

+

Please follow the links below to see AIMET APIs for either PyTorch, TensorFlow, Keras, or ONNX.

+ + + + +
+

+

+
+
+

Note

+

This documentation is auto-generated from the AIMET codebase using Sphinx

+
+
+

Indices and tables

+ +
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/keras.html b/releases/1.32.2/api_docs/keras.html new file mode 100644 index 00000000..9bacbb1d --- /dev/null +++ b/releases/1.32.2/api_docs/keras.html @@ -0,0 +1,1141 @@ + + + + + + AIMET Keras APIs — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+ +
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/keras_adaround.html b/releases/1.32.2/api_docs/keras_adaround.html new file mode 100644 index 00000000..cd8ddb16 --- /dev/null +++ b/releases/1.32.2/api_docs/keras_adaround.html @@ -0,0 +1,1283 @@ + + + + + + AIMET Keras AdaRound API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET Keras AdaRound API

+ + +
+

Top-level API

+
+
+

Adaround Parameters

+
+
+class aimet_tensorflow.adaround.adaround_weight.AdaroundParameters(data_set, num_batches, default_num_iterations=10000, default_reg_param=0.01, default_beta_range=(20, 2), default_warm_start=0.2)[source]
+

Configuration parameters for Adaround

+
+
Parameters
+
    +
  • data_set (DatasetV2) – TF Data set

  • +
  • num_batches (int) – Number of batches

  • +
  • default_num_iterations (int) – Number of iterations to adaround each layer. Default 10000

  • +
  • default_reg_param (float) – Regularization parameter, trading off between rounding loss vs reconstruction loss. +Default 0.01

  • +
  • default_beta_range (Tuple) – Start and stop beta parameter for annealing of rounding loss (start_beta, end_beta). +Default (20, 2)

  • +
  • default_warm_start (float) – warm up period, during which rounding loss has zero effect. Default 20% (0.2)

  • +
+
+
+
+ +
+
+

Enum Definition

+

Quant Scheme Enum

+
+
+class aimet_common.defs.QuantScheme(value)[source]
+

Enumeration of Quant schemes

+
+
+post_training_percentile = 6
+

For a Tensor, adjusted minimum and maximum values are selected based on the percentile value passed. +The Quantization encodings are calculated using the adjusted minimum and maximum value.

+
+ +
+
+post_training_tf = 1
+

For a Tensor, the absolute minimum and maximum value of the Tensor are used to compute the Quantization +encodings.

+
+ +
+
+post_training_tf_enhanced = 2
+

For a Tensor, searches and selects the optimal minimum and maximum value that minimizes the Quantization Noise. +The Quantization encodings are calculated using the selected minimum and maximum value.

+
+ +
+
+training_range_learning_with_tf_enhanced_init = 4
+

For a Tensor, the encoding values are initialized with the post_training_tf_enhanced scheme. Then, the encodings +are learned during training.

+
+ +
+
+training_range_learning_with_tf_init = 3
+

For a Tensor, the encoding values are initialized with the post_training_tf scheme. Then, the encodings are +learned during training.

+
+ +
+ +
+
+

Code Examples

+

Required imports

+

+import logging
+import numpy as np
+import tensorflow as tf
+
+from aimet_common.utils import AimetLogger
+from aimet_common.defs import QuantScheme
+from aimet_tensorflow.examples.test_models import keras_model
+from aimet_tensorflow.keras.quantsim import QuantizationSimModel
+from aimet_tensorflow.keras.adaround_weight import Adaround
+from aimet_tensorflow.adaround.adaround_weight import AdaroundParameters
+
+
+
+

Evaluation function

+
def dummy_forward_pass(model: tf.keras.Model, _):
+    """
+    This is intended to be the user-defined model evaluation function.
+    AIMET requires the above signature. So if the user's eval function does not
+    match this signature, please create a simple wrapper.
+    :param model: Model to be evaluated
+    :param _: These argument(s) are passed to the forward_pass_callback as-is. Up to
+            the user to determine the type of this parameter. E.g. could be simply an integer representing the number
+            of data samples to use. Or could be a tuple of parameters or an object representing something more complex.
+            If set to None, forward_pass_callback will be invoked with no parameters.
+    :return: single float number (accuracy) representing model's performance
+    """
+    input_data = np.random.rand(32, 16, 16, 3)
+    return model(input_data)
+
+
+

After applying AdaRound to the model, the AdaRounded model and associated encodings are returned

+
def apply_adaround_example():
+
+    AimetLogger.set_level_for_all_areas(logging.DEBUG)
+    tf.keras.backend.clear_session()
+
+    model = keras_model()
+    dataset_size = 32
+    batch_size = 16
+    possible_batches = dataset_size // batch_size
+    input_data = np.random.rand(dataset_size, 16, 16, 3)
+    dataset = tf.data.Dataset.from_tensor_slices(input_data)
+    dataset = dataset.batch(batch_size=batch_size)
+
+    params = AdaroundParameters(data_set=dataset, num_batches=possible_batches, default_num_iterations=10)
+
+    # W4A8
+    param_bw = 4
+    output_bw = 8
+    quant_scheme = QuantScheme.post_training_tf_enhanced
+
+    # Returns session with adarounded weights and their corresponding encodings
+    adarounded_model = Adaround.apply_adaround(model, params, path='./', filename_prefix='dummy',
+                                               default_param_bw=param_bw, default_quant_scheme=quant_scheme)
+
+    # Create QuantSim using adarounded_session
+    sim = QuantizationSimModel(adarounded_model, quant_scheme, default_output_bw=output_bw, default_param_bw=param_bw)
+
+    # Set and freeze encodings to use same quantization grid and then invoke compute encodings
+    sim.set_and_freeze_param_encodings(encoding_path='./dummy.encodings')
+    sim.compute_encodings(dummy_forward_pass, None)
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/keras_batchnorm_re_estimation.html b/releases/1.32.2/api_docs/keras_batchnorm_re_estimation.html new file mode 100644 index 00000000..f2e11fc8 --- /dev/null +++ b/releases/1.32.2/api_docs/keras_batchnorm_re_estimation.html @@ -0,0 +1,1216 @@ + + + + + + AIMET Keras BatchNorm Re-estimation APIs — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET Keras BatchNorm Re-estimation APIs

+ +
+

Introduction

+

AIMET functionality for Keras BatchNorm Re-estimation recalculates the batchnorm statistics based on the model after +QAT. By doing so, we aim to make our model learn batchnorm statistics from from stable outputs after QAT, rather than from likely noisy outputs during QAT.

+
+
+

Top-level APIs

+

API for BatchNorm Re-estimation

+
+
+aimet_tensorflow.keras.bn_reestimation.reestimate_bn_stats(model, bn_re_estimation_dataset, bn_num_batches=100)[source]
+

top level api for end user directly call

+
+
Parameters
+
    +
  • model (Model) – tf.keras.Model

  • +
  • bn_re_estimation_dataset (DatasetV2) – Training dataset

  • +
  • bn_num_batches (int) – The number of batches to be used for reestimation

  • +
+
+
Return type
+

Handle

+
+
Returns
+

Handle that undos the effect of BN reestimation upon handle.remove()

+
+
+
+ +

API for BatchNorm fold to scale

+
+
+aimet_tensorflow.keras.batch_norm_fold.fold_all_batch_norms_to_scale(sim)[source]
+

Fold all batch_norm layers in a model into the quantization scale parameter +of the corresponding conv layers

+
+
Parameters
+

sim (QuantizationSimModel) – QuantizationSimModel to be folded

+
+
Return type
+

List[Tuple[QcQuantizeWrapper, QcQuantizeWrapper]]

+
+
Returns
+

A list of pairs of layers [(Conv/Linear, BN layer that got folded)]

+
+
+
+ +
+
+

Code Example

+

Required imports

+
from aimet_tensorflow.keras.bn_reestimation import reestimate_bn_stats
+from aimet_tensorflow.keras.batch_norm_fold import fold_all_batch_norms_to_scale
+
+
+

Prepare BatchNorm Re-estimation dataset

+
batch_size = 4
+dataset = tf.data.Dataset.from_tensor_slices(x_train[0:100])
+dataset = dataset.batch(batch_size=batch_size)
+dummy_inputs = x_train[0:4]
+
+
+

Perform BatchNorm Re-estimation

+
reestimate_bn_stats(qsim.model, dataset, 1)
+
+
+

Perform BatchNorm Fold to scale

+
fold_all_batch_norms_to_scale(qsim)
+
+
+
+
+

Limitations

+

Please see The AIMET Keras ModelPreparer API limitations:

+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/keras_compression.html b/releases/1.32.2/api_docs/keras_compression.html new file mode 100644 index 00000000..2ed8da5d --- /dev/null +++ b/releases/1.32.2/api_docs/keras_compression.html @@ -0,0 +1,1519 @@ + + + + + + AIMET Keras Compression API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET Keras Compression API

+
+

Introduction

+
+
AIMET supports the following model compression techniques for keras models
    +
  • Spatial SVD

  • +
+
+
+

To learn more about these model compression techniques, please see Model Compression User Guide

+
+
For the Spatial SVD compression techniques, there are two modes in which you can invoke the AIMET API
    +
  • +
    Auto Mode: In Auto mode, AIMET will determine the optimal way to compress each layer of

    the model given an overall target compression ratio. Greedy Compression Ratio Selection Algorithm is used to pick appropriate compression ratios for each layer.

    +
    +
    +
  • +
  • +
    Manual Mode: In Manual mode, the user can pass in the desired compression-ratio per layer

    to AIMET. AIMET will apply the specified compression technique for each of the +layers to achieve the desired compression-ratio per layer. It is recommended that +the user start with Auto mode, and then tweak per-layer compression-ratios using +Manual mode if desired.

    +
    +
    +
  • +
+
+
+
+

+
+
+
+

Top-level API for Compression

+
+
+class aimet_tensorflow.keras.compress.ModelCompressor[source]
+

aimet model compressor: Enables model compression using various schemes

+
+ +
+

+
+
+
+static ModelCompressor.compress_model(model, eval_callback, eval_iterations, compress_scheme, cost_metric, parameters, trainer=None, visualization_url=None)[source]
+

Compress a given model using the specified parameters

+
+
Parameters
+
    +
  • model (Model) – Model, represented by a tf.keras.Model, to compress

  • +
  • eval_callback (Callable[[Any, Optional[int], bool], float]) – Evaluation callback. Expected signature is evaluate(model, iterations, use_cuda). +Expected to return an accuracy metric.

  • +
  • eval_iterations – Iterations to run evaluation for.

  • +
  • compress_scheme (CompressionScheme) – Compression scheme. See the enum for allowed values

  • +
  • cost_metric (CostMetric) – Cost metric to use for the compression-ratio (either mac or memory)

  • +
  • parameters (SpatialSvdParameters) – Compression parameters specific to given compression scheme

  • +
  • trainer (Optional[Callable]) – Training function +None: If per layer fine-tuning is not required while creating the final compressed model

  • +
  • visualization_url (Optional[str]) – url the user will need to input where visualizations will appear

  • +
+
+
Return type
+

Tuple[Model, CompressionStats]

+
+
Returns
+

A tuple of the compressed model session, and compression statistics

+
+
+
+ +
+

+
+
+
+

Greedy Selection Parameters

+
+
+class aimet_common.defs.GreedySelectionParameters(target_comp_ratio, num_comp_ratio_candidates=10, use_monotonic_fit=False, saved_eval_scores_dict=None)[source]
+

Configuration parameters for the Greedy compression-ratio selection algorithm

+
+
Variables
+
    +
  • target_comp_ratio – Target compression ratio. Expressed as value between 0 and 1. +Compression ratio is the ratio of cost of compressed model to cost of the original model.

  • +
  • num_comp_ratio_candidates – Number of comp-ratio candidates to analyze per-layer +More candidates allows more granular distribution of compression at the cost +of increased run-time during analysis. Default value=10. Value should be greater than 1.

  • +
  • use_monotonic_fit – If True, eval scores in the eval dictionary are fitted to a monotonically increasing +function. This is useful if you see the eval dict scores for some layers are not monotonically increasing. +By default, this option is set to False.

  • +
  • saved_eval_scores_dict – Path to the eval_scores dictionary pickle file that was +saved in a previous run. This is useful to speed-up experiments when trying +different target compression-ratios for example. aimet will save eval_scores +dictionary pickle file automatically in a ./data directory relative to the +current path. num_comp_ratio_candidates parameter will be ignored when this option is used.

  • +
+
+
+
+ +
+

+
+
+
+

Spatial SVD Configuration

+
+
+class aimet_tensorflow.defs.SpatialSvdParameters(input_op_names, output_op_names, mode, params, multiplicity=1)[source]
+

Configuration parameters for spatial svd compression

+
+
Parameters
+
    +
  • input_op_names (List[str]) – list of input op names to the model

  • +
  • output_op_names (List[str]) – List of output op names of the model

  • +
  • mode (Mode) – Either auto mode or manual mode

  • +
  • params (Union[ManualModeParams, AutoModeParams]) – Parameters for the mode selected

  • +
  • multiplicity – The multiplicity to which ranks/input channels will get rounded. Default: 1

  • +
+
+
+
+
+class AutoModeParams(greedy_select_params, modules_to_ignore=None)[source]
+

Configuration parameters for auto-mode compression

+
+
Parameters
+
    +
  • greedy_select_params (GreedySelectionParameters) – Params for greedy comp-ratio selection algorithm

  • +
  • modules_to_ignore (Optional[List[Operation]]) – List of modules to ignore (None indicates nothing to ignore)

  • +
+
+
+
+ +
+
+class ManualModeParams(list_of_module_comp_ratio_pairs)[source]
+

Configuration parameters for manual-mode spatial svd compression

+
+
Parameters
+

list_of_module_comp_ratio_pairs (List[ModuleCompRatioPair]) – List of (module, comp-ratio) pairs

+
+
+
+ +
+
+class Mode(value)[source]
+

Mode enumeration

+
+
+auto = 2
+

Auto mode

+
+ +
+
+manual = 1
+

Manual mode

+
+ +
+ +
+ +
+

+
+
+
+

Configuration Definitions

+
+
+class aimet_common.defs.CostMetric(value)[source]
+

Enumeration of metrics to measure cost of a model/layer

+
+
+mac = 1
+

Cost modeled for compute requirements

+
+
Type
+

MAC

+
+
+
+ +
+
+memory = 2
+

Cost modeled for space requirements

+
+
Type
+

Memory

+
+
+
+ +
+ +
+

+
+
+
+class aimet_common.defs.CompressionScheme(value)[source]
+

Enumeration of compression schemes supported in aimet

+
+

Note

+

Only Spatial SVD is supported for now.

+
+
+
+channel_pruning = 3
+

Channel Pruning

+
+ +
+
+spatial_svd = 2
+

Spatial SVD

+
+ +
+
+weight_svd = 1
+

Weight SVD

+
+ +
+ +
+

+
+
+
+class aimet_tensorflow.defs.ModuleCompRatioPair(module, comp_ratio)[source]
+

Pair of tf.Operation and a compression-ratio

+
+
Variables
+
    +
  • module – Module of type tf.Operation

  • +
  • comp_ratio – Compression ratio. Compression ratio is the ratio of cost of compressed model +to cost of the original model.

  • +
+
+
+
+ +
+

+
+
+
+

Code Examples

+

Required imports

+
from decimal import Decimal
+from typing import Tuple
+
+import numpy as np
+import tensorflow as tf
+from tensorflow.keras.applications.resnet import ResNet50, preprocess_input, decode_predictions
+
+# imports for AIMET
+import aimet_common.defs as aimet_common_defs
+from aimet_tensorflow.keras.compress import ModelCompressor
+import aimet_tensorflow.defs as aimet_tensorflow_defs
+
+
+

Evaluation function

+
def get_eval_func(dataset_dir, batch_size, num_iterations=50000):
+    """
+    Sample Function which returns an evaluate function callback which can be
+    called to evaluate a model on the provided dataset
+    """
+    def func_wrapper(model, iterations):
+        validation_ds = tf.keras.preprocessing.image_dataset_from_directory(
+            directory=dataset_dir,
+            labels='inferred',
+            label_mode='categorical',
+            batch_size=batch_size,
+            shuffle=False,
+            image_size=(224, 224))
+        # If no iterations specified, set to full validation set
+        if not iterations:
+            iterations = num_iterations
+        else:
+            iterations = iterations * batch_size
+        top1 = 0
+        total = 0
+        inp_data = None
+        for (img, label) in validation_ds:
+            x = preprocess_input(img)
+            inp_data = x if inp_data is None else inp_data
+            preds = model.predict(x, batch_size=batch_size)
+            label = np.where(label)[1]
+            label = [validation_ds.class_names[int(i)] for i in label]
+            cnt = sum([1 for a, b in zip(label, decode_predictions(preds, top=1)) if str(a) == b[0][0]])
+            top1 += cnt
+            total += len(label)
+            if total >= iterations:
+                break
+
+        return top1/total
+    return func_wrapper
+
+
+

Compressing using Spatial SVD in auto mode

+
def aimet_spatial_svd(model, evaluator: aimet_common_defs.EvalFunction) -> Tuple[tf.keras.Model,
+                    aimet_common_defs.CompressionStats]:
+    """
+    Compresses the model using AIMET's Keras Spatial SVD auto mode compression scheme.
+
+    :param model: The keras model to compress
+    :param evaluator: Evaluator used during compression
+    :return: A tuple of compressed sess graph and its statistics
+    """
+
+    # Desired target compression ratio using Spatial SVD
+    # This value denotes the desired compression % of the original model.
+    # To compress the model to 20% of original model, use 0.2. This would
+    # compress the model by 80%.
+    # We are compressing the model by 50% here.
+    target_comp_ratio = Decimal(0.5)
+
+    # Number of compression ratio used by the API at each layer
+    # API will evaluate 0.1, 0.2, ..., 0.9, 1.0 ratio (total 10 candidates)
+    # at each layer
+    num_comp_ratio_candidates = 10
+
+    # Creating Greedy selection parameters:
+    greedy_params = aimet_common_defs.GreedySelectionParameters(target_comp_ratio=target_comp_ratio,
+                                                                num_comp_ratio_candidates=num_comp_ratio_candidates)
+
+    # Ignoring first convolutional layer of the model for compression
+    modules_to_ignore = [model.layers[2]]
+
+    # Creating Auto mode Parameters:
+    auto_params = aimet_tensorflow_defs.SpatialSvdParameters.AutoModeParams(greedy_select_params=greedy_params,
+                                                                            modules_to_ignore=modules_to_ignore)
+
+    # Creating Spatial SVD parameters with Auto Mode:
+    params = aimet_tensorflow_defs.SpatialSvdParameters(input_op_names=model.inputs,
+                                                        output_op_names=model.outputs,
+                                                        mode=aimet_tensorflow_defs.SpatialSvdParameters.Mode.auto,
+                                                        params=auto_params)
+
+    # Scheme is Spatial SVD:
+    scheme = aimet_common_defs.CompressionScheme.spatial_svd
+
+    # Cost metric is MAC, it can be MAC or Memory
+    cost_metric = aimet_common_defs.CostMetric.mac
+
+
+    # Calling model compression using Spatial SVD:
+    # Here evaluator is passed which is used by the API to evaluate the
+    # accuracy for various compression ratio of each layer. To speed up
+    # the process, only 10 batches of data is being used inside evaluator
+    # (by passing eval_iterations=10) instead of running evaluation on
+    # complete dataset.
+    results = ModelCompressor.compress_model(model=model,
+                                             eval_callback=evaluator,
+                                             eval_iterations=10,
+                                             compress_scheme=scheme,
+                                             cost_metric=cost_metric,
+                                             parameters=params)
+
+    return results
+
+
+

Sample Driver Code for Spatial SVD using Resnet50

+
def compress():
+    """
+    Example Driver Function Code in which we are compressing Resnet50 model.
+    """
+    dataset_dir = '/path/to/dataset'
+    model = ResNet50(weights='imagenet')
+    eval_func = get_eval_func(dataset_dir, batch_size=16)
+    compressed_model, stats = aimet_spatial_svd(model=model, evaluator=eval_func)
+    print(stats)
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/keras_cross_layer_equalization.html b/releases/1.32.2/api_docs/keras_cross_layer_equalization.html new file mode 100644 index 00000000..202c65f7 --- /dev/null +++ b/releases/1.32.2/api_docs/keras_cross_layer_equalization.html @@ -0,0 +1,1197 @@ + + + + + + AIMET Keras Cross Layer Equalization APIs — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET Keras Cross Layer Equalization APIs

+ + +
+

Introduction

+
+
AIMET functionality for Keras Cross Layer Equalization supports three techniques:
    +
  • BatchNorm Folding

  • +
  • Cross Layer Scaling

  • +
  • High Bias Fold

  • +
+
+
+
+
+

Cross Layer Equalization API

+

Listed below is a comprehensive API to apply all available techniques under cross layer equalization. +It performs ‘auto’ detection of candidate layers and applies the techniques. +If there are no BatchNorm layers in a given model, BatchNorm fold and high bias fold shall be skipped.

+

API(s) for Cross Layer Equalization

+
+
+aimet_tensorflow.keras.cross_layer_equalization.equalize_model(model)[source]
+

High-level API to perform Cross-Layer Equalization (CLE) on the given model +:type model: Model +:param model: tf.keras.Model +:rtype: Model +:return: CLE applied tf.keras.Model

+
+ +
+
+

Code Example

+

Required imports

+
import tensorflow as tf
+from aimet_tensorflow.keras.cross_layer_equalization import equalize_model
+
+
+

Cross Layer Equalization in auto mode comprehensive

+
def cross_layer_equalization_auto():
+    input_shape = (224, 224, 3)
+    model = tf.keras.applications.ResNet50()
+
+    cle_applied_model = equalize_model(model)
+
+
+
+
+

Primitive APIs

+

If the user would like to call the APIs individually, then the following APIs can be used:

+ +
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/keras_layer_output_generation.html b/releases/1.32.2/api_docs/keras_layer_output_generation.html new file mode 100644 index 00000000..bdafb717 --- /dev/null +++ b/releases/1.32.2/api_docs/keras_layer_output_generation.html @@ -0,0 +1,1220 @@ + + + + + + AIMET Keras Layer Output Generation API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Keras Layer Output Generation API

+

This API captures and saves intermediate layer-outputs of a model. The model can be original (FP32) or quantsim. +The layer-outputs are named according to the exported Keras model by the quantsim export API. This allows layer-output +comparison amongst FP32 model, quantization simulated model and actually quantized model on target-device to debug +accuracy miss-match issues.

+
+

Top-level API

+
+
+class aimet_tensorflow.keras.layer_output_utils.LayerOutputUtil(model, save_dir='./KerasLayerOutput')[source]
+

Implementation to capture and save outputs of intermediate layers of a model (fp32/quantsim)

+

Constructor for LayerOutputUtil.

+
+
Parameters
+
    +
  • model (Model) – Keras (fp32/quantsim) model.

  • +
  • save_dir (str) – Directory to save the layer outputs.

  • +
+
+
+
+ +
+

+
+

The following API can be used to Generate Layer Outputs

+
+
+LayerOutputUtil.generate_layer_outputs(input_batch)[source]
+

This method captures output of every layer of a model & saves the inputs and corresponding layer-outputs to disk.

+
+
Parameters
+

input_batch (Union[Tensor, List[Tensor], Tuple[Tensor]]) – Batch of Inputs for which layer output need to be generated

+
+
Returns
+

None

+
+
+
+ +
+

+
+
+
+

Code Example

+

Imports

+
import tensorflow as tf
+
+from aimet_tensorflow.keras.quantsim import QuantizationSimModel
+from aimet_tensorflow.keras.layer_output_utils import LayerOutputUtil
+
+
+

Obtain Original or QuantSim model from AIMET Export Artifacts

+
# Load the model.
+model = tf.keras.models.load_model('path/to/aimet_export_artifacts/model.h5')
+
+# Use same arguments as that were used for the exported QuantSim model. For sake of simplicity only mandatory arguments are passed below.
+quantsim = QuantizationSimModel(model)
+
+# Load exported encodings into quantsim object.
+quantsim.load_encodings_to_sim('path/to/aimet_export_artifacts/model.encodings')
+
+# Check whether constructed original and quantsim model are running properly before using Layer Output Generation API.
+_ = model.predict(dummy_input)
+_ = quantsim.predict(dummy_input)
+
+
+

Obtain inputs for which we want to generate intermediate layer-outputs

+
# Use same input pre-processing pipeline as was used for computing the quantization encodings.
+input_batches = get_pre_processed_inputs()
+
+
+

Generate layer-outputs

+
# Use original model to get fp32 layer-outputs
+fp32_layer_output_util = LayerOutputUtil(model=model, save_dir='fp32_layer_outputs')
+
+# Use quantsim model to get quantsim layer-outputs
+quantsim_layer_output_util = LayerOutputUtil(model=quantsim.model, save_dir='quantsim_layer_outputs')
+
+for input_batch in input_batches:
+    fp32_layer_output_util.generate_layer_outputs(input_batch=input_batch)
+    quantsim_layer_output_util.generate_layer_outputs(input_batch=input_batch)
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/keras_model_guidelines.html b/releases/1.32.2/api_docs/keras_model_guidelines.html new file mode 100644 index 00000000..c8a7b9d0 --- /dev/null +++ b/releases/1.32.2/api_docs/keras_model_guidelines.html @@ -0,0 +1,1193 @@ + + + + + + Keras Model Guidelines — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

Keras Model Guidelines

+

In order to make full use of AIMET features, there are several guidelines users are encouraged to follow when defining +Keras models.

+

Model should support the Functional or Sequential Keras API

+

If at all possible, users should define their models using the Functional or Sequential Keras API as this is the format +that AIMET expects. Below is an example of a Functional and Sequential models respectively:

+
import tensorflow as tf
+
+# Functional API
+def get_model():
+    inputs = tf.keras.Input(shape=(32,))
+    x = tf.keras.layers.Dense(64, activation='relu')(inputs)
+    outputs = tf.keras.layers.Dense(10)(x)
+    return tf.keras.Model(inputs=inputs, outputs=outputs)
+
+# Sequential API
+def get_model():
+    model = tf.keras.Sequential()
+    model.add(tf.keras.layers.Dense(64, activation='relu', input_shape=(32,)))
+    model.add(tf.keras.layers.Dense(10))
+    return model
+
+
+

If the user’s model is defined using the Subclassing API, or any mix of Functional, Sequential, and Subclassing, they can still use AIMET. +However, they will need convert their model to the Functional or Sequential API before using AIMET. +This can be done by using the Model Preparer API

+

Avoid reuse of class defined modules

+

Modules defined in the class definition should only be used once. If any modules are being reused, instead define a new +identical module in the class definition. +For example, if the user had:

+
def __init__(self,...):
+    ...
+    self.relu = tf.keras.layers.ReLU()
+    ...
+
+def call(...):
+    ...
+    x = self.relu(x)
+    ...
+    x2 = self.relu(x2)
+    ...
+
+
+

Users should instead define their model as:

+
def __init__(self,...):
+    ...
+    self.relu = tf.keras.layers.ReLU()
+    self.relu2 = tf.keras.layers.ReLU()
+    ...
+
+def call(...):
+    ...
+    x = self.relu(x)
+    ...
+    x2 = self.relu2(x2)
+    ...
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/keras_model_preparer.html b/releases/1.32.2/api_docs/keras_model_preparer.html new file mode 100644 index 00000000..b2a77650 --- /dev/null +++ b/releases/1.32.2/api_docs/keras_model_preparer.html @@ -0,0 +1,1371 @@ + + + + + + Model Preparer API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

Model Preparer API

+

AIMET Keras ModelPreparer API is used to prepare a Keras model that is not using the Keras Functional or Sequential API. +Specifically, it targets models that have been created using the subclassing feature in Keras. The ModelPreparer API will +convert the subclassing model to a Keras Functional API model. This is required because the AIMET Keras Quantization API +requires a Keras Functional API model as input.

+

Users are strongly encouraged to use AIMET Keras ModelPreparer API first and then use the returned model as input +to all the AIMET Quantization features. It is manditory to use the AIMET Keras ModelPreparer API if the model is +created using the subclassing feature in Keras, if any of the submodules of the model are created via subclassing, or if +any custom layers that inherit from the Keras Layer class are used in the model.

+
+

Top-level API

+
+
+aimet_tensorflow.keras.model_preparer.prepare_model(original_model, input_layer=None)[source]
+

This function prepares a Keras model before continuing on with AIMET. Specifically, it will convert the model into +a purely Functional API model and copy over the original models weights.

+
+
Parameters
+
    +
  • original_model (Model) – The original model to be prepared

  • +
  • input_layer (Union[InputLayer, List[InputLayer], None]) – The input layer to be used for the new model. By default, the input layer is set to None. If the

  • +
+
+
+

beginning portion of the model is subclassed, then the input layer must be passed in. +:rtype: Model +:return: The prepared model if needed, or the original model

+
+ +
+
+

Code Examples

+

Required imports

+

Example 1: Model with Two Subclassed Layers

+

We begin with a model that has two subclassed layers - TokenAndPositionEmbedding and TransformerBlock. This model +is taken from the Transformer text classification example.

+
class TokenAndPositionEmbedding(tf.keras.layers.Layer):
+    def __init__(self, maxlen, vocab_size, embed_dim):
+        super(TokenAndPositionEmbedding, self).__init__()
+        self.token_emb = tf.keras.layers.Embedding(input_dim=vocab_size, output_dim=embed_dim)
+        self.pos_emb = tf.keras.layers.Embedding(input_dim=maxlen, output_dim=embed_dim)
+
+    def call(self, x, **kwargs):
+        maxlen = tf.shape(x)[-1]
+        positions = tf.range(start=0, limit=maxlen, delta=1)
+        positions = self.pos_emb(positions)
+        x = self.token_emb(x)
+        x = x + positions
+        return x
+
+
+
class TransformerBlock(tf.keras.layers.Layer):
+    def __init__(self, embed_dim, num_heads, ff_dim, rate=0.1):
+        super(TransformerBlock, self).__init__()
+        self.att = tf.keras.layers.MultiHeadAttention(num_heads=num_heads, key_dim=embed_dim)
+        self.ffn = tf.keras.Sequential(
+            [tf.keras.layers.Dense(ff_dim, activation="relu"), tf.keras.layers.Dense(embed_dim),]
+        )
+        self.layernorm1 = tf.keras.layers.LayerNormalization(epsilon=1e-6)
+        self.layernorm2 = tf.keras.layers.LayerNormalization(epsilon=1e-6)
+        self.dropout1 = tf.keras.layers.Dropout(rate)
+        self.dropout2 = tf.keras.layers.Dropout(rate)
+
+    def call(self, inputs, training, **kwargs):
+        attn_output = self.att(inputs, inputs)
+        attn_output = self.dropout1(attn_output, training=training)
+        out1 = self.layernorm1(inputs + attn_output)
+        ffn_output = self.ffn(out1)
+        ffn_output = self.dropout2(ffn_output, training=training)
+        return self.layernorm2(out1 + ffn_output)
+
+
+
def get_text_classificaiton_model() -> tf.keras.Model:
+    vocab_size = 20000 
+    maxlen = 200
+
+    random_input = np.random.random((10, 200)) # Random input to build the model
+
+    embed_dim = 32  # Embedding size for each token
+    num_heads = 2  # Number of attention heads
+    ff_dim = 32  # Hidden layer size in feed forward network inside transformer
+
+    inputs = tf.keras.layers.Input(shape=(maxlen,))
+    embedding_layer = TokenAndPositionEmbedding(maxlen, vocab_size, embed_dim)
+    x = embedding_layer(inputs)
+    transformer_block = TransformerBlock(embed_dim, num_heads, ff_dim)
+    x = transformer_block(x)
+    x = tf.keras.layers.GlobalAveragePooling1D()(x)
+    x = tf.keras.layers.Dropout(0.1)(x)
+    x = tf.keras.layers.Dense(20, activation="relu")(x)
+    x = tf.keras.layers.Dropout(0.1)(x)
+    outputs = tf.keras.layers.Dense(2, activation="softmax")(x)
+
+    model = tf.keras.Model(inputs=inputs, outputs=outputs)
+    _ = model(random_input)
+    return model
+
+
+

Run the model preparer API on the model by passing in the model.

+
def model_preparer_two_subclassed_layers() -> tf.keras.Model:
+    model = get_text_classificaiton_model()
+    model = prepare_model(model)
+    return model
+
+
+

The model preparer API will return a Keras Functional API model. +We can now use this model as input to the AIMET Keras Quantization API.

+

Example 2: Model with Subclassed Layer as First Layer

+
def get_subclass_model_with_functional_layers() -> tf.keras.Model:
+    inputs = tf.keras.Input(shape=(64,))
+    outputs = tf.keras.layers.Dense(1, activation="sigmoid")(inputs)
+    binary_classifier = tf.keras.Model(inputs=inputs, outputs=outputs)
+
+    class MyFunctionalModel(tf.keras.Model):
+        def __init__(self):
+            super().__init__(name='my_functional_model')
+            self.dense = tf.keras.layers.Dense(64, activation="relu")
+            self.classifier = binary_classifier
+
+        def call(self, inputs, **kwargs):
+            features = self.dense(inputs)
+            return self.classifier(features)
+
+    model = MyFunctionalModel()
+    return model
+
+
+

Run the model preparer API on the model by passing in the model and an Input Layer. Note that this is an example of when +the model preparer API will require an Input Layer as input.

+
def model_preparer_subclassed_model_with_functional_layers():
+    model = get_subclass_model_with_functional_layers()
+    model = prepare_model(model, input_layer=tf.keras.Input(shape=(64,))) # Note: input layer is passed in
+    return model
+
+
+

The model preparer API will return a Keras Functional API model. +We can now use this model as input to the AIMET Keras Quantization API.

+
+
+

Limitations

+

The AIMET Keras ModelPreparer API has the following limitations:

+
    +
  • If the model starts with a subclassed layer, the AIMET Keras ModelPreparer API will need an Keras Input Layer as input. +This is becuase the Keras Functional API requires an Input Layer as the first layer in the model. The AIMET Keras ModelPreparer API +will raise an exception if the model starts with a subclassed layer and an Input Layer is not provided as input.

  • +
  • The AIMET Keras ModelPreparer API is able to convert subclass layers that have arthmetic experssion in their call function. +However, this API and Keras, will convert these operations to TFOPLambda layers which are not currently supported by AIMET Keras Quantization API. +If possible, it is recommended to have the subclass layers call function resemble the Keras Functional API layers.

    +
    +

    For example, if a subclass layer has two convolution layers in its call function, the call function should look like +the following:

    +
    def call(self, x, **kwargs):
    +    x = self.conv_1(x)
    +    x = self.conv_2(x)
    +    return x
    +
    +
    +
    +
  • +
  • Subclass layers are pieces of Python code in contrast to typical Functional or Sequential models are static graphs of layers. +Due to this, the subclass layers do not have this same attribute and can cause some issues during the model preparer. +The model preparer utilizes the call function of a subclass layer to trace out the layers defined inside of it. +To do this, a Keras Symbolic Tensor is passed through. If this symbolic tensor does not “touch” all parts of the layers +defined inside, this can cause missing layers/weights when preparing the model. In the example below we can see that +in the first call function, we would run into this error. The Keras Symbolic Tensor represented with variable x, does +not pass through the position’s variable at any point. This results in the weight for self.pos_emb to be missing in +the final prepared model. In contrast, the second call function has the input layer go through the entirety of the +layers and allows the model preparer to pick up all the internal weights and layers.:

    +
    def call(self, x, **kwargs):
    +    positions = tf.range(start=0, limit=self.static_patch_count, delta=1)
    +    positions = self.pos_emb(positions)
    +    x = self.token_emb(x)
    +    x = x + positions
    +    return x
    +
    +def call(self, x, **kwargs):
    +    maxlen = tf.shape( x )[-1]
    +    positions = tf.range(start=0, limit=maxlen, delta=1)
    +    positions = self.pos_emb(positions)
    +    x = self.token_emb( x )
    +    x = x + positions
    +    return x
    +
    +
    +
  • +
  • The AIMET Keras ModelPreparer API may be able to convert models that are inheriting form the Keras Model class or have +layers that inherit from the Keras Model class. However, this is not guaranteed. The API will check these layers weights +and verify it has the same number of weights as the layers __init__ defines them. However, if layers defined in the __init__ +are not used in the call function, the API will not be able to verify the weights. Furthermore, if a layer defined in the __init__ +is resued, the API will not be able to see both uses. For example, in the ResBlock class below, the self.relu is used twice and the +API will miss the second use. If the user defines two separate ReLU’s, then the API will be able to convert the layer.:

    +
    # Bad Example
    +class ResBlock(tf.keras.Model):
    +    def __init__(self, filters, kernel_size):
    +        super(ResBlock, self).__init__()
    +        self.conv1 = tf.keras.layers.Conv2D(filters, kernel_size, padding='same')
    +        self.bn1 = tf.keras.layers.BatchNormalization()
    +        self.conv2 = tf.keras.layers.Conv2D(filters, kernel_size, padding='same')
    +        self.bn2 = tf.keras.layers.BatchNormalization()
    +        self.relu = tf.keras.layers.ReLU()
    +
    +    def call(self, input_tensor, training=False):
    +        x = self.conv1(input_tensor)
    +        x = self.bn1(x, training=training)
    +        x = self.relu(x) # First use of self.relu
    +        x = self.conv2(x)
    +        x = self.bn2(x, training=training)
    +        x = self.relu(x) # Second use of self.relu
    +        x = tf.keras.layers.add([x, input_tensor])
    +        return x
    +
    +# Good Example
    +class ResBlock(tf.keras.Model):
    +    def __init__(self, filters, kernel_size):
    +        super(ResBlock, self).__init__()
    +        self.conv1 = tf.keras.layers.Conv2D(filters, kernel_size, padding='same')
    +        self.bn1 = tf.keras.layers.BatchNormalization()
    +        self.conv2 = tf.keras.layers.Conv2D(filters, kernel_size, padding='same')
    +        self.bn2 = tf.keras.layers.BatchNormalization()
    +        self.relu1 = tf.keras.layers.ReLU()
    +        self.relu2 = tf.keras.layers.ReLU()
    +
    +    def call(self, input_tensor, training=False):
    +        x = self.conv1(input_tensor)
    +        x = self.bn1(x, training=training)
    +        x = self.relu1(x) # First use of self.relu1
    +        x = self.conv2(x)
    +        x = self.bn2(x, training=training)
    +        x = self.relu2(x) # first use of self.relu2
    +        x = tf.keras.layers.add([x, input_tensor])
    +        return x
    +
    +
    +
  • +
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/keras_primitive_apis_cle.html b/releases/1.32.2/api_docs/keras_primitive_apis_cle.html new file mode 100644 index 00000000..fc677d7c --- /dev/null +++ b/releases/1.32.2/api_docs/keras_primitive_apis_cle.html @@ -0,0 +1,1446 @@ + + + + + + AIMET Keras Cross Layer Equalization Primitive API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET Keras Cross Layer Equalization Primitive API

+
+

Introduction

+

If a user wants to modify the order of Cross Layer Equalization, not use some features, or manually tweak the list of +layers that need to be equalized, the following APIs can be used.

+

Higher level API can be used for using one or more features one after the other. It automatically finds the layers to +be folded or scaled.

+

Lower level APIs can be used to manually tweak the list of layers to be folded. The user has to pass the list of +layers in the correct order that they appear in the model.

+

Note: Before using High Bias fold, Cross Layer Scaling (CLS) needs to be applied and scaling factors obtained from +CLS need to be plugged in to High Bias Fold. And, if there are batchnorm layers, they need to be folded and the info +saved to be plugged into high bias fold API.

+
+
+

Higher Level APIs for Cross Layer Equalization

+

API for Batch Norm Folding

+
+
+aimet_tensorflow.keras.batch_norm_fold.fold_all_batch_norms(model)[source]
+

Fold all batch_norm layers in a model into corresponding conv/linear layers

+
+
Parameters
+

model (Model) – model to find all batch norms for

+
+
Return type
+

Tuple[List[Tuple[Union[Conv2D, Dense, Conv2DTranspose, DepthwiseConv2D], BatchNormalization]], Model]

+
+
Returns
+

A tuple of List of conv/linear layers with associated bn op / activation info and a new model with the

+
+
+

Batch Normalization layers folded

+
+ +

API for Cross Layer Scaling

+
+
+aimet_tensorflow.keras.cross_layer_equalization.CrossLayerScaling.scale_model(model)
+

Uses cross-layer scaling to scale all applicable layers in the given model +:type model: Model +:param model: tf.keras.Model +:rtype: List[ClsSetInfo] +:return: CLS information for each CLS set

+
+ +

API for High Bias Folding

+
+
+aimet_tensorflow.keras.cross_layer_equalization.HighBiasFold.bias_fold(cls_set_info_list, bn_layers)
+

Folds bias values greater than 3 * sigma to next layer’s bias +:type cls_set_info_list: List[ClsSetInfo] +:param cls_set_info_list: List of info elements for each cls set +:type bn_layers: Dict[Conv2D, BatchNormalization] +:param bn_layers: Key: Conv/Linear layer Value: Corresponding folded BN layer

+
+ +
+
+

Code Examples for Higher Level APIs

+

Required imports

+
import tensorflow as tf
+from aimet_tensorflow.keras.batch_norm_fold import fold_all_batch_norms
+from aimet_tensorflow.keras.cross_layer_equalization import HighBiasFold, CrossLayerScaling
+from aimet_tensorflow.keras.utils.model_transform_utils import replace_relu6_with_relu
+
+
+

Perform Cross Layer Equalization in auto mode step by step

+
def cross_layer_equalization_auto_stepwise():
+    """
+    Individual api calls to perform cross layer equalization one step at a time. Pairs to fold and
+    scale are found automatically.
+    1. Replace Relu6 with Relu
+    2. Fold batch norms
+    3. Perform cross layer scaling
+    4. Perform high bias fold
+    """
+
+    # Load the model to equalize
+    model = tf.keras.applications.resnet50.ResNet50(weights=None, classes=10)
+
+    # 1. Replace Relu6 layer with Relu
+    model_for_cle, _ = replace_relu6_with_relu(model)
+
+    # 2. Fold all batch norms
+    folded_pairs, model = fold_all_batch_norms(model_for_cle)
+
+    bn_dict = {}
+    for conv_or_linear, bn in folded_pairs:
+        bn_dict[conv_or_linear] = bn
+
+    # 3. Perform cross-layer scaling on applicable layer groups
+    cls_set_info_list = CrossLayerScaling.scale_model(model_for_cle)
+
+    # 4. Perform high bias fold
+    HighBiasFold.bias_fold(cls_set_info_list, bn_dict)
+
+    return model_for_cle
+
+
+
+
+

Lower Level APIs for Cross Layer Equalization

+

API for Batch Norm Folding on subsets of convolution-batchnorm layer pairs

+
+
+aimet_tensorflow.keras.batch_norm_fold.fold_given_batch_norms(model, layer_pairs)[source]
+

Fold a given set of batch_norm layers into conv_linear layers

+
+
Parameters
+
    +
  • model (Model) – Either a Keras Model or a QuantizationSimModel’s model

  • +
  • layer_pairs (List[Union[Tuple[Union[Conv2D, Dense, Conv2DTranspose, DepthwiseConv2D], BatchNormalization, bool], Tuple[BatchNormalization, Union[Conv2D, Dense, Conv2DTranspose, DepthwiseConv2D], bool]]]) – Tuple of conv, bn layers and is_batch_norm_second flag

  • +
+
+
Return type
+

Optional[Model]

+
+
Returns
+

new model with batch norm layers folded if model is a functional model, else None

+
+
+
+ +
+

+
+

API for Cross Layer Scaling on subset of conv layer groups

+
+
+aimet_tensorflow.keras.cross_layer_equalization.CrossLayerScaling.scale_cls_sets(cls_sets)
+

Scale each cls set +:type cls_sets: List[Union[Tuple[Conv2D, Conv2D], Tuple[Conv2D, DepthwiseConv2D, Conv2D]]] +:param cls_sets: Cls sets to scale +:rtype: List[Union[ndarray, Tuple[ndarray, ndarray]]] +:return: List of scale factors corresponding to each scaled cls set

+
+ +
+

+
+

API for High bias folding

+
+
+aimet_tensorflow.keras.cross_layer_equalization.HighBiasFold.bias_fold(cls_set_info_list, bn_layers)
+

Folds bias values greater than 3 * sigma to next layer’s bias +:type cls_set_info_list: List[ClsSetInfo] +:param cls_set_info_list: List of info elements for each cls set +:type bn_layers: Dict[Conv2D, BatchNormalization] +:param bn_layers: Key: Conv/Linear layer Value: Corresponding folded BN layer

+
+ +
+

+
+
+
+

Custom Datatype used

+
+
+class aimet_tensorflow.keras.cross_layer_equalization.ClsSetInfo(cls_pair_1, cls_pair_2=None)[source]
+

This class hold information about the layers in a CLS set, along with corresponding scaling factors +and other information like if there is a ReLU activation function between the CLS set layers

+

Constructor takes 2 pairs if Depth-wise separable layer is being folded +:type cls_pair_1: ClsSetLayerPairInfo +:param cls_pair_1: Pair between two conv or conv and depth-wise conv +:type cls_pair_2: Optional[ClsSetLayerPairInfo] +:param cls_pair_2: Pair between depth-wise conv and point-wise conv

+
+
+class ClsSetLayerPairInfo(layer1, layer2, scale_factor, relu_activation_between_layers)[source]
+

Models a pair of layers that were scaled using CLS. And related information.

+
+
Parameters
+
    +
  • layer1 (Conv2D) – Layer whose bias is folded

  • +
  • layer2 (Conv2D) – Layer to which bias of previous layer’s bias is folded

  • +
  • scale_factor (ndarray) – Scale Factor found from Cross Layer Scaling to scale BN parameters

  • +
  • relu_activation_between_layers (bool) – If the activation between layer1 and layer2 is Relu

  • +
+
+
+
+ +
+ +
+

+
+
+
+

Code Example for Lower level APIs

+

Required imports

+
import tensorflow as tf
+from aimet_tensorflow.keras.batch_norm_fold import fold_given_batch_norms
+from aimet_tensorflow.keras.cross_layer_equalization import HighBiasFold, CrossLayerScaling
+from aimet_tensorflow.keras.utils.model_transform_utils import replace_relu6_with_relu
+
+
+

Perform Cross Layer Equalization in manual mode

+
def cross_layer_equalization_manual():
+    """
+    Individual api calls to perform cross layer equalization one step at a time. Pairs to fold and
+    scale are provided by the user.
+    1. Replace Relu6 with Relu
+    2. Fold batch norms
+    3. Perform cross layer scaling
+    4. Perform high bias fold
+    """
+
+    # Load the model to equalize
+    model = tf.keras.applications.resnet50.ResNet50(weights=None, classes=10)
+
+    # replace any ReLU6 layers with ReLU
+    model_for_cle, _ = replace_relu6_with_relu(model)
+
+    # pick potential pairs of conv and bn ops for fold
+    layer_pairs = get_example_layer_pairs_resnet50_for_folding(model_for_cle)
+
+    # fold given layers
+    fold_given_batch_norms(model_for_cle, layer_pairs=layer_pairs)
+
+    # Cross Layer Scaling
+    # Create a list of consecutive conv layers to be equalized
+    consecutive_layer_list = get_consecutive_layer_list_from_resnet50_for_scaling(model_for_cle)
+
+    # invoke api to perform scaling on given list of cls pairs
+    scaling_factor_list = CrossLayerScaling.scale_cls_sets(consecutive_layer_list)
+
+    # get info from bn fold and cross layer scaling in format required for high bias fold
+    folded_pairs, cls_set_info_list = format_info_for_high_bias_fold(layer_pairs,
+                                                                     consecutive_layer_list,
+                                                                     scaling_factor_list)
+
+    HighBiasFold.bias_fold(cls_set_info_list, folded_pairs)
+    return model_for_cle
+
+
+
+
+

Example helper methods to perform CLE in manual mode

+

Helper to pick layers for batchnorm fold

+
def get_example_layer_pairs_resnet50_for_folding(model: tf.keras.Model):
+    """
+    Function to pick example conv-batchnorm layer pairs for folding.
+    :param model: Keras model containing conv batchnorm pairs to fold
+    :return: pairs of conv and batchnorm layers for batch norm folding in Resnet50 model.
+    """
+
+    conv_op_1 = model.layers[2]
+    bn_op_1 = model.layers[3]
+
+    conv_op_2 = model.layers[7]
+    bn_op_2 = model.layers[8]
+
+    conv_op_3 = model.layers[10]
+    bn_op_3 = model.layers[11]
+
+    # make a layer pair list with potential the conv op and bn_op pair along with a flag
+    # to indicate if given bn op can be folded upstream or downstream.
+    # example of two pairs of conv and bn op  shown below
+    layer_pairs = [(conv_op_1, bn_op_1, True),
+                   (conv_op_2, bn_op_2, True),
+                   (conv_op_3, bn_op_3, True)]
+
+    return layer_pairs
+
+
+

Helper to pick layers for cross layer scaling

+
def get_consecutive_layer_list_from_resnet50_for_scaling(model: tf.keras.Model):
+    """
+    helper function to pick example consecutive layer list for scaling.
+    :param model: tf.keras.Model
+    :return: sample layers for scaling as consecutive_layer_list from Resnet50 model
+    """
+    conv_op_1 = model.layers[2]
+    conv_op_2 = model.layers[7]
+    conv_op_3 = model.layers[10]
+
+    consecutive_layer_list = [(conv_op_1, conv_op_2), (conv_op_2, conv_op_3)]
+    return consecutive_layer_list
+
+
+

Helper to format data from batchnorm fold and cross layer scaling for usage by high bias fold

+
def format_info_for_high_bias_fold(layer_pairs, consecutive_layer_list, scaling_factor_list):
+    """
+    Helper function that formats data from cross layer scaling and bn fold for usage by high bias fold
+    :param layer_pairs: info obtained after batchnorm fold
+    :param consecutive_layer_list: info obtained after cross layer scaling
+    :param scaling_factor_list: scaling params corresponding to consecutive_layer_list
+    :return: data formatted for high bias fold
+    """
+
+    # convert info after batch norm fold and cross layer scaling for usage by high bias fold api
+    folded_pairs = []
+    for (conv_op, bn_op_with_meta, _fold_upstream_flag) in layer_pairs:
+        folded_pairs.append((conv_op, bn_op_with_meta.op))
+
+    # List that hold a boolean for if there were relu activations between layers of each cross layer scaling set
+    is_relu_activation_in_cls_sets = []
+    # Note the user is expected to fill in this list manually
+
+    # Convert to a list of cls-set-info elements
+    cls_set_info_list = CrossLayerScaling.create_cls_set_info_list(consecutive_layer_list,
+                                                                   scaling_factor_list,
+                                                                   is_relu_activation_in_cls_sets)
+
+    return folded_pairs, cls_set_info_list
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/keras_quant_analyzer.html b/releases/1.32.2/api_docs/keras_quant_analyzer.html new file mode 100644 index 00000000..049179ef --- /dev/null +++ b/releases/1.32.2/api_docs/keras_quant_analyzer.html @@ -0,0 +1,1305 @@ + + + + + + AIMET Keras Quant Analyzer API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET Keras Quant Analyzer API

+

AIMET Keras Quant Analyzer analyzes the Keras model and points out sensitive layers to quantization in the model. +It checks model sensitivity to weight and activation quantization, performs per layer sensitivity and MSE analysis. +It also exports per layer encodings min and max ranges and statistics histogram for every layer.

+
+

Top-level API

+
+
+class aimet_tensorflow.keras.quant_analyzer.QuantAnalyzer(model, forward_pass_callback, eval_callback)[source]
+

QuantAnalyzer tool provides

+
    +
  1. model sensitivity to weight and activation quantization

  2. +
  3. per layer sensitivity analysis

  4. +
  5. per layer encoding (min - max range)

  6. +
  7. per PDF analysis and

  8. +
  9. per layer MSE analysis

  10. +
+
+
Parameters
+
    +
  • model (Model) – FP32 model to analyze for quantization.

  • +
  • forward_pass_callback (CallbackFunc) – A callback function for model calibration that simply runs +forward passes on the model to compute encoding (delta/offset). This +callback function should use representative data and should be subset of +entire train/validation dataset (~1000 images/samples).

  • +
  • eval_callback (CallbackFunc) – A callback function for model evaluation that determines model +performance. This callback function is expected to return scalar value +representing the model performance evaluated against entire test/evaluation dataset.

  • +
+
+
+
+
+analyze(quant_scheme=QuantScheme.post_training_tf_enhanced, rounding_mode='nearest', default_param_bw=8, default_output_bw=8, config_file=None, results_dir='./tmp/')[source]
+
+
Analyze model for quantization and point out sensitive parts/hotspots of the model by performing
    +
  1. model sensitivity to quantization,

  2. +
  3. perform per layer sensitivity analysis by enabling and disabling quant wrappers,

  4. +
  5. export per layer encodings min - max ranges,

  6. +
  7. export per layer statistics histogram (PDF) when quant scheme is TF-Enhanced,

  8. +
  9. per layer MSE analysis

  10. +
+
+
+
+
Parameters
+
    +
  • quant_scheme (QuantScheme) – Quantization scheme. Supported values are +QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced.

  • +
  • rounding_mode (str) – The round scheme to used. One of: ‘nearest’ or ‘stochastic’, defaults to ‘nearest’

  • +
  • default_param_bw (int) – Default bitwidth (4-31) to use for quantizing layer parameters.

  • +
  • default_output_bw (int) – Default bitwidth (4-31) to use for quantizing layer inputs and outputs.

  • +
  • config_file (Optional[str]) – Path to configuration file for model quantizers.

  • +
  • results_dir (str) – Directory to save the results.

  • +
+
+
+
+ +
+ +
+
+

Code Examples

+

Required imports

+
from typing import Any
+
+import numpy as np
+import tensorflow as tf
+
+from aimet_common.defs import QuantScheme
+from aimet_common.utils import CallbackFunc
+from aimet_tensorflow.keras.model_preparer import prepare_model
+from aimet_tensorflow.keras.quant_analyzer import QuantAnalyzer
+
+
+

Prepare toy dataset to run example code

+
NUM_SAMPLES = 256
+NUM_CLASSES = 1000
+INPUT_SHAPES = (224, 224, 3)
+
+images = np.random.rand(NUM_SAMPLES, *INPUT_SHAPES)
+labels = np.eye(NUM_CLASSES)[np.random.choice(NUM_CLASSES, NUM_SAMPLES)]
+
+image_dataset = tf.data.Dataset.from_tensor_slices(images)
+label_dataset = tf.data.Dataset.from_tensor_slices(labels)
+
+eval_dataset = tf.data.Dataset.zip((image_dataset, label_dataset)).batch(32)
+unlabeled_dataset = eval_dataset.map(lambda image, label: image)
+
+
+

Prepare forward pass callback

+
# NOTE: In the actual use cases, the users should implement this part to serve
+#       their own goals if necessary.
+def forward_pass_callback(model: tf.keras.Model, _: Any = None) -> None:
+    """
+    NOTE: This is intended to be the user-defined model calibration function.
+    AIMET requires the above signature. So if the user's calibration function does not
+    match this signature, please create a simple wrapper around this callback function.
+
+    A callback function for model calibration that simply runs forward passes on the model to
+    compute encoding (delta/offset). This callback function should use representative data and should
+    be subset of entire train/validation dataset (~1000 images/samples).
+
+    :param model: tf.keras.Model object.
+    :param _: Argument(s) of this callback function. Up to the user to determine the type of this parameter.
+    E.g. could be simply an integer representing the number of data samples to use. Or could be a tuple of
+    parameters or an object representing something more complex.
+    """
+    # User action required
+    # User should create data loader/iterable using representative dataset and simply run
+    # forward passes on the model.
+    _ = model.predict(unlabeled_dataset)
+
+
+

Prepare eval callback

+
# NOTE: In the actual use cases, the users should implement this part to serve
+#       their own goals if necessary.
+def eval_callback(model: tf.keras.Model, _: Any = None) -> float:
+    """
+    NOTE: This is intended to be the user-defined model evaluation function.
+    AIMET requires the above signature. So if the user's calibration function does not
+    match this signature, please create a simple wrapper around this callback function.
+
+    A callback function for model evaluation that determines model performance. This callback function is
+    expected to return scalar value representing the model performance evaluated against entire
+    test/evaluation dataset.
+
+    :param model: tf.keras.Model object.
+    :param _: Argument(s) of this callback function. Up to the user to determine the type of this parameter.
+    E.g. could be simply an integer representing the number of data samples to use. Or could be a tuple of
+    parameters or an object representing something more complex.
+    :return: Scalar value representing the model performance.
+    """
+    # User action required
+    # User should create data loader/iterable using entire test/evaluation dataset, perform forward passes on
+    # the model and return single scalar value representing the model performance.
+
+    model.compile(optimizer=tf.keras.optimizers.Adam(),
+                  loss=tf.keras.losses.CategoricalCrossentropy(),
+                  metrics=tf.keras.metrics.CategoricalAccuracy())
+
+    _, acc = model.evaluate(eval_dataset)
+    return acc
+
+
+

Prepare model

+
    model = tf.keras.applications.ResNet50()
+    prepared_model = prepare_model(model)
+
+
+

Create QuantAnalyzer object

+
    quant_analyzer = QuantAnalyzer(model=prepared_model,
+                                   forward_pass_callback=forward_pass_callback_fn,
+                                   eval_callback=eval_callback_fn)
+
+    # Approximately 256 images/samples are recommended for MSE loss analysis. So, if the dataset
+    # has batch_size of 64, then 4 number of batches leads to 256 images/samples.
+    quant_analyzer.enable_per_layer_mse_loss(unlabeled_dataset=unlabeled_dataset, num_batches=4)
+
+
+

Run QuantAnalyzer

+
    quant_analyzer.analyze(quant_scheme=QuantScheme.post_training_tf_enhanced,
+                           default_param_bw=8,
+                           default_output_bw=8,
+                           config_file=None,
+                           results_dir="./quant_analyzer_results/")
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/keras_quantization.html b/releases/1.32.2/api_docs/keras_quantization.html new file mode 100644 index 00000000..d7b16874 --- /dev/null +++ b/releases/1.32.2/api_docs/keras_quantization.html @@ -0,0 +1,1155 @@ + + + + + + AIMET Keras Quantization APIs — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET Keras Quantization APIs

+

In order to make full use of AIMET Quantization features, there are several guidelines users are encouraged to follow +when defining Keras models. AIMET provides APIs which can automate some of the model definition changes and checks +whether AIMET Quantization features can be applied on Keras model.

+
+
+
+
Users should first invoke Model Preparer API before using any of the AIMET Quantization features.
+
+
AIMET Quantization for Keras provides the following functionality
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/keras_quantsim.html b/releases/1.32.2/api_docs/keras_quantsim.html new file mode 100644 index 00000000..c2a25ed2 --- /dev/null +++ b/releases/1.32.2/api_docs/keras_quantsim.html @@ -0,0 +1,1253 @@ + + + + + + AIMET Keras Quantization SIM API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET Keras Quantization SIM API

+ +
+

Top-level API

+
+
+class aimet_tensorflow.keras.quantsim.QuantizationSimModel(model, quant_scheme='tf_enhanced', rounding_mode='nearest', default_output_bw=8, default_param_bw=8, in_place=False, config_file=None, default_data_type=QuantizationDataType.int)[source]
+

Implements mechanism to add quantization simulations ops to a model. This allows for off-target simulation of +inference accuracy. Also allows the model to be fine-tuned to counter the effects of quantization.

+
+
Parameters
+
    +
  • model – Model to quantize

  • +
  • quant_scheme (Union[QuantScheme, str]) – Quantization Scheme, currently supported schemes are post_training_tf and +post_training_tf_enhanced, defaults to post_training_tf_enhanced

  • +
  • rounding_mode (str) – The round scheme to used. One of: ‘nearest’ or ‘stochastic’, defaults to ‘nearest’.

  • +
  • default_output_bw (int) – bitwidth to use for activation tensors, defaults to 8

  • +
  • default_param_bw (int) – bitwidth to use for parameter tensors, defaults to 8

  • +
  • in_place (bool) – If True, then the given ‘model’ is modified in-place to add quant-sim nodes. +Only suggested use of this option is when the user wants to avoid creating a copy of the model

  • +
  • config_file (Optional[str]) – Path to a config file to use to specify rules for placing quant ops in the model

  • +
  • default_data_type (QuantizationDataType) – Default data type to use for quantizing all layer parameters. +Possible options are QuantizationDataType.int and QuantizationDataType.float. +Note that the mode default_data_type=QuantizationDataType.float is only supported with +default_output_bw=16 and default_param_bw=16

  • +
+
+
+
+ +
+

+
+

The following API can be used to Compute Encodings for Model

+
+
+QuantizationSimModel.compute_encodings(forward_pass_callback, forward_pass_callback_args)[source]
+

Computes encodings for all quantization sim nodes in the model. +:type forward_pass_callback: +:param forward_pass_callback: A callback function that is expected to runs forward passes on a model.

+
+

This callback function should use representative data for the forward pass, so the calculated +encodings work for all data samples.

+
+
+
Parameters
+

forward_pass_callback_args – These argument(s) are passed to the forward_pass_callback as-is. Up to +the user to determine the type of this parameter. E.g. could be simply an integer representing the number +of data samples to use. Or could be a tuple of parameters or an object representing something more +complex.

+
+
+
+ +
+

+
+

The following API can be used to Export the Model to target

+
+
+QuantizationSimModel.export(path, filename_prefix, custom_objects=None, convert_to_pb=True)[source]
+

This method exports out the quant-sim model so it is ready to be run on-target. +Specifically, the following are saved +1. The sim-model is exported to a regular Keras model without any simulation ops +2. The quantization encodings are exported to a separate JSON-formatted file that can

+
+

then be imported by the on-target runtime (if desired)

+
+
+
Parameters
+
    +
  • path – path where to store model pth and encodings

  • +
  • filename_prefix – Prefix to use for filenames of the model pth and encodings files

  • +
  • custom_objects – If there are custom objects to load, Keras needs a dict of them to map them

  • +
+
+
+
+ +
+

+
+

Encoding format is described in the Quantization Encoding Specification

+
+

+
+
+
+

Code Examples

+

Required imports

+
import numpy as np
+import tensorflow as tf
+
+from aimet_tensorflow.keras import quantsim
+
+
+

Quantize with Fine tuning

+
def quantize_model():
+    model = tf.keras.applications.resnet50.ResNet50(weights=None, classes=10)
+    sim = quantsim.QuantizationSimModel(model)
+
+    # Generate some dummy data
+    dummy_x = np.random.randn(10, 224, 224, 3)
+    dummy_y = np.random.randint(0, 10, size=(10,))
+    dummy_y = tf.keras.utils.to_categorical(dummy_y, num_classes=10)
+
+    # Compute encodings
+    sim.model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.001),loss='categorical_crossentropy',metrics=['accuracy'])
+    sim.compute_encodings(evaluate, forward_pass_callback_args=(dummy_x, dummy_y))
+
+    # Do some fine-tuning
+    sim.model.fit(x=dummy_x, y=dummy_y, epochs=10)
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/onnx.html b/releases/1.32.2/api_docs/onnx.html new file mode 100644 index 00000000..3fa51775 --- /dev/null +++ b/releases/1.32.2/api_docs/onnx.html @@ -0,0 +1,1062 @@ + + + + + + AIMET ONNX APIs — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET ONNX APIs

+ +
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/onnx_adaround.html b/releases/1.32.2/api_docs/onnx_adaround.html new file mode 100644 index 00000000..67e043a1 --- /dev/null +++ b/releases/1.32.2/api_docs/onnx_adaround.html @@ -0,0 +1,1166 @@ + + + + + + AIMET ONNX AdaRound API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET ONNX AdaRound API

+ +
+

Top-level API

+
+
+aimet_onnx.adaround.adaround_weight.Adaround.apply_adaround(model, params, path, filename_prefix, default_param_bw=4, param_bw_override_list=None, ignore_quant_ops_list=None, default_quant_scheme=QuantScheme.post_training_tf_enhanced, default_config_file=None, use_cuda=True, device=0, user_onnx_libs=None)
+

Returns model with optimized weight rounding of every module (Conv and Linear) and also saves the +corresponding quantization encodings to a separate JSON-formatted file that can then be imported by +QuantSim for inference or QAT

+
+
Parameters
+
    +
  • model (ModelProto) – Model to Adaround

  • +
  • params (AdaroundParameters) – Parameters for Adaround

  • +
  • path (str) – path where to store parameter encodings

  • +
  • filename_prefix (str) – Prefix to use for filename of the encodings file

  • +
  • default_param_bw (int) – Default bitwidth (4-31) to use for quantizing layer parameters

  • +
  • param_bw_override_list (Optional[List[Tuple[str, int]]]) – List of Tuples. Each Tuple is a param name and the corresponding parameter bitwidth +to be used for that param.

  • +
  • ignore_quant_ops_list (Optional[List[str]]) – Ops listed here are skipped during quantization needed for AdaRounding. Do not +specify Conv and Linear modules in this list. Doing so, will affect accuracy.

  • +
  • default_quant_scheme (QuantScheme) – Quantization scheme. Supported options are using Quant Scheme Enum +QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced

  • +
  • default_config_file (Optional[str]) – Default configuration file for model quantizers

  • +
  • use_cuda (bool) – If we should use cuda

  • +
  • device (int) – CUDA device ID

  • +
  • user_onnx_libs (Optional[List[str]]) – List of paths to all compiled ONNX custom ops libraries

  • +
+
+
Return type
+

ModelProto

+
+
Returns
+

Model with Adarounded weights and saves corresponding parameter encodings JSON file at provided path

+
+
+
+ +
+
+

Adaround Parameters

+
+
+class aimet_onnx.adaround.adaround_weight.AdaroundParameters(data_loader, num_batches, default_num_iterations=None, default_reg_param=0.01, default_beta_range=(20, 2), default_warm_start=0.2, forward_fn=None, forward_pass_callback_args=None)[source]
+

Configuration parameters for Adaround

+
+
Parameters
+
    +
  • data_loader – Data loader

  • +
  • num_batches (int) – Number of batches to be used for Adaround. +A commonly recommended value for this parameter is the smaller value among (1) len(data_loader) and (2) ceil(2000/batch_size)

  • +
  • default_num_iterations (Optional[int]) – Number of iterations to adaround each layer. +The default value is 10K for models with 8- or higher bit weights, and 15K for models with lower than 8 bit weights.

  • +
  • default_reg_param (float) – Regularization parameter, trading off between rounding loss vs reconstruction loss. +Default 0.01

  • +
  • default_beta_range (Tuple) – Start and stop beta parameter for annealing of rounding loss (start_beta, end_beta). +Default (20, 2)

  • +
  • default_warm_start (float) – warm up period, during which rounding loss has zero effect. Default 20% (0.2)

  • +
  • forward_fn (Optional[Callable]) – Function to compute encodings for sim

  • +
  • forward_pass_callback_args – These argument(s) are passed to the forward_pass_callback as-is. Up to +the user to determine the type of this parameter. E.g. could be simply an integer representing the number +of data samples to use. Or could be a tuple of parameters or an object representing something more complex. +If set to None, forward_pass_callback will be invoked with no parameters.

  • +
+
+
+
+ +
+
+

Code Example - Adaptive Rounding (AdaRound)

+

This example shows how to use AIMET to perform Adaptive Rounding (AdaRound).

+

Required imports

+
from aimet_onnx.adaround.adaround_weight import AdaroundParameters, Adaround
+from aimet_onnx.quantsim import QuantizationSimModel
+
+
+

User should write this function to pass calibration data

+
def pass_calibration_data(model):
+    """
+    The User of the QuantizationSimModel API is expected to write this function based on their data set.
+    This is not a working function and is provided only as a guideline.
+
+    :param model:
+    """
+
+
+

Apply Adaround

+
def apply_adaround_example(model, dataloader):
+        """
+        Example code to run adaround
+
+        """
+        params = AdaroundParameters(data_loader=dataloader, num_batches=1, default_num_iterations=5,
+                                    forward_fn=pass_calibration_data,
+                                    forward_pass_callback_args=None)
+        ada_rounded_model = Adaround.apply_adaround(model, params, './', 'dummy')
+
+        sim = QuantizationSimModel(ada_rounded_model,
+                                   default_param_bw=8,
+                                   default_activation_bw=8, use_cuda=True)
+        sim.set_and_freeze_param_encodings('./dummy.encodings')
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/onnx_auto_quant.html b/releases/1.32.2/api_docs/onnx_auto_quant.html new file mode 100644 index 00000000..0caea247 --- /dev/null +++ b/releases/1.32.2/api_docs/onnx_auto_quant.html @@ -0,0 +1,1245 @@ + + + + + + AIMET ONNX AutoQuant API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET ONNX AutoQuant API

+ +
+

Top-level API

+
+
+class aimet_onnx.auto_quant_v2.AutoQuant(model, dummy_input, data_loader, eval_callback, param_bw=8, output_bw=8, quant_scheme=QuantScheme.post_training_tf_enhanced, rounding_mode='nearest', use_cuda=True, device=0, config_file=None, results_dir='/tmp', cache_id=None, strict_validation=True)[source]
+

Integrate and apply post-training quantization techniques.

+

AutoQuant includes 1) batchnorm folding, 2) cross-layer equalization, +and 3) Adaround. +These techniques will be applied in a best-effort manner until the model +meets the evaluation goal given as allowed_accuracy_drop.

+
+
Parameters
+
    +
  • model (Union[ModelProto, ONNXModel]) – Model to be quantized.

  • +
  • dummy_input (Dict[str, ndarray]) – Dummy input dict for the model.

  • +
  • data_loader (Iterable[Union[ndarray, List[ndarray], Tuple[ndarray]]]) – A collection that iterates over an unlabeled dataset, used for computing encodings

  • +
  • eval_callback (Callable[[InferenceSession, int], float]) – Function that calculates the evaluation score given the model session

  • +
  • param_bw (int) – Parameter bitwidth

  • +
  • output_bw (int) – Output bitwidth

  • +
  • quant_scheme (QuantScheme) – Quantization scheme

  • +
  • rounding_mode (str) – Rounding mode

  • +
  • use_cuda (bool) – True if using CUDA to run quantization op. False otherwise.

  • +
  • config_file (Optional[str]) – Path to configuration file for model quantizers

  • +
  • results_dir (str) – Directory to save the results of PTQ techniques

  • +
  • cache_id (Optional[str]) – ID associated with cache results

  • +
  • strict_validation (bool) – Flag set to True by default. When False, AutoQuant will proceed with execution and handle errors internally if possible. This may produce unideal or unintuitive results.

  • +
+
+
+
+
+run_inference()[source]
+

Creates a quantization model and performs inference

+
+
Return type
+

Tuple[QuantizationSimModel, float]

+
+
Returns
+

QuantizationSimModel, model accuracy as float

+
+
+
+ +
+
+optimize(allowed_accuracy_drop=0.0)[source]
+

Integrate and apply post-training quantization techniques.

+
+
Parameters
+

allowed_accuracy_drop (float) – Maximum allowed accuracy drop

+
+
Return type
+

Tuple[ONNXModel, float, str]

+
+
Returns
+

Tuple of (best model, eval score, encoding path)

+
+
+
+ +
+
+set_adaround_params(adaround_params)[source]
+

Set Adaround parameters. +If this method is not called explicitly by the user, AutoQuant will use +data_loader (passed to __init__) for Adaround.

+
+
Parameters
+

adaround_params (AdaroundParameters) – Adaround parameters.

+
+
Return type
+

None

+
+
+
+ +
+
+get_quant_scheme_candidates()[source]
+

Return the candidates for quant scheme search. +During optimize(), the candidate with the highest accuracy +will be selected among them.

+
+
Return type
+

Tuple[_QuantSchemePair, ...]

+
+
Returns
+

Candidates for quant scheme search

+
+
+
+ +
+
+set_quant_scheme_candidates(candidates)[source]
+

Set candidates for quant scheme search. +During optimize(), the candidate with the highest accuracy +will be selected among them.

+
+
Parameters
+

candidates (Tuple[_QuantSchemePair, ...]) – Candidates for quant scheme search

+
+
+
+ +
+ +
+
+

Code Examples

+
import math
+import onnxruntime as ort
+import numpy as np
+
+from aimet_onnx.auto_quant_v2 import AutoQuant
+from aimet_onnx.adaround.adaround_weight import AdaroundParameters
+
+# Step 1. Define constants
+EVAL_DATASET_SIZE = 5000
+CALIBRATION_DATASET_SIZE = 500
+BATCH_SIZE = 32
+
+# Step 2. Prepare model and dataloader
+onnx_model = Model()
+
+input_shape = (1, 3, 224, 224)
+dummy_data = np.random.randn(*input_shape).astype(np.float32)
+dummy_input = {'input': dummy_data}
+
+# NOTE: Use your dataloader. It should iterate over unlabelled dataset.
+#       Its data will be directly fed as input to the onnx model's inference session.
+unlabelled_data_loader = DataLoader(data=data, batch_size=BATCH_SIZE,
+                                    iterations=math.ceil(CALIBRATION_DATASET_SIZE / BATCH_SIZE))
+
+# Step 3. Prepare eval callback
+# NOTE: In the actual use cases, the users should implement this part to serve
+#       their own goals, maintaining the function signature.
+def eval_callback(session: ort.InferenceSession, num_of_samples: Optional[int] = None) -> float:
+    data_loader = EvalDataLoader()
+    if num_of_samples:
+        iterations = math.ceil(num_of_samples / data_loader.batch_size)
+    else:
+        iterations = len(data_loader)
+    batch_cntr = 1
+    acc_top1 = 0
+    acc_top5 = 0
+    for input_data, target in data_loader:
+        pred = session.run(None, {'input': input_data})
+
+        batch_avg_top_1_5 = accuracy(pred, target, topk=(1, 5))
+
+        acc_top1 += batch_avg_top_1_5[0].item()
+        acc_top5 += batch_avg_top_1_5[1].item()
+
+        batch_cntr += 1
+        if batch_cntr > iterations:
+            break
+    acc_top1 /= iterations
+    acc_top5 /= iterations
+    return acc_top1
+
+# Step 4. Create AutoQuant object
+auto_quant = AutoQuant(onnx_model,
+                       dummy_input,
+                       unlabelled_data_loader,
+                       eval_callback)
+
+# Step 5. (Optional) Set AdaRound params
+ADAROUND_DATASET_SIZE = 2000
+adaround_data_loader = DataLoader(data=data, batch_size=BATCH_SIZE,
+                                  iterations=math.ceil(ADAROUND_DATASET_SIZE / BATCH_SIZE))
+adaround_params = AdaroundParameters(adaround_data_loader, num_batches=len(adaround_data_loader))
+auto_quant.set_adaround_params(adaround_params)
+
+# Step 6. Run AutoQuant
+sim, initial_accuracy = auto_quant.run_inference()
+model, optimized_accuracy, encoding_path = auto_quant.optimize(allowed_accuracy_drop=0.01)
+
+print(f"- Quantized Accuracy (before optimization): {initial_accuracy:.4f}")
+print(f"- Quantized Accuracy (after optimization):  {optimized_accuracy:.4f}")
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/onnx_cross_layer_equalization.html b/releases/1.32.2/api_docs/onnx_cross_layer_equalization.html new file mode 100644 index 00000000..6dc80745 --- /dev/null +++ b/releases/1.32.2/api_docs/onnx_cross_layer_equalization.html @@ -0,0 +1,1106 @@ + + + + + + AIMET ONNX Cross Layer Equalization APIs — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET ONNX Cross Layer Equalization APIs

+ +
+

Introduction

+
+
AIMET functionality for Cross Layer Equalization has 3 features-
    +
  • BatchNorm Folding

  • +
  • Cross Layer Scaling

  • +
  • High Bias Fold

  • +
+
+
+
+
+

Cross Layer Equalization API

+

The following API performs BatchNorm fold followed by Cross Layer Scaling followed by High Bias Fold.

+

Note: High Bias fold will not happen when the below API is used, if the model does not have BatchNorm layers

+

API for Cross Layer Equalization

+
+
+aimet_onnx.cross_layer_equalization.equalize_model(model)[source]
+

High-level API to perform Cross-Layer Equalization (CLE) on the given model. The model is equalized in place.

+
+
Parameters
+

model (ModelProto) – Model to equalize

+
+
+
+ +
+

+
+
+
+

Code Example

+

Required imports

+
from aimet_onnx.cross_layer_equalization import equalize_model
+
+
+

Cross Layer Equalization in auto mode

+
def cross_layer_equalization():
+    onnx_model = Model()
+    equalize_model(onnx_model)
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/onnx_layer_output_generation.html b/releases/1.32.2/api_docs/onnx_layer_output_generation.html new file mode 100644 index 00000000..780d6fef --- /dev/null +++ b/releases/1.32.2/api_docs/onnx_layer_output_generation.html @@ -0,0 +1,1145 @@ + + + + + + AIMET ONNX Layer Output Generation API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET ONNX Layer Output Generation API

+

This API captures and saves intermediate layer-outputs of a model. The model can be original (FP32) or quantsim. +The layer-outputs are named according to the exported ONNX model by the quantsim export API. This allows layer-output comparison +amongst FP32 model, quantization simulated model and actually quantized model on target-device to debug accuracy miss-match issues.

+
+

Top-level API

+
+
+class aimet_onnx.layer_output_utils.LayerOutputUtil(model, dir_path, device=0)[source]
+

Implementation to capture and save outputs of intermediate layers of a model (fp32/quantsim)

+

Constructor - It initializes the utility classes that captures and saves layer-outputs

+
+
Parameters
+
    +
  • model (ModelProto) – ONNX model

  • +
  • dir_path (str) – Directory wherein layer-outputs will be saved

  • +
  • device (int) – CUDA device-id to be used

  • +
+
+
+
+ +
+

+
+

The following API can be used to Generate Layer Outputs

+
+
+LayerOutputUtil.generate_layer_outputs(input_batch)[source]
+

This method captures output of every layer of a model & saves the inputs and corresponding layer-outputs to disk.

+
+
Parameters
+

input_batch (Union[ndarray, List[ndarray], Tuple[ndarray]]) – Batch of inputs for which we want to obtain layer-outputs.

+
+
Returns
+

None

+
+
+
+ +
+

+
+
+
+

Code Example

+

Imports

+
import onnx
+from onnxruntime import InferenceSession
+
+from aimet_onnx.quantsim import QuantizationSimModel, load_encodings_to_sim
+from aimet_onnx.layer_output_utils import LayerOutputUtil
+
+
+

Obtain Original or QuantSim model from AIMET Export Artifacts

+
# Load the model.
+model = onnx.load('path/to/aimet_export_artifacts/model.onnx')
+
+# Use same arguments as that were used for the exported QuantSim model. For sake of simplicity only mandatory arguments are passed below.
+quantsim = QuantizationSimModel(model=model, dummy_input=dummy_input_dict, use_cuda=False)
+
+# Load exported encodings into quantsim object
+load_encodings_to_sim(quantsim, 'path/to/aimet_export_artifacts/model.encodings')
+
+# Check whether constructed original and quantsim model are running properly before using Layer Output Generation API.
+_ = InferenceSession(model.SerializeToString()).run(None, dummy_input_dict)
+_ = quantsim.session.run(None, dummy_input_dict)
+
+
+

Obtain inputs for which we want to generate intermediate layer-outputs

+
# Use same input pre-processing pipeline as was used for computing the quantization encodings.
+input_batches = get_pre_processed_inputs()
+
+
+

Generate layer-outputs

+
# Use original model to get fp32 layer-outputs
+fp32_layer_output_util = LayerOutputUtil(model=model, dir_path='./fp32_layer_outputs')
+
+# Use quantsim model to get quantsim layer-outputs
+quantsim_layer_output_util = LayerOutputUtil(model=quantsim.model.model, dir_path='./quantsim_layer_outputs')
+
+for input_batch in input_batches:
+    fp32_layer_output_util.generate_layer_outputs(input_batch)
+    quantsim_layer_output_util.generate_layer_outputs(input_batch)
+
+# Note: Generate layer-outputs for fp32 model before creating quantsim model becuase the fp32 model itself is modified to get quantsim version.
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/onnx_quant_analyzer.html b/releases/1.32.2/api_docs/onnx_quant_analyzer.html new file mode 100644 index 00000000..3b71b288 --- /dev/null +++ b/releases/1.32.2/api_docs/onnx_quant_analyzer.html @@ -0,0 +1,1462 @@ + + + + + + AIMET ONNX Quant Analyzer API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET ONNX Quant Analyzer API

+

AIMET ONNX Quant Analyzer analyzes the ONNX model and points out sensitive layers to quantization in the model. +It checks model sensitivity to weight and activation quantization, performs per layer sensitivity and MSE analysis. +It also exports per layer encodings min and max ranges and statistics histogram for every layer.

+
+

Top-level API

+
+
+class aimet_onnx.quant_analyzer.QuantAnalyzer(model, dummy_input, forward_pass_callback, eval_callback)[source]
+

QuantAnalyzer provides following utilities:

+
+
    +
  1. model sensitivity to weight and activation quantization

  2. +
  3. per layer sensitivity analysis

  4. +
  5. per layer encoding (min - max range)

  6. +
  7. per layer quantizer historgram analysis and

  8. +
  9. per layer MSE analysis

  10. +
+
+
+
Parameters
+
    +
  • model (Union[ModelProto, ONNXModel]) – FP32 model to analyze for quantization.

  • +
  • dummy_input (Dict[str, ndarray]) – Dummy input to model.

  • +
  • forward_pass_callback (CallbackFunc) – A callback function for model calibration that simply runs +forward passes on the model to compute encoding (delta/offset). This +callback function should use representative data and should be subset of +entire train/validation dataset (~1000 images/samples).

  • +
  • eval_callback (CallbackFunc) – A callback function for model evaluation that determines model +performance. This callback function is expected to return scalar value +representing the model performance evaluated against entire test/evaluation dataset.

  • +
+
+
+
+ +
+

+
+
+
+QuantAnalyzer.enable_per_layer_mse_loss(unlabeled_dataset_iterable, num_batches)[source]
+

Enables per layer MSE loss analysis.

+
+
Parameters
+
    +
  • unlabeled_dataset_iterable (Iterable) – A collection (i.e. iterable with __len__) +that iterates over an unlabeled dataset. The values yielded by this iterable are expected +to be able to be passed directly to the model.

  • +
  • num_batches (int) – Number of batches. Approximately 256 samples/images are recommended, +so if batch size of data loader is 64, then 4 number of batches leads to 256 samples/images.

  • +
+
+
+
+ +
+

+
+
+
+QuantAnalyzer.analyze(quant_scheme=QuantScheme.post_training_tf_enhanced, default_param_bw=8, default_activation_bw=8, config_file=None, results_dir='./tmp/')[source]
+
+
Analyzes model for quantization and point out sensitive parts/hotspots of the model by performing
    +
  1. model sensitivity to quantization,

  2. +
  3. perform per layer sensitivity analysis by enabling and disabling quantizers,

  4. +
  5. export per layer encodings min - max ranges,

  6. +
  7. export per layer quantizer stats histogram,

  8. +
  9. per layer MSE analysis

  10. +
+
+
+
+
Parameters
+
    +
  • quant_scheme (QuantScheme) – Quantization scheme. Supported values are +QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced.

  • +
  • default_param_bw (int) – Default bitwidth (4-31) to use for quantizing layer parameters.

  • +
  • default_activation_bw (int) – Default bitwidth (4-31) to use for quantizing layer inputs and outputs.

  • +
  • config_file (Optional[str]) – Path to configuration file for model quantizers.

  • +
  • results_dir (str) – Directory to save the results.

  • +
+
+
+
+ +
+
+

Run specific utility

+

We can avoid running all the utilities that Quant Analyzer offers and only run those of our interest. For this we +need to have the quantsim object which can be obtained from ‘create_quantsim_and_encodings()’. Then we call the +desired Quant Analyzer utility of our interest and pass the quantsim object to it.

+
+
+QuantAnalyzer.create_quantsim_and_encodings(quant_scheme, default_param_bw, default_activation_bw, config_file)[source]
+

Creates quantsim object and computes encodings.

+
+
Parameters
+
    +
  • quant_scheme (QuantScheme) – Quantization scheme.

  • +
  • default_param_bw (int) – Default bitwidth (4-31) to use for quantizing layer parameters.

  • +
  • default_activation_bw (int) – Default bitwidth (4-31) to use for quantizing layer inputs and outputs.

  • +
  • config_file (str) – Path to configuration file for model quantizers.

  • +
+
+
Return type
+

QuantizationSimModel

+
+
Returns
+

Quantsim object.

+
+
+
+ +
+

+
+
+
+QuantAnalyzer.check_model_sensitivity_to_quantization(sim)[source]
+

Performs model sensitivity analysis to weight and activation quantization individually.

+
+
Parameters
+

sim (QuantizationSimModel) – Quantsim model.

+
+
Return type
+

Tuple[float, float, float]

+
+
Returns
+

FP32 eval score, weight-quantized eval score, act-quantized eval score.

+
+
+
+ +
+

+
+
+
+QuantAnalyzer.perform_per_layer_analysis_by_enabling_quantizers(sim, results_dir)[source]
+

Performs layer-wise quantization sensitivity analysis by enabling its quantizers

+
    +
  1. All parameter and activation quantizers are disabled.

  2. +
  3. +
    For every layer, based on occurrence:
      +
    1. Each layer’s parameters and activations quantizers are enabled as per JSON config file +and set to bit-width specified.

    2. +
    3. Measure and record eval score on subset of dataset.

    4. +
    5. Disable enabled quantizers in step a.

    6. +
    +
    +
    +
  4. +
  5. Returns dictionary containing layer name and corresponding eval score.

  6. +
+
+
Parameters
+
    +
  • sim (QuantizationSimModel) – Quantsim model.

  • +
  • results_dir (str) – Directory to save the results.

  • +
+
+
Return type
+

Dict

+
+
Returns
+

layer wise eval score dictionary. dict[layer_name] = eval_score

+
+
+
+ +
+

+
+
+
+QuantAnalyzer.perform_per_layer_analysis_by_disabling_quantizers(sim, results_dir)[source]
+

Performs layer-wise quantization sensitivity analysis by disabling its quantizers

+
    +
  1. All parameter and activation quantizers are enabled as per JSON config file +and set to bit-width specified.

  2. +
  3. +
    For every layer, based on occurrence:
      +
    1. Each layer’s parameters and activations quantizers are disabled.

    2. +
    3. Measure and record eval score on subset of dataset.

    4. +
    5. Enable disabled quantizers in step a.

    6. +
    +
    +
    +
  4. +
  5. Returns dictionary containing layer name and corresponding eval score.

  6. +
+
+
Parameters
+
    +
  • sim (QuantizationSimModel) – Quantsim model.

  • +
  • results_dir (str) – Directory to save the results.

  • +
+
+
Return type
+

Dict

+
+
Returns
+

layer wise eval score dictionary. dict[layer_name] = eval_score

+
+
+
+ +
+

+
+
+
+QuantAnalyzer.export_per_layer_encoding_min_max_range(sim, results_dir)[source]
+

Exports encoding min and max range for all weights and activations. results_dir has +html files in following format.

+
+
-results_dir
+

-activations.html, +-weights.html

+
+
+

If per channel quantization(PCQ) is enabled then,

+
+
-results_dir
+

-activations.html, +-{layer_name}_{param_name}.html

+
+
+
+
Parameters
+
    +
  • sim (QuantizationSimModel) – Quantsim model.

  • +
  • results_dir (str) – Directory to save the results.

  • +
+
+
Return type
+

Tuple[Dict, Dict]

+
+
Returns
+

layer wise min-max range for weights and activations.

+
+
+
+ +
+

+
+
+
+QuantAnalyzer.export_per_layer_stats_histogram(sim, results_dir)[source]
+

NOTE: Not to invoke when quantization scheme is not TF-Enhanced.

+

Exports histogram that represents a PDF of collected statistics by a quantizer. +After invoking this API, results_dir should have html files in following +format for every quantizers in the model.

+
+
-results_dir
+
+
-activations_pdf
+

name_{input/output}_{index}.html

+
+
-weights_pdf
+
+
-name
+

param_name_{channel_index}.html

+
+
+
+
+
+
+
+
Parameters
+
    +
  • sim (QuantizationSimModel) – Quantsim model.

  • +
  • results_dir (str) – Directory to save the results.

  • +
+
+
+
+ +
+

+
+
+
+QuantAnalyzer.export_per_layer_mse_loss(sim, results_dir)[source]
+

Exports MSE loss between fp32 and quantized output activations for each layer.

+
+
Parameters
+
    +
  • sim (QuantizationSimModel) – Quantsim model.

  • +
  • results_dir (str) – Directory to save the results.

  • +
+
+
Return type
+

Dict

+
+
Returns
+

layer wise MSE loss. dict[layer_name] = MSE loss.

+
+
+
+ +
+

+
+
+
+

Code Examples

+

Required imports

+
from typing import Any
+import numpy as np
+from onnxruntime import InferenceSession
+
+from aimet_common.defs import QuantScheme
+from aimet_common.utils import CallbackFunc
+
+from aimet_onnx.quant_analyzer import QuantAnalyzer
+
+
+

Prepare forward pass callback

+
# NOTE: In the actual use cases, the users should implement this part to serve
+#       their own goals if necessary.
+def forward_pass_callback(session: InferenceSession, _: Any = None) -> None:
+    """
+    NOTE: This is intended to be the user-defined model calibration function.
+    AIMET requires the above signature. So if the user's calibration function does not
+    match this signature, please create a simple wrapper around this callback function.
+
+    A callback function for model calibration that simply runs forward passes on the model to
+    compute encoding (delta/offset). This callback function should use representative data and should
+    be subset of entire train/validation dataset (~1000 images/samples).
+
+    :param session: OnnxRuntime Inference Session.
+    :param _: Argument(s) of this callback function. Up to the user to determine the type of this parameter.
+    E.g. could be simply an integer representing the number of data samples to use. Or could be a tuple of
+    parameters or an object representing something more complex.
+    """
+    # User action required
+    # User should create data loader/iterable using representative dataset and simply run
+    # forward passes on the model.
+
+
+

Prepare eval callback

+
# NOTE: In the actual use cases, the users should implement this part to serve
+#       their own goals if necessary.
+def eval_callback(session: InferenceSession, _: Any = None) -> float:
+    """
+    NOTE: This is intended to be the user-defined model evaluation function.
+    AIMET requires the above signature. So if the user's calibration function does not
+    match this signature, please create a simple wrapper around this callback function.
+
+    A callback function for model evaluation that determines model performance. This callback function is
+    expected to return scalar value representing the model performance evaluated against entire
+    test/evaluation dataset.
+
+    :param session: OnnxRuntime Inference Session.
+    :param _: Argument(s) of this callback function. Up to the user to determine the type of this parameter.
+    E.g. could be simply an integer representing the number of data samples to use. Or could be a tuple of
+    parameters or an object representing something more complex.
+    :return: Scalar value representing the model performance.
+    """
+    # User action required
+    # User should create data loader/iterable using entire test/evaluation dataset, perform forward passes on
+    # the model and return single scalar value representing the model performance.
+    return .8
+
+
+

Prepare model, callback functions and dataloader

+
    onnx_model = Model()
+
+    input_shape = (1, 3, 224, 224)
+    dummy_data = np.random.randn(*input_shape).astype(np.float32)
+    dummy_input = {'input': dummy_data}
+
+    # User action required
+    # User should pass actual argument(s) of the callback functions.
+    forward_pass_callback_fn = CallbackFunc(forward_pass_callback, func_callback_args=None)
+    eval_callback_fn = CallbackFunc(eval_callback, func_callback_args=None)
+
+    # User action required
+    # User should use unlabeled dataloader, so if the dataloader yields labels as well user should discard them.
+    unlabeled_data_loader = _get_unlabled_data_loader()
+
+
+

Create QuantAnalyzer object

+
    quant_analyzer = QuantAnalyzer(model=onnx_model,
+                                   dummy_input=dummy_input,
+                                   forward_pass_callback=forward_pass_callback_fn,
+                                   eval_callback=eval_callback_fn)
+    # Approximately 256 images/samples are recommended for MSE loss analysis. So, if the dataloader
+    # has batch_size of 64, then 4 number of batches leads to 256 images/samples.
+    quant_analyzer.enable_per_layer_mse_loss(unlabeled_dataset_iterable=unlabeled_data_loader, num_batches=4)
+
+
+

Run QuantAnalyzer

+
    quant_analyzer.analyze(quant_scheme=QuantScheme.post_training_tf_enhanced,
+                           default_param_bw=8,
+                           default_activation_bw=8,
+                           config_file=None,
+                           results_dir="./quant_analyzer_results/")
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/onnx_quantization.html b/releases/1.32.2/api_docs/onnx_quantization.html new file mode 100644 index 00000000..cdf64625 --- /dev/null +++ b/releases/1.32.2/api_docs/onnx_quantization.html @@ -0,0 +1,1069 @@ + + + + + + AIMET ONNX Quantization APIs — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET ONNX Quantization APIs

+
+
+
+
AIMET Quantization for ONNX Models provides the following functionality.
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/onnx_quantsim.html b/releases/1.32.2/api_docs/onnx_quantsim.html new file mode 100644 index 00000000..ad8563b0 --- /dev/null +++ b/releases/1.32.2/api_docs/onnx_quantsim.html @@ -0,0 +1,1195 @@ + + + + + + AIMET ONNX Quantization SIM API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET ONNX Quantization SIM API

+
+

Top-level API

+
+
+class aimet_onnx.quantsim.QuantizationSimModel(model, dummy_input=None, quant_scheme=QuantScheme.post_training_tf_enhanced, rounding_mode='nearest', default_param_bw=8, default_activation_bw=8, use_symmetric_encodings=False, use_cuda=True, device=0, config_file=None, default_data_type=QuantizationDataType.int, simplify_model=True, user_onnx_libs=None, path=None)[source]
+

Creates a QuantizationSimModel model by adding quantization simulations ops to a given model

+

Constructor

+
+
Parameters
+
    +
  • model (ModelProto) – ONNX model or path to model

  • +
  • dummy_input (Optional[Dict[str, ndarray]]) – Dummy input to the model. If None, will attempt to auto-generate a dummy input

  • +
  • quant_scheme (QuantScheme) – Quantization scheme (e.g. QuantScheme.post_training_tf)

  • +
  • rounding_mode (str) – Rounding mode (e.g. nearest)

  • +
  • default_param_bw (int) – Quantization bitwidth for parameter

  • +
  • default_activation_bw (int) – Quantization bitwidth for activation

  • +
  • use_symmetric_encodings (bool) – True if symmetric encoding is used. False otherwise.

  • +
  • use_cuda (bool) – True if using CUDA to run quantization op. False otherwise.

  • +
  • config_file (Optional[str]) – Path to Configuration file for model quantizers

  • +
  • default_data_type (QuantizationDataType) – Default data type to use for quantizing all layer inputs, outputs and parameters. +Possible options are QuantizationDataType.int and QuantizationDataType.float. +Note that the mode default_data_type=QuantizationDataType.float is only supported with +default_output_bw=16 and default_param_bw=16

  • +
  • simplify_model (bool) – Default True, uses onnx simplifier to simplify model

  • +
  • user_onnx_libs (Optional[List[str]]) – List of paths to all compiled ONNX custom ops libraries

  • +
  • path (Optional[str]) – Directory to save the artifacts.

  • +
+
+
+
+ +
+

+
+

Note about Quantization Schemes : Since ONNX Runtime will be used for optimized inference only, ONNX +framework will support Post Training Quantization schemes i.e. TF or TF-enhanced to compute the encodings.

+

The following API can be used to Compute Encodings for Model

+
+
+QuantizationSimModel.compute_encodings(forward_pass_callback, forward_pass_callback_args)[source]
+

Compute and return the encodings of each tensor quantizer

+
+
Parameters
+
    +
  • forward_pass_callback – A callback function that simply runs forward passes on the model. This callback +function should use representative data for the forward pass, so the calculated encodings work for all +data samples. This callback internally chooses the number of data samples it wants to use for calculating +encodings.

  • +
  • forward_pass_callback_args – These argument(s) are passed to the forward_pass_callback as-is. Up to +the user to determine the type of this parameter. E.g. could be simply an integer representing the number +of data samples to use. Or could be a tuple of parameters or an object representing something more complex. +If set to None, forward_pass_callback will be invoked with no parameters.

  • +
+
+
+
+ +
+

+
+

The following API can be used to Export the Model to target

+
+
+QuantizationSimModel.export(path, filename_prefix)[source]
+

Compute encodings and export to files

+
+
Parameters
+
    +
  • path (str) – dir to save encoding files

  • +
  • filename_prefix (str) – filename to save encoding files

  • +
+
+
+
+ +
+

+
+
+
+

Code Examples

+

Required imports

+
from aimet_onnx.quantsim import QuantizationSimModel
+from aimet_common.defs import QuantScheme
+import numpy as np
+
+
+

User should write this function to pass calibration data

+
def pass_calibration_data(session):
+    """
+    The User of the QuantizationSimModel API is expected to write this function based on their data set.
+    This is not a working function and is provided only as a guideline.
+
+    :param session: Model's session
+    :return:
+    """
+
+    # User action required
+    # The following line of code is an example of how to use the ImageNet data's validation data loader.
+    # Replace the following line with your own dataset's validation data loader.
+    data_loader = None  # Your Dataset's data loader
+
+    # User action required
+    # For computing the activation encodings, around 1000 unlabelled data samples are required.
+    # Edit the following 2 lines based on your dataloader's batch size.
+    # batch_size * max_batch_counter should be 1024
+    batch_size = 64
+    max_batch_counter = 16
+
+    input_tensor = None  # input tensor in session
+
+    current_batch_counter = 0
+    for input_data, _ in data_loader:
+        session.run(None, input_data)
+
+        current_batch_counter += 1
+        if current_batch_counter == max_batch_counter:
+            break
+
+
+

Quantize the model and finetune (QAT)

+
def quantize_model():
+    onnx_model = Model()
+    input_shape = (1, 3, 224, 224)
+    dummy_data = np.random.randn(*input_shape).astype(np.float32)
+    dummy_input = {'input' : dummy_data}
+    sim = QuantizationSimModel(onnx_model, dummy_input, quant_scheme=QuantScheme.post_training_tf,
+                               rounding_mode='nearest', default_param_bw=8, default_activation_bw=8,
+                               use_symmetric_encodings=False, use_cuda=False)
+
+    sim.compute_encodings(pass_calibration_data, None)
+
+    # Evaluate the quant sim
+    forward_pass_function(sim.session)
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/quantization_encoding_specification.html b/releases/1.32.2/api_docs/quantization_encoding_specification.html new file mode 100644 index 00000000..2e85e7c8 --- /dev/null +++ b/releases/1.32.2/api_docs/quantization_encoding_specification.html @@ -0,0 +1,1478 @@ + + + + + + Encoding Format Specification — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

AIMET Quantization Simulation determines scale/offset values for activation and parameter tensors in the model. +This scale/offset information is also referred to as ‘quantization encoding’. +When a model is exported from the AIMET Quantization Simulation feature, +an encoding file is also exported that contains quantization encodings for the model. +This encoding file can then be used by an inference runtime when running the model on-target.

+

The following specification describes the format of this encoding file produced by AIMET.

+
+

Encoding Format Specification

+

The encodings from Quantization simulation can be exported for usage on run-time. The encoding file uses a JSON syntax. +The file format is usable with both PyTorch and TensorFlow models, that maps tensor names with the encodings.

+
+
+

1. Versioning

+

Encoding format will follow XX.YY.ZZ versioning format as describe below,

+
    +
  • XX = Major Revision

  • +
  • YY = Minor Revision

  • +
  • ZZ = Patching version

  • +
+

Change in major revision should indicate substantial change to the format, updates to minor version indicates additional information element being added to encoding format and might require update to fully consume the encodings. The patching version shall be updated to indicate minor updates to quantization simulation e.g. bug fix etc.

+
+
+

2. Version 0.4.0 (up to)

+

The encoding format as defined below is backward compatible and shall applicable to all exported encoding up to version 0.4. In case, where versioning information is missing the encoding is assumed to follow version 0.4 format.

+
+

2.1. Encoding Specification

+
“version”: “string”
+“activation_encodings”:
+{
+    <tensor_name>: [Encoding, …]
+}
+“param_encodings”
+{
+    <tensor_name>: [Encoding, …]
+}
+
+
+

Where,

+
    +
  • "version” is set to “0.4.0”

  • +
  • <tensor_name> is a string representing the tensor in onnx or tensorflow graph.

  • +
+

Encoding is as defined below,

+
Encoding:{
+   bitwidth: integer
+   is_symmetric: string
+   max: float
+   min: float
+   offset: integer
+   scale: float
+}
+
+
+

Where,

+
    +
  • bitwidth: constraints >=4 and <=32

  • +
  • is_symmetric: allowed choices “True”, “False”

  • +
+

if a tensor is assigned more than one Encoding then the encoding is at per channel basis.

+
+
+

2.2. Encoding File Example for PyTorch

+

On PyTorch, the tensor names shall be derived from the ONNX named model representation as depicted below on a sample model.

+Mapping between ONNX tensor names and encodings +

Given below is the sample format with keys and values for encodings JSON output file on PyTorch.

+
{
+    “version”: “0.4.0”
+    "activation_encodings": {
+        "20":
+        [
+            {
+                "bitwidth": 8,
+                "is_symmetric": “False”,
+                "max": 2.6086959838867188,
+                "min": -2.109158515930176,
+                "offset": -114.0,
+                "scale": 0.018501389771699905
+            }
+        ],
+        "21":
+        [
+            {
+                "bitwidth": 8,
+                "is_symmetric": “False”,
+                "max": 2.558866932988167,
+                "min": -0.12636379897594452,
+                "offset": -12.0,
+                "scale": 0.010530316270887852
+            }
+        ],
+    },
+    "param_encodings": {
+        "conv2.weight":
+        [
+            {
+                "bitwidth": 8,
+                "is_symmetric": “False”,
+                "max": 0.06318144500255585,
+                "min": -0.06268782913684845,
+                "offset": -127.0,
+                "scale": 0.0004936049808748066
+            }
+        ],
+        "fc1.weight":
+         [
+            {
+                "bitwidth": 8,
+                "is_symmetric": “False”,
+                "max": 0.05589814856648445,
+                "min": -0.05546144023537636,
+                "offset": -127.0,
+                "scale": 0.0004367042565718293
+            }
+        ],
+    }
+}
+
+
+
+
+

2.3. Encoding File Example for TensorFlow

+

Given below is a sample format with the keys and values for encodings on TensorFlow graph (in JSON format).

+
{
+    “version”: “0.4.0”
+    "activation_encodings": {
+        "conv2d/Relu:0":
+        [
+            {
+                "bitwidth": 8,
+                "is_symmetric": “False”,
+                "max": 2.184721499681473,
+                "min": -0.10788747668266296,
+                "offset": 11,
+                "scale": 0.0089906234367221
+            }
+        ],
+        "conv2d_1/Relu:0":
+        [
+            {
+                "bitwidth": 8,
+                "is_symmetric": “False”,
+                "max": 2.1020304188132286,
+                "min": -0.10380396991968155,
+                "offset": 11,
+                "scale": 0.008650330936207491
+            }
+        ],
+    },
+    "param_encodings": {
+        "conv2d/Conv2D/ReadVariableOp:0":
+        [
+            {
+                "bitwidth": 8,
+                "is_symmetric": “False”,
+                "max": 0.1462666392326355,
+                "min": -0.1451239287853241,
+                "offset": 126,
+                "scale": 0.0011427081098743512
+            }
+        ],
+        "conv2d_1/Conv2D/ReadVariableOp:0":
+        [
+            {
+                "bitwidth": 8,
+                "is_symmetric": “False”,
+                "max": 0.08333279937505722,
+                "min": -0.08268175274133682,
+                "offset": 126,
+                "scale": 0.0006510374592799766
+            }
+        ]
+    }
+}
+
+
+
+
+
+

3. Version 0.5.0

+
+

3.1. Encoding Specification

+
“version”: “string”
+“activation_encodings”:
+{
+    <tensor_name>: [Encoding, …]
+}
+“param_encodings”
+{
+    <tensor_name>: [Encoding, …]
+}
+
+
+

Where,

+
    +
  • "version” is set to “0.5.0”

  • +
  • <tensor_name> is a string representing the tensor in onnx or tensorflow graph.

  • +
+

‘Encoding’ structure shall include an encoding field “dtype” to specify the datatype used for simulating the tensor.

+
Encoding:{
+    dtype: string
+    bitwidth: integer
+    is_symmetric: string
+    max: float
+    min: float
+    offset: integer
+    scale: float
+}
+
+
+

Where,

+
    +
  • dtype: allowed choices “int”, “float”

  • +
  • bitwidth: constraints >=4 and <=32

  • +
  • is_symmetric: allowed choices “True”, “False”

  • +
+

when dtype is set to ‘float’, Encoding shall have the following fields

+
Encoding:{
+    dtype: string
+    bitwidth: integer
+}
+
+
+

bitwidth defines the precision of the tensor being generated by the producer and consumed by the +downstream consumer(s).

+
+
+

3.2. Encoding File Example for PyTorch

+

Given below is a snippet of the sample format with change highlighted.

+
{
+    “version”: “0.5.0”
+    "activation_encodings": {
+        "20":
+        [
+            {
+                “dtype”: “int”
+                "bitwidth": 8,
+                 ...
+            }
+        ],
+         ...
+    },
+    "param_encodings": {
+        "conv2.weight":
+        [
+            {
+                “dtype”: “int”
+                "bitwidth": 8,
+                ...
+            }
+        ],
+         ...
+   }
+}
+
+
+
+
+

3.3. Encoding File Example for TensorFlow

+

Given below is a snippet of the sample format with change highlighted.

+
{
+    “version”: “0.5.0”
+    "activation_encodings": {
+        "conv2d/Relu:0":
+        [
+            {
+                “dtype”: “float”
+                "bitwidth": 16,
+        ],
+         ...
+    },
+    "param_encodings": {
+        "conv2d/Conv2D/ReadVariableOp:0":
+        [
+            {
+                “dtype”: “float”
+                "bitwidth": 16,
+            }
+        ],
+         ...
+}
+
+
+
+
+
+

4. Version 0.6.1

+

Adds a new field called quantizer_args to all exported encodings files.

+
+

4.1. Encoding Specification

+
“version”: “string”
+“activation_encodings”:
+{
+    <tensor_name>: [Encoding, …]
+}
+“param_encodings”
+{
+    <tensor_name>: [Encoding, …]
+}
+"quantizer_args":
+{
+     "activation_bitwidth": integer,
+     "dtype": string,
+     "is_symmetric": string,
+     "param_bitwidth": integer,
+     "per_channel_quantization": string,
+     "quant_scheme": "string"
+}
+
+
+

Where,

+
    +
  • "version” is set to “0.6.1”

  • +
  • <tensor_name> is a string representing the tensor in onnx or tensorflow graph.

  • +
+

‘Encoding’ structure shall include an encoding field “dtype” to specify the datatype used for simulating the tensor.

+
Encoding:{
+    dtype: string
+    bitwidth: integer
+    is_symmetric: string
+    max: float
+    min: float
+    offset: integer
+    scale: float
+}
+
+
+

Where,

+
    +
  • dtype: allowed choices “int”, “float”

  • +
  • bitwidth: constraints >=4 and <=32

  • +
  • is_symmetric: allowed choices “True”, “False”

  • +
+

when dtype is set to ‘float’, Encoding shall have the following fields

+
Encoding:{
+    dtype: string
+    bitwidth: integer
+}
+
+
+

bitwidth defines the precision of the tensor being generated by the producer and consumed by the +downstream consumer(s).

+

The quantizer_args structure describes the settings used to configure the quantization simulation model, and contains +usable information about how encodings were computed. +The field is auto-populated and should not require a manual edit from users. It can be broken down as follows:

+
    +
  • activation_bitwidth: Indicates the bit-width set for all activation encodings.

  • +
  • dtype: Indicates if computation occurred in floating point or integer precision.

  • +
  • is_symmetric: If set to true, it indicates that parameter encodings were computed symmetrically.

  • +
  • param_bitwidth: Indicates the bit-width set for all parameter encodings.

  • +
  • per_channel_quantization: If set to True, then quantization encodings were computed for each channel axis of the tensor.

  • +
  • quant_scheme: Indicates the quantization algorithm used, which may be one of post_training_tf or post_training_tf_enhanced.

  • +
+

The intended usage of quantizer_args is to provide debugging information for customers who may need to perform +post-quantization tasks, which could benefit from knowledge of how the encoding information was obtained.

+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/tensorflow.html b/releases/1.32.2/api_docs/tensorflow.html new file mode 100644 index 00000000..458a5fc2 --- /dev/null +++ b/releases/1.32.2/api_docs/tensorflow.html @@ -0,0 +1,1144 @@ + + + + + + AIMET TensorFlow APIs — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ + + +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/tensorflow_adaround.html b/releases/1.32.2/api_docs/tensorflow_adaround.html new file mode 100644 index 00000000..db56c35e --- /dev/null +++ b/releases/1.32.2/api_docs/tensorflow_adaround.html @@ -0,0 +1,1316 @@ + + + + + + AIMET TensorFlow AdaRound API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET TensorFlow AdaRound API

+ + +
+

Top-level API

+
+
+aimet_tensorflow.adaround.adaround_weight.Adaround.apply_adaround(session, starting_op_names, output_op_names, params, path, filename_prefix, default_param_bw=4, default_quant_scheme=QuantScheme.post_training_tf_enhanced, default_config_file=None)
+

Returns Tf session - model with optimized weight rounding of every op (Conv and Linear) and also saves the +corresponding quantization encodings to a separate JSON-formatted file that can then be imported by +QuantSim for inference or QAT

+
+
Parameters
+
    +
  • session (Session) – Tf session with model to adaround

  • +
  • starting_op_names (List[str]) – List of starting op names of the model

  • +
  • output_op_names (List[str]) – List of output op names of the model

  • +
  • params (AdaroundParameters) – Parameters for adaround

  • +
  • path (str) – path where to store parameter encodings

  • +
  • filename_prefix (str) – Prefix to use for filename of the encodings file

  • +
  • default_param_bw (int) – Default bitwidth (4-31) to use for quantizing layer parameters. Default 4

  • +
  • default_quant_scheme (QuantScheme) – Quantization scheme. Supported options are QuantScheme.post_training_tf or +QuantScheme.post_training_tf_enhanced. Default QuantScheme.post_training_tf_enhanced

  • +
  • default_config_file (Optional[str]) – Default configuration file for model quantizers

  • +
+
+
Return type
+

Session

+
+
Returns
+

Tf session with Adarounded weight and saves corresponding parameter encodings JSON file +at provided path

+
+
+
+ +
+
+

Adaround Parameters

+
+
+class aimet_tensorflow.adaround.adaround_weight.AdaroundParameters(data_set, num_batches, default_num_iterations=10000, default_reg_param=0.01, default_beta_range=(20, 2), default_warm_start=0.2)[source]
+

Configuration parameters for Adaround

+
+
Parameters
+
    +
  • data_set (DatasetV2) – TF Data set

  • +
  • num_batches (int) – Number of batches

  • +
  • default_num_iterations (int) – Number of iterations to adaround each layer. Default 10000

  • +
  • default_reg_param (float) – Regularization parameter, trading off between rounding loss vs reconstruction loss. +Default 0.01

  • +
  • default_beta_range (Tuple) – Start and stop beta parameter for annealing of rounding loss (start_beta, end_beta). +Default (20, 2)

  • +
  • default_warm_start (float) – warm up period, during which rounding loss has zero effect. Default 20% (0.2)

  • +
+
+
+
+ +
+
+

Enum Definition

+

Quant Scheme Enum

+
+
+class aimet_common.defs.QuantScheme(value)[source]
+

Enumeration of Quant schemes

+
+
+post_training_percentile = 6
+

For a Tensor, adjusted minimum and maximum values are selected based on the percentile value passed. +The Quantization encodings are calculated using the adjusted minimum and maximum value.

+
+ +
+
+post_training_tf = 1
+

For a Tensor, the absolute minimum and maximum value of the Tensor are used to compute the Quantization +encodings.

+
+ +
+
+post_training_tf_enhanced = 2
+

For a Tensor, searches and selects the optimal minimum and maximum value that minimizes the Quantization Noise. +The Quantization encodings are calculated using the selected minimum and maximum value.

+
+ +
+
+training_range_learning_with_tf_enhanced_init = 4
+

For a Tensor, the encoding values are initialized with the post_training_tf_enhanced scheme. Then, the encodings +are learned during training.

+
+ +
+
+training_range_learning_with_tf_init = 3
+

For a Tensor, the encoding values are initialized with the post_training_tf scheme. Then, the encodings are +learned during training.

+
+ +
+ +
+
+

Code Examples

+

Required imports

+

+import logging
+import numpy as np
+import tensorflow as tf
+
+from aimet_common.utils import AimetLogger
+from aimet_common.defs import QuantScheme
+from aimet_tensorflow.examples.test_models import keras_model
+from aimet_tensorflow.quantsim import QuantizationSimModel
+from aimet_tensorflow.adaround.adaround_weight import Adaround, AdaroundParameters
+
+
+
+

Evaluation function

+
def dummy_forward_pass(session: tf.compat.v1.Session, _):
+    """
+    This is intended to be the user-defined model evaluation function.
+    AIMET requires the above signature. So if the user's eval function does not
+    match this signature, please create a simple wrapper.
+    :param session: Session with model to be evaluated
+    :param _: These argument(s) are passed to the forward_pass_callback as-is. Up to
+            the user to determine the type of this parameter. E.g. could be simply an integer representing the number
+            of data samples to use. Or could be a tuple of parameters or an object representing something more complex.
+            If set to None, forward_pass_callback will be invoked with no parameters.
+    :return: single float number (accuracy) representing model's performance
+    """
+    input_data = np.random.rand(32, 16, 16, 3)
+    input_tensor = session.graph.get_tensor_by_name('conv2d_input:0')
+    output_tensor = session.graph.get_tensor_by_name('keras_model/Softmax:0')
+    output = session.run(output_tensor, feed_dict={input_tensor: input_data})
+    return output
+
+
+

After applying AdaRound to the model, the AdaRounded session and associated encodings are returned

+
def apply_adaround_example():
+
+    AimetLogger.set_level_for_all_areas(logging.DEBUG)
+    tf.compat.v1.reset_default_graph()
+
+    _ = keras_model()
+    init = tf.compat.v1.global_variables_initializer()
+    dataset_size = 32
+    batch_size = 16
+    possible_batches = dataset_size // batch_size
+    input_data = np.random.rand(dataset_size, 16, 16, 3)
+    dataset = tf.data.Dataset.from_tensor_slices(input_data)
+    dataset = dataset.batch(batch_size=batch_size)
+
+    session = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph())
+    session.run(init)
+
+    params = AdaroundParameters(data_set=dataset, num_batches=possible_batches, default_num_iterations=10)
+    starting_op_names = ['conv2d_input']
+    output_op_names = ['keras_model/Softmax']
+
+    # W4A8
+    param_bw = 4
+    output_bw = 8
+    quant_scheme = QuantScheme.post_training_tf_enhanced
+
+    # Returns session with adarounded weights and their corresponding encodings
+    adarounded_session = Adaround.apply_adaround(session, starting_op_names, output_op_names, params, path='./',
+                                                 filename_prefix='dummy', default_param_bw=param_bw,
+                                                 default_quant_scheme=quant_scheme, default_config_file=None)
+
+    # Create QuantSim using adarounded_session
+    sim = QuantizationSimModel(adarounded_session, starting_op_names, output_op_names, quant_scheme,
+                               default_output_bw=output_bw, default_param_bw=param_bw, use_cuda=False)
+
+    # Set and freeze encodings to use same quantization grid and then invoke compute encodings
+    sim.set_and_freeze_param_encodings(encoding_path='./dummy.encodings')
+    sim.compute_encodings(dummy_forward_pass, None)
+
+    session.close()
+    adarounded_session.close()
+
+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/tensorflow_auto_quant.html b/releases/1.32.2/api_docs/tensorflow_auto_quant.html new file mode 100644 index 00000000..67ecef10 --- /dev/null +++ b/releases/1.32.2/api_docs/tensorflow_auto_quant.html @@ -0,0 +1,1328 @@ + + + + + + AIMET TensorFlow AutoQuant API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET TensorFlow AutoQuant API

+ + +
+

Top-level API

+
+
+class aimet_tensorflow.auto_quant.AutoQuant(allowed_accuracy_drop, unlabeled_dataset, eval_callback, default_param_bw=8, default_output_bw=8, default_quant_scheme=QuantScheme.post_training_tf_enhanced, default_rounding_mode='nearest', default_config_file=None)[source]
+

Integrate and apply post-training quantization techniques.

+

AutoQuant includes 1) batchnorm folding, 2) cross-layer equalization, +and 3) Adaround. +These techniques will be applied in a best-effort manner until the model +meets the evaluation goal given as allowed_accuracy_drop.

+
+
Parameters
+
    +
  • allowed_accuracy_drop (float) – Maximum allowed accuracy drop.

  • +
  • unlabeled_dataset (DatasetV1) – An unlabeled dataset for encoding computation. +By default, this dataset will be also used for Adaround unless +otherwise specified by self.set_adaround_params.

  • +
  • eval_callback (Callable[[Session, Optional[int]], float]) – A function that maps a tf session and the number of samples +to the evaluation score. This callback is expected to return a +scalar value representing the model performance evaluated +against exactly N samples, where N is the number of samples +passed as the second argument of this callback. +NOTE: If N is None, the model is expected to be evaluated against +the whole evaluation dataset.

  • +
  • default_param_bw (int) – Default bitwidth (4-31) to use for quantizing layer parameters.

  • +
  • default_output_bw (int) – Default bitwidth (4-31) to use for quantizing layer inputs andoutputs.

  • +
  • default_quant_scheme (QuantScheme) – Quantization scheme. Supported values are +QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced.

  • +
  • default_rounding_mode (str) – Rounding mode. Supported options are ‘nearest’ or ‘stochastic’

  • +
  • default_config_file (Optional[str]) – Path to configuration file for model quantizers

  • +
+
+
+
+
+apply(fp32_sess, starting_op_names, output_op_names, results_dir='/tmp', cache_id=None)[source]
+

Apply post-training quantization techniques.

+
+
Parameters
+
    +
  • fp32_sess (Session) – tf.Session associated with the model to apply PTQ techniques.

  • +
  • starting_op_names (List[str]) – List of starting op names of the model.

  • +
  • output_op_names (List[str]) – List of output op names of the model.

  • +
  • results_dir (str) – Directory to save the results.

  • +
+
+
Return type
+

Tuple[Session, float, str]

+
+
Returns
+

Tuple of (best session, eval score, encoding path).

+
+
+
+ +
+
+set_adaround_params(adaround_params)[source]
+

Set Adaround parameters. +If this method is not called explicitly by the user, AutoQuant will use +unlabeled_dataset (passed to __init__) for Adaround.

+
+
Parameters
+

adaround_params (AdaroundParameters) – Adaround parameters.

+
+
Return type
+

None

+
+
+
+ +
+ +
+
+

Code Examples

+

Required imports

+
from typing import Optional
+
+import numpy as np
+import tensorflow as tf
+from tensorflow.keras.applications.resnet import ResNet50
+
+from aimet_tensorflow.utils.common import iterate_tf_dataset
+from aimet_tensorflow.adaround.adaround_weight import AdaroundParameters
+from aimet_tensorflow.auto_quant import AutoQuant
+from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag
+
+tf.compat.v1.disable_eager_execution()
+
+
+

Define constants and helper functions

+
EVAL_DATASET_SIZE = 5000
+CALIBRATION_DATASET_SIZE = 2000
+BATCH_SIZE = 100
+
+_sampled_datasets = {}
+
+def _create_sampled_dataset(dataset, num_samples):
+    if num_samples in _sampled_datasets:
+        return _sampled_datasets[num_samples]
+
+    with dataset._graph.as_default():
+        SHUFFLE_BUFFER_SIZE = 300 # NOTE: Adjust the buffer size as necessary.
+        SHUFFLE_SEED = 22222
+        dataset = dataset.shuffle(buffer_size=SHUFFLE_BUFFER_SIZE, seed=SHUFFLE_SEED)\
+                         .take(num_samples)\
+                         .batch(BATCH_SIZE)
+        _sampled_datasets[num_samples] = dataset
+        return dataset
+
+
+

Prepare model and dataset

+
input_shape = (224, 224, 3)
+num_classes = 1000
+
+model = ResNet50(weights='imagenet', input_shape=input_shape)
+model = update_keras_bn_ops_trainable_flag(model, False, load_save_path='./')
+
+input_tensor_name = model.input.name
+input_op_name, _ = input_tensor_name.split(":")
+output_tensor_name = model.output.name
+output_op_name, _ = output_tensor_name.split(":")
+
+# NOTE: In the actual use cases, a real dataset should provide by the users.
+images = np.random.rand(100, *input_shape)
+labels = np.random.randint(num_classes, size=(100,))
+
+image_dataset = tf.compat.v1.data.Dataset.from_tensor_slices(images)\
+                                         .repeat()\
+                                         .take(EVAL_DATASET_SIZE)
+label_dataset = tf.compat.v1.data.Dataset.from_tensor_slices(labels)\
+                                         .repeat()\
+                                         .take(EVAL_DATASET_SIZE)
+eval_dataset = tf.compat.v1.data.Dataset.zip((image_dataset, label_dataset))
+
+
+

Prepare unlabeled dataset

+
# NOTE: In the actual use cases, the users should implement this part to serve
+#       their own goals if necessary.
+unlabeled_dataset = image_dataset.batch(BATCH_SIZE)
+                                 
+
+
+

Prepare eval callback

+
# NOTE: In the actual use cases, the users should implement this part to serve
+#       their own goals if necessary.
+def eval_callback(sess: tf.compat.v1.Session,
+                  num_samples: Optional[int] = None) -> float:
+    if num_samples is None:
+        num_samples = EVAL_DATASET_SIZE
+
+    sampled_dataset = _create_sampled_dataset(eval_dataset, num_samples)
+
+    with sess.graph.as_default():
+        sess.run(tf.compat.v1.global_variables_initializer())
+        input_tensor = sess.graph.get_tensor_by_name(input_tensor_name)
+        output_tensor = sess.graph.get_tensor_by_name(output_tensor_name)
+
+        num_correct_predictions = 0
+        for images, labels in iterate_tf_dataset(sampled_dataset):
+            prob = sess.run(output_tensor, feed_dict={input_tensor: images})
+            predictions = np.argmax(prob, axis=1)
+            num_correct_predictions += np.sum(predictions == labels)
+
+        return int(num_correct_predictions) / num_samples
+
+
+

Create AutoQuant object

+
auto_quant = AutoQuant(allowed_accuracy_drop=0.01,
+                       unlabeled_dataset=unlabeled_dataset,
+                       eval_callback=eval_callback)
+
+
+

(Optional) Set Adaround parameters

+

For setting the num_batches parameter, use the following guideline. +The number of batches is used to evaluate the model while calculating the quantization encodings. +Typically we want AdaRound to use around 2000 samples. +For example, if the batch size is 32, num_batches is 64. +If the batch size you are using is different, adjust the num_batches accordingly.

+
ADAROUND_DATASET_SIZE = 2000
+adaround_dataset = _create_sampled_dataset(image_dataset, ADAROUND_DATASET_SIZE)
+adaround_params = AdaroundParameters(adaround_dataset,
+                                     num_batches=ADAROUND_DATASET_SIZE // BATCH_SIZE)
+auto_quant.set_adaround_params(adaround_params)
+
+
+

Run AutoQuant

+
sess, accuracy, encoding_path =\
+    auto_quant.apply(tf.compat.v1.keras.backend.get_session(),
+                     starting_op_names=[input_op_name],
+                     output_op_names=[output_op_name])
+
+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/tensorflow_batchnorm_re_estimation.html b/releases/1.32.2/api_docs/tensorflow_batchnorm_re_estimation.html new file mode 100644 index 00000000..4539df00 --- /dev/null +++ b/releases/1.32.2/api_docs/tensorflow_batchnorm_re_estimation.html @@ -0,0 +1,1237 @@ + + + + + + AIMET TensorFlow BatchNorm Re-estimation APIs — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET TensorFlow BatchNorm Re-estimation APIs

+ +
+

Introduction

+

Batch Norm (BN) Re-estimation re-estimates the statistics of BN layers after performing QAT. Using the re-estimated statistics, the BN layers are folded in to preceding Conv and Linear layers

+
+
+

Top-level APIs

+

API for BatchNorm Re-estimation

+
+
+aimet_tensorflow.bn_reestimation.reestimate_bn_stats(sim, start_op_names, output_op_names, dataset, num_batches=100)[source]
+

Reestimate BatchNorm statistics (running mean and var).

+
+
Parameters
+
    +
  • sim (QuantizationSimModel) – QuantizationSimModel object.

  • +
  • start_op_names (List[str]) – List of starting op names of the model

  • +
  • output_op_names (List[str]) – List of output op names of the model

  • +
  • dataset (DatasetV1) – Training dataset

  • +
  • num_batches (int) – The number of batches to be used for reestimation

  • +
+
+
Return type
+

Handle

+
+
Returns
+

Handle that undos the effect of BN reestimation upon handle.remove()

+
+
+
+ +

API for BatchNorm fold to scale

+
+
+aimet_tensorflow.batch_norm_fold.fold_all_batch_norms_to_scale(sim, starting_op_names, output_op_names)[source]
+

Fold all batch_norm layers in a model into the quantization scale parameter +of the corresponding conv layers

+
+
Parameters
+
    +
  • sim (QuantizationSimModel) – tf quantized model

  • +
  • starting_op_names (List[str]) – List of starting op names of the model

  • +
  • output_op_names (List[str]) – List of output op names of the model

  • +
+
+
+
+ +
+
+

Code Example - BN-Reestimation

+

Step 1. Load the model

+

For this example, we are going to load a pretrained ResNet18 model.

+

+def load_fp32_model():
+
+    from tensorflow.compat.v1.keras.applications.resnet import ResNet50
+
+    tf.keras.backend.clear_session()
+    model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+    sess = tf.keras.backend.get_session()
+
+    # Following lines are additional steps to make keras model work with AIMET.
+    from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph
+    add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])
+
+    input_op_names = [model.input.op.name]
+    output_op_names = [model.output.op.name]
+
+    return sess, input_op_names, output_op_names
+
+
+
+

Step 2. Create QuantSim with Range Learning and Per Channel Quantization Enabled

+
    +
  1. For an example of creating QuantSim with Range Learning QuantScheme, please see here

  2. +
  3. For how to enable Per Channel Quantization, please see here

  4. +
+

Step 3. Perform QAT

+

+    update_ops_name = [op.name for op in model.updates] # Used for finetuning
+
+    # User action required
+    # The following line of code is an example of how to use an example ImageNetPipeline's train function.
+    # Replace the following line with your own pipeline's  train function.
+    ImageNetDataPipeline.finetune(quant_sim.session, update_ops_name=update_ops_name, epochs=1, learning_rate=5e-7, decay_steps=5)
+
+
+
+

Step 4 a. Perform BatchNorm Re-estimation

+

+    from aimet_tensorflow.bn_reestimation import reestimate_bn_stats
+
+    reestimate_bn_stats(quant_sim, start_op_names=input_op_names, output_op_names=output_op_names,
+                        bn_re_estimation_dataset=bn_re_restimation_dataset, bn_num_batches=100)
+
+
+
+

Step 4 b. Perform BatchNorm Fold to scale

+

+    from aimet_tensorflow.batch_norm_fold import fold_all_batch_norms_to_scale
+
+    fold_all_batch_norms_to_scale(quant_sim, input_op_names, output_op_names)
+
+
+
+

Step 5. Export the model and encodings and test on target

+

For how to export the model and encodings, please see here

+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/tensorflow_bias_correction.html b/releases/1.32.2/api_docs/tensorflow_bias_correction.html new file mode 100644 index 00000000..451b7d55 --- /dev/null +++ b/releases/1.32.2/api_docs/tensorflow_bias_correction.html @@ -0,0 +1,1478 @@ + + + + + + AIMET TensorFlow Bias Correction API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET TensorFlow Bias Correction API

+ +
+

Bias Correction API

+

Bias correction is performed after Cross layer equalization on models. +Main api to perform bias correction on entire model is listed below.

+
+
+aimet_tensorflow.bias_correction.BiasCorrection.correct_bias(reference_model, bias_correct_params, quant_params, data_set, conv_bn_dict=None, perform_only_empirical_bias_corr=True)
+
+

Top level function for bias correction

+
+
+
Parameters
+
    +
  • reference_model (Session) – active tf.compat.v1.Session for the model to be corrected.

  • +
  • bias_correct_params (BiasCorrectionParams) – input params for bias correction

  • +
  • quant_params (QuantParams) – QuantParams type with params for quantization simulation for bias correction.

  • +
  • data_set (DatasetV2) – input data set

  • +
  • conv_bn_dict (Optional[Dict[Operation, ConvBnInfoType]]) – Dict of conv and bn with activation info. If None, the function looks for it. +This can be obtained on the model with bns and convs using +BiasCorrection.find_all_convs_bn_with_activation() api.

  • +
  • perform_only_empirical_bias_corr (bool) – a flag to indicate only empirical bias correction is to be performed.

  • +
+
+
Returns
+

updated session with corrected bias for given ops

+
+
+
+ +
+
+

Input Parameter Types

+

Quantization Params

+
+
+class aimet_tensorflow.bias_correction.QuantParams(quant_mode='tf_enhanced', round_mode='nearest', use_cuda=True, ops_to_ignore=None)[source]
+

Quant Params to be passed in by user

+

Constructor

+
+
Parameters
+
    +
  • quant_mode – Indicates which quantization algorithm should be used, either +‘tf’ or ‘tf_enhanced’. Defaults to ‘tf_enhanced’

  • +
  • round_mode – The round scheme to used. One of: ‘nearest’ or ‘stochastic’. Default is ‘nearest’.

  • +
  • use_cuda – flag to indicate if GPU is to be used

  • +
  • ops_to_ignore – ops to be ignored

  • +
+
+
+
+ +

Bias Correction Params

+
+
+aimet_tensorflow.bias_correction.BiasCorrectionParams(batch_size, num_quant_samples, num_bias_correct_samples, input_op_names, output_op_names)[source]
+

Input for bias correction to be passed by the user

+
+
Parameters
+
    +
  • batch_size (int) – input batch size to be used

  • +
  • num_quant_samples (int) – samples to be used for quantization

  • +
  • num_bias_correct_samples (int) – samples to be used for bias correction

  • +
  • input_op_names (List[str]) – list of input op names of the given model

  • +
  • output_op_names (List[str]) – list of output op names of the given model

  • +
+
+
+
+ +
+
+

Data Input Type

+

Format expected is tf.Data.DataSet type

+

Dataset represents the input data as a tensor or nested structures of +tensors, one per input to model along with an iterator that operates on them.

+
+
+

Code Examples for Bias Correction

+

Required imports

+
import tensorflow as tf
+
+from tensorflow.keras.applications.resnet50 import ResNet50
+
+# Cross layer Equalization related imports
+from aimet_tensorflow.cross_layer_equalization import equalize_model
+from aimet_tensorflow.cross_layer_equalization import GraphSearchUtils, CrossLayerScaling, HighBiasFold
+from aimet_tensorflow.batch_norm_fold import fold_all_batch_norms
+from aimet_tensorflow.batch_norm_fold import fold_given_batch_norms
+from aimet_tensorflow.utils.graph_saver import save_and_load_graph
+from aimet_tensorflow.utils.op.conv import BiasUtils
+
+# Bias correction related imports
+from aimet_tensorflow.bias_correction import BiasCorrectionParams, QuantParams, BiasCorrection
+
+
+

Only Empirical Bias correction on a given model

+
def bias_correction_empirical(dataset: tf.data.Dataset):
+    """
+    Perform bias correction on a given model
+    :param dataset: Data passed by user as tf.Dataset type.
+    :return: None
+    """
+
+    # load a model
+    tf.keras.backend.clear_session()
+    _ = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+    sess = tf.compat.v1.keras.backend.get_session()
+
+    # input parameters for bias correction
+    # populate required parameters in two data types QuantParams and BiasCorrectParams
+
+    quant_params = QuantParams(quant_mode='tf_enhanced',
+                               round_mode='nearest',
+                               use_cuda=True,
+                               ops_to_ignore=None)
+
+    bias_correction_params = BiasCorrectionParams(batch_size=1,
+                                                  num_quant_samples=10,
+                                                  num_bias_correct_samples=10,
+                                                  input_op_names=['input_1'],
+                                                  output_op_names=['fc1000/Softmax'])
+
+    with sess.as_default():
+        # run bias correction on the model
+        _new_session = BiasCorrection.correct_bias(sess, bias_correction_params, quant_params, dataset)
+    sess.close()
+
+
+

Empirical and Analytical Bias correction on a given model

+
def bias_correction_empirical_analytical(dataset: tf.data.Dataset):
+    """
+    Perform bias correction on a given model (mix of empirical and analytical)
+    :param dataset: Data passed by user as tf.Dataset type.
+    :return: None
+    """
+
+    # load a model
+    tf.keras.backend.clear_session()
+    _ = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+    sess = tf.compat.v1.keras.backend.get_session()
+
+    # input parameters for bias correction
+    # populate required parameters in two data types QuantParams and BiasCorrectParams
+
+    quant_params = QuantParams(quant_mode='tf_enhanced',
+                               round_mode='nearest',
+                               use_cuda=True,
+                               ops_to_ignore=None)
+
+    bias_correction_params = BiasCorrectionParams(batch_size=1,
+                                                  num_quant_samples=10,
+                                                  num_bias_correct_samples=10,
+                                                  input_op_names=['input_1'],
+                                                  output_op_names=['fc1000/Softmax'])
+
+    with sess.as_default():
+        # run empirical and analytical bias correction on the model
+        _new_session = BiasCorrection.correct_bias(sess, bias_correction_params, quant_params,
+                                                   dataset,
+                                                   perform_only_empirical_bias_corr=False)
+    sess.close()
+
+
+

Empirical and Analytical Bias correction on a given model after performing CLE

+
def bias_correction_after_cle(dataset: tf.data.Dataset):
+    """
+    Perform bias correction on a given model (mix of empirical and analytical) after
+    cross layer equalization.
+    :param dataset: Data passed by user as tf.Dataset type.
+    :return: None
+    """
+
+    # load a model
+    tf.keras.backend.clear_session()
+    _ = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+    sess = tf.compat.v1.keras.backend.get_session()
+
+    # input parameters for bias correction
+    # populate required parameters in two data types QuantParams and BiasCorrectParams
+
+    quant_params = QuantParams(quant_mode='tf_enhanced',
+                               round_mode='nearest',
+                               use_cuda=True,
+                               ops_to_ignore=None)
+
+    bias_correction_params = BiasCorrectionParams(batch_size=1,
+                                                  num_quant_samples=10,
+                                                  num_bias_correct_samples=10,
+                                                  input_op_names=['input_1'],
+                                                  output_op_names=['fc1000/Softmax'])
+
+    with sess.as_default():
+
+        # store conv bns info before performing CLE
+        conv_bn_dict = BiasCorrection.find_all_convs_bn_with_activation(sess,
+                                                                        start_op_names=['input_1'],
+                                                                        output_op_names=['fc1000/Softmax'])
+
+        # perform CLE
+        sess_after_cle = equalize_model(sess, start_op_names=['input_1'], output_op_names=['fc1000/Softmax'])
+
+        # run empirical and analytical bias correction on the model
+        _new_session = BiasCorrection.correct_bias(sess_after_cle, bias_correction_params, quant_params,
+                                                   dataset,
+                                                   conv_bn_dict=conv_bn_dict,
+                                                   perform_only_empirical_bias_corr=False)
+    sess.close()
+
+
+
+
+

Bias Correction Per Layer API

+

Empirical/ analytical Bias correction can also be performed on a +subset of selected layers in a given model using the api listed below.

+
+
+aimet_tensorflow.bias_correction.BiasCorrection.bias_correction_per_layer(reference_model, corrected_model, bias_correct_params, layer_name_to_be_corrected, data_set)
+
+

Helper function to perform empirical bias correction per layer.

+
+
+
Parameters
+
    +
  • reference_model (Session) – active tensorflow session for reference model

  • +
  • corrected_model (Session) – active tensorflow session for corrected model

  • +
  • bias_correct_params (BiasCorrectionParams) – bias correction params

  • +
  • layer_name_to_be_corrected (str) – name of layer on which bias correction is to be performed

  • +
  • quant_params – Quantization specific params from user

  • +
+
+
Return type
+

Session

+
+
Returns
+

None, updates corrected model in-place.

+
+
+
+ +
+
+aimet_tensorflow.bias_correction.BiasCorrection.analytical_bias_correction_per_layer(corrected_model, layer, preceeding_bn_layer_info, quant_params, is_first_conv=False)
+

Perform bn based bias correction (analytical bc).

+
+
Parameters
+
    +
  • corrected_model (Session) – active tensorflow session for corrected model

  • +
  • layer (Operation) – conv/linear layer to be corrected

  • +
  • preceeding_bn_layer_info (ConvBnInfoType) – corresponding preceeding bn/ activation info

  • +
  • quant_params (QuantParams) – Quantization specific params from user

  • +
  • is_first_conv (bool) – flag to indicate if it’s the first conv layer

  • +
+
+
Return type
+

Session

+
+
Returns
+

None, updates corrected_model in place

+
+
+
+ +
+
+

Code Example for Per-Layer Bias Correction

+

Empirical Bias correction on one layer

+
def bias_correction_single_layer_empirical(dataset: tf.data.Dataset):
+    """ perform bias correction on one layer """
+
+    # load a model
+    tf.keras.backend.clear_session()
+    _ = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+    sess = tf.compat.v1.keras.backend.get_session()
+
+    # input parameters for bias correction
+    # populate required parameters in two data types QuantParams and BiasCorrectParams
+
+    quant_params = QuantParams(quant_mode='tf_enhanced',
+                               round_mode='nearest',
+                               use_cuda=True,
+                               ops_to_ignore=None)
+
+    bias_correction_params = BiasCorrectionParams(batch_size=1,
+                                                  num_quant_samples=10,
+                                                  num_bias_correct_samples=10,
+                                                  input_op_names=['input_1'],
+                                                  output_op_names=['fc1000/Softmax'])
+
+    with sess.as_default():
+        # initialize model with zero bias
+        sess = BiasUtils.initialize_model_with_bias(sess, bias_correction_params.input_op_names,
+                                                    bias_correction_params.output_op_names)
+
+        # pick a layer for bias correction
+        example_conv_layer = sess.graph.get_operation_by_name('res2a_branch2a/Conv2D')
+
+        # invoke bias correction of one layer
+        BiasCorrection.bias_correction_per_layer(reference_model=sess,
+                                                 corrected_model=sess,
+                                                 bias_correct_params=bias_correction_params,
+                                                 layer_name_to_be_corrected=example_conv_layer.name,
+                                                 quant_params=quant_params,
+                                                 data_set=dataset)
+    sess.close()
+
+
+

Analytical Bias correction on one layer

+
def bias_correction_single_layer_analytical():
+    """ perform analytical bias correction on one layer """
+
+    # load a model
+    tf.keras.backend.clear_session()
+    _ = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+    sess = tf.compat.v1.keras.backend.get_session()
+
+    # input parameters for bias correction
+    # populate required parameters in two data types QuantParams and BiasCorrectParams
+
+    quant_params = QuantParams(quant_mode='tf_enhanced',
+                               round_mode='nearest',
+                               use_cuda=True,
+                               ops_to_ignore=None)
+
+    with sess.as_default():
+        # initialize model with zero bias
+        sess = BiasUtils.initialize_model_with_bias(sess, ['input_1'], ['fc1000/Softmax'])
+
+        # pick a layer for bias correction
+        example_conv_layer = sess.graph.get_operation_by_name('res2a_branch2a/Conv2D')
+
+        # get candidate conv bns in the model
+        convs_bn_activation_info_dict = BiasCorrection.find_all_convs_bn_with_activation(sess,
+                                                                                         ['input_1'],
+                                                                                         ['fc1000/Softmax'])
+
+        # make sure to pick example_conv_layer that has a bn op associated with it
+        if example_conv_layer in convs_bn_activation_info_dict.keys():
+
+            preceding_bn_layer_info = convs_bn_activation_info_dict[example_conv_layer]
+
+            # invoke analytical bias correction on this layer
+            BiasCorrection.analytical_bias_correction_per_layer(sess,
+                                                                example_conv_layer,
+                                                                preceding_bn_layer_info,
+                                                                quant_params)
+    sess.close()
+
+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/tensorflow_compress.html b/releases/1.32.2/api_docs/tensorflow_compress.html new file mode 100644 index 00000000..6cf1ad6a --- /dev/null +++ b/releases/1.32.2/api_docs/tensorflow_compress.html @@ -0,0 +1,1834 @@ + + + + + + AIMET TensorFlow Compression API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET TensorFlow Compression API

+
+

Introduction

+
+
AIMET supports the following model compression techniques for tensorflow models
    +
  • Spatial SVD

  • +
  • Channel Pruning

  • +
  • Weight SVD

  • +
+
+
+

To learn more about these model compression techniques, please see Model Compression User Guide

+
+
For the Spatial SVD and Channel Pruning compression techniques, there are two modes in which you can invoke the AIMET API
    +
  • +
    Auto Mode: In Auto mode, AIMET will determine the optimal way to compress each layer of

    the model given an overall target compression ratio. Greedy Compression Ratio Selection Algorithm is used to pick appropriate compression ratios for each layer.

    +
    +
    +
  • +
  • +
    Manual Mode: In Manual mode, the user can pass in the desired compression-ratio per layer

    to AIMET. AIMET will apply the specified compression technique for each of the +layers to achieve the desired compression-ratio per layer. It is recommended that +the user start with Auto mode, and then tweak per-layer compression-ratios using +Manual mode if desired.

    +
    +
    +
  • +
+
+
+

For Weight SVD, we use Tar-Based Rank selection. Auto and Manual modes are supported for Weight SVD as well.

+
+

+
+
+
+

Top-level API for Compression

+
+
+class aimet_tensorflow.compress.ModelCompressor[source]
+

aimet model compressor: Enables model compression using various schemes

+
+ +
+

+
+
+
+static ModelCompressor.compress_model(sess, working_dir, eval_callback, eval_iterations, input_shape, compress_scheme, cost_metric, parameters, trainer=None, visualization_url=None)[source]
+

Compress a given model using the specified parameters

+
+
Parameters
+
    +
  • sess (Session) – Model, represented by a tf.compat.v1.Session, to compress

  • +
  • working_dir (str) – File path to save compressed TensorFlow meta file

  • +
  • eval_callback (Callable[[Any, Optional[int], bool], float]) – Evaluation callback. Expected signature is evaluate(model, iterations, use_cuda). +Expected to return an accuracy metric.

  • +
  • eval_iterations – Iterations to run evaluation for

  • +
  • trainer – Training Class: Contains a callable, train_model, which takes model, layer which is being fine +tuned and an optional parameter train_flag as a parameter +None: If per layer fine tuning is not required while creating the final compressed model

  • +
  • input_shape (Union[Tuple, List[Tuple]]) – tuple or list of tuples of input shapes to the model (channels_last format)

  • +
  • compress_scheme (CompressionScheme) – Compression scheme. See the enum for allowed values

  • +
  • cost_metric (CostMetric) – Cost metric to use for the compression-ratio (either mac or memory)

  • +
  • parameters (Union[SpatialSvdParameters, ChannelPruningParameters]) – Compression parameters specific to given compression scheme

  • +
  • trainer – Training function +None: If per layer fine tuning is not required while creating the final compressed model

  • +
  • visualization_url – url the user will need to input where visualizations will appear

  • +
+
+
Return type
+

Tuple[Session, CompressionStats]

+
+
Returns
+

A tuple of the compressed model session, and compression statistics

+
+
+
+ +
+

+
+
+
+

Greedy Selection Parameters

+
+
+class aimet_common.defs.GreedySelectionParameters(target_comp_ratio, num_comp_ratio_candidates=10, use_monotonic_fit=False, saved_eval_scores_dict=None)[source]
+

Configuration parameters for the Greedy compression-ratio selection algorithm

+
+
Variables
+
    +
  • target_comp_ratio – Target compression ratio. Expressed as value between 0 and 1. +Compression ratio is the ratio of cost of compressed model to cost of the original model.

  • +
  • num_comp_ratio_candidates – Number of comp-ratio candidates to analyze per-layer +More candidates allows more granular distribution of compression at the cost +of increased run-time during analysis. Default value=10. Value should be greater than 1.

  • +
  • use_monotonic_fit – If True, eval scores in the eval dictionary are fitted to a monotonically increasing +function. This is useful if you see the eval dict scores for some layers are not monotonically increasing. +By default, this option is set to False.

  • +
  • saved_eval_scores_dict – Path to the eval_scores dictionary pickle file that was +saved in a previous run. This is useful to speed-up experiments when trying +different target compression-ratios for example. aimet will save eval_scores +dictionary pickle file automatically in a ./data directory relative to the +current path. num_comp_ratio_candidates parameter will be ignored when this option is used.

  • +
+
+
+
+ +
+

+
+
+
+

Spatial SVD Configuration

+
+
+class aimet_tensorflow.defs.SpatialSvdParameters(input_op_names, output_op_names, mode, params, multiplicity=1)[source]
+

Configuration parameters for spatial svd compression

+
+
Parameters
+
    +
  • input_op_names (List[str]) – list of input op names to the model

  • +
  • output_op_names (List[str]) – List of output op names of the model

  • +
  • mode (Mode) – Either auto mode or manual mode

  • +
  • params (Union[ManualModeParams, AutoModeParams]) – Parameters for the mode selected

  • +
  • multiplicity – The multiplicity to which ranks/input channels will get rounded. Default: 1

  • +
+
+
+
+
+class AutoModeParams(greedy_select_params, modules_to_ignore=None)[source]
+

Configuration parameters for auto-mode compression

+
+
Parameters
+
    +
  • greedy_select_params (GreedySelectionParameters) – Params for greedy comp-ratio selection algorithm

  • +
  • modules_to_ignore (Optional[List[Operation]]) – List of modules to ignore (None indicates nothing to ignore)

  • +
+
+
+
+ +
+
+class ManualModeParams(list_of_module_comp_ratio_pairs)[source]
+

Configuration parameters for manual-mode spatial svd compression

+
+
Parameters
+

list_of_module_comp_ratio_pairs (List[ModuleCompRatioPair]) – List of (module, comp-ratio) pairs

+
+
+
+ +
+
+class Mode(value)[source]
+

Mode enumeration

+
+
+auto = 2
+

Auto mode

+
+ +
+
+manual = 1
+

Manual mode

+
+ +
+ +
+ +
+

+
+
+
+

Channel Pruning Configuration

+
+
+class aimet_tensorflow.defs.ChannelPruningParameters(input_op_names, output_op_names, data_set, batch_size, num_reconstruction_samples, allow_custom_downsample_ops, mode, params, multiplicity=1)[source]
+

Configuration parameters for channel pruning compression

+
+
Parameters
+
    +
  • input_op_names (List[str]) – list of input op names to the model

  • +
  • output_op_names (List[str]) – List of output op names of the model

  • +
  • data_set (DatasetV2) – data set

  • +
  • batch_size (int) – batch size

  • +
  • num_reconstruction_samples (int) – number of samples to be used for reconstruction

  • +
  • allow_custom_downsample_ops (bool) – If set to True, DownSampleLayer and UpSampleLayer will be added as required

  • +
  • mode (Mode) – indicates whether the mode is manual or auto

  • +
  • params (Union[ManualModeParams, AutoModeParams]) – ManualModeParams or AutoModeParams, depending on teh value of mode

  • +
  • multiplicity – The multiplicity to which ranks/input channels will get rounded. Default: 1

  • +
+
+
+
+
+class AutoModeParams(greedy_select_params, modules_to_ignore=None)[source]
+

Configuration parameters for auto-mode compression

+
+
Parameters
+
    +
  • greedy_select_params (GreedySelectionParameters) – Params for greedy comp-ratio selection algorithm

  • +
  • modules_to_ignore (Optional[List[Operation]]) – List of modules to ignore (None indicates nothing to ignore)

  • +
+
+
+
+ +
+
+class ManualModeParams(list_of_module_comp_ratio_pairs)[source]
+

Configuration parameters for manual-mode channel pruning compression

+
+
Parameters
+

list_of_module_comp_ratio_pairs (List[ModuleCompRatioPair]) – List of (module, comp-ratio) pairs

+
+
+
+ +
+
+class Mode(value)[source]
+

Mode enumeration

+
+
+auto = 2
+

aimet computes optimal comp-ratio per layer

+
+
Type
+

Auto mode

+
+
+
+ +
+
+manual = 1
+

User specifies comp-ratio per layer

+
+
Type
+

Manual mode

+
+
+
+ +
+ +
+ +
+

+
+
+
+

Configuration Definitions

+
+
+class aimet_common.defs.CostMetric(value)[source]
+

Enumeration of metrics to measure cost of a model/layer

+
+
+mac = 1
+

Cost modeled for compute requirements

+
+
Type
+

MAC

+
+
+
+ +
+
+memory = 2
+

Cost modeled for space requirements

+
+
Type
+

Memory

+
+
+
+ +
+ +
+

+
+
+
+class aimet_common.defs.CompressionScheme(value)[source]
+

Enumeration of compression schemes supported in aimet

+
+
+channel_pruning = 3
+

Channel Pruning

+
+ +
+
+spatial_svd = 2
+

Spatial SVD

+
+ +
+
+weight_svd = 1
+

Weight SVD

+
+ +
+ +
+

+
+
+
+class aimet_tensorflow.defs.ModuleCompRatioPair(module, comp_ratio)[source]
+

Pair of tf.Operation and a compression-ratio

+
+
Variables
+
    +
  • module – Module of type tf.Operation

  • +
  • comp_ratio – Compression ratio. Compression ratio is the ratio of cost of compressed model +to cost of the original model.

  • +
+
+
+
+ +
+

+
+
+
+

Code Examples

+

Required imports

+
from decimal import Decimal
+
+import numpy as np
+import tensorflow as tf
+from tensorflow.python.keras.applications.vgg16 import VGG16
+
+# Compression-related imports
+from aimet_common.defs import GreedySelectionParameters
+from aimet_common.defs import CostMetric, CompressionScheme
+from aimet_tensorflow.defs import SpatialSvdParameters, ChannelPruningParameters, ModuleCompRatioPair
+from aimet_tensorflow.compress import ModelCompressor
+
+
+

Evaluation function

+
def evaluate_model(sess: tf.compat.v1.Session, eval_iterations: int, use_cuda: bool) -> float:
+    """
+    This is intended to be the user-defined model evaluation function.
+    AIMET requires the above signature. So if the user's eval function does not
+    match this signature, please create a simple wrapper.
+
+    Note: Honoring the number of iterations is not absolutely necessary.
+    However if all evaluations run over an entire epoch of validation data,
+    the runtime for AIMET compression will obviously be higher.
+
+    :param sess: Tensorflow session
+    :param eval_iterations: Number of iterations to use for evaluation.
+            None for entire epoch.
+    :param use_cuda: If true, evaluate using gpu acceleration
+    :return: single float number (accuracy) representing model's performance
+    """
+
+    # Evaluate model should run data through the model and return an accuracy score.
+    # If the model does not have nodes to measure accuracy, they will need to be added to the graph.
+    return .5
+
+
+

Compressing using Spatial SVD in auto mode with multiplicity = 8 for rank rounding

+
def spatial_svd_auto_mode():
+
+    sess = tf.compat.v1.Session()
+    # Construct graph
+    with sess.graph.as_default():
+        _ = VGG16(weights=None, input_shape=(224, 224, 3))
+        init = tf.compat.v1.global_variables_initializer()
+    sess.run(init)
+
+    # ignore first Conv2D op
+    conv2d = sess.graph.get_operation_by_name('block1_conv1/Conv2D')
+    modules_to_ignore = [conv2d]
+
+    greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.8),
+                                              num_comp_ratio_candidates=10,
+                                              use_monotonic_fit=True,
+                                              saved_eval_scores_dict=None)
+
+    auto_params = SpatialSvdParameters.AutoModeParams(greedy_select_params=greedy_params,
+                                                      modules_to_ignore=modules_to_ignore)
+
+    params = SpatialSvdParameters(input_op_names=['input_1'], output_op_names=['predictions/Softmax'],
+                                  mode=SpatialSvdParameters.Mode.auto, params=auto_params, multiplicity=8)
+    input_shape = (1, 3, 224, 224)
+
+    # Single call to compress the model
+    compr_model_sess, stats = ModelCompressor.compress_model(sess=sess,
+                                                             working_dir=str('./'),
+                                                             eval_callback=evaluate_model,
+                                                             eval_iterations=10,
+                                                             input_shape=input_shape,
+                                                             compress_scheme=CompressionScheme.spatial_svd,
+                                                             cost_metric=CostMetric.mac,
+                                                             parameters=params,
+                                                             trainer=None)
+
+    print(stats)    # Stats object can be pretty-printed easily
+
+
+

Compressing using Spatial SVD in manual mode

+
def spatial_svd_manual_mode():
+
+    sess = tf.compat.v1.Session()
+    # Construct graph
+    with sess.graph.as_default():
+        _ = VGG16(weights=None, input_shape=(224, 224, 3))
+        init = tf.compat.v1.global_variables_initializer()
+    sess.run(init)
+
+    # Pick two convs to compress as examples
+    conv2d = sess.graph.get_operation_by_name('block1_conv1/Conv2D')
+    conv2d_1 = sess.graph.get_operation_by_name('block1_conv2/Conv2D')
+
+    # Specify the necessary parameters
+    manual_params = SpatialSvdParameters.ManualModeParams([ModuleCompRatioPair(module=conv2d, comp_ratio=0.5),
+                                                           ModuleCompRatioPair(module=conv2d_1, comp_ratio=0.4)])
+
+    params = SpatialSvdParameters(input_op_names=['input_1'], output_op_names=['predictions/Softmax'],
+                                  mode=SpatialSvdParameters.Mode.manual, params=manual_params)
+
+    input_shape = (1, 3, 224, 224)
+
+    # Single call to compress the model
+    compr_model_sess, stats = ModelCompressor.compress_model(sess=sess,
+                                                             working_dir=str('./'),
+                                                             eval_callback=evaluate_model,
+                                                             eval_iterations=10,
+                                                             input_shape=input_shape,
+                                                             compress_scheme=CompressionScheme.spatial_svd,
+                                                             cost_metric=CostMetric.mac,
+                                                             parameters=params,
+                                                             trainer=None)
+
+    print(stats)    # Stats object can be pretty-printed easily
+
+
+

Compressing using Channel Pruning in auto mode

+
def channel_pruning_auto_mode():
+
+    sess = tf.compat.v1.Session()
+    # Construct graph
+    with sess.graph.as_default():
+        _ = VGG16(weights=None, input_shape=(224, 224, 3))
+        init = tf.compat.v1.global_variables_initializer()
+    sess.run(init)
+
+    # ignore first Conv2D op
+    conv2d = sess.graph.get_operation_by_name('block1_conv1/Conv2D')
+    modules_to_ignore = [conv2d]
+
+    greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.8),
+                                              num_comp_ratio_candidates=2,
+                                              use_monotonic_fit=True,
+                                              saved_eval_scores_dict=None)
+
+    auto_params = ChannelPruningParameters.AutoModeParams(greedy_select_params=greedy_params,
+                                                          modules_to_ignore=modules_to_ignore)
+
+    # Create random dataset
+    batch_size = 1
+    input_data = np.random.rand(100, 224, 224, 3)
+    dataset = tf.data.Dataset.from_tensor_slices(input_data)
+    dataset = dataset.batch(batch_size=batch_size)
+
+    params = ChannelPruningParameters(input_op_names=['input_1'],
+                                      output_op_names=['predictions/Softmax'],
+                                      data_set=dataset,
+                                      batch_size=32,
+                                      num_reconstruction_samples=50,
+                                      allow_custom_downsample_ops=False,
+                                      mode=ChannelPruningParameters.Mode.auto,
+                                      params=auto_params,
+                                      multiplicity=8)
+
+    # Single call to compress the model
+    results = ModelCompressor.compress_model(sess,
+                                             working_dir=None,
+                                             eval_callback=evaluate_model,
+                                             eval_iterations=10,
+                                             input_shape=(32, 224, 224, 3),
+                                             compress_scheme=CompressionScheme.channel_pruning,
+                                             cost_metric=CostMetric.mac,
+                                             parameters=params)
+
+    compressed_model, stats = results
+    print(compressed_model)
+    print(stats)     # Stats object can be pretty-printed easily
+
+
+

Compressing using Channel Pruning in manual mode

+
def channel_pruning_manual_mode():
+
+    sess = tf.compat.v1.Session()
+
+    # Construct graph
+    with sess.graph.as_default():
+        _ = VGG16(weights=None, input_shape=(224, 224, 3))
+        init = tf.compat.v1.global_variables_initializer()
+    sess.run(init)
+
+    # Create random dataset
+    batch_size = 1
+    input_data = np.random.rand(100, 224, 224, 3)
+    dataset = tf.data.Dataset.from_tensor_slices(input_data)
+    dataset = dataset.batch(batch_size=batch_size)
+
+    #  Pick two convs to compress as examples
+    block1_conv2_op = sess.graph.get_operation_by_name('block1_conv2/Conv2D')
+    block2_conv2_op = sess.graph.get_operation_by_name('block2_conv2/Conv2D')
+
+    list_of_module_comp_ratio_pairs = [ModuleCompRatioPair(block1_conv2_op, 0.5),
+                                       ModuleCompRatioPair(block2_conv2_op, 0.5)]
+
+    manual_params = ChannelPruningParameters.ManualModeParams(list_of_module_comp_ratio_pairs=
+                                                              list_of_module_comp_ratio_pairs)
+
+    params = ChannelPruningParameters(input_op_names=['input_1'],
+                                      output_op_names=['predictions/Softmax'],
+                                      data_set=dataset,
+                                      batch_size=32,
+                                      num_reconstruction_samples=50,
+                                      allow_custom_downsample_ops=False,
+                                      mode=ChannelPruningParameters.Mode.
+                                      manual,
+                                      params=manual_params,
+                                      multiplicity=8)
+
+    # Single call to compress the model
+    results = ModelCompressor.compress_model(sess,
+                                             working_dir=None,
+                                             eval_callback=evaluate_model,
+                                             eval_iterations=10,
+                                             input_shape=(32, 224, 224, 3),
+                                             compress_scheme=CompressionScheme.channel_pruning,
+                                             cost_metric=CostMetric.mac,
+                                             parameters=params)
+
+    compressed_model, stats = results
+    print(compressed_model)
+    print(stats)     # Stats object can be pretty-printed easily
+
+
+
+
+

Weight SVD Top-level API

+
+
+class aimet_tensorflow.svd.Svd(graph, checkpoint, metric, output_file='./svd_graph', svd_type='svd', num_layers=0, layers=None, layer_ranks=None, num_ranks=20, gpu=True, debug=False, no_evaluation=False, layer_selection_threshold=0.6)[source]
+

A class for performing singular value decomposition on a tensorflow model.

+

The Svd class enables model compression through singular value decomposition (SVD). +It can analyze convolution and fully connected layers and perform +some analysis to find the optimal ranks for balancing compression and the +accuracy of the network.

+

Constructor for the Svd class

+

Constructs the Svd class from a set of options passed in at construction. The class takes +a number of named arguments which are detailed below.

+
+
Parameters
+
    +
  • graph – The file path to the meta graph.

  • +
  • checkpoint – The file path to the tensorflow checkpoint file.

  • +
  • metric – The metric to use for determining the optimal compression. Either +‘mac’ for optimizing compression to minimize multiplies and accumulates or ‘memory’ which +optimizes for overall memory footprint. Defaults to ‘memory’

  • +
  • output_file – The file path for saving the compressed tensorflow graph. +aimet will save to the directory specified, using output_file as a filename prefix

  • +
  • svd_type – Indicates which algorithm should be used, either +‘svd’ or ‘ssvd’. Defaults to ‘svd’.

  • +
  • num_layers – The number of layers to compress. Defaults to ‘0’ which uses a +heuristic to determine the optimal number of layers to compress.

  • +
  • layers – A list of op names to compress. All other layers will be ignored. +Overrides num_layers and sets it to the length of this list.

  • +
  • layer_ranks – required only if no_evaluation is set to True. A list of tuples to compress +layers specified in layers argument.

  • +
  • num_ranks – The number of ranks (compression_points) to evaluate for compression. +Defaults to 20. Value should be greater than 2.

  • +
  • gpu – Indicates if the algorithm should run on GPU or CPU. Defaults to GPU. To +use CPU set to false

  • +
  • debug – If true debug messages will be printed. Defaults to False.

  • +
  • no_evaluation – If true, ranks will be set manually from user. Defaults to False.

  • +
  • layer_selection_threshold – Threshold (0-1) to use to select the top layers in the network

  • +
+
+
Raises
+

ValueError: An error occurred processing one of the input parameters.

+
+
+
+ +
+

+
+
+
+Svd.compress_net(generator, eval_names=None, run_graph=<function evaluate_graph>, eval_func=<function default_eval_func>, error_margin=2, iterations=100)[source]
+

Compresses the network using SVD

+

Runs rank selection on the network, and compresses it using the method and parameters +passed during construction of the Svd object.

+
+
Parameters
+
    +
  • generator – The generator which should be used for generating data for quantization

  • +
  • eval_names – The list of names to use for calculating model performance

  • +
  • run_graph – The function to use for running data through the graph and evaluating +the network’s performance. This function must return only a single number representing the +avg performance of the model over the dataset batches. +See the ‘graph_eval’ module’s ‘evaluate_graph’ function for the prototype

  • +
  • eval_func – The function to use for evaluating the network performance. This function should always +return a single number that can be used for comparing different graph’s performance. +(The default is accuracy)

  • +
  • error_margin – The acceptable degradation in network accuracy from the original. +1 for 1% drop, etc. Defaults to 2%.

  • +
  • iterations – The number of iterations (data batches) to run through the network for analysis

  • +
+
+
Returns
+

An object containing compression statistics

+
+
Raises
+
    +
  • ValueError: An invalid parameter was passed

  • +
  • RuntimeError: An error occurred analyzing or compressing the network. The associated error +and other information will be returned with the error.

  • +
+
+
+
+ +
+

+
+
+
+

Code Examples for Weight SVD

+

Required imports

+
import os
+from aimet_tensorflow import svd as s
+from aimet_tensorflow.common import tfrecord_generator as tf_gen
+from aimet_tensorflow.common.tfrecord_generator import MnistParser
+
+
+

Compressing using Weight SVD in auto mode

+
def weight_svd_auto_mode(self):
+
+    # Allocate the generator you wish to use to provide the network with data
+    generator = tf_gen.TfRecordGenerator(tfrecords=[os.path.join('data', 'mnist', 'validation.tfrecords')],
+                                         parser=MnistParser())
+
+    # Allocate the SVD instance and compress the network
+    svd = s.Svd(graph=os.path.join('models', 'mnist_save.meta'), checkpoint=os.path.join('models', 'mnist_save'),
+                output_file=os.path.join('svd', 'svd_graph'), layers=[], num_ranks=20,
+                layer_selection_threshold=0.95, metric=s.CostMetric.memory)
+
+    stats = svd.compress_net(generator=generator, iterations=10)
+
+    stats.pretty_print() # Print the stats for Weight SVD compression
+
+
+

Compressing using Weight SVD in manual mode

+
def weight_svd_manual_mode(self):
+
+    # Allocate the generator you wish to use to provide the network with data
+    generator = tf_gen.TfRecordGenerator(tfrecords=[os.path.join('data', 'mnist', 'validation.tfrecords')],
+                                         parser=MnistParser())
+
+    # Only Compress Conv2d_1 and MatMul_1 with ranks 31 and 9 respectively
+    # no_evaluation should be True in Manual mode
+
+    layers = ['Conv2D_1', 'MatMul_1']
+    layer_ranks = [('Conv2D_1', 31), ('MatMul_1', 9)]
+
+    svd = s.Svd(graph=os.path.join('models', 'mnist_save.meta'), checkpoint=os.path.join('models', 'mnist_save'),
+                output_file=os.path.join('svd', 'svd_graph'), layers=layers, layer_ranks=layer_ranks, num_ranks=20,
+                no_evaluation=True, metric=s.CostMetric.memory)
+
+    stats = svd.compress_net(generator=generator, iterations=10)
+
+    stats.pretty_print() # Print the stats for Weight SVD compression
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/tensorflow_cross_layer_equalization.html b/releases/1.32.2/api_docs/tensorflow_cross_layer_equalization.html new file mode 100644 index 00000000..ca4d0fa3 --- /dev/null +++ b/releases/1.32.2/api_docs/tensorflow_cross_layer_equalization.html @@ -0,0 +1,1215 @@ + + + + + + AIMET TensorFlow Cross Layer Equalization APIs — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET TensorFlow Cross Layer Equalization APIs

+ + +
+

Introduction

+
+
AIMET functionality for TensorFlow Cross Layer Equalization supports three techniques-
    +
  • BatchNorm Folding

  • +
  • Cross Layer Scaling

  • +
  • High Bias Fold

  • +
+
+
+
+
+

Cross Layer Equalization API

+

Listed below is a comprehensive API to apply all available techniques under cross layer equalization. +It performs ‘auto’ detection of candidate layers and applies the techniques. +If there are no BatchNorm layers in a given model, BatchNorm fold and high bias fold shall be skipped.

+

API(s) for Cross Layer Equalization

+
+
+aimet_tensorflow.cross_layer_equalization.equalize_model(sess, start_op_names, output_op_names)[source]
+

High-level API to perform Cross-Layer Equalization (CLE) on the given model. The model is equalized in place.

+
+
Parameters
+
    +
  • sess (Session) – tf.compat.v1.Session with model to equalize

  • +
  • start_op_names (Union[str, List[str]]) – Names of starting ops in the given model

  • +
  • output_op_names (Union[str, List[str]]) – List of output op names of the model, used to help ConnectedGraph determine valid ops +(to ignore training ops for example).

  • +
+
+
Return type
+

Session

+
+
Returns
+

updated session after bn fold, cls and hbf.

+
+
+
+ +
+
+

Code Example

+

Required imports

+
import tensorflow as tf
+
+from tensorflow.keras.applications.resnet50 import ResNet50
+
+# Cross layer Equalization related imports
+from aimet_tensorflow.cross_layer_equalization import equalize_model
+
+
+

Cross Layer Equalization in auto mode comprehensive

+
def cross_layer_equalization_auto():
+    """ perform auto cross layer equalization """
+
+    # load a model
+    tf.keras.backend.clear_session()
+    _ = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+    sess = tf.compat.v1.keras.backend.get_session()
+
+    # get starting op name to invoke api for cle
+    input_op_name = 'input_1'
+    output_op_name = 'fc1000/Softmax'
+
+    # Equalize a model with Batchnorms
+    # Performs BatchNorm fold, replacing Relu6 with Relu, Cross layer scaling and High bias fold
+    # use the new session returned for further evaluations on TF graph
+    with sess.as_default():
+        new_session = equalize_model(sess, input_op_name, output_op_name)
+    sess.close()
+
+
+
+
+

Primitive APIs

+

If the user would like to call the APIs individually, then the following APIs can be used-

+ +
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/tensorflow_layer_output_generation.html b/releases/1.32.2/api_docs/tensorflow_layer_output_generation.html new file mode 100644 index 00000000..c3a054b9 --- /dev/null +++ b/releases/1.32.2/api_docs/tensorflow_layer_output_generation.html @@ -0,0 +1,1224 @@ + + + + + + AIMET Tensorflow Layer Output Generation API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Tensorflow Layer Output Generation API

+

This API captures and saves intermediate layer-outputs of a model. The model can be original (FP32) or quantsim. +The layer-outputs are named according to the exported Tensorflow model by the quantsim export API. This allows layer-output comparison +amongst FP32 model, quantization simulated model and actually quantized model on target-device to debug accuracy miss-match issues.

+
+

Top-level API

+
+
+class aimet_tensorflow.layer_output_utils.LayerOutputUtil(session, starting_op_names, output_op_names, dir_path)[source]
+

Implementation to capture and save outputs of intermediate layers of a model (fp32/quantsim)

+

Constructor for LayerOutputUtil.

+
+
Parameters
+
    +
  • session (Session) – Session containing the model whose layer-outputs are needed.

  • +
  • starting_op_names (List[str]) – List of starting op names of the model.

  • +
  • output_op_names (List[str]) – List of output op names of the model.

  • +
  • dir_path (str) – Directory wherein layer-outputs will be saved.

  • +
+
+
+
+ +
+

+
+

The following API can be used to Generate Layer Outputs

+
+
+LayerOutputUtil.generate_layer_outputs(input_batch)[source]
+

This method captures output of every layer of a model & saves the inputs and corresponding layer-outputs to disk.

+
+
Parameters
+

input_batch (Union[ndarray, List[ndarray], Tuple[ndarray]]) – Batch of inputs for which we want to obtain layer-outputs.

+
+
Returns
+

None

+
+
+
+ +
+

+
+
+
+

Code Example

+

Imports

+
import tensorflow as tf
+
+from aimet_tensorflow.quantsim import QuantizationSimModel
+from aimet_tensorflow.layer_output_utils import LayerOutputUtil
+
+
+

Obtain Original or QuantSim model session from AIMET Export Artifacts

+
# Load the model into session.
+tf.compat.v1.reset_default_graph()
+session = tf.compat.v1.Session()
+saver = tf.compat.v1.train.import_meta_graph('path/to/aimet_export_artifacts/model.meta')
+saver.restore(session, 'path/to/aimet_export_artifacts/model')
+
+# Use same arguments as that were used for the exported QuantSim model. For sake of simplicity only mandatory arguments are passed below.
+quantsim = QuantizationSimModel(session, starting_op_names, output_op_names, use_cuda=False)
+
+# Load exported encodings into quantsim object.
+quantsim.load_encodings_to_sim('path/to/aimet_export_artifacts/model.encodings')
+
+# Check whether constructed original and quantsim model are running properly before using Layer Output Generation API.
+_ = session.run(None, feed_dict)
+_ = quantsim.session.run(None, feed_dict)
+
+
+

Obtain inputs for which we want to generate intermediate layer-outputs

+
# Use same input pre-processing pipeline as was used for computing the quantization encodings.
+input_batches = get_pre_processed_inputs()
+
+
+

Generate layer-outputs

+
# Use original session to get fp32 layer-outputs
+fp32_layer_output_util = LayerOutputUtil(session, starting_op_names, output_op_names, dir_path='./fp32_layer_outputs')
+
+# Use quantsim session to get quantsim layer-outputs
+quantsim_layer_output_util = LayerOutputUtil(quantsim.session, starting_op_names, output_op_names, dir_path='./quantsim_layer_outputs')
+
+for input_batch in input_batches:
+    fp32_layer_output_util.generate_layer_outputs(input_batch)
+    quantsim_layer_output_util.generate_layer_outputs(input_batch)
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/tensorflow_model_guidelines.html b/releases/1.32.2/api_docs/tensorflow_model_guidelines.html new file mode 100644 index 00000000..f28a6837 --- /dev/null +++ b/releases/1.32.2/api_docs/tensorflow_model_guidelines.html @@ -0,0 +1,1171 @@ + + + + + + TensorFlow Model Guidelines — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

TensorFlow Model Guidelines

+

In order to make full use of AIMET features, there are several guidelines users should follow when defining +TensorFlow models.

+

If model has BatchNormalization (BN) layers

+

If model has BatchNormalization (BN) layers, then user should set it’s trainble flag to False and recompile the model +before AIMET usage. This is one of the limitations with TensorFlow 2.x but If you are using TensorFlow 1.x, +then this step is not required:

+
...
+model = Model()
+from aimet_tensorflow.utils.graph import update_keras_bn_ops_trainable_flag
+model = update_keras_bn_ops_trainable_flag(model, load_save_path="./", trainable=False)
+
+
+
+
+aimet_tensorflow.utils.graph.update_keras_bn_ops_trainable_flag(model, trainable, load_save_path)[source]
+
+

helper method to update Keras BN ops trainable state in a given keras model.

+
+
+
Parameters
+
    +
  • model (Model) – Keras model to be updated with BN ops trainable flag

  • +
  • trainable (bool) – bool flag to indicate trainable to be set to true or false

  • +
  • load_save_path (str) – temp folder to perform load/save, cleans up file created

  • +
+
+
Return type
+

Model

+
+
Returns
+

updated keras model

+
+
+
+ +

If model has Recurrent (RNN, LSTM etc.) layers

+

Recurrent layers (RNN, LSTM) are not supported with TensorFlow 2.x and only supported with TensorFlow 1.x.

+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/tensorflow_primitive_apis_cle.html b/releases/1.32.2/api_docs/tensorflow_primitive_apis_cle.html new file mode 100644 index 00000000..dd046b4e --- /dev/null +++ b/releases/1.32.2/api_docs/tensorflow_primitive_apis_cle.html @@ -0,0 +1,1512 @@ + + + + + + AIMET TensorFlow Cross Layer Equalization Primitive API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + +
  • + View page source +
  • +
+
+
+
+
+ +
+

AIMET TensorFlow Cross Layer Equalization Primitive API

+
+

Introduction

+

If a user wants to modify the order of Cross Layer equalization, not use some features or manually tweak the list of +layers that need to be equalized, the following APIs can be used.

+

Higher level API can be used for using one or more features one after the other. It automatically finds the layers to +be folded or scaled.

+

Lower level APIs can be used to manually tweak the list of layers to be folded. The user has to pass the list of +layers in the correct order that they appear in the model.

+

Note: Before using High Bias fold, Cross Layer Scaling (CLS) needs to be applied and scaling factors obtained from +CLS need to be plugged in to High Bias Fold. And, if there are batchnorm layers, they need to be folded and the info +saved to be plugged into high bias fold API.

+
+
+

Higher Level APIs for Cross Layer Equalization

+

API for Batch Norm Folding

+
+
+aimet_tensorflow.batch_norm_fold.fold_all_batch_norms(sess, input_op_names, output_op_names)[source]
+

Fold all batch_norm layers in a model into corresponding conv layers

+
+
Parameters
+
    +
  • sess (Session) – active tf.compat.v1.Session

  • +
  • input_op_names (Union[str, List[str]]) – Name of the starting op in the given graph or a list of names in case of multi-input model

  • +
  • output_op_names (Union[str, List[str]]) – List of output op names of the model, used to help ConnectedGraph determine valid ops +(to ignore training ops for example). If None, all ops in the model are considered valid.

  • +
+
+
Return type
+

Tuple[Session, List[Tuple[Operation, Operation]]]

+
+
Returns
+

A new session with edited graph and a list of pairs of layers [(Conv/Linear, BN layer that got folded)]

+
+
+
+ +

API for Cross Layer Scaling

+
+
+aimet_tensorflow.cross_layer_equalization.CrossLayerScaling.scale_model(sess, input_op_names, output_op_names)
+

Uses cross-layer scaling to scale all applicable layers in the given model

+
+
Parameters
+
    +
  • sess (Session) – Session containing graph to scale

  • +
  • input_op_names (Union[str, List[str]]) – Names of starting ops in the model

  • +
  • output_op_names (Union[str, List[str]]) – List of output op names of the model, used to help ConnectedGraph determine valid ops +(to ignore training ops for example). If None, all ops in the model are considered valid.

  • +
+
+
Return type
+

(Session, List[ClsSetInfo])

+
+
Returns
+

updated session, CLS information for each CLS set

+
+
+
+ +

API for High Bias Folding

+
+
+aimet_tensorflow.cross_layer_equalization.HighBiasFold.bias_fold(sess, folded_pairs, cls_set_info_list)
+

Folds bias values greater than 3 * sigma to next layer’s bias

+
+
Parameters
+
    +
  • sess (Session) – Current session

  • +
  • folded_pairs (List[Tuple[Operation, Operation]]) – Key: Conv/Linear layer Value: Corresponding folded BN layer

  • +
  • cls_set_info_list (List[ClsSetInfo]) – List of info elements for each cls set

  • +
+
+
Return type
+

Session

+
+
Returns
+

updated session after graph updates from hbf

+
+
+
+ +
+
+

Code Examples for Higher Level APIs

+

Required imports

+
import tensorflow as tf
+
+from tensorflow.keras.applications.resnet50 import ResNet50
+
+from aimet_tensorflow.cross_layer_equalization import GraphSearchUtils, CrossLayerScaling, HighBiasFold
+from aimet_tensorflow.batch_norm_fold import fold_all_batch_norms
+
+
+

Perform Cross Layer Equalization in auto mode step by step

+
def cross_layer_equalization_auto_stepwise():
+    """ Individual api calls to perform cross layer equalization one step at a time"""
+
+    # load a model
+    tf.keras.backend.clear_session()
+    _ = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+    sess = tf.compat.v1.keras.backend.get_session()
+
+    # get starting op name to invoke api for cle
+    start_op_name = 'input_1'
+    output_op_name = 'fc1000/Softmax'
+
+    with sess.as_default():
+        # replace any ReLU6 layers with ReLU
+        graph_util = GraphSearchUtils(sess.graph, start_op_name, output_op_name)
+        after_relu_replace_sess = graph_util.find_and_replace_relu6_with_relu(sess)
+
+        # fold batchnorm layers
+        after_bn_fold_sess, folded_pairs = fold_all_batch_norms(after_relu_replace_sess, start_op_name, output_op_name)
+
+        # perform cross-layer scaling on applicable layer groups
+        after_cls_sess, cls_set_info_list = CrossLayerScaling.scale_model(after_bn_fold_sess, start_op_name, output_op_name)
+
+        # perform high bias fold
+        # use the session after high bias fold returned for further evaluations on TF graph
+        after_hbf_sess = HighBiasFold.bias_fold(after_cls_sess, folded_pairs, cls_set_info_list)
+    sess.close()
+
+
+
+
+

Lower Level APIs for Cross Layer Equalization

+

API for Batch Norm Folding on subsets of convolution-batchnorm layer pairs

+
+
+aimet_tensorflow.batch_norm_fold.fold_given_batch_norms(sess, input_op_names, output_op_names, layer_pairs)[source]
+

Api to fold custom set of bn layers in a model

+
+
Parameters
+
    +
  • sess (Session) – active tensorflow session

  • +
  • input_op_names (Union[str, List[str]]) – starting op in model or a list of starting ops in the model

  • +
  • layer_pairs (List[Tuple[Operation, Operation, bool]]) – List of tuple with conv and bn op layers as tf.Operation and +a flag to indicate fold upstream or downstream

  • +
  • output_op_names (Union[str, List[str]]) – List of output op names of the model, used to help ConnectedGraph determine valid ops +(to ignore training ops for example).

  • +
+
+
Return type
+

Session

+
+
Returns
+

updated_session after fold

+
+
+
+ +
+

+
+

API for Cross Layer Scaling on subset of conv layer groups

+
+
+aimet_tensorflow.cross_layer_equalization.CrossLayerScaling.scale_cls_sets(sess, cls_sets)
+

Scale multiple CLS sets

+
+
Parameters
+
    +
  • sess (Session) – Current session

  • +
  • cls_sets (List[Union[Tuple[Operation, Operation], Tuple[Operation, Operation, Operation]]]) – List of CLS sets

  • +
+
+
Return type
+

List[Union[ndarray, Tuple[ndarray]]]

+
+
Returns
+

Scaling factors calculated and applied for each CLS set in order

+
+
+
+ +
+

+
+

API for High bias folding

+
+
+aimet_tensorflow.cross_layer_equalization.HighBiasFold.bias_fold(sess, folded_pairs, cls_set_info_list)
+

Folds bias values greater than 3 * sigma to next layer’s bias

+
+
Parameters
+
    +
  • sess (Session) – Current session

  • +
  • folded_pairs (List[Tuple[Operation, Operation]]) – Key: Conv/Linear layer Value: Corresponding folded BN layer

  • +
  • cls_set_info_list (List[ClsSetInfo]) – List of info elements for each cls set

  • +
+
+
Return type
+

Session

+
+
Returns
+

updated session after graph updates from hbf

+
+
+
+ +
+

+
+
+
+

Custom Datatype used

+
+
+class aimet_tensorflow.cross_layer_equalization.ClsSetInfo(cls_pair_1, cls_pair_2=None)[source]
+

This class hold information about the layers in a CLS set, along with corresponding scaling factors +for CLS set layers

+
+
+class ClsSetLayerPairInfo(layer1, layer2, scale_factor, relu_activation_between_layers)[source]
+
+

Models a pair of layers that were scaled using CLS. And related information.

+
+
+
Parameters
+
    +
  • layer1 (Operation) – layer as tf.Operation

  • +
  • layer2 (Operation) – layer as tf.Operation

  • +
  • scale_factor (ndarray) – scale factors as np.ndarray

  • +
  • relu_activation_between_layers – list of flags per layer set indicating if they have Relu activations in-between.

  • +
+
+
+
+ +
+
+static map_cls_sets_to_new_session(tf_names_op_dict, cls_set_info_list)[source]
+
+

Helper function to updates ops stored during cls to be used by high bias fold with updated session.

+
+
+
Parameters
+
    +
  • tf_names_op_dict (Dict[str, Operation]) – map of tf op names to ops

  • +
  • cls_set_info_list – list of ClsSetInfo type

  • +
+
+
Returns
+

None /cls_set_info_list updated in-place

+
+
+
+ +
+ +
+

+
+
+
+

Code Example for Lower level APIs

+

Required imports

+
import tensorflow as tf
+
+from tensorflow.keras.applications.resnet50 import ResNet50
+
+from aimet_tensorflow.batch_norm_fold import fold_given_batch_norms
+from aimet_tensorflow.utils.graph_saver import save_and_load_graph
+from aimet_tensorflow.utils.op.conv import BiasUtils
+
+
+

Perform Cross Layer Equalization in manual mode

+
def cross_layer_equalization_manual():
+    """ perform cross layer equalization using manual api"""
+
+    # load a model
+    tf.keras.backend.clear_session()
+    _ = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+    sess = tf.compat.v1.keras.backend.get_session()
+
+    with sess.as_default():
+        # Batch Norm Fold
+        # pick potential pairs of conv and bn ops for fold
+        layer_pairs = get_layer_pairs_Resnet50_for_folding(sess)
+
+        # fold given layer
+        after_fold_sess = fold_given_batch_norms(sess=sess, input_op_names="input_1", output_op_names="fc1000/Softmax",
+                                                 layer_pairs=layer_pairs)
+
+        # replace any ReLU6 layers with ReLU
+        graph_search = GraphSearchUtils(after_fold_sess.graph, "input_1", "fc1000/Softmax")
+        after_relu_replace_sess = graph_search.find_and_replace_relu6_with_relu(after_fold_sess)
+
+        # Cross Layer Scaling
+        # Create a list of consecutive conv layers to be equalized
+        consecutive_layer_list = get_consecutive_layer_list_from_resnet50_for_scaling(after_relu_replace_sess)
+
+        # invoke api to perform scaling on given list of cls pairs
+        scaling_factor_list = CrossLayerScaling.scale_cls_sets(after_relu_replace_sess, consecutive_layer_list)
+
+        # get info from bn fold and cross layer scaling in format required for high bias fold
+        after_cls_sess, folded_pairs, cls_set_info_list = format_info_for_high_bias_fold(after_relu_replace_sess,
+                                                                                         layer_pairs,
+                                                                                         consecutive_layer_list,
+                                                                                         scaling_factor_list)
+
+        # perform high-bias fold
+        after_hbf_sess = HighBiasFold.bias_fold(after_cls_sess, folded_pairs, cls_set_info_list)
+    sess.close()
+
+
+
+
+

Example helper methods to perform CLE in manual mode

+

Helper to pick layers for batchnorm fold

+
def get_layer_pairs_Resnet50_for_folding(sess: tf.compat.v1.Session):
+    """
+    Helper function to pick example conv-batchnorm layer pairs for folding.
+    :param sess: tensorflow session as tf.compat.v1.Session
+    :return: pairs of conv and batchnorm layers for batch norm folding in Resnet50 model.
+    """
+
+    # pick conv and bn op pairs
+    conv_op_1 = sess.graph.get_operation_by_name('res2a_branch2a/Conv2D')
+    bn_op_1 = sess.graph.get_operation_by_name('bn2a_branch2a/cond/FusedBatchNorm_1')
+
+    conv_op_2 = sess.graph.get_operation_by_name('res2a_branch2b/Conv2D')
+    bn_op_2 = sess.graph.get_operation_by_name('bn2a_branch2b/cond/FusedBatchNorm_1')
+
+    conv_op_3 = sess.graph.get_operation_by_name('res2a_branch2c/Conv2D')
+    bn_op_3 = sess.graph.get_operation_by_name('bn2a_branch2c/cond/FusedBatchNorm_1')
+
+    # make a layer pair list with potential the conv op and bn_op pair along with a flag
+    # to indicate if given bn op can be folded upstream or downstream.
+    # example of two pairs of conv and bn op  shown below
+    layer_pairs = [(conv_op_1, bn_op_1, True),
+                   (conv_op_2, bn_op_2, True),
+                   (conv_op_3, bn_op_3, True)]
+
+    return layer_pairs
+
+
+

Helper to pick layers for cross layer scaling

+
def get_consecutive_layer_list_from_resnet50_for_scaling(sess: tf.compat.v1.Session):
+    """
+    helper function to pick example consecutive layer list for scaling.
+    :param sess: tf.compat.v1.Session
+    :return: sample layers for scaling as consecutive_layer_list from Resnet50 model
+    """
+    conv1_op = sess.graph.get_operation_by_name('res2a_branch2a/Conv2D')
+    conv1_depthwise_op = sess.graph.get_operation_by_name('res2a_branch2b/Conv2D')
+    conv1_pointwise_op = sess.graph.get_operation_by_name('res2a_branch2c/Conv2D')
+
+    # conv layers for scaling (after bn fold)
+    consecutive_layer_list = [(conv1_op, conv1_depthwise_op, conv1_pointwise_op)]
+
+    return consecutive_layer_list
+
+
+

Helper to format data from batchnorm fold and cross layer scaling for usage by high bias fold

+
def format_info_for_high_bias_fold(sess, layer_pairs, consecutive_layer_list, scaling_factor_list):
+    """
+     Helper function that formats data from cross layer scaling and bn fold for usage by high bias fold.
+    :param sess: tf.compat.v1.Session type
+    :param layer_pairs: info obtained after batchnorm fold.
+    :param consecutive_layer_list: info obtained after cross layer scaling
+    :param scaling_factor_list: scaling params corresponding to consecutive_layer_list
+    :return: data formatted for high bias fold.
+    """
+
+    # convert info after batch norm fold and cross layer scaling for usage by high bias fold api
+    folded_pairs = []
+    for (conv_op, bn_op_with_meta, _fold_upstream_flag) in layer_pairs:
+        folded_pairs.append((conv_op, bn_op_with_meta.op))
+
+    # List that hold a boolean for if there were relu activations between layers of each cross layer scaling set
+    is_relu_activation_in_cls_sets = []
+    # Note the user is expected to fill in this list manually
+
+    # Convert to a list of cls-set-info elements
+    cls_set_info_list = CrossLayerScaling.create_cls_set_info_list(consecutive_layer_list,
+                                                                   scaling_factor_list,
+                                                                   is_relu_activation_in_cls_sets)
+
+    # load and save the updated graph after scaling
+    after_cls_sess = save_and_load_graph('./temp_cls', sess)
+
+    return after_cls_sess, folded_pairs, cls_set_info_list
+
+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/tensorflow_quant_analyzer.html b/releases/1.32.2/api_docs/tensorflow_quant_analyzer.html new file mode 100644 index 00000000..61915108 --- /dev/null +++ b/releases/1.32.2/api_docs/tensorflow_quant_analyzer.html @@ -0,0 +1,1298 @@ + + + + + + AIMET Tensorflow Quant Analyzer API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Tensorflow Quant Analyzer API

+

AIMET Tensorflow Quant Analyzer analyzes the Tensorflow model and points out sensitive ops to quantization in the model. +It checks model sensitivity to weight and activation quantization, performs per op sensitivity and MSE analysis. +It also exports per op encodings min and max ranges and statistics histogram for every op.

+
+

Top-level API

+
+
+class aimet_tensorflow.quant_analyzer.QuantAnalyzer(session, start_op_names, output_op_names, forward_pass_callback, eval_callback, use_cuda=True)[source]
+
+
QuantAnalyzer tool provides
    +
  1. Model sensitivity to weight and activation quantization

  2. +
  3. Per layer encoding (min - max range) and PDF analysis

  4. +
  5. Per op sensitivity analysis

  6. +
  7. Per op MSE analysis

  8. +
+
+
+
+
Parameters
+
    +
  • session (Session) – The input model as session to add quantize ops to

  • +
  • start_op_names (List[str]) – List of starting op names of the model

  • +
  • output_op_names (List[str]) – List of output op names of the model

  • +
  • forward_pass_callback (CallbackFunc) – A callback function that is expected to run forward passes on a session. +This callback function should use representative data for the forward pass, so the calculated +encodings work for all data samples. This callback internally chooses the number of data samples +it wants to use for calculating encodings.

  • +
  • eval_callback (CallbackFunc) – A callback function for model evaluation that determines model +performance. This callback function is expected to return scalar value +representing the model performance evaluated against entire test/evaluation dataset.

  • +
  • use_cuda (bool) – If True, places quantization ops on GPU. Defaults to True

  • +
+
+
+
+
+analyze(quant_scheme=QuantScheme.post_training_tf_enhanced, rounding_mode='nearest', default_param_bw=8, default_output_bw=8, config_file=None, unlabeled_dataset=None, num_batches=None, results_dir='./tmp/')[source]
+
+
Analyze model for quantization and point out sensitive parts/hotspots of the model by performing
    +
  1. model sensitivity to quantization

  2. +
  3. export per layer encoding (min - max range)

  4. +
  5. export per layer statistics histogram (PDF) when quant scheme is TF-Enhanced

  6. +
  7. perform per op sensitivity analysis by enabling and disabling quant ops

  8. +
  9. per op MSE loss between fp32 and quantized output activations

  10. +
+
+
+
+
Parameters
+
    +
  • quant_scheme (QuantScheme) – Quantization Scheme, currently supported schemes are post_training_tf and +post_training_tf_enhanced, defaults to post_training_tf_enhanced

  • +
  • rounding_mode (str) – The round scheme to used. One of: ‘nearest’ or ‘stochastic’, defaults to ‘nearest’

  • +
  • default_param_bw (int) – bitwidth to use for parameter tensors, defaults to 8

  • +
  • default_output_bw (int) – bitwidth to use for activation tensors, defaults to 8

  • +
  • config_file (Optional[str]) – Path to a config file to use to specify rules for placing quant ops in the model

  • +
  • results_dir (str) – Directory to save the results.

  • +
  • unlabeled_dataset (Optional[DatasetV1]) – Unlabeled TF dataset +Used in per op MSE loss calculation

  • +
  • num_batches (Optional[int]) – Number of batches. Approximately 256 samples/images are recommended, +so if batch size of data loader is 64, then 4 number of batches leads to 256 samples/images +Used in per op MSE loss calculation

  • +
+
+
+
+ +
+ +
+
+

Code Example

+

Required imports

+
from typing import Any
+import numpy as np
+import tensorflow.compat.v1 as tf
+from aimet_common.defs import QuantScheme
+from aimet_tensorflow.quant_analyzer import QuantAnalyzer, CallbackFunc
+# Below import is required just for eval_callback utility
+# User can have their own implementation
+from Examples.tensorflow.utils.image_net_evaluator import ImageNetEvaluator
+
+
+

Prepare forward pass callback

+
def forward_pass_callback(session: tf.compat.v1.Session, _: Any = None) -> None:
+    """
+    NOTE: This is intended to be the user-defined model calibration function.
+    AIMET requires the above signature. So if the user's calibration function does not
+    match this signature, please create a simple wrapper around this callback function.
+
+    A callback function for model calibration that simply runs forward passes on the model to
+    compute encoding (delta/offset). This callback function should use representative data and should
+    be subset of entire train/validation dataset (~1000 images/samples).
+
+    :param model: Tensorflow session.
+    :param _: Argument(s) of this callback function. Up to the user to determine the type of this parameter.
+    E.g. could be simply an integer representing the number of data samples to use. Or could be a tuple of
+    parameters or an object representing something more complex.
+    """
+
+
+

Prepare eval callback

+
def eval_callback(session: tf.compat.v1.Session, _: Any = None) -> float:
+    """
+    NOTE: This is intended to be the user-defined model evaluation function.
+    AIMET requires the above signature. So if the user's calibration function does not
+    match this signature, please create a simple wrapper around this callback function.
+
+    A callback function for model evaluation that determines model performance. This callback function is
+    expected to return scalar value representing the model performance evaluated against entire
+    test/evaluation dataset.
+
+    :param model: Tensorflow session.
+    :param _: Argument(s) of this callback function. Up to the user to determine the type of this parameter.
+    E.g. could be simply an integer representing the number of data samples to use. Or could be a tuple of
+    parameters or an object representing something more complex.
+    :return: Scalar value representing the model performance.
+    """
+    # User action required
+    # User should create data loader/iterable using entire test/evaluation dataset, perform forward passes on
+    # the model and return single scalar value representing the model performance.
+    evaluator = ImageNetEvaluator('/path/to/tfrecords dataset', training_inputs=['keras_learning_phase:0'],
+                                  data_inputs=['input_1:0'], validation_inputs=['labels:0'],
+                                  image_size=224,
+                                  batch_size=32,
+                                  format_bgr=True)
+    evaluator.evaluate(session, iterations=None)
+
+
+

Create session

+
    # User action required
+    # User should create a tf session from the model which has to be analyzed
+    session = tf.compat.v1.Session()
+    # User has to define start_op_names and output_op_names of model
+    start_op_names = ['model_start_op_name']
+    output_op_names = ['model_output_op_name']
+    # User action required
+    # User should pass actual argument(s) of the callback functions.
+    forward_pass_callback_fn = CallbackFunc(forward_pass_callback, func_callback_args=None)
+    eval_callback_fn = CallbackFunc(eval_callback, func_callback_args=None)
+
+
+

Create QuantAnalyzer object

+
    quant_analyzer = QuantAnalyzer(session, start_op_names=start_op_names, output_op_names=output_op_names,
+                                   forward_pass_callback=forward_pass_callback_fn, eval_callback=eval_callback_fn, use_cuda= False)
+
+
+

Create unlabeled dataset and define num_batches

+
    # Create unlabeled dataset and define num_batches to perform per op mse analysis
+    # Create unlabeled dataset and define num_batches to perform per op mse analysis
+    # Approximately 256 images/samples are recommended for MSE loss analysis. So, if the dataloader
+    # has batch_size of 64, then 4 number of batches leads to 256 images/samples.
+    # User action required
+    # User should use unlabeled dataloader, so if the dataloader yields labels as well user should use discard them.
+    dataset_size = 128
+    input_data = np.random.rand(dataset_size, 224, 224, 3)
+    dataset = tf.data.Dataset.from_tensor_slices(input_data)
+    batch_size = 32
+    unlabeled_dataset = dataset.batch(batch_size=batch_size)
+    num_batches = 4
+
+
+

Run QuantAnalyzer

+
    quant_analyzer.analyze(quant_scheme=QuantScheme.post_training_tf_enhanced,
+                           default_param_bw=8,
+                           default_output_bw=8,
+                           config_file=None,
+                           unlabeled_dataset=unlabeled_dataset,
+                           num_batches=num_batches,
+                           results_dir="./quant_analyzer_results/")
+
+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/tensorflow_quantization.html b/releases/1.32.2/api_docs/tensorflow_quantization.html new file mode 100644 index 00000000..38437071 --- /dev/null +++ b/releases/1.32.2/api_docs/tensorflow_quantization.html @@ -0,0 +1,1147 @@ + + + + + + AIMET TensorFlow Quantization APIs — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET TensorFlow Quantization APIs

+
+
AIMET Quantization for TensorFlow provides the following functionality
    +
  • Quantization Simulation: Allows ability to simulate inference and training on quantized hardware

  • +
  • QuantAnalyzer: Analyzes the model and points out sensitive ops to quantization

  • +
  • Adaptive Rounding: Post-training quantization technique to optimize rounding of weight tensors

  • +
  • Cross-Layer Equalization: Post-training quantization technique to equalize layer parameters

  • +
  • Bias Correction: Post-training quantization technique to correct shift in layer outputs due to quantization noise

  • +
  • AutoQuant API: Unified API that integrates the post-training quantization techniques provided by AIMET

  • +
  • BN Re-estimation APIs: APIs that Re-estimate BN layers’ statistics and fold the BN layers

  • +
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/tensorflow_quantsim.html b/releases/1.32.2/api_docs/tensorflow_quantsim.html new file mode 100644 index 00000000..ebe79f1e --- /dev/null +++ b/releases/1.32.2/api_docs/tensorflow_quantsim.html @@ -0,0 +1,1386 @@ + + + + + + AIMET TensorFlow Quantization SIM API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET TensorFlow Quantization SIM API

+ + +
+

Top-level API

+
+
+class aimet_tensorflow.quantsim.QuantizationSimModel(session, starting_op_names, output_op_names, quant_scheme='tf_enhanced', rounding_mode='nearest', default_output_bw=8, default_param_bw=8, use_cuda=True, config_file=None, default_data_type=QuantizationDataType.int)[source]
+

Creates a QuantSim model by adding quantization simulations ops to a given model.

+

This enables

+
    +
  1. off-target simulation of inference accuracy

  2. +
  3. the model to be fine-tuned to counter the effects of quantization

  4. +
+
+
Parameters
+
    +
  • session (Session) – The input model as session to add quantize ops to

  • +
  • starting_op_names (List[str]) – List of starting op names of the model

  • +
  • output_op_names (List[str]) – List of output op names of the model

  • +
  • quant_scheme (Union[str, QuantScheme]) – Quantization Scheme, currently supported schemes are post_training_tf and +post_training_tf_enhanced, defaults to post_training_tf_enhanced

  • +
  • rounding_mode (str) – The round scheme to used. One of: ‘nearest’ or ‘stochastic’, defaults to ‘nearest’.

  • +
  • default_output_bw (int) – bitwidth to use for activation tensors, defaults to 8

  • +
  • default_param_bw (int) – bitwidth to use for parameter tensors, defaults to 8

  • +
  • use_cuda (bool) – If True, places quantization ops on GPU. Defaults to True

  • +
  • config_file (Optional[str]) – Path to a config file to use to specify rules for placing quant ops in the model

  • +
  • default_data_type (QuantizationDataType) – Default data type to use for quantizing all layer parameters. +Possible options are QuantizationDataType.int and QuantizationDataType.float. +Note that the mode default_data_type=QuantizationDataType.float is only supported with +default_output_bw=16 and default_param_bw=16

  • +
+
+
Returns
+

An object which can be used to perform quantization on a tensorflow graph

+
+
Raises
+

ValueError: An error occurred processing one of the input parameters.

+
+
+
+ +
+

+
+
+
Note about Quantization SchemesAIMET offers multiple Quantization Schemes-
    +
  1. Post Training Quantization- The encodings of the model are computed using TF or TF-Enhanced scheme

  2. +
  3. +
    Trainable Quantization- The min max of encodings are learnt during training.
      +
    • Range Learning with TF initialization - Uses TF scheme to initialize the encodings and then during training these encodings are fine-tuned to improve accuracy of the model

    • +
    • Range Learning with TF-Enhanced initialization - Uses TF-Enhanced scheme to initialize the encodings and then during training these encodings are fine-tuned to improve accuracy of the model

    • +
    +
    +
    +
  4. +
+
+
+

The following API can be used to Compute Encodings for Model

+
+
+QuantizationSimModel.compute_encodings(forward_pass_callback, forward_pass_callback_args)[source]
+

Computes encodings for all quantization sim nodes in the model. +This is also used to set initial encodings for Range Learning.

+
+
Parameters
+
    +
  • forward_pass_callback (Callable[[Session, Any], None]) – A callback function that is expected to runs forward passes on a session. +This callback function should use representative data for the forward pass, so the calculated +encodings work for all data samples. This callback internally chooses the number of data samples +it wants to use for calculating encodings.

  • +
  • forward_pass_callback_args – These argument(s) are passed to the forward_pass_callback as-is. Up to +the user to determine the type of this parameter. E.g. could be simply an integer representing the number +of data samples to use. Or could be a tuple of parameters or an object representing something more +complex.

  • +
+
+
Returns
+

None

+
+
+
+ +
+

+
+

The following API can be used to Export the Model to target

+
+
+QuantizationSimModel.export(path, filename_prefix, orig_sess=None)[source]
+

This method exports out the quant-sim model so it is ready to be run on-target.

+

Specifically, the following are saved

+
    +
  1. The sim-model is exported to a regular tensorflow meta/checkpoint without any simulation ops

  2. +
  3. The quantization encodings are exported to a separate JSON-formatted file that can +then be imported by the on-target runtime (if desired)

  4. +
+
+
Parameters
+
    +
  • path (str) – path where to store model pth and encodings

  • +
  • filename_prefix (str) – Prefix to use for filenames of the model pth and encodings files

  • +
  • orig_sess (Optional[Session]) – optional param to pass in original session without quant nodes for export

  • +
+
+
Returns
+

None

+
+
+
+ +
+

+
+

Encoding format is described in the Quantization Encoding Specification

+
+

+
+
+
+

Code Examples

+

Required imports

+
import tensorflow as tf
+
+# Import the tensorflow quantisim
+from aimet_tensorflow import quantsim
+from aimet_tensorflow.common import graph_eval
+from aimet_tensorflow.utils import graph_saver
+from aimet_common.defs import QuantScheme
+from tensorflow.examples.tutorials.mnist import input_data
+
+
+

User should write this function to pass calibration data

+
def pass_calibration_data(session: tf.Session):
+    """
+    The User of the QuantizationSimModel API is expected to write this function based on their data set.
+    This is not a working function and is provided only as a guideline.
+
+    :param session: Model's session
+    :return:
+    """
+
+    # User action required
+    # The following line of code is an example of how to use the ImageNet data's validation data loader.
+    # Replace the following line with your own dataset's validation data loader.
+    data_loader = None  # Your Dataset's data loader
+
+    # User action required
+    # For computing the activation encodings, around 1000 unlabelled data samples are required.
+    # Edit the following 2 lines based on your dataloader's batch size.
+    # batch_size * max_batch_counter should be 1024
+    batch_size = 64
+    max_batch_counter = 16
+
+    input_tensor = None  # input tensor in session
+    train_tensor = None  # train tensor in session
+
+    current_batch_counter = 0
+    for input_data, _ in data_loader:
+        feed_dict = {input_tensor: input_data,
+                     train_tensor: False}
+
+        session.run([], feed_dict=feed_dict)
+
+        current_batch_counter += 1
+        if current_batch_counter == max_batch_counter:
+            break
+
+
+

Quantize the model and finetune (QAT)

+
def quantize_model():
+    """
+    Create the Quantization Simulation and finetune the model.
+    :return:
+    """
+    tf.compat.v1.reset_default_graph()
+
+    # load graph
+    sess = graph_saver.load_model_from_meta('models/mnist_save.meta', 'models/mnist_save')
+
+    # Create quantsim model to quantize the network using the default 8 bit params/activations
+    sim = quantsim.QuantizationSimModel(sess, starting_op_names=['reshape_input'], output_op_names=['dense_1/BiasAdd'],
+                                        quant_scheme=QuantScheme.post_training_tf_enhanced,
+                                        config_file='../../../TrainingExtensions/common/src/python/aimet_common/'
+                                                    'quantsim_config/default_config.json')
+
+    # Compute encodings
+    sim.compute_encodings(pass_calibration_data, forward_pass_callback_args=None)
+
+    # Do some finetuning
+
+    # User action required
+    # The following line of code illustrates that the model is getting finetuned.
+    # Replace the following train() function with your pipeline's train() function.
+    train(sim)
+
+
+

Quantize and finetune a trained model learn the encodings (Range Learning)

+
def quantization_aware_training_range_learning():
+    """
+    Running Quantize Range Learning Test
+    """
+    tf.reset_default_graph()
+
+    # Allocate the generator you wish to use to provide the network with data
+    parser2 = tf_gen.MnistParser(batch_size=100, data_inputs=['reshape_input'])
+    generator = tf_gen.TfRecordGenerator(tfrecords=[os.path.join('data', 'mnist', 'validation.tfrecords')],
+                                         parser=parser2)
+
+    sess = graph_saver.load_model_from_meta('models/mnist_save.meta', 'models/mnist_save')
+
+    # Create quantsim model to quantize the network using the default 8 bit params/activations
+    # quant scheme set to range learning
+    sim = quantsim.QuantizationSimModel(sess, ['reshape_input'], ['dense_1/BiasAdd'],
+                                        quant_scheme=QuantScheme.training_range_learning_with_tf_init)
+
+    # Initialize the model with encodings
+    sim.compute_encodings(pass_calibration_data, forward_pass_callback_args=None)
+
+    # Train the model to fine-tune the encodings
+    g = sim.session.graph
+    sess = sim.session
+
+    with g.as_default():
+
+        parser2 = tf_gen.MnistParser(batch_size=100, data_inputs=['reshape_input'])
+        generator2 = tf_gen.TfRecordGenerator(tfrecords=['data/mnist/validation.tfrecords'], parser=parser2)
+        cross_entropy = g.get_operation_by_name('xent')
+        train_step = g.get_operation_by_name("Adam")
+
+        # do training: learn weights and architecture simultaneously
+        x = sim.session.graph.get_tensor_by_name("reshape_input:0")
+        y = g.get_tensor_by_name("labels:0")
+        fc1_w = g.get_tensor_by_name("dense_1/MatMul/ReadVariableOp:0")
+
+        perf = graph_eval.evaluate_graph(sess, generator2, ['accuracy'], graph_eval.default_eval_func, 1)
+        print('Quantized performance: ' + str(perf * 100))
+
+        ce = g.get_tensor_by_name("xent:0")
+        train_step = tf.train.AdamOptimizer(1e-3, name="TempAdam").minimize(ce)
+        graph_eval.initialize_uninitialized_vars(sess)
+        mnist = input_data.read_data_sets('./data', one_hot=True)
+
+        for i in range(100):
+            batch = mnist.train.next_batch(50)
+            sess.run([train_step, fc1_w], feed_dict={x: batch[0], y: batch[1]})
+            if i % 10 == 0:
+                perf = graph_eval.evaluate_graph(sess, generator2, ['accuracy'], graph_eval.default_eval_func, 1)
+                print('Quantized performance: ' + str(perf * 100))
+
+    # close session
+    sess.close()
+
+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/tensorflow_visualization_quantization.html b/releases/1.32.2/api_docs/tensorflow_visualization_quantization.html new file mode 100644 index 00000000..6e61d304 --- /dev/null +++ b/releases/1.32.2/api_docs/tensorflow_visualization_quantization.html @@ -0,0 +1,1225 @@ + + + + + + AIMET Visualization for Quantization for TensorFlow API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Visualization for Quantization for TensorFlow API

+
+

Top-level API for Visualization of Weight tensors

+
+
+aimet_tensorflow.plotting_utils.visualize_weight_ranges_single_layer(sess, layer, results_dir)[source]
+

Given a layer, visualizes weight ranges with scatter plots and line plots

+
+
Parameters
+
    +
  • sess – tf.compat.v1.Session

  • +
  • layer – layer with weights

  • +
  • results_dir – Directory to save the Bokeh plots

  • +
+
+
Returns
+

Bokeh plot

+
+
+
+ +
+

+
+
+
+aimet_tensorflow.plotting_utils.visualize_relative_weight_ranges_single_layer(sess, layer, results_dir)[source]
+

Publishes a line plot showing weight ranges for each layer, summary statistics +for relative weight ranges, and a histogram showing weight ranges of output channels

+
+
Parameters
+
    +
  • sess – tf.compat.v1.Session

  • +
  • layer – layer with weights

  • +
  • results_dir – Directory to save the Bokeh plots

  • +
+
+
Returns
+

bokeh plot

+
+
+
+ +
+

+
+
+
+

Code Examples for Visualization of Weight tensors

+

Required imports

+
import tensorflow as tf
+from tensorflow.keras.applications.resnet50 import ResNet50
+from aimet_tensorflow import plotting_utils
+
+
+

Visualizing weight ranges for layer

+
def visualizing_weight_ranges_for_single_layer():
+    # load a model
+    tf.keras.backend.clear_session()
+    _ = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+    sess = tf.compat.v1.keras.backend.get_session()
+
+    results_dir = 'artifacts'
+
+    with sess.as_default():
+        # Getting a layer for visualizaing its weight ranges
+        conv_op = sess.graph.get_operation_by_name('conv1_conv/Conv2D')
+
+        plotting_utils.visualize_weight_ranges_single_layer(sess=sess, layer=conv_op, results_dir=results_dir)
+    sess.close()
+
+
+

Visualizing Relative weight ranges for layer

+
def visualizing_relative_weight_ranges_for_single_layer():
+    # load a model
+    tf.keras.backend.clear_session()
+    _ = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
+    sess = tf.compat.v1.keras.backend.get_session()
+
+    results_dir = 'artifacts'
+
+    with sess.as_default():
+        # Getting a layer for visualizaing its weight ranges
+        conv_op = sess.graph.get_operation_by_name('conv1_conv/Conv2D')
+
+        plotting_utils.visualize_relative_weight_ranges_single_layer(sess=sess, layer=conv_op,
+                                                                     results_dir=results_dir)
+    sess.close()
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch.html b/releases/1.32.2/api_docs/torch.html new file mode 100644 index 00000000..077e05db --- /dev/null +++ b/releases/1.32.2/api_docs/torch.html @@ -0,0 +1,1143 @@ + + + + + + AIMET PyTorch APIs — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ + + +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch_adaround.html b/releases/1.32.2/api_docs/torch_adaround.html new file mode 100644 index 00000000..1657d0ca --- /dev/null +++ b/releases/1.32.2/api_docs/torch_adaround.html @@ -0,0 +1,1404 @@ + + + + + + AIMET PyTorch AdaRound API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET PyTorch AdaRound API

+ + +
+

Top-level API

+
+
+aimet_torch.adaround.adaround_weight.Adaround.apply_adaround(model, dummy_input, params, path, filename_prefix, default_param_bw=4, param_bw_override_list=None, ignore_quant_ops_list=None, default_quant_scheme=QuantScheme.post_training_tf_enhanced, default_config_file=None)
+

Returns model with optimized weight rounding of every module (Conv and Linear) and also saves the +corresponding quantization encodings to a separate JSON-formatted file that can then be imported by +QuantSim for inference or QAT

+
+
Parameters
+
    +
  • model (Module) – Model to Adaround

  • +
  • dummy_input (Union[Tensor, Tuple]) – Dummy input to the model. Used to parse model graph. If the model has more than one input, +pass a tuple. User is expected to place the tensors on the appropriate device.

  • +
  • params (AdaroundParameters) – Parameters for Adaround

  • +
  • path (str) – path where to store parameter encodings

  • +
  • filename_prefix (str) – Prefix to use for filename of the encodings file

  • +
  • default_param_bw (int) – Default bitwidth (4-31) to use for quantizing layer parameters

  • +
  • param_bw_override_list (Optional[List[Tuple[Module, int]]]) – List of Tuples. Each Tuple is a module and the corresponding parameter bitwidth +to be used for that module.

  • +
  • ignore_quant_ops_list (Optional[List[Module]]) – Ops listed here are skipped during quantization needed for AdaRounding. Do not +specify Conv and Linear modules in this list. Doing so, will affect accuracy.

  • +
  • default_quant_scheme (QuantScheme) – Quantization scheme. Supported options are using Quant Scheme Enum +QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced

  • +
  • default_config_file (Optional[str]) – Default configuration file for model quantizers

  • +
+
+
Return type
+

Module

+
+
Returns
+

Model with Adarounded weights and saves corresponding parameter encodings JSON file at provided path

+
+
+
+ +
+
+

Adaround Parameters

+
+
+class aimet_torch.adaround.adaround_weight.AdaroundParameters(data_loader, num_batches, default_num_iterations=None, default_reg_param=0.01, default_beta_range=(20, 2), default_warm_start=0.2, forward_fn=None)[source]
+

Configuration parameters for Adaround

+
+
Parameters
+
    +
  • data_loader (DataLoader) – Data loader

  • +
  • num_batches (int) – Number of batches to be used for Adaround. +A commonly recommended value for this parameter is the smaller value among (1) len(data_loader) and (2) ceil(2000/batch_size)

  • +
  • default_num_iterations (Optional[int]) – Number of iterations to adaround each layer. +The default value is 10K for models with 8- or higher bit weights, and 15K for models with lower than 8 bit weights.

  • +
  • default_reg_param (float) – Regularization parameter, trading off between rounding loss vs reconstruction loss. +Default 0.01

  • +
  • default_beta_range (Tuple) – Start and stop beta parameter for annealing of rounding loss (start_beta, end_beta). +Default (20, 2)

  • +
  • default_warm_start (float) – warm up period, during which rounding loss has zero effect. Default 20% (0.2)

  • +
  • forward_fn (Optional[Callable[[Module, Any], Any]]) – Optional adapter function that performs forward pass given a model and inputs +yielded from the data loader. The function expects model as first argument and inputs to model +as second argument.

  • +
+
+
+
+ +
+
+

Enum Definition

+

Quant Scheme Enum

+
+
+class aimet_common.defs.QuantScheme(value)[source]
+

Enumeration of Quant schemes

+
+
+post_training_percentile = 6
+

For a Tensor, adjusted minimum and maximum values are selected based on the percentile value passed. +The Quantization encodings are calculated using the adjusted minimum and maximum value.

+
+ +
+
+post_training_tf = 1
+

For a Tensor, the absolute minimum and maximum value of the Tensor are used to compute the Quantization +encodings.

+
+ +
+
+post_training_tf_enhanced = 2
+

For a Tensor, searches and selects the optimal minimum and maximum value that minimizes the Quantization Noise. +The Quantization encodings are calculated using the selected minimum and maximum value.

+
+ +
+
+training_range_learning_with_tf_enhanced_init = 4
+

For a Tensor, the encoding values are initialized with the post_training_tf_enhanced scheme. Then, the encodings +are learned during training.

+
+ +
+
+training_range_learning_with_tf_init = 3
+

For a Tensor, the encoding values are initialized with the post_training_tf scheme. Then, the encodings are +learned during training.

+
+ +
+ +
+

+
+
+
+

Code Example - Adaptive Rounding (AdaRound)

+

This example shows how to use AIMET to perform Adaptive Rounding (AdaRound).

+

Load the model

+

For this example, we are going to load a pretrained ResNet18 model from torchvision. Similarly, you can load any +pretrained PyTorch model instead.

+
    import torch
+    from torchvision import models
+
+    model = models.resnet18(pretrained=True).eval()
+
+
+
+

Prepare the model for Quantization simulation

+

AIMET quantization simulation requires the user’s model definition to follow certain guidelines. For example, +functionals defined in forward pass should be changed to equivalent torch.nn.Module. AIMET user guide lists all these +guidelines. The following ModelPreparer API uses new graph transformation feature available in PyTorch 1.9+ version and +automates model definition changes required to comply with the above guidelines.

+

For more details, please refer: Model Preparer API:

+
    from aimet_torch.model_preparer import prepare_model
+    prepared_model = prepare_model(model)
+
+
+
+

Apply AdaRound

+

We can now apply AdaRound to this model.

+

Some of the parameters for AdaRound are described below

+
    +
  • dataloader: AdaRound needs a dataloader to use data samples for the layer-by-layer optimization to learn the rounding vectors. Either a training or validation dataloader could be passed in.

  • +
  • num_batches: The number of batches used to evaluate the model while calculating the quantization encodings. Typically we want AdaRound to use around 2000 samples. So with a batch size of 32, this may translate to 64 batches. To speed up the execution here we are using a batch size of 1.

  • +
  • default_num_iterations: The number of iterations to adaround each layer. Default value is set to 10000 and we strongly recommend to not reduce this number. But in this example we are using 32 to speed up the execution runtime.

  • +
+
    from aimet_common.defs import QuantScheme
+    from aimet_torch.quantsim import QuantizationSimModel
+    from aimet_torch.adaround.adaround_weight import Adaround, AdaroundParameters
+
+    # User action required
+    # The following line of code is an example of how to use the ImageNet data's training data loader.
+    # Replace the following line with your own dataset's training data loader.
+    data_loader = ImageNetDataPipeline.get_train_dataloader()
+
+    params = AdaroundParameters(data_loader=data_loader, num_batches=4, default_num_iterations=32,
+                                default_reg_param=0.01, default_beta_range=(20, 2))
+
+    input_shape = (1, 3, 224, 224)
+    dummy_input = torch.randn(input_shape)
+
+    # Returns model with adarounded weights and their corresponding encodings
+    adarounded_model = Adaround.apply_adaround(prepared_model, dummy_input, params, path='./',
+                                               filename_prefix='resnet18', default_param_bw=4,
+                                               default_quant_scheme=QuantScheme.post_training_tf_enhanced,
+                                               default_config_file=None)
+
+
+
+

Create the Quantization Simulation Model

+

Now we use AdaRounded model and create a QuantizationSimModel. This basically means that AIMET will insert fake +quantization ops in the model graph and will configure them. A few of the parameters are explained here

+
    +
  • default_param_bw: The QuantizationSimModel must be created with the same parameter bitwidth precision that was used in the apply_adaround() created.

  • +
  • Freezing the parameter encodings: After creating the QuantizationSimModel, the set_and_freeze_param_encodings() API must be called before calling the compute_encodings() API. While applying AdaRound, the parameter values have been rounded up or down based on these initial encodings internally created. Fo r Quantization Simulation accuracy, it is important to freeze these encodings. If the parameters encodings are NOT frozen, the call to compute_encodings() will alter the value of the parameters encoding and Quantization Simulation accuracy will not be correct.

  • +
+
    sim = QuantizationSimModel(adarounded_model, quant_scheme=quant_scheme, default_param_bw=param_bw,
+                               default_output_bw=output_bw, dummy_input=dummy_input)
+
+    # Set and freeze encodings to use same quantization grid and then invoke compute encodings
+    sim.set_and_freeze_param_encodings(encoding_path='./resnet18.encodings')
+
+
+
+
+

An example User created function that is called back from compute_encodings()

+

Even though AIMET has added ‘quantizer’ nodes to the model graph, the model is not ready to be used yet. Before we can +use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each +‘quantizer’ node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect +range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is +sometimes referred to as calibration. AIMET simply refers to it as ‘computing encodings’.

+

So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing +train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any +loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples

+

In practice, we need a very small percentage of the overall data samples for computing encodings. For example, +the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples.

+

It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all +classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, +we definitely want to avoid an extreme scenario like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures +captured at night might not give ideal results.

+
def pass_calibration_data(sim_model):
+    """
+    The User of the QuantizationSimModel API is expected to write this function based on their data set.
+    This is not a working function and is provided only as a guideline.
+
+    :param sim_model:
+    :return:
+    """
+
+    # User action required
+    # The following line is an example of how to use the ImageNet data's validation data loader.
+    # Replace the following line with your own dataset's validation data loader.
+    data_loader = ImageNetDataPipeline.get_val_dataloader()
+
+    # User action required
+    # For computing the activation encodings, around 1000 unlabelled data samples are required.
+    # Edit the following 2 lines based on your batch size.
+    # batch_size * max_batch_counter should be 1024
+    batch_size = 64
+    max_batch_counter = 16
+
+    sim_model.eval()
+
+    current_batch_counter = 0
+    with torch.no_grad():
+        for input_data, target_data in data_loader:
+
+            inputs_batch = input_data  # labels are ignored
+            sim_model(inputs_batch)
+
+            current_batch_counter += 1
+            if current_batch_counter == max_batch_counter:
+                break
+
+
+

Compute the Quantization Encodings

+

Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization +encodings. Encodings here refer to scale/offset quantization parameters.

+
    sim.compute_encodings(pass_calibration_data, forward_pass_callback_args=None)
+
+
+
+

Determine Simulated Accuracy

+

Now the QuantizationSim model is ready to be used for inference. First we can pass this model to an evaluation routine. +The evaluation routine will now give us a simulated quantized accuracy score for INT8 quantization.

+
    accuracy = ImageNetDataPipeline.evaluate(sim.model, use_cuda)
+    print(accuracy)
+
+
+
+

Export the model

+

So we have an improved model after AdaRound. Now the next step would be to actually take this model to target. For this +purpose, we need to export the model with the updated weights without the fake quant ops. We also to export the +encodings (scale/offset quantization parameters) that were updated during training since we employed QAT. +AIMET QuantizationSimModel provides an export API for this purpose.

+
    # Export the model which saves pytorch model without any simulation nodes and saves encodings file for both
+    # activations and parameters in JSON format
+    sim.export(path='./', filename_prefix='quantized_resnet18', dummy_input=dummy_input.cpu())
+
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch_architecture_checker.html b/releases/1.32.2/api_docs/torch_architecture_checker.html new file mode 100644 index 00000000..e1ee7f63 --- /dev/null +++ b/releases/1.32.2/api_docs/torch_architecture_checker.html @@ -0,0 +1,1280 @@ + + + + + + Architecture Checker API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +

EXPERIMENTAL

+
+

Architecture Checker API

+
+
+aimet_torch.arch_checker.arch_checker.ArchChecker.check_model_arch(model, dummy_input, result_dir=None)
+

Check each node in the model using checks in _node_check_dict. Record only the nodes and +failed tests.

+
+
Parameters
+
    +
  • model (Module) – Torch model to be checked.

  • +
  • dummy_input (Union[Tensor, Tuple]) – A dummy input to the model. Can be a Tensor or a Tuple of Tensors

  • +
+
+
Return arch_checker_report
+

{op.dotted_name_op: NodeErrorReportObject }

+
+
Return type
+

ArchCheckerReport

+
+
+
+ +

AIMET PyTorch Architecture Checker helps checks for sub-optimal model construct and provides potential option to +update the model to be more performant. The architecture checker currently checks for the following conditions:

+
    +
  • Convolution layers for optimal channel size.

  • +
  • Activation functions that are not performant.

  • +
  • Batch Normalization layer than cannot be folded.

  • +
  • Intermediate convolution layer in sequence of convolution layer having padding.

  • +
+

In this section, we present models failing the architecture checks, and show how to run the architecture checker.

+

Example 1: Model with not enough channels

+

We begin with the following model, which contains a convolution layer with channel less that 32.

+
class ModelWithNotEnoughChannels(torch.nn.Module):
+    """ Model that prelu module. Expects input of shape (1, 3, 32, 32) """
+
+    def __init__(self):
+        super(ModelWithNotEnoughChannels, self).__init__()
+        self.conv1 = torch.nn.Conv2d(3, 31, kernel_size=2, stride=2, padding=2, bias=False)
+        self.bn1 = torch.nn.BatchNorm2d(31)
+
+    def forward(self, *inputs):
+        x = self.conv1(inputs[0])
+        x = self.bn1(x)
+        return x
+
+
+

Import the architecture checker:

+
from aimet_torch.arch_checker.arch_checker import ArchChecker
+
+
+

Run the checker on the model by passing in the model as well as the model input:

+
def example_check_for_number_of_conv_channels():
+
+    model = ModelWithNotEnoughChannels()
+    ArchChecker.check_model_arch(model, dummy_input=torch.rand(1, 3, 32, 32))
+
+
+

the convolution layer in the model has one fewer channel, the following logger print will appear:

+
Utils - INFO - Graph/Node: ModelWithNotEnoughChannels.conv1: Conv2d(3, 31, kernel_size=(2, 2), stride=(2, 2), padding=(2, 2), bias=False) fails check: {'_check_conv_channel_32_base', '_check_conv_channel_larger_than_32'}
+
+
+

A HTML file with the following content is generated.

+ + +++++ + + + + + + + + + + + + + + + + +
HTML report content

Graph/Layer_name

Issue

Recommendation

ModelWithNotEnoughChannels.conv1

The channel size of input/output tensor of this convolution is smaller than 32

Try adjusting the channels to multiple of 32 to get better performance.

ModelWithNotEnoughChannels.conv1

The channel size of input/output tensor of this convolution is not a multiple of 32

Try adjusting the channels to multiple of 32 to get better performance.

+

Example 2: Model with non-performant activation

+

We begin with the following model, which contains a convolution layer with channel less that 32.

+
class ModelWithPrelu(torch.nn.Module):
+    """ Model that prelu module. Expects input of shape (1, 3, 32, 32) """
+
+    def __init__(self):
+        super(ModelWithPrelu, self).__init__()
+        self.conv1 = torch.nn.Conv2d(32, 32, kernel_size=2, stride=2, padding=2, bias=False)
+        self.bn1 = torch.nn.BatchNorm2d(32)
+        self.prelu1 = torch.nn.PReLU()
+
+    def forward(self, *inputs):
+        x = self.conv1(inputs[0])
+        x = self.bn1(x)
+        x = self.prelu1(x)
+        return x
+
+
+

Run the checker on the model by passing in the model as well as the model input:

+
def example_check_for_non_performant_activations():
+
+    model = ModelWithPrelu()
+    ArchChecker.check_model_arch(model, dummy_input=torch.rand(1, 32, 32, 32))
+
+
+

the PReLU layer in model is consider non-performant compared to ReLU, the following logger print will appear:

+
Utils - INFO - Graph/Node: ModelWithPrelu.prelu1: PReLU(num_parameters=1) fails check: {'_activation_checks'}
+
+
+

Example 3: Model with standalone batch normalization layer

+

We begin with the following model, which contains a convolution layer with channel less that 32.

+
class ModelWithNonfoldableBN(torch.nn.Module):
+    """ Model that has non-foldable batch norm. """
+
+    def __init__(self):
+        super(ModelWithNonfoldableBN, self).__init__()
+        self.conv1 = torch.nn.Conv2d(32, 32, kernel_size=2, stride=2, padding=2, bias=False)
+        self.avg_pool1 = torch.nn.AvgPool2d(3, padding=1)
+        self.bn1 = torch.nn.BatchNorm2d(32)
+
+    def forward(self, *inputs):
+        x = self.conv1(inputs[0])
+        x = self.avg_pool1(x)
+        x = self.bn1(x)
+        return x
+
+
+

Run the checker on the model by passing in the model as well as the model input:

+
def example_check_for_standalone_bn():
+
+    model = ModelWithNonfoldableBN()
+    ArchChecker.check_model_arch(model, dummy_input=torch.rand(1, 32, 32, 32))
+
+
+

the AveragePool layer prevents the BatchNormalization layer to be folded with the Convolution layer, the following logger print will appear:

+
Utils - INFO - Graph/Node: ModelWithNonfoldableBN.bn1: BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) fails check: {'_check_batch_norm_fold'}
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch_auto_quant.html b/releases/1.32.2/api_docs/torch_auto_quant.html new file mode 100644 index 00000000..34f72e3a --- /dev/null +++ b/releases/1.32.2/api_docs/torch_auto_quant.html @@ -0,0 +1,1295 @@ + + + + + + AIMET PyTorch AutoQuant API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET PyTorch AutoQuant API

+ + +
+

Top-level API

+
+
+class aimet_torch.auto_quant.AutoQuant(model, dummy_input, data_loader, eval_callback, param_bw=8, output_bw=8, quant_scheme=QuantScheme.post_training_tf_enhanced, rounding_mode='nearest', config_file=None, results_dir='/tmp', cache_id=None, strict_validation=True, model_prepare_required=True)[source]
+

Integrate and apply post-training quantization techniques.

+

AutoQuant includes 1) batchnorm folding, 2) cross-layer equalization, +and 3) Adaround. +These techniques will be applied in a best-effort manner until the model +meets the evaluation goal given as allowed_accuracy_drop.

+
+
Parameters
+
    +
  • model (Module) – Model to be quantized. Assumes model is on the correct device

  • +
  • dummy_input (Union[Tensor, Tuple]) – Dummy input for the model. Assumes that dummy_input is on the correct device

  • +
  • data_loader (DataLoader) – A collection that iterates over an unlabeled dataset, used for computing encodings

  • +
  • eval_callback (Callable[[Module], float]) – Function that calculates the evaluation score

  • +
  • param_bw (int) – Parameter bitwidth

  • +
  • output_bw (int) – Output bitwidth

  • +
  • quant_scheme (QuantScheme) – Quantization scheme

  • +
  • rounding_mode (str) – Rounding mode

  • +
  • config_file (Optional[str]) – Path to configuration file for model quantizers

  • +
  • results_dir (str) – Directory to save the results of PTQ techniques

  • +
  • cache_id (Optional[str]) – ID associated with cache results

  • +
  • strict_validation (bool) – Flag set to True by default.hen False, AutoQuant will proceed with execution and handle errors internally if possible. This may produce unideal or unintuitive results.

  • +
  • model_prepare_required (bool) – Flag set to True by default.If False, AutoQuant will skip model prepare block in the pipeline.

  • +
+
+
+
+ +
+
+

Code Examples

+
import random
+from typing import Optional
+
+import torch
+from torch.utils.data import Dataset, DataLoader, SubsetRandomSampler
+from torchvision import models, datasets, transforms
+
+from aimet_torch.adaround.adaround_weight import AdaroundParameters
+from aimet_torch.auto_quant_v2 import AutoQuant
+
+# Step 1. Define constants and helper functions
+EVAL_DATASET_SIZE = 5000
+CALIBRATION_DATASET_SIZE = 2000
+BATCH_SIZE = 100
+
+_subset_samplers = {}
+
+def _create_sampled_data_loader(dataset, num_samples):
+    if num_samples not in _subset_samplers:
+        indices = random.sample(range(len(dataset)), num_samples)
+        _subset_samplers[num_samples] = SubsetRandomSampler(indices=indices)
+    return DataLoader(dataset,
+                      sampler=_subset_samplers[num_samples],
+                      batch_size=BATCH_SIZE)
+
+# Step 2. Prepare model and dataset
+fp32_model = models.resnet18(pretrained=True).eval()
+
+input_shape = (1, 3, 224, 224)
+dummy_input = torch.randn(input_shape)
+
+transform = transforms.Compose((
+    transforms.ToTensor(),
+))
+# NOTE: In the actual use cases, a real dataset should provide by the users.
+eval_dataset = datasets.FakeData(size=EVAL_DATASET_SIZE,
+                                 image_size=input_shape[1:],
+                                 num_classes=1000,
+                                 transform=transform)
+
+# Step 3. Prepare unlabeled dataset
+# NOTE: In the actual use cases, the users should implement this part to serve
+#       their own goals if necessary.
+class UnlabeledDatasetWrapper(Dataset):
+    def __init__(self, dataset):
+        self._dataset = dataset
+
+    def __len__(self):
+        return len(self._dataset)
+
+    def __getitem__(self, index):
+        images, _ = self._dataset[index]
+        return images
+
+unlabeled_dataset = UnlabeledDatasetWrapper(eval_dataset)
+unlabeled_data_loader = _create_sampled_data_loader(unlabeled_dataset, CALIBRATION_DATASET_SIZE)
+
+# Step 4. Prepare eval callback
+# NOTE: In the actual use cases, the users should implement this part to serve
+#       their own goals if necessary.
+def eval_callback(model: torch.nn.Module, num_samples: Optional[int] = None) -> float:
+    if num_samples is None:
+        num_samples = len(eval_dataset)
+
+    eval_data_loader = _create_sampled_data_loader(eval_dataset, num_samples)
+
+    num_correct_predictions = 0
+    for images, labels in eval_data_loader:
+        predictions = torch.argmax(model(images.cuda()), dim=1)
+        num_correct_predictions += torch.sum(predictions.cpu() == labels)
+
+    return int(num_correct_predictions) / num_samples
+
+# Step 5. Create AutoQuant object
+auto_quant = AutoQuant(fp32_model.cuda(),
+                       dummy_input.cuda(),
+                       unlabeled_data_loader,
+                       eval_callback)
+
+# Step 6. (Optional) Set adaround params
+ADAROUND_DATASET_SIZE = 2000
+adaround_data_loader = _create_sampled_data_loader(unlabeled_dataset, ADAROUND_DATASET_SIZE)
+adaround_params = AdaroundParameters(adaround_data_loader, num_batches=len(adaround_data_loader))
+auto_quant.set_adaround_params(adaround_params)
+
+# Step 7. Run AutoQuant
+sim, initial_accuracy = auto_quant.run_inference()
+model, optimized_accuracy, encoding_path = auto_quant.optimize(allowed_accuracy_drop=0.01)
+
+print(f"- Quantized Accuracy (before optimization): {initial_accuracy:.4f}")
+print(f"- Quantized Accuracy (after optimization):  {optimized_accuracy:.4f}")
+
+
+
+

Note

+

To use auto_quant.AutoQuant (will be deprecated), apply the following code changes to step 5 and 7.

+
+
# Step 5. Create AutoQuant object
+auto_quant = AutoQuant(allowed_accuracy_drop=0.01,
+                       unlabeled_dataset_iterable=unlabeled_data_loader,
+                       eval_callback=eval_callback)
+
+# Step 6. (Optional) Set adaround params
+ADAROUND_DATASET_SIZE = 2000
+adaround_data_loader = _create_sampled_data_loader(unlabeled_dataset, ADAROUND_DATASET_SIZE)
+adaround_params = AdaroundParameters(adaround_data_loader, num_batches=len(adaround_data_loader))
+auto_quant.set_adaround_params(adaround_params)
+
+# Step 7. Run AutoQuant
+model, accuracy, encoding_path =\
+    auto_quant.apply(fp32_model.cuda(),
+                     dummy_input_on_cpu=dummy_input.cpu(),
+                     dummy_input_on_gpu=dummy_input.cuda())
+
+print(f"- Quantized Accuracy (after optimization):  {optimized_accuracy:.4f}")
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch_batchnorm_re_estimation.html b/releases/1.32.2/api_docs/torch_batchnorm_re_estimation.html new file mode 100644 index 00000000..e28d60b2 --- /dev/null +++ b/releases/1.32.2/api_docs/torch_batchnorm_re_estimation.html @@ -0,0 +1,1250 @@ + + + + + + AIMET PyTorch BatchNorm Re-estimation APIs — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET PyTorch BatchNorm Re-estimation APIs

+ +
+

Introduction

+

Batch Norm (BN) Re-estimation re-estimates the statistics of BN layers after performing QAT. Using the re-estimated statistics, the BN layers are folded in to preceding Conv and Linear layers

+
+
+

Top-level APIs

+

API for BatchNorm Re-estimation

+
+
+aimet_torch.bn_reestimation.reestimate_bn_stats(model, dataloader, num_batches=100, forward_fn=None)[source]
+

Reestimate BatchNorm statistics (running mean and var).

+
+
Parameters
+
    +
  • model (Module) – Model to reestimate the BN stats.

  • +
  • dataloader (DataLoader) – Training dataset.

  • +
  • num_batches (int) – The number of batches to be used for reestimation.

  • +
  • forward_fn (Optional[Callable[[Module, Any], Any]]) – Optional adapter function that performs forward pass +given a model and a input batch yielded from the data loader.

  • +
+
+
Return type
+

Handle

+
+
Returns
+

Handle that undos the effect of BN reestimation upon handle.remove().

+
+
+
+ +

API for BatchNorm fold to scale

+
+
+aimet_torch.batch_norm_fold.fold_all_batch_norms_to_scale(sim)[source]
+

Fold all batch_norm layers in a model into the quantization scale parameter +of the corresponding conv layers

+
+
Parameters
+

sim (QuantizationSimModel) – QuantizationSimModel

+
+
Return type
+

List[Tuple[QcQuantizeWrapper, QcQuantizeWrapper]]

+
+
Returns
+

A list of pairs of layers [(Conv/Linear, BN layer that got folded)]

+
+
+
+ +
+
+

Code Example - BN-Reestimation

+

** Step 1. Load the model**

+

For this example, we are going to load a pretrained ResNet18 model from torchvision.

+

+def load_fp32_model():
+
+    import torchvision
+    from torchvision.models import resnet18
+    from aimet_torch.model_preparer import prepare_model
+
+    use_cuda = torch.cuda.is_available()
+    if use_cuda:
+        device = torch.device("cuda")
+    else:
+        device = torch.device("cpu")
+
+    model = resnet18(pretrained=True).to(device)
+    model = prepare_model(model)
+
+    return model, use_cuda
+
+
+
+

Step 2. Create QuantSim with Range Learning and Per Channel Quantization Enabled

+
    +
  1. For an example of creating QuantSim with Range Learning QuantScheme, please see here

  2. +
  3. For how to enable Per Channel Quantization, please see here

  4. +
+

Step 3. Perform QAT

+

+    # User action required
+    # The following line of code is an example of how to use an example ImageNetPipeline's train function.
+    # Replace the following line with your own pipeline's  train function.
+    ImageNetPipeline.train(sim.model, epochs=1, learning_rate=5e-7, learning_rate_schedule=[5, 10], use_cuda=use_cuda)
+
+
+
+

Step 4 a. Perform BatchNorm Re-estimation

+
    from aimet_torch.bn_reestimation import reestimate_bn_stats
+
+    # User action required
+    # The following line of code is an example of how to use the ImageNet data's training data loader.
+    # Replace the following line with your own dataset's training data loader.
+    train_loader = ImageNetDataPipeline.get_train_dataloader()
+
+    reestimate_bn_stats(quant_sim.model, train_loader, forward_fn=forward_fn)
+
+
+
+

Step 4 b. Perform BatchNorm Fold to scale

+

+    from aimet_torch.batch_norm_fold import fold_all_batch_norms_to_scale
+
+    fold_all_batch_norms_to_scale(quant_sim)
+
+
+
+

Step 5. Export the model and encodings and test on target

+

For how to export the model and encodings, please see here

+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch_bias_correction.html b/releases/1.32.2/api_docs/torch_bias_correction.html new file mode 100644 index 00000000..80899f42 --- /dev/null +++ b/releases/1.32.2/api_docs/torch_bias_correction.html @@ -0,0 +1,1306 @@ + + + + + + AIMET PyTorch Bias Correction API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET PyTorch Bias Correction API

+ +
+

Bias Correction API

+
+
+aimet_torch.bias_correction.correct_bias(model, quant_params, num_quant_samples, data_loader, num_bias_correct_samples, conv_bn_dict=None, perform_only_empirical_bias_corr=True, layers_to_ignore=None)[source]
+

Corrects bias for each Conv layer of model (unless ignored). A combination of Analytical and Empirical Bias +Correction is used i.e. all the layers which can be corrected using Analytical Bias Correction are corrected +using Analytical Bias Correction and remaining layers are corrected using Empirical method.

+

Returns an in-place corrected floating point model

+
+
Parameters
+
    +
  • model (Module) – Model to be corrected

  • +
  • quant_params (QuantParams) – Named tuple for quantization simulation for bias correction

  • +
  • num_quant_samples (int) – number of samples of images to pass through quantization sim for bias correction.

  • +
  • data_loader – data loader for the model

  • +
  • num_bias_correct_samples (int) – number of samples for Bias correction

  • +
  • conv_bn_dict (Optional[Dict[Module, ConvBnInfoType]]) – Dict of conv and bn with information related to activation. If None, the function calc it

  • +
  • perform_only_empirical_bias_corr (bool) – Default True. If true will perform only empirical Bias Corr for all layers +irrespective of the fact that layer is eligible for Analytical Bias Corr.

  • +
  • layers_to_ignore (Optional[List[Module]]) – list of layer names for which we need to skip bias correction.

  • +
+
+
+
+ +
+

+
+
+
+

ConvBnInfoType

+
+
+class aimet_common.bias_correction.ConvBnInfoType(input_bn=None, output_bn=None, in_activation_type=ActivationType.no_activation, out_activation_type=ActivationType.no_activation)[source]
+

Type for hoding convs with bn info and activation types +Activation types supported are Relu and Relu6

+
+
Parameters
+
    +
  • input_bn – Reference to Input BatchNorm to layer

  • +
  • output_bn – Reference to Output BatchNorm to layer

  • +
  • in_activation_type (ActivationType) – Type of Activation

  • +
  • out_activation_type (ActivationType) – Type of Activation

  • +
+
+
+
+ +
+

+
+
+
+

ActivationType

+
+
+class aimet_common.defs.ActivationType(value)[source]
+

Enums to identify activation type

+
+
+no_activation = 0
+

No activation

+
+ +
+
+relu = 1
+

ReLU activation

+
+ +
+
+relu6 = 2
+

ReLU6 activation

+
+ +
+ +
+
+

Quantization Params

+
+
+class aimet_torch.quantsim.QuantParams(weight_bw=8, act_bw=8, round_mode='nearest', quant_scheme=QuantScheme.post_training_tf_enhanced, config_file=None)[source]
+

Data type to hold quantization related params.

+

Constructor

+
+
Parameters
+
    +
  • weight_bw (int) – Weight bitwidth (4-31) to use for quantizing layer weights. Default = 8

  • +
  • act_bw (int) – Activation bitwidth(4-31) to use for quantizing layer activations. Default = 8

  • +
  • round_mode (str) – Rounding mode. Supported options are ‘nearest’ or ‘stochastic’

  • +
  • quant_scheme (Union[QuantScheme, str]) – Quantization scheme. Supported options are ‘tf_enhanced’ or ‘tf’ or using Quant Scheme Enum +QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced

  • +
  • config_file (Optional[str]) – Path to Configuration file for model quantizers

  • +
+
+
+
+ +
+

+
+
+
+

Code Example #1 Empirical Bias Correction

+

Load the model

+
    model = MobileNetV2()
+    model.eval()
+
+
+
+

Apply Empirical Bias Correction

+
    from aimet_torch import bias_correction
+    from aimet_torch.quantsim import QuantParams
+
+    params = QuantParams(weight_bw=4, act_bw=4, round_mode="nearest", quant_scheme='tf_enhanced')
+
+    # User action required
+    # The following line of code is an example of how to use the ImageNet data's validation data loader.
+    # Replace the following line with your own dataset's validation data loader.
+    data_loader = ImageNetDataPipeline.get_val_dataloader()
+
+    # Perform Empirical Bias Correction
+    bias_correction.correct_bias(model.to(device="cuda"), params, num_quant_samples=1000,
+                                 data_loader=data_loader, num_bias_correct_samples=512)
+
+
+
+
+

+
+
+
+

Code Example #2 Analytical + Empirical Bias correction

+

Load the model

+
    model = MobileNetV2()
+    model.eval()
+
+
+
+

Find BN and Conv Modules

+

Find BN + Conv module pairs for analytical Bias Correction and remaining Conv modules for Empirical Bias Correction.

+
    module_prop_dict = bias_correction.find_all_conv_bn_with_activation(model, input_shape=(1, 3, 224, 224))
+
+
+
+

Apply Analytical + Empirical Bias Correction

+
    from aimet_torch import bias_correction
+    from aimet_torch.quantsim import QuantParams
+
+    params = QuantParams(weight_bw=4, act_bw=4, round_mode="nearest", quant_scheme='tf_enhanced')
+
+    # User action required
+    # The following line of code is an example of how to use the ImageNet data's validation data loader.
+    # Replace the following line with your own dataset's validation data loader.
+    data_loader = ImageNetDataPipeline.get_val_dataloader()
+
+    # Perform Bias Correction
+    bias_correction.correct_bias(model.to(device="cuda"), params, num_quant_samples=1000,
+                                 data_loader=data_loader, num_bias_correct_samples=512,
+                                 conv_bn_dict=module_prop_dict, perform_only_empirical_bias_corr=False)
+
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch_compress.html b/releases/1.32.2/api_docs/torch_compress.html new file mode 100644 index 00000000..885eef29 --- /dev/null +++ b/releases/1.32.2/api_docs/torch_compress.html @@ -0,0 +1,1794 @@ + + + + + + AIMET PyTorch Compression API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET PyTorch Compression API

+
+

Introduction

+
+
AIMET supports the following model compression techniques for PyTorch models
    +
  • Weight SVD

  • +
  • Spatial SVD

  • +
  • Channel Pruning

  • +
+
+
+

To learn more about these model compression techniques, please see Model Compression User Guide

+
+
For all of these compression techniques there are two modes in which you can invoke the AIMET API
    +
  • +
    Auto Mode: In Auto mode, AIMET will determine the optimal way to compress each layer of

    the model given an overall target compression ratio. Greedy Compression Ratio Selection Algorithm is used to pick appropriate compression ratios for each layer.

    +
    +
    +
  • +
  • +
    Manual Mode: In Manual mode, the user can pass in the desired compression-ratio per layer

    to AIMET. AIMET will apply the specified compression technique for each of the +layers to achieve the desired compression-ratio per layer. It is recommended that +the user start with Auto mode, and then tweak per-layer compression-ratios using +Manual mode if desired.

    +
    +
    +
  • +
+
+
+
+

+
+
+
+

Top-level API for Compression

+
+
+class aimet_torch.compress.ModelCompressor[source]
+

AIMET model compressor: Enables model compression using various schemes

+
+ +
+

+
+
+
+static ModelCompressor.compress_model(model, eval_callback, eval_iterations, input_shape, compress_scheme, cost_metric, parameters, trainer=None, visualization_url=None)[source]
+

Compress a given model using the specified parameters

+
+
Parameters
+
    +
  • model (Module) – Model to compress

  • +
  • eval_callback (Callable[[Any, Optional[int], bool], float]) – Evaluation callback. Expected signature is evaluate(model, iterations, use_cuda). +Expected to return an accuracy metric.

  • +
  • eval_iterations – Iterations to run evaluation for

  • +
  • trainer – Training Class: Contains a callable, train_model, which takes model, layer which is being fine +tuned and an optional parameter train_flag as a parameter +None: If per layer fine tuning is not required while creating the final compressed model

  • +
  • input_shape (Tuple) – Shape of the input tensor for model

  • +
  • compress_scheme (CompressionScheme) – Compression scheme. See the enum for allowed values

  • +
  • cost_metric (CostMetric) – Cost metric to use for the compression-ratio (either mac or memory)

  • +
  • parameters (Union[SpatialSvdParameters, WeightSvdParameters, ChannelPruningParameters]) – Compression parameters specific to given compression scheme

  • +
  • visualization_url – url the user will need to input where visualizations will appear

  • +
+
+
Return type
+

Tuple[Module, CompressionStats]

+
+
Returns
+

A tuple of the compressed model, and compression statistics

+
+
+
+ +
+

+
+
+
+

Greedy Selection Parameters

+
+
+class aimet_common.defs.GreedySelectionParameters(target_comp_ratio, num_comp_ratio_candidates=10, use_monotonic_fit=False, saved_eval_scores_dict=None)[source]
+

Configuration parameters for the Greedy compression-ratio selection algorithm

+
+
Variables
+
    +
  • target_comp_ratio – Target compression ratio. Expressed as value between 0 and 1. +Compression ratio is the ratio of cost of compressed model to cost of the original model.

  • +
  • num_comp_ratio_candidates – Number of comp-ratio candidates to analyze per-layer +More candidates allows more granular distribution of compression at the cost +of increased run-time during analysis. Default value=10. Value should be greater than 1.

  • +
  • use_monotonic_fit – If True, eval scores in the eval dictionary are fitted to a monotonically increasing +function. This is useful if you see the eval dict scores for some layers are not monotonically increasing. +By default, this option is set to False.

  • +
  • saved_eval_scores_dict – Path to the eval_scores dictionary pickle file that was +saved in a previous run. This is useful to speed-up experiments when trying +different target compression-ratios for example. aimet will save eval_scores +dictionary pickle file automatically in a ./data directory relative to the +current path. num_comp_ratio_candidates parameter will be ignored when this option is used.

  • +
+
+
+
+ +
+

+
+
+
+

TAR Selection Parameters

+
+
+class aimet_torch.defs.TarRankSelectionParameters(num_rank_indices)[source]
+

Configuration parameters for the TAR compression-ratio selection algorithm

+
+
Variables
+

num_rank_indices – Number of rank indices for ratio selection.

+
+
+
+ +
+

+
+
+
+

Spatial SVD Configuration

+
+
+class aimet_torch.defs.SpatialSvdParameters(mode, params, multiplicity=1)[source]
+

Configuration parameters for spatial svd compression

+
+
Parameters
+
    +
  • mode (Mode) – Either auto mode or manual mode

  • +
  • params (Union[ManualModeParams, AutoModeParams]) – Parameters for the mode selected

  • +
  • multiplicity – The multiplicity to which ranks/input channels will get rounded. Default: 1

  • +
+
+
+
+
+class AutoModeParams(greedy_select_params, modules_to_ignore=None)[source]
+

Configuration parameters for auto-mode compression

+
+
Parameters
+
    +
  • greedy_select_params (GreedySelectionParameters) – Params for greedy comp-ratio selection algorithm

  • +
  • modules_to_ignore (Optional[List[Module]]) – List of modules to ignore (None indicates nothing to ignore)

  • +
+
+
+
+ +
+
+class ManualModeParams(list_of_module_comp_ratio_pairs)[source]
+

Configuration parameters for manual-mode spatial svd compression

+
+
Parameters
+

list_of_module_comp_ratio_pairs (List[ModuleCompRatioPair]) – List of (module, comp-ratio) pairs

+
+
+
+ +
+
+class Mode(value)[source]
+

Mode enumeration

+
+
+auto = 2
+

Auto mode

+
+ +
+
+manual = 1
+

Manual mode

+
+ +
+ +
+ +
+

+
+
+
+

Weight SVD Configuration

+
+
+class aimet_torch.defs.WeightSvdParameters(mode, params, multiplicity=1)[source]
+

Configuration parameters for weight svd compression

+
+
Parameters
+
    +
  • mode (Mode) – Either auto mode or manual mode

  • +
  • params (Union[ManualModeParams, AutoModeParams]) – Parameters for the mode selected

  • +
  • multiplicity – The multiplicity to which ranks/input channels will get rounded. Default: 1

  • +
+
+
+
+
+class AutoModeParams(rank_select_scheme, select_params, modules_to_ignore=None)[source]
+

Configuration parameters for auto-mode compression

+
+
Parameters
+
    +
  • rank_select_scheme (RankSelectScheme) – supports two options greedy and tar

  • +
  • select_params (Union[GreedySelectionParameters, TarRankSelectionParameters]) – Params for greedy/TAR comp-ratio selection algorithm

  • +
  • modules_to_ignore (Optional[List[Module]]) – List of modules to ignore (None indicates nothing to ignore)

  • +
+
+
+
+ +
+
+class ManualModeParams(list_of_module_comp_ratio_pairs)[source]
+

Configuration parameters for manual-mode weight svd compression

+
+
Parameters
+

list_of_module_comp_ratio_pairs (List[ModuleCompRatioPair]) – List of (module, comp-ratio) pairs

+
+
+
+ +
+
+class Mode(value)[source]
+

Mode enumeration

+
+
+auto = 2
+

Auto mode

+
+ +
+
+manual = 1
+

Manual mode

+
+ +
+ +
+ +
+

+
+
+
+

Channel Pruning Configuration

+
+
+class aimet_torch.defs.ChannelPruningParameters(data_loader, num_reconstruction_samples, allow_custom_downsample_ops, mode, params, multiplicity=1)[source]
+

Configuration parameters for channel pruning compression

+
+
+class AutoModeParams(greedy_select_params, modules_to_ignore=None)[source]
+

Configuration parameters for auto-mode compression

+
+
Parameters
+
    +
  • greedy_select_params (GreedySelectionParameters) – Params for greedy comp-ratio selection algorithm

  • +
  • modules_to_ignore (Optional[List[Module]]) – List of modules to ignore (None indicates nothing to ignore)

  • +
+
+
+
+ +
+
+class ManualModeParams(list_of_module_comp_ratio_pairs)[source]
+

Configuration parameters for manual-mode channel pruning compression

+
+
Parameters
+

list_of_module_comp_ratio_pairs (List[ModuleCompRatioPair]) – List of (module, comp-ratio) pairs

+
+
+
+ +
+
+class Mode(value)[source]
+

Mode enumeration

+
+
+auto = 2
+

AIMET computes optimal comp-ratio per layer

+
+
Type
+

Auto mode

+
+
+
+ +
+
+manual = 1
+

User specifies comp-ratio per layer

+
+
Type
+

Manual mode

+
+
+
+ +
+ +
+ +
+

+
+
+
+

Configuration Definitions

+
+
+class aimet_common.defs.CostMetric(value)[source]
+

Enumeration of metrics to measure cost of a model/layer

+
+
+mac = 1
+

Cost modeled for compute requirements

+
+
Type
+

MAC

+
+
+
+ +
+
+memory = 2
+

Cost modeled for space requirements

+
+
Type
+

Memory

+
+
+
+ +
+ +
+

+
+
+
+class aimet_common.defs.CompressionScheme(value)[source]
+

Enumeration of compression schemes supported in aimet

+
+
+channel_pruning = 3
+

Channel Pruning

+
+ +
+
+spatial_svd = 2
+

Spatial SVD

+
+ +
+
+weight_svd = 1
+

Weight SVD

+
+ +
+ +
+

+
+
+
+class aimet_torch.defs.ModuleCompRatioPair(module, comp_ratio)[source]
+

Pair of torch.nn.module and a compression-ratio

+
+
Variables
+
    +
  • module – Module of type torch.nn.module

  • +
  • comp_ratio – Compression ratio. Compression ratio is the ratio of cost of compressed model +to cost of the original model.

  • +
+
+
+
+ +
+

+
+
+
+

Code Examples

+

Required imports

+
import os
+from decimal import Decimal
+import torch
+
+
+# Compression-related imports
+from aimet_common.defs import CostMetric, CompressionScheme, GreedySelectionParameters, RankSelectScheme
+from aimet_torch.defs import WeightSvdParameters, SpatialSvdParameters, ChannelPruningParameters, \
+    ModuleCompRatioPair
+from aimet_torch.compress import ModelCompressor
+
+
+

Evaluation function

+
def evaluate_model(model: torch.nn.Module, eval_iterations: int, use_cuda: bool = False) -> float:
+    """
+    This is intended to be the user-defined model evaluation function.
+    AIMET requires the above signature. So if the user's eval function does not
+    match this signature, please create a simple wrapper.
+
+    Note: Honoring the number of iterations is not absolutely necessary.
+    However if all evaluations run over an entire epoch of validation data,
+    the runtime for AIMET compression will obviously be higher.
+
+    :param model: Model to evaluate
+    :param eval_iterations: Number of iterations to use for evaluation.
+            None for entire epoch.
+    :param use_cuda: If true, evaluate using gpu acceleration
+    :return: single float number (accuracy) representing model's performance
+    """
+    return .5
+
+
+

Compressing using Spatial SVD in auto mode with multiplicity = 8 for rank rounding

+
def spatial_svd_auto_mode():
+
+    # load trained MNIST model
+    model = torch.load(os.path.join('../', 'data', 'mnist_trained_on_GPU.pth'))
+
+    # Specify the necessary parameters
+    greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.8),
+                                              num_comp_ratio_candidates=10)
+    auto_params = SpatialSvdParameters.AutoModeParams(greedy_params,
+                                                      modules_to_ignore=[model.conv1])
+
+    params = SpatialSvdParameters(mode=SpatialSvdParameters.Mode.auto,
+                                  params=auto_params, multiplicity=8)
+
+    # Single call to compress the model
+    results = ModelCompressor.compress_model(model,
+                                             eval_callback=evaluate_model,
+                                             eval_iterations=1000,
+                                             input_shape=(1, 1, 28, 28),
+                                             compress_scheme=CompressionScheme.spatial_svd,
+                                             cost_metric=CostMetric.mac,
+                                             parameters=params)
+
+    compressed_model, stats = results
+    print(compressed_model)
+    print(stats)     # Stats object can be pretty-printed easily
+
+
+

Compressing using Spatial SVD in manual mode

+
def spatial_svd_manual_mode():
+
+    # Load a trained MNIST model
+    model = torch.load(os.path.join('../', 'data', 'mnist_trained_on_GPU.pth'))
+
+    # Specify the necessary parameters
+    manual_params = SpatialSvdParameters.ManualModeParams([ModuleCompRatioPair(model.conv1, 0.5),
+                                                           ModuleCompRatioPair(model.conv2, 0.4)])
+    params = SpatialSvdParameters(mode=SpatialSvdParameters.Mode.manual,
+                                  params=manual_params)
+
+    # Single call to compress the model
+    results = ModelCompressor.compress_model(model,
+                                             eval_callback=evaluate_model,
+                                             eval_iterations=1000,
+                                             input_shape=(1, 1, 28, 28),
+                                             compress_scheme=CompressionScheme.spatial_svd,
+                                             cost_metric=CostMetric.mac,
+                                             parameters=params)
+
+    compressed_model, stats = results
+    print(compressed_model)
+    print(stats)    # Stats object can be pretty-printed easily
+
+
+

Compressing using Weight SVD in auto mode

+
def weight_svd_auto_mode():
+
+    # Load trained MNIST model
+    model = torch.load(os.path.join('../', 'data', 'mnist_trained_on_GPU.pth'))
+
+    # Specify the necessary parameters
+    greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.8),
+                                              num_comp_ratio_candidates=10)
+    rank_select = RankSelectScheme.greedy
+    auto_params = WeightSvdParameters.AutoModeParams(rank_select_scheme=rank_select,
+                                                     select_params=greedy_params,
+                                                     modules_to_ignore=[model.conv1])
+
+    params = WeightSvdParameters(mode=WeightSvdParameters.Mode.auto,
+                                 params=auto_params)
+
+    # Single call to compress the model
+    results = ModelCompressor.compress_model(model,
+                                             eval_callback=evaluate_model,
+                                             eval_iterations=1000,
+                                             input_shape=(1, 1, 28, 28),
+                                             compress_scheme=CompressionScheme.weight_svd,
+                                             cost_metric=CostMetric.mac,
+                                             parameters=params)
+
+    compressed_model, stats = results
+    print(compressed_model)
+    print(stats)     # Stats object can be pretty-printed easily
+
+
+

Compressing using Weight SVD in manual mode with multiplicity = 8 for rank rounding

+
def weight_svd_manual_mode():
+
+    # Load a trained MNIST model
+    model = torch.load(os.path.join('../', 'data', 'mnist_trained_on_GPU.pth'))
+
+    # Specify the necessary parameters
+    manual_params = WeightSvdParameters.ManualModeParams([ModuleCompRatioPair(model.conv1, 0.5),
+                                                          ModuleCompRatioPair(model.conv2, 0.4)])
+    params = WeightSvdParameters(mode=WeightSvdParameters.Mode.manual,
+                                 params=manual_params, multiplicity=8)
+
+    # Single call to compress the model
+    results = ModelCompressor.compress_model(model,
+                                             eval_callback=evaluate_model,
+                                             eval_iterations=1000,
+                                             input_shape=(1, 1, 28, 28),
+                                             compress_scheme=CompressionScheme.weight_svd,
+                                             cost_metric=CostMetric.mac,
+                                             parameters=params)
+
+    compressed_model, stats = results
+    print(compressed_model)
+    print(stats)    # Stats object can be pretty-printed easily
+
+
+

Compressing using Channel Pruning in auto mode

+
def channel_pruning_auto_mode():
+
+    # Load trained MNIST model
+    model = torch.load(os.path.join('../', 'data', 'mnist_trained_on_GPU.pth'))
+
+    # Specify the necessary parameters
+    greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.8),
+                                              num_comp_ratio_candidates=10)
+    auto_params = ChannelPruningParameters.AutoModeParams(greedy_params,
+                                                          modules_to_ignore=[model.conv1])
+
+    data_loader = mnist_torch_model.DataLoaderMnist(cuda=True, seed=1, shuffle=True)
+    params = ChannelPruningParameters(data_loader=data_loader.train_loader,
+                                      num_reconstruction_samples=500,
+                                      allow_custom_downsample_ops=True,
+                                      mode=ChannelPruningParameters.Mode.auto,
+                                      params=auto_params)
+
+    # Single call to compress the model
+    results = ModelCompressor.compress_model(model,
+                                             eval_callback=evaluate_model,
+                                             eval_iterations=1000,
+                                             input_shape=(1, 1, 28, 28),
+                                             compress_scheme=CompressionScheme.channel_pruning,
+                                             cost_metric=CostMetric.mac,
+                                             parameters=params)
+
+    compressed_model, stats = results
+    print(compressed_model)
+    print(stats)     # Stats object can be pretty-printed easily
+
+
+

Compressing using Channel Pruning in manual mode

+
def channel_pruning_manual_mode():
+
+    # Load a trained MNIST model
+    model = torch.load(os.path.join('../', 'data', 'mnist_trained_on_GPU.pth'))
+
+    # Specify the necessary parameters
+    manual_params = ChannelPruningParameters.ManualModeParams([ModuleCompRatioPair(model.conv2, 0.4)])
+
+    data_loader = mnist_torch_model.DataLoaderMnist(cuda=True, seed=1, shuffle=True)
+    params = ChannelPruningParameters(data_loader=data_loader.train_loader,
+                                      num_reconstruction_samples=500,
+                                      allow_custom_downsample_ops=True,
+                                      mode=ChannelPruningParameters.Mode.manual,
+                                      params=manual_params)
+
+    # Single call to compress the model
+    results = ModelCompressor.compress_model(model,
+                                             eval_callback=evaluate_model,
+                                             eval_iterations=1000,
+                                             input_shape=(1, 1, 28, 28),
+                                             compress_scheme=CompressionScheme.channel_pruning,
+                                             cost_metric=CostMetric.mac,
+                                             parameters=params)
+
+    compressed_model, stats = results
+    print(compressed_model)
+    print(stats)    # Stats object can be pretty-printed easily
+
+
+

Example Training Object

+
class Trainer:
+    """ Example trainer class """
+
+    def __init__(self):
+        self._layer_db = []
+
+    def train_model(self, model, layer, train_flag=True):
+        """
+        Trains a model
+        :param model: Model to be trained
+        :param layer: layer which has to be fine tuned
+        :param train_flag: Default: True. If ture the model gets trained
+        :return:
+        """
+        if train_flag:
+            mnist_torch_model.train(model, epochs=1, use_cuda=True, batch_size=50, batch_callback=None)
+        self._layer_db.append(layer)
+
+
+

Compressing using Spatial SVD in auto mode with layer-wise fine tuning

+
def spatial_svd_auto_mode_with_layerwise_finetuning():
+
+    # load trained MNIST model
+    model = torch.load(os.path.join('../', 'data', 'mnist_trained_on_GPU.pth'))
+
+    # Specify the necessary parameters
+    greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.8),
+                                              num_comp_ratio_candidates=10)
+    auto_params = SpatialSvdParameters.AutoModeParams(greedy_params,
+                                                      modules_to_ignore=[model.conv1])
+
+    params = SpatialSvdParameters(mode=SpatialSvdParameters.Mode.auto,
+                                  params=auto_params)
+
+    # Single call to compress the model
+    results = ModelCompressor.compress_model(model,
+                                             eval_callback=evaluate_model,
+                                             eval_iterations=1000,
+                                             input_shape=(1, 1, 28, 28),
+                                             compress_scheme=CompressionScheme.spatial_svd,
+                                             cost_metric=CostMetric.mac,
+                                             parameters=params, trainer=Trainer())
+
+    compressed_model, stats = results
+    print(compressed_model)
+    print(stats)     # Stats object can be pretty-printed easily
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch_cross_layer_equalization.html b/releases/1.32.2/api_docs/torch_cross_layer_equalization.html new file mode 100644 index 00000000..cf5eaee8 --- /dev/null +++ b/releases/1.32.2/api_docs/torch_cross_layer_equalization.html @@ -0,0 +1,1210 @@ + + + + + + AIMET PyTorch Cross Layer Equalization APIs — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET PyTorch Cross Layer Equalization APIs

+ + +
+

Introduction

+
+
AIMET functionality for PyTorch Cross Layer Equalization has 3 features-
    +
  • BatchNorm Folding

  • +
  • Cross Layer Scaling

  • +
  • High Bias Fold

  • +
+
+
+
+
+

Cross Layer Equalization API

+

The following API performs BatchNorm fold followed by Cross Layer Scaling followed by High Bias Fold.

+

Note: High Bias fold will not happen when the below API is used, if the model does not have BatchNorm layers

+

API for Cross Layer Equalization

+
+
+aimet_torch.cross_layer_equalization.equalize_model(model, input_shapes, dummy_input=None)[source]
+

High-level API to perform Cross-Layer Equalization (CLE) on the given model. The model is equalized in place.

+
+
Parameters
+
    +
  • model (Module) – Model to equalize

  • +
  • input_shapes (Union[Tuple, List[Tuple]]) – Shape of the input (can be a tuple or a list of tuples if multiple inputs)

  • +
  • dummy_input (Union[Tensor, Tuple, None]) – A dummy input to the model. Can be a Tensor or a Tuple of Tensors

  • +
+
+
Returns
+

None

+
+
+
+ +
+

+
+
+
+

Code Example

+

Required imports

+
from torchvision import models
+from aimet_torch.cross_layer_equalization import equalize_model
+
+
+

Cross Layer Equalization in auto mode

+
def cross_layer_equalization_auto():
+    model = models.resnet18(pretrained=True)
+
+    input_shape = (1, 3, 224, 224)
+
+    model = model.eval()
+
+    # Performs BatchNorm fold, Cross layer scaling and High bias folding
+    equalize_model(model, input_shape)
+
+
+
+
+

Primitive APIs

+

If the user would like to call the APIs individually, then the following APIs can be used-

+ +
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch_layer_output_generation.html b/releases/1.32.2/api_docs/torch_layer_output_generation.html new file mode 100644 index 00000000..6971643f --- /dev/null +++ b/releases/1.32.2/api_docs/torch_layer_output_generation.html @@ -0,0 +1,1265 @@ + + + + + + AIMET PyTorch Layer Output Generation API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET PyTorch Layer Output Generation API

+

This API captures and saves intermediate layer-outputs of a model. The model can be original (FP32) or quantsim. +The layer-outputs are named according to the exported PyTorch/ONNX/TorchScript model by the quantsim export API. +This allows layer-output comparison amongst FP32 model, quantization simulated model and actually quantized model +on target-device to debug accuracy miss-match issues.

+
+

Top-level API

+
+
+class aimet_torch.layer_output_utils.LayerOutputUtil(model, dir_path, naming_scheme=NamingScheme.PYTORCH, dummy_input=None, onnx_export_args=None)[source]
+

Implementation to capture and save outputs of intermediate layers of a model (fp32/quantsim).

+

Constructor for LayerOutputUtil.

+
+
Parameters
+
    +
  • model (Module) – Model whose layer-outputs are needed.

  • +
  • dir_path (str) – Directory wherein layer-outputs will be saved.

  • +
  • naming_scheme (NamingScheme) – Naming scheme to be followed to name layer-outputs. There are multiple schemes as per +the exported model (pytorch, onnx or torchscript). Refer the NamingScheme enum definition.

  • +
  • dummy_input (Union[Tensor, Tuple, List, None]) – Dummy input to model. Required if naming_scheme is ‘NamingScheme.ONNX’ or ‘NamingScheme.TORCHSCRIPT’.

  • +
  • onnx_export_args (Union[OnnxExportApiArgs, Dict, None]) – Should be same as that passed to quantsim export API to have consistency between +layer-output names present in exported onnx model and generated layer-outputs. Required if naming_scheme is +‘NamingScheme.ONNX’.

  • +
+
+
+
+ +
+

+
+

The following API can be used to Generate Layer Outputs

+
+
+LayerOutputUtil.generate_layer_outputs(input_batch)[source]
+

This method captures output of every layer of a model & saves the inputs and corresponding layer-outputs to disk.

+
+
Parameters
+

input_batch (Union[Tensor, List[Tensor], Tuple[Tensor]]) – Batch of inputs for which we want to obtain layer-outputs.

+
+
Returns
+

None

+
+
+
+ +
+

+
+
+
+

Enum Definition

+

Naming Scheme Enum

+
+
+class aimet_torch.layer_output_utils.NamingScheme(value)[source]
+

Enumeration of layer-output naming schemes.

+
+
+ONNX = 2
+

Names outputs according to exported onnx model. Layer output names are generally numeric.

+
+ +
+
+PYTORCH = 1
+

Names outputs according to exported pytorch model. Layer names are used.

+
+ +
+
+TORCHSCRIPT = 3
+

Names outputs according to exported torchscript model. Layer output names are generally numeric.

+
+ +
+ +
+

+
+
+
+

Code Example

+

Imports

+
import torch
+
+from aimet_torch.quantsim import QuantizationSimModel, load_encodings_to_sim
+
+from aimet_torch.layer_output_utils import LayerOutputUtil, NamingScheme
+from aimet_torch.onnx_utils import OnnxExportApiArgs
+
+
+

Obtain Original or QuantSim model from AIMET Export Artifacts

+
# Load the model on CPU device. Ensure model definition is present in the PYTHONPATH to successfully load the model.
+# If exported on CPU, load this way.
+model = torch.load('path/to/aimet_export_artifacts/model.pth')
+# Or
+# If exported on GPU, load this way.
+# model = torch.load('path/to/aimet_export_artifacts/model.pth', map_location=torch.device('cpu'))
+
+dummy_input = torch.rand(1, 3, 224, 224)
+
+# Use same arguments as that were used for the exported QuantSim model. For sake of simplicity only mandatory arguments are passed below.
+quantsim = QuantizationSimModel(model=model, dummy_input=dummy_input)
+
+# Load exported encodings into quantsim object
+load_encodings_to_sim(quantsim, 'path/to/aimet_export_artifacts/model_torch.encodings')
+
+# Check whether constructed original and quantsim model are running properly before using Layer Output Generation API.
+_ = model(dummy_input)
+_ = quantsim.model(dummy_input)
+
+
+

Obtain inputs for which we want to generate intermediate layer-outputs

+
# Use same input pre-processing pipeline as was used for computing the quantization encodings.
+input_batches = get_pre_processed_inputs()
+
+
+

Generate layer-outputs

+
# Use original model to get fp32 layer-outputs
+fp32_layer_output_util = LayerOutputUtil(model=model, dir_path='./fp32_layer_outputs', naming_scheme=NamingScheme.ONNX,
+                                         dummy_input=dummy_input, onnx_export_args=OnnxExportApiArgs())
+# Use quantsim model to get quantsim layer-outputs
+quantsim_layer_output_util = LayerOutputUtil(model=quantsim.model, dir_path='./quantsim_layer_outputs', naming_scheme=NamingScheme.ONNX,
+                                             dummy_input=dummy_input, onnx_export_args=OnnxExportApiArgs())
+for input_batch in input_batches:
+    fp32_layer_output_util.generate_layer_outputs(input_batch)
+    quantsim_layer_output_util.generate_layer_outputs(input_batch)
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch_model_guidelines.html b/releases/1.32.2/api_docs/torch_model_guidelines.html new file mode 100644 index 00000000..c825d47e --- /dev/null +++ b/releases/1.32.2/api_docs/torch_model_guidelines.html @@ -0,0 +1,1246 @@ + + + + + + PyTorch Model Guidelines — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

PyTorch Model Guidelines

+

In order to make full use of AIMET features, there are several guidelines users are encouraged to follow when defining +PyTorch models.

+

Model should support conversion to onnx

+

The model definition should support conversion to onnx, user could check compatibility of model for onnx conversion as +shown below:

+
...
+model = Model()
+torch.onnx.export(model, <dummy_input>, <onnx_file_name>):
+
+
+

Model should be jit traceable

+

The model definition should be jit traceable, user could check compatibility of model for jit tracing as +shown below:

+
...
+model = Model()
+torch.jit.trace(model, <dummy_input>):
+
+
+

Define layers as modules instead of using torch.nn.functional equivalents

+

When using activation functions and other stateless layers, PyTorch will allow the user to either

+
    +
  • define the layers as modules (instantiated in the constructor and used in the forward pass), or

  • +
  • use a torch.nn.functional equivalent purely in the forward pass

  • +
+

For AIMET quantization simulation model to add simulation nodes, AIMET requires the former (layers defined as modules). +Changing the model definition to use modules instead of functionals, is mathematically equivalent and does not require +the model to be retrained.

+

As an example, if the user had:

+
def forward(...):
+    ...
+    x = torch.nn.functional.relu(x)
+    ...
+
+
+

Users should instead define their model as:

+
def __init__(self,...):
+    ...
+    self.relu = torch.nn.ReLU()
+    ...
+
+def forward(...):
+    ...
+    x = self.relu(x)
+    ...
+
+
+

This will not be possible in certain cases where operations can only be represented as functionals and not as class +definitions, but should be followed whenever possible.

+

Also, User can also automate this by using Model Preparer API

+

Avoid reuse of class defined modules

+

Modules defined in the class definition should only be used once. If any modules are being reused, instead define a new +identical module in the class definition. +For example, if the user had:

+
def __init__(self,...):
+    ...
+    self.relu = torch.nn.ReLU()
+    ...
+
+def forward(...):
+    ...
+    x = self.relu(x)
+    ...
+    x2 = self.relu(x2)
+    ...
+
+
+

Users should instead define their model as:

+
def __init__(self,...):
+    ...
+    self.relu = torch.nn.ReLU()
+    self.relu2 = torch.nn.ReLU()
+    ...
+
+def forward(...):
+    ...
+    x = self.relu(x)
+    ...
+    x2 = self.relu2(x2)
+    ...
+
+
+

Also, User can also automate this by using Model Preparer API

+

Use only torch.Tensor or tuples of torch.Tensors as model/submodule inputs and outputs

+

Modules should use tensor or tuples of tensor for inputs and output in order to support conversion of the model to onnx. +For example, if the user had:

+
def __init__(self,...):
+...
+def forward(self, inputs: Dict[str, torch.Tensor]):
+    ...
+    x = self.conv1(inputs['image_rgb'])
+    rgb_output = self.relu1(x)
+    ...
+    x = self.conv2(inputs['image_bw'])
+    bw_output = self.relu2(x)
+    ...
+    return { 'rgb': rgb_output, 'bw': bw_output }
+
+
+

Users should instead define their model as:

+
def __init__(self,...):
+...
+def forward(self, image_rgb, image_bw):
+    ...
+    x = self.conv1(image_rgb)
+    rgb_output = self.relu1(x)
+    ...
+    x = self.conv2(image_bw)
+    bw_output = self.relu2(x)
+    ...
+    return rgb_output, bw_output
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch_model_preparer.html b/releases/1.32.2/api_docs/torch_model_preparer.html new file mode 100644 index 00000000..92e5c0a0 --- /dev/null +++ b/releases/1.32.2/api_docs/torch_model_preparer.html @@ -0,0 +1,1430 @@ + + + + + + Model Preparer API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

Model Preparer API

+

AIMET PyTorch ModelPreparer API uses new graph transformation feature available in PyTorch 1.9+ version and automates +model definition changes required by user. For example, it changes functionals defined in forward pass to +torch.nn.Module type modules for activation and elementwise functions. Also, when torch.nn.Module type modules are reused, +it unrolls into independent modules.

+

Users are strongly encouraged to use AIMET PyTorch ModelPreparer API first and then use the returned model as input +to all the AIMET Quantization features.

+

AIMET PyTorch ModelPreparer API requires minimum PyTorch 1.9 version.

+
+

Top-level API

+
+
+aimet_torch.model_preparer.prepare_model(model, modules_to_exclude=None, module_classes_to_exclude=None, concrete_args=None)[source]
+

Prepare and modify the pytorch model for AIMET features using torch.FX symbolic tracing API.

+
    +
  1. Replace torch.nn.functional by module of type torch.nn.Module

  2. +
  3. Create new independent torch.nn.Module instances for reused/duplicate module

  4. +
+
+
Parameters
+
    +
  • model (Module) – pytorch Model to be modified.

  • +
  • modules_to_exclude (Optional[List[Module]]) – List of modules to exclude when tracing.

  • +
  • module_classes_to_exclude (Optional[List[Callable]]) – List of module classes to exclude when tracing.

  • +
  • concrete_args (Optional[Dict[str, Any]]) – Allows you to partially specialize your function, whether it’s to remove control flow or +data structures. If the model has control flow, torch.fx won’t be able to trace the model. Check +torch.fx.symbolic_trace API in detail.

  • +
+
+
Return type
+

GraphModule

+
+
Returns
+

Modified pytorch Model

+
+
+
+ +
+
+

Code Examples

+

Required imports

+

+import torch
+import torch.nn.functional as F
+from aimet_torch.model_preparer import prepare_model
+
+
+
+

Example 1: Model with Functional relu

+

We begin with the following model, which contains two functional relus and relu method inside forward method.

+
class ModelWithFunctionalReLU(torch.nn.Module):
+    """ Model that uses functional ReLU instead of nn.Modules. Expects input of shape (1, 3, 32, 32) """
+    def __init__(self):
+        super(ModelWithFunctionalReLU, self).__init__()
+        self.conv1 = torch.nn.Conv2d(3, 6, 5)
+        self.conv2 = torch.nn.Conv2d(6, 16, 5)
+        self.fc1 = torch.nn.Linear(9216, 128)
+        self.fc2 = torch.nn.Linear(128, 10)
+
+    def forward(self, x):
+        x = F.relu(self.conv1(x))
+        x = F.relu(self.conv2(x))
+        x = torch.flatten(x, 1)
+        x = F.relu(self.fc1(x))
+        x = self.fc2(x).relu()
+        return x
+
+
+

Run the model preparer API on the model by passing in the model.

+
def model_preparer_functional_example():
+
+    # Load the model and keep in eval() mode
+    model = ModelWithFunctionalReLU().eval()
+    input_shape = (1, 3, 32, 32)
+    input_tensor = torch.randn(*input_shape)
+
+    # Call to prepare_model API
+    prepared_model = prepare_model(model)
+    print(prepared_model)
+
+    # Compare the outputs of original and transformed model
+    assert torch.allclose(model(input_tensor), prepared_model(input_tensor))
+
+
+

After that, we get prepared_model, which is functionally same as original model. User can verify this by comparing +the outputs of both models.

+

prepared_model should have all three functional relus now converted to torch.nn.ReLU modules which satisfy +model guidelines described here Model Guidelines.

+

Example 2: Model with reused torch.nn.ReLU module

+

We begin with the following model, which contains torch.nn.ReLU module which is used at multiple instances inside +model forward function.

+
class ModelWithReusedReLU(torch.nn.Module):
+    """ Model that uses single ReLU instances multiple times in the forward. Expects input of shape (1, 3, 32, 32) """
+    def __init__(self):
+        super(ModelWithReusedReLU, self).__init__()
+        self.conv1 = torch.nn.Conv2d(3, 6, 5)
+        self.conv2 = torch.nn.Conv2d(6, 16, 5)
+        self.relu = torch.nn.ReLU()
+        self.fc1 = torch.nn.Linear(9216, 128)
+        self.fc2 = torch.nn.Linear(128, 10)
+
+    def forward(self, x):
+        x = self.conv1(x)
+        x = self.relu(x)
+        x = self.conv2(x)
+        x = self.relu(x)
+        x = torch.flatten(x, 1)
+        x = self.fc1(x)
+        x = self.relu(x)
+        x = self.fc2(x)
+        x = self.relu(x)
+        return x
+
+
+

Run the model preparer API on the model by passing in the model.

+
def model_preparer_reused_example():
+
+    # Load the model and keep in eval() mode
+    model = ModelWithReusedReLU().eval()
+    input_shape = (1, 3, 32, 32)
+    input_tensor = torch.randn(*input_shape)
+
+    # Call to prepare_model API
+    prepared_model = prepare_model(model)
+    print(prepared_model)
+
+    # Compare the outputs of original and transformed model
+    assert torch.allclose(model(input_tensor), prepared_model(input_tensor))
+
+
+

After that, we get prepared_model, which is functionally same as original model. User can verify this by comparing +the outputs of both models.

+

prepared_model should have separate independent torch.nn.Module instances which satisfy model guidelines described +here Model Guidelines.

+

Example 3: Model with elementwise Add

+

We begin with the following model, which contains elementwise Add operation inside model forward function.

+
class ModelWithElementwiseAddOp(torch.nn.Module):
+    def __init__(self):
+        super(ModelWithElementwiseAddOp, self).__init__()
+        self.conv1 = torch.nn.Conv2d(3, 6, 5, bias=False)
+        self.conv2 = torch.nn.Conv2d(3, 6, 5)
+
+    def forward(self, *inputs):
+        x1 = self.conv1(inputs[0])
+        x2 = self.conv2(inputs[1])
+        x = x1 + x2
+        return x
+
+
+

Run the model preparer API on the model by passing in the model.

+
def model_preparer_elementwise_add_example():
+
+    # Load the model and keep in eval() mode
+    model = ModelWithElementwiseAddOp().eval()
+    input_shape = (1, 3, 32, 32)
+    input_tensor = [torch.randn(*input_shape), torch.randn(*input_shape)]
+
+    # Call to prepare_model API
+    prepared_model = prepare_model(model)
+    print(prepared_model)
+
+    # Compare the outputs of original and transformed model
+    assert torch.allclose(model(*input_tensor), prepared_model(*input_tensor))
+
+
+

After that, we get prepared_model, which is functionally same as original model. User can verify this by comparing +the outputs of both models.

+
+
+

Limitations of torch.fx symbolic trace API

+

Limitations of torch.fx symbolic trace: https://pytorch.org/docs/stable/fx.html#limitations-of-symbolic-tracing

+

1. Dynamic control flow is not supported by torch.fx +Loops or if-else statement where condition may depend on some of the input values. It can only trace one execution +path and all the other branches that weren’t traced will be ignored. For example, following simple function when traced, +will fail with TraceError saying that ‘symbolically traced variables cannot be used as inputs to control flow’:

+
def f(x, flag):
+    if flag:
+        return x
+    else:
+        return x*2
+
+torch.fx.symbolic_trace(f) # Fails!
+fx.symbolic_trace(f, concrete_args={'flag': True})
+
+
+

Workarounds for this problem:

+
    +
  • Many cases of dynamic control flow can be simply made to static control flow which is supported by torch.fx +symbolic tracing. Static control flow is where loops or if-else statements whose value can’t change +across different model forward passes. Such cases can be traced by removing data dependencies on input values by +passing concrete values to ‘concrete_args’ to specialize your forward functions.

  • +
  • In truly dynamic control flow, user should wrap such piece of code at model-level scope using torch.fx.wrap API +which will preserve it as a node instead of being traced through:

    +
    @torch.fx.wrap
    +def custom_function_not_to_be_traced(x, y):
    +    """ Function which we do not want to be traced, when traced using torch FX API, call to this function will
    +    be inserted as call_function, and won't be traced through """
    +    for i in range(2):
    +        x += x
    +        y += y
    +    return x * x + y * y
    +
    +
    +
  • +
+

2. Non-torch functions which does not use __torch_function__ mechanism is not supported by default in symbolic +tracing.

+

Workaround for this problem:

+
    +
  • If we do not want to capture them in symbolic tracing then user should use torch.fx.wrap() API at module-level scope:

    +
    import torch
    +import torch.fx
    +torch.fx.wrap('len')  # call the API at module-level scope.
    +torch.fx.wrap('sqrt') # call the API at module-level scope.
    +
    +class ModelWithNonTorchFunction(torch.nn.Module):
    +    def __init__(self):
    +        super(ModelWithNonTorchFunction, self).__init__()
    +        self.conv = torch.nn.Conv2d(3, 4, kernel_size=2, stride=2, padding=2, bias=False)
    +
    +    def forward(self, *inputs):
    +        x = self.conv(inputs[0])
    +        return x / sqrt(len(x))
    +
    +model = ModelWithNonTorchFunction().eval()
    +model_transformed = prepare_model(model)
    +
    +
    +
  • +
+

3. Customizing the behavior of tracing by overriding the Tracer.is_leaf_module() API

+

In symbolic tracing, leaf modules appears as node rather than being traced through and all the standard torch.nn modules +are default set of leaf modules. But this behavior can be changed by overriding the Tracer.is_leaf_module() API.

+

AIMET model preparer API exposes ‘module_to_exclude’ argument which can be used to prevent certain module(s) being +traced through. For example, let’s examine following code snippet where we don’t want to trace CustomModule further:

+
class CustomModule(torch.nn.Module):
+    @staticmethod
+    def forward(x):
+        return x * torch.nn.functional.softplus(x).sigmoid()
+
+class CustomModel(torch.nn.Module):
+    def __init__(self):
+        super(CustomModel, self).__init__()
+        self.conv1 = torch.nn.Conv2d(3, 8, kernel_size=2)
+        self.custom = CustomModule()
+
+    def forward(self, inputs):
+        x = self.conv1(inputs)
+        x = self.custom(x)
+        return x
+
+model = CustomModel().eval()
+prepared_model = prepare_model(model, modules_to_exclude=[model.custom])
+print(prepared_model)
+
+
+

In this example, ‘self.custom’ is preserved as node and not being traced through.

+

4. Tensor constructors are not traceable

+

For example, let’s examine following code snippet:

+
def f(x):
+    return torch.arange(x.shape[0], device=x.device)
+
+torch.fx.symbolic_trace(f)
+
+Error traceback:
+    return torch.arange(x.shape[0], device=x.device)
+    TypeError: arange() received an invalid combination of arguments - got (Proxy, device=Attribute), but expected one of:
+    * (Number end, *, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)
+    * (Number start, Number end, Number step, *, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)
+
+
+

The above snippet is problematic because arguments to torch.arange() are input dependent. +Workaround for this problem:

+
    +
  • use deterministic constructors (hard-coding) so that the value they produce will be embedded as constant in +the graph:

    +
    def f(x):
    +    return torch.arange(10, device=torch.device('cpu'))
    +
    +
    +
  • +
  • Or use torch.fx.wrap API to wrap torch.arange() and call that instead:

    +
    @torch.fx.wrap
    +def do_not_trace_me(x):
    +    return torch.arange(x.shape[0], device=x.device)
    +
    +def f(x):
    +    return do_not_trace_me(x)
    +
    +torch.fx.symbolic_trace(f)
    +
    +
    +
  • +
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch_model_validator.html b/releases/1.32.2/api_docs/torch_model_validator.html new file mode 100644 index 00000000..ba703e85 --- /dev/null +++ b/releases/1.32.2/api_docs/torch_model_validator.html @@ -0,0 +1,1285 @@ + + + + + + Model Validator Utility — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

Model Validator Utility

+

AIMET provides a model validator utility to help check whether AIMET feature can be applied on a Pytorch model. The +model validator currently checks for the following conditions:

+
    +
  • No modules are reused

  • +
  • Operations have modules associated with them and are not defined as Functionals (excluding a set of known operations)

  • +
+

In this section, we present models failing the validation checks, and show how to run the model validator, as well as +how to fix the models so the validation checks pass.

+

Example 1: Model with reused modules

+

We begin with the following model, which contains two relu modules sharing the same module instance.

+
class ModelWithReusedNodes(torch.nn.Module):
+    """ Model that reuses a relu module. Expects input of shape (1, 3, 32, 32) """
+
+    def __init__(self):
+        super(ModelWithReusedNodes, self).__init__()
+        self.conv1 = torch.nn.Conv2d(3, 8, kernel_size=2, stride=2, padding=2, bias=False)
+        self.bn1 = torch.nn.BatchNorm2d(8)
+        self.relu1 = torch.nn.ReLU(inplace=True)
+        self.linear = torch.nn.Linear(2592, 10)
+
+    def forward(self, *inputs):
+        x = self.conv1(inputs[0])
+        x = self.relu1(x)
+        x = self.bn1(x)
+        x = self.relu1(x)
+        x = x.view(x.size(0), -1)
+        x = self.linear(x)
+        return x
+
+
+

Import the model validator:

+
from aimet_torch.model_validator.model_validator import ModelValidator
+
+
+

Run the model validator on the model by passing in the model as well as model input:

+
def validate_example_model():
+
+    # Load the model to validate
+    model = ModelWithReusedNodes()
+
+    # Output of ModelValidator.validate_model will be True if model is valid, False otherwise
+    ModelValidator.validate_model(model, model_input=torch.rand(1, 3, 32, 32))
+
+
+

For each validation check run on the model, a logger print will appear:

+
Utils - INFO - Running validator check <function validate_for_reused_modules at 0x7f127685a598>
+
+
+

If the validation check finds any issues with the model, the log will contain information for how to resolve the model:

+
Utils - WARNING - The following modules are used more than once in the model: ['relu1']
+AIMET features are not designed to work with reused modules. Please redefine your model to use distinct modules for
+each instance.
+
+
+

Finally, at the end of the validation, any failing checks will be logged:

+
Utils - INFO - The following validator checks failed:
+Utils - INFO -     <function validate_for_reused_modules at 0x7f127685a598>
+
+
+

In this case, the validate_for_reused_modules check informs that the relu1 module is being used multiple times in the +model. We rewrite the model by defining a separate relu instance for each usage:

+
class ModelWithoutReusedNodes(torch.nn.Module):
+    """ Model that is fixed to not reuse modules. Expects input of shape (1, 3, 32, 32) """
+
+    def __init__(self):
+        super(ModelWithoutReusedNodes, self).__init__()
+        self.conv1 = torch.nn.Conv2d(3, 8, kernel_size=2, stride=2, padding=2, bias=False)
+        self.bn1 = torch.nn.BatchNorm2d(8)
+        self.relu1 = torch.nn.ReLU(inplace=True)
+        self.relu2 = torch.nn.ReLU(inplace=True)
+        self.linear = torch.nn.Linear(2592, 10)
+
+    def forward(self, *inputs):
+        x = self.conv1(inputs[0])
+        x = self.relu1(x)
+        x = self.bn1(x)
+        x = self.relu2(x)
+        x = x.view(x.size(0), -1)
+        x = self.linear(x)
+        return x
+
+
+

Now, after rerunning the model validator, all checks pass:

+
Utils - INFO - Running validator check <function validate_for_reused_modules at 0x7ff577373598>
+Utils - INFO - Running validator check <function validate_for_missing_modules at 0x7ff5703eff28>
+Utils - INFO - All validation checks passed.
+
+
+

Example 2: Model with functionals

+

We start with the following model, which uses a torch linear functional layer in the forward pass:

+
class ModelWithFunctionalLinear(torch.nn.Module):
+    """ Model that uses a torch functional linear layer. Expects input of shape (1, 3, 32, 32) """
+
+    def __init__(self):
+        super(ModelWithFunctionalLinear, self).__init__()
+        self.conv1 = torch.nn.Conv2d(3, 8, kernel_size=2, stride=2, padding=2, bias=False)
+        self.bn1 = torch.nn.BatchNorm2d(8)
+        self.relu1 = torch.nn.ReLU(inplace=True)
+        self.relu2 = torch.nn.ReLU(inplace=True)
+
+    def forward(self, *inputs):
+        x = self.conv1(inputs[0])
+        x = self.relu1(x)
+        x = self.bn1(x)
+        x = self.relu2(x)
+        x = x.view(x.size(0), -1)
+        x = F.linear(x, torch.randn(10, 2592))
+        return x
+
+
+

Running the model validator shows the validate_for_missing_modules check failing:

+
Utils - INFO - Running validator check <function validate_for_missing_modules at 0x7f9dd9bd90d0>
+Utils - WARNING - Ops with missing modules: ['matmul_8']
+This can be due to several reasons:
+1. There is no mapping for the op in ConnectedGraph.op_type_map. Add a mapping for ConnectedGraph to recognize and
+be able to map the op.
+2. The op is defined as a functional in the forward function, instead of as a class module. Redefine the op as a
+class module if possible. Else, check 3.
+3. This op is one that cannot be defined as a class module, but has not been added to ConnectedGraph.functional_ops.
+Add to continue.
+Utils - INFO - The following validator checks failed:
+Utils - INFO -      <function validate_for_missing_modules at 0x7f9dd9bd90d0>
+
+
+

The check has identified matmul_8 as an operation with a missing pytorch module. In this case, it is due to reason #2 +in the log, in which the layer has been defined as a functional in the forward function. We rewrite the model by +defining the layer as a module instead in order to resolve the issue.

+
class ModelWithoutFunctionalLinear(torch.nn.Module):
+    """ Model that is fixed to use a linear module instead of functional. Expects input of shape (1, 3, 32, 32) """
+
+    def __init__(self):
+        super(ModelWithoutFunctionalLinear, self).__init__()
+        self.conv1 = torch.nn.Conv2d(3, 8, kernel_size=2, stride=2, padding=2, bias=False)
+        self.bn1 = torch.nn.BatchNorm2d(8)
+        self.relu1 = torch.nn.ReLU(inplace=True)
+        self.relu2 = torch.nn.ReLU(inplace=True)
+        self.linear = torch.nn.Linear(2592, 10)
+        with torch.no_grad():
+            self.linear.weight = torch.nn.Parameter(torch.randn(10, 2592))
+
+    def forward(self, *inputs):
+        x = self.conv1(inputs[0])
+        x = self.relu1(x)
+        x = self.bn1(x)
+        x = self.relu2(x)
+        x = x.view(x.size(0), -1)
+        x = self.linear(x)
+        return x
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch_multi_gpu.html b/releases/1.32.2/api_docs/torch_multi_gpu.html new file mode 100644 index 00000000..868cefbf --- /dev/null +++ b/releases/1.32.2/api_docs/torch_multi_gpu.html @@ -0,0 +1,1162 @@ + + + + + + PyTorch Multi-GPU support — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

PyTorch Multi-GPU support

+

Currently AIMET supports models using Multi-GPU in data parallel mode with the following features

+
    +
  1. Cross-Layer Equalization (CLE)

  2. +
  3. Quantization Aware Training (QAT)

  4. +
+

A user can create a Data Parallel model using torch APIs. For example:

+
# Instantiate a torch model and pass it to DataParallel API
+model = torch.nn.DataParallel(model)
+
+
+

Multi-GPU with CLE

+

For using multi-GPU with CLE, you can pass the above created model directly to the CLE API +Cross-Layer Equalization API

+

NOTE: CLE doesn’t actually make use of multi-GPU, it is only integrated as a part of work-flow so that user need not move the model +back and forth from single gpu to multi-GPU and back.

+

Multi-GPU with Quantization Aware Training

+

For using multi-GPU with QAT,

+
    +
  1. Create a QuantizationSim as shown in Quantization Simulation API using a torch model (Not in DataParallel mode)

  2. +
  3. Perform compute encodings (NOTE: Do not use a forward function that moves the model to multi-gpu and back)

  4. +
  5. Move sim model to DataParallel:

    +
    sim.model = torch.nn.DataParallel(sim.model)
    +
    +
    +
  6. +
  7. Perform Eval/Training

  8. +
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch_peft_lora.html b/releases/1.32.2/api_docs/torch_peft_lora.html new file mode 100644 index 00000000..9901f104 --- /dev/null +++ b/releases/1.32.2/api_docs/torch_peft_lora.html @@ -0,0 +1,1406 @@ + + + + + + Top-level API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Top-level API

+
+
+class aimet_torch.peft.AdapterMetaData[source]
+

Tracks meta data for lora layers. Tracks names of lora_a & b as well as alpha values +.. attribute:: lora_A, lora_B, alpha

+
+ +
+

+
+

The following API can be used to replace PEFT lora layers definition with AIMET lora layers definition

+
+
+peft.replace_lora_layers_with_quantizable_layers()
+

Utility to replace lora layers with Quantizable Lora layers

+
+
Parameters
+

model (Module) – PEFT model

+
+
+
+ +
+

+
+

The following API can be used to save adapter weights if model adaptations were performed which change +the names/type of modules

+
+
+peft.save_lora_weights_after_adaptation(path, filename_prefix)
+

Utility to save model weights after model adaptations

+
+
Parameters
+
    +
  • model (Module) – PEFT model

  • +
  • path (str) – path where to store model pth and encodings

  • +
  • filename_prefix (str) – Prefix to use for filenames

  • +
+
+
+
+ +
+

+
+

The following API can be used to track lora meta data. To be passed to peft utilities

+
+
+peft.track_lora_meta_data(path, filename_prefix, replaced_module_type=None)
+

Utility to track and save meta data for adapters. The meta data has adapter names and corresponding lora layers & alphas

+
+
Parameters
+
    +
  • model (Module) – PEFT model

  • +
  • path (str) – path where to store model pth and encodings

  • +
  • filename_prefix (str) – Prefix to use for filenames

  • +
  • replaced_module_type (Optional[Type[Module]]) – If lora linear layer is replaced by another torch module, then replaced_module_type +represents the type with which linear layer was replaced. Otherwise pass None

  • +
+
+
Return type
+

Dict[str, AdapterMetaData]

+
+
+
+ +
+

+
+
+
+class aimet_torch.peft.PeftQuantUtils(adapater_name_to_meta_data, name_to_module_dict=None)[source]
+

Utilities for quantizing peft model

+

Init for Peft utilities for quantization

+
+
Parameters
+
    +
  • adapater_name_to_meta_data (Dict[str, AdapterMetaData]) – Dict mapping adapter name to meta data. Output of track_meta_data

  • +
  • name_to_module_dict – PT Name to module prepared model name mapping

  • +
+
+
+
+
+disable_lora_adapters(sim)[source]
+

Disables adapter (zero out weights for lora A & B) effect on base model by loading weights to model

+
+
Parameters
+

sim (QuantizationSimModel) – QuantSim model

+
+
+
+ +
+
+enable_adapter_and_load_weights(sim, adapter_weights_path, use_safetensor=True)[source]
+

Enables adapter effect on base model by loading weights to model

+
+
Parameters
+
    +
  • sim (QuantizationSimModel) – QuantSim model

  • +
  • adapter_weights_path – Path to adapter weights (adapter weights should be either bin file or safetensor)

  • +
  • use_safetensor (bool) – True if adapter weights path point to a safetensor file. False if points to bin file

  • +
+
+
+
+ +
+
+export_adapter_weights(sim, path, filename_prefix, onnx_model_path)[source]
+

Exports adapter weights to safetensor format

+
+
Parameters
+
    +
  • sim (QuantizationSimModel) – QuantSim model

  • +
  • path (str) – path where to store model pth and encodings

  • +
  • filename_prefix (str) – Prefix to use for filenames of the model pth and encodings files

  • +
  • onnx_model_path (str) – Path from where we can load the exported onnx model. This can be the same path to where +QuantSim exported the ONNX model

  • +
+
+
+
+ +
+
+freeze_base_model(sim)[source]
+

Freeze entire base model

+
+
Parameters
+

sim (QuantizationSimModel) – QuantSim model

+
+
+
+ +
+
+freeze_base_model_activation_quantizers(sim)[source]
+

Freeze activation quantizers of base model

+
+
Parameters
+

sim (QuantizationSimModel) – QuantSim model

+
+
+
+ +
+
+freeze_base_model_param_quantizers(sim)[source]
+

Freeze parameter quantizers of base model

+
+
Parameters
+

sim (QuantizationSimModel) – QuantSim model

+
+
+
+ +
+
+get_quantized_lora_layer(sim)[source]
+

This function can be used to generate lora quantized layers +Use cases: 1) New quantizers can be created and assigned to lora quantized layer.

+
+

New quantizers may be required if changing - Changing dtype, per channel to per tensor +and vice versa +2) Assign new values to symmetric, bitwidth

+
+
+
Parameters
+

sim (QuantizationSimModel) – QuantSim model

+
+
+
+ +
+
+set_bitwidth_for_lora_adapters(sim, output_bw, param_bw)[source]
+

Sets output and param bitwidth for all Lora adapters added to the model

+
+
Parameters
+
    +
  • sim (QuantizationSimModel) – QuantSim model

  • +
  • output_bw (int) – Output BW

  • +
  • param_bw (int) – Parameter BW

  • +
+
+
+
+ +
+ +
+

+
+
+
+

User flow

+

Example:

+
    +
  1. Create a PEFT model with one adapter

    +
    >>> from peft import LoraConfig, get_peft_model
    +>>> lora_config = LoraConfig(
    +>>>    lora_alpha=16,
    +>>>    lora_dropout=0.1,
    +>>>    r=4,
    +>>>    bias="none",
    +>>>    target_modules=["linear"])
    +>>> model = get_peft_model(model, lora_config)
    +
    +
    +
  2. +
  3. Replace lora layer with AIMET lora layer

    +
    >>> from aimet_torch.peft import replace_lora_layers_with_quantizable_layers
    +>>> replace_lora_layers_with_quantizable_layers(model)
    +
    +
    +
  4. +
  5. Save lora weights for adapter model

    +
    >>> from aimet_torch.peft import save_lora_weights_after_adaptation
    +>>> save_lora_weights_after_adaptation(model, tmp_dir, 'lora_weights_after_adaptation_for_adapter1')
    +
    +
    +
  6. +
  7. Track meta data for lora layers

    +
    >>> from aimet_torch.peft import track_lora_meta_data
    +>>> meta_data = track_lora_meta_data(model, tmp_dir, 'meta_data')
    +>>> ## If linear lora layers were replaced with ConvInplaceLinear then
    +>>> meta_data = track_lora_meta_data(model, tmp_dir, 'meta_data', ConvInplaceLinear)
    +
    +
    +
  8. +
  9. Create Quant utilities

    +
    >>> from aimet_torch.peft import PeftQuantUtils
    +>>> peft_utils = PeftQuantUtils(meta_data)
    +>>> ## If we are using a prepared model, then load name to module dict that gets saved as a json file
    +>>> peft_utils = PeftQuantUtils(meta_data, name_to_module_dict)
    +
    +
    +
  10. +
+

Next step will be to create a QuantSim object (steps are not shown below, please refer to quantsim docs for reference) +Once Sim is created, we can use peft_utils to modify quantization attributes for lora layers in sim

+
    +
  1. Disable lora adapters. To compute base model encodings without the effect of adapters

    +
    >>> peft_utils.disable_lora_adapters(sim)
    +
    +
    +
  2. +
  3. Compute Encodings for sim (Not shown below, refer to quantsim docs) & freeze base model encodings for params

    +
    >>> peft_utils.freeze_base_model_param_quantizers(sim)
    +
    +
    +
  4. +
  5. Export base model and encodings

    +
    >>> sim.export(tmpdir, 'model', dummy_input=dummy_inputs, export_model=True, filename_prefix_encodings='base_encodings')
    +
    +
    +
  6. +
  7. Load adapter weights

    +
    >>> peft_utils.enable_adapter_and_load_weights(sim, 'tmpdir/lora_weights_after_adaptation_for_adapter1.safetensor', use_safetensor=True)
    +
    +
    +
  8. +
  9. Configure lora adapter quantizers

    +
    >>> for name, lora_module in peft_utils.get_quantized_lora_layer(sim):
    +>>>     ### Change bitwidth
    +>>>     lora_module.param_quantizers['weight'].bitwidth = 16
    +>>>     ### Change per tensor to per channel
    +>>>     lora_module.param_quantizers['weight'] = aimet.quantization.affine.QuantizeDequantize(shape=(1, 1, 1, 1), bitwidth=16, symmetric=True).to(module.weight.device)
    +
    +
    +
  10. +
  11. Compute encodings for model & Export +Note: while exporting the model directory should be the same for base_model export and consecutive exports

    +
    >>> sim.export(tmpdir, 'model', dummy_input=dummy_inputs, export_model=False, filename_prefix_encodings='adapter1')
    +>>> peft_utils.export_adapter_weights(sim, tmpdir, 'adapter1_weights', 'tmpdir/model.onnx')
    +
    +
    +
  12. +
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch_primitive_apis_cle.html b/releases/1.32.2/api_docs/torch_primitive_apis_cle.html new file mode 100644 index 00000000..29cd1067 --- /dev/null +++ b/releases/1.32.2/api_docs/torch_primitive_apis_cle.html @@ -0,0 +1,1442 @@ + + + + + + AIMET PyTorch Cross Layer Equalization Primitive API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET PyTorch Cross Layer Equalization Primitive API

+
+

Introduction

+

If a user wants to modify the order of Cross Layer equalization, not use some features or manually tweak the list of +layers that need to be equalized, the following APIs can be used.

+

Higher level APIs can be used for using one or more features one after the other. It automatically finds the layers to +be folded or scaled.

+

Lower level APIs can be used to manually tweak the list of layers to be folded. The user has to pass the list of +layers in the correct order that they appear in the model.

+

Note: Before using High Bias fold, Cross Layer Scaling (CLS) needs to be applied and scaling factors obtained from +CLS need to be plugged in to High Bias Fold. And, if there are batchnorm layers, they need to be folded and the info +saved to be plugged into high bias fold API.

+
+
+

ClsSetInfo Definition

+
+
+class aimet_torch.cross_layer_equalization.ClsSetInfo(cls_pair_1, cls_pair_2=None)[source]
+

This class hold information about the layers in a CLS set, along with corresponding scaling factors +and other information like if there is a ReLU activation function between the CLS set layers

+

Constructor takes 2 pairs if Depth-wise separable layer is being folded

+
+
Parameters
+
+
+
+
+
+class ClsSetLayerPairInfo(layer1, layer2, scale_factor, relu_activation_between_layers)[source]
+

Models a pair of layers that were scaled using CLS. And related information.

+
+
Parameters
+
    +
  • layer1 (Conv2d) – Layer whose bias is folded

  • +
  • layer2 (Conv2d) – Layer to which bias of previous layer’s bias is folded

  • +
  • scale_factor (ndarray) – Scale Factor found from Cross Layer Scaling to scale BN parameters

  • +
  • relu_activation_between_layers (bool) – If the activation between layer1 and layer2 is Relu

  • +
+
+
+
+ +
+ +
+

+
+
+
+

Higher Level APIs for Cross Layer Equalization

+

API for Batch Norm Folding

+
+
+aimet_torch.batch_norm_fold.fold_all_batch_norms(model, input_shapes, dummy_input=None)
+

Fold all batch_norm layers in a model into the weight of the corresponding conv layers

+
+
Parameters
+
    +
  • model (Module) – Model

  • +
  • input_shapes (Union[Tuple, List[Tuple]]) – Input shapes for the model (can be one or multiple inputs)

  • +
  • dummy_input (Union[Tensor, Tuple, None]) – A dummy input to the model. Can be a Tensor or a Tuple of Tensors

  • +
+
+
Return type
+

List[Tuple[Union[Linear, Conv1d, Conv2d, ConvTranspose2d], Union[BatchNorm1d, BatchNorm2d]]]

+
+
Returns
+

A list of pairs of layers [(Conv/Linear, BN layer that got folded)]

+
+
+
+ +
+

+
+

API for Cross Layer Scaling

+
+
+aimet_torch.cross_layer_equalization.CrossLayerScaling.scale_model(model, input_shapes, dummy_input=None)
+

Uses cross-layer scaling to scale all applicable layers in the given model

+
+
Parameters
+
    +
  • model (Module) – Model to scale

  • +
  • input_shapes (Union[Tuple, List[Tuple]]) – Input shape for the model (can be one or multiple inputs)

  • +
  • dummy_input (Union[Tensor, List[Tensor], None]) – Dummy input to the model. Used to parse model graph. User is expected to place the tensors on the appropriate device.

  • +
+
+
Return type
+

List[ClsSetInfo]

+
+
Returns
+

CLS information for each CLS set

+
+
+
+ +
+

+
+

API for High Bias Folding

+
+
+aimet_torch.cross_layer_equalization.HighBiasFold.bias_fold(cls_set_info_list, bn_layers)
+

Folds bias values greater than 3 * sigma to next layer’s bias

+
+
Parameters
+
    +
  • cls_set_info_list (List[ClsSetInfo]) – List of info elements for each cls set

  • +
  • bn_layers (Dict[Union[Conv2d, ConvTranspose2d], BatchNorm2d]) – Key: Conv/Linear layer Value: Corresponding folded BN layer

  • +
+
+
Returns
+

None

+
+
+
+ +
+

+
+
+
+

Code Examples for Higher Level APIs

+

Required imports

+
import torch
+from torchvision import models
+from aimet_torch import batch_norm_fold
+from aimet_torch import cross_layer_equalization
+from aimet_torch import utils
+
+
+

Cross Layer Equalization in auto mode calling each API

+
def cross_layer_equalization_auto_step_by_step():
+    model = models.resnet18(pretrained=True)
+
+    model = model.eval()
+    input_shape = (1, 3, 224, 224)
+    # Fold BatchNorm layers
+    folded_pairs = batch_norm_fold.fold_all_batch_norms(model, input_shape)
+    bn_dict = {}
+    for conv_bn in folded_pairs:
+        bn_dict[conv_bn[0]] = conv_bn[1]
+
+    # Replace any ReLU6 layers with ReLU
+    utils.replace_modules_of_type1_with_type2(model, torch.nn.ReLU6, torch.nn.ReLU)
+
+    # Perform cross-layer scaling on applicable layer sets
+    cls_set_info_list = cross_layer_equalization.CrossLayerScaling.scale_model(model, input_shape)
+
+    # Perform high-bias fold
+    cross_layer_equalization.HighBiasFold.bias_fold(cls_set_info_list, bn_dict)
+
+
+
+
+

Lower Level APIs for Cross Layer Equalization

+

API for Batch Norm Folding

+
+
+aimet_torch.batch_norm_fold.fold_given_batch_norms(model, layer_pairs)[source]
+

Fold a given set of batch_norm layers into conv layers

+
+
Parameters
+
    +
  • model – Model

  • +
  • layer_pairs – Pairs of conv and batch_norm layers to use for folding

  • +
+
+
Returns
+

None

+
+
+
+ +
+

+
+

API for Cross Layer Scaling

+
+
+aimet_torch.cross_layer_equalization.CrossLayerScaling.scale_cls_sets(cls_sets)
+

Scale multiple CLS sets

+
+
Parameters
+

cls_sets (List[Union[Tuple[Conv2d, Conv2d], Tuple[Conv2d, Conv2d, Conv2d]]]) – List of CLS sets

+
+
Return type
+

List[Union[ndarray, Tuple[ndarray]]]

+
+
Returns
+

Scaling factors calculated and applied for each CLS set in order

+
+
+
+ +
+

+
+

API for High bias folding

+
+
+aimet_torch.cross_layer_equalization.HighBiasFold.bias_fold(cls_set_info_list, bn_layers)
+

Folds bias values greater than 3 * sigma to next layer’s bias

+
+
Parameters
+
    +
  • cls_set_info_list (List[ClsSetInfo]) – List of info elements for each cls set

  • +
  • bn_layers (Dict[Union[Conv2d, ConvTranspose2d], BatchNorm2d]) – Key: Conv/Linear layer Value: Corresponding folded BN layer

  • +
+
+
Returns
+

None

+
+
+
+ +
+

+
+
+
+

Code Examples for Lower Level APIs

+

Required imports

+
from torchvision import models
+from aimet_torch import batch_norm_fold
+from aimet_torch import cross_layer_equalization
+from aimet_torch import utils
+
+
+

Cross Layer Equalization in manual mode

+
def cross_layer_equalization_manual():
+    model = models.resnet18(pretrained=True)
+
+    model = model.eval()
+
+    # Batch Norm Fold
+    # Create a list of conv/linear and BN layers for folding forward or backward
+    layer_list = [(model.conv1, model.bn1),
+                  (model.layer1[0].conv1, model.layer1[0].bn1)]
+
+    # Save the corresponding BN layers (needed only for high bias folding)
+    bn_dict = {}
+    for conv_bn in layer_list:
+        bn_dict[conv_bn[0]] = conv_bn[1]
+
+    batch_norm_fold.fold_given_batch_norms(model, layer_list)
+
+    # Replace any ReLU6 layers with ReLU
+    utils.replace_modules_of_type1_with_type2(model, torch.nn.ReLU6, torch.nn.ReLU)
+
+    # Cross Layer Scaling
+    # Create a list of consecutive conv layers to be equalized
+    consecutive_layer_list = [(model.conv1, model.layer1[0].conv1),
+                              (model.layer1[0].conv1, model.layer1[0].conv2)]
+
+    scaling_factor_list = cross_layer_equalization.CrossLayerScaling.scale_cls_sets(consecutive_layer_list)
+
+    # High Bias Fold
+    # Create a list of consecutive conv layers whose previous layers bias has to be folded to next layers bias
+    ClsSetInfo = cross_layer_equalization.ClsSetInfo
+    ClsPairInfo = cross_layer_equalization.ClsSetInfo.ClsSetLayerPairInfo
+    cls_set_info_list = [ClsSetInfo(ClsPairInfo(model.conv1, model.layer1[0].conv1, scaling_factor_list[0], True)),
+                         ClsSetInfo(ClsPairInfo(model.layer1[0].conv1, model.layer1[0].conv2, scaling_factor_list[1], True))]
+
+    cross_layer_equalization.HighBiasFold.bias_fold(cls_set_info_list, bn_dict)
+
+
+

Cross Layer Equalization in manual mode for Depthwise Separable layer

+
def cross_layer_equalization_depthwise_layers():
+    model = MobileNetV2().to(torch.device('cpu'))
+    model.eval()
+    # Batch Norm Fold
+    # Create a list of conv/linear and BN layers for folding forward or backward
+    layer_list = [(model.features[0][0], model.features[0][1]),
+                  (model.features[1].conv[0], model.features[1].conv[1]),
+                  (model.features[1].conv[3], model.features[1].conv[4])]
+
+    # Save the corresponding BN layers (needed only for high bias folding)
+    bn_dict = {}
+    for conv_bn in layer_list:
+        bn_dict[conv_bn[0]] = conv_bn[1]
+
+    batch_norm_fold.fold_given_batch_norms(model, layer_list)
+
+    # Replace any ReLU6 layers with ReLU
+    utils.replace_modules_of_type1_with_type2(model, torch.nn.ReLU6, torch.nn.ReLU)
+
+    # Cross Layer Scaling
+    # Create a list of consecutive conv layers to be equalized
+    consecutive_layer_list = [(model.features[0][0], model.features[1].conv[0], model.features[1].conv[3])]
+    scaling_factor_list = cross_layer_equalization.CrossLayerScaling.scale_cls_sets(consecutive_layer_list)
+
+    # High Bias Fold
+    # Create a list of consecutive conv layers whose previous layers bias has to be folded to next layers bias
+    ClsSetInfo = cross_layer_equalization.ClsSetInfo
+    ClsPairInfo = cross_layer_equalization.ClsSetInfo.ClsSetLayerPairInfo
+    cls_set_info_list = [ClsSetInfo(ClsPairInfo(model.features[0][0], model.features[1].conv[0], scaling_factor_list[0][0], True)),
+                         ClsSetInfo(ClsPairInfo(model.features[1].conv[0], model.features[1].conv[3], scaling_factor_list[0][1], True))]
+
+    cross_layer_equalization.HighBiasFold.bias_fold(cls_set_info_list, bn_dict)
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch_quant_analyzer.html b/releases/1.32.2/api_docs/torch_quant_analyzer.html new file mode 100644 index 00000000..e5cc1af9 --- /dev/null +++ b/releases/1.32.2/api_docs/torch_quant_analyzer.html @@ -0,0 +1,1537 @@ + + + + + + AIMET PyTorch Quant Analyzer API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET PyTorch Quant Analyzer API

+

AIMET PyTorch Quant Analyzer analyzes the PyTorch model and points out sensitive layers to quantization in the model. +It checks model sensitivity to weight and activation quantization, performs per layer sensitivity and MSE analysis. +It also exports per layer encodings min and max ranges and statistics histogram for every layer.

+ + +
+

Top-level API

+
+
+class aimet_common.utils.CallbackFunc(func, func_callback_args=None)[source]
+

Class encapsulating callback function, and it’s argument(s)

+
+
Parameters
+
    +
  • func (Callable) – Callable Function

  • +
  • func_callback_args – Arguments passed to the callable function as-is.

  • +
+
+
+
+ +
+

+
+
+
+class aimet_torch.quant_analyzer.QuantAnalyzer(model, dummy_input, forward_pass_callback, eval_callback, modules_to_ignore=None)[source]
+

QuantAnalyzer tool provides

+
+
    +
  1. model sensitivity to weight and activation quantization

  2. +
  3. per layer sensitivity analysis

  4. +
  5. per layer encoding (min - max range)

  6. +
  7. per PDF analysis and

  8. +
  9. per layer MSE analysis

  10. +
+
+
+
Parameters
+
    +
  • model (Module) – FP32 model to analyze for quantization.

  • +
  • dummy_input (Union[Tensor, Tuple]) – Dummy input to model.

  • +
  • forward_pass_callback (CallbackFunc) – A callback function for model calibration that simply runs +forward passes on the model to compute encoding (delta/offset). This +callback function should use representative data and should be subset of +entire train/validation dataset (~1000 images/samples).

  • +
  • eval_callback (CallbackFunc) – A callback function for model evaluation that determines model +performance. This callback function is expected to return scalar value +representing the model performance evaluated against entire test/evaluation dataset.

  • +
  • modules_to_ignore (Optional[List[Module]]) – Excludes certain modules from being analyzed.

  • +
+
+
+
+ +
+

+
+
+
+QuantAnalyzer.enable_per_layer_mse_loss(unlabeled_dataset_iterable, num_batches)[source]
+

Enable per layer MSE loss analysis.

+
+
Parameters
+
    +
  • unlabeled_dataset_iterable (Union[DataLoader, Collection]) – A collection (i.e. iterable with __len__) +that iterates over an unlabeled dataset. The values yielded by this iterable are expected +to be able to be passed directly to the model.

  • +
  • num_batches (int) – Number of batches. Approximately 256 samples/images are recommended, +so if batch size of data loader is 64, then 4 number of batches leads to 256 samples/images.

  • +
+
+
+
+ +
+

+
+
+
+QuantAnalyzer.analyze(quant_scheme=QuantScheme.post_training_tf_enhanced, default_param_bw=8, default_output_bw=8, config_file=None, results_dir='./tmp/')[source]
+
+
Analyze model for quantization and point out sensitive parts/hotspots of the model by performing
    +
  1. model sensitivity to quantization,

  2. +
  3. perform per layer sensitivity analysis by enabling and disabling quant wrappers,

  4. +
  5. export per layer encodings min - max ranges,

  6. +
  7. export per layer statistics histogram (PDF) when quant scheme is TF-Enhanced,

  8. +
  9. per layer MSE analysis

  10. +
+
+
+
+
Parameters
+
    +
  • quant_scheme (QuantScheme) – Quantization scheme. Supported values are +QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced.

  • +
  • default_param_bw (int) – Default bitwidth (4-31) to use for quantizing layer parameters.

  • +
  • default_output_bw (int) – Default bitwidth (4-31) to use for quantizing layer inputs and outputs.

  • +
  • config_file (Optional[str]) – Path to configuration file for model quantizers.

  • +
  • results_dir (str) – Directory to save the results.

  • +
+
+
+
+ +
+

+
+
+
+

Run specific utility

+

We can avoid running all the utilities that QuantAnalyzer offers and only run those of our interest. For this we +need to have the QuantizationSimModel object, Then we call the desired QuantAnalyzer utility of our interest and pass +the same object to it.

+
+
+QuantAnalyzer.check_model_sensitivity_to_quantization(sim)[source]
+

Perform the sensitivity analysis to weight and activation quantization +individually.

+
+
Parameters
+

sim (QuantizationSimModel) – Quantsim model.

+
+
Return type
+

Tuple[float, float, float]

+
+
Returns
+

FP32 eval score, weight-quantized eval score, act-quantized eval score.

+
+
+
+ +
+

+
+
+
+QuantAnalyzer.perform_per_layer_analysis_by_enabling_quant_wrappers(sim, results_dir)[source]
+

NOTE: Option 1

+
    +
  1. All quant wrappers’ parameters and activations quantizers are disabled.

  2. +
  3. +
    Based on occurrence for every quant wrappers
      +
    • Each quant wrapper’s parameters and activations quantizers are enabled as per JSON config file and set to bit-width specified.

    • +
    • Measure and record eval score on subset of dataset.

    • +
    • Disable enabled quantizers in step 1.

    • +
    +
    +
    +
  4. +
  5. Returns dictionary containing quant wrapper name and corresponding eval score.

  6. +
+
+
Parameters
+
    +
  • sim (QuantizationSimModel) – Quantsim model.

  • +
  • results_dir (str) – Directory to save the results.

  • +
+
+
Return type
+

Dict

+
+
Returns
+

layer wise eval score dictionary. dict[layer_name] = eval_score

+
+
+
+ +
+

+
+
+
+QuantAnalyzer.perform_per_layer_analysis_by_disabling_quant_wrappers(sim, results_dir)[source]
+

NOTE: Option 2

+
    +
  1. All quant wrappers’ parameters and activations quantizers are enabled as per JSON config file and set to bit-width specified.

  2. +
  3. +
    Based on occurrence for every quant wrappers
      +
    • Each quant wrapper’s parameters and activations quantizers are disabled.

    • +
    • Measure and record eval score on subset of dataset.

    • +
    • Enable disabled quantizers in step 1.

    • +
    +
    +
    +
  4. +
  5. Returns dictionary containing quant wrapper name and corresponding eval score.

  6. +
+
+
Parameters
+
    +
  • sim (QuantizationSimModel) – Quantsim model.

  • +
  • results_dir (str) – Directory to save the results.

  • +
+
+
Return type
+

Dict

+
+
Returns
+

layer wise eval score dictionary. dict[layer_name] = eval_score

+
+
+
+ +
+

+
+
+
+QuantAnalyzer.export_per_layer_encoding_min_max_range(sim, results_dir)[source]
+

Export encoding min and max range for all weights and activations. results_dir should have +html files in following format.

+
+
-results_dir
+

-activations.html +-weights.html

+
+
+

If per channel quantization(PCQ) is enabled then,

+
+
-results_dir
+

-activations.html +-{wrapped_module_name}_{param_name}.html

+
+
+
+
Parameters
+
    +
  • sim (QuantizationSimModel) – Quantsim model.

  • +
  • results_dir (str) – Directory to save the results.

  • +
+
+
Return type
+

Tuple[Dict, Dict]

+
+
Returns
+

layer wise min-max range for weights and activations.

+
+
+
+ +
+

+
+
+
+QuantAnalyzer.export_per_layer_stats_histogram(sim, results_dir)[source]
+

NOTE: Not to invoke when quantization scheme is not TF-Enhanced.

+

Export histogram that represents a PDF of collected statistics by a quantizer for every +quant wrapper. After invoking this API, results_dir should have html files in following +format for every quantizers of quant wrappers.

+
+
-results_dir
+
+
-activations_pdf
+

name_{input/output}_{index}.html

+
+
-weights_pdf
+
+
-name
+

param_name_{channel_index}.html

+
+
+
+
+
+
+
+
Parameters
+
    +
  • sim (QuantizationSimModel) – Quantsim model.

  • +
  • results_dir (str) – Directory to save the results.

  • +
+
+
+
+ +
+

+
+
+
+QuantAnalyzer.export_per_layer_mse_loss(sim, results_dir)[source]
+

NOTE: Need to pass same model input data through both fp32 and quantsim model to +tap output activations of each layer.

+

Export MSE loss between fp32 and quantized output activations for each layer. +:type sim: QuantizationSimModel +:param sim: Quantsim model. +:type results_dir: str +:param results_dir: Directory to save the results. +:return layer wise MSE loss. dict[layer_name] = MSE loss.

+
+
Return type
+

Dict

+
+
+
+ +
+

+
+
+
+

Code Examples

+

Required imports

+
from typing import Any
+import torch
+from torchvision import models
+from aimet_common.defs import QuantScheme
+from aimet_torch.model_preparer import prepare_model
+from aimet_torch.quant_analyzer import QuantAnalyzer, CallbackFunc
+
+
+

Prepare forward pass callback

+
# NOTE: In the actual use cases, the users should implement this part to serve
+#       their own goals if necessary.
+def forward_pass_callback(model: torch.nn.Module, _: Any = None) -> None:
+    """
+    NOTE: This is intended to be the user-defined model calibration function.
+    AIMET requires the above signature. So if the user's calibration function does not
+    match this signature, please create a simple wrapper around this callback function.
+
+    A callback function for model calibration that simply runs forward passes on the model to
+    compute encoding (delta/offset). This callback function should use representative data and should
+    be subset of entire train/validation dataset (~1000 images/samples).
+
+    :param model: PyTorch model.
+    :param _: Argument(s) of this callback function. Up to the user to determine the type of this parameter.
+    E.g. could be simply an integer representing the number of data samples to use. Or could be a tuple of
+    parameters or an object representing something more complex.
+    """
+    # User action required
+    # User should create data loader/iterable using representative dataset and simply run
+    # forward passes on the model.
+
+
+

Prepare eval callback

+
# NOTE: In the actual use cases, the users should implement this part to serve
+#       their own goals if necessary.
+def eval_callback(model: torch.nn.Module, _: Any = None) -> float:
+    """
+    NOTE: This is intended to be the user-defined model evaluation function.
+    AIMET requires the above signature. So if the user's calibration function does not
+    match this signature, please create a simple wrapper around this callback function.
+
+    A callback function for model evaluation that determines model performance. This callback function is
+    expected to return scalar value representing the model performance evaluated against entire
+    test/evaluation dataset.
+
+    :param model: PyTorch model.
+    :param _: Argument(s) of this callback function. Up to the user to determine the type of this parameter.
+    E.g. could be simply an integer representing the number of data samples to use. Or could be a tuple of
+    parameters or an object representing something more complex.
+    :return: Scalar value representing the model performance.
+    """
+    # User action required
+    # User should create data loader/iterable using entire test/evaluation dataset, perform forward passes on
+    # the model and return single scalar value representing the model performance.
+    return .8
+
+
+

Prepare model and callback functions

+
    model = models.resnet18(pretrained=True).cuda().eval()
+    input_shape = (1, 3, 224, 224)
+    dummy_input = torch.randn(*input_shape).cuda()
+    prepared_model = prepare_model(model)
+
+    # User action required
+    # User should pass actual argument(s) of the callback functions.
+    forward_pass_callback_fn = CallbackFunc(forward_pass_callback, func_callback_args=None)
+    eval_callback_fn = CallbackFunc(eval_callback, func_callback_args=None)
+
+
+

Create QuantAnalyzer object

+
    quant_analyzer = QuantAnalyzer(model=prepared_model,
+                                   dummy_input=dummy_input,
+                                   forward_pass_callback=forward_pass_callback_fn,
+                                   eval_callback=eval_callback_fn)
+
+    # User action required
+    # User should use unlabeled dataloader, so if the dataloader yields labels as well user should use discard them.
+    unlabeled_data_loader = _get_unlabled_data_loader()
+    # Approximately 256 images/samples are recommended for MSE loss analysis. So, if the dataloader
+    # has batch_size of 64, then 4 number of batches leads to 256 images/samples.
+    quant_analyzer.enable_per_layer_mse_loss(unlabeled_dataset_iterable=unlabeled_data_loader, num_batches=4)
+
+
+

Run QuantAnalyzer

+
    quant_analyzer.analyze(quant_scheme=QuantScheme.post_training_tf_enhanced,
+                           default_param_bw=8,
+                           default_output_bw=8,
+                           config_file=None,
+                           results_dir="./quant_analyzer_results/")
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch_quantization.html b/releases/1.32.2/api_docs/torch_quantization.html new file mode 100644 index 00000000..6d0ea9e9 --- /dev/null +++ b/releases/1.32.2/api_docs/torch_quantization.html @@ -0,0 +1,1162 @@ + + + + + + AIMET PyTorch Quantization APIs — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET PyTorch Quantization APIs

+

In order to make full use of AIMET Quantization features, there are several guidelines users are encouraged to follow +when defining PyTorch models. AIMET provides APIs which can automate some of the model definition changes and checks +whether AIMET Quantization features can be applied on PyTorch model.

+
+
+
+
Users should first invoke Model Preparer API before using any of the AIMET Quantization features.
+
+
AIMET Quantization for PyTorch Models provides the following functionality.
+
+
If a user wants to use Multi-GPU with CLE or QAT, they can refer to:
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch_quantsim.html b/releases/1.32.2/api_docs/torch_quantsim.html new file mode 100644 index 00000000..7a41adad --- /dev/null +++ b/releases/1.32.2/api_docs/torch_quantsim.html @@ -0,0 +1,1494 @@ + + + + + + AIMET PyTorch Quantization SIM API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET PyTorch Quantization SIM API

+ + +
+

Guidelines

+

AIMET Quantization Sim requires PyTorch model definition to follow certain guidelines. These guidelines are described +in detail here. Model Guidelines

+

AIMET provides Model Preparer API to allow user to prepare PyTorch model for AIMET Quantization features. The API and +usage examples are described in detail here. Model Preparer API

+

AIMET also includes a Model Validator utility to allow user to check their model definition. Please see the API and +usage examples for this utility here. Model Validator API

+
+
+

Top-level API

+
+
+class aimet_torch.quantsim.QuantizationSimModel(model, dummy_input, quant_scheme=QuantScheme.post_training_tf_enhanced, rounding_mode='nearest', default_output_bw=8, default_param_bw=8, in_place=False, config_file=None, default_data_type=QuantizationDataType.int)[source]
+

Implements mechanism to add quantization simulations ops to a model. This allows for off-target simulation of +inference accuracy. Also allows the model to be fine-tuned to counter the effects of quantization.

+

Constructor for QuantizationSimModel.

+
+
Parameters
+
    +
  • model (Module) – Model to add simulation ops to

  • +
  • dummy_input (Union[Tensor, Tuple]) – Dummy input to the model. Used to parse model graph. If the model has more than one input, +pass a tuple. User is expected to place the tensors on the appropriate device.

  • +
  • quant_scheme (Union[str, QuantScheme]) – Quantization scheme. The Quantization scheme is used to compute the Quantization encodings. +There are multiple schemes available. Please refer the QuantScheme enum definition.

  • +
  • rounding_mode (str) – Rounding mode. Supported options are ‘nearest’ or ‘stochastic’

  • +
  • default_output_bw (int) – Default bitwidth (4-31) to use for quantizing all layer inputs and outputs

  • +
  • default_param_bw (int) – Default bitwidth (4-31) to use for quantizing all layer parameters

  • +
  • in_place (bool) – If True, then the given ‘model’ is modified in-place to add quant-sim nodes. +Only suggested use of this option is when the user wants to avoid creating a copy of the model

  • +
  • config_file (Optional[str]) – Path to Configuration file for model quantizers

  • +
  • default_data_type (QuantizationDataType) – Default data type to use for quantizing all layer inputs, outputs and parameters. +Possible options are QuantizationDataType.int and QuantizationDataType.float. +Note that the mode default_data_type=QuantizationDataType.float is only supported with +default_output_bw=16 or 32 and default_param_bw=16 or 32.

  • +
+
+
+
+ +
+

+
+

The following API can be used to Compute Encodings for Model

+
+
+QuantizationSimModel.compute_encodings(forward_pass_callback, forward_pass_callback_args)[source]
+

Computes encodings for all quantization sim nodes in the model. It is also used to find initial encodings for +Range Learning

+
+
Parameters
+
    +
  • forward_pass_callback – A callback function that simply runs forward passes on the model. This callback +function should use representative data for the forward pass, so the calculated encodings work for all +data samples. This callback internally chooses the number of data samples it wants to use for calculating +encodings.

  • +
  • forward_pass_callback_args – These argument(s) are passed to the forward_pass_callback as-is. Up to +the user to determine the type of this parameter. E.g. could be simply an integer representing the number +of data samples to use. Or could be a tuple of parameters or an object representing something more complex. +If set to None, forward_pass_callback will be invoked with no parameters.

  • +
+
+
Returns
+

None

+
+
+
+ +
+

+
+

The following APIs can be used to save and restore the quantized model

+
+
+quantsim.save_checkpoint(file_path)
+

This API provides a way for the user to save a checkpoint of the quantized model which can +be loaded at a later point to continue fine-tuning e.g. +See also load_checkpoint()

+
+
Parameters
+
    +
  • quant_sim_model (QuantizationSimModel) – QuantizationSimModel to save checkpoint for

  • +
  • file_path (str) – Path to the file where you want to save the checkpoint

  • +
+
+
Returns
+

None

+
+
+
+ +
+

+
+
+
+quantsim.load_checkpoint()
+

Load the quantized model

+
+
Parameters
+

file_path (str) – Path to the file where you want to save the checkpoint

+
+
Return type
+

QuantizationSimModel

+
+
Returns
+

A new instance of the QuantizationSimModel created after loading the checkpoint

+
+
+
+ +
+

+
+

The following API can be used to Export the Model to target

+
+
+QuantizationSimModel.export(path, filename_prefix, dummy_input, onnx_export_args=None, propagate_encodings=False, export_to_torchscript=False, use_embedded_encodings=False, export_model=True, filename_prefix_encodings=None)[source]
+

This method exports out the quant-sim model so it is ready to be run on-target.

+

Specifically, the following are saved:

+
    +
  1. The sim-model is exported to a regular PyTorch model without any simulation ops

  2. +
  3. The quantization encodings are exported to a separate JSON-formatted file that can +then be imported by the on-target runtime (if desired)

  4. +
  5. Optionally, An equivalent model in ONNX format is exported. In addition, nodes in the ONNX model are named +the same as the corresponding PyTorch module names. This helps with matching ONNX node to their quant +encoding from #2.

  6. +
+
+
Parameters
+
    +
  • path (str) – path where to store model pth and encodings

  • +
  • filename_prefix (str) – Prefix to use for filenames of the model pth and encodings files

  • +
  • dummy_input (Union[Tensor, Tuple]) – Dummy input to the model. Used to parse model graph. It is required for the dummy_input to +be placed on CPU.

  • +
  • onnx_export_args (Union[OnnxExportApiArgs, Dict, None]) – Optional export argument with onnx specific overrides provided as a dictionary or +OnnxExportApiArgs object. If not provided, defaults to “opset_version” = None, “input_names” = None, +“output_names” = None, and for torch version < 1.10.0, “enable_onnx_checker” = False.

  • +
  • propagate_encodings (bool) – If True, encoding entries for intermediate ops (when one PyTorch ops results in +multiple ONNX nodes) are filled with the same BW and data_type as the output tensor for that series of +ops. Defaults to False.

  • +
  • export_to_torchscript (bool) – If True, export to torchscript. Export to onnx otherwise. Defaults to False.

  • +
  • use_embedded_encodings (bool) – If True, another onnx model embedded with fakequant nodes will be exported

  • +
  • export_model (bool) – If True, then ONNX model is exported. When False, only encodings are exported. User should +disable (False) this flag only if the corresponding ONNX model already exists in the path +specified

  • +
  • filename_prefix_encodings (Optional[str]) – File name prefix to be used when saving encodings. +If None, then user defaults to filename_prefix value

  • +
+
+
+
+ +
+

+
+

Encoding format is described in the Quantization Encoding Specification

+
+

+
+
+
+

Enum Definition

+

Quant Scheme Enum

+
+
+class aimet_common.defs.QuantScheme(value)[source]
+

Enumeration of Quant schemes

+
+
+post_training_percentile = 6
+

For a Tensor, adjusted minimum and maximum values are selected based on the percentile value passed. +The Quantization encodings are calculated using the adjusted minimum and maximum value.

+
+ +
+
+post_training_tf = 1
+

For a Tensor, the absolute minimum and maximum value of the Tensor are used to compute the Quantization +encodings.

+
+ +
+
+post_training_tf_enhanced = 2
+

For a Tensor, searches and selects the optimal minimum and maximum value that minimizes the Quantization Noise. +The Quantization encodings are calculated using the selected minimum and maximum value.

+
+ +
+
+training_range_learning_with_tf_enhanced_init = 4
+

For a Tensor, the encoding values are initialized with the post_training_tf_enhanced scheme. Then, the encodings +are learned during training.

+
+ +
+
+training_range_learning_with_tf_init = 3
+

For a Tensor, the encoding values are initialized with the post_training_tf scheme. Then, the encodings are +learned during training.

+
+ +
+ +
+

+
+
+
+

Code Example - Quantization Aware Training (QAT)

+

This example shows how to use AIMET to perform QAT (Quantization-aware training). QAT is an +AIMET feature adding quantization simulation ops (also called fake quantization ops sometimes) to a trained ML model and +using a standard training pipeline to fine-tune or train the model for a few epochs. The resulting model should show +improved accuracy on quantized ML accelerators.

+

Simply referred to as QAT - quantization parameters like per-tensor scale/offsets for activations are computed once. +During fine-tuning, the model weights are updated to minimize the effects of quantization in the forward pass, keeping +the quantization parameters constant.

+

Required imports

+

+import torch
+import torch.cuda
+
+
+
+

Load the PyTorch Model

+

For this example, we are going to load a pretrained ResNet18 model from torchvision. Similarly, you can load any +pretrained PyTorch model instead.

+
    from torchvision.models import resnet18
+
+    model = resnet18(pretrained=True)
+    model = model.cuda()
+
+
+
+

Prepare the model for Quantization simulation

+

AIMET quantization simulation requires the user’s model definition to follow certain guidelines. For example, +functionals defined in forward pass should be changed to equivalent torch.nn.Module. AIMET user guide lists all these +guidelines. The following ModelPreparer API uses new graph transformation feature available in PyTorch 1.9+ version and +automates model definition changes required to comply with the above guidelines.

+

For more details, please refer: Model Preparer API:

+
    from aimet_torch.model_preparer import prepare_model
+    prepared_model = prepare_model(model)
+
+
+
+

Create the Quantization Simulation Model

+

Now we use AIMET to create a QuantizationSimModel. This basically means that AIMET will insert fake quantization ops in +the model graph and will configure them. A few of the parameters are explained here

+
    from aimet_common.defs import QuantScheme
+    from aimet_torch.quantsim import QuantizationSimModel
+    input_shape = (1, 3, 224, 224)
+    dummy_input = torch.randn(input_shape).cuda()
+
+    quant_sim = QuantizationSimModel(prepared_model, dummy_input=dummy_input,
+                                     quant_scheme=QuantScheme.post_training_tf_enhanced,
+                                     default_param_bw=8, default_output_bw=8,
+                                     config_file='../../TrainingExtensions/common/src/python/aimet_common/quantsim_config/'
+                                                 'default_config.json')
+
+
+
+

An example User created function that is called back from compute_encodings()

+

Even though AIMET has added ‘quantizer’ nodes to the model graph, the model is not ready to be used yet. Before we can +use the sim model for inference or training, we need to find appropriate scale/offset quantization parameters for each +‘quantizer’ node. For activation quantization nodes, we need to pass unlabeled data samples through the model to collect +range statistics which will then let AIMET calculate appropriate scale/offset quantization parameters. This process is +sometimes referred to as calibration. AIMET simply refers to it as ‘computing encodings’.

+

So we create a routine to pass unlabeled data samples through the model. This should be fairly simple - use the existing +train or validation data loader to extract some samples and pass them to the model. We don’t need to compute any +loss metric etc. So we can just ignore the model output for this purpose. A few pointers regarding the data samples

+

In practice, we need a very small percentage of the overall data samples for computing encodings. For example, +the training dataset for ImageNet has 1M samples. For computing encodings we only need 500 or 1000 samples.

+

It may be beneficial if the samples used for computing encoding are well distributed. It’s not necessary that all +classes need to be covered etc. since we are only looking at the range of values at every layer activation. However, +we definitely want to avoid an extreme scenario like all ‘dark’ or ‘light’ samples are used - e.g. only using pictures +captured at night might not give ideal results.

+
def pass_calibration_data(sim_model, forward_pass_args=None):
+    """
+    The User of the QuantizationSimModel API is expected to write this function based on their data set.
+    This is not a working function and is provided only as a guideline.
+
+    :param sim_model:
+    :param args: other arguments for the forwards
+    :return:
+    """
+
+    # User action required
+    # The following line of code is an example of how to use the ImageNet data's validation data loader.
+    # Replace the following line with your own dataset's validation data loader.
+    data_loader = ImageNetDataPipeline.get_val_dataloader()
+
+    # User action required
+    # For computing the activation encodings, around 1000 unlabelled data samples are required.
+    # Edit the following 2 lines based on your batch size.
+    # batch_size * max_batch_counter should be 1024
+    batch_size = 64
+    max_batch_counter = 16
+
+    sim_model.eval()
+
+    current_batch_counter = 0
+    with torch.no_grad():
+        for input_data, target_data in data_loader:
+
+            inputs_batch = input_data  # labels are ignored
+            sim_model(inputs_batch)
+
+            current_batch_counter += 1
+            if current_batch_counter == max_batch_counter:
+                break
+
+
+

Compute the Quantization Encodings

+

Now we call AIMET to use the above routine to pass data through the model and then subsequently compute the quantization +encodings. Encodings here refer to scale/offset quantization parameters.

+
    quant_sim.compute_encodings(pass_calibration_data, forward_pass_callback_args=None)
+
+
+
+

Finetune the Quatization Simulation Model

+

To perform quantization aware training (QAT), we simply train the model for a few more epochs (typically 15-20). As with +any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning +rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a +factor of 10 every 5 epochs or so.

+

For the purpose of this example, we are going to train only for 1 epoch. But feel free to change these parameters as you +see fit.

+
    # User action required
+    # The following line of code illustrates that the model is getting finetuned.
+    # Replace the following finetune() unction with your pipeline's finetune() function.
+    ImageNetDataPipeline.finetune(quant_sim.model, epochs=1, learning_rate=5e-7, learning_rate_schedule=[5, 10],
+                                  use_cuda=use_cuda)
+
+    # Determine simulated accuracy
+    accuracy = ImageNetDataPipeline.evaluate(quant_sim.model, use_cuda)
+    print(accuracy)
+
+
+
+

Export the model

+

So we have an improved model after QAT. Now the next step would be to actually take this model to target. For this +purpose, we need to export the model with the updated weights without the fake quant ops. We also to export the +encodings (scale/offset quantization parameters) that were updated during training since we employed QAT. +AIMET QuantizationSimModel provides an export API for this purpose.

+
    # Export the model which saves pytorch model without any simulation nodes and saves encodings file for both
+    # activations and parameters in JSON format
+    quant_sim.export(path='./', filename_prefix='quantized_resnet18', dummy_input=dummy_input.cpu())
+
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch_visualization_compression.html b/releases/1.32.2/api_docs/torch_visualization_compression.html new file mode 100644 index 00000000..b19cc153 --- /dev/null +++ b/releases/1.32.2/api_docs/torch_visualization_compression.html @@ -0,0 +1,1239 @@ + + + + + + AIMET Visualization Compression API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Visualization Compression API

+
+

Top-level API Compression

+
+
+class aimet_torch.visualize_serialized_data.VisualizeCompression(visualization_url)[source]
+

Updates bokeh server session document and publishes graphs/tables to the server with session id compression.

+
+ +
+

+
+
+
+VisualizeCompression.display_eval_scores(saved_eval_scores_dict_path)[source]
+

Publishes the evaluation scores table to the server.

+
+
Parameters
+

saved_eval_scores_dict_path – file path to the evaluation scores for each layer

+
+
Returns
+

None

+
+
+
+ +
+

+
+
+
+VisualizeCompression.display_comp_ratio_plot(comp_ratio_list_path)[source]
+

Publishes the optimal compression ratios to the server.

+
+
Parameters
+

comp_ratio_list_path – Path to the pkl file with compression ratios for each layer

+
+
Returns
+

None

+
+
+
+ +
+
+

Code Examples

+

Required imports

+
from decimal import Decimal
+import torch
+from torchvision import models
+import aimet_common.defs
+import aimet_torch.defs
+import aimet_torch.utils
+from aimet_common.utils import start_bokeh_server_session
+from aimet_torch.compress import ModelCompressor
+from aimet_torch.visualize_serialized_data import VisualizeCompression
+
+
+
+

Model Compression with Visualization Parameter

+
def model_compression_with_visualization(eval_func):
+    """
+    Code example for compressing a model with a visualization url provided.
+    """
+    process = None
+    try:
+        visualization_url, process = start_bokeh_server_session()
+
+        input_shape = (1, 3, 224, 224)
+        model = models.resnet18(pretrained=True).to(torch.device('cuda'))
+
+        modules_to_ignore = [model.conv1]
+
+        greedy_params = aimet_common.defs.GreedySelectionParameters(target_comp_ratio=Decimal(0.65),
+                                                                    num_comp_ratio_candidates=10,
+                                                                    saved_eval_scores_dict=
+                                                                    '../data/resnet18_eval_scores.pkl')
+
+        auto_params = aimet_torch.defs.SpatialSvdParameters.AutoModeParams(greedy_params,
+                                                                           modules_to_ignore=modules_to_ignore)
+
+        params = aimet_torch.defs.SpatialSvdParameters(aimet_torch.defs.SpatialSvdParameters.Mode.auto, auto_params,
+                                                       multiplicity=8)
+
+        # If no visualization URL is provided, during model compression execution no visualizations will be published.
+        ModelCompressor.compress_model(model=model, eval_callback=eval_func, eval_iterations=5,
+                                       input_shape=input_shape,
+                                       compress_scheme=aimet_common.defs.CompressionScheme.spatial_svd,
+                                       cost_metric=aimet_common.defs.CostMetric.mac, parameters=params,
+                                       visualization_url=None)
+
+        comp_ratios_file_path = './data/greedy_selection_comp_ratios_list.pkl'
+        eval_scores_path = '../data/resnet18_eval_scores.pkl'
+
+        # A user can visualize the eval scores dictionary and optimal compression ratios by executing the following code.
+        compression_visualizations = VisualizeCompression(visualization_url)
+        compression_visualizations.display_eval_scores(eval_scores_path)
+        compression_visualizations.display_comp_ratio_plot(comp_ratios_file_path)
+    finally:
+        if process:
+            process.terminate()
+            process.join()
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/api_docs/torch_visualization_quantization.html b/releases/1.32.2/api_docs/torch_visualization_quantization.html new file mode 100644 index 00000000..8faaf339 --- /dev/null +++ b/releases/1.32.2/api_docs/torch_visualization_quantization.html @@ -0,0 +1,1287 @@ + + + + + + AIMET Visualization for Quantization API — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Visualization for Quantization API

+
+

Top-level API Quantization

+
+
+aimet_torch.visualize_model.visualize_relative_weight_ranges_to_identify_problematic_layers(model, results_dir, selected_layers=None)[source]
+

For each of the selected layers, publishes a line plot showing weight ranges for each layer, summary statistics +for relative weight ranges, and a histogram showing weight ranges of output channels +with respect to the minimum weight range.

+
+
Parameters
+
    +
  • model (Module) – pytorch model

  • +
  • results_dir (str) – Directory to save the Bokeh plots

  • +
  • selected_layers (Optional[List]) – a list of layers a user can choose to have visualized. If selected layers is None, +all Linear and Conv layers will be visualized.

  • +
+
+
Return type
+

List[figure]

+
+
Returns
+

A list of bokeh plots

+
+
+
+ +
+

+
+
+
+aimet_torch.visualize_model.visualize_weight_ranges(model, results_dir, selected_layers=None)[source]
+

Visualizes weight ranges for each layer through a scatter plot showing mean plotted against the standard deviation, +the minimum plotted against the max, and a line plot with min, max, and mean for each output channel.

+
+
Parameters
+
    +
  • model (Module) – pytorch model

  • +
  • selected_layers (Optional[List]) – a list of layers a user can choose to have visualized. If selected layers is None, +all Linear and Conv layers will be visualized.

  • +
  • results_dir (str) – Directory to save the Bokeh plots

  • +
+
+
Return type
+

List[figure]

+
+
Returns
+

A list of bokeh plots

+
+
+
+ +
+

+
+
+
+aimet_torch.visualize_model.visualize_changes_after_optimization(old_model, new_model, results_dir, selected_layers=None)[source]
+

Visualizes changes before and after some optimization has been applied to a model.

+
+
Parameters
+
    +
  • old_model (Module) – pytorch model before optimization

  • +
  • new_model (Module) – pytorch model after optimization

  • +
  • results_dir (str) – Directory to save the Bokeh plots

  • +
  • selected_layers (Optional[List]) – a list of layers a user can choose to have visualized. If selected layers is None, +all Linear and Conv layers will be visualized.

  • +
+
+
Return type
+

List[figure]

+
+
Returns
+

A list of bokeh plots

+
+
+
+ +
+

+
+
+
+

Code Examples

+

Required imports

+
import copy
+import torch
+
+from torchvision import models
+
+from aimet_torch.cross_layer_equalization import equalize_model
+
+from aimet_torch import batch_norm_fold
+from aimet_torch import visualize_model
+
+
+

Comparing Model After Optimization

+
def visualize_changes_in_model_after_and_before_cle():
+    """
+    Code example for visualizating model before and after Cross Layer Equalization optimization
+    """
+    model = models.resnet18(pretrained=True).to(torch.device('cpu'))
+    model = model.eval()
+    # Create a copy of the model to visualize the before and after optimization changes
+    model_copy = copy.deepcopy(model)
+
+    # Specify a folder in which the plots will be saved
+    results_dir = './visualization'
+
+    batch_norm_fold.fold_all_batch_norms(model_copy, (1, 3, 224, 224))
+
+    equalize_model(model, (1, 3, 224, 224))
+    visualize_model.visualize_changes_after_optimization(model_copy, model, results_dir)
+
+
+

Visualizing weight ranges in Model

+
def visualize_weight_ranges_model():
+    """
+    Code example for model visualization
+    """
+    model = models.resnet18(pretrained=True).to(torch.device('cpu'))
+    model = model.eval()
+
+    # Specify a folder in which the plots will be saved
+    results_dir = './visualization'
+
+    batch_norm_fold.fold_all_batch_norms(model, (1, 3, 224, 224))
+
+    # Usually it is observed that if we do BatchNorm fold the layer's weight range increases.
+    # This helps in visualizing layer's weight
+    visualize_model.visualize_weight_ranges(model, results_dir)
+
+
+

Visualizing Relative weight ranges in Model

+
def visualize_relative_weight_ranges_model():
+    """
+    Code example for model visualization
+    """
+    model = models.resnet18(pretrained=True).to(torch.device('cpu'))
+    model = model.eval()
+
+    # Specify a folder in which the plots will be saved
+    results_dir = './visualization'
+
+    batch_norm_fold.fold_all_batch_norms(model, (1, 3, 224, 224))
+
+    # Usually it is observed that if we do BatchNorm fold the layer's weight range increases.
+    # This helps in finding layers which can be equalized to get better performance on hardware
+    visualize_model.visualize_relative_weight_ranges_to_identify_problematic_layers(model, results_dir)
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/genindex.html b/releases/1.32.2/genindex.html new file mode 100644 index 00000000..5a824451 --- /dev/null +++ b/releases/1.32.2/genindex.html @@ -0,0 +1,1705 @@ + + + + + + Index — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + +
  • +
  • +
+
+
+
+
+ + +

Index

+ +
+ A + | B + | C + | D + | E + | F + | G + | L + | M + | N + | O + | P + | Q + | R + | S + | T + | U + | V + | W + +
+

A

+ + + +
+ +

B

+ + + +
+ +

C

+ + + +
+ +

D

+ + + +
+ +

E

+ + + +
+ +

F

+ + + +
+ +

G

+ + + +
+ +

L

+ + + +
+ +

M

+ + + +
+ +

N

+ + + +
+ +

O

+ + +
+ +

P

+ + + +
+ +

Q

+ + + +
+ +

R

+ + + +
+ +

S

+ + + +
+ +

T

+ + + +
+ +

U

+ + +
+ +

V

+ + + +
+ +

W

+ + + +
+ + + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/install/index.html b/releases/1.32.2/install/index.html new file mode 100644 index 00000000..b15616e1 --- /dev/null +++ b/releases/1.32.2/install/index.html @@ -0,0 +1,1206 @@ + + + + + + AIMET Installation — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Installation

+
+

Quick Install

+

The AIMET PyTorch GPU PyPI packages are available for environments that meet the following requirements:

+
    +
  • 64-bit Intel x86-compatible processor

  • +
  • Linux Ubuntu 22.04 LTS [Python 3.10] or Ubuntu 20.04 LTS [Python 3.8]

  • +
  • Cuda 12.0

  • +
  • Torch 2.2.2

  • +
+

Pip install:

+
apt-get install liblapacke
+python3 -m pip install aimet-torch
+
+
+
+
+

Release Packages

+

For other aimet variants, install the latest version from the .whl files hosted at https://github.com/quic/aimet/releases

+

PyTorch

+
# Pytorch 1.13 with CUDA 11.x
+python3 -m pip install https://github.com/quic/aimet/releases/download/1.31.0/aimet_torch-torch_gpu_1.31.0-cp38-cp38-linux_x86_64.whl
+# Pytorch 1.13 CPU only
+python3 -m pip install https://github.com/quic/aimet/releases/download/1.31.0/aimet_torch-torch_cpu_1.31.0-cp38-cp38-linux_x86_64.whl
+

TensorFlow

+
# Tensorflow 2.10 GPU with CUDA 11.x
+python3 -m pip install https://github.com/quic/aimet/releases/download/1.31.0/aimet_tensorflow-tf_gpu_1.31.0-cp38-cp38-linux_x86_64.whl
+# Tensorflow 2.10 CPU only
+python3 -m pip install https://github.com/quic/aimet/releases/download/1.31.0/aimet_tensorflow-tf_cpu_1.31.0-cp38-cp38-linux_x86_64.whl
+

Onnx

+
# ONNX 1.14 GPU
+python3 -m pip install https://github.com/quic/aimet/releases/download/1.31.0/aimet_onnx-onnx_gpu_1.31.0-cp38-cp38-linux_x86_64.whl
+# ONNX 1.14 CPU
+python3 -m pip install https://github.com/quic/aimet/releases/download/1.31.0/aimet_onnx-onnx_cpu_1.31.0-cp38-cp38-linux_x86_64.whl
+

For previous AIMET releases, browse packages at https://github.com/quic/aimet/releases. Each release includes multiple python packages of the following format:

+
# VARIANT in {torch_gpu, torch_cpu, tf_gpu, tf_cpu, onnx_gpu, onnx_cpu}
+# PACKAGE_PREFIX in {aimet_torch, aimet_tensorflow, aimet_onnx}
+<PACKAGE_PREFIX>-<VARIANT>_<VERSION>-cp38-cp38-linux_x86_64.whl
+
+
+

System Requirements

+

The AIMET package requires the following host platform setup:

+
    +
  • 64-bit Intel x86-compatible processor

  • +
  • Linux Ubuntu: 22.04 LTS

  • +
  • bash command shell

  • +
  • +
    For GPU variants:
    +
    +
    +
  • +
+

To use the GPU accelerated training modules an Nvidia CUDA enabled GPU with a minimum Nvidia driver version of 455+ is required. Using the latest driver is always recommended, especially if using a newer GPU. Both CUDA and cuDNN (the more advanced CUDA interface) enabled GPUs are supported.

+
+
+

Advanced Installation Instructions

+
+
There are two ways to setup and install AIMET:
    +
  • On your host machine

  • +
  • Using our pre-built development Docker images

  • +
+
+
+

Please click on the appropriate link for installation instructions:

+ +
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/install/install_docker.html b/releases/1.32.2/install/install_docker.html new file mode 100644 index 00000000..cff69d5d --- /dev/null +++ b/releases/1.32.2/install/install_docker.html @@ -0,0 +1,1276 @@ + + + + + + AIMET Installation in Docker — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Installation in Docker

+

This page provides instructions to install AIMET package inside a development docker container.

+
+

Set variant

+
+
Set the <variant_string> to ONE of the following depending on your desired variant
    +
  1. For the PyTorch 2.1 GPU variant, use torch-gpu

  2. +
  3. For the PyTorch 2.1 CPU variant, use torch-cpu

  4. +
  5. For the PyTorch 1.13 GPU variant, use torch-gpu-pt113

  6. +
  7. For the PyTorch 1.13 CPU variant, use torch-cpu-pt113

  8. +
  9. For the TensorFlow GPU variant, use tf-gpu

  10. +
  11. For the TensorFlow CPU variant, use tf-cpu

  12. +
  13. For the ONNX GPU variant, use onnx-gpu

  14. +
  15. For the ONNX CPU variant, use onnx-cpu

  16. +
+
+
+
export AIMET_VARIANT=<variant_string>
+
+
+
+
+

Use prebuilt docker image

+

Follow these instructions to use one of the pre-built docker images:

+
WORKSPACE="<absolute_path_to_workspace>"
+docker_image_name="artifacts.codelinaro.org/codelinaro-aimet/aimet-dev:latest.${AIMET_VARIANT}"
+docker_container_name="aimet-dev-<any_name>"
+
+
+

NOTE: Feel free to modify the docker_container_name as needed.

+
+
+

Build docker image locally

+

Follow these instructions ONLY if you want to build the docker image locally. If not, skip to the next section.

+
WORKSPACE="<absolute_path_to_workspace>"
+docker_image_name="aimet-dev-docker:<any_tag>"
+docker_container_name="aimet-dev-<any_name>"
+docker build -t ${docker_image_name} -f $WORKSPACE/aimet/Jenkins/Dockerfile.${AIMET_VARIANT} .
+
+
+

NOTE: Feel free to modify the docker_image_name and docker_container_name as needed.

+
+
+

Start docker container

+

Ensure that a docker named $docker_container_name is not already running; otherwise remove the existing container and then start a new container as follows:

+
docker ps -a | grep ${docker_container_name} && docker kill ${docker_container_name}
+
+docker run --rm -it -u $(id -u ${USER}):$(id -g ${USER}) \
+-v /etc/passwd:/etc/passwd:ro -v /etc/group:/etc/group:ro \
+-v ${HOME}:${HOME} -v ${WORKSPACE}:${WORKSPACE} \
+-v "/local/mnt/workspace":"/local/mnt/workspace" \
+--entrypoint /bin/bash -w ${WORKSPACE} --hostname ${docker_container_name} ${docker_image_name}
+
+
+
+
NOTE:
    +
  1. Feel free to modify the above docker run command based on the environment and filesystem on your host machine.

  2. +
  3. If nvidia-docker 2.0 is installed, then add –gpus all to the docker run commands in order to enable GPU access inside the docker container.

  4. +
  5. If nvidia-docker 1.0 is installed, then replace docker run with nvidia-docker run in order to enable GPU access inside the docker container.

  6. +
  7. Port forwarding needs to be done in order to run the Visualization APIs from docker container. This can be achieved by running the docker container as follows:

  8. +
+
+
+
port_id="<any-port-number>"
+
+docker run -p ${port_id}:${port_id} --rm -it -u $(id -u ${USER}):$(id -g ${USER}) \
+-v /etc/passwd:/etc/passwd:ro -v /etc/group:/etc/group:ro \
+-v ${HOME}:${HOME} -v ${WORKSPACE}:${WORKSPACE} \
+-v "/local/mnt/workspace":"/local/mnt/workspace" \
+--entrypoint /bin/bash -w ${WORKSPACE} --hostname ${docker_container_name} ${docker_image_name}
+
+
+
+
+

Install AIMET packages

+
+

From PyPI

+

Aimet Torch GPU can install from pypi through the following method:

+

Go to https://pypi.org/project/aimet-torch to identify a version you wish to install

+
+
    +
  • For PyTorch 1.13 GPU you should use aimet-torch==1.31.1

  • +
  • For Pytorch 2.1.2 GPU you should use aimet-torch >= 1.32.0

  • +
+
+
sudo apt-get install liblapacke -y
+pip install aimet-torch
+
+
+
+
+

From Release Package

+

Alternatively, we host .whl packages for each release at https://github.com/quic/aimet/releases. Identify the release tag +of the package you wish to install, then follow the instructions below to install AIMET from the .whl file.

+

Set the <variant_string> to ONE of the following depending on your desired variant

+
    +
  1. For the PyTorch 2.1 GPU variant, use “torch_gpu”

  2. +
  3. For the PyTorch 2.1 CPU variant, use “torch_cpu”

  4. +
  5. For the PyTorch 1.13 GPU variant, use “torch_gpu-pt113”

  6. +
  7. For the PyTorch 1.13 CPU variant, use “torch_cpu-pt113”

  8. +
  9. For the TensorFlow GPU variant, use “tf_gpu”

  10. +
  11. For the TensorFlow CPU variant, use “tf_cpu”

  12. +
  13. For the ONNX GPU variant, use “onnx_gpu”

  14. +
  15. For the ONNX CPU variant, use “onnx_cpu”

  16. +
+
export AIMET_VARIANT=<variant_string>
+
+
+

Replace <release_tag> in the steps below with the appropriate tag:

+
export release_tag=<release_tag>
+
+
+

Set the package download URL as follows:

+
export download_url="https://github.com/quic/aimet/releases/download/${release_tag}"
+
+
+

Set the common suffix for the package files as follows:

+
export wheel_file_suffix="cp310-cp310-linux_x86_64.whl"
+
+
+

Install the AIMET packages in the order specified below:

+
+
NOTE:
    +
  1. Please pre-pend the “apt-get install” and “pip3 install” commands with “sudo -H” as appropriate.

  2. +
  3. These instructions assume that pip packages will be installed in the path: /usr/local/lib/python3.10/dist-packages. If that is not the case, please modify it accordingly.

  4. +
  5. Python dependencies will automatically get installed.

  6. +
+
+
+
# Install ONE of the following depending on the variant
+python3 -m pip install ${download_url}/aimet_torch-${AIMET_VARIANT}_${release_tag}-${wheel_file_suffix} -f https://download.pytorch.org/whl/torch_stable.html
+# OR
+python3 -m pip install ${download_url}/aimet_tensorflow-${AIMET_VARIANT}_${release_tag}-${wheel_file_suffix}
+# OR
+python3 -m pip install ${download_url}/aimet_onnx-${AIMET_VARIANT}_${release_tag}-${wheel_file_suffix}
+
+
+
+
+
+

Environment setup

+

Set the common environment variables as follows:

+
source /usr/local/lib/python3.10/dist-packages/aimet_common/bin/envsetup.sh
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/install/install_host.html b/releases/1.32.2/install/install_host.html new file mode 100644 index 00000000..9c554774 --- /dev/null +++ b/releases/1.32.2/install/install_host.html @@ -0,0 +1,1345 @@ + + + + + + AIMET Installation and Setup — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Installation and Setup

+

This page provides instructions to install AIMET package on Ubuntu 22.04 LTS with Nvidia GPU. Please follow the instructions in the order provided, unless specified otherwise.

+
+
NOTE:
    +
  1. Please pre-pend the “apt-get install” and “pip3 install” commands with “sudo -H” as appropriate.

  2. +
  3. These instructions assume that pip packages will be installed in the path: /usr/local/lib/python3.10/dist-packages. If that is not the case, please modify it accordingly.

  4. +
+
+
+
+

Install prerequisite packages

+

Install the basic pre-requisite packages as follows:

+
apt-get update
+apt-get install python3.10 python3.10-dev python3-pip
+python3 -m pip install --upgrade pip
+apt-get install --assume-yes wget gnupg2
+
+
+

If you have multiple python versions installed, set the default python version as follows:

+
update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 1
+update-alternatives --set python3 /usr/bin/python3.10
+
+
+
+
+

Install GPU packages

+

NOTE:

+
    +
  1. Do this section ONLY for the GPU variants.

  2. +
  3. +
    The released AIMET GPU packages were tested with the following CUDA toolkit versions:
      +
    1. PyTorch 2.1 GPU variant: CUDA Toolkit 11.8.0

    2. +
    3. PyTorch 1.13 GPU variant: CUDA Toolkit 11.7.1

    4. +
    5. TensorFlow GPU variant: CUDA Toolkit 11.8.0

    6. +
    7. ONNX GPU variant: CUDA Toolkit 11.7.1

    8. +
    +
    +
    +
  4. +
  5. The instructions in the sub-sections below correspond to our tested versions above. Visit this page https://developer.nvidia.com/cuda-toolkit-archive to obtain the correct version of the CUDA toolkit for your environment.

  6. +
+
+

Install GPU packages for PyTorch 2.1 or TensorFlow

+

NOTE:

+
    +
  1. Do this section ONLY for the PyTorch 2.1 or TensorFlow GPU variant.

  2. +
  3. Visit this page https://developer.nvidia.com/cuda-11-8-0-download-archive to obtain the exact and up-to-date installation instructions for your environment.

  4. +
+
apt-get update && apt-get install -y gnupg2
+wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
+mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
+wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-ubuntu2204-11-8-local_11.8.0-520.61.05-1_amd64.deb
+apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
+dpkg -i cuda-repo-ubuntu2204-11-8-local_11.8.0-520.61.05-1_amd64.deb
+cp /var/cuda-repo-ubuntu2204-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
+echo "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64 /" > /etc/apt/sources.list.d/cuda.list
+apt-get update
+
+
+
+
+

Install GPU packages for PyTorch 1.13 or ONNX

+

NOTE:

+
    +
  1. Do this section ONLY for the PyTorch 1.13 or ONNX GPU variants.

  2. +
  3. Visit this page https://developer.nvidia.com/cuda-11-7-1-download-archive to obtain the exact and up-to-date installation instructions for your environment.

  4. +
+
apt-get update && apt-get install -y gnupg2
+wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
+mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
+wget https://developer.download.nvidia.com/compute/cuda/11.7.1/local_installers/cuda-repo-ubuntu2204-11-7-local_11.7.1-515.65.01-1_amd64.deb
+apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
+dpkg -i cuda-repo-ubuntu2204-11-7-local_11.7.1-515.65.01-1_amd64.deb
+cp /var/cuda-repo-ubuntu2204-11-7-local/cuda-*-keyring.gpg /usr/share/keyrings/
+echo "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64 /" > /etc/apt/sources.list.d/cuda.list
+apt-get update
+
+
+
+
+
+

Install AIMET packages

+
+

From PyPI

+

Aimet Torch GPU can install from pypi through the following method:

+

Go to https://pypi.org/project/aimet-torch to identify a version you wish to install

+
+
    +
  • For PyTorch 1.13 GPU you should use aimet-torch==1.31.1

  • +
  • For Pytorch 2.1.2 GPU you should use aimet-torch >= 1.32.0

  • +
+
+
sudo apt-get install liblapacke -y
+pip install aimet-torch
+
+
+
+
+

From Release Package

+

Alternatively, we host .whl packages for each release at https://github.com/quic/aimet/releases. Identify the release tag +of the package you wish to install, then follow the instructions below to install AIMET from the .whl file.

+

Set the <variant_string> to ONE of the following depending on your desired variant

+
    +
  1. For the PyTorch 2.1 GPU variant, use “torch_gpu”

  2. +
  3. For the PyTorch 2.1 CPU variant, use “torch_cpu”

  4. +
  5. For the PyTorch 1.13 GPU variant, use “torch_gpu_pt113”

  6. +
  7. For the PyTorch 1.13 CPU variant, use “torch_cpu_pt113”

  8. +
  9. For the TensorFlow GPU variant, use “tf_gpu”

  10. +
  11. For the TensorFlow CPU variant, use “tf_cpu”

  12. +
  13. For the ONNX GPU variant, use “onnx_gpu”

  14. +
  15. For the ONNX CPU variant, use “onnx_cpu”

  16. +
+
export AIMET_VARIANT=<variant_string>
+
+
+

Replace <release_tag> in the steps below with the appropriate tag:

+
export release_tag=<release_tag>
+
+
+

Set the package download URL as follows:

+
export download_url="https://github.com/quic/aimet/releases/download/${release_tag}"
+
+
+

Set the common suffix for the package files as follows:

+

NOTE: Set wheel_file_suffix to cp310-cp310-linux_x86_64.whl OR cp38-cp38-linux_x86_64.whl OR cp36-cp36m-linux_x86_64 OR cp37-cp37m-linux_x86_64 OR py3-none-any as appropriate depending on the actual wheel filename(s) on the https://github.com/quic/aimet/releases.

+
export wheel_file_suffix="cp310-cp310-linux_x86_64.whl"
+
+
+

Install the AIMET packages in the order specified below:

+

NOTE: Python dependencies will automatically get installed.

+
# Install ONE of the following depending on the variant
+python3 -m pip install ${download_url}/aimet_torch-${AIMET_VARIANT}_${release_tag}-${wheel_file_suffix} -f https://download.pytorch.org/whl/torch_stable.html
+# OR
+python3 -m pip install ${download_url}/aimet_tensorflow-${AIMET_VARIANT}_${release_tag}-${wheel_file_suffix}
+# OR
+python3 -m pip install ${download_url}/aimet_onnx-${AIMET_VARIANT}_${release_tag}-${wheel_file_suffix}
+
+
+
+
+
+

Install common debian packages

+

Install the common debian packages as follows:

+
cat /usr/local/lib/python3.10/dist-packages/aimet_common/bin/reqs_deb_common.txt | xargs apt-get --assume-yes install
+
+
+

NOTE: Do the following ONLY for the PyTorch variant packages.

+
cat /usr/local/lib/python3.10/dist-packages/aimet_onnx/bin/reqs_deb_torch_common.txt | xargs apt-get --assume-yes install
+
+
+

NOTE: Do the following ONLY for the ONNX variant packages.

+
cat /usr/local/lib/python3.10/dist-packages/aimet_onnx/bin/reqs_deb_onnx_common.txt | xargs apt-get --assume-yes install
+
+
+
+
+

Install tensorflow GPU debian packages

+

NOTE: Do this ONLY for the TensorFlow GPU package.

+
cat /usr/local/lib/python3.10/dist-packages/aimet_tensorflow/bin/reqs_deb_tf_gpu.txt | xargs apt-get --assume-yes install
+
+
+
+
+

Install torch GPU debian packages

+

NOTE: Do this ONLY for the PyTorch GPU package.

+
cat /usr/local/lib/python3.10/dist-packages/aimet_torch/bin/reqs_deb_torch_gpu.txt | xargs apt-get --assume-yes install
+
+
+
+
+

Install ONNX GPU debian packages

+

NOTE: Do this ONLY for the ONNX GPU package.

+
cat /usr/local/lib/python3.10/dist-packages/aimet_onnx/bin/reqs_deb_onnx_gpu.txt | xargs apt-get --assume-yes install
+
+
+
+
+

Replace Pillow with Pillow-SIMD

+

Optional: Replace the Pillow package with Pillow-SIMD as follows:

+
python3 -m pip uninstall -y pillow
+python3 -m pip install --no-cache-dir Pillow-SIMD==9.0.0.post1
+
+
+
+
+

Replace onnxruntime with onnxruntime-gpu

+

NOTE: Do this ONLY for the PyTorch GPU package.

+
export ONNXRUNTIME_VER=$(python3 -c 'import onnxruntime; print(onnxruntime.__version__)')
+python3 -m pip uninstall -y onnxruntime
+python3 -m pip install --no-cache-dir onnxruntime-gpu==$ONNXRUNTIME_VER
+
+
+
+
+

Post installation steps

+
ln -s /usr/lib/x86_64-linux-gnu/libjpeg.so /usr/lib
+
+
+

NOTE: Do the following step ONLY for the PyTorch or Tensorflow GPU packages.

+
# NOTE: Please chose between the below command depending on the version of your CUDA driver toolkit
+ln -s /usr/local/cuda-11.7 /usr/local/cuda
+ln -s /usr/local/cuda-11.8 /usr/local/cuda
+
+
+
+
+

Environment setup

+

Set the common environment variables as follows:

+
source /usr/local/lib/python3.10/dist-packages/aimet_common/bin/envsetup.sh
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/objects.inv b/releases/1.32.2/objects.inv new file mode 100644 index 00000000..76d9550b Binary files /dev/null and b/releases/1.32.2/objects.inv differ diff --git a/releases/1.32.2/search.html b/releases/1.32.2/search.html new file mode 100644 index 00000000..7c26ad84 --- /dev/null +++ b/releases/1.32.2/search.html @@ -0,0 +1,1141 @@ + + + + + + Search — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + +
  • +
  • +
+
+
+
+
+ + + + +
+ +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/releases/1.32.2/searchindex.js b/releases/1.32.2/searchindex.js new file mode 100644 index 00000000..bfd9b4fa --- /dev/null +++ b/releases/1.32.2/searchindex.js @@ -0,0 +1 @@ +Search.setIndex({"docnames": ["Examples/onnx/quantization/adaround", "Examples/onnx/quantization/cle", "Examples/onnx/quantization/quantsim", "Examples/tensorflow/compression/channel_pruning", "Examples/tensorflow/compression/spatial_svd", "Examples/tensorflow/compression/spatial_svd_channel_pruning", "Examples/tensorflow/quantization/adaround", "Examples/tensorflow/quantization/autoquant", "Examples/tensorflow/quantization/bn_reestimation", "Examples/tensorflow/quantization/cle_bc", "Examples/tensorflow/quantization/keras/adaround", "Examples/tensorflow/quantization/keras/autoquant", "Examples/tensorflow/quantization/keras/bn_reestimation", "Examples/tensorflow/quantization/keras/keras_transformer_qat", "Examples/tensorflow/quantization/keras/model_preparer", "Examples/tensorflow/quantization/keras/qat", "Examples/tensorflow/quantization/keras/qat_range_learning", "Examples/tensorflow/quantization/keras/quant_analyzer", "Examples/tensorflow/quantization/keras/quantsim_adaround_pcq", "Examples/tensorflow/quantization/keras/quantsim_cle", "Examples/tensorflow/quantization/qat", "Examples/tensorflow/quantization/qat_range_learning", "Examples/tensorflow/quantization/quant_analyzer", "Examples/torch/compression/channel_pruning", "Examples/torch/compression/spatial_svd", "Examples/torch/compression/spatial_svd_channel_pruning", "Examples/torch/quantization/adaround", "Examples/torch/quantization/autoquant", "Examples/torch/quantization/bn_reestimation", "Examples/torch/quantization/cle_bc", "Examples/torch/quantization/qat", "Examples/torch/quantization/qat_range_learning", "Examples/torch/quantization/quant_analyzer", "api_docs/convert_tf_sess_to_keras", "api_docs/index", "api_docs/keras", "api_docs/keras_adaround", "api_docs/keras_batchnorm_re_estimation", "api_docs/keras_compression", "api_docs/keras_cross_layer_equalization", "api_docs/keras_layer_output_generation", "api_docs/keras_model_guidelines", "api_docs/keras_model_preparer", "api_docs/keras_primitive_apis_cle", "api_docs/keras_quant_analyzer", "api_docs/keras_quantization", "api_docs/keras_quantsim", "api_docs/onnx", "api_docs/onnx_adaround", "api_docs/onnx_auto_quant", "api_docs/onnx_cross_layer_equalization", "api_docs/onnx_layer_output_generation", "api_docs/onnx_quant_analyzer", "api_docs/onnx_quantization", "api_docs/onnx_quantsim", "api_docs/quantization_encoding_specification", "api_docs/tensorflow", "api_docs/tensorflow_adaround", "api_docs/tensorflow_auto_quant", "api_docs/tensorflow_batchnorm_re_estimation", "api_docs/tensorflow_bias_correction", "api_docs/tensorflow_compress", "api_docs/tensorflow_cross_layer_equalization", "api_docs/tensorflow_layer_output_generation", "api_docs/tensorflow_model_guidelines", "api_docs/tensorflow_primitive_apis_cle", "api_docs/tensorflow_quant_analyzer", "api_docs/tensorflow_quantization", "api_docs/tensorflow_quantsim", "api_docs/tensorflow_visualization_quantization", "api_docs/torch", "api_docs/torch_adaround", "api_docs/torch_architecture_checker", "api_docs/torch_auto_quant", "api_docs/torch_batchnorm_re_estimation", "api_docs/torch_bias_correction", "api_docs/torch_compress", "api_docs/torch_cross_layer_equalization", "api_docs/torch_layer_output_generation", "api_docs/torch_model_guidelines", "api_docs/torch_model_preparer", "api_docs/torch_model_validator", "api_docs/torch_multi_gpu", "api_docs/torch_peft_lora", "api_docs/torch_primitive_apis_cle", "api_docs/torch_quant_analyzer", "api_docs/torch_quantization", "api_docs/torch_quantsim", "api_docs/torch_visualization_compression", "api_docs/torch_visualization_quantization", "install/index", "install/install_docker", "install/install_host", "toplevelhidden", "user_guide/adaround", "user_guide/auto_quant", "user_guide/bn_reestimation", "user_guide/channel_pruning", "user_guide/compression_feature_guidebook", "user_guide/examples", "user_guide/greedy_compression_ratio_selection", "user_guide/index", "user_guide/known_issues", "user_guide/model_compression", "user_guide/model_guidelines", "user_guide/model_quantization", "user_guide/post_training_quant_techniques", "user_guide/quant_analyzer", "user_guide/quantization_aware_training", "user_guide/quantization_configuration", "user_guide/quantization_feature_guidebook", "user_guide/quantization_sim", "user_guide/release_notes", "user_guide/spatial_svd", "user_guide/visualization_compression", "user_guide/visualization_quant", "user_guide/weight_svd", "user_guide/winnowing"], "filenames": ["Examples/onnx/quantization/adaround.ipynb", "Examples/onnx/quantization/cle.ipynb", "Examples/onnx/quantization/quantsim.ipynb", "Examples/tensorflow/compression/channel_pruning.ipynb", "Examples/tensorflow/compression/spatial_svd.ipynb", "Examples/tensorflow/compression/spatial_svd_channel_pruning.ipynb", "Examples/tensorflow/quantization/adaround.ipynb", "Examples/tensorflow/quantization/autoquant.ipynb", "Examples/tensorflow/quantization/bn_reestimation.ipynb", "Examples/tensorflow/quantization/cle_bc.ipynb", "Examples/tensorflow/quantization/keras/adaround.ipynb", "Examples/tensorflow/quantization/keras/autoquant.ipynb", "Examples/tensorflow/quantization/keras/bn_reestimation.ipynb", "Examples/tensorflow/quantization/keras/keras_transformer_qat.ipynb", "Examples/tensorflow/quantization/keras/model_preparer.ipynb", "Examples/tensorflow/quantization/keras/qat.ipynb", "Examples/tensorflow/quantization/keras/qat_range_learning.ipynb", "Examples/tensorflow/quantization/keras/quant_analyzer.ipynb", "Examples/tensorflow/quantization/keras/quantsim_adaround_pcq.ipynb", "Examples/tensorflow/quantization/keras/quantsim_cle.ipynb", "Examples/tensorflow/quantization/qat.ipynb", "Examples/tensorflow/quantization/qat_range_learning.ipynb", "Examples/tensorflow/quantization/quant_analyzer.ipynb", "Examples/torch/compression/channel_pruning.ipynb", "Examples/torch/compression/spatial_svd.ipynb", "Examples/torch/compression/spatial_svd_channel_pruning.ipynb", "Examples/torch/quantization/adaround.ipynb", "Examples/torch/quantization/autoquant.ipynb", "Examples/torch/quantization/bn_reestimation.ipynb", "Examples/torch/quantization/cle_bc.ipynb", "Examples/torch/quantization/qat.ipynb", "Examples/torch/quantization/qat_range_learning.ipynb", "Examples/torch/quantization/quant_analyzer.ipynb", "api_docs/convert_tf_sess_to_keras.rst", "api_docs/index.rst", "api_docs/keras.rst", "api_docs/keras_adaround.rst", "api_docs/keras_batchnorm_re_estimation.rst", "api_docs/keras_compression.rst", "api_docs/keras_cross_layer_equalization.rst", "api_docs/keras_layer_output_generation.rst", "api_docs/keras_model_guidelines.rst", "api_docs/keras_model_preparer.rst", "api_docs/keras_primitive_apis_cle.rst", "api_docs/keras_quant_analyzer.rst", "api_docs/keras_quantization.rst", "api_docs/keras_quantsim.rst", "api_docs/onnx.rst", "api_docs/onnx_adaround.rst", "api_docs/onnx_auto_quant.rst", "api_docs/onnx_cross_layer_equalization.rst", "api_docs/onnx_layer_output_generation.rst", "api_docs/onnx_quant_analyzer.rst", "api_docs/onnx_quantization.rst", "api_docs/onnx_quantsim.rst", "api_docs/quantization_encoding_specification.rst", "api_docs/tensorflow.rst", "api_docs/tensorflow_adaround.rst", "api_docs/tensorflow_auto_quant.rst", "api_docs/tensorflow_batchnorm_re_estimation.rst", "api_docs/tensorflow_bias_correction.rst", "api_docs/tensorflow_compress.rst", "api_docs/tensorflow_cross_layer_equalization.rst", "api_docs/tensorflow_layer_output_generation.rst", "api_docs/tensorflow_model_guidelines.rst", "api_docs/tensorflow_primitive_apis_cle.rst", "api_docs/tensorflow_quant_analyzer.rst", "api_docs/tensorflow_quantization.rst", "api_docs/tensorflow_quantsim.rst", "api_docs/tensorflow_visualization_quantization.rst", "api_docs/torch.rst", "api_docs/torch_adaround.rst", "api_docs/torch_architecture_checker.rst", "api_docs/torch_auto_quant.rst", "api_docs/torch_batchnorm_re_estimation.rst", "api_docs/torch_bias_correction.rst", "api_docs/torch_compress.rst", "api_docs/torch_cross_layer_equalization.rst", "api_docs/torch_layer_output_generation.rst", "api_docs/torch_model_guidelines.rst", "api_docs/torch_model_preparer.rst", "api_docs/torch_model_validator.rst", "api_docs/torch_multi_gpu.rst", "api_docs/torch_peft_lora.rst", "api_docs/torch_primitive_apis_cle.rst", "api_docs/torch_quant_analyzer.rst", "api_docs/torch_quantization.rst", "api_docs/torch_quantsim.rst", "api_docs/torch_visualization_compression.rst", "api_docs/torch_visualization_quantization.rst", "install/index.rst", "install/install_docker.rst", "install/install_host.rst", "toplevelhidden.rst", "user_guide/adaround.rst", "user_guide/auto_quant.rst", "user_guide/bn_reestimation.rst", "user_guide/channel_pruning.rst", "user_guide/compression_feature_guidebook.rst", "user_guide/examples.rst", "user_guide/greedy_compression_ratio_selection.rst", "user_guide/index.rst", "user_guide/known_issues.rst", "user_guide/model_compression.rst", "user_guide/model_guidelines.rst", "user_guide/model_quantization.rst", "user_guide/post_training_quant_techniques.rst", "user_guide/quant_analyzer.rst", "user_guide/quantization_aware_training.rst", "user_guide/quantization_configuration.rst", "user_guide/quantization_feature_guidebook.rst", "user_guide/quantization_sim.rst", "user_guide/release_notes.rst", "user_guide/spatial_svd.rst", "user_guide/visualization_compression.rst", "user_guide/visualization_quant.rst", "user_guide/weight_svd.rst", "user_guide/winnowing.rst"], "titles": ["Adaptive Rounding (AdaRound)", "Cross-Layer Equalization (CLE)", "Quantization Simulation", "Model Compression Using Channel Pruning", "Model compression Using Spatial SVD", "Model Compression Using Spatial SVD Followed by Channel Pruning", "Adaptive Rounding (AdaRound)", "AutoQuant", "Quantization-Aware Training with BatchNorm Re-estimation", "Cross-Layer Equalization (CLE) and Bias Correction (BC)", "Adaptive Rounding (Adaround)", "AutoQuant", "Quantization-Aware Training with BatchNorm Re-estimation", "Quantization-Aware Training with a Keras Transformer Model", "Keras Model Preparer", "Quantization-Aware Training", "Quantization-Aware Training with Range Learning", "Quant Analyzer", "Quantsim and Adaround - Per Channel Quantization (PCQ)", "Cross-Layer Equalization (CLE) with QuantSim", "Quantization-Aware Training", "Quantization-Aware Training with Range Learning", "Quant Analyzer", "Model compression using Channel Pruning", "Model compression using Spatial SVD", "Model compression using Spatial SVD followed by Channel Pruning", "Adaptive Rounding (AdaRound)", "AutoQuant", "Quantization-Aware Training with BatchNorm Re-estimation", "Cross-Layer Equalization (CLE) and Bias Correction (BC)", "Quantization-Aware Training", "Quantization-Aware Training with Range Learning", "Quant Analyzer", "Using AIMET Tensorflow APIs with Keras Models", "Welcome to AI Model Efficiency Toolkit API Docs!", "AIMET Keras APIs", "AIMET Keras AdaRound API", "AIMET Keras BatchNorm Re-estimation APIs", "AIMET Keras Compression API", "AIMET Keras Cross Layer Equalization APIs", "AIMET Keras Layer Output Generation API", "Keras Model Guidelines", "Model Preparer API", "AIMET Keras Cross Layer Equalization Primitive API", "AIMET Keras Quant Analyzer API", "AIMET Keras Quantization APIs", "AIMET Keras Quantization SIM API", "AIMET ONNX APIs", "AIMET ONNX AdaRound API", "AIMET ONNX AutoQuant API", "AIMET ONNX Cross Layer Equalization APIs", "AIMET ONNX Layer Output Generation API", "AIMET ONNX Quant Analyzer API", "AIMET ONNX Quantization APIs", "AIMET ONNX Quantization SIM API", "Encoding Format Specification", "AIMET TensorFlow APIs", "AIMET TensorFlow AdaRound API", "AIMET TensorFlow AutoQuant API", "AIMET TensorFlow BatchNorm Re-estimation APIs", "AIMET TensorFlow Bias Correction API", "AIMET TensorFlow Compression API", "AIMET TensorFlow Cross Layer Equalization APIs", "AIMET Tensorflow Layer Output Generation API", "TensorFlow Model Guidelines", "AIMET TensorFlow Cross Layer Equalization Primitive API", "AIMET Tensorflow Quant Analyzer API", "AIMET TensorFlow Quantization APIs", "AIMET TensorFlow Quantization SIM API", "AIMET Visualization for Quantization for TensorFlow API", "AIMET PyTorch APIs", "AIMET PyTorch AdaRound API", "Architecture Checker API", "AIMET PyTorch AutoQuant API", "AIMET PyTorch BatchNorm Re-estimation APIs", "AIMET PyTorch Bias Correction API", "AIMET PyTorch Compression API", "AIMET PyTorch Cross Layer Equalization APIs", "AIMET PyTorch Layer Output Generation API", "PyTorch Model Guidelines", "Model Preparer API", "Model Validator Utility", "PyTorch Multi-GPU support", "Top-level API", "AIMET PyTorch Cross Layer Equalization Primitive API", "AIMET PyTorch Quant Analyzer API", "AIMET PyTorch Quantization APIs", "AIMET PyTorch Quantization SIM API", "AIMET Visualization Compression API", "AIMET Visualization for Quantization API", "AIMET Installation", "AIMET Installation in Docker", "AIMET Installation and Setup", "<no title>", "AIMET AdaRound", "AIMET AutoQuant", "AIMET BN Re-estimation", "AIMET Channel Pruning", "AIMET Compression Features Guidebook", "AIMET Examples", "AIMET Greedy Compression Ratio Selection", "AI Model Efficiency Toolkit User Guide", "AIMET Known Issues", "AIMET Model Compression", "Model Guidelines for PyTorch", "AIMET Model Quantization", "AIMET Post-Training Quantization Techniques", "AIMET QuantAnalyzer", "AIMET Quantization Aware Training", "Quantization Simulation Configuration", "AIMET Quantization Features Guidebook", "AIMET Quantization Simulation", "AIMET Release Notes", "AIMET Spatial SVD", "AIMET Visualization", "AIMET Visualization for Quantization", "AIMET Weight SVD", "AIMET Winnowing"], "terms": {"show": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 36, 37, 39, 48, 57, 58, 59, 62, 68, 69, 71, 72, 73, 74, 77, 81, 85, 87, 89, 101, 106, 110], "work": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 34, 46, 48, 54, 59, 66, 68, 71, 81, 82, 87, 96, 99, 103, 104, 106, 109], "code": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 94], "how": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 36, 37, 39, 48, 54, 55, 57, 58, 59, 62, 68, 71, 72, 73, 74, 75, 77, 81, 85, 87, 99, 103, 106, 107, 110, 111], "us": [0, 1, 2, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 26, 27, 28, 29, 30, 31, 32, 34, 36, 37, 38, 39, 40, 41, 42, 44, 45, 46, 48, 49, 50, 51, 52, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 66, 68, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 90, 92, 96, 97, 98, 99, 100, 101, 104, 106, 107, 108, 109, 110, 111, 112, 115], "aimet": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 34, 41, 42, 55, 64, 72, 79, 80, 81, 82, 83, 101, 104, 109], "perform": [0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 13, 17, 18, 19, 22, 23, 24, 25, 26, 29, 32, 33, 36, 37, 39, 44, 48, 50, 52, 55, 57, 58, 59, 60, 61, 62, 64, 66, 68, 71, 72, 74, 75, 76, 77, 82, 83, 84, 85, 86, 87, 89, 95, 96, 97, 98, 100, 103, 105, 106, 107, 108, 110], "featur": [0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 20, 21, 22, 26, 27, 28, 29, 30, 31, 32, 41, 42, 43, 45, 50, 55, 64, 65, 71, 77, 79, 80, 81, 82, 84, 86, 87, 94, 95, 96, 99, 103, 106, 107, 111, 112, 114, 115], "typic": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 18, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 42, 58, 71, 87, 98, 105, 107, 108, 109, 111, 114], "nearest": [0, 1, 6, 9, 10, 13, 15, 16, 18, 19, 26, 29, 44, 46, 54, 58, 60, 66, 68, 73, 75, 87, 94], "techniqu": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29, 30, 31, 36, 38, 39, 45, 48, 49, 50, 53, 57, 58, 60, 61, 62, 67, 71, 73, 75, 76, 77, 85, 86, 94, 95, 97, 98, 101, 105, 107, 108, 110, 111, 112, 113, 116], "achiev": [0, 3, 4, 5, 6, 7, 8, 10, 16, 18, 23, 24, 25, 26, 27, 38, 61, 76, 91, 94, 98, 100, 113, 116], "when": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 38, 41, 42, 44, 45, 46, 50, 55, 61, 64, 66, 76, 77, 79, 80, 85, 86, 87, 94, 101, 103, 105, 106, 107, 108, 109, 110, 111, 114, 115, 117], "weight": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 33, 36, 38, 42, 43, 44, 45, 46, 52, 53, 55, 57, 58, 59, 60, 62, 65, 66, 67, 68, 71, 75, 81, 83, 84, 85, 86, 87, 89, 94, 96, 98, 103, 105, 106, 107, 108, 109, 110, 111, 115], "valu": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 36, 38, 43, 44, 52, 55, 57, 58, 61, 65, 66, 71, 75, 76, 78, 80, 83, 84, 85, 87, 94, 100, 103, 105, 106, 107, 108, 111, 113, 115, 116], "integ": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 36, 44, 46, 52, 55, 57, 66, 68, 85, 87, 94, 105, 107], "optim": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 36, 38, 44, 45, 46, 49, 53, 54, 57, 61, 67, 71, 72, 73, 76, 86, 87, 88, 89, 94, 95, 101, 103, 105, 108, 111, 112, 114], "loss": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 36, 44, 46, 52, 57, 66, 71, 85, 87, 94, 101, 105, 107, 111], "function": [0, 1, 3, 4, 5, 6, 8, 9, 10, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 36, 37, 38, 39, 41, 42, 43, 44, 45, 46, 48, 49, 50, 52, 53, 54, 57, 58, 59, 60, 61, 62, 65, 66, 67, 68, 71, 72, 73, 74, 75, 76, 77, 79, 80, 81, 82, 83, 84, 85, 86, 87, 94, 100, 103, 104, 105, 107, 111, 112, 114, 115], "unlabel": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 25, 26, 27, 29, 30, 31, 32, 49, 52, 54, 58, 66, 68, 71, 73, 85, 87, 94, 105, 107, 111], "data": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 36, 37, 38, 43, 44, 46, 48, 49, 52, 54, 57, 58, 61, 65, 66, 68, 71, 73, 74, 75, 76, 80, 82, 83, 85, 87, 88, 94, 96, 102, 105, 106, 107, 108, 110, 111], "decid": [0, 1, 2, 6, 10, 18, 23, 24, 25, 26, 29, 30, 31, 32, 94, 114], "whether": [0, 1, 2, 3, 4, 5, 6, 10, 18, 23, 24, 25, 26, 28, 29, 30, 31, 32, 40, 45, 51, 61, 63, 78, 80, 81, 86, 94, 108], "specif": [0, 6, 7, 8, 10, 11, 12, 14, 18, 20, 21, 26, 28, 30, 31, 38, 42, 46, 60, 61, 68, 76, 87, 94, 95, 96, 98, 101, 103, 104, 105, 106, 109, 112], "closer": [0, 6, 10, 18, 26, 94], "farther": [0, 6, 26], "one": [0, 3, 4, 5, 6, 9, 13, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 30, 31, 32, 43, 55, 60, 61, 64, 65, 68, 71, 72, 80, 81, 83, 84, 87, 91, 97, 99, 103, 108, 109, 112, 113, 116], "abl": [0, 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 33, 42, 80, 81, 85, 94, 114, 115], "while": [0, 1, 3, 4, 5, 6, 9, 10, 13, 16, 18, 19, 23, 24, 25, 26, 29, 38, 58, 61, 71, 76, 83, 94, 100, 104, 105, 108, 110, 111, 114], "low": [0, 3, 4, 5, 6, 9, 10, 18, 23, 25, 26, 29, 94, 96, 103, 105, 106, 110], "bit": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 55, 68, 71, 85, 90, 94, 96, 105, 110, 111, 112], "width": [0, 1, 2, 6, 10, 18, 26, 28, 29, 30, 31, 32, 55, 85, 94, 110, 111, 113, 116, 117], "cover": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 71, 87, 96, 109, 111], "follow": [0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 26, 27, 28, 29, 30, 31, 32, 33, 34, 38, 39, 40, 41, 42, 43, 45, 46, 50, 51, 53, 54, 55, 58, 59, 61, 62, 63, 64, 65, 67, 68, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 90, 91, 92, 94, 95, 96, 97, 98, 99, 100, 101, 103, 104, 105, 107, 108, 109, 111, 113, 116, 117], "instanti": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 79, 82, 108, 114], "fake": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 18, 19, 20, 21, 26, 28, 29, 30, 31, 33, 71, 87], "op": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 18, 19, 20, 21, 23, 25, 26, 28, 29, 30, 31, 33, 38, 43, 46, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 71, 72, 81, 87, 105, 109, 112], "insert": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 18, 19, 20, 21, 23, 25, 26, 28, 29, 30, 31, 71, 80, 87, 105, 111], "simuat": [0, 1, 6, 9, 10, 18, 19, 20, 21, 26, 29, 30, 31], "get": [0, 1, 2, 3, 4, 5, 7, 12, 13, 14, 15, 16, 17, 23, 24, 25, 33, 38, 40, 43, 51, 60, 61, 62, 63, 65, 68, 69, 72, 76, 78, 80, 83, 87, 89, 90, 91, 92, 94, 97, 105, 115], "score": [0, 1, 2, 8, 12, 15, 16, 27, 28, 38, 58, 61, 71, 73, 76, 85, 88, 100, 103, 114], "post": [0, 1, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 18, 19, 20, 21, 26, 27, 28, 29, 30, 31, 45, 53, 54, 55, 58, 67, 68, 73, 86, 94, 95, 101, 103, 108, 111, 112], "finetun": [0, 1, 3, 4, 5, 6, 8, 9, 10, 11, 12, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 54, 59, 68, 87], "design": [0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 29, 30, 31, 32, 34, 81, 106], "state": [0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 29, 30, 31, 32, 33, 64, 103], "art": [0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 29, 30, 31, 32], "result": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 34, 38, 42, 44, 58, 61, 66, 71, 73, 76, 85, 87, 94, 95, 97, 98, 101, 106, 107, 108, 109, 111], "For": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 36, 37, 38, 39, 40, 41, 42, 51, 52, 54, 57, 58, 59, 61, 62, 63, 68, 71, 73, 74, 76, 77, 78, 79, 80, 81, 82, 85, 87, 89, 90, 91, 92, 94, 97, 98, 99, 100, 101, 102, 103, 105, 107, 109, 111, 114, 117], "rel": [0, 1, 2, 6, 7, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 26, 27, 29, 30, 31, 32, 38, 61, 69, 76, 89, 98, 105, 110, 115], "friendli": [0, 1, 2, 6, 7, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 26, 27, 29, 30, 31, 32, 95, 105, 106], "like": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 37, 39, 42, 43, 62, 71, 77, 84, 87, 101, 103, 105, 107, 108, 109, 114], "resnet18": [0, 1, 2, 7, 9, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 59, 71, 73, 74, 77, 84, 85, 87, 88, 89], "also": [0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 29, 30, 31, 32, 44, 46, 52, 55, 57, 58, 60, 66, 68, 71, 79, 80, 85, 87, 97, 98, 99, 100, 105, 107, 109, 110, 111, 112, 114, 115, 117], "some": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 29, 30, 31, 32, 38, 42, 43, 45, 46, 61, 65, 68, 71, 76, 80, 84, 86, 87, 89, 94, 98, 100, 103, 104, 105, 106, 108, 110, 111], "paramet": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 33, 37, 40, 42, 43, 44, 45, 46, 52, 53, 55, 58, 59, 62, 63, 64, 65, 66, 67, 68, 69, 72, 73, 74, 75, 77, 78, 80, 81, 83, 84, 85, 86, 87, 88, 89, 94, 96, 97, 103, 104, 105, 106, 107, 108, 109, 115], "ar": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 36, 38, 39, 40, 41, 42, 43, 44, 45, 46, 51, 52, 54, 57, 58, 59, 61, 62, 63, 64, 65, 66, 68, 71, 72, 74, 75, 76, 78, 79, 80, 81, 83, 84, 85, 86, 87, 90, 94, 95, 96, 97, 98, 100, 103, 104, 105, 106, 107, 108, 109, 110, 111, 114, 115, 117], "deliber": [0, 1, 3, 4, 5, 6, 7, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 29, 30, 31, 32], "chosen": [0, 1, 3, 4, 5, 6, 7, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 29, 30, 31, 32, 99, 103], "have": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 42, 50, 52, 55, 61, 65, 66, 71, 72, 77, 78, 80, 81, 85, 87, 89, 92, 100, 103, 105, 106, 107, 110, 111], "execut": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 71, 73, 80, 88, 99, 100, 114], "more": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 36, 38, 39, 43, 44, 46, 48, 49, 50, 52, 55, 57, 58, 60, 61, 62, 65, 66, 68, 71, 72, 73, 75, 76, 77, 81, 84, 85, 87, 90, 97, 98, 99, 100, 101, 103, 105, 106, 107, 108, 109, 110, 111, 114, 115], "quickli": [0, 1, 3, 4, 5, 6, 7, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 29, 30, 31, 32, 100], "reli": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "imagenet": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 38, 54, 58, 59, 60, 62, 65, 68, 69, 71, 74, 75, 87, 99], "task": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 55, 114, 115], "imag": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 44, 52, 58, 66, 73, 75, 85, 90, 94, 99, 107], "classif": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 42, 103], "If": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 36, 38, 39, 41, 42, 43, 46, 55, 57, 58, 60, 61, 62, 64, 65, 66, 68, 71, 73, 75, 76, 77, 78, 79, 80, 81, 83, 84, 85, 86, 87, 88, 89, 91, 92, 95, 104, 105, 106, 107, 109, 110, 114, 115, 117], "you": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 38, 58, 61, 64, 68, 71, 76, 80, 82, 87, 91, 92, 99, 100, 104, 113, 116], "alreadi": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 87, 91, 100, 110], "version": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 51, 71, 80, 87, 90, 91, 92, 99, 101], "readili": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "avail": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 39, 62, 71, 80, 87, 90, 104, 107, 109, 110], "otherwis": [0, 6, 26, 58, 81, 83, 87, 91, 92, 110], "download": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 90, 91, 92], "from": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 36, 37, 38, 39, 40, 42, 43, 44, 46, 48, 49, 50, 51, 52, 54, 55, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 68, 69, 71, 72, 73, 74, 75, 76, 77, 78, 80, 81, 82, 83, 84, 85, 87, 88, 89, 90, 94, 97, 98, 99, 100, 104, 105, 106, 107, 108, 109, 110, 111, 114, 117], "appropri": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 38, 61, 71, 76, 84, 87, 90, 91, 92, 98, 99, 100, 103, 110], "locat": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 99], "e": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 36, 44, 45, 46, 52, 54, 55, 57, 66, 68, 71, 75, 85, 87, 96, 98, 101, 108, 110, 117], "g": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 36, 44, 46, 52, 55, 57, 66, 68, 71, 85, 87, 91, 96, 98, 101, 110, 117], "http": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 80, 90, 91, 92, 98, 99, 106, 112, 114], "net": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 99], "org": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 80, 91, 92, 99, 106], "challeng": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "lsvrc": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "2012": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "index": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 34, 73, 85, 98, 112], "php": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 99], "note1": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "The": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 36, 37, 38, 40, 42, 43, 44, 46, 48, 50, 51, 54, 55, 57, 58, 59, 60, 61, 62, 63, 65, 66, 68, 71, 72, 74, 75, 76, 77, 78, 79, 80, 81, 83, 84, 85, 87, 90, 92, 94, 95, 96, 97, 99, 100, 101, 103, 104, 105, 106, 107, 108, 109, 110, 111, 113, 114, 115, 116, 117], "dataload": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 48, 49, 52, 54, 66, 68, 71, 73, 74, 85, 99, 107], "provid": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 38, 42, 43, 44, 45, 48, 53, 54, 55, 57, 58, 61, 66, 67, 68, 71, 72, 73, 81, 85, 86, 87, 88, 91, 92, 94, 98, 99, 100, 103, 105, 106, 107, 109, 110, 111, 114, 115, 117], "characterist": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "subfold": [0, 1, 2, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "sampl": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 36, 38, 43, 44, 46, 52, 54, 55, 57, 58, 60, 61, 65, 66, 68, 71, 73, 75, 85, 87, 97, 105, 106, 107, 108, 111], "val": [0, 1, 2, 15, 16, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "valid": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 38, 44, 52, 54, 61, 62, 65, 66, 68, 71, 75, 76, 85, 86, 87, 95, 105, 112], "pleas": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 34, 36, 37, 38, 39, 44, 46, 48, 49, 50, 52, 57, 58, 59, 60, 61, 62, 66, 68, 71, 73, 74, 75, 76, 77, 81, 83, 85, 87, 90, 91, 92, 94, 95, 96, 97, 99, 101, 103, 106, 107, 111], "see": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 34, 36, 37, 38, 39, 42, 46, 48, 49, 50, 57, 58, 59, 60, 61, 62, 68, 71, 73, 74, 75, 76, 77, 85, 87, 97, 100, 101, 103, 105, 109, 110, 111, 113, 114, 115, 116], "descript": [0, 1, 2, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 104], "detail": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 61, 71, 80, 87, 97, 99, 100, 101, 103, 105, 110, 111, 114, 115], "A": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 37, 38, 43, 44, 46, 52, 58, 61, 65, 66, 68, 71, 72, 73, 74, 75, 76, 77, 82, 83, 84, 85, 87, 88, 89, 98, 105, 107, 108, 109, 110, 111], "subdirectori": [0, 1, 2, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "per": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29, 30, 31, 38, 44, 52, 55, 59, 61, 65, 66, 74, 76, 78, 83, 85, 87, 96, 105, 106, 107, 109, 110, 111, 112], "class": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 36, 38, 40, 41, 42, 43, 44, 46, 57, 58, 60, 61, 63, 65, 66, 68, 71, 72, 73, 75, 76, 78, 79, 80, 81, 83, 84, 85, 87, 88], "file": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 38, 44, 46, 57, 58, 61, 64, 66, 68, 71, 72, 73, 75, 76, 83, 85, 87, 88, 90, 91, 92, 105, 107, 108, 111, 112, 115], "each": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 36, 38, 42, 43, 55, 57, 61, 65, 69, 71, 72, 75, 76, 81, 84, 85, 87, 88, 89, 90, 91, 92, 94, 95, 96, 97, 98, 99, 100, 105, 106, 107, 108, 109, 110, 111, 115, 117], "note2": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "To": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 36, 38, 39, 42, 46, 48, 49, 50, 57, 58, 60, 61, 62, 68, 71, 73, 75, 76, 77, 83, 85, 87, 90, 96, 99, 100, 103, 104, 107, 109, 110, 111, 114, 115], "speed": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 38, 61, 71, 76, 103, 106, 112], "up": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 36, 38, 42, 44, 46, 52, 57, 61, 64, 66, 68, 71, 76, 85, 87, 92, 103, 108, 109, 111, 117], "mai": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 42, 55, 71, 73, 80, 83, 87, 94, 98, 103, 105, 106, 107, 109, 110, 111], "reduc": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 71, 97, 103, 106, 110, 112, 117], "subset": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 43, 44, 52, 60, 65, 66, 85, 94, 96, 107, 117], "entir": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 44, 52, 60, 61, 66, 76, 83, 85, 100, 103], "ilsvrc2012": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "ha": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 36, 42, 43, 44, 50, 52, 57, 60, 64, 65, 66, 71, 72, 76, 77, 80, 81, 83, 84, 85, 87, 89, 98, 99, 100, 103, 106, 108, 111, 114, 117], "1000": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 44, 52, 54, 58, 66, 68, 71, 73, 75, 76, 85, 87, 94, 106, 107], "50": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 38, 61, 68, 76, 98], "But": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 71, 80, 87, 94, 103], "purpos": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 71, 87, 109], "run": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 38, 40, 42, 44, 45, 46, 48, 49, 51, 54, 55, 57, 58, 59, 60, 61, 63, 66, 68, 72, 73, 74, 76, 78, 80, 81, 87, 91, 96, 101, 103, 105, 106, 107, 111, 112, 114], "could": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 36, 44, 46, 52, 55, 57, 66, 68, 71, 79, 85, 87, 97, 117], "exercis": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "left": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 100, 117], "reader": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "necessari": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 44, 52, 58, 61, 71, 73, 76, 85, 87, 114], "edit": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 54, 55, 65, 68, 71, 87], "cell": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "below": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 34, 39, 40, 41, 42, 43, 50, 51, 55, 60, 61, 62, 63, 65, 66, 71, 77, 78, 79, 83, 91, 92, 94, 95, 96, 105, 106, 107, 109, 110, 111, 117], "specifi": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 38, 46, 55, 58, 61, 66, 68, 71, 76, 85, 87, 89, 91, 92, 95, 103, 109, 111, 115], "directori": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 38, 40, 44, 58, 61, 63, 66, 69, 73, 76, 78, 83, 85, 89, 99, 115], "where": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 38, 46, 55, 57, 58, 61, 68, 71, 76, 79, 80, 83, 87, 96, 100, 107, 108, 113, 116, 117], "save": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 38, 40, 43, 44, 46, 51, 57, 58, 61, 63, 64, 65, 66, 68, 69, 71, 73, 76, 78, 83, 84, 85, 87, 89, 95, 111, 115], "dataset_dir": [0, 1, 2, 7, 9, 10, 11, 12, 15, 16, 17, 18, 19, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 38], "path": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 36, 38, 40, 44, 46, 51, 57, 58, 61, 63, 66, 68, 71, 73, 75, 76, 78, 80, 83, 85, 87, 88, 91, 92, 99], "replac": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 43, 54, 59, 62, 65, 68, 71, 74, 75, 80, 83, 84, 87, 91, 106, 111], "real": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 58, 73], "loop": [0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 80, 110], "doe": [0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 36, 42, 44, 50, 52, 57, 61, 66, 76, 77, 79, 80, 85, 100, 102, 105, 110], "ani": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 34, 38, 41, 42, 43, 44, 45, 46, 52, 61, 65, 66, 68, 71, 74, 76, 79, 80, 81, 84, 85, 86, 87, 91, 92, 94, 95, 99, 109, 112], "limit": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 64, 102], "written": [0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 104, 105], "Not": [0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 82, 83, 85, 94, 100], "realli": [0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32], "we": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 37, 38, 40, 42, 51, 52, 58, 59, 61, 63, 71, 72, 74, 75, 78, 80, 81, 83, 85, 87, 89, 91, 92, 100, 103, 105, 106, 109, 110, 111, 115], "later": [0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 87, 90], "modifi": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 43, 46, 51, 65, 80, 83, 84, 87, 91, 92, 105, 111, 112, 117], "user": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 34, 37, 38, 41, 42, 43, 44, 45, 52, 54, 55, 59, 61, 64, 65, 66, 74, 76, 79, 80, 82, 84, 86, 88, 89, 91, 94, 95, 98, 99, 103, 105, 107, 108, 109, 110, 111, 112, 114, 115], "quantizationsim": [0, 1, 2, 6, 7, 8, 9, 10, 11, 13, 17, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 71, 82], "which": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 36, 38, 40, 42, 43, 45, 51, 52, 55, 57, 60, 61, 63, 66, 68, 71, 72, 75, 76, 78, 80, 81, 83, 84, 86, 87, 89, 94, 95, 96, 98, 100, 103, 105, 106, 107, 109, 111, 112, 113, 114, 115, 116], "still": [0, 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 41, 105, 110], "can": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 38, 39, 40, 41, 42, 43, 45, 46, 51, 52, 54, 55, 57, 60, 61, 62, 63, 65, 66, 68, 71, 72, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 91, 92, 95, 96, 98, 100, 101, 103, 105, 106, 107, 108, 109, 110, 111, 113, 114, 115, 116], "place": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 46, 60, 62, 65, 66, 68, 71, 75, 77, 84, 87, 108, 109], "origin": [0, 1, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 38, 40, 42, 51, 61, 63, 68, 76, 78, 80, 87, 97, 98, 103, 105, 106, 107, 108, 111, 114], "do": [0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 33, 37, 42, 46, 68, 71, 80, 82, 89, 92, 103, 107, 111], "infer": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 38, 45, 46, 49, 52, 53, 54, 55, 57, 67, 68, 71, 86, 87, 96, 98, 101, 106, 108, 111, 112], "put": [0, 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32], "interfac": [0, 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 90], "method": [0, 1, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 33, 40, 46, 58, 61, 63, 64, 68, 75, 78, 80, 87, 91, 92, 100, 103, 105, 110, 111], "should": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 38, 41, 42, 44, 45, 46, 48, 49, 52, 54, 55, 58, 60, 61, 64, 66, 68, 71, 73, 76, 78, 79, 80, 83, 85, 86, 87, 91, 92, 98, 103, 109, 114, 117], "your": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 49, 54, 59, 68, 71, 74, 75, 80, 81, 87, 90, 91, 92, 99, 104], "exist": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 71, 87, 91, 105, 111], "routin": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 71, 87], "import": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 36, 37, 38, 39, 40, 41, 42, 43, 44, 46, 48, 49, 50, 51, 52, 54, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 68, 69, 71, 72, 73, 74, 75, 76, 77, 78, 80, 81, 83, 84, 85, 87, 88, 89, 92, 96, 97, 110], "torch": [0, 1, 2, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 71, 72, 73, 74, 76, 78, 79, 81, 82, 83, 84, 85, 87, 88, 89, 90, 91, 99, 104, 112], "onnxruntim": [0, 1, 2, 49, 51, 52], "ort": [0, 1, 2, 49], "common": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 58, 61, 68, 87, 91, 110, 115], "image_net_config": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 59], "util": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 36, 42, 43, 44, 46, 57, 58, 59, 60, 64, 65, 66, 68, 72, 73, 83, 84, 87, 88, 96, 99, 105], "image_net_evalu": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 66], "imagenetevalu": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 66], "image_net_data_load": [0, 1, 2, 23, 25, 26, 28, 29, 30, 31, 32], "imagenetdataload": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 17, 20, 21, 22, 23, 25, 26, 28, 29, 30, 31, 32], "imagenetdatapipelin": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 59, 71, 74, 75, 87], "staticmethod": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 80], "def": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 36, 38, 39, 41, 42, 43, 44, 46, 48, 49, 50, 52, 54, 57, 58, 59, 60, 61, 62, 65, 66, 68, 69, 71, 72, 73, 74, 75, 76, 77, 79, 80, 81, 84, 85, 87, 88, 89], "get_val_dataload": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 23, 25, 26, 28, 29, 30, 31, 32, 71, 75, 87], "return": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 36, 37, 38, 39, 40, 41, 42, 43, 44, 49, 52, 54, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 68, 69, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 83, 84, 85, 87, 88, 89, 95, 100, 101, 107, 111], "data_load": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 18, 19, 20, 21, 22, 23, 25, 26, 27, 28, 29, 30, 31, 32, 48, 49, 54, 68, 71, 73, 75, 76, 87], "image_s": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 38, 66, 73], "batch_siz": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 36, 37, 38, 44, 49, 52, 54, 57, 58, 60, 61, 66, 68, 71, 73, 76, 85, 87], "is_train": [0, 1, 2, 23, 25, 26, 28, 29, 30, 31, 32], "fals": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 15, 16, 18, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 33, 38, 42, 46, 51, 54, 55, 57, 58, 60, 61, 63, 64, 66, 68, 72, 73, 75, 76, 80, 81, 83, 87, 104, 109], "num_work": [0, 1, 2, 23, 24, 25, 26, 28, 29, 30, 31, 32], "sess": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 33, 38, 58, 59, 60, 61, 62, 65, 68, 69], "inferencesess": [0, 1, 2, 49, 51, 52], "float": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 36, 38, 44, 46, 49, 52, 55, 57, 58, 61, 66, 68, 71, 73, 75, 76, 85, 87, 105, 107, 110, 111, 115], "given": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 33, 38, 39, 43, 46, 55, 58, 60, 61, 62, 64, 65, 68, 69, 71, 73, 74, 76, 77, 84, 87, 95, 97, 100, 101, 103, 106, 113, 114, 116], "its": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 38, 42, 69, 99, 101, 105, 107, 111, 117], "top": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 60, 97, 114], "param": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 33, 36, 38, 39, 43, 44, 46, 48, 49, 52, 54, 57, 60, 61, 65, 66, 68, 71, 73, 76, 83, 85, 87, 88, 109], "iter": [0, 1, 2, 3, 4, 5, 6, 10, 11, 12, 18, 19, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 33, 36, 38, 44, 49, 52, 57, 60, 61, 66, 71, 73, 76, 85, 94, 106], "none": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 36, 38, 40, 42, 43, 44, 46, 48, 49, 51, 52, 54, 57, 58, 60, 61, 63, 65, 66, 68, 71, 72, 73, 74, 75, 76, 77, 78, 80, 83, 84, 85, 87, 88, 89, 92, 114], "go": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 42, 59, 71, 74, 87, 91, 92, 99, 114], "load": [0, 1, 2, 13, 33, 40, 43, 46, 51, 59, 60, 62, 63, 64, 65, 68, 69, 71, 74, 75, 76, 78, 80, 81, 83, 87, 103], "pretrain": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 59, 71, 73, 74, 77, 84, 85, 87, 88, 89, 107, 108, 111], "torchvis": [0, 1, 2, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 71, 73, 74, 77, 84, 85, 87, 88, 89], "similarli": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 71, 87, 110], "instead": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 38, 41, 71, 79, 80, 81, 87, 105, 106], "differ": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 38, 58, 61, 76, 80, 97, 99, 100, 103, 105, 106, 108, 109, 110, 111], "framework": [0, 1, 2, 34, 54, 101, 105, 109, 111], "altogeth": [0, 1, 2, 109], "input_shap": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 14, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 33, 39, 41, 44, 49, 52, 54, 58, 59, 60, 61, 62, 65, 69, 71, 73, 75, 76, 77, 80, 84, 85, 87, 88], "224": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 15, 16, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 38, 39, 44, 46, 49, 52, 54, 58, 59, 60, 61, 62, 65, 66, 69, 71, 73, 75, 77, 78, 84, 85, 87, 88, 89], "shape": [0, 1, 2, 12, 13, 14, 18, 26, 28, 29, 30, 31, 32, 33, 41, 42, 61, 72, 76, 77, 80, 81, 83, 84, 107], "channel": [0, 1, 2, 9, 19, 24, 26, 28, 29, 30, 31, 32, 38, 55, 59, 69, 72, 74, 83, 85, 89, 96, 98, 99, 100, 102, 103, 106, 107, 109, 110, 111, 112, 113, 115, 116, 117], "x": [0, 1, 2, 10, 13, 14, 17, 18, 22, 26, 28, 29, 30, 31, 32, 38, 41, 42, 46, 64, 68, 72, 79, 80, 81, 90, 98, 104, 107], "height": [0, 1, 2, 13, 26, 28, 29, 30, 31, 32, 113, 116, 117], "dummy_input": [0, 1, 2, 26, 27, 28, 29, 30, 31, 32, 37, 40, 49, 51, 52, 54, 71, 72, 73, 77, 78, 79, 83, 84, 85, 87], "randn": [0, 1, 2, 27, 46, 49, 52, 54, 71, 73, 80, 81, 85, 87], "filenam": [0, 1, 2, 46, 57, 61, 68, 71, 83, 87, 92], "resnet": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 38, 58, 59, 98], "18": [0, 1, 2], "pt_model": [0, 1, 2], "true": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 16, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 38, 43, 46, 48, 55, 60, 61, 64, 65, 66, 68, 71, 72, 73, 74, 75, 76, 77, 80, 81, 83, 84, 85, 87, 88, 89, 104, 109], "export": [0, 1, 2, 6, 9, 10, 13, 18, 20, 21, 26, 29, 30, 31, 40, 44, 46, 51, 52, 54, 55, 59, 63, 66, 68, 71, 74, 78, 79, 83, 85, 87, 91, 92, 96, 99, 101, 103, 104, 105, 108, 111, 112], "eval": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 36, 38, 44, 49, 52, 57, 58, 61, 66, 71, 73, 75, 76, 77, 80, 82, 84, 85, 87, 88, 89, 100, 103, 114], "export_param": [0, 1, 2], "do_constant_fold": [0, 1, 2], "input_nam": [0, 1, 2, 87], "input": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 28, 32, 33, 38, 40, 41, 42, 44, 49, 51, 52, 54, 58, 59, 61, 63, 65, 66, 68, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 84, 85, 87, 97, 103, 107, 109, 111, 113, 114, 116, 117], "output_nam": [0, 1, 2, 87], "output": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 33, 37, 38, 41, 42, 44, 55, 57, 58, 59, 60, 61, 62, 65, 66, 67, 68, 69, 71, 72, 73, 75, 79, 80, 81, 83, 85, 86, 87, 89, 97, 103, 106, 107, 109, 111, 112, 113, 116, 117], "dynamic_ax": [0, 1, 2], "0": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 36, 37, 38, 42, 46, 49, 54, 57, 58, 61, 66, 68, 71, 72, 73, 75, 76, 80, 81, 83, 84, 87, 88, 90, 91, 92, 94, 98, 100, 104, 109], "load_model": [0, 1, 2, 40], "cpu": [0, 1, 2, 3, 4, 5, 6, 7, 9, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 61, 71, 73, 74, 78, 80, 84, 87, 89, 90, 91, 92, 105, 112], "cuda": [0, 1, 2, 3, 4, 5, 6, 7, 9, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 73, 74, 75, 76, 85, 87, 88, 90, 92], "devic": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 40, 51, 63, 71, 73, 74, 75, 78, 80, 83, 84, 87, 88, 89, 111], "environ": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 15, 16, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 90, 99], "chang": [0, 1, 2, 3, 4, 5, 8, 9, 12, 13, 15, 16, 20, 21, 23, 24, 25, 26, 28, 29, 30, 31, 32, 45, 55, 71, 73, 79, 80, 83, 86, 87, 89, 94, 103, 107, 108, 109, 111, 115, 117], "logic": [0, 1, 2, 23, 24, 25, 26, 29, 30, 31, 32, 112], "forc": [0, 1, 2, 23, 24, 25, 26, 29, 30, 31, 32], "placement": [0, 1, 2, 23, 24, 25, 26, 29, 30, 31, 32, 109], "need": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 38, 40, 41, 42, 43, 46, 52, 55, 61, 63, 65, 71, 75, 76, 78, 82, 84, 85, 87, 91, 95, 98, 103, 105, 106, 107, 108, 109, 111, 112, 114, 115], "cudnn_conv_algo_search": [0, 1, 2], "fix": [0, 1, 2, 55, 81, 101, 105, 110, 111, 112], "default": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 25, 26, 27, 28, 29, 30, 31, 32, 36, 38, 42, 44, 46, 57, 58, 60, 61, 66, 68, 71, 73, 75, 76, 80, 85, 87, 92, 94, 100, 103, 109, 111, 112, 114], "avoid": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 41, 46, 52, 71, 79, 85, 87, 98], "everi": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 40, 44, 52, 57, 63, 66, 71, 78, 85, 87, 100, 103, 108, 115], "cudaexecutionprovid": [0, 1, 2], "get_available_provid": [0, 1, 2], "cpuexecutionprovid": [0, 1, 2], "use_cuda": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 33, 38, 48, 51, 54, 57, 60, 61, 63, 66, 68, 71, 74, 76, 87], "els": [0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 38, 43, 49, 74, 80, 81, 106], "let": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 29, 30, 31, 71, 80, 87], "session": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 33, 36, 38, 49, 51, 52, 54, 57, 58, 59, 60, 61, 62, 63, 65, 66, 68, 69, 88], "point": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 28, 29, 30, 31, 42, 43, 44, 45, 52, 53, 55, 66, 67, 75, 83, 84, 85, 86, 87, 101, 103, 105, 107, 110, 111, 115], "32": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 27, 29, 30, 31, 33, 36, 41, 42, 44, 49, 55, 57, 58, 61, 66, 71, 72, 80, 81, 87, 91, 92, 110], "print": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 13, 14, 20, 21, 23, 24, 25, 26, 27, 28, 29, 30, 31, 38, 49, 61, 68, 71, 72, 73, 76, 80, 81, 87, 92, 105, 107], "befor": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29, 30, 31, 40, 41, 42, 43, 45, 49, 51, 60, 63, 64, 65, 71, 73, 78, 84, 86, 87, 89, 94, 95, 96, 103, 105, 108, 114, 115], "quantizationsimmodel": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 36, 37, 40, 43, 46, 48, 51, 54, 57, 59, 63, 68, 71, 74, 78, 83, 85, 87, 94, 96], "batchnorm": [0, 1, 2, 3, 4, 5, 6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 39, 42, 43, 45, 50, 58, 62, 64, 65, 72, 73, 75, 77, 84, 89, 95, 106, 117], "bn": [0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 12, 15, 16, 18, 19, 20, 21, 22, 26, 29, 30, 31, 37, 43, 60, 62, 64, 65, 67, 75, 84, 86, 105, 112], "These": [0, 1, 2, 6, 9, 10, 13, 15, 16, 18, 19, 20, 21, 26, 27, 29, 30, 31, 33, 36, 46, 57, 58, 68, 73, 87, 91, 92, 95, 96, 97, 98, 104, 105, 106, 107, 110, 111], "adjac": [0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 15, 16, 18, 19, 20, 21, 22, 26, 29, 30, 31, 109], "convolut": [0, 1, 2, 3, 4, 5, 6, 9, 10, 14, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 29, 30, 31, 38, 42, 43, 61, 65, 72, 96, 98, 103, 110], "cannot": [0, 1, 2, 6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 72, 80, 81], "thei": [0, 1, 2, 3, 4, 5, 6, 9, 10, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 41, 43, 61, 65, 80, 84, 86, 109, 114], "why": [0, 1, 2, 6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 110], "On": [0, 1, 2, 6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 55, 90], "runtim": [0, 1, 2, 6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 46, 54, 55, 61, 68, 71, 76, 87, 98, 101, 103, 105, 107, 109, 111, 112], "tflite": [0, 1, 2, 6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31], "snapdragon": [0, 1, 2, 6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31], "neural": [0, 1, 2, 6, 7, 9, 10, 11, 15, 16, 18, 19, 20, 21, 26, 27, 29, 30, 31, 95, 98, 101, 103, 105, 108, 110, 111, 116], "process": [0, 1, 2, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 18, 19, 20, 21, 26, 27, 29, 30, 31, 38, 40, 51, 61, 63, 68, 71, 78, 87, 88, 95, 101, 103, 105, 106, 111], "sdk": [0, 1, 2, 6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 101], "etc": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 55, 61, 64, 71, 87, 91, 92, 98, 105], "practic": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 71, 87, 103], "so": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 36, 37, 44, 46, 52, 57, 61, 66, 68, 71, 76, 80, 81, 82, 85, 87, 92, 104, 107, 114], "sec": [0, 1, 2, 6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31], "speedup": [0, 1, 2, 6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31], "sinc": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 54, 71, 87, 96, 98, 100, 111], "unnecessari": [0, 1, 2, 6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 117], "perspect": [0, 1, 2, 6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31], "mathemat": [0, 1, 2, 6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 79], "equival": [0, 1, 2, 6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 32, 71, 79, 87], "produc": [0, 1, 2, 6, 9, 10, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 55, 73, 80, 100, 107, 114], "same": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 14, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 28, 29, 30, 31, 36, 40, 42, 51, 57, 63, 71, 78, 80, 81, 83, 85, 87, 96, 106, 109, 111, 115], "howev": [0, 1, 2, 6, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 26, 27, 29, 30, 31, 32, 41, 42, 61, 71, 76, 87, 105, 106, 108, 109, 111], "increas": [0, 1, 2, 3, 4, 5, 6, 9, 10, 15, 16, 18, 19, 20, 21, 23, 25, 26, 29, 30, 31, 38, 61, 76, 89, 100, 106, 109], "rang": [0, 1, 2, 6, 8, 9, 10, 12, 13, 14, 15, 18, 19, 20, 26, 27, 28, 29, 30, 42, 44, 52, 59, 66, 68, 69, 71, 73, 74, 80, 85, 87, 89, 94, 96, 99, 100, 105, 106, 107, 108, 110, 111, 112, 115], "tensor": [0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 33, 36, 40, 42, 45, 46, 53, 54, 55, 57, 60, 66, 67, 68, 71, 72, 73, 76, 77, 78, 79, 80, 83, 84, 85, 86, 87, 94, 97, 104, 105, 107, 109, 110, 111, 112, 113, 116], "neg": [0, 1, 2, 6, 9, 10, 12, 13, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31], "impact": [0, 1, 2, 6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 100, 110], "especi": [0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 15, 16, 18, 19, 20, 21, 22, 26, 27, 29, 30, 31, 90, 105, 108, 110], "int8": [0, 1, 2, 6, 8, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 71, 108, 111, 115], "lower": [0, 1, 2, 6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 71, 100, 105, 110], "precis": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 55, 71, 105], "want": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 40, 43, 46, 51, 58, 63, 65, 66, 68, 71, 78, 80, 84, 86, 87, 91], "target": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29, 30, 31, 38, 40, 42, 46, 49, 51, 54, 55, 59, 61, 63, 68, 71, 74, 76, 78, 87, 96, 98, 100, 101, 103, 105, 110, 111, 112], "behavior": [0, 1, 2, 6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 80, 101], "here": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 36, 37, 38, 39, 57, 58, 59, 62, 68, 71, 73, 74, 77, 80, 85, 87, 98, 99, 108, 114], "call": [0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 37, 38, 39, 41, 42, 43, 52, 55, 58, 61, 62, 65, 71, 76, 77, 80, 84, 85, 87, 96, 103, 105, 107, 109, 111, 112, 113, 116], "aimet_onnx": [0, 1, 2, 48, 49, 50, 51, 52, 54, 90, 91, 92], "batch_norm_fold": [0, 1, 2, 6, 8, 9, 10, 12, 15, 16, 18, 19, 20, 21, 26, 28, 29, 30, 31, 37, 43, 59, 60, 65, 74, 84, 89], "fold_all_batch_norms_to_weight": [0, 1, 2], "_": [0, 1, 2, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 26, 27, 29, 30, 31, 32, 33, 36, 40, 42, 43, 44, 51, 52, 54, 57, 58, 60, 61, 62, 63, 65, 66, 68, 69, 73, 78, 85, 90, 91, 92], "now": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 14, 15, 16, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 38, 42, 71, 80, 81, 87, 105, 112, 117], "basic": [0, 1, 2, 8, 9, 12, 13, 15, 16, 19, 20, 21, 28, 29, 30, 31, 71, 87, 92], "mean": [0, 1, 2, 3, 4, 5, 6, 8, 9, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 27, 28, 29, 30, 31, 32, 45, 59, 71, 74, 87, 89, 97, 109, 111], "graph": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 15, 16, 18, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 33, 38, 42, 55, 57, 58, 60, 61, 62, 64, 65, 68, 69, 71, 72, 80, 84, 87, 88, 104, 105, 111, 114], "configur": [0, 1, 2, 6, 8, 9, 12, 13, 15, 16, 20, 21, 26, 28, 29, 30, 31, 36, 44, 55, 57, 58, 71, 73, 75, 83, 85, 87, 98, 102, 112], "them": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 42, 46, 52, 60, 66, 71, 80, 81, 85, 87, 94, 117], "few": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 71, 87, 98, 105, 110, 111], "explain": [0, 1, 2, 3, 4, 5, 6, 8, 9, 12, 13, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 28, 29, 30, 31, 32, 71, 87, 97, 103, 106, 111, 117], "quant_schem": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 36, 44, 46, 52, 54, 55, 57, 66, 68, 71, 73, 75, 85, 87], "set": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 36, 38, 42, 43, 48, 49, 54, 55, 57, 58, 60, 61, 64, 65, 68, 71, 73, 76, 80, 81, 83, 84, 85, 87, 92, 94, 98, 100, 101, 103, 104, 106, 107, 108, 109, 110, 111, 117], "quantschem": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 36, 44, 46, 52, 54, 57, 58, 59, 66, 68, 71, 73, 74, 75, 85, 87, 95], "post_training_tf_enhanc": [0, 1, 2, 6, 8, 9, 12, 13, 15, 17, 19, 20, 22, 26, 28, 29, 30, 32, 36, 44, 46, 52, 55, 57, 58, 66, 68, 71, 73, 75, 85, 87], "support": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 13, 14, 15, 19, 20, 21, 22, 28, 29, 30, 31, 33, 34, 38, 39, 41, 42, 44, 46, 54, 57, 58, 61, 62, 64, 66, 68, 71, 75, 76, 79, 80, 85, 87, 90, 97, 98, 101, 102, 103, 104, 105, 106, 109, 110, 111, 112, 113, 116, 117], "option": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 38, 43, 44, 46, 49, 57, 58, 60, 61, 66, 68, 71, 72, 73, 74, 75, 76, 80, 83, 84, 85, 87, 89, 92, 94, 99, 107, 109, 111, 114], "tf_enhanc": [0, 1, 2, 8, 9, 12, 13, 15, 19, 20, 28, 29, 30, 46, 60, 68, 75], "tf": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 28, 29, 30, 31, 32, 33, 36, 37, 38, 39, 40, 41, 42, 43, 44, 46, 54, 57, 58, 59, 60, 61, 62, 63, 65, 66, 68, 69, 75, 85, 91, 107, 111, 112], "quant": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 18, 19, 20, 21, 28, 29, 30, 31, 36, 45, 46, 54, 57, 60, 68, 71, 75, 83, 86, 87, 96], "scheme": [0, 1, 2, 3, 4, 5, 6, 8, 9, 12, 13, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 28, 29, 30, 31, 32, 33, 36, 38, 44, 46, 54, 57, 58, 60, 61, 66, 68, 71, 73, 75, 76, 78, 85, 87, 95, 96, 100, 103, 107], "enum": [0, 1, 2, 8, 9, 12, 13, 15, 19, 20, 28, 29, 30, 38, 61, 75, 76], "post_training_tf": [0, 1, 2, 8, 9, 10, 12, 13, 15, 18, 19, 20, 28, 29, 30, 36, 44, 46, 54, 55, 57, 58, 66, 68, 71, 75, 85, 87], "default_activation_bw": [0, 1, 2, 48, 52, 54], "8": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 24, 25, 26, 28, 29, 30, 31, 32, 36, 43, 44, 46, 48, 52, 54, 55, 57, 58, 61, 66, 68, 71, 73, 75, 76, 80, 81, 85, 87, 88, 90, 92, 105, 117], "essenti": [0, 1, 2, 8, 9, 12, 13, 15, 16, 19, 20, 21, 28, 29, 30, 31], "ask": [0, 1, 2, 6, 8, 9, 12, 13, 15, 16, 17, 19, 20, 21, 22, 28, 29, 30, 31, 32], "all": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 37, 39, 41, 42, 43, 46, 52, 55, 59, 61, 62, 65, 66, 68, 71, 74, 75, 76, 80, 81, 83, 84, 85, 87, 89, 91, 97, 100, 103, 106, 107, 109, 110], "activ": [0, 1, 2, 6, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 41, 42, 43, 44, 46, 52, 54, 55, 60, 65, 66, 68, 71, 72, 75, 79, 80, 83, 84, 85, 87, 105, 107, 108, 109, 110, 111], "default_param_bw": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 36, 44, 46, 48, 52, 54, 57, 58, 66, 68, 71, 85, 87], "In": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 38, 41, 42, 44, 45, 49, 52, 55, 58, 61, 64, 71, 72, 73, 76, 79, 80, 81, 85, 86, 87, 94, 95, 98, 100, 103, 105, 106, 108, 109, 111, 115, 117], "case": [0, 2, 13, 14, 17, 22, 32, 44, 49, 52, 55, 58, 65, 73, 79, 80, 81, 83, 85, 91, 92, 100, 106, 108, 109], "custom": [0, 2, 13, 16, 27, 42, 46, 55, 80, 99, 110, 111], "compil": [0, 2, 11, 12, 13, 17, 33, 44, 46], "via": [0, 2, 42, 98, 101, 111], "user_onnx_lib": [0, 2], "custom_op1": [0, 2], "custom_op2": [0, 2], "There": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 78, 81, 87, 90, 94, 104, 106, 108, 114, 115], "other": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 43, 61, 65, 79, 80, 84, 87, 90, 100, 102, 103, 105, 107, 110, 111, 112], "check": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 13, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 40, 42, 44, 45, 51, 52, 63, 66, 72, 78, 79, 80, 81, 85, 86, 87, 95, 105, 108, 110], "api": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 41, 79, 82, 91, 99, 101, 104, 105, 109, 112, 114], "document": [0, 1, 2, 6, 7, 8, 9, 11, 12, 13, 15, 16, 17, 19, 20, 21, 22, 27, 28, 29, 30, 31, 32, 34, 88, 98, 99, 101, 112], "refer": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 55, 60, 71, 75, 78, 83, 86, 87, 94, 95, 96, 99, 101, 105, 107, 108, 109, 111], "copi": [0, 4, 5, 6, 26, 29, 30, 31, 42, 46, 87, 89, 99, 111], "aimet_common": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117], "quantsim": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 20, 21, 26, 28, 29, 30, 31, 36, 40, 46, 48, 51, 52, 54, 57, 59, 63, 68, 71, 74, 75, 78, 83, 85, 87, 99, 105, 108, 109, 112], "deepcopi": [0, 89], "even": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 71, 87, 105], "though": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 71, 87, 109], "ad": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 18, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 55, 61, 68, 71, 81, 83, 87, 102, 105, 109, 112], "node": [0, 1, 2, 6, 8, 9, 10, 12, 13, 18, 20, 21, 26, 29, 30, 31, 46, 61, 68, 71, 72, 79, 80, 87, 108, 111], "readi": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 46, 68, 71, 87, 110], "yet": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 71, 87], "find": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 18, 19, 20, 21, 26, 27, 29, 30, 31, 43, 61, 65, 71, 75, 81, 84, 87, 89, 100, 105, 107, 108, 111], "scale": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 37, 39, 43, 50, 55, 59, 62, 65, 71, 74, 77, 84, 87, 96, 105, 106, 107, 108, 111], "offset": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 44, 52, 55, 66, 71, 85, 87, 105, 107, 108, 111], "pass": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 36, 38, 40, 42, 43, 44, 46, 48, 51, 52, 54, 57, 58, 60, 61, 63, 65, 66, 68, 71, 72, 74, 75, 76, 78, 79, 80, 81, 82, 83, 84, 85, 87, 101, 104, 105, 106, 107, 108, 110, 111, 112, 114], "through": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 42, 61, 71, 75, 80, 85, 87, 89, 91, 92, 106, 107, 111, 114, 115], "collect": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 71, 73, 85, 87, 97, 107], "statist": [0, 1, 2, 3, 4, 5, 6, 9, 10, 12, 13, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 29, 30, 31, 37, 38, 44, 45, 52, 59, 61, 66, 67, 69, 71, 74, 76, 85, 86, 87, 89, 96, 105, 107, 115], "calcul": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 36, 46, 57, 58, 61, 65, 66, 68, 71, 73, 84, 87, 100, 106, 107, 111], "sometim": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 71, 87, 97, 103, 106, 107], "calibr": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 44, 48, 52, 54, 66, 68, 71, 85, 87, 105, 107, 108, 110, 111], "simpli": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 28, 29, 30, 31, 36, 44, 46, 52, 57, 66, 68, 71, 80, 85, 87, 117], "fairli": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 71, 87], "simpl": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 36, 44, 52, 57, 61, 66, 71, 76, 80, 85, 87, 105, 117], "loader": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 25, 26, 28, 29, 30, 31, 32, 44, 52, 54, 66, 68, 71, 74, 75, 85, 87, 94], "extract": [0, 1, 2, 6, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 71, 87, 106], "don": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 71, 80, 87, 94], "t": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 71, 80, 82, 87, 91, 94], "metric": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 33, 38, 44, 46, 61, 71, 76, 87, 107, 111], "just": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 66, 71, 87, 111, 114, 117], "ignor": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 38, 60, 61, 62, 65, 71, 75, 76, 80, 87], "pointer": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 71, 87], "regard": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 71, 87], "veri": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 27, 29, 30, 31, 32, 71, 87, 96, 98, 103, 107, 115, 117], "small": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 27, 29, 30, 31, 32, 71, 87, 96, 101, 105], "percentag": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 71, 87], "1m": [0, 1, 2, 6, 8, 9, 10, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 71, 87], "onli": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 38, 40, 41, 46, 48, 51, 52, 54, 60, 61, 63, 64, 68, 71, 72, 75, 78, 79, 80, 82, 84, 85, 87, 90, 91, 92, 96, 102, 105, 107, 108, 109, 112, 117], "500": [0, 1, 2, 6, 8, 9, 10, 15, 16, 17, 18, 19, 20, 21, 22, 26, 27, 29, 30, 31, 32, 49, 71, 76, 87, 94, 106, 107], "It": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 38, 39, 42, 43, 44, 49, 52, 55, 61, 62, 65, 66, 71, 76, 80, 84, 85, 87, 96, 99, 100, 105, 106, 109, 114, 115, 117], "benefici": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 71, 87, 94, 107, 108], "well": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 29, 30, 31, 32, 52, 61, 66, 71, 72, 81, 83, 85, 87, 98, 103, 105, 106, 107, 111, 113], "distribut": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 33, 38, 61, 71, 76, 87, 106, 110, 111], "look": [0, 1, 2, 6, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 42, 60, 71, 87, 114], "definit": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 41, 45, 79, 80, 83, 86, 105], "extrem": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 71, 87], "scenario": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 71, 87, 96, 103, 105, 117], "dark": [0, 1, 2, 6, 8, 9, 10, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 71, 87], "light": [0, 1, 2, 6, 8, 9, 10, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 71, 87], "pictur": [0, 1, 2, 6, 8, 9, 10, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 71, 87, 97, 101], "captur": [0, 1, 2, 6, 8, 9, 10, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 40, 51, 63, 71, 78, 80, 87, 100], "night": [0, 1, 2, 6, 8, 9, 10, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 71, 87], "might": [0, 1, 2, 6, 8, 9, 10, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 55, 71, 87, 103, 107], "give": [0, 1, 2, 6, 8, 9, 10, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 71, 87, 113, 116], "ideal": [0, 1, 2, 4, 5, 6, 8, 9, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 71, 87], "mani": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 29, 30, 31, 32, 80, 94, 106, 111], "wai": [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 38, 61, 76, 78, 87, 90, 99, 100], "pass_calibration_data": [0, 1, 2, 6, 8, 9, 10, 15, 16, 17, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 48, 54, 68, 71, 87], "get_input": [0, 1, 2], "name": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 14, 16, 17, 20, 21, 22, 32, 33, 38, 40, 42, 51, 55, 57, 58, 59, 60, 61, 62, 63, 65, 66, 68, 75, 78, 83, 85, 87, 91, 106, 111, 112, 114], "batch_cntr": [0, 1, 2, 6, 8, 9, 10, 15, 16, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 49], "input_data": [0, 1, 2, 26, 28, 29, 30, 31, 32, 36, 49, 54, 57, 61, 66, 68, 71, 87], "target_data": [0, 1, 2, 26, 28, 29, 30, 31, 32, 71, 87], "inputs_batch": [0, 1, 2, 26, 28, 29, 30, 31, 32, 71, 87], "numpi": [0, 1, 2, 7, 8, 14, 33, 36, 38, 44, 46, 49, 52, 54, 57, 58, 61, 66], "break": [0, 1, 2, 6, 8, 9, 10, 15, 16, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 38, 49, 54, 68, 71, 87], "abov": [0, 1, 2, 3, 4, 5, 6, 9, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 36, 44, 52, 57, 61, 66, 71, 76, 80, 82, 85, 87, 91, 92, 95, 96, 99, 100, 101, 103, 104, 106, 110, 111, 117], "subsequ": [0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 29, 30, 31, 71, 87, 104, 106, 108, 109], "compute_encod": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 18, 19, 20, 21, 26, 28, 29, 30, 31, 36, 46, 54, 57, 68, 71, 87], "forward_pass_callback": [0, 1, 2, 6, 8, 9, 10, 15, 16, 17, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 36, 44, 46, 52, 57, 66, 68, 85, 87], "forward_pass_callback_arg": [0, 1, 2, 6, 8, 9, 10, 12, 15, 16, 18, 19, 20, 21, 26, 28, 29, 30, 31, 46, 48, 68, 71, 87], "first": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 17, 18, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 33, 38, 42, 45, 60, 61, 71, 80, 86, 98, 103, 105, 108, 114], "u": [0, 1, 2, 6, 8, 9, 10, 13, 18, 20, 21, 26, 29, 30, 31, 71, 91, 110], "saw": [0, 1, 2, 6, 8, 9, 10, 18, 20, 21, 26, 29, 30, 31], "describ": [0, 6, 10, 18, 26, 46, 55, 68, 71, 80, 87, 105, 106, 110, 111], "over": [0, 42, 49, 61, 73, 76, 85, 94, 100, 103, 115], "learn": [0, 1, 3, 4, 5, 6, 8, 9, 10, 12, 13, 14, 15, 18, 19, 20, 23, 24, 25, 26, 28, 29, 30, 36, 37, 38, 39, 46, 48, 49, 50, 57, 58, 59, 60, 61, 62, 68, 71, 73, 74, 75, 76, 77, 85, 87, 96, 99, 103, 105, 108, 111, 112], "vector": [0, 6, 10, 18, 26, 71], "compli": [0, 26, 29, 30, 31, 32, 71, 87], "signatur": [0, 3, 4, 5, 23, 24, 25, 36, 38, 44, 49, 52, 57, 61, 66, 76, 85], "expect": [0, 3, 4, 5, 8, 12, 23, 24, 25, 28, 38, 41, 43, 44, 46, 48, 52, 54, 58, 60, 61, 65, 66, 68, 71, 72, 76, 80, 81, 84, 85, 87, 103, 105, 107], "num_batch": [0, 1, 6, 7, 8, 9, 10, 11, 12, 17, 18, 19, 22, 26, 27, 28, 29, 32, 36, 44, 48, 49, 52, 57, 58, 59, 66, 71, 73, 74, 85], "number": [0, 1, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 36, 37, 38, 42, 44, 46, 52, 57, 58, 59, 61, 66, 68, 71, 74, 75, 76, 80, 85, 87, 91, 94, 100, 101, 103, 108, 111, 112, 114, 117], "around": [0, 6, 10, 13, 18, 26, 44, 52, 54, 58, 66, 68, 71, 85, 87, 104], "2000": [0, 6, 7, 9, 10, 11, 18, 26, 27, 29, 49, 58, 71, 73], "size": [0, 3, 4, 5, 6, 7, 10, 11, 12, 13, 14, 18, 26, 42, 46, 54, 58, 60, 61, 66, 68, 71, 72, 73, 81, 85, 87, 94, 103, 104, 113, 116], "translat": [0, 3, 4, 5, 6, 10, 18, 23, 24, 25, 26, 71], "64": [0, 6, 10, 18, 26, 41, 42, 44, 52, 54, 58, 66, 68, 71, 85, 87, 90, 94], "default_num_iter": [0, 6, 10, 18, 26, 27, 36, 48, 57, 71], "10000": [0, 6, 7, 10, 11, 18, 26, 36, 57, 71, 94], "strongli": [0, 6, 10, 18, 26, 42, 71, 80], "recommend": [0, 2, 6, 10, 14, 18, 26, 27, 38, 42, 44, 52, 61, 66, 71, 72, 76, 85, 90, 94, 96, 98, 105, 110], "o": [0, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 61, 68, 76], "adaround_weight": [0, 6, 7, 10, 11, 18, 26, 27, 36, 48, 49, 57, 58, 71, 73], "adaroundparamet": [0, 6, 7, 10, 11, 18, 26, 27, 36, 48, 49, 57, 58, 71, 73], "satisfi": [0, 27, 80, 95], "requir": [0, 2, 3, 4, 5, 14, 17, 22, 23, 24, 25, 26, 27, 29, 30, 31, 32, 33, 36, 37, 38, 39, 42, 43, 44, 46, 48, 50, 52, 54, 55, 57, 58, 59, 60, 61, 62, 64, 65, 66, 68, 69, 71, 74, 75, 76, 77, 78, 79, 80, 83, 84, 85, 87, 88, 89, 94, 96, 98, 103, 105, 106, 109, 111], "deriv": [0, 55], "form": [0, 17, 22, 32, 33, 42, 99], "arrai": [0, 8], "__init__": [0, 14, 27, 41, 42, 58, 72, 73, 76, 79, 80, 81], "self": [0, 14, 27, 41, 42, 58, 61, 72, 73, 76, 79, 80, 81], "_torch_data_load": 0, "_iter": 0, "__iter__": 0, "__next__": 0, "next": [0, 1, 3, 4, 5, 6, 8, 9, 10, 12, 13, 15, 16, 18, 19, 20, 21, 26, 28, 29, 30, 31, 33, 43, 65, 71, 83, 84, 87, 91, 110], "__len__": [0, 27, 73, 85], "len": [0, 13, 27, 38, 49, 71, 73, 80], "forward_fn": [0, 8, 12, 28, 48, 71, 74], "makedir": [0, 3, 4, 5, 8, 9, 10, 12, 18, 19, 20, 21, 23, 24, 25, 26, 28, 29, 30, 31], "exist_ok": [0, 3, 4, 5, 8, 9, 10, 12, 18, 19, 20, 21, 23, 24, 25, 26, 28, 29, 30, 31], "ada_model": [0, 6, 10, 18, 26], "apply_adaround": [0, 6, 10, 18, 26, 36, 48, 57, 71], "filename_prefix": [0, 6, 8, 9, 10, 12, 18, 19, 20, 21, 26, 28, 29, 30, 31, 36, 46, 57, 68, 71, 83, 87], "default_quant_schem": [0, 6, 10, 18, 26, 36, 57, 58, 71], "after": [0, 1, 3, 4, 6, 8, 9, 10, 12, 13, 18, 19, 20, 21, 23, 24, 26, 27, 28, 29, 30, 31, 33, 36, 37, 43, 49, 57, 59, 60, 62, 65, 71, 73, 74, 80, 81, 83, 84, 85, 87, 89, 94, 95, 96, 98, 103, 105, 108, 110, 114, 115], "again": [0, 1, 3, 4, 5, 6, 9, 10, 18, 19, 23, 24, 25, 26, 29, 107, 108, 114], "note": [0, 1, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 23, 24, 25, 26, 28, 29, 30, 31, 42, 43, 44, 46, 49, 50, 51, 52, 54, 58, 61, 65, 66, 68, 73, 76, 77, 82, 83, 84, 85, 87, 91, 92, 97, 98, 100, 101, 103, 104, 105, 107], "two": [0, 1, 3, 4, 5, 6, 9, 10, 13, 14, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 33, 38, 42, 43, 60, 61, 65, 76, 80, 81, 84, 90, 100, 101, 103, 105, 106, 107, 108, 111, 113, 114, 115, 116], "thing": [0, 6, 10, 18, 26, 105], "understand": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 105, 109, 114, 115], "biwidth": [0, 6, 10, 18, 26], "must": [0, 6, 10, 12, 13, 15, 16, 17, 18, 22, 26, 32, 42, 61, 71, 96, 101, 102, 107, 109, 117], "bitwidth": [0, 6, 10, 18, 26, 44, 46, 55, 57, 58, 66, 68, 71, 73, 75, 83, 85, 87, 96, 105, 110, 111], "wa": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29, 30, 31, 38, 40, 51, 55, 61, 63, 71, 76, 78, 83, 97, 103, 109], "freez": [0, 6, 10, 18, 26, 36, 57, 71, 83, 94], "set_and_freeze_param_encod": [0, 6, 10, 18, 26, 36, 48, 57, 71], "been": [0, 3, 4, 5, 6, 7, 10, 11, 12, 13, 18, 19, 23, 24, 25, 26, 27, 42, 71, 81, 89, 105, 108, 111, 117], "down": [0, 1, 6, 9, 10, 18, 19, 26, 29, 55, 71], "base": [0, 6, 7, 8, 10, 11, 12, 18, 26, 27, 28, 36, 37, 48, 54, 57, 60, 61, 68, 71, 83, 85, 87, 91, 97, 98, 105], "initi": [0, 6, 10, 16, 17, 18, 20, 21, 22, 26, 30, 31, 32, 36, 57, 60, 68, 71, 87, 94, 108, 110, 111], "intern": [0, 6, 7, 10, 11, 13, 14, 18, 26, 42, 66, 68, 71, 73, 87, 103, 105, 106, 109], "NOT": [0, 6, 10, 18, 26, 71, 96, 117], "frozen": [0, 6, 10, 18, 26, 71], "alter": [0, 6, 10, 18, 26, 71], "reflect": [0, 6, 26, 105, 111], "encoding_path": [0, 6, 7, 10, 11, 18, 26, 27, 36, 49, 57, 58, 71, 73], "join": [0, 6, 10, 15, 16, 18, 26, 27, 61, 68, 76, 88], "newli": [0, 6, 26], "updat": [0, 3, 4, 5, 6, 8, 9, 10, 14, 15, 16, 18, 19, 20, 21, 22, 26, 29, 30, 31, 55, 59, 60, 62, 64, 65, 71, 72, 87, 88, 92, 99, 105, 106, 108, 111, 112], "depend": [0, 3, 4, 5, 6, 8, 9, 10, 18, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 61, 80, 91, 92, 98, 99, 100, 105, 109, 112], "observ": [0, 3, 4, 5, 6, 8, 9, 10, 18, 20, 21, 23, 24, 25, 26, 28, 29, 30, 31, 89, 100, 103, 105, 106, 107, 108, 111], "slight": [0, 3, 4, 5, 6, 9, 10, 18, 20, 21, 23, 24, 25, 26, 29, 30, 31], "gain": [0, 3, 4, 5, 6, 9, 10, 18, 20, 21, 23, 24, 25, 26, 29, 30, 31, 97, 103], "serv": [0, 6, 26, 44, 49, 52, 58, 73, 85, 114], "try": [0, 3, 4, 5, 6, 7, 9, 10, 11, 18, 20, 21, 23, 24, 25, 26, 29, 30, 31, 38, 61, 72, 76, 88, 95, 97, 100, 103, 105, 110], "workflow": [0, 6, 26, 98, 101], "against": [0, 3, 4, 5, 6, 8, 9, 10, 18, 20, 21, 23, 24, 25, 26, 28, 29, 30, 31, 44, 52, 58, 66, 85, 89], "choic": [0, 1, 3, 4, 5, 6, 9, 10, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 44, 55, 103, 111], "plai": [0, 3, 4, 5, 6, 9, 10, 18, 20, 21, 23, 24, 25, 26, 29, 30, 31], "best": [0, 3, 4, 5, 6, 7, 9, 10, 11, 18, 20, 21, 23, 24, 25, 26, 27, 29, 30, 31, 58, 73, 95, 98, 103, 105, 111], "step": [0, 1, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 43, 49, 59, 64, 65, 71, 73, 74, 80, 83, 85, 87, 91, 94, 95, 96, 97, 98, 100, 103, 105, 106, 108, 110, 111], "would": [0, 3, 4, 5, 6, 8, 9, 10, 12, 16, 17, 18, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 38, 39, 42, 62, 71, 77, 87, 98, 103, 109, 112, 114], "take": [0, 6, 7, 8, 9, 10, 11, 12, 14, 17, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 43, 58, 61, 71, 76, 84, 87, 101, 103, 105, 106, 108, 109, 110, 117], "without": [0, 6, 7, 9, 10, 11, 13, 16, 18, 19, 20, 21, 26, 27, 29, 30, 31, 46, 68, 71, 83, 87, 95, 105, 108, 111, 117], "resnet18_after_adaround": [0, 6, 26], "illustr": [0, 6, 10, 18, 26, 68, 87, 94, 100, 105, 113, 116], "invok": [0, 6, 10, 18, 26, 33, 36, 38, 43, 45, 57, 60, 61, 62, 65, 71, 76, 85, 86, 87, 103, 105, 114, 115], "As": [0, 1, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 28, 29, 30, 31, 79, 87, 95, 97, 98, 100, 103, 105, 106, 107, 111, 113, 116], "indic": [0, 3, 4, 5, 6, 10, 18, 23, 24, 25, 26, 27, 38, 43, 55, 60, 61, 64, 65, 73, 76, 98, 117], "make": [0, 6, 8, 10, 12, 13, 14, 18, 26, 28, 33, 37, 41, 43, 45, 59, 60, 64, 65, 79, 82, 86, 100, 103, 104, 105, 111], "faster": [0, 3, 4, 5, 6, 7, 10, 11, 12, 18, 23, 24, 25, 26, 27, 94, 101, 108], "hope": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 18, 20, 21, 23, 24, 25, 26, 27, 28, 29, 30, 31], "addit": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 55, 59, 87, 95, 105, 108, 109, 112], "resourc": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29, 30, 31], "doc": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29, 30, 31, 80, 83, 107, 109, 114], "know": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29, 30, 31], "qat": [0, 1, 6, 9, 10, 13, 18, 19, 26, 29, 37, 54, 57, 59, 68, 71, 74, 82, 86, 94, 96, 99, 101, 105, 110, 111, 112], "showcas": [1, 9, 17, 19, 22, 29, 32], "appli": [1, 3, 4, 5, 8, 9, 12, 13, 19, 23, 24, 25, 27, 28, 29, 36, 38, 39, 43, 45, 48, 57, 58, 61, 62, 65, 71, 73, 75, 76, 81, 84, 86, 89, 94, 95, 96, 99, 100, 103, 105, 106, 108, 109, 110, 111, 112, 114, 115], "aim": [1, 3, 4, 5, 8, 9, 12, 19, 23, 24, 25, 28, 29, 37], "improv": [1, 2, 3, 4, 5, 6, 8, 9, 12, 13, 15, 16, 19, 20, 21, 23, 24, 25, 28, 29, 30, 31, 34, 68, 71, 87, 98, 105, 108, 110, 115], "help": [1, 3, 4, 5, 6, 7, 9, 17, 19, 20, 21, 22, 29, 32, 62, 65, 72, 81, 87, 89, 94, 100, 103, 105, 106, 107, 110, 114, 115], "recov": [1, 3, 4, 5, 9, 19, 23, 24, 25, 29, 101, 110, 111], "sensit": [1, 9, 19, 29, 44, 45, 52, 53, 66, 67, 85, 86, 100, 105, 107, 110, 111, 112], "oppos": [1, 9, 19, 29, 105, 109], "about": [1, 8, 9, 14, 19, 28, 29, 36, 38, 39, 43, 46, 48, 49, 50, 54, 55, 57, 58, 60, 61, 62, 65, 68, 71, 73, 75, 76, 77, 84, 85, 87], "free": [1, 3, 4, 5, 8, 9, 12, 13, 15, 16, 19, 20, 21, 23, 24, 25, 28, 29, 30, 31, 87, 91, 105, 106, 108], "bia": [1, 8, 12, 19, 39, 43, 50, 62, 65, 67, 72, 77, 80, 81, 83, 84, 86, 94, 97, 105, 106, 109, 110, 112], "correct": [1, 10, 18, 19, 27, 32, 43, 65, 67, 71, 73, 84, 86, 92, 94, 96, 105, 106, 110], "paper": [1, 9, 19, 29, 106], "iccv": [1, 9, 19, 29, 103, 106], "2019": [1, 9, 19, 29, 106], "arxiv": [1, 9, 19, 29, 106], "ab": [1, 9, 19, 29, 106], "1906": [1, 9, 19, 29, 106], "04721": [1, 9, 19, 29, 106], "norm": [1, 9, 19, 29, 43, 59, 65, 72, 74, 84, 94, 96, 105, 106, 107], "conv": [1, 3, 4, 5, 6, 7, 8, 9, 12, 19, 20, 21, 22, 29, 33, 37, 43, 57, 59, 60, 61, 65, 71, 74, 75, 80, 84, 89, 102, 109, 112, 113, 116, 117], "immedi": [1, 9, 19, 27, 29], "consecut": [1, 9, 19, 29, 43, 65, 83, 84, 105, 106], "correspond": [1, 3, 4, 5, 6, 7, 8, 9, 18, 19, 20, 21, 22, 23, 24, 25, 27, 29, 33, 36, 37, 40, 43, 57, 59, 60, 63, 65, 71, 74, 78, 83, 84, 85, 87, 92, 97, 100, 105, 107, 117], "high": [1, 3, 4, 5, 9, 19, 23, 24, 25, 29, 39, 43, 50, 62, 65, 77, 84, 94, 96, 98, 100, 101, 106, 110, 112, 115], "perhap": [1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 27, 28, 29, 30, 31, 32], "sai": [1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 27, 28, 29, 30, 31, 32, 80, 103], "upto": [1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 27, 28, 29, 30, 31, 32], "trainingmod": [1, 2], "preserv": [1, 2, 3, 4, 5, 23, 24, 25, 80], "current": [1, 3, 4, 5, 6, 7, 8, 9, 12, 13, 14, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 33, 38, 42, 46, 61, 65, 66, 68, 72, 76, 81, 82, 97, 101, 102, 103, 104, 109, 113, 116], "comput": [1, 3, 4, 5, 7, 10, 12, 13, 17, 18, 22, 23, 24, 25, 26, 29, 30, 31, 32, 36, 38, 40, 44, 46, 51, 52, 54, 55, 57, 58, 61, 63, 66, 68, 71, 73, 76, 78, 82, 83, 85, 87, 90, 92, 94, 103, 104, 105, 106, 107, 111, 114, 117], "And": [1, 2, 9, 10, 15, 16, 18, 19, 20, 21, 29, 30, 31, 43, 65, 84, 103], "default_output_bw": [1, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 36, 44, 46, 57, 58, 66, 68, 71, 85, 87], "encod": [1, 3, 4, 5, 7, 10, 11, 12, 13, 18, 26, 27, 29, 30, 31, 36, 40, 44, 46, 48, 51, 52, 54, 57, 58, 59, 63, 66, 68, 71, 73, 74, 78, 82, 83, 85, 87, 94, 96, 105, 107, 108, 112], "5": [1, 3, 4, 5, 6, 9, 13, 18, 20, 21, 23, 24, 25, 29, 30, 31, 33, 38, 48, 49, 59, 61, 73, 74, 76, 80, 87, 88, 90, 98, 108, 110], "suffici": [1, 9, 19, 29, 105, 107, 108, 111], "rounding_mod": [1, 9, 10, 13, 15, 16, 18, 19, 29, 44, 46, 54, 66, 68, 73, 87], "round": [1, 9, 18, 19, 29, 36, 38, 44, 45, 46, 53, 57, 58, 60, 61, 66, 67, 68, 73, 75, 76, 86, 87, 94, 99, 105, 107, 111], "mode": [1, 3, 4, 5, 6, 7, 9, 19, 20, 21, 22, 23, 24, 25, 29, 33, 38, 39, 46, 50, 58, 61, 62, 68, 73, 75, 76, 77, 80, 82, 84, 87, 88, 104, 105, 109], "possibl": [1, 7, 9, 11, 14, 19, 27, 29, 41, 42, 46, 68, 73, 79, 81, 87, 107, 109, 110], "stochast": [1, 9, 19, 29, 44, 46, 58, 60, 66, 68, 75, 87], "bias": [1, 9, 19, 29, 105], "interestingli": [1, 9, 19, 29], "procedur": [1, 3, 4, 5, 9, 19, 23, 24, 25, 29, 100, 103], "cl": [1, 9, 19, 29, 43, 62, 65, 84, 112], "skip": [1, 9, 19, 29, 33, 39, 62, 71, 73, 75, 91, 97], "hba": [1, 9, 19, 29], "absorpt": [1, 9, 19, 29], "cross_layer_equ": [1, 9, 19, 29, 39, 43, 50, 60, 62, 65, 77, 84, 89, 104], "equalize_model": [1, 9, 19, 29, 39, 50, 60, 62, 77, 89, 104], "add": [2, 3, 4, 5, 6, 7, 8, 9, 12, 20, 21, 22, 31, 41, 42, 46, 55, 66, 68, 79, 80, 81, 87, 91, 109, 111, 112, 114, 115, 117], "train": [2, 3, 14, 27, 33, 34, 36, 37, 38, 42, 44, 45, 52, 53, 54, 57, 58, 59, 61, 62, 63, 65, 66, 67, 68, 71, 73, 74, 76, 82, 85, 86, 90, 94, 95, 96, 99, 101, 103, 110, 111, 112], "ml": [2, 13, 15, 16, 20, 21, 30, 31, 34, 87, 103, 105, 106, 114, 115], "order": [2, 3, 4, 5, 8, 12, 13, 14, 15, 16, 17, 20, 21, 22, 23, 24, 25, 28, 30, 31, 32, 41, 43, 45, 64, 65, 79, 81, 84, 86, 87, 91, 92, 96, 97, 98, 105, 108, 111, 115], "estim": [2, 45, 67, 86, 105, 106], "deploi": [2, 111], "acceler": [2, 13, 15, 16, 20, 21, 30, 31, 61, 76, 87, 90, 101, 103], "awar": [2, 37, 45, 59, 68, 74, 82, 94, 96, 99, 101, 105, 110, 111], "adaround": [2, 49, 53, 58, 73, 95, 99, 105, 110, 112], "cross": [2, 45, 53, 58, 60, 67, 73, 75, 82, 86, 89, 94, 95, 99, 104, 105, 107, 115], "equal": [2, 14, 45, 53, 58, 60, 67, 73, 75, 82, 86, 89, 94, 95, 98, 99, 100, 104, 105, 107, 115], "emploi": [2, 30, 31, 71, 87], "act": [2, 15, 16, 19, 85], "regular": [2, 3, 4, 5, 13, 23, 24, 25, 36, 46, 57, 68, 71, 87, 94, 105, 111], "automat": [2, 27, 38, 43, 61, 65, 76, 84, 91, 92, 98, 103, 105, 107, 112], "regist": 2, "oper": [2, 14, 38, 42, 60, 61, 65, 79, 80, 81, 104, 105, 106, 109, 110], "quantizationsimul": 2, "exampl": [3, 13, 14, 27, 41, 72, 79, 81, 82, 83, 94, 98, 100, 101, 105, 107, 109, 111, 112, 117], "brief": [3, 4, 5, 23, 24, 25], "introduct": [3, 4, 5, 23, 24, 25], "guid": [3, 4, 5, 7, 9, 11, 23, 24, 25, 26, 27, 29, 30, 31, 32, 38, 61, 76, 98, 99, 106, 110, 112], "spatial": [3, 23, 33, 98, 99, 100, 102, 103, 112], "svd": [3, 23, 33, 98, 99, 100, 102, 103, 112], "decomposit": [3, 4, 5, 23, 24, 25, 61, 113, 116], "gener": [3, 4, 5, 9, 13, 15, 16, 23, 24, 25, 29, 34, 46, 55, 61, 68, 72, 83, 99, 103, 105, 107, 108, 109, 111], "layer": [3, 4, 5, 7, 13, 22, 23, 24, 25, 33, 36, 37, 38, 41, 42, 44, 45, 46, 52, 53, 57, 58, 59, 61, 64, 66, 67, 68, 69, 71, 72, 73, 74, 75, 76, 79, 81, 82, 83, 85, 86, 87, 88, 89, 94, 95, 96, 97, 98, 99, 102, 104, 105, 107, 109, 110, 111, 112, 113, 114, 115, 116, 117], "conv2d": [3, 4, 5, 12, 13, 18, 23, 24, 25, 33, 42, 43, 55, 60, 61, 65, 69, 72, 80, 81, 84, 97, 103, 112, 117], "decompos": [3, 4, 5, 23, 24, 25, 103, 113, 116], "singl": [3, 4, 5, 17, 18, 22, 23, 24, 25, 32, 33, 36, 44, 52, 57, 61, 66, 76, 80, 82, 85, 94, 106], "split": [3, 4, 5, 6, 7, 9, 20, 21, 22, 23, 24, 25, 58], "flattend": [3, 4, 5, 23, 24, 25], "2d": [3, 4, 5, 18, 23, 24, 25], "matrix": [3, 4, 5, 18, 23, 24, 25, 97], "singular": [3, 4, 5, 23, 24, 25, 61, 113, 116], "discard": [3, 4, 5, 23, 24, 25, 52, 66, 85], "least": [3, 4, 5, 23, 24, 25, 94, 97], "signific": [3, 4, 5, 23, 24, 25, 110], "diagon": [3, 4, 5, 23, 24, 25], "matric": [3, 4, 5, 23, 24, 25], "combin": [3, 4, 5, 7, 11, 23, 24, 25, 27, 75, 80, 95, 98, 103, 105, 106], "back": [3, 4, 5, 23, 24, 25, 33, 71, 82, 87, 109], "separ": [3, 4, 5, 20, 21, 23, 24, 25, 30, 31, 42, 43, 46, 57, 68, 71, 80, 81, 84, 87, 96, 107, 110, 112], "magnitud": [3, 4, 5, 23, 24, 25, 97], "feed": [3, 4, 5, 13, 14, 23, 24, 25, 42, 111], "dimens": [3, 4, 5, 23, 24, 25, 103, 110, 113, 116], "reconstruct": [3, 4, 5, 23, 24, 25, 36, 57, 61, 71], "minim": [3, 4, 5, 15, 16, 20, 21, 23, 24, 25, 30, 31, 36, 57, 61, 68, 71, 87, 101, 103, 105, 111], "distanc": [3, 4, 5, 23, 24, 25], "both": [3, 4, 5, 16, 19, 20, 21, 23, 24, 25, 30, 31, 42, 55, 71, 80, 85, 87, 90, 105, 106, 108, 109, 110, 111, 113, 117], "structur": [3, 4, 5, 17, 23, 24, 25, 32, 55, 60, 80, 103], "mac": [3, 4, 5, 23, 24, 25, 33, 38, 61, 76, 88, 98, 103, 113, 116], "memori": [3, 4, 5, 6, 7, 9, 20, 21, 22, 23, 24, 25, 38, 61, 76, 98, 103, 113, 116, 117], "either": [3, 4, 5, 6, 10, 18, 19, 23, 24, 25, 26, 34, 38, 43, 60, 61, 71, 76, 79, 83, 101, 111], "epoch": [3, 4, 5, 6, 8, 9, 12, 13, 15, 16, 20, 21, 23, 24, 25, 28, 30, 31, 33, 46, 59, 61, 74, 76, 87, 101, 103, 105, 108], "close": [3, 4, 5, 23, 24, 25, 57, 60, 62, 65, 68, 69, 97, 98, 111], "folder": [3, 4, 5, 6, 7, 8, 9, 17, 20, 21, 22, 23, 24, 25, 30, 31, 32, 64, 89, 107], "pipelin": [3, 13, 40, 51, 59, 63, 68, 73, 74, 78, 87, 105, 108, 110, 111], "num_comp_ratio_candid": [3, 4, 5, 23, 24, 25, 38, 61, 76, 88], "num_eval_iter": [3, 4, 5, 23, 24, 25], "convert": [3, 4, 5, 6, 7, 8, 9, 19, 20, 21, 22, 33, 41, 42, 43, 65, 80, 95, 105, 115], "tfrecord": [3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 26, 61, 66, 68], "contain": [3, 4, 5, 6, 7, 8, 9, 17, 20, 21, 22, 23, 24, 25, 32, 43, 55, 61, 63, 65, 72, 76, 80, 81, 85, 90, 105, 107, 108, 109, 111], "start": [3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 20, 21, 22, 23, 24, 25, 28, 30, 31, 32, 33, 36, 38, 42, 57, 58, 59, 61, 62, 63, 65, 66, 68, 71, 76, 80, 81, 87, 94, 99, 100, 103, 109, 111], "label": [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 18, 20, 21, 22, 27, 38, 44, 52, 58, 66, 68, 71, 73, 85, 87, 107, 108], "tfrecords_dir": [3, 4, 5, 6, 8, 9, 20, 21, 22], "dir": [3, 4, 5, 7, 9, 10, 11, 12, 16, 17, 18, 19, 20, 21, 92], "disabl": [3, 4, 5, 6, 7, 9, 20, 21, 23, 25, 33, 44, 66, 83, 85, 87, 100, 103, 107, 109, 111], "log": [3, 4, 5, 6, 7, 8, 9, 12, 13, 15, 16, 20, 21, 22, 36, 57, 81, 107], "info": [3, 4, 5, 6, 7, 9, 20, 21, 22, 43, 60, 65, 72, 75, 81, 84, 112], "level": [3, 4, 5, 6, 7, 9, 13, 20, 21, 22, 39, 60, 62, 77, 96, 98, 100, 101, 105, 110, 114], "eager": [3, 4, 5, 6, 7, 9, 20, 21, 22], "verbos": [3, 4, 5, 6, 7, 9, 20, 21, 22], "displai": [3, 4, 5, 6, 7, 9, 13, 20, 21, 22, 99, 107, 114, 115], "erorr": [3, 4, 5, 9, 20, 21], "tensorflow": [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 32, 34, 36, 38, 39, 40, 41, 43, 44, 46, 90, 91, 94, 95, 96, 99, 101, 102, 106, 107, 109, 111, 112], "messag": [3, 4, 5, 6, 7, 9, 20, 21, 22, 61], "error": [3, 4, 5, 6, 7, 8, 9, 11, 20, 21, 22, 42, 61, 68, 73, 80, 95, 105, 108, 110, 111], "critic": [3, 4, 5, 6, 7, 9, 20, 21, 22], "tf_cpp_min_log_level": [3, 4, 5, 6, 7, 8, 9, 11, 12, 15, 16, 20, 21, 22], "todo": [3, 4, 5, 9, 20, 21], "compat": [3, 4, 5, 6, 7, 8, 9, 12, 13, 20, 21, 22, 33, 55, 57, 58, 59, 60, 61, 62, 63, 65, 66, 68, 69, 79, 90], "v1": [3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 33, 57, 58, 59, 60, 61, 62, 63, 65, 66, 68, 69], "abhijit": [3, 4], "disable_eager_execut": [3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 58], "set_verbos": [3, 4, 5, 6, 7, 8, 9, 20, 21, 22], "type": [3, 4, 5, 6, 7, 8, 9, 11, 12, 14, 17, 20, 21, 22, 23, 24, 25, 27, 30, 31, 33, 36, 37, 38, 39, 43, 44, 46, 52, 57, 58, 59, 61, 62, 64, 65, 66, 68, 71, 72, 73, 74, 75, 76, 80, 83, 84, 85, 87, 89, 105, 107, 109, 111, 114], "list": [3, 4, 5, 6, 8, 9, 12, 14, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 33, 37, 38, 39, 40, 42, 43, 57, 58, 59, 60, 61, 62, 63, 65, 66, 68, 71, 74, 75, 76, 77, 78, 80, 84, 85, 87, 89, 92, 100, 102, 104, 109], "image_net_train": [3, 4, 5, 6, 8, 9, 20, 21, 23, 24, 25, 28, 29, 30, 31], "imagenettrain": [3, 4, 5, 6, 8, 9, 20, 21, 23, 24, 25, 28, 29, 30, 31], "format_bgr": [3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 66], "int": [3, 4, 5, 6, 7, 8, 9, 11, 12, 17, 20, 21, 22, 23, 24, 25, 27, 36, 37, 38, 44, 46, 49, 55, 57, 58, 59, 60, 61, 66, 68, 71, 73, 74, 75, 76, 83, 85, 87], "bool": [3, 4, 5, 23, 24, 25, 26, 28, 29, 30, 31, 32, 38, 43, 46, 60, 61, 64, 65, 66, 68, 73, 75, 76, 80, 83, 84, 87], "maximum": [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 17, 18, 19, 20, 21, 22, 32, 36, 57, 58, 71, 87], "training_input": [3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 66], "keras_learning_phas": [3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 66], "data_input": [3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 66, 68], "input_1": [3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 33, 60, 61, 62, 65, 66], "validation_input": [3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 66], "update_ops_nam": [3, 4, 5, 6, 8, 9, 20, 21, 59], "str": [3, 4, 5, 6, 8, 9, 14, 20, 21, 33, 38, 40, 44, 46, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 68, 71, 73, 75, 78, 79, 80, 83, 85, 87, 89], "learning_r": [3, 4, 5, 6, 8, 9, 20, 21, 23, 24, 25, 28, 30, 31, 59, 74, 87], "decay_step": [3, 4, 5, 6, 8, 9, 20, 21, 59], "mostli": [3, 4, 5, 6, 8, 9, 20, 21, 25], "move": [3, 4, 5, 6, 8, 9, 20, 21, 82], "averag": [3, 4, 5, 6, 8, 9, 20, 21], "graphkei": [3, 4, 5, 6, 8, 9, 20, 21, 22], "update_op": [3, 4, 5, 6, 8, 9, 20, 21, 22], "alwai": [3, 4, 5, 6, 8, 9, 20, 21, 61, 90, 100], "dure": [3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 28, 30, 31, 32, 36, 37, 38, 42, 57, 61, 65, 68, 71, 76, 87, 88, 94, 101, 103, 105, 108, 109, 111, 114, 115], "rate": [3, 4, 5, 6, 8, 9, 12, 13, 14, 15, 16, 20, 21, 23, 24, 25, 28, 30, 31, 42, 87, 103, 108], "adjust": [3, 4, 5, 6, 7, 8, 9, 20, 21, 29, 36, 57, 58, 71, 72, 87, 96, 97, 98, 105, 106, 110], "decai": [3, 4, 5, 6, 8, 9, 20, 21, 103], "trainer": [3, 4, 5, 6, 8, 9, 20, 21, 23, 24, 25, 28, 30, 31, 38, 61, 76, 99], "num_epoch": [3, 4, 5, 6, 8, 9, 20, 21], "resnet50": [3, 4, 5, 6, 7, 8, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 33, 38, 39, 43, 44, 46, 58, 59, 60, 62, 65, 69], "kera": [3, 4, 5, 6, 7, 8, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 34, 42, 56, 58, 59, 60, 61, 62, 64, 65, 69, 94, 96, 99, 101, 105, 106, 107, 109, 111, 112], "covert": [3, 4, 5, 6, 7, 8, 9, 20, 21, 22], "clear_sess": [3, 4, 5, 6, 7, 8, 9, 12, 20, 21, 22, 33, 36, 59, 60, 62, 65, 69], "releas": [3, 4, 5, 6, 7, 9, 20, 21, 22, 99, 104], "global": [3, 4, 5, 6, 7, 9, 20, 21, 22, 110], "clutter": [3, 4, 5, 6, 7, 9, 20, 21, 22], "old": [3, 4, 5, 6, 7, 9, 20, 21, 22], "By": [3, 4, 5, 6, 8, 9, 12, 20, 21, 22, 26, 28, 29, 30, 31, 37, 38, 42, 58, 61, 76, 103, 109, 111], "train_op": [3, 4, 5, 6, 9, 20, 21, 22], "fold": [3, 4, 5, 7, 22, 37, 39, 43, 50, 58, 59, 62, 65, 67, 72, 73, 74, 77, 84, 86, 89, 94, 95, 96, 105, 106, 107, 112], "applic": [3, 4, 5, 6, 7, 8, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 33, 38, 39, 43, 44, 46, 55, 58, 59, 60, 61, 62, 65, 69, 84, 100, 104], "backend": [3, 4, 5, 6, 7, 8, 9, 12, 20, 21, 22, 33, 36, 58, 59, 60, 62, 65, 69], "allow": [3, 4, 5, 6, 7, 9, 11, 13, 20, 21, 22, 27, 38, 40, 42, 45, 46, 51, 53, 55, 58, 61, 63, 67, 76, 78, 79, 80, 86, 87, 95, 101, 103, 105, 107, 108, 109, 110, 111, 112, 114], "easili": [3, 4, 5, 6, 7, 9, 20, 21, 22, 61, 76], "read": [3, 4, 5, 6, 7, 9, 20, 21, 22, 107], "eventu": [3, 4, 5, 6, 7, 9, 20, 21, 22], "aimet_tensorflow": [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 33, 36, 37, 38, 39, 40, 42, 43, 44, 46, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 68, 69, 90, 91, 92], "update_keras_bn_ops_trainable_flag": [3, 4, 5, 6, 7, 9, 20, 21, 22, 58, 64], "load_save_path": [3, 4, 5, 6, 7, 9, 20, 21, 22, 58, 64], "trainabl": [3, 4, 5, 6, 9, 16, 20, 21, 22, 31, 64, 68, 105], "add_image_net_computational_nodes_in_graph": [3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 59], "an": [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 36, 37, 38, 39, 41, 42, 44, 46, 52, 54, 55, 57, 58, 59, 60, 61, 62, 66, 68, 71, 73, 74, 75, 76, 77, 79, 80, 81, 85, 87, 90, 94, 95, 97, 100, 101, 103, 104, 105, 107, 108, 109, 110, 111, 115, 117], "softmax": [3, 4, 5, 6, 7, 9, 13, 14, 20, 21, 22, 33, 42, 57, 60, 61, 62, 65], "add_computational_nodes_in_graph": [3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 59], "get_sess": [3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 33, 58, 59, 60, 62, 65, 69], "creat": [3, 4, 5, 7, 11, 13, 17, 22, 32, 33, 36, 38, 42, 43, 44, 46, 49, 51, 52, 57, 58, 59, 61, 64, 65, 66, 68, 71, 73, 74, 76, 80, 82, 83, 84, 85, 87, 89, 94, 96, 103, 104, 105, 108, 111], "within": [3, 4, 5, 6, 7, 9, 13, 21, 33, 98, 107, 111], "images_class": [3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 59], "identifi": [3, 4, 5, 6, 7, 9, 20, 21, 22, 75, 81, 91, 92, 99, 107, 110, 112, 117], "input_op_nam": [3, 4, 5, 6, 7, 8, 9, 20, 21, 33, 38, 58, 59, 60, 61, 62, 65], "output_op_nam": [3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 33, 38, 57, 58, 59, 60, 61, 62, 63, 65, 66, 68], "starting_op_nam": [3, 4, 5, 6, 7, 9, 20, 21, 22, 57, 58, 59, 63, 68], "append": [3, 4, 5, 8, 43, 65, 76], "test": [3, 4, 5, 6, 7, 9, 12, 13, 14, 20, 21, 22, 44, 52, 59, 66, 68, 72, 74, 85, 92], "is_gpu_avail": [3, 4, 5, 6, 7, 9, 20, 21, 22], "cuda_onli": [3, 4, 5, 6, 7, 9, 20, 21, 22], "": [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 36, 38, 39, 41, 42, 43, 44, 46, 49, 52, 54, 55, 57, 59, 60, 61, 62, 64, 65, 66, 68, 71, 74, 75, 76, 80, 84, 85, 87, 89, 92, 98, 102, 103, 105, 106, 107, 108, 110, 111, 114, 115, 117], "determin": [3, 4, 5, 8, 13, 23, 24, 25, 27, 36, 38, 44, 46, 52, 55, 57, 61, 62, 65, 66, 68, 71, 76, 85, 87, 95, 98, 103, 105, 106, 107], "fp32": [3, 4, 5, 12, 13, 22, 23, 24, 25, 32, 40, 44, 51, 63, 66, 78, 85, 94, 101, 106, 107, 108, 110, 111], "defin": [3, 4, 5, 9, 14, 15, 16, 17, 18, 19, 20, 22, 23, 24, 25, 26, 29, 30, 31, 32, 33, 36, 41, 42, 44, 45, 49, 52, 55, 57, 58, 61, 64, 66, 71, 73, 76, 79, 80, 81, 85, 86, 87, 104, 105, 107, 109, 111], "target_comp_ratio": [3, 4, 5, 23, 24, 25, 38, 61, 76, 88], "desir": [3, 4, 5, 23, 24, 25, 38, 46, 52, 61, 68, 76, 85, 87, 91, 92, 98, 103, 105, 110], "compess": [3, 4, 5], "ratio": [3, 4, 5, 23, 24, 25, 38, 61, 76, 88, 97, 98, 114], "denot": [3, 4, 5, 23, 24, 25, 38, 95], "20": [3, 4, 5, 7, 8, 11, 12, 13, 14, 15, 16, 20, 21, 23, 24, 25, 28, 30, 31, 36, 38, 42, 55, 57, 61, 71, 87, 90, 94, 108], "80": [3, 4, 5, 38], "pre": [3, 4, 5, 15, 16, 40, 51, 63, 78, 90, 91, 92, 99, 101, 106], "9": [3, 4, 5, 23, 25, 26, 29, 30, 31, 32, 38, 61, 71, 80, 87, 92, 110], "10": [3, 4, 5, 7, 8, 10, 11, 12, 13, 14, 15, 16, 18, 19, 20, 21, 23, 24, 25, 28, 30, 31, 36, 38, 41, 42, 43, 46, 57, 60, 61, 68, 74, 76, 80, 81, 87, 88, 90, 91, 92, 100, 103, 108], "part": [3, 4, 5, 23, 24, 25, 42, 44, 49, 52, 58, 66, 73, 82, 85, 103, 105, 106, 107], "variou": [3, 4, 5, 7, 11, 17, 22, 23, 24, 25, 27, 32, 38, 61, 76, 95, 98, 103, 105, 110, 111, 112, 115], "measur": [3, 4, 5, 23, 24, 25, 27, 38, 61, 76, 85], "tri": [3, 4, 5, 23, 24, 25, 98, 105], "33": [3, 4, 5, 23, 24, 25], "66": [3, 4, 5, 23, 24, 25, 98], "00": [3, 4, 5, 23, 24, 25], "higher": [3, 4, 5, 23, 24, 25, 61, 71, 76, 96, 100, 108, 110], "candid": [3, 4, 5, 23, 24, 25, 38, 39, 60, 61, 62, 76, 100, 103], "granular": [3, 4, 5, 13, 23, 24, 25, 38, 61, 76, 103, 110, 111, 115], "time": [3, 4, 5, 7, 11, 17, 22, 23, 24, 25, 27, 32, 38, 43, 55, 61, 65, 76, 80, 81, 95, 103, 104, 108, 114], "taken": [3, 4, 5, 23, 24, 25, 42, 117], "complet": [3, 4, 5, 7, 11, 22, 23, 24, 25, 27, 38, 96, 110], "modules_to_ignor": [3, 4, 5, 23, 24, 25, 38, 61, 76, 85, 88, 102], "interact": [3, 4, 5, 23, 24, 25], "too": [3, 4, 5, 23, 24, 25], "choss": [3, 4, 5, 23, 24, 25], "auto": [3, 4, 5, 23, 24, 25, 34, 38, 39, 43, 50, 55, 61, 62, 65, 76, 77, 84, 88], "analysi": [3, 4, 5, 13, 23, 24, 25, 27, 38, 44, 52, 61, 66, 76, 85, 103, 110], "much": [3, 4, 5, 7, 11, 16, 21, 23, 24, 25, 31, 117], "altern": [3, 4, 5, 23, 24, 25, 91, 92, 103], "manual": [3, 4, 5, 7, 11, 23, 24, 25, 33, 38, 55, 61, 76, 84, 95, 103], "retriev": [3, 4, 5, 23, 25], "those": [3, 4, 5, 14, 16, 21, 23, 24, 25, 31, 52, 85, 103], "num_reconstruction_sampl": [3, 4, 5, 23, 25, 61, 76], "last": [3, 4, 5, 100, 102, 110], "stage": [3, 4, 5, 95], "map": [3, 4, 5, 7, 10, 11, 12, 14, 18, 22, 44, 46, 55, 58, 65, 81, 83, 107, 109], "linear": [3, 4, 5, 37, 43, 57, 59, 60, 65, 71, 74, 80, 81, 83, 84, 89, 96, 97], "regress": [3, 4, 5, 97], "attempt": [3, 4, 5, 97, 105, 106], "done": [3, 4, 5, 8, 14, 18, 19, 20, 21, 23, 24, 25, 28, 30, 31, 41, 91, 97, 103, 109, 111, 117], "random": [3, 4, 5, 14, 27, 33, 36, 42, 44, 46, 49, 52, 54, 57, 58, 61, 66, 73, 97, 107], "ridicul": [3, 4, 5, 23, 25], "enabl": [3, 4, 5, 16, 18, 21, 23, 25, 31, 34, 38, 44, 59, 61, 66, 68, 74, 76, 83, 85, 90, 91, 96, 101, 105, 107, 109, 111, 112], "allow_custom_downsample_op": [3, 4, 5, 23, 25, 61, 76], "flag": [3, 4, 5, 23, 25, 43, 60, 64, 65, 73, 80, 87], "downsampl": [3, 4, 5, 23, 25], "consid": [3, 4, 5, 13, 23, 25, 65, 72, 94, 100, 105, 110], "bandwidth": [3, 4, 5, 23, 25, 98], "overhead": [3, 4, 5, 23, 25], "trade": [3, 4, 5, 23, 25, 36, 57, 71], "off": [3, 4, 5, 23, 25, 36, 46, 57, 68, 71, 87, 105, 106, 109], "suggest": [3, 4, 5, 23, 25, 46, 87, 100, 103, 106], "eval_callback": [3, 4, 5, 7, 11, 12, 17, 22, 23, 24, 25, 27, 32, 33, 38, 44, 49, 52, 58, 61, 66, 73, 76, 85, 88], "function_nam": [3, 4, 5, 23, 24, 25], "eval_iter": [3, 4, 5, 23, 24, 25, 33, 38, 61, 76, 88], "batch": [3, 4, 5, 7, 8, 11, 12, 22, 23, 24, 25, 28, 36, 37, 38, 40, 43, 44, 52, 54, 57, 58, 59, 60, 61, 63, 65, 66, 68, 71, 72, 74, 78, 84, 85, 87, 94, 96, 105, 106, 107], "choos": [3, 4, 5, 23, 24, 25, 66, 68, 87, 89, 97, 98, 103], "enough": [3, 4, 5, 23, 24, 25, 72], "trust": [3, 4, 5, 23, 24, 25], "callback": [3, 4, 5, 13, 15, 16, 17, 22, 23, 24, 25, 32, 38, 44, 46, 49, 52, 58, 61, 66, 68, 73, 76, 85, 87, 107, 111], "invoc": [3, 4, 5, 23, 24, 25], "compress_schem": [3, 4, 5, 23, 24, 25, 33, 38, 61, 76, 88], "cost_metr": [3, 4, 5, 23, 24, 25, 33, 38, 61, 76, 88], "actual": [3, 4, 5, 8, 9, 12, 14, 20, 21, 28, 30, 31, 40, 44, 49, 51, 52, 58, 63, 66, 71, 73, 78, 82, 85, 87, 92, 98, 105], "greedi": [3, 4, 5, 103, 114], "select": [3, 4, 5, 36, 57, 60, 71, 87, 89, 95, 98, 107, 111, 114, 117], "among": [3, 4, 5, 71], "reach": [3, 4, 5, 7, 11, 27, 95, 98], "previou": [3, 4, 5, 9, 29, 38, 43, 61, 76, 84, 90, 98, 100, 110], "rule": [3, 4, 5, 46, 66, 68, 109], "thumb": [3, 4, 5], "found": [3, 4, 5, 14, 33, 43, 84, 108, 111], "compressionschem": [3, 4, 5, 23, 24, 25, 33, 38, 61, 76, 88], "costmetr": [3, 4, 5, 23, 24, 25, 33, 38, 61, 76, 88], "greedyselectionparamet": [3, 4, 5, 23, 24, 25, 38, 61, 76, 88], "channelpruningparamet": [3, 5, 23, 25, 61, 76], "decim": [3, 4, 5, 23, 24, 25, 38, 61, 76, 88], "greedy_param": [3, 4, 5, 23, 24, 25, 38, 61, 76, 88], "get_operation_by_nam": [3, 4, 5, 6, 8, 9, 20, 21, 22, 33, 60, 61, 65, 68, 69], "conv1_conv": [3, 4, 5, 69], "auto_param": [3, 4, 5, 23, 24, 25, 38, 61, 76, 88], "automodeparam": [3, 4, 5, 23, 24, 25, 38, 61, 76, 88], "greedy_select_param": [3, 4, 5, 23, 24, 25, 38, 61, 76], "data_set": [3, 5, 6, 9, 10, 18, 36, 57, 60, 61], "channel_prun": [3, 5, 23, 25, 38, 61, 76], "modelcompressor": [3, 4, 5, 23, 24, 25, 33, 38, 61, 76, 88], "compress_model": [3, 4, 5, 23, 24, 25, 33, 38, 61, 76, 88, 114], "relev": [3, 4, 5, 23, 24, 25], "our": [3, 4, 5, 8, 9, 12, 13, 15, 16, 19, 20, 23, 24, 25, 28, 37, 52, 85, 90, 92, 100, 110, 111], "new": [3, 5, 6, 9, 10, 14, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 32, 41, 42, 43, 55, 62, 65, 71, 79, 80, 83, 87, 91, 105, 109, 112], "final": [3, 5, 8, 12, 13, 15, 16, 17, 22, 28, 32, 38, 42, 61, 76, 81, 88, 97, 98, 100, 108, 110, 114], "compressed_sess": [3, 4, 33], "comp_stat": [3, 4, 23, 24], "working_dir": [3, 4, 5, 33, 61], "fell": [3, 4, 5, 23, 24, 25], "sharpli": [3, 4, 5, 23, 24, 25], "15": [3, 4, 5, 8, 12, 13, 15, 16, 20, 21, 23, 24, 25, 28, 30, 31, 87, 103, 108], "job": [3, 4, 5, 8, 12, 13, 15, 16, 20, 21, 23, 24, 25, 28, 30, 31, 87], "hyper": [3, 4, 5, 8, 9, 12, 13, 20, 21, 23, 24, 25, 28, 30, 31, 87, 94, 108], "search": [3, 4, 5, 8, 12, 13, 15, 16, 20, 21, 23, 24, 25, 28, 30, 31, 34, 36, 57, 71, 87, 100, 108, 109], "good": [3, 4, 5, 8, 12, 13, 15, 16, 20, 21, 23, 24, 25, 28, 30, 31, 42, 87, 94], "end": [3, 4, 5, 8, 12, 13, 15, 16, 20, 21, 23, 24, 25, 28, 30, 31, 33, 36, 37, 39, 57, 58, 59, 62, 68, 71, 73, 74, 77, 80, 81, 85, 87, 103], "drop": [3, 4, 5, 7, 8, 11, 12, 13, 15, 16, 20, 21, 23, 24, 25, 27, 28, 30, 31, 58, 61, 87, 95, 98, 103, 106, 107, 108, 110, 111], "factor": [3, 4, 5, 8, 12, 13, 15, 16, 20, 21, 23, 24, 25, 28, 30, 31, 43, 65, 84, 87, 98, 103, 106], "feel": [3, 4, 5, 8, 12, 13, 15, 16, 20, 21, 23, 24, 25, 28, 30, 31, 87, 91], "fit": [3, 4, 5, 8, 12, 13, 15, 16, 20, 21, 23, 24, 25, 28, 30, 31, 33, 38, 46, 61, 76, 87, 100], "reduced_": [3, 4, 5], "accordingli": [3, 4, 5, 58, 91, 92], "compr_graph_all_ops_nam": [3, 4, 5], "get_oper": [3, 4, 5], "update_ops_name_after_cp": [3, 4, 5], "op_nam": [3, 4, 5], "1e": [3, 4, 5, 13, 14, 20, 21, 42, 68, 72, 108], "finetu": [3, 4, 5, 23, 24, 25], "ofcours": [3, 4, 5, 9, 20, 21, 23, 24, 25, 29, 30, 31], "graph_sav": [3, 4, 5, 6, 60, 65, 68], "save_model_to_meta": [3, 4, 5], "meta_path": [3, 4, 5], "finetuned_model": [3, 4, 5, 23, 24], "quantiz": [3, 4, 5, 7, 11, 14, 23, 24, 25, 27, 34, 35, 36, 37, 40, 42, 44, 47, 49, 51, 52, 55, 56, 57, 58, 59, 60, 61, 63, 66, 70, 71, 73, 74, 78, 79, 80, 82, 83, 85, 94, 95, 96, 98, 99, 101, 103, 107, 112, 114], "pytorch": [4, 5, 9, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 34, 72, 80, 81, 89, 90, 91, 94, 95, 96, 99, 101, 106, 107, 109, 111, 112], "repres": [4, 5, 17, 22, 23, 24, 25, 32, 36, 38, 42, 44, 46, 52, 55, 57, 58, 60, 61, 66, 68, 76, 79, 83, 85, 87, 100, 105, 106, 107, 108, 111], "spatialsvdparamet": [4, 5, 24, 25, 33, 38, 61, 76, 88], "spatial_svd": [4, 5, 24, 25, 33, 38, 61, 76, 88], "comp_accuraci": 4, "ssvd_compressed_sess": 5, "ssvd_comp_stat": [5, 25], "ssvd_finetuned_model": [5, 25], "further": [5, 25, 34, 62, 65, 80, 97, 101, 103, 105, 109], "similar": [5, 16, 21, 25, 31, 106, 108, 111], "out": [5, 7, 11, 14, 25, 42, 44, 45, 46, 52, 53, 66, 67, 68, 80, 83, 85, 86, 87, 95, 98, 103, 107], "ssvd_cp_compressed_sess": 5, "cp_comp_stat": [5, 25], "ok": [5, 25], "fine": [6, 7, 8, 9, 13, 15, 16, 20, 21, 30, 31, 34, 38, 46, 61, 68, 76, 87, 98, 101, 105, 108, 111], "tune": [6, 7, 8, 9, 13, 15, 16, 20, 21, 30, 31, 34, 38, 46, 61, 68, 76, 87, 98, 101, 105, 108, 111], "fold_all_batch_norm": [6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 43, 60, 65, 84, 89], "bn_folded_sess": [6, 9, 20, 21], "maintain": [6, 49], "fresh": 6, "save_and_load_graph": [6, 60, 65], "bn_folded_sess_copi": 6, "With": [6, 14, 16, 17, 21, 22, 31, 32], "enhanc": [6, 16, 17, 21, 22, 31, 32, 44, 54, 66, 68, 85, 107, 111], "input_label_tensor": [6, 8, 9, 20, 21, 22], "get_tensor_by_nam": [6, 7, 8, 9, 20, 21, 22, 57, 58, 68], "train_tensor": [6, 8, 9, 20, 21, 22, 68], "train_tensors_dict": [6, 8, 9, 20, 21, 22], "dict": [6, 8, 9, 20, 21, 22, 38, 43, 46, 60, 61, 65, 75, 76, 78, 79, 80, 83, 84, 85, 87], "fromkei": [6, 8, 9, 20, 21, 22], "eval_output": [6, 8, 9, 20, 21, 22], "top1": [6, 8, 9, 20, 21, 22, 27, 38], "acc": [6, 8, 9, 11, 12, 13, 17, 20, 21, 22, 44], "input_label": [6, 8, 9, 20, 21, 22], "input_label_tensors_dict": [6, 8, 9, 20, 21, 22], "zip": [6, 8, 9, 20, 21, 22, 38, 44, 58], "feed_dict": [6, 7, 8, 9, 20, 21, 22, 57, 58, 63, 68], "as_default": [6, 7, 8, 9, 20, 21, 22, 58, 60, 61, 62, 65, 68, 69], "ensur": [6, 18, 78, 91, 105, 110], "prior": [6, 26, 29, 30, 31, 97, 105, 107], "num_iter": [6, 7, 11, 38], "offer": [7, 11, 27, 52, 68, 85, 95], "suit": [7, 11, 27, 95], "network": [7, 11, 13, 14, 18, 27, 42, 61, 68, 95, 98, 100, 103, 105, 108, 110, 111, 114, 116], "often": [7, 11, 94, 95, 103, 108], "sequenc": [7, 11, 13, 27, 72, 95, 96, 104, 109], "better": [7, 8, 11, 18, 28, 72, 89, 94, 95, 105, 106, 108], "prone": [7, 11, 95], "consum": [7, 11, 19, 27, 55, 95, 103], "analyz": [7, 11, 27, 38, 45, 53, 61, 67, 76, 86, 95, 97, 103, 104, 107, 111, 114, 115], "amount": [7, 11, 17, 22, 27, 32, 95, 109], "toler": [7, 11, 27, 95, 98], "soon": [7, 11, 95], "threshold": [7, 11, 61, 95], "stop": [7, 11, 36, 57, 71, 95], "autom": [7, 11, 26, 29, 30, 31, 32, 45, 71, 79, 80, 86, 87, 95, 105], "input_tensor_nam": [7, 58], "output_tensor_nam": [7, 58], "section": [7, 9, 11, 12, 27, 29, 72, 81, 91, 92, 94, 96, 97, 99, 103, 105, 111], "eval_dataset_s": [7, 11, 12, 27, 49, 58, 73], "5000": [7, 11, 27, 49, 58, 73], "calibration_dataset_s": [7, 11, 27, 49, 58, 73], "_create_sampled_data_load": [7, 11, 27, 73], "_sampled_dataset": [7, 58], "_create_sampled_dataset": [7, 58], "num_sampl": [7, 11, 12, 27, 44, 58, 73], "_graph": [7, 22, 58], "shuffle_buffer_s": [7, 58], "300": [7, 58], "buffer": [7, 58], "shuffle_se": [7, 58], "22222": [7, 58], "shuffl": [7, 10, 15, 16, 18, 33, 38, 58, 76], "buffer_s": [7, 58], "seed": [7, 58, 76], "object": [7, 11, 12, 17, 22, 26, 29, 30, 31, 32, 33, 36, 40, 44, 46, 49, 51, 52, 57, 58, 59, 61, 63, 66, 68, 73, 76, 78, 83, 85, 87, 96, 105, 108, 111], "eval_dataset": [7, 11, 12, 44, 58, 73], "image_dataset": [7, 44, 58], "lambda": [7, 10, 11, 12, 13, 14, 18, 22, 44], "unlabeled_dataset": [7, 11, 12, 17, 22, 44, 58, 66, 73], "argument": [7, 11, 12, 17, 22, 32, 33, 36, 40, 44, 46, 51, 52, 57, 58, 61, 63, 66, 68, 71, 78, 80, 85, 87], "whole": [7, 11, 12, 58, 111], "np": [7, 8, 14, 33, 36, 38, 42, 44, 46, 49, 52, 54, 57, 58, 61, 65, 66], "iterate_tf_dataset": [7, 58], "sampled_dataset": [7, 11, 12, 17, 58], "global_variables_initi": [7, 57, 58, 61], "input_tensor": [7, 10, 18, 19, 33, 42, 54, 57, 58, 68, 80], "output_tensor": [7, 33, 57, 58], "num_correct_predict": [7, 58, 73], "prob": [7, 58], "predict": [7, 17, 38, 40, 44, 58, 61, 73, 105], "argmax": [7, 58, 73], "axi": [7, 17, 22, 32, 55, 58, 107], "sum": [7, 27, 38, 58, 73], "allowed_accuracy_drop": [7, 11, 27, 49, 58, 73], "convei": [7, 11], "seri": [7, 11, 27, 87], "auto_qu": [7, 11, 27, 49, 58, 73], "01": [7, 11, 27, 36, 49, 57, 58, 71, 73, 92, 94], "shown": [7, 11, 17, 22, 32, 43, 65, 79, 82, 83, 94, 103, 106, 107, 110], "adaround_dataset_s": [7, 11, 27, 49, 58, 73], "adaround_dataset": [7, 11, 58], "adaround_param": [7, 11, 27, 49, 58, 73], "set_adaround_param": [7, 11, 27, 49, 58, 73], "associ": [7, 11, 17, 22, 32, 36, 43, 57, 58, 60, 61, 73, 81, 105], "eval_scor": [7, 11, 38, 61, 76, 85], "cle": [7, 11, 27, 39, 60, 62, 77, 82, 86, 94, 99, 105, 110, 112], "standalon": [7, 11, 27, 72, 105], "fashion": [7, 11, 18, 27], "counter": [8, 12, 28, 45, 46, 68, 87], "potenti": [8, 12, 28, 43, 45, 65, 72, 104, 107, 114, 115], "instabl": [8, 12, 28, 45], "batchnrom": [8, 28], "varianc": [8, 12, 28, 45, 106], "recalcul": [8, 12, 28, 37], "stabl": [8, 12, 28, 37, 80, 94], "rather": [8, 12, 28, 37, 80, 114], "than": [8, 12, 13, 28, 37, 38, 43, 55, 61, 65, 71, 72, 76, 80, 81, 84, 87, 102, 108, 114], "noisi": [8, 12, 28, 37], "compar": [8, 12, 13, 14, 15, 16, 17, 22, 28, 32, 61, 72, 80, 89, 107, 108, 115], "focu": [8, 28], "itself": [8, 17, 22, 28, 32, 51, 103, 111, 113, 116], "inform": [8, 28, 43, 55, 61, 65, 75, 81, 84, 105, 107], "accuraci": [8, 12, 13, 17, 22, 27, 28, 32, 34, 36, 38, 40, 46, 49, 51, 57, 58, 61, 63, 68, 71, 73, 76, 78, 87, 94, 95, 98, 100, 101, 103, 105, 106, 107, 108, 110, 111, 112, 115, 117], "line": [8, 54, 59, 68, 69, 71, 74, 75, 87, 89, 99], "difficult": 8, "model_sess_bn_mut": 8, "easier": 8, "bn_mutabl": 8, "modify_sess_bn_mut": 8, "training_tf_placehold": 8, "unlik": [8, 28], "script": [8, 28], "didn": [8, 28], "becaus": [8, 14, 28, 42, 80], "present": [8, 14, 28, 34, 72, 78, 81, 103, 106], "statatist": [8, 28], "json": [8, 12, 17, 19, 22, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117], "default_config_per_channel": [8, 12], "is_output_quant": [8, 12, 109], "is_quant": [8, 12, 109], "is_symmetr": [8, 12, 55, 109], "strict_symmetr": [8, 12, 109], "unsigned_symmetr": [8, 12, 109], "per_channel_quant": [8, 12, 18, 55, 109], "op_typ": [8, 12, 109], "squeez": [8, 12], "pad": [8, 12, 42, 72, 80, 81], "supergroup": [8, 12, 109, 112], "op_list": [8, 12, 109], "relu": [8, 12, 13, 14, 41, 42, 43, 55, 62, 65, 72, 75, 79, 80, 81, 84, 106, 109, 117], "clip": [8, 12, 109, 111], "gemm": [8, 12, 109], "model_input": [8, 12, 81, 109], "is_input_quant": [8, 12, 109], "model_output": [8, 12, 109], "config_file_path": 8, "tmp": [8, 12, 17, 22, 32, 44, 58, 66, 73, 85], "open": [8, 12], "w": [8, 12, 85, 91, 117], "f": [8, 12, 27, 49, 73, 80, 81, 91, 92], "dump": [8, 12], "training_range_learning_with_tf_init": [8, 12, 16, 21, 28, 31, 36, 57, 68, 71, 87], "config_fil": [8, 12, 17, 18, 22, 32, 44, 46, 52, 66, 68, 73, 75, 85, 87], "5e": [8, 24, 25, 28, 30, 31, 59, 74, 87], "7": [8, 12, 24, 25, 28, 30, 31, 43, 59, 73, 74, 87, 92, 117], "finetuned_accuraci": [8, 28, 30, 31], "helper": [8, 12, 28, 58, 60, 64, 73], "reestimate_bn_stat": [8, 12, 28, 37, 59, 74], "full": [8, 12, 28, 38, 41, 45, 64, 79, 86, 116], "100": [8, 12, 28, 37, 58, 59, 61, 68, 73, 74], "adapt": [8, 12, 18, 28, 45, 67, 74, 83, 86, 94, 99, 105, 107, 112], "forward": [8, 12, 13, 14, 15, 16, 17, 20, 21, 22, 26, 28, 29, 30, 31, 32, 42, 44, 46, 52, 66, 68, 71, 72, 74, 79, 80, 81, 82, 84, 85, 87, 91, 104, 107, 110, 112], "yield": [8, 12, 28, 52, 66, 71, 74, 85, 111], "directli": [8, 12, 16, 28, 37, 49, 82, 85, 107, 111], "bn_reestim": [8, 12, 28, 37, 59, 74], "real_input": 8, "vstack": 8, "from_tensor_slic": [8, 36, 37, 44, 57, 58, 61, 66], "bn_re_restimation_dataset": [8, 59], "start_op_nam": [8, 9, 22, 59, 60, 62, 65, 66], "bn_re_estimation_dataset": [8, 37, 59], "bn_num_batch": [8, 37, 59], "finetuned_accuracy_bn_reestim": [8, 28], "far": [8, 12, 28, 94], "effici": [8, 12, 28, 33, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117], "fold_all_batch_norms_to_scal": [8, 12, 28, 37, 59, 74], "resnet50_after_qat": [8, 20], "lead": [9, 29, 44, 52, 66, 85, 94, 96, 106, 110, 111], "shift": [9, 29, 67, 86, 106], "training_range_learning_with_tf_enhanced_init": [9, 16, 21, 31, 36, 57, 71, 87], "aimet_cl": [9, 19], "cle_applied_sess": 9, "under": [9, 29, 39, 62, 99, 107, 109, 114, 115], "hood": [9, 29], "correct_bia": [9, 29, 60, 75], "num_quant_sampl": [9, 29, 60, 75], "num_bias_correct_sampl": [9, 29, 60, 75], "bias_correct": [9, 29, 60, 75], "aimet_bc": 9, "quant_param": [9, 60, 75], "quantparam": [9, 29, 60, 75], "quant_mod": [9, 60], "round_mod": [9, 29, 60, 75], "ops_to_ignor": [9, 60], "bias_correction_param": [9, 60], "biascorrectionparam": [9, 60], "56": 9, "16": [9, 12, 29, 36, 38, 46, 54, 55, 57, 68, 71, 80, 83, 87, 94], "after_bc_sess": 9, "biascorrect": [9, 60], "bias_correct_param": [9, 60], "resnet50_after_qat_range_learn": [9, 21], "smaller": [10, 18, 71, 72, 94, 101, 110, 113, 116], "awai": [10, 18, 94], "image_net_dataset": [10, 11, 12, 17, 18, 19], "imagenetdataset": [10, 11, 12, 17, 18, 19], "get_val_dataset": [10, 11, 12, 17, 18, 19], "include_top": [10, 18, 19], "pool": [10, 12, 18, 19], "rest": [10, 15, 16, 18, 19, 33, 110], "sim": [10, 12, 13, 18, 26, 27, 36, 37, 48, 49, 57, 59, 71, 73, 74, 75, 82, 83, 85, 108, 111], "progbar": [10, 15, 16, 18, 19], "preprocess_input": [10, 15, 16, 18, 19, 33, 38], "sim_model": [10, 15, 16, 17, 18, 19, 26, 28, 29, 30, 31, 32, 71, 87], "tf_dataset": [10, 18, 19], "progbar_stat_upd": [10, 15, 16, 18, 19], "preprocess": [10, 13, 15, 16, 18, 38], "image_dataset_from_directori": [10, 15, 16, 18, 38], "ada_round_data": [10, 18], "label_mod": [10, 15, 16, 18, 38], "categor": [10, 15, 16, 18, 38], "image_width": [10, 18], "image_height": [10, 18], "y": [10, 17, 18, 22, 32, 46, 68, 80, 91, 92, 107], "fo": [10, 18, 71], "r": [10, 18, 71, 83, 85], "Of": [10, 18], "cours": [10, 18], "resnet50_after_adaround": 10, "quick": [10, 14, 18, 99], "dictionari": [11, 12, 14, 38, 61, 76, 85, 87, 88, 100, 103, 109], "adam": [11, 12, 13, 15, 16, 17, 44, 46, 68], "categoricalcrossentropi": [11, 12, 17, 44], "categoricalaccuraci": [11, 12, 17, 44], "thi": [12, 13, 14, 33, 34, 36, 38, 39, 40, 41, 42, 43, 44, 46, 48, 49, 50, 51, 52, 54, 55, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 68, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 83, 84, 85, 87, 89, 91, 92, 94, 95, 97, 98, 100, 101, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 116, 117], "notebook": [12, 13, 14], "i": [12, 13, 14, 33, 34, 36, 38, 39, 41, 42, 43, 44, 45, 46, 48, 50, 51, 52, 54, 55, 57, 58, 59, 60, 61, 62, 64, 65, 66, 68, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 87, 88, 89, 90, 91, 92, 94, 95, 96, 97, 98, 99, 100, 101, 103, 104, 105, 106, 107, 108, 109, 110, 111, 113, 114, 115, 116, 117], "6": [12, 13, 14, 36, 42, 49, 57, 61, 71, 73, 80, 87, 108], "simul": [12, 13, 15, 16, 32, 40, 45, 46, 51, 53, 55, 60, 63, 67, 68, 71, 75, 78, 79, 82, 86, 87, 101, 105, 108, 112], "train_dataset_s": 12, "re_estimation_dataset_s": 12, "train_dataset": 12, "re_estimation_dataset": 12, "built": [12, 13, 90, 91], "sequenti": [12, 13, 14, 41, 42, 109, 110], "subclass": [12, 13, 33, 41, 42], "incompat": [12, 13], "therefor": [12, 13, 14, 98, 106], "conv1": [12, 23, 24, 25, 33, 42, 72, 76, 79, 80, 81, 84, 88], "fuse": [12, 109, 111], "maxpooling2d": 12, "conv2": [12, 42, 55, 76, 79, 80, 84], "flatten": [12, 80], "dens": [12, 13, 14, 41, 42, 43], "functional_model": [12, 13, 14], "baselin": [12, 13, 27, 100, 108], "loss_fn": 12, "qsim": [12, 37], "posit": [12, 13, 14, 42], "quantized_callback": [12, 13, 15, 16], "tensorboard": [12, 13, 15, 16], "log_dir": [12, 13, 15, 16], "histori": [12, 13, 15, 16], "validation_data": [12, 13, 15, 16], "reestim": [12, 37], "mnist_after_bn_re_estimation_qat_range_learn": 12, "standard": [13, 15, 16, 20, 21, 30, 31, 80, 87, 89], "1": [13, 33, 36, 37, 38, 42, 43, 46, 48, 49, 52, 54, 57, 58, 59, 60, 61, 64, 68, 71, 72, 73, 74, 76, 77, 78, 80, 81, 83, 84, 85, 87, 88, 89, 90, 91, 100, 102, 103, 104, 105, 109, 110, 111, 113, 116, 117], "dataset": [13, 33, 36, 37, 38, 44, 49, 52, 54, 57, 58, 59, 60, 61, 66, 68, 71, 73, 74, 75, 85, 87, 99, 105, 106, 111], "2": [13, 33, 36, 38, 42, 43, 46, 49, 54, 57, 58, 59, 61, 64, 68, 71, 72, 73, 74, 76, 78, 80, 81, 83, 84, 85, 87, 90, 91, 94, 105, 110, 111], "3": [13, 33, 36, 38, 39, 43, 44, 46, 49, 50, 52, 54, 57, 58, 59, 60, 61, 62, 65, 66, 68, 69, 71, 72, 73, 74, 75, 76, 77, 78, 80, 81, 84, 85, 87, 88, 89, 90, 98, 105, 108, 110, 117], "evalu": [13, 27, 33, 36, 38, 44, 46, 52, 54, 57, 58, 61, 62, 65, 66, 71, 73, 76, 85, 87, 88, 95, 99, 100, 103, 105, 107, 108, 111, 114], "4": [13, 17, 22, 23, 32, 33, 36, 37, 43, 44, 49, 52, 57, 58, 59, 61, 66, 71, 73, 74, 75, 76, 80, 83, 84, 85, 87, 96, 100, 105, 117], "imdb": 13, "sentiment": 13, "vocab_s": [13, 14, 42], "20000": [13, 14, 42], "20k": 13, "word": 13, "maxlen": [13, 14, 42], "200": [13, 14, 27, 42], "movi": 13, "review": 13, "x_train": [13, 33, 37], "y_train": [13, 33], "x_val": 13, "y_val": 13, "load_data": 13, "num_word": 13, "pad_sequ": 13, "embed_dim": [13, 14, 42], "embed": [13, 14, 42, 80, 87, 103, 110], "token": [13, 14, 42, 110], "num_head": [13, 14, 42], "attent": [13, 14, 42], "head": [13, 14, 42], "ff_dim": [13, 14, 42], "hidden": [13, 14, 42], "insid": [13, 14, 38, 42, 80, 91], "delta": [13, 14, 42, 44, 52, 66, 85, 111], "input_dim": [13, 14, 42], "output_dim": [13, 14, 42], "block": [13, 73], "multiheadattent": [13, 14, 42, 112], "key_dim": [13, 14, 42], "dropout": [13, 14, 42], "layernorm": [13, 14, 42], "epsilon": [13, 14, 42], "globalaveragepooling1d": [13, 14, 42], "functional_callback": 13, "histogram_freq": 13, "sparse_categorical_crossentropi": 13, "128": [13, 15, 16, 66, 80], "wrap": [13, 15, 16, 17, 19, 22, 32, 80], "wrapper": [13, 19, 26, 29, 30, 31, 36, 44, 52, 57, 61, 66, 76, 85], "effect": [13, 15, 16, 20, 21, 30, 31, 36, 37, 46, 57, 59, 68, 71, 74, 83, 87, 96, 105, 107, 109, 111], "visual": [13, 17, 22, 32, 34, 38, 56, 61, 70, 76, 91, 103, 105, 106, 107, 110, 112, 113, 116], "right": [13, 27, 105, 117], "multi": [13, 33, 65, 86, 112], "encount": 13, "access": [13, 26, 29, 30, 31, 91, 105], "mha": [13, 112], "accur": 13, "clone_lay": 13, "clone": [13, 99], "diagram": [13, 96, 100, 103, 111, 113, 116], "m": [13, 90, 91, 92, 99], "convert_to_pb": [13, 46], "onc": [13, 15, 16, 20, 21, 23, 24, 25, 29, 30, 31, 41, 79, 81, 83, 87, 96, 97, 103, 107, 108, 111], "inspect": 13, "1024": [13, 54, 68, 71, 87, 94, 104], "artifact": [13, 15, 16, 40, 51, 63, 69, 78, 91], "3000": 13, "model_after_qat": [13, 15, 16], "anoth": [13, 16, 21, 31, 83, 87, 116, 117], "most": [13, 109], "complex": [13, 36, 44, 46, 52, 57, 66, 68, 85, 87], "elementari": 13, "logdir": 13, "summari": [13, 69, 89, 95], "vanilla": [13, 16, 21, 27, 31, 110], "tool": [14, 44, 66, 85, 103, 106, 115, 117], "sequanti": 14, "build": [14, 42], "dicuss": 14, "text": [14, 42], "transform": [14, 26, 27, 29, 30, 31, 32, 42, 71, 73, 80, 87, 112], "tokenandpositionembed": [14, 42], "transformerblock": [14, 42], "super": [14, 42, 72, 80, 81], "att": [14, 42], "ffn": [14, 42], "layernorm1": [14, 42], "layernorm2": [14, 42], "dropout1": [14, 42], "dropout2": [14, 42], "kwarg": [14, 42], "attn_output": [14, 42], "out1": [14, 42], "ffn_output": [14, 42], "token_emb": [14, 42], "pos_emb": [14, 42], "random_input": [14, 42], "embedding_lay": [14, 42], "transformer_block": [14, 42], "token_and_position_embed": 14, "re": [14, 33, 45, 67, 86, 99, 105], "symmetr": [14, 55, 83, 109, 111], "model_prepar": [14, 26, 28, 29, 30, 31, 32, 42, 44, 71, 74, 80, 85, 87], "prepare_model": [14, 26, 28, 29, 30, 31, 32, 42, 44, 71, 74, 80, 85, 87], "input_lay": [14, 42], "begin": [14, 42, 72, 80, 81, 108, 109], "unwrap": 14, "ident": [14, 41, 79], "total": [14, 38, 100, 111], "get_weight": 14, "represent": [14, 55], "reorder": 14, "get_original_models_weights_in_functional_model_ord": 14, "original_model": [14, 42], "class_nam": [14, 38], "ndarrai": [14, 43, 63, 65, 84], "arg": [14, 87], "lookup": 14, "remov": [14, 37, 59, 74, 80, 91, 97, 101, 111, 117], "match": [14, 36, 40, 44, 51, 52, 57, 61, 63, 66, 76, 78, 85, 87, 97, 103, 107, 109, 110, 111, 117], "original_model_weight": 14, "pop": 14, "weight_nam": 14, "kei": [14, 16, 21, 31, 43, 55, 60, 65, 84, 92], "functional_model_weight_ord": 14, "enumer": [14, 36, 38, 57, 61, 71, 76, 78, 87, 96], "sort": 14, "weights_in_correct_ord": 14, "item": [14, 17, 22, 32, 49, 105], "weight_info": 14, "assert": [14, 80], "count_param": 14, "output_shap": 14, "textclassif": 14, "what": [14, 111, 114], "architectur": [14, 68, 86, 98], "model_weights_in_correct_ord": 14, "assert_array_equ": 14, "modelprepar": [14, 26, 29, 30, 31, 32, 37, 42, 71, 80, 87], "arthmet": [14, 42], "experss": [14, 42], "tfoplambda": [14, 42], "ressembl": 14, "conv_1": [14, 42], "conv_2": [14, 42], "becuas": [14, 42, 51], "rais": [14, 42, 61, 68], "except": [14, 17, 22, 32, 42], "hopefulli": [14, 19], "min": [15, 16, 18, 44, 52, 55, 66, 68, 85, 89, 107, 111], "max": [15, 16, 18, 44, 52, 55, 66, 68, 85, 89, 103, 106, 107, 111], "keep": [15, 20, 21, 30, 31, 80, 87, 109, 110], "constant": [15, 20, 21, 30, 31, 49, 58, 73, 80, 87, 100, 105], "imagenet_dir": 15, "assign": [15, 16, 55, 83], "dataset_train": [15, 16], "dataset_valid": [15, 16], "respect": [15, 16, 41, 61, 89, 107], "categorical_crossentropi": [15, 16, 46], "being": [15, 16, 19, 38, 41, 43, 55, 61, 76, 79, 80, 81, 84, 85], "hyperparamet": [15, 16, 108], "henc": 16, "jointli": [16, 20, 21, 30, 31], "ye": [16, 92, 103], "due": [16, 34, 42, 67, 81, 86, 105, 106], "restrict": [16, 104], "prevent": [16, 72, 80, 97], "mention": 16, "continu": [16, 21, 31, 42, 81, 87, 105, 106, 108, 110], "benefit": [16, 21, 31, 55, 94], "analys": [17, 22, 32, 107], "respond": [17, 22, 32], "One": [17, 22, 26, 29, 30, 31, 32, 44, 46, 60, 66, 68, 98, 103, 113], "second": [17, 22, 32, 42, 58, 71, 109], "anyth": [17, 22, 32], "tupl": [17, 22, 32, 33, 36, 37, 38, 40, 43, 44, 46, 52, 57, 58, 61, 63, 65, 66, 68, 71, 72, 73, 74, 75, 76, 77, 78, 79, 84, 85, 87], "dummi": [17, 22, 32, 36, 46, 48, 57, 71, 72, 73, 77, 78, 84, 85, 87, 107], "val_dataset": 17, "callbackfunc": [17, 22, 32, 44, 52, 66, 85], "exactli": [17, 22, 32, 58, 111], "multipl": [17, 22, 32, 38, 61, 65, 68, 72, 76, 77, 78, 80, 81, 84, 87, 88, 90, 92, 101, 103, 105, 112], "eval_func": [17, 38, 61, 88], "v": [17, 22, 32, 36, 57, 71, 91, 100], "demonstr": [17, 22, 32], "quant_analyz": [17, 22, 32, 44, 52, 66, 85], "enable_per_layer_mse_loss": [17, 32, 44, 52, 85], "track": [17, 22, 32, 83, 107], "minimum": [17, 22, 32, 36, 57, 71, 80, 87, 89, 90], "histogram": [17, 22, 32, 44, 52, 66, 69, 85, 89, 105, 107, 111, 112], "seen": [17, 22, 32, 106, 107], "results_dir": [17, 22, 32, 44, 52, 58, 66, 69, 73, 85, 89], "html": [17, 22, 32, 80, 85, 91, 92, 98, 107, 112, 115], "plot": [17, 22, 32, 69, 89, 107], "per_layer_quant_en": [17, 32, 107], "per_layer_quant_dis": [17, 32, 107], "min_max_rang": [17, 22, 32, 107], "activations_pdf": [17, 22, 32, 107], "name_": [17, 32, 85], "index_0": [17, 32], "index_1": [17, 32], "index_n": [17, 32], "weights_pdf": [17, 22, 32, 107], "layer1": [17, 32, 43, 65, 84], "param_name_": [17, 22, 32, 85], "channel_index_0": [17, 22, 32], "channel_index_1": [17, 22, 32], "channel_index_n": [17, 32], "layer2": [17, 32, 43, 65, 84], "layern": [17, 32], "per_layer_mse_loss": [17, 32, 107], "sub": [17, 22, 32, 72, 92, 97, 103, 111, 117], "basi": [18, 55, 100, 103], "between": [18, 36, 38, 43, 57, 61, 65, 66, 71, 76, 78, 84, 85, 92, 106, 107, 109, 111], "imagin": 18, "filter": [18, 42], "kernel": [18, 97, 113, 116], "28": [18, 76], "were": [18, 27, 30, 31, 40, 43, 51, 55, 63, 65, 71, 78, 83, 84, 87, 92, 98, 105, 109, 117], "entireti": [18, 42], "contrast": [18, 42], "repeat": [18, 58, 97], "uniqu": 18, "attribut": [18, 42, 80, 83, 107], "conv2d_lay": 18, "kernel_s": [18, 42, 72, 80, 81], "snpe": [18, 19], "qnn": [18, 19], "config": [18, 46, 66, 68, 85, 109, 112], "style": 18, "mismatch": 18, "togeth": [18, 103], "pcq_quantsim_config": 18, "tell": [18, 114], "did": [18, 106], "resnet50_pcq_adaround": 18, "mimic": 19, "cle_applied_model": [19, 39], "yaml": 19, "h5": [19, 40, 101, 105], "savedmodel": 19, "protobuff": 19, "safe": 19, "resnet50_after_cl": 19, "Then": [20, 21, 30, 31, 36, 52, 57, 71, 85, 87], "meta": [20, 33, 61, 63, 68, 83, 101, 105], "No": [22, 75, 81, 105], "func": [22, 85], "func_callback_arg": [22, 52, 66, 85], "data_pipelin": 22, "per_op_quant_en": 22, "per_op_quant_dis": 22, "quant_op_name0": 22, "quant_op_name1": 22, "quant_op_namen": 22, "op1": 22, "channel_index_x": 22, "op2": 22, "channel_index_i": 22, "opn": 22, "channel_index_z": 22, "per_op_mse_loss": 22, "nn": [23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 71, 72, 73, 76, 79, 80, 81, 82, 84, 85, 87, 104, 112], "modul": [23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 38, 41, 61, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 83, 84, 85, 87, 89, 90, 94, 105, 112, 117], "gpu": [23, 24, 25, 26, 28, 29, 30, 31, 32, 33, 60, 61, 66, 68, 76, 78, 86, 90, 91, 105, 112], "learning_rate_schedul": [23, 24, 25, 28, 30, 31, 74, 87], "schedul": [23, 24, 25, 28, 30, 31, 108], "max_epoch": [23, 24, 25, 28, 30, 31], "is_avail": [23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 74], "aimet_torch": [23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 71, 72, 73, 74, 75, 76, 77, 78, 80, 81, 83, 84, 85, 87, 88, 89, 90, 91, 92, 104], "compressed_model": [23, 24, 38, 61, 76], "15e": [23, 25], "prune": [24, 38, 98, 99, 100, 102, 103, 112, 117], "ssvd_compressed_model": 25, "ssvd_cp_compressed_model": 25, "ssvd_cp_finetuned_model": 25, "certain": [26, 29, 30, 31, 32, 71, 79, 80, 85, 87, 103, 104, 105, 109], "guidelin": [26, 29, 30, 31, 32, 34, 45, 48, 54, 56, 58, 68, 71, 80, 86, 94, 98, 108], "rand": [26, 28, 29, 30, 31, 32, 33, 36, 44, 57, 58, 61, 66, 72, 78, 81], "modif": [26, 29, 30, 31], "made": [26, 29, 30, 31, 80, 109], "overrid": [26, 29, 30, 31, 61, 80, 87], "no_grad": [26, 27, 29, 30, 31, 32, 71, 81, 87], "ptq": [27, 58, 73, 101, 105, 107, 108], "success": 27, "care": 27, "non": [27, 72, 80, 111], "expert": 27, "effort": [27, 58, 73, 95], "known": [27, 81, 100, 101], "heurist": [27, 61], "cumul": 27, "until": [27, 58, 73, 95], "val_transform": 27, "compos": [27, 73], "centercrop": 27, "totensor": [27, 73], "normal": [27, 43, 72, 96, 107], "485": 27, "456": 27, "406": 27, "std": 27, "229": 27, "225": 27, "imagenet_dataset": 27, "imagefold": 27, "root": 27, "eaxmpl": 27, "tqdm": 27, "subsetrandomsampl": [27, 73], "in_eval_mod": 27, "get_devic": 27, "_dataset": [27, 73], "logit": 27, "topk": [27, 49], "k": [27, 113], "view_a": 27, "unlabeleddatasetwrapp": [27, 73], "__getitem__": [27, 73], "unlabeled_imagenet_dataset": 27, "unlabeled_imagenet_data_load": 27, "initial_accuraci": [27, 49, 73], "run_infer": [27, 49, 73], "predefin": [27, 100], "empir": [27, 60, 106], "adaround_data_load": [27, 49, 73], "furhter": 27, "optimized_accuraci": [27, 49, 73], "train_load": [28, 74, 76], "images_dir": 28, "resnet18_after_qat": [28, 30, 31], "bc_param": 29, "weight_bw": [29, 75], "act_bw": [29, 75], "resnet18_after_cle_bc": 29, "matter": 32, "involv": [33, 105, 110], "four": [33, 111], "convert_tf_sess_to_kera": 33, "save_tf_session_single_gpu": 33, "sourc": [33, 36, 37, 38, 39, 40, 42, 43, 44, 46, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 68, 69, 71, 73, 74, 75, 76, 77, 78, 80, 83, 84, 85, 87, 88, 89, 91, 92, 110], "variabl": [33, 38, 42, 61, 76, 80, 91, 92, 99], "load_tf_sess_variables_to_keras_single_gpu": 33, "compressed_op": 33, "save_session_graph_and_vari": 33, "creation": 33, "compress": [33, 34, 35, 56, 70, 97, 99, 101, 112, 113, 115, 116, 117], "isol": 33, "strategi": 33, "save_as_tf_module_multi_gpu": 33, "loading_path": 33, "saving_path": 33, "load_keras_model_multi_gpu": 33, "funetun": 33, "instanc": [33, 61, 80, 81, 87, 114], "moblinetv1": 33, "convert_tf_session_to_keras_model": 33, "mirroredstrategi": 33, "get_sess_from_keras_model": 33, "mobilnetv1": 33, "compress_sess": 33, "mobilenet": 33, "act_softmax": 33, "saved_model_single_gpu": 33, "correspnd": 33, "set_learning_phas": 33, "saved_model_multi_gpu": 33, "scope": [33, 80], "vgg16": [33, 61], "modulecompratiopair": [33, 38, 61, 76], "compressible_op": 33, "layer_a": 33, "list_of_module_comp_ratio_pair": [33, 38, 61, 76], "manual_param": [33, 61, 76], "manualmodeparam": [33, 38, 61, 76], "pylint": 33, "unus": 33, "to_categor": [33, 46], "rmsprop": 33, "mse": [33, 44, 52, 66, 85, 107, 111], "qualcomm": [33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117], "innov": [33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117], "center": [33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117], "inc": [33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117], "ai": [33, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117], "toolkit": [33, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117], "quantsim_config": [33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117], "default_config": [33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117], "softwar": [34, 101, 103], "dramat": 34, "lost": [34, 101], "At": [34, 98, 103], "onnx": [34, 55, 78, 79, 83, 87, 90, 91, 94, 95, 99, 101, 104, 105, 106, 107, 109, 111], "link": [34, 90, 94, 95, 96, 99, 106, 107, 111], "debug": [34, 35, 36, 40, 47, 51, 55, 56, 57, 61, 63, 70, 78, 110], "codebas": 34, "sphinx": 34, "page": [34, 91, 92, 98, 111, 112], "model": [35, 36, 37, 38, 39, 40, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 102, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117], "default_reg_param": [36, 57, 71], "default_beta_rang": [36, 57, 71], "default_warm_start": [36, 57, 71], "datasetv2": [36, 37, 57, 60, 61], "beta": [36, 57, 71, 94], "anneal": [36, 57, 71], "start_beta": [36, 57, 71], "end_beta": [36, 57, 71], "warm": [36, 57, 71, 94], "period": [36, 57, 71, 94], "zero": [36, 57, 60, 71, 83, 111, 112], "post_training_percentil": [36, 57, 71, 87], "percentil": [36, 57, 71, 87], "absolut": [36, 57, 61, 71, 76, 87], "nois": [36, 57, 67, 71, 86, 87, 105, 106, 107, 108, 109], "aimetlogg": [36, 57], "test_model": [36, 57], "keras_model": [36, 57], "dummy_forward_pass": [36, 57], "intend": [36, 44, 52, 55, 57, 61, 66, 76, 85, 98], "Or": [36, 44, 46, 52, 57, 66, 68, 78, 80, 85, 87, 103], "someth": [36, 44, 46, 52, 57, 66, 68, 85, 87, 103, 114], "apply_adaround_exampl": [36, 48, 57], "set_level_for_all_area": [36, 57], "dataset_s": [36, 57, 66], "possible_batch": [36, 57], "w4a8": [36, 57], "param_bw": [36, 57, 71, 73, 83], "output_bw": [36, 57, 71, 73, 83], "adarounded_model": [36, 71], "adarounded_sess": [36, 57], "grid": [36, 57, 71], "handl": [37, 59, 73, 74], "undo": [37, 59, 74], "upon": [37, 59, 74], "batch_norm": [37, 43, 59, 65, 74, 84], "qcquantizewrapp": [37, 74], "pair": [37, 38, 43, 61, 65, 74, 75, 76, 84], "got": [37, 65, 74, 80, 84], "prepar": [37, 41, 44, 45, 49, 52, 58, 66, 71, 73, 79, 83, 85, 86, 87, 95, 105, 112], "overal": [38, 61, 71, 76, 87, 98, 103, 110], "algorithm": [38, 55, 60, 61, 76, 98, 100, 103, 110, 117], "pick": [38, 42, 43, 60, 61, 65, 76, 98, 100, 103], "tweak": [38, 43, 61, 65, 76, 84], "compressor": [38, 61, 76], "static": [38, 42, 61, 65, 76, 80, 111], "visualization_url": [38, 61, 76, 88], "callabl": [38, 58, 61, 68, 71, 73, 74, 76, 80, 85], "cost": [38, 61, 76, 100, 103, 108], "url": [38, 61, 76, 88, 91, 92, 99, 114], "appear": [38, 43, 61, 65, 72, 76, 80, 81, 84], "compressionstat": [38, 61, 76], "use_monotonic_fit": [38, 61, 76], "saved_eval_scores_dict": [38, 61, 76, 88], "express": [38, 61, 76], "comp": [38, 61, 76], "greater": [38, 43, 61, 65, 76, 84], "monoton": [38, 61, 76, 100], "pickl": [38, 61, 76], "experi": [38, 61, 76, 103], "union": [38, 40, 42, 43, 46, 61, 62, 63, 65, 68, 71, 72, 73, 75, 76, 77, 78, 84, 85, 87], "rank": [38, 61, 76, 113, 116], "noth": [38, 61, 76], "space": [38, 61, 76], "weight_svd": [38, 61, 76], "comp_ratio": [38, 61, 76], "decode_predict": 38, "aimet_common_def": 38, "aimet_tensorflow_def": 38, "get_eval_func": 38, "50000": 38, "func_wrapp": 38, "validation_d": 38, "inp_data": 38, "img": 38, "pred": [38, 49], "cnt": 38, "b": [38, 59, 74, 83], "aimet_spatial_svd": 38, "evalfunct": 38, "driver": [38, 90, 92], "stat": [38, 61, 74, 76], "three": [39, 62, 80, 95, 98, 115], "comprehens": [39, 62], "detect": [39, 62, 103], "shall": [39, 55, 62], "rtype": [39, 42, 43], "cross_layer_equalization_auto": [39, 62, 77], "individu": [39, 43, 62, 65, 77, 85, 96, 97, 98, 100, 103, 105, 107, 110], "intermedi": [40, 51, 63, 72, 78, 87, 111], "accord": [40, 51, 63, 78, 105, 108, 109, 111], "comparison": [40, 51, 63, 78], "amongst": [40, 51, 63, 78], "miss": [40, 42, 51, 55, 63, 78, 81], "issu": [40, 42, 51, 63, 72, 78, 81, 96, 101, 104, 110, 112, 114, 115], "layer_output_util": [40, 51, 63, 78], "layeroutpututil": [40, 51, 63, 78], "save_dir": 40, "keraslayeroutput": 40, "implement": [40, 44, 46, 49, 52, 58, 63, 66, 73, 78, 85, 87, 104, 110], "constructor": [40, 43, 60, 61, 63, 75, 78, 79, 80, 84, 87], "generate_layer_output": [40, 51, 63, 78], "input_batch": [40, 51, 63, 78], "disk": [40, 63, 78], "obtain": [40, 43, 51, 52, 55, 60, 63, 65, 78, 84, 92, 97, 98, 107, 111], "aimet_export_artifact": [40, 51, 63, 78], "sake": [40, 51, 63, 78], "simplic": [40, 51, 63, 78], "mandatori": [40, 51, 63, 78], "load_encodings_to_sim": [40, 51, 63, 78], "construct": [40, 51, 61, 63, 72, 78, 104], "properli": [40, 51, 63, 78], "get_pre_processed_input": [40, 51, 63, 78], "fp32_layer_output_util": [40, 51, 63, 78], "fp32_layer_output": [40, 51, 63, 78], "quantsim_layer_output_util": [40, 51, 63, 78], "quantsim_layer_output": [40, 51, 63, 78], "sever": [41, 45, 64, 79, 81, 86, 98], "encourag": [41, 42, 45, 79, 80, 86], "format": [41, 43, 46, 57, 60, 61, 65, 68, 71, 83, 85, 87, 90, 95, 102], "get_model": 41, "mix": [41, 60], "reus": [41, 79, 80, 81], "had": [41, 79], "x2": [41, 79, 80], "relu2": [41, 42, 79, 81], "manditori": 42, "submodul": [42, 79], "inherit": 42, "pure": [42, 79], "inputlay": 42, "portion": 42, "get_text_classificaiton_model": 42, "model_preparer_two_subclassed_lay": 42, "get_subclass_model_with_functional_lay": 42, "sigmoid": [42, 80], "binary_classifi": 42, "myfunctionalmodel": 42, "my_functional_model": 42, "classifi": 42, "model_preparer_subclassed_model_with_functional_lay": 42, "resembl": 42, "piec": [42, 80], "python": [42, 61, 68, 87, 90, 91, 92], "caus": [42, 104, 110, 111], "trace": [42, 79], "symbol": 42, "touch": 42, "static_patch_count": 42, "guarante": 42, "verifi": [42, 80], "furthermor": 42, "resu": 42, "resblock": 42, "twice": 42, "bad": 42, "bn1": [42, 72, 81, 84], "bn2": 42, "relu1": [42, 79, 81], "plug": [43, 65, 84], "conv2dtranspos": 43, "depthwiseconv2d": [43, 102], "crosslayersc": [43, 60, 65, 84], "scale_model": [43, 65, 84], "clssetinfo": [43, 65], "highbiasfold": [43, 60, 65, 84], "bias_fold": [43, 65, 84], "cls_set_info_list": [43, 65, 84], "bn_layer": [43, 84], "sigma": [43, 65, 84], "element": [43, 55, 65, 84], "model_transform_util": 43, "replace_relu6_with_relu": 43, "cross_layer_equalization_auto_stepwis": [43, 65], "relu6": [43, 62, 65, 75, 84, 106], "model_for_cl": 43, "folded_pair": [43, 65, 84], "bn_dict": [43, 84], "conv_or_linear": 43, "group": [43, 65, 91, 109, 111], "fold_given_batch_norm": [43, 60, 65, 84], "layer_pair": [43, 65, 84], "conv_linear": 43, "is_batch_norm_second": 43, "scale_cls_set": [43, 65, 84], "cls_set": [43, 65, 84], "cls_pair_1": [43, 65, 84], "cls_pair_2": [43, 65, 84], "hold": [43, 65, 75, 84, 109], "along": [43, 60, 65, 84, 108, 111], "depth": [43, 84, 98, 110], "wise": [43, 76, 84, 85, 110], "clssetlayerpairinfo": [43, 65, 84], "scale_factor": [43, 65, 84], "relu_activation_between_lay": [43, 65, 84], "relat": [43, 60, 61, 62, 65, 75, 76, 84, 107, 111], "whose": [43, 63, 78, 80, 84, 106, 109, 117], "cross_layer_equalization_manu": [43, 65, 84], "get_example_layer_pairs_resnet50_for_fold": 43, "consecutive_layer_list": [43, 65, 84], "get_consecutive_layer_list_from_resnet50_for_sc": [43, 65], "scaling_factor_list": [43, 65, 84], "format_info_for_high_bias_fold": [43, 65], "conv_op_1": [43, 65], "bn_op_1": [43, 65], "conv_op_2": [43, 65], "bn_op_2": [43, 65], "conv_op_3": [43, 65], "bn_op_3": [43, 65], "11": [43, 55, 90, 92], "bn_op": [43, 65], "upstream": [43, 65, 97, 117], "downstream": [43, 55, 65], "usag": [43, 55, 64, 65, 81, 87, 98, 99, 103, 110], "conv_op": [43, 65, 69], "bn_op_with_meta": [43, 65], "_fold_upstream_flag": [43, 65], "boolean": [43, 65], "is_relu_activation_in_cls_set": [43, 65], "fill": [43, 65, 87], "create_cls_set_info_list": [43, 65], "quantanalyz": [44, 52, 53, 66, 67, 85, 105, 112], "pdf": [44, 66, 85, 112], "scalar": [44, 52, 58, 66, 85], "hotspot": [44, 66, 85, 107], "31": [44, 57, 58, 61, 71, 72, 75, 85, 87, 90, 91, 92], "toi": 44, "256": [44, 52, 66, 85, 107], "num_class": [44, 46, 58, 73], "ey": 44, "label_dataset": [44, 58], "own": [44, 49, 52, 54, 58, 59, 66, 68, 71, 73, 74, 75, 85, 87], "goal": [44, 49, 52, 58, 73, 85, 95], "action": [44, 52, 54, 59, 66, 68, 71, 74, 75, 85, 87, 117], "prepared_model": [44, 71, 80, 85, 87], "forward_pass_callback_fn": [44, 52, 66, 85], "eval_callback_fn": [44, 52, 66, 85], "approxim": [44, 52, 66, 85, 94, 98, 106, 107], "quant_analyzer_result": [44, 52, 66, 85], "abil": [45, 53, 67, 86, 112], "hardwar": [45, 53, 67, 86, 89, 105, 106, 111], "in_plac": [46, 87], "default_data_typ": [46, 68, 87], "quantizationdatatyp": [46, 68, 87], "mechan": [46, 80, 87], "custom_object": 46, "store": [46, 57, 60, 65, 68, 71, 83, 87], "pth": [46, 68, 76, 78, 83, 87], "prefix": [46, 57, 61, 68, 71, 83, 87], "quantize_model": [46, 54, 68], "dummy_x": 46, "dummy_i": 46, "randint": [46, 58], "lr": 46, "001": 46, "write": [48, 54, 68, 71, 87], "ada_rounded_model": 48, "math": 49, "auto_quant_v2": [49, 73], "onnx_model": [49, 50, 52, 54], "dummy_data": [49, 52, 54], "astyp": [49, 52, 54], "float32": [49, 52, 54], "Its": 49, "fed": 49, "unlabelled_data_load": 49, "ceil": [49, 71], "num_of_sampl": 49, "evaldataload": 49, "acc_top1": 49, "acc_top5": 49, "batch_avg_top_1_5": 49, "4f": [49, 73], "happen": [50, 77], "dummy_input_dict": 51, "serializetostr": 51, "dir_path": [51, 63, 78], "interest": [52, 85], "create_quantsim_and_encod": 52, "unlabeled_data_load": [52, 73, 85], "_get_unlabled_data_load": [52, 85], "unlabeled_dataset_iter": [52, 73, 85], "autoqu": [53, 67, 86, 99, 105, 108, 112], "unifi": [53, 67, 86], "integr": [53, 58, 67, 73, 82, 86, 105], "max_batch_count": [54, 68, 71, 87], "current_batch_count": [54, 68, 71, 87], "use_symmetric_encod": 54, "forward_pass_funct": 54, "syntax": 55, "usabl": 55, "xx": 55, "yy": 55, "zz": 55, "major": [55, 103], "revis": 55, "minor": [55, 112], "patch": 55, "substanti": 55, "fulli": [55, 61, 102], "bug": [55, 112], "backward": [55, 84], "assum": [55, 73, 91, 92], "string": [55, 109], "activation_encod": 55, "tensor_nam": 55, "param_encod": 55, "constraint": 55, "depict": 55, "6086959838867188": 55, "109158515930176": 55, "114": 55, "018501389771699905": 55, "21": 55, "558866932988167": 55, "12636379897594452": 55, "12": [55, 90], "010530316270887852": 55, "06318144500255585": 55, "06268782913684845": 55, "127": 55, "0004936049808748066": 55, "fc1": [55, 80], "05589814856648445": 55, "05546144023537636": 55, "0004367042565718293": 55, "184721499681473": 55, "10788747668266296": 55, "0089906234367221": 55, "conv2d_1": [55, 61], "1020304188132286": 55, "10380396991968155": 55, "008650330936207491": 55, "readvariableop": [55, 68], "1462666392326355": 55, "1451239287853241": 55, "126": 55, "0011427081098743512": 55, "08333279937505722": 55, "08268175274133682": 55, "0006510374592799766": 55, "includ": [55, 58, 73, 87, 90, 96, 103, 105, 107, 109, 111, 112], "field": 55, "dtype": [55, 80, 83], "datatyp": 55, "snippet": [55, 80], "highlight": [55, 106, 114, 115], "quantizer_arg": 55, "activation_bitwidth": 55, "param_bitwidth": 55, "popul": [55, 60], "broken": 55, "occur": [55, 61, 68], "who": 55, "knowledg": 55, "default_config_fil": [57, 58, 71], "conv2d_input": 57, "reset_default_graph": [57, 63, 68], "init": [57, 61, 83], "get_default_graph": 57, "default_rounding_mod": 58, "manner": [58, 73, 95], "meet": [58, 73, 90, 95, 98, 100], "datasetv1": [58, 59, 66], "unless": [58, 75, 92, 117], "n": [58, 85, 112], "andoutput": 58, "fp32_sess": 58, "cache_id": [58, 73], "explicitli": [58, 117], "preced": [59, 74, 109], "var": [59, 74, 92], "load_fp32_model": [59, 74], "imagenetpipelin": [59, 74], "quant_sim": [59, 74, 87], "main": [60, 96, 109, 112, 115], "reference_model": 60, "conv_bn_dict": [60, 75], "perform_only_empirical_bias_corr": [60, 75], "convbninfotyp": 60, "find_all_convs_bn_with_activ": 60, "nest": 60, "graphsearchutil": [60, 65], "biasutil": [60, 65], "bias_correction_empir": 60, "biascorrectparam": 60, "fc1000": [60, 62, 65], "_new_sess": 60, "analyt": [60, 106, 114, 115], "bias_correction_empirical_analyt": 60, "bias_correction_after_cl": 60, "sess_after_cl": 60, "bias_correction_per_lay": 60, "corrected_model": 60, "layer_name_to_be_correct": 60, "analytical_bias_correction_per_lay": 60, "preceeding_bn_layer_info": 60, "is_first_conv": 60, "bc": [60, 94], "preceed": [60, 96], "bias_correction_single_layer_empir": 60, "initialize_model_with_bia": 60, "example_conv_lay": 60, "res2a_branch2a": [60, 65], "bias_correction_single_layer_analyt": 60, "convs_bn_activation_info_dict": 60, "sure": [60, 100, 104], "preceding_bn_layer_info": 60, "tar": 61, "train_model": [61, 76], "train_flag": [61, 76], "channels_last": 61, "downsamplelay": 61, "upsamplelay": 61, "teh": [61, 117], "evaluate_model": [61, 76], "honor": [61, 76], "obvious": [61, 76], "spatial_svd_auto_mod": [61, 76], "block1_conv1": 61, "compr_model_sess": 61, "pretti": [61, 76], "spatial_svd_manual_mod": [61, 76], "block1_conv2": 61, "channel_pruning_auto_mod": [61, 76], "channel_pruning_manual_mod": [61, 76], "block1_conv2_op": 61, "block2_conv2_op": 61, "block2_conv2": 61, "checkpoint": [61, 68, 87, 105], "output_fil": 61, "svd_graph": 61, "svd_type": 61, "num_lay": 61, "layer_rank": 61, "num_rank": 61, "no_evalu": 61, "layer_selection_threshold": 61, "connect": [61, 97, 102, 116], "balanc": [61, 103], "multipli": [61, 98], "accumul": [61, 98], "footprint": 61, "ssvd": [61, 98], "length": 61, "compression_point": 61, "valueerror": [61, 68], "compress_net": 61, "eval_nam": 61, "run_graph": 61, "evaluate_graph": [61, 68], "default_eval_func": [61, 68], "error_margin": 61, "avg": 61, "graph_ev": [61, 68], "prototyp": 61, "accept": [61, 106, 110], "degrad": [61, 103], "invalid": [61, 80], "runtimeerror": 61, "tfrecord_gener": 61, "tf_gen": [61, 68], "mnistpars": [61, 68], "weight_svd_auto_mod": [61, 76], "alloc": [61, 68], "wish": [61, 68, 91, 92], "tfrecordgener": [61, 68], "mnist": [61, 68, 76], "parser": [61, 68], "mnist_sav": [61, 68], "95": 61, "pretty_print": 61, "weight_svd_manual_mod": [61, 76], "matmul_1": 61, "connectedgraph": [62, 65, 81], "hbf": [62, 65, 94], "new_sess": 62, "wherein": [63, 78], "saver": 63, "import_meta_graph": 63, "restor": [63, 87, 110], "trainbl": 64, "recompil": 64, "temp": 64, "clean": 64, "recurr": [64, 112], "rnn": [64, 112], "lstm": [64, 112], "graph_util": 65, "after_relu_replace_sess": 65, "find_and_replace_relu6_with_relu": 65, "after_bn_fold_sess": 65, "after_cls_sess": 65, "after_hbf_sess": 65, "updated_sess": 65, "map_cls_sets_to_new_sess": 65, "tf_names_op_dict": 65, "get_layer_pairs_resnet50_for_fold": 65, "after_fold_sess": 65, "graph_search": 65, "bn2a_branch2a": 65, "cond": 65, "fusedbatchnorm_1": 65, "res2a_branch2b": 65, "bn2a_branch2b": 65, "res2a_branch2c": 65, "bn2a_branch2c": 65, "conv1_op": 65, "conv1_depthwise_op": 65, "conv1_pointwise_op": 65, "temp_cl": 65, "model_start_op_nam": 66, "model_output_op_nam": 66, "learnt": 68, "orig_sess": 68, "quantisim": 68, "tutori": 68, "load_model_from_meta": 68, "reshape_input": 68, "dense_1": 68, "biasadd": 68, "trainingextens": [68, 87], "src": [68, 87], "quantization_aware_training_range_learn": 68, "parser2": 68, "generator2": 68, "cross_entropi": 68, "xent": 68, "train_step": 68, "simultan": 68, "fc1_w": 68, "matmul": [68, 112], "perf": 68, "ce": 68, "adamoptim": 68, "tempadam": 68, "initialize_uninitialized_var": 68, "read_data_set": 68, "one_hot": 68, "next_batch": 68, "plotting_util": 69, "visualize_weight_ranges_single_lay": 69, "scatter": [69, 89], "bokeh": [69, 88, 89], "visualize_relative_weight_ranges_single_lay": 69, "publish": [69, 88, 89], "visualizing_weight_ranges_for_single_lay": 69, "visualiza": 69, "visualizing_relative_weight_ranges_for_single_lay": 69, "param_bw_override_list": 71, "ignore_quant_ops_list": 71, "pars": [71, 84, 87], "affect": [71, 96, 109, 117], "commonli": 71, "10k": 71, "15k": 71, "get_train_dataload": [71, 74], "quantized_resnet18": [71, 87], "experiment": [72, 103, 109], "arch_check": 72, "archcheck": 72, "check_model_arch": 72, "result_dir": 72, "_node_check_dict": 72, "record": [72, 85], "fail": [72, 80, 81, 95, 104, 105], "arch_checker_report": 72, "dotted_name_op": 72, "nodeerrorreportobject": 72, "archcheckerreport": 72, "condit": [72, 80, 81], "less": [72, 97, 100], "modelwithnotenoughchannel": 72, "prelu": 72, "stride": [72, 80, 81], "batchnorm2d": [72, 81, 84], "example_check_for_number_of_conv_channel": 72, "fewer": 72, "logger": [72, 81], "_check_conv_channel_32_bas": 72, "_check_conv_channel_larger_than_32": 72, "layer_nam": [72, 85], "modelwithprelu": 72, "prelu1": 72, "example_check_for_non_performant_activ": 72, "num_paramet": 72, "_activation_check": 72, "modelwithnonfoldablebn": 72, "foldabl": 72, "avg_pool1": 72, "avgpool2d": 72, "example_check_for_standalone_bn": 72, "averagepool": 72, "ep": 72, "05": [72, 92], "momentum": 72, "affin": [72, 83], "track_running_stat": 72, "_check_batch_norm_fold": 72, "strict_valid": 73, "model_prepare_requir": 73, "id": [73, 88, 91, 114], "cach": [73, 92], "hen": 73, "proce": 73, "unid": 73, "unintuit": 73, "_subset_sampl": 73, "sampler": 73, "fp32_model": 73, "fakedata": 73, "eval_data_load": 73, "dim": 73, "deprec": [73, 105], "dummy_input_on_cpu": 73, "dummy_input_on_gpu": 73, "layers_to_ignor": 75, "remain": [75, 100, 105, 106, 111], "calc": 75, "corr": 75, "irrespect": 75, "fact": 75, "elig": 75, "input_bn": 75, "output_bn": 75, "in_activation_typ": 75, "no_activ": 75, "out_activation_typ": 75, "hode": 75, "mobilenetv2": [75, 84], "512": 75, "module_prop_dict": 75, "find_all_conv_bn_with_activ": 75, "weightsvdparamet": 76, "tarrankselectionparamet": 76, "num_rank_indic": 76, "rank_select_schem": 76, "select_param": 76, "rankselectschem": 76, "mnist_trained_on_gpu": 76, "rank_select": 76, "mnist_torch_model": 76, "dataloadermnist": 76, "_layer_db": 76, "ture": 76, "batch_callback": 76, "spatial_svd_auto_mode_with_layerwise_finetun": 76, "torchscript": [78, 87], "naming_schem": 78, "namingschem": 78, "onnx_export_arg": [78, 87], "onnxexportapiarg": [78, 87], "consist": [78, 95, 111, 117], "numer": 78, "onnx_util": 78, "pythonpath": [78, 99], "successfulli": [78, 104], "map_loc": 78, "model_torch": 78, "convers": [79, 110], "onnx_file_nam": 79, "jit": 79, "traceabl": [79, 80], "stateless": 79, "former": 79, "retrain": 79, "whenev": 79, "image_rgb": 79, "rgb_output": 79, "image_bw": 79, "bw_output": 79, "rgb": 79, "bw": [79, 83, 87], "elementwis": [80, 112], "unrol": 80, "independ": [80, 110], "modules_to_exclud": 80, "module_classes_to_exclud": 80, "concrete_arg": 80, "duplic": 80, "exclud": [80, 81, 85], "partial": 80, "special": 80, "control": [80, 111], "flow": [80, 82, 96, 105, 108, 110, 111], "won": 80, "symbolic_trac": 80, "graphmodul": 80, "modelwithfunctionalrelu": 80, "9216": 80, "fc2": 80, "model_preparer_functional_exampl": 80, "allclos": 80, "modelwithreusedrelu": 80, "model_preparer_reused_exampl": 80, "modelwithelementwiseaddop": 80, "x1": 80, "model_preparer_elementwise_add_exampl": 80, "dynam": [80, 106, 111, 112, 115], "statement": [80, 104], "branch": [80, 99, 109], "weren": 80, "traceerror": 80, "workaround": [80, 104], "problem": [80, 110], "across": [80, 106, 107], "Such": 80, "concret": 80, "truli": 80, "custom_function_not_to_be_trac": 80, "call_funct": 80, "__torch_function__": 80, "sqrt": 80, "modelwithnontorchfunct": 80, "model_transform": 80, "tracer": 80, "is_leaf_modul": 80, "leaf": [80, 112], "expos": [80, 94], "module_to_exclud": 80, "examin": 80, "custommodul": 80, "softplu": 80, "custommodel": 80, "arang": 80, "traceback": 80, "typeerror": 80, "receiv": 80, "proxi": 80, "layout": 80, "pin_memori": 80, "requires_grad": 80, "problemat": [80, 110, 115], "determinist": 80, "hard": 80, "do_not_trace_m": 80, "share": [81, 92], "modelwithreusednod": 81, "inplac": 81, "2592": 81, "view": [81, 94, 95, 96, 101, 104, 106, 107, 111, 114], "model_valid": 81, "modelvalid": 81, "validate_example_model": 81, "validate_model": 81, "validate_for_reused_modul": 81, "0x7f127685a598": 81, "resolv": 81, "warn": [81, 105], "redefin": 81, "distinct": 81, "rewrit": [81, 104], "modelwithoutreusednod": 81, "rerun": 81, "0x7ff577373598": 81, "validate_for_missing_modul": 81, "0x7ff5703eff28": 81, "modelwithfunctionallinear": 81, "0x7f9dd9bd90d0": 81, "matmul_8": 81, "reason": 81, "op_type_map": 81, "recogn": [81, 109, 111], "functional_op": 81, "modelwithoutfunctionallinear": 81, "parallel": 82, "dataparallel": [82, 86], "doesn": 82, "forth": 82, "peft": 83, "adaptermetadata": 83, "lora": 83, "lora_a": 83, "alpha": 83, "lora_b": 83, "replace_lora_layers_with_quantizable_lay": 83, "save_lora_weights_after_adapt": 83, "track_lora_meta_data": 83, "replaced_module_typ": 83, "peftquantutil": 83, "adapater_name_to_meta_data": 83, "name_to_module_dict": 83, "track_meta_data": 83, "pt": 83, "disable_lora_adapt": 83, "enable_adapter_and_load_weight": 83, "adapter_weights_path": 83, "use_safetensor": 83, "bin": [83, 91, 92], "safetensor": 83, "export_adapter_weight": 83, "onnx_model_path": 83, "freeze_base_model": 83, "freeze_base_model_activation_quant": 83, "freeze_base_model_param_quant": 83, "get_quantized_lora_lay": 83, "vice": [83, 111], "versa": [83, 111], "set_bitwidth_for_lora_adapt": 83, "loraconfig": 83, "get_peft_model": 83, "lora_config": 83, "lora_alpha": 83, "lora_dropout": 83, "target_modul": 83, "tmp_dir": 83, "lora_weights_after_adaptation_for_adapter1": 83, "meta_data": 83, "convinplacelinear": 83, "peft_util": 83, "tmpdir": 83, "export_model": [83, 87], "filename_prefix_encod": [83, 87], "base_encod": 83, "lora_modul": 83, "param_quant": 83, "quantizedequant": 83, "base_model": 83, "adapter1": 83, "adapter1_weight": 83, "conv1d": [84, 112], "convtranspose2d": 84, "batchnorm1d": 84, "cross_layer_equalization_auto_step_by_step": 84, "conv_bn": 84, "replace_modules_of_type1_with_type2": 84, "layer_list": 84, "clspairinfo": 84, "depthwis": [84, 96, 112], "cross_layer_equalization_depthwise_lay": 84, "encapsul": 85, "check_model_sensitivity_to_quant": 85, "perform_per_layer_analysis_by_enabling_quant_wrapp": 85, "occurr": [85, 97], "perform_per_layer_analysis_by_disabling_quant_wrapp": 85, "export_per_layer_encoding_min_max_rang": 85, "esults_dir": 85, "pcq": [85, 96, 107], "wrapped_module_nam": 85, "param_nam": 85, "export_per_layer_stats_histogram": 85, "ctivations_pdf": 85, "eights_pdf": 85, "am": 85, "channel_index": 85, "export_per_layer_mse_loss": 85, "tap": 85, "checker": 86, "concern": 86, "save_checkpoint": 87, "file_path": 87, "load_checkpoint": 87, "quant_sim_model": 87, "propagate_encod": 87, "export_to_torchscript": 87, "use_embedded_encod": 87, "opset_vers": 87, "enable_onnx_check": 87, "entri": [87, 109], "data_typ": 87, "fakequ": 87, "forward_pass_arg": 87, "quatiz": 87, "unction": 87, "visualize_serialized_data": 88, "visualizecompress": [88, 114], "server": [88, 99], "tabl": [88, 99, 100, 104, 114], "display_eval_scor": [88, 114], "saved_eval_scores_dict_path": 88, "display_comp_ratio_plot": [88, 114], "comp_ratio_list_path": 88, "pkl": 88, "start_bokeh_server_sess": 88, "model_compression_with_visu": 88, "65": [88, 92, 98], "resnet18_eval_scor": 88, "comp_ratios_file_path": 88, "greedy_selection_comp_ratios_list": 88, "eval_scores_path": 88, "compression_visu": 88, "termin": [88, 99], "visualize_model": 89, "visualize_relative_weight_ranges_to_identify_problematic_lay": 89, "selected_lay": 89, "figur": [89, 94, 100, 110, 117], "visualize_weight_rang": 89, "deviat": 89, "visualize_changes_after_optim": 89, "old_model": 89, "new_model": 89, "visualize_changes_in_model_after_and_before_cl": 89, "visualiz": 89, "model_copi": 89, "visualize_weight_ranges_model": 89, "usual": [89, 108], "visualize_relative_weight_ranges_model": 89, "pypi": 90, "intel": 90, "x86": 90, "processor": 90, "linux": [90, 92], "ubuntu": [90, 92], "22": [90, 92], "04": [90, 92], "lt": [90, 92], "pip": [90, 91, 92, 99], "apt": [90, 91, 92], "liblapack": [90, 91, 92], "python3": [90, 91, 92, 99], "variant": [90, 92, 94, 95, 96, 106, 107, 108, 111], "latest": [90, 91], "whl": [90, 91, 92], "host": [90, 91, 92, 112, 114], "github": [90, 91, 92, 98, 99, 112], "com": [90, 91, 92, 99, 112], "quic": [90, 91, 92, 98, 99, 112], "13": [90, 91], "torch_gpu_": 90, "cp38": [90, 92], "linux_x86_64": [90, 91, 92], "torch_cpu_": 90, "tf_gpu_": 90, "tf_cpu_": 90, "14": 90, "onnx_gpu_": 90, "onnx_cpu_": 90, "brows": 90, "torch_gpu": [90, 91, 92], "torch_cpu": [90, 91, 92], "tf_gpu": [90, 91, 92], "tf_cpu": [90, 91, 92], "onnx_gpu": [90, 91, 92], "onnx_cpu": [90, 91, 92], "package_prefix": 90, "platform": [90, 105], "setup": 90, "bash": [90, 91], "command": [90, 91, 92, 99, 114], "shell": 90, "nvidia": [90, 91, 92], "card": 90, "capabl": [90, 114, 115], "docker": 90, "455": 90, "newer": 90, "cudnn": 90, "machin": [90, 91, 103], "develop": [90, 91, 92], "click": 90, "instruct": [91, 92, 99], "variant_str": [91, 92], "ONE": [91, 92], "pt113": 91, "aimet_vari": [91, 92], "workspac": [91, 99], "absolute_path_to_workspac": [91, 99], "docker_image_nam": 91, "codelinaro": 91, "dev": [91, 92], "docker_container_nam": 91, "any_nam": 91, "any_tag": 91, "jenkin": 91, "dockerfil": 91, "p": 91, "grep": 91, "kill": 91, "rm": 91, "passwd": 91, "ro": 91, "home": 91, "mnt": 91, "entrypoint": 91, "hostnam": 91, "filesystem": 91, "port": [91, 114], "port_id": 91, "project": [91, 92], "sudo": [91, 92, 99], "tag": [91, 92, 99, 112], "release_tag": [91, 92, 99], "download_url": [91, 92], "suffix": [91, 92], "wheel_file_suffix": [91, 92], "cp310": [91, 92], "pend": [91, 92, 99], "pip3": [91, 92], "h": [91, 92, 99, 116, 117], "usr": [91, 92], "lib": [91, 92], "dist": [91, 92], "torch_stabl": [91, 92], "OR": [91, 92], "envsetup": [91, 92], "sh": [91, 92], "local": [92, 114], "requisit": 92, "upgrad": 92, "wget": 92, "gnupg2": 92, "visit": [92, 101], "archiv": 92, "exact": [92, 96], "date": 92, "repo": [92, 99], "ubuntu2204": 92, "x86_64": 92, "pin": 92, "mv": 92, "prefer": [92, 103], "d": 92, "repositori": 92, "600": 92, "local_instal": 92, "local_11": 92, "520": 92, "61": 92, "1_amd64": 92, "deb": 92, "adv": 92, "fetch": 92, "3bf863cc": 92, "pub": 92, "dpkg": 92, "cp": [92, 98], "keyr": 92, "gpg": 92, "echo": 92, "515": 92, "torch_gpu_pt113": 92, "torch_cpu_pt113": 92, "cp36": 92, "cp36m": 92, "cp37": 92, "cp37m": 92, "py3": 92, "wheel": 92, "cat": 92, "reqs_deb_common": 92, "txt": 92, "xarg": 92, "reqs_deb_torch_common": 92, "reqs_deb_onnx_common": 92, "reqs_deb_tf_gpu": 92, "reqs_deb_torch_gpu": 92, "reqs_deb_onnx_gpu": 92, "uninstal": 92, "post1": 92, "onnxruntime_v": 92, "c": [92, 98], "__version__": 92, "ln": 92, "gnu": 92, "libjpeg": 92, "chose": 92, "bnf": 94, "coupl": 94, "moder": 94, "enter": 95, "preprat": 95, "mainli": 95, "decreas": 96, "oscil": 96, "presenc": 97, "residu": 97, "discuss": [98, 110, 111], "reduct": 98, "uncompress": 98, "latenc": 98, "vari": [98, 100, 106, 115], "io": [98, 112], "half": 98, "unknown": 98, "apriori": 98, "cssvd": 98, "75": 98, "2b": 98, "larg": [98, 108, 113, 116], "2a": 98, "revisit": 98, "ccp": 98, "csvd": 98, "becom": [99, 106], "familiar": 99, "browsabl": 99, "metapackag": 99, "ip": 99, "browser": 99, "past": 99, "mkdir": 99, "cd": 99, "packag": [99, 112], "git": 99, "www": 99, "navig": 99, "launch": 99, "ipynb": 99, "extens": 99, "therein": 99, "assess": 100, "highest": 100, "column": 100, "unmodifi": 100, "strict": [100, 109, 111], "curv": 100, "core": 100, "interpol": 100, "met": 100, "binari": 100, "solut": [100, 108, 110], "lesser": [100, 103], "fall": [100, 109], "drstical": 100, "edg": 101, "incur": [101, 107], "hw": 101, "redund": 101, "product": 101, "technologi": 101, "subsidiari": 101, "dilat": 102, "librari": 103, "guidebook": [103, 105], "advic": 103, "phase": [103, 105], "nomin": 103, "fc": 103, "term": [103, 113, 114, 115, 116], "notic": 103, "sharp": 103, "respons": 103, "carefulli": 103, "slow": 103, "searcher": 103, "strike": 103, "xiangyu": 103, "zhang": 103, "jianhua": 103, "zou": 103, "kaim": 103, "he": 103, "jian": 103, "sun": 103, "deep": 103, "ieee": [103, 106], "transact": 103, "pattern": 103, "intellig": 103, "vol": 103, "38": 103, "pp": 103, "1943": 103, "1955": 103, "oct": 103, "2016": 103, "yihui": 103, "confer": [103, 106], "vision": [103, 106], "venic": 103, "2017": 103, "1398": 103, "1406": 103, "jaderberg": 103, "andrea": 103, "vedaldi": 103, "andrew": 103, "zisserman": 103, "expans": 103, "british": 103, "jan": 103, "2014": 103, "andrei": 103, "kuzmin": 103, "marku": [103, 106], "nagel": [103, 106], "saurabh": 103, "pitr": 103, "sandeep": 103, "pendyam": 103, "tijmen": [103, 106], "blankevoort": [103, 106], "taxonomi": 103, "primit": 104, "slice": 104, "bilinear": 104, "upsampl": 104, "129": 104, "align_corn": 104, "deconvolut": 104, "deeplabv3": 104, "address": [104, 110, 114], "introduc": [105, 109, 111], "advantag": 105, "fast": 105, "easi": [105, 107], "gap": 105, "robust": 105, "longer": [105, 108], "account": [105, 108, 110], "advis": [105, 109], "prep": 105, "align": 105, "retri": 105, "hand": 105, "satisfactori": [105, 110], "bring": 105, "onto": 105, "pb": 105, "trial": 105, "particular": [105, 109], "seem": 105, "bat": 105, "surround": 106, "big": 106, "discrep": 106, "wide": 106, "significantli": 106, "quantizaion": 106, "bottleneck": [106, 110], "hybrid": 106, "approach": [106, 111], "mart": 106, "van": 106, "baalen": 106, "seoul": 106, "octob": 106, "rune": 107, "situat": 107, "pinpoint": 107, "culprit": 107, "toss": 107, "outlier": [107, 111], "monitor": 107, "contribut": [107, 110], "mitig": [108, 111], "come": [108, 111], "accompani": 108, "throughout": [108, 109, 115], "themselv": 108, "aid": 108, "converg": 108, "divid": 108, "six": 109, "overrul": 109, "turn": 109, "empti": 109, "omit": 109, "asymmetr": [109, 111], "asid": 109, "govern": 109, "unsign": [109, 111], "convent": 109, "member": 109, "whatev": 109, "earlier": 109, "diagnost": 110, "strictli": 110, "insight": [110, 114, 115], "underperform": 110, "tackl": 110, "underli": 110, "chart": 110, "saniti": 110, "behav": 110, "ofth": 110, "kept": 110, "toward": 110, "uneven": 110, "inner": 110, "bert": 110, "reveal": 110, "resort": 110, "revert": 110, "power": 110, "ultim": 111, "ingest": 111, "000": 111, "dequant": 111, "dequantiz": 111, "hook": 111, "intercept": 111, "q": 111, "clamp": 111, "equat": 111, "textrm": 111, "dfrac": 111, "quad": 111, "strong": 111, "excess": 111, "signal": 111, "sqnr": 111, "squar": 111, "qmin": 111, "qmax": 111, "satur": 111, "erro": 111, "alongsid": 111, "wherea": 111, "ones": 111, "sign": 111, "slim": 112, "backslash": 112, "user_guid": 112, "api_doc": 112, "quantizablemultiheadattent": 112, "kyuykim": 112, "mangal": 112, "geunle": 112, "correctli": 112, "klhsieh": 112, "akhobar": 112, "resid": 112, "ashvkuma": 112, "fp16": 112, "convtranspose1d": 112, "concat": 112, "stand": [112, 113, 116], "adaptiveround": 112, "gru": 112, "instal": 112, "\ud835\udc5a": [113, 116], "\ud835\udc5b": [113, 116], "\u210e": [113, 116], "\ud835\udc64": [113, 116], "\ud835\udc58": [113, 116], "larger": [113, 116], "degre": [113, 116], "assist": [114, 115], "progress": [114, 115], "computation": [114, 115], "heavi": [114, 115], "websocket": 114, "listen": 114, "5006": 114, "lot": 115, "lose": 117, "pictori": 117, "volum": 117, "hxwx8": 117, "hxwx5": 117, "propag": 117, "That": 117, "green": 117, "color": 117, "side": 117, "pink": 117, "orang": 117}, "objects": {"aimet_common.bias_correction": [[75, 0, 1, "", "ConvBnInfoType"]], "aimet_common.defs": [[75, 0, 1, "", "ActivationType"], [61, 0, 1, "", "CompressionScheme"], [61, 0, 1, "", "CostMetric"], [76, 0, 1, "", "GreedySelectionParameters"], [87, 0, 1, "", "QuantScheme"]], "aimet_common.defs.ActivationType": [[75, 1, 1, "", "no_activation"], [75, 1, 1, "", "relu"], [75, 1, 1, "", "relu6"]], "aimet_common.defs.CompressionScheme": [[61, 1, 1, "", "channel_pruning"], [61, 1, 1, "", "spatial_svd"], [61, 1, 1, "", "weight_svd"]], "aimet_common.defs.CostMetric": [[61, 1, 1, "", "mac"], [61, 1, 1, "", "memory"]], "aimet_common.defs.QuantScheme": [[87, 1, 1, "", "post_training_percentile"], [87, 1, 1, "", "post_training_tf"], [87, 1, 1, "", "post_training_tf_enhanced"], [87, 1, 1, "", "training_range_learning_with_tf_enhanced_init"], [87, 1, 1, "", "training_range_learning_with_tf_init"]], "aimet_common.utils": [[85, 0, 1, "", "CallbackFunc"]], "aimet_tensorflow.adaround.adaround_weight.Adaround": [[57, 2, 1, "", "apply_adaround"]], "aimet_tensorflow.adaround.adaround_weight": [[57, 0, 1, "", "AdaroundParameters"]], "aimet_tensorflow.auto_quant": [[58, 0, 1, "", "AutoQuant"]], "aimet_tensorflow.auto_quant.AutoQuant": [[58, 3, 1, "", "apply"], [58, 3, 1, "", "set_adaround_params"]], "aimet_tensorflow.batch_norm_fold": [[65, 2, 1, "", "fold_all_batch_norms"], [59, 2, 1, "", "fold_all_batch_norms_to_scale"], [65, 2, 1, "", "fold_given_batch_norms"]], "aimet_tensorflow.bias_correction.BiasCorrection": [[60, 2, 1, "", "analytical_bias_correction_per_layer"], [60, 2, 1, "", "bias_correction_per_layer"], [60, 2, 1, "", "correct_bias"]], "aimet_tensorflow.bias_correction": [[60, 2, 1, "", "BiasCorrectionParams"], [60, 0, 1, "", "QuantParams"]], "aimet_tensorflow.bn_reestimation": [[59, 2, 1, "", "reestimate_bn_stats"]], "aimet_tensorflow.compress": [[61, 0, 1, "", "ModelCompressor"]], "aimet_tensorflow.compress.ModelCompressor": [[61, 3, 1, "", "compress_model"]], "aimet_tensorflow.cross_layer_equalization": [[65, 0, 1, "", "ClsSetInfo"], [62, 2, 1, "", "equalize_model"]], "aimet_tensorflow.cross_layer_equalization.ClsSetInfo": [[65, 0, 1, "", "ClsSetLayerPairInfo"], [65, 3, 1, "", "map_cls_sets_to_new_session"]], "aimet_tensorflow.cross_layer_equalization.CrossLayerScaling": [[65, 2, 1, "", "scale_cls_sets"], [65, 2, 1, "", "scale_model"]], "aimet_tensorflow.cross_layer_equalization.HighBiasFold": [[65, 2, 1, "id0", "bias_fold"]], "aimet_tensorflow.defs": [[61, 0, 1, "", "ChannelPruningParameters"], [61, 0, 1, "", "ModuleCompRatioPair"], [61, 0, 1, "", "SpatialSvdParameters"]], "aimet_tensorflow.defs.ChannelPruningParameters": [[61, 0, 1, "", "AutoModeParams"], [61, 0, 1, "", "ManualModeParams"], [61, 0, 1, "", "Mode"]], "aimet_tensorflow.defs.ChannelPruningParameters.Mode": [[61, 1, 1, "", "auto"], [61, 1, 1, "", "manual"]], "aimet_tensorflow.defs.SpatialSvdParameters": [[61, 0, 1, "", "AutoModeParams"], [61, 0, 1, "", "ManualModeParams"], [61, 0, 1, "", "Mode"]], "aimet_tensorflow.defs.SpatialSvdParameters.Mode": [[61, 1, 1, "", "auto"], [61, 1, 1, "", "manual"]], "aimet_tensorflow.keras.batch_norm_fold": [[43, 2, 1, "", "fold_all_batch_norms"], [37, 2, 1, "", "fold_all_batch_norms_to_scale"], [43, 2, 1, "", "fold_given_batch_norms"]], "aimet_tensorflow.keras.bn_reestimation": [[37, 2, 1, "", "reestimate_bn_stats"]], "aimet_tensorflow.keras.compress": [[38, 0, 1, "", "ModelCompressor"]], "aimet_tensorflow.keras.compress.ModelCompressor": [[38, 3, 1, "", "compress_model"]], "aimet_tensorflow.keras.cross_layer_equalization": [[43, 0, 1, "", "ClsSetInfo"], [39, 2, 1, "", "equalize_model"]], "aimet_tensorflow.keras.cross_layer_equalization.ClsSetInfo": [[43, 0, 1, "", "ClsSetLayerPairInfo"]], "aimet_tensorflow.keras.cross_layer_equalization.CrossLayerScaling": [[43, 2, 1, "", "scale_cls_sets"], [43, 2, 1, "", "scale_model"]], "aimet_tensorflow.keras.cross_layer_equalization.HighBiasFold": [[43, 2, 1, "id0", "bias_fold"]], "aimet_tensorflow.keras.layer_output_utils": [[40, 0, 1, "", "LayerOutputUtil"]], "aimet_tensorflow.keras.layer_output_utils.LayerOutputUtil": [[40, 3, 1, "", "generate_layer_outputs"]], "aimet_tensorflow.keras.model_preparer": [[42, 2, 1, "", "prepare_model"]], "aimet_tensorflow.keras.quant_analyzer": [[44, 0, 1, "", "QuantAnalyzer"]], "aimet_tensorflow.keras.quant_analyzer.QuantAnalyzer": [[44, 3, 1, "", "analyze"]], "aimet_tensorflow.keras.quantsim": [[46, 0, 1, "", "QuantizationSimModel"]], "aimet_tensorflow.keras.quantsim.QuantizationSimModel": [[46, 3, 1, "", "compute_encodings"], [46, 3, 1, "", "export"]], "aimet_tensorflow.layer_output_utils": [[63, 0, 1, "", "LayerOutputUtil"]], "aimet_tensorflow.layer_output_utils.LayerOutputUtil": [[63, 3, 1, "", "generate_layer_outputs"]], "aimet_tensorflow.plotting_utils": [[69, 2, 1, "", "visualize_relative_weight_ranges_single_layer"], [69, 2, 1, "", "visualize_weight_ranges_single_layer"]], "aimet_tensorflow.quant_analyzer": [[66, 0, 1, "", "QuantAnalyzer"]], "aimet_tensorflow.quant_analyzer.QuantAnalyzer": [[66, 3, 1, "", "analyze"]], "aimet_tensorflow.quantsim": [[68, 0, 1, "", "QuantizationSimModel"]], "aimet_tensorflow.quantsim.QuantizationSimModel": [[68, 3, 1, "", "compute_encodings"], [68, 3, 1, "", "export"]], "aimet_tensorflow.svd": [[61, 0, 1, "", "Svd"]], "aimet_tensorflow.svd.Svd": [[61, 3, 1, "", "compress_net"]], "aimet_tensorflow.utils.convert_tf_sess_to_keras": [[33, 2, 1, "", "load_keras_model_multi_gpu"], [33, 2, 1, "", "load_tf_sess_variables_to_keras_single_gpu"], [33, 2, 1, "", "save_as_tf_module_multi_gpu"], [33, 2, 1, "", "save_tf_session_single_gpu"]], "aimet_tensorflow.utils.graph": [[64, 2, 1, "", "update_keras_bn_ops_trainable_flag"]], "aimet_torch.adaround.adaround_weight.Adaround": [[71, 2, 1, "", "apply_adaround"]], "aimet_torch.adaround.adaround_weight": [[71, 0, 1, "", "AdaroundParameters"]], "aimet_torch.arch_checker.arch_checker.ArchChecker": [[72, 2, 1, "", "check_model_arch"]], "aimet_torch.auto_quant": [[73, 0, 1, "", "AutoQuant"]], "aimet_torch.batch_norm_fold": [[84, 2, 1, "", "fold_all_batch_norms"], [74, 2, 1, "", "fold_all_batch_norms_to_scale"], [84, 2, 1, "", "fold_given_batch_norms"]], "aimet_torch.bias_correction": [[75, 2, 1, "", "correct_bias"]], "aimet_torch.bn_reestimation": [[74, 2, 1, "", "reestimate_bn_stats"]], "aimet_torch.compress": [[76, 0, 1, "", "ModelCompressor"]], "aimet_torch.compress.ModelCompressor": [[76, 3, 1, "", "compress_model"]], "aimet_torch.cross_layer_equalization": [[84, 0, 1, "", "ClsSetInfo"], [77, 2, 1, "", "equalize_model"]], "aimet_torch.cross_layer_equalization.ClsSetInfo": [[84, 0, 1, "", "ClsSetLayerPairInfo"]], "aimet_torch.cross_layer_equalization.CrossLayerScaling": [[84, 2, 1, "", "scale_cls_sets"], [84, 2, 1, "", "scale_model"]], "aimet_torch.cross_layer_equalization.HighBiasFold": [[84, 2, 1, "id0", "bias_fold"]], "aimet_torch.defs": [[76, 0, 1, "", "ChannelPruningParameters"], [76, 0, 1, "", "ModuleCompRatioPair"], [76, 0, 1, "", "SpatialSvdParameters"], [76, 0, 1, "", "TarRankSelectionParameters"], [76, 0, 1, "", "WeightSvdParameters"]], "aimet_torch.defs.ChannelPruningParameters": [[76, 0, 1, "", "AutoModeParams"], [76, 0, 1, "", "ManualModeParams"], [76, 0, 1, "", "Mode"]], "aimet_torch.defs.ChannelPruningParameters.Mode": [[76, 1, 1, "", "auto"], [76, 1, 1, "", "manual"]], "aimet_torch.defs.SpatialSvdParameters": [[76, 0, 1, "", "AutoModeParams"], [76, 0, 1, "", "ManualModeParams"], [76, 0, 1, "", "Mode"]], "aimet_torch.defs.SpatialSvdParameters.Mode": [[76, 1, 1, "", "auto"], [76, 1, 1, "", "manual"]], "aimet_torch.defs.WeightSvdParameters": [[76, 0, 1, "", "AutoModeParams"], [76, 0, 1, "", "ManualModeParams"], [76, 0, 1, "", "Mode"]], "aimet_torch.defs.WeightSvdParameters.Mode": [[76, 1, 1, "", "auto"], [76, 1, 1, "", "manual"]], "aimet_torch.layer_output_utils": [[78, 0, 1, "", "LayerOutputUtil"], [78, 0, 1, "", "NamingScheme"]], "aimet_torch.layer_output_utils.LayerOutputUtil": [[78, 3, 1, "", "generate_layer_outputs"]], "aimet_torch.layer_output_utils.NamingScheme": [[78, 1, 1, "", "ONNX"], [78, 1, 1, "", "PYTORCH"], [78, 1, 1, "", "TORCHSCRIPT"]], "aimet_torch.model_preparer": [[80, 2, 1, "", "prepare_model"]], "aimet_torch.peft": [[83, 0, 1, "", "AdapterMetaData"], [83, 0, 1, "", "PeftQuantUtils"], [83, 3, 1, "", "replace_lora_layers_with_quantizable_layers"], [83, 3, 1, "", "save_lora_weights_after_adaptation"], [83, 3, 1, "", "track_lora_meta_data"]], "aimet_torch.peft.PeftQuantUtils": [[83, 3, 1, "", "disable_lora_adapters"], [83, 3, 1, "", "enable_adapter_and_load_weights"], [83, 3, 1, "", "export_adapter_weights"], [83, 3, 1, "", "freeze_base_model"], [83, 3, 1, "", "freeze_base_model_activation_quantizers"], [83, 3, 1, "", "freeze_base_model_param_quantizers"], [83, 3, 1, "", "get_quantized_lora_layer"], [83, 3, 1, "", "set_bitwidth_for_lora_adapters"]], "aimet_torch.quant_analyzer": [[85, 0, 1, "", "QuantAnalyzer"]], "aimet_torch.quant_analyzer.QuantAnalyzer": [[85, 3, 1, "", "analyze"], [85, 3, 1, "", "check_model_sensitivity_to_quantization"], [85, 3, 1, "", "enable_per_layer_mse_loss"], [85, 3, 1, "", "export_per_layer_encoding_min_max_range"], [85, 3, 1, "", "export_per_layer_mse_loss"], [85, 3, 1, "", "export_per_layer_stats_histogram"], [85, 3, 1, "", "perform_per_layer_analysis_by_disabling_quant_wrappers"], [85, 3, 1, "", "perform_per_layer_analysis_by_enabling_quant_wrappers"]], "aimet_torch.quantsim": [[75, 0, 1, "", "QuantParams"], [87, 0, 1, "", "QuantizationSimModel"], [87, 3, 1, "", "load_checkpoint"], [87, 3, 1, "", "save_checkpoint"]], "aimet_torch.quantsim.QuantizationSimModel": [[87, 3, 1, "", "compute_encodings"], [87, 3, 1, "", "export"]], "aimet_torch.visualize_model": [[89, 2, 1, "", "visualize_changes_after_optimization"], [89, 2, 1, "", "visualize_relative_weight_ranges_to_identify_problematic_layers"], [89, 2, 1, "", "visualize_weight_ranges"]], "aimet_torch.visualize_serialized_data": [[88, 0, 1, "", "VisualizeCompression"]], "aimet_torch.visualize_serialized_data.VisualizeCompression": [[88, 3, 1, "", "display_comp_ratio_plot"], [88, 3, 1, "", "display_eval_scores"]]}, "objtypes": {"0": "py:class", "1": "py:attribute", "2": "py:function", "3": "py:method"}, "objnames": {"0": ["py", "class", "Python class"], "1": ["py", "attribute", "Python attribute"], "2": ["py", "function", "Python function"], "3": ["py", "method", "Python method"]}, "titleterms": {"adapt": [0, 6, 10, 26, 48, 71], "round": [0, 6, 10, 26, 48, 71, 103], "adaround": [0, 6, 7, 10, 11, 18, 26, 27, 36, 48, 57, 71, 94], "overal": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 97], "flow": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 83, 106], "what": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "thi": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "notebook": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 36, 37, 39, 57, 58, 59, 62, 68, 71, 73, 74, 77, 85, 87, 99], "i": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "dataset": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "1": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 55, 75, 92, 112], "exampl": [0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 33, 36, 37, 38, 39, 40, 42, 43, 44, 46, 48, 49, 50, 51, 52, 54, 55, 57, 58, 59, 60, 61, 62, 63, 65, 66, 68, 69, 71, 73, 74, 75, 76, 77, 78, 80, 84, 85, 87, 88, 89, 99], "evalu": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32], "train": [0, 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 87, 105, 106, 108], "pipelin": [0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32], "2": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 55, 75, 92, 112], "convert": [0, 1, 2, 14], "an": [0, 1, 2], "fp32": [0, 1, 2, 6, 7, 8, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 26, 27, 28, 29, 30, 31], "pytorch": [0, 1, 2, 55, 70, 71, 73, 74, 75, 76, 77, 78, 79, 82, 84, 85, 86, 87, 92, 104, 105, 115], "model": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 41, 42, 64, 79, 80, 81, 101, 103, 104, 105], "onnx": [0, 1, 2, 47, 48, 49, 50, 51, 52, 53, 54, 92], "": [0, 1, 2], "baselin": [0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 29, 30, 31], "accuraci": [0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 29, 30, 31], "3": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 55], "creat": [0, 1, 2, 6, 8, 9, 10, 12, 14, 15, 16, 18, 19, 20, 21, 26, 27, 28, 29, 30, 31], "quantiz": [0, 1, 2, 6, 8, 9, 10, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 26, 28, 29, 30, 31, 32, 45, 46, 53, 54, 67, 68, 69, 75, 86, 87, 89, 105, 106, 108, 109, 110, 111, 115], "simul": [0, 1, 2, 6, 8, 9, 10, 18, 19, 20, 21, 26, 28, 29, 30, 31, 109, 111], "determin": [0, 1, 2, 6, 7, 9, 10, 11, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31, 111], "fold": [0, 1, 2, 6, 8, 9, 10, 12, 15, 16, 18, 19, 20, 21, 26, 28, 29, 30, 31], "batch": [0, 1, 2, 6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31], "normal": [0, 1, 2, 6, 9, 10, 15, 16, 18, 19, 20, 21, 26, 29, 30, 31], "layer": [0, 1, 2, 6, 8, 9, 10, 12, 14, 15, 16, 17, 18, 19, 20, 21, 26, 28, 29, 30, 31, 32, 39, 40, 43, 50, 51, 60, 62, 63, 65, 77, 78, 84, 100, 103, 106], "sim": [0, 1, 2, 6, 8, 9, 15, 16, 19, 20, 21, 28, 29, 30, 31, 46, 54, 68, 87], "comput": [0, 2, 6, 8, 9, 15, 16, 19, 20, 21], "encod": [0, 2, 6, 8, 9, 15, 16, 17, 19, 20, 21, 22, 32, 55, 111], "4": [0, 1, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 18, 19, 20, 21, 25, 26, 28, 29, 30, 31, 55], "appli": [0, 6, 7, 10, 11, 17, 18, 22, 26, 32], "summari": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 18, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29, 30, 31], "cross": [1, 9, 19, 29, 39, 43, 50, 62, 65, 77, 84, 106], "equal": [1, 9, 19, 29, 39, 43, 50, 62, 65, 77, 84, 106], "cle": [1, 9, 19, 29, 43, 65], "compress": [3, 4, 5, 23, 24, 25, 38, 61, 76, 88, 98, 100, 103, 114], "us": [3, 4, 5, 23, 24, 25, 33, 43, 65, 91, 94, 103, 105, 114], "channel": [3, 4, 5, 18, 23, 25, 61, 76, 97], "prune": [3, 4, 5, 23, 25, 61, 76, 97], "load": [3, 4, 5, 6, 7, 8, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32], "find": [3, 4, 5, 23, 24, 25], "fine": [3, 4, 5, 23, 24, 25, 103], "tune": [3, 4, 5, 23, 24, 25, 103], "post": [3, 4, 5, 23, 24, 25, 92, 105, 106], "spatial": [4, 5, 24, 25, 38, 61, 76, 113], "svd": [4, 5, 24, 25, 38, 61, 76, 113, 116], "follow": [5, 25], "after": [5, 15, 16, 25], "get": [6, 9, 10, 18, 19, 20, 21, 26, 29, 30, 31, 101, 103], "score": [6, 9, 10, 18, 19, 20, 21, 26, 29, 30, 31], "autoqu": [7, 11, 27, 49, 58, 73, 95], "pretrain": [7, 11, 15, 16, 17, 27], "defin": [7, 11, 12, 27], "constant": [7, 11, 12, 27], "helper": [7, 11, 27, 43, 65], "function": [7, 11, 12, 14, 27, 33], "prepar": [7, 11, 12, 14, 42, 80], "unlabel": 7, "callback": [7, 11, 12], "5": [7, 8, 11, 12, 15, 16, 19, 28, 55], "option": [7, 11, 27, 103], "set": [7, 11, 27, 91], "paramet": [7, 11, 27, 36, 38, 48, 57, 60, 61, 71, 76, 111], "run": [7, 11, 27, 52, 85, 99], "awar": [8, 12, 13, 15, 16, 20, 21, 28, 30, 31, 87, 108], "batchnorm": [8, 12, 28, 37, 59, 74], "re": [8, 12, 28, 37, 59, 74, 96], "estim": [8, 12, 28, 37, 59, 74, 96], "rewrit": 8, "perform": [8, 12, 15, 16, 20, 21, 28, 30, 31, 43, 65], "qat": [8, 12, 15, 16, 20, 21, 28, 30, 31, 87, 108], "reestim": [8, 28, 59, 74], "statist": [8, 17, 22, 28, 32], "export": [8, 12, 15, 16, 19, 28], "bia": [9, 29, 60, 75], "correct": [9, 29, 60, 75], "bc": [9, 29], "instanti": 12, "kera": [12, 13, 14, 33, 35, 36, 37, 38, 39, 40, 41, 43, 44, 45, 46], "quantizationsim": [12, 15, 16], "transform": 13, "subclass": 14, "show": 14, "similar": 14, "differ": 14, "between": 14, "origin": 14, "discuss": 14, "limit": [14, 37, 42, 80], "compil": [15, 16], "6": [15, 16, 55], "valid": [15, 16, 81], "7": [15, 16], "rang": [16, 17, 21, 22, 31, 32], "learn": [16, 21, 31], "quant": [17, 22, 32, 44, 52, 66, 85], "analyz": [17, 22, 32, 44, 52, 66, 85], "quantanalyz": [17, 22, 32, 107], "per": [17, 18, 22, 32, 60, 100, 103], "analysi": [17, 22, 32, 105, 107], "enabl": [17, 22, 32], "disabl": [17, 22, 32], "wrapper": [17, 32], "min": [17, 22, 32], "max": [17, 22, 32], "pdf": [17, 22, 32], "mse": [17, 22, 32], "loss": [17, 22, 32], "quantsim": [18, 19, 111], "pcq": 18, "op": [22, 111], "object": 27, "infer": 27, "optim": 27, "aimet": [33, 35, 36, 37, 38, 39, 40, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 56, 57, 58, 59, 60, 61, 62, 63, 65, 66, 67, 68, 69, 70, 71, 73, 74, 75, 76, 77, 78, 84, 85, 86, 87, 88, 89, 90, 91, 92, 94, 95, 96, 97, 98, 99, 100, 102, 103, 105, 106, 107, 108, 110, 111, 112, 113, 114, 115, 116, 117], "tensorflow": [33, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 92, 105, 115], "api": [33, 34, 35, 36, 37, 38, 39, 40, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 56, 57, 58, 59, 60, 61, 62, 63, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 80, 83, 84, 85, 86, 87, 88, 89, 94, 95, 96, 106, 107, 111], "introduct": [33, 37, 38, 39, 43, 50, 59, 61, 62, 65, 74, 76, 77, 84], "code": [33, 36, 37, 38, 39, 40, 42, 43, 44, 46, 48, 49, 50, 51, 52, 54, 57, 58, 59, 60, 61, 62, 63, 65, 66, 68, 69, 71, 73, 74, 75, 76, 77, 78, 80, 84, 85, 87, 88, 89, 99], "util": [33, 52, 81, 85], "welcom": 34, "ai": [34, 101], "effici": [34, 101], "toolkit": [34, 101], "doc": 34, "indic": 34, "tabl": 34, "user": [36, 39, 46, 48, 49, 50, 57, 58, 60, 62, 68, 71, 73, 75, 77, 83, 85, 87, 101, 106], "guid": [36, 39, 46, 48, 49, 50, 57, 58, 60, 62, 68, 71, 73, 75, 77, 85, 87, 101], "link": [36, 37, 39, 46, 48, 49, 50, 57, 58, 59, 60, 62, 68, 71, 73, 74, 75, 77, 85, 87], "top": [36, 37, 38, 40, 42, 44, 46, 48, 49, 51, 52, 54, 57, 58, 59, 61, 63, 66, 68, 69, 71, 73, 74, 76, 78, 80, 83, 85, 87, 88, 89], "level": [36, 37, 38, 40, 42, 43, 44, 46, 48, 49, 51, 52, 54, 57, 58, 59, 61, 63, 65, 66, 68, 69, 71, 73, 74, 76, 78, 80, 83, 84, 85, 87, 88, 89], "enum": [36, 57, 71, 78, 87], "definit": [36, 38, 57, 61, 71, 76, 78, 84, 87], "greedi": [38, 61, 76, 100], "select": [38, 61, 76, 97, 100, 103], "configur": [38, 61, 76, 109, 111], "primit": [39, 43, 62, 65, 77, 84], "output": [40, 51, 63, 78], "gener": [40, 51, 63, 78], "guidelin": [41, 64, 79, 87, 104, 105], "higher": [43, 65, 84], "lower": [43, 65, 84], "custom": [43, 65], "datatyp": [43, 65], "method": [43, 65], "manual": [43, 65], "mode": [43, 65, 108], "specif": [52, 55, 85], "format": 55, "version": 55, "0": [55, 112], "up": 55, "file": [55, 109], "bn": [59, 74, 96], "input": 60, "type": 60, "data": 60, "weight": [61, 69, 76, 97, 116], "visual": [69, 88, 89, 114, 115], "tensor": 69, "architectur": 72, "checker": 72, "html": 72, "report": 72, "content": 72, "convbninfotyp": 75, "activationtyp": 75, "param": 75, "empir": 75, "analyt": 75, "tar": 76, "torch": [80, 92], "fx": 80, "symbol": 80, "trace": 80, "multi": 82, "gpu": [82, 92], "support": 82, "clssetinfo": 84, "instal": [90, 91, 92, 99, 101], "quick": 90, "releas": [90, 91, 92, 101, 112], "packag": [90, 91, 92], "system": 90, "requir": [90, 107], "advanc": 90, "instruct": 90, "docker": 91, "variant": 91, "prebuilt": 91, "imag": 91, "build": 91, "local": 91, "start": [91, 101, 114], "contain": 91, "from": [91, 92], "pypi": [91, 92], "environ": [91, 92], "setup": [91, 92], "prerequisit": 92, "13": [92, 112], "common": [92, 94], "debian": 92, "replac": 92, "pillow": 92, "simd": 92, "onnxruntim": 92, "step": 92, "case": [94, 103, 105], "terminologi": 94, "overview": [95, 96, 100, 101, 103, 106, 107, 108, 109, 111, 114, 115, 117], "workflow": [95, 96, 105, 108, 111], "procedur": 97, "winnow": [97, 117], "reconstruct": 97, "featur": [98, 101, 105, 110], "guidebook": [98, 110], "brows": 99, "jupyt": 99, "download": 99, "relat": 99, "ratio": [100, 103], "how": [100, 109, 114, 117], "work": [100, 117], "explor": 100, "inform": 101, "toc": 101, "tree": 101, "known": 102, "issu": 102, "techniqu": [103, 106], "better": 103, "result": 103, "rank": 103, "faq": [103, 106], "refer": [103, 106], "debug": 105, "tool": [105, 114], "detail": 107, "descript": 107, "recommend": 108, "structur": 109, "individu": 109, "section": 109, "nois": 111, "scheme": 111, "frequent": 111, "ask": 111, "question": 111, "note": 112, "22": 112, "21": 112, "20": 112, "19": 112, "py37": 112, "18": 112, "17": 112, "16": 112, "14": 112, "design": 114, "bokeh": 114, "server": 114, "session": 114}, "envversion": {"sphinx.domains.c": 2, "sphinx.domains.changeset": 1, "sphinx.domains.citation": 1, "sphinx.domains.cpp": 8, "sphinx.domains.index": 1, "sphinx.domains.javascript": 2, "sphinx.domains.math": 2, "sphinx.domains.python": 3, "sphinx.domains.rst": 2, "sphinx.domains.std": 2, "nbsphinx": 4, "sphinx.ext.intersphinx": 1, "sphinx.ext.viewcode": 1, "sphinx": 57}, "alltitles": {"Adaptive Rounding (AdaRound)": [[0, "Adaptive-Rounding-(AdaRound)"], [6, "Adaptive-Rounding-(AdaRound)"], [26, "Adaptive-Rounding-(AdaRound)"]], "Overall flow": [[0, "Overall-flow"], [1, "Overall-flow"], [2, "Overall-flow"], [3, "Overall-flow"], [4, "Overall-flow"], [5, "Overall-flow"], [6, "Overall-flow"], [7, "Overall-flow"], [8, "Overall-flow"], [9, "Overall-flow"], [10, "Overall-flow"], [11, "Overall-flow"], [12, "Overall-flow"], [13, "Overall-flow"], [14, "Overall-flow"], [15, "Overall-flow"], [16, "Overall-flow"], [17, "Overall-flow"], [18, "Overall-flow"], [19, "Overall-flow"], [20, "Overall-flow"], [21, "Overall-flow"], [22, "Overall-flow"], [23, "Overall-flow"], [24, "Overall-flow"], [25, "Overall-flow"], [26, "Overall-flow"], [27, "Overall-flow"], [28, "Overall-flow"], [29, "Overall-flow"], [30, "Overall-flow"], [31, "Overall-flow"], [32, "Overall-flow"]], "What this notebook is not": [[0, "What-this-notebook-is-not"], [1, "What-this-notebook-is-not"], [2, "What-this-notebook-is-not"], [3, "What-this-notebook-is-not"], [4, "What-this-notebook-is-not"], [5, "What-this-notebook-is-not"], [6, "What-this-notebook-is-not"], [7, "What-this-notebook-is-not"], [8, "What-this-notebook-is-not"], [9, "What-this-notebook-is-not"], [10, "What-this-notebook-is-not"], [11, "What-this-notebook-is-not"], [15, "What-this-notebook-is-not"], [16, "What-this-notebook-is-not"], [17, "What-this-notebook-is-not"], [18, "What-this-notebook-is-not"], [19, "What-this-notebook-is-not"], [20, "What-this-notebook-is-not"], [21, "What-this-notebook-is-not"], [22, "What-this-notebook-is-not"], [23, "What-this-notebook-is-not"], [24, "What-this-notebook-is-not"], [25, "What-this-notebook-is-not"], [26, "What-this-notebook-is-not"], [27, "What-this-notebook-is-not"], [28, "What-this-notebook-is-not"], [29, "What-this-notebook-is-not"], [30, "What-this-notebook-is-not"], [31, "What-this-notebook-is-not"], [32, "What-this-notebook-is-not"]], "Dataset": [[0, "Dataset"], [1, "Dataset"], [2, "Dataset"], [3, "Dataset"], [4, "Dataset"], [5, "Dataset"], [6, "Dataset"], [7, "Dataset"], [8, "Dataset"], [9, "Dataset"], [10, "Dataset"], [11, "Dataset"], [12, "Dataset"], [15, "Dataset"], [16, "Dataset"], [17, "Dataset"], [18, "Dataset"], [19, "Dataset"], [20, "Dataset"], [21, "Dataset"], [22, "Dataset"], [23, "Dataset"], [24, "Dataset"], [25, "Dataset"], [26, "Dataset"], [27, "Dataset"], [28, "Dataset"], [29, "Dataset"], [30, "Dataset"], [31, "Dataset"], [32, "Dataset"]], "1. Example evaluation and training pipeline": [[0, "1.-Example-evaluation-and-training-pipeline"], [1, "1.-Example-evaluation-and-training-pipeline"], [4, "1.-Example-evaluation-and-training-pipeline"], [5, "1.-Example-evaluation-and-training-pipeline"], [7, "1.-Example-evaluation-and-training-pipeline"], [8, "1.-Example-evaluation-and-training-pipeline"], [9, "1.-Example-evaluation-and-training-pipeline"], [10, "1.-Example-evaluation-and-training-pipeline"], [11, "1.-Example-evaluation-and-training-pipeline"], [17, "1.-Example-evaluation-and-training-pipeline"], [18, "1.-Example-evaluation-and-training-pipeline"], [19, "1.-Example-evaluation-and-training-pipeline"], [20, "1.-Example-evaluation-and-training-pipeline"], [22, "1.-Example-evaluation-and-training-pipeline"], [23, "1.-Example-evaluation-and-training-pipeline"], [24, "1.-Example-evaluation-and-training-pipeline"], [25, "1.-Example-evaluation-and-training-pipeline"], [26, "1.-Example-evaluation-and-training-pipeline"], [28, "1.-Example-evaluation-and-training-pipeline"], [29, "1.-Example-evaluation-and-training-pipeline"], [30, "1.-Example-evaluation-and-training-pipeline"], [31, "1.-Example-evaluation-and-training-pipeline"], [32, "1.-Example-evaluation-and-training-pipeline"]], "2. Convert an FP32 PyTorch model to ONNX and evaluate the model\u2019s baseline FP32 accuracy": [[0, "2.-Convert-an-FP32-PyTorch-model-to-ONNX-and-evaluate-the-model's-baseline-FP32-accuracy"], [1, "2.-Convert-an-FP32-PyTorch-model-to-ONNX-and-evaluate-the-model's-baseline-FP32-accuracy"], [2, "2.-Convert-an-FP32-PyTorch-model-to-ONNX-and-evaluate-the-model's-baseline-FP32-accuracy"]], "3. Create a quantization simulation model and determine quantized accuracy": [[0, "3.-Create-a-quantization-simulation-model-and-determine-quantized-accuracy"], [1, "3.-Create-a-quantization-simulation-model-and-determine-quantized-accuracy"], [2, "3.-Create-a-quantization-simulation-model-and-determine-quantized-accuracy"], [6, "3.-Create-a-quantization-simulation-model-and-determine-quantized-accuracy"], [9, "3.-Create-a-quantization-simulation-model-and-determine-quantized-accuracy"], [10, "3.-Create-a-quantization-simulation-model-and-determine-quantized-accuracy"], [18, "3.-Create-a-quantization-simulation-model-and-determine-quantized-accuracy"], [19, "3.-Create-a-quantization-simulation-model-and-determine-quantized-accuracy"], [20, "3.-Create-a-quantization-simulation-model-and-determine-quantized-accuracy"], [21, "3.-Create-a-quantization-simulation-model-and-determine-quantized-accuracy"], [26, "3.-Create-a-quantization-simulation-model-and-determine-quantized-accuracy"], [29, "3.-Create-a-quantization-simulation-model-and-determine-quantized-accuracy"], [30, "3.-Create-a-quantization-simulation-model-and-determine-quantized-accuracy"], [31, "3.-Create-a-quantization-simulation-model-and-determine-quantized-accuracy"]], "Fold Batch Normalization layers": [[0, "Fold-Batch-Normalization-layers"], [1, "Fold-Batch-Normalization-layers"], [2, "Fold-Batch-Normalization-layers"], [6, "Fold-Batch-Normalization-layers"], [9, "Fold-Batch-Normalization-layers"], [10, "Fold-Batch-Normalization-layers"], [15, "Fold-Batch-Normalization-layers"], [16, "Fold-Batch-Normalization-layers"], [18, "Fold-Batch-Normalization-layers"], [19, "Fold-Batch-Normalization-layers"], [20, "Fold-Batch-Normalization-layers"], [21, "Fold-Batch-Normalization-layers"], [26, "Fold-Batch-Normalization-layers"], [29, "Fold-Batch-Normalization-layers"], [30, "Fold-Batch-Normalization-layers"], [31, "Fold-Batch-Normalization-layers"]], "Create Quantization Sim Model": [[0, "Create-Quantization-Sim-Model"], [1, "Create-Quantization-Sim-Model"], [2, "Create-Quantization-Sim-Model"], [6, "Create-Quantization-Sim-Model"], [8, "Create-Quantization-Sim-Model"], [9, "Create-Quantization-Sim-Model"], [15, "Create-Quantization-Sim-Model"], [16, "Create-Quantization-Sim-Model"], [19, "Create-Quantization-Sim-Model"], [20, "Create-Quantization-Sim-Model"], [21, "Create-Quantization-Sim-Model"], [28, "Create-Quantization-Sim-Model"], [29, "Create-Quantization-Sim-Model"], [30, "Create-Quantization-Sim-Model"], [31, "Create-Quantization-Sim-Model"]], "Compute Encodings": [[0, "Compute-Encodings"], [2, "Compute-Encodings"], [6, "Compute-Encodings"], [8, "Compute-Encodings"], [9, "Compute-Encodings"], [15, "Compute-Encodings"], [16, "Compute-Encodings"], [19, "Compute-Encodings"], [20, "Compute-Encodings"], [21, "Compute-Encodings"]], "4. Apply Adaround": [[0, "4.-Apply-Adaround"], [10, "4.-Apply-Adaround"], [18, "4.-Apply-Adaround"], [26, "4.-Apply-Adaround"]], "Summary": [[0, "Summary"], [1, "Summary"], [2, "Summary"], [3, "Summary"], [4, "Summary"], [5, "Summary"], [6, "Summary"], [7, "Summary"], [8, "Summary"], [9, "Summary"], [10, "Summary"], [11, "Summary"], [12, "Summary"], [14, "Summary"], [15, "Summary"], [16, "Summary"], [18, "Summary"], [19, "Summary"], [20, "Summary"], [21, "Summary"], [23, "Summary"], [24, "Summary"], [25, "Summary"], [26, "Summary"], [27, "Summary"], [28, "Summary"], [29, "Summary"], [30, "Summary"], [31, "Summary"]], "Cross-Layer Equalization (CLE)": [[1, "Cross-Layer-Equalization-(CLE)"]], "4. 1 Cross Layer Equalization": [[1, "4.-1-Cross-Layer-Equalization"], [9, "4.-1-Cross-Layer-Equalization"], [29, "4.-1-Cross-Layer-Equalization"]], "Quantization Simulation": [[2, "Quantization-Simulation"]], "1. Example evaluation pipeline": [[2, "1.-Example-evaluation-pipeline"]], "Model Compression Using Channel Pruning": [[3, "Model-Compression-Using-Channel-Pruning"]], "2. Load the model and evaluate it to find the baseline accuracy": [[3, "2.-Load-the-model-and-evaluate-it-to-find-the-baseline-accuracy"], [4, "2.-Load-the-model-and-evaluate-it-to-find-the-baseline-accuracy"], [5, "2.-Load-the-model-and-evaluate-it-to-find-the-baseline-accuracy"], [23, "2.-Load-the-model-and-evaluate-it-to-find-the-baseline-accuracy"], [24, "2.-Load-the-model-and-evaluate-it-to-find-the-baseline-accuracy"], [25, "2.-Load-the-model-and-evaluate-it-to-find-the-baseline-accuracy"]], "3. Compress the model and fine-tune": [[3, "3.-Compress-the-model-and-fine-tune"], [4, "3.-Compress-the-model-and-fine-tune"], [5, "3.-Compress-the-model-and-fine-tune"], [23, "3.-Compress-the-model-and-fine-tune"], [24, "3.-Compress-the-model-and-fine-tune"], [25, "3.-Compress-the-model-and-fine-tune"]], "3.1. Compress model using Channel Pruning and evaluate it to find post-compression accuracy": [[3, "3.1.-Compress-model-using-Channel-Pruning-and-evaluate-it-to-find-post-compression-accuracy"], [4, "3.1.-Compress-model-using-Channel-Pruning-and-evaluate-it-to-find-post-compression-accuracy"], [5, "3.1.-Compress-model-using-Channel-Pruning-and-evaluate-it-to-find-post-compression-accuracy"], [23, "3.1.-Compress-model-using-Channel-Pruning-and-evaluate-it-to-find-post-compression-accuracy"]], "3.2. Fine-tune the model": [[3, "3.2.-Fine-tune-the-model"], [4, "3.2.-Fine-tune-the-model"], [23, "3.2.-Fine-tune-the-model"], [24, "3.2.-Fine-tune-the-model"]], "Model compression Using Spatial SVD": [[4, "Model-compression-Using-Spatial-SVD"]], "Model Compression Using Spatial SVD Followed by Channel Pruning": [[5, "Model-Compression-Using-Spatial-SVD-Followed-by-Channel-Pruning"]], "3.2. Fine-tune the model after Spatial SVD": [[5, "3.2.-Fine-tune-the-model-after-Spatial-SVD"], [25, "3.2.-Fine-tune-the-model-after-Spatial-SVD"]], "3.3. Compress model using Channel Pruning and evaluate it to find post-compression accuracy": [[5, "3.3.-Compress-model-using-Channel-Pruning-and-evaluate-it-to-find-post-compression-accuracy"], [25, "3.3.-Compress-model-using-Channel-Pruning-and-evaluate-it-to-find-post-compression-accuracy"]], "3.4. Fine-tune the model after Channel Pruning": [[5, "3.4.-Fine-tune-the-model-after-Channel-Pruning"], [25, "3.4.-Fine-tune-the-model-after-Channel-Pruning"]], "1. Example Evaluation and Training Pipeline": [[6, "1.-Example-Evaluation-and-Training-Pipeline"], [21, "1.-Example-Evaluation-and-Training-Pipeline"]], "2. Load the model and evaluate to get a baseline FP32 accuracy score": [[6, "2.-Load-the-model-and-evaluate-to-get-a-baseline-FP32-accuracy-score"], [9, "2.-Load-the-model-and-evaluate-to-get-a-baseline-FP32-accuracy-score"], [10, "2.-Load-the-model-and-evaluate-to-get-a-baseline-FP32-accuracy-score"], [18, "2.-Load-the-model-and-evaluate-to-get-a-baseline-FP32-accuracy-score"], [19, "2.-Load-the-model-and-evaluate-to-get-a-baseline-FP32-accuracy-score"], [20, "2.-Load-the-model-and-evaluate-to-get-a-baseline-FP32-accuracy-score"], [21, "2.-Load-the-model-and-evaluate-to-get-a-baseline-FP32-accuracy-score"], [26, "2.-Load-the-model-and-evaluate-to-get-a-baseline-FP32-accuracy-score"], [29, "2.-Load-the-model-and-evaluate-to-get-a-baseline-FP32-accuracy-score"], [30, "2.-Load-the-model-and-evaluate-to-get-a-baseline-FP32-accuracy-score"], [31, "2.-Load-the-model-and-evaluate-to-get-a-baseline-FP32-accuracy-score"]], "4. Apply AdaRound": [[6, "4.-Apply-AdaRound"]], "AutoQuant": [[7, "AutoQuant"], [11, "AutoQuant"], [27, "AutoQuant"]], "2. Load a pretrained FP32 model": [[7, "2.-Load-a-pretrained-FP32-model"], [11, "2.-Load-a-pretrained-FP32-model"], [15, "2.-Load-a-pretrained-FP32-model"], [16, "2.-Load-a-pretrained-FP32-model"], [17, "2.-Load-a-pretrained-FP32-model"], [27, "2.-Load-a-pretrained-FP32-model"]], "3. Determine the baseline FP32 accuracy": [[7, "3.-Determine-the-baseline-FP32-accuracy"], [11, "3.-Determine-the-baseline-FP32-accuracy"], [15, "3.-Determine-the-baseline-FP32-accuracy"], [16, "3.-Determine-the-baseline-FP32-accuracy"]], "4. Define Constants and Helper functions": [[7, "4.-Define-Constants-and-Helper-functions"], [11, "4.-Define-Constants-and-Helper-functions"]], "Prepare unlabeled dataset": [[7, "Prepare-unlabeled-dataset"]], "Prepare the evaluation callback function": [[7, "Prepare-the-evaluation-callback-function"], [11, "Prepare-the-evaluation-callback-function"], [12, "Prepare-the-evaluation-callback-function"]], "5. Apply AutoQuant": [[7, "5.-Apply-AutoQuant"], [11, "5.-Apply-AutoQuant"]], "Optionally set AdaRound Parameters": [[7, "Optionally-set-AdaRound-Parameters"], [11, "Optionally-set-AdaRound-Parameters"]], "Run AutoQuant": [[7, "Run-AutoQuant"], [11, "Run-AutoQuant"]], "Quantization-Aware Training with BatchNorm Re-estimation": [[8, "Quantization-Aware-Training-with-BatchNorm-Re-estimation"], [12, "Quantization-Aware-Training-with-BatchNorm-Re-estimation"], [28, "Quantization-Aware-Training-with-BatchNorm-Re-estimation"]], "2. Load FP32 model": [[8, "2.-Load-FP32-model"], [28, "2.-Load-FP32-model"]], "BatchNorm Rewriter": [[8, "BatchNorm-Rewriter"]], "3. Create a quantization simulation model and Perform QAT": [[8, "3.-Create-a-quantization-simulation-model-and-Perform-QAT"], [28, "3.-Create-a-quantization-simulation-model-and-Perform-QAT"]], "Perform QAT": [[8, "Perform-QAT"], [28, "Perform-QAT"]], "4. Perform BatchNorm Reestimation": [[8, "4.-Perform-BatchNorm-Reestimation"], [28, "4.-Perform-BatchNorm-Reestimation"]], "Re-estimate BatchNorm Statistics": [[8, "Re-estimate-BatchNorm-Statistics"], [28, "Re-estimate-BatchNorm-Statistics"]], "Fold BatchNorm Layers": [[8, "Fold-BatchNorm-Layers"], [12, "Fold-BatchNorm-Layers"], [28, "Fold-BatchNorm-Layers"]], "5. Export Model": [[8, "5.-Export-Model"], [12, "5.-Export-Model"], [28, "5.-Export-Model"]], "Cross-Layer Equalization (CLE) and Bias Correction (BC)": [[9, "Cross-Layer-Equalization-(CLE)-and-Bias-Correction-(BC)"], [29, "Cross-Layer-Equalization-(CLE)-and-Bias-Correction-(BC)"]], "4. 2 Bias Correction": [[9, "4.-2-Bias-Correction"], [29, "4.-2-Bias-Correction"]], "Adaptive Rounding (Adaround)": [[10, "Adaptive-Rounding-(Adaround)"]], "1. Instantiate the example evaluation and training pipeline": [[12, "1.-Instantiate-the-example-evaluation-and-training-pipeline"]], "2. Define Constants and Datasets Prepare": [[12, "2.-Define-Constants-and-Datasets-Prepare"]], "2. Create the model in Keras": [[12, "2.-Create-the-model-in-Keras"]], "3. Train and evaluate the model": [[12, "3.-Train-and-evaluate-the-model"]], "4. Create a QuantizationSim Model": [[12, "4.-Create-a-QuantizationSim-Model"]], "5. Perform QAT": [[12, "5.-Perform-QAT"], [15, "5.-Perform-QAT"], [16, "5.-Perform-QAT"]], "Quantization-Aware Training with a Keras Transformer Model": [[13, "Quantization-Aware-Training-with-a-Keras-Transformer-Model"]], "Keras Model Preparer": [[14, "Keras-Model-Preparer"]], "1. Creating a Keras model with subclass layers": [[14, "1.-Creating-a-Keras-model-with-subclass-layers"]], "2. Converting the Keras model with subclass layers to a Keras model with functional layers": [[14, "2.-Converting-the-Keras-model-with-subclass-layers-to-a-Keras-model-with-functional-layers"]], "3. Showing similarities and differences between the original and converted models": [[14, "3.-Showing-similarities-and-differences-between-the-original-and-converted-models"]], "4. Discussing the limitations of the Keras Model Preparer": [[14, "4.-Discussing-the-limitations-of-the-Keras-Model-Preparer"]], "Quantization-Aware Training": [[15, "Quantization-Aware-Training"], [20, "Quantization-Aware-Training"], [30, "Quantization-Aware-Training"]], "Example evaluation and training pipeline": [[15, "Example-evaluation-and-training-pipeline"], [16, "Example-evaluation-and-training-pipeline"]], "1. Load the dataset": [[15, "1.-Load-the-dataset"], [16, "1.-Load-the-dataset"]], "4. Create a QuantizationSim Model and determine quantized accuracy": [[15, "4.-Create-a-QuantizationSim-Model-and-determine-quantized-accuracy"], [16, "4.-Create-a-QuantizationSim-Model-and-determine-quantized-accuracy"]], "Compile the model": [[15, "Compile-the-model"], [16, "Compile-the-model"]], "Evaluate the performance of the quantized model": [[15, "Evaluate-the-performance-of-the-quantized-model"], [16, "Evaluate-the-performance-of-the-quantized-model"]], "6. Evaluate validation accuracy after QAT": [[15, "6.-Evaluate-validation-accuracy-after-QAT"], [16, "6.-Evaluate-validation-accuracy-after-QAT"]], "7. Export the encodings": [[15, "7.-Export-the-encodings"], [16, "7.-Export-the-encodings"]], "Quantization-Aware Training with Range Learning": [[16, "Quantization-Aware-Training-with-Range-Learning"], [21, "Quantization-Aware-Training-with-Range-Learning"], [31, "Quantization-Aware-Training-with-Range-Learning"]], "Quant Analyzer": [[17, "Quant-Analyzer"], [22, "Quant-Analyzer"], [32, "Quant-Analyzer"]], "3. Apply QuantAnalyzer to the model": [[17, "3.-Apply-QuantAnalyzer-to-the-model"], [22, "3.-Apply-QuantAnalyzer-to-the-model"], [32, "3.-Apply-QuantAnalyzer-to-the-model"]], "Per-layer analysis by enabling/disabling quantization wrappers": [[17, "Per-layer-analysis-by-enabling/disabling-quantization-wrappers"], [32, "Per-layer-analysis-by-enabling/disabling-quantization-wrappers"]], "Encoding min/max ranges": [[17, "Encoding-min/max-ranges"], [22, "Encoding-min/max-ranges"], [32, "Encoding-min/max-ranges"]], "PDF of statistics": [[17, "PDF-of-statistics"], [22, "PDF-of-statistics"], [32, "PDF-of-statistics"]], "Per-layer MSE loss": [[17, "Per-layer-MSE-loss"], [32, "Per-layer-MSE-loss"]], "Quantsim and Adaround - Per Channel Quantization (PCQ)": [[18, "Quantsim-and-Adaround---Per-Channel-Quantization-(PCQ)"]], "Cross-Layer Equalization (CLE) with QuantSim": [[19, "Cross-Layer-Equalization-(CLE)-with-QuantSim"]], "4 Cross Layer Equalization": [[19, "4-Cross-Layer-Equalization"]], "5 Exporting": [[19, "5-Exporting"]], "4. Perform QAT": [[20, "4.-Perform-QAT"], [21, "4.-Perform-QAT"], [30, "4.-Perform-QAT"], [31, "4.-Perform-QAT"]], "2. Load the model": [[22, "2.-Load-the-model"], [32, "2.-Load-the-model"]], "Per op analysis by enabling/disabling quantization ops": [[22, "Per-op-analysis-by-enabling/disabling-quantization-ops"]], "Per op MSE loss": [[22, "Per-op-MSE-loss"]], "Model compression using Channel Pruning": [[23, "Model-compression-using-Channel-Pruning"]], "Model compression using Spatial SVD": [[24, "Model-compression-using-Spatial-SVD"]], "3.1. Compress model using Spatial SVD and evaluate it to find post-compression accuracy": [[24, "3.1.-Compress-model-using-Spatial-SVD-and-evaluate-it-to-find-post-compression-accuracy"], [25, "3.1.-Compress-model-using-Spatial-SVD-and-evaluate-it-to-find-post-compression-accuracy"]], "Model compression using Spatial SVD followed by Channel Pruning": [[25, "Model-compression-using-Spatial-SVD-followed-by-Channel-Pruning"]], "1. Define Constants and Helper functions": [[27, "1.-Define-Constants-and-Helper-functions"]], "3. Run AutoQuant": [[27, "3.-Run-AutoQuant"]], "Create AutoQuant Object": [[27, "Create-AutoQuant-Object"]], "Run AutoQuant Inference": [[27, "Run-AutoQuant-Inference"]], "Set AdaRound Parameters (optional)": [[27, "Set-AdaRound-Parameters-(optional)"]], "Run AutoQuant Optimization": [[27, "Run-AutoQuant-Optimization"]], "Using AIMET Tensorflow APIs with Keras Models": [[33, "using-aimet-tensorflow-apis-with-keras-models"]], "Introduction": [[33, "introduction"], [37, "introduction"], [38, "introduction"], [39, "introduction"], [43, "introduction"], [50, "introduction"], [59, "introduction"], [61, "introduction"], [62, "introduction"], [65, "introduction"], [74, "introduction"], [76, "introduction"], [77, "introduction"], [84, "introduction"]], "APIs": [[33, "apis"]], "Code Example": [[33, "code-example"], [37, "code-example"], [39, "code-example"], [40, "code-example"], [50, "code-example"], [51, "code-example"], [62, "code-example"], [63, "code-example"], [66, "code-example"], [77, "code-example"], [78, "code-example"]], "Utility Functions": [[33, "utility-functions"]], "Welcome to AI Model Efficiency Toolkit API Docs!": [[34, "welcome-to-ai-model-efficiency-toolkit-api-docs"]], "Indices and tables": [[34, "indices-and-tables"]], "AIMET Keras APIs": [[35, "aimet-keras-apis"]], "AIMET Keras AdaRound API": [[36, "aimet-keras-adaround-api"]], "User Guide Link": [[36, "user-guide-link"], [39, "user-guide-link"], [46, "user-guide-link"], [48, "user-guide-link"], [49, "user-guide-link"], [50, "user-guide-link"], [57, "user-guide-link"], [58, "user-guide-link"], [60, "user-guide-link"], [62, "user-guide-link"], [68, "user-guide-link"], [71, "user-guide-link"], [73, "user-guide-link"], [75, "user-guide-link"], [77, "user-guide-link"], [85, "user-guide-link"], [87, "user-guide-link"]], "Examples Notebook Link": [[36, "examples-notebook-link"], [37, "examples-notebook-link"], [39, "examples-notebook-link"], [57, "examples-notebook-link"], [58, "examples-notebook-link"], [59, "examples-notebook-link"], [62, "examples-notebook-link"], [68, "examples-notebook-link"], [71, "examples-notebook-link"], [73, "examples-notebook-link"], [74, "examples-notebook-link"], [77, "examples-notebook-link"], [85, "examples-notebook-link"], [87, "examples-notebook-link"]], "Top-level API": [[36, "top-level-api"], [40, "top-level-api"], [42, "top-level-api"], [44, "top-level-api"], [46, "top-level-api"], [48, "top-level-api"], [49, "top-level-api"], [51, "top-level-api"], [52, "top-level-api"], [54, "top-level-api"], [57, "top-level-api"], [58, "top-level-api"], [63, "top-level-api"], [66, "top-level-api"], [68, "top-level-api"], [71, "top-level-api"], [73, "top-level-api"], [78, "top-level-api"], [80, "top-level-api"], [83, "top-level-api"], [85, "top-level-api"], [87, "top-level-api"]], "Adaround Parameters": [[36, "adaround-parameters"], [48, "adaround-parameters"], [57, "adaround-parameters"], [71, "adaround-parameters"]], "Enum Definition": [[36, "enum-definition"], [57, "enum-definition"], [71, "enum-definition"], [78, "enum-definition"], [87, "enum-definition"]], "Code Examples": [[36, "code-examples"], [38, "code-examples"], [42, "code-examples"], [44, "code-examples"], [46, "code-examples"], [49, "code-examples"], [52, "code-examples"], [54, "code-examples"], [57, "code-examples"], [58, "code-examples"], [61, "code-examples"], [68, "code-examples"], [73, "code-examples"], [76, "code-examples"], [80, "code-examples"], [85, "code-examples"], [88, "code-examples"], [89, "code-examples"]], "AIMET Keras BatchNorm Re-estimation APIs": [[37, "aimet-keras-batchnorm-re-estimation-apis"]], "Top-level APIs": [[37, "top-level-apis"], [59, "top-level-apis"], [74, "top-level-apis"]], "Limitations": [[37, "limitations"], [42, "limitations"]], "AIMET Keras Compression API": [[38, "aimet-keras-compression-api"]], "Top-level API for Compression": [[38, "top-level-api-for-compression"], [61, "top-level-api-for-compression"], [76, "top-level-api-for-compression"]], "Greedy Selection Parameters": [[38, "greedy-selection-parameters"], [61, "greedy-selection-parameters"], [76, "greedy-selection-parameters"]], "Spatial SVD Configuration": [[38, "spatial-svd-configuration"], [61, "spatial-svd-configuration"], [76, "spatial-svd-configuration"]], "Configuration Definitions": [[38, "configuration-definitions"], [61, "configuration-definitions"], [76, "configuration-definitions"]], "AIMET Keras Cross Layer Equalization APIs": [[39, "aimet-keras-cross-layer-equalization-apis"]], "Cross Layer Equalization API": [[39, "cross-layer-equalization-api"], [50, "cross-layer-equalization-api"], [62, "cross-layer-equalization-api"], [77, "cross-layer-equalization-api"]], "Primitive APIs": [[39, "primitive-apis"], [62, "primitive-apis"], [77, "primitive-apis"]], "AIMET Keras Layer Output Generation API": [[40, "aimet-keras-layer-output-generation-api"]], "Keras Model Guidelines": [[41, "keras-model-guidelines"]], "Model Preparer API": [[42, "model-preparer-api"], [80, "model-preparer-api"]], "AIMET Keras Cross Layer Equalization Primitive API": [[43, "aimet-keras-cross-layer-equalization-primitive-api"]], "Higher Level APIs for Cross Layer Equalization": [[43, "higher-level-apis-for-cross-layer-equalization"], [65, "higher-level-apis-for-cross-layer-equalization"], [84, "higher-level-apis-for-cross-layer-equalization"]], "Code Examples for Higher Level APIs": [[43, "code-examples-for-higher-level-apis"], [65, "code-examples-for-higher-level-apis"], [84, "code-examples-for-higher-level-apis"]], "Lower Level APIs for Cross Layer Equalization": [[43, "lower-level-apis-for-cross-layer-equalization"], [65, "lower-level-apis-for-cross-layer-equalization"], [84, "lower-level-apis-for-cross-layer-equalization"]], "Custom Datatype used": [[43, "custom-datatype-used"], [65, "custom-datatype-used"]], "Code Example for Lower level APIs": [[43, "code-example-for-lower-level-apis"], [65, "code-example-for-lower-level-apis"]], "Example helper methods to perform CLE in manual mode": [[43, "example-helper-methods-to-perform-cle-in-manual-mode"], [65, "example-helper-methods-to-perform-cle-in-manual-mode"]], "AIMET Keras Quant Analyzer API": [[44, "aimet-keras-quant-analyzer-api"]], "AIMET Keras Quantization APIs": [[45, "aimet-keras-quantization-apis"]], "AIMET Keras Quantization SIM API": [[46, "aimet-keras-quantization-sim-api"]], "AIMET ONNX APIs": [[47, "aimet-onnx-apis"]], "AIMET ONNX AdaRound API": [[48, "aimet-onnx-adaround-api"]], "Code Example - Adaptive Rounding (AdaRound)": [[48, "code-example-adaptive-rounding-adaround"], [71, "code-example-adaptive-rounding-adaround"]], "AIMET ONNX AutoQuant API": [[49, "aimet-onnx-autoquant-api"]], "AIMET ONNX Cross Layer Equalization APIs": [[50, "aimet-onnx-cross-layer-equalization-apis"]], "AIMET ONNX Layer Output Generation API": [[51, "aimet-onnx-layer-output-generation-api"]], "AIMET ONNX Quant Analyzer API": [[52, "aimet-onnx-quant-analyzer-api"]], "Run specific utility": [[52, "run-specific-utility"], [85, "run-specific-utility"]], "AIMET ONNX Quantization APIs": [[53, "aimet-onnx-quantization-apis"]], "AIMET ONNX Quantization SIM API": [[54, "aimet-onnx-quantization-sim-api"]], "Encoding Format Specification": [[55, "encoding-format-specification"]], "1. Versioning": [[55, "versioning"]], "2. Version 0.4.0 (up to)": [[55, "version-0-4-0-up-to"]], "2.1. Encoding Specification": [[55, "encoding-specification"]], "2.2. Encoding File Example for PyTorch": [[55, "encoding-file-example-for-pytorch"]], "2.3. Encoding File Example for TensorFlow": [[55, "encoding-file-example-for-tensorflow"]], "3. Version 0.5.0": [[55, "version-0-5-0"]], "3.1. Encoding Specification": [[55, "id1"]], "3.2. Encoding File Example for PyTorch": [[55, "id2"]], "3.3. Encoding File Example for TensorFlow": [[55, "id3"]], "4. Version 0.6.1": [[55, "version-0-6-1"]], "4.1. Encoding Specification": [[55, "id4"]], "AIMET TensorFlow APIs": [[56, "aimet-tensorflow-apis"]], "AIMET TensorFlow AdaRound API": [[57, "aimet-tensorflow-adaround-api"]], "AIMET TensorFlow AutoQuant API": [[58, "aimet-tensorflow-autoquant-api"]], "AIMET TensorFlow BatchNorm Re-estimation APIs": [[59, "aimet-tensorflow-batchnorm-re-estimation-apis"]], "Code Example - BN-Reestimation": [[59, "code-example-bn-reestimation"], [74, "code-example-bn-reestimation"]], "AIMET TensorFlow Bias Correction API": [[60, "aimet-tensorflow-bias-correction-api"]], "Bias Correction API": [[60, "bias-correction-api"], [75, "bias-correction-api"]], "Input Parameter Types": [[60, "input-parameter-types"]], "Data Input Type": [[60, "data-input-type"]], "Code Examples for Bias Correction": [[60, "code-examples-for-bias-correction"]], "Bias Correction Per Layer API": [[60, "bias-correction-per-layer-api"]], "Code Example for Per-Layer Bias Correction": [[60, "code-example-for-per-layer-bias-correction"]], "AIMET TensorFlow Compression API": [[61, "aimet-tensorflow-compression-api"]], "Channel Pruning Configuration": [[61, "channel-pruning-configuration"], [76, "channel-pruning-configuration"]], "Weight SVD Top-level API": [[61, "weight-svd-top-level-api"]], "Code Examples for Weight SVD": [[61, "code-examples-for-weight-svd"]], "AIMET TensorFlow Cross Layer Equalization APIs": [[62, "aimet-tensorflow-cross-layer-equalization-apis"]], "AIMET Tensorflow Layer Output Generation API": [[63, "aimet-tensorflow-layer-output-generation-api"]], "TensorFlow Model Guidelines": [[64, "tensorflow-model-guidelines"]], "AIMET TensorFlow Cross Layer Equalization Primitive API": [[65, "aimet-tensorflow-cross-layer-equalization-primitive-api"]], "AIMET Tensorflow Quant Analyzer API": [[66, "aimet-tensorflow-quant-analyzer-api"]], "AIMET TensorFlow Quantization APIs": [[67, "aimet-tensorflow-quantization-apis"]], "AIMET TensorFlow Quantization SIM API": [[68, "aimet-tensorflow-quantization-sim-api"]], "AIMET Visualization for Quantization for TensorFlow API": [[69, "aimet-visualization-for-quantization-for-tensorflow-api"]], "Top-level API for Visualization of Weight tensors": [[69, "top-level-api-for-visualization-of-weight-tensors"]], "Code Examples for Visualization of Weight tensors": [[69, "code-examples-for-visualization-of-weight-tensors"]], "AIMET PyTorch APIs": [[70, "aimet-pytorch-apis"]], "AIMET PyTorch AdaRound API": [[71, "aimet-pytorch-adaround-api"]], "Architecture Checker API": [[72, "architecture-checker-api"]], "HTML report content": [[72, "id1"]], "AIMET PyTorch AutoQuant API": [[73, "aimet-pytorch-autoquant-api"]], "AIMET PyTorch BatchNorm Re-estimation APIs": [[74, "aimet-pytorch-batchnorm-re-estimation-apis"]], "AIMET PyTorch Bias Correction API": [[75, "aimet-pytorch-bias-correction-api"]], "ConvBnInfoType": [[75, "convbninfotype"]], "ActivationType": [[75, "activationtype"]], "Quantization Params": [[75, "quantization-params"]], "Code Example #1 Empirical Bias Correction": [[75, "code-example-1-empirical-bias-correction"]], "Code Example #2 Analytical + Empirical Bias correction": [[75, "code-example-2-analytical-empirical-bias-correction"]], "AIMET PyTorch Compression API": [[76, "aimet-pytorch-compression-api"]], "TAR Selection Parameters": [[76, "tar-selection-parameters"]], "Weight SVD Configuration": [[76, "weight-svd-configuration"]], "AIMET PyTorch Cross Layer Equalization APIs": [[77, "aimet-pytorch-cross-layer-equalization-apis"]], "AIMET PyTorch Layer Output Generation API": [[78, "aimet-pytorch-layer-output-generation-api"]], "PyTorch Model Guidelines": [[79, "pytorch-model-guidelines"]], "Limitations of torch.fx symbolic trace API": [[80, "limitations-of-torch-fx-symbolic-trace-api"]], "Model Validator Utility": [[81, "model-validator-utility"]], "PyTorch Multi-GPU support": [[82, "pytorch-multi-gpu-support"]], "User flow": [[83, "user-flow"]], "AIMET PyTorch Cross Layer Equalization Primitive API": [[84, "aimet-pytorch-cross-layer-equalization-primitive-api"]], "ClsSetInfo Definition": [[84, "clssetinfo-definition"]], "Code Examples for Lower Level APIs": [[84, "code-examples-for-lower-level-apis"]], "AIMET PyTorch Quant Analyzer API": [[85, "aimet-pytorch-quant-analyzer-api"]], "AIMET PyTorch Quantization APIs": [[86, "aimet-pytorch-quantization-apis"]], "AIMET PyTorch Quantization SIM API": [[87, "aimet-pytorch-quantization-sim-api"]], "Guidelines": [[87, "guidelines"]], "Code Example - Quantization Aware Training (QAT)": [[87, "code-example-quantization-aware-training-qat"]], "AIMET Visualization Compression API": [[88, "aimet-visualization-compression-api"]], "Top-level API Compression": [[88, "top-level-api-compression"]], "AIMET Visualization for Quantization API": [[89, "aimet-visualization-for-quantization-api"]], "Top-level API Quantization": [[89, "top-level-api-quantization"]], "AIMET Installation": [[90, "aimet-installation"]], "Quick Install": [[90, "quick-install"]], "Release Packages": [[90, "release-packages"]], "System Requirements": [[90, "system-requirements"]], "Advanced Installation Instructions": [[90, "advanced-installation-instructions"]], "AIMET Installation in Docker": [[91, "aimet-installation-in-docker"]], "Set variant": [[91, "set-variant"]], "Use prebuilt docker image": [[91, "use-prebuilt-docker-image"]], "Build docker image locally": [[91, "build-docker-image-locally"]], "Start docker container": [[91, "start-docker-container"]], "Install AIMET packages": [[91, "install-aimet-packages"], [92, "install-aimet-packages"]], "From PyPI": [[91, "from-pypi"], [92, "from-pypi"]], "From Release Package": [[91, "from-release-package"], [92, "from-release-package"]], "Environment setup": [[91, "environment-setup"], [92, "environment-setup"]], "AIMET Installation and Setup": [[92, "aimet-installation-and-setup"]], "Install prerequisite packages": [[92, "install-prerequisite-packages"]], "Install GPU packages": [[92, "install-gpu-packages"]], "Install GPU packages for PyTorch 2.1 or TensorFlow": [[92, "install-gpu-packages-for-pytorch-2-1-or-tensorflow"]], "Install GPU packages for PyTorch 1.13 or ONNX": [[92, "install-gpu-packages-for-pytorch-1-13-or-onnx"]], "Install common debian packages": [[92, "install-common-debian-packages"]], "Install tensorflow GPU debian packages": [[92, "install-tensorflow-gpu-debian-packages"]], "Install torch GPU debian packages": [[92, "install-torch-gpu-debian-packages"]], "Install ONNX GPU debian packages": [[92, "install-onnx-gpu-debian-packages"]], "Replace Pillow with Pillow-SIMD": [[92, "replace-pillow-with-pillow-simd"]], "Replace onnxruntime with onnxruntime-gpu": [[92, "replace-onnxruntime-with-onnxruntime-gpu"]], "Post installation steps": [[92, "post-installation-steps"]], "AIMET AdaRound": [[94, "aimet-adaround"]], "AdaRound Use Cases": [[94, "adaround-use-cases"]], "Common terminology": [[94, "common-terminology"]], "Use Cases": [[94, "use-cases"], [105, "use-cases"]], "AdaRound API": [[94, "adaround-api"]], "AIMET AutoQuant": [[95, "aimet-autoquant"]], "Overview": [[95, "overview"], [96, "overview"], [100, "overview"], [101, "overview"], [103, "overview"], [106, "overview"], [107, "overview"], [108, "overview"], [109, "overview"], [111, "overview"], [114, "overview"], [115, "overview"], [117, "overview"]], "Workflow": [[95, "workflow"], [96, "workflow"]], "AutoQuant API": [[95, "autoquant-api"]], "AIMET BN Re-estimation": [[96, "aimet-bn-re-estimation"]], "BN Re-estimation API": [[96, "bn-re-estimation-api"]], "AIMET Channel Pruning": [[97, "aimet-channel-pruning"]], "Overall Procedure": [[97, "overall-procedure"]], "Channel Selection": [[97, "channel-selection"]], "Winnowing": [[97, "winnowing"]], "Weight Reconstruction": [[97, "weight-reconstruction"]], "AIMET Compression Features Guidebook": [[98, "aimet-compression-features-guidebook"]], "AIMET Examples": [[99, "aimet-examples"]], "Browse the notebooks": [[99, "browse-the-notebooks"]], "Running the notebooks": [[99, "running-the-notebooks"]], "Install Jupyter": [[99, "install-jupyter"]], "Download the Example notebooks and related code": [[99, "download-the-example-notebooks-and-related-code"]], "Run the notebooks": [[99, "run-the-notebooks"]], "AIMET Greedy Compression Ratio Selection": [[100, "aimet-greedy-compression-ratio-selection"]], "How it works": [[100, "how-it-works"]], "Per-layer Exploration": [[100, "per-layer-exploration"]], "Compression Ratio Selection": [[100, "compression-ratio-selection"]], "AI Model Efficiency Toolkit User Guide": [[101, "ai-model-efficiency-toolkit-user-guide"]], "Features": [[101, "features"]], "Release Information": [[101, "release-information"]], "Installation Guide": [[101, "installation-guide"]], "Getting Started": [[101, "getting-started"]], "toc tree": [[101, "toc-tree"]], "AIMET Known Issues": [[102, "aimet-known-issues"]], "AIMET Model Compression": [[103, "aimet-model-compression"]], "Use Case": [[103, "use-case"]], "Compression ratio selection": [[103, "compression-ratio-selection"]], "Model Compression": [[103, "model-compression"]], "Optional techniques to get better compression results": [[103, "optional-techniques-to-get-better-compression-results"]], "Rank Rounding": [[103, "rank-rounding"]], "Per-layer Fine-tuning": [[103, "per-layer-fine-tuning"]], "FAQs": [[103, "faqs"], [106, "faqs"]], "References": [[103, "references"], [106, "references"]], "Model Guidelines for PyTorch": [[104, "model-guidelines-for-pytorch"]], "AIMET Model Quantization": [[105, "aimet-model-quantization"]], "AIMET Quantization Features": [[105, "aimet-quantization-features"]], "Post-Training Quantization": [[105, "post-training-quantization"]], "Debugging/Analysis Tools": [[105, "debugging-analysis-tools"]], "AIMET Quantization Workflow": [[105, "aimet-quantization-workflow"]], "PyTorch": [[105, "pytorch"], [115, "pytorch"]], "Tensorflow": [[105, "tensorflow"]], "Debugging Guidelines": [[105, "debugging-guidelines"]], "AIMET Post-Training Quantization Techniques": [[106, "aimet-post-training-quantization-techniques"]], "User Flow": [[106, "user-flow"]], "Cross-Layer Equalization API": [[106, "cross-layer-equalization-api"]], "AIMET QuantAnalyzer": [[107, "aimet-quantanalyzer"]], "Requirements": [[107, "requirements"]], "Detailed Analysis Descriptions": [[107, "detailed-analysis-descriptions"]], "QuantAnalyzer API": [[107, "quantanalyzer-api"]], "AIMET Quantization Aware Training": [[108, "aimet-quantization-aware-training"]], "QAT workflow": [[108, "qat-workflow"]], "QAT modes": [[108, "qat-modes"]], "Recommendations for Quantization-Aware Training": [[108, "recommendations-for-quantization-aware-training"]], "Quantization Simulation Configuration": [[109, "quantization-simulation-configuration"]], "Configuration File Structure": [[109, "configuration-file-structure"]], "How to configure individual Configuration File Sections": [[109, "how-to-configure-individual-configuration-file-sections"]], "AIMET Quantization Features Guidebook": [[110, "aimet-quantization-features-guidebook"]], "AIMET Quantization Simulation": [[111, "aimet-quantization-simulation"]], "QuantSim Workflow": [[111, "quantsim-workflow"]], "Simulating Quantization Noise": [[111, "simulating-quantization-noise"]], "Determining Quantization Parameters (Encodings)": [[111, "determining-quantization-parameters-encodings"]], "Quantization Schemes": [[111, "quantization-schemes"]], "Configuring Quantization Simulation Ops": [[111, "configuring-quantization-simulation-ops"]], "Quantization Simulation APIs": [[111, "quantization-simulation-apis"]], "Frequently Asked Questions": [[111, "frequently-asked-questions"]], "AIMET Release Notes": [[112, "aimet-release-notes"]], "1.22.2": [[112, "id1"]], "1.22.1": [[112, "id2"]], "1.22.0": [[112, "id3"]], "1.21.0": [[112, "id4"]], "1.20.0": [[112, "id5"]], "1.19.1.py37": [[112, "py37"]], "1.19.1": [[112, "id6"]], "1.18.0.py37": [[112, "id7"]], "1.18.0": [[112, "id8"]], "1.17.0.py37": [[112, "id9"]], "1.17.0": [[112, "id10"]], "1.16.2.py37": [[112, "id11"]], "1.16.2": [[112, "id12"]], "1.16.1.py37": [[112, "id13"]], "1.16.1": [[112, "id14"]], "1.16.0": [[112, "id15"]], "1.14.0": [[112, "id16"]], "1.13.0": [[112, "id17"]], "AIMET Spatial SVD": [[113, "aimet-spatial-svd"]], "AIMET Visualization": [[114, "aimet-visualization"]], "Design": [[114, "design"]], "Compression": [[114, "compression"]], "Starting a Bokeh Server Session:": [[114, "starting-a-bokeh-server-session"]], "How to use the tool": [[114, "how-to-use-the-tool"]], "AIMET Visualization for Quantization": [[115, "aimet-visualization-for-quantization"]], "Quantization": [[115, "quantization"]], "TensorFlow": [[115, "tensorflow"]], "AIMET Weight SVD": [[116, "aimet-weight-svd"]], "AIMET Winnowing": [[117, "aimet-winnowing"]], "Winnowing Overview": [[117, "winnowing-overview"]], "How Winnowing Works": [[117, "how-winnowing-works"]]}, "indexentries": {"load_keras_model_multi_gpu() (in module aimet_tensorflow.utils.convert_tf_sess_to_keras)": [[33, "aimet_tensorflow.utils.convert_tf_sess_to_keras.load_keras_model_multi_gpu"]], "load_tf_sess_variables_to_keras_single_gpu() (in module aimet_tensorflow.utils.convert_tf_sess_to_keras)": [[33, "aimet_tensorflow.utils.convert_tf_sess_to_keras.load_tf_sess_variables_to_keras_single_gpu"]], "save_as_tf_module_multi_gpu() (in module aimet_tensorflow.utils.convert_tf_sess_to_keras)": [[33, "aimet_tensorflow.utils.convert_tf_sess_to_keras.save_as_tf_module_multi_gpu"]], "save_tf_session_single_gpu() (in module aimet_tensorflow.utils.convert_tf_sess_to_keras)": [[33, "aimet_tensorflow.utils.convert_tf_sess_to_keras.save_tf_session_single_gpu"]], "adaroundparameters (class in aimet_tensorflow.adaround.adaround_weight)": [[36, "aimet_tensorflow.adaround.adaround_weight.AdaroundParameters"], [57, "aimet_tensorflow.adaround.adaround_weight.AdaroundParameters"]], "quantscheme (class in aimet_common.defs)": [[36, "aimet_common.defs.QuantScheme"], [57, "aimet_common.defs.QuantScheme"], [71, "aimet_common.defs.QuantScheme"], [87, "aimet_common.defs.QuantScheme"]], "post_training_percentile (aimet_common.defs.quantscheme attribute)": [[36, "aimet_common.defs.QuantScheme.post_training_percentile"], [57, "aimet_common.defs.QuantScheme.post_training_percentile"], [71, "aimet_common.defs.QuantScheme.post_training_percentile"], [87, "aimet_common.defs.QuantScheme.post_training_percentile"]], "post_training_tf (aimet_common.defs.quantscheme attribute)": [[36, "aimet_common.defs.QuantScheme.post_training_tf"], [57, "aimet_common.defs.QuantScheme.post_training_tf"], [71, "aimet_common.defs.QuantScheme.post_training_tf"], [87, "aimet_common.defs.QuantScheme.post_training_tf"]], "post_training_tf_enhanced (aimet_common.defs.quantscheme attribute)": [[36, "aimet_common.defs.QuantScheme.post_training_tf_enhanced"], [57, "aimet_common.defs.QuantScheme.post_training_tf_enhanced"], [71, "aimet_common.defs.QuantScheme.post_training_tf_enhanced"], [87, "aimet_common.defs.QuantScheme.post_training_tf_enhanced"]], "training_range_learning_with_tf_enhanced_init (aimet_common.defs.quantscheme attribute)": [[36, "aimet_common.defs.QuantScheme.training_range_learning_with_tf_enhanced_init"], [57, "aimet_common.defs.QuantScheme.training_range_learning_with_tf_enhanced_init"], [71, "aimet_common.defs.QuantScheme.training_range_learning_with_tf_enhanced_init"], [87, "aimet_common.defs.QuantScheme.training_range_learning_with_tf_enhanced_init"]], "training_range_learning_with_tf_init (aimet_common.defs.quantscheme attribute)": [[36, "aimet_common.defs.QuantScheme.training_range_learning_with_tf_init"], [57, "aimet_common.defs.QuantScheme.training_range_learning_with_tf_init"], [71, "aimet_common.defs.QuantScheme.training_range_learning_with_tf_init"], [87, "aimet_common.defs.QuantScheme.training_range_learning_with_tf_init"]], "fold_all_batch_norms_to_scale() (in module aimet_tensorflow.keras.batch_norm_fold)": [[37, "aimet_tensorflow.keras.batch_norm_fold.fold_all_batch_norms_to_scale"]], "reestimate_bn_stats() (in module aimet_tensorflow.keras.bn_reestimation)": [[37, "aimet_tensorflow.keras.bn_reestimation.reestimate_bn_stats"]], "compressionscheme (class in aimet_common.defs)": [[38, "aimet_common.defs.CompressionScheme"], [61, "aimet_common.defs.CompressionScheme"]], "costmetric (class in aimet_common.defs)": [[38, "aimet_common.defs.CostMetric"], [61, "aimet_common.defs.CostMetric"]], "modelcompressor (class in aimet_tensorflow.keras.compress)": [[38, "aimet_tensorflow.keras.compress.ModelCompressor"]], "modulecompratiopair (class in aimet_tensorflow.defs)": [[38, "aimet_tensorflow.defs.ModuleCompRatioPair"], [61, "aimet_tensorflow.defs.ModuleCompRatioPair"]], "spatialsvdparameters (class in aimet_tensorflow.defs)": [[38, "aimet_tensorflow.defs.SpatialSvdParameters"], [61, "aimet_tensorflow.defs.SpatialSvdParameters"]], "spatialsvdparameters.automodeparams (class in aimet_tensorflow.defs)": [[38, "aimet_tensorflow.defs.SpatialSvdParameters.AutoModeParams"], [61, "aimet_tensorflow.defs.SpatialSvdParameters.AutoModeParams"]], "spatialsvdparameters.manualmodeparams (class in aimet_tensorflow.defs)": [[38, "aimet_tensorflow.defs.SpatialSvdParameters.ManualModeParams"], [61, "aimet_tensorflow.defs.SpatialSvdParameters.ManualModeParams"]], "spatialsvdparameters.mode (class in aimet_tensorflow.defs)": [[38, "aimet_tensorflow.defs.SpatialSvdParameters.Mode"], [61, "aimet_tensorflow.defs.SpatialSvdParameters.Mode"]], "auto (aimet_tensorflow.defs.spatialsvdparameters.mode attribute)": [[38, "aimet_tensorflow.defs.SpatialSvdParameters.Mode.auto"], [61, "aimet_tensorflow.defs.SpatialSvdParameters.Mode.auto"]], "channel_pruning (aimet_common.defs.compressionscheme attribute)": [[38, "aimet_common.defs.CompressionScheme.channel_pruning"], [61, "aimet_common.defs.CompressionScheme.channel_pruning"]], "compress_model() (aimet_tensorflow.keras.compress.modelcompressor static method)": [[38, "aimet_tensorflow.keras.compress.ModelCompressor.compress_model"]], "mac (aimet_common.defs.costmetric attribute)": [[38, "aimet_common.defs.CostMetric.mac"], [61, "aimet_common.defs.CostMetric.mac"]], "manual (aimet_tensorflow.defs.spatialsvdparameters.mode attribute)": [[38, "aimet_tensorflow.defs.SpatialSvdParameters.Mode.manual"], [61, "aimet_tensorflow.defs.SpatialSvdParameters.Mode.manual"]], "memory (aimet_common.defs.costmetric attribute)": [[38, "aimet_common.defs.CostMetric.memory"], [61, "aimet_common.defs.CostMetric.memory"]], "spatial_svd (aimet_common.defs.compressionscheme attribute)": [[38, "aimet_common.defs.CompressionScheme.spatial_svd"], [61, "aimet_common.defs.CompressionScheme.spatial_svd"]], "weight_svd (aimet_common.defs.compressionscheme attribute)": [[38, "aimet_common.defs.CompressionScheme.weight_svd"], [61, "aimet_common.defs.CompressionScheme.weight_svd"]], "equalize_model() (in module aimet_tensorflow.keras.cross_layer_equalization)": [[39, "aimet_tensorflow.keras.cross_layer_equalization.equalize_model"]], "layeroutpututil (class in aimet_tensorflow.keras.layer_output_utils)": [[40, "aimet_tensorflow.keras.layer_output_utils.LayerOutputUtil"]], "generate_layer_outputs() (aimet_tensorflow.keras.layer_output_utils.layeroutpututil method)": [[40, "aimet_tensorflow.keras.layer_output_utils.LayerOutputUtil.generate_layer_outputs"]], "prepare_model() (in module aimet_tensorflow.keras.model_preparer)": [[42, "aimet_tensorflow.keras.model_preparer.prepare_model"]], "clssetinfo (class in aimet_tensorflow.keras.cross_layer_equalization)": [[43, "aimet_tensorflow.keras.cross_layer_equalization.ClsSetInfo"]], "clssetinfo.clssetlayerpairinfo (class in aimet_tensorflow.keras.cross_layer_equalization)": [[43, "aimet_tensorflow.keras.cross_layer_equalization.ClsSetInfo.ClsSetLayerPairInfo"]], "bias_fold() (in module aimet_tensorflow.keras.cross_layer_equalization.highbiasfold)": [[43, "aimet_tensorflow.keras.cross_layer_equalization.HighBiasFold.bias_fold"], [43, "id0"]], "fold_all_batch_norms() (in module aimet_tensorflow.keras.batch_norm_fold)": [[43, "aimet_tensorflow.keras.batch_norm_fold.fold_all_batch_norms"]], "fold_given_batch_norms() (in module aimet_tensorflow.keras.batch_norm_fold)": [[43, "aimet_tensorflow.keras.batch_norm_fold.fold_given_batch_norms"]], "scale_cls_sets() (in module aimet_tensorflow.keras.cross_layer_equalization.crosslayerscaling)": [[43, "aimet_tensorflow.keras.cross_layer_equalization.CrossLayerScaling.scale_cls_sets"]], "scale_model() (in module aimet_tensorflow.keras.cross_layer_equalization.crosslayerscaling)": [[43, "aimet_tensorflow.keras.cross_layer_equalization.CrossLayerScaling.scale_model"]], "quantanalyzer (class in aimet_tensorflow.keras.quant_analyzer)": [[44, "aimet_tensorflow.keras.quant_analyzer.QuantAnalyzer"]], "analyze() (aimet_tensorflow.keras.quant_analyzer.quantanalyzer method)": [[44, "aimet_tensorflow.keras.quant_analyzer.QuantAnalyzer.analyze"]], "quantizationsimmodel (class in aimet_tensorflow.keras.quantsim)": [[46, "aimet_tensorflow.keras.quantsim.QuantizationSimModel"]], "compute_encodings() (aimet_tensorflow.keras.quantsim.quantizationsimmodel method)": [[46, "aimet_tensorflow.keras.quantsim.QuantizationSimModel.compute_encodings"]], "export() (aimet_tensorflow.keras.quantsim.quantizationsimmodel method)": [[46, "aimet_tensorflow.keras.quantsim.QuantizationSimModel.export"]], "apply_adaround() (in module aimet_tensorflow.adaround.adaround_weight.adaround)": [[57, "aimet_tensorflow.adaround.adaround_weight.Adaround.apply_adaround"]], "autoquant (class in aimet_tensorflow.auto_quant)": [[58, "aimet_tensorflow.auto_quant.AutoQuant"]], "apply() (aimet_tensorflow.auto_quant.autoquant method)": [[58, "aimet_tensorflow.auto_quant.AutoQuant.apply"]], "set_adaround_params() (aimet_tensorflow.auto_quant.autoquant method)": [[58, "aimet_tensorflow.auto_quant.AutoQuant.set_adaround_params"]], "fold_all_batch_norms_to_scale() (in module aimet_tensorflow.batch_norm_fold)": [[59, "aimet_tensorflow.batch_norm_fold.fold_all_batch_norms_to_scale"]], "reestimate_bn_stats() (in module aimet_tensorflow.bn_reestimation)": [[59, "aimet_tensorflow.bn_reestimation.reestimate_bn_stats"]], "biascorrectionparams() (in module aimet_tensorflow.bias_correction)": [[60, "aimet_tensorflow.bias_correction.BiasCorrectionParams"]], "quantparams (class in aimet_tensorflow.bias_correction)": [[60, "aimet_tensorflow.bias_correction.QuantParams"]], "analytical_bias_correction_per_layer() (in module aimet_tensorflow.bias_correction.biascorrection)": [[60, "aimet_tensorflow.bias_correction.BiasCorrection.analytical_bias_correction_per_layer"]], "bias_correction_per_layer() (in module aimet_tensorflow.bias_correction.biascorrection)": [[60, "aimet_tensorflow.bias_correction.BiasCorrection.bias_correction_per_layer"]], "correct_bias() (in module aimet_tensorflow.bias_correction.biascorrection)": [[60, "aimet_tensorflow.bias_correction.BiasCorrection.correct_bias"]], "channelpruningparameters (class in aimet_tensorflow.defs)": [[61, "aimet_tensorflow.defs.ChannelPruningParameters"]], "channelpruningparameters.automodeparams (class in aimet_tensorflow.defs)": [[61, "aimet_tensorflow.defs.ChannelPruningParameters.AutoModeParams"]], "channelpruningparameters.manualmodeparams (class in aimet_tensorflow.defs)": [[61, "aimet_tensorflow.defs.ChannelPruningParameters.ManualModeParams"]], "channelpruningparameters.mode (class in aimet_tensorflow.defs)": [[61, "aimet_tensorflow.defs.ChannelPruningParameters.Mode"]], "modelcompressor (class in aimet_tensorflow.compress)": [[61, "aimet_tensorflow.compress.ModelCompressor"]], "svd (class in aimet_tensorflow.svd)": [[61, "aimet_tensorflow.svd.Svd"]], "auto (aimet_tensorflow.defs.channelpruningparameters.mode attribute)": [[61, "aimet_tensorflow.defs.ChannelPruningParameters.Mode.auto"]], "compress_model() (aimet_tensorflow.compress.modelcompressor static method)": [[61, "aimet_tensorflow.compress.ModelCompressor.compress_model"]], "compress_net() (aimet_tensorflow.svd.svd method)": [[61, "aimet_tensorflow.svd.Svd.compress_net"]], "manual (aimet_tensorflow.defs.channelpruningparameters.mode attribute)": [[61, "aimet_tensorflow.defs.ChannelPruningParameters.Mode.manual"]], "equalize_model() (in module aimet_tensorflow.cross_layer_equalization)": [[62, "aimet_tensorflow.cross_layer_equalization.equalize_model"]], "layeroutpututil (class in aimet_tensorflow.layer_output_utils)": [[63, "aimet_tensorflow.layer_output_utils.LayerOutputUtil"]], "generate_layer_outputs() (aimet_tensorflow.layer_output_utils.layeroutpututil method)": [[63, "aimet_tensorflow.layer_output_utils.LayerOutputUtil.generate_layer_outputs"]], "update_keras_bn_ops_trainable_flag() (in module aimet_tensorflow.utils.graph)": [[64, "aimet_tensorflow.utils.graph.update_keras_bn_ops_trainable_flag"]], "clssetinfo (class in aimet_tensorflow.cross_layer_equalization)": [[65, "aimet_tensorflow.cross_layer_equalization.ClsSetInfo"]], "clssetinfo.clssetlayerpairinfo (class in aimet_tensorflow.cross_layer_equalization)": [[65, "aimet_tensorflow.cross_layer_equalization.ClsSetInfo.ClsSetLayerPairInfo"]], "bias_fold() (in module aimet_tensorflow.cross_layer_equalization.highbiasfold)": [[65, "aimet_tensorflow.cross_layer_equalization.HighBiasFold.bias_fold"], [65, "id0"]], "fold_all_batch_norms() (in module aimet_tensorflow.batch_norm_fold)": [[65, "aimet_tensorflow.batch_norm_fold.fold_all_batch_norms"]], "fold_given_batch_norms() (in module aimet_tensorflow.batch_norm_fold)": [[65, "aimet_tensorflow.batch_norm_fold.fold_given_batch_norms"]], "map_cls_sets_to_new_session() (aimet_tensorflow.cross_layer_equalization.clssetinfo static method)": [[65, "aimet_tensorflow.cross_layer_equalization.ClsSetInfo.map_cls_sets_to_new_session"]], "scale_cls_sets() (in module aimet_tensorflow.cross_layer_equalization.crosslayerscaling)": [[65, "aimet_tensorflow.cross_layer_equalization.CrossLayerScaling.scale_cls_sets"]], "scale_model() (in module aimet_tensorflow.cross_layer_equalization.crosslayerscaling)": [[65, "aimet_tensorflow.cross_layer_equalization.CrossLayerScaling.scale_model"]], "quantanalyzer (class in aimet_tensorflow.quant_analyzer)": [[66, "aimet_tensorflow.quant_analyzer.QuantAnalyzer"]], "analyze() (aimet_tensorflow.quant_analyzer.quantanalyzer method)": [[66, "aimet_tensorflow.quant_analyzer.QuantAnalyzer.analyze"]], "quantizationsimmodel (class in aimet_tensorflow.quantsim)": [[68, "aimet_tensorflow.quantsim.QuantizationSimModel"]], "compute_encodings() (aimet_tensorflow.quantsim.quantizationsimmodel method)": [[68, "aimet_tensorflow.quantsim.QuantizationSimModel.compute_encodings"]], "export() (aimet_tensorflow.quantsim.quantizationsimmodel method)": [[68, "aimet_tensorflow.quantsim.QuantizationSimModel.export"]], "visualize_relative_weight_ranges_single_layer() (in module aimet_tensorflow.plotting_utils)": [[69, "aimet_tensorflow.plotting_utils.visualize_relative_weight_ranges_single_layer"]], "visualize_weight_ranges_single_layer() (in module aimet_tensorflow.plotting_utils)": [[69, "aimet_tensorflow.plotting_utils.visualize_weight_ranges_single_layer"]], "adaroundparameters (class in aimet_torch.adaround.adaround_weight)": [[71, "aimet_torch.adaround.adaround_weight.AdaroundParameters"]], "apply_adaround() (in module aimet_torch.adaround.adaround_weight.adaround)": [[71, "aimet_torch.adaround.adaround_weight.Adaround.apply_adaround"]], "check_model_arch() (in module aimet_torch.arch_checker.arch_checker.archchecker)": [[72, "aimet_torch.arch_checker.arch_checker.ArchChecker.check_model_arch"]], "autoquant (class in aimet_torch.auto_quant)": [[73, "aimet_torch.auto_quant.AutoQuant"]], "fold_all_batch_norms_to_scale() (in module aimet_torch.batch_norm_fold)": [[74, "aimet_torch.batch_norm_fold.fold_all_batch_norms_to_scale"]], "reestimate_bn_stats() (in module aimet_torch.bn_reestimation)": [[74, "aimet_torch.bn_reestimation.reestimate_bn_stats"]], "activationtype (class in aimet_common.defs)": [[75, "aimet_common.defs.ActivationType"]], "convbninfotype (class in aimet_common.bias_correction)": [[75, "aimet_common.bias_correction.ConvBnInfoType"]], "quantparams (class in aimet_torch.quantsim)": [[75, "aimet_torch.quantsim.QuantParams"]], "correct_bias() (in module aimet_torch.bias_correction)": [[75, "aimet_torch.bias_correction.correct_bias"]], "no_activation (aimet_common.defs.activationtype attribute)": [[75, "aimet_common.defs.ActivationType.no_activation"]], "relu (aimet_common.defs.activationtype attribute)": [[75, "aimet_common.defs.ActivationType.relu"]], "relu6 (aimet_common.defs.activationtype attribute)": [[75, "aimet_common.defs.ActivationType.relu6"]], "channelpruningparameters (class in aimet_torch.defs)": [[76, "aimet_torch.defs.ChannelPruningParameters"]], "channelpruningparameters.automodeparams (class in aimet_torch.defs)": [[76, "aimet_torch.defs.ChannelPruningParameters.AutoModeParams"]], "channelpruningparameters.manualmodeparams (class in aimet_torch.defs)": [[76, "aimet_torch.defs.ChannelPruningParameters.ManualModeParams"]], "channelpruningparameters.mode (class in aimet_torch.defs)": [[76, "aimet_torch.defs.ChannelPruningParameters.Mode"]], "greedyselectionparameters (class in aimet_common.defs)": [[76, "aimet_common.defs.GreedySelectionParameters"]], "modelcompressor (class in aimet_torch.compress)": [[76, "aimet_torch.compress.ModelCompressor"]], "modulecompratiopair (class in aimet_torch.defs)": [[76, "aimet_torch.defs.ModuleCompRatioPair"]], "spatialsvdparameters (class in aimet_torch.defs)": [[76, "aimet_torch.defs.SpatialSvdParameters"]], "spatialsvdparameters.automodeparams (class in aimet_torch.defs)": [[76, "aimet_torch.defs.SpatialSvdParameters.AutoModeParams"]], "spatialsvdparameters.manualmodeparams (class in aimet_torch.defs)": [[76, "aimet_torch.defs.SpatialSvdParameters.ManualModeParams"]], "spatialsvdparameters.mode (class in aimet_torch.defs)": [[76, "aimet_torch.defs.SpatialSvdParameters.Mode"]], "tarrankselectionparameters (class in aimet_torch.defs)": [[76, "aimet_torch.defs.TarRankSelectionParameters"]], "weightsvdparameters (class in aimet_torch.defs)": [[76, "aimet_torch.defs.WeightSvdParameters"]], "weightsvdparameters.automodeparams (class in aimet_torch.defs)": [[76, "aimet_torch.defs.WeightSvdParameters.AutoModeParams"]], "weightsvdparameters.manualmodeparams (class in aimet_torch.defs)": [[76, "aimet_torch.defs.WeightSvdParameters.ManualModeParams"]], "weightsvdparameters.mode (class in aimet_torch.defs)": [[76, "aimet_torch.defs.WeightSvdParameters.Mode"]], "auto (aimet_torch.defs.channelpruningparameters.mode attribute)": [[76, "aimet_torch.defs.ChannelPruningParameters.Mode.auto"]], "auto (aimet_torch.defs.spatialsvdparameters.mode attribute)": [[76, "aimet_torch.defs.SpatialSvdParameters.Mode.auto"]], "auto (aimet_torch.defs.weightsvdparameters.mode attribute)": [[76, "aimet_torch.defs.WeightSvdParameters.Mode.auto"]], "compress_model() (aimet_torch.compress.modelcompressor static method)": [[76, "aimet_torch.compress.ModelCompressor.compress_model"]], "manual (aimet_torch.defs.channelpruningparameters.mode attribute)": [[76, "aimet_torch.defs.ChannelPruningParameters.Mode.manual"]], "manual (aimet_torch.defs.spatialsvdparameters.mode attribute)": [[76, "aimet_torch.defs.SpatialSvdParameters.Mode.manual"]], "manual (aimet_torch.defs.weightsvdparameters.mode attribute)": [[76, "aimet_torch.defs.WeightSvdParameters.Mode.manual"]], "equalize_model() (in module aimet_torch.cross_layer_equalization)": [[77, "aimet_torch.cross_layer_equalization.equalize_model"]], "layeroutpututil (class in aimet_torch.layer_output_utils)": [[78, "aimet_torch.layer_output_utils.LayerOutputUtil"]], "namingscheme (class in aimet_torch.layer_output_utils)": [[78, "aimet_torch.layer_output_utils.NamingScheme"]], "onnx (aimet_torch.layer_output_utils.namingscheme attribute)": [[78, "aimet_torch.layer_output_utils.NamingScheme.ONNX"]], "pytorch (aimet_torch.layer_output_utils.namingscheme attribute)": [[78, "aimet_torch.layer_output_utils.NamingScheme.PYTORCH"]], "torchscript (aimet_torch.layer_output_utils.namingscheme attribute)": [[78, "aimet_torch.layer_output_utils.NamingScheme.TORCHSCRIPT"]], "generate_layer_outputs() (aimet_torch.layer_output_utils.layeroutpututil method)": [[78, "aimet_torch.layer_output_utils.LayerOutputUtil.generate_layer_outputs"]], "prepare_model() (in module aimet_torch.model_preparer)": [[80, "aimet_torch.model_preparer.prepare_model"]], "adaptermetadata (class in aimet_torch.peft)": [[83, "aimet_torch.peft.AdapterMetaData"]], "peftquantutils (class in aimet_torch.peft)": [[83, "aimet_torch.peft.PeftQuantUtils"]], "disable_lora_adapters() (aimet_torch.peft.peftquantutils method)": [[83, "aimet_torch.peft.PeftQuantUtils.disable_lora_adapters"]], "enable_adapter_and_load_weights() (aimet_torch.peft.peftquantutils method)": [[83, "aimet_torch.peft.PeftQuantUtils.enable_adapter_and_load_weights"]], "export_adapter_weights() (aimet_torch.peft.peftquantutils method)": [[83, "aimet_torch.peft.PeftQuantUtils.export_adapter_weights"]], "freeze_base_model() (aimet_torch.peft.peftquantutils method)": [[83, "aimet_torch.peft.PeftQuantUtils.freeze_base_model"]], "freeze_base_model_activation_quantizers() (aimet_torch.peft.peftquantutils method)": [[83, "aimet_torch.peft.PeftQuantUtils.freeze_base_model_activation_quantizers"]], "freeze_base_model_param_quantizers() (aimet_torch.peft.peftquantutils method)": [[83, "aimet_torch.peft.PeftQuantUtils.freeze_base_model_param_quantizers"]], "get_quantized_lora_layer() (aimet_torch.peft.peftquantutils method)": [[83, "aimet_torch.peft.PeftQuantUtils.get_quantized_lora_layer"]], "replace_lora_layers_with_quantizable_layers() (aimet_torch.peft method)": [[83, "aimet_torch.peft.replace_lora_layers_with_quantizable_layers"]], "save_lora_weights_after_adaptation() (aimet_torch.peft method)": [[83, "aimet_torch.peft.save_lora_weights_after_adaptation"]], "set_bitwidth_for_lora_adapters() (aimet_torch.peft.peftquantutils method)": [[83, "aimet_torch.peft.PeftQuantUtils.set_bitwidth_for_lora_adapters"]], "track_lora_meta_data() (aimet_torch.peft method)": [[83, "aimet_torch.peft.track_lora_meta_data"]], "clssetinfo (class in aimet_torch.cross_layer_equalization)": [[84, "aimet_torch.cross_layer_equalization.ClsSetInfo"]], "clssetinfo.clssetlayerpairinfo (class in aimet_torch.cross_layer_equalization)": [[84, "aimet_torch.cross_layer_equalization.ClsSetInfo.ClsSetLayerPairInfo"]], "bias_fold() (in module aimet_torch.cross_layer_equalization.highbiasfold)": [[84, "aimet_torch.cross_layer_equalization.HighBiasFold.bias_fold"], [84, "id0"]], "fold_all_batch_norms() (in module aimet_torch.batch_norm_fold)": [[84, "aimet_torch.batch_norm_fold.fold_all_batch_norms"]], "fold_given_batch_norms() (in module aimet_torch.batch_norm_fold)": [[84, "aimet_torch.batch_norm_fold.fold_given_batch_norms"]], "scale_cls_sets() (in module aimet_torch.cross_layer_equalization.crosslayerscaling)": [[84, "aimet_torch.cross_layer_equalization.CrossLayerScaling.scale_cls_sets"]], "scale_model() (in module aimet_torch.cross_layer_equalization.crosslayerscaling)": [[84, "aimet_torch.cross_layer_equalization.CrossLayerScaling.scale_model"]], "callbackfunc (class in aimet_common.utils)": [[85, "aimet_common.utils.CallbackFunc"]], "quantanalyzer (class in aimet_torch.quant_analyzer)": [[85, "aimet_torch.quant_analyzer.QuantAnalyzer"]], "analyze() (aimet_torch.quant_analyzer.quantanalyzer method)": [[85, "aimet_torch.quant_analyzer.QuantAnalyzer.analyze"]], "check_model_sensitivity_to_quantization() (aimet_torch.quant_analyzer.quantanalyzer method)": [[85, "aimet_torch.quant_analyzer.QuantAnalyzer.check_model_sensitivity_to_quantization"]], "enable_per_layer_mse_loss() (aimet_torch.quant_analyzer.quantanalyzer method)": [[85, "aimet_torch.quant_analyzer.QuantAnalyzer.enable_per_layer_mse_loss"]], "export_per_layer_encoding_min_max_range() (aimet_torch.quant_analyzer.quantanalyzer method)": [[85, "aimet_torch.quant_analyzer.QuantAnalyzer.export_per_layer_encoding_min_max_range"]], "export_per_layer_mse_loss() (aimet_torch.quant_analyzer.quantanalyzer method)": [[85, "aimet_torch.quant_analyzer.QuantAnalyzer.export_per_layer_mse_loss"]], "export_per_layer_stats_histogram() (aimet_torch.quant_analyzer.quantanalyzer method)": [[85, "aimet_torch.quant_analyzer.QuantAnalyzer.export_per_layer_stats_histogram"]], "perform_per_layer_analysis_by_disabling_quant_wrappers() (aimet_torch.quant_analyzer.quantanalyzer method)": [[85, "aimet_torch.quant_analyzer.QuantAnalyzer.perform_per_layer_analysis_by_disabling_quant_wrappers"]], "perform_per_layer_analysis_by_enabling_quant_wrappers() (aimet_torch.quant_analyzer.quantanalyzer method)": [[85, "aimet_torch.quant_analyzer.QuantAnalyzer.perform_per_layer_analysis_by_enabling_quant_wrappers"]], "quantizationsimmodel (class in aimet_torch.quantsim)": [[87, "aimet_torch.quantsim.QuantizationSimModel"]], "compute_encodings() (aimet_torch.quantsim.quantizationsimmodel method)": [[87, "aimet_torch.quantsim.QuantizationSimModel.compute_encodings"]], "export() (aimet_torch.quantsim.quantizationsimmodel method)": [[87, "aimet_torch.quantsim.QuantizationSimModel.export"]], "load_checkpoint() (aimet_torch.quantsim method)": [[87, "aimet_torch.quantsim.load_checkpoint"]], "save_checkpoint() (aimet_torch.quantsim method)": [[87, "aimet_torch.quantsim.save_checkpoint"]], "visualizecompression (class in aimet_torch.visualize_serialized_data)": [[88, "aimet_torch.visualize_serialized_data.VisualizeCompression"]], "display_comp_ratio_plot() (aimet_torch.visualize_serialized_data.visualizecompression method)": [[88, "aimet_torch.visualize_serialized_data.VisualizeCompression.display_comp_ratio_plot"]], "display_eval_scores() (aimet_torch.visualize_serialized_data.visualizecompression method)": [[88, "aimet_torch.visualize_serialized_data.VisualizeCompression.display_eval_scores"]], "visualize_changes_after_optimization() (in module aimet_torch.visualize_model)": [[89, "aimet_torch.visualize_model.visualize_changes_after_optimization"]], "visualize_relative_weight_ranges_to_identify_problematic_layers() (in module aimet_torch.visualize_model)": [[89, "aimet_torch.visualize_model.visualize_relative_weight_ranges_to_identify_problematic_layers"]], "visualize_weight_ranges() (in module aimet_torch.visualize_model)": [[89, "aimet_torch.visualize_model.visualize_weight_ranges"]]}}) \ No newline at end of file diff --git a/releases/1.32.2/toplevelhidden.html b/releases/1.32.2/toplevelhidden.html new file mode 100644 index 00000000..6c7ce89c --- /dev/null +++ b/releases/1.32.2/toplevelhidden.html @@ -0,0 +1,1124 @@ + + + + + + <no title> — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/_images/AIMET_index_no_fine_tune.png b/releases/1.32.2/torch_v2/_images/AIMET_index_no_fine_tune.png new file mode 100644 index 00000000..59803a20 Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/AIMET_index_no_fine_tune.png differ diff --git a/releases/1.32.2/torch_v2/_images/adaround.png b/releases/1.32.2/torch_v2/_images/adaround.png new file mode 100644 index 00000000..40c4c9a4 Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/adaround.png differ diff --git a/releases/1.32.2/torch_v2/_images/auto_quant_v2_flowchart.png b/releases/1.32.2/torch_v2/_images/auto_quant_v2_flowchart.png new file mode 100644 index 00000000..3f97910b Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/auto_quant_v2_flowchart.png differ diff --git a/releases/1.32.2/torch_v2/_images/bias_correction_analytical.png b/releases/1.32.2/torch_v2/_images/bias_correction_analytical.png new file mode 100644 index 00000000..92e930ac Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/bias_correction_analytical.png differ diff --git a/releases/1.32.2/torch_v2/_images/bias_correction_empirical.png b/releases/1.32.2/torch_v2/_images/bias_correction_empirical.png new file mode 100644 index 00000000..7dfe940c Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/bias_correction_empirical.png differ diff --git a/releases/1.32.2/torch_v2/_images/bn_reestimation.png b/releases/1.32.2/torch_v2/_images/bn_reestimation.png new file mode 100644 index 00000000..93c2c2aa Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/bn_reestimation.png differ diff --git a/releases/1.32.2/torch_v2/_images/channel_pruning_1.png b/releases/1.32.2/torch_v2/_images/channel_pruning_1.png new file mode 100644 index 00000000..68953c87 Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/channel_pruning_1.png differ diff --git a/releases/1.32.2/torch_v2/_images/cle_1.png b/releases/1.32.2/torch_v2/_images/cle_1.png new file mode 100644 index 00000000..56c09217 Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/cle_1.png differ diff --git a/releases/1.32.2/torch_v2/_images/cle_4.png b/releases/1.32.2/torch_v2/_images/cle_4.png new file mode 100644 index 00000000..57741e47 Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/cle_4.png differ diff --git a/releases/1.32.2/torch_v2/_images/cle_5.png b/releases/1.32.2/torch_v2/_images/cle_5.png new file mode 100644 index 00000000..07e35158 Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/cle_5.png differ diff --git a/releases/1.32.2/torch_v2/_images/compression_flow.png b/releases/1.32.2/torch_v2/_images/compression_flow.png new file mode 100644 index 00000000..12d15895 Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/compression_flow.png differ diff --git a/releases/1.32.2/torch_v2/_images/compression_use_case.PNG b/releases/1.32.2/torch_v2/_images/compression_use_case.PNG new file mode 100644 index 00000000..bc429099 Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/compression_use_case.PNG differ diff --git a/releases/1.32.2/torch_v2/_images/cp_2.png b/releases/1.32.2/torch_v2/_images/cp_2.png new file mode 100644 index 00000000..d25a132b Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/cp_2.png differ diff --git a/releases/1.32.2/torch_v2/_images/cp_3.jpg b/releases/1.32.2/torch_v2/_images/cp_3.jpg new file mode 100644 index 00000000..8c02570c Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/cp_3.jpg differ diff --git a/releases/1.32.2/torch_v2/_images/cp_4.jpg b/releases/1.32.2/torch_v2/_images/cp_4.jpg new file mode 100644 index 00000000..c047526b Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/cp_4.jpg differ diff --git a/releases/1.32.2/torch_v2/_images/flow_diagram_cle.png b/releases/1.32.2/torch_v2/_images/flow_diagram_cle.png new file mode 100644 index 00000000..a9e56767 Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/flow_diagram_cle.png differ diff --git a/releases/1.32.2/torch_v2/_images/greedy_1.png b/releases/1.32.2/torch_v2/_images/greedy_1.png new file mode 100644 index 00000000..4e5afd97 Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/greedy_1.png differ diff --git a/releases/1.32.2/torch_v2/_images/greedy_2.png b/releases/1.32.2/torch_v2/_images/greedy_2.png new file mode 100644 index 00000000..937d4b08 Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/greedy_2.png differ diff --git a/releases/1.32.2/torch_v2/_images/greedy_3.png b/releases/1.32.2/torch_v2/_images/greedy_3.png new file mode 100644 index 00000000..0088528a Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/greedy_3.png differ diff --git a/releases/1.32.2/torch_v2/_images/greedy_4.jpg b/releases/1.32.2/torch_v2/_images/greedy_4.jpg new file mode 100644 index 00000000..653fc39f Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/greedy_4.jpg differ diff --git a/releases/1.32.2/torch_v2/_images/greedy_5.jpg b/releases/1.32.2/torch_v2/_images/greedy_5.jpg new file mode 100644 index 00000000..39b02ebe Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/greedy_5.jpg differ diff --git a/releases/1.32.2/torch_v2/_images/pytorch_model_prep_and_validate.PNG b/releases/1.32.2/torch_v2/_images/pytorch_model_prep_and_validate.PNG new file mode 100644 index 00000000..bec69113 Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/pytorch_model_prep_and_validate.PNG differ diff --git a/releases/1.32.2/torch_v2/_images/quant_2.png b/releases/1.32.2/torch_v2/_images/quant_2.png new file mode 100644 index 00000000..5c81db4f Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/quant_2.png differ diff --git a/releases/1.32.2/torch_v2/_images/quant_3.png b/releases/1.32.2/torch_v2/_images/quant_3.png new file mode 100644 index 00000000..3e1bdcc9 Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/quant_3.png differ diff --git a/releases/1.32.2/torch_v2/_images/quant_use_case_1.PNG b/releases/1.32.2/torch_v2/_images/quant_use_case_1.PNG new file mode 100644 index 00000000..93a7cb72 Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/quant_use_case_1.PNG differ diff --git a/releases/1.32.2/torch_v2/_images/quant_use_case_2.PNG b/releases/1.32.2/torch_v2/_images/quant_use_case_2.PNG new file mode 100644 index 00000000..075ec7ee Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/quant_use_case_2.PNG differ diff --git a/releases/1.32.2/torch_v2/_images/quant_use_case_3.PNG b/releases/1.32.2/torch_v2/_images/quant_use_case_3.PNG new file mode 100644 index 00000000..c38a7d23 Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/quant_use_case_3.PNG differ diff --git a/releases/1.32.2/torch_v2/_images/quantization_debugging_flow_chart.png b/releases/1.32.2/torch_v2/_images/quantization_debugging_flow_chart.png new file mode 100644 index 00000000..8ed4aba9 Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/quantization_debugging_flow_chart.png differ diff --git a/releases/1.32.2/torch_v2/_images/quantization_workflow.PNG b/releases/1.32.2/torch_v2/_images/quantization_workflow.PNG new file mode 100644 index 00000000..1618222f Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/quantization_workflow.PNG differ diff --git a/releases/1.32.2/torch_v2/_images/quantsim_config_file.png b/releases/1.32.2/torch_v2/_images/quantsim_config_file.png new file mode 100644 index 00000000..a3d5c7a8 Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/quantsim_config_file.png differ diff --git a/releases/1.32.2/torch_v2/_images/spatial_svd.png b/releases/1.32.2/torch_v2/_images/spatial_svd.png new file mode 100644 index 00000000..6686a254 Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/spatial_svd.png differ diff --git a/releases/1.32.2/torch_v2/_images/vis_1.png b/releases/1.32.2/torch_v2/_images/vis_1.png new file mode 100644 index 00000000..78ace7ce Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/vis_1.png differ diff --git a/releases/1.32.2/torch_v2/_images/vis_3.png b/releases/1.32.2/torch_v2/_images/vis_3.png new file mode 100644 index 00000000..a00c69be Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/vis_3.png differ diff --git a/releases/1.32.2/torch_v2/_images/vis_4.png b/releases/1.32.2/torch_v2/_images/vis_4.png new file mode 100644 index 00000000..e6551c7c Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/vis_4.png differ diff --git a/releases/1.32.2/torch_v2/_images/vis_5.png b/releases/1.32.2/torch_v2/_images/vis_5.png new file mode 100644 index 00000000..f25ccccb Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/vis_5.png differ diff --git a/releases/1.32.2/torch_v2/_images/vis_6.png b/releases/1.32.2/torch_v2/_images/vis_6.png new file mode 100644 index 00000000..5e0dc63d Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/vis_6.png differ diff --git a/releases/1.32.2/torch_v2/_images/vis_7.png b/releases/1.32.2/torch_v2/_images/vis_7.png new file mode 100644 index 00000000..bfc33864 Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/vis_7.png differ diff --git a/releases/1.32.2/torch_v2/_images/weight_svd.png b/releases/1.32.2/torch_v2/_images/weight_svd.png new file mode 100644 index 00000000..1c2b548d Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/weight_svd.png differ diff --git a/releases/1.32.2/torch_v2/_images/winnow_1.png b/releases/1.32.2/torch_v2/_images/winnow_1.png new file mode 100644 index 00000000..ccbb3bfe Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/winnow_1.png differ diff --git a/releases/1.32.2/torch_v2/_images/winnow_2.png b/releases/1.32.2/torch_v2/_images/winnow_2.png new file mode 100644 index 00000000..bc5a123d Binary files /dev/null and b/releases/1.32.2/torch_v2/_images/winnow_2.png differ diff --git a/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/nn/base.html b/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/nn/base.html new file mode 100644 index 00000000..90bdacb4 --- /dev/null +++ b/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/nn/base.html @@ -0,0 +1,759 @@ + + + + + + aimet_torch.v2.nn.base — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_torch.v2.nn.base

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+"""Base class of quantized modules"""
+
+import abc
+import contextlib
+import itertools
+from typing import Type, List, Dict, Union, Iterable, Mapping, Optional
+
+import torch.nn as nn
+from torch import Tensor
+
+from aimet_torch.utils import is_vector_encoding
+from aimet_torch.v2.quantization.affine.encoding import VectorEncoding, AffineEncoding
+
+from aimet_torch.v2.quantization.tensor import QuantizedTensorBase
+from aimet_torch.v2.quantization.base import QuantizerBase
+from aimet_torch.v2.utils import (
+    patch_attr,
+    _ContextManager,
+    flatten_nn_module_list,
+)
+
+def _no_op(in_tensor):
+    return in_tensor
+
+
[docs]class BaseQuantizationMixin(abc.ABC): + """Mixin that implements quantization on top of regular pytorch modules. + + Attributes: + input_quantizers (nn.ModuleList): :class:`ModuleList` containing :class:`QuantizerBase` objects to be applied + to the layer's input tensors + output_quantizers (nn.ModuleList): :class:`ModuleList` containing :class:`QuantizerBase` objects to be applied + to the layer's output tensors + param_quantizers (nn.ModuleDict): :class:`ModuleDict` mapping parameter names to associated :class:`QuantizerBase` + objects + + """ + + input_quantizers: nn.ModuleList + output_quantizers: nn.ModuleList + param_quantizers: nn.ModuleDict + + def __init__(self, *args, **kwargs): + super().__init__(*args, **kwargs) + self.__quant_init__() + +
[docs] def __quant_init__(self): + """Initializer for quantized module. This method will be invoked right after __init__. + + This method initializes the :attr:`input_quantizers`, :attr:`output_quantizers`, and :attr:`param_quantizers` + structures to the appropriate sizes based on the number of input tensors, output tensors, and parameters of the + base :class:`nn.Module` class. All quantizers are initializd to ``None``. + + For custom quantized classes, this method should be overridden to set the appropriate lengths of + :attr:`input_quantizers` and :attr:`output_quantizers` for the given base class. + """ + self.param_quantizers = nn.ModuleDict({ + name: None for name, _ in self.named_parameters(recurse=False) + }) + # Currently assume single input & output + self.input_quantizers = nn.ModuleList([None]) + self.output_quantizers = nn.ModuleList([None])
+ + def __call__(self, *args, **kwargs): + self._compute_param_encodings(overwrite=False) + return super().__call__(*args, **kwargs) + + @abc.abstractmethod + def forward(self, *args, **kwargs): + """Forward function for quantized module. + + This method will replace the original forward function of the base :class:`nn.Module` class and is + responsible for computing a quantized version of the base class' forward function using the configuration of + the layer's :class:`QuantizerBase` objects. + """ + return super().forward(*args, **kwargs) + + @contextlib.contextmanager + def _patch_quantized_parameters(self): + with contextlib.ExitStack() as stack: + for param_name, param_quantizer in self.param_quantizers.items(): + if param_quantizer: + orig_param = getattr(self, param_name) + quantized_param = param_quantizer(orig_param) + ctx = patch_attr(self, param_name, quantized_param) + stack.enter_context(ctx) + yield + + def _compute_param_encodings(self, overwrite: bool): + """ + :param bool overwrite: If True, the quantizers that are already initialized will also recompute encodings. + Otherwise, only the uninitialized quantizers will compute encodings. + """ + for param_name, param_quantizer in self.param_quantizers.items(): + if not param_quantizer: + continue + + if not param_quantizer._allow_overwrite: # pylint: disable=protected-access + continue + + if not param_quantizer.is_initialized() or overwrite: + param = getattr(self, param_name) + if param is not None: + with patch_attr(param_quantizer, "forward", _no_op), param_quantizer.compute_encodings(): + _ = param_quantizer(param) + + def compute_param_encodings(self): + """ Compute encodings of parameter quantizers """ + self._compute_param_encodings(overwrite=True) + +
[docs] @contextlib.contextmanager + def compute_encodings(self): + """Enters the :meth:`compute_encodings` context for all :class:`QuantizerBase` objects in the layer. + + Inside this context, each quantizer will observe all inputs passed to the quantizer and will compute + quantization encodings upon exiting the context. + + Example: + + >>> qlinear = QuantizedLinear(10, 10) + >>> qlinear.output_quantizers[0] = Quantize((1, ), 8, symmetric=False) + >>> with qlinear.compute_encodings(): + >>> qlinear(torch.randn(16, 10)) + >>> print(qlinear.output_quantizers[0].is_initialized()) + True + + """ + self._compute_param_encodings(overwrite=True) + + with contextlib.ExitStack() as stack: + input_quantizers = flatten_nn_module_list(self.input_quantizers) + output_quantizers = flatten_nn_module_list(self.output_quantizers) + + for quantizer in itertools.chain(input_quantizers, output_quantizers): + if not isinstance(quantizer, QuantizerBase): + continue + + if not quantizer._allow_overwrite: # pylint: disable=protected-access + continue + + # Set input/output quantizers into pass-through mode during compute_encodings + # NOTE: This behavior is for backawrd-compatibility with V1 quantsim. + stack.enter_context(patch_attr(quantizer, 'forward', _no_op)) + + ctx = quantizer.compute_encodings() + stack.enter_context(ctx) + + yield
+ + @classmethod + @abc.abstractmethod + def wrap(cls, module_cls: Type[nn.Module]): + """ + Wrap a regular module class into a quantized module class + """ + + @classmethod + def from_module(cls, module: nn.Module): + r"""Create an instance of quantized module from a regular module instance. + + The resulting quantized module contains the same attributes and parameters as the original module, but may + be assigned input, output and parameter quantizers. + + :param module: Floating point module to quantize + :return: Quantized version of the original module + + Example: + + >>> linear = torch.nn.linear(10, 10) + >>> quantized_linear = FakeQuantizationMixin.from_module(linear) + >>> print(quantized_linear.weight is linear.weight) + True + >>> print(quantized_linear.param_quantizers) + ModuleDict( + (weight): None + (bias): None + ) + """ + # pylint: disable=protected-access + module_cls = type(module) + qtzn_module_cls = cls.cls_to_qcls.get(module_cls, None) + + if not qtzn_module_cls: + raise RuntimeError( + f'The quantized module definition of {module_cls} is not registered. ' + f'Please register the quantized module definition of {module_cls} ' + f'using `@{cls.__name__}.implements({module_cls.__name__})` decorator.' + ) + + qtzn_module = cls.__new__(qtzn_module_cls) + + qtzn_module.__dict__ = module.__dict__.copy() + qtzn_module._modules = module._modules.copy() + qtzn_module._parameters = module._parameters.copy() + qtzn_module._buffers = module._buffers.copy() + + qtzn_module.__quant_init__() + return qtzn_module + + def export_input_encodings(self) -> List[List[Dict]]: + """ + Returns a list of input encodings, each represented as a List of Dicts + """ + return [ + quantizer.get_legacy_encodings() if isinstance(quantizer, QuantizerBase) else None + for quantizer in flatten_nn_module_list(self.input_quantizers) + ] + + def import_input_encodings(self, + encodings: Mapping[str, Mapping], + strict: bool, + partial: bool, + requires_grad: Optional[bool], + allow_overwrite: bool): + """ + Import input encodings represented in below format: + { + '0': dict, + '1': dict, + ... + } + + :param encodings: Dictionary mapping quantizer index (str) to encoding (dict) + :param ignore_when_quantizer_disabled: If True, does not raise RuntimeError when a quantizer is disabled + :param disable_quantizer_without_encoding: If True, disable any quantizer without an encoding in `encodings` + :param freeze: If True, freezes the quantizer's encodings after loading + """ + for i, quantizer in enumerate(list(self.input_quantizers)): + if quantizer and not quantizer._allow_overwrite: # pylint: disable=protected-access + continue + encoding = encodings.get(str(i), None) + if not encoding: + if not partial: + # Dangling quantizers have to be removed when importing non-partial encodings + self.input_quantizers[i] = None + continue + if quantizer is None: + if strict: + raise RuntimeError + continue + if isinstance(encoding, dict): + encoding = [encoding] + quantizer.set_legacy_encodings(encoding) + + if requires_grad is not None: + quantizer.requires_grad_(requires_grad) + + quantizer.allow_overwrite(allow_overwrite) + + def export_output_encodings(self) -> List[List[Dict]]: + """ + Returns a list of output encodings, each represented as a List of Dicts + """ + return [ + quantizer.get_legacy_encodings() if isinstance(quantizer, QuantizerBase) else None + for quantizer in flatten_nn_module_list(self.output_quantizers) + ] + + def import_output_encodings(self, + encodings: Mapping[str, Mapping], + strict: bool, + partial: bool, + requires_grad: Optional[bool], + allow_overwrite: bool): + """ + Import output encodings represented in below format: + { + '0': dict, + '1': dict, + ... + } + + :param encodings: Dictionary mapping quantizer index (str) to encoding (dict) + :param ignore_when_quantizer_disabled: If True, does not raise RuntimeError when a quantizer is disabled + :param disable_quantizer_without_encoding: If True, disable any quantizer without an encoding in `encodings` + :param freeze: If True, freezes the quantizer's encodings after loading + """ + for i, quantizer in enumerate(list(self.output_quantizers)): + if quantizer and not quantizer._allow_overwrite: # pylint: disable=protected-access + continue + encoding = encodings.get(str(i), None) + if not encoding: + if not partial: + # Dangling quantizers have to be removed when importing non-partial encodings + self.output_quantizers[i] = None + continue + if quantizer is None: + if strict: + raise RuntimeError + continue + if isinstance(encoding, dict): + encoding = [encoding] + quantizer.set_legacy_encodings(encoding) + + if requires_grad is not None: + quantizer.requires_grad_(requires_grad) + + quantizer.allow_overwrite(allow_overwrite) + + def export_param_encodings(self) -> Dict[str, List[Dict]]: + """ + Returns a dict of {param name: param encodings}, with each encoding represented as a List of Dicts + """ + encodings = { + param_name: quantizer.get_legacy_encodings() if isinstance(quantizer, QuantizerBase) else None + for param_name, quantizer in self.param_quantizers.items() + } + for param_name, quantizer in self.param_quantizers.items(): + param = getattr(self, param_name) + if isinstance(quantizer, QuantizerBase): + e = encodings[param_name] + elif isinstance(param, QuantizedTensorBase) and param.encoding is not None: + # If parameter itself is an already-quantized tensor, + # export the encoding held by the parameter + e = param.encoding._to_legacy_format() # pylint: disable=protected-access + else: + e = None + encodings[param_name] = e + + return encodings + + def import_param_encodings(self, + encodings: Mapping[str, Mapping], + strict: bool, + partial: bool, + requires_grad: Optional[bool], + allow_overwrite: bool): + """ + Import parameter encodings represented in below format: + { + 'param_name_0': [dict, dict, ...], + 'param_name_1': [dict, dict, ...], + ... + } + + :param encodings: Dictionary mapping quantizer parameter name (str) to encodings (dict) + :param ignore_when_quantizer_disabled: If True, does not raise RuntimeError when a quantizer is disabled + :param disable_quantizer_without_encoding: If True, disable any quantizer without an encoding in `encodings` + :param freeze: If True, freezes the quantizer's encodings after loading + """ + for param_name, quantizer in dict(self.param_quantizers).items(): + if quantizer and not quantizer._allow_overwrite: # pylint: disable=protected-access + continue + encoding = encodings.get(param_name, None) + + if is_vector_encoding(encoding): + # Vector encodings will be held directly by weights, not by quantizers. + quantizer.set_legacy_encodings(encoding) + param = getattr(self, param_name) + rounded_weight = quantizer(param) + # At this point, rounded_weight is a quantized tensor with affine encoding + # since quantizer is an affine quantizer + assert isinstance(rounded_weight, QuantizedTensorBase) + assert isinstance(rounded_weight.encoding, AffineEncoding) + e = rounded_weight.encoding + # Convert affine encoding to vector encoding + vector_encoding_properties = { + "rows_per_block": encoding[0]["rows_per_block"], + "cols_per_block": encoding[0]["cols_per_block"], + "vector_dim": encoding[0]["vector_dim"], + "vector_stride": encoding[0]["vector_stride"], + "index_bw": encoding[0]["index_bw"], + } + rounded_weight.encoding = VectorEncoding(e.scale, + e.offset, + e.bitwidth, + e.signed, + e.symmetry, + block_size=None, + **vector_encoding_properties) + setattr(self, param_name, nn.Parameter(rounded_weight)) + # Remove associated quantizer since the weight is holding already-quantized values + self.param_quantizers[param_name] = None + + if not encoding: + if not partial: + # Dangling quantizers have to be removed when importing non-partial encodings + self.param_quantizers[param_name] = None + continue + if quantizer is None: + if strict: + raise RuntimeError + continue + if isinstance(encoding, dict): + encoding = [encoding] + quantizer.set_legacy_encodings(encoding) + + if requires_grad is not None: + quantizer.requires_grad_(requires_grad) + + quantizer.allow_overwrite(allow_overwrite) + + def get_original_module(self) -> nn.Module: + """Returns the floating point version of the quantized module + + Returns: + A floating point module with quantizers removed + + Example: + + >>> qlinear = QuantizedLinear(10, 20, bias=False) + >>> linear = qlinear.get_original_module() + >>> linear + Linear(in_features=10, out_features=20, bias=False) + >>> linear.weight is qlinear.weight + True + + """ + # pylint: disable=protected-access + + qtzn_module_cls = type(self) + orig_module_cls = self.qcls_to_cls.get(qtzn_module_cls) + + orig_module = self.__new__(orig_module_cls) + orig_module.__dict__ = self.__dict__.copy() + orig_module.__dict__.pop('forward', None) + + orig_module._parameters = self._parameters.copy() + orig_module._buffers = self._buffers.copy() + orig_module._modules = self._modules.copy() + del orig_module._modules['input_quantizers'] + del orig_module._modules['output_quantizers'] + del orig_module._modules['param_quantizers'] + + return orig_module + + def _remove_input_quantizers(self, indices: Union[int, Iterable[int]] = None): + """ + Remove input quantizers + :param indices: Indices of input quantizers to remove. + If None, all input quantizers will be removed. + """ + if isinstance(indices, int): + indices = [indices] + elif indices is None: + indices = list(range(len(self.input_quantizers))) + return _remove_quantizers(self.input_quantizers, indices) + + def _remove_param_quantizers(self, keys: Union[str, Iterable[str]] = None): + """ + Remove parameter quantizers + :param indices: Indices of parameter quantizers to remove. + If None, all input quantizers will be removed. + """ + if isinstance(keys, str): + keys = [keys] + elif keys is None: + keys = list(self.param_quantizers.keys()) + return _remove_quantizers(self.param_quantizers, keys) + + def _remove_output_quantizers(self, indices: Union[int, Iterable[int]] = None): + """ + Remove output quantizers + :param indices: Indices of input quantizers to remove. + If None, all input quantizers will be removed. + """ + if isinstance(indices, int): + indices = [indices] + elif indices is None: + indices = list(range(len(self.output_quantizers))) + return _remove_quantizers(self.output_quantizers, indices) + + def _remove_activation_quantizers(self): + """ Remove all activation quantizers """ + # pylint: disable=protected-access + ctx_1 = self._remove_output_quantizers() + ctx_2 = self._remove_input_quantizers() + return _ContextManager(action=lambda: None, + cleanup=lambda: (ctx_1._cleanup(), ctx_2._cleanup())) + + def _remove_all_quantizers(self): + """ Remove all quantizers """ + # pylint: disable=protected-access + ctx_1 = self._remove_activation_quantizers() + ctx_2 = self._remove_param_quantizers() + return _ContextManager(action=lambda: None, + cleanup=lambda: (ctx_1._cleanup(), ctx_2._cleanup()))
+ +class _BaseQuantizedUnaryOpMixin(BaseQuantizationMixin): + def forward(self, *args, **kwargs) -> Tensor: # pylint: disable=missing-function-docstring + x, *others = args + + if isinstance(x, Tensor) and x.is_floating_point() and self.input_quantizers[0]: + x = self.input_quantizers[0](x) + + with self._patch_quantized_parameters(): + output = super().forward(x, *others, **kwargs) + + if isinstance(output, Tensor) and output.is_floating_point() and self.output_quantizers[0]: + output = self.output_quantizers[0](output) + + return output + +class _BaseQuantizedBinaryOpMixin(BaseQuantizationMixin): + def __quant_init__(self): + super().__quant_init__() + self.input_quantizers = nn.ModuleList([None, None]) + + def forward(self, *args, **kwargs) -> Tensor: # pylint: disable=missing-function-docstring + x, y, *others = args + + if isinstance(x, Tensor) and x.is_floating_point() and self.input_quantizers[0]: + x = self.input_quantizers[0](x) + + if isinstance(y, Tensor) and y.is_floating_point() and self.input_quantizers[1]: + y = self.input_quantizers[1](y) + + with self._patch_quantized_parameters(): + output = super().forward(x, y, *others, **kwargs) + + if isinstance(output, Tensor) and output.is_floating_point() and self.output_quantizers[0]: + output = self.output_quantizers[0](output) + + return output + + +class _BaseQuantizedTernaryOpMixin(BaseQuantizationMixin): + def __quant_init__(self): + super().__quant_init__() + self.input_quantizers = nn.ModuleList([None, None, None]) + + def forward(self, *args, **kwargs) -> Tensor: # pylint: disable=missing-function-docstring + x, y, z, *others = args + + if isinstance(x, Tensor) and x.is_floating_point() and self.input_quantizers[0]: + x = self.input_quantizers[0](x) + + if isinstance(y, Tensor) and y.is_floating_point() and self.input_quantizers[1]: + y = self.input_quantizers[1](y) + + if isinstance(z, Tensor) and z.is_floating_point() and self.input_quantizers[2]: + z = self.input_quantizers[2](z) + + with self._patch_quantized_parameters(): + output = super().forward(x, y, z, *others, **kwargs) + + if isinstance(output, Tensor) and output.is_floating_point() and self.output_quantizers[0]: + output = self.output_quantizers[0](output) + + return output + + +def _remove_quantizers(quantizers, keys): + orig_quantizers = {key: quantizers[key] for key in keys} + + def restore_quantizers(): + for key, orig_qtzr in orig_quantizers.items(): + quantizers[key] = orig_qtzr + + ctx = _ContextManager(action=lambda: None, + cleanup=restore_quantizers) + + try: + for key in keys: + quantizers[key] = None + except Exception: + ctx._cleanup() # pylint: disable=protected-access + raise + else: + return ctx +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/nn/fake_quant.html b/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/nn/fake_quant.html new file mode 100644 index 00000000..8c609f34 --- /dev/null +++ b/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/nn/fake_quant.html @@ -0,0 +1,1127 @@ + + + + + + aimet_torch.v2.nn.fake_quant — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_torch.v2.nn.fake_quant

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+#pylint: disable=too-many-lines
+"""Fake-quantized modules"""
+
+from collections import OrderedDict
+from typing import Type, Optional, Tuple
+import abc
+import warnings
+
+from torch import Tensor
+import torch.nn as nn
+from torch.nn.modules.adaptive import _ASMoutput
+from torch.nn.utils.rnn import PackedSequence
+from torch.utils._pytree import tree_map
+
+import aimet_torch.elementwise_ops as aimet_ops
+
+from .base import BaseQuantizationMixin, _BaseQuantizedUnaryOpMixin, _BaseQuantizedBinaryOpMixin, _BaseQuantizedTernaryOpMixin # pylint: disable=import-error
+
+
+class FakeQuantMeta(abc.ABCMeta):
+    """Sets :meth:`forward` to :meth:`quantized_forward` if only :meth:`quantized_forward` is defined
+    """
+
+    def __new__(mcs, name, bases, namespace, **kwargs):
+        if "quantized_forward" in namespace and "forward" not in namespace:
+            warnings.warn("Support for defining `quantized_forward` in place of `forward` method will be deprecated, "
+                          "please use `forward` instead.",
+                          DeprecationWarning, stacklevel=2)
+            namespace["forward"] = namespace["quantized_forward"]
+        return super().__new__(mcs, name, bases, namespace, **kwargs)
+
+
+
[docs]class FakeQuantizationMixin(BaseQuantizationMixin, metaclass=FakeQuantMeta): # pylint: disable=abstract-method + """Mixin that implements fake-quantization on top of regular pytorch modules. + + Specifically, a fake-quantized module will quantize input, output, and parameter tensors with + its held :class:`QuantizerBase` objects during the :meth:`forward` method and use the inherited :class:torch.nn.Module` + forward method to compute the layer operation. If all input, output, and parameter quantizers are ``None``, a + fake-quantized module will behave exactly the same as its parent :class:`torch.nn.Module`. + + A fake-quantized module can be initialized from scratch using the same syntax as the parent module, or can be + formed from an existing module using the :meth:`from_module` method. + + Attributes: + input_quantizers (nn.ModuleList): :class:`ModuleList` containing :class:`QuantizerBase` objects to be applied + to the layer's input tensors + output_quantizers (nn.ModuleList): :class:`ModuleList` containing :class:`QuantizerBase` objects to be applied + to the layer's output tensors + param_quantizers (nn.ModuleDict): :class:`ModuleDict` mapping parameter names to associated :class:`QuantizerBase` + objects + + Examples: + + >>> qlinear = FakeQuantizedLinear(in_features=10, out_features=20, bias=False) + >>> print(qlinear) + FakeQuantizedLinear( + in_features=10, out_features=20, bias=False + (param_quantizers): ModuleDict( + (weight): None + ) + (input_quantizers): ModuleList( + (0): None + ) + (output_quantizers): ModuleList( + (0): None + ) + ) + + + >>> linear = torch.nn.Linear(in_features=10, out_features=20, bias=True) + >>> qlinear = FakeQuantizationMixin.from_module(linear) + >>> print(qlinear) + FakeQuantizedLinear( + in_features=10, out_features=20, bias=True + (param_quantizers): ModuleDict( + (weight): None + (bias): None + ) + (input_quantizers): ModuleList( + (0): None + ) + (output_quantizers): ModuleList( + (0): None + ) + ) + >>> qlinear.weight is linear.weight + True + + """ + + cls_to_qcls = OrderedDict() # ouantized class -> original class + qcls_to_cls = OrderedDict() # original class -> quantized class + + @abc.abstractmethod + def forward(self, *args, **kwargs): + """Computes a fake-quantized version of the parent module's forward method. + + The :meth:`forward` method should perform the following logic in order: + + 1) Apply existing input quantizers to input tensors + 2) Apply existing param quantizers to the layer's parameters + 3) Call the inherited :class:`torch.nn.Module` forward method with quantized inputs and parameters + 4) Apply existing output quantizers to the outputs of the forward method + + If all input, output, and parameter quantizers are ``None``, this method will behave exactly the same as + its parent module's forward pass. + """ + return super().forward(*args, **kwargs) + + @classmethod + def wrap(cls, module_cls: Type[nn.Module]) -> Type[nn.Module]: + """ + Wrap a regular module class into a fake-quantized module class + """ + if not issubclass(module_cls, nn.Module): + raise ValueError("Expected module_cls to be a subclass of torch.nn.Module. " + f"Got {module_cls}.") + if module_cls in cls.cls_to_qcls: + return cls.cls_to_qcls[module_cls] + + quantized_cls_name = f"FakeQuantized{module_cls.__name__}" + base_classes = (cls, module_cls) + quantized_cls = type(quantized_cls_name, base_classes, {'__module__': __name__}) + return cls.implements(module_cls)(quantized_cls) + +
[docs] @classmethod + def implements(cls, module_cls): + """Decorator for registering a fake-quantized implementation of the given base class. + + This decorator registers the defined class as the fake-quantized version of module_cls such that calling + :meth:`from_module` on an instance of module_cls will output an instance of the decorated class. + + Args: + module_cls: The base :class:`torch.nn.Module` class + + """ + def wrapper(quantized_cls): + cls.cls_to_qcls[module_cls] = quantized_cls + cls.qcls_to_cls[quantized_cls] = module_cls + return quantized_cls + return wrapper
+ + +class _FakeQuantizedUnaryOpMixin(_BaseQuantizedUnaryOpMixin, FakeQuantizationMixin): # pylint: disable=abstract-method + pass + +class _FakeQuantizedBinaryOpMixin(_BaseQuantizedBinaryOpMixin, FakeQuantizationMixin): # pylint: disable=abstract-method + pass + +class _FakeQuantizedTernaryOpMixin(_BaseQuantizedTernaryOpMixin, FakeQuantizationMixin): # pylint: disable=abstract-method + pass + + + +######################## +### torch.nn.Modules ### +######################## + +# Below are the lists of modules with regular code patterns +# that takes tensors as the first N arguments and returns single tensor as output +_TORCH_NN_UNARY_MODULES = [ + nn.AdaptiveAvgPool1d, + nn.AdaptiveAvgPool2d, + nn.AdaptiveAvgPool3d, + nn.AdaptiveMaxPool1d, + nn.AdaptiveMaxPool2d, + nn.AdaptiveMaxPool3d, + nn.AlphaDropout, + nn.AvgPool1d, + nn.AvgPool2d, + nn.AvgPool3d, + nn.BatchNorm1d, + nn.BatchNorm2d, + nn.BatchNorm3d, + nn.CELU, + nn.ChannelShuffle, + nn.ConstantPad1d, + nn.ConstantPad2d, + nn.ConstantPad3d, + nn.Conv1d, + nn.Conv2d, + nn.Conv3d, + nn.ConvTranspose1d, + nn.ConvTranspose2d, + nn.ConvTranspose3d, + nn.CrossMapLRN2d, + nn.Dropout, + # nn.Dropout1d, # Not supported in torch < 1.12 + nn.Dropout2d, + nn.Dropout3d, + nn.ELU, + nn.FeatureAlphaDropout, + nn.Flatten, + nn.Fold, + nn.FractionalMaxPool2d, + nn.FractionalMaxPool3d, + nn.GELU, + nn.GLU, + nn.GroupNorm, + nn.Hardshrink, + nn.Hardsigmoid, + nn.Hardswish, + nn.Hardtanh, + nn.Identity, + nn.InstanceNorm1d, + nn.InstanceNorm2d, + nn.InstanceNorm3d, + nn.LPPool1d, + nn.LPPool2d, + nn.LayerNorm, + nn.LeakyReLU, + nn.Linear, + nn.LocalResponseNorm, + nn.LogSigmoid, + nn.LogSoftmax, + nn.MaxPool1d, + nn.MaxPool2d, + nn.MaxPool3d, + nn.MaxUnpool1d, + nn.MaxUnpool2d, + nn.MaxUnpool3d, + nn.Mish, + nn.PReLU, + nn.PixelShuffle, + nn.PixelUnshuffle, + nn.RReLU, + nn.ReLU, + nn.ReLU6, + nn.ReflectionPad1d, + nn.ReflectionPad2d, + # nn.ReflectionPad3d, # Not supported in torch < 1.10 + nn.ReplicationPad1d, + nn.ReplicationPad2d, + nn.ReplicationPad3d, + nn.SELU, + nn.SiLU, + nn.Sigmoid, + nn.Softmax, + nn.Softmax2d, + nn.Softmin, + nn.Softplus, + nn.Softshrink, + nn.Softsign, + nn.SyncBatchNorm, + nn.Tanh, + nn.Tanhshrink, + nn.Threshold, + nn.Unflatten, + nn.Unfold, + nn.Upsample, + nn.UpsamplingBilinear2d, + nn.UpsamplingNearest2d, + nn.ZeroPad2d, +] +_TORCH_NN_BINARY_MODULES = [ + nn.BCELoss, + nn.BCEWithLogitsLoss, + nn.Bilinear, + nn.CTCLoss, + nn.CosineSimilarity, + nn.CrossEntropyLoss, + nn.HingeEmbeddingLoss, + nn.HuberLoss, + nn.KLDivLoss, + nn.L1Loss, + nn.MSELoss, + nn.MultiLabelMarginLoss, + nn.MultiLabelSoftMarginLoss, + nn.MultiMarginLoss, + nn.NLLLoss, + nn.NLLLoss2d, + nn.PairwiseDistance, + nn.PoissonNLLLoss, + nn.SmoothL1Loss, + nn.SoftMarginLoss, +] +_TORCH_NN_TERNARY_MODULES = [ + nn.CosineEmbeddingLoss, + nn.GaussianNLLLoss, + nn.MarginRankingLoss, + nn.TripletMarginLoss, + nn.TripletMarginWithDistanceLoss, +] + + +def _register_global_variable(var_name, obj): + if var_name in globals(): + raise RuntimeError(f'Variable name "{var_name}" already exists in the global namespace.') + globals().update({var_name: obj}) + + +# Auto-generate quantized module definitions for regular-patterned modules +for _module_cls in _TORCH_NN_UNARY_MODULES: + _quantized_cls = _FakeQuantizedUnaryOpMixin.wrap(_module_cls) + _register_global_variable(_quantized_cls.__name__, _quantized_cls) + +for _module_cls in _TORCH_NN_BINARY_MODULES: + _quantized_cls = _FakeQuantizedBinaryOpMixin.wrap(_module_cls) + _register_global_variable(_quantized_cls.__name__, _quantized_cls) + +for _module_cls in _TORCH_NN_TERNARY_MODULES: + _quantized_cls = _FakeQuantizedTernaryOpMixin.wrap(_module_cls) + _register_global_variable(_quantized_cls.__name__, _quantized_cls) + + +@FakeQuantizationMixin.implements(nn.Embedding) +class FakeQuantizedEmbedding(FakeQuantizationMixin, nn.Embedding): + """ + Quantized class definition for nn.Embedding. + """ + def __quant_init__(self): + super().__quant_init__() + # pylint: disable=attribute-defined-outside-init + self.input_quantizers = nn.ModuleList([]) # nn.Embedding takes no float input + self.output_quantizers = nn.ModuleList([None]) + + def forward(self, input: Tensor) -> Tensor: # pylint: disable=arguments-differ + """ + Quantized forward impl for nn.Embedding. + """ + # pylint: disable=redefined-builtin + + with self._patch_quantized_parameters(): + output = super().forward(input) + + if self.output_quantizers[0]: + output = self.output_quantizers[0](output) + + return output + + +@FakeQuantizationMixin.implements(nn.EmbeddingBag) +class FakeQuantizedEmbeddingBag(FakeQuantizationMixin, nn.EmbeddingBag): + """ + Quantized class definition for nn.EmbeddingBag. + """ + def __quant_init__(self): + super().__quant_init__() + # pylint: disable=attribute-defined-outside-init + self.input_quantizers = nn.ModuleList([None]) + self.output_quantizers = nn.ModuleList([None]) + + def forward(self, # pylint: disable=arguments-differ + input: Tensor, + offsets: Optional[Tensor] = None, + per_sample_weights: Optional[Tensor] = None) -> Tensor: + """ + Quantized forward impl for nn.EmbeddingBag. + """ + # pylint: disable=redefined-builtin + + if self.input_quantizers[0]: + per_sample_weights = self.input_quantizers[0](per_sample_weights) + + with self._patch_quantized_parameters(): + output = super().forward(input, offsets, per_sample_weights) + + if self.output_quantizers[0]: + output = self.output_quantizers[0](output) + + return output + + +class _FakeQuantizedRNNBaseMixin(FakeQuantizationMixin): + def __quant_init__(self): + super().__quant_init__() + self.input_quantizers = nn.ModuleList([None, None]) + self.output_quantizers = nn.ModuleList([None, None]) + + def forward(self, input, hx: Optional[Tensor] = None): # pylint: disable=arguments-differ + """ + Quantized forward impl for nn.GRU and nn.RNN. + """ + # pylint: disable=redefined-builtin + + if self.input_quantizers[0]: + if isinstance(input, PackedSequence): + data, *others = input + quantized_data = self.input_quantizers[0](data) + input = PackedSequence(quantized_data, *others) + else: + input = self.input_quantizers[0](input) + + if hx is not None and self.input_quantizers[1]: + hx = self.input_quantizers[1](hx) + + with self._patch_quantized_parameters(): + output, hidden = super().forward(input, hx) + + if self.output_quantizers[0]: + if isinstance(output, PackedSequence): + data, *others = output + quantized_data = self.output_quantizers[0](data) + output = PackedSequence(quantized_data, *others) + else: + output = self.output_quantizers[0](output) + + if self.output_quantizers[1]: + hidden = self.output_quantizers[1](hidden) + + return output, hidden + +FakeQuantizedGRU = _FakeQuantizedRNNBaseMixin.wrap(nn.GRU) +FakeQuantizedRNN = _FakeQuantizedRNNBaseMixin.wrap(nn.RNN) + + +class _FakeQuantizedRNNCellBaseMixin(_FakeQuantizedBinaryOpMixin): + def forward(self, input: Tensor, hx: Optional[Tensor] = None) -> Tensor: # pylint: disable=arguments-differ + """ + Quantized forward impl for nn.GRUCell and nn.RNNCell. + """ + # pylint: disable=redefined-builtin + + if self.input_quantizers[0]: + input = self.input_quantizers[0](input) + + if hx is not None and self.input_quantizers[1]: + hx = self.input_quantizers[1](hx) + + with self._patch_quantized_parameters(): + output = super().forward(input, hx) + + if self.output_quantizers[0]: + output = self.output_quantizers[0](output) + + return output + +FakeQuantizedGRUCell = _FakeQuantizedRNNCellBaseMixin.wrap(nn.GRUCell) +FakeQuantizedRNNCell = _FakeQuantizedRNNCellBaseMixin.wrap(nn.RNNCell) + + +@FakeQuantizationMixin.implements(nn.LSTM) +class FakeQuantizedLSTM(FakeQuantizationMixin, nn.LSTM): + """ + Quantized class definition for nn.LSTM. + """ + def __quant_init__(self): + super().__quant_init__() + # pylint: disable=attribute-defined-outside-init + self.input_quantizers = nn.ModuleList([None, nn.ModuleList([None, None])]) + self.output_quantizers = nn.ModuleList([None, nn.ModuleList([None, None])]) + + def forward(self, input, hx: Optional[Tuple[Tensor, Tensor]] = None): # pylint: disable=arguments-differ + """ + Quantized forward impl for nn.LSTM. + """ + # pylint: disable=redefined-builtin + + if isinstance(input, PackedSequence) and self.input_quantizers[0]: + data, *others = input + quantized_data = self.input_quantizers[0](data) + input = PackedSequence(quantized_data, *others) + + if hx is not None: + h, c = hx + h_quantizer, c_quantizer = self.input_quantizers[1] + + if h_quantizer: + h = h_quantizer(h) + if c_quantizer: + c = c_quantizer(c) + hx = (h, c) + + with self._patch_quantized_parameters(): + output, hidden = super().forward(input, hx) + + if self.output_quantizers[0]: + if isinstance(output, PackedSequence): + data, *others = output + quantized_data = self.output_quantizers[0](data) + output = PackedSequence(quantized_data, *others) + else: + output = self.output_quantizers[0](output) + + h_n, c_n = hidden + h_quantizer, c_quantizer = self.output_quantizers[1] + + if h_quantizer: + h_n = h_quantizer(h_n) + if c_quantizer: + c_n = c_quantizer(c_n) + hidden = (h_n, c_n) + + return output, hidden + + +@FakeQuantizationMixin.implements(nn.LSTMCell) +class FakeQuantizedLSTMCell(FakeQuantizationMixin, nn.LSTMCell): + """ + Quantized class definition for nn.LSTMCell. + """ + def __quant_init__(self): + super().__quant_init__() + # pylint: disable=attribute-defined-outside-init + self.input_quantizers = nn.ModuleList([None, nn.ModuleList([None, None])]) + self.output_quantizers = nn.ModuleList([None]) + + def forward(self, input: Tensor, hx: Optional[Tuple[Tensor, Tensor]] = None): # pylint: disable=arguments-differ + """ + Quantized forward impl for nn.LSTMCell. + """ + # pylint: disable=redefined-builtin + + if self.input_quantizers[0]: + input = self.input_quantizers[0](input) + + if hx is not None: + h, c = hx + h_quantizer, c_quantizer = self.input_quantizers[1] + if h_quantizer: + h = h_quantizer(h) + if c_quantizer: + c = c_quantizer(c) + hx = (h, c) + + with self._patch_quantized_parameters(): + output = super().forward(input, hx) + + if self.output_quantizers[0]: + output = self.output_quantizers[0](output) + + return output + + +@FakeQuantizationMixin.implements(nn.AdaptiveLogSoftmaxWithLoss) +class FakeQuantizedAdaptiveLogSoftmaxWithLoss(FakeQuantizationMixin, nn.AdaptiveLogSoftmaxWithLoss): + """ + Quantized class definition for nn.AdaptiveLogSoftmaxWithLoss. + """ + def __quant_init__(self): + super().__quant_init__() + # pylint: disable=attribute-defined-outside-init + self.input_quantizers = nn.ModuleList([None, None]) + self.output_quantizers = nn.ModuleList([None, None]) + + def forward(self, input_: Tensor, target_: Tensor) -> Tensor: # pylint: disable=arguments-differ + """ + Quantized forward impl for nn.AdaptiveLogSoftmaxWithLoss. + """ + if self.input_quantizers[0]: + input_ = self.input_quantizers[0](input_) + + if self.input_quantizers[1]: + target_ = self.input_quantizers[1](target_) + + with self._patch_quantized_parameters(): + outputs = super().forward(input_, target_) + + output, loss = outputs + + if self.output_quantizers[0]: + output = self.output_quantizers[0](output) + + if self.output_quantizers[1]: + loss = self.output_quantizers[1](loss) + + return _ASMoutput(output, loss) + + + +# Quantized definitions of the following nn.Modules are intentionally omitted: +# * nn.MultiheadAttention +# * nn.Transformer +# * nn.TransformerDecoder +# * nn.TransformerDecoderLayer +# * nn.TransformerEncoder +# * nn.TransformerEncoderLayer + + + + + +########################### +### AIMET V1 custom ops ### +########################### + +# These class names are already occupied by torch.nn.Modules. +# To avoid name collision, we add prefix "Aimet" to the variable names as an ad-hoc workaraound. +FakeQuantizedAimetChannelShuffle = _FakeQuantizedUnaryOpMixin.wrap(aimet_ops.ChannelShuffle) +FakeQuantizedAimetMaxPool2d = _FakeQuantizedUnaryOpMixin.wrap(aimet_ops.MaxPool2d) +FakeQuantizedAimetAdaptiveAvgPool2d = _FakeQuantizedUnaryOpMixin.wrap(aimet_ops.AdaptiveAvgPool2d) +FakeQuantizedAimetAvgPool2d = _FakeQuantizedUnaryOpMixin.wrap(aimet_ops.AvgPool2d) + +_AIMET_V1_UNARY_MODULES = [ + aimet_ops.AMax, + aimet_ops.AMin, + aimet_ops.Cast, + aimet_ops.DepthToSpaceCRDMode, + aimet_ops.DepthToSpaceDCRMode, + aimet_ops.OneHot, + aimet_ops.Exponential, + aimet_ops.Erf, + aimet_ops.Sqrt, + aimet_ops.Log, + aimet_ops.Abs, + aimet_ops.Neg, + aimet_ops.ElementwiseCeil, + aimet_ops.ElementwiseFloor, + aimet_ops.Sin, + aimet_ops.Cos, + aimet_ops.Asin, + aimet_ops.Atan, + aimet_ops.Round, + aimet_ops.LogicalNot, + aimet_ops.NonZero, + aimet_ops.ElementwiseUnarySign, + aimet_ops.RSqrt, + aimet_ops.Square, + aimet_ops.Mean, + aimet_ops.Sum, + aimet_ops.Prod, + aimet_ops.Argmin, + aimet_ops.Argmax, + aimet_ops.Gather, + aimet_ops.Reshape, + aimet_ops.RoiAlign, + aimet_ops.Permute, + aimet_ops.IndexSelect, + aimet_ops.TopK, + aimet_ops.Tile, + aimet_ops.Norm, + aimet_ops.CumSum, + aimet_ops.Interpolate, + aimet_ops.Normalize, + aimet_ops.Pad, + aimet_ops.Shape, + aimet_ops.Expand, + aimet_ops.StridedSlice, +] +_AIMET_V1_BINARY_MODULES = [ + aimet_ops.MatMul, + aimet_ops.Add, + aimet_ops.Multiply, + aimet_ops.Subtract, + aimet_ops.Divide, + aimet_ops.FloorDivide, + aimet_ops.Greater, + aimet_ops.Less, + aimet_ops.GreaterEqual, + aimet_ops.LessEqual, + aimet_ops.NotEqual, + aimet_ops.Equal, + aimet_ops.Remainder, + aimet_ops.Fmod, + aimet_ops.Pow, + aimet_ops.CustomSiLU, + aimet_ops.Maximum, + aimet_ops.Max, + aimet_ops.Minimum, + aimet_ops.Min, + aimet_ops.Bmm, + aimet_ops.LogicalOr, + aimet_ops.LogicalAnd, + aimet_ops.CustomGather, + aimet_ops.GatherNd, +] +_AIMET_V1_TERNARY_MODULES = [ + aimet_ops.Baddbmm, + aimet_ops.Addmm, + aimet_ops.ScatterND, + aimet_ops.DynamicConv2d, + aimet_ops.DynamicLinear, + aimet_ops.ScatterElements, +] + +# Auto-generate quantized module definitions for regular-patterned modules +for _module_cls in _AIMET_V1_UNARY_MODULES: + _quantized_cls = _FakeQuantizedUnaryOpMixin.wrap(_module_cls) + _register_global_variable(_quantized_cls.__name__, _quantized_cls) + +for _module_cls in _AIMET_V1_BINARY_MODULES: + _quantized_cls = _FakeQuantizedBinaryOpMixin.wrap(_module_cls) + _register_global_variable(_quantized_cls.__name__, _quantized_cls) + +for _module_cls in _AIMET_V1_TERNARY_MODULES: + _quantized_cls = _FakeQuantizedTernaryOpMixin.wrap(_module_cls) + _register_global_variable(_quantized_cls.__name__, _quantized_cls) + + + +@FakeQuantizationMixin.implements(aimet_ops.BatchNorm) +class FakeQuantizedBatchNorm(FakeQuantizationMixin, aimet_ops.BatchNorm): # pylint: disable=abstract-method + """ + Quantized class definition for aimet_ops.BatchNorm. + """ + def __quant_init__(self): + super().__quant_init__() + # pylint: disable=attribute-defined-outside-init + self.input_quantizers = nn.ModuleList([None, None, None, None, None]) + + def forward(self, # pylint: disable=too-many-arguments, arguments-differ + input: Tensor, + running_mean: Optional[Tensor], + running_var: Optional[Tensor], + weight: Optional[Tensor] = None, + bias: Optional[Tensor] = None, + training: bool = False, + momentum: float = 0.1, + eps: float = 1e-5) -> Tensor: + """ + Quantized forward impl for aimet_ops.BatchNorm. + """ + # pylint: disable=redefined-builtin + + if self.input_quantizers[0]: + input = self.input_quantizers[0](input) + + if running_mean is not None and self.input_quantizers[1]: + running_mean = self.input_quantizers[1](running_mean) + + if running_var is not None and self.input_quantizers[2]: + running_var = self.input_quantizers[2](running_var) + + if weight is not None and self.input_quantizers[3]: + weight = self.input_quantizers[3](weight) + + if bias is not None and self.input_quantizers[4]: + bias = self.input_quantizers[4](bias) + + output = super().forward(input, running_mean, running_var, + weight, bias, training, momentum, eps) + + if self.output_quantizers[0]: + output = self.output_quantizers[0](output) + + return output + + +@FakeQuantizationMixin.implements(aimet_ops.GroupNorm) +class FakeQuantizedAimetGroupNorm(FakeQuantizationMixin, aimet_ops.GroupNorm): # pylint: disable=abstract-method + """ + Quantized class definition for aimet_ops.GroupNorm. + """ + def __quant_init__(self): + super().__quant_init__() + # pylint: disable=attribute-defined-outside-init + self.input_quantizers = nn.ModuleList([None, None, None, None]) + + def forward(self, # pylint: disable=arguments-differ + input: Tensor, + num_groups: int, + weight: Optional[Tensor] = None, + bias: Optional[Tensor] = None, + eps: float = 1e-5) -> Tensor: + """ + Quantized forward impl for aimet_ops.GroupNorm. + """ + # pylint: disable=redefined-builtin + + if self.input_quantizers[0]: + input = self.input_quantizers[0](input) + + if weight is not None and self.input_quantizers[2]: + weight = self.input_quantizers[2](weight) + + if bias is not None and self.input_quantizers[3]: + bias = self.input_quantizers[3](bias) + + output = super().forward(input, num_groups, weight, bias, eps) + + if self.output_quantizers[0]: + output = self.output_quantizers[0](output) + + return output + + +@FakeQuantizationMixin.implements(aimet_ops.NonMaxSuppression) +class FakeQuantizedNonMaxSuppression(FakeQuantizationMixin, aimet_ops.NonMaxSuppression): + """ + Quantized class definition for aimet_ops.NonMaxSuppression. + """ + def __quant_init__(self): + super().__quant_init__() + # pylint: disable=attribute-defined-outside-init + self.input_quantizers = nn.ModuleList([None]) + self.output_quantizers = nn.ModuleList([None]) + + def forward(self, *args) -> Tensor: # pylint: disable=arguments-differ + """ + Quantized forward impl for aimet_ops.NonMaxSuppression. + """ + boxes, scores = args # boxes are integer tensors + + if self.input_quantizers[0]: + # Use same input quantizer for all the score tensors + scores = tree_map(self.input_quantizers[0], scores) + + output = super().forward(boxes, scores) + + if self.output_quantizers[0]: + output = self.output_quantizers[0](output) + + return output + + +@FakeQuantizationMixin.implements(aimet_ops.Split) +class FakeQuantizedSplit(_FakeQuantizedUnaryOpMixin, aimet_ops.Split): # pylint: disable=abstract-method, too-many-ancestors + """ + Quantized class definition for aimet_ops.Split. + """ + def forward(self, *args, **kwargs): # pylint: disable=arguments-differ + """ + Quantized forward impl for aimet_ops.Split. + """ + x, *others = args + + if x.is_floating_point() and self.input_quantizers[0]: + x = self.input_quantizers[0](x) + + outputs = super().forward(x, *others, **kwargs) + + if self.output_quantizers[0]: + # Use same output quantizer for all the output tensors + quantize_fn = lambda out: self.output_quantizers[0](out) if out.is_floating_point() else out + outputs = tree_map(quantize_fn, outputs) + + return outputs + + +@FakeQuantizationMixin.implements(aimet_ops.Concat) +class FakeQuantizedConcat(_FakeQuantizedUnaryOpMixin, aimet_ops.Concat): # pylint: disable=too-many-ancestors + """ + Quantized class definition for aimet_ops.Concat. + """ + _num_inputs: int + + def __quant_init__(self): + super().__quant_init__() + self._num_inputs = 1 + + def export_input_encodings(self): + """ + Extends super().export to repeat input quantizer's encodings :attr:`self._num_inputs` times + """ + input_encodings = super().export_input_encodings() + return input_encodings * self._num_inputs + + def import_input_encodings(self, + encodings, + strict: bool, + partial: bool, + requires_grad: Optional[bool], + allow_overwrite: bool): + """ + Extends super().import_input_encodings to set `self._num_inputs` based on length of encodings. + """ + self._num_inputs = len(encodings) + super().import_input_encodings(encodings, + strict=strict, + partial=partial, + requires_grad=requires_grad, + allow_overwrite=allow_overwrite) + + def forward(self, *x): # pylint: disable=arguments-differ + """ + Quantized forward impl for aimet_ops.Concat. + """ + self._num_inputs = len(x) + + if self.input_quantizers[0]: + # Use same input quantizer for all the input tensors + quantize_fn = lambda inp: self.input_quantizers[0](inp) if inp.is_floating_point() else inp + x = tree_map(quantize_fn, x) + + output = super().forward(*x) + + if output.is_floating_point() and self.output_quantizers[0]: + output = self.output_quantizers[0](output) + + return output + + +@FakeQuantizationMixin.implements(aimet_ops.Where) +class FakeQuantizedWhere(FakeQuantizationMixin, aimet_ops.Where): # pylint: disable=abstract-method + """ + Quantized class definition for aimet_ops.Where. + """ + def __quant_init__(self): + super().__quant_init__() + # pylint: disable=attribute-defined-outside-init + self.input_quantizers = nn.ModuleList([None, None, None]) + self.output_quantizers = nn.ModuleList([None]) + + def forward(self, condition: Tensor, input, other, **kwargs) -> Tensor: # pylint: disable=arguments-differ + """ + Quantized forward impl for aimet_ops.MaskedFill. + """ + # pylint: disable=redefined-builtin + + if isinstance(input, Tensor) and input.is_floating_point() and self.input_quantizers[1]: + input = self.input_quantizers[1](input) + + if isinstance(other, Tensor) and other.is_floating_point() and self.input_quantizers[2]: + other = self.input_quantizers[2](other) + + output = super().forward(condition, input, other, **kwargs) + + if output.is_floating_point() and self.output_quantizers[0]: + output = self.output_quantizers[0](output) + + return output + + +@FakeQuantizationMixin.implements(aimet_ops.MaskedFill) +class FakeQuantizedMaskedFill(FakeQuantizationMixin, aimet_ops.MaskedFill): # pylint: disable=abstract-method + """ + Quantized class definition for aimet_ops.MaskedFill. + """ + def __quant_init__(self): + super().__quant_init__() + # pylint: disable=attribute-defined-outside-init + self.input_quantizers = nn.ModuleList([None, None]) + self.output_quantizers = nn.ModuleList([None]) + + def forward(self, mask: Tensor, value) -> Tensor: # pylint: disable=arguments-differ + """ + Quantized forward impl for aimet_ops.MaskedFill. + """ + if isinstance(value, Tensor) and value.is_floating_point() and self.input_quantizers[1]: + value = self.input_quantizers[1](value) + + output = super().forward(mask, value) + + if output.is_floating_point() and self.output_quantizers[0]: + output = self.output_quantizers[0](output) + + return output +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/nn/true_quant.html b/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/nn/true_quant.html new file mode 100644 index 00000000..ab0defac --- /dev/null +++ b/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/nn/true_quant.html @@ -0,0 +1,579 @@ + + + + + + aimet_torch.v2.nn.true_quant — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

Source code for aimet_torch.v2.nn.true_quant

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+""" Quantized modules"""
+
+import contextlib
+from functools import partial
+import itertools
+from abc import abstractmethod
+from collections import OrderedDict
+from typing import Type, Any, Tuple, Dict, Optional, Callable
+from weakref import WeakKeyDictionary
+
+import torch
+import torch.nn as nn
+from torch import Tensor
+
+from aimet_torch.v2.quantization.base import QuantizerBase
+from aimet_torch.v2.quantization import affine
+from aimet_torch.v2.quantization.float import FloatQuantizeDequantize
+from aimet_torch.v2.quantization.tensor import QuantizedTensorBase
+from aimet_torch.v2.utils import patch_attr, _ContextManager, allow_recompute
+import aimet_torch.elementwise_ops as aimet_ops
+
+from .base import BaseQuantizationMixin, _BaseQuantizedUnaryOpMixin, _BaseQuantizedBinaryOpMixin # pylint: disable=import-error
+
+
+def _quantize_if_applicable(data: Any, quantizer: Optional[QuantizerBase]):
+    """
+    Quantize data if it is a quantizable type and quantize is not None
+    """
+    if quantizer and isinstance(data, Tensor) and data.is_floating_point():
+        if isinstance(data, QuantizedTensorBase):
+            data = data.dequantize()
+        return quantizer(data)
+
+    if isinstance(data, QuantizedTensorBase):
+        return data.quantize()
+
+    return data
+
+def _dequantize_if_applicable(data: torch.Tensor):
+    return data.dequantize() if isinstance(data, QuantizedTensorBase) else data
+
+
+_QUANTIZED_MODULES_UNDER_COMPUTE_ENCODINGS = WeakKeyDictionary()
+
+
+def _is_computing_encodings(qmodule):
+    return _QUANTIZED_MODULES_UNDER_COMPUTE_ENCODINGS.get(qmodule, 0) > 0
+
+
+def _enter_computing_encodings(qmodule):
+    if qmodule not in _QUANTIZED_MODULES_UNDER_COMPUTE_ENCODINGS:
+        _QUANTIZED_MODULES_UNDER_COMPUTE_ENCODINGS[qmodule] = 0
+    _QUANTIZED_MODULES_UNDER_COMPUTE_ENCODINGS[qmodule] += 1
+
+
+def _exit_compute_encodings(qmodule):
+    assert _QUANTIZED_MODULES_UNDER_COMPUTE_ENCODINGS[qmodule] > 0
+    _QUANTIZED_MODULES_UNDER_COMPUTE_ENCODINGS[qmodule] -= 1
+
+
+
[docs]class QuantizationMixin(BaseQuantizationMixin): # pylint: disable=abstract-method + """ + Mixin that allows dispatch to quantized operator libraries in place of native pytorch operations + """ + + cls_to_qcls = OrderedDict() # quantized class -> original class + qcls_to_cls = OrderedDict() # original class -> quantized class + + _default_kernel: Optional[Callable] = None + _kernels = WeakKeyDictionary() # instance -> instance_kernel + +
[docs] @classmethod + def set_default_kernel(cls, kernel: Callable): + """ + Set default kernel for the class. + + :param kernel: Callable object to be used as the default kernel + by all the instances of this class. + """ + cls._default_kernel = kernel
+ +
[docs] @classmethod + def get_default_kernel(cls) -> Optional[Callable]: + """ + Return the default kernel of the class + + :return: Default kernel of the class. None if the default kernel is not set. + """ + return cls._default_kernel
+ +
[docs] def set_kernel(self, kernel: Callable): + """ + Set kernel for this instance of quantized module. + + :param kernel: Callable object to be used as the underlying kernel. + """ + QuantizationMixin._kernels[self] = kernel
+ +
[docs] def get_kernel(self) -> Optional[Callable]: + """ + Return the kernel to be used by this instance of quantized module. + If the current instance does not have any kernel set, + it will try to use the default kernel of the class. + + :return: Kernel to be used by this instance. + """ + if self in QuantizationMixin._kernels: + return QuantizationMixin._kernels[self] + return self.get_default_kernel()
+ +
[docs] @contextlib.contextmanager + def compute_encodings(self): # pylint: disable=missing-function-docstring + ctx = _ContextManager(action=lambda: _enter_computing_encodings(self), + cleanup=lambda: _exit_compute_encodings(self)) + with super().compute_encodings(), ctx: + yield
+ + @contextlib.contextmanager + def _patch_dequantized_parameters(self): + with contextlib.ExitStack() as stack: + for param_name, _ in self.param_quantizers.items(): + qparam = getattr(self, param_name) + ctx = patch_attr(self, param_name, _dequantize_if_applicable(qparam)) + stack.enter_context(ctx) + yield + +
[docs] @classmethod + def wrap(cls, module_cls: Type[nn.Module]) -> Type[nn.Module]: + """ + Wrap a regular module class into a quantized module class + """ + if not issubclass(module_cls, nn.Module): + raise ValueError("Expected module_cls to be a subclass of torch.nn.Module. " + f"Got {module_cls}.") + if module_cls in cls.cls_to_qcls: + return cls.cls_to_qcls[module_cls] + + quantized_cls_name = f"Quantized{module_cls.__name__}" + base_classes = (cls, module_cls) + quantized_cls = type(quantized_cls_name, base_classes, {'__module__': __name__}) + return cls.implements(module_cls)(quantized_cls)
+ +
[docs] @classmethod + def implements(cls, module_cls): + """ + Decorator for registering quantized implementation of the given base class. + """ + + def wrapper(quantized_cls): + cls.cls_to_qcls[module_cls] = quantized_cls + cls.qcls_to_cls[quantized_cls] = module_cls + return quantized_cls + + return wrapper
+ + @contextlib.contextmanager + def _unsafe_view_quantizers_as_qdq(self): + + def _view_as_qdq(quantizer): + if not quantizer: + return contextlib.nullcontext() + + if isinstance(quantizer, affine.QuantizeDequantize): + return contextlib.nullcontext() + + if isinstance(quantizer, FloatQuantizeDequantize): + return contextlib.nullcontext() + + if 'forward' in quantizer.__dict__: + # forward is already monkey-patched probably due to compute_encodings() + # Leave it as-is + return contextlib.nullcontext() + + return patch_attr(quantizer, 'forward', + partial(affine.QuantizeDequantize.forward, quantizer)) + + with contextlib.ExitStack() as stack: + for quantizer in itertools.chain(self.input_quantizers, + self.output_quantizers, + self.param_quantizers.values()): + ctx = _view_as_qdq(quantizer) + stack.enter_context(ctx) + + yield
+ + +# pylint: disable=arguments-differ, abstract-method, too-many-ancestors + +class _QuantizedUnaryOpMixin(QuantizationMixin, _BaseQuantizedUnaryOpMixin): + def forward(self, *args, **kwargs): # pylint: disable=missing-function-docstring + kernel = self.get_kernel() + + if not kernel or _is_computing_encodings(self): + # Fast track: Fall back to fake quantization without further check + # Most of the users who never use integer kernels will always end up + # taking this path, making QuantizedModule behave the same as FakeQuantizedModule + # which is currently much more performant in terms of both speed and memory + + # NOTE: This is a quick temporary solution that may not be robust + # for the quantized modules to be added in the future. + with self._unsafe_view_quantizers_as_qdq(): + return super().forward(*args, **kwargs) + + x, *args = args + x = _quantize_if_applicable(x, self.input_quantizers[0]) + + if not isinstance(x, QuantizedTensorBase): + raise RuntimeError + + with self._patch_quantized_parameters(): + kernel_args, kernel_kwargs = self.get_functional_args(x, *args, **kwargs) + output_encodings = self.output_quantizers[0].get_encoding() if self.output_quantizers[0] else None + output = kernel(*kernel_args, **kernel_kwargs, output_encodings=output_encodings) + + return output.dequantize() + + @abstractmethod + def get_functional_args(self, x, *args, **kwargs) -> Tuple[Tuple, Dict]: + """ + Return the args and keyword args to the layer's kernel call + """ + + +class _QuantizedBinaryOpMixin(QuantizationMixin, _BaseQuantizedBinaryOpMixin): + def __quant_init__(self): + super().__quant_init__() + self.input_quantizers = nn.ModuleList([None, None]) + + def forward(self, *args, **kwargs): # pylint: disable=missing-function-docstring + kernel = self.get_kernel() + + if not kernel or _is_computing_encodings(self): + # Fast track: Fall back to fake quantization without further check + # Most of the users who never use integer kernels will always end up + # taking this path, making QuantizedModule behave the same as FakeQuantizedModule + # which is currently much more performant in terms of both speed and memory + + # NOTE: This is a quick temporary solution that may not be robust + # for the quantized modules to be added in the future. + with self._unsafe_view_quantizers_as_qdq(): + return super().forward(*args, **kwargs) + + x, y, *args = args + x = _quantize_if_applicable(x, self.input_quantizers[0]) + y = _quantize_if_applicable(y, self.input_quantizers[1]) + + if not isinstance(x, QuantizedTensorBase): + raise RuntimeError + + if not isinstance(y, QuantizedTensorBase): + raise RuntimeError + + with self._patch_quantized_parameters(): + kernel_args, kernel_kwargs = self.get_functional_args(x, y, *args, **kwargs) + output_encodings = self.output_quantizers[0].get_encoding() if self.output_quantizers[0] else None + output = kernel(*kernel_args, **kernel_kwargs, output_encodings=output_encodings) + + return output.dequantize() + + @abstractmethod + def get_functional_args(self, x, y, *args, **kwargs) -> Tuple[Tuple, Dict]: + """ + Return the args and keyword args to the layer's kernel call + """ + + +class _QuantizedConvNdMixin(_QuantizedUnaryOpMixin): # pylint: disable=too-many-ancestors + """ Quantized ConvNd """ + def __quant_init__(self): + if self.padding_mode != 'zeros': + msg = f'padding_mode other than "zeros" is currently not supported. (got {self.padding_mode})' + raise NotImplementedError(msg) + super().__quant_init__() + + def forward(self, *args, **kwargs): + if self.padding_mode != 'zeros': + msg = f'padding_mode other than "zeros" is currently not supported. (got {self.padding_mode})' + raise NotImplementedError(msg) + return super().forward(*args, **kwargs) + + def get_functional_args(self, x): + args = (x, self.weight) + kwargs = {"bias": self.bias, + "stride": self.stride, + "padding": self.padding, + "dilation": self.dilation, + "groups": self.groups} + return args, kwargs + + +@QuantizationMixin.implements(nn.Conv1d) +class QuantizedConv1d(_QuantizedConvNdMixin, nn.Conv1d): # pylint: disable=too-many-ancestors + """ Quantized Conv1d """ + + +@QuantizationMixin.implements(nn.Conv2d) +class QuantizedConv2d(_QuantizedConvNdMixin, nn.Conv2d): # pylint: disable=too-many-ancestors + """ Quantized Conv2d """ + + +@QuantizationMixin.implements(nn.Conv3d) +class QuantizedConv3d(_QuantizedConvNdMixin, nn.Conv3d): # pylint: disable=too-many-ancestors + """ Quantized Conv3d """ + + +@QuantizationMixin.implements(nn.Linear) +class QuantizedLinear(_QuantizedUnaryOpMixin, nn.Linear): + """ Quantized Linear """ + + # Only allow activation recompute (a.k.a activation checkpointing) for QuantizedLinear. + # This is mainly to reduce memory footprint of QAT of large language models. + @allow_recompute + def forward(self, *args, **kwargs): + return super().forward(*args, **kwargs) + + def get_functional_args(self, x): + return (x, self.weight), {"bias": self.bias} + + +@QuantizationMixin.implements(nn.GELU) +class QuantizedGELU(_QuantizedUnaryOpMixin, nn.GELU): + """ Quantized GELU """ + + def get_functional_args(self, x): + return (x, ), {"approximate": self.approximate} + + +@QuantizationMixin.implements(nn.LayerNorm) +class QuantizedLayerNorm(_QuantizedUnaryOpMixin, nn.LayerNorm): + """ Quantized LayerNorm """ + + def get_functional_args(self, x): + return (x, self.normalized_shape), {"weight": self.weight, "bias": self.bias, "eps": self.eps} + +@QuantizationMixin.implements(nn.Softmax) +class QuantizedSoftmax(_QuantizedUnaryOpMixin, nn.Softmax): + """ Quantized Softmax """ + + def get_functional_args(self, x): + return (x, self.dim), {} + +@QuantizationMixin.implements(nn.Sigmoid) +class QuantizedSigmoid(_QuantizedUnaryOpMixin, nn.Sigmoid): + """ Quantized Sigmoid """ + + def get_functional_args(self, x): + return (x, ), {} + + +@QuantizationMixin.implements(nn.Tanh) +class QuantizedTanh(_QuantizedUnaryOpMixin, nn.Tanh): + """ Quantized Tanh """ + + def get_functional_args(self, x): + return (x,), {} + + +@QuantizationMixin.implements(aimet_ops.Add) +class QuantizedAdd(_QuantizedBinaryOpMixin, aimet_ops.Add): + """ Quantized Add """ + + def get_functional_args(self, x, y): + return (x, y), {} + + +@QuantizationMixin.implements(aimet_ops.Multiply) +class QuantizedMultiply(_QuantizedBinaryOpMixin, aimet_ops.Multiply): + """ Quantized Multiply """ + + def get_functional_args(self, x, y): + return (x, y), {} + + +@QuantizationMixin.implements(aimet_ops.Subtract) +class QuantizedSubtract(_QuantizedBinaryOpMixin, aimet_ops.Subtract): + """ Quantized Subtract """ + + def get_functional_args(self, x, y): + return (x, y), {} +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/quantization/affine/backends.html b/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/quantization/affine/backends.html new file mode 100644 index 00000000..abd51f94 --- /dev/null +++ b/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/quantization/affine/backends.html @@ -0,0 +1,528 @@ + + + + + + aimet_torch.v2.quantization.affine.backends — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_torch.v2.quantization.affine.backends

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2023, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+# pylint: disable=all
+
+import math
+from typing import overload, Union, Tuple, Optional
+import torch
+from .utils import *
+
+
+@overload
+def quantize(tensor: torch.Tensor, scale: torch.Tensor, offset: torch.Tensor,
+             bitwidth: Union[int, float], signed: bool = False,
+             block_size: Optional[Tuple[int, ...]] = None):
+    ...
+
+@overload
+def quantize(tensor: torch.Tensor, scale: torch.Tensor, offset: torch.Tensor, *,
+             num_steps: int, signed: bool = False, block_size: Optional[Tuple[int, ...]] = None):
+    ...
+
+@overload
+def quantize(tensor: torch.Tensor, scale: torch.Tensor, offset: torch.Tensor, *,
+             qmin: int, qmax: int, block_size: Optional[Tuple[int, ...]] = None):
+    ...
+
+
+
[docs]def quantize(tensor: torch.Tensor, scale: torch.Tensor, offset: torch.Tensor, + *args, **kwargs): + r""" + Applies quantization to the input. + + Precisely, + + .. math:: + out = clamp\left(\left\lceil\frac{input}{scale}\right\rfloor - offset, qmin, qmax\right) + + If block size :math:`B = \begin{pmatrix} B_0 & B_1 & \cdots & B_{D-1} \end{pmatrix}` is specified, + this equation will be further generalized as + + .. math:: + out_{j_0 \cdots j_{D-1}} & = clamp\left( + \left\lceil\frac{input_{j_0 \cdots j_{D-1}}}{scale_{i_0 \cdots i_{D-1}}}\right\rfloor + - offset_{i_0 \cdots i_{D-1}}, qmin, qmax\right)\\ + + \text{where} \quad \forall_{0 \leq d < D} \quad i_d = \left\lfloor \frac{j_d}{B_d} \right\rfloor + + This function is overloaded with the signatures listed below: + + + .. function:: quantize(tensor, scale, offset, bitwidth, signed=False, block_size=None) + :noindex: + + Equivalent to: + + .. math:: + qmin= + \begin{cases} + -\left\lceil\frac{2^{bitwidth}-1}{2}\right\rceil,& \text{if } signed\\ + 0, & \text{otherwise (default)} + \end{cases} + qmax= + \begin{cases} + \left\lfloor\frac{2^{bitwidth}-1}{2}\right\rfloor,& \text{if } signed\\ + 2^{bitwidth}-1, & \text{otherwise (default)} + \end{cases} + + :param Tensor tensor: Tensor to quantize + :param Tensor scale: Scale for quantization + :param Tensor offset: Offset for quantization + :param int bitwidth: Bitwidth of quantized tensor based on which :math:`qmin` and :math:`qmax` will be derived + :param bool signed: If false, the output will be mapped to positive integers only. + Otherwise, it will range over both positive and negative integers. + :param block_size: Block size + :type block_size: Tuple[int, ...], optional + + .. function:: quantize(tensor, scale, offset, *, num_steps, signed=False, block_size=None) + :noindex: + + Equivalent to: + + .. math:: + qmin= + \begin{cases} + -\left\lceil\frac{num\_steps}{2}\right\rceil,& \text{if } signed\\ + 0, & \text{otherwise (default)} + \end{cases} + qmax= + \begin{cases} + \left\lfloor\frac{num\_steps}{2}\right\rfloor,& \text{if } signed\\ + num\_steps, & \text{otherwise (default)} + \end{cases} + + + :param Tensor tensor: Tensor to quantize + :param Tensor scale: Scale for quantization + :param Tensor offset: Offset for quantization + :param int num_steps: The number of steps in the quantization range based on which :math:`qmin` and :math:`qmax` will be derived + :param bool signed: If false, the output will be mapped to positive integers only. + Otherwise, it will range over both positive and negative integers. + :param block_size: Block size + :type block_size: Tuple[int, ...], optional + + .. function:: quantize(tensor, scale, offset, *, qmin, qmax, block_size=None) + :noindex: + + :param Tensor tensor: Tensor to quantize + :param Tensor scale: Scale for quantization + :param Tensor offset: Offset for quantization + :param int qmin: Minimum value of the quantization range + :param int qmax: Maximum value of the quantization range + :param block_size: Block size + :type block_size: Tuple[int, ...], optional + + + Examples: + + >>> import aimet_torch.v2.quantization as Q + >>> input = torch.arange(start=-0.3, end=1.3, step=0.05) + >>> print(input) + tensor([-3.0000e-01, -2.5000e-01, -2.0000e-01, -1.5000e-01, -1.0000e-01, + -5.0000e-02, -1.1921e-08, 5.0000e-02, 1.0000e-01, 1.5000e-01, + 2.0000e-01, 2.5000e-01, 3.0000e-01, 3.5000e-01, 4.0000e-01, + 4.5000e-01, 5.0000e-01, 5.5000e-01, 6.0000e-01, 6.5000e-01, + 7.0000e-01, 7.5000e-01, 8.0000e-01, 8.5000e-01, 9.0000e-01, + 9.5000e-01, 1.0000e+00, 1.0500e+00, 1.1000e+00, 1.1500e+00, + 1.2000e+00, 1.2500e+00]) + >>> scale = torch.tensor(1/15) + >>> offset = torch.tensor(0.0) + >>> Q.affine.quantize(input, scale, offset, bitwidth=4) + tensor([ 0., 0., 0., 0., 0., 0., -0., 1., 2., 2., 3., 4., 4., 5., + 6., 7., 7., 8., 9., 10., 10., 11., 12., 13., 13., 14., 15., 15., + 15., 15., 15., 15.]) + >>> Q.affine.quantize(input, scale, offset, num_steps=15) + tensor([ 0., 0., 0., 0., 0., 0., -0., 1., 2., 2., 3., 4., 4., 5., + 6., 7., 7., 8., 9., 10., 10., 11., 12., 13., 13., 14., 15., 15., + 15., 15., 15., 15.]) + >>> Q.affine.quantize(input, scale, offset, qmin=0, qmax=15) + tensor([ 0., 0., 0., 0., 0., 0., -0., 1., 2., 2., 3., 4., 4., 5., + 6., 7., 7., 8., 9., 10., 10., 11., 12., 13., 13., 14., 15., 15., + 15., 15., 15., 15.]) + """ + qmin, qmax, block_size = _parse_args(args, kwargs) + return get_backend().quantize(tensor, scale, offset, qmin, qmax, block_size)
+ + +@overload +def quantize_dequantize(tensor: torch.Tensor, scale: torch.Tensor, offset: torch.Tensor, + bitwidth: Union[int, float], signed: bool = False, + block_size: Optional[Tuple[int, ...]] = None): + ... + +@overload +def quantize_dequantize(tensor: torch.Tensor, scale: torch.Tensor, offset: torch.Tensor, *, + num_steps: int, signed: bool = False, block_size: Optional[Tuple[int, ...]] = None): + ... + +@overload +def quantize_dequantize(tensor: torch.Tensor, scale: torch.Tensor, offset: torch.Tensor, *, + qmin: int, qmax: int, block_size: Optional[Tuple[int, ...]] = None): + ... + + +
[docs]def quantize_dequantize(tensor: torch.Tensor, scale: torch.Tensor, offset: torch.Tensor, + *args, **kwargs): + r""" + Applies fake-quantization by quantizing and dequantizing the input. + + Precisely, + + .. math:: + out = (\overline{input} + offset) * scale + + where + + .. math:: + \overline{input} = clamp\left(\left\lceil\frac{input}{scale}\right\rfloor - offset, qmin, qmax\right) + + + If block size :math:`B = \begin{pmatrix} B_0 & B_1 & \cdots & B_{D-1} \end{pmatrix}` is specified, + this equation will be further generalized as + + .. math:: + out_{j_0 \cdots j_{D-1}} &= (\overline{input}_{j_0 \cdots j_{D-1}} + offset_{i_0 \cdots i_{D-1}}) * scale_{i_0 \cdots i_{D-1}}\\ + \overline{input}_{j_0 \cdots j_{D-1}} &= clamp\left( + \left\lceil\frac{input_{j_0 \cdots j_{D-1}}}{scale_{i_0 \cdots i_{D-1}}}\right\rfloor + - offset_{i_0 \cdots i_{D-1}}, qmin, qmax\right)\\ + + \text{where } \quad \forall_{0 \leq d < D} \quad i_d = \left\lfloor \frac{j_d}{B_d} \right\rfloor + + + This function is overloaded with the signatures listed below: + + + .. function:: quantize_dequantize(tensor, scale, offset, bitwidth, signed=False, block_size=None) + :noindex: + + Equivalent to: + + .. math:: + qmin= + \begin{cases} + -\left\lceil\frac{2^{bitwidth}-1}{2}\right\rceil,& \text{if } signed\\ + 0, & \text{otherwise (default)} + \end{cases} + qmax= + \begin{cases} + \left\lfloor\frac{2^{bitwidth}-1}{2}\right\rfloor,& \text{if } signed\\ + 2^{bitwidth}-1, & \text{otherwise (default)} + \end{cases} + + :param Tensor tensor: Tensor to quantize + :param Tensor scale: Scale for quantization + :param Tensor offset: Offset for quantization + :param int bitwidth: Bitwidth of quantized tensor based on which :math:`qmin` and :math:`qmax` will be derived + :param bool signed: If false, :math:`\overline{input}` will be mapped to positive integers only. + Otherwise, :math:`\overline{input}` will range over both positive and negative integers. + :param block_size: Block size + :type block_size: Tuple[int, ...], optional + + .. function:: quantize_dequantize(tensor, scale, offset, *, num_steps, signed=False, block_size=None) + :noindex: + + Equivalent to: + + .. math:: + qmin= + \begin{cases} + -\left\lceil\frac{num\_steps}{2}\right\rceil,& \text{if } signed\\ + 0, & \text{otherwise (default)} + \end{cases} + qmax= + \begin{cases} + \left\lfloor\frac{num\_steps}{2}\right\rfloor,& \text{if } signed\\ + num\_steps, & \text{otherwise (default)} + \end{cases} + + + :param Tensor tensor: Tensor to quantize + :param Tensor scale: Scale for quantization + :param Tensor offset: Offset for quantization + :param int num_steps: The number of steps in the quantization range based on which :math:`qmin` and :math:`qmax` will be derived + :param bool signed: If false, :math:`\overline{input}` will be mapped to positive integers only. + Otherwise, :math:`\overline{input}` will range over both positive and negative integers. + :param block_size: Block size + :type block_size: Tuple[int, ...], optional + + .. function:: quantize_dequantize(tensor, scale, offset, *, qmin, qmax, block_size=None) + :noindex: + + :param Tensor tensor: Tensor to quantize + :param Tensor scale: Scale for quantization + :param Tensor offset: Offset for quantization + :param int qmin: Minimum value of the quantization range + :param int qmax: Maximum value of the quantization range + :param block_size: Block size + :type block_size: Tuple[int, ...], optional + + + Examples: + + >>> import aimet_torch.v2.quantization as Q + >>> input = torch.arange(start=-0.3, end=1.3, step=0.05) + >>> print(input) + tensor([-3.0000e-01, -2.5000e-01, -2.0000e-01, -1.5000e-01, -1.0000e-01, + -5.0000e-02, -1.1921e-08, 5.0000e-02, 1.0000e-01, 1.5000e-01, + 2.0000e-01, 2.5000e-01, 3.0000e-01, 3.5000e-01, 4.0000e-01, + 4.5000e-01, 5.0000e-01, 5.5000e-01, 6.0000e-01, 6.5000e-01, + 7.0000e-01, 7.5000e-01, 8.0000e-01, 8.5000e-01, 9.0000e-01, + 9.5000e-01, 1.0000e+00, 1.0500e+00, 1.1000e+00, 1.1500e+00, + 1.2000e+00, 1.2500e+00]) + >>> scale = torch.tensor(1/15) + >>> offset = torch.tensor(0.0) + >>> Q.affine.quantize_dequantize(input, scale, offset, bitwidth=4) + tensor([0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0667, 0.1333, + 0.1333, 0.2000, 0.2667, 0.2667, 0.3333, 0.4000, 0.4667, 0.4667, 0.5333, + 0.6000, 0.6667, 0.6667, 0.7333, 0.8000, 0.8667, 0.8667, 0.9333, 1.0000, + 1.0000, 1.0000, 1.0000, 1.0000, 1.0000]) + >>> Q.affine.quantize_dequantize(input, scale, offset, num_steps=15) + tensor([0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0667, 0.1333, + 0.1333, 0.2000, 0.2667, 0.2667, 0.3333, 0.4000, 0.4667, 0.4667, 0.5333, + 0.6000, 0.6667, 0.6667, 0.7333, 0.8000, 0.8667, 0.8667, 0.9333, 1.0000, + 1.0000, 1.0000, 1.0000, 1.0000, 1.0000]) + >>> Q.affine.quantize_dequantize(input, scale, offset, qmin=0, qmax=15) + tensor([0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0667, 0.1333, + 0.1333, 0.2000, 0.2667, 0.2667, 0.3333, 0.4000, 0.4667, 0.4667, 0.5333, + 0.6000, 0.6667, 0.6667, 0.7333, 0.8000, 0.8667, 0.8667, 0.9333, 1.0000, + 1.0000, 1.0000, 1.0000, 1.0000, 1.0000]) + """ + qmin, qmax, block_size = _parse_args(args, kwargs) + return get_backend().quantize_dequantize(tensor, scale, offset, qmin, qmax, block_size)
+ + +
[docs]def dequantize(tensor: torch.Tensor, scale: torch.Tensor, offset: torch.Tensor, + block_size: Optional[Tuple[int, ...]] = None): + return get_backend().dequantize(tensor, scale, offset, block_size)
+ + +def _parse_args(args, kwargs) -> Tuple[int, int, Optional[Tuple[int, ...]]]: + bitwidth = num_steps = signed = qmin = qmax = None + block_size = kwargs.get('block_size') + + if len(args) == 2: + bitwidth, signed = args + elif len(args) == 1: + bitwidth = args[0] + signed = kwargs.get('signed', False) + else: + if 'bitwidth' in kwargs: + bitwidth, signed = kwargs['bitwidth'], kwargs.get('signed', False) + elif 'num_steps' in kwargs: + num_steps, signed = kwargs['num_steps'], kwargs.get('signed', False) + else: + qmin, qmax = kwargs['qmin'], kwargs['qmax'] + + if bitwidth is not None: + num_steps = 2 ** bitwidth - 1 + + if num_steps is not None: + if signed: + qmin = -math.ceil(num_steps/2) + qmax = math.floor(num_steps/2) + else: + qmin = 0 + qmax = num_steps + + assert qmin is not None + assert qmax is not None + + return qmin, qmax, block_size +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/quantization/affine/quantizer.html b/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/quantization/affine/quantizer.html new file mode 100644 index 00000000..ccdc5ae2 --- /dev/null +++ b/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/quantization/affine/quantizer.html @@ -0,0 +1,910 @@ + + + + + + aimet_torch.v2.quantization.affine.quantizer — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_torch.v2.quantization.affine.quantizer

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2023-2024, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+# pylint: disable=redefined-builtin
+""" Affine quantizers """
+
+import abc
+import math
+from typing import Optional, List, Dict, Tuple
+import contextlib
+import functools
+
+import torch
+from torch import nn
+
+from aimet_torch.v2.utils import patch_attr, _is_expandable, StatisticsNotFoundError
+from aimet_torch.v2.quantization.encoding_analyzer import EncodingAnalyzer, MinMaxEncodingAnalyzer
+from aimet_torch.v2.quantization.affine import AffineEncoding
+from aimet_torch.v2.quantization.tensor import QuantizedTensor, DequantizedTensor
+from aimet_torch.v2.quantization.base import QuantizerBase
+from aimet_torch.v2.quantization.affine.backends import quantize, quantize_dequantize, torch_builtins
+from aimet_torch.v2.utils import ste_round
+
+
+__all__ = ['AffineQuantizerBase', 'MinMaxQuantizer', 'Quantize', 'QuantizeDequantize', 'Dequantize',
+           'GroupedBlockQuantizeDequantize']
+
+
+class AffineQuantizerBase(QuantizerBase):
+    """
+    Base class for linear quantization modules.
+
+    Args:
+        shape (tuple): Shape of the quantization parameters
+        bitwidth (int): Quantization bitwidth
+        symmetric (bool): If True, performs symmetric quantization;
+                          otherwise, performs asymmetric quantization
+        encoding_analyzer (EncodingAnalyzer, optional): Encoding analyzer for calibrating quantization encodings
+                                                        (default: absolute min-max encoding analyzer)
+
+    """
+    def __init__(self, shape, bitwidth: int, symmetric: bool, encoding_analyzer: EncodingAnalyzer = None,
+                 block_size: Optional[Tuple[int, ...]] = None):
+        super().__init__()
+        if isinstance(shape, int):
+            shape = (shape,)
+        self.shape = shape
+        self.block_size = block_size
+        self.bitwidth = bitwidth
+        self._symmetric = symmetric
+        # We support two quantization modes: (unsigned) asymmetric and signed-symmetric
+        self._signed = symmetric
+
+        self.encoding_analyzer = encoding_analyzer or \
+                                 MinMaxEncodingAnalyzer(torch_builtins.get_encoding_shape_with_blocks(self.shape,
+                                                                                                      self.block_size))
+
+        if self.block_size is None and not _is_expandable(self.encoding_analyzer.observer.shape, self.shape):
+            raise RuntimeError(f'Encoding analyzer of shape {self.encoding_analyzer.observer.shape} '
+                               f'is incompatible with quantizer of shape {self.shape}.')
+
+    @abc.abstractmethod
+    def get_min(self, dtype=None) -> torch.Tensor:
+        """
+        Compute quantization min to be used for forward pass.
+        Return None f the quantizer is not initialized yet.
+
+        Args:
+            dtype (torch.dtype): dtype of the computed min
+
+        Returns:
+            Quantization min
+
+        """
+
+    @abc.abstractmethod
+    def get_max(self, dtype=None) -> torch.Tensor:
+        """
+        Compute quantization max to be used for forward pass.
+        Return None f the quantizer is not initialized yet.
+
+        Args:
+            dtype (torch.dtype): dtype of the computed max
+
+        Returns:
+            Quantization max
+
+        """
+
+    @abc.abstractmethod
+    def get_scale(self, dtype=None) -> torch.Tensor:
+        """
+        Compute quantization scale to be used for forward pass.
+        Return None f the quantizer is not initialized yet.
+
+        Args:
+            dtype (torch.dtype): dtype of the computed scale
+
+        Returns:
+            Quantization scale
+
+        """
+
+    @abc.abstractmethod
+    def get_offset(self, dtype=None) -> torch.Tensor:
+        """
+        Compute quantization offset to be used for forward pass.
+        Return None f the quantizer is not initialized yet.
+
+        Args:
+            dtype (torch.dtype): dtype of the computed offset
+
+        Returns:
+            Quantization offset
+
+        """
+
+    @abc.abstractmethod
+    def set_range(self, min: torch.Tensor, max: torch.Tensor):
+        """
+        Set quantization parameters to the given min-max range
+        """
+
+    def get_encoding(self) -> Optional[AffineEncoding]:
+        """
+        Return the quantizer's encodings as an AffineEncoding object
+        """
+        if self.is_initialized():
+            return AffineEncoding(self.get_scale(dtype=torch.float32),
+                                  self.get_offset(dtype=torch.float32),
+                                  self.bitwidth, self._signed, self._symmetric, self.block_size)
+        return None
+
+    @torch.no_grad()
+    def get_legacy_encodings(self) -> Optional[List[Dict]]:
+        """
+        Returns a list of encodings, each represented as a List of Dicts
+        """
+        # pylint: disable=redefined-builtin, protected-access
+
+        if not self.is_initialized():
+            return None
+
+        return self.get_encoding()._to_legacy_format()
+
+    @torch.no_grad()
+    def set_legacy_encodings(self, encodings: List[Dict]):
+        """
+        Set encodings represented in the same format as the output of get_legacy_encodings as below:
+
+        [
+            {'min': float, 'max': float, 'scale': float, 'offset': float,
+                     'bitwidth': int, 'dtype': str, 'is_symmetric': str},
+            {'min': float, 'max': float, 'scale': float, 'offset': float,
+                     'bitwidth': int, 'dtype': str, 'is_symmetric': str},
+            ...
+        ]
+        """
+        def str_to_bool(s: str):
+            s = s.lower()
+            if s == "false":
+                return False
+            if s == "true":
+                return True
+            raise ValueError
+
+        self.bitwidth = encodings[0]['bitwidth']
+        self.symmetric = str_to_bool(encodings[0]['is_symmetric'])
+        # Note: We can only accurately infer signed-ness in the symmetric case, but AIMET uses unsigned for asymmetric
+        self.signed = str_to_bool(encodings[0]['is_symmetric']) and encodings[0]["min"] != 0
+        min_ = torch.tensor([e['min'] for e in encodings]).view(self.shape)
+        max_ = torch.tensor([e['max'] for e in encodings]).view(self.shape)
+        self.set_range(min_, max_)
+
+    def extra_repr(self) -> str:
+        return f'shape={self.shape}, bitwidth={self.bitwidth}, symmetric={self.symmetric}'
+
+    @property
+    def symmetric(self) -> bool:
+        """
+        Indicates whether this quantizer uses symmetric quantization
+        """
+        return self._symmetric
+
+    @symmetric.setter
+    def symmetric(self, symmetric: bool):
+        """
+        Set the quantizer symmetry
+
+        :param symmetric: If True, use symmetric encodings. Else, use asymmetric encodings
+        """
+        self._symmetric = symmetric
+
+    @property
+    def signed(self)-> bool:
+        """
+        Indicates whether this quantizer uses signed quantization
+        """
+        return self._signed
+
+    @signed.setter
+    def signed(self, signed: bool):
+        """
+        Set the quantizer to use signed or unsigned quantization
+
+        :param signed: If True, use signed encodings, else use unsigned encodings
+        """
+        self._signed = signed
+
+
+class MinMaxQuantizer(AffineQuantizerBase): # pylint: disable=abstract-method
+    """
+    Affine quantizer with min-max as trainable parameters
+    """
+
+    min: torch.nn.Parameter
+    max: torch.nn.Parameter
+
+    def __init__(self, shape, bitwidth: int, symmetric: bool, encoding_analyzer: EncodingAnalyzer = None,
+                 block_size: Optional[Tuple[int, ...]] = None):
+        super().__init__(shape, bitwidth, symmetric, encoding_analyzer, block_size)
+
+        self.register_quantization_parameter('min', nn.Parameter(-torch.ones(self.shape)))
+        self.register_quantization_parameter('max', nn.Parameter(torch.ones(self.shape)))
+
+    @contextlib.contextmanager
+    def compute_encodings(self):
+        """
+        Observe inputs and update quantization parameters based on the input statistics.
+        During ``compute_encodings`` is enabled, the quantizer forward pass performs
+        dynamic quantization using the batch statistics.
+        """
+        if not self._allow_overwrite:
+            yield
+            return
+
+        original_forward = self.forward
+
+        @functools.wraps(original_forward)
+        def forward_wrapper(input):
+            expanded_input = torch_builtins.reshape_tensor_for_blocks(input, self.shape, self.block_size)
+            batch_statistics = self.encoding_analyzer.update_stats(expanded_input)
+            num_steps = math.pow(2, self.bitwidth) - 1
+            dynamic_min, dynamic_max =\
+                    self.encoding_analyzer.compute_encodings_from_stats(batch_statistics,
+                                                                        num_steps,
+                                                                        self.symmetric)
+            if self.block_size is not None:
+                dynamic_min = dynamic_min.view(self.min.shape)
+                dynamic_max = dynamic_max.view(self.max.shape)
+            dynamic_min = dynamic_min.to(dtype=self.min.dtype,
+                                         device=self.min.device).expand_as(self.min)
+            dynamic_max = dynamic_max.to(dtype=self.max.dtype,
+                                         device=self.max.device).expand_as(self.max)
+
+            with patch_attr(self, 'min', dynamic_min),\
+                    patch_attr(self, 'max', dynamic_max):
+                return original_forward(input)
+
+        self.encoding_analyzer.reset_stats()
+
+        try:
+            with patch_attr(self, 'forward', forward_wrapper):
+                yield
+        except: # pylint: disable=try-except-raise
+            raise
+        else:
+            try:
+                num_steps = math.pow(2, self.bitwidth) - 1
+                enc_min, enc_max = self.encoding_analyzer.compute_encodings(num_steps, self.symmetric)
+                if self.block_size is not None:
+                    enc_min = enc_min.view(self.min.shape)
+                    enc_max = enc_max.view(self.max.shape)
+
+            except StatisticsNotFoundError:
+                return
+
+            if enc_min is None or enc_max is None:
+                return
+
+            self.set_range(enc_min, enc_max)
+
+    def get_min(self, dtype=None) -> Optional[torch.Tensor]:
+        """
+        Compute quantization min to be used for forward pass.
+
+        NOTE: self.min may not be equal to self.get_min().
+              self.get_min() returns slightly recalibrated version of self.min.
+
+        :param dtype: dtype of the computed min. Use of self.min.dtype by default.
+        :return: Quantization min
+        """
+        if not self.is_initialized():
+            return None
+        num_negative_steps = 2 ** (self.bitwidth - 1) if self._signed else 0
+        return self.get_scale(dtype) * (self.get_offset(dtype) - num_negative_steps)
+
+    def get_max(self, dtype=None) -> Optional[torch.Tensor]:
+        """
+        Compute quantization max to be used for forward pass.
+
+        NOTE: self.max may not be equal to self.get_max()
+              self.get_max() returns slightly recalibrated version of self.max.
+
+        :param dtype: dtype of the computed max. Use of self.min.dtype by default.
+        :return: Quantization max
+        """
+        if not self.is_initialized():
+            return None
+        num_positive_steps = 2 ** (self.bitwidth - 1) - 1 if self._signed else 2 ** self.bitwidth - 1
+        return self.get_scale(dtype) * (self.get_offset(dtype) + num_positive_steps)
+
+    def get_scale(self, dtype=None) -> Optional[torch.Tensor]:
+        """
+        Compute quantization scale to be used for forward pass.
+
+        :param dtype: dtype of the computed scale. Use of self.min.dtype by default.
+        :return: Quantization scale
+        """
+        if not self.is_initialized():
+            return None
+
+        dtype = dtype or self.min.dtype
+        num_steps = 2 ** self.bitwidth - 1
+
+        scale = (self.max.to(dtype) - self.min.to(dtype)) / num_steps
+        return scale.to(dtype)
+
+    def get_offset(self, dtype=None) -> Optional[torch.Tensor]:
+        """
+        Compute quantization offset to be used for forward pass.
+
+        :param dtype: dtype of the computed offset. Use of self.min.dtype by default.
+        :return: Quantization offset
+        """
+        if not self.is_initialized():
+            return None
+
+        dtype = dtype or self.min.dtype
+
+        if self.symmetric:
+            offset = torch.zeros_like(self.min, requires_grad=False, dtype=dtype)
+        else:
+            offset = ste_round(self.min.to(dtype) / self.get_scale(dtype))
+
+            if self._signed:
+                offset += 2 ** (self.bitwidth - 1)
+
+        return offset.to(dtype)
+
+    def set_range(self, min: torch.Tensor, max: torch.Tensor):
+        """
+        Set quantization parameters to the given min-max range
+        """
+        with torch.no_grad():
+            self.min.copy_(min)
+            self.max.copy_(max)
+
+
+
[docs]class Quantize(MinMaxQuantizer): + r"""Applies quantization to the input. + + Precisely, + + .. math:: + out = clamp\left(\left\lceil\frac{input}{scale}\right\rfloor - offset, qmin, qmax\right) + + where :math:`scale` and :math:`offset` are derived from learnable parameters + :math:`\theta_{min}` and :math:`\theta_{max}`. + + If block size :math:`B = \begin{pmatrix} B_0 & B_1 & \cdots & B_{D-1} \end{pmatrix}` is specified, + this equation will be further generalized as + + .. math:: + out_{j_0 \cdots j_{D-1}} & = clamp\left( + \left\lceil\frac{input_{j_0 \cdots j_{D-1}}}{scale_{i_0 \cdots i_{D-1}}}\right\rfloor + - offset_{i_0 \cdots i_{D-1}}, qmin, qmax\right)\\ + + \text{where} \quad \forall_{0 \leq d < D} \quad i_d = \left\lfloor \frac{j_d}{B_d} \right\rfloor + + Args: + shape (tuple): Shape of the quantization parameters + bitwidth (int): Quantization bitwidth + symmetric (bool): If True, performs symmetric quantization; + otherwise, performs asymmetric quantization + encoding_analyzer (EncodingAnalyzer, optional): Encoding analyzer for calibrating quantization encodings + (default: absolute min-max encoding analyzer) + block_size (Tuple[int, ...], optional): Block size + + :ivar Tensor min: :math:`\theta_{min}` from which scale and offset will be derived. + :ivar Tensor max: :math:`\theta_{max}` from which scale and offset will be derived. + + .. note:: + :class:`Quantize` cannot run :meth:`forward` until :attr:`min` and :attr:`max` are properly initialized, + which can be done based on input statistics using :meth:`compute_encodings` or + by manually assigning a new value to :attr:`min` and :attr:`max`. + See the examples below. + + Examples: + + >>> import aimet_torch.v2.quantization as Q + >>> input = torch.randn(5, 10) + >>> q = Q.affine.Quantize(shape=(5, 1), bitwidth=8, symmetric=False, block_size=(1, 5)) + >>> q.is_initialized() + False + >>> with q.compute_encodings(): + ... _ = q(input) + ... + >>> q.is_initialized() + True + >>> q(input) + QuantizedTensor([[129., 64., 255., 122., 0., 192., 106., 94., 255., 0.], + [ 0., 145., 181., 255., 144., 255., 194., 0., 74., 86.], + [122., 0., 255., 150., 33., 103., 103., 0., 37., 255.], + [255., 111., 237., 218., 0., 49., 155., 255., 0., 179.], + [ 0., 66., 255., 89., 110., 17., 36., 83., 255., 0.]], + grad_fn=<AliasBackward0>) + + + >>> import aimet_torch.v2.quantization as Q + >>> input = torch.randn(5, 10) + >>> q = Q.affine.Quantize(shape=(5, 1), bitwidth=8, symmetric=False, block_size=(1, 5)) + >>> q.is_initialized() + False + >>> q.min = torch.nn.Parameter(-torch.ones_like(q.min)) + >>> q.max = torch.nn.Parameter(torch.ones_like(q.max)) + >>> q.is_initialized() + True + >>> q(input) + QuantizedTensor([[187., 186., 131., 0., 203., 64., 80., 0., 143., 152.], + [ 16., 0., 255., 0., 0., 150., 0., 255., 32., 255.], + [255., 226., 0., 255., 55., 172., 0., 255., 145., 255.], + [207., 146., 216., 238., 0., 0., 141., 178., 255., 188.], + [ 63., 59., 19., 162., 30., 255., 109., 255., 0., 255.]], + grad_fn=<AliasBackward0>) + """ +
[docs] def forward(self, input: torch.Tensor) -> QuantizedTensor: + """Quantizes the input tensor + + Args: + input (torch.Tensor): Input to quantize + + Returns: + Quantized output + + """ + if not self.is_initialized(): + raise RuntimeError( + 'Failed to run Quantize since quantization parameters are not initialized.' + ' Please initialize the quantization parameters using `compute_encodings()`.' + ) + + encoding = self.get_encoding() + output = quantize(input, + encoding.scale.to(input.dtype), + encoding.offset.to(input.dtype), + encoding.bitwidth, + encoding.signed, + block_size=self.block_size) + output = output.as_subclass(QuantizedTensor) + output.encoding = encoding + return output
+ + +
[docs]class QuantizeDequantize(MinMaxQuantizer): + r"""Applies fake-quantization by quantizing and dequantizing the input. + + Precisely, + + .. math:: + out = (\overline{input} + offset) * scale + + where + + .. math:: + \overline{input} = clamp\left(\left\lceil\frac{input}{scale}\right\rfloor - offset, qmin, qmax\right) + + and :math:`scale` and :math:`offset` are derived from learnable parameters + :math:`\theta_{min}` and :math:`\theta_{max}`. + + If block size :math:`B = \begin{pmatrix} B_0 & B_1 & \cdots & B_{D-1} \end{pmatrix}` is specified, + this equation will be further generalized as + + .. math:: + out_{j_0 \cdots j_{D-1}} &= (\overline{input}_{j_0 \cdots j_{D-1}} + offset_{i_0 \cdots i_{D-1}}) * scale_{i_0 \cdots i_{D-1}}\\ + \overline{input}_{j_0 \cdots j_{D-1}} &= clamp\left( + \left\lceil\frac{input_{j_0 \cdots j_{D-1}}}{scale_{i_0 \cdots i_{D-1}}}\right\rfloor + - offset_{i_0 \cdots i_{D-1}}, qmin, qmax\right)\\ + + \text{where} \quad \forall_{0 \leq d < D} \quad i_d = \left\lfloor \frac{j_d}{B_d} \right\rfloor + + Args: + shape (tuple): Shape of the quantization parameters + bitwidth (int): Quantization bitwidth + symmetric (bool): If True, performs symmetric quantization; + otherwise, performs asymmetric quantization + encoding_analyzer (EncodingAnalyzer, optional): Encoding analyzer for calibrating quantization encodings + (default: absolute min-max encoding analyzer) + block_size (Tuple[int, ...], optional): Block size + + :ivar Tensor min: :math:`\theta_{min}` from which scale and offset will be derived. + :ivar Tensor max: :math:`\theta_{max}` from which scale and offset will be derived. + + .. note:: + :class:`QuantizeDequantize` cannot run :meth:`forward` until :attr:`min` and :attr:`max` are properly initialized, + which can be done based on input statistics using :meth:`compute_encodings` or + by manually assigning a new value to :attr:`min` and :attr:`max`. + See the examples below. + + Examples: + + >>> import aimet_torch.v2.quantization as Q + >>> input = torch.randn(5, 10) + >>> qdq = Q.affine.QuantizeDequantize(shape=(5, 2), bitwidth=8, symmetric=False, block_size=(1, 5)) + >>> qdq.is_initialized() + False + >>> with qdq.compute_encodings(): + ... _ = qdq(input) + ... + >>> qdq.is_initialized() + True + >>> qdq(input) + DequantizedTensor([[-0.2771, 0.3038, 1.0819, 0.9700, 0.9487, -0.1307, + -1.7894, -0.1709, -0.2212, 0.7741], + [-1.0295, -1.2265, -1.0295, 1.0564, 0.6177, -1.0386, + -0.0176, -2.6054, 1.8836, -0.1232], + [-0.8229, 0.5540, 0.3992, -0.2363, 1.2546, -1.0036, + 0.2355, 0.1741, 1.6079, 0.6247], + [-1.0115, 1.2458, 0.9157, -1.4694, -0.0639, -0.2568, + 0.0680, 1.6695, 0.7932, -0.1889], + [ 0.0158, 0.5695, 0.5220, 0.1977, -1.4475, -0.0424, + -1.1128, -0.8796, -0.1060, 1.5897]], + grad_fn=<AliasBackward0>) + + + >>> import aimet_torch.v2.quantization as Q + >>> input = torch.randn(5, 10) + >>> qdq = Q.affine.QuantizeDequantize(shape=(5, 2), bitwidth=8, symmetric=False, block_size=(1, 5)) + >>> qdq.is_initialized() + False + >>> qdq.min = torch.nn.Parameter(-torch.ones_like(qdq.min)) + >>> qdq.max = torch.nn.Parameter(torch.ones_like(qdq.max)) + >>> qdq.is_initialized() + True + >>> qdq(input) + DequantizedTensor([[-0.6196, -0.9961, 0.0549, -0.6431, 1.0039, -0.8706, + 1.0039, 0.4706, -0.2353, 0.8078], + [ 0.3451, -0.1176, -0.9961, -0.4549, -0.0549, -0.0471, + -0.5255, -0.2353, 1.0039, -0.9961], + [-0.4157, 0.0784, 0.5333, 0.1647, -0.9961, -0.9961, + -0.2118, -0.2196, 0.9176, 0.9490], + [ 1.0039, -0.7765, 0.4784, -0.8706, 1.0039, 0.6039, + -0.4157, -0.2118, -0.9961, 0.3137], + [ 1.0039, 0.3216, -0.2353, -0.7765, -0.9961, 0.8000, + 1.0039, 0.4157, 0.4392, 0.4863]], + grad_fn=<AliasBackward0>) + """ +
[docs] def forward(self, input: torch.Tensor) -> DequantizedTensor: + """Quantizes and dequantizes the input tensor + + Args: + input (torch.Tensor): Input to quantize and dequantize + + Returns: + Quantize-dequantized output + + """ + if not self.is_initialized(): + raise RuntimeError( + 'Failed to run QuantizeDequantize since quantization parameters are not initialized.' + ' Please initialize the quantization parameters using `compute_encodings()`.' + ) + + encoding = self.get_encoding() + output = quantize_dequantize(input, + encoding.scale.to(input.dtype), + encoding.offset.to(input.dtype), + encoding.bitwidth, + encoding.signed, + block_size=self.block_size) + output = output.as_subclass(DequantizedTensor) + output.encoding = encoding + return output
+ + +class Dequantize(torch.nn.Module): + """ + Applies dequantization to the input + """ + def forward(self, input: QuantizedTensor) -> DequantizedTensor: + # pylint: disable=no-self-use + """ + :param input: Input to dequantize + :return: Dequantized output + """ + return input.dequantize() + +class GroupedBlockQuantizeDequantize(QuantizeDequantize): + """ Class for performing Grouped Block Quantize Dequantize """ + def __init__(self, shape, bitwidth: int, symmetric: bool, decompressed_bw: int, + encoding_analyzer: EncodingAnalyzer = None, block_size: Optional[Tuple[int, ...]] = None, + block_grouping: Optional[Tuple[int, ...]] = None): + """ + Grouped Block Quantize Dequantize constructor. + + :param shape: Shape of the quantization parameters + :type shape: tuple + :param bitwidth: Quantization bitwidth + :type bitwidth: int + :param symmetric: If True, performs symmetric quantization; + otherwise, performs asymmetric quantization + :type symmetric: bool + :param decompressed_bw: Bitwidth used for decompression + :type decompressed_bw: int + :param encoding_analyzer: Encoding analyzer for calibrating quantization encodings + (default: absolute min-max encoding analyzer) + :type encoding_analyzer: EncodingAnalyzer, optional + :param block_size: Block size per dimension. + :type block_size: Tuple + :param block_grouping: Block grouping per dimension. If provided, every set of block_group scales will be + grouped together, and the maximum scale for all blocks in the group will be used to find + the scale in the decompressed_grid to be shared by all blocks in the group. + If no block_grouping is provided, default behavior uses a block group of 1 for all dims, + equivalent to Blockwise Quantization. + A value of -1 for a block group for a dimension is equivalent to grouping all blocks in + the dimension in one group. This is also equivalent to a block group value equal to the + number of blocks for that dimension. + :type block_grouping: Tuple + """ + super().__init__(shape, bitwidth, symmetric, encoding_analyzer, block_size) + self.decompressed_bw = decompressed_bw + self.block_grouping = block_grouping + if self.block_grouping is None: + # Default to BQ behavior with 1 for all block grouping dims if not provided + self.block_grouping = tuple(1 for _ in enumerate(self.shape)) + + if block_grouping is not None: + if len(block_grouping) != len(shape): + raise RuntimeError(f'Length of block grouping {block_grouping} must equal length of shape {shape}.') + for idx, block_group in enumerate(block_grouping): + if block_group != -1 and shape[idx] % block_group != 0: + raise RuntimeError(f'Quantizer shape dimensions must divide evenly with corresponding block ' + f'grouping values for shapes {shape} and block grouping {block_grouping}.') + + if self.decompressed_bw < self.bitwidth: + raise RuntimeError(f'Decompressed bitwidth {decompressed_bw} cannot be smaller than self.bitwidth ' + f'{bitwidth}') + + if not symmetric: + raise RuntimeError('GroupedBlockQuantizeDequantize only supports symmetric quantization.') + + def get_scale(self, dtype=None) -> torch.Tensor: + """ + Compute quantization scale to be used for forward pass. + Overrides QuantizeDequantize self.get_scale() to apply the grouped block algorithm for calculating modified + scales. + + :param dtype: dtype of the computed scale. Use of self.min.dtype by default. + :return: Updated scale + """ + orig_scale = super().get_scale(dtype) + orig_scale_shape = orig_scale.shape + reshaped_scale = orig_scale.view(self.get_expanded_scale_shape()) + max_scale = torch.amax(reshaped_scale, list(range(1, len(orig_scale_shape) * 2, 2)), keepdim=True) + per_channel_scale = max_scale / 2 ** (self.decompressed_bw - self.bitwidth) + updated_scale = quantize_dequantize(reshaped_scale, + scale=per_channel_scale, + offset=torch.zeros_like(per_channel_scale), + qmin=1, + qmax=2 ** (self.decompressed_bw - self.bitwidth)) + return updated_scale.view(orig_scale_shape) + + def get_expanded_scale_shape(self) -> List[int]: + """ + Get expanded scale shape which breaks each scale dimension into a pair of dimensions with sizes + (original_shape / block_grouping, block_grouping). + + :return: Expanded scale shape + """ + expanded_shape = [] + for idx, block_group in enumerate(self.block_grouping): + # Block group of -1 is equivalent to grouping all blocks together + if block_group == -1: + expanded_shape.append(1) + expanded_shape.append(self.shape[idx]) + else: + expanded_shape.append(self.shape[idx] // block_group) + expanded_shape.append(block_group) + return expanded_shape + + def get_per_channel_scale(self, dtype=None) -> torch.Tensor: + """ + Get per channel scale. + + :return: Per channel scale + """ + orig_scale = super().get_scale(dtype) + orig_scale_shape = orig_scale.shape + reshaped_scale = orig_scale.view(self.get_expanded_scale_shape()) + max_scale = torch.amax(reshaped_scale, list(range(1, len(orig_scale_shape) * 2, 2)), keepdim=True) + per_channel_scale = max_scale / 2 ** (self.decompressed_bw - self.bitwidth) + return per_channel_scale + + def get_per_block_integer_scale(self) -> torch.Tensor: + """ + Get per block integer scale. + + :return: Per block integer scale + """ + per_channel_scale = self.get_per_channel_scale() + expanded_scale = self.get_scale().view(self.get_expanded_scale_shape()) + integer_scale = torch.round(expanded_scale / per_channel_scale).int().view(self.get_scale().shape) + return integer_scale +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/quantization/base/quantizer.html b/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/quantization/base/quantizer.html new file mode 100644 index 00000000..db662e6d --- /dev/null +++ b/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/quantization/base/quantizer.html @@ -0,0 +1,373 @@ + + + + + + aimet_torch.v2.quantization.base.quantizer — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_torch.v2.quantization.base.quantizer

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+""" Quantizer base class """
+
+import abc
+import copy
+from collections import OrderedDict
+import contextlib
+import weakref
+from typing import Optional, List, Dict
+
+import torch
+from torch import nn
+
+from packaging import version  # pylint: disable=wrong-import-order
+from aimet_torch.v2.quantization.base import EncodingBase
+from aimet_torch.v2.quantization.encoding_analyzer import EncodingAnalyzer
+
+
+__all__ = ['QuantizerBase']
+
+
+
[docs]class QuantizerBase(abc.ABC, torch.nn.Module): + """ + Quantizer base class + """ + encoding_analyzer: EncodingAnalyzer + + def __init__(self): + super().__init__() + + # param_name -> (weakref of initial parameter, version info of the initial parameter) + # This info will be used for judging whether the current parameter has ever been + # initialized after it was instantiated. + self._initial_parameters = OrderedDict() + self._allow_overwrite = True + +
[docs] @abc.abstractmethod + @contextlib.contextmanager + def compute_encodings(self): + """ + Observe inputs and update quantization parameters based on the input statistics. + """
+ +
[docs] @abc.abstractmethod + def get_legacy_encodings(self) -> Optional[List[Dict]]: + """ + Returns a list of encodings, each represented as a List of Dicts + """
+ +
[docs] @abc.abstractmethod + def set_legacy_encodings(self, encodings: List[Dict]): + """ + Set encodings represented in the same format as the output of get_legacy_encodings. + """
+ +
[docs] @abc.abstractmethod + def get_encoding(self) -> Optional[EncodingBase]: + """ + Return the quantizer's encodings as an EncodingBase object + """
+ +
[docs] def register_quantization_parameter(self, name: str, param: nn.Parameter): + """ + Register quantization parameter. + """ + # pylint: disable=protected-access + + self.register_parameter(name, param) + param = getattr(self, name) + self._initial_parameters[name] = (weakref.ref(param), param._version)
+ +
[docs] def is_initialized(self) -> bool: + """ + Returns true if the quantization parameters are initialized. + """ + for param_name, _ in self.named_parameters(): + if not self._is_initialized(param_name): + return False + return True
+ + def _is_initialized(self, param_name) -> bool: + # pylint: disable=protected-access + + initial_param_weakref, initial_param_version = self._initial_parameters[param_name] + initial_param = initial_param_weakref() + + if initial_param is None: + # The initial parameter object doesn't exist in memory space anymore. + return True + + current_param = getattr(self, param_name) + + if current_param is initial_param and current_param._version == initial_param_version: + # 1. Current parameter is the identical object as the initial parameter + # 2. The version nubmer of the current parameter never changed + return False + + return True + + def state_dict(self, *args, **kwargs): # pylint: disable=arguments-differ + state_dict = super().state_dict(*args, **kwargs) # pylint: disable=missing-kwoa + + if version.parse(torch.__version__) < version.parse("1.10"): + # This is for backward compatibility with torch < 1.10 + # which doesn't support get/set_extra_state() hooks + prefix = kwargs['prefix'] + state_dict[f'{prefix}extra_state'] = self.get_extra_state() + + return state_dict + + def load_state_dict(self, state_dict, strict: bool = True): # pylint:disable=arguments-differ + if '_extra_state' not in state_dict: + is_initialized = { + param_name: True for param_name in state_dict + if param_name in self._parameters + } + state_dict['_extra_state'] = is_initialized + + ret = super().load_state_dict(state_dict, strict) + + if version.parse(torch.__version__) < version.parse("1.10"): + # This is for backward compatibility with torch < 1.10 + # which doesn't support get/set_extra_state() hooks + self.set_extra_state(state_dict['_extra_state']) + + return ret + + def get_extra_state(self): + """ + Get extra state that describes which parameters are initialized. + """ + return { + param_name: self._is_initialized(param_name) + for param_name, _ in self.named_parameters() + } + + @torch.no_grad() + def set_extra_state(self, state): + """ + Set extra state that describes which parameters are initialized. + """ + is_initialized = state + for param_name, param in self._parameters.items(): + if param_name in is_initialized: + self.register_quantization_parameter(param_name, param) + + if is_initialized[param_name]: + # If the parameter has been already initialized, + # artificially increment the parameter version to mark as initialized + param.mul_(1.) + + @torch.no_grad() + def __deepcopy__(self, memo): + self_copy = self.__new__(type(self)) + self_copy.__dict__ = copy.deepcopy(self.__dict__, memo) + self_copy.set_extra_state(self.get_extra_state()) + return self_copy + + def __getstate__(self): + state = self.__dict__.copy() + state.pop('_initial_parameters') + state['is_initialized'] = self.get_extra_state() + return state + + @torch.no_grad() + def __setstate__(self, state): + self._initial_parameters = OrderedDict() + is_initialized = state.pop('is_initialized') + self.__dict__.update(state) + self.set_extra_state(is_initialized) + +
[docs] def allow_overwrite(self, mode: bool): + """ Set allow_overwite flag """ + self._allow_overwrite = mode
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/quantization/encoding_analyzer.html b/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/quantization/encoding_analyzer.html new file mode 100644 index 00000000..26c1f6e7 --- /dev/null +++ b/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/quantization/encoding_analyzer.html @@ -0,0 +1,794 @@ + + + + + + aimet_torch.v2.quantization.encoding_analyzer — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_torch.v2.quantization.encoding_analyzer

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+# pylint: disable=redefined-builtin
+# pylint: disable=missing-docstring
+
+""" Computes statistics and encodings """
+
+from abc import ABC, abstractmethod
+import math
+from dataclasses import dataclass
+from typing import TypeVar, Generic, Tuple, Optional, List
+import itertools
+import torch
+from aimet_torch.v2.utils import reduce, StatisticsNotFoundError, _is_expandable
+
+
+@dataclass
+class _MinMaxRange:
+    min: Optional[torch.Tensor] = None
+    max: Optional[torch.Tensor] = None
+
+@dataclass
+class _Histogram:
+    histogram: torch.Tensor = None
+    bin_edges: torch.Tensor = None
+    min: Optional[torch.Tensor] = None
+    max: Optional[torch.Tensor] = None
+
+_Statistics = TypeVar('_Statistics', _MinMaxRange, _Histogram)
+
+class _Observer(Generic[_Statistics], ABC):
+    """
+    Observes and gathers statistics
+    """
+    def __init__(self, shape: tuple):
+        if isinstance(shape, int):
+            shape = (shape,)
+        self.shape = shape
+
+    @abstractmethod
+    def collect_stats(self, input_tensor: torch.Tensor) -> _Statistics:
+        pass
+
+    @abstractmethod
+    def merge_stats(self, stats: _Statistics):
+        pass
+
+    @abstractmethod
+    def reset_stats(self):
+        pass
+
+    @abstractmethod
+    def get_stats(self) -> _Statistics:
+        pass
+
+
+class _MinMaxObserver(_Observer[_MinMaxRange]):
+    """
+    Observer for Min-Max calibration technique
+    """
+    def __init__(self, shape: tuple):
+        super().__init__(shape)
+        self.stats = _MinMaxRange()
+
+    @torch.no_grad()
+    def collect_stats(self, input_tensor: torch.Tensor) -> _MinMaxRange:
+        new_min = reduce(input_tensor, shape=self.shape, reduce_op=torch.min).values
+        new_max = reduce(input_tensor, shape=self.shape, reduce_op=torch.max).values
+        return _MinMaxRange(new_min, new_max)
+
+    @torch.no_grad()
+    def merge_stats(self, stats: _MinMaxRange):
+        updated_min = self.stats.min
+        if stats.min is not None:
+            if updated_min is None:
+                updated_min = stats.min.clone()
+            else:
+                updated_min = torch.minimum(updated_min, stats.min)
+
+        updated_max = self.stats.max
+        if stats.max is not None:
+            if updated_max is None:
+                updated_max = stats.max.clone()
+            else:
+                updated_max = torch.maximum(updated_max, stats.max)
+
+        self.stats = _MinMaxRange(updated_min, updated_max)
+
+    def reset_stats(self):
+        self.stats = _MinMaxRange()
+
+    def get_stats(self) -> _MinMaxRange:
+        return self.stats
+
+
+class _HistogramObserver(_Observer[_Histogram]):
+    """
+    Observer for Histogram based calibration techniques (percentile, MSE)
+    """
+    def __init__(self, shape: tuple, num_bins: int):
+        super().__init__(shape)
+        self.num_bins = num_bins
+        self.num_histograms = torch.prod(torch.Tensor(self.shape), dtype=int).item()
+        self.stats = []
+        for _ in range(self.num_histograms):
+            self.stats.append(_Histogram())
+
+
+    # pylint: disable=too-many-locals
+    @torch.no_grad()
+    def collect_stats(self, input_tensor: torch.Tensor) -> List[_Histogram]:
+        if not _is_expandable(self.shape, input_tensor.shape):
+            raise RuntimeError(f"Shape {self.shape} is incompatible with input of shape {input_tensor.shape}")
+
+        hist_stats = []
+        input_shape = tuple(input_tensor.shape)
+        histogram_shape = self.shape
+
+        padded_histogram_shape = (
+            *itertools.repeat(1, len(input_shape) - len(histogram_shape)),
+            *histogram_shape
+        )
+
+        for hist_num in range(self.num_histograms):
+            hist_input = input_tensor
+
+            for axis, dim in enumerate(padded_histogram_shape):
+                if dim == 1:
+                    continue
+                # elements in current axis, ex: could be W*C, C, or 1 for input_shape [H, W, C]
+                numel = torch.prod(torch.Tensor(padded_histogram_shape[axis+1:]), dtype=int)
+                # index where hist_input at current dimension will be sliced at
+                index = (hist_num // numel) % dim
+                hist_input = torch.unsqueeze(torch.select(hist_input, axis, index), axis)
+
+            hist_min, hist_max = self._handle_inf_inputs(hist_input)
+            bin_edges = self._create_bin_edges(min_val=hist_min, max_val=hist_max, device=input_tensor.device)
+            histogram = torch.histc(hist_input.to(torch.float), bins=self.num_bins, min=bin_edges[0], max=bin_edges[-1])
+
+            # clip inf values to hist_min and hist_max and adjust for any fp errors
+            histogram[0] += torch.sum(hist_input < bin_edges[0])
+            histogram[-1] += torch.sum(hist_input > bin_edges[-1])
+
+            hist_stats.append(_Histogram(histogram, bin_edges, hist_min, hist_max))
+
+        return hist_stats
+
+    # pylint: disable=no-self-use
+    def _handle_inf_inputs(self, hist_input):
+        if torch.all(torch.isinf(hist_input)):
+            raise ValueError('Input tensor cannot contain only infinite values')
+
+        min = hist_input[hist_input.isfinite()].min()
+        max = hist_input[hist_input.isfinite()].max()
+
+        return min, max
+
+    def _create_bin_edges(self, min_val, max_val, device):
+        # Adjust min/max values to be in line with PyTorch's torch.histc implementation
+        if max_val == min_val:
+            min_val = min_val - 0.5
+            max_val = max_val + 0.5
+
+        min_val, max_val = min_val.float(), max_val.float()
+
+        step = (max_val - min_val) / self.num_bins
+
+        return torch.arange(0, self.num_bins + 1, device=device) * step + min_val
+
+    def _get_bin_num(self, bin_width: int, curr_min, data):
+        bin_tensor = torch.full(data.shape, self.num_bins - 1, device=data.device)
+        index_tensor = (data - curr_min) / bin_width
+        return torch.minimum(index_tensor.to(torch.int32), bin_tensor)
+
+    # pylint: disable=arguments-differ
+    # pylint: disable=too-many-locals
+    @torch.no_grad()
+    def merge_stats(self, new_stats_list: List[_Histogram], input_tensor: torch.Tensor):
+
+        if self.stats[0].histogram is None:
+            self.stats = new_stats_list
+            return
+
+        hist_inputs = torch.reshape(input_tensor, (len(new_stats_list), -1))
+        for index, new_stats in enumerate(new_stats_list):
+            curr_stats = self.stats[index]
+            curr_input = hist_inputs[index]
+
+            updated_min = min(new_stats.min, curr_stats.min)
+            updated_max = max(new_stats.max, curr_stats.max)
+
+            # if the current histogram can capture new_stats within in its range
+            if updated_min == curr_stats.min and updated_max == curr_stats.max:
+                histogram_updates = curr_stats.histogram
+            else:
+
+                dest_bin_width = (updated_max - updated_min) / self.num_bins
+                src_bin_width = (curr_stats.max - curr_stats.min) / self.num_bins
+                histogram_updates = torch.zeros(self.num_bins).to(input_tensor.device)
+
+                src_bin_start = curr_stats.min + (src_bin_width * torch.arange(0, self.num_bins, device=input_tensor.device))
+                dest_bin_index = self._get_bin_num(dest_bin_width, updated_min, src_bin_start)
+                dest_bin_end = updated_min + dest_bin_width * (dest_bin_index + 1)
+
+                # split curr_hist if values in source bin cannot neatly fold into dest bin
+                split_hist_value = torch.round(((dest_bin_end - src_bin_start) / src_bin_width) * curr_stats.histogram)
+                dest_bin_updates = torch.minimum(split_hist_value, curr_stats.histogram)
+
+                # update appropriate bin with either the full or split curr_hist value
+                for i, dest_bin in enumerate(dest_bin_index):
+                    histogram_updates[dest_bin] += dest_bin_updates[i]
+
+                # if curr_hist is split, update other bin that the remaining values fall into
+                other_bins = torch.nonzero(torch.where(dest_bin_updates < curr_stats.histogram, 1, 0))
+                other_bin_index = self._get_bin_num(dest_bin_width, updated_min, src_bin_start + dest_bin_width)
+                other_bin_updates = curr_stats.histogram - dest_bin_updates
+                for bin_num in other_bins:
+                    histogram_updates[other_bin_index[bin_num]] += other_bin_updates[bin_num]
+
+            # create histogram given input tensor and full range
+            expanded_histogram = torch.histc(curr_input.to(torch.float), bins=self.num_bins, min=updated_min, max=updated_max)
+            expanded_histogram += histogram_updates.to(expanded_histogram.device)
+
+            # clip inf values to hist_min and hist_max
+            expanded_histogram[0] += torch.sum(curr_input == -float('inf'))
+            expanded_histogram[-1] += torch.sum(curr_input == float('inf'))
+
+            expanded_bin_edges = self._create_bin_edges(min_val=updated_min, max_val=updated_max, device=expanded_histogram.device)
+            self.stats[index] = _Histogram(expanded_histogram, expanded_bin_edges, updated_min, updated_max)
+
+    def reset_stats(self):
+        self.stats = []
+        for _ in range(self.num_histograms):
+            self.stats.append(_Histogram())
+
+    def get_stats(self) -> List[_Histogram]:
+        return self.stats
+
+
[docs]class EncodingAnalyzer(Generic[_Statistics], ABC): + def __init__(self, observer: _Observer): + self.observer = observer + + @torch.no_grad() + def update_stats(self, input_tensor: torch.Tensor) -> _Statistics: + new_stats = self.observer.collect_stats(input_tensor) + self.observer.merge_stats(new_stats) + return new_stats + + def reset_stats(self) -> None: + self.observer.reset_stats() + + def compute_encodings(self, num_steps: int, is_symmetric: bool) -> torch.Tensor: + return self.compute_encodings_from_stats(self.observer.get_stats(), num_steps, is_symmetric) + + def compute_dynamic_encodings(self, input_tensor: torch.Tensor, num_steps: int, + is_symmetric: bool)-> Tuple[Optional[torch.Tensor], Optional[torch.Tensor]]: + return self.compute_encodings_from_stats( + self.observer.collect_stats(input_tensor), num_steps, is_symmetric) + + @abstractmethod + def compute_encodings_from_stats(self, stats: _Statistics, num_steps: int, is_symmetric: bool)\ + -> Tuple[Optional[torch.Tensor], Optional[torch.Tensor]]: + pass
+ +
[docs]class MinMaxEncodingAnalyzer(EncodingAnalyzer[_MinMaxRange]): + """ + Encoding Analyzer for Min-Max calibration technique + """ + def __init__(self, shape: tuple): + observer = _MinMaxObserver(shape) + super().__init__(observer) + + #pylint: disable=too-many-locals + @torch.no_grad() + def compute_encodings_from_stats(self, stats: _MinMaxRange, num_steps: int, is_symmetric: bool)\ + -> Tuple[Optional[torch.Tensor], Optional[torch.Tensor]]: + if num_steps <= 0: + raise ValueError('The number of quantization bins cannot be less than or equal to 0.') + + if stats.min is None or stats.max is None: + raise StatisticsNotFoundError('No statistics present to compute encodings.') + + tiny_num = torch.finfo(stats.min.dtype).tiny + # enforces that 0 is within the min/max + min_with_zero = torch.clamp(stats.min, max=0) + max_with_zero = torch.clamp(stats.max, min=0) + + # adjusts any min/max pairing that are too close + tensor_diff = (max_with_zero - min_with_zero) / num_steps + adjustment_step = tiny_num * (tensor_diff < tiny_num) + + updated_max = max_with_zero + math.floor(num_steps / 2) * adjustment_step + updated_min = min_with_zero - math.ceil(num_steps / 2) * adjustment_step + + if is_symmetric: + num_pos_steps = math.floor(num_steps / 2) + num_neg_steps = math.ceil(num_steps / 2) + delta = torch.maximum(updated_max / num_pos_steps, -updated_min / num_neg_steps) + offset = -1 * num_neg_steps + updated_min = offset * delta + updated_max = num_pos_steps * delta + + # replace pos and neg inf respectively + updated_max = torch.clamp(updated_max, max=torch.finfo(stats.min.dtype).max) + updated_min = torch.clamp(updated_min, min=torch.finfo(stats.min.dtype).min) + return updated_min, updated_max
+ + +def adjust_min_max(curr_min, curr_max, num_steps, is_symmetric): + # ensure that 0 is in the range + curr_min = torch.minimum(curr_min, torch.zeros_like(curr_min)) + curr_max = torch.maximum(curr_max, torch.zeros_like(curr_max)) + + # ensure that min/max are finite + curr_min.clamp_(min=torch.finfo(curr_max.dtype).min, max=0) + curr_max.clamp_(min=0, max=torch.finfo(curr_max.dtype).max) + + # ensure that min/max aren't too close + tiny_num = torch.finfo(curr_min.dtype).tiny + tensor_threshold = (curr_max - curr_min) / num_steps + curr_min[tensor_threshold < tiny_num] -= tiny_num * math.ceil(num_steps / 2) + curr_max[tensor_threshold < tiny_num] += tiny_num * math.floor(num_steps / 2) + + if is_symmetric: + num_pos_steps = math.floor(num_steps / 2) + num_neg_steps = math.ceil(num_steps / 2) + delta = max(curr_max / num_pos_steps, -curr_min / num_neg_steps) + offset = -1 * num_neg_steps + + curr_min = offset * delta + curr_max = num_pos_steps * delta + + return curr_min, curr_max + +# pylint: disable=arguments-differ +
[docs]class PercentileEncodingAnalyzer(EncodingAnalyzer[_Histogram]): + """ + Encoding Analyzer for Percentile calibration technique + """ + def __init__(self, shape: tuple, num_bins: int = 2048, percentile: float = 100): + if num_bins <= 0: + raise ValueError('Number of bins cannot be less than or equal to 0.') + + observer = _HistogramObserver(shape=shape, num_bins=num_bins) + super().__init__(observer) + self.set_percentile(percentile) + +
[docs] def set_percentile(self, percentile): + """ + Set the clipping percentile of the encoding analyzer. The encoding analyzer will clip the (100% - percentile) + largest and smallest observed values from the encoding range when computing encodings. + + :param percentile: Value from 50.0 to 100.0 indicating the clipping percentile + """ + if percentile < 50 or percentile > 100: + raise ValueError('Percentile value must be within 50-100 range') + + self.percentile = percentile
+ + @torch.no_grad() + def update_stats(self, input_tensor: torch.Tensor) -> _Statistics: + new_stats = self.observer.collect_stats(input_tensor) + self.observer.merge_stats(new_stats, input_tensor) + return new_stats + + # pylint: disable=too-many-locals + @torch.no_grad() + def compute_encodings_from_stats(self, stats: List[_Histogram], num_steps: int, is_symmetric: bool)\ + -> Tuple[Optional[torch.Tensor], Optional[torch.Tensor]]: + + if num_steps <= 0: + raise ValueError('The number of quantization bins cannot be less than or equal to 0.') + + if stats[0].histogram is None: + raise StatisticsNotFoundError('No statistics present to compute encodings.') + + encoding_min_list = [] + encoding_max_list = [] + + for list_elem in stats: + cum_sum = torch.cumsum(list_elem.histogram, dim=0) + # trim percentile value from min and max + max_index = torch.searchsorted(cum_sum, cum_sum[-1] * self.percentile/100) + min_index = torch.searchsorted(cum_sum, cum_sum[-1] * (1 - self.percentile/100)) + + if self.percentile == 100: + min_index = 0 + max_index = -1 + curr_min = list_elem.bin_edges[min_index] + curr_max = list_elem.bin_edges[max_index] + # adjust min/max + updated_min, updated_max = adjust_min_max(curr_min, curr_max, num_steps, is_symmetric) + encoding_min_list.append(updated_min) + encoding_max_list.append(updated_max) + + encoding_min = torch.tensor(encoding_min_list, device=stats[0].histogram.device) + encoding_min = torch.reshape(encoding_min, self.observer.shape) + + encoding_max = torch.tensor(encoding_max_list, device=stats[0].histogram.device) + encoding_max = torch.reshape(encoding_max, self.observer.shape) + + return encoding_min, encoding_max
+ + +
[docs]class SqnrEncodingAnalyzer(EncodingAnalyzer[_Histogram]): + """ + Encoding Analyzer for SQNR Calibration technique + """ + def __init__(self, + shape: tuple, + num_bins: int = 2048, *, + asymmetric_delta_candidates=17, + symmetric_delta_candidates=101, + offset_candidates=21, + max_parallelism=64, + gamma=3.0): + """ + :param shape: Shape of calculated encoding + :param num_bins: number of bins to use per histogram + :param asymmetric_delta_candidates: number of delta values to search over in asymmetric mode + :param symmetric_delta_candidates: number of delta values to search over in symmetric mode + :param offset_candidates: number of offset values to search over in asymmetric mode + :param max_parallelism: maximum number of encodings to process parallely (higher number results in higher + memory usage but faster computation) + :param gamma: weighting factor on clipping noise (higher value results in less clipping noise) + """ + if num_bins <= 0: + raise ValueError('Number of bins cannot be less than or equal to 0.') + observer = _HistogramObserver(shape=shape, num_bins=num_bins) + super().__init__(observer) + self.asym_delta_candidates = asymmetric_delta_candidates + self.sym_delta_candidates = symmetric_delta_candidates + self.num_offset_candidates = offset_candidates + self.gamma = gamma + self.max_parallelism = max_parallelism + + @torch.no_grad() + def update_stats(self, input_tensor: torch.Tensor) -> _Statistics: + new_stats = self.observer.collect_stats(input_tensor) + self.observer.merge_stats(new_stats, input_tensor) + return new_stats + + # pylint: disable=too-many-locals +
[docs] @torch.no_grad() + def compute_encodings_from_stats(self, stats: List[_Histogram], num_steps: int, is_symmetric: bool)\ + -> Tuple[Optional[torch.Tensor], Optional[torch.Tensor]]: + """ + Searches for encodings which produce the lowest expected SQNR based on the histograms in stats + + :param stats: A list of _Histogram objects with length equal to the number of encodings to compute + :param num_steps: The number of bins the quantized range is split into + :param is_symmetric: If True, computes symmetric encodings, else computes asymmetric encodings + :return: Tuple of computed encodings (min, max) as tensors with shape self.shape + """ + if stats[0].histogram is None: + raise StatisticsNotFoundError('No statistics present to compute encodings.') + if num_steps <= 0: + raise ValueError('The number of quantization bins cannot be less than or equal to 0.') + chunked_stats = [stats[i:min(i+self.max_parallelism, len(stats))] for i in range(0, len(stats), self.max_parallelism)] + best_deltas, best_offsets = [], [] + for stats_ in chunked_stats: + test_deltas, test_offsets = self._pick_test_candidates(stats_, num_steps, is_symmetric) + best_delta, best_offset = self._select_best_candidates(test_deltas, test_offsets, stats_, num_steps) + best_deltas.append(best_delta) + best_offsets.append(best_offset) + best_offset = best_offsets[0] if is_symmetric else torch.cat(best_offsets) + best_delta = torch.cat(best_deltas) + min_enc = best_offset * best_delta + max_enc = min_enc + num_steps * best_delta + return min_enc.view(self.observer.shape).to(stats[0].max.dtype), \ + max_enc.view(self.observer.shape).to(stats[0].max.dtype)
+ + def _pick_test_candidates(self, stats, num_steps, symmetric): + # min/max.shape = (num_histograms, ) + min_vals = torch.stack([stat.min for stat in stats]) + max_vals = torch.stack([stat.max for stat in stats]) + min_vals = torch.min(min_vals, torch.zeros_like(min_vals)) + max_vals = torch.max(max_vals, torch.zeros_like(max_vals)) + max_vals = torch.max(max_vals, min_vals + torch.finfo(min_vals.dtype).tiny * num_steps) + if symmetric: + return self._pick_test_candidates_symmetric(min_vals, max_vals, num_steps) + return self._pick_test_candidates_asymmetric(min_vals, max_vals, num_steps) + + def _pick_test_candidates_asymmetric(self, min_vals, max_vals, num_steps): + """ + Selects the set of deltas and offsets over which to search for the optimal encodings + """ + # Note: casting to float32 for two reason: + # 1) float16 on CPU is not well-supported in pytorch + # 2) Computing int16 encodings using f16 can result in inf (2 ** 16 - 1 == inf in fp16) + tensor_kwargs = {"device": min_vals.device, "dtype": torch.float32} + max_delta = (max_vals - min_vals).to(torch.float32) / num_steps + observed_offset = torch.round(min_vals / max_delta) + observed_min = max_delta * observed_offset + observed_max = observed_min + max_delta * num_steps + num_deltas = self.asym_delta_candidates + search_space = torch.arange(start=1, end=(1 + num_deltas), step=1, **tensor_kwargs) + # test_deltas.shape = (num_histograms, num_tests) + test_deltas = max_delta[:, None] * search_space[None, :] / (num_deltas - 1) + # test_offsets.shape = (num_offsets) + num_offsets = min(num_steps + 2, self.num_offset_candidates) + test_offset_step = num_steps / (num_offsets - 2) # subtract 2 because we add the observed offset + test_offsets = torch.round(torch.arange(start=-num_steps, end=test_offset_step, step=test_offset_step, **tensor_kwargs)) + test_offsets = test_offsets[None, :].expand(min_vals.shape[0], -1) + # Add in the observed offset as a candidate, test_offsets.shape = (num_histograms, num_offsets + 1) + test_offsets = torch.concat((test_offsets, observed_offset[:, None]), dim=1) + return self._clamp_delta_offset_values(observed_min, observed_max, num_steps, test_deltas, test_offsets) + + def _pick_test_candidates_symmetric(self, min_vals, max_vals, num_steps): + """ + Selects the set of deltas over which to search for the optimal symmetric encodings + """ + tensor_kwargs = {"device": min_vals.device, "dtype": torch.float32} + max_delta = 2 * torch.max(max_vals, -min_vals).to(torch.float32) / num_steps + test_offsets = torch.full((1, ), (-num_steps) // 2, **tensor_kwargs) + num_deltas = self.sym_delta_candidates + search_space = torch.arange(start=1, end=(1 + num_deltas), step=1, **tensor_kwargs) + test_deltas = max_delta[:, None] * search_space[None, :] / (num_deltas - 1) + # test_deltas.shape = (num_histograms, num_deltas, 1) + # test_offsets.shape = (1, 1, 1) + min_delta = torch.Tensor([torch.finfo(test_deltas.dtype).tiny]).to(**tensor_kwargs) + test_deltas = torch.max(test_deltas, min_delta) + return test_deltas[:, :, None], test_offsets[:, None, None] + + @staticmethod + def _clamp_delta_offset_values(min_vals, max_vals, num_steps, test_deltas, test_offsets): + """ + Clamps delta/offset encodings such that represented range falls within the observed min/max range of inputs + """ + # test_min shape = (num_histograms, num_deltas, num_offsets) + test_min = test_deltas[:, :, None] * test_offsets[:, None, :] + test_max = test_min + test_deltas[:, :, None] * num_steps + # Clamp min/max to observed min/max + test_min = torch.max(min_vals[:, None, None], test_min) + test_max = torch.min(max_vals[:, None, None], test_max) + # Recompute delta/offset with clamped min/max + # Returned delta/offset shapes = (num_histograms, num_deltas, num_offsets) + test_deltas = (test_max - test_min) / num_steps + min_delta = torch.Tensor([torch.finfo(test_deltas.dtype).tiny]).to(device=test_deltas.device, + dtype=test_deltas.dtype) + test_deltas = torch.max(test_deltas, min_delta) + test_offsets = torch.round(test_min / test_deltas) + return test_deltas, test_offsets + + def _select_best_candidates(self, test_deltas, test_offsets, stats, num_steps): + """ + Searches all pairs of (delta, offset) in test_deltas, test_offsets to find the set with the lowest expected SQNR + """ + noise = self._estimate_clip_and_quant_noise(stats, test_deltas, test_offsets, num_steps, self.gamma) + _, min_idx = torch.min(noise.flatten(start_dim=1), dim=1) + best_delta = torch.gather(test_deltas.flatten(start_dim=1), dim=1, index=min_idx[:, None]) + if test_offsets.numel() == 1: + best_offset = test_offsets + else: + best_offset = torch.gather(test_offsets.flatten(start_dim=1), dim=1, index=min_idx[:, None]) + return best_delta, best_offset + + # pylint: disable=too-many-locals + @staticmethod + def _estimate_clip_and_quant_noise(stats: List[_Histogram], + test_deltas: torch.Tensor, + test_offsets: torch.Tensor, + num_steps: int, + gamma: float = 1.0): + """ + Calculates the error from quantization for each delta, offset pair in test_deltas, test_offsets. + We approximately reconstruct x from hists by assuming all elements within a given bin fall exactly on the + midpoint of that bin. + + :param stats: list of _Histogram objects of observed input values + :param test_deltas: Tensor holding the values of all deltas to search with shape (num_hists, num_deltas, num_offsets) + :param test_offsets: Tensor holding values of all offsets to search with shape (num_hists, num_deltas, num_offsets) + :param num_steps: Number of quantization steps, i.e., (2 ** bitwidth) - 1 + :param gamma: Fudge factor to trade off between saturation cost and quantization cost. When gamma=1.0, this + approximates the MSE of the quantization function + """ + tensor_kwargs = {"device": test_deltas.device, "dtype": test_deltas.dtype} + hists = torch.stack([stat.histogram for stat in stats]) + bin_edges = torch.stack([stat.bin_edges for stat in stats]) + hist_delta = bin_edges[:, 1] - bin_edges[:, 0] + # hist_midpoints is shape (hists, num_bins) + hist_offsets = hist_delta[:, None] * torch.arange(0, bin_edges.shape[1] - 1, **tensor_kwargs)[None, :] + hist_midpoints = (bin_edges[:, 0] + hist_delta/2)[:, None] + hist_offsets + # hists_midpoints_qdq is shape (hists, num_deltas, num_offsets, num_bins) + test_offsets_bcast = test_offsets[:, :, :, None] + test_deltas_bcast = test_deltas[:, :, :, None] + hist_midpoints_qdq = hist_midpoints[:, None, None, :].div(test_deltas_bcast).sub(test_offsets_bcast).round() + if gamma != 1.0: + clipped = torch.logical_or(hist_midpoints_qdq < 0, + hist_midpoints_qdq > num_steps) + hist_midpoints_qdq = hist_midpoints_qdq.clamp_(0, num_steps).add_(test_offsets_bcast).mul_(test_deltas_bcast) + square_error = hist_midpoints_qdq.sub_(hist_midpoints[:, None, None, :]).pow_(2).mul_(hists[:, None, None, :]) + if gamma != 1.0: + # Apply the gamma "fudge factor" to the clipped errors + square_error = torch.where(clipped, square_error * gamma, square_error) + return torch.sum(square_error, dim=-1)
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/quantization/float/quantizer.html b/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/quantization/float/quantizer.html new file mode 100644 index 00000000..e60fd67f --- /dev/null +++ b/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/quantization/float/quantizer.html @@ -0,0 +1,468 @@ + + + + + + aimet_torch.v2.quantization.float.quantizer — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_torch.v2.quantization.float.quantizer

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+# pylint: disable=redefined-builtin
+""" Float quantizers """
+
+import contextlib
+import functools
+from typing import Optional, List, Dict
+import math
+
+import torch
+from aimet_torch.v2.quantization.encoding_analyzer import EncodingAnalyzer
+from aimet_torch.v2.quantization.base import QuantizerBase
+from aimet_torch.v2.quantization.float import FloatEncoding
+from aimet_torch.v2.utils import StatisticsNotFoundError, patch_attr
+from aimet_torch.fp_quantization import fake_cast_to_ieee_float
+
+
+__all__ = ['QuantizeDequantize', 'FloatQuantizeDequantize']
+
+
+def _ieee_float_max_representable_value(exponent_bits, mantissa_bits):
+    exponent_max = 2 ** exponent_bits - 1
+    exponent_bias = exponent_max // 2
+    return (2 - 2**-mantissa_bits) * 2 ** (exponent_max - exponent_bias - 1)
+
+
+_IEEE_FLOAT16_EXPONENT_BITS = 5
+_IEEE_FLOAT16_MANTISSA_BITS = 10
+assert _ieee_float_max_representable_value(_IEEE_FLOAT16_EXPONENT_BITS, _IEEE_FLOAT16_MANTISSA_BITS) == \
+        torch.finfo(torch.float16).max
+
+_BFLOAT16_EXPONENT_BITS = 8
+_BFLOAT16_MANTISSA_BITS = 7
+assert _ieee_float_max_representable_value(_BFLOAT16_EXPONENT_BITS, _BFLOAT16_MANTISSA_BITS) == \
+        torch.finfo(torch.bfloat16).max
+
+
+
[docs]class FloatQuantizeDequantize(QuantizerBase): # pylint: disable=abstract-method + r""" + Simulates quantization by fake-casting the input + + If dtype is provided, this is equivalent to + + .. math:: + out = x.to(dtype).to(x.dtype) \\ + + + If the exponent and mantissa bits are provided, this is equivalent to + + .. math:: + out = \left\lceil\frac{x_c}{scale}\right\rfloor * scale + + where + + .. math:: + x_c &= clamp(x, -max, max) \\ + bias &= 2^{exponent} - \log_2(max) + \log_2(2 - 2^{-mantissa}) - 1 \\ + scale &= 2 ^ {\left\lfloor \log_2 |x_c| + bias \right\rfloor - mantissa - bias} \\ + + + The IEEE standard computes the maximum representable value by + + .. math:: + max = (2 - 2^{-mantissa}) * 2^{(\left\lfloor 0.5 * exponent\_max \right\rfloor)} \\ + + where + + .. math:: + exponent\_max = 2^{exponent} - 1 \\ + + Args: + exponent_bits (int): Number of exponent bits to simulate + mantissa_bits (int): Number of mantissa bits to simulate + dtype (torch.dtype): torch.dtype to simulate. This argument is mutually exclusive with exponent_bits and mantissa_bits. + encoding_analyzer (EncodingAnalyzer): If specified, the maximum value to represent will be determined dynamically based on the input statistics for finer precision. + + Examples: + + >>> import aimet_torch.v2.quantization as Q + >>> input = torch.tensor([[ 1.8998, -0.0947],[-1.0891, -0.1727]]) + >>> qdq = Q.float.FloatQuantizeDequantize(mantissa_bits=7, exponent_bits=8) + >>> # Unlike AffineQuantizer, FloatQuantizer is initialized without calling compute_encodings() + >>> qdq.is_initialized() + True + >>> qdq.is_bfloat16() + True + >>> qdq.bitwidth + 16 + >>> qdq(input) + tensor([[ 1.8984, -0.0947], [-1.0859, -0.1729]]) + + >>> from aimet_torch.v2.quantization.encoding_analyzer import MinMaxEncodingAnalyzer + >>> encoding_analyzer = MinMaxEncodingAnalyzer(shape=(1,)) + >>> qdq = Q.float.FloatQuantizeDequantize(dtype=torch.float16, encoding_analyzer=encoding_analyzer) + >>> qdq.is_float16() + True + >>> qdq.bitwidth + 16 + >>> qdq(input) + tensor([[ 1.8994, -0.0947], [-1.0889, -0.1727]]) + """ + + maxval: torch.Tensor + + def __init__(self, + exponent_bits: int = None, + mantissa_bits: int = None, + dtype: torch.dtype = None, + encoding_analyzer: EncodingAnalyzer = None): + super().__init__() + + if dtype is None: + if exponent_bits is None or mantissa_bits is None: + raise ValueError('Neither "dtype" nor "exponent/mantissa_bits" was specified.') + + if dtype is not None: + if exponent_bits is not None or mantissa_bits is not None: + raise ValueError( + 'Argument "dtype" is mutually exclusive with "exponent/mantissa_bits".') + + if dtype not in (torch.half, torch.float16, torch.bfloat16): + raise ValueError( + f"Float quantizer only supports torch.float16 and torch.bfloat16. Got {dtype}.") + + if dtype in (torch.half, torch.float16): + exponent_bits = _IEEE_FLOAT16_EXPONENT_BITS + mantissa_bits = _IEEE_FLOAT16_MANTISSA_BITS + else: + exponent_bits = _BFLOAT16_EXPONENT_BITS + mantissa_bits = _BFLOAT16_MANTISSA_BITS + + self.exponent_bits = exponent_bits + self.mantissa_bits = mantissa_bits + self.encoding_analyzer = encoding_analyzer + + if self.encoding_analyzer: + shape = self.encoding_analyzer.observer.shape + maxval = _ieee_float_max_representable_value(exponent_bits, mantissa_bits) + self.register_buffer('maxval', torch.full(shape, maxval)) + else: + self.register_buffer('maxval', None) + + @property + def bitwidth(self): + """ + Returns bitwidth of the quantizer + """ + return self.exponent_bits + self.mantissa_bits + 1 + + def is_float16(self): + """ + Returns true if current configuration simulates IEEE float16 + """ + return self.exponent_bits == _IEEE_FLOAT16_EXPONENT_BITS and \ + self.mantissa_bits == _IEEE_FLOAT16_MANTISSA_BITS + + def is_bfloat16(self): + """ + Returns true if current configuration simulates bfloat16 + """ + return self.exponent_bits == _BFLOAT16_EXPONENT_BITS and \ + self.mantissa_bits == _BFLOAT16_MANTISSA_BITS + + def get_legacy_encodings(self) -> Optional[List[Dict]]: + """ + :meta private: + """ + return [{'bitwidth': self.bitwidth, 'dtype': 'float'}] + + def set_legacy_encodings(self, encodings: List[Dict]): + """ + :meta private: + Set encodings represented in the same format as the output of get_legacy_encodings as below: + + [ + {'bitwidth': int, 'dtype': str}, + ... + ] + """ + if encodings[0]['bitwidth'] != 16: + raise RuntimeError(f"{self.__class__} can only import 16-bit legay encodings.") + self.exponent_bits = 5 + self.mantissa_bits = 10 + + def get_encoding(self) -> Optional[FloatEncoding]: + if self.is_initialized(): + return FloatEncoding(self.mantissa_bits, self.exponent_bits, self.maxval) + return None + + @contextlib.contextmanager + def compute_encodings(self): + """ + Observe inputs and update quantization parameters based on the input statistics. + During ``compute_encodings`` is enabled, the quantizer forward pass performs + dynamic quantization using the batch statistics. + """ + if not self.encoding_analyzer or not self._allow_overwrite: + yield + return + + original_forward = self.forward + + @functools.wraps(original_forward) + def forward_wrapper(input): + batch_statistics = self.encoding_analyzer.update_stats(input) + num_steps = math.pow(2, self.bitwidth) - 1 + dynamic_min, dynamic_max =\ + self.encoding_analyzer.compute_encodings_from_stats(batch_statistics, + num_steps, + is_symmetric=False) + dynamic_absmax = torch.maximum(dynamic_min.abs(), dynamic_max.abs()) + dynamic_absmax = dynamic_absmax.to(dtype=self.maxval.dtype, + device=self.maxval.device).expand_as(self.maxval) + + with patch_attr(self, 'maxval', dynamic_absmax): + return original_forward(input) + + self.encoding_analyzer.reset_stats() + + try: + with patch_attr(self, 'forward', forward_wrapper): + yield + except: # pylint: disable=try-except-raise + raise + else: + try: + num_steps = math.pow(2, self.bitwidth) - 1 + min, max = self.encoding_analyzer.compute_encodings(num_steps, + is_symmetric=False) + except StatisticsNotFoundError: + return + + if min is None or max is None: + return + + absmax = torch.maximum(min.abs(), max.abs()).expand_as(self.maxval) + with torch.no_grad(): + self.maxval.copy_(absmax) + + def forward(self, input: torch.Tensor): + """ + :param input: Input to quantize and dequantize + :return: Quantize-dequantized output + """ + maxval = self.maxval + exponent_bits = self.exponent_bits + mantissa_bits = self.mantissa_bits + + if maxval is None: + if self.is_float16() or self.is_bfloat16(): + # Fast forward using type casting + orig_dtype = input.dtype + dtype = torch.float16 if self.is_float16() else torch.bfloat16 + return input.to(dtype).to(orig_dtype) + + maxval = _ieee_float_max_representable_value(exponent_bits, mantissa_bits) + + return fake_cast_to_ieee_float(input, maxval, exponent_bits, mantissa_bits) + + def extra_repr(self): + """ + :meta private: + """ + return f'exponent_bits={self.exponent_bits}, mantissa_bits={self.mantissa_bits}'
+ +
[docs]class QuantizeDequantize(FloatQuantizeDequantize): + r""" + Alias of FloatQuantizeDequantize + """
+
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/quantization/tensor.html b/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/quantization/tensor.html new file mode 100644 index 00000000..244e84b6 --- /dev/null +++ b/releases/1.32.2/torch_v2/_modules/aimet_torch/v2/quantization/tensor.html @@ -0,0 +1,635 @@ + + + + + + aimet_torch.v2.quantization.tensor — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + + +
  • +
  • +
+
+
+
+
+ +

Source code for aimet_torch.v2.quantization.tensor

+# -*- mode: python -*-
+# =============================================================================
+#  @@-COPYRIGHT-START-@@
+#
+#  Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.
+#
+#  Redistribution and use in source and binary forms, with or without
+#  modification, are permitted provided that the following conditions are met:
+#
+#  1. Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#
+#  2. Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#
+#  3. Neither the name of the copyright holder nor the names of its contributors
+#     may be used to endorse or promote products derived from this software
+#     without specific prior written permission.
+#
+#  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+#  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+#  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+#  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+#  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+#  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+#  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+#  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+#  POSSIBILITY OF SUCH DAMAGE.
+#
+#  SPDX-License-Identifier: BSD-3-Clause
+#
+#  @@-COPYRIGHT-END-@@
+# =============================================================================
+""" Quantized tensor class implementation """
+
+import abc
+import copy
+
+import torch
+from torch.utils._pytree import tree_map, tree_flatten
+
+from aimet_torch.v2.quantization.base import EncodingBase
+
+
+__all__ = ['QuantizedTensorBase', 'QuantizedTensor', 'DequantizedTensor', 'EncodingError']
+
+
+HANDLED_FUNCTIONS = {}
+def implements(torch_function):
+    """
+    Register an override for QuantizedTensorBase
+    """
+    def decorator(func):
+        HANDLED_FUNCTIONS[torch_function] = func
+        return func
+
+    return decorator
+
+
+class QuantizedTensorBase(torch.Tensor):
+    """
+    Abstract base class to define quantized tensor behavior.
+    Represents a quantized or dequantized tensor as a subclass of :class:`torch.Tensor` which also holds the quantization encodings.
+    This object can be safely quantized or dequantized through the :meth:`quantize` and :meth:`dequantize` methods without
+    changing the represented data values.
+
+    Example:
+
+        >>> from aimet_torch.v2 import quantization as Q
+        >>> quantizer = Q.affine.Quantize(shape=(2, 1), bitwidth=8, symmetric=True)
+        >>> x = torch.tensor([[-1.20, 4.1, -0.21, 2.3],
+        ...                   [0.2, 5.6, -1.0, -.1]])
+        >>> with quantizer.compute_encodings():
+        ...     x_q = quantizer(x)
+        >>> torch.equal(x_q.encoding.scale, quantizer.get_scale())
+        True
+        >>> x_q
+        QuantizedTensor([[-37., 127.,  -7.,  71.],
+                         [  5., 127., -23.,  -2.]])
+        >>> x_q.quantized_repr()
+        tensor([[-37, 127,  -7,  71],
+                [  5, 127, -23,  -2]], dtype=torch.int8)
+        >>> x_q.dequantize()
+        DequantizedTensor([[-1.1945,  4.1000, -0.2260,  2.2921],
+                           [ 0.2205,  5.6000, -1.0142, -0.0882]])
+    """
+
+    encoding: EncodingBase
+
+    _cast_ops = [
+        torch.Tensor.half,
+        torch.Tensor.float,
+        torch.Tensor.double,
+        torch.Tensor.char,
+        torch.Tensor.short,
+        torch.Tensor.int,
+        torch.Tensor.long,
+        torch.Tensor.cuda,
+        torch.Tensor.cpu,
+        torch.Tensor.to,
+    ]
+
+    # Operations that an encoding can always pass through
+    _passthrough_ops = {
+        torch.Tensor.contiguous,
+    }
+
+    # Operations that a per-tensor encoding can pass through
+    _pertensor_passthrough_ops = {
+        torch.Tensor.__getitem__,
+        torch.Tensor.as_strided,
+        torch.Tensor.broadcast_to,
+        torch.Tensor.chunk,
+        torch.Tensor.dsplit,
+        torch.Tensor.expand,
+        torch.Tensor.expand_as,
+        torch.Tensor.flatten,
+        torch.Tensor.flip,
+        torch.Tensor.fliplr,
+        torch.Tensor.flipud,
+        torch.Tensor.gather,
+        torch.Tensor.hsplit,
+        torch.Tensor.index_select,
+        torch.Tensor.kthvalue,
+        torch.Tensor.masked_select,
+        torch.Tensor.movedim,
+        torch.Tensor.moveaxis,
+        torch.Tensor.msort,
+        torch.Tensor.narrow,
+        torch.Tensor.permute,
+        torch.Tensor.repeat,
+        torch.Tensor.reshape,
+        torch.Tensor.reshape_as,
+        torch.Tensor.resize,
+        torch.Tensor.resize_as,
+        torch.Tensor.select,
+        torch.Tensor.split,
+        torch.Tensor.squeeze,
+        torch.Tensor.swapaxes,
+        torch.Tensor.swapdims,
+        torch.Tensor.t,
+        torch.Tensor.take,
+        torch.Tensor.take_along_dim,
+        torch.Tensor.tensor_split,
+        torch.Tensor.tile,
+        torch.Tensor.transpose,
+        torch.Tensor.unflatten,
+        torch.Tensor.unsqueeze,
+        torch.Tensor.view,
+        torch.Tensor.view_as,
+        torch.as_strided,
+        torch.as_strided_copy,
+        torch.chunk,
+        torch.dsplit,
+        torch.expand_copy,
+        torch.flatten,
+        torch.flip,
+        torch.fliplr,
+        torch.flipud,
+        torch.gather,
+        torch.hsplit,
+        torch.index_select,
+        torch.masked_select,
+        torch.moveaxis,
+        torch.movedim,
+        torch.narrow,
+        torch.narrow_copy,
+        torch.permute,
+        torch.permute_copy,
+        torch.reshape,
+        torch.select,
+        torch.split,
+        torch.squeeze,
+        torch.squeeze_copy,
+        torch.swapaxes,
+        torch.swapdims,
+        torch.t,
+        torch.take,
+        torch.take_along_dim,
+        torch.tensor_split,
+        torch.tile,
+        torch.t_copy,
+        torch.unbind,
+        torch.unflatten,
+        torch.unsqueeze,
+        torch.unsqueeze_copy,
+        torch.vsplit,
+        torch.view_copy,
+    }
+
+
+    @abc.abstractmethod
+    def quantize(self) -> "QuantizedTensor":
+        """
+        Quantizes ``self`` with the associated encoding
+
+        .. note::
+            This method must be an IDEMPOTENT function.
+            The result of calling this method multiple times should be equal to calling it only once.
+            In other words, calling this method multiple times should not result in duplicate quantization.
+        """
+        raise NotImplementedError
+
+    @abc.abstractmethod
+    def dequantize(self) -> "DequantizedTensor":
+        """
+        Dequantizes ``self`` with the associated encoding
+
+        .. note::
+            This method must be an IDEMPOTENT function.
+            The result of calling this method multiple times should be equal to calling it only once.
+            In other words, calling this method multiple times should not result in duplicate dequantization.
+        """
+        raise NotImplementedError
+
+    @abc.abstractmethod
+    def quantized_repr(self) -> torch.Tensor:
+        """
+        Return the quantized representation of ``self`` as a :class:`torch.Tensor` with data type :attr:`self.encoding.dtype`
+
+        .. note::
+            The result of this function may not be able to carry a gradient depending on the quantized data type.
+            Thus, it may be necessary to call this only within an autograd function to allow for backpropagation.
+
+        Example:
+
+            >>> from aimet_torch.v2 import quantization as Q
+            >>> quantizer = Q.affine.Quantize(shape=(2, 1), bitwidth=8, symmetric=True)
+            >>> x = torch.randn((2, 4), requires_grad=True)
+            >>> with quantizer.compute_encodings():
+            ...     x_q = quantizer(x)
+            >>> x_q
+            QuantizedTensor([[  11.,  -57., -128.,   38.],
+                             [  28.,   -0., -128.,  -40.]], grad_fn=<AliasBackward0>)
+            >>> x_q.quantized_repr()
+            tensor([[  11,  -57, -128,   38],
+                    [  28,    0, -128,  -40]], dtype=torch.int8)
+        """
+        raise NotImplementedError
+
+    @classmethod
+    def __new__(cls, *args, **kwargs):
+        encoding = kwargs.pop('encoding', None)
+        ret = super().__new__(*args, **kwargs)
+        if not ret.is_floating_point():
+            raise RuntimeError(f"Non-floating point dtype `{ret.dtype}` is not allowed for quantized tensors.")
+        ret.encoding = encoding
+        return ret
+
+    def new_empty(self, size, *, dtype=None, device=None, requires_grad=False,
+                  layout=torch.strided, pin_memory=False, **kwargs) -> "QuantizedTensorBase":
+        # PyTorch requires subclasses of torch.Tensor to override this method such that
+        # it returns an instance of the subclass, not a plain torch.Tensor,
+        # for the subclass to be deep-copyable
+        encoding = kwargs.pop('encoding', None)
+        t = super().new_empty(size, dtype=dtype, device=device, requires_grad=requires_grad,
+                              layout=layout, pin_memory=pin_memory, **kwargs).as_subclass(type(self))
+        t.encoding = encoding
+        return t
+
+    @implements(torch.clone)
+    def clone(self, *, memory_format=torch.preserve_format):
+        """
+        Returns a copy of self
+
+        :param memory_format: Desired memory format of the returned tensor (default=torch.preserve_format)
+        """
+        # Note: use encoding.clone() here instead of deepcopy to propagate gradient through operation
+        encoding_clone = self.encoding._clone() # pylint:disable = protected-access
+        self_clone = super().clone(memory_format=memory_format).as_subclass(self.__class__)
+        self_clone.encoding = encoding_clone
+        return self_clone
+
+    @implements(torch.detach)
+    def detach(self) -> "QuantizedTensorBase":
+        """
+        Returns a new QuantizedTensorBase with data and encoding detached from the current graph
+        """
+        self_detached = super().detach().as_subclass(self.__class__)
+        self_detached.encoding = self.encoding._detach() # pylint:disable = protected-access
+        return self_detached
+
+    @classmethod
+    def __torch_function__(cls, func, types, args=(), kwargs=None):
+        if func in HANDLED_FUNCTIONS:
+            kwargs = kwargs if kwargs is not None else {}
+            return HANDLED_FUNCTIONS[func](*args, **kwargs)
+        ret = super().__torch_function__(func, types, args, kwargs)
+
+        flattened_args, _ = tree_flatten((args, kwargs))
+        if any(ret is arg for arg in flattened_args):
+            # Return value is the same object as one of the arguments.
+            # This implies that func is likely (but not necessarily) an in-place operator.
+            return ret
+
+        if func in cls._cast_ops:
+            if not ret.dtype.is_floating_point:
+                raise RuntimeError(
+                    f"Type casting to non-floating point dtype `{ret.dtype}` is not allowed for quantized tensors. "
+                    "To cast quantized tensors to integer, use `qtensor.quantzed_repr()`."
+                )
+
+            # Outputs of cast ops can inherit the same encoding as its parents
+            self, *_ = args
+            ret.encoding = copy.copy(self.encoding) # shallow copy
+
+        def propagate_encoding(qtensor, encoding):
+            if isinstance(qtensor, QuantizedTensorBase):
+                qtensor.encoding = copy.copy(encoding)
+
+        if func in cls._passthrough_ops:
+            self, *_ = args
+            tree_map(lambda t: propagate_encoding(t, self.encoding), ret)
+
+        if func in cls._pertensor_passthrough_ops:
+            self, *_ = args
+            if self.encoding and self.encoding.granularity == "pertensor":
+                # Return a cls object with the same encoding which can later be quantized or dequantized
+                tree_map(lambda t: propagate_encoding(t, self.encoding), ret)
+            else:
+                # Return a cls object with no encoding
+                # If the user later tries to quantize or dequantize this, an error will be thrown
+                tree_map(lambda t: propagate_encoding(t, None), ret)
+            return ret
+
+
+        def set_encoding(qtensor):
+            if not hasattr(qtensor, 'encoding'):
+                qtensor.encoding = None
+
+            if qtensor.encoding is None:
+                # If encoding does not exist, return a plain torch.Tensor
+                return qtensor.as_subclass(torch.Tensor)
+
+            # Change device of encoding
+            # NOTE: We don't change the dtypes of encoding because scale/offset
+            #       are sensitive to dtype
+            qtensor.encoding = qtensor.encoding.to(device=qtensor.device)
+
+            return qtensor
+
+        return tree_map(lambda t: set_encoding(t) if isinstance(t, cls) else t, ret)
+
+
+
[docs]class QuantizedTensor(QuantizedTensorBase): + """ + Represents a quantized tensor object. The object holds quantized values stored in a floating-point tensor along with + an :class:`EncodingBase` object which holds the information necessary to map the quantized values back to the + real/represented values. + """ + +
[docs] def quantize(self) -> "QuantizedTensor": + """ + Returns ``self`` + """ + if self.encoding is None: + raise EncodingError("Encoding does not exist") + return self
+ +
[docs] def dequantize(self) -> "DequantizedTensor": + """ + Dequantizes ``self`` using :attr:`self.encoding` to produce a :class:`DequantizedTensor` with the same encoding + information. + + Example: + + >>> from aimet_torch.v2.quantization as Q + >>> x = torch.tensor([[2.57, -2.312], + ... [0.153, 0.205]]) + >>> quantizer = Q.affine.Quantize(shape=(1, ), bitwidth=8, symmetric=True) + >>> quantizer.set_range(-128 * 0.1, 127 * 0.1) + >>> x_q = quantizer(x) + >>> x_q + QuantizedTensor([[ 26., -23.], + [ 2., 2.]], grad_fn=<AliasBackward0>) + >>> x_dq = x_q.dequantize() + >>> x_dq + DequantizedTensor([[ 2.6000, -2.3000], + [ 0.2000, 0.2000]], grad_fn=<AliasBackward0>) + >>> torch.equal(x_dq.encoding.scale, x_q.encoding.scale) + True + """ + if self.encoding is None: + raise EncodingError("Encoding does not exist") + + qtensor = self.encoding.dequantize(self.as_subclass(torch.Tensor)) + qtensor = qtensor.as_subclass(DequantizedTensor) + qtensor.encoding = copy.copy(self.encoding) + return qtensor
+ +
[docs] def quantized_repr(self) -> torch.Tensor: + # FIXME(kyunggeu): This only works for affine encodings. + # Needs to be generalized for any kind of encodings + return self.quantize().as_subclass(torch.Tensor).to(self.encoding.dtype)
+ + +
[docs]class DequantizedTensor(QuantizedTensorBase): + """ + Represents a tensor which has been quantized and subsequently dequantized. This object contains real floating point + data as well as an :class:`EncodingBase` object which holds information about the quantization parameters with which + the data was quantized. With this, a :class:`DequantizedTensor` can be converted back to its quantized representation + without further loss in information. + """ + +
[docs] def quantize(self) -> QuantizedTensor: + """ + Quantizes ``self`` using :attr:`self.encoding` to produce a :class:`QuantizedTensor` with the same encoding + information. + + Example: + + >>> import aimet_torch.v2.quantization as Q + >>> x = torch.tensor([[0.39, 51.0], [3.521, 9.41]]) + >>> quant_dequant = Q.affine.QuantizeDequantize((1, ), 8, symmetric=False) + >>> quant_dequant.set_range(-10, 41) + >>> x_qdq = quant_dequant(x) + >>> x_qdq + DequantizedTensor([[ 0.4000, 41.0000], + [ 3.6000, 9.4000]], grad_fn=<AliasBackward0>) + >>> x_qdq.quantize() + QuantizedTensor([[ 52., 255.], + [ 68., 97.]], grad_fn=<AliasBackward0>) + """ + if self.encoding is None: + raise EncodingError("Encoding does not exist") + + qtensor = self.encoding.quantize(self.as_subclass(torch.Tensor)) + qtensor = qtensor.as_subclass(QuantizedTensor) + qtensor.encoding = copy.copy(self.encoding) + return qtensor
+ +
[docs] def dequantize(self) -> "DequantizedTensor": + """ + Returns ``self`` + """ + if self.encoding is None: + raise EncodingError("Encoding does not exist") + return self
+ +
[docs] def quantized_repr(self) -> torch.Tensor: + """ + Return the quantized representation of ``self`` as a :class:`torch.Tensor` with data type :attr:`self.encoding.dtype`. + + .. note:: + The result of this function may not be able to carry a gradient depending on the quantized data type. + Thus, it may be necessary to call this only within an autograd function to allow for backpropagation. + + Example: + + >>> import aimet_torch.v2.quantization as Q + >>> x = torch.tensor([[0.39, 51.0], [3.521, 9.41]]) + >>> quant_dequant = Q.affine.QuantizeDequantize((1, ), 8, symmetric=False) + >>> quant_dequant.set_range(-10, 41) + >>> x_qdq = quant_dequant(x) + >>> x_qdq + DequantizedTensor([[ 0.4000, 41.0000], + [ 3.6000, 9.4000]], grad_fn=<AliasBackward0>) + >>> x_qdq.quantized_repr() + tensor([[ 52, 255], + [ 68, 97]], dtype=torch.uint8) + """ + # FIXME(kyunggeu): This only works for affine encodings. + # Needs to be generalized for any kind of encodings + return self.quantize().as_subclass(torch.Tensor).to(self.encoding.dtype)
+ + +class EncodingError(RuntimeError): + """Error that indicates an encoding is missing or invalid""" +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/_modules/index.html b/releases/1.32.2/torch_v2/_modules/index.html new file mode 100644 index 00000000..ebb98d20 --- /dev/null +++ b/releases/1.32.2/torch_v2/_modules/index.html @@ -0,0 +1,172 @@ + + + + + + Overview: module code — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/_static/_sphinx_javascript_frameworks_compat.js b/releases/1.32.2/torch_v2/_static/_sphinx_javascript_frameworks_compat.js new file mode 100644 index 00000000..8549469d --- /dev/null +++ b/releases/1.32.2/torch_v2/_static/_sphinx_javascript_frameworks_compat.js @@ -0,0 +1,134 @@ +/* + * _sphinx_javascript_frameworks_compat.js + * ~~~~~~~~~~ + * + * Compatability shim for jQuery and underscores.js. + * + * WILL BE REMOVED IN Sphinx 6.0 + * xref RemovedInSphinx60Warning + * + */ + +/** + * select a different prefix for underscore + */ +$u = _.noConflict(); + + +/** + * small helper function to urldecode strings + * + * See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/decodeURIComponent#Decoding_query_parameters_from_a_URL + */ +jQuery.urldecode = function(x) { + if (!x) { + return x + } + return decodeURIComponent(x.replace(/\+/g, ' ')); +}; + +/** + * small helper function to urlencode strings + */ +jQuery.urlencode = encodeURIComponent; + +/** + * This function returns the parsed url parameters of the + * current request. Multiple values per key are supported, + * it will always return arrays of strings for the value parts. + */ +jQuery.getQueryParameters = function(s) { + if (typeof s === 'undefined') + s = document.location.search; + var parts = s.substr(s.indexOf('?') + 1).split('&'); + var result = {}; + for (var i = 0; i < parts.length; i++) { + var tmp = parts[i].split('=', 2); + var key = jQuery.urldecode(tmp[0]); + var value = jQuery.urldecode(tmp[1]); + if (key in result) + result[key].push(value); + else + result[key] = [value]; + } + return result; +}; + +/** + * highlight a given string on a jquery object by wrapping it in + * span elements with the given class name. + */ +jQuery.fn.highlightText = function(text, className) { + function highlight(node, addItems) { + if (node.nodeType === 3) { + var val = node.nodeValue; + var pos = val.toLowerCase().indexOf(text); + if (pos >= 0 && + !jQuery(node.parentNode).hasClass(className) && + !jQuery(node.parentNode).hasClass("nohighlight")) { + var span; + var isInSVG = jQuery(node).closest("body, svg, foreignObject").is("svg"); + if (isInSVG) { + span = document.createElementNS("http://www.w3.org/2000/svg", "tspan"); + } else { + span = document.createElement("span"); + span.className = className; + } + span.appendChild(document.createTextNode(val.substr(pos, text.length))); + node.parentNode.insertBefore(span, node.parentNode.insertBefore( + document.createTextNode(val.substr(pos + text.length)), + node.nextSibling)); + node.nodeValue = val.substr(0, pos); + if (isInSVG) { + var rect = document.createElementNS("http://www.w3.org/2000/svg", "rect"); + var bbox = node.parentElement.getBBox(); + rect.x.baseVal.value = bbox.x; + rect.y.baseVal.value = bbox.y; + rect.width.baseVal.value = bbox.width; + rect.height.baseVal.value = bbox.height; + rect.setAttribute('class', className); + addItems.push({ + "parent": node.parentNode, + "target": rect}); + } + } + } + else if (!jQuery(node).is("button, select, textarea")) { + jQuery.each(node.childNodes, function() { + highlight(this, addItems); + }); + } + } + var addItems = []; + var result = this.each(function() { + highlight(this, addItems); + }); + for (var i = 0; i < addItems.length; ++i) { + jQuery(addItems[i].parent).before(addItems[i].target); + } + return result; +}; + +/* + * backward compatibility for jQuery.browser + * This will be supported until firefox bug is fixed. + */ +if (!jQuery.browser) { + jQuery.uaMatch = function(ua) { + ua = ua.toLowerCase(); + + var match = /(chrome)[ \/]([\w.]+)/.exec(ua) || + /(webkit)[ \/]([\w.]+)/.exec(ua) || + /(opera)(?:.*version|)[ \/]([\w.]+)/.exec(ua) || + /(msie) ([\w.]+)/.exec(ua) || + ua.indexOf("compatible") < 0 && /(mozilla)(?:.*? rv:([\w.]+)|)/.exec(ua) || + []; + + return { + browser: match[ 1 ] || "", + version: match[ 2 ] || "0" + }; + }; + jQuery.browser = {}; + jQuery.browser[jQuery.uaMatch(navigator.userAgent).browser] = true; +} diff --git a/releases/1.32.2/torch_v2/_static/basic.css b/releases/1.32.2/torch_v2/_static/basic.css new file mode 100644 index 00000000..eeb0519a --- /dev/null +++ b/releases/1.32.2/torch_v2/_static/basic.css @@ -0,0 +1,899 @@ +/* + * basic.css + * ~~~~~~~~~ + * + * Sphinx stylesheet -- basic theme. + * + * :copyright: Copyright 2007-2022 by the Sphinx team, see AUTHORS. + * :license: BSD, see LICENSE for details. + * + */ + +/* -- main layout ----------------------------------------------------------- */ + +div.clearer { + clear: both; +} + +div.section::after { + display: block; + content: ''; + clear: left; +} + +/* -- relbar ---------------------------------------------------------------- */ + +div.related { + width: 100%; + font-size: 90%; +} + +div.related h3 { + display: none; +} + +div.related ul { + margin: 0; + padding: 0 0 0 10px; + list-style: none; +} + +div.related li { + display: inline; +} + +div.related li.right { + float: right; + margin-right: 5px; +} + +/* -- sidebar --------------------------------------------------------------- */ + +div.sphinxsidebarwrapper { + padding: 10px 5px 0 10px; +} + +div.sphinxsidebar { + float: left; + width: 230px; + margin-left: -100%; + font-size: 90%; + word-wrap: break-word; + overflow-wrap : break-word; +} + +div.sphinxsidebar ul { + list-style: none; +} + +div.sphinxsidebar ul ul, +div.sphinxsidebar ul.want-points { + margin-left: 20px; + list-style: square; +} + +div.sphinxsidebar ul ul { + margin-top: 0; + margin-bottom: 0; +} + +div.sphinxsidebar form { + margin-top: 10px; +} + +div.sphinxsidebar input { + border: 1px solid #98dbcc; + font-family: sans-serif; + font-size: 1em; +} + +div.sphinxsidebar #searchbox form.search { + overflow: hidden; +} + +div.sphinxsidebar #searchbox input[type="text"] { + float: left; + width: 80%; + padding: 0.25em; + box-sizing: border-box; +} + +div.sphinxsidebar #searchbox input[type="submit"] { + float: left; + width: 20%; + border-left: none; + padding: 0.25em; + box-sizing: border-box; +} + + +img { + border: 0; + max-width: 100%; +} + +/* -- search page ----------------------------------------------------------- */ + +ul.search { + margin: 10px 0 0 20px; + padding: 0; +} + +ul.search li { + padding: 5px 0 5px 20px; + background-image: url(file.png); + background-repeat: no-repeat; + background-position: 0 7px; +} + +ul.search li a { + font-weight: bold; +} + +ul.search li p.context { + color: #888; + margin: 2px 0 0 30px; + text-align: left; +} + +ul.keywordmatches li.goodmatch a { + font-weight: bold; +} + +/* -- index page ------------------------------------------------------------ */ + +table.contentstable { + width: 90%; + margin-left: auto; + margin-right: auto; +} + +table.contentstable p.biglink { + line-height: 150%; +} + +a.biglink { + font-size: 1.3em; +} + +span.linkdescr { + font-style: italic; + padding-top: 5px; + font-size: 90%; +} + +/* -- general index --------------------------------------------------------- */ + +table.indextable { + width: 100%; +} + +table.indextable td { + text-align: left; + vertical-align: top; +} + +table.indextable ul { + margin-top: 0; + margin-bottom: 0; + list-style-type: none; +} + +table.indextable > tbody > tr > td > ul { + padding-left: 0em; +} + +table.indextable tr.pcap { + height: 10px; +} + +table.indextable tr.cap { + margin-top: 10px; + background-color: #f2f2f2; +} + +img.toggler { + margin-right: 3px; + margin-top: 3px; + cursor: pointer; +} + +div.modindex-jumpbox { + border-top: 1px solid #ddd; + border-bottom: 1px solid #ddd; + margin: 1em 0 1em 0; + padding: 0.4em; +} + +div.genindex-jumpbox { + border-top: 1px solid #ddd; + border-bottom: 1px solid #ddd; + margin: 1em 0 1em 0; + padding: 0.4em; +} + +/* -- domain module index --------------------------------------------------- */ + +table.modindextable td { + padding: 2px; + border-collapse: collapse; +} + +/* -- general body styles --------------------------------------------------- */ + +div.body { + min-width: 360px; + max-width: 800px; +} + +div.body p, div.body dd, div.body li, div.body blockquote { + -moz-hyphens: auto; + -ms-hyphens: auto; + -webkit-hyphens: auto; + hyphens: auto; +} + +a.headerlink { + visibility: hidden; +} +a.brackets:before, +span.brackets > a:before{ + content: "["; +} + +a.brackets:after, +span.brackets > a:after { + content: "]"; +} + + +h1:hover > a.headerlink, +h2:hover > a.headerlink, +h3:hover > a.headerlink, +h4:hover > a.headerlink, +h5:hover > a.headerlink, +h6:hover > a.headerlink, +dt:hover > a.headerlink, +caption:hover > a.headerlink, +p.caption:hover > a.headerlink, +div.code-block-caption:hover > a.headerlink { + visibility: visible; +} + +div.body p.caption { + text-align: inherit; +} + +div.body td { + text-align: left; +} + +.first { + margin-top: 0 !important; +} + +p.rubric { + margin-top: 30px; + font-weight: bold; +} + +img.align-left, figure.align-left, .figure.align-left, object.align-left { + clear: left; + float: left; + margin-right: 1em; +} + +img.align-right, figure.align-right, .figure.align-right, object.align-right { + clear: right; + float: right; + margin-left: 1em; +} + +img.align-center, figure.align-center, .figure.align-center, object.align-center { + display: block; + margin-left: auto; + margin-right: auto; +} + +img.align-default, figure.align-default, .figure.align-default { + display: block; + margin-left: auto; + margin-right: auto; +} + +.align-left { + text-align: left; +} + +.align-center { + text-align: center; +} + +.align-default { + text-align: center; +} + +.align-right { + text-align: right; +} + +/* -- sidebars -------------------------------------------------------------- */ + +div.sidebar, +aside.sidebar { + margin: 0 0 0.5em 1em; + border: 1px solid #ddb; + padding: 7px; + background-color: #ffe; + width: 40%; + float: right; + clear: right; + overflow-x: auto; +} + +p.sidebar-title { + font-weight: bold; +} +div.admonition, div.topic, blockquote { + clear: left; +} + +/* -- topics ---------------------------------------------------------------- */ +div.topic { + border: 1px solid #ccc; + padding: 7px; + margin: 10px 0 10px 0; +} + +p.topic-title { + font-size: 1.1em; + font-weight: bold; + margin-top: 10px; +} + +/* -- admonitions ----------------------------------------------------------- */ + +div.admonition { + margin-top: 10px; + margin-bottom: 10px; + padding: 7px; +} + +div.admonition dt { + font-weight: bold; +} + +p.admonition-title { + margin: 0px 10px 5px 0px; + font-weight: bold; +} + +div.body p.centered { + text-align: center; + margin-top: 25px; +} + +/* -- content of sidebars/topics/admonitions -------------------------------- */ + +div.sidebar > :last-child, +aside.sidebar > :last-child, +div.topic > :last-child, +div.admonition > :last-child { + margin-bottom: 0; +} + +div.sidebar::after, +aside.sidebar::after, +div.topic::after, +div.admonition::after, +blockquote::after { + display: block; + content: ''; + clear: both; +} + +/* -- tables ---------------------------------------------------------------- */ + +table.docutils { + margin-top: 10px; + margin-bottom: 10px; + border: 0; + border-collapse: collapse; +} + +table.align-center { + margin-left: auto; + margin-right: auto; +} + +table.align-default { + margin-left: auto; + margin-right: auto; +} + +table caption span.caption-number { + font-style: italic; +} + +table caption span.caption-text { +} + +table.docutils td, table.docutils th { + padding: 1px 8px 1px 5px; + border-top: 0; + border-left: 0; + border-right: 0; + border-bottom: 1px solid #aaa; +} + +th { + text-align: left; + padding-right: 5px; +} + +table.citation { + border-left: solid 1px gray; + margin-left: 1px; +} + +table.citation td { + border-bottom: none; +} + +th > :first-child, +td > :first-child { + margin-top: 0px; +} + +th > :last-child, +td > :last-child { + margin-bottom: 0px; +} + +/* -- figures --------------------------------------------------------------- */ + +div.figure, figure { + margin: 0.5em; + padding: 0.5em; +} + +div.figure p.caption, figcaption { + padding: 0.3em; +} + +div.figure p.caption span.caption-number, +figcaption span.caption-number { + font-style: italic; +} + +div.figure p.caption span.caption-text, +figcaption span.caption-text { +} + +/* -- field list styles ----------------------------------------------------- */ + +table.field-list td, table.field-list th { + border: 0 !important; +} + +.field-list ul { + margin: 0; + padding-left: 1em; +} + +.field-list p { + margin: 0; +} + +.field-name { + -moz-hyphens: manual; + -ms-hyphens: manual; + -webkit-hyphens: manual; + hyphens: manual; +} + +/* -- hlist styles ---------------------------------------------------------- */ + +table.hlist { + margin: 1em 0; +} + +table.hlist td { + vertical-align: top; +} + +/* -- object description styles --------------------------------------------- */ + +.sig { + font-family: 'Consolas', 'Menlo', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', monospace; +} + +.sig-name, code.descname { + background-color: transparent; + font-weight: bold; +} + +.sig-name { + font-size: 1.1em; +} + +code.descname { + font-size: 1.2em; +} + +.sig-prename, code.descclassname { + background-color: transparent; +} + +.optional { + font-size: 1.3em; +} + +.sig-paren { + font-size: larger; +} + +.sig-param.n { + font-style: italic; +} + +/* C++ specific styling */ + +.sig-inline.c-texpr, +.sig-inline.cpp-texpr { + font-family: unset; +} + +.sig.c .k, .sig.c .kt, +.sig.cpp .k, .sig.cpp .kt { + color: #0033B3; +} + +.sig.c .m, +.sig.cpp .m { + color: #1750EB; +} + +.sig.c .s, .sig.c .sc, +.sig.cpp .s, .sig.cpp .sc { + color: #067D17; +} + + +/* -- other body styles ----------------------------------------------------- */ + +ol.arabic { + list-style: decimal; +} + +ol.loweralpha { + list-style: lower-alpha; +} + +ol.upperalpha { + list-style: upper-alpha; +} + +ol.lowerroman { + list-style: lower-roman; +} + +ol.upperroman { + list-style: upper-roman; +} + +:not(li) > ol > li:first-child > :first-child, +:not(li) > ul > li:first-child > :first-child { + margin-top: 0px; +} + +:not(li) > ol > li:last-child > :last-child, +:not(li) > ul > li:last-child > :last-child { + margin-bottom: 0px; +} + +ol.simple ol p, +ol.simple ul p, +ul.simple ol p, +ul.simple ul p { + margin-top: 0; +} + +ol.simple > li:not(:first-child) > p, +ul.simple > li:not(:first-child) > p { + margin-top: 0; +} + +ol.simple p, +ul.simple p { + margin-bottom: 0; +} +dl.footnote > dt, +dl.citation > dt { + float: left; + margin-right: 0.5em; +} + +dl.footnote > dd, +dl.citation > dd { + margin-bottom: 0em; +} + +dl.footnote > dd:after, +dl.citation > dd:after { + content: ""; + clear: both; +} + +dl.field-list { + display: grid; + grid-template-columns: fit-content(30%) auto; +} + +dl.field-list > dt { + font-weight: bold; + word-break: break-word; + padding-left: 0.5em; + padding-right: 5px; +} +dl.field-list > dt:after { + content: ":"; +} + + +dl.field-list > dd { + padding-left: 0.5em; + margin-top: 0em; + margin-left: 0em; + margin-bottom: 0em; +} + +dl { + margin-bottom: 15px; +} + +dd > :first-child { + margin-top: 0px; +} + +dd ul, dd table { + margin-bottom: 10px; +} + +dd { + margin-top: 3px; + margin-bottom: 10px; + margin-left: 30px; +} + +dl > dd:last-child, +dl > dd:last-child > :last-child { + margin-bottom: 0; +} + +dt:target, span.highlighted { + background-color: #fbe54e; +} + +rect.highlighted { + fill: #fbe54e; +} + +dl.glossary dt { + font-weight: bold; + font-size: 1.1em; +} + +.versionmodified { + font-style: italic; +} + +.system-message { + background-color: #fda; + padding: 5px; + border: 3px solid red; +} + +.footnote:target { + background-color: #ffa; +} + +.line-block { + display: block; + margin-top: 1em; + margin-bottom: 1em; +} + +.line-block .line-block { + margin-top: 0; + margin-bottom: 0; + margin-left: 1.5em; +} + +.guilabel, .menuselection { + font-family: sans-serif; +} + +.accelerator { + text-decoration: underline; +} + +.classifier { + font-style: oblique; +} + +.classifier:before { + font-style: normal; + margin: 0 0.5em; + content: ":"; + display: inline-block; +} + +abbr, acronym { + border-bottom: dotted 1px; + cursor: help; +} + +/* -- code displays --------------------------------------------------------- */ + +pre { + overflow: auto; + overflow-y: hidden; /* fixes display issues on Chrome browsers */ +} + +pre, div[class*="highlight-"] { + clear: both; +} + +span.pre { + -moz-hyphens: none; + -ms-hyphens: none; + -webkit-hyphens: none; + hyphens: none; + white-space: nowrap; +} + +div[class*="highlight-"] { + margin: 1em 0; +} + +td.linenos pre { + border: 0; + background-color: transparent; + color: #aaa; +} + +table.highlighttable { + display: block; +} + +table.highlighttable tbody { + display: block; +} + +table.highlighttable tr { + display: flex; +} + +table.highlighttable td { + margin: 0; + padding: 0; +} + +table.highlighttable td.linenos { + padding-right: 0.5em; +} + +table.highlighttable td.code { + flex: 1; + overflow: hidden; +} + +.highlight .hll { + display: block; +} + +div.highlight pre, +table.highlighttable pre { + margin: 0; +} + +div.code-block-caption + div { + margin-top: 0; +} + +div.code-block-caption { + margin-top: 1em; + padding: 2px 5px; + font-size: small; +} + +div.code-block-caption code { + background-color: transparent; +} + +table.highlighttable td.linenos, +span.linenos, +div.highlight span.gp { /* gp: Generic.Prompt */ + user-select: none; + -webkit-user-select: text; /* Safari fallback only */ + -webkit-user-select: none; /* Chrome/Safari */ + -moz-user-select: none; /* Firefox */ + -ms-user-select: none; /* IE10+ */ +} + +div.code-block-caption span.caption-number { + padding: 0.1em 0.3em; + font-style: italic; +} + +div.code-block-caption span.caption-text { +} + +div.literal-block-wrapper { + margin: 1em 0; +} + +code.xref, a code { + background-color: transparent; + font-weight: bold; +} + +h1 code, h2 code, h3 code, h4 code, h5 code, h6 code { + background-color: transparent; +} + +.viewcode-link { + float: right; +} + +.viewcode-back { + float: right; + font-family: sans-serif; +} + +div.viewcode-block:target { + margin: -1px -10px; + padding: 0 10px; +} + +/* -- math display ---------------------------------------------------------- */ + +img.math { + vertical-align: middle; +} + +div.body div.math p { + text-align: center; +} + +span.eqno { + float: right; +} + +span.eqno a.headerlink { + position: absolute; + z-index: 1; +} + +div.math:hover a.headerlink { + visibility: visible; +} + +/* -- printout stylesheet --------------------------------------------------- */ + +@media print { + div.document, + div.documentwrapper, + div.bodywrapper { + margin: 0 !important; + width: 100%; + } + + div.sphinxsidebar, + div.related, + div.footer, + #top-link { + display: none; + } +} \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/_static/brain_logo.png b/releases/1.32.2/torch_v2/_static/brain_logo.png new file mode 100644 index 00000000..72de002b Binary files /dev/null and b/releases/1.32.2/torch_v2/_static/brain_logo.png differ diff --git a/releases/1.32.2/torch_v2/_static/css/badge_only.css b/releases/1.32.2/torch_v2/_static/css/badge_only.css new file mode 100644 index 00000000..c718cee4 --- /dev/null +++ b/releases/1.32.2/torch_v2/_static/css/badge_only.css @@ -0,0 +1 @@ +.clearfix{*zoom:1}.clearfix:after,.clearfix:before{display:table;content:""}.clearfix:after{clear:both}@font-face{font-family:FontAwesome;font-style:normal;font-weight:400;src:url(fonts/fontawesome-webfont.eot?674f50d287a8c48dc19ba404d20fe713?#iefix) format("embedded-opentype"),url(fonts/fontawesome-webfont.woff2?af7ae505a9eed503f8b8e6982036873e) format("woff2"),url(fonts/fontawesome-webfont.woff?fee66e712a8a08eef5805a46892932ad) format("woff"),url(fonts/fontawesome-webfont.ttf?b06871f281fee6b241d60582ae9369b9) format("truetype"),url(fonts/fontawesome-webfont.svg?912ec66d7572ff821749319396470bde#FontAwesome) format("svg")}.fa:before{font-family:FontAwesome;font-style:normal;font-weight:400;line-height:1}.fa:before,a .fa{text-decoration:inherit}.fa:before,a .fa,li .fa{display:inline-block}li .fa-large:before{width:1.875em}ul.fas{list-style-type:none;margin-left:2em;text-indent:-.8em}ul.fas li .fa{width:.8em}ul.fas li .fa-large:before{vertical-align:baseline}.fa-book:before,.icon-book:before{content:"\f02d"}.fa-caret-down:before,.icon-caret-down:before{content:"\f0d7"}.fa-caret-up:before,.icon-caret-up:before{content:"\f0d8"}.fa-caret-left:before,.icon-caret-left:before{content:"\f0d9"}.fa-caret-right:before,.icon-caret-right:before{content:"\f0da"}.rst-versions{position:fixed;bottom:0;left:0;width:300px;color:#fcfcfc;background:#1f1d1d;font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif;z-index:400}.rst-versions a{color:#2980b9;text-decoration:none}.rst-versions .rst-badge-small{display:none}.rst-versions .rst-current-version{padding:12px;background-color:#272525;display:block;text-align:right;font-size:90%;cursor:pointer;color:#27ae60}.rst-versions .rst-current-version:after{clear:both;content:"";display:block}.rst-versions .rst-current-version .fa{color:#fcfcfc}.rst-versions .rst-current-version .fa-book,.rst-versions .rst-current-version .icon-book{float:left}.rst-versions .rst-current-version.rst-out-of-date{background-color:#e74c3c;color:#fff}.rst-versions .rst-current-version.rst-active-old-version{background-color:#f1c40f;color:#000}.rst-versions.shift-up{height:auto;max-height:100%;overflow-y:scroll}.rst-versions.shift-up .rst-other-versions{display:block}.rst-versions .rst-other-versions{font-size:90%;padding:12px;color:grey;display:none}.rst-versions .rst-other-versions hr{display:block;height:1px;border:0;margin:20px 0;padding:0;border-top:1px solid #413d3d}.rst-versions .rst-other-versions dd{display:inline-block;margin:0}.rst-versions .rst-other-versions dd a{display:inline-block;padding:6px;color:#fcfcfc}.rst-versions.rst-badge{width:auto;bottom:20px;right:20px;left:auto;border:none;max-width:300px;max-height:90%}.rst-versions.rst-badge .fa-book,.rst-versions.rst-badge .icon-book{float:none;line-height:30px}.rst-versions.rst-badge.shift-up .rst-current-version{text-align:right}.rst-versions.rst-badge.shift-up .rst-current-version .fa-book,.rst-versions.rst-badge.shift-up .rst-current-version .icon-book{float:left}.rst-versions.rst-badge>.rst-current-version{width:auto;height:30px;line-height:30px;padding:0 6px;display:block;text-align:center}@media screen and (max-width:768px){.rst-versions{width:85%;display:none}.rst-versions.shift{display:block}} \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/_static/css/fonts/Roboto-Slab-Bold.woff b/releases/1.32.2/torch_v2/_static/css/fonts/Roboto-Slab-Bold.woff new file mode 100644 index 00000000..6cb60000 Binary files /dev/null and b/releases/1.32.2/torch_v2/_static/css/fonts/Roboto-Slab-Bold.woff differ diff --git a/releases/1.32.2/torch_v2/_static/css/fonts/Roboto-Slab-Bold.woff2 b/releases/1.32.2/torch_v2/_static/css/fonts/Roboto-Slab-Bold.woff2 new file mode 100644 index 00000000..7059e231 Binary files /dev/null and b/releases/1.32.2/torch_v2/_static/css/fonts/Roboto-Slab-Bold.woff2 differ diff --git a/releases/1.32.2/torch_v2/_static/css/fonts/Roboto-Slab-Regular.woff b/releases/1.32.2/torch_v2/_static/css/fonts/Roboto-Slab-Regular.woff new file mode 100644 index 00000000..f815f63f Binary files /dev/null and b/releases/1.32.2/torch_v2/_static/css/fonts/Roboto-Slab-Regular.woff differ diff --git a/releases/1.32.2/torch_v2/_static/css/fonts/Roboto-Slab-Regular.woff2 b/releases/1.32.2/torch_v2/_static/css/fonts/Roboto-Slab-Regular.woff2 new file mode 100644 index 00000000..f2c76e5b Binary files /dev/null and b/releases/1.32.2/torch_v2/_static/css/fonts/Roboto-Slab-Regular.woff2 differ diff --git a/releases/1.32.2/torch_v2/_static/css/fonts/fontawesome-webfont.eot b/releases/1.32.2/torch_v2/_static/css/fonts/fontawesome-webfont.eot new file mode 100644 index 00000000..e9f60ca9 Binary files /dev/null and b/releases/1.32.2/torch_v2/_static/css/fonts/fontawesome-webfont.eot differ diff --git a/releases/1.32.2/torch_v2/_static/css/fonts/fontawesome-webfont.svg b/releases/1.32.2/torch_v2/_static/css/fonts/fontawesome-webfont.svg new file mode 100644 index 00000000..855c845e --- /dev/null +++ b/releases/1.32.2/torch_v2/_static/css/fonts/fontawesome-webfont.svg @@ -0,0 +1,2671 @@ + + + + +Created by FontForge 20120731 at Mon Oct 24 17:37:40 2016 + By ,,, +Copyright Dave Gandy 2016. All rights reserved. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/releases/1.32.2/torch_v2/_static/css/fonts/fontawesome-webfont.ttf b/releases/1.32.2/torch_v2/_static/css/fonts/fontawesome-webfont.ttf new file mode 100644 index 00000000..35acda2f Binary files /dev/null and b/releases/1.32.2/torch_v2/_static/css/fonts/fontawesome-webfont.ttf differ diff --git a/releases/1.32.2/torch_v2/_static/css/fonts/fontawesome-webfont.woff b/releases/1.32.2/torch_v2/_static/css/fonts/fontawesome-webfont.woff new file mode 100644 index 00000000..400014a4 Binary files /dev/null and b/releases/1.32.2/torch_v2/_static/css/fonts/fontawesome-webfont.woff differ diff --git a/releases/1.32.2/torch_v2/_static/css/fonts/fontawesome-webfont.woff2 b/releases/1.32.2/torch_v2/_static/css/fonts/fontawesome-webfont.woff2 new file mode 100644 index 00000000..4d13fc60 Binary files /dev/null and b/releases/1.32.2/torch_v2/_static/css/fonts/fontawesome-webfont.woff2 differ diff --git a/releases/1.32.2/torch_v2/_static/css/fonts/lato-bold-italic.woff b/releases/1.32.2/torch_v2/_static/css/fonts/lato-bold-italic.woff new file mode 100644 index 00000000..88ad05b9 Binary files /dev/null and b/releases/1.32.2/torch_v2/_static/css/fonts/lato-bold-italic.woff differ diff --git a/releases/1.32.2/torch_v2/_static/css/fonts/lato-bold-italic.woff2 b/releases/1.32.2/torch_v2/_static/css/fonts/lato-bold-italic.woff2 new file mode 100644 index 00000000..c4e3d804 Binary files /dev/null and b/releases/1.32.2/torch_v2/_static/css/fonts/lato-bold-italic.woff2 differ diff --git a/releases/1.32.2/torch_v2/_static/css/fonts/lato-bold.woff b/releases/1.32.2/torch_v2/_static/css/fonts/lato-bold.woff new file mode 100644 index 00000000..c6dff51f Binary files /dev/null and b/releases/1.32.2/torch_v2/_static/css/fonts/lato-bold.woff differ diff --git a/releases/1.32.2/torch_v2/_static/css/fonts/lato-bold.woff2 b/releases/1.32.2/torch_v2/_static/css/fonts/lato-bold.woff2 new file mode 100644 index 00000000..bb195043 Binary files /dev/null and b/releases/1.32.2/torch_v2/_static/css/fonts/lato-bold.woff2 differ diff --git a/releases/1.32.2/torch_v2/_static/css/fonts/lato-normal-italic.woff b/releases/1.32.2/torch_v2/_static/css/fonts/lato-normal-italic.woff new file mode 100644 index 00000000..76114bc0 Binary files /dev/null and b/releases/1.32.2/torch_v2/_static/css/fonts/lato-normal-italic.woff differ diff --git a/releases/1.32.2/torch_v2/_static/css/fonts/lato-normal-italic.woff2 b/releases/1.32.2/torch_v2/_static/css/fonts/lato-normal-italic.woff2 new file mode 100644 index 00000000..3404f37e Binary files /dev/null and b/releases/1.32.2/torch_v2/_static/css/fonts/lato-normal-italic.woff2 differ diff --git a/releases/1.32.2/torch_v2/_static/css/fonts/lato-normal.woff b/releases/1.32.2/torch_v2/_static/css/fonts/lato-normal.woff new file mode 100644 index 00000000..ae1307ff Binary files /dev/null and b/releases/1.32.2/torch_v2/_static/css/fonts/lato-normal.woff differ diff --git a/releases/1.32.2/torch_v2/_static/css/fonts/lato-normal.woff2 b/releases/1.32.2/torch_v2/_static/css/fonts/lato-normal.woff2 new file mode 100644 index 00000000..3bf98433 Binary files /dev/null and b/releases/1.32.2/torch_v2/_static/css/fonts/lato-normal.woff2 differ diff --git a/releases/1.32.2/torch_v2/_static/css/theme.css b/releases/1.32.2/torch_v2/_static/css/theme.css new file mode 100644 index 00000000..19a446a0 --- /dev/null +++ b/releases/1.32.2/torch_v2/_static/css/theme.css @@ -0,0 +1,4 @@ +html{box-sizing:border-box}*,:after,:before{box-sizing:inherit}article,aside,details,figcaption,figure,footer,header,hgroup,nav,section{display:block}audio,canvas,video{display:inline-block;*display:inline;*zoom:1}[hidden],audio:not([controls]){display:none}*{-webkit-box-sizing:border-box;-moz-box-sizing:border-box;box-sizing:border-box}html{font-size:100%;-webkit-text-size-adjust:100%;-ms-text-size-adjust:100%}body{margin:0}a:active,a:hover{outline:0}abbr[title]{border-bottom:1px dotted}b,strong{font-weight:700}blockquote{margin:0}dfn{font-style:italic}ins{background:#ff9;text-decoration:none}ins,mark{color:#000}mark{background:#ff0;font-style:italic;font-weight:700}.rst-content code,.rst-content tt,code,kbd,pre,samp{font-family:monospace,serif;_font-family:courier new,monospace;font-size:1em}pre{white-space:pre}q{quotes:none}q:after,q:before{content:"";content:none}small{font-size:85%}sub,sup{font-size:75%;line-height:0;position:relative;vertical-align:baseline}sup{top:-.5em}sub{bottom:-.25em}dl,ol,ul{margin:0;padding:0;list-style:none;list-style-image:none}li{list-style:none}dd{margin:0}img{border:0;-ms-interpolation-mode:bicubic;vertical-align:middle;max-width:100%}svg:not(:root){overflow:hidden}figure,form{margin:0}label{cursor:pointer}button,input,select,textarea{font-size:100%;margin:0;vertical-align:baseline;*vertical-align:middle}button,input{line-height:normal}button,input[type=button],input[type=reset],input[type=submit]{cursor:pointer;-webkit-appearance:button;*overflow:visible}button[disabled],input[disabled]{cursor:default}input[type=search]{-webkit-appearance:textfield;-moz-box-sizing:content-box;-webkit-box-sizing:content-box;box-sizing:content-box}textarea{resize:vertical}table{border-collapse:collapse;border-spacing:0}td{vertical-align:top}.chromeframe{margin:.2em 0;background:#ccc;color:#000;padding:.2em 0}.ir{display:block;border:0;text-indent:-999em;overflow:hidden;background-color:transparent;background-repeat:no-repeat;text-align:left;direction:ltr;*line-height:0}.ir br{display:none}.hidden{display:none!important;visibility:hidden}.visuallyhidden{border:0;clip:rect(0 0 0 0);height:1px;margin:-1px;overflow:hidden;padding:0;position:absolute;width:1px}.visuallyhidden.focusable:active,.visuallyhidden.focusable:focus{clip:auto;height:auto;margin:0;overflow:visible;position:static;width:auto}.invisible{visibility:hidden}.relative{position:relative}big,small{font-size:100%}@media print{body,html,section{background:none!important}*{box-shadow:none!important;text-shadow:none!important;filter:none!important;-ms-filter:none!important}a,a:visited{text-decoration:underline}.ir a:after,a[href^="#"]:after,a[href^="javascript:"]:after{content:""}blockquote,pre{page-break-inside:avoid}thead{display:table-header-group}img,tr{page-break-inside:avoid}img{max-width:100%!important}@page{margin:.5cm}.rst-content .toctree-wrapper>p.caption,h2,h3,p{orphans:3;widows:3}.rst-content .toctree-wrapper>p.caption,h2,h3{page-break-after:avoid}}.btn,.fa:before,.icon:before,.rst-content .admonition,.rst-content .admonition-title:before,.rst-content .admonition-todo,.rst-content .attention,.rst-content .caution,.rst-content .code-block-caption .headerlink:before,.rst-content .danger,.rst-content .eqno .headerlink:before,.rst-content .error,.rst-content .hint,.rst-content .important,.rst-content .note,.rst-content .seealso,.rst-content .tip,.rst-content .warning,.rst-content code.download span:first-child:before,.rst-content dl dt .headerlink:before,.rst-content h1 .headerlink:before,.rst-content h2 .headerlink:before,.rst-content h3 .headerlink:before,.rst-content h4 .headerlink:before,.rst-content h5 .headerlink:before,.rst-content h6 .headerlink:before,.rst-content p.caption .headerlink:before,.rst-content p .headerlink:before,.rst-content table>caption .headerlink:before,.rst-content tt.download span:first-child:before,.wy-alert,.wy-dropdown .caret:before,.wy-inline-validate.wy-inline-validate-danger .wy-input-context:before,.wy-inline-validate.wy-inline-validate-info .wy-input-context:before,.wy-inline-validate.wy-inline-validate-success .wy-input-context:before,.wy-inline-validate.wy-inline-validate-warning .wy-input-context:before,.wy-menu-vertical li.current>a button.toctree-expand:before,.wy-menu-vertical li.on a button.toctree-expand:before,.wy-menu-vertical li button.toctree-expand:before,input[type=color],input[type=date],input[type=datetime-local],input[type=datetime],input[type=email],input[type=month],input[type=number],input[type=password],input[type=search],input[type=tel],input[type=text],input[type=time],input[type=url],input[type=week],select,textarea{-webkit-font-smoothing:antialiased}.clearfix{*zoom:1}.clearfix:after,.clearfix:before{display:table;content:""}.clearfix:after{clear:both}/*! + * Font Awesome 4.7.0 by @davegandy - http://fontawesome.io - @fontawesome + * License - http://fontawesome.io/license (Font: SIL OFL 1.1, CSS: MIT License) + */@font-face{font-family:FontAwesome;src:url(fonts/fontawesome-webfont.eot?674f50d287a8c48dc19ba404d20fe713);src:url(fonts/fontawesome-webfont.eot?674f50d287a8c48dc19ba404d20fe713?#iefix&v=4.7.0) format("embedded-opentype"),url(fonts/fontawesome-webfont.woff2?af7ae505a9eed503f8b8e6982036873e) format("woff2"),url(fonts/fontawesome-webfont.woff?fee66e712a8a08eef5805a46892932ad) format("woff"),url(fonts/fontawesome-webfont.ttf?b06871f281fee6b241d60582ae9369b9) format("truetype"),url(fonts/fontawesome-webfont.svg?912ec66d7572ff821749319396470bde#fontawesomeregular) format("svg");font-weight:400;font-style:normal}.fa,.icon,.rst-content .admonition-title,.rst-content .code-block-caption .headerlink,.rst-content .eqno .headerlink,.rst-content code.download span:first-child,.rst-content dl dt .headerlink,.rst-content h1 .headerlink,.rst-content h2 .headerlink,.rst-content h3 .headerlink,.rst-content h4 .headerlink,.rst-content h5 .headerlink,.rst-content h6 .headerlink,.rst-content p.caption .headerlink,.rst-content p .headerlink,.rst-content table>caption .headerlink,.rst-content tt.download span:first-child,.wy-menu-vertical li.current>a button.toctree-expand,.wy-menu-vertical li.on a button.toctree-expand,.wy-menu-vertical li button.toctree-expand{display:inline-block;font:normal normal normal 14px/1 FontAwesome;font-size:inherit;text-rendering:auto;-webkit-font-smoothing:antialiased;-moz-osx-font-smoothing:grayscale}.fa-lg{font-size:1.33333em;line-height:.75em;vertical-align:-15%}.fa-2x{font-size:2em}.fa-3x{font-size:3em}.fa-4x{font-size:4em}.fa-5x{font-size:5em}.fa-fw{width:1.28571em;text-align:center}.fa-ul{padding-left:0;margin-left:2.14286em;list-style-type:none}.fa-ul>li{position:relative}.fa-li{position:absolute;left:-2.14286em;width:2.14286em;top:.14286em;text-align:center}.fa-li.fa-lg{left:-1.85714em}.fa-border{padding:.2em .25em .15em;border:.08em solid #eee;border-radius:.1em}.fa-pull-left{float:left}.fa-pull-right{float:right}.fa-pull-left.icon,.fa.fa-pull-left,.rst-content .code-block-caption .fa-pull-left.headerlink,.rst-content .eqno .fa-pull-left.headerlink,.rst-content .fa-pull-left.admonition-title,.rst-content code.download span.fa-pull-left:first-child,.rst-content dl dt .fa-pull-left.headerlink,.rst-content h1 .fa-pull-left.headerlink,.rst-content h2 .fa-pull-left.headerlink,.rst-content h3 .fa-pull-left.headerlink,.rst-content h4 .fa-pull-left.headerlink,.rst-content h5 .fa-pull-left.headerlink,.rst-content h6 .fa-pull-left.headerlink,.rst-content p .fa-pull-left.headerlink,.rst-content table>caption .fa-pull-left.headerlink,.rst-content tt.download span.fa-pull-left:first-child,.wy-menu-vertical li.current>a button.fa-pull-left.toctree-expand,.wy-menu-vertical li.on a button.fa-pull-left.toctree-expand,.wy-menu-vertical li button.fa-pull-left.toctree-expand{margin-right:.3em}.fa-pull-right.icon,.fa.fa-pull-right,.rst-content .code-block-caption .fa-pull-right.headerlink,.rst-content .eqno .fa-pull-right.headerlink,.rst-content .fa-pull-right.admonition-title,.rst-content code.download span.fa-pull-right:first-child,.rst-content dl dt .fa-pull-right.headerlink,.rst-content h1 .fa-pull-right.headerlink,.rst-content h2 .fa-pull-right.headerlink,.rst-content h3 .fa-pull-right.headerlink,.rst-content h4 .fa-pull-right.headerlink,.rst-content h5 .fa-pull-right.headerlink,.rst-content h6 .fa-pull-right.headerlink,.rst-content p .fa-pull-right.headerlink,.rst-content table>caption .fa-pull-right.headerlink,.rst-content tt.download span.fa-pull-right:first-child,.wy-menu-vertical li.current>a button.fa-pull-right.toctree-expand,.wy-menu-vertical li.on a button.fa-pull-right.toctree-expand,.wy-menu-vertical li button.fa-pull-right.toctree-expand{margin-left:.3em}.pull-right{float:right}.pull-left{float:left}.fa.pull-left,.pull-left.icon,.rst-content .code-block-caption .pull-left.headerlink,.rst-content .eqno .pull-left.headerlink,.rst-content .pull-left.admonition-title,.rst-content code.download span.pull-left:first-child,.rst-content dl dt .pull-left.headerlink,.rst-content h1 .pull-left.headerlink,.rst-content h2 .pull-left.headerlink,.rst-content h3 .pull-left.headerlink,.rst-content h4 .pull-left.headerlink,.rst-content h5 .pull-left.headerlink,.rst-content h6 .pull-left.headerlink,.rst-content p .pull-left.headerlink,.rst-content table>caption .pull-left.headerlink,.rst-content tt.download span.pull-left:first-child,.wy-menu-vertical li.current>a button.pull-left.toctree-expand,.wy-menu-vertical li.on a button.pull-left.toctree-expand,.wy-menu-vertical li button.pull-left.toctree-expand{margin-right:.3em}.fa.pull-right,.pull-right.icon,.rst-content .code-block-caption .pull-right.headerlink,.rst-content .eqno .pull-right.headerlink,.rst-content .pull-right.admonition-title,.rst-content code.download span.pull-right:first-child,.rst-content dl dt .pull-right.headerlink,.rst-content h1 .pull-right.headerlink,.rst-content h2 .pull-right.headerlink,.rst-content h3 .pull-right.headerlink,.rst-content h4 .pull-right.headerlink,.rst-content h5 .pull-right.headerlink,.rst-content h6 .pull-right.headerlink,.rst-content p .pull-right.headerlink,.rst-content table>caption .pull-right.headerlink,.rst-content tt.download span.pull-right:first-child,.wy-menu-vertical li.current>a button.pull-right.toctree-expand,.wy-menu-vertical li.on a button.pull-right.toctree-expand,.wy-menu-vertical li button.pull-right.toctree-expand{margin-left:.3em}.fa-spin{-webkit-animation:fa-spin 2s linear infinite;animation:fa-spin 2s linear infinite}.fa-pulse{-webkit-animation:fa-spin 1s steps(8) infinite;animation:fa-spin 1s steps(8) infinite}@-webkit-keyframes fa-spin{0%{-webkit-transform:rotate(0deg);transform:rotate(0deg)}to{-webkit-transform:rotate(359deg);transform:rotate(359deg)}}@keyframes fa-spin{0%{-webkit-transform:rotate(0deg);transform:rotate(0deg)}to{-webkit-transform:rotate(359deg);transform:rotate(359deg)}}.fa-rotate-90{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=1)";-webkit-transform:rotate(90deg);-ms-transform:rotate(90deg);transform:rotate(90deg)}.fa-rotate-180{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=2)";-webkit-transform:rotate(180deg);-ms-transform:rotate(180deg);transform:rotate(180deg)}.fa-rotate-270{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=3)";-webkit-transform:rotate(270deg);-ms-transform:rotate(270deg);transform:rotate(270deg)}.fa-flip-horizontal{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=0, mirror=1)";-webkit-transform:scaleX(-1);-ms-transform:scaleX(-1);transform:scaleX(-1)}.fa-flip-vertical{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=2, mirror=1)";-webkit-transform:scaleY(-1);-ms-transform:scaleY(-1);transform:scaleY(-1)}:root .fa-flip-horizontal,:root .fa-flip-vertical,:root .fa-rotate-90,:root .fa-rotate-180,:root .fa-rotate-270{filter:none}.fa-stack{position:relative;display:inline-block;width:2em;height:2em;line-height:2em;vertical-align:middle}.fa-stack-1x,.fa-stack-2x{position:absolute;left:0;width:100%;text-align:center}.fa-stack-1x{line-height:inherit}.fa-stack-2x{font-size:2em}.fa-inverse{color:#fff}.fa-glass:before{content:""}.fa-music:before{content:""}.fa-search:before,.icon-search:before{content:""}.fa-envelope-o:before{content:""}.fa-heart:before{content:""}.fa-star:before{content:""}.fa-star-o:before{content:""}.fa-user:before{content:""}.fa-film:before{content:""}.fa-th-large:before{content:""}.fa-th:before{content:""}.fa-th-list:before{content:""}.fa-check:before{content:""}.fa-close:before,.fa-remove:before,.fa-times:before{content:""}.fa-search-plus:before{content:""}.fa-search-minus:before{content:""}.fa-power-off:before{content:""}.fa-signal:before{content:""}.fa-cog:before,.fa-gear:before{content:""}.fa-trash-o:before{content:""}.fa-home:before,.icon-home:before{content:""}.fa-file-o:before{content:""}.fa-clock-o:before{content:""}.fa-road:before{content:""}.fa-download:before,.rst-content code.download span:first-child:before,.rst-content tt.download span:first-child:before{content:""}.fa-arrow-circle-o-down:before{content:""}.fa-arrow-circle-o-up:before{content:""}.fa-inbox:before{content:""}.fa-play-circle-o:before{content:""}.fa-repeat:before,.fa-rotate-right:before{content:""}.fa-refresh:before{content:""}.fa-list-alt:before{content:""}.fa-lock:before{content:""}.fa-flag:before{content:""}.fa-headphones:before{content:""}.fa-volume-off:before{content:""}.fa-volume-down:before{content:""}.fa-volume-up:before{content:""}.fa-qrcode:before{content:""}.fa-barcode:before{content:""}.fa-tag:before{content:""}.fa-tags:before{content:""}.fa-book:before,.icon-book:before{content:""}.fa-bookmark:before{content:""}.fa-print:before{content:""}.fa-camera:before{content:""}.fa-font:before{content:""}.fa-bold:before{content:""}.fa-italic:before{content:""}.fa-text-height:before{content:""}.fa-text-width:before{content:""}.fa-align-left:before{content:""}.fa-align-center:before{content:""}.fa-align-right:before{content:""}.fa-align-justify:before{content:""}.fa-list:before{content:""}.fa-dedent:before,.fa-outdent:before{content:""}.fa-indent:before{content:""}.fa-video-camera:before{content:""}.fa-image:before,.fa-photo:before,.fa-picture-o:before{content:""}.fa-pencil:before{content:""}.fa-map-marker:before{content:""}.fa-adjust:before{content:""}.fa-tint:before{content:""}.fa-edit:before,.fa-pencil-square-o:before{content:""}.fa-share-square-o:before{content:""}.fa-check-square-o:before{content:""}.fa-arrows:before{content:""}.fa-step-backward:before{content:""}.fa-fast-backward:before{content:""}.fa-backward:before{content:""}.fa-play:before{content:""}.fa-pause:before{content:""}.fa-stop:before{content:""}.fa-forward:before{content:""}.fa-fast-forward:before{content:""}.fa-step-forward:before{content:""}.fa-eject:before{content:""}.fa-chevron-left:before{content:""}.fa-chevron-right:before{content:""}.fa-plus-circle:before{content:""}.fa-minus-circle:before{content:""}.fa-times-circle:before,.wy-inline-validate.wy-inline-validate-danger .wy-input-context:before{content:""}.fa-check-circle:before,.wy-inline-validate.wy-inline-validate-success .wy-input-context:before{content:""}.fa-question-circle:before{content:""}.fa-info-circle:before{content:""}.fa-crosshairs:before{content:""}.fa-times-circle-o:before{content:""}.fa-check-circle-o:before{content:""}.fa-ban:before{content:""}.fa-arrow-left:before{content:""}.fa-arrow-right:before{content:""}.fa-arrow-up:before{content:""}.fa-arrow-down:before{content:""}.fa-mail-forward:before,.fa-share:before{content:""}.fa-expand:before{content:""}.fa-compress:before{content:""}.fa-plus:before{content:""}.fa-minus:before{content:""}.fa-asterisk:before{content:""}.fa-exclamation-circle:before,.rst-content .admonition-title:before,.wy-inline-validate.wy-inline-validate-info .wy-input-context:before,.wy-inline-validate.wy-inline-validate-warning .wy-input-context:before{content:""}.fa-gift:before{content:""}.fa-leaf:before{content:""}.fa-fire:before,.icon-fire:before{content:""}.fa-eye:before{content:""}.fa-eye-slash:before{content:""}.fa-exclamation-triangle:before,.fa-warning:before{content:""}.fa-plane:before{content:""}.fa-calendar:before{content:""}.fa-random:before{content:""}.fa-comment:before{content:""}.fa-magnet:before{content:""}.fa-chevron-up:before{content:""}.fa-chevron-down:before{content:""}.fa-retweet:before{content:""}.fa-shopping-cart:before{content:""}.fa-folder:before{content:""}.fa-folder-open:before{content:""}.fa-arrows-v:before{content:""}.fa-arrows-h:before{content:""}.fa-bar-chart-o:before,.fa-bar-chart:before{content:""}.fa-twitter-square:before{content:""}.fa-facebook-square:before{content:""}.fa-camera-retro:before{content:""}.fa-key:before{content:""}.fa-cogs:before,.fa-gears:before{content:""}.fa-comments:before{content:""}.fa-thumbs-o-up:before{content:""}.fa-thumbs-o-down:before{content:""}.fa-star-half:before{content:""}.fa-heart-o:before{content:""}.fa-sign-out:before{content:""}.fa-linkedin-square:before{content:""}.fa-thumb-tack:before{content:""}.fa-external-link:before{content:""}.fa-sign-in:before{content:""}.fa-trophy:before{content:""}.fa-github-square:before{content:""}.fa-upload:before{content:""}.fa-lemon-o:before{content:""}.fa-phone:before{content:""}.fa-square-o:before{content:""}.fa-bookmark-o:before{content:""}.fa-phone-square:before{content:""}.fa-twitter:before{content:""}.fa-facebook-f:before,.fa-facebook:before{content:""}.fa-github:before,.icon-github:before{content:""}.fa-unlock:before{content:""}.fa-credit-card:before{content:""}.fa-feed:before,.fa-rss:before{content:""}.fa-hdd-o:before{content:""}.fa-bullhorn:before{content:""}.fa-bell:before{content:""}.fa-certificate:before{content:""}.fa-hand-o-right:before{content:""}.fa-hand-o-left:before{content:""}.fa-hand-o-up:before{content:""}.fa-hand-o-down:before{content:""}.fa-arrow-circle-left:before,.icon-circle-arrow-left:before{content:""}.fa-arrow-circle-right:before,.icon-circle-arrow-right:before{content:""}.fa-arrow-circle-up:before{content:""}.fa-arrow-circle-down:before{content:""}.fa-globe:before{content:""}.fa-wrench:before{content:""}.fa-tasks:before{content:""}.fa-filter:before{content:""}.fa-briefcase:before{content:""}.fa-arrows-alt:before{content:""}.fa-group:before,.fa-users:before{content:""}.fa-chain:before,.fa-link:before,.icon-link:before{content:""}.fa-cloud:before{content:""}.fa-flask:before{content:""}.fa-cut:before,.fa-scissors:before{content:""}.fa-copy:before,.fa-files-o:before{content:""}.fa-paperclip:before{content:""}.fa-floppy-o:before,.fa-save:before{content:""}.fa-square:before{content:""}.fa-bars:before,.fa-navicon:before,.fa-reorder:before{content:""}.fa-list-ul:before{content:""}.fa-list-ol:before{content:""}.fa-strikethrough:before{content:""}.fa-underline:before{content:""}.fa-table:before{content:""}.fa-magic:before{content:""}.fa-truck:before{content:""}.fa-pinterest:before{content:""}.fa-pinterest-square:before{content:""}.fa-google-plus-square:before{content:""}.fa-google-plus:before{content:""}.fa-money:before{content:""}.fa-caret-down:before,.icon-caret-down:before,.wy-dropdown .caret:before{content:""}.fa-caret-up:before{content:""}.fa-caret-left:before{content:""}.fa-caret-right:before{content:""}.fa-columns:before{content:""}.fa-sort:before,.fa-unsorted:before{content:""}.fa-sort-desc:before,.fa-sort-down:before{content:""}.fa-sort-asc:before,.fa-sort-up:before{content:""}.fa-envelope:before{content:""}.fa-linkedin:before{content:""}.fa-rotate-left:before,.fa-undo:before{content:""}.fa-gavel:before,.fa-legal:before{content:""}.fa-dashboard:before,.fa-tachometer:before{content:""}.fa-comment-o:before{content:""}.fa-comments-o:before{content:""}.fa-bolt:before,.fa-flash:before{content:""}.fa-sitemap:before{content:""}.fa-umbrella:before{content:""}.fa-clipboard:before,.fa-paste:before{content:""}.fa-lightbulb-o:before{content:""}.fa-exchange:before{content:""}.fa-cloud-download:before{content:""}.fa-cloud-upload:before{content:""}.fa-user-md:before{content:""}.fa-stethoscope:before{content:""}.fa-suitcase:before{content:""}.fa-bell-o:before{content:""}.fa-coffee:before{content:""}.fa-cutlery:before{content:""}.fa-file-text-o:before{content:""}.fa-building-o:before{content:""}.fa-hospital-o:before{content:""}.fa-ambulance:before{content:""}.fa-medkit:before{content:""}.fa-fighter-jet:before{content:""}.fa-beer:before{content:""}.fa-h-square:before{content:""}.fa-plus-square:before{content:""}.fa-angle-double-left:before{content:""}.fa-angle-double-right:before{content:""}.fa-angle-double-up:before{content:""}.fa-angle-double-down:before{content:""}.fa-angle-left:before{content:""}.fa-angle-right:before{content:""}.fa-angle-up:before{content:""}.fa-angle-down:before{content:""}.fa-desktop:before{content:""}.fa-laptop:before{content:""}.fa-tablet:before{content:""}.fa-mobile-phone:before,.fa-mobile:before{content:""}.fa-circle-o:before{content:""}.fa-quote-left:before{content:""}.fa-quote-right:before{content:""}.fa-spinner:before{content:""}.fa-circle:before{content:""}.fa-mail-reply:before,.fa-reply:before{content:""}.fa-github-alt:before{content:""}.fa-folder-o:before{content:""}.fa-folder-open-o:before{content:""}.fa-smile-o:before{content:""}.fa-frown-o:before{content:""}.fa-meh-o:before{content:""}.fa-gamepad:before{content:""}.fa-keyboard-o:before{content:""}.fa-flag-o:before{content:""}.fa-flag-checkered:before{content:""}.fa-terminal:before{content:""}.fa-code:before{content:""}.fa-mail-reply-all:before,.fa-reply-all:before{content:""}.fa-star-half-empty:before,.fa-star-half-full:before,.fa-star-half-o:before{content:""}.fa-location-arrow:before{content:""}.fa-crop:before{content:""}.fa-code-fork:before{content:""}.fa-chain-broken:before,.fa-unlink:before{content:""}.fa-question:before{content:""}.fa-info:before{content:""}.fa-exclamation:before{content:""}.fa-superscript:before{content:""}.fa-subscript:before{content:""}.fa-eraser:before{content:""}.fa-puzzle-piece:before{content:""}.fa-microphone:before{content:""}.fa-microphone-slash:before{content:""}.fa-shield:before{content:""}.fa-calendar-o:before{content:""}.fa-fire-extinguisher:before{content:""}.fa-rocket:before{content:""}.fa-maxcdn:before{content:""}.fa-chevron-circle-left:before{content:""}.fa-chevron-circle-right:before{content:""}.fa-chevron-circle-up:before{content:""}.fa-chevron-circle-down:before{content:""}.fa-html5:before{content:""}.fa-css3:before{content:""}.fa-anchor:before{content:""}.fa-unlock-alt:before{content:""}.fa-bullseye:before{content:""}.fa-ellipsis-h:before{content:""}.fa-ellipsis-v:before{content:""}.fa-rss-square:before{content:""}.fa-play-circle:before{content:""}.fa-ticket:before{content:""}.fa-minus-square:before{content:""}.fa-minus-square-o:before,.wy-menu-vertical li.current>a button.toctree-expand:before,.wy-menu-vertical li.on a button.toctree-expand:before{content:""}.fa-level-up:before{content:""}.fa-level-down:before{content:""}.fa-check-square:before{content:""}.fa-pencil-square:before{content:""}.fa-external-link-square:before{content:""}.fa-share-square:before{content:""}.fa-compass:before{content:""}.fa-caret-square-o-down:before,.fa-toggle-down:before{content:""}.fa-caret-square-o-up:before,.fa-toggle-up:before{content:""}.fa-caret-square-o-right:before,.fa-toggle-right:before{content:""}.fa-eur:before,.fa-euro:before{content:""}.fa-gbp:before{content:""}.fa-dollar:before,.fa-usd:before{content:""}.fa-inr:before,.fa-rupee:before{content:""}.fa-cny:before,.fa-jpy:before,.fa-rmb:before,.fa-yen:before{content:""}.fa-rouble:before,.fa-rub:before,.fa-ruble:before{content:""}.fa-krw:before,.fa-won:before{content:""}.fa-bitcoin:before,.fa-btc:before{content:""}.fa-file:before{content:""}.fa-file-text:before{content:""}.fa-sort-alpha-asc:before{content:""}.fa-sort-alpha-desc:before{content:""}.fa-sort-amount-asc:before{content:""}.fa-sort-amount-desc:before{content:""}.fa-sort-numeric-asc:before{content:""}.fa-sort-numeric-desc:before{content:""}.fa-thumbs-up:before{content:""}.fa-thumbs-down:before{content:""}.fa-youtube-square:before{content:""}.fa-youtube:before{content:""}.fa-xing:before{content:""}.fa-xing-square:before{content:""}.fa-youtube-play:before{content:""}.fa-dropbox:before{content:""}.fa-stack-overflow:before{content:""}.fa-instagram:before{content:""}.fa-flickr:before{content:""}.fa-adn:before{content:""}.fa-bitbucket:before,.icon-bitbucket:before{content:""}.fa-bitbucket-square:before{content:""}.fa-tumblr:before{content:""}.fa-tumblr-square:before{content:""}.fa-long-arrow-down:before{content:""}.fa-long-arrow-up:before{content:""}.fa-long-arrow-left:before{content:""}.fa-long-arrow-right:before{content:""}.fa-apple:before{content:""}.fa-windows:before{content:""}.fa-android:before{content:""}.fa-linux:before{content:""}.fa-dribbble:before{content:""}.fa-skype:before{content:""}.fa-foursquare:before{content:""}.fa-trello:before{content:""}.fa-female:before{content:""}.fa-male:before{content:""}.fa-gittip:before,.fa-gratipay:before{content:""}.fa-sun-o:before{content:""}.fa-moon-o:before{content:""}.fa-archive:before{content:""}.fa-bug:before{content:""}.fa-vk:before{content:""}.fa-weibo:before{content:""}.fa-renren:before{content:""}.fa-pagelines:before{content:""}.fa-stack-exchange:before{content:""}.fa-arrow-circle-o-right:before{content:""}.fa-arrow-circle-o-left:before{content:""}.fa-caret-square-o-left:before,.fa-toggle-left:before{content:""}.fa-dot-circle-o:before{content:""}.fa-wheelchair:before{content:""}.fa-vimeo-square:before{content:""}.fa-try:before,.fa-turkish-lira:before{content:""}.fa-plus-square-o:before,.wy-menu-vertical li button.toctree-expand:before{content:""}.fa-space-shuttle:before{content:""}.fa-slack:before{content:""}.fa-envelope-square:before{content:""}.fa-wordpress:before{content:""}.fa-openid:before{content:""}.fa-bank:before,.fa-institution:before,.fa-university:before{content:""}.fa-graduation-cap:before,.fa-mortar-board:before{content:""}.fa-yahoo:before{content:""}.fa-google:before{content:""}.fa-reddit:before{content:""}.fa-reddit-square:before{content:""}.fa-stumbleupon-circle:before{content:""}.fa-stumbleupon:before{content:""}.fa-delicious:before{content:""}.fa-digg:before{content:""}.fa-pied-piper-pp:before{content:""}.fa-pied-piper-alt:before{content:""}.fa-drupal:before{content:""}.fa-joomla:before{content:""}.fa-language:before{content:""}.fa-fax:before{content:""}.fa-building:before{content:""}.fa-child:before{content:""}.fa-paw:before{content:""}.fa-spoon:before{content:""}.fa-cube:before{content:""}.fa-cubes:before{content:""}.fa-behance:before{content:""}.fa-behance-square:before{content:""}.fa-steam:before{content:""}.fa-steam-square:before{content:""}.fa-recycle:before{content:""}.fa-automobile:before,.fa-car:before{content:""}.fa-cab:before,.fa-taxi:before{content:""}.fa-tree:before{content:""}.fa-spotify:before{content:""}.fa-deviantart:before{content:""}.fa-soundcloud:before{content:""}.fa-database:before{content:""}.fa-file-pdf-o:before{content:""}.fa-file-word-o:before{content:""}.fa-file-excel-o:before{content:""}.fa-file-powerpoint-o:before{content:""}.fa-file-image-o:before,.fa-file-photo-o:before,.fa-file-picture-o:before{content:""}.fa-file-archive-o:before,.fa-file-zip-o:before{content:""}.fa-file-audio-o:before,.fa-file-sound-o:before{content:""}.fa-file-movie-o:before,.fa-file-video-o:before{content:""}.fa-file-code-o:before{content:""}.fa-vine:before{content:""}.fa-codepen:before{content:""}.fa-jsfiddle:before{content:""}.fa-life-bouy:before,.fa-life-buoy:before,.fa-life-ring:before,.fa-life-saver:before,.fa-support:before{content:""}.fa-circle-o-notch:before{content:""}.fa-ra:before,.fa-rebel:before,.fa-resistance:before{content:""}.fa-empire:before,.fa-ge:before{content:""}.fa-git-square:before{content:""}.fa-git:before{content:""}.fa-hacker-news:before,.fa-y-combinator-square:before,.fa-yc-square:before{content:""}.fa-tencent-weibo:before{content:""}.fa-qq:before{content:""}.fa-wechat:before,.fa-weixin:before{content:""}.fa-paper-plane:before,.fa-send:before{content:""}.fa-paper-plane-o:before,.fa-send-o:before{content:""}.fa-history:before{content:""}.fa-circle-thin:before{content:""}.fa-header:before{content:""}.fa-paragraph:before{content:""}.fa-sliders:before{content:""}.fa-share-alt:before{content:""}.fa-share-alt-square:before{content:""}.fa-bomb:before{content:""}.fa-futbol-o:before,.fa-soccer-ball-o:before{content:""}.fa-tty:before{content:""}.fa-binoculars:before{content:""}.fa-plug:before{content:""}.fa-slideshare:before{content:""}.fa-twitch:before{content:""}.fa-yelp:before{content:""}.fa-newspaper-o:before{content:""}.fa-wifi:before{content:""}.fa-calculator:before{content:""}.fa-paypal:before{content:""}.fa-google-wallet:before{content:""}.fa-cc-visa:before{content:""}.fa-cc-mastercard:before{content:""}.fa-cc-discover:before{content:""}.fa-cc-amex:before{content:""}.fa-cc-paypal:before{content:""}.fa-cc-stripe:before{content:""}.fa-bell-slash:before{content:""}.fa-bell-slash-o:before{content:""}.fa-trash:before{content:""}.fa-copyright:before{content:""}.fa-at:before{content:""}.fa-eyedropper:before{content:""}.fa-paint-brush:before{content:""}.fa-birthday-cake:before{content:""}.fa-area-chart:before{content:""}.fa-pie-chart:before{content:""}.fa-line-chart:before{content:""}.fa-lastfm:before{content:""}.fa-lastfm-square:before{content:""}.fa-toggle-off:before{content:""}.fa-toggle-on:before{content:""}.fa-bicycle:before{content:""}.fa-bus:before{content:""}.fa-ioxhost:before{content:""}.fa-angellist:before{content:""}.fa-cc:before{content:""}.fa-ils:before,.fa-shekel:before,.fa-sheqel:before{content:""}.fa-meanpath:before{content:""}.fa-buysellads:before{content:""}.fa-connectdevelop:before{content:""}.fa-dashcube:before{content:""}.fa-forumbee:before{content:""}.fa-leanpub:before{content:""}.fa-sellsy:before{content:""}.fa-shirtsinbulk:before{content:""}.fa-simplybuilt:before{content:""}.fa-skyatlas:before{content:""}.fa-cart-plus:before{content:""}.fa-cart-arrow-down:before{content:""}.fa-diamond:before{content:""}.fa-ship:before{content:""}.fa-user-secret:before{content:""}.fa-motorcycle:before{content:""}.fa-street-view:before{content:""}.fa-heartbeat:before{content:""}.fa-venus:before{content:""}.fa-mars:before{content:""}.fa-mercury:before{content:""}.fa-intersex:before,.fa-transgender:before{content:""}.fa-transgender-alt:before{content:""}.fa-venus-double:before{content:""}.fa-mars-double:before{content:""}.fa-venus-mars:before{content:""}.fa-mars-stroke:before{content:""}.fa-mars-stroke-v:before{content:""}.fa-mars-stroke-h:before{content:""}.fa-neuter:before{content:""}.fa-genderless:before{content:""}.fa-facebook-official:before{content:""}.fa-pinterest-p:before{content:""}.fa-whatsapp:before{content:""}.fa-server:before{content:""}.fa-user-plus:before{content:""}.fa-user-times:before{content:""}.fa-bed:before,.fa-hotel:before{content:""}.fa-viacoin:before{content:""}.fa-train:before{content:""}.fa-subway:before{content:""}.fa-medium:before{content:""}.fa-y-combinator:before,.fa-yc:before{content:""}.fa-optin-monster:before{content:""}.fa-opencart:before{content:""}.fa-expeditedssl:before{content:""}.fa-battery-4:before,.fa-battery-full:before,.fa-battery:before{content:""}.fa-battery-3:before,.fa-battery-three-quarters:before{content:""}.fa-battery-2:before,.fa-battery-half:before{content:""}.fa-battery-1:before,.fa-battery-quarter:before{content:""}.fa-battery-0:before,.fa-battery-empty:before{content:""}.fa-mouse-pointer:before{content:""}.fa-i-cursor:before{content:""}.fa-object-group:before{content:""}.fa-object-ungroup:before{content:""}.fa-sticky-note:before{content:""}.fa-sticky-note-o:before{content:""}.fa-cc-jcb:before{content:""}.fa-cc-diners-club:before{content:""}.fa-clone:before{content:""}.fa-balance-scale:before{content:""}.fa-hourglass-o:before{content:""}.fa-hourglass-1:before,.fa-hourglass-start:before{content:""}.fa-hourglass-2:before,.fa-hourglass-half:before{content:""}.fa-hourglass-3:before,.fa-hourglass-end:before{content:""}.fa-hourglass:before{content:""}.fa-hand-grab-o:before,.fa-hand-rock-o:before{content:""}.fa-hand-paper-o:before,.fa-hand-stop-o:before{content:""}.fa-hand-scissors-o:before{content:""}.fa-hand-lizard-o:before{content:""}.fa-hand-spock-o:before{content:""}.fa-hand-pointer-o:before{content:""}.fa-hand-peace-o:before{content:""}.fa-trademark:before{content:""}.fa-registered:before{content:""}.fa-creative-commons:before{content:""}.fa-gg:before{content:""}.fa-gg-circle:before{content:""}.fa-tripadvisor:before{content:""}.fa-odnoklassniki:before{content:""}.fa-odnoklassniki-square:before{content:""}.fa-get-pocket:before{content:""}.fa-wikipedia-w:before{content:""}.fa-safari:before{content:""}.fa-chrome:before{content:""}.fa-firefox:before{content:""}.fa-opera:before{content:""}.fa-internet-explorer:before{content:""}.fa-television:before,.fa-tv:before{content:""}.fa-contao:before{content:""}.fa-500px:before{content:""}.fa-amazon:before{content:""}.fa-calendar-plus-o:before{content:""}.fa-calendar-minus-o:before{content:""}.fa-calendar-times-o:before{content:""}.fa-calendar-check-o:before{content:""}.fa-industry:before{content:""}.fa-map-pin:before{content:""}.fa-map-signs:before{content:""}.fa-map-o:before{content:""}.fa-map:before{content:""}.fa-commenting:before{content:""}.fa-commenting-o:before{content:""}.fa-houzz:before{content:""}.fa-vimeo:before{content:""}.fa-black-tie:before{content:""}.fa-fonticons:before{content:""}.fa-reddit-alien:before{content:""}.fa-edge:before{content:""}.fa-credit-card-alt:before{content:""}.fa-codiepie:before{content:""}.fa-modx:before{content:""}.fa-fort-awesome:before{content:""}.fa-usb:before{content:""}.fa-product-hunt:before{content:""}.fa-mixcloud:before{content:""}.fa-scribd:before{content:""}.fa-pause-circle:before{content:""}.fa-pause-circle-o:before{content:""}.fa-stop-circle:before{content:""}.fa-stop-circle-o:before{content:""}.fa-shopping-bag:before{content:""}.fa-shopping-basket:before{content:""}.fa-hashtag:before{content:""}.fa-bluetooth:before{content:""}.fa-bluetooth-b:before{content:""}.fa-percent:before{content:""}.fa-gitlab:before,.icon-gitlab:before{content:""}.fa-wpbeginner:before{content:""}.fa-wpforms:before{content:""}.fa-envira:before{content:""}.fa-universal-access:before{content:""}.fa-wheelchair-alt:before{content:""}.fa-question-circle-o:before{content:""}.fa-blind:before{content:""}.fa-audio-description:before{content:""}.fa-volume-control-phone:before{content:""}.fa-braille:before{content:""}.fa-assistive-listening-systems:before{content:""}.fa-american-sign-language-interpreting:before,.fa-asl-interpreting:before{content:""}.fa-deaf:before,.fa-deafness:before,.fa-hard-of-hearing:before{content:""}.fa-glide:before{content:""}.fa-glide-g:before{content:""}.fa-sign-language:before,.fa-signing:before{content:""}.fa-low-vision:before{content:""}.fa-viadeo:before{content:""}.fa-viadeo-square:before{content:""}.fa-snapchat:before{content:""}.fa-snapchat-ghost:before{content:""}.fa-snapchat-square:before{content:""}.fa-pied-piper:before{content:""}.fa-first-order:before{content:""}.fa-yoast:before{content:""}.fa-themeisle:before{content:""}.fa-google-plus-circle:before,.fa-google-plus-official:before{content:""}.fa-fa:before,.fa-font-awesome:before{content:""}.fa-handshake-o:before{content:""}.fa-envelope-open:before{content:""}.fa-envelope-open-o:before{content:""}.fa-linode:before{content:""}.fa-address-book:before{content:""}.fa-address-book-o:before{content:""}.fa-address-card:before,.fa-vcard:before{content:""}.fa-address-card-o:before,.fa-vcard-o:before{content:""}.fa-user-circle:before{content:""}.fa-user-circle-o:before{content:""}.fa-user-o:before{content:""}.fa-id-badge:before{content:""}.fa-drivers-license:before,.fa-id-card:before{content:""}.fa-drivers-license-o:before,.fa-id-card-o:before{content:""}.fa-quora:before{content:""}.fa-free-code-camp:before{content:""}.fa-telegram:before{content:""}.fa-thermometer-4:before,.fa-thermometer-full:before,.fa-thermometer:before{content:""}.fa-thermometer-3:before,.fa-thermometer-three-quarters:before{content:""}.fa-thermometer-2:before,.fa-thermometer-half:before{content:""}.fa-thermometer-1:before,.fa-thermometer-quarter:before{content:""}.fa-thermometer-0:before,.fa-thermometer-empty:before{content:""}.fa-shower:before{content:""}.fa-bath:before,.fa-bathtub:before,.fa-s15:before{content:""}.fa-podcast:before{content:""}.fa-window-maximize:before{content:""}.fa-window-minimize:before{content:""}.fa-window-restore:before{content:""}.fa-times-rectangle:before,.fa-window-close:before{content:""}.fa-times-rectangle-o:before,.fa-window-close-o:before{content:""}.fa-bandcamp:before{content:""}.fa-grav:before{content:""}.fa-etsy:before{content:""}.fa-imdb:before{content:""}.fa-ravelry:before{content:""}.fa-eercast:before{content:""}.fa-microchip:before{content:""}.fa-snowflake-o:before{content:""}.fa-superpowers:before{content:""}.fa-wpexplorer:before{content:""}.fa-meetup:before{content:""}.sr-only{position:absolute;width:1px;height:1px;padding:0;margin:-1px;overflow:hidden;clip:rect(0,0,0,0);border:0}.sr-only-focusable:active,.sr-only-focusable:focus{position:static;width:auto;height:auto;margin:0;overflow:visible;clip:auto}.fa,.icon,.rst-content .admonition-title,.rst-content .code-block-caption .headerlink,.rst-content .eqno .headerlink,.rst-content code.download span:first-child,.rst-content dl dt .headerlink,.rst-content h1 .headerlink,.rst-content h2 .headerlink,.rst-content h3 .headerlink,.rst-content h4 .headerlink,.rst-content h5 .headerlink,.rst-content h6 .headerlink,.rst-content p.caption .headerlink,.rst-content p .headerlink,.rst-content table>caption .headerlink,.rst-content tt.download span:first-child,.wy-dropdown .caret,.wy-inline-validate.wy-inline-validate-danger .wy-input-context,.wy-inline-validate.wy-inline-validate-info .wy-input-context,.wy-inline-validate.wy-inline-validate-success .wy-input-context,.wy-inline-validate.wy-inline-validate-warning .wy-input-context,.wy-menu-vertical li.current>a button.toctree-expand,.wy-menu-vertical li.on a button.toctree-expand,.wy-menu-vertical li button.toctree-expand{font-family:inherit}.fa:before,.icon:before,.rst-content .admonition-title:before,.rst-content .code-block-caption .headerlink:before,.rst-content .eqno .headerlink:before,.rst-content code.download span:first-child:before,.rst-content dl dt .headerlink:before,.rst-content h1 .headerlink:before,.rst-content h2 .headerlink:before,.rst-content h3 .headerlink:before,.rst-content h4 .headerlink:before,.rst-content h5 .headerlink:before,.rst-content h6 .headerlink:before,.rst-content p.caption .headerlink:before,.rst-content p .headerlink:before,.rst-content table>caption .headerlink:before,.rst-content tt.download span:first-child:before,.wy-dropdown .caret:before,.wy-inline-validate.wy-inline-validate-danger .wy-input-context:before,.wy-inline-validate.wy-inline-validate-info .wy-input-context:before,.wy-inline-validate.wy-inline-validate-success .wy-input-context:before,.wy-inline-validate.wy-inline-validate-warning .wy-input-context:before,.wy-menu-vertical li.current>a button.toctree-expand:before,.wy-menu-vertical li.on a button.toctree-expand:before,.wy-menu-vertical li button.toctree-expand:before{font-family:FontAwesome;display:inline-block;font-style:normal;font-weight:400;line-height:1;text-decoration:inherit}.rst-content .code-block-caption a .headerlink,.rst-content .eqno a .headerlink,.rst-content a .admonition-title,.rst-content code.download a span:first-child,.rst-content dl dt a .headerlink,.rst-content h1 a .headerlink,.rst-content h2 a .headerlink,.rst-content h3 a .headerlink,.rst-content h4 a .headerlink,.rst-content h5 a .headerlink,.rst-content h6 a .headerlink,.rst-content p.caption a .headerlink,.rst-content p a .headerlink,.rst-content table>caption a .headerlink,.rst-content tt.download a span:first-child,.wy-menu-vertical li.current>a button.toctree-expand,.wy-menu-vertical li.on a button.toctree-expand,.wy-menu-vertical li a button.toctree-expand,a .fa,a .icon,a .rst-content .admonition-title,a .rst-content .code-block-caption .headerlink,a .rst-content .eqno .headerlink,a .rst-content code.download span:first-child,a .rst-content dl dt .headerlink,a .rst-content h1 .headerlink,a .rst-content h2 .headerlink,a .rst-content h3 .headerlink,a .rst-content h4 .headerlink,a .rst-content h5 .headerlink,a .rst-content h6 .headerlink,a .rst-content p.caption .headerlink,a .rst-content p .headerlink,a .rst-content table>caption .headerlink,a .rst-content tt.download span:first-child,a .wy-menu-vertical li button.toctree-expand{display:inline-block;text-decoration:inherit}.btn .fa,.btn .icon,.btn .rst-content .admonition-title,.btn .rst-content .code-block-caption .headerlink,.btn .rst-content .eqno .headerlink,.btn .rst-content code.download span:first-child,.btn .rst-content dl dt .headerlink,.btn .rst-content h1 .headerlink,.btn .rst-content h2 .headerlink,.btn .rst-content h3 .headerlink,.btn .rst-content h4 .headerlink,.btn .rst-content h5 .headerlink,.btn .rst-content h6 .headerlink,.btn .rst-content p .headerlink,.btn .rst-content table>caption .headerlink,.btn .rst-content tt.download span:first-child,.btn .wy-menu-vertical li.current>a button.toctree-expand,.btn .wy-menu-vertical li.on a button.toctree-expand,.btn .wy-menu-vertical li button.toctree-expand,.nav .fa,.nav .icon,.nav .rst-content .admonition-title,.nav .rst-content .code-block-caption .headerlink,.nav .rst-content .eqno .headerlink,.nav .rst-content code.download span:first-child,.nav .rst-content dl dt .headerlink,.nav .rst-content h1 .headerlink,.nav .rst-content h2 .headerlink,.nav .rst-content h3 .headerlink,.nav .rst-content h4 .headerlink,.nav .rst-content h5 .headerlink,.nav .rst-content h6 .headerlink,.nav .rst-content p .headerlink,.nav .rst-content table>caption .headerlink,.nav .rst-content tt.download span:first-child,.nav .wy-menu-vertical li.current>a button.toctree-expand,.nav .wy-menu-vertical li.on a button.toctree-expand,.nav .wy-menu-vertical li button.toctree-expand,.rst-content .btn .admonition-title,.rst-content .code-block-caption .btn .headerlink,.rst-content .code-block-caption .nav .headerlink,.rst-content .eqno .btn .headerlink,.rst-content .eqno .nav .headerlink,.rst-content .nav .admonition-title,.rst-content code.download .btn span:first-child,.rst-content code.download .nav span:first-child,.rst-content dl dt .btn .headerlink,.rst-content dl dt .nav .headerlink,.rst-content h1 .btn .headerlink,.rst-content h1 .nav .headerlink,.rst-content h2 .btn .headerlink,.rst-content h2 .nav .headerlink,.rst-content h3 .btn .headerlink,.rst-content h3 .nav .headerlink,.rst-content h4 .btn .headerlink,.rst-content h4 .nav .headerlink,.rst-content h5 .btn .headerlink,.rst-content h5 .nav .headerlink,.rst-content h6 .btn .headerlink,.rst-content h6 .nav .headerlink,.rst-content p .btn .headerlink,.rst-content p .nav .headerlink,.rst-content table>caption .btn .headerlink,.rst-content table>caption .nav .headerlink,.rst-content tt.download .btn span:first-child,.rst-content tt.download .nav span:first-child,.wy-menu-vertical li .btn button.toctree-expand,.wy-menu-vertical li.current>a .btn button.toctree-expand,.wy-menu-vertical li.current>a .nav button.toctree-expand,.wy-menu-vertical li .nav button.toctree-expand,.wy-menu-vertical li.on a .btn button.toctree-expand,.wy-menu-vertical li.on a .nav button.toctree-expand{display:inline}.btn .fa-large.icon,.btn .fa.fa-large,.btn .rst-content .code-block-caption .fa-large.headerlink,.btn .rst-content .eqno .fa-large.headerlink,.btn .rst-content .fa-large.admonition-title,.btn .rst-content code.download span.fa-large:first-child,.btn .rst-content dl dt .fa-large.headerlink,.btn .rst-content h1 .fa-large.headerlink,.btn .rst-content h2 .fa-large.headerlink,.btn .rst-content h3 .fa-large.headerlink,.btn .rst-content h4 .fa-large.headerlink,.btn .rst-content h5 .fa-large.headerlink,.btn .rst-content h6 .fa-large.headerlink,.btn .rst-content p .fa-large.headerlink,.btn .rst-content table>caption .fa-large.headerlink,.btn .rst-content tt.download span.fa-large:first-child,.btn .wy-menu-vertical li button.fa-large.toctree-expand,.nav .fa-large.icon,.nav .fa.fa-large,.nav .rst-content .code-block-caption .fa-large.headerlink,.nav .rst-content .eqno .fa-large.headerlink,.nav .rst-content .fa-large.admonition-title,.nav .rst-content code.download span.fa-large:first-child,.nav .rst-content dl dt .fa-large.headerlink,.nav .rst-content h1 .fa-large.headerlink,.nav .rst-content h2 .fa-large.headerlink,.nav .rst-content h3 .fa-large.headerlink,.nav .rst-content h4 .fa-large.headerlink,.nav .rst-content h5 .fa-large.headerlink,.nav .rst-content h6 .fa-large.headerlink,.nav .rst-content p .fa-large.headerlink,.nav .rst-content table>caption .fa-large.headerlink,.nav .rst-content tt.download span.fa-large:first-child,.nav .wy-menu-vertical li button.fa-large.toctree-expand,.rst-content .btn .fa-large.admonition-title,.rst-content .code-block-caption .btn .fa-large.headerlink,.rst-content .code-block-caption .nav .fa-large.headerlink,.rst-content .eqno .btn .fa-large.headerlink,.rst-content .eqno .nav .fa-large.headerlink,.rst-content .nav .fa-large.admonition-title,.rst-content code.download .btn span.fa-large:first-child,.rst-content code.download .nav span.fa-large:first-child,.rst-content dl dt .btn .fa-large.headerlink,.rst-content dl dt .nav .fa-large.headerlink,.rst-content h1 .btn .fa-large.headerlink,.rst-content h1 .nav .fa-large.headerlink,.rst-content h2 .btn .fa-large.headerlink,.rst-content h2 .nav .fa-large.headerlink,.rst-content h3 .btn .fa-large.headerlink,.rst-content h3 .nav .fa-large.headerlink,.rst-content h4 .btn .fa-large.headerlink,.rst-content h4 .nav .fa-large.headerlink,.rst-content h5 .btn .fa-large.headerlink,.rst-content h5 .nav .fa-large.headerlink,.rst-content h6 .btn .fa-large.headerlink,.rst-content h6 .nav .fa-large.headerlink,.rst-content p .btn .fa-large.headerlink,.rst-content p .nav .fa-large.headerlink,.rst-content table>caption .btn .fa-large.headerlink,.rst-content table>caption .nav .fa-large.headerlink,.rst-content tt.download .btn span.fa-large:first-child,.rst-content tt.download .nav span.fa-large:first-child,.wy-menu-vertical li .btn button.fa-large.toctree-expand,.wy-menu-vertical li .nav button.fa-large.toctree-expand{line-height:.9em}.btn .fa-spin.icon,.btn .fa.fa-spin,.btn .rst-content .code-block-caption .fa-spin.headerlink,.btn .rst-content .eqno .fa-spin.headerlink,.btn .rst-content .fa-spin.admonition-title,.btn .rst-content code.download span.fa-spin:first-child,.btn .rst-content dl dt .fa-spin.headerlink,.btn .rst-content h1 .fa-spin.headerlink,.btn .rst-content h2 .fa-spin.headerlink,.btn .rst-content h3 .fa-spin.headerlink,.btn .rst-content h4 .fa-spin.headerlink,.btn .rst-content h5 .fa-spin.headerlink,.btn .rst-content h6 .fa-spin.headerlink,.btn .rst-content p .fa-spin.headerlink,.btn .rst-content table>caption .fa-spin.headerlink,.btn .rst-content tt.download span.fa-spin:first-child,.btn .wy-menu-vertical li button.fa-spin.toctree-expand,.nav .fa-spin.icon,.nav .fa.fa-spin,.nav .rst-content .code-block-caption .fa-spin.headerlink,.nav .rst-content .eqno .fa-spin.headerlink,.nav .rst-content .fa-spin.admonition-title,.nav .rst-content code.download span.fa-spin:first-child,.nav .rst-content dl dt .fa-spin.headerlink,.nav .rst-content h1 .fa-spin.headerlink,.nav .rst-content h2 .fa-spin.headerlink,.nav .rst-content h3 .fa-spin.headerlink,.nav .rst-content h4 .fa-spin.headerlink,.nav .rst-content h5 .fa-spin.headerlink,.nav .rst-content h6 .fa-spin.headerlink,.nav .rst-content p .fa-spin.headerlink,.nav .rst-content table>caption .fa-spin.headerlink,.nav .rst-content tt.download span.fa-spin:first-child,.nav .wy-menu-vertical li button.fa-spin.toctree-expand,.rst-content .btn .fa-spin.admonition-title,.rst-content .code-block-caption .btn .fa-spin.headerlink,.rst-content .code-block-caption .nav .fa-spin.headerlink,.rst-content .eqno .btn .fa-spin.headerlink,.rst-content .eqno .nav .fa-spin.headerlink,.rst-content .nav .fa-spin.admonition-title,.rst-content code.download .btn span.fa-spin:first-child,.rst-content code.download .nav span.fa-spin:first-child,.rst-content dl dt .btn .fa-spin.headerlink,.rst-content dl dt .nav .fa-spin.headerlink,.rst-content h1 .btn .fa-spin.headerlink,.rst-content h1 .nav .fa-spin.headerlink,.rst-content h2 .btn .fa-spin.headerlink,.rst-content h2 .nav .fa-spin.headerlink,.rst-content h3 .btn .fa-spin.headerlink,.rst-content h3 .nav .fa-spin.headerlink,.rst-content h4 .btn .fa-spin.headerlink,.rst-content h4 .nav .fa-spin.headerlink,.rst-content h5 .btn .fa-spin.headerlink,.rst-content h5 .nav .fa-spin.headerlink,.rst-content h6 .btn .fa-spin.headerlink,.rst-content h6 .nav .fa-spin.headerlink,.rst-content p .btn .fa-spin.headerlink,.rst-content p .nav .fa-spin.headerlink,.rst-content table>caption .btn .fa-spin.headerlink,.rst-content table>caption .nav .fa-spin.headerlink,.rst-content tt.download .btn span.fa-spin:first-child,.rst-content tt.download .nav span.fa-spin:first-child,.wy-menu-vertical li .btn button.fa-spin.toctree-expand,.wy-menu-vertical li .nav button.fa-spin.toctree-expand{display:inline-block}.btn.fa:before,.btn.icon:before,.rst-content .btn.admonition-title:before,.rst-content .code-block-caption .btn.headerlink:before,.rst-content .eqno .btn.headerlink:before,.rst-content code.download span.btn:first-child:before,.rst-content dl dt .btn.headerlink:before,.rst-content h1 .btn.headerlink:before,.rst-content h2 .btn.headerlink:before,.rst-content h3 .btn.headerlink:before,.rst-content h4 .btn.headerlink:before,.rst-content h5 .btn.headerlink:before,.rst-content h6 .btn.headerlink:before,.rst-content p .btn.headerlink:before,.rst-content table>caption .btn.headerlink:before,.rst-content tt.download span.btn:first-child:before,.wy-menu-vertical li button.btn.toctree-expand:before{opacity:.5;-webkit-transition:opacity .05s ease-in;-moz-transition:opacity .05s ease-in;transition:opacity .05s ease-in}.btn.fa:hover:before,.btn.icon:hover:before,.rst-content .btn.admonition-title:hover:before,.rst-content .code-block-caption .btn.headerlink:hover:before,.rst-content .eqno .btn.headerlink:hover:before,.rst-content code.download span.btn:first-child:hover:before,.rst-content dl dt .btn.headerlink:hover:before,.rst-content h1 .btn.headerlink:hover:before,.rst-content h2 .btn.headerlink:hover:before,.rst-content h3 .btn.headerlink:hover:before,.rst-content h4 .btn.headerlink:hover:before,.rst-content h5 .btn.headerlink:hover:before,.rst-content h6 .btn.headerlink:hover:before,.rst-content p .btn.headerlink:hover:before,.rst-content table>caption .btn.headerlink:hover:before,.rst-content tt.download span.btn:first-child:hover:before,.wy-menu-vertical li button.btn.toctree-expand:hover:before{opacity:1}.btn-mini .fa:before,.btn-mini .icon:before,.btn-mini .rst-content .admonition-title:before,.btn-mini .rst-content .code-block-caption .headerlink:before,.btn-mini .rst-content .eqno .headerlink:before,.btn-mini .rst-content code.download span:first-child:before,.btn-mini .rst-content dl dt .headerlink:before,.btn-mini .rst-content h1 .headerlink:before,.btn-mini .rst-content h2 .headerlink:before,.btn-mini .rst-content h3 .headerlink:before,.btn-mini .rst-content h4 .headerlink:before,.btn-mini .rst-content h5 .headerlink:before,.btn-mini .rst-content h6 .headerlink:before,.btn-mini .rst-content p .headerlink:before,.btn-mini .rst-content table>caption .headerlink:before,.btn-mini .rst-content tt.download span:first-child:before,.btn-mini .wy-menu-vertical li button.toctree-expand:before,.rst-content .btn-mini .admonition-title:before,.rst-content .code-block-caption .btn-mini .headerlink:before,.rst-content .eqno .btn-mini .headerlink:before,.rst-content code.download .btn-mini span:first-child:before,.rst-content dl dt .btn-mini .headerlink:before,.rst-content h1 .btn-mini .headerlink:before,.rst-content h2 .btn-mini .headerlink:before,.rst-content h3 .btn-mini .headerlink:before,.rst-content h4 .btn-mini .headerlink:before,.rst-content h5 .btn-mini .headerlink:before,.rst-content h6 .btn-mini .headerlink:before,.rst-content p .btn-mini .headerlink:before,.rst-content table>caption .btn-mini .headerlink:before,.rst-content tt.download .btn-mini span:first-child:before,.wy-menu-vertical li .btn-mini button.toctree-expand:before{font-size:14px;vertical-align:-15%}.rst-content .admonition,.rst-content .admonition-todo,.rst-content .attention,.rst-content .caution,.rst-content .danger,.rst-content .error,.rst-content .hint,.rst-content .important,.rst-content .note,.rst-content .seealso,.rst-content .tip,.rst-content .warning,.wy-alert{padding:12px;line-height:24px;margin-bottom:24px;background:#e7f2fa}.rst-content .admonition-title,.wy-alert-title{font-weight:700;display:block;color:#fff;background:#6ab0de;padding:6px 12px;margin:-12px -12px 12px}.rst-content .danger,.rst-content .error,.rst-content .wy-alert-danger.admonition,.rst-content .wy-alert-danger.admonition-todo,.rst-content .wy-alert-danger.attention,.rst-content .wy-alert-danger.caution,.rst-content .wy-alert-danger.hint,.rst-content .wy-alert-danger.important,.rst-content .wy-alert-danger.note,.rst-content .wy-alert-danger.seealso,.rst-content .wy-alert-danger.tip,.rst-content .wy-alert-danger.warning,.wy-alert.wy-alert-danger{background:#fdf3f2}.rst-content .danger .admonition-title,.rst-content .danger .wy-alert-title,.rst-content .error .admonition-title,.rst-content .error .wy-alert-title,.rst-content .wy-alert-danger.admonition-todo .admonition-title,.rst-content .wy-alert-danger.admonition-todo .wy-alert-title,.rst-content .wy-alert-danger.admonition .admonition-title,.rst-content .wy-alert-danger.admonition .wy-alert-title,.rst-content .wy-alert-danger.attention .admonition-title,.rst-content .wy-alert-danger.attention .wy-alert-title,.rst-content .wy-alert-danger.caution .admonition-title,.rst-content .wy-alert-danger.caution .wy-alert-title,.rst-content .wy-alert-danger.hint .admonition-title,.rst-content .wy-alert-danger.hint .wy-alert-title,.rst-content .wy-alert-danger.important .admonition-title,.rst-content .wy-alert-danger.important .wy-alert-title,.rst-content .wy-alert-danger.note .admonition-title,.rst-content .wy-alert-danger.note .wy-alert-title,.rst-content .wy-alert-danger.seealso .admonition-title,.rst-content .wy-alert-danger.seealso .wy-alert-title,.rst-content .wy-alert-danger.tip .admonition-title,.rst-content .wy-alert-danger.tip .wy-alert-title,.rst-content .wy-alert-danger.warning .admonition-title,.rst-content .wy-alert-danger.warning .wy-alert-title,.rst-content .wy-alert.wy-alert-danger .admonition-title,.wy-alert.wy-alert-danger .rst-content .admonition-title,.wy-alert.wy-alert-danger .wy-alert-title{background:#f29f97}.rst-content .admonition-todo,.rst-content .attention,.rst-content .caution,.rst-content .warning,.rst-content .wy-alert-warning.admonition,.rst-content .wy-alert-warning.danger,.rst-content .wy-alert-warning.error,.rst-content .wy-alert-warning.hint,.rst-content .wy-alert-warning.important,.rst-content .wy-alert-warning.note,.rst-content .wy-alert-warning.seealso,.rst-content .wy-alert-warning.tip,.wy-alert.wy-alert-warning{background:#ffedcc}.rst-content .admonition-todo .admonition-title,.rst-content .admonition-todo .wy-alert-title,.rst-content .attention .admonition-title,.rst-content .attention .wy-alert-title,.rst-content .caution .admonition-title,.rst-content .caution .wy-alert-title,.rst-content .warning .admonition-title,.rst-content .warning .wy-alert-title,.rst-content .wy-alert-warning.admonition .admonition-title,.rst-content .wy-alert-warning.admonition .wy-alert-title,.rst-content .wy-alert-warning.danger .admonition-title,.rst-content .wy-alert-warning.danger .wy-alert-title,.rst-content .wy-alert-warning.error .admonition-title,.rst-content .wy-alert-warning.error .wy-alert-title,.rst-content .wy-alert-warning.hint .admonition-title,.rst-content .wy-alert-warning.hint .wy-alert-title,.rst-content .wy-alert-warning.important .admonition-title,.rst-content .wy-alert-warning.important .wy-alert-title,.rst-content .wy-alert-warning.note .admonition-title,.rst-content .wy-alert-warning.note .wy-alert-title,.rst-content .wy-alert-warning.seealso .admonition-title,.rst-content .wy-alert-warning.seealso .wy-alert-title,.rst-content .wy-alert-warning.tip .admonition-title,.rst-content .wy-alert-warning.tip .wy-alert-title,.rst-content .wy-alert.wy-alert-warning .admonition-title,.wy-alert.wy-alert-warning .rst-content .admonition-title,.wy-alert.wy-alert-warning .wy-alert-title{background:#f0b37e}.rst-content .note,.rst-content .seealso,.rst-content .wy-alert-info.admonition,.rst-content .wy-alert-info.admonition-todo,.rst-content .wy-alert-info.attention,.rst-content .wy-alert-info.caution,.rst-content .wy-alert-info.danger,.rst-content .wy-alert-info.error,.rst-content .wy-alert-info.hint,.rst-content .wy-alert-info.important,.rst-content .wy-alert-info.tip,.rst-content .wy-alert-info.warning,.wy-alert.wy-alert-info{background:#e7f2fa}.rst-content .note .admonition-title,.rst-content .note .wy-alert-title,.rst-content .seealso .admonition-title,.rst-content .seealso .wy-alert-title,.rst-content .wy-alert-info.admonition-todo .admonition-title,.rst-content .wy-alert-info.admonition-todo .wy-alert-title,.rst-content .wy-alert-info.admonition .admonition-title,.rst-content .wy-alert-info.admonition .wy-alert-title,.rst-content .wy-alert-info.attention .admonition-title,.rst-content .wy-alert-info.attention .wy-alert-title,.rst-content .wy-alert-info.caution .admonition-title,.rst-content .wy-alert-info.caution .wy-alert-title,.rst-content .wy-alert-info.danger .admonition-title,.rst-content .wy-alert-info.danger .wy-alert-title,.rst-content .wy-alert-info.error .admonition-title,.rst-content .wy-alert-info.error .wy-alert-title,.rst-content .wy-alert-info.hint .admonition-title,.rst-content .wy-alert-info.hint .wy-alert-title,.rst-content .wy-alert-info.important .admonition-title,.rst-content .wy-alert-info.important .wy-alert-title,.rst-content .wy-alert-info.tip .admonition-title,.rst-content .wy-alert-info.tip .wy-alert-title,.rst-content .wy-alert-info.warning .admonition-title,.rst-content .wy-alert-info.warning .wy-alert-title,.rst-content .wy-alert.wy-alert-info .admonition-title,.wy-alert.wy-alert-info .rst-content .admonition-title,.wy-alert.wy-alert-info .wy-alert-title{background:#6ab0de}.rst-content .hint,.rst-content .important,.rst-content .tip,.rst-content .wy-alert-success.admonition,.rst-content .wy-alert-success.admonition-todo,.rst-content .wy-alert-success.attention,.rst-content .wy-alert-success.caution,.rst-content .wy-alert-success.danger,.rst-content .wy-alert-success.error,.rst-content .wy-alert-success.note,.rst-content .wy-alert-success.seealso,.rst-content .wy-alert-success.warning,.wy-alert.wy-alert-success{background:#dbfaf4}.rst-content .hint .admonition-title,.rst-content .hint .wy-alert-title,.rst-content .important .admonition-title,.rst-content .important .wy-alert-title,.rst-content .tip .admonition-title,.rst-content .tip .wy-alert-title,.rst-content .wy-alert-success.admonition-todo .admonition-title,.rst-content .wy-alert-success.admonition-todo .wy-alert-title,.rst-content .wy-alert-success.admonition .admonition-title,.rst-content .wy-alert-success.admonition .wy-alert-title,.rst-content .wy-alert-success.attention .admonition-title,.rst-content .wy-alert-success.attention .wy-alert-title,.rst-content .wy-alert-success.caution .admonition-title,.rst-content .wy-alert-success.caution .wy-alert-title,.rst-content .wy-alert-success.danger .admonition-title,.rst-content .wy-alert-success.danger .wy-alert-title,.rst-content .wy-alert-success.error .admonition-title,.rst-content .wy-alert-success.error .wy-alert-title,.rst-content .wy-alert-success.note .admonition-title,.rst-content .wy-alert-success.note .wy-alert-title,.rst-content .wy-alert-success.seealso .admonition-title,.rst-content .wy-alert-success.seealso .wy-alert-title,.rst-content .wy-alert-success.warning .admonition-title,.rst-content .wy-alert-success.warning .wy-alert-title,.rst-content .wy-alert.wy-alert-success .admonition-title,.wy-alert.wy-alert-success .rst-content .admonition-title,.wy-alert.wy-alert-success .wy-alert-title{background:#1abc9c}.rst-content .wy-alert-neutral.admonition,.rst-content .wy-alert-neutral.admonition-todo,.rst-content .wy-alert-neutral.attention,.rst-content .wy-alert-neutral.caution,.rst-content .wy-alert-neutral.danger,.rst-content .wy-alert-neutral.error,.rst-content .wy-alert-neutral.hint,.rst-content .wy-alert-neutral.important,.rst-content .wy-alert-neutral.note,.rst-content .wy-alert-neutral.seealso,.rst-content .wy-alert-neutral.tip,.rst-content .wy-alert-neutral.warning,.wy-alert.wy-alert-neutral{background:#f3f6f6}.rst-content .wy-alert-neutral.admonition-todo .admonition-title,.rst-content .wy-alert-neutral.admonition-todo .wy-alert-title,.rst-content .wy-alert-neutral.admonition .admonition-title,.rst-content .wy-alert-neutral.admonition .wy-alert-title,.rst-content .wy-alert-neutral.attention .admonition-title,.rst-content .wy-alert-neutral.attention .wy-alert-title,.rst-content .wy-alert-neutral.caution .admonition-title,.rst-content .wy-alert-neutral.caution .wy-alert-title,.rst-content .wy-alert-neutral.danger .admonition-title,.rst-content .wy-alert-neutral.danger .wy-alert-title,.rst-content .wy-alert-neutral.error .admonition-title,.rst-content .wy-alert-neutral.error .wy-alert-title,.rst-content .wy-alert-neutral.hint .admonition-title,.rst-content .wy-alert-neutral.hint .wy-alert-title,.rst-content .wy-alert-neutral.important .admonition-title,.rst-content .wy-alert-neutral.important .wy-alert-title,.rst-content .wy-alert-neutral.note .admonition-title,.rst-content .wy-alert-neutral.note .wy-alert-title,.rst-content .wy-alert-neutral.seealso .admonition-title,.rst-content .wy-alert-neutral.seealso .wy-alert-title,.rst-content .wy-alert-neutral.tip .admonition-title,.rst-content .wy-alert-neutral.tip .wy-alert-title,.rst-content .wy-alert-neutral.warning .admonition-title,.rst-content .wy-alert-neutral.warning .wy-alert-title,.rst-content .wy-alert.wy-alert-neutral .admonition-title,.wy-alert.wy-alert-neutral .rst-content .admonition-title,.wy-alert.wy-alert-neutral .wy-alert-title{color:#404040;background:#e1e4e5}.rst-content .wy-alert-neutral.admonition-todo a,.rst-content .wy-alert-neutral.admonition a,.rst-content .wy-alert-neutral.attention a,.rst-content .wy-alert-neutral.caution a,.rst-content .wy-alert-neutral.danger a,.rst-content .wy-alert-neutral.error a,.rst-content .wy-alert-neutral.hint a,.rst-content .wy-alert-neutral.important a,.rst-content .wy-alert-neutral.note a,.rst-content .wy-alert-neutral.seealso a,.rst-content .wy-alert-neutral.tip a,.rst-content .wy-alert-neutral.warning a,.wy-alert.wy-alert-neutral a{color:#2980b9}.rst-content .admonition-todo p:last-child,.rst-content .admonition p:last-child,.rst-content .attention p:last-child,.rst-content .caution p:last-child,.rst-content .danger p:last-child,.rst-content .error p:last-child,.rst-content .hint p:last-child,.rst-content .important p:last-child,.rst-content .note p:last-child,.rst-content .seealso p:last-child,.rst-content .tip p:last-child,.rst-content .warning p:last-child,.wy-alert p:last-child{margin-bottom:0}.wy-tray-container{position:fixed;bottom:0;left:0;z-index:600}.wy-tray-container li{display:block;width:300px;background:transparent;color:#fff;text-align:center;box-shadow:0 5px 5px 0 rgba(0,0,0,.1);padding:0 24px;min-width:20%;opacity:0;height:0;line-height:56px;overflow:hidden;-webkit-transition:all .3s ease-in;-moz-transition:all .3s ease-in;transition:all .3s ease-in}.wy-tray-container li.wy-tray-item-success{background:#27ae60}.wy-tray-container li.wy-tray-item-info{background:#2980b9}.wy-tray-container li.wy-tray-item-warning{background:#e67e22}.wy-tray-container li.wy-tray-item-danger{background:#e74c3c}.wy-tray-container li.on{opacity:1;height:56px}@media screen and (max-width:768px){.wy-tray-container{bottom:auto;top:0;width:100%}.wy-tray-container li{width:100%}}button{font-size:100%;margin:0;vertical-align:baseline;*vertical-align:middle;cursor:pointer;line-height:normal;-webkit-appearance:button;*overflow:visible}button::-moz-focus-inner,input::-moz-focus-inner{border:0;padding:0}button[disabled]{cursor:default}.btn{display:inline-block;border-radius:2px;line-height:normal;white-space:nowrap;text-align:center;cursor:pointer;font-size:100%;padding:6px 12px 8px;color:#fff;border:1px solid rgba(0,0,0,.1);background-color:#27ae60;text-decoration:none;font-weight:400;font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif;box-shadow:inset 0 1px 2px -1px hsla(0,0%,100%,.5),inset 0 -2px 0 0 rgba(0,0,0,.1);outline-none:false;vertical-align:middle;*display:inline;zoom:1;-webkit-user-drag:none;-webkit-user-select:none;-moz-user-select:none;-ms-user-select:none;user-select:none;-webkit-transition:all .1s linear;-moz-transition:all .1s linear;transition:all .1s linear}.btn-hover{background:#2e8ece;color:#fff}.btn:hover{background:#2cc36b;color:#fff}.btn:focus{background:#2cc36b;outline:0}.btn:active{box-shadow:inset 0 -1px 0 0 rgba(0,0,0,.05),inset 0 2px 0 0 rgba(0,0,0,.1);padding:8px 12px 6px}.btn:visited{color:#fff}.btn-disabled,.btn-disabled:active,.btn-disabled:focus,.btn-disabled:hover,.btn:disabled{background-image:none;filter:progid:DXImageTransform.Microsoft.gradient(enabled = false);filter:alpha(opacity=40);opacity:.4;cursor:not-allowed;box-shadow:none}.btn::-moz-focus-inner{padding:0;border:0}.btn-small{font-size:80%}.btn-info{background-color:#2980b9!important}.btn-info:hover{background-color:#2e8ece!important}.btn-neutral{background-color:#f3f6f6!important;color:#404040!important}.btn-neutral:hover{background-color:#e5ebeb!important;color:#404040}.btn-neutral:visited{color:#404040!important}.btn-success{background-color:#27ae60!important}.btn-success:hover{background-color:#295!important}.btn-danger{background-color:#e74c3c!important}.btn-danger:hover{background-color:#ea6153!important}.btn-warning{background-color:#e67e22!important}.btn-warning:hover{background-color:#e98b39!important}.btn-invert{background-color:#222}.btn-invert:hover{background-color:#2f2f2f!important}.btn-link{background-color:transparent!important;color:#2980b9;box-shadow:none;border-color:transparent!important}.btn-link:active,.btn-link:hover{background-color:transparent!important;color:#409ad5!important;box-shadow:none}.btn-link:visited{color:#9b59b6}.wy-btn-group .btn,.wy-control .btn{vertical-align:middle}.wy-btn-group{margin-bottom:24px;*zoom:1}.wy-btn-group:after,.wy-btn-group:before{display:table;content:""}.wy-btn-group:after{clear:both}.wy-dropdown{position:relative;display:inline-block}.wy-dropdown-active .wy-dropdown-menu{display:block}.wy-dropdown-menu{position:absolute;left:0;display:none;float:left;top:100%;min-width:100%;background:#fcfcfc;z-index:100;border:1px solid #cfd7dd;box-shadow:0 2px 2px 0 rgba(0,0,0,.1);padding:12px}.wy-dropdown-menu>dd>a{display:block;clear:both;color:#404040;white-space:nowrap;font-size:90%;padding:0 12px;cursor:pointer}.wy-dropdown-menu>dd>a:hover{background:#2980b9;color:#fff}.wy-dropdown-menu>dd.divider{border-top:1px solid #cfd7dd;margin:6px 0}.wy-dropdown-menu>dd.search{padding-bottom:12px}.wy-dropdown-menu>dd.search input[type=search]{width:100%}.wy-dropdown-menu>dd.call-to-action{background:#e3e3e3;text-transform:uppercase;font-weight:500;font-size:80%}.wy-dropdown-menu>dd.call-to-action:hover{background:#e3e3e3}.wy-dropdown-menu>dd.call-to-action .btn{color:#fff}.wy-dropdown.wy-dropdown-up .wy-dropdown-menu{bottom:100%;top:auto;left:auto;right:0}.wy-dropdown.wy-dropdown-bubble .wy-dropdown-menu{background:#fcfcfc;margin-top:2px}.wy-dropdown.wy-dropdown-bubble .wy-dropdown-menu a{padding:6px 12px}.wy-dropdown.wy-dropdown-bubble .wy-dropdown-menu a:hover{background:#2980b9;color:#fff}.wy-dropdown.wy-dropdown-left .wy-dropdown-menu{right:0;left:auto;text-align:right}.wy-dropdown-arrow:before{content:" ";border-bottom:5px solid #f5f5f5;border-left:5px solid transparent;border-right:5px solid transparent;position:absolute;display:block;top:-4px;left:50%;margin-left:-3px}.wy-dropdown-arrow.wy-dropdown-arrow-left:before{left:11px}.wy-form-stacked select{display:block}.wy-form-aligned .wy-help-inline,.wy-form-aligned input,.wy-form-aligned label,.wy-form-aligned select,.wy-form-aligned textarea{display:inline-block;*display:inline;*zoom:1;vertical-align:middle}.wy-form-aligned .wy-control-group>label{display:inline-block;vertical-align:middle;width:10em;margin:6px 12px 0 0;float:left}.wy-form-aligned .wy-control{float:left}.wy-form-aligned .wy-control label{display:block}.wy-form-aligned .wy-control select{margin-top:6px}fieldset{margin:0}fieldset,legend{border:0;padding:0}legend{width:100%;white-space:normal;margin-bottom:24px;font-size:150%;*margin-left:-7px}label,legend{display:block}label{margin:0 0 .3125em;color:#333;font-size:90%}input,select,textarea{font-size:100%;margin:0;vertical-align:baseline;*vertical-align:middle}.wy-control-group{margin-bottom:24px;max-width:1200px;margin-left:auto;margin-right:auto;*zoom:1}.wy-control-group:after,.wy-control-group:before{display:table;content:""}.wy-control-group:after{clear:both}.wy-control-group.wy-control-group-required>label:after{content:" *";color:#e74c3c}.wy-control-group .wy-form-full,.wy-control-group .wy-form-halves,.wy-control-group .wy-form-thirds{padding-bottom:12px}.wy-control-group .wy-form-full input[type=color],.wy-control-group .wy-form-full input[type=date],.wy-control-group .wy-form-full input[type=datetime-local],.wy-control-group .wy-form-full input[type=datetime],.wy-control-group .wy-form-full input[type=email],.wy-control-group .wy-form-full input[type=month],.wy-control-group .wy-form-full input[type=number],.wy-control-group .wy-form-full input[type=password],.wy-control-group .wy-form-full input[type=search],.wy-control-group .wy-form-full input[type=tel],.wy-control-group .wy-form-full input[type=text],.wy-control-group .wy-form-full input[type=time],.wy-control-group .wy-form-full input[type=url],.wy-control-group .wy-form-full input[type=week],.wy-control-group .wy-form-full select,.wy-control-group .wy-form-halves input[type=color],.wy-control-group .wy-form-halves input[type=date],.wy-control-group .wy-form-halves input[type=datetime-local],.wy-control-group .wy-form-halves input[type=datetime],.wy-control-group .wy-form-halves input[type=email],.wy-control-group .wy-form-halves input[type=month],.wy-control-group .wy-form-halves input[type=number],.wy-control-group .wy-form-halves input[type=password],.wy-control-group .wy-form-halves input[type=search],.wy-control-group .wy-form-halves input[type=tel],.wy-control-group .wy-form-halves input[type=text],.wy-control-group .wy-form-halves input[type=time],.wy-control-group .wy-form-halves input[type=url],.wy-control-group .wy-form-halves input[type=week],.wy-control-group .wy-form-halves select,.wy-control-group .wy-form-thirds input[type=color],.wy-control-group .wy-form-thirds input[type=date],.wy-control-group .wy-form-thirds input[type=datetime-local],.wy-control-group .wy-form-thirds input[type=datetime],.wy-control-group .wy-form-thirds input[type=email],.wy-control-group .wy-form-thirds input[type=month],.wy-control-group .wy-form-thirds input[type=number],.wy-control-group .wy-form-thirds input[type=password],.wy-control-group .wy-form-thirds input[type=search],.wy-control-group .wy-form-thirds input[type=tel],.wy-control-group .wy-form-thirds input[type=text],.wy-control-group .wy-form-thirds input[type=time],.wy-control-group .wy-form-thirds input[type=url],.wy-control-group .wy-form-thirds input[type=week],.wy-control-group .wy-form-thirds select{width:100%}.wy-control-group .wy-form-full{float:left;display:block;width:100%;margin-right:0}.wy-control-group .wy-form-full:last-child{margin-right:0}.wy-control-group .wy-form-halves{float:left;display:block;margin-right:2.35765%;width:48.82117%}.wy-control-group .wy-form-halves:last-child,.wy-control-group .wy-form-halves:nth-of-type(2n){margin-right:0}.wy-control-group .wy-form-halves:nth-of-type(odd){clear:left}.wy-control-group .wy-form-thirds{float:left;display:block;margin-right:2.35765%;width:31.76157%}.wy-control-group .wy-form-thirds:last-child,.wy-control-group .wy-form-thirds:nth-of-type(3n){margin-right:0}.wy-control-group .wy-form-thirds:nth-of-type(3n+1){clear:left}.wy-control-group.wy-control-group-no-input .wy-control,.wy-control-no-input{margin:6px 0 0;font-size:90%}.wy-control-no-input{display:inline-block}.wy-control-group.fluid-input input[type=color],.wy-control-group.fluid-input input[type=date],.wy-control-group.fluid-input input[type=datetime-local],.wy-control-group.fluid-input input[type=datetime],.wy-control-group.fluid-input input[type=email],.wy-control-group.fluid-input input[type=month],.wy-control-group.fluid-input input[type=number],.wy-control-group.fluid-input input[type=password],.wy-control-group.fluid-input input[type=search],.wy-control-group.fluid-input input[type=tel],.wy-control-group.fluid-input input[type=text],.wy-control-group.fluid-input input[type=time],.wy-control-group.fluid-input input[type=url],.wy-control-group.fluid-input input[type=week]{width:100%}.wy-form-message-inline{padding-left:.3em;color:#666;font-size:90%}.wy-form-message{display:block;color:#999;font-size:70%;margin-top:.3125em;font-style:italic}.wy-form-message p{font-size:inherit;font-style:italic;margin-bottom:6px}.wy-form-message p:last-child{margin-bottom:0}input{line-height:normal}input[type=button],input[type=reset],input[type=submit]{-webkit-appearance:button;cursor:pointer;font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif;*overflow:visible}input[type=color],input[type=date],input[type=datetime-local],input[type=datetime],input[type=email],input[type=month],input[type=number],input[type=password],input[type=search],input[type=tel],input[type=text],input[type=time],input[type=url],input[type=week]{-webkit-appearance:none;padding:6px;display:inline-block;border:1px solid #ccc;font-size:80%;font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif;box-shadow:inset 0 1px 3px #ddd;border-radius:0;-webkit-transition:border .3s linear;-moz-transition:border .3s linear;transition:border .3s linear}input[type=datetime-local]{padding:.34375em .625em}input[disabled]{cursor:default}input[type=checkbox],input[type=radio]{padding:0;margin-right:.3125em;*height:13px;*width:13px}input[type=checkbox],input[type=radio],input[type=search]{-webkit-box-sizing:border-box;-moz-box-sizing:border-box;box-sizing:border-box}input[type=search]::-webkit-search-cancel-button,input[type=search]::-webkit-search-decoration{-webkit-appearance:none}input[type=color]:focus,input[type=date]:focus,input[type=datetime-local]:focus,input[type=datetime]:focus,input[type=email]:focus,input[type=month]:focus,input[type=number]:focus,input[type=password]:focus,input[type=search]:focus,input[type=tel]:focus,input[type=text]:focus,input[type=time]:focus,input[type=url]:focus,input[type=week]:focus{outline:0;outline:thin dotted\9;border-color:#333}input.no-focus:focus{border-color:#ccc!important}input[type=checkbox]:focus,input[type=file]:focus,input[type=radio]:focus{outline:thin dotted #333;outline:1px auto #129fea}input[type=color][disabled],input[type=date][disabled],input[type=datetime-local][disabled],input[type=datetime][disabled],input[type=email][disabled],input[type=month][disabled],input[type=number][disabled],input[type=password][disabled],input[type=search][disabled],input[type=tel][disabled],input[type=text][disabled],input[type=time][disabled],input[type=url][disabled],input[type=week][disabled]{cursor:not-allowed;background-color:#fafafa}input:focus:invalid,select:focus:invalid,textarea:focus:invalid{color:#e74c3c;border:1px solid #e74c3c}input:focus:invalid:focus,select:focus:invalid:focus,textarea:focus:invalid:focus{border-color:#e74c3c}input[type=checkbox]:focus:invalid:focus,input[type=file]:focus:invalid:focus,input[type=radio]:focus:invalid:focus{outline-color:#e74c3c}input.wy-input-large{padding:12px;font-size:100%}textarea{overflow:auto;vertical-align:top;width:100%;font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif}select,textarea{padding:.5em .625em;display:inline-block;border:1px solid #ccc;font-size:80%;box-shadow:inset 0 1px 3px #ddd;-webkit-transition:border .3s linear;-moz-transition:border .3s linear;transition:border .3s linear}select{border:1px solid #ccc;background-color:#fff}select[multiple]{height:auto}select:focus,textarea:focus{outline:0}input[readonly],select[disabled],select[readonly],textarea[disabled],textarea[readonly]{cursor:not-allowed;background-color:#fafafa}input[type=checkbox][disabled],input[type=radio][disabled]{cursor:not-allowed}.wy-checkbox,.wy-radio{margin:6px 0;color:#404040;display:block}.wy-checkbox input,.wy-radio input{vertical-align:baseline}.wy-form-message-inline{display:inline-block;*display:inline;*zoom:1;vertical-align:middle}.wy-input-prefix,.wy-input-suffix{white-space:nowrap;padding:6px}.wy-input-prefix .wy-input-context,.wy-input-suffix .wy-input-context{line-height:27px;padding:0 8px;display:inline-block;font-size:80%;background-color:#f3f6f6;border:1px solid #ccc;color:#999}.wy-input-suffix .wy-input-context{border-left:0}.wy-input-prefix .wy-input-context{border-right:0}.wy-switch{position:relative;display:block;height:24px;margin-top:12px;cursor:pointer}.wy-switch:before{left:0;top:0;width:36px;height:12px;background:#ccc}.wy-switch:after,.wy-switch:before{position:absolute;content:"";display:block;border-radius:4px;-webkit-transition:all .2s ease-in-out;-moz-transition:all .2s ease-in-out;transition:all .2s ease-in-out}.wy-switch:after{width:18px;height:18px;background:#999;left:-3px;top:-3px}.wy-switch span{position:absolute;left:48px;display:block;font-size:12px;color:#ccc;line-height:1}.wy-switch.active:before{background:#1e8449}.wy-switch.active:after{left:24px;background:#27ae60}.wy-switch.disabled{cursor:not-allowed;opacity:.8}.wy-control-group.wy-control-group-error .wy-form-message,.wy-control-group.wy-control-group-error>label{color:#e74c3c}.wy-control-group.wy-control-group-error input[type=color],.wy-control-group.wy-control-group-error input[type=date],.wy-control-group.wy-control-group-error input[type=datetime-local],.wy-control-group.wy-control-group-error input[type=datetime],.wy-control-group.wy-control-group-error input[type=email],.wy-control-group.wy-control-group-error input[type=month],.wy-control-group.wy-control-group-error input[type=number],.wy-control-group.wy-control-group-error input[type=password],.wy-control-group.wy-control-group-error input[type=search],.wy-control-group.wy-control-group-error input[type=tel],.wy-control-group.wy-control-group-error input[type=text],.wy-control-group.wy-control-group-error input[type=time],.wy-control-group.wy-control-group-error input[type=url],.wy-control-group.wy-control-group-error input[type=week],.wy-control-group.wy-control-group-error textarea{border:1px solid #e74c3c}.wy-inline-validate{white-space:nowrap}.wy-inline-validate .wy-input-context{padding:.5em .625em;display:inline-block;font-size:80%}.wy-inline-validate.wy-inline-validate-success .wy-input-context{color:#27ae60}.wy-inline-validate.wy-inline-validate-danger .wy-input-context{color:#e74c3c}.wy-inline-validate.wy-inline-validate-warning .wy-input-context{color:#e67e22}.wy-inline-validate.wy-inline-validate-info .wy-input-context{color:#2980b9}.rotate-90{-webkit-transform:rotate(90deg);-moz-transform:rotate(90deg);-ms-transform:rotate(90deg);-o-transform:rotate(90deg);transform:rotate(90deg)}.rotate-180{-webkit-transform:rotate(180deg);-moz-transform:rotate(180deg);-ms-transform:rotate(180deg);-o-transform:rotate(180deg);transform:rotate(180deg)}.rotate-270{-webkit-transform:rotate(270deg);-moz-transform:rotate(270deg);-ms-transform:rotate(270deg);-o-transform:rotate(270deg);transform:rotate(270deg)}.mirror{-webkit-transform:scaleX(-1);-moz-transform:scaleX(-1);-ms-transform:scaleX(-1);-o-transform:scaleX(-1);transform:scaleX(-1)}.mirror.rotate-90{-webkit-transform:scaleX(-1) rotate(90deg);-moz-transform:scaleX(-1) rotate(90deg);-ms-transform:scaleX(-1) rotate(90deg);-o-transform:scaleX(-1) rotate(90deg);transform:scaleX(-1) rotate(90deg)}.mirror.rotate-180{-webkit-transform:scaleX(-1) rotate(180deg);-moz-transform:scaleX(-1) rotate(180deg);-ms-transform:scaleX(-1) rotate(180deg);-o-transform:scaleX(-1) rotate(180deg);transform:scaleX(-1) rotate(180deg)}.mirror.rotate-270{-webkit-transform:scaleX(-1) rotate(270deg);-moz-transform:scaleX(-1) rotate(270deg);-ms-transform:scaleX(-1) rotate(270deg);-o-transform:scaleX(-1) rotate(270deg);transform:scaleX(-1) rotate(270deg)}@media only screen and (max-width:480px){.wy-form button[type=submit]{margin:.7em 0 0}.wy-form input[type=color],.wy-form input[type=date],.wy-form input[type=datetime-local],.wy-form input[type=datetime],.wy-form input[type=email],.wy-form input[type=month],.wy-form input[type=number],.wy-form input[type=password],.wy-form input[type=search],.wy-form input[type=tel],.wy-form input[type=text],.wy-form input[type=time],.wy-form input[type=url],.wy-form input[type=week],.wy-form label{margin-bottom:.3em;display:block}.wy-form input[type=color],.wy-form input[type=date],.wy-form input[type=datetime-local],.wy-form input[type=datetime],.wy-form input[type=email],.wy-form input[type=month],.wy-form input[type=number],.wy-form input[type=password],.wy-form input[type=search],.wy-form input[type=tel],.wy-form input[type=time],.wy-form input[type=url],.wy-form input[type=week]{margin-bottom:0}.wy-form-aligned .wy-control-group label{margin-bottom:.3em;text-align:left;display:block;width:100%}.wy-form-aligned .wy-control{margin:1.5em 0 0}.wy-form-message,.wy-form-message-inline,.wy-form .wy-help-inline{display:block;font-size:80%;padding:6px 0}}@media screen and (max-width:768px){.tablet-hide{display:none}}@media screen and (max-width:480px){.mobile-hide{display:none}}.float-left{float:left}.float-right{float:right}.full-width{width:100%}.rst-content table.docutils,.rst-content table.field-list,.wy-table{border-collapse:collapse;border-spacing:0;empty-cells:show;margin-bottom:24px}.rst-content table.docutils caption,.rst-content table.field-list caption,.wy-table caption{color:#000;font:italic 85%/1 arial,sans-serif;padding:1em 0;text-align:center}.rst-content table.docutils td,.rst-content table.docutils th,.rst-content table.field-list td,.rst-content table.field-list th,.wy-table td,.wy-table th{font-size:90%;margin:0;overflow:visible;padding:8px 16px}.rst-content table.docutils td:first-child,.rst-content table.docutils th:first-child,.rst-content table.field-list td:first-child,.rst-content table.field-list th:first-child,.wy-table td:first-child,.wy-table th:first-child{border-left-width:0}.rst-content table.docutils thead,.rst-content table.field-list thead,.wy-table thead{color:#000;text-align:left;vertical-align:bottom;white-space:nowrap}.rst-content table.docutils thead th,.rst-content table.field-list thead th,.wy-table thead th{font-weight:700;border-bottom:2px solid #e1e4e5}.rst-content table.docutils td,.rst-content table.field-list td,.wy-table td{background-color:transparent;vertical-align:middle}.rst-content table.docutils td p,.rst-content table.field-list td p,.wy-table td p{line-height:18px}.rst-content table.docutils td p:last-child,.rst-content table.field-list td p:last-child,.wy-table td p:last-child{margin-bottom:0}.rst-content table.docutils .wy-table-cell-min,.rst-content table.field-list .wy-table-cell-min,.wy-table .wy-table-cell-min{width:1%;padding-right:0}.rst-content table.docutils .wy-table-cell-min input[type=checkbox],.rst-content table.field-list .wy-table-cell-min input[type=checkbox],.wy-table .wy-table-cell-min input[type=checkbox]{margin:0}.wy-table-secondary{color:grey;font-size:90%}.wy-table-tertiary{color:grey;font-size:80%}.rst-content table.docutils:not(.field-list) tr:nth-child(2n-1) td,.wy-table-backed,.wy-table-odd td,.wy-table-striped tr:nth-child(2n-1) td{background-color:#f3f6f6}.rst-content table.docutils,.wy-table-bordered-all{border:1px solid #e1e4e5}.rst-content table.docutils td,.wy-table-bordered-all td{border-bottom:1px solid #e1e4e5;border-left:1px solid #e1e4e5}.rst-content table.docutils tbody>tr:last-child td,.wy-table-bordered-all tbody>tr:last-child td{border-bottom-width:0}.wy-table-bordered{border:1px solid #e1e4e5}.wy-table-bordered-rows td{border-bottom:1px solid #e1e4e5}.wy-table-bordered-rows tbody>tr:last-child td{border-bottom-width:0}.wy-table-horizontal td,.wy-table-horizontal th{border-width:0 0 1px;border-bottom:1px solid #e1e4e5}.wy-table-horizontal tbody>tr:last-child td{border-bottom-width:0}.wy-table-responsive{margin-bottom:24px;max-width:100%;overflow:auto}.wy-table-responsive table{margin-bottom:0!important}.wy-table-responsive table td,.wy-table-responsive table th{white-space:nowrap}a{color:#2980b9;text-decoration:none;cursor:pointer}a:hover{color:#3091d1}a:visited{color:#9b59b6}html{height:100%}body,html{overflow-x:hidden}body{font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif;font-weight:400;color:#404040;min-height:100%;background:#edf0f2}.wy-text-left{text-align:left}.wy-text-center{text-align:center}.wy-text-right{text-align:right}.wy-text-large{font-size:120%}.wy-text-normal{font-size:100%}.wy-text-small,small{font-size:80%}.wy-text-strike{text-decoration:line-through}.wy-text-warning{color:#e67e22!important}a.wy-text-warning:hover{color:#eb9950!important}.wy-text-info{color:#2980b9!important}a.wy-text-info:hover{color:#409ad5!important}.wy-text-success{color:#27ae60!important}a.wy-text-success:hover{color:#36d278!important}.wy-text-danger{color:#e74c3c!important}a.wy-text-danger:hover{color:#ed7669!important}.wy-text-neutral{color:#404040!important}a.wy-text-neutral:hover{color:#595959!important}.rst-content .toctree-wrapper>p.caption,h1,h2,h3,h4,h5,h6,legend{margin-top:0;font-weight:700;font-family:Roboto Slab,ff-tisa-web-pro,Georgia,Arial,sans-serif}p{line-height:24px;font-size:16px;margin:0 0 24px}h1{font-size:175%}.rst-content .toctree-wrapper>p.caption,h2{font-size:150%}h3{font-size:125%}h4{font-size:115%}h5{font-size:110%}h6{font-size:100%}hr{display:block;height:1px;border:0;border-top:1px solid #e1e4e5;margin:24px 0;padding:0}.rst-content code,.rst-content tt,code{white-space:nowrap;max-width:100%;background:#fff;border:1px solid #e1e4e5;font-size:75%;padding:0 5px;font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;color:#e74c3c;overflow-x:auto}.rst-content tt.code-large,code.code-large{font-size:90%}.rst-content .section ul,.rst-content .toctree-wrapper ul,.rst-content section ul,.wy-plain-list-disc,article ul{list-style:disc;line-height:24px;margin-bottom:24px}.rst-content .section ul li,.rst-content .toctree-wrapper ul li,.rst-content section ul li,.wy-plain-list-disc li,article ul li{list-style:disc;margin-left:24px}.rst-content .section ul li p:last-child,.rst-content .section ul li ul,.rst-content .toctree-wrapper ul li p:last-child,.rst-content .toctree-wrapper ul li ul,.rst-content section ul li p:last-child,.rst-content section ul li ul,.wy-plain-list-disc li p:last-child,.wy-plain-list-disc li ul,article ul li p:last-child,article ul li ul{margin-bottom:0}.rst-content .section ul li li,.rst-content .toctree-wrapper ul li li,.rst-content section ul li li,.wy-plain-list-disc li li,article ul li li{list-style:circle}.rst-content .section ul li li li,.rst-content .toctree-wrapper ul li li li,.rst-content section ul li li li,.wy-plain-list-disc li li li,article ul li li li{list-style:square}.rst-content .section ul li ol li,.rst-content .toctree-wrapper ul li ol li,.rst-content section ul li ol li,.wy-plain-list-disc li ol li,article ul li ol li{list-style:decimal}.rst-content .section ol,.rst-content .section ol.arabic,.rst-content .toctree-wrapper ol,.rst-content .toctree-wrapper ol.arabic,.rst-content section ol,.rst-content section ol.arabic,.wy-plain-list-decimal,article ol{list-style:decimal;line-height:24px;margin-bottom:24px}.rst-content .section ol.arabic li,.rst-content .section ol li,.rst-content .toctree-wrapper ol.arabic li,.rst-content .toctree-wrapper ol li,.rst-content section ol.arabic li,.rst-content section ol li,.wy-plain-list-decimal li,article ol li{list-style:decimal;margin-left:24px}.rst-content .section ol.arabic li ul,.rst-content .section ol li p:last-child,.rst-content .section ol li ul,.rst-content .toctree-wrapper ol.arabic li ul,.rst-content .toctree-wrapper ol li p:last-child,.rst-content .toctree-wrapper ol li ul,.rst-content section ol.arabic li ul,.rst-content section ol li p:last-child,.rst-content section ol li ul,.wy-plain-list-decimal li p:last-child,.wy-plain-list-decimal li ul,article ol li p:last-child,article ol li ul{margin-bottom:0}.rst-content .section ol.arabic li ul li,.rst-content .section ol li ul li,.rst-content .toctree-wrapper ol.arabic li ul li,.rst-content .toctree-wrapper ol li ul li,.rst-content section ol.arabic li ul li,.rst-content section ol li ul li,.wy-plain-list-decimal li ul li,article ol li ul li{list-style:disc}.wy-breadcrumbs{*zoom:1}.wy-breadcrumbs:after,.wy-breadcrumbs:before{display:table;content:""}.wy-breadcrumbs:after{clear:both}.wy-breadcrumbs>li{display:inline-block;padding-top:5px}.wy-breadcrumbs>li.wy-breadcrumbs-aside{float:right}.rst-content .wy-breadcrumbs>li code,.rst-content .wy-breadcrumbs>li tt,.wy-breadcrumbs>li .rst-content tt,.wy-breadcrumbs>li code{all:inherit;color:inherit}.breadcrumb-item:before{content:"/";color:#bbb;font-size:13px;padding:0 6px 0 3px}.wy-breadcrumbs-extra{margin-bottom:0;color:#b3b3b3;font-size:80%;display:inline-block}@media screen and (max-width:480px){.wy-breadcrumbs-extra,.wy-breadcrumbs li.wy-breadcrumbs-aside{display:none}}@media print{.wy-breadcrumbs li.wy-breadcrumbs-aside{display:none}}html{font-size:16px}.wy-affix{position:fixed;top:1.618em}.wy-menu a:hover{text-decoration:none}.wy-menu-horiz{*zoom:1}.wy-menu-horiz:after,.wy-menu-horiz:before{display:table;content:""}.wy-menu-horiz:after{clear:both}.wy-menu-horiz li,.wy-menu-horiz ul{display:inline-block}.wy-menu-horiz li:hover{background:hsla(0,0%,100%,.1)}.wy-menu-horiz li.divide-left{border-left:1px solid #404040}.wy-menu-horiz li.divide-right{border-right:1px solid #404040}.wy-menu-horiz a{height:32px;display:inline-block;line-height:32px;padding:0 16px}.wy-menu-vertical{width:300px}.wy-menu-vertical header,.wy-menu-vertical p.caption{color:#55a5d9;height:32px;line-height:32px;padding:0 1.618em;margin:12px 0 0;display:block;font-weight:700;text-transform:uppercase;font-size:85%;white-space:nowrap}.wy-menu-vertical ul{margin-bottom:0}.wy-menu-vertical li.divide-top{border-top:1px solid #404040}.wy-menu-vertical li.divide-bottom{border-bottom:1px solid #404040}.wy-menu-vertical li.current{background:#e3e3e3}.wy-menu-vertical li.current a{color:grey;border-right:1px solid #c9c9c9;padding:.4045em 2.427em}.wy-menu-vertical li.current a:hover{background:#d6d6d6}.rst-content .wy-menu-vertical li tt,.wy-menu-vertical li .rst-content tt,.wy-menu-vertical li code{border:none;background:inherit;color:inherit;padding-left:0;padding-right:0}.wy-menu-vertical li button.toctree-expand{display:block;float:left;margin-left:-1.2em;line-height:18px;color:#4d4d4d;border:none;background:none;padding:0}.wy-menu-vertical li.current>a,.wy-menu-vertical li.on a{color:#404040;font-weight:700;position:relative;background:#fcfcfc;border:none;padding:.4045em 1.618em}.wy-menu-vertical li.current>a:hover,.wy-menu-vertical li.on a:hover{background:#fcfcfc}.wy-menu-vertical li.current>a:hover button.toctree-expand,.wy-menu-vertical li.on a:hover button.toctree-expand{color:grey}.wy-menu-vertical li.current>a button.toctree-expand,.wy-menu-vertical li.on a button.toctree-expand{display:block;line-height:18px;color:#333}.wy-menu-vertical li.toctree-l1.current>a{border-bottom:1px solid #c9c9c9;border-top:1px solid #c9c9c9}.wy-menu-vertical .toctree-l1.current .toctree-l2>ul,.wy-menu-vertical .toctree-l2.current .toctree-l3>ul,.wy-menu-vertical .toctree-l3.current .toctree-l4>ul,.wy-menu-vertical .toctree-l4.current .toctree-l5>ul,.wy-menu-vertical .toctree-l5.current .toctree-l6>ul,.wy-menu-vertical .toctree-l6.current .toctree-l7>ul,.wy-menu-vertical .toctree-l7.current .toctree-l8>ul,.wy-menu-vertical .toctree-l8.current .toctree-l9>ul,.wy-menu-vertical .toctree-l9.current .toctree-l10>ul,.wy-menu-vertical .toctree-l10.current .toctree-l11>ul{display:none}.wy-menu-vertical .toctree-l1.current .current.toctree-l2>ul,.wy-menu-vertical .toctree-l2.current .current.toctree-l3>ul,.wy-menu-vertical .toctree-l3.current .current.toctree-l4>ul,.wy-menu-vertical .toctree-l4.current .current.toctree-l5>ul,.wy-menu-vertical .toctree-l5.current .current.toctree-l6>ul,.wy-menu-vertical .toctree-l6.current .current.toctree-l7>ul,.wy-menu-vertical .toctree-l7.current .current.toctree-l8>ul,.wy-menu-vertical .toctree-l8.current .current.toctree-l9>ul,.wy-menu-vertical .toctree-l9.current .current.toctree-l10>ul,.wy-menu-vertical .toctree-l10.current .current.toctree-l11>ul{display:block}.wy-menu-vertical li.toctree-l3,.wy-menu-vertical li.toctree-l4{font-size:.9em}.wy-menu-vertical li.toctree-l2 a,.wy-menu-vertical li.toctree-l3 a,.wy-menu-vertical li.toctree-l4 a,.wy-menu-vertical li.toctree-l5 a,.wy-menu-vertical li.toctree-l6 a,.wy-menu-vertical li.toctree-l7 a,.wy-menu-vertical li.toctree-l8 a,.wy-menu-vertical li.toctree-l9 a,.wy-menu-vertical li.toctree-l10 a{color:#404040}.wy-menu-vertical li.toctree-l2 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l3 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l4 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l5 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l6 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l7 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l8 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l9 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l10 a:hover button.toctree-expand{color:grey}.wy-menu-vertical li.toctree-l2.current li.toctree-l3>a,.wy-menu-vertical li.toctree-l3.current li.toctree-l4>a,.wy-menu-vertical li.toctree-l4.current li.toctree-l5>a,.wy-menu-vertical li.toctree-l5.current li.toctree-l6>a,.wy-menu-vertical li.toctree-l6.current li.toctree-l7>a,.wy-menu-vertical li.toctree-l7.current li.toctree-l8>a,.wy-menu-vertical li.toctree-l8.current li.toctree-l9>a,.wy-menu-vertical li.toctree-l9.current li.toctree-l10>a,.wy-menu-vertical li.toctree-l10.current li.toctree-l11>a{display:block}.wy-menu-vertical li.toctree-l2.current>a{padding:.4045em 2.427em}.wy-menu-vertical li.toctree-l2.current li.toctree-l3>a{padding:.4045em 1.618em .4045em 4.045em}.wy-menu-vertical li.toctree-l3.current>a{padding:.4045em 4.045em}.wy-menu-vertical li.toctree-l3.current li.toctree-l4>a{padding:.4045em 1.618em .4045em 5.663em}.wy-menu-vertical li.toctree-l4.current>a{padding:.4045em 5.663em}.wy-menu-vertical li.toctree-l4.current li.toctree-l5>a{padding:.4045em 1.618em .4045em 7.281em}.wy-menu-vertical li.toctree-l5.current>a{padding:.4045em 7.281em}.wy-menu-vertical li.toctree-l5.current li.toctree-l6>a{padding:.4045em 1.618em .4045em 8.899em}.wy-menu-vertical li.toctree-l6.current>a{padding:.4045em 8.899em}.wy-menu-vertical li.toctree-l6.current li.toctree-l7>a{padding:.4045em 1.618em .4045em 10.517em}.wy-menu-vertical li.toctree-l7.current>a{padding:.4045em 10.517em}.wy-menu-vertical li.toctree-l7.current li.toctree-l8>a{padding:.4045em 1.618em .4045em 12.135em}.wy-menu-vertical li.toctree-l8.current>a{padding:.4045em 12.135em}.wy-menu-vertical li.toctree-l8.current li.toctree-l9>a{padding:.4045em 1.618em .4045em 13.753em}.wy-menu-vertical li.toctree-l9.current>a{padding:.4045em 13.753em}.wy-menu-vertical li.toctree-l9.current li.toctree-l10>a{padding:.4045em 1.618em .4045em 15.371em}.wy-menu-vertical li.toctree-l10.current>a{padding:.4045em 15.371em}.wy-menu-vertical li.toctree-l10.current li.toctree-l11>a{padding:.4045em 1.618em .4045em 16.989em}.wy-menu-vertical li.toctree-l2.current>a,.wy-menu-vertical li.toctree-l2.current li.toctree-l3>a{background:#c9c9c9}.wy-menu-vertical li.toctree-l2 button.toctree-expand{color:#a3a3a3}.wy-menu-vertical li.toctree-l3.current>a,.wy-menu-vertical li.toctree-l3.current li.toctree-l4>a{background:#bdbdbd}.wy-menu-vertical li.toctree-l3 button.toctree-expand{color:#969696}.wy-menu-vertical li.current ul{display:block}.wy-menu-vertical li ul{margin-bottom:0;display:none}.wy-menu-vertical li ul li a{margin-bottom:0;color:#d9d9d9;font-weight:400}.wy-menu-vertical a{line-height:18px;padding:.4045em 1.618em;display:block;position:relative;font-size:90%;color:#d9d9d9}.wy-menu-vertical a:hover{background-color:#4e4a4a;cursor:pointer}.wy-menu-vertical a:hover button.toctree-expand{color:#d9d9d9}.wy-menu-vertical a:active{background-color:#2980b9;cursor:pointer;color:#fff}.wy-menu-vertical a:active button.toctree-expand{color:#fff}.wy-side-nav-search{display:block;width:300px;padding:.809em;margin-bottom:.809em;z-index:200;background-color:#2980b9;text-align:center;color:#fcfcfc}.wy-side-nav-search input[type=text]{width:100%;border-radius:50px;padding:6px 12px;border-color:#2472a4}.wy-side-nav-search img{display:block;margin:auto auto .809em;height:45px;width:45px;background-color:#2980b9;padding:5px;border-radius:100%}.wy-side-nav-search .wy-dropdown>a,.wy-side-nav-search>a{color:#fcfcfc;font-size:100%;font-weight:700;display:inline-block;padding:4px 6px;margin-bottom:.809em;max-width:100%}.wy-side-nav-search .wy-dropdown>a:hover,.wy-side-nav-search>a:hover{background:hsla(0,0%,100%,.1)}.wy-side-nav-search .wy-dropdown>a img.logo,.wy-side-nav-search>a img.logo{display:block;margin:0 auto;height:auto;width:auto;border-radius:0;max-width:100%;background:transparent}.wy-side-nav-search .wy-dropdown>a.icon img.logo,.wy-side-nav-search>a.icon img.logo{margin-top:.85em}.wy-side-nav-search>div.version{margin-top:-.4045em;margin-bottom:.809em;font-weight:400;color:hsla(0,0%,100%,.3)}.wy-nav .wy-menu-vertical header{color:#2980b9}.wy-nav .wy-menu-vertical a{color:#b3b3b3}.wy-nav .wy-menu-vertical a:hover{background-color:#2980b9;color:#fff}[data-menu-wrap]{-webkit-transition:all .2s ease-in;-moz-transition:all .2s ease-in;transition:all .2s ease-in;position:absolute;opacity:1;width:100%;opacity:0}[data-menu-wrap].move-center{left:0;right:auto;opacity:1}[data-menu-wrap].move-left{right:auto;left:-100%;opacity:0}[data-menu-wrap].move-right{right:-100%;left:auto;opacity:0}.wy-body-for-nav{background:#fcfcfc}.wy-grid-for-nav{position:absolute;width:100%;height:100%}.wy-nav-side{position:fixed;top:0;bottom:0;left:0;padding-bottom:2em;width:300px;overflow-x:hidden;overflow-y:hidden;min-height:100%;color:#9b9b9b;background:#343131;z-index:200}.wy-side-scroll{width:320px;position:relative;overflow-x:hidden;overflow-y:scroll;height:100%}.wy-nav-top{display:none;background:#2980b9;color:#fff;padding:.4045em .809em;position:relative;line-height:50px;text-align:center;font-size:100%;*zoom:1}.wy-nav-top:after,.wy-nav-top:before{display:table;content:""}.wy-nav-top:after{clear:both}.wy-nav-top a{color:#fff;font-weight:700}.wy-nav-top img{margin-right:12px;height:45px;width:45px;background-color:#2980b9;padding:5px;border-radius:100%}.wy-nav-top i{font-size:30px;float:left;cursor:pointer;padding-top:inherit}.wy-nav-content-wrap{margin-left:300px;background:#fcfcfc;min-height:100%}.wy-nav-content{padding:1.618em 3.236em;height:100%;max-width:800px;margin:auto}.wy-body-mask{position:fixed;width:100%;height:100%;background:rgba(0,0,0,.2);display:none;z-index:499}.wy-body-mask.on{display:block}footer{color:grey}footer p{margin-bottom:12px}.rst-content footer span.commit tt,footer span.commit .rst-content tt,footer span.commit code{padding:0;font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;font-size:1em;background:none;border:none;color:grey}.rst-footer-buttons{*zoom:1}.rst-footer-buttons:after,.rst-footer-buttons:before{width:100%;display:table;content:""}.rst-footer-buttons:after{clear:both}.rst-breadcrumbs-buttons{margin-top:12px;*zoom:1}.rst-breadcrumbs-buttons:after,.rst-breadcrumbs-buttons:before{display:table;content:""}.rst-breadcrumbs-buttons:after{clear:both}#search-results .search li{margin-bottom:24px;border-bottom:1px solid #e1e4e5;padding-bottom:24px}#search-results .search li:first-child{border-top:1px solid #e1e4e5;padding-top:24px}#search-results .search li a{font-size:120%;margin-bottom:12px;display:inline-block}#search-results .context{color:grey;font-size:90%}.genindextable li>ul{margin-left:24px}@media screen and (max-width:768px){.wy-body-for-nav{background:#fcfcfc}.wy-nav-top{display:block}.wy-nav-side{left:-300px}.wy-nav-side.shift{width:85%;left:0}.wy-menu.wy-menu-vertical,.wy-side-nav-search,.wy-side-scroll{width:auto}.wy-nav-content-wrap{margin-left:0}.wy-nav-content-wrap .wy-nav-content{padding:1.618em}.wy-nav-content-wrap.shift{position:fixed;min-width:100%;left:85%;top:0;height:100%;overflow:hidden}}@media screen and (min-width:1100px){.wy-nav-content-wrap{background:rgba(0,0,0,.05)}.wy-nav-content{margin:0;background:#fcfcfc}}@media print{.rst-versions,.wy-nav-side,footer{display:none}.wy-nav-content-wrap{margin-left:0}}.rst-versions{position:fixed;bottom:0;left:0;width:300px;color:#fcfcfc;background:#1f1d1d;font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif;z-index:400}.rst-versions a{color:#2980b9;text-decoration:none}.rst-versions .rst-badge-small{display:none}.rst-versions .rst-current-version{padding:12px;background-color:#272525;display:block;text-align:right;font-size:90%;cursor:pointer;color:#27ae60;*zoom:1}.rst-versions .rst-current-version:after,.rst-versions .rst-current-version:before{display:table;content:""}.rst-versions .rst-current-version:after{clear:both}.rst-content .code-block-caption .rst-versions .rst-current-version .headerlink,.rst-content .eqno .rst-versions .rst-current-version .headerlink,.rst-content .rst-versions .rst-current-version .admonition-title,.rst-content code.download .rst-versions .rst-current-version span:first-child,.rst-content dl dt .rst-versions .rst-current-version .headerlink,.rst-content h1 .rst-versions .rst-current-version .headerlink,.rst-content h2 .rst-versions .rst-current-version .headerlink,.rst-content h3 .rst-versions .rst-current-version .headerlink,.rst-content h4 .rst-versions .rst-current-version .headerlink,.rst-content h5 .rst-versions .rst-current-version .headerlink,.rst-content h6 .rst-versions .rst-current-version .headerlink,.rst-content p .rst-versions .rst-current-version .headerlink,.rst-content table>caption .rst-versions .rst-current-version .headerlink,.rst-content tt.download .rst-versions .rst-current-version span:first-child,.rst-versions .rst-current-version .fa,.rst-versions .rst-current-version .icon,.rst-versions .rst-current-version .rst-content .admonition-title,.rst-versions .rst-current-version .rst-content .code-block-caption .headerlink,.rst-versions .rst-current-version .rst-content .eqno .headerlink,.rst-versions .rst-current-version .rst-content code.download span:first-child,.rst-versions .rst-current-version .rst-content dl dt .headerlink,.rst-versions .rst-current-version .rst-content h1 .headerlink,.rst-versions .rst-current-version .rst-content h2 .headerlink,.rst-versions .rst-current-version .rst-content h3 .headerlink,.rst-versions .rst-current-version .rst-content h4 .headerlink,.rst-versions .rst-current-version .rst-content h5 .headerlink,.rst-versions .rst-current-version .rst-content h6 .headerlink,.rst-versions .rst-current-version .rst-content p .headerlink,.rst-versions .rst-current-version .rst-content table>caption .headerlink,.rst-versions .rst-current-version .rst-content tt.download span:first-child,.rst-versions .rst-current-version .wy-menu-vertical li button.toctree-expand,.wy-menu-vertical li .rst-versions .rst-current-version button.toctree-expand{color:#fcfcfc}.rst-versions .rst-current-version .fa-book,.rst-versions .rst-current-version .icon-book{float:left}.rst-versions .rst-current-version.rst-out-of-date{background-color:#e74c3c;color:#fff}.rst-versions .rst-current-version.rst-active-old-version{background-color:#f1c40f;color:#000}.rst-versions.shift-up{height:auto;max-height:100%;overflow-y:scroll}.rst-versions.shift-up .rst-other-versions{display:block}.rst-versions .rst-other-versions{font-size:90%;padding:12px;color:grey;display:none}.rst-versions .rst-other-versions hr{display:block;height:1px;border:0;margin:20px 0;padding:0;border-top:1px solid #413d3d}.rst-versions .rst-other-versions dd{display:inline-block;margin:0}.rst-versions .rst-other-versions dd a{display:inline-block;padding:6px;color:#fcfcfc}.rst-versions.rst-badge{width:auto;bottom:20px;right:20px;left:auto;border:none;max-width:300px;max-height:90%}.rst-versions.rst-badge .fa-book,.rst-versions.rst-badge .icon-book{float:none;line-height:30px}.rst-versions.rst-badge.shift-up .rst-current-version{text-align:right}.rst-versions.rst-badge.shift-up .rst-current-version .fa-book,.rst-versions.rst-badge.shift-up .rst-current-version .icon-book{float:left}.rst-versions.rst-badge>.rst-current-version{width:auto;height:30px;line-height:30px;padding:0 6px;display:block;text-align:center}@media screen and (max-width:768px){.rst-versions{width:85%;display:none}.rst-versions.shift{display:block}}.rst-content .toctree-wrapper>p.caption,.rst-content h1,.rst-content h2,.rst-content h3,.rst-content h4,.rst-content h5,.rst-content h6{margin-bottom:24px}.rst-content img{max-width:100%;height:auto}.rst-content div.figure,.rst-content figure{margin-bottom:24px}.rst-content div.figure .caption-text,.rst-content figure .caption-text{font-style:italic}.rst-content div.figure p:last-child.caption,.rst-content figure p:last-child.caption{margin-bottom:0}.rst-content div.figure.align-center,.rst-content figure.align-center{text-align:center}.rst-content .section>a>img,.rst-content .section>img,.rst-content section>a>img,.rst-content section>img{margin-bottom:24px}.rst-content abbr[title]{text-decoration:none}.rst-content.style-external-links a.reference.external:after{font-family:FontAwesome;content:"\f08e";color:#b3b3b3;vertical-align:super;font-size:60%;margin:0 .2em}.rst-content blockquote{margin-left:24px;line-height:24px;margin-bottom:24px}.rst-content pre.literal-block{white-space:pre;margin:0;padding:12px;font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;display:block;overflow:auto}.rst-content div[class^=highlight],.rst-content pre.literal-block{border:1px solid #e1e4e5;overflow-x:auto;margin:1px 0 24px}.rst-content div[class^=highlight] div[class^=highlight],.rst-content pre.literal-block div[class^=highlight]{padding:0;border:none;margin:0}.rst-content div[class^=highlight] td.code{width:100%}.rst-content .linenodiv pre{border-right:1px solid #e6e9ea;margin:0;padding:12px;font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;user-select:none;pointer-events:none}.rst-content div[class^=highlight] pre{white-space:pre;margin:0;padding:12px;display:block;overflow:auto}.rst-content div[class^=highlight] pre .hll{display:block;margin:0 -12px;padding:0 12px}.rst-content .linenodiv pre,.rst-content div[class^=highlight] pre,.rst-content pre.literal-block{font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;font-size:12px;line-height:1.4}.rst-content div.highlight .gp,.rst-content div.highlight span.linenos{user-select:none;pointer-events:none}.rst-content div.highlight span.linenos{display:inline-block;padding-left:0;padding-right:12px;margin-right:12px;border-right:1px solid #e6e9ea}.rst-content .code-block-caption{font-style:italic;font-size:85%;line-height:1;padding:1em 0;text-align:center}@media print{.rst-content .codeblock,.rst-content div[class^=highlight],.rst-content div[class^=highlight] pre{white-space:pre-wrap}}.rst-content .admonition,.rst-content .admonition-todo,.rst-content .attention,.rst-content .caution,.rst-content .danger,.rst-content .error,.rst-content .hint,.rst-content .important,.rst-content .note,.rst-content .seealso,.rst-content .tip,.rst-content .warning{clear:both}.rst-content .admonition-todo .last,.rst-content .admonition-todo>:last-child,.rst-content .admonition .last,.rst-content .admonition>:last-child,.rst-content .attention .last,.rst-content .attention>:last-child,.rst-content .caution .last,.rst-content .caution>:last-child,.rst-content .danger .last,.rst-content .danger>:last-child,.rst-content .error .last,.rst-content .error>:last-child,.rst-content .hint .last,.rst-content .hint>:last-child,.rst-content .important .last,.rst-content .important>:last-child,.rst-content .note .last,.rst-content .note>:last-child,.rst-content .seealso .last,.rst-content .seealso>:last-child,.rst-content .tip .last,.rst-content .tip>:last-child,.rst-content .warning .last,.rst-content .warning>:last-child{margin-bottom:0}.rst-content .admonition-title:before{margin-right:4px}.rst-content .admonition table{border-color:rgba(0,0,0,.1)}.rst-content .admonition table td,.rst-content .admonition table th{background:transparent!important;border-color:rgba(0,0,0,.1)!important}.rst-content .section ol.loweralpha,.rst-content .section ol.loweralpha>li,.rst-content .toctree-wrapper ol.loweralpha,.rst-content .toctree-wrapper ol.loweralpha>li,.rst-content section ol.loweralpha,.rst-content section ol.loweralpha>li{list-style:lower-alpha}.rst-content .section ol.upperalpha,.rst-content .section ol.upperalpha>li,.rst-content .toctree-wrapper ol.upperalpha,.rst-content .toctree-wrapper ol.upperalpha>li,.rst-content section ol.upperalpha,.rst-content section ol.upperalpha>li{list-style:upper-alpha}.rst-content .section ol li>*,.rst-content .section ul li>*,.rst-content .toctree-wrapper ol li>*,.rst-content .toctree-wrapper ul li>*,.rst-content section ol li>*,.rst-content section ul li>*{margin-top:12px;margin-bottom:12px}.rst-content .section ol li>:first-child,.rst-content .section ul li>:first-child,.rst-content .toctree-wrapper ol li>:first-child,.rst-content .toctree-wrapper ul li>:first-child,.rst-content section ol li>:first-child,.rst-content section ul li>:first-child{margin-top:0}.rst-content .section ol li>p,.rst-content .section ol li>p:last-child,.rst-content .section ul li>p,.rst-content .section ul li>p:last-child,.rst-content .toctree-wrapper ol li>p,.rst-content .toctree-wrapper ol li>p:last-child,.rst-content .toctree-wrapper ul li>p,.rst-content .toctree-wrapper ul li>p:last-child,.rst-content section ol li>p,.rst-content section ol li>p:last-child,.rst-content section ul li>p,.rst-content section ul li>p:last-child{margin-bottom:12px}.rst-content .section ol li>p:only-child,.rst-content .section ol li>p:only-child:last-child,.rst-content .section ul li>p:only-child,.rst-content .section ul li>p:only-child:last-child,.rst-content .toctree-wrapper ol li>p:only-child,.rst-content .toctree-wrapper ol li>p:only-child:last-child,.rst-content .toctree-wrapper ul li>p:only-child,.rst-content .toctree-wrapper ul li>p:only-child:last-child,.rst-content section ol li>p:only-child,.rst-content section ol li>p:only-child:last-child,.rst-content section ul li>p:only-child,.rst-content section ul li>p:only-child:last-child{margin-bottom:0}.rst-content .section ol li>ol,.rst-content .section ol li>ul,.rst-content .section ul li>ol,.rst-content .section ul li>ul,.rst-content .toctree-wrapper ol li>ol,.rst-content .toctree-wrapper ol li>ul,.rst-content .toctree-wrapper ul li>ol,.rst-content .toctree-wrapper ul li>ul,.rst-content section ol li>ol,.rst-content section ol li>ul,.rst-content section ul li>ol,.rst-content section ul li>ul{margin-bottom:12px}.rst-content .section ol.simple li>*,.rst-content .section ol.simple li ol,.rst-content .section ol.simple li ul,.rst-content .section ul.simple li>*,.rst-content .section ul.simple li ol,.rst-content .section ul.simple li ul,.rst-content .toctree-wrapper ol.simple li>*,.rst-content .toctree-wrapper ol.simple li ol,.rst-content .toctree-wrapper ol.simple li ul,.rst-content .toctree-wrapper ul.simple li>*,.rst-content .toctree-wrapper ul.simple li ol,.rst-content .toctree-wrapper ul.simple li ul,.rst-content section ol.simple li>*,.rst-content section ol.simple li ol,.rst-content section ol.simple li ul,.rst-content section ul.simple li>*,.rst-content section ul.simple li ol,.rst-content section ul.simple li ul{margin-top:0;margin-bottom:0}.rst-content .line-block{margin-left:0;margin-bottom:24px;line-height:24px}.rst-content .line-block .line-block{margin-left:24px;margin-bottom:0}.rst-content .topic-title{font-weight:700;margin-bottom:12px}.rst-content .toc-backref{color:#404040}.rst-content .align-right{float:right;margin:0 0 24px 24px}.rst-content .align-left{float:left;margin:0 24px 24px 0}.rst-content .align-center{margin:auto}.rst-content .align-center:not(table){display:block}.rst-content .code-block-caption .headerlink,.rst-content .eqno .headerlink,.rst-content .toctree-wrapper>p.caption .headerlink,.rst-content dl dt .headerlink,.rst-content h1 .headerlink,.rst-content h2 .headerlink,.rst-content h3 .headerlink,.rst-content h4 .headerlink,.rst-content h5 .headerlink,.rst-content h6 .headerlink,.rst-content p.caption .headerlink,.rst-content p .headerlink,.rst-content table>caption .headerlink{opacity:0;font-size:14px;font-family:FontAwesome;margin-left:.5em}.rst-content .code-block-caption .headerlink:focus,.rst-content .code-block-caption:hover .headerlink,.rst-content .eqno .headerlink:focus,.rst-content .eqno:hover .headerlink,.rst-content .toctree-wrapper>p.caption .headerlink:focus,.rst-content .toctree-wrapper>p.caption:hover .headerlink,.rst-content dl dt .headerlink:focus,.rst-content dl dt:hover .headerlink,.rst-content h1 .headerlink:focus,.rst-content h1:hover .headerlink,.rst-content h2 .headerlink:focus,.rst-content h2:hover .headerlink,.rst-content h3 .headerlink:focus,.rst-content h3:hover .headerlink,.rst-content h4 .headerlink:focus,.rst-content h4:hover .headerlink,.rst-content h5 .headerlink:focus,.rst-content h5:hover .headerlink,.rst-content h6 .headerlink:focus,.rst-content h6:hover .headerlink,.rst-content p.caption .headerlink:focus,.rst-content p.caption:hover .headerlink,.rst-content p .headerlink:focus,.rst-content p:hover .headerlink,.rst-content table>caption .headerlink:focus,.rst-content table>caption:hover .headerlink{opacity:1}.rst-content p a{overflow-wrap:anywhere}.rst-content .wy-table td p,.rst-content .wy-table td ul,.rst-content .wy-table th p,.rst-content .wy-table th ul,.rst-content table.docutils td p,.rst-content table.docutils td ul,.rst-content table.docutils th p,.rst-content table.docutils th ul,.rst-content table.field-list td p,.rst-content table.field-list td ul,.rst-content table.field-list th p,.rst-content table.field-list th ul{font-size:inherit}.rst-content .btn:focus{outline:2px solid}.rst-content table>caption .headerlink:after{font-size:12px}.rst-content .centered{text-align:center}.rst-content .sidebar{float:right;width:40%;display:block;margin:0 0 24px 24px;padding:24px;background:#f3f6f6;border:1px solid #e1e4e5}.rst-content .sidebar dl,.rst-content .sidebar p,.rst-content .sidebar ul{font-size:90%}.rst-content .sidebar .last,.rst-content .sidebar>:last-child{margin-bottom:0}.rst-content .sidebar .sidebar-title{display:block;font-family:Roboto Slab,ff-tisa-web-pro,Georgia,Arial,sans-serif;font-weight:700;background:#e1e4e5;padding:6px 12px;margin:-24px -24px 24px;font-size:100%}.rst-content .highlighted{background:#f1c40f;box-shadow:0 0 0 2px #f1c40f;display:inline;font-weight:700}.rst-content .citation-reference,.rst-content .footnote-reference{vertical-align:baseline;position:relative;top:-.4em;line-height:0;font-size:90%}.rst-content .citation-reference>span.fn-bracket,.rst-content .footnote-reference>span.fn-bracket{display:none}.rst-content .hlist{width:100%}.rst-content dl dt span.classifier:before{content:" : "}.rst-content dl dt span.classifier-delimiter{display:none!important}html.writer-html4 .rst-content table.docutils.citation,html.writer-html4 .rst-content table.docutils.footnote{background:none;border:none}html.writer-html4 .rst-content table.docutils.citation td,html.writer-html4 .rst-content table.docutils.citation tr,html.writer-html4 .rst-content table.docutils.footnote td,html.writer-html4 .rst-content table.docutils.footnote tr{border:none;background-color:transparent!important;white-space:normal}html.writer-html4 .rst-content table.docutils.citation td.label,html.writer-html4 .rst-content table.docutils.footnote td.label{padding-left:0;padding-right:0;vertical-align:top}html.writer-html5 .rst-content dl.citation,html.writer-html5 .rst-content dl.field-list,html.writer-html5 .rst-content dl.footnote{display:grid;grid-template-columns:auto minmax(80%,95%)}html.writer-html5 .rst-content dl.citation>dt,html.writer-html5 .rst-content dl.field-list>dt,html.writer-html5 .rst-content dl.footnote>dt{display:inline-grid;grid-template-columns:max-content auto}html.writer-html5 .rst-content aside.citation,html.writer-html5 .rst-content aside.footnote,html.writer-html5 .rst-content div.citation{display:grid;grid-template-columns:auto auto minmax(.65rem,auto) minmax(40%,95%)}html.writer-html5 .rst-content aside.citation>span.label,html.writer-html5 .rst-content aside.footnote>span.label,html.writer-html5 .rst-content div.citation>span.label{grid-column-start:1;grid-column-end:2}html.writer-html5 .rst-content aside.citation>span.backrefs,html.writer-html5 .rst-content aside.footnote>span.backrefs,html.writer-html5 .rst-content div.citation>span.backrefs{grid-column-start:2;grid-column-end:3;grid-row-start:1;grid-row-end:3}html.writer-html5 .rst-content aside.citation>p,html.writer-html5 .rst-content aside.footnote>p,html.writer-html5 .rst-content div.citation>p{grid-column-start:4;grid-column-end:5}html.writer-html5 .rst-content dl.citation,html.writer-html5 .rst-content dl.field-list,html.writer-html5 .rst-content dl.footnote{margin-bottom:24px}html.writer-html5 .rst-content dl.citation>dt,html.writer-html5 .rst-content dl.field-list>dt,html.writer-html5 .rst-content dl.footnote>dt{padding-left:1rem}html.writer-html5 .rst-content dl.citation>dd,html.writer-html5 .rst-content dl.citation>dt,html.writer-html5 .rst-content dl.field-list>dd,html.writer-html5 .rst-content dl.field-list>dt,html.writer-html5 .rst-content dl.footnote>dd,html.writer-html5 .rst-content dl.footnote>dt{margin-bottom:0}html.writer-html5 .rst-content dl.citation,html.writer-html5 .rst-content dl.footnote{font-size:.9rem}html.writer-html5 .rst-content dl.citation>dt,html.writer-html5 .rst-content dl.footnote>dt{margin:0 .5rem .5rem 0;line-height:1.2rem;word-break:break-all;font-weight:400}html.writer-html5 .rst-content dl.citation>dt>span.brackets:before,html.writer-html5 .rst-content dl.footnote>dt>span.brackets:before{content:"["}html.writer-html5 .rst-content dl.citation>dt>span.brackets:after,html.writer-html5 .rst-content dl.footnote>dt>span.brackets:after{content:"]"}html.writer-html5 .rst-content dl.citation>dt>span.fn-backref,html.writer-html5 .rst-content dl.footnote>dt>span.fn-backref{text-align:left;font-style:italic;margin-left:.65rem;word-break:break-word;word-spacing:-.1rem;max-width:5rem}html.writer-html5 .rst-content dl.citation>dt>span.fn-backref>a,html.writer-html5 .rst-content dl.footnote>dt>span.fn-backref>a{word-break:keep-all}html.writer-html5 .rst-content dl.citation>dt>span.fn-backref>a:not(:first-child):before,html.writer-html5 .rst-content dl.footnote>dt>span.fn-backref>a:not(:first-child):before{content:" "}html.writer-html5 .rst-content dl.citation>dd,html.writer-html5 .rst-content dl.footnote>dd{margin:0 0 .5rem;line-height:1.2rem}html.writer-html5 .rst-content dl.citation>dd p,html.writer-html5 .rst-content dl.footnote>dd p{font-size:.9rem}html.writer-html5 .rst-content aside.citation,html.writer-html5 .rst-content aside.footnote,html.writer-html5 .rst-content div.citation{padding-left:1rem;padding-right:1rem;font-size:.9rem;line-height:1.2rem}html.writer-html5 .rst-content aside.citation p,html.writer-html5 .rst-content aside.footnote p,html.writer-html5 .rst-content div.citation p{font-size:.9rem;line-height:1.2rem;margin-bottom:12px}html.writer-html5 .rst-content aside.citation span.backrefs,html.writer-html5 .rst-content aside.footnote span.backrefs,html.writer-html5 .rst-content div.citation span.backrefs{text-align:left;font-style:italic;margin-left:.65rem;word-break:break-word;word-spacing:-.1rem;max-width:5rem}html.writer-html5 .rst-content aside.citation span.backrefs>a,html.writer-html5 .rst-content aside.footnote span.backrefs>a,html.writer-html5 .rst-content div.citation span.backrefs>a{word-break:keep-all}html.writer-html5 .rst-content aside.citation span.backrefs>a:not(:first-child):before,html.writer-html5 .rst-content aside.footnote span.backrefs>a:not(:first-child):before,html.writer-html5 .rst-content div.citation span.backrefs>a:not(:first-child):before{content:" "}html.writer-html5 .rst-content aside.citation span.label,html.writer-html5 .rst-content aside.footnote span.label,html.writer-html5 .rst-content div.citation span.label{line-height:1.2rem}html.writer-html5 .rst-content aside.citation-list,html.writer-html5 .rst-content aside.footnote-list,html.writer-html5 .rst-content div.citation-list{margin-bottom:24px}html.writer-html5 .rst-content dl.option-list kbd{font-size:.9rem}.rst-content table.docutils.footnote,html.writer-html4 .rst-content table.docutils.citation,html.writer-html5 .rst-content aside.footnote,html.writer-html5 .rst-content aside.footnote-list aside.footnote,html.writer-html5 .rst-content div.citation-list>div.citation,html.writer-html5 .rst-content dl.citation,html.writer-html5 .rst-content dl.footnote{color:grey}.rst-content table.docutils.footnote code,.rst-content table.docutils.footnote tt,html.writer-html4 .rst-content table.docutils.citation code,html.writer-html4 .rst-content table.docutils.citation tt,html.writer-html5 .rst-content aside.footnote-list aside.footnote code,html.writer-html5 .rst-content aside.footnote-list aside.footnote tt,html.writer-html5 .rst-content aside.footnote code,html.writer-html5 .rst-content aside.footnote tt,html.writer-html5 .rst-content div.citation-list>div.citation code,html.writer-html5 .rst-content div.citation-list>div.citation tt,html.writer-html5 .rst-content dl.citation code,html.writer-html5 .rst-content dl.citation tt,html.writer-html5 .rst-content dl.footnote code,html.writer-html5 .rst-content dl.footnote tt{color:#555}.rst-content .wy-table-responsive.citation,.rst-content .wy-table-responsive.footnote{margin-bottom:0}.rst-content .wy-table-responsive.citation+:not(.citation),.rst-content .wy-table-responsive.footnote+:not(.footnote){margin-top:24px}.rst-content .wy-table-responsive.citation:last-child,.rst-content .wy-table-responsive.footnote:last-child{margin-bottom:24px}.rst-content table.docutils th{border-color:#e1e4e5}html.writer-html5 .rst-content table.docutils th{border:1px solid #e1e4e5}html.writer-html5 .rst-content table.docutils td>p,html.writer-html5 .rst-content table.docutils th>p{line-height:1rem;margin-bottom:0;font-size:.9rem}.rst-content table.docutils td .last,.rst-content table.docutils td .last>:last-child{margin-bottom:0}.rst-content table.field-list,.rst-content table.field-list td{border:none}.rst-content table.field-list td p{line-height:inherit}.rst-content table.field-list td>strong{display:inline-block}.rst-content table.field-list .field-name{padding-right:10px;text-align:left;white-space:nowrap}.rst-content table.field-list .field-body{text-align:left}.rst-content code,.rst-content tt{color:#000;font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;padding:2px 5px}.rst-content code big,.rst-content code em,.rst-content tt big,.rst-content tt em{font-size:100%!important;line-height:normal}.rst-content code.literal,.rst-content tt.literal{color:#e74c3c;white-space:normal}.rst-content code.xref,.rst-content tt.xref,a .rst-content code,a .rst-content tt{font-weight:700;color:#404040;overflow-wrap:normal}.rst-content kbd,.rst-content pre,.rst-content samp{font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace}.rst-content a code,.rst-content a tt{color:#2980b9}.rst-content dl{margin-bottom:24px}.rst-content dl dt{font-weight:700;margin-bottom:12px}.rst-content dl ol,.rst-content dl p,.rst-content dl table,.rst-content dl ul{margin-bottom:12px}.rst-content dl dd{margin:0 0 12px 24px;line-height:24px}.rst-content dl dd>ol:last-child,.rst-content dl dd>p:last-child,.rst-content dl dd>table:last-child,.rst-content dl dd>ul:last-child{margin-bottom:0}html.writer-html4 .rst-content dl:not(.docutils),html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple){margin-bottom:24px}html.writer-html4 .rst-content dl:not(.docutils)>dt,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt{display:table;margin:6px 0;font-size:90%;line-height:normal;background:#e7f2fa;color:#2980b9;border-top:3px solid #6ab0de;padding:6px;position:relative}html.writer-html4 .rst-content dl:not(.docutils)>dt:before,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt:before{color:#6ab0de}html.writer-html4 .rst-content dl:not(.docutils)>dt .headerlink,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt .headerlink{color:#404040;font-size:100%!important}html.writer-html4 .rst-content dl:not(.docutils) dl:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) dl:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt{margin-bottom:6px;border:none;border-left:3px solid #ccc;background:#f0f0f0;color:#555}html.writer-html4 .rst-content dl:not(.docutils) dl:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt .headerlink,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) dl:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt .headerlink{color:#404040;font-size:100%!important}html.writer-html4 .rst-content dl:not(.docutils)>dt:first-child,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt:first-child{margin-top:0}html.writer-html4 .rst-content dl:not(.docutils) code.descclassname,html.writer-html4 .rst-content dl:not(.docutils) code.descname,html.writer-html4 .rst-content dl:not(.docutils) tt.descclassname,html.writer-html4 .rst-content dl:not(.docutils) tt.descname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) code.descclassname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) code.descname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) tt.descclassname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) tt.descname{background-color:transparent;border:none;padding:0;font-size:100%!important}html.writer-html4 .rst-content dl:not(.docutils) code.descname,html.writer-html4 .rst-content dl:not(.docutils) tt.descname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) code.descname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) tt.descname{font-weight:700}html.writer-html4 .rst-content dl:not(.docutils) .optional,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) .optional{display:inline-block;padding:0 4px;color:#000;font-weight:700}html.writer-html4 .rst-content dl:not(.docutils) .property,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) .property{display:inline-block;padding-right:8px;max-width:100%}html.writer-html4 .rst-content dl:not(.docutils) .k,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) .k{font-style:italic}html.writer-html4 .rst-content dl:not(.docutils) .descclassname,html.writer-html4 .rst-content dl:not(.docutils) .descname,html.writer-html4 .rst-content dl:not(.docutils) .sig-name,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) .descclassname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) .descname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) .sig-name{font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;color:#000}.rst-content .viewcode-back,.rst-content .viewcode-link{display:inline-block;color:#27ae60;font-size:80%;padding-left:24px}.rst-content .viewcode-back{display:block;float:right}.rst-content p.rubric{margin-bottom:12px;font-weight:700}.rst-content code.download,.rst-content tt.download{background:inherit;padding:inherit;font-weight:400;font-family:inherit;font-size:inherit;color:inherit;border:inherit;white-space:inherit}.rst-content code.download span:first-child,.rst-content tt.download span:first-child{-webkit-font-smoothing:subpixel-antialiased}.rst-content code.download span:first-child:before,.rst-content tt.download span:first-child:before{margin-right:4px}.rst-content .guilabel,.rst-content .menuselection{font-size:80%;font-weight:700;border-radius:4px;padding:2.4px 6px;margin:auto 2px}.rst-content .guilabel,.rst-content .menuselection{border:1px solid #7fbbe3;background:#e7f2fa}.rst-content :not(dl.option-list)>:not(dt):not(kbd):not(.kbd)>.kbd,.rst-content :not(dl.option-list)>:not(dt):not(kbd):not(.kbd)>kbd{color:inherit;font-size:80%;background-color:#fff;border:1px solid #a6a6a6;border-radius:4px;box-shadow:0 2px grey;padding:2.4px 6px;margin:auto 0}.rst-content .versionmodified{font-style:italic}@media screen and (max-width:480px){.rst-content .sidebar{width:100%}}span[id*=MathJax-Span]{color:#404040}.math{text-align:center}@font-face{font-family:Lato;src:url(fonts/lato-normal.woff2?bd03a2cc277bbbc338d464e679fe9942) format("woff2"),url(fonts/lato-normal.woff?27bd77b9162d388cb8d4c4217c7c5e2a) format("woff");font-weight:400;font-style:normal;font-display:block}@font-face{font-family:Lato;src:url(fonts/lato-bold.woff2?cccb897485813c7c256901dbca54ecf2) format("woff2"),url(fonts/lato-bold.woff?d878b6c29b10beca227e9eef4246111b) format("woff");font-weight:700;font-style:normal;font-display:block}@font-face{font-family:Lato;src:url(fonts/lato-bold-italic.woff2?0b6bb6725576b072c5d0b02ecdd1900d) format("woff2"),url(fonts/lato-bold-italic.woff?9c7e4e9eb485b4a121c760e61bc3707c) format("woff");font-weight:700;font-style:italic;font-display:block}@font-face{font-family:Lato;src:url(fonts/lato-normal-italic.woff2?4eb103b4d12be57cb1d040ed5e162e9d) format("woff2"),url(fonts/lato-normal-italic.woff?f28f2d6482446544ef1ea1ccc6dd5892) format("woff");font-weight:400;font-style:italic;font-display:block}@font-face{font-family:Roboto Slab;font-style:normal;font-weight:400;src:url(fonts/Roboto-Slab-Regular.woff2?7abf5b8d04d26a2cafea937019bca958) format("woff2"),url(fonts/Roboto-Slab-Regular.woff?c1be9284088d487c5e3ff0a10a92e58c) format("woff");font-display:block}@font-face{font-family:Roboto Slab;font-style:normal;font-weight:700;src:url(fonts/Roboto-Slab-Bold.woff2?9984f4a9bda09be08e83f2506954adbe) format("woff2"),url(fonts/Roboto-Slab-Bold.woff?bed5564a116b05148e3b3bea6fb1162a) format("woff");font-display:block} \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/_static/doctools.js b/releases/1.32.2/torch_v2/_static/doctools.js new file mode 100644 index 00000000..527b876c --- /dev/null +++ b/releases/1.32.2/torch_v2/_static/doctools.js @@ -0,0 +1,156 @@ +/* + * doctools.js + * ~~~~~~~~~~~ + * + * Base JavaScript utilities for all Sphinx HTML documentation. + * + * :copyright: Copyright 2007-2022 by the Sphinx team, see AUTHORS. + * :license: BSD, see LICENSE for details. + * + */ +"use strict"; + +const BLACKLISTED_KEY_CONTROL_ELEMENTS = new Set([ + "TEXTAREA", + "INPUT", + "SELECT", + "BUTTON", +]); + +const _ready = (callback) => { + if (document.readyState !== "loading") { + callback(); + } else { + document.addEventListener("DOMContentLoaded", callback); + } +}; + +/** + * Small JavaScript module for the documentation. + */ +const Documentation = { + init: () => { + Documentation.initDomainIndexTable(); + Documentation.initOnKeyListeners(); + }, + + /** + * i18n support + */ + TRANSLATIONS: {}, + PLURAL_EXPR: (n) => (n === 1 ? 0 : 1), + LOCALE: "unknown", + + // gettext and ngettext don't access this so that the functions + // can safely bound to a different name (_ = Documentation.gettext) + gettext: (string) => { + const translated = Documentation.TRANSLATIONS[string]; + switch (typeof translated) { + case "undefined": + return string; // no translation + case "string": + return translated; // translation exists + default: + return translated[0]; // (singular, plural) translation tuple exists + } + }, + + ngettext: (singular, plural, n) => { + const translated = Documentation.TRANSLATIONS[singular]; + if (typeof translated !== "undefined") + return translated[Documentation.PLURAL_EXPR(n)]; + return n === 1 ? singular : plural; + }, + + addTranslations: (catalog) => { + Object.assign(Documentation.TRANSLATIONS, catalog.messages); + Documentation.PLURAL_EXPR = new Function( + "n", + `return (${catalog.plural_expr})` + ); + Documentation.LOCALE = catalog.locale; + }, + + /** + * helper function to focus on search bar + */ + focusSearchBar: () => { + document.querySelectorAll("input[name=q]")[0]?.focus(); + }, + + /** + * Initialise the domain index toggle buttons + */ + initDomainIndexTable: () => { + const toggler = (el) => { + const idNumber = el.id.substr(7); + const toggledRows = document.querySelectorAll(`tr.cg-${idNumber}`); + if (el.src.substr(-9) === "minus.png") { + el.src = `${el.src.substr(0, el.src.length - 9)}plus.png`; + toggledRows.forEach((el) => (el.style.display = "none")); + } else { + el.src = `${el.src.substr(0, el.src.length - 8)}minus.png`; + toggledRows.forEach((el) => (el.style.display = "")); + } + }; + + const togglerElements = document.querySelectorAll("img.toggler"); + togglerElements.forEach((el) => + el.addEventListener("click", (event) => toggler(event.currentTarget)) + ); + togglerElements.forEach((el) => (el.style.display = "")); + if (DOCUMENTATION_OPTIONS.COLLAPSE_INDEX) togglerElements.forEach(toggler); + }, + + initOnKeyListeners: () => { + // only install a listener if it is really needed + if ( + !DOCUMENTATION_OPTIONS.NAVIGATION_WITH_KEYS && + !DOCUMENTATION_OPTIONS.ENABLE_SEARCH_SHORTCUTS + ) + return; + + document.addEventListener("keydown", (event) => { + // bail for input elements + if (BLACKLISTED_KEY_CONTROL_ELEMENTS.has(document.activeElement.tagName)) return; + // bail with special keys + if (event.altKey || event.ctrlKey || event.metaKey) return; + + if (!event.shiftKey) { + switch (event.key) { + case "ArrowLeft": + if (!DOCUMENTATION_OPTIONS.NAVIGATION_WITH_KEYS) break; + + const prevLink = document.querySelector('link[rel="prev"]'); + if (prevLink && prevLink.href) { + window.location.href = prevLink.href; + event.preventDefault(); + } + break; + case "ArrowRight": + if (!DOCUMENTATION_OPTIONS.NAVIGATION_WITH_KEYS) break; + + const nextLink = document.querySelector('link[rel="next"]'); + if (nextLink && nextLink.href) { + window.location.href = nextLink.href; + event.preventDefault(); + } + break; + } + } + + // some keyboard layouts may need Shift to get / + switch (event.key) { + case "/": + if (!DOCUMENTATION_OPTIONS.ENABLE_SEARCH_SHORTCUTS) break; + Documentation.focusSearchBar(); + event.preventDefault(); + } + }); + }, +}; + +// quick alias for translations +const _ = Documentation.gettext; + +_ready(Documentation.init); diff --git a/releases/1.32.2/torch_v2/_static/documentation_options.js b/releases/1.32.2/torch_v2/_static/documentation_options.js new file mode 100644 index 00000000..b57ae3b8 --- /dev/null +++ b/releases/1.32.2/torch_v2/_static/documentation_options.js @@ -0,0 +1,14 @@ +var DOCUMENTATION_OPTIONS = { + URL_ROOT: document.getElementById("documentation_options").getAttribute('data-url_root'), + VERSION: '', + LANGUAGE: 'en', + COLLAPSE_INDEX: false, + BUILDER: 'html', + FILE_SUFFIX: '.html', + LINK_SUFFIX: '.html', + HAS_SOURCE: true, + SOURCELINK_SUFFIX: '.txt', + NAVIGATION_WITH_KEYS: false, + SHOW_SEARCH_SUMMARY: true, + ENABLE_SEARCH_SHORTCUTS: true, +}; \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/_static/file.png b/releases/1.32.2/torch_v2/_static/file.png new file mode 100644 index 00000000..a858a410 Binary files /dev/null and b/releases/1.32.2/torch_v2/_static/file.png differ diff --git a/releases/1.32.2/torch_v2/_static/jquery-3.6.0.js b/releases/1.32.2/torch_v2/_static/jquery-3.6.0.js new file mode 100644 index 00000000..fc6c299b --- /dev/null +++ b/releases/1.32.2/torch_v2/_static/jquery-3.6.0.js @@ -0,0 +1,10881 @@ +/*! + * jQuery JavaScript Library v3.6.0 + * https://jquery.com/ + * + * Includes Sizzle.js + * https://sizzlejs.com/ + * + * Copyright OpenJS Foundation and other contributors + * Released under the MIT license + * https://jquery.org/license + * + * Date: 2021-03-02T17:08Z + */ +( function( global, factory ) { + + "use strict"; + + if ( typeof module === "object" && typeof module.exports === "object" ) { + + // For CommonJS and CommonJS-like environments where a proper `window` + // is present, execute the factory and get jQuery. + // For environments that do not have a `window` with a `document` + // (such as Node.js), expose a factory as module.exports. + // This accentuates the need for the creation of a real `window`. + // e.g. var jQuery = require("jquery")(window); + // See ticket #14549 for more info. + module.exports = global.document ? + factory( global, true ) : + function( w ) { + if ( !w.document ) { + throw new Error( "jQuery requires a window with a document" ); + } + return factory( w ); + }; + } else { + factory( global ); + } + +// Pass this if window is not defined yet +} )( typeof window !== "undefined" ? window : this, function( window, noGlobal ) { + +// Edge <= 12 - 13+, Firefox <=18 - 45+, IE 10 - 11, Safari 5.1 - 9+, iOS 6 - 9.1 +// throw exceptions when non-strict code (e.g., ASP.NET 4.5) accesses strict mode +// arguments.callee.caller (trac-13335). But as of jQuery 3.0 (2016), strict mode should be common +// enough that all such attempts are guarded in a try block. +"use strict"; + +var arr = []; + +var getProto = Object.getPrototypeOf; + +var slice = arr.slice; + +var flat = arr.flat ? function( array ) { + return arr.flat.call( array ); +} : function( array ) { + return arr.concat.apply( [], array ); +}; + + +var push = arr.push; + +var indexOf = arr.indexOf; + +var class2type = {}; + +var toString = class2type.toString; + +var hasOwn = class2type.hasOwnProperty; + +var fnToString = hasOwn.toString; + +var ObjectFunctionString = fnToString.call( Object ); + +var support = {}; + +var isFunction = function isFunction( obj ) { + + // Support: Chrome <=57, Firefox <=52 + // In some browsers, typeof returns "function" for HTML elements + // (i.e., `typeof document.createElement( "object" ) === "function"`). + // We don't want to classify *any* DOM node as a function. + // Support: QtWeb <=3.8.5, WebKit <=534.34, wkhtmltopdf tool <=0.12.5 + // Plus for old WebKit, typeof returns "function" for HTML collections + // (e.g., `typeof document.getElementsByTagName("div") === "function"`). (gh-4756) + return typeof obj === "function" && typeof obj.nodeType !== "number" && + typeof obj.item !== "function"; + }; + + +var isWindow = function isWindow( obj ) { + return obj != null && obj === obj.window; + }; + + +var document = window.document; + + + + var preservedScriptAttributes = { + type: true, + src: true, + nonce: true, + noModule: true + }; + + function DOMEval( code, node, doc ) { + doc = doc || document; + + var i, val, + script = doc.createElement( "script" ); + + script.text = code; + if ( node ) { + for ( i in preservedScriptAttributes ) { + + // Support: Firefox 64+, Edge 18+ + // Some browsers don't support the "nonce" property on scripts. + // On the other hand, just using `getAttribute` is not enough as + // the `nonce` attribute is reset to an empty string whenever it + // becomes browsing-context connected. + // See https://github.com/whatwg/html/issues/2369 + // See https://html.spec.whatwg.org/#nonce-attributes + // The `node.getAttribute` check was added for the sake of + // `jQuery.globalEval` so that it can fake a nonce-containing node + // via an object. + val = node[ i ] || node.getAttribute && node.getAttribute( i ); + if ( val ) { + script.setAttribute( i, val ); + } + } + } + doc.head.appendChild( script ).parentNode.removeChild( script ); + } + + +function toType( obj ) { + if ( obj == null ) { + return obj + ""; + } + + // Support: Android <=2.3 only (functionish RegExp) + return typeof obj === "object" || typeof obj === "function" ? + class2type[ toString.call( obj ) ] || "object" : + typeof obj; +} +/* global Symbol */ +// Defining this global in .eslintrc.json would create a danger of using the global +// unguarded in another place, it seems safer to define global only for this module + + + +var + version = "3.6.0", + + // Define a local copy of jQuery + jQuery = function( selector, context ) { + + // The jQuery object is actually just the init constructor 'enhanced' + // Need init if jQuery is called (just allow error to be thrown if not included) + return new jQuery.fn.init( selector, context ); + }; + +jQuery.fn = jQuery.prototype = { + + // The current version of jQuery being used + jquery: version, + + constructor: jQuery, + + // The default length of a jQuery object is 0 + length: 0, + + toArray: function() { + return slice.call( this ); + }, + + // Get the Nth element in the matched element set OR + // Get the whole matched element set as a clean array + get: function( num ) { + + // Return all the elements in a clean array + if ( num == null ) { + return slice.call( this ); + } + + // Return just the one element from the set + return num < 0 ? this[ num + this.length ] : this[ num ]; + }, + + // Take an array of elements and push it onto the stack + // (returning the new matched element set) + pushStack: function( elems ) { + + // Build a new jQuery matched element set + var ret = jQuery.merge( this.constructor(), elems ); + + // Add the old object onto the stack (as a reference) + ret.prevObject = this; + + // Return the newly-formed element set + return ret; + }, + + // Execute a callback for every element in the matched set. + each: function( callback ) { + return jQuery.each( this, callback ); + }, + + map: function( callback ) { + return this.pushStack( jQuery.map( this, function( elem, i ) { + return callback.call( elem, i, elem ); + } ) ); + }, + + slice: function() { + return this.pushStack( slice.apply( this, arguments ) ); + }, + + first: function() { + return this.eq( 0 ); + }, + + last: function() { + return this.eq( -1 ); + }, + + even: function() { + return this.pushStack( jQuery.grep( this, function( _elem, i ) { + return ( i + 1 ) % 2; + } ) ); + }, + + odd: function() { + return this.pushStack( jQuery.grep( this, function( _elem, i ) { + return i % 2; + } ) ); + }, + + eq: function( i ) { + var len = this.length, + j = +i + ( i < 0 ? len : 0 ); + return this.pushStack( j >= 0 && j < len ? [ this[ j ] ] : [] ); + }, + + end: function() { + return this.prevObject || this.constructor(); + }, + + // For internal use only. + // Behaves like an Array's method, not like a jQuery method. + push: push, + sort: arr.sort, + splice: arr.splice +}; + +jQuery.extend = jQuery.fn.extend = function() { + var options, name, src, copy, copyIsArray, clone, + target = arguments[ 0 ] || {}, + i = 1, + length = arguments.length, + deep = false; + + // Handle a deep copy situation + if ( typeof target === "boolean" ) { + deep = target; + + // Skip the boolean and the target + target = arguments[ i ] || {}; + i++; + } + + // Handle case when target is a string or something (possible in deep copy) + if ( typeof target !== "object" && !isFunction( target ) ) { + target = {}; + } + + // Extend jQuery itself if only one argument is passed + if ( i === length ) { + target = this; + i--; + } + + for ( ; i < length; i++ ) { + + // Only deal with non-null/undefined values + if ( ( options = arguments[ i ] ) != null ) { + + // Extend the base object + for ( name in options ) { + copy = options[ name ]; + + // Prevent Object.prototype pollution + // Prevent never-ending loop + if ( name === "__proto__" || target === copy ) { + continue; + } + + // Recurse if we're merging plain objects or arrays + if ( deep && copy && ( jQuery.isPlainObject( copy ) || + ( copyIsArray = Array.isArray( copy ) ) ) ) { + src = target[ name ]; + + // Ensure proper type for the source value + if ( copyIsArray && !Array.isArray( src ) ) { + clone = []; + } else if ( !copyIsArray && !jQuery.isPlainObject( src ) ) { + clone = {}; + } else { + clone = src; + } + copyIsArray = false; + + // Never move original objects, clone them + target[ name ] = jQuery.extend( deep, clone, copy ); + + // Don't bring in undefined values + } else if ( copy !== undefined ) { + target[ name ] = copy; + } + } + } + } + + // Return the modified object + return target; +}; + +jQuery.extend( { + + // Unique for each copy of jQuery on the page + expando: "jQuery" + ( version + Math.random() ).replace( /\D/g, "" ), + + // Assume jQuery is ready without the ready module + isReady: true, + + error: function( msg ) { + throw new Error( msg ); + }, + + noop: function() {}, + + isPlainObject: function( obj ) { + var proto, Ctor; + + // Detect obvious negatives + // Use toString instead of jQuery.type to catch host objects + if ( !obj || toString.call( obj ) !== "[object Object]" ) { + return false; + } + + proto = getProto( obj ); + + // Objects with no prototype (e.g., `Object.create( null )`) are plain + if ( !proto ) { + return true; + } + + // Objects with prototype are plain iff they were constructed by a global Object function + Ctor = hasOwn.call( proto, "constructor" ) && proto.constructor; + return typeof Ctor === "function" && fnToString.call( Ctor ) === ObjectFunctionString; + }, + + isEmptyObject: function( obj ) { + var name; + + for ( name in obj ) { + return false; + } + return true; + }, + + // Evaluates a script in a provided context; falls back to the global one + // if not specified. + globalEval: function( code, options, doc ) { + DOMEval( code, { nonce: options && options.nonce }, doc ); + }, + + each: function( obj, callback ) { + var length, i = 0; + + if ( isArrayLike( obj ) ) { + length = obj.length; + for ( ; i < length; i++ ) { + if ( callback.call( obj[ i ], i, obj[ i ] ) === false ) { + break; + } + } + } else { + for ( i in obj ) { + if ( callback.call( obj[ i ], i, obj[ i ] ) === false ) { + break; + } + } + } + + return obj; + }, + + // results is for internal usage only + makeArray: function( arr, results ) { + var ret = results || []; + + if ( arr != null ) { + if ( isArrayLike( Object( arr ) ) ) { + jQuery.merge( ret, + typeof arr === "string" ? + [ arr ] : arr + ); + } else { + push.call( ret, arr ); + } + } + + return ret; + }, + + inArray: function( elem, arr, i ) { + return arr == null ? -1 : indexOf.call( arr, elem, i ); + }, + + // Support: Android <=4.0 only, PhantomJS 1 only + // push.apply(_, arraylike) throws on ancient WebKit + merge: function( first, second ) { + var len = +second.length, + j = 0, + i = first.length; + + for ( ; j < len; j++ ) { + first[ i++ ] = second[ j ]; + } + + first.length = i; + + return first; + }, + + grep: function( elems, callback, invert ) { + var callbackInverse, + matches = [], + i = 0, + length = elems.length, + callbackExpect = !invert; + + // Go through the array, only saving the items + // that pass the validator function + for ( ; i < length; i++ ) { + callbackInverse = !callback( elems[ i ], i ); + if ( callbackInverse !== callbackExpect ) { + matches.push( elems[ i ] ); + } + } + + return matches; + }, + + // arg is for internal usage only + map: function( elems, callback, arg ) { + var length, value, + i = 0, + ret = []; + + // Go through the array, translating each of the items to their new values + if ( isArrayLike( elems ) ) { + length = elems.length; + for ( ; i < length; i++ ) { + value = callback( elems[ i ], i, arg ); + + if ( value != null ) { + ret.push( value ); + } + } + + // Go through every key on the object, + } else { + for ( i in elems ) { + value = callback( elems[ i ], i, arg ); + + if ( value != null ) { + ret.push( value ); + } + } + } + + // Flatten any nested arrays + return flat( ret ); + }, + + // A global GUID counter for objects + guid: 1, + + // jQuery.support is not used in Core but other projects attach their + // properties to it so it needs to exist. + support: support +} ); + +if ( typeof Symbol === "function" ) { + jQuery.fn[ Symbol.iterator ] = arr[ Symbol.iterator ]; +} + +// Populate the class2type map +jQuery.each( "Boolean Number String Function Array Date RegExp Object Error Symbol".split( " " ), + function( _i, name ) { + class2type[ "[object " + name + "]" ] = name.toLowerCase(); + } ); + +function isArrayLike( obj ) { + + // Support: real iOS 8.2 only (not reproducible in simulator) + // `in` check used to prevent JIT error (gh-2145) + // hasOwn isn't used here due to false negatives + // regarding Nodelist length in IE + var length = !!obj && "length" in obj && obj.length, + type = toType( obj ); + + if ( isFunction( obj ) || isWindow( obj ) ) { + return false; + } + + return type === "array" || length === 0 || + typeof length === "number" && length > 0 && ( length - 1 ) in obj; +} +var Sizzle = +/*! + * Sizzle CSS Selector Engine v2.3.6 + * https://sizzlejs.com/ + * + * Copyright JS Foundation and other contributors + * Released under the MIT license + * https://js.foundation/ + * + * Date: 2021-02-16 + */ +( function( window ) { +var i, + support, + Expr, + getText, + isXML, + tokenize, + compile, + select, + outermostContext, + sortInput, + hasDuplicate, + + // Local document vars + setDocument, + document, + docElem, + documentIsHTML, + rbuggyQSA, + rbuggyMatches, + matches, + contains, + + // Instance-specific data + expando = "sizzle" + 1 * new Date(), + preferredDoc = window.document, + dirruns = 0, + done = 0, + classCache = createCache(), + tokenCache = createCache(), + compilerCache = createCache(), + nonnativeSelectorCache = createCache(), + sortOrder = function( a, b ) { + if ( a === b ) { + hasDuplicate = true; + } + return 0; + }, + + // Instance methods + hasOwn = ( {} ).hasOwnProperty, + arr = [], + pop = arr.pop, + pushNative = arr.push, + push = arr.push, + slice = arr.slice, + + // Use a stripped-down indexOf as it's faster than native + // https://jsperf.com/thor-indexof-vs-for/5 + indexOf = function( list, elem ) { + var i = 0, + len = list.length; + for ( ; i < len; i++ ) { + if ( list[ i ] === elem ) { + return i; + } + } + return -1; + }, + + booleans = "checked|selected|async|autofocus|autoplay|controls|defer|disabled|hidden|" + + "ismap|loop|multiple|open|readonly|required|scoped", + + // Regular expressions + + // http://www.w3.org/TR/css3-selectors/#whitespace + whitespace = "[\\x20\\t\\r\\n\\f]", + + // https://www.w3.org/TR/css-syntax-3/#ident-token-diagram + identifier = "(?:\\\\[\\da-fA-F]{1,6}" + whitespace + + "?|\\\\[^\\r\\n\\f]|[\\w-]|[^\0-\\x7f])+", + + // Attribute selectors: http://www.w3.org/TR/selectors/#attribute-selectors + attributes = "\\[" + whitespace + "*(" + identifier + ")(?:" + whitespace + + + // Operator (capture 2) + "*([*^$|!~]?=)" + whitespace + + + // "Attribute values must be CSS identifiers [capture 5] + // or strings [capture 3 or capture 4]" + "*(?:'((?:\\\\.|[^\\\\'])*)'|\"((?:\\\\.|[^\\\\\"])*)\"|(" + identifier + "))|)" + + whitespace + "*\\]", + + pseudos = ":(" + identifier + ")(?:\\((" + + + // To reduce the number of selectors needing tokenize in the preFilter, prefer arguments: + // 1. quoted (capture 3; capture 4 or capture 5) + "('((?:\\\\.|[^\\\\'])*)'|\"((?:\\\\.|[^\\\\\"])*)\")|" + + + // 2. simple (capture 6) + "((?:\\\\.|[^\\\\()[\\]]|" + attributes + ")*)|" + + + // 3. anything else (capture 2) + ".*" + + ")\\)|)", + + // Leading and non-escaped trailing whitespace, capturing some non-whitespace characters preceding the latter + rwhitespace = new RegExp( whitespace + "+", "g" ), + rtrim = new RegExp( "^" + whitespace + "+|((?:^|[^\\\\])(?:\\\\.)*)" + + whitespace + "+$", "g" ), + + rcomma = new RegExp( "^" + whitespace + "*," + whitespace + "*" ), + rcombinators = new RegExp( "^" + whitespace + "*([>+~]|" + whitespace + ")" + whitespace + + "*" ), + rdescend = new RegExp( whitespace + "|>" ), + + rpseudo = new RegExp( pseudos ), + ridentifier = new RegExp( "^" + identifier + "$" ), + + matchExpr = { + "ID": new RegExp( "^#(" + identifier + ")" ), + "CLASS": new RegExp( "^\\.(" + identifier + ")" ), + "TAG": new RegExp( "^(" + identifier + "|[*])" ), + "ATTR": new RegExp( "^" + attributes ), + "PSEUDO": new RegExp( "^" + pseudos ), + "CHILD": new RegExp( "^:(only|first|last|nth|nth-last)-(child|of-type)(?:\\(" + + whitespace + "*(even|odd|(([+-]|)(\\d*)n|)" + whitespace + "*(?:([+-]|)" + + whitespace + "*(\\d+)|))" + whitespace + "*\\)|)", "i" ), + "bool": new RegExp( "^(?:" + booleans + ")$", "i" ), + + // For use in libraries implementing .is() + // We use this for POS matching in `select` + "needsContext": new RegExp( "^" + whitespace + + "*[>+~]|:(even|odd|eq|gt|lt|nth|first|last)(?:\\(" + whitespace + + "*((?:-\\d)?\\d*)" + whitespace + "*\\)|)(?=[^-]|$)", "i" ) + }, + + rhtml = /HTML$/i, + rinputs = /^(?:input|select|textarea|button)$/i, + rheader = /^h\d$/i, + + rnative = /^[^{]+\{\s*\[native \w/, + + // Easily-parseable/retrievable ID or TAG or CLASS selectors + rquickExpr = /^(?:#([\w-]+)|(\w+)|\.([\w-]+))$/, + + rsibling = /[+~]/, + + // CSS escapes + // http://www.w3.org/TR/CSS21/syndata.html#escaped-characters + runescape = new RegExp( "\\\\[\\da-fA-F]{1,6}" + whitespace + "?|\\\\([^\\r\\n\\f])", "g" ), + funescape = function( escape, nonHex ) { + var high = "0x" + escape.slice( 1 ) - 0x10000; + + return nonHex ? + + // Strip the backslash prefix from a non-hex escape sequence + nonHex : + + // Replace a hexadecimal escape sequence with the encoded Unicode code point + // Support: IE <=11+ + // For values outside the Basic Multilingual Plane (BMP), manually construct a + // surrogate pair + high < 0 ? + String.fromCharCode( high + 0x10000 ) : + String.fromCharCode( high >> 10 | 0xD800, high & 0x3FF | 0xDC00 ); + }, + + // CSS string/identifier serialization + // https://drafts.csswg.org/cssom/#common-serializing-idioms + rcssescape = /([\0-\x1f\x7f]|^-?\d)|^-$|[^\0-\x1f\x7f-\uFFFF\w-]/g, + fcssescape = function( ch, asCodePoint ) { + if ( asCodePoint ) { + + // U+0000 NULL becomes U+FFFD REPLACEMENT CHARACTER + if ( ch === "\0" ) { + return "\uFFFD"; + } + + // Control characters and (dependent upon position) numbers get escaped as code points + return ch.slice( 0, -1 ) + "\\" + + ch.charCodeAt( ch.length - 1 ).toString( 16 ) + " "; + } + + // Other potentially-special ASCII characters get backslash-escaped + return "\\" + ch; + }, + + // Used for iframes + // See setDocument() + // Removing the function wrapper causes a "Permission Denied" + // error in IE + unloadHandler = function() { + setDocument(); + }, + + inDisabledFieldset = addCombinator( + function( elem ) { + return elem.disabled === true && elem.nodeName.toLowerCase() === "fieldset"; + }, + { dir: "parentNode", next: "legend" } + ); + +// Optimize for push.apply( _, NodeList ) +try { + push.apply( + ( arr = slice.call( preferredDoc.childNodes ) ), + preferredDoc.childNodes + ); + + // Support: Android<4.0 + // Detect silently failing push.apply + // eslint-disable-next-line no-unused-expressions + arr[ preferredDoc.childNodes.length ].nodeType; +} catch ( e ) { + push = { apply: arr.length ? + + // Leverage slice if possible + function( target, els ) { + pushNative.apply( target, slice.call( els ) ); + } : + + // Support: IE<9 + // Otherwise append directly + function( target, els ) { + var j = target.length, + i = 0; + + // Can't trust NodeList.length + while ( ( target[ j++ ] = els[ i++ ] ) ) {} + target.length = j - 1; + } + }; +} + +function Sizzle( selector, context, results, seed ) { + var m, i, elem, nid, match, groups, newSelector, + newContext = context && context.ownerDocument, + + // nodeType defaults to 9, since context defaults to document + nodeType = context ? context.nodeType : 9; + + results = results || []; + + // Return early from calls with invalid selector or context + if ( typeof selector !== "string" || !selector || + nodeType !== 1 && nodeType !== 9 && nodeType !== 11 ) { + + return results; + } + + // Try to shortcut find operations (as opposed to filters) in HTML documents + if ( !seed ) { + setDocument( context ); + context = context || document; + + if ( documentIsHTML ) { + + // If the selector is sufficiently simple, try using a "get*By*" DOM method + // (excepting DocumentFragment context, where the methods don't exist) + if ( nodeType !== 11 && ( match = rquickExpr.exec( selector ) ) ) { + + // ID selector + if ( ( m = match[ 1 ] ) ) { + + // Document context + if ( nodeType === 9 ) { + if ( ( elem = context.getElementById( m ) ) ) { + + // Support: IE, Opera, Webkit + // TODO: identify versions + // getElementById can match elements by name instead of ID + if ( elem.id === m ) { + results.push( elem ); + return results; + } + } else { + return results; + } + + // Element context + } else { + + // Support: IE, Opera, Webkit + // TODO: identify versions + // getElementById can match elements by name instead of ID + if ( newContext && ( elem = newContext.getElementById( m ) ) && + contains( context, elem ) && + elem.id === m ) { + + results.push( elem ); + return results; + } + } + + // Type selector + } else if ( match[ 2 ] ) { + push.apply( results, context.getElementsByTagName( selector ) ); + return results; + + // Class selector + } else if ( ( m = match[ 3 ] ) && support.getElementsByClassName && + context.getElementsByClassName ) { + + push.apply( results, context.getElementsByClassName( m ) ); + return results; + } + } + + // Take advantage of querySelectorAll + if ( support.qsa && + !nonnativeSelectorCache[ selector + " " ] && + ( !rbuggyQSA || !rbuggyQSA.test( selector ) ) && + + // Support: IE 8 only + // Exclude object elements + ( nodeType !== 1 || context.nodeName.toLowerCase() !== "object" ) ) { + + newSelector = selector; + newContext = context; + + // qSA considers elements outside a scoping root when evaluating child or + // descendant combinators, which is not what we want. + // In such cases, we work around the behavior by prefixing every selector in the + // list with an ID selector referencing the scope context. + // The technique has to be used as well when a leading combinator is used + // as such selectors are not recognized by querySelectorAll. + // Thanks to Andrew Dupont for this technique. + if ( nodeType === 1 && + ( rdescend.test( selector ) || rcombinators.test( selector ) ) ) { + + // Expand context for sibling selectors + newContext = rsibling.test( selector ) && testContext( context.parentNode ) || + context; + + // We can use :scope instead of the ID hack if the browser + // supports it & if we're not changing the context. + if ( newContext !== context || !support.scope ) { + + // Capture the context ID, setting it first if necessary + if ( ( nid = context.getAttribute( "id" ) ) ) { + nid = nid.replace( rcssescape, fcssescape ); + } else { + context.setAttribute( "id", ( nid = expando ) ); + } + } + + // Prefix every selector in the list + groups = tokenize( selector ); + i = groups.length; + while ( i-- ) { + groups[ i ] = ( nid ? "#" + nid : ":scope" ) + " " + + toSelector( groups[ i ] ); + } + newSelector = groups.join( "," ); + } + + try { + push.apply( results, + newContext.querySelectorAll( newSelector ) + ); + return results; + } catch ( qsaError ) { + nonnativeSelectorCache( selector, true ); + } finally { + if ( nid === expando ) { + context.removeAttribute( "id" ); + } + } + } + } + } + + // All others + return select( selector.replace( rtrim, "$1" ), context, results, seed ); +} + +/** + * Create key-value caches of limited size + * @returns {function(string, object)} Returns the Object data after storing it on itself with + * property name the (space-suffixed) string and (if the cache is larger than Expr.cacheLength) + * deleting the oldest entry + */ +function createCache() { + var keys = []; + + function cache( key, value ) { + + // Use (key + " ") to avoid collision with native prototype properties (see Issue #157) + if ( keys.push( key + " " ) > Expr.cacheLength ) { + + // Only keep the most recent entries + delete cache[ keys.shift() ]; + } + return ( cache[ key + " " ] = value ); + } + return cache; +} + +/** + * Mark a function for special use by Sizzle + * @param {Function} fn The function to mark + */ +function markFunction( fn ) { + fn[ expando ] = true; + return fn; +} + +/** + * Support testing using an element + * @param {Function} fn Passed the created element and returns a boolean result + */ +function assert( fn ) { + var el = document.createElement( "fieldset" ); + + try { + return !!fn( el ); + } catch ( e ) { + return false; + } finally { + + // Remove from its parent by default + if ( el.parentNode ) { + el.parentNode.removeChild( el ); + } + + // release memory in IE + el = null; + } +} + +/** + * Adds the same handler for all of the specified attrs + * @param {String} attrs Pipe-separated list of attributes + * @param {Function} handler The method that will be applied + */ +function addHandle( attrs, handler ) { + var arr = attrs.split( "|" ), + i = arr.length; + + while ( i-- ) { + Expr.attrHandle[ arr[ i ] ] = handler; + } +} + +/** + * Checks document order of two siblings + * @param {Element} a + * @param {Element} b + * @returns {Number} Returns less than 0 if a precedes b, greater than 0 if a follows b + */ +function siblingCheck( a, b ) { + var cur = b && a, + diff = cur && a.nodeType === 1 && b.nodeType === 1 && + a.sourceIndex - b.sourceIndex; + + // Use IE sourceIndex if available on both nodes + if ( diff ) { + return diff; + } + + // Check if b follows a + if ( cur ) { + while ( ( cur = cur.nextSibling ) ) { + if ( cur === b ) { + return -1; + } + } + } + + return a ? 1 : -1; +} + +/** + * Returns a function to use in pseudos for input types + * @param {String} type + */ +function createInputPseudo( type ) { + return function( elem ) { + var name = elem.nodeName.toLowerCase(); + return name === "input" && elem.type === type; + }; +} + +/** + * Returns a function to use in pseudos for buttons + * @param {String} type + */ +function createButtonPseudo( type ) { + return function( elem ) { + var name = elem.nodeName.toLowerCase(); + return ( name === "input" || name === "button" ) && elem.type === type; + }; +} + +/** + * Returns a function to use in pseudos for :enabled/:disabled + * @param {Boolean} disabled true for :disabled; false for :enabled + */ +function createDisabledPseudo( disabled ) { + + // Known :disabled false positives: fieldset[disabled] > legend:nth-of-type(n+2) :can-disable + return function( elem ) { + + // Only certain elements can match :enabled or :disabled + // https://html.spec.whatwg.org/multipage/scripting.html#selector-enabled + // https://html.spec.whatwg.org/multipage/scripting.html#selector-disabled + if ( "form" in elem ) { + + // Check for inherited disabledness on relevant non-disabled elements: + // * listed form-associated elements in a disabled fieldset + // https://html.spec.whatwg.org/multipage/forms.html#category-listed + // https://html.spec.whatwg.org/multipage/forms.html#concept-fe-disabled + // * option elements in a disabled optgroup + // https://html.spec.whatwg.org/multipage/forms.html#concept-option-disabled + // All such elements have a "form" property. + if ( elem.parentNode && elem.disabled === false ) { + + // Option elements defer to a parent optgroup if present + if ( "label" in elem ) { + if ( "label" in elem.parentNode ) { + return elem.parentNode.disabled === disabled; + } else { + return elem.disabled === disabled; + } + } + + // Support: IE 6 - 11 + // Use the isDisabled shortcut property to check for disabled fieldset ancestors + return elem.isDisabled === disabled || + + // Where there is no isDisabled, check manually + /* jshint -W018 */ + elem.isDisabled !== !disabled && + inDisabledFieldset( elem ) === disabled; + } + + return elem.disabled === disabled; + + // Try to winnow out elements that can't be disabled before trusting the disabled property. + // Some victims get caught in our net (label, legend, menu, track), but it shouldn't + // even exist on them, let alone have a boolean value. + } else if ( "label" in elem ) { + return elem.disabled === disabled; + } + + // Remaining elements are neither :enabled nor :disabled + return false; + }; +} + +/** + * Returns a function to use in pseudos for positionals + * @param {Function} fn + */ +function createPositionalPseudo( fn ) { + return markFunction( function( argument ) { + argument = +argument; + return markFunction( function( seed, matches ) { + var j, + matchIndexes = fn( [], seed.length, argument ), + i = matchIndexes.length; + + // Match elements found at the specified indexes + while ( i-- ) { + if ( seed[ ( j = matchIndexes[ i ] ) ] ) { + seed[ j ] = !( matches[ j ] = seed[ j ] ); + } + } + } ); + } ); +} + +/** + * Checks a node for validity as a Sizzle context + * @param {Element|Object=} context + * @returns {Element|Object|Boolean} The input node if acceptable, otherwise a falsy value + */ +function testContext( context ) { + return context && typeof context.getElementsByTagName !== "undefined" && context; +} + +// Expose support vars for convenience +support = Sizzle.support = {}; + +/** + * Detects XML nodes + * @param {Element|Object} elem An element or a document + * @returns {Boolean} True iff elem is a non-HTML XML node + */ +isXML = Sizzle.isXML = function( elem ) { + var namespace = elem && elem.namespaceURI, + docElem = elem && ( elem.ownerDocument || elem ).documentElement; + + // Support: IE <=8 + // Assume HTML when documentElement doesn't yet exist, such as inside loading iframes + // https://bugs.jquery.com/ticket/4833 + return !rhtml.test( namespace || docElem && docElem.nodeName || "HTML" ); +}; + +/** + * Sets document-related variables once based on the current document + * @param {Element|Object} [doc] An element or document object to use to set the document + * @returns {Object} Returns the current document + */ +setDocument = Sizzle.setDocument = function( node ) { + var hasCompare, subWindow, + doc = node ? node.ownerDocument || node : preferredDoc; + + // Return early if doc is invalid or already selected + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( doc == document || doc.nodeType !== 9 || !doc.documentElement ) { + return document; + } + + // Update global variables + document = doc; + docElem = document.documentElement; + documentIsHTML = !isXML( document ); + + // Support: IE 9 - 11+, Edge 12 - 18+ + // Accessing iframe documents after unload throws "permission denied" errors (jQuery #13936) + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( preferredDoc != document && + ( subWindow = document.defaultView ) && subWindow.top !== subWindow ) { + + // Support: IE 11, Edge + if ( subWindow.addEventListener ) { + subWindow.addEventListener( "unload", unloadHandler, false ); + + // Support: IE 9 - 10 only + } else if ( subWindow.attachEvent ) { + subWindow.attachEvent( "onunload", unloadHandler ); + } + } + + // Support: IE 8 - 11+, Edge 12 - 18+, Chrome <=16 - 25 only, Firefox <=3.6 - 31 only, + // Safari 4 - 5 only, Opera <=11.6 - 12.x only + // IE/Edge & older browsers don't support the :scope pseudo-class. + // Support: Safari 6.0 only + // Safari 6.0 supports :scope but it's an alias of :root there. + support.scope = assert( function( el ) { + docElem.appendChild( el ).appendChild( document.createElement( "div" ) ); + return typeof el.querySelectorAll !== "undefined" && + !el.querySelectorAll( ":scope fieldset div" ).length; + } ); + + /* Attributes + ---------------------------------------------------------------------- */ + + // Support: IE<8 + // Verify that getAttribute really returns attributes and not properties + // (excepting IE8 booleans) + support.attributes = assert( function( el ) { + el.className = "i"; + return !el.getAttribute( "className" ); + } ); + + /* getElement(s)By* + ---------------------------------------------------------------------- */ + + // Check if getElementsByTagName("*") returns only elements + support.getElementsByTagName = assert( function( el ) { + el.appendChild( document.createComment( "" ) ); + return !el.getElementsByTagName( "*" ).length; + } ); + + // Support: IE<9 + support.getElementsByClassName = rnative.test( document.getElementsByClassName ); + + // Support: IE<10 + // Check if getElementById returns elements by name + // The broken getElementById methods don't pick up programmatically-set names, + // so use a roundabout getElementsByName test + support.getById = assert( function( el ) { + docElem.appendChild( el ).id = expando; + return !document.getElementsByName || !document.getElementsByName( expando ).length; + } ); + + // ID filter and find + if ( support.getById ) { + Expr.filter[ "ID" ] = function( id ) { + var attrId = id.replace( runescape, funescape ); + return function( elem ) { + return elem.getAttribute( "id" ) === attrId; + }; + }; + Expr.find[ "ID" ] = function( id, context ) { + if ( typeof context.getElementById !== "undefined" && documentIsHTML ) { + var elem = context.getElementById( id ); + return elem ? [ elem ] : []; + } + }; + } else { + Expr.filter[ "ID" ] = function( id ) { + var attrId = id.replace( runescape, funescape ); + return function( elem ) { + var node = typeof elem.getAttributeNode !== "undefined" && + elem.getAttributeNode( "id" ); + return node && node.value === attrId; + }; + }; + + // Support: IE 6 - 7 only + // getElementById is not reliable as a find shortcut + Expr.find[ "ID" ] = function( id, context ) { + if ( typeof context.getElementById !== "undefined" && documentIsHTML ) { + var node, i, elems, + elem = context.getElementById( id ); + + if ( elem ) { + + // Verify the id attribute + node = elem.getAttributeNode( "id" ); + if ( node && node.value === id ) { + return [ elem ]; + } + + // Fall back on getElementsByName + elems = context.getElementsByName( id ); + i = 0; + while ( ( elem = elems[ i++ ] ) ) { + node = elem.getAttributeNode( "id" ); + if ( node && node.value === id ) { + return [ elem ]; + } + } + } + + return []; + } + }; + } + + // Tag + Expr.find[ "TAG" ] = support.getElementsByTagName ? + function( tag, context ) { + if ( typeof context.getElementsByTagName !== "undefined" ) { + return context.getElementsByTagName( tag ); + + // DocumentFragment nodes don't have gEBTN + } else if ( support.qsa ) { + return context.querySelectorAll( tag ); + } + } : + + function( tag, context ) { + var elem, + tmp = [], + i = 0, + + // By happy coincidence, a (broken) gEBTN appears on DocumentFragment nodes too + results = context.getElementsByTagName( tag ); + + // Filter out possible comments + if ( tag === "*" ) { + while ( ( elem = results[ i++ ] ) ) { + if ( elem.nodeType === 1 ) { + tmp.push( elem ); + } + } + + return tmp; + } + return results; + }; + + // Class + Expr.find[ "CLASS" ] = support.getElementsByClassName && function( className, context ) { + if ( typeof context.getElementsByClassName !== "undefined" && documentIsHTML ) { + return context.getElementsByClassName( className ); + } + }; + + /* QSA/matchesSelector + ---------------------------------------------------------------------- */ + + // QSA and matchesSelector support + + // matchesSelector(:active) reports false when true (IE9/Opera 11.5) + rbuggyMatches = []; + + // qSa(:focus) reports false when true (Chrome 21) + // We allow this because of a bug in IE8/9 that throws an error + // whenever `document.activeElement` is accessed on an iframe + // So, we allow :focus to pass through QSA all the time to avoid the IE error + // See https://bugs.jquery.com/ticket/13378 + rbuggyQSA = []; + + if ( ( support.qsa = rnative.test( document.querySelectorAll ) ) ) { + + // Build QSA regex + // Regex strategy adopted from Diego Perini + assert( function( el ) { + + var input; + + // Select is set to empty string on purpose + // This is to test IE's treatment of not explicitly + // setting a boolean content attribute, + // since its presence should be enough + // https://bugs.jquery.com/ticket/12359 + docElem.appendChild( el ).innerHTML = "" + + ""; + + // Support: IE8, Opera 11-12.16 + // Nothing should be selected when empty strings follow ^= or $= or *= + // The test attribute must be unknown in Opera but "safe" for WinRT + // https://msdn.microsoft.com/en-us/library/ie/hh465388.aspx#attribute_section + if ( el.querySelectorAll( "[msallowcapture^='']" ).length ) { + rbuggyQSA.push( "[*^$]=" + whitespace + "*(?:''|\"\")" ); + } + + // Support: IE8 + // Boolean attributes and "value" are not treated correctly + if ( !el.querySelectorAll( "[selected]" ).length ) { + rbuggyQSA.push( "\\[" + whitespace + "*(?:value|" + booleans + ")" ); + } + + // Support: Chrome<29, Android<4.4, Safari<7.0+, iOS<7.0+, PhantomJS<1.9.8+ + if ( !el.querySelectorAll( "[id~=" + expando + "-]" ).length ) { + rbuggyQSA.push( "~=" ); + } + + // Support: IE 11+, Edge 15 - 18+ + // IE 11/Edge don't find elements on a `[name='']` query in some cases. + // Adding a temporary attribute to the document before the selection works + // around the issue. + // Interestingly, IE 10 & older don't seem to have the issue. + input = document.createElement( "input" ); + input.setAttribute( "name", "" ); + el.appendChild( input ); + if ( !el.querySelectorAll( "[name='']" ).length ) { + rbuggyQSA.push( "\\[" + whitespace + "*name" + whitespace + "*=" + + whitespace + "*(?:''|\"\")" ); + } + + // Webkit/Opera - :checked should return selected option elements + // http://www.w3.org/TR/2011/REC-css3-selectors-20110929/#checked + // IE8 throws error here and will not see later tests + if ( !el.querySelectorAll( ":checked" ).length ) { + rbuggyQSA.push( ":checked" ); + } + + // Support: Safari 8+, iOS 8+ + // https://bugs.webkit.org/show_bug.cgi?id=136851 + // In-page `selector#id sibling-combinator selector` fails + if ( !el.querySelectorAll( "a#" + expando + "+*" ).length ) { + rbuggyQSA.push( ".#.+[+~]" ); + } + + // Support: Firefox <=3.6 - 5 only + // Old Firefox doesn't throw on a badly-escaped identifier. + el.querySelectorAll( "\\\f" ); + rbuggyQSA.push( "[\\r\\n\\f]" ); + } ); + + assert( function( el ) { + el.innerHTML = "" + + ""; + + // Support: Windows 8 Native Apps + // The type and name attributes are restricted during .innerHTML assignment + var input = document.createElement( "input" ); + input.setAttribute( "type", "hidden" ); + el.appendChild( input ).setAttribute( "name", "D" ); + + // Support: IE8 + // Enforce case-sensitivity of name attribute + if ( el.querySelectorAll( "[name=d]" ).length ) { + rbuggyQSA.push( "name" + whitespace + "*[*^$|!~]?=" ); + } + + // FF 3.5 - :enabled/:disabled and hidden elements (hidden elements are still enabled) + // IE8 throws error here and will not see later tests + if ( el.querySelectorAll( ":enabled" ).length !== 2 ) { + rbuggyQSA.push( ":enabled", ":disabled" ); + } + + // Support: IE9-11+ + // IE's :disabled selector does not pick up the children of disabled fieldsets + docElem.appendChild( el ).disabled = true; + if ( el.querySelectorAll( ":disabled" ).length !== 2 ) { + rbuggyQSA.push( ":enabled", ":disabled" ); + } + + // Support: Opera 10 - 11 only + // Opera 10-11 does not throw on post-comma invalid pseudos + el.querySelectorAll( "*,:x" ); + rbuggyQSA.push( ",.*:" ); + } ); + } + + if ( ( support.matchesSelector = rnative.test( ( matches = docElem.matches || + docElem.webkitMatchesSelector || + docElem.mozMatchesSelector || + docElem.oMatchesSelector || + docElem.msMatchesSelector ) ) ) ) { + + assert( function( el ) { + + // Check to see if it's possible to do matchesSelector + // on a disconnected node (IE 9) + support.disconnectedMatch = matches.call( el, "*" ); + + // This should fail with an exception + // Gecko does not error, returns false instead + matches.call( el, "[s!='']:x" ); + rbuggyMatches.push( "!=", pseudos ); + } ); + } + + rbuggyQSA = rbuggyQSA.length && new RegExp( rbuggyQSA.join( "|" ) ); + rbuggyMatches = rbuggyMatches.length && new RegExp( rbuggyMatches.join( "|" ) ); + + /* Contains + ---------------------------------------------------------------------- */ + hasCompare = rnative.test( docElem.compareDocumentPosition ); + + // Element contains another + // Purposefully self-exclusive + // As in, an element does not contain itself + contains = hasCompare || rnative.test( docElem.contains ) ? + function( a, b ) { + var adown = a.nodeType === 9 ? a.documentElement : a, + bup = b && b.parentNode; + return a === bup || !!( bup && bup.nodeType === 1 && ( + adown.contains ? + adown.contains( bup ) : + a.compareDocumentPosition && a.compareDocumentPosition( bup ) & 16 + ) ); + } : + function( a, b ) { + if ( b ) { + while ( ( b = b.parentNode ) ) { + if ( b === a ) { + return true; + } + } + } + return false; + }; + + /* Sorting + ---------------------------------------------------------------------- */ + + // Document order sorting + sortOrder = hasCompare ? + function( a, b ) { + + // Flag for duplicate removal + if ( a === b ) { + hasDuplicate = true; + return 0; + } + + // Sort on method existence if only one input has compareDocumentPosition + var compare = !a.compareDocumentPosition - !b.compareDocumentPosition; + if ( compare ) { + return compare; + } + + // Calculate position if both inputs belong to the same document + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + compare = ( a.ownerDocument || a ) == ( b.ownerDocument || b ) ? + a.compareDocumentPosition( b ) : + + // Otherwise we know they are disconnected + 1; + + // Disconnected nodes + if ( compare & 1 || + ( !support.sortDetached && b.compareDocumentPosition( a ) === compare ) ) { + + // Choose the first element that is related to our preferred document + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( a == document || a.ownerDocument == preferredDoc && + contains( preferredDoc, a ) ) { + return -1; + } + + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( b == document || b.ownerDocument == preferredDoc && + contains( preferredDoc, b ) ) { + return 1; + } + + // Maintain original order + return sortInput ? + ( indexOf( sortInput, a ) - indexOf( sortInput, b ) ) : + 0; + } + + return compare & 4 ? -1 : 1; + } : + function( a, b ) { + + // Exit early if the nodes are identical + if ( a === b ) { + hasDuplicate = true; + return 0; + } + + var cur, + i = 0, + aup = a.parentNode, + bup = b.parentNode, + ap = [ a ], + bp = [ b ]; + + // Parentless nodes are either documents or disconnected + if ( !aup || !bup ) { + + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + /* eslint-disable eqeqeq */ + return a == document ? -1 : + b == document ? 1 : + /* eslint-enable eqeqeq */ + aup ? -1 : + bup ? 1 : + sortInput ? + ( indexOf( sortInput, a ) - indexOf( sortInput, b ) ) : + 0; + + // If the nodes are siblings, we can do a quick check + } else if ( aup === bup ) { + return siblingCheck( a, b ); + } + + // Otherwise we need full lists of their ancestors for comparison + cur = a; + while ( ( cur = cur.parentNode ) ) { + ap.unshift( cur ); + } + cur = b; + while ( ( cur = cur.parentNode ) ) { + bp.unshift( cur ); + } + + // Walk down the tree looking for a discrepancy + while ( ap[ i ] === bp[ i ] ) { + i++; + } + + return i ? + + // Do a sibling check if the nodes have a common ancestor + siblingCheck( ap[ i ], bp[ i ] ) : + + // Otherwise nodes in our document sort first + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + /* eslint-disable eqeqeq */ + ap[ i ] == preferredDoc ? -1 : + bp[ i ] == preferredDoc ? 1 : + /* eslint-enable eqeqeq */ + 0; + }; + + return document; +}; + +Sizzle.matches = function( expr, elements ) { + return Sizzle( expr, null, null, elements ); +}; + +Sizzle.matchesSelector = function( elem, expr ) { + setDocument( elem ); + + if ( support.matchesSelector && documentIsHTML && + !nonnativeSelectorCache[ expr + " " ] && + ( !rbuggyMatches || !rbuggyMatches.test( expr ) ) && + ( !rbuggyQSA || !rbuggyQSA.test( expr ) ) ) { + + try { + var ret = matches.call( elem, expr ); + + // IE 9's matchesSelector returns false on disconnected nodes + if ( ret || support.disconnectedMatch || + + // As well, disconnected nodes are said to be in a document + // fragment in IE 9 + elem.document && elem.document.nodeType !== 11 ) { + return ret; + } + } catch ( e ) { + nonnativeSelectorCache( expr, true ); + } + } + + return Sizzle( expr, document, null, [ elem ] ).length > 0; +}; + +Sizzle.contains = function( context, elem ) { + + // Set document vars if needed + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( ( context.ownerDocument || context ) != document ) { + setDocument( context ); + } + return contains( context, elem ); +}; + +Sizzle.attr = function( elem, name ) { + + // Set document vars if needed + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( ( elem.ownerDocument || elem ) != document ) { + setDocument( elem ); + } + + var fn = Expr.attrHandle[ name.toLowerCase() ], + + // Don't get fooled by Object.prototype properties (jQuery #13807) + val = fn && hasOwn.call( Expr.attrHandle, name.toLowerCase() ) ? + fn( elem, name, !documentIsHTML ) : + undefined; + + return val !== undefined ? + val : + support.attributes || !documentIsHTML ? + elem.getAttribute( name ) : + ( val = elem.getAttributeNode( name ) ) && val.specified ? + val.value : + null; +}; + +Sizzle.escape = function( sel ) { + return ( sel + "" ).replace( rcssescape, fcssescape ); +}; + +Sizzle.error = function( msg ) { + throw new Error( "Syntax error, unrecognized expression: " + msg ); +}; + +/** + * Document sorting and removing duplicates + * @param {ArrayLike} results + */ +Sizzle.uniqueSort = function( results ) { + var elem, + duplicates = [], + j = 0, + i = 0; + + // Unless we *know* we can detect duplicates, assume their presence + hasDuplicate = !support.detectDuplicates; + sortInput = !support.sortStable && results.slice( 0 ); + results.sort( sortOrder ); + + if ( hasDuplicate ) { + while ( ( elem = results[ i++ ] ) ) { + if ( elem === results[ i ] ) { + j = duplicates.push( i ); + } + } + while ( j-- ) { + results.splice( duplicates[ j ], 1 ); + } + } + + // Clear input after sorting to release objects + // See https://github.com/jquery/sizzle/pull/225 + sortInput = null; + + return results; +}; + +/** + * Utility function for retrieving the text value of an array of DOM nodes + * @param {Array|Element} elem + */ +getText = Sizzle.getText = function( elem ) { + var node, + ret = "", + i = 0, + nodeType = elem.nodeType; + + if ( !nodeType ) { + + // If no nodeType, this is expected to be an array + while ( ( node = elem[ i++ ] ) ) { + + // Do not traverse comment nodes + ret += getText( node ); + } + } else if ( nodeType === 1 || nodeType === 9 || nodeType === 11 ) { + + // Use textContent for elements + // innerText usage removed for consistency of new lines (jQuery #11153) + if ( typeof elem.textContent === "string" ) { + return elem.textContent; + } else { + + // Traverse its children + for ( elem = elem.firstChild; elem; elem = elem.nextSibling ) { + ret += getText( elem ); + } + } + } else if ( nodeType === 3 || nodeType === 4 ) { + return elem.nodeValue; + } + + // Do not include comment or processing instruction nodes + + return ret; +}; + +Expr = Sizzle.selectors = { + + // Can be adjusted by the user + cacheLength: 50, + + createPseudo: markFunction, + + match: matchExpr, + + attrHandle: {}, + + find: {}, + + relative: { + ">": { dir: "parentNode", first: true }, + " ": { dir: "parentNode" }, + "+": { dir: "previousSibling", first: true }, + "~": { dir: "previousSibling" } + }, + + preFilter: { + "ATTR": function( match ) { + match[ 1 ] = match[ 1 ].replace( runescape, funescape ); + + // Move the given value to match[3] whether quoted or unquoted + match[ 3 ] = ( match[ 3 ] || match[ 4 ] || + match[ 5 ] || "" ).replace( runescape, funescape ); + + if ( match[ 2 ] === "~=" ) { + match[ 3 ] = " " + match[ 3 ] + " "; + } + + return match.slice( 0, 4 ); + }, + + "CHILD": function( match ) { + + /* matches from matchExpr["CHILD"] + 1 type (only|nth|...) + 2 what (child|of-type) + 3 argument (even|odd|\d*|\d*n([+-]\d+)?|...) + 4 xn-component of xn+y argument ([+-]?\d*n|) + 5 sign of xn-component + 6 x of xn-component + 7 sign of y-component + 8 y of y-component + */ + match[ 1 ] = match[ 1 ].toLowerCase(); + + if ( match[ 1 ].slice( 0, 3 ) === "nth" ) { + + // nth-* requires argument + if ( !match[ 3 ] ) { + Sizzle.error( match[ 0 ] ); + } + + // numeric x and y parameters for Expr.filter.CHILD + // remember that false/true cast respectively to 0/1 + match[ 4 ] = +( match[ 4 ] ? + match[ 5 ] + ( match[ 6 ] || 1 ) : + 2 * ( match[ 3 ] === "even" || match[ 3 ] === "odd" ) ); + match[ 5 ] = +( ( match[ 7 ] + match[ 8 ] ) || match[ 3 ] === "odd" ); + + // other types prohibit arguments + } else if ( match[ 3 ] ) { + Sizzle.error( match[ 0 ] ); + } + + return match; + }, + + "PSEUDO": function( match ) { + var excess, + unquoted = !match[ 6 ] && match[ 2 ]; + + if ( matchExpr[ "CHILD" ].test( match[ 0 ] ) ) { + return null; + } + + // Accept quoted arguments as-is + if ( match[ 3 ] ) { + match[ 2 ] = match[ 4 ] || match[ 5 ] || ""; + + // Strip excess characters from unquoted arguments + } else if ( unquoted && rpseudo.test( unquoted ) && + + // Get excess from tokenize (recursively) + ( excess = tokenize( unquoted, true ) ) && + + // advance to the next closing parenthesis + ( excess = unquoted.indexOf( ")", unquoted.length - excess ) - unquoted.length ) ) { + + // excess is a negative index + match[ 0 ] = match[ 0 ].slice( 0, excess ); + match[ 2 ] = unquoted.slice( 0, excess ); + } + + // Return only captures needed by the pseudo filter method (type and argument) + return match.slice( 0, 3 ); + } + }, + + filter: { + + "TAG": function( nodeNameSelector ) { + var nodeName = nodeNameSelector.replace( runescape, funescape ).toLowerCase(); + return nodeNameSelector === "*" ? + function() { + return true; + } : + function( elem ) { + return elem.nodeName && elem.nodeName.toLowerCase() === nodeName; + }; + }, + + "CLASS": function( className ) { + var pattern = classCache[ className + " " ]; + + return pattern || + ( pattern = new RegExp( "(^|" + whitespace + + ")" + className + "(" + whitespace + "|$)" ) ) && classCache( + className, function( elem ) { + return pattern.test( + typeof elem.className === "string" && elem.className || + typeof elem.getAttribute !== "undefined" && + elem.getAttribute( "class" ) || + "" + ); + } ); + }, + + "ATTR": function( name, operator, check ) { + return function( elem ) { + var result = Sizzle.attr( elem, name ); + + if ( result == null ) { + return operator === "!="; + } + if ( !operator ) { + return true; + } + + result += ""; + + /* eslint-disable max-len */ + + return operator === "=" ? result === check : + operator === "!=" ? result !== check : + operator === "^=" ? check && result.indexOf( check ) === 0 : + operator === "*=" ? check && result.indexOf( check ) > -1 : + operator === "$=" ? check && result.slice( -check.length ) === check : + operator === "~=" ? ( " " + result.replace( rwhitespace, " " ) + " " ).indexOf( check ) > -1 : + operator === "|=" ? result === check || result.slice( 0, check.length + 1 ) === check + "-" : + false; + /* eslint-enable max-len */ + + }; + }, + + "CHILD": function( type, what, _argument, first, last ) { + var simple = type.slice( 0, 3 ) !== "nth", + forward = type.slice( -4 ) !== "last", + ofType = what === "of-type"; + + return first === 1 && last === 0 ? + + // Shortcut for :nth-*(n) + function( elem ) { + return !!elem.parentNode; + } : + + function( elem, _context, xml ) { + var cache, uniqueCache, outerCache, node, nodeIndex, start, + dir = simple !== forward ? "nextSibling" : "previousSibling", + parent = elem.parentNode, + name = ofType && elem.nodeName.toLowerCase(), + useCache = !xml && !ofType, + diff = false; + + if ( parent ) { + + // :(first|last|only)-(child|of-type) + if ( simple ) { + while ( dir ) { + node = elem; + while ( ( node = node[ dir ] ) ) { + if ( ofType ? + node.nodeName.toLowerCase() === name : + node.nodeType === 1 ) { + + return false; + } + } + + // Reverse direction for :only-* (if we haven't yet done so) + start = dir = type === "only" && !start && "nextSibling"; + } + return true; + } + + start = [ forward ? parent.firstChild : parent.lastChild ]; + + // non-xml :nth-child(...) stores cache data on `parent` + if ( forward && useCache ) { + + // Seek `elem` from a previously-cached index + + // ...in a gzip-friendly way + node = parent; + outerCache = node[ expando ] || ( node[ expando ] = {} ); + + // Support: IE <9 only + // Defend against cloned attroperties (jQuery gh-1709) + uniqueCache = outerCache[ node.uniqueID ] || + ( outerCache[ node.uniqueID ] = {} ); + + cache = uniqueCache[ type ] || []; + nodeIndex = cache[ 0 ] === dirruns && cache[ 1 ]; + diff = nodeIndex && cache[ 2 ]; + node = nodeIndex && parent.childNodes[ nodeIndex ]; + + while ( ( node = ++nodeIndex && node && node[ dir ] || + + // Fallback to seeking `elem` from the start + ( diff = nodeIndex = 0 ) || start.pop() ) ) { + + // When found, cache indexes on `parent` and break + if ( node.nodeType === 1 && ++diff && node === elem ) { + uniqueCache[ type ] = [ dirruns, nodeIndex, diff ]; + break; + } + } + + } else { + + // Use previously-cached element index if available + if ( useCache ) { + + // ...in a gzip-friendly way + node = elem; + outerCache = node[ expando ] || ( node[ expando ] = {} ); + + // Support: IE <9 only + // Defend against cloned attroperties (jQuery gh-1709) + uniqueCache = outerCache[ node.uniqueID ] || + ( outerCache[ node.uniqueID ] = {} ); + + cache = uniqueCache[ type ] || []; + nodeIndex = cache[ 0 ] === dirruns && cache[ 1 ]; + diff = nodeIndex; + } + + // xml :nth-child(...) + // or :nth-last-child(...) or :nth(-last)?-of-type(...) + if ( diff === false ) { + + // Use the same loop as above to seek `elem` from the start + while ( ( node = ++nodeIndex && node && node[ dir ] || + ( diff = nodeIndex = 0 ) || start.pop() ) ) { + + if ( ( ofType ? + node.nodeName.toLowerCase() === name : + node.nodeType === 1 ) && + ++diff ) { + + // Cache the index of each encountered element + if ( useCache ) { + outerCache = node[ expando ] || + ( node[ expando ] = {} ); + + // Support: IE <9 only + // Defend against cloned attroperties (jQuery gh-1709) + uniqueCache = outerCache[ node.uniqueID ] || + ( outerCache[ node.uniqueID ] = {} ); + + uniqueCache[ type ] = [ dirruns, diff ]; + } + + if ( node === elem ) { + break; + } + } + } + } + } + + // Incorporate the offset, then check against cycle size + diff -= last; + return diff === first || ( diff % first === 0 && diff / first >= 0 ); + } + }; + }, + + "PSEUDO": function( pseudo, argument ) { + + // pseudo-class names are case-insensitive + // http://www.w3.org/TR/selectors/#pseudo-classes + // Prioritize by case sensitivity in case custom pseudos are added with uppercase letters + // Remember that setFilters inherits from pseudos + var args, + fn = Expr.pseudos[ pseudo ] || Expr.setFilters[ pseudo.toLowerCase() ] || + Sizzle.error( "unsupported pseudo: " + pseudo ); + + // The user may use createPseudo to indicate that + // arguments are needed to create the filter function + // just as Sizzle does + if ( fn[ expando ] ) { + return fn( argument ); + } + + // But maintain support for old signatures + if ( fn.length > 1 ) { + args = [ pseudo, pseudo, "", argument ]; + return Expr.setFilters.hasOwnProperty( pseudo.toLowerCase() ) ? + markFunction( function( seed, matches ) { + var idx, + matched = fn( seed, argument ), + i = matched.length; + while ( i-- ) { + idx = indexOf( seed, matched[ i ] ); + seed[ idx ] = !( matches[ idx ] = matched[ i ] ); + } + } ) : + function( elem ) { + return fn( elem, 0, args ); + }; + } + + return fn; + } + }, + + pseudos: { + + // Potentially complex pseudos + "not": markFunction( function( selector ) { + + // Trim the selector passed to compile + // to avoid treating leading and trailing + // spaces as combinators + var input = [], + results = [], + matcher = compile( selector.replace( rtrim, "$1" ) ); + + return matcher[ expando ] ? + markFunction( function( seed, matches, _context, xml ) { + var elem, + unmatched = matcher( seed, null, xml, [] ), + i = seed.length; + + // Match elements unmatched by `matcher` + while ( i-- ) { + if ( ( elem = unmatched[ i ] ) ) { + seed[ i ] = !( matches[ i ] = elem ); + } + } + } ) : + function( elem, _context, xml ) { + input[ 0 ] = elem; + matcher( input, null, xml, results ); + + // Don't keep the element (issue #299) + input[ 0 ] = null; + return !results.pop(); + }; + } ), + + "has": markFunction( function( selector ) { + return function( elem ) { + return Sizzle( selector, elem ).length > 0; + }; + } ), + + "contains": markFunction( function( text ) { + text = text.replace( runescape, funescape ); + return function( elem ) { + return ( elem.textContent || getText( elem ) ).indexOf( text ) > -1; + }; + } ), + + // "Whether an element is represented by a :lang() selector + // is based solely on the element's language value + // being equal to the identifier C, + // or beginning with the identifier C immediately followed by "-". + // The matching of C against the element's language value is performed case-insensitively. + // The identifier C does not have to be a valid language name." + // http://www.w3.org/TR/selectors/#lang-pseudo + "lang": markFunction( function( lang ) { + + // lang value must be a valid identifier + if ( !ridentifier.test( lang || "" ) ) { + Sizzle.error( "unsupported lang: " + lang ); + } + lang = lang.replace( runescape, funescape ).toLowerCase(); + return function( elem ) { + var elemLang; + do { + if ( ( elemLang = documentIsHTML ? + elem.lang : + elem.getAttribute( "xml:lang" ) || elem.getAttribute( "lang" ) ) ) { + + elemLang = elemLang.toLowerCase(); + return elemLang === lang || elemLang.indexOf( lang + "-" ) === 0; + } + } while ( ( elem = elem.parentNode ) && elem.nodeType === 1 ); + return false; + }; + } ), + + // Miscellaneous + "target": function( elem ) { + var hash = window.location && window.location.hash; + return hash && hash.slice( 1 ) === elem.id; + }, + + "root": function( elem ) { + return elem === docElem; + }, + + "focus": function( elem ) { + return elem === document.activeElement && + ( !document.hasFocus || document.hasFocus() ) && + !!( elem.type || elem.href || ~elem.tabIndex ); + }, + + // Boolean properties + "enabled": createDisabledPseudo( false ), + "disabled": createDisabledPseudo( true ), + + "checked": function( elem ) { + + // In CSS3, :checked should return both checked and selected elements + // http://www.w3.org/TR/2011/REC-css3-selectors-20110929/#checked + var nodeName = elem.nodeName.toLowerCase(); + return ( nodeName === "input" && !!elem.checked ) || + ( nodeName === "option" && !!elem.selected ); + }, + + "selected": function( elem ) { + + // Accessing this property makes selected-by-default + // options in Safari work properly + if ( elem.parentNode ) { + // eslint-disable-next-line no-unused-expressions + elem.parentNode.selectedIndex; + } + + return elem.selected === true; + }, + + // Contents + "empty": function( elem ) { + + // http://www.w3.org/TR/selectors/#empty-pseudo + // :empty is negated by element (1) or content nodes (text: 3; cdata: 4; entity ref: 5), + // but not by others (comment: 8; processing instruction: 7; etc.) + // nodeType < 6 works because attributes (2) do not appear as children + for ( elem = elem.firstChild; elem; elem = elem.nextSibling ) { + if ( elem.nodeType < 6 ) { + return false; + } + } + return true; + }, + + "parent": function( elem ) { + return !Expr.pseudos[ "empty" ]( elem ); + }, + + // Element/input types + "header": function( elem ) { + return rheader.test( elem.nodeName ); + }, + + "input": function( elem ) { + return rinputs.test( elem.nodeName ); + }, + + "button": function( elem ) { + var name = elem.nodeName.toLowerCase(); + return name === "input" && elem.type === "button" || name === "button"; + }, + + "text": function( elem ) { + var attr; + return elem.nodeName.toLowerCase() === "input" && + elem.type === "text" && + + // Support: IE<8 + // New HTML5 attribute values (e.g., "search") appear with elem.type === "text" + ( ( attr = elem.getAttribute( "type" ) ) == null || + attr.toLowerCase() === "text" ); + }, + + // Position-in-collection + "first": createPositionalPseudo( function() { + return [ 0 ]; + } ), + + "last": createPositionalPseudo( function( _matchIndexes, length ) { + return [ length - 1 ]; + } ), + + "eq": createPositionalPseudo( function( _matchIndexes, length, argument ) { + return [ argument < 0 ? argument + length : argument ]; + } ), + + "even": createPositionalPseudo( function( matchIndexes, length ) { + var i = 0; + for ( ; i < length; i += 2 ) { + matchIndexes.push( i ); + } + return matchIndexes; + } ), + + "odd": createPositionalPseudo( function( matchIndexes, length ) { + var i = 1; + for ( ; i < length; i += 2 ) { + matchIndexes.push( i ); + } + return matchIndexes; + } ), + + "lt": createPositionalPseudo( function( matchIndexes, length, argument ) { + var i = argument < 0 ? + argument + length : + argument > length ? + length : + argument; + for ( ; --i >= 0; ) { + matchIndexes.push( i ); + } + return matchIndexes; + } ), + + "gt": createPositionalPseudo( function( matchIndexes, length, argument ) { + var i = argument < 0 ? argument + length : argument; + for ( ; ++i < length; ) { + matchIndexes.push( i ); + } + return matchIndexes; + } ) + } +}; + +Expr.pseudos[ "nth" ] = Expr.pseudos[ "eq" ]; + +// Add button/input type pseudos +for ( i in { radio: true, checkbox: true, file: true, password: true, image: true } ) { + Expr.pseudos[ i ] = createInputPseudo( i ); +} +for ( i in { submit: true, reset: true } ) { + Expr.pseudos[ i ] = createButtonPseudo( i ); +} + +// Easy API for creating new setFilters +function setFilters() {} +setFilters.prototype = Expr.filters = Expr.pseudos; +Expr.setFilters = new setFilters(); + +tokenize = Sizzle.tokenize = function( selector, parseOnly ) { + var matched, match, tokens, type, + soFar, groups, preFilters, + cached = tokenCache[ selector + " " ]; + + if ( cached ) { + return parseOnly ? 0 : cached.slice( 0 ); + } + + soFar = selector; + groups = []; + preFilters = Expr.preFilter; + + while ( soFar ) { + + // Comma and first run + if ( !matched || ( match = rcomma.exec( soFar ) ) ) { + if ( match ) { + + // Don't consume trailing commas as valid + soFar = soFar.slice( match[ 0 ].length ) || soFar; + } + groups.push( ( tokens = [] ) ); + } + + matched = false; + + // Combinators + if ( ( match = rcombinators.exec( soFar ) ) ) { + matched = match.shift(); + tokens.push( { + value: matched, + + // Cast descendant combinators to space + type: match[ 0 ].replace( rtrim, " " ) + } ); + soFar = soFar.slice( matched.length ); + } + + // Filters + for ( type in Expr.filter ) { + if ( ( match = matchExpr[ type ].exec( soFar ) ) && ( !preFilters[ type ] || + ( match = preFilters[ type ]( match ) ) ) ) { + matched = match.shift(); + tokens.push( { + value: matched, + type: type, + matches: match + } ); + soFar = soFar.slice( matched.length ); + } + } + + if ( !matched ) { + break; + } + } + + // Return the length of the invalid excess + // if we're just parsing + // Otherwise, throw an error or return tokens + return parseOnly ? + soFar.length : + soFar ? + Sizzle.error( selector ) : + + // Cache the tokens + tokenCache( selector, groups ).slice( 0 ); +}; + +function toSelector( tokens ) { + var i = 0, + len = tokens.length, + selector = ""; + for ( ; i < len; i++ ) { + selector += tokens[ i ].value; + } + return selector; +} + +function addCombinator( matcher, combinator, base ) { + var dir = combinator.dir, + skip = combinator.next, + key = skip || dir, + checkNonElements = base && key === "parentNode", + doneName = done++; + + return combinator.first ? + + // Check against closest ancestor/preceding element + function( elem, context, xml ) { + while ( ( elem = elem[ dir ] ) ) { + if ( elem.nodeType === 1 || checkNonElements ) { + return matcher( elem, context, xml ); + } + } + return false; + } : + + // Check against all ancestor/preceding elements + function( elem, context, xml ) { + var oldCache, uniqueCache, outerCache, + newCache = [ dirruns, doneName ]; + + // We can't set arbitrary data on XML nodes, so they don't benefit from combinator caching + if ( xml ) { + while ( ( elem = elem[ dir ] ) ) { + if ( elem.nodeType === 1 || checkNonElements ) { + if ( matcher( elem, context, xml ) ) { + return true; + } + } + } + } else { + while ( ( elem = elem[ dir ] ) ) { + if ( elem.nodeType === 1 || checkNonElements ) { + outerCache = elem[ expando ] || ( elem[ expando ] = {} ); + + // Support: IE <9 only + // Defend against cloned attroperties (jQuery gh-1709) + uniqueCache = outerCache[ elem.uniqueID ] || + ( outerCache[ elem.uniqueID ] = {} ); + + if ( skip && skip === elem.nodeName.toLowerCase() ) { + elem = elem[ dir ] || elem; + } else if ( ( oldCache = uniqueCache[ key ] ) && + oldCache[ 0 ] === dirruns && oldCache[ 1 ] === doneName ) { + + // Assign to newCache so results back-propagate to previous elements + return ( newCache[ 2 ] = oldCache[ 2 ] ); + } else { + + // Reuse newcache so results back-propagate to previous elements + uniqueCache[ key ] = newCache; + + // A match means we're done; a fail means we have to keep checking + if ( ( newCache[ 2 ] = matcher( elem, context, xml ) ) ) { + return true; + } + } + } + } + } + return false; + }; +} + +function elementMatcher( matchers ) { + return matchers.length > 1 ? + function( elem, context, xml ) { + var i = matchers.length; + while ( i-- ) { + if ( !matchers[ i ]( elem, context, xml ) ) { + return false; + } + } + return true; + } : + matchers[ 0 ]; +} + +function multipleContexts( selector, contexts, results ) { + var i = 0, + len = contexts.length; + for ( ; i < len; i++ ) { + Sizzle( selector, contexts[ i ], results ); + } + return results; +} + +function condense( unmatched, map, filter, context, xml ) { + var elem, + newUnmatched = [], + i = 0, + len = unmatched.length, + mapped = map != null; + + for ( ; i < len; i++ ) { + if ( ( elem = unmatched[ i ] ) ) { + if ( !filter || filter( elem, context, xml ) ) { + newUnmatched.push( elem ); + if ( mapped ) { + map.push( i ); + } + } + } + } + + return newUnmatched; +} + +function setMatcher( preFilter, selector, matcher, postFilter, postFinder, postSelector ) { + if ( postFilter && !postFilter[ expando ] ) { + postFilter = setMatcher( postFilter ); + } + if ( postFinder && !postFinder[ expando ] ) { + postFinder = setMatcher( postFinder, postSelector ); + } + return markFunction( function( seed, results, context, xml ) { + var temp, i, elem, + preMap = [], + postMap = [], + preexisting = results.length, + + // Get initial elements from seed or context + elems = seed || multipleContexts( + selector || "*", + context.nodeType ? [ context ] : context, + [] + ), + + // Prefilter to get matcher input, preserving a map for seed-results synchronization + matcherIn = preFilter && ( seed || !selector ) ? + condense( elems, preMap, preFilter, context, xml ) : + elems, + + matcherOut = matcher ? + + // If we have a postFinder, or filtered seed, or non-seed postFilter or preexisting results, + postFinder || ( seed ? preFilter : preexisting || postFilter ) ? + + // ...intermediate processing is necessary + [] : + + // ...otherwise use results directly + results : + matcherIn; + + // Find primary matches + if ( matcher ) { + matcher( matcherIn, matcherOut, context, xml ); + } + + // Apply postFilter + if ( postFilter ) { + temp = condense( matcherOut, postMap ); + postFilter( temp, [], context, xml ); + + // Un-match failing elements by moving them back to matcherIn + i = temp.length; + while ( i-- ) { + if ( ( elem = temp[ i ] ) ) { + matcherOut[ postMap[ i ] ] = !( matcherIn[ postMap[ i ] ] = elem ); + } + } + } + + if ( seed ) { + if ( postFinder || preFilter ) { + if ( postFinder ) { + + // Get the final matcherOut by condensing this intermediate into postFinder contexts + temp = []; + i = matcherOut.length; + while ( i-- ) { + if ( ( elem = matcherOut[ i ] ) ) { + + // Restore matcherIn since elem is not yet a final match + temp.push( ( matcherIn[ i ] = elem ) ); + } + } + postFinder( null, ( matcherOut = [] ), temp, xml ); + } + + // Move matched elements from seed to results to keep them synchronized + i = matcherOut.length; + while ( i-- ) { + if ( ( elem = matcherOut[ i ] ) && + ( temp = postFinder ? indexOf( seed, elem ) : preMap[ i ] ) > -1 ) { + + seed[ temp ] = !( results[ temp ] = elem ); + } + } + } + + // Add elements to results, through postFinder if defined + } else { + matcherOut = condense( + matcherOut === results ? + matcherOut.splice( preexisting, matcherOut.length ) : + matcherOut + ); + if ( postFinder ) { + postFinder( null, results, matcherOut, xml ); + } else { + push.apply( results, matcherOut ); + } + } + } ); +} + +function matcherFromTokens( tokens ) { + var checkContext, matcher, j, + len = tokens.length, + leadingRelative = Expr.relative[ tokens[ 0 ].type ], + implicitRelative = leadingRelative || Expr.relative[ " " ], + i = leadingRelative ? 1 : 0, + + // The foundational matcher ensures that elements are reachable from top-level context(s) + matchContext = addCombinator( function( elem ) { + return elem === checkContext; + }, implicitRelative, true ), + matchAnyContext = addCombinator( function( elem ) { + return indexOf( checkContext, elem ) > -1; + }, implicitRelative, true ), + matchers = [ function( elem, context, xml ) { + var ret = ( !leadingRelative && ( xml || context !== outermostContext ) ) || ( + ( checkContext = context ).nodeType ? + matchContext( elem, context, xml ) : + matchAnyContext( elem, context, xml ) ); + + // Avoid hanging onto element (issue #299) + checkContext = null; + return ret; + } ]; + + for ( ; i < len; i++ ) { + if ( ( matcher = Expr.relative[ tokens[ i ].type ] ) ) { + matchers = [ addCombinator( elementMatcher( matchers ), matcher ) ]; + } else { + matcher = Expr.filter[ tokens[ i ].type ].apply( null, tokens[ i ].matches ); + + // Return special upon seeing a positional matcher + if ( matcher[ expando ] ) { + + // Find the next relative operator (if any) for proper handling + j = ++i; + for ( ; j < len; j++ ) { + if ( Expr.relative[ tokens[ j ].type ] ) { + break; + } + } + return setMatcher( + i > 1 && elementMatcher( matchers ), + i > 1 && toSelector( + + // If the preceding token was a descendant combinator, insert an implicit any-element `*` + tokens + .slice( 0, i - 1 ) + .concat( { value: tokens[ i - 2 ].type === " " ? "*" : "" } ) + ).replace( rtrim, "$1" ), + matcher, + i < j && matcherFromTokens( tokens.slice( i, j ) ), + j < len && matcherFromTokens( ( tokens = tokens.slice( j ) ) ), + j < len && toSelector( tokens ) + ); + } + matchers.push( matcher ); + } + } + + return elementMatcher( matchers ); +} + +function matcherFromGroupMatchers( elementMatchers, setMatchers ) { + var bySet = setMatchers.length > 0, + byElement = elementMatchers.length > 0, + superMatcher = function( seed, context, xml, results, outermost ) { + var elem, j, matcher, + matchedCount = 0, + i = "0", + unmatched = seed && [], + setMatched = [], + contextBackup = outermostContext, + + // We must always have either seed elements or outermost context + elems = seed || byElement && Expr.find[ "TAG" ]( "*", outermost ), + + // Use integer dirruns iff this is the outermost matcher + dirrunsUnique = ( dirruns += contextBackup == null ? 1 : Math.random() || 0.1 ), + len = elems.length; + + if ( outermost ) { + + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + outermostContext = context == document || context || outermost; + } + + // Add elements passing elementMatchers directly to results + // Support: IE<9, Safari + // Tolerate NodeList properties (IE: "length"; Safari: ) matching elements by id + for ( ; i !== len && ( elem = elems[ i ] ) != null; i++ ) { + if ( byElement && elem ) { + j = 0; + + // Support: IE 11+, Edge 17 - 18+ + // IE/Edge sometimes throw a "Permission denied" error when strict-comparing + // two documents; shallow comparisons work. + // eslint-disable-next-line eqeqeq + if ( !context && elem.ownerDocument != document ) { + setDocument( elem ); + xml = !documentIsHTML; + } + while ( ( matcher = elementMatchers[ j++ ] ) ) { + if ( matcher( elem, context || document, xml ) ) { + results.push( elem ); + break; + } + } + if ( outermost ) { + dirruns = dirrunsUnique; + } + } + + // Track unmatched elements for set filters + if ( bySet ) { + + // They will have gone through all possible matchers + if ( ( elem = !matcher && elem ) ) { + matchedCount--; + } + + // Lengthen the array for every element, matched or not + if ( seed ) { + unmatched.push( elem ); + } + } + } + + // `i` is now the count of elements visited above, and adding it to `matchedCount` + // makes the latter nonnegative. + matchedCount += i; + + // Apply set filters to unmatched elements + // NOTE: This can be skipped if there are no unmatched elements (i.e., `matchedCount` + // equals `i`), unless we didn't visit _any_ elements in the above loop because we have + // no element matchers and no seed. + // Incrementing an initially-string "0" `i` allows `i` to remain a string only in that + // case, which will result in a "00" `matchedCount` that differs from `i` but is also + // numerically zero. + if ( bySet && i !== matchedCount ) { + j = 0; + while ( ( matcher = setMatchers[ j++ ] ) ) { + matcher( unmatched, setMatched, context, xml ); + } + + if ( seed ) { + + // Reintegrate element matches to eliminate the need for sorting + if ( matchedCount > 0 ) { + while ( i-- ) { + if ( !( unmatched[ i ] || setMatched[ i ] ) ) { + setMatched[ i ] = pop.call( results ); + } + } + } + + // Discard index placeholder values to get only actual matches + setMatched = condense( setMatched ); + } + + // Add matches to results + push.apply( results, setMatched ); + + // Seedless set matches succeeding multiple successful matchers stipulate sorting + if ( outermost && !seed && setMatched.length > 0 && + ( matchedCount + setMatchers.length ) > 1 ) { + + Sizzle.uniqueSort( results ); + } + } + + // Override manipulation of globals by nested matchers + if ( outermost ) { + dirruns = dirrunsUnique; + outermostContext = contextBackup; + } + + return unmatched; + }; + + return bySet ? + markFunction( superMatcher ) : + superMatcher; +} + +compile = Sizzle.compile = function( selector, match /* Internal Use Only */ ) { + var i, + setMatchers = [], + elementMatchers = [], + cached = compilerCache[ selector + " " ]; + + if ( !cached ) { + + // Generate a function of recursive functions that can be used to check each element + if ( !match ) { + match = tokenize( selector ); + } + i = match.length; + while ( i-- ) { + cached = matcherFromTokens( match[ i ] ); + if ( cached[ expando ] ) { + setMatchers.push( cached ); + } else { + elementMatchers.push( cached ); + } + } + + // Cache the compiled function + cached = compilerCache( + selector, + matcherFromGroupMatchers( elementMatchers, setMatchers ) + ); + + // Save selector and tokenization + cached.selector = selector; + } + return cached; +}; + +/** + * A low-level selection function that works with Sizzle's compiled + * selector functions + * @param {String|Function} selector A selector or a pre-compiled + * selector function built with Sizzle.compile + * @param {Element} context + * @param {Array} [results] + * @param {Array} [seed] A set of elements to match against + */ +select = Sizzle.select = function( selector, context, results, seed ) { + var i, tokens, token, type, find, + compiled = typeof selector === "function" && selector, + match = !seed && tokenize( ( selector = compiled.selector || selector ) ); + + results = results || []; + + // Try to minimize operations if there is only one selector in the list and no seed + // (the latter of which guarantees us context) + if ( match.length === 1 ) { + + // Reduce context if the leading compound selector is an ID + tokens = match[ 0 ] = match[ 0 ].slice( 0 ); + if ( tokens.length > 2 && ( token = tokens[ 0 ] ).type === "ID" && + context.nodeType === 9 && documentIsHTML && Expr.relative[ tokens[ 1 ].type ] ) { + + context = ( Expr.find[ "ID" ]( token.matches[ 0 ] + .replace( runescape, funescape ), context ) || [] )[ 0 ]; + if ( !context ) { + return results; + + // Precompiled matchers will still verify ancestry, so step up a level + } else if ( compiled ) { + context = context.parentNode; + } + + selector = selector.slice( tokens.shift().value.length ); + } + + // Fetch a seed set for right-to-left matching + i = matchExpr[ "needsContext" ].test( selector ) ? 0 : tokens.length; + while ( i-- ) { + token = tokens[ i ]; + + // Abort if we hit a combinator + if ( Expr.relative[ ( type = token.type ) ] ) { + break; + } + if ( ( find = Expr.find[ type ] ) ) { + + // Search, expanding context for leading sibling combinators + if ( ( seed = find( + token.matches[ 0 ].replace( runescape, funescape ), + rsibling.test( tokens[ 0 ].type ) && testContext( context.parentNode ) || + context + ) ) ) { + + // If seed is empty or no tokens remain, we can return early + tokens.splice( i, 1 ); + selector = seed.length && toSelector( tokens ); + if ( !selector ) { + push.apply( results, seed ); + return results; + } + + break; + } + } + } + } + + // Compile and execute a filtering function if one is not provided + // Provide `match` to avoid retokenization if we modified the selector above + ( compiled || compile( selector, match ) )( + seed, + context, + !documentIsHTML, + results, + !context || rsibling.test( selector ) && testContext( context.parentNode ) || context + ); + return results; +}; + +// One-time assignments + +// Sort stability +support.sortStable = expando.split( "" ).sort( sortOrder ).join( "" ) === expando; + +// Support: Chrome 14-35+ +// Always assume duplicates if they aren't passed to the comparison function +support.detectDuplicates = !!hasDuplicate; + +// Initialize against the default document +setDocument(); + +// Support: Webkit<537.32 - Safari 6.0.3/Chrome 25 (fixed in Chrome 27) +// Detached nodes confoundingly follow *each other* +support.sortDetached = assert( function( el ) { + + // Should return 1, but returns 4 (following) + return el.compareDocumentPosition( document.createElement( "fieldset" ) ) & 1; +} ); + +// Support: IE<8 +// Prevent attribute/property "interpolation" +// https://msdn.microsoft.com/en-us/library/ms536429%28VS.85%29.aspx +if ( !assert( function( el ) { + el.innerHTML = ""; + return el.firstChild.getAttribute( "href" ) === "#"; +} ) ) { + addHandle( "type|href|height|width", function( elem, name, isXML ) { + if ( !isXML ) { + return elem.getAttribute( name, name.toLowerCase() === "type" ? 1 : 2 ); + } + } ); +} + +// Support: IE<9 +// Use defaultValue in place of getAttribute("value") +if ( !support.attributes || !assert( function( el ) { + el.innerHTML = ""; + el.firstChild.setAttribute( "value", "" ); + return el.firstChild.getAttribute( "value" ) === ""; +} ) ) { + addHandle( "value", function( elem, _name, isXML ) { + if ( !isXML && elem.nodeName.toLowerCase() === "input" ) { + return elem.defaultValue; + } + } ); +} + +// Support: IE<9 +// Use getAttributeNode to fetch booleans when getAttribute lies +if ( !assert( function( el ) { + return el.getAttribute( "disabled" ) == null; +} ) ) { + addHandle( booleans, function( elem, name, isXML ) { + var val; + if ( !isXML ) { + return elem[ name ] === true ? name.toLowerCase() : + ( val = elem.getAttributeNode( name ) ) && val.specified ? + val.value : + null; + } + } ); +} + +return Sizzle; + +} )( window ); + + + +jQuery.find = Sizzle; +jQuery.expr = Sizzle.selectors; + +// Deprecated +jQuery.expr[ ":" ] = jQuery.expr.pseudos; +jQuery.uniqueSort = jQuery.unique = Sizzle.uniqueSort; +jQuery.text = Sizzle.getText; +jQuery.isXMLDoc = Sizzle.isXML; +jQuery.contains = Sizzle.contains; +jQuery.escapeSelector = Sizzle.escape; + + + + +var dir = function( elem, dir, until ) { + var matched = [], + truncate = until !== undefined; + + while ( ( elem = elem[ dir ] ) && elem.nodeType !== 9 ) { + if ( elem.nodeType === 1 ) { + if ( truncate && jQuery( elem ).is( until ) ) { + break; + } + matched.push( elem ); + } + } + return matched; +}; + + +var siblings = function( n, elem ) { + var matched = []; + + for ( ; n; n = n.nextSibling ) { + if ( n.nodeType === 1 && n !== elem ) { + matched.push( n ); + } + } + + return matched; +}; + + +var rneedsContext = jQuery.expr.match.needsContext; + + + +function nodeName( elem, name ) { + + return elem.nodeName && elem.nodeName.toLowerCase() === name.toLowerCase(); + +} +var rsingleTag = ( /^<([a-z][^\/\0>:\x20\t\r\n\f]*)[\x20\t\r\n\f]*\/?>(?:<\/\1>|)$/i ); + + + +// Implement the identical functionality for filter and not +function winnow( elements, qualifier, not ) { + if ( isFunction( qualifier ) ) { + return jQuery.grep( elements, function( elem, i ) { + return !!qualifier.call( elem, i, elem ) !== not; + } ); + } + + // Single element + if ( qualifier.nodeType ) { + return jQuery.grep( elements, function( elem ) { + return ( elem === qualifier ) !== not; + } ); + } + + // Arraylike of elements (jQuery, arguments, Array) + if ( typeof qualifier !== "string" ) { + return jQuery.grep( elements, function( elem ) { + return ( indexOf.call( qualifier, elem ) > -1 ) !== not; + } ); + } + + // Filtered directly for both simple and complex selectors + return jQuery.filter( qualifier, elements, not ); +} + +jQuery.filter = function( expr, elems, not ) { + var elem = elems[ 0 ]; + + if ( not ) { + expr = ":not(" + expr + ")"; + } + + if ( elems.length === 1 && elem.nodeType === 1 ) { + return jQuery.find.matchesSelector( elem, expr ) ? [ elem ] : []; + } + + return jQuery.find.matches( expr, jQuery.grep( elems, function( elem ) { + return elem.nodeType === 1; + } ) ); +}; + +jQuery.fn.extend( { + find: function( selector ) { + var i, ret, + len = this.length, + self = this; + + if ( typeof selector !== "string" ) { + return this.pushStack( jQuery( selector ).filter( function() { + for ( i = 0; i < len; i++ ) { + if ( jQuery.contains( self[ i ], this ) ) { + return true; + } + } + } ) ); + } + + ret = this.pushStack( [] ); + + for ( i = 0; i < len; i++ ) { + jQuery.find( selector, self[ i ], ret ); + } + + return len > 1 ? jQuery.uniqueSort( ret ) : ret; + }, + filter: function( selector ) { + return this.pushStack( winnow( this, selector || [], false ) ); + }, + not: function( selector ) { + return this.pushStack( winnow( this, selector || [], true ) ); + }, + is: function( selector ) { + return !!winnow( + this, + + // If this is a positional/relative selector, check membership in the returned set + // so $("p:first").is("p:last") won't return true for a doc with two "p". + typeof selector === "string" && rneedsContext.test( selector ) ? + jQuery( selector ) : + selector || [], + false + ).length; + } +} ); + + +// Initialize a jQuery object + + +// A central reference to the root jQuery(document) +var rootjQuery, + + // A simple way to check for HTML strings + // Prioritize #id over to avoid XSS via location.hash (#9521) + // Strict HTML recognition (#11290: must start with <) + // Shortcut simple #id case for speed + rquickExpr = /^(?:\s*(<[\w\W]+>)[^>]*|#([\w-]+))$/, + + init = jQuery.fn.init = function( selector, context, root ) { + var match, elem; + + // HANDLE: $(""), $(null), $(undefined), $(false) + if ( !selector ) { + return this; + } + + // Method init() accepts an alternate rootjQuery + // so migrate can support jQuery.sub (gh-2101) + root = root || rootjQuery; + + // Handle HTML strings + if ( typeof selector === "string" ) { + if ( selector[ 0 ] === "<" && + selector[ selector.length - 1 ] === ">" && + selector.length >= 3 ) { + + // Assume that strings that start and end with <> are HTML and skip the regex check + match = [ null, selector, null ]; + + } else { + match = rquickExpr.exec( selector ); + } + + // Match html or make sure no context is specified for #id + if ( match && ( match[ 1 ] || !context ) ) { + + // HANDLE: $(html) -> $(array) + if ( match[ 1 ] ) { + context = context instanceof jQuery ? context[ 0 ] : context; + + // Option to run scripts is true for back-compat + // Intentionally let the error be thrown if parseHTML is not present + jQuery.merge( this, jQuery.parseHTML( + match[ 1 ], + context && context.nodeType ? context.ownerDocument || context : document, + true + ) ); + + // HANDLE: $(html, props) + if ( rsingleTag.test( match[ 1 ] ) && jQuery.isPlainObject( context ) ) { + for ( match in context ) { + + // Properties of context are called as methods if possible + if ( isFunction( this[ match ] ) ) { + this[ match ]( context[ match ] ); + + // ...and otherwise set as attributes + } else { + this.attr( match, context[ match ] ); + } + } + } + + return this; + + // HANDLE: $(#id) + } else { + elem = document.getElementById( match[ 2 ] ); + + if ( elem ) { + + // Inject the element directly into the jQuery object + this[ 0 ] = elem; + this.length = 1; + } + return this; + } + + // HANDLE: $(expr, $(...)) + } else if ( !context || context.jquery ) { + return ( context || root ).find( selector ); + + // HANDLE: $(expr, context) + // (which is just equivalent to: $(context).find(expr) + } else { + return this.constructor( context ).find( selector ); + } + + // HANDLE: $(DOMElement) + } else if ( selector.nodeType ) { + this[ 0 ] = selector; + this.length = 1; + return this; + + // HANDLE: $(function) + // Shortcut for document ready + } else if ( isFunction( selector ) ) { + return root.ready !== undefined ? + root.ready( selector ) : + + // Execute immediately if ready is not present + selector( jQuery ); + } + + return jQuery.makeArray( selector, this ); + }; + +// Give the init function the jQuery prototype for later instantiation +init.prototype = jQuery.fn; + +// Initialize central reference +rootjQuery = jQuery( document ); + + +var rparentsprev = /^(?:parents|prev(?:Until|All))/, + + // Methods guaranteed to produce a unique set when starting from a unique set + guaranteedUnique = { + children: true, + contents: true, + next: true, + prev: true + }; + +jQuery.fn.extend( { + has: function( target ) { + var targets = jQuery( target, this ), + l = targets.length; + + return this.filter( function() { + var i = 0; + for ( ; i < l; i++ ) { + if ( jQuery.contains( this, targets[ i ] ) ) { + return true; + } + } + } ); + }, + + closest: function( selectors, context ) { + var cur, + i = 0, + l = this.length, + matched = [], + targets = typeof selectors !== "string" && jQuery( selectors ); + + // Positional selectors never match, since there's no _selection_ context + if ( !rneedsContext.test( selectors ) ) { + for ( ; i < l; i++ ) { + for ( cur = this[ i ]; cur && cur !== context; cur = cur.parentNode ) { + + // Always skip document fragments + if ( cur.nodeType < 11 && ( targets ? + targets.index( cur ) > -1 : + + // Don't pass non-elements to Sizzle + cur.nodeType === 1 && + jQuery.find.matchesSelector( cur, selectors ) ) ) { + + matched.push( cur ); + break; + } + } + } + } + + return this.pushStack( matched.length > 1 ? jQuery.uniqueSort( matched ) : matched ); + }, + + // Determine the position of an element within the set + index: function( elem ) { + + // No argument, return index in parent + if ( !elem ) { + return ( this[ 0 ] && this[ 0 ].parentNode ) ? this.first().prevAll().length : -1; + } + + // Index in selector + if ( typeof elem === "string" ) { + return indexOf.call( jQuery( elem ), this[ 0 ] ); + } + + // Locate the position of the desired element + return indexOf.call( this, + + // If it receives a jQuery object, the first element is used + elem.jquery ? elem[ 0 ] : elem + ); + }, + + add: function( selector, context ) { + return this.pushStack( + jQuery.uniqueSort( + jQuery.merge( this.get(), jQuery( selector, context ) ) + ) + ); + }, + + addBack: function( selector ) { + return this.add( selector == null ? + this.prevObject : this.prevObject.filter( selector ) + ); + } +} ); + +function sibling( cur, dir ) { + while ( ( cur = cur[ dir ] ) && cur.nodeType !== 1 ) {} + return cur; +} + +jQuery.each( { + parent: function( elem ) { + var parent = elem.parentNode; + return parent && parent.nodeType !== 11 ? parent : null; + }, + parents: function( elem ) { + return dir( elem, "parentNode" ); + }, + parentsUntil: function( elem, _i, until ) { + return dir( elem, "parentNode", until ); + }, + next: function( elem ) { + return sibling( elem, "nextSibling" ); + }, + prev: function( elem ) { + return sibling( elem, "previousSibling" ); + }, + nextAll: function( elem ) { + return dir( elem, "nextSibling" ); + }, + prevAll: function( elem ) { + return dir( elem, "previousSibling" ); + }, + nextUntil: function( elem, _i, until ) { + return dir( elem, "nextSibling", until ); + }, + prevUntil: function( elem, _i, until ) { + return dir( elem, "previousSibling", until ); + }, + siblings: function( elem ) { + return siblings( ( elem.parentNode || {} ).firstChild, elem ); + }, + children: function( elem ) { + return siblings( elem.firstChild ); + }, + contents: function( elem ) { + if ( elem.contentDocument != null && + + // Support: IE 11+ + // elements with no `data` attribute has an object + // `contentDocument` with a `null` prototype. + getProto( elem.contentDocument ) ) { + + return elem.contentDocument; + } + + // Support: IE 9 - 11 only, iOS 7 only, Android Browser <=4.3 only + // Treat the template element as a regular one in browsers that + // don't support it. + if ( nodeName( elem, "template" ) ) { + elem = elem.content || elem; + } + + return jQuery.merge( [], elem.childNodes ); + } +}, function( name, fn ) { + jQuery.fn[ name ] = function( until, selector ) { + var matched = jQuery.map( this, fn, until ); + + if ( name.slice( -5 ) !== "Until" ) { + selector = until; + } + + if ( selector && typeof selector === "string" ) { + matched = jQuery.filter( selector, matched ); + } + + if ( this.length > 1 ) { + + // Remove duplicates + if ( !guaranteedUnique[ name ] ) { + jQuery.uniqueSort( matched ); + } + + // Reverse order for parents* and prev-derivatives + if ( rparentsprev.test( name ) ) { + matched.reverse(); + } + } + + return this.pushStack( matched ); + }; +} ); +var rnothtmlwhite = ( /[^\x20\t\r\n\f]+/g ); + + + +// Convert String-formatted options into Object-formatted ones +function createOptions( options ) { + var object = {}; + jQuery.each( options.match( rnothtmlwhite ) || [], function( _, flag ) { + object[ flag ] = true; + } ); + return object; +} + +/* + * Create a callback list using the following parameters: + * + * options: an optional list of space-separated options that will change how + * the callback list behaves or a more traditional option object + * + * By default a callback list will act like an event callback list and can be + * "fired" multiple times. + * + * Possible options: + * + * once: will ensure the callback list can only be fired once (like a Deferred) + * + * memory: will keep track of previous values and will call any callback added + * after the list has been fired right away with the latest "memorized" + * values (like a Deferred) + * + * unique: will ensure a callback can only be added once (no duplicate in the list) + * + * stopOnFalse: interrupt callings when a callback returns false + * + */ +jQuery.Callbacks = function( options ) { + + // Convert options from String-formatted to Object-formatted if needed + // (we check in cache first) + options = typeof options === "string" ? + createOptions( options ) : + jQuery.extend( {}, options ); + + var // Flag to know if list is currently firing + firing, + + // Last fire value for non-forgettable lists + memory, + + // Flag to know if list was already fired + fired, + + // Flag to prevent firing + locked, + + // Actual callback list + list = [], + + // Queue of execution data for repeatable lists + queue = [], + + // Index of currently firing callback (modified by add/remove as needed) + firingIndex = -1, + + // Fire callbacks + fire = function() { + + // Enforce single-firing + locked = locked || options.once; + + // Execute callbacks for all pending executions, + // respecting firingIndex overrides and runtime changes + fired = firing = true; + for ( ; queue.length; firingIndex = -1 ) { + memory = queue.shift(); + while ( ++firingIndex < list.length ) { + + // Run callback and check for early termination + if ( list[ firingIndex ].apply( memory[ 0 ], memory[ 1 ] ) === false && + options.stopOnFalse ) { + + // Jump to end and forget the data so .add doesn't re-fire + firingIndex = list.length; + memory = false; + } + } + } + + // Forget the data if we're done with it + if ( !options.memory ) { + memory = false; + } + + firing = false; + + // Clean up if we're done firing for good + if ( locked ) { + + // Keep an empty list if we have data for future add calls + if ( memory ) { + list = []; + + // Otherwise, this object is spent + } else { + list = ""; + } + } + }, + + // Actual Callbacks object + self = { + + // Add a callback or a collection of callbacks to the list + add: function() { + if ( list ) { + + // If we have memory from a past run, we should fire after adding + if ( memory && !firing ) { + firingIndex = list.length - 1; + queue.push( memory ); + } + + ( function add( args ) { + jQuery.each( args, function( _, arg ) { + if ( isFunction( arg ) ) { + if ( !options.unique || !self.has( arg ) ) { + list.push( arg ); + } + } else if ( arg && arg.length && toType( arg ) !== "string" ) { + + // Inspect recursively + add( arg ); + } + } ); + } )( arguments ); + + if ( memory && !firing ) { + fire(); + } + } + return this; + }, + + // Remove a callback from the list + remove: function() { + jQuery.each( arguments, function( _, arg ) { + var index; + while ( ( index = jQuery.inArray( arg, list, index ) ) > -1 ) { + list.splice( index, 1 ); + + // Handle firing indexes + if ( index <= firingIndex ) { + firingIndex--; + } + } + } ); + return this; + }, + + // Check if a given callback is in the list. + // If no argument is given, return whether or not list has callbacks attached. + has: function( fn ) { + return fn ? + jQuery.inArray( fn, list ) > -1 : + list.length > 0; + }, + + // Remove all callbacks from the list + empty: function() { + if ( list ) { + list = []; + } + return this; + }, + + // Disable .fire and .add + // Abort any current/pending executions + // Clear all callbacks and values + disable: function() { + locked = queue = []; + list = memory = ""; + return this; + }, + disabled: function() { + return !list; + }, + + // Disable .fire + // Also disable .add unless we have memory (since it would have no effect) + // Abort any pending executions + lock: function() { + locked = queue = []; + if ( !memory && !firing ) { + list = memory = ""; + } + return this; + }, + locked: function() { + return !!locked; + }, + + // Call all callbacks with the given context and arguments + fireWith: function( context, args ) { + if ( !locked ) { + args = args || []; + args = [ context, args.slice ? args.slice() : args ]; + queue.push( args ); + if ( !firing ) { + fire(); + } + } + return this; + }, + + // Call all the callbacks with the given arguments + fire: function() { + self.fireWith( this, arguments ); + return this; + }, + + // To know if the callbacks have already been called at least once + fired: function() { + return !!fired; + } + }; + + return self; +}; + + +function Identity( v ) { + return v; +} +function Thrower( ex ) { + throw ex; +} + +function adoptValue( value, resolve, reject, noValue ) { + var method; + + try { + + // Check for promise aspect first to privilege synchronous behavior + if ( value && isFunction( ( method = value.promise ) ) ) { + method.call( value ).done( resolve ).fail( reject ); + + // Other thenables + } else if ( value && isFunction( ( method = value.then ) ) ) { + method.call( value, resolve, reject ); + + // Other non-thenables + } else { + + // Control `resolve` arguments by letting Array#slice cast boolean `noValue` to integer: + // * false: [ value ].slice( 0 ) => resolve( value ) + // * true: [ value ].slice( 1 ) => resolve() + resolve.apply( undefined, [ value ].slice( noValue ) ); + } + + // For Promises/A+, convert exceptions into rejections + // Since jQuery.when doesn't unwrap thenables, we can skip the extra checks appearing in + // Deferred#then to conditionally suppress rejection. + } catch ( value ) { + + // Support: Android 4.0 only + // Strict mode functions invoked without .call/.apply get global-object context + reject.apply( undefined, [ value ] ); + } +} + +jQuery.extend( { + + Deferred: function( func ) { + var tuples = [ + + // action, add listener, callbacks, + // ... .then handlers, argument index, [final state] + [ "notify", "progress", jQuery.Callbacks( "memory" ), + jQuery.Callbacks( "memory" ), 2 ], + [ "resolve", "done", jQuery.Callbacks( "once memory" ), + jQuery.Callbacks( "once memory" ), 0, "resolved" ], + [ "reject", "fail", jQuery.Callbacks( "once memory" ), + jQuery.Callbacks( "once memory" ), 1, "rejected" ] + ], + state = "pending", + promise = { + state: function() { + return state; + }, + always: function() { + deferred.done( arguments ).fail( arguments ); + return this; + }, + "catch": function( fn ) { + return promise.then( null, fn ); + }, + + // Keep pipe for back-compat + pipe: function( /* fnDone, fnFail, fnProgress */ ) { + var fns = arguments; + + return jQuery.Deferred( function( newDefer ) { + jQuery.each( tuples, function( _i, tuple ) { + + // Map tuples (progress, done, fail) to arguments (done, fail, progress) + var fn = isFunction( fns[ tuple[ 4 ] ] ) && fns[ tuple[ 4 ] ]; + + // deferred.progress(function() { bind to newDefer or newDefer.notify }) + // deferred.done(function() { bind to newDefer or newDefer.resolve }) + // deferred.fail(function() { bind to newDefer or newDefer.reject }) + deferred[ tuple[ 1 ] ]( function() { + var returned = fn && fn.apply( this, arguments ); + if ( returned && isFunction( returned.promise ) ) { + returned.promise() + .progress( newDefer.notify ) + .done( newDefer.resolve ) + .fail( newDefer.reject ); + } else { + newDefer[ tuple[ 0 ] + "With" ]( + this, + fn ? [ returned ] : arguments + ); + } + } ); + } ); + fns = null; + } ).promise(); + }, + then: function( onFulfilled, onRejected, onProgress ) { + var maxDepth = 0; + function resolve( depth, deferred, handler, special ) { + return function() { + var that = this, + args = arguments, + mightThrow = function() { + var returned, then; + + // Support: Promises/A+ section 2.3.3.3.3 + // https://promisesaplus.com/#point-59 + // Ignore double-resolution attempts + if ( depth < maxDepth ) { + return; + } + + returned = handler.apply( that, args ); + + // Support: Promises/A+ section 2.3.1 + // https://promisesaplus.com/#point-48 + if ( returned === deferred.promise() ) { + throw new TypeError( "Thenable self-resolution" ); + } + + // Support: Promises/A+ sections 2.3.3.1, 3.5 + // https://promisesaplus.com/#point-54 + // https://promisesaplus.com/#point-75 + // Retrieve `then` only once + then = returned && + + // Support: Promises/A+ section 2.3.4 + // https://promisesaplus.com/#point-64 + // Only check objects and functions for thenability + ( typeof returned === "object" || + typeof returned === "function" ) && + returned.then; + + // Handle a returned thenable + if ( isFunction( then ) ) { + + // Special processors (notify) just wait for resolution + if ( special ) { + then.call( + returned, + resolve( maxDepth, deferred, Identity, special ), + resolve( maxDepth, deferred, Thrower, special ) + ); + + // Normal processors (resolve) also hook into progress + } else { + + // ...and disregard older resolution values + maxDepth++; + + then.call( + returned, + resolve( maxDepth, deferred, Identity, special ), + resolve( maxDepth, deferred, Thrower, special ), + resolve( maxDepth, deferred, Identity, + deferred.notifyWith ) + ); + } + + // Handle all other returned values + } else { + + // Only substitute handlers pass on context + // and multiple values (non-spec behavior) + if ( handler !== Identity ) { + that = undefined; + args = [ returned ]; + } + + // Process the value(s) + // Default process is resolve + ( special || deferred.resolveWith )( that, args ); + } + }, + + // Only normal processors (resolve) catch and reject exceptions + process = special ? + mightThrow : + function() { + try { + mightThrow(); + } catch ( e ) { + + if ( jQuery.Deferred.exceptionHook ) { + jQuery.Deferred.exceptionHook( e, + process.stackTrace ); + } + + // Support: Promises/A+ section 2.3.3.3.4.1 + // https://promisesaplus.com/#point-61 + // Ignore post-resolution exceptions + if ( depth + 1 >= maxDepth ) { + + // Only substitute handlers pass on context + // and multiple values (non-spec behavior) + if ( handler !== Thrower ) { + that = undefined; + args = [ e ]; + } + + deferred.rejectWith( that, args ); + } + } + }; + + // Support: Promises/A+ section 2.3.3.3.1 + // https://promisesaplus.com/#point-57 + // Re-resolve promises immediately to dodge false rejection from + // subsequent errors + if ( depth ) { + process(); + } else { + + // Call an optional hook to record the stack, in case of exception + // since it's otherwise lost when execution goes async + if ( jQuery.Deferred.getStackHook ) { + process.stackTrace = jQuery.Deferred.getStackHook(); + } + window.setTimeout( process ); + } + }; + } + + return jQuery.Deferred( function( newDefer ) { + + // progress_handlers.add( ... ) + tuples[ 0 ][ 3 ].add( + resolve( + 0, + newDefer, + isFunction( onProgress ) ? + onProgress : + Identity, + newDefer.notifyWith + ) + ); + + // fulfilled_handlers.add( ... ) + tuples[ 1 ][ 3 ].add( + resolve( + 0, + newDefer, + isFunction( onFulfilled ) ? + onFulfilled : + Identity + ) + ); + + // rejected_handlers.add( ... ) + tuples[ 2 ][ 3 ].add( + resolve( + 0, + newDefer, + isFunction( onRejected ) ? + onRejected : + Thrower + ) + ); + } ).promise(); + }, + + // Get a promise for this deferred + // If obj is provided, the promise aspect is added to the object + promise: function( obj ) { + return obj != null ? jQuery.extend( obj, promise ) : promise; + } + }, + deferred = {}; + + // Add list-specific methods + jQuery.each( tuples, function( i, tuple ) { + var list = tuple[ 2 ], + stateString = tuple[ 5 ]; + + // promise.progress = list.add + // promise.done = list.add + // promise.fail = list.add + promise[ tuple[ 1 ] ] = list.add; + + // Handle state + if ( stateString ) { + list.add( + function() { + + // state = "resolved" (i.e., fulfilled) + // state = "rejected" + state = stateString; + }, + + // rejected_callbacks.disable + // fulfilled_callbacks.disable + tuples[ 3 - i ][ 2 ].disable, + + // rejected_handlers.disable + // fulfilled_handlers.disable + tuples[ 3 - i ][ 3 ].disable, + + // progress_callbacks.lock + tuples[ 0 ][ 2 ].lock, + + // progress_handlers.lock + tuples[ 0 ][ 3 ].lock + ); + } + + // progress_handlers.fire + // fulfilled_handlers.fire + // rejected_handlers.fire + list.add( tuple[ 3 ].fire ); + + // deferred.notify = function() { deferred.notifyWith(...) } + // deferred.resolve = function() { deferred.resolveWith(...) } + // deferred.reject = function() { deferred.rejectWith(...) } + deferred[ tuple[ 0 ] ] = function() { + deferred[ tuple[ 0 ] + "With" ]( this === deferred ? undefined : this, arguments ); + return this; + }; + + // deferred.notifyWith = list.fireWith + // deferred.resolveWith = list.fireWith + // deferred.rejectWith = list.fireWith + deferred[ tuple[ 0 ] + "With" ] = list.fireWith; + } ); + + // Make the deferred a promise + promise.promise( deferred ); + + // Call given func if any + if ( func ) { + func.call( deferred, deferred ); + } + + // All done! + return deferred; + }, + + // Deferred helper + when: function( singleValue ) { + var + + // count of uncompleted subordinates + remaining = arguments.length, + + // count of unprocessed arguments + i = remaining, + + // subordinate fulfillment data + resolveContexts = Array( i ), + resolveValues = slice.call( arguments ), + + // the primary Deferred + primary = jQuery.Deferred(), + + // subordinate callback factory + updateFunc = function( i ) { + return function( value ) { + resolveContexts[ i ] = this; + resolveValues[ i ] = arguments.length > 1 ? slice.call( arguments ) : value; + if ( !( --remaining ) ) { + primary.resolveWith( resolveContexts, resolveValues ); + } + }; + }; + + // Single- and empty arguments are adopted like Promise.resolve + if ( remaining <= 1 ) { + adoptValue( singleValue, primary.done( updateFunc( i ) ).resolve, primary.reject, + !remaining ); + + // Use .then() to unwrap secondary thenables (cf. gh-3000) + if ( primary.state() === "pending" || + isFunction( resolveValues[ i ] && resolveValues[ i ].then ) ) { + + return primary.then(); + } + } + + // Multiple arguments are aggregated like Promise.all array elements + while ( i-- ) { + adoptValue( resolveValues[ i ], updateFunc( i ), primary.reject ); + } + + return primary.promise(); + } +} ); + + +// These usually indicate a programmer mistake during development, +// warn about them ASAP rather than swallowing them by default. +var rerrorNames = /^(Eval|Internal|Range|Reference|Syntax|Type|URI)Error$/; + +jQuery.Deferred.exceptionHook = function( error, stack ) { + + // Support: IE 8 - 9 only + // Console exists when dev tools are open, which can happen at any time + if ( window.console && window.console.warn && error && rerrorNames.test( error.name ) ) { + window.console.warn( "jQuery.Deferred exception: " + error.message, error.stack, stack ); + } +}; + + + + +jQuery.readyException = function( error ) { + window.setTimeout( function() { + throw error; + } ); +}; + + + + +// The deferred used on DOM ready +var readyList = jQuery.Deferred(); + +jQuery.fn.ready = function( fn ) { + + readyList + .then( fn ) + + // Wrap jQuery.readyException in a function so that the lookup + // happens at the time of error handling instead of callback + // registration. + .catch( function( error ) { + jQuery.readyException( error ); + } ); + + return this; +}; + +jQuery.extend( { + + // Is the DOM ready to be used? Set to true once it occurs. + isReady: false, + + // A counter to track how many items to wait for before + // the ready event fires. See #6781 + readyWait: 1, + + // Handle when the DOM is ready + ready: function( wait ) { + + // Abort if there are pending holds or we're already ready + if ( wait === true ? --jQuery.readyWait : jQuery.isReady ) { + return; + } + + // Remember that the DOM is ready + jQuery.isReady = true; + + // If a normal DOM Ready event fired, decrement, and wait if need be + if ( wait !== true && --jQuery.readyWait > 0 ) { + return; + } + + // If there are functions bound, to execute + readyList.resolveWith( document, [ jQuery ] ); + } +} ); + +jQuery.ready.then = readyList.then; + +// The ready event handler and self cleanup method +function completed() { + document.removeEventListener( "DOMContentLoaded", completed ); + window.removeEventListener( "load", completed ); + jQuery.ready(); +} + +// Catch cases where $(document).ready() is called +// after the browser event has already occurred. +// Support: IE <=9 - 10 only +// Older IE sometimes signals "interactive" too soon +if ( document.readyState === "complete" || + ( document.readyState !== "loading" && !document.documentElement.doScroll ) ) { + + // Handle it asynchronously to allow scripts the opportunity to delay ready + window.setTimeout( jQuery.ready ); + +} else { + + // Use the handy event callback + document.addEventListener( "DOMContentLoaded", completed ); + + // A fallback to window.onload, that will always work + window.addEventListener( "load", completed ); +} + + + + +// Multifunctional method to get and set values of a collection +// The value/s can optionally be executed if it's a function +var access = function( elems, fn, key, value, chainable, emptyGet, raw ) { + var i = 0, + len = elems.length, + bulk = key == null; + + // Sets many values + if ( toType( key ) === "object" ) { + chainable = true; + for ( i in key ) { + access( elems, fn, i, key[ i ], true, emptyGet, raw ); + } + + // Sets one value + } else if ( value !== undefined ) { + chainable = true; + + if ( !isFunction( value ) ) { + raw = true; + } + + if ( bulk ) { + + // Bulk operations run against the entire set + if ( raw ) { + fn.call( elems, value ); + fn = null; + + // ...except when executing function values + } else { + bulk = fn; + fn = function( elem, _key, value ) { + return bulk.call( jQuery( elem ), value ); + }; + } + } + + if ( fn ) { + for ( ; i < len; i++ ) { + fn( + elems[ i ], key, raw ? + value : + value.call( elems[ i ], i, fn( elems[ i ], key ) ) + ); + } + } + } + + if ( chainable ) { + return elems; + } + + // Gets + if ( bulk ) { + return fn.call( elems ); + } + + return len ? fn( elems[ 0 ], key ) : emptyGet; +}; + + +// Matches dashed string for camelizing +var rmsPrefix = /^-ms-/, + rdashAlpha = /-([a-z])/g; + +// Used by camelCase as callback to replace() +function fcamelCase( _all, letter ) { + return letter.toUpperCase(); +} + +// Convert dashed to camelCase; used by the css and data modules +// Support: IE <=9 - 11, Edge 12 - 15 +// Microsoft forgot to hump their vendor prefix (#9572) +function camelCase( string ) { + return string.replace( rmsPrefix, "ms-" ).replace( rdashAlpha, fcamelCase ); +} +var acceptData = function( owner ) { + + // Accepts only: + // - Node + // - Node.ELEMENT_NODE + // - Node.DOCUMENT_NODE + // - Object + // - Any + return owner.nodeType === 1 || owner.nodeType === 9 || !( +owner.nodeType ); +}; + + + + +function Data() { + this.expando = jQuery.expando + Data.uid++; +} + +Data.uid = 1; + +Data.prototype = { + + cache: function( owner ) { + + // Check if the owner object already has a cache + var value = owner[ this.expando ]; + + // If not, create one + if ( !value ) { + value = {}; + + // We can accept data for non-element nodes in modern browsers, + // but we should not, see #8335. + // Always return an empty object. + if ( acceptData( owner ) ) { + + // If it is a node unlikely to be stringify-ed or looped over + // use plain assignment + if ( owner.nodeType ) { + owner[ this.expando ] = value; + + // Otherwise secure it in a non-enumerable property + // configurable must be true to allow the property to be + // deleted when data is removed + } else { + Object.defineProperty( owner, this.expando, { + value: value, + configurable: true + } ); + } + } + } + + return value; + }, + set: function( owner, data, value ) { + var prop, + cache = this.cache( owner ); + + // Handle: [ owner, key, value ] args + // Always use camelCase key (gh-2257) + if ( typeof data === "string" ) { + cache[ camelCase( data ) ] = value; + + // Handle: [ owner, { properties } ] args + } else { + + // Copy the properties one-by-one to the cache object + for ( prop in data ) { + cache[ camelCase( prop ) ] = data[ prop ]; + } + } + return cache; + }, + get: function( owner, key ) { + return key === undefined ? + this.cache( owner ) : + + // Always use camelCase key (gh-2257) + owner[ this.expando ] && owner[ this.expando ][ camelCase( key ) ]; + }, + access: function( owner, key, value ) { + + // In cases where either: + // + // 1. No key was specified + // 2. A string key was specified, but no value provided + // + // Take the "read" path and allow the get method to determine + // which value to return, respectively either: + // + // 1. The entire cache object + // 2. The data stored at the key + // + if ( key === undefined || + ( ( key && typeof key === "string" ) && value === undefined ) ) { + + return this.get( owner, key ); + } + + // When the key is not a string, or both a key and value + // are specified, set or extend (existing objects) with either: + // + // 1. An object of properties + // 2. A key and value + // + this.set( owner, key, value ); + + // Since the "set" path can have two possible entry points + // return the expected data based on which path was taken[*] + return value !== undefined ? value : key; + }, + remove: function( owner, key ) { + var i, + cache = owner[ this.expando ]; + + if ( cache === undefined ) { + return; + } + + if ( key !== undefined ) { + + // Support array or space separated string of keys + if ( Array.isArray( key ) ) { + + // If key is an array of keys... + // We always set camelCase keys, so remove that. + key = key.map( camelCase ); + } else { + key = camelCase( key ); + + // If a key with the spaces exists, use it. + // Otherwise, create an array by matching non-whitespace + key = key in cache ? + [ key ] : + ( key.match( rnothtmlwhite ) || [] ); + } + + i = key.length; + + while ( i-- ) { + delete cache[ key[ i ] ]; + } + } + + // Remove the expando if there's no more data + if ( key === undefined || jQuery.isEmptyObject( cache ) ) { + + // Support: Chrome <=35 - 45 + // Webkit & Blink performance suffers when deleting properties + // from DOM nodes, so set to undefined instead + // https://bugs.chromium.org/p/chromium/issues/detail?id=378607 (bug restricted) + if ( owner.nodeType ) { + owner[ this.expando ] = undefined; + } else { + delete owner[ this.expando ]; + } + } + }, + hasData: function( owner ) { + var cache = owner[ this.expando ]; + return cache !== undefined && !jQuery.isEmptyObject( cache ); + } +}; +var dataPriv = new Data(); + +var dataUser = new Data(); + + + +// Implementation Summary +// +// 1. Enforce API surface and semantic compatibility with 1.9.x branch +// 2. Improve the module's maintainability by reducing the storage +// paths to a single mechanism. +// 3. Use the same single mechanism to support "private" and "user" data. +// 4. _Never_ expose "private" data to user code (TODO: Drop _data, _removeData) +// 5. Avoid exposing implementation details on user objects (eg. expando properties) +// 6. Provide a clear path for implementation upgrade to WeakMap in 2014 + +var rbrace = /^(?:\{[\w\W]*\}|\[[\w\W]*\])$/, + rmultiDash = /[A-Z]/g; + +function getData( data ) { + if ( data === "true" ) { + return true; + } + + if ( data === "false" ) { + return false; + } + + if ( data === "null" ) { + return null; + } + + // Only convert to a number if it doesn't change the string + if ( data === +data + "" ) { + return +data; + } + + if ( rbrace.test( data ) ) { + return JSON.parse( data ); + } + + return data; +} + +function dataAttr( elem, key, data ) { + var name; + + // If nothing was found internally, try to fetch any + // data from the HTML5 data-* attribute + if ( data === undefined && elem.nodeType === 1 ) { + name = "data-" + key.replace( rmultiDash, "-$&" ).toLowerCase(); + data = elem.getAttribute( name ); + + if ( typeof data === "string" ) { + try { + data = getData( data ); + } catch ( e ) {} + + // Make sure we set the data so it isn't changed later + dataUser.set( elem, key, data ); + } else { + data = undefined; + } + } + return data; +} + +jQuery.extend( { + hasData: function( elem ) { + return dataUser.hasData( elem ) || dataPriv.hasData( elem ); + }, + + data: function( elem, name, data ) { + return dataUser.access( elem, name, data ); + }, + + removeData: function( elem, name ) { + dataUser.remove( elem, name ); + }, + + // TODO: Now that all calls to _data and _removeData have been replaced + // with direct calls to dataPriv methods, these can be deprecated. + _data: function( elem, name, data ) { + return dataPriv.access( elem, name, data ); + }, + + _removeData: function( elem, name ) { + dataPriv.remove( elem, name ); + } +} ); + +jQuery.fn.extend( { + data: function( key, value ) { + var i, name, data, + elem = this[ 0 ], + attrs = elem && elem.attributes; + + // Gets all values + if ( key === undefined ) { + if ( this.length ) { + data = dataUser.get( elem ); + + if ( elem.nodeType === 1 && !dataPriv.get( elem, "hasDataAttrs" ) ) { + i = attrs.length; + while ( i-- ) { + + // Support: IE 11 only + // The attrs elements can be null (#14894) + if ( attrs[ i ] ) { + name = attrs[ i ].name; + if ( name.indexOf( "data-" ) === 0 ) { + name = camelCase( name.slice( 5 ) ); + dataAttr( elem, name, data[ name ] ); + } + } + } + dataPriv.set( elem, "hasDataAttrs", true ); + } + } + + return data; + } + + // Sets multiple values + if ( typeof key === "object" ) { + return this.each( function() { + dataUser.set( this, key ); + } ); + } + + return access( this, function( value ) { + var data; + + // The calling jQuery object (element matches) is not empty + // (and therefore has an element appears at this[ 0 ]) and the + // `value` parameter was not undefined. An empty jQuery object + // will result in `undefined` for elem = this[ 0 ] which will + // throw an exception if an attempt to read a data cache is made. + if ( elem && value === undefined ) { + + // Attempt to get data from the cache + // The key will always be camelCased in Data + data = dataUser.get( elem, key ); + if ( data !== undefined ) { + return data; + } + + // Attempt to "discover" the data in + // HTML5 custom data-* attrs + data = dataAttr( elem, key ); + if ( data !== undefined ) { + return data; + } + + // We tried really hard, but the data doesn't exist. + return; + } + + // Set the data... + this.each( function() { + + // We always store the camelCased key + dataUser.set( this, key, value ); + } ); + }, null, value, arguments.length > 1, null, true ); + }, + + removeData: function( key ) { + return this.each( function() { + dataUser.remove( this, key ); + } ); + } +} ); + + +jQuery.extend( { + queue: function( elem, type, data ) { + var queue; + + if ( elem ) { + type = ( type || "fx" ) + "queue"; + queue = dataPriv.get( elem, type ); + + // Speed up dequeue by getting out quickly if this is just a lookup + if ( data ) { + if ( !queue || Array.isArray( data ) ) { + queue = dataPriv.access( elem, type, jQuery.makeArray( data ) ); + } else { + queue.push( data ); + } + } + return queue || []; + } + }, + + dequeue: function( elem, type ) { + type = type || "fx"; + + var queue = jQuery.queue( elem, type ), + startLength = queue.length, + fn = queue.shift(), + hooks = jQuery._queueHooks( elem, type ), + next = function() { + jQuery.dequeue( elem, type ); + }; + + // If the fx queue is dequeued, always remove the progress sentinel + if ( fn === "inprogress" ) { + fn = queue.shift(); + startLength--; + } + + if ( fn ) { + + // Add a progress sentinel to prevent the fx queue from being + // automatically dequeued + if ( type === "fx" ) { + queue.unshift( "inprogress" ); + } + + // Clear up the last queue stop function + delete hooks.stop; + fn.call( elem, next, hooks ); + } + + if ( !startLength && hooks ) { + hooks.empty.fire(); + } + }, + + // Not public - generate a queueHooks object, or return the current one + _queueHooks: function( elem, type ) { + var key = type + "queueHooks"; + return dataPriv.get( elem, key ) || dataPriv.access( elem, key, { + empty: jQuery.Callbacks( "once memory" ).add( function() { + dataPriv.remove( elem, [ type + "queue", key ] ); + } ) + } ); + } +} ); + +jQuery.fn.extend( { + queue: function( type, data ) { + var setter = 2; + + if ( typeof type !== "string" ) { + data = type; + type = "fx"; + setter--; + } + + if ( arguments.length < setter ) { + return jQuery.queue( this[ 0 ], type ); + } + + return data === undefined ? + this : + this.each( function() { + var queue = jQuery.queue( this, type, data ); + + // Ensure a hooks for this queue + jQuery._queueHooks( this, type ); + + if ( type === "fx" && queue[ 0 ] !== "inprogress" ) { + jQuery.dequeue( this, type ); + } + } ); + }, + dequeue: function( type ) { + return this.each( function() { + jQuery.dequeue( this, type ); + } ); + }, + clearQueue: function( type ) { + return this.queue( type || "fx", [] ); + }, + + // Get a promise resolved when queues of a certain type + // are emptied (fx is the type by default) + promise: function( type, obj ) { + var tmp, + count = 1, + defer = jQuery.Deferred(), + elements = this, + i = this.length, + resolve = function() { + if ( !( --count ) ) { + defer.resolveWith( elements, [ elements ] ); + } + }; + + if ( typeof type !== "string" ) { + obj = type; + type = undefined; + } + type = type || "fx"; + + while ( i-- ) { + tmp = dataPriv.get( elements[ i ], type + "queueHooks" ); + if ( tmp && tmp.empty ) { + count++; + tmp.empty.add( resolve ); + } + } + resolve(); + return defer.promise( obj ); + } +} ); +var pnum = ( /[+-]?(?:\d*\.|)\d+(?:[eE][+-]?\d+|)/ ).source; + +var rcssNum = new RegExp( "^(?:([+-])=|)(" + pnum + ")([a-z%]*)$", "i" ); + + +var cssExpand = [ "Top", "Right", "Bottom", "Left" ]; + +var documentElement = document.documentElement; + + + + var isAttached = function( elem ) { + return jQuery.contains( elem.ownerDocument, elem ); + }, + composed = { composed: true }; + + // Support: IE 9 - 11+, Edge 12 - 18+, iOS 10.0 - 10.2 only + // Check attachment across shadow DOM boundaries when possible (gh-3504) + // Support: iOS 10.0-10.2 only + // Early iOS 10 versions support `attachShadow` but not `getRootNode`, + // leading to errors. We need to check for `getRootNode`. + if ( documentElement.getRootNode ) { + isAttached = function( elem ) { + return jQuery.contains( elem.ownerDocument, elem ) || + elem.getRootNode( composed ) === elem.ownerDocument; + }; + } +var isHiddenWithinTree = function( elem, el ) { + + // isHiddenWithinTree might be called from jQuery#filter function; + // in that case, element will be second argument + elem = el || elem; + + // Inline style trumps all + return elem.style.display === "none" || + elem.style.display === "" && + + // Otherwise, check computed style + // Support: Firefox <=43 - 45 + // Disconnected elements can have computed display: none, so first confirm that elem is + // in the document. + isAttached( elem ) && + + jQuery.css( elem, "display" ) === "none"; + }; + + + +function adjustCSS( elem, prop, valueParts, tween ) { + var adjusted, scale, + maxIterations = 20, + currentValue = tween ? + function() { + return tween.cur(); + } : + function() { + return jQuery.css( elem, prop, "" ); + }, + initial = currentValue(), + unit = valueParts && valueParts[ 3 ] || ( jQuery.cssNumber[ prop ] ? "" : "px" ), + + // Starting value computation is required for potential unit mismatches + initialInUnit = elem.nodeType && + ( jQuery.cssNumber[ prop ] || unit !== "px" && +initial ) && + rcssNum.exec( jQuery.css( elem, prop ) ); + + if ( initialInUnit && initialInUnit[ 3 ] !== unit ) { + + // Support: Firefox <=54 + // Halve the iteration target value to prevent interference from CSS upper bounds (gh-2144) + initial = initial / 2; + + // Trust units reported by jQuery.css + unit = unit || initialInUnit[ 3 ]; + + // Iteratively approximate from a nonzero starting point + initialInUnit = +initial || 1; + + while ( maxIterations-- ) { + + // Evaluate and update our best guess (doubling guesses that zero out). + // Finish if the scale equals or crosses 1 (making the old*new product non-positive). + jQuery.style( elem, prop, initialInUnit + unit ); + if ( ( 1 - scale ) * ( 1 - ( scale = currentValue() / initial || 0.5 ) ) <= 0 ) { + maxIterations = 0; + } + initialInUnit = initialInUnit / scale; + + } + + initialInUnit = initialInUnit * 2; + jQuery.style( elem, prop, initialInUnit + unit ); + + // Make sure we update the tween properties later on + valueParts = valueParts || []; + } + + if ( valueParts ) { + initialInUnit = +initialInUnit || +initial || 0; + + // Apply relative offset (+=/-=) if specified + adjusted = valueParts[ 1 ] ? + initialInUnit + ( valueParts[ 1 ] + 1 ) * valueParts[ 2 ] : + +valueParts[ 2 ]; + if ( tween ) { + tween.unit = unit; + tween.start = initialInUnit; + tween.end = adjusted; + } + } + return adjusted; +} + + +var defaultDisplayMap = {}; + +function getDefaultDisplay( elem ) { + var temp, + doc = elem.ownerDocument, + nodeName = elem.nodeName, + display = defaultDisplayMap[ nodeName ]; + + if ( display ) { + return display; + } + + temp = doc.body.appendChild( doc.createElement( nodeName ) ); + display = jQuery.css( temp, "display" ); + + temp.parentNode.removeChild( temp ); + + if ( display === "none" ) { + display = "block"; + } + defaultDisplayMap[ nodeName ] = display; + + return display; +} + +function showHide( elements, show ) { + var display, elem, + values = [], + index = 0, + length = elements.length; + + // Determine new display value for elements that need to change + for ( ; index < length; index++ ) { + elem = elements[ index ]; + if ( !elem.style ) { + continue; + } + + display = elem.style.display; + if ( show ) { + + // Since we force visibility upon cascade-hidden elements, an immediate (and slow) + // check is required in this first loop unless we have a nonempty display value (either + // inline or about-to-be-restored) + if ( display === "none" ) { + values[ index ] = dataPriv.get( elem, "display" ) || null; + if ( !values[ index ] ) { + elem.style.display = ""; + } + } + if ( elem.style.display === "" && isHiddenWithinTree( elem ) ) { + values[ index ] = getDefaultDisplay( elem ); + } + } else { + if ( display !== "none" ) { + values[ index ] = "none"; + + // Remember what we're overwriting + dataPriv.set( elem, "display", display ); + } + } + } + + // Set the display of the elements in a second loop to avoid constant reflow + for ( index = 0; index < length; index++ ) { + if ( values[ index ] != null ) { + elements[ index ].style.display = values[ index ]; + } + } + + return elements; +} + +jQuery.fn.extend( { + show: function() { + return showHide( this, true ); + }, + hide: function() { + return showHide( this ); + }, + toggle: function( state ) { + if ( typeof state === "boolean" ) { + return state ? this.show() : this.hide(); + } + + return this.each( function() { + if ( isHiddenWithinTree( this ) ) { + jQuery( this ).show(); + } else { + jQuery( this ).hide(); + } + } ); + } +} ); +var rcheckableType = ( /^(?:checkbox|radio)$/i ); + +var rtagName = ( /<([a-z][^\/\0>\x20\t\r\n\f]*)/i ); + +var rscriptType = ( /^$|^module$|\/(?:java|ecma)script/i ); + + + +( function() { + var fragment = document.createDocumentFragment(), + div = fragment.appendChild( document.createElement( "div" ) ), + input = document.createElement( "input" ); + + // Support: Android 4.0 - 4.3 only + // Check state lost if the name is set (#11217) + // Support: Windows Web Apps (WWA) + // `name` and `type` must use .setAttribute for WWA (#14901) + input.setAttribute( "type", "radio" ); + input.setAttribute( "checked", "checked" ); + input.setAttribute( "name", "t" ); + + div.appendChild( input ); + + // Support: Android <=4.1 only + // Older WebKit doesn't clone checked state correctly in fragments + support.checkClone = div.cloneNode( true ).cloneNode( true ).lastChild.checked; + + // Support: IE <=11 only + // Make sure textarea (and checkbox) defaultValue is properly cloned + div.innerHTML = ""; + support.noCloneChecked = !!div.cloneNode( true ).lastChild.defaultValue; + + // Support: IE <=9 only + // IE <=9 replaces "; + support.option = !!div.lastChild; +} )(); + + +// We have to close these tags to support XHTML (#13200) +var wrapMap = { + + // XHTML parsers do not magically insert elements in the + // same way that tag soup parsers do. So we cannot shorten + // this by omitting or other required elements. + thead: [ 1, "", "
" ], + col: [ 2, "", "
" ], + tr: [ 2, "", "
" ], + td: [ 3, "", "
" ], + + _default: [ 0, "", "" ] +}; + +wrapMap.tbody = wrapMap.tfoot = wrapMap.colgroup = wrapMap.caption = wrapMap.thead; +wrapMap.th = wrapMap.td; + +// Support: IE <=9 only +if ( !support.option ) { + wrapMap.optgroup = wrapMap.option = [ 1, "" ]; +} + + +function getAll( context, tag ) { + + // Support: IE <=9 - 11 only + // Use typeof to avoid zero-argument method invocation on host objects (#15151) + var ret; + + if ( typeof context.getElementsByTagName !== "undefined" ) { + ret = context.getElementsByTagName( tag || "*" ); + + } else if ( typeof context.querySelectorAll !== "undefined" ) { + ret = context.querySelectorAll( tag || "*" ); + + } else { + ret = []; + } + + if ( tag === undefined || tag && nodeName( context, tag ) ) { + return jQuery.merge( [ context ], ret ); + } + + return ret; +} + + +// Mark scripts as having already been evaluated +function setGlobalEval( elems, refElements ) { + var i = 0, + l = elems.length; + + for ( ; i < l; i++ ) { + dataPriv.set( + elems[ i ], + "globalEval", + !refElements || dataPriv.get( refElements[ i ], "globalEval" ) + ); + } +} + + +var rhtml = /<|&#?\w+;/; + +function buildFragment( elems, context, scripts, selection, ignored ) { + var elem, tmp, tag, wrap, attached, j, + fragment = context.createDocumentFragment(), + nodes = [], + i = 0, + l = elems.length; + + for ( ; i < l; i++ ) { + elem = elems[ i ]; + + if ( elem || elem === 0 ) { + + // Add nodes directly + if ( toType( elem ) === "object" ) { + + // Support: Android <=4.0 only, PhantomJS 1 only + // push.apply(_, arraylike) throws on ancient WebKit + jQuery.merge( nodes, elem.nodeType ? [ elem ] : elem ); + + // Convert non-html into a text node + } else if ( !rhtml.test( elem ) ) { + nodes.push( context.createTextNode( elem ) ); + + // Convert html into DOM nodes + } else { + tmp = tmp || fragment.appendChild( context.createElement( "div" ) ); + + // Deserialize a standard representation + tag = ( rtagName.exec( elem ) || [ "", "" ] )[ 1 ].toLowerCase(); + wrap = wrapMap[ tag ] || wrapMap._default; + tmp.innerHTML = wrap[ 1 ] + jQuery.htmlPrefilter( elem ) + wrap[ 2 ]; + + // Descend through wrappers to the right content + j = wrap[ 0 ]; + while ( j-- ) { + tmp = tmp.lastChild; + } + + // Support: Android <=4.0 only, PhantomJS 1 only + // push.apply(_, arraylike) throws on ancient WebKit + jQuery.merge( nodes, tmp.childNodes ); + + // Remember the top-level container + tmp = fragment.firstChild; + + // Ensure the created nodes are orphaned (#12392) + tmp.textContent = ""; + } + } + } + + // Remove wrapper from fragment + fragment.textContent = ""; + + i = 0; + while ( ( elem = nodes[ i++ ] ) ) { + + // Skip elements already in the context collection (trac-4087) + if ( selection && jQuery.inArray( elem, selection ) > -1 ) { + if ( ignored ) { + ignored.push( elem ); + } + continue; + } + + attached = isAttached( elem ); + + // Append to fragment + tmp = getAll( fragment.appendChild( elem ), "script" ); + + // Preserve script evaluation history + if ( attached ) { + setGlobalEval( tmp ); + } + + // Capture executables + if ( scripts ) { + j = 0; + while ( ( elem = tmp[ j++ ] ) ) { + if ( rscriptType.test( elem.type || "" ) ) { + scripts.push( elem ); + } + } + } + } + + return fragment; +} + + +var rtypenamespace = /^([^.]*)(?:\.(.+)|)/; + +function returnTrue() { + return true; +} + +function returnFalse() { + return false; +} + +// Support: IE <=9 - 11+ +// focus() and blur() are asynchronous, except when they are no-op. +// So expect focus to be synchronous when the element is already active, +// and blur to be synchronous when the element is not already active. +// (focus and blur are always synchronous in other supported browsers, +// this just defines when we can count on it). +function expectSync( elem, type ) { + return ( elem === safeActiveElement() ) === ( type === "focus" ); +} + +// Support: IE <=9 only +// Accessing document.activeElement can throw unexpectedly +// https://bugs.jquery.com/ticket/13393 +function safeActiveElement() { + try { + return document.activeElement; + } catch ( err ) { } +} + +function on( elem, types, selector, data, fn, one ) { + var origFn, type; + + // Types can be a map of types/handlers + if ( typeof types === "object" ) { + + // ( types-Object, selector, data ) + if ( typeof selector !== "string" ) { + + // ( types-Object, data ) + data = data || selector; + selector = undefined; + } + for ( type in types ) { + on( elem, type, selector, data, types[ type ], one ); + } + return elem; + } + + if ( data == null && fn == null ) { + + // ( types, fn ) + fn = selector; + data = selector = undefined; + } else if ( fn == null ) { + if ( typeof selector === "string" ) { + + // ( types, selector, fn ) + fn = data; + data = undefined; + } else { + + // ( types, data, fn ) + fn = data; + data = selector; + selector = undefined; + } + } + if ( fn === false ) { + fn = returnFalse; + } else if ( !fn ) { + return elem; + } + + if ( one === 1 ) { + origFn = fn; + fn = function( event ) { + + // Can use an empty set, since event contains the info + jQuery().off( event ); + return origFn.apply( this, arguments ); + }; + + // Use same guid so caller can remove using origFn + fn.guid = origFn.guid || ( origFn.guid = jQuery.guid++ ); + } + return elem.each( function() { + jQuery.event.add( this, types, fn, data, selector ); + } ); +} + +/* + * Helper functions for managing events -- not part of the public interface. + * Props to Dean Edwards' addEvent library for many of the ideas. + */ +jQuery.event = { + + global: {}, + + add: function( elem, types, handler, data, selector ) { + + var handleObjIn, eventHandle, tmp, + events, t, handleObj, + special, handlers, type, namespaces, origType, + elemData = dataPriv.get( elem ); + + // Only attach events to objects that accept data + if ( !acceptData( elem ) ) { + return; + } + + // Caller can pass in an object of custom data in lieu of the handler + if ( handler.handler ) { + handleObjIn = handler; + handler = handleObjIn.handler; + selector = handleObjIn.selector; + } + + // Ensure that invalid selectors throw exceptions at attach time + // Evaluate against documentElement in case elem is a non-element node (e.g., document) + if ( selector ) { + jQuery.find.matchesSelector( documentElement, selector ); + } + + // Make sure that the handler has a unique ID, used to find/remove it later + if ( !handler.guid ) { + handler.guid = jQuery.guid++; + } + + // Init the element's event structure and main handler, if this is the first + if ( !( events = elemData.events ) ) { + events = elemData.events = Object.create( null ); + } + if ( !( eventHandle = elemData.handle ) ) { + eventHandle = elemData.handle = function( e ) { + + // Discard the second event of a jQuery.event.trigger() and + // when an event is called after a page has unloaded + return typeof jQuery !== "undefined" && jQuery.event.triggered !== e.type ? + jQuery.event.dispatch.apply( elem, arguments ) : undefined; + }; + } + + // Handle multiple events separated by a space + types = ( types || "" ).match( rnothtmlwhite ) || [ "" ]; + t = types.length; + while ( t-- ) { + tmp = rtypenamespace.exec( types[ t ] ) || []; + type = origType = tmp[ 1 ]; + namespaces = ( tmp[ 2 ] || "" ).split( "." ).sort(); + + // There *must* be a type, no attaching namespace-only handlers + if ( !type ) { + continue; + } + + // If event changes its type, use the special event handlers for the changed type + special = jQuery.event.special[ type ] || {}; + + // If selector defined, determine special event api type, otherwise given type + type = ( selector ? special.delegateType : special.bindType ) || type; + + // Update special based on newly reset type + special = jQuery.event.special[ type ] || {}; + + // handleObj is passed to all event handlers + handleObj = jQuery.extend( { + type: type, + origType: origType, + data: data, + handler: handler, + guid: handler.guid, + selector: selector, + needsContext: selector && jQuery.expr.match.needsContext.test( selector ), + namespace: namespaces.join( "." ) + }, handleObjIn ); + + // Init the event handler queue if we're the first + if ( !( handlers = events[ type ] ) ) { + handlers = events[ type ] = []; + handlers.delegateCount = 0; + + // Only use addEventListener if the special events handler returns false + if ( !special.setup || + special.setup.call( elem, data, namespaces, eventHandle ) === false ) { + + if ( elem.addEventListener ) { + elem.addEventListener( type, eventHandle ); + } + } + } + + if ( special.add ) { + special.add.call( elem, handleObj ); + + if ( !handleObj.handler.guid ) { + handleObj.handler.guid = handler.guid; + } + } + + // Add to the element's handler list, delegates in front + if ( selector ) { + handlers.splice( handlers.delegateCount++, 0, handleObj ); + } else { + handlers.push( handleObj ); + } + + // Keep track of which events have ever been used, for event optimization + jQuery.event.global[ type ] = true; + } + + }, + + // Detach an event or set of events from an element + remove: function( elem, types, handler, selector, mappedTypes ) { + + var j, origCount, tmp, + events, t, handleObj, + special, handlers, type, namespaces, origType, + elemData = dataPriv.hasData( elem ) && dataPriv.get( elem ); + + if ( !elemData || !( events = elemData.events ) ) { + return; + } + + // Once for each type.namespace in types; type may be omitted + types = ( types || "" ).match( rnothtmlwhite ) || [ "" ]; + t = types.length; + while ( t-- ) { + tmp = rtypenamespace.exec( types[ t ] ) || []; + type = origType = tmp[ 1 ]; + namespaces = ( tmp[ 2 ] || "" ).split( "." ).sort(); + + // Unbind all events (on this namespace, if provided) for the element + if ( !type ) { + for ( type in events ) { + jQuery.event.remove( elem, type + types[ t ], handler, selector, true ); + } + continue; + } + + special = jQuery.event.special[ type ] || {}; + type = ( selector ? special.delegateType : special.bindType ) || type; + handlers = events[ type ] || []; + tmp = tmp[ 2 ] && + new RegExp( "(^|\\.)" + namespaces.join( "\\.(?:.*\\.|)" ) + "(\\.|$)" ); + + // Remove matching events + origCount = j = handlers.length; + while ( j-- ) { + handleObj = handlers[ j ]; + + if ( ( mappedTypes || origType === handleObj.origType ) && + ( !handler || handler.guid === handleObj.guid ) && + ( !tmp || tmp.test( handleObj.namespace ) ) && + ( !selector || selector === handleObj.selector || + selector === "**" && handleObj.selector ) ) { + handlers.splice( j, 1 ); + + if ( handleObj.selector ) { + handlers.delegateCount--; + } + if ( special.remove ) { + special.remove.call( elem, handleObj ); + } + } + } + + // Remove generic event handler if we removed something and no more handlers exist + // (avoids potential for endless recursion during removal of special event handlers) + if ( origCount && !handlers.length ) { + if ( !special.teardown || + special.teardown.call( elem, namespaces, elemData.handle ) === false ) { + + jQuery.removeEvent( elem, type, elemData.handle ); + } + + delete events[ type ]; + } + } + + // Remove data and the expando if it's no longer used + if ( jQuery.isEmptyObject( events ) ) { + dataPriv.remove( elem, "handle events" ); + } + }, + + dispatch: function( nativeEvent ) { + + var i, j, ret, matched, handleObj, handlerQueue, + args = new Array( arguments.length ), + + // Make a writable jQuery.Event from the native event object + event = jQuery.event.fix( nativeEvent ), + + handlers = ( + dataPriv.get( this, "events" ) || Object.create( null ) + )[ event.type ] || [], + special = jQuery.event.special[ event.type ] || {}; + + // Use the fix-ed jQuery.Event rather than the (read-only) native event + args[ 0 ] = event; + + for ( i = 1; i < arguments.length; i++ ) { + args[ i ] = arguments[ i ]; + } + + event.delegateTarget = this; + + // Call the preDispatch hook for the mapped type, and let it bail if desired + if ( special.preDispatch && special.preDispatch.call( this, event ) === false ) { + return; + } + + // Determine handlers + handlerQueue = jQuery.event.handlers.call( this, event, handlers ); + + // Run delegates first; they may want to stop propagation beneath us + i = 0; + while ( ( matched = handlerQueue[ i++ ] ) && !event.isPropagationStopped() ) { + event.currentTarget = matched.elem; + + j = 0; + while ( ( handleObj = matched.handlers[ j++ ] ) && + !event.isImmediatePropagationStopped() ) { + + // If the event is namespaced, then each handler is only invoked if it is + // specially universal or its namespaces are a superset of the event's. + if ( !event.rnamespace || handleObj.namespace === false || + event.rnamespace.test( handleObj.namespace ) ) { + + event.handleObj = handleObj; + event.data = handleObj.data; + + ret = ( ( jQuery.event.special[ handleObj.origType ] || {} ).handle || + handleObj.handler ).apply( matched.elem, args ); + + if ( ret !== undefined ) { + if ( ( event.result = ret ) === false ) { + event.preventDefault(); + event.stopPropagation(); + } + } + } + } + } + + // Call the postDispatch hook for the mapped type + if ( special.postDispatch ) { + special.postDispatch.call( this, event ); + } + + return event.result; + }, + + handlers: function( event, handlers ) { + var i, handleObj, sel, matchedHandlers, matchedSelectors, + handlerQueue = [], + delegateCount = handlers.delegateCount, + cur = event.target; + + // Find delegate handlers + if ( delegateCount && + + // Support: IE <=9 + // Black-hole SVG instance trees (trac-13180) + cur.nodeType && + + // Support: Firefox <=42 + // Suppress spec-violating clicks indicating a non-primary pointer button (trac-3861) + // https://www.w3.org/TR/DOM-Level-3-Events/#event-type-click + // Support: IE 11 only + // ...but not arrow key "clicks" of radio inputs, which can have `button` -1 (gh-2343) + !( event.type === "click" && event.button >= 1 ) ) { + + for ( ; cur !== this; cur = cur.parentNode || this ) { + + // Don't check non-elements (#13208) + // Don't process clicks on disabled elements (#6911, #8165, #11382, #11764) + if ( cur.nodeType === 1 && !( event.type === "click" && cur.disabled === true ) ) { + matchedHandlers = []; + matchedSelectors = {}; + for ( i = 0; i < delegateCount; i++ ) { + handleObj = handlers[ i ]; + + // Don't conflict with Object.prototype properties (#13203) + sel = handleObj.selector + " "; + + if ( matchedSelectors[ sel ] === undefined ) { + matchedSelectors[ sel ] = handleObj.needsContext ? + jQuery( sel, this ).index( cur ) > -1 : + jQuery.find( sel, this, null, [ cur ] ).length; + } + if ( matchedSelectors[ sel ] ) { + matchedHandlers.push( handleObj ); + } + } + if ( matchedHandlers.length ) { + handlerQueue.push( { elem: cur, handlers: matchedHandlers } ); + } + } + } + } + + // Add the remaining (directly-bound) handlers + cur = this; + if ( delegateCount < handlers.length ) { + handlerQueue.push( { elem: cur, handlers: handlers.slice( delegateCount ) } ); + } + + return handlerQueue; + }, + + addProp: function( name, hook ) { + Object.defineProperty( jQuery.Event.prototype, name, { + enumerable: true, + configurable: true, + + get: isFunction( hook ) ? + function() { + if ( this.originalEvent ) { + return hook( this.originalEvent ); + } + } : + function() { + if ( this.originalEvent ) { + return this.originalEvent[ name ]; + } + }, + + set: function( value ) { + Object.defineProperty( this, name, { + enumerable: true, + configurable: true, + writable: true, + value: value + } ); + } + } ); + }, + + fix: function( originalEvent ) { + return originalEvent[ jQuery.expando ] ? + originalEvent : + new jQuery.Event( originalEvent ); + }, + + special: { + load: { + + // Prevent triggered image.load events from bubbling to window.load + noBubble: true + }, + click: { + + // Utilize native event to ensure correct state for checkable inputs + setup: function( data ) { + + // For mutual compressibility with _default, replace `this` access with a local var. + // `|| data` is dead code meant only to preserve the variable through minification. + var el = this || data; + + // Claim the first handler + if ( rcheckableType.test( el.type ) && + el.click && nodeName( el, "input" ) ) { + + // dataPriv.set( el, "click", ... ) + leverageNative( el, "click", returnTrue ); + } + + // Return false to allow normal processing in the caller + return false; + }, + trigger: function( data ) { + + // For mutual compressibility with _default, replace `this` access with a local var. + // `|| data` is dead code meant only to preserve the variable through minification. + var el = this || data; + + // Force setup before triggering a click + if ( rcheckableType.test( el.type ) && + el.click && nodeName( el, "input" ) ) { + + leverageNative( el, "click" ); + } + + // Return non-false to allow normal event-path propagation + return true; + }, + + // For cross-browser consistency, suppress native .click() on links + // Also prevent it if we're currently inside a leveraged native-event stack + _default: function( event ) { + var target = event.target; + return rcheckableType.test( target.type ) && + target.click && nodeName( target, "input" ) && + dataPriv.get( target, "click" ) || + nodeName( target, "a" ); + } + }, + + beforeunload: { + postDispatch: function( event ) { + + // Support: Firefox 20+ + // Firefox doesn't alert if the returnValue field is not set. + if ( event.result !== undefined && event.originalEvent ) { + event.originalEvent.returnValue = event.result; + } + } + } + } +}; + +// Ensure the presence of an event listener that handles manually-triggered +// synthetic events by interrupting progress until reinvoked in response to +// *native* events that it fires directly, ensuring that state changes have +// already occurred before other listeners are invoked. +function leverageNative( el, type, expectSync ) { + + // Missing expectSync indicates a trigger call, which must force setup through jQuery.event.add + if ( !expectSync ) { + if ( dataPriv.get( el, type ) === undefined ) { + jQuery.event.add( el, type, returnTrue ); + } + return; + } + + // Register the controller as a special universal handler for all event namespaces + dataPriv.set( el, type, false ); + jQuery.event.add( el, type, { + namespace: false, + handler: function( event ) { + var notAsync, result, + saved = dataPriv.get( this, type ); + + if ( ( event.isTrigger & 1 ) && this[ type ] ) { + + // Interrupt processing of the outer synthetic .trigger()ed event + // Saved data should be false in such cases, but might be a leftover capture object + // from an async native handler (gh-4350) + if ( !saved.length ) { + + // Store arguments for use when handling the inner native event + // There will always be at least one argument (an event object), so this array + // will not be confused with a leftover capture object. + saved = slice.call( arguments ); + dataPriv.set( this, type, saved ); + + // Trigger the native event and capture its result + // Support: IE <=9 - 11+ + // focus() and blur() are asynchronous + notAsync = expectSync( this, type ); + this[ type ](); + result = dataPriv.get( this, type ); + if ( saved !== result || notAsync ) { + dataPriv.set( this, type, false ); + } else { + result = {}; + } + if ( saved !== result ) { + + // Cancel the outer synthetic event + event.stopImmediatePropagation(); + event.preventDefault(); + + // Support: Chrome 86+ + // In Chrome, if an element having a focusout handler is blurred by + // clicking outside of it, it invokes the handler synchronously. If + // that handler calls `.remove()` on the element, the data is cleared, + // leaving `result` undefined. We need to guard against this. + return result && result.value; + } + + // If this is an inner synthetic event for an event with a bubbling surrogate + // (focus or blur), assume that the surrogate already propagated from triggering the + // native event and prevent that from happening again here. + // This technically gets the ordering wrong w.r.t. to `.trigger()` (in which the + // bubbling surrogate propagates *after* the non-bubbling base), but that seems + // less bad than duplication. + } else if ( ( jQuery.event.special[ type ] || {} ).delegateType ) { + event.stopPropagation(); + } + + // If this is a native event triggered above, everything is now in order + // Fire an inner synthetic event with the original arguments + } else if ( saved.length ) { + + // ...and capture the result + dataPriv.set( this, type, { + value: jQuery.event.trigger( + + // Support: IE <=9 - 11+ + // Extend with the prototype to reset the above stopImmediatePropagation() + jQuery.extend( saved[ 0 ], jQuery.Event.prototype ), + saved.slice( 1 ), + this + ) + } ); + + // Abort handling of the native event + event.stopImmediatePropagation(); + } + } + } ); +} + +jQuery.removeEvent = function( elem, type, handle ) { + + // This "if" is needed for plain objects + if ( elem.removeEventListener ) { + elem.removeEventListener( type, handle ); + } +}; + +jQuery.Event = function( src, props ) { + + // Allow instantiation without the 'new' keyword + if ( !( this instanceof jQuery.Event ) ) { + return new jQuery.Event( src, props ); + } + + // Event object + if ( src && src.type ) { + this.originalEvent = src; + this.type = src.type; + + // Events bubbling up the document may have been marked as prevented + // by a handler lower down the tree; reflect the correct value. + this.isDefaultPrevented = src.defaultPrevented || + src.defaultPrevented === undefined && + + // Support: Android <=2.3 only + src.returnValue === false ? + returnTrue : + returnFalse; + + // Create target properties + // Support: Safari <=6 - 7 only + // Target should not be a text node (#504, #13143) + this.target = ( src.target && src.target.nodeType === 3 ) ? + src.target.parentNode : + src.target; + + this.currentTarget = src.currentTarget; + this.relatedTarget = src.relatedTarget; + + // Event type + } else { + this.type = src; + } + + // Put explicitly provided properties onto the event object + if ( props ) { + jQuery.extend( this, props ); + } + + // Create a timestamp if incoming event doesn't have one + this.timeStamp = src && src.timeStamp || Date.now(); + + // Mark it as fixed + this[ jQuery.expando ] = true; +}; + +// jQuery.Event is based on DOM3 Events as specified by the ECMAScript Language Binding +// https://www.w3.org/TR/2003/WD-DOM-Level-3-Events-20030331/ecma-script-binding.html +jQuery.Event.prototype = { + constructor: jQuery.Event, + isDefaultPrevented: returnFalse, + isPropagationStopped: returnFalse, + isImmediatePropagationStopped: returnFalse, + isSimulated: false, + + preventDefault: function() { + var e = this.originalEvent; + + this.isDefaultPrevented = returnTrue; + + if ( e && !this.isSimulated ) { + e.preventDefault(); + } + }, + stopPropagation: function() { + var e = this.originalEvent; + + this.isPropagationStopped = returnTrue; + + if ( e && !this.isSimulated ) { + e.stopPropagation(); + } + }, + stopImmediatePropagation: function() { + var e = this.originalEvent; + + this.isImmediatePropagationStopped = returnTrue; + + if ( e && !this.isSimulated ) { + e.stopImmediatePropagation(); + } + + this.stopPropagation(); + } +}; + +// Includes all common event props including KeyEvent and MouseEvent specific props +jQuery.each( { + altKey: true, + bubbles: true, + cancelable: true, + changedTouches: true, + ctrlKey: true, + detail: true, + eventPhase: true, + metaKey: true, + pageX: true, + pageY: true, + shiftKey: true, + view: true, + "char": true, + code: true, + charCode: true, + key: true, + keyCode: true, + button: true, + buttons: true, + clientX: true, + clientY: true, + offsetX: true, + offsetY: true, + pointerId: true, + pointerType: true, + screenX: true, + screenY: true, + targetTouches: true, + toElement: true, + touches: true, + which: true +}, jQuery.event.addProp ); + +jQuery.each( { focus: "focusin", blur: "focusout" }, function( type, delegateType ) { + jQuery.event.special[ type ] = { + + // Utilize native event if possible so blur/focus sequence is correct + setup: function() { + + // Claim the first handler + // dataPriv.set( this, "focus", ... ) + // dataPriv.set( this, "blur", ... ) + leverageNative( this, type, expectSync ); + + // Return false to allow normal processing in the caller + return false; + }, + trigger: function() { + + // Force setup before trigger + leverageNative( this, type ); + + // Return non-false to allow normal event-path propagation + return true; + }, + + // Suppress native focus or blur as it's already being fired + // in leverageNative. + _default: function() { + return true; + }, + + delegateType: delegateType + }; +} ); + +// Create mouseenter/leave events using mouseover/out and event-time checks +// so that event delegation works in jQuery. +// Do the same for pointerenter/pointerleave and pointerover/pointerout +// +// Support: Safari 7 only +// Safari sends mouseenter too often; see: +// https://bugs.chromium.org/p/chromium/issues/detail?id=470258 +// for the description of the bug (it existed in older Chrome versions as well). +jQuery.each( { + mouseenter: "mouseover", + mouseleave: "mouseout", + pointerenter: "pointerover", + pointerleave: "pointerout" +}, function( orig, fix ) { + jQuery.event.special[ orig ] = { + delegateType: fix, + bindType: fix, + + handle: function( event ) { + var ret, + target = this, + related = event.relatedTarget, + handleObj = event.handleObj; + + // For mouseenter/leave call the handler if related is outside the target. + // NB: No relatedTarget if the mouse left/entered the browser window + if ( !related || ( related !== target && !jQuery.contains( target, related ) ) ) { + event.type = handleObj.origType; + ret = handleObj.handler.apply( this, arguments ); + event.type = fix; + } + return ret; + } + }; +} ); + +jQuery.fn.extend( { + + on: function( types, selector, data, fn ) { + return on( this, types, selector, data, fn ); + }, + one: function( types, selector, data, fn ) { + return on( this, types, selector, data, fn, 1 ); + }, + off: function( types, selector, fn ) { + var handleObj, type; + if ( types && types.preventDefault && types.handleObj ) { + + // ( event ) dispatched jQuery.Event + handleObj = types.handleObj; + jQuery( types.delegateTarget ).off( + handleObj.namespace ? + handleObj.origType + "." + handleObj.namespace : + handleObj.origType, + handleObj.selector, + handleObj.handler + ); + return this; + } + if ( typeof types === "object" ) { + + // ( types-object [, selector] ) + for ( type in types ) { + this.off( type, selector, types[ type ] ); + } + return this; + } + if ( selector === false || typeof selector === "function" ) { + + // ( types [, fn] ) + fn = selector; + selector = undefined; + } + if ( fn === false ) { + fn = returnFalse; + } + return this.each( function() { + jQuery.event.remove( this, types, fn, selector ); + } ); + } +} ); + + +var + + // Support: IE <=10 - 11, Edge 12 - 13 only + // In IE/Edge using regex groups here causes severe slowdowns. + // See https://connect.microsoft.com/IE/feedback/details/1736512/ + rnoInnerhtml = /\s*$/g; + +// Prefer a tbody over its parent table for containing new rows +function manipulationTarget( elem, content ) { + if ( nodeName( elem, "table" ) && + nodeName( content.nodeType !== 11 ? content : content.firstChild, "tr" ) ) { + + return jQuery( elem ).children( "tbody" )[ 0 ] || elem; + } + + return elem; +} + +// Replace/restore the type attribute of script elements for safe DOM manipulation +function disableScript( elem ) { + elem.type = ( elem.getAttribute( "type" ) !== null ) + "/" + elem.type; + return elem; +} +function restoreScript( elem ) { + if ( ( elem.type || "" ).slice( 0, 5 ) === "true/" ) { + elem.type = elem.type.slice( 5 ); + } else { + elem.removeAttribute( "type" ); + } + + return elem; +} + +function cloneCopyEvent( src, dest ) { + var i, l, type, pdataOld, udataOld, udataCur, events; + + if ( dest.nodeType !== 1 ) { + return; + } + + // 1. Copy private data: events, handlers, etc. + if ( dataPriv.hasData( src ) ) { + pdataOld = dataPriv.get( src ); + events = pdataOld.events; + + if ( events ) { + dataPriv.remove( dest, "handle events" ); + + for ( type in events ) { + for ( i = 0, l = events[ type ].length; i < l; i++ ) { + jQuery.event.add( dest, type, events[ type ][ i ] ); + } + } + } + } + + // 2. Copy user data + if ( dataUser.hasData( src ) ) { + udataOld = dataUser.access( src ); + udataCur = jQuery.extend( {}, udataOld ); + + dataUser.set( dest, udataCur ); + } +} + +// Fix IE bugs, see support tests +function fixInput( src, dest ) { + var nodeName = dest.nodeName.toLowerCase(); + + // Fails to persist the checked state of a cloned checkbox or radio button. + if ( nodeName === "input" && rcheckableType.test( src.type ) ) { + dest.checked = src.checked; + + // Fails to return the selected option to the default selected state when cloning options + } else if ( nodeName === "input" || nodeName === "textarea" ) { + dest.defaultValue = src.defaultValue; + } +} + +function domManip( collection, args, callback, ignored ) { + + // Flatten any nested arrays + args = flat( args ); + + var fragment, first, scripts, hasScripts, node, doc, + i = 0, + l = collection.length, + iNoClone = l - 1, + value = args[ 0 ], + valueIsFunction = isFunction( value ); + + // We can't cloneNode fragments that contain checked, in WebKit + if ( valueIsFunction || + ( l > 1 && typeof value === "string" && + !support.checkClone && rchecked.test( value ) ) ) { + return collection.each( function( index ) { + var self = collection.eq( index ); + if ( valueIsFunction ) { + args[ 0 ] = value.call( this, index, self.html() ); + } + domManip( self, args, callback, ignored ); + } ); + } + + if ( l ) { + fragment = buildFragment( args, collection[ 0 ].ownerDocument, false, collection, ignored ); + first = fragment.firstChild; + + if ( fragment.childNodes.length === 1 ) { + fragment = first; + } + + // Require either new content or an interest in ignored elements to invoke the callback + if ( first || ignored ) { + scripts = jQuery.map( getAll( fragment, "script" ), disableScript ); + hasScripts = scripts.length; + + // Use the original fragment for the last item + // instead of the first because it can end up + // being emptied incorrectly in certain situations (#8070). + for ( ; i < l; i++ ) { + node = fragment; + + if ( i !== iNoClone ) { + node = jQuery.clone( node, true, true ); + + // Keep references to cloned scripts for later restoration + if ( hasScripts ) { + + // Support: Android <=4.0 only, PhantomJS 1 only + // push.apply(_, arraylike) throws on ancient WebKit + jQuery.merge( scripts, getAll( node, "script" ) ); + } + } + + callback.call( collection[ i ], node, i ); + } + + if ( hasScripts ) { + doc = scripts[ scripts.length - 1 ].ownerDocument; + + // Reenable scripts + jQuery.map( scripts, restoreScript ); + + // Evaluate executable scripts on first document insertion + for ( i = 0; i < hasScripts; i++ ) { + node = scripts[ i ]; + if ( rscriptType.test( node.type || "" ) && + !dataPriv.access( node, "globalEval" ) && + jQuery.contains( doc, node ) ) { + + if ( node.src && ( node.type || "" ).toLowerCase() !== "module" ) { + + // Optional AJAX dependency, but won't run scripts if not present + if ( jQuery._evalUrl && !node.noModule ) { + jQuery._evalUrl( node.src, { + nonce: node.nonce || node.getAttribute( "nonce" ) + }, doc ); + } + } else { + DOMEval( node.textContent.replace( rcleanScript, "" ), node, doc ); + } + } + } + } + } + } + + return collection; +} + +function remove( elem, selector, keepData ) { + var node, + nodes = selector ? jQuery.filter( selector, elem ) : elem, + i = 0; + + for ( ; ( node = nodes[ i ] ) != null; i++ ) { + if ( !keepData && node.nodeType === 1 ) { + jQuery.cleanData( getAll( node ) ); + } + + if ( node.parentNode ) { + if ( keepData && isAttached( node ) ) { + setGlobalEval( getAll( node, "script" ) ); + } + node.parentNode.removeChild( node ); + } + } + + return elem; +} + +jQuery.extend( { + htmlPrefilter: function( html ) { + return html; + }, + + clone: function( elem, dataAndEvents, deepDataAndEvents ) { + var i, l, srcElements, destElements, + clone = elem.cloneNode( true ), + inPage = isAttached( elem ); + + // Fix IE cloning issues + if ( !support.noCloneChecked && ( elem.nodeType === 1 || elem.nodeType === 11 ) && + !jQuery.isXMLDoc( elem ) ) { + + // We eschew Sizzle here for performance reasons: https://jsperf.com/getall-vs-sizzle/2 + destElements = getAll( clone ); + srcElements = getAll( elem ); + + for ( i = 0, l = srcElements.length; i < l; i++ ) { + fixInput( srcElements[ i ], destElements[ i ] ); + } + } + + // Copy the events from the original to the clone + if ( dataAndEvents ) { + if ( deepDataAndEvents ) { + srcElements = srcElements || getAll( elem ); + destElements = destElements || getAll( clone ); + + for ( i = 0, l = srcElements.length; i < l; i++ ) { + cloneCopyEvent( srcElements[ i ], destElements[ i ] ); + } + } else { + cloneCopyEvent( elem, clone ); + } + } + + // Preserve script evaluation history + destElements = getAll( clone, "script" ); + if ( destElements.length > 0 ) { + setGlobalEval( destElements, !inPage && getAll( elem, "script" ) ); + } + + // Return the cloned set + return clone; + }, + + cleanData: function( elems ) { + var data, elem, type, + special = jQuery.event.special, + i = 0; + + for ( ; ( elem = elems[ i ] ) !== undefined; i++ ) { + if ( acceptData( elem ) ) { + if ( ( data = elem[ dataPriv.expando ] ) ) { + if ( data.events ) { + for ( type in data.events ) { + if ( special[ type ] ) { + jQuery.event.remove( elem, type ); + + // This is a shortcut to avoid jQuery.event.remove's overhead + } else { + jQuery.removeEvent( elem, type, data.handle ); + } + } + } + + // Support: Chrome <=35 - 45+ + // Assign undefined instead of using delete, see Data#remove + elem[ dataPriv.expando ] = undefined; + } + if ( elem[ dataUser.expando ] ) { + + // Support: Chrome <=35 - 45+ + // Assign undefined instead of using delete, see Data#remove + elem[ dataUser.expando ] = undefined; + } + } + } + } +} ); + +jQuery.fn.extend( { + detach: function( selector ) { + return remove( this, selector, true ); + }, + + remove: function( selector ) { + return remove( this, selector ); + }, + + text: function( value ) { + return access( this, function( value ) { + return value === undefined ? + jQuery.text( this ) : + this.empty().each( function() { + if ( this.nodeType === 1 || this.nodeType === 11 || this.nodeType === 9 ) { + this.textContent = value; + } + } ); + }, null, value, arguments.length ); + }, + + append: function() { + return domManip( this, arguments, function( elem ) { + if ( this.nodeType === 1 || this.nodeType === 11 || this.nodeType === 9 ) { + var target = manipulationTarget( this, elem ); + target.appendChild( elem ); + } + } ); + }, + + prepend: function() { + return domManip( this, arguments, function( elem ) { + if ( this.nodeType === 1 || this.nodeType === 11 || this.nodeType === 9 ) { + var target = manipulationTarget( this, elem ); + target.insertBefore( elem, target.firstChild ); + } + } ); + }, + + before: function() { + return domManip( this, arguments, function( elem ) { + if ( this.parentNode ) { + this.parentNode.insertBefore( elem, this ); + } + } ); + }, + + after: function() { + return domManip( this, arguments, function( elem ) { + if ( this.parentNode ) { + this.parentNode.insertBefore( elem, this.nextSibling ); + } + } ); + }, + + empty: function() { + var elem, + i = 0; + + for ( ; ( elem = this[ i ] ) != null; i++ ) { + if ( elem.nodeType === 1 ) { + + // Prevent memory leaks + jQuery.cleanData( getAll( elem, false ) ); + + // Remove any remaining nodes + elem.textContent = ""; + } + } + + return this; + }, + + clone: function( dataAndEvents, deepDataAndEvents ) { + dataAndEvents = dataAndEvents == null ? false : dataAndEvents; + deepDataAndEvents = deepDataAndEvents == null ? dataAndEvents : deepDataAndEvents; + + return this.map( function() { + return jQuery.clone( this, dataAndEvents, deepDataAndEvents ); + } ); + }, + + html: function( value ) { + return access( this, function( value ) { + var elem = this[ 0 ] || {}, + i = 0, + l = this.length; + + if ( value === undefined && elem.nodeType === 1 ) { + return elem.innerHTML; + } + + // See if we can take a shortcut and just use innerHTML + if ( typeof value === "string" && !rnoInnerhtml.test( value ) && + !wrapMap[ ( rtagName.exec( value ) || [ "", "" ] )[ 1 ].toLowerCase() ] ) { + + value = jQuery.htmlPrefilter( value ); + + try { + for ( ; i < l; i++ ) { + elem = this[ i ] || {}; + + // Remove element nodes and prevent memory leaks + if ( elem.nodeType === 1 ) { + jQuery.cleanData( getAll( elem, false ) ); + elem.innerHTML = value; + } + } + + elem = 0; + + // If using innerHTML throws an exception, use the fallback method + } catch ( e ) {} + } + + if ( elem ) { + this.empty().append( value ); + } + }, null, value, arguments.length ); + }, + + replaceWith: function() { + var ignored = []; + + // Make the changes, replacing each non-ignored context element with the new content + return domManip( this, arguments, function( elem ) { + var parent = this.parentNode; + + if ( jQuery.inArray( this, ignored ) < 0 ) { + jQuery.cleanData( getAll( this ) ); + if ( parent ) { + parent.replaceChild( elem, this ); + } + } + + // Force callback invocation + }, ignored ); + } +} ); + +jQuery.each( { + appendTo: "append", + prependTo: "prepend", + insertBefore: "before", + insertAfter: "after", + replaceAll: "replaceWith" +}, function( name, original ) { + jQuery.fn[ name ] = function( selector ) { + var elems, + ret = [], + insert = jQuery( selector ), + last = insert.length - 1, + i = 0; + + for ( ; i <= last; i++ ) { + elems = i === last ? this : this.clone( true ); + jQuery( insert[ i ] )[ original ]( elems ); + + // Support: Android <=4.0 only, PhantomJS 1 only + // .get() because push.apply(_, arraylike) throws on ancient WebKit + push.apply( ret, elems.get() ); + } + + return this.pushStack( ret ); + }; +} ); +var rnumnonpx = new RegExp( "^(" + pnum + ")(?!px)[a-z%]+$", "i" ); + +var getStyles = function( elem ) { + + // Support: IE <=11 only, Firefox <=30 (#15098, #14150) + // IE throws on elements created in popups + // FF meanwhile throws on frame elements through "defaultView.getComputedStyle" + var view = elem.ownerDocument.defaultView; + + if ( !view || !view.opener ) { + view = window; + } + + return view.getComputedStyle( elem ); + }; + +var swap = function( elem, options, callback ) { + var ret, name, + old = {}; + + // Remember the old values, and insert the new ones + for ( name in options ) { + old[ name ] = elem.style[ name ]; + elem.style[ name ] = options[ name ]; + } + + ret = callback.call( elem ); + + // Revert the old values + for ( name in options ) { + elem.style[ name ] = old[ name ]; + } + + return ret; +}; + + +var rboxStyle = new RegExp( cssExpand.join( "|" ), "i" ); + + + +( function() { + + // Executing both pixelPosition & boxSizingReliable tests require only one layout + // so they're executed at the same time to save the second computation. + function computeStyleTests() { + + // This is a singleton, we need to execute it only once + if ( !div ) { + return; + } + + container.style.cssText = "position:absolute;left:-11111px;width:60px;" + + "margin-top:1px;padding:0;border:0"; + div.style.cssText = + "position:relative;display:block;box-sizing:border-box;overflow:scroll;" + + "margin:auto;border:1px;padding:1px;" + + "width:60%;top:1%"; + documentElement.appendChild( container ).appendChild( div ); + + var divStyle = window.getComputedStyle( div ); + pixelPositionVal = divStyle.top !== "1%"; + + // Support: Android 4.0 - 4.3 only, Firefox <=3 - 44 + reliableMarginLeftVal = roundPixelMeasures( divStyle.marginLeft ) === 12; + + // Support: Android 4.0 - 4.3 only, Safari <=9.1 - 10.1, iOS <=7.0 - 9.3 + // Some styles come back with percentage values, even though they shouldn't + div.style.right = "60%"; + pixelBoxStylesVal = roundPixelMeasures( divStyle.right ) === 36; + + // Support: IE 9 - 11 only + // Detect misreporting of content dimensions for box-sizing:border-box elements + boxSizingReliableVal = roundPixelMeasures( divStyle.width ) === 36; + + // Support: IE 9 only + // Detect overflow:scroll screwiness (gh-3699) + // Support: Chrome <=64 + // Don't get tricked when zoom affects offsetWidth (gh-4029) + div.style.position = "absolute"; + scrollboxSizeVal = roundPixelMeasures( div.offsetWidth / 3 ) === 12; + + documentElement.removeChild( container ); + + // Nullify the div so it wouldn't be stored in the memory and + // it will also be a sign that checks already performed + div = null; + } + + function roundPixelMeasures( measure ) { + return Math.round( parseFloat( measure ) ); + } + + var pixelPositionVal, boxSizingReliableVal, scrollboxSizeVal, pixelBoxStylesVal, + reliableTrDimensionsVal, reliableMarginLeftVal, + container = document.createElement( "div" ), + div = document.createElement( "div" ); + + // Finish early in limited (non-browser) environments + if ( !div.style ) { + return; + } + + // Support: IE <=9 - 11 only + // Style of cloned element affects source element cloned (#8908) + div.style.backgroundClip = "content-box"; + div.cloneNode( true ).style.backgroundClip = ""; + support.clearCloneStyle = div.style.backgroundClip === "content-box"; + + jQuery.extend( support, { + boxSizingReliable: function() { + computeStyleTests(); + return boxSizingReliableVal; + }, + pixelBoxStyles: function() { + computeStyleTests(); + return pixelBoxStylesVal; + }, + pixelPosition: function() { + computeStyleTests(); + return pixelPositionVal; + }, + reliableMarginLeft: function() { + computeStyleTests(); + return reliableMarginLeftVal; + }, + scrollboxSize: function() { + computeStyleTests(); + return scrollboxSizeVal; + }, + + // Support: IE 9 - 11+, Edge 15 - 18+ + // IE/Edge misreport `getComputedStyle` of table rows with width/height + // set in CSS while `offset*` properties report correct values. + // Behavior in IE 9 is more subtle than in newer versions & it passes + // some versions of this test; make sure not to make it pass there! + // + // Support: Firefox 70+ + // Only Firefox includes border widths + // in computed dimensions. (gh-4529) + reliableTrDimensions: function() { + var table, tr, trChild, trStyle; + if ( reliableTrDimensionsVal == null ) { + table = document.createElement( "table" ); + tr = document.createElement( "tr" ); + trChild = document.createElement( "div" ); + + table.style.cssText = "position:absolute;left:-11111px;border-collapse:separate"; + tr.style.cssText = "border:1px solid"; + + // Support: Chrome 86+ + // Height set through cssText does not get applied. + // Computed height then comes back as 0. + tr.style.height = "1px"; + trChild.style.height = "9px"; + + // Support: Android 8 Chrome 86+ + // In our bodyBackground.html iframe, + // display for all div elements is set to "inline", + // which causes a problem only in Android 8 Chrome 86. + // Ensuring the div is display: block + // gets around this issue. + trChild.style.display = "block"; + + documentElement + .appendChild( table ) + .appendChild( tr ) + .appendChild( trChild ); + + trStyle = window.getComputedStyle( tr ); + reliableTrDimensionsVal = ( parseInt( trStyle.height, 10 ) + + parseInt( trStyle.borderTopWidth, 10 ) + + parseInt( trStyle.borderBottomWidth, 10 ) ) === tr.offsetHeight; + + documentElement.removeChild( table ); + } + return reliableTrDimensionsVal; + } + } ); +} )(); + + +function curCSS( elem, name, computed ) { + var width, minWidth, maxWidth, ret, + + // Support: Firefox 51+ + // Retrieving style before computed somehow + // fixes an issue with getting wrong values + // on detached elements + style = elem.style; + + computed = computed || getStyles( elem ); + + // getPropertyValue is needed for: + // .css('filter') (IE 9 only, #12537) + // .css('--customProperty) (#3144) + if ( computed ) { + ret = computed.getPropertyValue( name ) || computed[ name ]; + + if ( ret === "" && !isAttached( elem ) ) { + ret = jQuery.style( elem, name ); + } + + // A tribute to the "awesome hack by Dean Edwards" + // Android Browser returns percentage for some values, + // but width seems to be reliably pixels. + // This is against the CSSOM draft spec: + // https://drafts.csswg.org/cssom/#resolved-values + if ( !support.pixelBoxStyles() && rnumnonpx.test( ret ) && rboxStyle.test( name ) ) { + + // Remember the original values + width = style.width; + minWidth = style.minWidth; + maxWidth = style.maxWidth; + + // Put in the new values to get a computed value out + style.minWidth = style.maxWidth = style.width = ret; + ret = computed.width; + + // Revert the changed values + style.width = width; + style.minWidth = minWidth; + style.maxWidth = maxWidth; + } + } + + return ret !== undefined ? + + // Support: IE <=9 - 11 only + // IE returns zIndex value as an integer. + ret + "" : + ret; +} + + +function addGetHookIf( conditionFn, hookFn ) { + + // Define the hook, we'll check on the first run if it's really needed. + return { + get: function() { + if ( conditionFn() ) { + + // Hook not needed (or it's not possible to use it due + // to missing dependency), remove it. + delete this.get; + return; + } + + // Hook needed; redefine it so that the support test is not executed again. + return ( this.get = hookFn ).apply( this, arguments ); + } + }; +} + + +var cssPrefixes = [ "Webkit", "Moz", "ms" ], + emptyStyle = document.createElement( "div" ).style, + vendorProps = {}; + +// Return a vendor-prefixed property or undefined +function vendorPropName( name ) { + + // Check for vendor prefixed names + var capName = name[ 0 ].toUpperCase() + name.slice( 1 ), + i = cssPrefixes.length; + + while ( i-- ) { + name = cssPrefixes[ i ] + capName; + if ( name in emptyStyle ) { + return name; + } + } +} + +// Return a potentially-mapped jQuery.cssProps or vendor prefixed property +function finalPropName( name ) { + var final = jQuery.cssProps[ name ] || vendorProps[ name ]; + + if ( final ) { + return final; + } + if ( name in emptyStyle ) { + return name; + } + return vendorProps[ name ] = vendorPropName( name ) || name; +} + + +var + + // Swappable if display is none or starts with table + // except "table", "table-cell", or "table-caption" + // See here for display values: https://developer.mozilla.org/en-US/docs/CSS/display + rdisplayswap = /^(none|table(?!-c[ea]).+)/, + rcustomProp = /^--/, + cssShow = { position: "absolute", visibility: "hidden", display: "block" }, + cssNormalTransform = { + letterSpacing: "0", + fontWeight: "400" + }; + +function setPositiveNumber( _elem, value, subtract ) { + + // Any relative (+/-) values have already been + // normalized at this point + var matches = rcssNum.exec( value ); + return matches ? + + // Guard against undefined "subtract", e.g., when used as in cssHooks + Math.max( 0, matches[ 2 ] - ( subtract || 0 ) ) + ( matches[ 3 ] || "px" ) : + value; +} + +function boxModelAdjustment( elem, dimension, box, isBorderBox, styles, computedVal ) { + var i = dimension === "width" ? 1 : 0, + extra = 0, + delta = 0; + + // Adjustment may not be necessary + if ( box === ( isBorderBox ? "border" : "content" ) ) { + return 0; + } + + for ( ; i < 4; i += 2 ) { + + // Both box models exclude margin + if ( box === "margin" ) { + delta += jQuery.css( elem, box + cssExpand[ i ], true, styles ); + } + + // If we get here with a content-box, we're seeking "padding" or "border" or "margin" + if ( !isBorderBox ) { + + // Add padding + delta += jQuery.css( elem, "padding" + cssExpand[ i ], true, styles ); + + // For "border" or "margin", add border + if ( box !== "padding" ) { + delta += jQuery.css( elem, "border" + cssExpand[ i ] + "Width", true, styles ); + + // But still keep track of it otherwise + } else { + extra += jQuery.css( elem, "border" + cssExpand[ i ] + "Width", true, styles ); + } + + // If we get here with a border-box (content + padding + border), we're seeking "content" or + // "padding" or "margin" + } else { + + // For "content", subtract padding + if ( box === "content" ) { + delta -= jQuery.css( elem, "padding" + cssExpand[ i ], true, styles ); + } + + // For "content" or "padding", subtract border + if ( box !== "margin" ) { + delta -= jQuery.css( elem, "border" + cssExpand[ i ] + "Width", true, styles ); + } + } + } + + // Account for positive content-box scroll gutter when requested by providing computedVal + if ( !isBorderBox && computedVal >= 0 ) { + + // offsetWidth/offsetHeight is a rounded sum of content, padding, scroll gutter, and border + // Assuming integer scroll gutter, subtract the rest and round down + delta += Math.max( 0, Math.ceil( + elem[ "offset" + dimension[ 0 ].toUpperCase() + dimension.slice( 1 ) ] - + computedVal - + delta - + extra - + 0.5 + + // If offsetWidth/offsetHeight is unknown, then we can't determine content-box scroll gutter + // Use an explicit zero to avoid NaN (gh-3964) + ) ) || 0; + } + + return delta; +} + +function getWidthOrHeight( elem, dimension, extra ) { + + // Start with computed style + var styles = getStyles( elem ), + + // To avoid forcing a reflow, only fetch boxSizing if we need it (gh-4322). + // Fake content-box until we know it's needed to know the true value. + boxSizingNeeded = !support.boxSizingReliable() || extra, + isBorderBox = boxSizingNeeded && + jQuery.css( elem, "boxSizing", false, styles ) === "border-box", + valueIsBorderBox = isBorderBox, + + val = curCSS( elem, dimension, styles ), + offsetProp = "offset" + dimension[ 0 ].toUpperCase() + dimension.slice( 1 ); + + // Support: Firefox <=54 + // Return a confounding non-pixel value or feign ignorance, as appropriate. + if ( rnumnonpx.test( val ) ) { + if ( !extra ) { + return val; + } + val = "auto"; + } + + + // Support: IE 9 - 11 only + // Use offsetWidth/offsetHeight for when box sizing is unreliable. + // In those cases, the computed value can be trusted to be border-box. + if ( ( !support.boxSizingReliable() && isBorderBox || + + // Support: IE 10 - 11+, Edge 15 - 18+ + // IE/Edge misreport `getComputedStyle` of table rows with width/height + // set in CSS while `offset*` properties report correct values. + // Interestingly, in some cases IE 9 doesn't suffer from this issue. + !support.reliableTrDimensions() && nodeName( elem, "tr" ) || + + // Fall back to offsetWidth/offsetHeight when value is "auto" + // This happens for inline elements with no explicit setting (gh-3571) + val === "auto" || + + // Support: Android <=4.1 - 4.3 only + // Also use offsetWidth/offsetHeight for misreported inline dimensions (gh-3602) + !parseFloat( val ) && jQuery.css( elem, "display", false, styles ) === "inline" ) && + + // Make sure the element is visible & connected + elem.getClientRects().length ) { + + isBorderBox = jQuery.css( elem, "boxSizing", false, styles ) === "border-box"; + + // Where available, offsetWidth/offsetHeight approximate border box dimensions. + // Where not available (e.g., SVG), assume unreliable box-sizing and interpret the + // retrieved value as a content box dimension. + valueIsBorderBox = offsetProp in elem; + if ( valueIsBorderBox ) { + val = elem[ offsetProp ]; + } + } + + // Normalize "" and auto + val = parseFloat( val ) || 0; + + // Adjust for the element's box model + return ( val + + boxModelAdjustment( + elem, + dimension, + extra || ( isBorderBox ? "border" : "content" ), + valueIsBorderBox, + styles, + + // Provide the current computed size to request scroll gutter calculation (gh-3589) + val + ) + ) + "px"; +} + +jQuery.extend( { + + // Add in style property hooks for overriding the default + // behavior of getting and setting a style property + cssHooks: { + opacity: { + get: function( elem, computed ) { + if ( computed ) { + + // We should always get a number back from opacity + var ret = curCSS( elem, "opacity" ); + return ret === "" ? "1" : ret; + } + } + } + }, + + // Don't automatically add "px" to these possibly-unitless properties + cssNumber: { + "animationIterationCount": true, + "columnCount": true, + "fillOpacity": true, + "flexGrow": true, + "flexShrink": true, + "fontWeight": true, + "gridArea": true, + "gridColumn": true, + "gridColumnEnd": true, + "gridColumnStart": true, + "gridRow": true, + "gridRowEnd": true, + "gridRowStart": true, + "lineHeight": true, + "opacity": true, + "order": true, + "orphans": true, + "widows": true, + "zIndex": true, + "zoom": true + }, + + // Add in properties whose names you wish to fix before + // setting or getting the value + cssProps: {}, + + // Get and set the style property on a DOM Node + style: function( elem, name, value, extra ) { + + // Don't set styles on text and comment nodes + if ( !elem || elem.nodeType === 3 || elem.nodeType === 8 || !elem.style ) { + return; + } + + // Make sure that we're working with the right name + var ret, type, hooks, + origName = camelCase( name ), + isCustomProp = rcustomProp.test( name ), + style = elem.style; + + // Make sure that we're working with the right name. We don't + // want to query the value if it is a CSS custom property + // since they are user-defined. + if ( !isCustomProp ) { + name = finalPropName( origName ); + } + + // Gets hook for the prefixed version, then unprefixed version + hooks = jQuery.cssHooks[ name ] || jQuery.cssHooks[ origName ]; + + // Check if we're setting a value + if ( value !== undefined ) { + type = typeof value; + + // Convert "+=" or "-=" to relative numbers (#7345) + if ( type === "string" && ( ret = rcssNum.exec( value ) ) && ret[ 1 ] ) { + value = adjustCSS( elem, name, ret ); + + // Fixes bug #9237 + type = "number"; + } + + // Make sure that null and NaN values aren't set (#7116) + if ( value == null || value !== value ) { + return; + } + + // If a number was passed in, add the unit (except for certain CSS properties) + // The isCustomProp check can be removed in jQuery 4.0 when we only auto-append + // "px" to a few hardcoded values. + if ( type === "number" && !isCustomProp ) { + value += ret && ret[ 3 ] || ( jQuery.cssNumber[ origName ] ? "" : "px" ); + } + + // background-* props affect original clone's values + if ( !support.clearCloneStyle && value === "" && name.indexOf( "background" ) === 0 ) { + style[ name ] = "inherit"; + } + + // If a hook was provided, use that value, otherwise just set the specified value + if ( !hooks || !( "set" in hooks ) || + ( value = hooks.set( elem, value, extra ) ) !== undefined ) { + + if ( isCustomProp ) { + style.setProperty( name, value ); + } else { + style[ name ] = value; + } + } + + } else { + + // If a hook was provided get the non-computed value from there + if ( hooks && "get" in hooks && + ( ret = hooks.get( elem, false, extra ) ) !== undefined ) { + + return ret; + } + + // Otherwise just get the value from the style object + return style[ name ]; + } + }, + + css: function( elem, name, extra, styles ) { + var val, num, hooks, + origName = camelCase( name ), + isCustomProp = rcustomProp.test( name ); + + // Make sure that we're working with the right name. We don't + // want to modify the value if it is a CSS custom property + // since they are user-defined. + if ( !isCustomProp ) { + name = finalPropName( origName ); + } + + // Try prefixed name followed by the unprefixed name + hooks = jQuery.cssHooks[ name ] || jQuery.cssHooks[ origName ]; + + // If a hook was provided get the computed value from there + if ( hooks && "get" in hooks ) { + val = hooks.get( elem, true, extra ); + } + + // Otherwise, if a way to get the computed value exists, use that + if ( val === undefined ) { + val = curCSS( elem, name, styles ); + } + + // Convert "normal" to computed value + if ( val === "normal" && name in cssNormalTransform ) { + val = cssNormalTransform[ name ]; + } + + // Make numeric if forced or a qualifier was provided and val looks numeric + if ( extra === "" || extra ) { + num = parseFloat( val ); + return extra === true || isFinite( num ) ? num || 0 : val; + } + + return val; + } +} ); + +jQuery.each( [ "height", "width" ], function( _i, dimension ) { + jQuery.cssHooks[ dimension ] = { + get: function( elem, computed, extra ) { + if ( computed ) { + + // Certain elements can have dimension info if we invisibly show them + // but it must have a current display style that would benefit + return rdisplayswap.test( jQuery.css( elem, "display" ) ) && + + // Support: Safari 8+ + // Table columns in Safari have non-zero offsetWidth & zero + // getBoundingClientRect().width unless display is changed. + // Support: IE <=11 only + // Running getBoundingClientRect on a disconnected node + // in IE throws an error. + ( !elem.getClientRects().length || !elem.getBoundingClientRect().width ) ? + swap( elem, cssShow, function() { + return getWidthOrHeight( elem, dimension, extra ); + } ) : + getWidthOrHeight( elem, dimension, extra ); + } + }, + + set: function( elem, value, extra ) { + var matches, + styles = getStyles( elem ), + + // Only read styles.position if the test has a chance to fail + // to avoid forcing a reflow. + scrollboxSizeBuggy = !support.scrollboxSize() && + styles.position === "absolute", + + // To avoid forcing a reflow, only fetch boxSizing if we need it (gh-3991) + boxSizingNeeded = scrollboxSizeBuggy || extra, + isBorderBox = boxSizingNeeded && + jQuery.css( elem, "boxSizing", false, styles ) === "border-box", + subtract = extra ? + boxModelAdjustment( + elem, + dimension, + extra, + isBorderBox, + styles + ) : + 0; + + // Account for unreliable border-box dimensions by comparing offset* to computed and + // faking a content-box to get border and padding (gh-3699) + if ( isBorderBox && scrollboxSizeBuggy ) { + subtract -= Math.ceil( + elem[ "offset" + dimension[ 0 ].toUpperCase() + dimension.slice( 1 ) ] - + parseFloat( styles[ dimension ] ) - + boxModelAdjustment( elem, dimension, "border", false, styles ) - + 0.5 + ); + } + + // Convert to pixels if value adjustment is needed + if ( subtract && ( matches = rcssNum.exec( value ) ) && + ( matches[ 3 ] || "px" ) !== "px" ) { + + elem.style[ dimension ] = value; + value = jQuery.css( elem, dimension ); + } + + return setPositiveNumber( elem, value, subtract ); + } + }; +} ); + +jQuery.cssHooks.marginLeft = addGetHookIf( support.reliableMarginLeft, + function( elem, computed ) { + if ( computed ) { + return ( parseFloat( curCSS( elem, "marginLeft" ) ) || + elem.getBoundingClientRect().left - + swap( elem, { marginLeft: 0 }, function() { + return elem.getBoundingClientRect().left; + } ) + ) + "px"; + } + } +); + +// These hooks are used by animate to expand properties +jQuery.each( { + margin: "", + padding: "", + border: "Width" +}, function( prefix, suffix ) { + jQuery.cssHooks[ prefix + suffix ] = { + expand: function( value ) { + var i = 0, + expanded = {}, + + // Assumes a single number if not a string + parts = typeof value === "string" ? value.split( " " ) : [ value ]; + + for ( ; i < 4; i++ ) { + expanded[ prefix + cssExpand[ i ] + suffix ] = + parts[ i ] || parts[ i - 2 ] || parts[ 0 ]; + } + + return expanded; + } + }; + + if ( prefix !== "margin" ) { + jQuery.cssHooks[ prefix + suffix ].set = setPositiveNumber; + } +} ); + +jQuery.fn.extend( { + css: function( name, value ) { + return access( this, function( elem, name, value ) { + var styles, len, + map = {}, + i = 0; + + if ( Array.isArray( name ) ) { + styles = getStyles( elem ); + len = name.length; + + for ( ; i < len; i++ ) { + map[ name[ i ] ] = jQuery.css( elem, name[ i ], false, styles ); + } + + return map; + } + + return value !== undefined ? + jQuery.style( elem, name, value ) : + jQuery.css( elem, name ); + }, name, value, arguments.length > 1 ); + } +} ); + + +function Tween( elem, options, prop, end, easing ) { + return new Tween.prototype.init( elem, options, prop, end, easing ); +} +jQuery.Tween = Tween; + +Tween.prototype = { + constructor: Tween, + init: function( elem, options, prop, end, easing, unit ) { + this.elem = elem; + this.prop = prop; + this.easing = easing || jQuery.easing._default; + this.options = options; + this.start = this.now = this.cur(); + this.end = end; + this.unit = unit || ( jQuery.cssNumber[ prop ] ? "" : "px" ); + }, + cur: function() { + var hooks = Tween.propHooks[ this.prop ]; + + return hooks && hooks.get ? + hooks.get( this ) : + Tween.propHooks._default.get( this ); + }, + run: function( percent ) { + var eased, + hooks = Tween.propHooks[ this.prop ]; + + if ( this.options.duration ) { + this.pos = eased = jQuery.easing[ this.easing ]( + percent, this.options.duration * percent, 0, 1, this.options.duration + ); + } else { + this.pos = eased = percent; + } + this.now = ( this.end - this.start ) * eased + this.start; + + if ( this.options.step ) { + this.options.step.call( this.elem, this.now, this ); + } + + if ( hooks && hooks.set ) { + hooks.set( this ); + } else { + Tween.propHooks._default.set( this ); + } + return this; + } +}; + +Tween.prototype.init.prototype = Tween.prototype; + +Tween.propHooks = { + _default: { + get: function( tween ) { + var result; + + // Use a property on the element directly when it is not a DOM element, + // or when there is no matching style property that exists. + if ( tween.elem.nodeType !== 1 || + tween.elem[ tween.prop ] != null && tween.elem.style[ tween.prop ] == null ) { + return tween.elem[ tween.prop ]; + } + + // Passing an empty string as a 3rd parameter to .css will automatically + // attempt a parseFloat and fallback to a string if the parse fails. + // Simple values such as "10px" are parsed to Float; + // complex values such as "rotate(1rad)" are returned as-is. + result = jQuery.css( tween.elem, tween.prop, "" ); + + // Empty strings, null, undefined and "auto" are converted to 0. + return !result || result === "auto" ? 0 : result; + }, + set: function( tween ) { + + // Use step hook for back compat. + // Use cssHook if its there. + // Use .style if available and use plain properties where available. + if ( jQuery.fx.step[ tween.prop ] ) { + jQuery.fx.step[ tween.prop ]( tween ); + } else if ( tween.elem.nodeType === 1 && ( + jQuery.cssHooks[ tween.prop ] || + tween.elem.style[ finalPropName( tween.prop ) ] != null ) ) { + jQuery.style( tween.elem, tween.prop, tween.now + tween.unit ); + } else { + tween.elem[ tween.prop ] = tween.now; + } + } + } +}; + +// Support: IE <=9 only +// Panic based approach to setting things on disconnected nodes +Tween.propHooks.scrollTop = Tween.propHooks.scrollLeft = { + set: function( tween ) { + if ( tween.elem.nodeType && tween.elem.parentNode ) { + tween.elem[ tween.prop ] = tween.now; + } + } +}; + +jQuery.easing = { + linear: function( p ) { + return p; + }, + swing: function( p ) { + return 0.5 - Math.cos( p * Math.PI ) / 2; + }, + _default: "swing" +}; + +jQuery.fx = Tween.prototype.init; + +// Back compat <1.8 extension point +jQuery.fx.step = {}; + + + + +var + fxNow, inProgress, + rfxtypes = /^(?:toggle|show|hide)$/, + rrun = /queueHooks$/; + +function schedule() { + if ( inProgress ) { + if ( document.hidden === false && window.requestAnimationFrame ) { + window.requestAnimationFrame( schedule ); + } else { + window.setTimeout( schedule, jQuery.fx.interval ); + } + + jQuery.fx.tick(); + } +} + +// Animations created synchronously will run synchronously +function createFxNow() { + window.setTimeout( function() { + fxNow = undefined; + } ); + return ( fxNow = Date.now() ); +} + +// Generate parameters to create a standard animation +function genFx( type, includeWidth ) { + var which, + i = 0, + attrs = { height: type }; + + // If we include width, step value is 1 to do all cssExpand values, + // otherwise step value is 2 to skip over Left and Right + includeWidth = includeWidth ? 1 : 0; + for ( ; i < 4; i += 2 - includeWidth ) { + which = cssExpand[ i ]; + attrs[ "margin" + which ] = attrs[ "padding" + which ] = type; + } + + if ( includeWidth ) { + attrs.opacity = attrs.width = type; + } + + return attrs; +} + +function createTween( value, prop, animation ) { + var tween, + collection = ( Animation.tweeners[ prop ] || [] ).concat( Animation.tweeners[ "*" ] ), + index = 0, + length = collection.length; + for ( ; index < length; index++ ) { + if ( ( tween = collection[ index ].call( animation, prop, value ) ) ) { + + // We're done with this property + return tween; + } + } +} + +function defaultPrefilter( elem, props, opts ) { + var prop, value, toggle, hooks, oldfire, propTween, restoreDisplay, display, + isBox = "width" in props || "height" in props, + anim = this, + orig = {}, + style = elem.style, + hidden = elem.nodeType && isHiddenWithinTree( elem ), + dataShow = dataPriv.get( elem, "fxshow" ); + + // Queue-skipping animations hijack the fx hooks + if ( !opts.queue ) { + hooks = jQuery._queueHooks( elem, "fx" ); + if ( hooks.unqueued == null ) { + hooks.unqueued = 0; + oldfire = hooks.empty.fire; + hooks.empty.fire = function() { + if ( !hooks.unqueued ) { + oldfire(); + } + }; + } + hooks.unqueued++; + + anim.always( function() { + + // Ensure the complete handler is called before this completes + anim.always( function() { + hooks.unqueued--; + if ( !jQuery.queue( elem, "fx" ).length ) { + hooks.empty.fire(); + } + } ); + } ); + } + + // Detect show/hide animations + for ( prop in props ) { + value = props[ prop ]; + if ( rfxtypes.test( value ) ) { + delete props[ prop ]; + toggle = toggle || value === "toggle"; + if ( value === ( hidden ? "hide" : "show" ) ) { + + // Pretend to be hidden if this is a "show" and + // there is still data from a stopped show/hide + if ( value === "show" && dataShow && dataShow[ prop ] !== undefined ) { + hidden = true; + + // Ignore all other no-op show/hide data + } else { + continue; + } + } + orig[ prop ] = dataShow && dataShow[ prop ] || jQuery.style( elem, prop ); + } + } + + // Bail out if this is a no-op like .hide().hide() + propTween = !jQuery.isEmptyObject( props ); + if ( !propTween && jQuery.isEmptyObject( orig ) ) { + return; + } + + // Restrict "overflow" and "display" styles during box animations + if ( isBox && elem.nodeType === 1 ) { + + // Support: IE <=9 - 11, Edge 12 - 15 + // Record all 3 overflow attributes because IE does not infer the shorthand + // from identically-valued overflowX and overflowY and Edge just mirrors + // the overflowX value there. + opts.overflow = [ style.overflow, style.overflowX, style.overflowY ]; + + // Identify a display type, preferring old show/hide data over the CSS cascade + restoreDisplay = dataShow && dataShow.display; + if ( restoreDisplay == null ) { + restoreDisplay = dataPriv.get( elem, "display" ); + } + display = jQuery.css( elem, "display" ); + if ( display === "none" ) { + if ( restoreDisplay ) { + display = restoreDisplay; + } else { + + // Get nonempty value(s) by temporarily forcing visibility + showHide( [ elem ], true ); + restoreDisplay = elem.style.display || restoreDisplay; + display = jQuery.css( elem, "display" ); + showHide( [ elem ] ); + } + } + + // Animate inline elements as inline-block + if ( display === "inline" || display === "inline-block" && restoreDisplay != null ) { + if ( jQuery.css( elem, "float" ) === "none" ) { + + // Restore the original display value at the end of pure show/hide animations + if ( !propTween ) { + anim.done( function() { + style.display = restoreDisplay; + } ); + if ( restoreDisplay == null ) { + display = style.display; + restoreDisplay = display === "none" ? "" : display; + } + } + style.display = "inline-block"; + } + } + } + + if ( opts.overflow ) { + style.overflow = "hidden"; + anim.always( function() { + style.overflow = opts.overflow[ 0 ]; + style.overflowX = opts.overflow[ 1 ]; + style.overflowY = opts.overflow[ 2 ]; + } ); + } + + // Implement show/hide animations + propTween = false; + for ( prop in orig ) { + + // General show/hide setup for this element animation + if ( !propTween ) { + if ( dataShow ) { + if ( "hidden" in dataShow ) { + hidden = dataShow.hidden; + } + } else { + dataShow = dataPriv.access( elem, "fxshow", { display: restoreDisplay } ); + } + + // Store hidden/visible for toggle so `.stop().toggle()` "reverses" + if ( toggle ) { + dataShow.hidden = !hidden; + } + + // Show elements before animating them + if ( hidden ) { + showHide( [ elem ], true ); + } + + /* eslint-disable no-loop-func */ + + anim.done( function() { + + /* eslint-enable no-loop-func */ + + // The final step of a "hide" animation is actually hiding the element + if ( !hidden ) { + showHide( [ elem ] ); + } + dataPriv.remove( elem, "fxshow" ); + for ( prop in orig ) { + jQuery.style( elem, prop, orig[ prop ] ); + } + } ); + } + + // Per-property setup + propTween = createTween( hidden ? dataShow[ prop ] : 0, prop, anim ); + if ( !( prop in dataShow ) ) { + dataShow[ prop ] = propTween.start; + if ( hidden ) { + propTween.end = propTween.start; + propTween.start = 0; + } + } + } +} + +function propFilter( props, specialEasing ) { + var index, name, easing, value, hooks; + + // camelCase, specialEasing and expand cssHook pass + for ( index in props ) { + name = camelCase( index ); + easing = specialEasing[ name ]; + value = props[ index ]; + if ( Array.isArray( value ) ) { + easing = value[ 1 ]; + value = props[ index ] = value[ 0 ]; + } + + if ( index !== name ) { + props[ name ] = value; + delete props[ index ]; + } + + hooks = jQuery.cssHooks[ name ]; + if ( hooks && "expand" in hooks ) { + value = hooks.expand( value ); + delete props[ name ]; + + // Not quite $.extend, this won't overwrite existing keys. + // Reusing 'index' because we have the correct "name" + for ( index in value ) { + if ( !( index in props ) ) { + props[ index ] = value[ index ]; + specialEasing[ index ] = easing; + } + } + } else { + specialEasing[ name ] = easing; + } + } +} + +function Animation( elem, properties, options ) { + var result, + stopped, + index = 0, + length = Animation.prefilters.length, + deferred = jQuery.Deferred().always( function() { + + // Don't match elem in the :animated selector + delete tick.elem; + } ), + tick = function() { + if ( stopped ) { + return false; + } + var currentTime = fxNow || createFxNow(), + remaining = Math.max( 0, animation.startTime + animation.duration - currentTime ), + + // Support: Android 2.3 only + // Archaic crash bug won't allow us to use `1 - ( 0.5 || 0 )` (#12497) + temp = remaining / animation.duration || 0, + percent = 1 - temp, + index = 0, + length = animation.tweens.length; + + for ( ; index < length; index++ ) { + animation.tweens[ index ].run( percent ); + } + + deferred.notifyWith( elem, [ animation, percent, remaining ] ); + + // If there's more to do, yield + if ( percent < 1 && length ) { + return remaining; + } + + // If this was an empty animation, synthesize a final progress notification + if ( !length ) { + deferred.notifyWith( elem, [ animation, 1, 0 ] ); + } + + // Resolve the animation and report its conclusion + deferred.resolveWith( elem, [ animation ] ); + return false; + }, + animation = deferred.promise( { + elem: elem, + props: jQuery.extend( {}, properties ), + opts: jQuery.extend( true, { + specialEasing: {}, + easing: jQuery.easing._default + }, options ), + originalProperties: properties, + originalOptions: options, + startTime: fxNow || createFxNow(), + duration: options.duration, + tweens: [], + createTween: function( prop, end ) { + var tween = jQuery.Tween( elem, animation.opts, prop, end, + animation.opts.specialEasing[ prop ] || animation.opts.easing ); + animation.tweens.push( tween ); + return tween; + }, + stop: function( gotoEnd ) { + var index = 0, + + // If we are going to the end, we want to run all the tweens + // otherwise we skip this part + length = gotoEnd ? animation.tweens.length : 0; + if ( stopped ) { + return this; + } + stopped = true; + for ( ; index < length; index++ ) { + animation.tweens[ index ].run( 1 ); + } + + // Resolve when we played the last frame; otherwise, reject + if ( gotoEnd ) { + deferred.notifyWith( elem, [ animation, 1, 0 ] ); + deferred.resolveWith( elem, [ animation, gotoEnd ] ); + } else { + deferred.rejectWith( elem, [ animation, gotoEnd ] ); + } + return this; + } + } ), + props = animation.props; + + propFilter( props, animation.opts.specialEasing ); + + for ( ; index < length; index++ ) { + result = Animation.prefilters[ index ].call( animation, elem, props, animation.opts ); + if ( result ) { + if ( isFunction( result.stop ) ) { + jQuery._queueHooks( animation.elem, animation.opts.queue ).stop = + result.stop.bind( result ); + } + return result; + } + } + + jQuery.map( props, createTween, animation ); + + if ( isFunction( animation.opts.start ) ) { + animation.opts.start.call( elem, animation ); + } + + // Attach callbacks from options + animation + .progress( animation.opts.progress ) + .done( animation.opts.done, animation.opts.complete ) + .fail( animation.opts.fail ) + .always( animation.opts.always ); + + jQuery.fx.timer( + jQuery.extend( tick, { + elem: elem, + anim: animation, + queue: animation.opts.queue + } ) + ); + + return animation; +} + +jQuery.Animation = jQuery.extend( Animation, { + + tweeners: { + "*": [ function( prop, value ) { + var tween = this.createTween( prop, value ); + adjustCSS( tween.elem, prop, rcssNum.exec( value ), tween ); + return tween; + } ] + }, + + tweener: function( props, callback ) { + if ( isFunction( props ) ) { + callback = props; + props = [ "*" ]; + } else { + props = props.match( rnothtmlwhite ); + } + + var prop, + index = 0, + length = props.length; + + for ( ; index < length; index++ ) { + prop = props[ index ]; + Animation.tweeners[ prop ] = Animation.tweeners[ prop ] || []; + Animation.tweeners[ prop ].unshift( callback ); + } + }, + + prefilters: [ defaultPrefilter ], + + prefilter: function( callback, prepend ) { + if ( prepend ) { + Animation.prefilters.unshift( callback ); + } else { + Animation.prefilters.push( callback ); + } + } +} ); + +jQuery.speed = function( speed, easing, fn ) { + var opt = speed && typeof speed === "object" ? jQuery.extend( {}, speed ) : { + complete: fn || !fn && easing || + isFunction( speed ) && speed, + duration: speed, + easing: fn && easing || easing && !isFunction( easing ) && easing + }; + + // Go to the end state if fx are off + if ( jQuery.fx.off ) { + opt.duration = 0; + + } else { + if ( typeof opt.duration !== "number" ) { + if ( opt.duration in jQuery.fx.speeds ) { + opt.duration = jQuery.fx.speeds[ opt.duration ]; + + } else { + opt.duration = jQuery.fx.speeds._default; + } + } + } + + // Normalize opt.queue - true/undefined/null -> "fx" + if ( opt.queue == null || opt.queue === true ) { + opt.queue = "fx"; + } + + // Queueing + opt.old = opt.complete; + + opt.complete = function() { + if ( isFunction( opt.old ) ) { + opt.old.call( this ); + } + + if ( opt.queue ) { + jQuery.dequeue( this, opt.queue ); + } + }; + + return opt; +}; + +jQuery.fn.extend( { + fadeTo: function( speed, to, easing, callback ) { + + // Show any hidden elements after setting opacity to 0 + return this.filter( isHiddenWithinTree ).css( "opacity", 0 ).show() + + // Animate to the value specified + .end().animate( { opacity: to }, speed, easing, callback ); + }, + animate: function( prop, speed, easing, callback ) { + var empty = jQuery.isEmptyObject( prop ), + optall = jQuery.speed( speed, easing, callback ), + doAnimation = function() { + + // Operate on a copy of prop so per-property easing won't be lost + var anim = Animation( this, jQuery.extend( {}, prop ), optall ); + + // Empty animations, or finishing resolves immediately + if ( empty || dataPriv.get( this, "finish" ) ) { + anim.stop( true ); + } + }; + + doAnimation.finish = doAnimation; + + return empty || optall.queue === false ? + this.each( doAnimation ) : + this.queue( optall.queue, doAnimation ); + }, + stop: function( type, clearQueue, gotoEnd ) { + var stopQueue = function( hooks ) { + var stop = hooks.stop; + delete hooks.stop; + stop( gotoEnd ); + }; + + if ( typeof type !== "string" ) { + gotoEnd = clearQueue; + clearQueue = type; + type = undefined; + } + if ( clearQueue ) { + this.queue( type || "fx", [] ); + } + + return this.each( function() { + var dequeue = true, + index = type != null && type + "queueHooks", + timers = jQuery.timers, + data = dataPriv.get( this ); + + if ( index ) { + if ( data[ index ] && data[ index ].stop ) { + stopQueue( data[ index ] ); + } + } else { + for ( index in data ) { + if ( data[ index ] && data[ index ].stop && rrun.test( index ) ) { + stopQueue( data[ index ] ); + } + } + } + + for ( index = timers.length; index--; ) { + if ( timers[ index ].elem === this && + ( type == null || timers[ index ].queue === type ) ) { + + timers[ index ].anim.stop( gotoEnd ); + dequeue = false; + timers.splice( index, 1 ); + } + } + + // Start the next in the queue if the last step wasn't forced. + // Timers currently will call their complete callbacks, which + // will dequeue but only if they were gotoEnd. + if ( dequeue || !gotoEnd ) { + jQuery.dequeue( this, type ); + } + } ); + }, + finish: function( type ) { + if ( type !== false ) { + type = type || "fx"; + } + return this.each( function() { + var index, + data = dataPriv.get( this ), + queue = data[ type + "queue" ], + hooks = data[ type + "queueHooks" ], + timers = jQuery.timers, + length = queue ? queue.length : 0; + + // Enable finishing flag on private data + data.finish = true; + + // Empty the queue first + jQuery.queue( this, type, [] ); + + if ( hooks && hooks.stop ) { + hooks.stop.call( this, true ); + } + + // Look for any active animations, and finish them + for ( index = timers.length; index--; ) { + if ( timers[ index ].elem === this && timers[ index ].queue === type ) { + timers[ index ].anim.stop( true ); + timers.splice( index, 1 ); + } + } + + // Look for any animations in the old queue and finish them + for ( index = 0; index < length; index++ ) { + if ( queue[ index ] && queue[ index ].finish ) { + queue[ index ].finish.call( this ); + } + } + + // Turn off finishing flag + delete data.finish; + } ); + } +} ); + +jQuery.each( [ "toggle", "show", "hide" ], function( _i, name ) { + var cssFn = jQuery.fn[ name ]; + jQuery.fn[ name ] = function( speed, easing, callback ) { + return speed == null || typeof speed === "boolean" ? + cssFn.apply( this, arguments ) : + this.animate( genFx( name, true ), speed, easing, callback ); + }; +} ); + +// Generate shortcuts for custom animations +jQuery.each( { + slideDown: genFx( "show" ), + slideUp: genFx( "hide" ), + slideToggle: genFx( "toggle" ), + fadeIn: { opacity: "show" }, + fadeOut: { opacity: "hide" }, + fadeToggle: { opacity: "toggle" } +}, function( name, props ) { + jQuery.fn[ name ] = function( speed, easing, callback ) { + return this.animate( props, speed, easing, callback ); + }; +} ); + +jQuery.timers = []; +jQuery.fx.tick = function() { + var timer, + i = 0, + timers = jQuery.timers; + + fxNow = Date.now(); + + for ( ; i < timers.length; i++ ) { + timer = timers[ i ]; + + // Run the timer and safely remove it when done (allowing for external removal) + if ( !timer() && timers[ i ] === timer ) { + timers.splice( i--, 1 ); + } + } + + if ( !timers.length ) { + jQuery.fx.stop(); + } + fxNow = undefined; +}; + +jQuery.fx.timer = function( timer ) { + jQuery.timers.push( timer ); + jQuery.fx.start(); +}; + +jQuery.fx.interval = 13; +jQuery.fx.start = function() { + if ( inProgress ) { + return; + } + + inProgress = true; + schedule(); +}; + +jQuery.fx.stop = function() { + inProgress = null; +}; + +jQuery.fx.speeds = { + slow: 600, + fast: 200, + + // Default speed + _default: 400 +}; + + +// Based off of the plugin by Clint Helfers, with permission. +// https://web.archive.org/web/20100324014747/http://blindsignals.com/index.php/2009/07/jquery-delay/ +jQuery.fn.delay = function( time, type ) { + time = jQuery.fx ? jQuery.fx.speeds[ time ] || time : time; + type = type || "fx"; + + return this.queue( type, function( next, hooks ) { + var timeout = window.setTimeout( next, time ); + hooks.stop = function() { + window.clearTimeout( timeout ); + }; + } ); +}; + + +( function() { + var input = document.createElement( "input" ), + select = document.createElement( "select" ), + opt = select.appendChild( document.createElement( "option" ) ); + + input.type = "checkbox"; + + // Support: Android <=4.3 only + // Default value for a checkbox should be "on" + support.checkOn = input.value !== ""; + + // Support: IE <=11 only + // Must access selectedIndex to make default options select + support.optSelected = opt.selected; + + // Support: IE <=11 only + // An input loses its value after becoming a radio + input = document.createElement( "input" ); + input.value = "t"; + input.type = "radio"; + support.radioValue = input.value === "t"; +} )(); + + +var boolHook, + attrHandle = jQuery.expr.attrHandle; + +jQuery.fn.extend( { + attr: function( name, value ) { + return access( this, jQuery.attr, name, value, arguments.length > 1 ); + }, + + removeAttr: function( name ) { + return this.each( function() { + jQuery.removeAttr( this, name ); + } ); + } +} ); + +jQuery.extend( { + attr: function( elem, name, value ) { + var ret, hooks, + nType = elem.nodeType; + + // Don't get/set attributes on text, comment and attribute nodes + if ( nType === 3 || nType === 8 || nType === 2 ) { + return; + } + + // Fallback to prop when attributes are not supported + if ( typeof elem.getAttribute === "undefined" ) { + return jQuery.prop( elem, name, value ); + } + + // Attribute hooks are determined by the lowercase version + // Grab necessary hook if one is defined + if ( nType !== 1 || !jQuery.isXMLDoc( elem ) ) { + hooks = jQuery.attrHooks[ name.toLowerCase() ] || + ( jQuery.expr.match.bool.test( name ) ? boolHook : undefined ); + } + + if ( value !== undefined ) { + if ( value === null ) { + jQuery.removeAttr( elem, name ); + return; + } + + if ( hooks && "set" in hooks && + ( ret = hooks.set( elem, value, name ) ) !== undefined ) { + return ret; + } + + elem.setAttribute( name, value + "" ); + return value; + } + + if ( hooks && "get" in hooks && ( ret = hooks.get( elem, name ) ) !== null ) { + return ret; + } + + ret = jQuery.find.attr( elem, name ); + + // Non-existent attributes return null, we normalize to undefined + return ret == null ? undefined : ret; + }, + + attrHooks: { + type: { + set: function( elem, value ) { + if ( !support.radioValue && value === "radio" && + nodeName( elem, "input" ) ) { + var val = elem.value; + elem.setAttribute( "type", value ); + if ( val ) { + elem.value = val; + } + return value; + } + } + } + }, + + removeAttr: function( elem, value ) { + var name, + i = 0, + + // Attribute names can contain non-HTML whitespace characters + // https://html.spec.whatwg.org/multipage/syntax.html#attributes-2 + attrNames = value && value.match( rnothtmlwhite ); + + if ( attrNames && elem.nodeType === 1 ) { + while ( ( name = attrNames[ i++ ] ) ) { + elem.removeAttribute( name ); + } + } + } +} ); + +// Hooks for boolean attributes +boolHook = { + set: function( elem, value, name ) { + if ( value === false ) { + + // Remove boolean attributes when set to false + jQuery.removeAttr( elem, name ); + } else { + elem.setAttribute( name, name ); + } + return name; + } +}; + +jQuery.each( jQuery.expr.match.bool.source.match( /\w+/g ), function( _i, name ) { + var getter = attrHandle[ name ] || jQuery.find.attr; + + attrHandle[ name ] = function( elem, name, isXML ) { + var ret, handle, + lowercaseName = name.toLowerCase(); + + if ( !isXML ) { + + // Avoid an infinite loop by temporarily removing this function from the getter + handle = attrHandle[ lowercaseName ]; + attrHandle[ lowercaseName ] = ret; + ret = getter( elem, name, isXML ) != null ? + lowercaseName : + null; + attrHandle[ lowercaseName ] = handle; + } + return ret; + }; +} ); + + + + +var rfocusable = /^(?:input|select|textarea|button)$/i, + rclickable = /^(?:a|area)$/i; + +jQuery.fn.extend( { + prop: function( name, value ) { + return access( this, jQuery.prop, name, value, arguments.length > 1 ); + }, + + removeProp: function( name ) { + return this.each( function() { + delete this[ jQuery.propFix[ name ] || name ]; + } ); + } +} ); + +jQuery.extend( { + prop: function( elem, name, value ) { + var ret, hooks, + nType = elem.nodeType; + + // Don't get/set properties on text, comment and attribute nodes + if ( nType === 3 || nType === 8 || nType === 2 ) { + return; + } + + if ( nType !== 1 || !jQuery.isXMLDoc( elem ) ) { + + // Fix name and attach hooks + name = jQuery.propFix[ name ] || name; + hooks = jQuery.propHooks[ name ]; + } + + if ( value !== undefined ) { + if ( hooks && "set" in hooks && + ( ret = hooks.set( elem, value, name ) ) !== undefined ) { + return ret; + } + + return ( elem[ name ] = value ); + } + + if ( hooks && "get" in hooks && ( ret = hooks.get( elem, name ) ) !== null ) { + return ret; + } + + return elem[ name ]; + }, + + propHooks: { + tabIndex: { + get: function( elem ) { + + // Support: IE <=9 - 11 only + // elem.tabIndex doesn't always return the + // correct value when it hasn't been explicitly set + // https://web.archive.org/web/20141116233347/http://fluidproject.org/blog/2008/01/09/getting-setting-and-removing-tabindex-values-with-javascript/ + // Use proper attribute retrieval(#12072) + var tabindex = jQuery.find.attr( elem, "tabindex" ); + + if ( tabindex ) { + return parseInt( tabindex, 10 ); + } + + if ( + rfocusable.test( elem.nodeName ) || + rclickable.test( elem.nodeName ) && + elem.href + ) { + return 0; + } + + return -1; + } + } + }, + + propFix: { + "for": "htmlFor", + "class": "className" + } +} ); + +// Support: IE <=11 only +// Accessing the selectedIndex property +// forces the browser to respect setting selected +// on the option +// The getter ensures a default option is selected +// when in an optgroup +// eslint rule "no-unused-expressions" is disabled for this code +// since it considers such accessions noop +if ( !support.optSelected ) { + jQuery.propHooks.selected = { + get: function( elem ) { + + /* eslint no-unused-expressions: "off" */ + + var parent = elem.parentNode; + if ( parent && parent.parentNode ) { + parent.parentNode.selectedIndex; + } + return null; + }, + set: function( elem ) { + + /* eslint no-unused-expressions: "off" */ + + var parent = elem.parentNode; + if ( parent ) { + parent.selectedIndex; + + if ( parent.parentNode ) { + parent.parentNode.selectedIndex; + } + } + } + }; +} + +jQuery.each( [ + "tabIndex", + "readOnly", + "maxLength", + "cellSpacing", + "cellPadding", + "rowSpan", + "colSpan", + "useMap", + "frameBorder", + "contentEditable" +], function() { + jQuery.propFix[ this.toLowerCase() ] = this; +} ); + + + + + // Strip and collapse whitespace according to HTML spec + // https://infra.spec.whatwg.org/#strip-and-collapse-ascii-whitespace + function stripAndCollapse( value ) { + var tokens = value.match( rnothtmlwhite ) || []; + return tokens.join( " " ); + } + + +function getClass( elem ) { + return elem.getAttribute && elem.getAttribute( "class" ) || ""; +} + +function classesToArray( value ) { + if ( Array.isArray( value ) ) { + return value; + } + if ( typeof value === "string" ) { + return value.match( rnothtmlwhite ) || []; + } + return []; +} + +jQuery.fn.extend( { + addClass: function( value ) { + var classes, elem, cur, curValue, clazz, j, finalValue, + i = 0; + + if ( isFunction( value ) ) { + return this.each( function( j ) { + jQuery( this ).addClass( value.call( this, j, getClass( this ) ) ); + } ); + } + + classes = classesToArray( value ); + + if ( classes.length ) { + while ( ( elem = this[ i++ ] ) ) { + curValue = getClass( elem ); + cur = elem.nodeType === 1 && ( " " + stripAndCollapse( curValue ) + " " ); + + if ( cur ) { + j = 0; + while ( ( clazz = classes[ j++ ] ) ) { + if ( cur.indexOf( " " + clazz + " " ) < 0 ) { + cur += clazz + " "; + } + } + + // Only assign if different to avoid unneeded rendering. + finalValue = stripAndCollapse( cur ); + if ( curValue !== finalValue ) { + elem.setAttribute( "class", finalValue ); + } + } + } + } + + return this; + }, + + removeClass: function( value ) { + var classes, elem, cur, curValue, clazz, j, finalValue, + i = 0; + + if ( isFunction( value ) ) { + return this.each( function( j ) { + jQuery( this ).removeClass( value.call( this, j, getClass( this ) ) ); + } ); + } + + if ( !arguments.length ) { + return this.attr( "class", "" ); + } + + classes = classesToArray( value ); + + if ( classes.length ) { + while ( ( elem = this[ i++ ] ) ) { + curValue = getClass( elem ); + + // This expression is here for better compressibility (see addClass) + cur = elem.nodeType === 1 && ( " " + stripAndCollapse( curValue ) + " " ); + + if ( cur ) { + j = 0; + while ( ( clazz = classes[ j++ ] ) ) { + + // Remove *all* instances + while ( cur.indexOf( " " + clazz + " " ) > -1 ) { + cur = cur.replace( " " + clazz + " ", " " ); + } + } + + // Only assign if different to avoid unneeded rendering. + finalValue = stripAndCollapse( cur ); + if ( curValue !== finalValue ) { + elem.setAttribute( "class", finalValue ); + } + } + } + } + + return this; + }, + + toggleClass: function( value, stateVal ) { + var type = typeof value, + isValidValue = type === "string" || Array.isArray( value ); + + if ( typeof stateVal === "boolean" && isValidValue ) { + return stateVal ? this.addClass( value ) : this.removeClass( value ); + } + + if ( isFunction( value ) ) { + return this.each( function( i ) { + jQuery( this ).toggleClass( + value.call( this, i, getClass( this ), stateVal ), + stateVal + ); + } ); + } + + return this.each( function() { + var className, i, self, classNames; + + if ( isValidValue ) { + + // Toggle individual class names + i = 0; + self = jQuery( this ); + classNames = classesToArray( value ); + + while ( ( className = classNames[ i++ ] ) ) { + + // Check each className given, space separated list + if ( self.hasClass( className ) ) { + self.removeClass( className ); + } else { + self.addClass( className ); + } + } + + // Toggle whole class name + } else if ( value === undefined || type === "boolean" ) { + className = getClass( this ); + if ( className ) { + + // Store className if set + dataPriv.set( this, "__className__", className ); + } + + // If the element has a class name or if we're passed `false`, + // then remove the whole classname (if there was one, the above saved it). + // Otherwise bring back whatever was previously saved (if anything), + // falling back to the empty string if nothing was stored. + if ( this.setAttribute ) { + this.setAttribute( "class", + className || value === false ? + "" : + dataPriv.get( this, "__className__" ) || "" + ); + } + } + } ); + }, + + hasClass: function( selector ) { + var className, elem, + i = 0; + + className = " " + selector + " "; + while ( ( elem = this[ i++ ] ) ) { + if ( elem.nodeType === 1 && + ( " " + stripAndCollapse( getClass( elem ) ) + " " ).indexOf( className ) > -1 ) { + return true; + } + } + + return false; + } +} ); + + + + +var rreturn = /\r/g; + +jQuery.fn.extend( { + val: function( value ) { + var hooks, ret, valueIsFunction, + elem = this[ 0 ]; + + if ( !arguments.length ) { + if ( elem ) { + hooks = jQuery.valHooks[ elem.type ] || + jQuery.valHooks[ elem.nodeName.toLowerCase() ]; + + if ( hooks && + "get" in hooks && + ( ret = hooks.get( elem, "value" ) ) !== undefined + ) { + return ret; + } + + ret = elem.value; + + // Handle most common string cases + if ( typeof ret === "string" ) { + return ret.replace( rreturn, "" ); + } + + // Handle cases where value is null/undef or number + return ret == null ? "" : ret; + } + + return; + } + + valueIsFunction = isFunction( value ); + + return this.each( function( i ) { + var val; + + if ( this.nodeType !== 1 ) { + return; + } + + if ( valueIsFunction ) { + val = value.call( this, i, jQuery( this ).val() ); + } else { + val = value; + } + + // Treat null/undefined as ""; convert numbers to string + if ( val == null ) { + val = ""; + + } else if ( typeof val === "number" ) { + val += ""; + + } else if ( Array.isArray( val ) ) { + val = jQuery.map( val, function( value ) { + return value == null ? "" : value + ""; + } ); + } + + hooks = jQuery.valHooks[ this.type ] || jQuery.valHooks[ this.nodeName.toLowerCase() ]; + + // If set returns undefined, fall back to normal setting + if ( !hooks || !( "set" in hooks ) || hooks.set( this, val, "value" ) === undefined ) { + this.value = val; + } + } ); + } +} ); + +jQuery.extend( { + valHooks: { + option: { + get: function( elem ) { + + var val = jQuery.find.attr( elem, "value" ); + return val != null ? + val : + + // Support: IE <=10 - 11 only + // option.text throws exceptions (#14686, #14858) + // Strip and collapse whitespace + // https://html.spec.whatwg.org/#strip-and-collapse-whitespace + stripAndCollapse( jQuery.text( elem ) ); + } + }, + select: { + get: function( elem ) { + var value, option, i, + options = elem.options, + index = elem.selectedIndex, + one = elem.type === "select-one", + values = one ? null : [], + max = one ? index + 1 : options.length; + + if ( index < 0 ) { + i = max; + + } else { + i = one ? index : 0; + } + + // Loop through all the selected options + for ( ; i < max; i++ ) { + option = options[ i ]; + + // Support: IE <=9 only + // IE8-9 doesn't update selected after form reset (#2551) + if ( ( option.selected || i === index ) && + + // Don't return options that are disabled or in a disabled optgroup + !option.disabled && + ( !option.parentNode.disabled || + !nodeName( option.parentNode, "optgroup" ) ) ) { + + // Get the specific value for the option + value = jQuery( option ).val(); + + // We don't need an array for one selects + if ( one ) { + return value; + } + + // Multi-Selects return an array + values.push( value ); + } + } + + return values; + }, + + set: function( elem, value ) { + var optionSet, option, + options = elem.options, + values = jQuery.makeArray( value ), + i = options.length; + + while ( i-- ) { + option = options[ i ]; + + /* eslint-disable no-cond-assign */ + + if ( option.selected = + jQuery.inArray( jQuery.valHooks.option.get( option ), values ) > -1 + ) { + optionSet = true; + } + + /* eslint-enable no-cond-assign */ + } + + // Force browsers to behave consistently when non-matching value is set + if ( !optionSet ) { + elem.selectedIndex = -1; + } + return values; + } + } + } +} ); + +// Radios and checkboxes getter/setter +jQuery.each( [ "radio", "checkbox" ], function() { + jQuery.valHooks[ this ] = { + set: function( elem, value ) { + if ( Array.isArray( value ) ) { + return ( elem.checked = jQuery.inArray( jQuery( elem ).val(), value ) > -1 ); + } + } + }; + if ( !support.checkOn ) { + jQuery.valHooks[ this ].get = function( elem ) { + return elem.getAttribute( "value" ) === null ? "on" : elem.value; + }; + } +} ); + + + + +// Return jQuery for attributes-only inclusion + + +support.focusin = "onfocusin" in window; + + +var rfocusMorph = /^(?:focusinfocus|focusoutblur)$/, + stopPropagationCallback = function( e ) { + e.stopPropagation(); + }; + +jQuery.extend( jQuery.event, { + + trigger: function( event, data, elem, onlyHandlers ) { + + var i, cur, tmp, bubbleType, ontype, handle, special, lastElement, + eventPath = [ elem || document ], + type = hasOwn.call( event, "type" ) ? event.type : event, + namespaces = hasOwn.call( event, "namespace" ) ? event.namespace.split( "." ) : []; + + cur = lastElement = tmp = elem = elem || document; + + // Don't do events on text and comment nodes + if ( elem.nodeType === 3 || elem.nodeType === 8 ) { + return; + } + + // focus/blur morphs to focusin/out; ensure we're not firing them right now + if ( rfocusMorph.test( type + jQuery.event.triggered ) ) { + return; + } + + if ( type.indexOf( "." ) > -1 ) { + + // Namespaced trigger; create a regexp to match event type in handle() + namespaces = type.split( "." ); + type = namespaces.shift(); + namespaces.sort(); + } + ontype = type.indexOf( ":" ) < 0 && "on" + type; + + // Caller can pass in a jQuery.Event object, Object, or just an event type string + event = event[ jQuery.expando ] ? + event : + new jQuery.Event( type, typeof event === "object" && event ); + + // Trigger bitmask: & 1 for native handlers; & 2 for jQuery (always true) + event.isTrigger = onlyHandlers ? 2 : 3; + event.namespace = namespaces.join( "." ); + event.rnamespace = event.namespace ? + new RegExp( "(^|\\.)" + namespaces.join( "\\.(?:.*\\.|)" ) + "(\\.|$)" ) : + null; + + // Clean up the event in case it is being reused + event.result = undefined; + if ( !event.target ) { + event.target = elem; + } + + // Clone any incoming data and prepend the event, creating the handler arg list + data = data == null ? + [ event ] : + jQuery.makeArray( data, [ event ] ); + + // Allow special events to draw outside the lines + special = jQuery.event.special[ type ] || {}; + if ( !onlyHandlers && special.trigger && special.trigger.apply( elem, data ) === false ) { + return; + } + + // Determine event propagation path in advance, per W3C events spec (#9951) + // Bubble up to document, then to window; watch for a global ownerDocument var (#9724) + if ( !onlyHandlers && !special.noBubble && !isWindow( elem ) ) { + + bubbleType = special.delegateType || type; + if ( !rfocusMorph.test( bubbleType + type ) ) { + cur = cur.parentNode; + } + for ( ; cur; cur = cur.parentNode ) { + eventPath.push( cur ); + tmp = cur; + } + + // Only add window if we got to document (e.g., not plain obj or detached DOM) + if ( tmp === ( elem.ownerDocument || document ) ) { + eventPath.push( tmp.defaultView || tmp.parentWindow || window ); + } + } + + // Fire handlers on the event path + i = 0; + while ( ( cur = eventPath[ i++ ] ) && !event.isPropagationStopped() ) { + lastElement = cur; + event.type = i > 1 ? + bubbleType : + special.bindType || type; + + // jQuery handler + handle = ( dataPriv.get( cur, "events" ) || Object.create( null ) )[ event.type ] && + dataPriv.get( cur, "handle" ); + if ( handle ) { + handle.apply( cur, data ); + } + + // Native handler + handle = ontype && cur[ ontype ]; + if ( handle && handle.apply && acceptData( cur ) ) { + event.result = handle.apply( cur, data ); + if ( event.result === false ) { + event.preventDefault(); + } + } + } + event.type = type; + + // If nobody prevented the default action, do it now + if ( !onlyHandlers && !event.isDefaultPrevented() ) { + + if ( ( !special._default || + special._default.apply( eventPath.pop(), data ) === false ) && + acceptData( elem ) ) { + + // Call a native DOM method on the target with the same name as the event. + // Don't do default actions on window, that's where global variables be (#6170) + if ( ontype && isFunction( elem[ type ] ) && !isWindow( elem ) ) { + + // Don't re-trigger an onFOO event when we call its FOO() method + tmp = elem[ ontype ]; + + if ( tmp ) { + elem[ ontype ] = null; + } + + // Prevent re-triggering of the same event, since we already bubbled it above + jQuery.event.triggered = type; + + if ( event.isPropagationStopped() ) { + lastElement.addEventListener( type, stopPropagationCallback ); + } + + elem[ type ](); + + if ( event.isPropagationStopped() ) { + lastElement.removeEventListener( type, stopPropagationCallback ); + } + + jQuery.event.triggered = undefined; + + if ( tmp ) { + elem[ ontype ] = tmp; + } + } + } + } + + return event.result; + }, + + // Piggyback on a donor event to simulate a different one + // Used only for `focus(in | out)` events + simulate: function( type, elem, event ) { + var e = jQuery.extend( + new jQuery.Event(), + event, + { + type: type, + isSimulated: true + } + ); + + jQuery.event.trigger( e, null, elem ); + } + +} ); + +jQuery.fn.extend( { + + trigger: function( type, data ) { + return this.each( function() { + jQuery.event.trigger( type, data, this ); + } ); + }, + triggerHandler: function( type, data ) { + var elem = this[ 0 ]; + if ( elem ) { + return jQuery.event.trigger( type, data, elem, true ); + } + } +} ); + + +// Support: Firefox <=44 +// Firefox doesn't have focus(in | out) events +// Related ticket - https://bugzilla.mozilla.org/show_bug.cgi?id=687787 +// +// Support: Chrome <=48 - 49, Safari <=9.0 - 9.1 +// focus(in | out) events fire after focus & blur events, +// which is spec violation - http://www.w3.org/TR/DOM-Level-3-Events/#events-focusevent-event-order +// Related ticket - https://bugs.chromium.org/p/chromium/issues/detail?id=449857 +if ( !support.focusin ) { + jQuery.each( { focus: "focusin", blur: "focusout" }, function( orig, fix ) { + + // Attach a single capturing handler on the document while someone wants focusin/focusout + var handler = function( event ) { + jQuery.event.simulate( fix, event.target, jQuery.event.fix( event ) ); + }; + + jQuery.event.special[ fix ] = { + setup: function() { + + // Handle: regular nodes (via `this.ownerDocument`), window + // (via `this.document`) & document (via `this`). + var doc = this.ownerDocument || this.document || this, + attaches = dataPriv.access( doc, fix ); + + if ( !attaches ) { + doc.addEventListener( orig, handler, true ); + } + dataPriv.access( doc, fix, ( attaches || 0 ) + 1 ); + }, + teardown: function() { + var doc = this.ownerDocument || this.document || this, + attaches = dataPriv.access( doc, fix ) - 1; + + if ( !attaches ) { + doc.removeEventListener( orig, handler, true ); + dataPriv.remove( doc, fix ); + + } else { + dataPriv.access( doc, fix, attaches ); + } + } + }; + } ); +} +var location = window.location; + +var nonce = { guid: Date.now() }; + +var rquery = ( /\?/ ); + + + +// Cross-browser xml parsing +jQuery.parseXML = function( data ) { + var xml, parserErrorElem; + if ( !data || typeof data !== "string" ) { + return null; + } + + // Support: IE 9 - 11 only + // IE throws on parseFromString with invalid input. + try { + xml = ( new window.DOMParser() ).parseFromString( data, "text/xml" ); + } catch ( e ) {} + + parserErrorElem = xml && xml.getElementsByTagName( "parsererror" )[ 0 ]; + if ( !xml || parserErrorElem ) { + jQuery.error( "Invalid XML: " + ( + parserErrorElem ? + jQuery.map( parserErrorElem.childNodes, function( el ) { + return el.textContent; + } ).join( "\n" ) : + data + ) ); + } + return xml; +}; + + +var + rbracket = /\[\]$/, + rCRLF = /\r?\n/g, + rsubmitterTypes = /^(?:submit|button|image|reset|file)$/i, + rsubmittable = /^(?:input|select|textarea|keygen)/i; + +function buildParams( prefix, obj, traditional, add ) { + var name; + + if ( Array.isArray( obj ) ) { + + // Serialize array item. + jQuery.each( obj, function( i, v ) { + if ( traditional || rbracket.test( prefix ) ) { + + // Treat each array item as a scalar. + add( prefix, v ); + + } else { + + // Item is non-scalar (array or object), encode its numeric index. + buildParams( + prefix + "[" + ( typeof v === "object" && v != null ? i : "" ) + "]", + v, + traditional, + add + ); + } + } ); + + } else if ( !traditional && toType( obj ) === "object" ) { + + // Serialize object item. + for ( name in obj ) { + buildParams( prefix + "[" + name + "]", obj[ name ], traditional, add ); + } + + } else { + + // Serialize scalar item. + add( prefix, obj ); + } +} + +// Serialize an array of form elements or a set of +// key/values into a query string +jQuery.param = function( a, traditional ) { + var prefix, + s = [], + add = function( key, valueOrFunction ) { + + // If value is a function, invoke it and use its return value + var value = isFunction( valueOrFunction ) ? + valueOrFunction() : + valueOrFunction; + + s[ s.length ] = encodeURIComponent( key ) + "=" + + encodeURIComponent( value == null ? "" : value ); + }; + + if ( a == null ) { + return ""; + } + + // If an array was passed in, assume that it is an array of form elements. + if ( Array.isArray( a ) || ( a.jquery && !jQuery.isPlainObject( a ) ) ) { + + // Serialize the form elements + jQuery.each( a, function() { + add( this.name, this.value ); + } ); + + } else { + + // If traditional, encode the "old" way (the way 1.3.2 or older + // did it), otherwise encode params recursively. + for ( prefix in a ) { + buildParams( prefix, a[ prefix ], traditional, add ); + } + } + + // Return the resulting serialization + return s.join( "&" ); +}; + +jQuery.fn.extend( { + serialize: function() { + return jQuery.param( this.serializeArray() ); + }, + serializeArray: function() { + return this.map( function() { + + // Can add propHook for "elements" to filter or add form elements + var elements = jQuery.prop( this, "elements" ); + return elements ? jQuery.makeArray( elements ) : this; + } ).filter( function() { + var type = this.type; + + // Use .is( ":disabled" ) so that fieldset[disabled] works + return this.name && !jQuery( this ).is( ":disabled" ) && + rsubmittable.test( this.nodeName ) && !rsubmitterTypes.test( type ) && + ( this.checked || !rcheckableType.test( type ) ); + } ).map( function( _i, elem ) { + var val = jQuery( this ).val(); + + if ( val == null ) { + return null; + } + + if ( Array.isArray( val ) ) { + return jQuery.map( val, function( val ) { + return { name: elem.name, value: val.replace( rCRLF, "\r\n" ) }; + } ); + } + + return { name: elem.name, value: val.replace( rCRLF, "\r\n" ) }; + } ).get(); + } +} ); + + +var + r20 = /%20/g, + rhash = /#.*$/, + rantiCache = /([?&])_=[^&]*/, + rheaders = /^(.*?):[ \t]*([^\r\n]*)$/mg, + + // #7653, #8125, #8152: local protocol detection + rlocalProtocol = /^(?:about|app|app-storage|.+-extension|file|res|widget):$/, + rnoContent = /^(?:GET|HEAD)$/, + rprotocol = /^\/\//, + + /* Prefilters + * 1) They are useful to introduce custom dataTypes (see ajax/jsonp.js for an example) + * 2) These are called: + * - BEFORE asking for a transport + * - AFTER param serialization (s.data is a string if s.processData is true) + * 3) key is the dataType + * 4) the catchall symbol "*" can be used + * 5) execution will start with transport dataType and THEN continue down to "*" if needed + */ + prefilters = {}, + + /* Transports bindings + * 1) key is the dataType + * 2) the catchall symbol "*" can be used + * 3) selection will start with transport dataType and THEN go to "*" if needed + */ + transports = {}, + + // Avoid comment-prolog char sequence (#10098); must appease lint and evade compression + allTypes = "*/".concat( "*" ), + + // Anchor tag for parsing the document origin + originAnchor = document.createElement( "a" ); + +originAnchor.href = location.href; + +// Base "constructor" for jQuery.ajaxPrefilter and jQuery.ajaxTransport +function addToPrefiltersOrTransports( structure ) { + + // dataTypeExpression is optional and defaults to "*" + return function( dataTypeExpression, func ) { + + if ( typeof dataTypeExpression !== "string" ) { + func = dataTypeExpression; + dataTypeExpression = "*"; + } + + var dataType, + i = 0, + dataTypes = dataTypeExpression.toLowerCase().match( rnothtmlwhite ) || []; + + if ( isFunction( func ) ) { + + // For each dataType in the dataTypeExpression + while ( ( dataType = dataTypes[ i++ ] ) ) { + + // Prepend if requested + if ( dataType[ 0 ] === "+" ) { + dataType = dataType.slice( 1 ) || "*"; + ( structure[ dataType ] = structure[ dataType ] || [] ).unshift( func ); + + // Otherwise append + } else { + ( structure[ dataType ] = structure[ dataType ] || [] ).push( func ); + } + } + } + }; +} + +// Base inspection function for prefilters and transports +function inspectPrefiltersOrTransports( structure, options, originalOptions, jqXHR ) { + + var inspected = {}, + seekingTransport = ( structure === transports ); + + function inspect( dataType ) { + var selected; + inspected[ dataType ] = true; + jQuery.each( structure[ dataType ] || [], function( _, prefilterOrFactory ) { + var dataTypeOrTransport = prefilterOrFactory( options, originalOptions, jqXHR ); + if ( typeof dataTypeOrTransport === "string" && + !seekingTransport && !inspected[ dataTypeOrTransport ] ) { + + options.dataTypes.unshift( dataTypeOrTransport ); + inspect( dataTypeOrTransport ); + return false; + } else if ( seekingTransport ) { + return !( selected = dataTypeOrTransport ); + } + } ); + return selected; + } + + return inspect( options.dataTypes[ 0 ] ) || !inspected[ "*" ] && inspect( "*" ); +} + +// A special extend for ajax options +// that takes "flat" options (not to be deep extended) +// Fixes #9887 +function ajaxExtend( target, src ) { + var key, deep, + flatOptions = jQuery.ajaxSettings.flatOptions || {}; + + for ( key in src ) { + if ( src[ key ] !== undefined ) { + ( flatOptions[ key ] ? target : ( deep || ( deep = {} ) ) )[ key ] = src[ key ]; + } + } + if ( deep ) { + jQuery.extend( true, target, deep ); + } + + return target; +} + +/* Handles responses to an ajax request: + * - finds the right dataType (mediates between content-type and expected dataType) + * - returns the corresponding response + */ +function ajaxHandleResponses( s, jqXHR, responses ) { + + var ct, type, finalDataType, firstDataType, + contents = s.contents, + dataTypes = s.dataTypes; + + // Remove auto dataType and get content-type in the process + while ( dataTypes[ 0 ] === "*" ) { + dataTypes.shift(); + if ( ct === undefined ) { + ct = s.mimeType || jqXHR.getResponseHeader( "Content-Type" ); + } + } + + // Check if we're dealing with a known content-type + if ( ct ) { + for ( type in contents ) { + if ( contents[ type ] && contents[ type ].test( ct ) ) { + dataTypes.unshift( type ); + break; + } + } + } + + // Check to see if we have a response for the expected dataType + if ( dataTypes[ 0 ] in responses ) { + finalDataType = dataTypes[ 0 ]; + } else { + + // Try convertible dataTypes + for ( type in responses ) { + if ( !dataTypes[ 0 ] || s.converters[ type + " " + dataTypes[ 0 ] ] ) { + finalDataType = type; + break; + } + if ( !firstDataType ) { + firstDataType = type; + } + } + + // Or just use first one + finalDataType = finalDataType || firstDataType; + } + + // If we found a dataType + // We add the dataType to the list if needed + // and return the corresponding response + if ( finalDataType ) { + if ( finalDataType !== dataTypes[ 0 ] ) { + dataTypes.unshift( finalDataType ); + } + return responses[ finalDataType ]; + } +} + +/* Chain conversions given the request and the original response + * Also sets the responseXXX fields on the jqXHR instance + */ +function ajaxConvert( s, response, jqXHR, isSuccess ) { + var conv2, current, conv, tmp, prev, + converters = {}, + + // Work with a copy of dataTypes in case we need to modify it for conversion + dataTypes = s.dataTypes.slice(); + + // Create converters map with lowercased keys + if ( dataTypes[ 1 ] ) { + for ( conv in s.converters ) { + converters[ conv.toLowerCase() ] = s.converters[ conv ]; + } + } + + current = dataTypes.shift(); + + // Convert to each sequential dataType + while ( current ) { + + if ( s.responseFields[ current ] ) { + jqXHR[ s.responseFields[ current ] ] = response; + } + + // Apply the dataFilter if provided + if ( !prev && isSuccess && s.dataFilter ) { + response = s.dataFilter( response, s.dataType ); + } + + prev = current; + current = dataTypes.shift(); + + if ( current ) { + + // There's only work to do if current dataType is non-auto + if ( current === "*" ) { + + current = prev; + + // Convert response if prev dataType is non-auto and differs from current + } else if ( prev !== "*" && prev !== current ) { + + // Seek a direct converter + conv = converters[ prev + " " + current ] || converters[ "* " + current ]; + + // If none found, seek a pair + if ( !conv ) { + for ( conv2 in converters ) { + + // If conv2 outputs current + tmp = conv2.split( " " ); + if ( tmp[ 1 ] === current ) { + + // If prev can be converted to accepted input + conv = converters[ prev + " " + tmp[ 0 ] ] || + converters[ "* " + tmp[ 0 ] ]; + if ( conv ) { + + // Condense equivalence converters + if ( conv === true ) { + conv = converters[ conv2 ]; + + // Otherwise, insert the intermediate dataType + } else if ( converters[ conv2 ] !== true ) { + current = tmp[ 0 ]; + dataTypes.unshift( tmp[ 1 ] ); + } + break; + } + } + } + } + + // Apply converter (if not an equivalence) + if ( conv !== true ) { + + // Unless errors are allowed to bubble, catch and return them + if ( conv && s.throws ) { + response = conv( response ); + } else { + try { + response = conv( response ); + } catch ( e ) { + return { + state: "parsererror", + error: conv ? e : "No conversion from " + prev + " to " + current + }; + } + } + } + } + } + } + + return { state: "success", data: response }; +} + +jQuery.extend( { + + // Counter for holding the number of active queries + active: 0, + + // Last-Modified header cache for next request + lastModified: {}, + etag: {}, + + ajaxSettings: { + url: location.href, + type: "GET", + isLocal: rlocalProtocol.test( location.protocol ), + global: true, + processData: true, + async: true, + contentType: "application/x-www-form-urlencoded; charset=UTF-8", + + /* + timeout: 0, + data: null, + dataType: null, + username: null, + password: null, + cache: null, + throws: false, + traditional: false, + headers: {}, + */ + + accepts: { + "*": allTypes, + text: "text/plain", + html: "text/html", + xml: "application/xml, text/xml", + json: "application/json, text/javascript" + }, + + contents: { + xml: /\bxml\b/, + html: /\bhtml/, + json: /\bjson\b/ + }, + + responseFields: { + xml: "responseXML", + text: "responseText", + json: "responseJSON" + }, + + // Data converters + // Keys separate source (or catchall "*") and destination types with a single space + converters: { + + // Convert anything to text + "* text": String, + + // Text to html (true = no transformation) + "text html": true, + + // Evaluate text as a json expression + "text json": JSON.parse, + + // Parse text as xml + "text xml": jQuery.parseXML + }, + + // For options that shouldn't be deep extended: + // you can add your own custom options here if + // and when you create one that shouldn't be + // deep extended (see ajaxExtend) + flatOptions: { + url: true, + context: true + } + }, + + // Creates a full fledged settings object into target + // with both ajaxSettings and settings fields. + // If target is omitted, writes into ajaxSettings. + ajaxSetup: function( target, settings ) { + return settings ? + + // Building a settings object + ajaxExtend( ajaxExtend( target, jQuery.ajaxSettings ), settings ) : + + // Extending ajaxSettings + ajaxExtend( jQuery.ajaxSettings, target ); + }, + + ajaxPrefilter: addToPrefiltersOrTransports( prefilters ), + ajaxTransport: addToPrefiltersOrTransports( transports ), + + // Main method + ajax: function( url, options ) { + + // If url is an object, simulate pre-1.5 signature + if ( typeof url === "object" ) { + options = url; + url = undefined; + } + + // Force options to be an object + options = options || {}; + + var transport, + + // URL without anti-cache param + cacheURL, + + // Response headers + responseHeadersString, + responseHeaders, + + // timeout handle + timeoutTimer, + + // Url cleanup var + urlAnchor, + + // Request state (becomes false upon send and true upon completion) + completed, + + // To know if global events are to be dispatched + fireGlobals, + + // Loop variable + i, + + // uncached part of the url + uncached, + + // Create the final options object + s = jQuery.ajaxSetup( {}, options ), + + // Callbacks context + callbackContext = s.context || s, + + // Context for global events is callbackContext if it is a DOM node or jQuery collection + globalEventContext = s.context && + ( callbackContext.nodeType || callbackContext.jquery ) ? + jQuery( callbackContext ) : + jQuery.event, + + // Deferreds + deferred = jQuery.Deferred(), + completeDeferred = jQuery.Callbacks( "once memory" ), + + // Status-dependent callbacks + statusCode = s.statusCode || {}, + + // Headers (they are sent all at once) + requestHeaders = {}, + requestHeadersNames = {}, + + // Default abort message + strAbort = "canceled", + + // Fake xhr + jqXHR = { + readyState: 0, + + // Builds headers hashtable if needed + getResponseHeader: function( key ) { + var match; + if ( completed ) { + if ( !responseHeaders ) { + responseHeaders = {}; + while ( ( match = rheaders.exec( responseHeadersString ) ) ) { + responseHeaders[ match[ 1 ].toLowerCase() + " " ] = + ( responseHeaders[ match[ 1 ].toLowerCase() + " " ] || [] ) + .concat( match[ 2 ] ); + } + } + match = responseHeaders[ key.toLowerCase() + " " ]; + } + return match == null ? null : match.join( ", " ); + }, + + // Raw string + getAllResponseHeaders: function() { + return completed ? responseHeadersString : null; + }, + + // Caches the header + setRequestHeader: function( name, value ) { + if ( completed == null ) { + name = requestHeadersNames[ name.toLowerCase() ] = + requestHeadersNames[ name.toLowerCase() ] || name; + requestHeaders[ name ] = value; + } + return this; + }, + + // Overrides response content-type header + overrideMimeType: function( type ) { + if ( completed == null ) { + s.mimeType = type; + } + return this; + }, + + // Status-dependent callbacks + statusCode: function( map ) { + var code; + if ( map ) { + if ( completed ) { + + // Execute the appropriate callbacks + jqXHR.always( map[ jqXHR.status ] ); + } else { + + // Lazy-add the new callbacks in a way that preserves old ones + for ( code in map ) { + statusCode[ code ] = [ statusCode[ code ], map[ code ] ]; + } + } + } + return this; + }, + + // Cancel the request + abort: function( statusText ) { + var finalText = statusText || strAbort; + if ( transport ) { + transport.abort( finalText ); + } + done( 0, finalText ); + return this; + } + }; + + // Attach deferreds + deferred.promise( jqXHR ); + + // Add protocol if not provided (prefilters might expect it) + // Handle falsy url in the settings object (#10093: consistency with old signature) + // We also use the url parameter if available + s.url = ( ( url || s.url || location.href ) + "" ) + .replace( rprotocol, location.protocol + "//" ); + + // Alias method option to type as per ticket #12004 + s.type = options.method || options.type || s.method || s.type; + + // Extract dataTypes list + s.dataTypes = ( s.dataType || "*" ).toLowerCase().match( rnothtmlwhite ) || [ "" ]; + + // A cross-domain request is in order when the origin doesn't match the current origin. + if ( s.crossDomain == null ) { + urlAnchor = document.createElement( "a" ); + + // Support: IE <=8 - 11, Edge 12 - 15 + // IE throws exception on accessing the href property if url is malformed, + // e.g. http://example.com:80x/ + try { + urlAnchor.href = s.url; + + // Support: IE <=8 - 11 only + // Anchor's host property isn't correctly set when s.url is relative + urlAnchor.href = urlAnchor.href; + s.crossDomain = originAnchor.protocol + "//" + originAnchor.host !== + urlAnchor.protocol + "//" + urlAnchor.host; + } catch ( e ) { + + // If there is an error parsing the URL, assume it is crossDomain, + // it can be rejected by the transport if it is invalid + s.crossDomain = true; + } + } + + // Convert data if not already a string + if ( s.data && s.processData && typeof s.data !== "string" ) { + s.data = jQuery.param( s.data, s.traditional ); + } + + // Apply prefilters + inspectPrefiltersOrTransports( prefilters, s, options, jqXHR ); + + // If request was aborted inside a prefilter, stop there + if ( completed ) { + return jqXHR; + } + + // We can fire global events as of now if asked to + // Don't fire events if jQuery.event is undefined in an AMD-usage scenario (#15118) + fireGlobals = jQuery.event && s.global; + + // Watch for a new set of requests + if ( fireGlobals && jQuery.active++ === 0 ) { + jQuery.event.trigger( "ajaxStart" ); + } + + // Uppercase the type + s.type = s.type.toUpperCase(); + + // Determine if request has content + s.hasContent = !rnoContent.test( s.type ); + + // Save the URL in case we're toying with the If-Modified-Since + // and/or If-None-Match header later on + // Remove hash to simplify url manipulation + cacheURL = s.url.replace( rhash, "" ); + + // More options handling for requests with no content + if ( !s.hasContent ) { + + // Remember the hash so we can put it back + uncached = s.url.slice( cacheURL.length ); + + // If data is available and should be processed, append data to url + if ( s.data && ( s.processData || typeof s.data === "string" ) ) { + cacheURL += ( rquery.test( cacheURL ) ? "&" : "?" ) + s.data; + + // #9682: remove data so that it's not used in an eventual retry + delete s.data; + } + + // Add or update anti-cache param if needed + if ( s.cache === false ) { + cacheURL = cacheURL.replace( rantiCache, "$1" ); + uncached = ( rquery.test( cacheURL ) ? "&" : "?" ) + "_=" + ( nonce.guid++ ) + + uncached; + } + + // Put hash and anti-cache on the URL that will be requested (gh-1732) + s.url = cacheURL + uncached; + + // Change '%20' to '+' if this is encoded form body content (gh-2658) + } else if ( s.data && s.processData && + ( s.contentType || "" ).indexOf( "application/x-www-form-urlencoded" ) === 0 ) { + s.data = s.data.replace( r20, "+" ); + } + + // Set the If-Modified-Since and/or If-None-Match header, if in ifModified mode. + if ( s.ifModified ) { + if ( jQuery.lastModified[ cacheURL ] ) { + jqXHR.setRequestHeader( "If-Modified-Since", jQuery.lastModified[ cacheURL ] ); + } + if ( jQuery.etag[ cacheURL ] ) { + jqXHR.setRequestHeader( "If-None-Match", jQuery.etag[ cacheURL ] ); + } + } + + // Set the correct header, if data is being sent + if ( s.data && s.hasContent && s.contentType !== false || options.contentType ) { + jqXHR.setRequestHeader( "Content-Type", s.contentType ); + } + + // Set the Accepts header for the server, depending on the dataType + jqXHR.setRequestHeader( + "Accept", + s.dataTypes[ 0 ] && s.accepts[ s.dataTypes[ 0 ] ] ? + s.accepts[ s.dataTypes[ 0 ] ] + + ( s.dataTypes[ 0 ] !== "*" ? ", " + allTypes + "; q=0.01" : "" ) : + s.accepts[ "*" ] + ); + + // Check for headers option + for ( i in s.headers ) { + jqXHR.setRequestHeader( i, s.headers[ i ] ); + } + + // Allow custom headers/mimetypes and early abort + if ( s.beforeSend && + ( s.beforeSend.call( callbackContext, jqXHR, s ) === false || completed ) ) { + + // Abort if not done already and return + return jqXHR.abort(); + } + + // Aborting is no longer a cancellation + strAbort = "abort"; + + // Install callbacks on deferreds + completeDeferred.add( s.complete ); + jqXHR.done( s.success ); + jqXHR.fail( s.error ); + + // Get transport + transport = inspectPrefiltersOrTransports( transports, s, options, jqXHR ); + + // If no transport, we auto-abort + if ( !transport ) { + done( -1, "No Transport" ); + } else { + jqXHR.readyState = 1; + + // Send global event + if ( fireGlobals ) { + globalEventContext.trigger( "ajaxSend", [ jqXHR, s ] ); + } + + // If request was aborted inside ajaxSend, stop there + if ( completed ) { + return jqXHR; + } + + // Timeout + if ( s.async && s.timeout > 0 ) { + timeoutTimer = window.setTimeout( function() { + jqXHR.abort( "timeout" ); + }, s.timeout ); + } + + try { + completed = false; + transport.send( requestHeaders, done ); + } catch ( e ) { + + // Rethrow post-completion exceptions + if ( completed ) { + throw e; + } + + // Propagate others as results + done( -1, e ); + } + } + + // Callback for when everything is done + function done( status, nativeStatusText, responses, headers ) { + var isSuccess, success, error, response, modified, + statusText = nativeStatusText; + + // Ignore repeat invocations + if ( completed ) { + return; + } + + completed = true; + + // Clear timeout if it exists + if ( timeoutTimer ) { + window.clearTimeout( timeoutTimer ); + } + + // Dereference transport for early garbage collection + // (no matter how long the jqXHR object will be used) + transport = undefined; + + // Cache response headers + responseHeadersString = headers || ""; + + // Set readyState + jqXHR.readyState = status > 0 ? 4 : 0; + + // Determine if successful + isSuccess = status >= 200 && status < 300 || status === 304; + + // Get response data + if ( responses ) { + response = ajaxHandleResponses( s, jqXHR, responses ); + } + + // Use a noop converter for missing script but not if jsonp + if ( !isSuccess && + jQuery.inArray( "script", s.dataTypes ) > -1 && + jQuery.inArray( "json", s.dataTypes ) < 0 ) { + s.converters[ "text script" ] = function() {}; + } + + // Convert no matter what (that way responseXXX fields are always set) + response = ajaxConvert( s, response, jqXHR, isSuccess ); + + // If successful, handle type chaining + if ( isSuccess ) { + + // Set the If-Modified-Since and/or If-None-Match header, if in ifModified mode. + if ( s.ifModified ) { + modified = jqXHR.getResponseHeader( "Last-Modified" ); + if ( modified ) { + jQuery.lastModified[ cacheURL ] = modified; + } + modified = jqXHR.getResponseHeader( "etag" ); + if ( modified ) { + jQuery.etag[ cacheURL ] = modified; + } + } + + // if no content + if ( status === 204 || s.type === "HEAD" ) { + statusText = "nocontent"; + + // if not modified + } else if ( status === 304 ) { + statusText = "notmodified"; + + // If we have data, let's convert it + } else { + statusText = response.state; + success = response.data; + error = response.error; + isSuccess = !error; + } + } else { + + // Extract error from statusText and normalize for non-aborts + error = statusText; + if ( status || !statusText ) { + statusText = "error"; + if ( status < 0 ) { + status = 0; + } + } + } + + // Set data for the fake xhr object + jqXHR.status = status; + jqXHR.statusText = ( nativeStatusText || statusText ) + ""; + + // Success/Error + if ( isSuccess ) { + deferred.resolveWith( callbackContext, [ success, statusText, jqXHR ] ); + } else { + deferred.rejectWith( callbackContext, [ jqXHR, statusText, error ] ); + } + + // Status-dependent callbacks + jqXHR.statusCode( statusCode ); + statusCode = undefined; + + if ( fireGlobals ) { + globalEventContext.trigger( isSuccess ? "ajaxSuccess" : "ajaxError", + [ jqXHR, s, isSuccess ? success : error ] ); + } + + // Complete + completeDeferred.fireWith( callbackContext, [ jqXHR, statusText ] ); + + if ( fireGlobals ) { + globalEventContext.trigger( "ajaxComplete", [ jqXHR, s ] ); + + // Handle the global AJAX counter + if ( !( --jQuery.active ) ) { + jQuery.event.trigger( "ajaxStop" ); + } + } + } + + return jqXHR; + }, + + getJSON: function( url, data, callback ) { + return jQuery.get( url, data, callback, "json" ); + }, + + getScript: function( url, callback ) { + return jQuery.get( url, undefined, callback, "script" ); + } +} ); + +jQuery.each( [ "get", "post" ], function( _i, method ) { + jQuery[ method ] = function( url, data, callback, type ) { + + // Shift arguments if data argument was omitted + if ( isFunction( data ) ) { + type = type || callback; + callback = data; + data = undefined; + } + + // The url can be an options object (which then must have .url) + return jQuery.ajax( jQuery.extend( { + url: url, + type: method, + dataType: type, + data: data, + success: callback + }, jQuery.isPlainObject( url ) && url ) ); + }; +} ); + +jQuery.ajaxPrefilter( function( s ) { + var i; + for ( i in s.headers ) { + if ( i.toLowerCase() === "content-type" ) { + s.contentType = s.headers[ i ] || ""; + } + } +} ); + + +jQuery._evalUrl = function( url, options, doc ) { + return jQuery.ajax( { + url: url, + + // Make this explicit, since user can override this through ajaxSetup (#11264) + type: "GET", + dataType: "script", + cache: true, + async: false, + global: false, + + // Only evaluate the response if it is successful (gh-4126) + // dataFilter is not invoked for failure responses, so using it instead + // of the default converter is kludgy but it works. + converters: { + "text script": function() {} + }, + dataFilter: function( response ) { + jQuery.globalEval( response, options, doc ); + } + } ); +}; + + +jQuery.fn.extend( { + wrapAll: function( html ) { + var wrap; + + if ( this[ 0 ] ) { + if ( isFunction( html ) ) { + html = html.call( this[ 0 ] ); + } + + // The elements to wrap the target around + wrap = jQuery( html, this[ 0 ].ownerDocument ).eq( 0 ).clone( true ); + + if ( this[ 0 ].parentNode ) { + wrap.insertBefore( this[ 0 ] ); + } + + wrap.map( function() { + var elem = this; + + while ( elem.firstElementChild ) { + elem = elem.firstElementChild; + } + + return elem; + } ).append( this ); + } + + return this; + }, + + wrapInner: function( html ) { + if ( isFunction( html ) ) { + return this.each( function( i ) { + jQuery( this ).wrapInner( html.call( this, i ) ); + } ); + } + + return this.each( function() { + var self = jQuery( this ), + contents = self.contents(); + + if ( contents.length ) { + contents.wrapAll( html ); + + } else { + self.append( html ); + } + } ); + }, + + wrap: function( html ) { + var htmlIsFunction = isFunction( html ); + + return this.each( function( i ) { + jQuery( this ).wrapAll( htmlIsFunction ? html.call( this, i ) : html ); + } ); + }, + + unwrap: function( selector ) { + this.parent( selector ).not( "body" ).each( function() { + jQuery( this ).replaceWith( this.childNodes ); + } ); + return this; + } +} ); + + +jQuery.expr.pseudos.hidden = function( elem ) { + return !jQuery.expr.pseudos.visible( elem ); +}; +jQuery.expr.pseudos.visible = function( elem ) { + return !!( elem.offsetWidth || elem.offsetHeight || elem.getClientRects().length ); +}; + + + + +jQuery.ajaxSettings.xhr = function() { + try { + return new window.XMLHttpRequest(); + } catch ( e ) {} +}; + +var xhrSuccessStatus = { + + // File protocol always yields status code 0, assume 200 + 0: 200, + + // Support: IE <=9 only + // #1450: sometimes IE returns 1223 when it should be 204 + 1223: 204 + }, + xhrSupported = jQuery.ajaxSettings.xhr(); + +support.cors = !!xhrSupported && ( "withCredentials" in xhrSupported ); +support.ajax = xhrSupported = !!xhrSupported; + +jQuery.ajaxTransport( function( options ) { + var callback, errorCallback; + + // Cross domain only allowed if supported through XMLHttpRequest + if ( support.cors || xhrSupported && !options.crossDomain ) { + return { + send: function( headers, complete ) { + var i, + xhr = options.xhr(); + + xhr.open( + options.type, + options.url, + options.async, + options.username, + options.password + ); + + // Apply custom fields if provided + if ( options.xhrFields ) { + for ( i in options.xhrFields ) { + xhr[ i ] = options.xhrFields[ i ]; + } + } + + // Override mime type if needed + if ( options.mimeType && xhr.overrideMimeType ) { + xhr.overrideMimeType( options.mimeType ); + } + + // X-Requested-With header + // For cross-domain requests, seeing as conditions for a preflight are + // akin to a jigsaw puzzle, we simply never set it to be sure. + // (it can always be set on a per-request basis or even using ajaxSetup) + // For same-domain requests, won't change header if already provided. + if ( !options.crossDomain && !headers[ "X-Requested-With" ] ) { + headers[ "X-Requested-With" ] = "XMLHttpRequest"; + } + + // Set headers + for ( i in headers ) { + xhr.setRequestHeader( i, headers[ i ] ); + } + + // Callback + callback = function( type ) { + return function() { + if ( callback ) { + callback = errorCallback = xhr.onload = + xhr.onerror = xhr.onabort = xhr.ontimeout = + xhr.onreadystatechange = null; + + if ( type === "abort" ) { + xhr.abort(); + } else if ( type === "error" ) { + + // Support: IE <=9 only + // On a manual native abort, IE9 throws + // errors on any property access that is not readyState + if ( typeof xhr.status !== "number" ) { + complete( 0, "error" ); + } else { + complete( + + // File: protocol always yields status 0; see #8605, #14207 + xhr.status, + xhr.statusText + ); + } + } else { + complete( + xhrSuccessStatus[ xhr.status ] || xhr.status, + xhr.statusText, + + // Support: IE <=9 only + // IE9 has no XHR2 but throws on binary (trac-11426) + // For XHR2 non-text, let the caller handle it (gh-2498) + ( xhr.responseType || "text" ) !== "text" || + typeof xhr.responseText !== "string" ? + { binary: xhr.response } : + { text: xhr.responseText }, + xhr.getAllResponseHeaders() + ); + } + } + }; + }; + + // Listen to events + xhr.onload = callback(); + errorCallback = xhr.onerror = xhr.ontimeout = callback( "error" ); + + // Support: IE 9 only + // Use onreadystatechange to replace onabort + // to handle uncaught aborts + if ( xhr.onabort !== undefined ) { + xhr.onabort = errorCallback; + } else { + xhr.onreadystatechange = function() { + + // Check readyState before timeout as it changes + if ( xhr.readyState === 4 ) { + + // Allow onerror to be called first, + // but that will not handle a native abort + // Also, save errorCallback to a variable + // as xhr.onerror cannot be accessed + window.setTimeout( function() { + if ( callback ) { + errorCallback(); + } + } ); + } + }; + } + + // Create the abort callback + callback = callback( "abort" ); + + try { + + // Do send the request (this may raise an exception) + xhr.send( options.hasContent && options.data || null ); + } catch ( e ) { + + // #14683: Only rethrow if this hasn't been notified as an error yet + if ( callback ) { + throw e; + } + } + }, + + abort: function() { + if ( callback ) { + callback(); + } + } + }; + } +} ); + + + + +// Prevent auto-execution of scripts when no explicit dataType was provided (See gh-2432) +jQuery.ajaxPrefilter( function( s ) { + if ( s.crossDomain ) { + s.contents.script = false; + } +} ); + +// Install script dataType +jQuery.ajaxSetup( { + accepts: { + script: "text/javascript, application/javascript, " + + "application/ecmascript, application/x-ecmascript" + }, + contents: { + script: /\b(?:java|ecma)script\b/ + }, + converters: { + "text script": function( text ) { + jQuery.globalEval( text ); + return text; + } + } +} ); + +// Handle cache's special case and crossDomain +jQuery.ajaxPrefilter( "script", function( s ) { + if ( s.cache === undefined ) { + s.cache = false; + } + if ( s.crossDomain ) { + s.type = "GET"; + } +} ); + +// Bind script tag hack transport +jQuery.ajaxTransport( "script", function( s ) { + + // This transport only deals with cross domain or forced-by-attrs requests + if ( s.crossDomain || s.scriptAttrs ) { + var script, callback; + return { + send: function( _, complete ) { + script = jQuery( " + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

{{ name | escape | underline }}

+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/_templates/autosummary/function.html b/releases/1.32.2/torch_v2/_templates/autosummary/function.html new file mode 100644 index 00000000..24b173f6 --- /dev/null +++ b/releases/1.32.2/torch_v2/_templates/autosummary/function.html @@ -0,0 +1,164 @@ + + + + + + <no title> — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +

{{ name | escape | underline }}

+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/genindex.html b/releases/1.32.2/torch_v2/genindex.html new file mode 100644 index 00000000..c1be5dd2 --- /dev/null +++ b/releases/1.32.2/torch_v2/genindex.html @@ -0,0 +1,460 @@ + + + + + + Index — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + +
  • +
  • +
+
+
+
+
+ + +

Index

+ +
+ _ + | A + | B + | C + | D + | E + | F + | G + | I + | M + | O + | P + | Q + | R + | S + | W + +
+

_

+ + +
+ +

A

+ + + +
    +
  • + aimet_torch.v2.quantization.affine + +
  • +
+ +

B

+ + +
+ +

C

+ + + +
+ +

D

+ + + +
+ +

E

+ + +
+ +

F

+ + + +
+ +

G

+ + + +
+ +

I

+ + + +
+ +

M

+ + +
+ +

O

+ + +
+ +

P

+ + + +
+ +

Q

+ + + +
+ +

R

+ + +
+ +

S

+ + + +
+ +

W

+ + +
+ + + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/install/index.html b/releases/1.32.2/torch_v2/install/index.html new file mode 100644 index 00000000..06553a66 --- /dev/null +++ b/releases/1.32.2/torch_v2/install/index.html @@ -0,0 +1,247 @@ + + + + + + AIMET Installation — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Installation

+
+

Quick Install

+

The AIMET PyTorch GPU PyPI packages are available for environments that meet the following requirements:

+
    +
  • 64-bit Intel x86-compatible processor

  • +
  • Linux Ubuntu 22.04 LTS [Python 3.10] or Ubuntu 20.04 LTS [Python 3.8]

  • +
  • Cuda 12.0

  • +
  • Torch 2.2.2

  • +
+

Pip install:

+
apt-get install liblapacke
+python3 -m pip install aimet-torch
+
+
+
+
+

Release Packages

+

For other aimet variants, install the latest version from the .whl files hosted at https://github.com/quic/aimet/releases

+

PyTorch

+
# Pytorch 1.13 with CUDA 11.x
+python3 -m pip install https://github.com/quic/aimet/releases/download/1.31.0/aimet_torch-torch_gpu_1.31.0-cp38-cp38-linux_x86_64.whl
+# Pytorch 1.13 CPU only
+python3 -m pip install https://github.com/quic/aimet/releases/download/1.31.0/aimet_torch-torch_cpu_1.31.0-cp38-cp38-linux_x86_64.whl
+

TensorFlow

+
# Tensorflow 2.10 GPU with CUDA 11.x
+python3 -m pip install https://github.com/quic/aimet/releases/download/1.31.0/aimet_tensorflow-tf_gpu_1.31.0-cp38-cp38-linux_x86_64.whl
+# Tensorflow 2.10 CPU only
+python3 -m pip install https://github.com/quic/aimet/releases/download/1.31.0/aimet_tensorflow-tf_cpu_1.31.0-cp38-cp38-linux_x86_64.whl
+

Onnx

+
# ONNX 1.14 GPU
+python3 -m pip install https://github.com/quic/aimet/releases/download/1.31.0/aimet_onnx-onnx_gpu_1.31.0-cp38-cp38-linux_x86_64.whl
+# ONNX 1.14 CPU
+python3 -m pip install https://github.com/quic/aimet/releases/download/1.31.0/aimet_onnx-onnx_cpu_1.31.0-cp38-cp38-linux_x86_64.whl
+

For previous AIMET releases, browse packages at https://github.com/quic/aimet/releases. Each release includes multiple python packages of the following format:

+
# VARIANT in {torch_gpu, torch_cpu, tf_gpu, tf_cpu, onnx_gpu, onnx_cpu}
+# PACKAGE_PREFIX in {aimet_torch, aimet_tensorflow, aimet_onnx}
+<PACKAGE_PREFIX>-<VARIANT>_<VERSION>-cp38-cp38-linux_x86_64.whl
+
+
+

System Requirements

+

The AIMET package requires the following host platform setup:

+
    +
  • 64-bit Intel x86-compatible processor

  • +
  • Linux Ubuntu: 22.04 LTS

  • +
  • bash command shell

  • +
  • +
    For GPU variants:
    +
    +
    +
  • +
+

To use the GPU accelerated training modules an Nvidia CUDA enabled GPU with a minimum Nvidia driver version of 455+ is required. Using the latest driver is always recommended, especially if using a newer GPU. Both CUDA and cuDNN (the more advanced CUDA interface) enabled GPUs are supported.

+
+
+

Advanced Installation Instructions

+
+
There are two ways to setup and install AIMET:
    +
  • On your host machine

  • +
  • Using our pre-built development Docker images

  • +
+
+
+

Please click on the appropriate link for installation instructions:

+ +
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/install/install_docker.html b/releases/1.32.2/torch_v2/install/install_docker.html new file mode 100644 index 00000000..c91e3026 --- /dev/null +++ b/releases/1.32.2/torch_v2/install/install_docker.html @@ -0,0 +1,320 @@ + + + + + + AIMET Installation in Docker — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Installation in Docker

+

This page provides instructions to install AIMET package inside a development docker container.

+
+

Set variant

+
+
Set the <variant_string> to ONE of the following depending on your desired variant
    +
  1. For the PyTorch 2.1 GPU variant, use torch-gpu

  2. +
  3. For the PyTorch 2.1 CPU variant, use torch-cpu

  4. +
  5. For the PyTorch 1.13 GPU variant, use torch-gpu-pt113

  6. +
  7. For the PyTorch 1.13 CPU variant, use torch-cpu-pt113

  8. +
  9. For the TensorFlow GPU variant, use tf-gpu

  10. +
  11. For the TensorFlow CPU variant, use tf-cpu

  12. +
  13. For the ONNX GPU variant, use onnx-gpu

  14. +
  15. For the ONNX CPU variant, use onnx-cpu

  16. +
+
+
+
export AIMET_VARIANT=<variant_string>
+
+
+
+
+

Use prebuilt docker image

+

Follow these instructions to use one of the pre-built docker images:

+
WORKSPACE="<absolute_path_to_workspace>"
+docker_image_name="artifacts.codelinaro.org/codelinaro-aimet/aimet-dev:latest.${AIMET_VARIANT}"
+docker_container_name="aimet-dev-<any_name>"
+
+
+

NOTE: Feel free to modify the docker_container_name as needed.

+
+
+

Build docker image locally

+

Follow these instructions ONLY if you want to build the docker image locally. If not, skip to the next section.

+
WORKSPACE="<absolute_path_to_workspace>"
+docker_image_name="aimet-dev-docker:<any_tag>"
+docker_container_name="aimet-dev-<any_name>"
+docker build -t ${docker_image_name} -f $WORKSPACE/aimet/Jenkins/Dockerfile.${AIMET_VARIANT} .
+
+
+

NOTE: Feel free to modify the docker_image_name and docker_container_name as needed.

+
+
+

Start docker container

+

Ensure that a docker named $docker_container_name is not already running; otherwise remove the existing container and then start a new container as follows:

+
docker ps -a | grep ${docker_container_name} && docker kill ${docker_container_name}
+
+docker run --rm -it -u $(id -u ${USER}):$(id -g ${USER}) \
+-v /etc/passwd:/etc/passwd:ro -v /etc/group:/etc/group:ro \
+-v ${HOME}:${HOME} -v ${WORKSPACE}:${WORKSPACE} \
+-v "/local/mnt/workspace":"/local/mnt/workspace" \
+--entrypoint /bin/bash -w ${WORKSPACE} --hostname ${docker_container_name} ${docker_image_name}
+
+
+
+
NOTE:
    +
  1. Feel free to modify the above docker run command based on the environment and filesystem on your host machine.

  2. +
  3. If nvidia-docker 2.0 is installed, then add –gpus all to the docker run commands in order to enable GPU access inside the docker container.

  4. +
  5. If nvidia-docker 1.0 is installed, then replace docker run with nvidia-docker run in order to enable GPU access inside the docker container.

  6. +
  7. Port forwarding needs to be done in order to run the Visualization APIs from docker container. This can be achieved by running the docker container as follows:

  8. +
+
+
+
port_id="<any-port-number>"
+
+docker run -p ${port_id}:${port_id} --rm -it -u $(id -u ${USER}):$(id -g ${USER}) \
+-v /etc/passwd:/etc/passwd:ro -v /etc/group:/etc/group:ro \
+-v ${HOME}:${HOME} -v ${WORKSPACE}:${WORKSPACE} \
+-v "/local/mnt/workspace":"/local/mnt/workspace" \
+--entrypoint /bin/bash -w ${WORKSPACE} --hostname ${docker_container_name} ${docker_image_name}
+
+
+
+
+

Install AIMET packages

+
+

From PyPI

+

Aimet Torch GPU can install from pypi through the following method:

+

Go to https://pypi.org/project/aimet-torch to identify a version you wish to install

+
+
    +
  • For PyTorch 1.13 GPU you should use aimet-torch==1.31.1

  • +
  • For Pytorch 2.1.2 GPU you should use aimet-torch >= 1.32.0

  • +
+
+
sudo apt-get install liblapacke -y
+pip install aimet-torch
+
+
+
+
+

From Release Package

+

Alternatively, we host .whl packages for each release at https://github.com/quic/aimet/releases. Identify the release tag +of the package you wish to install, then follow the instructions below to install AIMET from the .whl file.

+

Set the <variant_string> to ONE of the following depending on your desired variant

+
    +
  1. For the PyTorch 2.1 GPU variant, use “torch_gpu”

  2. +
  3. For the PyTorch 2.1 CPU variant, use “torch_cpu”

  4. +
  5. For the PyTorch 1.13 GPU variant, use “torch_gpu-pt113”

  6. +
  7. For the PyTorch 1.13 CPU variant, use “torch_cpu-pt113”

  8. +
  9. For the TensorFlow GPU variant, use “tf_gpu”

  10. +
  11. For the TensorFlow CPU variant, use “tf_cpu”

  12. +
  13. For the ONNX GPU variant, use “onnx_gpu”

  14. +
  15. For the ONNX CPU variant, use “onnx_cpu”

  16. +
+
export AIMET_VARIANT=<variant_string>
+
+
+

Replace <release_tag> in the steps below with the appropriate tag:

+
export release_tag=<release_tag>
+
+
+

Set the package download URL as follows:

+
export download_url="https://github.com/quic/aimet/releases/download/${release_tag}"
+
+
+

Set the common suffix for the package files as follows:

+
export wheel_file_suffix="cp310-cp310-linux_x86_64.whl"
+
+
+

Install the AIMET packages in the order specified below:

+
+
NOTE:
    +
  1. Please pre-pend the “apt-get install” and “pip3 install” commands with “sudo -H” as appropriate.

  2. +
  3. These instructions assume that pip packages will be installed in the path: /usr/local/lib/python3.10/dist-packages. If that is not the case, please modify it accordingly.

  4. +
  5. Python dependencies will automatically get installed.

  6. +
+
+
+
# Install ONE of the following depending on the variant
+python3 -m pip install ${download_url}/aimet_torch-${AIMET_VARIANT}_${release_tag}-${wheel_file_suffix} -f https://download.pytorch.org/whl/torch_stable.html
+# OR
+python3 -m pip install ${download_url}/aimet_tensorflow-${AIMET_VARIANT}_${release_tag}-${wheel_file_suffix}
+# OR
+python3 -m pip install ${download_url}/aimet_onnx-${AIMET_VARIANT}_${release_tag}-${wheel_file_suffix}
+
+
+
+
+
+

Environment setup

+

Set the common environment variables as follows:

+
source /usr/local/lib/python3.10/dist-packages/aimet_common/bin/envsetup.sh
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/install/install_host.html b/releases/1.32.2/torch_v2/install/install_host.html new file mode 100644 index 00000000..ca65d2a3 --- /dev/null +++ b/releases/1.32.2/torch_v2/install/install_host.html @@ -0,0 +1,386 @@ + + + + + + AIMET Installation and Setup — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Installation and Setup

+

This page provides instructions to install AIMET package on Ubuntu 22.04 LTS with Nvidia GPU. Please follow the instructions in the order provided, unless specified otherwise.

+
+
NOTE:
    +
  1. Please pre-pend the “apt-get install” and “pip3 install” commands with “sudo -H” as appropriate.

  2. +
  3. These instructions assume that pip packages will be installed in the path: /usr/local/lib/python3.10/dist-packages. If that is not the case, please modify it accordingly.

  4. +
+
+
+
+

Install prerequisite packages

+

Install the basic pre-requisite packages as follows:

+
apt-get update
+apt-get install python3.10 python3.10-dev python3-pip
+python3 -m pip install --upgrade pip
+apt-get install --assume-yes wget gnupg2
+
+
+

If you have multiple python versions installed, set the default python version as follows:

+
update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 1
+update-alternatives --set python3 /usr/bin/python3.10
+
+
+
+
+

Install GPU packages

+

NOTE:

+
    +
  1. Do this section ONLY for the GPU variants.

  2. +
  3. +
    The released AIMET GPU packages were tested with the following CUDA toolkit versions:
      +
    1. PyTorch 2.1 GPU variant: CUDA Toolkit 11.8.0

    2. +
    3. PyTorch 1.13 GPU variant: CUDA Toolkit 11.7.1

    4. +
    5. TensorFlow GPU variant: CUDA Toolkit 11.8.0

    6. +
    7. ONNX GPU variant: CUDA Toolkit 11.7.1

    8. +
    +
    +
    +
  4. +
  5. The instructions in the sub-sections below correspond to our tested versions above. Visit this page https://developer.nvidia.com/cuda-toolkit-archive to obtain the correct version of the CUDA toolkit for your environment.

  6. +
+
+

Install GPU packages for PyTorch 2.1 or TensorFlow

+

NOTE:

+
    +
  1. Do this section ONLY for the PyTorch 2.1 or TensorFlow GPU variant.

  2. +
  3. Visit this page https://developer.nvidia.com/cuda-11-8-0-download-archive to obtain the exact and up-to-date installation instructions for your environment.

  4. +
+
apt-get update && apt-get install -y gnupg2
+wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
+mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
+wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-ubuntu2204-11-8-local_11.8.0-520.61.05-1_amd64.deb
+apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
+dpkg -i cuda-repo-ubuntu2204-11-8-local_11.8.0-520.61.05-1_amd64.deb
+cp /var/cuda-repo-ubuntu2204-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
+echo "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64 /" > /etc/apt/sources.list.d/cuda.list
+apt-get update
+
+
+
+
+

Install GPU packages for PyTorch 1.13 or ONNX

+

NOTE:

+
    +
  1. Do this section ONLY for the PyTorch 1.13 or ONNX GPU variants.

  2. +
  3. Visit this page https://developer.nvidia.com/cuda-11-7-1-download-archive to obtain the exact and up-to-date installation instructions for your environment.

  4. +
+
apt-get update && apt-get install -y gnupg2
+wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
+mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
+wget https://developer.download.nvidia.com/compute/cuda/11.7.1/local_installers/cuda-repo-ubuntu2204-11-7-local_11.7.1-515.65.01-1_amd64.deb
+apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
+dpkg -i cuda-repo-ubuntu2204-11-7-local_11.7.1-515.65.01-1_amd64.deb
+cp /var/cuda-repo-ubuntu2204-11-7-local/cuda-*-keyring.gpg /usr/share/keyrings/
+echo "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64 /" > /etc/apt/sources.list.d/cuda.list
+apt-get update
+
+
+
+
+
+

Install AIMET packages

+
+

From PyPI

+

Aimet Torch GPU can install from pypi through the following method:

+

Go to https://pypi.org/project/aimet-torch to identify a version you wish to install

+
+
    +
  • For PyTorch 1.13 GPU you should use aimet-torch==1.31.1

  • +
  • For Pytorch 2.1.2 GPU you should use aimet-torch >= 1.32.0

  • +
+
+
sudo apt-get install liblapacke -y
+pip install aimet-torch
+
+
+
+
+

From Release Package

+

Alternatively, we host .whl packages for each release at https://github.com/quic/aimet/releases. Identify the release tag +of the package you wish to install, then follow the instructions below to install AIMET from the .whl file.

+

Set the <variant_string> to ONE of the following depending on your desired variant

+
    +
  1. For the PyTorch 2.1 GPU variant, use “torch_gpu”

  2. +
  3. For the PyTorch 2.1 CPU variant, use “torch_cpu”

  4. +
  5. For the PyTorch 1.13 GPU variant, use “torch_gpu_pt113”

  6. +
  7. For the PyTorch 1.13 CPU variant, use “torch_cpu_pt113”

  8. +
  9. For the TensorFlow GPU variant, use “tf_gpu”

  10. +
  11. For the TensorFlow CPU variant, use “tf_cpu”

  12. +
  13. For the ONNX GPU variant, use “onnx_gpu”

  14. +
  15. For the ONNX CPU variant, use “onnx_cpu”

  16. +
+
export AIMET_VARIANT=<variant_string>
+
+
+

Replace <release_tag> in the steps below with the appropriate tag:

+
export release_tag=<release_tag>
+
+
+

Set the package download URL as follows:

+
export download_url="https://github.com/quic/aimet/releases/download/${release_tag}"
+
+
+

Set the common suffix for the package files as follows:

+

NOTE: Set wheel_file_suffix to cp310-cp310-linux_x86_64.whl OR cp38-cp38-linux_x86_64.whl OR cp36-cp36m-linux_x86_64 OR cp37-cp37m-linux_x86_64 OR py3-none-any as appropriate depending on the actual wheel filename(s) on the https://github.com/quic/aimet/releases.

+
export wheel_file_suffix="cp310-cp310-linux_x86_64.whl"
+
+
+

Install the AIMET packages in the order specified below:

+

NOTE: Python dependencies will automatically get installed.

+
# Install ONE of the following depending on the variant
+python3 -m pip install ${download_url}/aimet_torch-${AIMET_VARIANT}_${release_tag}-${wheel_file_suffix} -f https://download.pytorch.org/whl/torch_stable.html
+# OR
+python3 -m pip install ${download_url}/aimet_tensorflow-${AIMET_VARIANT}_${release_tag}-${wheel_file_suffix}
+# OR
+python3 -m pip install ${download_url}/aimet_onnx-${AIMET_VARIANT}_${release_tag}-${wheel_file_suffix}
+
+
+
+
+
+

Install common debian packages

+

Install the common debian packages as follows:

+
cat /usr/local/lib/python3.10/dist-packages/aimet_common/bin/reqs_deb_common.txt | xargs apt-get --assume-yes install
+
+
+

NOTE: Do the following ONLY for the PyTorch variant packages.

+
cat /usr/local/lib/python3.10/dist-packages/aimet_onnx/bin/reqs_deb_torch_common.txt | xargs apt-get --assume-yes install
+
+
+

NOTE: Do the following ONLY for the ONNX variant packages.

+
cat /usr/local/lib/python3.10/dist-packages/aimet_onnx/bin/reqs_deb_onnx_common.txt | xargs apt-get --assume-yes install
+
+
+
+
+

Install tensorflow GPU debian packages

+

NOTE: Do this ONLY for the TensorFlow GPU package.

+
cat /usr/local/lib/python3.10/dist-packages/aimet_tensorflow/bin/reqs_deb_tf_gpu.txt | xargs apt-get --assume-yes install
+
+
+
+
+

Install torch GPU debian packages

+

NOTE: Do this ONLY for the PyTorch GPU package.

+
cat /usr/local/lib/python3.10/dist-packages/aimet_torch/bin/reqs_deb_torch_gpu.txt | xargs apt-get --assume-yes install
+
+
+
+
+

Install ONNX GPU debian packages

+

NOTE: Do this ONLY for the ONNX GPU package.

+
cat /usr/local/lib/python3.10/dist-packages/aimet_onnx/bin/reqs_deb_onnx_gpu.txt | xargs apt-get --assume-yes install
+
+
+
+
+

Replace Pillow with Pillow-SIMD

+

Optional: Replace the Pillow package with Pillow-SIMD as follows:

+
python3 -m pip uninstall -y pillow
+python3 -m pip install --no-cache-dir Pillow-SIMD==9.0.0.post1
+
+
+
+
+

Replace onnxruntime with onnxruntime-gpu

+

NOTE: Do this ONLY for the PyTorch GPU package.

+
export ONNXRUNTIME_VER=$(python3 -c 'import onnxruntime; print(onnxruntime.__version__)')
+python3 -m pip uninstall -y onnxruntime
+python3 -m pip install --no-cache-dir onnxruntime-gpu==$ONNXRUNTIME_VER
+
+
+
+
+

Post installation steps

+
ln -s /usr/lib/x86_64-linux-gnu/libjpeg.so /usr/lib
+
+
+

NOTE: Do the following step ONLY for the PyTorch or Tensorflow GPU packages.

+
# NOTE: Please chose between the below command depending on the version of your CUDA driver toolkit
+ln -s /usr/local/cuda-11.7 /usr/local/cuda
+ln -s /usr/local/cuda-11.8 /usr/local/cuda
+
+
+
+
+

Environment setup

+

Set the common environment variables as follows:

+
source /usr/local/lib/python3.10/dist-packages/aimet_common/bin/envsetup.sh
+
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/objects.inv b/releases/1.32.2/torch_v2/objects.inv new file mode 100644 index 00000000..c46a5537 Binary files /dev/null and b/releases/1.32.2/torch_v2/objects.inv differ diff --git a/releases/1.32.2/torch_v2/py-modindex.html b/releases/1.32.2/torch_v2/py-modindex.html new file mode 100644 index 00000000..6efb9c3a --- /dev/null +++ b/releases/1.32.2/torch_v2/py-modindex.html @@ -0,0 +1,192 @@ + + + + + + Python Module Index — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + +
  • +
  • +
+
+
+
+
+ + +

Python Module Index

+ +
+ a +
+ + + + + + + + + + + + + +
 
+ a
+ aimet_torch +
    + aimet_torch.v2.quantization.affine +
    + aimet_torch.v2.quantization.float +
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/search.html b/releases/1.32.2/torch_v2/search.html new file mode 100644 index 00000000..fef08014 --- /dev/null +++ b/releases/1.32.2/torch_v2/search.html @@ -0,0 +1,182 @@ + + + + + + Search — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+
    +
  • + +
  • +
  • +
+
+
+
+
+ + + + +
+ +
+ +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/searchindex.js b/releases/1.32.2/torch_v2/searchindex.js new file mode 100644 index 00000000..e9ce9832 --- /dev/null +++ b/releases/1.32.2/torch_v2/searchindex.js @@ -0,0 +1 @@ +Search.setIndex({"docnames": ["_templates/autosummary/class", "_templates/autosummary/function", "install/index", "install/install_docker", "install/install_host", "toplevelhidden", "torch_docs/api/nn.fake_quantization_mixin", "torch_docs/api/nn.quantization_mixin", "torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.Quantize", "torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.QuantizeDequantize", "torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.dequantize", "torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.quantize_", "torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.quantize_dequantize", "torch_docs/api/quantization/affine/index", "torch_docs/api/quantization/float/FloatQuantizeDequantize", "torch_docs/api/quantization/float/index", "torch_docs/api/quantization/tensor", "torch_docs/encoding_analyzer", "torch_docs/examples/ptq", "torch_docs/generated/aimet_torch.v2.quantization.encoding_analyzer.MinMaxEncodingAnalyzer", "torch_docs/generated/aimet_torch.v2.quantization.encoding_analyzer.PercentileEncodingAnalyzer", "torch_docs/generated/aimet_torch.v2.quantization.encoding_analyzer.SqnrEncodingAnalyzer", "torch_docs/index", "torch_docs/quantized_modules", "torch_docs/quantizer", "torch_docs/tutorials/quickstart_guide", "user_guide/adaround", "user_guide/auto_quant", "user_guide/bn_reestimation", "user_guide/channel_pruning", "user_guide/compression_feature_guidebook", "user_guide/greedy_compression_ratio_selection", "user_guide/index", "user_guide/known_issues", "user_guide/model_compression", "user_guide/model_guidelines", "user_guide/model_quantization", "user_guide/post_training_quant_techniques", "user_guide/quant_analyzer", "user_guide/quantization_aware_training", "user_guide/quantization_configuration", "user_guide/quantization_feature_guidebook", "user_guide/quantization_sim", "user_guide/release_notes", "user_guide/spatial_svd", "user_guide/visualization_compression", "user_guide/visualization_quant", "user_guide/weight_svd", "user_guide/winnowing"], "filenames": ["_templates/autosummary/class.rst", "_templates/autosummary/function.rst", "install/index.rst", "install/install_docker.rst", "install/install_host.rst", "toplevelhidden.rst", "torch_docs/api/nn.fake_quantization_mixin.rst", "torch_docs/api/nn.quantization_mixin.rst", "torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.Quantize.rst", "torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.QuantizeDequantize.rst", "torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.dequantize.rst", "torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.quantize_.rst", "torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.quantize_dequantize.rst", "torch_docs/api/quantization/affine/index.rst", "torch_docs/api/quantization/float/FloatQuantizeDequantize.rst", "torch_docs/api/quantization/float/index.rst", "torch_docs/api/quantization/tensor.rst", "torch_docs/encoding_analyzer.rst", "torch_docs/examples/ptq.rst", "torch_docs/generated/aimet_torch.v2.quantization.encoding_analyzer.MinMaxEncodingAnalyzer.rst", "torch_docs/generated/aimet_torch.v2.quantization.encoding_analyzer.PercentileEncodingAnalyzer.rst", "torch_docs/generated/aimet_torch.v2.quantization.encoding_analyzer.SqnrEncodingAnalyzer.rst", "torch_docs/index.rst", "torch_docs/quantized_modules.rst", "torch_docs/quantizer.rst", "torch_docs/tutorials/quickstart_guide.rst", "user_guide/adaround.rst", "user_guide/auto_quant.rst", "user_guide/bn_reestimation.rst", "user_guide/channel_pruning.rst", "user_guide/compression_feature_guidebook.rst", "user_guide/greedy_compression_ratio_selection.rst", "user_guide/index.rst", "user_guide/known_issues.rst", "user_guide/model_compression.rst", "user_guide/model_guidelines.rst", "user_guide/model_quantization.rst", "user_guide/post_training_quant_techniques.rst", "user_guide/quant_analyzer.rst", "user_guide/quantization_aware_training.rst", "user_guide/quantization_configuration.rst", "user_guide/quantization_feature_guidebook.rst", "user_guide/quantization_sim.rst", "user_guide/release_notes.rst", "user_guide/spatial_svd.rst", "user_guide/visualization_compression.rst", "user_guide/visualization_quant.rst", "user_guide/weight_svd.rst", "user_guide/winnowing.rst"], "titles": ["<no title>", "<no title>", "AIMET Installation", "AIMET Installation in Docker", "AIMET Installation and Setup", "<no title>", "FakeQuantizationMixin", "nn.QuantizationMixin", "Quantize", "QuantizeDequantize", "dequantize", "quantize", "quantize_dequantize", "quantization.affine", "FloatQuantizeDequantize", "quantization.float", "quantization.tensor", "Encoding Analyzers", "Post-Training Quantization", "MinMaxEncodingAnalyzer", "PercentileEncodingAnalyzer", "SqnrEncodingAnalyzer", "AIMET: AI Model Efficiency Toolkit Documentation", "Quantized Modules", "Quantizers", "Quickstart Guide", "AIMET AdaRound", "AIMET AutoQuant", "AIMET BN Re-estimation", "AIMET Channel Pruning", "AIMET Compression Features Guidebook", "AIMET Greedy Compression Ratio Selection", "AI Model Efficiency Toolkit User Guide", "AIMET Known Issues", "AIMET Model Compression", "Model Guidelines for PyTorch", "AIMET Model Quantization", "AIMET Post-Training Quantization Techniques", "AIMET QuantAnalyzer", "AIMET Quantization Aware Training", "Quantization Simulation Configuration", "AIMET Quantization Features Guidebook", "AIMET Quantization Simulation", "AIMET Release Notes", "AIMET Spatial SVD", "AIMET Visualization", "AIMET Visualization for Quantization", "AIMET Weight SVD", "AIMET Winnowing"], "terms": {"name": [0, 1, 3, 6, 23, 24, 37, 42, 43, 45], "escap": [0, 1], "underlin": [0, 1], "qualcomm": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48], "innov": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48], "center": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48], "inc": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48], "ai": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29, 30, 31, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48], "model": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 23, 24, 26, 27, 28, 29, 30, 31, 33, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48], "effici": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29, 30, 31, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48], "toolkit": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29, 30, 31, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48], "aimet_common": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48], "quantsim_config": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48], "default_config": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48], "json": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48], "The": [2, 4, 6, 11, 12, 14, 16, 20, 21, 22, 23, 25, 26, 27, 28, 29, 31, 32, 34, 35, 36, 37, 38, 39, 40, 41, 42, 44, 45, 46, 47, 48], "pytorch": [2, 3, 6, 7, 23, 26, 27, 28, 32, 37, 38, 40, 42, 43], "gpu": [2, 3, 36, 43], "pypi": 2, "ar": [2, 6, 8, 9, 14, 23, 24, 25, 26, 27, 28, 29, 30, 31, 34, 35, 36, 37, 38, 39, 40, 41, 42, 45, 46, 48], "avail": [2, 25, 35, 38, 40, 41], "environ": 2, "meet": [2, 27, 30, 31], "follow": [2, 3, 4, 26, 27, 28, 29, 30, 31, 32, 34, 35, 36, 38, 39, 40, 42, 44, 47, 48], "64": [2, 8, 21, 24, 26], "bit": [2, 14, 25, 26, 28, 36, 41, 42, 43], "intel": 2, "x86": 2, "compat": [2, 25], "processor": 2, "linux": [2, 4], "ubuntu": [2, 4], "22": [2, 4, 25], "04": [2, 4], "lt": [2, 4], "python": [2, 3, 4], "3": [2, 11, 12, 16, 21, 25, 30, 36, 39, 41, 48], "10": [2, 3, 4, 6, 7, 8, 9, 11, 16, 23, 24, 25, 31, 34, 39], "20": [2, 6, 26, 39], "8": [2, 4, 6, 7, 8, 9, 11, 12, 14, 16, 23, 24, 25, 36, 48], "cuda": [2, 4, 25], "12": [2, 11], "0": [2, 3, 4, 6, 7, 8, 9, 11, 12, 14, 16, 20, 21, 23, 24, 25, 26, 30, 31, 35, 40], "torch": [2, 3, 6, 7, 8, 9, 11, 12, 14, 16, 22, 23, 24, 25, 35, 43], "2": [2, 3, 9, 11, 12, 14, 16, 24, 26, 36, 41, 42], "pip": [2, 3, 4, 22], "apt": [2, 3, 4, 22], "get": [2, 3, 4, 26, 29, 36, 46], "liblapack": [2, 3, 4, 22], "python3": [2, 3, 4, 22], "m": [2, 3, 4, 22], "For": [2, 3, 4, 6, 22, 23, 25, 26, 29, 30, 31, 32, 33, 34, 36, 38, 40, 42, 45, 48], "other": [2, 31, 33, 34, 36, 38, 41, 42, 43], "variant": [2, 4, 26, 27, 28, 37, 38, 39, 42], "latest": [2, 3], "version": [2, 3, 4, 6, 7, 23, 25, 32], "from": [2, 6, 8, 9, 14, 16, 20, 23, 24, 25, 26, 29, 30, 31, 35, 36, 37, 38, 39, 40, 41, 42, 45, 48], "whl": [2, 3, 4], "file": [2, 3, 4, 25, 36, 38, 39, 42, 43, 46], "host": [2, 3, 4, 43, 45], "http": [2, 3, 4, 30, 37, 43, 45], "github": [2, 3, 4, 30, 43], "com": [2, 3, 4, 43], "quic": [2, 3, 4, 30, 43], "1": [2, 3, 6, 7, 8, 9, 11, 12, 14, 16, 23, 24, 31, 33, 34, 35, 36, 40, 41, 42, 44, 47, 48], "13": [2, 3, 11], "11": [2, 4, 11, 16], "x": [2, 14, 16, 23, 25, 30, 35, 38], "download": [2, 3, 4, 25], "31": [2, 3, 4], "aimet_torch": [2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 14, 16, 17, 19, 20, 21, 22, 23, 24, 25, 35], "torch_gpu_": 2, "cp38": [2, 4], "linux_x86_64": [2, 3, 4], "cpu": [2, 3, 4, 25, 36, 43], "onli": [2, 3, 4, 11, 12, 16, 23, 25, 28, 33, 36, 38, 39, 40, 43, 48], "torch_cpu_": 2, "tensorflow": [2, 3, 26, 27, 28, 32, 33, 37, 38, 40, 42, 43], "aimet_tensorflow": [2, 3, 4], "tf_gpu_": 2, "tf_cpu_": 2, "onnx": [2, 3, 22, 26, 27, 32, 35, 36, 37, 38, 40, 42], "14": [2, 11, 25], "aimet_onnx": [2, 3, 4], "onnx_gpu_": 2, "onnx_cpu_": 2, "previou": [2, 25, 30, 31, 41], "brows": 2, "each": [2, 3, 4, 6, 7, 23, 24, 25, 26, 27, 28, 29, 30, 31, 36, 37, 38, 39, 40, 41, 42, 46, 48], "includ": [2, 7, 28, 34, 36, 38, 40, 42, 43], "multipl": [2, 4, 23, 32, 34, 36, 43], "format": [2, 24, 27, 33], "torch_gpu": [2, 3, 4], "torch_cpu": [2, 3, 4], "tf_gpu": [2, 3, 4], "tf_cpu": [2, 3, 4], "onnx_gpu": [2, 3, 4], "onnx_cpu": [2, 3, 4], "package_prefix": 2, "_": [2, 3, 4, 8, 9, 12, 22, 23, 24, 25], "platform": [2, 36], "setup": 2, "bash": [2, 3], "command": [2, 3, 4, 45], "shell": 2, "nvidia": [2, 3, 4], "card": 2, "comput": [2, 4, 6, 7, 14, 20, 21, 25, 26, 34, 35, 36, 37, 38, 42, 45, 48], "capabl": [2, 23, 45, 46], "5": [2, 8, 9, 11, 12, 14, 23, 24, 30, 39, 41], "later": [2, 25], "docker": 2, "To": [2, 23, 25, 28, 31, 34, 35, 38, 40, 41, 42, 45, 46], "us": [2, 4, 6, 7, 8, 9, 16, 21, 22, 24, 25, 28, 29, 30, 31, 32, 35, 37, 38, 39, 40, 41, 42, 43, 46], "acceler": [2, 22, 32, 34], "train": [2, 22, 26, 27, 28, 32, 34, 41, 42, 43], "modul": [2, 6, 7, 22, 25, 26, 36, 43, 48], "an": [2, 6, 16, 22, 23, 24, 25, 26, 27, 29, 31, 32, 34, 35, 36, 38, 39, 40, 41, 42, 46, 48], "enabl": [2, 3, 22, 28, 32, 36, 38, 40, 42, 43], "minimum": [2, 11, 12, 23], "driver": [2, 4], "455": 2, "i": [2, 3, 4, 6, 7, 8, 9, 11, 12, 14, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 34, 35, 36, 37, 38, 39, 40, 41, 42, 44, 45, 46, 47, 48], "alwai": [2, 31], "recommend": [2, 26, 28, 30, 36, 41], "especi": [2, 36, 39, 41], "newer": 2, "both": [2, 7, 11, 12, 22, 23, 36, 37, 39, 40, 41, 42, 44, 48], "cudnn": 2, "more": [2, 22, 23, 25, 29, 30, 31, 32, 34, 36, 37, 38, 39, 40, 41, 42, 45, 46], "interfac": 2, "support": [2, 29, 30, 32, 33, 34, 35, 36, 37, 40, 41, 42, 43, 44, 47, 48], "There": [2, 26, 35, 37, 39, 45, 46], "two": [2, 23, 25, 31, 32, 34, 36, 37, 38, 39, 42, 44, 45, 46, 47], "wai": [2, 25, 31], "On": 2, "your": [2, 3, 4, 22, 35], "machin": [2, 3, 34], "our": [2, 4, 25, 31, 41, 42], "pre": [2, 3, 4, 32, 37], "built": [2, 3], "develop": [2, 3, 4, 7, 23], "imag": [2, 26, 38], "pleas": [2, 3, 4, 22, 25, 26, 27, 28, 29, 32, 34, 37, 38, 42], "click": 2, "appropri": [2, 3, 4, 6, 23, 30, 31, 34, 41], "link": [2, 26, 27, 28, 37, 38, 42], "contain": [2, 6, 16, 23, 25, 36, 38, 39, 40, 42], "thi": [3, 4, 6, 7, 8, 9, 11, 12, 14, 16, 23, 24, 25, 26, 27, 29, 30, 31, 32, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 48], "page": [3, 4, 30, 42, 43], "provid": [3, 4, 14, 22, 23, 25, 26, 30, 31, 34, 36, 37, 38, 40, 41, 42, 45, 46, 48], "instruct": [3, 4, 22], "insid": [3, 6, 7, 23, 25], "variant_str": [3, 4], "ONE": [3, 4], "depend": [3, 4, 16, 30, 31, 36, 40, 43], "desir": [3, 4, 25, 30, 34, 36, 41], "pt113": 3, "tf": [3, 38, 42, 43], "export": [3, 4, 22, 28, 32, 34, 35, 36, 39, 42, 43], "aimet_vari": [3, 4], "one": [3, 23, 25, 29, 34, 39, 40, 43, 44, 47], "workspac": 3, "absolute_path_to_workspac": 3, "docker_image_nam": 3, "artifact": [3, 25], "codelinaro": 3, "org": [3, 4, 37], "dev": [3, 4], "docker_container_nam": 3, "any_nam": 3, "note": [3, 4, 25, 29, 30, 31, 32, 34, 35, 36, 38], "feel": 3, "free": [3, 36, 37, 39], "modifi": [3, 4, 36, 42, 43, 48], "need": [3, 25, 27, 30, 34, 36, 37, 38, 39, 40, 42, 43, 45, 46], "you": [3, 4, 31, 35, 44, 47], "want": 3, "If": [3, 4, 6, 7, 8, 9, 11, 12, 14, 21, 23, 24, 25, 27, 35, 36, 37, 38, 40, 41, 45, 46, 48], "skip": [3, 29], "next": [3, 25, 41], "section": [3, 4, 26, 28, 29, 34, 36, 42], "any_tag": 3, "t": [3, 26], "f": [3, 4, 25], "jenkin": 3, "dockerfil": 3, "ensur": [3, 23, 36, 41], "alreadi": [3, 31, 41], "run": [3, 8, 9, 23, 24, 28, 32, 34, 36, 37, 38, 42, 43, 45], "otherwis": [3, 4, 8, 9, 11, 12, 24, 41], "remov": [3, 6, 25, 29, 32, 42, 48], "exist": [3, 6, 36, 42], "new": [3, 8, 9, 22, 24, 25, 36, 40, 43], "p": 3, "grep": 3, "kill": 3, "rm": 3, "u": [3, 41], "id": [3, 45], "user": [3, 22, 23, 26, 27, 30, 34, 36, 38, 39, 40, 41, 42, 43, 45, 46], "g": [3, 25, 28, 30, 32, 41, 48], "v": [3, 31], "etc": [3, 4, 30, 36], "passwd": 3, "ro": 3, "group": [3, 40, 42], "home": 3, "mnt": 3, "entrypoint": 3, "bin": [3, 4, 21], "w": [3, 48], "hostnam": 3, "abov": [3, 4, 22, 27, 28, 31, 32, 34, 35, 37, 41, 42, 48], "base": [3, 6, 7, 8, 9, 11, 12, 14, 21, 23, 24, 29, 30, 36], "filesystem": 3, "add": [3, 23, 25, 40, 42, 43, 45, 46, 48], "all": [3, 6, 7, 23, 25, 29, 31, 34, 37, 38, 40, 41], "order": [3, 4, 25, 28, 29, 30, 36, 39, 42, 46], "access": [3, 36], "replac": [3, 23, 25, 37, 42], "port": [3, 45], "forward": [3, 6, 7, 8, 9, 23, 24, 25, 35, 38, 41, 43], "done": [3, 8, 9, 24, 29, 34, 40, 42, 48], "visual": [3, 34, 36, 37, 38, 41, 43, 44, 47], "api": [3, 25, 32, 35, 36, 40, 43, 45], "can": [3, 4, 6, 8, 9, 16, 22, 23, 24, 25, 27, 28, 30, 31, 32, 34, 36, 37, 38, 39, 40, 41, 42, 44, 45, 46, 47], "achiev": [3, 26, 30, 31, 44, 47], "port_id": 3, "ani": [3, 4, 7, 25, 26, 27, 40, 43], "number": [3, 6, 11, 12, 14, 21, 23, 26, 31, 32, 34, 39, 42, 43, 45, 48], "through": [3, 4, 23, 25, 37, 38, 42, 45, 46], "method": [3, 4, 6, 23, 25, 31, 34, 36, 41, 42], "go": [3, 4, 25, 45], "project": [3, 4], "identifi": [3, 4, 38, 41, 43, 48], "wish": [3, 4], "should": [3, 4, 6, 23, 25, 30, 34, 40, 45, 48], "32": [3, 4, 8, 24, 41], "sudo": [3, 4], "y": [3, 4, 25, 38], "altern": [3, 4, 34], "we": [3, 4, 23, 25, 31, 34, 36, 37, 40, 41, 42, 46], "tag": [3, 4, 43], "below": [3, 4, 8, 9, 11, 12, 23, 24, 25, 26, 27, 28, 36, 37, 38, 40, 41, 42, 48], "release_tag": [3, 4], "step": [3, 11, 12, 22, 25, 26, 27, 28, 29, 30, 31, 34, 36, 37, 39, 41, 42], "url": [3, 4, 45], "download_url": [3, 4], "common": [3, 41, 46], "suffix": [3, 4], "wheel_file_suffix": [3, 4], "cp310": [3, 4], "specifi": [3, 4, 8, 9, 11, 12, 14, 24, 25, 27, 34, 40, 42, 46], "pend": [3, 4], "pip3": [3, 4], "h": [3, 4, 47, 48], "These": [3, 4, 23, 25, 27, 28, 29, 30, 35, 36, 37, 38, 41, 42], "assum": [3, 4], "path": [3, 4], "usr": [3, 4], "lib": [3, 4], "dist": [3, 4], "case": [3, 4, 11, 12, 23, 25, 31, 37, 39, 40], "accordingli": [3, 4], "automat": [3, 4, 30, 34, 36, 38, 43], "torch_stabl": [3, 4], "html": [3, 4, 30, 38, 43, 46], "OR": [3, 4], "variabl": [3, 4, 8, 9, 24], "sourc": [3, 4, 6, 7, 8, 9, 10, 11, 12, 14, 16, 17, 19, 20, 21, 23, 24, 41], "envsetup": [3, 4], "sh": [3, 4], "unless": [4, 48], "local": [4, 45], "basic": [4, 22, 25], "requisit": 4, "updat": [4, 24, 36, 37, 39, 42, 43], "upgrad": 4, "ye": [4, 34], "wget": 4, "gnupg2": 4, "have": [4, 7, 25, 31, 34, 36, 37, 38, 41, 42], "set": [4, 6, 7, 20, 23, 24, 26, 30, 31, 32, 34, 35, 37, 38, 39, 40, 41, 42, 48], "default": [4, 7, 8, 9, 11, 12, 23, 24, 26, 31, 34, 40, 42, 43, 45], "do": [4, 25, 34, 38, 42], "were": [4, 30, 36, 40, 48], "test": 4, "7": [4, 11, 12, 14, 25, 48], "sub": [4, 29, 34, 42, 48], "correspond": [4, 23, 29, 31, 36, 38, 48], "visit": [4, 22, 32], "archiv": 4, "obtain": [4, 29, 30, 38, 42], "correct": [4, 25, 26, 28, 36, 37, 41], "exact": [4, 23, 28], "up": [4, 34, 39, 40, 42, 48], "date": 4, "repo": 4, "ubuntu2204": 4, "x86_64": 4, "pin": 4, "mv": 4, "prefer": [4, 34], "d": [4, 8, 9, 11, 12, 24], "repositori": 4, "600": 4, "local_instal": 4, "local_11": 4, "520": 4, "61": 4, "05": [4, 11, 12, 25], "1_amd64": 4, "deb": 4, "kei": 4, "adv": 4, "fetch": 4, "3bf863cc": 4, "pub": 4, "dpkg": 4, "cp": [4, 30], "var": 4, "keyr": 4, "gpg": 4, "share": [4, 23], "echo": 4, "list": [4, 11, 12, 21, 23, 24, 31, 33, 35, 40], "515": 4, "65": [4, 30], "01": [4, 11, 12, 26], "torch_gpu_pt113": 4, "torch_cpu_pt113": 4, "cp36": 4, "cp36m": 4, "cp37": 4, "cp37m": 4, "py3": 4, "none": [4, 6, 7, 8, 9, 10, 11, 12, 14, 23, 24, 25, 45], "actual": [4, 30, 36], "wheel": 4, "filenam": 4, "": [4, 6, 7, 22, 23, 24, 25, 30, 33, 34, 36, 37, 38, 39, 41, 42, 45, 46, 48], "cat": 4, "reqs_deb_common": 4, "txt": 4, "xarg": 4, "reqs_deb_torch_common": 4, "reqs_deb_onnx_common": 4, "reqs_deb_tf_gpu": 4, "reqs_deb_torch_gpu": 4, "reqs_deb_onnx_gpu": 4, "option": [4, 7, 8, 9, 11, 12, 21, 22, 24, 25, 26, 38, 40, 42, 45], "uninstal": 4, "cach": 4, "dir": 4, "9": [4, 11, 12, 16, 41], "post1": 4, "onnxruntime_v": 4, "c": [4, 30], "import": [4, 8, 9, 11, 12, 14, 16, 22, 23, 24, 25, 28, 29, 41], "print": [4, 6, 7, 11, 12, 23, 25, 36, 38], "__version__": 4, "ln": 4, "gnu": 4, "libjpeg": 4, "so": [4, 23, 35, 38, 45], "chose": 4, "between": [4, 23, 37, 38, 40, 42], "class": [6, 7, 8, 9, 14, 17, 19, 20, 21, 24, 25], "v2": [6, 7, 8, 9, 10, 11, 12, 14, 16, 17, 19, 20, 21, 22, 23, 24, 25], "nn": [6, 8, 9, 22, 23, 24, 25, 35, 43], "arg": [6, 7, 11, 12, 16, 23], "kwarg": [6, 7, 11, 12, 16, 23], "mixin": [6, 7, 23], "implement": [6, 7, 23, 35, 41], "fake": [6, 9, 12, 14, 23, 24, 25], "quantiz": [6, 7, 9, 10, 12, 14, 17, 19, 20, 21, 22, 26, 27, 28, 30, 32, 34, 38, 43, 45], "top": [6, 29, 45], "regular": [6, 7, 23, 26, 36, 42], "specif": [6, 25, 26, 27, 28, 30, 32, 34, 35, 36, 37, 40, 43], "input": [6, 7, 8, 9, 11, 12, 14, 23, 24, 25, 29, 34, 38, 40, 42, 44, 45, 47, 48], "output": [6, 7, 8, 9, 11, 23, 24, 25, 29, 34, 37, 38, 40, 42, 43, 44, 47, 48], "paramet": [6, 7, 8, 9, 11, 12, 14, 16, 20, 21, 23, 24, 25, 26, 28, 29, 34, 35, 36, 37, 38, 39, 40, 46], "tensor": [6, 8, 9, 10, 11, 12, 14, 21, 23, 24, 25, 26, 29, 35, 36, 38, 40, 41, 42, 43, 44, 47], "its": [6, 16, 22, 23, 25, 32, 36, 38, 42, 48], "held": [6, 25], "quantizerbas": [6, 7, 23, 24], "object": [6, 7, 16, 21, 23, 24, 25, 28, 36, 39, 42], "dure": [6, 23, 25, 26, 32, 34, 36, 39, 40, 42, 45, 46], "inherit": [6, 23], "layer": [6, 7, 23, 25, 26, 27, 28, 29, 30, 33, 35, 36, 38, 40, 41, 42, 43, 44, 45, 46, 47, 48], "oper": [6, 7, 23, 25, 35, 36, 37, 40, 41], "behav": [6, 23, 41], "exactli": [6, 23, 42], "same": [6, 16, 23, 24, 28, 37, 40, 42, 46], "parent": 6, "A": [6, 21, 23, 30, 36, 38, 39, 40, 41, 42], "initi": [6, 8, 9, 14, 23, 24, 26, 39, 41, 42], "scratch": 6, "syntax": 6, "form": 6, "from_modul": 6, "input_quant": [6, 23, 25], "modulelist": [6, 23, 25], "appli": [6, 8, 9, 11, 12, 23, 24, 25, 26, 27, 28, 31, 34, 36, 37, 39, 40, 41, 42, 43, 45, 46], "type": [6, 7, 8, 9, 16, 21, 23, 24, 36, 38, 40, 42, 45], "output_quant": [6, 7, 23, 25], "param_quant": [6, 23, 25], "moduledict": [6, 23, 25], "map": [6, 11, 12, 16, 23, 38, 40], "associ": [6, 23, 36], "exampl": [6, 7, 8, 9, 11, 12, 14, 16, 23, 24, 25, 26, 30, 31, 32, 36, 38, 40, 42, 43, 48], "qlinear": [6, 7, 23], "fakequantizedlinear": [6, 23], "in_featur": [6, 23, 25], "out_featur": [6, 23, 25], "bia": [6, 14, 25, 26, 29, 36, 37, 40, 41, 43], "fals": [6, 7, 8, 9, 11, 12, 16, 23, 24, 25, 35, 40], "weight": [6, 21, 23, 25, 26, 28, 30, 34, 36, 37, 38, 39, 40, 41, 42, 46], "linear": [6, 23, 25, 28, 29], "true": [6, 7, 8, 9, 14, 16, 21, 23, 24, 25, 35, 40], "__quant_init__": [6, 23], "invok": [6, 23, 34, 36, 45, 46], "right": [6, 8, 9, 11, 12, 14, 23, 24, 36, 48], "after": [6, 23, 25, 26, 27, 28, 30, 34, 36, 39, 41, 45, 46], "__init__": [6, 23, 25], "structur": [6, 23, 34], "size": [6, 8, 9, 11, 12, 23, 24, 26, 34, 35, 44, 47], "initializd": [6, 23], "custom": [6, 23, 41, 42], "overridden": [6, 23], "length": [6, 21, 23], "given": [6, 7, 23, 27, 29, 31, 32, 34, 37, 44, 45, 47], "compute_encod": [6, 7, 8, 9, 14, 16, 22, 23, 24, 25], "enter": [6, 7, 23, 27], "context": [6, 7, 23, 25], "observ": [6, 7, 17, 20, 23, 24, 25, 31, 34, 36, 37, 38, 39, 42], "pass": [6, 7, 22, 23, 25, 32, 35, 36, 37, 38, 39, 41, 42, 43, 45], "encod": [6, 7, 8, 9, 16, 19, 20, 21, 22, 24, 25, 26, 28, 36, 38, 39, 43], "upon": [6, 7, 23, 25], "exit": [6, 7, 23, 25], "quantizedlinear": [6, 7, 23, 25], "symmetr": [6, 7, 8, 9, 16, 21, 23, 24, 25, 40, 42], "randn": [6, 7, 8, 9, 16, 23, 24], "16": [6, 7, 8, 14, 23, 24, 26], "is_initi": [6, 7, 8, 9, 14, 23, 24], "classmethod": [6, 7], "creat": [6, 22, 23, 25, 26, 28, 34, 35, 36, 39, 42], "instanc": [6, 7, 45], "result": [6, 16, 21, 26, 27, 29, 30, 32, 37, 38, 39, 40, 42], "attribut": [6, 23, 38], "origin": [6, 25, 29, 30, 34, 36, 37, 38, 39, 42, 45], "mai": [6, 7, 16, 23, 26, 30, 34, 36, 37, 38, 40, 41, 42], "assign": [6, 8, 9, 23, 24], "float": [6, 14, 16, 22, 23, 36, 38, 41, 42, 46], "point": [6, 16, 22, 23, 32, 34, 36, 38, 41, 42, 46], "return": [6, 7, 8, 9, 16, 21, 22, 24, 25, 27, 31, 32, 38, 42], "quantized_linear": 6, "get_original_modul": 6, "module_cl": [6, 7], "decor": [6, 7], "regist": [6, 7, 23, 24], "defin": [6, 23, 25, 35, 36, 38, 40, 42], "call": [6, 14, 16, 23, 25, 28, 34, 36, 38, 40, 42, 43, 44, 47], "featur": [7, 23, 26, 27, 28, 34, 37, 38, 42, 43, 45, 46], "under": [7, 23, 38, 40, 45, 46], "heavi": [7, 23, 45, 46], "chang": [7, 23, 25, 26, 34, 38, 39, 40, 42, 46, 48], "occur": [7, 23], "without": [7, 14, 16, 23, 27, 36, 39, 42, 48], "notic": [7, 23, 34], "futur": [7, 23], "verion": [7, 23], "ad": [7, 25, 33, 36, 40, 43], "full": [7, 23, 47], "function": [7, 11, 12, 16, 23, 25, 26, 31, 34, 35, 36, 38, 42, 43, 45, 46], "subclass": 7, "abil": [7, 43], "well": [7, 16, 23, 30, 34, 36, 37, 38, 42, 44], "allow": [7, 16, 23, 27, 32, 34, 36, 38, 39, 40, 41, 42, 43, 45], "dispatch": 7, "librari": [7, 34], "place": [7, 39, 40], "nativ": [7, 23], "get_default_kernel": 7, "kernel": [7, 23, 29, 44, 47], "callabl": 7, "get_kernel": 7, "current": [7, 29, 32, 33, 34, 35, 40, 44, 47], "doe": [7, 23, 25, 31, 33, 36, 41], "try": [7, 27, 29, 31, 34, 36, 41], "set_default_kernel": 7, "set_kernel": 7, "underli": [7, 41], "wrap": 7, "affin": [8, 9, 10, 11, 12, 16, 22, 23, 24, 25], "shape": [8, 9, 14, 16, 19, 20, 21, 23, 24, 25, 38], "bitwidth": [8, 9, 11, 12, 14, 16, 23, 24, 25, 28, 36, 41, 42], "encoding_analyz": [8, 9, 14, 17, 19, 20, 21, 24], "block_siz": [8, 9, 10, 11, 12, 24], "precis": [8, 9, 11, 12, 14, 22, 24, 36], "out": [8, 9, 11, 12, 14, 24, 27, 30, 34, 38], "clamp": [8, 9, 11, 12, 14, 24, 42], "left": [8, 9, 11, 12, 14, 24, 31, 48], "lceil": [8, 9, 11, 12, 14, 24], "frac": [8, 9, 11, 12, 14, 24], "scale": [8, 9, 10, 11, 12, 14, 16, 24, 28, 36, 37, 38, 39, 42], "rfloor": [8, 9, 11, 12, 14, 24], "offset": [8, 9, 10, 11, 12, 21, 24, 36, 38, 39, 42], "qmin": [8, 9, 11, 12, 24, 42], "qmax": [8, 9, 11, 12, 24, 42], "where": [8, 9, 11, 12, 14, 23, 24, 25, 28, 31, 38, 39, 44, 47, 48], "deriv": [8, 9, 11, 12, 23, 24], "learnabl": [8, 9, 24], "theta_": [8, 9, 24], "min": [8, 9, 19, 21, 23, 24, 25, 38, 42], "max": [8, 9, 14, 19, 21, 23, 24, 25, 34, 37, 38, 42], "block": [8, 9, 11, 12, 24], "b": [8, 9, 11, 12, 24], "begin": [8, 9, 11, 12, 24, 39, 40], "pmatrix": [8, 9, 11, 12, 24], "b_0": [8, 9, 11, 12, 24], "b_1": [8, 9, 11, 12, 24], "cdot": [8, 9, 11, 12, 24], "b_": [8, 9, 11, 12, 24], "end": [8, 9, 11, 12, 24, 25, 34], "equat": [8, 9, 11, 12, 24, 42], "further": [8, 9, 11, 12, 16, 24, 25, 29, 32, 34, 36, 40], "gener": [8, 9, 11, 12, 24, 25, 34, 36, 38, 39, 40, 42], "out_": [8, 9, 11, 12, 24], "j_0": [8, 9, 11, 12, 24], "j_": [8, 9, 11, 12, 24], "input_": [8, 9, 11, 12, 24], "scale_": [8, 9, 11, 12, 24], "i_0": [8, 9, 11, 12, 24], "i_": [8, 9, 11, 12, 24], "offset_": [8, 9, 11, 12, 24], "text": [8, 9, 11, 12, 24], "quad": [8, 9, 11, 12, 24, 42], "forall_": [8, 9, 11, 12, 24], "leq": [8, 9, 11, 12, 24], "i_d": [8, 9, 11, 12, 24], "lfloor": [8, 9, 11, 12, 14, 24], "j_d": [8, 9, 11, 12, 24], "b_d": [8, 9, 11, 12, 24], "tupl": [8, 9, 11, 12, 21, 24], "int": [8, 9, 11, 12, 14, 21, 24], "bool": [8, 9, 11, 12, 21, 24], "perform": [8, 9, 23, 24, 25, 27, 28, 29, 30, 31, 34, 36, 37, 38, 39, 41], "asymmetr": [8, 9, 21, 24, 40, 42], "encodinganalyz": [8, 9, 14, 17, 24], "analyz": [8, 9, 19, 20, 21, 22, 23, 24, 27, 29, 34, 35, 38, 42, 45, 46], "calibr": [8, 9, 19, 20, 21, 22, 23, 24, 25, 36, 38, 39, 41, 42], "absolut": [8, 9, 24], "which": [8, 9, 11, 12, 16, 21, 22, 23, 24, 25, 26, 27, 28, 30, 31, 34, 36, 37, 38, 40, 42, 43, 44, 45, 46, 47], "cannot": [8, 9, 24], "until": [8, 9, 24, 27], "properli": [8, 9, 24, 25], "statist": [8, 9, 14, 23, 24, 25, 28, 36, 38, 46], "manual": [8, 9, 24, 27, 34], "valu": [8, 9, 11, 12, 14, 16, 20, 21, 24, 25, 26, 31, 34, 36, 37, 38, 39, 42, 44, 46, 47], "see": [8, 9, 23, 24, 25, 29, 31, 32, 34, 36, 40, 41, 42, 44, 45, 46, 47], "q": [8, 9, 11, 12, 14, 16, 23, 24, 42], "quantizedtensor": [8, 16, 24], "129": [8, 24, 35], "255": [8, 16, 24], "122": [8, 24], "192": [8, 24], "106": [8, 24], "94": [8, 24], "145": [8, 24], "181": [8, 24], "144": [8, 24], "194": [8, 24], "74": [8, 24], "86": [8, 24], "150": [8, 24], "33": [8, 24], "103": [8, 24], "37": [8, 24], "111": [8, 24], "237": [8, 24], "218": [8, 24], "49": [8, 24], "155": [8, 24], "179": [8, 24], "66": [8, 24, 30], "89": [8, 24], "110": [8, 24], "17": [8, 21, 24], "36": [8, 24], "83": [8, 24], "grad_fn": [8, 9, 16, 24], "aliasbackward0": [8, 9, 16, 24], "ones_lik": [8, 9, 24], "187": [8, 24], "186": [8, 24], "131": [8, 24], "203": [8, 24], "80": [8, 24], "143": [8, 24], "152": [8, 24], "226": [8, 24], "55": [8, 24], "172": [8, 24], "207": [8, 24], "146": [8, 24], "216": [8, 24], "238": [8, 24], "141": [8, 24], "178": [8, 24], "188": [8, 24], "63": [8, 24], "59": [8, 24], "19": [8, 24], "162": [8, 24], "30": [8, 24], "109": [8, 24], "dequant": [9, 12, 16, 22, 23, 24, 42], "overlin": [9, 12, 24], "qdq": [9, 14, 24], "dequantizedtensor": [9, 16, 24], "2771": [9, 24], "3038": [9, 24], "0819": [9, 24], "9700": [9, 24], "9487": [9, 24], "1307": [9, 24], "7894": [9, 24], "1709": [9, 24], "2212": [9, 24], "7741": [9, 24], "0295": [9, 24], "2265": [9, 24], "0564": [9, 24], "6177": [9, 24], "0386": [9, 24], "0176": [9, 24], "6054": [9, 24], "8836": [9, 24], "1232": [9, 24], "8229": [9, 24], "5540": [9, 24], "3992": [9, 24], "2363": [9, 24], "2546": [9, 24], "0036": [9, 24], "2355": [9, 24], "1741": [9, 24], "6079": [9, 24], "6247": [9, 24], "0115": [9, 24], "2458": [9, 24], "9157": [9, 24], "4694": [9, 24], "0639": [9, 24], "2568": [9, 24], "0680": [9, 24], "6695": [9, 24], "7932": [9, 24], "1889": [9, 24], "0158": [9, 24], "5695": [9, 24], "5220": [9, 24], "1977": [9, 24], "4475": [9, 24], "0424": [9, 24], "1128": [9, 24], "8796": [9, 24], "1060": [9, 24], "5897": [9, 24], "6196": [9, 24], "9961": [9, 24], "0549": [9, 24], "6431": [9, 24], "0039": [9, 24], "8706": [9, 24], "4706": [9, 24], "2353": [9, 24], "8078": [9, 24], "3451": [9, 24], "1176": [9, 24], "4549": [9, 24], "0471": [9, 24], "5255": [9, 24], "4157": [9, 24], "0784": [9, 24], "5333": [9, 12, 24], "1647": [9, 24], "2118": [9, 24], "2196": [9, 24], "9176": [9, 24], "9490": [9, 24], "7765": [9, 24], "4784": [9, 24], "6039": [9, 24], "3137": [9, 24], "3216": [9, 24], "8000": [9, 12, 24], "4392": [9, 24], "4863": [9, 24], "overload": [11, 12], "signatur": [11, 12], "sign": [11, 12, 42], "equival": [11, 12, 14, 25], "rceil": [11, 12], "posit": [11, 12], "integ": [11, 12, 26, 36, 38], "rang": [11, 12, 20, 21, 25, 26, 28, 31, 36, 37, 38, 39, 41, 42, 43, 46], "over": [11, 12, 21, 23, 26, 31, 34, 46], "neg": [11, 12, 23], "num_step": [11, 12, 21], "num": [11, 12], "_step": [11, 12], "maximum": [11, 12, 14, 21, 23], "arang": [11, 12], "start": [11, 12, 25, 26, 31, 34, 40, 42], "0000e": [11, 12], "5000e": [11, 12], "02": [11, 12], "1921e": [11, 12], "08": [11, 12], "4": [11, 12, 16, 25, 28, 31, 36, 48], "6": [11, 12, 39], "00": [11, 12], "0500e": [11, 12], "1000e": [11, 12], "1500e": [11, 12], "2000e": [11, 12], "2500e": [11, 12], "15": [11, 12, 34, 39], "0000": [12, 16], "0667": 12, "1333": 12, "2000": [12, 16], "2667": 12, "3333": 12, "4000": [12, 16], "4667": 12, "6000": [12, 16], "6667": 12, "7333": 12, "8667": 12, "9333": 12, "exponent_bit": 14, "mantissa_bit": 14, "dtype": [14, 16], "simul": [14, 22, 23, 25, 32, 36, 39, 43], "cast": [14, 23], "expon": 14, "mantissa": 14, "x_c": 14, "log_2": 14, "ieee": [14, 34, 37], "standard": [14, 23], "represent": [14, 16], "_max": 14, "argument": 14, "mutual": 14, "exclus": 14, "repres": [14, 16, 23, 24, 25, 31, 36, 37, 38, 39, 42], "determin": [14, 23, 25, 27, 30, 34, 36, 37, 38], "dynam": [14, 37, 42, 43, 46], "finer": 14, "8998": 14, "0947": 14, "0891": 14, "1727": 14, "unlik": 14, "affinequant": 14, "floatquant": 14, "is_bfloat16": 14, "8984": 14, "0859": 14, "1729": 14, "minmaxencodinganalyz": [14, 22], "float16": 14, "is_float16": 14, "8994": 14, "0889": 14, "alia": 14, "hold": [16, 23, 40], "store": 16, "along": [16, 25, 39, 42], "encodingbas": [16, 24], "inform": [16, 36, 38], "necessari": [16, 25, 45], "back": [16, 25, 40], "real": 16, "self": [16, 21, 25], "produc": [16, 21, 31, 38, 45], "57": 16, "312": 16, "153": 16, "205": 16, "set_rang": 16, "128": [16, 25], "127": 16, "x_q": 16, "26": 16, "23": 16, "x_dq": 16, "3000": 16, "equal": [16, 21, 23, 26, 27, 30, 31, 35, 36, 38, 46], "quantized_repr": 16, "data": [16, 22, 25, 26, 28, 33, 36, 37, 38, 39, 41, 42], "rtype": 16, "abl": [16, 25, 26, 45, 46], "carri": 16, "gradient": 16, "thu": 16, "within": [16, 23, 30, 38, 42], "autograd": 16, "backpropag": 16, "requires_grad": 16, "38": [16, 34], "28": 16, "40": 16, "int8": [16, 39, 42, 46], "ha": [16, 25, 30, 31, 34, 37, 39, 42, 45, 48], "been": [16, 36, 39, 42, 48], "subsequ": [16, 35, 37, 39, 40], "about": [16, 25], "wa": [16, 29, 34, 40], "With": 16, "convert": [16, 25, 27, 36, 46], "loss": [16, 22, 25, 26, 32, 36, 38, 42], "39": [16, 25], "51": 16, "521": 16, "41": 16, "quant_dequ": 16, "quantizedequant": [16, 22, 23, 24, 25], "x_qdq": 16, "52": 16, "68": 16, "97": 16, "uint8": 16, "techniqu": [19, 20, 21, 22, 25, 26, 27, 29, 30, 32, 36, 38, 39, 41, 42, 43, 44, 47], "num_bin": [20, 21], "2048": [20, 21], "percentil": 20, "100": [20, 25], "set_percentil": 20, "clip": [20, 21, 40, 42], "largest": 20, "smallest": 20, "when": [20, 22, 23, 25, 26, 32, 34, 36, 37, 38, 39, 40, 41, 42, 45, 46, 48], "50": [20, 30], "indic": [20, 23, 30, 48], "asymmetric_delta_candid": 21, "symmetric_delta_candid": 21, "101": 21, "offset_candid": 21, "21": 21, "max_parallel": 21, "gamma": 21, "sqnr": [21, 42], "calcul": [21, 23, 31, 37, 38, 42], "per": [21, 23, 28, 36, 37, 38, 40, 41, 42, 43], "histogram": [21, 36, 38, 42, 43], "delta": [21, 42], "search": [21, 31, 39, 40], "mode": [21, 24, 35, 36, 40], "process": [21, 22, 25, 27, 32, 34, 36, 37, 42], "paral": 21, "higher": [21, 28, 31, 39, 41], "memori": [21, 30, 34, 44, 47, 48], "usag": [21, 22, 30, 34, 41], "faster": [21, 26, 32, 39], "factor": [21, 30, 34, 37], "nois": [21, 25, 36, 37, 38, 39, 40], "less": [21, 23, 29, 31], "compute_encodings_from_stat": 21, "stat": 21, "is_symmetr": [21, 40], "lowest": 21, "expect": [21, 25, 34, 36, 38], "_histogram": 21, "split": [21, 23], "els": [21, 25, 37], "tool": [22, 25, 34, 37, 46, 48], "compress": [22, 29, 32, 43, 44, 46, 47, 48], "essenti": 22, "deploi": [22, 42], "edg": [22, 32], "devic": [22, 25, 42], "fix": [22, 32, 36, 41, 42, 43], "post": [22, 25, 26, 27, 32, 34, 39, 42, 43], "fine": [22, 30, 32, 36, 39, 42], "tune": [22, 30, 32, 36, 39, 42], "minim": [22, 32, 34, 36, 42], "accuraci": [22, 25, 26, 27, 30, 31, 32, 34, 36, 37, 38, 39, 41, 42, 43, 46, 48], "incur": [22, 32, 38], "pictur": [22, 29, 32], "show": [22, 25, 32, 37, 41], "high": [22, 26, 28, 30, 31, 32, 37, 41, 43, 46], "level": [22, 28, 30, 31, 32, 36, 41, 45], "view": [22, 25, 26, 27, 28, 32, 35, 37, 38, 42, 45], "workflow": [22, 25, 30, 32], "low": [22, 26, 28, 34, 36, 37, 41], "infer": [22, 25, 28, 30, 32, 37, 39, 42, 43], "recov": [22, 32, 41, 42], "lost": [22, 32], "via": [22, 30, 32, 42], "torchscript": 22, "target": [22, 28, 30, 31, 32, 34, 36, 41, 42, 43], "runtim": [22, 25, 30, 32, 34, 36, 38, 40, 42, 43], "like": [22, 25, 32, 34, 36, 38, 39, 40, 45], "neural": [22, 25, 27, 30, 32, 34, 36, 39, 41, 42, 47], "sdk": [22, 25, 32], "instal": [22, 43], "sim": [22, 25, 39, 42], "quantsim": [22, 36, 39, 40, 43], "quantizationsimmodel": [22, 25, 26, 28], "sample_input": [22, 25], "sampl": [22, 23, 25, 29, 36, 37, 38, 39, 42], "data_load": [22, 25], "sample_output": 22, "out_dir": 22, "quantized_model": 22, "quickstart": 22, "guid": [22, 30, 37, 41, 43], "depth": [22, 30, 41], "adapt": [22, 25, 26, 36, 38, 43], "round": [22, 23, 26, 36, 38, 42], "adaround": [22, 27, 36, 41, 43], "sqnrencodinganalyz": 22, "percentileencodinganalyz": 22, "fakequantizationmixin": [22, 23], "quantizationmixin": [22, 23], "quantize_dequant": 22, "product": [22, 32], "technologi": [22, 32], "subsidiari": [22, 32], "effect": [23, 25, 28, 36, 38, 40, 42], "network": [23, 25, 27, 30, 31, 34, 36, 39, 41, 42, 45, 47], "reduc": [23, 29, 34, 37, 41, 43, 48], "aimet": [23, 25, 32, 35, 40], "serv": [23, 45], "drop": [23, 27, 30, 34, 37, 38, 39, 41, 42], "counterpart": 23, "behavior": [23, 25, 32], "state": [23, 25, 34], "superset": 23, "mean": [23, 25, 29, 40, 42], "extens": 23, "coverag": 23, "limit": [23, 33], "tabl": [23, 31, 35, 45], "basequantizationmixin": 23, "control": [23, 42], "descript": [23, 35], "dict": [23, 24], "By": [23, 34, 40, 42], "index": [23, 30, 43], "respect": [23, 38], "channel": [23, 28, 30, 31, 33, 34, 37, 38, 40, 41, 42, 43, 44, 46, 47, 48], "dimens": [23, 34, 41, 44, 47], "per_channel_quant": [23, 40], "elementwis": [23, 43], "multipli": [23, 30], "second": [23, 40], "qmul": 23, "quantizedmultipli": 23, "In": [23, 25, 26, 27, 30, 31, 34, 36, 37, 39, 40, 42, 46, 48], "some": [23, 25, 26, 30, 31, 34, 35, 36, 37, 39, 41, 42], "make": [23, 31, 34, 35, 36, 42], "sens": 23, "qadd": 23, "quantizedadd": 23, "befor": [23, 25, 26, 27, 28, 34, 36, 39, 45, 46], "must": [23, 28, 32, 33, 38, 40, 48], "first": [23, 25, 30, 34, 36, 39, 45], "disabl": [23, 31, 34, 38, 40, 42], "while": [23, 26, 31, 35, 36, 39, 41, 42, 45], "activ": [23, 25, 36, 38, 39, 40, 41, 42], "them": [23, 25, 26, 48], "how": [23, 25, 34, 37, 38, 41, 42], "sever": [23, 30], "calibration_data_load": 23, "adaptiveavgpool1d": 23, "fakequantizedadaptiveavgpool1d": 23, "adaptiveavgpool2d": 23, "fakequantizedadaptiveavgpool2d": 23, "adaptiveavgpool3d": 23, "fakequantizedadaptiveavgpool3d": 23, "adaptivemaxpool1d": 23, "fakequantizedadaptivemaxpool1d": 23, "adaptivemaxpool2d": 23, "fakequantizedadaptivemaxpool2d": 23, "adaptivemaxpool3d": 23, "fakequantizedadaptivemaxpool3d": 23, "alphadropout": 23, "fakequantizedalphadropout": 23, "avgpool1d": 23, "fakequantizedavgpool1d": 23, "avgpool2d": 23, "fakequantizedavgpool2d": 23, "avgpool3d": 23, "fakequantizedavgpool3d": 23, "batchnorm1d": 23, "fakequantizedbatchnorm1d": 23, "batchnorm2d": [23, 25], "fakequantizedbatchnorm2d": 23, "batchnorm3d": 23, "fakequantizedbatchnorm3d": 23, "celu": 23, "fakequantizedcelu": 23, "channelshuffl": 23, "fakequantizedchannelshuffl": 23, "constantpad1d": 23, "fakequantizedconstantpad1d": 23, "constantpad2d": 23, "fakequantizedconstantpad2d": 23, "constantpad3d": 23, "fakequantizedconstantpad3d": 23, "conv1d": [23, 43], "fakequantizedconv1d": 23, "quantizedconv1d": 23, "conv2d": [23, 25, 29, 34, 43, 48], "fakequantizedconv2d": 23, "quantizedconv2d": [23, 25], "conv3d": 23, "fakequantizedconv3d": 23, "quantizedconv3d": 23, "convtranspose1d": [23, 43], "fakequantizedconvtranspose1d": 23, "convtranspose2d": 23, "fakequantizedconvtranspose2d": 23, "convtranspose3d": 23, "fakequantizedconvtranspose3d": 23, "crossmaplrn2d": 23, "fakequantizedcrossmaplrn2d": 23, "dropout": 23, "fakequantizeddropout": 23, "dropout2d": 23, "fakequantizeddropout2d": 23, "dropout3d": 23, "fakequantizeddropout3d": 23, "elu": 23, "fakequantizedelu": 23, "featurealphadropout": 23, "fakequantizedfeaturealphadropout": 23, "flatten": 23, "fakequantizedflatten": 23, "fold": [23, 26, 27, 28, 36, 37, 38, 43], "fakequantizedfold": 23, "fractionalmaxpool2d": 23, "fakequantizedfractionalmaxpool2d": 23, "fractionalmaxpool3d": 23, "fakequantizedfractionalmaxpool3d": 23, "gelu": 23, "fakequantizedgelu": 23, "quantizedgelu": 23, "glu": 23, "fakequantizedglu": 23, "groupnorm": 23, "fakequantizedgroupnorm": 23, "hardshrink": 23, "fakequantizedhardshrink": 23, "hardsigmoid": 23, "fakequantizedhardsigmoid": 23, "hardswish": 23, "fakequantizedhardswish": 23, "hardtanh": 23, "fakequantizedhardtanh": 23, "ident": [23, 25], "fakequantizedident": 23, "instancenorm1d": 23, "fakequantizedinstancenorm1d": 23, "instancenorm2d": 23, "fakequantizedinstancenorm2d": 23, "instancenorm3d": 23, "fakequantizedinstancenorm3d": 23, "lppool1d": 23, "fakequantizedlppool1d": 23, "lppool2d": 23, "fakequantizedlppool2d": 23, "layernorm": 23, "fakequantizedlayernorm": 23, "quantizedlayernorm": 23, "leakyrelu": 23, "fakequantizedleakyrelu": 23, "localresponsenorm": 23, "fakequantizedlocalresponsenorm": 23, "logsigmoid": 23, "fakequantizedlogsigmoid": 23, "logsoftmax": 23, "fakequantizedlogsoftmax": 23, "maxpool1d": 23, "fakequantizedmaxpool1d": 23, "maxpool2d": 23, "fakequantizedmaxpool2d": 23, "maxpool3d": 23, "fakequantizedmaxpool3d": 23, "maxunpool1d": 23, "fakequantizedmaxunpool1d": 23, "maxunpool2d": 23, "fakequantizedmaxunpool2d": 23, "maxunpool3d": 23, "fakequantizedmaxunpool3d": 23, "mish": 23, "fakequantizedmish": 23, "prelu": 23, "fakequantizedprelu": 23, "pixelshuffl": 23, "fakequantizedpixelshuffl": 23, "pixelunshuffl": 23, "fakequantizedpixelunshuffl": 23, "rrelu": 23, "fakequantizedrrelu": 23, "relu": [23, 25, 37, 40, 48], "fakequantizedrelu": [23, 25], "relu6": [23, 37], "fakequantizedrelu6": 23, "reflectionpad1d": 23, "fakequantizedreflectionpad1d": 23, "reflectionpad2d": 23, "fakequantizedreflectionpad2d": 23, "replicationpad1d": 23, "fakequantizedreplicationpad1d": 23, "replicationpad2d": 23, "fakequantizedreplicationpad2d": 23, "replicationpad3d": 23, "fakequantizedreplicationpad3d": 23, "selu": 23, "fakequantizedselu": 23, "silu": 23, "fakequantizedsilu": 23, "sigmoid": 23, "fakequantizedsigmoid": 23, "quantizedsigmoid": 23, "softmax": [23, 25], "fakequantizedsoftmax": 23, "quantizedsoftmax": [23, 25], "softmax2d": 23, "fakequantizedsoftmax2d": 23, "softmin": 23, "fakequantizedsoftmin": 23, "softplu": 23, "fakequantizedsoftplu": 23, "softshrink": 23, "fakequantizedsoftshrink": 23, "softsign": 23, "fakequantizedsoftsign": 23, "syncbatchnorm": 23, "fakequantizedsyncbatchnorm": 23, "tanh": 23, "fakequantizedtanh": 23, "tanhshrink": 23, "fakequantizedtanhshrink": 23, "threshold": [23, 27], "fakequantizedthreshold": 23, "unflatten": 23, "fakequantizedunflatten": 23, "unfold": 23, "fakequantizedunfold": 23, "upsampl": [23, 35], "fakequantizedupsampl": 23, "upsamplingbilinear2d": 23, "fakequantizedupsamplingbilinear2d": 23, "upsamplingnearest2d": 23, "fakequantizedupsamplingnearest2d": 23, "zeropad2d": 23, "fakequantizedzeropad2d": 23, "bceloss": 23, "fakequantizedbceloss": 23, "bcewithlogitsloss": 23, "fakequantizedbcewithlogitsloss": 23, "bilinear": [23, 35], "fakequantizedbilinear": 23, "ctcloss": 23, "fakequantizedctcloss": 23, "cosinesimilar": 23, "fakequantizedcosinesimilar": 23, "crossentropyloss": [23, 25], "fakequantizedcrossentropyloss": 23, "hingeembeddingloss": 23, "fakequantizedhingeembeddingloss": 23, "huberloss": 23, "fakequantizedhuberloss": 23, "kldivloss": 23, "fakequantizedkldivloss": 23, "l1loss": 23, "fakequantizedl1loss": 23, "mseloss": 23, "fakequantizedmseloss": 23, "multilabelmarginloss": 23, "fakequantizedmultilabelmarginloss": 23, "multilabelsoftmarginloss": 23, "fakequantizedmultilabelsoftmarginloss": 23, "multimarginloss": 23, "fakequantizedmultimarginloss": 23, "nllloss": 23, "fakequantizednllloss": 23, "nllloss2d": 23, "fakequantizednllloss2d": 23, "pairwisedist": 23, "fakequantizedpairwisedist": 23, "poissonnllloss": 23, "fakequantizedpoissonnllloss": 23, "smoothl1loss": 23, "fakequantizedsmoothl1loss": 23, "softmarginloss": 23, "fakequantizedsoftmarginloss": 23, "cosineembeddingloss": 23, "fakequantizedcosineembeddingloss": 23, "gaussiannllloss": 23, "fakequantizedgaussiannllloss": 23, "marginrankingloss": 23, "fakequantizedmarginrankingloss": 23, "tripletmarginloss": 23, "fakequantizedtripletmarginloss": 23, "tripletmarginwithdistanceloss": 23, "fakequantizedtripletmarginwithdistanceloss": 23, "embed": [23, 34, 41], "fakequantizedembed": 23, "embeddingbag": 23, "fakequantizedembeddingbag": 23, "gru": [23, 43], "fakequantizedgru": 23, "rnn": [23, 43], "fakequantizedrnn": 23, "grucel": 23, "fakequantizedgrucel": 23, "rnncell": 23, "fakequantizedrnncel": 23, "lstm": [23, 43], "fakequantizedlstm": 23, "lstmcell": 23, "fakequantizedlstmcel": 23, "adaptivelogsoftmaxwithloss": 23, "fakequantizedadaptivelogsoftmaxwithloss": 23, "aimet_op": 23, "fakequantizedcast": 23, "depthtospacedcrmod": 23, "fakequantizeddepthtospacedcrmod": 23, "onehot": 23, "fakequantizedonehot": 23, "exponenti": 23, "fakequantizedexponenti": 23, "erf": 23, "fakequantizederf": 23, "sqrt": 23, "fakequantizedsqrt": 23, "log": [23, 38], "fakequantizedlog": 23, "ab": [23, 37], "fakequantizedab": 23, "fakequantizedneg": 23, "elementwiseceil": 23, "fakequantizedelementwiseceil": 23, "elementwisefloor": 23, "fakequantizedelementwisefloor": 23, "sin": 23, "fakequantizedsin": 23, "co": 23, "fakequantizedco": 23, "asin": 23, "fakequantizedasin": 23, "atan": 23, "fakequantizedatan": 23, "fakequantizedround": 23, "logicalnot": 23, "fakequantizedlogicalnot": 23, "nonzero": 23, "fakequantizednonzero": 23, "elementwiseunarysign": 23, "fakequantizedelementwiseunarysign": 23, "rsqrt": 23, "fakequantizedrsqrt": 23, "squar": [23, 42], "fakequantizedsquar": 23, "fakequantizedmean": 23, "sum": [23, 25], "fakequantizedsum": 23, "prod": 23, "fakequantizedprod": 23, "argmin": 23, "fakequantizedargmin": 23, "argmax": [23, 25], "fakequantizedargmax": 23, "gather": 23, "fakequantizedgath": 23, "reshap": 23, "fakequantizedreshap": 23, "roialign": 23, "fakequantizedroialign": 23, "permut": 23, "fakequantizedpermut": 23, "indexselect": 23, "fakequantizedindexselect": 23, "topk": 23, "fakequantizedtopk": 23, "tile": 23, "fakequantizedtil": 23, "norm": [23, 26, 28, 36, 37, 38], "fakequantizednorm": 23, "cumsum": 23, "fakequantizedcumsum": 23, "interpol": [23, 31], "fakequantizedinterpol": 23, "normal": [23, 28, 38], "pad": [23, 25], "fakequantizedpad": 23, "fakequantizedshap": 23, "expand": 23, "fakequantizedexpand": 23, "stridedslic": 23, "fakequantizedstridedslic": 23, "matmul": [23, 43], "fakequantizedmatmul": 23, "fakequantizedadd": 23, "fakequantizedmultipli": 23, "subtract": 23, "fakequantizedsubtract": 23, "quantizedsubtract": 23, "divid": [23, 39], "fakequantizeddivid": 23, "floordivid": 23, "fakequantizedfloordivid": 23, "greater": 23, "fakequantizedgreat": 23, "fakequantizedless": 23, "greaterequ": 23, "fakequantizedgreaterequ": 23, "lessequ": 23, "fakequantizedlessequ": 23, "notequ": 23, "fakequantizednotequ": 23, "fakequantizedequ": 23, "remaind": 23, "fakequantizedremaind": 23, "fmod": 23, "fakequantizedfmod": 23, "pow": 23, "fakequantizedpow": 23, "customsilu": 23, "fakequantizedcustomsilu": 23, "fakequantizedmaximum": 23, "fakequantizedmax": 23, "fakequantizedminimum": 23, "fakequantizedmin": 23, "bmm": 23, "fakequantizedbmm": 23, "logicalor": 23, "fakequantizedlogicalor": 23, "logicaland": 23, "fakequantizedlogicaland": 23, "customgath": 23, "fakequantizedcustomgath": 23, "gathernd": 23, "fakequantizedgathernd": 23, "baddbmm": 23, "fakequantizedbaddbmm": 23, "addmm": 23, "fakequantizedaddmm": 23, "scatternd": 23, "fakequantizedscatternd": 23, "dynamicconv2d": 23, "fakequantizeddynamicconv2d": 23, "scatterel": 23, "fakequantizedscatterel": 23, "batchnorm": [23, 27, 37, 48], "fakequantizedbatchnorm": 23, "fakequantizedaimetgroupnorm": 23, "nonmaxsuppress": 23, "fakequantizednonmaxsuppress": 23, "fakequantizedsplit": 23, "concat": [23, 43], "fakequantizedconcat": 23, "fakequantizedwher": 23, "maskedfil": 23, "fakequantizedmaskedfil": 23, "allow_overwrit": 24, "allow_overwit": 24, "flag": 24, "abstract": 24, "get_encod": 24, "get_legacy_encod": 24, "register_quantization_paramet": 24, "param": [24, 40], "set_legacy_encod": 24, "tutori": 25, "simpl": [25, 36, 48], "intend": [25, 30], "most": [25, 40], "It": [25, 28, 31, 36, 37, 40, 45, 46, 48], "meant": 25, "demonstr": 25, "art": 25, "eval": [25, 31, 34, 45], "loop": [25, 41], "evalu": [25, 27, 31, 34, 36, 38, 39, 42, 45], "improv": [25, 30, 36, 39, 41, 46], "clearli": 25, "what": [25, 42, 45], "happen": 25, "let": 25, "code": [25, 26], "optim": [25, 26, 27, 32, 34, 36, 39, 42, 43, 45], "special": 25, "requir": [25, 26, 28, 30, 34, 36, 37, 40, 42], "look": [25, 45], "torchvis": 25, "is_avail": 25, "loader": [25, 26], "cifar10_train_data": 25, "dataset": [25, 36, 37, 42], "fashionmnist": 25, "tmp": 25, "cifar10": 25, "transform": [25, 43], "totensor": 25, "cifar10_test_data": 25, "train_load": 25, "util": [25, 28, 36], "dataload": [25, 38], "batch_siz": 25, "shuffl": 25, "test_load": 25, "def": 25, "super": 25, "conv1": 25, "in_channel": 25, "out_channel": 25, "kernel_s": 25, "stride": 25, "bn_1": 25, "conv2": 25, "256": [25, 38], "bn_2": 25, "dim": 25, "total": [25, 31, 42], "now": [25, 36, 43, 48], "instanti": [25, 39, 45], "few": [25, 30, 36, 41, 42], "epoch": [25, 32, 34, 36, 39], "establish": 25, "baselin": [25, 31, 39], "send": 25, "loss_fn": 25, "adam": 25, "lr": 25, "1e": [25, 39], "batch_idx": 25, "enumer": [25, 28], "backward": 25, "zero_grad": 25, "fp_accuraci": 25, "91": 25, "70999908447266": 25, "accur": 25, "coupl": [25, 26], "take": [25, 32, 34, 36, 37, 39, 40, 41, 48], "care": 25, "conform": 25, "guidelin": [25, 26, 30, 39], "math": 25, "wherea": [25, 42], "incorrectli": 25, "ignor": 25, "definit": [25, 36], "complet": [25, 28, 41], "redefin": 25, "thankfulli": 25, "model_prepar": 25, "incompat": 25, "fulli": [25, 33], "prepared_model": 25, "prepare_model": 25, "fp_accuracy_prepar": 25, "assert": 25, "2024": 25, "07": 25, "747": 25, "root": 25, "info": [25, 43], "806": 25, "modelprepar": 25, "node": [25, 39, 42], "module_relu": 25, "module_relu_1": 25, "module_softmax": 25, "graphmodul": 25, "ep": 25, "momentum": 25, "track_running_stat": 25, "12544": 25, "getattr_1": 25, "getitem": 25, "debug": [25, 41], "graph_modul": 25, "print_read": 25, "distinct": 25, "execut": [25, 31, 45], "typic": [25, 30, 36, 38, 39, 40, 42, 45], "adjac": [25, 40], "convolut": [25, 28, 30, 34, 41], "whenev": 25, "possibl": [25, 38, 40, 41], "unnecessari": [25, 48], "good": [25, 26], "idea": 25, "batch_norm_fold": 25, "iter": [25, 26, 37], "fold_all_batch_norm": 25, "input_shap": 25, "passthrough": 25, "previous": 25, "had": 25, "impact": [25, 31, 41], "readi": [25, 41], "involv": [25, 36, 41], "e": [25, 28, 30, 32, 39, 41, 48], "encount": 25, "therefor": [25, 30, 37], "theoret": 25, "could": [25, 29, 48], "entir": [25, 31, 34], "practic": [25, 34], "usual": [25, 39], "500": [25, 26, 37, 38], "1000": [25, 26, 37, 38], "estim": [25, 36, 37], "configur": [25, 30, 33, 43], "dummy_input": 25, "default_output_bw": 25, "default_param_bw": 25, "idx": 25, "break": 25, "compar": [25, 38, 39, 46], "quantized_accuraci": 25, "n": [25, 43], "1500015258789": 25, "here": [25, 30, 39, 45], "noth": 25, "than": [25, 33, 39, 45], "everi": [25, 31, 34, 39, 46], "construct": [25, 35], "discuss": [25, 30, 41, 42], "advanc": 25, "re": [25, 36], "satisfi": [25, 27], "One": [25, 30, 34, 44], "qat": [25, 26, 28, 32, 36, 41, 42, 43], "op": [25, 36, 40, 43], "present": [25, 34, 37], "repeat": [25, 29], "time": [25, 27, 34, 35, 39, 45], "post_qat_accuraci": 25, "92": 25, "05333709716797": 25, "happi": 25, "export_path": 25, "model_nam": 25, "fashion_mnist_model": 25, "save": [25, 27, 42, 46], "sent": 25, "nearest": 26, "figur": [26, 31, 41, 48], "singl": [26, 37], "shown": [26, 34, 37, 38, 41], "illustr": [26, 31, 36, 44, 47], "smaller": [26, 32, 41, 44, 47], "subset": [26, 28, 38, 48], "unlabel": [26, 36, 38, 42], "far": 26, "decid": [26, 45], "whether": [26, 39], "awai": 26, "closer": 26, "fp32": [26, 32, 37, 38, 39, 41, 42], "width": [26, 41, 42, 44, 47, 48], "freez": 26, "refer": [26, 27, 28, 32, 36, 38, 39, 40, 42], "bc": 26, "bnf": 26, "batch": [26, 28, 36, 37, 38], "cle": [26, 36, 41, 43], "cross": [26, 27, 35, 36, 38, 46], "hbf": 26, "awar": [26, 28, 32, 36, 41, 42], "benefit": 26, "don": 26, "But": [26, 34], "benefici": [26, 38, 39], "consid": [26, 31, 36, 41], "better": [26, 27, 36, 37, 39], "help": [26, 31, 34, 36, 37, 38, 41, 45, 46], "Not": [26, 31], "hyper": [26, 39], "expos": 26, "lead": [26, 28, 37, 41, 42], "stabl": 26, "mani": [26, 37, 42], "often": [26, 27, 34, 39], "approxim": [26, 30, 37, 38], "1024": [26, 35], "10000": 26, "moder": 26, "least": [26, 29], "beta": 26, "warm": 26, "period": 26, "kera": [26, 28, 32, 36, 37, 38, 40, 42, 43], "offer": 27, "suit": 27, "sequenc": [27, 28, 35, 40], "variou": [27, 30, 34, 36, 41, 42, 43, 46], "combin": [27, 30, 34, 36, 37], "error": [27, 36, 39, 41, 42], "prone": 27, "consum": [27, 34], "addit": [27, 36, 39, 40, 43], "amount": [27, 40], "toler": [27, 30], "As": [27, 29, 30, 31, 34, 36, 37, 38, 42, 44, 47], "soon": 27, "reach": [27, 30], "stop": 27, "summari": 27, "autom": [27, 36], "prepar": [27, 36, 43], "check": [27, 36, 39, 41], "valid": [27, 36, 43], "friendli": [27, 36, 37], "denot": 27, "select": [27, 30, 38, 42, 45, 48], "best": [27, 30, 34, 36, 42], "scheme": [27, 28, 31, 34, 38], "quantschem": 27, "preprat": 27, "mainli": 27, "consist": [27, 42, 48], "three": [27, 30, 46], "stage": 27, "effort": 27, "manner": 27, "fail": [27, 35, 36], "goal": 27, "small": [28, 32, 36], "individu": [28, 29, 30, 31, 34, 36, 38, 41], "adjust": [28, 29, 30, 36, 37, 41], "preceed": 28, "learn": [28, 34, 36, 39, 42, 43], "pcq": [28, 38], "veri": [28, 30, 34, 38, 46, 48], "NOT": [28, 48], "cover": [28, 40, 42], "scenario": [28, 34, 36, 48], "decreas": 28, "main": [28, 40, 43, 46], "issu": [28, 32, 35, 41, 43, 45, 46], "depthwis": [28, 43], "separ": [28, 38, 41, 43], "sinc": [28, 30, 31, 42], "affect": [28, 40, 48], "oscil": 28, "quant": 28, "onc": [28, 29, 34, 38, 39, 42], "flow": [28, 36, 39, 41, 42], "diagram": [28, 31, 34, 42, 44, 47], "work": [28, 34, 35, 37, 40], "explain": [29, 34, 37, 42, 48], "differ": [29, 31, 34, 36, 37, 39, 40, 41, 42], "occurr": 29, "detail": [29, 31, 32, 34, 36, 41, 42, 45, 46], "ratio": [29, 30, 45], "magnitud": 29, "choos": [29, 30, 34], "matrix": 29, "upstream": [29, 48], "also": [29, 30, 31, 36, 38, 40, 41, 42, 43, 45, 46, 48], "gain": [29, 34], "presenc": 29, "connect": [29, 33, 47], "residu": 29, "sometim": [29, 34, 37, 38], "prevent": 29, "final": [29, 30, 31, 39, 41, 45], "attempt": [29, 36, 37], "match": [29, 34, 38, 40, 41, 42, 48], "close": [29, 30, 42], "prior": [29, 36, 38], "collect": [29, 38], "random": [29, 38], "regress": 29, "document": [30, 32, 43], "svd": [30, 31, 33, 34, 43], "spatial": [30, 31, 33, 34, 43], "ssvd": 30, "prune": [30, 31, 33, 34, 43, 48], "accumul": 30, "mac": [30, 34, 44, 47], "reduct": 30, "uncompress": 30, "algorithm": [30, 31, 34, 41, 48], "overal": [30, 34, 41], "latenc": 30, "bandwidth": 30, "vari": [30, 31, 37, 46], "architectur": 30, "io": [30, 43], "At": [30, 34], "half": 30, "unknown": 30, "apriori": 30, "cssvd": 30, "tri": [30, 36], "75": 30, "would": [30, 34, 40, 43, 45], "pick": [30, 31, 34], "2b": 30, "rel": [30, 36, 41, 46], "avoid": 30, "larg": [30, 39, 44, 47], "2a": 30, "revisit": 30, "ccp": 30, "resnet": 30, "csvd": 30, "basi": [31, 34], "assess": 31, "sensit": [31, 36, 38, 41, 42, 43], "applic": [31, 35], "find": [31, 36, 38, 39, 42], "sure": [31, 35], "highest": 31, "remain": [31, 36, 37, 42], "dictionari": [31, 34, 40], "column": 31, "captur": 31, "predefin": 31, "candid": [31, 34], "unmodifi": 31, "score": [31, 34, 45], "last": [31, 33, 41], "known": [31, 32], "monoton": 31, "fit": 31, "strict": [31, 40, 42], "increas": [31, 37, 40], "procedur": [31, 34], "curv": 31, "core": 31, "cost": [31, 34, 39], "constant": [31, 36], "met": 31, "binari": 31, "solut": [31, 39, 41], "quickli": 31, "suggest": [31, 34, 37], "lower": [31, 36, 41], "lesser": [31, 34], "fall": [31, 40], "drstical": 31, "softwar": [32, 34], "either": [32, 42], "framework": [32, 36, 40, 42], "meta": [32, 36], "h5": [32, 36], "hw": 32, "ptq": [32, 36, 38, 39], "redund": 32, "conv": [33, 40, 43, 44, 47, 48], "dilat": 33, "modules_to_ignor": 33, "depthwiseconv2d": 33, "guidebook": [34, 36], "advic": 34, "greedi": [34, 45], "phase": [34, 36], "choic": [34, 42], "nomin": 34, "And": 34, "ml": [34, 36, 37, 45, 46], "those": 34, "fc": 34, "certain": [34, 35, 36, 40], "decompos": [34, 44, 47], "term": [34, 44, 45, 46, 47], "sharp": 34, "degrad": 34, "might": [34, 38], "respons": 34, "rate": [34, 39], "carefulli": 34, "decai": 34, "togeth": 34, "slow": 34, "someth": [34, 45], "speed": [34, 37, 43], "itself": [34, 42, 44, 47], "part": [34, 36, 37, 38], "experi": 34, "load": 34, "searcher": 34, "Or": 34, "granular": [34, 41, 42, 46], "strike": 34, "balanc": 34, "chosen": 34, "experiment": [34, 40], "major": 34, "sai": 34, "xiangyu": 34, "zhang": 34, "jianhua": 34, "zou": 34, "kaim": 34, "he": 34, "jian": 34, "sun": 34, "deep": 34, "classif": 34, "detect": 34, "transact": 34, "pattern": 34, "analysi": [34, 41], "intellig": 34, "vol": 34, "pp": 34, "1943": 34, "1955": 34, "oct": 34, "2016": 34, "yihui": 34, "intern": [34, 36, 37, 40], "confer": [34, 37], "vision": [34, 37], "iccv": [34, 37], "venic": 34, "2017": 34, "1398": 34, "1406": 34, "jaderberg": 34, "andrea": 34, "vedaldi": 34, "andrew": 34, "zisserman": 34, "expans": 34, "british": 34, "jan": 34, "2014": 34, "andrei": 34, "kuzmin": 34, "marku": [34, 37], "nagel": [34, 37], "saurabh": 34, "pitr": 34, "sandeep": 34, "pendyam": 34, "tijmen": [34, 37], "blankevoort": [34, 37], "taxonomi": 34, "cross_layer_equ": 35, "equalize_model": 35, "graph": [35, 36, 42, 45], "restrict": 35, "successfulli": 35, "potenti": [35, 38, 45, 46], "workaround": 35, "primit": 35, "around": 35, "rewrit": 35, "slice": 35, "written": [35, 36], "caus": [35, 41, 42], "statement": 35, "align_corn": 35, "deconvolut": 35, "deeplabv3": 35, "address": [35, 41, 45], "releas": 35, "hardwar": [36, 37, 42], "howev": [36, 37, 39, 40, 42], "introduc": [36, 40, 42], "due": [36, 37], "predict": 36, "oppos": [36, 40], "advantag": 36, "No": 36, "pipelin": [36, 39, 41, 42], "suffici": [36, 38, 39, 42], "even": 36, "fast": 36, "easi": [36, 38], "still": [36, 41], "gap": 36, "insert": [36, 42], "robust": 36, "longer": [36, 39], "account": [36, 39, 41], "trainabl": 36, "bias": 36, "reflect": [36, 42], "autoqu": [36, 39, 43], "integr": 36, "describ": [36, 37, 41, 42], "standalon": 36, "consecut": [36, 37], "bn": [36, 43], "deprec": 36, "advis": [36, 40], "instead": [36, 37], "quantanalyz": [36, 43], "understand": [36, 40, 45, 46], "prep": 36, "accord": [36, 39, 40, 42], "align": 36, "retri": 36, "continu": [36, 37, 39, 41], "warn": 36, "hand": 36, "satisfactori": [36, 41], "bring": 36, "onto": 36, "thing": 36, "item": 36, "checkpoint": 36, "pb": 36, "trial": 36, "particular": [36, 40], "seem": 36, "off": [36, 37, 40], "bat": 36, "becom": 37, "design": 37, "paper": 37, "2019": 37, "arxiv": 37, "1906": 37, "04721": 37, "surround": 37, "highlight": [37, 45, 46], "big": 37, "discrep": 37, "accept": [37, 41], "wide": 37, "varianc": 37, "across": [37, 38], "seen": [37, 38], "significantli": 37, "similar": [37, 39, 42], "quantizaion": 37, "distribut": [37, 41, 42], "did": 37, "shift": 37, "whose": [37, 40, 48], "empir": 37, "analyt": [37, 45, 46], "extract": 37, "bottleneck": [37, 41], "hybrid": 37, "approach": [37, 42], "mart": 37, "van": 37, "baalen": 37, "seoul": 37, "octob": 37, "hotspot": 38, "analys": 38, "callback": [38, 42], "mse": [38, 42], "plot": 38, "pretrain": [38, 39, 42], "dummi": 38, "label": [38, 39], "metric": [38, 42], "rune": 38, "relat": [38, 42], "doc": [38, 40, 45], "situat": 38, "pinpoint": 38, "culprit": 38, "again": [38, 39, 45], "per_layer_quant_en": 38, "per_layer_quant_dis": 38, "axi": 38, "track": 38, "directli": [38, 42], "min_max_rang": 38, "folder": 38, "enhanc": [38, 42], "toss": 38, "outlier": [38, 42], "displai": [38, 45, 46], "activations_pdf": 38, "weights_pdf": 38, "monitor": 38, "contribut": [38, 41], "read": 38, "per_layer_mse_loss": 38, "mitig": [39, 42], "come": [39, 42], "hyperparamet": 39, "accompani": 39, "found": [39, 42], "throughout": [39, 40, 46], "themselv": 39, "aid": 39, "converg": 39, "schedul": 39, "placement": 40, "rule": 40, "fuse": [40, 42], "thei": [40, 45], "six": 40, "overrul": 40, "turn": 40, "op_typ": 40, "purpos": 40, "empti": 40, "is_output_quant": 40, "is_quant": 40, "strict_symmetr": 40, "unsigned_symmetr": 40, "though": 40, "omit": 40, "altogeth": 40, "asid": 40, "govern": 40, "unsign": [40, 42], "gemm": 40, "is_input_quant": 40, "recogn": [40, 42], "keep": [40, 41], "convent": 40, "preced": 40, "supergroup": [40, 43], "made": 40, "op_list": 40, "member": 40, "sequenti": [40, 41], "branch": 40, "config": [40, 43], "entri": 40, "string": 40, "model_input": 40, "whatev": 40, "earlier": 40, "model_output": 40, "diagnost": 41, "strictli": 41, "insight": [41, 45, 46], "why": 41, "underperform": 41, "tackl": 41, "chart": 41, "saniti": 41, "similarli": 41, "ofth": 41, "independ": 41, "kept": 41, "convers": 41, "toward": 41, "signific": 41, "wise": 41, "uneven": 41, "vanilla": 41, "global": 41, "restor": 41, "rest": 41, "inner": 41, "token": 41, "bert": 41, "reveal": 41, "problemat": [41, 46], "problem": 41, "resort": 41, "revert": 41, "power": 41, "ultim": 42, "copi": 42, "ingest": 42, "feed": 42, "000": 42, "yield": 42, "dequantiz": 42, "hook": 42, "intercept": 42, "four": 42, "zero": [42, 43], "vice": 42, "versa": 42, "textrm": 42, "dfrac": 42, "whole": 42, "strong": 42, "excess": 42, "signal": 42, "satur": 42, "erro": 42, "static": 42, "alongsid": 42, "ones": 42, "just": [42, 45, 48], "non": 42, "intermedi": 42, "slim": 43, "backslash": 43, "cl": 43, "user_guid": 43, "api_doc": 43, "quantizablemultiheadattent": 43, "kyuykim": 43, "multi": 43, "mangal": 43, "logic": 43, "geunle": 43, "bug": 43, "correctli": 43, "leaf": 43, "klhsieh": 43, "akhobar": 43, "resid": 43, "multiheadattent": 43, "ashvkuma": 43, "mha": 43, "pdf": 43, "fp16": 43, "minor": 43, "stand": [43, 44, 47], "adaptiveround": 43, "recurr": 43, "packag": 43, "decomposit": [44, 47], "singular": [44, 47], "\ud835\udc5a": [44, 47], "\ud835\udc5b": [44, 47], "\u210e": [44, 47], "\ud835\udc64": [44, 47], "give": [44, 47], "height": [44, 47, 48], "\ud835\udc58": [44, 47], "k": 44, "rank": [44, 47], "larger": [44, 47], "degre": [44, 47], "assist": [45, 46], "progress": [45, 46], "computation": [45, 46], "task": [45, 46], "websocket": 45, "tell": 45, "listen": 45, "rather": 45, "5006": 45, "compress_model": 45, "visualizecompress": 45, "display_eval_scor": 45, "display_comp_ratio_plot": 45, "directori": 46, "lot": 46, "anoth": [47, 48], "lose": 48, "much": 48, "explicitli": 48, "pictori": 48, "volum": 48, "hxwx8": 48, "hxwx5": 48, "simpli": 48, "propag": 48, "That": 48, "teh": 48, "green": 48, "color": 48, "side": 48, "action": 48, "taken": 48, "pink": 48, "orang": 48}, "objects": {"aimet_torch.v2.nn": [[6, 0, 1, "", "FakeQuantizationMixin"], [7, 0, 1, "", "QuantizationMixin"]], "aimet_torch.v2.nn.FakeQuantizationMixin": [[6, 1, 1, "", "__quant_init__"], [6, 1, 1, "", "compute_encodings"], [6, 1, 1, "", "from_module"], [6, 1, 1, "", "get_original_module"], [6, 1, 1, "", "implements"], [6, 2, 1, "", "input_quantizers"], [6, 2, 1, "", "output_quantizers"], [6, 2, 1, "", "param_quantizers"]], "aimet_torch.v2.nn.QuantizationMixin": [[7, 1, 1, "", "compute_encodings"], [7, 1, 1, "", "get_default_kernel"], [7, 1, 1, "", "get_kernel"], [7, 1, 1, "", "implements"], [7, 1, 1, "", "set_default_kernel"], [7, 1, 1, "", "set_kernel"], [7, 1, 1, "", "wrap"]], "aimet_torch.v2.nn.base": [[23, 0, 1, "", "BaseQuantizationMixin"]], "aimet_torch.v2.nn.base.BaseQuantizationMixin": [[23, 1, 1, "", "__quant_init__"], [23, 1, 1, "", "compute_encodings"], [23, 2, 1, "", "input_quantizers"], [23, 2, 1, "", "output_quantizers"], [23, 2, 1, "", "param_quantizers"]], "aimet_torch.v2.quantization": [[13, 3, 0, "-", "affine"], [15, 3, 0, "-", "float"]], "aimet_torch.v2.quantization.affine": [[8, 0, 1, "", "Quantize"], [9, 0, 1, "", "QuantizeDequantize"], [10, 4, 1, "", "dequantize"], [11, 4, 1, "", "quantize"], [12, 4, 1, "", "quantize_dequantize"]], "aimet_torch.v2.quantization.affine.Quantize": [[8, 1, 1, "", "forward"]], "aimet_torch.v2.quantization.affine.QuantizeDequantize": [[9, 1, 1, "", "forward"]], "aimet_torch.v2.quantization.affine.quantizer": [[24, 0, 1, "", "Quantize"], [24, 0, 1, "", "QuantizeDequantize"], [24, 0, 1, "", "QuantizerBase"]], "aimet_torch.v2.quantization.affine.quantizer.Quantize": [[24, 1, 1, "", "forward"]], "aimet_torch.v2.quantization.affine.quantizer.QuantizeDequantize": [[24, 1, 1, "", "forward"]], "aimet_torch.v2.quantization.affine.quantizer.QuantizerBase": [[24, 1, 1, "", "allow_overwrite"], [24, 1, 1, "", "compute_encodings"], [24, 1, 1, "", "get_encoding"], [24, 1, 1, "", "get_legacy_encodings"], [24, 1, 1, "", "is_initialized"], [24, 1, 1, "", "register_quantization_parameter"], [24, 1, 1, "", "set_legacy_encodings"]], "aimet_torch.v2.quantization.encoding_analyzer": [[17, 0, 1, "", "EncodingAnalyzer"], [19, 0, 1, "", "MinMaxEncodingAnalyzer"], [20, 0, 1, "", "PercentileEncodingAnalyzer"], [21, 0, 1, "", "SqnrEncodingAnalyzer"]], "aimet_torch.v2.quantization.encoding_analyzer.PercentileEncodingAnalyzer": [[20, 1, 1, "", "set_percentile"]], "aimet_torch.v2.quantization.encoding_analyzer.SqnrEncodingAnalyzer": [[21, 1, 1, "", "compute_encodings_from_stats"]], "aimet_torch.v2.quantization.float": [[14, 0, 1, "", "FloatQuantizeDequantize"], [14, 0, 1, "", "QuantizeDequantize"]], "aimet_torch.v2.quantization.tensor": [[16, 0, 1, "", "DequantizedTensor"], [16, 0, 1, "", "QuantizedTensor"]], "aimet_torch.v2.quantization.tensor.DequantizedTensor": [[16, 1, 1, "", "dequantize"], [16, 1, 1, "", "quantize"], [16, 1, 1, "", "quantized_repr"]], "aimet_torch.v2.quantization.tensor.QuantizedTensor": [[16, 1, 1, "", "dequantize"], [16, 1, 1, "", "quantize"], [16, 1, 1, "", "quantized_repr"]]}, "objtypes": {"0": "py:class", "1": "py:method", "2": "py:attribute", "3": "py:module", "4": "py:function"}, "objnames": {"0": ["py", "class", "Python class"], "1": ["py", "method", "Python method"], "2": ["py", "attribute", "Python attribute"], "3": ["py", "module", "Python module"], "4": ["py", "function", "Python function"]}, "titleterms": {"aimet": [2, 3, 4, 22, 26, 27, 28, 29, 30, 31, 33, 34, 36, 37, 38, 39, 41, 42, 43, 44, 45, 46, 47, 48], "instal": [2, 3, 4, 32], "quick": 2, "releas": [2, 3, 4, 32, 43], "packag": [2, 3, 4], "system": 2, "requir": [2, 38], "advanc": 2, "instruct": 2, "docker": 3, "set": 3, "variant": [3, 17], "us": [3, 26, 34, 36, 45], "prebuilt": 3, "imag": 3, "build": 3, "local": 3, "start": [3, 22, 32, 45], "contain": 3, "from": [3, 4], "pypi": [3, 4], "environ": [3, 4], "setup": [3, 4], "prerequisit": [4, 25], "gpu": 4, "pytorch": [4, 22, 25, 35, 36, 46], "2": [4, 25, 43], "1": [4, 25, 43], "tensorflow": [4, 36, 46], "13": [4, 43], "onnx": 4, "common": [4, 26], "debian": 4, "torch": 4, "replac": 4, "pillow": 4, "simd": 4, "onnxruntim": 4, "post": [4, 18, 36, 37], "step": 4, "fakequantizationmixin": 6, "nn": 7, "quantizationmixin": 7, "top": [7, 23, 24], "level": [7, 23, 24], "api": [7, 22, 23, 24, 26, 27, 28, 37, 38, 42], "quantiz": [8, 11, 13, 15, 16, 18, 23, 24, 25, 36, 37, 39, 40, 41, 42, 46], "quantizedequant": [9, 14], "dequant": 10, "quantize_dequant": 12, "affin": 13, "class": [13, 16, 23], "function": 13, "floatquantizedequant": 14, "float": [15, 25], "tensor": 16, "encod": [17, 23, 42], "analyz": 17, "train": [18, 25, 36, 37, 39], "minmaxencodinganalyz": 19, "percentileencodinganalyz": 20, "sqnrencodinganalyz": 21, "ai": [22, 32], "model": [22, 25, 32, 34, 35, 36], "effici": [22, 32], "toolkit": [22, 32], "document": 22, "get": [22, 32, 34], "exampl": 22, "featur": [22, 30, 32, 36, 41], "descript": [22, 38], "modul": 23, "configur": [23, 40, 42], "comput": 23, "quickstart": 25, "guid": [25, 32], "overal": [25, 29], "flow": [25, 37], "prepar": 25, "point": 25, "batchnorm": 25, "fold": 25, "fine": [25, 34], "tune": [25, 34], "awar": [25, 39], "export": 25, "quantsim": [25, 42], "adaround": 26, "case": [26, 34, 36], "terminologi": 26, "autoqu": 27, "overview": [27, 28, 31, 32, 34, 37, 38, 39, 40, 42, 45, 46, 48], "workflow": [27, 28, 36, 39, 42], "bn": 28, "re": 28, "estim": 28, "channel": 29, "prune": 29, "procedur": 29, "select": [29, 31, 34], "winnow": [29, 48], "weight": [29, 47], "reconstruct": 29, "compress": [30, 31, 34, 45], "guidebook": [30, 41], "greedi": 31, "ratio": [31, 34], "how": [31, 40, 45, 48], "work": [31, 48], "per": [31, 34], "layer": [31, 34, 37], "explor": 31, "user": [32, 37], "inform": 32, "toc": 32, "tree": 32, "known": 33, "issu": 33, "option": 34, "techniqu": [34, 37], "better": 34, "result": 34, "rank": 34, "round": 34, "faq": [34, 37], "refer": [34, 37], "guidelin": [35, 36], "debug": 36, "analysi": [36, 38], "tool": [36, 45], "cross": 37, "equal": 37, "quantanalyz": 38, "detail": 38, "qat": 39, "mode": 39, "recommend": 39, "simul": [40, 42], "file": 40, "structur": 40, "individu": 40, "section": 40, "nois": 42, "determin": 42, "paramet": 42, "scheme": 42, "op": 42, "frequent": 42, "ask": 42, "question": 42, "note": 43, "22": 43, "0": 43, "21": 43, "20": 43, "19": 43, "py37": 43, "18": 43, "17": 43, "16": 43, "14": 43, "spatial": 44, "svd": [44, 47], "visual": [45, 46], "design": 45, "bokeh": 45, "server": 45, "session": 45}, "envversion": {"sphinx.domains.c": 2, "sphinx.domains.changeset": 1, "sphinx.domains.citation": 1, "sphinx.domains.cpp": 8, "sphinx.domains.index": 1, "sphinx.domains.javascript": 2, "sphinx.domains.math": 2, "sphinx.domains.python": 3, "sphinx.domains.rst": 2, "sphinx.domains.std": 2, "nbsphinx": 4, "sphinx.ext.intersphinx": 1, "sphinx.ext.viewcode": 1, "sphinx": 57}, "alltitles": {"AIMET Installation": [[2, "aimet-installation"]], "Quick Install": [[2, "quick-install"]], "Release Packages": [[2, "release-packages"]], "System Requirements": [[2, "system-requirements"]], "Advanced Installation Instructions": [[2, "advanced-installation-instructions"]], "AIMET Installation in Docker": [[3, "aimet-installation-in-docker"]], "Set variant": [[3, "set-variant"]], "Use prebuilt docker image": [[3, "use-prebuilt-docker-image"]], "Build docker image locally": [[3, "build-docker-image-locally"]], "Start docker container": [[3, "start-docker-container"]], "Install AIMET packages": [[3, "install-aimet-packages"], [4, "install-aimet-packages"]], "From PyPI": [[3, "from-pypi"], [4, "from-pypi"]], "From Release Package": [[3, "from-release-package"], [4, "from-release-package"]], "Environment setup": [[3, "environment-setup"], [4, "environment-setup"]], "AIMET Installation and Setup": [[4, "aimet-installation-and-setup"]], "Install prerequisite packages": [[4, "install-prerequisite-packages"]], "Install GPU packages": [[4, "install-gpu-packages"]], "Install GPU packages for PyTorch 2.1 or TensorFlow": [[4, "install-gpu-packages-for-pytorch-2-1-or-tensorflow"]], "Install GPU packages for PyTorch 1.13 or ONNX": [[4, "install-gpu-packages-for-pytorch-1-13-or-onnx"]], "Install common debian packages": [[4, "install-common-debian-packages"]], "Install tensorflow GPU debian packages": [[4, "install-tensorflow-gpu-debian-packages"]], "Install torch GPU debian packages": [[4, "install-torch-gpu-debian-packages"]], "Install ONNX GPU debian packages": [[4, "install-onnx-gpu-debian-packages"]], "Replace Pillow with Pillow-SIMD": [[4, "replace-pillow-with-pillow-simd"]], "Replace onnxruntime with onnxruntime-gpu": [[4, "replace-onnxruntime-with-onnxruntime-gpu"]], "Post installation steps": [[4, "post-installation-steps"]], "FakeQuantizationMixin": [[6, "fakequantizationmixin"]], "nn.QuantizationMixin": [[7, "nn-quantizationmixin"]], "Top-level API": [[7, "top-level-api"], [23, "top-level-api"], [24, "top-level-api"]], "Quantize": [[8, "quantize"]], "QuantizeDequantize": [[9, "quantizedequantize"], [14, "quantizedequantize"]], "dequantize": [[10, "dequantize"]], "quantize": [[11, "quantize"]], "quantize_dequantize": [[12, "quantize-dequantize"]], "quantization.affine": [[13, "module-aimet_torch.v2.quantization.affine"]], "Classes": [[13, "classes"], [16, "classes"]], "Functions": [[13, "functions"]], "FloatQuantizeDequantize": [[14, "floatquantizedequantize"]], "quantization.float": [[15, "module-aimet_torch.v2.quantization.float"]], "quantization.tensor": [[16, "quantization-tensor"]], "Encoding Analyzers": [[17, "encoding-analyzers"]], "Variants": [[17, "variants"]], "Post-Training Quantization": [[18, "post-training-quantization"], [36, "post-training-quantization"]], "MinMaxEncodingAnalyzer": [[19, "minmaxencodinganalyzer"]], "PercentileEncodingAnalyzer": [[20, "percentileencodinganalyzer"]], "SqnrEncodingAnalyzer": [[21, "sqnrencodinganalyzer"]], "AIMET: AI Model Efficiency Toolkit Documentation": [[22, "aimet-ai-model-efficiency-toolkit-documentation"]], "Getting Started": [[22, "getting-started"], [32, "getting-started"]], "Examples": [[22, null]], "Feature Descriptions": [[22, null]], "AIMET PyTorch API": [[22, null]], "Quantized Modules": [[23, "quantized-modules"]], "Configuration": [[23, "configuration"]], "Computing Encodings": [[23, "computing-encodings"]], "Quantized Module Classes": [[23, "quantized-module-classes"]], "Quantizers": [[24, "quantizers"]], "Quickstart Guide": [[25, "quickstart-guide"]], "Overall flow": [[25, "overall-flow"]], "PyTorch prerequisites": [[25, "pytorch-prerequisites"]], "Prepare the floating point model for quantization": [[25, "prepare-the-floating-point-model-for-quantization"]], "1) Model preparation": [[25, "model-preparation"]], "2) BatchNorm fold": [[25, "batchnorm-fold"]], "Quantize the model": [[25, "quantize-the-model"]], "Fine-tune the model with quantization aware training": [[25, "fine-tune-the-model-with-quantization-aware-training"]], "Export the quantsim model": [[25, "export-the-quantsim-model"]], "AIMET AdaRound": [[26, "aimet-adaround"]], "AdaRound Use Cases": [[26, "adaround-use-cases"]], "Common terminology": [[26, "common-terminology"]], "Use Cases": [[26, "use-cases"], [36, "use-cases"]], "AdaRound API": [[26, "adaround-api"]], "AIMET AutoQuant": [[27, "aimet-autoquant"]], "Overview": [[27, "overview"], [28, "overview"], [31, "overview"], [32, "overview"], [34, "overview"], [37, "overview"], [38, "overview"], [39, "overview"], [40, "overview"], [42, "overview"], [45, "overview"], [46, "overview"], [48, "overview"]], "Workflow": [[27, "workflow"], [28, "workflow"]], "AutoQuant API": [[27, "autoquant-api"]], "AIMET BN Re-estimation": [[28, "aimet-bn-re-estimation"]], "BN Re-estimation API": [[28, "bn-re-estimation-api"]], "AIMET Channel Pruning": [[29, "aimet-channel-pruning"]], "Overall Procedure": [[29, "overall-procedure"]], "Channel Selection": [[29, "channel-selection"]], "Winnowing": [[29, "winnowing"]], "Weight Reconstruction": [[29, "weight-reconstruction"]], "AIMET Compression Features Guidebook": [[30, "aimet-compression-features-guidebook"]], "AIMET Greedy Compression Ratio Selection": [[31, "aimet-greedy-compression-ratio-selection"]], "How it works": [[31, "how-it-works"]], "Per-layer Exploration": [[31, "per-layer-exploration"]], "Compression Ratio Selection": [[31, "compression-ratio-selection"]], "AI Model Efficiency Toolkit User Guide": [[32, "ai-model-efficiency-toolkit-user-guide"]], "Features": [[32, "features"]], "Release Information": [[32, "release-information"]], "Installation Guide": [[32, "installation-guide"]], "toc tree": [[32, "toc-tree"]], "AIMET Known Issues": [[33, "aimet-known-issues"]], "AIMET Model Compression": [[34, "aimet-model-compression"]], "Use Case": [[34, "use-case"]], "Compression ratio selection": [[34, "compression-ratio-selection"]], "Model Compression": [[34, "model-compression"]], "Optional techniques to get better compression results": [[34, "optional-techniques-to-get-better-compression-results"]], "Rank Rounding": [[34, "rank-rounding"]], "Per-layer Fine-tuning": [[34, "per-layer-fine-tuning"]], "FAQs": [[34, "faqs"], [37, "faqs"]], "References": [[34, "references"], [37, "references"]], "Model Guidelines for PyTorch": [[35, "model-guidelines-for-pytorch"]], "AIMET Model Quantization": [[36, "aimet-model-quantization"]], "AIMET Quantization Features": [[36, "aimet-quantization-features"]], "Debugging/Analysis Tools": [[36, "debugging-analysis-tools"]], "AIMET Quantization Workflow": [[36, "aimet-quantization-workflow"]], "PyTorch": [[36, "pytorch"], [46, "pytorch"]], "Tensorflow": [[36, "tensorflow"]], "Debugging Guidelines": [[36, "debugging-guidelines"]], "AIMET Post-Training Quantization Techniques": [[37, "aimet-post-training-quantization-techniques"]], "User Flow": [[37, "user-flow"]], "Cross-Layer Equalization API": [[37, "cross-layer-equalization-api"]], "AIMET QuantAnalyzer": [[38, "aimet-quantanalyzer"]], "Requirements": [[38, "requirements"]], "Detailed Analysis Descriptions": [[38, "detailed-analysis-descriptions"]], "QuantAnalyzer API": [[38, "quantanalyzer-api"]], "AIMET Quantization Aware Training": [[39, "aimet-quantization-aware-training"]], "QAT workflow": [[39, "qat-workflow"]], "QAT modes": [[39, "qat-modes"]], "Recommendations for Quantization-Aware Training": [[39, "recommendations-for-quantization-aware-training"]], "Quantization Simulation Configuration": [[40, "quantization-simulation-configuration"]], "Configuration File Structure": [[40, "configuration-file-structure"]], "How to configure individual Configuration File Sections": [[40, "how-to-configure-individual-configuration-file-sections"]], "AIMET Quantization Features Guidebook": [[41, "aimet-quantization-features-guidebook"]], "AIMET Quantization Simulation": [[42, "aimet-quantization-simulation"]], "QuantSim Workflow": [[42, "quantsim-workflow"]], "Simulating Quantization Noise": [[42, "simulating-quantization-noise"]], "Determining Quantization Parameters (Encodings)": [[42, "determining-quantization-parameters-encodings"]], "Quantization Schemes": [[42, "quantization-schemes"]], "Configuring Quantization Simulation Ops": [[42, "configuring-quantization-simulation-ops"]], "Quantization Simulation APIs": [[42, "quantization-simulation-apis"]], "Frequently Asked Questions": [[42, "frequently-asked-questions"]], "AIMET Release Notes": [[43, "aimet-release-notes"]], "1.22.2": [[43, "id1"]], "1.22.1": [[43, "id2"]], "1.22.0": [[43, "id3"]], "1.21.0": [[43, "id4"]], "1.20.0": [[43, "id5"]], "1.19.1.py37": [[43, "py37"]], "1.19.1": [[43, "id6"]], "1.18.0.py37": [[43, "id7"]], "1.18.0": [[43, "id8"]], "1.17.0.py37": [[43, "id9"]], "1.17.0": [[43, "id10"]], "1.16.2.py37": [[43, "id11"]], "1.16.2": [[43, "id12"]], "1.16.1.py37": [[43, "id13"]], "1.16.1": [[43, "id14"]], "1.16.0": [[43, "id15"]], "1.14.0": [[43, "id16"]], "1.13.0": [[43, "id17"]], "AIMET Spatial SVD": [[44, "aimet-spatial-svd"]], "AIMET Visualization": [[45, "aimet-visualization"]], "Design": [[45, "design"]], "Compression": [[45, "compression"]], "Starting a Bokeh Server Session:": [[45, "starting-a-bokeh-server-session"]], "How to use the tool": [[45, "how-to-use-the-tool"]], "AIMET Visualization for Quantization": [[46, "aimet-visualization-for-quantization"]], "Quantization": [[46, "quantization"]], "TensorFlow": [[46, "tensorflow"]], "AIMET Weight SVD": [[47, "aimet-weight-svd"]], "AIMET Winnowing": [[48, "aimet-winnowing"]], "Winnowing Overview": [[48, "winnowing-overview"]], "How Winnowing Works": [[48, "how-winnowing-works"]]}, "indexentries": {"fakequantizationmixin (class in aimet_torch.v2.nn)": [[6, "aimet_torch.v2.nn.FakeQuantizationMixin"]], "__quant_init__() (aimet_torch.v2.nn.fakequantizationmixin method)": [[6, "aimet_torch.v2.nn.FakeQuantizationMixin.__quant_init__"]], "compute_encodings() (aimet_torch.v2.nn.fakequantizationmixin method)": [[6, "aimet_torch.v2.nn.FakeQuantizationMixin.compute_encodings"]], "from_module() (aimet_torch.v2.nn.fakequantizationmixin class method)": [[6, "aimet_torch.v2.nn.FakeQuantizationMixin.from_module"]], "get_original_module() (aimet_torch.v2.nn.fakequantizationmixin method)": [[6, "aimet_torch.v2.nn.FakeQuantizationMixin.get_original_module"]], "implements() (aimet_torch.v2.nn.fakequantizationmixin class method)": [[6, "aimet_torch.v2.nn.FakeQuantizationMixin.implements"]], "input_quantizers (aimet_torch.v2.nn.fakequantizationmixin attribute)": [[6, "aimet_torch.v2.nn.FakeQuantizationMixin.input_quantizers"]], "output_quantizers (aimet_torch.v2.nn.fakequantizationmixin attribute)": [[6, "aimet_torch.v2.nn.FakeQuantizationMixin.output_quantizers"]], "param_quantizers (aimet_torch.v2.nn.fakequantizationmixin attribute)": [[6, "aimet_torch.v2.nn.FakeQuantizationMixin.param_quantizers"]], "quantizationmixin (class in aimet_torch.v2.nn)": [[7, "aimet_torch.v2.nn.QuantizationMixin"]], "compute_encodings() (aimet_torch.v2.nn.quantizationmixin method)": [[7, "aimet_torch.v2.nn.QuantizationMixin.compute_encodings"]], "get_default_kernel() (aimet_torch.v2.nn.quantizationmixin class method)": [[7, "aimet_torch.v2.nn.QuantizationMixin.get_default_kernel"]], "get_kernel() (aimet_torch.v2.nn.quantizationmixin method)": [[7, "aimet_torch.v2.nn.QuantizationMixin.get_kernel"]], "implements() (aimet_torch.v2.nn.quantizationmixin class method)": [[7, "aimet_torch.v2.nn.QuantizationMixin.implements"]], "set_default_kernel() (aimet_torch.v2.nn.quantizationmixin class method)": [[7, "aimet_torch.v2.nn.QuantizationMixin.set_default_kernel"]], "set_kernel() (aimet_torch.v2.nn.quantizationmixin method)": [[7, "aimet_torch.v2.nn.QuantizationMixin.set_kernel"]], "wrap() (aimet_torch.v2.nn.quantizationmixin class method)": [[7, "aimet_torch.v2.nn.QuantizationMixin.wrap"]], "quantize (class in aimet_torch.v2.quantization.affine)": [[8, "aimet_torch.v2.quantization.affine.Quantize"]], "forward() (aimet_torch.v2.quantization.affine.quantize method)": [[8, "aimet_torch.v2.quantization.affine.Quantize.forward"]], "quantizedequantize (class in aimet_torch.v2.quantization.affine)": [[9, "aimet_torch.v2.quantization.affine.QuantizeDequantize"]], "forward() (aimet_torch.v2.quantization.affine.quantizedequantize method)": [[9, "aimet_torch.v2.quantization.affine.QuantizeDequantize.forward"]], "dequantize() (in module aimet_torch.v2.quantization.affine)": [[10, "aimet_torch.v2.quantization.affine.dequantize"]], "quantize() (in module aimet_torch.v2.quantization.affine)": [[11, "aimet_torch.v2.quantization.affine.quantize"]], "quantize_dequantize() (in module aimet_torch.v2.quantization.affine)": [[12, "aimet_torch.v2.quantization.affine.quantize_dequantize"]], "aimet_torch.v2.quantization.affine": [[13, "module-aimet_torch.v2.quantization.affine"]], "module": [[13, "module-aimet_torch.v2.quantization.affine"], [15, "module-aimet_torch.v2.quantization.float"]], "floatquantizedequantize (class in aimet_torch.v2.quantization.float)": [[14, "aimet_torch.v2.quantization.float.FloatQuantizeDequantize"]], "quantizedequantize (class in aimet_torch.v2.quantization.float)": [[14, "aimet_torch.v2.quantization.float.QuantizeDequantize"]], "aimet_torch.v2.quantization.float": [[15, "module-aimet_torch.v2.quantization.float"]], "dequantizedtensor (class in aimet_torch.v2.quantization.tensor)": [[16, "aimet_torch.v2.quantization.tensor.DequantizedTensor"]], "quantizedtensor (class in aimet_torch.v2.quantization.tensor)": [[16, "aimet_torch.v2.quantization.tensor.QuantizedTensor"]], "dequantize() (aimet_torch.v2.quantization.tensor.dequantizedtensor method)": [[16, "aimet_torch.v2.quantization.tensor.DequantizedTensor.dequantize"]], "dequantize() (aimet_torch.v2.quantization.tensor.quantizedtensor method)": [[16, "aimet_torch.v2.quantization.tensor.QuantizedTensor.dequantize"]], "quantize() (aimet_torch.v2.quantization.tensor.dequantizedtensor method)": [[16, "aimet_torch.v2.quantization.tensor.DequantizedTensor.quantize"]], "quantize() (aimet_torch.v2.quantization.tensor.quantizedtensor method)": [[16, "aimet_torch.v2.quantization.tensor.QuantizedTensor.quantize"]], "quantized_repr() (aimet_torch.v2.quantization.tensor.dequantizedtensor method)": [[16, "aimet_torch.v2.quantization.tensor.DequantizedTensor.quantized_repr"]], "quantized_repr() (aimet_torch.v2.quantization.tensor.quantizedtensor method)": [[16, "aimet_torch.v2.quantization.tensor.QuantizedTensor.quantized_repr"]], "encodinganalyzer (class in aimet_torch.v2.quantization.encoding_analyzer)": [[17, "aimet_torch.v2.quantization.encoding_analyzer.EncodingAnalyzer"]], "minmaxencodinganalyzer (class in aimet_torch.v2.quantization.encoding_analyzer)": [[19, "aimet_torch.v2.quantization.encoding_analyzer.MinMaxEncodingAnalyzer"]], "percentileencodinganalyzer (class in aimet_torch.v2.quantization.encoding_analyzer)": [[20, "aimet_torch.v2.quantization.encoding_analyzer.PercentileEncodingAnalyzer"]], "set_percentile() (aimet_torch.v2.quantization.encoding_analyzer.percentileencodinganalyzer method)": [[20, "aimet_torch.v2.quantization.encoding_analyzer.PercentileEncodingAnalyzer.set_percentile"]], "sqnrencodinganalyzer (class in aimet_torch.v2.quantization.encoding_analyzer)": [[21, "aimet_torch.v2.quantization.encoding_analyzer.SqnrEncodingAnalyzer"]], "compute_encodings_from_stats() (aimet_torch.v2.quantization.encoding_analyzer.sqnrencodinganalyzer method)": [[21, "aimet_torch.v2.quantization.encoding_analyzer.SqnrEncodingAnalyzer.compute_encodings_from_stats"]], "basequantizationmixin (class in aimet_torch.v2.nn.base)": [[23, "aimet_torch.v2.nn.base.BaseQuantizationMixin"]], "__quant_init__() (aimet_torch.v2.nn.base.basequantizationmixin method)": [[23, "aimet_torch.v2.nn.base.BaseQuantizationMixin.__quant_init__"]], "compute_encodings() (aimet_torch.v2.nn.base.basequantizationmixin method)": [[23, "aimet_torch.v2.nn.base.BaseQuantizationMixin.compute_encodings"]], "input_quantizers (aimet_torch.v2.nn.base.basequantizationmixin attribute)": [[23, "aimet_torch.v2.nn.base.BaseQuantizationMixin.input_quantizers"]], "output_quantizers (aimet_torch.v2.nn.base.basequantizationmixin attribute)": [[23, "aimet_torch.v2.nn.base.BaseQuantizationMixin.output_quantizers"]], "param_quantizers (aimet_torch.v2.nn.base.basequantizationmixin attribute)": [[23, "aimet_torch.v2.nn.base.BaseQuantizationMixin.param_quantizers"]], "quantize (class in aimet_torch.v2.quantization.affine.quantizer)": [[24, "aimet_torch.v2.quantization.affine.quantizer.Quantize"]], "quantizedequantize (class in aimet_torch.v2.quantization.affine.quantizer)": [[24, "aimet_torch.v2.quantization.affine.quantizer.QuantizeDequantize"]], "quantizerbase (class in aimet_torch.v2.quantization.affine.quantizer)": [[24, "aimet_torch.v2.quantization.affine.quantizer.QuantizerBase"]], "allow_overwrite() (aimet_torch.v2.quantization.affine.quantizer.quantizerbase method)": [[24, "aimet_torch.v2.quantization.affine.quantizer.QuantizerBase.allow_overwrite"]], "compute_encodings() (aimet_torch.v2.quantization.affine.quantizer.quantizerbase method)": [[24, "aimet_torch.v2.quantization.affine.quantizer.QuantizerBase.compute_encodings"]], "forward() (aimet_torch.v2.quantization.affine.quantizer.quantize method)": [[24, "aimet_torch.v2.quantization.affine.quantizer.Quantize.forward"]], "forward() (aimet_torch.v2.quantization.affine.quantizer.quantizedequantize method)": [[24, "aimet_torch.v2.quantization.affine.quantizer.QuantizeDequantize.forward"]], "get_encoding() (aimet_torch.v2.quantization.affine.quantizer.quantizerbase method)": [[24, "aimet_torch.v2.quantization.affine.quantizer.QuantizerBase.get_encoding"]], "get_legacy_encodings() (aimet_torch.v2.quantization.affine.quantizer.quantizerbase method)": [[24, "aimet_torch.v2.quantization.affine.quantizer.QuantizerBase.get_legacy_encodings"]], "is_initialized() (aimet_torch.v2.quantization.affine.quantizer.quantizerbase method)": [[24, "aimet_torch.v2.quantization.affine.quantizer.QuantizerBase.is_initialized"]], "register_quantization_parameter() (aimet_torch.v2.quantization.affine.quantizer.quantizerbase method)": [[24, "aimet_torch.v2.quantization.affine.quantizer.QuantizerBase.register_quantization_parameter"]], "set_legacy_encodings() (aimet_torch.v2.quantization.affine.quantizer.quantizerbase method)": [[24, "aimet_torch.v2.quantization.affine.quantizer.QuantizerBase.set_legacy_encodings"]]}}) \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/toplevelhidden.html b/releases/1.32.2/torch_v2/toplevelhidden.html new file mode 100644 index 00000000..48b65ef3 --- /dev/null +++ b/releases/1.32.2/torch_v2/toplevelhidden.html @@ -0,0 +1,165 @@ + + + + + + <no title> — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/api/nn.fake_quantization_mixin.html b/releases/1.32.2/torch_v2/torch_docs/api/nn.fake_quantization_mixin.html new file mode 100644 index 00000000..28307bfb --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/api/nn.fake_quantization_mixin.html @@ -0,0 +1,351 @@ + + + + + + FakeQuantizationMixin — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

FakeQuantizationMixin

+
+
+class aimet_torch.v2.nn.FakeQuantizationMixin(*args, **kwargs)[source]
+

Mixin that implements fake-quantization on top of regular pytorch modules.

+

Specifically, a fake-quantized module will quantize input, output, and parameter tensors with +its held QuantizerBase objects during the forward() method and use the inherited :class:torch.nn.Module` +forward method to compute the layer operation. If all input, output, and parameter quantizers are None, a +fake-quantized module will behave exactly the same as its parent torch.nn.Module.

+

A fake-quantized module can be initialized from scratch using the same syntax as the parent module, or can be +formed from an existing module using the from_module() method.

+
+
+input_quantizers
+

ModuleList containing QuantizerBase objects to be applied +to the layer’s input tensors

+
+
Type
+

nn.ModuleList

+
+
+
+ +
+
+output_quantizers
+

ModuleList containing QuantizerBase objects to be applied +to the layer’s output tensors

+
+
Type
+

nn.ModuleList

+
+
+
+ +
+
+param_quantizers
+

ModuleDict mapping parameter names to associated QuantizerBase +objects

+
+
Type
+

nn.ModuleDict

+
+
+
+ +

Examples

+
>>> qlinear = FakeQuantizedLinear(in_features=10, out_features=20, bias=False)
+>>> print(qlinear)
+FakeQuantizedLinear(
+  in_features=10, out_features=20, bias=False
+  (param_quantizers): ModuleDict(
+    (weight): None
+  )
+  (input_quantizers): ModuleList(
+    (0): None
+  )
+  (output_quantizers): ModuleList(
+    (0): None
+  )
+)
+
+
+
>>> linear = torch.nn.Linear(in_features=10, out_features=20, bias=True)
+>>> qlinear = FakeQuantizationMixin.from_module(linear)
+>>> print(qlinear)
+FakeQuantizedLinear(
+  in_features=10, out_features=20, bias=True
+  (param_quantizers): ModuleDict(
+    (weight): None
+    (bias): None
+  )
+  (input_quantizers): ModuleList(
+    (0): None
+  )
+  (output_quantizers): ModuleList(
+    (0): None
+  )
+)
+>>> qlinear.weight is linear.weight
+True
+
+
+
+
+__quant_init__()
+

Initializer for quantized module. This method will be invoked right after __init__.

+

This method initializes the input_quantizers, output_quantizers, and param_quantizers +structures to the appropriate sizes based on the number of input tensors, output tensors, and parameters of the +base nn.Module class. All quantizers are initializd to None.

+

For custom quantized classes, this method should be overridden to set the appropriate lengths of +input_quantizers and output_quantizers for the given base class.

+
+ +
+
+compute_encodings()
+

Enters the compute_encodings() context for all QuantizerBase objects in the layer.

+

Inside this context, each quantizer will observe all inputs passed to the quantizer and will compute +quantization encodings upon exiting the context.

+

Example

+
>>> qlinear = QuantizedLinear(10, 10)
+>>> qlinear.output_quantizers[0] = Quantize((1, ), 8, symmetric=False)
+>>> with qlinear.compute_encodings():
+>>>     qlinear(torch.randn(16, 10))
+>>> print(qlinear.output_quantizers[0].is_initialized())
+True
+
+
+
+ +
+
+classmethod from_module(module)
+

Create an instance of quantized module from a regular module instance.

+

The resulting quantized module contains the same attributes and parameters as the original module, but may +be assigned input, output and parameter quantizers.

+
+
Parameters
+

module (Module) – Floating point module to quantize

+
+
Returns
+

Quantized version of the original module

+
+
+

Example

+
>>> linear = torch.nn.linear(10, 10)
+>>> quantized_linear = FakeQuantizationMixin.from_module(linear)
+>>> print(quantized_linear.weight is linear.weight)
+True
+>>> print(quantized_linear.param_quantizers)
+ModuleDict(
+    (weight): None
+    (bias): None
+)
+
+
+
+ +
+
+get_original_module()
+

Returns the floating point version of the quantized module

+
+
Return type
+

Module

+
+
Returns
+

A floating point module with quantizers removed

+
+
+

Example

+
>>> qlinear = QuantizedLinear(10, 20, bias=False)
+>>> linear = qlinear.get_original_module()
+>>> linear
+Linear(in_features=10, out_features=20, bias=False)
+>>> linear.weight is qlinear.weight
+True
+
+
+
+ +
+
+classmethod implements(module_cls)[source]
+

Decorator for registering a fake-quantized implementation of the given base class.

+

This decorator registers the defined class as the fake-quantized version of module_cls such that calling +from_module() on an instance of module_cls will output an instance of the decorated class.

+
+
Parameters
+

module_cls – The base torch.nn.Module class

+
+
+
+ +
+ +
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/api/nn.quantization_mixin.html b/releases/1.32.2/torch_v2/torch_docs/api/nn.quantization_mixin.html new file mode 100644 index 00000000..545fd719 --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/api/nn.quantization_mixin.html @@ -0,0 +1,277 @@ + + + + + + nn.QuantizationMixin — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Warning

+

This feature is under heavy development and API changes may occur without notice in future verions.

+
+
+

nn.QuantizationMixin

+

Mixin for adding full quantization functionality to nn.Module subclasses. This functionality includes both the ability +to set input, output, and parameter quantizers as well as the ability to register a quantized version of the layer’s +forward operation.

+
+

Top-level API

+
+
+class aimet_torch.v2.nn.QuantizationMixin(*args, **kwargs)[source]
+

Mixin that allows dispatch to quantized operator libraries in place of native pytorch operations

+
+
+compute_encodings()[source]
+

Enters the compute_encodings() context for all QuantizerBase objects in the layer.

+

Inside this context, each quantizer will observe all inputs passed to the quantizer and will compute +quantization encodings upon exiting the context.

+

Example

+
>>> qlinear = QuantizedLinear(10, 10)
+>>> qlinear.output_quantizers[0] = Quantize((1, ), 8, symmetric=False)
+>>> with qlinear.compute_encodings():
+>>>     qlinear(torch.randn(16, 10))
+>>> print(qlinear.output_quantizers[0].is_initialized())
+True
+
+
+
+ +
+
+classmethod get_default_kernel()[source]
+

Return the default kernel of the class

+
+
Return type
+

Optional[Callable]

+
+
Returns
+

Default kernel of the class. None if the default kernel is not set.

+
+
+
+ +
+
+get_kernel()[source]
+

Return the kernel to be used by this instance of quantized module. +If the current instance does not have any kernel set, +it will try to use the default kernel of the class.

+
+
Return type
+

Optional[Callable]

+
+
Returns
+

Kernel to be used by this instance.

+
+
+
+ +
+
+classmethod implements(module_cls)[source]
+

Decorator for registering quantized implementation of the given base class.

+
+ +
+
+classmethod set_default_kernel(kernel)[source]
+

Set default kernel for the class.

+
+
Parameters
+

kernel (Callable) – Callable object to be used as the default kernel +by all the instances of this class.

+
+
+
+ +
+
+set_kernel(kernel)[source]
+

Set kernel for this instance of quantized module.

+
+
Parameters
+

kernel (Callable) – Callable object to be used as the underlying kernel.

+
+
+
+ +
+
+classmethod wrap(module_cls)[source]
+

Wrap a regular module class into a quantized module class

+
+
Return type
+

Type[Module]

+
+
+
+ +
+ +
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.Quantize.html b/releases/1.32.2/torch_v2/torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.Quantize.html new file mode 100644 index 00000000..0124a9d6 --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.Quantize.html @@ -0,0 +1,275 @@ + + + + + + Quantize — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Quantize

+
+
+class aimet_torch.v2.quantization.affine.Quantize(shape, bitwidth, symmetric, encoding_analyzer=None, block_size=None)[source]
+

Applies quantization to the input.

+

Precisely,

+
+\[out = clamp\left(\left\lceil\frac{input}{scale}\right\rfloor - offset, qmin, qmax\right)\]
+

where \(scale\) and \(offset\) are derived from learnable parameters +\(\theta_{min}\) and \(\theta_{max}\).

+

If block size \(B = \begin{pmatrix} B_0 & B_1 & \cdots & B_{D-1} \end{pmatrix}\) is specified, +this equation will be further generalized as

+
+\[ \begin{align}\begin{aligned}\begin{split}out_{j_0 \cdots j_{D-1}} & = clamp\left( + \left\lceil\frac{input_{j_0 \cdots j_{D-1}}}{scale_{i_0 \cdots i_{D-1}}}\right\rfloor + - offset_{i_0 \cdots i_{D-1}}, qmin, qmax\right)\\\end{split}\\\text{where} \quad \forall_{0 \leq d < D} \quad i_d = \left\lfloor \frac{j_d}{B_d} \right\rfloor\end{aligned}\end{align} \]
+
+
Parameters
+
    +
  • shape (tuple) – Shape of the quantization parameters

  • +
  • bitwidth (int) – Quantization bitwidth

  • +
  • symmetric (bool) – If True, performs symmetric quantization; +otherwise, performs asymmetric quantization

  • +
  • encoding_analyzer (EncodingAnalyzer, optional) – Encoding analyzer for calibrating quantization encodings +(default: absolute min-max encoding analyzer)

  • +
  • block_size (Tuple[int, ...], optional) – Block size

  • +
+
+
Variables
+
    +
  • min (Tensor) – \(\theta_{min}\) from which scale and offset will be derived.

  • +
  • max (Tensor) – \(\theta_{max}\) from which scale and offset will be derived.

  • +
+
+
+
+

Note

+

Quantize cannot run forward() until min and max are properly initialized, +which can be done based on input statistics using compute_encodings() or +by manually assigning a new value to min and max. +See the examples below.

+
+

Examples

+
>>> import aimet_torch.v2.quantization as Q
+>>> input = torch.randn(5, 10)
+>>> q = Q.affine.Quantize(shape=(5, 1), bitwidth=8, symmetric=False, block_size=(1, 5))
+>>> q.is_initialized()
+False
+>>> with q.compute_encodings():
+...     _ = q(input)
+...
+>>> q.is_initialized()
+True
+>>> q(input)
+QuantizedTensor([[129.,  64., 255., 122.,   0., 192., 106.,  94., 255.,   0.],
+                 [  0., 145., 181., 255., 144., 255., 194.,   0.,  74.,  86.],
+                 [122.,   0., 255., 150.,  33., 103., 103.,   0.,  37., 255.],
+                 [255., 111., 237., 218.,   0.,  49., 155., 255.,   0., 179.],
+                 [  0.,  66., 255.,  89., 110.,  17.,  36.,  83., 255.,   0.]],
+                grad_fn=<AliasBackward0>)
+
+
+
>>> import aimet_torch.v2.quantization as Q
+>>> input = torch.randn(5, 10)
+>>> q = Q.affine.Quantize(shape=(5, 1), bitwidth=8, symmetric=False, block_size=(1, 5))
+>>> q.is_initialized()
+False
+>>> q.min = torch.nn.Parameter(-torch.ones_like(q.min))
+>>> q.max = torch.nn.Parameter(torch.ones_like(q.max))
+>>> q.is_initialized()
+True
+>>> q(input)
+QuantizedTensor([[187., 186., 131.,   0., 203.,  64.,  80.,   0., 143., 152.],
+                 [ 16.,   0., 255.,   0.,   0., 150.,   0., 255.,  32., 255.],
+                 [255., 226.,   0., 255.,  55., 172.,   0., 255., 145., 255.],
+                 [207., 146., 216., 238.,   0.,   0., 141., 178., 255., 188.],
+                 [ 63.,  59.,  19., 162.,  30., 255., 109., 255.,   0., 255.]],
+                grad_fn=<AliasBackward0>)
+
+
+
+
+forward(input)[source]
+

Quantizes the input tensor

+
+
Return type
+

QuantizedTensor

+
+
Parameters
+

input (torch.Tensor) – Input to quantize

+
+
Returns
+

Quantized output

+
+
+
+ +
+ +
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.QuantizeDequantize.html b/releases/1.32.2/torch_v2/torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.QuantizeDequantize.html new file mode 100644 index 00000000..3cd29b88 --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.QuantizeDequantize.html @@ -0,0 +1,289 @@ + + + + + + QuantizeDequantize — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

QuantizeDequantize

+
+
+class aimet_torch.v2.quantization.affine.QuantizeDequantize(shape, bitwidth, symmetric, encoding_analyzer=None, block_size=None)[source]
+

Applies fake-quantization by quantizing and dequantizing the input.

+

Precisely,

+
+\[out = (\overline{input} + offset) * scale\]
+

where

+
+\[\overline{input} = clamp\left(\left\lceil\frac{input}{scale}\right\rfloor - offset, qmin, qmax\right)\]
+

and \(scale\) and \(offset\) are derived from learnable parameters +\(\theta_{min}\) and \(\theta_{max}\).

+

If block size \(B = \begin{pmatrix} B_0 & B_1 & \cdots & B_{D-1} \end{pmatrix}\) is specified, +this equation will be further generalized as

+
+\[ \begin{align}\begin{aligned}\begin{split}out_{j_0 \cdots j_{D-1}} &= (\overline{input}_{j_0 \cdots j_{D-1}} + offset_{i_0 \cdots i_{D-1}}) * scale_{i_0 \cdots i_{D-1}}\\ +\overline{input}_{j_0 \cdots j_{D-1}} &= clamp\left( + \left\lceil\frac{input_{j_0 \cdots j_{D-1}}}{scale_{i_0 \cdots i_{D-1}}}\right\rfloor + - offset_{i_0 \cdots i_{D-1}}, qmin, qmax\right)\\\end{split}\\\text{where} \quad \forall_{0 \leq d < D} \quad i_d = \left\lfloor \frac{j_d}{B_d} \right\rfloor\end{aligned}\end{align} \]
+
+
Parameters
+
    +
  • shape (tuple) – Shape of the quantization parameters

  • +
  • bitwidth (int) – Quantization bitwidth

  • +
  • symmetric (bool) – If True, performs symmetric quantization; +otherwise, performs asymmetric quantization

  • +
  • encoding_analyzer (EncodingAnalyzer, optional) – Encoding analyzer for calibrating quantization encodings +(default: absolute min-max encoding analyzer)

  • +
  • block_size (Tuple[int, ...], optional) – Block size

  • +
+
+
Variables
+
    +
  • min (Tensor) – \(\theta_{min}\) from which scale and offset will be derived.

  • +
  • max (Tensor) – \(\theta_{max}\) from which scale and offset will be derived.

  • +
+
+
+
+

Note

+

QuantizeDequantize cannot run forward() until min and max are properly initialized, +which can be done based on input statistics using compute_encodings() or +by manually assigning a new value to min and max. +See the examples below.

+
+

Examples

+
>>> import aimet_torch.v2.quantization as Q
+>>> input = torch.randn(5, 10)
+>>> qdq = Q.affine.QuantizeDequantize(shape=(5, 2), bitwidth=8, symmetric=False, block_size=(1, 5))
+>>> qdq.is_initialized()
+False
+>>> with qdq.compute_encodings():
+...     _ = qdq(input)
+...
+>>> qdq.is_initialized()
+True
+>>> qdq(input)
+DequantizedTensor([[-0.2771,  0.3038,  1.0819,  0.9700,  0.9487, -0.1307,
+                    -1.7894, -0.1709, -0.2212,  0.7741],
+                   [-1.0295, -1.2265, -1.0295,  1.0564,  0.6177, -1.0386,
+                    -0.0176, -2.6054,  1.8836, -0.1232],
+                   [-0.8229,  0.5540,  0.3992, -0.2363,  1.2546, -1.0036,
+                     0.2355,  0.1741,  1.6079,  0.6247],
+                   [-1.0115,  1.2458,  0.9157, -1.4694, -0.0639, -0.2568,
+                     0.0680,  1.6695,  0.7932, -0.1889],
+                   [ 0.0158,  0.5695,  0.5220,  0.1977, -1.4475, -0.0424,
+                    -1.1128, -0.8796, -0.1060,  1.5897]],
+                  grad_fn=<AliasBackward0>)
+
+
+
>>> import aimet_torch.v2.quantization as Q
+>>> input = torch.randn(5, 10)
+>>> qdq = Q.affine.QuantizeDequantize(shape=(5, 2), bitwidth=8, symmetric=False, block_size=(1, 5))
+>>> qdq.is_initialized()
+False
+>>> qdq.min = torch.nn.Parameter(-torch.ones_like(qdq.min))
+>>> qdq.max = torch.nn.Parameter(torch.ones_like(qdq.max))
+>>> qdq.is_initialized()
+True
+>>> qdq(input)
+DequantizedTensor([[-0.6196, -0.9961,  0.0549, -0.6431,  1.0039, -0.8706,
+                     1.0039,  0.4706, -0.2353,  0.8078],
+                   [ 0.3451, -0.1176, -0.9961, -0.4549, -0.0549, -0.0471,
+                    -0.5255, -0.2353,  1.0039, -0.9961],
+                   [-0.4157,  0.0784,  0.5333,  0.1647, -0.9961, -0.9961,
+                    -0.2118, -0.2196,  0.9176,  0.9490],
+                   [ 1.0039, -0.7765,  0.4784, -0.8706,  1.0039,  0.6039,
+                    -0.4157, -0.2118, -0.9961,  0.3137],
+                   [ 1.0039,  0.3216, -0.2353, -0.7765, -0.9961,  0.8000,
+                     1.0039,  0.4157,  0.4392,  0.4863]],
+                  grad_fn=<AliasBackward0>)
+
+
+
+
+forward(input)[source]
+

Quantizes and dequantizes the input tensor

+
+
Return type
+

DequantizedTensor

+
+
Parameters
+

input (torch.Tensor) – Input to quantize and dequantize

+
+
Returns
+

Quantize-dequantized output

+
+
+
+ +
+ +
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.dequantize.html b/releases/1.32.2/torch_v2/torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.dequantize.html new file mode 100644 index 00000000..82b96415 --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.dequantize.html @@ -0,0 +1,180 @@ + + + + + + dequantize — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

dequantize

+
+
+aimet_torch.v2.quantization.affine.dequantize(tensor, scale, offset, block_size=None)[source]
+
+ +
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.quantize_.html b/releases/1.32.2/torch_v2/torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.quantize_.html new file mode 100644 index 00000000..a7390fb6 --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.quantize_.html @@ -0,0 +1,297 @@ + + + + + + quantize — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

quantize

+
+
+aimet_torch.v2.quantization.affine.quantize(tensor, scale, offset, *args, **kwargs)[source]
+

Applies quantization to the input.

+

Precisely,

+
+\[out = clamp\left(\left\lceil\frac{input}{scale}\right\rfloor - offset, qmin, qmax\right)\]
+

If block size \(B = \begin{pmatrix} B_0 & B_1 & \cdots & B_{D-1} \end{pmatrix}\) is specified, +this equation will be further generalized as

+
+\[ \begin{align}\begin{aligned}\begin{split}out_{j_0 \cdots j_{D-1}} & = clamp\left( + \left\lceil\frac{input_{j_0 \cdots j_{D-1}}}{scale_{i_0 \cdots i_{D-1}}}\right\rfloor + - offset_{i_0 \cdots i_{D-1}}, qmin, qmax\right)\\\end{split}\\\text{where} \quad \forall_{0 \leq d < D} \quad i_d = \left\lfloor \frac{j_d}{B_d} \right\rfloor\end{aligned}\end{align} \]
+

This function is overloaded with the signatures listed below:

+
+
+aimet_torch.v2.quantization.affine.quantize(tensor, scale, offset, bitwidth, signed=False, block_size=None)[source]
+

Equivalent to:

+
+\[\begin{split}qmin= +\begin{cases} + -\left\lceil\frac{2^{bitwidth}-1}{2}\right\rceil,& \text{if } signed\\ + 0, & \text{otherwise (default)} +\end{cases} +qmax= +\begin{cases} + \left\lfloor\frac{2^{bitwidth}-1}{2}\right\rfloor,& \text{if } signed\\ + 2^{bitwidth}-1, & \text{otherwise (default)} +\end{cases}\end{split}\]
+
+
Parameters
+
    +
  • tensor (Tensor) – Tensor to quantize

  • +
  • scale (Tensor) – Scale for quantization

  • +
  • offset (Tensor) – Offset for quantization

  • +
  • bitwidth (int) – Bitwidth of quantized tensor based on which \(qmin\) and \(qmax\) will be derived

  • +
  • signed (bool) – If false, the output will be mapped to positive integers only. +Otherwise, it will range over both positive and negative integers.

  • +
  • block_size (Tuple[int, ...], optional) – Block size

  • +
+
+
+
+ +
+
+aimet_torch.v2.quantization.affine.quantize(tensor, scale, offset, *, num_steps, signed=False, block_size=None)[source]
+

Equivalent to:

+
+\[\begin{split}qmin= +\begin{cases} + -\left\lceil\frac{num\_steps}{2}\right\rceil,& \text{if } signed\\ + 0, & \text{otherwise (default)} +\end{cases} +qmax= +\begin{cases} + \left\lfloor\frac{num\_steps}{2}\right\rfloor,& \text{if } signed\\ + num\_steps, & \text{otherwise (default)} +\end{cases}\end{split}\]
+
+
Parameters
+
    +
  • tensor (Tensor) – Tensor to quantize

  • +
  • scale (Tensor) – Scale for quantization

  • +
  • offset (Tensor) – Offset for quantization

  • +
  • num_steps (int) – The number of steps in the quantization range based on which \(qmin\) and \(qmax\) will be derived

  • +
  • signed (bool) – If false, the output will be mapped to positive integers only. +Otherwise, it will range over both positive and negative integers.

  • +
  • block_size (Tuple[int, ...], optional) – Block size

  • +
+
+
+
+ +
+
+aimet_torch.v2.quantization.affine.quantize(tensor, scale, offset, *, qmin, qmax, block_size=None)[source]
+
+
Parameters
+
    +
  • tensor (Tensor) – Tensor to quantize

  • +
  • scale (Tensor) – Scale for quantization

  • +
  • offset (Tensor) – Offset for quantization

  • +
  • qmin (int) – Minimum value of the quantization range

  • +
  • qmax (int) – Maximum value of the quantization range

  • +
  • block_size (Tuple[int, ...], optional) – Block size

  • +
+
+
+
+ +

Examples

+
>>> import aimet_torch.v2.quantization as Q
+>>> input = torch.arange(start=-0.3, end=1.3, step=0.05)
+>>> print(input)
+tensor([-3.0000e-01, -2.5000e-01, -2.0000e-01, -1.5000e-01, -1.0000e-01,
+        -5.0000e-02, -1.1921e-08,  5.0000e-02,  1.0000e-01,  1.5000e-01,
+        2.0000e-01,  2.5000e-01,  3.0000e-01,  3.5000e-01,  4.0000e-01,
+        4.5000e-01,  5.0000e-01,  5.5000e-01,  6.0000e-01,  6.5000e-01,
+        7.0000e-01,  7.5000e-01,  8.0000e-01,  8.5000e-01,  9.0000e-01,
+        9.5000e-01,  1.0000e+00,  1.0500e+00,  1.1000e+00,  1.1500e+00,
+        1.2000e+00,  1.2500e+00])
+>>> scale = torch.tensor(1/15)
+>>> offset = torch.tensor(0.0)
+>>> Q.affine.quantize(input, scale, offset, bitwidth=4)
+tensor([ 0.,  0.,  0.,  0.,  0.,  0., -0.,  1.,  2.,  2.,  3.,  4.,  4.,  5.,
+         6.,  7.,  7.,  8.,  9., 10., 10., 11., 12., 13., 13., 14., 15., 15.,
+         15., 15., 15., 15.])
+>>> Q.affine.quantize(input, scale, offset, num_steps=15)
+tensor([ 0.,  0.,  0.,  0.,  0.,  0., -0.,  1.,  2.,  2.,  3.,  4.,  4.,  5.,
+         6.,  7.,  7.,  8.,  9., 10., 10., 11., 12., 13., 13., 14., 15., 15.,
+         15., 15., 15., 15.])
+>>> Q.affine.quantize(input, scale, offset, qmin=0, qmax=15)
+tensor([ 0.,  0.,  0.,  0.,  0.,  0., -0.,  1.,  2.,  2.,  3.,  4.,  4.,  5.,
+         6.,  7.,  7.,  8.,  9., 10., 10., 11., 12., 13., 13., 14., 15., 15.,
+         15., 15., 15., 15.])
+
+
+
+ +
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.quantize_dequantize.html b/releases/1.32.2/torch_v2/torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.quantize_dequantize.html new file mode 100644 index 00000000..2e28e715 --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/api/quantization/affine/generated/aimet_torch.v2.quantization.affine.quantize_dequantize.html @@ -0,0 +1,304 @@ + + + + + + quantize_dequantize — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

quantize_dequantize

+
+
+aimet_torch.v2.quantization.affine.quantize_dequantize(tensor, scale, offset, *args, **kwargs)[source]
+

Applies fake-quantization by quantizing and dequantizing the input.

+

Precisely,

+
+\[out = (\overline{input} + offset) * scale\]
+

where

+
+\[\overline{input} = clamp\left(\left\lceil\frac{input}{scale}\right\rfloor - offset, qmin, qmax\right)\]
+

If block size \(B = \begin{pmatrix} B_0 & B_1 & \cdots & B_{D-1} \end{pmatrix}\) is specified, +this equation will be further generalized as

+
+\[ \begin{align}\begin{aligned}\begin{split}out_{j_0 \cdots j_{D-1}} &= (\overline{input}_{j_0 \cdots j_{D-1}} + offset_{i_0 \cdots i_{D-1}}) * scale_{i_0 \cdots i_{D-1}}\\ +\overline{input}_{j_0 \cdots j_{D-1}} &= clamp\left( + \left\lceil\frac{input_{j_0 \cdots j_{D-1}}}{scale_{i_0 \cdots i_{D-1}}}\right\rfloor + - offset_{i_0 \cdots i_{D-1}}, qmin, qmax\right)\\\end{split}\\\text{where } \quad \forall_{0 \leq d < D} \quad i_d = \left\lfloor \frac{j_d}{B_d} \right\rfloor\end{aligned}\end{align} \]
+

This function is overloaded with the signatures listed below:

+
+
+aimet_torch.v2.quantization.affine.quantize_dequantize(tensor, scale, offset, bitwidth, signed=False, block_size=None)[source]
+

Equivalent to:

+
+\[\begin{split}qmin= +\begin{cases} + -\left\lceil\frac{2^{bitwidth}-1}{2}\right\rceil,& \text{if } signed\\ + 0, & \text{otherwise (default)} +\end{cases} +qmax= +\begin{cases} + \left\lfloor\frac{2^{bitwidth}-1}{2}\right\rfloor,& \text{if } signed\\ + 2^{bitwidth}-1, & \text{otherwise (default)} +\end{cases}\end{split}\]
+
+
Parameters
+
    +
  • tensor (Tensor) – Tensor to quantize

  • +
  • scale (Tensor) – Scale for quantization

  • +
  • offset (Tensor) – Offset for quantization

  • +
  • bitwidth (int) – Bitwidth of quantized tensor based on which \(qmin\) and \(qmax\) will be derived

  • +
  • signed (bool) – If false, \(\overline{input}\) will be mapped to positive integers only. +Otherwise, \(\overline{input}\) will range over both positive and negative integers.

  • +
  • block_size (Tuple[int, ...], optional) – Block size

  • +
+
+
+
+ +
+
+aimet_torch.v2.quantization.affine.quantize_dequantize(tensor, scale, offset, *, num_steps, signed=False, block_size=None)[source]
+

Equivalent to:

+
+\[\begin{split}qmin= +\begin{cases} + -\left\lceil\frac{num\_steps}{2}\right\rceil,& \text{if } signed\\ + 0, & \text{otherwise (default)} +\end{cases} +qmax= +\begin{cases} + \left\lfloor\frac{num\_steps}{2}\right\rfloor,& \text{if } signed\\ + num\_steps, & \text{otherwise (default)} +\end{cases}\end{split}\]
+
+
Parameters
+
    +
  • tensor (Tensor) – Tensor to quantize

  • +
  • scale (Tensor) – Scale for quantization

  • +
  • offset (Tensor) – Offset for quantization

  • +
  • num_steps (int) – The number of steps in the quantization range based on which \(qmin\) and \(qmax\) will be derived

  • +
  • signed (bool) – If false, \(\overline{input}\) will be mapped to positive integers only. +Otherwise, \(\overline{input}\) will range over both positive and negative integers.

  • +
  • block_size (Tuple[int, ...], optional) – Block size

  • +
+
+
+
+ +
+
+aimet_torch.v2.quantization.affine.quantize_dequantize(tensor, scale, offset, *, qmin, qmax, block_size=None)[source]
+
+
Parameters
+
    +
  • tensor (Tensor) – Tensor to quantize

  • +
  • scale (Tensor) – Scale for quantization

  • +
  • offset (Tensor) – Offset for quantization

  • +
  • qmin (int) – Minimum value of the quantization range

  • +
  • qmax (int) – Maximum value of the quantization range

  • +
  • block_size (Tuple[int, ...], optional) – Block size

  • +
+
+
+
+ +

Examples

+
>>> import aimet_torch.v2.quantization as Q
+>>> input = torch.arange(start=-0.3, end=1.3, step=0.05)
+>>> print(input)
+tensor([-3.0000e-01, -2.5000e-01, -2.0000e-01, -1.5000e-01, -1.0000e-01,
+        -5.0000e-02, -1.1921e-08,  5.0000e-02,  1.0000e-01,  1.5000e-01,
+        2.0000e-01,  2.5000e-01,  3.0000e-01,  3.5000e-01,  4.0000e-01,
+        4.5000e-01,  5.0000e-01,  5.5000e-01,  6.0000e-01,  6.5000e-01,
+        7.0000e-01,  7.5000e-01,  8.0000e-01,  8.5000e-01,  9.0000e-01,
+        9.5000e-01,  1.0000e+00,  1.0500e+00,  1.1000e+00,  1.1500e+00,
+        1.2000e+00,  1.2500e+00])
+>>> scale = torch.tensor(1/15)
+>>> offset = torch.tensor(0.0)
+>>> Q.affine.quantize_dequantize(input, scale, offset, bitwidth=4)
+tensor([0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0667, 0.1333,
+        0.1333, 0.2000, 0.2667, 0.2667, 0.3333, 0.4000, 0.4667, 0.4667, 0.5333,
+        0.6000, 0.6667, 0.6667, 0.7333, 0.8000, 0.8667, 0.8667, 0.9333, 1.0000,
+        1.0000, 1.0000, 1.0000, 1.0000, 1.0000])
+>>> Q.affine.quantize_dequantize(input, scale, offset, num_steps=15)
+tensor([0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0667, 0.1333,
+        0.1333, 0.2000, 0.2667, 0.2667, 0.3333, 0.4000, 0.4667, 0.4667, 0.5333,
+        0.6000, 0.6667, 0.6667, 0.7333, 0.8000, 0.8667, 0.8667, 0.9333, 1.0000,
+        1.0000, 1.0000, 1.0000, 1.0000, 1.0000])
+>>> Q.affine.quantize_dequantize(input, scale, offset, qmin=0, qmax=15)
+tensor([0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0667, 0.1333,
+        0.1333, 0.2000, 0.2667, 0.2667, 0.3333, 0.4000, 0.4667, 0.4667, 0.5333,
+        0.6000, 0.6667, 0.6667, 0.7333, 0.8000, 0.8667, 0.8667, 0.9333, 1.0000,
+        1.0000, 1.0000, 1.0000, 1.0000, 1.0000])
+
+
+
+ +
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/api/quantization/affine/index.html b/releases/1.32.2/torch_v2/torch_docs/api/quantization/affine/index.html new file mode 100644 index 00000000..50849e13 --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/api/quantization/affine/index.html @@ -0,0 +1,211 @@ + + + + + + quantization.affine — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

quantization.affine

+
+

Classes

+ ++++ + + + + + + + + +

Quantize

Applies quantization to the input.

QuantizeDequantize

Applies fake-quantization by quantizing and dequantizing the input.

+
+
+

Functions

+ ++++ + + + + + + + + + + + +

quantize

Applies quantization to the input.

quantize_dequantize

Applies fake-quantization by quantizing and dequantizing the input.

dequantize

+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/api/quantization/float/FloatQuantizeDequantize.html b/releases/1.32.2/torch_v2/torch_docs/api/quantization/float/FloatQuantizeDequantize.html new file mode 100644 index 00000000..c07cb0b9 --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/api/quantization/float/FloatQuantizeDequantize.html @@ -0,0 +1,236 @@ + + + + + + FloatQuantizeDequantize — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

FloatQuantizeDequantize

+
+
+class aimet_torch.v2.quantization.float.FloatQuantizeDequantize(exponent_bits=None, mantissa_bits=None, dtype=None, encoding_analyzer=None)[source]
+

Simulates quantization by fake-casting the input

+

If dtype is provided, this is equivalent to

+
+\[\begin{split}out = x.to(dtype).to(x.dtype) \\\end{split}\]
+

If the exponent and mantissa bits are provided, this is equivalent to

+
+\[out = \left\lceil\frac{x_c}{scale}\right\rfloor * scale\]
+

where

+
+\[\begin{split}x_c &= clamp(x, -max, max) \\ +bias &= 2^{exponent} - \log_2(max) + \log_2(2 - 2^{-mantissa}) - 1 \\ +scale &= 2 ^ {\left\lfloor \log_2 |x_c| + bias \right\rfloor - mantissa - bias} \\\end{split}\]
+

The IEEE standard computes the maximum representable value by

+
+\[\begin{split}max = (2 - 2^{-mantissa}) * 2^{(\left\lfloor 0.5 * exponent\_max \right\rfloor)} \\\end{split}\]
+

where

+
+\[\begin{split}exponent\_max = 2^{exponent} - 1 \\\end{split}\]
+
+
Parameters
+
    +
  • exponent_bits (int) – Number of exponent bits to simulate

  • +
  • mantissa_bits (int) – Number of mantissa bits to simulate

  • +
  • dtype (torch.dtype) – torch.dtype to simulate. This argument is mutually exclusive with exponent_bits and mantissa_bits.

  • +
  • encoding_analyzer (EncodingAnalyzer) – If specified, the maximum value to represent will be determined dynamically based on the input statistics for finer precision.

  • +
+
+
+

Examples

+
>>> import aimet_torch.v2.quantization as Q
+>>> input = torch.tensor([[ 1.8998, -0.0947],[-1.0891, -0.1727]])
+>>> qdq = Q.float.FloatQuantizeDequantize(mantissa_bits=7, exponent_bits=8)
+>>> # Unlike AffineQuantizer, FloatQuantizer is initialized without calling compute_encodings()
+>>> qdq.is_initialized()
+True
+>>> qdq.is_bfloat16()
+True
+>>> qdq.bitwidth
+16
+>>> qdq(input)
+tensor([[ 1.8984, -0.0947], [-1.0859, -0.1729]])
+
+
+
>>> from aimet_torch.v2.quantization.encoding_analyzer import MinMaxEncodingAnalyzer
+>>> encoding_analyzer = MinMaxEncodingAnalyzer(shape=(1,))
+>>> qdq = Q.float.FloatQuantizeDequantize(dtype=torch.float16, encoding_analyzer=encoding_analyzer)
+>>> qdq.is_float16()
+True
+>>> qdq.bitwidth
+16
+>>> qdq(input)
+tensor([[ 1.8994, -0.0947], [-1.0889, -0.1727]])
+
+
+
+ +
+
+

QuantizeDequantize

+
+
+class aimet_torch.v2.quantization.float.QuantizeDequantize(exponent_bits=None, mantissa_bits=None, dtype=None, encoding_analyzer=None)[source]
+

Alias of FloatQuantizeDequantize

+
+ +
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/api/quantization/float/index.html b/releases/1.32.2/torch_v2/torch_docs/api/quantization/float/index.html new file mode 100644 index 00000000..5bac99bf --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/api/quantization/float/index.html @@ -0,0 +1,185 @@ + + + + + + quantization.float — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

quantization.float

+ ++++ + + + + + + + + +

FloatQuantizeDequantize

Simulates quantization by fake-casting the input

QuantizeDequantize

Alias of FloatQuantizeDequantize

+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/api/quantization/tensor.html b/releases/1.32.2/torch_v2/torch_docs/api/quantization/tensor.html new file mode 100644 index 00000000..ed2d61df --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/api/quantization/tensor.html @@ -0,0 +1,316 @@ + + + + + + quantization.tensor — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

quantization.tensor

+
+

Classes

+
+
+class aimet_torch.v2.quantization.tensor.QuantizedTensor(*args, **kwargs)[source]
+

Represents a quantized tensor object. The object holds quantized values stored in a floating-point tensor along with +an EncodingBase object which holds the information necessary to map the quantized values back to the +real/represented values.

+
+
+dequantize()[source]
+

Dequantizes self using self.encoding to produce a DequantizedTensor with the same encoding +information.

+

Example

+
>>> from aimet_torch.v2.quantization as Q
+>>> x = torch.tensor([[2.57, -2.312],
+...                   [0.153, 0.205]])
+>>> quantizer = Q.affine.Quantize(shape=(1, ), bitwidth=8, symmetric=True)
+>>> quantizer.set_range(-128 * 0.1, 127 * 0.1)
+>>> x_q = quantizer(x)
+>>> x_q
+QuantizedTensor([[ 26., -23.],
+                 [  2.,   2.]], grad_fn=<AliasBackward0>)
+>>> x_dq = x_q.dequantize()
+>>> x_dq
+DequantizedTensor([[ 2.6000, -2.3000],
+                   [ 0.2000,  0.2000]], grad_fn=<AliasBackward0>)
+>>> torch.equal(x_dq.encoding.scale, x_q.encoding.scale)
+True
+
+
+
+
Return type
+

DequantizedTensor

+
+
+
+ +
+
+quantize()[source]
+

Returns self

+
+
Return type
+

QuantizedTensor

+
+
+
+ +
+
+quantized_repr()[source]
+

Return the quantized representation of self as a torch.Tensor with data type self.encoding.dtype +:rtype: Tensor

+
+

Note

+

The result of this function may not be able to carry a gradient depending on the quantized data type. +Thus, it may be necessary to call this only within an autograd function to allow for backpropagation.

+
+

Example

+
>>> from aimet_torch.v2 import quantization as Q
+>>> quantizer = Q.affine.Quantize(shape=(2, 1), bitwidth=8, symmetric=True)
+>>> x = torch.randn((2, 4), requires_grad=True)
+>>> with quantizer.compute_encodings():
+...     x_q = quantizer(x)
+>>> x_q
+QuantizedTensor([[  11.,  -57., -128.,   38.],
+                 [  28.,   -0., -128.,  -40.]], grad_fn=<AliasBackward0>)
+>>> x_q.quantized_repr()
+tensor([[  11,  -57, -128,   38],
+        [  28,    0, -128,  -40]], dtype=torch.int8)
+
+
+
+ +
+ +
+
+class aimet_torch.v2.quantization.tensor.DequantizedTensor(*args, **kwargs)[source]
+

Represents a tensor which has been quantized and subsequently dequantized. This object contains real floating point +data as well as an EncodingBase object which holds information about the quantization parameters with which +the data was quantized. With this, a DequantizedTensor can be converted back to its quantized representation +without further loss in information.

+
+
+dequantize()[source]
+

Returns self

+
+
Return type
+

DequantizedTensor

+
+
+
+ +
+
+quantize()[source]
+

Quantizes self using self.encoding to produce a QuantizedTensor with the same encoding +information.

+

Example

+
>>> import aimet_torch.v2.quantization as Q
+>>> x = torch.tensor([[0.39, 51.0], [3.521, 9.41]])
+>>> quant_dequant = Q.affine.QuantizeDequantize((1, ), 8, symmetric=False)
+>>> quant_dequant.set_range(-10, 41)
+>>> x_qdq = quant_dequant(x)
+>>> x_qdq
+DequantizedTensor([[ 0.4000, 41.0000],
+                   [ 3.6000,  9.4000]], grad_fn=<AliasBackward0>)
+>>> x_qdq.quantize()
+QuantizedTensor([[ 52., 255.],
+                 [ 68.,  97.]], grad_fn=<AliasBackward0>)
+
+
+
+
Return type
+

QuantizedTensor

+
+
+
+ +
+
+quantized_repr()[source]
+

Return the quantized representation of self as a torch.Tensor with data type self.encoding.dtype. +:rtype: Tensor

+
+

Note

+

The result of this function may not be able to carry a gradient depending on the quantized data type. +Thus, it may be necessary to call this only within an autograd function to allow for backpropagation.

+
+

Example

+
>>> import aimet_torch.v2.quantization as Q
+>>> x = torch.tensor([[0.39, 51.0], [3.521, 9.41]])
+>>> quant_dequant = Q.affine.QuantizeDequantize((1, ), 8, symmetric=False)
+>>> quant_dequant.set_range(-10, 41)
+>>> x_qdq = quant_dequant(x)
+>>> x_qdq
+DequantizedTensor([[ 0.4000, 41.0000],
+                   [ 3.6000,  9.4000]], grad_fn=<AliasBackward0>)
+>>> x_qdq.quantized_repr()
+tensor([[ 52, 255],
+        [ 68,  97]], dtype=torch.uint8)
+
+
+
+ +
+ +
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/encoding_analyzer.html b/releases/1.32.2/torch_v2/torch_docs/encoding_analyzer.html new file mode 100644 index 00000000..d1c8353b --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/encoding_analyzer.html @@ -0,0 +1,199 @@ + + + + + + Encoding Analyzers — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Encoding Analyzers

+
+
+class aimet_torch.v2.quantization.encoding_analyzer.EncodingAnalyzer(observer)[source]
+
+ +
+

Variants

+ ++++ + + + + + + + + + + + +

MinMaxEncodingAnalyzer(shape)

Encoding Analyzer for Min-Max calibration technique

SqnrEncodingAnalyzer(shape[, num_bins, ...])

Encoding Analyzer for SQNR Calibration technique

PercentileEncodingAnalyzer(shape[, ...])

Encoding Analyzer for Percentile calibration technique

+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/examples/ptq.html b/releases/1.32.2/torch_v2/torch_docs/examples/ptq.html new file mode 100644 index 00000000..11210ad1 --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/examples/ptq.html @@ -0,0 +1,174 @@ + + + + + + Post-Training Quantization — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Post-Training Quantization

+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/generated/aimet_torch.v2.quantization.encoding_analyzer.MinMaxEncodingAnalyzer.html b/releases/1.32.2/torch_v2/torch_docs/generated/aimet_torch.v2.quantization.encoding_analyzer.MinMaxEncodingAnalyzer.html new file mode 100644 index 00000000..3c246498 --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/generated/aimet_torch.v2.quantization.encoding_analyzer.MinMaxEncodingAnalyzer.html @@ -0,0 +1,181 @@ + + + + + + MinMaxEncodingAnalyzer — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

MinMaxEncodingAnalyzer

+
+
+class aimet_torch.v2.quantization.encoding_analyzer.MinMaxEncodingAnalyzer(shape)[source]
+

Encoding Analyzer for Min-Max calibration technique

+
+ +
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/generated/aimet_torch.v2.quantization.encoding_analyzer.PercentileEncodingAnalyzer.html b/releases/1.32.2/torch_v2/torch_docs/generated/aimet_torch.v2.quantization.encoding_analyzer.PercentileEncodingAnalyzer.html new file mode 100644 index 00000000..e4d722e2 --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/generated/aimet_torch.v2.quantization.encoding_analyzer.PercentileEncodingAnalyzer.html @@ -0,0 +1,193 @@ + + + + + + PercentileEncodingAnalyzer — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

PercentileEncodingAnalyzer

+
+
+class aimet_torch.v2.quantization.encoding_analyzer.PercentileEncodingAnalyzer(shape, num_bins=2048, percentile=100)[source]
+

Encoding Analyzer for Percentile calibration technique

+
+
+set_percentile(percentile)[source]
+

Set the clipping percentile of the encoding analyzer. The encoding analyzer will clip the (100% - percentile) +largest and smallest observed values from the encoding range when computing encodings.

+
+
Parameters
+

percentile – Value from 50.0 to 100.0 indicating the clipping percentile

+
+
+
+ +
+ +
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/generated/aimet_torch.v2.quantization.encoding_analyzer.SqnrEncodingAnalyzer.html b/releases/1.32.2/torch_v2/torch_docs/generated/aimet_torch.v2.quantization.encoding_analyzer.SqnrEncodingAnalyzer.html new file mode 100644 index 00000000..7a3f647a --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/generated/aimet_torch.v2.quantization.encoding_analyzer.SqnrEncodingAnalyzer.html @@ -0,0 +1,216 @@ + + + + + + SqnrEncodingAnalyzer — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

SqnrEncodingAnalyzer

+
+
+class aimet_torch.v2.quantization.encoding_analyzer.SqnrEncodingAnalyzer(shape, num_bins=2048, *, asymmetric_delta_candidates=17, symmetric_delta_candidates=101, offset_candidates=21, max_parallelism=64, gamma=3.0)[source]
+

Encoding Analyzer for SQNR Calibration technique

+
+
Parameters
+
    +
  • shape (tuple) – Shape of calculated encoding

  • +
  • num_bins (int) – number of bins to use per histogram

  • +
  • asymmetric_delta_candidates – number of delta values to search over in asymmetric mode

  • +
  • symmetric_delta_candidates – number of delta values to search over in symmetric mode

  • +
  • offset_candidates – number of offset values to search over in asymmetric mode

  • +
  • max_parallelism – maximum number of encodings to process parallely (higher number results in higher +memory usage but faster computation)

  • +
  • gamma – weighting factor on clipping noise (higher value results in less clipping noise)

  • +
+
+
+
+
+compute_encodings_from_stats(stats, num_steps, is_symmetric)[source]
+

Searches for encodings which produce the lowest expected SQNR based on the histograms in stats

+
+
Parameters
+
    +
  • stats (List[_Histogram]) – A list of _Histogram objects with length equal to the number of encodings to compute

  • +
  • num_steps (int) – The number of bins the quantized range is split into

  • +
  • is_symmetric (bool) – If True, computes symmetric encodings, else computes asymmetric encodings

  • +
+
+
Return type
+

Tuple[Optional[Tensor], Optional[Tensor]]

+
+
Returns
+

Tuple of computed encodings (min, max) as tensors with shape self.shape

+
+
+
+ +
+ +
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/index.html b/releases/1.32.2/torch_v2/torch_docs/index.html new file mode 100644 index 00000000..1cfe39a5 --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/index.html @@ -0,0 +1,250 @@ + + + + + + AIMET: AI Model Efficiency Toolkit Documentation — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET: AI Model Efficiency Toolkit Documentation

+

AI Model Efficiency Toolkit (AIMET) provides tools enabling users to quantize and compress PyTorch models. Quantization +is an essential step when deploying models to edge devices with fixed-point AI accelerators.

+

AIMET provides both post-training and fine-tuning techniques to minimize accuracy loss incurred when quantizing +floating-point models.

+../_images/AIMET_index_no_fine_tune.png +

The above picture shows a high-level view of the workflow when using AIMET. The user passes a trained floating-point +model to AIMET’s APIs for quantization. AIMET returns a new PyTorch model simulating low-precision inference, which users +can fine-tune to recover lost accuracy. Users can then export the quantized model via ONNX/torchscript to an on-target +runtime like Qualcomm® Neural Processing SDK.

+
+

Getting Started

+
+
+

Pip Installation:

+
apt-get install liblapacke
+python3 -m pip install aimet-torch
+
+
+

For more installation options, please visit the AIMET installation instructions.

+

Basic Usage:

+
import aimet_torch.v2 as aimet
+
+# Create quantization simulation model for your model
+sim = aimet.quantsim.QuantizationSimModel(model, sample_input)
+
+# Calibrate quantization encodings on sample data
+with aimet.nn.compute_encodings(sim.model):
+    for data, _ in data_loader:
+        sim.model(data)
+
+# Simulate quantized inference
+sample_output = sim.model(sample_input)
+
+# Export model and quantization encodings
+sim.export("./out_dir", "quantized_model", sample_input)
+
+
+

Please view the Quickstart Guide for a more in-depth guide to using AIMET quantsim.

+
+

Examples

+ +
+
+

Feature Descriptions

+ +
+ +
+
AI Model Efficiency Toolkit is a product of Qualcomm Innovation Center, Inc.
+
Qualcomm® Neural Processing SDK is a product of Qualcomm Technologies, Inc. and/or its subsidiaries.
+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/quantized_modules.html b/releases/1.32.2/torch_v2/torch_docs/quantized_modules.html new file mode 100644 index 00000000..9168577d --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/quantized_modules.html @@ -0,0 +1,1200 @@ + + + + + + Quantized Modules — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Warning

+

This feature is under heavy development and API changes may occur without notice in future verions.

+
+
+

Quantized Modules

+

To simulate the effects of running networks at a reduced bitwidth, AIMET provides quantized versions of +standard torch.nn.Modules. These quantized modules serve as drop-in replacements for their PyTorch counterparts, but can +hold input, output, and parameter quantizers to perform quantization operations during the +module’s forward pass and compute quantization encodings.

+

A quantized module inherits both from an AIMET-defined quantization mixin type as well as a native pytorch nn.Module type. The +exact behavior and capabilities of the quantized module are determined by which type of quantization mixin it inherits from.

+

AIMET defines two types of quantization mixin:

+
+
    +
  • FakeQuantizationMixin: Simulates quantization by performing quantize-dequantize +operations on tensors and calling into native pytorch floating-point operations

  • +
  • QuantizationMixin: Allows the user to register a custom kernel to perform +a quantized forward pass and dequantizes the output. If no kernel is registered, the module will perform fake-quantization.

  • +
+
+

The functionality and state of a QuantizationMixin is a superset of that of a FakeQuantizationMixin, meaning that +if one does not register a custom kernel, a QuantizationMixin-derived module behaves +exactly the same as a FakeQuantizationMixin-derived module. AIMET provides +extensive coverage of FakeQuantizationMixin for torch.nn.Module layer types, and more limited coverage for +QuantizationMixin layers. See the table below for a full list of module coverage.

+
+

Top-level API

+
+
+class aimet_torch.v2.nn.base.BaseQuantizationMixin(*args, **kwargs)[source]
+

Mixin that implements quantization on top of regular pytorch modules.

+
+
+input_quantizers
+

ModuleList containing QuantizerBase objects to be applied +to the layer’s input tensors

+
+
Type
+

nn.ModuleList

+
+
+
+ +
+
+output_quantizers
+

ModuleList containing QuantizerBase objects to be applied +to the layer’s output tensors

+
+
Type
+

nn.ModuleList

+
+
+
+ +
+
+param_quantizers
+

ModuleDict mapping parameter names to associated QuantizerBase +objects

+
+
Type
+

nn.ModuleDict

+
+
+
+ +
+
+__quant_init__()[source]
+

Initializer for quantized module. This method will be invoked right after __init__.

+

This method initializes the input_quantizers, output_quantizers, and param_quantizers +structures to the appropriate sizes based on the number of input tensors, output tensors, and parameters of the +base nn.Module class. All quantizers are initializd to None.

+

For custom quantized classes, this method should be overridden to set the appropriate lengths of +input_quantizers and output_quantizers for the given base class.

+
+ +
+
+compute_encodings()[source]
+

Enters the compute_encodings() context for all QuantizerBase objects in the layer.

+

Inside this context, each quantizer will observe all inputs passed to the quantizer and will compute +quantization encodings upon exiting the context.

+

Example

+
>>> qlinear = QuantizedLinear(10, 10)
+>>> qlinear.output_quantizers[0] = Quantize((1, ), 8, symmetric=False)
+>>> with qlinear.compute_encodings():
+>>>     qlinear(torch.randn(16, 10))
+>>> print(qlinear.output_quantizers[0].is_initialized())
+True
+
+
+
+ +
+ +
+
+

Configuration

+

The quantization behavior of a quantized module is controlled by the quantizers contained within the input, output, +and parameter quantizer attributes listed below.

+ +++++ + + + + + + + + + + + + + + + + + + + + +

Attribute

Type

Description

input_quantizers

torch.nn.ModuleList

List of quantizers for input tensors

param_quantizers

torch.nn.ModuleDict

Dict mapping parameter names to quantizers

output_quantizers

torch.nn.ModuleList

List of quantizers for output tensors

+

By assigning and configuring quantizers to these structures, we define the type of quantization applied to the corresponding +input index, output index, or parameter name. By default, all the quantizers are set to None, meaning that no quantization +will be applied to the respective tensor.

+
+
Example: Create a linear layer which performs only per-channel weight quantization
>>> import aimet_torch.v2 as aimet
+>>> import aimet_torch.quantization as Q
+>>> qlinear = aimet.nn.QuantizedLinear(out_features=10, in_features=5)
+>>> # Per-channel weight quantization is performed over the `out_features` dimension, so encodings are shape (10, 1)
+>>> per_channel_quantizer = Q.affine.QuantizeDequantize(shape=(10, 1), bitwidth=8, symmetric=True)
+>>> qlinear.param_quantizers["weight"] = per_channel_quantizer
+
+
+
+
Example: Create an elementwise multiply layer which quantizes only the output and the second input
>>> qmul = aimet.nn.QuantizedMultiply()
+>>> qmul.output_quantizers[0] = Q.affine.QuantizeDequantize(shape=(1, ), bitwidth=8, symmetric=False)
+>>> qmul.input_quantizers[1] = Q.affine.QuantizeDequantize(shape=(1, ), bitwidth=8, symmetric=False)
+
+
+
+
+

In some cases, it may make sense for multiple tensors to share the same quantizer. In this case, we can assign the same +quantizer to multiple indices.

+
+
Example: Create an elementwise add layer which shares the same quantizer between its inputs
>>> qadd = aimet.nn.QuantizedAdd()
+>>> quantizer = Q.affine.QuantizeDequantize(shape=(1, ), bitwidth=8, symmetric=False)
+>>> qadd.input_quantizers[0] = quantizer
+>>> qadd.input_quantizers[1] = quantizer
+
+
+
+
+
+
+

Computing Encodings

+

Before a module can compute a quantized forward pass, all quantizers must first be calibrated inside a compute_encodings +context. When a quantized module enters the compute_encodings context, it first disables all input and output quantization +while the quantizers observe the statistics of the activation tensors passing through them. Upon exiting the context, +the quantizers calculate appropriate quantization encodings based on these statistics (exactly how the encodings are +computed is determined by each quantizer’s encoding analyzer).

+
+
Example:
>>> qlinear = aimet.nn.QuantizedLinear(out_features=10, in_features=5)
+>>> qlinear.output_quantizers[0] = Q.affine.QuantizeDequantize((1, ), bitwidth=8, symmetric=False)
+>>> qlinear.param_quantizers[0] = Q.affine.QuantizeDequantize((10, 1), bitwidth=8, symmetric=True)
+>>> with qlinear.compute_encodings():
+...     # Pass several samples through the layer to ensure representative statistics
+...     for x, _ in calibration_data_loader:
+...         qlinear(x)
+>>> print(qlinear.output_quantizers[0].is_initialized())
+True
+>>> print(qlinear.param_quantizers["weight"].is_initialized())
+True
+
+
+
+
+
+
+

Quantized Module Classes

+ +++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

nn.Module

FakeQuantizationMixin

QuantizationMixin

torch.nn.AdaptiveAvgPool1d

FakeQuantizedAdaptiveAvgPool1d

torch.nn.AdaptiveAvgPool2d

FakeQuantizedAdaptiveAvgPool2d

torch.nn.AdaptiveAvgPool3d

FakeQuantizedAdaptiveAvgPool3d

torch.nn.AdaptiveMaxPool1d

FakeQuantizedAdaptiveMaxPool1d

torch.nn.AdaptiveMaxPool2d

FakeQuantizedAdaptiveMaxPool2d

torch.nn.AdaptiveMaxPool3d

FakeQuantizedAdaptiveMaxPool3d

torch.nn.AlphaDropout

FakeQuantizedAlphaDropout

torch.nn.AvgPool1d

FakeQuantizedAvgPool1d

torch.nn.AvgPool2d

FakeQuantizedAvgPool2d

torch.nn.AvgPool3d

FakeQuantizedAvgPool3d

torch.nn.BatchNorm1d

FakeQuantizedBatchNorm1d

torch.nn.BatchNorm2d

FakeQuantizedBatchNorm2d

torch.nn.BatchNorm3d

FakeQuantizedBatchNorm3d

torch.nn.CELU

FakeQuantizedCELU

torch.nn.ChannelShuffle

FakeQuantizedChannelShuffle

torch.nn.ConstantPad1d

FakeQuantizedConstantPad1d

torch.nn.ConstantPad2d

FakeQuantizedConstantPad2d

torch.nn.ConstantPad3d

FakeQuantizedConstantPad3d

torch.nn.Conv1d

FakeQuantizedConv1d

QuantizedConv1d

torch.nn.Conv2d

FakeQuantizedConv2d

QuantizedConv2d

torch.nn.Conv3d

FakeQuantizedConv3d

QuantizedConv3d

torch.nn.ConvTranspose1d

FakeQuantizedConvTranspose1d

torch.nn.ConvTranspose2d

FakeQuantizedConvTranspose2d

torch.nn.ConvTranspose3d

FakeQuantizedConvTranspose3d

torch.nn.CrossMapLRN2d

FakeQuantizedCrossMapLRN2d

torch.nn.Dropout

FakeQuantizedDropout

torch.nn.Dropout2d

FakeQuantizedDropout2d

torch.nn.Dropout3d

FakeQuantizedDropout3d

torch.nn.ELU

FakeQuantizedELU

torch.nn.FeatureAlphaDropout

FakeQuantizedFeatureAlphaDropout

torch.nn.Flatten

FakeQuantizedFlatten

torch.nn.Fold

FakeQuantizedFold

torch.nn.FractionalMaxPool2d

FakeQuantizedFractionalMaxPool2d

torch.nn.FractionalMaxPool3d

FakeQuantizedFractionalMaxPool3d

torch.nn.GELU

FakeQuantizedGELU

QuantizedGELU

torch.nn.GLU

FakeQuantizedGLU

torch.nn.GroupNorm

FakeQuantizedGroupNorm

torch.nn.Hardshrink

FakeQuantizedHardshrink

torch.nn.Hardsigmoid

FakeQuantizedHardsigmoid

torch.nn.Hardswish

FakeQuantizedHardswish

torch.nn.Hardtanh

FakeQuantizedHardtanh

torch.nn.Identity

FakeQuantizedIdentity

torch.nn.InstanceNorm1d

FakeQuantizedInstanceNorm1d

torch.nn.InstanceNorm2d

FakeQuantizedInstanceNorm2d

torch.nn.InstanceNorm3d

FakeQuantizedInstanceNorm3d

torch.nn.LPPool1d

FakeQuantizedLPPool1d

torch.nn.LPPool2d

FakeQuantizedLPPool2d

torch.nn.LayerNorm

FakeQuantizedLayerNorm

QuantizedLayerNorm

torch.nn.LeakyReLU

FakeQuantizedLeakyReLU

torch.nn.Linear

FakeQuantizedLinear

QuantizedLinear

torch.nn.LocalResponseNorm

FakeQuantizedLocalResponseNorm

torch.nn.LogSigmoid

FakeQuantizedLogSigmoid

torch.nn.LogSoftmax

FakeQuantizedLogSoftmax

torch.nn.MaxPool1d

FakeQuantizedMaxPool1d

torch.nn.MaxPool2d

FakeQuantizedMaxPool2d

torch.nn.MaxPool3d

FakeQuantizedMaxPool3d

torch.nn.MaxUnpool1d

FakeQuantizedMaxUnpool1d

torch.nn.MaxUnpool2d

FakeQuantizedMaxUnpool2d

torch.nn.MaxUnpool3d

FakeQuantizedMaxUnpool3d

torch.nn.Mish

FakeQuantizedMish

torch.nn.PReLU

FakeQuantizedPReLU

torch.nn.PixelShuffle

FakeQuantizedPixelShuffle

torch.nn.PixelUnshuffle

FakeQuantizedPixelUnshuffle

torch.nn.RReLU

FakeQuantizedRReLU

torch.nn.ReLU

FakeQuantizedReLU

torch.nn.ReLU6

FakeQuantizedReLU6

torch.nn.ReflectionPad1d

FakeQuantizedReflectionPad1d

torch.nn.ReflectionPad2d

FakeQuantizedReflectionPad2d

torch.nn.ReplicationPad1d

FakeQuantizedReplicationPad1d

torch.nn.ReplicationPad2d

FakeQuantizedReplicationPad2d

torch.nn.ReplicationPad3d

FakeQuantizedReplicationPad3d

torch.nn.SELU

FakeQuantizedSELU

torch.nn.SiLU

FakeQuantizedSiLU

torch.nn.Sigmoid

FakeQuantizedSigmoid

QuantizedSigmoid

torch.nn.Softmax

FakeQuantizedSoftmax

QuantizedSoftmax

torch.nn.Softmax2d

FakeQuantizedSoftmax2d

torch.nn.Softmin

FakeQuantizedSoftmin

torch.nn.Softplus

FakeQuantizedSoftplus

torch.nn.Softshrink

FakeQuantizedSoftshrink

torch.nn.Softsign

FakeQuantizedSoftsign

torch.nn.SyncBatchNorm

FakeQuantizedSyncBatchNorm

torch.nn.Tanh

FakeQuantizedTanh

torch.nn.Tanhshrink

FakeQuantizedTanhshrink

torch.nn.Threshold

FakeQuantizedThreshold

torch.nn.Unflatten

FakeQuantizedUnflatten

torch.nn.Unfold

FakeQuantizedUnfold

torch.nn.Upsample

FakeQuantizedUpsample

torch.nn.UpsamplingBilinear2d

FakeQuantizedUpsamplingBilinear2d

torch.nn.UpsamplingNearest2d

FakeQuantizedUpsamplingNearest2d

torch.nn.ZeroPad2d

FakeQuantizedZeroPad2d

torch.nn.BCELoss

FakeQuantizedBCELoss

torch.nn.BCEWithLogitsLoss

FakeQuantizedBCEWithLogitsLoss

torch.nn.Bilinear

FakeQuantizedBilinear

torch.nn.CTCLoss

FakeQuantizedCTCLoss

torch.nn.CosineSimilarity

FakeQuantizedCosineSimilarity

torch.nn.CrossEntropyLoss

FakeQuantizedCrossEntropyLoss

torch.nn.HingeEmbeddingLoss

FakeQuantizedHingeEmbeddingLoss

torch.nn.HuberLoss

FakeQuantizedHuberLoss

torch.nn.KLDivLoss

FakeQuantizedKLDivLoss

torch.nn.L1Loss

FakeQuantizedL1Loss

torch.nn.MSELoss

FakeQuantizedMSELoss

torch.nn.MultiLabelMarginLoss

FakeQuantizedMultiLabelMarginLoss

torch.nn.MultiLabelSoftMarginLoss

FakeQuantizedMultiLabelSoftMarginLoss

torch.nn.MultiMarginLoss

FakeQuantizedMultiMarginLoss

torch.nn.NLLLoss

FakeQuantizedNLLLoss

torch.nn.NLLLoss2d

FakeQuantizedNLLLoss2d

torch.nn.PairwiseDistance

FakeQuantizedPairwiseDistance

torch.nn.PoissonNLLLoss

FakeQuantizedPoissonNLLLoss

torch.nn.SmoothL1Loss

FakeQuantizedSmoothL1Loss

torch.nn.SoftMarginLoss

FakeQuantizedSoftMarginLoss

torch.nn.CosineEmbeddingLoss

FakeQuantizedCosineEmbeddingLoss

torch.nn.GaussianNLLLoss

FakeQuantizedGaussianNLLLoss

torch.nn.MarginRankingLoss

FakeQuantizedMarginRankingLoss

torch.nn.TripletMarginLoss

FakeQuantizedTripletMarginLoss

torch.nn.TripletMarginWithDistanceLoss

FakeQuantizedTripletMarginWithDistanceLoss

torch.nn.Embedding

FakeQuantizedEmbedding

torch.nn.EmbeddingBag

FakeQuantizedEmbeddingBag

torch.nn.GRU

FakeQuantizedGRU

torch.nn.RNN

FakeQuantizedRNN

torch.nn.GRUCell

FakeQuantizedGRUCell

torch.nn.RNNCell

FakeQuantizedRNNCell

torch.nn.LSTM

FakeQuantizedLSTM

torch.nn.LSTMCell

FakeQuantizedLSTMCell

torch.nn.AdaptiveLogSoftmaxWithLoss

FakeQuantizedAdaptiveLogSoftmaxWithLoss

aimet_ops.ChannelShuffle

FakeQuantizedChannelShuffle

aimet_ops.MaxPool2d

FakeQuantizedMaxPool2d

aimet_ops.AdaptiveAvgPool2d

FakeQuantizedAdaptiveAvgPool2d

aimet_ops.AvgPool2d

FakeQuantizedAvgPool2d

aimet_ops.Cast

FakeQuantizedCast

aimet_ops.DepthToSpaceDCRMode

FakeQuantizedDepthToSpaceDCRMode

aimet_ops.OneHot

FakeQuantizedOneHot

aimet_ops.Exponential

FakeQuantizedExponential

aimet_ops.Erf

FakeQuantizedErf

aimet_ops.Sqrt

FakeQuantizedSqrt

aimet_ops.Log

FakeQuantizedLog

aimet_ops.Abs

FakeQuantizedAbs

aimet_ops.Neg

FakeQuantizedNeg

aimet_ops.ElementwiseCeil

FakeQuantizedElementwiseCeil

aimet_ops.ElementwiseFloor

FakeQuantizedElementwiseFloor

aimet_ops.Sin

FakeQuantizedSin

aimet_ops.Cos

FakeQuantizedCos

aimet_ops.Asin

FakeQuantizedAsin

aimet_ops.Atan

FakeQuantizedAtan

aimet_ops.Round

FakeQuantizedRound

aimet_ops.LogicalNot

FakeQuantizedLogicalNot

aimet_ops.NonZero

FakeQuantizedNonZero

aimet_ops.ElementwiseUnarySign

FakeQuantizedElementwiseUnarySign

aimet_ops.RSqrt

FakeQuantizedRSqrt

aimet_ops.Square

FakeQuantizedSquare

aimet_ops.Mean

FakeQuantizedMean

aimet_ops.Sum

FakeQuantizedSum

aimet_ops.Prod

FakeQuantizedProd

aimet_ops.Argmin

FakeQuantizedArgmin

aimet_ops.Argmax

FakeQuantizedArgmax

aimet_ops.Gather

FakeQuantizedGather

aimet_ops.Reshape

FakeQuantizedReshape

aimet_ops.RoiAlign

FakeQuantizedRoiAlign

aimet_ops.Permute

FakeQuantizedPermute

aimet_ops.IndexSelect

FakeQuantizedIndexSelect

aimet_ops.TopK

FakeQuantizedTopK

aimet_ops.Tile

FakeQuantizedTile

aimet_ops.Norm

FakeQuantizedNorm

aimet_ops.CumSum

FakeQuantizedCumSum

aimet_ops.Interpolate

FakeQuantizedInterpolate

aimet_ops.Normalize

FakeQuantizedNormalize

aimet_ops.Pad

FakeQuantizedPad

aimet_ops.Shape

FakeQuantizedShape

aimet_ops.Expand

FakeQuantizedExpand

aimet_ops.StridedSlice

FakeQuantizedStridedSlice

aimet_ops.MatMul

FakeQuantizedMatMul

aimet_ops.Add

FakeQuantizedAdd

QuantizedAdd

aimet_ops.Multiply

FakeQuantizedMultiply

QuantizedMultiply

aimet_ops.Subtract

FakeQuantizedSubtract

QuantizedSubtract

aimet_ops.Divide

FakeQuantizedDivide

aimet_ops.FloorDivide

FakeQuantizedFloorDivide

aimet_ops.Greater

FakeQuantizedGreater

aimet_ops.Less

FakeQuantizedLess

aimet_ops.GreaterEqual

FakeQuantizedGreaterEqual

aimet_ops.LessEqual

FakeQuantizedLessEqual

aimet_ops.NotEqual

FakeQuantizedNotEqual

aimet_ops.Equal

FakeQuantizedEqual

aimet_ops.Remainder

FakeQuantizedRemainder

aimet_ops.Fmod

FakeQuantizedFmod

aimet_ops.Pow

FakeQuantizedPow

aimet_ops.CustomSiLU

FakeQuantizedCustomSiLU

aimet_ops.Maximum

FakeQuantizedMaximum

aimet_ops.Max

FakeQuantizedMax

aimet_ops.Minimum

FakeQuantizedMinimum

aimet_ops.Min

FakeQuantizedMin

aimet_ops.Bmm

FakeQuantizedBmm

aimet_ops.LogicalOr

FakeQuantizedLogicalOr

aimet_ops.LogicalAnd

FakeQuantizedLogicalAnd

aimet_ops.CustomGather

FakeQuantizedCustomGather

aimet_ops.GatherNd

FakeQuantizedGatherNd

aimet_ops.Baddbmm

FakeQuantizedBaddbmm

aimet_ops.Addmm

FakeQuantizedAddmm

aimet_ops.ScatterND

FakeQuantizedScatterND

aimet_ops.DynamicConv2d

FakeQuantizedDynamicConv2d

aimet_ops.ScatterElements

FakeQuantizedScatterElements

aimet_ops.BatchNorm

FakeQuantizedBatchNorm

aimet_ops.GroupNorm

FakeQuantizedAimetGroupNorm

aimet_ops.NonMaxSuppression

FakeQuantizedNonMaxSuppression

aimet_ops.Split

FakeQuantizedSplit

aimet_ops.Concat

FakeQuantizedConcat

aimet_ops.Where

FakeQuantizedWhere

aimet_ops.MaskedFill

FakeQuantizedMaskedFill

+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/quantizer.html b/releases/1.32.2/torch_v2/torch_docs/quantizer.html new file mode 100644 index 00000000..23932b81 --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/quantizer.html @@ -0,0 +1,452 @@ + + + + + + Quantizers — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Quantizers

+
+

Top-level API

+
+
+class aimet_torch.v2.quantization.affine.quantizer.QuantizerBase[source]
+

Quantizer base class

+
+
+allow_overwrite(mode)[source]
+

Set allow_overwite flag

+
+ +
+
+abstract compute_encodings()[source]
+

Observe inputs and update quantization parameters based on the input statistics.

+
+ +
+
+abstract get_encoding()[source]
+

Return the quantizer’s encodings as an EncodingBase object

+
+
Return type
+

Optional[EncodingBase]

+
+
+
+ +
+
+abstract get_legacy_encodings()[source]
+

Returns a list of encodings, each represented as a List of Dicts

+
+
Return type
+

Optional[List[Dict]]

+
+
+
+ +
+
+is_initialized()[source]
+

Returns true if the quantization parameters are initialized.

+
+
Return type
+

bool

+
+
+
+ +
+
+register_quantization_parameter(name, param)[source]
+

Register quantization parameter.

+
+ +
+
+abstract set_legacy_encodings(encodings)[source]
+

Set encodings represented in the same format as the output of get_legacy_encodings.

+
+ +
+ +
+
+class aimet_torch.v2.quantization.affine.quantizer.QuantizeDequantize(shape, bitwidth, symmetric, encoding_analyzer=None, block_size=None)[source]
+

Applies fake-quantization by quantizing and dequantizing the input.

+

Precisely,

+
+\[out = (\overline{input} + offset) * scale\]
+

where

+
+\[\overline{input} = clamp\left(\left\lceil\frac{input}{scale}\right\rfloor - offset, qmin, qmax\right)\]
+

and \(scale\) and \(offset\) are derived from learnable parameters +\(\theta_{min}\) and \(\theta_{max}\).

+

If block size \(B = \begin{pmatrix} B_0 & B_1 & \cdots & B_{D-1} \end{pmatrix}\) is specified, +this equation will be further generalized as

+
+\[ \begin{align}\begin{aligned}\begin{split}out_{j_0 \cdots j_{D-1}} &= (\overline{input}_{j_0 \cdots j_{D-1}} + offset_{i_0 \cdots i_{D-1}}) * scale_{i_0 \cdots i_{D-1}}\\ +\overline{input}_{j_0 \cdots j_{D-1}} &= clamp\left( + \left\lceil\frac{input_{j_0 \cdots j_{D-1}}}{scale_{i_0 \cdots i_{D-1}}}\right\rfloor + - offset_{i_0 \cdots i_{D-1}}, qmin, qmax\right)\\\end{split}\\\text{where} \quad \forall_{0 \leq d < D} \quad i_d = \left\lfloor \frac{j_d}{B_d} \right\rfloor\end{aligned}\end{align} \]
+
+
Parameters
+
    +
  • shape (tuple) – Shape of the quantization parameters

  • +
  • bitwidth (int) – Quantization bitwidth

  • +
  • symmetric (bool) – If True, performs symmetric quantization; +otherwise, performs asymmetric quantization

  • +
  • encoding_analyzer (EncodingAnalyzer, optional) – Encoding analyzer for calibrating quantization encodings +(default: absolute min-max encoding analyzer)

  • +
  • block_size (Tuple[int, ...], optional) – Block size

  • +
+
+
Variables
+
    +
  • min (Tensor) – \(\theta_{min}\) from which scale and offset will be derived.

  • +
  • max (Tensor) – \(\theta_{max}\) from which scale and offset will be derived.

  • +
+
+
+
+

Note

+

QuantizeDequantize cannot run forward() until min and max are properly initialized, +which can be done based on input statistics using compute_encodings() or +by manually assigning a new value to min and max. +See the examples below.

+
+

Examples

+
>>> import aimet_torch.v2.quantization as Q
+>>> input = torch.randn(5, 10)
+>>> qdq = Q.affine.QuantizeDequantize(shape=(5, 2), bitwidth=8, symmetric=False, block_size=(1, 5))
+>>> qdq.is_initialized()
+False
+>>> with qdq.compute_encodings():
+...     _ = qdq(input)
+...
+>>> qdq.is_initialized()
+True
+>>> qdq(input)
+DequantizedTensor([[-0.2771,  0.3038,  1.0819,  0.9700,  0.9487, -0.1307,
+                    -1.7894, -0.1709, -0.2212,  0.7741],
+                   [-1.0295, -1.2265, -1.0295,  1.0564,  0.6177, -1.0386,
+                    -0.0176, -2.6054,  1.8836, -0.1232],
+                   [-0.8229,  0.5540,  0.3992, -0.2363,  1.2546, -1.0036,
+                     0.2355,  0.1741,  1.6079,  0.6247],
+                   [-1.0115,  1.2458,  0.9157, -1.4694, -0.0639, -0.2568,
+                     0.0680,  1.6695,  0.7932, -0.1889],
+                   [ 0.0158,  0.5695,  0.5220,  0.1977, -1.4475, -0.0424,
+                    -1.1128, -0.8796, -0.1060,  1.5897]],
+                  grad_fn=<AliasBackward0>)
+
+
+
>>> import aimet_torch.v2.quantization as Q
+>>> input = torch.randn(5, 10)
+>>> qdq = Q.affine.QuantizeDequantize(shape=(5, 2), bitwidth=8, symmetric=False, block_size=(1, 5))
+>>> qdq.is_initialized()
+False
+>>> qdq.min = torch.nn.Parameter(-torch.ones_like(qdq.min))
+>>> qdq.max = torch.nn.Parameter(torch.ones_like(qdq.max))
+>>> qdq.is_initialized()
+True
+>>> qdq(input)
+DequantizedTensor([[-0.6196, -0.9961,  0.0549, -0.6431,  1.0039, -0.8706,
+                     1.0039,  0.4706, -0.2353,  0.8078],
+                   [ 0.3451, -0.1176, -0.9961, -0.4549, -0.0549, -0.0471,
+                    -0.5255, -0.2353,  1.0039, -0.9961],
+                   [-0.4157,  0.0784,  0.5333,  0.1647, -0.9961, -0.9961,
+                    -0.2118, -0.2196,  0.9176,  0.9490],
+                   [ 1.0039, -0.7765,  0.4784, -0.8706,  1.0039,  0.6039,
+                    -0.4157, -0.2118, -0.9961,  0.3137],
+                   [ 1.0039,  0.3216, -0.2353, -0.7765, -0.9961,  0.8000,
+                     1.0039,  0.4157,  0.4392,  0.4863]],
+                  grad_fn=<AliasBackward0>)
+
+
+
+
+forward(input)[source]
+

Quantizes and dequantizes the input tensor

+
+
Return type
+

DequantizedTensor

+
+
Parameters
+

input (torch.Tensor) – Input to quantize and dequantize

+
+
Returns
+

Quantize-dequantized output

+
+
+
+ +
+ +
+
+class aimet_torch.v2.quantization.affine.quantizer.Quantize(shape, bitwidth, symmetric, encoding_analyzer=None, block_size=None)[source]
+

Applies quantization to the input.

+

Precisely,

+
+\[out = clamp\left(\left\lceil\frac{input}{scale}\right\rfloor - offset, qmin, qmax\right)\]
+

where \(scale\) and \(offset\) are derived from learnable parameters +\(\theta_{min}\) and \(\theta_{max}\).

+

If block size \(B = \begin{pmatrix} B_0 & B_1 & \cdots & B_{D-1} \end{pmatrix}\) is specified, +this equation will be further generalized as

+
+\[ \begin{align}\begin{aligned}\begin{split}out_{j_0 \cdots j_{D-1}} & = clamp\left( + \left\lceil\frac{input_{j_0 \cdots j_{D-1}}}{scale_{i_0 \cdots i_{D-1}}}\right\rfloor + - offset_{i_0 \cdots i_{D-1}}, qmin, qmax\right)\\\end{split}\\\text{where} \quad \forall_{0 \leq d < D} \quad i_d = \left\lfloor \frac{j_d}{B_d} \right\rfloor\end{aligned}\end{align} \]
+
+
Parameters
+
    +
  • shape (tuple) – Shape of the quantization parameters

  • +
  • bitwidth (int) – Quantization bitwidth

  • +
  • symmetric (bool) – If True, performs symmetric quantization; +otherwise, performs asymmetric quantization

  • +
  • encoding_analyzer (EncodingAnalyzer, optional) – Encoding analyzer for calibrating quantization encodings +(default: absolute min-max encoding analyzer)

  • +
  • block_size (Tuple[int, ...], optional) – Block size

  • +
+
+
Variables
+
    +
  • min (Tensor) – \(\theta_{min}\) from which scale and offset will be derived.

  • +
  • max (Tensor) – \(\theta_{max}\) from which scale and offset will be derived.

  • +
+
+
+
+

Note

+

Quantize cannot run forward() until min and max are properly initialized, +which can be done based on input statistics using compute_encodings() or +by manually assigning a new value to min and max. +See the examples below.

+
+

Examples

+
>>> import aimet_torch.v2.quantization as Q
+>>> input = torch.randn(5, 10)
+>>> q = Q.affine.Quantize(shape=(5, 1), bitwidth=8, symmetric=False, block_size=(1, 5))
+>>> q.is_initialized()
+False
+>>> with q.compute_encodings():
+...     _ = q(input)
+...
+>>> q.is_initialized()
+True
+>>> q(input)
+QuantizedTensor([[129.,  64., 255., 122.,   0., 192., 106.,  94., 255.,   0.],
+                 [  0., 145., 181., 255., 144., 255., 194.,   0.,  74.,  86.],
+                 [122.,   0., 255., 150.,  33., 103., 103.,   0.,  37., 255.],
+                 [255., 111., 237., 218.,   0.,  49., 155., 255.,   0., 179.],
+                 [  0.,  66., 255.,  89., 110.,  17.,  36.,  83., 255.,   0.]],
+                grad_fn=<AliasBackward0>)
+
+
+
>>> import aimet_torch.v2.quantization as Q
+>>> input = torch.randn(5, 10)
+>>> q = Q.affine.Quantize(shape=(5, 1), bitwidth=8, symmetric=False, block_size=(1, 5))
+>>> q.is_initialized()
+False
+>>> q.min = torch.nn.Parameter(-torch.ones_like(q.min))
+>>> q.max = torch.nn.Parameter(torch.ones_like(q.max))
+>>> q.is_initialized()
+True
+>>> q(input)
+QuantizedTensor([[187., 186., 131.,   0., 203.,  64.,  80.,   0., 143., 152.],
+                 [ 16.,   0., 255.,   0.,   0., 150.,   0., 255.,  32., 255.],
+                 [255., 226.,   0., 255.,  55., 172.,   0., 255., 145., 255.],
+                 [207., 146., 216., 238.,   0.,   0., 141., 178., 255., 188.],
+                 [ 63.,  59.,  19., 162.,  30., 255., 109., 255.,   0., 255.]],
+                grad_fn=<AliasBackward0>)
+
+
+
+
+forward(input)[source]
+

Quantizes the input tensor

+
+
Return type
+

QuantizedTensor

+
+
Parameters
+

input (torch.Tensor) – Input to quantize

+
+
Returns
+

Quantized output

+
+
+
+ +
+ +
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/torch_docs/tutorials/quickstart_guide.html b/releases/1.32.2/torch_v2/torch_docs/tutorials/quickstart_guide.html new file mode 100644 index 00000000..4bb689e7 --- /dev/null +++ b/releases/1.32.2/torch_v2/torch_docs/tutorials/quickstart_guide.html @@ -0,0 +1,568 @@ + + + + + + Quickstart Guide — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Quickstart Guide

+

In this tutorial, we will go through the end-to-end process of using AIMET and PyTorch to create, calibrate, and export +a simple quantized model. Note that this is intended to show the most basic workflow in AIMET. It is not meant to +demonstrate the most state-of-the-art techniques available in AIMET.

+
+

Overall flow

+
    +
  1. Define the basic floating-point PyTorch model, training, and eval loops

  2. +
  3. Prepare the trained model for quantization

  4. +
  5. Create quantization simulation (quantsim) model in AIMET to simulate the effects of quantization

  6. +
  7. Calibrate the quantsim model on training data and evaluate the quantized accuracy

  8. +
  9. Fine-tune the quantized model to improve the quantized accuracy

  10. +
  11. Export the quantized model

  12. +
+
+
+

PyTorch prerequisites

+

To see clearly what happens inside AIMET, let’s first start with some simple PyTorch code for defining, training, and +evaluating a model. The code below is adapted from PyTorch’s +basic optimization tutorial. +Note that AIMET does not have any special requirement on what these training/eval loops look like.

+
import torch
+import torchvision
+import torch.nn.functional as F
+
+device = "cuda:0" if torch.cuda.is_available() else "cpu"
+
+# 1) Start with some data loaders to train, evaluate, and calibrate the model
+
+cifar10_train_data = torchvision.datasets.FashionMNIST('/tmp/cifar10', train=True, download=True, transform=torchvision.transforms.ToTensor())
+cifar10_test_data = torchvision.datasets.FashionMNIST('/tmp/cifar10', train=True, download=True, transform=torchvision.transforms.ToTensor())
+
+train_loader = torch.utils.data.DataLoader(cifar10_train_data, batch_size=128, shuffle=True)
+test_loader = torch.utils.data.DataLoader(cifar10_train_data, batch_size=128, shuffle=True)
+
+# 2) Define a simple model to train on this dataset
+
+class Network(torch.nn.Module):
+
+    def __init__(self):
+        super().__init__()
+        self.conv1 = torch.nn.Conv2d(in_channels=1, out_channels=128, kernel_size=3, padding=1, stride=2)
+        self.bn_1 = torch.nn.BatchNorm2d(128)
+        self.conv2 = torch.nn.Conv2d(in_channels=128, out_channels=256, kernel_size=3, padding=1, stride=2)
+        self.bn_2 = torch.nn.BatchNorm2d(256)
+        self.linear = torch.nn.Linear(in_features=7*7*256, out_features=10)
+
+    def forward(self, x):
+        x = self.conv1(x)
+        x = F.relu(self.bn_1(x))
+        x = self.conv2(x)
+        x = F.relu(self.bn_2(x))
+        x = self.linear(x.view(x.shape[0], -1))
+        return F.softmax(x, dim=-1)
+
+
+# 3) Define an evaluation loop for the model
+
+def evaluate(model, data_loader):
+    model.eval()
+    correct = total = 0
+    for x, y in data_loader:
+        x, y = x.to(device), y.to(device)
+        output = model(x)
+        correct += (torch.argmax(output, dim=1) == y).sum()
+        total += x.shape[0]
+
+    accuracy = correct / total * 100.
+    return accuracy
+
+
+

Now, let’s instantiate a network and train for a few epochs on our dataset to establish a baseline floating-point model

+
# Create a model
+model = Network()
+
+# Send the model to the desired device (optional)
+model.to(device)
+
+# Define some loss function and optimizer
+loss_fn = torch.nn.CrossEntropyLoss()
+optimizer = torch.optim.Adam(model.parameters(), lr=1e-4)
+
+# Train for 4 epochs
+model.train()
+for epoch in range(4):
+    for batch_idx, (x, y) in enumerate(train_loader):
+        x, y = x.to(device), y.to(device)
+        output = model(x)
+        loss = loss_fn(output, y)
+        loss.backward()
+        optimizer.step()
+        optimizer.zero_grad()
+
+# Evaluate the floating-point model
+model.eval()
+fp_accuracy = evaluate(model, test_loader)
+print(f"Floating point accuracy: {fp_accuracy}")
+
+
+
Floating point accuracy: 91.70999908447266
+
+
+
+
+

Prepare the floating point model for quantization

+

Before we can (accurately) simulate quantization, there are a couple important steps to take care of:

+
+

1) Model preparation

+

AIMET’s quantization simulation tool (QuantizationSimModel) expects the floating point model to conform to some +specific guidelines. For example, QuantizationSimModel is only able to quantize math operations performed by +torch.nn.Module objects, whereas torch.nn.functional calls will be (incorrectly) ignored.

+

If we look back at our previous model definition, we see it calls F.relu() and F.softmax() in the forward +function. Does this mean we need to completely redefine our model to use AIMET? Thankfully, no. AIMET provides the +model_preparer API to transform our incompatible model into a new fully-compatible model.

+
from aimet_torch import model_preparer
+
+prepared_model = model_preparer.prepare_model(model)
+print(prepared_model)
+
+# Note: This transformation should not change the model's forward function at all
+fp_accuracy_prepared = evaluate(prepared_model, test_loader)
+assert fp_accuracy_prepared == fp_accuracy
+
+
+
2024-05-07 14:39:22,747 - root - INFO - AIMET
+2024-05-07 14:39:22,806 - ModelPreparer - INFO - Functional         : Adding new module for node: {module_relu}
+2024-05-07 14:39:22,806 - ModelPreparer - INFO - Functional         : Adding new module for node: {module_relu_1}
+2024-05-07 14:39:22,806 - ModelPreparer - INFO - Functional         : Adding new module for node: {module_softmax}
+GraphModule(
+  (conv1): Conv2d(1, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
+  (bn_1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
+  (conv2): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
+  (bn_2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
+  (linear): Linear(in_features=12544, out_features=10, bias=True)
+  (module_relu): ReLU()
+  (module_relu_1): ReLU()
+  (module_softmax): Softmax(dim=-1)
+)
+
+
+
+def forward(self, x):
+    conv1 = self.conv1(x);  x = None
+    bn_1 = self.bn_1(conv1);  conv1 = None
+    module_relu = self.module_relu(bn_1);  bn_1 = None
+    conv2 = self.conv2(module_relu);  module_relu = None
+    bn_2 = self.bn_2(conv2);  conv2 = None
+    module_relu_1 = self.module_relu_1(bn_2);  bn_2 = None
+    getattr_1 = module_relu_1.shape
+    getitem = getattr_1[0];  getattr_1 = None
+    view = module_relu_1.view(getitem, -1);  module_relu_1 = getitem = None
+    linear = self.linear(view);  view = None
+    module_softmax = self.module_softmax(linear);  linear = None
+    return module_softmax
+
+# To see more debug info, please use `graph_module.print_readable()`
+
+
+

Note how the prepared model now contains distinct modules for the relu() and softmax() operations.

+
+
+

2) BatchNorm fold

+

When models are executed in a quantized runtime, batchnorm layers are typically folded into the weight and bias of +an adjacent convolution layer whenever possible in order to remove unnecessary computations. To accurately simulate +inference in these runtimes, it is generally a good idea to perform this batchnorm folding on the floating point model +before applying quantization. AIMET provides the batch_norm_fold tool to do this.

+
from aimet_torch import batch_norm_fold
+
+sample_input, _ = next(iter(train_loader))
+batch_norm_fold.fold_all_batch_norms(prepared_model, input_shapes=sample_input.shape)
+
+print(prepared_model)
+
+
+
GraphModule(
+  (conv1): Conv2d(1, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
+  (bn_1): Identity()
+  (conv2): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
+  (bn_2): Identity()
+  (linear): Linear(in_features=12544, out_features=10, bias=True)
+  (module_relu): ReLU()
+  (module_relu_1): ReLU()
+  (module_softmax): Softmax(dim=-1)
+)
+
+
+
+def forward(self, x):
+    conv1 = self.conv1(x);  x = None
+    bn_1 = self.bn_1(conv1);  conv1 = None
+    module_relu = self.module_relu(bn_1);  bn_1 = None
+    conv2 = self.conv2(module_relu);  module_relu = None
+    bn_2 = self.bn_2(conv2);  conv2 = None
+    module_relu_1 = self.module_relu_1(bn_2);  bn_2 = None
+    getattr_1 = module_relu_1.shape
+    getitem = getattr_1[0];  getattr_1 = None
+    view = module_relu_1.view(getitem, -1);  module_relu_1 = getitem = None
+    linear = self.linear(view);  view = None
+    module_softmax = self.module_softmax(linear);  linear = None
+    return module_softmax
+
+# To see more debug info, please use `graph_module.print_readable()`
+
+
+

Note that the model now has Identity (passthrough) layers where it previously had BatchNorm2d layers. Like the +model_preparer step, this operation should not impact the model’s accuracy.

+
+
+
+

Quantize the model

+

Now, we are ready to use AIMET’s QuantizationSimModel to simulate quantizing the floating point model. This +involves two steps:

+
    +
  1. Add quantizers to simulate quantization noise during the model’s forward pass

  2. +
  3. Calibrate the quantizer encodings (e.g., min/max ranges) on some sample inputs

  4. +
+

Calibration is necessary to determine the range of values each activation quantizer is likely to encounter in the +model’s forward pass, and should therefore be able to represent. Theoretically, we could pass the entire training +dataset through the model for calibration, but in practice we usually only need about 500-1000 representative samples +to accurately estimate the ranges.

+
import aimet_torch.v2 as aimet
+from aimet_torch.v2 import quantsim
+
+# QuantizationSimModel will convert each nn.Module in prepared_model into a quantized equivalent module and configure the module's quantizers
+# In this case, we will quantize all parameters to 4 bits and all activations to 8 bits.
+sim = quantsim.QuantizationSimModel(prepared_model,
+                                    dummy_input=sample_input.to(device),
+                                    default_output_bw=8,                                # Simulate 8-bit activations
+                                    default_param_bw=4)                                 # Simulate 4-bit weights
+
+# Inside the compute_encodings context, quantizers will observe the statistics of the activations passing through them. These statistics will be used
+# to compute properly calibrated encodings upon exiting the context.
+with aimet.nn.compute_encodings(sim.model):
+    for idx, (x, _) in enumerate(train_loader):
+        x = x.to(device)
+        sim.model(x)
+        if idx >= 10:
+            break
+
+# Compare the accuracy before and after quantization:
+quantized_accuracy = evaluate(sim.model, test_loader)
+
+print(sim.model)
+
+print(f"Floating point model accuracy: {fp_accuracy} %\n"
+      f"Quantized model accuracy: {quantized_accuracy} %")
+
+
+
GraphModule(
+  (conv1): QuantizedConv2d(
+    1, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)
+    (param_quantizers): ModuleDict(
+      (weight): QuantizeDequantize(shape=[1], bitwidth=4, symmetric=True)
+      (bias): None
+    )
+    (input_quantizers): ModuleList(
+      (0): QuantizeDequantize(shape=[1], bitwidth=8, symmetric=False)
+    )
+    (output_quantizers): ModuleList(
+      (0): None
+    )
+  )
+  (bn_1): Identity()
+  (module_relu): FakeQuantizedReLU(
+    (param_quantizers): ModuleDict()
+    (input_quantizers): ModuleList(
+      (0): None
+    )
+    (output_quantizers): ModuleList(
+      (0): QuantizeDequantize(shape=[1], bitwidth=8, symmetric=False)
+    )
+  )
+  (conv2): QuantizedConv2d(
+    128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)
+    (param_quantizers): ModuleDict(
+      (weight): QuantizeDequantize(shape=[1], bitwidth=4, symmetric=True)
+      (bias): None
+    )
+    (input_quantizers): ModuleList(
+      (0): None
+    )
+    (output_quantizers): ModuleList(
+      (0): None
+    )
+  )
+  (bn_2): Identity()
+  (module_relu_1): FakeQuantizedReLU(
+    (param_quantizers): ModuleDict()
+    (input_quantizers): ModuleList(
+      (0): None
+    )
+    (output_quantizers): ModuleList(
+      (0): QuantizeDequantize(shape=[1], bitwidth=8, symmetric=False)
+    )
+  )
+  (linear): QuantizedLinear(
+    in_features=12544, out_features=10, bias=True
+    (param_quantizers): ModuleDict(
+      (weight): QuantizeDequantize(shape=[1], bitwidth=4, symmetric=True)
+      (bias): None
+    )
+    (input_quantizers): ModuleList(
+      (0): None
+    )
+    (output_quantizers): ModuleList(
+      (0): QuantizeDequantize(shape=[1], bitwidth=8, symmetric=False)
+    )
+  )
+  (module_softmax): QuantizedSoftmax(
+    dim=-1
+    (param_quantizers): ModuleDict()
+    (input_quantizers): ModuleList(
+      (0): None
+    )
+    (output_quantizers): ModuleList(
+      (0): QuantizeDequantize(shape=[1], bitwidth=8, symmetric=False)
+    )
+  )
+)
+
+
+
+def forward(self, x):
+    conv1 = self.conv1(x);  x = None
+    bn_1 = self.bn_1(conv1);  conv1 = None
+    module_relu = self.module_relu(bn_1);  bn_1 = None
+    conv2 = self.conv2(module_relu);  module_relu = None
+    bn_2 = self.bn_2(conv2);  conv2 = None
+    module_relu_1 = self.module_relu_1(bn_2);  bn_2 = None
+    getattr_1 = module_relu_1.shape
+    getitem = getattr_1[0];  getattr_1 = None
+    view = module_relu_1.view(getitem, -1);  module_relu_1 = getitem = None
+    linear = self.linear(view);  view = None
+    module_softmax = self.module_softmax(linear);  linear = None
+    return module_softmax
+
+# To see more debug info, please use `graph_module.print_readable()`
+Floating point model accuracy: 91.70999908447266 %
+Quantized model accuracy: 91.1500015258789 %
+
+
+

Here, we can see that sim.model is nothing more than the prepared_model with every layer replaced with a +quantized version of the layer. The quantization behavior of each module is determined by the configuration of its +held quantizers.

+

For example, we can see that sim.model.conv2 has a 4-bit weight quantizer and an 8-bit output quantizer as specified +during construction. We will discuss more advanced ways to configure these quantizers to optimize performance and +accuracy in a later tutorial.

+
+
+

Fine-tune the model with quantization aware training

+

If we’re not satisfied with our accuracy after applying quantization, there are some steps we can take to further +optimize the quantized accuracy. One such step is quantization aware training (QAT), during which the model is trained +with the fake-quantization ops present.

+

Let’s repeat our floating-point training loop for one more epoch, but this time use the quantized model.

+
# Define some loss function and optimizer
+loss_fn = torch.nn.CrossEntropyLoss()
+optimizer = torch.optim.Adam(sim.model.parameters(), lr=1e-4)
+
+# Train for one more epoch on the quantsim model
+for epoch in range(1):
+    for batch_idx, (x, y) in enumerate(train_loader):
+        x, y = x.to(device), y.to(device)
+        output = sim.model(x)
+        loss = loss_fn(output, y)
+        loss.backward()
+        optimizer.step()
+        optimizer.zero_grad()
+
+
+# Compare the accuracy before and after QAT:
+post_QAT_accuracy = evaluate(sim.model, test_loader)
+
+print(f"Original quantized model accuracy: {quantized_accuracy} %\n"
+      f"Post-QAT model accuracy: {post_QAT_accuracy} %")
+
+
+
Original quantized model accuracy: 91.1500015258789 %
+Post-QAT model accuracy: 92.05333709716797 %
+
+
+
+
+

Export the quantsim model

+

Now that we are happy with our quantized model’s accuracy, we are ready to export the model with its quantization parameters.

+
export_path = "/tmp/"
+model_name = "fashion_mnist_model"
+sample_input, _ = next(iter(train_loader))
+
+sim.export(export_path, model_name, dummy_input=sample_input)
+
+
+

This export method will save the model with quantization nodes removed, along with an encodings file containing +quantization parameters for each activation and weight tensor in the model. These artifacts can then be sent to a +quantized runtime such as Qualcomm® Neural Processing SDK.

+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/adaround.html b/releases/1.32.2/torch_v2/user_guide/adaround.html new file mode 100644 index 00000000..505e239b --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/adaround.html @@ -0,0 +1,255 @@ + + + + + + AIMET AdaRound — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET AdaRound

+
+

AIMET quantization features, by default, use the “nearest rounding” technique for achieving quantization. +In the following figure, a single weight value in a weight tensor is shown as an illustrative example. When using the +“nearest rounding” technique, this weight value is quantized to the nearest integer value. The Adaptive Rounding +(AdaRound) feature, uses a smaller subset of the unlabelled training data to adaptively round the weights of modules +with weights. In the following figure, the weight value is quantized to the integer value far from it. AdaRound, +optimizes a loss function using the unlabelled training data to adaptively decide whether to quantize a specific +weight to the integer value near it or away from it. Using the AdaRound quantization, a model is able to achieve an +accuracy closer to the FP32 model, while using low bit-width integer quantization.

+

When creating a QuantizationSimModel using the AdaRounded model, use the QuantizationSimModel provided API for +setting and freezing parameter encodings before computing the encodings. Please refer the code example in the AdaRound +API section.

+
+../_images/adaround.png +
+

AdaRound Use Cases

+
+
+

Common terminology

+
+
    +
  • BC - Bias Correction

  • +
  • BNF - Batch Norm Folding

  • +
  • CLE - Cross Layer Equalization

  • +
  • HBF - High Bias Folding

  • +
  • QAT - Quantization Aware Training

  • +
  • { } - An optional step in the use case

  • +
+
+
+
+

Use Cases

+
+
    +
  1. +
    {BNF} –> {CLE} –> AdaRound

    Applying BNF and CLE are optional steps before applying AdaRound. Some models benefit from applying CLE +while some don’t get any benefit.

    +
    +
    +
  2. +
  3. +
    AdaRound –> QAT

    AdaRound is a post-training quantization feature. But, for some models applying BNF and CLE may not be beneficial. +For these models, QAT after AdaRound may be beneficial. AdaRound is considered as a better weights initialization +step which helps for faster QAT.

    +
    +
    +
  4. +
+

Not recommended

+
+
+
    +
  1. AdaRound –> BC

  2. +
  3. BC –> AdaRound

  4. +
+

AdaRound Hyper parameters guidelines

+
+
+

There are couple of hyper parameters required during AdaRound optimization and are exposed to users. But some of them +are with their default values which lead to good and stable results over many models and not recommended to change often.

+

Following is guideline for Hyper parameters:

+
    +
  1. Hyper Parameters to be changed often: number of batches (approximately 500-1000 images, if batch size of data loader +is 64, then 16 number of batches leads to 1024 images), number of iterations(default 10000)

  2. +
  3. Hyper Parameters to be changed moderately: regularization parameter (default 0.01)

  4. +
  5. Hyper Parameters to be changed least: beta range(default (20, 2)), warm start period (default 20%)

  6. +
+
+

+
+
+
+

AdaRound API

+

Please refer to the links below to view the AdaRound API for each AIMET variant:

+
    +
  • AdaRound for PyTorch

  • +
  • AdaRound for Tensorflow

  • +
  • AdaRound for Keras

  • +
  • AdaRound for ONNX

  • +
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/auto_quant.html b/releases/1.32.2/torch_v2/user_guide/auto_quant.html new file mode 100644 index 00000000..2f33c173 --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/auto_quant.html @@ -0,0 +1,210 @@ + + + + + + AIMET AutoQuant — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET AutoQuant

+
+

Overview

+

AIMET offers a suite of neural network post-training quantization techniques. Often, applying these techniques in a +specific sequence, results in better accuracy and performance. Without the AutoQuant feature, the AIMET +user needs to manually try out various combinations of AIMET quantization features. This manual process is +error-prone and often time-consuming.

+

The AutoQuant feature, analyzes the model, determines the sequence of AIMET quantization techniques and applies these +techniques. In addition, the user can specify the amount of accuracy drop that can be tolerated, in the AutoQuant API. +As soon as this threshold accuracy is reached, AutoQuant stops applying any additional quantization technique. In +summary, the AutoQuant feature saves time and automates the quantization of the neural networks.

+
+
+

Workflow

+

Before entering the optimization workflow, AutoQuant performs the following preparation steps:

+
+
    +
  1. Check the validity of the model and convert it into an AIMET quantization-friendly format (denoted as Prepare Model below).

  2. +
  3. Select the best-performing quantization scheme for the given model (denoted as QuantScheme Selection below)

  4. +
+
+

After the prepration steps, AutoQuant mainly consists of the following three stages:

+
+
    +
  1. BatchNorm folding

  2. +
  3. Cross-Layer Equalization

  4. +
  5. AdaRound

  6. +
+
+

These techniques are applied in a best-effort manner until the model meets the allowed accuracy drop. +If applying AutoQuant fails to satisfy the evaluation goal, AutoQuant will return the model to which the best combination +of the above techniques is applied.

+
+
../_images/auto_quant_v2_flowchart.png +
+
+
+

AutoQuant API

+

Please refer to the links below to view the AutoQuant API for each AIMET variant:

+
    +
  • AutoQuant for PyTorch

  • +
  • AutoQuant for Tensorflow

  • +
  • AutoQuant for ONNX

  • +
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/bn_reestimation.html b/releases/1.32.2/torch_v2/user_guide/bn_reestimation.html new file mode 100644 index 00000000..cddac07c --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/bn_reestimation.html @@ -0,0 +1,212 @@ + + + + + + AIMET BN Re-estimation — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET BN Re-estimation

+
+

Overview

+

The BN Re-estimation feature utilizes a small subset of training data to individually re-estimate the statistics of the +Batch Normalization (BN) layers in a model. These BN statistics are then used to adjust the quantization scale parameters +of the preceeding Convolution or Linear layers. Effectively, the BN layers are folded.

+

The BN Re-estimation feature is applied after performing Quantization Aware Training (QAT) with Range Learning, with +Per Channel Quantization (PCQ) enabled. It is very important NOT to fold the BN layers before performing QAT. The BN layers are +folded ONLY after QAT and the re-estimation of the BN statistics are completed. The Workflow section below, covers +the exact sequence of steps.

+

The BN Re-estimation feature is specifically recommended for the following scenarios:

+
    +
  • Low-bitwidth weight quantization (e.g., 4-bits)

  • +
  • Models for which Batch Norm Folding leads to decreased performance.

  • +
  • Models where the main issue is weight quantization (including higher bitwidth quantization)

  • +
  • Low bitwidth quantization of depthwise separable layers since their Batch Norm Statistics are affected by oscillations

  • +
+
+
+

Workflow

+

BN-Re-estimation requires that

+
    +
  1. BN layers not be folded before QAT.

  2. +
  3. Per Channel Quantization is enabled.

  4. +
+

To use the BN-Re-estimation feature, the following sequence of steps must be followed in the correct order.

+
    +
  1. Create the QuantizationSimModel object with Range Learning Quant Scheme

  2. +
  3. Perform QAT with Range Learning

  4. +
  5. Re-estimate the BN statistics

  6. +
  7. Fold the BN layers

  8. +
  9. Using the QuantizationSimModel, export the model and encodings.

  10. +
+

Once the above steps are completed, the model can be run on the target for inference.

+

The following high level call flow diagrams, enumerates the work flow for PyTorch. +The workflow is the same for TensorFlow and Keras.

+../_images/bn_reestimation.png +
+
+

BN Re-estimation API

+

Please refer to the links below to view the BN Re-estimation API for each AIMET variant:

+
    +
  • BN Re-estimation for PyTorch

  • +
  • BN Re-estimation for Tensorflow

  • +
  • BN Re-estimation for Keras

  • +
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/channel_pruning.html b/releases/1.32.2/torch_v2/user_guide/channel_pruning.html new file mode 100644 index 00000000..5f633c76 --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/channel_pruning.html @@ -0,0 +1,195 @@ + + + + + + AIMET Channel Pruning — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Channel Pruning

+

Channel Pruning is a model compression technique that reduces less-important input channels from layers in a given model. Currently AIMET supports Channel Pruning of Conv2d layers.

+
+

Overall Procedure

+

The following picture explains the different steps in Channel Pruning a given layer. These steps are repeated for all layers selected to be compressed in the order of their occurrence from the top of the model.

+../_images/channel_pruning_1.png +

These individual steps are explained in more detail in the following sub-sections.

+
+
+

Channel Selection

+

For a given layer and a given compression ratio Channel Selection analyzes the magnitude of each input channel (based on the kernel weights for that channel) and chooses the channels with the least magnitude to be pruned.

+
+
+

Winnowing

+

Winnowing is used to remove input channels of weight matrix obtained from Channel Selection resulting in compressed tensors

+../_images/cp_2.png +

Once one or more input channels for a layer are removed, then it means corresponding output channels of a upstream layer could also be removed to get further compression gains. Note that the presence of skip-connections or residuals sometimes prevents upstream layers from getting output-pruned.

+../_images/cp_3.jpg +

For more details on winnowing, please see this

+
+ +
+
+
+

Weight Reconstruction

+

As a final step in Channel Pruning, AIMET will adjust the weight and bias parameters of a layer that was pruned in an attempt to try and match the outputs of that layer to closely match the outputs prior to pruning.This is done by collecting random samples of the output of the layer from the original model and the corresponding input samples from the pruned model for that layer. AIMET then performs linear regression to adjust the layer parameters.

+../_images/cp_4.jpg +
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/compression_feature_guidebook.html b/releases/1.32.2/torch_v2/user_guide/compression_feature_guidebook.html new file mode 100644 index 00000000..e8937557 --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/compression_feature_guidebook.html @@ -0,0 +1,200 @@ + + + + + + AIMET Compression Features Guidebook — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Compression Features Guidebook

+

This document provides typical workflows in order to compress a network using AIMET. A more in-depth discussion on various techniques and their usage is provided in User Guide

+

AIMET supports network compression using the following techniques: Weight SVD, Spatial SVD (SSVD) and Channel Pruning (CP). These techniques are intended for Multiply-and-Accumulate (MAC) reduction of convolution layers in a neural network. Based on a configured desired MAC reduction ratio, i.e., MACs in compress model to MACs in uncompressed model, the compression algorithms automatically compress each individual convolution layer in the network to approximately reach the overall desired MAC reduction. Note that the actual on-target inference latency performance of a model depends on several factors MACs, memory and memory bandwidth, quantization, etc. Therefore, the improvement in runtime latency based on MAC reduction based compression may vary depending on the specific model architecture. Performance results for some typical models are provided in https://quic.github.io/aimet-pages/index.html. +For best performance, a combination of spatial SVD followed by channel pruning is recommended. At high level, following steps should be performed to compress a network using SSVD + CP combination:

+../_images/compression_flow.png +
    +
  1. Determine the target compression ratio (C), which is the ratio of MACs in final compressed model to the MACs in the original uncompressed model. For example, target compression ratio = 0.5 indicates that the final model MACs are half of the original model MACs.

  2. +
  3. Perform compression using Spatial SVD technique as follows:

  4. +
+
+
    +
  1. Since the target compression ratio C is for the final SSVD+CP compressed model, the compression that should be targeted or can be achieved via SSVD is unknown apriori. As a result, few target compression ratios (Cssvd)need to be tried out. Choose few Cssvd > C targets and perform SSVD. E.g., if C = 0.5, Cssvd = {0.5,0.65, 0.75} can be used typically. This would result in three SSVD compressed models.

  2. +
  3. For each of the SSVD compressed model obtained from previous step, perform fine-tuning to improve model accuracy. Guidelines on fine-tuning are provided here [].

  4. +
+
+
    +
  1. Pick a model (or few models) that provide high accuracy from step 2b. For example, if the tolerable accuracy drop SSVD+CP compression relative to the original uncompressed model is X % (X = Accuracy of uncompressed model (%) Accuracy of compressed model (%)) , then a model(s) that has accuracy within few % (X-5 %)of the original uncompressed model accuracy should be selected to avoid very large drop in accuracy after CP step.

  2. +
+
+
    +
  1. Note that if step 2b results in very large accuracy drop or drop well within tolerable accuracy drop, then step 2a/2b should be revisited first by appropriately adjusting the compression ratios.

  2. +
+
+
    +
  1. Perform compression using Channel Pruning technique as follows:

  2. +
+
+
    +
  1. Perform compression with few target compression ratios (Ccp). One can set the compression ratio(s) based on the Cssvd of the model obtained from SSVD step 3 such that Cssvd * Ccp is approximately equal to C.

  2. +
  3. Perform fine-tuning to improve model accuracy.

  4. +
+
+
    +
  1. In the final step, a model is selected with MAC ratio relative to the original uncompressed model is close to C and also meets user’s accuracy requirements. For example, for ResNet-50 results provided on https://quic.github.io/aimet-pages/index.html, Csvd = 0.75 and Ccp = 0.66 were used to achieve overall compression C = 0.5

  2. +
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/greedy_compression_ratio_selection.html b/releases/1.32.2/torch_v2/user_guide/greedy_compression_ratio_selection.html new file mode 100644 index 00000000..481d1abd --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/greedy_compression_ratio_selection.html @@ -0,0 +1,203 @@ + + + + + + AIMET Greedy Compression Ratio Selection — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Greedy Compression Ratio Selection

+
+

Overview

+

The model compression methods, Spatial SVD and Channel Pruning work on per layer basis. Not all the layers in the given model are equally compressible. Compression of individual layers of a given model can have varying impact on the final accuracy of the model. Greedy Per Layer Compression Ratio Selection Algorithm is used to assess the sensitivity of applicable layers to compression and find appropriate compression-ratio for each individual layers. The algorithm makes sure that the entire model has highest remaining accuracy and also meets the given target compression-ratio.

+
+
+

How it works

+

The Greedy Compression Ratio Selection algorithm executes the following two steps:

+
    +
  • Per-layer exploration

  • +
  • Compression-ratio selection

  • +
+

The following figure provides a high level overview and is followed by details for each step.

+../_images/greedy_1.png +
+

+
+../_images/greedy_2.png +
+

+
+

where, the Eval dictionary is represented as-

+../_images/greedy_3.png +
+
+

Per-layer Exploration

+

For each layer, produces a column in the compression-ratio vs. model-performance table. This column captures the over all network performance values as the layer is compressed by predefined range of compression ratio candidates, while all other layers are left unmodified.

+../_images/greedy_4.jpg +

In the above figure, you see an example model with 4 layers, and 10 compression-ratio candidates (which is the default setting). Note that the table does not capture the eval score for the last candidate which is always compression-ratio=1.0 (since this score is the baseline score and known already).

+

Monotonic Fit: In some cases it is observed that the model performance is not a strict increasing function of increasing compression-ratios. To help with the greedy selection procedure, AIMET can apply a curve-fit scheme to try and fit the model-performance numbers for a given layer using a monotonically increasing function. The functionality is disabled by default.

+
+
+

Compression Ratio Selection

+

This step is the core of the algorithm. It considers the compression-ratio vs. model-performance table for each applicable layer from the previous step, target compression ratio and function to calculate the cost of the compressed model depending on the compression method (Spatial SVD, Channel Pruning) used. It starts with a constant accuracy and finds the corresponding compression ratio for every applicable layer by interpolating from compression-ratio vs. model-performance evaluation table. The algorithm, then calculates total cost of the model to see if we have met our target compression ratio or not. Binary search algorithm is used to find the solution quickly. Finally it returns list of selected compression ratios for all applicable layers. This way, the algorithm achieves the highest remaining final accuracy of the compressed model and meet target compression ratio.

+

The following figure illustrates that for a given accuracy, the compression ratio for each layer is different.

+../_images/greedy_5.jpg +

As suggested by above diagram, the algorithm picks a lower compression ratio (higher compression) for layers which are more compressible and pick a higher compression ratio (lower compression) for layers which are less compressible (For lesser compressible layers the accuracy falls drstically if compression ratio is lowered).

+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/index.html b/releases/1.32.2/torch_v2/user_guide/index.html new file mode 100644 index 00000000..fcf70332 --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/index.html @@ -0,0 +1,225 @@ + + + + + + AI Model Efficiency Toolkit User Guide — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AI Model Efficiency Toolkit User Guide

+
+

Overview

+

AI Model Efficiency Toolkit (AIMET) is a software toolkit that enables users to quantize and compress models. +Quantization is a must for efficient edge inference using fixed-point AI accelerators.

+

AIMET optimizes pre-trained models (e.g., FP32 trained models) using post-training and fine-tuning techniques that +minimize accuracy loss incurred during quantization or compression.

+

AIMET currently supports PyTorch, TensorFlow, and Keras models.

+../_images/AIMET_index_no_fine_tune.png +

The above picture shows a high-level view of the workflow when using AIMET. The user will start with a trained +model in either the PyTorch, TensorFlow, or Keras training framework. This trained model is passed to AIMET using APIs +for compression and quantization. AIMET returns a compressed/quantized version of the model +that the users can fine-tune (or train further for a small number of epochs) to recover lost accuracy. Users can then +export via ONNX/meta/h5 to an on-target runtime like Qualcomm® Neural Processing SDK.

+
+
+

Features

+

AIMET supports two sets of model optimization techniques:

+
    +
  • Model Quantization: AIMET can simulate behavior of quantized HW for a given trained +model. This model can be optimized using Post-Training Quantization (PTQ) and fine-tuning (Quantization Aware Training +- QAT) techniques.

  • +
  • Model Compression: AIMET supports multiple model compression techniques that allow the +user to take a trained model and remove redundancies, resulting in a smaller model that runs faster on target.

  • +
+
+
+

Release Information

+

For information specific to this release, please see Release Notes and Known Issues.

+
+
+

Installation Guide

+

Please visit the AIMET Installation for more details.

+
+
+

Getting Started

+

Please refer to the following documentation:

+ +
+

toc tree

+
+
+
+

+
+
+

+
+
+
AI Model Efficiency Toolkit is a product of Qualcomm Innovation Center, Inc.
+
Qualcomm® Neural Processing SDK is a product of Qualcomm Technologies, Inc. and/or its subsidiaries.
+
+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/known_issues.html b/releases/1.32.2/torch_v2/user_guide/known_issues.html new file mode 100644 index 00000000..41911cfc --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/known_issues.html @@ -0,0 +1,179 @@ + + + + + + AIMET Known Issues — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Known Issues

+

Known issues and limitations for Qualcomm AI Model Efficiency ToolKit (AIMET)

+
    +
  • AIMET Spatial SVD currently does not support Fully Connected layers

  • +
  • +
    AIMET Channel Pruning
      +
    • Does not support Conv layers with dilation other than (1,1). Conv layers with dilation other than (1,1) must be added to Channel Pruning Configuration’s modules_to_ignore list.

    • +
    • Does not support channel pruning of DepthwiseConv2d layers.

    • +
    • For TensorFlow, supports only models with “Channels Last” data format

    • +
    +
    +
    +
  • +
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/model_compression.html b/releases/1.32.2/torch_v2/user_guide/model_compression.html new file mode 100644 index 00000000..631a1493 --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/model_compression.html @@ -0,0 +1,256 @@ + + + + + + AIMET Model Compression — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Model Compression

+
+

Overview

+

AIMET provides a model compression library that can be used to reduce a model’s MAC and memory costs with a minimal +drop in accuracy. AIMET supports various compression schemes like Weight SVD, Spatial SVD and Channel Pruning.

+
+
+

Please see the Compression Guidebook - which includes some practical advice on using the compression features, and how to combine the features

+
+
+

Use Case

+

AIMET allows user to take a trained model and compress it to desired compression ratio which can be further fine-tuned and exported to a target. +All of the compression schemes in AIMET use a two-step process - Compression ratio selection followed by model +compression.

+../_images/compression_use_case.PNG +

The following sub-sections explain these steps in more detail.

+
+
+

Compression ratio selection

+
+
+
    +
  • Greedy Compression Ratio Selection: During this phase, individual layers of the original model are analyzed to determine optimal compression ratios per layer. Currently AIMET supports the Greedy Compression Ratio Selection method.

  • +
  • Manual Compression Ratio Selection: As an alternative to AIMET automatically selecting optimal compression ratios per layer, the user has a choice to specify compression ratios manually per layer. The suggested procedure would be to use the Greedy Compression Ratio Selection method to get a nominal set of compression ratios first. And then use this as the starting point for manually changing compression ratios for one or more layers.

  • +
+

To visualize various usage of the compression tool we can use:

+ +
+
+

Model Compression

+

In this phase, AIMET will apply the compression ratios per layer to create a compressed model. +Currently, AIMET supports the following model compression algorithms.

+ +
+
+

Optional techniques to get better compression results

+

AIMET supports the following techniques that can be optionally used to get better compression results

+
    +
  • Rank-rounding

  • +
  • Per-layer fine-tuning

  • +
+
+

Rank Rounding

+

Often ML runtime-software like those for Embedded ML accelerators, will prefer the dimensions of layers like Conv2d or FC to be of a certain multiplicity. Matching the expected dimension size will result in optimal runtime for that layer. AIMET techniques like Weight/Spatial SVD or Channel Pruning, try to decompose layers or reduce layers - specifically in terms of output channels and input channels. The rank-rounding feature in AIMET will try and reduce layers to match a user-provided multiplicity. By default this feature is disabled. At present, AIMET allows the user to specify a multiplicity-factor for the entire model, not on a per-layer basis.

+

Users can make use of this feature to generate more optimal models for running on embedded targets.

+
+
+

Per-layer Fine-tuning

+

Given a user-model and desired compression-ratio, the user may sometimes notice a sharp degradation in accuracy after compression but before fine-tuning. One technique that might help the overall compression of such scenarios, is using a feature called per-layer fine-tuning. When this feature is selected, AIMET invokes a user-provided fine-tuning function after compressing every layer that was selected for compression. This is done during the Model Compression phase in the diagram shown above.

+

Note: The user is responsible for choosing appropriate learning-rates and other training parameters for fine-tuning. Using this feature may require the user to carefully pick the learning rates and learning-rate-decay parameters to be used during fine-tuning.

+
+
+
+

FAQs

+
    +
  1. Which technique is the best technique to use for compression?

    +

    We see best results when Spatial SVD is performed followed by Channel Pruning.

    +
  2. +
  3. Can we combine the different techniques?

    +

    Yes, as stated in 1, different techniques can be combined together to get better accuracy. Compression can be combined with Post-training Quantization techniques as well to get a better model for target.

    +
  4. +
  5. How to take a model to target after compression?

    +

    To take a model to target it needs to be first compressed using the above techniques and then it should be quantized and exported to target

    +
  6. +
  7. Greedy rank selection is very slow. Can something be done to speed it up?

    +

    Greedy rank selection in itself is not time consuming. The time consuming part is creating the eval-score dictionary. For different experiments, eval-score dictionary can be generated once and then loaded into the searcher. Or, one can reduce the number of candidates over which the eval-score dictionary is created. But lesser the number of candidates, lesser the granularity. To strike a balance the value of 10 candidates was chosen.

    +
  8. +
  9. Is per-layer fine tuning helpful?

    +

    Per-layer fine tuning is an experimental technique. We have not observed major gains by using it. But one can try out if it works for their model. In practice, we have observed that the best combination is to do say 1 epoch of fine-tuning per-layer and then do say 10-15 epochs of fine-tuning for the entire compressed model at the end.

    +
  10. +
+
+
+

References

+
    +
  1. Xiangyu Zhang, Jianhua Zou, Kaiming He, and Jian Sun. “Accelerating Very Deep Convolutional Networks for Classification and Detection.” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 10, pp. 1943-1955, 1 Oct. 2016.

  2. +
  3. Yihui He, Xiangyu Zhang, and Jian Sun. “Channel Pruning for Accelerating Very Deep Neural Networks.” IEEE International Conference on Computer Vision (ICCV), Venice, 2017, pp. 1398-1406.

  4. +
  5. Max Jaderberg, Andrea Vedaldi, and Andrew Zisserman. “Speeding up Convolutional Neural Networks with Low Rank Expansions.” British Machine Vision Conference, Jan. 2014.

  6. +
  7. Andrey Kuzmin, Markus Nagel, Saurabh Pitre, Sandeep Pendyam, Tijmen Blankevoort, Max Welling. “Taxonomy and Evaluation of Structured Compression of Convolutional Neural Networks.”

  8. +
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/model_guidelines.html b/releases/1.32.2/torch_v2/user_guide/model_guidelines.html new file mode 100644 index 00000000..4657e35c --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/model_guidelines.html @@ -0,0 +1,232 @@ + + + + + + Model Guidelines for PyTorch — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Model Guidelines for PyTorch

+

To implement the Cross Layer Equalization API, aimet_torch.cross_layer_equalization.equalize_model(), AIMET creates a computing graph to analyze the sequence of Operations in the model. +If your model is defined using certain constructs, it restricts AIMET from successfully creating and analyzing the computing graph. The following table lists the potential issues and workarounds.

+

Note: These restrictions are not applicable, if you are using the Primitive APIs

+ +++++ + + + + + + + + + + + + + + + + + + + + + + + + +

Potential Issue

Description

Work Around

ONNX Export

Use torch.onnx.export() +to export your model. +Make sure ONNX export passes

If ONNX export fails, rewrite the +specific layer so that ONNX +export passes

Slicing Operation

Some models use +torch.tensor.view() in the +forward function as follows: +x = x.view(-1, 1024) +If view function is written +as above, it causes an issue +while creating the +computing graph

Rewrite the x.view() statement +as follows: +x = x.view(x.size(0), -1)

Bilinear, upsample +operation

Some models use the upsample +operation in the forward +function as: x= +torch.nn.functional.upsample( +x, size=torch.Size([129,129]) +, mode = ‘bilinear’, +align_corners=True)

Set the align_corners parameter to +False as follows: +x = +torch.nn.functional.upsample(x, +size=torch.Size([129, 129]), +mode=’bilinear’, +align_corners=False)

Deconvolution operation

The deconvolution operation +is used in DeepLabV3 model. +This is currently not +supported by AIMET

There is no workaround available +at this time. This issue will be +addressed in a subsequent AIMET +release.

+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/model_quantization.html b/releases/1.32.2/torch_v2/user_guide/model_quantization.html new file mode 100644 index 00000000..92916f7b --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/model_quantization.html @@ -0,0 +1,387 @@ + + + + + + AIMET Model Quantization — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Model Quantization

+

Models are generally trained on floating-point hardware like CPUs and GPUs. However, when these trained models are run +on quantized hardware that support fixed-precision operations, model parameters are converted from floating-point +precision to fixed precision. As an example, when running on hardware that supports 8-bit integer operations, the +floating point parameters in the trained model need to be converted to 8-bit integers. It is observed that for some +models, running on an 8-bit fixed-precision runtime introduces a loss in accuracy due to noise added from the use +of fixed precision parameters and fixed precision operations.

+

AIMET provides multiple techniques and tools which help to create quantized models with a minimal loss in accuracy +relative to floating-point models.

+

This section provides information on typical use cases and AIMET’s quantization features.

+
+

Use Cases

+

1. Predict on-target accuracy: AIMET enables a user to simulate the effects of quantization to get a first order +estimate of the model’s accuracy when run on quantized targets. This is useful to get an estimate of on-target accuracy +without needing an actual target platform. Note that to create a simulation model, AIMET uses representative data +samples to compute per-layer quantization encodings.

+
+
../_images/quant_use_case_1.PNG +
+

2. Post-Training Quantization (PTQ): PTQ techniques attempt to make a model more quantization friendly without +requiring model re-training/fine-tuning. PTQ (as opposed to fine-tuning) is recommended as a first step in a +quantization workflow due to the following advantages:

+
    +
  • No need for the original training pipeline; an evaluation pipeline is sufficient

  • +
  • Only requires a small unlabeled dataset for calibration (can even be data-free in some scenarios)

  • +
  • Fast, simple, and easy to use

    +
    +
    ../_images/quant_use_case_3.PNG +
    +
  • +
+

Note that with PTQ techniques, the quantized model accuracy may still have a gap relative to the floating-point model. +In such a scenario, or to even further improve the model accuracy, fine-tuning is recommended.

+

3. Quantization-Aware Training (QAT)/Fine-Tuning: QAT enables a user to fine-tune a model with quantization +operations inserted in network graph, which in effect adapts the model parameters to be robust to quantization noise. +While QAT requires access to a training pipeline and dataset, and takes longer to run due to needing a few epochs of +fine-tuning, it can provide better accuracy especially at low bitwidths. A typical QAT workflow is illustrated below.

+
+
../_images/quant_use_case_2.PNG +
+
+
+

AIMET Quantization Features

+
+
+
    +
  • +
    Quantization Simulation:

    QuantSim enables a user to modify a model by adding quantization simulation ops. When an evaluation is run on a +model with these quantization simulation ops, the user can observe a first-order simulation of expected accuracy on +quantized hardware.

    +
    +
    +
  • +
  • +
    Quantization-Aware Training (QAT):

    QAT allows users to take a QuantSim model and further fine-tune the model parameters by taking quantization into +account.

    +

    Two modes of QAT are supported:

    +
      +
    • +
      Regular QAT:

      Fine-tuning of model parameters. Trainable parameters such as module weights, biases, etc. can be +updated. The scale and offset quantization parameters for activation quantizers remain constant. Scale and +offset parameters for weight quantizers will update to reflect new weight values after each training step.

      +
      +
      +
    • +
    • +
      QAT with Range Learning:

      In addition to trainable module weights and scale/offset parameters for weight quantizers, scale/offset +parameters for activation quantizers are also updated during each training step.

      +
      +
      +
    • +
    +
    +
    +
  • +
+
+

Post-Training Quantization

+
    +
  • Post-Training Quantization (PTQ) Techniques:

    +
    +

    Post-training quantization techniques help a model improve quantized accuracy without needing to re-train.

    +
    +
    +
      +
    • +
      AutoQuant:

      AIMET provides an API that integrates the post-training quantization techniques described below. AutoQuant is +recommended for PTQ. If desired, individual techniques can be invoked using standalone feature specific APIs.

      +
      +
      +
    • +
    • +
      Adaptive Rounding (AdaRound):

      Determines optimal rounding for weight tensors to improve quantized performance.

      +
      +
      +
    • +
    • +
      Cross-Layer Equalization:

      Equalizes weight ranges in consecutive layers.

      +
      +
      +
    • +
    • +
      BN Re-estimation:

      Re-estimates Batch Norm layer statistics before folding the Batch Norm layers.

      +
      +
      +
    • +
    • +
      Bias Correction [Deprecated]:

      Bias Correction is considered deprecated. It is advised to use AdaRound instead.

      +
      +
      +
    • +
    +
    +
  • +
+
+
+

Debugging/Analysis Tools

+
+
+
    +
  • +
    Debugging/Analysis Tools
      +
    • +
      QuantAnalyzer:

      Automated debugging of the model to understand sensitivity to weight and/or activation quantization, individual +layer sensitivity, etc.

      +
      +
      +
    • +
    • +
      Visualizations:

      Visualizations and histograms of weight and activation ranges.

      +
      +
      +
    • +
    +
    +
    +
  • +
+
+
+
+

AIMET Quantization Workflow

+

This section describes the recommended workflow for quantizing a neural network.

+
+
../_images/quantization_workflow.PNG +
+

1. Model prep and validation

+

Before attempting quantization, ensure that models have been defined in accordance to model guidelines. These guidelines +depend on the ML framework the model is written in.

+
+

PyTorch

+
+
+
    +
  • Pytorch:

    +
    +

    PyTorch Model Guidelines

    +
    +

    In the case of PyTorch, there exists the Model Validator utility, to automate the checking of certain PyTorch model +requirements, as well as the Model Preparer utility, to automate the updating of the model definition to align with +certain requirements.

    +

    In this model prep and validation phase, we advise the following flow:

    +../_images/pytorch_model_prep_and_validate.PNG +

    Users can use the model validator utility first to check if the model can be run with AIMET. If validator checks +fail, users can first try using model preparer in their pipeline, an automated feature for updating models, and +retry the model validator to see if checks now pass. If the validator continues to print warnings, users will need +to update the model definition by hand prior to using AIMET features.

    +

    For more information on model validator and preparer, refer to the corresponding sections in +AIMET PyTorch Quantization APIs.

    +
    +
    +
  • +
+
+
+

Tensorflow

+
+
+
    +
  • +
    Tensorflow:

    TensorFlow Model Guidelines

    +
    +
    +
  • +
+

2. PTQ/AutoQuant

+

The user can apply various PTQ techniques to the model to adjust model parameters and make the model more robust to +quantization. We recommend trying AutoQuant first, a PTQ feature which internally tries various other PTQ methods and +finds the best combination of methods to apply. Refer to the +AIMET Quantization Features section for more details on PTQ/AutoQuant.

+

3. QAT

+

If model accuracy is still not satisfactory after PTQ/AutoQuant, the user can use QAT to fine-tune the model. Refer to +the AIMET Quantization Features section for more details on QAT.

+

4. Exporting models

+

In order to bring the model onto the target, users will need two things:

+
    +
  • a model with updated weights

  • +
  • an encodings file containing quantization parameters associated with each quantization op

  • +
+

AIMET QuantSim provides export functionality to generate both items. The exported model type will differ based on the ML +framework used:

+
    +
  • .onnx for PyTorch

  • +
  • meta/checkpoint for TensorFlow

  • +
  • .h5 and .pb for Keras

  • +
+

Depending on which AIMET Quantization features were used, the user may need to take different steps to export the model +and encodings file. For example, calling AutoQuant will automatically export the model and encodings file as part of its +processing. If QAT is used, users will need to call .export() on the QuantSim object. If lower level PTQ techniques like +CLE are used, users will need to first create a QuantSim object from the modified model, and then call .export() on the +QuantSim object.

+
+
+
+

Debugging Guidelines

+
+
+

Applying AIMET Quantization features may involve some trial and error in order to find the best optimizations to apply +on a particular model. We have included some debugging steps in the Quantization Guidebook +that can be tried when quantization accuracy does not seem to improve right off the bat.

+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/post_training_quant_techniques.html b/releases/1.32.2/torch_v2/user_guide/post_training_quant_techniques.html new file mode 100644 index 00000000..0e07f23e --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/post_training_quant_techniques.html @@ -0,0 +1,255 @@ + + + + + + AIMET Post-Training Quantization Techniques — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Post-Training Quantization Techniques

+
+

Overview

+

It is observed that some ML models show reduced inference accuracy when run on quantized hardware due to approximation noises. AIMET provides post-training quantization techniques that help adjust the parameters in the model such that the model becomes more quantization-friendly. AIMET post-training quantizations are designed to be applied on pre-trained ML models. These techniques are explained as part of the “Data-Free Quantization Through Weight Equalization and Bias Correction” paper at ICCV 2019 - https://arxiv.org/abs/1906.04721

+
+
+

User Flow

+
+
../_images/flow_diagram_cle.png +
+
    +
  1. BatchNorm Folding: This feature will fold in the batch-norm layers (if present) into surrounding layers.

    +
    +
    +

    +
    +
    +
  2. +
  3. Quantization Visualization: AIMET provides visualization tools that help guide the user to determine if AIMET post-training quantization techniques are useful for a given model. Specifically, the visualization tools will show per-channel ranges of parameters to highlight if there is big discrepancy in ranges between different channels in a layer.

    +
    +
    ../_images/cle_5.png +
    +

    +
    +
    +
  4. +
  5. Replace ReLU6 with ReLU: This feature replaces ReLU6 layers with ReLU layers. This is needed for the subsequent cross-layer scaling step. However, this replacement can lead to a drop in accuracy for some models. If this drop in accuracy is not acceptable, the user may be better off not using the post-training quantization techniques.

    +
    +
    +

    +
    +
    +
  6. +
  7. Cross Layer Scaling: In some models, the parameter ranges for different channels in a layer show a wide variance. This feature attempts to equalize the parameter ranges across different channels. As seen below, the ranges of weights per channel in a layer vary significantly. Cross-Layer Scaling scales layer’s per channel weights of consecutive layers. This helps increase the range for layers with low range and reduce range for layers with high range. Therefore, different channels have similar range and same quantizaion parameters can be used for weights across all channels.

    +
    +
    ../_images/cle_1.png +
    +

    As shown below, AIMET takes in a model and equalizes the distribution of weights per channel of consecutive layers. The scaling factor is calculated and used to scale the weights. The output of the model remains the same and the dynamic range of weight distribution is reduced.

    +
    +
    ../_images/cle_4.png +
    +

    +
    +
    +
  8. +
  9. High Bias Fold: Cross-layer scaling may result in high bias parameter values for some layers. This technique folds some of the bias of a layer into the subsequent layer’s parameters. High-bias fold requires batch-norm parameters to operate on. If the original model did not batch-norm parameters for a given layer, the high-bias fold technique will not be applied to that layer.

    +
    +
    +

    +
    +
    +
  10. +
  11. Bias Correction: Quantization sometimes leads to a shift in layer outputs. This techniques helps correct this shift by adjusting the bias parameters of that layer. Bias parameter is iteratively corrected/updated for each layer. The layer whose bias has to be corrected, and all the layers above it, are quantized. There are two techniques, namely Empirical Bias Correction and Analytical Bias Correction that are supported for bias correction.

  12. +
+

In empirical bias correction technique, representative data is passed through both the FP32 model and quantized model. Outputs are extracted for the layer to be corrected from both the models and used for correcting the bias parameter as shown below which describes correcting bias for a single layer. This process continues for all layers in the model.

+../_images/bias_correction_empirical.png +

In analytical bias correction, data from BatchNorms - when present are used for correction factor estimation instead of passing data through model as in empirical case.

+../_images/bias_correction_analytical.png +
+
+

Cross-Layer Equalization API

+

Please refer to the links below to view the Cross-Layer Equalization API for each AIMET variant:

+
    +
  • Cross-Layer Equalization for PyTorch

  • +
  • Cross-Layer Equalization for Tensorflow

  • +
  • Cross-Layer Equalization for Keras

  • +
  • Cross-Layer Equalization for ONNX

  • +
+
+
+

FAQs

+
    +
  1. +
    How many samples of data are required to perform Bias Correction?

    Bias Correction requires a representative set of dataset. We have observed that providing 500-1000 samples works well.

    +
    +
    +
  2. +
  3. +
    Which is better Empirical Bias Correction or Analytical + Empirical Bias Correction?

    If speed is not a bottleneck, then it is suggested to use Empirical Bias Correction, else the hybrid approach of combining both.

    +
    +
    +
  4. +
+
+
+

References

+
    +
  1. Markus Nagel, Mart van Baalen, Tijmen Blankevoort, Max Welling. “Data-Free Quantization Through Weight Equalization and Bias Correction.” IEEE International Conference on Computer Vision (ICCV), Seoul, October 2019.

  2. +
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/quant_analyzer.html b/releases/1.32.2/torch_v2/user_guide/quant_analyzer.html new file mode 100644 index 00000000..6a0e90d2 --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/quant_analyzer.html @@ -0,0 +1,254 @@ + + + + + + AIMET QuantAnalyzer — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET QuantAnalyzer

+
+

Overview

+

The QuantAnalyzer feature analyzes the model for quantization and points out sensitive parts/hotspots in the model. +The analyses are performed automatically, and only requires the user to pass in callbacks for performing forward pass and evaluation, and optionally a dataloader for MSE loss analysis.

+

For each analysis, QuantAnalyzer outputs json and/or html files containing data and plots for easy visualization.

+
+
+

Requirements

+
+
To call the QuantAnalyzer API, users need to provide the following:
    +
  • An FP32 pretrained model for analysis

  • +
  • A dummy input for the model which can contain random values, but must match the shape of the model’s expected input

  • +
  • A user defined function for passing 500-1000 representative data samples through the model for quantization calibration.

  • +
  • A user defined function for passing labeled data through the model for evaluation, returning an accuracy metric

  • +
  • (Optional, for runing MSE loss analysis) A dataloader providing unlabeled data to be passed through the model

  • +
+
+
+

Other quantization related settings are also provided in the call to analyze a model. +Please refer to PyTorch QuantAnalyzer API Docs for more information on how to call the QuantAnalyzer feature.

+

Note: Typically on quantized runtimes, batch normalization layers will be folded where possible. +So that users do not have to call a separate API to do so, QuantAnalyzer automatically performs Batch Norm Folding prior to running its analyses.

+
+
+

Detailed Analysis Descriptions

+

QuantAnalyzer performs the following analyses:

+
    +
  1. +
    Sensitivity analysis to weight and activation quantization:

    QuantAnalyzer compares the accuracies of the original FP32 model, an activation-only quantized model, and a weight-only quantized model.

    +

    This helps users determine which AIMET quantization technique(s) will be more beneficial for the model. +For example, in situations where the model is more sensitive to activation quantization, PTQ techniques like Adaptive Rounding or Cross Layer Equalization might not be very helpful.

    +

    Accuracy values for each model are printed as part of AIMET logging.

    +
    +
    +
  2. +
  3. +
    Per layer quantizer enablement analysis:

    Sometimes the accuracy drop incurred from quantization can be attributed to only a subset of quantizers within the model. +QuantAnalyzer performs analyses to find such layers by enabling and disabling individual quantizers to observe how the model accuracy changes.

    +

    The following two types of quantizer enablement analyses are performed:

    +
      +
    1. Disable all quantizers across the model and, for each layer, enable only that layer’s output quantizer and perform evaluation with the provided callback. +This results in accuracy values obtained for each layer in the model when only that layer’s quantizer is enabled, allowing users to observe effects of individual layer quantization and pinpoint culprit layer(s) and hotspots.

    2. +
    3. Enable all quantizers across the model and, for each layer, disable only that layer’s output quantizer and perform evaluation with the provided callback. +Once again, accuracy values are produced for each layer in the model when only that layer’s quantizer is disabled.

    4. +
    +

    As a result of these analyses, AIMET outputs per_layer_quant_enabled.html and per_layer_quant_disabled.html respectively, containing plots mapping layers on the x-axis to model accuracy on the y-axis.

    +

    JSON files per_layer_quant_enabled.json and per_layer_quant_disabled.json are also produced, containing the data shown in the .html plots.

    +
    +
    +
  4. +
  5. +
    Per layer encodings min-max range analysis:

    As part of quantization, encoding parameters for each quantizer must be obtained. +These parameters include scale, offset, min, and max, and are used for mapping floating point values to quantized integer values.

    +

    QuantAnalyzer tracks the min and max encoding parameters computed by each quantizer in the model as a result of forward passes through the model with representative data (from which the scale and offset values can be directly obtained).

    +

    As a result of this analysis, AIMET outputs html plots and json files for each activation quantizer and each parameter quantizer (contained in the min_max_ranges folder), containing the encoding min/max values for each.

    +

    If Per Channel Quantization (PCQ) is enabled, encoding min and max values for all the channels of each weight will be shown.

    +
    +
    +
  6. +
  7. +
    Per layer statistics histogram:

    Under the TF Enhanced quantization scheme, encoding min/max values for each quantizer are obtained by collecting a histogram of tensor values seen at that quantizer and potentially tossing out outliers.

    +

    When this quantization scheme is selected, QuantAnalyzer will output plots for each quantizer in the model, displaying the histogram of tensor values seen at that quantizer. +These plots are available as part of the activations_pdf and weights_pdf folders, containing a separate .html plot for each quantizer.

    +
    +
    +
  8. +
  9. +
    Per layer MSE loss:

    An optional analysis QuantAnalyzer can do is to monitor each layer’s output in the original FP32 model as well as the corresponding layer output in the quantized model, and calculate the MSE loss between the two. +This helps identify which layers may contribute more to quantization noise.

    +

    To enable this optional analysis, users need to pass in a dataloader for QuantAnalyzer to read from. +Approximately 256 samples/images are sufficient.

    +

    A per_layer_mse_loss.html file will be generated containing a plot mapping layer quantizers on the x-axis to MSE loss on the y-axis. +A corresponding per_layer_mse_loss.json file will also be generated containing data corresponding to the .html file.

    +
    +
    +
  10. +
+
+
+

QuantAnalyzer API

+

Please refer to the links below to view the QuantAnalyzer API for each AIMET variant:

+
    +
  • QuantAnalyzer for PyTorch

  • +
  • QuantAnalyzer for Tensorflow

  • +
  • QuantAnalyzer for Keras

  • +
  • QuantAnalyzer for ONNX

  • +
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/quantization_aware_training.html b/releases/1.32.2/torch_v2/user_guide/quantization_aware_training.html new file mode 100644 index 00000000..c3f0540c --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/quantization_aware_training.html @@ -0,0 +1,225 @@ + + + + + + AIMET Quantization Aware Training — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Quantization Aware Training

+
+

Overview

+

In cases where PTQ techniques are not sufficient for mitigating quantization error, users can use quantization-aware +training (QAT). QAT models the quantization noise during training and allows the model to find better solutions +than post-training quantization. However, the higher accuracy comes with the usual costs of neural +network training, i.e. longer training times, need for labeled data and hyperparameter search.

+
+
+

QAT workflow

+

The QAT workflow is largely similar to the flow for using Quantization Simulation for inference. The only difference is +that a user can take the sim.model and use it in their training pipeline in order to fine-tune model parameters while +taking quantization noise into account. The user’s training pipeline will not need to change in order to train the +sim.model compared to training the original model.

+

A typical pipeline is as follows:

+
    +
  1. Create a QuantSim sim object from a pretrained model.

  2. +
  3. Calibrate the sim using representative data samples to come up with initial encoding values for each quantizer node.

  4. +
  5. Pass the sim.model into a training pipeline to fine-tune the model parameters.

  6. +
  7. Evaluate the sim.model using an evaluation pipeline to check whether model accuracy has improved.

  8. +
  9. Export the sim to generate a model with updated weights and no quantization nodes, along with the accompanying +encodings file containing quantization scale/offset parameters for each quantization node.

  10. +
+

Observe that as compared to QuantSim inference, step 3 is the only addition when performing QAT.

+
+
+

QAT modes

+

There are two variants of QAT, referred to as QAT without Range Learning and QAT with Range Learning.

+

In QAT without Range Learning, encoding values for activation quantizers are found once in the beginning during the +calibration step after QuantSim has been instantiated, and are not updated again subsequently throughout training.

+

In QAT with Range Learning, encoding values for activation quantizers are initially set during the calibration step, but +are free to update during training, allowing a more optimal set of scale/offset quantization parameters to be found +as training takes place.

+

In both variants, parameter quantizer encoding values will continue to update in accordance with the parameters +themselves updating during training.

+
+
+

Recommendations for Quantization-Aware Training

+

Here are some general guidelines that can aid in improving performance or faster convergence with Quantization-aware Training (QAT):

+
    +
  • +
    Initialization:
      +
    • Often it can be beneficial to first apply post training quantization techniques like AutoQuant before applying QAT. +This is especially beneficial if there is large drop in INT8 performance compared to the FP32 baseline.

    • +
    +
    +
    +
  • +
  • +
    Hyper-parameters:
      +
    • Number of epochs: 15-20 epochs are generally sufficient for convergence

    • +
    • Learning rate: Comparable (or one order higher) to FP32 model’s final learning rate at convergence. +Results in AIMET are with learning of the order 1e-6.

    • +
    • Learning rate schedule: Divide learning rate by 10 every 5-10 epochs

    • +
    +
    +
    +
  • +
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/quantization_configuration.html b/releases/1.32.2/torch_v2/user_guide/quantization_configuration.html new file mode 100644 index 00000000..dcd1c714 --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/quantization_configuration.html @@ -0,0 +1,447 @@ + + + + + + Quantization Simulation Configuration — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Quantization Simulation Configuration

+
+

Overview

+

AIMET allows the configuration of quantizer placement and settings in accordance with a set of rules specified in a json configuration file, applied when the Quantization Simulation API is called.

+

Settings such as quantizer enablement, per channel quantization, symmetric quantization, and specifying fused ops when quantizing can be configurated. +The general use case for this file would be for users to match the quantization rules for a particular runtime they would like to simulate.

+

For examples on how to provide a specific configuration file to AIMET Quantization Simulation, +refer to the API docs for PyTorch Quantsim, TensorFlow Quantsim, and Keras Quantsim.

+

It is advised for the user to begin with the default configuration file under

+

aimet_common/quantsim_config/default_config.json

+

For most users of AIMET, no additional changes to the default configuration file should be needed.

+
+
+

Configuration File Structure

+

The configuration file contains six main sections, in increasing amounts of specificity:

+../_images/quantsim_config_file.png +

Rules defined in a more general section can be overruled by subsequent rules defined in a more specific case. +For example, one may specify in “defaults” for no layers to be quantized, but then turn on quantization for specific layers in the “op_type” section.

+
+
+

How to configure individual Configuration File Sections

+

When working with a new runtime with different rules, or for experimental purposes, users can refer to this section to understand how to configure individual sections in a configuration file.

+
    +
  1. defaults:

    +
    +
    {"defaults": {
    +    "ops": {                                # Required dictionary, but can be empty
    +        "is_output_quantized": "True",      # Optional: Possible settings: True
    +        "is_symmetric": "False"             # Optional: Possible settings: True, False
    +    },
    +    "params": {                             # Required dictionary, but can be empty
    +        "is_quantized": "True",             # Optional: Possible settings: True, False
    +        "is_symmetric": "True"              # Optional: Possible settings: True, False
    +    },
    +    "strict_symmetric": "False",            # Optional: Possible settings: True, False
    +    "unsigned_symmetric": "True",           # Optional: Possible settings: True, False
    +    "per_channel_quantization": "False"     # Optional: Possible settings: True, False
    +    },
    +
    +
    +

    In the defaults section, it is required to include an “ops” dictionary and a “params” dictionary (though these dictionaries may be empty).

    +

    The “ops” dictionary holds settings that will apply to all activation quantizers in the model. +In this section, the following settings are available:

    +
    +
      +
    • +
      is_output_quantized:

      An optional parameter. If included, it must be set to “True”. +Including this setting will turn on all output activation quantizers by default. +If not specified, all activation quantizers will start off as disabled.

      +

      For cases when the runtime quantizes input activations, we typically see this only done for certain op types. +Configuring these settings for specific op types is covered in sections further below.

      +
      +
      +
    • +
    • +
      is_symmetric:

      An optional parameter. If included, possible settings include “True” and “False”. +A “True” setting will place all activation quantizers in symmetric mode by default. +A “False” setting, or omitting the parameter altogether, will set all activation quantizers to asymmetric mode by default.

      +
      +
      +
    • +
    +
    +

    The “params” dictionary holds settings that will apply to all parameter quantizers in the model. +In this section, the following settings are available:

    +
    +
      +
    • +
      is_quantized:

      An optional parameter. If included, possible settings include “True” and “False”. +A “True” setting will turn on all parameter quantizers by default. +A “False” setting, or omitting the parameter altogether, will disable all parameter quantizers by default.

      +
      +
      +
    • +
    • +
      is_symmetric:

      An optional parameter. If included, possible settings include “True” and “False”. +A “True” setting will place all parameter quantizers in symmetric mode by default. +A “False” setting, or omitting the parameter altogether, will set all parameter quantizers to asymmetric mode by default.

      +
      +
      +
    • +
    +
    +

    Aside from the “ops” and “params” dictionary, additional settings governing quantizers in the model are available:

    +
      +
    • +
      strict_symmetric:

      An optional parameter. If included, possible settings include “True” and “False”. +When set to “True”, quantizers which are configured in symmetric mode will use strict symmetric quantization. +When set to “False” or omitting the parameter altogether, quantizers which are configured in symmetric mode will not use strict symmetric quantization.

      +
      +
      +
    • +
    • +
      unsigned_symmetric:

      An optional parameter. If included, possible settings include “True” and “False”. +When set to “True”, quantizers which are configured in symmetric mode will use unsigned symmetric quantization when available. +When set to “False” or omitting the parameter altogether, quantizers which are configured in symmetric mode will not use unsigned symmetric quantization.

      +
      +
      +
    • +
    • +
      per_channel_quantization:

      An optional parameter. If included, possible settings include “True” and “False”. +When set to “True”, parameter quantizers will use per channel quantization as opposed to per tensor quantization. +When set to “False” or omitting the parameter altogether, parameter quantizers will use per tensor quantization.

      +
      +
      +
    • +
    +
    +
  2. +
  3. params:

    +
    +
        "params": {                         # Can specify 0 or more param types
    +        "weight": {
    +            "is_quantized": "True",     # Optional: Possible settings: True, False
    +            "is_symmetric": "True"      # Optional: Possible settings: True, False
    +        }
    +    },
    +
    +
    +

    In the params section, settings can be configured for certain types of parameters throughout the model. +For example, adding settings for “weight” will affect all parameters of type “weight” in the model. +Currently supported parameter types include:

    +
    +
      +
    • weight

    • +
    • bias

    • +
    +
    +

    For each parameter type, the following settings are available:

    +
    +
      +
    • +
      is_quantized:

      An optional parameter. If included, possible settings include “True” and “False”. +A “True” setting will turn on all parameter quantizers of that type. +A “False” setting, will disable all parameter quantizers of that type. +By omitting the setting, the parameter will fall back to the setting specified by the defaults section.

      +
      +
      +
    • +
    • +
      is_symmetric:

      An optional parameter. If included, possible settings include “True” and “False”. +A “True” setting will place all parameter quantizers of that type in symmetric mode. +A “False” setting will place all parameter quantizers of that type in asymmetric mode. +By omitting the setting, the parameter will fall back to the setting specified by the defaults section.

      +
      +
      +
    • +
    +
    +
    +
  4. +
  5. op_type:

    +
    +
        "op_type": {                                # Can specify 0 or more ONNX op types
    +        "Gemm": {
    +            "is_input_quantized": "True",       # Optional: Possible settings: True
    +            "is_output_quantized": "False",     # Optional: Possible settings: True, False
    +            "per_channel_quantization": "True", # Optional: Possible settings: True, False
    +            "params": {                         # Optional, can specify 1 or more param types
    +                "weight": {
    +                    "is_quantized": "True",     # Optional: Possible settings: True, False
    +                    "is_symmetric": "True"      # Optional: Possible settings: True, False
    +                }
    +            },
    +        },
    +    },
    +
    +
    +

    In the op type section, settings affecting particular op types can be specified. +The configuration file recognizes ONNX op types, and will internally map the type to a PyTorch or TensorFlow op type +depending on which framework is used.

    +

    For each op type, the following settings are available:

    +
    +
      +
    • +
      is_input_quantized:

      An optional parameter. If included, it must be set to “True”. +Including this setting will turn on input quantization for all ops of this op type. +Omitting the setting will keep input quantization disabled for all ops of this op type.

      +
      +
      +
    • +
    • +
      is_output_quantized:

      An optional parameter. If included, possible settings include “True” and “False”. +A “True” setting will turn on output quantization for all ops of this op type. +A “False” setting will disable output quantization for all ops of this op type. +By omitting the setting, output quantizers of this op type will fall back to the setting specified by the defaults section.

      +
      +
      +
    • +
    • +
      is_symmetric:

      An optional parameter. If included, possible settings include “True” and “False”. +A “True” setting will place all quantizers of this op type in symmetric mode. +A “False” setting will place all quantizers of this op type in asymmetric mode. +By omitting the setting, quantizers of this op type will fall back to the setting specified by the defaults section.

      +
      +
      +
    • +
    • +
      per_channel_quantization:

      An optional parameter. If included, possible settings include “True” and “False”. +When set to “True”, parameter quantizers of this op type will use per channel quantization as opposed to per tensor quantization. +When set to “False”, parameter quantizers of this op type will use per tensor quantization. +By omitting the setting, parameter quantizers of this op type will fall back to the setting specified by the defaults section.

      +
      +
      +
    • +
    +
    +

    For a particular op type, settings for particular parameter types can also be specified. +For example, specifying settings for weight parameters of a Conv op type will affect only Conv weights and not weights +of Gemm op types.

    +

    To specify settings for param types of this op type, include a “params” dictionary under the op type. +Settings for this section follow the same convention as settings for parameter types in the preceding “params” section, however will only affect parameters for this op type.

    +
    +
  6. +
  7. supergroups:

    +
    +
        "supergroups": [    # Can specify 0 or more supergroup lists made up of ONNX op types
    +        {
    +            "op_list": ["Conv", "Relu"]
    +        },
    +        {
    +            "op_list": ["Conv", "Clip"]
    +        },
    +        {
    +            "op_list": ["Add", "Relu"]
    +        },
    +        {
    +            "op_list": ["Gemm", "Relu"]
    +        }
    +    ],
    +
    +
    +

    Supergroups are a sequence of operations which are fused during quantization, meaning no quantization noise is introduced between members of the supergroup. +For example, specifying [“Conv, “Relu”] as a supergroup disables quantization between any adjacent Conv and Relu ops in the model.

    +

    When searching for supergroups in the model, only sequential groups of ops with no branches in between will be matched with supergroups defined in the list. +Using [“Conv”, “Relu”] as an example, if there was a Conv op in the model whose output is used by both a Relu op and a second op, the supergroup would not take effect for these Conv and Relu ops.

    +

    To specify supergroups in the config file, add each entry as a list of op type strings. +The configuration file recognizes ONNX op types, and will internally map the types to PyTorch or TensorFlow op types depending on which framework is used.

    +
    +
  8. +
  9. model_input:

    +
    +
        "model_input": {
    +        "is_input_quantized": "True"    # Optional: Possible settings: True
    +    },
    +
    +
    +

    The “model_input” section is used to configure the quantization of inputs to the model. +In this section, the following setting is available:

    +
      +
    • +
      is_input_quantized:

      An optional parameter. If included, it must be set to “True”. +Including this setting will turn on quantization for input quantizers to the model. +Omitting the setting will keep input quantizers set to whatever setting they were in as a result of applying configurations from earlier sections.

      +
      +
      +
    • +
    +
    +
  10. +
  11. model_output:

    +
    +
        "model_output": {
    +        "is_output_quantized": "True"   # Optional: Possible settings: True
    +    }
    +
    +
    +

    The “model_output” section is used to configure the quantization of outputs of the model. +In this section, the following setting is available:

    +
      +
    • +
      is_output_quantized:

      An optional parameter. If included, it must be set to “True”. +Including this setting will turn on quantization for output quantizers of the model. +Omitting the setting will keep output quantizers set to whatever setting they were in as a result of applying configurations from earlier sections.

      +
      +
      +
    • +
    +
    +
  12. +
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/quantization_feature_guidebook.html b/releases/1.32.2/torch_v2/user_guide/quantization_feature_guidebook.html new file mode 100644 index 00000000..dd1ce367 --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/quantization_feature_guidebook.html @@ -0,0 +1,210 @@ + + + + + + AIMET Quantization Features Guidebook — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Quantization Features Guidebook

+

AIMET supports various neural network quantization techniques. A more in-depth discussion on various techniques and +their usage is provided in User Guide

+

After applying an AIMET Quantization feature, if the model’s performance is still not satisfactory, we recommend a set +of diagnostics steps to identify the bottlenecks and improve the performance. While this is not strictly an algorithm, +these debugging steps can provide insights on why a quantized model underperforms and help to tackle the underlying +issues. These steps are shown as a flow chart in figure 9 and are described in more detail below:

+

FP32 sanity check +An important initial debugging step is to ensure that the floating-point and quantized model behave similarly in the +forward pass, especially when using custom quantization pipelines. Set the quantized model bit-width to 32 bits for +both weights and activation, or by-pass the quantization operation, if possible, and check that the accuracy matches +that ofthe FP32 model.

+

Weights or activations quantization +The next debugging step is to identify how activation or weight quantization impact the performance independently. Does +performance recover if all weights are quantized to a higher bit-width while activations are kept in a lower bitwidth, +or conversely if all activations use a high bit-width and activations a low bit-width? This step can show the relative +contribution of activations and weight quantization to the overall performance drop and point us towards the +appropriate solution.

+

Fixing weight quantization +If the previous step shows that weight quantization does cause significant accuracy drop, then there are a few solutions +to try: +1. Apply CLE if not already implemented, especially for models with depth-wise separable convolutions. +2. Try per-channel quantization. This will address the issue of uneven per-channel weight distribution. +3. Apply bias correction or AdaRound if calibration data is available

+../_images/quantization_debugging_flow_chart.png +

Fixing activation quantization +To reduce the quantization error from activation quantization, we can also try using different range setting methods or +adjust CLE to take activation quantization ranges into account, as vanilla CLE can lead to uneven activation +distribution.

+

Per-layer analysis +If the global solutions have not restored accuracy to acceptable levels, we consider each quantizer individually. We set +each quantizer sequentially, to the target bit-width while keeping the rest of the network to 32 bits +(see inner for loop in figure above).

+

Visualizing layers +If the quantization of a individual tensor leads to significant accuracy drop, we recommended visualizing the tensor +distribution at different granularities, e.g. per-channel as in figure 5, and dimensions, e.g., per-token or per-embedding +for activations in BERT.

+

Fixing individual quantizers +The visualization step can reveal the source of the tensor’s sensitivity to quantization. Some common solutions involve +custom range setting for this quantizer or allowing a higher bit-width for problematic quantizer. If the problem is +fixed and the accuracy recovers, we continue to the next quantizer. If not, we may have to resort to other methods, +such as quantization-aware training (QAT).

+

After completing the above steps, the last step is to quantize the complete model to the desired bit-width. If the +accuracy is acceptable, we have our final quantized model ready to use. Otherwise, we can consider higher bit-widths and +smaller granularities or revert to more powerful quantization methods, such as quantization-aware training.

+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/quantization_sim.html b/releases/1.32.2/torch_v2/user_guide/quantization_sim.html new file mode 100644 index 00000000..295ae509 --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/quantization_sim.html @@ -0,0 +1,312 @@ + + + + + + AIMET Quantization Simulation — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Quantization Simulation

+
+

Overview

+

AIMET’s Quantization Simulation feature provides functionality to simulate the effects of quantized hardware. This +allows the user to then apply post-training and/or fine-tuning techniques in AIMET to recover the loss in accuracy, and +ultimately deploy the model on the target device.

+

When applying QuantSim by itself, optimal quantization scale/offset parameters for each quantizer are found, but no +techniques for mitigating accuracy loss from quantization are applied. Users can either pass their original model +directly to QuantSim to simulate quantization noise on the starting model, or apply Post-Training Quantization +techniques to obtain an updated model to then pass into QuantSim to observe a difference in quantization accuracy as a +result of applying the techniques.

+

Once a QuantSim object has been created, users can fine-tune the model within the QuantSim object using their +existing pipeline. This method is described in the Quantization Aware Training page.

+

The quantization nodes used in QuantSim are custom quantizers defined in AIMET, and are not recognized by targets. +QuantSim provides an export functionality that will save a copy of the model with quantization nodes removed, as well as +generate an encodings file containing quantization scale/offset parameters for each activation and weight tensor in +the model.

+

A hardware runtime can ingest the encodings file and match it with the exported model to find what scale/offset values +to apply on each tensor in the model.

+
+
+

QuantSim Workflow

+

A typical workflow for using AIMET quantization simulation to simulate on-target quantized accuracy is described below.

+
    +
  1. The user starts with a pretrained floating-point FP32 model.

  2. +
  3. AIMET creates a simulation model by inserting quantization simulation ops into the model graph as explained in the +sub-section below.

  4. +
  5. AIMET also configures the inserted simulation ops. The configuration of these ops can be controlled via a +configuration file as discussed in sub-section below.

  6. +
  7. AIMET finds optimal quantization parameters, such as scale/offsets, for the inserted quantization simulation ops. To +do this, AIMET requires the user to provide a callback method that feeds a few representative data samples through +the model. These samples can either be from the training or calibration datasets. Generally, samples in the order of +1,000-2,000 have been sufficient for AIMET to find optimal quantization parameters.

  8. +
  9. AIMET returns a quantization simulation model that can be used as a drop-in replacement for the original model in +their evaluation pipeline. Running this simulation model through the evaluation pipeline yields a quantized accuracy +metric that closely simulates on-target accuracy.

  10. +
  11. The user can call .export() on the sim object to save a copy of the model with quantization nodes removed, along with +an encodings file containing quantization scale/offset parameters for each activation and weight tensor in the model.

  12. +
+
+
+

Simulating Quantization Noise

+

The diagram below explains how quantization noise is introduced to a model when its input, output or parameters are +quantized and dequantized.

+
+
../_images/quant_3.png +
+

Since dequantizated value may not be exactly the same as quantized value, the difference between the two values is the +quantization noise.

+

In order to simulate quantization noise, AIMET QuantSim adds quantizer ops to the PyTorch/TensorFlow/Keras model graph. +The resulting model graph can be used as is in the user’s evaluation or training pipeline.

+
+
+

Determining Quantization Parameters (Encodings)

+

Using a QuantSim model, AIMET analyzes and determines the optimal quantization encodings (scale and offset parameters) +for each quantizer op.

+

To do this, AIMET passes some calibration samples through the model. Using hooks, tensor data is intercepted while +flowing through the model. A histogram is created to model the distribution of the floating point numbers in the output +tensor for each layer.

+../_images/quant_2.png +

Using the distribution of the floating point numbers in the output tensor for each layer, quantization encodings are +computed using the specified quantization calibration technique. An encoding for a layer consists of four numbers:

+
    +
  • Min (qmin): Numbers below these are clamped

  • +
  • Max (qmax): Numbers above these are clamped

  • +
  • Delta: Granularity of the fixed point numbers (is a function of the bit-width selected)

  • +
  • Offset: Offset from zero

  • +
+
+
The Delta and Offset can be calculated using Min and Max and vice versa using the equations:

\(\textrm{Delta} = \dfrac{\textrm{Max} - \textrm{Min}}{{2}^{\textrm{bitwidth}} - 1} \quad \textrm{Offset} = \dfrac{-\textrm{Min}}{\textrm{Delta}}\)

+
+
+
+
+

Quantization Schemes

+

AIMET supports various techniques for coming up with min and max values for encodings, also called quantization schemes:

+
    +
  • Min-Max: Also referred to as “TF” in AIMET (The name TF represents the origin of this technique and +has no relation to what framework the user is using). To cover the whole dynamic range of the tensor, we can define +the quantization parameters Min and Max to be the observed Min and Max during the calibration process. This leads to +no clipping error. However, this approach is sensitive to outliers, as strong outliers may cause excessive rounding +errors.

  • +
  • Signal-to-Quantization-Noise (SQNR): Also referred to as “TF Enhanced” in AIMET (The name TF +represents the origin of this technique and has no relation to what framework the user is using). The SQNR approach is +similar to the Mean Square Error (MSE) minimization approach. In the SQNR range setting method, we find qmin and qmax +that minimize the total MSE between the original and the quantized tensor. Quantization noise and saturation noise are +different types of erros which are weighted differently.

  • +
+

For each quantization scheme, there are “post training” and “training range learning” variants. The “post training” +variants are used during regular QuantSim inference as well as QAT without Range Learning, to come up with initial +encoding values for each quantization node. In QAT without Range Learning, encoding values for activation quantizers +will remain static (encoding values for parameter quantizers will change in accordance with changing parameter values +during training).

+

The “training range learning” variants are used during QAT with Range Learning. The schemes define how to come up with +initial encoding values for each quantization node, but also allow encoding values for activations to be learned +alongside parameter quantizer encodings during training.

+

For more details on QAT, refer to Quantization Aware Training.

+
+
+

Configuring Quantization Simulation Ops

+

Different hardware and on-device runtimes may support different quantization choices for neural network inference. For +example, some runtimes may support asymmetric quantization for both activations and weights, whereas other ones may +support asymmetric quantization just for weights.

+

As a result, we need to make quantization choices during simulation that best reflect our target runtime and hardware. +AIMET provides a default configuration file, which can be modified. This file is used during quantization simulation if +no other configuration file is specified. By default, following configuration is used for quantization simulation:

+
    +
  • Weight quantization: Per-channel, symmetric quantization, INT8

  • +
  • Activation or layer output quantization: Per-tensor, asymmetric quantization, INT8

  • +
+

Quantization options that can be controlled via the configuration file include the following:

+
    +
  • Enabling/disabling of input and output quantizer ops

  • +
  • Enabling/disabling of parameter quantizer ops

  • +
  • Enabling/disabling of model input quantizer

  • +
  • Enabling/disabling of model output quantizer

  • +
  • Symmetric/Asymmetric quantization

  • +
  • Unsigned/signed symmetric quantization

  • +
  • Strict/non strict symmetric quantization

  • +
  • Per channel/per tensor quantization

  • +
  • Defining groups of layers to be fused (no quantization done on intermediate tensors within fused layers)

  • +
+

Please see the Quantization Simulation Configuration page which describes the configuration +options in detail.

+
+
+

Quantization Simulation APIs

+

Please refer to the links below to view the Quantization Simulation API for each AIMET variant:

+
    +
  • Quantization Simulation for PyTorch

  • +
  • Quantization Simulation for Tensorflow

  • +
  • Quantization Simulation for Keras

  • +
  • Quantization Simulation for ONNX

  • +
+
+
+

Frequently Asked Questions

+
    +
  • +
    Q: How many samples are needed in the calibration step (compute encodings)?

    A: 1,000 - 2,000 unlabeled representative data samples are sufficient.

    +
    +
    +
  • +
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/release_notes.html b/releases/1.32.2/torch_v2/user_guide/release_notes.html new file mode 100644 index 00000000..77123754 --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/release_notes.html @@ -0,0 +1,432 @@ + + + + + + AIMET Release Notes — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Release Notes

+

Release Notes for Qualcomm AI Model Efficiency ToolKit (AIMET)

+
+

1.22.2

+

Tensorflow

+
    +
  • Added support for supergroups : MatMul + Add

  • +
  • Added support for TF-Slim BN name with backslash

  • +
  • Added support for Depthwise + Conv in CLS

  • +
+

Documentation

+ +
+
+

1.22.1

+
    +
  • Added support for QuantizableMultiHeadAttention for PyTorch nn.transformer layers by @quic-kyuykim

  • +
  • Support functional conv2d in model preparer by @quic-kyuykim

  • +
  • Enable qat with multi gpu by @quic-mangal

  • +
  • Optimize forward pass logic of PyTorch QAT 2.0 by @quic-geunlee

  • +
  • Fix functional depthwise conv support on model preparer by @quic-kyuykim

  • +
  • Fix bug in model validator to correctly identify functional ops in leaf module by @quic-klhsieh

  • +
  • Support dynamic functional conv2d in model preparer by @quic-kyuykim

  • +
  • Added updated default runtime config, also a per-channel one. Fixed n… by @quic-akhobare

  • +
  • Include residing module info in model validator by @quic-klhsieh

  • +
  • Support for Keras MultiHeadAttention Layer by @quic-ashvkuma

  • +
+

Documentation

+ +
+
+

1.22.0

+
    +
  • Support for simulation and QAT for PyTorch transformer models (including support for torch.nn mha and encoder layers)

  • +
+

Documentation

+ +
+
+

1.21.0

+
    +
  • New feature: PyTorch QuantAnalyzer - Visualize per-layer sensitivity and per-quantizer PDF histograms

  • +
  • New feature: TensorFlow AutoQuant - Automatically apply various AIMET post-training quantization techniques

  • +
  • PyTorch QAT with Range Learning: Added support for Per Channel Quantization

  • +
  • PyTorch: Enabled exporting of encodings for multi-output leaf module

  • +
  • +
    TensorFlow Adaround
      +
    • Added ability to use configuration file in API to adapt to a specific runtime target

    • +
    • Added Per-Channel Quantization support

    • +
    +
    +
    +
  • +
  • TensorFlow QuantSim: Added support for FP16 inference and QAT

  • +
  • +
    TensorFlow Per Channel Quantization
      +
    • Fixed speed and accuracy issues

    • +
    • Fixed zero accuracy for 16-bits per channel quantization

    • +
    • Added support for DepthWise Conv2d Op

    • +
    +
    +
    +
  • +
  • Multiple other bug fixes

  • +
+

Documentation

+ +
+ +
+

1.19.1.py37

+
    +
  • PyTorch: Added CLE support for Conv1d, ConvTranspose1d and Depthwise Separable Conv1d layers

  • +
  • PyTorch: Added High-Bias Fold support for Conv1D layer

  • +
  • PyTorch: Modified Elementwise Concat Op to support any number of tensors

  • +
  • Minor dependency fixes

  • +
+

Documentation

+ +
+
+

1.19.1

+
    +
  • PyTorch: Added CLE support for Conv1d, ConvTranspose1d and Depthwise Separable Conv1d layers

  • +
  • PyTorch: Added High-Bias Fold support for Conv1D layer

  • +
  • PyTorch: Modified Elementwise Concat Op to support any number of tensors

  • +
  • Minor dependency fixes

  • +
+

Documentation

+ +
+
+

1.18.0.py37

+
    +
  • Multiple bug fixes

  • +
  • Additional feature examples for PyTorch and TensorFlow

  • +
+

Documentation

+ +
+
+

1.18.0

+
    +
  • Multiple bug fixes

  • +
  • Additional feature examples for PyTorch and TensorFlow

  • +
+

Documentation

+ +
+
+

1.17.0.py37

+
    +
  • Add Adaround TF feature

  • +
  • Added Examples for Torch quantization, and Channel Pruning & Spatial SVD compression

  • +
+

Documentation

+ +
+
+

1.17.0

+
    +
  • Add Adaround TF feature

  • +
  • Added Examples for Torch quantization, and Channel Pruning & Spatial SVD compression

  • +
+

Documentation

+ +
+
+

1.16.2.py37

+
    +
  • Added a new post-training quantization feature called AdaRound, which stands for AdaptiveRounding

  • +
  • Quantization simulation and QAT now also support recurrent layers (RNN, LSTM, GRU)

  • +
+

Documentation

+ +
+
+

1.16.2

+
    +
  • Added a new post-training quantization feature called AdaRound, which stands for AdaptiveRounding

  • +
  • Quantization simulation and QAT now also support recurrent layers (RNN, LSTM, GRU)

  • +
+

Documentation

+ +
+ +
+

1.16.1

+
    +
  • Added separate packages for CPU and GPU models. This allows users with CPU-only hosts to run AIMET.

  • +
  • Added separate packages for PyTorch and TensorFlow. Reduces the number of dependencies that users would need to install.

  • +
+

Documentation

+ +
+ + + +
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/spatial_svd.html b/releases/1.32.2/torch_v2/user_guide/spatial_svd.html new file mode 100644 index 00000000..5416fa77 --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/spatial_svd.html @@ -0,0 +1,170 @@ + + + + + + AIMET Spatial SVD — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Spatial SVD

+

Spatial SVD is a tensor decomposition technique which decomposes one large layer (in terms of mac or memory) into two smaller layers. SVD stands for Singular Value Decomposition.

+

Given a conv layer, with kernel (𝑚,𝑛,ℎ,𝑤) where 𝑚 is the input channels, 𝑛 the output channels, and ℎ, 𝑤 giving the height and width of the kernel itself, Spatial SVD will decompose the kernel into two kernels. One of size (𝑚,𝑘,ℎ,1) and one of size (𝑘,𝑛,1,𝑤), where k is called the rank. The smaller the value of k the larger the degree of compression achieved.

+

The following diagram illustrates this visually. As you can see, Spatial SVD decomposes both the output channel dimension as well as the size of the conv kernel itself. Spatial SVD is currently supported for Conv layers in AIMET.

+../_images/spatial_svd.png +
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/visualization_compression.html b/releases/1.32.2/torch_v2/user_guide/visualization_compression.html new file mode 100644 index 00000000..cb4741f1 --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/visualization_compression.html @@ -0,0 +1,246 @@ + + + + + + AIMET Visualization — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Visualization

+
+

Overview

+

AIMET Visualization adds analytical capability to the AIMET tool (which helps quantize and compress ML models) through visualization. It provide more detailed insights in to AIMET features as users are able to analyze a model’s layers in terms of compressibility and also highlight potential issues when applying quantization. The tool also assists in displaying progress for computationally heavy tasks.

+
+
+

Design

+

Given a model, a user can start a Bokeh server session and then invoke functions which will produce visualizations to help analyze and understand the model before using AIMET features from quantization and compression

+../_images/vis_1.png +
+
+

Compression

+

Evaluation scores during compression are displayed in a table as they are computed and users can see the progress displayed while computing these scores. After Greedy Selection has run, the optimal compression ratios are also displayed in a graph

+../_images/vis_4.png +../_images/vis_5.png +../_images/vis_6.png +../_images/vis_7.png +
+
+

Starting a Bokeh Server Session:

+

Start a bokeh server by typing this command: bokeh serve –allow-websocket-origin=<host name>:<port number> –port=<port number>

+

–allow-websocket-origin tells the Bokeh server which network addresses to listen on, again not typically needed for local It is not need just to view locally.

+

–port tells the Bokeh server what network port to listen on rather than the default port of 5006

+
+
+

How to use the tool

+

Model Compression

+
    +
  1. Start a bokeh server by typing this command: bokeh serve –allow-websocket-origin=<host name>:<port number> –port=<port number>

  2. +
  3. +
    To visualize eval scores and compression ratios during execution time:
      +
    1. +
      Input a visualization URL into the top level function: compress_model. This url is http://<host name>:<port number>/
        +
      1. For model compression, the visualization url is passed through compress_model. If no visualizations are necessary then the url has a default option for None.

      2. +
      +
      +
      +
    2. +
    3. +
      Finally, go to the URL to see the visualizations.
        +
      1. The session-id here is: compression. So the URL would look something like this:

      2. +
      3. http://<host name>:<port number>/?&bokeh-session-id=compression

      4. +
      +
      +
      +
    4. +
    +
    +
    +
  4. +
  5. +
    To visualize eval scores and compression ratios after execution:
      +
    1. +
      Use API doc to decide which functions to use. They should be under “Model Compression.”
        +
      1. First instantiate a VisualizeCompression instance by passing in a visualization URL. This url is http://<host name>:<port number>/

      2. +
      +
      +
      +
    2. +
    3. +
      There are two functions:
        +
      1. display_eval_scores

      2. +
      3. display_comp_ratio_plot

      4. +
      +
      +
      +
    4. +
    5. +
      Finally, go to the URL to see the visualizations
        +
      1. The session-id here is: compression. So the URL would look something like this:

      2. +
      3. http://<host name>:<port number>/?&bokeh-session-id=compression

      4. +
      +
      +
      +
    6. +
    +
    +
    +
  6. +
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/visualization_quant.html b/releases/1.32.2/torch_v2/user_guide/visualization_quant.html new file mode 100644 index 00000000..3c89cb7a --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/visualization_quant.html @@ -0,0 +1,196 @@ + + + + + + AIMET Visualization for Quantization — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Visualization for Quantization

+
+

Overview

+

AIMET Visualization adds analytical capability to the AIMET tool (which helps quantize and compress ML models) through visualization. It provides more detailed insights into AIMET features as users are able to analyze a model’s layers in terms of compressibility and also highlight potential issues when applying quantization. The tool also assists in displaying progress for computationally heavy tasks. The visualizations get saved as an HTML file under the specified directory.

+
+
+

Quantization

+

During quantization, common parameters are used throughout a layer for converting the floating point weight values to INT8. If the dynamic range in weights is very high, the quantization will not be very granular. To equalize the weight range we apply Cross Layer Equalization. +In order to understand if we need to apply Cross Layer Equalization, we can visualize the weight range for every channel in a layer. If the weight range varies a lot over the various channels, applying cross layer equalization helps in improving the Quantization accuracy.

+../_images/vis_3.png +
+

PyTorch

+

In PyTorch, we can visualize the weights for a model. We can also visualize the weight ranges for a model before and after Cross Layer Equalization. +There are three main functions a user can invoke:

+
    +
  1. User can analyze relative weight ranges of model to see potentially problematic layers for quantization

  2. +
  3. User can understand each layer in the model

  4. +
  5. User can visualize the model, comparing weights before and after quantization.

  6. +
+
+
+

TensorFlow

+

In TensorFlow, we can visualize the weight ranges and relative weight ranges over various channels in a layer. +User can also use the same functions to see the changes in a layer weight ranges before and after Cross Layer Equalization.

+

There are two main functions a user can invoke:

+
    +
  1. User can analyze relative weight ranges of a layer to see potentially problematic layers for quantization

  2. +
  3. User can visualize weight ranges of a layer and see the various statistics for weights

  4. +
+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/weight_svd.html b/releases/1.32.2/torch_v2/user_guide/weight_svd.html new file mode 100644 index 00000000..50db36d7 --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/weight_svd.html @@ -0,0 +1,170 @@ + + + + + + AIMET Weight SVD — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Weight SVD

+

Weight SVD is a tensor decomposition technique which decomposes one large layer (in terms of mac or memory) into two smaller layers. SVD stands for Singular Value Decomposition.

+

Given a neural network layer, with kernel (𝑚,𝑛,ℎ,𝑤) where 𝑚 is the input channels, 𝑛 the output channels, and ℎ, 𝑤 giving the height and width of the kernel itself, Weight SVD will decompose the kernel into one of size (𝑚,𝑘,1,1) and another of size (𝑘,𝑛,h,𝑤), where 𝑘 is called the rank. The smaller the value of 𝑘 the larger the degree of compression achieved.

+

The following diagram illustrates this visually. As you can see, Weight SVD decomposes the output channel dimension. Weight SVD is currently supported for Conv and Full-connected layers in AIMET.

+../_images/weight_svd.png +
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/torch_v2/user_guide/winnowing.html b/releases/1.32.2/torch_v2/user_guide/winnowing.html new file mode 100644 index 00000000..b405ddc3 --- /dev/null +++ b/releases/1.32.2/torch_v2/user_guide/winnowing.html @@ -0,0 +1,184 @@ + + + + + + AIMET Winnowing — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Winnowing

+
+

Overview

+

The model compression algorithm, Channel Pruning, identifies modules in a model, whose subset of input channels could be pruned without losing much accuracy. Unless explicitly removed, these input channels take up memory and add to unnecessary computation. For each identified module, the Winnow tool removes the input channels that were selected for pruning. Only Conv2D layers are supported for winnowing.

+
+
+

Winnowing Overview

+

The following figure provides a pictorial overview of Winnowing. In this example, a module in a model has an input volume of HxWx8, where H = Height, W = Width and Number of input Channels = 8. The Channel Pruning algorithm identifies that for this module, input channels 1, 4 and 7 should be pruned. Winnowing removes the identified input channels from this modules. The module’s input volume is now reduced to HxWx5.

+../_images/winnow_1.png +
+
+

How Winnowing Works

+

When the number of input channels of a Conv module is reduced, the output channels of the module above it must also be modified. If the module above is a another Conv layer, that Conv layer’s output channels are also reduced to match the number of input channels of the winnowed Conv module. If the module above is NOT a Conv layer (e.g., BatchNorm, ReLU), that module simply propagates the changes upstream. That is both the output and the input channels of teh BatchNorm and ReLU modules are winnowed to match the winnowed channels of the Conv layer just below them.

+

The following figure explains a very simple scenario. In this scenario, a Conv module has been identified for winnowing a sub set of its input channels. This is indicated by green color on the left side of the figure. The right side of the figure indicates the actions taken by Winnowing. Winnowing consists of the following changes done to the 3 affected modules.

+

The identified Conv module’s sub set of input channels are removed. This is indicated by pink color on the right side of the figure. +The module just above the winnowed Conv module is NOT a Conv module. It could be a ReLU or a BatchNorm module. For this module the corresponding output and input channels are winnowed. This is indicated by orange color on the right side of the figure. +The module above the ReLU/BatchNorm is another Conv module. This Conv module’s output channels are winnowed.This is indicated by pink color on the right side of the figure.

+../_images/winnow_2.png +
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/adaround.html b/releases/1.32.2/user_guide/adaround.html new file mode 100644 index 00000000..a380f4b1 --- /dev/null +++ b/releases/1.32.2/user_guide/adaround.html @@ -0,0 +1,1215 @@ + + + + + + AIMET AdaRound — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET AdaRound

+
+

AIMET quantization features, by default, use the “nearest rounding” technique for achieving quantization. +In the following figure, a single weight value in a weight tensor is shown as an illustrative example. When using the +“nearest rounding” technique, this weight value is quantized to the nearest integer value. The Adaptive Rounding +(AdaRound) feature, uses a smaller subset of the unlabelled training data to adaptively round the weights of modules +with weights. In the following figure, the weight value is quantized to the integer value far from it. AdaRound, +optimizes a loss function using the unlabelled training data to adaptively decide whether to quantize a specific +weight to the integer value near it or away from it. Using the AdaRound quantization, a model is able to achieve an +accuracy closer to the FP32 model, while using low bit-width integer quantization.

+

When creating a QuantizationSimModel using the AdaRounded model, use the QuantizationSimModel provided API for +setting and freezing parameter encodings before computing the encodings. Please refer the code example in the AdaRound +API section.

+
+../_images/adaround.png +
+

AdaRound Use Cases

+
+
+

Common terminology

+
+
    +
  • BC - Bias Correction

  • +
  • BNF - Batch Norm Folding

  • +
  • CLE - Cross Layer Equalization

  • +
  • HBF - High Bias Folding

  • +
  • QAT - Quantization Aware Training

  • +
  • { } - An optional step in the use case

  • +
+
+
+
+

Use Cases

+
+
    +
  1. +
    {BNF} –> {CLE} –> AdaRound

    Applying BNF and CLE are optional steps before applying AdaRound. Some models benefit from applying CLE +while some don’t get any benefit.

    +
    +
    +
  2. +
  3. +
    AdaRound –> QAT

    AdaRound is a post-training quantization feature. But, for some models applying BNF and CLE may not be beneficial. +For these models, QAT after AdaRound may be beneficial. AdaRound is considered as a better weights initialization +step which helps for faster QAT.

    +
    +
    +
  4. +
+

Not recommended

+
+
+
    +
  1. AdaRound –> BC

  2. +
  3. BC –> AdaRound

  4. +
+

AdaRound Hyper parameters guidelines

+
+
+

There are couple of hyper parameters required during AdaRound optimization and are exposed to users. But some of them +are with their default values which lead to good and stable results over many models and not recommended to change often.

+

Following is guideline for Hyper parameters:

+
    +
  1. Hyper Parameters to be changed often: number of batches (approximately 500-1000 images, if batch size of data loader +is 64, then 16 number of batches leads to 1024 images), number of iterations(default 10000)

  2. +
  3. Hyper Parameters to be changed moderately: regularization parameter (default 0.01)

  4. +
  5. Hyper Parameters to be changed least: beta range(default (20, 2)), warm start period (default 20%)

  6. +
+
+

+
+
+
+

AdaRound API

+

Please refer to the links below to view the AdaRound API for each AIMET variant:

+ +
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/auto_quant.html b/releases/1.32.2/user_guide/auto_quant.html new file mode 100644 index 00000000..f12f1204 --- /dev/null +++ b/releases/1.32.2/user_guide/auto_quant.html @@ -0,0 +1,1178 @@ + + + + + + AIMET AutoQuant — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET AutoQuant

+
+

Overview

+

AIMET offers a suite of neural network post-training quantization techniques. Often, applying these techniques in a +specific sequence, results in better accuracy and performance. Without the AutoQuant feature, the AIMET +user needs to manually try out various combinations of AIMET quantization features. This manual process is +error-prone and often time-consuming.

+

The AutoQuant feature, analyzes the model, determines the sequence of AIMET quantization techniques and applies these +techniques. In addition, the user can specify the amount of accuracy drop that can be tolerated, in the AutoQuant API. +As soon as this threshold accuracy is reached, AutoQuant stops applying any additional quantization technique. In +summary, the AutoQuant feature saves time and automates the quantization of the neural networks.

+
+
+

Workflow

+

Before entering the optimization workflow, AutoQuant performs the following preparation steps:

+
+
    +
  1. Check the validity of the model and convert it into an AIMET quantization-friendly format (denoted as Prepare Model below).

  2. +
  3. Select the best-performing quantization scheme for the given model (denoted as QuantScheme Selection below)

  4. +
+
+

After the prepration steps, AutoQuant mainly consists of the following three stages:

+
+
    +
  1. BatchNorm folding

  2. +
  3. Cross-Layer Equalization

  4. +
  5. AdaRound

  6. +
+
+

These techniques are applied in a best-effort manner until the model meets the allowed accuracy drop. +If applying AutoQuant fails to satisfy the evaluation goal, AutoQuant will return the model to which the best combination +of the above techniques is applied.

+
+
../_images/auto_quant_v2_flowchart.png +
+
+
+

AutoQuant API

+

Please refer to the links below to view the AutoQuant API for each AIMET variant:

+ +
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/bn_reestimation.html b/releases/1.32.2/user_guide/bn_reestimation.html new file mode 100644 index 00000000..392e1adf --- /dev/null +++ b/releases/1.32.2/user_guide/bn_reestimation.html @@ -0,0 +1,1180 @@ + + + + + + AIMET BN Re-estimation — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET BN Re-estimation

+
+

Overview

+

The BN Re-estimation feature utilizes a small subset of training data to individually re-estimate the statistics of the +Batch Normalization (BN) layers in a model. These BN statistics are then used to adjust the quantization scale parameters +of the preceeding Convolution or Linear layers. Effectively, the BN layers are folded.

+

The BN Re-estimation feature is applied after performing Quantization Aware Training (QAT) with Range Learning, with +Per Channel Quantization (PCQ) enabled. It is very important NOT to fold the BN layers before performing QAT. The BN layers are +folded ONLY after QAT and the re-estimation of the BN statistics are completed. The Workflow section below, covers +the exact sequence of steps.

+

The BN Re-estimation feature is specifically recommended for the following scenarios:

+
    +
  • Low-bitwidth weight quantization (e.g., 4-bits)

  • +
  • Models for which Batch Norm Folding leads to decreased performance.

  • +
  • Models where the main issue is weight quantization (including higher bitwidth quantization)

  • +
  • Low bitwidth quantization of depthwise separable layers since their Batch Norm Statistics are affected by oscillations

  • +
+
+
+

Workflow

+

BN-Re-estimation requires that

+
    +
  1. BN layers not be folded before QAT.

  2. +
  3. Per Channel Quantization is enabled.

  4. +
+

To use the BN-Re-estimation feature, the following sequence of steps must be followed in the correct order.

+
    +
  1. Create the QuantizationSimModel object with Range Learning Quant Scheme

  2. +
  3. Perform QAT with Range Learning

  4. +
  5. Re-estimate the BN statistics

  6. +
  7. Fold the BN layers

  8. +
  9. Using the QuantizationSimModel, export the model and encodings.

  10. +
+

Once the above steps are completed, the model can be run on the target for inference.

+

The following high level call flow diagrams, enumerates the work flow for PyTorch. +The workflow is the same for TensorFlow and Keras.

+../_images/bn_reestimation.png +
+
+

BN Re-estimation API

+

Please refer to the links below to view the BN Re-estimation API for each AIMET variant:

+ +
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/channel_pruning.html b/releases/1.32.2/user_guide/channel_pruning.html new file mode 100644 index 00000000..6b84894b --- /dev/null +++ b/releases/1.32.2/user_guide/channel_pruning.html @@ -0,0 +1,1163 @@ + + + + + + AIMET Channel Pruning — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Channel Pruning

+

Channel Pruning is a model compression technique that reduces less-important input channels from layers in a given model. Currently AIMET supports Channel Pruning of Conv2d layers.

+
+

Overall Procedure

+

The following picture explains the different steps in Channel Pruning a given layer. These steps are repeated for all layers selected to be compressed in the order of their occurrence from the top of the model.

+../_images/channel_pruning_1.png +

These individual steps are explained in more detail in the following sub-sections.

+
+
+

Channel Selection

+

For a given layer and a given compression ratio Channel Selection analyzes the magnitude of each input channel (based on the kernel weights for that channel) and chooses the channels with the least magnitude to be pruned.

+
+
+

Winnowing

+

Winnowing is used to remove input channels of weight matrix obtained from Channel Selection resulting in compressed tensors

+../_images/cp_2.png +

Once one or more input channels for a layer are removed, then it means corresponding output channels of a upstream layer could also be removed to get further compression gains. Note that the presence of skip-connections or residuals sometimes prevents upstream layers from getting output-pruned.

+../_images/cp_3.jpg +

For more details on winnowing, please see this

+
+ +
+
+
+

Weight Reconstruction

+

As a final step in Channel Pruning, AIMET will adjust the weight and bias parameters of a layer that was pruned in an attempt to try and match the outputs of that layer to closely match the outputs prior to pruning.This is done by collecting random samples of the output of the layer from the original model and the corresponding input samples from the pruned model for that layer. AIMET then performs linear regression to adjust the layer parameters.

+../_images/cp_4.jpg +
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/compression_feature_guidebook.html b/releases/1.32.2/user_guide/compression_feature_guidebook.html new file mode 100644 index 00000000..cb3adc30 --- /dev/null +++ b/releases/1.32.2/user_guide/compression_feature_guidebook.html @@ -0,0 +1,1168 @@ + + + + + + AIMET Compression Features Guidebook — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Compression Features Guidebook

+

This document provides typical workflows in order to compress a network using AIMET. A more in-depth discussion on various techniques and their usage is provided in User Guide

+

AIMET supports network compression using the following techniques: Weight SVD, Spatial SVD (SSVD) and Channel Pruning (CP). These techniques are intended for Multiply-and-Accumulate (MAC) reduction of convolution layers in a neural network. Based on a configured desired MAC reduction ratio, i.e., MACs in compress model to MACs in uncompressed model, the compression algorithms automatically compress each individual convolution layer in the network to approximately reach the overall desired MAC reduction. Note that the actual on-target inference latency performance of a model depends on several factors MACs, memory and memory bandwidth, quantization, etc. Therefore, the improvement in runtime latency based on MAC reduction based compression may vary depending on the specific model architecture. Performance results for some typical models are provided in https://quic.github.io/aimet-pages/index.html. +For best performance, a combination of spatial SVD followed by channel pruning is recommended. At high level, following steps should be performed to compress a network using SSVD + CP combination:

+../_images/compression_flow.png +
    +
  1. Determine the target compression ratio (C), which is the ratio of MACs in final compressed model to the MACs in the original uncompressed model. For example, target compression ratio = 0.5 indicates that the final model MACs are half of the original model MACs.

  2. +
  3. Perform compression using Spatial SVD technique as follows:

  4. +
+
+
    +
  1. Since the target compression ratio C is for the final SSVD+CP compressed model, the compression that should be targeted or can be achieved via SSVD is unknown apriori. As a result, few target compression ratios (Cssvd)need to be tried out. Choose few Cssvd > C targets and perform SSVD. E.g., if C = 0.5, Cssvd = {0.5,0.65, 0.75} can be used typically. This would result in three SSVD compressed models.

  2. +
  3. For each of the SSVD compressed model obtained from previous step, perform fine-tuning to improve model accuracy. Guidelines on fine-tuning are provided here [].

  4. +
+
+
    +
  1. Pick a model (or few models) that provide high accuracy from step 2b. For example, if the tolerable accuracy drop SSVD+CP compression relative to the original uncompressed model is X % (X = Accuracy of uncompressed model (%) Accuracy of compressed model (%)) , then a model(s) that has accuracy within few % (X-5 %)of the original uncompressed model accuracy should be selected to avoid very large drop in accuracy after CP step.

  2. +
+
+
    +
  1. Note that if step 2b results in very large accuracy drop or drop well within tolerable accuracy drop, then step 2a/2b should be revisited first by appropriately adjusting the compression ratios.

  2. +
+
+
    +
  1. Perform compression using Channel Pruning technique as follows:

  2. +
+
+
    +
  1. Perform compression with few target compression ratios (Ccp). One can set the compression ratio(s) based on the Cssvd of the model obtained from SSVD step 3 such that Cssvd * Ccp is approximately equal to C.

  2. +
  3. Perform fine-tuning to improve model accuracy.

  4. +
+
+
    +
  1. In the final step, a model is selected with MAC ratio relative to the original uncompressed model is close to C and also meets user’s accuracy requirements. For example, for ResNet-50 results provided on https://quic.github.io/aimet-pages/index.html, Csvd = 0.75 and Ccp = 0.66 were used to achieve overall compression C = 0.5

  2. +
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/examples.html b/releases/1.32.2/user_guide/examples.html new file mode 100644 index 00000000..7e17e7ef --- /dev/null +++ b/releases/1.32.2/user_guide/examples.html @@ -0,0 +1,1277 @@ + + + + + + AIMET Examples — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ + ../_images/logo-quic-on%40h68.png +
+

AIMET Examples

+

AIMET Examples provide reference code (in the form of Jupyter Notebooks) to learn how to +apply AIMET quantization and compression features. It is also a quick way to become +familiar with AIMET usage and APIs.

+

For more details on each of the features and APIs please refer: +Links to User Guide and API Documentation

+
+

Browse the notebooks

+

The following table has links to browsable versions of the notebooks for different features.

+

Model Quantization Examples

+ +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Features

PyTorch

TensorFlow

Keras

ONNX

Quantsim / Quantization-Aware Training (QAT)

Link

Link

Link (no training)

QAT with Range Learning

Link

Link

Cross-Layer Equalization (CLE)

Link

Link

Link

Adaptive Rounding (AdaRound)

Link

Link

Link

AutoQuant

Link

Link

+
+

+
+

Model Compression Examples

+ +++++ + + + + + + + + + + + + + + + + + + + + +

Features

PyTorch

TensorFlow

Channel Pruning

Link

Link

Spatial SVD

Link

Link

Spatial SVD + Channel Pruning

Link

Link

+
+

+
+
+
+

Running the notebooks

+
+

Install Jupyter

+
    +
  • Install the Jupyter metapackage as follows (pre-pend with “sudo -H” if appropriate):

  • +
+

python3 -m pip install jupyter

+
    +
  • Start the notebook server as follows (please customize the command line options if appropriate):

  • +
+

jupyter notebook --ip=* --no-browser &

+
    +
  • The above command will generate and display a URL in the terminal. Copy and paste it into your browser.

  • +
+
+ +
+

Run the notebooks

+
    +
  • Navigate to one of the following paths under the Examples directory and launch your chosen Jupyter Notebook (.ipynb extension):

    +
      +
    • Examples/torch/quantization/

    • +
    • Examples/torch/compression/

    • +
    • Examples/tensorflow/quantization/

    • +
    • Examples/tensorflow/compression/

    • +
    +
  • +
  • Follow the instructions therein to execute the code.

  • +
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/greedy_compression_ratio_selection.html b/releases/1.32.2/user_guide/greedy_compression_ratio_selection.html new file mode 100644 index 00000000..729e3cad --- /dev/null +++ b/releases/1.32.2/user_guide/greedy_compression_ratio_selection.html @@ -0,0 +1,1171 @@ + + + + + + AIMET Greedy Compression Ratio Selection — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Greedy Compression Ratio Selection

+
+

Overview

+

The model compression methods, Spatial SVD and Channel Pruning work on per layer basis. Not all the layers in the given model are equally compressible. Compression of individual layers of a given model can have varying impact on the final accuracy of the model. Greedy Per Layer Compression Ratio Selection Algorithm is used to assess the sensitivity of applicable layers to compression and find appropriate compression-ratio for each individual layers. The algorithm makes sure that the entire model has highest remaining accuracy and also meets the given target compression-ratio.

+
+
+

How it works

+

The Greedy Compression Ratio Selection algorithm executes the following two steps:

+
    +
  • Per-layer exploration

  • +
  • Compression-ratio selection

  • +
+

The following figure provides a high level overview and is followed by details for each step.

+../_images/greedy_1.png +
+

+
+../_images/greedy_2.png +
+

+
+

where, the Eval dictionary is represented as-

+../_images/greedy_3.png +
+
+

Per-layer Exploration

+

For each layer, produces a column in the compression-ratio vs. model-performance table. This column captures the over all network performance values as the layer is compressed by predefined range of compression ratio candidates, while all other layers are left unmodified.

+../_images/greedy_4.jpg +

In the above figure, you see an example model with 4 layers, and 10 compression-ratio candidates (which is the default setting). Note that the table does not capture the eval score for the last candidate which is always compression-ratio=1.0 (since this score is the baseline score and known already).

+

Monotonic Fit: In some cases it is observed that the model performance is not a strict increasing function of increasing compression-ratios. To help with the greedy selection procedure, AIMET can apply a curve-fit scheme to try and fit the model-performance numbers for a given layer using a monotonically increasing function. The functionality is disabled by default.

+
+
+

Compression Ratio Selection

+

This step is the core of the algorithm. It considers the compression-ratio vs. model-performance table for each applicable layer from the previous step, target compression ratio and function to calculate the cost of the compressed model depending on the compression method (Spatial SVD, Channel Pruning) used. It starts with a constant accuracy and finds the corresponding compression ratio for every applicable layer by interpolating from compression-ratio vs. model-performance evaluation table. The algorithm, then calculates total cost of the model to see if we have met our target compression ratio or not. Binary search algorithm is used to find the solution quickly. Finally it returns list of selected compression ratios for all applicable layers. This way, the algorithm achieves the highest remaining final accuracy of the compressed model and meet target compression ratio.

+

The following figure illustrates that for a given accuracy, the compression ratio for each layer is different.

+../_images/greedy_5.jpg +

As suggested by above diagram, the algorithm picks a lower compression ratio (higher compression) for layers which are more compressible and pick a higher compression ratio (lower compression) for layers which are less compressible (For lesser compressible layers the accuracy falls drstically if compression ratio is lowered).

+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/index.html b/releases/1.32.2/user_guide/index.html new file mode 100644 index 00000000..40e8187e --- /dev/null +++ b/releases/1.32.2/user_guide/index.html @@ -0,0 +1,1189 @@ + + + + + + AI Model Efficiency Toolkit User Guide — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AI Model Efficiency Toolkit User Guide

+
+

Overview

+

AI Model Efficiency Toolkit (AIMET) is a software toolkit that enables users to quantize and compress models. +Quantization is a must for efficient edge inference using fixed-point AI accelerators.

+

AIMET optimizes pre-trained models (e.g., FP32 trained models) using post-training and fine-tuning techniques that +minimize accuracy loss incurred during quantization or compression.

+

AIMET currently supports PyTorch, TensorFlow, and Keras models.

+../_images/AIMET_index_no_fine_tune.png +

The above picture shows a high-level view of the workflow when using AIMET. The user will start with a trained +model in either the PyTorch, TensorFlow, or Keras training framework. This trained model is passed to AIMET using APIs +for compression and quantization. AIMET returns a compressed/quantized version of the model +that the users can fine-tune (or train further for a small number of epochs) to recover lost accuracy. Users can then +export via ONNX/meta/h5 to an on-target runtime like Qualcomm® Neural Processing SDK.

+
+
+

Features

+

AIMET supports two sets of model optimization techniques:

+
    +
  • Model Quantization: AIMET can simulate behavior of quantized HW for a given trained +model. This model can be optimized using Post-Training Quantization (PTQ) and fine-tuning (Quantization Aware Training +- QAT) techniques.

  • +
  • Model Compression: AIMET supports multiple model compression techniques that allow the +user to take a trained model and remove redundancies, resulting in a smaller model that runs faster on target.

  • +
+
+
+

Release Information

+

For information specific to this release, please see Release Notes and Known Issues.

+
+
+

Installation Guide

+

Please visit the AIMET Installation for more details.

+
+
+

Getting Started

+

Please refer to the following documentation:

+ +
+

toc tree

+
+
+
+

+
+
+

+
+
+
AI Model Efficiency Toolkit is a product of Qualcomm Innovation Center, Inc.
+
Qualcomm® Neural Processing SDK is a product of Qualcomm Technologies, Inc. and/or its subsidiaries.
+
+
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/known_issues.html b/releases/1.32.2/user_guide/known_issues.html new file mode 100644 index 00000000..f8010f45 --- /dev/null +++ b/releases/1.32.2/user_guide/known_issues.html @@ -0,0 +1,1138 @@ + + + + + + AIMET Known Issues — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Known Issues

+

Known issues and limitations for Qualcomm AI Model Efficiency ToolKit (AIMET)

+
    +
  • AIMET Spatial SVD currently does not support Fully Connected layers

  • +
  • +
    AIMET Channel Pruning
      +
    • Does not support Conv layers with dilation other than (1,1). Conv layers with dilation other than (1,1) must be added to Channel Pruning Configuration’s modules_to_ignore list.

    • +
    • Does not support channel pruning of DepthwiseConv2d layers.

    • +
    • For TensorFlow, supports only models with “Channels Last” data format

    • +
    +
    +
    +
  • +
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/model_compression.html b/releases/1.32.2/user_guide/model_compression.html new file mode 100644 index 00000000..400dd2c6 --- /dev/null +++ b/releases/1.32.2/user_guide/model_compression.html @@ -0,0 +1,1223 @@ + + + + + + AIMET Model Compression — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Model Compression

+
+

Overview

+

AIMET provides a model compression library that can be used to reduce a model’s MAC and memory costs with a minimal +drop in accuracy. AIMET supports various compression schemes like Weight SVD, Spatial SVD and Channel Pruning.

+
+
+

Please see the Compression Guidebook - which includes some practical advice on using the compression features, and how to combine the features

+
+
+

Use Case

+

AIMET allows user to take a trained model and compress it to desired compression ratio which can be further fine-tuned and exported to a target. +All of the compression schemes in AIMET use a two-step process - Compression ratio selection followed by model +compression.

+../_images/compression_use_case.PNG +

The following sub-sections explain these steps in more detail.

+
+
+

Compression ratio selection

+
+
+
    +
  • Greedy Compression Ratio Selection: During this phase, individual layers of the original model are analyzed to determine optimal compression ratios per layer. Currently AIMET supports the Greedy Compression Ratio Selection method.

  • +
  • Manual Compression Ratio Selection: As an alternative to AIMET automatically selecting optimal compression ratios per layer, the user has a choice to specify compression ratios manually per layer. The suggested procedure would be to use the Greedy Compression Ratio Selection method to get a nominal set of compression ratios first. And then use this as the starting point for manually changing compression ratios for one or more layers.

  • +
+

To visualize various usage of the compression tool we can use:

+ +
+
+

Model Compression

+

In this phase, AIMET will apply the compression ratios per layer to create a compressed model. +Currently, AIMET supports the following model compression algorithms.

+ +
+
+

Optional techniques to get better compression results

+

AIMET supports the following techniques that can be optionally used to get better compression results

+
    +
  • Rank-rounding

  • +
  • Per-layer fine-tuning

  • +
+
+

Rank Rounding

+

Often ML runtime-software like those for Embedded ML accelerators, will prefer the dimensions of layers like Conv2d or FC to be of a certain multiplicity. Matching the expected dimension size will result in optimal runtime for that layer. AIMET techniques like Weight/Spatial SVD or Channel Pruning, try to decompose layers or reduce layers - specifically in terms of output channels and input channels. The rank-rounding feature in AIMET will try and reduce layers to match a user-provided multiplicity. By default this feature is disabled. At present, AIMET allows the user to specify a multiplicity-factor for the entire model, not on a per-layer basis.

+

Users can make use of this feature to generate more optimal models for running on embedded targets.

+
+
+

Per-layer Fine-tuning

+

Given a user-model and desired compression-ratio, the user may sometimes notice a sharp degradation in accuracy after compression but before fine-tuning. One technique that might help the overall compression of such scenarios, is using a feature called per-layer fine-tuning. When this feature is selected, AIMET invokes a user-provided fine-tuning function after compressing every layer that was selected for compression. This is done during the Model Compression phase in the diagram shown above.

+

Note: The user is responsible for choosing appropriate learning-rates and other training parameters for fine-tuning. Using this feature may require the user to carefully pick the learning rates and learning-rate-decay parameters to be used during fine-tuning.

+
+
+
+

FAQs

+
    +
  1. Which technique is the best technique to use for compression?

    +

    We see best results when Spatial SVD is performed followed by Channel Pruning.

    +
  2. +
  3. Can we combine the different techniques?

    +

    Yes, as stated in 1, different techniques can be combined together to get better accuracy. Compression can be combined with Post-training Quantization techniques as well to get a better model for target.

    +
  4. +
  5. How to take a model to target after compression?

    +

    To take a model to target it needs to be first compressed using the above techniques and then it should be quantized and exported to target

    +
  6. +
  7. Greedy rank selection is very slow. Can something be done to speed it up?

    +

    Greedy rank selection in itself is not time consuming. The time consuming part is creating the eval-score dictionary. For different experiments, eval-score dictionary can be generated once and then loaded into the searcher. Or, one can reduce the number of candidates over which the eval-score dictionary is created. But lesser the number of candidates, lesser the granularity. To strike a balance the value of 10 candidates was chosen.

    +
  8. +
  9. Is per-layer fine tuning helpful?

    +

    Per-layer fine tuning is an experimental technique. We have not observed major gains by using it. But one can try out if it works for their model. In practice, we have observed that the best combination is to do say 1 epoch of fine-tuning per-layer and then do say 10-15 epochs of fine-tuning for the entire compressed model at the end.

    +
  10. +
+
+
+

References

+
    +
  1. Xiangyu Zhang, Jianhua Zou, Kaiming He, and Jian Sun. “Accelerating Very Deep Convolutional Networks for Classification and Detection.” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 10, pp. 1943-1955, 1 Oct. 2016.

  2. +
  3. Yihui He, Xiangyu Zhang, and Jian Sun. “Channel Pruning for Accelerating Very Deep Neural Networks.” IEEE International Conference on Computer Vision (ICCV), Venice, 2017, pp. 1398-1406.

  4. +
  5. Max Jaderberg, Andrea Vedaldi, and Andrew Zisserman. “Speeding up Convolutional Neural Networks with Low Rank Expansions.” British Machine Vision Conference, Jan. 2014.

  6. +
  7. Andrey Kuzmin, Markus Nagel, Saurabh Pitre, Sandeep Pendyam, Tijmen Blankevoort, Max Welling. “Taxonomy and Evaluation of Structured Compression of Convolutional Neural Networks.”

  8. +
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/model_guidelines.html b/releases/1.32.2/user_guide/model_guidelines.html new file mode 100644 index 00000000..13f4b5ef --- /dev/null +++ b/releases/1.32.2/user_guide/model_guidelines.html @@ -0,0 +1,1191 @@ + + + + + + Model Guidelines for PyTorch — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Model Guidelines for PyTorch

+

To implement the Cross Layer Equalization API, aimet_torch.cross_layer_equalization.equalize_model(), AIMET creates a computing graph to analyze the sequence of Operations in the model. +If your model is defined using certain constructs, it restricts AIMET from successfully creating and analyzing the computing graph. The following table lists the potential issues and workarounds.

+

Note: These restrictions are not applicable, if you are using the Primitive APIs

+ +++++ + + + + + + + + + + + + + + + + + + + + + + + + +

Potential Issue

Description

Work Around

ONNX Export

Use torch.onnx.export() +to export your model. +Make sure ONNX export passes

If ONNX export fails, rewrite the +specific layer so that ONNX +export passes

Slicing Operation

Some models use +torch.tensor.view() in the +forward function as follows: +x = x.view(-1, 1024) +If view function is written +as above, it causes an issue +while creating the +computing graph

Rewrite the x.view() statement +as follows: +x = x.view(x.size(0), -1)

Bilinear, upsample +operation

Some models use the upsample +operation in the forward +function as: x= +torch.nn.functional.upsample( +x, size=torch.Size([129,129]) +, mode = ‘bilinear’, +align_corners=True)

Set the align_corners parameter to +False as follows: +x = +torch.nn.functional.upsample(x, +size=torch.Size([129, 129]), +mode=’bilinear’, +align_corners=False)

Deconvolution operation

The deconvolution operation +is used in DeepLabV3 model. +This is currently not +supported by AIMET

There is no workaround available +at this time. This issue will be +addressed in a subsequent AIMET +release.

+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/model_quantization.html b/releases/1.32.2/user_guide/model_quantization.html new file mode 100644 index 00000000..ba2e1f61 --- /dev/null +++ b/releases/1.32.2/user_guide/model_quantization.html @@ -0,0 +1,1354 @@ + + + + + + AIMET Model Quantization — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Model Quantization

+

Models are generally trained on floating-point hardware like CPUs and GPUs. However, when these trained models are run +on quantized hardware that support fixed-precision operations, model parameters are converted from floating-point +precision to fixed precision. As an example, when running on hardware that supports 8-bit integer operations, the +floating point parameters in the trained model need to be converted to 8-bit integers. It is observed that for some +models, running on an 8-bit fixed-precision runtime introduces a loss in accuracy due to noise added from the use +of fixed precision parameters and fixed precision operations.

+

AIMET provides multiple techniques and tools which help to create quantized models with a minimal loss in accuracy +relative to floating-point models.

+

This section provides information on typical use cases and AIMET’s quantization features.

+
+

Use Cases

+

1. Predict on-target accuracy: AIMET enables a user to simulate the effects of quantization to get a first order +estimate of the model’s accuracy when run on quantized targets. This is useful to get an estimate of on-target accuracy +without needing an actual target platform. Note that to create a simulation model, AIMET uses representative data +samples to compute per-layer quantization encodings.

+
+
../_images/quant_use_case_1.PNG +
+

2. Post-Training Quantization (PTQ): PTQ techniques attempt to make a model more quantization friendly without +requiring model re-training/fine-tuning. PTQ (as opposed to fine-tuning) is recommended as a first step in a +quantization workflow due to the following advantages:

+
    +
  • No need for the original training pipeline; an evaluation pipeline is sufficient

  • +
  • Only requires a small unlabeled dataset for calibration (can even be data-free in some scenarios)

  • +
  • Fast, simple, and easy to use

    +
    +
    ../_images/quant_use_case_3.PNG +
    +
  • +
+

Note that with PTQ techniques, the quantized model accuracy may still have a gap relative to the floating-point model. +In such a scenario, or to even further improve the model accuracy, fine-tuning is recommended.

+

3. Quantization-Aware Training (QAT)/Fine-Tuning: QAT enables a user to fine-tune a model with quantization +operations inserted in network graph, which in effect adapts the model parameters to be robust to quantization noise. +While QAT requires access to a training pipeline and dataset, and takes longer to run due to needing a few epochs of +fine-tuning, it can provide better accuracy especially at low bitwidths. A typical QAT workflow is illustrated below.

+
+
../_images/quant_use_case_2.PNG +
+
+
+

AIMET Quantization Features

+
+
+
    +
  • +
    Quantization Simulation:

    QuantSim enables a user to modify a model by adding quantization simulation ops. When an evaluation is run on a +model with these quantization simulation ops, the user can observe a first-order simulation of expected accuracy on +quantized hardware.

    +
    +
    +
  • +
  • +
    Quantization-Aware Training (QAT):

    QAT allows users to take a QuantSim model and further fine-tune the model parameters by taking quantization into +account.

    +

    Two modes of QAT are supported:

    +
      +
    • +
      Regular QAT:

      Fine-tuning of model parameters. Trainable parameters such as module weights, biases, etc. can be +updated. The scale and offset quantization parameters for activation quantizers remain constant. Scale and +offset parameters for weight quantizers will update to reflect new weight values after each training step.

      +
      +
      +
    • +
    • +
      QAT with Range Learning:

      In addition to trainable module weights and scale/offset parameters for weight quantizers, scale/offset +parameters for activation quantizers are also updated during each training step.

      +
      +
      +
    • +
    +
    +
    +
  • +
+
+

Post-Training Quantization

+
    +
  • Post-Training Quantization (PTQ) Techniques:

    +
    +

    Post-training quantization techniques help a model improve quantized accuracy without needing to re-train.

    +
    +
    +
      +
    • +
      AutoQuant:

      AIMET provides an API that integrates the post-training quantization techniques described below. AutoQuant is +recommended for PTQ. If desired, individual techniques can be invoked using standalone feature specific APIs.

      +
      +
      +
    • +
    • +
      Adaptive Rounding (AdaRound):

      Determines optimal rounding for weight tensors to improve quantized performance.

      +
      +
      +
    • +
    • +
      Cross-Layer Equalization:

      Equalizes weight ranges in consecutive layers.

      +
      +
      +
    • +
    • +
      BN Re-estimation:

      Re-estimates Batch Norm layer statistics before folding the Batch Norm layers.

      +
      +
      +
    • +
    • +
      Bias Correction [Deprecated]:

      Bias Correction is considered deprecated. It is advised to use AdaRound instead.

      +
      +
      +
    • +
    +
    +
  • +
+
+
+

Debugging/Analysis Tools

+
+
+
    +
  • +
    Debugging/Analysis Tools
      +
    • +
      QuantAnalyzer:

      Automated debugging of the model to understand sensitivity to weight and/or activation quantization, individual +layer sensitivity, etc.

      +
      +
      +
    • +
    • +
      Visualizations:

      Visualizations and histograms of weight and activation ranges.

      +
      +
      +
    • +
    +
    +
    +
  • +
+
+
+
+

AIMET Quantization Workflow

+

This section describes the recommended workflow for quantizing a neural network.

+
+
../_images/quantization_workflow.PNG +
+

1. Model prep and validation

+

Before attempting quantization, ensure that models have been defined in accordance to model guidelines. These guidelines +depend on the ML framework the model is written in.

+
+

PyTorch

+
+
+
    +
  • Pytorch:

    +
    +

    PyTorch Model Guidelines

    +
    +

    In the case of PyTorch, there exists the Model Validator utility, to automate the checking of certain PyTorch model +requirements, as well as the Model Preparer utility, to automate the updating of the model definition to align with +certain requirements.

    +

    In this model prep and validation phase, we advise the following flow:

    +../_images/pytorch_model_prep_and_validate.PNG +

    Users can use the model validator utility first to check if the model can be run with AIMET. If validator checks +fail, users can first try using model preparer in their pipeline, an automated feature for updating models, and +retry the model validator to see if checks now pass. If the validator continues to print warnings, users will need +to update the model definition by hand prior to using AIMET features.

    +

    For more information on model validator and preparer, refer to the corresponding sections in +AIMET PyTorch Quantization APIs.

    +
    +
    +
  • +
+
+
+

Tensorflow

+
+
+ +

2. PTQ/AutoQuant

+

The user can apply various PTQ techniques to the model to adjust model parameters and make the model more robust to +quantization. We recommend trying AutoQuant first, a PTQ feature which internally tries various other PTQ methods and +finds the best combination of methods to apply. Refer to the +AIMET Quantization Features section for more details on PTQ/AutoQuant.

+

3. QAT

+

If model accuracy is still not satisfactory after PTQ/AutoQuant, the user can use QAT to fine-tune the model. Refer to +the AIMET Quantization Features section for more details on QAT.

+

4. Exporting models

+

In order to bring the model onto the target, users will need two things:

+
    +
  • a model with updated weights

  • +
  • an encodings file containing quantization parameters associated with each quantization op

  • +
+

AIMET QuantSim provides export functionality to generate both items. The exported model type will differ based on the ML +framework used:

+
    +
  • .onnx for PyTorch

  • +
  • meta/checkpoint for TensorFlow

  • +
  • .h5 and .pb for Keras

  • +
+

Depending on which AIMET Quantization features were used, the user may need to take different steps to export the model +and encodings file. For example, calling AutoQuant will automatically export the model and encodings file as part of its +processing. If QAT is used, users will need to call .export() on the QuantSim object. If lower level PTQ techniques like +CLE are used, users will need to first create a QuantSim object from the modified model, and then call .export() on the +QuantSim object.

+
+
+
+

Debugging Guidelines

+
+
+

Applying AIMET Quantization features may involve some trial and error in order to find the best optimizations to apply +on a particular model. We have included some debugging steps in the Quantization Guidebook +that can be tried when quantization accuracy does not seem to improve right off the bat.

+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/post_training_quant_techniques.html b/releases/1.32.2/user_guide/post_training_quant_techniques.html new file mode 100644 index 00000000..8f43754a --- /dev/null +++ b/releases/1.32.2/user_guide/post_training_quant_techniques.html @@ -0,0 +1,1223 @@ + + + + + + AIMET Post-Training Quantization Techniques — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Post-Training Quantization Techniques

+
+

Overview

+

It is observed that some ML models show reduced inference accuracy when run on quantized hardware due to approximation noises. AIMET provides post-training quantization techniques that help adjust the parameters in the model such that the model becomes more quantization-friendly. AIMET post-training quantizations are designed to be applied on pre-trained ML models. These techniques are explained as part of the “Data-Free Quantization Through Weight Equalization and Bias Correction” paper at ICCV 2019 - https://arxiv.org/abs/1906.04721

+
+
+

User Flow

+
+
../_images/flow_diagram_cle.png +
+
    +
  1. BatchNorm Folding: This feature will fold in the batch-norm layers (if present) into surrounding layers.

    +
    +
    +

    +
    +
    +
  2. +
  3. Quantization Visualization: AIMET provides visualization tools that help guide the user to determine if AIMET post-training quantization techniques are useful for a given model. Specifically, the visualization tools will show per-channel ranges of parameters to highlight if there is big discrepancy in ranges between different channels in a layer.

    +
    +
    ../_images/cle_5.png +
    +

    +
    +
    +
  4. +
  5. Replace ReLU6 with ReLU: This feature replaces ReLU6 layers with ReLU layers. This is needed for the subsequent cross-layer scaling step. However, this replacement can lead to a drop in accuracy for some models. If this drop in accuracy is not acceptable, the user may be better off not using the post-training quantization techniques.

    +
    +
    +

    +
    +
    +
  6. +
  7. Cross Layer Scaling: In some models, the parameter ranges for different channels in a layer show a wide variance. This feature attempts to equalize the parameter ranges across different channels. As seen below, the ranges of weights per channel in a layer vary significantly. Cross-Layer Scaling scales layer’s per channel weights of consecutive layers. This helps increase the range for layers with low range and reduce range for layers with high range. Therefore, different channels have similar range and same quantizaion parameters can be used for weights across all channels.

    +
    +
    ../_images/cle_1.png +
    +

    As shown below, AIMET takes in a model and equalizes the distribution of weights per channel of consecutive layers. The scaling factor is calculated and used to scale the weights. The output of the model remains the same and the dynamic range of weight distribution is reduced.

    +
    +
    ../_images/cle_4.png +
    +

    +
    +
    +
  8. +
  9. High Bias Fold: Cross-layer scaling may result in high bias parameter values for some layers. This technique folds some of the bias of a layer into the subsequent layer’s parameters. High-bias fold requires batch-norm parameters to operate on. If the original model did not batch-norm parameters for a given layer, the high-bias fold technique will not be applied to that layer.

    +
    +
    +

    +
    +
    +
  10. +
  11. Bias Correction: Quantization sometimes leads to a shift in layer outputs. This techniques helps correct this shift by adjusting the bias parameters of that layer. Bias parameter is iteratively corrected/updated for each layer. The layer whose bias has to be corrected, and all the layers above it, are quantized. There are two techniques, namely Empirical Bias Correction and Analytical Bias Correction that are supported for bias correction.

  12. +
+

In empirical bias correction technique, representative data is passed through both the FP32 model and quantized model. Outputs are extracted for the layer to be corrected from both the models and used for correcting the bias parameter as shown below which describes correcting bias for a single layer. This process continues for all layers in the model.

+../_images/bias_correction_empirical.png +

In analytical bias correction, data from BatchNorms - when present are used for correction factor estimation instead of passing data through model as in empirical case.

+../_images/bias_correction_analytical.png +
+
+

Cross-Layer Equalization API

+

Please refer to the links below to view the Cross-Layer Equalization API for each AIMET variant:

+ +
+
+

FAQs

+
    +
  1. +
    How many samples of data are required to perform Bias Correction?

    Bias Correction requires a representative set of dataset. We have observed that providing 500-1000 samples works well.

    +
    +
    +
  2. +
  3. +
    Which is better Empirical Bias Correction or Analytical + Empirical Bias Correction?

    If speed is not a bottleneck, then it is suggested to use Empirical Bias Correction, else the hybrid approach of combining both.

    +
    +
    +
  4. +
+
+
+

References

+
    +
  1. Markus Nagel, Mart van Baalen, Tijmen Blankevoort, Max Welling. “Data-Free Quantization Through Weight Equalization and Bias Correction.” IEEE International Conference on Computer Vision (ICCV), Seoul, October 2019.

  2. +
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/quant_analyzer.html b/releases/1.32.2/user_guide/quant_analyzer.html new file mode 100644 index 00000000..c3cfa842 --- /dev/null +++ b/releases/1.32.2/user_guide/quant_analyzer.html @@ -0,0 +1,1222 @@ + + + + + + AIMET QuantAnalyzer — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET QuantAnalyzer

+
+

Overview

+

The QuantAnalyzer feature analyzes the model for quantization and points out sensitive parts/hotspots in the model. +The analyses are performed automatically, and only requires the user to pass in callbacks for performing forward pass and evaluation, and optionally a dataloader for MSE loss analysis.

+

For each analysis, QuantAnalyzer outputs json and/or html files containing data and plots for easy visualization.

+
+
+

Requirements

+
+
To call the QuantAnalyzer API, users need to provide the following:
    +
  • An FP32 pretrained model for analysis

  • +
  • A dummy input for the model which can contain random values, but must match the shape of the model’s expected input

  • +
  • A user defined function for passing 500-1000 representative data samples through the model for quantization calibration.

  • +
  • A user defined function for passing labeled data through the model for evaluation, returning an accuracy metric

  • +
  • (Optional, for runing MSE loss analysis) A dataloader providing unlabeled data to be passed through the model

  • +
+
+
+

Other quantization related settings are also provided in the call to analyze a model. +Please refer to PyTorch QuantAnalyzer API Docs for more information on how to call the QuantAnalyzer feature.

+

Note: Typically on quantized runtimes, batch normalization layers will be folded where possible. +So that users do not have to call a separate API to do so, QuantAnalyzer automatically performs Batch Norm Folding prior to running its analyses.

+
+
+

Detailed Analysis Descriptions

+

QuantAnalyzer performs the following analyses:

+
    +
  1. +
    Sensitivity analysis to weight and activation quantization:

    QuantAnalyzer compares the accuracies of the original FP32 model, an activation-only quantized model, and a weight-only quantized model.

    +

    This helps users determine which AIMET quantization technique(s) will be more beneficial for the model. +For example, in situations where the model is more sensitive to activation quantization, PTQ techniques like Adaptive Rounding or Cross Layer Equalization might not be very helpful.

    +

    Accuracy values for each model are printed as part of AIMET logging.

    +
    +
    +
  2. +
  3. +
    Per layer quantizer enablement analysis:

    Sometimes the accuracy drop incurred from quantization can be attributed to only a subset of quantizers within the model. +QuantAnalyzer performs analyses to find such layers by enabling and disabling individual quantizers to observe how the model accuracy changes.

    +

    The following two types of quantizer enablement analyses are performed:

    +
      +
    1. Disable all quantizers across the model and, for each layer, enable only that layer’s output quantizer and perform evaluation with the provided callback. +This results in accuracy values obtained for each layer in the model when only that layer’s quantizer is enabled, allowing users to observe effects of individual layer quantization and pinpoint culprit layer(s) and hotspots.

    2. +
    3. Enable all quantizers across the model and, for each layer, disable only that layer’s output quantizer and perform evaluation with the provided callback. +Once again, accuracy values are produced for each layer in the model when only that layer’s quantizer is disabled.

    4. +
    +

    As a result of these analyses, AIMET outputs per_layer_quant_enabled.html and per_layer_quant_disabled.html respectively, containing plots mapping layers on the x-axis to model accuracy on the y-axis.

    +

    JSON files per_layer_quant_enabled.json and per_layer_quant_disabled.json are also produced, containing the data shown in the .html plots.

    +
    +
    +
  4. +
  5. +
    Per layer encodings min-max range analysis:

    As part of quantization, encoding parameters for each quantizer must be obtained. +These parameters include scale, offset, min, and max, and are used for mapping floating point values to quantized integer values.

    +

    QuantAnalyzer tracks the min and max encoding parameters computed by each quantizer in the model as a result of forward passes through the model with representative data (from which the scale and offset values can be directly obtained).

    +

    As a result of this analysis, AIMET outputs html plots and json files for each activation quantizer and each parameter quantizer (contained in the min_max_ranges folder), containing the encoding min/max values for each.

    +

    If Per Channel Quantization (PCQ) is enabled, encoding min and max values for all the channels of each weight will be shown.

    +
    +
    +
  6. +
  7. +
    Per layer statistics histogram:

    Under the TF Enhanced quantization scheme, encoding min/max values for each quantizer are obtained by collecting a histogram of tensor values seen at that quantizer and potentially tossing out outliers.

    +

    When this quantization scheme is selected, QuantAnalyzer will output plots for each quantizer in the model, displaying the histogram of tensor values seen at that quantizer. +These plots are available as part of the activations_pdf and weights_pdf folders, containing a separate .html plot for each quantizer.

    +
    +
    +
  8. +
  9. +
    Per layer MSE loss:

    An optional analysis QuantAnalyzer can do is to monitor each layer’s output in the original FP32 model as well as the corresponding layer output in the quantized model, and calculate the MSE loss between the two. +This helps identify which layers may contribute more to quantization noise.

    +

    To enable this optional analysis, users need to pass in a dataloader for QuantAnalyzer to read from. +Approximately 256 samples/images are sufficient.

    +

    A per_layer_mse_loss.html file will be generated containing a plot mapping layer quantizers on the x-axis to MSE loss on the y-axis. +A corresponding per_layer_mse_loss.json file will also be generated containing data corresponding to the .html file.

    +
    +
    +
  10. +
+
+
+

QuantAnalyzer API

+

Please refer to the links below to view the QuantAnalyzer API for each AIMET variant:

+ +
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/quantization_aware_training.html b/releases/1.32.2/user_guide/quantization_aware_training.html new file mode 100644 index 00000000..1f334024 --- /dev/null +++ b/releases/1.32.2/user_guide/quantization_aware_training.html @@ -0,0 +1,1193 @@ + + + + + + AIMET Quantization Aware Training — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Quantization Aware Training

+
+

Overview

+

In cases where PTQ techniques are not sufficient for mitigating quantization error, users can use quantization-aware +training (QAT). QAT models the quantization noise during training and allows the model to find better solutions +than post-training quantization. However, the higher accuracy comes with the usual costs of neural +network training, i.e. longer training times, need for labeled data and hyperparameter search.

+
+
+

QAT workflow

+

The QAT workflow is largely similar to the flow for using Quantization Simulation for inference. The only difference is +that a user can take the sim.model and use it in their training pipeline in order to fine-tune model parameters while +taking quantization noise into account. The user’s training pipeline will not need to change in order to train the +sim.model compared to training the original model.

+

A typical pipeline is as follows:

+
    +
  1. Create a QuantSim sim object from a pretrained model.

  2. +
  3. Calibrate the sim using representative data samples to come up with initial encoding values for each quantizer node.

  4. +
  5. Pass the sim.model into a training pipeline to fine-tune the model parameters.

  6. +
  7. Evaluate the sim.model using an evaluation pipeline to check whether model accuracy has improved.

  8. +
  9. Export the sim to generate a model with updated weights and no quantization nodes, along with the accompanying +encodings file containing quantization scale/offset parameters for each quantization node.

  10. +
+

Observe that as compared to QuantSim inference, step 3 is the only addition when performing QAT.

+
+
+

QAT modes

+

There are two variants of QAT, referred to as QAT without Range Learning and QAT with Range Learning.

+

In QAT without Range Learning, encoding values for activation quantizers are found once in the beginning during the +calibration step after QuantSim has been instantiated, and are not updated again subsequently throughout training.

+

In QAT with Range Learning, encoding values for activation quantizers are initially set during the calibration step, but +are free to update during training, allowing a more optimal set of scale/offset quantization parameters to be found +as training takes place.

+

In both variants, parameter quantizer encoding values will continue to update in accordance with the parameters +themselves updating during training.

+
+
+

Recommendations for Quantization-Aware Training

+

Here are some general guidelines that can aid in improving performance or faster convergence with Quantization-aware Training (QAT):

+
    +
  • +
    Initialization:
      +
    • Often it can be beneficial to first apply post training quantization techniques like AutoQuant before applying QAT. +This is especially beneficial if there is large drop in INT8 performance compared to the FP32 baseline.

    • +
    +
    +
    +
  • +
  • +
    Hyper-parameters:
      +
    • Number of epochs: 15-20 epochs are generally sufficient for convergence

    • +
    • Learning rate: Comparable (or one order higher) to FP32 model’s final learning rate at convergence. +Results in AIMET are with learning of the order 1e-6.

    • +
    • Learning rate schedule: Divide learning rate by 10 every 5-10 epochs

    • +
    +
    +
    +
  • +
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/quantization_configuration.html b/releases/1.32.2/user_guide/quantization_configuration.html new file mode 100644 index 00000000..5f5b31b6 --- /dev/null +++ b/releases/1.32.2/user_guide/quantization_configuration.html @@ -0,0 +1,1406 @@ + + + + + + Quantization Simulation Configuration — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

Quantization Simulation Configuration

+
+

Overview

+

AIMET allows the configuration of quantizer placement and settings in accordance with a set of rules specified in a json configuration file, applied when the Quantization Simulation API is called.

+

Settings such as quantizer enablement, per channel quantization, symmetric quantization, and specifying fused ops when quantizing can be configurated. +The general use case for this file would be for users to match the quantization rules for a particular runtime they would like to simulate.

+

For examples on how to provide a specific configuration file to AIMET Quantization Simulation, +refer to the API docs for PyTorch Quantsim, TensorFlow Quantsim, and Keras Quantsim.

+

It is advised for the user to begin with the default configuration file under

+

aimet_common/quantsim_config/default_config.json

+

For most users of AIMET, no additional changes to the default configuration file should be needed.

+
+
+

Configuration File Structure

+

The configuration file contains six main sections, in increasing amounts of specificity:

+../_images/quantsim_config_file.png +

Rules defined in a more general section can be overruled by subsequent rules defined in a more specific case. +For example, one may specify in “defaults” for no layers to be quantized, but then turn on quantization for specific layers in the “op_type” section.

+
+
+

How to configure individual Configuration File Sections

+

When working with a new runtime with different rules, or for experimental purposes, users can refer to this section to understand how to configure individual sections in a configuration file.

+
    +
  1. defaults:

    +
    +
    {"defaults": {
    +    "ops": {                                # Required dictionary, but can be empty
    +        "is_output_quantized": "True",      # Optional: Possible settings: True
    +        "is_symmetric": "False"             # Optional: Possible settings: True, False
    +    },
    +    "params": {                             # Required dictionary, but can be empty
    +        "is_quantized": "True",             # Optional: Possible settings: True, False
    +        "is_symmetric": "True"              # Optional: Possible settings: True, False
    +    },
    +    "strict_symmetric": "False",            # Optional: Possible settings: True, False
    +    "unsigned_symmetric": "True",           # Optional: Possible settings: True, False
    +    "per_channel_quantization": "False"     # Optional: Possible settings: True, False
    +    },
    +
    +
    +

    In the defaults section, it is required to include an “ops” dictionary and a “params” dictionary (though these dictionaries may be empty).

    +

    The “ops” dictionary holds settings that will apply to all activation quantizers in the model. +In this section, the following settings are available:

    +
    +
      +
    • +
      is_output_quantized:

      An optional parameter. If included, it must be set to “True”. +Including this setting will turn on all output activation quantizers by default. +If not specified, all activation quantizers will start off as disabled.

      +

      For cases when the runtime quantizes input activations, we typically see this only done for certain op types. +Configuring these settings for specific op types is covered in sections further below.

      +
      +
      +
    • +
    • +
      is_symmetric:

      An optional parameter. If included, possible settings include “True” and “False”. +A “True” setting will place all activation quantizers in symmetric mode by default. +A “False” setting, or omitting the parameter altogether, will set all activation quantizers to asymmetric mode by default.

      +
      +
      +
    • +
    +
    +

    The “params” dictionary holds settings that will apply to all parameter quantizers in the model. +In this section, the following settings are available:

    +
    +
      +
    • +
      is_quantized:

      An optional parameter. If included, possible settings include “True” and “False”. +A “True” setting will turn on all parameter quantizers by default. +A “False” setting, or omitting the parameter altogether, will disable all parameter quantizers by default.

      +
      +
      +
    • +
    • +
      is_symmetric:

      An optional parameter. If included, possible settings include “True” and “False”. +A “True” setting will place all parameter quantizers in symmetric mode by default. +A “False” setting, or omitting the parameter altogether, will set all parameter quantizers to asymmetric mode by default.

      +
      +
      +
    • +
    +
    +

    Aside from the “ops” and “params” dictionary, additional settings governing quantizers in the model are available:

    +
      +
    • +
      strict_symmetric:

      An optional parameter. If included, possible settings include “True” and “False”. +When set to “True”, quantizers which are configured in symmetric mode will use strict symmetric quantization. +When set to “False” or omitting the parameter altogether, quantizers which are configured in symmetric mode will not use strict symmetric quantization.

      +
      +
      +
    • +
    • +
      unsigned_symmetric:

      An optional parameter. If included, possible settings include “True” and “False”. +When set to “True”, quantizers which are configured in symmetric mode will use unsigned symmetric quantization when available. +When set to “False” or omitting the parameter altogether, quantizers which are configured in symmetric mode will not use unsigned symmetric quantization.

      +
      +
      +
    • +
    • +
      per_channel_quantization:

      An optional parameter. If included, possible settings include “True” and “False”. +When set to “True”, parameter quantizers will use per channel quantization as opposed to per tensor quantization. +When set to “False” or omitting the parameter altogether, parameter quantizers will use per tensor quantization.

      +
      +
      +
    • +
    +
    +
  2. +
  3. params:

    +
    +
        "params": {                         # Can specify 0 or more param types
    +        "weight": {
    +            "is_quantized": "True",     # Optional: Possible settings: True, False
    +            "is_symmetric": "True"      # Optional: Possible settings: True, False
    +        }
    +    },
    +
    +
    +

    In the params section, settings can be configured for certain types of parameters throughout the model. +For example, adding settings for “weight” will affect all parameters of type “weight” in the model. +Currently supported parameter types include:

    +
    +
      +
    • weight

    • +
    • bias

    • +
    +
    +

    For each parameter type, the following settings are available:

    +
    +
      +
    • +
      is_quantized:

      An optional parameter. If included, possible settings include “True” and “False”. +A “True” setting will turn on all parameter quantizers of that type. +A “False” setting, will disable all parameter quantizers of that type. +By omitting the setting, the parameter will fall back to the setting specified by the defaults section.

      +
      +
      +
    • +
    • +
      is_symmetric:

      An optional parameter. If included, possible settings include “True” and “False”. +A “True” setting will place all parameter quantizers of that type in symmetric mode. +A “False” setting will place all parameter quantizers of that type in asymmetric mode. +By omitting the setting, the parameter will fall back to the setting specified by the defaults section.

      +
      +
      +
    • +
    +
    +
    +
  4. +
  5. op_type:

    +
    +
        "op_type": {                                # Can specify 0 or more ONNX op types
    +        "Gemm": {
    +            "is_input_quantized": "True",       # Optional: Possible settings: True
    +            "is_output_quantized": "False",     # Optional: Possible settings: True, False
    +            "per_channel_quantization": "True", # Optional: Possible settings: True, False
    +            "params": {                         # Optional, can specify 1 or more param types
    +                "weight": {
    +                    "is_quantized": "True",     # Optional: Possible settings: True, False
    +                    "is_symmetric": "True"      # Optional: Possible settings: True, False
    +                }
    +            },
    +        },
    +    },
    +
    +
    +

    In the op type section, settings affecting particular op types can be specified. +The configuration file recognizes ONNX op types, and will internally map the type to a PyTorch or TensorFlow op type +depending on which framework is used.

    +

    For each op type, the following settings are available:

    +
    +
      +
    • +
      is_input_quantized:

      An optional parameter. If included, it must be set to “True”. +Including this setting will turn on input quantization for all ops of this op type. +Omitting the setting will keep input quantization disabled for all ops of this op type.

      +
      +
      +
    • +
    • +
      is_output_quantized:

      An optional parameter. If included, possible settings include “True” and “False”. +A “True” setting will turn on output quantization for all ops of this op type. +A “False” setting will disable output quantization for all ops of this op type. +By omitting the setting, output quantizers of this op type will fall back to the setting specified by the defaults section.

      +
      +
      +
    • +
    • +
      is_symmetric:

      An optional parameter. If included, possible settings include “True” and “False”. +A “True” setting will place all quantizers of this op type in symmetric mode. +A “False” setting will place all quantizers of this op type in asymmetric mode. +By omitting the setting, quantizers of this op type will fall back to the setting specified by the defaults section.

      +
      +
      +
    • +
    • +
      per_channel_quantization:

      An optional parameter. If included, possible settings include “True” and “False”. +When set to “True”, parameter quantizers of this op type will use per channel quantization as opposed to per tensor quantization. +When set to “False”, parameter quantizers of this op type will use per tensor quantization. +By omitting the setting, parameter quantizers of this op type will fall back to the setting specified by the defaults section.

      +
      +
      +
    • +
    +
    +

    For a particular op type, settings for particular parameter types can also be specified. +For example, specifying settings for weight parameters of a Conv op type will affect only Conv weights and not weights +of Gemm op types.

    +

    To specify settings for param types of this op type, include a “params” dictionary under the op type. +Settings for this section follow the same convention as settings for parameter types in the preceding “params” section, however will only affect parameters for this op type.

    +
    +
  6. +
  7. supergroups:

    +
    +
        "supergroups": [    # Can specify 0 or more supergroup lists made up of ONNX op types
    +        {
    +            "op_list": ["Conv", "Relu"]
    +        },
    +        {
    +            "op_list": ["Conv", "Clip"]
    +        },
    +        {
    +            "op_list": ["Add", "Relu"]
    +        },
    +        {
    +            "op_list": ["Gemm", "Relu"]
    +        }
    +    ],
    +
    +
    +

    Supergroups are a sequence of operations which are fused during quantization, meaning no quantization noise is introduced between members of the supergroup. +For example, specifying [“Conv, “Relu”] as a supergroup disables quantization between any adjacent Conv and Relu ops in the model.

    +

    When searching for supergroups in the model, only sequential groups of ops with no branches in between will be matched with supergroups defined in the list. +Using [“Conv”, “Relu”] as an example, if there was a Conv op in the model whose output is used by both a Relu op and a second op, the supergroup would not take effect for these Conv and Relu ops.

    +

    To specify supergroups in the config file, add each entry as a list of op type strings. +The configuration file recognizes ONNX op types, and will internally map the types to PyTorch or TensorFlow op types depending on which framework is used.

    +
    +
  8. +
  9. model_input:

    +
    +
        "model_input": {
    +        "is_input_quantized": "True"    # Optional: Possible settings: True
    +    },
    +
    +
    +

    The “model_input” section is used to configure the quantization of inputs to the model. +In this section, the following setting is available:

    +
      +
    • +
      is_input_quantized:

      An optional parameter. If included, it must be set to “True”. +Including this setting will turn on quantization for input quantizers to the model. +Omitting the setting will keep input quantizers set to whatever setting they were in as a result of applying configurations from earlier sections.

      +
      +
      +
    • +
    +
    +
  10. +
  11. model_output:

    +
    +
        "model_output": {
    +        "is_output_quantized": "True"   # Optional: Possible settings: True
    +    }
    +
    +
    +

    The “model_output” section is used to configure the quantization of outputs of the model. +In this section, the following setting is available:

    +
      +
    • +
      is_output_quantized:

      An optional parameter. If included, it must be set to “True”. +Including this setting will turn on quantization for output quantizers of the model. +Omitting the setting will keep output quantizers set to whatever setting they were in as a result of applying configurations from earlier sections.

      +
      +
      +
    • +
    +
    +
  12. +
+
+
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/quantization_feature_guidebook.html b/releases/1.32.2/user_guide/quantization_feature_guidebook.html new file mode 100644 index 00000000..82a0e3fc --- /dev/null +++ b/releases/1.32.2/user_guide/quantization_feature_guidebook.html @@ -0,0 +1,1178 @@ + + + + + + AIMET Quantization Features Guidebook — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Quantization Features Guidebook

+

AIMET supports various neural network quantization techniques. A more in-depth discussion on various techniques and +their usage is provided in User Guide

+

After applying an AIMET Quantization feature, if the model’s performance is still not satisfactory, we recommend a set +of diagnostics steps to identify the bottlenecks and improve the performance. While this is not strictly an algorithm, +these debugging steps can provide insights on why a quantized model underperforms and help to tackle the underlying +issues. These steps are shown as a flow chart in figure 9 and are described in more detail below:

+

FP32 sanity check +An important initial debugging step is to ensure that the floating-point and quantized model behave similarly in the +forward pass, especially when using custom quantization pipelines. Set the quantized model bit-width to 32 bits for +both weights and activation, or by-pass the quantization operation, if possible, and check that the accuracy matches +that ofthe FP32 model.

+

Weights or activations quantization +The next debugging step is to identify how activation or weight quantization impact the performance independently. Does +performance recover if all weights are quantized to a higher bit-width while activations are kept in a lower bitwidth, +or conversely if all activations use a high bit-width and activations a low bit-width? This step can show the relative +contribution of activations and weight quantization to the overall performance drop and point us towards the +appropriate solution.

+

Fixing weight quantization +If the previous step shows that weight quantization does cause significant accuracy drop, then there are a few solutions +to try: +1. Apply CLE if not already implemented, especially for models with depth-wise separable convolutions. +2. Try per-channel quantization. This will address the issue of uneven per-channel weight distribution. +3. Apply bias correction or AdaRound if calibration data is available

+../_images/quantization_debugging_flow_chart.png +

Fixing activation quantization +To reduce the quantization error from activation quantization, we can also try using different range setting methods or +adjust CLE to take activation quantization ranges into account, as vanilla CLE can lead to uneven activation +distribution.

+

Per-layer analysis +If the global solutions have not restored accuracy to acceptable levels, we consider each quantizer individually. We set +each quantizer sequentially, to the target bit-width while keeping the rest of the network to 32 bits +(see inner for loop in figure above).

+

Visualizing layers +If the quantization of a individual tensor leads to significant accuracy drop, we recommended visualizing the tensor +distribution at different granularities, e.g. per-channel as in figure 5, and dimensions, e.g., per-token or per-embedding +for activations in BERT.

+

Fixing individual quantizers +The visualization step can reveal the source of the tensor’s sensitivity to quantization. Some common solutions involve +custom range setting for this quantizer or allowing a higher bit-width for problematic quantizer. If the problem is +fixed and the accuracy recovers, we continue to the next quantizer. If not, we may have to resort to other methods, +such as quantization-aware training (QAT).

+

After completing the above steps, the last step is to quantize the complete model to the desired bit-width. If the +accuracy is acceptable, we have our final quantized model ready to use. Otherwise, we can consider higher bit-widths and +smaller granularities or revert to more powerful quantization methods, such as quantization-aware training.

+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/quantization_sim.html b/releases/1.32.2/user_guide/quantization_sim.html new file mode 100644 index 00000000..066778bd --- /dev/null +++ b/releases/1.32.2/user_guide/quantization_sim.html @@ -0,0 +1,1280 @@ + + + + + + AIMET Quantization Simulation — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Quantization Simulation

+
+

Overview

+

AIMET’s Quantization Simulation feature provides functionality to simulate the effects of quantized hardware. This +allows the user to then apply post-training and/or fine-tuning techniques in AIMET to recover the loss in accuracy, and +ultimately deploy the model on the target device.

+

When applying QuantSim by itself, optimal quantization scale/offset parameters for each quantizer are found, but no +techniques for mitigating accuracy loss from quantization are applied. Users can either pass their original model +directly to QuantSim to simulate quantization noise on the starting model, or apply Post-Training Quantization +techniques to obtain an updated model to then pass into QuantSim to observe a difference in quantization accuracy as a +result of applying the techniques.

+

Once a QuantSim object has been created, users can fine-tune the model within the QuantSim object using their +existing pipeline. This method is described in the Quantization Aware Training page.

+

The quantization nodes used in QuantSim are custom quantizers defined in AIMET, and are not recognized by targets. +QuantSim provides an export functionality that will save a copy of the model with quantization nodes removed, as well as +generate an encodings file containing quantization scale/offset parameters for each activation and weight tensor in +the model.

+

A hardware runtime can ingest the encodings file and match it with the exported model to find what scale/offset values +to apply on each tensor in the model.

+
+
+

QuantSim Workflow

+

A typical workflow for using AIMET quantization simulation to simulate on-target quantized accuracy is described below.

+
    +
  1. The user starts with a pretrained floating-point FP32 model.

  2. +
  3. AIMET creates a simulation model by inserting quantization simulation ops into the model graph as explained in the +sub-section below.

  4. +
  5. AIMET also configures the inserted simulation ops. The configuration of these ops can be controlled via a +configuration file as discussed in sub-section below.

  6. +
  7. AIMET finds optimal quantization parameters, such as scale/offsets, for the inserted quantization simulation ops. To +do this, AIMET requires the user to provide a callback method that feeds a few representative data samples through +the model. These samples can either be from the training or calibration datasets. Generally, samples in the order of +1,000-2,000 have been sufficient for AIMET to find optimal quantization parameters.

  8. +
  9. AIMET returns a quantization simulation model that can be used as a drop-in replacement for the original model in +their evaluation pipeline. Running this simulation model through the evaluation pipeline yields a quantized accuracy +metric that closely simulates on-target accuracy.

  10. +
  11. The user can call .export() on the sim object to save a copy of the model with quantization nodes removed, along with +an encodings file containing quantization scale/offset parameters for each activation and weight tensor in the model.

  12. +
+
+
+

Simulating Quantization Noise

+

The diagram below explains how quantization noise is introduced to a model when its input, output or parameters are +quantized and dequantized.

+
+
../_images/quant_3.png +
+

Since dequantizated value may not be exactly the same as quantized value, the difference between the two values is the +quantization noise.

+

In order to simulate quantization noise, AIMET QuantSim adds quantizer ops to the PyTorch/TensorFlow/Keras model graph. +The resulting model graph can be used as is in the user’s evaluation or training pipeline.

+
+
+

Determining Quantization Parameters (Encodings)

+

Using a QuantSim model, AIMET analyzes and determines the optimal quantization encodings (scale and offset parameters) +for each quantizer op.

+

To do this, AIMET passes some calibration samples through the model. Using hooks, tensor data is intercepted while +flowing through the model. A histogram is created to model the distribution of the floating point numbers in the output +tensor for each layer.

+../_images/quant_2.png +

Using the distribution of the floating point numbers in the output tensor for each layer, quantization encodings are +computed using the specified quantization calibration technique. An encoding for a layer consists of four numbers:

+
    +
  • Min (qmin): Numbers below these are clamped

  • +
  • Max (qmax): Numbers above these are clamped

  • +
  • Delta: Granularity of the fixed point numbers (is a function of the bit-width selected)

  • +
  • Offset: Offset from zero

  • +
+
+
The Delta and Offset can be calculated using Min and Max and vice versa using the equations:

\(\textrm{Delta} = \dfrac{\textrm{Max} - \textrm{Min}}{{2}^{\textrm{bitwidth}} - 1} \quad \textrm{Offset} = \dfrac{-\textrm{Min}}{\textrm{Delta}}\)

+
+
+
+
+

Quantization Schemes

+

AIMET supports various techniques for coming up with min and max values for encodings, also called quantization schemes:

+
    +
  • Min-Max: Also referred to as “TF” in AIMET (The name TF represents the origin of this technique and +has no relation to what framework the user is using). To cover the whole dynamic range of the tensor, we can define +the quantization parameters Min and Max to be the observed Min and Max during the calibration process. This leads to +no clipping error. However, this approach is sensitive to outliers, as strong outliers may cause excessive rounding +errors.

  • +
  • Signal-to-Quantization-Noise (SQNR): Also referred to as “TF Enhanced” in AIMET (The name TF +represents the origin of this technique and has no relation to what framework the user is using). The SQNR approach is +similar to the Mean Square Error (MSE) minimization approach. In the SQNR range setting method, we find qmin and qmax +that minimize the total MSE between the original and the quantized tensor. Quantization noise and saturation noise are +different types of erros which are weighted differently.

  • +
+

For each quantization scheme, there are “post training” and “training range learning” variants. The “post training” +variants are used during regular QuantSim inference as well as QAT without Range Learning, to come up with initial +encoding values for each quantization node. In QAT without Range Learning, encoding values for activation quantizers +will remain static (encoding values for parameter quantizers will change in accordance with changing parameter values +during training).

+

The “training range learning” variants are used during QAT with Range Learning. The schemes define how to come up with +initial encoding values for each quantization node, but also allow encoding values for activations to be learned +alongside parameter quantizer encodings during training.

+

For more details on QAT, refer to Quantization Aware Training.

+
+
+

Configuring Quantization Simulation Ops

+

Different hardware and on-device runtimes may support different quantization choices for neural network inference. For +example, some runtimes may support asymmetric quantization for both activations and weights, whereas other ones may +support asymmetric quantization just for weights.

+

As a result, we need to make quantization choices during simulation that best reflect our target runtime and hardware. +AIMET provides a default configuration file, which can be modified. This file is used during quantization simulation if +no other configuration file is specified. By default, following configuration is used for quantization simulation:

+
    +
  • Weight quantization: Per-channel, symmetric quantization, INT8

  • +
  • Activation or layer output quantization: Per-tensor, asymmetric quantization, INT8

  • +
+

Quantization options that can be controlled via the configuration file include the following:

+
    +
  • Enabling/disabling of input and output quantizer ops

  • +
  • Enabling/disabling of parameter quantizer ops

  • +
  • Enabling/disabling of model input quantizer

  • +
  • Enabling/disabling of model output quantizer

  • +
  • Symmetric/Asymmetric quantization

  • +
  • Unsigned/signed symmetric quantization

  • +
  • Strict/non strict symmetric quantization

  • +
  • Per channel/per tensor quantization

  • +
  • Defining groups of layers to be fused (no quantization done on intermediate tensors within fused layers)

  • +
+

Please see the Quantization Simulation Configuration page which describes the configuration +options in detail.

+
+
+

Quantization Simulation APIs

+

Please refer to the links below to view the Quantization Simulation API for each AIMET variant:

+ +
+
+

Frequently Asked Questions

+
    +
  • +
    Q: How many samples are needed in the calibration step (compute encodings)?

    A: 1,000 - 2,000 unlabeled representative data samples are sufficient.

    +
    +
    +
  • +
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/release_notes.html b/releases/1.32.2/user_guide/release_notes.html new file mode 100644 index 00000000..a0cee6c0 --- /dev/null +++ b/releases/1.32.2/user_guide/release_notes.html @@ -0,0 +1,1391 @@ + + + + + + AIMET Release Notes — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Release Notes

+

Release Notes for Qualcomm AI Model Efficiency ToolKit (AIMET)

+
+

1.22.2

+

Tensorflow

+
    +
  • Added support for supergroups : MatMul + Add

  • +
  • Added support for TF-Slim BN name with backslash

  • +
  • Added support for Depthwise + Conv in CLS

  • +
+

Documentation

+ +
+
+

1.22.1

+
    +
  • Added support for QuantizableMultiHeadAttention for PyTorch nn.transformer layers by @quic-kyuykim

  • +
  • Support functional conv2d in model preparer by @quic-kyuykim

  • +
  • Enable qat with multi gpu by @quic-mangal

  • +
  • Optimize forward pass logic of PyTorch QAT 2.0 by @quic-geunlee

  • +
  • Fix functional depthwise conv support on model preparer by @quic-kyuykim

  • +
  • Fix bug in model validator to correctly identify functional ops in leaf module by @quic-klhsieh

  • +
  • Support dynamic functional conv2d in model preparer by @quic-kyuykim

  • +
  • Added updated default runtime config, also a per-channel one. Fixed n… by @quic-akhobare

  • +
  • Include residing module info in model validator by @quic-klhsieh

  • +
  • Support for Keras MultiHeadAttention Layer by @quic-ashvkuma

  • +
+

Documentation

+ +
+
+

1.22.0

+
    +
  • Support for simulation and QAT for PyTorch transformer models (including support for torch.nn mha and encoder layers)

  • +
+

Documentation

+ +
+
+

1.21.0

+
    +
  • New feature: PyTorch QuantAnalyzer - Visualize per-layer sensitivity and per-quantizer PDF histograms

  • +
  • New feature: TensorFlow AutoQuant - Automatically apply various AIMET post-training quantization techniques

  • +
  • PyTorch QAT with Range Learning: Added support for Per Channel Quantization

  • +
  • PyTorch: Enabled exporting of encodings for multi-output leaf module

  • +
  • +
    TensorFlow Adaround
      +
    • Added ability to use configuration file in API to adapt to a specific runtime target

    • +
    • Added Per-Channel Quantization support

    • +
    +
    +
    +
  • +
  • TensorFlow QuantSim: Added support for FP16 inference and QAT

  • +
  • +
    TensorFlow Per Channel Quantization
      +
    • Fixed speed and accuracy issues

    • +
    • Fixed zero accuracy for 16-bits per channel quantization

    • +
    • Added support for DepthWise Conv2d Op

    • +
    +
    +
    +
  • +
  • Multiple other bug fixes

  • +
+

Documentation

+ +
+ +
+

1.19.1.py37

+
    +
  • PyTorch: Added CLE support for Conv1d, ConvTranspose1d and Depthwise Separable Conv1d layers

  • +
  • PyTorch: Added High-Bias Fold support for Conv1D layer

  • +
  • PyTorch: Modified Elementwise Concat Op to support any number of tensors

  • +
  • Minor dependency fixes

  • +
+

Documentation

+ +
+
+

1.19.1

+
    +
  • PyTorch: Added CLE support for Conv1d, ConvTranspose1d and Depthwise Separable Conv1d layers

  • +
  • PyTorch: Added High-Bias Fold support for Conv1D layer

  • +
  • PyTorch: Modified Elementwise Concat Op to support any number of tensors

  • +
  • Minor dependency fixes

  • +
+

Documentation

+ +
+
+

1.18.0.py37

+
    +
  • Multiple bug fixes

  • +
  • Additional feature examples for PyTorch and TensorFlow

  • +
+

Documentation

+ +
+
+

1.18.0

+
    +
  • Multiple bug fixes

  • +
  • Additional feature examples for PyTorch and TensorFlow

  • +
+

Documentation

+ +
+
+

1.17.0.py37

+
    +
  • Add Adaround TF feature

  • +
  • Added Examples for Torch quantization, and Channel Pruning & Spatial SVD compression

  • +
+

Documentation

+ +
+
+

1.17.0

+
    +
  • Add Adaround TF feature

  • +
  • Added Examples for Torch quantization, and Channel Pruning & Spatial SVD compression

  • +
+

Documentation

+ +
+
+

1.16.2.py37

+
    +
  • Added a new post-training quantization feature called AdaRound, which stands for AdaptiveRounding

  • +
  • Quantization simulation and QAT now also support recurrent layers (RNN, LSTM, GRU)

  • +
+

Documentation

+ +
+
+

1.16.2

+
    +
  • Added a new post-training quantization feature called AdaRound, which stands for AdaptiveRounding

  • +
  • Quantization simulation and QAT now also support recurrent layers (RNN, LSTM, GRU)

  • +
+

Documentation

+ +
+ +
+

1.16.1

+
    +
  • Added separate packages for CPU and GPU models. This allows users with CPU-only hosts to run AIMET.

  • +
  • Added separate packages for PyTorch and TensorFlow. Reduces the number of dependencies that users would need to install.

  • +
+

Documentation

+ +
+ + + +
+ + +
+
+
+ +
+ +
+

© Copyright 2020, Qualcomm Innovation Center, Inc..

+
+ + Built with Sphinx using a + theme + provided by Read the Docs. + + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/spatial_svd.html b/releases/1.32.2/user_guide/spatial_svd.html new file mode 100644 index 00000000..2b9c6d68 --- /dev/null +++ b/releases/1.32.2/user_guide/spatial_svd.html @@ -0,0 +1,1138 @@ + + + + + + AIMET Spatial SVD — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Spatial SVD

+

Spatial SVD is a tensor decomposition technique which decomposes one large layer (in terms of mac or memory) into two smaller layers. SVD stands for Singular Value Decomposition.

+

Given a conv layer, with kernel (𝑚,𝑛,ℎ,𝑤) where 𝑚 is the input channels, 𝑛 the output channels, and ℎ, 𝑤 giving the height and width of the kernel itself, Spatial SVD will decompose the kernel into two kernels. One of size (𝑚,𝑘,ℎ,1) and one of size (𝑘,𝑛,1,𝑤), where k is called the rank. The smaller the value of k the larger the degree of compression achieved.

+

The following diagram illustrates this visually. As you can see, Spatial SVD decomposes both the output channel dimension as well as the size of the conv kernel itself. Spatial SVD is currently supported for Conv layers in AIMET.

+../_images/spatial_svd.png +
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/visualization_compression.html b/releases/1.32.2/user_guide/visualization_compression.html new file mode 100644 index 00000000..0e0e5b73 --- /dev/null +++ b/releases/1.32.2/user_guide/visualization_compression.html @@ -0,0 +1,1214 @@ + + + + + + AIMET Visualization — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Visualization

+
+

Overview

+

AIMET Visualization adds analytical capability to the AIMET tool (which helps quantize and compress ML models) through visualization. It provide more detailed insights in to AIMET features as users are able to analyze a model’s layers in terms of compressibility and also highlight potential issues when applying quantization. The tool also assists in displaying progress for computationally heavy tasks.

+
+
+

Design

+

Given a model, a user can start a Bokeh server session and then invoke functions which will produce visualizations to help analyze and understand the model before using AIMET features from quantization and compression

+../_images/vis_1.png +
+
+

Compression

+

Evaluation scores during compression are displayed in a table as they are computed and users can see the progress displayed while computing these scores. After Greedy Selection has run, the optimal compression ratios are also displayed in a graph

+../_images/vis_4.png +../_images/vis_5.png +../_images/vis_6.png +../_images/vis_7.png +
+
+

Starting a Bokeh Server Session:

+

Start a bokeh server by typing this command: bokeh serve –allow-websocket-origin=<host name>:<port number> –port=<port number>

+

–allow-websocket-origin tells the Bokeh server which network addresses to listen on, again not typically needed for local It is not need just to view locally.

+

–port tells the Bokeh server what network port to listen on rather than the default port of 5006

+
+
+

How to use the tool

+

Model Compression

+
    +
  1. Start a bokeh server by typing this command: bokeh serve –allow-websocket-origin=<host name>:<port number> –port=<port number>

  2. +
  3. +
    To visualize eval scores and compression ratios during execution time:
      +
    1. +
      Input a visualization URL into the top level function: compress_model. This url is http://<host name>:<port number>/
        +
      1. For model compression, the visualization url is passed through compress_model. If no visualizations are necessary then the url has a default option for None.

      2. +
      +
      +
      +
    2. +
    3. +
      Finally, go to the URL to see the visualizations.
        +
      1. The session-id here is: compression. So the URL would look something like this:

      2. +
      3. http://<host name>:<port number>/?&bokeh-session-id=compression

      4. +
      +
      +
      +
    4. +
    +
    +
    +
  4. +
  5. +
    To visualize eval scores and compression ratios after execution:
      +
    1. +
      Use API doc to decide which functions to use. They should be under “Model Compression.”
        +
      1. First instantiate a VisualizeCompression instance by passing in a visualization URL. This url is http://<host name>:<port number>/

      2. +
      +
      +
      +
    2. +
    3. +
      There are two functions:
        +
      1. display_eval_scores

      2. +
      3. display_comp_ratio_plot

      4. +
      +
      +
      +
    4. +
    5. +
      Finally, go to the URL to see the visualizations
        +
      1. The session-id here is: compression. So the URL would look something like this:

      2. +
      3. http://<host name>:<port number>/?&bokeh-session-id=compression

      4. +
      +
      +
      +
    6. +
    +
    +
    +
  6. +
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/visualization_quant.html b/releases/1.32.2/user_guide/visualization_quant.html new file mode 100644 index 00000000..d0e471b1 --- /dev/null +++ b/releases/1.32.2/user_guide/visualization_quant.html @@ -0,0 +1,1164 @@ + + + + + + AIMET Visualization for Quantization — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Visualization for Quantization

+
+

Overview

+

AIMET Visualization adds analytical capability to the AIMET tool (which helps quantize and compress ML models) through visualization. It provides more detailed insights into AIMET features as users are able to analyze a model’s layers in terms of compressibility and also highlight potential issues when applying quantization. The tool also assists in displaying progress for computationally heavy tasks. The visualizations get saved as an HTML file under the specified directory.

+
+
+

Quantization

+

During quantization, common parameters are used throughout a layer for converting the floating point weight values to INT8. If the dynamic range in weights is very high, the quantization will not be very granular. To equalize the weight range we apply Cross Layer Equalization. +In order to understand if we need to apply Cross Layer Equalization, we can visualize the weight range for every channel in a layer. If the weight range varies a lot over the various channels, applying cross layer equalization helps in improving the Quantization accuracy.

+../_images/vis_3.png +
+

PyTorch

+

In PyTorch, we can visualize the weights for a model. We can also visualize the weight ranges for a model before and after Cross Layer Equalization. +There are three main functions a user can invoke:

+
    +
  1. User can analyze relative weight ranges of model to see potentially problematic layers for quantization

  2. +
  3. User can understand each layer in the model

  4. +
  5. User can visualize the model, comparing weights before and after quantization.

  6. +
+
+
+

TensorFlow

+

In TensorFlow, we can visualize the weight ranges and relative weight ranges over various channels in a layer. +User can also use the same functions to see the changes in a layer weight ranges before and after Cross Layer Equalization.

+

There are two main functions a user can invoke:

+
    +
  1. User can analyze relative weight ranges of a layer to see potentially problematic layers for quantization

  2. +
  3. User can visualize weight ranges of a layer and see the various statistics for weights

  4. +
+
+
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/weight_svd.html b/releases/1.32.2/user_guide/weight_svd.html new file mode 100644 index 00000000..f5c3e821 --- /dev/null +++ b/releases/1.32.2/user_guide/weight_svd.html @@ -0,0 +1,1138 @@ + + + + + + AIMET Weight SVD — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+
+ +
+
+
+
+ +
+

AIMET Weight SVD

+

Weight SVD is a tensor decomposition technique which decomposes one large layer (in terms of mac or memory) into two smaller layers. SVD stands for Singular Value Decomposition.

+

Given a neural network layer, with kernel (𝑚,𝑛,ℎ,𝑤) where 𝑚 is the input channels, 𝑛 the output channels, and ℎ, 𝑤 giving the height and width of the kernel itself, Weight SVD will decompose the kernel into one of size (𝑚,𝑘,1,1) and another of size (𝑘,𝑛,h,𝑤), where 𝑘 is called the rank. The smaller the value of 𝑘 the larger the degree of compression achieved.

+

The following diagram illustrates this visually. As you can see, Weight SVD decomposes the output channel dimension. Weight SVD is currently supported for Conv and Full-connected layers in AIMET.

+../_images/weight_svd.png +
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/1.32.2/user_guide/winnowing.html b/releases/1.32.2/user_guide/winnowing.html new file mode 100644 index 00000000..b4c535c3 --- /dev/null +++ b/releases/1.32.2/user_guide/winnowing.html @@ -0,0 +1,1153 @@ + + + + + + AIMET Winnowing — AI Model Efficiency Toolkit Documentation: ver 1.32.2 + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+
+ +
+
+ +
+

AIMET Winnowing

+
+

Overview

+

The model compression algorithm, Channel Pruning, identifies modules in a model, whose subset of input channels could be pruned without losing much accuracy. Unless explicitly removed, these input channels take up memory and add to unnecessary computation. For each identified module, the Winnow tool removes the input channels that were selected for pruning. Only Conv2D layers are supported for winnowing.

+
+
+

Winnowing Overview

+

The following figure provides a pictorial overview of Winnowing. In this example, a module in a model has an input volume of HxWx8, where H = Height, W = Width and Number of input Channels = 8. The Channel Pruning algorithm identifies that for this module, input channels 1, 4 and 7 should be pruned. Winnowing removes the identified input channels from this modules. The module’s input volume is now reduced to HxWx5.

+../_images/winnow_1.png +
+
+

How Winnowing Works

+

When the number of input channels of a Conv module is reduced, the output channels of the module above it must also be modified. If the module above is a another Conv layer, that Conv layer’s output channels are also reduced to match the number of input channels of the winnowed Conv module. If the module above is NOT a Conv layer (e.g., BatchNorm, ReLU), that module simply propagates the changes upstream. That is both the output and the input channels of teh BatchNorm and ReLU modules are winnowed to match the winnowed channels of the Conv layer just below them.

+

The following figure explains a very simple scenario. In this scenario, a Conv module has been identified for winnowing a sub set of its input channels. This is indicated by green color on the left side of the figure. The right side of the figure indicates the actions taken by Winnowing. Winnowing consists of the following changes done to the 3 affected modules.

+

The identified Conv module’s sub set of input channels are removed. This is indicated by pink color on the right side of the figure. +The module just above the winnowed Conv module is NOT a Conv module. It could be a ReLU or a BatchNorm module. For this module the corresponding output and input channels are winnowed. This is indicated by orange color on the right side of the figure. +The module above the ReLU/BatchNorm is another Conv module. This Conv module’s output channels are winnowed.This is indicated by pink color on the right side of the figure.

+../_images/winnow_2.png +
+
+ + +
+
+ +
+
+
+
+ + + + \ No newline at end of file diff --git a/releases/latest b/releases/latest index 925bfd9b..9d406f95 120000 --- a/releases/latest +++ b/releases/latest @@ -1 +1 @@ -1.32.1 \ No newline at end of file +1.32.2 \ No newline at end of file